Table 1 |
||||||||||||||||
|
Comparison of subtype assignments (jpHMM results versus current database assignment that is based on the original literature) |
||||||||||||||||
|
AG set |
BC set |
|||||||||||||||
|
|
||||||||||||||||
|
Num of sequences |
Full length (world) N = 140 |
Full length (world) N = 509 |
Fragments (Asia) N = 4413 |
|||||||||||||
|
|
||||||||||||||||
|
Database subtype |
A |
G |
02 |
AG |
B |
C |
07 |
08 |
BC |
B |
C |
07 |
08 |
BC |
||
|
|
||||||||||||||||
|
Num of sequences |
72 |
12 |
48 |
8 |
152 |
334 |
7 |
4 |
12 |
3133 |
1048 |
17 |
171 |
44 |
||
|
|
||||||||||||||||
|
Num of problematic sequences1 |
1 |
0 |
2 |
0 |
15 |
12 |
0 |
0 |
3 |
0 |
0 |
0 |
0 |
0 |
||
|
|
||||||||||||||||
|
Num of discordant sequences2 |
0 |
0 |
1 |
0 |
2 |
0 |
0 |
0 |
2 |
24 |
6 |
6 |
102 |
27 |
||
|
|
||||||||||||||||
|
BF set |
||||||||||||||||
|
|
||||||||||||||||
|
Num of sequences |
Full length (world) N = 220 |
Fragments (S. America) N = 4153 |
||||||||||||||
|
|
||||||||||||||||
|
Database subtype |
B |
F |
12 |
17 |
28 |
29 |
BF |
B |
F |
12 |
17 |
28 |
29 |
BF |
||
|
|
||||||||||||||||
|
Num of sequences |
152 |
12 |
11 |
2 |
3 |
4 |
36 |
3070 |
242 |
261 |
0 |
0 |
0 |
580 |
||
|
|
||||||||||||||||
|
Num of problematic sequences1 |
15 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
||
|
|
||||||||||||||||
|
Num of discordant sequences2 |
2 |
2 |
6 |
2 |
1 |
1 |
1 |
74 |
19 |
31 |
0 |
0 |
0 |
107 |
||
|
|
||||||||||||||||
|
1. Problematic sequences are those that could not be unequivocally assigned. They meet one of the following criteria: 1) Contain an unusually high content of IUPAC code N (defined as > 100 continuous Ns, or > 7% N for sequences of length < 1000 nt, or > 5% N for sequences of length 1000-2999, or > 3% N for sequences of length 3000 or above); 2) Contain an artifactual deletion of > 100 nt. 2. Classification of the sequences was compared between the database assignments (of which the majority were extracted from the literature) and the jpHMM predictions. |
||||||||||||||||
|
Zhang et al. Retrovirology 2010 7:25 doi:10.1186/1742-4690-7-25 |
||||||||||||||||