You are on page 1of 6

Table S3: detailed information regarding the gene based assessment of F11 assemblies

Draft Coord Pilon Coord Draft Seq


scaffold00001:36376 scaffold00001_pilon:36376 C
scaffold00001:82428-93547 scaffold00001_pilon:82428-82703 NNNNNNNNNNNNNNN
scaffold00001:98098 scaffold00001_pilon:87254 C
scaffold00001:119994 scaffold00001_pilon:109150 C
scaffold00001:206663 scaffold00001_pilon:195819 T
scaffold00001:227630-233009scaffold00001_pilon:216786-216835 NNNNNNNNNNNNNNN
scaffold00001:234632-234736scaffold00001_pilon:218458 GANNNNNNNNNNGAA
scaffold00001:391571-392830scaffold00001_pilon:375292-376459 NNNNNNNNNNNNNNN
scaffold00001:822485-822741scaffold00001_pilon:806114-806456 TTCGTCGACCATCAT
scaffold00001:824650-829730scaffold00001_pilon:808365-813445 TCTTGACGATGCGGT
scaffold00001:1025847 scaffold00001_pilon:1009562-1009666 .
scaffold00001:1225902-12260scaffold00001_pilon:1209722-1210275 NNNNNNNNNNNNNNN
scaffold00001:1262828-12639scaffold00001_pilon:1247097-1248264 NNNNNNNNNNNNNNN
scaffold00001:1311817 scaffold00001_pilon:1296099 C
scaffold00001:1312655-13129scaffold00001_pilon:1296937-1297261 NNNNNNNNNNNNNNN
scaffold00001:1642178-16423scaffold00001_pilon:1626475-1627333 NNNNNNNNNNNNNNN
scaffold00001:1751529 scaffold00001_pilon:1736499 G
scaffold00001:1751555 scaffold00001_pilon:1736525 C
scaffold00001:1751621 scaffold00001_pilon:1736591 C
scaffold00001:1783366-17835scaffold00001_pilon:1768336 AGGGTTAGGCCCGTT
scaffold00001:1923038-19230scaffold00001_pilon:1907858 TGG
scaffold00001:1951717-19555scaffold00001_pilon:1936534-1940398 NNNNNNNNNNNNNNN
scaffold00001:1959786-19659scaffold00001_pilon:1944603-1945771 NNNNNNNNNNNNNNN
scaffold00001:1970743-19715scaffold00001_pilon:1950539-1950777 NNNNNNNNNNNNNNN
scaffold00001:1979617-19797scaffold00001_pilon:1958872-1959539 NNNNNNNNNNNNNNN
scaffold00001:2017173-20187scaffold00001_pilon:1996961-1998129 NNNNNNNNNNNNNNN
scaffold00001:2622670 scaffold00001_pilon:2602067 C
scaffold00001:2685980 scaffold00001_pilon:2665377 A
scaffold00001:3331364 scaffold00001_pilon:3310761 A
scaffold00001:3382027 scaffold00001_pilon:3361424 C
scaffold00001:3811160 scaffold00001_pilon:3790556 C
scaffold00001:3860752 scaffold00001_pilon:3840148 C
scaffold00001:4323907-43241scaffold00001_pilon:4303303 TGGGGTGATTGCCTG
scaffold00001:4425862-44258scaffold00001_pilon:4405036-4405491 NNNNNNNNNN
scaffold00004:48 scaffold00004_pilon:48 G
scaffold00004:58 scaffold00004_pilon:58 G
scaffold00004:59 scaffold00004_pilon:58 G
scaffold00004:62 scaffold00004_pilon:61 A
scaffold00004:64 scaffold00004_pilon:63 G
t of F11 assemblies

Pilon SeqEvent Ref Coord Assessment Notes gene loci


G SNP -3907031 Correct Perfect TBFG_13513
TTATACAClosedGap -3860978 No worse Good fill despite R flank misasse TBFG_13465
T SNP -3853732 Correct Perfect TBFG_13458
T SNP -3831956 Correct Issue with R flank TBFG_13437
C SNP -3744993 Correct Perfect TBFG_13376
TCACCT ClosedGap -3723965 Correct Perfect intergenic
. ClosedGap -3722293 Correct Perfect intergenic
AGCGGGClosedGap -3565522 Correct Perfect TBFG_13206 & TB
ATGGGAPartialFill -3134509 Correct 589 L & 344 R Perfect intergenic
AGGTTC PartialFill -3132023 Correct L good extension despite flank issintergenic
CGGCGCBreakFix -2936181 Correct Perfect TBFG_12611
ATCAGC ClosedGap -2736320 Correct Perfect TBFG_12452
GAGCGGClosedGap -2698588 Correct Perfect TBFG_12416 & TB
T SNP 935275 Correct Perfect TBFG_10852
GGCGAAPartialFill -1485202 Correct Perfect TBFG_11346
TGCAGCPartialFill -2317069 Correct Perfect TBFG_12085
T SNP -2205568 Correct Perfect TBFG_11974
T SNP -2205542 Correct Perfect TBFG_11974
T SNP -2205476 Correct Perfect TBFG_11974
. BreakFix -2173652 No worse no worse? Pilon took away 150 raTBFG_11946
. Del -2034120 Correct Perfect TBFG_11817
ACCGGCPartialFill -2005444 Correct 1986 L & 1274 R perfect TBFG_11792 & TB
GACCGGClosedGap -1998623 Correct Perfect TBFG_11784 & TB
CATCGA ClosedGap -1992687 Correct Perfect TBFG_11780
GTGAGCPartialFill -1984354 Correct 967 L & 291 R perfect TBFG_11774
GAGCGGClosedGap -1947252 Correct Perfect TBFG_11739 & TB
G SNP -1343505 Correct Perfect TBFG_11220
G SNP -1280196 Correct Perfect TBFG_11025
T SNP -634655 Correct Perfect TBFG_10549
. Del -583887 Correct Perfect intergenic
T SNP -154507 Correct Perfect intergenic
T SNP -104915 Correct Perfect TBFG_10096
. BreakFix -4065750 Correct 1 SNP error in L flank TBFG_13643
CCGGGGPartialFill -3964017 Correct 608 L & 438 R perfect TBFG_13548
C SNP 3858054 Correct by IGV intergenic
. Del 3858064 Correct by IGV TBFG_13461
T SNP 3858064 Correct by IGV TBFG_13461
C SNP 3858067 Correct by IGV TBFG_13461
T SNP 3858069 Correct by IGV TBFG_13461
gene names ref range
PPE family protein 3907031
PPE family protein 3860703-3860978
hypothetical protein 3853732
conserved hypothetical protein 3831956-3831956
PPE family protein 3744993
NO GENE {falls between transposase (TBFT_13358)3723916-3723965
and transposase (TBFT_13359)}
NO GENE {falls between pterin-4-alpha-carbinolamine
3722293-3722293
dehydratase moaB3 (TBFT_13356) and transposase (TBFT_13357)}
transposase(s) 3564355-3565522
NO GENE {falls between transposase (TBFT_12829)3134167-3134509
and conserved hypothetical protein (TBFT_12830)}
NO GENE family
PE-PGRS {falls between
protein transposase (TBFT_12826)3126943-3132023
and transposase (TBFT_12828)}
2936077-2936181
transposase 2735767-2736320
transposase(s) 2697421-2698588
transposase 935275-935275
hypothetical adenylate cyclase (ATP pyrophosphate-lyase)
1484878-1485202
(adenylyl cyclase)
hypothetical protein 2316211-2317069
hypothetical protein 2205568
hypothetical protein 2205542
hypothetical protein
PPE family protein 2205476
2173652-2173652
PPE family protein 2034120-2034120
transposase(s) 2001580-2005444
transposase(s) 1997455-1998623
transposase 1992449-1992687
transposase 1983687-1984354
transposase(s)
PPE family protein 1946084-1947252
1343505-1343505
hypothetical protein 1280196
conserved membrane protein 634655-634655
NO GENE {falls between two component system sensor
583887-583887
histidine kinase senX3 (TBFT_10498) and two component system se
NO GENE (falls between TBFG_10127 & TBFG_101154507
hypothetical protein 104915
hypothetical arginine and proline rich protein 4065750-4065750
fatty-acid-CoA ligase fadD19 3963562-3964017
NO GENE {falls between
hypothetical protein transposase (TBFT_13460)
3858054-3858054
& hypothetical protein (TBFT_13461)}
hypothetical protein 3858064-3858064
3858064-3858064
hypothetical protein
hypothetical protein 3858067-3858067
3858069-3858069

You might also like