Faculty of Science Institute of Biological Science Bioinformatics SHGB 6111 Assignment

Name : AHMAD MOHAMED GUMEL

Matric no. :

SGF 090001

Lecturer : Dr. Geok Yuam Annie Tan

FEBRUARY,2010

Q1)Below is the protein sequence of DNA-directed RNA polymerase alpha subunit/40kD subunit in Fasta format. The data is for the organisms:Archaeoglobus fulgidus Thermoplasma acidophilum Thermoplasma volcanium Mycobacterium tuberculosis Mycobacterium leprae Haemophilus influenzae
SEQUANCE RETRIEVAL >sp|O28002|RPOD_ARCFU DNA-directed RNA polymerase subunit D OS=Archaeoglobus fulgidus GN=rpoD PE=3 SV=1 MMPEIEILEEKDFKIKFILKNASPALANSFRRAMKAEVPAMAVDYVDIYLNSSYFYDEVI AHRLAMLPIKTYLDRFNMQSECSCGGEGCPNCQISFRLNVEGPKVVYSGDFISDDPDVVF AIDNIPVLELFEGQQLMLEAVARLGTGREHAKFQPVSVCVYKIIPEIVVNENCNGCGDCI EACPRNVFEKDGDKVRVKNVMACSMCGECVEVCEMNAISVNETNNFLFTVEGTGALPVRE VMKKALEILRSKAEEMNKIIEEIQ

>sp|Q9HJD9|RPOD_THEAC DNA-directed RNA polymerase subunit D OS=Thermoplasma acidophilum GN=rpoD PE=3 SV=1 MDVKIFKLSDKYIRFEIDGITPSQANALRRTLINDIPKLAIENVTFHHGEIRDSEGNVYD SSLPLFDEMVAHRLGLIPLKTDLTMNFRDQCSCGGKGCSLCTVTYSINKLGPSTVFSSDL QAVSHPDLIPVDGEIPIVKLGPKQAILITAEAILGTAKEHAKWQVTSGVSYKYHREFHVS KKDFEDWQKLKGACPKSVMSETDTEIVFTDDFGCNDLNVLFESDGVNMIEDDSRFIFQFE TDGSLTAKETLLYALNRLKDRWDILVESLSE

>sp|Q97B93|RPOD_THEVO DNA-directed RNA polymerase subunit D OS=Thermoplasma volcanium GN=rpoD PE=3 SV=1 MEVKIFKLSDKYMSFEIDGITPSQANALRRTLINDIPKLAIENVTFHHGEIRDAEGNVYD SSLPLFDEMVAHRLGLIPLKTDLSLNFRDQCSCGGKGCSLCTVTYSINKIGPASVMSGDI QAISHPDLVPVDPDIPIVKLGAKQAILITAEAILGTAKEHAKWQVTSGVAYKYHREFEVN KKLFEDWAKIKERCPKSVLSEDENTIVFTDDYGCNDLSILFESDGVQIKEDDSRFIFHFE TDGSLTAEETLSYALNRLMDRWGILVESLSE

>sp|P66701|RPOA_MYCTU DNA-directed RNA polymerase subunit alpha OS=Mycobacterium tuberculosis GN=rpoA PE=3 SV=1 MLISQRPTLSEDVLTDNRSQFVIEPLEPGFGYTLGNSLRRTLLSSIPGAAVTSIRIDGVL HEFTTVPGVKEDVTEIILNLKSLVVSSEEDEPVTMYLRKQGPGEVTAGDIVPPAGVTVHN PGMHIATLNDKGKLEVELVVERGRGYVPAVQNRASGAEIGRIPVDSIYSPVLKVTYKVDA TRVEQRTDFDKLILDVETKNSISPRDALASAGKTLVELFGLARELNVEAEGIEIGPSPAE ADHIASFALPIDDLDLTVRSYNCLKREGVHTVGELVARTESDLLDIRNFGQKSIDEVKIK LHQLGLSLKDSPPSFDPSEVAGYDVATGTWSTEGAYDEQDYAETEQL

>sp|Q9X798|RPOA_MYCLE DNA-directed RNA polymerase subunit alpha OS=Mycobacterium leprae GN=rpoA PE=3 SV=1 MLISQRPTLSEDILTDNRSQFVIEPLEPGFGYTLGNSLRRTLLSSIPGAAVTSIRIDGVL HEFTTVPGVKEDVTEIILNLKGLVVSSEEDEPVTMYLRKQGPGEVTAGDIVPPAGVTLHN PGMRIATLNDKGKIEAELVVERGRGYVPAVQNRALGAEIGRIPVDSIYSPVLKVTYKVDA TRVEQRTDFDKLILDVETKSSITPRDALASAGKTLVELFGLARELNVEAEGIEIGPSPAE ADHIASFALPIDDLDLTVRSYNCLKREGVHTVGELVSRTESDLLDIRNFGQKSIDEVKVK LHQLGLSLKDSPDSFDPSEVAGYDVTTGTWSTDGAYDSQDYAETEQL

sp|P43737|RPOA_HAEIN DNA-directed RNA polymerase subunit alpha OS=Haemophilus influenzae GN=rpoA PE=3 SV=1 MQGSVTEFLKPRLVDIEQISSTHAKVILEPLERGFGHTLGNALRRILLSSMPGCAVTEVE IDGVLHEYSSKEGVQEDILEVLLNLKGLAVKVQNKDDVILTLNKSGIGPVVAADITYDGD VEIVNPDHVICHLTDENASISMRIRVQRGRGYVPASSRTHTQEERPIGRLLVDACYSPVE RIAYNVEAARVEQRTDLDKLVIELETNGALEPEEAIRRAATILAEQLDAFVDLRDVRQPE IKEEKPEFXPILLRPVDDLELTVRSANCLKAETIHYIGDLVQRTEVELLKTPNLGKKSLT EIKDVLASRGLSLGMRLENWPPASIAED

1a) Use ClustalW to align these sequences, choose phylip format for the out put

6 356 tuberculos leprae influenzae acidophilu volcanium falgidus

-----MLISQ -----MLISQ MQGSVTEFLK ---------------------------M IPGAAVTSIR IPGAAVTSIR MPGCAVTEVE IPKLAIENVT IPKLAIENVT VPAMAVD--Y-LRKQGPGE Y-LRKQGPGE T-LNKSGIGP T-MNFRDQCS S-LNFRDQCS DRFNMQSECS RGYVPAVQNR RGYVPAVQNR RGYVPASSRT DGEIPIVKLG DPDIPIVKLG IDNIPVLELF LILDVETKNS LILDVETKSS LVIELETNGA HVSKKDFEDW EVNKKLFEDW VVNEN-CNGC DHIASFALPI DHIASFALPI ---PILLRPV -------DGV -------DGV ---------A

RPTLSEDVLT RPTLSEDILT PRLVDIEQIS -MDVKIFKLS -MEVKIFKLS MPEIEILEEK IDGVLHEFTT IDGVLHEFTT IDGVLHEYSS FH--HGEIRD FH--HGEIRD ---------VTAGDIVPPA VTAGDIVPPA VVAADITYDG CGGKGCSLCT CGGKGCSLCT CGGEGCPNCQ ASG--AEIGR ALG--AEIGR HTQEERPIGR PKQ----AIL AKQ----AIL EGQ----QLM ISPRDALASA ITPRDALASA LEPEEAIRRA QKLKGACPKS AKIKERCPKS GDCIEACPRN DDLDLTVRSY DDLDLTVRSY DDLELTVRSA NMIEDDSRFI QIKEDDSRFI ISVNETNNFL

DNRSQFVIEP DNRSQFVIEP STHAKVILEP DKYIRFEIDG DKYMSFEIDG DFKIKFILKN VPGVKEDVTE VPGVKEDVTE KEGVQEDILE SEGNVYDSSL AEGNVYDSSL -YVDIYLNSS GVTVHNPGMH GVTLHNPGMR DVEIVNPDHV VTYSINKLGP VTYSINKIGP ISFRLNVEGP IPVDSIYSPV IPVDSIYSPV LLVDACYSPV ITAEAILGTA ITAEAILGTA LEAVARLGTG GKTLVELFGL GKTLVELFGL ATILAEQLDA VMSETDTEIV VLSEDENTIV VFEKDGDKVR NCLKREGVHT NCLKREGVHT NCLKAETIHY FQFETDGSLT FHFETDGSLT FTVEGTGALP

LEPGFGYTLG LEPGFGYTLG LERGFGHTLG ITP----SQA ITP----SQA ASP----ALA IILNLKSLVV IILNLKGLVV VLLNLKGLAV PLFDEMVAHR PLFDEMVAHR YFYDEVIAHR IATLNDK-GK IATLNDK-GK ICHLTDENAS STVFSSDLQA ASVMSGDIQA KVVYSGD--F LKVTYKVDAT LKVTYKVDAT ERIAYNVEAA KEHAKWQVTS KEHAKWQVTS REHAKFQPVS ARELNVEAEG ARELNVEAEG FVDLRDVRQP FTDDFGCNDL FTDDYGCNDL VKNVMACSMC VGELVARTES VGELVSRTES IGDLVQRTEV AKETLLYALN AEETLSYALN VREVMKKALE

NSLRRTLLSS NSLRRTLLSS NALRRILLSS NALRRTLIND NALRRTLIND NSFRRAMKAE SSEEDEPVTM SSEEDEPVTM KVQNKDDVIL LGLIPLKTDL LGLIPLKTDL LAMLPIKTYL LEVELVVERG IEAELVVERG ISMRIRVQRG VSHPDLIP-V ISHPDLVP-V ISDDPDVVFA RVEQRTDFDK RVEQRTDFDK RVEQRTDLDK GVSYKYHREF GVAYKYHREF VCVYKIIPEI IEIGPSPAEA IEIGPSPAEA EIKEEKPEFX NVLFES---SILFES---GECVEVCEMN DLLDIRNFGQ DLLDIRNFGQ ELLKTPNLGK RLKDRWDILV RLMDRWGILV ILRSKAEEMN

KSIDEVKIKL KSIDEVKVKL KSLTEIKDVL ESLSE----ESLSE----KIIEEIQ--AETEQL AETEQL ---------------------

HQLGLSLKDS HQLGLSLKDS ASRGLSLGMR ----------------------------

PPSFDPSEVA PDSFDPSEVA LENWPPASIA ----------------------------

GYDVATGTWS GYDVTTGTWS ED-----------------------------------

TEGAYDEQDY TDGAYDSQDY -------------------------------------

1b) Calculate a distance matrix using BLOSUM62 and Draw NJ tree

1c) ALIGNMENT WITH DELETED SEQUENCES Phylip output format of Multiple alaignment using ClustalW2 after the delition of half the sequences.
6 214 acidophilu ---------volcanium ---------falgidus ---------M tuberculos -----MLISQ leprae -----MLISQ influenzae MQGSVTEFLK IPKLAIENVT IPKLAIENVT VPAMAVDYVIPGAAVTSIR IPGAAVTSIR MPGCAVTEVE MNFRDQCSCG LNFRDQCSCG FNMQSECSCG YLRKQGPGEV YLRKQGPGEV TLNKSGIGPV IPIVKLGPKQ IPIVKLGAKIPVLELFEGQ GYVPAVQNRA GYVPAVQNRA GYVPASSRTH ---------------------------ILDVETKNSI ILDVETKSSI VIELETN---MDVKIFKLS -MEVKIFKLS MPEIEILEEK RPTLSEDVLT RPTLSEDILT PRLVDIEQIS FHHGEIRDSE FHHGEIRDAE ---------IDGVLHEFTT IDGVLHEFTT IDGVLHEYSS GKGCSLCTVT GKGCSLCTVT GEGCPNCQIS TAGDIVPPAG TAGDIVPPAG VAADITYDGD ------------------QLMLEAVARL SG--AEIGRI LG--AEIGRI TQEERPIGRL ---------SPRD TP----DKYIRFEIDG DKYMSFEIDG DFKIKFILKN DNRSQFVIEP DNRSQFVIEP STHAKVILEP GNVYDSSLPL GNVYDSSLPL -DIYLNSSYF VPGVKEDVTE VPGVKEDVTE KEGVQEDILE YSINKLGPST YSINKIGPAS FRLNVEGPKV VTVHNPGMHI VTLHNPGMRI VEIVNPDHVI ------------------GT-------PVDSIYSPVL PVDSIYSPVL LVDACYSPVE ITP----SQA ITP----SQA ASP----ALA LEPGFGYTLG LEPGFGYTLG LERGFGHTLG FDEMVAHRLG FDEMVAHRLG YDEVIAHRLA IILNLKSLVV IILNLKGLVV VLLNLKGLAV VFSSDLQAVS VMSGDIQAIS VYSGDFISDATLNDK-GKL ATLNDK-GKI CHLTDENASI ---------------------------KVTYKVDATR KVTYKVDATR RIAYNVEAAR NALRRTLIND NALRRTLIND NSFRRAMKAE NSLRRTLLSS NSLRRTLLSS NALRRILLSS LIPLKTDLTLIPLKTDLSMLPIKTYLDR SSEEDEPVTM SSEEDEPVTM KVQNKDDVIL HPDLIPVDGE HPDLVPVDPD DPDVVFAIDN EVELVVERGR EAELVVERGR SMRIRVQRGR ---------------------------VEQRTDFDKL VEQRTDFDKL VEQRTDLDKL

Multiple alaignment by exchanging some part of sequences between falgidus and influenzae, acidophilum and volcanium, and tuberculosis and leprae

CLUSTALW2 phylip output format of MULTIPLE ALIGNMENT of mixed sequences.
5 751 sp|O28002| sp|P66701| sp|Q9X798| sp|Q9HJD9| sp|Q97B93| ------------------MLISQRPTLS ------------------------------------EDILTDNRSQ ------------------------------------FVIEPLEPGF ------------------------------------GYTLGNSLRR ------------------------------------TLLSSIPGAA -------------------

---------- ---------- ---------- ---------- ------------------- ---------- ---------- ---------- ----------

VTSIRIDGVL HEFTTVPGVK EDVTEIILNL KGLVVSSEED EPVTMYLRKQ ---------- ---------- ---------- ---------- ------------------- ---------- ---------- ---------- ---------------------------GPGEVTAGDI ------------------------------------GFGYTLGNSL ------------------------------------NLKSLVVSSE ------------------LEEKDFKIKF NDKGKIEAEL NDKGKLEVEL FKLSDKYIRF KGCSLCTVTY ---------Y DATRVEQRTD DATRVEQRTD ------------------E --------MQ --------LA AEINDNA-DI ---------L ------VSYK ---------NC-------ILUSINFLUE ------------------------RLLV ----LKREGV ILEPLERGFG -------GIT -----EIVFT ---GALEPEE DSPDSFDPSE DILEVLLNLK ---NVYDSSL TLLYALNRLK ------------------VPPAGVTLHN ------------------------------------RRTLLSSIPG ------------------------------------EDEPVTMYLR ------------------ILKNASPALA VVERGRGYVP VVERGRGYVP EIDGITPSQA SINKLGPSTV VDIYLNSSYF FDKLILDVET FDKLILDVET VTFHHGEIRD IPIVKLGPKQ SECSCGGR-RELNVEAEGI RECTEDRNAP IPLKTDLT-YHREFHVSKK ------------------NZAEGNRPOA ------------------DACYSPVERI HTVGELVSRT HTLGNALRRI PSQANALRRT DDFGCNDLNV -----AIRRA -----VAGYD GLAVKVQNKD PLFDEMVAHR DRWDILVESL ------------------PGMLISQRPT ------------------------------------AAVTSIRIDG ------------------------------------KQGPGEVTAG ------------------NSFRRAMKAE AVQNRALGAE AVQNRASGAE NALRRTLIND FSSDLQAVSH YDEVIAHRLA KSSITPRDAL KNSISPRDAL SEGNVYDSSL AILITAEAIL ---GYVPASS -EIGPSPAEA OLYMERASES ---MEVKIFK DFEDWQKLKG ------------------PESVMQGSVT ------------------AYNVEAARVE ESDLLDIRNF LLSSMPGCAV LINDIPKLAI LFESDGVNMI ATILAEQLDA VTTGTWSTDG DVILTLNKSG LGLIPLKTDL SEITAEAILG ------------------LSEDVLTDNR ------------------------------------VLHEFTTVPG ------------------------------------DIVPPAGVTV ------------------M VPAMAVD--IGRIPVDSIY IGRIPVDSIY IPKLAIEN-PDLIPVDG-MLPIKTYLDR ASAGKTLVEL ASAGKTLVEL PLFDEMVAHR GTAKEHAKWQ RTHTQEERPI DHIASFALPI UBUNITALPH LSDKYMSFEI ACPKSVMSET ------------------EFLKPRLVDI ------------------QRTDLDKLVI GQKSIDEVKV TEVEIDGVLH ENVTFHHGEI EDDSRFIFQF FVDLRDVRQP AYDSQDYAET IGPVVAADIT SLNFRDQCSC TAKEHAKWQV ------------------SQFVIEPLEP ------------------------------------VKEDVTEIIL ---------------------MMPEIEI ----MRIATL HNPGMHIATL -----MDVKI NFRDQCSCGG ---------SPVLKVTYKV SPVLKVTYKV ------------------FN-------FG-------FGSPPRPOAH LG-------VTSG-----G--------DDLDLTVRSY AOSHAEMOPH D--------DT-------------------------EQISSTHAKV ------------------ELETN----KLHQLGLSLK EYSSKEGVQE RDAEG----ETDGSLTAKE E--------I EQLLARELNV YDGDVEIVNP G--------TSG-------

KEEKPEFXP- --------IL LRPVDDLELT VRSANCLKAE TIHYIGDLVQ EAEGIEIGPS PAEADHIASF ALPIDDLDLT VRSYNCLKRE GVHTVGELVA DHVICHLTDE NASISMRIRV QRGEGCPNCQ ISFRLNVEGP KVVYSGDFIS

---------- ---------- --GKGCSLCT VTYSINKIGP ASVMSGDIQA ---------- ---VAYKYHR EFEVNKKLFE DWAKIKERCP KSVLSEDENT RTEVELLKTP RTESDLLDIR DDPDVVFAID ISHPDLVPVD IVFTDDYGCN ---------ATGTWSTEGA IPEIVVNENC ---------NRLMDRWGIL ------------------EMNAISVNET ------------------Q NLGKKSLTEI NFGQKSIDEV NIPVLELFEG PDIPIVKLGA DLSILFESDG ---------YDEQDYAETE NGCGDCIEAC ---------VESLSE---------------------NNFLFTVEGT ------------------KDVLASRGLS KIKLHQLGLS QQLMLEAVAR KQAIL----VQIKEDDSRF ---------QL-------PRNVFEKDGD ------------------------------------GALPVREVMK ------------------LGM--RLENW LKD--SPPSF LGTGREHAKF ---------IFHFETDGSL ------------------KVRVKNVMAC ------------------------------------KALEILRSKA ------------------PPASIAED-DPSEVAGYDV QPVSVCVYKI ---------TAEETLSYAL ------------------SMCGECVEVC ------------------------------------EEMNKIIEEI -------------------

2. Sulfolobus solfataricus gene for 16S ribosomal RNA
>gi|47609|emb|X03235.1| Sulfolobus solfataricus gene for 16S ribosomal RNA ATTCCGGTTGATCCTGCCGGACCCGACCGCTATCGGGGTAGGGATAAGCCATGGGAGTCTTACACTCCCG GGTAAGGGAGTGTGGCGGACGGCTGAGTAACACGTGGCTAACCTACCCTCGGGACGGGGATAACCCCGGG AAACTGGGGATAATCCCCGATAGGGAAGGAGTCCTGGAATGGTTCCTTCCCTAAAGGGCTATAGGCTATT TCCCGTTTGTAGCCGCCCGAGGATGGGGCTACGGCCCATCAGGCTGTCGGTGGGGTAAAGGCCCACCGAA CCTATAACGGGTAGGGGCCGTGGAAGCGGGAGCCTCCAGTTGGGCACTGAGACAAGGGCCCAGGCCCTAC GGGGCGCACCAGGCGCGAAACGTCCCCAATGCGCGAAAGCGTGAGGGCGCTACCCCGAGTGCCTCCGCAA

GGAGGCTTTTCCCCGCTCTAAAAAGGCGGGGGAATAAGCGGGGGGCAAGTCTGGTGTCAGCCGCCGCGGT AATACCAGCTCCGCGAGTGGTCGGGGTGATTACTGGGCCTAAAGCGCCTGTAGCCGGCCCACCAAGTCGC CCCTTAAAGTCCCCGGCTCAACCGGGGAACTGGGGGCGATACTGGTGGGCTAGGGGGCGGGAGAGGCGGG GGGTACTCCCGGAGTAGGGGCGAAATCCTTAGATACCGGGAGGACCACCAGTGGCGGAAGCGCCCCGCTA GAACGCGCCCGACGGTGAGAGGCGAAAGCCGGGGCAGCAAACGGGATTAGATACCCCGGTAGTCCCGGCT GTAAACGATGCGGGCTAGGTGTCGAGTAGGCTTAGAGCCTACTCGGTGCCGCAGGGAAGCCGTTAAGCCC GCCGCCTGGGGAGTACGGTCGCAAGACTGAAACTTAAAGGAATTGGCGGGGGAGCACCACAAGGGGTGGA ACCTGCGGCTCAATTGGAGTCAACGCCTGGAATCTTACCGGGGGAGACCGCAGTATGACGGCCAGGCTAA CGACCTTGCCTGACTCGCGGAGAGGAGGTGCATGGCCGTCGCCAGCTCGTGTTGTGAAATGTCCGGTTAA GTCCGGCAACGAGCGAGACCCCCACCCCTAGTTGGTATTCTGGACTCCGGTCCAGAACCACACTAGGGGG ACTGCCGGCGTAAGCCGGAGGAAGGAGGGGGCCACGGCAGGTCAGCATGCCCCGAAACTCCCGGGCCGCA CGCGGGTTACAATGGCAGGGACAACGGGATGCTACCTCGAAAGGGGGAGCCAATCCTTAAACCCTGCCGC AGTTGGGATCGAGGGCTGAAACCCGCCCTCGTGAACGAGGAATCCCTAGTAACCGCGGGTCAACAACCCG CGGTGAATACGTCCCTGCTCCTTGCACACACCGCCCGTCGCTCCACCCGAGCGCGAAAGGGGTGAGGTCC CTTGCGATAAGTGGGGGATCGAACTCCTTTCCCGCGAGGGGGGAGAAGTCGTAACAAGGTAGCCGTAGGG GAACCTGCGGCTGGATCACCTCATATATTTACTCCCCCGCTAATTGGGTGGGAGGGCTTCACTAAAACTC GTAATCTTCCCTTTTATAGATGCAGTTCTCCTCTTGGGCCAGAGGGGAATGAAGTGCCTAGGGCCCATTT GGCAGAGACATACAAATATGTCTCTGCCAAGTTAGGGCTCAATGAGGCTAGTACTAGGTAGCCACATTAT AGCCGTCTAGGAGTTCTACCCAGGGGCCGAAGCCTCCCGGTGGATGGCT

Thermococcus marinus 16S ribosomal RNA gene, partial sequence
>gi|19110964|gb|AF479012.1| Thermococcus marinus 16S ribosomal RNA gene, partial sequence ATTCCGGTTGATCCTGCCGGAGGCGCACTGCTATGGGGGTCCGACTAAGCCATGCGAGTCATGGGGCGCG CTCTGCGCGCACCGGCGGACGGCTCAGTAACACGTCGGTAACCTACCCTCGGGAGGGGGATAACCCCGGG AAACTGGGGCTAATCCCCCATAGGCCTGGGGTACTGGAAGGTCCCCAGGCCGAAAGGGGCTCTGCCCGCC CGAGGATGGGCCGGCGGCCGATTAGGTAGTTGGTGGGGTAACGGCCCACCAAGCCGAAGATCGGTACGGG CCATGAGAGTGGGAGCCCGGAGATGGACACTGAGACACGGGTCCAGGCCCTACGGGGCGCAGCAGGCGCG AAACCTCCGCAATGCGGGCAACCGCGACGGGGGGACCCCCAGTGCCGTGGCTCAGGCCACGGCTTTTCCG GAGTGTAAAAAGCTCCGGGAATAAGGGCTGGGCAAGGCCGGTGGCAGCCGCCGCGGTAATACCGGCGGCC CAAGTGGTGGCCGCTATTATTGGGCCTAAAGCGTCCGTAGCCGGGCCCGTAAGTCCCTGGCGAAATCCCA CGGCTCAACCGTGGGGCTTGCTGGGGATACTGCGGGCCTTGGGACCGGGAGAGGCCGGGGGTACCCCTGG GGTAGGGGTGAAATCCTATAATCCCGGGGGGACCGCCAGTGGCGAAGGCGCCCGGCTGGAACGGGTCCGA CGGTGAGGGACGAAGGCCAGGGGAGCGAACCGGATTAGATACCCGGGTAGTCCTGGCTGTAAAGGATGCG GGCTAGGTGTCGGGCGAGCTTCGAGCTCGGCCCGGTGCCGGAGGGCAAGCCGTTAAGCCCGCCGCCTGGG GAGTACGGCCGCAAGGCTGAAACTTAAAGGAATTGGCGGGGGAGCACTACAAGGGGTGGAGCGTGCGGTT TAATTGGATTCAACGCCGGGAACCTCACCGGGGGCGACGGCAGGATGAAGGCCAGGCTGAAGGTCTTGCC GGACACGCCGAGAGGAGGTGCATGGCCGCCGTCAGCTCGTACCGTGAGGCGTCCACTTAAGTGTGGTAAC GAGCGAGACCCGCGCCCCCAGTTGCCAAGTCCTCCCCGCTGGGGAGGAGGCACTCTGGGGGGACCACCGG CGATAAGCCGGAGGAAGGAGCGGGCGACGGTAGGTCAGTATGCCCCGAAACCCCCGGGCTACACGCGCGC TACAATGGGCGGGACAATGGGATCCGACCCCGAAAGGGGAAGGGAATCCCCTAAACCCGCCCTCAGTTCG GATCGCGGGCTGCAACTCGCCCGCGTGAAGCTGGAATCCCTAGTACCCGCGTGTCATCATCGCGCGGCGA ATACGTCCCTGCTCCTTGCACACACCGCCCGTCACTCCACCCGAGCGGGGTCTGGGTGAGGCCTGGTCTC CCTTCGGGGAGGCCGGGTCGTTCTGGGCTCCGTGAGGGGGAGAATCGTACAAGGTAGCGTAGGGAACTAC GTCSGAATCACTCTATCGCGGGA

D.radiodurans 16S rRNA gene
>gi|2108294|emb|Y11332.1| D.radiodurans 16S rRNA gene AGGGTGAACGCTGGCGGCGTGCTTAAGACATGCAAGTCGAACGCGGTCTTCGGACCGAGTGGCGCACGGG TGAGTAACACGTAACTGACCTACCCAGAAGTCACGAATAACTGGCCGAAAGGTCCGCTAATACGTGATGT GGTGATGCACCGTGGTGCATCACTAAAGATTTATCGCTTCTGGATGGGGTTGCGTTCCATCAGCTGGTTG GTGGGGTAAAGGCCTACCAAGGCGACGACGGATAGCCGGCCTGAGAGGGTGGCCGGCCACAGGGGCACTG AGACACGGGTCCCACTCCTACGGGAGGCAGCAGTTAGGAATCTTCCACAATGGGCGCAAGCCTGATGGAG CGACGCCGCGTGAGGGATGAAGGTTTTCGGATCGTAAACCTCTGAATCTGGGACGAAAGAGCCTTAGGGC AGATGACGGTACCAGAGTAATAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAA GCGTTACCCGGAATCACTGGGCGTAAAGGGCGTGTACGCGGAAATTTAAGTCTGGTTTTAAAGACCGGGG CTCAACCTCGGGGATGGACTGGATACTGGATTTCTTGACCTCTGGAGAGGTAACTGGAATTCCTGGTGTA GCGGTGGAATGCGTAGATACCAGGAGGAACACCAATGGCGAAGGCAAGTTACTGGACAGAAGGTGACGCT

GAGGCGCGAAAGTGTGGGGAGCAAACCGGATTAGATACCCGGGTAGTCCACACCCTAAACGATGTACGTT GGCTAAGCGCAGGATGCTGTGCTTGGCGAAGCTAACGCGATAAACGTACCGCCTGGGAAGTACGGCCGCA AGGTTGAAACTCAAAGGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCAAC GCGAAGAACCTTACCAGGTCTTGACATGCTAGGAACTTTGCAGAGATGCAGAGGTGCCCTTCGGGGAACC TAGACACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCG CAACCCTTGCCTTTAGTTGTCAGCATTCAGTTGGACACTCTAGAGGGACTGCCTATGAAAGTAGGAGGAA GGCGGGGATGACGTCTAGTCAGCATGGTCCTTACGTCCTGGGCGACACACGTGCTACAATGGGTAGGACA ACGCGCAGCAAACCCGCGAGGGTAAGCGAATCGCTAAAACCTATCCCCAGTTCAGATCGGAGTCTGCAAC TCGACTCCGTGAAGTTGGAATCGCTAGTAATCGCGGGTCAGCATACCGCGGTGAATACGTTCCCGGGCCT TGTACACACCGCCCGTCACACCATGGGAGTAGATTGCAGTTGAAACCGCCGGGAGCTTTGCGGCAGGCGT CTAGACTGTGGTTTATGACTGGGGTGAAGTCGTAACAAGGTAACTGTACCGGAAGGTGCGGCTGGA

T.aquaticus (1) 16S ribosomal RNA, part
>gi|48084|emb|X58340.1| T.aquaticus (1) 16S ribosomal RNA, part NTGGTGGAGAGTTTGATCCTGGCTCAGGGTGAACGCTGGCGGCGTGCCTAAGACATGCAAGNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNGCTAATCCCCCATGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTNGCNNGCTTCCGGAT GGGCCCGCGTCCCATCAGCTAGTTGGTGGGGTAAAGGCCCACCAAGGCNACGACGGNNNGCCNGTCTGAG AGGATGGCCGGCCACAGGGGCACTGAGACACGGGCCCCACTCCTACGGGAGGCAGCAGTTAGGAATCTTC CGCAATGGGCGCAAGCCTGACGGAGCGACGCTNCNTGGAGGAGGAANNCCTTCGGGGTGTAAACTCCTGA ACCCGGGACGAAACCCCCGATTAGGGGACTGACGGTACCGGGGTAATAGCGCCGGCCAACTCCGTGCCAG CAGCCNCNGTAATACGGAGNGCNCNANCGTTACCCGGATTTACTGGGCGTAAANNNCGCGCAGGCGGCTT GGGNNGTCCCATGTGAAATNNCNCGGCTCAACCGGGGAGAGGCNNGGGATACGCTCAGGCTAGACGGTGG NNGAGGNNNGNNGAATTCCCGGAGTAGCGGTGAAATGCGCAGATACCGGGAGGAACGCCGATGGCGAAGG CAGCCACCTGGTCCACTCGTGACGCTGAGGCGCGAAAGCGTGGGGAGCAAACCGGATTAGATACCCGGGT AGTCCACGCCCTAAACGATGCGCGCTAGGTCTCTGGGTTATCTGNGGGGCCGAAGCTAACGGGTTAAGCG CGCCGCCTGGGGAGTACGCCNNCNNNNNNGAAACTCAAAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNGTTTAATTCGNNNNNACGCGAAGAACCTTACCAGGCCTTGACNTGCTAGGGAACCTGGGTGAA AGCCTGGGNTGCCCGCGAGGGGAGCCCTAGCACTGGTGCTGCATGGCCGTCGTCAGCTCGTGTCGTGAGA TGTTGGGTTAAGTCCCGCAACGAGCGCAACCCCTGCCGTTAGTTGCCAGCGGGTGAAGCCGGGCACTCTA ACGNNACTGCCTGCGAAAGCAGGAGGAAGGCGGGGACGACGTCTGGTCATCATNGCCCTTACGGCCTGGT CGACACACGTGCTACAATGCCCACTACAGAGCGAGTCGACCTGGCAACAGGGAGCGAATCGCAAAAAGGT GGGNGTAGTTCGGATTGGGGTCTGCAACCCGACCCCATGAAGCCGGAATCGCTAGTAATCGCGGATCAGC CATGCCGCGGTGAATACGTNCCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGTAACAAGNN NNNNNNNNNNNNNNNNNNNNNNNGATCACCTCCTTTCT

Bacillus subtilis 16S rRNA gene, strain DSM10
>gi|8980302|emb|AJ276351.1| Bacillus subtilis 16S rRNA gene, strain DSM10 CCTGGCTCAGGACGAACGCTGGCGGCGTGCCTAATACATGCAAGTCGAGCGGACAGATGGGAGCTTGCTC CCTGATGTTAGCGGCGGACGGGTGAGTAACACGTGGGTAACCTGCCTGTAAGACTGGGATAACTCCGGGA AACCGGGGCTAATACCGGATGGTTGTTTGAACCGCATGGTTCAAACATAAAAGGTGGCTTCGGCTACCAC TTACAGATGGACCCGCGGCGCATTAGCTAGTTGGTGAGGTAACGGCTCACCAAGGCAACGATGCGTAGCC GACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTAGG GAATCTTCCGCAATGGACGAAAGTCTGACGGAGCAACGCCGCGTGAGTGATGAAGGTTTTCGGATCGTAA AGCTCTGTTGTTAGGGAAGAACAAGTACCGTTCGAATAGGGCGGTACCTTGACGGTACCTAACCAGAAAG CCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGAATTATTGGGCG TAAAGGGCTCGCAGGCGGTTTCTTAAGTCTGATGTGAAAGCCCCCGGCTCAACCGGGGAGGGTCATTGGA AACTGGGGAACTTGAGTGCAGAAGAGGAGAGTGGAATTCCACGTGTAGCGGTGAAATGCGTAGAGATGTG GAGGAACACCAGTGGCGAAGGCGACTCTCTGGTCTGTAACTGACGCTGAGGAGCGAAAGCGTGGGGAGCG AACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAGTGCTAAGTGTTAGGGGGTTTCCGCCCC TTAGTGCTGCAGCTAACGCATTAAGCACTCCGCCTGGGGAGTACGGTCGCAAGACTGAAACTCAAAGGAA TTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAGGT CTTGACATCCTCTGACAATCCTAGAGATAGGACGTCCCCTTCGGGGGCAGAGTGACAGGTGGTGCATGGT TGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTGATCTTAGTTGCC AGCATTCAGTTGGGCACTCTAAGGTGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAAATC

ATCATGCCCCTTATGACCTGGGCTACACACGTGCTACAATGGACAGAACAAAGGGCAGCGAAACCGCGAG GTTAAGCCAATCCCACAAATCTGTTCTCAGTTCGGATCGCAGTCTGCAACTCGACTGCGTGAAGCTGGAA TCGCTAGTAATCGCGGATCAGCATGCCGCGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACA CCACGAGAGTTTGTAACACCCGAAGTCGGTGAGGTAACCTTTTAGGAGCCAGCCGCCGAAGGTGGGACAG ATGATTGGGGTGAAGTCGTAACAAGGTAGCCGTATCGGAAGGTGCGG CLUSTAL 2.0.12 multiple sequence alignment with numbers

Sulfolobus Thermococcus radiodurans Bacillus Thermus

-----ATTCCGGTTGATCCTGCCGGACCCG-ACCGCTATCGGGGTAGGGATAAGCCATGG -----ATTCCGGTTGATCCTGCCGGAGGCGCACTGCTATGGGGGTCCGACTAAGCCATGC -------------------------AGGGTGAACGCTGGCGGCGT--GCTTAAGACATGC -----------------CCTGGCTCAGGACGAACGCTGGCGGCGT--GCCTAATACATGC NTGGTGGAGAGTTTGATCCTGGCTCAGGGTGAACGCTGGCGGCGT--GCCTAAGACATGC * * *** ** ** * *** **** GAGTCTTACAC------------TCCCGGGTAAGGGAGTGTGGCGGACGGCTGAGTAACA GAGTCATGGGG------------CGCGCTCTGCGCGCAC-CGGCGGACGGCTCAGTAACA AAGTCGAACG--------------CGGTCTTCGGACCGAGTGGCGCACGGGTGAGTAACA AAGTCGAGCGGACAGATGGGAGCTTGCTCCCTGATGTTAGCGGCGGACGGGTGAGTAACA AAGNNNNNNN--------------NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN ** CGTGGCTAACCTACCCTCGGGACGGGGATAACCCCGGGAAACTGGGGATAATCCCCGATA CGTCGGTAACCTACCCTCGGGAGGGGGATAACCCCGGGAAACTGGGGCTAATCCCCCATA CGTAACTGACCTACCCAGAAGTCACGAATAACTGGCCGAAAGGTCCGCTAATACGTGATG CGTGGGTAACCTGCCTGTAAGACTGGGATAACTCCGGGAAACCGGGGCTAATACCGGATG NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCTAATCCCCCATG * **** * ** GGGAAGGAGTCCTGGAATGGTTCCTTCCCTAAAGGGCTATAGGCTATTTCCCGTTTGTAG GGCCTGGGGTACTGGAAGGTCCCCAGGCCGAAAGGGG-----------------CTCTGC ----TGGTGATGCACCGTGGTGCAT-CACTAAAGAT---------------------TTA G--TTGTTTGAACCGCATGGTTCAAACATAAAAGGTGGCT-------------TCGGCTA -----NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNT---------------------NGC

54 55 33 41 58

Sulfolobus Thermococcus radiodurans Bacillus Thermus

102 102 79 101 104

Sulfolobus Thermococcus radiodurans Bacillus Thermus

162 162 139 161 164

Sulfolobus Thermococcus radiodurans Bacillus Thermus

222 205 173 206 198

Sulfolobus Thermococcus radiodurans Bacillus Thermus

CCGCCCGAGGATGGGGCTACGGCCCATCAGGCTGTCGGTGGGGTAAAGGCCCACCGAACC CCGCCCGAGGATGGGCCGGCGGCCGATTAGGTAGTTGGTGGGGTAACGGCCCACCAAGCC TCGCTTCTGGATGGGGTTGCGTTCCATCAGCTGGTTGGTGGGGTAAAGGCCTACCAAGGC CCACTTACAGATGGACCCGCGGCGCATTAGCTAGTTGGTGAGGTAACGGCTCACCAAGGC NNGCTTCCGGATGGGCCCGCGTCCCATCAGCTAGTTGGTGGGGTAAAGGCCCACCAAGGC * ***** ** ** ** ** **** ***** *** *** * * TATAACGGGTAGGGGCCGTGGAAGCGGGAGCCTCCAGTTGGGCACTGAGACAAGGGCCCA GAAGATCGGTACGGGCCATGAGAGTGGGAGCCCGGAGATGGACACTGAGACACGGGTCCA GACGACGGATAGCCGGCCTGAGAGGGTGGCCGGCCACAGGGGCACTGAGACACGGGTCCC AACGATGCGTAGCCGACCTGAGAGGGTGATCGGCCACACTGGGACTGAGACACGGCCCAG NACGACGGNNNGCCNGTCTGAGAGGATGGCCGGCCACAGGGGCACTGAGACACGGGCCCC * * ** ** * * * * ********* ** * GGCCCTACGGGGCGCACCAGGCGCGAAACGTCCCCAATGCGCGAAAGCGTGAGGGCGCTA GGCCCTACGGGGCGCAGCAGGCGCGAAACCTCCGCAATGCGGGCAACCGCGACGGGGGGA ACTCCTACGGGAGGCAGCAGTTAGGAATCTTCCACAATGGGCGCAAGCCTGATGGAGCGA ACTCCTACGGGAGGCAGCAGTAGGGAATCTTCCGCAATGGACGAAAGTCTGACGGAGCAA ACTCCTACGGGAGGCAGCAGTTAGGAATCTTCCGCAATGGGCGCAAGCCTGACGGAGCGA ******** *** *** *** * *** ***** * ** ** ** * *

282 265 233 266 258

Sulfolobus Thermococcus radiodurans Bacillus Thermus

342 325 293 326 318

Sulfolobus Thermococcus radiodurans Bacillus Thermus

402 385 353 386 378

Sulfolobus Thermococcus radiodurans

CCCCGAGTGCCTCCGCAAGGA----GGCTTTT----CCCCGC--------TCTAAAAAGG 446 CCCCCAGTGCCGTGGCTCAGGCCACGGCTTTT----CCGGAG--------TGTAAAAAGC 433 CGCCGCGTGAGGGATGAAGGTTTTCGGATCGTAAACCTCTGA-ATCTGGGACGAAAGAGC 412

Bacillus Thermus

CGCCGCGTGAGTGATGAAGGTTTTCGGATCGTAAAGCTCTGTTGTTAGGGAAGAACAAGT 446 CGCTNCNTGGAGGAGGAANNCCTTCGGGGTGTAAACTCCTGA----ACCCGGGACGAAAC 434 * * ** ** * * * CG----------GGGGAAT---------------------AAGCGGGGGGCAAGTCTGGT TC----------CGGGAAT---------------------AAGGGCTGGGCAAGGCCGGT CT--------TAGGGCAGA----TGACGGTACCAGAGTA-ATAGCACCGGCTAACTCCGT ACCGTTCGAATAGGGCGGTACCTTGACGGTACCTAACCAGAAAGCCACGGCTAACTACGT CCC---CGATTAGGGGACT-----GACGGTACCGGGGTA-ATAGCGCCGGCCAACTCCGT ** * *** * ** GTCAGCCGCCGCGGTAATACCAGCTCCGCGAGTGGTCGGGGTGATTACTGGGCCTAAAGC GGCAGCCGCCGCGGTAATACCGGCGGCCCAAGTGGTGGCCGCTATTATTGGGCCTAAAGC GCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTACCCGGAATCACTGGGCGTAAAGG GCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGG GCCAGCAGCCNCNGTAATACGGAGNGCNCNANCGTTACCCGGATTTACTGGGCGTAAANN * **** *** * ******* * * * * * * * ***** **** GCCTGTAGCCGGCCCACCAAGTCGCCCCTTAAAGTCCCCGGCTCAACCGGGGAACTGGGGTCCGTAGCCGGGCCCGTAAGTCCCTGGCGAAATCCCACGGCTCAACCGTGGGGCTTGCT GCGTGTACGCGGAAATTTAAGTCTGGTTTTAAAGACCGGGGCTCAACCTCGGGGATGGAC GCTCGCAGGCGGTTTCTTAAGTCTGATGTGAAAGCCCCCGGCTCAACCGGGGAGGGTCAT NCGCGCAGGCGGCTTGGGNNGTCCCATGTGAAATNNCNCGGCTCAACCGGGGAGAGGCNN * * *** *** *** * ********* ** GGCGATACTGGTGGGCTAGGGGGCGGGAGAGGCGGGGGGTACTCCCGGAGTAGGGGCGAA GGGGATACTGCGGGCCTTGGGACCGGGAGAGGCCGGGGGTACCCCTGGGGTAGGGGTGAA TG-GATACTGGATTTCTTGACCTCTGGAGAGGTAACTGGAATTCCTGGTGTAGCGGTGGA TG-GAAACTGGGGAACTTGAGTGCAGAAGAGGAGAGTGGAATTCCACGTGTAGCGGTGAA GG-GATACGCTCAGGCTAGACGGTGGNNGAGGNNNGNNGAATTCCCGGAGTAGCGGTGAA * ** ** ** * * **** * * ** * **** ** * * ATCCTTAGATACCGGGAGGACCACCAGTGGCGGAAGCGCCCCGCTAGAACGCGCCCGACG ATCCTATAATCCCGGGGGGACCGCCAGTGGCGAAGGCGCCCGGCTGGAACGGGTCCGACG ATGCGTAGATACCAGGAGGAACACCAATGGCGAAGGCAAGTTACTGGACAGAAGGTGACG ATGCGTAGAGATGTGGAGGAACACCAGTGGCGAAGGCGACTCTCTGGTCTGTAACTGACG ATGCGCAGATACCGGGAGGAACGCCGATGGCGAAGGCAGCCACCTGGTCCACTCGTGACG ** * * ** *** * ** ***** * ** ** * **** GTGAGAGGCGAAAGCCGGGGCAGCAAACGGGATTAGATACCCCGGTAGTCCCGGCTGTAA GTGAGGGACGAAGGCCAGGGGAGCGAACCGGATTAGATACCCGGGTAGTCCTGGCTGTAA CTGAGGCGCGAAAGTGTGGGGAGCAAACCGGATTAGATACCCGGGTAGTCCACACCCTAA CTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTAGATACCCTGGTAGTCCACGCCGTAA CTGAGGCGCGAAAGCGTGGGGAGCAAACCGGATTAGATACCCGGGTAGTCCACGCCCTAA **** **** * *** *** *** ************* ******** * *** ACGATGCGGGCTAGGTGTCGAGTAGGCTTAGAGCCTA-CTCGGTGCCGCAGGG-AAGCCG AGGATGCGGGCTAGGTGTCGGGCGAGCTTCGAGCTCGGCCCGGTGCCGGAGGGCAAGCCG ACGATGTACGTTGGCTAAGCGCAGGATGCTGTGCTT--------GGCGAAGCT-AACGCG ACGATGAGTGCTAAGTGTTAGGGGGTTTCCGCCCCTT----AGTGCTGCAGCT-AACGCA ACGATGCGCGCTAGGTCTCTGGGTTATCTGNGGG----------GCCGAAGCT-AACGGG * **** * * * * * ** ** TTAAGCCCGCCGCCTGGGGAGTACGGTCGCAAGACTGAAACTTAAAGGAATTGGCGGGGG TTAAGCCCGCCGCCTGGGGAGTACGGCCGCAAGGCTGAAACTTAAAGGAATTGGCGGGGG ATAAACGTACCGCCTGGGAAGTACGGCCGCAAGGTTGAAACTCAAAGGAATTGACGGGGG TTAAGCACTCCGCCTGGGGAGTACGGTCGCAAGACTGAAACTCAAAGGAATTGACGGGGG TTAAGCGCGCCGCCTGGGGAGTACGCCNNCNNNNNNGAAACTCAAAGNNNNNNNNNNNNN *** * ********* ****** * ****** **** 475 462 459 506 485

Sulfolobus Thermococcus radiodurans Bacillus Thermus

Sulfolobus Thermococcus radiodurans Bacillus Thermus

535 522 519 566 545

Sulfolobus Thermococcus radiodurans Bacillus Thermus

594 582 579 626 605

Sulfolobus Thermococcus radiodurans Bacillus Thermus

654 642 638 685 664

Sulfolobus Thermococcus radiodurans Bacillus Thermus

714 702 698 745 724

Sulfolobus Thermococcus radiodurans Bacillus Thermus

774 762 758 805 784

Sulfolobus Thermococcus radiodurans Bacillus Thermus

832 822 809 860 833

Sulfolobus Thermococcus radiodurans Bacillus Thermus

892 882 869 920 893

Sulfolobus

AGCACCACAAGGGGTGGAACCTGCGGCTCAATTGGAGTCAACGCCTGGAATCTTACCGGG 952

Thermococcus radiodurans Bacillus Thermus

AGCACTACAAGGGGTGGAGCGTGCGGTTTAATTGGATTCAACGCCGGGAACCTCACCGGG CCCGC-ACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAGG CCCGC-ACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAGG NNNNN-NNNNNNNNNNNNNNNNNNNGTTTAATTCGNNNNNACGCGAAGAACCTTACCAGG * * **** * **** *** ** *** ** G---GAGACCGCAGT--ATGACGGCCAGGCTAACGA------CCTTGCCTGACTCGCGGA G---GCGACGGCAGG--ATGAAGGCCAGGCTGAAGG------TCTTGCCGGACACGCCGA TCTTGACATGCTAGG--AACTTTGCAGAGATGCAGAGGTGCCCTTCGGGGAACCTAGACA TCTTGACATCCTCTG--ACAATCCTAGAGATA-GGACGTCCCCTTCGGGGG-CAGAGTGA CCTTGACNTGCTAGGGAACCTGGGTGAAAGCCTGGGNTGCCCGCGAGGGGAGCCCTAGCA * * * * * * GAGGAGGTGCATGGCCGTCGCCAGCTCGTGTTGTGAAATGTCCGGTTAAGTCCGGCAACG GAGGAGGTGCATGGCCGCCGTCAGCTCGTACCGTGAGGCGTCCACTTAAGTGTGGTAACG CAGGTGCTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACG CAGGTGGTGCATGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACG CTGGTGCTGCATGGCCGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACG ** * ******* * ** ******** **** ** ****** * **** AGCGAGACCCCCACCCCTAGTTGGTA--TTCTGGACTCCGGTCCAGAACCACACTAGGGG AGCGAGACCCGCGCCCCCAGTTGCCAAGTCCTCCCCGCTGGGGAGGAGGCACTCTGGGGG AGCGCAACCCTTGCCTTTAGTTGTCA-------GCATTCAGTTGGA---CACTCTAGAGG AGCGCAACCCTTGATCTTAGTTGCCA-------GCATTCAGTTGGG---CACTCTAAGGT AGCGCAACCCCTGCCGTTAGTTGCCAG-----CGGGTGAAGCCGGG---CACTCTAACGN **** **** ***** * * *** ** * GACTGCCGGCG-TAAGCCGGAGGAAGGAGGGGGCCACGGCAGGTCAGCATGCCCCGAAAC GACCACCGGCGATAAGCCGGAGGAAGGAGCGGGCGACGGTAGGTCAGTATGCCCCGAAAC GACTGCCTATGA-AAGTAGGAGGAAGGCGGGGATGACGTCTAGTCAGCATGGTCCTTACG GACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAAATCATCATGCCCCTTATG NACTGCCTGCGA-AAGCAGGAGGAAGGCGGGGACGACGTCTGGTCATCATNGCCCTTACG ** ** * ** ********* * ** *** *** ** ** * TCCCGGGCCGCACGCGGGTTACAATGGCAGGGACAACGGGATGCTACCTCGAAAGGGGGA CCCCGGGCTACACGCGCGCTACAATGGGCGGGACAATGGGATCCGACCCCGAAAGGGGAA TCCTGGGCGACACACGTGCTACAATGGGTAGGACAACGCGCAGCAAACCCGCGAGGGTAA ACCTGGGCTACACACGTGCTACAATGGACAGAACAAAGGGCAGCGAAACCGCGAGGTTAA GCCTGGTCGACACACGTGCTACAATGCCCACTACAGAGCGAGTCGACCTGGCAACAGGGA ** ** * *** ** * ******* *** * * * * * * * GCCAATCCT-TAAACCCTGCCGCAGTTGGGATCGAGGGCTGAAACCCGCCCTCGTGAACG GGGAATCCCCTAAACCCGCCCTCAGTTCGGATCGCGGGCTGCAACTCGCCCGCGTGAAGC GCGAATCGCTAAAACCTATCCCCAGTTCAGATCGGAGTCTGCAACTCGACTCCGTGAAGT GCCAATCCCACAAATCTGTTCTCAGTTCGGATCGCAGTCTGCAACTCGACTGCGTGAAGC GCGAATCGCAAAAAGGTGGGNGTAGTTCGGATTGGGGTCTGCAACCCGACCCCATGAAGC * **** *** **** *** * * *** *** ** * * **** AGGAATCCCTAGTAACCGCGGGTCAAC-AACCCGCGGTGAATACGTCCCTGCTCCTTGCA TGGAATCCCTAGTACCCGCGTGTCATC-ATCGCGCGGCGAATACGTCCCTGCTCCTTGCA TGGAATCGCTAGTAATCGCGGGTCAGC-ATACCGCGGTGAATACGTTCCCGGGCCTTGTA TGGAATCGCTAGTAATCGCGGATCAGC-ATGCCGCGGTGAATACGTTCCCGGGCCTTGTA CGGAATCGCTAGTAATCGCGGATCAGCCATGCCGCGGTGAATACGTNCCNNNNNNNNNNN ****** ****** **** *** * * ***** ******** ** CACACCGCCCGTCGCTCCACCCGAGCGCGAAAGGGGTGAGGTC--CCTTGCGATAAGTGG CACACCGCCCGTCACTCCACCCGAGCGGGGTCTGGGTGAGGCC--TGGTCTCCCTTCGGG CACACCGCCCGTCACACCATGGGAGTAGATTGCAGTTGAAACCGCCGGG--AGCTTTGCG CACACCGCCCGTCACACCACGAGAGTTTGTAACACCCGAAGTCGGTGAGGTAACCTTTTA NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN---NNNNNNNNNNNNNN

942 928 979 952

Sulfolobus Thermococcus radiodurans Bacillus Thermus

1001 991 986 1035 1012

Sulfolobus Thermococcus radiodurans Bacillus Thermus

1061 1051 1046 1095 1072

Sulfolobus Thermococcus radiodurans Bacillus Thermus

1119 1111 1096 1145 1124

Sulfolobus Thermococcus radiodurans Bacillus Thermus

1178 1171 1155 1205 1183

Sulfolobus Thermococcus radiodurans Bacillus Thermus

1238 1231 1215 1265 1243

Sulfolobus Thermococcus radiodurans Bacillus Thermus

1297 1291 1275 1325 1303

Sulfolobus Thermococcus radiodurans Bacillus Thermus

1356 1350 1334 1384 1363

Sulfolobus Thermococcus radiodurans Bacillus Thermus

1414 1408 1392 1444 1420

Sulfolobus Thermococcus radiodurans Bacillus Thermus

GGGATCGAACTCCT---TTC--CCGCGAGG--GGGGAGAAGTCGTAACAAGGTAGCCGTA GAGGCCGGGTCGTT---CTGGGCTCCGTGA--GGGGGAGAATCGTA-CAAGGTAGC-GTA G----CAGGCGTCTAGACTGTGGTTTATGACTGGGGTGAAGTCGTAACAAGGTAACTGTA GGAGCCAGCCGCCGAAGGTGGGACAGATGATTGGGGTGAAGTCGTAACAAGGTAGCCGTA NNNNNNNNNNNNNN---NNNNNNNNNNNNNNNNNNNNNNNNNNGTAACAAGNNNNNNNNN *** **** GGGGAACCTGCGGCTGGATCACCTCATATATTTACTCCCCCGCTAATTGGGTGGGAGGGC GGGAA--CTACGTCSGAATCACTCTATCGCGGGA-------------------------CCGGAAGGTGCGGCTGGA-----------------------------------------TCGGAAGGTGCGG----------------------------------------------NNNNNNNNNNNNNNNNGATCACCTCCTTTCT-----------------------------

1467 1461 1448 1504 1477

Sulfolobus Thermococcus radiodurans Bacillus Thermus

1527 1493 1466 1517 1508

Sulfolobus Thermococcus radiodurans Bacillus Thermus

TTCACTAAAACTCGTAATCTTCCCTTTTATAGATGCAGTTCTCCTCTTGGGCCAGAGGGG 1587 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Sulfolobus Thermococcus radiodurans Bacillus Thermus

AATGAAGTGCCTAGGGCCCATTTGGCAGAGACATACAAATATGTCTCTGCCAAGTTAGGG 1647 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Sulfolobus Thermococcus radiodurans Bacillus Thermus

CTCAATGAGGCTAGTACTAGGTAGCCACATTATAGCCGTCTAGGAGTTCTACCCAGGGGC 1707 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Sulfolobus Thermococcus radiodurans Bacillus Thermus

CGAAGCCTCCCGGTGGATGGCT 1729 -------------------------------------------------------------------------------------

2b)

radiodurans Bacillus subtilis Thermus aquaticus Sulfolobus solfataricus Thermococcus marinus

evolutionary distances that were computed using the Maximum Composite Likelihood method appeared to reduce between radiodurans and B. subtillis as compared to their position in original NJ-tree above)
radiodurans Bacillus subtilis Thermus aquaticus Sulfolobus solfataricus Thermococcus marinus

Sulfolobus solfataricus Thermococcus marinus Thermus aquaticus radiodurans Bacillus subtilis

Sulfolobus solfataricus Thermococcus marinus Thermus aquaticus radiodurans Bacillus subtilis

Sign up to vote on this title
UsefulNot useful