You are on page 1of 18

INTRODUCTION TO BIOINFORMATCS

Lab Activity

SUBMITTED BY:
NA:REEMA AMIN (FA19-BSO-058)
TAJIDA HASSAN (FA19-BSO-
080)
SABAHAT FAROOQ (FA19-BSO-
097)

SECTION:B
SUBMITTED TO:
Dr. Hassaan Mehboob Awan
1. Download protein sequences of COL1A1, COL3A1, COL5A1, COL4A2 genes of Human,
Chimpanzee, Mouse, Chicken, Zebrafish and Drosophila.
Name the biological names of these species. To download protein sequences use NCBI
genome browser.

COL1A1

Human:
>NP_000079.2 collagen alpha-1(I) chain preproprotein [Homo sapiens]
MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEPCRICVCDNGKVLC
DDVICDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGL
PGPPGPPGPPGPPGLGGNFAPQLSYGYDEKSTGGISVPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEPGE
PGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDA
GPAGPKGEPGSPGENGAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVG
AKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGP
QGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGP
GSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAG
QDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGE
RGEQGPAGSPGFQGLPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGAN
GAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLTGPIG
PPGPAGAPGDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGP
PGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGPAGKEGGKGPR
GETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSG
EPGKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPGAEGSPGRDGSPGAKGDRGETGPAGPPGAPGA
PGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPP
GPPGSPGEQGPSGASGPAGPRGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPG
PPSAGFDFSFLPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCR
DLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKDKRHVWFGESM
TDGFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLQGSNEIEIRA
EGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKTSRLPIIDVAPLDVGAPDQEFGFDVGPVCFL

Chimpanzee:
>XP_001169409.1 collagen alpha-1(I) chain isoform X1 [Pan troglodytes]
MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEPCRICVCDNGKVLC
DDVICDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGL
PGPPGPPGPPGPPGLGGNFAPQLSYGYDEKSTGGISVPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEPGE
PGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDA
GPAGPKGEPGSPGENGAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVG
AKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGP
QGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGP
GSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAG
QDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGE
RGEQGPAGSPGFQGLPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGAN
GAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLTGPIG
PPGPAGAPGDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGP
PGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGPAGKEGGKGPR
GETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSG
EPGKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPGAEGSPGRDGSPGAKGDRGETGPAGPPGAPGA
PGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPP
GPPGSPGEQGPSGASGPAGPRGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPG
PPSAGFDFSFLPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCR
DLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKDKRHVWFGESM
TDGFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLQGSNEIEIRA
EGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKTSRLPIIDVAPLDVGAPDQEFGFDVGPVCFL

Mouse:
>NP_031768.2 collagen alpha-1(I) chain preproprotein [Mus musculus]
MFSFVDLRLLLLLGATALLTHGQEDIPEVSCIHNGLRVPNGETWKPEVCLICICHNGTAVCDDVQCNEEL
DCPNPQRREGECCAFCPEEYVSPNSEDVGVEGPKGDPGPQGPRGPVGPPGRDGIPGQPGLPGPPGPPGPP
GPPGLGGNFASQMSYGYDEKSAGVSVPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGGSGPMGPRG
PPGPPGKNGDDGEAGKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGS
PGENGAPGQMGPRGLPGERGRPGPPGTAGARGNDGAVGAAGPPGPTGPTGPPGFPGAVGAKGEAGPQGAR
GSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGPQGPSGPPGPKG
NSGEPGAPGNKGDTGAKGEPGATGVQGPPGPAGEEGKRGARGEPGPSGLPGPPGERGGPGSRGFPGADGV
AGPKGPSGERGAPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAGQDGRPGPAGPP
GARGQAGVMGFPGPKGTAGEPGKAGERGLPGPPGAVGPAGKDGEAGAQGAPGPAGPAGERGEQGPAGSPG
FQGLPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGNNGAPGNDGAKGD
TGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGARGLTGPIGPPGPAGAPGDK
GEAGPSGPPGPTGARGAPGDRGEAGPPGPAGFAGPPGADGQPGAKGEPGDTGVKGDAGPPGPAGPAGPPG
PIGNVGAPGPKGPRGAAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGPVGKEGGKGPRGETGPAGRPGE
VGPPGPPGPAGEKGSPGADGPAGSPGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPSGSS
GERGPPGPMGPPGLAGPPGESGREGSPGAEGSPGRDGAPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAG
KNGDRGETGPAGPAGPIGPAGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGSPGSPGEQGP
SGASGPAGPRGPPGSAGSPGKDGLNGLPGPIGPPGPRGRTGDSGPAGPPGPPGPPGPPGPPSGGYDFSFL
PQPPQEKSQDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCRDLKMCHSDWKS
GEYWIDPNQGCNLDAIKVYCNMETGQTCVFPTQPSVPQKNWYISPNPKEKKHVWFGESMTDGFPFEYGSE
GSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLQGSNEIELRGEGNSRFTYSTL
VDGCTSHTGTWGKTVIEYKTTKTSRLPIIDVAPLDIGAPDQEFGLDIGPACFV

Chicken:
>XP_024999899.1 collagen alpha-1(I) chain isoform X1 [Gallus gallus]
MFSFVDSRLLLLIAATVLLTRGQGEEDIQTGSCVQDGLTYNDKDVWKPEPCQICVCDSGNILCDEVICED
TSDCPNAEIPFGECCPICPDVDASPVYPESAGVEGPKGDTGPRGDRGLPGPPGRDGIPGQPGLPGPPGPP
GPPGLGGNFAPQMSYGYDEKSAGVAVPGPMGPAGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRG
PAGPPGKNGDDGEAGKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGEPGPAGPKGEPGS
PGENGAPGQMGPRGLPGERGRPGPSGPAGARGNDGAPGAAGPPGPTGPAGPPGFPGAAGAKGETGPQGAR
GSEGPQGARGEPGPPGPAGAAGPAGNPGADGQPGAKGATGAPGIAGAPGFPGARGPSGPQGPSGAPGPKG
NSGEPGAPGNKGDTGAKGEPGPAGVQGPPGPAGEEGKRGARGEPGPAGLPGPAGERGAPGSRGFPGADGI
AGPKGPPGERGSPGAVGPKGSPGEAGRPGEPGLPGAKGLTGSPGSPGPDGKTGPPGPAGQDGRPGPPGPP
GARGQAGVMGFPGPKGAAGEPGKPGERGAPGPPGAVGAAGKDGEAGAQGPPGPTGPAGERGEQGPAGAPG
FQGLPGPAGPPGEAGKPGEQGVPGDAGAPGPAGARGERGFPGERGVQGPPGPQGPRGANGAPGNDGAKGD
AGAPGAPGNQGPPGLQGMPGERGAAGLPGAKGDRGDPGPKGADGAPGKDGLRGLTGPIGPPGPAGAPGDK
GEAGPPGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGETGDAGAKGDAGPPGPAGPTGAPG
PAGAVGAPGPKGARGSAGPPGATGFPGAAGRVGPPGPSGNIGLPGPPGPSGKEGGKGPRGETGPAGRPGE
PGPAGPPGPPGEKGSPGADGPIGAPGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPSGSP
GERGPPGPMGPPGLAGPPGEAGREGAPGAEGAPGRDGAAGPKGDRGETGPAGPPGAPGAPGAPGPVGPAG
KNGDRGETGPAGPAGPPGPAGARGPAGPQGPRGDKGETGEQGDRGMKGHRGFSGLQGPPGPPGAPGEQGP
SGASGPAGPRGPPGSAGAAGKDGLNGLPGPIGPPGPRGRTGEVGPVGPPGPPGPPGPPGPPSGGFDFSFL
PQPPQEKAHDGGRYYRADDANVMRDRDLEVDTTLKSLSQQIENIRSPEGTRKNPARTCRDLKMCHGDWKS
GEYWIDPNQGCNLDAIKVYCNMETGETCVYPTQATIAQKNWYLSKNPKEKKHVWFGETMSDGFQFEYGGE
GSNPADVAIQLTFLRLMSTEATQNVTYHCKNSVAYMDHDTGNLKKALLLQGANEIEIRAEGNSRFTYGVT
EDGCTSHTGAWGKTVIEYKTTKTSRLPIIDLAPMDVGAPDQEFGIDIGPVCFL

Zebrafish:
>NP_954684.1 collagen, type I, alpha 1a precursor [Danio rerio]
MFSFVDILLALLLNATVLLARGQGEDDRTGGSCTLDGQVYNDRDVWKPEPCQICVCDSGTVMCDEVICED
TSDCPNPVIAHDECCPVCPDDDFQEPSVEGPRGSPGDKGERGPANPPGNDGIHEQSVLPVPTSHSGPAAL
GGNLSPQMSGGFDEKSSPMAVPGPMGPMGPRGAPGPPGPSGPQGFTGPPGEPGEAGAPGPMGPRGAAGPP
GKNGEDGESGKPGRPGERGPPGPQGARGFPGTPGLPGIKGHRGFSGLDGAKGDAGPAGPKGEPGAPGENG
TPGAMGPRGLPGERGRAGPPGAAGARGNDGAAGAAGPPGPTGPAGPPGFPGGPGSKGEVGPQGSRGAEGP
QGARGEAGNPGPAGPAGPAGNNGADGAPGAKGAPGAPGIAGAPGFPGPRGPPGAAGAAGAPGPKGNTGEA
GAPGAKGEAGAKGEAGAQGVQGPPGPPGEEGKRGPRGEPGAGGARGPTGERGAPGARGFPGADGAAGPRG
APGERGGPGVVGPKGATGEPGRNGEPGMPGSKGMTGSPGSPGPDGKTGPGGAPGQDGRPGPPGPVGARGQ
PGVMGFPGPKGAAGEAGKPGERGVMGAIGATGAPGKDGDVGAPGAPGPAGPAGERGEQGAAGPPGFQGLP
GPQGATGEPGKSGEQGAPGEAGAPGPSGSRGDRGFPGERGAPGPAGPVGARGSPGSAGNDGAKGESGAAG
APGAQGPPGLQGMPGERGAAGLPGLKGDRGDQGAKGADGAAGKDGIRGMTGPIGPPGPAGAPGDKGESGA
QGLVGPTGARGPPGERGETGAPGPAGFAGPPGADGLPGAKGEPGDNGAKGDAGAPGPAGATGAPGPQGPV
GATGPKGARGAAGPPGATGFPGAAGRVGPPGPSGNSGPPGPPGPAGKEGQKGNRGETGPAGRTGEVGAAG
PPGAPGEKGNPGAEGATGPAGIPGPQGIGGQRGIVGLPGQRGERGFPGLPGPSGEIGKQGPSGPSGERGP
PGPMGPPGLAGPPGEPGREGTPGNEGSAGRDGAAGPKGDRGETGPSGTPGAPGPPGAAGPIGPAGKTGDR
GETGPAGVPGPAGPSGPRGPSGPAGARGDKGETGEAGERGMKGHRGFTGMPGPPGPPGPSGESGPAGASG
PAGPRGPAGSAGSAGKDGMSGLPGPIGPPGPRGRNGEIGPAGPPGPPGPPGAPGPSGGGFDIGFIAQPQE
KAPDPFRHFRADDANVMRDRDLEVDTTLKSLSQQIESIISPDGTKKNPARTCRDLKMCHPDWKSGEYWID
PDQGCNQDAIKVYCNMETGETCVNPTESAIPKKNWYTSKNIKEKKHVWFGEAMTDGFQFEYGSEGSKPED
VNIQLTFLRLMSTEASQNITYHCKNSIAYMDQASGNLKKALLLQGSNEIEIRAEGNSRFTYSVTEDGCTS
HTGAWGKTVIDYKTTKTSRLPIIDIAPMDVGAPNQEFGIEVGPVCFL
COL3A1

Human:
>NP_000081.2 collagen alpha-1(III) chain preproprotein [Homo sapiens]
MMSFVQKGSWLLLALLHPTIILAQQEAVEGGCSHLGQSYADRDVWKPEPCQICVCDSGSVLCDDIICDDQ
ELDCPNPEIPFGECCAVCPQPPTAPTRPPNGQGPQGPKGDPGPPGIPGRNGDPGIPGQPGSPGSPGPPGI
CESCPTGPQNYSPQYDSYDVKSGVAVGGLAGYPGPAGPPGPPGPPGTSGHPGSPGSPGYQGPPGEPGQAG
PSGPPGPPGAIGPSGPAGKDGESGRPGRPGERGLPGPPGIKGPAGIPGFPGMKGHRGFDGRNGEKGETGA
PGLKGENGLPGENGAPGPMGPRGAPGERGRPGLPGAAGARGNDGARGSDGQPGPPGPPGTAGFPGSPGAK
GEVGPAGSPGSNGAPGQRGEPGPQGHAGAQGPPGPPGINGSPGGKGEMGPAGIPGAPGLMGARGPPGPAG
ANGAPGLRGGAGEPGKNGAKGEPGPRGERGEAGIPGVPGAKGEDGKDGSPGEPGANGLPGAAGERGAPGF
RGPAGPNGIPGEKGPAGERGAPGPAGPRGAAGEPGRDGVPGGPGMRGMPGSPGGPGSDGKPGPPGSQGES
GRPGPPGPSGPRGQPGVMGFPGPKGNDGAPGKNGERGGPGGPGPQGPPGKNGETGPQGPPGPTGPGGDKG
DTGPPGPQGLQGLPGTGGPPGENGKPGEPGPKGDAGAPGAPGGKGDAGAPGERGPPGLAGAPGLRGGAGP
PGPEGGKGAAGPPGPPGAAGTPGLQGMPGERGGLGSPGPKGDKGEPGGPGADGVPGKDGPRGPTGPIGPP
GPAGQPGDKGEGGAPGLPGIAGPRGSPGERGETGPPGPAGFPGAPGQNGEPGGKGERGAPGEKGEGGPPG
VAGPPGGSGPAGPPGPQGVKGERGSPGGPGAAGFPGARGLPGPPGSNGNPGPPGPSGSPGKDGPPGPAGN
TGAPGSPGVSGPKGDAGQPGEKGSPGAQGPPGAPGPLGIAGITGARGLAGPPGMPGPRGSPGPQGVKGES
GKPGANGLSGERGPPGPQGLPGLAGTAGEPGRDGNPGSDGLPGRDGSPGGKGDRGENGSPGAPGAPGHPG
PPGPVGPAGKSGDRGESGPAGPAGAPGPAGSRGAPGPQGPRGDKGETGERGAAGIKGHRGFPGNPGAPGS
PGPAGQQGAIGSPGPAGPRGPVGPSGPPGKDGTSGHPGPIGPPGPRGNRGERGSEGSPGHPGQPGPPGPP
GAPGPCCGGVGAAAIAGIGGEKAGGFAPYYGDEPMDFKINTDEIMTSLKSVNGQIESLISPDGSRKNPAR
NCRDLKFCHPELKSGEYWVDPNQGCKLDAIKVFCNMETGETCISANPLNVPRKHWWTDSSAEKKHVWFGE
SMDGGFQFSYGNPELPEDVLDVHLAFLRLLSSRASQNITYHCKNSIAYMDQASGNVKKALKLMGSNEGEF
KAEGNSKFTYTVLEDGCTKHTGEWSKTVFEYRTRKAVRLPIVDIAPYDIGGPDQEFGVDVGPVCFL

Chimpanzee:
>XP_001163809.1 collagen alpha-1(III) chain [Pan troglodytes]
MMSFVQKGSWLLLALLHPTIILAQQEAVEGGCSHLGQSYADRDVWKPEPCQICVCDSGSVLCDDIICDDQ
ELDCPNPEIPFGECCAVCPQPPTAPTRPPNGQGPQGPKGDPGPPGIPGRNGDPGIPGQPGSPGSPGPPGI
CESCPTGPQNYSPQYDSYDVKSGVAVGGLAGYPGPAGPPGPPGPPGTSGHPGSPGSPGYQGPPGEPGQAG
PSGPPGPPGAIGPSGPAGKDGESGRPGRPGERGLPGPPGIKGPAGIPGFPGMKGHRGFDGRNGEKGETGA
PGLKGENGLPGENGAPGPMGPRGAPGERGRPGLPGAAGARGNDGARGSDGQPGPPGPPGTAGFPGSPGAK
GEVGPAGSPGSNGAPGQRGEPGPQGHAGAQGPPGPPGINGSPGGKGEMGPAGIPGAPGLMGARGPPGPAG
ANGAPGLRGGAGEPGKNGAKGEPGPRGERGEAGIPGVPGAKGEDGKDGSPGEPGANGLPGAAGERGAPGF
RGPAGPNGIPGEKGPAGERGAPGPAGPRGAAGEPGRDGVPGGPGMRGMPGSPGGPGSDGKPGPPGSQGES
GRPGPPGPSGPRGQPGVMGFPGPKGNDGAPGKNGERGGPGGPGPQGPPGKNGETGPQGPPGPTGPGGDKG
DTGPPGPQGLQGLPGTGGPPGENGKPGEPGPKGDAGAPGAPGGKGDAGAPGERGPPGLAGAPGLRGGAGP
PGPEGGKGAAGPPGPPGAAGTPGLQGMPGERGGLGSPGPKGDKGEPGGPGADGVPGKDGPRGPTGPIGPP
GPAGQPGDKGEGGAPGLPGIAGPRGSPGERGETGPPGPAGFPGAPGQNGEPGGKGERGAPGEKGEGGPPG
VAGPPGGSGPAGPPGPQGVKGERGSPGGPGAAGFPGARGLPGPPGSNGNPGPPGPSGSPGKDGPPGPAGN
TGAPGSPGVSGPKGDAGQPGEKGSPGAQGPPGAPGPLGIAGITGARGLAGPPGMPGPRGSPGPQGVKGES
GKPGANGLSGERGPPGPQGLPGLAGTAGEPGRDGNPGSDGLPGRDGSPGGKGDRGENGSPGAPGAPGHPG
PPGPVGPAGKSGDRGESGPAGPAGAPGPAGSRGAPGPQGPRGDKGETGERGAAGIKGHRGFPGNPGAPGS
PGPAGQQGAIGSPGPAGPRGPVGPSGPPGKDGTSGHPGPIGPPGPRGNRGERGSEGSPGHPGQPGPPGPP
GAPGPCCGGVGAAAIAGIGGEKAGGFAPYYGDEPMDFKINTDEIMTSLKSVNGQIESLISPDGSRKNPAR
NCRDLKFCHPELKSGEYWVDPNQGCKLDAIKVFCNMETGETCISANPLNVPRKHWWTDSSAEKKHVWFGE
SMDGGFQFSYGNPELPEDVLDVHLAFLRLLSSRASQNITYHCKNSIAYMDQASGNVKKALKLMGSNEGEF
KAEGNSKFTYTVLEDGCTKHTGEWSKTVFEYRTRKAVRLPIVDIAPYDIGGPDQEFGVDVGPVCFL

Mouse:
>NP_034060.2 collagen alpha-1(III) chain preproprotein [Mus musculus]
MMSFVQSGTWFLLTLLHPTLILAQQSNVDELGCSHLGQSYESRDVWKPEPCQICVCDSGSVLCDDIICDE
EPLDCPNPEIPFGECCAICPQPSTPAPVLPDGHGPQGPKGDPGPPGIPGRNGDPGLPGQPGLPGPPGSPG
ICESCPTGGQNYSPQFDSYDVKSGVGGMGGYPGPAGPPGPPGPPGSSGHPGSPGSPGYQGPPGEPGQAGP
AGPPGPPGALGPAGPAGKDGESGRPGRPGERGLPGPPGIKGPAGMPGFPGMKGHRGFDGRNGEKGETGAP
GLKGENGLPGDNGAPGPMGPRGAPGERGRPGLPGAAGARGNDGARGSDGQPGPPGPPGTAGFPGSPGAKG
EVGPAGSPGSNGSPGQRGEPGPQGHAGAQGPPGPPGNNGSPGGKGEMGPAGIPGAPGLIGARGPPGPAGT
NGIPGTRGPSGEPGKNGAKGEPGARGERGEAGSPGIPGPKGEDGKDGSPGEPGANGLPGAAGERGPSGFR
GPAGPNGIPGEKGPPGERGGPGPAGPRGVAGEPGRDGTPGGPGIRGMPGSPGGPGNDGKPGPPGSQGESG
RPGPPGPSGPRGQPGVMGFPGPKGNDGAPGKNGERGGPGGPGLPGPAGKNGETGPQGPPGPTGPAGDKGD
SGPPGPQGLQGIPGTGGPPGENGKPGEPGPKGEVGAPGAPGGKGDSGAPGERGPPGTAGIPGARGGAGPP
GPEGGKGPAGPPGPPGASGSPGLQGMPGERGGPGSPGPKGEKGEPGGAGADGVPGKDGPRGPAGPIGPPG
PAGQPGDKGEGGSPGLPGIAGPRGGPGERGEHGPPGPAGFPGAPGQNGEPGAKGERGAPGEKGEGGPPGP
AGPTGSSGPAGPPGPQGVKGERGSPGGPGTAGFPGGRGLPGPPGNNGNPGPPGPSGAPGKDGPPGPAGNS
GSPGNPGIAGPKGDAGQPGEKGPPGAQGPPGSPGPLGIAGLTGARGLAGPPGMPGPRGSPGPQGIKGESG
KPGASGHNGERGPPGPQGLPGQPGTAGEPGRDGNPGSDGQPGRDGSPGGKGDRGENGSPGAPGAPGHPGP
PGPVGPSGKSGDRGETGPAGPSGAPGPAGARGAPGPQGPRGDKGETGERGSNGIKGHRGFPGNPGPPGSP
GAAGHQGAIGSPGPAGPRGPVGPHGPPGKDGTSGHPGPIGPPGPRGNRGERGSEGSPGHPGQPGPPGPPG
APGPCCGGGAAAIAGVGGEKSGGFSPYYGDDPMDFKINTEEIMSSLKSVNGQIESLISPDGSRKNPARNC
RDLKFCHPELKSGEYWVDPNQGCKMDAIKVFCNMETGETCINASPMTVPRKHWWTDSGAEKKHVWFGESM
NGGFQFSYGPPDLPEDVVDVQLAFLRLLSSRASQNITYHCKNSIAYMDQASGNVKKSLKLMGSNEGEFKA
EGNSKFTYTVLEDGCTKHTGEWSKTVFEYQTRKAMRLPIIDIAPYDIGGPDQEFGVDIGPVCFL

Chicken:
>NP_990711.2 collagen alpha-1(III) chain precursor [Gallus gallus]
MMSFVQKVSLFILAVFQPSVILAQQDALGGCTHLGQEYADRDVWKPEPCQICVCDSGSVLCDDIICDDQE
LDCPNPEIPFGECCPVCPQTTPQPTKLPYTQGPKGDPGSPGSPGRTGAPGPPGQPGSPGAPGPPGICQSC
PSISGGSFSPQYDSYDVKAGSVGMGYPPQPISGFPGPPGPSGPPGPPGHAGPPGSNGYQGPPGEPGQPGP
SGPPGPAGMIGPAGPPGKDGEPGRPGRNGDRGIPGLPGHKGHPGMPGMPGMKGARGFDGKDGAKGDSGAP
GPKGEAGQPGANGSPGQPGPRGPTGERGRPGNPGGPGAHGKDGAPGAAGPPGPPGPPGTAGFPGSPGFKG
EAGPPGPAGASGSPGERGEPGPQGQAGPPGPQGPPGRAGSPGNKGEMGPSGIPGAPGLPGGRGLPGPPGT
SGNPGAKGTPGEPGKNGAKGDPGPKGERGENGTPGAPGPPGEEGKRGANGEPGQNGVPGTPGERGSPGFR
GLPGSNGLPGEKGPAGERGSPGPPGPSGPAGDRGQDGGPGLPGMRGLPGIPGSPGSDGKPGPPGNQGEPG
RSGPPGPAGPRGQPGVMGFPGPKGNEGAPGKNGERGPGGPPGTPGPAGKNGDVGLPGPPGPAGPAGDRGE
PGPSGSPGLQGLPGGPGPAGENGKPGEPGPKGDIGGPGFPGPKGENGIPGERGAQGPPGPTGARGGPGPA
GSEGAKGPPGPPGAPGGTGLPGLQGMPGERGASGSPGPKGDKGEPGGKGADGLPGARGERGNVGPIGPPG
PAGPPGDKGETGPAGAPGPAGSRGGPGERGEQGLPGPAGFPGAPGQNGEPGGKGERGPPGLRGEAGPPGA
AGPQGGPGAPGPPGPQGVKGERGSPGGPGAAGFPGARGLPGPPGNNGSPGPPGNAGPPGKDGPPGPPGNT
GPPGGSGPPGLRGEPGAPGEKGPPGARGERGTPGDPGPQGIIGSRGSTGLPGPRGLPGPAGMAGGKGEDG
KPGVNGVPGERGAPGPQGPMGQRGLPGEPGRDGNPGSDGSPGRDGSPGGKGDRGESGPPGVPGPPGHPGP
AGNNGAPGKAGERGFQGPPGPPGSAGPAGARGPAGPQGPRGDKGETGERGSAGIKGHRGFPGTPGLPGPP
GPLGPQGAIGSPGASGARGPPGPAGPPGKDGRGGYPGPIGPPGPRGNRGESGPAGPPGQPGLPGPSGPPG
PCCGGGVASLGAGEKGPVGYGYEYRDEPKENEINLGEIMSSMKSINNQIENILSPDGSRKNPARNCRDLK
FCHPELKSGEYWIDPNQGCKMDAIKVYCNMETGETCLSANPATVPRKNWWTTESSGKKHVWFGESMKGGF
QFSYGDPDLPEDVSEVQLAFLRILSSRASQNITYHCKNSIAYMDQASGNVKKALKLMSSVETEIKAEGNS
KYMYAVLEDGCTKHTGEWGKTVFEYRTRKTMRLPVVDIAPIDIGGPDQEFGVDVGPVCFL

COL5A1

Human:
>NP_000084.3 collagen alpha-1(V) chain isoform 1 preproprotein [Homo sapiens]
MDVHTRWKARSALRPGAPLLPPLLLLLLWAPPPSRAAQPADLLKVLDFHNLPDGITKTTGFCATRRSSKG
PDVAYRVTKDAQLSAPTKQLYPASAFPEDFSILTTVKAKKGSQAFLVSIYNEQGIQQIGLELGRSPVFLY
EDHTGKPGPEDYPLFRGINLSDGKWHRIALSVHKKNVTLILDCKKKTTKFLDRSDHPMIDINGIIVFGTR
ILDEEVFEGDIQQLLFVSDHRAAYDYCEHYSPDCDTAVPDTPQSQDPNPDEYYTEGDGEGETYYYEYPYY
EDPEDLGKEPTPSKKPVEAAKETTEVPEELTPTPTEAAPMPETSEGAGKEEDVGIGDYDYVPSEDYYTPS
PYDDLTYGEGEENPDQPTDPGAGAEIPTSTADTSNSSNPAPPPGEGADDLEGEFTEETIRNLDENYYDPY
YDPTSSPSEIGPGMPANQDTIYEGIGGPRGEKGQKGEPAIIEPGMLIEGPPGPEGPAGLPGPPGTMGPTG
QVGDPGERGPPGRPGLPGADGLPGPPGTMLMLPFRFGGGGDAGSKGPMVSAQESQAQAILQQARLALRGP
AGPMGLTGRPGPVGPPGSGGLKGEPGDVGPQGPRGVQGPPGPAGKPGRRGRAGSDGARGMPGQTGPKGDR
GFDGLAGLPGEKGHRGDPGPSGPPGPPGDDGERGDDGEVGPRGLPGEPGPRGLLGPKGPPGPPGPPGVTG
MDGQPGPKGNVGPQGEPGPPGQQGNPGAQGLPGPQGAIGPPGEKGPLGKPGLPGMPGADGPPGHPGKEGP
PGEKGGQGPPGPQGPIGYPGPRGVKGADGIRGLKGTKGEKGEDGFPGFKGDMGIKGDRGEIGPPGPRGED
GPEGPKGRGGPNGDPGPLGPPGEKGKLGVPGLPGYPGRQGPKGSIGFPGFPGANGEKGGRGTPGKPGPRG
QRGPTGPRGERGPRGITGKPGPKGNSGGDGPAGPPGERGPNGPQGPTGFPGPKGPPGPPGKDGLPGHPGQ
RGETGFQGKTGPPGPPGVVGPQGPTGETGPMGERGHPGPPGPPGEQGLPGLAGKEGTKGDPGPAGLPGKD
GPPGLRGFPGDRGLPGPVGALGLKGNEGPPGPPGPAGSPGERGPAGAAGPIGIPGRPGPQGPPGPAGEKG
APGEKGPQGPAGRDGLQGPVGLPGPAGPVGPPGEDGDKGEIGEPGQKGSKGDKGEQGPPGPTGPQGPIGQ
PGPSGADGEPGPRGQQGLFGQKGDEGPRGFPGPPGPVGLQGLPGPPGEKGETGDVGQMGPPGPPGPRGPS
GAPGADGPQGPPGGIGNPGAVGEKGEPGEAGEPGLPGEGGPPGPKGERGEKGESGPSGAAGPPGPKGPPG
DDGPKGSPGPVGFPGDPGPPGEPGPAGQDGPPGDKGDDGEPGQTGSPGPTGEPGPSGPPGKRGPPGPAGP
EGRQGEKGAKGEAGLEGPPGKTGPIGPQGAPGKPGPDGLRGIPGPVGEQGLPGSPGPDGPPGPMGPPGLP
GLKGDSGPKGEKGHPGLIGLIGPPGEQGEKGDRGLPGPQGSSGPKGEQGITGPSGPIGPPGPPGLPGPPG
PKGAKGSSGPTGPKGEAGHPGPPGPPGPPGEVIQPLPIQASRTRRNIDASQLLDDGNGENYVDYADGMEE
IFGSLNSLKLEIEQMKRPLGTQQNPARTCKDLQLCHPDFPDGEYWVDPNQGCSRDSFKVYCNFTAGGSTC
VFPDKKSEGARITSWPKENPGSWFSEFKRGKLLSYVDAEGNPVGVVQMTFLRLLSASAHQNVTYHCYQSV
AWQDAATGSYDKALRFLGSNDEEMSYDNNPYIRALVDGCATKKGYQKTVLEIDTPKVEQVPIVDIMFNDF
GEASQKFGFEVGPACFMG
Chimpanzee:
>XP_016817528.1 collagen alpha-1(V) chain isoform X1 [Pan troglodytes]
MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEPCRICVCDNGKVLC
DDVICDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGL
PGPPGPPGPPGPPGLGGNFAPQLSYGYDEKSTGGISVPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEPGE
PGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDA
GPAGPKGEPGSPGENGAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVG
AKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGP
QGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGP
GSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAG
QDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGE
RGEQGPAGSPGFQGLPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGAN
GAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLTGPIG
PPGPAGAPGDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGP
PGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGEPGKQGPSGASGERGPPGPM
GPPGLAGPPGESGREGAPGAEGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSGDRGETG
PAGPAGPVGPVGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGP
RGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSFLPQPPQEKAH
DGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCRDLKMCHSDWKSGEYWIDPNQ
GCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQGSDPADVAI
QLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLQGSNEIEIRAEGNSRFTYSVTVDGCTSHTG
AWGKTVIEYKTTKTSRLPIIDVAPLDVGAPDQEFGFDVGPVCFL

Mouse:
>NP_056549.2 collagen alpha-1(V) chain precursor [Mus musculus]
MDVHTRWKAARPGALLLSSPLLLFLLLLWAPPSSRAAQPADLLEMLDFHNLPSGVTKTTGFCATRRSSSE
PDVAYRVSKDAQLSMPTKQLYPESGFPEDFSILTTVKAKKGSQAFLVSIYNEQGIQQLGLELGRSPVFLY
EDHTGKPGPEEYPLFPGINLSDGKWHRIALSVYKKNVTLILDCKKKITKFLSRSDHPIIDTNGIVMFGSR
ILDDEIFEGDIQQLLFVSDNRAAYDYCEHYSPDCDTAVPDTPQSQDPNPDEYYPEGEGETYYYEYPYYED
PEDPGKEPAPTQKPVEAARETTEVPEEQTQPLPEAPTVPETSDTADKEDSLGIGDYDYVPPDDYYTPPPY
EDFGYGEGVENPDQPTNPDSGAEVPTSTTVTSNTSNPAPGEGKDDLGGEFTEETIKNLEENYYDPYFDPD
SDSSVSPSEIGPGMPANQDTIFEGIGGPRGEKGQKGEPAIIEPGMLIEGPPGPEGPAGLPGPPGTTGPTG
QMGDPGERGPPGRPGLPGADGLPGPPGTMLMLPFRFGGGGDAGSKGPMVSAQESQAQAILQQARLALRGP
AGPMGLTGRPGPMGPPGSGGLKGEPGDMGPQGPRGVQGPPGPTGKPGRRGRAGSDGARGMPGQTGPKGDR
GFDGLAGLPGEKGHRGDPGPSGPPGIPGDDGERGDDGEVGPRGLPGEPGPRGLLGPKGPPGPPGPPGVTG
MDGQPGPKGNVGPQGEPGPPGQQGNPGAQGLPGPQGAIGPPGEKGPLGKPGLPGMPGADGPPGHPGKEGP
PGEKGGQGPPGPQGPIGYPGPRGVKGADGIRGLKGTKGEKGEDGFPGFKGDMGIKGDRGEIGPPGPRGED
GPEGPKGRGGPNGDPGPLGPTGEKGKLGVPGLPGYPGRQGPKGSIGFPGFPGANGEKGGRGTPGKPGPRG
QRGPTGPRGERGPRGITGKPGPKGNSGGDGPAGPPGERGPNGPQGPTGFPGPKGPPGPPGKDGLPGHPGQ
RGETGFQGKTGPPGPPGVVGPQGPTGETGPMGERGHPGPPGPPGEQGLPGAAGKEGTKGDPGPAGLPGKD
GPPGLRGFPGDRGLPGPVGALGLKGSEGPPGPPGPAGSPGERGPAGAAGPIGIPGRPGPQGPPGPAGEKG
LPGEKGPQGPAGRDGLQGPVGLPGPAGPVGPPGEDGDKGEIGEPGQKGSKGDKGEQGPPGPTGPQGPIGQ
PGPSGADGEPGPRGQQGLFGQKGDEGSRGFPGPPGPVGLQGLPGPPGEKGETGDVGQMGPPGPPGPRGPS
GAPGADGPQGPPGGIGNPGAVGEKGEPGEAGDPGLPGEGGPLGPKGERGEKGEAGPSGAAGPPGPKGPPG
DDGPKGSPGPVGFPGDPGPPGEPGPAGQDGPPGDKGDDGEPGQTGSPGPTGEPGPSGPPGKRGPPGPAGP
EGRQGEKGAKGEAGLEGPPGKTGPIGPQGAPGKPGPDGLRGIPGPVGEQGLPGSPGPDGPPGPMGPPGLP
GLKGDSGPKGEKGHPGLIGLIGPPGEQGEKGDRGLPGPQGSSGPKGDQGITGPSGPLGPPGPPGLPGPPG
PKGAKGSSGPTGPKGEAGHPGLPGPPGPPGEVIQPLPIQASRTRRNIDASQLLDDGAGESYVDYADGMEE
IFGSLNSLKLEIEQMKRPLGTQQNPARTCKDLQLCHPDFPDGEYWVDPNQGCSRDSFKVYCNFTAGGSTC
VFPDKKSEGARITSWPKENPGSWFSEFKRGKLLSYVDAEGNPVGVVQMTFLRLLSASAHQNVTYNCYQSV
AWQDAATGSYDKAIRFLGSNDEEMSYDNNPYIRALVDGCATKKGYQKTVLEIDTPKVEQVPIVDIMFNDF
GEASQKFGFEVGPACFLG

Zebrafish:
>XP_021324673.1 collagen alpha-1(V) chain isoform X1 [Danio rerio]
MLCLFLCVCVFSAAEPADLLKILDFHSLPDGVTKTTGFCTHRKSSKGPDVAYRVTKDAQLSAPTKQLYPS
SMFPEDFSILATVRPKKGSQSFLLSVYNEQGIQQLGLEVGRSPVFLYEDHLGKPGPEDYPLFRGINLADG
KWHRVAISVHKQSITLILDCKKKVTRTLSRSPHPIIDTKGIVVFGTRILDEEVFEGDIQQLMIVSDHRAA
FDYCEHYSPDCEVSAPEQPQNQDPNTDQSNPEEDNYYYEYPYYEDMDSDKTEETITEETTGTETEVSVPD
SGRVITSVISSGSGGSSRSSSGSSSSSSSSSSSSSSSSSSSSSSSNTANTGEDRYDGYDGYDTGYETYYD
ESTSSPDGSRTITITSTGTGSAGELDLGLGEVDHQLGRVDLDRRIITSSSSSSGGEGGTVHSTTTIIRNG
TSGSGSSTRIISSSGSGSSSRISIGSGSSSGGSTGSSSSSSSSSSSSSSSTSVGGGGGYDDAQYGESYDL
SYGEGYGEGYGEGDYTVGGGSSSSSTSVSVGGGSVSVGGGSVSVGGGSVSVGGGSVSSSGSIGGGTSSAG
AAAGAESVATGGQGGIGGEAESIDIDKFKEESITDYTDAELEKMYDYDDLYTDGDKALPAETDEIYGQVD
GLRGEKGQKGEPAIIEPGMLVEGPSGPEGPTGLPGPPGSTGPPGSAGEPGERGLPGRAGLPGADGLPGPP
GTVLMLPFRMSTGGDGGQKGPAVSAQEAQMQAIMQQARLAMRGPTGPMGLTGRPGPLGPPGVSGLKGESG
ESGPQGPRGPMGSAGPTGKPGRRGRPGSDGARGMPGQTGPKGDRGFDGLAGLPGEKGHRGETGPQGPPGP
SGEDGERGDDGEVGPRGLPGEPGPRGVLGPKGPQGPPGPPGVTGMDGHPGPKGNIGPQGEPGPPGQQGNP
GAQGLNGPQGPIGPPGEKGPTGKPGLPGMPGADGPPGHPGKEGPSGEKGNLGPHGPQGPIGYPGPRGVKG
ADGVRGLKGNKGEKGEDGFPGFKGDMGVKGDKGELGAAGPRGEDGPEGPKGRSGLPGDPGPLGPLGEKGK
LGVPGLPGYPGRQGPKGSQGFQGFQGTSGEKGTRGTAGKPGPRGQRGPTGPRGERGPRGPTGKAGPKGNS
GSDGPPGPPGERGLLGPQGPAGFPGPKGPPGPPGKDGLPGHPGQRGETGFQGKTGPPGPPGVVGPQGPTG
ETGPMGERGHPGPPGPPGEQGLPGAAGKEGAKGDPGPAGPSGKDGPPGLRGFPGERGLPGPVGSAGLKGN
EGPPGPPGPAGSPGERGPAGPAGPTGIPGRPGPQGPPGPAGEKGGPGEKGPQGPAGRDGIQGPVGLPGPA
GPIGPPGEDGDKGEIGEPGQKGSKGDKGEQGPPGPTGPQGPAGQPGPSGADGEPGPRGQQGLFGQKGDEG
SRGFPGPPGPVGLQGLPGPPGEKGETGDVGQMGPPGPPGPRGPSGPPGADGPQGPPGGIGNPGAVGEKGD
AGEAGEPGPQGDIGPPGPRGERGEKGEAGPAGSGGPPGPKGPPGDDGPKGSPGPSGFPGDPGPPGEPGPS
GLDGSPGDKGDDGEPGQPGSPGPTGETGPPGPPGKRGPHGPAGPEGRQGEKGAKGESGLEGPPGKTGPVG
PQGSPGKPGPEGLRGVPGPVGEQGLPGAPGPDGPPGPMGPPGLPGLKGDLGIKGEKGHPGLIGLIGPPGE
QGEKGDRGLPGPQGTSGPKGDNGIAGPSGPIGPIGPPGLPGPPGPKGAKGSSGQTGPKGESGIPGPPGPP
GPPGDVIHPMPYQSSPKRAKRNIDASQVMDEATDANYKDYEDGMEEIFGSLNSLKLEIEQMKHPLGTESN
PARTCKDLQLCHPDFPDGEYWIDPNQGCSRDSFKVYCNFTAGGESCIFPDKKSEGARLTSWPKENPGTWF
SEYKRGKLLSYVDAEGNAIGVVQMTFLRLLSATARQNLTYNCYQSVAWHDQDQDSYDKAIRFLGSNDEEM
SYDNNPYIRAVVDGCALKKGYEKTILEINTPKVEQVPFVDIMFNDFGGATQKFGFEVGPACFIG

Chicken:
>NP_990121.2 collagen alpha-1(V) chain precursor [Gallus gallus]
MDTHTRWKRRSWIRNWQLHVALVLLGAAALGRAAEPADLLKVLDFHNLPDGITRTTGFCTSRRSSKEADV
AYRVTKDAQLSAPTKQLYPASPFPEDFSILTTVKAKKGGQSFLISIYNEQGIQQIGVEMGRSPVFLYEDH
TGKPGPEDYPLFRGINLADGKWHRVAISVQKKNVTLILDCKKKITKFLDRSDHPIIDVNGIIVFGTRILD
EEVFEGDIQQLLIVADPRAAHDYCEHYSPDCDTAVPDAPQSQDPNQDEYYTDGEGEGDTYYYEYPYYEDV
DEAVKPEAPTTKPAPPGVAAGERPETKQDYPSPTPSPEAGNPSRQTKGAAPVDDPLVDEYNYETINEEYF
TPLPYEDINYNEEVDPQGGLTENAVEAELPTSTVITYNETDAAQGGDDLDKDFTEETIKEYDGNYYYYDR
TVSPDIGPGMPANQDTIYEGIGGPRGEKGQKGEPAIIEPGMLVEGPPGPEGPAGLPGPPGPTGPVGLMGD
PGERGPPGRPGLPGADGLPGPPGTMLMLPFRFSGGGDAGSKGPMVSAQEAQAQAILQQARLALRGPAGPM
GLTGRPGPMGPPGSGGLKGEAGEMGPQGPRGIQGPPGPAGKPGRRGRAGSDGARGMPGQTGPKGDRGFDG
LAGLPGEKGNRGEPGPHGPPGAPGEDGERGDDGEVGPRGLPGEPGPRGLLGPKGPPGPPGPPGVAGMDGQ
TGPKGNVGPQGEPGPPGQQGNPGAQGLPGPQGPIGPPGEKGPLGKPGLPGMPGADGPPGHPGKEGPPGEK
GSQGPPGPQGPIGYPGPRGVKGADGVRGLKGTKGEKGEDGFPGFKGDMGIKGDRGEIGPPGPRGEDGPEG
PKGRSGPNGDPGPLGPAGEKGKLGVPGLPGYPGRQGPKGSIGFPGFPGANGEKGTRGTPGKPGPRGQRGP
TGPRGERGPRGSTGKPGPKGNSGGDGPPGPPGERGPPGPQGPTGFPGPKGPPGPPGKDGLPGHPGQRGET
GFQGKTGPPGPPGVVGPQGPTGETGPMGERGHPGPPGPPGEQGLPGLTGKEGTKGDPGPAGLPGKDGPPG
LRGFPGERGLPGPIGSPGLKGNEGPPGPPGPAGSPGERGPAGSAGPIGLPGRPGPQGPPGPAGEKGAPGE
KGPQGPAGRDGIQGPVGLPGPAGPVGPPGEDGDKGEIGEPGQKGSKGDKGEQGPPGPTGPQGPIGQPGPA
GADGEPGPRGQQGLFGQKGDEGPRGFPGPPGPVGLQGLPGPPGEKGETGDVGQMGPPGPPGPRGPSGPPG
ADGPQGPAGGIGNPGAVGEKGEPGESGEPGLPGEVGLPGPKGERGEKGEAGPSGAAGPPGPKGPPGDDGP
KGSPGPVGFPGDPGPPGEPGPAGQDGPPGDKGDDGEPGQTGSPGPTGEPGPSGPPGKRGPPGPAGPEGRQ
GEKGAKGEAGLEGPPGKTGPIGPQGAPGKPGPDGLRGIPGPVGEQGLPGSPGPDGPPGPLGPPGLPGLKG
DSGPKGEKGHPGLIGLIGPPGEQGEKGDRGLPGPQGSAGPKGEQGITGPSGPIGPPGPPGLPGPPGPKGA
KGSSGPTGPKGESGLPGPPGPPGPPGEVIQPLPIQSSKRTRRNIDASQLVDDGNADNYMDYADGMEEIFG
SLNSLKLEIEQMKHPLGTQHNPARTCKDLQLCHPDFPDGEYWVDPNQGCSRDSFKVYCNFTAGGETCIFP
DKKSEGARITSWPKENPGSWFSEFKRGKLLSYVDSDGNPIGVVQMTFLRLLSASAHQNITYNCYQSVAWH
DATTDSYDKAIRFLGSNDEEMSYDNNPYIRAAFDGCAAKKGYQKTVLEINTPKVEQVPIVDIMFNDFGEA
SQKFGFEVGPACFMG

COL4A2

Humans:

>NP_001837.2 collagen alpha-2(IV) chain preproprotein [Homo sapiens]


MGRDQRAVAGPALRRWLLLGTVTVGFLAQSVLAGVKKFDVPCGGRDCSGGCQCYPEKGGRGQPGPVGPQG
YNGPPGLQGFPGLQGRKGDKGERGAPGVTGPKGDVGARGVSGFPGADGIPGHPGQGGPRGRPGYDGCNGT
QGDSGPQGPPGSEGFTGPPGPQGPKGQKGEPYALPKEERDRYRGEPGEPGLVGFQGPPGRPGHVGQMGPV
GAPGRPGPPGPPGPKGQQGNRGLGFYGVKGEKGDVGQPGPNGIPSDTLHPIIAPTGVTFHPDQYKGEKGS
EGEPGIRGISLKGEEGIMGFPGLRGYPGLSGEKGSPGQKGSRGLDGYQGPDGPRGPKGEAGDPGPPGLPA
YSPHPSLAKGARGDPGFPGAQGEPGSQGEPGDPGLPGPPGLSIGDGDQRRGLPGEMGPKGFIGDPGIPAL
YGGPPGPDGKRGPPGPPGLPGPPGPDGFLFGLKGAKGRAGFPGLPGSPGARGPKGWKGDAGECRCTEGDE
AIKGLPGLPGPKGFAGINGEPGRKGDRGDPGQHGLPGFPGLKGVPGNIGAPGPKGAKGDSRTITTKGERG
QPGVPGVPGMKGDDGSPGRDGLDGFPGLPGPPGDGIKGPPGDPGYPGIPGTKGTPGEMGPPGLGLPGLKG
QRGFPGDAGLPGPPGFLGPPGPAGTPGQIDCDTDVKRAVGGDRQEAIQPGCIGGPKGLPGLPGPPGPTGA
KGLRGIPGFAGADGGPGPRGLPGDAGREGFPGPPGFIGPRGSKGAVGLPGPDGSPGPIGLPGPDGPPGER
GLPGEVLGAQPGPRGDAGVPGQPGLKGLPGDRGPPGFRGSQGMPGMPGLKGQPGLPGPSGQPGLYGPPGL
HGFPGAPGQEGPLGLPGIPGREGLPGDRGDPGDTGAPGPVGMKGLSGDRGDAGFTGEQGHPGSPGFKGID
GMPGTPGLKGDRGSPGMDGFQGMPGLKGRPGFPGSKGEAGFFGIPGLKGLAGEPGFKGSRGDPGPPGPPP
VILPGMKDIKGEKGDEGPMGLKGYLGAKGIQGMPGIPGLSGIPGLPGRPGHIKGVKGDIGVPGIPGLPGF
PGVAGPPGITGFPGFIGSRGDKGAPGRAGLYGEIGATGDFGDIGDTINLPGRPGLKGERGTTGIPGLKGF
FGEKGTEGDIGFPGITGVTGVQGPPGLKGQTGFPGLTGPPGSQGELGRIGLPGGKGDDGWPGAPGLPGFP
GLRGIRGLHGLPGTKGFPGSPGSDIHGDPGFPGPPGERGDPGEANTLPGPVGVPGQKGDQGAPGERGPPG
SPGLQGFPGITPPSNISGAPGDKGAPGIFGLKGYRGPPGPPGSAALPGSKGDTGNPGAPGTPGTKGWAGD
SGPQGRPGVFGLPGEKGPRGEQGFMGNTGPTGAVGDRGPKGPKGDPGFPGAPGTVGAPGIAGIPQKIAVQ
PGTVGPQGRRGPPGAPGEMGPQGPPGEPGFRGAPGKAGPQGRGGVSAVPGFRGDEGPIGHQGPIGQEGAP
GRPGSPGLPGMPGRSVSIGYLLVKHSQTDQEPMCPVGMNKLWSGYSLLYFEGQEKAHNQDLGLAGSCLAR
FSTMPFLYCNPGDVCYYASRNDKSYWLSTTAPLPMMPVAEDEIKPYISRCSVCEAPAIAIAVHSQDVSIP
HCPAGWRSLWIGYSFLMHTAAGDEGGGQSLVSPGSCLEDFRATPFIECNGGRGTCHYYANKYSFWLTTIP
EQSFQGSPSADTLKAGLIRTHISRCQVCMKNL

Chimpanzee:
>XP_001136859.2 collagen alpha-2(IV) chain isoform X2 [Pan troglodytes]
MGRDQRAVAGPALRRWLLGTVTVGFLAQSVLAGVKKFDVPCGGRDCSGGCQCYPEKGGRGQPGPVGPQGY
NGPPGLQGFPGLQGRKGDKGERGAPGITGPKGDVGARGVSGFPGADGIPGHPGQGGPRGRPGYDGCNGTQ
GDSGPQGPPGSEGFTGPPGPQGPKGQKGEPYALPKEERDRYRGEPGEPGLVGFQGPPGRPGHVGQMGPVG
APGRPGPPGPPGPKGQQGNRGLGFYGVKGEKGDVGQPGPNGIPSDLLHPIIAPTGVTFHPDQYKGEKGSE
GEPGIRGISLKGEEGIMGFPGLRGYPGLSGEKGSPGQKGSRGLDGYQGPDGPRGPKGEAGDPGPPGLPAY
SPHPSLAKGARGDPGFPGAQGEPGSQGEPGDPGLPGAPGLSIGDGDQRRGLPGEMGPKGFIGDPGIPALY
GGPPGPDGKRGPPGPPGLPGPPGPDGFLFGLKGAKGRAGFPGLPGSPGARGPKGWKGDAGECRCTEGDEA
IKGLPGLPGPKGFAGINGEPGRKGDKGDPGQHGLPGFPGLKGVPGNVGAPGPKGAKGDSRTITTKGERGQ
PGVPGVPGMKGDDGSPGRDGLDGFPGLPGPPGDGIKGPPGDPGYPGIPGTKGTPGEMGPPGLGLPGLKGQ
RGFPGDAGLPGPPGFLGPPGPAGTPGQIDCDTDVKRAIGGDRQEAIQPGCVGGPKGLPGLPGPPGPTGAK
GLRGIPGFSGADGGPGPKGLPGDAGREGFPGPPGFIGPRGSKGAVGLPGPDGSPGPIGLPGPDGPPGERG
LPGEVLGAQPGPRGDAGVPGQPGLKGLPGDRGPPGFRGSQGMPGMPGLKGQPGLPGPSGQPGLYGPPGLH
GFPGAPGQEGPLGLPGIPGREGLPGDRGDPGDTGAPGPVGMKGLSGDRGDAGFTGERGHPGSPGFKGIDG
MPGTPGLKGDRGSPGMDGFQGMPGLKGRPGFPGSKGEAGFFGIPGLKGLAGEPGFKGSRGDPGPPGPPPV
ILPGMKDIKGEKGDEGPMGLKGYLGAKGIQGMPGIPGLSGIPGLPGRPGHIKGVKGDIGAPGIPGLPGFP
GVAGPPGITGFPGFIGSRGDKGAPGRAGLYGEIGATGDFGDIGDTINLPGRPGLKGERGTTGIPGLKGFF
GEKGTEGDIGFPGITGVTGVQGPPGLKGQTGFPGLTGPPGSQGEPGRIGLPGGKGDDGWPGAPGLPGFPG
LRGIRGLHGLPGTKGFPGSPGSDIHGDPGFPGPPGERGDPGEANTLPGPVGVPGQKGDQGAPGERGPPGS
PGLQGFPGITPPSNISGAPGDKGAPGIFGLKGYRGPPGPPGSAALPGSKGDTGNPGAPGTPGTKGWAGDS
GPQGRPGVFGLPGEKGPRGEQGFMGNTGPTGAVGDRGPKGPKGDPGFPGAPGTVGAPGIAGIPQKIAVQP
GTVGPQGRRGPPGAPGEMGPQGPPGEPGFRGAPGKAGPQGRGGVSAVPGFRGDEGPIGHQGPIGQEGAPG
RPGSPGLPGMPGRSVSIGYLLVKHSQTDQEPMCPVGMNKLWSGYSLLYFEGQEKAHNQDLGLAGSCLARF
STMPFLYCNPGDVCYYASRNDKSYWLSTTAPLPMMPVAEDEIKPYISRCSVCEAPAVAIAVHSQDVSIPH
CPAGWRSLWIGYSFLMHTAAGDEGGGQSLVSPGSCLEDFRATPFIECNGGRGTCHYYANKYSFWLTTIPE
QSFQGSPSADTLKAGLIRTHISRCQVCMKNL

Mouse:
>NP_034062.3 collagen alpha-2(IV) chain precursor [Mus musculus]
MDRVRFKASGPPLRGWLLLATVTVGLLAQSVLGGVKKLDVPCGGRDCSGGCQCYPEKGARGQPGAVGPQG
YNGPPGLQGFPGLQGRKGDKGERGVPGPTGPKGDVGARGVSGFPGADGIPGHPGQGGPRGRPGYDGCNGT
RGDAGPQGPSGSGGFPGLPGPQGPKGQKGEPYALSKEDRDKYRGEPGEPGLVGYQGPPGRPGPIGQMGPM
GAPGRPGPPGPPGPKGQPGNRGLGFYGQKGEKGDIGQPGPNGIPSDITLVGPTTSTIHPDLYKGEKGDEG
EQGIPGVISKGEEGIMGFPGIRGFPGLDGEKGVVGQKGSRGLDGFQGPSGPRGPKGERGEQGPPGPSVYS
PHPSLAKGARGDPGFQGAHGEPGSRGEPGEPGTAGPPGPSVGDEDSMRGLPGEMGPKGFSGEPGSPARYL
GPPGADGRPGPQGVPGPAGPPGPDGFLFGLKGSEGRVGYPGPSGFPGTRGQKGWKGEAGDCQCGQVIGGL
PGLPGPKGFPGVNGELGKKGDQGDPGLHGIPGFPGFKGAPGVAGAPGPKGIKGDSRTITTKGERGQPGIP
GVHGMKGDDGVPGRDGLDGFPGLPGPPGDGIKGPPGDAGLPGVPGTKGFPGDIGPPGQGLPGPKGERGFP
GDAGLPGPPGFPGPPGPPGTPGQRDCDTGVKRPIGGGQQVVVQPGCIEGPTGSPGQPGPPGPTGAKGVRG
MPGFPGASGEQGLKGFPGDPGREGFPGPPGFMGPRGSKGTTGLPGPDGPPGPIGLPGPAGPPGDRGIPGE
VLGAQPGTRGDAGLPGQPGLKGLPGETGAPGFRGSQGMPGMPGLKGQPGFPGPSGQPGQSGPPGQHGFPG
TPGREGPLGQPGSPGLGGLPGDRGEPGDPGVPGPVGMKGLSGDRGDAGMSGERGHPGSPGFKGMAGMPGI
PGQKGDRGSPGMDGFQGMLGLKGRQGFPGTKGEAGFFGVPGLKGLPGEPGVKGNRGDRGPPGPPPLILPG
MKDIKGEKGDEGPMGLKGYLGLKGIQGMPGVPGVSGFPGLPGRPGFIKGVKGDIGVPGTPGLPGFPGVSG
PPGITGFPGFTGSRGEKGTPGVAGVFGETGPTGDFGDIGDTVDLPGSPGLKGERGITGIPGLKGFFGEKG
AAGDIGFPGITGMAGAQGSPGLKGQTGFPGLTGLQGPQGEPGRIGIPGDKGDFGWPGVPGLPGFPGIRGI
SGLHGLPGTKGFPGSPGVDAHGDPGFPGPTGDRGDRGEANTLPGPVGVPGQKGERGTPGERGPAGSPGLQ
GFPGISPPSNISGSPGDVGAPGIFGLQGYQGPPGPPGPNALPGIKGDEGSSGAAGFPGQKGWVGDPGPQG
QPGVLGLPGEKGPKGEQGFMGNTGPSGAVGDRGPKGPKGDQGFPGAPGSMGSPGIPGIPQKIAVQPGTLG
PQGRRGLPGALGEIGPQGPPGDPGFRGAPGKAGPQGRGGVSAVPGFRGDQGPMGHQGPVGQEGEPGRPGS
PGLPGMPGRSVSIGYLLVKHSQTDQEPMCPVGMNKLWSGYSLLYFEGQEKAHNQDLGLAGSCLARFSTMP
FLYCNPGDVCYYASRNDKSYWLSTTAPLPMMPVAEEEIKPYISRCSVCEAPAVAIAVHSQDTSIPHCPAG
WRSLWIGYSFLMHTAAGDEGGGQSLVSPGSCLEDFRATPFIECNGGRGTCHYFANKYSFWLTTIPEQNFQ
STPSADTLKAGLIRTHISRCQVCMKNL

Chicken:
>NP_001155862.1 collagen alpha-2(IV) chain precursor [Gallus gallus]
MDLSPLPGARVSVCQDVGLLLVLLEVVLSAGRVDAGGKSYTGPCGGRDCSGGCQCFPEKGARGQPGILGS
QGFPGPPGLMGIPGLQGPKGHKGERGHPGISGPKGETGQRGVTGFPGADGVPGHPGQPGSRGKPGHDGCN
GTVGDPGDPGTPGHSGFPGTIGVQGPKGQKGEPYVLPPDIASRHRGDPGDPGFTGFPGAPGTLGIQGPIG
PRGVPGRPGPPGSPGPQGPQGNRGLGFYGEKGEQGSPGPPGPPGLPTRELIGVPTDKHKGERGEPGQKGE
AGFPGVLLFAPEKGEEGVMGFPGQRGLPGNDGFPGLSGERGFPGFDGQPGQYGPRGGKGEQGEMGPPGPP
AYVPYRIPRKGVRGDPGSPGASGLRGEQGEQGDRGLPGIPGFSDGDADKPGLPGEIGPKGEKGEEGSPAY
QAGPPGFPGKHGEPGIRGPPGPPGTPGSLFGLKGQEGTPGRPGVQGFPGPRGQRGPKGEEGDCSKCLLSD
ELRRGSTGPRGPPGFPGTPGQPGRKGEPGDQGPHGIPGYAGAKGQSGQEGLPGPKGEKGDSIYITTKGTK
GIRGDPGLPGIRGEDGFPGRDGLDGLPGLPGLPGDGIRGLPGDPGYPGELGPKGFPGEIGLPGEGYPGPK
GYRGLPGDRGTDGHPGPPGLPGPPGEPGQLDCGQVIEDFSRGEATDPIWSGGGCVRPPKGSQGNPGLPGA
TGTKGARGFPGDPGPVGFPGLNGTRGDPGREGYPGPPGFIGPRGDRGPNGLPGLQGHPGLMGKSGAPGLA
GQKGAPGDVLGAAAGPRGDDGLPGFPGLKGAPGDQGIPGIRGADGNPGLPGPKGDPGLQGLPGLMGLPGT
PGTHGFPGPPGNRGPDGGPGSQGPLGPPGARGEDGEQGFPGPVGMKGLSGDKGETGFPGLQGIPGVTGPP
GISGMDGFPGDKGSRGSPGIDGFKGMPGLKGRPGIKGIKGEFGLLGTRGDKGAQGARGFKGDRGEQGPPG
EPPKLKPSMMMEVKGEKGDAGETGTKGFFGIKGSKGMPGLPGKTGIPGSPGHPSYVPGVKGDIGAKGLTG
LKGYPGPTGSPGIRGFPGSTGGRGDKGAPGISGHFGTPGSHGEIGEPGDTINLPGMPGLKGEVGVPGLTG
LRGGPGQKGEGGDPGLPGIEGLKGIQGVPGSLGQKGLPGLVGPPGQQGSPGTPGFQGEKGAPGWPGLPGQ
AGLPGLRGISGLHGLPGTKGLPGSPGPDGYGSAGFPGPVGDKGEAGEPSRVEGSQGPPGQKGDRGVPGVQ
GPFGIPGQDGLPGPPGISNISGYPGDTGSPGLDGVPGYPGLHGQPGIPAPPGSKGESGRAGVSGQAGPKG
TRGDPGLPGRPGIPGYPGPKGRKGEQGVIGFIGTVGFPGDLGPIGPKGDRGLTGFQGPPGSPGLPPIPPR
LVAEQGSPGPRGNAGPRGSPGDAGPQGPPGEPGLRGLPGEPGLQGRGGIPAPPGSRGEQGAMGFQGPVGF
EGQPGRPGSPGLPGMPGRSVSIGYLLVKHSQSDQEPMCPIGMNKLWSGYSLLYFEGQEKAHNQDLGLAGS
CLARFSTMPFLYCNPGDICYYANRNDKSYWLSTTAPLPMMPVAEEEIRPYISRCSVCEAPAVAIAVHSQE
ASIPRCPEGWRSLWIGYSFLMHTAAGDEGGGQSLVSPGSCLEDFRATPFIECNGARGTCHYFANKYSFWL
TTIDQPFQSKPSADTLKAGLIRSHISRCQVCMKNL

Zebrafish:
>XP_017213247.1 collagen alpha-2(IV) chain isoform X1 [Danio rerio]
MEGENHLQHWNTLRCFLLCLVVLILTTDVNAGVKKSTGPCGGRDCSGGCQCYPEKGARGLPGPLGPQGPT
GPKGRQGEPGLQGPKGMKGEHGEAGFVGPKGSVGMEGVPGFNGADGVPGHPGPSGARGKAGPDGCNGTRG
DSGMPGFPGLEGGQGTPGWPGMKGEKGDPLEVWVYMERFRGDPGPAGFPGIVGPTGDPGYRGLPGFQGPW
GPQGLKGAKGQKGERTKILKGVKGELGEIGEPGPPGVFSQTSPIPSDWQGEKGKMGSKGDKGDLGYLSTK
AESGVSGFAGARGNPGLDGWPGPRGDTGLPGFPGNNGRKGDVGEPGELVGSPFDDTGVGPPGPPGERGPV
GNYGRKGAKGQPGPPGPPPYGQQTVELWGPEGPRGPKGASGETGEPGNPATQPGPPGPDGSPGSVGPPGP
KGSLEEYFKGSPGHRGRPGSAGKKGPKGDHGLCECSVKPPPGPPGPPGDSGDPGMAGEWGQQGDQGDPGT
KGANGLPGFPGTEGLPGPKGRKGELMEAVQKGSAGDPGDPGHSGFPGEQGRPGIDGRDGEPGFRGPPGEG
PVGEPGAKGYSGPPGRSGLIGPKGNPGQVLGATKGLSGLPGDDGQTGPDGRTGPPGPPGDCSSRGGSQWS
IPVECIGDPGPPGESGLPGPRGLEGFPGVPGPKGTAGFSGDLGDKGERGEPGRGGPPGPKGFTGPRGDLG
YPGTKGVKGPPGLSGKPGLDGASGGKGEHGEVFGASSGAPGDPGLPGHRGDTGLTGDPGLPGYFGMEGMP
GMTGLKGESGPPGLQGEQGRPGPPGTFGFPGQTGYTGPPGPPGNIGEPGPSGRRGDQGEPGLIGARGVKG
AIGDPGNIGVTGNIGPPGDQGETGGPGFFGLPGLKGSKGGQGMTGFPGKTGERGLKGFSGTKGEIGLPGH
QGAKGAPGSPGQKGDRGLTGPPGDKPVIHITPHMKEVMKGSKGDHGNMGDPGFTGPRGTKGFPGIPGGEG
KDGHPGEPSVVKGVKGLPGEPGLEGPKGMPGPTGLPGIEGFPGMSGPKGNKGSSGAYGRPGDSGIKGFKG
DLGPTISLPGSTGLRGETGHPGSTGAKGLYGMPGERGSSGLDGIEGMKGYQGEPGTVGPPGSDGLQGFPG
SQGHKGRSGDPGQTGTTGIQGQPGQKGFPGLKGIFGLDGLKGQKGNQGVPGTDNWGQPGNPGTKGELGET
GIPSTTTGVPGSLGHKGTTGDSGNKGEIGQRGVPGPQGQQGFSDIKGTQGDYGFPGIKGIPGFPGTRGTP
GIPSPFGAKGDRGNVGDFGLFGEKGVQGDRGEKGSSGESGVSGRKGDKGEPGMMGFPGRRGFQGDRGPFG
PKGDTGPPGFPGLPGAKGYPPVPQKLPGEQGPPGQMGIQGPSGRNGNPGPAGPPGDSGFVGPSGHKGMPG
LPGIPGSPGFRGEPGSMGHSGLQGEQGTRGRPGMPGPLGMAGRSVNVGYLLVKHSQSEEIPMCPQGMSLL
WMGYSLLYFEGQEKAHNQDLGLAGSCLPRFNTMPFLYCNPGDICYYASRNDKSYWLSTTQPIPMMPVEES
EIKPYISRCSVCEAPSVAIAVHSQDTTIPNCPVGWRSLWIGYSFLMHTAAGDEGGGQSLASPGSCLEDFR
TTPFIECNGAKGTCHYFANKHSFWLTSIDESFQSSPSSETLKAGQLLSRISRCQVCMKNL
Scientific Names of Species:

Drosophila: Drosophila melanogaster


Chimpanzee: Pan troglodytes
Humans: Homo sapiens
Mouse: Mus musculus
Chicken: Gallus gallus
Zebra fish: Danio rerio

2. Is the specific collagen present in all species? If no, what do you think why this gene is
absent in any organism?

 Collagen 1A1 is absent in Drosophila..


 Collagen 3A1 is absent in Zebrafish and drosophila.
 Collagen 5A1 is absent in Drosophila.
 Collagen 4A2 is absent in Drosophila

The collagens absent in certain organisms may be due to the reason that there is some
mutation, deletion, gaps and or some evolutionary aspects.

Use CLUSTALW for multiple sequence alignment with default parameters.

3. Analyze the multiple sequence alignment result.


Can you identify a conserved domain of collagen in these 5 species? If yes, what is its name?

COL1A1:
COL3A1:
COL5A1:
COL4A2:
4. What do the colors mean in protein alignment?
The colors in protein alignment demonstrates the alignment between certain amino acid
residues. Each residue in the alignment is assigned a colour if the amino acid profile of the
alignment at that position meets some minimum criteria specific for the residue type. In
protein the colors show the nature of Amino acid. Colors also indicate the acidity or basicity
of amino acid.
Following is the description of multiple sequence alignment colors:
Red: shows small + hydrophobic amino acids (widely conserved amino acids residue
sequence)
Blue: shows acidic amino acid (poorly conserved amino acid domains)
Magenta: show basic amino acid domains
Green: shows hydroxyl + sulfhydryl + amine amino acid
Gray: shows unusual amino or imino acid region of aligned sequence

5. Analyze the multiple sequence alignment result.


Can you identify a conserved domain of collagen in these 5 species? If yes, what is its name?

COLF1 is the superfamily having high residual level


Conserved domain on : Collagen alpha-1(1)chain preproprotein[Homo sapiens]
VWC and COLFI domain-containing protein

You might also like