You are on page 1of 28

"Exploring the Structural Insights, Evolutionary

Relationships, and Functional Significance of


PWL2 Gene in Pyricularia oryzae: A Bioinformatics
Approach"

Project report submitted for the fulfilment of

Submitted by
MD Ashfaque Molla
Department of Botany
Dr. A.P.J. Abdul Kalam Govt. College
Kolkata- 700135
Page |2

CONTENTS
Abstract …………………………….3

Introduction………………………...4

Materials and methods……………...5

Results & Discussion…………….....7

Conclusion.................…....................27

References………...……………….28
Page |3

Abstract
This bioinformatics project focuses on the analysis of the PWL2 gene, a gene of interest in the field
of molecular biology. The aim of this study is to gain a better understanding of the structure and
evolution of the PWL2 gene. Various bioinformatics software, websites and databases are utilized to
predict the 3D structure of the protein of the given gene and validate the structure using
Ramachandran plot and evolution of PWL2 genes through closely related organisms and phylogeny
tree drawn within those closely related organisms. Nucleotide & Protein sequences were retrieved
from different organisms with maximum homology and nucleotide & protein blast was performed
between them. All retrieved sequences are aligned. The results of this project provide valuable
insights into the role of the PWL2 gene in biological processes and may serve as a basis for further
experimental investigations.
Page |4

INTRODUCTION
Many fungi possess Avirulence (Avr) genes that establish a gene-for-gene relationship with their host
plants. These genes act as unique genetic determinants, preventing the fungi from causing disease in
plants that carry corresponding resistance (R) genes. Interaction between elicitors derived from Avr
genes, both primary and secondary products, and host receptors in resistant plants trigger various
defense responses, often involving a hypersensitive response. PWL1 and PWL2 is an Avr genes of
Magnaporthe grisea.( Richard Laugé, Pierre J.G.M. De Wit, 1998)

PWL2 (for Pathogenicity toward Weeping Lovegrass (Sweigard, James A., et al, 1995)) belongs to a
gene family that includes three other putative effectors: PWL1, PWL3, and PWL4. PWL1 has also
been implicated in the incompatible reaction of M. Oryzae against weeping lovegrass. The PWL2
gene encodes a protein of 145 amino acids with a molecular weight of 16.17 kDa. An allele of
PWL2, termed a divergent pwl2 allele, was unable to confer avirulence. This allele resulted from a
guanine-to-adenine substitution, causing an amino acid change from aspartic acid to asparagine at
residue 90. The normal PWL2 gene product has the amino acid sequence DKS, while the divergent
allele alters it to NKS, which is a putative signal sequence for glycosylation. Different alleles of
PWL2 associated with virulence towards weeping lovegrass exhibit high polymorphism depending
on the isolate and geographic origin. Although field isolates with spontaneous PWL2 deletions do not
show any known fitness issues, most of the field isolates possess one or two copies of the gene. The
exact role of PWL2 in rice blast disease is unclear, but its high prevalence in rice blast field isolates
suggests a potential function during plant infection. (Were,Vincent Mbashira,2018)

The rice blast fungus (Magnaporthe grisea) was studied to understand its host specificity. Genetic
analysis identified a key gene, PWL2, which significantly affects the fungus’s ability to infect
weeping lovegrass (Eragrostis curvula). The non-pathogenic allele of PWL2 was genetically
unstable, frequently giving rise to spontaneous pathogenic mutants. PWL2 was cloned using its map
position, with guidance from large deletions found in pathogenic mutants. Transformants carrying
the cloned PWL2 gene lost pathogenicity toward weeping lovegrass but remained fully pathogenic
toward other host plants. Therefore, PWL2 functions similarly to classical avirulence genes,
preventing infection of specific cultivars of a host species. The PWL2 gene encodes a hydrophilic,
glycine-rich protein (16 kD) with a putative secretion signal sequence. In the mapping population,
the pathogenic allele PWL2-2 differed from PWL2 by a single base pair substitution that resulted in
loss of function. The PWL2 locus exhibits high polymorphism among rice pathogens in different
geographic locations. (Sweigard, James A., et al, 1995)

The ability of M. Oryzae strains to infect weeping lovegrass is controlled by PWL2, initially
identified in the laboratory strain 4360. This strain is a genetic cross between two rice pathogenic
laboratory strains. One parent strain, 4224-7-8, infects weeping lovegrass but lacks PWL2, while the
other strain, 6043, is non-pathogenic and possesses the PWL2 locus. In the genetic cross, each of the
five tetrads produced four ascospore progenies that were pathogenic on weeping lovegrass and four
that were non-pathogenic, indicating single-gene segregation for the ability to infect weeping
lovegrass. Spontaneous mutant strains lacking PWL2 were also capable of infecting weeping
lovegrass, suggesting that PWL2 determines the pathogenicity of M. Oryzae strains towards this host.
When PWL2-deficient strains were transformed with the cloned PWL2 gene, their pathogenicity
towards weeping lovegrass was lost, while they retained pathogenicity towards barley and rice
cultivars. This suggests that M. Oryzae strains did not have a general defect in their ability to infect
Page |5

plants but were avirulent on weeping lovegrass due to the presence of PWL2.(Were,Vincent
Mbashira,2018).

Bioinformatics analysis plays a crucial role in understanding the structure, function, and
evolutionary aspects of genes. In the case of the PWL2 gene, which is a species-specific Avirulence
(Avr) gene in the pathogenic fungus Pyricularia oryzae, bioinformatics tools and techniques are
instrumental in unravelling its complexities and exploring its significance. (Hernández-Domínguez,
Edna María, et al., 2020)

Firstly, bioinformatics aids in the identification and annotation of the PWL2 gene by analyzing the
genome sequence of Pyricularia oryzae. This involves using specialized algorithms to search for
sequences with homology to known Avr genes. By comparing the PWL2 gene sequence with other
related genes or proteins, bioinformatics tools can help determine its structural features. (Chen,
Chenxi. , 2013.)

Sequence analysis is another important aspect of bioinformatics in studying the PWL2 gene. Multiple
sequence alignment and phylogenetic analysis allow researchers to compare the PWL2 gene across
different strains or related species of Pyricularia. These analyses provide insights into the genetic
variations and evolutionary relationships of the PWL2 gene, shedding light on its origin,
diversification, and potential co-evolution with host resistance genes. (Zhong, Zhenhui, et al.,
2016),(Peng, Zhao, et al, 2019)

Structural prediction and modelling are additional bioinformatics approaches applied to the study of
the PWL2 gene. These methods employ computational algorithms and databases to predict the three-
dimensional structure of the PWL2 protein based on its amino acid sequence. Structural models can
provide valuable information about the protein's functional sites, ligand binding regions, and
potential protein-protein interactions, aiding in the understanding of its role in pathogenicity and host
recognition. (Jambon, Martin, et al., 2003)

MATERIALS AND METHODS


Nucleotide Sequence :
 Retrieve Nucleotide Sequence: At first complete CDS of PWL2 nucleotide sequence of
Pyricularia oryzae organism is retrieved from NCBI website database and downloaded it in
FASTA format. The nucleotide sequence length is 438 bp.
( https://www.ncbi.nlm.nih.gov/nuccore/MN072513.1 )

 Performing BLAST: Then, nucleotide BLAST is performed in NCBI website database.

 Multiple Sequence Alignment: 24 Nucleotide sequences showing maximum homology are


retrieved from BLAST database of NCBI. After that, multiple sequence alignment is performed
with all the retrieved sequences using CLUSTALW web server.
Page |6

 Phylogenetic Tree Build: Previously retrieved multiple nucleotide sequences is uploaded in


MEGA software (v.11). Then phylogenetic tree is build using MEGA software.

FASTA format of the PWL2 nucleotide sequence of Pyricularia oryzae given below:-

Pyricularia oryzae isolate BJM-1 PWL2 gene, complete cds


GenBank: MN072513.1

>MN072513.1 Pyricularia oryzae isolate BJM-1 PWL2 gene, complete cds


ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGTTCTTTTTTTCGACCACTGTAACCGCCGGTGGCG
GGTGGACTAACAAACAGTTTTACAACGACAAAGGCGAAAGAGAGGGCTCAATTTCAATTAGAAAGGGCTC
GGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGGCCTGATAGGATGGTACGGGTTCATGAAAAC
AACGGCAACATCCGCGGGATGCCCCCGGGATATTCTCTAGGCCCTGATCATCAGCAAGATCAAACCGATC
GTCAATATTATAACAGGCACGGATATCATGTTGGTGATGGACCCGCCGAATACGGAAATCACGGAGGCGG
GCAATGGGGCGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGAG
GGCTGCAATATTATGTAA

Protein Sequence:
 Retrieve Protein Sequence: At first complete CDS of PWL2 protein sequence of Pyricularia
oryzae organism is retrieved from NCBI website database and downloaded it in FASTA format.
The protein sequence length is 145aa.
( https://www.ncbi.nlm.nih.gov/protein/1904992618 )
 Performing BLAST: Then, protein BLAST is performed in NCBI website database.
 Multiple Sequence Alignment: 28 Protein sequences showing maximum homology are
retrieved. After that, multiple sequence alignment is performed with all the retrieved sequences
using CLUSTALW web server.
 3D Structure Building & Validated: Previously retrieved protein sequence in FASTA format is
uploaded in SWISS-MODEL web server and then 3D structure of the PWL2 protein is build.
Then the previously made 3D structure of protein is downloaded in PDB format from SWISS-
MODEL and uploaded in PROCHECK UCLA-DOE LAB-SAVES (v6.0) to validate using
Ramachandran plot.
 Phylogenetic Tree Build: Previously retrieved multiple protein sequences upload to MEGA
software (v.11). Then phylogenetic tree is build using MEGA software.

FASTA format of the PWL2 protein sequence of Pyricularia oryzae given below:-

PWL2 [Pyricularia oryzae]


GenBank: QNS36448.1

>QNS36448.1 PWL2 [Pyricularia oryzae]


MKCNNIILPFALFFFSTTVTAGGGWTNKQFYNDKGEREGSISIRKGSEGDFNYGPSYPGGPDRMVRVHEN
NGNIRGMPPGYSLGPDHQQDQTDRQYYNRHGYHVGDGPAEYGNHGGGQWGDGYYGPPGEFTHEHREQREE
GCNIM
Page |7

RESULT & DISCURSION


 Multiple Sequences Alignment:

Multiple gene sequence alignment analysis is a powerful bioinformatics tool that allows researchers to
compare and analyze the similarities and differences among multiple gene sequences. This analysis
provides valuable insights into the evolutionary relationships, functional domains, and conserved regions of
genes, leading to a deeper understanding of their structure and function.

In this study, I performed multiple gene & protein sequence alignment analysis using 24 nucleotide & 28
protein homologous sequences retrieved from NCBI BLAST. The goal was to identify conserved regions
and patterns across these sequences, which can provide important clues about their functional significance.

The alignment was carried out using widely used bioinformatics software, such as CLUSTALW web
server, which implement efficient algorithms for aligning multiple sequences. These programs align the
sequences by identifying similar residues and maximizing the overall sequence similarity.

Upon analyzing the aligned sequences, several interesting observations were made. First, I identified highly
conserved regions, indicated by residues that were identical or showed strong similarity across the
sequences. These conserved regions are likely to play crucial roles in the protein's structure or function, and
their identification can guide future experimental studies.

Furthermore, I observed variations or gaps in certain regions, indicating sequence divergence among the
analyzed genes. These variations may indicate species-specific adaptations or functional differences among
the gene products. Exploring these variations can provide insights into the evolution and diversification of
the gene family.

The multiple gene sequence alignment analysis also provided a basis for phylogenetic inference. By
comparing the aligned sequences, I constructed a phylogenetic tree to depict the evolutionary relationships
among the genes and the species from which they were derived. This tree revealed patterns of divergence,
speciation events, and possible gene duplication events, shedding light on the evolutionary history of the
gene family.

The multiple gene sequence alignment analysis provided valuable insights into the conserved regions,
variations, functional domains, and evolutionary relationships among the analyzed gene sequences. This
analysis serves as a foundation for further experimental investigations, such as functional studies and
comparative genomics, ultimately advancing our understanding of gene structure, function, and evolution.
Page |8

 Multiple Nucleotide Sequences Alignment:


CLUSTAL 2.1 Multiple Sequence Alignments Sequence type explicitly set to DNA

Sequence format is Pearson Sequence 5: U26313.1_590-1027 438 bp


Sequence 1: MN072510.1_1-438 438 bp Sequence 6: MG787166.1_1-438 438 bp
Sequence 2: XM_031122648.1_1-438 438 bp Sequence 7: MG787165.1_1-438 438 bp
Sequence 3: MT669815.1_726-1163 438 bp Sequence 8: MG787153.1_1-438 438 bp
Sequence 4: XM_003712998.1_1-438 438 bp
Sequence 9: MG787163.1_1-438 438 bp
Sequence 10: MG787162.1_1-438 438 bp
Sequence 11: MG787160.1_1-438 438 bp Sequence 16: MG787154.1_1-438 438 bp
Sequence 12: MG787159.1_1-438 438 bp Sequence 17: MG787158.1_1-438 438 bp
Sequence 13: MG787157.1_1-438 438 bp Sequence 18: XM_031132118.1_12-447 436 bp
Sequence 14: MG787156.1_1-438 438 bp Sequence 19: XM_031127161.1_1-284 284 bp
Sequence 15: MG787155.1_1-438 438 bp
Start of Pairwise alignments Aligning...
Sequences (1:2) Aligned. Score: 99 Sequences (1:6) Aligned. Score: 98
Sequences (1:3) Aligned. Score: 99 Sequences (1:7) Aligned. Score: 98
Sequences (1:4) Aligned. Score: 98 Sequences (1:8) Aligned. Score: 98
Sequences (1:5) Aligned. Score: 98
Sequences (1:9) Aligned. Score: 97
Sequences (1:10) Aligned. Score: 97
Sequences (1:11) Aligned. Score: 97 Sequences (2:10) Aligned. Score: 97
Sequences (1:12) Aligned. Score: 97 Sequences (2:11) Aligned. Score: 97
Sequences (1:13) Aligned. Score: 97 Sequences (2:12) Aligned. Score: 97
Sequences (1:14) Aligned. Score: 97 Sequences (2:13) Aligned. Score: 97
Sequences (1:15) Aligned. Score: 97 Sequences (2:14) Aligned. Score: 97
Sequences (1:16) Aligned. Score: 97 Sequences (2:15) Aligned. Score: 97
Sequences (1:17) Aligned. Score: 97 Sequences (2:16) Aligned. Score: 97
Sequences (1:18) Aligned. Score: 97 Sequences (2:17) Aligned. Score: 97
Sequences (1:19) Aligned. Score: 83 Sequences (2:18) Aligned. Score: 97
Sequences (2:3) Aligned. Score: 98 Sequences (2:19) Aligned. Score: 82
Sequences (2:4) Aligned. Score: 97 Sequences (3:4) Aligned. Score: 99
Sequences (2:5) Aligned. Score: 97 Sequences (3:5) Aligned. Score: 99
Sequences (2:6) Aligned. Score: 97 Sequences (3:6) Aligned. Score: 99
Sequences (2:7) Aligned. Score: 97 Sequences (3:7) Aligned. Score: 99
Sequences (2:8) Aligned. Score: 97 Sequences (3:8) Aligned. Score: 99
Sequences (2:9) Aligned. Score: 97
Sequences (3:9) Aligned. Score: 98
Sequences (3:10) Aligned. Score: 98
Sequences (3:11) Aligned. Score: 98 Sequences (6:14) Aligned. Score: 99
Sequences (3:12) Aligned. Score: 98 Sequences (6:15) Aligned. Score: 99
Sequences (3:13) Aligned. Score: 98 Sequences (6:16) Aligned. Score: 99
Sequences (3:14) Aligned. Score: 98 Sequences (6:17) Aligned. Score: 99
Sequences (3:15) Aligned. Score: 98 Sequences (6:18) Aligned. Score: 99
Sequences (3:16) Aligned. Score: 98 Sequences (6:19) Aligned. Score: 83
Sequences (3:17) Aligned. Score: 98 Sequences (7:8) Aligned. Score: 99
Sequences (3:18) Aligned. Score: 98 Sequences (7:9) Aligned. Score: 99
Sequences (3:19) Aligned. Score: 83 Sequences (7:10) Aligned. Score: 99
Sequences (4:5) Aligned. Score: 100 Sequences (7:11) Aligned. Score: 99
Sequences (4:6) Aligned. Score: 99 Sequences (7:12) Aligned. Score: 99
Sequences (4:7) Aligned. Score: 99 Sequences (7:13) Aligned. Score: 99
Sequences (4:8) Aligned. Score: 99 Sequences (7:14) Aligned. Score: 99
Sequences (4:9) Aligned. Score: 99 Sequences (7:15) Aligned. Score: 99
Sequences (4:10) Aligned. Score: 99 Sequences (7:16) Aligned. Score: 99
Sequences (4:11) Aligned. Score: 99 Sequences (7:17) Aligned. Score: 99
Sequences (4:12) Aligned. Score: 99 Sequences (7:18) Aligned. Score: 99
Sequences (4:13) Aligned. Score: 99 Sequences (7:19) Aligned. Score: 83
Sequences (4:14) Aligned. Score: 99 Sequences (8:9) Aligned. Score: 99
Sequences (4:15) Aligned. Score: 99 Sequences (8:10) Aligned. Score: 99
Sequences (4:16) Aligned. Score: 99 Sequences (8:11) Aligned. Score: 99
Sequences (4:17) Aligned. Score: 99 Sequences (8:12) Aligned. Score: 99
Sequences (4:18) Aligned. Score: 99 Sequences (8:13) Aligned. Score: 99
Sequences (4:19) Aligned. Score: 83 Sequences (8:14) Aligned. Score: 99
Sequences (5:6) Aligned. Score: 99 Sequences (8:15) Aligned. Score: 99
Sequences (5:7) Aligned. Score: 99 Sequences (8:16) Aligned. Score: 99
Sequences (5:8) Aligned. Score: 99 Sequences (8:17) Aligned. Score: 99
Sequences (5:9) Aligned. Score: 99 Sequences (8:18) Aligned. Score: 99
Sequences (5:10) Aligned. Score: 99 Sequences (8:19) Aligned. Score: 83
Sequences (5:11) Aligned. Score: 99 Sequences (9:10) Aligned. Score: 99
Sequences (5:12) Aligned. Score: 99 Sequences (9:11) Aligned. Score: 99
Sequences (5:13) Aligned. Score: 99 Sequences (9:12) Aligned. Score: 99
Sequences (5:14) Aligned. Score: 99 Sequences (9:13) Aligned. Score: 99
Sequences (5:15) Aligned. Score: 99 Sequences (9:14) Aligned. Score: 99
Sequences (5:16) Aligned. Score: 99 Sequences (9:15) Aligned. Score: 99
Sequences (5:17) Aligned. Score: 99 Sequences (9:16) Aligned. Score: 99
Sequences (5:18) Aligned. Score: 99 Sequences (9:17) Aligned. Score: 99
Sequences (5:19) Aligned. Score: 83 Sequences (9:18) Aligned. Score: 98
Sequences (6:7) Aligned. Score: 99 Sequences (9:19) Aligned. Score: 83
Sequences (6:8) Aligned. Score: 99 Sequences (10:11) Aligned. Score: 99
Sequences (6:9) Aligned. Score: 99 Sequences (10:12) Aligned. Score: 99
Sequences (6:10) Aligned. Score: 99 Sequences (10:13) Aligned. Score: 99
Sequences (6:11) Aligned. Score: 99 Sequences (10:14) Aligned. Score: 99
Sequences (6:12) Aligned. Score: 99 Sequences (10:15) Aligned. Score: 99
Sequences (6:13) Aligned. Score: 99 Sequences (10:16) Aligned. Score: 99
P a g e | 10

Sequences (10:17) Aligned. Score: 99 Group 3: Sequences: 2 Score:8303


Sequences (10:18) Aligned. Score: 98 Group 4: Sequences: 3 Score:8303
Sequences (10:19) Aligned. Score: 83 Group 5: Sequences: 4 Score:8302
Sequences (11:12) Aligned. Score: 99 Group 6: Sequences: 5 Score:8302
Sequences (11:13) Aligned. Score: 99 Group 7: Sequences: 6 Score:8312
Sequences (11:14) Aligned. Score: 99 Group 8: Sequences: 7 Score:8295
Sequences (11:15) Aligned. Score: 99 Group 9: Sequences: 8 Score:8302
Sequences (11:16) Aligned. Score: 99 Group 10: Sequences: 9 Score:8302
Sequences (11:17) Aligned. Score: 99 Group 11: Sequences: 10 Score:8302
Sequences (11:18) Aligned. Score: 98 Group 12: Sequences: 2 Score:8246
Sequences (11:19) Aligned. Score: 83 Group 13: Sequences: 3 Score:8246
Sequences (12:13) Aligned. Score: 99 Group 14: Sequences: 13 Score:8229
Sequences (12:14) Aligned. Score: 99 Group 15: Sequences: 14 Score:8260
Sequences (12:15) Aligned. Score: 99 Group 16: Sequences: 15 Score:8261
Sequences (12:16) Aligned. Score: 99 Group 17: Sequences: 18 Score:8116
Sequences (12:17) Aligned. Score: 99 Group 18: Sequences: 19 Score:4627
Sequences (12:18) Aligned. Score: 98
Sequences (12:19) Aligned. Score: 83
Sequences (13:14) Aligned. Score: 99
Sequences (13:15) Aligned. Score: 99
Sequences (13:16) Aligned. Score: 99
Sequences (13:17) Aligned. Score: 99
Sequences (13:18) Aligned. Score: 98
Sequences (13:19) Aligned. Score: 83
Sequences (14:15) Aligned. Score: 99
Sequences (14:16) Aligned. Score: 99
Sequences (14:17) Aligned. Score: 99
Sequences (14:18) Aligned. Score: 98
Sequences (14:19) Aligned. Score: 83
Sequences (15:16) Aligned. Score: 99
Sequences (15:17) Aligned. Score: 99
Sequences (15:18) Aligned. Score: 98
Sequences (15:19) Aligned. Score: 83
Sequences (16:17) Aligned. Score: 99
Sequences (16:18) Aligned. Score: 98
Sequences (16:19) Aligned. Score: 83
Sequences (17:18) Aligned. Score: 98
Sequences (17:19) Aligned. Score: 83
Sequences (18:19) Aligned. Score: 82
Guide tree file created: [clustalw.dnd]
There are 18 groups
Start of Multiple Alignment
Aligning...
Group 1: Sequences: 2 Score:8284
Group 2: Sequences: 3 Score:8217
P a g e | 11

Alignment Score 166233 CLUSTAL-Alignment file

clustalw.dnd
(MN072510.1_1-438:0.00000,
XM_031122648.1_1-438:0.00686):0.00885,MT669815.1_726-1163:0.00028)
:0.00260, XM_031127161.1_1284:0.15837):0.00410, XM_003712998.1_1-
438:0.00018):0.00000, U26313.1_590-1027:0.00000)
:0.00014,((MG787166.1_1-438:0.00227,
XM_031132118.1_12-447:0.00691):0.00000,MG787165.1_1-438:0.00227):0.00000):0.00205,MG787162.1_1-438:0.00211):0.00000,MG787156.1_1-
438:0.00219):0.00000,MG787155.1_1-438:0.00226):0.00000,MG787158.1_1-438:0.00456):0.00000,MG787153.1_1-438:0.00000)
:0.00000,MG787163.1_1-438:0.00228):0.00000,MG787160.1_1-438:0.00228):0.00000,MG787159.1_1-438:0.00228):0.00000,MG787157.1_1-438:0.00228,
MG787154.1_1438:0.00228);

CLUSTAL O(1.2.4) multiple sequence alignment:


XM_031127161.1:1-284 ------------------------------------------------------------ 0
MN072510.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGTTCTTTTTTTCGACCACTGTAACC 60
XM_031122648.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGTTCTTTTTTTCGACCACTGTAACC 60
XM_031132118.1:12-447 --GAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 58
MT669815.1:726-1163 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGTTCTTTTTTTCGACCACTGTAACC 60
MG787158.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACT 60
MG787154.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60
MG787155.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60
MG787156.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60
MG787157.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60
MG787159.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60
MG787160.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60
MG787162.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60
MG787163.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60
MG787153.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60
MG787165.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60
XM_003712998.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60
U26313.1:590-1027 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60
MG787166.1:1-438 ATGAAATGCAACAACATCATCCTCCCTTTTGCTTTGGTCTTTTTTTCGACCACTGTAACC 60

XM_031127161.1:1-284 ------------------------------------------------------------ 0
MN072510.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAGTTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
XM_031122648.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAGTTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
XM_031132118.1:12-447 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 118
MT669815.1:726-1163 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
MG787158.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
MG787154.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
MG787155.1:1-438 GCCGGTGGCGGGTGGACTAACAAGCAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
MG787156.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
MG787157.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
MG787159.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
MG787160.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
MG787162.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
MG787163.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
MG787153.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
MG787165.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
XM_003712998.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
U26313.1:590-1027 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120
MG787166.1:1-438 GCCGGTGGCGGGTGGACTAACAAACAATTTTACAACGACAAAGGCGAAAGAGAGGGCTCA 120

XM_031127161.1:1-284 -------------------------------------ATGGCCCTGGTCATCCTGGAGGG 23
MN072510.1:1-438 ATTTCAATTAGAAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
XM_031122648.1:1-438 ATTTCAATTAAAAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
XM_031132118.1:12-447 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 178
MT669815.1:726-1163 ATTTCAATTAGAAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
MG787158.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
MG787154.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
MG787155.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
MG787156.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
MG787157.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
MG787159.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
MG787160.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
MG787162.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
MG787163.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
MG787153.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
MG787165.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
P a g e | 12

XM_003712998.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180


U26313.1:590-1027 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
MG787166.1:1-438 ATTTCAATTAGGAAGGGCTCGGAAGGCGATTTTAACTATGGCCCCAGTTATCCTGGAGGG 180
******* ** ***********

XM_031127161.1:1-284 CGCGATGGGATGGTGCGGGTTTATGCGAACAATGGCGACATCCGCGGGATGCCTCCGCGA 83
MN072510.1:1-438 CCTGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
XM_031122648.1:1-438 CCTAATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCACGGGA 240
XM_031132118.1:12-447 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGTAACGTCCGCGGGATGCCCCCGGGA 238
MT669815.1:726-1163 CCTGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
MG787158.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
MG787154.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACAGCAACATCCGCGGGATGCCCCCGGGA 240
MG787155.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
MG787156.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
MG787157.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
MG787159.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCAGGATGCCCCCGGGA 240
MG787160.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
MG787162.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
MG787163.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
MG787153.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
MG787165.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
XM_003712998.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
U26313.1:590-1027 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
MG787166.1:1-438 CCCGATAGGATGGTACGGGTTCATGAAAACAACGGCAACATCCGCGGGATGCCCCCGGGA 240
* ** ******* ****** *** ***** * ** ***** ******* ** **

XM_031127161.1:1-284 TACCCTCTACACCGTGACCCTGCGGAAGATCAAAACGATCAGCAATACTATAACAGGAAC 143


MN072510.1:1-438 TATTCTCTAGGCCCTGATCATCAGCAAGATCAAACCGATCGTCAATATTATAACAGGCAC 300
XM_031122648.1:1-438 TATTCTCTAGGCCCTGATCATCAGCAAGATCAAACCGATCGTCAATATTATAACAGGCAC 300
XM_031132118.1:12-447 TATTCTCTAGGCCCTGATCATCAGGAAGATAAAAGCGATCGTCAGTATTATAACAGGCAC 298
MT669815.1:726-1163 TATTCTCTAGGCCCTGATCATCAGGAAGATAAAAGCGATCGTCAATATTATAACAGGCAC 300
MG787158.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAAATAAAAGCGATCGTCAATATTATAACAGGCAC 300
MG787154.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAAATAAAAGCGATCGTCAATATTATAACAGGCAC 300
MG787155.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAAATAAAAGCGATCGTCAATATTATAACAGGCAC 300
MG787156.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAAATAAAAGCGATCGTCAATATTATAACAGGCAC 300
MG787157.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAAATAAAAGCGATCGTCAATATTATAACAGGCAC 300
MG787159.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAAATAAAAGCGATCGTCAATATTATAACAGGCAC 300
MG787160.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAAATAAAAGCGATCGTCAATATTATAACAGGCAC 300
MG787162.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAAATAAAAGCGATCGTCAATATTATAACAGGCAC 300
MG787163.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAAATAAAAGCGATCGTCAATATTATAACAGGCAC 300
MG787153.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAAATAAAAGCGATCGTCAATATTATAACAGGCAC 300
MG787165.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAGATAAAAGCGATCGTCAATATTATAACAGGCAC 300
XM_003712998.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAGATAAAAGCGATCGTCAATATTATAACAGGCAC 300
U26313.1:590-1027 TATTCTCTAGGCCCTGATCATCAGGAAGATAAAAGCGATCGTCAATATTATAACAGGCAC 300
MG787166.1:1-438 TATTCTCTAGGCCCTGATCATCAGGAAGATAAAAGCGATCGTCAATATTATAACAGGCAC 300
** ***** ** *** * * * ** ** *** ***** ** ** ********* **

XM_031127161.1:1-284 GGATATCATGTTGGTGATGGACCCGCCCGAATACGGAACGCATGGAGCCGGGCATTGGGG 203


MN072510.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
XM_031122648.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
XM_031132118.1:12-447 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 357
MT669815.1:726-1163 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
MG787158.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
MG787154.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
MG787155.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
MG787156.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
MG787157.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCGCGGAGGCGGGCAATGGGG 359
MG787159.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
MG787160.1:1-438 GGATATCACGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
MG787162.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
MG787163.1:1-438 GGATACCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
MG787153.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
MG787165.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCGATGGGG 359
XM_003712998.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
U26313.1:590-1027 GGATATCATGTTGGTGATGGACCCGCCG-AATACGGAAATCACGGAGGCGGGCAATGGGG 359
MG787166.1:1-438 GGATATCATGTTGGTGATGGACCCGCCG-AACACGGAAATCACGGAGGCGGGCAATGGGG 359
***** ** ****************** ** ****** * **** ***** *****

XM_031127161.1:1-284 CGAGGGATATTCTGGACCACCAGGCAAGTTTACACATGAGCACGGCGAACAGCGAGGAGA 263


MN072510.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
XM_031122648.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
XM_031132118.1:12-447 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 417
MT669815.1:726-1163 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
MG787158.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGGGCACCGTGAACAGCGAGAAGA 419
MG787154.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
MG787155.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
MG787156.1:1-438 CGACGGATATTATGGACCACCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
MG787157.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
MG787159.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
MG787160.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
MG787162.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
MG787163.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
P a g e | 13

MG787153.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419


MG787165.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
XM_003712998.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
U26313.1:590-1027 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
MG787166.1:1-438 CGACGGATATTATGGACCGCCAGGCGAGTTTACACATGAGCACCGTGAACAGCGAGAAGA 419
*** ******* ****** ****** ************ **** * ********** ***

XM_031127161.1:1-284 AGATAGCTGCAACATTATGTA 284


MN072510.1:1-438 GGGCTGCAATATTATGTAA-- 438
XM_031122648.1:1-438 GGGCTGCAATATTATGTAA-- 438
XM_031132118.1:12-447 GGGCTGCAATATTATGTAA-- 436
MT669815.1:726-1163 GGGCTGCAATATTATGTAA-- 438
MG787158.1:1-438 GGGCTGCAATATTATGTAA-- 438
MG787154.1:1-438 GGGCTGCAATATTATGTAA-- 438
MG787155.1:1-438 GGGCTGCAATATTATGTAA-- 438
MG787156.1:1-438 GGGCTGCAATATTATGTAA-- 438
MG787157.1:1-438 GGGCTGCAATATTATGTAA-- 438
MG787159.1:1-438 GGGCTGCAATATTATGTAA-- 438
MG787160.1:1-438 GGGCTGCAATATTATGTAA-- 438
MG787162.1:1-438 GAGCTGCAATATTATGTAA-- 438
MG787163.1:1-438 GGGCTGCAATATTATGTAA-- 438
MG787153.1:1-438 GGGCTGCAATATTATGTAA-- 438
MG787165.1:1-438 GGGCTGCAATATTATGTAA-- 438
XM_003712998.1:1-438 GGGCTGCAATATTATGTAA-- 438
U26313.1:590-1027 GGGCTGCAATATTATGTAA-- 438
MG787166.1:1-438 GGGCTGCAATATTATGTAA-- 438
** * **

 Multiple Protein Sequences Alignment:


Sequence 11: AYN79356.1_1-145 145 aa
Sequence 12: AYN79358.1_1-145 145 aa
Sequence 13: AYN79357.1_1-145 145 aa

CLUSTALW Result Sequence


Sequence
14:
15:
QBZ53273.1_1-147
prf||2210377A_43-189
147
147
aa
aa
Sequence 16: AAA80239.2_1-147 147 aa
CLUSTAL 2.1 Multiple Sequence Alignments Sequence 17: KAI6291589.1_1-131 131 aa
Sequence 18: AAA80240.1_1-130 130 aa
Sequence type explicitly set to Protein Sequence 19: AAA80241.1_1-131 131 aa
Sequence format is Pearson Sequence 20: KAI6353944.1_12-134 123 aa
Sequence 1: QNS36445.1_1-145 145 aa Sequence 21: TLD19068.1_1-57 57 aa
Sequence 2: QMU24232.1_1-145 145 aa Sequence 22: XP_030981816.1_4-131 128 aa
Sequence 3: XP_030987775.1_1-145 145 aa Sequence 23: KAI6344236.1_4-131 128 aa
Sequence 4: XP_003713046.1_1-145 145 aa Sequence 24: KAI7908610.1_1-102 102 aa
Sequence 5: AYN79364.1_1-145 145 aa Sequence 25: XP_003711276.1_1-100 100 aa
Sequence 6: AYN79352.1_1-145 145 aa Sequence 26: TLD14703.1_1-138 138 aa
Sequence 7: AYN79365.1_1-145 145 aa Sequence 27: XP_029744010.1_1-69 69 aa
Sequence 8: XP_030977270.1_4-148 145 aa Sequence 28: XP_030981442.1_56-94 39 aa
Sequence 9: AYN79361.1_1-145 145 aa
Sequence 10: AYN79353.1_1-145 145 aa
Start of Pairwise alignments Aligning...

Sequences (1:2) Aligned. Score: 97 Sequences (1:25) Aligned. Score: 39


Sequences (1:3) Aligned. Score: 97 Sequences (1:26) Aligned. Score: 36
Sequences (1:4) Aligned. Score: 97 Sequences (1:27) Aligned. Score: 50
Sequences (1:5) Aligned. Score: 96 Sequences (1:28) Aligned. Score: 71
Sequences (1:6) Aligned. Score: 96 Sequences (2:3) Aligned. Score: 95
Sequences (1:7) Aligned. Score: 96 Sequences (2:4) Aligned. Score: 99
Sequences (1:8) Aligned. Score: 95 Sequences (2:5) Aligned. Score: 98
Sequences (1:9) Aligned. Score: 95 Sequences (2:6) Aligned. Score: 98
Sequences (1:10) Aligned. Score: 95 Sequences (2:7) Aligned. Score: 98
Sequences (1:11) Aligned. Score: 95 Sequences (2:8) Aligned. Score: 97
Sequences (1:12) Aligned. Score: 95 Sequences (2:9) Aligned. Score: 97
Sequences (1:13) Aligned. Score: 95 Sequences (2:10) Aligned. Score: 97
Sequences (1:14) Aligned. Score: 74 Sequences (2:11) Aligned. Score: 97
Sequences (1:15) Aligned. Score: 73 Sequences (2:12) Aligned. Score: 97
Sequences (1:16) Aligned. Score: 73 Sequences (2:13) Aligned. Score: 97
Sequences (1:17) Aligned. Score: 57 Sequences (2:14) Aligned. Score: 75
Sequences (1:18) Aligned. Score: 50 Sequences (2:15) Aligned. Score: 74
Sequences (1:19) Aligned. Score: 58 Sequences (2:16) Aligned. Score: 74
Sequences (1:20) Aligned. Score: 44 Sequences (2:17) Aligned. Score: 57
Sequences (1:21) Aligned. Score: 70 Sequences (2:18) Aligned. Score: 51
Sequences (1:22) Aligned. Score: 43 Sequences (2:19) Aligned. Score: 58
Sequences (1:23) Aligned. Score: 43 Sequences (2:20) Aligned. Score: 45
Sequences (1:24) Aligned. Score: 45 Sequences (2:21) Aligned. Score: 70
P a g e | 14

Sequences (2:22) Aligned. Score: 44 Sequences (3:15) Aligned. Score: 71


Sequences (2:23) Aligned. Score: 44 Sequences (3:16) Aligned. Score: 71
Sequences (2:24) Aligned. Score: 45 Sequences (3:17) Aligned. Score: 55
Sequences (2:25) Aligned. Score: 40 Sequences (3:18) Aligned. Score: 50
Sequences (2:26) Aligned. Score: 37 Sequences (3:19) Aligned. Score: 56
Sequences (2:27) Aligned. Score: 50 Sequences (3:20) Aligned. Score: 44
Sequences (2:28) Aligned. Score: 71 Sequences (3:21) Aligned. Score: 68
Sequences (3:4) Aligned. Score: 95 Sequences (3:22) Aligned. Score: 43
Sequences (3:5) Aligned. Score: 94 Sequences (3:23) Aligned. Score: 43
Sequences (3:6) Aligned. Score: 94 Sequences (3:24) Aligned. Score: 45
Sequences (3:7) Aligned. Score: 94 Sequences (3:25) Aligned. Score: 38
Sequences (3:8) Aligned. Score: 93 Sequences (3:26) Aligned. Score: 35
Sequences (3:9) Aligned. Score: 93 Sequences (3:27) Aligned. Score: 49
Sequences (3:10) Aligned. Score: 93 Sequences (3:28) Aligned. Score: 71
Sequences (3:11) Aligned. Score: 93 Sequences (4:5) Aligned. Score: 99
Sequences (3:12) Aligned. Score: 93 Sequences (4:6) Aligned. Score: 99
Sequences (3:13) Aligned. Score: 93 Sequences (4:7) Aligned. Score: 99
Sequences (3:14) Aligned. Score: 72 Sequences (4:8) Aligned. Score: 98
Sequences (4:9) Aligned. Score: 98
Sequences (4:10) Aligned. Score: 98
Sequences (4:11) Aligned. Score: 98 Sequences (6:8) Aligned. Score: 97
Sequences (4:12) Aligned. Score: 98 Sequences (6:9) Aligned. Score: 99
Sequences (4:13) Aligned. Score: 98 Sequences (6:10) Aligned. Score: 99
Sequences (4:14) Aligned. Score: 75 Sequences (6:11) Aligned. Score: 99
Sequences (4:15) Aligned. Score: 74 Sequences (6:12) Aligned. Score: 99
Sequences (4:16) Aligned. Score: 74 Sequences (6:13) Aligned. Score: 99
Sequences (4:17) Aligned. Score: 57 Sequences (6:14) Aligned. Score: 75
Sequences (4:18) Aligned. Score: 51 Sequences (6:15) Aligned. Score: 73
Sequences (4:19) Aligned. Score: 58 Sequences (6:16) Aligned. Score: 73
Sequences (4:20) Aligned. Score: 45 Sequences (6:17) Aligned. Score: 56
Sequences (4:21) Aligned. Score: 70 Sequences (6:18) Aligned. Score: 50
Sequences (4:22) Aligned. Score: 44 Sequences (6:19) Aligned. Score: 57
Sequences (4:23) Aligned. Score: 44 Sequences (6:20) Aligned. Score: 44
Sequences (4:24) Aligned. Score: 45 Sequences (6:21) Aligned. Score: 68
Sequences (4:25) Aligned. Score: 40 Sequences (6:22) Aligned. Score: 43
Sequences (4:26) Aligned. Score: 37 Sequences (6:23) Aligned. Score: 43
Sequences (4:27) Aligned. Score: 50 Sequences (6:24) Aligned. Score: 45
Sequences (4:28) Aligned. Score: 71 Sequences (6:25) Aligned. Score: 39
Sequences (5:6) Aligned. Score: 98 Sequences (6:26) Aligned. Score: 36
Sequences (5:7) Aligned. Score: 98 Sequences (6:27) Aligned. Score: 50
Sequences (5:8) Aligned. Score: 97 Sequences (6:28) Aligned. Score: 71
Sequences (5:9) Aligned. Score: 97 Sequences (7:8) Aligned. Score: 97
Sequences (5:10) Aligned. Score: 97 Sequences (7:9) Aligned. Score: 97
Sequences (5:11) Aligned. Score: 97 Sequences (7:10) Aligned. Score: 97
Sequences (5:12) Aligned. Score: 97 Sequences (7:11) Aligned. Score: 97
Sequences (5:13) Aligned. Score: 97 Sequences (7:12) Aligned. Score: 97
Sequences (5:14) Aligned. Score: 75 Sequences (7:13) Aligned. Score: 97
Sequences (5:15) Aligned. Score: 73 Sequences (7:14) Aligned. Score: 75
Sequences (5:16) Aligned. Score: 73 Sequences (7:15) Aligned. Score: 73
Sequences (5:17) Aligned. Score: 57 Sequences (7:16) Aligned. Score: 73
Sequences (5:18) Aligned. Score: 51 Sequences (7:17) Aligned. Score: 56
Sequences (5:19) Aligned. Score: 58 Sequences (7:18) Aligned. Score: 50
Sequences (5:20) Aligned. Score: 46 Sequences (7:19) Aligned. Score: 57
Sequences (5:21) Aligned. Score: 68 Sequences (7:20) Aligned. Score: 44
Sequences (5:22) Aligned. Score: 44 Sequences (7:21) Aligned. Score: 68
Sequences (5:23) Aligned. Score: 44 Sequences (7:22) Aligned. Score: 43
Sequences (5:24) Aligned. Score: 45 Sequences (7:23) Aligned. Score: 43
Sequences (5:25) Aligned. Score: 40 Sequences (7:24) Aligned. Score: 44
Sequences (5:26) Aligned. Score: 37 Sequences (7:25) Aligned. Score: 40
Sequences (5:27) Aligned. Score: 50 Sequences (7:26) Aligned. Score: 36
Sequences (5:28) Aligned. Score: 71 Sequences (7:27) Aligned. Score: 50
Sequences (6:7) Aligned. Score: 98 Sequences (7:28) Aligned. Score: 69
Sequences (8:9) Aligned. Score: 97
Sequences (8:10) Aligned. Score: 97
Sequences (8:11) Aligned. Score: 97 Sequences (9:10) Aligned. Score: 98
Sequences (8:12) Aligned. Score: 97 Sequences (9:11) Aligned. Score: 98
Sequences (8:13) Aligned. Score: 97 Sequences (9:12) Aligned. Score: 98
Sequences (8:14) Aligned. Score: 74 Sequences (9:13) Aligned. Score: 98
Sequences (8:15) Aligned. Score: 73 Sequences (9:14) Aligned. Score: 72
Sequences (8:16) Aligned. Score: 73 Sequences (9:15) Aligned. Score: 71
Sequences (8:17) Aligned. Score: 55 Sequences (9:16) Aligned. Score: 71
Sequences (8:18) Aligned. Score: 50 Sequences (9:17) Aligned. Score: 56
Sequences (8:19) Aligned. Score: 56 Sequences (9:18) Aligned. Score: 50
Sequences (8:20) Aligned. Score: 44 Sequences (9:19) Aligned. Score: 57
Sequences (8:21) Aligned. Score: 68 Sequences (9:20) Aligned. Score: 44
Sequences (8:22) Aligned. Score: 43 Sequences (9:21) Aligned. Score: 68
Sequences (8:23) Aligned. Score: 43 Sequences (9:22) Aligned. Score: 43
Sequences (8:24) Aligned. Score: 44 Sequences (9:23) Aligned. Score: 43
Sequences (8:25) Aligned. Score: 38 Sequences (9:24) Aligned. Score: 45
Sequences (8:26) Aligned. Score: 36 Sequences (9:25) Aligned. Score: 39
Sequences (8:27) Aligned. Score: 49 Sequences (9:26) Aligned. Score: 36
Sequences (8:28) Aligned. Score: 71 Sequences (9:27) Aligned. Score: 50
P a g e | 15

Sequences (9:28) Aligned. Score: 74 Sequences (15:17) Aligned. Score: 53


Sequences (10:11) Aligned. Score: 98 Sequences (15:18) Aligned. Score: 49
Sequences (10:12) Aligned. Score: 98 Sequences (15:19) Aligned. Score: 54
Sequences (10:13) Aligned. Score: 98 Sequences (15:20) Aligned. Score: 48
Sequences (10:14) Aligned. Score: 74 Sequences (15:21) Aligned. Score: 80
Sequences (10:15) Aligned. Score: 73 Sequences (15:22) Aligned. Score: 37
Sequences (10:16) Aligned. Score: 73 Sequences (15:23) Aligned. Score: 37
Sequences (10:17) Aligned. Score: 55 Sequences (15:24) Aligned. Score: 46
Sequences (10:18) Aligned. Score: 50 Sequences (15:25) Aligned. Score: 40
Sequences (10:19) Aligned. Score: 56 Sequences (15:26) Aligned. Score: 32
Sequences (10:20) Aligned. Score: 44 Sequences (15:27) Aligned. Score: 47
Sequences (10:21) Aligned. Score: 68 Sequences (15:28) Aligned. Score: 53
Sequences (10:22) Aligned. Score: 43 Sequences (16:17) Aligned. Score: 53
Sequences (10:23) Aligned. Score: 43 Sequences (16:18) Aligned. Score: 49
Sequences (10:24) Aligned. Score: 45 Sequences (16:19) Aligned. Score: 54
Sequences (10:25) Aligned. Score: 38 Sequences (16:20) Aligned. Score: 48
Sequences (10:26) Aligned. Score: 36 Sequences (16:21) Aligned. Score: 80
Sequences (10:27) Aligned. Score: 50 Sequences (16:22) Aligned. Score: 37
Sequences (10:28) Aligned. Score: 71 Sequences (16:23) Aligned. Score: 37
Sequences (11:12) Aligned. Score: 98 Sequences (16:24) Aligned. Score: 46
Sequences (11:13) Aligned. Score: 98 Sequences (16:25) Aligned. Score: 40
Sequences (11:14) Aligned. Score: 74 Sequences (16:26) Aligned. Score: 32
Sequences (11:15) Aligned. Score: 73 Sequences (16:27) Aligned. Score: 47
Sequences (11:16) Aligned. Score: 73 Sequences (16:28) Aligned. Score: 53
Sequences (11:17) Aligned. Score: 55 Sequences (17:18) Aligned. Score: 71
Sequences (11:18) Aligned. Score: 50 Sequences (17:19) Aligned. Score: 99
Sequences (11:19) Aligned. Score: 56 Sequences (17:20) Aligned. Score: 42
Sequences (11:20) Aligned. Score: 43 Sequences (17:21) Aligned. Score: 54
Sequences (11:21) Aligned. Score: 66 Sequences (17:22) Aligned. Score: 33
Sequences (11:22) Aligned. Score: 43 Sequences (17:23) Aligned. Score: 33
Sequences (11:23) Aligned. Score: 43 Sequences (17:24) Aligned. Score: 69
Sequences (11:24) Aligned. Score: 45 Sequences (17:25) Aligned. Score: 66
Sequences (11:25) Aligned. Score: 39 Sequences (17:26) Aligned. Score: 37
Sequences (11:26) Aligned. Score: 36 Sequences (17:27) Aligned. Score: 47
Sequences (11:27) Aligned. Score: 50 Sequences (17:28) Aligned. Score: 46
Sequences (11:28) Aligned. Score: 69 Sequences (18:19) Aligned. Score: 72
Sequences (12:13) Aligned. Score: 98 Sequences (18:20) Aligned. Score: 38
Sequences (12:14) Aligned. Score: 74 Sequences (18:21) Aligned. Score: 50
Sequences (12:15) Aligned. Score: 73 Sequences (18:22) Aligned. Score: 36
Sequences (12:16) Aligned. Score: 73 Sequences (18:23) Aligned. Score: 36
Sequences (12:17) Aligned. Score: 55 Sequences (18:24) Aligned. Score: 99
Sequences (12:18) Aligned. Score: 50 Sequences (18:25) Aligned. Score: 100
Sequences (12:19) Aligned. Score: 56 Sequences (18:26) Aligned. Score: 36
Sequences (12:20) Aligned. Score: 44 Sequences (18:27) Aligned. Score: 39
Sequences (12:21) Aligned. Score: 66 Sequences (18:28) Aligned. Score: 41
Sequences (12:22) Aligned. Score: 43 Sequences (19:20) Aligned. Score: 43
Sequences (12:23) Aligned. Score: 43 Sequences (19:21) Aligned. Score: 56
Sequences (12:24) Aligned. Score: 45 Sequences (19:22) Aligned. Score: 34
Sequences (12:25) Aligned. Score: 38 Sequences (19:23) Aligned. Score: 34
Sequences (12:26) Aligned. Score: 36 Sequences (19:24) Aligned. Score: 70
Sequences (12:27) Aligned. Score: 50 Sequences (19:25) Aligned. Score: 66
Sequences (12:28) Aligned. Score: 71 Sequences (19:26) Aligned. Score: 37
Sequences (13:14) Aligned. Score: 75 Sequences (19:27) Aligned. Score: 47
Sequences (13:15) Aligned. Score: 73 Sequences (19:28) Aligned. Score: 43
Sequences (13:16) Aligned. Score: 73 Sequences (20:21) Aligned. Score: 49
Sequences (13:17) Aligned. Score: 56 Sequences (20:22) Aligned. Score: 81
Sequences (13:18) Aligned. Score: 50 Sequences (20:23) Aligned. Score: 81
Sequences (13:19) Aligned. Score: 57 Sequences (20:24) Aligned. Score: 37
Sequences (13:20) Aligned. Score: 44 Sequences (20:25) Aligned. Score: 22
Sequences (13:21) Aligned. Score: 68 Sequences (20:26) Aligned. Score: 31
Sequences (13:22) Aligned. Score: 41 Sequences (20:27) Aligned. Score: 23
Sequences (13:23) Aligned. Score: 41 Sequences (20:28) Aligned. Score: 43
Sequences (13:24) Aligned. Score: 45 Sequences (21:22) Aligned. Score: 40
Sequences (13:25) Aligned. Score: 39 Sequences (21:23) Aligned. Score: 40
Sequences (13:26) Aligned. Score: 34 Sequences (21:24) Aligned. Score: 31
Sequences (13:27) Aligned. Score: 50 Sequences (21:25) Aligned. Score: 24
Sequences (13:28) Aligned. Score: 69 Sequences (21:26) Aligned. Score: 14
Sequences (14:15) Aligned. Score: 97 Sequences (21:27) Aligned. Score: 10
Sequences (14:16) Aligned. Score: 97 Sequences (21:28) Aligned. Score: 17
Sequences (14:17) Aligned. Score: 55 Sequences (22:23) Aligned. Score: 100
Sequences (14:18) Aligned. Score: 50 Sequences (22:24) Aligned. Score: 28
Sequences (14:19) Aligned. Score: 56 Sequences (22:25) Aligned. Score: 27
Sequences (14:20) Aligned. Score: 48 Sequences (22:26) Aligned. Score: 27
Sequences (14:21) Aligned. Score: 80 Sequences (22:27) Aligned. Score: 5
Sequences (14:22) Aligned. Score: 37 Sequences (22:28) Aligned. Score: 33
Sequences (14:23) Aligned. Score: 37 Sequences (23:24) Aligned. Score: 28
Sequences (14:24) Aligned. Score: 46 Sequences (23:25) Aligned. Score: 27
Sequences (14:25) Aligned. Score: 40 Sequences (23:26) Aligned. Score: 27
Sequences (14:26) Aligned. Score: 33 Sequences (23:27) Aligned. Score: 5
Sequences (14:27) Aligned. Score: 49 Sequences (23:28) Aligned. Score: 33
Sequences (14:28) Aligned. Score: 53 Sequences (24:25) Aligned. Score: 78
Sequences (15:16) Aligned. Score: 100 Sequences (24:26) Aligned. Score: 43
P a g e | 16

Sequences (24:27) Aligned. Score: 39 Group 9: Sequences: 3 Score:619


Sequences (24:28) Aligned. Score: 41 Group 10: Sequences: 2 Score:2475
Sequences (25:26) Aligned. Score: 39 Group 11: Sequences: 3 Score:2487
Sequences (25:27) Aligned. Score: 39 Group 12: Sequences: 4 Score:2476
Sequences (25:28) Aligned. Score: 17 Group 13: Sequences: 5 Score:2478
Sequences (26:27) Aligned. Score: 43 Group 14: Sequences: 6 Score:2478
Sequences (26:28) Aligned. Score: 12 Group 15: Sequences: 2 Score:2492
Sequences (27:28) Aligned. Score: 5 Group 16: Sequences: 8 Score:2474
Guide tree file created: [clustalw.dnd] Group 17: Sequences: 9 Score:2476
There are 27 groups Group 18: Sequences: 10 Score:2472
Start of Multiple Alignment Group 19: Sequences: 11 Score:2476
Group 20: Sequences: 14 Score:713
Aligning... Group 21: Sequences: 18 Score:863
Group 1: Sequences: 2 Score:2209 Group 22: Sequences: 2 Score:2178
Group 2: Sequences: 2 Score:1664 Group 23: Sequences: 3 Score:1823
Group 3: Sequences: 3 Score:1391 Group 24: Sequences: 21 Score:545
Group 4: Sequences: 5 Score:1469 Group 25: Sequences: 26 Score:483
Group 5: Sequences: 2 Score:2531 Group 26: Sequences: 2 Score:476
Group 6: Sequences: 3 Score:2498 Group 27: Sequences: 28 Score:373
Group 7: Sequences: 4 Score:880
Group 8: Sequences: 2 Score:2477

Alignment Score 166233 CLUSTAL-Alignment filE

Alignment Score 166233 CLUSTAL-Alignment file

CLUSTAL O(1.2.4) multiple sequence alignment:

TLD14703.1:1-138 MKLSTVILPLALGLFSNTATAAPIGYNPFSRPKWTNLKIKNDKGESEGSMSIVKGKDGTI 60
KAI6353944.1:12-134 ----NIIFAGALALFSTTISAG--------RSKWINRQIYKGN-DRQGSLSVARGYEHHC 47
XP_030981816.1:4-131 ----NIIFAGALALFSTTESAG-----ILSLPKWINKQIFIGE-DRQGSLSVARGYEDHC 50
KAI6344236.1:4-131 ----NIIFAGALALFSTTESAG-----ILSLPKWINKQIFIGE-DRQGSLSVARGYEDHC 50
XP_029744010.1:1-69 MKFSTFILPFSLGLFSEPVTA------FWGRKKWTNKNVYNEFGERTDSLSVAKGATGYI 54
XP_030981442.1:56-94 ------------------------------------------------------------ 0
XP_030987775.1:1-145 MKCNNIILPFALFFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIKKGSEGDF 51
QNS36445.1:1-145 MKCNNIILPFALFFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIRKGSEGDF 51
XP_030977270.1:4-148 TKCNNIILPFALVFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIRKGSEGDF 51
AYN79357.1:1-145 MKCNNIILPFALVFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIRKGSEGDF 51
AYN79358.1:1-145 MKCNNIILPFALVFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIRKGSEGDF 51
AYN79356.1:1-145 MKCNNIILPFALVFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIRKGSEGDF 51
AYN79353.1:1-145 MKCNNIILPFALVFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIRKGSEGDF 51
AYN79361.1:1-145 MKCNNIILPFALVFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIRKGSEGDF 51
AYN79365.1:1-145 MKCNNIILPFALVFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIRKGSEGDF 51
AYN79352.1:1-145 MKCNNIILPFALVFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIRKGSEGDF 51
AYN79364.1:1-145 MKCNNIILPFALVFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIRKGSEGDF 51
QMU24232.1:1-145 MKCNNIILPFALFFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIRKGSEGDF 51
XP_003713046.1:1-145 MKCNNIILPFALVFFSTTVT---------AGGGWTNKQFYNDKGEREGSISIRKGSEGDF 51
QBZ53273.1:1-147 MKFNKTIPLYILAFFSTAVIA--------GGRKWTNKVIYNDKGEREGSISIRKGAEGDF 52
prf||2210377A:43-189 MKFNKTIPLYILAFFSTAVIA--------GGRKWTNKVIYNDKGPREGSISIRKGAEGDF 52
AAA80239.2:1-147 MKFNKTIPLYILAFFSTAVIA--------GGRKWTNKVIYNDKGPREGSISIRKGAEGDF 52
TLD19068.1:1-57 ------------------------------------------------------------ 0
KAI6291589.1:1-131 MKINTSILALALALLPGM-AT--------AGRKWFNKKIYDENGESAGSLSVVKGGSGYI 51
AAA80241.1:1-131 MKINTSILALALALLPGM-AT--------AGRKWFNKKIYDENGESAGSLSVVKGGSGYI 51
KAI7908610.1:1-102 MKINTSILALTLALLPGM-AT--------AGRKWLNKKLWDANGQSAGSVSIVKGGQGSI 51
AAA80240.1:1-130 MKINTSILALTLALLPGM-AT--------AGRKWLNKKLWDANGQSAGSVSIVKGGQGSI 51
XP_003711276.1:1-100 MKINTSILALTLALLPGM-AT--------AGRKWLNKKLWDANGQSAGSVSIVKGGQGSI 51

TLD14703.1:1-138 VPGATEYDPNTDYYQWIHEENGKIKGAPSGWTFRRDWREDLSVI---------------- 104


KAI6353944.1:12-134 --------------------DEYINAADYGYDLRSDPVEDARDAAYYSRNGYHVGDGPAE 87
XP_030981816.1:4-131 --------------------DEYINAADYGYDLRDDPVEDAIDARYYNKHGYHIGDGPAE 90
KAI6344236.1:4-131 --------------------DEYINAADYGYDLRDDPVEDAIDARYYNKHGYHIGDGPAE 90
XP_029744010.1:1-69 NTGPTA-PGKPDAVAR-------------------------------------------- 69
XP_030981442.1:56-94 ---------------------------------------------------------PPE 3
P a g e | 17

XP_030987775.1:1-145 NYGPSY-PGGPNRMVRVHENNGNIRGMPTGYSLGPDHQQDQTDRQYYNRHGYHVGDGPAE 110


QNS36445.1:1-145 NYGPSY-PGGPDRMVRVHENNGNIRGMPPGYSLGPDHQQDQTDRQYYNRHGYHVGDGPAE 110
XP_030977270.1:4-148 NYGPSY-PGGPDRMVRVHENNGNVRGMPPGYSLGPDHQEDKSDRQYYNRHGYHVGDGPAE 110
AYN79357.1:1-145 NYGPSY-PGGPDRMVRVHENNGNIRGMPPGYSLGPDHQENKSDRQYYNRHGYHVGDGPAE 110
AYN79358.1:1-145 NYGPSY-PGGPDRMVRVHENNGNIRRMPPGYSLGPDHQENKSDRQYYNRHGYHVGDGPAE 110
AYN79356.1:1-145 NYGPSY-PGGPDRMVRVHENNGNIRGMPPGYSLGPDHQENKSDRQYYNRHGYHVGDGPAE 110
AYN79353.1:1-145 NYGPSY-PGGPDRMVRVHENNSNIRGMPPGYSLGPDHQENKSDRQYYNRHGYHVGDGPAE 110
AYN79361.1:1-145 NYGPSY-PGGPDRMVRVHENNGNIRGMPPGYSLGPDHQENKSDRQYYNRHGYHVGDGPAE 110
AYN79365.1:1-145 NYGPSY-PGGPDRMVRVHENNGNIRGMPPGYSLGPDHQEDKSDRQYYNRHGYHVGDGPAE 110
AYN79352.1:1-145 NYGPSY-PGGPDRMVRVHENNGNIRGMPPGYSLGPDHQENKSDRQYYNRHGYHVGDGPAE 110
AYN79364.1:1-145 NYGPSY-PGGPDRMVRVHENNGNIRGMPPGYSLGPDHQEDKSDRQYYNRHGYHVGDGPAE 110
QMU24232.1:1-145 NYGPSY-PGGPDRMVRVHENNGNIRGMPPGYSLGPDHQEDKSDRQYYNRHGYHVGDGPAE 110
XP_003713046.1:1-145 NYGPSY-PGGPDRMVRVHENNGNIRGMPPGYSLGPDHQEDKSDRQYYNRHGYHVGDGPAE 110
QBZ53273.1:1-147 NCGPGY-PGGPDRMVRVHEDNGNIRGMPPGYRLGPDDKEDKRDNQYYSRNGYHVGDGPAE 111
prf||2210377A:43-189 NCGPGY-PGGPDRMVRVHEDNGNIRGMPPGYRLGPDDKEDKGDNQYYSRNGYHVGDGPAE 111
AAA80239.2:1-147 NCGPGY-PGGPDRMVRVHEDNGNIRGMPPGYRLGPDDKEDKGDNQYYSRNGYHVGDGPAE 111
TLD19068.1:1-57 -------------MARVYEDSREIRGMPPGYRLGPDTEEDQLDNQYYSRNNYHVGDGSAE 47
KAI6291589.1:1-131 NIGPSA-PGQRDRLVEFRESGGKIQGGPPGYRYTSDHEEDQRDNRYYNTHGYHVGDGPAE 110
AAA80241.1:1-131 NIGPSA-PGQRDRLVEFRESGGKIQGGPPGYRYTSDHEEDQRDNRYYNTHGYHVGDGPAE 110
KAI7908610.1:1-102 NTDTGP--ITAEGSYDIYERNGKIEGGPP-----------------------------AE80
AAA80240.1:1-130 NTDTGP--ITAEGSYDIYERNGKIEGGPPGYKYTEDRYEDRKDDRYYNTHGYHVGDGPAE109
XP_003711276.1:1-100 NTDTGP--ITAEGSYDIYERNGKIEGGPPGYKYTEDRYEDRKDDRYYNTHG---------100

TLD14703.1:1-138 VSDPRQQKWGDGYTGAY----EDEIVGPAGEIVQHSDE---------- 138


KAI6353944.1:12-134 YGTHGGGRWGDGYTGSDNLFNDQAHYGPPGQIVQYH------------ 123
XP_030981816.1:4-131 YSTWGGGKWGDGYTGPDDLFNDQAYYGPPGDIVEYHED---------- 128
KAI6344236.1:4-131 YSTWGGGKWGDGYTGPDDLFNDQAYYGPPGDIVEYHED---------- 128
XP_029744010.1:1-69 ------------------------------------------------ 69
XP_030981442.1:56-94 YGTHGAGHWGEGYSG------------PPGKFTHEHGEQRGEDSCNIM 39
XP_030987775.1:1-145 YGNHGGGQWGDGYYG------------PPGEFTHEHRE-QREEGCNIM 145
QNS36445.1:1-145 YGNHGGGQWGDGYYG------------PPGEFTHEHRE-QREEGCNIM 145
XP_030977270.1:4-148 YGNHGGGQWGDGYYG------------PPGEFTHEHRE-QREEGCNIM 145
AYN79357.1:1-145 YGNHGGGQWGDGYYG------------PPGEFTHGHRE-QREEGCNIM 145
AYN79358.1:1-145 YGNHGGGQWGDGYYG------------PPGEFTHEHRE-QREEGCNIM 145
AYN79356.1:1-145 YGNRGGGQWGDGYYG------------PPGEFTHEHRE-QREEGCNIM 145
AYN79353.1:1-145 YGNHGGGQWGDGYYG------------PPGEFTHEHRE-QREEGCNIM 145
AYN79361.1:1-145 YGNHGGGQWGDGYYG------------PPGEFTHEHRE-QREESCNIM 145
AYN79365.1:1-145 HGNHGGGQWGDGYYG------------PPGEFTHEHRE-QREEGCNIM 145
AYN79352.1:1-145 YGNHGGGQWGDGYYG------------PPGEFTHEHRE-QREEGCNIM 145
AYN79364.1:1-145 YGNHGGGRWGDGYYG------------PPGEFTHEHRE-QREEGCNIM 145
QMU24232.1:1-145 YGNHGGGQWGDGYYG------------PPGEFTHEHRE-QREEGCNIM 145
XP_003713046.1:1-145 YGNHGGGQWGDGYYG------------PPGEFTHEHRE-QREEGCNIM 145
QBZ53273.1:1-147 YQNHGGGQWGDGYYG------------PPGEITNQHGKRQGDQGCHIM 147
prf||2210377A:43-189 YQNHGGGQWGDGYYG------------PPGQITNQHGKRQGDQGCHIM 147
AAA80239.2:1-147 YQNHGGGQWGDGYYG------------PPGQITNQHGKRQGDQGCHIM 147
TLD19068.1:1-57 YQNHGGGQWG-------------------------------------- 57
KAI6291589.1:1-131 YGDHGAGHWGDGYYG------------PPGEFV--------------- 131
AAA80241.1:1-131 YGDHGGGHWGDGYYG------------PPGEFV--------------- 131
KAI7908610.1:1-102 YGNYGGGHWGDGYYG------------PPGEFVQ-------------- 102
AAA80240.1:1-130 YGNYGGGHWGDGYYG------------PPGEFV--------------- 130
XP_003711276.1:1-100 ------------------------------------------------ 100
P a g e | 18

 Phylogenetic Tree of Multiple aligned Nucleotide Sequence:

Phylogenetic tree analysis is a powerful tool in evolutionary biology and molecular genetics that allows
researchers to study the evolutionary relationships among different organisms or genes. In this analysis, the
relationships are represented graphically as branching patterns, with closely related organisms or genes
appearing closer together on the tree.
The construction of a phylogenetic tree involves multiple steps, including DNA & protein sequence
retrieving, multiple sequence alignment, and phylogenetic tree building using MEGA software. DNA or
protein sequences are retrieved from NCBI database, and run BLAST. Then those BLAST sequences are
retrieved and aligned to identify regions of similarity and difference. Sequence alignment ensures that
comparable positions are aligned, allowing for accurate analysis.
After the sequences are aligned, MEGA software is used to construct the phylogenetic tree. MEGA
consider the patterns of sequence similarity and difference to estimate the most likely evolutionary
relationships.
Phylogenetic tree analysis provides several important insights. Firstly, it reveals the evolutionary
relatedness between different organisms or genes. By examining the branching patterns and the lengths of
branches, researchers can infer the degree of divergence and estimate the time since the common ancestor.
Additionally, the analysis can help identify evolutionary events such as gene duplication, horizontal gene
transfer, or convergent evolution.
Furthermore, phylogenetic trees can assist in the classification and identification of unknown organisms or
genes. By comparing their sequences to those on the tree, researchers can assign them to a specific group or
lineage and determine their evolutionary position. (Brown TA., 2002)
Fig.1: Phylogenetic Tree of Multiple aligned nucleotide Sequence of PWL2 gene using MEGA
P a g e | 20

 Phylogenetic Tree of Multiple aligned Protein Sequence:

Fig.2: Phylogenetic Tree of Multiple aligned Protein Sequence of PWL2 gene using MEGA
P a g e | 21

 3D Structure of PWL2 Protein:

The analysis of protein 3D structures through bioinformatics tools and techniques is crucial for
understanding their function, interactions, and dynamics. These analyses contribute to our knowledge
of protein structure-function relationships and aid in drug discovery, protein engineering, and the
development of novel therapeutics. The determination and analysis of three-dimensional (3D)
structures of proteins play a crucial role in understanding their molecular mechanisms and aiding in
drug discovery and design. In addition, molecular visualization tools are utilized to visualize and
explore protein structures in a three-dimensional space. These tools enable researchers to manipulate
and interact with the structure, allowing them to better understand its overall architecture and spatial
arrangement of functional residues. (Schmidt, Tobias et al, 2014)

Fig.3: 3D structure of PWL2 protein using SWISS-MODEL


P a g e | 22

SWISS-MODEL Homology Modelling Report

Model Building Report

This document lists the results for the homology modelling project "Untitled Project" submitted to
SWISS-MODEL workspace on May 14, 2023, 6:59 p.m..The submitted primary amino acid
sequence is given in Table T1.

If you use any results in your research, please cite the relevant publications:

 Waterhouse, A., Bertoni, M., Bienert, S., Studer, G., Tauriello, G., Gumienny, R., Heer, F.T.,
de Beer, T.A.P., Rempfer, C., Bordoli, L., Lepore, R., Schwede, T. SWISS-MODEL:
homology modelling of protein structures and complexes. Nucleic Acids Res. 46(W1),
W296-W303 (2018).
 Bienert, S., Waterhouse, A., de Beer, T.A.P., Tauriello, G., Studer, G., Bordoli, L., Schwede,
T. The SWISS-MODEL Repository - new features and functionality. Nucleic Acids Res. 45,
D313-D319 (2017).
 Studer, G., Tauriello, G., Bienert, S., Biasini, M., Johner, N., Schwede, T. ProMod3 - A
versatile homology modelling toolbox. PLOS Comp. Biol. 17(1), e1008667 (2021).
 Studer, G., Rempfer, C., Waterhouse, A.M., Gumienny, G., Haas, J., Schwede, T.
QMEANDisCo - distance constraints applied on model quality estimation. Bioinformatics 36,
1765-1771 (2020).
 Bertoni, M., Kiefer, F., Biasini, M., Bordoli, L., Schwede, T. Modeling protein quaternary
structure of homo- and hetero-oligomers beyond binary interactions by homology. Scientific
Reports 7 (2017).

Results

The SWISS-MODEL template library (SMTL version 2023-05-10, PDB release 2023-05-05) was
searched with for evolutionary related structures matching the target sequence in Table T1. For
details on the template search, see Materials and Methods. Overall 10 templates were found (Table
T2).

Models

The following model was built (see Materials and Methods "Model Building"):

Model #01 File Built with Oligo-State Ligands GMQE

PDB ProMod3 3.3.0 monomer None 0.58


P a g e | 23

Seq Seq
Oligo- QSQ Foun Resoluti Ran Covera Descripti
Template Identi Method Similari
state E d by on ge ge on
ty ty

AFD
A0A3G2LZF2. monom B AlphaFo 1- Pwl2
96.55 - - 0.64 1.00
1.A er searc ld v2 145 protein
h

The template contained no ligands.

Target
MKCNNIILPFALFFFSTTVTAGGGWTNKQFYNDKGEREGSISIRKGSEGDFNYGPSYPGGPD
RMVRVHENNGNIRGMPPG
A0A3G2LZF2.1.AMKCNNIILPFALVFFSTTVTAGGGWTNKQFYNDKGEREGSISIRKGSEGDF
NYGPSYPGGPDRMVRVHENNGNIRGMPPG

Target
YSLGPDHQQDQTDRQYYNRHGYHVGDGPAEYGNHGGGQWGDGYYGPPGEFTHEHREQR
EEGCNIM
A0A3G2LZF2.1.AYSLGPDHQENKSDRQYYNRHGYHVGDGPAEYGNHGGGQWGDGYYGPP
GEFTHEHREQREEGCNIM

Materials and Methods

Template Search

Template search with has been performed against the SWISS-MODEL template library (SMTL, last
update: 2023-05-10, last included PDB release: 2023-05-05).

Template Selection

For each identified template, the template's quality has been predicted from features of the target-
template alignment. The templates with the highest quality have then been selected for model
building.

Model Building

Models are built based on the target-template alignment using ProMod3 (Studer et al.). Coordinates
which are conserved between the target and the template are copied from the template to the model.
P a g e | 24

Insertions and deletions are remodelled using a fragment library. Side chains are then rebuilt. Finally,
the geometry of the resulting model is regularized by using a force field.

Model Quality Estimation

The global and per-residue model quality has been assessed using the QMEAN scoring function
(Studer et al.).

Ligand Modelling

Ligands present in the template structure are transferred by homology to the model when the
following criteria are met: (a) The ligands are annotated as biologically relevant in the template
library, (b) the ligand is in contact with the model, (c) the ligand is not clashing with the protein, (d)
the residues in contact with the ligand are conserved between the target and the template. If any of
these four criteria is not satisfied, a certain ligand will not be included in the model. The model
summary includes information on why and which ligand has not been included.

Oligomeric State Conservation

The quaternary structure annotation of the template is used to model the target sequence in its
oligomeric form. The method (Bertoni et al.) is based on a supervised machine learning algorithm,
Support Vector Machines (SVM), which combines interface conservation, structural clustering, and
other template features to provide a quaternary structure quality estimate (QSQE). The QSQE score
is a number between 0 and 1, reflecting the expected accuracy of the interchain contacts for a model
built based a given alignment and template. Higher numbers indicate higher reliability. This
complements the GMQE score which estimates the accuracy of the tertiary structure of the resulting
model.

References

 BLAST
Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., Madden,
T.L. BLAST+: architecture and applications. BMC Bioinformatics 10, 421-430 (2009).

 HHblits
Steinegger, M., Meier, M., Mirdita, M., Vöhringer, H., Haunsberger, S. J., Söding, J. HH-
suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics
20, 473 (2019).

Table T1:

Primary amino acid sequence for which templates were searched and models were built.

MKCNNIILPFALFFFSTTVTAGGGWTNKQFYNDKGEREGSISIRKGSEGDFNYGPSYPGGPD
RMVRVHENNGNIRGMPPGYSLGPDHQQDQTDRQYYNRH
GYHVGDGPAEYGNHGGGQWGDGYYGPPGEFTHEHREQREEGCNIM
P a g e | 25

Table T2:
Seq Seq
Oligo- QSQ Foun Resoluti Covera
Template Identit Method Similari Description
state E d by on ge
y ty

A0A3G2LZF2. monom AFDB AlphaFo Pwl2


96.55 - NA 0.64 1.00
1.A er search ld v2 protein

Putative
monom HHbli
2noc.1.A 27.27 - NMR NA 0.30 0.30 periplasmic
er ts
protein

Putative
homo- HHbli
2jna.1.B 22.73 - NMR NA 0.30 0.30 secreted
dimer ts
protein

Putative
homo- HHbli
2jna.1.A 22.73 - NMR NA 0.30 0.30 secreted
dimer ts
protein

Chaperone
monom HHbli
6e14.1.E 14.29 - EM 4.00Å 0.30 0.24 protein
er ts
FimC

monom HHbli Chaperone


3q48.1.A 17.65 - X-ray 2.50Å 0.28 0.23
er ts CupB2

Chaperone
monom HHbli
6e15.1.A 14.29 - EM 6.20Å 0.30 0.24 protein
er ts
FimC

monom HHbli Erythropoie


2mxb.1.A 7.41 - NMR NA 0.26 0.19
er ts tin receptor

Chaperone
monom HHbli
7lhg.1.B 13.33 - EM NA 0.28 0.31 protein
er ts
PapD

monom HHbli Erythropoie


2mv6.1.A 11.54 - NMR NA 0.27 0.18
er ts tin receptor

Swiss Institute of Bioinformatics


P a g e | 26

 Validate Using Ramachandran Plot:

Validation by Ramachandran plot analysis is a widely used method in structural biology to assess the
quality and reliability of protein structures. Ramachandran plot is a graphical representation of the
dihedral angles φ and ψ of amino acid residues in a protein structure, which provides insights into the
conformational stability and sterically allowed regions of the protein backbone.

In this study, I performed Ramachandran plot analysis to validate the three-dimensional structure of
our protein of interest. But, in my work the Ramachandran plot analysis doesn’t provide strong
evidence supporting the validity and reliability of protein structure.

Fig.4: Validate 3D structure of PWL2 protein Using Ramachandran Plot using PROCHECK

NOTE: Analysing Ramachandran Plot, the 3D structure of PWL2 protein needs further structural
development with the right amino acids in those HIS 123 (A), ARG 138 (A), ASN 113(A) region.
P a g e | 27

CONCLUSSION:
The PWL2 gene of Pyricularia oryzae, a pathogenic fungus causing rice blast disease, has been the
subject of extensive research and investigation. This gene, along with other Avirulence (Avr) genes,
plays a crucial role in the gene-for-gene interaction between the fungus and its host plant, rice.
Understanding the function and mechanisms of the PWL2 gene is of great importance for developing
effective strategies to control rice blast disease and improve crop yield. This recognition and
response mechanism is essential in breeding resistant rice varieties and implementing disease
management practices.

The PWL2 gene of Pyricularia oryzae is a key player in the gene-for-gene interaction between the
fungus and rice plants. Its characterization and understanding of its avirulence function have
provided valuable insights into the mechanisms of rice immunity and opened avenues for developing
resistant rice varieties. Further research in this area will continue to contribute to the development of
sustainable strategies for managing rice blast disease and ensuring food security.

In conclusion, bioinformatics is a powerful toolset for studying the PWL2 gene in Pyricularia
oryzae. Through genome analysis, sequence alignment, phylogenetic analysis, protein structure
prediction, and functional annotation, bioinformatics provides valuable insights into the structure,
function, and evolutionary aspects of the PWL2 gene. The knowledge gained from bioinformatics
analysis enhances our understanding of the molecular mechanisms underlying pathogenicity and host
specificity in Pyricularia oryzae, contributing to the development of strategies for disease
management and crop improvement.
P a g e | 28

REFERENCE:
 Richard Laugé, Pierre J.G.M. De Wit,Fungal Avirulence Genes: Structure and Possible Functions,Fungal Genetics and
Biology,Volume 24, Issue 3,1998,Pages 285-297,ISSN 1087-1845,
https://doi.org/10.1006/fgbi.1998.1076.(https://www.sciencedirect.com/science/article/pii/S1087184598910763)

 Sweigard, James A., et al. "Identification, cloning, and characterization of PWL2, a gene for host species specificity in the rice
blast fungus." The plant cell 7.8 (1995): 1221-1233.
https://www.researchgate.net/publication/15648597_Identification_Cloning_and_Characterization_of_PWL2_a_Gene_for_Host_
Species_Specificity_in_the_Rice_Blast_Fungus

 Were, Vincent Mbashira. "Investigating the role of effector proteins in the rice blast fungus Magnaporthe oryzae."(2018)
https://ore.exeter.ac.uk/repository/handle/10871/33115

 Hernández-Domínguez, Edna María, et al. "Bioinformatics as a Tool for the Structural and Evolutionary Analysis of Proteins."
Computational Biology and Chemistry; IntechOpen: London, UK (2020): 37-64.
https://books.google.com/books?hl=en&lr=&id=EpYtEAAAQBAJ&oi=fnd&pg=PA37&dq=Hern%C3%A1ndez-
Dom%C3%ADnguez,+Edna+Mar%C3%ADa,+et+al.+%22Bioinformatics+as+a+Tool+for+the+Structural+and+Evolutionary+A
nalysis+of+Proteins.%22+Computational+Biology+and+Chemistry%3B+IntechOpen:+London,+UK+(2020):+37-
64.&ots=TZNJmaerFa&sig=_OSzx97Cu8En9OK9RUH5wYG1SIY

 Chen, Chenxi. Analysis of the molecular basis of virulence in pathogenic fungi. The Ohio State University, 2013.
https://search.proquest.com/openview/6d64b434112fe3defb5839df53b8740a/1?pq-origsite=gscholar&cbl=18750

 Zhong, Zhenhui, et al. "Directional selection from host plants is a major force driving host specificity in Magnaporthe species."
Scientific reports 6.1 (2016): 25591. https://www.nature.com/articles/srep25591

 Peng, Zhao, et al. "Effector gene reshuffling involves dispensable mini-chromosomes in the wheat blast fungus." PLoS genetics
15.9 (2019): e1008272. https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1008272

 Jambon, Martin, et al. "A new bioinformatic approach to detect common 3D sites in protein structures." Proteins: Structure,
Function, and Bioinformatics 52.2 (2003): 137-145. https://onlinelibrary.wiley.com/doi/abs/10.1002/prot.10339

 Schmidt, Tobias et al. “Modelling three-dimensional protein structures for applications in drug design.” Drug discovery today
vol. 19,7 (2014): 890-7. doi:10.1016/j.drudis.2013.10.027 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4112578/

 Brown TA. Genomes. 2nd edition. Oxford: Wiley-Liss; 2002 https://www.ncbi.nlm.nih.gov/books/NBK21128/

You might also like