You are on page 1of 17

Int. J. Biosci.

2012 International Journal of Biosciences (IJB)


ISSN: 2220-6655 (Print) 2222-5234 (Online) Vol. 2, No. 10(1), p. 36-52, 2012 http://www.innspub.net

RESEARCH PAPER

OPEN ACCESS

The putative synaptotagmin protein encoded by the SYT1 gene of the picoplanktonic alga Micromonas is a novel member of C2-domain containing proteins: evidence from in silico characterization and homology modeling
Ashutosh Mukherjee Department of Botany, Dinabandhu Mahavidyalaya, Bongaon, North 24 Parganas - 743235, West Bengal, India
Received: 14 September 2012 Revised: 21 September 2012 Accepted: 22 September 2012

Key words: Disorder, template, dendrogram, ramachandran plot, flexibility, electrostatic potential.
Abstract
Synaptotagmin proteins are a class of membrane trafficking proteins and controls endocytosis and exocytosis of synaptic vesicles in animals. Increasing number of plant nucleotide and protein data shows they are also present in plants. Micromonas pusilla is a picophytoplanktonic alga belonging to Prasinophyceae which is believed to be the ancient member of green plant lineage and thus, very useful in various evolutionary studies. The SYT1 gene of this alga encodes a putative synaptotagmin which shows novel features. In this study, this protein has been characterized by several bioinformatic tools. The protein contains several novel motifs and domains besides the C2 domain. The three dimensional structure has been predicted in silico by homology modeling to gather knowledge about the structure of the ancient forms of the plant synaptotagmin protein. The C2 domain in this protein itself is somewhat different from the known structures. The spatial distribution of the active site amino acids around the calcium ion showed that some amino acids outside the C2 domain are also involved in calcium binding which is a novel feature of this protein.

Corresponding Author: Ashutosh Mukherjee ashutoshcaluniv@gmail.com

36

Mukherjee

Int. J. Biosci.
Introduction Synaptotagmins are a group of membrane trafficking proteins characterized by the presence of an N-terminal transmembrane region (TMR), a linker of variable length and two tandemly arranged C-terminal C2 domains (Craxton, 2004), called C2A and C2B. The C2 domain is a Ca2+-binding protein domain, approximately 130-145 amino acids long which are found in many membrane-associated signaling proteins in a large number of organisms (Nalefski and Falke, 1996). It is considered that Ca2+ neutralizes negatively charged residues in the loop regions of the C2 domain and permits its interaction with phospholipids in the membrane which leads to trafficking (Rizo and Sudhof, 1998). In mammals, there are 15 members of synaptotagmin family and many of these proteins act in the regulated synaptic vesicle exocytosis required for efficient are neurotransmission (Craxton, 2004). They

2012
C2-domain, no sequence similarity was found with any other protein. As this is a 1053 amino acid long protein and C2-domain only spans for 214 amino acids, a large portion of the protein is uncharacterized. Thus, further characterization including the presence of known or novel domains, motifs in this region is needed for better understanding of the structural and functional properties of this ancient form of C2-domain containing putative synaptotagmin protein. Biological function of a protein is also the manifestation of its tertiary structure and knowledge of the structural organization of the protein is a prerequisite for understanding its functional aspects (Paital et al., 2011). However, no three-dimensional structure of this C2 domain containing protein from Micromonas is known. Thus, it would be useful to recognize the 3D structure of this protein for the understanding of its functional aspects. In absence of crystal structure, homology modeling, which is done in silico, provides a faster way to obtain structural insight into the protein (Dolan et al., 2012). Additionally, identification of the Ca2+ binding residues and knowledge about their interaction with the ligand are necessary for understanding of its functional properties. This study was conducted with the help of several bioinformatics approaches including homology modeling to a) investigate and the physicochemical, structural functional

calcium sensors and regulate exocytosis and endocytosis of synaptic vesicles. Although they were thought to be exclusive to animals, they have also identified from plants (Lewis and Lazarowitz, 2010). From the sequenced plant genomes, many synaptotagmin genes have been identified by several computational procedures (Craxton, 2004). The picoplanktonic alga Micromonas pusilla is an important model organism in developmental biology and evolutionary biology, as it belong to Prasinophyceae which is thought to be the anciently diverged sister clade to land plants (Worden et al., 2009). Analyses of the genome of this small unicellular eukaryote offer valuable insights into the dynamic nature of early plant evolution. The genome of this picoplankton contains one SYT1 gene which encodes one C2-domain containing protein annotated as putative synaototagmin

properties of this protein, b) analyze the structure of the whole protein and the C2 domain and c) study the interaction of the active site amino acid residues with the Ca2+ ion. Materials and methods Sequence retrieval The Micromonas pusilla Ca2+-lipid binding protein sequence containing C2 domain i.e. putative synaptotagmin (GenBank accession XP_002504251; GI: 255082530; further called as SYT1 in this study) was downloaded from the NCBI Refseq (Pruitt et al., 2007) database (http://www.ncbi.nlm.nih.gov/projects/RefSeq/).

(Worden et al., 2009). The protein is 1053 amino acid long and the C2 domain spans for 214 amino acids, which is much longer than the average length of a C2 domain (130-145 amino acids). Additionally, Initial BLAST (Altschul et al., 1990) search against NCBI non-redundant protein database revealed several plant synaptotagmins with high sequence similarity in the C2-domain region but outside the

37

Mukherjee

Int. J. Biosci.

2012

Fig. 1. Dendrogram showing the phylogenetic relationship of the SYT1 from Micromonas with other C2-domain containing proteins. The SYT1 from Miromonas is shown in a grey box. The protein sequence was predicted by conceptual translation from an mRNA sequence of Micromonas sp. RCC299 (Worden et al., 2009). The protein is 1053 amino acids long and the C2 domain (COG5038) spans from residue 282-495. Three dimensional crystal structure of this protein was not yet available in the Protein Data bank. This sequence was further utilized for characterization and structure prediction.

38

Mukherjee

Int. J. Biosci.

2012

Fig. 2. Multiple sequence alignment of the templates and the target protein as visualized with Jalview. Phylogenetic analysis Protein sequences related to SYT1 were searched using NCBI BLASTP (Altschul et al., 1990) program. For evaluating the phylogenetic relationship, the resulting sequences (excluding hypothetical and predicted sequences) were aligned using alignment explorer in Mega 5.0 (Tamura et al., 2011) with default parameters. Unrooted phylogenetic tree of these sequences was constructed by the neighbor-joining (NJ) method in Mega 5 program. The level of confidence was estimated using bootstrap analysis of 1000 replications.

39

Mukherjee

Int. J. Biosci.

2012

Fig. 3. Ramachandran plot of the modeled SYT1 protein.

40

Mukherjee

Int. J. Biosci.

2012

Fig. 4. Details of the modeled three-dimensional structure of SYT1 protein. A) Ribbon diagram of the protein as shown in Chimera. The alpha helices are shown in orange, beta sheets are shown in yellow and loops are coloured in cyan; B) Position of the C2-domain (orange) into the protein. Physicochemical analysis The computation of various physicochemical parameters, such as amino acid composition, isoelectric point (pI), total number of negatively and positively charged residues, instability index, aliphatic index and Grand Average of Hydropathy (GRAVY), (Gasteiger was et done al., using 2005) ProtParam available tool at

http://us.expasy.org/ tools/protparam.html.

Fig. 5. Topology of the modeled SYT1 protein as predicted by PDBsum. Helices and strands outside the C2domain are shown in red and pink, respectively. The helices and strands of C2-domain are shown in blue and green, respectively.

41

Mukherjee

Int. J. Biosci.

2012

Fig. 6. Flexibility of modeled three-dimensional structure of SYT1. A) Flexibility to rigidity as shown in a gradient of red to white in the 3D model; B) Flexibility along the length of the protein as indicated by peaks; C) Flexibility as indicated in a red white gradient over the entire sequence.

Fig. 7. A) Protein disorder (disordered regions are indicated as blue regions); B) Interacting surface (shown as red regions) and C) Surface electrostatic potential of SYT1 (Red portions are electronegative and blue portions are electropositive. White portions are neutral).

Fig. 8. Interaction of Ca2+ ion with the SYT1 protein. a. Three-dimensional orientation of side chains of active site residues surrounding Ca2+ ion (cyan ribbon represent part of C2-domain and orange ribbon includes important amino acids for Ca2+ binding outside the C2-domain); b. LIGPLOT of SYT1 complexed with Ca2+. Structural and functional characterization Secondary structure prediction was carried out with SOPMA (Geourjon and Deleage 1995). The CDD database (Marchler-Bauer et al., 2011) was searched for domains using CD search (Marchler-Bauer and Bryant, 2004). Motifs were predicted using Multiple Em for Motif Elicitation (MEME) suite (Bailey et al., 2009) respectively using default parameters to gain insight about its function. Motifs found with MEME were further searched with MAST tool for known matches for the motifs. Motif Scan (Pagni et al., 2007; Sigrist et al., 2010) server (http://hits.isbsib.ch/cgi-bin/PFSCAN) and SMART (Schultz et al., 1998; Letunic et al., 2012) server (http://smart.embl-heidelberg.de/) were also used for scanning signature domains with the default

42

Mukherjee

Int. J. Biosci.
parameters, including outlier homologs and

2012
multiple templates since it is recommended that multiple templates should be used in order to avoid biasing the model toward one protein or one set of side chain conformations (Ginalski, 2006; Rhodes, 2006). Sequence alignments of the target protein and the templates were performed using CLUSTALW(http://www.ch.embnet.org/software/ ClustalW.html) (Larkin et al., 2007) and visualized with Jalview (Clamp et al., 2004; Waterhouse et al., 2009). I-TASSER generated five predicted structures for the protein of which the model with the highest C-score was chosen for further analysis. Validation and analysis of the 3D model

homologs of known structures, Pfam domains, signal peptides and internal repeats. The SOSUI (Hirokawa et al., 1998) program (http://bp.nuap.nagoyau.ac.jp/sosui/sosui_submit. html) was employed to predict the presence of any transmembrane region. Subcellular localization was predicted using TargetP (Emanuelsson et al., 2000) 1.1 php). Protein disorder et was al., predicted 2004) server using (http:// (http://www.cbs.dtu.dk/services/TargetP/abstract. Disopred (Ward

bioinf.cs.ucl.ac.uk/disopred/) server. Homology modeling Primarily, HHpred (Sding et al., 2005) server (http://toolkit.tuebingen.mpg.de/hhpred) as well as PSIBLAST (Altschul et al., 1997) server (http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Pro teins) was used for identification of suitable templates from the PDB protein structure database (Berman et al., 2000). However, HHpred only identified some templates with coiled coil region aligned with a very small region (approximately from 400th to 650th residue) of the target protein. The Phyre2 web (Kelley and server PSI-BLAST, on the other hand, could not find any significant Sternberg, match. 2009) After modeling, the validation of the modeled structure was carried out using Protein Structure Validation Suite (PSVS) tool (Bhattacharya, et al., 2007) available at http://psvs-1_4-dev.nesg.org/. Within PSVS, the model et was al., analyzed 1993) by and PROCHECK (Laskowski

Molprobity (Lovell et al., 2003). 3D structures of the proteins and protein-calcium complex were visualized with Chimera (Pettersen et al., 2004). For an at-a-glance overview of the topology of the modeled protein, PDBsum (Laskowski, 2009) web server was used (http://www.ebi.ac.uk/pdbsum/). Molecular surface area and contact volume was calculated with the web-based tool Voss Volume Voxelator (http://www.molmovdb.org/cgibin/3v.cgi) (Voss, 2007; Voss et al., 2006). To know the secondary structure and topology of the protein, the 3D structure was submitted to the PDBsum (Laskowski, 2009) server (http://www.ebi.ac.uk/pdbsum/). B-factor profiles of the modeled protein were investigated using the web-based tool for the analysis of protein flexibility FlexServ (http://mmb.pcb.ub.es/FlexServ/) (Camps et al., 2009), with Normal Mode Analysis employed. This server incorporates the protocols for the coarse-grained determination of protein dynamics using different algorithms. For further annotation and identification of protein interface identification, the structure was analysed with Polyview (Porollo and Meller, 2007) server (http://polyview.cchmc.org/). To identify the likely

(http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi ?id=index) was also employed for modeling. However, only 31% of the protein could be modeled by the normal mode. Intensive mode could not be employed on Phyre2 as it requires protein less than 1000 amino acids long. Finally, I-TASSER (Zhang, 2007; Roy et al., 2010), the iterative threading assembly renement server (http://zhanglab.ccmb.med.umich.edu/ITASSER/), was chosen to generate the homology models because it is automated and easy to use, its algorithm incorporates multiple templates, and it has a high degree of accuracy based on blind CASP experiments (Roy et al., 2010). Rather than specifying one single template for homology modeling, I-TASSER was allowed to incorporate

43

Mukherjee

Int. J. Biosci.
biochemical function of the protein from its threedimensional structure, ProFunc (Laskowski et al., 2005a; Laskowski et al., 2005b) server (http://www.ebi.ac.uk/thorntonsrv/databases/profunc/) was employed. Binding site prediction was performed with I-TASSER which also generated a ligand-protein complex. The ligand (Ca2+) bound with active site residues was plotted with LIGPLOT (Wallace et al., 1995) within PDBsum. Protein structure accession numbers The homology model of the protein was submitted to the Protein Model Data Base i.e. PMDB (Castrignan et al., and 2006) assigned at the http://mi.caspur.it/PMDB/ identifiers PM0078184. Results and discussion Phylogenetic relationship of SYT1 with other members of C2 domain containing proteins BLAST search of SYT1 identified several C2 domain

2012
containing proteins including some hypothetical and predicted proteins. These hypothetical and predicted proteins were excluded for dendrogram preparation. Finally, Micromonas SYT1 and the other 57 related proteins (supplementary material, table S1) were used for phylogenetic tree construction. All of them had either one or two C2 domains (table 3). Besides plant synaptotagmin, these proteins included several membrane proteins with single C2 domain, calcium-dependent lipidbinding domain-containing proteins, CLB1 and other C2 domain containing proteins. The dendrogram showed that SYT1 of Micromonas is distinctly different from all the other 57 proteins (Fig. 1).

Table 1. ProtParam table showing different physicochemical properties of the C2 domain containing protein. Parameters pI Total number of negatively charged residues (Asp + Glu) Total number of positively charged residues (Arg + Lys) The instability index (II) Aliphatic index Grand average of hydropathicity (GRAVY) 39.61 82.74 -0.263 125 Value 5.10 155 Explanation Indicates that the protein is acidic. Total number of negatively charged residues is greater than Total number of positively charged residues. This indicates that the protein is intracellular. This classifies the protein as stable. Indicates that this globular protein is thermostable. A negative GRAVY score indicates that the protein is hydrophilic.

Table 2. Secondary structure of the C2 domain containing protein as predicted by SOPMA. Parameters Alpha helix (Hh) 310 helix (Gg) Pi helix (Ii) Beta bridge (Bb) Extended strand (Ee) Beta turn (Tt) Bend region (Ss) Number of amino acids 386 0 0 0 187 89 0 Percentage of amino acids 36.66 0.00 0.00 0.00 17.16 8.45 0.00

44

Mukherjee

Int. J. Biosci.
Random coil (Cc) Ambigous states Other states 391 0 0 37.13 0.00 0.00

2012

Table 3. Motifs predicted using MEME. Motif Motif 1 Width 8 Sites 2 E-value 4.4e-001 Start position 234 453 Motif 2 Motif 3 6 6 2 2 1.4e+001 7.4e+001 504 550 387 404 Physicochemical properties The physicochemical properties of the C2 domain containing protein from Micromonas was predicted using Expasys ProtParam server (http://expasy.org/cgi-bin/protparam) using the protein sequence and the results are shown in table 1. The most frequent amino acid present in the sequence was found to be alanine (157 residues, 14.9%) and the least was that of cystine (5 residues, 0.5%). The total number of negatively charged residues (Asp + Glu) was 155 and the total number of positively charged residues (Arg + Lys) was 125 which indicate the protein to be intracellular as intracellular proteins have higher fraction of negatively charged residues. The calculated isoelectric point (pI) is useful for the fact that at isoelectric point, the solubility is the least and the mobility in an electric field is zero. Isoelectric point (pI) is the pH at which the surface of protein is covered with charge but net charge of protein is zero. The calculated isoelectric point (pI) was computed to be 5.10 which indicates that the protein is acidic. The high aliphatic index (82.74) indicates that this protein is stable for a wide range of temperature range. This is important to combat various stressful environments which is natural for a signaling protein. The instability index (39.61) also provides the evidence that the protein in stable. The Grand Average Hydropathicity (GRAVY) value The Conserved Domain Database showed only the presence of C2-domain (COG5038). No other domains were found. The scan for motifs with MEME showed the presence of three motifs (table 3). The sequence of the motifs are [FW]M[GV]W[PQ][QR][CS][IK], L[CQ]VRW[PY] Structural and functional properties Table 2 presents the results of secondary structure prediction analysis by SOPMA from which it is clear that random coil is predominantly present (37.13%), followed by alpha helix (36.66%) and extended strand (17.16%). SOPMA also predicted the presence of Beta turn (8.45%). is 7.72e-11 1.16e-12 3.10e-08 4.16e-10 1.10e-08 1.97e-08 negative (-0.263) FMGWQQSK WMVWPRCI LQVRWP LCVRWY EFECSF VFPCFG which indicates better p-value Sequence

interaction of the protein with water. The SOSUI program also showed an average of hydrophobicity of -0.263343 confirming that the protein is a soluble protein. Prediction of its subcellular localization with TargetP showed the protein is localized in chloroplast with a 63 amino acid long target peptide.

and [EV]F[EP]C[FS][FG]. All the motifs were present in two copies in the sequence. Of these, motif 1 and 3 were the part of the C2 domain. Search for the presence of these motifs in other proteins with MAST revealed some interesting results. For motif 1, MAST resulted into 26 proteins

45

Mukherjee

Int. J. Biosci.
with E-values less than 10 which include several FAB fragments, FV fragments and few indolicidin (antimicrobial cationic peptide), membrane glycoprotein and one cytochrome. For motif 2, 12 sequences were identified with E-value less than 10 and include few S-phase Kinase associated protein and many Dienelactone hydrolase. Only 5 proteins were identified with motif 3 which include bacterial toxins. SMART identified several low complexity regions as well as two coiled coil regions (table 4). No low-complexity regions fell into the C2 domain. It also identified two SCOP domains (d1i19a1 i.e. FAD linked oxidase, C-terminal domain, and d1hcia4 i.e. spectrin repeat). The Motif scan tool identified one amidation site, two N-glycosylation sites, fourteen Casein kinase II phosphorylation sites, eighteen N-myristoylation sites, sixteen Protein kinase C phosphorylation sites, one cell attachment sequence, one each of Alanine rich, Arginine rich and Glycine rich regions as well as one octapeptide repeat (table 5). Initial BLAST search against NCBI non-redundant protein database showed many C2 plant domain synaptotagmins and some other Disordered regions of a protein

2012
Surprisingly, these proteins only showed similarity in the C2 domain region. The C-terminal and Nterminal regions outside the C2 domain did not show any sequence similarity with any other proteins. The CD search showed that the C2 domain spans from Asp282 to Gly495. The results showed the presence of another small domain of the superfamily cl01482. As shown in CDD, this superfamily represents bacterial proteins related to CpxP, a periplasmic protein that forms part of a two-component system which acts as a global modulator of cell-envelope stress in Gram-negative bacteria. In this protein, this domain spans from Gly816 to Arg874. facilitate

interactions of the protein and allow more modification sites in the protein (Paital et al., 2011). The total disordered amino acid residues were 378 (35.89%) as predicted by Disopred. However, they were spread over the protein in 14 regions. The longest disordered region was spread from Glu47 to Thr181. However, the C2 domain was not disordered as no amino acid within this region was found disordered. These disordered regions play significant roles in protein interaction (Paital et al., 2011). From these results, it seems that this protein interacts with other proteins with novel properties.

containing proteins in the top BLAST hits. Also, ProFunc identified several synaptotagmin genes related with SYT1 of Micromonas from plants. Table 4. Motifs identified with SMART. Motif Low complexity No. of sites 11

Amino acid positions 28-42, 56-77, 82-103, 117-134, 165-176, 241252, 607-621, 636-655, 693-707, 742-760, 1018-1035

E-value ---

Coiled coil SCOP: d1i19a1 (FAD-linked oxidases, Cterminal domain) SCOP: d1hcia4 (Spectrin repeat)

2 1

816-865, 940-980 198-321

--2.20e+00

800-851

1.40e-01

Among the top five models generated by I-TASSER, each was with a C-score. The C-score is a confidence score for estimating the quality of a predicted model: a high C-score signifies a model

with a high confidence and vice-versa. Models with a C-score > -1.5 generally have a correct fold (Roy et al., 2010). The structure with the highest C-score (0.9) was used for further studies. The template-

46

Mukherjee

Int. J. Biosci.
modeling score (TM-score) provides a sensitive measure of overall topology difference between a predicted structure and template, with a higher score indicating a better structural match. A TMscore >0.5 indicates correct overall topology for a modeled structure. The TM-score for the modeled protein of this study was 0.600.14 which indicates that the model had correct overall topology. Additionally, the normalized Z-score for each Table 5. Motifs identified with Motif Scan. Motif information Amidation site Nglycosylation site Casein kinase II phosphorylation site Nmyristoylation site 18 No. of sites 1 2 14 81-84 79-82, 519-522 44-47, 65-68, 162-165, 198-201, 261-264, 349-352, 385-388, 535-538, 711-714, 847-850, 950-953, 985988, 999-1002, 1043-1046 88-93, 126-131, 227-232, 335-340, 372-377, 404-409, 430-435, 467-472, 531-536, 627-632, 644-649, 654659, 663-638, 684-689, 758-763, 802-807, 826-821, 995-1000 Protein kinase C phosphorylation site Cell attachment sequence Alanine-rich region Arginine-rich region Glycine-rich region Octapeptide repeat 1 1 1 1 123145 25-69 627-706 473-480 16 1 23-25, 33-35, 37-39, 46-48, 81-83, 127-129, 241-243, 394-396, 535-537, 565-567, 670-672, 723-725, 795797, 807-809, 844-846, 950-952 74-175 Amino acid residues

2012
threading alignment between the target and a given template indicates the significance of the alignment compared to the average. I-TASSER documentation advises that a threading >1 alignment a with a normalized Z-score reflects confident

alignment. In this study, normalized Z-score for the top 10 templates used by I-TASSER ranged from 1.02-3.53 alignment. which reflects the confidence of

E-value -------

---

---

--0.07 5.7 0.00059 3.6

The PSVS suite analyzed the protein structure with the help of several tools. According to PROCHECK program, Ramachandran plot (figure 3) of the shading represents the different regions of the plot. The darker the area, the more favorable is the - combination. Residues in most favored regions, additionally allowed regions and generously allowed regions were 79%, 14.7% 4.9%, respectively. Only 1.3% residues were in disallowed region. Molprobity evaluates the stereochemical quality of a structure by calculating phi and psi torsion angles, backbone bond lengths and backbone bond angles. Molprobity provides a clashscore as a result of an

all-atom contact analysis which is performed after adding hydrogen atoms to a structure. When nondonor acceptor atoms overlap by more than 0.4 , at least one of the two atoms must be modeled incorrectly. A clash at this location is noted and incorporated into the clashscore, which is simply the number of clashes per 1000 atoms (Lovell et al., 2003). In this study, the clashscore was quite low (169.39). All these quality evaluation measures showed that the modeled structure was quite reliable. Overall three-dimensional structure of the protein

47

Mukherjee

Int. J. Biosci.
The modeled protein belongs to the / structural class (Chou and Zhang, 1995) as evidenced from figure 4A. It is also notable that the protein formed a V-shaped structure. One part of this V-shaped structure has prevalence of beta sheets and the other part has the prevalence of alpha helices. The volume of the protein was 149974 3.The C2 domain lies in the beta sheet prevalent area (figure 4B). The modeled structure was submitted to PDBsum to show the secondary structures graphically. This showed the presence of 21 helices and 31 strands (which formed 10 sheets) and 10 beta hairpins. The topology (figure 5) showed that the N-terminal part is primarily consisted of beta sheets, while the C-terminal portion was made primarily of alpha helices along with some small beta strands. Of the 31 beta strands, 23 were present in the N-terminal region. The B factor, which reflects spatial uncertainty, was calculated using the web-based tool for the analysis of protein flexibility, FlexServ. The minimum B-factor for a residue was measured to be 4.663 was 304.671 100 2 2. 2 and the maximum B-factor The protein has six regions in form

2012
was evident that all of the amino acids which contribute to the flexibility of the protein except Pro121 form the interacting surfaces of the protein. The distribution of electrostatic potentials (figure7C) showed that the C2-domain is primarily neutral with some negatively charged regions and a few positively charged regions. It is also notable that the highly flexible region of the protein has either positive or negative electrostatic potentials. The presence of charged residues in the loop regions of high flexibility suggests their participation in dynamic charge-mediated interactions with other molecules. Structure of the C2 domain and ca2+ binding residues The C2 domain was consisted of 4 sheets (9 strands). Of these 9 strands, one very small strand (Asp425-Arg427) was not shown as strand in the ITASSER generated model as viewed by Chimera, but showed in PDBsum topology (figure 5). Otherwise the topology generated by the PDBsum matched with the modeled structure. The C2 domain also contains three small alpha helices. However, the C2 domain is not fully formed of helices and strands. 125 of 214 residues (58.41%) did not form any helix or sheet. Usually, the C2 domain forms a beta-sheet scaffold with eight anti parallel strands connected by loops (Reddy and Reddy, 2004). Loops 1-3 are placed on top of the sheets and coordinate with Ca2+ binding (Sutton et al., 1995). This binding of C2 domain with Ca2+ ion facilitates its interaction with negatively charged phospholipids. The protein studied here, however, interacts with Ca2+ ion with the help of amino acids within the C2-domain as well as amino acids outside the C2 domain (Asp545, Pro546, Lys547, Ala548 and Gln549), as shown by I-TASSER. The Ca2+ ion is surrounded by nine amino acids (figure 8A) The protein with a similar binding site was, surprisingly showed by one integrin alphaXbeta2 ectodomain from human (PDB ID: 3K6S) (Xie et al., 2010). The Ca2+ bound model was submitted to PDBsum and the LIGPLOT showed bonding of the Ca2+ ion with the backbone nitrogen of Phe424. The

of six peaks which have B-factor values more than (figure 6A). In general, several loop regions showed more flexibility as shown in figure 6B. Maximum flexibility was showed by Pro119, Leu120, Pro121, Thr482, Ala483, Pro718, and Leu719 (figure 6C). As loops do not form any rigid structure in the protein, these flexible regions seemed to be vital for structural modifications of the protein. The disordered regions were mainly situated in the loop regions of the protein (figure 7A). 19 beta strands contained disordered regions in them in contrast to only 4 alpha helices. The longest disordered region was Glu47 to Thr181 which contained 6 beta strands and only 1 alpha helix. The Polyview 3D program estimated the interacting residues of the protein. Total 275 residues were predicted as interacting i.e. interfacial (figure 7B). Comparison of the data of disordered regions and interacting residues showed that 30 interacting residues were predicted to be disordered. Comparing the results of FlexServ and Polyview, it

48

Mukherjee

Int. J. Biosci.
Ca2+ ion formed hydrogen bonds with Leu422, Asp545 and Pro546 (figure 8B). Conclusion The putative synaptotagmin protein from the picoeukaryoic plankton Micromonas investigated in this study is a novel member of the C2-domain containing protein family as it did not show any sequence similarity with other members of the C2 domain family outside the C2-domain as shown by NCBI BLAST search. The NJ tree developed on the basis of sequence alignment also showed that the protein is distinct from other members of the C2domain containing proteins from the plant

2012
discovery and searching. Nucleic Acids Research 37, W202-W208. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. 2000. The Protein Data Bank. Nucleic Acids Research 28, 235-242. Bhattacharya A, Tejero R, Montelione GT. 2007. Evaluating protein structures determined by structural genomics consortia. Proteins 66, 778795. Camps J, Carrillo O, Emperador A, Orellana L, Hospital A, Rueda M, Cicin-Sain D, D'Abramo M, Gelp JL, Orozco M. 2009. FlexServ: an integrated tool for the analysis of protein flexibility. Bioinformatics 25(13), 17091710. Castrignan T, De Meo PD, Cozzetto D, Talamo IG, Tramontano A. 2006. The PMDB Protein Model Database. Nucleic Acids Research 34, D306-D309. Cedano J, Aloy P, Prez-Pons JA, Querol E. 1997. Relation between amino acid composition and cellular location of proteins. Journal of Molecular Biology 266(3), 594-600. Chou KC, Zhang CT. 1995. Prediction of protein structural classes. Critical Reviews in Biochemistry and Molecular Biology 30, 275-349. Clamp M, Cuff J, Searle SM, Barton GJ. is gratefully

kingdom. Finally, this analysis provides insight into the unique structural properties as well as its novelty for interaction with Calcium. The predicted model of the protein is useful for different experimental purposes in relation to the different signaling mechanisms involving this protein. The interaction between the protein and the Ca2+-ion proposed in this study are useful for understanding the potential mechanism of action of this protein and also its evolutionary significance. Acknowledgement The facility situated at the Department of Botany, Dinabandhu acknowledged. References Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. Journal of Molecular Biology 215(3), 403-410. Altschul SF, Madden TL, Schffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25, 3389-3402. Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS. 2009. MEME SUITE: tools for motif Mahavidyalaya

2004.

The

Jalview

Java

Alignment

Editor.

Bioinformatics 20, 426-427. Craxton M. 2004. Synaptotagmin gene content of the sequenced genomes. BMC Genomics 5, 43. Dolan MA, of Noah JW, Hurt homology D. 2012.

Comparison

common

modeling

algorithms: application of user-defined alignments. In: Orry A. J.W. and Abagyan R, eds. Homology

49

Mukherjee

Int. J. Biosci.
Modeling: Methods and Protocols, Methods in Molecular Biology, vol. 857, Humana Prerss, USA, 399-414. Emanuelsson O, Nielsen H, Brunak S, von Heijne G. 2000. Predicting Subcellular localization of proteins based on their N-terminal amino acid sequence. Journal of Molecular Biology 300(4), 1005-1016. Gasteiger E, Hoogland C, Gattiker A, Duvaud S, Wilkins MR, Appel RD, Bairoch A. 2005. Protein Identification and Analysis Tools on the ExPASy Server. In: Walker JM, ed. The Proteomics Protocols Handbook. Humana Press, Totowa, New Jersey, USA, 571-607. Geourjon C, Delage G. 1995. SOPMA:

2012

Laskowski RA, MacArthur MW, Moss DS, Thornton JM. 1993. PROCHECK: a program to check the stereochemistry of protein structures. Journal of Applied Crystallography 26, 283-291. Laskowski RA, Watson JD, Thornton JM. 2005a. ProFunc: a server for predicting protein function from 3D structure. Nucleic Acids Research 33, W89-W93. Laskowski RA, Watson JD, Thornton JM. 2005b. Protein function prediction using local 3D templates. Journal of Molecular Biology 351, 614626. Letunic I, Doerks T, Bork P. 2012. SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Research 40(D1), D302D305. Lewis JD, Lazarowitz SG. 2010. Arabidopsis synaptotagmin SYTA regulates endocytosis and virus movement protein cell-to-cell transport. Proceedings of the National Academy of Sciences USA 107(6), 2491-2496. Lovell SC, Davis IW, Arendall WB, de Bakker

Signicant improvements in protein secondary structure prediction by consensus prediction from multiple alignments. Computer applications in the biosciences 11, 681-684. Ginalski K. 2006. Comparative modeling for protein structure prediction. Current Opinion in Structural Biology 16(2), 172-177. Hirokawa T, Boon-Chieng S, Mitaku S. 1998. SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics 14, 378-379. Kelley LA, Sternberg MJE. 2009. Protein structure prediction on the web: a case study using the Phyre server. Nature Protocol 4, 363-371. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. 2007. ClustalW and ClustalX version 2. Bioinformatics 23(21), 2947-2948. Laskowski RA. 2009. PDBsum new things. Nucleic Acids Research 37, D355-D359.

PIW, Word JM, Prisant MG, Richardson JS, Richardson DC. 2003. Structure validation by C geometry: , and C deviation. Proteins 50, 437-450. Marchler-Bauer A, Bryant SH. 2004. CDSearch: protein domain annotations on the fly. Nucleic Acids Research. 32, W327-W331. Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, Deweese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Jackson JD, Ke Z, Lanczycki CJ, Lu M, F, Marchler GH, MV, Mullokandov Omelchenko

Robertson CL, Song JS, Thanki N, Yamashita RA, Zhang D, Zhang N, Zheng C, Bryant SH. 2011. CDD: a Conserved Domain Database for the

50

Mukherjee

Int. J. Biosci.
functional annotation of proteins. Nucleic Acids Research 39, D225-D229. Nalefski EA, Falke JJ. 1996. The C2 domain calcium-binding motif: Structural and functional diversity. Protein Science 5, 2375-2390. Pagni M, Ioannidis V, Cerutti L, Zahn-Zabal M, Jongeneel CV, Hau J, Martin O, Kuznetsov analyzing D, Falquet L. 2007. MyHits:

2012
Rizo J, Sudhof TC. 1998. C2-domains, structure and function of a universal Ca2+ -binding domain. Journal of Biological Chemmistry 273, 1587915882. Roy A, Kucukural A, Zhang Y. 2010. ITASSER: a unified platform for automated protein structure and function prediction. Nature Protocol 5(4), 725-738. Schultz J, Milpetz F, Bork P, Ponting CP. 1998. SMART, a simple modular architecture research tool: Identification of signaling domains. Proceedings of the National Academy of Sciences USA 95, 5857-5864. Sigrist CJA, Cerutti L, PS, de Castro E, V,

improvements to an interactive resource for protein sequences. Nucleic Acids Research 35, W433-W437. Paital B, Kumar S, Farmer R, Tripathy NK, Chainy GBN. 2011. In silico Prediction and characterization of 3D structure and binding properties of catalase from the commercially important crab, Scylla serrata. Interdisciplinary Sciences: Computational Life Science 3, 110-120. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE. 2004. UCSF Chimera - a visualization system for exploratory research and analysis. Journal of computational chemistry 25(13), 1605-1612. Porollo A, Meller J. 2007. Versatile Annotation and Publication Quality Visualization of Protein Complexes Using POLYVIEW-3D. BMC Bioinformatics 8, 316. Pruitt KD, Tatusova T, Maglott TR. 2007. NCBI reference sequences (RefSeq): a curated nonredundant sequence database of genomes, transcripts and proteins. Nucleic Acids Research 35, D61-D65. Reddy VS, Reddy ASN. 2004. Proteomics of calcium-signaling components in plants.

Langendijk-Genevaux

Bulliard

Bairoch A, Hulo N. 2010. PROSITE, a protein domain database for functional characterization and annotation. Nucleic Acids Research 38, D161-D166. Sding J, Biegert A, Lupas AN. 2005. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Research 33, W244-W248. Sutton RB, Davletov BA, Berghuis AM, Sudhof TC, Sprang SR. 1995. Structure of the rst C2 domain of synaptotagmin I: a novel Ca2+/phospholipid-binding fold. Cell 80, 929-938. Tamura K. Peterson D, Peterson N, Stecher G, Nei M, Kumar S. 2011. MEGA5: Molecular Evolutionary Genetics Analysis using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods. Molecular Biology and Evolution 28, 2731-2739. Voss NR, Gerstein M, Steitz TA, Moore PB. 2006. The geometry of the ribosomal polypeptide exit tunnel. Journal of Molecular Biology 360(4), 893-906.

Phytochemistry 65, 1745-1776. Rhodes G. 2006. Crystallography Made Crystal Clear. 3rd ed., Academic Press, Burlington, MA.

51

Mukherjee

Int. J. Biosci.
Voss NR. 2007. Geometric Studies of RNA and Ribosomes, and Ribosome Crystallization PhD dissertation, Yale University. Wallace AC, Laskowski RA, Thornton JM. 1995. LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions. Protein Engineering design & selection 8(2), 127-134. Ward JJ, McGuffin LJ, Bryson K, Buxton BF, Jones DT. 2004. The DISOPRED server for the prediction of protein disorder. Bioinformatics. 20, 2138-2139. Waterhouse AM, Procter JB, Martin DMA, Clamp M, Barton GJ. 2009. Jalview version 2: A Multiple Sequence Alignment and Analysis Workbench. Bioinformatics 25 (9), 1189-1191. Worden AZ, Lee J-H, Mock T. Rouz P, Simmons MP, Aerts AL, Allen AE, Cuvelier ML, Derelle E, Everett MV, Foulon E, Grimwood J, Gundlach H, Henrissat B, Napoli C, McDonald SM,

2012
Parker MS,

Rombauts S, Salamov A, Von Dassow P, Badger JH, Coutinho PM, Demir E, Dubchak I, Gentemann C, Eikrem W, Gready JE, John U, Lanier W, Lindquist EA, Lucas S, Mayer KF, Moreau H, Not F, Otillar R, Panaud O, Pangilinan J, Paulsen I, Piegu B, Poliakov A, Robbens S, Schmutz J, Toulza E, Wyss T, Zelensky A, Zhou K, Armbrust EV, Bhattacharya D, Goodenough UW, Van de Peer Y, Grigoriev IV. 2009. Green evolution and dynamic adaptations revealed by the genomes of the marine picoeukaryote Micromonas. Science 324, 268-272. Xie C, Zhu J, Chen X, Mi L, Nishida N, Springer TA. 2010. EMBO Journal 29(3), 666679. Zhang Y. 2007. Template-based modeling and free modeling by I-TASSER in CASP7. Proteins 69(S8), 108-117.

52

Mukherjee

You might also like