You are on page 1of 14

Chiang Mai J. Sci.

2020; 47(6) : 1158-1171


http://epg.science.cmu.ac.th/ejournal/
Contributed Paper (Short Communication)

Mining Late Embryogenesis Abundant (LEA) Family Genes


in Amaranthus hypochondriacus
Phi Bang Cao
Faculty of Natural Sciences, Hung Vuong University, Vietnam.
*Author for correspondence; e-mail: phibang.cao@hvu.edu.vn
Received: 24 December 2019
Revised: 6 May 2020
Accepted: 15 July 2020
A BSTRACT
Late Embryogenesis Abundant (LEA) was first described to accumulate during plant seed
dehydration, absolutely at the late stages of embryogenesis. Since then they have also been found in
plant vegetative tissues under water limitation, thereby, associated with the acquisition of desiccation
tolerance. This work provides furnishes the first comprehensive study about the LEA gene family
in Amaranthus hypochondriacus, which is that the important grain crops and widely grown around the
world. A total of 46 genes encoding LEAs (AhLEAs) were identified and classified into eight groups
(LEA1, LEA2, LEA3, LEA4, LEA5, LEA6, DHN, and SMP) depending on basis of their predicted
amino acid sequences and further on their phylogenetic relationships with the Arabidopsis thaliana LEA
proteins (AtLEAs). A linking gene structure and motif architecture were observed within each LEA
group. Analysis of their chromosomal localizations revealed that the AhLEAs were non-randomly
distributed across all 16 chromosomes and that 43% of all AhLEAs are tandemly or whole-genome
duplicated genes. In silico expression analysis using RNA-seq data revealed that the AhLEA genes
were widely expressed in various vegetative and reproductive tissues especially including mature seeds.
The tissue-specific expression was observed for most of AhLEA genes. Additionally, most of AhLEA
genes were up-regulated by water deficit stress. These results bring about hand over a useful reference
to further investigate the AhLEAs functions and to apply on crop improvement.

Keywords: Amaranthus hypochondriacus, Late Embryogenesis Abundant (LEA), gene expression analysis,
gene ontology analysis, phylogeny analysis

1. I NTRODUCTION
Late embryogenesis abundant (LEA) protein As a result of their particular amino acid
was first identified from cotton seeds in the late composition, most LEA proteins displayed a high
stages of embryo development [1]. However, positive hydrophilicity and heat stabilization in
these proteins are not restricted to seeds. Many solution [7]. However, some of them can exhibit
LEA proteins have been also found in vegetative particular three-dimensional structures under
tissues[2] Moreover, LEA homologs have been desiccation or extreme temperature conditions
found in non-viridiplantae species such as bacteria [8]. In plants, LEA proteins play an important
[3], mosses [4], fungi [5] and animals [6]. physiological role in development and stress
Chiang Mai J. Sci. 2020; 47(6) 1159

tolerance via many proposed mechanisms, such as many plant tissues of different amaranth wild
protection of proteins and organelles, interactions and domesticated species. This protein could
with membranes, stabilization of sugar glasses, play an important function during the seed drying
hydration buffers, and ion sequestration [9]. Many process. Experimental analysis indicated that
LEA proteins may function as chaperones and this LEA was an intrinsically disordered protein
protect enzyme activities under in vitro conditions involved in protection against desiccation, oxidant
[10]. conditions, and osmotic stress [17]. Recently,
The LEA proteins were classified into different differential accumulation of late embryogenesis
groups according to various authors [11, 12]. Firstly, abundant as storage proteins in seeds of wild
Dure (1994) divided these proteins into 8 groups, and cultivated amaranth species was reported
D-7, D-11, D-19, D-29, D-34, D-73, D-95, and [18]. However, there has been little knowledge
D-113, depending on their molecular weight [12]. available about the LEA family in Amaranthus.
Recently, the availability of many plant genomes In this work, we first identified the LEA genes
allowed annotating and classifying LEA superfamily. in A. hypochondriacus genome and then conducted
In Arabidopsis thaliana, 51 LEA proteins were classified detailed characterization and expression analyses
into nine groups using the Pfam nomenclature [11]. by using a bioinformatic approach.
Deduced amino acid sequences of LEA proteins
allowed the identification of eight groups using the 2. M ATERIALS AND METHODS
Pfam nomenclature in cotton (Sorghum bicolor) [2] 2.1 Identification of LEA from A. hypochondriacus
and upland cotton [13] and sweet orange (Citrus Genomic Sequences and Construction of the
sinensis) [8]. Indeed, 29 potato LEA proteins were Phylogenetic Tree
divided into nine distinct groups [14]. Surprisingly, BLAST searches and sequence analyses were
a total of 242, 136, and 142 LEA genes were performed by BLASTp on the A. hypochondriacus
identified in Gossypium hirsutum, G. arboreum, and genome v2.1 (https://phytozome.jgi.doe.gov/pz/
G. raimondii respectively [13]. Most recently, 68 portal.html#!info?alias=Org_Ahypochondriacus_er)
LEA genes were identified in Sorghum bicolor [2]. using known A. thaliana [11] as queries. The selected
The expression of LEA genes was correlated Amaranthus genes were used for a second TBLASTN
with abiotic stress tolerance in many plants [2, round on the A. hypochondriacus genome; this additional
14]. LEA gene expression can be involved in step allowed discovering A. hypochondriacus paralogs
ABA-dependent or ABA-independent signaling that had been excluded by their dissimilarity to the
pathways. Many LEA genes are expressed in Arabidopsis orthologs.
response to ABA, drought, and salt stress. ABA- All the identified A. hypochondriacus candidates
responsive element (ABRE) or/and dehydration were analyzed using the Pfam server (https://
responsive element (DRE/CRT) were found in pfam.xfam.org/) [19] to confirm the presence
their gene promoter regions [2, 8]. Overexpression of  LEA conserved domains in their protein
of a LEA gene confers tolerance to water deficit structure. Proteins without any appropriate motif
in the transgenic plants including rice [15]. were manually re-predicted and corrected by
Amaranthus hypochondriacus belongs to the Fgenesh software (http://linux1.softberry.com/
Amaranthus genus which has gained much attention, berry.phtml) using dicotyledon plant models.
particularly for its high economic and nutritional Joint phylogenetic analysis with all Arabidopsis
values [16]. Previous studies on A. cruentus have LEAs allowed validating sequences without
displayed a novel late embryogenesis abundant LEA conserved domains. Proteins grouped with
protein that belongs to the LEA5 group in seeds Arabidopsis LEAs were considered to belong to the
[17]. The other LEA was identified in seed and LEA family. The multiple sequence alignment of
1160 Chiang Mai J. Sci. 2020; 47(6)

the A. hypochondriacus and Arabidopsis LEA proteins including Green Cotyledon (SRX722062), Root tissue
was performed using MAFFT (http://mafft.cbrc. (SRX722060), Leaf tissue (SRX722059), Stem tissue
jp/alignment/) [20] with default parameters and (SRX722057), Floral tissue (SRX722058), Immature
Iterative refinement methods. The phylogenetic seeds (SRX722056), Mature seeds (SRX722063),
tree was conducted using Mega X [21]. Water stressed tissue sample (SRX722061) [25].
The results showed the log-transformed (base
2.2. Sequence Analysis 10) values of FPKM counts.
MEME suite tool (http://meme.sdsc.edu/
meme) was used to analyze protein motif with 3. R ESULTS
the following parameters: distribution of motifs: 3.1. Identification of A. hypochondriacus
Anny number of repetitions; maximum number of LEA Genes
motifs: 5; minimum number of sites: 2; maximum The members of LEA gene family in
number of sites: 5; minimum motif width: 6 and A. hypochondriacus genome were identified by a series
maximum motif width: 50. of basic local alignment search tool (BLAST) using
The chromosomal localization of AhLEA genes LEA proteins of Arabidopsis as query sequences
was retrieved from phytozome (https://phytozome. and then a second round of TBLASTN against
jgi.doe.gov/pz/portal.html#!info?alias=Org_ the A. hypochondriacus genome. Sequences that
Ahypochondriacus_er). excluded start or stop codons or exon sequences
The characteristics of AhLEA proteins in comparison to Arabidopsis orthologs were
were calculated using the ProtParam online re-predicted using the Fgenesh software (http://
tool (http://www.expasy.org/tools/protparam. linux1.softberry.com/berry.phtml).
html) [22], including the number of amino acids, In total, 46 full-length encoding putative
molecular weight, theoretical isoelectric point (pI), LEA proteins were selected (Table 1). Domain
grand average of hydropathicity (GRAVY). The analysis using Pfam allowed to valid 42 of the 46
prediction of the LEA subcellular location was candidate genes. The remaining four genes which
performed using the Yloc program (http://abi.inf. did not contain LEA-conserved domains were
uni-tuebingen.de/Services/YLoc/webloc.cgi) [23]. further validated by joint phylogenetic analysis
with all Arabidopsis orthologs (Figure 1).
2.3. Gene Ontology (GO) and In Silico
Expression Analysis 3.2. Phylogenetic Analysis and Classification
Functional analysis of LEA proteins was taken of AhLEAs
using Omicsbox program, version 1.1.164 [24] to The phylogenetic tree was constructed
predict biological functions, cellular content, and with LEA sequences from A. hypochondriacus
molecular functions. Amino acid sequences of and A. thaliana by Mega X [21] software using
predicted LEA proteins were loaded to OmicsBox. the Maximum Likelihood methods with 1,000
Functional analysis was done by Blast2GO program bootstrap replicates (Figure 1). Phylogenetic analysis
with three steps (i) matching was conducted with revealed that AhLEA proteins can be divided into
the loaded sequences in the program (BLASTp) eight major groups (Figure 1): LEA1-6, SMP, and
(ii) mapping associated with BLAST results was DHN corresponding to the Pfam nomenclature.
completed (Mapping) (iii) dump file related to With 14 members, the LEA4 group was the
sequences was created (Annotation). largest. Next, LEA2 had 11 members. SMP group
The gene expression Fragments Per Kilobase was made up of seven genes. DHN contained
of transcript per Million fragments mapped (FPKM) five members. LEA1 and LEA3 comprised three
values were obtained from RNA-seq libraries, genes in each group and LEA6 contained only one
Chiang Mai J. Sci. 2020; 47(6) 1161

Table 1. Identification and characteristics of the LEA genes in A. hypochondriacus.


Gene name Subgroup Locus ID PL (aa) MW (kDa) pI GRAVY IN SCL
AhDHN-1 DHN Ah001781 208 21.96 5.89 -0.96 1 Cyt
AhDHN-2 DHN Ah002326 174 19.58 5.84 -1.51 1 N
AhDHN-3 DHN Ah003804 265 27.67 6.69 -1.05 1 Cyt
AhDHN-4 DHN Ah005786 147 16.01 5.96 -1.17 2 Cyt
AhDHN-5 DHN Ah023683 225 25.48 5.62 -1.50 3 M
AhLEA1-1 LEA1 Ah009310 80 9.14 9.13 -1.16 1 N
AhLEA1-2 LEA1 Ah011491 159 16.86 9.05 -0.86 1 Cyt
AhLEA1-3 LEA1 Ah020326 83 9.06 8.93 -1.31 1 N
AhLEA2-1 LEA2 Ah002467 231 25.02 9.10 0.23 1 Cyt
AhLEA2-2 LEA2 Ah007830 258 28.80 10.04 -0.20 1 N
AhLEA2-3 LEA2 Ah009482 325 35.75 9.24 -0.53 0 Cyt
AhLEA2-4 LEA2 Ah009631 159 17.62 4.91 -0.23 0 Cyt
AhLEA2-5 LEA2 Ah010338 215 23.26 9.75 0.20 0 N
AhLEA2-6 LEA2 Ah011754 317 35.04 4.87 -0.33 0 Cyt
AhLEA2-7 LEA2 Ah013240 186 20.55 6.19 0.03 1 M
AhLEA2-8 LEA2 Ah016081 276 30.06 9.88 -0.23 1 M
AhLEA2-9 LEA2 Ah021219 229 26.09 9.51 -0.22 1 N
AhLEA2-10 LEA2 Ah021764 208 23.34 9.40 0.11 0 N
AhLEA2-11 LEA2 Ah022417 169 18.51 9.80 -0.13 1 Cyt
AhLEA3-1 LEA3 Ah012945 96 10.49 10.09 -0.27 0 M
AhLEA3-2 LEA3 Ah013427 122 14.11 9.08 -0.91 2 Cyt
AhLEA3-3 LEA3 Ah022093 88 9.77 5.43 -0.53 0 Cyt
AhLEA4-1 LEA4 Ah000279 648 68.44 9.00 -0.88 0 N
AhLEA4-2 LEA4 Ah002005  149 15.89 8.92 -1.20 1 Cyt
AhLEA4-3 LEA4 Ah012685 295 32.36 6.19 -1.07 2 Cyt
AhLEA4-4 LEA4 Ah012686 231 25.59 8.55 -1.15 1 Cyt
AhLEA4-5 LEA4 Ah015080  390 41.85 6.22 -0.90 0 Chp
AhLEA4-6 LEA4 Ah015243  175 19.87 9.44 -1.30 1 M
AhLEA4-7 LEA4 Ah016434 67 6.92 5.15 -0.80 1 ER
AhLEA4-8 LEA4 Ah016488 65 6.98 4.96 -1.10 3 M
AhLEA4-9 LEA4 Ah016489 72 7.52 4.76 -0.89  2  N
AhLEA4-10 LEA4 Ah021631  439 47.02 6.11 -0.99 1 N
AhLEA4-11 LEA4 Ah021661  283 31.92 6.70 -1.24 2 N
AhLEA4-12 LEA4 Ah021865  420 45.88 6.23 -1.18 1 N
AhLEA4-13 LEA4 Ah022926 70 7.48 4.81 -1.19 1 N
PL: Protein full length, MW: Molecular weight, IN: Introns number, SCL: Sub cellular Location, Cyt: Cytoplasm, Chp:
Chloroplast, ER: Endoplasmic reticulum M: Mitochondrion, N: Nucleus.
1162 Chiang Mai J. Sci. 2020; 47(6)

Table 1. (Continued)
Gene name Subgroup Locus ID PL (aa) MW (kDa) pI GRAVY IN SCL
AhLEA4-14 LEA4 Ah022928 65 6.80 5.22 -1.02 1 N
AhLEA5-1 LEA5 Ah011466 81 8.53 5.85 -1.40 0 Cyt
AhLEA5-2 LEA5 Ah018817 86 9.67 5.88 -1.59 2 N
AhLEA6-1 LEA6 Ah006187 86 9.29 5.49 -1.37 1 Cyt
AhSMP-1 SMP Ah002544  144 14.85 4.44 -0.19 1 Cyt
AhSMP-2 SMP Ah007095 239 24.53 5.03 -0.23 0 N
AhSMP-3 SMP Ah010846  188 20.11 9.01 -0.53 0 Cyt
AhSMP-4 SMP Ah017845 272 28.66 4.98 -0.40 2 N
AhSMP-5 SMP Ah017846 240 25.12 4.70 -0.48 2 N
AhSMP-6 SMP Ah017848 105 10.69 4.31 -0.11 1 N
AhSMP-7 SMP Ah017849  243 25.38 4.36 -0.22 1 Cyt
PL: Protein full length, MW: Molecular weight, IN: Introns number, SCL: Sub cellular Location, Cyt: Cytoplasm, Chp:
Chloroplast, ER: Endoplasmic reticulum M: Mitochondrion, N: Nucleus.

Figure 1. Phylogenetic tree of the LEA family from A. hypochondriacus (Ah) and A. thaliana (At). The
tree was generated using Mega X program by Neighbor-Joining method. Bootstrap values are indicated
at each branch.
Chiang Mai J. Sci. 2020; 47(6) 1163

member. The species-specific Arabidopsis group LEA2 group


(AtM) of the LEA gene family was absent in the The LEA2 group contained 11 genes (AhLEA2-1
A. hypochondriacus genome. to AhLEA2-11), indicating that this group was the
second largest group in A. hypochondriacus LEA family.
3.3. Characteristics of Predicted LEA Genes Six of them (AhLEA2-1, AhLEA2-2, AhLEA2-7,
in A. hypochondriacus AhLEA2-8, AhLEA2-9, and AhLEA2-11) had
In silico analysis allowed determining the only one intron while five others were intronless.
characteristics of predicted LEA genes in Their deduced full-length protein sequences ranged
A. hypochondriacus. The AhLEA genes included a from 159 to 325 amino acids while the molecular
few introns in which only two of them contained weights of these proteins varied from 17.62 to
three introns, seven had two introns while most 35.75 kDa. The theoretical pI values of the eleven
(24/46) of them consisted only one intron and AhLEA2 proteins fluctuated from 4.78 to 10.04,
12 had no intron. Most AhLEA deduced proteins with three acidic and eight alkaline proteins. Four
(37/46) were smaller than 30 kDa (Table 1). proteins, AhLEA2-1, AhLEA2-5, AhLEA2-7,
Eleven of them were very small LEA proteins and AhLEA2-10, were hydrophobic with positive
(<10 kDa). The AhLEA4-1 was the unique GRAVY values. In contrast, the GRAVY values
protein that showed high molecular weights of seven other proteins were negative, indicating
(~68 kDa). The theoretical pI value of AhLEA that they were hydrophilic. Subcellular localization
proteins ranged from 4.31 to 10.09. Most of prediction of AhLEA2 proteins showed that they
AhLEA (42/46) were hydrophilic. The GRAVY were in ambiguous localized subcellular, four in
scores of all A. hypochondriacus LEA proteins the nucleus, four in the cytoplasm, and two in the
were negative, except for four AhLEA2 proteins mitochondrion (Table 1). MEME analysis revealed
named AhLEA2-1, AhLEA2-5, AhLEA2-7, and that the two most common motifs (motif 5 and 6)
AhLEA2-10 (Table 1). were displayed in nine out of 11 AhLEA2s except
LEA1 group for AhLEA2-4 and AhLEA2-6. Besides, motif 7
The LEA1 group of A. hypochondriacus was was exhibited in seven out of 11 AhLEA2s. Four
composed of three genes, AhLEA1-1, AhLEA1-2, genes, AhLEA2-2, AhLEA2-5, AhLEA2-8 and
and AhLEA1-3. Each of them exhibited one AhLEA2-9, shared consensus motif [NK]R[NS]
intron. The molecular weights of these proteins C[CS]CR[AKV][CIL]CC in N-terminal (Table 2).
were 9.14, 16.86, and 9.06 kDa, respectively. The LEA3 group
theoretical pI values of these proteins were 9.13, The LEA3 group of A. hypochondriacus was
9.05, and 8.93, respectively, showing that they made up of three members. The AhLEA3-2
were alkaline. All three AhLEA1 proteins were gene exhibited two introns whereas AhLEA3-1
hydrophilic with GRAVY values ranging from and AhLEA3-3 lacked introns. The theoretical pI
-1.31 to -0.86. AhLEA1-1 and AhLEA1-3 were values of AhLEA3-1 and AhLEA3-2 were 10.09
predicted to be located in the nucleus whereas and 9.08, which supposed that they were alkaline.
AhLEA1-2 was in the cytoplasm (Table 1). MEME However, the AhLEA3-3 protein was acidic (pI
analysis could detect only one consensus motif 5.43). The GRAVY values of all AhLEA3 proteins
containing 13 amino acids (MQ[AS][AV]K[DQ] were negative (ranging from -0.91 to -0.27),
K[IV][KS][DE][IM][AS]S) in three AhLEA1 indicating that they were hydrophilic. Subcellular
proteins. Additionally, three other motifs, 2, 3, localization prediction showed that AhLEA3-1
and 4, were only presented in two out of three was in the mitochondrion, whereas, AhLEA3-2
LEA1 members (Table 2). and AhLEA3-3 were presented in the cytoplasm
(Table 1). Motif 9 was discovered in all three
1164 Chiang Mai J. Sci. 2020; 47(6)

Table 2. Conserved motifs in different groups of A. hypochondriacus LEA proteins.


Group Motif Sites E-value Consensus sequence
1 3 3.50E+02 MQ[AS][AV]K[DQ]K[IV][KS][DE][IM][AS]S
2 2 9.90E-01 [NY][ET]TGY[PR]A[AP][GH][NY][IN]
LEA1
3 2 5.00E+01 IAKEVR
4 2 7.70E+01 [GQ]H[DL]Q[HV]D
5 9 1.60E-10 [FP][PV]SL[DS][AS]X[FL]QLT[IV][TH][AV][RK]NPN[KI][KV]I[GI]IYY
6 9 1.10E-05 [LS][AIL]L[LIV][LGV][LAV][IT][GAL][AIL][AT][IAT][LT][IV][FIL][FLWY]
L[LVA][FAY][KRH]P[KRS][KV]P[HKT][FY][ST][LV][DIQ]S
LEA2
7 7 3.00E-13 [LMF][ST][IVA][YMW]Y[KD][GDN][IST][PLQR]L[GC][QSR][GA]
[HRSVY][LVF][PE][ADGK][FG][YS]Q[PG][APH][RHLN][SN][CTV]
8 4 5.60E-10 [NK]R[NS]C[CS]CR[AKV][CIL]CC
9 3 2.90E-15 [SF]W[IM][PK]DP[VE]TG[YN][YW][RI]P[AE][NT][RH][AF][AG]
E[IV]D[VI][AV]ELR[ER][IKM][LF][LI]

LEA3 10 3 5.90E+00 [FL]A[AM]ASQ


11 3 2.70E+01 CN[RW][IQ][DT][FY]
12 2 1.70E+02 [QR]KIRSQ
13 32 7.70E-98 [DN]KA[KS][EQ][AI]KDX[TA][VM]EK[AT]GEAKD[YK][TVA]
LEA4 14 31 9.90E-37 E[AK]ASK[MA]KEKA[KS][DG][AT]A[ED]S[VA]K[ED][AS]T
15 7 3.70E-07 [ECQ]N[AY]S[YI][QA][AS]G[QG][AGV][KR][AG][TAE]T
16 3 5.00E-01 QTR[KR]EQ[LM]G[TH]EGY[QK][EQ][LM]
17 2 4.40E+01 MA[ST][GR]Q[EQ][NS][KR][EK]ELD
LEA5 18 2 3.90E+01 [AI][EK][IQ][DG]E[ST][KV][FV][PR]
19 2 1.10E+03 G[LR][ST]T[GM]D
20 2 1.10E+03 [DL]EA[GQ][AQ][EH][DL][AV]
LEA6 21 1 1.20E-04 T[DE]APT
22 9 3.70E-15 [HKQ][ED]KKG[IF][ML][DE]KIK[ED]K[LI]PG[TFG][HG]

DHN 23 5 9.70E-10 [LI][HQ][RH][SDT][GNS][SY]S[ST]S[SG]S[SDK]E


[LH][RK][DE][EDKS][YEHL][GAE][AEKNQ][PAK][VL][HRE]
24 6 3.00E-05
[QEHL][TDN][DH][EDH][FLNTY][GN][NRV][PH][VHI][QEK]
25 17 1.70E-78 [LV][EF]A[VT]AKLA[GA]DK[PA][VI][TD][PR][SR]DA[AE]A[IVM]
QAAEMR[AN]T[GP]
26 16 1.40E-23 PGG[VLP][AG][AES]X[AM]Q[AS]AA[TR]IN
SMP 27 7 6.30E-01 VTIGEA
28 4 2.10E-06 [HR][LR][MI]YD[EQ]DKT[KT]I[SA]EI[LI]
29 4 3.00E-04 [PA]IKYGD
Chiang Mai J. Sci. 2020; 47(6) 1165

AhLEA3 proteins by MEME tool. In addition, due to their negative GRAVY scores, ranging from
three motifs 10, 11, and 12 were found in two out -1.59 to -1.40. The results of subcellular prediction
of three AhLEA3 members (Table 2). Variation showed that the AhLEA5-1 was in the cytoplasm
in motifs in this group suggested functional whereas AhLEA5-2 was in the nucleus (Table 1).
divergence. MEME analysis indicated that two AhLEA5
LEA4 group proteins shared four common motifs, 16, 17, 18,
The LEA4 was the largest LEA group of and 19. In which, motif 16 was repeated twice in
A. hypochondriacus with the presence of 14 members. AhLEA5-2. Additionally, motif 20 was repeated
Most of them (12/14) exhibited one to three twice only in AhLEA5-1 (Table 2).
introns. However, AhLEA4-1 and AhLEA4-5 LEA6 group
lacked intron. Their protein lengths varied from The LEA6 group of A. hypochondriacus contained
65 to 648 amino acids corresponding to molecular only one gene, AhLEA6-1, which exhibited one
weights ranging from 6.80 to 47.02 kDa. The intron. The protein sequence possessed 86 amino
theoretical pI values of the LEA4 proteins were acids corresponding to their molecular weight
variable, from 4.76 to 9.44 with four alkaline and of 9.29 kDa. The theoretical pI value was 5.49.
ten acidic proteins. All AhLEA4 proteins were AhLEA6-1 protein was hydrophilic due to its
hydrophilic due to negative GRAVY scores, negative GRAVY score (-1.37). AhLEA6-1 was
ranging between -1.24 and -0.80 (Table 1). The predicted to locate in cytoplasmic (Table 1). No
results of subcellular prediction revealed that motif was detected in LEA6 by using MEME
the AhLEA4 group was widely distributed in analysis since this group contained the unique gene.
most of the subcellular compartments, including Dehydrin (DHN) group
endoplasmic reticulum (AhLEA4-7), chloroplast DHN was the fourth largest group in the
(AhLEA4-5), mitochondrion (AhLEA4-6 and A. hypochondriacus LEA gene family with five genes,
AhLEA4-8), cytoplasm (AhLEA4-2, AhLEA4-3, AhDHN-1 to AhDHN-5, respectively. Introns were
and AhLEA4-4) and nucleus (seven remaining found in each of them. The three first contained
members) (Table 1). MEME analysis showed only one intron whereas the AhDHN-4 exhibited
three most common motifs in this group including two introns and three introns were observed in the
motif 14, E[AK]ASK[MA]KEKA[KS][DG][AT] AhDHN-5. Their protein lengths ranged from 147
A[ED]S[VA]K[ED][AS]T, which was shared by to 265 amino acids corresponding to molecular
all AhLEA4 proteins and repeated several times weights ranging from 16.01 to 27.67 kDa. Their
in some genes. Next, motif 13 was found in eight theoretical pI values ranged from 5.62 to 6.69,
members with several repetitions in some genes. suggesting that these DHNs were weak acidic.
Motif 15 was exhibited in the N-terminal of seven All of five AhDHN proteins were hydrophilic
AhLEA4 proteins (Table 2). due to their negative GRAVY values, ranging
LEA5 group from -1.51 to -0.96. The results of subcellular
The LEA5 group of A. hypochondriacus contained prediction showed that AhDHN-1, AhDHN-3,
two members, AhLEA5-1 and AhLEA5-2, and AhDHN-4 were in the cytoplasm whereas
respectively. The former lacked intron whereas AhDHN2 in the nucleus and AhDHN5 in the
the latter exhibited two introns. The protein mitochondrion (Table 1). By using MEME, two
sequences were from 81 to 86 amino acids in length motifs, 22 and 23, were discovered in all five
corresponding to molecular weights ranging from members of the DHN group. In detail, Motif 22
8.53 to 9.67 kDa. These two proteins were acidic ([ER][KEH][EK]KKG[FLI][LM][DE]KIK[ED]
since their theoretical pI values ranged from 5.85 KLPG[GT][HGK][HK]) was repeated two or
to 5.88. Both AhLEA5 proteins were hydrophilic three times. Additionally, motif 24 ([ADN][FHR]
1166 Chiang Mai J. Sci. 2020; 47(6)

LRD[ES][YH]G[KNQ]PV[RE][QL]TDE[FLY] protein stabilization, embryo development ending


G[NR]PV[QK][HL][TK][GD][TE][MH]G[AQ] in seed dormancy, and response to water were
[YP]) was detected in only three genes, AhDHN-1, associated with two genes each. Prediction of
AhDHN-3 and AhDHN-4 respectively. molecular function showed that AhDHN-3 was
Seed Maturation Protein (SMP) group involved in protein-containing complex binding,
SMP was the third-largest group within the AhDHN-4 was linked to binding and AhLEA2-1
A. hypochondriacus LEA gene family. This group was associated with translation initiation factor
contained seven genes. Within them, two genes, activity. Eighteen out of 46 AhLEA genes
AhSMP-2 and AhSMP-3 lacked intron while the were putatively linked to cellular components.
AhSMP-1, AhSMP-6, and AhSMP-7 exhibited one Prediction analysis showed that nine AhLEAs
intron and therefore two other, AhSMP-4 and involved in integral component of membrane,
AhSMP-5, contained two introns. The polypeptide seven related in the cytosol, and two linked to the
full length of those SMPs ranged from 105 to 272 mitochondrion. Three other cellular components
amino acids like their molecular weights starting including membrane, plasma membrane, nucleus,
from 10.69 to 28.66 kDa. The pI value of AhSMP-3 and chromosome were be involved by one gene
was 9.01, suggesting that it was alkaline. The six each.
other proteins were acidic with pI values ranging In all the LEA groups of A. hypochondriacus,
from 4.31 to 5.03. The negative GRAVY values molecular functions, biological process, and cellular
indicated that each one of seven AhSMPs were components were recorded except in LEA1 in
hydrophilic. Three members, AhSMP-1, AhSMP-3, which only two GO functions, biological process,
and AhSMP-7, located in cytoplasmic and four and cellular components were observed.
others were in the nucleus. (Table 1). MEME
analysis revealed three common motifs, 25, 26, 3.4. Expression Profiling of LEA Genes
and 27, which were shared by all seven AhSMP The expression profile of the 46 AhLEA genes
genes. Motif 25 and 26 were repeated twice or was analyzed via the bioinformatic analyses from
thrice in six out of seven SMP members, apart an open data bank archive (SRA, Sequence Read
from AhSMP-6. Additionally, two other motifs, Archive) libraries (Figure 3). Expression analysis
28 and 29, were discovered in four out of seven was conducted on different tissues including
AhSMPs (Table 2). green cotyledons (GC), roots (RT), leaves (LT),
stems (ST), floral tissues (FT), immatured seeds
3.3. Gene Ontology (GO) Analysis (IMS), matured seeds (MS) and water-stressed
Gene ontology (GO) analysis of AhLEA tissue sample (mixed- root, stem, and leaf) [25].
genes was carried out by using OmicsBox Heat map indicated that all of the AhLEA genes
v1.1.164 (Figure 2). The biological processes, were expressed at least in green cotyledons, floral
molecular functions, and cellular components of tissues, and seeds (Figure 3). All AhDHNs were
A. hypochondriacus LEA genes were investigated expressed in all analyzed tissues. The AhDHN-1
by GO database. The results indicated that 15 out and AhDHN-3 were strongest expressed in the
of 46 AhLEAs were putatively implicated in a green cotyledon. These two genes were also
variety (during a short) of biological processes. highly expressed in floral tissues and mature
Four AhLEA genes (AhLEA2-4, AhLEA2-6, seeds. Especially, AhDHN-5 had high expression
AhLEA2-7, AhLEA2-11) were predicted to levels in all analyzed tissues. All DHN genes were
function in the response to desiccation, followed higher expressed in a water-stressed tissue sample
by response to water deprivation and response to (mixed- root, stem, and leaf) than in separated
abscisic acid (three genes each). Cold acclimation, tissues under normal conditions.
Chiang Mai J. Sci. 2020; 47(6) 1167

Figure 2. GO analysis of LEA genes in A. hypochondriacus. (A) Biological process (BP) results of GO
functional enrichment analysis, (B) Molecular function (MF) results of GO functional enrichment
analysis, (C) Cellular component results of GO functional enrichment analysis.

LEA2 and LEA6 were two groups containing expression of AhLEA1-1 was not determined in
similar expression profiles as DHN group. Expression leaves. Similarly, expression of two out of three
of all LEA2 and LEA6 genes was observed in AhLEA3 genes, AhLEA3-1 and AhLEA3-3,
all tissues with various levels. Two out of three was found in all tissues whereas the transcript of
AhLEA1 genes, AhLEA1-2 and AhLEA1-3, AhLEA3-2 was not detected in the stem. Two
were expressed in all eight tissues whereas LEA5 genes were exhibited expression in most
1168 Chiang Mai J. Sci. 2020; 47(6)

Figure 3. Heatmap showing expression level of AhLEAs in eight tissues. Colour scale represents RPKM
normalized log10 transformed counts. The green scale indicates low expression and red indicates high expres-
sion. GC: Green Cotyledon (no perisperm, no seed coat), RT: Root tissue, LT: Leaf tissue, ST: Stem tissue,
FT: Floral tissue (tepals from both male and female flower), IMS: Immature seeds, MS: Mature seeds, WS:
Water stressed tissue sample (mixed- root, stem, and leaf).

of the analyzed tissues, except an expression of green cotyledons in comparison to vegetative


AhLEA5-2 was not found in the stem. Seven out tissues were observed for 46 LEA genes of A.
of 14 LEA4 genes and three out of seven SMP hypochondriacus Except for four LEA4 (AhLEA4-8,
genes displayed expression in all tissues. However, AhLEA4-9, AhLEA4-13, and AhLEA4-14) and
other members of these two groups were not one SMP (AhSMP-3) genes, most of AhLEAs
discovered in one or two vegetative tissues. Higher expressed stronger in water stress tissue sample
expression levels in the floral tissues, seeds and than in separated vegetative tissues.
Chiang Mai J. Sci. 2020; 47(6) 1169

4. D ISCUSSION duplication events. A tandem duplication event


Genome identification of LEA superfamily was detected in chromosome 11, originating four
proteins was done for many species such as SMP genes. Five whole-genome duplication events
A. thaliana [11], potato [14], and upland cottons allowed explicating the expansion of LEA4 group
(e.g. Gossypium hirsutum, G. arboreum and G. raimondii) in A. hypochondriacus genome. In addition, two
[13]. A. hypochondriacus is a C4 dicotyledonous paralogs, AhDHN-3 and AhDHN-4, were duplicates
seed-producing crop. Investigation of proteins made up of whole-genome duplication. Similarly,
encoded by LEA family genes in this C4 plant AhLEA3-1 and AhLEA3-3 were whole-genome
should benefit crop improvement. According to duplicated paralogs, too. Segmental duplication,
literature, the gene number of LEA family of A. tandem duplication, and transposition events were
hypochondriacus was higher than the numbers of the main drivers of LEA gene family expansion
LEA genes previously reported in the genomes of which were observed in the genome of sweet
potato (29 genes) [14] but smaller than in A. thaliana orange [8], cotton and upland cotton [2, 13].
(51 genes) [11], and upland cottons (with 242, 136 Motif analysis of the A. hypochondriacus
and 142 in Gossypium hirsutum, G. arboreum and LEA proteins by using MEME tool showed that
G. raimondii, respectively) [13]. It is confusing members of each LEA group possessed several
to note that the copy-number of LEA genes is group-specific conserved motifs (Table 2), except
abundant and variable between different taxa. The for LEA6. Similar characteristics have been
abundance perhaps suggests their conservative role reported for LEA proteins in Potato [14], and sweet
under abiotic stress conditions as well as during orange [8]. For example, MEME analysis revealed
growth and development. Most of AhLEA shared three motifs (motifs 22–24) in DHN group. All
common characteristics including small molecular AhDHN shared K-segment and the S-segment,
weights, richness in hydrophilic amino acids, and two known typical motif of DHNs. Three proteins,
containing few introns [13]. Indeed, only two LEA AhDHN-1, AhDHN-3, and AhDHN-4, contained
genes, AhDHN5 and AhLEA4-8, contained three two Y-segments whereas two others lacked this
introns. Remaining AhLEA genes (96%) exhibited type of motif. (Figure 3). Conserved motifs in the
one, two, or no intron. It has been indicated that DHN group were the 11-amino acid repeated motif,
genes involving in the stress response usually EKKGIMDKIKEKLPG (K-segment, richness
contain few introns [26]. A proposed hypothesis in lysine residues). In this study, consensus motif
to explicate such observation is that introns may [ER][KH][EK]KKG[FLI][LM][DE]KIK[ED]
have a hurtful effect on gene expression since KLPG[GT][HK][HK] was identified among the
they can delay transcript production, likewise the DHN group. Additionally, the commonly known
further energetic cost caused by the increase of motif [LI][HQ][RH][SDT][GNS][SY]S[ST]S[SG]
transcript length [26]. The variation in biochemical (S-segment) was also observed. The Y-segment
characteristics among the 46 AhLEA proteins was not detected by MEME analysis. Instead,
suggested functional divergence. motif 24 ([ADN][FHR]LRD[ES][YH]G[KNQ]
Phylogenetic analysis categorized the AhLEA PV[RE][QL]TDE[FLY]G[NR]PV[QK][HL][TK]
proteins into eight different groups, e.g., LEA_1 [GD][TE][MH]G[AQ][YP]) was discovered in this
to LEA_6, SMP, and DHN, respectively. Theses study. It can be found that this motif includes
eight majors groups of LEA were reported in Y-segment with two repetitions. K-segment was
many plants, such as cotton and upland cotton rich in the DHN group, indicating that the LEA
[2, 13], sweet orange [8]. Differently, there were proteins evolved from the gene expansion within
nine groups in Arabidopsis [11], and potato [14]. their specific gene groups. In addition, LEA4
Phylogenetic analysis further allowed detecting genes were found to possess a repetition of
1170 Chiang Mai J. Sci. 2020; 47(6)

conserved motif 14. In some proteins, the time deficit. This gene is orthologs of At1G76180,
of repetition was seven. The same characteristic ERD14 gene, which was induced early on in
was also noted in harboring repeat motifs, and response to dehydration stress [28].
so was in motif 14 [8].
5. Conclusions
Expression analysis supports new insights 5. C ONCLUSIONS
into the function of AhLEA genes. RNA-seq data In this work, the whole repertoire of LEA
In this work, the whole repertoire of LEA encoding genes in the A. hypochondriacus
from the databases show different expressions of encoding genes in the A. hypochondriacus genome
AhLEA genes in different tissues. This indicates has been identified and characterized for the first
genome has been identified and characterized for the first time. The results indicated that LEA
that they have a different role during growth and time. The results indicated that LEA constitutes
development
constitutes [2]. Generally,
a multigene all AhLEA
family includinggenes a multigene
eight groups family including
in A. hypochondriacus, eight groups
displaying a in
exhibit higher expression levels in mature seeds A. hypochondriacus, displaying a diversity of
anddiversity
cotyledons than in other
of sequences, tissues. Thegene
motif composition, highstructure,
sequences, motif and
gene ontology, composition, gene structure,
expression patterns.
expression level of AhLEA genes in cotyledons, gene ontology, and expression patterns. It appears
lateItstage
appearsof that
embryogenesis.
segmental and Gene expression
whole-genome thatplays
duplication segmental and whole-genome
an important role in the expansionduplication
in the mature seed is similar to LEA of Pisum plays an important role in the expansion of some
of some
sativum, groups. Further,
supporting the role the diversified
of genes duringandloss
tissue-specific
groups.expression
Further, the profiles provide
diversified andfurther
tissue-specific
and re-establishment of desiccation tolerance expression profiles provide further insight into
[27].insight into theexpression
Abundant possible functional
of LEA1, divergence
LEA4,in the the
AhLEA gene functional
possible family. Thedivergence
future attempts
in thetoAhLEA
LEA5, SMP, and DHN groups was observed in gene family. The future attempts to clear up their
clear up their functional role in A. hypochondriacus
flower tissue, suggesting their roles in reproductive should greatlyrole
functional gain
in from this comprehensive
A. hypochondriacus should greatly
development [27]. The majority of the AhLEAs gain from this comprehensive analysis.
analysis.
were expressed in leaf tissues, consistent with
the observations in Sorghum bicolor L. [2] and in A CKNOWLEDGEMENT 
Acknowledgement
sweet orange [8]. The paralogous genes of LEA4 The author is grateful to Msc. Ngô Thi Thanh
group exhibited
The author similar expression
is grateful to Msc.features in
Ngô Thị Thanh Huy ề nn for
Huy for checking
checking the theEnglish
Englishlanguage
language in the
different tissues, indicating distinct divergence manuscript.
andinevolution
the manuscript.
of duplicated genes for different
functions during plant growth and development. R EFERENCES
References
Most of AhLEA genes further exhibited [1] Dure L. and Galau G.A., Plant physiol., 1981; 68(1):
expression under water deficit conditions, indicating 187-194. DOI: 10.1104/pp.68.1.187.
1. Dure L. and Galau G.A., Plant physiol., 1981; 68(1): 187-194. DOI: 10.1104/pp.68.1.187
their role under water deficit stress. The up-regulation [2] Nagaraju M., Kumar S.A., Reddy P.S., Kumar A.,
of some AhLEA AhDHN5, Rao D.M. and Kavi Kishor P.B., Plos One, 2019; 14(1):
2. Nagaraju M.,genes
Kumar(AhDHN1,
S.A., Reddy P.S., Kumar A., Rao D.M., and Kavi Kishor P.B., Plos One, 2019;
AhLEA2-6, AhLEA4-2) was consistent with the e0209980. DOI: 10.1371/journal.pone.0209980.
GO analysis result. The
14(1): e0209980. DOI:significant changes in
10.1371/journal.pone.0209980 [3] Stacy R.A.P. and Aalen R.B., Planta, 1998; 206(3):
expression levels of most AhLEA genes under 476-478. DOI: 10.1007/s004250050424.
water
3. deficit
Stacyare in agreement
R.A.P., and AalenwithR.B.,many reported
Planta, 1998; 206(3):
[4] 476-478. DOI:B.,
Khraiwesh 10.1007/s004250050424
Qudeimat E., Thimma M., Chai-
LEA genes of other plants. CsLEA4, CsLEA55, boonchoe A., Jijakli K., Alzahmi A., Arnoux M.
and4.CsLEA60Khraiwesh wereB.,up-regulated
Qudeimat E., by Thimma
droughtM.,inChaiboonchoe andA.,
Salehi-Ashtiani
Jijakli K., Alzahmi Arnoux 2015; 5:
K,. Sci.A.,Rep.-UK,
leaves and/or roots of sweet orange [8]. More 17434. DOI: 10.1038/srep17434.
than half M.,the
Salehi-Ashtiani
upland cotton K,. Sci.
LEA Rep.-UK, were5: 17434.
genes2015; [5] DOI:
van 10.1038/srep17434
Leeuwen M.R., Wyatt T.T., van Doorn T.M.,
increasingly expressed in roots and leaves at 7 th Lugones L.G., Wosten H.A. and Dijksterhuis J., Env.
Microbiol. Rep., 2016; 8(1): 45-52.H.A.,
DOI:10.1111/1758-
and5.14 van daysLeeuwen M.R.,stress
Wyatt [13].T.T., van Doorn T.M., Lugones L.G., Wosten and
th
of drought AhDHN5
2229.12349.
was most highly expressed in vegetative tissues
Dijksterhuis J., Env Microbiol Rep, 2016; 8(1): 45-52. [6] DOI:10.1111/1758-2229.12349
Tyson T., O’Mahony Zamora G., Wong S., Skelton
under normal conditions and up-regulated by water
Chiang Mai J. Sci. 2020; 47(6) 1171

M., Daly B., Jones J.T., Mulvihill E.D., Elsworth [18] Bojórquez-Velázquez E., Barrera-Pacheco A.,
B., Phillips M., Blaxter M. and Burnell A.M., BMC Espitia-Rangel E., Herrera-Estrella A. and Barba
Res. Notes, 2012; 5(1): 68. DOI: 10.1186/1756- de la Rosa A.P., BMC Plant Biol., 2019; 19(1): 59.
0500-5-68. DOI: 10.1186/s12870-019-1656-7.
[7] Jaspard E., Macherel D. and Hunault G., PLoS [19] Finn R.D., Coggill P., Eberhardt R.Y., Eddy S.R.,
One, 2012; 7(5): e36968. DOI: 10.1371/journal. Mistry J., Mitchell A.L., Potter S.C., Punta M.,
pone.0036968. Qureshi M., Sangrador-Vegas A., Salazar G.A.,
Tate J. and Bateman A., Nucleic Acids Res., 2016;
[8] Pedrosa A.M., Martins C.e.P., Gonçalves L.P. and
44(D1): D279-D285. DOI:10.1093/nar/gkv1344.
Costa M.G., PLoS One, 2015. 10(12): e0145785.
DOI: 10.1371/journal.pone.0145785. [20] Katoh K. and Standley D.M., Mol. Biol. Evol., 2013;
30(4): 772-780. DOI:10.1093/molbev/mst010.
[9] Hand S.C., Menze M.A., Toner M., Boswell L. and
Moore D., Annu. Rev. Physiol., 2011; 73: 115-134. [21] Kumar S., Stecher G., Li M., Knyaz C. and Ta-
DOI: 10.1146/annurev-physiol-012110-142203. mura K., Mol. Biol. Evol., 2018; 35(6): 1547-1549.
DOI:10.1093/molbev/msy096.
[10] Hincha D.K. and Thalhammer A., Biochem. Soc.
Trans, 2012; 40(5): 1000-1003. DOI:10.1042/ [22] Gasteiger E., Hoogland C., Gattiker A., Duvaud S.,
bst20120109. Wilkins M.R., Appel R.D. and Bairoch A., Protein
identification and analysis tools on the ExPASy
[11] Hundertmark M. and Hincha D.K., BMC genomics,
server, in Walker J.M., eds., The Proteomics Protocols
2008; 9(1): 118. DOI: 10.1186/1471-2164-9-118.
Handbook,. Springer. 2005: 571-607.
[12] Dure L., Structure/Function Studies of Lea Pro-
[23] Briesemeister S., Rahnenfuhrer J. and Kohlbacher
teins, in Coruzzi G. and Puigdomènech P., eds.,
O., Nucleic Acids Res., 2010; 38: W497-502.
Plant Molecular Biology: Molecular Genetic Analysis of
Plant Development and Metabolism, Springer Berlin [24] Gotz S., García-Gómez J.M., Terol J., Williams
Heidelberg: Berlin, Heidelberg. 1994: 245-255. T.D., Nagaraj S.H., Nueda M.J., Robles M., Talón
M., Dopazo J., and Conesa A., Nucleic Acids Res.,
[13] Magwanga R.O., Lu P., Kirungu J.N., Lu H., Wang
2008; 36(10): 3420-3435. DOI:10.1093/nar/
X., Cai X., Zhou Z., Zhang Z., Salih H., Wang
gkq477.
K. and Liu F., BMC Genet., 2018; 19(1): 6. DOI:
10.1186/s12863-017-0596-1. [25] Clouse J.W., Adhikary D., Page J.T., Ramaraj T.,
Deyholos M.K., Udall J.A., Fairbanks D.J., Jellen
[14] Charfeddine S., Saidi M.N., Charfeddine M. and
E.N. and Maughan P.J., Plant Genome, 2016; 9(1).
Gargouri-Bouzid R., Mol. Biol. Rep., 2015; 42(7):
DOI: 10.3835/plantgenome2015.07.0062er.
1163-1174. DOI:10.1007/s11033-015-3853-2.
[26] Jeffares D.C., Penkett C.J., and Bähler J., Trends
[15] Kumar M., Lee S.C., Kim J.Y., Kim S.J., Aye S.S.
Genet., 2008; 24(8): 375-378. DOI: 10.1016/j.
and Kim S.R., J. Plant Biol., 2014; 57(6): 383-393.
tig.2008.05.006.
DOI:10.1007/s12374-014-0487-1.
[27] Sahu B., Sahu A.K., Sahu A. and Naithani S.C.,
[16] D’Amico S., and Schoenlechner R., Chapter
S. Afr. J. Bot., 2018; 119: 28-36. DOI: 10.1016/j.
6 - Amaranth: Its Unique Nutritional and
sajb.2018.08.004.
Health-Promoting Attributes, in Taylor J.R.N.
and Awika J.M., eds., Gluten-Free Ancient Grains, [28] Kovacs D., Kalmar E., Torok Z. and Tompa P.,
Woodhead Publishing, 2017: 131-159. Plant Physiol., 2008; 147(1): 381-390. DOI:10.1093/
molbev/msy096.
[17] Saucedo A.L., Hernández-Domínguez E.E., de
Luna-Valdez L.A., Guevara-García A.A., Esc-
obedo-Moratilla A., Bojorquéz-Velázquez E., Del
Río-Portilla F., Fernández-Velasco D.A. and Barba
de la Rosa A.P., Front. Plant Sci., 2017; 8: 497. DOI:
10.3389/fpls.2017.00497.

You might also like