You are on page 1of 18

Chapter 12

Molecular evolution, three-dimensional


structural characteristics, mechanism of
action, and functions of plant
beta-galactosidases
Md. Anowar Hossain
Department of Biochemistry and Molecular Biology, University of Rajshahi, Rajshahi, Bangladesh

12.1 Introduction
Beta-galactosidase (BGAL, EC 3.2.1.23) is one of the oldest ubiquitous enzymes, which catalyzes the hydrolysis of
nonreducing β-D-galactosyl residues from β-D-galactoside polymers. It is also called by various names such as lactase,
β-lactosidase, maxilact, hydrolact, β-D-lactosidase, lactozym, trilactase, β-D-galactanase, oryzatym, sumiklat, β-D-galac-
tanase, β-galase, exo-β-(1-4)-D-galactanase, and exo-β-(1-3)-D-galactanase based on their substrate-catalyzed reac-
tions, sources, and mechanism of action (Dwevedi & Kayastha, 2010). It has been reported that BGAL has the ability
to hydrolyze the terminal galactosyl residues from carbohydrates, glycoproteins, and galactolipids (Ross, Wagrzyn,
MacRae, & Redgwell, 1994; Smith, Abbott, & Gross, 2002; Smith & Gross, 2000). BGALs are widely distributed in
lower to higher organisms including bacteria, fungi, yeasts, plants, and animals (Husain, 2010). These enzymes have
been classified into glycoside hydrolase (GH) families as GH1, GH2, GH35, and GH42 in CAZy (carbohydrate-active
enzymes, http://www.cazy.org/) database. BGALs present in microorganisms mostly belong to GH1, GH2, and GH42.
On the other hand, BGALs belonging to GH35 are found in a wide range of organisms including bacteria, fungi, ani-
mals, and plants (Dwevedi & Kayastha, 2010). On the basis of their substrate specificities, the plant BGALs (pBGALs)
can be divided into two types: one that consists of exo-β-(1-4)-galactanases particularly function on pectic β-(1-4)-D
galactan, and the other that acts on β-(1-3) and β-(1-6)-galactosyl linkages of arabinogalactan proteins but does not
have hydrolytic activity against β-(1-4)-galactan (Kotake et al., 2005).
BGALs have various physiological roles in plants including cell-wall expansion and degradation, and turnover of
signaling molecules during ripening (Buckeridge & Reid, 1994; de Alcantara, Martim, Silva, Dietrich, & Buckeridge,
2006; Ross, Redgwell, & MacRae, 1993). Recently, the pBGALs have gained much interest for their involvement in
the developmental stages and pectin degradation during fruit ripening in various plants including tomato (Carey et al.,
1995; Moctezuma, Smith, & Gross, 2003; Pressey, 1983), muskmelon (Ranwala, Suematsu, & Masuda, 1992), kiwifruit
(Ross et al., 1993), mango (Ali, Armugam, & Lazan, 1995), peach (Lee, Kang, Suh, & Byun, 2003), radish (Kotake
et al., 2005), papaya (Lazan, Ng, Goh, & Ali, 2004), and apple (Yang et al., 2018). β-Galactosidase activity signifi-
cantly increased in tomato fruits during ripening that suggested their roles in the breakdown of β(1-4)-galactan side
chains of pectin as part of the ripening process (Carey et al., 1995). Subsequently, it has been reported that downregula-
tion of a ripening-related BGAL mRNA decreased the enzyme activity and freed galactose content and significantly
retained the fruit firmness (Smith et al., 2002). Another report on pectin changes and pectin-modifying enzymes in
Jonagold apples during postharvest softening showed that the BGAL was the key player for softening during ripening
(Gwanpua et al., 2014). Our previous study showed that mango ripening-related enzymes such as BGAL,
α-mannosidase, and beta-hexosaminidase changed significantly during the postharvest storage at different temperatures
(Hossain, Rana, Kimura, & Roslan, 2014). Recently, Yang’s group reported that BGAL activity and expression levels

Bioinformatics in Agriculture. DOI: https://doi.org/10.1016/B978-0-323-89778-5.00017-9


© 2022 Elsevier Inc. All rights reserved. 191
192 SECTION | I Bioinformatics and next generation sequencing technologies

of BGAL genes (Mdβ-Gal1, Mdβ-Gal2, and Mdβ-Gal5) significantly increased in “Fuji” and “Qinguan” apples during
all stages of fruit developmental and were much higher in the mature fruits; indicating that pectin was degraded by
BGALs (Yang et al., 2018).
The GH35 like other families contains multiple copies of BGAL genes in different plant species. At least 17 BGAL
genes were reported from tomatoes, of which 6 were expressed during fruit development stages and ripening (Smith &
Gross, 2000; Chandrasekar & Hoorn, 2016). In Arabidopsis, 17 putative BGAL genes were found to be expressed that
were further divided into seven subgroups based on their sequence similarities. Subgroup-III included seven members
that involved in the modification of pectic polysaccharides of cell-wall matrices (Ahn et al., 2007). Meanwhile, 15
BGAL genes were identified in rice, 1 of which encoded for a protein similar to animal BGAL and the rest 14 were
grouped into a plant-specific subfamily of BGALs and few BGAL genes were located on the different chromosomes by
segments duplication (Tanthanuch, Chantarangsee, Maneesan, & Ketudat-Cairns, 2008). Rice BGAL enzymes might
play important roles in cell-wall polysaccharides, glycoproteins, and glycolipids metabolism. At least two BGAL iso-
forms were identified and characterized from the Coffea arabica genome (Figueiredo, Lashermes, & Araga, 2011).
Recently, a comprehensive genome-wide analysis of Brassica campestris ssp. chinensis identified 16 BGAL genes (Liu,
Gao, Lv, & Cao, 2013). Based on their conserved motifs, Brassica BGALs (BcBGALs) were classified into four groups
and 7 out of the 16 BcBGAL genes had two copies, whereas one BcBGAL gene contained five copies. Exon-intron
structures of different BcBGAL genes within the same group were very similar (Liu et al., 2013). Altogether the results
obtained from the above observation, it is postulated that pBGALs under GH35 family have multiple copies of gene
that might be generated through segmental gene duplications.
The determination of three-dimensional (3D) structure of an enzyme is a prerequisite to get a better understanding
of the functional mechanism of an enzyme. Numerous X-ray solved crystal structures of BGALs belonging to GH35
family have been deposited to protein data bank (https://www.rcsb.org/). The first 3D structure of β-galactosidase from
Escherichia coli (EcBGAL) was published in the Nature in 1994 (Jacobson, Zhang, Dubose, & Matthews, 1994).
EcBGAL is a tetrameric structure of four identical polypeptide chains with a calculated molecular mass of 465 kDa.
Each subunit contains five domains: jelly-roll type barrel (Domain 1), fibronectin type III-like barrels (Domain 2 and
4), β-sandwich (Domain 5), and the TIM (triose-phosphate isomerase)-type barrel (Domain 3). Central domain 3 houses
the active site amino-acid residues. Similar to EcBGAL, crystal structure of Penicillium sp. BGAL has five distinct
domains but the first domain is distorted TIM barrel that contains the catalytic site (Rojas et al., 2004). On the other
hand, human BGAL consists of catalytic TIM-barrel domain, β-domain 1, and β-domain 2 (Ohto et al., 2012). The first
and only X-ray crystal structure of pBGAL (TBG4) was solved at 1.65 Å resolution (pdb id: 3w5g) from tomato fruit
by a Japanese research group (Eda, Ishimaru, & Tada, 2015; Masahiro Eda, Matsumoto, Ishimaru, & Tada, 2016).
Recently, the phylogenetic relationship, homology modeling, docking, and mechanism of action of Mangifera indica
BGAL (MiBGAL) have been elucidated (Hossain, Roslan, Karim, & Kimura, 2016). This chapter summarizes the
molecular evolution, structural features, mechanism of action, and physiological functions of pBGALs.

12.2 Protein sequence features of plant beta-galactosidases


Numerous BGALs have been characterized based on the number of amino acids that resided in the polypeptide chain of
active enzymes. The number of amino acids in the BGAL enzymes varies from higher organism to lower one. The
smallest BGAL was found in bacteria (586 613 aa). The largest BGALs were found in fungi that contain 1002 1023
aa followed by plants (715 857 aa) and animals (647 677 aa) (Table 12.1) (Hossain et al., 2016). NCBI CD (con-
served domain)-search tool (CDD V3.0 44354 PSSMs) was used to identify the CDs in the 67 BGAL protein
sequences. All BGALs usually consist of GH35, GH42, LacA domain, and a BGAL multidomain, called “PLN03059”
(Hossain et al., 2016). The pBGAL sequence possesses an additional unknown functional domain “DUF4185” including
244 324 aa and a galactose binding lectin domain with 750 827 aa (Fig. 12.1). Moreover, the pBGALs also contain a
unique galactose binding lectin domain in the C-terminal region if they have more than 750 amino acids. Bacterial and
animal BGALs possess the following common domains such as GH35, GH42, PLN03059, BGal-dom4.5, and Gal-lactin
but some bacteria don’t have additional BGal-dom4.5 (Hossain et al., 2016). The functional roles of these additional
domains are not yet clear. However, it was suggested that the Gal-lectin domain could play a role in substrate specific-
ity of BGAL (Chandrasekar & Hoorn, 2016). Meanwhile, MiBGAL contained all types of domains in a complete multi-
domain architecture (PLN03059) (Masahiro Eda et al., 2016). These domains were termed domains I, II, III, and IV
due to their common presence in other proteins (e.g., domain-I-TIM-barrel domain). They receive different names when
they are also present in different protein families. As can be seen, what is called a GH42 domain is part of the GH35
domain (and they are both parts of a TIM-barrel domain) (Hossain et al., 2016).
TABLE 12.1 The features of beta-galactosidase sequences used for phylogenetic analysis (Hossain et al., 2016).

Sl GI Name used in the Organisms Taxonomy No. of Domains identified by CD- Signal N-
no. number phylogenetic tree amino search peptide glycosylation
acids cleavage sites
site
1. 1857333 Arthrobacter-BGAL Arthrobacter sp. Prokaryota (bacteria) 471 GH42, 35, PLN03059, LacA No No
2. 76097478 X_campestris-GalD Xanthomonas Prokaryota (bacteria) 579 GH 10, 42, 35, LacA 35 36 2
campestris
3. 1045034 X_axonopodis-BgaX Xanthomonas Prokaryota (bacteria) 598 GH42, 35, PLN03059, LacA 22 23 3
axonopodis
4. 32709094 X_campestris-GalC X. campestris Prokaryota (bacteria) 613 GH42, 35, PLN03059, LacA 23 24 2
5. 21114096 X_campestris-NixL X. campestris Prokaryota (bacteria) 613 GH42, 35, PLN03059, LacA 23 24 2
6. 2289790 B_circulans-BgaC Bacillus circulans Prokaryota (bacteria) 586 GH42, 35, PLN03059, BGal- No 2
dom4.5, LacA
7. 145688909 S_suis-BgaC Streptococcus suis Prokaryota (bacteria) 590 GH42, 35, PLN03059, LacA No 3
05ZYH33
8. 14971525 S_pneumoniaeTIGR4- Streptococcus Prokaryota (bacteria) 595 GH42, 35, PLN03059, LacA No 2
BgaC pneumoniae
TIGR4
9. 116077789 S_pneumoniaeD39-BgaC S. Prokaryota (bacteria) 595 GH42, 35, PLN03059, LacA No 2
pneumoniaeD39
10. 15457592 S_pneumoniaeR6-BgaC S. pneumoniae R6 Prokaryota (bacteria) 595 GH42, 35, PLN03059, LacA No 2
11. 16611713 C_maltaromaticum-BgaC Carnobacterium Prokaryota (bacteria) 586 GH42, 35, PLN03059, LacA No 3
maltaromaticum
12. 257143787 P_thiaminolyticus-Bga Paenibacillus Prokaryota (bacteria) 583 GH42, 35, PLN03059, BGal- No No
thiaminolyticus dom4.5, LacA
13. 669059 B_oleracea-BgalA Brassica oleracea Eukaryota (planta) 828 GH35, 42, PLN03059, Gal- 22 23 11
Lectin, LacA
14. 68161828 M_indica-BGAL Mangifera indica Eukaryota (planta) 827 GH35, 42, PLN03059, Gal-lectin, 22 23 7
LacA, DUF4185
15. 6686884 A_thaliana-BGAL6 Arabidopsis Eukaryota (planta) 718 GH35, 42, PLN03059, BGal 4.5, 28 29 3
thaliana LacA
16. 6686878 A_thaliana-BGAL3 A. thaliana Eukaryota (planta) 856 GH35, 42, PLN03059, Gal- No 2
Lectin, LacA

(Continued )
TABLE 12.1 (Continued)

Sl GI Name used in the Organisms Taxonomy No. of Domains identified by CD- Signal N-
no. number phylogenetic tree amino search peptide glycosylation
acids cleavage sites
site
17. 20514290 O_sativa-BGAL1 Oryza sativa Eukaryota (planta) 843 GH35, 42, PLN03059, BGal 4.5, No 2
Gal-lectin, LacA
18. 6686882 A_thaliana-BGAL5 A. thaliana Eukaryota (planta) 732 GH35, 42, PLN03059, BGal 4.5, 28 29 1
LacA
19. 3860321 C_arietinum-BGAL5 Cicer arietinum Eukaryota (planta) 745 GH35, 42, PLN03059, LacA 26 27 1
20. 7682680 V_radiata-BGAL1 Vigna radiata Eukaryota (planta) 739 GH35, 42, PLN03059, 26 27 1
PRK13974, LacA
21. 56201401 R_sativus-BGAL1 Raphanus sativus Eukaryota (planta) 851 GH35, 42, PLN03059, 30 31 3
Gal_lectin, LacA
22. 14970841 F_X_ananassa-BGAL2 Fragaria ananassa Eukaryota (planta) 840 GH35, 42, PLN03059, BGal 4.5, No 3
Gal_lectin, LacA
23. 7939623 S_lycopersicum-Tbg5 Solanum Eukaryota (planta) 852 GH35, 42, PLN03059, Gal-lectin, No 5
lycopersicum LacA
24. 54291174 O_sativa-BGAL2 O. sativa Eukaryota (planta) 715 GH35, 42, PLN03059, BGal 4.5, 20 21 1
LacA
25. 20384648 C_sinensis-BGAL Citrus sinensis Eukaryota (planta) 737 GH35, 42, PLN03059, LacA No No
26. 452712 A_officinalis-BGAL Asparagus Eukaryota (planta) 832 GH35, 42, PLN03059, Gal- 25 26 No
officinalis Lectin, LacA
27. 3641865 C_arietinum-BGAL4 C. arietinum Eukaryota (planta) 723 GH35, 42, PLN03059, LacA 23 24 4
28. 3869280 C_papaya-BGAL Carica papaya Eukaryota (planta) 721 GH35, 42, PLN03059, LacA 21 22 1
29. 18148449 P_americana-BGAL1 Persea americana Eukaryota (planta) 766 GH35, 42, PLN03059, LacA 35 36 1
30. 13936236 C_annuum-BGAL1 Capsicum annuum Eukaryota (planta) 724 GH35, 42, PLN03059, LacA 23 24 3
31. 3299896 S_lycopersicum-Tbg4 S. lycopersicum Eukaryota (planta) 724 GH35, 42, PLN03059, LacA 23 24 3
32. 4138137 S_lycopersicum-Tbg3 S. lycopersicum Eukaryota (planta) 838 GH35, 42, PLN03059, Gal-lectin, 25 26 1
LacA
33 6649906 S_lycopersicum-Tbg1 S. lycopersicum Eukaryota (planta) 835 GH35, 42, PLN03059, Gal-lectin, 22 23 2
LacA
34. 14970839 F_X_ananassa-BGAL1 F. ananassa Eukaryota (planta) 843 GH35, 42, PLN03059, Gal-lectin, 28 29 2
LacA
35. 33521214 S_aurantiaca-BGAL Sandersonia Eukaryota (planta) 826 GH35, 42, PLN03059, Gal-lectin, 24 25 2
aurantiaca LacA
36. 9294020 A.thaliana-BGAL1 A. thaliana Eukaryota (planta) 847 GH35, 42, PLN03059, Gal-lectin, 32 33 No
LacA
37. 14970843 F_X_ananassa-BGAL3 F. ananassa Eukaryota (planta) 722 GH35, 42, PLN03059, LacA 25 26 1
38. 7682677 V_radiata-BGAL2 V. radiata Eukaryota (planta) 721 GH35, 42, PLN03059, LacA 23 24 2
39. 10059008 D_caryopyllus-BGAL Dianthus Eukaryota (planta) 731 GH35, 42, PLN03059, LacA No 2
caryophyllus
40. 3860420 L_angustifolius-BGAL Lupinus Eukaryota (planta) 730 GH35, 42, PLN03059, LacA 33 34 1
angustifolius
41. 507278 M_domestica-BGAL Malus domestica Eukaryota (planta) 731 GH35, 42, PLN03059, LacA 24 25 1
42. 12583687 P_pyrifolia-BGAL1 Pyrus pyrifolia Eukaryota (planta) 731 GH35, 42, PLN03059, LacA 24 25 1
43. 8809655 A_thaliana-BGAL4 A. thaliana Eukaryota (planta) 724 GH35, 42, PLN03059, LacA 27 28 1
44. 6686876 A_thaliana-BGAL2 A. thaliana Eukaryota (planta) 727 GH35, 42, PLN03059, LacA 27 28 1
45. 334305536 L_usitatissimum-BGAL Linum Eukaryota (planta) 731 GH35, 42, PLN03059, LacA 29 30 1
usitatissimum
46. 7939621 S_lycopersicum-Tbg7 S. lycopersicum Eukaryota (planta) 870 GH35, 42, PLN03059, Gal-lectin, 35 36 5
LacA
47. 6686892 A_thaliana-BGAL10 A. thaliana Eukaryota (planta) 741 GH35, 42, PLN03059, LacA 29 30 4
48. 219927064 T_majus BGAL Tropaeolum majus Eukaryota (planta) 857 GH35, 42, PLN03059, Gal-lectin, No 7
LacA
49. 3641863 C_arietinum-BGAL3 C. arietinum Eukaryota (planta) 730 GH35, 42, PLN03059, LacA No 1
50. 18958133 A_candidus-BGAL Aspergillus (Fungi) 1005 GH35, PLN03059, BGal-dom 2, 18 19 6
candidus 3, 4.5, 4.5, LacA
51. 582890099 A_oryzae-BGAL1 Aspergillus oryzae- (Fungi) 1005 GH35, PLN03059, BGal-dom 2, 18 19 6
112 2, 3, 4.5, 4.5, LacA
52. 83770489 A_oryzae-BGAL2 A. oryzae-RIB40 (Fungi) 1005 GH35, PLN03059, BGal-dom 2, 18 19 6
2, 3, 4.5, 4.5, LacA
53. 34370136 Trichoderma reesei T. reesei (Fungi) 1023 GH35, PLN03059, BGal-dom 2, 20 21 6
2, 3, 4.5, 4.5, LacA
54. 321150462 P_aerugineus-bglA Paecilomyces (Fungi) 1011 GH35, PLN03059, GH_2 N, 18 19 6
aerugineus BGal-dom 2, 2, 3, 4.5, 4.5, LacA
55. 189092779 P_expansum-BGAL Penicillium (Fungi) 1011 GH35, PLN03059, GH_2N, 18 19 6
expansum BGal-dom 2, 2, 3, 4.5, 4.5, LacA

(Continued )
TABLE 12.1 (Continued)

Sl GI Name used in the Organisms Taxonomy No. of Domains identified by CD- Signal N-
no. number phylogenetic tree amino search peptide glycosylation
acids cleavage sites
site
56. 56266627 P_canescens-BGAL Penicillium (Fungi) 1011 GH35, PLN03059, BGal-dom 2, 19 20 7
canescens 2, 3, 4.5, 4.5, LacA
57. 44844271 P_sp.-BGAL Penicillium sp. (Fungi) 1011 GH35, PLN03059, BGal-dom 2, 19 20 6
2, 3, 4.5, 4.5, LacA
58. 238914608 B_sp.MEY-1-bglA Bispora sp. MEY-1 (Fungi) 1002 GH35, PLN03059, BGal-dom 2, 21 22 10
2, 3, 4.5, 4.5, LacA
59. 32448796 R_emersonii-BGAL Rasamsonia (Fungi) 1008 GH35, PLN03059, BGal-dom 2, 19 20 9
emersonii 2, 3, 4.5, 4.5, LacA
60 166513 A_niger-BGALA Aspergillus niger (Fungi) 1006 GH35, PLN03059, GH_2N, 18 19 11
BGal-dom 2, 2, 3, 4.5, 4.5, LacA
61. 62913951 A_phoenicis-BGAL Aspergillus (Fungi) 1007 GH35, PLN03059, GH_2N, 18 19 11
phoenicis BGal-dom 2, 2, 3, 4.5, 4.5, LacA
62. 383212688 P_chrysogenum-BGAL Penicillium (Fungi) 1013 GH35, 42, PLN03059, BGal-dom 21 22 7
chrysogenum 2, 2, 3, 4.5, 4.5, LacA
63. 14099962 Cl_familiaris-BGAL Canis lupus (Primates) 668 GH35, 42, PLN03059, BGal- 24 25 6
familiaris dom4.5, LacA
64. 192187 M_musculus-BGAL Mus musculus (Primates) 647 GH35, 42, PLN03059, BGal 24 25 6
dom4_5, LacA
65. 179401 H_sapiens-BGAL1 Homo sapiens (Primates) 677 GH35, 42, PLN03059, BGal- 23 24 7
dom4.5, LacA
66. 2547317 F_catus-BGAL Felis catus (Primates) 669 GH35, 42, PLN03059, BGal- 24 25 6
dom4.5, LacA
67. 34013388 T_ kodakarensis Thermococcus Archea 786 GH42, 35, PLN03059, LacA, A4
kodakarensis galactosidase middle domain,
GH42 trimerization domain
NO 1

BGal_dom 2, 2, 3, 4_5, 4_5, Beta-gacalosidase domain 2, 2, 3, 4_5, 4_5; GH_2N, glycosyl hydrolase 2N sugar biding domain; GH35, glycosyl hydrolase-35; GH42, glycosyl hydrolase 42; LacA, beta-
galactosidase; PLN03059, provisional multi domain; PRK13974, thymidylate kinase.
Molecular evolution, 3D structural characteristics, mechanism of action, and functions of plant β-galactosidases Chapter | 12 197

FIGURE 12.1 Conserved domains in mango BGAL were searched by Conserved Domain Database search in NCBI-BLAST. BGAL, Beta-
galactosidase.

The signal peptide is a short peptide found in newly synthesized protein at N-terminal, which determines whether
the protein will be secreted or not. The online web server “SignalP 4.1” was used to predict the signal peptide in the 67
BGAL amino-acid sequences (Petersen, Brunak, Heijne, & Nielsen, 2011). Fifty-one out of 67 BGALs possessed signal
sequences in their polypeptide chains. Plant and animal BGAL signal peptides contain the first 21 35 amino acids,
whereas fungal BGALs have 18 22 amino acids. MiBGAL signal peptide was found to be the first 23 amino acids
(Table 12.1) (Hossain et al., 2016). Thirteen out of 17 BGALs were found in Arabidopsis that have potential N-terminal
signal peptides secreted to the endomembrane system. The rest of the BGALs are probably located in the cytoplasm or
nucleus (Chandrasekar & Hoorn, 2016). With the few exceptions in some bacteria, most of the BGALs have signal pep-
tides in their polypeptide chains, indicating that they possibly are secreted proteins (Hossain et al., 2016).
Glycosylation is one the most abundant posttranslational modification events in eukaryotes. The online web server
“NetNGlyc 1.0” (http://www.cbs.dtu.dk/services/NetNGlyc/) is usually used to determine the potential N-glycosylation
sites in the polypeptide sequences. The BGALs belonging to animals and fungi have 6 11 N-glycosylation sites. Most
of the pBGALs contain less N-glycosylation sites than fungi. However, seven potential N-glycosylation sites were found
in MiBGAL protein sequence. They are located at positions N24, N152, N252, N349, N378, N492, and N498 (Hossain
et al., 2016). Most of the bacteria contained two N-glycosylation sites but don’t have any signal peptide, indicating that
bacterial BGALs are not true glycosylation site (Table 12.1). Some bacteria have multifunctional proteins that are gly-
cosylated and secreted or surface-exposed and might have an important role in the interaction with their environment
(Szymanski & Wren, 2005). On other hand, Penicillium sp. BGAL contains seven N-linked oligosaccharide chains and
was reported to be the first X-ray solved crystal structure of a glycosylated β-galactosidase (Rojas et al., 2004). Human
BGAL also contains seven N-glycosylation sites at positions N26, N247, N464, N498, N542, N545, and N555 (Ohto
et al., 2012). Two N-glycosylation sites (N282 and N459) have been reported in Solanum lycopersicum β-galactosiase 4
(TBG4), and a peptide signal cleavage site is found in between the amino-acid position of 23 and 24 in polypeptide
sequence, indicating a high probability for secretory nature of protein (Hossain et al., 2016).
Usually, MEME online software is used to identify the conserved motif in the protein sequences (Bailey et al.,
2009). Five conserved motifs are present in the 67 BGALs of plant, animals, fungi, and bacteria (Fig. 12.2). The
number of amino-acid present in the motif-1, -2, -3, -4, and -5 are 50, 41, 41, 21, and 21, respectively (Hossain
et al., 2016). Most of the BGAL sequences possess at least three common motifs: motif-1 (cyan), -4 (pink), and
-5 (yellow) (Fig. 12.3) (Hossain et al., 2016). All pBGALs contain 5 motifs present in the domain-I (TIM barrel)
and some have more than one copy of the same motif. It has been reported that it could be due to segmental gene
duplication in the pBGALs (Hossain et al., 2016). A special motif 3 (red) is found at the active site of all pBGALs
belonging to GH35 family. The motif 2 (blue) is also reported at the active site of pBGALs and bacterial that
belongs to GH42. On the other hand, fungi don’t possess motif 2 (blue) whereas animals and bacterial BGALs also
don’t have motif 3 (red) due to the short sizes of protein sequences. However, two animal BGALs have two copies
of motif-1. No conserved motif was found at the C-terminal end of all BGAL polypeptide sequences, probably due
to a lower similarity score among the member sequences (Hossain et al., 2016). Another online software,
“PROSITE” (http://www.prosite.expasy.org) identified the GH35 predictive active site in pBGAL polypeptide
sequences, which possesses a consensus sequence G-G-P-[LIVM](2)-x(2)-Q-x-E-N-E-[FY]. It was postulated that
the second E was the key residue in the active site of pBGALs. More than 50% pBGALs contain the SUEL-type
lectin domain (Hossain et al., 2016).
198 SECTION | I Bioinformatics and next generation sequencing technologies

FIGURE 12.2 Conserved motifs present in the 67 BGALs protein sequences. Five motifs were identified using MEME (Motif Em for Motif
Elicitation) software. The symbol heights represent the relative frequency of each residue. The number of sites and e-value for each motif are indi-
cated. The widths of the motif-1, -2, -3, -4, and -5 are 50, 41, 41, 21, and 21 amino acids, respectively. BGAL, Beta-galactosidase.

12.3 Molecular evolution of beta-galactosidases and their classification


To determine the evolutionary relationship of various BGALs, a phyogenetic tree was reconstructed using 67 BGAL
protein sequences retrieved from CAZy database. The MUSCLE program (Edgar, 2004) was used to alignment protein
sequences. The phylogenetic relationships were built by PROTML program of PHYLIP version 3.6 using the
maximum-likelihood method (Felsenstein, 2000). The BGAL genes evolved from an archea and organized into four dif-
ferent families such as bacteria, animals, fungi, and plants (Fig. 12.3) (Hossain et al., 2016). Further plants BGALs are
subdivided into six subfamilies (D1 D6) where MiBGAL belongs to the D1 family. Bacterial BGAL proteins have the
highest similarities to animals, and pBGALs evolve from fungi (Hossain et al., 2016). Approximately 65.57% of simi-
larity index (identities) is found between MiBGAL and the Brassica oleracea-BGAL. The D5 subfamily members had
the highest percentage (98.92% 65.77%) of sequence identities, whereas the D2 subfamily members had the lowest
percentage (71.76% 47.16%). Although pBGALs have a wide range of protein sequence variation, all of them possess
five conserved motifs, motif-1, -2, -3, -4, and -5 (Fig. 12.3). Few pBGALs have double motifs, which may be due to
the segmental gene duplication events. It has been reported that all organisms except plants have a single copy of
BGAL gene located in their chromosomes (Hossain et al., 2016). All plant species have multiple copies of BGAL
genes, namely, 17 in Arabidopsis and tomato (Chandrasekar & Hoorn, 2016), 15 in rice (Tanthanuch et al., 2008), and
16 in brassica (Liu et al., 2013). The pBGAL multigenes reside either on the same or different chromosomal locations
and they possibly evolved through segmental or gene duplications. The gene duplication might have critical roles in
evolving new functions of the multifunctional enzymes (Hossain & Roslan, 2014).
Protein sequence analyses reveal that pBGALs can be divided into two subgroups based on their length of polypep-
tide chain; smaller BGALs (Less than 750 aa) possess GH35, 42, and β-galactosidase domains (Hossain et al., 2016).
Larger BGALs (Greater than 750 aa) contain the conserved C-terminal Lectin-like SUE (sea urchin egg lectin) type
domains. Lectin-like SUE domains usually contain 100 amino acids with 7 highly conserved cysteine residues. This C-
terminal domain is very common to many pBGALs, which shows homology to animal lectin proteins (Ozeki, Yokota,
Kato, Titani, & Matsui, 1995). Sea urchin eggs also contain SUE lectins, which consist of L-rhamnose- and D-galactose-
Molecular evolution, 3D structural characteristics, mechanism of action, and functions of plant β-galactosidases Chapter | 12 199

FIGURE 12.3 The phylogenetic tree (cladogram) based on beta-galactosidase (GH35) amino-acid sequences obtained by the maximum-likelihood
method (left side). Archea GH35 sequences were used as an out-group to reconstruct the phylogenetic tree. All analyses were performed with the
WAG amino-acid substitution model and 1 invariable and 4 gamma-distributed site rate categories. Detailed information about the sequences is shown
in Table 12.1. The conserved motifs are distributed in the BGALs sequences (right side).
200 SECTION | I Bioinformatics and next generation sequencing technologies

specific homodimers (Ozeki et al., 1995). Although there are no experimental evidences on the specific function of this
domain in plants (Tanthanuch et al., 2008), it has been suggested that the lectin-like domain could enhance the affinity
of the enzymes for their substrates, thereby increasing catalytic efficiency (Ahn et al., 2007) and possibly also enzyme
stability (Trainotti, Spinello, Piovan, Spolaore, & Casadoro, 2001).

12.4 Three-dimensional structural characteristics of plant beta-galactosidases


3D structure is very important for the determination of structure function relationship of the proteins and/or enzymes.
Still now only tomato pBGAL (TGB4) structure has been successfully solved by X-ray crystallography (Eda et al.,
2015). An open reading frame of TGBG4 cDNA (24 724 aa) was cloned and expressed in Pichia pastoris using
expression vector pPICZ_A (Invitrogen) after the α-factor signal sequence for the production of a secreted recombinant
protein fused with a hexahistidine tag (Eda et al., 2015). BGAL isolated from different sources has a common catalytic
TIM-barrel domain in their structure. The catalytic domain of TBG4 showed amino-acid sequence identities at 27%
34% with other enzymes and the other part of the TBG4 have 19% 25% sequence identities (Eda et al., 2015). The
Ramachandran plot shows that over 95.9% of the residues in the crystal structures remain in structural favor region and
3.9% residues in the allowed region, whereas only 0.2% residues fall in disallowed region (Masahiro Eda et al., 2016).
These results indicate that the structure is of high quality and accurate. TBG4 consists of four domains (Masahiro Eda
et al., 2016); a central TIM-barrel domain is followed by three β-sandwich domains (Fig. 12.4A). The domain-I has 323
amino acids (24 346) with a distorted (β/α)8 TIM barrel fold that houses the active site. Like an ideal TIM barrel of 8
(β/α) repeats, the TBG4 TIM barrel does not have the fifth and sixth α-helixes in the β/α barrel (Masahiro Eda et al.,
2016). The domain-II contains 66 amino-acid residues (347 412) with an antiparallel β-sandwich structure that pos-
sesses 7 β-strands (Masahiro Eda et al., 2016). The domain-III contains an antiparallel β-sandwich structure of 9
β-strands joined with the loop structure. Amino-acid residues located at the position 413 438 in the polypeptide chain
build up loop regions and 2 β-strands, and residues 586 724 constitute the rest of the C-terminal domain (Masahiro
Eda et al., 2016). The domain-IV contains 147 amino acids (439 585) that form an antiparallel β-sandwich structure
with 8 β-strands (Masahiro Eda et al., 2016).
To get more insights of substrate specificities, mechanism of action pBGALs and physiological function, a modeled
structure of MiBGAL also has been developed by homology modeling using TBG4 as a template (pdb id: 3w5g)
(Masahiro Eda et al., 2016). Homology modeling was carried out on online software, SWISS-MODEL web server
(Waterhouse et al., 2018) followed by ModRefiner (Xu & Zhang, 2011). An important criterion of reliable homology
modeling is the cut off value greater than 30% sequence identity between the template and target. More importantly,

FIGURE 12.4 Three-dimensional


X-ray solved crystal structure of
(A) TBG4 and (B) its complex
with galactose. Four domains I IV
are colored blue, cyan, orange, and
red, respectively. Glycosylated
amino-acid residues (Asn282 and
Asn459) and N-acetyl-D-glucos-
amine residues are depicted as stick
models in (A). The β-D-galactose
molecule in the active site is shown
as a space-filling view in (B).
Molecular evolution, 3D structural characteristics, mechanism of action, and functions of plant β-galactosidases Chapter | 12 201

more than 50% sequence identity between the template and target usually give an accurate model. The amino-acid
sequence identity of MiBGAL with TBG4 is 52.80% (Hossain et al., 2016). The modeled structure is analyzed using
the protein structure and model assessment tools at the SWISS-MODEL server, which utilizes various local and global
quality estimation parameters. Finally, the model is assessed and verified using the PROCHECK (Laskowski,
MacArthur, & Thornton, 2001), WHAT_CHECK (Hooft, Vriend, Sander, & Abola, 1996), VERIFY_3D (Luthy,
Bowie, & Eisenberg, 1992) methods, and ModEval Model evaluation server (Eramian, Eswar, Shen, & Sali, 2008).
Over 99.5% of the residues fell in the common region in the Ramachandran plot and only 0.50% remained in the unfa-
vorable region, indicating that the refined structure is good quality (Hossain and Roslan, 2014). The overall G-factor,
the main chain, side chains, and bond angles parameters were found within the normal limits.
Since the template structure, TBG4 (pdb id: 3w5g) was cloned and expressed P. pastoris without signal sequence
(Eda et al., 2015). Like the template structure, the modeled structure of MiBGAL starts at position 22 and included 716
amino acids (position 22 737) in its structure (Hossain et al., 2016). The MiBGAL modeled 3D structure is presented
in Fig. 12.5A. Similar to TBG4, the model contains four domains (Hossain et al., 2016); first domain (GH35) comprises
a triose-phosphate isomerase (TIM) barrel [also called (β/α)8] in the central part houses the catalytic residues, second
and third domains form small beta-sandwich and four domain is jelly-roll like (Fig. 12.5A and B) as found in template
structure, TBG4 (Hossain et al., 2016). The second, third, and fourth domains consist of six, seven, and eight antiparal-
lel β-sandwich structures, respectively (Hossain et al., 2016). The secondary-structural elements of the modeled struc-
ture MiBGAL consist of 10 α-helices and 38 β-strands, with two additional disulfide bonds located at the position
C230 C235 and C372 405 (Hossain et al., 2016). PROMOTIF predicted the α- and β-contents of the modeled struc-
ture that are 15.90% and 29.10%, respectively. Superimposition of modeled MiBGAL with template TBG4 exhibits the
magic-fit overlapping conformation (Fig. 12.4E) (Hossain et al., 2016). It has been reported that five distinct domains
are found in Penicillium sp. BGAL (PspBGAL) that belongs to GH35 (Rojas et al., 2004). The PspBGAL has two disul-
fide bonds at the position of C205 C206 and C267 C316.

12.5 Structural comparison between MiBGAL and TBG4


Both the MiBGAL modeled and TBG4 crystal-structure possess the four domains including TIM barrel in their centers
(Fig. 12.5A D) (Hossain et al., 2016). The active site clefts of both structures are also very similar in conformation
(Fig. 12.5B and D) (Hossain et al., 2016). The COACH program identified the catalytic residues responsible for the
ligand binding of the modeled structure of MiBGAL. Thirteen interacting residues located at the catalytic site of mod-
eled MiBGAL structure are Tyr74, Val117, Cys118, Ala119, Glu120, Asn181, Glu182, Glu251, Trp253, Trp256,
Phe257, Tyr290, and Tyr313, whereas that residues of TBG4 are Tyr74, Val117, Cys118, Ala119, Glu120, Asn180,
Glu181, Glu250, Trp252, Trp255, Tyr256, Tyr289, and Tyr312 (Hossain et al., 2016). The interacting residues are very
much similar in both structures. Superimposition of MiBGAL (model) with TBG4 structure presents a magic fit
between them (Fig. 12.5E) and the only difference is that the MiBGAL contain Phe257 instead of Tyr256, which is
located in the catalytic site of TBG4 crystal structure (Fig. 12.5F) (Eda et al., 2015). The MiBGAL also possesses two
disulfide bonds at the position of C230 C235 and C372 C405, whereas TBG4 structure forms four disulfide bonds at
the position of C229 C234, C370 C405, C684 C682, and C67 C678 (Masahiro Eda et al., 2016). Protein protein
interaction studies revealed that MiBGAL exists in dimeric form as found in TBG4. The galactose TBG4 protein inter-
action (pdb id: 3w5g) is represented in Fig. 12.5A, where the conserved amino acids residues such as Tyr74, Cys118,
Ala119, Glu120, Asn180, Glu181, Asn230, Glu250, Trp252, Tyr289, and Tyr312 of TBG4 interacted with galactose
molecule to form complex (Hossain et al., 2016). Superposition of the PspBGAL galactose complex with other BGAL
complexes belonging to GH35 identified the E200 and E299 residues as the proton donor and the nucleophile, respec-
tively (Rojas et al., 2004). The Glu182 and Glu251 were identified as catalytic residues in a deep well in the TIM-
barrel domain of MiBGAL modeled structure by triple-superimposition (Hossain et al., 2016). The residue Glu182 and
Glu251 could act as the proton donor the catalytic nucleophile base of MiBGAL (Fig. 12.6B). On the other hand, the
Glu181 and Glu250 were identified as proton donor and catalytic nucleophile in TBG4, respectively (Fig. 12.6A and B)
(Eda et al., 2015).

12.6 Substrate specificity of plant beta-galactosidases


The pBGAL can hydrolyze various plant-based (1,4)-linked polysaccharides and exhibits a strong affinity to attack
β-(1,4)-galactan molecules. Three aromatic amino-acid residues postulated to be important for substrate specificity
are conserved in GH35 BGALs isolated from bacteria, fungi, and animals (Cheng et al., 2012). The crystal structural
202 SECTION | I Bioinformatics and next generation sequencing technologies

FIGURE 12.5 The molecular 3D structural features of plant beta-galactosidases (BGALs). (A) The predicted 3D-modeled structure of mango BGAL
(MiBGAL) is shown as ribbon diagram. The structure contains fourfold domains (I, II, III & IV) including α-helices (red), β-pleated sheets (purple),
and coils (gray) as found in template structure (pdb id: 3w5g). The catalytic domain-I is a TIM barrel with the active site located at the N-terminus of
the protein. (B) Surface (20% transparent) of modeled structure of MiBGAL. Four domains are depicted as cyan (domain-I), green (domain-II), yellow
(domain-III), red (domain-IV), and active site clefts are indicated by red arrows. Parts (C) and (D) are the ribbon forms and surface (20% transparent)
filled view of X-ray crystal structures of tomato BGAL (TBG4), respectively. (E) Superimposition magic-fit image of the modeled structures of
MiBGAL (Green) with template structure TBG4 (Id: 3w5g, purple). (F) Superimposition of magic-fit image of active site residues of modeled
MiBGAL (blue) with template, 3w5g (magenta). Chimera and SPDBV 4.01 OSX were used to prepare the images.
Molecular evolution, 3D structural characteristics, mechanism of action, and functions of plant β-galactosidases Chapter | 12 203

FIGURE 12.6 (A) Representations of the galactose-protein interactions in the catalytic sites of the X-ray crystal structure of tomato beta-
galactosidase-galactose complex (TBG4; pdb id: 3w5g). Bonds and bond lengths are indicated as purple (H-bonds), green (hydrophobic bonds). Bond
lengths are expressed as Angstrom (Å). (B) Stereo view of the superposition of the catalytic sites in MiBGAL (in blue), TBG4 (in green), and
PspbGAL (in orange). The pictures were prepared by RCSB-PDB Ligand explorer 4.2.0 and SWISS-PDB viewer.

complex of TBG4 with β-D-galactose ligand provided structural insight into its substrate specificity (Eda et al.,
2015). In TBG4 structure, two amino acids out of three were conserved and one aromatic residue was replaced by
valine residue. To confirm the role of valine residue, kinetic studies were carried out of TBG4 and its mutant V548W
using the synthetic (4-nitrophenyl β-D-galactopyranoside) and natural substrates [β(1,4), β(1,3), and β(1,6)-galacto-
biose, chelator-soluble pectin, alkali-soluble pectin] (Ohto et al., 2012). It is interesting that V548W mutant showed
fivefold more activity (Kcat value) compared with wild-type TBG4, but km values remained the same levels for both
(Ohto et al., 2012).
204 SECTION | I Bioinformatics and next generation sequencing technologies

TABLE 12.2 Molecular docking for substrates of plant β-galactosidases (Hossain et al., 2016).

Sl Name of ligands Mango BGAL modeled structure TBG4 X-ray crystal structure
no.
Rosetta Interface Auto Dock Binding Rosetta Interface Auto Dock Binding
Energy (delta_X free energy (ΔG), Energy (delta_X free energy (ΔG),
score) kcal/mol score) kcal/mol
Substrates
1. p-Nitrophenyl-β-D- 215.20 25.18 211.30 24.50
galactopyranoside
(pNP-GAL)
2. o-Nitrophenyl-β-D- 212.29 25.09 28.79 23.81
galactopyranoside
(oNP-GAL)
3. (1)Lactose 28.39 23.94 28.48 23.26
4. Galactobiose 29.29 24.38 27.31 23.45
5. Galactotriose Not possible 22.88 Not possible 10.08
6. Arabinogalactan Not possible 20.80 Not possible 21.31
7. Galactan Not possible 13.71 Not possible 15.79

The activity of the V548W mutant as compared with wild-type TBG4 increased sixfold against β-(1,6)-galactobiose
and B0.6-fold against β-(1,4)-galactobiose, while no change of activity against β-(1,3)-galactobiose (Ohto et al., 2012).
The V548W mutant hydrolyzed the chelator-soluble pectin and alkali-soluble pectin and released the galactose mole-
cule approximately 0.6 0.8-fold compared with wild-type TBG4, indicating that V548 have a critical role in substrate
specificity and efficiently degrade the pectic β(1,4)-galactan (Ohto et al., 2012). Another report showed that TBG5,
which had tyrosine residue instead of valine at same of TBG4, preferred to hydrolyze β(1,6) and β(1,3)-linked galacto-
oligosaccharides but did not show any activity against substrate, β(1,4)-galactan (Ishimaru, Smith, Mort, & Gross,
2009). Thus the residue present in the TBG4 corresponding to the position V548 of plant galactosidases seems to deter-
mine the substrate specificity against β(1,4)- and β(1,6)-linked polysaccharides (Ohto et al., 2012).
Molecular docking is the study of an interaction between the protein and ligands to determine the stability of the
interacting amino-acid residues between the substrate and the active site (Meng, Zhang, Mezei, & Cui, 2011; Sethi,
Joshi, Sasikala, & Alvala, 2019). We did molecular docking using 12 well-known synthetic and natural substrates as
well as inhibitors (Tables 12.2 and 12.3). Molecular docking was carried out using the online server such as
RosettaLigands (http://rosettaserver.graylab.jhu.edu/) and DockingServer (http://www.dockingserver.com/). In Rosetta
docking the lowest Interface Energy (delta_X) scores were found in p-nitrophenyl-β-D-galactopyranoside (pNP-GAL)
among the tested ligands for this study. The delta_X scores were 215.20 and 211.30 for the structure of MiBGAL and
TBG4, respectively (Tables 12.2 and 12.3). Consistently, Auto Dock results showed that the binding free energies (ΔG)
of pNP-GAL were 25.18 and 24.50 kcal/mol for MiBGAL and TBG4, respectively (Table 12.2). These results indi-
cated that pNP-GAL could be potential synthetic substrate for both modeled structure of MiBGAL and the crystal struc-
ture of TBG4. It is also consistent with experimental results where TBG4 has strong activities toward the pNP-GAL
(Smith and Gross, 2000). Therefore it could be concluded that pNP-GAL is potential synthetic substrate for both
MiBGAL and TBG4; and MiBGAL might be able to hydrolyze β-(1-4) linkage of the substrates like TBG4 (Ohto
et al., 2012).

12.7 Mechanism of action of plant beta-galactosidases


BGALs isolated from different sources belonging to GH35 usually act to retain the same stereochemical configuration
of product as initial substrate, which is called “retaining mechanism”(Rojas et al., 2004). Double-displacement reaction
occurs here where two successive nucleophilic attack on the anomeric carbon that guides to overall retention of the
anomeric configuration (Rojas et al., 2004). Two carboxylic acids are required for this reaction; First carboxylic acid
Molecular evolution, 3D structural characteristics, mechanism of action, and functions of plant β-galactosidases Chapter | 12 205

TABLE 12.3 Molecular docking for inhibitors of plant β-galactosidases (Hossain et al., 2016).

Sl Name of ligands Mango BGAL modeled structure TBG4 X-ray crystal structure
no.
Rosetta interface Auto dock binding Rosetta interface Auto dock binding
energy (delta_X free energy (ΔG), energy (delta_X free energy (ΔG),
score) kcal/mol score) kcal/mol
Inhibitors
1. 1-Deoxy- 29.68 26.46 29.47 26.06
manojirimycin
(DMJ)
2. 1-Deoxy- 27.91 24.95 29.28 24.88
galactonojirimycin
(DGJ)
3. Galactose 29.63 24.62 29.19 24.12
4. 1-methyl-β-D- 27.15 24.40 28.76 23.76
galactoside
5. 1-methylcyclopropene 24.86 23.02 27.97 22.51

functions as a catalytic nucleophile, and second one works as an acid/base catalyst (Rojas et al., 2004; Ohto et al.,
2012). In TBG4, Glu181 and Glu250 resided in the TIM-barrel domain were identified as a candidate for the acid/base
catalyst and catalytic nucleophile, respectively (Ohto et al., 2012). A complex of galactose TBG4 protein is shown in
Fig. 12.6A, where a galactose is bonded with each monomer in the chair conformation, and its OH group in the position
one exists in the β-anomeric form (Ohto et al., 2012). Nine hydroxyl groups of galactose form direct hydrogen bonds
with TBG4 protein. In addition, aromatic and hydrophobic amino-acid residues resided at the catalytic site play an
important role in the recognition of ligand through extensive van der Waals interactions (data not shown) (Hossain
et al., 2016). The terminal galactose molecule is identified by TBG4 based on these interactions.
In the MiGBAL modeled structure, catalytic site conformation is very indistinguishable from TBG4 (Hossain et al.,
2016). Two catalytically important residues, Glu182 and Glu251, are resided in the TIM-barrel domain and overlapped
with identical residues upon triple-superimposition of MiBGAL, TBG4, and PspBGAL (Fig. 12.6B) (Hossain et al.,
2016). Protein ligand interaction (MiBFAL Galactose) studies showed that the residue Glu182 and Glu251 could
function as the proton donor and the catalytic nucleophile base of MiBGAL (data not shown) (Hossain et al., 2016).
According to the conformation conservation of the anomeric carbon position through the reaction mechanism, glycoside
hydrolases belonging to GH35 can be categorized into two types: retaining or inverting enzymes (Ohto et al., 2012).
The basis of this classification is the distance between the oxygen of the two successive catalytic carboxylates; distance
ranges for retaining enzyme and inverting enzyme are 4.5 6.5 and 9.0 9.5 Å, respectively (Ohto et al., 2012). The
retaining enzyme reacts with its substrate to forms a covalent intermediate, while the inverting enzyme hydrolyzes the
substrate by activating the water molecules (Eda et al., 2015). The average value of the distance for the two carboxy-
lates was 5.99 Å in the modeled structure of MiBGAL, while that of TBG4 was 5.41 Å, indicating that pBGALs act on
their substrates in a retaining manner (Hossain et al., 2016).

12.8 Physiological function of plant beta-galactosidase


The pBGALs can hydrolyze β-(1,4)-galactans to play various physiological functions including cell-wall extension and
breakdown of signaling molecules during fruit (Ross et al., 1993). The BGAL activity in tomato fruit significantly
increased during ripening that suggested its roles in the breakdown of β(1,4)-galactan side chains of pectin as part of
the ripening process (Carey et al., 1995). Another group of scientists reported that tomato β-galactosidase-4 (TBG4)
hydrolyzes a wide varieties of plant-derived (1,4)- or 4-linked polysaccharides and exhibits a strong affinity to attack
β-(1,4)-galactan, thereby expanding the cell-wall pericap (Eda et al., 2015). Recently, Yang’s group reported that
BGAL activity and expression levels of BGAL genes (Mdβ-Gal1, Mdβ-Gal2, and Mdβ-Gal5) significantly increased in
206 SECTION | I Bioinformatics and next generation sequencing technologies

“Fuji” and “Qinguan” apples during all stages of fruit developmental and were much higher in the mature fruits, indi-
cating that pectin was degraded by BGALs (Yang et al., 2018). Another report on pectin changes and pectin-modifying
enzymes in Jonagold apples during postharvest softening showed that the BGAL was the key player for softening dur-
ing ripening (Gwanpua et al., 2014).
Recently, the pBGALs have gained much interest for mostly their involvement in fruit developmental stages and
pectin degradation during fruit ripening in various plants including tomato (Carey et al., 1995; Moctezuma et al., 2003;
Pressey, 1983), muskmelon (Ranwala et al., 1992), kiwifruit (Ross et al., 1993), mango (Ali et al., 1995), peach (Lee
et al., 2003), papaya (Lazan et al., 2004), and apple (Yang et al., 2018). Subsequently, it has been reported that downre-
gulation of a ripening-related BGAL mRNA decreased the enzyme activity and freed galactose content and significantly
retained the fruit firmness (Smith et al., 2002). Our previous study showed that mango ripening related enzymes such
as BGAL, α-mannosidase, and beta-hexosaminidase changed significantly during the postharvest storage at different
temperatures (Hossain et al., 2014). The BGAL is thought to accelerate fruit softening by increasing the porosity of the
cell wall and enhancing the access of other cell wall degrading enzymes (Brummell, 2006; Ng et al., 2013, 2015). A
β-galactosidase has been reported from chickpea (Cicer arietinum) seeds, indicating its involvement in plant seedling
development (Kishore and Kayastha, 2012). Spinach leaf β-galactosidases also showed the synergistic action with α-L-
arabinofuranosidases on the hydrolysis of arabinogalactan protein (Hirano, Tsumuraya, & Hashimoto, 1994). Recently,
genome-wide identification and expression analysis revealed that sweet potato contains 17 BGAL genes that might be
involved in plant development and stress responses through regulating the metabolism of cell-wall polysaccharides (Li
et al., 2020). Although several reports have been published on BGALs found in various parts of plants such as fruits
(Lazan et al., 2004; Lee et al., 2003), seeds (Kishore and Kayastha, 2012), and leaves (Hirano et al., 1994; Li et al.,
2020), their physiological functions in the plant kingdom still remain obscure.

12.9 Conclusion
Evolutionary analyses revealed that all BGALs are evolved from the ancestor bacterial BGALs. All BGALs including
plants have the most common TIM-barrel domains that house their catalytic residues. However, dissimilarities at the C-
terminal region of the BGALs belonging to GH35 members are the cause of diversified or new functional of these
enzymes different organisms. The pBGALs may function through a retaining mechanism as found in animal BGAL.
Docking results showed clear pictures of the ligand protein interactions and substrate specificities of pBGALs.
Although X-ray crystal structure analyses of BGALs belonging to GH35 increase our understanding of the structure-
function relationship, their exact roles of pBGLs in plant physiology remain elusive. To get a better understanding of
the molecular functions of this enzyme in plant biology, it is advisable to characterize the properties, structures, and
evolution of related BGALs from different species of plants.

Conflict of interest
The authors declare no conflict of interest.

References
Ahn, Y. O., Zheng, M., Winkel, B., Bevan, D. R., Esen, A., Shin-Han, S., . . . Shih, M. (2007). Functional genomic analysis of Arabidopsis thaliana
glycoside hydrolase family 35. Phytochemistry, 68, 1510 1520.
Ali, Z. M., Armugam, S., & Lazan, H. (1995). β-Galactosidase and its significance in ripening mango fruit. Phytochemistry, 38, 1109 1114.
Bailey, T. L., Bodén, M., Buske, F. A., Frith, M., Grant, C. E., Clementi, L., . . . Noble, W. S. (2009). MEME SUITE: Tools for motif discovery and
searching. Nucleic Acids Research, 37, W202 W208.
Brummell, D. A. (2006). Cell wall disassembly in ripening fruit. Functional Plant Biology, 33, 103 119.
Buckeridge, M. S., & Reid, J. S. (1994). Purification and properties of a novel β-galactosidase or exo-(1 4)-β-D-galactanase from the cotyledons of
germinated Lupinus angustifolius L. seeds. Planta, 192, 502 511.
Carey, A. T., Holt, K., Picard, S., Wilde, R., Tucker, G. A., Bird, C. R., . . . Seymour, G. B. (1995). Tomato exo-(1-4)-β-D-galactanase. Isolation,
changes during ripening in normal and mutant tomato fruit, and characterization of a related clone. Plant Physiology, 1008, 1099 1107.
Chandrasekar, B., & Hoorn, R. A. L. (2016). Beta galactosidases in Arabidopsis and tomato-a mini review. Biochemical Society Transactions, 44,
150 157.
Cheng, W., Wang, L., Jiang, Y. L., Bai, X. H., Chu, J., Li, Q., . . . Chen, Y. (2012). Structural insights into the substrate specificity of Streptococcus
pneumoniae β(1,3)-galactosidase BgaC. Journal of Biological Chemistry, 287, 22910 22918.
Molecular evolution, 3D structural characteristics, mechanism of action, and functions of plant β-galactosidases Chapter | 12 207

de Alcantara, P. H., Martim, L., Silva, C. O., Dietrich, S. M., & Buckeridge, M. S. (2006). Purification of a β-galactosidase from cotyledons of
Hymenaea courbaril L. (Leguminosae). Enzyme properties and biological function. Plant Physiology and Biochemistry, 44((11 22),
619 627.
Dwevedi, A., & Kayastha, A. M. (2010). Plant β-galactosidases: Physiological significance and recent advances in technological applications. Journal
of Plant Biochemistry & Biotechnology, 19(1), 09 20.
Eda, M., Ishimaru, M., & Tada, T. (2015). Expression, purification, crystallization and preliminary X-ray crystallographic analysis of tomato
β-galactosidase 4. Acta Crystallography, F71, 153 156.
Edgar, R. C. (2004). MUSCLE: A multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics, 5, 113.
Eramian, D., Eswar, N., Shen, M. Y., & Sali, A. (2008). How well can the accuracy of comparative protein structure models be predicted? Protein
Science, 17, 1881 1893.
Felsenstein, J. (2000). PHYLIP: Phylogeny inference package, version 3.6 (alpha). Seattle, WA: University of Washington.
Figueiredo, S. A., Lashermes, P., & Araga, F. J. L. (2011). Molecular characterization and functional analysis of the β-galactosidase gene during
Coffea arabica (L.) fruit development. Journal of Experimental Botany, 62(8), 2691 2703.
Gwanpua, S. G., Buggenhout, S. V., Verlinden, B. E., Christiaens, S., Shpigelman, A., Vicent, V., . . . Geeraerd, A. (2014). Pectin modifications and
the role of pectin-degrading enzymes during postharvest softening of Jonagold apples. Food Chemistry, 158, 283 291.
Hirano, Y., Tsumuraya, Y., & Hashimoto, Y. (1994). Characterization of spinach leaf α-L arabinofuranosidases and β-galactosidases and their syner-
gistic action on an endogenous arabinogalactan protein. Physiologia Plantarum, 92(2), 286 296.
Hooft, R. W. W., Vriend, G., Sander, C., & Abola, E. E. (1996). Errors in protein structures. Nature, 381, 272.
Hossain, M. A., Rana, M. M., Kimura, Y., & Roslan, H. A. (2014). Changes in biochemical characteristics and activities of ripening associated
enzymes in mango fruit during the storage at different temperatures. BioMed Research International, Article ID 232969, 11 pages.
Hossain, M. A., & Roslan, H. A. (2014). Molecular phylogeny and predicted 3D structure of plant beta-D-β-acetylhexosaminidase. The Scientific
World Journal, Article ID 186029, 14 pages.
Hossain, M. A., Roslan, H. A., Karim, M. R., & Kimura, Y. (2016). Molecular phylogeny, 3D-structural insights, docking and mechanisms of action
of plant beta-galactosidases. International Journal of Bioinformatics Research and Applications, 12(2), 149 179.
Husain, Q. (2010). Beta-galactosidases and their potential applications: A review. Critical Review of Biotechnology, 30, 41 62.
Ishimaru, M., Smith, D. L., Mort, A. J., & Gross, K. C. (2009). Enzymatic activity and substrate specificity of recombinant tomato β-galactosidases 4
and 5. Planta, 229, 447 456.
Jacobson, R. H., Zhang, X. J., Dubose, R. F., & Matthews, B. W. (1994). Three-dimensional structure of β-galactosidase from E. coli. Nature, 369
(6483), 761 766.
Kishore, D., & Kayastha, A. M. (2012). A β-galactosidase from chickpea (Cicer arietinum) seeds: Its purification, biochemical properties and indus-
trial applications. Food Chemistry, 13, 1113 1122.
Kotake, T., Dina, S., Konishi, T., Kaneko, S., Igarashi, K., Samejima, M., et al. (2005). Molecular cloning of a β-galactosidase from radish that specif-
ically hydrolyzes β-(1-3)- and β-(1-6)-galactosyl residues of arabinogalactan protein. Journal of Plant Physiology, 138, 1563 1576.
Laskowski, R. A., MacArthur, M. W., & Thornton, J. M. (2001). PROCHECK: Validation of protein structure coordinates. In M. G. Rossmann, & E.
Arnold (Eds.), International tables of crystallography, Volume F. Crystallography of biological macromolecules (pp. 722 725). The Netherlands:
Dordrecht, Kluwer Academic Publishers.
Lazan, H., Ng, S.-Y., Goh, L.-Y., & Ali, Z. M. (2004). Papaya β-galactosidase/galactanase isoforms in differential cell wall hydrolysis and fruit soften-
ing during ripening. Plant Physiology and Biochemistry, 42 847 853.
Lee, D. H., Kang, S.-G., Suh, S.-G., & Byun, J. K. (2003). Purification and characterization of a β-galactosidase from peach (Prunus persica).
Molecular Cells., 15(1), 68 74.
Li, Z., Hou, F., Du, T., Xu, T., Li, A., Dong, S., . . . Zhang, L. (2020). Genome-wide identification and expression analysis of beta-galactosidase fam-
ily members in sweetpotato [Ipomoea batatas (L.) Lam.]. BMC Genomics. Available from https://doi.org/10.21203/rs.3.rs-32133/v1.
Liu, J., Gao, M., Lv, M., & Cao, J. (2013). Structure, evolution, and expression of the β-galactosidase gene family in Brassica campestris ssp.
Chinensis. Plant Molecular Biology Reports, 31, 1249 1260.
Luthy, R., Bowie, J. U., & Eisenberg, D. (1992). Assessment of protein models with three-dimensional profiles. Nature, 356, 83 85.
Masahiro Eda, M., Matsumoto, T., Ishimaru, M., & Tada, T. (2016). Structural and functional analysis of tomato β-galactosidase 4: Insight into the
substrate specificity of the fruit softening-related enzyme. The Plant Journal, 86, 300 307.
Meng, X. Y., Zhang, H. X., Mezei, M., & Cui, M. (2011). Molecular docking: A powerful approach for structure-based drug discovery. Current
Computer Aided Drug Design, 7(2), 146 157.
Moctezuma, E., Smith, D. L., & Gross, K. C. (2003). Antisense suppression of a β-galactosidase gene (TBG6) in tomato increases fruit cracking.
Journal of Experimental Botany, 54(390), 2025 2033.
Ng, J. K., Schroder, R., Brummell, D. A., Sutherland, P. W., Hallet, I. C., Smith, B. G., et al. (2015). Lower cell wall pectin solubilisation and galac-
tose loss during early fruit development in apple (Malus domestica) cultivar ‘Scifresh’ are associated with slower softening rate. Journal of Plant
Physiology, 176, 129 137.
Ng, J. K., Schroder, R., Sutherland, P. W., Hallett, I. C., Hall, M. I., Prakash, R., et al. (2013). Cell wall structures leading to cultivar differences in
softening rates develop early during apple (Malus domestica) fruit growth. BMC Plant Biology, 13, 183.
Ohto, U., Usui, K., Ochi, T., Yuki, K., Satow, Y., & Shimizu, T. (2012). Crystal structure of human β-Galactosidase: Structural basis of gm1 ganglio-
sidosis and morquiob diseases. Journal of Biological Chemistry, 287, 1801 1812.
208 SECTION | I Bioinformatics and next generation sequencing technologies

Ozeki, Y., Yokota, Y., Kato, K. H., Titani, K., & Matsui, T. (1995). Developmental expression of D-galactoside-binding lectin in sea urchin
(Anthocidaris crassispina) eggs. Experimental Cell Research, 216, 318 324.
Petersen, T. N., Brunak, S., Heijne, G. V., & Nielsen, H. (2011). SignalP 4.0: Discriminating signal peptides from transmembrane regions. Nature
Methods, 8, 785 786.
Pressey, R. (1983). β-Galactosidases in ripening tomatoes. Plant Physiology, 71, 132 135.
Ranwala, A. P., Suematsu, C., & Masuda, H. (1992). The role of β-galactosidases in the modification of cell wall components during muskmelon fruit
ripening. Plant Physiology, 100, 1318 1325.
Rojas, A. L., Nagem, R. A. P., Neustroev, K. N., Arand, M., Adamska, M., Eneyskaya, E. V., . . . Polikarpov, I. (2004). Crystal Structures of
β-galactosidase from Penicillium sp. and its complex with galactose. Journal of Molecular Biology, 343, 1281 1292.
Ross, G. S., Redgwell, R. J., & MacRae, E. A. (1993). Kiwifruit β-galactosidase: Isolation and activity against specific fruit cell-wall polysaccharides.
Planta, 189, 499 506.
Ross, G. S., Wagrzyn, T., MacRae, E. A., & Redgwell, R. J. (1994). Apple β-galactosidase: Activity against cell wall polysaccharides and characteri-
zation of a related cDNA clone. Plant Physiology, 106, 521 528.
Sethi, A., Joshi, K., Sasikala, K., & Alvala, M. (2019). In V. Gaitonde, P. Karmakar, & A. Trivedi (Eds.), Molecular docking in modern drug discov-
ery: Principles and recent applications, drug discovery and development New advances. IntechOpen. Available from https://doi.org/10.5772/
intechopen.85991.
Smith, D. L., Abbott, J. A., & Gross, K. C. (2002). Down-regulation of tomato β-galactosidase 4 results in decreased fruit softening. Plant Physiology,
129, 1755 1762.
Smith, D. L., & Gross, K. C. (2000). A family of at least seven β-galactosidase genes is expressed during tomato fruit development. Plant Physiology,
123, 1173 1183.
Szymanski, C. M., & Wren, B. W. (2005). Protein glycosylation in bacterial mucosal pathogens. Nature Review Microbiology, 3, 225 237.
Tanthanuch, W., Chantarangsee, M., Maneesan, J., & Ketudat-Cairns, J. (2008). Genomic and expression analysis of glycosyl hydrolase family 35
genes from rice (Oryza sativa L.). BMC Plant Biology, 8, 84.
Trainotti, L., Spinello, R., Piovan, A., Spolaore, S., & Casadoro, G. (2001). β-Galactosidases with a lectin-like domain are expressed in strawberry.
Journal of Experimental Botany, 52, 1635 1645.
Waterhouse, A., Bertoni, M., Bienert, S., Studer, G., Tauriello, G., Gumienny, R., . . . Schwede, T. (2018). SWISS-MODEL: Homology modelling of
protein structures and complexes. Nucleic Acids Research, 46, W296 W303.
Xu, D., & Zhang, Y. (2011). Improving the physical realism and structural accuracy of protein models by a two-step atomic-level energy minimiza-
tion. Biophysical Journal, 101, 2525 2534.
Yang, H., Liu, J., Dang, M., Zhang, B., Li, H., Meng, R., . . . Zhao, Z. (2018). Analysis of β-galactosidase during fruit development and ripening in
two different texture types of apple cultivars. Frontiers, 9, 539.

You might also like