You are on page 1of 5

Plant Cell Rep (2004) 23:246–250

DOI 10.1007/s00299-004-0811-1

PHYSIOLOGY AND BIOCHEMISTRY

S. W. Kim · S. H. Ban · H. Chung · S. Cho ·


H. J. Chung · P. S. Choi · O. J. Yoo · J. R. Liu

Taxonomic discrimination of flowering plants by multivariate analysis


of Fourier transform infrared spectroscopy data

Received: 17 October 2003 / Revised: 19 April 2004 / Accepted: 20 April 2004 / Published online: 10 July 2004
 Springer-Verlag 2004

Abstract Fourier transform infrared spectroscopy (FTIR) Keywords Dendrogram · Genetic programming
provides biochemical profiles containing overlapping sig- analysis · Phylogenetic relationship · Principal component
nals from a majority of the compounds that are present analysis
when whole cells are analyzed. Leaf samples of seven
higher plant species and varieties were subjected to FTIR Abbreviations FTIR: Fourier transform infrared
to determine whether plants can be discriminated phylo- spectroscopy · GP: Genetic programming ·
genetically on the basis of biochemical profiles. A hier- PCA: Principal component analysis · PyMS: Pyrolysis
archical dendrogram based on principal component anal- mass spectrometry
ysis (PCA) of FTIR data showed relationships between
plants that were in agreement with known plant taxono-
my. Genetic programming (GP) analysis determined the Introduction
top three to five biomarkers from FTIR data that dis-
criminated plants at each hierarchical level of the den- Fourier transform infrared spectroscopy (FTIR) is a rapid,
drogram. Most biomarkers determined by GP analysis at simple, high-resolution analytical method that is based on
each hierarchical level were specific to the carbohydrate the vibrations of functional groups and highly polar bonds
fingerprint region (1,200–800 cm1) of the FTIR spec- of the components analyzed. Thus, FTIR provides bio-
trum. Our results indicate that differences in cell-wall chemical profiles containing overlapping signals from a
composition and structure can provide the basis for che- majority of the compounds that are present in the cell
motaxonomy of flowering plants. when whole cells are analyzed. The biochemical profiles
of FTIR from whole cell samples are extremely high-
Communicated by J.M. Widholm
density data sets and, consequently, FTIR data must be
analyzed by means of multivariate analysis when multiple
S. W. Kim · S. H. Ban · H. J. Chung · J. R. Liu ()) whole cell samples are compared. Using multivariate
Laboratories of Plant Genomic Services analysis, FTIR has been used for discriminating closely
and Plant Cell Biotechnology,
related microbial strains (Freeman et al. 1994; Goodacre
Korea Research Institute of Bioscience
and Biotechnology (KRIBB), et al. 1998; Wenning et al. 2002).
52 Eoun-dong, Yuseong-gu, Daejeon, 305-333, South Korea FTIR has only been used in plant biology in a limited
e-mail: jrliu@kribb.re.kr number of studies, including those on discriminating be-
Fax: +82-42-8604608 tween cell-wall mutant plants (Stewart et al. 1997; Chen
et al. 1998). These studies have demonstrated that FTIR is
H. Chung · S. Cho
Department of Chemistry, Hanyang University,
robust in identifying structural and architectural alter-
17 Haengdang-dong, Seongdong-gu, Seoul, 133-791, South Korea ations in cell walls. However, it remains to be determined
whether the results of this approach are valid represen-
P. S. Choi tations of phylogenetic relationships between higher plant
Laboratory of Functional Genomics species. The purpose of the investigation reported here
for Plant Secondary Metabolism (National Research Laboratory),
was to determine whether multivariate analysis of FTIR
Eugentech, Inc., 52 Eoun-dong, Yuseong-gu, Daejeon,
305-333, South Korea data from flowering plants can be used to discriminate
plants phylogenetically.
O. J. Yoo
Department of Biological Science,
Korea Advanced Institute of Science and Technology,
373-1 Kusong-dong, Yuseong-gu, Daejeon, 305-701, South Korea
247

Materials and methods Results


Plant materials PCA analysis
Leaf tissues of seven species and varieties (Table 1) were subjected
to FTIR analysis. All plants were grown in a greenhouse at ap- Quantitative FTIR data for each sample are given in
proximately 30/20C(day/night) under a 14/10-h (day/night) pho- Fig. 1a. PCA of FTIR data is displayed in a two-dimen-
toperiod with light provided at an intensity of 500 mmol m2 s1. sional plot using the first two principal components
Fully expanded leaves were excised from plants at the flowering (Fig. 1b). Three replicate samples of each species and
stage and immediately plunged into liquid nitrogen before grinding
with a mortar and pestle. Ground samples were then freeze-dried variety were grouped in discrete clusters, indicating that
and stored at 70C until use. PCA is able to discriminate different species and vari-
eties. A hierarchical dendrogram was generated to display
the relationships between plant species based on PCA of
FTIR spectroscopy FTIR data (Fig. 1c). The dendrogram divided seven spe-
Five milligrams of freeze-dried, powdered leaves was mixed with cies and varieties into two groups. Lilium longiflorum and
80 mg of KBr and the mixture ground to reduce the particle size to the dicotyledonous plants were placed at the top level,
less than 100 mm in diameter. The finely ground mixture was then indicating that the monocot species were separate from
compressed in a pellet press in order to form a pellet through which the dicotyledonous plants. The dicotyledonous plants
the beam of the spectrometer was able to pass. Infrared analysis
was performed using a BOMEM FTIR spectrophotometer, Model were subsequently divided into two groups. Two Catha-
DA 3.2, equipped with a liquid nitrogen cooled mercury-cadmium- ranthus roseus varieties and the Rosales plants were
telluride detector and a KBr beam-splitter. To improve the signal- placed at the second level. Sedum kamtschaticum was
to-noise ratio, we co-added 32 interferograms and averaged these subsequently separated from three Rosa plants, which
with the analytical results. Infrared spectra were obtained by sub-
traction of the plate spectra (background) used for deposition of the were further discriminated into Rosa rugosa and Rosa
samples. Spectral resolution was 4 cm1, and spectra collected over multiflora varieties at the lowest level. The separation of
the wave number ranged from 8,000 cm1 to 400 cm1. Spectra species in the dendrogram was in agreement with the
were processed using the GRAMS/386 program (Galactic Indus- known taxonomy of the plants, indicating that discrimi-
tries, Salen, N.H.). Samples were run in triplicate using ground leaf nation of higher plant species and varieties by FTIR re-
discs from three different individual plants of each kind.
flects phylogenetic relationships between plants.

Data normalization and statistical analysis


GP analysis
Procedures were implemented to minimize problems arising from
baseline shifts. Spectra were first subjected to path-length correc-
tion, and then the spectra were baseline corrected so that the GP analysis revealed the top three to five biomarkers
smallest absorbance (2,000 cm1) was equal to 0. The smoothed from FTIR data that contributed most to the discrimina-
second derivatives of these normalized spectra were calculated tion of plants at each hierarchical level of the dendrogram
using the Savitzky-Golay algorithm (Savitzky and Golay 1964)
with five-point smoothing. These data were then subjected to (Fig. 2). We determined a total of 15 biomarkers that
multivariate analysis. included three biomarkers [V116 (1,100 cm1), V76
A hierarchical dendrogram was constructed from principal (1,022 cm1), and V66 (1,003 cm1)] discriminating Lil-
component analysis (PCA) of FTIR data by the unweighted pair ium from the dicotyledonous plants, three biomarkers
group method with arithmetic mean analysis using the multivariate [V125 (1,117 cm1), V124 (1,115 cm1), V16 (906 cm1)]
statistical package (mvsp 3.13) of Kovach Computing Services.
Genetic programming (GP) analysis was used to determine bio- discriminating Catharanthus roseus cultivars from the
marker variables that discriminated plants at each hierarchical level Rosales plants, two biomarkers [V37 (948 cm1), V25
of the dendrogram. gmax-bio software (Aber Genomic Computing, (924 cm1)] discriminating Sedum kamtschaticum from
Aberystwyth, Wales, UK) was used with the default parameters of a the Rosa plants, and three biomarkers [V139 (1,143 cm1),
population size of 1,000; a maximum program length of 44 nodes;
fitness based on tournament selection/Gmax(v); a crossover operator V85 (1,040 cm1), V82 (1,034 cm1) discriminating Rosa
used 80% of the time, and terminals of the mutations were selected rugosa from R. multiflora varieties (Fig. 2). These 11
20% of the time. The operators used were the default numeric (0.1, biomarkers out of the 15 were deconvoluted to cell-wall
1, 3, 5, and rand) and arithmetic (1, 2, /, and *) operators. components including cellulose (Chen et al. 1997), pectin

Table 1 Plant materials used for FTIR spectroscopy


Order Family Genus Species Variety Cultivar Identifera
Rosales Rosaceae Rosa multiflora a
Rosa multiflora platyphylla b
Rosa rugosa c
Crassulaceae Sedum kamtschaticum d
Gentianales Apocynaceae Catharanthus roseus Cooler grape e
Catharanthus roseus Cooler peppermint f
Liliales Liliaceae Lilium longiflorum Casablanca g
a
Identifiers of plant materials for FTIR spectra and PCA plots in Fig. 1.
248

Fig. 1a–c FTIR spectra, PCA of FTIR data, and a dendrogram plants, c a dendrogram based on PCA of FTIR data from seven
based on PCA of FTIR data from seven plants. a Representative plants. Refer to Table 1 for the identifiers a, b, c, d, e, f, and g. The
FTIR spectra of seven plants, b PCA of FTIR data from seven numbers next to the identifiers 1, 2, and 3 indicate three replicates

(Wilson et al. 2000), polygalacturonic acid, and rhamno- long to two orders and three families were chosen to
galacturonan (Kaurkov et al. 2000). One biomarker of determine whether they could be systematically discrim-
V390 (1,627 cm1), discriminating C. roseus cultivars inated by FTIR. Therefore, these seven different plants
from the Rosales plants, and the other biomarker of V380 represent a sample of the flowering plants that can be
(1,608 cm1), discriminating R. rugosa from R. multiflora discriminated by FTIR.
varieties, are deconvoluted to amides associated to pro- FTIR is an excellent method for determining phylo-
teins (Nelson 1991; Williams and Fleming 1996). genetic relationships between flowering plants based on
its ease of use and quick results. This method has been
widely used to discriminate closely related microbial
Discussion strains (Freeman et al. 1994; Goodacre et al. 1998;
Wenning et al. 2002). FTIR has been used in plant bi-
Seven different plants including one variety and two ology in a number of studies that include discrimination
cultivars were chosen in this study. L. longiflorum was of cell-wall mutant plants (Stewart et al. 1997; Chen et
chosen to determine whether this monocot plant could be al. 1998), cell-wall composition and architecture (Sn et
discriminated from the dicot plants by FTIR. One variety al. 1994; McCann et al. 1997), mechanical properties and
and two cultivars within two different species were cho- molecular dynamics of plant cell-wall polysaccharides
sen to determine whether genotypes within the same (Wilson et al. 2000), and determination of the fruit
species could be discriminated by FTIR. Plants that be- content in processed foods (Wilson et al. 1993). Sn et
249

Fig. 2 Biomarkers (arrows) from FTIR data identified by GP Rosales plants (solid line). c three biomarkers [V437 (1,720 cm1),
analysis. a Three biomarkers [V116 (1,100 cm1), V76 (1,022 cm1), V37 (948 cm1), and V25 (924 cm1)] discriminating Sedum
and V66 (1,003 cm1)] discriminating Lilium from the dicotyle- kamtschaticum from the Rosa plants and spectra averaged from the
donous plants and spectra averaged from the whole data set of L. whole data set of S. kamtschaticum (broken line) and the Rosa
longiflorum (broken line) and the dicotyledonous plants (solid line). plants (solid line). d Five biomarkers (V380 (1,608 cm1),
b Four biomarkers [V390 (1,627 cm1), V125 (1,117 cm1), V124 V175(1,213 cm1), V139 [1,143 cm1), V85 (1,040 cm1), and V82
(1,115 cm1), and V16 (906 cm1)] discriminating Catharanthus (1,034 cm1)] discriminating Rosa rugosa from R. multiflora va-
roseus cultivars from the Rosales plants and spectra averaged from rieties and spectra averaged from the whole data set of R. rugosa
the whole data set of C. roseus cultivars (broken line) and the (broken line) and R. multiflora varieties (solid line)

al. (1994) showed differences in the plant cell walls of tionship between plants in a two-dimensional plot, lead-
five angiosperms by infrared and Raman spectroscopies, ing to construction of a dendrogram that provided hier-
thereby indicating that taxonomic classification was archical levels of plant groupings. GP analysis, a super-
possible. However, they did not attempt to discriminate vised learning method for the production of mathematical
between the different flowering plants using multivariate rules that enable the easy identification of data selected to
analysis of FTIR data for taxonomic classification. Wil- perform the classification, led to the identification of
son et al. (1993) reported that multivariate analysis of FTIR biomarkers that discriminated between plants at
FTIR data for fruit jam enables the discrimination be- each hierarchical level of the dendrogram. Biomarkers
tween differing fruit content. were commonly found in the carbohydrate fingerprint
FTIR detects all compounds, including polymers and region where differences in cell-wall composition and
low-molecular weight compounds in whole cell samples, structure are reflected. Thus, differences in cell-wall
subsequently providing biochemical profiles of extremely composition and structure can provide the basis for
high-density data sets. PCA, an unsupervised learning chemotaxonomy of flowering plants. We have previously
method requiring no prior knowledge of class structure reported that pyrolysis mass spectrometry (PyMS) dis-
within the data set, was used to display the natural rela- criminates plants phylogenetically in a similar manner
250

(Kim et al. 2004). However, biomarkers determined by polysaccharides and hemicelluloses. Carbohydr Polym 43:195–
GP of PyMS data cannot be deconvoluted to identify the 203
Kim SW, Ban SH, Chung HJ, Choi DW, Choi PS, Yoo OJ, Liu JR
chemical compounds. (2004) Taxonomic discrimination of higher plants by pyrolysis
mass spectrometry. Plant Cell Rep 22:519–522
Acknowledgements This work was supported by grants to JRL McCann MC, Chen L, Roberts K, Kemsley EK, Sn C, Carpita
from the National Research Laboratory Program (M10104000234- NC, Stacey NJ, Wilson RH (1997) Infrared microspectroscopy:
01J000-10710), from the Strategic National R&D Program through sampling heterogeneity in plant cell-wall composition and ar-
the Genetic Resources and Information Network Center (no. chitecture. Physiol Plant 100:729–738
BDM0100211), from the Korea Science and Engineering Founda- Nelson WH (1991) Modern techniques for rapid microbiological
tion through the Plant Metabolism Research Center of the Kyung analysis. VCH, New York
Hee University funded by the Korean Ministry of Science and Savitzky A, Golay MJE (1964) Smoothing and differentiation of
Technology, and from the KRIBB Research Initiative Program. data by simplified least squares procedures. Anal Chem 36:
1627–1633
Sn CFB, McCann MC, Wilson RH, Grinter R (1994) Fourier-
transform Raman and Fourier-transform infrared spectroscopy:
References an investigation of five higher plant cell walls and their com-
ponents. Plant Physiol 106:1623–1631
Chen L, Wilson RH, McCann MC (1997) Infrared microspec- Stewart D, Yahiaoui N, McDougall GJ, Myton K, Marque C,
troscopy of hydrated biological systems: design and construc- Boudet AM, Haigh J (1997) Fourier-transform infrared and
tion of a new cell with atmospheric control for the study of Raman spectroscopic evidence for the incorporation of cin-
plant cell walls. J Microsc 188:62–71 namaldehydes into the lignin of transgenic tobacco (Nicotiana
Chen L, Carpita NC, Reiter WD, Wilson RH, Jeffries C, McCann tabacum L.) plants with reduced expression of cinnamyl alco-
MC (1998) A rapid method to screen for cell-wall mutants hol dehydrogenase. Planta 201:311–318
using discriminant analysis of Fourier transformation infrared Wenning M, Seiler H, Scherer S (2002) Fourier-transform infrared
spectra. Plant J 16:385–392 microspectroscopy, a novel and rapid tool for identification of
Freeman R, Goodacre R, Sisson PR, Magee JG, Ward AC, Light- yeast. Appl Environ Microbiol 68:4717–4721
foot NF (1994) Rapid identification of species within the My- Williams DH, Fleming I (1996) Spectroscopic methods in organic
cobacterium tuberculosis complex by artificial neural network chemistry, 5th edn. McGraw-Hill, London
analysis of NMR data. J Med Microbiol 40:170–173 Wilson RH, Slack PT, Appleton GP, Sun L, Belton PS (1993)
Goodacre R, Timmins M, Burton R, Kaderbhai N, Woodward AM, Determination of the fruit content of jam using Fourier trans-
Kell DB, Rooney PJ (1998) Rapid identification of urinary tract form infrared spectroscopy. Food Chem 47:303–308
infection bacteria using hyperspectral whole-organism finger- Wilson RH, Smith AC, Kacurakova M, Saunders PK, Wellner N,
printing and artificial neural networks. Microbiology 144: Waldron KW (2000) The mechanical properties and molecular
1157–1170 dynamics of the plant cell wall polysaccharides studied by
Kaurkov M, Capek P, Sasinkov V, Wellner N, Ebringerov A Fourier-transform infrared spectroscopy. Plant Physiol 124:
(2000) FT-IR study of plant cell wall model compounds: pectic 397–405

You might also like