Professional Documents
Culture Documents
410,
Making Sense of the Metabolome Special Issue, pp. 219–243, January 2005
doi:10.1093/jxb/eri069 Advance Access publication 23 December, 2004
John M. Halket1,2,*, Daniel Waterman1, Anna M. Przyborowska2, Raj K. P. Patel1,2, Paul D. Fraser1 and
Peter M. Bramley1
1
Bourne Laboratory, Centre for Chemical and Bioanalytical Sciences, Royal Holloway,
University of London, Egham, Surrey TW20 0EX, UK
Abstract Introduction
An overview is presented of gas chromatography/mass Metabolic profiling and ‘fingerprinting’
spectrometry (GC/MS) and liquid chromatography/mass
The emerging field of metabolomics requires profiling and
spectrometry (LC/MS), the two major hyphenated tech-
fingerprinting methods (Fiehn et al., 2000a; Glassbrook
niques employed in metabolic profiling that complement
and Ryals, 2001; Harrigan and Goodacre, 2003; Sumner
direct ‘fingerprinting’ methods such as atmospheric
et al., 2003) capable of measuring the absolute or relative
pressure ionization (API) quadrupole time-of-flight MS,
amounts of all metabolites (the metabolome). The great
API Fourier transform MS, and NMR. In GC/MS, the
analytes are normally derivatized prior to analysis in
diversity of chemical properties and wide concentration
order to reduce their polarity and facilitate chromato-
ranges of these compounds pose a significant challenge as
graphic separation. The electron ionization mass spec- the methods need to be robust and reproducible to enable
tra obtained are reproducible and suitable for library samples to be reliably compared (Glassbrook et al., 2000).
matching, mass spectral collections being readily avail- Profiling refers to the detailed analysis by hyphenated
able. In LC/MS, derivatization and library matching are at techniques such as gas chromatography-mass spectrometry
an early stage of development and mini-reviews are (GC/MS), liquid chromatography-mass spectrometry (LC/
provided. Chemical derivatization can dramatically in- MS) or capillary electrophoresis-mass spectrometry (CE/
crease the sensitivity and specificity of LC/MS methods MS). Such techniques provide a detailed chromatographic
for less polar compounds and provides additional struc- profile of the sample and consequently measurements of
tural information. The potential of derivatization for the relative or absolute amounts of the components. The
metabolic profiling in LC/MS is demonstrated by the number of components measured will depend on the
enhanced analysis of plant extracts, including the po- resolution of the chromatographic system and the specifi-
tential to measure volatile acids such as formic acid, city of the detection technique. A mass spectrometer can
difficult to achieve by GC/MS. The important role of mass function as a highly specific chromatographic detector and
spectral library creation and usage in these techniques a high resolution mass spectrometer even more so.
is discussed and illustrated by examples. ‘Fingerprinting’ refers to more rapid and general screen-
ing methods such as direct infusion atmospheric pressure
Key words: Derivatization, electrospray ionization, food analy- ionization (API) MS (particularly at high mass resolution),
sis, gas-liquid chromatography-mass spectrometry, ion trap, NMR spectroscopy, and other methods such as Raman
liquid chromatography-mass spectrometry, urine analysis, spectroscopy and Fourier transform infra-red spectroscopy
mass spectral library, metabolic profiling, tandem mass spec- (Fig. 1), all of which provide complementary information.
trometry, time-of-flight. Such techniques can be configured as ‘high-throughput’
and are suitable for determining differences and classifying
Journal of Experimental Botany, Vol. 56, No. 410, ª Society for Experimental Biology 2004; all rights reserved
220 Halket et al.
1,000
pe p ti d e s
LC/MS (APCI, +/- )
Mr
GC/MS (EI, +)
sterols, steroids
drugs
analysis in order to reduce their polarity and facilitate commercially available (see below). The LC/MS situation
chromatographic separation on a column of low polarity as is more complex. The atmospheric pressure ionization
usually employed in metabolic profiling. For example, fatty techniques generally produce pseudo-molecular ions
acids are commonly esterified by methylation before GC ([M+H]+ or [M-H]ÿ) depending on a number of factors:
separation. In the case of ESI, polar and pre-ionized the chemical properties of the analyte, the polarity of the
components are favoured making the technique highly electrospray voltage, the nature of the matrix and the
complementary to GC/MS. By comparison, the related solvent composition. It is not always a simple matter to
APCI technique covers lower polarity compounds and predict whether positive or negative ions will be preferen-
therefore has great potential in metabolic profiling by LC/ tially produced (Cech and Enkei, 2001). For example, the
MS. The harsher conditions employed in APCI prevents its production of negative ions from carboxylic acids can be
use for larger molecules, but smaller analytes (generally enhanced by the addition of weak acid, rather than the
<2000 Da) can be detected over a wider polarity range expected base (Wu et al., 2004). Matrix effects (Matuszewski
(Ardrey, 2003). The technique is also more tolerant to et al., 2003) can include ionization suppression (King et al.,
changes in experimental conditions and is commonly 2000; Choi et al., 2001) and ionization enhancement
employed in quantitative analysis. Although APCI has (Mallet et al., 2004; Liang et al., 2003) caused by the
a requirement for high flow rates, making it less compatible presence of salts and other components being ionized at the
with microbore column technology, a post-column ‘make- same time. The probability of such effects is greater where
up’ flow of mobile phase can be added. A newer ionization no chromatographic steps are employed as in fingerprinting
technique, atmospheric pressure photoionization (APPI; by direct infusion ESI (Goodacre et al., 2002) and the
Robb et al., 2000) is not indicated, but has potential for the analyst must always be aware of the dangers.
analysis of compounds of lower polarity and thus consider- Thus, there is a much more complex situation in LC/MS as
able potential for metabolic profiling by LC/MS. In many compounds in a plant extract will not ionize optimally,
conventional GC/MS, the electron ionization technique is or at all, under the conditions employed in a single analytical
employed and only the much more abundant positively run. Currently, metabolic profiling by LC/MS is more suited
charged ions are measured. In addition, the energy supplied to groups of compounds, such as alkaloids, which ionize
to induce fragmentation of the parent ion in order to obtain similarly under given conditions. In complex mixtures such
a fragmentation pattern (mass spectrum, or plot of ion as plant extracts, some compounds will preferentially form
abundance versus mass-to-charge ratio) is kept constant (70 positive ions, others negative ions, and some will be difficult
eV) thereby enabling reproducible mass spectra to be to ionize under fixed conditions. Chemical derivatization
obtained. In this way, libraries of such spectra can be has the potential to alter the ionization properties of the
shared between investigators and several libraries are analyte molecules favourably, as indicated by the arrows in
222 Halket et al.
Fig. 2: the polarities of organic acids can be reduced by the data matrix is easily constructed. The situation with
esterification facilitating their analysis by GC/MS. Simi- non-targeted analysis is more complex. Components ap-
larly, organic acids can be derivatized in such a way that their pearing or disappearing in some samples, for example,
polarity is increased (arrowed, Fig. 2) making them more following genetic modification, or in a disease state, will
amenable to analysis by positive electrospray and examples require manual adjustment of the matrix. Commercially
of such a transformation are described below. available software to assist with this step is only just
In addition, energy supply to such parent ions in order to becoming available. A selection of programs, some of
induce fragmentation and obtain the product ion mass which can read a variety of file formats and carry out mass
spectra, which can be used for library searching, is not spectral deconvolution, are listed in Table 1.
standardized between instrument types and complicates the In comprehensive metabolic profiling of plant extracts by
analysis. This situation is covered in the LC/MS library GC/MS, the majority of compounds measured have not yet
section, below. been formally identified. Over 400 components were de-
tected in Cucurbita maxima phloem, using automated mass
Fig. 3. The data transformation required in profiling techniques such as GC/MS and LC/MS. Areas of mass chromatographic peaks corresponding to
components (a,b,c. . .n) are entered into a peak table for each sample chromatogram (1,2,3. . .z).
Table 1. A selection of specialist software packages for peak detection and mass spectral deconvolution
Name Publisher Reference
Fig. 4. GC/MS analysis showing conversion of arginine to ornithine upon trimethylsilylation by treatment with MSTFA (37 8C, 20 min). The total ion
current chromatograms obtained from arginine and ornithine are shown in (A) and (C), resp., and both have peaks at retention time 23.9 min as well as
identical mass spectra (B) and (D) corresponding to N,N9O-tris(trimethylsilyl)ornithine. Instrumentation: Agilent 6890 gas chromatograph, injected 1 ll
(split 1:20, 2908) sample in MSTFA (N-methyl-N-trimethylsilyltrifluoroacetamide), column: 30 m30.25 mm30.25 lm film thickness (DB5MS),
temperature program: 70 8C (5 min)ÿ5 8C minÿ1ÿ310 8C (7 min), detection: Agilent 5973 Mass Selective Detector, scanned m/z 10-800.
Metabolic profiling 225
Table 3. A selection of major commercially available EI
libraries
Database Class Spectra www
Qualifier ions: chromatographic peak integrity Yamaguchi et al., 1999), but much remains to be achieved
In addition to a matching retention index between the in this area.
analyte and the known compound, positive assignment will
require a mass spectral library match, performed by some GC/MS: mass spectral library searching
manufacturers’ data systems (see section below), or at least A useful introduction to library searching is given in
agreement with the ratios of one or two qualifier ions. Figure a recent text (Smith and Busch, 1999). In addition, detailed
5 shows the GC/MS analysis of cholesterol TMS with three studies of search algorithms have been carried out (Stein
reconstructed ion chromatograms; (a) m/z 458, (b) m/z 443, and Scott, 1994; McLafferty et al., 1998).
and (c) m/z 368, together with a part of the mass spectrum Table 3 lists some common commercially available
recorded at the peak apex (28.0 min). The ratios of the peak libraries with an indication of the numbers of spectra
areas (a), (b) and (c) should correspond to the ratios of available in each case. The mass spectra contained within
abundances of the ions m/z 458, 443, and 368, respectively, the NIST library are studied in detail by professional
in the library spectrum. The software checks these ratios and evaluators before inclusion (Ausloos et al., 1999) ensuring
reports deviations outside a preset range. The operation is high quality.
analogous to the wavelength absorbance ratio method in Of particular interest to metabolic profilers are the useful
HPLC to detect impure peaks (Sievert and Drouen, 1993). libraries downloadable from the Max-Planck Institute for
Plant Physiology in Golm, Germany (http://www.mpimp-golm.
GC/MS: automatic interpretation mpg.de/mms-library/). These include TMS and TBDMS
Since the start of GC/MS for the diagnosis of metabolic derivatives of known compounds and also an important
disorders, software approaches to automated data interpret- library of unassigned spectra (TMS). Compound listings
ation have been made in individual laboratories (Jellum include retention times. Collaborations are invited for
et al., 1975; Mizuno et al., 1981; Halket et al., 1999; structure elucidation (mms-library@mpimp-golm.mpg.de).
226 Halket et al.
The computerized matching of an unknown spectrum GC3GC: a revolutionary development
with a database is a very rapid and useful tool in metabolic A very simple and revolutionary innovation in GC is so-
profiling. Despite the advances made in algorithms and called ‘GC3GC’ or ‘comprehensive 2-dimensional GC’ in
software, such automated matching cannot be entirely which a non-polar column can be coupled to a shorter polar
relied upon (Sparkman, 1996) and data must be checked column. The temperature of the junction between the
by an experienced analyst in order to ensure the integrity columns is modulated by a moving heater (Kinghorn and
of the results: a daunting task when one considers the Marriott, 1998a) or jets of gas (Ledford and Billesbach,
amount of data produced in a short time by modern 2000; Beens et al., 2001) so that peaks from the first
hyphenated techniques. The additional application of re- column are continually ‘frozen’ (modulated) and trans-
tention indices (see above) can improve the situation when ferred to the second, faster-running column. If the first
trying to distinguish between compounds having similar column can separate 300 peaks and the second column 15
mass spectra. peaks, the total resolving power of the system is
300315=4500 peaks, representing a staggering increase
Fig. 6. Ultrafast GC/time-of-flight (TOF) MS showing metabolic profiles in approximately 2 min: (a) total ion current (TIC) chromatogram of a mixture
of C8-C28 straight chain hydrocarbons used for retention index assignments via AMDIS software, (b) TIC chromatogram of a trimethylsilylated tomato
extract (Waterman et al., 2003), (c) mass spectrum of GC peak with retention time 0.94 min (arrowed), (d) matched spectrum of phosphoric acid TMS
from NIST 02 mass spectral library. GC/MS system: TRACE gas chromatograph and TEMPUS mass spectrometer (Thermo, San Jose), equipped
with a split injector (1:300) and 10 m30.1 mm (0.1 lm film thickness RTX-5) column, temperature program 70 8C (0.1 min)ÿ120 8C minÿ1ÿ350 8C
(0.5 min), full scan mode (m/z 30–600, 40 scan sÿ1).
Downloaded from https://academic.oup.com/jxb/article/56/410/219/484060 by guest on 28 February 2021
227
Metabolic profiling
228 Halket et al.
such as fast atom bombardment (FAB), thermospray, ESI, (Quirke et al., 2000). Fatty acids have been analysed by ESI
and matrix-assisted laser desorption-ionization (MALDI). using alkyldimethylaminoethyl derivatives (Johnson, 2000).
The rationale behind such attempts to alter the chemical Polar derivatives have been applied to steroids (Nakagawa
properties can be seen in Fig. 2. Analytes in the ESI region and Hashimoto, 2002; Griffiths et al., 2003), fatty alcohols
containing different functional groups can have diverse and alcohol ethoxylate surfactants in wastewater (Dunphy
ionization properties, giving preferentially positive or et al., 2001), carbonyl compounds (Brombacher et al.,
negative ions depending on the mobile phase composition. 2002), and drug impurities (Barry et al., 2003b) with
For example, the ESI method in positive mode would work improved detection. In MALDI MS, charged tris(2,4,6-
particularly well with positively charged (or easily charge- trimethoxyphenyl)phosphonium (TMPP) derivatives were
able) species (highly polar) so that a derivatization scheme used for the derivatization of peptide amino groups (Huang
which introduces such species to a wide variety of com- et al., 1997), yielding improved sequence information.
pounds would facilitate simultaneous measurement (pro- Such derivatives have since been applied to amines and
filing) of an increased number of compounds in one carboxylic acids (Leavens et al., 2002), alcohols, aldehydes,
177
100
O O
O O
(A) 50 137 190
HO OH 350 368
89 217 232 272
51 65 77 159 298 320
0
50 80 110 140 170 200 230 260 290 320 350 380
(curc) CURCUMIN EI
369
100
O O
(B) O O
50
HO OH
74 122 143 167 183 208 235 256 279 309 340
0
50 80 110 140 170 200 230 260 290 320 350 380
(curc) CURCUMIN ELECTROSPRAY
245
100
(C) O
O O
O
50 285
HO OH 175
369
259 299
151 203 219 325 351
0
50 80 110 140 170 200 230 260 290 320 350 380
(curc) CURCUMIN MS/MS 30 NCE
Fig. 7. (A) EI spectrum of the turmeric (Curcuma longa L.) pigment curcumin (Mr=368) taken from the NIST02 database, (B) the ESI mass spectrum
showing [M+H]+=369, (C) the product ion mass spectrum obtained by collisionally induced dissociation (CID) using an LCQ DECA quadrupole ion
trap mass spectrometer (Thermo, San Jose), ESI, positive, 4.5 kV, capillary 200 8C, normalized collision energy 30%, isolation width 1.5 Th. A 1 lg
mlÿ1 standard solution of curcumin in acetonitrile-water (50:50, v/v) was infused at 5 ll minÿ1.
230 Halket et al.
below) so that this so-called ‘source CID’ or ‘transport MS/MS: CID in a quadrupole ion trap mass
region CID’ is applied to all molecules present in the spectrometer
system. Other names for the technique include orifice or In the quadrupole ion trap, different stages of MS are
nozzle-skimmer voltage CID. As in EI, the mass spectra carried out in time (Bier and Schwartz, 1997). In addition to
produced will be overlapped so that chromatographic an inherent high sensitivity, multiple levels of MS (MSn)
separation is desirable (as in GC/MS). can be carried out by sequentially isolating and fragmenting
Such transport region CID spectral patterns have been selected ions in the trap. The ability automatically to switch
shown to vary greatly with instrumental conditions (Bogusz triggering of MS/MS according to parent ion intensities
et al., 1999). Corresponding libraries have only become (data-dependent scanning, (Tiller et al., 1997)) and further
available following the introduction of tuning compounds, automation facilities (Drexler et al., 1998) are powerful
enabling fairly reproducible spectra to be obtained, albeit features. Longer residence times in the ion trap can lead to
on instrument types from the same manufacturer (Marquet more rearrangement ions than with the other techniques
et al., 2003; Hough et al., 2000; Weinmann et al., 2001). described above. The instrumentation is very compact and
technique of normalized collision energy (Lopez et al., a smaller fragment ion at m/z 416, corresponding to the
1999), which automatically compensates for the mass- molecular mass of tomatidine, was selected for a further
dependent energy deposition characteristics typical of ion fragmentation step. The product ion mass spectrum (MS3)
trap instruments and makes the MS/MS spectra remarkably obtained is shown in Fig. 9B together with the best library
reproducible and relatively insensitive to instrumental match showing tomatidine contained in a 1000 compound
settings. ion trap library (Baumann et al., 2000). The spectra are not
identical, giving a NIST match factor (Stein, 1994) of only
625, a consequence of the different energies used to obtain
Rapid confirmation of curcumin in turmeric (Curcuma
them. However, they are similar enough to indicate the
longa L.) powder
usefulness of the procedure for rapid chemical analysis. An
A simple extract of turmeric (Curcuma longa L.) powder advantage of relatively low cost ion trap mass spectrometers
(approximately 0.5 mg mixed with 1 ml methanol–water, is the capability of carrying out MSn experiments, useful in
1:1, v:v) was infused into an ion trap mass spectrometer at 5 structure elucidation (Tolstikov and Fiehn, 2002). The m/z
ll minÿ1. A clear signal was obtained at m/z 369 (data not selection and ionization steps can be automated to give
shown) which could correspond to the [M+H]+ ion of fragmentation maps and can be repeated up to 10 times,
curcumin. The product ion mass spectrum obtained for m/z although the sensitivity of the technique reduces after each
369 is shown in Fig. 8A, together with the NIST library step. In many cases, MS4 or MS5 will be achievable with
match for curcumin (B), a substance exhibiting beneficial biological samples, depending on the analyte concentration.
antioxidant properties and approved as a colouring material
in foodstuffs. The library spectrum had been made several
years beforehand on a different LCQ instrument. The whole Development of an ion trap library
analysis took only a few minutes. Recently, spectra are being acquired with a variety of
conditions and contributions of spectra are being received
from collaborators. The library now contains approximately
MSn spectra
1200 spectra, about half of which are drug-related. The
A great advantage of ion trap mass spectrometers is the library will be distributed free of charge to contributors and
ability to carry out multiple levels of MS. In cases where the other Thermo LCQ users. The spectra will also be included
MS/MS spectrum is dominated by a single ion, this ion can in a large MS/MS library to be distributed by NIST
be selected and fragmented further to give an MS/MS/MS (j.halket@rhul.ac.uk).
(MS3) spectrum.
A simple example is shown in Fig. 9. A solution of the
tomato alkaloid tomatine (5 lg mlÿ1 in methanol–water, LC/MS/MS: towards a universal library?
1:1, v:v, containing 0.1% formic acid, 5 ll minÿ1) was Further to early work on the reproducibility of triple
infused into the LCQ XP. The mass spectrum gave a single quadrupole CID spectra (Martinez, 1991), the factors
ion at m/z 1037.5 (data not shown). The MS/MS spectrum affecting the API product ion mass spectra originating
obtained by fragmentation of this ion (50% collision energy from a number of instruments are being actively studied
(Lopez et al., 1999) is shown in Fig. 9A in NIST user library (SE Stein, personal communication). Mass spectra from
format. In addition to the major fragment ion at m/z 1017, different instrument types and manufacturers are being
232 Halket et al.
compared. Although perfectly matchable spectra from represents the first stage in the development of a sequential
different instruments seems unlikely, early indications are derivatization and multi-component analysis procedure
that useful universal libraries may be created (Bristow applicable in the field of metabolic profiling.
et al., 2002, 2004; Gergov et al., 2004). The TMPP derivatization step is shown in Fig. 10A for
gibberellic acid (Cat+=m/z 918) together with the corres-
LC/MS/MS of derivatized organic acids ponding MS/MS spectrum (NIST user library format) of the
As previously demonstrated, charged tris(2,4,6-trimethoxy- product in Fig. 10B. The mass spectrum has several high
phenyl)phosphonium (TMPP) derivatives of organic acids molecular mass fragment ions eminently suitable for the
can be prepared using a TMPP propylamine reagent recognition and also for the quantitation of this compound.
following activation of the carboxylic acid group with 2- Reconstructed mass chromatograms (C1–C6) for a mix-
chloro-1-methylpyridinium iodide (Leavens et al., 2002; J ture of volatile carboxylic acids following separation on
Halket et al., unpublished data). a short HPLC column are shown in Fig. 11A together with
In the first part of a comprehensive LC/MS/MS de- the MS/MS spectrum of the formic acid TMPP derivative
rivatization study, MS/MS spectra of a wide range of (NIST user library format) in Fig. 11B. Using this procedure,
organic acid TMPP derivatives have been recorded and compounds covering a wide range of molecular mass can be
stored in a NIST format user library. The derivatization determined in a single analysis. The determination of low
improves the detection characteristics of the carboxylic molecular weight acids is difficult to carry out by GC/MS,
acids and femtogram sensitivity has been achieved with particularly in the same separation as compounds having
standards (J Halket et al., unpublished data). Utility of the much greater molecular mass, such as oleic acid.
derivative is illustrated by preliminary examples of meta- The degree of separation of the acid TMPP derivatives
bolic profiling in extracts of tomato tissue. The work obtained using such a short HPLC column is remarkable.
Metabolic profiling 233
Formic and acetic acids are nearly resolved and acetic and biological samples, such as the comparison of GM versus
propionic acids are resolved completely within about 8 min. non-GM foodstuffs (Gröger et al., 2003; Waterman et al.,
The new technique should prove to be useful for the 2004). The boxed peak indicated in Fig. 12A for m/z 764 is
qualitative and quantitative analysis of acidic compounds shown in expanded form in Fig. 12B. The product ion mass
present at very low levels in plant tissues. The ion trap TMPP spectrum recorded for the small peak with a retention time
derivative library now contains nearly 100 product ion mass of 16.2 min is shown in Fig. 12C together with the best
spectra, including some MS/MS/MS (MS3) spectra. library match, indicating that the compound is derived from
the TMPP derivative of quinic acid. The larger chromato-
Metabolic profiling of tomato extracts graphic peak at 16.81 min in Fig. 12B was found to
A selection of the ion chromatograms present in a total ion correspond to citric acid TMPP (data not shown). The lack
current profile of TMPP derivatized tomato extract is of chemical noise in the reconstructed ion chromatograms
shown in Fig. 12 and illustrates how relative integrated in Fig. 12A is notable and a result of the relatively high m/z
peak areas could be utilized in statistical comparisons of ratios of the derivatives.
234 Halket et al.
O O
O
HO
O O O O O
O O
Br GA
O P NH2 O P HN
(A)
CMPI / TEA
O O
O
OH
O O O O
C49H61NO14P
O O 918.382969
O
O O HO
O O
(B)
50 O P+ NH 846
O
O
O O OH 856
533
O
590 616
828
0
280 320 360 400 4 40 48 0 520 560 600 640 680 720 760 8 00 840 880 920
(g02 tmp p) GIBBERELIC ACID TMPP
Fig. 10. (A) Reaction scheme for the TMPP derivatization of gibberellic acid (CMPI=2-chloro-1-methylpyridinium iodide, TEA=triethylamine) and (B)
resulting ion trap MS/MS spectrum (product ions of m/z 918) in NIST user library. Derivatization was carried out as previously described (Leavens et al.,
2002; J Halket et al., unpublished data).
Statistical analysis the route of the distribution along the two principal
The principal components analysis (PCA) score plot of a set component axes, a weightings plot for each variable can
of electrospray LC/MS (ion trap) obtained from derivatized be produced indicating components corresponding to each
tomato extracts (Waterman et al., 2004; J Halket et al., area of the PCA scores plot (data not shown). In this case
unpublished data) is shown in Fig. 13. A matrix (16315, components are not identified and the PCA methodology
compound abundance versus sample) representing the simply illustrates the potential of the derivatization tech-
relative amounts of 16 unidentified analytes corresponding nique in this field. An exhaustive study using a large
to TMPP-derivatized components from five tomato types in number of samples is currently being carried out and
triplicate was extracted from the data. Chromatograms of includes the identification of key metabolites.
the m/z values (Cat+) for the selected components were
automatically integrated from each chromatogram, using LC/MS: structure elucidation of unknowns
a quantitation method in the Xcalibur software and the As in GC/MS, library searching and mass spectral in-
peak areas (abundances) input into the table. terpretation steps (see GC/MS section) may assist in
Of the ten principal components generated, the first two structure elucidation, although fragmentation mechanisms
components accounted for 67% of the total variance in the can be quite different. The application of ion trap or Fourier
data. transform mass spectrometers with multiple levels of MS
Proper treatment of data using PCA should involve more (MSn) to elucidate fragmentation pathways, together with
data points than shown in this example. However, the NMR and high resolution MS (Tolstikov and Fiehn, 2002)
diagram does serve to illustrate the potential of the method is encouraging. The power of LC/UV/NMR/MS for the
for metabolic profiling. Although there is a degree of structural elucidation of plant metabolites has been well
overlap for the five types of tomato samples, the data are demonstrated (Wolfender et al., 2003). Recent develop-
seen to cluster relative to tomato type. In order to evaluate ments in cryoprobe NMR on-line with MS (Spraul et al.,
Metabolic profiling 235
2003; Exarchou et al., 2003) will surely assist with this methods. The LC/MS technique particularly requires faster
important task. separations. Although important developments are being
made in LC column technology (Tolstikov and Fiehn,
2002) and in ultra-high pressure gradient l-HPLC (Tolley
Discussion et al., 2001; Legido-Quigley et al., 2002) with correspond-
ing increases in resolving power, such methods are not yet
Physico-chemical limitations of GC/MS and LC/MS fast enough for high throughput profiling.
An obvious limitation of these combined techniques is that
the columns are also acting as filters and not all components Data exchange limitations on GC/MS and LC/MS
injected will necessarily pass through, a consequence of the
different physico-chemical properties of the analytes and Serious problems still exist with compatibilities between
matrix components (polarity, molecular mass, etc). Com- mass spectrometer manufacturers’ data files, so there is
ponents will therefore remain in the injector, column and a strong need for universal data processing. AMDIS
detector areas and the whole system will be inherently software can read and process most data files and although
different after each injection. This is in stark contrast to it has been designed for GC/MS applications, it should also
a non-destructive spectroscopic technique such as NMR. find applications in LC/MS. Further specialized software is
The analyst must therefore be aware of subsequent degrad- being developed with metabolic profiling in mind (Table 1)
ation in both column and detector performance and appro- and the development should benefit from user feedback on
priate controls must be incorporated in any such analytical requirements.
system (regular injection of quality assurance samples and
maintenance of control charts). GC/MS and LC/MS: inter-laboratory comparison of
It is clear that the current speed limitations on the data sets
analytical power of GC/MS and LC/MS are quite severe Library matching and retention parameter comparisons are
considering the large numbers of samples to be analysed in important for inter-laboratory comparisons of data sets, i.e.
metabolomics. However, the introduction of faster devices is the unknown ‘x’ in laboratory A the same ‘x’ measured in
such as TOF/MS perhaps combined with multi-dimensional laboratory B and so on. A useful step would be the creation
chromatographic separation techniques and more ‘intelli- of standard data sets for individual species (Overy et al.,
gent’ software, both for peak recognition and data in- 2004). A further aid to recognition of compounds between
terpretation, will help to ensure the future of the profiling laboratories could be the universal adoption of an identifier
236 Halket et al.
for chemical entities such as the CAS (Chemical Abstracts chemicals (common names, IUPAC names, etc), the use
Service) Registry Number: The Chemical Abstracts Service of such unique identifiers can enable workers throughout
(a Division of the American Chemical Society) administers the world to avoid confusion.
a scheme whereby chemical entities can be assigned unique The numbers themselves have no chemical significance
numerical identifier numbers (http://www.cas.org/). Con- and can contain up to nine digits divided by hyphens into three
sidering the variety of naming schemes available for parts. Examples of CAS numbers taken from the NIST02
Metabolic profiling 237
Scores Plot of TMPP Profiling Data
8.E+09
Tomato Line 2
Tomato Line 1
Tomato Line 3
PC2
0.E+00
Tomato Line 4
TomatoLine 5
Fig. 13. Principal components analysis (PCA) scores plot of electrospray LC/MS data from TMPP derivatized tomato extracts (three samples of each of
five tomato types and the selected ion abundances of 16 components) (J Halket et al., unpublished data). By plotting PC1 versus PC2 a degree of
clustering of the data by tomato type is apparent showing the potential of the technique to distinguish between tomato types. Chromatograms of the m/z
values (Cat+) were automatically extracted and integrated from each chromatogram using a quantitation method in the Xcalibur software and the peak
areas (abundances) input into the table.
mass spectral library (http://www.nist.gov/srd/nist1a.htm/) being actively investigated by a number of research groups.
are: citric acid: [77-92-9]; citric acid tetra-TMS derivative: Until fairly recently, their creation has been ignored by mass
[14330-97-3]; L-serine: [56-45-1]; D-serine: [312-84-5]; DL- spectrometer manufacturers and software incompatibilities
serine: [302-84-1]; serine di-TMS derivative [70125-39-2] between data files of major manufacturers still exist. These
(O,O9-bis(trimethylsilyl)serine); serine tri-TMS drivative issues will hopefully be addressed as more users undertake
[64625-17-8] (N,O,O9 tris(trimethylsilyl)serine). profiling. NIST user libraries are being constructed in the
The right digit in each case is a check digit used to verify authors’ laboratory by the addition of product-ion (MS/MS)
the validity and uniqueness of the entire number (http:// and MSn mass spectra of metabolites and derivatives
www.cas.org/EO/checkdig.html/). obtained from an ion trap mass spectrometer. These MS/
Approximately 4000 new numbers are created every day MS spectra are reproducible and suitable for library search-
and there are now more than 23 million organic and ing. In cases where the MS2 spectrum is very simple, MS3
inorganic substances represented as well as over 43 million and further levels can be easily investigated. Such library
sequences. The adoption of such CAS Numbers in meta- searching in LC/MS is now a routine operation (Baumann
bolic profiling would facilitate data set comparisons for et al., 2000). The libraries are being constantly updated with
compounds with assigned structures between laboratories. the spectra of unknowns from biological material as well as
Considering that many analytes in metabolic profiling have those of authentic standards. The NIST format libraries can
still not been formally identified, a similar numerical be accessed directly by the Xcalibur data system to perform
system, perhaps based on mass spectral and GC retention library matching including spectra acquired during data-
parameters, could probably be developed for inter-laboratory dependent scanning experiments, i.e. where the software
comparison. triggers MS/MS on a chosen set of precursor ions or those
*note added in proof: a system for naming unknown ions above a preset abundance threshold.
compounds has just been proposed by the International
Committee on Plant Metabolomics (http://www.metabolomics. LC retention parameters: Development and standardization
nl). (Bino et al., 2004). Important plans to distribute of LC retention parameters is lacking. Hopefully, powerful
lyophilized standard reference materials are also described. concepts such as the ‘hydrophobicity index’ (Valko et al.,
1997) will provide a solution.
LC/MS libraries, retention data and unknowns
LC/MS libraries: Metabolic profiling by LC/MS requires the Derivatives for LC/MS: Although chemical derivatization
creation and the application of mass spectral libraries, areas is often not necessary in LC/MS, it is clear that derivatives
238 Halket et al.
can provide advantages in many cases. This follows the GC/MS and LC/MS: the compromises on quality
widespread and highly successful application of derivatiza- GC/MS and LC/MS for metabolic profiling are by their
tion in HPLC with UV and fluorescence detection (Lunn nature compromises. The majority of measurements are
and Hellig, 1998; Toyo’oka, 1999). Increasing numbers of relative rather than absolute and are of unknowns, although
MS applications are being described. in some cases the compound class can be recognized from
the mass spectral features. The techniques have to supply
Organic acids: The ESI mass spectra of the organic acid sufficient reproducibility to enable meaningful comparisons
TMPP derivatives show strong Cat+ ions with little frag- of samples (Glassbrook et al., 2000). Stable isotope-labelled
mentation, as previously observed in a quadrupole instru- internal standards can only be employed for a few repre-
ment (Leavens et al., 2002). The ion trap MS/MS spectra sentative members of the different analyte classes, but the
obtained at 30–50% normalized collision energy (LCQ) relatively large numbers of analytes in a metabolic profile
appear to be useful mass spectral ‘fingerprints’ and are not means that optimal methods cannot be applied.
generally dominated by the expected m/z 533 ion resulting The analyst should be aware of deficiencies in the