You are on page 1of 10

Food Additives & Contaminants: Part A

ISSN: 1944-0049 (Print) 1944-0057 (Online) Journal homepage: http://www.tandfonline.com/loi/tfac20

The classification of almonds (Prunus dulcis) by


country and variety using UHPLC-HRMS-based
untargeted metabolomics

R. Gil Solsona, C. Boix, M. Ibáñez & J. V. Sancho

To cite this article: R. Gil Solsona, C. Boix, M. Ibáñez & J. V. Sancho (2018): The classification
of almonds (Prunus dulcis) by country and variety using UHPLC-HRMS-based untargeted
metabolomics, Food Additives & Contaminants: Part A, DOI: 10.1080/19440049.2017.1416679

To link to this article: https://doi.org/10.1080/19440049.2017.1416679

Accepted author version posted online: 26


Dec 2017.
Published online: 17 Jan 2018.

Submit your article to this journal

Article views: 35

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=tfac20
FOOD ADDITIVES & CONTAMINANTS: PART A, 2018
https://doi.org/10.1080/19440049.2017.1416679

ARTICLE

The classification of almonds (Prunus dulcis) by country and variety using


UHPLC-HRMS-based untargeted metabolomics
R. Gil Solsona , C. Boix, M. Ibáñez and J. V. Sancho
Research Institute for Pesticides and Water (IUPA), University Jaume I, Castellón, Spain

ABSTRACT ARTICLE HISTORY


The aim of this study was to use an untargeted UHPLC-HRMS-based metabolomics approach Received 14 September 2017
allowing discrimination between almonds based on their origin and variety. Samples were Accepted 15 November 2017
homogenised, extracted with ACN:H2O (80:20) containing 0.1% HCOOH and injected in a KEYWORDS
UHPLC-QTOF instrument in both positive and negative ionisation modes. Principal component Almond; untargeted
analysis (PCA) was performed to ensure the absence of outliers. Partial least squares – discrimi- metabolomics; UHPLC;
nant analysis (PLS-DA) was employed to create and validate the models for country (with five high-resolution mass
different compounds) and variety (with 20 features), showing more than 95% accuracy. spectrometry; PLS-DA
Additional samples were injected and the model was evaluated with blind samples, with more
than 95% of samples being correctly classified using both models. MS/MS experiments were
carried out to tentatively elucidate the highlighted marker compounds (pyranosides, peptides or
amino acids, among others). This study has shown the potential of high-resolution mass spectro-
metry to perform and validate classification models, also providing information concerning the
identification of the unexpected biomarkers which showed the highest discriminant power.

Introduction affecting their flavour and also influencing their price.


Apart from this problem, two typical sweet almond-
Nuts and olive oil are considered fundamental in the
derived products (turron and marzipan) require a mini-
Mediterranean diet, not only due to their healthy lipid
mum amount of almond, protein and fat to be consid-
profile (Kodad and Socias i Company 2008), but also
ered “high quality” products (Romero 2014). In this
the presence of antioxidants, phenols, flavonoids
sense, genotype or growing region is really important
(Alasalvar and Bolling 2015), vitamins and/or phytos-
for nutrient composition in almonds (Yada et al. 2011).
terols (Bullo et al. 2011). Their consumption has been
Spanish almonds provide different characteristics and
related to several benefits for human health, proved by
flavour to these products making them highly appre-
published clinical diet trials such as those showing
ciated by the turron and marzipan industries. Thus, it
lower levels of blood cholesterol (Hyson et al. 2002)
becomes important to develop methods for origin and
as well as reduced coronary heart disease (Ros 2010),
variety control to give producers analytical tools to
reduction of serum uric acid (Jamshed et al. 2015) and
ensure the authenticity of origin and/or variety of
also linked with prebiotic effects on gut microbiota
almonds. In this sense, different experiments have
(Ukhanova et al. 2014), which makes almonds or
been performed to obtain fingerprints which allow the
nuts in general an important part of a balanced diet.
classification of almonds based on their geographical
The almond (Prunus dulcis) is a nut largely con-
origin. Most of them are based on lipid profiling, the
sumed in the Mediterranean diet (Bullo et al. 2011),
most studied compounds being fatty acids (Amorello
considered one of the healthiest diets in the world.
et al. 2016), triacylglycerols and phospholipids (Shen
However, almonds have different fatty acid profiles
et al. 2013; Petroselli et al. 2015), or also in combination
depending on their origin and variety, which affects
with tocopherol analysis (Barreira et al. 2012). The main
their stability against rancidity during storage or trans-
problem with these studies is the lack of validation of the
port (Kodad and Socias i Company 2008), directly
developed methodology and the fact that the selected

CONTACT J. V. Sancho sanchoj@uji.es Research Institute for Pesticides and Water (IUPA), University Jaume I, Castellón, Spain
Color versions of one or more of the figures in this article can be found online at www.tandfonline.com/TFAC
Supplemental data can be accessed here.
© 2018 Taylor & Francis Group, LLC
2 R. GIL SOLSONA ET AL.

compounds may not be the best distinguishing markers, and formic acid (mobile phase modifier) were pur-
as a targeted approach was used. Untargeted metabolo- chased from Sigma-Aldrich.
mics could be a good option to highlight the best com-
pounds to discriminate samples in different scenarios
Sampling
like animal diets (Ruiz-Aracama et al. 2011) or food
traceability, and almond classification by cultivar Spanish almond of different varieties (Bitter almond,
(Beltrán Sanahuja et al. 2011). The main drawbacks Belona, Carrerona, Comuna, Ferranduel, Guara,
were, as previously noted for targeted approaches, the Largueta and Marcona) were purchased from
lack of extra validation steps to confirm that promising Frusema Company (Albocasser, Castellón, Spain).
markers are robust. Almonds from the USA (Bute-padre, California
The untargeted metabolomics approach has and Non-Pareil) were obtained from FruSecs
become very useful for food control (Gil-Solsona Company (Albocasser, Castellón, Spain). In a second
et al. 2016; Sales et al. 2017). In this sense, powerful season, samples were also obtained from Frusema
chromatographic techniques coupled to high-resolu- and FruSecs. In this case, an additional Spanish
tion MS (HRMS) (Emwas 2015) provide the perfect variety, Soleta, was also sampled. A total of 62 sam-
tool to meet this goal, as demonstrated by their ple packages containing 100 g of an individual vari-
increased use in recent years in food authenticity ety were employed.
and control (Castro-Puyana and Herrero 2013;
Rubert et al. 2015).
Sample processing
The main aim of this research was to investigate
the applicability of an untargeted metabolomics Raw samples (100 g) were triturated and homoge-
approach using ultra-high-performance liquid chro- nised; 2.5 g of sample were weighed and mixed with
matography (UHPLC) coupled to HRMS to classify 10 ml of ACN:H2O (80:20) 0.1% HCOOH. After
almond samples according to their country of origin mechanically shaking for 90 min, extracts were soni-
as well as variety. For this purpose, almonds from cated for 15 min and centrifuged for 10 min at
Spain and the USA were employed. Sample extracts 4.500g. The supernatant was diluted fourfold with
were injected and after multivariate analysis the Milli-Q water and stored at −24°C until analysis. A
most relevant compounds were highlighted using a pool of all the extracts was also prepared, named
Variable Importance in Projection (VIP) selection QC, to obtain an average extract of the sample set.
method. Partial least squares – discriminant analysis This pool was used for column stabilisation (by
(PLS-DA) was employed to create both models, injecting 10 QC samples at the beginning of each
which were validated with samples from a second sample batch), and to control possible instrumental
season employed as a system challenge (Riedl et al. signal variation along the sequence.
2015). MS/MS experiments were performed for
highlighted markers which were tentatively eluci-
UHPLC-HRMS
dated with the help of online databases.
A Waters Acquity UPLC system (Waters, Milford,
MA, USA) was coupled to a hybrid quadrupole-TOF
Materials and methods mass spectrometer (Xevo G2 QTOF, Waters,
Manchester, UK), using a Z-spray-ESI interface
Reagents and chemicals operating in both positive and negative ionisation
HPLC-grade water was obtained from a Mili-Q modes. The UHPLC separation was performed
water purification system (Millipore Ltd, Bedford, using a CORTECS® C18 fused-core 2.7 μm particle
MA, USA). HPLC-grade methanol (MeOH), size analytical column 100 × 2.1 mm (Waters) at
HPLC-supergradient ACN, sodium hydroxide 300 μl/min flow rate. The separation was performed
(> 99%) and ammonium acetate (NH4Ac) reagent- using H2O 0.01% HCOOH as weak mobile phase
grade were obtained from Scharlab (Barcelona, (A) and MeOH 0.01% HCOOH as strong mobile
Spain). Leucine-enkephalin (mass-axis calibration) phase (B). The percentage of B was changed from
10% at 0 min, to 90% at 14 min, 90% at 16 min and
FOOD ADDITIVES & CONTAMINANTS: PART A 3

10% at 16.01 min, with a total run time of 18 min. using XCMS R package (https://xcmsonline.scripps.
Injection volume was 10 μl. Nitrogen was used as edu/) (Smith et al. 2006). Centwave feature detection
both the desolvation gas and the nebulising gas. A algorithm was employed for peak picking (peak width
capillary voltage of 0.7 and 1.5 kV for positive and from 5 to 20 s, S/N ratio higher than 10 and mass
negative ion modes, respectively, and cone voltage of tolerance of 15 ppm) to convert chromatograms into a
25 V were used. MS data were acquired over an m/z list of detected features. It was followed by retention
range of 50–1200. TOF-MS resolution was approxi- time alignment, to identify the same ion across different
mately 20,000 at full width half maximum at m/z samples with slightly different retention time (around
556.2771. Collision gas was argon 99.995% (Praxair, 10 seconds of difference). The aligned features were
Valencia, Spain). The desolvation gas flow was set at labelled as MxxxTyyy, where xxx corresponds to the
1000 l/h, and the cone gas was set at 80 l/h. The nominal mass of the compound and yyy to the retention
desolvation gas temperature was set to 600℃, the time in seconds. Mean centring was applied to normal-
source temperature to 130℃ and the column tem- ise each data set, minimising instrumental drifts
perature was set to 40℃. between samples. Finally, log2 transformation was
For MSE experiments, two acquisition functions applied to the area of each detected signal to avoid
with different collision energies were created: the heteroscedasticity, followed by Pareto scaling, which
low-energy (LE) function, with a fixed collision provides to the features their “statistical weight” regard-
energy of 4 eV, and the high-energy (HE) function, ing differences between groups and not depending on
with a collision energy ramp ranging from 15 to their total area.
40 eV, in order to obtain the (de)protonated ion
from LE function and a wide range of fragment
ions from the HE function. Both LE and HE func-
Multivariate analysis
tions used a scan time of 0.3 s with an inter-scan
delay of 0.05 s and were applied in the same injec- Principal component analysis (PCA) and PLS-DA were
tion simultaneously. performed by means of the EZ-Info software (Umetrics,
MS/MS experiments were carried out in the same Sweden). Firstly, PCA was used to ensure the absence of
conditions with different collision energies depend- outliers and the correct grouping of QC samples after
ing on the fragmentation observed for each com- normalisation. PLS-DA was then applied to reduce
pound. Calibrations were conducted from m/z dimensions in the data set. By means of VIP filtering,
50–1200 with a 1:1 mixture of 0.05 M NaOH:5% the minimum required ions to achieve a good classifica-
HCOOH diluted (1:25) with H2O:ACN (20:80), at tion model were obtained.
a flow rate of 10 μl/min. For automated accurate The model was created with 75–80% of the data set,
mass measurement, a leucine-enkephalin solution with samples from both first and second season, ensur-
(2 μg/ml) in ACN:H2O (50:50) at 0.1% HCOOH ing that the selected compounds were independent
was pumped at 20 μl/min through the lock-spray from the harvest year, while the other 20–25% were
needle and measured every 30 s, with a scan time not included in the model creation. With these two
of 0.3 s. The (de)protonated molecule of leucine- groups, the model was validated in two steps. Firstly, a
enkephalin, at m/z 556.2771 in positive mode and cross-validation was applied to control the model good-
m/z 554.2615 in negative mode was used for recali- ness and also an additional validation step was carried
brating the mass axis during the injection and to out with the 20–25% of the samples not included in the
ensure a robust accurate mass along time. model. The statistical model gives two columns, the first
one (Likely Classification) where the model assigns the
sample unequivocally to the group obtained, and the
Data processing
second (Less Likely Classification) where the model can
The untargeted metabolomics data workflow provide no result, only one result and more than one
(Figure S2) starts converting LC-MS raw data from result. Samples with only one result in this column are
proprietary (.raw, Waters Corp.) to generic (.cdf, given as correct by us while the rest are treated as
NetCDF) format using Databridge application (within unknown.
MassLynx v 4.1; Waters Corporation) and processed
4 R. GIL SOLSONA ET AL.

Marker identification 1555 different ions. Data were then analysed with
PCA. At this point, QC samples were employed as
The MS/MS spectra of the most significant metabo-
an external standard to control the correct normal-
lites at 10, 20, 30 and 40 eV were acquired and
isation. The QC sample, as explained in Sample
searched in online databases as METLIN (https://
treatment section, is a pool of all the samples
metlin.scripps.edu/landing_page.php?pgcontent=
employed to perform the model. This sample,
mainPage) or were in-silico tentatively elucidated
which has an average composition, should appear
with MetFrag Software (https://msbi.ipb-halle.de/
after normalisation in the centre of the PCA (non
MetFragBeta/), employing ChemSpider as chemical
supervised method) and grouped, meaning that nor-
structure library When no hits were obtained, we
malisation steps (mean centring, log2 and Pareto
attempted to elucidate them manually.
scaling) has corrected possible instrumental drifts
and differences along the batch. In this case, after
Results and discussion observing this correct QC grouping and the absence
of outliers (Figure S1), PLS-DA models were created
Sample treatment
for country and variety classification.
Almonds contains, regarding the polarity of the com- It is important to observe that both data sets were
pounds, two fractions, a polar fraction (studied in this joined in a single file in order to extract the best ions
paper) and the less-polar fraction (mainly composed for the discrimination step despite the ionisation
by lipids). The polar fraction requires polar solvents mode. As we employed an untargeted strategy, if
(water, methanol, acetonitrile), while in order to one of both ionisation modes better explained the
extract less-polar compounds other kind of solvents differences between groups, these ions would be
should be employed (dichloromethane/methanol preferably selected by VIP filtering. If not, it is
mixtures) as discussed in the literature (Cevallos- necessary to ensure that the selected markers are
Cevallos et al. 2009), or even butanol and 2-propanol the group that better explains the differences inde-
when dealing directly with oils (Gil-Solsona et al. pendently of their ionisation.
2016). For this reason, the non-polar compounds
were extracted with ACN:H2O (80:20) 0.1%
Country classification by PLS-DA
HCOOH, which proved a good extraction solvent
for a wide range of food matrices (Beltrán et al. 2013). Initially, a model to differentiate the origin of the
almonds was created. Samples were divided in three
different groups, Spanish almonds (Belona,
Data treatment
Carrerona, Comuna, Ferranduel, Guara, Largueta,
Both data sets (from positive and negative ionisation Marcona and Soleta), USA almonds (Bute-padre,
modes) were joined in a single file, with a total of California and Non-pareil) and Bitter almonds,

Figure 1. PLS-DA model for country classification. (a) Score plot of the first two components and (b) score plot with test samples.
FOOD ADDITIVES & CONTAMINANTS: PART A 5

which showed a different behaviour (Figure 1). As was tentatively elucidated as 5ʹ-deoxy-5ʹ-
has been previously explained, both sample sets (first (methylthio)adenosine.
and second season) were mixed and 80% of the M293T201 and M448T119 were tentatively iden-
samples were employed to create the model, while tified as glucopyranosyl hydroxycaproic acid and
the remaining 20% were used for model validation. diglucopyranosyl niacin after observing their MS/
The most important ions for the model were MS spectra losses corresponding to hexose groups
initially reduced down to 20 by means of their VIP (C6H10O5). The remaining products’ ions allowed us
value, ensuring that all the samples in the cross- to identify the corresponding aglycone moiety using
validation were correctly classified. However, despite the METLIN database.
a soft ionisation source being employed, more than The METLIN database was used to search for
one ion could be obtained for each marker com- compound M933T337, but as no results were
pound. Therefore, adducts and/or in-source frag- obtained, it was further evaluated using MetFrag
ments corresponding to the same marker were in-silico fragmentation web tool searching for possi-
excluded based on mass accuracy as well as chroma- ble structures in Chemspider. Observing its frag-
tographic profile. Finally, the total number of ions mentation pattern, the consecutive neutral losses of
was heavily reduced to only five compounds/ions. amygdalin and hexose rendering a final product ion
The PLS-DA model was created with these five at m/z 297.0948, this marker was tentatively eluci-
ions with goodness-of-fit (R2Y = 0.848) and good- dated as de-hypoxanthine futalosine conjugated with
ness-of-prediction (Q2Y = 0.771) for the two first amygdalin and one hexose.
components. Then, it was validated in two steps, as
recommended in the literature (Riedl et al. 2015).
Spanish varieties classification by PLS-DA
The first step was the cross-validation of the sample
set employed to perform the model. Here, all 50 In a second step, a classification model was created
samples, which were a mix of both seasons, were in order to differentiate among the Spanish vari-
correctly labelled, ensuring the model robustness. eties included in the first model. Samples were
To finally validate the model, 12 samples not obtained after mixing almonds of different
included initially in the model creation (two bitter Spanish regions, always ensuring the same variety.
almonds, five Spanish almonds and five USA This fact guarantees that the highlighted markers
almonds) were employed to test the model. All the are robust for any Spanish almond, independently
samples were properly classified, showing the suc- from their cultivar.
cessful applicability of the model. For this variety classification model, Spanish
Markers’ identity (see Table 1) was tentatively almonds were divided into seven different groups
performed after MS/MS experiments. The most (Belona, Carrerona, Comuna, Ferranduel, Guara,
important product ions can be observed in Largueta and Marcona), employing 75% (30 sam-
Table S1. Regarding the marker labelled as ples) of all the Spanish almonds to train the model
M318T239, after searching its accurate mass (m/z and the remaining 25% (10 samples) to validate it,
318.2022) in the METLIN database, several hits with at least one sample per variety. In the case of
were retrieved. However, the comparison of the Soleta variety, as only one sample was obtained, it
empirical MS/MS spectrum with available spectra was only employed to test the model, evaluating
allowed us to tentatively elucidate it as a tripeptide potential misclassifications.
(Val-Thr-Val). In a similar way, marker M298T178 VIP filtering was applied to the whole table and
features were checked to include only one ion per

Table 1. Markers selected for country discrimination.


Ionisation Molecular Exact mass/Accurate mass Retention time
Feature mode Ion Tentative elucidation formula (error) [M + H]+ or [M-H]− (min)
M448T119 Positive [M + H]+ Diglucopyranosyl niacin C18H25NO12 448.1455/448.1458 (+0.3 mDa) 1.99
M298T178 Positive [M + H]+ 5ʹ-Deoxy-5ʹ-(methylthio)adenosine C11H15N5O3S 298.0974/298.0959 (−1.5 mDa) 2.98
M293T201 Negative [M-H]− Glucopyranosyl hydroxy caproic acid C12H22O8 293.1236/293.1231 (−0.5 mDa) 3.35
M318T239 Positive [M + H]+ Val-Thr-Val C14H27N3O5 318.2029/318.2022 (−0.7mDa) 3.94
M933T337 Positive [M+NH4]+ Amigdalin hexose de-hypoxanthine C40H53NO23 916.3087/916.3058 (−2.8 mDa) 4.98
futalosine
6
R. GIL SOLSONA ET AL.

Table 2. Markers selected for variety discrimination.


Exact mass/accurate mass (error) Retention
Feature Ionisation mode Ion Tentative elucidation Molecular formula [M + H]+ or [M – H]− time (min)
M503T55 Negative [M-H]− Gentianose C18H32O16 503.1616/503.1612 (+0.4 mDa) 1.00
M377T68 Negative [M+Cl]− Inulobiose C12H22O11 341.1070/341.1084 (−1.4 mDa) 1.23
M268T85 Positive [M + H]+ Adenosine C10H13N5O4 268.1045/268.1056 (−1.1 mDa) 1.46
M166T99 Positive [M + H]+ L-phenylalanine C9H11NO2 166.0867/166.0868 (−0.1 mDa) 1.64
M476T114 Positive [M + H]+ Dihexosyl 2-ethylisonicotinic acid C20H29NO12 476.1786/476. 1768 (+1.8 mDa) 1.79
M577T127 Negative [M-H]− Procyanidin B5 C30H26O12 577.1334/577.1346 (−1.2 mDa) 2.05
M450T190 Positive [M+NH4]+ Benzyl gentobioside C19H28O11 433.1737/433.1710 (+2.7 mDa) 3.14
M313T217 Positive [M+NH4]+ Unknown C14H17NO6 296.1129/296.1107 (+2.2 mDa) 3.58
M289T214 Negative [M-H]− Unknown C15H14O6 289.0706/289.0712 (−0.6 mDa) 3.6
M318T239 Positive [M + H]+ Val-Thr-Val C14H27N3O5 318.2029/318.2022 (−0.7 mDa) 3.87
M464T246 Positive [M+NH4]+ Hexosyl-2-phenylethyl glucopyranoside C20H30O11 447.1857/447.1866 (−0.9 mDa) 4.09
M548T340 Positive [M+NH4]+ Di hexosyl-3-dimethylallyl-4-hydroxybenzoate C24H34O13 531.2090/531.2078 (+1.2 mDa) 5.65
M563T342 Positive [M + H]+ Unknown C26H30N2O12 563.1882/563.1877 (+0.5 mDa) 5.7
M540T362 Positive [M+NH4]+ Unknown C26H34O11 523.2181/523.2179 (+0.2 mDa) 6.05
M521T362 Negative [M-H]− Unknown C26H34O11 521.2013/521.1983 (+3.0 mDa) 6.05
M287T340 Negative [M-H]− 7-hydroxy-3-(3,4,5-trihydroxyphenyl)-2,3-dihydro-4H-chromen-4-one C15H12O6 287.0542/287.0556 (−1.4 mDa) 6.07
M593T393 Negative [M-H]− 4,7-Dihydroxy-2-(4-hydroxyphenyl)-5-oxo-5H-chomen-3-yl-6-O-(6-deoxy- C27H30O15 593.1505/593.1506 (−0.1 mDa) 6.56
mannopyranosyl)-glucopyranoside
M477T395 Negative [M-H]− Unknown C22H22O12 477.1026/477.1033 (−0.7 mDa) 6.61
M647T403 Positive [M+Na]+ Deoxyhexosyl hexosyl quercetin 3-methyl ester C28H32O16 625.1769/625.1769 (0.0 mDa) 6.72
M348T852 Positive [M + H]+ Hexadecyl methyl glycerol C20H42O3 331.3202/331.3212 (−1.0 mDa) 14.06
FOOD ADDITIVES & CONTAMINANTS: PART A 7

Figure 2. PLS-DA model for variety classification. (a) Score plot of the first two components and (b) score plot with test samples.
Soleta sample is labelled as unknown.

compound, typically the (de)protonated molecule or were found in METLIN and tentatively elucidated
an adduct. Only 20 ions were necessary to build a after comparing their experimental spectra with online
model (see Table 2) with satisfactory goodness-of-fit records. M287T340 and M593T393 were tentatively
(R2Y = 0.866) and goodness-of-prediction elucidated with the MetFrag tool, selecting the highest
(Q2Y = 0.760) using eight components (Figure 2(a)). scoring molecules. For the rest of the compounds,
This classification model was again validated in two M464T246, M548T340, M647T403 and M476T114
steps, a cross-validation, where 29 samples were cor- were finally manually elucidated. An example of man-
rectly labelled, only remaining one as unknown. The ual elucidation is shown in Figure 3. MS/MS experi-
final validation of the model was made with 10 samples ments were carried out for M464T246 marker,
not included in the initial model (1 Belona, 1 annotated as an ammonium adduct, based on accurate
Carrerona, 1 Comuna, 1 Ferranduel, 1 Guara. 2 mass full scan spectra (Figure S3). A product ion at
Largueta, 2 Marcona and 1 Soleta). Soleta was m/z 285.1343 (+1.1 mDa mass error) was observed at
employed to test that samples from varieties not 10 eV corresponding to the loss of NH3 plus C6H10O5
included in the model were not wrongly labelled. group. This signified the presence of at least one
Eight samples were correctly classified, while two sam- hexose unit in the molecule. Then, two consecutive
ples were classified as unknown. One of these was the water losses were observed at m/z 267.1216 (+0.3
Soleta sample, labelled as missing in Figure 2(b), show- mDa) and 249.1114 (+0.3 mDa). Furthermore, pro-
ing the model goodness against misclassifications of duct ion at m/z 163.0602 was assigned to an additional
new almond varieties. hexose unit showing an elemental composition
MS/MS experiments were acquired for these 20 [C6H11O5]+, which is supported by two other conse-
ions and searched in online databases (METLIN or cutive water losses at m/z 145.0501 (−0.4 mDa) and
HMDB). When no results were obtained (12 out of 127.0387 (0.9 mDa). At this point, this compound was
20), elucidation was attempted with MetFrag in- tentatively identified as hexosyl-2-phenylethyl gluco-
silico fragmentation tool, annotating only two addi- pyranoside, as shown in Table 2. In the same way, an
tional compounds and with the remaining 10 ions attempt was made to elucidate the remainder of the
still to be elucidated manually, the most complex compounds. Six were not elucidated unambiguously
and lengthy step in the metabolomics workflow. In as more than one compound fitted to the experimental
this case, we have tentatively elucidated four extra spectra. In any event, the two main product ions are
markers, leaving six markers only as chemical reported in Table S2.
formulas. With these 20 compounds, 95% of the samples
M268T85, M450T190, M348T852, M166T99, were correctly classified regarding their variety, as
M318T239, M503T55, M377T68 and M577T127 the Soleta sample was not misclassified, showing
8 R. GIL SOLSONA ET AL.

Figure 3. Structural elucidation for M464T246. MS/MS spectra at 10 eV (bottom) and 20 eV (top) of the ammonium adduct.

the robustness of the model to assess the selected to discriminate the almond variety, enabling
correct variety and to avoid false positive 95% of the samples to be classified correctly. For the
assignments. future, these promising results will be validated with
a larger sample set, ensuring that the models con-
tinue being robust and accurate in following seasons.
Conclusions
This work has shown that untargeted metabolomics is
a powerful technique to develop classification models Acknowledgments
to differentiate food, based not only on their origin but The authors acknowledge the support from Generalitat
also on variety. The selected extraction procedure pro- Valenciana (Group of Excellence Prometeo II/2017/023).
vides a fast and easy analysis, obtaining robust results. This work has also been developed with financial support
Additionally, the power of the UHPLC-HRMS techni- from Universitat Jaume I (UJI-B2016-10).
que allows the analysis of a wide range of compounds
occurring at low concentrations, highlighting those
providing most differentiation. The analysis of both Disclosure statement
positive and negative ionisation modes gives informa-
No potential conflict of interest was reported by the authors.
tion of a wider range of acidities, which supported by
the appropriate HRMS sensitivity highlights the best
compounds to create classification models.
One of the advantages of QTOF instruments is Funding
the possibility to perform tandem mass spectrometry
This work was supported by the Generalitat Valenciana
experiments with accurate mass information, which
[Group of Excellence Prometeo II/2017/023]; Universitat
strongly helps in the elucidation process. The model Jaume I [UJI-B2016-10].
has allowed us to discriminate the origins of the
almonds with only five compounds, ensuring the
differentiation between Spanish and American ORCID
almonds, also avoiding the inclusion of bitter R. Gil Solsona http://orcid.org/0000-0003-0937-9072
almonds. Furthermore, 20 markers have been J. V. Sancho http://orcid.org/0000-0002-6873-4778
FOOD ADDITIVES & CONTAMINANTS: PART A 9

References Kodad O, Socias i Company R. 2008. Variability of oil con-


tent and of major fatty acid composition in almond
Alasalvar C, Bolling BW. 2015. Review of nut phytochem- (Prunus amygdalus Batsch) and its relationship with
icals, fat-soluble bioactives, antioxidant components and Kernel quality. J Agric Food Chem. 56:4096–4101.
health effects. Br J Nutr. 113:S68–S78. Petroselli G, Mandal MK, Chen LC, Hiraoka K, Nonami H,
Amorello D, Orecchio S, Pace A, Barreca S. 2016. Erra-Balsells R. 2015. In situ analysis of soybeans and nuts
Discrimination of almonds (Prunus dulcis) geographical by probe electrospray ionization mass spectrometry. J
origin by minerals and fatty acids profiling. Nat Prod Res. Mass Spectrom. 50:676–682.
30:2107–2110. Riedl J, Esslinger S, Fauhl-Hassek C. 2015. Review of valida-
Barreira JCM, Casal S, Ferreira ICFR, Peres AM, Pereira JA, tion and reporting of non-targeted fingerprinting
Oliveira MBPP. 2012. Supervised chemical pattern recog- approaches for food authentication. Anal Chim Acta.
nition in almond (Prunus dulcis) Portuguese PDO culti- 885:17–32.
vars: PCA- and LDA-based triennial study. J Agric Food Romero A. 2014. Almond quality requirements for industrial
Chem. 60:9697–9704. purposes - its relevance for the future acceptance of new
Beltrán E, Ibáñez M, Portolés T, Ripollés C, Sancho JV, Yusà cultivars from breeding programs. Acta Hortic. 213–220.
V, Marín S, Hernández F. 2013. Development of sensitive Ros E. 2010. Health benefits of nut consumption. Nutrients.
and rapid analytical methodology for food analysis of 18 2:652–682.
mycotoxins included in a total diet study. Anal Chim Acta. Rubert J, Zachariasova M, Hajslova J. 2015. Advances in
783:39–48. high-resolution mass spectrometry based on metabolomics
Beltrán Sanahuja A, Ramos Santonja M, Grané Teruel N, studies for food – a review. Food Addit Contam Part A.
Martín Carratalá ML, Garrigós Selva MC. 2011. 32:1685–1708.
Classification of almond cultivars using oil volatile com- Ruiz-Aracama A, Lommen A, Huber M, Van De Vijver L,
pound determination by HS-SPME–GC–MS. J Am Oil Hoogenboom R. 2011. Application of an untargeted meta-
Chem Soc. 88:329–336. bolomics approach for the identification of compounds
Bullo M, Lamuela-Raventos R, Salas-Salvado J. 2011. that may be responsible for observed differential effects
Mediterranean diet and oxidation: nuts and olive oil as in chickens fed an organic and a conventional diet. Food
important sources of fat and antioxidants. Curr Top Med Addit Contam Part A. 1–10.
Chem. 11:1797–1810. Sales C, Cervera MI, Gil R, Portolés T, Pitarch E, Beltran J.
Castro-Puyana M, Herrero M. 2013. Metabolomics approaches 2017. Quality classification of Spanish olive oils by untar-
based on mass spectrometry for food safety, quality and geted gas chromatography coupled to hybrid quadrupole-
traceability. TrAC Trends Anal Chem. 52:74–87. time of flight mass spectrometry with atmospheric pres-
Cevallos-Cevallos JM, Etxeberria E, Danyluk MD, Rodrick sure chemical ionization and metabolomics-based statisti-
GE. 2009. Metabolomic analysis in food science : a review. cal approach. Food Chem. 216:365–373.
Trends Food Sci Technol. 20:557–566. Shen Q, Dong W, Yang M, Li L, Cheung H-Y, Zhang Z.
Emwas A-HM. 2015. The strengths and weaknesses of NMR 2013. Lipidomic fingerprint of almonds (Prunus dulcis L.
spectroscopy and mass spectrometry with particular focus on cv Nonpareil) using TiO 2 nanoparticle based matrix solid-
metabolomics research. Methods Mol Biol. 1277:161–193. phase dispersion and MALDI-TOF/MS and its potential in
Gil-Solsona R, Raro M, Sales C, Lacalle L, Diaz R, Ibañez M, geographical origin verification. J Agric Food Chem.
Beltran J, Sancho JV, Hernández FJ. 2016. Metabolomic 61:7739–7748.
approach for Extra virgin olive oil origin discrimination Smith CA, Want EJ, O’Maille G, Abagyan R, Siuzdak G.
making use of ultra-high performance liquid chromato- 2006. XCMS: processing mass spectrometry data for meta-
graphy - Quadrupole time-of-flight mass spectrometry. bolite profiling using nonlinear peak alignment, matching,
Food Control. 70:350–359. and identification. Anal Chem. 78:779–787.
Hyson DA, Schneeman BO, Davis PA. 2002. Almonds and Ukhanova M, Wang X, Baer DJ, Novotny JA, Fredborg M,
almond oil have similar effects on plasma lipids and LDL Mai V, Coates AM, Howe PR, Bolling BW, Chen C-YO,
oxidation in healthy men and women. J Nutr. 132:703–707. et al. 2014. Effects of almond and pistachio consumption
Jamshed H, Gilani A-H, Sultan FAT, Amin F, Arslan J, on gut microbiota composition in a randomised cross-
Ghani S, Masroor M, Obermayr R, Temml C, Gutjahr G, over human feeding study. Br J Nutr. 111:2146–2152.
et al. 2015. Almond supplementation reduces serum uric Yada S, Lapsley K, Huang G. 2011. A review of composition
acid in coronary artery disease patients: a randomized studies of cultivated almonds: macronutrients and micro-
controlled trial. Nutr J. 15:77. nutrients. J Food Compos Anal. 24:469–480.

You might also like