Professional Documents
Culture Documents
Food Control
journal homepage: www.elsevier.com/locate/foodcont
Keywords: The aim of the present study was to conduct a systematic and in-depth comparison of four different sample
Almonds preparation techniques for near-infrared (NIR) spectroscopy analysis of almonds and evaluation of their suit-
NIR ability for the geographical origin determination. Although it is generally known that the sample preparation has
Sample preparation an impact on the NIR screening, there is no scientific consensus on a commonly accepted procedure. In this work,
Geographical origin
64 almond samples from six countries were analyzed as whole and bisected nuts as well as in a ground and
Support vector machine
Chemometrics
freeze-dried state (after grinding). In order to assess the suitability for future applications, both the labor effort
for sample preparation and the classification accuracy were evaluated. Using support vector machine (SVM)
classification we obtained a classification accuracy of 80.2% (±1.9%) on the validation set for the determination
of origin of freeze-dried almonds. The other three sample preparations result in at least 8.3 percentage points
lower classification accuracies. Nonetheless, the analysis of whole and bisected almonds is more suitable for an
initial rapid screening due to a lower overall required work effort. The results confirm the influence of the
sample preparation techniques in NIR screening and pave the way for future widespread analytical applications.
1. Introduction content in meat (Porep, Kammerer, & Carle, 2015), are already estab-
lished in industrial routine laboratories.
Fourier transform near-infrared (FT-NIR) spectroscopy is a highly Even though FT-NIR spectroscopy is generally a fast method, the
versatile and powerful method in verifying the authenticity of food raw sample preparation can differ significantly. Depending on the chosen
materials. It has been used to address a wide range of food-related sample preparation technique – i.e. for nuts simply using whole or bi-
questions such as the determination of adulteration in olive oil or sected nuts or more elaborately prepared materials like ground or
honey, of the geographical origin of fish, to distinguish varieties of wine freeze-dried nuts – time requirement can vary from almost immediate
and quality issues such as mycotoxin contamination (Cozzolino, 2016; spectra recording to around 72 h (Richter, Rurik, Gurk, Kohlbacher, &
Rodriguez-Saona, Giusti, & Shotts, 2016). In general, FT-NIR spectro- Fischer, 2019; Zhang, Jiang, Liu, Mei, & Huang, 2017). Currently,
scopy is a quick analytical method since no extraction is required prior however, there is no universally established form of sample preparation
to analysis. Due to its simple handling and rapid analysis, FT-NIR of food materials for the determination of origin. While in recent studies
spectroscopy is a cost-efficient, non-polluting and thus green analytical Biancolillo et al. and Moscetti et al. analyzed whole hazelnuts, Vitale
method (Armenta, Garrigues, & de la Guardia, 2008; Gałuszka, et al. bisected pistachios prior to analysis (Biancolillo et al., 2018;
Migaszewski, & Namieśnik, 2013). Therefore, various analytical Moscetti, Radicetti, Monarca, Cecchini, & Massantini, 2015; Vitale
methods based on NIR spectroscopy, e.g. the determination of water et al., 2013). In other laboratories even ground walnuts (Gu et al.,
∗
Corresponding author.
E-mail address: markus.fischer@chemie.uni-hamburg.de (M. Fischer).
1
These authors contributed equally to this work.
https://doi.org/10.1016/j.foodcont.2020.107302
Received 2 February 2020; Received in revised form 5 April 2020; Accepted 6 April 2020
Available online 11 April 2020
0956-7135/ © 2020 Elsevier Ltd. All rights reserved.
M. Arndt, et al. Food Control 115 (2020) 107302
2018) or freeze-dried (after grinding) asparagus were utilized for FT- (22 °C ± 2 °C) before the FT-NIR analysis. After the analysis, the al-
NIR measurements (Richter et al., 2019). NIR screening of freeze-dried monds were bisected to check for visible damage due to the ability to
(after grinding, hereinafter referred to as just freeze-dried) nuts has – to detect internal damages via NIR-spectroscopy (Nakariyakul, 2014).
the best of our knowledge – never been applied, despite it being a
promising sample preparation technique due to the removal of poten- 2.2.2. Preparations for analyzing bisected almonds
tially superimposing water bands in the resulting spectra. The selection of the five almonds was conducted as described in the
This study shall provide a systematic and in-depth comparison and previous section. After thawing (22 °C ± 2 °C), the almonds were
evaluation of the above-mentioned forms of sample preparation (whole, manually bisected widthways resulting in a plane contact surface ap-
bisected, ground, and freeze-dried) in terms of their suitability for the plicable for FT-NIR analysis.
determination of origin of almonds (Prunus dulcis MILLER). The prediction
of the geographical origin of almonds is of great economic interest as 2.2.3. Preparations for analysis of ground almonds
the country of harvest determines the market price. While almonds At least 100 g of whole frozen almonds were ground to homo-
from the USA cost about 5 US$/kg, almonds from the Mediterranean geneous powder by using a knife mill (Grindomix GM 300, Retsch,
region can cost up to 9 US$/kg (related to worldwide export 2018, UN Haan, Germany). In order to prevent frictional heat, the grinding was
Comtrade Database, 2018). Although a financial profit could be gen- carried out using dry ice (at least twice the sample mass). The ground
erated by misdeclaration of the origin of almonds (Manning, 2016), material was stored at −20 °C for a minimum of 24 h to ensure
there is still no reliable and routinely applicable method of origin de- quantitative dry ice sublimation. Prior to FT-NIR analysis, 1.25 g
termination to counteract this type of food fraud. The potential of NIR (±0.1 g) of the ground material was thawed at 22 °C (±2 °C) in closed
analysis for this application has, however, been hinted at in a recent glass vials (52.0 mm × 22 mm x 1.2 mm, Nipro Diagnostics Germany
publication (Firmani, Bucci, Marini, & Biancolillo, 2019) presenting an GmbH, Ratingen, Germany).
FT-NIR-based method for differentiating “Avola Almonds” from other
Italian almonds. 2.2.4. Preparations for analysis of freeze-dried almonds
The aim of this study is therefore to investigate the influence of The ground almonds (obtained after grinding process described in
various sample preparations on the FT-NIR analysis of almonds. Whole, Section 2.2.3.) were freeze-dried using a freeze dryer (Beta 1–8
bisected, ground, and freeze-dried almonds are compared regarding LSCplus, Martin Christ Freeze Dryers GmbH, Osterode, Germany) to
their processing effort, analysis time, required sample volume, re- decrease the water content to about 1% by weight. About 60 g ground
producibility, and, in particular, their ability to classify the geo- almonds (incl. dry ice) were freeze-dried for 24 h. Subsequently, the
graphical origin. So, almonds from various economically relevant pro- ground material was manually stirred and freeze-dried for an additional
duction countries were acquired and analyzed in all four sample 24 h. Thereafter, 1.25 g (±0.1 g) of freeze-dried almond powder was
preparations via a non-targeted FT-NIR spectroscopy approach. The thawed at 22 °C (±2 °C) in closed glass vials (52.0 mm × 22 mm x
data were evaluated by multivariate analysis, especially principal 1.2 mm, Nipro Diagnostics Germany GmbH, Ratingen, Germany).
component analysis (PCA) and support vector machine (SVM) classifi-
cation. 2.3. NIR spectroscopy
2. Materials and methods The thawed samples were analyzed using a FT-NIR spectrometer
with an integration sphere (TANGO, Bruker Optics, Bremen, Germany).
2.1. Almond sample acquisition The spectra were acquired in reflectance mode with 50 scans per
spectrum with a resolution of 2 cm−1. A wavenumber range of 11,550
In order to assess the suitability for geographical origin determi- until 3950 cm−1 was selected. Data acquisition was carried out via
nation, a total of 64 authentic almond samples from six different OPUS software (Bruker Optics, Bremen, Germany).
countries were analyzed. The samples were acquired directly from the Spectra of the ground and freeze-dried samples were acquired inside
producers and exporters. With the exception of the Australian samples, glass vials (see Section 2.2.3), while the whole and bisected samples
which stem from 2019 crops, all almonds were harvested in 2018. Due were put directly on the instrument's surface.
to the seasonal differences in the southern hemisphere, however, a six- Depending on the form of sample preparation, different numbers of
month discrepancy of the harvest time is inevitable. In an attempt to replicates are required for a sufficient coverage of the almonds’ basic
cover a broad spectrum of almonds, 17 different varieties (e.g. Non populations. While the ground and freeze-dried samples required only
Pareil, Padre, Butte, Tuono) and additionally even bitter almonds were five recorded spectra, analysis of whole and bisected almonds was
chosen. conducted with six spectra per almond (amounting to a total of 30
spectra per sample). A more detailed description of the four different
2.2. Sample preparation procedures can be taken from Table 1. All samples were measured at
room temperature (22 °C ± 2 °C).
100–1000 g of the acquired almond raw material, shelled and un-
shelled, were shock frosted in liquid nitrogen for 5 min each and sub- 2.4. Spectra pre-processing
sequently stored at −20 °C. If necessary, the almonds were cracked
manually under dry ice cooling (−78.5 °C) to safely remove remaining To avoid overfitting and enable comparability, multiplicative
endocarp. The samples were stored at −20 °C. Each of the 64 acquired scatter correction (MSC), smoothing, first order derivative, reduction of
authentic almond samples were eventually analyzed after undergoing variables, and median formation were applied as data pre-processing
the four sample preparation techniques which are listed in the fol- steps.
lowing sections. At first, MSC was used in order to equalize additive and multi-
plicative scattering effects using the mean spectrum of each sample
2.2.1. Preparations for analyzing whole almonds preparation technique as reference. The mentioned scattering effects
Five almonds with intact tegument (brown skin) were selected from lead to differences in the spectra caused by inhomogeneous physical
the available sample material. If the morphology of the seeds visually effects and not – as is desired – by geographical origin. To put it more
differed greatly within the sample, almonds were chosen which also precisely, the morphology of whole and bisected almonds can result in
differed in shape and/or color in order to cover the sample's variance as different scattering as the shape of the seeds or the surface structure of
best as possible. The almonds were thawed at room temperature the tegument varies. Moreover, different morphologies result in a
2
M. Arndt, et al. Food Control 115 (2020) 107302
different amount of reflected light causing baseline shifting. Scattering Fig. 1 compares the median spectra of the four processing forms
effects caused by the inhomogeneous physical effects are also observed after MSC (9000–3950 cm−1, median value of all samples of each
analyzing ground or freeze-dried samples as the particle size is not sample preparation method, see Fig. S1 for whole and unprocessed
perfectly uniform. spectra). A precise peak assignment cannot be achieved, since peak
In addition, the first-order gap-segment derivatives were calculated overlaps are unavoidable when complex matrices like almonds are
to eliminate offset and baseline drifts (gapDer function implemented in analyzed. Nevertheless, the spectra of the bisected, ground as well as
the R package prospectr 0.1.3, using a derivative order m of 1, a the freeze-dried almonds show similar trends in absorbance which can
smoothing window s of 11 and a window size w of 11; Stevens & mainly be attributed to the predominant lipids. The HC]CH band oc-
Ramirez-Lopez, 2014). Gap-segment derivatives use a window-based curs at approximately 8550 cm−1 (C–H, second overtone) and is most
approach that first averages the points in a smoothing window centered likely caused by the carbon–carbon double bonds of the almonds’ un-
around the current measurement point, then calculate the derivative by saturated fatty acids. The aliphatic hydrocarbon part of the lipids ab-
taking the difference between two points separated by a given gap size sorbs in various areas: at about 5800 cm−1, the C–H stretch first
(Norris & Williams, 1984; Rinnan, Van Den Berg, & Engelsen, 2009). overtone of methylene is located while the symmetric CH2 bond vi-
The high resolution of 2 cm−1 used during spectra acquisition re- brations (C–H, stretch first overtone) appear at around 5680 cm−1.
sulted in 3725 variables that often contain similar information for ad- Another lipid associated absorbance is located at around 4330 cm−1
jacent wavenumbers. For this reason, the average of five contiguous which is caused by the C–H bending (second overtone). Besides lipids,
wavenumbers was taken (binning function implemented in the R some other characteristic bands were observed. At around 4855 cm−1,
package prospectr 0.1.3), reducing 3693 (after the first derivative) to protein vibrations are located due the amide combination band of
739 variables. CONH2. The broad absorbance band in a range of 7100–6100 cm−1 is
The final step of data pre-processing was to take the median of each composed of the N–H stretching (first overtone) of proteins and the
wavenumber of all measured spectra for the same sample (30 spectra water associated absorbance (O–H symmetric and asymmetric
for whole and bisected samples and five spectra for ground and freeze- stretching combination, first overtone). Furthermore, a combination
dried samples). The median was used instead of the arithmetic average band of water appears at a wavenumber of around 5155 cm−1 due to
in order to minimize the influence of potential outliers. the O–H stretching and H–O–H bending combination (Buijs & Choppin,
1963; Shenk, Workman, & Westerhaus, 2001; Weyer & Workman,
2.5. Multivariate data analysis and classification models 2012).
Examining the NIR spectra of Fig. 1, it becomes evident that the
Multivariate data analysis was conducted separately for each al- median spectrum of the whole almonds' analysis differs from the other
mond sample preparation. Each preparation leads to one data set con- spectra: apart from the overall lower absorbance values, the spectra
taining the results of all 64 acquired almond samples. also exhibits a partially disparate absorbance pattern. While the slightly
First, a principal component analysis (PCA) was applied to visualize weaker lipid bands are still observed, the absorbance in the range of
the present data. The PCAs were performed on the pre-processed and 4960–4510 cm−1 is significantly increased. When visually comparing
centered data (mean = 0). Subsequently, the supervised learning al- the whole almond samples with samples of the other preparations, the
gorithm support vector machine (SVM) was applied for classifying the most apparent difference is the tegument. As in NIR spectroscopy the
acquired data regarding the almond samples' geographical origin. SVM samples surface majorly impacts the measurement, the tegument's
is a state-of-the-art classification method that can be used to train both contribution to the spectra is – for obvious reasons – at its maximum in
linear and non-linear classifiers (Cortes & Vapnik, 1995). In this study the spectra recorded from whole almonds. Hence, it seems likely that
SVM was performed using LIBSVM (Chang & Lin, 2011), a publicly the band from 4960 to 4510 cm−1 derives from composition differences
available library that implements methods for the training and classi- of the tegument. In order to verify this explanatory approach, FT-NIR
fication of SVMs. LIBSVM was accessed using the e1071 interface (R analysis of different compartments was conducted. Subsequently, one
Core Team, 2019, R version 3.6.0; Meyer et al., 2019, ‘e1071’ version whole almond was blanched and the removed tegument as well as the
1.7–2). As SVM is a binary classifier, for multiclass classification the blanched almond were analyzed separately (see Fig. S2). In these
one-versus-one approach implemented in LIBSVM was used, i.e. a spectra, the intensity increase of the band from 4960 to 4510 cm−1 is
binary classifier is created for each pair of classes. A new data point is only observable in the tegument and the whole almond. While the te-
then classified by taking the most frequent class label. gument has fiber content of about 46 wt%, the whole almonds merely
In order to avoid overfitting and simultaneously optimize the model exhibit 11–14 wt% fiber. The fiber fraction consists mainly of cellulose,
parameters, a nested cross-validation was conducted (Krstajic, hemicellulose and lignin – biological macromolecules which all form
Buturovic, Leahy, & Thomas, 2014; Meyer et al., 2019; Varma & Simon, complex spectra independently – and might result in marked absor-
2006). This approach uses an inner cross-validation to optimize model bance deviations due to the O–H bending and C–O stretching
3
M. Arndt, et al. Food Control 115 (2020) 107302
Fig. 2. Median almond spectra after MSC and first derivative for (a) whole, (b) bisected, (c) ground, and (d) freeze-dried almonds.
4
M. Arndt, et al. Food Control 115 (2020) 107302
Fig. 3. PCA score plots showing differentiation for (a) whole, (b) bisected, (c) ground, and (d) freeze-dried almonds.
only about 70% of the variance in the first two PCs, cluster trends are Further comparison of the classification results is possible since the
observable slightly better in the whole almond PCA plot. However, the same almond samples were used for each sample preparation. As al-
Mediterranean almonds (Spain and Italy) form a cluster in all shown ready presumed from the spectra, the validation accuracies (hereinafter
PCAs and can thus be distinguished from other countries of origin. The referred to as just classification accuracy) differ depending on the
separation of the Mediterranean almonds is a relevant concern – chosen preparation (see Table 2). Comparatively, the freeze-dried al-
especially from an economic perspective – as these are the most ex- monds show the highest classification accuracy of 80.2% (±1.9%). The
pensive almonds compared to the almonds from the other origins (UN superiority of this preparation is, most likely, due to an effective re-
Comtrade Database, 2018). Higher-order PCs may also contain in- moval of water which can cause unwanted signal overlay possibly re-
formation that contributes to the determination of geographical origin sulting in information loss. While the analysis of ground and bisected
(see Figs. S3–S6) since the explained variance increases up to 20 per- almonds leads to accuracies of 71.9% (±3.5%) and 64.5% (±3.5%),
centage points by adding the third and the fourth PC. respectively, the most easily and quickly feasible analysis of whole al-
In order to compare the four sample preparation techniques quan- monds still achieves accuracies of 62.6% (±2.8%). The comparably low
titatively, a support vector machine (SVM) was used to classify the al- accuracies of the whole and bisected almond analysis are most probably
monds regarding their geographical origin. A one-versus-one classifier explained by the insufficient coverage of the light beam resulting in
with a Gaussian radial basis function (RBF) kernel was trained via significant loss of information. Additionally, the classification accuracy
LIBSVM (R package ‘e1071’) and validated using repeated nested cross- of whole almonds could be decreased by the influence of the tegument.
validation which allows to optimize model parameters and obtain and As the tegument is formed in an earlier growth phase, changes in ex-
unbiased estimation of the generalization performance. The hy- ternal conditions have a greater impact on the inner kernel which grows
perparameter optimization was performed using a grid search in the in a later phase. For example, water deficiency in a crucial growth
inner cross-validation loop (cost C from 10−5 to 105, gamma γ from phase can lead to irregular kernel development (e.g. wrinkled almonds)
10−6 to 10−1). where only the tegument is fully developed (Hawker & Buttrose, 1980).
5
M. Arndt, et al. Food Control 115 (2020) 107302
Table 3
Comprehensive comparison of the four sample preparations regarding work time – active and passive –, sample quantity and classification accuracy via support
vector machine (SVM).
whole bisected ground freeze-dried
estimated active work time (measurement and 10 min 10 min 30 min 50 min (inclusive treatments during freeze-
preparation) drying)
estimated passive work time (storage or freeze-drying) – – 24 h (for sublimation of dry ice) and min. 4 h 48 h (freeze- drying) and min. 4 h thawing
thawing
sample quantity 5 almonds 5 almonds min. 100 g min. 100 g
classification accuracy via SVM 62.6% 64.5% 71.9% 80.2%
6
M. Arndt, et al. Food Control 115 (2020) 107302
7
M. Arndt, et al. Food Control 115 (2020) 107302
Influence of particle size. Cereal Chemistry, 61(2), 158–165. Socias, R., Kodad, O., Alonso, J., & Gradziel, T. (2007). Almond quality: A breeding
Pannico, A., Schouten, R., Basile, B., Romano, R., Woltering, E., & Cirillo, C. (2015). Non- perspective. Horticultural Reviews, 34, 197–238.
destructive detection of flawed hazelnut kernels and lipid oxidation assessment using Stevens, A., & Ramirez-Lopez, L. (2014). An introduction to the prospectr package. R
NIR spectroscopy. Journal of Food Engineering, 160, 42–48. Package Vignette, Report No.: R Package Version 0.1.3.
Porep, J. U., Kammerer, D. R., & Carle, R. (2015). On-line application of near infrared Teye, E., Huang, X., Dai, H., & Chen, Q. (2013). Rapid differentiation of Ghana cocoa
(NIR) spectroscopy in food production. Trends in Food Science & Technology, 46(2), beans by FT-NIR spectroscopy coupled with multivariate classification.
211–230. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 114, 183–189.
R Core Team (2019). R: A language and enviroment for statistical computing. Vienna, UN Comtrade Database (2018). Almond (shelled) export trade value and netweight. New
Austria: R Foundation for Statistical Computing. http://www.r-project.org/, Accessed York, USA: United Nations Publications Boardhttps://comtrade.un.org/data/,
date: 25 November 2019. Accessed date: 2 November 2019.
Richter, B., Rurik, M., Gurk, S., Kohlbacher, O., & Fischer, M. (2019). Food monitoring: Varma, S., & Simon, R. (2006). Bias in error estimation when using cross-validation for
Screening of the geographical origin of white asparagus using FT-NIR and machine model selection. BMC Bioinformatics, 7(1), 91.
learning. Food Control, 104, 318–325. Vitale, R., Bevilacqua, M., Bucci, R., Magrì, A. D., Magrì, A. L., & Marini, F. (2013). A
Rinnan, Å., Van Den Berg, F., & Engelsen, S. B. (2009). Review of the most common pre- rapid and non-invasive method for authenticating the origin of pistachio samples by
processing techniques for near-infrared spectra. TRAC Trends in Analytical Chemistry, NIR spectroscopy and chemometrics. Chemometrics and Intelligent Laboratory Systems,
28(10), 1201–1222. 121, 90–99.
Rodriguez-Saona, L. E., Giusti, M. M., & Shotts, M. (2016). Advances in infrared spectro- Weyer, L., & Workman, J., Jr. (2012). Practical guide and spectral atlas for interpretive near-
scopy for food authenticity testing. Advances in food authenticity testing. Woodhead infrared spectroscopy. CRC Press.
Publishing. Yada, S., Lapsley, K., & Huang, G. (2011). A review of composition studies of cultivated
Ruggeri, S., Cappelloni, M., Gambelli, L., Nicoli, S., & Carnovale, E. (1998). Chemical almonds: Macronutrients and micronutrients. Journal of Food Composition and
composition and nutritive value of nuts grown in Italy. Italian Journal of Food Science, Analysis, 24(4–5), 469–480.
3, 243–252. Zhang, H., Jiang, H., Liu, G., Mei, C., & Huang, Y. (2017). Identification of Radix puer-
Shenk, J. S., Workman, J. J., & Westerhaus, M. O. (2001). Application of NIR spectro- ariae starch from different geographical origins by FT-NIR spectroscopy. International
scopy to agricultural products. Practical Spectroscopy Series, 27, 419–474. Journal of Food Properties, 20(sup2), 1567–1577.