You are on page 1of 9

Vibrational Spectroscopy 114 (2021) 103244

Contents lists available at ScienceDirect

Vibrational Spectroscopy
journal homepage: www.elsevier.com/locate/vibspec

Discriminant analysis and quantitative study of antibiotics in infant milk


powder based on hyperspectral detection
Jun Hu a, Zhen Xu a, Maopeng Li a, Yong He b, Yande Liu a, *, 1
a
School of Mechatronics & Vehicle Engineering, East China Jiaotong University, Nanchang, Jiangxi, 330013, PR China
b
School of Mechanical Engineering, Zhejiang University, Hangzhou, 310027, PR China

A R T I C L E I N F O A B S T R A C T

Keywords: Objective: Food quality and safety has become the focus of attention for people from all walks of life. As antibiotic
Hyperspectral residues in food will cause serious harm to human health, it is necessary to realize the rapid and non-destructive
Infant milk powder detection of antibiotic residues in food. The problem of antibiotic residues is among the most urgent problems to
Antibiotic
be tackled in the quality problems of milk powder, so it is very important to conduct accurate qualitative
RF
LS-SVM
identification and quantitative detection of antibiotics in milk powder.
Method: Based on hyperspectral technology and combined with chemometrics, this research took the three
common residual antibiotics (doxycycline, chlortetracycline and oxytetracycline) in milk powder as the research
objects to monitor the quality of milk powder. Firstly, Samples were prepared by grinding, drying, weighing,
mixing and performing successively according to the designed concentration gradient. Then, the spectral of pure
sample (infant milk powder and pure antibiotic) and samples containing three types of antibiotic residues were
acquired characteristics and compared. Thirdly, to establish a qualitative discriminant model for different
antibiotic residues in infant milk powder, the Partial Least Squares Discriminant Analysis (PLS-DA) and Random
Forest (RF) models were established to identify antibiotic residues in milk powder. Fourthly, to establish a
quantitative discriminant model for antibiotic residues in infant milk powder, to simplify the models and reduce
the computational complexity, three methods, namely, Successive projection Algorithm (SPA), Uninformative
Variable Elimination (UVE), and Competitive Adaptive Reweighted Sampling (CARS) were used to select the
wavelengths for the optimal method. Then the Least Squares Support Vector Machine (LS-SVM) model was
established to conduct quantitative detection of residual antibiotics.
Result: In the qualitative analysis, PLS-DA model can roughly identify three antibiotics, with an accuracy rate of
96.2 %. RF model has better effect, with an identification accuracy reaching 100 %. In the establishment of
quantitative detection model, after the spectrum wavelengths of three types of milk powder samples was selected
by CARS algorithm, the CARS-LS-SVM model which was established by using only 7% of the data showed good
effect. Among them, the prediction set correlation coefficient Rp and Root Mean Square Error of Prediction Set
(RMSEP) of milk powder samples containing aureomycin, doxycycline and oxytetracycin residues were 0.9990
and 0.08 %, 0.9996 and 0.05 %, 0.9997 and 0.04 %, respectively. The LOD(Limit of Detection) of aureomycin,
doxycycline, and oxytetracycline were 2.44 × 10− 3, 1.51 × 10− 3, 1.2 × 10-3, respectively.
Conclusion: The identification of infant milk powder can be well realized by using hyperspectral technology
combined with RF algorithm. The LS-SVM models were established by hyperspectral technology combined with
CARS algorithm can then be used to set up better quantitative determination models of antibiotic residues in
infant milk powder. This research can provide a theoretical basis for the detection of antibiotics in other types of
food and can guarantee food safety to a certain extent.

* Corresponding author at: School of Mechatronics & Vehicle Engineering, East China Jiaotong University, 330013, PR China.
E-mail addresses: hujun_ecjtu@163.com (J. Hu), jxliuyd@163.com (Y. Liu).
1
Received: Oct 5, 2020.

https://doi.org/10.1016/j.vibspec.2021.103244
Received 26 November 2020; Received in revised form 15 February 2021; Accepted 10 March 2021
Available online 13 March 2021
0924-2031/© 2021 Published by Elsevier B.V.
J. Hu et al. Vibrational Spectroscopy 114 (2021) 103244

1. Introduction algorithm were established to find superior discrimination model of


antibiotic residues type. To optimize the model, SPA, UVE and CARS
Infant milk powder, as a necessity in the growing period of infants algorithm were used to select the spectral wavelengths of the samples
and young children, is the main nutrition source for them. Therefore, it [14–16]. Finally, different quantitative models were establish to detect
is particularly important to ensure the safety of infant milk powder, the antibiotic residues in milk powder, and to explore the optimal
which is related to the future of the country [1]. Antibiotics are not only model. This study can provide relevant basis and theoretical reference
used as veterinary drugs to treat animal diseases, but also as feed ad­ for the subsequent research on antibiotics in food.
ditives to prevent animal diseases and promote animal growth. How­
ever, excessive use of antibiotics could lead to excessive antibiotic 2. Materials and methods
residues in milk source, seriously threatening the health of people [2].In
order to seek economic benefits, some milk powder manufacturers 2.1. Experimental materials and devices
ignore the quality supervision in the process of milk powder production,
resulting in excessive antibiotic content in infant milk powder that pose The infant milk powder used in this experiment was under the brand
serious threats to the safety of infants and young children. Therefore, it of FIRMUS; the antibiotic samples were purchased from the Aladdin
is necessary to strengthen the detection of antibiotic residues in milk website (www.aladdin-e.com). Among them, the purity of aureomycin
powder to ensure the quality and safety of milk powder. hydrochloride was ≥80.0 % (USP grade), the purity of doxycycline
The research of antibiotics in dairy products has always been the ≥98.0 %, and the purity of oxytetracycline hydrochloride ≥95.0 %.
focus of academic attention. Currently, most of the analytical methods According to the designed doping ratio (quality ratio), the samples were
for antibiotics are based on the physical and chemical analysis, and the prepared according to the designed concentration gradient (concentra­
biological determination method. Traditional physical and chemical tion 0.07 %, 0.008 %, 0.01 %, 0.02 %, 0.04 %, 0.06 %, 0.08 %, 0.1 %, …,
analysis methods mainly include High Performance Liquid Chromatog­ 5%), with the doping ratio of 0.007 % ~ 5.0 % for each type of antibi­
raphy (HPLC), Thin Layer Chromatography (TLC), Gas Chromatography otics. A total of 21 groups were prepared, 5 samples for each concen­
(GC), Mass Spectrometry (MS), and the combination of these techniques tration group, with 1 group of pure antibiotic samples and 1 group of
[3]. The mentioned above methods require complex pretreatment, and pure milk powder samples. A total of 105 mixed samples were used to
the detection process is tedious, time-consuming and expensive [4]. simulate the milk powder samples containing antibiotic residues.
Traditional biological determination methods include immune analysis The samples in this experiment were prepared through steps such as
method and biosensors method [5]. The biological determination weighing, mixing, sample loading. The specific preparation process is as
methods are all destructive detection methods with high cost and follows: (1)In the process of sample preparation, all the samples
complicated operation, which cannot meet the requirements of speed involved have been fully ground and passed through 200 mesh sieve.
and cost for quality detection of infant milk powder [6]. Although the Samples screened through the sieve were used for subsequent experi­
pure antibiotics have obvious characteristic absorption peaks in the ments;(2) The antibiotics and milk powder were added into the centri­
terahertz spectrum, the terahertz technology still has the disadvantages fuge tube according to the sample quality ratio table, and then were
of low laser source power and low detection sensitivity at this stage, and labeled; (3) the prepared samples were put on the vortex mixer and
it is currently difficult to use in actual milk powder safety testing [7]. shaken for 3 min for thorough mixture; (4) the samples were put into a
Although some scholars have studied the use of metamaterials to plastic culture dish with a diameter of 30 mm and a height of about 8
enhance detection sensitivity, the structure design of metamaterials is mm, and then were scraped flat; (5) labels of corresponding concen­
relatively complex, the processing is difficult, and the metamaterials are tration and antibiotic type were stuck outside the culture dish. The
not universal [8,9]. Therefore, it is urgent to explore a rapid and preparation methods of the 3 types of residual antibiotics samples were
non-destructive method for detecting antibiotic residues in milk powder. all the same [17].The samples were randomly divided into calibration
As a kind of non-destructive testing technology, hyperspectral set and prediction set at a ratio of about 3:1 by using K–S (Kennard-­
detection technology has been widely used in various fields in recent Stone) algorithm for subsequent modeling. Table 1 shows the true values
years. Many researchers have applied it to the detection of food and distribution of calibration set and prediction set for preparing infant
chemical substances. Jie Dengfei et al. [10] used hyperspectral tech­ milk powder samples containing different antibiotic concentrations.
nology to study the prediction models of soluble solid content in In this experiment, the Gaia Sorter system produced by Dualix
different parts of citrus, with the correlation coefficient of the prediction Spectral Imaging Technology Co.,LTD. Fig. 1 shows the Schematic and
(Rp) of 0.950 and Root Mean Square Error of Prediction (RMSEP) of structural of hyperspectral imaging system.The hyperspectral imaging
0.636 %. Wang Guanghui et al. [11] used hyperspectral technology device is mainly composed of three parts: imaging lens, imaging spec­
combined with chemometrics to model and analyze the aflatoxin B1 and trometer and detector. The light source range of the spectral camera is
gibberellin in moldy corn, with the prediction accuracy of aflatoxin B1 980nm~2500 nm, and the main acquisition parameters are set as fol­
content was 98.74 %, and the correct rate of gibberellin content pre­ lows: the exposure time of the imaging system is 0.1− 20 ms (15 ms in
diction of 100 %.Yin Yong et al. [12] used hyperspectral technology to this experiment), the spectral resolution is 10 nm, the forward speed of
classify the mildew degree of corn, and the identification accuracy of 6 the motor platform is 5 mm/s, and the return speed is 20 mm/s.
grades of moldy corn was 98.6 %. Liu Zheng et al. [13] used hyper­
spectral technology to detect the nitrite content of sausages in different 2.2. Spectral acquisition
storage periods, and the coefficient of determination and the root mean
square error of the accuracy of the established partial least square After the parameters were set, the image should be focused to get a
regression model were 0.9829 and 0.0592, respectively. The results clear spectral image. In this experiment, SpecView software was used to
showed that the spectral information at full wavelength was more acquire hyperspectral images of mixed infant milk powder samples.
suitable for establishing hyperspectral detection model of nitrite content During the process of image acquisition, the sample acquisition surface
in sausage storage. Hyperspectral technology is particularly suitable for should face the camera lens and the sample should be placed in a po­
quality and safety research in food and agricultural products, but sition not exceeding the spectral image acquisition area.
hyperspectral imaging technology has rarely been reported in antibi­ In order to avoid the uneven distribution of the light source in each
otics detection. wavelength and the influence of dark current in the camera on the image
In this paper, the infant milk powder and the widely used aureo­ quality, the image should be calibrated after the acquisition of hyper­
mycin, doxycycline and oxytetracycline were used as the research ob­ spectral images. Before the calibration of spectral image, the calibration
jects to detect the antibiotic residues in milk powder. The PLS-DA and RF reference shall be obtained. The main steps to obtain the reference are as

2
J. Hu et al. Vibrational Spectroscopy 114 (2021) 103244

Table 1
The true value distribution of calibration set and prediction set of infant milk powder with different antibiotic concentrations.
Type of Milk Powder Classification of Sample Set Amount Minimum Value (%) Maximum Value (%) Average Value (%) STDEV

Total 105 0. 007 5.245 1.448 1.724


Type A Calibration set 79 0. 007 5.245 1.436 1.720
Prediction set 26 0. 007 5.245 1.485 1.768
Total 105 0. 006 5.028 1.400 1.649
Type B Calibration set 79 0. 006 5.028 1.387 1.646
Prediction set 26 0. 006 5.028 1.436 1.689
Total 105 0. 006 5.001 1.378 1.658
Type C Calibration set 79 0. 006 5.001 1.366 1.656
Prediction set 26 0. 006 5.001 1.417 1.697

Note: Type A milk powder is the prepared sample of infant milk powder containing aureomycin residues; Type B milk powder is the prepared sample of baby milk
powder containing doxycycline residues; Type C milk powder is the prepared sample of infant milk powder containing oxytetracycline residues.

Fig. 1. Hyperspectral Imaging System: (a) Schematic Diagram; (b) structural Diagram.

follows: Firstly, the CCD camera lens were covered to collect an all-black ducting spectral matrix decomposition [18–20]. The discriminant
image; then the lens cover was moved to collect the white frame image analysis process is as follows: (1) the classified variables of the cali­
of the white reference plate, which then was used to correct the original bration set are established; (2) PLS analysis of classified variables and
image of the sample. The calibration Formula (1) is shown below. spectral data is conducted to establish a PLS model between classified
variables and spectral data; (3) the PLS model of the classified variables
I λ − Hλ
Rλ = (1) and spectral characteristics is established according to the calibration
Bλ − H λ
set, then the value of the classified variables of the verification set are
In the formula, Rλ represents calibrated data; Hλ represents all-black calculated. The calculation method is shown in Formula (2).
data; Bλ represents all-white data; and Iλ represents original data. After

n
all spectral images had been corrected, subsequent analysis and pro­ Y= βi λi + bi (2)
cessing were carried out. i=1

After the original hyperspectral images of the experimental samples


In the formula, Y is the predicted classified value; n is the number of
were corrected, the characteristics of the hyperspectral images could be
spectral variables not involved in modeling; β is the energy spectrum
extracted. The spectra of the hyperspectral images were extracted by
intensity; λ is the regression coefficient; b is the intercept of the model.
using ENVI 4.5 software. Firstly, the file was imported and the hyper­
The classified predicted value of the PLS model can be obtained by
spectral image was opened. Then, the region of interest was selected.
adding the weighted sum of spectral variables and regression co­
Afterwards, ENVI software was used to calculate the average spectrum
efficients within the full spectrum, and the intercept.
of the region of interest for subsequent data modeling.
2.3.2. RF algorithm
RF is an integrated algorithm based on decision tree classification. Its
2.3. Algorithm introduction
basic idea is to select training samples to minimize the residual sum of
squares until a complete decision tree is formed, so that multiple deci­
2.3.1. PLS-DA algorithm
sion trees can be formed and the final prediction can be obtained by
PLS-DA is a multivariate method based on PLS, which combines
voting. The most important parameter of RF is the number of decision
principal component analysis (PCA) and correlation analysis to perform
trees, and the more the number is, the higher the prediction accuracy
linear regression on spectral data and classified variables. Meanwhile,
will be [21,22]. Compared with the neural network, RF not only reduces
the spectrum was decomposed and fitted by setting factor matrix. Then,
the computation but also improves the prediction accuracy of the model.
the factors could be substituted into the spectral data when the matrix
was decomposed, thus avoiding the impact on the data of only con­

3
J. Hu et al. Vibrational Spectroscopy 114 (2021) 103244

2.3.3. LS-SVM algorithm model, and the closer the values of RMSEC and RMSEP are, the better
LS-SVM is a learning method based on kernel function. It is opti­ the robustness of the model. Fig. 2 shows the flow chart of establishing
mized on the basis of support vector machine to reduce the amount of the detection model of antibiotic residues in infant milk powder.
computation so as to reduce the calculation time, thus improving the
efficiency of model establishment [23–25]. LS-SVM has good mathe­ 3. Experimental results and analysis
matical properties and can solve practical problems such as small sam­
ple, local minimum point, nonlinear and high dimension. Its main kernel 3.1. Hyperspectral response characteristics of pure infant milk powder
functions include linear kernel function (Link_kernel) and radial basis and aureomycin samples
kernel function (RBF_kernel). The algorithm of RBF_kernel adopted in
this paper is shown in Formula (3). 3.1.1. Hyperspectral contrast analysis of different antibiotic samples
⃦ ⃦2 / Hyperspectrum has abundant data and message. In this experiment,
K(xi , xj ) = exp(− ⃦xi − xj ⃦ 2σ2 ) (3) the frequency range of hyperspectral is 957.49~2567.67 nm, with a
total of 288 band points. Fig. 3 shows the spectral comparison of the
In the formula, xi represents the sample point; xj represents the
three pure antibiotics, it can be seen that the reflectance of the three
center point of the kernel function; σ2 represents the kernel parameter. antibiotics as a whole had a similar trend. As the wavelength increased,
Eq. (4) was calculated according to the slope of the fitting curve the reflectance gradually decreased, but there were certain differences
between true value and predictive value established by hyperspectral in different wavelengths. Oxytetracycline has the highest reflectivity
data and the variance of prediction error. The calculation result shows among the three antibiotics at 957.79~1400 nm, 1700~1900 nm and
that the limit of detection (LOD) had a confidence interval of 99.86 % 2050nm~2567.67 nm.
[26]. In Eq. (4),σ refers to the standard error of predictive concentration;
m refers to the slope of the fitting curve, and RMSEP is equal to the value 3.1.2. Hyperspectral analysis of infant milk powder containing antibiotic
of σ in the model. residues
3σ Figs. 4–6 show the spectral comparison of different contents of the
LOD = (4) three antibiotics in milk powder, among which the spectral reflection
m
intensities of the three pure antibiotics were all higher than that of the
milk powder. With the increase of spectral wavelength, the spectral
2.4. Model evaluation criteria
refraction intensities of the infant milk powder containing antibiotic
residues all presented a downward trend. Fig. 4 shows the samples of
The parameters of the evaluation model mainly include correlation
mixed milk powder containing aureomycin residues, and the spectral
coefficient (rc and rp) of calibration set and prediction set and RMSE
peaks were relatively obvious at 1154.54 nm, 1261.51 nm, 1368.48 nm,
(RMSEC and RMSEP) between those two sets. The higher the correlation
1706.28 nm, 1908.96 nm, 2004.67 nm and 2252.39 nm. Fig. 5 shows the
coefficient of the model, the closer its value is to 1, the better the pre­
spectral peaks of the mixed milk powder containing doxycycline resi­
dictive effect; the smaller the RMSEP, the higher the accuracy of the
dues at 1154.54 nm, 1261.51 nm, 1368.48 nm, 1706.28 nm, 1908.96
nm, 2004.67 nm and 2252.39 nm. Fig. 6 shows the spectral peaks of the
mixed milk powder containing oxytetracycline residues at 1154.54 nm,
1261.51 nm, 1368.48 nm, 1706.28 nm, 1908.96 nm, 2004.67 nm and
2252.39 nm. And with the increase of antibiotic concentration, the in­
tensity of absorption peak increased gradually. The reason for the
emergence of this peak might be that the concentration of antibiotics
mixed in infant milk powder increased, making the hyperspectral
reflection of samples more intense.

3.2. Establishment of a qualitative model for different antibiotic residues


in infant milk powder

In this experiment, 105 spectra were collected for each type of


sample, a total of 315 spectra, with the wavelength range of

Fig. 2. The flow chart of establishing detection model of antibiotic residues in


milk powder. Fig. 3. Spectral comparison of different types of antibiotics.

4
J. Hu et al. Vibrational Spectroscopy 114 (2021) 103244

Fig. 4. Hyperspectral comparison diagram of samples with different concen­ Fig. 7. The regression coefficient curve chart of the PLS-DA discriminant model
trations of aureomycin in infant milk powder. for antibiotic types in milk powder.

square method to discriminant analysis and establishes a regression


model between hyperspectral characteristics and the samples’ classified
variables [27]. The main idea is to replace the concentration vector with
the classified vector of the sample, and then establish the model by
quantitative PLS method. K–S algorithm was used to randomly divide
the 315 collected groups of hyperspectral data into calibration set and
prediction set according to the ratio of 3:1. It means that the numbers of
spectral data occupied by the two sets were 236 and 79 respectively.
Based on the divided calibration set and prediction set, the PLS-DA
model was built by using unscrambler8.0 software. In this experiment,
the number of optimal principal component factors in the PLS-DA model
was 19. Fig. 7 shows the regression coefficient curve of the PLS-DA
discriminant model for antibiotic types in milk powder. Regression co­
efficients were obtained through the model regression calculation,
namely the weighted value of spectral variables corresponding to
different wavelength points. By adding the weighted sum of regression
coefficients and spectral variables and the intercept (b=-11.64), the
class value of the PLS-DA model could be obtained for qualitative
Fig. 5. Hyperspectral comparison diagram of samples with different concen­
determination of residual antibiotic types in milk powder.
trations of doxycycline in infant milk powder.
Fig. 8 shows the PLS-DA model fitting diagram for the classified
variables of the prediction set (3 types of samples). The correlation co­
efficient of the prediction set (rp) was 0.9468 and RMSEP was 0.4875 %.
The intermediate value of the categorization vector was taken as the
threshold with an interval of 2, and the intermediate value was taken as
the threshold with corresponding thresholds T1 = 2.0 and T2 = 4.0
respectively.

Fig. 6. Hyperspectral comparison diagram of samples with different concen­


trations of oxytetracycline in infant milk powder.

957.49~2567.67 nm. The spectra of Type A, B and C samples were


marked as “1′′ , “3′′ and “5′′ respectively, and then the model was con­
structed for qualitative discrimination.

3.2.1. Establishment of PLS-DA model for infant milk powder


PLS-DA is a qualitative discriminant method that applies partial least Fig. 8. Fitting diagram of prediction set for PLS-DA model.

5
J. Hu et al. Vibrational Spectroscopy 114 (2021) 103244

Table 2
Classification results of RF discriminant model.
Samples Calibration set Prediction Set True Correct rate

315 236 79 79 100.00%

Fig. 9. OOB error rate of spectra.

There were three types of milk powder samples in total: Type A


(containing residual aureomycin),Type B (containing residual doxycy­
cline), and Type C (containing residual oxytetracycline). The judgment
results are as follows: the judgment result of Type A was correct, and the
misjudgment rate was 0%; the judgment result of Type B was correct, Fig. 11. Selecting process diagram of wavelength variable based on
and 1 sample was misjudged; there were 2 errors in Type C judgment SPA algorithm.
results; there were 3 misjudgments in 79 predicted samples. In conclu­
sion, the comprehensive misjudgment rate was 96.2 %. rate reached the minimum value of 0.0085. Fig. 10 shows the variation
of contribution rate of variables. Variables near the wavelength range of
3.2.2. Establishment of random forest discriminant model for infant milk 957.5~1064.5 nm and 2455~2567.7 nm could be better used to iden­
powder tify different types of antibiotics.
Essentially, RF model is a classifier containing multiple decision Table 2 shows the classification results of the RF discriminant model.
trees. These decision trees are formed by a random method, so they are The accuracy of classification for different types of antibiotics was 100
also called random decision trees and there is no correlation between the %, showing good classification effect.
trees in RF. The test data enter into the random forest to be classified by
each decision tree, and finally the category with the most classified re­
sults among all decision trees is taken as the final result. The method of 3.3. Quantitative model analysis of antibiotic residues in infant milk
randomly selecting split attribute sets is adopted by RF to build decision powder
trees [28]. Firstly, K–S algorithm was used to divide 315 samples of
hyperspectral data into a calibration set and a prediction set at a ratio of 3.3.1. Effective wavelength selection of antibiotic residues in infant milk
about 3:1. The parameter settings are as follows: Ntree was the number powder
of decision trees in RF, which was set to be 500; Mtry was the number of In order to simplify the model and reduce computation, SPA, UVE
attributes in the split attribute set, serving as the square root of the and CARS were used to select the spectral wavelengths in order to find a
√̅̅̅̅̅̅̅̅̅
number of spectral variables ( 288 (rounded down)), which was about better wavelength selection method.
16. Next, the RF classification model was constructed. Fig. 9 shows the
relationship between out-of-bag (OOB) error rate and the number of 3.3.1.1. Selection of hyperspectral wavelength variables based on SPA
decision trees. When the number of decision trees was 27, the OOB error algorithm. SPA is a forward cyclic selection method, which uses vector
projection analysis to select the effective wavelength with minimum

Fig. 12. The results of the mixed samples of milk powder variables selection
Fig. 10. The variation diagram of variable contribution rate. by UVE.

6
J. Hu et al. Vibrational Spectroscopy 114 (2021) 103244

uses the UVE algorithm to select 37 variables from the original spectrum
of 288 variables. The spectrum of Class C milk powder uses the UVE
algorithm in the original spectrum. 28 variables were selected out of 288
variables. The algorithm eliminated the redundant information of the
original spectrum and optimized the model.

3.3.1.3. Effective wavelength variable selection based on CARS algorithm.


The Competitive Adaptive Reweighted Sampling (CARS) algorithm fol­
lows the principle of "survival of the fittest", using Monte Carlo sampling
(MCS) to randomly select 80 % of the sample set as the calibration set,
and then press the selected calibration set. The spectral array and con­
centration array of a part of the samples are proportionally extracted to
establish the PLS model respectively. And use exponential decay func­
tion (EDP) and adaptive reweight sampling (APS) to filter variables,
eliminate redundant information, and identify variables with high ab­
solute values of regression coefficients. Fig. 13 is a process diagram of
variable selecting of mixed samples of infant milk powder by CARS al­
gorithm. It can be seen from Fig. 13(a) that the number of reserved
wavelengths gradually decreases as the number of samples increases. In
Fig. 13. The process of spectra variables selection by CARS.
the beginning, as the sampling runs increase, the number of reserved
wavelengths decreases faster, then decreases more slowly, and finally
redundancy and minimum multicollinearity. This method starts with a
does not change, which embodies the CARS wavelength variables se­
wavelength. In each cycle, its projection on the unselected wavelength is
lection is the process from rough selection to fine selection. Fig. 13(b)
calculated and the wavelength with the maximum projection vector is
shows the trend diagram of the RMSECV value changing with the in­
introduced into the wavelength combination [29]. Fig. 11 shows the
crease of sampling times during the process of selecting wavelength
spectrogram after selecting the spectral variables of mixed infant milk
variables. When the sampling times are from 1 to 26, the RMSECV value
powder samples based on SPA algorithm (taking Type A milk powder as
gradually becomes smaller. When the sampling time is 26, RMSECV get
an example). The spectrum of Type A milk powder selected 26 variables
the minimum value. After the number of samples exceeds 26, the
out of 288 variables in the original spectrum using SPA algorithm; the
RMSECV value gradually increases. The above process shows that when
spectrum of Type B milk powder selected 37 variables out of 288 vari­
the number of sampling times is less than 26, the CARS algorithm filters
ables in the original spectrum using SPA algorithm; the spectrum of Type
out spectral information that is not related to chlortetracycline antibi­
C milk powder selected 28 variables out of 288 variables in the original
otics, and the important information related to chlortetracycline anti­
spectrum using SPA algorithm. This algorithm eliminated redundant
biotics is selected out after the number of sampling times 14 is greater
information and optimized the model to some extent.
than 26. Fig. 13(c) shows the trend graph of the wavelength variable
regression coefficient changing with the increase of sampling times
3.3.1.2. Effective wavelength variable selection based on UVE algorithm. during the wavelength selecting process. The position shown by "*" in
The Uninformed Variable Elimination Method (UVE) is a wavelength the Fig. is the sampling times when the RMSECV value takes the mini­
selecting method based on the partial least squares (PLS) regression mum value. It can be seen from Fig. 13(a) that when the number of
coefficient. The basic idea is to use the regression coefficient as a mea­ samples is 26, RMSECV takes the minimum value, and the number of
sure of the importance of wavelength. Fig. 12 shows the result of UVE wavelength variables corresponding to this time is 23.
variable selecting of the spectrum of the mixed sample of infant milk
powder. The black spectrum line on the left side of the variable dividing 3.3.2. Establishment of LS-SVM model for mixed infant milk powder
line represents the wavelength stability curve, and the right side is the LS-SVM models mainly have two functions: RBF kernel function and
wavelength dividing line. The two parallel lines parallel to the X axis Lin linear kernel function. The established LS-SVM models of mixed
represent the threshold of UVE variable selecting. The upper and lower infant milk powder samples were compared and evaluated. In this paper,
thresholds are ±12.1154, and the stability is at Variables other than the the RBF kernel function which is better than Lin linear kernel function
upper and lower thresholds can be used as input variables, and the was adopted. Table 3 compares the performance results of LS-SVM
variables whose stability is within the upper and lower thresholds will models established after wavelength selecting based on RBF kernel
be eliminated and not included in the built model. The spectrum of Class function. By comparing the spectral results of the three samples pro­
A milk powder uses the UVE algorithm to select 26 variables from the cessed by different waveband selecting methods, it was found that
original spectrum of 288 variables. The spectrum of Class B milk powder

Table 3
Performance comparison of LS-SVM models based on variable selection methods in different wavelengths.
Type Treatment Method No. of variable γ σ2 rp RMSEP
4
Full spectrum 288 5.7209 × 10 573.5550 0.9990 0.0008
SPA 26 7.2264 × 105 6.0030 × 103 0.9983 0.0010
Type A
UVE 95 2.8098 × 103 129.1420 0.9989 0.0008
CARS 23 9.7373×103 66.0053 0.9990 0.0008
Full spectrum 288 1.2456 × 103 6.452 × 103 0.9995 0.0006
SPA 37 1.1468×107 472.1220 0.9992 0.0006
Type B
UVE 113 3622.2995 295.9600 0.9976 0.0013
CARS 23 2.7910×105 655.5651 0.9996 0.0005
Full spectrum 288 1.0462 × 105 5362.3224 0.9995 0.0005
SPA 28 5.0517 × 105 288.0593 0.9996 0.0005
Type C
UVE 113 2.0917 × 105 1807.8376 0.9996 0.0005
CARS 23 2.5144×105 519.8977 0.9997 0.0004

7
J. Hu et al. Vibrational Spectroscopy 114 (2021) 103244

Declaration of Competing Interest

The authors report no declarations of interest.

Acknowledgement

The authors gratefully acknowledged the financial support of Na­


tional 863 Program (SS2012AA101306), “the 12th Five-Year Plan”,
Jiangxi Advantageous Science and Technology Innovation Team Con­
struction Plan (20153BCB24002), Collaborative Innovation Center
Project of Intelligent Management Technology and Equipment for
Southern Mountain Orchards (G.J.G.Z. [2014] No.60),National Natural
Science Foundation of China (2002017018).Science and Technology
Research Youth Project of Jiangxi Education Department (GJJ190348);
Innovation Fund Project for Doctoral Students of Jiangxi Province
(YC2019-B106).

Fig. 14. Predicted antibiotic residues contents of LS-SVM models in References


three types.
[1] E. Song, M. Yu, Y. Wang, et al., Multi-color quantum dot-based fluorescence
immunoassay array for simultaneous visual detection of multiple antibiotic
samples processed by CARS could be better modeled. After being residues in milk, Biosens. Bioelectron. (2015) 320–325.
selected by CARS on wavelengths, only 7% of the spectral data of Type [2] GB 2760-2014, National Health and Family Planning Commission of the People’s
Republic of China, 2014.
A, B and C milk powder samples were collected. In the prediction set of
[3] Rong Li, Canbiao Zeng, Junni Li, et al., Characterization of the fruits and seeds of
the three types, the correlation coefficient rp and RMSEP were 0.9990 Alpinia oxyphylla miq by high-performance liquid chromatography (HPLC) and
and 0.08 %, 0.9996 and 0.05 %, 0.9997 and 0.04 % respectively, indi­ near-infrared spectroscopy (NIRS) with partial least-squares (PLS) regression, Anal.
cating that the established CARS-LS-SVM model had a good effect. Lett. 53 (11) (2020) 1667–1682.
[4] R. Han, N. Zheng, Z. Yu, et al., Simultaneous determination of 38 veterinary
Therefore, the method of CARS combined with LS-SVM could be used to antibiotic residues in raw milk by UPLC-MS/MS, Food Chem. (2015) 119–126.
quantitatively determine antibiotic residues in infant milk powder. [5] C. Wang, X. Li, T. Peng, et al., Latex bead and colloidal gold applied in a multiplex
Fig. 14 shows the predicted values and true values of the three immunochromatographic assay for high-throughput detection of three classes of
antibiotic residues in milk, Food Control (2017) 1–7.
antibiotic residues in the mixed samples of milk powder. The predictive [6] L. Afsah-Hejri, P. Hajeb, P. Ara, et al., A comprehensive review on food
correlation coefficients (rp) of the CARS-LS-SVM models for the three applications of terahertz spectroscopy and imaging, Compr. Rev. Food Sci. Food
samples were greater than 0.9990, and the detection accuracy and sta­ Saf. 18 (5) (2019) 1563–1621.
[7] X. Sun, K. Zhu, J. HU, et al., Terahertz Spectroscopy Determination of Benzoic Acid
bility of the model were relatively ideal. Therefore, CARS-LS-SVM model Additive in Wheat Flour by Machine Learning, J. Infrared Millim. Terahertz Waves
was more suitable for the detection of antibiotic residues in infant milk (2019) 1–10.
powder. The LOD of aureomycin, doxycycline, and oxytetracycline were [8] Wendao Xu, Lijuan Xie, Jianfei Zhu, et al., Terahertz biosensing with a graphene-
metamaterial heterostructure platform, Carbon 141 (2019) 247–252.
2.44 × 10− 3, 1.51 × 10− 3, 1.2 × 10-3, respectively.
[9] J. Liu, Terahertz spectroscopy and chemometric tools for rapid identification of
adulterated dairy product, Opt. Quantum Electron. 49 (1) (2017) 1–8.
4. Conclusion [10] J.I.E. Deng-fei, Y.A.N.G. Jie, P.E.N.G. Ya-xi, et al., Research on the detection model
of sugar content in different position of citrus based on the hyperspectral
technology, Food & Machinery 33 (03) (2017) 51–54.
This paper verified the feasibility of qualitative and quantitative [11] W.A.N.G. Guang-hui, Y.I.N. Yong, Detection of moldy maize aflatoxin B1 and
determination of antibiotics in infant milk powder by using hyper­ gibberellin by hyperspectral coupled with neural network, Food & Machinery 34
spectral detection technology. PLS-DA and RF qualitative discrimination (11) (2018) 64–69.
[12] Y.I.N. Yong, W.A.N.G. Guanghui, Hyperspectral Characteristic Wavelength
models were established respectively, and the accuracy of the RF qual­ Selection Method for Moldy Maize Based on Continuous Projection Algorithm
itative discrimination model was up to 100 %. In the establishment of Fusion Information Entropy, J. Agric. Sci. 34 (02) (2020) 356–362.
the quantitative model, wavelength selection was conducted on the [13] L.I.U. Zheng, Y.I.N. Yong, Rapid detection method of sausage nitrite based on
hyperspectral technology, Food Mach. 35 (05) (2019) 78–82.
original spectrum using the UVE, CARS and SPA algorithm, then the LS- [14] I. Baek, M.S. Kim, B. Cho, et al., Selection of optimal hyperspectral wavebands for
SVM model was established. The LOD of aureomycin, doxycycline, and detection of discolored, Diseased Rice Seeds. Appl. Sci. 9 (5) (2019).
oxytetracycline were 2.44 × 10− 3, 1.51 × 10− 3, 1.2 × 10-3, respectively. [15] Shengxiang Xu, Yongcun Zhao, Meiyan Wang, et al., Determination of rice root
density from Vis-NIR spectroscopy by support vector machine regression and
It was found that the predictive correlation coefficients of the estab­ spectral variable selection techniques, Exp. Cell Res. 356 (2) (2017) 12–23.
lished CARS-LS-SVM model for the three types of antibiotics were all [16] M. Huang, M.S. Kim, S.R. Delwiche, et al., Quantitative analysis of melamine in
greater than 0.9990, indicating that this model had better detection milk powders using near-infrared hyperspectral imaging and band ratio, J. Food
Eng. 181 (2016) 10–19.
effect and best predictive effect. This experiment verified that hyper­
[17] Yating Xiong, Ziwen Li, Jian Wang, et al., The determination of the fatty acid
spectral detection technology could be applied in the detection of other content of sea buckthorn seed oil using near infrared spectroscopy and variable
types of antibiotics in food, which was of great significance to ensure selection methods for multivariate calibration, Vib. Spectr. 84 (2016) 24–29.
[18] A. Folchfortuny, J.M. Pratsmontalban, S. Cubero, et al., VIS/NIR hyperspectral
food safety.
imaging and N-way PLS-DA models for detection of decay lesions in citrus fruits,
Chemom. Intell. Lab. Syst. (2016) 241–248.
Author statement [19] Jianjun Liu, Lili Mao, Jingfeng Ku, et al., Using terahertz spectroscopy to identify
transgenic cottonseed oil according to physicochemical quality parameters, Optik:
Zeitschrift fur Licht- und Elektronenoptik: Journal for Light-and Electronoptic 142
Jun Hu: Conceptualization, Methodology, Software, Writing- (2017) 483–488.
Reviewing and Editing. Zhen Xu : Data curation, Software, Writing- [20] Jia Jin, Quan Wang, Evaluation of informative bands used in different PLS
Original draft preparation. Maopeng Li: Visualization, Investigation. regressions for estimating leaf biochemical contents from hyperspectral
reflectance, Remote Sens. (Basel) 11 (2) (2019) 197.
Yong He: Supervision. Yande Liu: Validation, Funding acquisition, [21] N.K. Poona, A. Van Niekerk, R.L. Nadel, et al., Random Forest (RF) Wrappers for
Project administration. Waveband Selection and Classification of Hyperspectral Data, Appl. Spectrosc. 70
(2) (2016) 322–333.

8
J. Hu et al. Vibrational Spectroscopy 114 (2021) 103244

[22] H. Yuan, G. Yang, C. Li, et al., Retrieving Soybean Leaf Area Index from Unmanned reflectance Fourier transform infrared spectroscopy and multivariate methods PLS-
Aerial Vehicle Hyperspectral Remote Sensing: Analysis of RF, ANN, and SVM DA and PCA, Spectrochim. Acta A. Mol. Biomol. Spectrosc. 208 (2019) 222–228.
Regression Models, Remote Sens. (Basel) 9 (4) (2017). [28] Aili Wang, Ying Wang, Yushi Chen, Hyperspectral image classification based on
[23] Wei Wang, Min Huang, Qibing Zhu, et al., Optical property inversion of biological convolutional neural network and random forest, Remote. Sens. Lett. 10 (11)
materials using Fourier series expansion and LS-SVM for hyperspectral imaging, (2019) 1086–1094.
Inverse Probl. Sci. Eng. 26 (7) (2018) 1019–1036. [29] X. Yuan, B. Dai, D. Zhang, et al., Hyperspectral imaging and SPA-LDA Quantitative
[24] Jin Xiaming, Sun Jun, Mao Hanping, et al., Discrimination of rice varieties using Analysis for Detection of Colon Cancer Tissue, J. Appl. Spectrosc. 85 (2) (2018)
LS-SVM classification algorithms and hyperspectral data, Adv. J. Food Sci. Technol. 307–312.
7 (9) (2015) 691–696.
[25] Yunhong Liu, Qingqing Wang, Gao Xiuwei, et al., Total phenolic content prediction
Jun Hu, male, born in 1992 in Xiaogan, Hubei Province,doctor; research direction: THz
in Flos Lonicerae using hyperspectral imaging combined with wavelengths
spectroscopy applications in agricultural product quality and safety testing.
selection methods, J. Food Process Eng. 42 (6) (2019).
[26] B. Liu, P. Zhou, X. Liu, et al., Detection of pesticides in fruits by surface-enhanced
raman spectroscopy coupled with gold nanostructures, Food Bioproc. Tech. 6 (3) Yande Liu, female, born in 1967 in Ji’an, Jiangxi Province, doctor, professor; research
(2013) 710–718. direction: spectroscopic diagnostic technology in agricultural product quality and safety.
[27] Agata Walkowiak, Lukasz Ledzinski, Mariusz Zapadka, et al., Detection of
adulterants in dietary supplements with Ginkgo biloba extract by attenuated total

You might also like