Professional Documents
Culture Documents
Biodiesel classification by base stock type (vegetable oil) using near infrared
spectroscopy data
Roman M. Balabin a,∗ , Ravilya Z. Safieva b
a
Department of Chemistry and Applied Biosciences, ETH Zurich, 8093 Zurich, Switzerland
b
Gubkin Russian State University of Oil and Gas, 119991 Moscow, Russia
a r t i c l e i n f o a b s t r a c t
Article history: The use of biofuels, such as bioethanol or biodiesel, has rapidly increased in the last few years. Near
Received 7 October 2010 infrared (near-IR, NIR, or NIRS) spectroscopy (>4000 cm−1 ) has previously been reported as a cheap and
Received in revised form 17 January 2011 fast alternative for biodiesel quality control when compared with infrared, Raman, or nuclear magnetic
Accepted 19 January 2011
resonance (NMR) methods; in addition, NIR can easily be done in real time (on-line). In this proof-of-
Available online 26 January 2011
principle paper, we attempt to find a correlation between the near infrared spectrum of a biodiesel
sample and its base stock. This correlation is used to classify fuel samples into 10 groups according to
Keywords:
their origin (vegetable oil): sunflower, coconut, palm, soy/soya, cottonseed, castor, Jatropha, etc. Prin-
Petroleum (fossil) fuel
Vegetable (plant) oil
cipal component analysis (PCA) is used for outlier detection and dimensionality reduction of the NIR
Biofuel (biodiesel, bioethanol, spectral data. Four different multivariate data analysis techniques are used to solve the classification
ethanol–gasoline fuel) problem, including regularized discriminant analysis (RDA), partial least squares method/projection on
Vibrational spectroscopy (infrared, latent structures (PLS-DA), K-nearest neighbors (KNN) technique, and support vector machines (SVMs).
near-infrared, and Raman) Classifying biodiesel by feedstock (base stock) type can be successfully solved with modern machine
Artificial neural networks learning techniques and NIR spectroscopy data. KNN and SVM methods were found to be highly effective
Support vector machines for biodiesel classification by feedstock oil type. A classification error (E) of less than 5% can be reached
using an SVM-based approach. If computational time is an important consideration, the KNN technique
(E = 6.2%) can be recommended for practical (industrial) implementation. Comparison with gasoline and
motor oil data shows the relative simplicity of this methodology for biodiesel classification.
© 2011 Elsevier B.V. All rights reserved.
0003-2670/$ – see front matter © 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.aca.2011.01.041
R.M. Balabin, R.Z. Safieva / Analytica Chimica Acta 689 (2011) 190–197 191
large amounts of data and aim to identify meaningful patterns in This correlation is used to correctly classify samples into 10
NIR spectra [45–47]. Both regression and classification tasks were groups according to their origin (type of vegetable oil) [61]. Prin-
successfully completed using IR/NIR spectral data [48–50]. cipal component analysis (PCA) is used for outlier detection and
Increasing energy costs and environmental concerns have dimensionality reduction of NIR spectral data [28–30,45–50]. Four
emphasized the need to produce sustainable renewable fuels different multivariate data analysis techniques were used to solve
and chemicals [51–55]. Ideally, these alternatives should be eco- the problem, including regularized discriminant analysis (RDA),
nomically competitive, technically achievable, environmentally partial least squares method/projection on latent structures (PLS)
acceptable and widely available [51]. The use of renewable ener- [also known as partial least squares-discriminant analysis (PLS-
gies, such as biofuels [53], biomass [55], wave, hydro, wind and DA)], K-nearest neighbors (KNN) technique, and support vector
solar energies, needs to become widespread in order to decrease machines (SVMs) [61]. Notably, near infrared spectroscopy in com-
the dependence on fossil fuels [51]. bination with MDA techniques has not been previously applied to
The demand for ethanol and biodiesel, called alternative fuels classify biodiesel fuel by its original feedstock.
or biofuels, has increased over the last few years due to environ-
mental, economical, political, and social issues [44,54]. Biodiesel 2. Experimental
is composed of fatty acid mono-alkyl esters, which are obtained
through the base-catalyzed transesterification of vegetable oils or 2.1. Materials
animal fats with a short chain alcohol, such as methanol (MeOH,
CH3 OH) or ethanol (EtOH, C2 H5 OH) [44]. This biofuel is the major 2.1.1. Industrial samples
substitute for petroleum-derived diesel. Because its physical prop- Sixty-five samples of rapeseed biodiesel fuel were supplied
erties are very similar to diesel, the use of pure or blended biodiesel by three companies: Yugrostexport (Russia), Biosam (Russia), and
does not require any modification of the diesel engine or to the Sokhna Biodiesel Co. (Egypt).
existing fuel distribution and storage infrastructure [44]. Increased
biodiesel use leads to the reduction of greenhouse gases, particulate 2.1.2. Vegetable oils and used frying oil
matter, and sulphur emissions. In addition, waste frying oils (WFO) Nine types of vegetable oils were used for biodiesel preparation.
can be used as a raw material in biodiesel production, thus reducing Refined and crude sunflower oils, coconut oil, and palm oil were
and reusing industrial and household waste [51]. Recently, blends obtained from Kochmeister (Russia). Soy (soya) bean oil, cottonseed
of biodiesel with mineral diesel have become commercially avail- oil, and castor oil were obtained from MasloPromBaza (Russia). Jat-
able world-wide [44,56]. ropha (Jatrophacurcas) oil was obtained from Purandhar Agro &
Depending on geographic limitations and oil prices, biodiesel Biofuels (India). Linseed (flax or common flax) oil was obtained
may be produced from a variety of feedstocks, and different from local commercial sources. The kitchen of a local restaurant
technologies can be applied for biodiesel production [51–58]. Con- provided the used frying oil. All oil samples were used without
sequently, the final product can have different properties, so quality further purification.
control for biodiesel is very important. The EN 14214 mandates
twenty-five (25) parameters that must be analyzed to certify 2.1.3. Other chemicals
biodiesel quality [51]. ACS spectroscopic grade methanol (≥99.9%; Sigma–Aldrich),
Near infrared spectroscopy has already been applied to quality analytical grade sodium chloride (Sigma–Aldrich), analytical grade
analysis of biodiesel fuel. In 2007, Correia and co-workers [57] pub- phosphoric acid (Fluka), and reagent grade potassium hydroxide
lished their analysis of the effect of commonly used pre-processing (≥90%; Sigma–Aldrich) were used throughout the study without
techniques that were applied prior to partial least squares (PLS) and further purification.
principal components regression (PCR). They compared the qual-
ity of the developed calibration models to relate the near infrared 2.2. Methods
spectral characteristics of a biodiesel sample and its methanol and
water content. From the 50 samples studied, 38 were used for cali- 2.2.1. Biodiesel preparation
bration and 12 for testing. Leave-one-out cross-validation (LOOCV) The reactor was initially charged with an amount of oil that
was applied [57]. achieved the desired methanol/vegetable oil molar ratio, which
Chemometric treatment of NIR spectra was assessed by Galtier ranged from 6:1 to 9:1, and the reactor was then placed in a
et al. [59] for the quantification of fatty acids and triacylglycerols constant-temperature bath with its associated equipment and
in virgin olive oil samples. Their classification into five registered heated to a temperature of 25–45 ◦ C. The catalyst consisted of
designations of origin (RDOs) of French virgin olive oils, including 0.4–2.0% (w/w) of potassium hydroxide (KOH) by weight of veg-
“Aix-en-Provence”, “Haute-Provence”, “Nice”, “Nyons” and “Vallee etable oil and was completely dissolved in the methanol under
des Baux”, was successfully conducted using PLS classifiers. It stirring; the solution was then added to the reactor. The timing of
was concluded [59] that chemometric treatments of NIR spectra the reaction began as soon as the potassium hydroxide/methanol
allowed one to obtain similar results than those obtained by time- solution was added and continued for 3–8 h. The impeller speed
consuming analytical techniques, such as GC and HPLC, and these was 600 rpm. Following the reaction, the mixture was transferred
methods constitute a fast and robust means for authentication of to a separatory funnel, and glycerol was allowed to separate
the French virgin olive oils. by gravity for 2–3 h. After the glycerol layer was removed, the
The goals of the study conducted by Yang et al. [60] in 2005 were excess methanol in the methyl ester phase was removed by rotary
to use Fourier transform mid-infrared (FTIR), near-infrared (FT- evaporation at 70 ◦ C. The methyl ester was washed twice with a
NIR) and Raman (FT-Raman) spectroscopy to discriminate among phosphoric acid water solution (5%, v/v) and brine until a clear
10 different edible oils and fats and to compare the performance phase (methyl ester) was obtained.
of these spectroscopic methods. The spectral features of edible oils Outlier detection was done according to Ref. [71]. The final,
and fats were studied, and the characteristic vibrations of C C dou- “outlier-free” biodiesel sample set included 403 samples from
ble bond were identified and used for discriminant analysis (DA) 10 feedstocks: 65 industrial samples from rapeseed oil and 338
[60]. laboratory-prepared samples from different oils. The laboratory
In this paper, we attempt to find a correlation between the samples included sunflower oil (44 samples), coconut oil (40 sam-
near infrared spectrum of a biodiesel sample and its feedstock. ples), palm oil (40 samples), soy oil (40 samples), cottonseed oil (30
192 R.M. Balabin, R.Z. Safieva / Analytica Chimica Acta 689 (2011) 190–197
samples), castor oil (40 samples), Jatropha oil (29 samples), linseed of M) of each sample is coded with a (binary) vector of length M
oil (35 samples), and used frying oil (40 samples). Each sample was with zeros and 1 one. In our case, M is equal to 10. The M predicted
obtained with a separate synthesis. scores by the ordinary PLS-DA method are used to predict the class
of each sample. If the j-score (normalized) is greater than or equal
2.2.2. Near infrared (NIR) spectroscopy to parameter ı, the predicted class of the sample was j. Extra infor-
The near infrared (NIR) spectra were acquired with a MPA Multi mation about the idea of PLS-DA classification can be found in Ref.
Purpose FT-NIR Analyzer (Bruker). The spectra were acquired at [63].
room temperature (20–23 ◦ C). The NIR spectrometer was calibrated
with benzene (C6 H6 ) and cyclohexane (c-C6 H12 ) at least twice per 2.2.4.3. K-nearest neighbor (KNN). The K-nearest neighbor (KNN)
day to minimize the influence of variable laboratory conditions. method was first introduced by Fix and Hodges [64]. It has an
The spectral range between 9000 and 4500 cm−1 (1110–2500 nm) extremely simple, intuitive theoretical background. In the KNN
was scanned with a resolution of 8 cm−1 . Sixty-four scans were method, a distance (Euclidean or other, see below) is assigned
averaged for each sample spectrum, and a background spectrum between all points in a data set. The data points, which are K clos-
(64 scans) was measured every 45 min. A photometric accuracy of est neighbors (K being the number of neighbors), are then found by
∼0.05% was obtained [2,28–30]. A cylindrical glass cell was used analyzing (sorting) the distance matrix. The K-closest data points
throughout the study. Approximately 1 mL of biodiesel sample was are then analyzed to determine which class label is the most com-
needed for each NIR measurement. NIR spectrum collection was mon among the set (“voted” KNN). The most common class label
repeated five times with cell rotation inside the spectrometer to is then assigned to the data point being analyzed. Other variants
minimize interferences from the cell or glass defects. The mea- of KNN (“weighted” KNN) are also possible. In these methods, dif-
surement of one sample took less than 5 min. The averaged- and ferent neighbors have different “weight”, which is the ability to
background-corrected spectra were used for subsequent data pre- influence the class of a sample, according to their distances from
processing [28–30]. the sample. Some remarks about the possible types of functions
for weighting are given below (Table 1). For a KNN classifier, it is
2.2.3. Near infrared (NIR) spectral pre-processing necessary to have a training set that is not too small and a good
Data pre-processing was done according to Refs. [51,57]. discriminating distance. KNN performs well in multi-class, simul-
Briefly, prior to calibration model building, various widely used taneous problem solving. There exists an optimal choice for the
pre-processing techniques were applied to the data. Eight data value of the parameter K, which describes the best performance of
pre-treatment methods were tested: the classifier. For a detailed analysis, one can consult Ref. [64].
Table 1
Parameters of the K-nearest neighbor (KNN) classification model.
of distances D(xi , p) between vectors xi and p were used: Euclidean distance (Pythagorean metric):
a
Two types
L
D(xi , p) =
2
(xi,l − pl )
l=1
D(xi , p) = |xi,l − pl |.
l=1
b
“Voting” KNN.
error of cross-validation (E) was used: model efficiency on its parameters, the following parameters were
varied and optimised using cross-validation error (RMSE) [28–30]:
Nwrong N0 − Nright
E= = (1)
N0 N0 • RDA: one parameter (˛ = 0–1);
where Nwrong and Nright refer to the number of wrongly and • PLS-DA: number of latent variables (LV = 1–35) and threshold
rightly classified samples; N0 is the total number of samples in parameter (ı = 0.40–0.95);
validation set. The value of E is a measure of model sensitivity • KNN: different types of K-nearest neighbor (KNN) method (voted
(SENS = 100% − E), i.e., the ability to classify in a correct manner and weighted by different weighting functions) were used to get
the objects belonging to the class (minimization of false negatives). the best result (Table 1), K was optimised on each case;
However, there is another fundamental parameter, specificity • SVM: different types of kernel functions (and corresponding
(SPEC), that expresses the ability of a given class model to correctly parameters) were checked for optimization of SVM classification
reject the objects of other classes (minimization of false positives) model (Table 2), cost parameter C was optimised in each case.
[69,70]. In our case (Tables 3–6) almost no difference is observed if
any error definition is used (e.g., mean of SENS and SPEC) for mul- 3. Results and discussion
tivariate methods comparison. So in this work we follow the same
error function (E) as in Ref. [61] and references therein. Note that 3.1. Biodiesel samples set and its classes
one can calculate any other error type from the data in Tables 3–6,
if he or she prefers other functional for methods comparison. All biodiesel samples in the set can be divided into two broad
Fivefold (5-fold) cross-validation was used to evaluate model’s categories: biodiesel fuel produced from vegetable oils (363 sam-
efficiency. It was checked that validation set consists of all biodiesel ples) and biodiesel fuel produced from used frying oil (40 samples).
classes. The results of cross-validation check were found to be A real variety of biodiesel feedstocks (vegetable oils) is presented
almost identical to the test ones on a fully independent sample set. in the set: nine types of vegetable oils that are very different in
So, cross-validation (CV) error will be used as a default measure of chemical composition and properties [51–60]. One can expect that
model accuracy, if other is not specified. this sample set makes the final model general enough to be used in
real-world applications. Of course, additional samples are needed
2.2.5.3. Efficiency estimation with external test set. To check the if samples from other feedstock are needed to be classified. But the
validity of the CV procedure the tests with external test sets were general trend is believed to stay the same as presented below.
done. The full sample set was randomly divided into three subsets: At least 29 samples per class: 40.3 ± 9.9 (±) on average—are
calibration (283), cross-validation test set/test set for parameter presented in the set. This value can be called acceptable [38–40]. It
optimization (50), and external test sets (70). It was checked that makes it possible to create a robust model, since approximately the
all classes are presented in each set. The test set has 6–10 samples same number of samples (±25%) is observed in all cases [40,65]. It
of every oil type. The procedure has been repeated five times and also means that the classification error can be calculated in a direct
average values are reported. way, see Eq. (1)—no correction for sample number difference and/or
making copies of samples is needed [38–40,65].
2.2.5.4. Model optimisation. To truly compare different multivari- One would also expect that 403 samples is enough to build
ate classification methods, the efficiency of the best possible model an accurate model at “sample set limit”—compare with “basis set
should be found. Because of the dependence of the calibration limit” or “complete basis set” in quantum chemistry (QC) [35,67]. It
Table 2
Parameters of the support vector machine (SVM) classification model.
Table 3
Biodiesel classification by regularized discriminant analysis (RDA): confusion matrix. The rows indicate the true sample class and that the columns refer to the observed
class.
Biodiesel type Total number Sunflower Coconut Palm Rapeseed Soy Cottonseed Castor Jatropha Linseed Used frying
of samples oil
Sunflower 44 33 1 1 3 2 1 1 1 1
Coconut 40 1 33 3 1 1 1
Palm 40 1 2 34 2 1
Rapeseed 65 3 60 1 1
Soy 40 1 1 37 1
Cottonseed 30 1 1 28
Castor 40 3 37
Jatropha 29 3 25 1
Linseed 35 3 3 1 2 1 25
Used frying oil 40 1 1 38
Table 4
Biodiesel classification by partial least squares (PLS) classification model: confusion matrix. The rows indicate the true sample class and that the columns refer to the observed
class.
Biodiesel type Total number Sunflower Coconut Palm Rapeseed Soy Cottonseed Castor Jatropha Linseed Used frying
of samples oil
Sunflower 44 33 2 2 1 2 1 1 1 1
Coconut 40 1 35 3 1
Palm 40 3 34 1 1 1
Rapeseed 65 1 63 1
Soy 40 2 37 1
Cottonseed 30 1 1 28
Castor 40 2 38
Jatropha 29 2 27
Linseed 35 3 3 1 1 1 1 25
Used frying oil 40 40
Table 5
Biodiesel classification by K-nearest neighbor (KNN) classification model: confusion matrix. The rows indicate the true sample class and that the columns refer to the observed
class.
Biodiesel type Total number Sunflower Coconut Palm Rapeseed Soy Cottonseed Castor Jatropha Linseed Usedfrying
of samples oil
Sunflower 44 38 2 2 2
Coconut 40 40
Palm 40 3 33 2 1 1
Rapeseed 65 64 1
Soy 40 1 1 35 2 1
Cottonseed 30 2 28
Castor 40 40
Jatropha 29 29
Linseed 35 1 1 1 1 31
Used frying oil 40 40
Table 6
Biodiesel classification by Support vector machine (SVM) classification model: confusion matrix. The rows indicate the true sample class and that the columns refer to the
observed class.
Biodiesel type Total number Sunflower Coconut Palm Rapeseed Soy Cottonseed Castor Jatropha Linseed Usedfrying
of samples oil
Sunflower 44 40 1 1 2
Coconut 40 39 1
Palm 40 1 2 35 1 1
Rapeseed 65 64 1
Soy 40 38 2
Cottonseed 30 30
Castor 40 40
Jatropha 29 29
Linseed 35 1 1 1 1 1 30
Used frying oil 40 40
is well-known that the accuracy of a multivariate model increases 3.2. Biodiesel classification by regularized discriminant analysis
with the increasing size of the sample set and saturates at a limit (RDA)
value at very large sample sets. The same effect is observed in ab ini-
tio quantum chemistry for basis set—set of Gaussian atom-cantered Table 3 shows the results of regularized discriminant analysis
functions used to create the molecular orbitals and to approximate (RDA) of biodiesel fuels. One can see that the model is not accu-
electron density distribution in molecular systems. rate enough for practical use. A classification error of 13.2% was
R.M. Balabin, R.Z. Safieva / Analytica Chimica Acta 689 (2011) 190–197 195
Table 7
Classification of biodiesel samples by different multivariate methods: regularized discriminant analysis (RDA), partial least squares classification (PLS), K-nearest neighbor
(KNN), and support vector machines (SVMs).
E is the error of 5-fold cross-validation, see Eq. (1); Nright refers to the number of wrongly and rightly classified samples; N0 is the total number of samples in validation set.
Methods: RDA: regularized discriminant analysis; PLS: partial least squares regression; KNN: K-nearest neighbor; and SVMs: support vector machines.
reached based on 350 correctly classified biodiesel samples out of of accuracy is significantly better than the accuracy achieved using
403. the PDA or PLS-DA methods (see above).
In the cases of biodiesel from sunflower oil and linseed oil, an
error above 25% was observed (Table 3). The best results were 3.5. Biodiesel classification by support vector machines (SVMs)
observed for the samples from the used frying oil; E = 5% for 38 cor-
rectly classified biodiesel samples out of 40. The confusion matrix Table 6 summarises the results of the SVM approach to biodiesel
(Table 3) shows that there is a significant chemical difference in classification. The confusion matrix (Table 6) clearly shows the
the feedstock (used frying oil vs. vegetable oil), which makes clas- superiority of a support vector approach over all of the classification
sification among these two species rather simple, and the problem methods described above [29,68]. The extent of the classification
can easily be solved by a simple RDA model. The same cannot be error decreases from 6.2% to 4.5% for 385 correctly classified sam-
said about classification among vegetable oils; a more sophisticated ples at a much higher computational cost (see below).
classification method is needed in this case. The largest inaccuracy (E = 14%) is observed for linseed oil. In
the cases of cottonseed, castor, Jatropha, and used frying oils, 100%
3.3. Biodiesel classification by the partial least squares (PLS-DA) effectiveness in classification using this model was reached, which
classification model is nearly the same result provided by the KNN method. Importantly,
not more than two samples were labelled incorrectly by a SVM-
The confusion matrix for the partial least squares or projection to based approach.
latent structures (PLS-DA) classification model (Table 4) shows that Support vector machines can be regarded as the most effective
the PLS-DA method results are close to those for the RDA method. classification model for biodiesel fuels. Very accurate (E < 5%) and
A classification error of 10.7% is obtained (360 correctly classified robust classification models can be built using a SVM approach.
biodiesel samples out of 403). This is better than the RDA results
(see above), but these results are still inadequate to state that the 3.6. General remarks. Comparison with gasoline and motor oil
model is highly accurate. data
Interestingly, two of the most problematic types of samples for
classification yield the same results as in the RDA method: sun- Table 7 summarises the data presented above for a general dis-
flower and linseed oils. In both cases, E is above 25%. An accuracy cussion. Table 7 clearly shows the non-equivalency of the classes in
of 100% (40/40) is reached in the case of the used frying oil clas- term of classification accuracy. Three types of vegetable oil have an
sification. Thus, the problem of a vegetable oil vs. used frying oil average classification error above 15%: sunflower, palm, and linseed
distinction is completely solved at the PLS-DA level. oils. Thus, biodiesel samples from these three groups are difficult to
Though PLS-DA classification has been shown to be a more effec- label correctly. Tables 4–6 show that this inaccuracy is mostly based
tive classification method than standard RDA, more sophisticated on the confusion between these three classes. The general accuracy
classification methods are needed to build highly accurate models of a classification model seems to be greatly dependent on its ability
for biodiesel classification by feedstock type. to correctly distinguish sunflower-, palm-, and linseed-based fuel
samples. Only the KNN and SVM methods are rather successful in
3.4. Biodiesel classification by the K-nearest neighbors (KNN) this task.
classification model Three other types of biodiesel samples may be described as
“easy-to-distinguish”. These are: rapeseed, castor, and used frying
The K-nearest neighbors (KNN) classification model is a sim- oils. In these cases, the average error is below 4%. The difference
ple but highly effective classification algorithm [64]; this fact is in chemical composition [51–60] and, as a consequence, the differ-
confirmed by this study using a biodiesel sample set (Table 5). A ence in near infrared spectral features [58–60] may be responsible
classification error of only 6.2% was reached by an application of the for this attribute.
KNN model. The largest inaccuracy (E = 18%) is observed for palm Comparison of the effectiveness of RDA, PLS-DA, KNN, and SVM
oil. In the cases of coconut, castor, Jatropha, and used frying oils, for different fuel and oil samples (gasoline, motor oil, and biodiesel)
100% effectiveness in classification using this model was reached. is presented in Fig. 1 [29,68]. Motor oils of 10 popular brands (Cas-
This final model based on K-nearest neighbors classification trol, Mobil, Esso, TNK-BP, Shell, Total, Lukoil, BP, Mannol, Liqui
can be recommended for practical implementation as 378 sam- moly; SAE types: 0W-20; 5W-20; 0W-30; 5W-30; 10W-30; 0W-40;
ples were successfully assigned to their correct classes. This level 5W-40; 10W-40; 15W-40; 5W-50; 10W-50) were used for classifi-
196 R.M. Balabin, R.Z. Safieva / Analytica Chimica Acta 689 (2011) 190–197
[2] J. Workman, A. Springsteen, Applied Spectroscopy: A Compact Reference for [36] M. Andersson, K.-G. Knuuttil, Vib. Spectrosc. 29 (2002) 133–138.
Practitioners, Academic Press, 1998. [37] L.M. Harwood, C.J. Moody, Experimental Organic Chemistry: Principles and
[3] B. Osborne, T. Fearn, Near Infrared Spectroscopy in Food Analysis, Wiley, New Practice, Wiley-Blackwell, 1989.
York, 1986. [38] T. Næs, T. Isaksson, T. Fearn, T. Davies, A User-Friendly Guide to Mul-
[4] F.J. Duarte, Tunable Laser Applications, CRC, New York, 2009. tivariate Calibration and Classification, NIR Publications, Chichester, UK,
[5] F.J. Duarte, Tunable Lasers Handbook, Academic Press, 1995. 2002.
[6] W. Demtröder, Laser Spectroscopy: Basic Principles, 4th ed., Springer, Berlin, [39] R.M. Balabin, E.I. Lomakina, J. Chem. Phys. 131 (2009) 074104.
2008. [40] B.F.J. Manly, Multivariate Statistical Methods: A Primer, 3rd ed., Chapman and
[7] J.R. Lakowicz, Principles of Fluorescence Spectroscopy, Kluwer Aca- Hall/CRC, 2004.
demic/Plenum Publishers, 1999. [41] R.M. Balabin, R.Z. Syunyaev, S.A. Karpov, Fuel 86 (2007) 323–327.
[8] A. Sharma, S.G. Schulman, Introduction to Fluorescence Spectroscopy, Wiley, [42] R.M. Balabin, R.Z. Syunyaev, S.A. Karpov, Energy Fuels 21 (2007) 2460–2465.
1999. [43] S.B. Kim, C. Temiyasathit, K. Bensalah, A. Tuncel, J. Cadeddu, W. Kabbani, A.V.
[9] J. Logan, K. Edwards, N. Saunders, Real-Time PCR: Current Technology and Mathker, H. Liu, Expert Syst. Appl. 37 (2010) 3863–3871.
Applications, Caister Academic Press, 2009. [44] M.R. Monteiroa, A.R.P. Ambrozin, M.S. Santos, E.F. Boffo, E.R. Pereira-Filho, L.M.
[10] J. Eisinger, J. Flores, Anal. Biochem. 94 (1979) 15–21. Lião, A.G. Ferreira, Talanta 78 (2009) 660–664.
[11] J. Keeler, Understanding NMR Spectroscopy, John Wiley & Sons, 2005. [45] D.D. Lee, H.S. Seung, Nature 401 (1999) 788–790.
[12] J. Workman, M. Koch, B. Lavine, R. Chrisman, Anal. Chem. 81 (2009) 4623–4643. [46] J.B. Tenenbaum, V. de Silva, J.C. Langford, Science 290 (2000) 2319–2321.
[13] A. Burke, X. Ding, R. Singh, R.A. Kraft, N. Levi-Polyachenko, M.N. Rylander, C. [47] G.E. Hinton, R.R. Salakhutdinov, Science 313 (2006) 504–507.
Szot, C. Buchanan, J. Whitney, J. Fisher, H.C. Hatcher, R. D’Agostino, N.D. Kock, [48] D.C. Malins, N.L. Polissar, S.J. Gunselman, Proc. Natl. Acad. Sci. U. S. A. 94 (1997)
P.M. Ajayan, D.L. Carroll, S. Akman, F.M. Torti, S.V. Torti, Proc. Natl. Acad. Sci. U. 3611–3615.
S. A. 106 (2009) 12897–12902. [49] K. Sakurai, Y. Goto, Proc. Natl. Acad. Sci. U. S. A. 104 (2007) 15346–15351.
[14] E.R. Trivedi, A.S. Harney, M.B. Olive, I. Podgorski, K. Moin, B.F. Sloane, A.G.M. [50] A.E. Cohen, W.E. Moerner, Proc. Natl. Acad. Sci. U. S. A. 104 (2007) 12622–12627.
Barrett, T.J. Meade, B.M. Hoffman, Proc. Natl. Acad. Sci. U. S. A. 107 (2010) [51] P. Baptista, P. Felizardo, J.C. Menezes, M.J.N. Correia, Talanta 77 (2008)
1284–1288. 144–151.
[15] X. Michalet, F.F. Pinaud, L.A. Bentolila, J.M. Tsay, S. Doose, J.J. Li, G. Sundaresan, [52] E.J. Steen, Nature 463 (2010) 559–562.
A.M. Wu, S.S. Gambhir, S. Weiss, Science 307 (2005) 538–544. [53] A.J. Ragauskas, Science 311 (2006) 484–489.
[16] L.R. Hirsch, Proc. Natl. Acad. Sci. U. S. A. 100 (2003) 13549–13554. [54] A.E. Farrell, R.J. Plevin, B.T. Turner, A.D. Jones, M. O’Hare, D.M. Kammen, Science
[17] M.J. Therien, Nature 458 (2009) 716–717. 311 (2006) 506–508.
[18] T.F. Krauss, R.M. De La Rue, S. Brand, Nature 383 (1996) 699–702. [55] S. Crossley, J. Faria, M. Shen, D.E. Resasco, Science 327 (2009) 68–72.
[19] Y. Zou, Proc. Natl. Acad. Sci. U. S. A. 106 (2009) 22135–22138. [56] G. Knothe, J. Am. Oil Chem. Soc. 78 (2001) 1025–1028.
[20] R. Adato, Proc. Natl. Acad. Sci. U. S. A. 106 (2009) 19227–19232. [57] P. Felizardo, P. Baptista, J.C. Menezes, M.J.N. Correia, Anal. Chim. Acta 595 (2007)
[21] H. Keppler, L.S. Dubrovinsky, O. Narygina, I. Kantor, Science 322 (2008) 107–113.
1529–1532. [58] F.C.C. Oliveira, C.R.R. Brandao, H.F. Ramalho, L.A.F. Costa, P.A.Z. Suarez, J.C.
[22] R.M. Balabin, R.Z. Safieva, J. Near Infrared Spectrosc. 15 (2007) 343–347. Rubim, Anal. Chim. Acta 587 (2007) 194–199.
[23] R.M. Balabin, R.Z. Syunyaev, J. Colloid Interface Sci. 318 (2008) 167–174. [59] O. Galtier, N. Dupuya, Y. Le Dreau, D. Ollivier, C. Pinatel, J. Kister, J. Artaud, Anal.
[24] R.Z. Syunyaev, R.M. Balabin, I.S. Akhatov, J.O. Safieva, Energy Fuels 23 (2009) Chim. Acta 595 (2007) 136–144.
1230–1238. [60] H. Yang, J. Irudayaraj, M.M. Paradkar, Food Chem. 93 (2005) 25–32.
[25] O.C. Mullins, Anal. Chem. 62 (1990) 508–514. [61] R.M. Balabin, R.Z. Safieva, E.I. Lomakina, Anal. Chim. Acta 671 (2010) 27–35.
[26] M.R. Swain, G. Vasisht, G. Tinetti, Nature 452 (2008) 329–331. [62] J.H. Friedman, J. Am. Stat. Assoc. 84 (1989) 16–1755.
[27] S. Byrne, Science 325 (2009) 1674–1676. [63] P. Geladi, B.R. Kowalski, Anal. Chim. Acta 185 (1986) 1–17.
[28] R.M. Balabin, R.Z. Safieva, E.I. Lomakina, Chemometr. Intell. Lab. Syst. 88 (2007) [64] E. Fix, J.L. Hodges, Int. Stat. Rev. 57 (1989) 238.
183–188. [65] V.N. Vapnik, The Nature of Statistical Learning Theory, Springer-Verlag, New
[29] R.M. Balabin, R.Z. Safieva, Fuel 87 (2008) 1096–1101. York, 1995.
[30] R.M. Balabin, R.Z. Safieva, E.I. Lomakina, Chemometr. Intell. Lab. Syst. 93 (2008) [66] S.R. Amendolia, G. Cossu, M.L. Ganadu, B. Golosio, G.L. Masala, G.M. Mura,
58–64. Chemometr. Intell. Lab. Syst. 69 (2003) 13–20.
[31] L.R. Schimleck, R. Evans, A.C. Matheson, J. Wood Sci. 48 (2002) 132–137. [67] R.M. Balabin, J. Chem. Phys. 129 (2008) 164101.
[32] P.D. Jones, L.R. Schimleck, G.F. Peter, R.F. Daniels, A. Clark, Wood Sci. Technol. [68] R.M. Balabin, R.Z. Safieva, Fuel 87 (2008) 2745–2752.
40 (2006) 709–720. [69] L. Pigani, G. Foca, A. Ulrici, K. Ionescu, V. Martina, F. Terzi, M. Vignali, C. Zanardi,
[33] E.W. Ciurczak, J.K. Drennen, Pharmaceutical and Medicinal Applications of R. Seeber, Anal. Chim. Acta 643 (2009) 67–73.
Near-infrared Spectroscopy, 1st ed., CRC Press, 2002. [70] L. Pigani, G. Foca, K. Ionescu, V. Martina, A. Ulrici, F. Terzi, M. Vignali, C. Zanardi,
[34] J. Moros, N. Galipienso, R. Vilches, S. Garrigues, M. de la Guardia, Anal. Chem. R. Seeber, Anal. Chim. Acta 614 (2008) 213–222.
80 (2008) 7257–7264. [71] T. Lillhonga, P. Geladi, Anal. Chim. Acta 544 (2005) 177–183.
[35] P. Taddei, S. Affatato, C. Fagnano, B. Bordini, A. Tinti, A. Toni, J. Mol. Struct. 613 [72] R.M. Balabin, R.Z. Safieva, E.I. Lomakina, Microchem. J., in press.
(2002) 121–129. [73] R.M. Balabin, E.I. Lomakina, R.Z. Safieva, Fuel, in press.