You are on page 1of 8

Microchemical Journal 109 (2013) 170–177

Contents lists available at SciVerse ScienceDirect

Microchemical Journal
journal homepage: www.elsevier.com/locate/microc

Discrimination between authentic and counterfeit banknotes using Raman


spectroscopy and PLS-DA with uncertainty estimation
Mariana R. de Almeida a, Deleon N. Correa a, b, Werickson F.C. Rocha c, Francisco J.O. Scafi b, Ronei J. Poppi a,⁎
a
Institute of Chemistry, State University of Campinas, POB 6154, 13084-971 Campinas, SP, Brazil
b
Criminalistic Institute Dr. Octávio Eduardo de Brito Alvarenga, Technical-Scientific Police Superintendency, 05507-060, São Paulo, SP, Brazil
c
National Institute of Metrology, Quality and Technology, Chemical Metrology Division (Dquim), 25250-020, Duque de Caxias, RJ, Brazil

a r t i c l e i n f o a b s t r a c t

Article history: In the present work, Raman spectroscopy and chemometric tools were explored as an analytical method to
Received 29 November 2011 discriminate authentic and counterfeit Real banknotes. The analysis was based on the characterization of
Received in revised form 27 February 2012 inks used to confect the banknotes. Multivariate analysis was required for data analysis, since the colors present
Accepted 9 March 2012
in the banknotes are a mixture of pigments and the Raman spectra is complex and not totally resolved. Original
Available online 16 March 2012
and counterfeit R$ 50 banknotes were analyzed by Raman spectroscopy without any sample preparation and
Keywords:
three different areas were selected for study: chalcographic, orange and red inks. In this study, only the results
Banknotes for the chalcographic ink will be present. The classification method PLS-DA was employed to discriminate
Counterfeits authentic and counterfeit banknotes, as well as the counterfeit type. The reliability of the results was calculated
Raman spectroscopy using the re-sampling bootstrap technique. The samples classified as counterfeit banknotes by the PLS-DA model
Chemometric had been apprehended by local authorities and classified as fake by classical forensic approaches, based on
Uncertainty estimation sensory tests and optical inspection by a specialist. PLS-DA was used for the development of a procedure, that
could be used by non specialist operators and can also analyze new samples of R$ 50 banknotes, classifying
them with reliability and estimating uncertainty. In the proposed method all fake and not fake banknotes used
to validate the analysis were correctly classified. The procedure could be used as a complementary method to
classical forensic inspection, offering fast, non-destructive, robust analyses with the possibility of in situ analysis
using a portable instrument.
© 2012 Elsevier B.V. All rights reserved.

1. Introduction In a search of the scientific literature we can find analytical methods,


such as infrared spectroscopy [2] and ambient mass spectrometry [3],
Counterfeit banknotes are a constant concern that affects the that have been employed to identify counterfeit banknotes, including
economy of various countries. In 1994, the Real was introduced as the Dollars, Euros and Reals. However, most studies found in the literature
currency in the Brazil and since its introduction falsification has steadily discuss the contamination of the banknotes with illicit drugs [4–6].
increased. The banknotes have several security features that can be Raman spectroscopy has shown a large number of applications in
recognized by the ordinary person and are employed in identification samples of forensic interest, mainly examination of pigments, such as
of authenticity. However, with the popularization of digital technolo- printing material used for producing documents [7], pigments in
gies, counterfeiters have arrived at a quality level that has made it works of art [8,9] and analysis of car paints [10,11], among others. The
practically impossible to distinguish counterfeit banknotes from origi- use of Raman spectroscopy for identification of drugs [12], explosives
nal ones with the naked eye. Counterfeiting uses computational devices, [13], and fibers [14] is also widely found in the literature. In the study
such as scanners for image-capturing of the genuine banknotes, of counterfeit banknotes, an understanding of the chemical composition
processing and printing. The R$ 50 banknote is that chosen by counter- of inks used for making bills is an important information and
feiters, representing 70% of all counterfeit banknotes seized in Brazil [1]. Raman spectroscopy is presented as an appropriate technique for such
Every time a false banknote is identified, it is removed from circulation. a proposal.
Currently the volume of fake bills has not impacted the economy, but Depending on the specific objectives, the large amount of data
has brought considerable losses to ordinary citizens. obtained by Raman spectroscopy makes it interesting to use methods
of multivariate data analysis. Despite many publications describing
analytical methods that combine Raman spectroscopy with chemo-
metric methods that are able to give information on the composition
⁎ Corresponding author. Tel.: + 55 19 35212134; fax: + 55 19 35213126. of the inks, reports of successful field implementation of Raman spec-
E-mail address: ronei@iqm.unicamp.br (R.J. Poppi). troscopy for routine analyses are almost non-existent. This is due to a

0026-265X/$ – see front matter © 2012 Elsevier B.V. All rights reserved.
doi:10.1016/j.microc.2012.03.006
M.R. de Almeida et al. / Microchemical Journal 109 (2013) 170–177 171

lack of adaptations and validation of developed models. Robust esti- The PLS-DA model is developed from algorithms for Partial Least
mates of the accuracy and precision of results are crucial for any Squares (PLS) regression. PLS is an inverse multivariate calibration
method to be adopted in routine forensic analysis [15]. which seeks a direct relationship between instrumental response
Metrological activities are fundamental to ensure the quality of and the property of interest (qualitative and quantitative). The two
scientific and forensic activities. Measurement results must be valid, data matrices are: matrix X (NxJ) representing the instrumental
comparable, and reproducible, and their uncertainties are the quantita- response, where N is the number of the samples and J the variable
tive expression of their quality. In accordance with the ISO/IEC number; and matrix Y (NxM), which corresponds to the property of
17025:2005 standard [16], all calibration or testing laboratories must interest, with M being the number of properties. This matrix is
have and apply procedures to evaluate uncertainty in measurements. decomposed by factors (or latent variables), in order to reduce the
Establishing fitness-for-purpose is necessary before analytical results size of the data:
can be relied on for important legal decisions. This is particularly true
T
in the forensic measurement of identifying counterfeit banknotes. X ¼ TP þ E ð1Þ
Due to the importance of uncertainty estimation in analytical data,
multiple proposals have been published in the literature to estimate T
Y ¼ UQ þ F ð2Þ
uncertainty in multivariate analysis: linearization-based methods,
re-sampling methods, and U-deviation, among others [17]. However, where T and U are the score matrices containing orthogonal rows; P
there are few examples in the literature that have been applied to are the loading of the X matrix; E is the residue of the X matrix; Q
determine the uncertainty of multivariate calibration models, such is the loading of the Y matrix and F is the error for the Y matrix.
as Partial Least Squares (PLS) [18–21]. Olivieri and et al. [22] discuss In PLS there is a compromise between the explanation of the vari-
in a review paper the principal methods for uncertainty estimation ance in X and its correlation with Y. Usually the numbers of latent vari-
of multivariate calibration. ables are small. The T scores are orthogonal and estimated as a linear
The number of applications of pattern recognition methods, such combination of the original variables with weighing coefficients.
as Principal Component Analysis (PCA), Partial Least Squares for
Discriminant Analysis (PLS-DA) and Soft Independent Modeling of T ¼ XW ð3Þ
Class Analogy (SIMCA) in the literature is vast. However, most studies
only attribute object classifications; only a few papers evaluate the un- The T scores are good predictors of Y and assume that Y and X are
certainty estimation in these pattern recognition methods. Approaches modeled by the same latent variable.
to uncertainty estimation in unsupervised (PCA) and supervised
techniques PLS-DA and SIMCA have been reported by Preisner et al. T
Y ¼ TQ þ F ð4Þ
[23] for discrimination among pathogenic microorganisms. In this
case, the authors implemented different re-sampling methodologies, The Y residues, F, express the deviations between the observed
jackknife and bootstrap, to assess the uncertainty of bacteria discrimi- and modeled response.
nation models using infrared data. The re-sampling methods generated Eqs. (3) and (4) can be rewritten as:
new data sets from the available one by an artificial perturbation [20].
From this new data set, the unknown distribution of a parameter Y ¼ XWQ þ F
T
ð5Þ
could be estimated by mimicking the random mechanism through re-
sampling of the data set. Y ¼ Xβ þ F ð6Þ
In this work we propose the use of Raman spectroscopy, an analyti-
cal technique with great potential for investigations of forensic cases, The regression coefficient, β, can be written as:
for the characterization of inks used to confect authentic and counterfeit
Real banknotes. Multivariate analysis was required, since the colors of
β ¼ W Q
T
ð7Þ
the banknotes are from a mixture of pigments and the Raman spectra
are complex and not resolved. The classification method PLS-DA was  −1
W ¼ W P W
T
employed for this proposal with evaluation of the reliability of the ð8Þ
results using the re-sampling bootstrap technique. The main advantage
of PLS-DA is that the sources of variability in the data are modeled by where W is defined in terms of a set of weighting loadings, that maxi-
latent variables, the associated PLS scores are then calculated and mize the covariance between X and Y [25]. More detail about PLS
plotted pairwise, allowing a visual assessment of group separation regression is given by Wold et al. [26].
[24]. The model also calculates the probability of a sample belonging As PLS-DA is a classification method, the matrix or vector Y (proper-
to the class being modeled. While the SIMCA classification method ty of interest) is coded to 0 or 1, when there are two classes (C= 2). For
provides good results in the development of classification model only more than two classes, one can build several models with 0 and 1
when classes are well defined by the PCA, this is not necessary in the encoding, or use the PLS2 algorithm by constructing a matrix (NxC),
PLS-DA method; SIMCA also still requires more development time to where each column represents a class [27].
optimize designs for each class. A fundamental step to build a PLS-DA model is the determination
of the correct number of latent variables. This choice is commonly
2. Theory performed by using cross-validation of the calibration samples where
some samples are separated into a validation set and the models are
2.1. Partial Least Square Discriminant Analysis (PLS-DA) built with the others. The prediction errors are calculated for the
samples that were separated using different numbers of latent vari-
The PLS-DA is considered a supervised classification method, which ables. The process is repeated until all samples have been predicted.
should have an initial knowledge of the classes of the sample set. The The value obtained by the PLS-DA model is a number given by
classes are defined based on a priori information of the system or by Eq. (5), not reading exactly 0 or 1. Thus it is necessary to establish
an exploratory analysis, for example, using Principal Component threshold values to define the class limits. The threshold is estimated
Analysis. Barker and Rayens [24] compared PLS-DA with LDA (Linear in many routines by the Bayesian theorem [27] or by establishing
Discriminant Analysis) and presented some advantages of PLS-DA confidence limits for each object classified. These confidence intervals
such as the selection of variables and noise reduction. can be calculated by re-sampling techniques, such as bootstrap.
172 M.R. de Almeida et al. / Microchemical Journal 109 (2013) 170–177

2.2. Bootstrap A new PLS model can be calculated from Y*, the regression coeffi-
cient bootstrap (β*), which allows the calculation of the new values of
Bootstrap [28,29] is a generalization of the ideas behind cross- Ŷ*, and then new residuals are calculated:
validation, a simple and trustworthy method to estimate prediction
errors. Again, the idea is to generate multiple data sets that, after F^  ¼ Y PLS −Y^ : ð15Þ
analysis, shed light on the variability of the statistics of interest as a
result of different training set compositions. The quantiles of the distribution F are used to estimate the confi-
Bootstrap was introduced by Efron in 1979 [30] to estimate confi- dence interval. For the confidence interval for each sample, the per-
dence intervals for certain parameters, not possible by other techniques, centile method was used, the confidence intervals are asymmetric
mainly when there were small numbers of samples. Despite more than and specific, the upper and lower limits are defined by:
three decades, there are only a few papers in the chemistry area that use
the bootstrap method [17–20,23].
F^ βα ≤ ya ≤ F^ βð1−αÞ ð16Þ
There are two types of bootstrap to estimate the uncertainty in 2 2

regression models: objects and residual bootstrap. Here, we are inter-


ested only in residual bootstrap, since examples from literature present where α is the degree of confidence and B is the number of the
better results. In this paper, we use bootstrap to calculate the uncer- bootstrap.
tainties of prediction of the PLS-DA models. The strategy was developed More detail about bootstrap can be found in the tutorial paper of
in accordance with the procedure presented by Pereira et al. [18], with Wehrens et al. [32].
modifications in the residues calculation.
The first step to estimate the uncertainties of the predictions is to 3. Experimental
calculate the residues:
3.1. Samples

F ¼ Y−YPLS ð9Þ Original and counterfeit R$ 50 banknotes were analyzed by Raman


spectroscopy without any sample preparation. The counterfeit bank-
where Y is the reference value and YPLS is the value predicted by the notes were provided by the Technical–Scientific Police Superintenden-
model. According to Faber [20], the residues must be corrected by the cy from the State of São Paulo, Brazil and homemade counterfeit
degrees of freedom, since the differences between predicted and ob- samples were prepared by scanning authentic bills and printing copies
served values are considerably smaller than those from the expected employing printers (inkjet, laser jet, off-set), with impression per-
deviation values. formed in white alkaline paper.

3.2. Instrument and data acquisition


Y−Y PLS
F¼  1 = ð10Þ
1−D F N 2 The Raman spectra of banknotes were collected with a RamanStation
400F dispersive spectrometer equipped with a CCD detector using a
Peltier system for cooling to −50 °C. A 785 nm near-infrared diode
where DF is the number of degrees of freedom; and N is the number laser of 250 mW was used, and the spectra were obtained with 3 s of
of calibration samples. exposure with 12 accumulations in the region of 3200–400 cm− 1
The number of the degrees of freedom is also difficult to be esti- with a spectral resolution of 4 cm− 1. Each spectrum was obtained in
mated for the calculations of the confidence intervals [22]. Various duplicate and a baseline correction was applied before of the chemo-
approaches have been made to estimate the degrees of freedom in mul- metric calculation. Three different areas on the R$ 50 banknotes were
tivariate regression models. The present study employed the pseudo- selected for study, as shown in Fig. 1. The areas of the bills analyzed
degrees of freedom (PDF) defined by van der Voet [31], which take are called: chalcographic, orange and red ink.
into account the difference between the MSEC (Mean Squared Error of
Calibration) and MSECV (Mean Squared Error of Cross-Calibration), so 3.3. Data analysis
that the higher this difference, the smaller the number of degrees of
freedom, according to the following equations: The Raman spectra were manipulated using MATLAB software
Version 7.8 (R2009a) using routines developed in our laboratory.
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! For all analyses, the data were preprocessed using mean centering
MSEC
PDF ¼ N 1− ð11Þ
MSECV

2
∑ðY−Y PLS Þ
MSEC ¼ ð12Þ
N

2
∑ðY−Y PLS−CV Þ
MSECV ¼ ð13Þ
N

where YPLS is the value predicted by the PLS model and YPLS-CV is the
value predicted by internal validation.
The samples are bootstrap generated from random substitutions
with replacement values of corrected values. The bootstrap residues
(F*) are added to the YPLS values, generating a matrix Y*:

Y  ¼ Y PLS þ F : ð14Þ Fig. 1. Illustrative Brazilian R$ 50 banknotes with areas analyzed by Raman spectroscopy.
M.R. de Almeida et al. / Microchemical Journal 109 (2013) 170–177 173

the X (Raman intensities) and Y (class) blocks, and the Raman inten-
sities were normalized with SNV (Standard Normal Variate). First,
Principal Component Analysis was an employment for exploratory
analyses and the number of principal components was selected
based on the captured variance. For classification analyses, PLS-DA
was applied; the data set was randomly split into two subsets for
training and validation. A dummy matrix Y was created with 0 for
the street counterfeit samples and 1 for the authentic banknotes in
the first model. Other models were developed for the homemade
counterfeit banknotes and classified according to printer type. The
number of latent variables for PLS-DA models was chosen by leave-
one-out cross-validation. The threshold for the class was calculated
by Bayes' Theorem employing the plsthres function present in PLS
Toolbox software, version 4.2.1, from Eigenvector Technologies [27].
The confidence interval estimations for each sample were obtained
with bootstrap residual [33], according to the flowchart shown in
Fig. 2. For calculation of uncertainty, the number of pseudo degrees
of freedom [31] was estimated using Eq. (11).
Fig. 3. Raman spectra of the analyzed areas of R$ 50 banknotes: chalcographic ink (A);
orange ink (B) and red ink (C).
4. Results and discussion

4.1. Raman spectra


Phthalocyanines are widely used as charge generation materials in
In this study, the application of Raman spectroscopy for the forensic solid-state devices such as electrophotographic copiers and printers
analysis of authentic banknote inks was evaluated. The Raman spectra [35], the main Raman bands used to identify phthalocyanine pigment
from three different areas of authentic banknotes are shown in Fig. 3. were 1528, 1340, 748, 680 and 484 cm− 1. The band at 1528 cm− 1 cor-
The colors obtained by printers are from RGB (red, green and blue) or responds to C\C and C\N stretching of the macrocycle, breathing and
CMYK (cyan, magenta, yellow and black) systems, where the combina- deformation of the macrocycle appear at 680 and 750 cm− 1. The diary-
tion of pigments results in a color image. The identification of the lide pigment exhibits four feature bands at 1596, 1398, 1256 and
pigments responsible for each area was based on spectra presented in 953 cm− 1. The prominent band at 1596 cm− 1 is attributed to aromatic
the literature. ring vibrations. The Raman band at 1398 cm− 1can be assigned to the
The spectra from chalcographic and orange inks have characteristic C\H bending vibration, the signal at 1256 cm− 1 was attributed to the
bands of phthalocyanine and diarylide (diazo) pigments [34]; the pro- amide III mode, having contributions from the N\H deformation and
file of the spectra indicates that the chalcographic and orange ink colors C\N stretching [8]. Finally, the Raman band at 953 cm− 1 is due to
are the result of mixtures of phthalocyanine and diarylide pigments. symmetric stretching vibration of benzylamide [9].
The Raman spectrum collected from the orange ink indicated a
major component of phthalocyanine with smaller amounts of a diary-
lide, compared with the Raman spectrum of the chalcographic ink, as
suggested by more intense bands of the phthalocyanine pigment in
the spectrum of orange ink. The proportion of each pigment in the
mix is modified to obtain the desired color.
The Raman spectrum of red ink reveals characteristic bands of azo
pigments; the signals at 1592 and 1364 cm− 1 can be assigned to C\C
stretching and azo group stretching vibrations, respectively. The
Raman bands of the naphthalene group vibration mode appear at 948
and 748 cm− 1. The weak band at 1156 cm− 1 was attributed to SO2
symmetric stretching and indicates a specimen of azo lake pigments [9].
In this work, only the results for chalcographic ink will be pre-
sented; the results for other areas (orange and red inks) can be
found in the Supplementary material. Fig. 4 shows the Raman spectra
of chalcographic inks of the authentic banknotes, laser jet and ink jet
printer lab-made banknotes. The Raman spectrum of the authentic
banknotes was discussed above. Analyzing the Raman spectrum of
the laser jet printer it is possible to observe featured bands of the
phthalocyanine pigment, which indicate a pigment mixture. The pig-
ment of phthalocyanine is blue and the Raman spectrum shows reso-
nances or pre-resonances using a laser in the visible region, due to
interaction with either one or both of these electronic absorption
bands [35]. The chalcographic ink shows a brown color, provided by
a mixture of the phthalocyanine and other pigments. However, the
Raman spectrum of the chalcographic ink printed on the laser printer
is dominated by characteristic bands of the phthalocyanine pigment.
Since the Raman spectra was obtained with a laser at 785 nm, the
absence of bands for the other pigments in the spectrum can be due
to the great intensification of the resonance Raman effect of the
Fig. 2. Flowchart of the bootstrap residual method. phthalocyanine pigments [35], overlapping the other pigments.
174 M.R. de Almeida et al. / Microchemical Journal 109 (2013) 170–177

prepared by scanning and printing of the authentic bill were separated


according to the type of printer used. The samples printed on the laser
printer were close to the apprehended counterfeit samples. This result
suggests that the seized banknotes were printed on a laser printer. For
samples printed on the inkjet printer, the formation of a third group is
observed, in which two samples were separated from the others,
suggesting that these samples have a different pigment compositions.
The possible presence of different pigment mixtures in the sample set
studied justifies multivariate analysis in the discrimination between
genuine and fake banknotes.
The variables that were mainly responsible for sample clustering
could be observed by inspection of the loadings for each variable for
the first, second and fourth principal components, as shown in
Fig. 6. The loadings of PC1 show that the variable with the highest
contribution is at 1596 cm − 1 and it can be assigned to the diarylide
pigment present in the genuine bill, as discussed above. The variables
that have the greatest contribution to PC2 correspond to regions of the
Raman spectrum at 1521, 1340, 748 and 640 cm− 1, which correspond
Fig. 4. Raman spectra of chalcographic ink of (A) authentic banknotes, (B) banknotes
made with a laser jet printer and (C) with an inkjet printer.
to the characteristic bands of the phthalocyanine pigment, present in
the original banknote spectra and, especially, in the Raman spectra of
the counterfeit banknotes produced by the laser printer. Finally, PC4
loadings are represented by Raman bands at 1542, 1520, 1084, 910
The Raman spectrum of fake banknotes printed on an ink jet printer and 748 cm− 1. The graphic profile of PC4 indicates the contribution of
shows a mixture of pigments, with the azo compound, as diarylide, the samples produced by on inkjet printer, where these samples
indicated by a weak band at 1596 cm− 1, another azo compound can showed greater variability of pigment mix according to the printer
also be present in the mixture, suggested by a band at 1380 cm− 1, brand.
attributed to N\N stretching, characteristic this group. The exploratory analysis employing PCA and Raman spectra of the
chalcographic ink of the authentic and fake banknotes allowed the
4.2. Exploratory analysis: PCA separation of samples according to printer type used.

Exploratory analysis, based on Principal Component Analysis, was


applied to the Raman spectra of the chalcographic ink of authentic 4.3. Classification: PLS-DA
and false banknotes. The PCA model was built with authentic bank-
note samples, counterfeit banknotes provided by Technical–Scientific After a preliminary stage of feature extraction to compress the rele-
Police Superintendency, and samples of the homemade counterfeits vant information using PCA, PLS-DA models were developed to predict
printed on laser jet and inkjet printers. Four Principal Components a class membership for new observations. Two models of classification
(PCs) were selected, which explain 90.2% of the total variance of the were built, where the samples were split into calibration and testing
samples. In Fig. 5 it is possible to see the scores of the PCA model, data sets.
with the projection of the PC1 × PC2 × PC4, where the formation of The first model was constructed to classify Real banknotes as
three different clusters among the banknotes samples can be visualized. authentic or counterfeit. The calibration set was composed of 30 genu-
Analyzing Fig. 5, it is observed that the authentic bills gave a single ine banknote samples and 28 lab-made fake banknote samples printed
group (*). For the counterfeit banknotes (▼) the formation of clusters on laser jet and inkjet printers. In order to support the selection of
was also observed, indicating that all samples presented Raman model dimensionality, the leave-one-out cross-validation was
spectra with same features. The homemade counterfeit samples (■) employed; the choice of the number of latent variables was performed

Fig. 5. Scores plot of PC1 × PC2 × PC4 of the PCA analysis of authentic, street counterfeit Fig. 6. Graph of the loadings of (A) PC 1, (B) PC 2 and (C) PC 4 versus wavenumbers
and lab-made counterfeit banknotes. (variables) for the PCA model.
M.R. de Almeida et al. / Microchemical Journal 109 (2013) 170–177 175

using the plot RMSECV (Root Mean Squared Error of Cross-Calibration)


versus the number of latent variables. RMSECV is an internal validation
error and was calculated according to Eq. (17).
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u I  2
uP
u yref −ypred;cv
t i¼1
RMSECV ¼ ð17Þ
N

Four latent variables were selected that represent 88.8% of the vari-
ance explained in the X and 98.9% of the variance in the Y, and show the
lowest RMSECV, equal to 0.0644. The predicted values from the model
were 0 for counterfeit samples and 1 for original samples. The RMSEC
(Root Mean Squared Error of Calibration) value for calibration set was
0.0589, calculated according to Eq. (18):
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u I  2
uP
u
ti¼1 yref −ycal
RMSEC ¼ : ð18Þ
N−DF

These values of RMSEC and RMSECV estimated during the calibra-


tion stage are referred to as accuracy estimations and cannot be trans-
lated into prediction uncertainties for future samples [17].
A threshold value to separate the classes was calculated based in the
distribution of samples of the calibration prediction obtained from the
PLS model, which will best split those classes with the least probability
of false classifications in future predictions [27]. The threshold value
was 0.4666. Then, above this value the banknotes are classified as
genuine and below as counterfeit.
In general the accuracy is not only enough to assess the performance
of the model developed, but it is also important to consider the reliabil-
ity of the model. After optimization of the parameters of the calibration
model, the sample-specific interval of confidence was calculated. The
calculation of the reliability in the classification model is an important
issue, not only must it assigns the object to a class, but also needs to
know the uncertainty of this assignment. For this proposal, the boot- Fig. 7. Results of the calibration (A) and validation (B) set of the PLS-DA model, show-
strap residual technique was employed according to the description in ing authentic banknotes (above dashed line) and counterfeit banknotes (below dashed
Section 2.2. Due to its stochastic nature, the bootstrap method yields line) with prediction intervals for all samples analyzed by the bootstrap residual meth-
od. The validation samples are composed of street counterfeit samples apprehended by
(slightly) different results when applied repeatedly to the same prob-
the Police.
lem (unless the same random numbers are reused). This property is
certainly an issue for metrologists. One of the ways to compensate for
uncertainty estimation for specific-samples shown by the error bar
this variability is related to the adjustment of the number of trials (the
in Fig. 7, allows better confidence in the classification of the validation
number of evaluations of the measurement model), which should be
set. The lower and upper confidence limits calculated for this set ranged
large enough to ensure the reliability of the bootstrap results. Each
from 0.0310 to 0.0300, and all samples were correctly classified. The
simulation was carried out using 103 random trials for each calibration
results show the reliability of the model to discriminate authentic and
sample. This number was optimized to obtain the lowest standard
counterfeit bills.
deviations. The analysis was repeated 10 times. All of the calculation
After identifying whether the banknotes are original or false, a sec-
samples were considered as a new observation and the 95% bootstrap
ond classification model was developed to classify the type of printer
confidence intervals were computed. The results from PLS-DA calibra-
used in making the counterfeit banknotes. In this case, the calibration
tion model 1 are shown in Fig. 7A. The results of the validation set
set was composed of 28 samples made by inkjet and laser jet printers.
with uncertainty estimations are show in Fig. 7B. The samples classified
For this model four latent variables were selected, which provided a
as counterfeit banknotes (0) by the PLS-DA model were provided by the
RMSEC equal to 0.0987. Fig. 8A displays results for calibration samples,
Technical–Scientific Police Superintendency and classified as fake by
where 1 is for the samples confected on the laser printer and 0 is for
classical forensic approaches, based on sensory tests and optical inspec-
samples printed by the inkjet printer. The RMSEC calculated was
tion by a specialist. For the validation set, the RMSEP (Root Mean
0.0835 and the threshold between the classes was 0.3234, represented
Squared Error of Prediction) was calculated by Eq. (19), resulting in a
by dashed line in the graphics of Fig. 8A. Then, above this value the fake
value equal to 0.3552.
banknotes are classified as laser printer and below as inkjet printer. The
uncertainty estimation was calculated by the re-sampling technique,
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
uM bootstrap residual. For the test set (Fig. 8B), the street counterfeit bank-
uX yref −ypred Þ2
RMSEP ¼ t ð19Þ notes apprehended by the Police were employed, in order to identify
i¼1
K the printer type used in the manufacture of these counterfeit Real bank-
notes. Most street counterfeit banknotes were classified as made by a
where K is the number of samples in the validation set. laser printer; this type of printer provides better print quality and is a
The accuracy indicators (RMSEC and RMSEP) do not show a good favorite of counterfeiters. Five samples were classified as originating
concordance, implying that the RMSEC value is not a good estimate of from inkjet printers. Based on the calculation of the confidence intervals
the standard error of prediction observed in the validation set. The for each sample, according to flowchart of Fig. 2, the lower and upper
176 M.R. de Almeida et al. / Microchemical Journal 109 (2013) 170–177

The classification model developed based on PLS-DA and Raman


spectra could be used as a complementary method to classical forensic
inspection. The procedure is fast, non-destructive, robust and presents
the possibility of in situ analysis using a portable instrument.
Supplementary materials related to this article can be found
online at doi:10.1016/j.microc.2012.03.006.

Acknowledgments

The authors thank CAPES, INCTBio for financial support and the
Technical–Scientific Police Superintendency from State of São Paulo,
Brazil for providing the street counterfeit banknotes.

References

[1] Banco Central do Brasil, Departamento do Meio Circulante, SISMECIR. http://www.bcb.


gov.br/htms/mecir/seguranca/EstatisticaFalsificacao%20%20UF%20X%20Denominacao_
internet_2011.pdf. Accessed on 28/09/2011.
[2] A. Villa, N. Ferrer, J. Mantecón, D. Bretón, J.F. García, Development of a fast and
non-destructive procedure for characterizing and distinguishing original and
fake euro notes, Anal. Chim. Acta 559 (2006) 257–263.
[3] L.S. Eberlin, R. Haddad, R.C. Sarabia Neto, R.G. Cosso, D.R.J. Maia, A.O. Maldaner, J.J.
Zacca, G.B. Sanvido, W. Romão, B.G. Vaz, D.E. Ifa, A. Dill, R. Graham Cooks, M.N.
Eberlin, Instantaneous chemical profiles of banknotes by ambient mass spectrome-
try, Analyst 135 (2010) 2533–2539.
[4] S. Armenta, M. de la Guardia, Analytical methods to determine cocaine contamina-
tion of banknotes from around the world, Trends Anal. Chem. 27 (2008) 344–351.
[5] K.A. Frederick, R. Pertaub, N.W.S. Kam, Identification of individual drug crystals
on paper currency using Raman microspectroscopy, Spectrosc. Lett. 37 (2004)
301–310.
[6] S.J. Dixon, R.G. Brereton, J.F. Carter, R. Sleeman, Determination of cocaine contam-
ination on banknotes using tandem mass spectrometry and pattern recognition,
Anal. Chim. Acta 559 (2006) 54–63.
[7] J. Zieba-Palus, B.M. Trzcinska, Establishing of chemical composition of printing
ink, J. Forensic Sci. 56 (2011) 819–821.
[8] F. Schulte, K.W. Brzezinka, K. Lutzenberger, H. Stege, U. Panne, Raman spectrosco-
py of synthetic organic pigments used in 20th century works of art, J. Raman
Spectrosc. 39 (2008) 1455–1463.
[9] P. Vandenabeele, L. Moens, H.G.M. Edwards, R. Dams, Raman spectroscopic data-
Fig. 8. Results of the calibration (A) and validation (B) set of the PLS-DA model, showing base of azo pigments and application to modern art studies, J. Raman Spectrosc.
lab made counterfeit samples prepared by laser jet printers (above dashed line) and inkjet 31 (2000) 509–517.
printers (below dashed line) with prediction intervals for all samples, analyzed by the [10] J. Zieba-Paulus, J. Was-Gubala, An investigation into the use of micro-Raman
bootstrap residual method. The validation samples are composed of street counterfeit spectroscopy for the analysis of car paints and single textile fibres, J. Mol. Struct.
993 (2011) 127–133.
samples apprehended by the Police.
[11] M. Skenderovska, B. Minčeva-Šukarova, L. Andreeva, Application of micro-Raman
and FT-IR spectroscopy in forensic analysis of automotive topcoats in the Repub-
limits ranged from 0.0290 to 0.0729. The uncertainty in the classifica- lic of Macedonia, Maced. J. Chem. Chem. Eng. 27 (2008) 9–17.
[12] E.M.A. Ali, H.G.M. Edwards, M.D. Hargreaves, I.J. Scowen, In situ detection of co-
tion of two samples is noted, as outlined in Fig. 8B, where the confidence
caine hydrochloride in clothing impregnated with the drug using benchtop and
limits exceeded the threshold line. This result suggests that mixture of portable Raman spectroscopy, J. Raman Spectrosc. 41 (2010) 938–943.
inks not included in this study or it could be due to noisy spectra, [13] D.S. Moore, R.J. Scharff, Portable Raman explosives detection, Anal. Bioanal. Chem.
393 (2009) 1571–1578.
leading to large uncertainties in the classification.
[14] L. Lepot, K. De Wael, F. Gason, B. Gilbert, Application of Raman spectroscopy to
The PLS-DA was used in the design of a system to be used by non forensic fibre cases, Sci. Justice 48 (2008) 109–117.
specialist operators, the model developed can analyze new samples of [15] O. Preisner, J.A. Lopes, R. Guiomar, J. Machado, J.C. Machado, J.C. Menezes, Fourier
R$ 50 banknotes with data obtained by Raman spectroscopy, classifying transform infrared (FT-IR) spectroscopy in bacteriology: towards a reference
method for bacteria discrimination, Anal. Bioanal. Chem. 387 (2007) 1739–1748.
with reliability and estimating uncertainty. [16] ABNT NBR ISO/IEC 17025, Requisitos Gerais para Competência de Laboratórios de
A similar procedure was applied to the Raman spectra for several Ensaio e Calibração, ABNT, Rio de Janeiro, 2005.
other areas of the R$ 50 banknotes, for red and orange inks. The [17] L. Zhang, S. Garcia-Munoz, A comparison of different methods to estimate predic-
tion uncertainty using Partial Least Square (PLS): a practitioner's perspective,
results are shown in the Supplementary material. The information pro- Chemom. Intell. Lab. Syst. 97 (2009) 152–158.
vided by these other areas can be used for complementary analysis in [18] A.C. Pereira, M.S. Reis, P.M. Saraiva, J.C. Marques, Madeira wine ageing prediction
the discrimination between genuine and counterfeit banknotes. based on different analytical technique: UV–vis, GC-MS, HPLC-DAD, Chemom.
Intell. Lab. Syst. 105 (2011) 43–55.
[19] M. Boiret, L. Meunier, Y.M. Ginot, Tablet potency of Tianeptine in coated tablets by
near infrared spectroscopy: model optimization, calibration transfer and confi-
5. Conclusions dence intervals, J. Pharm. Biomed. Anal. 54 (2011) 510–516.
[20] N.M. Faber, Uncertainty estimation for multivariate regression coefficients,
Chemom. Intell. Lab. Syst. 64 (2002) 169–179.
This work addresses the problem of counterfeit Real banknotes in [21] J.W.B. Braga, R.J. Poppi, Comparison of variance sources and confidence limits in
Brazil. The proposed procedure, based on the characterization of inks two PLSR models for determination of the polymorphic purity of carbamazepine,
Chemom. Intell. Lab. Syst. 80 (2006) 50–56.
used to confect authentic and counterfeit R$ 50 banknotes employing
[22] A.C. Olivieri, N.M. Faber, J. Ferré, R. Boqué, J.H. Kalivas, H. Mark, Uncertainty esti-
Raman spectroscopy, PCA and PLS-DA, allows us to distinguish mation and figures of merit for multivariate calibration, Pure Appl. Chem. 78
between genuine and fake Real notes. This work takes into account (2006) 633–661.
the uncertainty in the predictions by chemometric methods using [23] O. Preisner, J.A. Lopes, J.C. Menezes, Uncertainty assessment in FT-IR spectroscopy
based bacteria classification models, Chemom. Intell. Lab. Syst. 94 (2008) 33–42.
the re-sampling technique and the bootstrap residual was used for [24] M. Barker, W. Rayens, Partial least square for discrimination, J. Chemom. 17
this proposal considering the degrees of freedom of the model. (2003) 166–173.
M.R. de Almeida et al. / Microchemical Journal 109 (2013) 170–177 177

[25] H. Martens, M. Martens, Modified jack-knife estimation of parameter uncertainty [31] H. van der Voet, Pseudo-degrees of freedom for complex predictive models: the
in bilinear modelling by partial least squares regression (PLSR), Food Qual. Prefer. example of partial least squares, J. Chemom. 13 (1999) 195–208.
11 (2000) 5–16. [32] R. Wehrens, H. Putter, L.M.C. Buydens, The bootstrap: a tutorial, Chemom. Intell.
[26] S. Wold, M. Sjöström, L. Eriksson, PLS-regression: a basic tool of chemometrics, Lab. Syst. 54 (2000) 35–52.
Chemom. Intell. Lab. Syst. 58 (2001) 109–130. [33] A.M. Zoubir, B. Boashash, The bootstrap and its application in signal processing,
[27] B.M. Wise, N.B. Gallagher, R. Bro, J.M. Shaver, W. Windig, R.S. Koch, Chemometris IEEE Signal Process. Mag. 15 (1998) 55–76.
Tutorial for PLS_Toobox and Solo, Eigenvector Research, Inc., 3905 West Eagle- [34] K.W.C. Poon, I.R. Dadour, A.J. McKinley, In situ chemical analysis of modern organ-
rock Drive, Wenatchee, WA 98801 USA, 2006. ic tattooing inks and pigments by micro-Raman spectroscopy, J. Raman Spectrosc.
[28] B. Efron, R.J. Tibshirani, An Introduction to the Bootstrap, Chapman and Hall, New 39 (2008) 1227–1237.
York, 1993. [35] D.R. Tackley, G. Dent, W.E. Smith, Phthalocyanines: structure and vibrations, Phys.
[29] A.C. Davison, D.V. Hinkley, Bootstrap Methods and Their Applications, Cambridge Chem. Chem. Phys. 3 (2001) 1419–1426.
University Press, Cambridge, 1997.
[30] B. Efron, Bootstrap methods: another look at the jackknife, Ann. Stat. 7 (1979)
1–26.

You might also like