You are on page 1of 18

Analytical Letters, 38: 2189–2206, 2005

Copyright # Taylor & Francis, Inc.


ISSN 0003-2719 print/1532-236X online
DOI: 10.1080/00032710500259342

CHEMOMETRICS

Data Compression for a Voltammetric


Electronic Tongue Modelled with
Artificial Neural Networks

Laura Moreno-Barón, Raúl Cartas, Arben Merkoçi, and


Salvador Alegret
Sensors and Biosensors Group, Chemistry Department, Autonomous
University of Barcelona, Bellaterra, Catalonia, Spain

Juan M. Gutiérrez, Lorenzo Leija, Pablo R. Hernandez, and


Roberto Muñoz
Bioelectronics Section, Department of Electrical Engineering, Cinvestav,
Mexico City, Mexico

Manuel del Valle


Sensors and Biosensors Group, Chemistry Department, Autonomous
University of Barcelona, Bellaterra, Catalonia, Spain

Abstract: In the study of voltammetric electronic tongues, a key point is the preproces-
sing of the departure information, the voltammograms which form the response of the
sensor array, prior to classification or modeling with advanced chemometric tools. This
work demonstrates the use of the discrete wavelet transform (DWT) for compacting
these voltammograms prior to modeling. After compression, a system based on
artificial neural networks (ANNs) was used for the quantification of the electroactive
substances present, using the obtained wavelet decomposition coefficients as their
inputs. The Daubechies wavelet of fourth order permitted an effective compression

Received 17 June 2005; accepted 22 June 2005


Financial support for this work was provided by the MECD (Madrid, Spain) through
project CTQ2004-08134, by CONACYT (Mexico) through project 43553, and by the
Department of Universities and the Information Society (DURSI) from the Generalitat
de Catalunya.
Address correspondence to Manuel del Valle, Sensors and Biosensors Group,
Chemistry Department, Autonomous University of Barcelona, Bellaterra, Catalonia
E-08193, Spain. E-mail: manel.delvalle@uab.es

2189
2190 L. Moreno-Barón et al.

up to 16 coefficients, reducing the original dimension by ca. 10 times. The case studied
is a mixture of three oxidizable amino acids:tryptophan, cysteine, and tyrosine. With
the reduced information, one ANN per specie was trained using the Bayesian regular-
ization algorithm. The proposed procedure was compared with the more conventional
treatments of downsampling the voltammogram, or its feature extraction employing
principal component analysis prior to ANNs.

Keywords: Voltammetric electronic tongue, discrete wavelet transform, artificial


neural networks, PCA, oxidizable aminoacids

INTRODUCTION

We have been attending during recent years to the success of the concept
of electronic tongues in the field of chemical sensors. This case is one of
the more clear benefits accounted for in the combination of chemometrics
and electrochemical sensors (Pravdová 2002), which was foreseen as an
excellent way to improve sensor performance (Lavine 2002). An accepted
definition of electronic tongue (Holmberg 2004) entails an analytical instru-
ment comprising an array of nonspecific, poorly selective, chemical sensors
with cross-sensitivity to different compounds in a solution, and an appropriate
chemometric tool for the data processing. For the analysis of liquid samples,
there are two main kinds of electronic tongues, those employing potentio-
metric sensors (Gallardo 2003) and those employing voltammetric sensors
(Winquist 1997). The latter usually employ arrays of voltammetric electrodes,
for example, a number of different metallic electrodes, or a number of
modified electrodes (Apetrei 2004). From the conceptual point of view, a vol-
tammetric system with a single electrode can also be considered an electronic
tongue, as the dominating point here is the high order measuring information
fed to the computer-processing tool.
Moreover, regarding this feature, the complexity of the input information,
it makes enormously cumbersome any chemometrical stage, becoming a cri-
tical issue in voltammetric electronic tongues. In this way, a crucial point in
this field is the reduction of the data, prior to classification or calibration.
The use of artificial neural networks (ANNs) is widely accepted to build the
calibration model of electronic tongues (Krantz-Rülcker 2001). When the
input information is of the voltammetric type, it becomes difficult to
correctly build and adjust a network with hundreds of input nodes, as needed
by these voltammograms. Again, the way to solve this bottleneck is through
the preprocessing of the original signal in order to reduce its dimensions.
In addition to size reduction, compression is intended to extract signi-
ficant features from the departure information, besides the elimination of
irrelevant content, such as noise or redundancies (Simons 1995; Despagne
1998). Further advantages of such pretreatment can be an increased training
speed, a reduction of memory needs, better generalization ability of the
Data Compression for Voltammetric Electronic Tongue 2191

model, enhanced robustness versus noise, and simpler model representations


(Despagne 1998).
A widely used method for data compression is principal component
analysis (PCA). The method intends to summarize almost all variance
contained in the departure information on a fewer number of directions (the
PCs) with new coordinates called scores, obtained after data transformation.
These axes or directions have the property to be mutually orthogonal, which
facilitates the use of linear regression models (principal components
regression, PCR). ANNs will not benefit from the orthogonality of input
variables, but applications in quantitative analysis can still be developed
employing the PC scores as input data (Borggaard 1992; de Carvalho
2000). In practice, PC scores have been successfully used as inputs, because
all relevant information from a huge spectrum or a voltammogram is
reduced to a few PCs depending on the correlation of the original data. One
limitation of the treatment is that it can fail to preserve the nonlinearity of a
data set, as it is a linear projection technique. If there are some nonlinear
characteristics in the departure information, these will be considered as pertur-
bations or noise and will not be described by the first PCs as in a linear case.
Alternatively to PCA, it is possible to use Fourier analysis (Gemperline
1997), Hadamard transform (Dathe 1996), and discrete wavelet transform
(DWT) (Collantes 1997), or the most common technique of downsampling
to preprocess input data before ANN modeling. The DWT is a processing
tool that yields a series of coefficients as a result of relating an original
signal with a family of functions that are scaled and translated versions of a
base function known as mother wavelet (Leung 1998). Its most attractive
feature is its ability to optimally describe temporal information from the
spectrum, while Fourier decomposition is global (Shao 2003). This feature
allows DWT to describe nonstationary signals in a better way than does
Fourier transform, which employs periodical sine and cosine functions
(Chau 2004).
When working with voltammetric electronic tongues, different options
have been attempted to reduce the complexity of the acquired information.
Seminal works of this research topic employed the fitting to different para-
metric models having more or less physical meaning (Simons 1995).
Different measurement compression techniques were evaluated with the vol-
tammetric electronic tongue developed by the group of Winquist in Sweden
(Winquist 1997; Krantz-Rülcker 2001) and used for classification; the
measurements of this electronic tongue are the responses of an array of
metallic working electrodes of different nature to a set of voltage pulses.
A first work (Holmin 2001) compared the hierarchical principal
component analysis, an evolution of PCA, against the DWT for the classifi-
cation task of beverages and foods. In the same work, a third procedure that
yielded the best results was evaluated; it used a parametric model based on
a sum of exponential decays with some electrochemical fundamentals. A
second contribution by Artursson (2002a) studied a model based on the sum
2192 L. Moreno-Barón et al.

of two exponential decays to compress the original voltammetric signal by a


factor of ca. 100. The reduced signal was then used for the classification,
employing PCA, of different aqueous samples and beverages. An additional
work from the same author (Artursson 2002b) evaluated the use of
the DWT to reduce the raw data of a voltammetric e-tongue applied at a
bottling plant of drinking water. These contributions are closely related
with their equivalents in the field of gas sensors (systems known as
electronic noses), where the DWT has been employed for feature
extraction of dynamic or thermally modulated signals (Distante 2002;
Ionescu 2002).
Apart from these works related to classification, there are also in the lit-
erature several contributions dealing with the resolution of overlapped signals
obtained with electroanalytical procedures and employing the DWT for this
purpose (Shao 2001). These works are aimed to the compression of raw
data, altogether with the retention of information needed to correlate them
with the concentrations present. In this way, these works are more related
with the computing needed to obtain the quantitative application of the
e-tongue data. Different alternatives for the quantification of the electroactive
components present are attempted, all of which aimed to the extraction of sig-
nificant features of an overlapped voltammetric signal. Almost all of these
works are focused on the resolution of metals mixed in a solution, as is the
case of the continuous WT applied to the determination of mixtures of
cadmium and indium (Nie 2001), or the derivative WT used for determining
mixtures of indium and cadmium or dopamine and ascorbic acid (Zhang
2004). Also remarkable is the development of an instrument that employs
the DWT online and displays the resolution of lead and thallium, or
cadmium and indium (Shao 2000) mixed in a solution.
The utility of the DWT was demonstrated in the extraction of significant
features and the subsequent use of them for the quantitative modeling of vol-
tammetric signals employing multivariate calibration procedures. These con-
tributions studied the optimum wavelet type to reduce the voltammogram and
used the compressed information to perform the modeling with partial least
squares (PLS) regression or with ANNs (Cocchi 2003a). In these different
works, the case under study resolved a two-component system with two
metallic ions: thallium and lead (Palacios-Santander 2003; Cocchi 2003b).
Recently in our laboratory, an equivalent system was developed, in
which the original information in the voltammogram is compressed
employing the DWT, and the reduced information is used for ANN
modeling (Moreno-Barón 2005).
The present work compares the performance of this alternative, the use of
DWT as compression tool prior to the quantitative modeling of voltammetric
electronic tongues employing ANN, with the more classical PCA þ ANN
combination (Ensafi 2002), or with the more straightforward alternative,
just the downsampling to reduce the dimensionality of the raw data, plus
ANN modeling (Gutes 2005).
Data Compression for Voltammetric Electronic Tongue 2193

EXPERIMENTAL

Reagents and Materials

All chemicals for electrolyte and the stock of amino acid solutions, tryptophan
(Trp), cysteine (Cys), and tyrosine (Tyr), were purchased from Merck as pro-
analysis grade. The support electrolyte solution consisted of 0.1 M potassium
chloride þ 0.1 M phosphate solution (pH was adjusted to 7.5). Synthetic
mixtures for the evaluation of the voltammetric method were prepared from
0.1 M stock solutions of each amino acid.

Apparatus

A PGSTAT 20 Autolab potentiostat with a Pt working electrode was used


for differential pulse voltammetric measurements. The resulting voltammetric
data consisted of current intensities recorded in the range of potentials
from 0.4 to 1.0 V in steps of 0.00365 V. Hence, 164 data points per
sample were recorded, which formed the voltammograms used in further
analysis. The modulation amplitude was 0.025 V, the modulation time
was 70 ms, and the pulse interval was 300 ms. No preconditioning was
performed.
For the series of synthetic samples, microvolumes of each amino acid
mixture solution were added to 25 ml of the support electrolyte solution gen-
erating the different sample series. A magnetic stirrer was used to homogenize
the solution prior to measurement.

Procedure

Three series of synthetic solutions were prepared for Trp, Cys, and Tyr
analysis. For each analyte, six concentration levels were considered as
follows: 5.0, 10, 20, 25, 30, and 35 mM for Cys and Tyr, and 2.0, 6.0, 10,
14, 17, and 21 mM for Trp. Interferences were studied at two levels: 10 and
25 mM for Trp and Cys; 5.0 and 34 mM for Tyr. As a result, each analyte
series was composed of 24 mixture solutions, and the set of voltammograms
processed 72.

Software

DWT as well as ANN modeling were implemented employing Matlab


version 6.1 (MathWorks, Natick, MA) with the aid of its Neural Network
(version 4.0) and Wavelet (version 2.0) toolboxes. The PCA treatment was
done employing the statistical program Minitab, release 14 (Minitab Inc.,
State College, PA).
2194 L. Moreno-Barón et al.

RESULTS AND DISCUSSION

The procedures for data reduction employing DWT, PCA, and downsampling
were set up and fine-tuned before the voltammograms were coupled to the
modeling system based on ANNs. The performances of the three alternatives
were compared after training, using the input training data and an external test
subset. The analytical case studied was the simultaneous direct determination
of oxidizable amino acids, a common application in animal feed analysis.
Figure 1 shows a typical voltammogram, corresponding to one of the
original input data. As can be observed, a degree of overlapping, together
with background oxidation, makes the estimation of the components an inter-
esting issue.

Conditioning of the Information

The departure universe of data used for building the calibration model
consisted of an input matrix, formed by 72 samples (columns) with 164
current values (rows) for each one, plus the corresponding output matrix,
formed by 72 samples with three concentration values (rows) for each one.

Figure 1. Example of the overlapped-signal voltammogram of oxidizable amino


acids studied in this work. Concentrations on the curve are (Trp, Cys, Tyr) 5.2, 25,
and 34 mM, respectively. Oxidation zones for the three considered amino acids are indi-
cated on the figure.
Data Compression for Voltammetric Electronic Tongue 2195

The first preprocessing was done with the DWT using the Daubechies
mother wavelet of fourth order and taking the decomposition to a fourth
level. The mother wavelet, order, and decomposition level were chosen
based on a compromise between the number of approximation coefficients
obtained at each decomposition level and the degree of similarity between
the original voltammogram and the one recovered with these coefficients, as
was done in a previous work (Moreno-Barón 2005). The fewest number of
coefficients and the highest similarity were the goals. With this processing,
the size of the input information was reduced from 164 points per voltammo-
gram to only 16; therefore, the information was compressed by a factor
slightly larger than 10.
To reduce the input information employing PCA, the analysis showed that
more than 95% of input variance could be explained with just the first three
PCs; nevertheless, in order to compare the efficiency with the DWT com-
pression, the first 16 PCs were taken, including in this way more information
than strictly necessary.
Lastly, in the downsampling scheme, the input matrix was reduced by a
factor of 10, following the decimation procedure (Mitra 2001), in which
a signal is resampled at a lower rate after lowpass filtering. The filter has a
cutoff frequency of p/D, where p is a normalized frequency and D is the
downsampling factor. In this way, all three compression techniques yielded
16 rows. No change was needed to the output matrix in any case.
After the three compression alternatives were done, the data were split
into two subsets for training and testing the neural networks. For the
training process, 75% of the total number of columns was taken, while
for testing, the remaining columns were used. All data were normalized
to the interval [21,1] to facilitate the convergence of the learning
algorithm. No internal validation subset was needed, due to the nature of
the used algorithm, as explained below.

ANNs

All the trained networks were of the feedforward backpropagation type,


identical in structure and topology. Three single output neural networks
were used in parallel, one for each modeled amino acid, as this scheme
reaches better results than a triple output network (Moreno-Barón 2005).
Each ANN had a three-layer structure: two hidden layers and one output
layer. The first hidden layer had six neurons and tangent-sigmoidal transfer
function, the second hidden layer had 24 neurons and logarithmic-sigmoidal
transfer function, and the output layer had a single neuron and linear
transfer function. The used training algorithm was the Bayesian regularization
algorithm. This algorithm has the particularity that it avoids overfitting
without the need to monitor the fitness degree of an internal validation subset
(Demuth 2001).
2196 L. Moreno-Barón et al.

The goal for convergence training was a sum of squared error (SSE) of
0.001 in 200 or less training epochs. SSE was calculated as follows:
X
N
SSE ¼ ðcexpected  ccalculated Þ2 ð1Þ
j¼1

where c is the concentration value and N the number of training samples.

Training of the ANNs

The programming for the ANN training was devised to improve the general-
ization performance of the network, i.e., to correctly predict outputs related to
testing data. The Matlab program starts with the loading of the compressed
information for training and testing, then the network is initialized, and the
training is performed with the selected learning strategy. The network is
trained until it reaches the previously fixed SSE goal. After convergence
state is reached, the external test data are interpolated to check the
modeling ability for the considered chemical component. From the output
values, a prediction error using the absolute values of the residuals is
calculated. This error is a percentile relative absolute error (PRAE) defined
by Eq. (2):
 
M c 
1X expectedi  ccalculatedi
PRAE ¼  100 ð2Þ
M i¼1 cexpected
i

where M is the number of testing outputs.


At the beginning of the training process, it is assumed that the ANN does
not have a correct generalization ability with the external test data. Therefore,
the starting PRAE value before the training process begins is arbitrarily
assigned a value of 100%. If, at the end of training, the calculated PRAE
value is lower than the assumed value, the new is stored and taken as a
figure of merit to improve in the next training process.
Figure 2 shows the flowchart of the strategy used. Evidently, the training
process is iterative, because if the new PRAE value does not improve the
previous one, the former one is kept, and the training is reinitiated.
For the different trainings accomplished, PRAE values larger than 10%
were considered bad indication of the generalization capability of the network.

Results Obtained with the DWT Compression

The alternative that showed better generalization ability to data test was that
of ANNs employing DWT preprocessed voltammograms. Training for
each network lasted approximately 96 h to reach a minimum PRAE, and as
Data Compression for Voltammetric Electronic Tongue 2197

Figure 2. Flowchart of the training process used for each artificial neural network
model.

an additional 48 h of training did not improve this value, the best


achieved training was taken as the model, and that currently running was
aborted.
The PRAE values obtained for the three oxidizable amino acids Trp, Cys,
and Tyr were 6.76%, 5.90%, and 8.34%, respectively, each one obtained with
its own network. To visualize the modeling ability of the trained network, Fig. 3
2198 L. Moreno-Barón et al.

Figure 3. Comparison of the obtained vs. expected results for the three considered
amino acids for DWT preprocessed input data. The dashed line corresponds to ideality
(y ¼ x), and the solid line is the regression of the comparison data. Graphs on the left
correspond to training and those on the right to external testing.
Data Compression for Voltammetric Electronic Tongue 2199

Table 1. Linear regression parameters for the line (y ¼ m . x þ b) that best fits the
plots of obtained vs. expected results for the networks trained with the three data
sets obtained with the WT, PCA, and downsampling processing techniques. The sets
were split into two subsets for training and testing. Uncertainty intervals were calcu-
lated at 95% of confidence level

Training Testing
Amino
acid m b m b

DWT
Trp 0.998 + 0.0023 1.1E-5 + 2.6E-05 1.012 + 0.0086 2.3E-4 + 1.1E-03
Cys 1.002 + 0.0018 8.9E-6 + 3.7E-05 1.043 + 0.0761 26.4E-4 + 1.7E-03
Tyr 0.994 + 0.0061 3.6E-5 + 1.2E-04 0.980 + 0.126 1.7E-4 + 2.7E-03
PCA
Trp 0.999 + 0.0019 5.7E-6 + 2.2E-05 0.897 + 0.128 7.7E-4 + 1.6E-03
Cys 0.987 + 0.014 2.4E-4 + 3.0E-04 0.865 + 0.395 4. 9E-3 + 8.9E-03
Tyr 0.999 + 0.0016 1.5E-5 + 3.2E-05 0.970 + 0.194 24.2E-4 + 4.4E-03
Downsampling
Trp 0.998 + 0.0021 0.028 + 2.5E-05 1.060 + 0.0130 20.96 + 1.7E-03
Cys 0.999 + 0.0012 0.020 + 2.5E-05 1.112 + 0.077 21.4 + 1.7E-03
Tyr 0.998 + 0.0014 0.031 + 2.8E-05 0.938 + 0.121 21.1 + 2.8E-03

shows comparison graphs between obtained and expected concentration


values, where a correct behavior is clear.
A measure of the modeling performance can be deducted from the linear
regression of these plots, providing the best achievable case of a comparison
slope of 1 and a bias intercept of 0. In addition, correlation coefficients close to
one will indicate the achievements of the modeling.
The data of the linear regression analysis for training and testing subsets
corresponding to the three compression methods used in this work are
presented in Table 1. The table shows remarkable results of comparison
slopes, intercepts, and correlation coefficients for training data, which are
expected to obtain for a correctly trained network, and more significantly,
for external test data, a good indicator of the modeling ability.

Results Obtained with the PCA Compression

Although some works in the literature achieved correct results (de Carvalho
2000; Ensafi 2002), networks trained with PCA compressed data did not
show good responses with the data test, which must be due to the difficulty
of the case studied. With exception of the network for Trp, which obtained
a PRAE value of 9.33%, the networks modeling Cys and Tyr yielded values
larger than the 10% limit, being 32.73% and 14.23%, respectively. Figure 4
shows the comparison graphs between obtained and expected concentration
2200 L. Moreno-Barón et al.

Figure 4. Comparison of the obtained vs. expected results for the three considered
amino acids for PCA preprocessed input data. The dashed line corresponds to ideality
(y ¼ x), and the solid line is the regression of the comparison data. Column at left
corresponds to training and column at right to testing.
Data Compression for Voltammetric Electronic Tongue 2201

values, for both training and testing sets. Corresponding data on Table 1
summarize the regression lines shown on the graphs, which are clearly
worse than those for the DWT case, specially when observing the modeling
ability for the external test.

Results Obtained with Downsampling

Networks employing simple downsampled information permitted better cali-


bration models to be built than those employing PCA. The PRAEs achieved
with this alternative were 6.6072%, 6.3986%, and 10.3958% for Trp, Cys,
and Tyr, respectively. Comparison graphs of obtained vs. expected concen-
tration values for each amino acid are shown in Fig. 5. Their closeness to
ideality is again calculated from the linear regression lines, both for training
and for testing subsets, which are in Table 1. Although surpassed by the
DWT procedure, downsampling can be taken as a very simple approach that
yields acceptable models.
Table 2 summarizes the PRAEs obtained with the calibration models
based on the three different processing techniques used to compress the vol-
tammograms. The errors in the quantifications of Trp and Cys amino acids
were similar using either downsampling or DWT method. However, on quan-
tifying Tyr amino acid, the combination DWT – ANN performed better than
the combination of downsampling and ANN. DWT and downsampling
preprocessing techniques retain from the observed signal the information
contained at low frequencies. When downsampling is applied, the spectrum
is reduced from 2 p/T  v  p/T to 2 p/DT  v  p/DT. The parameter
T is the sampling rate, which has been given a value of 0.00365 V, the size
of the voltage step used in our voltammetry tests, and D is the reduction
factor. Despite wavelet approximation coefficients and downsampled voltam-
mograms having 16 data points length for each one, the downsampled voltam-
mogram spectrum is only half the size of the postprocessed voltammogram
and has only eight frequency components.
DWT can be interpreted as a filter bank structured by levels. In the first level,
the original signal with length N and maximum spectral component frequency
(fmax) is low-pass filtered and high- or band-pass filtered and then downsampled
by a factor of two. The results of this process are subsignals halved in length and
bandwidth. The elements obtained from the low-pass filter after downsampling
are called approximation coefficients, and the ones obtained from the high-
pass filter after downsampling are called detail coefficients. By bisecting the
bandwidth of each approximation subsignal, the frequency resolution is
doubled, i.e., it focuses on a finer band of frequencies. Likewise, downsampling
by a factor of two reduces the number of time samples and hence decreases time
resolution. This trade-off between time and frequency resolution is the mark of
the wavelet transform. Low frequency components are more difficult to
2202 L. Moreno-Barón et al.

Figure 5. Comparison of the obtained vs. expected results for the three considered
amino acids for downsampled preprocessed input data. The dashed line corresponds
to ideality (y ¼ x), and the solid line is the regression of the comparison data. Column
at left corresponds to training and column at right to testing.
Data Compression for Voltammetric Electronic Tongue 2203

Table 2. PRAE values obtained for the three alternatives


of preprocessing and ANN modeling

Preprocessing
technique PRAE1 PRAE2 PRAE3

Wavelet 6.76 5.90 8.34


PCA 9.33 32.73 14.23
Downsampling 6.6 6.39 10.39

resolve in frequency domain, and thus, finer frequency resolution is desirable for
the processed voltammograms in this work.
The determination of Tyr amino acid from voltammograms is better when
the compacted signal presented at the input of the ANN carries spectral and
temporal information about the original signal. The upper graph in Fig. 6
shows one raw voltammogram, and the lower graph shows the same voltam-
mogram reconstructed from the 16 approximation coefficients obtained by
DWT using Daubechies wavelet of fourth order. Notice the smoothing
effect on the reconstructed signal. Correlation between original and recon-
structed signals was 0.98. The downsampled voltammogram is also plotted
in the lower graph over the reconstructed signal, and some points of it lie
outside the recovered voltammogram, mainly at the end of the signal.

Figure 6. (Upper) One raw voltammogram of the oxidizable amino acids. (Lower)
The previous signal reconstructed after wavelet processing (solid line) and downsam-
pling (asterisks). Note that differences in postprocessed signals are more manifest for
the lower current values.
2204 L. Moreno-Barón et al.

CONCLUSIONS

This work demonstrated the use of voltammetric electronic tongues for


the simultaneous quantitative determination of three electroactive substances,
the oxidizable amino acids Trp, Cys, and Tyr. With the proposed strategy,
not only a very simple and direct measurement is obtained, but also the
correction of noise or baseline effects is achieved. The calibration model
is built, first pretreating the raw information, extracting significant
features employing the DWT, and next building appropriate ANNs as
calibration tools. The preprocessing of the voltammograms by DWT
has permitted the reduction of the amount of information needed to
represent its content in a factor of ca. 10, which means a huge reduction
considering the high difficulty of the case studied: overlapped combination
of three compounds plus noise and the oxidation of containing media.
The presented work compared the performance of the proposed
alternative, with the more classical use of PCA preprocessing plus ANNs, or
the simpler case that downsamples the voltammogram as the way to reduce
the dimension of information before being coupled to ANNs. Calibration
models built with PCA-pretreated information did not perform well on
testing. To assure that an excess of 16 PCs was not the cause of bad generaliz-
ation capability of the ANN, we trained networks with only the first three PCs
that contained most of the input variance, and none of them reached the SSE
programmed for training. In the case of ANNs trained with wavelet coefficients
and downsampled voltammograms, the closeness of PRAEs obtained after
training is explained by the energy retained from original voltammograms
after being processed. When DWT or downsampling is applied to a signal,
the low frequency content is retained, and the original length of the signal is
reduced. In our case of study, the lengths of the processed voltammograms
were matched for comparison purposes, but the energy contents were
different. The bandwidth retained by DWT was longer than the one retained
by downsampling. This is explained if we consider that wavelet coefficients
are obtained by comparing a displacing base function against the signal
under observation. This means that this technique does not dismiss any segment
of the original voltammogram that might contain details closely related to the
amino acids. In contrast, downsampling removes points of the observed signal
after low-pass filtering that might contain amino-acid-related information.
Although the best calibration models are obtained when wavelet coeffi-
cients are used, downsampling can be taken as a very simple approach that
yields good models, probably due to the large ratio of measured data/
chemical diversity obtained in voltammograms and indirect advantages
supplied by ANN modeling.
Of the three alternatives evaluated, the DWT –ANN combination was the
option that best performed for the voltammetric electronic tongue. The experi-
ence presented in this work is an interesting application of voltammetric
electronic tongues for the quantitative determination of chemical species.
Data Compression for Voltammetric Electronic Tongue 2205

REFERENCES

Apetrei, C., Rodriguez-Mendez, M.L., Parra, V., Gutierrez, F., and de Saja, J.A. 2004.
Array of voltammetric sensors for the discrimination of bitter solutions. Sens.
Actuators, B103: 145– 152.
Artursson, T. and Holmberg, M. 2002a. Wavelet transform of electronic tongue data.
Sens. Actuators, B87: 379– 391.
Artursson, T., Spångeus, P., and Holmberg, M. 2002b. Variable reduction on electronic
tongue data. Anal. Chim. Acta, 452: 255– 264.
Borggaard, C. and Thodberg, H.H. 1992. Optimal minimal neural interpretation of
spectra. Anal. Chem., 64: 545–551.
Chau, F.-T., Liang, Y.-Z., Gao, J., and Shao, X.-G. 2004. Chemometrics. From Basics
to Wavelet Transform; Wiley-VCH: Weinheim.
Cocchi, M., Seeber, R., and Ulrici, A. 2003a. Multivariate calibration of signals by
WILMA. J. Chemom., 17: 512– 527.
Cocchi, M., Hidalgo-Hidalgo-de-Cisneros, J.L., Naranjo-Rodrı́guez, I., Palacios-
Santander, J.M., Seeber, R., and Ulrici, A. 2003b. Multicomponent analysis of
electrochemical signals in the wavelet domain. Talanta., 59: 735– 749.
Collantes, E.R., Duta, R., Welsh, W.J., Zielinski, W.L., and Brower, J. 1997.
Preprocessing of HPLC trace impurity patterns by wavelet packets for pharma-
ceutical fingerprinting using artificial neural networks. Anal. Chem., 69: 1392– 1397.
Dathe, M. and Otto, M. 1996. Confidence intervals for calibration with neural
networks. Fresenius’ J. Anal. Chem., 356: 17 – 20.
de Carvalho, R.M., Mello, C., and Kubota, L.T. 2000. Simultaneous determination of
phenol isomers in binary mixtures by differential pulse voltammetry using carbon
fibre electrode and neural network with pruning as a multivariate calibration tool.
Anal. Chim. Acta., 420: 109– 121.
Demuth, H. and Beale, M. 2001. Neural Network Toolbox. User’s Guide; The
Mathworks Inc: Natick, MA.
Despagne, F. and Massart, D.L. 1998. Neural networks in multivariate calibration.
Analyst, 123: 157R– 178R.
Distante, C., Leo, M., Siciliano, P., and Persaud, K.C. 2002. On the study of feature
extraction methods for an electronic nose. Sens. Actuators, B87: 274– 288.
Ensafi, A.A., Khayamian, T., and Atabati, M. 2002. Simultaneous voltammetric deter-
mination of molybdenum and copper by adsorption cathodic differential pulse
stripping method using a principal component artificial neural network. Talanta,
57: 785– 793.
Gallardo, J., Alegret, S., de Roman, M.A., Muñoz, R., Hernández, P.R., Leija, L., and
del Valle, M. 2003. Determination of ammonium ion employing an electronic tongue
based on potentiometric sensors. Anal. Lett., 36 (14): 2893– 2908.
Gemperline, P.J. 1997. Rugged spectroscopic calibration for process control.
Chemometr. Intell. Lab. Syst., 39: 29 – 40.
Gutes, A., Cespedes, F., Alegret, S., and del Valle, M. 2005. Determination of phenolic
compounds by a polyphenol oxidase amperometric biosensor and artificial neural
network analysis. Biosens. Bioelectron., 20: 1668– 1673.
Holmberg, M., Eriksson, M., Krantz-Rülcker, C., Artursson, T., Winquist, F., Lloyd-
Spetz, A., and Lundström, I.. Second workshop of the second network on artificial
olfactory sensing (NOSE II). Sens. Actuators, B101: 213– 223.
Holmin, S., Spångeus, P., Krantz-Rülcker, C., and Winquist, F. 2001. Compression of
electronic tongue data based on voltammetry—a comparative study. Sens. Actuators,
B76: 455– 464.
2206 L. Moreno-Barón et al.

Ionescu, R. and Llobet, E. 2002. Wavelet transform-based fast feature extraction from
temperature modulated semiconductor gas sensors. Sens. Actuators, B81: 289– 295.
Krantz-Rülcker, C., Stenberg, M., Winquist, F., and Lundström, I. 2001. Electronic
tongues for environmental monitoring based on sensor arrays and pattern
recognition: a review. Anal. Chim. Acta., 426: 217– 226.
Lavine, B.K. and Workman, J. 2002. Chemometrics. Anal. Chem., 74: 2763– 2770.
Leung, A.K., Chau, F., and Gao, J. 1998. A review on applications of wavelet transform
techniques in chemical analysis: 1989– 1997. Chemometr. Intell. Lab. Syst., 43:
165– 184.
Mitra, S.K. 2001. Digital Signal Processing. A Computer-Based Approach, 2nd ed.;
McGraw-Hill: Boston, MA.
Moreno-Barón, L., Cartas, R., Merkoçi, A., Alegret, S., del Valle, M., Leija, L.,
Hernandez, P.R., and Muñoz, R. 2005. Application of the wavelet transform
coupled with artificial neural networks in a voltammetric electronic tongue. Sens.
Actuators B (in press), (DOI:10.1016/j.snb.2005.03.063).
Nie, L., Wu, S., and Wang, J. 2001. Continuous wavelet transform and its application to
resolving and quantifying the overlapped voltammetric peaks. Anal. Chim. Acta.,
450: 185– 192.
Palacios-Santander, J.M., Jimenez-Jimenez, A., Cubillana-Aguilera, L.M., Naranjo-
Rodriguez, I., and Hidalgo-Hidalgo-de-Cisneros, J.L. 2003. Use of artificial neural
networks, aided by methods to reduce dimensions, to resolve overlapped electroche-
mical signals. A comparative study including other statistical methods. Mikrochim.
Acta, 142: 27– 36.
Pravdová, V., Pravda, M., and Guilbault, G.G. 2002. Role of chemometrics for electro-
chemical sensors. Anal. Lett., 35 (15): 2389– 2419.
Shao, X. and Sun, L. 2001. An application of the continuous wavelet transform to
resolution of multicomponent overlapping analytical signals. Anal. Lett., 34:
267– 280.
Shao, X., Leung, A.K., and Chau, F. 2003. Wavelet: a new trend in chemistry. Acc.
Chem. Res., 36: 276– 283.
Shao, X., Pang, C., Wu, S., and Lin, X. 2000. Development of wavelet transform
voltammetric analyzer. Talanta., 50: 1175– 1182.
Simons, J., Bos, M., and Van der Linden, W.E. 1995. Data processing for amperometric
signals. Analyst., 120: 1009– 1012.
Winquist, F., Wide, P., and Lundström, I. 1997. An electronic tongue based on voltam-
metry. Anal. Chim. Acta., 357: 21 – 31.
Zhang, X. and Jin, J. 2004. Wavelet derivative: application in multicomponent analysis
of electrochemical signals. Electroanalysis., 16: 1514– 1520.

You might also like