You are on page 1of 5

Available online at www.sciencedirect.

com

ScienceDirect
APCBEE Procedia 10 (2014) 2 – 6

ICESD 2014: February 19-21, Singapore

A Polynomial Curve Fitting Method for Baseline Drift


Correction in the Chromatographic Analysis of Hydrocarbons in
Environmental Samples
Mauro Mecozzia,
a
Laboratory of Chemometrics and Environmental Application, Via di Castel Romano 100, 00128 Rome, Italy

Abstract

In this paper we describe an algorithm for removing the baseline drift often observed in GC and HPLC plots of complex
environmental samples which causes reduction of the analytical accuracy of peak area measurements. The proposed
method determines the real baseline (i.e. baseline modeling) directly on the chromatographic plot of each analyzed sample
and then substrate it to the analytical time series signals of the chromatogram, so reducing instrumental noise and then
improving analytical accuracy. The proposed algorithm, based on a polynomial fitting estimation of the baseline, has been
applied to the GC analysis of the hydrocarbon content in marine sediments.

© 2014
© 2014The Authors. Published
Published by Elsevier
by Elsevier B.V. This is
B.V. Selection an open
and/or access
peer articleunder
review under responsibility
the CC BY-NC-ND license
of Asia-Pacific
(http://creativecommons.org/licenses/by-nc-nd/3.0/).
Chemical, Biological & Environmental Engineering Society
Selection and peer review under responsibility of Asia-Pacific Chemical, Biological & Environmental Engineering Society
Keywords: Baseline Drift Correction, Chromatographic Analysis, Environmental Monitoring, Signal Processing

1. Introduction

Chromatographic techniques are widely diffused in all the applications of environmental monitoring due to
the possibility of analyzing complex matter samples in automated way. However in some specific analytical
cases such as those related to the gas chromatographic determination of the hydrocarbon content in marine
sediments, an evident baseline drift is still present even after the blank (i.e. solvent) subtraction (Fig. 1). That
is because the baseline drift is a random effect depending on several casual and not reproducible variables


Corresponding author. Tel.:+30-06-50073287;
E-mail address: mauro.mecozzi@isprambiente.it; mauromecozzi2004@libero.it.

2212-6708 © 2014 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/3.0/).
Selection and peer review under responsibility of Asia-Pacific Chemical, Biological & Environmental Engineering Society
doi:10.1016/j.apcbee.2014.10.003
Mauro Mecozzi / APCBEE Procedia 10 (2014) 2 – 6 3

which can produce k-different baseline shapes in k-sequential chromatographic analysis [1]. So the blank
subtraction procedure could not be reliable to solve the baseline drift problems and to allow the accurate
quantification of the chromatographic peaks. For this reason, many papers propose different methods of
signal processing for modeling and then correcting baseline drifts in chromatographic analysis [1-4]. In this
paper, we propose a curve fitting method for baseline correction. This method consists of modeling the
baseline by means of a polynomial curve where the identification of the optimized polynomial grade is
performed by the F-Snedekor test [5]. The method has been tested for the GC analysis of total hydrocarbon
content in marine sediments by means of a previous method [6].
2.7

1.8
Signal Intensity (mV)

0.9

0
7 12 17 22 27 32 37
Retention Time (min)

Fig. 1. Example of an evident baseline drift in the GC analysis of hydrocarbons in a marine sediment sample, already submitted to blank
subtraction. Each couple of arrows show a time range where the absence of significant peaks makes it suitable for baseline estimation

2. Basic theory of the proposed algorithm

The chromatographic plot of a baseline consists of a z-number of yj, xj values; y and x are the signal
intensity and the acquisition (i.e. retention) time respectively with 1 < j < z. The y signals represent a
continuous and derivable series in the x1-xz time range and according to these characteristics the y = f(x)
relationship can be described according to the Taylor series [7] as
z
y = Ȉ (x-xo)k f(k) (xo)/k! + Rz(x) (1)
k=0

where xj ” xo ” xz, f (k) is the order of the derivative of the f(x) function, n is the polynomial degree and Rz(x)
are the residuals (i.e. differences) among the original f(x) values and the f(x) values determined according to
Eq. 1. So, on the basis of Eq.1 we can suppose the existence of a polynomial relationship between y and x
y = a1*xj + a2*xj 2 + a3* xj 3 + … + an* xj n + ao 1<j<i (2)
4 Mauro Mecozzi / APCBEE Procedia 10 (2014) 2 – 6

We applied this function to describe the y vs. x relationship of a baseline plot where peaks are absent. (Fig.1)
and the first step of the algorithm is the selection of the i-number of y, x couples to be considered for baseline
estimation by means of the above Eq. 2. After selecting and collecting all these yi, x i couples in a new y, x
data file, we can submit it to the polynomial regression analysis, according to Eq. 2. For the selection of the
optimal n-order polynomial order, we can use the F-Snedekor test [5]. This test consists of the ratio between
the root square mean value of the kth and the kth+1 polynomial order. When this ratio tends to a minimal value
so that the F value is § 1, we can identify the optimized polynomial grade for the estimation of a reliable
baseline.
As i < z, in the final step of this procedure we recalculate the total z-signal intensities of the yb baseline signals
according to
yb = a1*xj + a2*xj 2 + a3* xj 3 + … + an* xj n + ao 1<j<z (3)
where ao, a1, a2, … an are the coefficients previously determined by Eq.2. This equation is analogous to the
Eq. 2 but here, with the x terms we include and consider all the z-acquisition times to have the full baseline
modelling. At last, the yb series determined by Eq.3 have to be subtracted to the original y-signal intensities of
the chromatogram to perform the baseline drift correction. This operation is called detrending [8, 9] because it
minimizes the non constant and curvilinear trend present in the baseline drift (Figs. 2 and 3). It is obvious that
the higher is the number of the yi, x i couples selected for the baseline estimation (according to Fig.1 and Eq.2),
the more accurate is the baseline modelling and then the resulting baseline drift correction.

3. Methods and materials

The determination of hydrocarbons in marine sediments, including hydrocarbon extraction, purification


and GC determination was performed according a previous paper [6]. Chromatograms were saved as ASCII
files for the submission to the above baseline correction procedure according to an in house Matlab (MatWork,
Inc, MA, USA) routine.

4. Results and discussion

Figs. 2 and 3 report an example of the proposed baseline correction method. In Fig. 2 we can see that
fourth (left plot) and sixth (right plot) order polynomial curves are reliable for baseline estimation because
they match quite well the zone of the chromatographic plot where significant peaks are absent. In any case,
the identification of the optimal order grade is not visual or random operation because it is supported by
applying the F-Snedekor test.
Fig. 3 shows the final chromatographic plot after subtraction of the baseline determined according to Eqs.
2 and 3. We have to observe that in the final chromatographic plots, most peaks have their starting signals
close to zero.
Obviously we can also point out that the examples of Fig. 2 concerning the identification of the optimal
polynomial grade are not a general rule because more complex baselines can require higher polynomial order
functions. These cases not reported here for the sake of brevity. This is further evidence of the random
characteristics of a baseline drift, always related to the applied experimental conditions [1-3].
Mauro Mecozzi / APCBEE Procedia 10 (2014) 2 – 6 5

2.7

1.8
Signal Intensity (mV)

0.9

0
7 12 17 22 27 32 37
Retention Time (min)

Fig. 2. Example of baseline estimation by means of a second (normal line) and fourth (bold line) polynomial curve fitting.
Signal intensity(mv)

Intensity(mv)

Fig. 3. Baseline correct (grey) chromatogram compared with the original (black) chromatograms after subtraction of a fourth (left) and a
six (right) order polynomial function for baseline modeling ( see Eqs. 2 and 3).

The main advantage of the proposed method consists of the improved analytical accuracy of the
quantitative peak are determinations. In fact, when we compare the signal to noise (S/N) ratios among
chromatograms submitted to the conventional blank subtraction treatment and chromatograms submitted to
6 Mauro Mecozzi / APCBEE Procedia 10 (2014) 2 – 6

the baseline drift correction, we can observe a significant enhancement of this value. Table 1 reports some
examples of the improved (i.e. enhancement) signal to noise ratios after baseline drift correction.

Table 1. Comparison of S/N ratio values for the GC determination of hydrocarbons among conventional blank subtraction
chromatograms (i.e. uncorrected) and baseline corrected samples. S/N values are determined according to Ref. 10

Uncorrected Corrected
Sample A 1,65 51

Sample B 2,2 121

Sample C 3,7 250

5. Conclusion

The method here presented allows to perform the baseline correction of strongly drifted chromatographic
signals, allowing to obtain more accurate quantitative detection of the constituents in complex environmental
samples. Future developments of this study involve the preparation of an analytical software for the full
automation of the baseline method proposed.

Acknowledgements

The author is grateful to Mr. Claudio Micocci of the GLM Scientifica S.r.l. in Guardea (Terni, Italy) for his
instrumental assistance.

References

[1] Gan F., Ruan G., Mo J. Baseline correction by improved iterative polynomial fitting with automatic threshold. Chem. Intell. Lab.
Syst., 2006, 82, 59-65
[2] Baek S., Park A., Kim J., Sihen A., Hu J. A simple background elimination method for Raman spectra Chem. Int. Lab. Syst.,
2009, 98, 24-30
[3] de Rooi J.J., Eilers P. H.C. Mixture models for baseline estimation. Chem. Intell. Lab. Syst., 2012, 117, 56-60
[4] Komsta I. Comparison of several methods of chromatographic baseline removal with a new approach based on quantile
regression. Chromatographia, 2011, 73, 721-731
[5] Miller J.M., Miller J.N. Statistics and Chemometrics for analytical chemistry, Hollis Horwood Series, fifth editon, Chichester,
UK, 2005
[6] Pietroletti M., Mattiello S., Moscato F., Oteri F., Mecozzi M. One step ultrasound extraction and purification method for the
analysis of hydrocarbons in marine sediments: application to the monitoring of Italian costs. Chromatographia, 2012, 75, 961-71
[7] http://homepages.ulb.ac.be/~dgonze/TEACHING/taylor.pdf
[8] Engel, J., Gerretzen, J., SzymaĖska, E., Jansen, J.J., Downey, G., Blanchet, L., Buydens, L. M.C. Breaking with trends in pre-
processing? Trends Anal. Chem., 2013, 50, 96-106
[9] Azzouz T., Puigdoménec A., Araguy M., Tauler, R. Comparison between data pre-treatment methods in the analysis of forage
samples using near-infrared diffuse reflectance spectroscopy and partial least squares multivariate calibration method. Anal. Chim. Acta,
2003, 494, 121-134
[10] Coelho A.M., Estrela, V.V. EM-based mixture models applied to video event detection, in “Principal Component Analysis-
Engineering application. P. Sanguansat Ed., In Tech Editions, Rijeka, Croatia, 2012

You might also like