Professional Documents
Culture Documents
www.fuelfirst.com
Abstract
This paper reports on analysis of 45 gasoline samples with different qualities, namely, octane number and chemical composition.
Measurements of data from gas chromatography and IR (FTIR) spectroscopy are used to gasoline quality prediction and classification. The
data were processed using principal component analysis (PCA) and fuzzy C means (FCM) algorithm. The data were then analyzed following
the neural network paradigms, hybrid neural network and support vector machines (SVM) classifier. The IR spectra were compressed and de-
noised by the discrete wavelet analysis. Using the hybrid neural network and multi linear regression method (MLRM), excellent correlation
between chemical composition of the gasoline samples and predicted value of the octane number was obtained. About 100% correct
classification for six different categories of the gasoline was achieved, each of which has different qualities.
q 2005 Elsevier Ltd. All rights reserved.
Keywords: Gasoline classification; Octane number; Neural networks; Wavelet analysis; SVM classifier
1. Introduction octane numbers from IR and NIR spectral data was done.
Octane number has also been correlated with carbon or
The antiknock performance of a gasoline is its ability to hydrocarbon types [1,2] measured by gas chromatography,
resist detonation, a form of abnormal combustion. Detona- high performance liquid chromatography, or nuclear
tion occurs when the air–fuel mixture reaches a temperature magnetic resonance [3]. The octane number prediction out
and/or pressure at which it can no longer keep from self- of these models gave good and reproducible results, but only
igniting. Two types of abnormal combustion are common: for fuels with a very similar composition. Most of the
the first is detonation, as previously mentioned, and the correlation models published were developed with multiple
other is preignition. linear or nonlinear regression techniques, which require the
Research octane number (RON) is determined in a user to specify a priori a mathematical model of the
standardized engine. This is a very expensive method but empirical correlation. The neural network approach is an
still the only accepted one. Very soon scientists began to alternative way of solving the problem. Unlike multiple
look for a correlation between the tendency of hydrocarbon- linear or nonlinear regression techniques, which require a
based fuels to knock and the composition of these fuels. predefined empirical model, the neural network can identify
With the help of kinetic models, possible reaction and learn the correlative patterns between the input and
mechanisms were established. Later on the calculation of corresponding output values once a training set is provided.
In this paper, the application of gas chromatography
and IR (FTIR) data in combination with different pattern-
* Corresponding author. Tel./fax: C48 22 660 5358. recognition engines (PCA, FCM, neural networks) to
E-mail address: bruxz@ch.pw.edu.pl (K. Brudzewski). predict the octane number of gasoline is reported. The
0016-2361/$ - see front matter q 2005 Elsevier Ltd. All rights reserved. selforganizing hybrid network and SVM network were
doi:10.1016/j.fuel.2005.07.019 used.
554 K. Brudzewski et al. / Fuel 85 (2006) 553–558
As the classifier, the artificial neural network of the SVM Forty-five unleaded gasoline samples were prepared in
type was applied. The SVM solution of Vapnik [7] is known such a way that they covered a wide range of the gasoline
as a very good tool for classification problems with properties (see Table 1). In total, 45 gasoline samples were
excellent generalization ability. The SVM neural network available for the study: 35 gasoline samples were included in
structure is presented in Fig. 2. In distinction to the classical the training dataset and 10 in the test dataset. To cover the
neural networks SVM formulation of learning problem whole range of the gasoline fuels in this study, the training
leads to the quadratic programming with linear constraints dataset included samples that contained at least the maximum
[8]. Basically, the SVM is a linear machine working in the or minimum values of inputs and outputs. The rest 10
high dimensional feature space formed by the non-linear available gasoline samples were used as the test dataset.
mapping of the n-dimensional input vector x into a K- Only the data from the gas chromatography was used in
dimensional feature space (KOn) through the use of the the first experiment (see Table 1). The mass percent of the
function J(x). The equation of the hyperplane separating five hydrocarbon types and ethanol identified by GC were
two different classes is given by used as neural network inputs and independent variables in
the linear regression equations. Octane numbers used as the
yðxÞ Z wT JðxÞ Z 0 (1)
outputs for the neural network and the dependent variables
where J(x)Z[J0(x),J1(x),.,JK(x)]T with J0(x)Z1 and for linear regression correlations. The prediction of the
w is the weight vector of the network. The data vector x octane numbers was done using the hybrid neural network.
satisfying the condition y(x)O0 belongs to one class and What is particularly important in defining the hybrid
when y(x)!0 belongs to the opposite one. The most network is the proper choice of the number of neurons in
distinctive fact about SVM is that the learning task is each layer. The size of the input layer is dictated by the number
simplified to the quadratic programming by introducing so- of the input vector components. In the described case this
called Lagrange multipliers. All operations in learning and number is equal to 6 (5 hydrocarbonsCethanol). The number
testing modes are done in SVM using so-called kernel of Kohonen neurons should reflect the complexity of data
functions. The kernel is defined as: distribution. After some experiments, 64 neurons in Kohonen
layer were found as an optimal number. The input dimension
Kðx; xi Þ Z JT ðxÞJðxi Þ (2)
of MLP network is equal to the number of neurons in the
Polynomial kernel function was used in the calculation Kohonen layer. The output dimension of the system is defined
by the number of the predicted parameters (there is only one
Kðx; xi Þ Z ðxT xi C gÞp (3) parameter used-octane number). The number of hidden
where pZ5, gZ0.45. neurons has been adjusted experimentally to obtain the best
Although SVM separates the data only into two classes, accuracy of generalization. The experiments of learning
the recognition of more classes is straightforward by different structures of MLP have shown that in this case the
applying either ‘one against one’ or ‘one against all’ hidden layer consisting of six neurons is optimal. So, the
methods. The important advantage of the SVM approach is structure of the hybrid network (64–6–1) uses only one hidden
transformation of the learning task to the quadratic layer of six neurons.
programming problem. For this tape of optimization, there To estimate the qualitative and quantitative prediction
exist many very effective learning algorithms, leading in of the octane number by the hybrid network, two kinds of
almost all cases to the global minimum of the coast function Table 1
and to the best possible choice of the parameter values of Main properties of gasolines used in the research
neural network.
Property Unit Minimum Maximum
value value
Density at 20 8C g/cm3 0.7173 0.8077
Content of n-paraffin %, m/m 4.627 7.867
hydrocarbons
Content of i-paraffin %, m/m 18.339 52.151
hydrocarbons
Content of naphthenes %, m/m 1.938 26.275
Content of olefins %, m/m 0.029 22.713
Content of aromatic %, m/m 13.863 71.990
hydrocarbons
Content of EMTB %, m/m 0 0
Content of ethanol %, m/m 0 5.00
Content of ETBE %, m/m 0 0
Octane number – 81.4 99.7
Fig. 2. Structure of the SVM neural network.
556 K. Brudzewski et al. / Fuel 85 (2006) 553–558
Table 3 Table 4
Division of gasoline into six different categories of quality according to its Confusion matrix for SVM classification results, true vs. predicted (rows vs.
octane number value columns) for the training dataset (35 samples)
Table 5
PC3
0
Confusion matrix for SVM classification results, true vs. predicted (rows vs.
columns) for the test dataset (10 samples)
–0.02
Octane number Class number
–0.04 and (class number) 1 2 3 4 5 6
0.5
1.5 78–81 (1) 0 0 0 0 0 0
1.4
0 1.3 82–86 (2) 0 1 0 0 0 0
1.2
PC2 1.1 87–91 (3) 0 0 4 0 0 0
–0.5 1 PC1 92–94 (4) 0 0 0 2 0 0
95–97 (5) 0 0 0 0 1 0
Fig. 4. PCA (3D) plot of the IR spectra and clusters centers ‘C’ from Fuzzy 98–100 (6) 0 0 0 0 0 2
C means method.
558 K. Brudzewski et al. / Fuel 85 (2006) 553–558