Research Paper 2

Expert Systems with Applications 37 (2010) 2150–2160
Contents lists available at ScienceDirect
Expert Systems with Applications

journal homepage: www.elsevier.com/locate/eswa
Misfire identification in a four-stroke four-cylinder petrol engine using decision tree

S. Babu Devasenapati a, V. Sugumaran b,*, K.I. Ramachandran a
a
Department of Mechanical Engineering, Amrita School of Engineering, Coimbatore, Tamil Nadu, India
b
Sri Manakula Vinayagar Engineering College, Madagadipet, Pondicherry 605 107, India
a r t i c l e i n f o a b s t r a c t
Keywords: Misfire detection in an internal combustion engine is very crucial to maintain optimum performance
Misfire throughout its service life and to reduce emissions. The vibration of the engine block contains indirect
Knock information regarding the condition of the engine. Misfire detection can be achieved by processing the
Engine condition monitoring vibration signals acquired from the engine using a piezoelectric accelerometer. This hidden information
Decision tree
can be decoded using statistical parameters like kurtosis, standard deviation, mean, median, etc. This
Feature selection
paper illustrates the use of decision tree as a tool for feature selection and feature classification. The effect
of dimension, minimum number of objects and confidence factor on classification accuracy are studied
and reported in this work.
Ó 2009 Elsevier Ltd. All rights reserved.
1. Introduction exhaust does not rise significantly in the case of low fre-
quency misfire.
The rapid growth of transportation systems mostly using inter- b. Monitoring the oxygen sensor signal in exhaust. This
nal combustion (IC) engines has led to a wide range of challenges method is not encouraging since the momentary increase
demanding immediate attention. Maintenance and condition mon- in oxygen level for a single misfire might not evoke a good
itoring of an IC engine is a very crucial activity required to ensure response from the sensor and it is even more challenging
optimum performance and minimum load on the environment, by at higher speeds.
restricting emissions to bare minimum levels. Misfire in spark igni- c. In-cylinder pressure monitoring. This method is very reliable
tion IC engine is a major factor leading to undetected emissions and accurate as individual cylinder instantaneous mean
and performance reduction. According to the California Air Re- effective pressure could be calculated in real time. However,
sources Board (CARB) regulations (California Air Resources Board, the cost of fitting each cylinder with a pressure transducer is
1991) engine misfire means, ‘‘lack of combustion in the cylinder prohibitively high.
due to absence of spark, poor fuel metering, poor compression, or d. Evaluation of crankshaft angular velocity fluctuations.
any other cause”. Extensive studies have been done using measurement of
The engine diagnostic system of the vehicle should be designed instantaneous crank angle speed (Cavina & Ponti, 2003;
to monitor misfire continuously because even with a small number Chen, Wu, Hsieh, & Tsai et al., 2005; Kuroda, Shimasaki, Igar-
of misfiring cycles, engine performance degrades, hydrocarbon ashi, Maruyama, & Asaka, 1995; Osburn, Kostek, & Franchek,
emissions increase, and drivability will suffer (Lee & Rizzoni, 2006; Tinaut, Melgar, Laget, & Dominguez, 2007) and diverse
1995). The cylinder misfire cycle also results in a large quantity techniques have been developed to predict misfire. These
of unburned fuel being sent through the catalytic converter, which methods call for a high resolution crank angle encoder and
causes a reduction in its service life due to high temperature expo- associated infrastructure capable of identifying minor
sures (Klenk, Moser, Mueller, & Wimmer, 1993) and also contrib- changes in angular velocity due to misfire. The application
utes to significant air pollution. of these techniques becomes more challenging due to con-
Several methods of misfire detection have been proposed tinuously varying operating conditions involving random
(Klenk et al., 1993): variation in acceleration coupled with the effect of flywheel,
which tries to smoothen out minor variations in angular
a. Monitoring catalyst temperature at exhaust. This method velocity at higher speeds. Fluctuating load torque applied
is unacceptable since the catalyst temperature at to the crankshaft through the drive train poses additional
hurdles in decoding the misfire signals.
* Corresponding author. Tel.: +91 9486094486.

Piotr and Jerky (2005) have reported their work using vibro-
E-mail addresses: s_babu@ettimadai.amrita.edu (S.B. Devasenapati), v_sugu@ya- acoustic measurement at engine exhaust to model nonlinear meth-
hoo.com (V. Sugumaran). ods for misfire detection in locomotive engines. Although the idea
0957-4174/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2009.07.061
S.B. Devasenapati et al. / Expert Systems with Applications 37 (2010) 2150–2160 2151
of using vibroacoustic signals is encouraging, the implementation

of such a system requires the use of multi-sensor input escalating
the cost and computational infrastructure. It also offers more chal-
lenges when there is a need to integrate the system to an onboard
condition monitoring system for automobiles, with minimum
infrastructure. Ye (2009) reported work on misfire detection using
the Matter-element model, which is built on diagnostics derived
from specialists’ knowledge of practical experience. In this model
the misfire in the engine cylinder can be directly identified using
relation indices. The shortcoming observed here is that the tech-
nique depends heavily on the knowledge of an expert and does
not facilitate automatic machine learning through a model built
on an algorithm using knowledge hidden in the data. The reliability
of a system with automatic rule based learning is more since it can
be trained for the continuously changing behavior of the engine,
due to wear and tear.
Engine misfire detection done using sliding mode observer
(Shiao & Moskwa, 1995; Wang & Chu, 2005) is challenged with dif-
Fig. 1. Flowchart of fault diagnosis system.
ficulty in modeling. Expressing a dynamic non-linear system into a
robust model will induce errors. The system becomes more com-
plicated with IC engines since it is a time varying system. 2.1. IC engine test rig
Some studies have also been done using linear approximation
techniques using Kalman filter (Kiencke, 1999). The inherent prob- The experimental setup of the engine misfire simulator consists
lem in such systems is that there can be loss of valuable informa- of a four-stroke vertical four-cylinder gasoline (petrol) engine.
tion due to linear approximation and these signals cannot be used Switching off the high voltage electrical supply to individual spark
to extract other engine information required for designing a vehi- plugs or to a combination of spark plugs simulates the misfire. The
cle condition monitoring system. The linear approximation models engine accelerator is manually controlled using a screw and nut
using Kalman filter is found to be less efficient than non-linear sys- mechanism that can be locked in any desired position. The engine
tems (Ilkivova, Ilkiv, & Neuschl, 2002). speed is monitored using an optical interference tachometer.
Chang, Kim, and Min (2002) have reported their work using a
combination of engine block vibration and wavelet transform to
detect engine misfire and knock in a spark ignition engine. The 2.2. Data acquisition system
use of engine block vibration is appreciable because it requires
minimum instrumentation but the use of wavelet transforms in- Accelerometers are the preferred transducers in machine condi-
creases the computational requirements. tion monitoring due to the following advantages: extreme rugged-
The present study proposes a non-intrusive engine block accel- ness, large frequency response and large dynamic range.
eration measurement using a low cost mono axial piezoelectric Accelerometers have a wide operating range enabling them to de-
accelerometer connected to a computer through a signal processor. tect very small and large vibrations. The vibration sensed can be ta-
The acquired analog vibration signals are converted to digital sig- ken as a reflection of the internal engine condition. The voltage
nals using an analog to digital converter and the discrete data files output of the accelerometers is directly proportional to the vibra-
are stored in the computer for further processing. Feature extraction. A piezoelectric mono axial accelerometer and its accessories
tion and feature selection techniques are employed and their clas- form the core equipment for vibration measurement and record-
sification results obtained is presented in the ensuing discussion. ing. The accelerometer is directly mounted on the center of the en-
A good classifier should have the following properties: gine head-using adhesive mounting as shown in Fig. 2.
The output of the accelerometer is connected to the signal-con-
(1) It should have good ‘predictive accuracy’; it is the ability of ditioning unit, a DACTRON FFT analyzer that converts the signal
the model to correctly predict the class label of new or pre- from Analogue to Digital (ADC). The digitized vibration signal (in
viously unseen data. time domain) is given as input to the computer through the USB
(2) It should have good speed. port. The data are stored in the secondary memory of the computer
(3) The computational cost involved in generating and using the using the accompanying software RT Pro Photon for data process-
model should be as low as possible. ing and feature extraction.
(4) It should be ‘robust’; robustness is the ability of the model to
make correct predictions given the noisy data or data with
missing values. (Insensitive to noise in the data.) 3. Experimental procedure
(5) The level of understanding and insight that is provided by
classification model should be high enough. The engine is started by electrical cranking at no load and
warmed up for 15 min. The FFT analyzer is switched on, the accel-
The C4.5 algorithm has all the above properties and hence cho- erometer is initialized and the data are recorded after the engine
sen for study. speed stabilized. A sampling frequency of 24 kHz and sample
length of 8192 is maintained for all conditions. The highest fre-
quency was found to be 10 kHz and since Nyquist sampling theo-
2. Experimental setup rem says that the sampling frequency must be at least twice that of
the highest measured frequency or higher. Hence the sampling fre-
Referring to Fig. 1, the misfire simulator consists of two subsys- quency was chosen to be 24 kHz. To strike a balance between com-
tems namely, IC engine test rig and data acquisition system. They putational load and data quality, the number of samples is chosen
are discussed in detail in the following sections. as 8000. For further work on wavelet based feature extraction
2152 S.B. Devasenapati et al. / Expert Systems with Applications 37 (2010) 2150–2160
Fig. 2. Engine setup.
techniques the sample length as 2n is preferred. Hence a consistent

value of 8192 samples was maintained throughout the study. 3
Extensive trials were taken at various speeds (1000 rpm,
1500 rpm and 2000 rpm) and discrete vibration signals were 2
Amplitude
stored in the files. Five cases were considered – normal running
(without any fault), engine with any one-cylinder misfire individ- 1
ually (i.e. first, second, third or fourth). All the misfire events were
simulated at 1000 rpm, 1500 rpm and 2000 rpm. The rated speed 0
of the engine electrical generator set is 1500 rpm. Time domain
plots of the signals at 1500 rpm are shown in Fig. 3a–e.
-1
0 2000 4000 6000 8000
4. Definition of statistical features used Sample Number
(a) Standard deviation: This is a measure of the energy content Fig. 3b. Amplitude plot – cylinder two misfire.
of the vibration signal. It is the root mean square (RMS) devi-
ation of signal values from their arithmetic mean. The fol-
lowing formula was used for computation of standard 3
deviation.
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2
Amplitude
u
u1 X N
Standard deviation ðrÞ ¼ t ðxi xÞ2 ð1Þ
N i1 1
0
(b) Standard error: The standard error of a statistic is the stan-
dard deviation of the sampling distribution of that statistic.
-1
Standard errors are important because they reflect how
0 2000 4000 6000 8000
much sampling fluctuation a statistic will show. The inferen-
Sample Number
tial statistics involved in the construction of confidence
intervals and significance testing are based on standard Fig. 3c. Amplitude plot – cylinder three misfire.
errors. The standard error of a statistic depends on the sam-
ple size. In general, the larger the sample sizes the smaller
the standard error. It is a measure of the content of error 3
2
Amplitude
3
1
2 0
Amplitude
1 -1
0 2000 4000 6000 8000
0 Sample Number
Fig. 3d. Amplitude plot – cylinder four misfire.

-1
0 2000 4000 6000 8000
Sample Number in the prediction of y for an individual x in the regression,
where x and y are the sample means and ‘n’ is the sample
Fig. 3a. Amplitude plot – cylinder 1 misfire. size.
3 6. Decision tree (C4.5 Algorithm)
2
Amplitude
Decision tree represents the information in the signal as fea-

1
tures, in the form of a tree. The classification is done through the
decision tree with its leaves representing the different conditions
0 of the engine. The sequential branching process ending up with
the leaves here is based on conditional probabilities associated
-1 with individual features.
0 2000 4000 6000 8000
C4.5 algorithm introduced by Quinlan (1996) is one of the
Sample Number
widely used algorithms to generate decision tree. The decision tree
Fig. 3e. Amplitude plot – no misfire. induced with C4.5 algorithm plays a dual role in the present study
namely, feature selection and feature classification. Prior to the
r presentation of the specific role of C4.5 algorithm, the underlying
Standard error of the predicted y ¼ pffiffiffiffi ð2Þ
N theory is presented. Decision tree algorithm (C4.5) consists of
where N is the number of samples and r is the standard two phases: building and pruning. The building phase is also
deviation. known as ‘growing phase’. Both these phases are briefly described
in the following sections.
(c) Sample variance: It is variance of the signal points and is
given by the square of the standard deviation. 6.1. Building the decision tree
Sample variance ¼ r2 ð3Þ In the building phase, the training sample sets with discrete-
valued attributes are recursively partitioned until all the records
(d) Kurtosis: Kurtosis indicates the flatness or the spikiness of in a partition have the same class. The tree has a single root node
the signal. Its value is very low for good engine performance for the entire training set. A new node is added to the decision tree
without knock and high for misfire. for every partition. For a set of samples in a partition S, a test attri-
( ) bute X is selected for further partitioning the set into S1, S2, S3, . . . SL.
Xn 4
nðn þ 1Þ xi x For each new set S1, S2, S3 . . . SL new nodes are created and these are
Kurtosis ¼
ðn 1Þðn 2Þðn 3Þ i¼1 s added to the decision tree as children of the node for S. Further, the
3ðn 1Þ2 node for S is labeled with test X, and partitions S1, S2, S3 . . . SL are
ð4Þ recursively partitioned. When all the records in a partition arrive
ðn 2Þðn 3Þ
with identical class label, further portioning is stopped, and the leaf
where ‘s’ is the sample standard deviation. corresponding to it is labeled with the corresponding class. The
construction of decision tree strongly depends on how a test attri-
(e) Skewness: Skewness characterizes the degree of asymmetry bute X is selected. C4.5 algorithm uses information entropy evalu-
of a distribution around its mean. Positive skewness indi- ation function as the selection criteria.
cates a distribution with an asymmetric tail extending The entropy evaluation function is arrived at through the fol-
towards more positive values. Negative skewness indicates lowing steps:
a distribution with an asymmetric tail extending towards Step 1: Calculate Info(S) to identify the class in the training
more negative values. The measure of it is given by, set S.
Xn 3 X
K
n xi x
Skewness ¼ ð5Þ InfoðSÞ ¼ f½freqðC i ; S=jSjÞlog2 ½freqðC i ; S=jSjÞg ð6Þ
ðn 1Þðn 2Þ i¼1 s i¼1
where jSj is the number of cases in the training set. Ci is a class, I = 1,

(f) Range: It refers to the signal values between the maximum 2, 3, . . . K is the number of classes and freq(Ci, S) is the number of
and minimum for a given signal. cases included in Ci.
(g) Minimum value: It refers to the minimum signal point value Step 2: Calculate the expected information value, infox(S) for
of a given signal. test X to partition samples in S.
(h) Maximum value: It refers to the maximum signal point value
X
k
of a given signal. InfoX ðSÞ ¼ ½ðjSi j=jSjÞinfoðSi Þ ð7Þ
(i) Sum: It is the sum of all signal point values in a given signal. i¼1
where K is the number of outputs for test X, Si is a subset of S cor-

responding to ith output and is the number of cases of subset Si.
5. Feature extraction Step 3: Calculate the information gain after partition according
to test X.
Referring to Fig. 1, after data acquisition, the next step is feature
GainðXÞ ¼ InfoðSÞ InfoX ðSÞ ð8Þ
extraction. The process of computing relevant parameters of the
signals that reveal the information contained in the signal is called Step 4: Calculate the partition information value Splitinfo(X)
feature extraction. Statistical analysis of vibration signals yields acquiring for S, partitioned into L subsets.
different parameters. The statistical parameters taken for this L
study are mean, standard error, median, standard deviation, sam- 1X jSi j jSi j jSi j jSi j
SplitInfoðXÞ ¼ log2 þ 1þ log2 1 ð9Þ
ple variance, kurtosis, skewness, range, minimum, maximum and 2 i¼1 jSj jSj jSj jSj
sum. These features were extracted from the vibration signals.
Step 5: Calculate the gain ratio
All these features may not be required to capture the information
required for classification. The relevant ones can be selected by GainRatioðXÞ ¼ GainðXÞ SplitInfoðXÞ ð10Þ
several means. Here it is performed using decision tree.
The GainRatio(X) compensates for the weak point of Gain(X), called information gain (discussed in Section 6.1) as heuris-
which represents the quantity of information provided by X in tic for selecting the attribute that will best separate the
the training set. Therefore, an attribute with the highest GainRa- given samples into individual classes.
tio(X) is taken as the root of the decision tree. (4) For each best discrete interval of the test attribute, a branch
is created and the samples are partitioned accordingly.
6.2. Pruning the decision tree (5) The algorithm uses the same process recursively to build a
decision tree for the samples at each partition.
It is observed that a training set in the sample space leads to a (6) The recursive partitioning stops only when one of the fol-
decision tree, which may be too large to be an accurate model; this lowing conditions is satisfied:
is due to over-training or over-fitting. Such a fully-grown decision (a) All the samples for a given node belong to the same
tree needs to be pruned by removing the less reliable branches to class.
obtain better classification performance over the whole instance (b) There are no remaining attributes on which the sam-
space even though it may induce a higher error over the training ples may further be partitioned.
set. (c) There are no samples for the branch test attribute. In
The C4.5 algorithm uses an error-based post-pruning strategy to such a case, a leaf is created with the majority class
deal with the issue of over-training. For each classification node in the available sample.
the C4.5 algorithm calculates predicted error rate based on the to- (7) A pessimistic error pruning method (discussed in Section
tal aggregate of misclassifications at that particular node. The er- 6.2) is used to prune the developed tree. This operation
ror-based pruning technique results in the replacement of vast improves the robustness and accuracy of the tree.
sub-trees in the classification structure by singleton nodes or sim-
ple branch collections, if these actions contribute to a drop in the In the present work, confidence factor assigned is 0.2. The con-
overall error rate of the root node. fidence factor is used for pruning the tree (smaller values incur
more pruning). The results are discussed below. A typical decision
6.3. Application of decision tree for misfire detection tree using C4.5 algorithm is shown in Fig. 4 under Section 7.1.
Statistical features defined in Section 4 forms the input to the
The obtained samples are divided into two parts: training set algorithm. Kurtosis, maximum, etc., in the ovals represent the fea-
and testing set. The training set is used to train the classifier and tures and the rectangles represent a class. The labels found within
the testing set is used to validate the performance of the classifier. the rectangles, but outside parenthesis represent the class labels
Tenfold cross-validation is deployed to evaluate classification e.g. c2m, c3m, etc. Two additional numbers are present with in
accuracy. The training process of C4.5 algorithm using the training the parenthesis; the first these indicates the data points that can
set with continuous-valued attributes is as follows: be classified using that feature set and the second number repre-
sents the number of misfit data points. The absence of second
(1) The tree starts as a single node representing the samples number indicates that there are no misfit data. Further, if the first
from the training set. number is insignificantly small compared to the total number of
(2) If the samples are all of the same class, then the node samples, the corresponding features can be considered as outliers
becomes a leaf. It is labeled with the same class. and ignored. The other features that appear at the nodes of the
(3) Otherwise, the algorithm discretises every attribute to select decision tree are in descending order of importance. Only features
the optimal threshold. It uses the entropy-based measure that contribute to the classification appear in the decision tree and
Fig. 4. Decision tree at 1500 rpm.

others do not. Features that have less discriminating capability can Effect of Dimension 1000 rpm
be consciously discarded by fixing the threshold. This concept is 96
Classification efficiency
made use of in selecting good features by the algorithm by inher- 94
ent nature. Features, which have good discriminating capability 92
alone, will appear in the decision tree.
90
The next role played by decision tree is classification. With the
88
selected features the classification was carried out. The outcome is
presented in the next section. 86
84
82
7. Results and discussion
80
The study of misfire classification using decision tree is dis- 78
cussed in the following phases: 0 2 4 6 8 10 12
No of features
1. Dimensionality reduction (Feature selection).
Fig. 5a. Effect of dimension (1000 rpm).
2. Design of classifier (C4.5 algorithm).
3. Validation of the classifier.
From the experimental setup through data acquisition 200 sig-

nals have been acquired for each condition. The conditions are Effect of Dimension 1500 rpm
96
mentioned in Section 3. Out of these 200 signals, 100 signals were
94
Classification accuracy
kept aside for testing purpose and the remaining 100 signals were
used for building the classifier. 92
90
88
7.1. Dimensionality reduction
86
Dimensionality reduction is the process of reducing the number 84
of input features that are required for classification to reduce the 82
computational effort. From the signals obtained at 1000 rpm, 11 80
statistical features as explained in Section 4 have been extracted.
78
All 11 features were given as input to the C4.5 algorithm and the
0 2 4 6 8 10 12
corresponding decision tree was generated. The resulting decision
No of features
tree is shown in Fig. 4. The decision tree gives an estimate of the
worth of a feature representing them in the order of importance. Fig. 5b. Effect of dimension (1500 rpm).
The top most node (feature) appearing in the tree – the root, con-
tains maximum information about the signal, however the remain-
ing node in the branches gives the same in the order of importance.
Effect of Dimension 2000 rpm
From Fig. 4 the order of importance of each feature has been noted 80
Classsification accuracy
down for all the features that appear in the decision tree. The miss- 70
ing features do not have significant information for classification;
however they are considered randomly to complete the list in the 60
study relating the effect of dimension on classification accuracy. 50
The dimensionality reduction was carried out as explained be- 40
low. Initially the root feature (standard error) of the tree alone
30
was considered for classification and the classification accuracy
was noted down. In the next step classification was performed 20
using the root node along with the next most prominent feature 10
(standard error and sample variance) and the classification accu-
0
racy was noted down. Further down the root node along with
0 2 4 6 8 10 12
the next two prominent features were considered for classification.
No of features
Similarly top four, top five, etc., features have been considered and
the corresponding classification accuracies were noted down. Fig. 5c. Effect of dimension (2000 rpm).
Fig. 5b shows the plot of the number of features versus classifica-
tion accuracy. From the graph (Fig. 5b) it is evident that the classi-
fication accuracy gradually increases for number of features As explained above the feature selection has been carried out
ranging from one to three (79.14–94.1%) and then has minor using decision tree C4.5 algorithm. Since an algorithm does the fea-
reduction in classification accuracy when number of features in- ture selection it is necessary to validate the result. The scatter plot
creased beyond three. It again reaches the maximum classification is used as a tool for validating the selection. Scatter plot depicts the
accuracy of 94.14% when number of features considered is seven. ability of the feature to classify. It can be observed from the scatter
Using lesser number of features reduces the computational load plots given in Fig. 6d–i, the points are very cluttered and cannot be
considerably hence in this work the first three features in their or- a good selection for classification. Kurtosis, given in Fig. 6a, has
der of importance have been selected. Similar exercise has been steep variations and hence is not a good selection. Since variance
carried out for 1000 rpm and 2000 rpm and the results are plotted is square of standard deviation, no additional information can be
in Fig. 5a and c, respectively. One can observe a similar pattern as obtained by selecting both these parameters, it may only increase
found in Fig. 5b. the computational effort.
60 0.015
50
0.01
40
Median x E-1 g
0.005
Kurtosis
30
0
20
10 -0.005
0
-0.01
-10
0 5 10 15 20 25 -0.015
Sample Number 0 5 10 15 20 25
Sample Number
Fig. 6a. Scatter plot of kurtosis.
Fig. 6d. Scatter plot of median.
If the feature selection is carried out manually, the available fea-

tures are standard error, sample variance, skewness, maximum
and range. The study of effect of dimension on classification accu- 0.04
racy shows that the optimum number of features for this case is 0.035
three. Out of the available five features, three can be chosen. The
Sample Variance xE -2g 0.03
criterion for selection is each class should be well separated from
each other. Applying this criterion one can select standard error,
0.025
sample variance and skewness. A plot of all the features is pre-
sented in Fig. 6a–k. 0.02
0.015
0.01
5
0.005
4
0
3 0 5 10 15 20 25
Skewness
Sample Number
2
Fig. 6e. Scatter plot of sample variance.
1
-1 0.008
-2 0.006
0 5 10 15 20 25
Sample Number 0.004
Mean x E-1 g
Fig. 6b. Scatter plot of skewness. 0.002
0.0022 -0.002
-0.004
Standard Error x E-1 g
0.002
-0.006
0.0018
-0.008
0 5 10 15 20 25
0.0016
Sample Number
0.0014 Fig. 6f. Scatter plot of mean.
0.0012
7.2. Design of classifier
0.001
0 5 10 15 20 25 In the present study, C4.5 algorithm has been used as classifier.
Sample Number The design of C4.5 algorithm for a given data set means finding the
optimum values of the algorithm parameters – the minimum
Fig. 6c. Scatter plot of standard error. number of objects to form a leaf and confidence factor. After
3.5 4.5
4
3
3.5
Range x E-1 g
2.5
Maximum xE-1g
3
2 2.5
1.5 2
1.5
1
1
0.5
0.5
0 0
0 5 10 15 20 25 0 5 10 15 20 25
Sample Number Sample Number
Fig. 6g. Scatter plot of maximum. Fig. 6j. Scatter plot of range.
-0.4 0.2
Standard Deviation x E-1 g

0.19
-0.5
0.18
Minimum x E-1 g
-0.6 0.17
0.16
-0.7
0.15
-0.8 0.14
-0.9 0.13
0.12
-1
0.11
-1.1 0.1
0 5 10 15 20 25 0 5 10 15 20 25
Sample Number Sample Number
Fig. 6h. Scatter plot of minimum. Fig. 6k. Scatter plot of standard deviation.
80
60
40
Sum x E-1 g
20
-20
-40
-60
-80
0 5 10 15 20 25
Sample Number
Fig. 6i. Scatter plot of sum.
Fig. 6l. Legend.

dimensionality reduction only the selected features namely kurto-
sis, maximum, standard deviation and median were considered for
further study (see Fig. 6l). Fig. 7a. Similar work has been carried out for 1500 rpm and
2000 rpm and the results are plotted in Fig. 7b and c, respectively.
As the minimum number of data points required to form a leaf
7.2.1. Effect of minimum number of objects on classification accuracy decreases, the algorithm induces more branching resulting in a lar-
Keeping the confidence factor at default value (0.2) the effect of ger tree. A larger tree tends to over-fit the training data. Hence fu-
minimum number of objects (to form a leaf) on classification accu- ture prediction for unseen data may be poor. On the other hand, if
racy has been studied. The minimum number of objects was varied minimum number of data points required to form a leaf is high,
from one to 10 with a step of one and beyond 10 with a step of five then it results in a narrow tree. The narrow tree is due to prema-
up to 25. The classification accuracies observed versus the mini- ture termination of the growth of the tree due to formation of a leaf
mum number of objects required to form a leaf are presented in in early stage. This reduces the possibility of extracting and using
Min no. of objects 1000 rpm required to form a leaf increases, the classification accuracy contin-
96 uously decreases with small undulation.
94 One can infer from Fig. 7a that for 1000 rpm the minimum
92 number of data points required to form a leaf is one due to high
classification accuracy. This might give a wrong notion that the
90
tree over-fits the data. However, testing results were also close
88 to 95% (training accuracy) as explained in later section. A similar
86 observation can also be made on Fig. 7b for 1500 rpm.
84
82 7.2.2. Effect of confidence factor on classification accuracy
80 Confidence factor is the second parameter to be tuned in classi-
0 5 10 15 20 25 30 fier design. It can take any value between ‘0’ and ‘1’. For studying
no. of objects the effect of confidence factor on classification accuracy, the confi-
dence factor was varied from 0 to 1 in steps of ‘0.1’ and the corre-
Fig. 7a. Minimum number of objects – 1000 rpm. sponding classification accuracies were recorded for three speeds;
1000 rpm, 1500 rpm and 2000 rpm. Fig. 8a–c shows the plots of
confidence factor versus classification accuracy, respectively. It is
observed from these results that the overall classification accuracy
Min. no of Objects 1500 rpm
95.5 for the given classifier is high when confidence factor is at 0.2.
Observe that the maximum classification accuracy at 1000 rpm

95 and 1500 rpm is 94.6% and 95.3%, respectively. For 2000 rpm the
94.5 maximum classification accuracy is only 68%, which can be attrib-
uted to the sudden appearance of very strong vibrations muffling
94
the misfire knock signal. For all the conditions taken, the classifica-
93.5 tion accuracy remains in their respective neighborhood for confi-
dence factor value ranging from ‘0.3’ to 1, with minor variations
93
92.5
Confidence Factor 1000 rpm
92 94.65
0 5 10 15 20 25 30 94.60
No.of Objects 94.55
Fig. 7b. Minimum number of objects – 1500 rpm. 94.50

94.45
94.40
94.35
Min. No. of Objects 2000 rpm
68 94.30
94.25
67.5
94.20
67
94.15
66.5 0 0.2 0.4 0.6 0.8 1 1.2
66 Confidence factor
65.5 Fig. 8a. Confidence factor 1000 rpm.
65
64.5 Classification Accuracy 1500 rpm

0 5 10 15 20 25 30 95.40
No.of Objects 95.30
Fig. 7c. Minimum Number of objects – 2000 rpm. 95.20
95.10
95.00
the information in the features/signal. The physical reasoning of
94.90
the relation between the minimum number of data points required
forming a leaf and the classification accuracy is very complex. 94.80
Therefore, one has to study the effect of minimum number of data 94.70
points required to form a leaf with respect to the classification
accuracy. The graph of signals extracted at 2000 rpm shown in 94.60
Fig. 7c illustrates this point. At 2000 rpm, the classification accu- 94.50
racy is maximum when the minimum number of objects are main- 0 0.2 0.4 0.6 0.8 1 1.2
tained in the range one to three but drops down for a value of four Confidence factor
and remains at that level with minor undulation. Fig. 7a and b can
be interpreted as follows: as the minimum number of data points Fig. 8b. Classification accuracy 1500 rpm.
Confidence Factor 2000 rpm

68.10
68.00
67.90
67.80
67.70
67.60
67.50
67.40
67.30
67.20
67.10
0 0.2 0.4 0.6 0.8 1 1.2
Confidence factor Fig. 10a. Confusion matrix (1000 rpm).
Fig. 8c. Confidence factor 2000 rpm.
the optimized parameters during classifier design phase the classi-

which does not significantly enhance the classifier accuracy. Since
fication performance of the testing data set was found to be
smaller values of confidence factor results in greater pruning ‘0.2’
94.14%. The misclassification details are presented in the form of
was selected. Additionally greater pruning results in a robust clas-
confusion matrix. Before going to the discussion of the confusion
sifier, compensating the small sacrifice made in classification
matrix the way of interpretation of confusion matrix is presented
accuracy.
here. Referring to Fig. 10a, the first row of the confusion matrix
represents the good condition. The first element in first row, i.e.
7.2.3. Effect of speed on classification accuracy location (1, 1), 100 represents the number of data points that be-
In misfire classification of IC engines, the classification parame- long to the good condition and have been classified correctly as
ters such as confidence factor and the minimum number of objects ‘good’. The second element in the first row i.e. location (1, 2), ‘0’ de-
required are defined for a specific speed under consideration. How- picts as to how many of the good condition have been misclassified
ever, the sensitivity of the classifier with respect to speed needs to as ‘misfire in cylinder one’. The third element represents the number
be studied because the engine has to be operated at varying speeds of data points that has been misclassified as ‘misfire in cylinder two’.
depending on the load fluctuations. The fourth element in the first row i.e. location (1, 4), ‘0’ depicts as
In this study the classifier is designed at three different speeds to how many of the good condition have been misclassified as ‘mis-
(1000 rpm, 1500 rpm and 2000 rpm). At each speed the best clas- fire in cylinder three’. The last element in the first row represents
sification accuracy obtained is plotted in Fig. 9. It is found that in the number of data points that has been misclassified as ‘misfire
normal operating conditions (1000–1500 rpm) the classification in cylinder four’. Similarly the second row represents the ‘misfire
accuracy is found to be around 95%. This demonstrates the robust- in cylinder one’ condition. The second element in second row repre-
ness in the normal operating range. There is a drop in classification sents the correctly classified instances for ‘misfire in cylinder one’
accuracy at elevated speeds. It is to be noted here, this is beyond condition and rest of them are misclassified details as explained
normal operating range and the general vibration level is very high earlier. Similar interpretation can be given for other elements as
which could suppress the information due to misfire. well. To summarize, the diagonal elements shown in the confusion
matrix represents the correctly classified points and non-diagonal
7.3. Validation of classifier elements are misclassified ones.
Referring to Fig. 10a (confusion matrix for operation at
A sample case from 1500 rpm (rated speed) is taken for discus- 1000 rpm), it is evident that the misclassification among the faulty
sion. The 100 data points, which were kept aside for testing, were conditions and ‘good’ condition is nil. However, there are minor
taken for validating the performance of the classifier. Upon using misclassifications among the faulty conditions. In a condition mon-
itoring activity fault identification forms the major objective and
fault classification comes second in priority. In this context, the
120 present algorithm (C4.5) performs fault identification (differentiat-
Maximum classification accuracy
ing between good and faulty conditions) sufficiently well (100%). It

100 can be observed that the misclassifications are present in two pairs
of conditions. In the first pair (mis1 Vs mis4) the misclassification
80 is 8%. A careful analysis into the physical effect due to misfire of
cylinders for the misclassification reveals that cylinder one and
60 cylinder four are in two extreme ends of the engine block and
the accelerometer is placed in the center of the engine block. The
40 misfire in a cylinder acts as the source of excitation with a specific
pattern. As the cylinder one and four are located symmetrically
20
with equal distance and they are identical in all aspects, there is
a possibility of some similarity in vibration signal patterns from
0
these two cylinders reaching the accelerometer. In the second pair
0 500 1000 1500 2000 2500
(mis3 Vs mis4) misclassification between cylinders three and four
Speed in rpm
is 8%. This could be attributed to the different spring mass proper-
Fig. 9. Maximum classification efficiency. ties of these cylinders producing identical signals.
ness of the classifier at such extreme conditions also needs to be

studied. Hence the study is extended to 2000 rpm as well. Even
at such extreme conditions the classifier is able to classify the mis-
fire conditions with an accuracy of 68%, which is very encouraging.
This could be further improved by installing better anti-vibro
mountings to arrest high amplitude vibrations. The misfire classi-
fication details are presented in the form of confusion matrix in
Fig. 10c.
8. Conclusion
In the era of rapid growth in automobile industry diagnosis of

misfire in IC engines is apt and hence taken up for study. The
search for simple yet powerful classifier to do this resulted in the
use of C4.5 algorithm. In any fault classification study using pattern
recognition techniques, feature selection and feature classification
Fig. 10b. Confusion matrix (1500 rpm).
has to be carried out after extracting the relevant features from the
vibration signals. C4.5 algorithm has the distinction of performing
these two activities simultaneously with reasonably less computa-
tional effort. The results presented in Section 7 illustrate these
capabilities.
The classification accuracy is found to be around 95%. From the
results presented it is encouraging to conclude that the C4.5 algo-
rithm is well suited for misfire detection in IC engines. It should be
noted that these results are specific to this application and cannot
be generalized to other similar applications. Further studies are to
be conducted on different engines at different operating conditions
in order to generalize this result.
References
California Air Resources Board. (1991). Technical status update and proposed
revisions to malfunction and diagnostic system requirements applicable to
1994 and subsequent california passenger cars, light-duty trucks, and medium-
duty vehicles – (OBDII). CARB Staff Report.
Fig. 10c. Confusion matrix (2000 rpm). Cavina, N., & Ponti, F. (2003). Engine torque nonuniformity evaluation using
instantaneous crankshaft speed signal. Journal of Engineering for Gas Turbines
and Power, 125(4), 1050–1058.
Chang, J., Kim, M., & Min, K. (2002). Detection of misfire and knock in spark ignition
Table 1 engines by wavelet transform of engine block vibration signals. Measurement
Science and Technology, 13(7), 1108–1114.
Specification of the IC engine.
Chen, B.-C., Wu, Y.-Y., Hsieh, F.-C., Tsai, G.-L. (2005). Crank angle estimation with
Make Hindustan motors Kalman filter and stroke identification for electronic fuel injection control of a
scooter engine. SAE Document Number: 2005-01-0076.
Number of cylinders/stroke Four cylinders/four stroke Ilkivova, M. R., Ilkiv, B. R., & Neuschl, T. (2002). Comparison of a linear and nonlinear
Fuel Gasoline (petrol) approach to engine misfires detection. Control Engineering Practice, 10(10),
Rated power 7.35 kW 1141–1146.
Rated speed (alternator) 1500 rpm Kiencke, U. (1999). Engine misfire detection. Control Engineering Practice, 7(2),
Engine stroke length 73.02 mm 203–208.
Engine bore diameter 88.9 mm Klenk, M., Moser, W., Mueller, W., & Wimmer, W. (1993). Misfire detection by
Cooling Water cooled evaluating crankshaft speed – A means to comply with OBDII, SAE Paper 93099.
Kuroda, S., Shimasaki, Y., Igarashi, H., Maruyama, S., & Asaka, T. (1995).
Development on digital filtering technology of crank speed measurement to
detect misfire in internal combustion engine at high speed revolution. JSAE
Review, 16(4), 387–390.
Referring to Fig. 10b (confusion matrix for operation at Lee, D., & Rizzoni, G. (1995). Detection of partial misfire in IC engines using
1500 rpm), a similar performance is noticed. In the first pair measurement of crankshaft angular velocity. SAE Paper 951070.
(mis1 Vs mis4) the misclassification is 8%. Here the misclassifi- Osburn, A. W., Kostek, T. M., & Franchek, M. A. (2006). Residual generation and
statistical pattern recognition for engine misfire diagnostics. Mechanical Systems
cation between cylinders three and four is more pronounced at and Signal Processing, 20(8), 2232–2258.
13%. This could be attributed to the different spring mass properties Piotr, B., & Jerzy, M. (2005). Misfire detection of locomotive diesel engine by non-
of these cylinders producing identical signals at 1500 rpm. linear analysis. Mechanical Systems and Signal Processing, 19(4), 881–899.
Quinlan, J. R. (1996). Improved use of continuous attributes in C4.5. Journal of
Referring to Fig. 10c (confusion matrix for operation at 2000 rpm), Artificial Research, 4, 77–90.
one can observe that the overall misclassification among the faulty Shiao, Y., & Moskwa, J. J. (1995). Cylinder pressure and combustion heat release
conditions and ‘good’ condition is present to the tune of 32% (see estimation for si engine diagnostics using nonlinear sliding observers. IEEE
Transactions on Control Systems Technology, 3.
Table 1). Tinaut, F. V., Melgar, A., Laget, H., & Dominguez, J. I. (2007). Misfire and compression
The engine-generator assembly used in the study is designed fault detection through the energy model. Mechanical Systems and Signal
for operation at 1500 rpm and can go up to a maximum speed of Processing, 21(3), 1521–1535.
Wang, Y., & Chu, F. (2005). Real-time misfire detection via sliding mode observer.
2000 rpm under extreme conditions. It is obvious that the level Mechanical Systems and Signal Processing, 19(4), 900–912.
of vibration will be high when the speed goes beyond 1500 rpm. Ye, J. (2009). Application of extension theory in misfire fault diagnosis of gasoline
For the completeness of the study, investigation of the effective- engines. Expert Systems with Applications, 36(2), 1217–1221.

Research Paper 2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Research Paper 2

Uploaded by

Copyright:

Available Formats

Expert Systems with Applications 37 (2010) 2150–2160

Contents lists available at ScienceDirect

Expert Systems with Applications

Misﬁre identiﬁcation in a four-stroke four-cylinder petrol engine using decision tree

* Corresponding author. Tel.: +91 9486094486.

of using vibroacoustic signals is encouraging, the implementation

Fig. 2. Engine setup.

techniques the sample length as 2n is preferred. Hence a consistent

Fig. 3d. Amplitude plot – cylinder four misﬁre.

3 6. Decision tree (C4.5 Algorithm)

Decision tree represents the information in the signal as fea-

where jSj is the number of cases in the training set. Ci is a class, I = 1,

where K is the number of outputs for test X, Si is a subset of S cor-

Fig. 4. Decision tree at 1500 rpm.

From the experimental setup through data acquisition 200 sig-

Fig. 6d. Scatter plot of median.

If the feature selection is carried out manually, the available fea-

Fig. 6b. Scatter plot of skewness. 0.002

0.0014 Fig. 6f. Scatter plot of mean.

Standard Deviation x E-1 g

Fig. 6i. Scatter plot of sum.

Fig. 6l. Legend.

Observe that the maximum classiﬁcation accuracy at 1000 rpm

Fig. 7b. Minimum number of objects – 1500 rpm. 94.50

65.5 Fig. 8a. Conﬁdence factor 1000 rpm.

64.5 Classification Accuracy 1500 rpm

Fig. 7c. Minimum Number of objects – 2000 rpm. 95.20

Confidence Factor 2000 rpm

Fig. 8c. Conﬁdence factor 2000 rpm.

the optimized parameters during classiﬁer design phase the classi-

ing between good and faulty conditions) sufﬁciently well (100%). It

ness of the classiﬁer at such extreme conditions also needs to be

In the era of rapid growth in automobile industry diagnosis of

You might also like