Professional Documents
Culture Documents
a r t i c l e i n f o a b s t r a c t
Keywords: Misfire detection in an internal combustion engine is very crucial to maintain optimum performance
Misfire throughout its service life and to reduce emissions. The vibration of the engine block contains indirect
Knock information regarding the condition of the engine. Misfire detection can be achieved by processing the
Engine condition monitoring vibration signals acquired from the engine using a piezoelectric accelerometer. This hidden information
Decision tree
can be decoded using statistical parameters like kurtosis, standard deviation, mean, median, etc. This
Feature selection
paper illustrates the use of decision tree as a tool for feature selection and feature classification. The effect
of dimension, minimum number of objects and confidence factor on classification accuracy are studied
and reported in this work.
Ó 2009 Elsevier Ltd. All rights reserved.
1. Introduction exhaust does not rise significantly in the case of low fre-
quency misfire.
The rapid growth of transportation systems mostly using inter- b. Monitoring the oxygen sensor signal in exhaust. This
nal combustion (IC) engines has led to a wide range of challenges method is not encouraging since the momentary increase
demanding immediate attention. Maintenance and condition mon- in oxygen level for a single misfire might not evoke a good
itoring of an IC engine is a very crucial activity required to ensure response from the sensor and it is even more challenging
optimum performance and minimum load on the environment, by at higher speeds.
restricting emissions to bare minimum levels. Misfire in spark igni- c. In-cylinder pressure monitoring. This method is very reliable
tion IC engine is a major factor leading to undetected emissions and accurate as individual cylinder instantaneous mean
and performance reduction. According to the California Air Re- effective pressure could be calculated in real time. However,
sources Board (CARB) regulations (California Air Resources Board, the cost of fitting each cylinder with a pressure transducer is
1991) engine misfire means, ‘‘lack of combustion in the cylinder prohibitively high.
due to absence of spark, poor fuel metering, poor compression, or d. Evaluation of crankshaft angular velocity fluctuations.
any other cause”. Extensive studies have been done using measurement of
The engine diagnostic system of the vehicle should be designed instantaneous crank angle speed (Cavina & Ponti, 2003;
to monitor misfire continuously because even with a small number Chen, Wu, Hsieh, & Tsai et al., 2005; Kuroda, Shimasaki, Igar-
of misfiring cycles, engine performance degrades, hydrocarbon ashi, Maruyama, & Asaka, 1995; Osburn, Kostek, & Franchek,
emissions increase, and drivability will suffer (Lee & Rizzoni, 2006; Tinaut, Melgar, Laget, & Dominguez, 2007) and diverse
1995). The cylinder misfire cycle also results in a large quantity techniques have been developed to predict misfire. These
of unburned fuel being sent through the catalytic converter, which methods call for a high resolution crank angle encoder and
causes a reduction in its service life due to high temperature expo- associated infrastructure capable of identifying minor
sures (Klenk, Moser, Mueller, & Wimmer, 1993) and also contrib- changes in angular velocity due to misfire. The application
utes to significant air pollution. of these techniques becomes more challenging due to con-
Several methods of misfire detection have been proposed tinuously varying operating conditions involving random
(Klenk et al., 1993): variation in acceleration coupled with the effect of flywheel,
which tries to smoothen out minor variations in angular
a. Monitoring catalyst temperature at exhaust. This method velocity at higher speeds. Fluctuating load torque applied
is unacceptable since the catalyst temperature at to the crankshaft through the drive train poses additional
hurdles in decoding the misfire signals.
0957-4174/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2009.07.061
S.B. Devasenapati et al. / Expert Systems with Applications 37 (2010) 2150–2160 2151
Amplitude
stored in the files. Five cases were considered – normal running
(without any fault), engine with any one-cylinder misfire individ- 1
ually (i.e. first, second, third or fourth). All the misfire events were
simulated at 1000 rpm, 1500 rpm and 2000 rpm. The rated speed 0
of the engine electrical generator set is 1500 rpm. Time domain
plots of the signals at 1500 rpm are shown in Fig. 3a–e.
-1
0 2000 4000 6000 8000
4. Definition of statistical features used Sample Number
(a) Standard deviation: This is a measure of the energy content Fig. 3b. Amplitude plot – cylinder two misfire.
of the vibration signal. It is the root mean square (RMS) devi-
ation of signal values from their arithmetic mean. The fol-
lowing formula was used for computation of standard 3
deviation.
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2
Amplitude
u
u1 X N
Standard deviation ðrÞ ¼ t ðxi xÞ2 ð1Þ
N i1 1
0
(b) Standard error: The standard error of a statistic is the stan-
dard deviation of the sampling distribution of that statistic.
-1
Standard errors are important because they reflect how
0 2000 4000 6000 8000
much sampling fluctuation a statistic will show. The inferen-
Sample Number
tial statistics involved in the construction of confidence
intervals and significance testing are based on standard Fig. 3c. Amplitude plot – cylinder three misfire.
errors. The standard error of a statistic depends on the sam-
ple size. In general, the larger the sample sizes the smaller
the standard error. It is a measure of the content of error 3
2
Amplitude
3
1
2 0
Amplitude
1 -1
0 2000 4000 6000 8000
0 Sample Number
2
Amplitude
Sample variance ¼ r2 ð3Þ In the building phase, the training sample sets with discrete-
valued attributes are recursively partitioned until all the records
(d) Kurtosis: Kurtosis indicates the flatness or the spikiness of in a partition have the same class. The tree has a single root node
the signal. Its value is very low for good engine performance for the entire training set. A new node is added to the decision tree
without knock and high for misfire. for every partition. For a set of samples in a partition S, a test attri-
( ) bute X is selected for further partitioning the set into S1, S2, S3, . . . SL.
Xn 4
nðn þ 1Þ xi x For each new set S1, S2, S3 . . . SL new nodes are created and these are
Kurtosis ¼
ðn 1Þðn 2Þðn 3Þ i¼1 s added to the decision tree as children of the node for S. Further, the
3ðn 1Þ2 node for S is labeled with test X, and partitions S1, S2, S3 . . . SL are
ð4Þ recursively partitioned. When all the records in a partition arrive
ðn 2Þðn 3Þ
with identical class label, further portioning is stopped, and the leaf
where ‘s’ is the sample standard deviation. corresponding to it is labeled with the corresponding class. The
construction of decision tree strongly depends on how a test attri-
(e) Skewness: Skewness characterizes the degree of asymmetry bute X is selected. C4.5 algorithm uses information entropy evalu-
of a distribution around its mean. Positive skewness indi- ation function as the selection criteria.
cates a distribution with an asymmetric tail extending The entropy evaluation function is arrived at through the fol-
towards more positive values. Negative skewness indicates lowing steps:
a distribution with an asymmetric tail extending towards Step 1: Calculate Info(S) to identify the class in the training
more negative values. The measure of it is given by, set S.
Xn 3 X
K
n xi x
Skewness ¼ ð5Þ InfoðSÞ ¼ f½freqðC i ; S=jSjÞlog2 ½freqðC i ; S=jSjÞg ð6Þ
ðn 1Þðn 2Þ i¼1 s i¼1
The GainRatio(X) compensates for the weak point of Gain(X), called information gain (discussed in Section 6.1) as heuris-
which represents the quantity of information provided by X in tic for selecting the attribute that will best separate the
the training set. Therefore, an attribute with the highest GainRa- given samples into individual classes.
tio(X) is taken as the root of the decision tree. (4) For each best discrete interval of the test attribute, a branch
is created and the samples are partitioned accordingly.
6.2. Pruning the decision tree (5) The algorithm uses the same process recursively to build a
decision tree for the samples at each partition.
It is observed that a training set in the sample space leads to a (6) The recursive partitioning stops only when one of the fol-
decision tree, which may be too large to be an accurate model; this lowing conditions is satisfied:
is due to over-training or over-fitting. Such a fully-grown decision (a) All the samples for a given node belong to the same
tree needs to be pruned by removing the less reliable branches to class.
obtain better classification performance over the whole instance (b) There are no remaining attributes on which the sam-
space even though it may induce a higher error over the training ples may further be partitioned.
set. (c) There are no samples for the branch test attribute. In
The C4.5 algorithm uses an error-based post-pruning strategy to such a case, a leaf is created with the majority class
deal with the issue of over-training. For each classification node in the available sample.
the C4.5 algorithm calculates predicted error rate based on the to- (7) A pessimistic error pruning method (discussed in Section
tal aggregate of misclassifications at that particular node. The er- 6.2) is used to prune the developed tree. This operation
ror-based pruning technique results in the replacement of vast improves the robustness and accuracy of the tree.
sub-trees in the classification structure by singleton nodes or sim-
ple branch collections, if these actions contribute to a drop in the In the present work, confidence factor assigned is 0.2. The con-
overall error rate of the root node. fidence factor is used for pruning the tree (smaller values incur
more pruning). The results are discussed below. A typical decision
6.3. Application of decision tree for misfire detection tree using C4.5 algorithm is shown in Fig. 4 under Section 7.1.
Statistical features defined in Section 4 forms the input to the
The obtained samples are divided into two parts: training set algorithm. Kurtosis, maximum, etc., in the ovals represent the fea-
and testing set. The training set is used to train the classifier and tures and the rectangles represent a class. The labels found within
the testing set is used to validate the performance of the classifier. the rectangles, but outside parenthesis represent the class labels
Tenfold cross-validation is deployed to evaluate classification e.g. c2m, c3m, etc. Two additional numbers are present with in
accuracy. The training process of C4.5 algorithm using the training the parenthesis; the first these indicates the data points that can
set with continuous-valued attributes is as follows: be classified using that feature set and the second number repre-
sents the number of misfit data points. The absence of second
(1) The tree starts as a single node representing the samples number indicates that there are no misfit data. Further, if the first
from the training set. number is insignificantly small compared to the total number of
(2) If the samples are all of the same class, then the node samples, the corresponding features can be considered as outliers
becomes a leaf. It is labeled with the same class. and ignored. The other features that appear at the nodes of the
(3) Otherwise, the algorithm discretises every attribute to select decision tree are in descending order of importance. Only features
the optimal threshold. It uses the entropy-based measure that contribute to the classification appear in the decision tree and
others do not. Features that have less discriminating capability can Effect of Dimension 1000 rpm
be consciously discarded by fixing the threshold. This concept is 96
Classification efficiency
made use of in selecting good features by the algorithm by inher- 94
ent nature. Features, which have good discriminating capability 92
alone, will appear in the decision tree.
90
The next role played by decision tree is classification. With the
88
selected features the classification was carried out. The outcome is
presented in the next section. 86
84
82
7. Results and discussion
80
The study of misfire classification using decision tree is dis- 78
cussed in the following phases: 0 2 4 6 8 10 12
No of features
1. Dimensionality reduction (Feature selection).
Fig. 5a. Effect of dimension (1000 rpm).
2. Design of classifier (C4.5 algorithm).
3. Validation of the classifier.
Classification accuracy
kept aside for testing purpose and the remaining 100 signals were
used for building the classifier. 92
90
88
7.1. Dimensionality reduction
86
Dimensionality reduction is the process of reducing the number 84
of input features that are required for classification to reduce the 82
computational effort. From the signals obtained at 1000 rpm, 11 80
statistical features as explained in Section 4 have been extracted.
78
All 11 features were given as input to the C4.5 algorithm and the
0 2 4 6 8 10 12
corresponding decision tree was generated. The resulting decision
No of features
tree is shown in Fig. 4. The decision tree gives an estimate of the
worth of a feature representing them in the order of importance. Fig. 5b. Effect of dimension (1500 rpm).
The top most node (feature) appearing in the tree – the root, con-
tains maximum information about the signal, however the remain-
ing node in the branches gives the same in the order of importance.
Effect of Dimension 2000 rpm
From Fig. 4 the order of importance of each feature has been noted 80
Classsification accuracy
down for all the features that appear in the decision tree. The miss- 70
ing features do not have significant information for classification;
however they are considered randomly to complete the list in the 60
study relating the effect of dimension on classification accuracy. 50
The dimensionality reduction was carried out as explained be- 40
low. Initially the root feature (standard error) of the tree alone
30
was considered for classification and the classification accuracy
was noted down. In the next step classification was performed 20
using the root node along with the next most prominent feature 10
(standard error and sample variance) and the classification accu-
0
racy was noted down. Further down the root node along with
0 2 4 6 8 10 12
the next two prominent features were considered for classification.
No of features
Similarly top four, top five, etc., features have been considered and
the corresponding classification accuracies were noted down. Fig. 5c. Effect of dimension (2000 rpm).
Fig. 5b shows the plot of the number of features versus classifica-
tion accuracy. From the graph (Fig. 5b) it is evident that the classi-
fication accuracy gradually increases for number of features As explained above the feature selection has been carried out
ranging from one to three (79.14–94.1%) and then has minor using decision tree C4.5 algorithm. Since an algorithm does the fea-
reduction in classification accuracy when number of features in- ture selection it is necessary to validate the result. The scatter plot
creased beyond three. It again reaches the maximum classification is used as a tool for validating the selection. Scatter plot depicts the
accuracy of 94.14% when number of features considered is seven. ability of the feature to classify. It can be observed from the scatter
Using lesser number of features reduces the computational load plots given in Fig. 6d–i, the points are very cluttered and cannot be
considerably hence in this work the first three features in their or- a good selection for classification. Kurtosis, given in Fig. 6a, has
der of importance have been selected. Similar exercise has been steep variations and hence is not a good selection. Since variance
carried out for 1000 rpm and 2000 rpm and the results are plotted is square of standard deviation, no additional information can be
in Fig. 5a and c, respectively. One can observe a similar pattern as obtained by selecting both these parameters, it may only increase
found in Fig. 5b. the computational effort.
2156 S.B. Devasenapati et al. / Expert Systems with Applications 37 (2010) 2150–2160
60 0.015
50
0.01
40
Median x E-1 g
0.005
Kurtosis
30
0
20
10 -0.005
0
-0.01
-10
0 5 10 15 20 25 -0.015
Sample Number 0 5 10 15 20 25
Sample Number
Fig. 6a. Scatter plot of kurtosis.
0.015
0.01
5
0.005
4
0
3 0 5 10 15 20 25
Skewness
Sample Number
2
Fig. 6e. Scatter plot of sample variance.
1
-1 0.008
-2 0.006
0 5 10 15 20 25
Sample Number 0.004
Mean x E-1 g
0.0022 -0.002
-0.004
Standard Error x E-1 g
0.002
-0.006
0.0018
-0.008
0 5 10 15 20 25
0.0016
Sample Number
0.0012
7.2. Design of classifier
0.001
0 5 10 15 20 25 In the present study, C4.5 algorithm has been used as classifier.
Sample Number The design of C4.5 algorithm for a given data set means finding the
optimum values of the algorithm parameters – the minimum
Fig. 6c. Scatter plot of standard error. number of objects to form a leaf and confidence factor. After
S.B. Devasenapati et al. / Expert Systems with Applications 37 (2010) 2150–2160 2157
3.5 4.5
4
3
3.5
Range x E-1 g
2.5
Maximum xE-1g
3
2 2.5
1.5 2
1.5
1
1
0.5
0.5
0 0
0 5 10 15 20 25 0 5 10 15 20 25
Sample Number Sample Number
Fig. 6g. Scatter plot of maximum. Fig. 6j. Scatter plot of range.
-0.4 0.2
-0.6 0.17
0.16
-0.7
0.15
-0.8 0.14
-0.9 0.13
0.12
-1
0.11
-1.1 0.1
0 5 10 15 20 25 0 5 10 15 20 25
Sample Number Sample Number
Fig. 6h. Scatter plot of minimum. Fig. 6k. Scatter plot of standard deviation.
80
60
40
Sum x E-1 g
20
-20
-40
-60
-80
0 5 10 15 20 25
Sample Number
Min no. of objects 1000 rpm required to form a leaf increases, the classification accuracy contin-
96 uously decreases with small undulation.
Classification accuracy
94 One can infer from Fig. 7a that for 1000 rpm the minimum
92 number of data points required to form a leaf is one due to high
classification accuracy. This might give a wrong notion that the
90
tree over-fits the data. However, testing results were also close
88 to 95% (training accuracy) as explained in later section. A similar
86 observation can also be made on Fig. 7b for 1500 rpm.
84
82 7.2.2. Effect of confidence factor on classification accuracy
80 Confidence factor is the second parameter to be tuned in classi-
0 5 10 15 20 25 30 fier design. It can take any value between ‘0’ and ‘1’. For studying
no. of objects the effect of confidence factor on classification accuracy, the confi-
dence factor was varied from 0 to 1 in steps of ‘0.1’ and the corre-
Fig. 7a. Minimum number of objects – 1000 rpm. sponding classification accuracies were recorded for three speeds;
1000 rpm, 1500 rpm and 2000 rpm. Fig. 8a–c shows the plots of
confidence factor versus classification accuracy, respectively. It is
observed from these results that the overall classification accuracy
Min. no of Objects 1500 rpm
95.5 for the given classifier is high when confidence factor is at 0.2.
Classification accuracy
92.5
Confidence Factor 1000 rpm
92 94.65
0 5 10 15 20 25 30 94.60
No.of Objects 94.55
Classification accuracy
94.25
67.5
94.20
67
94.15
66.5 0 0.2 0.4 0.6 0.8 1 1.2
66 Confidence factor
65
95.10
95.00
the information in the features/signal. The physical reasoning of
94.90
the relation between the minimum number of data points required
forming a leaf and the classification accuracy is very complex. 94.80
Therefore, one has to study the effect of minimum number of data 94.70
points required to form a leaf with respect to the classification
accuracy. The graph of signals extracted at 2000 rpm shown in 94.60
Fig. 7c illustrates this point. At 2000 rpm, the classification accu- 94.50
racy is maximum when the minimum number of objects are main- 0 0.2 0.4 0.6 0.8 1 1.2
tained in the range one to three but drops down for a value of four Confidence factor
and remains at that level with minor undulation. Fig. 7a and b can
be interpreted as follows: as the minimum number of data points Fig. 8b. Classification accuracy 1500 rpm.
S.B. Devasenapati et al. / Expert Systems with Applications 37 (2010) 2150–2160 2159
67.90
67.80
67.70
67.60
67.50
67.40
67.30
67.20
67.10
0 0.2 0.4 0.6 0.8 1 1.2
Confidence factor Fig. 10a. Confusion matrix (1000 rpm).
8. Conclusion
References
California Air Resources Board. (1991). Technical status update and proposed
revisions to malfunction and diagnostic system requirements applicable to
1994 and subsequent california passenger cars, light-duty trucks, and medium-
duty vehicles – (OBDII). CARB Staff Report.
Fig. 10c. Confusion matrix (2000 rpm). Cavina, N., & Ponti, F. (2003). Engine torque nonuniformity evaluation using
instantaneous crankshaft speed signal. Journal of Engineering for Gas Turbines
and Power, 125(4), 1050–1058.
Chang, J., Kim, M., & Min, K. (2002). Detection of misfire and knock in spark ignition
Table 1 engines by wavelet transform of engine block vibration signals. Measurement
Science and Technology, 13(7), 1108–1114.
Specification of the IC engine.
Chen, B.-C., Wu, Y.-Y., Hsieh, F.-C., Tsai, G.-L. (2005). Crank angle estimation with
Make Hindustan motors Kalman filter and stroke identification for electronic fuel injection control of a
scooter engine. SAE Document Number: 2005-01-0076.
Number of cylinders/stroke Four cylinders/four stroke Ilkivova, M. R., Ilkiv, B. R., & Neuschl, T. (2002). Comparison of a linear and nonlinear
Fuel Gasoline (petrol) approach to engine misfires detection. Control Engineering Practice, 10(10),
Rated power 7.35 kW 1141–1146.
Rated speed (alternator) 1500 rpm Kiencke, U. (1999). Engine misfire detection. Control Engineering Practice, 7(2),
Engine stroke length 73.02 mm 203–208.
Engine bore diameter 88.9 mm Klenk, M., Moser, W., Mueller, W., & Wimmer, W. (1993). Misfire detection by
Cooling Water cooled evaluating crankshaft speed – A means to comply with OBDII, SAE Paper 93099.
Kuroda, S., Shimasaki, Y., Igarashi, H., Maruyama, S., & Asaka, T. (1995).
Development on digital filtering technology of crank speed measurement to
detect misfire in internal combustion engine at high speed revolution. JSAE
Review, 16(4), 387–390.
Referring to Fig. 10b (confusion matrix for operation at Lee, D., & Rizzoni, G. (1995). Detection of partial misfire in IC engines using
1500 rpm), a similar performance is noticed. In the first pair measurement of crankshaft angular velocity. SAE Paper 951070.
(mis1 Vs mis4) the misclassification is 8%. Here the misclassifi- Osburn, A. W., Kostek, T. M., & Franchek, M. A. (2006). Residual generation and
statistical pattern recognition for engine misfire diagnostics. Mechanical Systems
cation between cylinders three and four is more pronounced at and Signal Processing, 20(8), 2232–2258.
13%. This could be attributed to the different spring mass properties Piotr, B., & Jerzy, M. (2005). Misfire detection of locomotive diesel engine by non-
of these cylinders producing identical signals at 1500 rpm. linear analysis. Mechanical Systems and Signal Processing, 19(4), 881–899.
Quinlan, J. R. (1996). Improved use of continuous attributes in C4.5. Journal of
Referring to Fig. 10c (confusion matrix for operation at 2000 rpm), Artificial Research, 4, 77–90.
one can observe that the overall misclassification among the faulty Shiao, Y., & Moskwa, J. J. (1995). Cylinder pressure and combustion heat release
conditions and ‘good’ condition is present to the tune of 32% (see estimation for si engine diagnostics using nonlinear sliding observers. IEEE
Transactions on Control Systems Technology, 3.
Table 1). Tinaut, F. V., Melgar, A., Laget, H., & Dominguez, J. I. (2007). Misfire and compression
The engine-generator assembly used in the study is designed fault detection through the energy model. Mechanical Systems and Signal
for operation at 1500 rpm and can go up to a maximum speed of Processing, 21(3), 1521–1535.
Wang, Y., & Chu, F. (2005). Real-time misfire detection via sliding mode observer.
2000 rpm under extreme conditions. It is obvious that the level Mechanical Systems and Signal Processing, 19(4), 900–912.
of vibration will be high when the speed goes beyond 1500 rpm. Ye, J. (2009). Application of extension theory in misfire fault diagnosis of gasoline
For the completeness of the study, investigation of the effective- engines. Expert Systems with Applications, 36(2), 1217–1221.