Professional Documents
Culture Documents
a r t i c l e i n f o a b s t r a c t
Keywords: This paper deals with the application of fast single-shot multiclass proximal support vector machine for
Single-shot multiclass proximal support fault diagnosis of a gear box consisting of twenty four classes. The condition of an inaccessible gear in an
vector machine operating machine can be monitored using the vibration signal of the machine measured at some conve-
Morlet wavelet nient location and further processed to unravel the significance of these signals. The statistical feature
Statistical features
vectors from Morlet wavelet coefficients are classified using J48 algorithm and the predominant features
Fault detection
Bevel gear box
were fed as input for training and testing multiclass proximal support vector machine. The efficiency and
time consumption in classifying the twenty four classes all-at-once is reported.
Ó 2009 Published by Elsevier Ltd.
Table 1
Various classes used for classification.
Table 3
Gear wheel and Pinion details.
3. Wavelet-based feature extraction is a window function called the mother wavelet, s is a scale and s is
a translation.
After acquiring the vibration signals in the time domain, it is The term translation is related to the location of the window, as
processed to obtain feature vectors. The continuous wavelet trans- the window is shifted through the signal. This corresponds to the
time information in the transform domain. But instead of a fre-
Table 2
quency parameter, we have a scale. Scaling, as a mathematical
Details of faults under investigation.
operation, either dilates or compresses a signal. Smaller scales cor-
Gears Fault description Dimension (mm) respond to dilated (or stretched out) signals and large scales corre-
G1 Good – spond to compressed signals.
G2 Gear tooth breakage (GTB) 8 The wavelet series is simply a sampled version of the CWT, and
G3 Gear with crack at root (GTC) 0.8 0.5 20
the information it provides is highly redundant as far as the recon-
G4 Gear with face wear 0.5
struction of the signal is concerned. This redundancy, on the other
N. Saravanan, K.I. Ramachandran / Expert Systems with Applications 36 (2009) 10854–10862 10857
Good-Dry-Unload Good-Dry-FullLoad
0.2 0.2
Amplitude
Amplitude
0 0
-0.2 -0.2
-0.4 -0.4
0 2000 4000 6000 8000 0 2000 4000 6000 8000
Sample No. Sample No.
Good-HalfLub-Unload Good-HalfLub-FullLoad
0.2 Amplitude 0.2
Amplitude
0 0
-0.2
-0.2
-0.4
0 2000 4000 6000 8000 0 2000 4000 6000 8000
Sample No. Sample No.
Good-FullLub-Unload Good-Full-FullLoad
0.2 0.2
Amplitude
Amplitude
0 0
-0.2 -0.2
Fig. 5a. Vibration signal for good pinion wheel under different lubrication and loading conditions.
GTB-Dry-Unload GTB-Dry-FullLoad
0.2 0.2
Amplitude
Amplitude
0 0
-0.2 -0.2
-0.4 -0.4
0 2000 4000 6000 8000 0 2000 4000 6000 8000
Sample No. Sample No.
GTB-HalfLub-Unload GTB-HalfLub-FullLoad
0.2 0.2
Amplitude
Amplitude
0 0
-0.2 -0.2
-0.4 -0.4
0 2000 4000 6000 8000 0 2000 4000 6000 8000
Sample No. Sample No.
GTB-FullLub-Unload GTB-FullLub-FullLoad
0.2 0.2
Amplitude
Amplitude
0 0
-0.2 -0.2
-0.4 -0.4
0 2000 4000 6000 8000 0 2000 4000 6000 8000
Sample No. Sample No.
Fig. 5b. Vibration signal for pinion wheel with teeth breakage under different lubrication and loading conditions.
10858 N. Saravanan, K.I. Ramachandran / Expert Systems with Applications 36 (2009) 10854–10862
GTC-Dry-Unload GTC-Dry-fullLoad
0.1 0.1
Amplitude
Amplitude
0 0
-0.1 -0.1
Amplitude
0 0
-0.1 -0.1
Amplitude
0 0
-0.1 -0.1
Fig. 5c. Vibration signal for pinion wheel with crack at root under different lubrication and loading conditions.
TFW-Dry-Unload TFW-Dry-FullLoad
0.2 0.2
Amplitude
Amplitude
0 0
-0.2 -0.2
Amplitude
0 0
-0.2 -0.2
Amplitude
0 0
-0.2 -0.2
Fig. 5d. Vibration signals for pinion wheel with teeth face wear under different lubrication and loading conditions.
as an input for training and classification using SVM. Fig. 7 gives and Soares (2002). A decision tree is a tree-based knowledge rep-
the efficiencies of all scales. resentation methodology used to represent classification rules.
J48 algorithm (A WEKA implementation of c4.5 Algorithm) is a
4. Using J48 algorithm in the present work widely used one to construct decision trees as explained by
Sugumaran et al. (2007).
A standard tree induced with c5.0 (or possibly ID3 or c4.5) con- The decision tree algorithm has been applied to the problem
sists of a number of branches, one root, a number of nodes and a under discussion. Input to the algorithm is set of statistical features
number of leaves. One branch is a chain of nodes from root to a of the eighth scale Morlet coefficients of the vibration signatures of
leaf; and each node involves one attribute. The occurrence of an all the twenty four classes. It is clear that the top node is the best
attribute in a tree provides the information about the importance node for classification. The other features in the nodes of decision
of the associated attribute as explained by Peng, Flach, Brazdil, tree appear in descending order of importance. It is to be stressed
N. Saravanan, K.I. Ramachandran / Expert Systems with Applications 36 (2009) 10854–10862 10859
96
% Efficiency
95
94
93
92
91
0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64
Morlet Wavelet Scale
Table 4 (continued) a
Ka ¼ e
Trail No. % Error Time (s) Nu Sigma C
77 27.58333 163.5 0.4 66 where, K ¼ ðK yij K /ij Þ (where represents element wise
78 27.58333 159.75 0.4 67 multiplication)
79 27.66667 156.6406 0.4 68
80 27.66667 153.7813 0.4 69 I
Kþ a¼e
81 27.66667 151.3906 0.4 70 C
82 27.625 150.9375 0.4 71
83 27.58333 151.2656 0.4 72
I
84 27.625 151.6563 0.4 73 Q a ¼ e; where Q ¼ K þ ð7Þ
85 27.625 152 0.4 74 C
86 27.66667 152.5625 0.4 75
87 27.66667 150.9063 0.4 76
88 27.75 150.1094 0.4 77
a ¼ Q 1 e:
89 27.875 161.9531 0.35 66
This leads to the closed form solution for SVM training. Here as is
90 27.91667 160.375 0.3 66
91 27.79167 160.7656 0.45 66
unrestricted in sign and unbounded. Our argument is that in the
92 27.54167 163.4063 0.5 66 formulation given in Szedmak and Shawe-Taylor (2005), the restric-
93 27.83333 162.6563 0.55 66 tion put on error variable is unwarranted especially when we inter-
94 28.0000 161.0938 0.6 66 pret that W maps data/feature points into label space. The error that
95 27.54167 160.9844 0.5 67
we allow while doing the mapping can have any sign. It is yet to be
96 27.58333 156.6563 0.5 68
97 27.58333 153.125 0.5 69 explored the impact and meaning of restriction on the sign of error
98 27.54167 158.5938 0.5 65 variable.
99 27.58333 154.0938 0.5 64
100 27.66667 149.3594 0.5 63
101 27.75 147.0469 0.5 62
6. Application of MSVM for problem at hand and results
102 27.58333 158.1875 0.5 68
103 27.625 151.0938 0.5 70 The predominant statistical features selected from the eighth
104 27.625 150.7031 0.5 71 scale of Morlet wavelet were given as an input to the multiclass
proximal support vector machine. Totally as mentioned in section
1.0, there are totally 24 classes, all these classes are classified all-
at-once using multiclass support vector machine. The kernel
On taking Lagrangian we obtain: parameters for classification consist of Nu, Sigma, Tolerance and
no. of iterations. Each class consists of 100 data sets and a 10 cross
Xm validation is used for testing. The no. of iterations was fixed to 50
1
L¼ trðW T WÞ þ CeT n ai ðyTi ; W/ðxi Þ 1 þ ni Þ for the entire classification and tolerance was set to 0.0001. Table 4
2 i¼1 shows results obtained using multiclass proximal support vector
machine for different no. of trails and also time taken for classifica-
@L Xm Xm
tion in each trail. The Fig. 9 shows the% error for different trails and
¼W ai yi /ðxÞT ¼ 0 ) W ¼ ai yi /ðxÞT ð5Þ
@W i¼1 i¼1
Fig. 10 shows the time taken for classification for a particular trail.
@L ai a 7. Discussion
¼ Cni ai ¼ 0 ) ni ¼ ; n ¼ : ð6Þ
@ni C C
In this paper, we have shown that a multiclass proximal support
P
Substituting W ¼ m T a
i¼1 ai yi /ðxÞ and n ¼ C in the constraint in (4), we
vector machine simplifies the computation and shred some light on
y /
obtain ðK ij K ij Þa ¼ e Ca (where represents element wise the geometry of multiclass formulation. It was clear from the results
multiplication) that using multiclass proximal support vector machine, it was also
39
37
% Error
35
33
31
29
27
0 10 20 30 40 50 60 70 80 90 100
No. of Trails
160
155
Time (sec)
150
145
140
135
130
0 10 20 30 40 50 60 70 80 90 100
No. of Trails
Fig. 10. Time elapsed for training (sec) for different no. of trails.
depicted that the speed of the classification is high compared to oth- Gadd, P., & Mitchell, P. J. (1984). Condition monitoring of helicopter gearboxes using
automatic vibration analysis techniques. AGARD CP 369, Gears and power
ers and only a few kernel parameters (Nu and Sigma) were needed
transmission system for helicopter turboprops (pp. 29/1–29/10).
for classification. In this work, an error percentage of 27.541 were Kecman, V., Huang, T. -M., & Vogt, M. (2005). Iterative single data algorithm for
obtained for classifying a total of twenty four classes. training kernel machines from huge data sets: Theory and performance. In
Support vector machines: Theory and applications. Studies in fuzziness and soft
computing (Vol. 177). Springer-Verlag.
8. Conclusion Leblanc, J. F. A., Dube, J. R. F., & Devereux, B. (1990). Helicopter gearbox vibration
analysis in the Canadian forces – Applications and lessons. In Proceedings of the
first international conference, gearbox noise and vibration (pp. 173–177).
Fault diagnosis of Gear box is one of the core research areas in the Cambridge, UK: IMechE. C404/023.
field of condition monitoring of rotating machines. A total of twenty Mallat (1998). A wavelet tour of signal processing. Academic Press.
four classes are classified using statistical features of eighth scale Micchelli, C. A., & Pontil, M. (2004). Kernels for multi-task learning. In Proceedings of
the 18th conference on neural information processing systems (NIPS’04).
Morlet wavelet coefficients and multiclass proximal support vector
Micchelli, C. A., & Pontil, M. (2005). On learning vector-valued functions. Neural
machine is used in further classification of features. An error of Computation, 17, 177–204.
27.541% is obtained in classifying a total of twenty four classes Peng, Y. H., Flach, P. A., Brazdil, P., & Soares, C. (2002). Decision tree-based data
characterization for meta-learning, In ECML/PKDD-2002 Workshop IDDM.
(given in Table 3) all-at-once. It is found that the classification using
Helsinki, Finland.
the multiclass proximal support vector machine gives good results Petrille, O., Paya, B., Esat, I. I., & Badi, M. N. M. (1995). In Proceedings of the energy-
for large classes of data in less time. There exists better result than sources technology conference and exhibition: Structural dynamics and vibration
obtained one if it is possible to find the better kernel parameters. PD (Vol. 70, p. 97).
Soman, K. P., & Ramachandran, K. I. (2005). Insight into wavelets from theory to
practice. Prentice-Hall of India Private Limited.
Sugumaran, V., Muralidharan, V., & Ramachandran, K. I. (2007). Feature selection
References using decision tree and classification through proximal support vector machine
for fault diagnostics of roller bearing. Mechanical Systems and Signal Processing,
Boulahbal, D., Golnaraghi, M. F., & Ismail, F. (1997). In Proceedings of the 21, 930–942.
DETC’97, 1997 ASME design engineering technical conference, DETC97/VIB- Szedmak, S., & Shawe-Taylor, J. (2005). Multiclass learning at one-class complexity.
4009. Technical report. ISIS Group, Electronics and Computer Science.
Cameron, B. G., & Stuckey, M. J. (1994). A review of transmission vibration Vapnick, V. N. (1999). In The nature of statistical learning theory. 5, 2nd edn. Springer-
monitoring at Westland Helicopter Ltd. In Proceedings of the 20th European Verlag, pp. 138–146.
rotorcraft forum, Paper 116 (pp. 16/1–116/16). Wang, W. J., & Mcfadden, P. D. (1983). Early detection of gear failure by vibration
Crammer, Koby, & Singer, Yoram (2001). On the algorithmic implementation of analysis II: Interpretation of the time-frequency distribution using image
multiclass kernel-based vector machines. Journal of Machine Learning Research, processing techniques. Mechanical Systems and Signal Processing, 7(3), 205–
2, 265–292. 215.
Fung, G., & Mangasarian, O. L. (2001). In Proximal support vector machine classifiers Weston, J., & Watkins, C. (1998). Multiclass support vector machines. Technical
KDD 2001: Seventh ACM SIGKDD international conference on knowledge discovery report. CSD-TR-98-04. Department of Computer Science, Royal Holloway,
and data mining. San Francisco, August 26–29. University of London, Egham, TW20 0EX, UK.