Professional Documents
Culture Documents
com
ScienceDirect
Available online at www.sciencedirect.com
ScienceDirect
Procedia Manufacturing 00 (2019) 000–000
www.elsevier.com/locate/procedia
Abstract
Several methods have been proposed for fault detection in mechanical systems based on sensor signals. It is preferable that corresponding label
for each sensor signal should be provided and analyzed via appropriate supervised classification methods. However, the label information about
a system's status often does not perfectly pair to the corresponding data. Therefore, we apply a semi-supervised classification for fault detection
using pattern extraction of multivariate signals. This approach transforms continuous time series into a set of contiguous bins via multivariate
discretization. Then, we identify informative patterns in the system states, by using a self-training method with limited label information. To
demonstrate the effectiveness of the proposed extraction method, five accelerometer signals are collected from a bearing-shaft system. The
proposed method successfully reveals informative fault patterns that can be applied as references for fault detection. The method achieved a
higher detection performance, regardless the ratio of unlabeled inputs in datasets.
© 2020The
© 2020 TheAuthors.
Authors. Published
Published by Elsevier
by Elsevier Ltd. Ltd.
This
This isisan
anopen
openaccess
access article
article under
under the BY-NC-ND
the CC CC BY-NC-ND licenselicense (https://creativecommons.org/licenses/by-nc-nd/4.0/)
https://creativecommons.org/licenses/by-nc-nd/4.0/)
Peer-review underresponsibility
Peer-review under responsibility ofof
thethe scientific
scientific committee
committee of FAIM
of the the FAIM 2021.
2020.
transforming the original signals, the significant signal classification problem. In addition, they conducted the pre-
patterns are extracted from target states of a system as motifs training of each layer in an unsupervised manner to find
[16, 17, 18]. For example, the transformation of the original appropriate parameter settings of the classifier. Then, in a
signals to discretized state vectors (DSVs) is applied both in supervised manner, they applied convolution filters to
discrete wavelet transforms and in one of the time series discretized time-series data. A combination of SAX and neural
discretization methods, namely, symbolic aggregate networks was also used to distinguish gestures and actions
approximation (SAX) [19]. By counting the number of [35]. A 3D posed image was converted into a symbol matrix
occurrences of each DSV and using the k-nearest neighbor using SAX, and then a hierarchical-clustering method was
classifier, George et al. extracted meaningful patterns about used to assign symbol matrices to several 3D posed-image
rotor faults in induction motors [20]. Duan et al. analyzed groups. After generating the groups in a semi-supervised
compressor faults by SAX transformation and bit map [21]. manner, the investigators converted a symbol matrix into a
After transforming the original vibration signals, they feature vector using a CNN and predicted the current gesture
generated a meaningful bit map from each different fault state by a long short-term memory model in a supervised condition.
(e.g., spring faults, valve fracture, and valve wear). In addition, However, during the time-series classification and fault
many researchers have used time series classification with a detection, a decrease in the performance of semi-supervised
supervised learning classifier for the extraction of significant learning approaches was observed compared with that of
markers of faults [22, 23, 24]. supervised learning. In addition, several detection approaches
However, obtaining corresponding and correct labels for provided high detection performance either with few labeled
the entire time series data is very expensive and time and many unlabeled input datasets or with few labeled and
consuming [25, 26, 27], and mechanical systems are usually many unlabeled inputs [27, 33, 34]. Because semi-supervised
rife with unknown faults [28]. In other words, in practice, the detection cannot be controlled in practical environments, it
label information about a system's status often fails to should not be susceptible to the number of unlabeled inputs in
perfectly pair with the corresponding sensor data. For example, the entire dataset. Therefore, in this study, we propose adding
fully labeled datasets are often collected from in-lab the number of DSV occurrences in no-fault and fault patterns
experiments or test runs, whereas unlabeled datasets are as input data in semi-supervised learning to re-weigh the
collected using the identical system in practical environments. extracted patterns and make up for the information loss from
Therefore, several semi-supervised learning approaches have the discretization. The proposed occurrence information
been introduced in order to leverage the advantages of both should enable dealing with a severity as a fault marker.
supervised and unsupervised learning [29, 30]. Zhao et al. Subsequently, we analyzed the dependence of fault detection
proposed a semi-supervised learning classifier for fault performance on the amount of data in training sets to compare
detection in solar photovoltaic arrays [31]. They normalized the performance of our proposed method with the existing
and filtered measurements and then distinguished fault data supervised or semi-supervised detection methods.
from other data by using supervised and unsupervised The remainder of the present study is organized as follows.
classifiers together in a semi-supervised manner. This method Section 2 details the transformation of the original continuous
showed good performance without labeling costs in signals into DSVs and describes the extraction of fault
continuously updating models. For the extraction of patterns using semi-supervised classification in consideration
meaningful vibration signals, a semi-supervised classifier was of the number of DSV occurrences. Section 3 describes the
applied with Kernel marginal Fisher analysis [32]. Those bearing-shaft system analyzed here, the collection of
authors first reduced redundant information in order to acceleration signals and the control of abnormal states. The
highlight meaningful signal behaviors related to system status pattern extraction method is experimentally verified and
change. They then extracted optimal low-dimensional features validated for the detection of fault states in the bearing shaft.
in order to improve the classification performance of various Finally, Section 4 presents concluding remarks and future
bearing fault types. directions following this work.
In many studies, semi-supervised learning approaches
showed the improvement in fault detection with not fully 2. Semi-supervised classification of signal patterns
labeled input datasets. However, it is not easy to apply a semi-
supervised classifier to DSVs, because time series 2.1. Pattern generation from multivariate signals
discretization reduces the number of datasets, making the
preprocessed dataset smaller than the original dataset. Jun et al. We used the multivariate discretization in a pattern
obtained a symbolic representation of time series with a semi- extraction manner, to observe informative signal behaviors, as
supervised classification [33]. They transformed the time- developed in our previous study [11] (illustrated as Fig. 1).
series data into a series of granules by applying a hidden This developed multivariate discretization-based pattern
Markov model, and then used both the hidden Markov model analysis was composed of (i) a digitizing and partitioning step
and a shallow neural network together with symbolic and and (ii) an abnormal pattern extraction step. The developed
original real-valued data, respectively. Also, a convolution pattern extraction method showed higher fault detection and
neural network (CNN) was modified and used for a semi- prediction performance in the electromechanical systems, such
supervised learning in the time-series classification [34]. First, as an automobile engine and a laser welding machine. In this
the investigators artificially increased the amount of training study, we therefore continue to apply the method to extract
data to handle partially labeled inputs in the supervised accelerometer signal patterns to detect abnormal vibrations in
318 Sujeong Baek et al. / Procedia Manufacturing 51 (2020) 316–323
Author name / Procedia Manufacturing 00 (2019) 000–000 3
Sensitivity 1.0 𝑚𝑚𝑚𝑚𝑚 � Five signals were collected at a sampling rate of 100Hz,
which is relatively but sufficiently low for monitoring the
Measurement range � 4,900 𝑚𝑚𝑚𝑚𝑚 � 𝑝𝑝𝑝𝑝 status of the rotating equipment. If the sampling rate is
Frequency range (� 5%) [1.0 10,000] Hz extremely low, it is susceptible to distortions such as aliasing
�� or folding [37]. On the other hand, if too high, more
Spectral noise (100Hz) 60 /√𝐻𝐻𝐻𝐻
�� computation cost would be required. According to the
previous research, any value over the Nyquist sampling rate
Table 5. Description of the collected vibration datasets during normal and
(herein 0.5Hz) was adequate and appropriate to collect signals
abnormal states.
in fault pattern extraction through DSVs [38]. When
Number of sensors in the system Five accelerometer sensors multivariate discretization was used for the fault pattern
extraction, 100Hz was an allowable sampling rate because of
Number of abnormal states the partitioning step. As several measurements in the user-
10 times for each fault type
(type 1 / type 2 / type 3)
defined length of the time segment are transformed into one
Total number of normal states 30 states label, above the minimum sampling rate (i.e., the Nyquist
sampling rate) the performance of the extracted fault pattern
Monitoring period for each state
About 83s was not significantly different from a Tukey honest
(Standard deviation: 7.8s) significant-difference test. That is, time segmentation showed
Sampling frequency 100Hz
a similar effect to having a low sampling rate, and there was
no significant difference for any sampling rates higher than
the Nyquist's minimum sampling rate. Therefore, in this study,
Table 6. The structure of hidden layers used in abnormal vibration detection.
we selected 100Hz, which was not only sufficiently high to
Time Sensor 1 Sensor 2 Sensor 3 Sensor 4 Sensor 5 capture the signal behavior over time but also efficient to
collect a huge amount of time series data.
00.000 -0.0084 0.0766 0.0287 -0.1182 -0.0193
We artificially simulated three different fault states by
00.001 0.0947 0.0011 -0.01125 0.1374 -0.0047 controlling the basic components (i.e., shaft and bearing) of
00.002 0.0042 0.0195 0.0622 0.0637 -0.0119 the bearing-shaft system as follows: For the first fault type,
two shafts were arranged in a non-parallel configuration. Thus,
00.003 0.0011 0.0171 -0.0052 0.1322 0.0353
the rotational force was not fully transmitted to the shorter
00.004 0.0211 -0.0484 0.0265 0.0611 0.0134 shaft due to an improper engagement between the teeth of two
gears as illustrated in Fig. 4a. Another fault was generated by
…
loosening a bolt in a bearing cover that holds both a bearing
65.623 0.0106 0.0105 0.0413 -0.1881 0.0251 and a shaft in place, as depicted in Fig. 4b. As the bearing
cover becomes loose, the adjacent shafts, gears, and plates
may tend to rotate unstably. To generate a third fault type,
Sujeong Baek et al. / Procedia Manufacturing 51 (2020) 316–323 321
6 Author name / Procedia Manufacturing 00 (2019) 000–000
non-parallel shafts and loose bearing cover were applied Training datasets without labels, those with corresponding
simultaneously. labels, and test datasets were divided as summarized in Table
Although the physical experimental setup for any fault type 7.
depicted in Fig. 4 seemed to generate distinguishable sensor
signals, enabling an abnormal state to be detected easily, Table 7. Datasets divided into test datasets, training datasets without labels,
and training datasets with corresponding labels for 5-fold cross-validation.
however, it was not simple to classify the two health states of
the system through a tr additional statistical detection method. The ratio of Training
Training
removed labels datasets with the
Because some acceleration signals from the normal state and among the
Test datasets datasets
corresponding
any abnormal state are extremely non-linear and non- training datasets
without labels
labels
stationary, as Fig. 5 illustrates, it was not straightforward to 25% Group 1 Group 2 Group 3, 4, 5
find significant differences between normal and abnormal
Group 1 Group 2, 3 Group 4, 5
system states. Therefore, we employed the proposed fault- 50%
detection method described in Section 2. 75% Group 1 Group 2, 3, 4 Group 5
Table 8. An example of the representative DSV for each system state when
Fig. 4. Controlling the bearing-shaft system to generate a fault: (a) fault type three-fifths of the total datasets had the correct corresponding labels and were
1 (non-parallel shafts); (b) fault type 2 (a loose bearing cover). used in the first training phase (i.e., the first row in Table 7).
DSV from fault type 3 against the representative DSV from corresponding occurrence numbers of input DSVs in normal
normal state in Table 8. However, as several of the DSVs and abnormal states. As a result, the proposed method
found in abnormal states showed only slight differences from provides higher detection performance regardless of the ratio
the DSVs of the normal states, still it was not straightforward of unlabeled DSVs in the training datasets. In particular, the
to detect abnormal vibrations using traditional statistical proposed method correctly determines novel unknown DSVs
methods. Note that these representative DSVs depends on the in the test phases by considering the occurrence of DSVs.
setting of k-fold cross validation. Although the primary objectives of this study were
To examine the effectiveness of the proposed detection achieved, further work is needed to overcome the limitations
method, we implemented three methods for detecting of the present study: The detection performance of the
abnormal vibrations: supervised fault detection [11] with the proposed method was rarely affected by any alternations of
labeled training datasets; the original self-training-based, the test conditions (such as a different motor speed, a different
semi-super vised classification; and the proposed detection gear ratio, and different lengths of shafts), but, it required a
method (semi-supervised classification considering the new training model with the identical architecture of the
number of DSV occurrences). We measured detection proposed method, because of the different signal
performance by calculating the number of discernible characteristics of the collected data. Therefore, we need (i) to
abnormal states in the test datasets. According to our previous test the performance of the proposed detection method across
supervised fault detection research [11], a discernible different test conditions. In addition, it is also necessary (ii) to
abnormal state was defined as a system state in which one or generalize the analysis results using more diverse types of
more abnormal patterns are detected. faults, (iii) to validate and verify the proposed method in early
Table 9 summarizes the abnormal vibration detection detection of degradation before fault occurrence, and (iv) to
performance of each label removal regime. Regardless of the discuss how to apply the method to unsupervised problems.
degree of label removal (i.e., of the prevalence of unlabeled
datasets) among the training sets, semi-supervised Acknowledgements
classifications showed better detection. As the ratio of
removed labels in training datasets increased, semi-supervised This work was supported by the National Research
methods provided higher detection rates, whereas supervised Foundation of Korea (NRF) grant funded by the Korea
detection worsened. Semi-supervised classification also can government (MSIT) (No. 2019R1G1A1097478).
provide more accurate detection results for unknown DSVs
that are never found in the training DSVs. In addition, we References
were able to identify the effect of considering the occurrence
numbers of DSVs with an increasing proportion of unlabeled [1] Lu Y, Xie R, Liang SY. Detection of weakfault using sparse
DSVs. empirical wavelet transform for cyclicfault. The International Journal of
Advanced Manufac-turing Technology 2018;99(5-8):1195–12012.
[2] Peng Y, Dong M, Zuo MJ. Current status of machine prognostics in
Table 9. The number of discernible abnormal states as a percentage of total
condition-based maintenance: a review. The International Journal of
abnormal states (=30 abnormal states) for supervised fault detection with the
labeled training datasets, for the original self-training based semi-supervised Advanced Manufac-turing Technology 2010;50(1-4):297–3133.
[3] Addo-Tenkorang R, Helo PT. Big data applications in
classification, and the proposed detection method. A number beside the
operations/supply-chain management: A literature review. Computers &
percentage denotes the exact number of discernible states among total
abnormal states. Industrial Engineering 2016;101:528–5434.
[4] Khan S, Phillips P, Jennions I, Hockley C. No fault found events in
The proposed maintenance engineering part 1: Current trends, implications and
The ratio of Supervised The original
detection organizationalpractices. Reliability Engineering & System Safety
removed labels fault detection self-training
method 2014;123:183–1955.
among the with the based semi-
(considering [5] Sydor P, Kavade R, Hockley CJ. Warranty impacts from no fault found
training labeled training supervised
the DSV’s (nff) and an impact avoidance benchmarking tool. Advances in Through-
datasets datasets classification
occurrences) life Engineering Services, Cham: Springer; 2017. p 245–2596.
25% 90% (27.0) 93% (28.0) 93% (28.0) [6] Tjahjono B, Teixeira ELS, Alfaro SCA. An on-line simulation to link
asset condition monitoring andoperations decisions in through-life
50% 89% (26.8) 86% (25.8) 97% (29.0) engineering services. In Proceedings of 2013 Winter Simulation
Conference. 2013.
75% 85% (25.4) 91% (27.2) 100% (30.0) [7] Erkoyuncu JA, Khan S, Hussain SMF, Roy R. A framework to estimate
the cost of no-fault found events. International Journal of production
economics 2016;173:207–2228.
4. Conclusion [8] Lee WJ, Wu H, Huang A, Sutherland JW. Learning via acceleration
spectrograms of a dc motor systemwith application to condition
In this study, we demonstrate semi-supervised abnormal monitoring. The International Journal of Advanced Manufacturing
Technology 2020;106(3-4):803–8169.
vibration detection in a constructed bearing-shaft system that
[9] Lu Y, Xie R, Liang SY. Detection of weakfault using sparse
can exhibit three different fault states based on the placement empirical wavelet transform for cyclicfault. The International Journal of
of the shafts and the tightness of a bearing cover. These fault Advanced Manufac-turing Technology 2018;99(5-8):1195–1201.
states are analyzed by identifying fault patterns based on the [10] Dragomir OE, Gouriveau R, Dragomir F, Minca E,Zerhouni N.
DSVs from five vibration sensor signals. Because time series Review of prognostic problem incondition-based maintenance. In 2009
European Con-trol Conference. 2009.
discretization reduces the number of datasets, we consider as
input variables both the transformed DSVs and the
Sujeong Baek et al. / Procedia Manufacturing 51 (2020) 316–323 323
8 Author name / Procedia Manufacturing 00 (2019) 000–000
[11] Baek S, Kim DY. Empirical sensitivity analysis of discretization [25] Huang G, Song S, Gupta JN, Wu C. Semi-supervised and
parameters for fault pattern extractionfrom multivariate time series data. unsupervised extreme learning machines. IEEE Transactions on
IEEE transactions oncybernetics 2016;47(5):1198–1209. cybernetics 2014;44(12):2405–2417
[12] Chandola V, Banerjee A, Kumar V. Anomaly detection: A survey. [26] Seliya N, Khoshgoftaar TM. Software quality estimation with
ACM computing surveys 2009;41(3):1–58. limited fault data: A semi-supervised learning perspective. Software
[13] Georgoulas G, Karvelis P, Loutas T, Stylios CD. Rolling element Quality Journal 2007;15(3):327–344.
bearings diagnostics using the symbolic aggregate approximation. [27] Wei L, Keogh E. Semi-supervised time series classification. In
Mechanical Systemsand Signal Processing 2015;60:229–242. Proceedings of the 12th ACMSIGKDD International Conference on
[14] Mattes A, Schöpka U, Schellenberger M, Scheibelhofer P, Leditzky G. Knowledge Discovery and Data Mining. 2006.
Virtual equipment for benchmark-ing predictive maintenance [28] Theissler A. Detecting known and unknownfaults in automotive
algorithms. In Proceedings of the 2012 Winter Simulation Conference. systems using ensemble-based anomaly detection. Knowledge-Based
2012. Systems 2017;123:163–173.
[15] Yiakopoulos C, Gryllias K, Chioua M, Hollender M. Antoniadis I. [29] Schwenker F, Trentin E. Pattern classificationand clustering: A review
An on-line sax and hmm-basedanomaly detection and visualization tool of partially supervised learning approaches. Pattern Recognition Letters
for early dis-turbance discovery in a dynamic industrial process. 2014;37:4–14.
Journal of Process Control 2016;44:134–159. [30] Wu H, Yu Z, Wang Y. Real-time FDM machinecondition monitoring
[16] Liu B, Li J, Chen C, Tan W, Chen Q, Zhou M. Efficient motif discovery and diagnosis based on acousticemission and hidden semi-markov model.
for large-scale time series in health care. IEEE Transactions on Industrial The International Journal of Advanced Manufacturing Technology
Informatics 2015; 11(3):583–590. 2017;90:2027–2036.
[17] Keogh E, Lin J, Lee SH, Van Herle H. Finding the most unusual time [31] Zhao Y, Ball R, Mosesian J, de Palma JF, Lehman B. Graph-based semi-
series subsequence: Algorithms and applications. Knowledge and supervised learning for fault detection and classification in solar
Information Systems 2007;11(1):1–27. photovoltaic arrays. IEEE Transactions on Power Electronics
[18] Mitsa T. Temporal data mining. Temporal Pattern Discovery, Boca 2014;30(5):2848–2858.
Raton: CRC Group; 2010. p.153-200. [32] Jiang L, Xuan J, Shi T. Feature extraction basedon semi-supervised
[19] Karvelis P, Georgoulas G, Tsoumas IP, Antonino-Daviu JA, Climente- kernel marginal fisher analysis and its application in bearing fault
Alarcón V, Stylios CD. A symbolic representation approach for the diagnosis. MechanicalSystems and Signal Processing 2013;41(1-2):113–
diagnosis of broken rotor bars in induction motors. IEEE Transactions on 126.
Industrial Informatics 2015;11(5):1028–1037. [33] Meng J, Wu L, Wang X, Lin T. Granulation-based symbolic
[20] Georgoulas G, Karvelis P, Stylios CD, Tsoumas IP. Antonino- representation of time series and semi-supervised classification.
Daviu JA, Corral-Hernández J, Climente-Alarcón V, Nikolakopoulos G. Computers and Mathematics with Applications 2011;62:3581–3590.
Automatizing the detection of rotor failures in induction motors [34] Le Guennec A, Malinowski S, Tavenard R. Data augmentation for time
operatedvia soft-starters. In Proceedings of IECON 2015-41st Annual series classification using convolutional neural networks. In
Conference of the IEEE Industrial Electronics Society, 2015. Proceedings of ECML/PKDD Workshop on Advanced Analytics and
[21] Duan L, Zhang Y, Zhao J, Wang J, Wang X, Zhao F. A hybrid Learning on Temporal Data, 2016.
approach of sax and bitmap for ma-chinery fault diagnosis. In [35] Batch A, Lee K, Maddali HT, Elmqvist N. Gesture and action discovery
Proceedings of 2016 International Symposium on Flexible Automation. for evaluating virtual environments with semi-supervised segmentation of
2016. telemetry records. In Proceedings of 2018 IEEE International Conference
[22] Othman Z, Eshames HF. Abnormal patterns detection in control charts on Artificial Intelligence and Virtual Reality (AIVR). 2018.
using classification techniques. Int J Adv Comput Technol [36] Vesel`y K, Hannemann M, Burget L. Semi-supervised training of
2012;4(10):61–70. deep neural networks. In Proceedings of 2013 IEEE Workshop on
[23] Wang J, Balasubramanian A, Mojica de la Vega L, Green JR, Automatic Speech Recognition and Understanding. 2013.
Samal A, Prabhakaran B. Word recognition from continuous [37] Landau HJ. Sampling, data transmission, and the nyquist rate.
articulatory movement time-series data using symbolic representations. Proceedings of the IEEE 1967;55(10):1701–1706.
In Proceedings of the Fourth Workshop on Speech and Language [38] Baek S, Kim DY. Effects of sampling rate on the performance of
Processing for Assistive Technologies, Association for Computational multidimensional discretization-based fault detection. In Proceedings of
Linguistics, 2013. the 2015 Spring Conference of Korean Institute of Industrial Engineers,
[24] Chen J. A predictive system for blast furnaces by integrating a neural 2015.
network with qualitative analysis. Engineering Applications of Artificial
Intelligence 2001;14(1):77–85.