You are on page 1of 27

Imbalance classification in a scaled-down wind turbine

using radial basis function kernel and support vector


machines

Tiago de Oliveira Nogueiraa , Gilderlânio Barbosa Alves Palacioa , Fabrı́cio


Damasceno Bragab , Pedro Paulo Nunes Maiab , Elineudo Pinho de Mourab ,
Carla Freitas de Andradea , Paulo Alexandre Costa Rochaa
a Departamento de Engenharia Mecânica, Universidade Federal do Ceará, 60455-760,
Fortaleza, CE, Brazil
b Departamento de Engenharia Metalúrgica e de Materiais, Universidade Federal do Ceará,

60455-760, Fortaleza, CE, Brazil

Abstract

This work innovates by proposing the combination of DFA with the SVM
and RBFK methods, two supervised algorithms that use the kernel-method, for
the imbalance level classification in a scaled-down wind turbine. The results
obtained were compared with other techniques proposed in previous works.
The vibration signals analyzed were acquired under certain work conditions,
and it is possible grouping them into 3 or 7 categories. It is worth mentioning
that the dataset examined here is composed of the same signals used by previous
works aiming at comparing results.
The aforecited kernel-methods (Support Vector Machine and Radial Basis
Function Kernel) classified, with a high success rate, the output of detrended
fluctuation analysis (DFA) of vibration signals according to their respective
working conditions.
In the classification of three major classes, the performance achieved by both
classifiers reduces with the increase in the rotation speed. The best average
success rates reached at 900 rpm, 1200 rpm and 1500 rpm were, respectively,
99.96% by RBFK, 99.24% by RBFK and 98.73% by SVM. For seven imbalance
levels, both classifiers showed the best performance at 900 rpm again. In this

? Fully documented templates are available in the elsarticle package on CTAN.

Preprint submitted to Journal of LATEX Templates September 9, 2021


case, the best rates reached 98.83% by RBFK. At 1200 and 1500 rpm, the rates
are slightly different.
Keywords:
Machine Learning classification, Vibration analysis, Support Vector Machine
, Radial Basis Function Kernel

1. Introduction

The Intergovernmental Panel on Climate Change (IPCC) report states that


the average global temperature increased 1.50°C compared to the pre-industrial
period. Besides, the report recommends reverting this warming by the end
of this century to avoid irreversible and catastrophic impacts. This implies
reducing 45% of the carbon dioxide (CO2 ) emissions until 2030 and that the
net carbon emission should be zero in 2050. The IPCC report states that to
achieve this goal, unprecedented social and economic changes are required [1].
The use of renewable energies presents high economic and environmen-
tal interest. Among the renewable sources of energy, wind energy is clean,
environment-friendly, and secure [2]. However, it is essential ensuring the per-
formance and reliability of wind turbines, especially in components such as
blades, where some of the most common failures happen [3, 4].
Approximately 20% of the total cost of the wind turbine corresponds to the
blades [5]. These components are subject to the imbalance caused by adverse
weather conditions [6, 7, 8, 9], which leads to a loss of performance.
Erosion and dust accumulation on the blades affect turbine efficiency [10].
Han et al. [11] studied the wear on wind blades and its impact on wind turbine
efficiency. They observed a drop in the lift coefficient value and a concomitant
increase in the drag coefficient, leading to a reduction of 2% to 3.7% on energy
production.
Signal processing techniques are widely used tools for fault diagnosis [12,
13]. However, the various vibration sources existent in a wind turbine (such
as gearboxes, bearings, etc.) with different frequencies, and even wind gust,

2
produce signals with transient or non-stationary characteristics, hindering their
analysis [14]. For this reason, techniques based on monitoring temporal changes
of statistical parameters, or the pattern identification of the natural frequencies
of faults, are not always suitable for the analysis of these complex signals.
On the other hand, the combination of detrended fluctuation analysis (DF
Analysis) and classifiers with supervised learning algorithms for bearings and
gearboxes fault diagnosis [15, 16], imbalance evaluation of scaled-down wind
turbines under different rotational regimes [17, 18] and for retina images classi-
fication [19] were proved to be successful.
Just like several other pattern recognition techniques, the kernel-methods
can identify general relations in datasets and find discriminant functions. Unlike
many algorithms that explicitly map the input vectors into a feature space to
perform the separation between classes, kernel-methods emulate dot products
in the feature space by means of the kernel trick [20], i.e., without knowing the
nonlinear mapping that projects the input vector into the feature space. To
achieve this task, a kernel function must be chosen among many options that
are available in the literature, such as the radial basis function (RBF) and the
polynomial function. The feature space in this case is a special type of Hilbert
space, namely, the reproducing kernel Hilbert space. Support Vector Machines
(SVM) and Radial Basis Function Kernel (RBFK) are examples of algorithms
that use the kernel function.
Support vector machine (SVM) gets the class decision from support vectors,
avoiding assumptions about the shape of the underlying class distributions. By
minimizing the structural risk in a classification task, the SVM avoids overfitting
and achieves a good generalization [21, 22, 23].
The radial basis function kernel (RBFK) is a kind of artificial neural network
(ANN) commonly used for regression or classification. Likewise, the RBFK is
composed of layers of interconnected neurons. However, the neurons of its single
hidden layer use kernel functions (e.g., the Gaussian function), which enables
non-recursive solution techniques.
The innovative proposal of the work is to use detrended fluctuation analysis

3
with pattern recognition algorithms, as an effective tool to diagnose the imbal-
ance of wind turbines. It was found in the literature that the vibration signal
classification method was proposed to be incorporated into the maintenance
schedule management [24, 25] (avoiding unnecessary downtime to discover the
type of defect or corrective maintenance).This work applies the vibration anal-
ysis method, which has been used in other areas of study, as a method to help
ensure the integrity of the wind turbine and optimize energy generation.
The next sections are organized as follows: Section 2 is a description of
the dataset analyzed; A brief explanation of DF Analysis, Support Vector Ma-
chine, Radial Basis Function Network, and hyperparameter optimization are
presented in Sections 3, 4, 5, and 6, respectively; Section 7 describes the re-
sults obtained by applying a combination of DF Analysis of vibration signals
and kernel-methods (SVM and RBFK) and compares against the results from
a similar approach involving other pattern recognition techniques; Section 8
contains the conclusions.

2. Experimental setup and dataset

The vibration signals evaluated here are the same examined in previous
studies. The aim of using this dataset is to compare the performance of different
classifiers. For this reason, this section describes the experimental setup and
dataset briefly. Detailed explanations are available in references 1 and 2.
The signal recording system consists of an accelerometer (Bruel Kjaer,
model 4381V), an amplifier (Bruel Kjaer, model 2692), and an oscilloscope
(Tektronix, model 1062 TBS). The accelerometer was positioned on the shaft
bearing of a scaled-down wind turbine.
The blades were designed with NREL S809 airfoil profile, tip speed ratio λ
equal to seven, and produced by additive manufacturing. Every blade is 0.20
meters in length and weighs 15 grams.
The dataset is arranged into three main classes that reproduce some distinct
operation conditions of a wind turbine. The conditions are defined as follows:

4
(a) (b) (c)

Figure 1: Representation of the three main conditions. For all cases, the dissimilar blade is
always named blade 1: (a) imbalance by an excess of mass in one blade, (b) imbalance by
lack of mass in one blade, (c) balanced turbine.

first, imbalance by an excess of mass in one blade; second, imbalance by lack of


mass in one blade; and, third, balanced turbine, as seen in Figure 1. Heavier
blades are depicted in red and lighter blades in blue. The black color in the tip
of the blades shows where the masses were added.
To obtaining the first condition, three different quantities of mass (0.5, 1.0,
or 1.5 gram), each one at a time, were added at the tip of one blade, yielding
three subclasses.
In the second case, a certain quantity of mass was added, simultaneously, at
the tip of two blades. The mass quantities used to produce the second situation
are identical to those used in the first condition, leading to three additional
subclasses.
The third group simulates a balanced wind turbine under normal operating
conditions.
The identification of the main classes and subclasses, and their respective
descriptions, are presented in Table 2. Figure 2 shows representative normalized
vibration signals.
For each one of the seven imbalance levels, fifty vibration signals were ac-
quired, yielding 350 signals. Furthermore, for all aforecited conditions, the tur-
bine worked at three rotation frequencies (900, 1200, and 1500 rpm), resulting

5
(a) (b) (c)

(d) (e) (f)

Figure 2: Representative signals plots obtained under three main conditions, at a rotation
frequency of 1200 rpm: (a) signal acquired from turbine imbalanced by an excess of mass in
one blade, (b) signal acquired from turbine imbalanced by lack of mass in one blade, (c) signal
acquired from a balanced turbine, (d) normalized signal depicted in Figure 2a, (e) normalized
signal depicted in Figure 2b, (f) normalized signal depicted in Figure 2c.

6
in a dataset composed of 1050 signals.
Each signal has 500 data points acquired with a sampling rate of 250 Hz
(250 samples/s). It was applied a low-pass filter with cut-off frequencies of 100
kHz, and a high-pass filter with 1 kHz cut-off frequencies.

Table 1: Nomenclature of each unbalance class and subclass, for the set with 3 and 7 different
classes, respectively.

Blade’s mass [gram] System’s


Class Subclass Description
blade #1 blade #2 blade #3 mass [gram]
0.5 g added
C1-0.5g 15.5 15.0 15.0 45.5
to one blade
C1
1.0 g added
C1-1.0g 16.0 15.0 15.0 46.0
to one blade
1.5 g added
C1-1.5g 16.5 15.0 15.0 46.5
to one blade
0.5 g added
C2-0.5g 15.0 15.5 15.5 46.0
to two blades
C2
1.0 g added
C2-1.0g 15.0 16.0 16.0 47.0
to two blades
1.5 g added
C2-1.5g 15.0 16.5 16.5 48.0
to two blades
Balanced
C3 C3 15.0 15.0 15.0 45.0
system

3. Detrended fluctuation analysis

In summary, the DFA method consists of four steps [26]. Firstly, from a
time series ui of length N , an integrated time-series yi is created, as shown by

7
Equation 1.

j
X
yi = (ui − hui) (1)
i=1

where hui is the overall average of the original series,

N
1 X
hui = ui (2)
N i=1

Secondly, this new series is divided into N/τ intervals of size τ and the
local trend is removed, yei j, inside each one. Thirdly, the fluctuation (root mean
square) of the detrended data inside each interval and the average fluctuation
over all intervals are calculated.

s
1X 2
fk (τ ) = (yj − yej ) (3)
τ
i∈Ik

Finally, repeating this process for different interval sizes builds a curve of the
average fluctuation as a function of the size interval. Figure 3 shows log10 (F (τ ))
against log10 (τ ).
It is noteworthy that some points at the end of the time series are not
analyzed if the series length is not a multiple of τ .
In the same way as in Moura et al. [17] and Melo Junior et al. [18], to
avoid this possibility, here we used a DFA algorithm slightly different from the
original proposed by Peng [26] but based on Podobnik and Stanley [27]. Thus,
for each interval Ik , the variance of the residuals and the covariance F (τ ) were
calculated over all overlapping N − τ + 1 intervals of size τ as

1 X 2
fk2 (τ ) = (yi − yei ) (4)
τ −1
i∈Ik

1 X
F (τ ) = fk (τ ) (5)
N −τ +1
k

8
(a) (b) (c)

Figure 3: Representative DFA plots obtained under three main conditions, at a rotation
frequency of 1200 rpm: (a) DFA from signal depicted in Figure 2d, (b) DFA from signal
depicted in Figure 2e, (c) DFA from signal depicted in Figure 2f.

4. Support vector machine

Support vector machine (SVM) [28] have been widely used to solve classifi-
cation problems. Those machines approach the concepts of supervised learning
through a well-founded mathematical theory. SVM model learning process is
based on a structural risk minimization principle. This principle is related to the
discriminant function complexity associated with the classification task of two
distinct labels (d=-1 and d=+1) [29]. The discriminant function is an optimal
classification surface estimated through the training samples during the learning
process. The formulation of an SVM model consists of solving two optimization
problems. The first one, called primal optimization problem, is given by:

n
1 T X
di wT xi + b ≥ 1 − ξi , ∀i, s.t.ξi ≥ 0, ∀i,

min τ (w, ξ) = w w+C ξi , s.t.
2 i=1
(6)
where w is the normal vector to the classification surface, b is the threshold
of the discriminant function and {xi , di } corresponds to the i-th sample of the
n
training set {xi , di }i=1 ⊂ RN × {−1, +1}. The minimization of the first term of
τ (w, ξ) represents the maximization of the margin of separation, which is the
distance between the hyperplanes that intersect the support-vectors of each of
the two classes. Its maximization is related to structural risk minimization [29].
ξi are slack variables that permit margin failure and C is a hyperparameter

9
of the model which trades off a wide margin with a small number of margin
failures, being responsible for the margin’s flexibilization.
Lagrange multipliers methods are used to deal with the primal problem
presented in Eq. 6. The solution is determined by the Lagrangian function
saddle point, which must be maximized concerning the Lagrange multipliers,
αi . Through saddle point condition, it is possible to formulate the following
optimization problem, called the dual optimization problem:

Pn
min G(α) = i=1 αi
Pn Pn
− 12 i=1 j=1 αi αj di dj K (xi , xj ) ,
Pn (7)
s.t. i=1 αi di = 0,

s.t. 0 ≤ αi ≤ C, ∀i
The function K (xi , xj ) represents the Kernel Function, which gives the con-
volution of the dot-product of vectors xi and xj in feature space [29]. The initial
formulation of SVM classifiers was only able to solve linear classification prob-
lems. By using kernel functions, it is possible to transform the input features
nonlinearly to a feature space of higher dimension in which linear methods may
be applied [30]. The kernel function used in the present work was the gaussian
kernel, which is given by:
( )
2
− kxi − xj k
K (xi , xj ) = exp (8)
σ2

where kxi − xj k is the Euclidian distance between xi and xj and the parameter
σ is another hyperparameter of the model (along with C ).
Solving the dual problem (Eq. 7 ) will provide the optimal Lagrange mul-
tipliers. With them, it is possible to calculate the discriminant function of the
SVM model, which is given by:

n
X
f (x) = αio di K (xi , x) + b (9)
i=1

where αo represents the optimal value for the Lagrange multipliers. Those
values are obtained during the training phase, by solving the dual optimization

10
problem for all the training samples.
The dual optimization problem presented in Eq. 7 is a Quadratic Program-
ming (QP) problem, whose solution can be obtained by using various libraries,
software toolboxes, and specific algorithms. The current paper used the Se-
quential Minimal Optimization algorithm [31] to train the SVM classifiers. The
formulation of SVM classifiers is intended to solve binary classification prob-
lems. However, these classifiers are widely used to solve multiclass problems
through different classification strategies that use combinations of multiple bi-
nary classifiers outputs [32]. This work applied the One-Against-One approach
for solving a multiclass problem.
SVM is the most widely used kernel-learning algorithm due to its great gen-
eralization capacity, relative ease of use, and elegant theoretical foundations,
achieving a robust performance in classification and regression problems. Fur-
thermore, when compared with classical learning algorithms, SVM scales rela-
tively well with high dimensional data. It may even perform accurately in small
training sets cases [33].

5. Radial Basis Function Kernel

An artificial neural network (ANN) consists of a set of neurons, i.e. mathe-


matical operators, organized in interconnected layers. Its primary purpose is to
assimilate, analyze and generalize complicated patterns and information [33].
It is usual to divide the network into three distinct layers: an input layer,
whose essential role is to transmit external information into neurons to start the
information flow; hidden layers that process the data and transfer it to a final
layer, the output layer, which provides a response calculated by the network
and serves as a comparison metric with expected values [34].
It can be understood the radial basis function neural networks (RBFN) as
a peculiar ANN class. It has characteristics distinct from the others: a fixed
structure with only an intermediate layer containing several radial base neurons
and an output layer of linear neurons. But what sets it apart the most is the

11
kernel method applying a transformation from the initial input space into a
high dimensional space to maximize class separability, as SVM does, following
Cover’s Theorem [35].
A typical radial basis function is the gaussian used in SVM’s Eq. 8 .But it is
modified to Eq. 10, where x represents a vector of attribute entry, µ the center of
each neuron, and σ the function scattering. So then, that network can be called
RBFK. The main disparity between the two applications is in the Euclidian
distance determination. On the one hand, SVM’s equation calculates it among
each one of the training patterns. Wherein RBFK’s form, it is determined
between a training pattern and a neuron centroid.

kx − µk2
 
ϕ(x) = exp − (10)
σ2
It is common practice to normalize each hidden layer neuron’s output [6].
Hence, we can write Eq. 11 as the neuron’s output equation and summarize the
network’s response in Eq. 12.
!
ϕi (x, µi )
Φi (x) = Pk (11)
j=1 ϕj (x, µj )
k
X
f (x) = ωj Φj (x, µ) (12)
j=1
There are several RBFK solution methods, recursive or analytical. The
recursive solution is, in general, used in most neural networks, but it implies a
greater need for training data and longer processing time. Thus, the selected
procedure was the ordinary least squares, an analytical one [35, 36]. The work
methodology used this technique together with a hyperparameter optimization
approach, which, in RBFK’s case, is an optimization of neuron numbers/centers
and activation function scattering.

6. Hyperparameter optimisation

The solution algorithms are composed of parameters and hyperparameters.


We can calculate the first ones by algorithm training that allows an analytical

12
solution for its correct definition, for example, the synaptic weights of an ANN.
However, we must define the last ones before the training starts as they are as-
sociated with the algorithm performance and its computational cost [36]. Over
the years, automatic optimization methods have emerged for hyperparameter
search, including Grid Search [37], Random Search [38], Gradient-based Opti-
mization [36], and others [39]. It was used two methods in the present work:
Grid Search and Random Search. Both techniques apply an established interval
resolution of the search for each hyperparameter to define the number of possi-
ble analysis points, so, in general, they are very similar. However, Grid Search
method uses a procedure based on a pre-established order point, while Random
Search merely randomizes it [36].

7. Results and discussion

This section shows the results obtained by the SVM and RBFK algorithms
used to classify vibration signals according to imbalance levels of a scaled-down
wind turbine at different rotational regimes. The final purpose is a performance
evaluation of classifiers based on kernel methods against non-kernel methods in
this classification task.
For all classifiers and rotation speeds presented, 80% of the vectors generated
by the DF Analysis of the vibration signals are randomly selected to define the
training subset. After the learning process, the remaining 20% of the vectors are
used to evaluate the generalization of the classifier, i.e., its ability to distinguish
data not presented during the training process. In all cases, this process was
repeated 100 times to ensure good statistical significance.
Table 2 summarizes the successes and errors rates obtained by SVM and
RBFK algorithms used for the three-class classification case. Tables 3 and 4
show, respectively, the average confusion matrices produced by SVM and RBFK
algorithms applied for the seven-class discrimination. The values in tables 2, 3,
and 4 correspond to averages calculated over 100 distinct exclusively test sets.
In these three confusion matrices, columns represent the classifier output,

13
Table 2: Average confusion matrices obtained by SVM (left side in the results cells) and
RBFK (right side in the results cells) classifiers used for the three-class classification, for
three rotation frequencies. Averages are calculated over 100 test datasets.

Classifiers output
C1 C2 C3
C1 99.89 | 99.97 0.11 | 0.03 0|0
900 RPM C2 0 | 0.03 100 | 99.97 0|0
C3 0|0 0.71 | 0.05 99.29 | 99.95
C1 98.60 | 99.03 0.62 | 0.4 0.78 | 0.57
1200 RPM C2 0 | 0.07 100 | 99.93 0|0
C3 1.07 | 0.41 0 | 0.81 98.93 | 98.78
C1 98.95 | 98.13 0.27 | 0.88 0.78 | 0.99
1500 RPM C2 2.12 | 2.27 97.54 | 97.46 0.34 | 0.27
C3 0|0 0.31 | 0 99.69 | 100

while rows indicate the actual class. In this sense, every row sums 100% for each
of the tested models. Figures in the main diagonal correspond to the correct
classification rate, while off-diagonal figures correspond to misclassifications.
It can be seen in Table 2 that the performance of the classifiers used for the
ternary classification is slightly different between methods and reduces with the
increase in the rotation speed of the turbine. The best average success rates
reached by SVM and RBFK were, respectively, 99.73% and 99.96% at 900 rpm.
The second-best average rates were 99.17% and 99.24% at 1200 rpm, followed
by 98.73% and 98.53% at 1500 rpm
As already observed in the ternary classification scheme, Tables 3 and 4
show reductions in the average success rate of the RBFK and SVM classifiers
employed for discrimination of the seven imbalance levels at higher rotations.
Table 3 shows that the best result was observed for 900 rpm, whereas the
higher misclassification occurred at 1500 rpm. In this case, 4.06% of data from

14
subclass C1-0.5g (i.e., 0.5 gram added to the tip of one blade) had been assigned
incorrectly to the subclass C2-1.0g (i.e., 1.0 gram added to the tip of two blades).
Moreover, 8.6% of data from subclass C1-0.5g were classified incorrectly among
all other subclasses.
As Table 4 shows, the worst misclassification happened at 1200 rpm when
22.73% of data from subclass C1-1.0g (i.e., one blade with 1.0 gram additional)
were classified mistakenly as belonging to subclass C1-1.5g (one blade with
1.5 gram extra). Likewise, 7.92% of data belonging to subclass C1-1.5g were
wrongly considered as subclass C1-1.0g. Again, the best results were observed
for 900 rpm.
Tables 3 and 4 also show that the subclasses C1-1.0g and C1-1.5g and the
subclasses C2-0.5g and C2-1.0g are the most frequently confused imbalance
levels and with the largest misclassifications.
These results are coherent with those presented by Moura et al. [17] and
Melo Junior et al. [18]. The authors used three classifiers implemented by a
multilayer perceptron (MLP), a Gaussian discriminator (GC), and a classifier
based on Karhunen-Loève (KL) transformation to classify the same dataset
under study here. According to these previous works, the imbalance levels of
the wind turbine evaluated were most easily separable at 900 rpm, regardless
of the classifier. Papers that use the same methods, but in other fields of study
achieved similar success rates[40].
On the other hand, all classifiers reached their largest misclassifications dur-
ing the classification of data derived from vibration signals acquired at 1200
rpm.
The authors [17, 18] understood that 1200 rpm could be closer to the charac-
teristic vibration of the system. The procedure required to check this hypothesis
is beyond the scope of this work.
They reported that the multilayer perceptron misclassified 24.5% of data
belonging to subclass C1-1.0g as belonging to subclass C1-1.5g. This error
exceeds the largest misclassification performed by the SVM classifier (22.73%).
Meanwhile, the Gaussian discriminator misclassified 17.1% of data from sub-

15
Table 3: Average confusion matrices obtained by RBFK classifier used for the seven-class
classification, at three rotation frequencies. Averages calculated over 100 test datasets.

Classifiers output
C1-0.5g C1-1.0g C1-1.5g C2-0.5g C2-1.0g C2-1.5g C3
C1-0.5g 99.30 0.10 0.50 0 0 0.10 0
C1-1.0g 0.60 97.61 1.29 0 0 0.50 0
900 C1-1.5g 0 0 100 0 00 08 0
C2-0.5g 0.19 0.10 0.10 97.91 1.30 0.10 0.30
RPM C2-1.0g 0 0 0 2.34 97.66 0 0
C2-1.5g 0.10 0.11 0.32 0.11 0 99.36 0
C3 0 0 0 0 0 0 100
C1-0.5g 96.8 0.83 0.72 0.31 0.31 0 1.03
C1-1.0g 0.1 96.75 2.46 0 0.1 0.2 0.39
1200 C1-1.5g 0.33 1.49 96.48 0.32 0.21 0.53 0.64
C2-0.5g 0.2 0 0 99.03 0.48 0.19 0.19
RPM C2-1.0g 0 0 0 0 100 0 0
C2-1.5g 0 0 0 0 0 100 0
C3 0.26 0.34 0.15 0.59 0.34 0.29 98.03
C1-0.5g 91.4 2.03 0.48 0.1 4.06 0 1.93
C1-1.0g 0.32 99.15 0.21 0 0.21 0 0.11
1500 C1-1.5g 0 0.1 99.1 0.1 0.2 0 0.5
C2-0.5g 0.41 0.68 0.58 92.59 3.7 0.19 1.85
RPM C2-1.0g 0.4 0.41 1.52 0.41 96.25 0.1 0.91
C2-1.5g 0 0 0 0 0 100 0
C3 0 0 0 0 0 0 100

16
Table 4: Average confusion matrices obtained by SVM classifier applied for the seven-class
classification, at three rotation frequencies. Averages calculated over 100 test datasets.

Classifiers output
C1-0.5g C1-1.0g C1-1.5g C2-0.5g C2-1.0g C2-1.5g C3
C1-0.5g 100 0 0 0 0 0 0
C1-1.0g 0.92 99.08 0 0 0 0 0
900 C1-1.5g 0 0 99.92 0 0 0.08 0
C2-0.5g 0 0 0 93.66 6.34 0 0
RPM C2-1.0g 0 0 0 3.25 96.75 0 0
C2-1.5g 0 0 0 0 0 100 0
C3 0 0 0 0.4 0.84 0 98.76
C1-0.5g 97.83 0 0 0 0 0 2.17
C1-1.0g 0 77.27 22.73 0 0 0 0
1200 C1-1.5g 0.33 7.92 89.92 0.22 0 0.16 1.45
C2-0.5g 0 0 0 99.03 0 0.88 0.09
RPM C2-1.0g 0 0 0 0.47 99.28 0 0.25
C2-1.5g 0 0 0 1.87 0 98.13 0
C3 0.6 0 0 0 0 0 99.4
C1-0.5g 92.54 4.92 0 0 0 0 2.54
C1-1.0g 1.88 98.12 0 0 0 0 0
1500 C1-1.5g 0 0 99.89 0 0.11 0 0
C2-0.5g 0 0.19 0 98.93 0 00 0.88
RPM C2-1.0g 4.15 0 3.45 0 92.32 0 0.08
C2-1.5g 0 0 0 0 0 100 0
C3 0.1 0 0 0 0 0 99.9

17
class C1-1.5g as belonging to subclass C1-1.0g, while the KLT classifier asso-
ciated by mistake 22.8% of data from subclass C1-1.5g with several other sub-
classes.
Additionally, the imbalance levels most frequently misclassified by the kernel
methods under study (C1-1.0g/C1-1.5g and C2-0.5g/C2-1.0g) were also consid-
ered similar and grouped by k-means clustering, an automatic classifier with an
unsupervised learning algorithm.
Despite all the misclassifications found among the imbalance subclasses, the
RBFK classifier reached 98.83%, 98.16%, and 96.93% of success against 98.31%,
94.40%, and 97.39% of the SVM classifier (respectively at 900 rpm, 1200 rpm,
and 1500 rpm).
It is important to highlight that the first group represents the most un-
balanced scenarios where one blade is heavier than the others. The heaviest
blade of the subclasses C1-1.0g and C1-1.5g comprises, respectively, 34.78%
and 35.48% of the system’s mass, resulting in a higher vibration. Consequently,
a low signal-to-noise ratio may be a possible reason for these mistakes.
The second group presents one blade lighter than the others. The lightest
blade of the subclasses C2-0.5g and C2-1.0g comprises, respectively, 32.61% and
31.91% of the systems mass.
Furthermore, within the same category, the difference between the mass of
the dissimilar blade (always named blade #1) and the mass of any other in
relation to the system’s mass is more expressive in the first group than in the
second.
Figure 4 summarizes and compares the best results obtained with the kernel-
methods and non-kernel-methods classifiers used for the septenary classification.
An important point to note is related to the superior performance of the RBFK
classifier at 1200 rpm, i.e., at the rotation in which all classifiers presented their
worst performances.
Hao Yu et al. [41] used a Radial Basis Function Network for Dynamic System
Design. According to them, good generalization and strong tolerance to input
noise are some of the advantages of the radial basis function networks over the

18
Figure 4: Performance comparison of the classifiers that apply kernel functions (SVM and
RFBK) against those which do not use kernel function (GC, KLT and MLP) for the test stage
of the data set with 7 classes and three rotation speeds.

traditional neural networks and fuzzy inferential systems.


Another possible explanation for the better performance of the radial func-
tion network may rely on the No Free Lunch (NFL) Theorem [42]. It states
that if it were performed trials for all possible scenarios of a given problem, on
average, no method would be superior to the others. The practical implication
of this theorem is that some techniques will do better than others for some
scenarios, i.e., the effectiveness of a model depends directly on how well the
assumptions made by the model fit the nature of the data.
In this sense, for the scenario under study here, the RBFK method found
conditions for an improved performance when compared to the others.

8. Conclusion

Two kernel-methods (SVM and RBFK classifiers) were successfully used


for the imbalance classification of a scaled-down wind turbine from vibration
signals.

19
The imbalance levels can be arranged into three main classes or seven sub-
classes without any significant loss of performance.
Hence, for each method, the data classification was carried out for two dif-
ferent quantities of categories: ternary and septenary classification schemes.
In general, the success rate of classifiers decreased with the increase of the
rotation speed. Nevertheless, there was no misclassification rate superior to 2%
for the three-class classification scheme.
For the seven-class classification, the classification success rates achieved
by the two classifiers with kernel functions (SVM and RBFK) are similar to
the obtained by the three other classifiers without kernel functions (Multilayer
perceptron, Gaussian classifier, and based on Karhunen-Loève transformation).
Their average performances were never inferior to 94% for the testing sets.
The results suggest that all these tools can be useful to the development of
an automatic system for imbalance diagnosis in wind turbines.
The RBFK classifier presented superior performance when compared to the
other methods (1 proposed here and 3 from the literature), for the 1200 rpm
rotation speed. Furthermore, it performed similarly to the other speeds stud-
ied. In this sense, this method is strongly recommended for the proposed task
presented here.

Acknowledgement

This study was financed in part by the Coordenação de Aperfeiçoamento


de Pessoal de Nı́vel Superior - Brasil (CAPES) - Finance Code 001. We also
gratefully acknowledge the financial support of the Brazilian agencies CNPq and
FUNCAP

References

[1] Report of the secretary-general on the 2019 climate action summit the
way forward in 2020.

20
URL https://www.un.org/en/climatechange/assets/pdf/cas_
report_11_dec.pdf

[2] F. M. Carneiro, H. B. Rocha, P. C. Rocha, Investigation of possible soci-


etal risk associated with wind power generation systems, Renewable and
Sustainable Energy Reviews 19 (2013) 30–36. doi:10.1016/j.rser.2012.
11.006.
URL https://doi.org/10.1016/j.rser.2012.11.006

[3] M. I. Blanco, The economics of wind energy, Renewable and Sustainable


Energy Reviews 13 (6-7) (2009) 1372–1382. doi:10.1016/j.rser.2008.
09.004.
URL https://doi.org/10.1016/j.rser.2008.09.004

[4] N. Dalili, A. Edrisy, R. Carriveau, A review of surface engineering issues


critical to wind turbine performance, Renewable and Sustainable Energy
Reviews 13 (2) (2009) 428–438. doi:10.1016/j.rser.2007.11.009.
URL https://doi.org/10.1016/j.rser.2007.11.009

[5] B. Yang, D. Sun, Testing, inspecting and monitoring technologies for wind
turbine blades: A survey, Renewable and Sustainable Energy Reviews 22
(2013) 515–526. doi:10.1016/j.rser.2012.12.056.
URL https://doi.org/10.1016/j.rser.2012.12.056

[6] E. Sagol, M. Reggio, A. Ilinca, Issues concerning roughness on wind turbine


blades, Renewable and Sustainable Energy Reviews 23 (2013) 514–525.
doi:10.1016/j.rser.2013.02.034.
URL https://doi.org/10.1016/j.rser.2013.02.034

[7] M. Soltani, A. Birjandi, M. S. Moorani, Effect of surface contamination


on the performance of a section of a wind turbine blade, Scientia Iranica
18 (3) (2011) 349–357. doi:10.1016/j.scient.2011.05.024.
URL https://doi.org/10.1016/j.scient.2011.05.024

21
[8] D. Daniher, L. Briens, A. Tallevi, End-point detection in high-shear granu-
lation using sound and vibration signal analysis, Powder Technology 181 (2)
(2008) 130–136. doi:10.1016/j.powtec.2006.12.003.
URL https://doi.org/10.1016/j.powtec.2006.12.003

[9] S. Gong, H. Cao, J. Zhang, H. Lu, Experimental study on the effect of blade
surface roughness on aerodynamic performance, IOP Conference Series:
Earth and Environmental Science 675 (1) (2021) 012090. doi:10.1088/
1755-1315/675/1/012090.
URL https://doi.org/10.1088/1755-1315/675/1/012090

[10] K. Papadopoulou, C. Alasis, G. A. Xydis, On the wind blade's surface


roughness due to dust accumulation and its impact on the wind tur-
bine's performance: A heuristic QBlade-based modeling assessment, En-
vironmental Progress & Sustainable Energy 39 (1) (2019) 13296. doi:
10.1002/ep.13296.
URL https://doi.org/10.1002/ep.13296

[11] W. Han, J. Kim, B. Kim, Effects of contamination and erosion at the


leading edge of blade tip airfoils on the annual energy production of wind
turbines, Renewable Energy 115 (2018) 817–823. doi:10.1016/j.renene.
2017.09.002.
URL https://doi.org/10.1016/j.renene.2017.09.002

[12] A. Abouhnik, A. Albarbar, Wind turbine blades condition assessment based


on vibration measurements and the level of an empirically decomposed
feature, Energy Conversion and Management 64 (2012) 606–613. doi:
10.1016/j.enconman.2012.06.008.
URL https://doi.org/10.1016/j.enconman.2012.06.008

[13] X. Dong, J. Lian, H. Wang, Vibration source identification of offshore


wind turbine structure based on optimized spectral kurtosis and ensem-
ble empirical mode decomposition, Ocean Engineering 172 (2019) 199–212.

22
doi:10.1016/j.oceaneng.2018.11.030.
URL https://doi.org/10.1016/j.oceaneng.2018.11.030

[14] X. Dong, G. Li, Y. Jia, B. Li, K. He, Non-iterative denoising algorithm


for mechanical vibration signal using spectral graph wavelet transform and
detrended fluctuation analysis, Mechanical Systems and Signal Processing
149 (2021) 107202. doi:10.1016/j.ymssp.2020.107202.
URL https://doi.org/10.1016/j.ymssp.2020.107202

[15] E. de Moura, A. Vieira, M. Irmão, A. Silva, Applications of detrended-


fluctuation analysis to gearbox fault diagnosis, Mechanical Systems and
Signal Processing 23 (3) (2009) 682–689. doi:10.1016/j.ymssp.2008.
06.001.
URL https://doi.org/10.1016/j.ymssp.2008.06.001

[16] E. de Moura, C. Souto, A. Silva, M. Irmão, Evaluation of principal com-


ponent analysis and neural network performance for bearing fault di-
agnosis from vibration signal processed by RS and DF analyses, Me-
chanical Systems and Signal Processing 25 (5) (2011) 1765–1772. doi:
10.1016/j.ymssp.2010.11.021.
URL https://doi.org/10.1016/j.ymssp.2010.11.021

[17] E. P. de Moura, F. E. de Abreu Melo Junior, F. F. R. Damasceno, L. C. C.


Figueiredo, C. F. de Andrade, M. S. de Almeida, P. A. C. Rocha, Clas-
sification of imbalance levels in a scaled wind turbine through detrended
fluctuation analysis of vibration signals, Renewable Energy 96 (2016) 993–
1002. doi:10.1016/j.renene.2016.05.005.
URL https://doi.org/10.1016/j.renene.2016.05.005

[18] F. E. de Abreu Melo Junior, E. P. de Moura, P. A. C. Rocha, C. F.


de Andrade, Unbalance evaluation of a scaled wind turbine under different
rotational regimes via detrended fluctuation analysis of vibration signals
combined with pattern recognition techniques, Energy 171 (2019) 556–565.

23
doi:10.1016/j.energy.2019.01.042.
URL https://doi.org/10.1016/j.energy.2019.01.042

[19] J. Wang, Y. Liang, Y. Zheng, R. X. Gao, F. Zhang, An integrated fault


diagnosis and prognosis approach for predictive maintenance of wind tur-
bine bearing with limited samples, Renewable Energy 145 (2020) 642–650.
doi:10.1016/j.renene.2019.06.103.
URL https://doi.org/10.1016/j.renene.2019.06.103

[20] S. Theodoridis, Pattern Recognition, Academic Press, 2008.


URL https://www.xarg.org/ref/a/1597492728/

[21] J. Zhou, J. Shi, G. Li, Fine tuning support vector machines for short-term
wind speed forecasting, Energy Conversion and Management 52 (4) (2011)
1990–1998. doi:10.1016/j.enconman.2010.11.007.
URL https://doi.org/10.1016/j.enconman.2010.11.007

[22] M. Pal, G. M. Foody, Feature selection for classification of hyperspectral


data by SVM, IEEE Transactions on Geoscience and Remote Sensing 48 (5)
(2010) 2297–2307. doi:10.1109/tgrs.2009.2039484.
URL https://doi.org/10.1109/tgrs.2009.2039484

[23] A. Belousov, S. Verzakov, J. von Frese, A flexible classification approach


with optimal generalisation performance: support vector machines, Chemo-
metrics and Intelligent Laboratory Systems 64 (1) (2002) 15–25. doi:
10.1016/s0169-7439(02)00046-1.
URL https://doi.org/10.1016/s0169-7439(02)00046-1

[24] S. Baek, H. S. Yoon, D. Y. Kim, Abnormal vibration detection in the


bearing-shaft system via semi-supervised classification of accelerometer sig-
nal patterns, Procedia Manufacturing 51 (2020) 316–323. doi:10.1016/
j.promfg.2020.10.045.
URL https://doi.org/10.1016/j.promfg.2020.10.045

24
[25] A. Glaeser, V. Selvaraj, K. Lee, N. Lee, Y. Hwang, S. Lee, S. Lee, S. Min,
Remote machine mode detection in cold forging using vibration signal, Pro-
cedia Manufacturing 48 (2020) 908–914. doi:10.1016/j.promfg.2020.
05.129.
URL https://doi.org/10.1016/j.promfg.2020.05.129

[26] C.-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E. Stanley, A. L.


Goldberger, Mosaic organisation of DNA nucleotides, Physical Review E
49 (2) (1994) 1685–1689. doi:10.1103/physreve.49.1685.
URL https://doi.org/10.1103/physreve.49.1685

[27] B. Podobnik, H. E. Stanley, Detrended cross-correlation analysis: A new


method for analyzing two nonstationary time series, Physical Review Let-
ters 100 (8). doi:10.1103/physrevlett.100.084102.
URL https://doi.org/10.1103/physrevlett.100.084102

[28] C. Cortes, V. Vapnik, Support-vector networks, Machine Learning 20 (3)


(1995) 273–297. doi:10.1007/bf00994018.
URL https://doi.org/10.1007/bf00994018

[29] V. Vapnik, Principles of risk minimization for learning theory, in: Proceed-
ings of the 4th International Conference on Neural Information Processing
Systems, NIPS’91, Morgan Kaufmann Publishers Inc., San Francisco, CA,
USA, 1991, p. 831–838.

[30] A. R. Webb, K. D. Copsey, Statistical Pattern Recognition, John Wiley &


Sons, Ltd, 2011. doi:10.1002/9781119952954.
URL https://doi.org/10.1002/9781119952954

[31] J. Platt, Sequential minimal optimization: A fast algorithm for training


support vector machines, Tech. Rep. MSR-TR-98-14, Microsoft (April
1998).
URL https://www.microsoft.com/en-us/research/publication/
sequential-minimal-optimization-a-fast-algorithm-for-training-support-vector-machines/

25
[32] C. J. Burges, Data Mining and Knowledge Discovery 2 (2) (1998) 121–167.
doi:10.1023/a:1009715923555, [link].
URL https://doi.org/10.1023/a:1009715923555

[33] S. Haykin, Neural networks and learning machines, no. 3, Prentice


Hall/Pearson, New York, 2009.

[34] C. M. Bishop, Pattern Recognition and Machine Learning (Information


Science and Statistics), Springer, 2011.
URL https://www.xarg.org/ref/a/0387310738/

[35] D. Kriesel, A Brief Introduction to Neural Networks, 2007.


URL availableathttp://www.dkriesel.com

[36] Y. Bengio, Gradient-based optimisation of hyperparameters, Neural Com-


putation 12 (8) (2000) 1889–1900. doi:10.1162/089976600300015187.
URL https://doi.org/10.1162/089976600300015187

[37] P. M. Lerman, Fitting segmented regression models by grid search, Applied


Statistics 29 (1) (1980) 77. doi:10.2307/2346413.
URL https://doi.org/10.2307/2346413

[38] F. J. Solis, R. J.-B. Wets, Minimisation by random search techniques,


Mathematics of Operations Research 6 (1) (1981) 19–30. doi:10.1287/
moor.6.1.19.
URL https://doi.org/10.1287/moor.6.1.19

[39] G. Luo, A review of automatic selection methods for machine learning algo-
rithms and hyper-parameter values, Network Modeling Analysis in Health
Informatics and Bioinformatics 5 (1). doi:10.1007/s13721-016-0125-6.
URL https://doi.org/10.1007/s13721-016-0125-6

[40] E. de Moura, C. Souto, A. Silva, M. Irmão, Evaluation of principal com-


ponent analysis and neural network performance for bearing fault di-
agnosis from vibration signal processed by RS and DF analyses, Me-
chanical Systems and Signal Processing 25 (5) (2011) 1765–1772. doi:

26
10.1016/j.ymssp.2010.11.021.
URL https://doi.org/10.1016/j.ymssp.2010.11.021

[41] H. Yu, T. Xie, S. Paszczynski, B. M. Wilamowski, Advantages of radial


basis function networks for dynamic system design, IEEE Transactions on
Industrial Electronics 58 (12) (2011) 5438–5450. doi:10.1109/tie.2011.
2164773.
URL https://doi.org/10.1109/tie.2011.2164773

[42] D. Wolpert, W. Macready, No free lunch theorems for optimization, IEEE


Transactions on Evolutionary Computation 1 (1) (1997) 67–82. doi:10.
1109/4235.585893.
URL https://doi.org/10.1109/4235.585893

27

You might also like