Energies 14 02970

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/351720259
Accuracy Improvement of Transformer Faults Diagnostic Based on DGA Data

Using SVM-BA Classiﬁer
Article in Energies · May 2021

DOI: 10.3390/en14102970
CITATIONS READS
22 558
5 authors, including:
Youcef Benmahamed Omar Kherif

National Polytechnic School of Algiers Cardiff University
15 PUBLICATIONS 137 CITATIONS 33 PUBLICATIONS 180 CITATIONS
SEE PROFILE SEE PROFILE
M. Teguar Ahmed Boubakeur

National Polytechnic School of Algiers National Polytechnic School of Algiers ENP
98 PUBLICATIONS 659 CITATIONS 152 PUBLICATIONS 1,211 CITATIONS
SEE PROFILE SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Agriculture 4.0: Green IoT, Smart Water Management System, Smart Agriculture System and/or Smart Farming View project
characterization of thermal aging of XLPE insulation cables View project
All content following this page was uploaded by Sherif S. M. Ghoneim on 20 May 2021.
The user has requested enhancement of the downloaded file.

Article
Accuracy Improvement of Transformer Faults Diagnostic Based
on DGA Data Using SVM‐BA Classifier
Youcef Benmahamed 1, Omar Kherif 1, Madjid Teguar 1, Ahmed Boubakeur 1 and Sherif S. M. Ghoneim 2,*
1 Research Laboratories, National Polytechnic School (ENP), B.P 182, El‐Harrach, 16200 Algiers, Algeria;
youcef.benmahamed@g.enp.edu.dz (Y.B.); omar.kherif@g.enp.edu.dz (O.K.); madjid.teguar@g.enp.edu.dz (M.T.);
ahmed.boubakeur@g.enp.edu.dz (A.B.)
2 Electrical Engineering Department, College of Engineering, Taif University, Taif 21944, Saudi Arabia
* Correspondence: s.ghoneim@tu.edu.sa
Abstract: The main objective of the current work was to enhance the transformer fault diagnostic accu‐

racy based on dissolved gas analysis (DGA) data with a proposed coupled system of support vector
machine (SVM)‐bat algorithm (BA) and Gaussian classifiers. Six electrical and thermal fault classes
were categorized based on the IEC and IEEE standard rules. The concentration of five main com‐
bustible gases (hydrogen, methane, ethane, ethylene, and acetylene) was utilized as an input vector
of the two classifiers. Two types of input vectors have been tested; the first input type considered
the five gases in ppm, and the second input type considered the gases introduced in the percentage
of the sum of the five gases. An extensive database of 481 had been used for training and testing
phases (321 data samples for training and 160 data samples for testing). The SVM model condition‐
ing parameter “λ” and penalty margin parameter “C” were adjusted through the bat algorithm to
develop a maximum accuracy rate. The SVM‐BA and Gaussian classifiers’ accuracy was evaluated
and compared with several DGA techniques in the literature.
Citation: Benmahamed, Y.; Kherif,
O.; Teguar, M.; Boubakeur, A.;
Keywords: transformer faults; SVM‐BA classifier; DGA; DGALab
Ghoneim, S.S.M. Accuracy
Improvement of Transformer Faults

Diagnostic Based on DGA Data
Using SVM‐BA Classifier. Energies
2021, 14, 2970. 1. Introduction
https://doi.org/10.3390/en14102970 The insulation system state of the power transformers is responsible for determining
the transformers’ lifetime. It is generally exposed to a couple of defects arising from over‐
Academic Editor: Ayman El‐Hag
heating, paper carbonization, arcing, and discharges of low or high energy [1–3]. These
faults might accelerate the insulation degradation, affecting the transformer reliability
Received: 15 April 2021
and lifetime [4]. Early detection of these faults can avoid the undesired abnormal operat‐
Accepted: 19 May 2021
ing conditions or transformer outages [5,6].
Published: 20 May 2021
Several DGA techniques in the literature were proposed to detect the faults in trans‐
Publisher’s Note: MDPI stays neu‐
formers, but in some cases, these DGA techniques’ diagnostic accuracy is inadequate. The
tral with regard to jurisdictional dissolved gas analysis (DGA) technique considers one of the fastest and economical tech‐
claims in published maps and institu‐ niques widely used to diagnose the transformer fault types of the insulation system [7].
tional affiliations. The insulating oil decomposes into hydrocarbon products, which are categorized as com‐
bustible and incombustible gases. The five main combustible gases are Hydrogen (H2),
Methane (CH4), Acetylene (C2H2), Ethylene (C2H4), and Ethane (C2H6), which might be
generated within the oil during a faulty mode [1]. The concentrations of these gases were
Copyright: © 2021 by the authors. Li‐ used as an input vector to interpret the DGA results in transformer oil, associated with
censee MDPI, Basel, Switzerland. six basic electrical and thermal faults [4,8]. Different DGA techniques have been devel‐
This article is an open access article oped to diagnose the transformer faults, including graphical DGA methods (e.g., [1,9–11])
distributed under the terms and con‐ and artificial intelligence techniques (e.g., [12,13]). Improved coupled techniques have
ditions of the Creative Commons At‐ also been developed to diagnose multiple transformer faults and quantitatively indicate
tribution (CC BY) license (http://crea‐
each fault’s likelihood (e.g., [14]).
tivecommons.org/licenses/by/4.0/).

Energies 2021, 14, 2970. https://doi.org/10.3390/en14102970 www.mdpi.com/journal/energies
Energies 2021, 14, 2970 2 of 16

Artificial intelligence techniques such as artificial neural networks (ANN) can com‐
bine with the traditional DGA techniques to enhance the diagnostic accuracy of the trans‐
former faults, such as the California State University Sacramento artificial neural network
method (CSUS‐ANN) [13]. The CSUS‐ANN DGA technique used the gas concentration
percentage from the five main combustible gases as inputs to the backpropagation neural
network to determine the transformer faults based on the training process of DGA sam‐
ples with knowing transformer fault types. Ghoneim and Taha [15] proposed a new ap‐
proach (clustering) to enhance the diagnostic transformer faults by developing new gas
ratios with the IEC ratios and defining its limits to improve diagnostic accuracy. The tra‐
ditional IEC code 60599 and Rogers’ four ratios gave a poor diagnostic accuracy of the
transformer faults. Enhancing the diagnostic accuracy by modifying the two previous
DGA methods’ ratio limits using the particle swarm optimization with fuzzy logic is pre‐
sented [6]. The conditional probability in [16] introduced a new concept using the likeli‐
hood of the faults’ occurrence and the likelihood of un‐occurrence of the fault via the mean
and standard deviation of the two events’ DGA samples. The conditional probability of
the fault occurrence is identified using the multivariate normal probability density func‐
tion. Three scenarios were developed depending on how to separate among the different
faults. All these techniques are merged into one software package (DGALab), which is
own as in [17] to facilitate the comparison process between them and any new proposed
DGA techniques with the advantage of using an extensive database of DGA samples
[17,18].
In this paper, SVM‐BA and Gaussian classifiers have been used to detect faults
within an oil‐immersed power transformer. The concentration of gases in the ppm and
percentage of the sum of the five main combustible gases have been used as an input
vector for Gaussian and SVM classifiers. Kernel parameter λ and penalty margin C of the
SVM model have been optimized by a Bat algorithm (SVM‐BA) to adjust the model, get‐
ting a high diagnostic accuracy. Electrical and thermal transformer faults have character‐
ized the output of each classifier including partial discharge (PD), low energy discharges
(D1), high energy discharges (D2), thermal faults < 300 °C (T1), thermal faults of 300 °C to
700 °C (T2), and thermal faults > 700 °C (T3) [1]. The performance of each classifier has
been investigated in terms of accuracy rate. A total of 481 sample datasets have been con‐
sidered, where two‐thirds were used for the training process (321 samples) while the rest
was used for the testing process (160 samples). A comparative study was accomplished
with the other DGA techniques in the literature to identify the proposed DGA technique’s
diagnostic improvement.
The current work presents a classification technique (SVM‐BAT and Gaussian classi‐
fiers) to enhance the transformer faults’ diagnostic accuracy, which considers one of the
new trends in condition monitoring and diagnostics of power system assets.
2. Problem Formulation
Highly reliable transformers are mainly made of iron core and windings; both are
placed in the oil tank filled with insulating oil, as shown in Figure 1.
Energies 2021, 14, 2970 3 of 16

Figure 1. Oil‐immersed power transformer cross‐section.
Mineral insulating oil is the most common type of oil used in outdoor transformers
[19]. This insulating oil has significant dielectric strength so that it can withstand a pretty
high voltage. It also reduces heat generated by transformer windings employing the
cooler (radiators, air fans, …). Therefore, the heat generated in the transformer results in
a temperature rise in the internal transformer structures. Under electrical and thermal
stresses, different hydrocarbon gases are liberated due to the insulating oil decomposition.
Particular gases characterize each type of fault. For instance, hydrogen concentration, pro‐
duced by ionic bombardment, increases with partial discharges within a transformer oil.
In this context, a general review about the gases produced during the deterioration of
mineral oil and their interpretation has been detailed in [10].
Early‐stage detection of these faults should be carried out to avoid the undesired ab‐
normal operating conditions or transformer outages. For this purpose, periodic monitor‐
ing of the oil should be conducted during transformer service, whether in‐situ or at the
laboratory, using a multi‐stage gas‐extractor (a device for sampling transformer oil) [10].
In general, the most important gases are Hydrogen (H2), Methane (CH4), Acetylene (C2H2),
Ethylene (C2H4), and Ethane (C2H6). The distribution of these gases is related to the type
of transformer fault, and the rate of gas generation can indicate the severity of the fault
[5,20].
In [6], the authors have collected 481 samples associating with the six different faults
as indicated in the Introduction (i.e., PD, D1, D2, T1, T2, and T3). The number of samples
associated with each fault is given in Table 1.
Table 1. Database distribution.
Defect Interpretation Number of Samples

PD Partial discharge 48
D1 Low energy discharges 79
D2 High energy discharges 126
T1 Thermal faults of < 300 °C 95
T2 Thermal faults of 300 °C to 700 °C 48
T3 Thermal faults of > 700 °C 85
All 481
Energies 2021, 14, 2970 4 of 16

The database set has been exploited in the present investigation to detect and identify
faults. As shown in this table, only separated faults (no combined faults) have been con‐
sidered. The fault detection has been examined using the concentration of each dissolved
gas. Since the weight percent of the gases as mentioned earlier would result in an inop‐
portunely small number, concentration in parts per million, or ppm, has been considered
for each gas. Furthermore, percent concentration of the total sum was also used, where
each sample X = [x1, x2, …, x5] is scaled as follows:
𝑋
X 100% (1)
∑ 𝑥
The faults diagnostic method has been carried out elaborating two different classifi‐
ers, namely Gaussian and SVM‐BA. The flowchart given in Figure 2 summarizes the var‐
ious stages of the diagnostic approach.

Figure 2. Flowchart of the problem formulation.
3. Classification Approach
For both, Gaussian and SVM‐BA, classifier, the concentrations in percentages and
ppm of the five dissolved gases have been used as an input vector, denoted by X = [x1, x2,
…, x5], associated with a particular class of fault (denoted by y) representing the classifier
decision (classifier output).
3.1. Gaussian Classifier
In this part, the Gaussian classification is used as a probabilistic learning method for
constructing a classifier by applying Bayes’ theorem. It concerns the conditional and mar‐
ginal probabilities of two random events. The classifier is based on the comparison of the
posterior probability P (wi|x):
𝑃 𝑥|𝑤 𝑃 𝑤
𝑃 𝑤 |𝑥 , 𝑖 1,2, . . . ,6 (2)
𝑃 𝑥
where P (x|wi) is the conditional probability (likelihood) given by:
Energies 2021, 14, 2970 5 of 16

𝑃 𝑥|𝑤 𝑃 𝑥 |𝑤 (3)
and P(x) is the unconditional density that normalizes the posteriors, computed as follows:
𝑃 𝑥 𝑃 𝑥|𝑤 𝑃 𝑤 (4)
in which P(wi) is the prior probability of each class.
Firstly, the training phase has been carried out for constructing the parameters of the
Gaussian model. In this phase, 321 samples of the data set have been reserved to deter‐
mine the Gaussian distributions, consisting of the mean value (μ) and the matrix covari‐
ance (σ) of the gas concentration for each defect class. Since the number of samples differs
from one fault to another, every distribution is multiplied by a weight corresponding to
its samples’ number on the database’s total size.
In the next step, Gaussian has been employed to compute the conditional probability
P (x|wi) as indicated in Equation (3), where the posterior probability is calculated using
the probability density function of a univariate normal distribution as follows:
1
𝑃 𝑥 |𝑤 𝑒 (5)
√2𝜋𝜎
Since it is required to know the likelihood of observing the k‐th sample while consid‐
ering all the different distributions, one can sum the likelihood of observing the given
sample from each possible Gaussian, using:
1
𝑒𝑥𝑝 𝑋 𝜇 𝜎 𝑋 𝜇
𝑃 𝑥 |𝑤 2 (6)
2𝜋 / |𝜎|
in which, |σ| and σ−1 denote the determinant and inverse of the covariance matrix σ.
Each Gaussian model’s parameters (i.e., variance, mean, and weight) have been ad‐
dressed to cluster the data and estimate those having the same parameters. Moreover, a
maximum likelihood estimate (MLE) was used to find the optimal mean and variance,
maximizing the data’s likelihood. After training the model, the classifier output ideally
ends up with six distributions on the same axis. Depending on the axis’s location, each
testing sample (a total of 160 testing ones) is placed in one of the defect classes. Figure 3
illustrates the different steps of the Gaussian classifier.
Energies 2021, 14, 2970 6 of 16

Figure 3. Flowchart formulating the problem using Gaussian classifier.
3.2. SVM Classifier Coupled with BA
SVMs techniques are used in the problem of classification, regression, and prediction
models [21]. For the classification problems, hyperplanes are required in a multidimen‐
sional space separating data points of both fault classes. These hyperplanes are used to
distinguish between every two classes (yi and yj) of faults associated with two different
input vectors (Xi and Xj) [22–24]. Among these hyperplanes, it is suggested to find the one
that has the maximum margin (denoted by M). In this light, the classification becomes an
optimization problem where hyperplanes represent the decision boundaries that help
classify the data points. Usually, an orthogonal vector (denoted by ω) to the hyperplane
defined by:
𝜔 𝜔 ,𝜔 ,...,𝜔 (7)
which is used in combination with an input vector (Xi) to define the hyperplane function,
h, as follows [22]:
ℎ 𝑋 𝑤 .𝑋 𝜔 𝜔 𝜔 .𝑥 (8)
The 𝜔 is the bias term required to determine the position of separating hyperplane
(i.e., h (X) = 0).
A learning strategy of One‐to‐One is selected. It is assumed that Xi is of class “1” if h
(Xi) ≥ 0 and, consequently, it is of class “−1” elsewhere. Assuming that Xi and Xj are the
two closest points on each side of the hyperplane (different classes), the equations for the
hyperplanes h (Xi) and h (Xj) become:
ℎ 𝑋 𝑤 .𝑋 𝜔 1 (9)
and
Energies 2021, 14, 2970 7 of 16

ℎ 𝑋 𝑤 .𝑋 𝜔 1 (10)
Differencing these equations and dividing both sides by the magnitude of the ω, we
obtain:
2
𝑋 -X (11)
||𝜔||
Xi − Xj is the distance between the two hyperplanes.
From the expression (11), it is clear that the maximization of the margin implies the
minimization of the weight vector ω used to define the hyperplane. A soft‐margin SVM is
utilized for nonlinear classes to provide freedom to the model misclassifying some data
points by minimizing the number of such samples [23]. For this purpose, slack variable
non‐negatives ζi is introduced in the hyperplane equation. Consequently, the optimiza‐
tion problem becomes:
⎧𝑀𝑖𝑛: 1
⎪ ||𝜔|| 𝐶 𝜁
2
(12)
⎨𝑆𝑢𝑐ℎ that : 𝑦 𝜔. 𝑥 𝜔 1 𝜁
⎪
⎩ ∀𝑖, 0 𝜁 1 𝑖 1,2, . . . , 𝑘
C represents the margin parameter, which can be seen as a regularization parameter.
The corresponding Lagrangian dual problem is given by:
𝐿 𝜔, 𝜔 , 𝜁, 𝛼 ‖𝜔‖ 𝐶 ∑ 𝜁 ∑ 𝛼 𝑦 𝜔𝑥 𝜔 - 1 𝜁 (13)
αi are Lagrange coefficients (multipliers).
In such circumstances, the Karush–Kuhn–Kucker conditions are [14]:
𝐿 𝜔, 𝜔 , 𝜁, 𝛼 ‖𝜔‖ 𝐶 ∑ 𝜁 ∑ 𝛼 𝑦 𝜔𝑥 𝜔 - 1 𝜁 (14)
Setting the derivatives as mentioned earlier of the Lagrangian and 𝜔 individually
to 0, it follows that the Lagrangian expression should be maximized under the constraint:
𝛼𝑦 0 (15)
and it also yields
𝐶 𝛼 𝜁 (16)
Since the data are assumed as non‐separable, the feature space has been enlarged by
a characteristic function Φ known as Kernel function. Every data point has been mapped
into high‐dimensional space through a particular transformation Φ: X 7→ ϕ (X). Polyno‐
mial Kernel function of d–degree has been used in this investigation as follows [23]:
𝐾 𝑥 ,𝑥 𝐶 𝑥. 𝑥′ (17)
which verifies the following condition:
𝜙 𝑥 .𝜙 𝑥 𝐾 𝑥 ,𝑥 (18)
In this case, the optimization problem after rearrangement becomes as follows:
⎧𝑀𝑎𝑥: 1
𝛼 𝛼𝛼𝑦𝑦𝛷 𝑥 𝛷 𝑥
⎪
⎪ 2
𝑆𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜 ∶ ∀𝑖 ∈ 1,2, . . . , 𝑘 , 𝐶 𝛼 0 (19)
⎨
⎪
⎪ 𝛼 .𝑥 0
⎩
Energies 2021, 14, 2970 8 of 16

SVM parameters consisting in:
 Kernel parameter λ (conditioning parameter equivalent σ in RBF kernel [24]);
 penalty parameter C (margin parameter); and
 degree d of the Kernel polynomial.
significantly affect the accuracy of predicting model. To further improve the accuracy rate,
the bat algorithm (BA) has been elaborated in this investigation to optimize the SVM pa‐
rameters. BA is part of meta‐heuristic algorithms for global optimization, intended (by
Xin‐She Yang in 2010) to simulate prey’s sensing distance and avoid obstacles using mi‐
crobats echolocation behavior [23]. In BA, the aim is reached by determining the optimum
parameters C and λ that give the best accuracy rate of the SVM classifier. The degree of a
polynomial kernel has been fixed at d = 1, 2, and 3. Figure 4 illustrates the different steps
of the coupled SVM‐BA classifier.
In the beginning, BA parameters have been initialized. Table 2 lists the detailed set‐
tings for the BA values used to optimize the SVM model.
Table 2. Parameters of Bat algorithm.
Parameter Value
Population size 50
Loudness 0.5
Frequency (fmin and fmax) 0 and 20
Number of iterations 600
Pulse rate 0.5
Therefore, BA generates a population of the SVM parameters (C, λ). For each couple,
the initial position p and velocity v have been randomly selected. Each couple’s fitness
has been evaluated to extract the best global position (denoted by p∗). This means that the
training dataset is used to train the SVM classifier for each position, while the testing da‐
taset is used to calculate the accuracy rate. This latter represents the ratio between the
number of correctly classified samples (Nc) to the total number of test samples (N). After
this step, the position and velocity of each individual are updated using the following
expressions:
 pit 1  pit  vit

 t 1 t t *
(20)
vi  vi  ( pi - p ) fi
where vit and vit+1 are current, and the next velocities correspond to the existing pit and the
next pit+1 positions, respectively.
Energies 2021, 14, 2970 9 of 16

Figure 4. Flowchart formulating SVM‐BA Classifier.
The parameter fi denotes the frequency, which is computed as follows:
f i  f m in  ( f m a x - f m in )  (21)
in which β is a random number ranging between 0 and 1.
It is worth noting that the searching space has been bounded by Cmin = 10−6 and Cmax =
0.1 for the parameter C against λmin = 10−7 and λmax = 0.7 for the second parameter λ. More‐
over, the pulse rate increased according to the iteration number as follows:
𝑟 𝑟 1 𝑒𝑥𝑝 𝛾𝑡 (22)
while the loudness decreased by:
𝐴 𝜎𝐴 , 0 𝜎 1 (23)
4. Experimental Work
During transformers’ operation, the insulation of transformer coils is subjected to
high electrical and thermal stresses causing corrosion of some insulating material particles
and decomposition of some insulating oil particles producing different types of gases.
These gases dissolve in the transformer oil. At the beginning of any slight fault, the gases
are not released largely enough to operate the gas protection device that does not cause
instantaneous breakdown, but the transformer efficiency is reduced.
The gases that were used to diagnose the transformers’ state include Hydrogen (H2),
Methane (CH4), Acetylene (C2H2), Ethylene (C2H4), Ethane (C2H6), carbon monoxide (CO),
and carbon dioxide (CO2). Hence, chromatographic analysis (CA) of dissolved gases in
Energies 2021, 14, 2970 10 of 16

transformer oils is considered as an analysis method that reveals small percentages of dis‐
solved gases in the oil. The CA of gases indicates the transformers’ condition in the early
stage of the fault occurrence. Thus, the transformers can be preserved and decrease the
transformer failure before a transformers’ complete breakdown occurs.
The CA results’ accuracy depends on drawing the transformer’s oil sample, extract‐
ing the dissolved gases from the oils’ samples, and adjusting the analyzer device. The CA
must be carried out at the start of the transformers’ operation, and its results are consid‐
ered a reference when analyzing this transformer later.
American Society for Testing and Materials (ASTM) D3612‐2 [24] indicates the dis‐
solved gases’ extracting procedures from the transformer oils’ samples using gas chroma‐
tography (GC). The GC consists of the mobile phase (including three types of gases the
carrier gas, the fuel gas, and zero air), the sample injector, the column, the columns’ oven,
the detector, and the data system.
Oil samples were prepared and filled with glass vials by a sampling device. Then,
they were placed into the Autosampler unit. Hence, one by one, the samples were ana‐
lyzed, and inserted into the oven at 80 °C. The dissolved gases are extracted by increasing
the temperature by moving the oil sample. Hence, the extracted gases are injected into the
GC to accomplish the gases’ analysis [25].
Figure 5 illustrates the oil samples’ drawing process from the transformer and the
GC device (8890 Gas Chromatograph (GC) System and 7697A Headspace Sampler, Ag‐
ilent, USA). The GCs’ analysis results are shown in Figure 6, illustrating the time required
to extract each gas and its concentration in ppm. The chromatograph provides a signal
with time, which produces the familiar chromatogram. The chromatogram signal can be
converted into a list of peak times and sizes by either manual or electronic means [26].

Figure 5. Drawing the oil sample from the transformer and the 8890 Gas Chromatograph (GC)
System and 7697A Headspace Sampler, Agilent, USA.
Energies 2021, 14, 2970 11 of 16

Figure 6. Gas Chromatography result.
5. Results and Discussions
A database set of 481 samples has been exploited to evaluate each classifiers’ accuracy
rate. As stated in Section 2, 321 samples of the data set were used in the training phase,
while the rest was used for testing (160 samples). The data distribution was based on the
holdout method; more than 60% of the database must be reserved for the training phase
(2/3 for the training set and the remaining 1/3 as the test set) [27]. Both data parts were
randomly selected, and they were used in all simulations. DGA results in percentages (i.e.,
percentages of the total sum) and ppm have been considered an input vector for both
classifiers. As mentioned previously for SVM, the Kernel polynomial and a one‐to‐one
learning strategy were selected. The classifiers’ results were compared regarding inspec‐
tion (the real fault in the transformer) as in Table 3. In Table 3, some cases were illustrated
to explain the comparison between the SVM‐BA and Gaussian classifier for gases in ppm
and gases percentages.
Table 3. Diagnosis results of some cases.
Gaussian
SVM‐BA (In‐ SVM‐BA (In‐ Gaussian (In‐
Inspec‐ (Input
H2 CH4 C2H2 C2H4 C2H6 put Vector in put Vector in put Vector in
tion Vector in
ppm) Percentage) Percentage)
ppm)
2587.2 112.25 0 1.4 4.704 PD PD D2 * PD PD
6870 1028 5500 900 79 D1 D1 D1 D1 D2 *
84 6 86 14 1 D2 D2 T3 * D1 * D1 *
92 27 0 7 67 T1 T1 T3 * T1 T1
960 4000 6 1560 1290 T2 T3* T3 * T2 T3 *
1374 2648 298 5376 628 T3 T3 T3 T3 T3
(*) denotes that diagnosis is wrong based on the inspection.
Not only the type of input vector influences the accuracy rate in the SVM algorithm,
but in the experience of previous investigations, the degree of the polynomial kernel can
also affect the diagnostic accuracy [28]. Figure 7 shows the impact of vector input type on
the classification performance for the SVM‐BA classifier’s evolution during the optimiza‐
tion process. This latter has been illustrated in Figure 7a,b when the input vector calcu‐
lates gases in percentages and ppm, respectively, with different polynomial kernel de‐
grees.
Energies 2021, 14, 2970 12 of 16

94
Accuracy rate (%)

90
86 d=2 d=1 d=3
0 100 200 300 400 500 600

Iteration
(a)
88
85
Accuracy rate (%)
80
d=1 d=2 d=3
0 50 100 150 200 250 300

Iteration

(b)
Figure 7. SVM‐BA classifier evolution during the optimization process: (a) DGA in percentage; (b)
DGA in ppm.
For a given degree of Kernel polynomial, it is clear that 300 iterations are mainly suf‐
ficient for the convergence of the SVM‐BA algorithm. The results showed that the SVM‐
BA classifier’s accuracy rate is quite sensitive to the degree of Kernel polynomial. For an
input vector taken in percentages as shown in Figure 7a, the maximal accuracy rate is
93.13% with d = 2 and 3 against 91.88% with d = 1. On the other hand, notably lower results
have been found in Figure 7b for an input vector in ppm where the highest accuracy is
87.5% obtained for d = 1 against 82.5% and 78.13% for d = 2 and 3, respectively. However,
the convergence for d = 3 is very fast compared to those found for d = 2 and d = 1.
The previous simulation, related to the SVM‐BA classifier and shown in Figure 7, is
repeated 50 times to find the best accuracy rate to provide more credibility of the obtained
results. Figure 8 illustrates an example of the accuracy rate versus the number of runs (i.e.,
executions) when using DG in percentages as an input vector for the SVM‐BA classifier.
These results have been computed for the Kernel polynomial of d = 1, 2, and 3 degrees.
Energies 2021, 14, 2970 13 of 16

94
d=1 d=2 d=3
Accuracy rate (%)
91
0 10 20 30 40 50
Number of runs
Figure 8. The accuracy rate of the SVM‐BA classifier over the running number.
In Figure 8, the best accuracy rate obtained for different executions is located between
91% and 94%. The global best results related to the accuracy rate obtained for several runs
are presented in Table 4. Additionally, DGA has been elaborated in ppm and percentages
for the input vector.
Table 4. The accuracy rate for both classifiers after 50 executions.
SVM‐BA
Classifier Gaussian
d = 1 d = 2 d = 3
DGA in percentages 69.37 % 93.13% 93.75% 93.75%
DGA in ppm 32.75 % 87.50% 82.50% 78.13%
For SVM‐BA, the results are given for three degrees of Kernel polynomial. After 50
executions, it was found that the maximal accuracy rate was 93.75% with d = 2 and 3
against 93.13% with d = 1 obtained when employing an input vector in percentages. When
using the dissolved gases in ppm as input vector for the SVMBA, the computed results
decreased to 87.50% for d = 1 against 89.75% for d = 2 and 3. This implies that the SVM‐BA
classifier gives a better accuracy rate for an input vector given in percentages. On the other
hand, Gaussian classifier gives the lowest accuracy rate of 32.75% when the input vector
employed in ppm compared to an accuracy rate of 66.25% when the input vector is in
percentages. This ascertainment demonstrates the concentration of gases in percentages
to differentiate between a particular defect from the other ones.
For the Gaussian classifier, it should be noted that the results have been dramatically
improved when the real part of the posterior probability given by the expression (6) is
employed. In this case, the accuracy rate has been increased to 70% for percentages input
in while it remains the same (i.e., 32.75%) for an input vector in ppm. Such findings sug‐
gest using the real part of posterior Probability in Gaussian classifier with an input vector
in percentages. Compared to other literature results, the SVM‐BA classifier has good ac‐
curacy and has high abilities to diagnose the transformer fault classes with simple codes.
The overall accuracy obtained in [6] for the same database in ppm is a good example of
this.
6. Validation and Overall Accuracy of the Proposed SVM‐BA Classifier
The SVM‐BA and Gaussian classifiers are compared with various classification algo‐
rithms used in the DGALab interface to evaluate the proposed method’s accuracy [17].
The free DGALab software package is available in [18]. DGALab involves the Duval tri‐
Energies 2021, 14, 2970 14 of 16

angle method, IEC code 60599, Roger’s four ratios, modified IEC code and Modified Rog‐
ers’ 4 ratios, clustering method, conditional probability, and California State University
Sacramento artificial neural network method (CSUS‐ANN). The details of the whole algo‐
rithms are cited in [15–17]. A comparison among all these mentioned methods was carried
out based on the individual fault accuracy and overall accuracy rate (Table 5).
Table 5. Accuracy rate table of different techniques.
Modi‐
Condi‐
Duval IEC Rog‐ fied
Modified Cluster‐ tional CSUS Gauss‐
ACT Trian‐ Code‐ ers’ 4 Rogers’ SVM‐BA
IEC Code ing Probabil‐ ANN ian
gle 60599 Ratios 4 Ra‐
ity
tios
PD 16 93.75 43.75 37.5 8.5 75 93.75 100 8.5 87.5 81.25
D1 26 76.92 26.92 0 76.92 84.61 84.61 76.92 69.23 92.31 34.61
D2 42 85.71 66.66 73.80 90.47 88.09 85.71 97.61 92.85 95.24 78.57
T1 32 65.62 53.125 90.62 93.75 93.75 93.75 87.5 93.75 93.75 90.62
T2 16 50 68.75 25 93.75 93.75 31.25 81.25 43.75 87.5 0
T3 28 100 82.14 57.14 89.28 89.28 85.7 82.14 71.42 96.48 100
All 160 80 58.12 53.75 88.75 88.12 82.5 88.12 80 93.75 69.37
The last row in Table 5 illustrates the total number of samples used for testing pur‐
poses and the overall accuracy of each DGA method for comparison purposes. From the
evaluation exposed in Table 5, the SVM‐BA provides the best overall accuracy rate
(93.75%). This superiority came from the ability of SVM to classify the complex and exten‐
sive data set. Moreover, the coupling of SVM with the BAT algorithm enabled the right
choice of parameters which gave the highest possible accuracy rate. The nearest overall
accuracy of the proposed method is the modified IEC code, which showed an 88.75%
overall accuracy rate. The worst overall accuracy rate is the Rogers’ four ratio DGA
method, for which the overall accuracy is 53.75%. The results in Table 5 are recapitulated
in Figure 9 in a histogram form.
100
Duval
IEC
90
Rogers4 Ratios
80 Modified IEC
Modified
Rogers4 ratios
70
Clustering
Accuracy rate (%)
Probability
60
CUSU
SVM-BA
50
Gaussian
40
30
20
10
0
PD D1 D2 T1 T2 T3 All
Fault types
Figure 9. Histogram of accuracy rates.
7. Conclusions
This paper proposed a new DGA technique using an SVM‐BA classifier to enhance
the transformer faults’ diagnostic accuracy. Five main combustible dissolved gas concen‐
trations (H2, CH4, C2H6, C2H4, and C2H2) were used as an input vector top the SVM‐BA
Energies 2021, 14, 2970 15 of 16

classifier to identify the transformer fault type. The concentration of five dissolved gases
was used in ppm and in percentages. A total of 481 samples was collected from the chem‐
ical laboratory and literature, categorized into 321 data samples for training and 160 data
samples for testing processes. The SVM‐BA classifier results indicated the following:
 An accuracy rate of 93.75% has been achieved when the input vector in percentage
with d = 2 and 3 degrees.
 The coupled SVM‐BA classifier’s test results revealed the classifier’s ability to en‐
hance the transformer faults’ diagnostic accuracy rather than the other DGA tech‐
niques in the literature.
 The overall accuracy of SVM‐BA was 93.75%, which is higher than that of the modi‐
fied IEC code (88.75%).
 It recommended determining the expected remaining life of the transformer based
on the state of the insulation system.
Author Contributions: Conceptualization: Y.B.; methodology: O.K.; software: Y.B.; validation,

M.T.; formal analysis: O.K.; investigation, resources, data curation: S.S.M.G.; writing—original draft
preparation: Y.B.; writing—review and editing: All authors, visualization, supervision: A.B.; project
administration: S.S.M.G.; funding acquisition: S.S.M.G. All authors have read and agreed to the pub‐
lished version of the manuscript.
Funding: This research was funded by TAIF UNIVERSITY RESEARCHERS SUPPORTING PRO‐
JECT, grant number “TURSP‐2020/34” and “The APC was funded by SHERIF GHONEIM”.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Acknowledgments: The authors would like to acknowledge the financial support received from
Taif University Researchers Supporting Project Number (TURSP‐2020/34), Taif University, Taif,
Saudi Arabia.
Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the
study’s design; in the collection, analyses, or interpretation of data; in the writing of the manuscript,
or in the decision to publish the results.
References
1. Benmahamed, Y.; Teguar, M.; Boubakeur, A. Application of SVM and KNN to Duval Pentagon 1 Transformer Oil Diagnosis.
IEEE Trans. Dielect. Electr. Inst. 2017, 24, 3443–3451.
2. Ji, X.; Zhang, Y.; Liu, Q. Insulation Condition Assessment of Power Transformers Employing Fused Information in Time and
Space Dimensions. Electr. Power Comp. Syst. 2020, 48, 213–223.
3. Malik, H.; Mishra, S. Selection of Most Relevant Input Parameters Using Principal Component Analysis for Extreme Learning
Machine Based Power Transformer Fault Diagnosis Model. Electr. Power Compon. Syst. 2017, 45, 1339–1352.
4. IEEE Standard C57‐104. Guide for the Interpretation of Gases Generated in Oil‐Immersed Transformers; IEEE: New York, NY, USA,
2008; p. 104.
5. Jiang, J.; Chen, R.; Zhang, C.; Chen, M.; Li, X.; Ma, G. Dynamic Fault Prediction of Power Transformers Based on Lasso Regres‐
sion and Change Point Detection by Dissolved Gas Analysis. IEEE Trans. Dielect. Electr. Inst. 2020, 27, 2130–2137.
6. Taha, I.B.M.; Hoballah, A.; Ghoneim, S.S.M. Optimal ratio limits of Rogers’ four‐ratios and IEC 60599 code methods using
particle swarm optimization fuzzy‐logic approach. IEEE Trans. Dielect. Electr. Inst. 2020, 27, 222–230.
7. Gouda, O.E.; El‐Hoshy, S.H.; ELTamaly, H.H. Condition assessment of power transformers based on dissolved gas analysis.
IET Gener. Trans. Distrib. 2019, 13, 2299–2310.
8. Code, P.; Prix, C. Mineral Oil‐Impregnated Electrical Equipment in Service–Guide to the Interpretation of Dissolved and Free Gases
Analysis; IEC Publication 60599; British Standards Institution: London, UK, 2007.
9. Duval, M.; Lamarre, L. The Duval Pentagon—A new Complementary Tool for the Interpretation of Dissolved Gas Analysis in
Transformers. IEEE Electr. Insul. Mag. 2014, 30, 9–12.
10. Cheim, L.; Duval, M.; and Haider, S. Combined Duval Pentagons: A Simplified Approach. Energies 2020, 13, 2859.
11. Mansour, D.A. Development of a new graphical technique for dissolved gas analysis in power transformers based on the five
combustible gases. IEEE Trans. Dielect. Electr. Inst. 2015, 22, 2507–2512.
Energies 2021, 14, 2970 16 of 16

12. Benmahamed, Y.; Kemari, Y.; Teguar, M.; Boubakeur, A. Diagnosis of Power Transformer Oil Using KNN and Nave Bayes
Classifiers. In Proceedings of the 2018 IEEE 2nd International Conference on Dielectrics (ICD), Budapest, Hungary, 1–5 July
2018.
13. Poonnoy, N.; Suwanasri, C.; Suwanasri, T.: ‘Fuzzy Logic Approach to Dissolved Gas Analysis for Power Transformer Failure
Index and Fault Identification. Energies 2021, 14, 36.
14. Benmahamed, Y.; Teguar, M.; Boubakeur, A. Diagnosis of Power Transformer Oil Using PSO‐SVM and KNN Classifiers. In
Proceedings of the 2018 International Conference on Electrical Sciences and Technologies in Maghreb (CISTEM), Algiers, Alge‐
ria, 28–31 October 2018.
15. Ghoneim, S.S.M.; Taha, I.B.M. A new approach of DGA interpretation technique for transformer fault diagnosis. Int. Jour. of
Electr. Power Energy Syst. 2016, 81, 265–274.
16. Taha, I.B.M.; Mansour, D.A.; Ghoneim, S.S.M.; Elkalashy, N. Conditional probability‐based interpretation of dissolved gas anal‐
ysis for transformer incipient faults. IET Gener. Transm. Distrib. 2017, 11, 943–951.
17. Ibrahim, S.; Ghoneim, S.S.M.; Taha, I.B.M. DGALab: an extensible software implementation for DGA. IET Gener. Transm. Distrib.
2018, 18, 4117–4124.
18. Ibrahim, S.; Taha, I.B.M.; Ghoneim, S.S.M. DGA Tool GitHub Repository. 2018. Available online:
https://github.com/Saleh860/DGA (accessed on 19 may 2021).
19. IEEE Std C57‐104. IEEE Guide for the Reclamation of Mineral Insulating Oil and Criteria for Its Use; British Standards Institution:
London, UK, 2015; p. 637.
20. Wani, S.A.; Khan, S.A.; Prashal, G.; Gupta, D. Smart Diagnosis of Incipient Faults Using Dissolved Gas Analysis‐Based Fault
Interpretation Matrix (FIM). Arab. J. Sci. Eng. 2019, 44, 6977–6985.
21. Yazdani‐Asrami, M.; Taghipour‐Gorjikolaie, M.; Song, M.; Zhang, W.; Yuan, W. Prediction of Nonsinusoidal AC Loss of Super‐
conducting Tapes Using Artificial Intelligence‐Based Models. IEEE Access 2020, 8, 207287–207297.
22. Wei, H.; Wang, Y.; Yang, L.; Yan, C.; Zhang, Y.; and Liao, R. A new support vector machine model based on improved imperi‐
alist competitive algorithm for fault diagnosis of oil‐immersed transformers. J. Elect. Eng. Technol. 2017, 12, 830–839.
23. Tharwat, A.; Hassanien, A.E.; Elnaghi, B. A BA‐based algorithm for parameter optimization of Support Vector Machine. Pattern
Recognit. Lett. 2017, 93, 13–22.
24. Liu, T.; Zhu, X.; Pedrycz, W.; Li, Z. A design of information granule‐based under‐sampling method in imbalanced data classi‐
fication. Soft Comput. 2020, 24, 17333–17347.
25. ASTM. “Standard Test Method for Analysis of Gases Dissolved in Electrical Insulating Oil by Gas Chromatography”; D3612‐2; ASTM:
West Conshohocken, PA, USA, 2017.
26. Available online: https://www.agilent.com/cs/library/usermanuals/public/G1176‐90000_034327.pdf (accessed on 19 may 2021).
27. Foysal, K.H.; Chang, H.J.; Bruess, F.; Chong, J.W. SmartFit. Smartphone Application for Garment Fit Detection. Electronics 2021,
10, 97.
28. Ali, S.; Smith, K.A. Automatic parameter selection for polynomial kernel. In Proceedings of the Fifth IEEE Workshop on Mobile
Computing Systems and Applications, Las Vegas, NV, USA, 27–29 October 2003; pp. 243–249.
View publication stats

Energies 14 02970

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Energies 14 02970

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Accuracy Improvement of Transformer Faults Diagnostic Based on DGA Data

Article in Energies · May 2021

Youcef Benmahamed Omar Kherif

SEE PROFILE SEE PROFILE

M. Teguar Ahmed Boubakeur

SEE PROFILE SEE PROFILE

characterization of thermal aging of XLPE insulation cables View project

The user has requested enhancement of the downloaded file.

Abstract: The main objective of the current work was to enhance the transformer fault diagnostic accu‐

Defect Interpretation Number of Samples

 pit 1  pit  vit

Accuracy rate (%)

86 d=2 d=1 d=3

0 100 200 300 400 500 600

d=1 d=2 d=3

0 50 100 150 200 250 300

Accuracy rate (%)

Author Contributions: Conceptualization: Y.B.; methodology: O.K.; software: Y.B.; validation,

View publication stats

You might also like