You are on page 1of 15

Engineering Failure Analysis 101 (2019) 215–229

Contents lists available at ScienceDirect

Engineering Failure Analysis


journal homepage: www.elsevier.com/locate/engfailanal

Converting data into knowledge for preventing failures in power


T
transformers

Ricardo Manuel Arias Velásquez , Jennifer Vanessa Mejía Lara, Andres Melgar
Pontificia Universidad Católica del, Peru

A R T IC LE I N F O ABS TRA CT

Keywords: This research is performed in order to estimate the overall condition of power Transformers, it is
Health considered for 60 power transformers and 10,198 electrical test: Factory and site acceptance test,
Knowledge quality oil test, insulation test, among others. The Health Index (HI) is useful for maintenance
Machine learning strategy planning. The aim of this research is to represent the new method to assess the condi-
Transformers
tions of transformer by applying a new developed HI method; it has been improved from the
traditional method by using orthogonal Wavelet network for long-term degradation factors, that
cumulatively lead to the transformer span life and validate by the data mining process. It has
been proved with machine learning algorithms with high accuracy. Rich metadata is required to
find and understand the measurements, from modern experiments with their immense and
complex data stores, the new method allows to structure a more practical way to categorization
with various components, for a better decision science with a multi planning strategy. To store
and manage these metadata have improved over time, but, mostly of the cases are ad-hoc col-
lections of data relationships, often, represented in domain or site specific application code.
A case study with 60 power transformer is made to evaluate the performance of the new
develop HI. Therefore, the proposed method can create an efficient preventive maintenance plans
for power transformers; it is an important contribution for knowledge, because the data is con-
verting into knowledge.

1. Introduction

The health index is a technique widely used to perform the strategy of maintenance planning in power systems, with this approach
to transformer maintenance; it possible prevents an early failure [1], it is today a deep problem for engineering failure analysis. The
span life associated to power transformers are from 32 to 55 years with a standard deviation of 8 years, the which depends on design,
loading, degradation of paper and oil insulation, faults, high temperature and humidity [2], the factors mentioned above can reduce
the transformers span life and it caused a failure. Therefore, the identification of the problems through an identification and eva-
luation of the condition of the transformers are important in the activities by the maintenance plan.
There are many methods for evaluating the condition of equipment in transmission systems [3]. These methods are the fol-
lowings: Failure mode engineering analysis (FMEA) [4], maintenance-focused reliability (RCM) [5,6], among others.
Many efforts have also been made in the representation of health index [7], however, the research can be improved, considering
that it is necessary to determine the amount of variables required and appropriate for the evaluation, a current limitation is that only
with the laboratory results (DGA analysis) [1–3,8]. However, these studies did not consider electrical or quality tests in the current


Corresponding author.
E-mail addresses: ricardoariasvelasquez@hotmail.com, ricardo.ariasv@pucp.pe (R.M. Arias Velásquez).

https://doi.org/10.1016/j.engfailanal.2019.03.027
Received 15 January 2019; Accepted 24 March 2019
Available online 26 March 2019
1350-6307/ © 2019 Elsevier Ltd. All rights reserved.
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

studies, and the method validation was not possible in the last researches [3–7].
Health index (HI) is a usual and effective method to create a priority of critical equipment and propose a maintenance plan [8].
This method can use the results of inspections, field and laboratory tests. Many methods could use the condition of the transformer,
dissolved gases (DGA), furan composition, degree of polymerization (DP), oil quality, power factor, visual inspections, equipment
with conditional inspections for bushings and oil loader tap changer (OLTC).
This research has been developed to estimate the general condition of the power transformer through Health Index (HI), con-
sidering the test methods, Dissolved Gas Analysis, Power Factor, Oil Quality, Furans gases, Visual inspections, state of the art, OLTC
and bushings. The evaluation of the general condition of the transformers is developed considering the analysis of the tests with
known methods: IEEE [13,20] and IEC [26,27] described in the numeral 2 “method”. If it has considered each analysis based on
wavelets networks and method ending with a score, that allows to determine the condition and categorization of the equipment
which is validated with a data mining analysis to determine the decision tree that shows the greatest influence and possible dif-
ferences and interactions with failure modes [35].
Building on concepts developed in these projects, we are developing a general system that could be used to represent all of these
kinds of data relationships as mathematical graphs. Just as MDSplus and MPO were generalizations of data management needs for a
collection of users, this new system will generalize the storage, location, and retrieval of the relationships between data.
The system will store data relationships as data, not encoded in a set of application specific programs or ad hoc data structures.
Stored data, would be referred to by URIs allowing the system to be agnostic to the underlying data representations. Users can then
traverse these graphs. The system will allow users to construct a collection of graphs describing ANY OR ALL OF the relationships
between data items, locate interesting data, see what other graphs these data are members of and navigate into and through them.
In addition to the general condition, the health index will determine the medium and long term renewal plan for the
Transformers.

2. Method

2.1. Wave networks and multi-resolution analysis

Combining the wave transform and the basic ideas of neural networks are combined in wave networks. The basic idea is the
excellent location of time - frequency. To explain the structure, we will start with introducing the continuity and discrete waves to
establish the waves of transformations and multi-resolution analysis (MRA).
The function f (t) and the wave transform (CWT) is used to transfer, as detailed in Eq. 1.
∞ 1 ∗⎛t − b ⎞
W (a, b) = ∫−∞ f (t ) φ
|a| ⎝ a ⎠
dt
(1)

where a, b are real and * denotes the complex conjugation.


The term is written in synthesized form is shown in Eq. 2.

1 t − b⎞
φ⎛
|a| ⎝ a ⎠ (2)

In Eq. 2, a second developed function has been synthesized in eq. 3 [9], as a non-redundant representation.
∞ ∞
k
f (t ) = ∑ ∑ d (k , l) ∗ 2− 2 φ (2−k /2t − l)
k =−∞ k =−∞ (3)

The main point is the dilation and the translation of the parameters are not continuous, but they are not discrete values.
Dilation takes values of the form 2 k where k is an integrator. Eq. 3 corresponds to coordinator examples (a, b) in the network,
where the consecutive values are discrete scales as intervals of 2 values. The 2-dimensional sequence d (k, l) is referred to the discrete
wave transform (DWT). 4 DWT is still a continuous signal transformation, the discretization is only in parameters a, b in this sense,
they are analogous, to the Fourier series, as detailed in eq. 4.

δj, n = φ (2−jx − n) (4)

Among them:φ, n belongs to Z;n are the parameters of translation and j is the dilation.
In particular, the representation means that the function f (t) in L2 (R) space, it could be approximated with different precisions
depending on the resolution of the space in which the approximate function is Eq. (5).

f (t ) = ∑ μ (j, l) φ (2−jt − l)
k =−∞ (5)

Among them, the function f (t) denotes the approximation of the function in resolution j and (j, l) are coordinated of the scale
function in the sub space. The added details are approximations. These new sub spaces Wi which contains the details, are ortho-
normal and have basic ortho-normality.
In addition, if the function f (t) has defined over a small region, it is detailed in eq. 6.

216
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

L
+ ϕ(.) μ1
2 -fo

- L+1
i -fo + μ1+1 f
2 ϕ(.)

- U
-fo + ϕ(.) μu
2

Fig. 1. Representation of wave networks [9].

U
fk (t , μ) = ∑ μ (j0 , l) φ (2−jt − l)
I=L (6)

For μ, l belong Z.
Eq. 6 represents the wave networks that can be used as an approximation of an unknown function, using the wave functions as a
basic one. A graphic representation is shown in Fig. 1. Which represents the input data developed and the contributions of the
different variables in the model, which allows prediction by applying neural networks for time series and data mining analysis [22].

2.2. Tests for power transformers

In this section, it develops the parameters that influence the general condition of the transformer, data origin. It requires three
types of tests: Electrical tests, oil isolation tests and oil quality.

2.2.1. Electrical tests of the transformer


The electrical test determines the condition of the winding being the power factor of the insulation. This method detects the
integrity of the insulation of the windings and bushings.
The evaluation of the condition of the power transformer, health index is the result obtained from the different test methods, with
the availability of data and the experience of the operating personnel. The HI is classified into five levels, very good, good, regular,
poor, very poor. This method is used to identify how good the condition of the transformer is.
The electrical test is used for the inspection of the condition of the winding, being the power factor of the insulation [31]; this test
detects the integrity of the insulation of the winding and determines the power factor of the total insulation including winding and
bushings [30], it could be associated to the electric field and pollution [29]. The percentage of the power factor is calculated from
the measurement of the voltage, current and power value. In eq. 7 is the percentage of the power factor.
100
%FP = P ∗
VI (7)

The limit should not exceed 0.5% at 20 °C [30], the limits of the power factor are shown in Tables 1 and 2.

2.2.2. Oil isolation analysis tests


The oil insulation test is performed to inspect the condition of the oil insulation. The integrality assessment consists of:

2.2.2.1. Dissolved gases analysis (DGA). The dissolved gas analysis method is used for the analysis of concentrations and the
decomposition process for the following gases: H2; CH4; C2H6; C2H4; [10], the insulation paper CO, CO2 [11]. The interpretation
of the results is implemented with an interpretation technique according to standards IEC 60599 [12] and IEEE C57.104 [13].

Table 1
Winding power factor limits.
Power factor limits % FP to 20 °C

IH CH CHL CL Capacitance

Very good < 0.5 < 0.5 < 0.5 < 2%


Good 0.5–1.0 0.5–1.0 0.5–1.0 2.1–5%
Regular 1.1–1.5 1.1–1.5 1.1–1.5 5–8%
Poor 1.6–2.0 1.6–2.0 1.6–2.0 8–10%
Very poor > 2 > 2 > 2 > 10%

217
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

Table 2
Limits of the power factor in bushings [30].
Bushing power factors

IH C1% C1 Capacitance

Very good 0–0.5 0%


Good 0.5–0.7 0–5%
Regular 0.7–1.0 5–8%
Poor 1.0–2.0 8–10%
Very poor > 2 or negative > 10%

The DGA test is used as a tool to determine the condition of the transformer. This indicates many problems and can identify the
deterioration of oil insulation, for example the Sulphur effects [28]. This diagnosis using the total and individual gas concentration
and the daily trend, according to IEEE C57.104 standard, the diagnosis is also made using the gas ratio according to IEC 60599 [12],
Tables 3 and 4 are shown with the detail of the concentration total and individual.

2.2.2.2. Furan. The deterioration of cellulose due to the furan components due to heating, oxidation, acidity and humidity.
Therefore, it is possible to approximate a calculation of remaining life based on the degree of polymerization of the IEC 61198
standard [23]. The studies of furans, it considered the correlation study between the content of 2- furaldehyde (FAL) and the degree
of polymerization according to the investigations of Heisler, Bazer [14] and Arias, Mejía [33] according to eq. 8.

19
DP = ⎜⎛325 ∗ ⎛ –Log10(2FAL) ⎞
⎝ ⎝ 13 ⎠ (8)

Eq. 8, DP is set between 100 and 900 [33].


According to Chendong [15], it details eq. 09.

1
DP = ⎛ ⎞ ∗ (1.51 − log 10(furans))
⎝ 0.0035 ⎠ (9)

Eq. 09 varies among 150 and 1000.


According to Pablo Eq. [16], it details in the Eq. 10.

Table 3
Diagnosis using total and individual concentration of gases.
Factor HI HI H2 C2H2 C2H4

100 Very good 10 5 5


30 Good 11-100 6-35 6-50
20 Regular 101-700 36-50 51-100
10 Poor 701-1800 51-80 101-200
0 Very poor > 1800 > 81 > 200

Factor 4 to 6

Factor IH CO CO2 CH4


HI

100 Very good < 30 < 500 0-50


30 Good 31-350 500-2500 51-120
20 Regular 351-570 2501-6000 121-400
10 Poor 571-1400 6001-10,000 401-1000
0 Very poor > 1400 > 10,000 > 1000

Factor 7 to 9

Factor IH C2H6 TGC TGC


HI ppm/dia

100 Very good 0-10 – < 1


30 Good 11-65 720 1-5
20 Regular 66-100 721-1,920 6-10
10 Poor 101-150 1921-1,920 11-30
0 Very poor > 150 > 4,630 > 30

218
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

Table 4
Diagnosis using gas ratio according to IEC 60599.
HI Description C2H2 / C2H4 CH4 / C2H4 C2H4 / C2H6

100 Very good ND < 0.1 < 0.2


30 Good > 1 0.1–0.5 >1
20 Regular 0.6–2.5 0.1–1 >2
10 Poor < 0.1 >1 1–4
0 Very poor < 0.2 >1 >4

(1850)
DP = ⎛ ⎜
⎞ ⎟

⎝ (2FAL + 2.3) ⎠ (10)

Eq. 10, varies between 150 and 600.


The analysis of 2 - furaldehyde (FAL) diagnoses the decomposition of the paper in the transformer oil [33,34]. The limits
indicated in Table 5 as following in the Table 5.

2.2.2.3. Oil quality. The oil quality tests are represented by the electrical tests recommended by the standards IEC 60505–2011 [18]
and IEEE C.57–106-2006 [19].
The oil quality test is performed to check the general condition of the oil insulation. The evaluation of the quality is made
considering 5 tests: Dielectric rigidity, inter face tension (IFT), Neutralization Number (NN) or acidity, water contained or saturation
of water in the oil and color [34].
The general state of the transformer, is known as the health index is performed by the evaluation of each criterion according to the
tests, see Table 6. The values of the HI Index of Health Index, are grouped of very good, good, regular, poor and very poor, and the
factors are determined according to a logarithmic scale of 100, 30, 20, 10 and 0, as can be seen in Table 7.

2.2.2.4. Visual inspections. The visual inspection mainly provides an external inspection each month. The check history of load,
general condition, bushings, conservative tank, main tank, radiator and cooling system, according to Table 7 with 10 criteria for the
affectation of the inspection parameters [37].

2.2.2.5. Health index. Finally, Table 8 has been wrote for the health index according to the methodologies [20].
It is a factor of 100 is determined with a very good condition and a duration of > 15 years of remaining life, a score of 30, with a
team observed with a period of 5 to 15 years, with a good condition, 20 regular with a period of care of 2 to 5 years, 10 poor with a
duration of 1 to 2 years and 0 very poor with urgent care < 1 year.

2.2.2.6. Data mining techniques and machine learning applications. Data mining can be seen to have many definitions. The most
common are: “Data mining is a process of discovering interesting patterns and knowledge from large amounts of data” [24]. “Data
mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data in novel
ways that are both understandable and useful to the data owner” [25].
In the case of clustering using the K-mean algorithm, the goal is to find the extremum of the objective function. The k-mean
algorithm function is defined as [26]:
m k
J (U ; M ) = ∑ ∑ uy ∗ dis
i=1 j=1 (11)

where:J — Objective function,U — Matrix of the object belonging to a cluster,M — Matrix in which a row vector represents the
centroids of clusters,i = 1, 2, 3, …, m — Number of objects.j = 1, 2, 3, …, k — number of classes (clusters).uij—element indicating
the fact of assignment of i-th object to the jth class (cluster).dis — measure of distance.
The results achieved using the K-mean algorithm are dependent on distance selection [36]. It may be chosen differently and the
most common measures of distance are [13]:

Table 5
Diagnosis using 2 – FAL [33].
Factor HI ppm Ppm per year

100 Very good 2FAL < 0.5 < 0.020


30 Good 0.5 ≤ 2FAL < 1.0 > 0.020
20 Regular 1.0 ≤ 2FAL < 1.5 > 0.035
10 Poor 1.5 ≤ 2FAL < 2.0 > 0.040
0 Very poor 2FAL ≥ 2.0 > 0.050

219
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

Table 6
Diagnosis using Oil Quality [34].
Oil Quality

Test < 69 kV 69kV < > 220 kV Factor HI IH


V<
220 kV

Dielectric strength ASTM D1816 > 46 > 53 > 56 100 Very good
44 - 46 51 – 53 54 - 56 30 Good
42 – 44 49 – 51 52 – 54 20 Regular
< 42 < 49 < 52 10 Poor
Interfacial voltage > 32 > 40 > 40 100 Very good
27 – 32 32 – 40 32 – 40 30 Good
22 – 27 25 – 32 25 – 32 20 Regular
< 22 < 25 < 25 10 Poor
Colour < 1.5 100 Very good
1.5 – 2.0 30 Good
2.0 – 2.5 20 Regular
≥ 2.5 10 Poor
Neutralization number ≤ 0.05 ≤ 0.005 ≤ 0.005 100 Very good
0.051 - 0.07 0.006 – 0.01 0.005-0.01 30 Good
0.07 – 0.10 0.01 – 0.05 0.01 – 0.05 20 Regular
0.1 – 0.2 0.05 – 0.1 0.05 – 0.1 10 Poor
Water content < 25 < 15 < 10 100 Very good
25 – 30 15-20 10-15 30 Good
30 – 35 20-25 15-20 20 Regular
> 35 > 25 > 20 10 Poor

Table 7
Condition parameters of the power transformer.
N° Criteria description

1 Dissolved Gases Analysis


2 History load
3 Power factor
4 Oil quality
5 Furan analysis
6 General condition
7 Bushing
8 Conservative tank
9 Main tank
10 Radiator cooling system

Table 8
Transformer health index.
Factor Condition Requirement Color HI (years)

100 Very good Normal condition 100 > 15


30 Good Preventive maintenance is a good option for you assets 30 5–15
20 Regular You should consider diagnosis and test 20 2–5
10 Poor You should Start planning for major maintenance replacement “overhaul” 10 1–2
0 Very poor Risk failure 0 < 1

- Euclidean distance [16]


- Manhattan distance [17],
- Minkowski distance [17],
- Chebyshev distance [18].

Euclidean distance is defined as:

dis (x , y ) = ∑ (xi − yi)2


i (12)

where,dis — Distance between vectors x and y,xi — Vector of observations belonging to cluster x,yi — Vector of observations
belonging to cluster y.

220
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

Fig. 2. Wavelet networks used for health indices in transformers and reactors.

The “learning” process takes place while the algorithm creates internal selection criteria, so much so that when a new element is
provided to the system for classification (say a new circle of larger diameter that had never been seen by the algorithm during
training) it will be classified approximately in a correct way. The word approximately here is related to the fact that there is always
some classification error accounted for by the accuracy of individual algorithms [32].

221
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

Table 9
Model developed.
Qualification HI Localization Voltage Age

1 0 CARA500TRF74-AUTOT-R 500 6
2 10 CHMB500REA20-REACT-T 500 6
3 20 CARA500TRF73-AUTOT-R 500 6
4 20 CARA500TRF73-AUTOT-S 500 6
5 20 CARA500TRF73-AUTOT-T 500 6
6 20 CARA500TRF74-AUTOT-S 500 6
7 20 CARA500REA17-REACT-T 500 6
8 20 CHMB500REA19-REACT-R 500 6
9 20 TRJL500REA24-REACT-T 500 4
10 30 TRJL500REA21-REACT-R 500 6
11 30 TRJL500REA21-REACT-S 500 6
12 30 TRJL500REA21-REACT-T 500 6
13 30 CHMB500REA18-REACT-R 500 6
14 30 CHMB500REA19-REACT-S 500 6
15 30 CHMB500REA19-REACT-T 500 6
16 30 CHMB500REA20-REACT-R 500 6
17 30 LNIN500REA25-REACT-T 500 4
18 30 LNIN500REA25-REACT-RES 500 4
19 30 TRJL500REA24-REACT-R 500 4
20 30 TRJL500REA24-REACT-S 500 4
21 100 CARA500TRF73-AUTOT-RES 500 6
22 100 CHIA500TRF72-AUTOT-R 500 6
23 100 CHIA500TRF72-AUTOT-S 500 6
24 100 CHIA500TRF72-AUTOT-T 500 6
25 100 CARA500TRF74-AUTOT-T 500 6
26 100 CHIA500TRF72-AUTOT-RES 500 6
27 100 CARA500REA17-REACT-R 500 6
28 100 CARA500REA17-REACT-S 500 6
29 100 CARA500REA17-REAC-RES 500 6
30 100 TRJL500REA21-REACT-RES 500 6
31 100 TRJL500TRF85-AUTOT-R 500 6
32 100 TRJL500TRF85-AUTOT-S 500 6
33 100 TRJL500TRF85-AUTOT-T 500 6
34 100 TRJL500TRF85-AUTOT-RES 500 6
35 100 CHMB500REA18-REACT-S 500 6
36 100 CHMB500REA18-REACT-T 500 6
37 100 CHMB500REA18-REAC-RES 500 6
38 100 CHMB500REA20-REACT-S 500 6
39 100 CHMB500TRF84-AUTOT-R 500 6
40 100 CHMB500TRF84-AUTOT-S 500 6
41 100 CHMB500TRF84-AUTOT-T 500 6
42 100 CHMB500TRF84-AUTO-RES 500 6
43 100 LNIN500REA25-REACT-R 500 4
44 100 LNIN500REA25-REACT-S 500 4
45 100 LNIN500TRF91-AUTOT-R 500 4
46 100 LNIN500TRF91-AUTOT-S 500 4
47 100 LNIN500TRF91-AUTOT-T 500 4
48 100 LNIN500TRF91-AUTOT-RES 500 4

First think of entropy of a given system as the level of confusion that needs to be minimized in order to best separate the elements
of that very system into distinct classes. Classification and Regression Trees (CART) employs the concept of entropy and search for
minimum entropy in order to select the variables that best reduce entropy of a given system and for each of those variables find the
level that best “separate” the data into at least two classes.
Entropy with two classes x and y is defined as eq. 13.
Entropy = −P (x ) × Log 2(P (x )) − (1 − P (x )) × Log 2(1 − P (x )) (13)
It is the P(x) = probability of class x.
Entropy > 0 (14)
With an entropy equal 1 at the top, the perfect separation of two classes with zero entropy at the bottom. On the other hand, the
imperfect separation of classes with eq. (14).

3. Case study implemented

The capacity of the wavelet networks has been verified with the estimation of the health index, it is in the transformer. A

222
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

Fig. 3. Estimated wave networks vs. health index.

Fig. 4. Decision making tree.

simulation was carried out with a family of transformers and reactors in 500 kV, for the corresponding health index tests, it has 60
equipment with 1,520,290 technical data with results of the tests.
To validate the capacity of the proposal in the estimation of health indices, for the diagnosis and data detailed in Fig. 2 with the
grouped factors, the corresponding groupings and consolidated evaluation, considering the wave networks.
A wavelet network indicated in Fig. 2. The wavelet network consists of 19 nodes in the inner layer and was trained under a
standard with algorithms.
The diagnostic tests were taken from the transmission systems in Peru. The data were: Winding power factor, bushings, H2, C2H2,
C2H4, CO, CO2, CH4, C2H6, TGC, C2H2 / C2H4, CH4 / C2H4, C2H4 / C2H6, furans, dielectric strength, inter face tension, color,
neutralization number and water in oil content.
The method applied to engineering is developed in reference [21]. The values of the health index are between 0 and 100%. A
health index of one represents an optimum operating band of the transformer [22]. A health index of 0 represents a transformer with
a high risk of failure with a high probability of failure event.
In the Fig. 4, it shows us the tree of decision making, with the component of the main contribution. The family of accelerated
degradation, the furans determine the factor HI, whose greater influence then initiates with the methane, which is influenced by the
ethane and has 2 types of failure mode one caused by methane and ethylene (low and high energy respectively) and total furans by
improper assembly caused specifically by the 2-FAL content, it is also studied. This analysis can perform a specific analysis by family
and the contribution of each of the components, as well as the influence on the diagnosis of functional failures in the associated
families and the root cause.
The statistical perspective frames data in the context of a hypothetical function (f). For the data mining, it is implemented a

223
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

Fig. 5. Histograms and derived probability density functions of all 24 variables used.

Fig. 6. Map of missing values in the 1000 cases.

machine learning (ML) algorithm is trying to learn a given pattern and thereby generalize that pattern to some unseen data. In order
to achieve such a learning it is necessary to “map” input to output with some sort of statistical function as illustrated in Fig. 5.

The ML algorithms learn the statistical mapping between input and output through typically a large number of examples provided
in the training phase, in which each example generally contains a multiple number of features.
The supervised learning takes place through a comparison between the output of each individual Machine Learning algorithm and
the one posted by the human expert. An error function is defined and a proper statistical process is employed in order to minimize
such a function so that each algorithm will provide the best possible accuracy based on each model. After learning, the algorithms are
then tested against the 60 unseen cases, and another accuracy is calculated. One interesting way of showing such test accuracy is
through the so-called Confusion Matrix Figs. 6 and 8.
There are several possible approaches to the problem of missing data but one can say for sure that missing data is like a medical
issue: it will not go away just because it is overlooked. A human expert will intuitively handle missing data by for example assuming
that a missing parameter, it is normal and as such will not influence the decision about the condition of the transformer.
Statistically speaking a single value imputation (like in the educated guess or in the replacement of missing data by the mean for
example) may work well in certain applications but suffers from a significant change in the distribution of data as illustrated in Fig. 7.

224
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

Fig. 7. Illustration of mean imputation.

Fig. 8. Illustration of entropy variation with class separation.

In practice, however, there is always some “misclassification” so that a non-circular shape may be classified as circular or a
circular shape as non-circular as illustrated in Fig. 7(b). Systems will typically have multiple features, and the process continues to the
next variable until no more separation is possible or until a given accuracy is found. The rights and wrongs in the classification
process will then determine the accuracy of the procedure.

4. Discussion

The evaluation of the performance of the model is performed, with the wave methodology, which is shown in Table 9.
In Table 9, the prioritization of the equipment according to the conditions can be determined in a very practical way. Analyzing
with data mining a set of 10,198 data, it has been notified with the results of the tests to the inductive equipment, the diagnosis is in
Fig. 9.
With the check of the serial numbers of the transformers, they have included an improvement of the health indexes to determine
the factors that can determine a criticality index of equipment, which can be seen in Fig. 3 and Fig. 4.

225
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

Fig. 9. Contribution and influence of the health index and team rating by component.

Fig. 10. Training accuracy of machine learning algorithms.

Finally, data mining establishes the influences of the health index and the qualification of the equipment with the contributions
corresponding to each element in Fig. 5 determining a direct and solid influence in the verification with a correlation coefficient of
95.6% in the family of equipment.
In the Fig. 10, it shows the boxplots with comparative results of training accuracy for the 12 Machine Learning Models employed
in the present work. Notice that the top 5 best performing models are all variations and ensembles of CART and their major dif-
ferences are in the process of building the multiple trees that will best separate the data after learning from the training dataset
[38,39]. The test accuracy is obtained by comparing the output of the system when classifying data that was not used during training
(200 new cases not used during training) against the human experts' opinion for those new cases [40,41].
In the Fig. 10, a comparative accuracy of Machine Learning algorithms after training 12 models with 80% of the available data, by
10-fold cross validation (CV) and 3 repeats. The ML algorithms were Naïve Bayes, Linear Discriminant Analysis (LDA), Classification
and Regression Trees (CART), General Linear Model (GLM), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Artificial
Neural Networks (ANN), Tree Bagging, Extreme Gradient Boosting Machine (xGBM1 and xGBM2), Random Forest (RF) and C5.0.
In the Fig. 11, each node represents a transformer major component or operational data plus essential test results such as DGA,
electrical tests, etc. showing monitors with belief in that particular node or functionality given no evidence and based on prior
probabilities (prior knowledge). It is called “instantiation” of the BN.

226
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

Fig. 11. Simplified Bayesian Network (BN) or Belief Propagation Network (BPN).

Besides, the evidence of bushing issue and oil leak (see node clicks) – notice influence on accessories belief that reduces from
87.7% Good in (a) to 39% in (b); Main tank from 79.3% Good in (a) to 57.8% in (b) and finally belief in good health reduces from
76% in (a) to 64.4% in (b) given the evidence of oil leak and bushing condition.
The “learning” process takes place while the algorithm creates internal selection criteria, so much so that when a new element is
provided to the system for classification (say a new circle of larger diameter that had never been seen by the algorithm during
training) it will be classified approximately in a correct way. The word approximately here is related to the fact that there is always
some classification error accounted for by the accuracy of individual algorithms.

5. Conclusion

The proposed health index estimation method called HI, used orthogonal wavelet networks, besides it has shown high efficiency
to represent a complex asset as a power transformer. HI quantifies the condition of the equipment based on numerous condition
criteria that are related to long-term degradation factors that cumulatively lead to the transformer span life. The new proposed
method allows us to structure in a more practical way a categorization with various components, to achieve a better prioritization in
decision making, with a variable planning strategy.
The link of the method between the wave transform and Multi - Resolution Analysis (MRA) approach and combines the various
factors and those that combine into a condition. In addition to the usual test data that has been used in the past. Used to evaluate the

227
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

physical health status of transformers.


Finally, the work is a fundamental tool for asset management, which allows establishing priorities based on diagnostics and
information, to streamline the interpretation of knowledge of the physical assets of the most important component of the energy
transmission system, the transformer.
The Machine Learning algorithms showed an impressive accuracy when analyzing complex power transformer data without the
use of any engineering model whatsoever. In other words, the algorithms were not provided with reference levels or flags to indicate
that a given parameter was within acceptable range or outside “normal” levels.
The significant number of misses in practical terms is 3 yellows classified as green cases out of 200 total leading to 3/200 = 1.5%
real miss since the other misses were conservative and would not lead to any unfavorable situation like a possible failure. The paper
also demonstrated the importance of human expert judgment in the training and learning process of the ML algorithms particularly
with respect to power transformer diagnostics.

Acknowledgement

Recognition to Pontificia Universidad Católica del Perú, for the support and development of this research.

References

[1] R. Murugan, R. Ramasamy, Understanding the power transformer component failures for health index-based maintenance planning in utilities, Eng. Fail. Anal.
96 (2019) 274–288.
[2] CIGRE Working Group 37-27, Ageing of the System Impact on Planning, (2000), pp. 1–41.
[3] R.M.A. Velásquez, J.V.M. Lara, Electrical assessment by lightning phenemenon in power lines of double circuit, IEEE Lat. Am. Trans. 14 (5) (2016) 2217–2225.
[4] Mohsen Akbari, P. Khazaee, I. Sabetghadam, P. Karimifard, Failure Modes and Effects Analysis (FMEA) for Power Transformers, 28th Power System Conf. Iran,
(2013).
[5] H.A. Aldhubaib, M.A. Salama, A novel approach to investigate the effect of maintenance on the replacement time for transformers, IEEE Trans. on Power
Delivery 29 (2014) 1603–1612.
[6] G. Wolf, T. Loczi, A. Vilimi, Risk and reliability - focused maintenance Paks NPP-s for the maintenance strategy technical meeting on maintenance optimization
to improve nuclear power plant performance, Technical Meeting on Maintenance Optimization to Improve Nuclear Power Plant Performance (2014) 1–5.
[7] A.N. Jahromi, R. Piercy, S. Cress, W. Fan, An Approach to Power Transformer Asset Management Using Health Index, IEEE Insul. Mag, (2009), pp. 25–1-5.
[8] A. Naderian, S. Cress, R. Percy, An approach to determine the health index of power transformers, IEEE Int. Symp. Electrical Insulation, Canada (2008) 192–196.
[9] M. Ahmed, M. Elkhatib, M. Salama, Transformer health index estimation using orthogonal wavelet network, 2015 IEEE Electrical Power and Energy Conference
(EPEC), 2015, pp. 210–224.
[10] A. Siada, S. Islam, A new approach to identify power transformer criticality and asset management decision based on dissolved gas-in-oil analysis, IEEE Trans.
Dielectrics and Electrical Insulation 19 (3) (2012) 1007–1012.
[11] N.A. Baker, A. Abu-Siada, S. Islam, A review of dissolved gas analysis measurement and interpretation techniques, IEEE Electr. Insul. Mag. 30 (3) (2014) 39–49.
[12] International Electrotechnical Commission, IEC 60599 Edition 2.1. 2007–05. Mineral Oil-Impregnated Electrical Equipment in Service, (2007), pp. 1–69.
[13] IEEE-SA Standards Board IEEE Std, C57.104–2008, IEEE Guide for the interpretation of Gases Generated in Oil – Immersed Transformers, (2008), pp. 1–90.
[14] A. Heisler, A. Banzer, Zustandsbeurteilung von Transformatoren mit Furfurol-Bestimmung, Frankfurt, Germany, ew, Heft. 16, vol. 102, (2003), pp. 58–59.
[15] X. Chendong, Monitoring paper insulation ageing by measuring fur-fural contents in oil, 7th Int. Symp. High Voltage Eng., Dresden, Germany, vol. 74, 1991, pp.
26–30 vol. 6.
[16] A. De Pablo, Interpretation of Furanic Compounds Analysis Degradation Models, CIGRE WG D1.01.03, Former WG. 15–01, Task Force 03Paris, France, (1997),
pp. 1–87.
[17] A.B. Shkolnik, R.T. Rasor, Statistic al insights into furan interpretation using a large dielectric fluid testing database, IEEE Power Energy Soc. Transm. Distrib
(2012) 1–5.
[18] International Standard IEC 61198 Edition 1.0. 1993–09, Mineral Insulating Insolates. Methods for the Determination of 2-Furfural and Related Compounds,
(1993), pp. 1–28.
[19] International Standard IEC 60505:2011 Edition 4.0, Evaluation and Qualification of Electrical Insulation Systems, (2011), pp. 1–151.
[20] IEEE Transformers Committee, IEEE Std. C57.106–2006, IEEE Guide for Acceptance and Maintenance of Insulating Oil in Equipment, (2007).
[21] A. Jahromi, R. Piercy, S. Cress, J. Service, W. Fan, An approach to power transformer asset management using health index, electrical insulation magazine, IEEE
25 (2009) 20–34.
[22] Y. Chen, B. Yang, J. Dong, Time-series prediction using a local linear wavelet neural network, Neurocomputing 69 (2006) 449–465.
[23] International Standard 61198:1993–9 Edition 1, Methods for the Determination of 2-Furfural and Related Compounds, vol. 9, (1993), pp. 1–28.
[24] J. Han, M. Kamber, Data Mining: Concepts and Techniques, (2011), https://doi.org/10.1007/978-3-642-19721-5.
[25] D. Hand, H. Mannila, P. Smyth, Principles of Data Mining, (2001), https://doi.org/10.2165/00002018-200730070-00010.
[26] International Electrotechnical Commission, IEC 60422:2013: Mineral insulating oils in electrical equipment - Supervision and maintenance guidance, (2013), pp.
1–88.
[27] International Electrotechnical Commission, IEC 60567:2011 Oil-filled electrical equipment - Sampling of gases and analysis of free and dissolved gases –
Guidance, (2011), pp. 1–54.
[28] Ricardo Manuel Arias Velásquez, Jennifer Vanessa Mejia Lara, Corrosive Sulphur effect in power and distribution transformers failures and treatments, Eng. Fail.
Anal. 92 (2018) 240–267.
[29] Ricardo Manuel Arias Velásquez, Jennifer Vanessa Mejia Lara, Current transformer failure caused by electric field associated to circuit breaker and pollution in
500 kV substations, Eng. Fail. Anal. 92 (2018) 163–181.
[30] Ricardo Manuel Arias Velásquez, Jennifer Vanessa Mejia Lara, Bushing failure in power transformers and the influence of moisture with spectroscopy test, Eng.
Fail. Anal. 94 (2018) 300–312.
[31] Ricardo Manuel Arias Velásquez, Jennifer Vanessa Mejía Lara, The need of creating a new nominal creepage distance in accordance with heaviest pollution
500kV overhead line insulators, Eng. Fail. Anal. 86 (2018) 21–32.
[32] Ricardo Manuel Arias Velásquez, Jennifer Vanessa Mejía Lara, Reliability, availability and maintainability study for failure analysis in series capacitor bank, Eng.
Fail. Anal. 86 (2018) 158–167.
[33] Ricardo Manuel Arias Velásquez, Jennifer Vanessa Mejía Lara, Life estimation of shunt power reactors considering a failure core heating by floating potentials,
Eng. Fail. Anal. 86 (2018) 142–157.
[34] Ricardo Manuel Arias Velásquez, Jennifer Vanessa Mejía Lara, Principal components analysis and adaptive decision system based on fuzzy logic for power
transformer, Fuzzy Information and Engineering. 9 (4) (2017) 493–514.
[35] Max Kuhn, Kjell Johnson, Applied Predictive Modelling, Springer, 2013.
[36] Ian Witten, Eibe Frank, Mark Hall, Data Mining: Practical Machine Learning Tools and Techniques, 3rd edition, Elsevier, 2011.

228
R.M. Arias Velásquez, et al. Engineering Failure Analysis 101 (2019) 215–229

[37] W. Wattakapaiboon, N. Pattanadech, The new developed health index for transformer condition assessment, International conference on condition monitoring
and diagnosis – Xi'an – China 2016 (2016) 32–35.
[38] Ricardo Manuel Arias Velásquez, Jennifer Vanessa Mejía Lara, Improvement in the design of power oil-filled reactors to avoid faults of seismic origin, Eng. Fail.
Anal. 97 (2019) 416–433.
[39] Ricardo Manuel Arias Velásquez, Jennifer Vanessa Mejía Lara, Model for failure analysis for overhead lines with distributed parameters associated to atmo-
spheric discharges, Eng. Fail. Anal. 100 (2019) 406–427.
[40] Ricardo Manuel Arias Velásquez, Jennifer Vanessa Mejía Lara, Failures in overhead lines grounding system and a new improve in the IEEE and national
standards, Eng. Fail. Anal. 100 (2019) 103–118.
[41] Ricardo Manuel Arias Velásquez, Jennifer Vanessa Mejía Lara, Reliability model for switchgear failure analysis applied to ageing, Eng. Fail. Anal. 101 (2019)
36–60.

229

You might also like