Artificial Intelligence-Based Power Transformer He-1

This article has been accepted for publication in a future issue of this journal, but has not been
fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.Doi Number
Artificial Intelligence-based Power Transformer

Health Index for Handling Data Uncertainty
Dhanu Rediansyah1,2, Rahman Azis Prasojo1,4, (Graduate Student Member, IEEE)
Suwarno1, (Senior Member, IEEE), and A. Abu-Siada3, (Senior Member, IEEE)
1
School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung 40132, Indonesia;
2
PT. PLN (Persero) Head Office, Jakarta 12160, Indonesia
3
Electrical and Computer Engineering Discipline, Curtin University, Perth, W A 6102, Australia
4
Electrical Engineering Department, Politeknik Negeri Malang, Malang 65141, Indonesia
Corresponding author: Rahman Azis Prasojo (rahmanazisp@polinema.ac.id).

This work was supported by PT. PLN (Persero) and Institut Teknologi Bandung.
ABSTRACT Power transformer is a critical and expensive asset in electric transmission and distribution
networks. It is essential to monitor the health condition of all power transformer fleet in such networks to
avoid unwanted outages. The health index (HI) is a quick and efficient way to assess the condition of power
transformers based on multi-criteria. While Power transformer HI method has been well presented in the
literature, not much attention was given to handle the uncertainty and reliability of this method due to
unavailability of used data. Therefore, this paper aims to tackle this issue through employing Artificial
Intelligence (AI)-based techniques to reveal the health condition of power transformers with high accuracy
and at the same time handling data uncertainty. The proposed HI approach assesses the power transformer
insulation system based on oil quality, dissolved gas analysis (DGA), and paper condition. In this regard,
collected data from 504, 150-kV transformers are used to establish the proposed AI-models. Seven AI
algorithms including k-Nearest Neighbor (kNN), Support Vector Machine (SVM), Random Forest (RF),
Naïve Bayes (NB), Artificial Neural Network (ANN), Adaptive Boosting (AdaBoost), and Decision Tree are
investigated. A performance comparison of the proposed AI-based HI models is carried out using the scoring-
weighting-based HI method as the reference. Results show that RF model provides the best performance in
predicting power transformer HI with an accuracy of 97.3%.
INDEX TERMS Power transformer, health index, insulation system, condition monitoring, artificial
intelligence.
I. INTRODUCTION of power transformers have been assessed through various

Power transformers are among the most critical and parameters of the insulation system [8], [9].
expensive assets within the electrical transmission and Reliable and cost effective condition monitoring and asset
distribution networks. With the continous increase in the load management techniques are essential for electric utilities in
demand, power transformers are operating close to their preparing an appropriate financial plans to estimate the future
nominal ratings and becoming more prone to failures. cost of maintenance, repair, and replacement of their power
Therefore, power transformers need to be continousely transformer fleet. In this regard, much research effort has
monitored during their entire operational life to avoid sudden been conducted to help utility companies optimize their asset
and catstrophic failures. Over the past two decades, several maintenance costs. Transformer asset management (TAM)
condition monitoring techniques have been developed for practice has been explained in [10]–[12] in which strategic
power transformers [1]. Among these techniques, insulation plans for future maintenance and replacement activities
system has been a common key component to identify based on various diagnostic testing methods are presented.
transformer health state and estimate its useful rmnant life The diagnostic testing methods can be generally categorized
[2]. Several papers to explain the aging mechanism of the into condition monitoring (CM) and condition assessment
transformer oil and paper insulation have been published in (CA) techniques. CM techniques employ electrical,
the literatures [3]–[7]. The health condition and remnant life
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
chemical, and physical tests to be collectively used by the CA (GRNN) was introduced in [28] to calculate the HI based on
tools to determine the health condition of the transformer. four-class transformer conditions; very poor, poor, fair, and
The transformer health index (HI), as primarily explained good. In [29], AI-based HI using Bayesian network was
in [13], is a single factor that utilizes the information from proposed to quantify the parameters contribution through
operating observations, field inspections, and laboratory tests score-probability and population failure statistics. In [30],
to provide a reliable TAM decision. Majority of the data used principal component analysis (PCA) and analytical hierarchy
in the HI model are based on the insulation system testing. process (AHP) were proposed to determine the transformer
Oil insulation tests include dissolved gases analysis (DGA), health condition based on an expert empirical formula.
oil quality analysis (OQA), and furan analysis (FFA). Due to The use of AI methods to define transformer health index
the high electric and thermal stress within operating has been presented in the literatures. For instance, decision
transformers, oil and paper insulation decomposes and tree, random forest (RF), static vector machine (SVM),
releases some gases that dissolve in the oil and decrease its artificial neural network (ANN), and k-nearest neighbor
dielectric strength. These gases include hydrogen (H2), (kNN) methods were used in [27] to automate the assessment
methane (CH4), ethylene (C2H4), acetylene (C2H2), ethane process. A study in [31] compared several machine learning
(C2H6), carbon monoxide (CO), and carbon dioxide (CO2). algorithms such as ANN, SVM, and Gaussian bayesian
DGA test is conducted to quantify these gases from which networks (GBN), to evaluate the health condition of power
internal transformer electrical and thermal faults can be transformers through probabilistic HI framework. An
identified [14]. Considering the guideline in [15], the OQA approach is presented in [32] that compared the performance
is determined by analyzing the oil breakdown voltage of various AI methods such as naïve Bayes (NB),
(BDV), acidity, water content, interfacial tension (IFT), multinomial logistic regression (MLR), ANN, SVM, kNN,
dielectric dissipation factor (DDF), and color. FFA is one rule (OneR), decision tree (J48), and RF in identifying
conducted to measure furan compunds that are generated due transformer health index. However, regardless the various AI
to cellulose degradation and dissolve in the transformer oil methods proposed in the literatures to calculate the
[16]. Among the 5 furan compunds, furfuraldehyde (also transformer HI, more thorough studies are still required to
known as furfural/2FAL) dominates the measurements and is improve the accuracy of such methods, in particular with the
correlated to the degree of polymerization of the paper uncertainaty or unavailability of the used data.
insulation [17]. As discussed in [9], data uncertainty is the main
A scoring-weighting (also called weighted-sum) is the shortcoming of the HI method which affects its reliability and
most commonly used method to calculate the HI of power accuracy. Data unavailability is the main reason of
transformers [13], [18]–[23]. The approach starts by uncertainty. The study in [33] reported the effect of data
comparing each parameter to a scoring table, then weighting unavailability and proposed a remidial approach based on RF
each parameter based on its importance. The weights are to predict the missing data and improve the scoring-
usually identified by expert personnel. The individual scores weighting HI. However, the development and investigation
are combined into a single index that reveals the overall of AI-based HI model in handling data uncertainty has not
health condition of the transformer. yet discussed thoroughly with proven implementation
As the HI is a linear combination of different scores and feasibility.
weighted measurement data, it is a challenging task to deal From the above discussion, the gaps in this area of research
with the uncertainty of the used data throgh the current utility can be summarized as below:
practice to identify the HI. • despite the several AI-based methods used to identify the
The rapid development of computer science and data HI of power transformers, accuracy of such methods is still
processing has resulted in new HI approaches based on relatively low.
machine learning algorithms for big data analysis [9]. • there is no reliable and widely accepted technique to deal
Artificial Intelligence (AI)-based HI approach was presented with data uncertainity which affects the reliability of the
in [24] using DGA, furan, and oil test data to assess the health current HI practice.
condition of several in-service transformers. However, due
to the limited available data, the prediction accuracy of the As such, the main contribution of this paper is to:
developed neuro-fuzzy model has only been 56.3%. The • construct an AI-based approach to enrich the current
study in [25] conducted a transformer HI sensitivity analysis conventional HI scoring-weighting method for power
using self adaptive neuro-fuzzy inference system (ANFIS) transformers.
whose parameters were tuned using particle swarm optimizer • improve the accuracy of the HI calculation using multi-AI
(PSO). In [26], probabilistic Markov chain models were used models to process most common routine testing data for power
to predict the future condition of the transformer based on HI transformer insulation system.
calculation using a non-linear optimization technique. AI • develop an effective correlation for all used parameters in
with a fuzzy-based support vector machine (SVM) was calculating the transformer HI.
employed in transformer HI-based condition assessment • identify the best AI-based combination for dealing with data
using several factors, industry standards, and utility expert uncertainty and missing values.
judgment [27]. AI-based general regression neural network
II. METHODOLOGY content, interfacial tension (IFT), acidity, and color. In

The study in this paper is conducted in accordance to the addition, 2-FAL measurement are provided for some units.
workflow of Figure 1. Diagnostic data of 504, 150kV-power Normally, the oil testing is conducted once a year or more
transformers including oil characteristics, furan content, and frequent, if needed. Data used for the analysis in this paper are
DGA were collected from in-service units of various life spans the same year data when HI is calculated, therefore omitting
and health conditions. The assessment data are used to the uncertainty caused by outdated data. The age classification
calculate the power transformer insulation system HI using the of the observed population sample is shown in Figure 2.
conventional Scoring-Weighting (SW) method. At the same
time, data are used to establish the AI-based HI method. The 140
performance of various AI algorithms is compared to identify
Operating Age of Transformer

120
the best performing AI algorithm through comparison with the
SW method as a reference. The effect of unavailable data on 100
the accuracy of both conventional SW and the identified best 80
performing AI-based HI is evaluated by adjusting the complete
data into 50 missing data scenarios. A preprocessing approach 60
is then proposed to enhance the ability of the AI-based HI 40
model to deal with such unavailable data and provide an
20
accurate HI value.
0
Start <5 6-10 10-15 16-20 21-25 26-30 30-35 >36
Qty 84 80 53 127 98 43 14 5
504 Power Transformer

Figure 2. The age classification of the transformer population sample
Assessment Data (150 kV)
B. TRANSFORMER HEALTH INDEX
There are many methods to calculate the transformer HI
Scoring-Weighting Development of AI- Adjust to 50 Missing score that reflects the overall health condition of the power
HI Method based HI Method Data Scenarios transformer. Generally, there are two main approaches to
determine the HI:
Various Scoring-Weighting: This approach starts with a scoring and
Algorithms weighting process conducted by expert personnel. As shown
Comparison in Figure 3, condition data are processed into scores by
comparing them to the scoring tables. Then, individual scores
Evaluate are weighted and aggregated into a single index value
revealing the overall transformer health condition. A final
AI-Based HI health index is a linear combination of different scored and
Best Performing with
AI-Based HI
weighted measurement data. Therefore it is a challenging
Preprocessing
task to conduct a consistent and reliable HI process in case
of some of the data are unavailable. In this regard, this paper
highlights the drawback of the conventional scoring-
Evaluate the effect of Unavailable
Data to HI Accuracy
weighting models in handling unavailable data using the
certainty level and output accuracy as proposed in [33].
Finish S1 S1xW1
Condition Assessment Weight
(Scoring (W1)
Data 1
Figure 1. Flowchart of the proposed methodology Function)
S2xW2 Health
Index
A. TRANSFORMER POPULATION Condition S2
(HI)
Assessment
The data used in this study were collected from PT. PLN Data 2
(Scoring
Weight HI =
(W2) ∑ ∑Sf xWf
PERSERO, an Indonesian state electricity company. Data are Function)
collected from 504 units, all with a specific high voltage of 150
kV with a low voltage of either 70 kV or 20 kV. Majority of Sn
these transformers use Kraft paper insulation and they are Condition Assessment
Weight
Data n (Scoring (Wn)
periodically inspected in a condition-based maintenance Function) SnxWn
scheme by checking their insulation properties through DGA
and dielectric characteristics (oil quality analysis). Dielectric
Figure 3. The principle of conventional HI scoring-weighting method
characteristics include oil breakdown voltage (BDV), water
Artificial Intelligence: various AI methods have been

employed to estimate the transformers HI. The typical AI-
based HI model development is as shown in Figure 4. In this Data Layer H2 C2H6 C2H2
method, training database including various diagnostic CO-
CO2
measurements conducted on several transformers at different Water IFT
CH4 C2H4 2FAL
health condition levels must be carefully prepared. The Age
training dataset is used to establish the AI model to estimate BDV Color Level Rate DPM
the overall transformer health condition. As discussed above,
Acid
Various artificial intelligence algorithms have been used to
establish such model including ANN, ANFIS, SVM, GRNN,
decision trees [34]. The use of AI in power transformer Oil Paper
Factor Quality Faults Condition
condition assessment is useful especially for analysing large Layer Factor (FF) Factor
Factor
transformer datasets [35]. (OQF) (PCF)
Condition Data Index Transformer

1 Layer Health Index
(HI)
Artificial
Condition Data
2 Intelligence Health Index (HI)
(AI) HI = f(V1, V2,...Vn)
Figure 5. Layers of the proposed AI Method
algorithms
Condition Data
D. SCORING-WEIGHTING-BASED HI
n All of the data used to calculate the transformer HI are
determined by international standards limits [37], [38]. The
Figure 4. The principle of AI-based HI Method scoring-weighting based HI for each factor is calculated as in
(1).
C. POWER TRANSFORMER INSULATION SYSTEM HI
The HI is developed based on the transformer reliability ∑𝑛
𝑖=1 𝑆𝑖 𝑊𝑖
𝐻𝐼𝑒𝑎𝑐ℎ 𝑓𝑎𝑐𝑡𝑜𝑟 = ∑𝑛
(1)
survey published by CIGRE TB 642 [36] that is discussing 1 𝑊𝑖
the failure modes of high voltage power transformers. The where n is the number of parameters used in every factor. The
survey results show that the most common fault location for value of Si is based on the scoring for each parameter based
power transformer is the winding, with a dominant fault on standard scoring tables, and Wi is the weighting factor that
mode within the dielectric system. The main cause of failures describes the importance of every parameter.
includes aging and external short circuit. Therefore, this The assessment methods are based on Tables I through VI,
study focuses mainly on the integrity of the power along with Figure 6. The final HI value is calculated using (2).
∑𝑛
𝑗=1 𝑆𝐹𝑗 𝑊𝑗
transformer insulation system. The structure of the proposed 𝐻𝐼𝑓𝑖𝑛𝑎𝑙 = 𝑥 100% (2)
HI method is divided into three-layers as shown in Figure 5 ∑𝑛
𝑗=1 4𝑊𝑗
which is based on previously proposed method in [20]. The where SFj is the parameter’s scoring factor and Wj is the
HI is equipped with scoring tables to score each weighting factor.
measurement, as well as weighting factors based on the The weighting parameters and factors are obtained by using
reliability and criticality of each parameter. These tables and AHP. This technique is built based on the judgment of five
weighting factors can be continuously adapted based on new experts with in-depth experience in transformer condition
information, experience, and recent international guidelines. monitoring, fault diagnosis and asset management [39], [40].
The layers in Figure 5 include:
TABLE I
Data layer: this layer consists of frequently measured THE SCORING-WEIGHTING FOR OIL QUALITY
diagnostic data such as key gases of DGA, BDV, water
Score
content, IFT, acidity, oil color, 2FAL, and transformer age. Oil Quality Weight
Factor layer: This layer has three categories that are derived 1 2 3 4
from data layer/ measurement: BDV (kV) >50 50-45 45-40 <40 0.169
1. Oil Quality Factor (OQF) that consists of oil parameters;
BDV, water content, IFT, acidity, and color. Water Content
<20 20-25 25-30 >30 0.108
2. Faults factor (FF), which consists of gas evolution, gas (ppm)
concentration level, and Duval’s pentagon DGA Acidity 0.1- 0.15-
interpretation method (DPM), which is shown in Figure 6. <0.1 >0.2 0.139
(MgKOH/mg) 0.15 0.2
3. Paper condition Factor (PCF): This factor consists of the
IFT
operating age, CO/CO2 ratio, and 2FAL concentration. (Dyne/cm)
>35 35-25 25-20 <20 0.124
Health Index layer: in this layer the HI calculation is
conducted to provide a single value that reveals the overall Color Scale <1.5 1.5-2 2-2.5 >2.5 0.114
health condition of the investigated transformer.
TABLE II The health index category and description is classified as

THE LEVEL SCORING FOR DGA shown in Table VII. A more detailed explanation of the
H2 CH4 C 2 H6 C 2 H2 C 2 H4 scoring-weighting method can be found in [20].
L1 <80 <90 <90 <1 <50 TABLE VI
THE SCORING FOR HEALTH INDEX FACTOR
L2 80-200 90-150 90-170 1-2 50-100
Health Index Rating
Score Factor
L3 200-320 150-210 170-250 2-3 100-150 Factor Code
L4 >320 >210 >250 >3 >150 <1.2 A 4

1.2-1.5 B 3
1.5-2.0 C 2
TABLE III
THE RATE SCORING FOR DGA 2.0-3.0 D 1
H2 CH4 C 2 H6 C 2 H2 C 2 H4 >3.0 E 0
R1 <20 <20 <29 0 <7
R2 20-31 20-37 29-58 - 7-16
TABLE VII
R3 31-59 37-72 58-145 0 16-48 THE HEALTH INDEX CATEGORIES
R4 >59 >72 >145 >0 >48 Health Index Health
Description
Score Index Code
TABLE IV
85-100 VG Very Good
THE MATRIX OF FAULT FACTOR CATEGORY
GAS LEVEL AND RATE CONDITION 85-100 G Good
C.1 C.2 C.3 C.4 50-70 C Caution
C.1 A FALSE FALSE FALSE 30-50 P Poor

C.2 FALSE B B C 0-30 VP Very Poor
DPM
C.3 FALSE C C D
C.4 FALSE D D E E. AI-BASED HI DEVELOPMENT
To establish the AI-based HI model, the collected
TABLE V
transformer data have been divided into 354 training and 150
THE SCORING-WEIGHTING PAPER CONDITION testing data sets. Each set comprises data of transformers of
Paper Score
different aging, operating and health conditions.
Condition Weight
Factor 1 2 3 4 5
Assessment S1 S1xW1
Cond. Weight
CO / CO2 A B C D E 0.171 (Scoring ∑
Data (W1)
Scoring- Function)
AGE 20- Weighting
<20 30-40 40-60 >60 0.234
(Years) 30 Method
100- 500- 1000- Health Index (HI)
2FAL (ppb) <100 >5000 0.214 HI = ∑Sf xWf
500 1000 5000
Training Prediction of Health

Data Index (HI)
AI HI = f (V1, V2,...Vn)
40%
Methods
C2H2 PD : Corona partial discharges
D1 : Low-energy discharges
D2 : High-energy discharges Artificial
T3-H : Themal faults in oil Intelligence (AI)
above 700˚C Training
NO Measuring
C : Themal faults above 300˚C of Error
and below 700˚C with Testing
carbonization of paper Testing Data
O : Overheating below 250˚C YES
S : Stray gassing
Validation Final The Accuracy
Health Index (HI) Acceptance
Figure 6. Duval’s pentagon DGA interpretation method Figure 7. A flowchart for the AI-based HI model’s development
average/most frequent value is used to handle the issue of

START unavailable data as shown in Figure 9.
Step 1: Data Data Step 2: Input: Input:

Data Training Data Testing
Input 1: Input 2: Parameters Pre- Parameters
Parameters Machine 1. OQF; 2. processed 1. OQF; 2. FF;
Factor 3. PCF
(1. OQF; 2. Learning: FF; 3.PCF
FF; 3. PCF) Categories
- kNN
- SVM
Machine
- Naive Bayes
Score-Weight Learning:
Data - Neural Random Forest
Network Assessment
- AdaBoost
- Tree
Output 1: AI-SW-based Output:
- Random Forest
Factors HI prediction Category Factors
categories Learner Prediction
prediction 1. OQF; 2. FF; 3. PCF
Pre-
AI-Full-based Data processed
HI prediction
Input:
Data Training Machine
Category Learning:
Evaluation Random Forest
Factors
FINISH
Output:
Figure 8. Flowchart of the proposed multi-AI approaches to calculate HI Health Index
Prediction
The output of the AI model is the overall HI which is
validated using the corresponding HI calculated by the
scoring-weighting method. If the AI model does not provide Figure 9. Dealing the uncertainty with AI-based HI
satisfactorily accuracy, additional tuning process is conducted Three methods are used to implement the proposed
through more training. On the other hand, if the accuracy has approach namely; scoring weighting (SW), random forest
met the criteria, the final HI is calculated as per the process (RF), and Random forest with preprocessing-based (RFwP).
shown in Figure 7. Various AI algorithms are employed to The certainty level (CL) is calculated as below [33]:
establish the proposed HI model; this includes kNN, SVM,
RF, NB, ANN, Adaptive Boosting (AdaBoost), and Decision 𝐶𝐿𝑖 = 𝑊𝑃𝑖 𝑥 𝑊𝑓𝑖 (3)
Tree (Tree). The proposed method to calculate the ∑(𝐴𝑣𝑎𝑖𝑙𝑎𝑏𝑙𝑒 𝐶𝐿𝑖)
transformer HI using multi-AI approaches is shown in Figure 𝐶𝐿 = 𝑥 100 (4)
𝑀𝑎𝑥.𝐶𝐿
8. The performance of these approaches is compared based
on: CLi is the certainty level for each parameter (i), WPi is the
AI-SW: which is based on factors category stage (OQF, FF, weighting parameter, and WFi is the weighting factor.
and PCF), then scoring-weighting-based in HI category stage.
AI-Ful: which is based on all classification stages (OQF, FF, G. PROPOSED RANDOM FOREST
PCF, and HI category) of the AI model. Random forest algorithm is a method consisting of a
collection of tree-structured classifiers {h(x,Θk ), k=1, ...}
F. DEALING WITH DATA UNCERTAINTY where {Θk} represent independent identically distributed
All assessments of transformers include a non-avoidable random vectors. Each tree casts a unit vote for choosing the
level of uncertainty. Unavailability of the data, erroneous, or most popular class at input x [41]. An ensemble of the
obsolete data will adversely affect the assessment results and classifiers h1(x), h2(x)...hK (x), is given, and the training data
increase the uncertainty of the obtained outcomes. The are drawn randomly from the distribution of the random
uncertainty within available data is due to different reasons vector X, Y and the margin function is defined as:
such as incorrect data entry, erroneous or questionable test mg(X,Y) = avk I(hk (X)=Y)− maxj ≠Y avk I(hk (X)= j ) (5)
results, uncertainty in the condition assessment, and obsolete The generalization error is given by:
data [9].
A solution to manage the unavailability of the data is by PE* = PX, Y (mg(X, Y) < 0) (6)
employing machine learning to estimate the missing In RF, hk (X) = h(X, Θk ). Almost all sequences Θ1… PE*
parameters based on historical data and well training process. converge to:
Following the characteristics of the data, several methods can PX, Y (PΘ (h(X, Θ)=Y)−max j≠Y PΘ(h(X,Θ)= j) < 0) (7)
be used to preprocess the missing values. In this study, the
This explains why random forests do not overfit as more

trees are added and produce a limiting value for the
generalization error [41].
H. EVALUATION
The evaluation of the model is based on the accuracy of the
prediction of the transformer HI category. The classification
accuracy (CA) is the proportion of correctly classified data
as given by (8).
(𝑇𝑃+𝑇𝑁)
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁) (8)
where TP is true positive, TN is true negative, FP is false
positive, and FN is false negative. This calculation is based
on the confusion matrix of binary classification in which the
evaluation is considered as a multi-class classification
problem.
III. RESULTS AND DISCUSSION

A. TRANSFORMER HEALTH INDEX
The HI is calculated for the investigated 504 units using the
scoring-weighting method. As shown in Figure 10, the most Figure 11. Histogram of data collected parameters
frequent category of HI transformer is very good condition 1.0

VBD
(VG) that represents 174 units or 35% of the population. The Water 0.8
other categories are good condition (G) for 104 units (21%), Acidity
caution condition (C) for 134 units (27%), poor condition (P) IFT 0.6
for 82 units (16%), and very poor condition (VP) for 10 units Color
(2%). This HI assessment is used to highlight faulty 0.4
H2
transformers and to construct strategies for maintenance
CH4
plans. 0.2
C2H6
B. DATA MANAGEMENT
C2H4 0.0
C2H2
VP 2FAL -0.2
Age
P
CO -0.4
C CO2
LvMax
G
Water
Color
Age
LvMax
HI_Sc
Acidity
2FAL
H2
CH4
C2H6
C2H4
C2H2
CO2
IFT
VBD
CO
VG
0 50 100 150 200 Figure 12. The correlation matrix of observed parameters
VG G C P VP BDV
Units 174 104 134 82 10
Water
Figure 10. HI categories by scoring-weighting method Acidity
IFT
The data used to develop the AI-based HI models are the
same data classified in the above section to ensure the Color
correctness and availability of all data. The distributions of H2
the used parameters are shown in Figure 11 and a matrix
CH4
correlating these parameters is shown in Figure 12 . As
shown in Figure 12, the IFT and color have the highest C2H6
correlation value of -0.69. Sequentially, the significant C2H4
correlation is found between C2H4 and CH4 (0.65), 2FAL and -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00
color (0.65), IFT and acidity (-0.63), color and acidity (0.55),
and age and color (0.53). These results reveal that several
Figure 13. Plot of the correlation between observed parameters
parameters have a dependency relationship and influences
each other.
Furthermore, the correlation of parameters with HI score, results of 98%. Figure 15 shows a comparison of the
as shown in Figure 13, can be described from the highest to performance of all developed models in terms of the HI
the lowest by applying a threshold limit above 0.2, which are calculation accuracy. Results attest that AI-based tree-
as follows: color (-0.778), IFT (+0.638), 2FAL (-0.568), age structured classifiers produces better HI calculation accuracy
(-0.51), acidity (-0.45), CO (-0.413), CO2 (-0.335), water than other algorithms.
content (-0.278), BDV (+0.268), and C2H2 (-0.218). The HI TABLE VIII
score has a limit value of 1, which means HI category has an THE CONFUSION MATRIX OF RANDOM FOREST AI-FULL-BASED
equal relationship with the HI score. Therefore HI scores and Random Predict
maximum level are derived from the value of the parameter Forest VG G C P VP ∑
(data layer), then in the subsequent discussion, HI maximum VG 54 0 0 0 0 54
score and LvMax will be ignored. The correlation of G 0 31 0 0 0 31
Actual
parameters helps to reduce the number of features without C 0 1 36 1 0 38
P 0 0 0 24 0 24
reducing the accuracy significantly as will be elaborated
VP 0 0 0 2 1 3
below. ∑ 54 32 36 27 1 150
C. EVALUATION OF DEVELOPED AI-BASED HI MODELS

To evaluate the performance of the developed AI-based HI
models, the data were divided into training and testing sets. kNN
Using the training data, random forest algorithm was found Naïve Bayes
to produce the highest classification accuracy (CA) of 91.2% SVM
in OQF and 92.1% in PCF. The tree decision excels at Neural Network
predicting FF categories with about 100% of CA. In the HI- AdaBoost
layer model, the AdaBoost algorithm produces the highest Tree
CA of 97.2%. When evaluated using the testing data, as
Random Forest
shown in Figure 14, random forest excels with the PCF
classification accuracy of 97%. The decision tree excels with 0.0% 20.0% 40.0% 60.0% 80.0% 100.0%
Random AdaBoo Neural Naïve
the OQF classification accuracy of 96%, and the FF is 100% Forest
Tree
st Network
SVM
Bayes
kNN
accurate. The AdaBoost excels at predicting HI categories HI AI Full 97.3% 96.0% 94.0% 91.3% 89.3% 70.7% 70.0%
with about 100% of CA. HI AI-SW 98.0% 97.3% 94.0% 92.7% 90.7% 85.3% 79.3%
OQF FF PCF HI
Class. Accuracy (x100 %)
0.99 0.98 Figure 15. Accuracy of the AI method to predict HI

1.00 0.99 0.99
0.77 0.88 D. EFFECTS OF UNAVAILABLE DATA
0.97 0.92 0.91 0.88 0.85 0.80 0.77
In this study, a comparison is done to evaluate the effect of
0.99 1.00 0.99
uncertainty as a result of unavailable data to the accuracy of
0.99 0.97 0.93 0.81 the HI calculation. Three main methods were compared: SW,
0.95 0.96 0.92 0.87 0.87 0.78 0.77 RF, and RFwP. The missing data scenarios in Table IX were
assumed to conduct this analysis.
Random Tree AdaBoost Neural SVM Naïve kNN
Forest Network Bayes
TABLE IX
Figure 14. Performance comparison of data trainning multi-AI-based MISSING VALUE SCENARIOS OF TRANSFORMER HEALTH INDEX
Para HI Scenarios
The results show that the best prediction models are the meters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
tree-based classifiers such as random forest, decision tree and
AdaBoost. Water 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
VBD 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1
The performance of the RF classifier for the HI category is Acid. 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1
shown in the confusion matrix. The comparison between IFT 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1
actual and predicted values is used to evaluate the Color 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1
classification accuracy of AI-based HI. The classification CO/CO2 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1
accuracy for RF is 97.3%. Furthermore, the random forest 2FAL 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1
Age 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1
model only has four incorrect classifications, as shown in
H2 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1
Table VIII. The accuracy of other models are: decision tree CH4 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1
(96%), AdaBoost (94.8%), neural network (91.3%), SVM C 2 H6 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1
(89.3%), Naïve Bayes (70.7%), and kNN (70%). In addition, C 2 H4 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1
an AI-SW-based model, which uses AI in the factor level, C 2 H2 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1
Lv.max 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1
and weight-sum calculation in the HI level has been also
DPM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
developed. The model provides a classification accuracy Note: 1 = Available data; 0 = Unavailable data
Figure 16 shows the accuracy with the certainty level of After evaluating the effect of missing each parameter and
parameters. The results show that the RF is significantly factor on the HI accuracy, the next analysis is conducted to
influenced by a low certainty level due to particular missing investigate various scenario of unavailable data. The
parameters. With a low certainty level, the accuracy of RF correlation of the certainty level and the classification
also decreased. In contrast, RF with preprocessing (RFwP), accuracy of multi-parameters is constructed, as shown in
due to missing value using the average or most frequent Appendix 1. The aim is to measure the causal effect between
value, can maintain the classification accuracy at an certainty level and resulting HI accuracy for each of the three
acceptable level of at least 95%. developed model: SW-based HI, RF-based HI, and RFwP-
based HI. This analysis is helping to define efficient and
100%
precise ways to deal with the data uncertainty.
95%
1) SCORING-WEIGHTING (SW)-BASED HEALTH INDEX
90%
ACCURACY (%)
When evaluated using the testing data in Appendix 1, the

85% SW-based HI is affected by the uncertainty due to
80% unavailable data as shown in Figure 18. This is indicated by
75% the coefficient of determination (R2) of 0.917. From Figure
70% 18, The correlation of the accuracy (y) and certainty level (x)
65% can be approximated using (10).
60% HI0 HI1 HI2 HI3 HI4 HI5 HI6 HI7 HI8 HI9 HI10 HI11 HI12 HI13 HI14 HI15 y = 0.6554x+0.2573 (10)
Certainty 100 96% 94% 95% 96% 96% 88% 84% 85% 95% 97% 95% 95% 95% 95% 95%
RF 98% 98% 95% 93% 96% 82% 85% 75% 90% 99% 97% 97% 97% 98% 79% 95% The challenges of this method is that if there are many
RFwP 100 100 99% 99% 100 99% 97% 95% 100 100 100 100 100 100 99% 99%
missing values, the classification accuracy of the HI category
Figure 16. The accuracy with the certainty level of parameters will be decreased significantly. The range of the
classification accuracy in HI categories will have a value
Then, the effect of missing factor on the HI calculation above 80% if the certainty level is above 85%.
accuracy is examined according to the scenarios shown in The color categories in terms of the accuracy indicator are
Table X. The uncertainty due to missing factor significantly green for high (H) accuracy level, yellow for medium (M)
affects the HI classification accuracy as shown in Figure 17. accuracy level, and red for low (L) accuracy level.
In this case, the observed factor conditions were divided into
OQF, FF, and PCF. The most significant effect of missing 2) RANDOM FOREST-BASED
value is observed in the RF-based paper condition factor The AI-based RF algorithm also suffers when predicting
(PCF). The lowest accuracy value is 60.7% using RF based transformers HI with many unavailable data as confirmed by
on the missing of the paper condition factor. Using RFwP, Figure 19 in which R2 is 0.7614. The correlation of the
the classification accuracy can be increased to 82%. accuracy and certainty level for this model can be
Therefore, it can be concluded that the RFwP model can approximated as below.
maintain the classification accuracy of HI categories with y=0.7542x+0.1232 (11)
missing factor.
In some cases, RF model can predict the HI with an
TABLE X accuracy above 80% even though the certainty level is only
SCENARIOS OF THE FACTOR AVAILABILITY
Factor HI HI HI HI HI HI
70%. Nevertheless, the accuracy will be better if the certainty
F1 F2 F3 F4 F5 F6 level is above 80%.
Oil Quality Factor (OQF) 0 1 1 0 0 1 3) RANDOM FOREST WITH PREPROCESSING
Fault Factor (FF) 1 0 1 0 1 0
The RFwP is slightly affected by unavailable data. In this
Paper Condition Factor (PCF) 1 1 0 1 0 0
case R2 is 0.8079 and the accuracy-certainty level correlation
can be approximated from Figure 19 as in (12).
100.0%
y = 0.5407x + 0.4904 (12)
ACCURACY (%)
80.0%
60.0% The RFwP model has performed better than SW and RF,
40.0%
resulting in the highest accuracy value in classifying the HI
category even at a low certainty level. When the certainty
20.0%
level is, in the worst-case below 10%, the accuracy is still
0.0% above 50%. In some instances, the accuracy can reach a value
HI_F1 HI_F2 HI_F3 HI_F4 HI_F5 HI_F6
Certainty 78.4% 65.2% 56.4% 43.6% 34.8% 21.6% above 80% even with a certainty level of only 50%. The best
RF 84.7% 66.7% 60.7% 39.3% 30.0% 28.0% condition is when the certainty level is above 70% which
RFwP 86.0% 70.7% 82.0% 58.0% 57.3% 60.7% results in an accuracy reaches 90%.
Figure 17. The accuracy with the certainty level of factor
For instance for scenario HI-13 in the Appendix, furan and

100% IFT parameters are unavailable. Based on the data
90% H observation, furan and IFT have the highest frequency of data
80% unavailability. The assessment value of the certainty level for
70%
HI-13 scenario is 79%, and when evaluated based on 150
M transformers data, the classification accuracy using the
% Accuracy
60%
scoring-weighting method is 78%. It can be observed that the
50% RF model can achieve 80.7% accuracy while the RFwP
40% y = 0.6554x + 0.2573 classification accuracy is 96%.
R² = 0.9176
30% As shown in Figure 20, the RFwP model achieves medium
L to high accuracy levels in almost all instances. This reveals
20%
the ability of RFwP to solve the issue of data uncertainties in
10% estimating the transformer HI.
0%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% IV. CONCLUSION
% Certainty Level
The development and implementation of artificial
Figure 18. Effect of certainty level due to unavailable data on the
intelligence-based health index model to assess the power
accuracy of SW-based HI transformer insulation system conditions has been presented.
The proposed method is aimed to simplify, speed up, and
reduce error due to data uncertainty. A comparison of various
100%
AI algorithms is carried out and evaluated using the
90% H classification accuracy in reference to the scoring-weighting-
80% based HI calculation that is widely used by worldwide
70% utilities. Based on this comparison, the random forest-based
M model is chosen as the proposed method with the highest
% Accuracy
60%
accuracy to predict the transformer HI. Results show that
50% random forest does not overfit as it produces a limiting value
40% of generalization error. As such, random forest with
30%
preprocessing has been proposed to handle the data
L uncertainty by considering the missing values with the
20% y = 0.7542x + 0.1232 average or the most frequent value. The impact of missing
R² = 0.7614
10% data on the model has been evaluated, and the proposed
0% RFwP performed better than other investigated methods and
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%100% resulted in a satisfactory classification accuracy even with
% Certainty Level low certainity levels. This solves the current common issue
of calculating the transformer HI with unavailable data.
Figure 19. Effect of certainty level due to unavailable data on the Results in this paper pave the way to adopt AI-based methods
accuracy of RF-based HI in calculating the transformer HI. However, implementing
such techniques calls for more validation and feasibility
100.0% studies on large transformer population with several
scenarios of unavailable data.
90.0% H
80.0%
70.0% APPENDIX
M Scenarios of uncertainty multi-parameter
% Accuracy
60.0%
Parameters
50.0% Oil Quality Fault Factor
HI PCF
Level (%)
Factor (OQF) (FF) - DGA

Certainty
40.0% y = 0.5407x + 0.4904 Scena-

CO/CO2
R² = 0.8079
Water
2FAL
Color
VBD
C 2 H6
C 2 H4
C 2 H2
rios
Acid
CH4
30.0%
Age
IFT
H2
L
20.0%
10.0% HI-0 1 1 1 1 1 1 1 1 1 1 1 1 1 100
HI-1 0 1 1 1 1 1 1 1 1 1 1 1 1 96
0.0% HI-2 1 0 1 1 1 1 1 1 1 1 1 1 1 94
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%100% HI-3 1 1 0 1 1 1 1 1 1 1 1 1 1 95
% Certainty Level HI-4 1 1 1 0 1 1 1 1 1 1 1 1 1 96
HI-5 1 1 1 1 0 1 1 1 1 1 1 1 1 96
Figure 20. Effect of certainty level due to unavailable data on the HI-6 1 1 1 1 1 0 1 1 1 1 1 1 1 88
accuracy of RFwP-based HI
HI-7 1 1 1 1 1 1 0 1 1 1 1 1 1 84 transformer condition assessment,” IEEE Trans. Dielectr. Electr.

HI-8 1 1 1 1 1 1 1 0 1 1 1 1 1 85 Insul., vol. 18, no. 5, pp. 1591–1598, 2011.
HI-9 1 1 1 1 1 1 1 1 0 0 0 0 0 65 [8] S. Forouhari and A. Abu-Siada, “Remnant life estimation of power
HI-10 0 1 1 1 1 1 0 1 1 1 1 1 1 80 transformer based on IFT and acidity number of transformer oil,”
Proc. IEEE Int. Conf. Prop. Appl. Dielectr. Mater., vol. 2015-Octob,
HI-11 1 0 1 1 1 1 0 1 1 1 1 1 1 78
pp. 552–555, 2015.
HI-12 1 1 0 1 1 1 0 1 1 1 1 1 1 79 [9] CIGRE 761: Condition assessment of power transformers, no. March.
HI-13 1 1 1 0 1 1 0 1 1 1 1 1 1 79 2019.
HI-14 1 1 1 1 0 1 0 1 1 1 1 1 1 80 [10] X. Zhang and E. Gockenbach, “Asset-management of transformers
HI-15 1 1 1 1 1 0 0 1 1 1 1 1 1 71 based on condition monitoring and standard diagnosis,” IEEE Electr.
HI-16 1 1 1 1 1 1 0 0 1 1 1 1 1 68 Insul. Mag., vol. 24, no. 4, pp. 26–40, 2008.
HI-17 1 1 1 1 1 1 0 1 0 0 0 0 0 49 [11] A. E. B. Abu-Elanien and M. M. A. Salama, “Asset management
HI-18 0 1 1 0 1 1 0 1 1 1 1 1 1 76 techniques for transformers,” Electr. Power Syst. Res., vol. 80, no. 4,
HI-19 1 0 1 0 1 1 0 1 1 1 1 1 1 74 pp. 456–464, 2010.
[12] A. Azmi, J. Jasni, N. Azis, and M. Z. A. A. Kadir, “Evolution of
HI-20 1 1 0 0 1 1 0 1 1 1 1 1 1 75
transformer health index in the form of mathematical equation,”
HI-21 1 1 1 0 0 1 0 1 1 1 1 1 1 76 Renewable and Sustainable Energy Reviews, vol. 76. pp. 687–700,
HI-22 1 1 1 0 1 0 0 1 1 1 1 1 1 67 2017.
HI-23 1 1 1 0 1 1 0 0 1 1 1 1 1 64 [13] A. Jahromi, R. Piercy, S. Cress, J. Service, and W. Fan, “An approach
HI-24 1 1 1 0 1 1 0 1 0 0 0 0 0 45 to power transformer asset management using health index,” IEEE
HI-25 0 1 1 0 0 1 0 1 1 1 1 1 1 72 Electr. Insul. Mag., vol. 25, no. 2, pp. 20–34, 2009.
HI-26 1 0 1 0 0 1 0 1 1 1 1 1 1 70 [14] N. Bakar, A. Abu-Siada, and S. Islam, “A review of dissolved gas
HI-27 1 1 0 0 0 1 0 1 1 1 1 1 1 71 analysis measurement and interpretation techniques,” IEEE Electr.
HI-28 1 1 1 0 0 0 0 1 1 1 1 1 1 64 Insul. Mag., vol. 30, no. 3, pp. 39–49, 2014.
[15] IEEE Std C57.106-2015, “IEEE Guide for Acceptance and
HI-29 1 1 1 0 0 1 0 0 1 1 1 1 1 61
Maintenance of Insulating Mineral Oil in Electrical Equipment,” vol.
HI-30 1 1 1 0 0 1 0 1 0 0 0 0 0 41
2015, no. June. 2015.
HI-31 0 1 0 0 0 1 0 1 1 1 1 1 1 67 [16] A. J. Kachler and I. Höhlein, “Aging of cellulose at transformer
HI-32 1 0 0 0 0 1 0 1 1 1 1 1 1 65 service temperatures. Part 1: Influence of type of oil and air on the
HI-33 1 1 0 0 0 0 0 1 1 1 1 1 1 59 degree of polymerization of pressboard, dissolved gases, and furanic
HI-34 1 1 0 0 0 1 0 0 1 1 1 1 1 56 compounds in oil,” IEEE Electr. Insul. Mag., vol. 21, no. 2, pp. 15–
HI-35 1 1 0 0 0 1 0 1 0 0 0 0 0 36 21, 2005.
HI-36 0 0 0 0 0 1 0 1 1 1 1 1 1 62 [17] A. Abu-Siada, S. P. Lai, and S. M. Islam, “A novel fuzzy-logic
HI-37 1 0 0 0 0 0 0 1 1 1 1 1 1 53 approach for furan estimation in transformer oil,” IEEE Trans. Power
HI-38 0 0 1 1 1 1 1 50 Deliv., vol. 27, no. 2, pp. 469–474, 2012.
1 0 0 0 0 1
[18] A. Naderian, S. Cress, R. Piercy, F. Wang, and J. Service, “An
HI-39 1 0 0 0 0 1 0 1 0 0 0 0 0 31
Approach to Determine the Health Index of Power Transformers,” in
HI-40 0 0 0 0 0 0 0 1 1 1 1 1 1 50 Conference Record of the 2008 IEEE International Symposium on
HI-41 0 0 0 0 0 1 0 0 1 1 1 1 1 47 Electrical Insulation, 2008, pp. 192–196.
HI-42 1 0 0 0 0 0 0 1 0 0 0 0 0 19 [19] I. G. N. S. Hernanda, A. C. Mulyana, D. A. Asfani, I. M. Y. Negara,
HI-43 1 0 0 0 0 0 0 0 0 0 0 0 0 4 and D. Fahmi, “Application of health index method for transformer
HI-44 0 1 0 0 0 0 0 0 0 0 0 0 0 6 condition assessment,” in TENCON 2014 - 2014 IEEE Region 10
HI-45 0 0 1 0 0 0 0 0 0 0 0 0 0 5 Conference, 2014, pp. 1–6.
HI-46 0 0 0 1 0 0 0 0 0 0 0 0 0 4 [20] W. R. Tamma, R. A. Prasojo, and Suwarno, “High voltage power
HI-47 0 0 0 0 1 0 0 0 0 0 0 0 0 4 transformer condition assessment considering the health index value
and its decreasing rate,” High Volt., no. 1–14, 2021.
HI-48 0 0 0 0 0 1 0 0 0 0 0 0 0 12
[21] W. R. Tamma, R. Azis Prasojo, and Suwarno, “Assessment of High
HI-49 0 0 0 0 0 0 1 0 0 0 0 0 0 16 Voltage Power Transformer Aging Condition Based on Health Index
HI-50 0 0 0 0 0 0 0 1 0 0 0 0 0 15 Value Considering Its Apparent and Actual Age,” in 12th
International Conference on Information Technology and Electrical
Engineering (ICITEE), 2020, pp. 292–296.
REFERENCES [22] L. En-Wen and S. Bin, “Transformer health status evaluation model
[1] A. Abu-Siada, Power Transformer Condition Monitoring and based on multi-feature factors,” in Power System Technology
Diagnosis: Concepts and Challenges. The IET, UK, 2018. (POWERCON), 2014 International Conference on, 2014, pp. 1417–
[2] N. A. Bakar, “Fuzzy Logic Approach for Transformer Remnant Life 1422.
Prediction and Asset Management Decision,” IEEE Trans. Dielectr. [23] K. Taengko and P. Damrongkulkamjorn, “Risk assessment for power
Electr. Insul., vol. 23, no. 5, pp. 3199–3208, 2016. transformers in PEA substations using health index,” 2013 10th Int.
[3] L. E. Lundgaard, W. Hansen, D. Linhjell, and T. J. Painter, “Aging of Conf. Electr. Eng. Comput. Telecommun. Inf. Technol. ECTI-CON
Oil-Impregnated Paper in Power Transformers,” IEEE Trans. Power 2013, pp. 1–6, 2013.
Deliv., vol. 19, no. 1, pp. 230–239, 2004. [24] E. J. Kadim, N. Azis, J. Jasni, S. A. Ahmad, and M. A. Talib,
[4] M. Wang, A. J. Vandermaar, and K. D. Srivastava, “Review of “Transformers health index assessment based on neural-fuzzy
condition assessment of power transformers in service,” IEEE Electr. network,” Energies, vol. 11, no. 4, 2018.
Insul. Mag., vol. 18, no. 6, pp. 12–25, 2002. [25] K. Ibrahim, R. M. Sharkawy, H. K. Temraz, and M. M. A. Salama,
[5] D. Martin, Y. Cui, C. Ekanayake, M. Hui, and T. Saha, “An updated “Transformer Health Index Sensitivity Analysis using Neuro-Fuzzy
model to determine the life remaining of transformer insulation,” Modelling,” in 2nd International Conference on Advanced
IEEE Trans. Power Deliv., vol. 30, no. 1, pp. 395–402, 2015. Technology and Applied Science (ICaTAS), 2017, pp. 1–7.
[6] M. Ali, C. Eley, A. M. Emsley, R. Heywood, and X. Xaio, [26] M. Yahaya, N. Azis, M. Ab Kadir, J. Jasni, M. Hairi, and M. Talib,
“Measuring and understanding the ageing of kraft insulating paper in “Estimation of Transformers Health Index Based on the Markov
power transformers,” IEEE Electr. Insul. Mag., vol. 12, no. 3, pp. 28– Chain,” Energies, vol. 10, no. 11, p. 1824, 2017.
34, 1996. [27] A. D. Ashkezari, H. Ma, T. Saha, and C. Ekanayake, “Application of
[7] M. Arshad and S. M. Islam, “Significance of cellulose power fuzzy support vector machine for determining the health index of the
insulation system of in-service power transformers,” IEEE Trans.
Dielectr. Electr. Insul., vol. 20, no. 3, pp. 965–973, 2013.

[28] M. M. Islam, G. Lee, and S. N. Hettiwatte, “Application of a general
regression neural network for health index calculation of power
transformers,” Int. J. Electr. Power Energy Syst., vol. 93, pp. 308–
315, 2017.
[29] S. Li, G. Wu, H. Dong, L. Yang, and X. Zhen, “Probabilistic health
index-based apparent age estimation for power transformers,” IEEE
Access, vol. 8, pp. 9692–9701, 2020.
[30] S. Tee, Q. Liu, and Z. Wang, “Insulation condition ranking of
transformers through principal component analysis and analytic
hierarchy process,” IET Gener. Transm. Distrib., vol. 11, no. 1, pp.
110–117, 2017.
[31] J. I. Aizpurua, B. G. Stewart, S. D. J. McArthur, B. Lambert, J. G.
Cross, and V. M. Catterson, “Improved power transformer condition
monitoring under uncertainty through soft computing and
probabilistic health index,” Appl. Soft Comput. J., vol. 85, no. 105530,
2019.
[32] A. Alqudsi and A. El-Hag, “Application of machine learning in
transformer health index prediction,” Energies, vol. 12, no. 14, pp. 1–
13, 2019.
[33] R. A. Prasojo, Suwarno, and A. Abu-Siada, “Dealing with Data
Uncertainty for Transformer Insulation System Health Index,” IEEE
Access, vol. 9, pp. 1–1, 2021.
[34] Z. Li et al., “Fault diagnosis of transformer windings based on
decision tree and fully connected neural network,” Energies, vol. 14,
no. 6, 2021.
[35] A. Abu-Siada, M. Arshad, and S. Islam, “Fuzzy logic approach to
identify transformer criticality using dissolved gas analysis,” in IEEE
PES General Meeting, PES 2010, 2010, pp. 1–5.
[36] CIGRE WG A2.37, TB642 - Transformer Reliability Survey, no.
December. 2015.
[37] IEEE Std C57.104-2019, “IEEE Guide for the Interpretation of Gases
Generated in Mineral Oil-Immersed Transformers.” 2019.
[38] IEC 60422:2013, “Mineral insulating oils in electrical equipment –
Supervision and maintenance guidance,” 2013.
[39] R. A. Prasojo, A. Setiawan, Suwarno, N. U. Maulidevi, and B. A.
Soedjarno, “Development of Analytic Hierarchy Process Technique
in Determining Weighting Factor for Power Transformer Health
Index,” in The 2nd International Conference on High Voltage and
Power Systems ICHVEPS, 2019.
[40] R. A. Prasojo, Suwarno, N. U. Maulidevi, and B. A. Soedjarno, “A
Multiple Expert Consensus Model for Transformer Assessment Index
Weighting Factor Determination,” in 8th International Conference on
Condition Monitoring and Diagnosis (CMD), 2020, pp. 234–237.
[41] L. Breiman, “Random Forests - Random Features, Tecnical Report
567, Statistic Department, University of California, Berkeley,
(https://www.stat.berkeley.edu/~breiman/random-forests.pdf,
08.10.2018’de erişildi),” pp. 1–29, 1999.
DHANU REDIANSYAH (Graduate Student SUWARNO (Senior Member, IEEE)

Member, IEEE) received the B.Sc. degree received the B.Sc. and M.Sc. degrees from the
from the Department of Mechanical Department of Electrical Engineering, Institut
Engineering, Institut Teknologi Indonesia, in Teknologi Bandung, Indonesia, in 1988 and
2009, and the M.Sc. degree from the School of 1991, respectively, and the Ph.D. degree from
Electrical Engineering and Informatics, Nagoya University, Japan, in 1996. He is
currently a Professor and the Emeritus Dean of
Institut Teknologi Bandung, Indonesia, in
the School of Electrical Engineering and
2021. Since 2011, he has been worked at PT.
Informatics, Institut Teknologi Bandung. Over
PLN (Persero), the Indonesian state of 200 international journal articles or
Electricity Company. He has experienced at a conference papers have been published. His
generation power plant, distribution, and research interests include high voltage
transmission. His research interests include High Voltage equipment insulating materials and technology and diagnostics of HV equipment. He
condition monitoring and diagnostic, renewable energy power plant, and was the General Chairman of several international conferences, such as
power quality. ICPADM 2006, ICEEI 2007, CMD 2012, ICHVEPS 2017, and ICHVEPS
2019. He also serves as the Head of the Electrical Power Engineering
Research Group and the Editor-in-Chief for the International Journal on
Electrical Engineering and Informatics.
RAHMAN AZIS PRASOJO (Graduate A. ABU-SIADA (Senior Member, IEEE)

Student Member, IEEE), received the B.Sc. received the B.Sc. and M.Sc. degrees from Ain
degree from the Department of Electrical Shams University, Egypt, in 1998, and the
Engineering, Politeknik Negeri Malang, Ph.D. degree from Curtin University,
Malang, Indonesia, in 2015, and the M.Sc. Australia, in 2004, all in electrical
degree from the School of Electrical engineering. He is currently the Head of the
Engineering and Informatics, Institut Electrical and Computer Engineering, Curtin
Teknologi Bandung, Bandung, Indonesia, in University. His research interests include
2017, where he is currently pursuing the Ph.D. power electronics, power system stability,
degree. Since 2019, he has been an Assistant condition monitoring, and power quality. He
Professor with the Department of Electrical is the Vice-Chair of the IEEE Computation
Engineering, Politeknik Negeri Malang. He Intelligence Society, WA chapter. In addition,
has published several conference papers and journal articles following high he is also the Editor-in-Chief of the International Journal of Electrical and
voltage in power transformer condition monitoring and diagnostics. Electronic Engineering and a regular reviewer for various IEEE
transactions.

Artificial Intelligence-Based Power Transformer He-1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Artificial Intelligence-Based Power Transformer He-1

Uploaded by

Copyright:

Available Formats

This article has been accepted for publication in a future issue of this journal, but has not been

Artificial Intelligence-based Power Transformer

Corresponding author: Rahman Azis Prasojo (rahmanazisp@polinema.ac.id).

I. INTRODUCTION of power transformers have been assessed through various

II. METHODOLOGY content, interfacial tension (IFT), acidity, and color. In

Operating Age of Transformer

504 Power Transformer

Artificial Intelligence: various AI methods have been

Condition Data Index Transformer

TABLE II The health index category and description is classified as

L4 >320 >210 >250 >3 >150 <1.2 A 4

C.1 C.2 C.3 C.4 50-70 C Caution

C.1 A FALSE FALSE FALSE 30-50 P Poor

Training Prediction of Health

average/most frequent value is used to handle the issue of

Step 1: Data Data Step 2: Input: Input:

This explains why random forests do not overfit as more

III. RESULTS AND DISCUSSION

frequent category of HI transformer is very good condition 1.0

Figure 10. HI categories by scoring-weighting method Acidity

C. EVALUATION OF DEVELOPED AI-BASED HI MODELS

0.99 0.98 Figure 15. Accuracy of the AI method to predict HI

When evaluated using the testing data in Appendix 1, the

Figure 17. The accuracy with the certainty level of factor

For instance for scenario HI-13 in the Appendix, furan and

Factor (OQF) (FF) - DGA

40.0% y = 0.5407x + 0.4904 Scena-

HI-7 1 1 1 1 1 1 0 1 1 1 1 1 1 84 transformer condition assessment,” IEEE Trans. Dielectr. Electr.

Dielectr. Electr. Insul., vol. 20, no. 3, pp. 965–973, 2013.

DHANU REDIANSYAH (Graduate Student SUWARNO (Senior Member, IEEE)

RAHMAN AZIS PRASOJO (Graduate A. ABU-SIADA (Senior Member, IEEE)

You might also like