Professional Documents
Culture Documents
fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.Doi Number
ABSTRACT Power transformer is a critical and expensive asset in electric transmission and distribution
networks. It is essential to monitor the health condition of all power transformer fleet in such networks to
avoid unwanted outages. The health index (HI) is a quick and efficient way to assess the condition of power
transformers based on multi-criteria. While Power transformer HI method has been well presented in the
literature, not much attention was given to handle the uncertainty and reliability of this method due to
unavailability of used data. Therefore, this paper aims to tackle this issue through employing Artificial
Intelligence (AI)-based techniques to reveal the health condition of power transformers with high accuracy
and at the same time handling data uncertainty. The proposed HI approach assesses the power transformer
insulation system based on oil quality, dissolved gas analysis (DGA), and paper condition. In this regard,
collected data from 504, 150-kV transformers are used to establish the proposed AI-models. Seven AI
algorithms including k-Nearest Neighbor (kNN), Support Vector Machine (SVM), Random Forest (RF),
Naïve Bayes (NB), Artificial Neural Network (ANN), Adaptive Boosting (AdaBoost), and Decision Tree are
investigated. A performance comparison of the proposed AI-based HI models is carried out using the scoring-
weighting-based HI method as the reference. Results show that RF model provides the best performance in
predicting power transformer HI with an accuracy of 97.3%.
INDEX TERMS Power transformer, health index, insulation system, condition monitoring, artificial
intelligence.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
chemical, and physical tests to be collectively used by the CA (GRNN) was introduced in [28] to calculate the HI based on
tools to determine the health condition of the transformer. four-class transformer conditions; very poor, poor, fair, and
The transformer health index (HI), as primarily explained good. In [29], AI-based HI using Bayesian network was
in [13], is a single factor that utilizes the information from proposed to quantify the parameters contribution through
operating observations, field inspections, and laboratory tests score-probability and population failure statistics. In [30],
to provide a reliable TAM decision. Majority of the data used principal component analysis (PCA) and analytical hierarchy
in the HI model are based on the insulation system testing. process (AHP) were proposed to determine the transformer
Oil insulation tests include dissolved gases analysis (DGA), health condition based on an expert empirical formula.
oil quality analysis (OQA), and furan analysis (FFA). Due to The use of AI methods to define transformer health index
the high electric and thermal stress within operating has been presented in the literatures. For instance, decision
transformers, oil and paper insulation decomposes and tree, random forest (RF), static vector machine (SVM),
releases some gases that dissolve in the oil and decrease its artificial neural network (ANN), and k-nearest neighbor
dielectric strength. These gases include hydrogen (H2), (kNN) methods were used in [27] to automate the assessment
methane (CH4), ethylene (C2H4), acetylene (C2H2), ethane process. A study in [31] compared several machine learning
(C2H6), carbon monoxide (CO), and carbon dioxide (CO2). algorithms such as ANN, SVM, and Gaussian bayesian
DGA test is conducted to quantify these gases from which networks (GBN), to evaluate the health condition of power
internal transformer electrical and thermal faults can be transformers through probabilistic HI framework. An
identified [14]. Considering the guideline in [15], the OQA approach is presented in [32] that compared the performance
is determined by analyzing the oil breakdown voltage of various AI methods such as naïve Bayes (NB),
(BDV), acidity, water content, interfacial tension (IFT), multinomial logistic regression (MLR), ANN, SVM, kNN,
dielectric dissipation factor (DDF), and color. FFA is one rule (OneR), decision tree (J48), and RF in identifying
conducted to measure furan compunds that are generated due transformer health index. However, regardless the various AI
to cellulose degradation and dissolve in the transformer oil methods proposed in the literatures to calculate the
[16]. Among the 5 furan compunds, furfuraldehyde (also transformer HI, more thorough studies are still required to
known as furfural/2FAL) dominates the measurements and is improve the accuracy of such methods, in particular with the
correlated to the degree of polymerization of the paper uncertainaty or unavailability of the used data.
insulation [17]. As discussed in [9], data uncertainty is the main
A scoring-weighting (also called weighted-sum) is the shortcoming of the HI method which affects its reliability and
most commonly used method to calculate the HI of power accuracy. Data unavailability is the main reason of
transformers [13], [18]–[23]. The approach starts by uncertainty. The study in [33] reported the effect of data
comparing each parameter to a scoring table, then weighting unavailability and proposed a remidial approach based on RF
each parameter based on its importance. The weights are to predict the missing data and improve the scoring-
usually identified by expert personnel. The individual scores weighting HI. However, the development and investigation
are combined into a single index that reveals the overall of AI-based HI model in handling data uncertainty has not
health condition of the transformer. yet discussed thoroughly with proven implementation
As the HI is a linear combination of different scores and feasibility.
weighted measurement data, it is a challenging task to deal From the above discussion, the gaps in this area of research
with the uncertainty of the used data throgh the current utility can be summarized as below:
practice to identify the HI. • despite the several AI-based methods used to identify the
The rapid development of computer science and data HI of power transformers, accuracy of such methods is still
processing has resulted in new HI approaches based on relatively low.
machine learning algorithms for big data analysis [9]. • there is no reliable and widely accepted technique to deal
Artificial Intelligence (AI)-based HI approach was presented with data uncertainity which affects the reliability of the
in [24] using DGA, furan, and oil test data to assess the health current HI practice.
condition of several in-service transformers. However, due
to the limited available data, the prediction accuracy of the As such, the main contribution of this paper is to:
developed neuro-fuzzy model has only been 56.3%. The • construct an AI-based approach to enrich the current
study in [25] conducted a transformer HI sensitivity analysis conventional HI scoring-weighting method for power
using self adaptive neuro-fuzzy inference system (ANFIS) transformers.
whose parameters were tuned using particle swarm optimizer • improve the accuracy of the HI calculation using multi-AI
(PSO). In [26], probabilistic Markov chain models were used models to process most common routine testing data for power
to predict the future condition of the transformer based on HI transformer insulation system.
calculation using a non-linear optimization technique. AI • develop an effective correlation for all used parameters in
with a fuzzy-based support vector machine (SVM) was calculating the transformer HI.
employed in transformer HI-based condition assessment • identify the best AI-based combination for dealing with data
using several factors, industry standards, and utility expert uncertainty and missing values.
judgment [27]. AI-based general regression neural network
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
Finish S1 S1xW1
Condition Assessment Weight
(Scoring (W1)
Data 1
Figure 1. Flowchart of the proposed methodology Function)
S2xW2 Health
Index
A. TRANSFORMER POPULATION Condition S2
(HI)
Assessment
The data used in this study were collected from PT. PLN Data 2
(Scoring
Weight HI =
(W2) ∑ ∑Sf xWf
PERSERO, an Indonesian state electricity company. Data are Function)
collected from 504 units, all with a specific high voltage of 150
kV with a low voltage of either 70 kV or 20 kV. Majority of Sn
these transformers use Kraft paper insulation and they are Condition Assessment
Weight
Data n (Scoring (Wn)
periodically inspected in a condition-based maintenance Function) SnxWn
scheme by checking their insulation properties through DGA
and dielectric characteristics (oil quality analysis). Dielectric
Figure 3. The principle of conventional HI scoring-weighting method
characteristics include oil breakdown voltage (BDV), water
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
the failure modes of high voltage power transformers. The where n is the number of parameters used in every factor. The
survey results show that the most common fault location for value of Si is based on the scoring for each parameter based
power transformer is the winding, with a dominant fault on standard scoring tables, and Wi is the weighting factor that
mode within the dielectric system. The main cause of failures describes the importance of every parameter.
includes aging and external short circuit. Therefore, this The assessment methods are based on Tables I through VI,
study focuses mainly on the integrity of the power along with Figure 6. The final HI value is calculated using (2).
∑𝑛
𝑗=1 𝑆𝐹𝑗 𝑊𝑗
transformer insulation system. The structure of the proposed 𝐻𝐼𝑓𝑖𝑛𝑎𝑙 = 𝑥 100% (2)
HI method is divided into three-layers as shown in Figure 5 ∑𝑛
𝑗=1 4𝑊𝑗
which is based on previously proposed method in [20]. The where SFj is the parameter’s scoring factor and Wj is the
HI is equipped with scoring tables to score each weighting factor.
measurement, as well as weighting factors based on the The weighting parameters and factors are obtained by using
reliability and criticality of each parameter. These tables and AHP. This technique is built based on the judgment of five
weighting factors can be continuously adapted based on new experts with in-depth experience in transformer condition
information, experience, and recent international guidelines. monitoring, fault diagnosis and asset management [39], [40].
The layers in Figure 5 include:
TABLE I
Data layer: this layer consists of frequently measured THE SCORING-WEIGHTING FOR OIL QUALITY
diagnostic data such as key gases of DGA, BDV, water
Score
content, IFT, acidity, oil color, 2FAL, and transformer age. Oil Quality Weight
Factor layer: This layer has three categories that are derived 1 2 3 4
from data layer/ measurement: BDV (kV) >50 50-45 45-40 <40 0.169
1. Oil Quality Factor (OQF) that consists of oil parameters;
BDV, water content, IFT, acidity, and color. Water Content
<20 20-25 25-30 >30 0.108
2. Faults factor (FF), which consists of gas evolution, gas (ppm)
concentration level, and Duval’s pentagon DGA Acidity 0.1- 0.15-
interpretation method (DPM), which is shown in Figure 6. <0.1 >0.2 0.139
(MgKOH/mg) 0.15 0.2
3. Paper condition Factor (PCF): This factor consists of the
IFT
operating age, CO/CO2 ratio, and 2FAL concentration. (Dyne/cm)
>35 35-25 25-20 <20 0.124
Health Index layer: in this layer the HI calculation is
conducted to provide a single value that reveals the overall Color Scale <1.5 1.5-2 2-2.5 >2.5 0.114
health condition of the investigated transformer.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
Figure 6. Duval’s pentagon DGA interpretation method Figure 7. A flowchart for the AI-based HI model’s development
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
FINISH
Output:
Figure 8. Flowchart of the proposed multi-AI approaches to calculate HI Health Index
Prediction
The output of the AI model is the overall HI which is
validated using the corresponding HI calculated by the
scoring-weighting method. If the AI model does not provide Figure 9. Dealing the uncertainty with AI-based HI
satisfactorily accuracy, additional tuning process is conducted Three methods are used to implement the proposed
through more training. On the other hand, if the accuracy has approach namely; scoring weighting (SW), random forest
met the criteria, the final HI is calculated as per the process (RF), and Random forest with preprocessing-based (RFwP).
shown in Figure 7. Various AI algorithms are employed to The certainty level (CL) is calculated as below [33]:
establish the proposed HI model; this includes kNN, SVM,
RF, NB, ANN, Adaptive Boosting (AdaBoost), and Decision 𝐶𝐿𝑖 = 𝑊𝑃𝑖 𝑥 𝑊𝑓𝑖 (3)
Tree (Tree). The proposed method to calculate the ∑(𝐴𝑣𝑎𝑖𝑙𝑎𝑏𝑙𝑒 𝐶𝐿𝑖)
transformer HI using multi-AI approaches is shown in Figure 𝐶𝐿 = 𝑥 100 (4)
𝑀𝑎𝑥.𝐶𝐿
8. The performance of these approaches is compared based
on: CLi is the certainty level for each parameter (i), WPi is the
AI-SW: which is based on factors category stage (OQF, FF, weighting parameter, and WFi is the weighting factor.
and PCF), then scoring-weighting-based in HI category stage.
AI-Ful: which is based on all classification stages (OQF, FF, G. PROPOSED RANDOM FOREST
PCF, and HI category) of the AI model. Random forest algorithm is a method consisting of a
collection of tree-structured classifiers {h(x,Θk ), k=1, ...}
F. DEALING WITH DATA UNCERTAINTY where {Θk} represent independent identically distributed
All assessments of transformers include a non-avoidable random vectors. Each tree casts a unit vote for choosing the
level of uncertainty. Unavailability of the data, erroneous, or most popular class at input x [41]. An ensemble of the
obsolete data will adversely affect the assessment results and classifiers h1(x), h2(x)...hK (x), is given, and the training data
increase the uncertainty of the obtained outcomes. The are drawn randomly from the distribution of the random
uncertainty within available data is due to different reasons vector X, Y and the margin function is defined as:
such as incorrect data entry, erroneous or questionable test mg(X,Y) = avk I(hk (X)=Y)− maxj ≠Y avk I(hk (X)= j ) (5)
results, uncertainty in the condition assessment, and obsolete The generalization error is given by:
data [9].
A solution to manage the unavailability of the data is by PE* = PX, Y (mg(X, Y) < 0) (6)
employing machine learning to estimate the missing In RF, hk (X) = h(X, Θk ). Almost all sequences Θ1… PE*
parameters based on historical data and well training process. converge to:
Following the characteristics of the data, several methods can PX, Y (PΘ (h(X, Θ)=Y)−max j≠Y PΘ(h(X,Θ)= j) < 0) (7)
be used to preprocess the missing values. In this study, the
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
H. EVALUATION
The evaluation of the model is based on the accuracy of the
prediction of the transformer HI category. The classification
accuracy (CA) is the proportion of correctly classified data
as given by (8).
(𝑇𝑃+𝑇𝑁)
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁) (8)
where TP is true positive, TN is true negative, FP is false
positive, and FN is false negative. This calculation is based
on the confusion matrix of binary classification in which the
evaluation is considered as a multi-class classification
problem.
Color
Age
LvMax
HI_Sc
Acidity
2FAL
H2
CH4
C2H6
C2H4
C2H2
CO2
IFT
VBD
CO
VG
0 50 100 150 200 Figure 12. The correlation matrix of observed parameters
VG G C P VP BDV
Units 174 104 134 82 10
Water
IFT
The data used to develop the AI-based HI models are the
same data classified in the above section to ensure the Color
correctness and availability of all data. The distributions of H2
the used parameters are shown in Figure 11 and a matrix
CH4
correlating these parameters is shown in Figure 12 . As
shown in Figure 12, the IFT and color have the highest C2H6
correlation value of -0.69. Sequentially, the significant C2H4
correlation is found between C2H4 and CH4 (0.65), 2FAL and -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00
color (0.65), IFT and acidity (-0.63), color and acidity (0.55),
and age and color (0.53). These results reveal that several
Figure 13. Plot of the correlation between observed parameters
parameters have a dependency relationship and influences
each other.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
Furthermore, the correlation of parameters with HI score, results of 98%. Figure 15 shows a comparison of the
as shown in Figure 13, can be described from the highest to performance of all developed models in terms of the HI
the lowest by applying a threshold limit above 0.2, which are calculation accuracy. Results attest that AI-based tree-
as follows: color (-0.778), IFT (+0.638), 2FAL (-0.568), age structured classifiers produces better HI calculation accuracy
(-0.51), acidity (-0.45), CO (-0.413), CO2 (-0.335), water than other algorithms.
content (-0.278), BDV (+0.268), and C2H2 (-0.218). The HI TABLE VIII
score has a limit value of 1, which means HI category has an THE CONFUSION MATRIX OF RANDOM FOREST AI-FULL-BASED
equal relationship with the HI score. Therefore HI scores and Random Predict
maximum level are derived from the value of the parameter Forest VG G C P VP ∑
(data layer), then in the subsequent discussion, HI maximum VG 54 0 0 0 0 54
score and LvMax will be ignored. The correlation of G 0 31 0 0 0 31
Actual
parameters helps to reduce the number of features without C 0 1 36 1 0 38
P 0 0 0 24 0 24
reducing the accuracy significantly as will be elaborated
VP 0 0 0 2 1 3
below. ∑ 54 32 36 27 1 150
TABLE IX
Figure 14. Performance comparison of data trainning multi-AI-based MISSING VALUE SCENARIOS OF TRANSFORMER HEALTH INDEX
Para HI Scenarios
The results show that the best prediction models are the meters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
tree-based classifiers such as random forest, decision tree and
AdaBoost. Water 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
VBD 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1
The performance of the RF classifier for the HI category is Acid. 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1
shown in the confusion matrix. The comparison between IFT 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1
actual and predicted values is used to evaluate the Color 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1
classification accuracy of AI-based HI. The classification CO/CO2 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1
accuracy for RF is 97.3%. Furthermore, the random forest 2FAL 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1
Age 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1
model only has four incorrect classifications, as shown in
H2 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1
Table VIII. The accuracy of other models are: decision tree CH4 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1
(96%), AdaBoost (94.8%), neural network (91.3%), SVM C 2 H6 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1
(89.3%), Naïve Bayes (70.7%), and kNN (70%). In addition, C 2 H4 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1
an AI-SW-based model, which uses AI in the factor level, C 2 H2 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1
Lv.max 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1
and weight-sum calculation in the HI level has been also
DPM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
developed. The model provides a classification accuracy Note: 1 = Available data; 0 = Unavailable data
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
Figure 16 shows the accuracy with the certainty level of After evaluating the effect of missing each parameter and
parameters. The results show that the RF is significantly factor on the HI accuracy, the next analysis is conducted to
influenced by a low certainty level due to particular missing investigate various scenario of unavailable data. The
parameters. With a low certainty level, the accuracy of RF correlation of the certainty level and the classification
also decreased. In contrast, RF with preprocessing (RFwP), accuracy of multi-parameters is constructed, as shown in
due to missing value using the average or most frequent Appendix 1. The aim is to measure the causal effect between
value, can maintain the classification accuracy at an certainty level and resulting HI accuracy for each of the three
acceptable level of at least 95%. developed model: SW-based HI, RF-based HI, and RFwP-
based HI. This analysis is helping to define efficient and
100%
precise ways to deal with the data uncertainty.
95%
1) SCORING-WEIGHTING (SW)-BASED HEALTH INDEX
90%
ACCURACY (%)
80.0%
60.0% The RFwP model has performed better than SW and RF,
40.0%
resulting in the highest accuracy value in classifying the HI
category even at a low certainty level. When the certainty
20.0%
level is, in the worst-case below 10%, the accuracy is still
0.0% above 50%. In some instances, the accuracy can reach a value
HI_F1 HI_F2 HI_F3 HI_F4 HI_F5 HI_F6
Certainty 78.4% 65.2% 56.4% 43.6% 34.8% 21.6% above 80% even with a certainty level of only 50%. The best
RF 84.7% 66.7% 60.7% 39.3% 30.0% 28.0% condition is when the certainty level is above 70% which
RFwP 86.0% 70.7% 82.0% 58.0% 57.3% 60.7% results in an accuracy reaches 90%.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
60%
scoring-weighting method is 78%. It can be observed that the
50% RF model can achieve 80.7% accuracy while the RFwP
40% y = 0.6554x + 0.2573 classification accuracy is 96%.
R² = 0.9176
30% As shown in Figure 20, the RFwP model achieves medium
L to high accuracy levels in almost all instances. This reveals
20%
the ability of RFwP to solve the issue of data uncertainties in
10% estimating the transformer HI.
0%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% IV. CONCLUSION
% Certainty Level
The development and implementation of artificial
Figure 18. Effect of certainty level due to unavailable data on the
intelligence-based health index model to assess the power
accuracy of SW-based HI transformer insulation system conditions has been presented.
The proposed method is aimed to simplify, speed up, and
reduce error due to data uncertainty. A comparison of various
100%
AI algorithms is carried out and evaluated using the
90% H classification accuracy in reference to the scoring-weighting-
80% based HI calculation that is widely used by worldwide
70% utilities. Based on this comparison, the random forest-based
M model is chosen as the proposed method with the highest
% Accuracy
60%
accuracy to predict the transformer HI. Results show that
50% random forest does not overfit as it produces a limiting value
40% of generalization error. As such, random forest with
30%
preprocessing has been proposed to handle the data
L uncertainty by considering the missing values with the
20% y = 0.7542x + 0.1232 average or the most frequent value. The impact of missing
R² = 0.7614
10% data on the model has been evaluated, and the proposed
0% RFwP performed better than other investigated methods and
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%100% resulted in a satisfactory classification accuracy even with
% Certainty Level low certainity levels. This solves the current common issue
of calculating the transformer HI with unavailable data.
Figure 19. Effect of certainty level due to unavailable data on the Results in this paper pave the way to adopt AI-based methods
accuracy of RF-based HI in calculating the transformer HI. However, implementing
such techniques calls for more validation and feasibility
100.0% studies on large transformer population with several
scenarios of unavailable data.
90.0% H
80.0%
70.0% APPENDIX
M Scenarios of uncertainty multi-parameter
% Accuracy
60.0%
Parameters
50.0% Oil Quality Fault Factor
HI PCF
Level (%)
R² = 0.8079
Water
2FAL
Color
VBD
C 2 H6
C 2 H4
C 2 H2
rios
Acid
CH4
30.0%
Age
IFT
H2
L
20.0%
10.0% HI-0 1 1 1 1 1 1 1 1 1 1 1 1 1 100
HI-1 0 1 1 1 1 1 1 1 1 1 1 1 1 96
0.0% HI-2 1 0 1 1 1 1 1 1 1 1 1 1 1 94
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%100% HI-3 1 1 0 1 1 1 1 1 1 1 1 1 1 95
% Certainty Level HI-4 1 1 1 0 1 1 1 1 1 1 1 1 1 96
HI-5 1 1 1 1 0 1 1 1 1 1 1 1 1 96
Figure 20. Effect of certainty level due to unavailable data on the HI-6 1 1 1 1 1 0 1 1 1 1 1 1 1 88
accuracy of RFwP-based HI
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3125379, IEEE Access
D. Rediansyah et al.: Artificial Intelligence-based Power Transformer Health Index for Handling Data Uncertainty
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/