You are on page 1of 6

Failure Mode and Effect Analysis Applied to Power

Transformers
Suelen C. Freitag Mauricio Sperandio Tiago B. Marchesan Rodinei Carraro
UFSM – Brazil UFSM- Brazil UFSM- Brazil CEEE – Brazil
suelenfreitag@gmail.com mauricio.sperandio@ufsm.br tiago@ufsm.br RodineiC@ceee.com.br

Abstractʊ This work presents the Failure Modes and Effects Reliability Centred Maintenance (RCM), has been
Analysis (FMEA) technique applied to an utility’s power continuously improved to meet these needs.
transformers to obtain a risk value to enable the optimization of
a maintenance schedule. The application of the study will lead to According to [2] RCM is the assignment of optimal
greater assertiveness of the investments, providing technical and maintenance strategy for each component in a system, taking
scientific support and agility in the decision making. into account component importance / priority as well as failure
Index Terms— Reliability, Failure Modes and Efeccts modes and their effects on system reliability. RCM studies for
Analysis (FMEA), Risk Priority Number (RPN), Power power transformers have been of great interest in the works
Transformer. [3]–[9].
The growing number of transformers that are aging and
operating close to their capacity, or even above it for short
I. INTRODUCTION periods, highlights the need for better strategies to manage
The electric power transmission network plays a these assets [10].
fundamental role in the electric system, being the link between Hence, this work intends to approach a transformer
generation and load, and are planned and operated for the management study, based on historical maintenance data, to
highest possible reliability. However, these networks still are detect potential failure through RCM, in order to make the
subject to interruptions, defect and malfunctions of various appropriate decision of appropriate preventive or corrective
devices present in the system. actions.
Power transformers are one of the main equipment within
the substations of the transmission system, and their loss of II. RELIABILITY CENTRED MAINTENANCE
function generates major operational disturbances, mainly
overloads in other transformers. When a fault or fault trip Transmission agents are pursuing strategies to increase the
occurs in this equipment, a large amount of energy is reliability of their systems, one of the areas of great interest and
interrupted. In addition, their replacement or maintenance is contribution to this increase is the maintenance department,
difficult, taking several weeks, leaks can cause serious because positive performance results in costs reduction. Thus,
environmental impacts, and therefore have high financial costs. maintenance plays a strategic role for the companies.
Particularly according to [1], power transformers are a One of the practices adopted by world-class companies as a
reliable equipment, with low failure rate, but because they are way of guaranteeing their competitiveness and the consequent
very expensive and difficult to transport, it is always necessary continuity in the market is the practice of RCM methodology,
to improve the maintenance of their components in order to which is the application of a structured method to establish the
increase the life and decrease the risk. best maintenance strategy for a given system or equipment.
It is known that a large number of power transformers The RCM process and the use of the support tools initially
currently in operation in the world, is approaching the end of require a perfect understanding of a series of definitions
its useful life. In Brazil, this situation is no different, with parks associated with the failures and performances of the physical
with an average of 30 years in operation. The operating state items. Therefore, some fundamental definitions for the
of a particular transformer (close to the end of life) increases development of RCM will be presented, according to [11].
the probability of failure of the equipment. A transformer can • Function: is what the user wants the item or system to do
overload because of interruptions of another equipment, so a within a specified performance pattern;
contingency analisys is necessaire to evaluate this situations. • Failures: consists of interrupting or changing the ability of
In addition, the configuration information of the substation that an item to perform a required or expected function;
the transformer under review is operating is essential, since • Failure Mode: is defined as any event that causes a functional
they can reduce the degree of overload or even eliminate it. failure, that lead to a partial or total reduction of the
In a competitive electric market, high maintenance costs equipment function and its performance goals associated
have led to the need to use methods to increase efficiency in with it;
the maintenance process and to extend equipment life. • Cause of Failure: represents the events that generate the

978-1-5386-2344-2/17/$31.00 ©2017 IEEE


failure mode appearance and can be detailed at diferent levels
TABLE I
for diferent situations; LEVELS OF RISK DETECTABILITY
• Failure Effects: is what happens when a failure mode occurs;
Detectabily Description Ranking
• Severity: is a criterion that quantifies the consequence of the
Failure detectable by
failure and its impact on the system; I Easy 0-1
operating procedure
• Occurrence: refers to the frequency with which a cause is Failure detectable by
II Reasonable 2-3-4
likely to occur; operational inspection
• Detection: refers to the probability of detecting the cause Failure detectable by
III Difficult 5-6-7
functional test
before the failure occurs. Very Failure detectable only by
IV 8-9
Difficult shutdowm

A. Failure Mode and Effects Analysis V Impossible Failure totally hidden 10


One of the techniques of RCM is the use of Failure Mode
and Effects Analysis (FMEA), that has been developed to be
applied mainly to components. The main objective is to detail TABLE II
LEVELS OF RISK OCCURRENCE
each of the components of a system in order to raise all the
Occurrence Description Ranking
ways which each component may fail and assess what effects
Failure will practically not
these will have on other components and on the system. I Unbelievable
occur
0-1
In order to conduct an FMEA efficiently, it is recommended Failure will occur
II Improbable 2-3-4
to follow systematic steps as [12]: define the purpose or exceptionally
expectations of the evaluation, make a process flow chart, Failure reasonably
III Remote 5-6-7
expected
prioritize, collect data, analyze, present results, confirm and
Failure will occur
evaluate the results. IV Probable 8-9
frequently
The general procedure to perform the FMEA is to collect the Failure will occur
V Frequent 10
functional information of the equipment and the process that is continuously
the target of the analysis, so with such information, it is
estimated the severity of the effects of the failures, probability TABLE III
of occurrence of the cause of the failures and its detection. The LEVELS OF RISK SEVERITY
Risk Priority Number (RPN) is then calculated in view of the Severity Description Ranking
planned validation, verification and prevention activities. [12], Failure pratically causes no
I Insignificant 0-1
[13]. damage
Failure causes some minor
B. Risk Matrix II Minimum 2-3-4
damage
The risk matrix is a combination of the frequency levels of III Moderate Significant damages 5-6-7
failure modes with the levels of severity and acceptability of
High failure rate causing
risks. The effect of a failure mode can be measured by a risk IV Critical
system loss
8-9
assessment. In general, the risk of a failure mode can be Extreme damage that
V Catastrophic 10
defined as: affects the system
RPN = S ∗ O ∗ D (1)
where:
S – Severity; V
High Risk Range
Occurrence

O – Occurrence; IV
D – Detectability. Critical Risk Range
III
Tolerable Risk Range
II
The risk matrix uses the levels of detectability, occurrence Low Risk Range
I
or frequency, severity of risk that are classified according to
I II III IV V
Table I, Table II and Table III, respectively, adapted from [11],
Severity
[14], [15].
Fig. 1. Risk Matrix.
Once the levels of occurrence, severity and acceptability of
risks have been defined, we can construct the Risk Matrix, III. METHODOLOGY
shown in Fig. 1
This session presents the method developed to calculate the
With the Risk Matrix, it is visualize the degree of risk of
risk of each transformer in the system and then do a ranking of
each equipment, thus taking actions that can reduce or
these equipment, in order to direct maintenance and justify
eliminate the risk associated with the potential cause of the
policy changes to improve the system reliabity.
failure. They can then prioritize issues based on this risk and
Fig. 2 presents the flowchart of the method for determining
prioritize high risk equipment.
the equipments risk for the system. This includes the use of
Collect functional NORMAL OPERATION
information from
equipment

ȜF μ
ȜS
μD
Determine potential
μSN ȜD
failure modes FAILURE

Overload Contingency ȜΜ
μM
analysis analysis

ȜSD
Determine the causes OVERLOAD OFF
of each failure
Contingency
analysis
Determine failure rates

Check the effects of


each failure MAINTENANCE

Find occurrence
Find severity ranking
ranking Fig. 3. Reliability model of a transformer.
Find detection ranking
The state of a component can be understood as the set of
possible values that its parameters can assume. These
parameters are called state variables and describe the condition
of the component. State space is the set of all states that a
Calculate RPN
component can present.
To better comprehend this technique we observe the
example shown in Fig. 4 where it is possible to describe two
Identification of critical
equipment states, ON and OFF [16].
Fig. 2. Proposed methodology
From the knowledge of the number of states and the rates of
transitions between states we can calculate the probability of
FMEA, which aims to define and identify potential system the system being in each state, in the same way knowing the
failures, the use of the evaluation in the electrical system with current state we can determine the chance to reach a certain
the intention to carry out the analysis of the impact of the loss state or estimate the time to reach it (number of transitions).
of the equipment to system (contingency analysis), in addition The sum of all the transition rates (Pkj), exit and permanence
to the analysis of overload, to verify if the operation of the of the state, must be equal to one.
n
equipment is above its nominal capacity.
All these analyzes contribute to the determination of a new ¦P
j =1
kj = 1 (3)

failure rate model, since in the literature it is observed that in


the elaboration of the failure model of the transformers the where:
overload and maintenance states are not contemplated, that is, k - current state;
only the failure and operation states are considered. Thus, as j - next state;
shown in Fig. 3, this new model of failure is used to the n - number of states;
evaluation of transformers. Pkj - transition rate from k to j.
According to [16] the value of the failure rate (λt) at a give The transition matrix (P) relates the distribution of transition
instant (t) is associated with the number of failures that occured probabilites between the states of the Markov chain,
in the sample in the observed period, as in (2). remembering that the sum of the lines is always equal to one.
Taking the state diagram shown in Fig. 4 as an example, the
number of unit failure transition matrix P is presented as (4).
λ (t ) = (2)
time unit is running
λ

The rates consist of the probability of occurrence of the


event in a given time interval. Thus, the rate of a component
corresponds to the sum of the probability indices. ON OFF
In this way, the analysis usually performed through models
is that of state space that can be solved using the Markov
diagram. This technique represents dependent events and
allows the calculation of the time evolution of the states of a μ
system since the transition probabilities between these states Fig. 4. Two-state Transition Diagram.
remain constant.
Insulation Cooling Windings Core
6% System 2%
ª1 − λ λ º 5%
1% Protection
P=« » (4) Tank
¬ μ 1 − μ¼ Tap 1% Other
Changer
7% Bushings

That said, the calculation of the probability vector: Tap Changer


i Insulation
pi = p0 <P (5) Bushings
10% Cooling System
where:
Windings
pi - probability vector of state “i”;
Core
p0 - probability vector of the initial state; Other Protection
Tank
P - transition matrix; 20% 48%

i - number of transitions.
Fig. 5. Distribution of faults in 230 kV power transformer components

Therefore, in the long run, we have the following relation:


of the transformer are the largest contributors, followed by the
pi = pi −1 ⋅ P (6) bushings and tap changer.
The analysis of fault risk factors are classified as presented
This results in an indeterminate linear system, because the in section three for Severity, Occurrence and Detection for
power transformers.
equations will be linearly dependent, to solve it one must add
In order to determine the severity index, we analyzed the
a new equation.
n
effects caused by the failures in the maintenance reports
¦p
k =1
k =1 (7)
together with the analysis of the impact of the loss of this
equipment in the system, that is, a contingency analysis where
changes in the topology of the network are observed, such as:
a transformer outage that can cause violations in the network
Solving the Markov model.
constraints and also the unavailability of the bus. This study
used the Brazilian software ANAREDE developed by the
ª1 − λ λ º
ª¬ON OFF º¼ = « » < ªON OFF º¼ (8) Centro de Pesquisa de Energia Elétrica (CEPEL), that is
¬ μ 1− μ ¼ ¬
recommended by the ISO.
hence, For the determination of the Occurrence index, the causes of
−λ ⋅ ON + μ ⋅ OFF = 0 failures in the maintenance reports were complemented with a
(9)
λ ⋅ ON − μ ⋅ OFF = 0 contingency analysis, overload analysis and the definition of
fault rates for the application of the Markov model.
Substituting one of the equations by ON+OFF=1, and Tranformers have a maximum operating temperature, and
solving this system, we obtain the probabilities of being in each when the equipment is exposed to overloads, of any magnitude,
state. it will pass this limit, causing gradual or abrupt degradation of
μ the insulation, contributing to the reduction of the useful life of
PON = (10)
λ+μ the equipment. The most important factors in relation to the
λ overloads are the conditions of load before the occurrence,
POFF = (11) time of duration and its magnitude. Thus, in this study, the
λ+μ
analysis of overload will occur through the study of
For the contextualization of the proposed method, a sample contingencies, that is, the impact f the loss of na equipment or
how it behaves / influences the other equipment of the system.
of 50 transformers from the transmission grid of the utility
The failure rate consists of the probability of failure
CEEE-GT, bulk network 230 kV, distributed in 13 substations
occurring in a given time interval. For the FMEA analysis the
was studied.
probability of occurrence is obtained through historical data.
Fig. 3 shows the probability of the transformer being in each
IV. RESULTS state. The Transitions matrix for the study is presented in Table
IV, considering the Normal Operation (N), Failure (F),
In order to analyze the use of the FMEA technique and the
Overload (O), Shutdown (S) and Maintenance (M).
Calculation of Risk in transformers for decision making and
In order to determine the detection index, the effects and
prioritization of actions in the maintenance of this equipment,
causes of the failure in the maintenance reports were analyzed
the records of the operational history and maintenance of the
to define the probability that a given cause of failure is detected
basic power transformers (230 kV) from CEEE-GT in a period
before its occurrence. Detection for transformer failures uses
of 15 years (2000 to 2015).
very similar parameters for all substations, only by
Fig. 5 shows a typical distribution of the failed component
distinguishing those wich are remote controlled, locally
for transmission transformers based on fault data in the reports.
controlled or assisted substation.
It is noticeable that the failures originating from the protection
TABLE IV
TRANSITION MATRIX OF THE STATE DIAGRAM OF FIG. 4 TABLE V
TRANSFORMER FAULT RISK ANALYSIS
N F O S M SE TR S O D RPN Ranking
PAL6 TR1 5 5 6 150 1
N 1-(λS+λF+λD+λM) λF λS λD λM
PAL10 TR1 6 3 6 108 2
F μF 1-μF 0 0 0 PAL6 TR2 5 3 6 90 3
PAL10 TR2 6 2 6 72 4
O μS 0 1-μS λSD 0
PPE TR7 3 3 7 63 5
S μD 0 0 1-μD 0 CIN AT1 1 6 7 42 6
PPE TR8 3 2 7 42 7
M μM 0 0 0 1-μM CBO TR3 1 6 6 36 8
CIN AT2 1 5 7 35 9
ELD TR1 2 3 5 30 10
Table V presents the failure risk factors classified according
GAR1 TR1 2 3 5 30 11
to Severity, Occurrence and Detection for the power
GAR1 TR2 2 3 5 30 12
transformers and how they would be classified using the RPN
PAL9 TR2 1 5 6 30 13
values.
The case study is applied to a sample of 50 power CIN TR1 1 4 7 28 14
transformer units of the transmission area of CEEE-GT, 230 LIV2 TR1 1 4 7 28 15
kV, distributed in 13 substations. LAJ2 TR1 1 4 6 24 16
Anlyzing the transformer number 1 (TR1) of the PAL6 LAJ2 TR2 1 4 6 24 17
substation, based on Table V, the fault severity has a degree III PAL4 TR3 1 4 6 24 18
risk, that is, moderate severity that could cause significant PAL6 TR4 1 4 6 24 19
damages; failure occurrence has a degree III risk, that is, a CIN TR2 1 3 7 21 20
remote occurrence or a seasonably expected failure; the fault PPE TR3 1 3 7 21 21
detection has a degree III risk, that is, difficult detectability. In CNA1 TR1 1 3 6 18 22
this way, this equipment exhibits the highest RPN of 150 and CBO TR1 1 3 6 18 23
can be examined in the risk matrix shown in Fig. 6. The
CBO TR2 1 3 6 18 24
transformer is in the critical risk range, which means that based
GUA2 TR1 1 3 6 18 25
on this method, the maintenance strategy should be prioritized
PAL10 TR3 1 3 6 18 26
for this substation equipment. If we analyze the transformers
TR1 and TR2 of the PAL10 substation and the TR2 of PAL6 PAL4 TR2 1 3 6 18 27
they are with high RPN values and in the risk matrix in the PAL4 TR4 1 3 6 18 28
tolerable risk range, lighting a warning signal to the utility PAL4 TR6 1 3 6 18 29
about the equipment. PAL6 TR6 1 3 6 18 30
PAL9 TR1 1 3 6 18 31
V CAX2 TR1 1 4 4 16 32
High Risk Range
Occurrence

IV PAL13 TR1 1 3 5 15 33
Critical Risk Range
III PAL13 TR2 1 3 6 15 34
Tolerable Risk Range
II LIV2 TR4 1 2 7 14 35
Low Risk Range
I PPE TR2 1 2 7 14 36
I II III IV V CBO TR6 1 2 6 12 37
Severity CBO TR7 1 2 6 12 38
Fig. 6. Example Risk Matrix for the Substation PAL6 TR1 . LAJ2 TR3 1 2 6 12 39
PAL4 TR1 1 2 6 12 40
This technique is being implemented in a management PAL9 TR5 1 2 6 12 41
software, which will have access to the company database, and GRA2 AT1 1 3 4 12 42
will allow to generate these reports automatically after GRA2 AT2 1 3 4 12 43
parameter setting by a specialist. GRA2 AT3 1 3 4 12 44
GRA2 TR1 1 3 4 12 45
V. CONCLUSIONS PAL8 TR1 1 2 5 10 46
PAL8 TR2 1 2 5 10 47
This paper illustrates the use of the Failure Mode and Effect
Analysis technique to determine Occurrence (O), Severity (S) SCH TR1 1 2 5 10 48
and Detection (D) to analyze power transformers and to enable GRA2 TR2 1 2 4 8 49
the optimization of a maintenance schedule. The study scores GRA2 TR3 1 2 4 8 50
and prioritizes the risk through a calculation of the RPN [5] A. Abiri-Jahromi, M. Parvania, F. Bouffard, and M. Fotuhi-
Firuzabad, “A Two-Stage Framework for Power Transformer Asset
indicator.
Maintenance Management - Part II: Validation Results,” IEEE
One of the main challenges for the implementation of this Trans. Power Syst., vol. 28, no. 2, pp. 1395–1403, May 2013.
methodology is the performance of researches in several [6] A. Abiri-Jahromi, M. Parvania, F. Bouffard, and M. Fotuhi-
company records, which presents difficulties, such as: data Firuzabad, “A Two-Stage Framework for Power Transformer Asset
Maintenance Management - Part I: Models and Formulations,”
interpretation; filtering process for locating interruption
IEEE Trans. Power Syst., vol. 28, no. 2, pp. 1404–1414, 2013.
information; lack of sufficient information in the records; [7] T. Suwnansri, “Asset management of power transformer:
among others. This evidences the need to standardize the Optimization of operation and maintenance costs,” in 2014
maintenance data for management and preparation of the International Electrical Engineering Congress (iEECON), 2014,
pp. 1–4.
maintenance plans, in order to facilitate the work of data
[8] A. J. C. Trappey, C. V. Trappey, L. Ma, and J. C. M. Chang,
mining, providing greater agility and consequent improvement “Intelligent engineering asset management system for power
in the performance and operational availability of the transformer maintenance decision supports under various operating
equipment. conditions,” Comput. Ind. Eng., vol. 84, pp. 3–11, Jun. 2015.
[9] A. Koksal and A. Ozdemir, “Improved transformer maintenance
The application of the study will lead to greater assertiveness
plan for reliability centred asset management of power
in the investments by the utilities, providing technical and transmission system,” IET Gener. Transm. Distrib., vol. 10, no. 8,
scientific support and agility in decision making. pp. 1976–1983, May 2016.
[10] Weihui Fu, J. D. McCalley, and V. Vittal, “Risk assessment for
transformer loading,” IEEE Trans. Power Syst., vol. 16, no. 3, pp.
ACKNOWLEDGEMENTS 346–353, 2001.
[11] I. P. Siqueira, Manutenção Centrada na Confiabilidade - Manual
The authors are grateful to CEEE-GT for the promotion of de Implementação, 1o Edição. Rio de Janeiro: Editora QualityMark,
work and the availability of data. 2009.
[12] D. H. Stamatis, Failure Mode and Effect Analysis: FMEA from
Theory to Execution, 2 nd ed. 1995.
[13] S.-H. Teng and S.-Y. Ho, “Failure mode and effects analysis: An
integrated approach for product design and process control,” Int. J.
REFERENCES Qual. Reliab. Manag., vol. 13, no. 5, pp. 8–26, Jul. 1996.
[14] M. Akbari, P. Khazaee, I. Sabetghadam, and P. Karimifard,
[1] D. Benetti, “Análise de padrões de desligamentos de
“Failure Modes and Effects Analysis (FMEA) for Power
transformadores da rede básica.” Curitiba, p. 85, 2012.
Transformers,” in 28 th International Power System Conference,
[2] W. Li, Risk Assessment of Power Systems. Hoboken, NJ, USA:
2013, pp. 1–7.
John Wiley & Sons, Inc., 2004.
[15] J. Bian, X. Sun, and J. Yang, “Failure Mode and Effect Analysis of
[3] a. Jahromi, R. Piercy, S. Cress, J. Service, and W. Fan, “An
Power Transformer Based on Cloud Model of Weight,”
approach to power transformer asset management using health
TELKOMNIKA (Telecommunication Comput. Electron. Control.,
index,” IEEE Electr. Insul. Mag., vol. 25, no. 2, pp. 20–34, Mar.
vol. 13, no. 3, p. 776, Sep. 2015.
2009.
[16] R. Billinton and R. N. Allan, Reliability Evaluation of Power
[4] A. E. B. Abu-Elanien and M. M. A. Salama, “Asset management
Systems, 2end ed. New York: Plenum Press, 1996.
techniques for transformers,” Electr. Power Syst. Res., vol. 80, no.
4, pp. 456–464, Apr. 2010.
.

You might also like