You are on page 1of 395

Asset Management in

Power Systems
A Ph.D. course in co-operation between KTH and TKK by
Prof. Matti Lehtonen & Dr. Lina Bertling 2005

Kungliga Tekniska Hgskolan


School of Electrical Engineering
100 44 Stockholm
A-ETS/EEK-0505

Helsinki Univeristy of Technology


Power Systems Laboratory
02015 Helsinki

List of Content
1.
2.
3.
4.

Course program
Start up meeting
Course schedule with list of participants
Project reports

Course group, August 2005, TKK

Thanks all for a great work!


Lina&Matti
August 2005

Course program

2004-03-29

Asset Management in Power Systems (6 or 4 credits)


Helsinki University of Technology (TKK), Power systems and high voltage engineering
and the Department of Electrical Engineering at KTH in Stockholm, will jointly organize a
post-graduate course about asset management in power systems in 2005.
Course subjects: The overall topic of asset management has been devided into four
themes for the course content that are: (1) Reliability data assessment and reliability
modelling and assessment, (2) Reliability centred maintenance for maintenance
optimization, (3) Condition monitoring and diagnostics methods and (4) Computer tools
supporting techniques for maintenance planning.
Course description: The course includes specialists lectures and own student work with
an individual task. A list of proposed topics for the task is included in the Appendix. The
students shall prepare a 10-15 page summary report, and a presentation (about 6-8 slides
for approx 20minutes presentation) about their individual task. The working language is
English.
Course examination: The examination for the course includes the following two
activities:
1. The individual task that shall be presented in a report and as an oral presentation.
Approved task result in 2 course points.
2. The written examination will be based on all course material, that includes the four
different themes for the course content. There will be an option to take a full
examination of 4 course points that includes examination of all themes or an
examination of a selected part of themes for 2 course points. The course
participant should announce at the registration, which target that is selected for the
written examination.
Course material: Lecture handouts, summary reports, conference and journal papers. A
list of recommended literature that is related to the identified tasks will be delivered in the
first course day.
Course schedule:
-

Registrations: by 18th April to matti.lehtonen@hut.fi or lina.bertling@ets.kth.se.


The number of participants is limited to 45.

Start-up meetings to deliver material / homework: 25th April at KTH and 27th April
at TKK.

Students have to prepare their project reports and presentation slides by 10th
August

Lecture days: 17th to 19th August, 2005. From 10-16 oclock at TKK, Power
systems and high voltage engineering, room I345.

Examination: 5th September. Examinations at same time in Sweden and Finland.

Course responsible and lecturers: Prof. Matti Lehtonen, TKK and Dr. Lina Bertling,
KTH.
Matti Lehtonen (1959) was with VTT Energy, Espoo, Finland from 1987 to 2003, and since 1999 has
been a professor at the Helsinki University of Technology, where he is now head of Power Systems and
High Voltage Engineering. Matti Lehtonen received both his Masters and Licentiate degrees in Electrical
Engineering from Helsinki University of Technology, in 1984 and 1989 respectively, and the Doctor of
Technology degree from Tampere University of Technology in 1992. The main activities of Dr.
Lehtonen include power system planning and asset management, power system protection including
earth fault problems, harmonic related issues and applications of information technology in distribution
systems. (Helsinki University of Technology, Power Systems and High Voltage Engineering, P.O.Box
3000, FIN-02015 HUT, Finland, Tel. +358 9 4515484, Fax +358 9 460224, E-mail:
Matti.Lehtonen@hut.fi)

Lina Bertling was born in Stockholm in 1973. She completed her Ph.D. 2002 and TechLic 1999
in Electric power systems at the Department of Electrical Engineering, and M.Sc. 1997 at the
Vehicle engineering program specialized in Systems Engineering at the Department of
Mathematics, all at the Royal Institute of Technology (KTH), Stockholm, Sweden. She was a
visiting postdoctoral student at the University of Toronto associated with Kinectrics Inc. during
2002-2003. She is currently engaged at KTH as research associate and project leader of the
research program on asset management in power systems. Her research interests are in
reliability evaluation of power systems, and development of methods for maintenance
optimization with special interest in reliability centered maintenance methods. Current
involvements include organizing of the 9th International Conference on Probabilistic Methods
Applied to Power Systems (PMAPS) in June 2006, www.pmaps2006.org. (KTH Dept. of
Electrical Engineering, 100 44 Stockholm, SWEDEN, Phone: +46 8 790 6508 E-mail:
lina.bertling@ets.kth.se, www.ets.kth.se/eek/lina.)

Course fee: For PhD students the course is free, for others the fee is 500 euros

Appendix: Proposed topics for individual tasks

Theme 1: Reliability data assessment, reliability modelling and assessment


1.
2.
3.
4.
5.
6.

Reliability assessment of surge arresters


Failure causes in medium voltage networks
Reliability analysis methods for distribution systems
Reliability engineering in distribution system planning
Reliability modelling of HV switchgear
MonteCarlo simulation methods for reliability modelling and assessment

Theme 2: Reliability centred maintenance for maintenance optimization


7.
8.
9.
10.
11.
12.
13.
14.
15.

Different maintenance strategies and their impact on power system reliability


Prioritization of maintenance methods (RCM,CBM,TBM,CM)
RCM applications for overhead lines
RCM applications in underground power systems
RCM applications for power transformers
RCM applications for switchgear
RCM applications for secondary substations
RCM applications for generator systems
Maintenance optimization techniques for distribution systems

Theme 3: Condition monitoring and diagnostics methods


16. Condition monitoring of wooden poles
17. Maintenance and condition monitoring on large power transformers
18. Aging phenomena of paper-oil insulation in power transformers
19. On-line monitoring applications for power transformers
20. Tests and diagnostics of insulating oil
21. Diagnostics of dissolved gas in insulating oil of transformers
22. Dielectric diagnostics measurements of transformers and their interpretation
23. PD-measurements for transformers and their interpretation
24. Diagnostics and condition monitoring of power cables
25. Aging phehomena of cable insulation materials
26. Dielectric diagnostics measurements of power cables and their interpretation
27. Condition monitoring of cable acessories (joints, terminal)
28. PD-measurements of power cables and their accessories
29. On-line monitoring applications for power cables
30. Condition monitoring of circuit breaker and switchgear
31. Condition monitoring and diagnostics of GIS

32. Condition monitoring of high voltage circuit breakers


33. Condition monitoring of surge arresters
34. On-line monitoring applications for secondary substations

Theme 4: Computer tools supporting techniques for maintenance planning


35. Evaluation of different tools for reliability assessment e.g. RADPOW, NEPLAN,
NetBAS etc.
36. Evaluation of using the tool VeFoNet for purpose of maintenance planning, RCM.
37. Evalution of using the tool NEPLAN for the purpose of maintenance planning
38. Compliation of existing functions for RCM in commercial tools like NetBas,
NEPLAN. Meldis, Maximo etc.

In addition, the students are free to propose a subject of their own, as long as the
subject is in line to the themes above. It is recommended that, if only possible, the
students select a subject which is close to their PhD-project.

Welcome to the course on

Asset management in
power systems
KTH and TKK 2005
Prof. Matti Lehtonen (Course responsible at TKK)
Dr. Lina Bertling (Course responsible at KTH )
AM course Start-up meeting April 2005

Contents Start-up meeting


1. Introduction to the course
Concepts*
Challenges*
Objectives and approach
2. Presentation of the course program
3. Selection of topics
4. Closure
*Material based on key-note presentation: Lina Bertling, Asset management
using reliability-centered maintenance methods, NORDAC, Espoo, 23-24 August 2004
AM course Start-up meeting April 2005

1 AM course Concepts

Asset management (AM) is a concept used today for


planning and operation of the electrical power system

The aim of AM is to handle physical assets in an optimal


way in order to fulfil an organisations goal whilst
considering risk where:
the goal could be maximum asset value, maximum
benefit or minimal life cycle cost
the risk could be defined by the probability of failure
occurrence and it consequence e.g. unavailability in power
supply to customers

AM course Start-up meeting April 2005

1 AM course Concepts

There are different possible actions to handle these assets:


e.g. acquire, maintain, replace or redesign

AM implies consequently to make right decisions on:


what assets to apply actions for
what actions to apply
how to apply the actions
when to apply the actions

AM course Start-up meeting April 2005

1 AM course Concepts

To make the right decisions there is a need of;


condition data
failure statistics
reliability modelling techniques
reliability assessment tools
maintenance planning tools
systematic techniques for maintenance planning e.g.
reliability-centred maintenance (RCM) method
This different type of needs are covered within the course
subject for this course.

AM course Start-up meeting April 2005

1 AM course Concepts

Equipment maintenance provides a tool for handling


reliability either by preventive (PM) or corrective
maintenance (CM) of assets.

PM could be separated into either:

time-based maintenance (TBE) traditional

condition based maintenance (CBM)

RCM provides a tool for AM by balancing between CM


and PM to reach cost-effective maintenance plans

AM course Start-up meeting April 2005

1 AM course Concepts
Background for RCM

Originated in the civil aircraft industry in 1960s

US Department of Commerce defined the concept RCM in 1975


declared that all major systems should apply RCM

First full description of RCM (Nowlan 1978)

Introduced for nuclear power industry 1980s by EPRI

Introduced for hydro power plants in Norway in 1990s and at


Vattenfall from 1997-2004

Ongoing introduction for transmission and distribution system


operators e.g. pilot studies by Swedenergy and research at KTH
(quantitative method Reliability-Centered Asset Maintenance).

AM course Start-up meeting April 2005

1 AM course Concepts

The aim of RCM is to optimize the maintenance


achievements (efforts, performance) in a systematic way.

The following features define and characterize RCM:


preservation of system function,
identification of failure modes,
prioritizing of function needs, and
selection of applicable and effective maintenance tasks.

RCM does not add anything new in a technical sense.


The method is generally qualitative.

AM course Start-up meeting April 2005

1 AM course Challenges

To get support and confident in incorporating new


systematic techniques to support in decision process of
AM not new technique but new working process
Develop, implement and integrate information systems to
support and handle required data in detail and range
Develop general models of component reliability and
relationships to change in component condition
Develop decision support methods and optimization
algorithms that are applicable and could be supported with
real data.
Develop measurement techniques to support CBM and
RCM
Further develop RCM methods like RCAM
AM course Start-up meeting April 2005

2 AM course Course objective

The objective for this course is to investigate solutions to


these challenges by studying different needs for AM.
The different needs for AM has been divided into four
themes for the course content that are:
1. reliability data assessment and reliability modelling,
2. RCM for maintenance optimization,
3. condition monitoring and diagnostics methods, and
4. computer tools supporting techniques for
maintenance planning
The objective is that the course participants gain
knowledge in all these 4 themes as tools for AM.

AM course Start-up meeting April 2005

2 AM course Course approach

There are expertise knowledge in AM at both TKK and


KTH including several research activities e.g.:
development of techniques and methods for CBM at TKK
and KTH
development of methods for RCM at KTH

The approach for this course is to use this knowledge as a


platform for investigating AM and to communicate the
different experiences in the area.

AM course Start-up meeting April 2005

2 AM course Course program

The course program has been handed out and is


available at

http://www.ets.kth.se/seminarium.htm

It summarizes relevant information about the course as:


course subjects
course description
course examination
course material
course schedule
course responsible and lectures
AM course Start-up meeting April 2005

2 AM course Course subject

For this course the overall topic of AM has been divided


into four themes for the course content that are:
1. reliability data assessment and reliability modelling,
2. RCM for maintenance optimization,
3. condition monitoring and diagnostics methods, and
4. computer tools supporting techniques for
maintenance planning

AM course Start-up meeting April 2005

2 AM course Course description

The course includes lectures and own student work with an


individual tasks.
A list of proposed topics for the tasks is included in the course
program.
The students shall prepare a 10-15 page summary report, and a
presentation (about 8-10 slides for approx 20minutes presentation)
about their individual task.
Guidelines are provided for writing the report. See GUIDELINES
FOR THE PREPARATION OF REPORTS FOR THE COURSE
ASSET MANAGEMENT IN POWER SYSTEMS
The working language is English.

AM course Start-up meeting April 2005

2 AM course Course examination

The examination for the course includes two activities i.e.:


1. Individual task that shall be presented in a report and
as an oral presentation. Approved task result in 2
course points.
2. Written examination that will be based on all course
subjects. There will be an option to take a full
examination of 4 course points that includes
examination of all themes or an examination of a
selected part of themes for 2 course points.
The course participants should announce at the start-up
meeting, which target that is selected for the written
examination. More information about the exam will be
provided after the start-up meeting.
AM course Start-up meeting April 2005

2 AM course Course material

The course material includes:


Lecture handouts
Summary reports and presentation slides for the tasks
Recommended literature
A summary presentation of the course material will be
presented and published as course material at KTH/TKK,
and will be available in pdf-format.
In the following slides some recommended literature with
links are provided
AM course Start-up meeting April 2005

2 AM course Course material

The main source for literature is the IEEE Explore (with


approximately 1 million research papers)
http://www.ieeexplore.ieee.org/Xplore/dynhome.jsp

A list of references is handed out as a result from a general


literature search at IEEE Explore for the AM course.

AM course Start-up meeting April 2005

2 AM course Course material

Links with reference literature in power industry:


EPRI http://www.epri.com/
Cigr http://www.cigre.be/
SINTEF http://www.sintef.no/
Elforsk http://www.elforsk.se/
SwedEnergy http://www.svenskenergi.se/

AM course Start-up meeting April 2005

2 AM course Course material

Links with reference literature at KTH:


PhD theses at KTH (e.g. Bertling 2002 on RCM)
http://media.lib.kth.se/kthdiss.asp
Research at KTH see annual report
http://www.ets.kth.se/annual_report.htm
Research at KTH/EKC see list of publications at
http://www.ets.kth.se/comp
Course material and reference material:
http://www.ets.kth.se/eek/lina/2C4030.html
AM course Start-up meeting April 2005

2 AM course Course material

Links with reference to different standards:

IEEE http://standards.ieee.org/

Military standards e.g. MIL STD

ISO/IEC
http://www.iso.org/iso/en/ISOOnline.frontpage

SIS http://www.sis.se/

Utek http://www.utek.se/

AM course Start-up meeting April 2005

10

2 AM course Course material


Example on useful standards for reliability assessment:

IEC 300 Dependability management (SEK SS 441 05


05 Tillfrlitlighet-ordlista)

British Standard BS5760 Reliability of systems,


equipments and components

MIL-STD 785 Reliability program for systems and


equipment- development and production

MIL-STD 882 System safety program requirements

AM course Start-up meeting April 2005

2 AM course Course material


Example on useful standards for reliability assessment:

MIL-STD 756 Reliability prediction

IEEE Std. 352 General principles of reliability analysis


of nuclear power generating station protection systems

ISO9000 Quality system standards that is identical with

European: CEN/CENELEC EN29000

American ANSI/ASQC Q90

AM course Start-up meeting April 2005

11

2 AM course Course material


Example on books on reliability assessment and RCM:

Hoyland A., Rausand M., System reliability theory - models and


statistical methods, Wiley Series, 2004

Patrick D. T. OConnor: Practical Reliability Engineering,


Wiley&Sons Ltd, 2002

Roy Billinton and Ron Allan, Reliability Evaluation of Power


Systems

Lina Bertling, RCM for Electric Power Distribution Systems, KTH,


Doctoral thesis 2002 (that includes a reference list with fundamental
books on distribution systems and RCM)

AM course Start-up meeting April 2005

2 AM course Course schedule

The course schedule involves the following activities and


deadlines:
Start-up meeting with selection of student tasks and
hand out of recommended literature 25th (KTH )
and 27th April (TKK)
Individual project work with presentation in report
with deadline for submission for 10th August.
Lecture days with introduction lectures by course
leaders followed with presentations of tasks by
course participants 17th-19th August
Examination 5th September
AM course Start-up meeting April 2005

12

2 AM course Course responsible

The course responsible and lectures in the course are:


Prof. Matti Lehtonen at TKK (that is the main coordinator for the course)
Dr. Lina Bertling at
KTH (that co-ordinate
participants from Sweden)

AM course Start-up meeting April 2005

2 AM course Course responsible

Contact information to Prof. Matti Lehtonen, TKK


Helsinki University of Technology
Power Systems and High Voltage Engineering,
P.O. Box 3000, FIN-02015 HUT,
Finland, Tel. +358 9 4515484,
Fax +358 9 460224,
E-mail: Matti.Lehtonen@hut.fi

AM course Start-up meeting April 2005

13

2 AM course Course responsible

Contact information to Dr. Lina Bertling, KTH.


KTH Dept. of Electrical Engineering
100 44 Stockholm, SWEDEN
Phone: +46 8 790 6508
E-mail: lina.bertling@ets.kth.se
WWW: www.ets.kth.se/eek/lina

AM course Start-up meeting April 2005

3 AM course Selection of tasks

Theme 1: Reliability data assessment, reliability modelling and


assessment
1.
2.
3.
4.
5.
6.

Reliability assessment of surge arresters


Failure causes in medium voltage networks
Reliability analysis methods for distribution systems
Reliability engineering in distribution system planning
Reliability modelling of HV switchgear
MonteCarlo simulation methods for reliability modelling and
assessment

AM course Start-up meeting April 2005

14

3 AM course Selection of tasks

Theme 2: Reliability centred maintenance for maintenance


optimization
7.
8.
9.
10.
11.
12.
13.
14.
15.

Different maintenance strategies and their impact on power


system reliability
Prioritization of maintenance methods (RCM,CBM,TBM,CM)
RCM applications for overhead lines
RCM applications in underground power systems
RCM applications for power transformers
RCM applications for switchgear
RCM applications for secondary substations
RCM applications for generator systems
Maintenance optimization techniques for distribution systems
AM course Start-up meeting April 2005

3 AM course Selection of tasks

Theme 3: Condition monitoring and diagnostics methods


16. Condition monitoring of wooden poles
17. Maintenance and condition monitoring on large power
transformers
18. Aging phenomena of paper-oil insulation in power transformers
19. On-line monitoring applications for power transformers
20. Tests and diagnostics of insulating oil
21. Diagnostics of dissolved gas in insulating oil of transformers
22. Dielectric diagnostics measurements of transformers and their
interpretation
23. PD-measurements for transformers and their interpretation
24. Diagnostics and condition monitoring of power cables
25. Aging phenomena of cable insulation materials

AM course Start-up meeting April 2005

15

3 AM course Selection of tasks

Theme 3: Condition monitoring and diagnostics methods


26. Dielectric diagnostics measurements of power cables and their
interpretation
27. Condition monitoring of cable accessories (joints, terminal)
28. PD-measurements of power cables and their accessories
29. On-line monitoring applications for power cables
30. Condition monitoring of circuit breaker and switchgear
31. Condition monitoring and diagnostics of GIS
32. Condition monitoring of high voltage circuit breakers
33. Condition monitoring of surge arresters
34. On-line monitoring applications for secondary substations

AM course Start-up meeting April 2005

3 AM course Selection of tasks

Theme 4: Computer tools supporting techniques for


maintenance planning
35. Evaluation of different tools for reliability assessment e.g.
RADPOW, NEPLAN, NetBAS etc.
36. Evaluation of using the tool VeFoNet for purpose of
maintenance planning, RCM.
37. Evaluation of using the tool NEPLAN for the purpose of
maintenance planning
38. Compilation of existing functions for RCM in commercial tools
like NetBas, NEPLAN. Meldis, Maximo etc.

AM course Start-up meeting April 2005

16

4 AM course Closure

As a result from the two start-up meetings at KTH and


TKK the following material will be distributed:
a list of participants and their selected tasks
details about the course exam
During the following period of work with individual tasks
the lectures will be available to support with material and
discussions.
Time for questions and discussions!!

AM course Start-up meeting April 2005

17

Asset Management in Power Systems


Course schedule for student presentations
Wednesday, the 17th August:
Theme 1: Reliability data assessment, reliability modelling and assessment
10.00-12.00
Reliability assessment of surge arresters, Vesa Latva-Pukkila
Failure causes in medium voltage networks, Kimmo Kivikko
Investigation of the impact of non uniformal distributed failures on customer interruption
costs for electrical distribution systems, Patrik Hilber
Optimal strategy for variance reducation based on component improtanc measures in
electrical distribution systems, Torbjrn Solver
Reliability engineering in distribution system planning, Markku Hyvrinen
12.00-13.00 Lunch break
13.00-14.30
A bayesian method for reliability modelling applied to HV circuit breakers, Tommie
Lindqvist
MonteCarlo simulation methods for reliability modelling and assessment, Jussi Palola

Theme 2: Reliability centred maintenance for maintenance optimization


Different maintenance strategies and their impact on power system reliability, Anna
Brdd
Maintenance optimization techniques for distribution systems, Sirpa Repo
14.30-15.00 Coffee break
15.00-16.30
Prioritization of maintenance methods (RCM,CBM,TBM,CM), Sanna Uski
RCM applications for overhead lines, Sauli Antila
RCM applications in underground power systems, Samuli Honkapuro
RCM applications for switchgear and high voltage breakers, Richard Thomas

Thursday, the 18th August:


10.00-12.00
RCM applications for generator systems, Nathaniel Taylor
RCM application for LV process electrification and control systems, Helmuth Veiler

Theme 3: Condition monitoring and diagnostics methods


Condition monitoring of wooden poles, Osmo Auvinen
Maintenance and condition monitoring on large power transformers, Pramod Bhusal
Aging phenomena of paper-oil insulation in power transformers, Henry Lgland
12.00-13.00 Lunch break
13.00-14.30
On-line monitoring applications for power transformers, Pekka Nevalainen
Tests and diagnostics of insulating oil, Kaisa Tahvanainen
Gas diagnostics in transformer condition monitoring, Pauliina Salovaara
Dielectric diagnostics measurements of transformers and their interpretation, Xiaolei
Wang
14.30-15.00 Coffee break
15.00-16.30
Condition monitoring of generator systems, Matti Heikkil
On-line monitoring applications for secondary substations, Petri Trygg
PD-measurements for diagnosis of covered conductor overhead lines, Murtaza Hashmi

Theme 4: Computer tools supporting techniques for maintenance planning


Evaluation of the representation of power system component maintenance data in IEC
standards 6180/61968/61970, Lars Nordstrm
17.00-21.00
Sauna and dinner in Micronova

Friday, the 19th August:


10.00-12.00

Theme 3: Condition monitoring and diagnostics methods, continued


Condition monitoring of high voltage circuit breakers, Tuomas Laitinen
On-line condition monitoring of high voltage circuit-breakers, Shui-cehong Kam
Theme 1: Reliability assessment of protection systems, Gabriel Olguin
12.00 The closing of the course

RELIABILITY ASSESSMENT OF SURGE ARRESTERS

Vesa Latva-Pukkila
Tampere University of Technology
vesa.latva-pukkila@tut.fi

INTRODUCTION
Surge arresters are used in electricity networks mainly to protect sensitive network apparatus
against overvoltages. The purpose of lightning arresters is to provide a path which the surge can
pass to the ground before it has a chance to seriously damage the insulation of the transformer or
other electrical equipment. The major change from gapped silicon carbide arresters to gapless
metal oxide surge (MO) arresters started some 25 years ago. The shift from porcelain to polymers
as housing material started little later, 15 20 years ago. These two technology changes have
changed the behavior and properties of surge arresters considerably. Today polymer housed metal
oxide arrester designs are used in most of the new arrester applications at distribution level and
increasingly also at higher voltages. [Lah03]

SURGE ARRESTER TYPES


In general, there are two types of lightning arresters: gapped silicon carbide (SiC) and gapped or
gapless metal oxide (MO). There are also two types of housing materials: porcelain and
polymeric. The porcelain housed gapped SiC arresters are the earlier form of lightning arresters,
but are still used in the power systems. Today most of the new arresters are polymer housed
gapless metal oxide arresters. [Grz99, Lah03, Kan05]
The elementary SiC arrester consists of air gaps in series with a SiC resistive element. These are
hermetically sealed in porcelain housing. Multiple series of gaps are utilized to improve the gap
reliability versus a single gap. [Grz99] When an overvoltage occurs, a SiC arrester will operate
and let through a large current caused by the overvoltage. After the overvoltage impulse
disappears, the SiC arrester will continue to carry a power-frequency follow current of several
hundred amperes for a few milliseconds until a current zero or until the arrester gaps deionize.
Energy dissipated in the SiC arrester is mainly due to power-frequency follow currents and not
due to surge impulses. [Goe00]
The gapless metal oxide arrester is composed of metal oxide varistor blocks in series. MO
arresters have extremely non-linear characteristics and when they operate, they let through less
power-frequency current than SiC arresters. An MO arrester will let through only the current
impulse caused by the overvoltage and does not have power-frequency follow-current. This
allows MO arresters with less energy capability to be used when replacing SiC arresters. [Goe00,
Kan00]

In year 2000, there were approximately 100000 surge arresters in medium voltage power lines in
Finland. Of the arresters approximately 55 % were gapped SiC arresters and 45 % gapless MO
arresters. Approximately 40 % of MO arresters are porcelain housed and 60 % polymer housed.
The age of SiC arresters varied between 15 and 50 years while oldest MO arresters were 18 years
old. The share of the MO arresters is estimated to rise to 90 % of all surge arresters before year
2010. [Kan00]

Structure of MO arrester
The mechanical structure and design of surge arresters naturally varies between arrester types and
manufacturers. Structures of porcelain and polymer housed arresters differ significantly. The
main principles of the mechanical structures used in polymer and porcelain housed metal oxide
surge arresters are illustrated in Fig. 1. The figures were prepared for distribution class arresters,
but the same kind of basic structures can also be found in arresters of different voltage levels.
[Lah03]

Fig. 1 Basic structures of metal oxide surge arresters (a: porcelain housed MO arrester, b and
c: polymer housed MO arresters) [Lah03]
The main differences between the structures in porcelain (Fig. 1 a) and polymer housed (Fig. 1 b
and c) arresters are the need for separate mechanical support in polymeric arresters and
unavoidable internal gas space and the pressure relief system needed in porcelain arresters. In
porcelain housed arresters the housing also serves as a mechanical support. In polymeric MO
arresters, due to the elasticity of polymeric housings, a separate support system, formed by a
glass fiber reinforced epoxy tube or rods inside an arrester, is needed to take care of mechanical

loads. In all, the mechanical structure of porcelain housed MO arresters is more complex and
includes more separate parts than polymer housed MO arresters. [Lah03]
The structures of polymer housed arresters can be separated into two main categories according
to their manufacturing technique; polymeric housing molded directly onto internal parts (Fig. 1
c) and separately molded housing (Fig. 1 b). In the latter case, the arrester is assembled after
molding and it needs separate end caps (7) to seal the housing to the end electrodes. [Lah03]
In designs with housing molded directly onto the internal parts, the interfaces between the
housing and other parts are tightened using primers during the molding process to achieve a
chemical bonding between the materials. In separately molded designs, this interface is often
filled with an elastic sealing paste to ensure a totally void free structure. In some designs voids
and even remarkable gas spaces are found in this interfacial region. [Lah03]

Polymer housing versus porcelain housing


The change from porcelain as a housing material towards polymeric housings has been the most
recent major change in the history of metal oxide surge arresters. The reason for this change is,
that the polymeric housings have several advantages over porcelain housings: [Lah03]
better resistance to harmful effects of pollution
fewer components, lower weight and compact size
faster and more simple manufacturing process, allows more complex geometry of housing
better thermal properties (higher conductivity)
reduced risk of shattering and explosion (increased safety)
better resistance to moisture ingress
less vulnerable to vandalism or mishandling during installation and transportation
Although polymeric materials have many superior properties over porcelain, there are also
negative aspects. Generally, polymers are much more complex to use because they may degrade
under service stresses when not properly used. Several aspects have to be considered, solved and
tested when designing polymer housed high voltage apparatus. Some of these aspects are given in
the following: [Lah03]

weathering degradation is possible


o polymers have weaker bonds than porcelain (they can be aged and degraded)
careful material development and product design needed
o polymer material formulations are organic and their foundation element, carbon, is
conductive in most of its uncombined forms in case of material degradation
conductive paths or tracks may be formed (tracking)
lower mechanical strength separate mechanical support needed
higher raw material costs
complex material formulation and manufacturing process, material compatibility is not
easy to handle
o polymers may suffer from stress corrosion or brittle fracture if not properly
formulated in the specific application
3

Further, the housing of the polymer housed arresters, which fail in networks, frequently remains
unbroken and hence the damaged arresters are very difficult to detect. This delays fault location,
isolation, and restoration measures in the network. At the moment, there is no simple and reliable
means to determine the condition of an in-service medium voltage surge arrester on site. [Kan05]

FAILURE MODES OF SURGE ARRESTERS


Studies have shown that the vast majority of porcelain distribution arresters failures are moisture
ingress related. To prevent moisture ingress, an arrester must have a robust sealing system which
completely seals the housing-varistor assembly interface as well as the terminal ends. In addition,
the arrester must be essentially void-free to prevent moisture vapor transmission. If not void-free,
moisture vapor permeates the housing, condenses and collects in internal air spaces. This means
also that polymer arresters with free internal air space, including gapped arresters or designs with
an unfilled interface, are prone to premature moisture-induced failure. [Mac94]
If one can protect against moisture ingress, then the next likely cause of arrester failure will be
excessive energy input. Gapless distribution arresters absorb energy from power frequency
temporary overvoltage, which occurs during ground faults, switching surges or lightning
discharge. [Mac94]
Stresses often act simultaneously and sometimes sequentially, e.g., formation of salts followed by
absorption of moisture into the arrester. Additionally, the varistor elements of a MO arrester are
continuously subject to considerable voltage stress, and a small leakage current continuously
passes through a MO arrester giving rise to new kinds of phenomena unknown in arresters with
gaps. [Kan05]
This chapter presents shortly the operating environment of surge arresters, typical failure modes
of silicon carbide and metal oxide arresters as well as moisture penetration phenomena.

Operating environment of surge arresters


Arresters encounter a large range of different stresses in their operating environment. Resistance
to the harmful effects of these ambient stresses is thus one of the fundamental properties of any
arrester. The basic ambient stresses are illustrated in Fig. 2.
In case of metal oxide arresters, ambient stresses may degrade arrester either directly (e.g.
mechanical forces) or more often indirectly by causing changes in the materials (e.g. UV
radiation, acids) or in the electrical behavior (e.g. leakage current formation by humidity ingress
or pollution). All these changes often interact with the other degradation processes. Failure of an
MO arrester is often finally caused by the electrical stresses also in the case of an
environmentally degraded arrester when the weakened arrester can no longer withstand the field
strengths needed. [Lah03]

Fig. 2 Basic ambient stresses affecting MO arresters in different service conditions [Lah03]
Surface contaminations, pollutions and other kinds of layers accumulating on the surfaces, may
cause both internal and external problems in arresters. The conductive layer may change the
potential distribution and increase the internal leakage current of an arrester. This may lead to
uneven heat distribution and overheating of the metal oxide discs. Conductive layers may also
cause internal partial discharge activity in arresters with internal gas space. External problems
include surface leakage current and partial discharge formation that may lead to material erosion,
puncture of the housing and flashovers. [Lah03]
Humidity in all forms is more or less conductive. The humidity stressing of arresters is caused by
different forms of ambient humidity; precipitation and content of moisture in air. Factors like
length and recurrence of moist periods also affect the stressing, likewise the properties of periods
with lower air humidity content and precipitation. If humidity penetrates the inside of an
electrical insulation it may form conductive paths leading to internal leakage currents and
changes in electrical field distribution. Internal leakage currents may also cause tracking and
erosion of materials. In silicon carbide type arresters internal humidity has been found to be the
most important reason for failures in the field. [Lah03]

Failure modes of silicon carbide arresters


The most important cause of failures of gapped SiC arresters is internal degradation due to
moisture penetration. Other reasons for failures include severe surges, external contamination,

damaged air caps due to several events and deterioration of SiC material due to several events of
moisture.
According to many sources, the dominant cause of failure of gapped silicon carbide surge
arresters is internal degradation due to moisture ingress resulting from inadequate seals. Some
studies suggest that nearly 86% of all SiC arrester failures could be associated with moisture
ingress. Another study showed that moisture was present in 10% of about 3000 arresters
examined after about 12 years of service life. [Dar96]
Some SiC arrester failures (about 5%) are caused by very severe lightning surges, e.g. strokes of
very high current magnitude or long continuing current durations, and/or multiple-stroke flashes
with high multiplicity of stroke currents and power-follow currents. [Dar96]
Further, some arrester failures are caused by external contamination on the housing. Such failures
are more likely for arresters without non-linear resistance grading of the internal gaps. [Dar96]
Another reason for failures is several spark-over events caused by lightning surges during the
service. Therefore, inside the arresters, the surfaces of air gaps are slightly damaged by arc due to
energy dissipation. The inherent smooth surfaces are accumulated by some small particles
emitted during arcing. Consequently, the withstand voltage level is reduced as breakdown is easy
to be caused via the air gaps. [Grz99]
Furthermore, the property of silicon carbide may also be changed. During energy dissipation the
silicon carbide sustains a very huge current, and the material structure of SiC has some variations.
Moisture may also deteriorate the SiC material, even hermetically sealed in the porcelain
housing. [Grz99]

Failure modes of metal oxide arresters


Metal oxide arresters in service are still relatively new and it seems that there isnt clear
information on their typical aging and failure mechanisms. Naturally severe surges cause failures,
but energy related failures are considerably less than 1 % of all failures. The majority are caused
by high temporary overvoltages, which exceed arrester limits, and animals. [Goe00]
For gapped MO arresters, it is clear that the principal cause of failure is moisture ingress through
faulty or inadequate seals on both porcelain and polymer housings. The moisture ingress
degrades the series gap structure by either destroying the properties of the insulation associated
with the gaps or by damaging the non-linear grading resistors. Eventually all the system voltage
is applied to the varistor blocks, leading to their failure and so failure of the arrester. It is clear
that the presence of lightning is not a part of this process, but there is evidence that the likelihood
of this mode of failure is increased by the occurrence of high temporary over-voltages. [Dar00]
Formation of acids and nitrates inside porcelain housed MO arresters with internal gas space may
also cause problems. [Kan97]

For gapless MO arresters, moisture ingress does not appear to be a significant factor. While the
cause of block failure in such arresters is often masked by the damaging effects of fault currents,
most of the blocks failed in service by flashover of the surfaces or perhaps near-surface
flashovers. This may occur due to high temporary over-voltages or lightning. [Dar00]

Moisture penetration
The ambient conditions naturally define the external "humidity stress" tending to access inside an
MO arrester. The surface properties, like hydrophobicity and roughness, of the housing material
affect humidity ingress by affecting the moisture ingress processes. For example, high water
repellence due to hydrophobicity limits water permeation. The housing material itself greatly
affects moisture permeation. Cracks and other flaws cause direct moisture ingress by capillary
effect but these problems are rather rare in modern arrester types. Instead, moisture permeation
by diffusion takes place in all polymers if the stresses are high enough. The `recipe' of a housing
material defines the permeability of the material to water. [Lah03]
In some cases moisture can also be formed internally as a product of chemical reactions initiated
by internal discharge activity, initiated, for example, when a conductive layer covers the external
surface of an arrester. Internal partial discharge activity is normally possible only in arrester types
having internal gas space. These possible moisture penetration or formation processes in
polymer housed arresters are illustrated in Fig. 3. [Lah03]

Fig. 3 Moisture penetration and formation possibilities in polymer housed arrester (1 = end
electrode, 2 = MO disc, 3 = mechanical support, 4 = polymeric housing) [Lah03]
The internal structure of an MO arrester affects possible moisture problems remarkably. A good
design should not leave any places for moisture to collect to (e.g. voids, gas spaces) because
moisture diffusion through polymeric housing is possible. In the newest designs, where housing
is typically directly molded onto the internal parts of the arrester, the quality of bondings over
7

internal interfaces is of importance. The bonding areas should not be porous and the bondings
should withstand the stresses and conditions present inside an arrester throughout its lifetime.
These stresses may be very severe, for example the ambient temperature in Finnish Lapland may
reach -50C in winter while the temperature during summertime may reach +30C. The operation
of an arrester naturally heats the arrester up to a higher temperature. [Lah03]
In cases where moisture has penetrated an arrester the internal leakage current will start to flow
when the arrester is energized and a leakage current path is available. The leakage current may
cause tracking of the materials depending on the flowing current and the dissipated power.
Degradation caused by the internal leakage current and the humidity can increase the humidity
ingress into the arrester interior by weakening the material properties and the internal structure.
This degradation may lead to loosening of materials from each other. [Lah03]

FAULT RATES AND EXPERIMENTS


In general, lightning arresters are designed to service 30 or 40 years in power systems. However,
the actual service life greatly depends on the property of the varistor element, the sealing of the
arresters and the working conditions such as surge magnitudes, the frequency of surge or fault
occurrences. [Grz99] Depending on the source, the annual rate of surge arrester failures varies
between 0.1 3 %. In some cases, failed arresters have included also relatively new metal oxide
arresters. [Kan00]
It is a common experience that metal-oxide arresters are more reliable than the technologically
outdated gapped SiC arresters. This is to be expected, as the MO arrester (particularly the gapless
type) is much simpler than the gapped SiC arrester, Also, they have not been in service much in
excess of 10 years, so if any degradation due to age is to occur, this should not be evident at this
early stage of their life. Even so, during the 1990s and only on distribution systems, some
electricity companies experienced significant failure rates for a few makes, both porcelain and
polymer housed, and gapped and gapless arresters. [Dar00] Still, properly designed and
constructed gapless polymer arresters, with no internal air space, can provide significant
improvement in overvoltage protection and arrester reliability over porcelain arresters. [Mac94]
According to study made by Detroit Edison normal rate of failure for the metal oxide lightning
arresters is roughly 2.99/1000 each year. [Boh93]
This chapter presents the results of some case studies focused on reliability and failure modes of
both silicon carbide and metal oxide surge arresters.

Tests of SiC and MO arresters gathered from various 24 kV networks in Finland


The devices to be studied were gapped silicon-carbide (SiC) arresters and gapless metal-oxide
(MO) surge arresters gathered from various 24 kV networks in Finland (during the years 1999
2000). A total of 246 silicon-carbide arresters (age from 14 to 38 years) and 164 MO arresters
(age from 4 to 15 years) were studied. The results of the measurements on MO arresters were

compared to the measurements conducted earlier (in 1994) on 16 new, unused MO arresters of
the same types as those taken from the field.
According to results presented in Table 1, the type (not just the age) of the arrester seems to have
considerable influence on the reliability of arrester. For example, the type S1 (manufactured
1969-72) performed well in the tests while the type S5 (1980-85) did not.
Table 1

Results of Finnish tests on silicon-carbide arresters [Kan05]

Arrester ManuType
facturing Housing
S1
1969-72 Porcelain
S2
1980-85 Porcelain
S3
1973-76 Porcelain
S4
1970-79 Porcelain
S5
1980-85 Porcelain
S6
1967-69 Porcelain
S7
1962-70 Porcelain
S8
1971-74 Porcelain
S9
1976
Porcelain
S10
1962-74 Porcelain

AC Voltage Test
Number
Failed
[%]
30
1
3.3
31
3
9.7
54
4
7.4
56
29
51.8
19
1
5.3
22
3
13.6
17
0
0.0
5
0
0.0
3
0
0.0
9
0
0.0

Impulse Current Test


Number
Failed
[%]
30
0
0.0
31
0
0.0
54
10
18.5
11
10
90.9
17
11
64.7
5
5
100.0
13
9
69.2
2
2
100.0
3
0
0.0
9
4
44.4

A total of 41 gapped SiC arresters out of 246 specimens (Table 1) failed in the ac voltage
withstand test. Of the gapped SiC arresters tested 34.5% (85 out of 246, not presented in Table 1)
did not pass the lightning impulse (LI) sparkover test, although for most of the failed arresters
the LI sparkover voltages were only a few kV higher than the required value and, hence, the
protective margins were still rather good. [Kan05]
The current impulse test of the gapped silicon-carbide arresters resulted in a lot of damage. With
many types nearly half or more of the arrester were damaged (Table 1). In many cases the
internal failure also caused disintegration of the arrester porcelain covering. Due to the damage
and failures in the electrical tests the authors recommend the electricity companies in Finland to
exchange their gapped silicon-carbide arresters of types S3, S4, S5, S6, S7, S8, and S10 to new
metal oxide surge arrester types in order to reduce the risk of arrester failures and to improve the
protection levels in the MV networks. [Kan05]
All the ten MO arrester types studied were generally in good condition after being used 415
years in Finnish 24 kV networks. Only three specimens out of a total of 164 MO arresters tested
could be evaluated to be faulty or near failure (Table 2). Comparisons between the MO arresters
gathered from the networks and the unused MO arresters showed that the protection levels of the
MO arrester types studied had remained stable after being used 415 years in networks. In
addition, the ac durability of these MO arrester types had not weakened remarkably during the
use in networks. The main information of arresters and test results is presented in Table 2.
[Kan05]
It was also discovered, that some MO arrester types were undervalued compared to the rather
high values of temporary overvoltages typical for Finnish 24 kV networks. [Kan05]

Table 2

Results of Finnish test on metal oxide arresters [Kan05]

Arrester ManuType
facturing Housing
M1
1984-91 Porcelain
M2
1986-91 Porcelain
M3
1987-90 Porcelain
M4
1990
EPDM
M5
1991
EPDM
M6
1990
EPDM
M7
1991-94
EPDM
M8
1991-94
EPDM
M9
1991-95
EPDM
M10
1992-95 Silicone
M11
1991-92 Silicone

AC Voltage Test
Number
Failed
64
1
12
0
21
0
3
0
6
0
3
0
9
0
25
0
8
0
4
0
9
0

[%]
1.6
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0

Impulse Current Test


Number
Failed
[%]
64
1
1.6
12
0
0.0
21
1
4.8
3
0
0.0
6
0
0.0
3
0
0.0
9
0
0.0
25
0
0.0
8
0
0.0
4
0
0.0
9
0
0.0

According to these measurements, there is no clear correlation between insulation resistance


value (measured with a dc voltage of 1000 V) and ac withstand voltage of gapped silicon carbide
arresters. Hence, a simple insulation resistance measurement (with rather low test voltage) cannot
be used as a reliable method to evaluate the ac voltage withstand level of a gapped silicon-carbide
arrester. [Kan05]
The silicon-carbide arresters with high PD levels did not show any worse performance in the
other tests compared to the arresters of the same type but with low PD levels. In conclusion, there
is no correlation between PD levels and the durability or operational condition of gapped siliconcarbide arresters. [Kan05]

Tests of SiC arresters gathered from networks in Australia


There is clearly an upturn in the incidence of unsatisfactory insulation resistance after about 8 to
10 years of service, and this is followed by a more marked upturn in failure on power frequency
tests at about 13 to 15 years of service. The incidence of visible deterioration of internal
components, as revealed by inspection, begins to appear significant after only a few years of
service, and rises steadily thereafter. [Dar96]
There are clear differences in the results for four manufacturers' arresters. Some of the
differences are attributable to different ages of the arresters, e.g. in respect of insulation
resistance, arresters of one make, having median age of 6 years, had a low insulation resistance
failure rate (4%) compared to 32 to 41 % for three other makes, with median ages of 11 to 22
years. The same type of comment can be made (but less clearly so) in respect of the results of
power frequency tests. But most of the arresters with high lightning impulse sparkover failures
(31 %) were of one make having median age of 6 years (newest). The seal failure rates of all
arresters subjected to seal tests were uniformly high among nearly all the makes. The inspection
results of the four makes were mostly similar - high percentages (63 to 100%) showing evidence
of moisture ingress and significant percentages (11 to 31 %) exhibiting electrical damage.
[Dar96]

10

Comparisons were made of test results for arresters from electricity authorities in different
climates, but with somewhat similar median lengths of service. These showed that, although there
appeared to be a small effect on seal failure rates for service periods of about 7 years, for longer
periods of service the effects due to length of service were much greater than any effect due to
climate. [Dar96]
Of the failed arresters opened for inspection, 74 (about 48% of the 155) arresters showed
evidence of moisture-related degradation. As a percentage of the whole sample (365) of arresters
included in the project, the percentages of the moisture-related degradation and the moistureevident numbers were 20% and 7% respectively. Block damage such as surface flashover and
block puncture or rupture was also evident in 5.7% of the 365 arresters (13.5% of the inspected
arresters). 9 of the 365 arresters showed external flashover marks - 8 were associated with one
particular make of arrester. [Dar96]
Most gapped silicon carbide distribution lightning arresters of age in excess of 13 to 15 years
have characteristics which render them no longer suitable for service on a distribution system and
authors recommended that all silicon carbide distribution arresters older than 13 years be
progressively replaced by modern metal oxide arresters. [Dar96]

Inspection of failed arresters gathered from networks in Australia


The inspection of gapped MO arresters failed in service or laboratory tests show that most of
failures were clearly caused by moisture ingress through ineffective seals had caused degradation
of the internal series gap structures and other components. Even gapped MO arresters with
satisfactory test results showed signs of internal degradation due to moisture ingress. Clearly the
seals were not adequate. The in-service failure rate of such arresters on one 11kV system was as
high as 14%. [Dar00]
Internal components of failed polymer housed gapless MO arresters generally showed the effects
of power frequency fault currents which damaged the MO blocks and their protective casings to
varying degrees and which often ruptured the arrester housings. Internal components only
occasionally indicated degradation attributable to moisture ingress. In most cases, it was not
obvious why the arresters had failed. The majority showed surface or near-surface damage to the
MO blocks which is common in with very high temporary over-voltages (WTOV) and with
multipulse (MP) lightning impulse currents. [Dar00]
Apart from inadequate seals on gapped MO arresters, the factor that seems to have most
influence on the performance of metal oxide arresters in the field is the quality of the surface
coatings which surround or encapsulate the MO blocks. [Dar00]

11

Laboratory moisture ingress tests


The arresters tested can be divided into three groups according to their internal structure. The
arresters with housing molded directly on the MO-active part had the best performance under
very humid conditions. In fact the arresters of this typical design are the only ones that passed the
long term tests without failures, and which could still be used in the system after the test cycle.
This correlates with the good service experience with approx. 50 000 arresters of this design
installed in tropical countries since 1994. [Lah99]
Other arresters designs seem to have problems with sealing and/or the interfaces between the
different polymeric materials. The types with separately manufactured housing showed typically
no or very little increase of the internal leakage current before the failure. Therefore detection of
the arresters (before failure) seems to be a problem. In the case of the arresters with separately
manufactured housing and with clear internal gas space the leakage currents started to rise soon
after the beginning of the humidity stress. [Lah99]

Comparison of SiC and MO arresters


There is little doubt that the frequency of failures for the reasons given varies from system to
system because of differences in the frequency of lightning storms, application practices, and a
number of other factors including the quality of arrester designs in service on the system;
however, the results of most studies appear to fit a similar pattern. The failure causes can be
grouped roughly into three categories: 1) moisture leakage and contamination; 2) overvoltages
including resonance and switching overvoltages; 3) surges of excessive magnitude and duration.
[Sak89]
According to inspection of 250 failed silicon carbide arresters from a large system the causes of
failure were: [Sak89]
Moisture ingress
Surges
Contamination
Misapplication
Unknown

85.6%
5.9%
4.5%
2.5%
1.5%

Moisture leakage and severe contamination should have several times less effect on metal oxide
distribution arresters than on silicon carbide arresters. Failures for these two reasons evidently
account for about 80% or 90% of total failures of silicon carbide arresters; therefore, metal oxide
arresters should be more reliable than silicon carbide arresters from the very important point of
view of moisture leakage and contamination. It should be understood, however, that design and
rigorous quality control practices may result in very high reliability, and less rigorous practices
might result in lower reliability, for either kind of arrester. Both metal oxide and gapped silicon
carbide arresters in conventional porcelain housing will fail, once a leak has been established in
their sealing system. [Sak89]

12

Metal oxide distribution arresters should be highly reliable in most applications because the
arresters are far less likely than silicon carbide arresters to fail as a result of moisture ingress and
contamination. Metal oxide arresters are more likely to fail as a result of system overvoltages
because they conduct current in response to the overvoltages, and for this reason, somewhat more
care must be exercised in application to match the magnitude and time duration of system
overvoltages to the temporary overvoltage capability of the arresters. [Sak89]

SUMMARY
In the last 25 years, the type of the installed surge arresters has changed from porcelain housed
gapped silicon carbide (SiC) arresters to polymer housed gapless metal oxide (MO) arresters. In
year 2000, approximately 55 % of surge arresters in medium voltage networks in Finland were
porcelain housed gapped SiC arresters, approximately 18 % were porcelain housed gapless MO
arresters and approximately 27 % were polymer housed gapless MO arresters. The age of SiC
arresters varied between 15 and 50 years while oldest MO arresters were 18 years old. Today
most of the new arresters are polymer housed gapless MO arresters and older designs will be
phased out.
Depending on the source, the annual rate of surge arrester failures varies between 0.1 3 %.
Main cause of failures of SiC arresters is deterioration of internal parts caused by moisture
ingress. The gapless MO arresters are clearly more reliable than SiC arresters but possible
internal gas space in porcelain and some separately molded polymeric housings increases the
possibility of moisture penetration and decreases reliability. For same reason, gapless designs are
in general more reliable than gapped (SiC or MO) designs. At the moment, there is no simple and
reliable means to determine the condition of an in-service surge arrester on site.
The reliability of the surge arresters depends on the property of the varistor element, the sealing
of the arresters and the working conditions such as surge magnitudes, the frequency of surge or
fault occurrences. Many studies have shown that the manufacturing quality has a lot of influence
on the reliability of the arresters and in some cases the type and quality of the arrester seems to be
more important than the age of the arrester.

13

REFERENCES
[Boh93]

[Dar96]

[Dar00]

[Goe00]

[Grz99]

[Kan97]

[Kan00]

[Kan05]

[Lah99]

[Lah03]

[Mac94]

[Sak89]

Bohmann, L.J., McDaniel, J. and Stanek, E.K. Lightning arrester failure and
ferroresonance on a distribution system. IEEE Transactions on industry
applications, Vol. 29, No. 6, November 1993.
Darveniza, M., Mercer, D.R. and Watson, R.M. An assessment of the reliability of
in-service gapped silicon-carbide distribution surge arresters. IEEE Transactions
on Power Delivery, Vol. 11, No. 4, October 1996.
Darveniza, M., Saha, T.K. and Wright, S. Comparisons of in-service and
laboratory failure modes of metal-oxide distribution surge arresters. IEEE Power
Engineering Society Winter Meeting, 23-27 Jan. 2000. p 2093 2100.
Goedde, G.L., Kojovic, Lj.A. and Woodworth, J.J. Surge arrester characteristics
that provide reliable overvoltage protection in distribution and low-voltage
systems. IEEE Power Engineering Society Summer Meeting, 16-20 July 2000.
Grzybowski, S. and Gao, G. Evaluation of 15-420 kV substation lightning
arresters after 25 years of service. Proceedings of Southeastcon '99, 25-28 March
1999.
Kannus, K., Lahti, K. and Nousiainen, K. Aspects of the performance of metal
oxide surge arresters in different environmental conditions. CIRED 97, 2-5 June
1997.
Kannus, K. Keskijnniteverkkojen ukkossuojaus Suomessa taustaa ja
nykytilanne (in Finnish). Lecture notes of Lightning and lightning protection of
electric networks, Tampere, Finland, 16 November 2000.
Kannus, K. and Lahti, K. Evaluation of the operational condition and reliability of
surge arresters used on medium voltage networks. IEEE Transactions on Power
Delivery, Vol. 20, No. 2, April 2005.
Lahti, K., Richter, B., Kannus, K. and Nousiainen, K. Internal degradation of
polymer housed metal oxide surge arresters in very humid ambient conditions.
IEE High Voltage Engineering Symposium, 22-27 August 1999.
Lahti, K. Effects of internal moisture on the durability and electrical behaviour of
polymer housed metal oxide surge arresters. Tampere University of Technology
Publications 437. Tampere 2003.
Mackevich, J.P. Evaluation of polymer-housed distribution arresters for use on
rural electric power systems. IEEE Transactions of industry applications, Vol 30.,
No. 2, March 1994.
Sakshaug, E.C., Burke, J.J. and Kresge, J.S. Metal oxide arresters on distribution
systems fundamental considerations. IEEE Transactions on Power Delivery,
Vol. 4, No. 4, October 1989.

14

FAILURE CAUSES IN MEDIUM VOLTAGE NETWORKS


Kimmo Kivikko
Tampere University of Technology
kimmo.kivikko@tut.fi

INTRODUCTION
Failures of hardware may be defined perhaps most commonly in abstract functional terms such
as: the loss of ability to perform a required function. Failures of equipment or materials may be
defined technically in terms of definitive specification parameters which are not met during
established tests or inspections. Failures of complex systems may be defined in terms of system
functions which are disabled or degraded by the failure. These failures may also be defined in
terms of the degradation or loss of specific functions by the subsystem in which the failure
occurred. [5]
Some types of failures are classified on the basis of the statistical distributions representing their
frequency of occurrence. These include the exponentially distributed random failures having a
constant hazard rate, and the Gaussian or Weibull distributed wearout failures having an
increasing hazard rate. In tracing system failures, often a primary or root-cause failure (which
initiated a series of events leading to the failure of the system), and one or more secondary or
contributing failures (failures which are caused directly or indirectly by the root-cause failure),
are detected. [5]
Failure mechanism defines the physics of the failure. For example, electrical or mechanical
overloads which involve a level of stress which exceeds the rated electrical or mechanical
strength of an item will cause physical damage resulting in loss of functional capability. The
failure mechanism is the process which occurred to change the physical or functional
characteristics, or both, of the materials in the failed item.
Failure mode is defined as the effect by which a failure is observed to occur and is usually
characterized by description of the manner in which failure occurs. It is the description of the
failure itself. A failure mode provides a descriptive characterization of the failure event in generic
terms, not in terms of the failure mechanism, and not in terms of failure effect. The effects of a
failure within a system may be propagated to higher and lower levels of assembly, or the system
design may prevent such propagation.

POWER SYSTEM FAILURES


Power system is composed of a number of components, such as lines, cables, transformers,
circuit breakers, disconnectors etc. Every component in a power system has an inherent risk of
failure. In addition, outside factors influence the possibility of component failure; e.g. the current
loading of the component, damage by third parties (human or animal), trees and atmospheric
conditions temperature, humidity, pollution, wind, rain, snow, ice, lightning and solar effect

[1]. On the other hand, a certain component can increase reliability, but on the other hand all
components can have failures and thus cause a supply interruption.
One important thing in the clarification of different failure causes in medium voltage network is
collecting sufficient fault statistics in the company level and in national or international level.
Failure causes can also be investigated with e.g. questionnaires. There are a lot of uses for fault
statistics. They can be used, for example, in network planning to decide which components and
network structures are most suitable in a certain case.
In general, a bathtub curve may be used as an adequate representation of component failure mode
with varying lifetimes. Early failures occur during the first years, and usually they are caused by
inherent defects due to poor materials, workmanship or processing procedures or manufacturers
quality control beside installation problems. Random failures are produced by chance or
operating conditions such as failure from switching surges or lightning. This failure mode is
usually present at very low percentages. Wearout failures usually result of dielectric or
mechanical material wearout. Normally, the wearout mode becomes predominant after tens of
years of operation, and it can be seen in the curve as an increasing failure rate. (See Figure 1)
120 %
Failure rate / %

Early
failures

100 %

Random
failures

Wearout
failures

80 %

60 %

40 %

20 %

0%
1

5.5

5.7

10

15

20

25

27

30

Tim
35 e / ye
40ars

Figure 1: Bathtub curve.


One questionnaire about failure causes has been done in USA and Canada by IEEE in 1972 and it
offers quite a good overview to different failure causes in power systems. The network
components discussed here are transformers, circuit breakers, disconnect switches, cables, cable
joints and cable terminations. The failures were predominantly flashovers involving ground,
caused by lightning during severe weather or by dig-ins or vehicular accidents. Below is a short
summary from [2] about the research:

For transformers, damaged insulation in windings or bushings accounted for the majority
of the transformer damages. The insulation can damage e.g. because of lightning or
switching surges or due to manufacturing and material errors.
In the case of circuit breakers the bulk of failures involved flashovers to ground with
damage primarily to the protective device components and the device insulation. Many
cases also report a circuit breaker misoperation, i.e. the breaker opens when it should not
open.
Electrical defects, mechanical defects and flashovers to ground resulted in damage to
mechanical components and insulation in disconnect switches. Some form of mechanical
breaking or contact from foreign sources, exposure to dust and contaminants accounted
for the majority of cases. There were also many unclassified cases.
For cables, the majority of failures involved flashovers to ground, resulting in insulation
damage. The flashovers can be due to e.g. transient overvoltages, mechanical failures or
normal deterioration.
In the case of cable joints and terminations the primary damage was insulation involving
either a flashover to ground or other electrical defect. Causes for the faults were abnormal
moisture, exposure to aggressive chemicals, inadequate installation, severe weather and
normal deterioration were the most common failure causes.

FAILURE CAUSES
Failures can originate from several different causes. Below is a list about different failure causes
and how they affect the network components. Figure 1 presents the failure causes and their
percentages according to Finnish interruption statistics. This figure relates to the number of faults
causing customer interruptions (see Figure 2).

Unknown
17 %

Snow and ice


13 %

External
6%
Faulty operation
or improper
materials
11 %
Animals
4%

W ind and storm


35 %
Thunder and
lightning
12 %

Other weather
2%

Figure 2: Failure causes and their percentages in Finland in 2003 [4].

Lightning
Lightning strikes can be either direct, which strike directly to a phase-conductor in the network,
or indirect strikes, which strike near the network and cause an induced overvoltage. These
overvoltages can be even thousands of kilovolts in the case of direct lightning strikes and 500 kV
in the case of indirect strikes, and thus they cause much higher voltage stresses to the network
components than the normal operating voltages. Even surge arresters are used, these overvoltages
can cause failures due to insulation breakdowns in transformers, insulators, cables etc. Also back
flashovers i.e. flashovers due to a lightning strike in a grounded part in the network can cause
failures [3]. According to the Finnish interruption statistics from year 2003, about 12 % of faults
were caused by thunder or lightning [4].

Wind and storm


Winds and storms usually cause faults that originate from trees. Trees falling to the lines can
cause conductor breakage or poles falling down, and short-circuits or earth faults, which lead to
customer interruptions. Also tree branches falling to lines or transformers can cause shortcircuits. According to the Finnish interruption statistics from year 2003, about 35 % of faults
were caused by wind and storm [4].

Snow and ice


Snow and ice cause a little similar failures as wind and storm. Snow burden can cause the trees to
fall to the lines and they cause conductor breakage or short-circuits or earth faults. On certain
weather, conductors collect frost and ice, and they can break the conductors because the
conductors cant take the weight of the ice. According to the Finnish interruption statistics from
year 2003, about 13 % of faults were caused by snow and ice [4].

Other weather dependent cause


Other weather dependent cause includes those weather dependent failures that cant be
categorized to any other class. These are, for example, extremely heavy rain that causes floods in
the basements etc. 2 % of the faults were classified to this category in Finland in 2003 [4].

Animals
Animals can cause short-circuits and earth faults. Squirrels may go on the top of transformers or
between line insulators and cause faults that way. Also birds sometimes fly to the lines and cause
faults. 4 % of faults in Finland in 2003 were caused by animals [4].

Network owners actions


About 11 % of faults in 2003 in Finland were due to network owners actions [4].
Neglected maintenance: Proper maintenance can increase the lifetime of network
components. Neglected maintenance can lead to premature failures that could have been
avoided with maintenance. This includes also line faults due to neglected tree trimming.
Planning or installation error: This category includes failures caused by planning or
installation errors. Usually these are cases where a wrong type of component is planned to
be installed in the network.
Faulty operation can cause failures, for example, if disconnector is used to break high
currents.
Overload: If overload causes component failures, it usually is a consequence of planning
error or faulty operation.

External people actions


According to the Finnish interruption statistics from year 2003, about 6 % of faults were
caused by external people actions [4].
Digging: Even though distribution network companies usually have cable location
services, digging sometimes causes cable failures.
Vandalism

FAILURE CAUSES BY COMPONENT


Above the failure causes were classified by the cause of the failure. Below the classification is
made by the component, and more details about failure mechanisms and causes are presented for
each component. In Finland failure causes are not collected in the national level, but FASIT
statistics [7] from Norway give some kind of picture about the meaning of different failure causes
in the case of different components.

Transformers
Transformers can fail in a variety of ways and for a variety of reasons. Important factors are
design weaknesses, abnormal system conditions, aged condition, pre-existing faults and
timescales for fault development [8]. Most transformers in medium voltage network are paper/oil
insulated. In addition to the stresses caused by persistent operating voltage, the insulations get
older due to high temperatures caused by load and fault currents, humidity and mechanical
stresses caused by e.g. fault currents. Small discharges in the insulations weaken the paper
insulations and dissolve gases in oil. According to the gases dissolved in the oil the condition of
oil can be investigated. The condition of the paper insulations can be detected with e.g. furfural
analysis. There are also other methods like partial discharge measurements, infra red emission
tests and acoustic emission tests.

Bus bars
In the case of bus bars the leading contributor was exposure to moisture (30 %) in the insulated
bus category and exposure to dust and other contaminants (19 %) in the bare bus category. Other
major causes were normal deterioration from age, shorting by tools and metal objects (bare
buses), mechanical damages and obstruction of ventilation [6]. Moisture can cause corrosion to
the components, which may damage the insulations so that e.g. SF6 gases leak from the bus bar
shielding. Dust and other contaminants may decrease the breakdown voltage causing unwanted
breakdowns in the insulations.

Circuit breakers and disconnectors


According to [6], in the case of circuit breakers the most common failure contributors were
lubricant loss, lack of preventive maintenance, overload and normal deterioration from age.
Lubricant loss can damage the breaker mechanically when stuck mechanisms do not work
properly. In the case of oil filled circuit breakers oil loss can damage the live parts of the breaker,
when arc remains burning after the breaker operation. Overload is usually caused due to planning
error when the circuit breaker has too small rated current compared to fault currents or load
currents. Reasons for inadequate circuit breaker sizing can also be due to load growth, network
development or unusual switching state. Normal deterioration causes also failures. Circuit
breakers have a limited number of breaker operations, even though the circuit breaker was
adequately maintained. The number of breaker operations depend e.g. on breaking currents and
many other things. These failures may be also due to minimizing the assets in the network, when
old components are not replaced before failure. Same failure causes apply usually for
disconnectors, too. Failure causes according to FASIT statistics are shown in Figures 3 and 4.

Unknown, 87.2 %

Isolation to ground,
2.6 %

Several
components, 0.0 %
Base, 0.0 %
Stand/rack, 0.0 %

Mechanics, 5.1 %

Grounding, 0.0 %

Live part, 5.1 %

Figure 3: Circuit breaker failure causes (2003) according to FASIT statistics [7].

Isolation to ground,
32.5 %

Unknown, 45.6 %

Grounding, 1.4 %
Several
components, 0.6 %

Live part, 17.6 %


Mechanics, 0.8 %

Base, 0.0 %
Stand/rack, 1.4 %

Figure 4: Disconnector failure causes (2003) according to FASIT statistics [7].

Overhead lines
Most overhead line failures are weather or environment dependent. These are e.g. trees falling to
the lines, which causes conductor and pole damages, lightning, snow and ice, and animals.

Pole, 1.5 %

Isolator, 9.6 %

Unknown, 32.4 %

Phase lines, 50.6


%

Grounding, 0.4 %
Spark gap, 0.3 %

Shield line, 0.1 %


Crossarm, 0.4 %
Binding, 1.5 %
Clamp, 0.8 %

Anchoring, 0.1 %
Base, 0.1 %
Stay wire, 0.1 %
Joint, 0.2 %

Jumper, 2.1 %

Figure 5: Overhead line failure causes (2003) according to FASIT statistics [7].

Cables
The majority of cable failures are mainly due to natural aging of insulation, cable imperfections
or water treeing. Other failure modes involve corrosion or damage to the concentric neutral and
metallic ground shields, and the loss of good contact between metallic shield and the
semiconducting shield. Under certain conditions corrosion/breaks in the metallic shield or poor
contact between metallic shield and semiconducting shield will lead to arcing damage at metallic
shield/semiconducting shield interface which progresses into the insulation until failure occurs
[9]. Digging also causes cable failures. Even though the cable would not be totally broken, the
insulation or the jacket may be injured. This may lead to e.g. ingress of moisture and chemicals
which will lead to water treeing or insulation breakdown.

Unknown, 30.4 %
Cable, 41.9 %

Several
components, 0.5 %
Cable shoe, 3.7 %
Cable terminal, 16.8
%

Joint, 5.2 %
Trifurcating joint, 1.6
%

Figure 6: PEX cable failure causes (2003) according to FASIT statistics [7].

Unknown, 38.4 %

Cable, 35.7 %

Several
components, 0.0 %

Joint, 8.5 %

Cable shoe, 0.8 %


Cable terminal, 13.2
%

Trifurcating joint, 3.5


%

Figure 7: Paper/oil cable failure causes (2003) according to FASIT statistics [7].

NORDIC GUIDELINES FOR FAULT STATISTICS


Nordic countries have long traditions in co-operation in the field of electricity transmission and
distribution, and currently there is also going on a research on pan-Nordic regulation models.
Probably in the future there will be even more co-operation between Nordic countries. One
possibility for this the development of common Nordic fault and interruption statistics.

Opal
In order to give the network companies (and others) a better decision basis for investments,
operation, maintenance and renewal of the networks a common Scandinavian project called
OPAL (Optimization of reliability in power networks) was launched in 2002. The objectives of
the project are:

Development of common Scandinavian guidelines for collecting and reporting fault data
Contribute to a better utilization of fault and interruption data
Development of methods for calculation and analysis of reliability of supply in power
networks

One of the activities in the project has been to specify a common database for faults and
interruptions. The content of the future database is presented in Table 1.

Table 1: The content of a future Scandinavian fault database [10].


Identification of the disturbance
Event identification
Time of event
Information about the fault(s)
Fault number
Reference to disturbance
Network company
Network area
Component id
Geographical location
Network type
Faulty component
Misc. information about component
Faulty sub-component
System voltage
Earthing system
Fault type
Primary or secondary fault
Fault character
Main fault cause
Underlying/contributing fault cause
Repair time
Repair type
Information about the consequences
Interrupted power
Energy not supplied
Number of affected customers
Number of affected delivery points
Customer interruption duration
Delivery point interruption duration
Total interruption duration
Disconnection type
Reconnection type

Id number
Date and time
Within the disturbance (1, 2, 3, etc)
Reference to event id number
Chosen from predefined list of companies
Location of the fault picked from predefined list of areas
Unambiguous component id, picked from company specific list of components
Chosen from list defined by each company
Cable network (> 90 % cable), overhead network (> 90 % overhead line), mixed network (remaining)
Breaker, transformer, overhead line, cable, protection equipment, etc.
Placing, function, type, manufacturer, capacity, year of installation, etc.
Specific choices for each component
kV
Impedance, direct, isolated
Earth fault, short circuit, open circuit, missing operation, unwanted operation, etc.
Secondary fault includes succeeding and latent fault
Permanent, temporary, intermittent
Specific choices grouped under lightning, other environmental causes, external influence, operation and
maintenance, technical equipment, other, unknown
Same choices as for the main fault cause
Including fault localization
Component replaced, permanently repaired, provisionally repaired, no repair
kW
kWh
Integer
Integer
Aggregated time for all affected customers
Aggregated time for all affected delivery points
Time from first customer is interrupted to the supply is restored to the last customer
Automatic, automatic with unsuccessful automatic reconnection, manual, none
Automatic, manual, none

NorStat
NorStat working group was founded in April 2005 to continue the development of common
Nordic guidelines for fault and interruption statistics and to take care of the implementation of
these statistics. The main assignment is further developing and maintenance of the guidelines for
registration of faults for network between 1 and 100 kV. The group is responsible for collecting
fault data and composing an annual publication of Nordic fault and interruption statistics. The
long-term goal is to collect all data in a common database, where Nordic network companies
could get the statistics they want through a web-application.

REFERENCES
[1]

Lakervi, E., Holmes, E. J. Electricity distribution network design. IEE Power Series 21.
Peter Peregrinus Ltd. 1998.

[2]

IEEE Recommended Practice for the Design of Reliable Industrial and Commercial
Power Systems. IEEE Std 493-1997. ISBN 1-55937-969-3.

10

[3]

Aro, M., Elovaara, J., Karttunen, M., Nousiainen, K., Palva, V. Suurjnnitetekniikka.
Espoo 1996. 483 p. (in Finnish)

[4]

Keskeytystilasto 2003. Shkenergialiitto Sener ry. 21 p. (in Finnish)

[5]

IEEE Guide to the Collection and Presentation of Electrical, Electronic, Sensing


Component, and Mechanical Equipment Reliability Data for Nuclear-Power Generating
Stations. IEEE Std 500-1984.

[6]

Paoletti, G., Baier, M. Failure Contributors of MV Electrical Equipment and Condition


Assessment Program Development. IEEE Transactions on Industry Applications, Vol. 38,
No. 6, November/December 2002.

[7]

www.fasit.no

[8]

Lapworth, J., McGrail, T. Transformer Failure Modes and Planned Replacement. IEE
Colloquium on Transformer Life Management (Ref. No. 1998/510).

[9]

Abdolall, K., Halldorson, G., Green, D. Condition Assessment and Failure Modes of
Solid Dielectric Cables in Perspective. IEEE Transactions on Power Delivery, Vol. 17,
No. 1, January 2002.

[10]

Heggset, J., Christensen, J., Jansson, S., Kivikko, K., Heieren, A., Mork, R. Common
Guidelines for Reliability Data Collection in Scandinavia. Proceedings of Cired 2005.

11

EFFECTS OF CORRELATION BETWEEN FAILURES AND POWER


CONSUMPTION ON CUSTOMER INTERRUPTION COST

a paper for the course:


Asset Management in Power Systems
Patrik Hilber
Royal Institute of Technology
hilber@ets.kth.se

Abstract
The concept of this paper is to study the correlation between failures and power
consumption for the Kristinehamn network. This is done in order to scrutinize whether
the commonly used assumptions of constant failure rate and constant power consumption
is reasonable to use for reliability calculations. The studied entity is energy not delivered,
which is assumed to be a good estimate of how customer interruption costs are affected
(by the assumptions).

The effect of the seasonal variations is that the constant case underestimates the
energy not delivered by approximately 2 %.

The effect of the daily variations is that the constant case underestimates the
energy not delivered by approximately 6 %.

The effect of repair time variations is that the constant case overestimates the
energy not delivered by approximately 1 %.

The conclusion of this paper is that by using constant failure rates, repair rates and power
consumption the approximation of customer costs becomes somewhat low, i.e. by 7 % for
the studied case. This result indicates that the assumptions of constant failure rates, repair
rates and power consumption are quite sufficient for at least the actual case study. This
since this error probably is significantly smaller than other types of errors, for example
customer outage costs estimates. Nevertheless, having performed these calculations the
current results should be applied to further modeling of the Kristinehamn network.

1 Introduction
Customer interruption cost due to loss of supply is of crucial importance for electrical
network owners. Crucial since this cost can be used as a measure of the reliability worth
of a network [1]. One example of this focus on customer interruption cost is the fact that
the Swedish Energy Agency, the government body that regulates network tariffs, will
apply a newly developed network performance assessment model for determining the
maximum tariffs. In this model, one of the most important factors is customer
interruption cost.
The concept of this paper is to study the correlation between failures and power
consumption for the Kristinehamn network. This is done in order to scrutinize whether
the commonly used assumption of constant failure rate and average power consumption
is reasonable for the calculation of customer interruption costs.
Similar studies have been performed, for example presented in [2] and [3]. In [2], one
major conclusion is that by using time varying interruption costs, the expected
interruption cost for the industrial sector becomes significantly lower. However, the
generally used customer damage function [2] undervalues the interruption cost compared
to a method based on normal distributed cost factors. In [3] the interruption cost for
customers is studied with respect to varying repair rates, failure rates and loads. It is
noted that such data can be used for example for decision when maintenance shall be
performed. It is concluded that using constant interruption cost rates significantly
underestimates the average interruption cost.
This paper perform a similar study but on an additional network (Kristinehamn) and with
different data and with a somewhat different approach and hence complement the earlier
work within this area. This study also validates and scrutinizes the assumption of average
power and constant failure and repair rate for the calculation of customer interruption
cost, which is an assumption made in [4].

2 The network
The Kristinehamn 11kV-network for which this study has been performed is located in
the western parts of Sweden. The network is depicted in Fig. 1. The urban network
consists of underground cables of approximately 170 km and the rural network comprises
seven separate overhead line systems with a total line length of 110 km [5]. Although the
overhead line systems are operated as radial, it is possible to close cross connections
between some of them during disturbances in order to shorten customer outage times.
The urban part of the network allows for many opportunities for reconnection in the case
of failures of the normal feeding. In table I some additional data for the network can be
found.

Figure 1. The Kristinehamn network [5].


Table I. Data for the Kristinehamn network [5].
Number of customers
Energy, GWh/yr

10 900
230

Investments 2004, kkr


Operation and Maintenance 2004, kkr

6 364
6 810

Interruptions per customer and year1


Average interruption time per customer and year1, min

0.82
42

Underground cable, km (incl. 0.4 kV)


Insulated overhead line, km
Uninsulated overhead line, km
Total line length, km
Uninsulated to insulated conductor for year 2004, m

684
108
106
898
7200

Planed and unplaned interruptions for 2004.

3 Approach/Method
A common assumption in reliability calculations is that failure rate and consumption are
evenly distributed over time. This has an effect on the estimates of customer interruption
cost. The approach of this paper is to study the effect on customer interruption cost of
that consumption and failure rate to a certain degree coincide. This is performed in order
to evaluate the above-mentioned assumption.
The data used in this paper are based on an extensive study of the Kristinehamn network.
presented in [5]. The data for failure and repair rates have been collected over a period of
10 years, while data for 3 years of power consumption has been used. Failure rate, repair
rate and consumption has been divided up into time intervals. Based on these intervalls
expected undelivered energy is calculated and compared with the corresponding values
calculated from the assumed constant values of load, failure rate and repair rate.
Three different aspects are studied:
1. Effects of seasonal variations.
2. Effects of daily variations with respect to load and failure rate.
3. Effects of daily variations with respect to load and repair rate.
For each aspect a percentage is established which indicates how much the constant case
ought to be modified with in order to reach a better approximation of actual customer
interruption cost (or actually accommodate the specific aspect). The percentage is
established by identfying the expected averages from the constant case and the case based
on distributions for load and failure and repair rates.
The studied entity in this report is energy not delivered. It is assumed that the relative
difference for this entity is a good approximation of the differences for customer
interruption costs.

4 Results
4.1 Effects of seasonal correlated power consumption and
failure rate
By dividing the power consumption and failures per month the effect of seasonal
variations can be studied. In Fig. 2 the total number of failures over the studied 10-year
period is allocated to the month where the failures occurred. In Fig. 3 the average energy
consumption per month is shown.
The effect of the seasonal variations is that the constant case underestimates the energy
not delivered with 2 %.

90
80

Number of faillures

70
60
50
40
30
20
10
0
1

10

11

12

10

11

12

Month

Figure 2. Number of failures per month.

Average energy consumption [GWh]

30

25

20

15

10

0
1

Month

Figure 3. Average energy consumption per month (2000-2002).

4.2 Effects of daily correlated power consumption and failure


rate
By dividing the power consumption and failures per hour of day the effect of daily
variations can be studied. In Fig. 4 the total number of failures over the studied 10-year
period is allocated to the month where the failures occurred. In Fig. 5 the average power
per hour of day is shown. The power data is without the power drawn by larger
industries.

The effect of the daily variations is that the constant case underestimates the energy not
delivered with 5 %. If larger industries are incorporated the percentage becomes
approximately 6 %.
0.025

Failure rate per hour [f/h]

0.02

0.015

0.01

0.005

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Hour of day

Figure 4. Failure rate per hour.


20000
18000
16000
14000
kW

12000
10000
8000
6000
4000
2000

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

Hour of day

Figure 5. Average hourly power for Kristinehamn over three years (2000-2002) [4]. Note
that the power consumption for industries not is included in the plot.

4.3 Effects of daily power consumption and repair time


In this section the implications of that the repair rate varies with the hour of day is
studied. The repair rate can in general be said to be higher when the most failures occur
and at high demand for the Kristinehamn network. In Fig. 6 the average repair time per
hour can bee seen. It is interesting to note that the longest repair times occur closely to
midnight (this might be explained by that Kristinehamn is infested with ghosts that makes
the repair work harder to perform).
The correction caused by the repair time becomes 1 %, i.e. the constant case
overestimates the costs slightly. Hence this factor has a to some extent canceling effect
on the two previously presented factors.

Average duration of interruption (h)

This result might be somewhat surprising, at a first glance the reader might be conceived
to believe that the percentage ought to be lower, since the duration of interruption has a
close to opposite look as the power consumption (i.e. low power consumption
corresponds to a long interruption durations and vice versa), by looking at Fig. 5 and Fig.
6. However, since not many failures occur during nighttime, as seen in Fig. 4, the actual
effect of the variation of repair time becomes relatively small (1 %).

5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Hour of day

Figure 6. Average duration of interruption as a function of hour of day.

4.4 Total effect of studied correlations


When weighed together the total correction becomes 7 %, i.e. when considering the
effects presented in 4.1, 4.2 and 4.3 simultaneously. It is interesting to note that this result
concurs with the results in [2] and [3], i.e. by taking distributions into account the
expected result becomes higher.

5 Conclusion
The conclusion of this paper is that by using constant failure rates, repair rates and power
consumption the approximation of customer costs becomes somewhat low i.e. by 7 % for
the studied case. This result indicates that the assumptions of constant failure rates, repair
rates and power consumption are quite sufficient for at least the actual case study. This
since the actual error probably is significantly smaller than other types of errors, for
example customer outage costs estimates. Nevertheless, having performed these
calculations the current results should be applied to further modeling of the Kristinehamn
network. Furthermore this paper points out the fact that it is important to scrutinize
assumptions generally made in reliability modeling.

6 Discussion and future work


Today maintenance is performed mainly during daytime, with corresponding high
interruption costs, the benefits of performing maintenance during other hours of the day
might be interesting to analyze.
In this study three factors have been studied (hour of day for repair rate and for failure
rate and seasonal variations). There might be other factors not analyzed in this paper that
might prove important, for example
This study has gone deeper in the study of the mechanisms causing interruption costs
than what is usually done in reliability calculations. However, it is still possible to
investigate further in the topic, e.g. by studying all individual failures and calculate their
individual cost. By such an approach the constant case could be evaluated even more
thoroughly.
Another approach for more detailed studies is to use Monte Carlo simulation techniques.
For example different types of customers and their specific load patterns could be
combined with varying failure and repair rates. This in order to establish for example
distribution curves for customer interruption costs and a more accurate estimate of the
total expected interruption cost.

References
[1]
[2]
[3]

[4]

[5]

[6]

Billinton, R. and Allan R. N., (1996). Reliability evaluation of power systems. 2nd
ed. Plenum Press, New York.
Jonnavithula, A. and Billinton, R. Features that influence composite power system
reliability worth assessment. IEEE Trans. Power Systems vol 12, No. 4 1997.
Kjlle, G. H., Holen, A. T., Samdal, K. and Solum, G. Adequate interruption cost
assessment in a quality based regulation regime. Porto Power Tech Conf. 2001,
Portugal.
Hilber, P., (2005) Component reliability importance indices for maintenance
optimization of electrical networks, Licentiate thesis, ETS, KTH. Usab ab.
ISBN 91-7178-055-6
Hllgren, B. and Hilber, P. 2004, An optimal plan for overhead line replacement
in Kristinehamn background material, in Swedish, department of Electrical
Engineering, KTH, Sweden.
Tapper, M. et al, 2003. Electric power interruption costs 2003, in Swedish.
Swedenergy 2003.

OPTIMAL STRATEGY FOR VARIANCE REDUCTION BASED ON


COMPONENT IMPORTANCE MEASURES IN ELECTRICAL DISTRIBUTION
SYSTEMS
Torbjrn Solver
School of Electrical Engineering, KTH
torbjorn.solver@ets.kth.se
INTRODUCTION
Reliability analysis conducted using Monte Carlo simulations (MCS) techniques can be effective
within most areas where reliability is of the essence. One such is area is RCAM, Reliability
Centred Asset Management, where the objective is to manage the assets of a system based on the
system components lifecycle, i.e. their tendency to fail, the ability to repair and maintain them,
and the systems function. The assets of the system can basically be anything, technical, financial
or human assets, however the most straight forward approach is within technical systems, such as
power system where reliability analysis is an important and powerful tool to plan investments,
reinvestments and maintenance, i.e. manage the assets. This assignment focuses on how to make
the reliability analyses faster and more accurate.
The data processing capability of modern computers and its current development rate enables
advanced systems and techniques to be evaluated and analysed by using MCS. However, even
though computers are becoming faster and more effective in handling large quantities of data and
more calculations, there should always be a strive to decrease simulation time by minimizing the
number of samples while keeping the estimation variance low. This can be achieved first of all by
making the evaluated systems as simple as possible and also to apply some kind of variance
reduction technique.
In this assignment importance sampling is used as variance reduction technique. As the name
states; the key for successful importance sampling is to know which components that are
important for the systems function. If the technique is performed poorly the effects can be the
opposite of what was intended, i.e. the variance increases and more samples are needed for a
sufficiently accurate result then would have been needed if simple sampling was used. Further,
there are several indices for component importance with different focuses on the importance
aspects. The goal of this assignment is to investigate the connection between some of these
indicators and how to optimally weight of system components and thereby be able to identify a
strategy for importance sampling.
MONTE CARLO SIMULATIONS
The general concept of MCS methods is to use random numbers to generate the systems possible
states, this procedure is then performed for a sufficient number of times, samples, in order
comprehend the systems stochastic behaviour [1]. In order to put light on the fundamental
differences between stochastic methods such MC and the standard analytical method the
following example can be told.

Example 1: Determine bulls eyes area percentage of a dart board.


The analytical approach to determine this would be to measure the diameter of the dartboard and
the bulls eye, then calculate their areas and thereby be able to calculate the bulls eye percentage.
The MC approach on the other hand would be to let a computer throw a sufficient number of
darts and if we assume the darts are evenly distributed over the board and only the dart board, we
would be able to calculate the bulls eye percentage simply by calculating how many out of the
total number of darts that hit the bulls eye.
The MC solving method for a simple example such as this seems quite complicated, and it is. But
the fact is that main advantage with MC methods is its suitable usage in situations where the
analytical methods becomes complicated, either due to the sheer size of the system or because
more than the mean output value is of interest, e.g. the variance and the distribution.
In the previously mentioned dart example, it is quite obvious for those familiar with dart that it
would take a significant number of darts in order to achieve an accurate result. The variance in
the estimation of a stochastic variable Y and =, can be calculated as [2]:
Var[mY ] =

Var[Y ]
n

Where mY is the mean value of Y, and n is the number of randomly selected samples of Y, i.e.
y1,y2,,yn. The variance of the estimation decreases if the number of samples increases, which
seems quite natural. The following example will illustrate how the estimated value can depend on
the number of samples.
Example 2: Consider the systems illustrated in figure 1, determine each systems unavailability.
P1 = 0.9

P2 = 0.9
System 1

P1 = 0.9
P2 = 0.9

System 2

Figure 1. System with components in series and in parallel, 1 and 2 respectively


Both systems contain two components, that has the availability of 0.9. Each system can
consequently have four states. The probability of each systems states and the systems function
is presented in table 1, 1 indicates that the component/system is working and 0 the opposite.

Table 1. States and probability of the system illustrated in figure 1.


Component 1
1
1
0
0

Component 2
1
0
1
0

Probability
0.81
0.09
0.09
0.01

System 1
1
0
0
0

System 2
1
1
1
0

Analytical solution, the systems unavailability can be calculated as:


Usys1 = 1 (10.81 + 00.09 + 00.09 + 00.01) = 0.19
Usys2 = 1 (10.81 + 10.09 + 10.09 + 00.01) = 0.01
If we perform MCS on the same systems for different number of samples we can se how the
estimation of the unavailability changes. Examples of such results are illustrated in figure 2 and
3. The expected value of the unavailability fluctuates for low number of samples, but seems to be
converge towards the analytically calculated values of 0.19 and 0.01 for system 1 and 2
respectively. The simulation of system 2 seems to require more samples in order to converge
properly.
0.4
Simulated
Analytical

0.35

System unavailability

0.3
0.25
0.2
0.15
0.1
0.05
0

500

1000

1500

2000 2500 3000


Number of samples

3500

4000

4500

5000

Figure 2. System 1 of example 2, simple sampling with number of samples raging from 1 to 5000

0.02
Simulated
Analytical

0.018
0.016

System unavailability

0.014
0.012
0.01
0.008
0.006
0.004
0.002
0

500

1000

1500

2000 2500 3000


Number of samples

3500

4000

4500

5000

Figure 3. System 2 of example 2, simple sampling with number of samples raging from 1 to 5000
Generally, the more complex the system is the more samples are needed, also, safe systems, high
availability, requires large numbers of samples to in order to get a sufficiently good
representation of the system. In other words when performing MCS on large power systems,
there is a need to keep the number of samples low and as mentioned earlier one way to do this is
the use variance reduction techniques. The variance that is referred to is the variance of the
estimated value presented in equation 1, not the variance of the variable. The variance reduction
technique used in this assignment is importance sampling.
Importance Sampling
The aim of importance sampling is to improve the estimation by concentrating the samples to the
interesting part of the population. This is achieved by using another probability distribution than
the real.
Principle [2]: Let Y be a random variable with a known density function fY defined on the sample
space . We seek E[X], X = g(Y), but instead of sampling X we introduce Z, with the density
function fZ, which is called the importance sampling function. fZ should be such that fZ() > 0
. For each outcome Y = we have
X = g ( ) ,

Z = g ( )

fY ( )
f Z ( )

We can verify the E[Z] equals the E[X] [1]:


E[ X ] =

g ( ) fY ( )d

g ( )

E[ Z ] =

fY ( )
f Z ( )d = g ( ) fY ( )d
f Z ( )

If we can return to the dart board example, importance sampling applied to that problem would
mean that we simply enhanced the size of the bulls eye. By doing this more darts would hit the
bulls eye and if we gave the darts not hitting the bulls eye higher weight than those hitting it the
result would be as good as the simple sampling solution, but the number of darts would be less.
Now, by applying importance sampling onto example 2, illustrated in figure 1 we can observe
how importance sampling improves the result in such way that fever samples are needed, figure 4
and 5. The components unavailability used in the importance sampling are presented in table 2.
0.4
Imp. samp.
Simple
Analytical

0.35

System unavailability

0.3
0.25
0.2
0.15
0.1
0.05
0

100

200

300

400
500
600
Number of samples

700

800

900

1000

Figure 4. System unavailability of system 1 in example 2, two components in series.

0.04
Imp. samp.
Simple
Analytical

0.035

System unavailability

0.03
0.025
0.02
0.015
0.01
0.005
0

100

200

300

400
500
600
Number of samples

700

800

900

1000

Figure 5. System unavailability of system 2 in example 2, two components in parallel.


Table 2. Real and sampled unavailability for example 2.
System 1

System 2

Component unavailability
(real)

0.1

0.1

Component unavailability
(imp. sampled)

0.5

0.9

The components weighted unavailability used in for the importance sampling of example 2 are
chosen arbitrary, not at random and not optimally. What can be seen in figure 4 and 5 is that
importance sampling seems to be very effective for parallel structures.
COMPONENT IMPORTANCE
In any system containing more than one component, some components are more important for
the systems function than the other(s). For instance, a component in series with the rest of the
system, in a cut set of order 1, is generally more important then a component in parallel with
others, i.e. part of cut set of higher order. There are several component importance measures,
some will be used in this assignment1. Component importance measures are primary used to rank
and classify the components within a system. The application of using the importance measures
to indicate how to weigh the components when performing MCS with importance sampling as
variance reduction is not a general application.

The included component importance measures were included, due to their suitability to systems structured as in this
assignment and due to the fact that the author is familiar with these measures.

In this section the we will consider n independent components with the probabilities pi(t), i =
1n, at the time t. The system reliability is h(p(t)).
Birnbaums measure
Definition [3]: Birnbaums measure of importance of component i at time t is:
I B (i | t ) =

h( p(t ))
pi (t )

for i = 1,2,..., n

Birnbaums measure is basically a sensitivity analysis of changes in system reliability due to


component i. If IB is large, a rather small change in component is reliability will have large
consequences on the system reliability at the time t. By using tree notation and pivotal
decomposition on the definition of IB, Birnbaums importance measure can be written as [3]:
I B (i | t ) =

h( p(t ))
= h(1i , p(t )) h(0i , p(t ))
pi (t )

Note in equation 7 that Birnbaums importance measure of component i only depends on the
structure of the system and the reliability of the other components.
Critical importance
Definition [3]: The critical importance measure ICR(i | t) of component i at the time t is the
probability that component i is critical for the system and is failed at the time t, when we know
that the system is failed at the time t .

I CR (i | t ) =

I B (i | t ) (1 pi (t ))
1 h( p(t ))

Critical importance measure suitable for prioritising maintenance measure, it is in other words the
probability that component i caused the system failure.
Fussel-Veselys measure

Definition [3]: Fussel-Veselys measure of importance , IFV(i | t) is the probability that at least
one minimal cut set that contains component i is failed at the time t, given that the system is
failed at the time t.
A cut set is failed when all components in it are failed.
m

I FV (i | t ) =

Q ij (t )

Pr( Di (t )) j =1

Pr(C (t ))
Q0 (t )

Where Di(t) states that at least one of the minimal cut set containing component i has failed at the
time t, C(t) states that the system is failed at the time t. Q ij (t ) denotes the probability that the
minimal cut set j, contains component I, is failed at the time t. Q0(t) is the system unavailability,
i.e. 1 - h(p(t)). How to derive equation 9 can be found in [3].
TEST SYSTEM

The test system used to analyse the optimal weighting of components based on component
importance measure is shown in figure 6. It is a substation 40/10 kV, containing of primary as
well as secondary side breakers (B) and bus bars (BB) and two parallel transformers.
BB1

T1

B1

B3

T2

B1

BB2

B4

40 kV

10 kV

Figure 6. Test system for example 3, 4 and 5


The system is analysed in three alterations, one with single bus bars but double transformers, one
with double transformers and bus bars and one example with a single transformer and double bus
bars. All alterations are assumed to be systems of independent components. The systems are fully
automated and operations such as switching between bus bars do not cause any interruption.
When there are two transformers in the system, the substation is operated as a meshed grid. Only
passive fault are assumed to occur in the breakers, which means that the faults in breakers and
transformers only affect the downstream network and parallel routes are not affected. This is off
course a simplified way to represent a substation, more thorough description on how to represent
the different substation components for system reliability studies can among others be found in
[4]. Figures 7, 8 and 9 show the block diagrams of example 3, 4 and 5 of the test system,
henceforth simply referred to as system 3, 4 and 5 respectively.
2

8
5

Figure 7. Block diagram of system 3

10

Figure 8. Block diagram of system 4


1

6
3

Figure 9. Block diagram of system 5


Table 3 shows the components type and availability, the availability is assumed to be constant,
i.e. independent of time.
Table 3. Component specification and reliability data for example 3, 4 and 5.
System 3

System 4

System 5

Type

pi

type

pi

type

pi

Bus bar 40 kV

0.95

Bus bar 40 kV

0.95

Bus bar 40 kV

0.95

Breaker 40 kV

0.93

Bus bar 40 kV

0.95

Bus bar 40 kV

0.95

Transformer 40/10 kV

0.99

Breaker 40 kV

0.93

Breaker 40 kV

0.93

Breaker 10 kV

0.93

Transformer 40/10 kV

0.99

Transformer 40/10 kV

0.99

Breaker 40 kV

0.93

Breaker 10 kV

0.93

Breaker 10 kV

0.93

Transformer 40/10 kV

0.99

Breaker 40 kV

0.93

Bus bar 10 kV

0.95

Breaker 10 kV

0.93

Transformer 40/10 kV

0.99

Bus bar 10 kV

0.95

Bus bar 10 kV

0.95

Breaker 10 kV

0.93

Bus bar 10 kV

0.95

10

Bus bar 10 kV

0.95

In order to analytically be able to calculate the availability for system 3,4 and 5, each examples
structure function, (X(t)), must be determined. The structure function is a function of the state
vector X(t) = X1(t) + X2(t) + + Xn(t), which is a vector containing the states of each component
in the system. Further details on how to derive the structure functions can be found in [3]. The
structure function for system 3, 4and 5 are:

S 3 ( X S 3 (t )) = X 1 X 8 (1 (1 X 2 X 3 X 4 )(1 X 5 X 6 X 7 ))

10

S 4 ( X S 4 (t )) = (1 (1 X 1 )(1 X 2 ))(1 (1 X 9 )(1 X 10 ))(1 (1 X 3 X 4 X 5 )(1 X 6 X 7 X 8 )) 11


S 5 ( X S 5 (t )) = (1 (1 X 1 )(1 X 2 )) X 3 X 4 X 5 (1 (1 X 6 )(1 X 7 ))

12

Since all components only appear once in each equation, the structure functions are in their basic
form, hence there is at this time no need expand the functions [3]. By replacing the components
9

sate variable Xi(t) with the components probability pi(t) we are able to analytically calculate the
systems probability.
Pr( X S 3 (t ) = 1) = p S 3 (t ) = hS 3 ( p(t )) = p1 p8 (1 (1 p 2 p3 p 4 )(1 p5 p6 p7 ) = 0.8839

hS 4 ( p) = (1 (1 p1 )(1 p 2 ))(1 (1 p9 )(1 p10 ))(1 (1 p3 p 4 p5 )(1 p 6 p7 p8 ) = 0.9744


hS 5 ( p) = (1 (1 p1 )(1 p 2 ))(1 (1 p 6 )(1 p 7 )) p 3 p 4 p 5 = 0.8520

RESULTS

The results from the main analysis are presented for component importance and MCS separately.
This is followed by some conclusions and a short discussion.
Component importance

According to the previously described component importance measures we can calculate the each
components importance for system 3, 4 and 5.
In order to use Birnbaums component importance measure one can either expand the structure
function, equation 6, or differentiate the structure function for each component, equation 7,
whichever is the simplest. For example 3, we will expand the structure function and differentiate
it.
hS 3 ( p) = p1 p8 (1 (1 p2 p3 p4 )(1 p5 p6 p7 ) = p1 p5 p6 p7 p8 + p1 p2 p3 p4 p8 p1 p2 p3 p4 p5 p6 p7 p8
If we now differentiate hS3(p) with respect to each component in the system we can calculate
Birnbaums importance measure for each component. In order to save space only the calculation
IB(1) is shown here the rest are presented in table 4.
I 3B (1)

hS 3 ( p)
= p5 p6 p7 p8 + p2 p3 p4 p8 p2 p3 p4 p5 p6 p7 p8 = 0.9304
p1

For example 4, expanding the structure function for system 4 is a whole other boll game. We will
now use the other method, equation 7. Birnbaums importance measure for component 1 in
system 4 is:
I 4B (1) =

hS 4 ( p)
= h(11 , p(t )) h(01 , p(t )) = 0.9769-0.9280 = 0.0489
p1

Birnbaums importance measure for component 1 in system 5, using the same method as for
example 4:

10

I 5B (1) =

hS 5 ( p)
= h(11 , p(t )) h(01 , p(t )) = 0.8541 0.8114 = 0.0427
p1

The critical importance measure can now be calculated, using equation 8, which based on
Birnbaums measure, again only ICR(1) for system 3,4 and 5 are calculate here, the rest are listed
in table4.
I 3CR (1)

I 3B (1) (1 p1 ) 0.9304 (1 0.95)


=
=
= 0.4005
1 hS 3 ( p )
1 0.8839

I 4CR (1) =

I 4B (1) (1 p1 ) 0.0489 (1 0.95)


=
= 0.0956
1 hS 4 ( p)
1 0.9744

I 5CR (1) =

I 5B (1) (1 p1 ) 0.0427 (1 0.95)


=
= 0.0144
1 hS 5 ( p )
1 0.8520

In order to determine the Fussel-Veselys importance measure the minimal cut sets of system 3
and 4 must be determined. The minimal cut sets are:
System 3: {1},{2,5},{2,6},{2,7},{3,5},{3,6},{3,7},{4,5},{4,6},{4,7},{8}
System 4: {1,2},{3,6},{3,7},{3,8},{4,6},{4,7},{4,8},{5,6},{5,7},{5,8},{9,10}
System 5: {1,2},{3},{4},{5},{6,7}
So, but using equation 9, the Fussel-Veselys importance measure can be calculated for
component 1 in system 3, 4 and 5.
I 3FV (1) =

1 p1
q1
=
= 0.0510
Q0 1 hS 3 ( p)

I 4FV (1) =

q1q2 (1 p1 )(1 p2 )
=
= 0.0026
Q0
1 hS 3 ( p)

I 5FV (1) =

q1q2 (1 p1 )(1 p2 )
=
= 0.0169
Q0
1 hS 5 ( p)

11

Table 4. Component importance measures and rating of the components for system 3, 4 and 5
System 3

System 4

ICR(i)

IFV(i)

IB(i)

System5

IB(i)

ICR(i)

IFV(i)

IB(i)

ICR(i)

IFV(i)

0.9304

0.4005

0.4305

0.0489

0.0956

0.0978

0.0427

0.0144

0.0169

0.1194

0.0720

0.0904

0.0489

0.0956

0.0978

0.0427

0.0144

0.0169

0.1122

0.0097

0.0129

0.1317

0.3607

0.4109

0.9161

0.4332

0.4729

0.1194

0.0720

0.0904

0.1237

0.0484

0.0587

0.8606

0.0581

0.0676

0.1194

0.0720

0.0904

0.1317

0.3607

0.4109

0.9161

0.4332

0.4729

0.1122

0.0097

0.0129

0.1317

0.3607

0.4109

0.0427

0.0144

0.0169

0.1194

0.0720

0.0904

0.1237

0.0484

0.0587

0.0427

0.0144

0.0169

0.9304

0.4005

0.4304

0.1317

0.3607

0.4109

0.0489

0.0956

0.0978

10

0.0489

0.0956

0.0978

Optimal component availability in the importance sampling

In order to determine the optimal availability in the importance sampling, system 3 5 were
simulated with all configurations of availability for each component class, i.e. not the same as
component availability configurations. These simulations were conducted with 100 sets of 1000
samples in each set. Different seeds were used for the random number generator for each set, but
the same seeds for the different systems2. Since the components actual availability only
influence the weight and not the outcome of each sample it is sufficient to determine the optimal
availability for each class of components3. How the components were classified is included in
table 5. Table 5 also shows the outcome of the search for optimal availability. The constraints for
the optimisation of the availability configuration were a combination of variance of the estimated
value, equation 1, and an error margin towards the analytically calculated value.
System 3

Optimal
availability

System 4

System 5

Component
1,8

Component
2-7

Component
1,2,9,10

Component
3-8

Component
1,2,6,7

Component
3-5

0.55

0.85

0.83

0.69

0.93

0.69

Seed 1-100 were used for random number generator #4 in matlab.


If all availability configurations were to be tested it would require 998 = 9.2 1015 sets for an 8 component system,
which would take enormously long time.
3

12

Interesting to note is the more important the component is the lower is the optimal availability,
opposite to the indication of the importance measures. To illustrate how the importance sampling
improves the simulations, one set of 1000 samples were simulated, the cumulative results were
logged after each sample. The optimal availabilities where used. Figure 10 12 shows these
results.
1
Imp. samp.
Simp samp
Analytical

System availability

0.95

0.9

0.85

0.8

0.75

100

200

300

400
500
600
Number of samples

700

800

900

1000

Figure 10. System availability as a function of the number of samples, system3


1
Imp. samp.
Simp. samp
Analytical

0.995
0.99

System availability

0.985
0.98
0.975
0.97
0.965
0.96
0.955
0.95

100

200

300

400
500
600
Number of samples

700

800

900

1000

Figure 11. System availability as a function of the number of samples, system 4

13

1
Imp. samp.
Simp. samp.
Analytical

System availability

0.95

0.9

0.85

0.8

0.75

100

200

300

400
500
600
Number of samples

700

800

900

1000

Figure 12. System availability as a function of the number of samples, system 5


CONCLUSIONS AND DISCUSSION

To conclude the assignment, based on the results presented in this assignment the following
conclusions can be made:
i

As the name reveals, there are connections between how to weight the system
components when applying importance sampling and the component importance
measures tested in the assignment.

ii

High importance in the importance sampling refers to those components whose


reliability characteristics should be emphasized, i.e. lowered availability when sampling
them. This is the opposite of how the important components are indicated in by the
component importance measures included in this assignment.

iii

It is important to analyse the system using some kind of indicator on components


importance before simulating it, since wrongly configured importance weighting of
system components can increase the variance and thereby making the results worse than
if simple sampling would have been used.

iv

Of the importance measures tested in this assignment Birnbaums measure seems to be


the best to indicating on how to weight the system components.

The importance measures do not indicate how the optimal availability should be, but
rather the direction of the availability should be adjustment and set relative the other
components.

vi

Looking at the system configuration; components in series with the rest of the system
are important and should be given high weight, components in series with other

14

components but in parallel with another set of components are more important then
single components in parallel with another component.
First of all, since the importance sampling is unique for each system one applies it on, the
conclusions presented in this assignment are valid for this specific test system. The indication
however is that the conclusions are somewhat general.
Second, the luxury of using a small system is that everything is known since it can be calculated
analytically, however, MCS are normally used when the analytical value is too complicated to
calculate analytically. Consequently, the importance measure used in this assignment are also too
complicated to calculate. This means that other measures for importance indication would have to
be used such as system configuration, historical reliability data etc.
Finally, the measure used in this assignment are such that, as most component importance
measures, for more than one output the importance measure have to be calculated for each output.
In distribution systems there are usually one input but several outputs and using importance
measures as those used in this assignment could be very time consuming. To over come this one
can however differentiate the structure function towards performance indicators, which are not
tied to one single output. However, differentiating to performance indicators give no indication
on how to weigh the system components since they are dependent on what is at the end of a line
not in its path, i.e. the systems component configuration.
REFERENCES

[1]

R. Billington, W. Li, Reliability Assessment of Electrical Power Systems Using Monte


Carlo Methods, Plenum Press, New York, USA, 1994.

[2]

M. Amelin, On Monte Carlo Simulations and Analysis of Electricity Markets, Doctoral


thesis, Royal Inst. of Technology (KTH), Stockholm, Sweden, 2004.

[3]

M. Rausand, A. Hyland, System Reliability Theory Models, Statistical Methods, and


Applications, 2nd edition, Wiley & Sons, Inc., Hoboken, NJ, USA, 2004.

[4]

R. Billington, R. N. Allan, Reliability Evaluation of Power Systems, Plenum Press, New


York, NY, USA, 1986.

15

RELIABILITY ENGINEERING IN DISTRIBUTION SYSTEM PLANNING


Markku Hyvrinen
Helsinki Energy
markku.hyvarinen@helsinginenergia.fi

1 INTRODUCTION
The planning of electricity distribution systems is a technical and economic optimization problem
in which the long-term total cost must be minimized taking into account the technical,
environmental, safety related and other constraints. The two metrics that most customers notice
about power are its quality and its cost. The expectation is, that there is continual improvement in
one or both.
In the past, distribution system reliability was a by-product of standard design practices and
reactive solutions to historical problems. Today, but even more in the future, reliability is a
measured area of performance, reported to regulators and customers and having an impact on
utilities economics. Reliability must be planned, designed, and optimized with regard to cost.
Further more, reliability-related risks must be managed.
Reliability engineering is a staff function whose prime responsibility is to ensure that equipment
is designed and modified to improve reliability and maintainability, that maintenance techniques
are effective, that ongoing reliability-related technical problems are investigated, and appropriate
corrective and improvement actions are taken.
Reliability engineering uses reliability assessment (analysis, evaluation) in order to obtain failure
characteristics of a distribution system. These characteristics are normally frequency and
duration of interruptions and voltage sags (dips). System reliability can be improved by reducing
the frequency of occurrence of faults and by reducing the repair or restoration time by means of
various design and maintenance strategies.
Reliability can only be objectively assessed by quantifying the behaviour of the individual items
of equipment and thus by monitoring, collecting and collating the associated reliability data. This
data can then be used to both assess past performance and to predict likely future performance.
As importantly, it can be used to judge the merits and benefits, technical and economic, of
alternative planning and operational strategies and decisions.
Reliability levels are interdependent with economics since increased reliability is obtained
through increased investments but it also allows the consumers to reduce their outage costs. In
order to carry out objective cost-benefit studies, both these economical aspects are important: the
optimum reliability level is determined by minimizing the total cost.
Finally, it must be recognized that management decisions of network reinforcement or expansion
schemes cannot be based only on the knowledge of the reliability indices of each scheme. It is
also necessary to know the economic implications of each of these schemes.

2 THE BASIC CONCEPTS


2.1 Quantification of failure consequencies and mitigation actions
Reliability can be engineered, in the same way that other performance aspects such as
voltage,loading, and power factor are engineered. This requires reliability-focused analytical
methods.
The basic concept is the following: the probabilistic risk of an event is the product of likelihood
and severity (measured by damage in monetary terms):
Risk [/year] = Probability [1/year] Damage []

(1)

In simplified form, the damage caused by an power outage increases along with the duration and
the extent (amount of interrupted load) of the outage:
Damage [] = Duration [h] Interrupted Load [kW] Outage Cost [/kWh]

(2)

where the outage cost includes both the utilities cost and the cost to the customer.
Thus:
Risk [/year] = Probability [1/year] Duration [h] Interrupted Load [kW]
Outage Cost [/kWh]

(3)

Although very simplified, this equation shows where the engineering potential lies: in reducing
the number of events leading to an interruption, or in mitigation of consequences by reducing the
duration or the extent of the interruption. In paragraph 6 below, there is an overview of various
means used in distribution system planning in mitigation of interruptions and voltage sags.
The total annual cost Ct is given by [7][30]:
n
n

C t = j r j Pi CIC (ri ) +ceu i r j Pi + C r + C m + C s


i =1 jm ( j )
i =1 jm ( j )

where
i
= ith customer or load point
j
= jth element of upstream elements m(j) from the load to the feeding point
= failure rate of the element j
j
= average outage time of the element j
rj
Pi
= average load at the load point i
CIC(ri) = customer interruption cost due to a failure of duration ri
= loss of revenue per energy not supplied (ENS)
ceu
Cr
= annualized investment costs (CAPEX)
Cm
= increase / decrease in annualized maintenance cost (MAINTEX)
Cs
= increase / decrease in annualized cost of system losses (OPEX)

(4)

The values of Cm and Cs may be negative (decrease) and these, together with Cr, should be
evaluated as annualized values using present worth valuation or discount cash flow.
Thus the value, which combines network utility unavailability data with customers view on
unavailability of supply can be used as reliability criterion in planning tasks.
The same approach can be used in analog operational planning processes to quantify risk (or the
reduction of it) of various actions. For instance, RCM maintenance policy is oriented towards the
prevention of critical failures, evaluated as such on the basis of the cumulation of the seriousness
and frequency of the events in question. The purpose of preventive maintenance is to reduce the
probability of faults. This approach can also be used in scheduling planned outages.

2.2 Power system hierarchy


The power system can be divided into three (or four, including loads) levels, each with its own
specific design and operational problems and solutions:

generation
transmission
distribution
(loads)

Reliability evaluation of a complete electric power system including all the levels is normally not
conducted due to the enormity of the problem. Instead, reliability evaluation of generating
facilities, of composite generation and transmission systems (bulk power systems, including
interconnections with neighbouring systems, generally operated at voltages of 100 kV or higher)
and of distribution systems are conducted separately. [30]
In bulk power systems, the following aspects are usually considered: adequacy, transient stability
and thermal stability. Adequacy is the ability of the electric system to supply the aggregate
electrical demand and energy requirements of their customers at all times, taking into account
scheduled and reasonably expected unscheduled outages of system elements. Security is the
ability of the electric systems to withstand sudden disturbances such as electric short circuits or
unanticipated loss of system elements [31].
Distribution design, in many systems, is almost entirely decoupled from the transmission system
development process. Coupling between these two systems in reliability evaluation can be
accommodated by using the load point indices evaluated for the bulk transmission system as the
input reliability indices of the distribution system. [7]
Distribution network is the final link between the power system and the customer, and
distribution systems largely determine the service quality profile seen by end-customers and
dominate overall reliability indices. Distribution system is gradually becoming an increasing
share of overall power system cost. The influence of the transmission system in a case of a
developed system - is much smaller because of the high redundancy used. [35]

3 MEASURING RELIABILITY
To provide a quantitative evaluation of the reliability of an electrical system, it is necessary to
have indices which can express system failure events on a frequency and probability basis. Three
basic reliability indices, expected failure rate (), average outage duration (r) and average annual
outage time (U), are to be evaluated for each load point of any meshed or parallel system. These
three basic indices permit a measure of reliability at each load point to be quantified and allow
subsidiary indices such as the customer interruption indices to be found. They have three major
deficiencies, however [7]:
they cannot differentiate between the interruption of large and small load;
they do not recognize the effect of load growth by existing customers or additional new loads;
they cannot be used to compare the cost-benefit ratios of alternative reinforcement schemes
nor to indicate the most suitable timing of such reinforcements.
These deficiencies can be overcome by the evaluation of two additional indices, these being [7]:
the average load disconnected due to a system failure, measured in kW or MW
the average energy not supplied due to a system failure, measured in kWh or MWh
Two sets of reliability indices, customer load point indices and system indices, have been
established to assess the reliability performance of distribution systems. Load point indices
measure the expected number of outages and their duration for individual customers. System
indices such as SAIDI and SAIFI measure the overall reliability of the system. The third popular
index most utilities have been benchmarking is CAIDI. These indices can be used to compare the
effects of various design and maintenance strategies on system reliability.
SAIFI is improved be reducing the frequency of outages (for example tree trimming and
maintaining equipment). SAIFI is also improved by reducing the number of customers
interrupted when outages do occur (for example, by adding reclosers and fuses).
Strategies that reduce SAIFI improve SAIDI because if an outage does not happen, it does not
add to duration. SAIDI is also improved by improving CAIDI through faster customer
restoration. However, system improvements can make CAIDI go up or down, depending on
whether the improvements have a greater effect on outage frequency (customer interruptions) or
outage duration (customer minutes of interruptions).
Usually, the three reliability indices are graphed independent of each other. It is though useful to
remember the relationship between the indices: SAIDI = SAIFI * CAIDI. In this equation, SAIFI
can be considered the independent variable there can be no outage duration (SAIDI) without an
outage frquency (SAIFI). This representation of the indices suggests graphing them together,
with SAIFI on the x-axis and SAIDI on the y-axis. Lines of constant CAIDI pass through the
origin. Thus, all three indices are shown at once. The graph is referred as the reliability triangle
because the benchmark quartile lines form a triangle. The closer to the origin a point lays, the
better the reliability. This method also is useful for tracking the indices over course of a year. [34]

Displaying the indices in a reliability triangle has several advantages:


data reporting is condensed
the fact that the indices are related is emphasized
reliability performance is clarified
cumulative tracking shows that SAIFI and SAIDI can only increase over the year, while
CAIDI can increase or decrease.
Reliability indices can be calculated using historical outage data, or predicted using stochastic
methods.

4 RELIABILITY ENGINEERING PROCESS


4.1 An overview of the process
Picture 1 shows an overview of functional blocks relating to reliability engineering. With input
data from data-analysis, staff knowledge, r&d, etc. a reliability model is composed based on the
new design or existing network. After assessing failure consequencies, potential mitigation
actions are development. These are analysed with predictive analysis methods. An economic
evaluation is needed before final decision-making.

Picture 1. An overview of reliability engineering functional blocks (modified from [25] and [33]).

4.2 Data collection and processing


Consistent collection of data is essential as it forms the input to relevant reliability models,
techniques and equations. Processing of data occurs in two distict stages. Firstly, field data is
obtained by documenting details of failures as they occur and the various outage durations

associated with these failures. Secondly, this field data is analysed to create statistical indices.
These indices are updated by the entry of subsequent new data.
The quality of the data and thus the confidence that can be placed in it, is clearly dependent on
the accuracy and completeness of the compiled information. It is therefore essential that the
future use to which the data will be put and the importance it will play in later developments is
stressed. The quality of the statistical indices is also dependent on how the data is processed, how
much pooling is done and the age of the data currently stored. These factors affect the relevance
of the indices for the use to which they are subsequently put. [1]
The reliability data which must be estimated from the collected data are outage rates and
durations for forced and scheduled outages for each component type. Restoration can be
perceived as being of two types: restoring supply to the customers and restoring a failed
component to its working state. These are not necessarily the same. Care is therefore required to
correctly identify restoration times. In order to perform predictive reliability assessments, the
component restoration times are required. The service restoration times are important only to
measure the quality of service to customers. For more detailed analysis, restoration may be
categorized by a number of subevents such as repair, replacement, reclosure, switching. In most
cases, a particular restoration process is coupled with a specific type of failure event. [1]

4.3 Reliability modelling


Reliability modelling techniques are decribed in a more comprehensive manner in other
presentations. These models must reflect all relevant phenomena of real networks; operating and
failure states of system components, the asset ageing process, etc.
In more enhanced studies, especially for transmission and subtransmission systems, network
reliability has to estimated in such a way that both post-fault substation events, i.e., the protection
system and circuit breaker operations, and the power system dynamics are included [33]. This
means that the protection system must be modelled as a relevant subsystem. The failures suffered
by protection systems are different from those experienced by other types of electrical system
components. Due to different failure modes failure to operate and incorrect operation the fault
clearing chain must be divided on dependability and security. The reliability measure
dependability indicates the ability to perform the requested fault clearing. Security on the other
hand indicates the security against incorrect operations. Dependability and security are
contradictionary to each other, and a gain one of the quantities can result in a loss in the other.
In modelling of the protection system, the event tree method is of particular interest because it
can recognize the sequential logic of protection system operations, and the analysis can be
extended to study power systems as well as protection systems to a pre-selected depth. After the
occurrence of a fault in power systems, the normal operation or failure to operate the protection
system components can be described in a detailed tree structure. The end events of the tree
identifies the resulting states of circuit breakers, and hence the effect of failures in the protection
system. [33]

4.4 Reliability assessment


Reliability analysis consists of two steps, independent of one another on both function and
application. Some utilities perform only one or the other.
Historical reliability assessment, whose goal is to analyze system and historical operating
data to assess the reliability health of the system, to identify problem and to determine
what caused any problems.
Predictive reliability assessment, whose goal is to predict expected future reliability levels
on the system, both in general and at specific locations, by analyzing a specific candidate
design for a part of the distribution system, and determining expected reliability.
Willis has added also a third step [37]:
Calibration is required to adjust a predictive model so that it correctly predicts past
events on the system.
Predictive analysis is the core of reliability planning. It must produce an estimate to
the future performance of the plant and system
the benefits of alternative designs, refurbishments and expansion plans
the effects of alternative operational and maintenance policies
the related reliability cost/benefit/worth of the alternatives with two previous items

4.5 Cost evaluation


A major attribute of planning is optimization or reduction of cost. An assessment of various
planning alternatives may be based on the capital investment cost alone if the additional network
capacity provided by each option is comparable and if system maintenance costs are effectively
the same. If they are not the same, the supplement in the costs must be taken into account. The
change on maintenance costs usually associated with the addition of new objects (for example a
new substation).
Besides, there are cost of equipment, land, labor, etc. These values can be estimated by the utility
for each possible planning option.
The investment criterion can be calculated as a sum of annuities of the investments and the
corresponding supplements to the maintenance cost (see 2.1).

5 METHODS
5.1 An overview
The techniques used in power system reliability evaluation can be divided into two basic
categories: analytical and simulation methods.
The analytical techniques are highly developed and have been used in practical applications for
several decades. Analytical techniques represent the system by mathematical models and evaluate

the reliability indices from these models using mathematical solutions. A range of approximate
techniques has been developed to simplify the required calculations. Analytical techniques for
distribution system reliability assessment can be effectively used to evaluate the mean values of a
wide range of system reliability indices. This approach is usually used when teaching the basic
concepts of distribution system reliability evaluation. The mean or expected value, however, does
not provide any information on the inherent variability of an index. [9]
The principles of probabilistic design that have been applied in the aerospace and power
generation industries for some time are finding new applications in the distribution power
planning and engineering. Probabilistic design methodology provides a way to objectively
quantify the relative value of different design arrangements and operational practices.
Also, the probabilistic approach makes it possible to understand the sensitivity of the design to
system variables, thereby giving the system operator knowledge about where system
reinforcements are required to improve performance. This is in contrast to the deterministic
approach that incorporates factors of uncertainty or safety factors to produce a design that may
be overly conservative. A good reliability model and a firm grasp on the capital and O&M cost
activities are essential to this process. [17]
Qualitative methods may be required in several stages of the process, e.g. in pre-studies and in
managing the risks.

5.2 Contingency-based planning methods


Traditionally, electric utilities assured the reliability of their power delivery indirectly, by setting
and then engineering criteria that called for margin, switching, and other characteristics that
would assure secure operation. The electric utilities developed a method of engineering a system
that would provide reliable performance by adhering to basic rule: design the system so that it
can do without any major element and the system can still do its job even if any one unit fails.
These methods were often referred to as N-1 methods. So designed, a system would tolerate
failures. The likelyhood of two nearby units failing simultaneously was usually remote enough so
as to not be a consideration. In general, contingency-based methods came to be called N-X
methods, where X is the number of units they were designed to do without on a routine basis.
[37]
Probabilities of different faults are traditionally not taken into account; instead all faults that may
limit the transmission capacity are treated equally. This method can lead to conservative
utilization of the grid. [33]

5.3 Planning standards


From the contingency-based concept, utilities evolved rules (guidelines, design criteria,
standards) and engineering methods that applied these rules with precision and procedure. These
methods were simple to apply and understand, and effective in traditional situations.
The so-called planning standards usually make a distiction between probable (likely) and
extreme (unlikely) faults. In that sense they are semi-risk-based approaches. They are based on
the assumption that the more rare an event is the more severe consequences are accepted. [20]
8

According to these standards, electric systems must be planned to withstand the more probable
forced and maintenance outage system contingencies at projected customer demand and
anticipated electricity transfer levels. Extreme but less probable contingencies measure the
robustness of the electric systems and should be evaluated for risks and consequences [31].
Although it is not practical (and in some cases not possible) to construct a system to withstand all
possible extreme contingencies without cascading, it is desirable to control or limit the scope of
such cascading or system instability events and the significant economic and social impacts that
can result.
Planning stardards have been used successfully especially in bulk power systems, which involves
multiple parties. Since all electric systems within an integrated network are electrically
connected, whatever one system does can affect the reliability of the other systems. Therefore, to
maintain the reliability of the bulk electric systems or interconnected transmission systems or
networks must comply common planning standards (e.g. NERC, NORDEL). [31]

5.4 Predictive reliability analysis of distribution systems


Contingency based methods have several disadvantages. Because they accomplish reliability
indirectly, it is essentially impossible to use them as engineering tools in a process aimed at
driving cost to a minimum. For that, one needs methods that directly address reliability socalled reliability-based engineering methods. The planners need reliability-based analysis tools,
methods that can evaluate a particular power system layout against a particular customer demand
profile. These tools are used on almost the same manner that traditional planners used a load
flow. Working with a reliability load flow, planners can engineer frequency and duration of
outages (SAIDI, SAIFI) , and incidence of voltage sags, throughout the system in exactly the
same manner.
Reliability assessment on a routine computational basis is a relatively new feature of distribution
planning.

5.5 Planning constraints


Minimum level of reliability or maximum duration of interruptions may be defined by authorities
or utilities recommendations. An example of such recommendations for maximum duration of
interruptions is as follows: [36]:
24 hours with maximum load of < 2 MW
18 hours with maximum load of 2-5 MW
12 hours with maximum load of 5-20 MW
2 hours with maximum load of 20-50 MW
1 hour with maximum load of > 50 MW
Also, critical values for interruption durations for different types of loads can be defined [36]:
1 second
hospitals or process industry
10-15 minutes
steelworks
15-30 minutes
animal feeding
30 min
elevators

2 hours
6 hours
8 hours
10 hours
12-24 hours

water supply
greenhouses
caring industry
telecommunication
houses, water treatment plants and freezing plants

These constraints, which are felt to be deterministic, are used in the probabilistic reliability
analysis as limitations. [13]
More often, authorities define limits for interruption durations, and if these limits are exceeded,
incentives follow.

5.6 Disaster mitigation


While it is often impossible to prevent disasters like fires, tornadoes, hurricanes, floods,
industrial accidents, and even bombings there are measures that communities can take to
mitigate their impact. Decisions how to prepare for, respond to, and recover from disasters are
not based on engineered data, but instead they are political by nature. These preventive actions
can however have an impact on distribution system reliability level also during normal
conditions.

6 IMPROVEMENT OF RELIABILITY OF A SYSTEM BY VARIOUS MEANS


MITIGATION OF INTERRUPTIONS AND VOLTAGE SAGS
6.1 Overview of mitigation or elimination methods
To understand the various ways of mitigation, the mechanism all the way from the cause of a
fault to an end-customer equipment or process outage and the following restoration process all
need to be understood.
Long interruptions are always due to component outages. Component outages are due to three
different causes [10]:
A fault occurs in the power system which leads to an intervention by the power system
protection.
A protection relay intervenes incorrectly, thus causing a component outage.
Operator actions causes a component outage. These could also be scheduled or planned
interruptions.
Whether a component outage leads to an interruption, depends upon what sort of redundancy the
component in question has.
For different types of events, different mitigation methods are most suitable: improving the
equipment for short-duration events, improving the network design for long-duration events, etc..
[10]

10

The basic categories of mitigation methods are:


reducing the number of faults
improving the fault-clearing process
changing the system design such that faults result in less severe events at the equipment
terminals or at the customer interface
connecting mitigation equipment between the sensitive equipment and the supply
improving the immunity of the end-customers equipment

6.2 The number of faults


Reducing the number of faults in a system not only reduces the sag frequency but also the
frequency of sustained interruptions. Some examples of fault mitigation are:
better shielding of the components and plants (using cables or covered wires instead of
overhead lines with bare conductors, additional shielding wires or earth wires reducing
the risk of lightning fault, placing equipment inside the buildings, fences, etc.)
increasing the insulation level (insulation coordination) and related earthing practices
increasing maintenance and inspection frequencies; preventive maintenance activities
could impact on the frequency of faults by preventing the actual cause of failure
optimal neutral earthing practice
preventing maloperations by proper design of protection and control systems
preventing human errors by designing easy-to-use systems (primary, secondary, control
centers), training, etc.
One has to keep in mind, however, that these measures may be very expensive and that its costs
have to be weighted against the consequencies of the equipment trips. [10]
The impact of reliability of components on system reliability and the identification of those
components which have the greatest impact on system reliability, is clear. For planning
engineers, one important field of improvement is customer specifications. Customers
specifications define how the component will be used, the loading regime and the system
conditions envisaged. In some cases the specification fails to fully describe the operating
conditions in terms of loading, switching operations, system transients or environmental
conditions. The importance of specifications is often overlooked.

6.3 The fault-clearing


Protection systems are designed to automatically disconnect components from the network to
isolate electrical faults or protect equipment from damage due to voltage, current, or frequency
excursions outside of the design capability of these facilities. Coordination of protection systems
is vital to the reliability of the networks. If protection and control systems are not properly
coordinated, a system disturbance or contigency event could result in the unexpected loss of
multiple facilities. Through cascading events, the extent of the interruption is increased. [31]
Reducing the fault-clearing time does not reduce the number of events but only their severity.
The duration of an interruption is determined by the speed with which the supply is restored.

11

Faster fault-clearing does also not affect the number of voltage sags but it can significantly limit
the sag duration. [10]

6.4 The power system design and supply restoration


The structure of the distribution system has a big influence on both the number and the duration
of the interruptions and voltage sags experienced by the customer.
When a power system component fails, it needs to be repaired or its function taken over by
another component before the supply can be restored. The repair or replacement process can take
several hours or, especially with power transformers and GIS-plants, even days up to weeks. In
most cases the supply is not restored through repair or replacement but by switching from the
faulted supply to a backup supply. The speed with which this takes place depends on the type of
switching used. A smooth transition without any interruption takes place when two components
are operated in parallel. This will however not mitigate the voltage sag due to the fault which
often precedes the interruption. [10]
When a single customer is considered, additional reduncancy on the distribution system side is
only justified for large industrial or commercial customers. For a larger group of customers,
especially in densely populated areas, feeder level reduncy becomes easily feasible. At the
substation level and the subtransmission level full redundancy is almost self-evident due to long
repair-times of the main components in the reference of the large amount of interrupted load. At
these levels optimization is needed to balance restoration times and system complexity and
dimensioning (parallel operation i.e. meshed systems versus redundancy through switching).
Table 1. Examples of various types of redundancy in power system design [10]
Redundancy

Duration of interruption

Typical applications

No redundancy

hours days

low voltage in rural areas

Redundancy through switching


- local manual switching
- remote manual switching
- automatic switching
- solid state switching

1 hour and more


5 2 0 minutes
1 60 seconds
1 cycle and less

low voltage and distribution


industrial systems, future public distribution
industrial systems
future industrial systems

Redundancy through parallel


operation

voltage sag only

transmission systems, industrial systems

(also a combination is possible: remote manual switching plus local manual switching)

There are two types of parallel operation: two feeders in parallel and a loop system. In both cases
there is a single redundancy. The design of parallel and loop systems is based on so-called (n-1)
criterion. It enables high reliability without the need for stochastic assessment. A thorough
assessment of all common-mode failures is needed before one can trustfully use such highredundancy design criterion. [10]

12

In switching schemes, the additional cost for the system are not only switching, signalling and
communication equipment. The feeder has to be dimensioned such that it can handle the extra
load. Also the voltage drop over the, now potentially twice as long, feeder should not exceed the
margins. Roughly speaking the feeder can only feed half as much load. This will increase the
number of substations and thus increase the cost. Thus, reliability engineering should always be
conducted on system basis.
In switching schemes, a system outage or failure event may not lead to long-term total loss of
continuity but may cause violation of a network contraint (loss of quality) and / or to the event
defined as partial loss of continuity. Many systems have interconnections which allow the
transfer of some or all the load of a failed load point to other neighbouring load points through
normally open points. [7]
While the amount of distributed generation (DG) is likely to grow, it is not unambiguous that
this has advantageous impact on reliability. The adaptation of traditionally designed distribution
system to DG takes a while, and the impact of distributed generation on reliability is by nature a
case-by-case issue.

6.5 The interface between the network and the customer


Mitigation equipment in the system-equipment interface is the only place where the customer has
control over the situation [10]. Both changes in the supply system as well as improvement of the
equipment are often completely outside of the control of the end-user.
On-site generators are used for two distictly different reasons:
Generating electricity locally can be cheaper than buying it from the utility.
Having an on-site generator available increases the reliability of the supply as it can serve
as a backup in case the supply is interrupted.
Standby generation is often used in combination with a small amount of energy storage (e.g.
flywheel) supplying the essential load during the first few seconds of an interruption.
All modern mitigation techniques are based on power electronic devices, with the voltage-source
converter being the main building block [10]. Terminology is still very confusing in this area,
terms like compensators, conditioners, controllers and active filters are in use, all
referring to similar kind of devices. These device can be series or shunt connected. One of the
main disadvantages of a series controller is that cannot operate during an interruption, A shunt
controller operates during an interruption, but its storage requirements are much higher.
Some examples of mitigation equipment are:
The main device used to mitigate voltage sags and interruptions at the interface is the socalled uninterruptable power supply (UPS).
Voltage source converters (VSC) generate a sinusoidal voltage with the required
magnitude and phase, by switching a dc voltage an a particular way over the three phases.
This voltage source can be used to mitigate voltage sags and interruptions.

13

The interface between the supply system and the end-customers equipment including protection
coordination is controlled in the facility connection requirements. These requirements shall be
documented, maintained, and published by voltage class, capacity, and other characteristics that
are applicable to generation, transmission, and electricity end-user facilities and which are
connected to, or being planned to be connected to, the system.

6.6 The equipment


Improvement of equipment immunity is probably the most effective solution against equipment
trips due to voltage sags. For interruptions, especially the longer ones, improving the equipment
immunity is no longer feasible.

7 ECONOMIC CONSIDERATIONS
7.1 Decision process requirements
The decision process can be seen as build of three separate levels. The first level deals with the
technical information (network and reliability data); the second level uses the results of the first
level and the economical information on assets (CAPEX, OPEX, MAINTEX), while the third
level combines this with the economical information on business and societal information. [18]
While reliability of the equipment and system itself is important, what matters to regulators and
consumers alike is customer service quality.
The reliability worth of a network may be defined as the benefit to society ascribed to the
reliability level of a network. The best measure of reliability worth is therefore given by customer
outage cost. While the load point and performance indices indicate the frequency, duration,
severity and significance of outage situations, reliability worth attaches an economic value to
such situations. This is a particularly attractive aspect since this means that their incremental
values can be easily included in cost-benefit analyses of alternative network configurations. [22]

7.2 Customer interruption cost


Outages in electricity supply can cause extensive economical damage to the customers due to lost
production, spoilt raw materials, broken equipment, and for various other reasons. Outage
frequency and duration can be reduced by technical means, but this usually requires capital
intensive investments into the power distribution and generation system. Hence, when planning
the power system, we have to compromise between reliability and outage costs. In this planning
process, accurate and up to date knowledge on customers outage costs is of high importance.
[27]
The outage costs in general have two parts; that seen by the utility and that seen by society or the
customer [10]. The utility outage cost include the loss of revenue from customers not served and
the increased expenditure due to maintenance and repair. These costs, however, usually form
only a very small part of the total outage costs. A greater part of the costs comprises those seen
by the customer and most of these are extremely difficult to quantify. [7]
14

The worth assessment is based on customer surveys. The obtained survey data can be compiled
and calculated for the Sector Customer Damage Function (SCDF), which presents the
relationship between the sector interruption cost as a function of interruption duration. Real
distribution systems supply different mixtures of commercial, industrial and residential types of
customers that impose different load demands and service quality requirements. Composite
customer damage function (CCDF) must be defined as the estimate of costs associated with
power interruptions as a function of the interruption duration for the customer mix in a particular
service area [22]. Reliability worth can then be evaluated in terms of the expected customer
interruption cost by appropriately combining the CCDF with the calculated indices, i.e., expected
energy not supplied, expected load loss, load point failure etc.

7.3 Asset management


The utility must manage, in general, three inter-related activities [24]:
prioritization of O&M resources,
prioritization of capital spending, to meet new expansion needs and for replacement of
equipment that is too old or costs too much to repair, and
optimum utilization of its existing and any new equipment.
These three activities should be a part of the overall utility managerial process during budget
allocation.
Asset management takes a holistic view on all these activities: it is a paradigm which includes a
much closer integration of capital and O&M planning, and spending prioritization, than was
traditionally the case, along with tighter controls on spending, all aimed at achieving a lifetime
optimum business-case for the acquisition, use, maintenance and disposal of equipment and
facilities. [37]

8 CONCLUSION
Reliability planning of a distribution system seldom involves determining how to provide the
highest possible reliability of service, but instead involves determining how to meet service
reliability targets while keeping cost as low as possible.
The modern method is to engineer reliability directly, i.e. to engineer frequency and duration of
outages (SAIDI, SAIFI) and incidence of voltage sags. The quantitative reliability analysis forms
input to the value-based planning process, where the total cost including customer outage cost is
mimized.

15

9 REFERENCES
[1] R.N.Allan, R.Billinton: Concepts of Data for Assessing The Reliability of Transmission and
Distribution Equipment, The Reliability of Transmission and Distribution Equipment, 29-31
March 1995, Conference Publication No. 406, IEE, 1995, pp. 1-6
[2] D.J.Allan, A.White: Transformer Design for High Reliability, The Reliability of
Transmission and Distribution Equipment, 29-31 March 1995, Conference Publication No. 406,
IEE, 1995, pp. 66-72
[3] APM Task Force Report on Protection System Reliability: Effect of Protection Systems on
Bulk Power Reliability Evaluation, Transactions on Power Systems, Vol 9., No. 1., February
1994, pp. 198-205
[4] N.Balijepalli, S.S.Venkata, R.D.Christie: Modelling and Analysis of Distribution Reliability
Indices, IEEE Transactions on Power Delivery, Vol. 19, No. 4, October 2004, pp. 1950-1955
[5] C.Basille, J.Aupied, G.Sanchis: Application of RCM to High Voltage Substations, The
Reliability of Transmission and Distribution Equipment, 29-31 March 1995, Conference
Publication No. 406, IEE, 1995, pp. 186-190
[6] L.Bertling, R.Allan, R.Eriksson: A Reliability-Centered Asset Maintenance Method for
Assessing the Impact of Maintenance in Power Systems, IEEE Transactions on Power Systems,
Vol. 20, No. 1, February 2005, pp. 75-82
[7] R.Billinton, R.N.Allan: Reliability Evaluation of Power Systems, Pitman Publishing Limited
1984, 432 pages
[8] R.Billinton, R.Ghajar, F.Filippelli, R. Del Bianco: The Canadian Electrical Association
Approach to Transmission and Distribution Equipment Reliability Assessment, The Reliability
of Transmission and Distribution Equipment, 29-31 March 1995, Conference Publication No.
406, IEE, 1995, pp. 7-12
[9] R.Billinton, P.Wang: Teaching Distribution System Reliability Evaluation Using Monte Carlo
Simulation, IEEE Transactions on Power Systems, Vol. 14, No. 2, May 1999, pp. 397-403
[10] M.H.J. Bollen: Understanding power quality problems: voltage sags and interruptions, IEEE
Press 2000, 543 pages
[11 R.E. Brown, J.J. Burke: Managing the Risk of Performance Based Rates, IEEE Transactions
on Power Systems, Vol. 15, No. 2, May 2000, pp 893-898
[12] Mo-yuen Chow, Leroy S. Taylor, Mo-Suk Chow: Time of Outage Restoration Analysis in
Distribution Systems, Transactions on Power Systems, Vol 11., No. 3., August 1996, pp. 16521658

16

[13] CIGRE Meetings, Montreal Symposium 16-18 September 1991: Electric power systems
reliability, Electra No 140, February 1992, pp. 5-35
[14] D.M.Dalabeih, Y.A.Jebril: Determination of Data for Reliability Analysis of a Transmission
System, The Reliability of Transmission and Distribution Equipment, 29-31 March 1995,
Conference Publication No. 406, IEE, 1995, pp. 19-23
[15] J.G.Dalton, D.L.Garrison, C.M.Fallon: Value-Based Reliability Transmission Planning,
IEEE Transactions on Power Systems, Vol. 11, No. 3., August 1996, pp. 1400-1408
[16] A.O.Eggen, B.I.Langdal: Large-Scale Utility Asset Management, CIRED Conference 2005,
paper 446
[17] L.A.Freeman, D.T.Van Zandt, L.J.Powell: Using a Probabilistic Design Process to
Maximize Reliability and Minimize Cost in Urban Central Business Districts, CIRED
Conference 2005, paper 681
[18] E.Gulski, J.J.Smit, B.Quak, E.R.S.Groot: Decision Support for Life Time Management of
HV Infrastructures, CIRED Conference 2005, paper 183
[19] P.Haase: Charting Power System Security, EPRI Journal, September/October 1998, pp. 2731
[20] D.Holmbergm T.Ostrup, M.Amorouayeche, A.Invernizzi (CIGRE Advisory Group 38.03):
Reliability Standards Versus Development of Electric Power Industry, Electra No. 177, April
1998, pp. 95-104
[21] IEEE Task Force: Reporting Bulk Power System Delivery Point Reliability, IEEE
Transactions on Power Systems, Vol 11., No. 3., August 1996, pp. 1262-1268
[22] K.K.Kariuki, R.N.Allan: Reliability Worth in Distribution Plant Replacement Programmes,
The Reliability of Transmission and Distribution Equipment, 29-31 March 1995, Conference
Publication No. 406, IEE, 1995, pp. 162-167
[23] G.H. Kjolle, H.Seljeseth, J. Heggset, F.Trengereid: Quality of Supply Management by
Means of Interruption Statistics and Voltage Quality Measurement, ETEP Vol. 13, No. 6,
November/December 2003, pp. 373-379
[24] T.Kostic: Asset Management in Electrical Utilities: How Many Facets It Actually Has, IEEE
2003
[25] M.Kruithof, J.Hodemaekers, R.Van Dijk: Quantitative Risk Assessment; A Key to CostEffective SAIFI and SAIDI Reduction, CIRED Conference 2005, paper 185
[26] L.Lamarre: When disaster strikes, EPRI Journal, September/October 1998, pp. 9-17

17

[27] B.Lemstrm, M.Lehtonen: Cost of Electricity Supply Outages, Report 2-94, VTT Energy,
1994
[28] V.Miranda, L.M.Proenca: Why Risk Analysis Outperforms Probabilistic Choice as the
Effective Decision Support Paradigm for Power System Planning, IEEE Transactions on Power
Systems, Vol. 13, No. 2, May 1998, pp. 643-648
[29] A.Mkinen, J.Partanen, E.Lakervi: A Practical Approach for Estimating Future Outage
Costs in Power Distribution Networks, IEEE Transactions on Power Delivery, Vol. 5, No. 1,
January 1990, pp. 311-316
[30] V.Neimane: On Development Planning of Electricity Distribution Networks, Doctoral
Dissertation, Royal Institute of Technology, Department of Electrical Engineering, Electric
Power Systems, Stockholm 2001, 208 pages
[31] North American Electric Reliability Council (NERC): Planning Standards, September 1997
[32] P.Pohjanheimo: A Probabilitisc Method for Comprehensive Voltage Sag Management in
Power Distribution Systems, Helsinki University of Technology publications in Power Systems
7, Espoo 2003, 87+19 pages
[33] L.Pottonen: A Method for the Probabilistic Security Analysis of Transmission Grids,
Doctoral Dissertation, Helsinki University of Technology, Power Systems and High Voltage
Engineering, 2005, 207 pages
[34] J.Rothwell: The Reliability Triangle, Transmission & Distribution World, November 2004,
pp. 54-56
[35] G.Strbac, J.Nahman: Reliability Aspects in Operational Structuring of Large-scale Urban
Distribution Systems, The Reliability of Transmission and Distribution Equipment, 29-31
March 1995, Conference Publication No. 406, IEE, 1995, pp. 151-156
[36] Svenska Elverksfreningen: Leveranskvalitet
[37] H.L.Willis: Power Distribution Planning Reference Book Second Edition, Revised and
Expanded, Marcel-Dekker Inc 2004, 1217 pages

18

A BAYESIAN METHOD FOR RELIABILITY MODELLING APPLIED TO


HIGH-VOLTAGE CIRCUIT BREAKERS

Tommie Lindquist
Royal Institute of Technology (KTH)
tommie.lindquist@ets.kth.se

ABSTRACT
This report presents a Bayesian method using rejection sampling in order to model the
reliability of power equipment. The method makes use of the different calculations and tests
carried out during the development process as prior information about new components that
has not yet failed. This information is combined with data from failures and inspections.
The aim of this report is to demonstrate the proposed method by using an example applied to
high voltage circuit breakers (CB) using randomly generated test data.
The results from the CB example show that it is possible to model the reliability of CBs with
limited access to failure statistics. They also show that the proposed method makes it possible
to model the effect of maintenance. Finally, the method can be applied when modelling subcomponents with different ageing factors.

TABLE OF CONTENTS
1

INTRODUCTION ............................................................................................................. 4
1.1

Background............................................................................................................... 4

1.2
Theory ....................................................................................................................... 4
3.2.1
Censored data ..................................................................................................... 4
3.2.2
Bayes theorem................................................................................................... 5
2

METHOD .......................................................................................................................... 6
2.1

Component modelling .............................................................................................. 6

2.2

Bayesian method....................................................................................................... 7

CIRCUIT BREAKER EXAMPLE.................................................................................... 8


3.1

Model ......................................................................................................................... 8

3.2
Data............................................................................................................................ 9
3.2.1
Prior information ................................................................................................ 9
3.2.2
Updating information ......................................................................................... 9
4

RESULTS ........................................................................................................................ 10

CONCLUSIONS.............................................................................................................. 12

REFERENCES ............................................................................................................... 13

INTRODUCTION

1.1 Background
The aim of modelling the reliability of power system components is to be able to predict
failures and thus, by applying the appropriate maintenance tasks, prevent or delay these
failures. Probabilistic reliability models allow the user to predict the likely future behaviour of
the component.
Modelling the reliability of power system components is difficult due to the lack of failure
data resulting from high reliability components and high cost of life tests [1]. This problem
can be overcome by the use of Bayesian methods. Bayesian methods allow the combination
of any previous knowledge about the process one may have with empirical data. This
knowledge may come from past experience from similar equipment, such as failure data or
maintenance records, or it may come from tests carried out during the product development
process.
In this report a Bayesian method using rejection sampling is presented. The method makes
use of the different calculations and tests carried out during the development process as prior
information about new components that has not yet failed. This information is combined with
data from failures and inspections.
The aim of this report is to demonstrate the proposed method by using an example applied to
high voltage circuit breakers (CB) using randomly generated test data.
1.2
Theory
Throughout this report x will be an observation of lifetime, which is represented by the
random variable X. Analogously will be considered to be an observation of the parameter of
interest represented by the random variable .
3.2.1 Censored data
Censoring occurs when it is not possible to observe a components time* to failure exactly and
is very common in reliability data analysis. Basically there are three ways in which reliability
data can be censored [2], [3].
1. Right-censored data; component i has not failed at time xi, giving X > xi. This is very often
the result from inspections where no fault has been discovered, nevertheless this
information is very important when modelling reliability.
2. Left-censored data; component i has failed before time xi, giving X xi. This situation
may occur when a component breaks down before its first inspection and the failure time
is not known exactly.
3. Interval-censored data; component i has failed between times xi-1 and xi, giving
x i 1 < X x i . This is the case when a unit is found to have failed between two
inspections.

Note that time, as it is used in this report, may be something different from calendar time (e.g. number of
operations or number of cycles).

3.2.2 Bayes theorem


Bayes theorem was first formulated by reverend Thomas Bayes and was presented
posthumously in 1763 [5]. Bayes theorem provides a mechanism for combining prior
information with sample data to make inferences on model parameters [2], [3].
Let B1, B2,Bn be mutually exclusive and exhaustive events contained in a sample space S,
such that:
n
P U Bi = 1
i =1
Bi I B j = 0 for i j
P ( Bi ) > 0 for each i

and let A be an event such that P(A)>0. Then for each k:


P( Bk | A) =

P ( A | Bk ) P ( Bk )

P( A | B ) P( B )
i

i =1

(1)

The basic concept of the Bayesian point of view is that, in the continuous case, is interpreted
as a realisation of the random variable with some density f(). This density represents the
prior belief about the value of , before any observations have been made. f() is called the
prior density of . The conditional distribution of , given X=x, is then:
f ( | x) =

f ( x, )
f ( x)

(2)

where f ( x, ) is the joint distribution of X and and is given by:

f ( x, ) = f ( x | ) f ( )

(3)

In (7) the marginal distribution of X, f (x), is:

f ( x) = f ( x | ) f ( )

(4)

In (2), the denominator, as described in (4), is only used as a normalising constant due to the
fact that when a value for X has been observed (4) is constant [3]. Hence f ( | x) is always
proportional to f ( x | ) f ( ) , which can be written as:
f ( | x) f ( x | ) f ( )

(5)

Furthermore, it is possible to predict future events, like the failure of a component from a
specified population, using Bayesian methods. Future events can be predicted by using the
Bayesian posterior predictive distribution [2].

If X0 represents a random variable for a new observation, then the posterior predictive
probability density function (p.d.f.) of X0 is [3]:

f ( x0 | x) = f ( x0 | ) f ( | x)

(6)

METHOD

2.1 Component modelling


In this report the definition of a sub-component is the smallest replaceable item of a power
system component. All sub-components in the component reliability model are considered to
be non-repairable and the different times to failure are statistically independent. Nonrepairable means that the only maintenance action that can bring a failed sub-component back
into a functioning state is replacement.

Consider a component comprising k non-repairable sub-components. Each sub-component, i,


has a lifetime Xi, where Xi is an independent random variable with a p.d.f. fi(x), where x is an
observation of X.
Using the proposed method, a power system component is modelled as a serial system
comprising k sub-components, each with a lifetime Xi, see Figure 1.
Power system component
Subcomponent 1

Subcomponent 2

Subcomponent k

Figure 1. A reliability model representing a power system component, from [8].


The reliability of sub-component i is measured using its failure rate i(x), defined as:

i ( x) =

f i ( x)
x

(7)

1 f i ( )d
0

If the repair time can be assumed to be zero then the failure rate of the entire power system
component, modelled as a series structure as in figure 1, is defined as:
k

total (x) = i ( x)

(8)

i =1

Many power system components are very complex and are subject to many different failure
modes. The different sub-components often have different failure mechanisms and factors that
affect their condition. Because not all sub-components age with calendar time it is necessary,
in order to use equation (8), to put the different sub-components on a common age-scale with
respect to the different factors affecting their age. This concept of relative age is a way to put
the age of the different sub-components to a common base. The relative age is typically a
value between zero to one, where zero means that the sub-component is new and one means
that it has reached the accumulated stress for which it was designed and built. Note that a subcomponent may have a relative age, Ai(x), larger than one.

xi
, v0
(9)
vi
where xi is the accumulated stress and vi is the set accumulated stress limit the sub-component
was designed to withstand. The advantage of using relative age is that it makes it possible to
compare different sub-components with respect to their relative age, even though they might
have different failure mechanisms.
Ai ( x) =

2.2 Bayesian method


When applying Bayesian methods for reliability data analysis the integration operation for
calculating the normalising constant in (4) plays a very important role. This integral is rarely
possible to evaluate using analytical methods, except in simple cases [6]. A way to overcome
this difficulty is to use numerical techniques such as Monte Carlo simulation.
This section describes a Monte Carlo based Bayesian method to model the reliability of a
power system apparatus as presented in [2]. The method makes use of the concept of relative
age and is based on a rejection sampling Monte Carlo technique first introduced in [6].
The general Bayesian method for making inferences or predictions is described in Figure 2,
where DATA is the data set x = x1 , x 2 ,K, x n of observations of X.
Model for
DATA
&
DATA

Posterior
f(|DATA)

Prior
f()

Predictive
posterior
f(x0|DATA)

Figure 2. The Bayesian method for making predictions, from [10].


The prior information about the parameter vector is expressed by a p.d.f. denoted f(). The
likelihood for the available updating data and specified model is given by L(DATA|). Then,
according to (2), the posterior distribution, representing the updated knowledge about , can
be expressed by:
f ( | DATA) =

L(DATA | ) f ()
R() f ()
=
L(DATA | ) f ()d R() f ()d

(10)

where the integral is computed over the region f()>0 and R() is the relative likelihood, as
introduced in [2], such that:
R() =

L()
L( )

(11)

As more empirical data becomes available, the model is updated using the previous posterior
data as prior information. In this way the model is improved as more data becomes available.
3

CIRCUIT BREAKER EXAMPLE

3.1 Model
The CB is modelled as several sub-components in series, as is shown in figure 1. In this
example the series system is made up of five different sub-components A, B, C, D and E,
where E is a fictive sub-component with a constant failure rate in time. This may be thought
of as all failures caused by other sub-components than A, B, C, or D, i.e. if a sub-components
has only failed once it will not be allotted a box in the model. The sub-components all have
different ageing factors. Failures on sub-components A and B depend on time and failures on
C and D depend on the number of operations the CB has performed, as can be seen in Table
1.
Table 1. Ageing factors for CB sub-components.
Sub-component
Ageing factor
A
Time
B
Time
C
No. of operations
D
No. of operations
E
Time
The lifetimes of the CB sub-components are assumed to be Weibull distributed random
variables Xi~W(bi,ci) and the Weibull parameters are in turn considered to be gamma
distributed random variables Bi~(b,rb) and Ci~(c,rc).
The Weibull distribution with scale parameter b and shape parameter c is defined such that
the p.d.f. is:
c 1

c x
f ( x | b, c) = e b
bb

(12)

The Weibull parameters are Gamma distributed with the p.d.f.:


r 1
f ( | , r ) = r
e
( r )

(13)

where the gamma function is:

(r ) = x r 1e x dx
0

(14)

The relationship between the different parameters in the model is described in Figure 3 as a
directed acyclic graph (DAG) model. Each quantity is represented as a node, the circles
represent random variables, and the rectangles represent deterministic variables.

rb

Bi

rc

Ci

Xi

Figure 3. A DAG-model over the relationship between the variables, from [8].
In this CB example vA=vB=30 years and vC=vD=10000 operations is used for the relative age
scaling. These limits are, in the CB example, found in the IEC standards [11].
3.2 Data
Due to the difficulties in obtaining high quality failure data for power system components this
CB example is presented using randomly generated test data. This data was generated so that
it was within reasonable limits based on previous CB failure studies, such as [8] and [9].
The Weibull parameters for sub-component E are b=5 and c=1 and are not updated.
3.2.1 Prior information
The prior information can be obtained from the tests and calculations carried out by the
manufacturer during the design and construction process. In this example the prior
information is a set of Weibull parameters, see table 3.
3.2.2 Updating information
The updating information is empirical data obtained from maintenance records and failure
statistics. In this example each sub-component has failed five times giving 25 observed
failures during the studied time period. Hence, every sub-component model is updated using
five observed lifetimes and 20 right-censored observations. It is assumed that no previous
failures have occurred. The test data used in this example can be found in table 2. No further
information about the total CB population is assumed to be known.
In column one of Table 2 the failing sub-component is the component that failed and the rest
are considered right-censored observations of X.

Table 2. Updating data.


Failing
Time in
No. of
sub-component operation [yrs] operations
A
14
83
A
19
121
A
18
99
A
5
50
A
13
253
B
20
83
B
25
145
B
26
121
B
29
97
B
3
345
C
10
2501
C
7
1890
C
12
3251
C
9
988
C
15
5478
D
8
6265
D
11
3888
D
12
5550
D
2
1600
D
4
2101
E
3
10
E
8
130
E
21
145
E
22
6120
E
30
901
4

RESULTS

The prior Weibull p.d.f. for each sub-component supplied by the manufacturer is combined
with a Weibull p.d.f. fitted to the empirical data in Table 2 by using equation (2), as described
in Figure 2. The predictive posterior distribution for the sub-component lifetime is then
obtained using equation (6). The sub-component failure rates are subsequently found using
equation (7) and for the complete CB failure rate equation (8) is used.
In Figure 4 the prior, updating and predictive posterior distributions of the sub-component
lifetimes for sub-components A, B, C and D are found. Common for all sub-components is
the large variance for the predicted lifetimes. This variance represents the uncertainty present
when making predictions into the future. In Table 3 a summary of all Weibull parameters is
shown.
Table 3. Weibull parameters for the prior, updating and posterior distributions.
Sub-component A Sub-component B Sub-component C Sub-component D
b
c
b
c
b
c
b
c
Prior
1.0000 2.0000 0.9000 2.7000 0.7000 2.1000 0.3500 1.3000
Updating 1.1936 2.0170 1.0245 2.9157 0.6417 1.6035 0.5846 2.4165
Posterior 1.3394 1.4084 1.1523 1.8807 0.7407 1.1827 0.6140 1.5504
10

The expected value of the prior p.d.f. is 0.8862 for sub-component A and is 1.2196 for the
updating p.d.f. This is the case even though all failures of sub-component A occurred before
they reached the relative age Ai(x)=1. The reason for this is that there are many right-censored
observations of sub-component A lifetimes exceeding those of the sub-component A failures,
see Table 2.
The left graph of Figure 5 shows the failure rate of all CB sub-components as a function of
their relative age and the right graph shows the total predicted failure rate for the CB. It is
clear that the total failure rate is dominated by sub-component D.

Figure 4. Prior, updating and posterior distributions for four CB sub-components.

11

Figure 5. Failure rates for all sub-components and the CB total failure rate.
Figure 6 shows the result if sub-component D, which is what dominates the total failure rate,
is replaced when it reaches the relative age of Ai(x)=0.5. In this example it is assumed that the
CB operates approximately 333 times/year, hence the relative age of 0.5 for sub-component D
equals 5000 CB operations or 15 years in operation. It is assumed that the replaced subcomponent is as good as new, i.e. its relative age is zero after replacement. Note that the xaxis is time in Figure 6.

Figure 6. Failure rates for all sub-components and the CB total failure rate after replacement
of sub-component D at AD(x)=0.5.
5

CONCLUSIONS

This report proposes a method to model the reliability of power system components, using
development test data from the manufacturer along with empirical data such as maintenance
records and failure statistics. The two data types are combined using a Bayesian method by
employing rejection sampling Monte Carlo.
Using the proposed method it is possible to model the reliability of power system components
with limited access to failure statistics. The level of detail of the data is what limits the
accuracy of the model.

12

Furthermore, it is possible to model the reliability of sub-components that is affected by


different ageing factors by the use the relative age concept.
The proposed method can be used for assessing the effect of component maintenance in the
form of sub-component replacement.
6

REFERENCES

[1]

Atella, F., Chiodo, E. and Pagano, M. (2001). Dynamic discriminant analysis for
predictive maintenance of electrical components subjected to stochastic wear,
COMPEL, 21(1), pp 98-115.
[2] Meeker, W.Q. and Escobar, L.A. Statistical methods for reliability data. John Wiley &
Sons Inc., 2001, ISBN 0-471-14328-6.
[3] Rausand, M. and Hoyland, A.. System reliability theory: Models and statistic methods.
John Wiley & Sons Inc. 2nd ed., 2004, ISBN 0-471-47133-X.
[4] Blom, G. Probability theory with applications (Sannolikhetsteori med tillmpningar).
Studentlitteratur, Lund, Sweden, 4th ed., 1989, ISBN 91-44-03594-2. In Swedish.
[5] Bayes, T., (1763). An essay towards solving a Problem in the Doctrine of Chances
[WWW] http://www.stat.ucla.edu/history/essay.pdf (Bayes's essay in the original
notation), January 20, 2005.
[6] Smith, A.F.M. and Gelfand, A.E. (1992). Bayesian statistics without tears: A
sampling-resampling perspective, The American Statistician, 46(2), pp.84-88.
[7] Englund, G. Computer intensive methods for applied statistics (Datorintensiva metoder
fr tillmpad matematisk statistik). Course material, dept. of Mathematics, KTH,
Stockholm, Sweden. In Swedish.
[8] Lindquist, T., Bertling, L. and Eriksson, R. (2005). A method for age modeling of
power system components based on experiences from the design process with the
purpose of maintenance optimization. In Proc. of the 51st Reliability and
Maintainability Symposium (RAMS), Alexandria, Virginia, USA.
[9] Cigr. (1994). Final report of the second international enquiry on high voltage circuitbreaker failures and defects in service. Working Group 06.13. Report 83.
[10] Lindquist, T. (2005). On reliability modelling of ageing equipment in electric power
systems with regard to the effect of maintenance. Licentiate thesis, dept. of Electrical
Engineering, KTH, Stockholm, ISBN: 91-7178-054-8.
[11] IEC 62271-100. High-voltage switchgear and controlgear Part 100: High-voltage
alternating-current circuit-breakers. Ed. 1.1, (2003-05).

13

MONTE CARLO SIMULATION METHODS FOR RELIABILITY


MODELLING AND ASSESSMENT
Jussi Palola
Helsinki Energy - Network Investments
jussi.palola@helsinginenergia.fi

INTRODUCTION

System behavior can be seen stochastic by nature - this has been acknowledged since
the 1930s, and there is huge amount of publications dealing with development models,
techniques, and applications of reliability assessment of power systems. [13] Complex
systems, such as large transmission and distribution network, are possible to simulate
and observe with the stochastic behavior of the reliability model. Where analytical
equations describing the model give too simplified solution, comes Monte Carlo
simulation usable. [1] By itself deterministic analytical reliability assessment can lead to
insufficient or overinvestment if severity of occurrence is observed without taking
stochastic likelihood in account. Appropriate evaluation of likelihood and severity
together creates indices that can represent system risk for the investment planning. [10]
Power system reliability assessment can be divided into the two basic aspects of
system adequacy (sufficiency) and system security. The existence of sufficient
facilities to satisfy consumer load demand and to meet the system operational
constraints of static condition, presents system adequacy. System security relates the
ability to respond dynamic or transient disturbances arising within the system. [10]
It is necessary to know that most of the probabilistic techniques presently available for
reliability evaluation are concentrating on adequacy assessment. The ability to assess
security is therefore limited. Past system performance indices includes both effects of
inadequacy and those of insecurity, thus pure theoretical evaluations would be important
to link on historical data for the results evaluation. [10]
2

RELIABILITY MODELING

Electrical power system is highly integrated and complex. That is why reliability
assessment usually concentrates on selected functional zones such as generating
stations, transmission systems and distribution systems. [5] Still a majority of
interruptions in developed nations result from problems occurring between customer
meters and distribution substations. [6]
2.1

Failure State Modeling

Failures are events where a device suddenly does not operate as intended, and failures
occur in all parts of the power system. Cables are damaged, for example, lighting
strikes, breakers suddenly open, transformers burn out, digging activities, etc. [11]
1

Figure 1. Basic reliability assessment scheme [11]


Every reliability analysis must start by recognizing and modeling all relevant failures
which may affect the systems reliability. [11] In addition to equipment, animals,
vegetation and weather, humans are directly responsible for many customer
interruptions. There are too many ways for humans to cause interruptions to analyze it
exhaustively, but digging activities is one good example of it. Most common mistakes by
utility workers can generally be classified into switching errors, direct faults, and indirect
faults. [6]
2.1.1 Analytical Simulation
Analytical simulation models each system contingency, computes the impact of each
contingency, and weights this impact based on the expected frequency of the
contingency. At first glance it appears to be very close to a form of contingency analysis,
as N-1 criterion would be, expect that probabilities are assigned for each contingency
case. [15] [3]
Component
2
Component
1
Component
3

Figure 2. Reliability block diagram 3 components in a series-parallel configuration. [3]

Analytic calculations range from matrix manipulations to state enumeration techniques.


State enumeration means that all relevant failures will be analyzed one by one, as with
the deterministic fault effect analysis. [11] They generally provide expectation indices in
a relatively short computing time, but also assumptions are frequently required in order
to simplify the problem. This is especially the case when complex systems and complex
operating procedures have to be modeled. [13]
2.1.2 Markov Modeling
The Markov equation approach is sometimes called the state space method since it is
based on a state space diagram. The main advantage of this technique is clear picture
of all states and transitions between them. [7] Markov modeling is based on analyzing
the states that a system could be in: states can be such things as everything operating
normally, or component is derated. It focuses though, on analyzing the transitions
between these states: it analyzes the probability that the system can move from
operational condition to failure and also how long the restoration time would be from the
failure. [9]
Consequences view
Markov modeling is especially appropriate when system details of the transitions
between states are known or important. For example if planners want to study how
different repair and replacement part policies would impact the availability of the system.
[9]

Up
1

Out

Where :
a =Active failure transition rate

p = Passive failure transition rate

Down

sw = Switching rate

sw

= Repair rate

2
Figure 3. Example of a Three State Breaker Markov Model. [5] [13]
(1) State before fault
(2) State after the fault but before component isolation
(3) State after the fault isolation but before repair is completed
Passive event: A component failure mode that does not have impact on the remaining
healthy components. Active event: A component failure mode that can cause the
removal of other healthy components and branches from service. [13]
3

Up

Derated

Complete
repair

Repair

Complete
repair

Unit
31
Up 1

21

Up

12
Unit
Derated 2

Partial
repair

32

23

13
Unit
Down 3

Time

Down
Down
Figure 4. Three-state operating cycle and Markov model for the base load unit. [10]
Analytic calculations on the basis of the Markov model are normally much faster than
Monte Carlo simulations, and they produce exact results. Variances can be produced
also, but probability distributions are normally not realizable. The disadvantage of the
analytic Markov methods is that there are severe limitations to the calculation of
interruption costs indices. [11]
2.2

Monte Carlo Approach

The name of the Monte Carlo method and the systematic use of it date back to Second
World War. Nuclear physicists at Los Alamos were working on the difficult problem of
determining how far neutrons would travel through various types of materials. [3] [10]
Monte Carlo simulation is similar to analytical simulation, but models random
contingencies are based on probabilities of occurrence, rather than expected events.
This allows component parameters to be modeled with probability distribution functions
instead of expected values. For applications requiring determination of expected values
of reliability, analytical simulation is the best method for distribution system assessment.
Monte Carlo simulation becomes necessary if statistical results other than expected
values are required: analysis of such things as the distribution of expected SAIDI from
year to year. [9] [15]
Monte Carlo simulation is the repeated chronological simulation of the power system.
During each simulation, faults will occur randomly, as in real-life, and the reactions of the
system to these faults are simulated chronologically. The performance of the system is
then monitored during the simulations. [11]
The advantage of Monte Carlo simulation is that all aspects of the power system can be
addressed, and there are no limits to the stochastic models or to the possibilities for
measuring performance. The disadvantages are the often very high computational
demands. Monte Carlo simulation produces stochastic results only, which are inexact,
but for which it is possible to calculate variances and even probability distributions. [11]

2.3

Random-Number Generation

The name of the method points to uncertainty and is implemented with randomnumbers. A simple physical random-number generator would be a dice, a coin or a
roulette table [8]. Monte Carlo random-numbers are formulated usually with computer in
the following form, and the output is a row of integers with values between 1 and
(1 N ) . [1]

U i +1 = (aU i ) mod (N )

, U 0 = 1 a and N have to be chosen.


A mod B = A B int( A / B)

Usually resulting integer is divided with N to get random number between 0 and 1. A
31
popular value for N is 2 1 and the result would be a random draw from the uniform
distribution between 0 and 1. [1] Random values generated with mathematical method
are pseudo random numbers and they should be tested statistically to assure
randomness. There are three basic requirements for a random number generation:
1. Uniformity: The random numbers should be uniformly distributed between [0,1].
2. Independence: There should be minimal correlation between random numbers.
3. Long period: The repeat period should be relevantly long. [10]
Uniform distribution betw een 0 and 1
1,5

Random Number Values

Generated Random Number

0,5

0
0

1000

2000
3000
Am ount of Random Num bers

4000

5000

Figure 5. Five thousand random numbers generated with Excels Rand() function and
correlation between other random number generation round was -0,000319.
Random numbers should represent various reliability indices and that is why different
shaped distributions are needed. [10] Probability distribution functions are mathematical
equations allowing a large amount of information, characteristics, and behavior to be
described by a small number of parameters. A deep understanding of each equation is
not mandatory, but the minimum suggestion is to have conceptual understanding of
probability distribution functions and their relation to reality. [6]
5

Studies of the distributions have associated with the basic reliability indices indicate that
the load point failure rate is approximately Poisson-distributed. It has been noted that if
the restoration times of components can be assumed to be exponentially distributed, the
load point outage duration is approximately gamma-distributed. There are, however,
many distribution systems for which the gamma distribution does not describe real
indices. Probability distributions for the annual load point outage time, SAIDI, SAIFI and
CAIDI indices can also not be presented by common distributions. [10]
It is also possible to make empirical distributions for the random numbers in order to
have more case sensitive results. Empirical distributions are especially used in finance
market analysis. Where, for example, oil stock investment has a characteristic risk
profiles generated for the return on investment with tens of years experience. [12]
It should expected therefore that SAIDI, SAIFI and CAIDI statistics obtained from actual
circuit or system operating behavior will vary over time as each year is simply a
snapshot of a continuum in time. The ability to generate the index distribution by Monte
Carlo provides the opportunity to appreciate and quantify the deviations associated with
these important customer parameters. [10]
2.4

Sequential Modeling

Sequential Monte Carlo method refers to a simulation process over a chronological time
span although the length of the time sequent can be changed depending on dynamic
characteristics of the system. There are different approaches to create an artificial
system state transition cycle. The most popular one is the so-called state duration
sampling. [6]
2.4.1 State Duration Sampling Approach
The state duration sampling approach is based on sampling the probability distribution
of the component state duration. In this approach, chronological component state
transition processes for all components are first simulated by sampling. The
chronological system state transition process is then created by combination of the
chronological component state transition processes. The advantages of the state
duration sampling approach are:
+
+
+

It can be easily used to calculate the actual frequency index.


Any state duration distribution can be easily considered.
The statistical probability distributions of the reliability indices can be
calculated in addition to their expected values. [10]

The disadvantages of this approach are:

2.5

Compared to state sampling approach, it requires more computing time


and storage because it is necessary to generate random variants following
a given distribution for each component and store information on
chronological component state transition processes of all components in a
long time span.
This approach requires parameters associated with all component state
duration distributions. Even under a simple exponential assumption, these
are all transition rates between states of each component. In some case,
especially for a multistate component representation, it might be quite
difficult to provide all these data in an actual system application. [10]

Continuous Monte Carlo Simulation

Continuous Monte Carlo simulation is sometimes called the state sampling approach. It
is widely used in power system risk evaluation. The concept is based on the fact that a
system state is a combination of all component states and each component state can be
determined by sampling the probability of the component appearing in that state. [7]
Lets have a simple example about hybrid analytical - Monte Carlo state sampling
approach: in the figure below is simplified substation model. Substation is connected at
two high voltage transmission lines and has two main power transformers and medium
voltage busbars. The aim is to simulate medium voltage feeder A state reliability with
continuous Monte Carlo method. Reliability data for the components is combined from
Helen Network, Nordel and Reliability Evaluation of Power Systems [13] statistics.

Table 1. Reliability statistics for the substation state sampling simulation.


Reliability data from Helen Network and NORDEL statistics
Element

Faults/year =

Reliability

= e

110 kV overhead line

0,0218

0,97844

110 kV breaker

0,00238

0,99762

Power Transformer

0,0038

0,99621

0,000412

0,99959

0,0536

0,94781

Medium voltage breaker


20 kV Feeder

HV
HV

R1

T1

Analytically
simplified for
the Monte
Carlo
simulation

T2

MV

R2

R4
MV
R3

Figure 6. Simplified reliability model for a substation feeder A.


For each component, a random number between zero and one is generated. If this
random number is less than Rx , no failure will occur in the simulation round. If the
number is greater, then component fails and system state is achieved by collecting all
component states with summary conclusion. The essential requirement of a reliability
assessment in this case is to identify whether the failure of a component or a
combination of components causes the failure at the load point A. [13] [16] [15] [10]
Table 2. Result from the Monte Carlo state sampling simulation with macro-program. [16]
R1
R2
R3
R4
Reliability--> 0,999429223 0,993429208 0,94742119 0,993020473

SYSTEM

Random number this iteration-->


Success: (Random) < (R-value)
Success = (1) this iteration-->
Failure = (0) this iteration-->

0,3802

0,7190

0,3476

0,2337

1
0

1
0

1
0

1
0

1
0

Cumulative successes-->
Cumulative failure-->
Total Iterations-->

19983
18
20001

19878
123
20001

18905
1096
20001

19851
150
20001

18769
1232
20001

Simulated Reliability-->
Theoretical Reliability-->
% Error-->

0,9991
0,9994
-0,03 %

0,9939
0,9934
0,04 %

0,9452
0,9474
-0,23 %

0,9925
0,9930
-0,05 %

0,9384

0,9407
-0,24%

Monte Carlo simulated reliability for the whole system is 0,9384 , which differs -0,24
percent from analytical result. 1232 simulations led to system failure and 18769 to
success, and it presents the result of 20001 simulation rounds altogether.
In general, advantages of the state sampling approach are:
+

Sampling is relatively simple. It is only necessary to generate uniformly


distributed random numbers between [0, 1], without need to sample a
distribution function.

Required basic reliability data are relatively few. Only the component-state
probabilities are required.

The idea of state sampling not only applies to component failure events
but also can be easily generalized to sample states of other parameters in
power system reliability evaluation, such as load, hydrological, and
weather states, etc. [10]

The disadvantage of this approach is that it cannot be used by itself to calculate the
actual frequency index. [10]
2.6

Weather Simulations

In generally weather conditions are one of the main causes of equipment failures: a
worldwide survey over several years in the 1980s indicated that about 20 percent of
failures were arising from weather conditions. [14] Electrical network characteristics
affect what kind of impact different environment conditions have on system functionality.
For example urban underground network is more vulnerable to floods than overhead line
delivery network - on the contrary bare overhead line conductors are more open to
thunder and snowstorms.
During normal weather, equipment failures are considered to be independent events,
but in the severe weather conditions many equipment failures can occur at the same
time and we are discussing about common cause failures [17]. This strains utility
resources and can lead to long restoration times for many interrupted customers and
also to penalty payments. [6] This we have seen recently happening in Sweden and
Finland rural area networks, where over headline distribution network suffered
interruptions due to storm weather conditions.

Figure 7. Failure rate as a function of time - normal and adverse weather. [1]
Weather conditions can be modeled in long term view with Monte Carlo. For example,
component which is sensitive for certain weather effects can have changing failure rate
modeling in the basis of two-state weather function. [1]
2.6.1 System State Transition Sampling Approach
This approach focuses on state transition of the whole system instead of component
states or component state transition processes [7] [10]. The advantages of this approach
are:
+

It can be easily used to calculate the exact frequency index without the
need to sample the distribution function and storing chronological
information as in the state duration sampling approach.
In the state sampling approach, m random numbers are required to obtain
a system state for an m-component system. This approach requires only a
random number to produce a system state. [10]

The disadvantage of this approach is that it only applies to exponentially distributed


component state durations. It should be noted, however, that the exponential distribution
is the most commonly used distribution in reliability evaluation. [10]
2.7

Monte Carlo Simulation Error Evaluation

Selection of a stopping rule is an important factor in Monte Carlo simulation. The


stopping criterion can be the fixed number of samples, or the coefficient of variation
tolerance, or combination of both. A large number of samples or a small tolerance

10

coefficient can provide relatively high accuracy in reliability indices, of course with
increased calculation time also. [10] [1]
Outcomes of five Monte Carlo Simulations
0,32

Average Results

0,3

0,28

0,26

0,24
0

5000

10000

15000

20000

25000

30000

Number of Samples

Figure 8. Monte Carlo reliability simulations for two component parallel system. [16]
Variance reduction techniques can be used to improve the effectiveness of Monte Carlo
simulation. The variance cannot be reduced to be zero and therefore it is always
necessary to utilize a reasonable and sufficiently large amount of samples. [10] [1]
2.7.1 Sensitivity Analysis
Sensitivity analyses are useful for many aspects of system analysis. It is useful when
aim is to mitigate uncertainty concerning reliability data. Secondly, sensitive result can
be used to calibrate systems to historical reliability data. Latest, sensitivity analyses can
be used to find the most effective actions to have a significant reliability impact on the
system. For example, if system SAIDI is highly sensitive to feeder failure rates, reducing
cable failures will probably reduce SAIDI. By itself this analysis would not necessarily
give cost effective result, but a view where is effective to invest, if any where. [6]

11

Sensitivity to Failure Rate %

40
35

Sensitivities of reliability indices to equipment failure rates


SAIFI

30
25

SAIDI

20
15
10
5
0

Padmount Switches

Substation Transformers

Underground Cables

Figure 9. Example from southern US. military base distribution system. [6]
Here SAIFI is least sensitive and SAIDI is most sensitive to the failure rate of substation
transformers. Depending on reliability improvement goals, addressing substation
transformer failures could be very effective for reducing SAIDI and very ineffective to
reduce SAIFI. [6] In principle, system reliability becomes more predictable as system
size increases. [10]
3

RELIABILITY ASSESSMENT

The main aims for the reliability assessment are to design new systems to meet explicit
reliability targets and identify reliability problems on existing systems and due to system
expansion. Also effectiveness of reliability improvement projects and to design systems
that are best suited for the customer view performance based rates is important. [9]
The assessment of the reliability of a power system means the calculation of a set of
performance indicators. Two different sets of indicators can be distinguished: local
indicators and system indicators. Local indicators are calculated for a specific point in
the system. Example from local indicator could be the interruption costs per year for a
specific customer. System indicator like, the average number of interruptions per year,
per customer, express the overall system performance. [11]
3.1

Worth Assessment Using the Cost of Interruption Data

One approach which can be used to start reliability assessment is to relate it to the cost
for the customer when power delivery fails. Standard loading classifications, where
consumer sectors are identified, makes data processing reasonable. The next step is to
create individual Customer Damage Functions (CDF) to relate the outage consequences
to cost associated. It can be updated with cost of interruption to customer
questionnaires, and also other option is to upgrade values with relation to generally
know indexes like Gross Domestic Product, Consumer Price Index and energy
consumption [4]. [2]

12

3.2

Using Monte Carlo Simulations for Analyzing Quality Costs

Stochastic methods can


produce
system
performance rates and with
the
customer
damage
functions, reliability level can
be transformed in to a cost
of interruptions. And by
taking reliability investments
in account we are able to
start working with reliability
investments
optimizing
challenge.
Operation,
quality
and
maintenance cost functions
are dependent on the
investments and therefore
they should be optimized
altogether in the same
examination in order to have
the
best
investment
reference
acknowledge.
This approach is needed
specially
to
make
investment planning more
comprehensive
and
transparent.
Figure 10. Simple model for the network investment cost function structure.

Monte Carlo simulations can produce, for example, SAIDI distributions for the certain
network in order to avoid interruption penalties or such. [9] [15] Next example figure is
result of a 1000 random year simulation for real U.S. utility distribution system: where is
three voltage levels, nine substations, more than 770 kilometers of feeder and
approximately 8000 system components altogether. [15]

13

Figure 11. Performance based rate SAIDI for the quality cost evaluation. [15]
Monte Carlo method can be used to specify characteristics for the optimized reliability
assessment management of the power system altogether with other reliability modeling
methods. Monte Carlo methods specialty is modeling events with stochastic input in
order to found uncertainty distribution bounded to analytical results.
4

SOURCES
1.

Math H. J. Bollen, Understanding Power Quality Problems Voltage Sags and


Interruptions, IEEE Press, New York, 2000.

2.

A. Sankarakrishnan, R. Billinton, Effective Techniques for Reliability Worth


Assessment in Composite Power System Networks Using Monte Carlo Simulation,
IEEE Transactions on Power Systems, Vol 11, No. 3, August 1996.

3.

K. E. Forry, A General Monte Carlo Simulation Model for Estimating Large Scale
System Reliability and Availability, Clemson University, Ph. D., Electrical Engineering,
U.S.A. 1972.

4.

K. Kivikko, Extrapolating Customer Damage Functions, Presentation 16.6.2005 in


Espoo Technical Research Centre of Finland (VTT).

5.

R. Billinton, G. Lian, Station Reliability Evaluation Using A Monte Carlo Approach,


IEEE Transactions on Power Delivery, Vol 8, No. 3, July 1993.

14

6.

Richard E. Brown, Electrical Power Distribution Reliability, ABB Inc., Raleigh, North
Carolina, Marcel Dekker, New York 2002.

7.

Wenyuan Li, Risk Assessment of Power Systems Models, Methods, and Applications,
IEEE Press Series on Power Engineering, Wiley & Sons 2005.

8.

I. M. Sobol, The Monte Carlo Method, Little Mathematics Library, Mir Publishers,
Moscow 1975.

9.

H. Lee Willis, Gregory V. Welch, Randall R. Schrieber, Aging Power Delivery


Infrastructures, ABB Power T&D Company Inc. Raleigh, North Carolina, Marcel
Dekker 2001 New York.

10. Roy. Billinton, Wenyuan Li, Reliability Assessment of Electrical Power Systems Using
Monte Carlo Methods, Plenum Press, New York 1994.
11. Jasper Van Casteren, Assessment of Interruption Costs in Electric Power Systems
using the Weibull-Markov Model, PhD, Department of Electric Power Engineering
Chalmers University of Technology, Gothenburg, Sweden, 2003.
12. VAR, Understanding and Applying Value-at-Risk, Risk Publications, Financial
Engineering Ltd 1997 London. and Monte Carlo Methodologies and Applications for
Pricing and Risk Management, Risk Publications, Financial Engineering Ltd 1998
London. Also conversation with Helen Trading portfolio manager, Jaakko Kontio.
13. Roy Billinton, Ronald N. Allan, Reliability Evaluations of Power Systems, second
edition, Plenum Press, New York 1996.
14. U. G. Knight, Power Systems in Emergencies, from Contingency Planning to Crisis
Management, John Wiley & Sons Ltd 2001.
15. H. Lee Willis, Power Distribution Planning Reference Book, second edition revised
and expanded, Marcel Dekker Inc., New York Basel 2004.
16. Modified H. Paul Barringer MS Excel macros (MCSample.xls) by Jussi Palola and
Petteri Haveri.
17. Liisa Pottonen, A Method for the Probabilistic Security Analysis of Transmission
Grids, Doctoral Dissertation in Helsinki University of Technology, Espoo 2005.
15

LAPPEENRANNAN
TEKNILLINEN YLIOPISTO

LAPPEENRANNAN TEKNILLINEN YLIOPISTO


SHKTEKNIIKAN OSASTO

DIFFERENT MAINTENANCE STRATEGIES AND THEIR IMPACT ON


POWER SYSTEM RELIABILITY
Anna Brdd
Researcher, Lappeenranta University of Technology
Anna.bradd@lut.fi

Introduction

These days utilities are forced to rethink their maintenance strategies as well as new investments must be carried out as effectively as possible due to the hardening competitive electricity market. In order to survive on a deregulated power market utilities simply have to be as
cost effective as possible. Cost effectiveness today means minimizing costs and at the same
time living up to the demand of reliability from customers and regulatory. This cost may be
divided into cost of failure, cost of preventive maintenance and capital costs. [1] This paper
deals with managing of different maintenance strategies and their impact on power system
reliability.
Taking a look at preventive maintenance means facing extensive renewals- buying new components or refurbishing a component into a new condition. Alas, when planning of making
remarkable renewals it becomes important to find the point in time when such a renewal must
be carried out. By postponing renewal, capital costs are saved. But on the other hand, the risk
for a costly failure increases. In maintenance management it becomes very important to be as
effective as possible when deciding about the point in time for a renewal. [1]
Obviously, every manager's intention is to run the equipment as much as possible without
costly breakdowns. Moreover, the fastest way to increase earning in a short-term perspective
is to cut down on maintenance and postpone renewals. In many cases this kind of approach
has been proven successful due to the very long operative lifetime of many components and
the inherent reliability. However, in a long time perspective this might still not be so cost efficient. In every case, the network manager has to manage the assets he is responsible for;
which means that he can make investments just to meet the needs of today or the needs in a
long-term perspective, say 25 years. The question here really is, what will the total costs become and what condition will the technical system be in? [1]
Power system reliability is affected by strategic planning of network structure and layout,
choices made regarding use of network components, increase of system quality, electrical
quality requirements of the end customer, geographical constraints in the network as well as
maintenance activities. Currently, when the economical situation of electrical utilities is quite
constraint, utilities are forced to get the most out of the devices they already own through
more effective operating policies including improved maintenance programs. All in all it is
fair to say that maintenance is becoming an important part of network related asset management. [2]
PL 20, 53851 LAPPEENRANTA, p. 05 62111, fax. 05 621 6799
http://www.ee.lut.fi

Presentation of maintenance strategies

Two fundamental maintenance strategies to keep in mind when talking about maintenance are
i) replacement by a new component (or "good as new") and alternatively ii) replacement with
a less costly component facing a limited improvement of the component's condition. Moreover methods are divided into categories where maintenance is performed at fixed intervals
and alternatively where it is carried out as needed. These methods are further divided into
heuristic methods and other mathematical models, where the models can be deterministic or
probabilistic. [2] Later in the paper we will take a closer look especially at a heuristic based
method called reliability centered maintenance (RCM) as well as probabilistic mathematical
models.
In order to determine an optimal inspection and replacement policy, a particular strategy is
usually pre-specified by maintenance planners using practical considerations such as limitation of technology, cost of equipment and simplicity of implementation. Basic maintenance
strategies in this sense are; failure replacement, age replacement, sequential replacement, periodic inspection and continuous inspection. [4]
2.1

Short history on the development of maintenance

Through the years electric utilities have always relied on maintenance programs to keep their
equipment in good working condition for as long as it's feasible. Maintenance routines consisted in many cases of pre-defined activities carried out at regular intervals, which is also
known as scheduled maintenance. However, such a maintenance policy may be quite inefficient in the long run, being perhaps too costly and may still not extend component lifetime as
much as possible. After this period and during the last ten years utilities have in many cases
replaced their fixed-interval maintenance schedules with more flexible programs based on an
analysis of current needs and priorities or information obtained from continuous condition
monitoring. [2]
2.2

Definition of network maintenance and its central characteristics

The purpose of maintenance is to extend equipment lifetime, or at least the mean time to the
next failure whose repair may be costly. Effective maintenance policies can reduce the frequency of service interruptions and other undesirable consequences related to such an interruption. [2]
Maintenance activities naturally affect network component and system reliability- if too little
is done the result can be an excessive number of costly failures and poor overall system performance, which degrades the reliability of the system. On the other hand, if maintenance activities are done too often reliability may improve but the cost of maintenance will sharply
increase. Therefore, in a cost-effective scheme both variables, maintenance and reliability,
must be balanced. We should still keep in mind that maintenance in the end is only one of
many factor that affects component and system reliability. Maintenance is one of the operating activities in network business and it is selected to satisfy both technical and financial constraints. [2]

2.3

Maintenance and its relation to reliability-centered maintenance (RCM)

In order to compare different maintenance strategies one can use a so called RCM approach,
which selects the most cost-effective strategy for sustaining equipment reliability. RCM programs have been taken into use in several electric utilities as a useful management tool. The
implementation of RCM programs started a new era in the development of maintenance and
the direction of "getting the most out" of the equipment installed. This RCM approach is however heuristic meaning that its application requires experience and constant judgment. Another fact is that it can take quite a long time before enough data are collected for making
such judgments. Keeping this difficulty in mind, several mathematical methods have been
proposed to aid scheduling of maintenance and the literature if maintenance models has become extensive. [2]
2.4

Maintenance approaches

Starting from the strategic level it is safe to conclude that maintenance is today a central part
of asset management in electric utilities. To a vast extent the maintenance literature still concerns itself basically with replacements only, both after failures and during maintenance and
doesn't take into account the kind of maintenance where less improvement is achieved at
smaller cost. The oldest replacement schemes are the age replacement and bulk replacement
policies. In the age replacement scheme a component is replaced at a certain age or when it
fails, whichever comes first. With bulk replacement policies on the other hand, all devices in
a given class are replaced at predetermined intervals, or when they fail. Because of the large
scope in this lastly mentioned policy it easily understood that it can be more economical than
a policy based on individual replacement, especially if the ages of the components are unknown. [2]
When talking about newer replacement schemes, they are often based on probabilistic models
and can thus be quite complex. The result can be seen in electric utilities where maintenance
resulting in limited improvement is an established practice and replacement models have only
a secondary role. [2]
2.4.1

A basic maintenance program

Generally we can say that maintenance programs range from very simple programs to quite
sophisticated ones. An example of a very simple maintenance program is a case where predefined activities are carried out at predefined intervals- whenever a component fails it is then
repaired on replaced. However, both repair and replacement are considered much more costly
than a single maintenance job. The maintenance intervals are chosen on the basis of long-time
experience, which is not necessarily an inferior alternative to different mathematical models.
In fact, to this day this is the approach most frequently used among utilities. [2]
2.4.2

The RCM approach

Another approach, that deserves to be looked at is the RCM approach, which is based on
regular assessment of equipment condition and is thus not directly in the business of applying
tight maintenance schedules. Reliability Centered Maintenance is a strategy where maintenance of system components is related to the improvement in system reliability. The approach
is not however always based on condition monitoring, but on other features like failure modes

and effects analysis and an investigation of operating needs and priorities. Normally the approach is always empirical. A general RCM process could include the following stages:
1. Listing of critical components and their functions
2. Failure mode and effects analysis for each chosen component, determination of
failure history and calculation of mean time between failures.
3. Categorization of failure effects and determination of possible maintenance tasks.
4. Maintenance task assignment.
5. Program evaluation, including cost analysis. [2]
A slightly different way of presenting the RCM process has been done in Sweden [3], and
here the steps are:
1.
2.
3.
4.
5.

Choosing and defining of a system to analyze


Defining and collecting necessary input data
Performing of a reliability analysis of the chosen system
Identifying of critical components
Identifying of the benefit of maintenance of the critical components from an
knowledge of either its functions (for example transfer of energy), its failure
modes (for example short circuit), its failure events (for example insulation failure), its failure causes (for example material and method), the %-contribution of
each failure cause to total failures using statistics or the knowledge of which
causes can be affected by preventive maintenance. [3]

In itself the RCM procedure does not actually bring anything new to the complexity of the
question regarding cost-efficient maintenance. However, it provides a formal framework to
address this multi-dimensioned question. One concrete output we get from the RCM procedures are RCM plans. Regarding the RCM method we can identify three main procedures,
which are; sensitivity analysis, deduction of relationship between system reliability and maintenance and finally cost optimization. In the following table the procedures mentioned before,
required input data and the need for interaction between system and component level is presented. [3]
Table 1: Presentation of RCM methodology and its main procedures
Procedure
Reliability analysis
Sensitivity analysis
Analysis of critical components
Analysis of failure models
Estimation of composite
failure rate
Sensitivity analysis

Level
System
System
Component

Required data
Component data
Component data
Failure modes

Component
Component

Failure modes, causes of


failures
Maintenance frequency

System

Maintenance frequency

Cost/Benefit analysis

System

Costs

Result
Reliability indices
Critical components
Critical components affected by maintenance
Frequency of maintenance
Composite failure rate
Relation reliability indices and PM schedules
RCM plan

2.4.3

The PREMO approach

A third suggested maintenance approach, claimed to be more efficient than the RCM approach, is called Preventive Maintenance Optimization (PREMO). PREMO is based on extensive task analysis rather than system analysis, with a capability of radically reducing the
required number of maintenance tasks in a plant. Both the RCM and PREMO program have
proven themselves useful in ensuring the economic operation of power stations. Still, none of
them will provide the full benefits and flexibility of programs based on mathematical models.
[2]
In order to get a thorough evaluation of the effects of a maintenance policy, one has to know
by how much its application would extend the lifetime of a selected component, for example
measured in mean time to failure. This kind of answer is only obtained using a mathematical
model of the component deterioration process, which is then combined with a model describing the effects of maintenance. Several such models have been proposed during the last ten
years. Finally, they provide the missing link from earlier mentioned approaches, that is- a
quantitative connection between reliability and maintenance. [2]
2.4.4

The use of mathematical approaches

Starting from the simpler mathematical models it is safe to say, that they are essentially still
based on fixed maintenance intervals, and optimization will result in identifying the least
costly maintenance frequency. More complex models include the idea of condition monitoring, where decisions related to the timing and amount of maintenance are dependent on the
actual condition, or stage of deterioration, of the device. From this we can conclude that some
kind of monitoring must be part of the model. Examples of other desirable features which
should be built into maintenance models are i) to insert inspection states before proceeding to
maintenance ( in the case of predictive maintenance where maintenance is performed only
when needed- during inspections decisions are made as to the necessity of carrying out maintenance at that time) and ii) the effect of the break-in period of the beginning of a device's life
(when failure rate is possibly quite high). [2]
Once a mathematical model is constructed, the process itself can be optimized with changes
in one or more of the variables. [2]
2.5

Linking power system reliability and maintenance - use of mathematical models

Commencing from the simplest maintenance policies we are facing i) a set of instructions
taken from equipment manuals or ii) a long-standing experience. In both cases there are no
quantitative relationships involved and the possibilities are quite limited for making predictions about the effectiveness of the policy or carrying out any type of optimization. It is now
we come concretely in touch with mathematical models- models that make numerical predictions and carry out optimization, and thus show us the effects of maintenance on power system reliability. [2]
Mathematical models are either deterministic or probabilistic. Both models are useful in appropriate maintenance studies. In this paper we will however closer inspect the probabilistic

model, since it is the only one that can properly handle the future uncertainties associated
with quantities involved whose values vary randomly. [2]
2.5.1

Failure models

Generally speaking, component failures can be divided into two categories- random failures
and those arising as a consequence of deterioration. Failure-repair processes are shown in the
following figure 1.

i)

Work

ii)

Failure

D1

D2

Dk

Failure

Figure 1: State diagrams for i) random failure and ii) deterioration process

The deterioration process is represented by a sequence of stages of increasing wear, finally


leading to equipment failure. Naturally, the number of deterioration stages may vary and also
we should keep in mind that in reality deterioration is a continuous process in time and not
consisting of discrete steps used in this model. Deterioration stages can be defined either by
duration ( second stage normally at three years, then the third in six etc.) or physical signs
(corrosion, wear). This second approach is often in use in practical applications, which makes
periodic inspections necessary to determine the stage of deterioration the device has reached.
Continuing with this case, the mean times between the stages are usually uneven, and are selected from performance data or by judgment based on experience. [2]
The process presented in figure 1 can also be represented by a mathematical model, called the
Markov model, assuming that the transitions between the states occur at constant rates. Furthermore, well-know techniques exist for the solution of such models.
2.5.2

The effect of maintenance on the failure models

The purpose of maintenance is to increase the mean time to failure. Keeping this in mind, one
way to put maintenance states to the models is shown in figure 2. In this figure it is assumed
that maintenance will bring about an improvement to the conditions in the previous stage of
deterioration. Frankly, this contrasts with many strategies described in the literature, where
maintenance involves replacement - meaning a return to the "new" conditions. [2]

i)

Work

ii)

Failure

D1

D2

Dk

Failure

Figure 2: Maintenance states for random failure and deterioration process

This model represents only one of the several ways of accounting for the effect of
maintenance on reliability. Many other approaches have been developed and of these at least
two have been specifically concerned with power system applications. One of these two
maintenance models is derived from parallel branches of components in series for example in
transformer stations and the other model is concerned with the maintenance and reliability of
standby devices. [2]
2.6

Present maintenance policies - results from the Task Force's questionnaire to


utilities

According to the report "The present status of maintenance strategies and the impact of
maintenance on reliability" maintenance at fixed intervals is the most often used approach,
though often augmented by additional corrections. On the other hand newer so called "as
needed"-type methods like reliability-centered maintenance (RCM) are increasing their favor
in North America, but methods based on mathematical models are hardly ever used or even
considered. In a way this is silly practice since mathematical models offer the only way to
quantitatively link component deterioration and condition improvement by maintenance.
Regarding these mathematical models we can state that even though probabilistic models are
more complex they have clear advantages over the deterministic ones: they are capable of
describing actual processes more realistically and at the same time they also facilitate
optimization for maximal reliability or minimal costs. [2]
The task force had prepared a questionnaire to utilities in order to form a overview of the
present maintenance practices. Replies were received from 6 countries including Austria,
Canada, Germany, Italy, Saudi Arabia and the United States. Generally the answers to many
questions displayed a considerable spread. Many utilities do scheduled maintenance only, or a
modified form of it where additional corrective actions are taken if required by inspection
results. Only a few utilities reported of exclusive use of predictive maintenance. [2]

2.6.1

Scheduled and predictive (as needed) maintenance

The intervals and durations reported for scheduled maintenance show considerable spread and
for example cyclic routine is rare among the answers for the questionnaire. Furthermore the
most frequent maintenance values for critical components are given in table 2. [2]
Table 2: The reported most frequent maintenance intervals and durations
Generators

Minor maintenance
Minor overhaul
Major overhaul

Interval
1 year

Duration
1-2 weeks

Transformers
Interval
1 year

5 years

4-5 weeks

5 years

3 days

5 years

3 days

8-10 years

6-8 weeks

7 years

4-8 weeks

8-10 years

2 weeks

Breakers
Duration
1 day

Interval
1 year

Duration
1 day

Regarding predictive maintenance, the most popular devices to detect maintenance needs
were found to be periodic inspection and continuous monitoring. Here, looking for example at
transformers, a big variety in periodic inspection intervals was found ranging from 1 week to
5 years. Continuous monitoring was widely used especially for generators, regarding for example oil leakage, vibration and bearing temperature, but also for smaller equipment regarding corrosion and discharge voltage. [2]

2.6.2

RCM maintenance

From the answers to the task force's questionnaire we can see that the RCM-procedure is not
yet generally used and even more rarely outside North America. Luckily however, about half
of the correspondents were at that moment considering the introduction of RCM. The RCMprocedure was reported to be most used with transformers. Furthermore, the expected benefits
for those considering the use of RCM in the future were; longer up-times, lower costs, better
use of labor and better control and decisions. [2]

2.6.3

Probabilistic models

Generally it can be seen from the answers that probabilistic approaches are not used in maintenance planning at all. However, some of the utilities reported on pilot applications, while
some had hired external consultants who may use these probabilistic models. Finally, many of
the respondents wished to compute indices as failure frequency and duration as well as unavailability. [2]
2.6.4

Maintenance data requirements

Maintenance data requirements will now shortly be viewed separately for generators, transformers and breakers. Regarding generators, current maintenance strategies are primarily
based on historical records like performance indices, inspection records and maintenance
data. On top of this generator manuals and thorough experience were listed as important data
sources. [2]

If we look at transformers on the other hand, the most frequently mentioned maintenance related data were test reports, data on windings, failure data, maintenance history and maintenance protocols. Moving further to the breakers, often used data were among others maintenance history, operation logs, failure statistics and faulty operation counts versus total number
of operations. [2]

Conclusions

The relationship between power system reliability and maintenance strategies is not unambiguous. One accepted way of understanding the relationship between maintenance and network reliability is to study the causes of network failures. Striving for a cost-effective planning of a network system in the changing competitive electrical market environment means
maintenance of the right components, at the right time and with the right maintenance activity. Continuing from this statement, it is safe to say that, maintenance should be focused on
the critical components- those that have a significant impact on system reliability- and to furthermore reduce the dominant causes. [3]
Mathematical models are useful tools to represent the effects of maintenance on power system
reliability. Algorithms are used for deriving optimal maintenance policies to minimize the
mean long-run costs for continuously deteriorating systems (Markov). [4]
In a RCM-application to an urban distribution system in the Stockholm city area, the following results were concluded:
There is a need to identify the critical network components
A comprehensive understanding of failure causes has to be obtained
Preventive maintenance programs should not be created by considering components in
isolation, but always as a part of the whole system. [3]
According to a present study regarding the relationship between preventive maintenance and
reliability of a distribution system the results show that benefits are obtained by focusing
maintenance both on critical network components as well as the dominant causes of failures
for which maintenance can impact. Finally, it can thus be concluded that preventive maintenance can have a significant impact on the network system reliability and the relationship between failure rate and maintenance should be analyzed on a case-by-case basis.[3]

References

10

[1]
6,

Cired, 18 th international conference and exhibition on electricity distribution, special reports, session
Maintenance

[2]

Power

A report of the IEEE/PES Task Force on Impact of maintenance Strategy on reliability of the net
work, Risk and Probability Applications Subcommittee, J.Endrenyi et al.,
"The Present Status of
Maintenance Strategies and the Impact of Maintenance on Reli
ability", IEEE transactions on
System, VOL.16, NO.4, November 2001

[3]
cost-

L.Bertling, R.Eriksson, R.N. Allan, "Relation between preventive maintenance and reliability for a
effective distribution system", IEEE Porto Power tech Conference, Portugal, 2001

[4]

C.Teresa Lam, R.H.Yeh, "Optimal Maintenance-Policies for deteriorating Systems Under various
Maintenance Strategies", IEEE Transactions on reliability, VOL. 43, NO.3, 1994

MAINTENANCE OPTIMIZATION TECHNIQUES FOR DISTRIBUTION


SYSTEMS
Sirpa Repo
Vattenfall Distribution Finland
sirpa.repo@vattenfall.com

INTRODUCTION
Maintenance, in general, can be defined as the combination of technical and associated
administrative actions intended to retain an item or a system in, or restore it to a state in which it
can perform its required function. Objectives can be classified in four categories: ensuring system
function (availability, efficiency and product quality), ensuring system life (asset management),
ensuring safety and ensuring human well-being. [1]
Target of maintenance is to extend equipment lifetime and/or reduce the probability of failure.
That is to increase reliability. Other objectives could be to reduce risks or appearance and worth
of existing equipment. Objective is to find right level for maintenance targets. This isnt usually
the maximum.
Maintenance is especially important in areas where large amount of capital is invested in
equipment and technical systems, and this is the case in distribution systems.

MAINTENANCE OPTIMIZATION
Maintenance optimization is basically a mathematical model. In the model costs and benefits of
maintenance are quantified and optimal balance between both is obtained. The model is objective
and quantitative way of decision-making. With maintenance optimization models different
policies can be evaluated and compared. As a result of maintenance optimization maintenance
plans or timetables can be created. [1]
Maintenance optimization consists of four parts, first is a description of the system, its function
and importance. Secondly a description of the deterioration and what kind of affects it has on the
system. Thirdly a maintenance optimization covers information about system and actions that are
possible to implement and finally an objective function and an optimization technique to find
balance between costs and reliability. [1]
There are many problems in creating maintenance optimization methods in general. First of all
models for maintenance optimizations arent simple and easily transformable into mathematical
calculations. Therefore software programs are needed. Developing these programs isnt easy, for
example because of lack of specialized staff or unclear formulation of problem. Programs may
not be user-friendly and are left not used. [1]

One of the biggest problems in maintenance optimization is lack of data. Key point of
maintenance optimization is to know how deterioration can be modeled and what the occurrences
of equipment failures are. Mechanisms behind failures have to be known before making
maintenance decisions, relying on average failure rates etc. isnt cost-effective. Collecting this
data requires lot of effort. [1]
Besides getting technical information about components and their condition, cost data is also
needed in order to create an optimization model. Costs of maintenance are usually easily
calculated, but benefits of maintenance are more difficult to quantify. In optimization you also
have to think of the costs that are realized after equipment failure. [1]
In electricity distribution systems equipment failure usually leads into a power failure. How to
value costs that result from power failure? Distribution system operators can have many way of
defining these costs and customer and society probably have a different opinion of the value of
electricity. From distribution system operators point of view costs of power failure are at least
electricity not delivered and cost of restoration, in addition to that reduced power quality and
customer dissatisfaction.
Dekker [1] has analyzed use of maintenance optimization models in general. There are some gaps
between theory and practice in this field. Most of the models are difficult to understand and
interpret, maintenance optimization uses in many cases stochastic processes and most of the
people are used to deterministic approach. Also many of the research papers in the field are
written only for mathematic purposes and companies in industry arent enthusiastic about
publishing their optimization methods.

MAINTENANCE IN DISTRIBUTION SYSTEMS


Performance of distribution system is related to performance of its components. Equipment type,
age, condition and exposure of events effect on equipment performance and have to be taken into
account in maintenance management. Objective is to optimize maintenance costs and efforts in
order to ensure competitive electricity price and quality of service. Distribution system operator
has power quality and safety requirements that need to be fulfilled.
Maintenance is one of the tools that can be used for ensuring system reliability, other are for
example increasing system capacity, reinforcing redundancy and using more reliable
components. Usually utilities dont have many possibilities for these actions and have to
concentrate on improving status of existing equipment.
Objective of maintenance is to find a balance between maintenance and condition of the system.
If too many resources are spent, costs will be too high and if not enough, it will lead into failures,
and outage and restoration costs. Deregulation of electricity market makes companies to work
more efficiently in order to ensure competitive energy and distribution price. Electricity system
has some features that make its maintenance a challenging task. First of all electricity system is
highly important to society and breakdowns could have high indirect costs. Operative lifetime of
equipment used in electrical system is long, often even 50-100 years. Also equipment is highly
dispersed geographically. [3]
2

Maintenance methods can be divided into two categories: preventive and corrective maintenance.
Preventive maintenance is carried out in selected intervals to ensure that equipment works.
Intervals could be decided according to time or condition. Objective is to prevent equipment
failures. Corrective maintenance is carried out after equipment failure. Corrective maintenance is
carried out only in locations where equipment failure doesnt cause significant problems. [2]
Methods that are used in preventive maintenance can be divided into three categories: time-based
maintenance, condition-based maintenance and reliability-centered maintenance. These are
shortly presented here.
Time-based maintenance (TBM)
In TBM maintenance is carried out by predefined time intervals. This approach has been
especially popular in past because its considered to be reliable. Problem of this approach is too
late or too early maintenance which leads into inefficiencies. [2]
Condition-based maintenance (CBM)
Condition-based maintenance optimizes intervals when maintenance is done. This maintenance
method is based on the actual condition of equipment. The most critical components of
distribution system could be monitored on-line and real time. Monitoring of other components
would be too expensive. Collected data about equipments condition has to be processed so that
decisions of the maintenance can be made. This method gives up to date information and greater
confidence about the condition of equipment. Thus maintenance can be planned more carefully
than in time-based maintenance. If maintenance intervals defined according to condition are
longer than in time-based maintenance, then savings can be achieved. If maintenance intervals
this way are shorter, equipment failures are likely prevented. [2]
Reliability-centered maintenance (RCM)
Reliability-centered maintenance differs from the other two approaches because it distinguishes
difference between failure causes and failure effects analysis. RCM consists of reliability
assessment, life expectancy and risk calculations. In order to make RCM work, expertise is
needed for inspections, failure analysis and decision-making. [2]

MAINTENANCE OPTIMIZATION TECHNIQUES


Literature shows only few applications of maintenance optimization techniques for distribution
systems. Its more common to have developed optimization model for maintenance of individual
component or substation. Optimization models are usually made for important and expensive
components, like generators or transformers.
Basis of maintenance optimization is defining an objective. It could be maximizing reliability
under given constraints or minimizing costs under given constraints or minimizing total costs of

interruptions and maintenance. Other factors that have to be taken into account in maintenance
optimization are for example optimal maintenance and inspection intervals, allocation and
number of spare parts, manpower and redundancy. Besides a clear objective, data from
components and their relationships is needed. For system analysis also a model of the network is
required. Lastly some optimization method or approach is needed in order to reach the objective.
[5]
In this chapter few methods are explained in more detail. There are variety of other approaches
for maintenance optimization, such as Total Test Time and Weibull methodologies, Petri nets,
multi state systems and Life Cycle Costs, but these are mainly for designed for component
maintenance optimization [5].
Variety reduction model
What managers of distribution companies want from tools or methods that assist on asset
management decisions are that tool presents results also in monetary terms, model is able to cope
with uncertainty, decision is based on actual condition of the system, result shows times when
renewals should be carried out, tool should be easy and cheap to use and model should also be
able to describe long-term condition of the system. To deal with this problem a variety reduction
model is presented in [3].
Target of this model is to provide support for decision making in maintenance and renewal
strategies. Model contains four steps: condition based index, dynamic lifetime analysis,
economical analysis and maintenance analysis.
The method starts with condition based index (CBI), where condition of the system and
equipment is defined. Method uses weighted sum of measured values as inputs. These can
temperatures, vibrations, hours spent on maintenance etc. These values are transformed into
linear function and condition is presented as a value between 0 and 100. If system under study
consists of more components, like distribution system does, input values of different components
are weighted so that sum of weight factors is 1. This means that each component gets weight
factor between 0 and 1.
Dynamic lifetime analysis (DLA) needs as input residual lifetime, economic consequence of
equipment failure and costs of planned maintenance. In this way method tries to estimate risks
involved. Residual lifetime is difficult to estimate because of the stochastic nature of equipment
failures, but statistical methods or various methods dealing with distribution can be used. Also
expert opinion can be utilized. By comparing costs of planned maintenance with risk and
consequences of equipment failure financial risk is described.
Economical analysis gives understanding of dependencies between costs and maintenance.
Results would be total cost for maintenance, renewals and residual value. Costs can be calculated
by determining point of renewal, either based on condition or risk If analysis shows too much
resources spent on maintenance, renewals are postponed. It has to be considered whether to spend
more money on maintenance or accept higher risk. Method would give better understanding
maintenance in future and when renewals should be carried out when regarding risk and
condition and what are the financial consequences.
4

Maintenance analysis formulates maintenance strategy based on previous three steps. It shows
how different maintenance strategies affect costs, financial risk and condition of the system. For
the analysis time span and extent of analysis should be decided. As a result of analysis
maintenance time can be decided and condition of network calculated. Different strategies could
be compared.
Method should be used in iterative way because maintenance, costs, risks and reliability affect
each other. Workflow is presented in figure 1.

CBI
Basic data for
decision making

DLA
Financial
analysis

Maintenance
analysis

Figure 1. Work flow of the model [3]


Advantage of this method is that it helps manager to make maintenance decision without taking
too much information into account. In decision making situations not all technical or financial
information can be considered so carefully. For example condition monitoring gives detailed
information about condition of one component only. But system perspective has to be considered,
for example how increased maintenance for one component affect reliability of entire system.
Proposed methodology requires relatively few inputs, on the other hand it requires expert
opinions. Of course when entire distribution system is considered amount of data is significant.
WASRI
Maintenance optimization can be done based on parameters or key figures. One of the goals of
the optimization could be minimizing the weighted average system reliability index (WASRI)
subject to cost constraints as in [4]. This approach is based on component or system reliability
and method ranks maintenance tasks based on their impact on system reliability, it is designed to
find most cost-effective maintenance task.
WASRI is defined as WASRI = 1 ( SAIFI ) + 2 ( SAIDI ) + 3 ( MAIFI E )
where

(1)

SAIFI system average interruption frequency index;


SAIDI system average interruption duration index;
MAIFI momentary event average interruption frequency;

weight assigned to different indices.


Cost-efficiency of maintenance action E can be defined as a change of WASRI divided by the
cost associated with the maintenance task C. SAIFI, SAIDI and MAIFI indexes are sum of
contributions of each components corresponding indexes.
Contributions of individual components can be written as
S
SAIFI iC = i i
N
D
dij
SAIDI iC = i i = i
N
N
T
C
MAIFI E i = i i
N

(2)
(3)
(4)

where
i failure rate of component i
S i number of customers experiencing sustained interruptions due to failure of component i
Di sustained interruption durations for all customers due to failure of component i
d ij sustained interruption duration for customer j due to a failure of component i, j=1,2,, S i
Ti number of customers experiencing temporary interruption event due to failure of component i
N total number of customers
Model considers only first-order contingencies, meaning that only one component fails at a time
and assumes that maintenance action changes failure rate , but not for example repair or
switching times. Factors S i , Di and Ti remain thus constant before and after maintenance.
Approach assumes that change of failure rate after maintenance is known.
Maintenance reduces failure rate by i and thus
S
SAIFI iC = (i ) i
(5)
N
D
SAIDI iC = (i ) i
(6)
N
T
C
MAIFI E i = (i ) i
(7)
N
Changes in these indices are directly related to change of failure rate of associated component.
From previous equations (1, 5-7) cost-effectiveness E of maintenance action can be written

E=

(1 S i + 2 D j + 3 Ti )
WASRI
= i
C
(N C)

(8)

Ranking procedure of this approach is as follows:


1. Evaluate condition of each component and cost of maintenance
2. List all possible maintenance tasks
3. Run a basic reliability assessment to get S i , Di and Ti for each component
4. Rank the cost-effectiveness of each maintenance task
5. For maintenance tasks operating on the same component, select one with highest ranking
and eliminate other tasks from list
6. Select maintenance tasks from the top of the ranking list until cost limit is reached or
reliability target is reached.
Approach was applied into testing utility system whit 4000 components and five substations.
Maintenance was recommended to components in substation. With given budget limits and
maintenance actions, approach reduced WASRI by 28 %. As a comparison another approach was
tested. There maintenance was based on component-level cost-efficiency, not system-level
reliability. This CBM-based approach reduced WASRI by 16 %.
Reliability importance indices

Another approach presented in [5] is based on measuring interruption costs for network
performance. Components reliability and costs from failure are presented with following three
indices.
C S
H
Interruption cost index I i =
[SEK/f]
(9)
i
where C S is [SEK/year] total year interruption cost and i [f/year] component is failure rate.
With this index the most critical components in system can be identified, because it studies
interruption costs in relation to component reliability.
Maintenance potential I iMP = I iH i

[SEK/yr]

(10)

where I is interruption cost index [SEK/f] defined in equation (9) and i [f/year] component is
failure rate. Maintenance potential express total expected yearly cost reduction if no failures
would occur or said in other way, what are the expected costs that studied components fault is
going to cause.
Ki
[SEK/yr]
(11)
T
where K i is total accumulated interruption cost over the simulation time T for component i.

Simulation based index I iM =

Customer interruption cost K i is simulated so that all the costs incurred by the components fault
are taken into account. This index shows components that are likely to cause most interruption
costs and maybe what components should be prioritized for maintenance actions.
Even though these three indices may look similar, they are different.
I H = expected costs if studied component fails
I MP = total expected yearly cost reduction that would occur in the case of a perfect component
7

I M = total expected yearly interruption cost caused by the component (finally causing)
Hilber [5] has used these indices in two actual cases studies, other being urban and other rural
system in Sweden. Case studies showed that these indices were possible to use and gave results,
which could be further analyzed for maintenance actions and economical effect. From results of
these case studies it could be noted that maintenance potential and simulation based indices give
for many components approximately the same value even though they are calculated differently.
Results of the case studies show the most important components of the distribution system. Issues
that effect components importance for system are e.g. how far, what kind and how important
loads are behind, how long are repair times, what are fault isolation possibilities and how long
does it take to isolate fault, is there redundancy etc.
Interruption cost index I H and maintenance potential I MP can be combined and figure 2 can be
constructed. These components that are important for system in reliability and in maintenance
potential can be identified.

Figure 2. Example of components importance index and maintenance potential in rural case,
radial system [5]
Profitable actions can be determined by
Pij = (ij ) I iH q j

(12)

where
Pij
profit for action j on component i
IH

importance index for actual component


8

ij
q

change of failure rate


cost for action j

After every maintenance decision reliability calculations would have to be done again to gain
accuracy because changing status of one equipment changes the state of the system.
As an evaluation of the method, indices demonstrated above were possible to apply to existing
networks. These indices enable evaluation of components from system perspective by assigning
monetary value for interruptions and thus give a starting point for maintenance optimization. For
larger system this method requires lot of data and calculation.
Linear programming

This method described in [6] optimizes maintenance resources for reliability. It is done by
minimizing System Average Interruption Frequency Index (SAIFI). First framework of the work
assumes constant failure rates and second when no accurate information about failure rates and
impact to the reliability is available. Linear programming and approximate reasoning using fuzzy
sets solve this dilemma. Usually there isnt enough accurate information about components
failure rates.
Problem is defined by describing SAIFI by number of components, failure rates, levels of
maintenance and maintenance costs. Limiting factors for the optimization are maintenance
resources, maintenance levels and crew constraints. Expected failure rates and the impact on
maintenance are modeled by allowing a range of values for each statistic (fuzzy number).
Statistical distributions arent used because suitable distributions arent available and
computation would be more complex. Optimization model is determined by arithmetic operations
on fuzzy numbers. Failure rates and failure rate multipliers for maintenance levels are given as a
range of expected values expressed by fuzzy sets.
Method was introduced by two examples of same radial distribution system. First example was
with complete information and the other one with incomplete information. System was the same
in both cases. In case of incomplete data failure rates, fixing times for different types of failures
(fixed, tree or recloser failure) and extent of the maintenance (extensive, minimal, no
maintenance) were given as trapezoidal and triangular fuzzy numbers instead of fixed values as
in first case. Maintenance actions were considered only for reclosers and sections (tree trimming).
With incomplete information maintenance optimization was calculated based on expected values
and based on fuzzy numbers. Expected value -solution recommends to almost all equipment
minimal maintenance, fuzzy number calculation give almost the same result. Example with
complete information (and larger maintenance budget) recommends extensive maintenance for
almost all reclosers and some sections.
Besides it was studied how additional information about failure rates, in this case visual
inspection of tree growth, would affect results. This would reduce the range of variation in outage
statistics. More accurate field information lead to lower maintenance costs with about same level
of SAIFI, so in decision making it was worthwhile.

CONCLUSIONS

Maintenance optimization in distribution system is a new subject in research and hasnt been
much studied even though maintenance is important and money consuming part of electricity
distribution business. This paper pointed out some aspects on studies in this area.
Distribution system consists of many components. Optimizing maintenance for individual
component doesnt result optimizing maintenance for system. Idea of maintenance optimization
in distribution systems is to evaluate effect of the components maintenance to system
performance.
However for effective maintenance optimization data about individual equipment is needed.
Providing this data might be problematic because failure mechanisms arent fully known and
long-term data about failures doesnt exist in all cases. For larger system collecting this
information is demanding task, if it doesnt exist already before. Thats why all aspects cannot be
taken into account and models have to be simplified. For maintenance optimization software
program is needed in order to transfer models into practice.
Methods introduced in this paper may seem in light of equations simple, but supplying data for
them is more complex. For example defining failure rates for equipment would need lot of back
ground information (models for deterioration, historic data etc.), which isnt in scope of this
paper. Considering every component and its impact on reliability in system makes maintenance
optimization difficult and time-consuming task.
The four models presented in this paper give a starting point for optimization work. Variety
reduction model mainly presented comprehensive way of thinking without considering
mathematical implementation. The other three were examples of different approaches for
considering maintenance optimization as a mathematical problem.
Main point of maintenance optimization is to find an objective of what needs to be optimized. In
these models it has been minimizing system reliability index, minimizing interruption costs and
minimizing interruption frequency index.

REFERENCES

1. Dekker, Rommert. Applications of maintenance optimization models: a review and


analysis. Reliability Engineering and System Safety, 1996. Vol 51, p. 229-240.
2. Polimac, Vukan & Polimac, Jelica. Assesment of present maintenance practices and
future trends. Transmission and Distribution Conference and Exposition, 2001 IEEE/PES
Volume 2, 28 Oct.-2 Nov. 2001 p.891 - 894
3. Strmberg, M. & Bjrkqvist, O. Variety reduction model for maintenance and renewal
strategies. Paper in Nordic Distribution and Asset Management Conference 2004,
24.8.2004

10

4. Li, Fangxing & Brown, Richard E. A cost-effective approach of prioritizing distribution


maintenance based on system reliability. IEEE Transaction on power delivery, 2004. Vol
19, No 1.
5. Hilber, Patrik. Component reliability importance indices for maintenance optimization of
electrical networks. Licentiate thesis, Royal Institute of Technology, Stockholm, Swede
2005.
6. Sittithumwat, A., Soudi, F. & Tomsovic, K. Optimal allocation of distribution
maintenance resources with limited information. Electric Power Systems Research 68,
2004, p.208-220.

11

PRIORITIZATION OF MAINTENANCE METHODS


(RCM, CBM, TBM, CM)
Sanna Uski
sanna.uski@vtt.fi

ABBREVIATIONS
RCM Reliability Centered Maintenance
CBM Condition Based Maintenance
TBM Time Based Maintenance
CM Corrective Maintenance
HBS Hardware Breakdown Structure
MSI Maintenance Significant
MTBF Mean Time Between Failures
MTTR Mean Time To Repair
OSI Operational Significant

INTRODUCTION
Traditionally electric utility maintenance strategies have been based on fixed interval
maintenance, and even nowadays it is quite common maintenance policy used in the world.
The grounds for emerge of new maintenance methods has been to find an optimum between
the costs of maintenance, and the ability of maintaining sufficient reliability of the system.
The fixed interval maintenance policy, i.e. time-based maintenance (TBM) policy, may often
be inefficient in the sense of the costs as well as in lengthening the lifetime of components.
Due to these reasons, an approach towards flexible maintenance policies has been taken in
order to be able to take advantage of the information obtained through condition monitoring
done, and carry out the maintenance based on the needs and priorities. This is called
predictive maintenance, of which reliability-centered maintenance (RCM) is a part. The RCM
is not really a single strict maintenance method, but it allows comparison of different
maintenance methods, of which the most cost-effective can be chosen without compromising
the reliability. Predictive maintenance can enable better outage scheduling, flexibility of
operation, better fuel use, more efficient spare part management, and improve efficiency. [1]
The most primitive maintenance strategy is Corrective Maintenance (CM), which is based on
restoring the operation by fixing or replacing the component in case of failure, and does not
include any additional maintenance inspections. TBM is based on maintenance tasks
performed after certain period of time, either executing maintenance actions regardless of, or
based on, the age of the component. Condition based management (CBM) takes into
consideration the condition of the component based on which, the maintenance tasks depend
on. Figure 1 shows an overview diagram of maintenance methods.

Asset Management
Purchasing

Maintenance

Manufacturers Replacement
Specifications

Disposal

Scheduled
Predictive
Maintenance Maintenance

Condition Analysis of
Monitoring Needs and
Priorities
Age,
Bulk

RCM
Mathematical Empirical
Models Approaches

Figure 1 Over view of maintenance approaches. [1]

Use of the Reliability Centered Maintenance (RCM) method is increasing nowadays. The
approach in RCM is commonly empirical, and the RCM relying maintenance policies used in
recent years, are rarely employing mathematical models. Mathematical models are needed in
maintenance schemes in order to be able to represent the effects of maintenance in reliability.
Mathematical models can be deterministic or probabilistic. [1]
Reliability, and therefore condition management, has a significant economical role in the
industry. 50 % average is a rough estimation of the life-cycle costs of components that are
allocated to maintenance activities. [2] There is a risk of possible outages due to failures, and
on the other hand the costs of investments on (over)improved reliability in form of system
upgrading and maintenance. In Figure
the cost-investment relationship is shown
schematically.

Costs

Total costs
Maintenance costs
Interruption costs
Investments on reliability

Figure 2 Relation between costs and investments on reliability. [3]


In optimizing maintenance strategies, it is essential to know which system parts are
maintenance critical. After this is known, one can define appropriate maintenance strategies
for each subsystem and component. The most critical parts need to be working with higher
reliability than the less critical and non-critical parts, and the maintenance strategies must be
selected accordingly. The maintenance should always be optimized with respect to
availability and reliability of the system. [4]

FAILURE TYPES
There are two failure categories; random failure, which can occur at any instant of time, and
failure caused by deterioration, meaning the older the component in respect of component
lifetime, the closer to failure it is. As the random failure can occur at any time, preventive
maintenance does not make any improvement to that. On the other hand, maintenance actions
due to deterioration, can prolong significantly the lifetime of the component.
There are several different inspection-replacement strategies. Simplest one is replacement in
case of failure, in which case no inspections are performed, and component is replaced only in
case of a failure. This strategy is called corrective management (CM). Age replacement
strategy adds a bit of reliability (e.g. in considering the failure due to aging) to the previous,
when the component is replaced after a certain amount of time in case it has not failed before
this. [5]
It is essential to know the impact of failure of all the components in the system, to the total
system functionality. The so-called maintenance significant (MSI) logic, as well as
operational significant (OSI) logic, can help to narrow down the components to which
corrective maintenance (CM) should be applied to, and on the other hand the components, for
which preventive maintenance should be done. MSI and OSI logic are based on yes/no
answers to questions on safety, production losses, cost of repair, spare part availability and
need to capture information/documentation. This phase has a significant influence on
maintenance efforts on the whole system, by e.g. determining the spare part reserve,
maintenance work on preventing and predicting failures, and work to be put on
documentation. RCM is applied only to all the MSI items (to which all the OSI items belong
to as well). First, an item is selected, and hardware breakdown structure (HBS the database
into which all information will be stored) is created to the item. Then the failure mode will be
investigated, and finally RCM will be applied. The result of this, will be the possible actions
to be done in order to prevent failure, and the causes of failure. RMC method also reveals if
condition based information is needed (instead of scheduled maintenance tasks) and if it will
3

be cost effective. Generally complex systems require more condition based information for
failure prediction and prevention. If needed, condition monitoring strategy supports the top
level system maintenance strategy. [4]
Assessment of deterioration
Deterioration can be determined based on time, or inspection of the physical state. In order to
be able to determine the physical state, periodic inspections are needed. Despite of this, the
latter deterioration assessment method is more commonly used of the two methods. A
common way of modeling the deterioration, is with the Markov models, with which the
optimal inspection-replacement policy, enabling long-term cost-effectiveness, is defined. [1]
In determining the optimal policy, there are several features that are considered, e.g.
limitations of technology, equipment costs and features of implementation. [5]
Besides periodic inspections, which occur periodically after equal amounts of time, there is a
sequential inspection strategy, in which the inspections are spaced by sequences of time not
necessarily equal to each other. Continuous inspection (condition monitoring) can also be
used for determining the condition of the system. In periodic, sequential and continuous
inspection strategies, the state of the system in determined, and based on predefined criteria,
the system is either replaced or allowed to continue functioning (until the following
inspection). In cases where periodic inspections do not help in prediction of wear-out trends
accurately enough, condition monitoring is preferred, provided that the costs remain
reasonable. Other arguments for using condition monitoring, may be for example the need of
online inspections, or that the failure of the device is notably critical.
Inspection methods may consist of e.g. visual inspection, optical inspection, neuron analysis,
radiography, eddy current testing, ultrasonic testing, vibration analysis, lubricant analysis,
temperature analysis, magnetic flux leakage analysis and acoustic emission monitoring [1]
The mathematical models of deterioration can be either deterministic or probabilistic.
Deterministic models can be strait-forward and more simple than the probabilistic ones, but
may sometimes lead to false conclusions, such as resulting an infinite lifetime of a component
before failure due to fixed values and assumptions that the maintenance restores the
component condition to the state it had in the previous inspection or, even better state.
Probabilistic methods are more realistic than the deterministic ones, but are more complex on
their structure.

OPERATIONAL RELIABILITY AND MAINTENANCE COSTS


Optimal values of cost-rates of some different maintenance policies compared with each other
are
gf ga gp gs gc,
where g is the cost-rate and subscripts f, a, p, s and c refer to failure replacement, age
replacement, periodic inspection (inspections spaced by fixed time), sequential inspection
(inspection times predetermined, but not necessarily spaced equal) and continuous inspection,

respectively. [5] The failure replacement is a CM strategy, age, sequential and periodic
inspections belong to TBM and continuous inspection in CBM, and all may be used in RCM.
Relation between reliability and maintenance effort in presented in figure 3 in three different
scenarios. X represents the normal reliability of the equipment and the number of
maintenance actions equal to the number of failures. Y represents the maintenance effort due
to unnecessary maintenance actions caused by false alarms. Z represents the maintenance
effort due to human errors and vandalism, or non-standard maintenance actions. The number
of failures in each case are as follow:

Maintenance Effort

X: # maintenance actions = # failures


Y: # maintenance actions = # failures + # false alarms
Z: # maintenance actions = # failures + # false alarms + # non-standard
maintenance actions

Z
Y
X
Reliability

Figure 3 Relation between reliability and maintenance effort. [4]


In the figure 3 can be noticed that maintenance effort decreases as reliability increases. Also
can be seen that at low reliability, the maintenance effort due to Y and Z remains almost
constant or even decreases slightly. Therefore, as the normal reliability X increases,
elimination of Y and Z improves reliability a great deal. However, all components have their
built-in reliability defined at design-stage, which can not be surpassed even with the best
quality control scheme.
Availability of a system is defined based on maintainability (the ability to repair a component
in certain time), operability (man-machine interface), supportability (availability of spare
parts, human resources etc.) and reliability (probability of adequate performance for a certain
time of the component). Availability can be calculated with
A = MTBF/(MTBF + MTTR).
MTBF, mean time between failures is a function of reliability and operability
5

MTBF = f(R,O),
and MTTR, mean time to repair is a function of maintainability and supportability
MTTF = f(M,S),
therefore availability is a function of the all four A = f(R,O,M,S). Reliability can be calculated
by
R = e,
where = failure rate = 1/MTBF. If availability can be increased, the optimum maintenance
effort is achieved, and operating costs of the system have been reduced. [4]

MAINTENANCE TASKS
For task analysis, when on defining the maintenance tasks to be executed, e.g. questions, Is
system shut-down necessary to when executing maintenance task? and Can tasks be
grouped to optimise maintenance effort?, need to be answered. The task analysis will finally
define;
which parameters need to condition monitored
optimised set of maintenance tasks
the optimum level and range of spare parts to be stored
personnel and skills required for executing the maintenance tasks.
Inputs of condition monitoring system are condition data regarding,
performance
mechanical state
wear
physical state
other information.
In managing and optimizing predictive maintenance activities based on information received
through condition monitoring, it is important, that all the correct parameters are being
monitored. To be able to achieve optimal level of maintenance, documentation is required, as
it important that the maintenance strategy has been agreed upon.

CONCLUSION
It has been stated that management costs are very significant in system costs. Despite of this,
it has not been until recent years when more sophisticated maintenance strategies have been
started to be implemented. CM is the most simple maintenance method, and TBM has been
the prevailing method in the world. CBM is a bit more advanced method from TBM, and the
most advanced method is RCM, which is actually a procedure to identify the optimal
maintenance strategy regarding each item of a system. The whole maintenance task can

therefore be implemented by using optimally all the three other maintenance methods, CM,
TBM and CBM, on different system components.

REFERENCES
[1] A Report of the IEEE/PES Task Force on Impact of Maintenance Strategy on Reliability
of the Reliability, Risk and Probability Applications Subcommittee, J. Endrenyi, S.
Aboresheid, R. N. allan, G. J. Anders, S. Asgarpoor, R. Billington, N. Chowdhury, E. N.
Dialynas, M. Flipper, R. H. Fletcher, C. Grigg, J. McCalley, S. Meliopoulos, T. C. Mielnik, P.
Nitu, N. Rau, N. D. Reppen, L. Salvaderi, A. Schneider, and Ch. Singh, The Present Status
of Maintenance Strategies and the Impact of Maintenance on Reliability, IEEE Transactions
on Power Systems, vol. 16, no. 4, pp. 638-646, Nov. 2001.
[2] Thorbjrn Andersson, Inger eriksson, Donald L. Amoroso, Steering the Maintenance
Costs: An Explorations of the Maintenance Construct, System Sciences, 1992. Proceedings
of the Twenty-Fifth Hawaii International Conference on volume iv, 7-10 Jan. 1992. Pp. 348358 vol. 4.
[3] Olli Lehtonen, Masters Thesis Development of reliability management of electricity
distribution in industrial plants (in Finnish), Tampere Universty of Technology, 2001.
[4] J.H. Wichers, Optimising Miantenance Functions by Ensuring Effective Management of
your Computerised Maintenance Management System, IEEE Africon 96 Conference, pp.
788-794, Sept. 1996, Stellenbosch, RSA.
[5] C. Teresa Lam, R.H. Yeh, Optimal Maintenance-Policies For Deteriorating Systems
Under Various Maintenance Strategies, IEEE Transactions on Reliability, vol. 43, no. 3,
pp.423-430, Sept. 1994.

RCM APPLICATIONS IN UNDERGROUND POWER SYSTEMS


Samuli Honkapuro
Lappeenranta University of Technology
Samuli.Honkapuro@lut.fi

INTRODUCTION
The maintenance work of the network equipment can be divided to preventive maintenance (PM)
and corrective maintenance (CM). Traditionally, most of the maintenance works of the
underground cables have been corrective maintenance, where the cable is repaired after the fault
has occurred. However, diagnostic methods for underground cables have developed and thereby
there is better potential to implement preventive maintenance for underground cables. However,
preventive maintenance could demand considerable amount of the resources, such as manpower
and financial resources. Therefore preventive maintenance should be focused on those parts of
the network, where it provides highest impact to reliability for lowest costs. For optimising the
reliability and the costs, the maintenance method called reliability-centred maintenance (RCM)
has been developed.
In this seminar paper, RCM process for underground power systems is presented. The
fundamental of the RCM and the basic characteristics of the cable network, such as common
causes of the failures and diagnostic methods, needed for the creating of the RCM process are
presented. Three methods to implement the RCM for underground power systems are presented,
first the general framework and then two case examples.
RELIABILITY CENTRED MAINTENANCE (RCM)
Reliability centred maintenance was developed at 1960s in the aircraft industry. At that time the
largest civil aircraft, Boeing 747 (the Jumbo) was created. Because of the complexity of the
aircraft, the preventive maintenance was expected to be expensive and the developing of the new
maintenance strategy was needed. The aim of the RCM is to achieve cost effectiveness by
optimising the maintenance activities in the systematic way. RCM provides a formal framework
for handling the complexity of the maintenance process. In technical view, it does not add
anything new. (Bertling, 2002)
General RCM framework
RCM framework can be divided to three main steps as shown in figure 1. RCM process begins
with system reliability analysis, where the system is defined and components that are critical for
system reliability are evaluated. Second step is to define the relation between preventive
maintenance and the reliability of the component. This is usually seen as most challenging task
of the process. In the third step, the system reliability and cost/benefit analysis puts the
understanding of the component behaviour gained in the system perspective. (Bertling 2002)

Figure 1. Three main steps in RCM analysis. (Bertling 2002)


RCM process can also be formulated into seven questions, after the system items to be analysed
are identified. These questions are: (Bertling 2002)
1. What are the functions and performances required?
2. In what ways can each function fail?
3. What causes each functional failure?
4. What are the effects of each failure?
5. What are the consequences of each failure?
6. How can each failure be prevented?
7. How does one proceed if no preventive activity is possible?
Data requirements
In order to create RCM strategy successfully, comprehensive knowledge about the system and its
components is needed. (Bertling 2002)
At the system level, following data is required: (Bertling 2002)

System descriptions and drawings

PM and control programs

Commitments and requirements for existing maintenance programs

At component level, following data is required: (Bertling 2002)

The list of the components

Components maintenance history

Following data is required for the cost/benefit analysis: (Bertling 2002)

The costs of reliability (investment costs and outage costs)

The costs of undertaking maintenance (cots of manpower, materials etc.)

The costs of not undertaking maintenance (outage costs for utility and customer)

In addition to previous, also the knowledge about the relationship between maintenance and
reliability is needed. That includes e.g. the failure causes of the components and the effect of the
maintenance on these failures and the relationship between failures and lifetimes. Important issue
is also that what are the cost factors that are needed to be balanced. (Bertling 2002)
METHOD 1; GENERAL RCM FRAMEWORK FOR UNDERGROUND POWER
SYSTEMS
The increasing need for reliability and the decreasing amount of the available land in the urban
areas are the main reasons for the growing use of the underground power systems. Although
underground cables are more reliable than overhead lines, there are some failures that are
typically for cables and that can be partially prevented by the appropriate maintenance of the
cable system.
Failures in underground power systems
In the figure 2, there is shown the failure causes of the cable systems. These are based on the
survey of the failures in the one substation of the Stockholm city power system. The failures can
be divided to short circuit failures, which are caused by insulation failure, and open circuit
failures, which are caused by conductor failure. (Bertling 2002)

Figure 2. Failures in cables system based on survey. Boxes shown as dashed lines are those that
could be considered to be affected by preventive maintenance. (Bertling 2002)
In the figure 2, failure causes shown as dashed lines are those that could be considered to be
affected by preventive maintenance. Based on these results, it can be concluded that about 35 %
of the failures in the cable system could possible be avoided by appropriate maintenance.
The insulation failure of the polymeric-insulated cable is usually caused by the deterioration of
the insulation, which can be caused for example by the water tree phenomenon of the cable
(Bertling 2002). Water tree phenomenon is quite common for the polymeric-insulated cables that
are contacted with water. In water treeing the water intrudes in the polymeric insulation as the
shape of a tree. Water trees can be divided to vented trees and bow-tie trees as shown in figure 3.

Figure 3. Water-trees. (Anon 2005)


4

Diagnostics in cable networks


The predictive diagnostic techniques of the underground cables have been developed during last
decades and they have become powerful tools to increase the reliability of the cable networks.
When cables in poor condition can be detected and repaired or replaced before the fault occur,
the amount of the customer interruptions can be decreased.
Diagnostic methods of underground cables can be divided to destructive and non-destructive
methods. In destructive methods the measured object has to be destroyed. Electrical breakdown
testing and water tree testing are two generally used destructive test methods. Electrical
breakdown testing is done by using AC, DC or impulse voltage. In water tree testing, thin slices
are cut of from the cable and analysed with microscope. Non-destructive methods are based on
the detecting of the changes in the insulation of the cable by dielectric diagnostics. These
diagnostics are based on the measurement of the current in time and frequency domain. In the
figure 4 there is illustrated the water treeing process and the non-destructive diagnostics of the
water tree. (Bertling 2002)

Figure 4. Water treeing process and the diagnostics of the water trees. (Bertling 2002)
Preventive maintenance of the underground cables
Traditionally underground cables have been replaced after the fault has occurred and the
preventive maintenance of the cables has been minimal. With modern diagnostic methods, the
condition of the cable can be detected and cable can be replaced or repaired before the fault
occurs.

The cable deteriorated due to water tree can be repaired with rehabilitation method. In
rehabilitation, the silicon-based liquid is injected between the wires of the conductor. When
rehabilitation liquid comes into contact with water, chemical polymerising takes place and it
consumes the dampness. (Bertling 2002)
According to study of Birkner (Birkner 2004), the costs of rehabilitation method are roughly 50
% of the costs of the exchange of the cable. On the other hand, the additional lifetime of the
rehabilitated cable is about 20 years, compared to 40 years lifetime expectancy of new cable.
(Birkner 2004) Based on these results the rehabilitation and replacement are quite equal method
in the economical point of view. However, in some other studies rehabilitation has been proved
to be economically and technically beneficial method (Bertling 2002).
RCM framework
In the figure 5, the optimal focus of the RCM for underground distribution systems is presented.
At first whole cable system is considered to be replaced or repaired. After that the fundamental
maintenance program is created based on historic performance and customer load critically. The
predictive diagnostics can then be applied to the areas with most severe needs. The results of the
diagnostics will then identify the cables that need to be repaired and cables that need to be
replaced. These tasks can then be prioritised by considering the severity of the defects and the
number of the customers affected. Project implementation can then be performed within the
limits of the human and financial resources. (Reder & Flaten 2000)

Implement
to resource
limits
Prioritise
repair/replace
Analyse total repair/replace
Perform value added diagnostics
Understand critical customer load risk
Determine dominant failure prone components
Identify failure prone cables and components
Consider entire cable and component system

Figure 5. The optimal focus of the RCM for underground distribution system. (Reder & Flaten
2000)

RCM process for underground distribution systems can also be divided to following six steps:
(Reder & Flaten 2000)
1. Establish the scope: The work limits are defined by establishing the initially boundaries.
The scope of the maintenance activities could include e.g. certain geographical areas or
certain feeders or critical customers.
2. Identify what is not in the scope: The elements that do not affect on the goals of the
program are defined. Items that will not be included could be for example: dig-ins,
animal related outages and ground settling.
3. Specify performance goals: The performance level that is needed to meet company
reliability objectives is defined.
4. Identify the problem: The historic problems and their causes are defined.
5. Identify resources available: Resource constraints, such as field resources, computer
systems, data gathering tasks and financial limitations, are defined.
6. Create necessary procedures: Necessary procedures for predictive diagnostics and
corresponding repairs or replacements are created.
METHOD 2; CASE EXAMPLE LECHWERKE AG
Lechwerke AG is a German electricity distribution company, which operates a radial 20 kV grid.
Lechwerke AG has introduced RCM process for its underground cable network. The total length
of cable network is 3156 km, but only 8,3 % (262 km) of all cables are responsible of 74 % of all
outages. These cables were categorised and the condition of the important cables were tested
with IRC-analysis (Isothermal Relaxation Current), which is a non-destructive method that
allows the classification of the XLPE and PE cables in the aging classes Perfect, Mid Life,
Old and Critical. These classes correlate to the residual dielectric strength of the cable
insulation. (Birkner 2004)
In RCM process, underground cables are categorised into required and not required ones
depending on whether they are required or not for system operation. Required cables are
categorised further to important and not important ones. Evaluation process is shown in figure 6
and it is based on practical experience. (Birkner 2004)

Figure 6. The categorisation of the underground cables. (Birkner 2004)


RCM process for cable network is shown in figure 7. After the cables have been categorised as
shown in figure 6, the diagnostic and maintenance process can be created. If cable is not required
for system operation and it has increased outage rate, it can be decommissioned or dismantled. If
cable is required but unimportant, no activities are performed and the risk for outage is accepted.
Only cables that are required and important are tested with IRC-analysis. (Birkner 2004)

Figure 7. RCM program for underground cable network. (Birkner 2004)


If the result of the IRC-analysis is Perfect or Mid Life, the cable is submitted for further
testing and is repaired if testing indicates some deficiencies. If the result of the IRC-testing is
Old or Critical, the cable is replaced or repaired. (Birkner 2004)
The results of the analysis of the cables with increased outage rate show that 45% (120 km) of
the cables are considered as required and important, these are the cables that are IRC tested.
79 % (95 km) of IRC tested cables have been classified to Old or Critical. After the analysis

it has been possible to concentrate the replacements to these 95 km of cables. Investment costs
for that operation are 5,7 million Euros. Another plan was the replacement of all the cables with
increased outage rate (262 km of cable), which would have cost 15,8 million Euros. Thereby the
savings in the investment costs due to successful RCM process were 10,1 million Euros. Also the
number of the outages has been reduced since RCM has been implemented systematically.
(Birkner 2004)
METHOD 3; CASE EXAMPLE NORTHERN STATES POWER COMPANY
Northern States Power Company (NSP) implemented RCM for underground distribution
network on feeders in Minneapolis/St. Paul area. Predictive cable testing was performed on 241
NSPs worst performing feeders; the methodology is shown in upper part of figure 8. These
feeders represented one-third of total metro area feeders. Control group, which was not tested,
contained 554 feeders. (Reder & Flaten 2000)

Figure 8. Methodology utilised to show benefits of predictive cable testing and repair. (Reder &
Flaten 2000)
After repairs were performed on the 40 % of the recommended locations, clear reliability
benefits were achieved. The System Average Interruption Frequency Index (SAIFI) of tested
feeders and control group of not tested feeders was compared. The results of this comparison are
illustrated in the figure 9.

Figure 9. The reliability improvement of the tested cables. (Reder & Flaten 2000)
As can be seen from the figure 9, SAIFI of the tested group improved by 40 %. In practice this
means that approximately 45 000 customer outages were avoided. The SAIFI of the remaining
feeders that were not tested got worse, partly due to hot summer during testing time. (Reder &
Flaten 2000)
CONCLUSIONS
Reliability Centred Maintenance has been used in many areas of industrial during several
decades. Recently RCM applications for the electricity distribution systems have been created.
There are some special challenges in the creating the RCM application for underground power
system; condition monitoring and repairing of the cable need special methods. With new
diagnostic methods, the improvement in the reliability and cost-effectiveness can both be
achieved. It is stated that predictive diagnostics can reveal 2/3 of cables, which are scheduled for
replacement, can actually be kept in service reliably in the near future (Reder & Flaten 2000).
Developed diagnostic methods have generated the need for new methods to optimise the
preventive maintenance process of the underground power system. RCM is an appropriate
framework for planning the optimal maintenance program for underground cables. Theoretical
studies have been made to create appropriate RCM process for underground power systems.
Case studies of RCM applications have shown out that increase in reliability and decrease in
costs could both be achieved by applying RCM process in the underground power system.

10

REFERENCES
Anon 2005

http://www.eupen.com/cable/power/medium_voltage/power0212.html

Bertling 2002

Bertling, Lina. Reliability Centred Maintenance for Electric Power


Distribution Systems. Doctoral Dissertation. Royal Institute of
Technology (KTH). 2002

Birkner 2004

Birkner, Peter. Field Experience With a Condition-Based


Maintenance Program of 20-kV XLPE Distribution System Using
IRC-Analysis. IEEE Transactions on Power Delivery, Vol. 19, No. 1.,
January 2004

Reder & Flaten 2000

Reder, Wanda & Flaten, Dave. Reliability Centred Maintenance for


Distribution Underground Systems. Power Engineering Society
Summer Meeting 2000. IEEE

11

RCM applications for switchgear and high voltage breakers


Richard THOMAS
Chalmers Tekniska Hgskolan, Gteborg & ABB Power Technologies AB1, Ludvika, SWEDEN
richard.thomas@se.abb.com
1. AUTHOR NOTE REFERENCES, DEFINITIONS & ABBREVIATIONS
This report refers to specific terms for which formal or normative definitions exist (IEC / ANSI /
IEEE / CIGR) and these have been used wherever possible. They have been collated into a table
(including abbreviations and/or symbols) contained in Section 9 of the report and are referenced
by a superscripted 9.# reference corresponding to the table numbering where terms are introduced.
Definitions might be repeated in the body text where considered relevant to the flow of the
discussion. Other references to cited texts are in referenced in square brackets [#], according to
numbering in the Bibliography in Section 10 at the end of the report.
2. INTRODUCTION
Power system operation involves maintaining a balance between supply and demand, usually
with a demand driven focus. As large scale storage of electrical energy has practical limitations,
the power system operator needs to maintain a suitably robust and flexible system in order to
respond to demand fluctuations instanteously, allowing for the probability of faults and
disturbances on the system that can reduce the supply capability.
There are many sources of possible disturbance to a power system that can result in the loss of a
component. These can be catergorized according to their cause and frequency as indicated in
Table R1 below.
Disturbance Category
Externally Caused Faults

Primary Causes
Lightning, falling trees, weather
extremes, human error etc

Equipment Failure

Ageing, operational wear, inherent


defects, incorrect settings etc

Equipment Maintenance

Prevention and / or rectification of


defects in equipment due to
operational wear or age.

Occurrence Frequency
Sporadic or random. Weather
related faults can show seasonal
tendancies.
Ideally rare. Varies according to
equipment, insulation, wear and
failure mechanism types. Studies of
failure statistics can reveal specific
failure trends for specific equipment.
Preventative maintenance is planned.
Planning criteria can vary, usually
either on a time, operations history
or condition basis.

Table R1: Categorization of Major Power System Operational Disturbances


External faults, refers to faults caused by events external to the power system itself i.e. not
caused by the power systems own equipment or operation. Equipment faults are those caused by
the failure of equipment within the power system. Equipment maintenance refers to the need to
1

Please note that the views expressed in this report do not necessarily reflect the official policies of ABB.

take equipment out of service for safety and practical reasons in order to conduct maintenance.
Equipment failures might also be considered random, but for well understood failure
mechanisms, the expected frequency and probability of such failures can to some extent be
quantified, including allowance for changes to the risk of failure due to specific equipment age or
condition. Maintenance and diagnosis of equipment on the other hand can be planned and
therefore has additional possibilities for optimization.
The focus of this report is on the application of Reliability Centred Maintenance (RCM)
high voltage (HV) breakers. This report is structured in the following main parts:

9.13

on

General principles of RCM as applied to HV breakers in power systems


Functional description of an HV breaker
Failure modes of HV breakers
Application of RCM to HV breakers including reported utilty experience
Impact of RCM on breaker standards and designs

High voltage breakers are one of the most critical primary elements of a power system. They are
the main devices relied upon for interruption of load and fault currents in a controlled manner.
Failure of a breaker to fulfill its primary functions will normally have a direct impact on the
relaibility9.11,9.12 and availability9.2 of the power system. Even if such a failure does not result in
an immediate loss of supply to a consumer, it will alter the operational state of the power system
so as to increase the risk due of result in loss of supply to any further failures in the network.
The relationships between one of a breakers primary functions and maintenance, to the above
listed power system disturbance catergories is illustrated in Figure R1 below:
Disturbance category

Primary breaker
function:
Interrupt faults

External faults
Equipment faults

Corrective
Maintenance

Equipment maintenance

Preventative
Maintenance

Focus areas of this report

Figure R1:

Maintenance
optimization through
Reliability
Centred
Maintenance (RCM)

Relationships between Breaker Function, Maintenance


and Power System Disturbances

The primary functions of a breaker will be presented in more detail later, though in Figure R1,
interruption of faults is indicated as the most well recognized of these functions. Figure R1
indicates two main categories of maintenance activity: preventative and corrective. There are
important differences between these activities. Preventative maintenance9.10 is aimed at restoring
the condition of equipment so as to achieve a minimum acceptable performance and reliability.
Such maintenance is normally planned. Corrective maintenance9.17 is undertaken as repair or
replacement of failed equipment and is typically unplanned. Both activities are linked to
equipment reliability. Preventative maintenance seeks to maintain or improve reliability.
Corrective maintenance occurs as a consequence of imperfect reliability; meaning that the risk of
failures can only be minimized but never eradicated.
Too little (effective) preventative maintenance can lead to an increase in corrective maintenance.
However, there will also be a level of diminishing returns in attempting too much preventative
maintenance. First, the activity itself normally requires removing equipment from service, thus
making it a disturbance to overal power system availability, as indicated above. Second, there is a
realistic limit to the level of reliability that can be achieved through maintenance; it cannot
eliminate all failure risks especially so called hidden failures. This can be expressed Pareto
terms; e.g. 80% of failures can be attributed to 20% of components; alternatively 80% of failures
can be minimized by 20% effort. Minimizing the remaining 20% will required a drammatically
increasing effort.
RCM is a process to optimize the maintenance activities. The mix of objectives and constraints
governing the optimization process can be complex, but in general RCM aims to achieve an
acceptable level of reliability for a minimum of effort.
3. RELIABILITY CENTRED MAINTENANCE APPLIED TO POWER SYSTEMS AND
HV BREAKERS GENERAL PRINCIPLES
RCM was first developed in the 1960s within the aircraft industry in response to the advent of
larger and more complex aircraft designs [3 ], for which traditional maintenance methods were
considered to be impractical for. In the power system context RCM is defined in IEC 60300-11-3
[3 ]as follows:
Reliability Centred Maintenance9.13: Systematic approach for identifying effective and efficient
preventative maintenance tasks for items in accordance with
a set of specific procedures and for establishing intervals
between maintenance tasks
In other words RCM can be considered as the application of a structured optimization process to
existing preventative maintenance schemes, aimed at meeting the availability requirements of the
power system for the lowest total maintenance cost within a given time frame.
Optimization is generally understood to be seeking to maximize output while minimizing input.
In this respect, optimization of power system maintenance can be considered as seeking to
maximize system availability while minimizing maintenance effort (or cost). It is important to
recognize that both system availability and maintenance effort can be measured in monetary
terms. Availability can be measured in monetary terms such as costs for loss of supply or
3

providing alternative supply, in addition to penalties that might be imposed either by a network
regulator or by customer supply contract due to loss of supply. Maintenance measurement in
monetary terms is made most typically in terms of the budgeted costs of personnel, parts and
equipment required to carry out the maintenance, but may also include penalty costs from a
network regulator associated with taking an item of plant out-of-service.
RCM also adds in reliability considerations to the optimization task i.e. achieving a specified
reliability through optimum maintenance effort. In the power system context reliability can be
considered not only in repect of the probability of a failure, but the probability of a failure
impacting on either the direct instanteous availability or supply capacity of the system, but also
impact on the redundancy level of the system.
Factors in RCM

RCM Goals

Failure rates,
Mean time between failures, MTBF9.7 = 1 /
Mean time to maintain / repair, MTTM/R9.8
Cost of maintenance
Cost of unavailability

Minimize
Maximize
Minimize
Minimize
Minimize

One must distinguish between system reliability and component reliability. In applying RCM to
HV breakers it will be come clear that the breaker itself can be considered as a system, due to
its complexity and different functions that involve the correct interaction of many parts. RCM at

system level
component level

= Select which breakers to maintain


= Determine maintenance tasks providing the maximum return on
effort

Figure R2 below illustrates the above considerations in the context of HV breakers on a (fictious)
power system. Several important points are made in Figure R2:

Different types of circuit breakers according to interrupter types: Air-blast A, Oil O


and SF6 S are present on this example system. This reflects the use of different breaker
technologies and designs available at the time each part of the system has been built.
These different breaker types will have different failure modes, failure rates and
maintenance requirements.
Breakers have different impacts on overall system availability depending on their
location. Breakers near the system generators have a greater overall impact than breakers
near the radial ends of the system (in this example).

It is important to recognise the relationships between overall power system HV circuit design and
associated breaker reliability / availability on the overall system reliability and availability.
Breaker availability and system availability are seperate quantities. Normally transmission
systems are designed with the N-1 redundancy criteria in mind i.e. the systems performance
should not be placed in immediate jeopardy by the loss of a single line (or component)[7 ]. On
more critical power circuits an N-2 or higher redundancy might be applied.

CRITICAL
SYSTEM
IMPACT

A
O

BREAKER TYPES

HIGH
SYSTEM
IMPACT

MODERATE
SYSTEM
IMPACT

O
O

PNEUMATIC; AIR-BLAST

HYDRAULIC; OIL

SPRING, SF6

MAINTENANCE
COSTS

PBM (MAXIMIZE)
PBI (MAXIMIZE)
MAINTENANCE COST
(MINIMIZE)

INSPECTION COST
(MINIMIZE)

TIME
MTTI (MINIMIZE)

MTTM (MINIMIZE)

FAILURE
COSTS

FAILURE COST
(MINIMIZE)

MTBF (MAXIMIZE)

TIME

LEGEND:
PBI:

PERIOD BETWEEN INSPECTIONS

MTTI:

MEAN TIME TO INSPECT

PBM:

PERIOD BETWEEN MAINTENANCE

MTTM(/R):

MEAN TIME TO MAINTAIN(/REPAIR)

MTBF:

MEAN TIME BETWEEN FAILURE

MAINTENANCE
INVESTMENT

BUDGET

FAILURE
RISK

FAILURE
COSTS

Figure R2: Illustration of Breaker Maintenance and System Risk Mitigation Factors
There are two important aspects of breaker availability to be considered with respect to overall
power system level N-1 redundancy:
1. Failure of one breaker can shift the system from the N secure state to N-1 secure
state. Failure of another breaker may then place the system in a potentially insecure or
unstable state.
2. Maintenance that requires removal of a breaker from service can affect the N-1 system
status. The longer such maintenance takes, the longer the system can be required to
operate in a N secure state and thus is operating at higher risk.
Considering the above two aspects of the impact of breaker availability on system performance,
RCM on breakers will be aimed not only at reducing direct maintenance costs, but also reducing
5

total maintenance time (i.e. breaker out-of-service time) while aiming at an acceptable (if not
as low as possible) breaker failure probability.
ANSI/IEEE standard 1366-2003 [2 ] defines quantities to be considered in setting the
availability requirements of a power system. Figure A.3 in this standard reports that the following
four indicies were the most commonly used according to a survey conducted between 1995 and
1997:

System Average Interruption Frequency Index, SAIFI 9.15


System Average Interruption Duration Index, SAIDI 9.14
Customer Average Interruption Frequency Index, CAIFI 9.3
Average System Avaliability Index, ASAI 9.1

These indicies provide various ways to quantify the level of interruption experienced by
electricity consumers and thus provide reference indicators for utilities and regulators to monitor
and maintain minimum levels of service or performance.
The application of RCM to HV breakers is a relatively new concept for many utilities; CIGR
Brochure 165 [5 ]2 cites the introduction of RCM in transmission and distribution companies as
starting from 1991. Traditionally HV breaker preventative maintenance has been dictated
according to manufacturers recommendations which are still normally based on a combination
of Time Based Maintenance (TBM9.16) together with some Condition Based Maintenance
(CBM9.17) guidelines relating to the usage of the breakers. CIGRE Brochure 165 (Parts II & III),
refers to a survey of 45 utilities conducted in 1998 indicating a planned shift from TBM and
CBM dominated strateiges towards more CBM + RCM dominated strategies (CIGR Figs III.2
& III.4).
TBM is most commonly where the item to be maintained has well-defined and dominant timedependent degradation of key functions. An example of such a dependence could be the elasticity
in a rubber seal or the drying out of lubricant.
CIGR Brochure 167 on monitoring and diagnostics for HV switchgear [6 ] clarifies the
difference between CBM and PM. While both methods aim to predict when maintenance will be
required, PM normally will base the prediction on the last, most recent inspection or maintenance
in combination with a knowledge of the equipments operating duty and environment. In contrast,
CBM is normally based on use of continuous or routine monitoring of the equipment to predict
maintenance based on the present needs of the equipment condition. In short, CBM is viewed
as PM applied with more up-to-date information on the equipment condition.
There are a number of normative and guidance publications existing to support utilities in
devising and implementing an RCM program for HV breakers. Some relevant examples include:
IEC:

IEC 60300-3-11 Dependability management Part 3-11: Application guide


Reliability centred maintenance[3 ]

Numerous references are made to CIGR Brochure 167 in the remainder of this report, all traced to ref [5 ]. Further
references will focus on the specific parts of this CIGR document.

IEEE:

IEEE Std C37.10-1995 IEEE Guide for Diagnostics and Failure Investigation of
Power Circuit Breakers[4 ]

CIGR:

CIGR Brochure 165 Life Management of Circuit-Breakers[5 ]

IEC 60300-3-11 [3 ] provides a general process description for RCM implementation:

Figure R3: IEC flowchart for a general RCM programme.


HV breakers differ from most other HV power system equipment in that they are subject both to
high electrical and mechanical stresses during service. In addition their primary function in
regard to circuit switching involves high dynamic stresses. The CIGR definitions for HV
breaker failure modes are dominated by cases relating to the failure of a breaker to perform its
dynamic operations as intended.

The next sections of this report will now address functional description of an HV breaker as a
prelude to reviewing types of functional failures. These two subjects provide pre-requisite
information for then assessing the application of RCM specifically to HV breakers in more detail.
4. FUNCTIONAL HIGH VOLTAGE BREAKER DESCRIPTION
A functional device description is essential when applying RCM, since as stated by Bergman [8 ]
RCM is a method of preserving functional integrity and later Defining function is perhaps the
most difficult RCM analysis task. From a black box perspective, a breaker has several main
functions:

Conduct rated currents when closed


Maintain rated insulation withstand across its contacts when open
Close its circuit always and only on command
Open or interrupt its circuit always and only on command, according to its ratings

There is a wide variety of HV circuit breaker designs and configurations in existence. They can
be classified in various ways, but the most common classifications (for 72-800kV breakers) are
described in Figure R4 below based on information from [9 ],[10 ],[11 ].
HV Breakers (72-800kV)

Metal Enclosed
(Dead Tank)

Non-Metal Enclosed
(Live Tank)

Interrupter Type:
Air-blast
Oil
SF6

Operating Mechanism:
Pneumatic
Hydraulic
Spring

Operational Configuration:
Three Pole Operated
Single Pole Operated

Figure R4: Summary of General HV Breaker Classification by Design


There are even more detailed sub-classifications and design differences within those shown
above, however those shown here are adequate for a general review of RCM application to HV
breakers. From a historical perspective oil and air-blast interrupters are the oldest designs and
essentially have been superceded by SF6 interrupters since the mid-1980s. There has also been
an increasing trend towards the use of spring operating mechanisms from the early 1990s,
following from the results of the second CIGR enquiry into the reliability of HV breakers [12 ].

At the most basic level a breaker can be considered as an interrupter or set of contacts that is
switched to a closed on open state on command. Control of the circuit breaker is normally via a
combination of local controls at the breaker, substation protection system and the network control
system.
The main functions of a circuit breaker described above can be divided into static (conducting
current when closed; maintaining insulation across open gap when open) and dynamic (closing
a circuit; opening or interrupting a circuit) functions. The dynamic functions must be carefully
controlled and there are certain conditions that must be fulfilled for closing or opening to be
made successfully.
For a breaker to close successfully the following minimum conditions must be met:
The breaker should be in fully open position
Sufficient stored energy for a closing (or close-open) operation should be available
Sufficient dielectric strength in the breaker (to manage immediate opening)
A valid closing command issued (with necessary operating latch release energy available)
For a successful opening or trip operation, the required pre-conditions are similar, but with some
additional measures:
The breaker shall be in the fully closed position
Sufficient stored energy for an opening operation (with current interruption) should be
available
Sufficient dielectric strength in the breaker (to manage current interruption and transient
recovery voltage withstand)
Sufficient commutation margin between arcing and main contacts
Sufficient dielectric flow control into arc interruption region (between arcing contacts)
A valid opening command issued (with necessary operating latch release energy
available)
The current interruption process for breakers becomes considerably more complex when
considering the broad range of interruption cases a breaker must perform, ranging from near
resistive load or highly reactive (capacitive or inductive) load to fault currents. It is easiest to
explain the interruption process of a HV breaker by considering an example of a specific design.
Figure R5 below (based on Garzon Fig5.39 [9 ] and van der Sluis Fig 4.5 [10 ]) illustrates a
generic SF6 puffer interrupter. SF6 interrupters are the most common design in production today
for breakers above 36kV.

HV TERMINAL
SINGLE
PRESSURE
SF6 VOLUME

NOZZLE
FIXED MAIN
CONTACT
ARCING CONTACT
GAP

FIXED ARCING
CONTACT
MOVING ARCING
CONTACT

MOVING MAIN
CONTACT

MOVING
CONTACT
CYLINDER

PUFFER
VOLUME

FIXED
PISTON

OPERATING
ROD

HV TERMINAL

Figure R5: Illustration of a Generic SF6 Puffer Interrupter


Because of the different stresses on breaker contacts conducting load current when closed
compared to interrupting arc currents, breakers are normally equipped with arcing and main
contacts of different materials and specially co-ordinated operating times. Arc contacts are
typically made of a tungsten-based alloy that is highly resistant to erosion from the higher current
densities and temperatures associated with arcs (especially fault arcs, which may reach 20000K).
Main contacts are typically silver plated, copper, designed to conduct load currents with minimal
contact resistance and therefore low losses and contact tempertures.
During opening, the arcing and main contact opening times are normally co-ordinated so that the
main contact open first and the current is commutated to the arcing contacts. Once the arcing
contacts open, an arc is formed, which should then be interrupted at a current zero. During
closing, the contact operating sequence is reversed. On HV breakers normally a small pre-arc
occurs as the gap between the arcing contacts becomes small and then as the arcing contacts close
they carry the current until the main contacts close and they largely takeover the current
conduction duty.
Normally HV breakers are also equipped with some form of arc control and dielectric flow
control device, built around and between the arcing contacts (the nozzle in an SF6 breaker).
These devices help both restrict the arc to being between the arcing contacts and away from the
main contacts at the same time directing dielectric medium into the arc region to cool the arc and
establish a vey rapid dielectric withstand at arc current zero.
Figures R6a & R6b below (copied from own Licentiate Thesis [11 ]) illustrate the basic operating
principle of an HV (SF6 puffer) interrupter for fault interruption, described in terms of certain
time intervals or instants (T1 to T7):

10

T1

T2

T3

T4
T5
T6

CONTACT TRAVEL

T7
ARCING
CONTACTS

MAIN
CONTACTS
TRIP
SIGNAL

Current

0.02

0.036

0.052

0.068

0.084

0.1

0.084

0.1

Voltage Across Circuit Breaker (TRV)

0.02

0.036

0.052

0.068

Time (sec)

Figure R6a: Fault Interruption Process of a HV (SF6 Puffer) Breaker


T1:
T2:
T3:
T4:
T5:
T6:
T7:

Fault starts.
Protection system has detected fault and sends trip / open command to breaker
Breaker has responded to trip command and its contact have moved to point where the
main contacts separate. Now current is commutated to the arcing contacts.
Contacts have now moved to the point where the arcing contacts separate. An arc is
formed between the arcing contacts.
First fault current zero is encountered, but contact gap and SF6 gas flow is too small to
succeed in interruption and current re-ignites after current zero.
Current interrupted at next current zero.
Peak Transient Recovery Voltage (TRV) withstood by opening contacts.

11

It must be stressed that the above figure is only an example. Different (fault) currents will exhibit
different current zero times with respect to contact parting and so the breaker may interrupt at the
first or second current zero after contact parting, depending on the arcing time and the current
type and magnitude.

T1-T2

BREAKER
CLOSED

T3

MAIN CONTACTS
OPEN
COMMUTATION
TO ARCING
CONTACTS

T4

ARCING CONTACTS
OPEN
ARC FORMS
PRESSURE RISE

T5

1st CURRENT ZERO


TOO SHORT ARCING
TIME LEADS TO
RE-IGNITION

T6

2nd CURRENT ZERO


SUCCESSFUL
THERMAL
INTERRUPTION

T7

TRV PEAK
SUCCESSFUL
DIELECTRIC
WITHSTAND

Figure R6b: Fault Interruption Process of a HV (SF6 Puffer) Breaker


This description can be enhanced by considering additional details to the basic open and close
functions. For example, HV breakers are required by international standards (e.g. IEC 62271-100
Cl 4.104 [15 ]) to be able to perform auto-reclosing duties, i.e. operating sequences like:
Open 0.3 s Close-Open-3min-Open, or Close-Open 15 s Close-Open
12

In order to perform the above operating sequences the breaker is normally required to have some
form of operating energy storage. Due to the strenuous duties required by HV breakers they are
typically very large items of equipment, requiring large operating energies (in the order of 2 to
30kJ per opening operation, depending on interrupter rating and design). There must be a system
for tranferring the mechanical operating energy from to the moving contact system in the
interrupter.
In addition it is required in international standards (e.g. IEC 62271-100, Cl 5.4 [15 ]) that circuitbreakers be equipped with so-called anti-pumping control. This requires that if the breaker
receives a close command, the close circuit shall respond and then be disabled until the same
close command is removed. This is to avoid the case where a breaker closes onto a fault and
receives a near instanteous trip signal on closing. If the original close command is still present
after the trip has occurred, the breaker should not attempt to re-close until a second or new close
command is sent. This stops a breaker performing repeated CO operations (referred to as
pumping) in the presence of a long close command with a persistant trip condition.
For single pole operated breakers, there can be the additional control requirement that if not all
poles of the circuit breaker close within a set time window (ususally set between 0,5 to 3 s) a
three phase trip command is sent to the breaker to bring all three phases back to the open
position. This is to avoid single phasing of the three phase system, which can cause major
imbalances in load flow and potentially major damage to large motors or generators.
The main functional modules of a generic HV breaker and its main interfaces to the remainder of
the power system as described in Figure R7 below.
High Voltage Circuit-Breaker
HV Insulators / Housing
Dielectric Monitor
Dielectric (Interrupting Medium)

Protection
Relays

Interrupter

Network
Control

Auxiliary
Supplies

Control & Signaling

Energy
Transmission

HV Primary
Connections

Arc Control Device


Energy
Release
Arcing Contacts
Energy
Storage
Main Contacts

Energy
Charging

Figure R7: Functional Description of HV Breaker and its System Interfaces

13

5. FAILURE MODES OF HIGH VOLTAGE BREAKERS


The functional view of a breaker is reinforced by the methods applied in assessing breaker
reliability. CIGR has conducted two (2) major international studies on circuit breaker reliability
to date (a 3rd is now in progress)[12 ]. In these studies, failures have been classified in terms of
their functional impact. The two main CIGR [12 ] breaker failure classifications are:
Major Failure (MF) : Complete failure of a circuit-breaker which causes lack of one or more of
its fundamental functions. Note: A major failure will result in an
immediate change in the system operating condition(intervention
required within 30 minutes).
minor failure (mf) :

Failure of circuit-breaker other than major failure; or any failure, even


complete, of a constructional element or a sub-assembly which does not
cause a major failure of a circuit-breaker.

It should be noted that even though a minor failure does not result in the immediate loss of a
breakers primary function(s), it may either develop into a major failure, or in any event require
user intervention / maintenance which usually requires the breaker to be taken out-of-service i.e.
a forced outage.
The CIGR breaker reliability surveys have been an important source of information to the
industry (both utilities and manufacturers), and provide a useful reference in consideration of
RCM planning and assessment on HV breakers. However some important points should be noted
in respect of the first and second surveys:

Both survey results are now somewhat old. The first covering HV breakers of all
technologies from 1974 to 1977. The second focussing on 18,000 single pressure SF6
breakers from 1988 to 1991. CIGR Working Group A3.06 is now conducting a third and
even more comprehensive survey. (The figures quoted from the surveys in this report are
from CIGR paper 13-202, published in 1994 [12 ]).
While reasonably detailed, considering the scope and complexity of such an international
survey task, there is an acknowledged risk that the data is subject to variance in
interpretation by the respondents in terms of classifying cause of failures.
Some differences in definitions between the surveys make comparison of results difficult,
in addition to fact that the second enquiry focussed on SF6 single pressure interrupter
breakers and the first enquiry focussed on all types of interrupter.

Types of circuit breaker failure investigated by the surveys and their failure rates are summarised
in CIGR Table 6, reproduced below:

14

Table 6 - CIGR Paper 13-202, August 1994 - 2nd International CB Reliability Study
Major Failure rate by characteristic for
first and second enquiry
First enquiry
Second Enquiry
MF per
(% total)
MF per
(% total)
Characteristic
100
100
CByears
CByears
0,33
33,3%
0,164
24,6%
Does not close on command
0,14
14,1%
0,055
8,2%
Does not open on command
Closes without command
0,02
2,0%
0,007
1,0%
Opens without command
0,05
5,1%
0,047
7,0%
Does not make the current
0,02
2,0%
0,011
1,6%
Does not break the current
0,02
2,0%
0,020
3,0%
Fails to carry current
0,02
2,0%
0,010
1,5%
Breakdown to earth
0,03
3,0%
0,021
3,1%
Breakdown between poles
0,00
0,0%
0,010
1,5%
Breakdown across open pole (internal)
0,04
4,0%
0,024
3,6%
Breakdown across open pole (external)
0,01
1,0%
0,010
1,5%
Locking in open or closed position
0,00
0,0%
0,190
28,5%
Others
0,31
31,3%
0,098
14,7%
No answer
0,59
0,006
Totals
1,58
100,0%
0,673
100,0%

CIGR Table 4, reproduced below, summarizes the reported failure types, causes and
frequencies from the two surveys on HV breaker reliability:
Table 4 - CIGR Paper 13-202, August 1994 - 2nd International CB Reliability Study
Failures per 100 CByears
Subassembly
Major Failure (MF) Rate
minor failure (mf) Rate
responsible
1st Survey
2nd Survey
1st Survey
2nd Survey
77892

CB years

70708

77892

70708

Components
at service
voltage
El. control and
aux. circuits

0,76

48%

0,14

21%

0,92

26%

1,44

31%

0,3

19%

0,19

29%

0,57

16%

0,92

20%

Operating
Mechanism
Others
Totals

0,52

33%

0,29

43%

2,06

58%

2,05

44%

1,58

100%

0,05
0,67

7%
100%

3,55

100%

0,25
4,66

5%
100%

The above two tables provide some insight into areas of most concern regarding breaker
reliability:
Failures to operate-on-command or operating-without-command accounted for between
40-50% of reported failures in both surveys

15

The high percentage of locked-in-postion faults in the second survey is in part attributed
to the second surveys focus on SF6 breakers, where the monitoring of SF6 in the breaker
is often arranged to block breaker operations once the density level falls to a specified
critical level.
There was a distinct reduction in the number of failures arising from components at
service voltage i.e. HV components between the first and second survey. This could be
attributed to improvements in interrupter design, often also attributed to the reduction in
number of series interrupters required per phase for SF6 versus air-blast or oil breakers.
Breaker operating mechanisms consistently accounted for over 40% of failures in both
surveys. It should be noted that while the second survey focussed on SF6 interrupter
breakers, it reported use of three main types of operating mechanism; pneumatic,
hydraulic and spring. It has been inferred by this CIGR report that spring operating
mechanisms exhibited the least number of failures and this has been a significant factor
affecting further SF6 breaker development (see later).

More specific insight into the attributed relationships between faults and components for SF6
breakers is provided in CIGR report Table 5 reproduced below:
Table 5 - CIGR Paper 13-202, August 1994 - 2nd International CB Reliability Study
Failures per subassembly or component
Major Failures
minor failures
Subassembly responsible
(% total)
(% total)
No. of failures
No. of failures
99
21,0%
1019
30,9%
1. Components at service voltage
66
14,0%
310
9,4%
1.1 interrupting unit
1.2 auxilliary interrupter / resistor
6
1,3%
20
0,6%
1.3 main insulation to earth
27
5,7%
689
20,9%
2. Electrical control and auxiliary circuits
137
29,0%
650
19,7%
2.1 trip / close circuits
47
10,0%
49
1,5%
2.2 auxiliary switches
35
7,4%
69
2,1%
2.3 contactors, heaters etc
36
7,6%
178
5,4%
2.4 gas density monitor
19
4,0%
354
10,7%
3. Operating mechanism
204
43,2%
1449
44,0%
3.1 compressors, pumps etc
64
13,6%
615
18,7%
3.2 energy storage
36
7,6%
238
7,2%
3.3 control elements
44
9,3%
383
11,6%
3.4 actuators, dampers
42
8,9%
168
5,1%
3.5 mechanical transmission
18
3,8%
45
1,4%
4. Others
32
6,8%
178
5,4%
Totals
472
100,0%
3296
100,0%

CIGR report 13-202 draws some important general conclusions, including:


Operating mechanisms are the sub-assembly with the most failures further work was
suggested to simplify the electrical and mechanical control process
While only 7% of major failures in SF6 breakers were attributed to loss of gas,
approximately 40% of minor failures were due to this problem, leading to the conclusion
that additional effort was needed (in the early 1990s) to address SF6 gas pole sealing and
monitoring
Additional type testing for mechanical endurance, climate testing and life cycle
assessment were recommended (see later discussion on impacts on revisions to IEC
breaker standards)
16

Encouragement was made for the use of common definitions for reliability studies and
maintenance techniques.

One further aspect of the CIGR reliability surveys is of relevance to consideration of RCM on
HV breakers. Table 8 from the CIGR 13-202 report is shown below:
Table 8 - CIGR Paper 13-202, August 1994 - 2nd International CB
Reliability Study
Cause of the Major and minor failures
Major Failure
minor failure
First
Second
First
Second
Cause
enquiry
enquiry
enquiry
enquiry
25,4%
24,7%
Design
45,3%
52,5%
28,7%
39,1%
Manufacture
Inadequate instructions
0,7%
1,1%
0,3%
1,7%
Incorrect erection
9,3%
8,2%
10,7%
7,1%
Incorrect operation
1,2%
6,0%
0,2%
4,5%
Incorrect maintenance
8,1%
2,8%
4,5%
2,6%
Stresses beyond specification
4,8%
3,4%
0,7%
1,8%
Other external causes
2,3%
5,4%
1,7%
6,6%
Other
28,3%
19,0%
29,4%
11,9%
Totals
100,0%
100,0%
100,0%
100,0%

It is of interest to note the the change in the percentage of failure attributed to Incorrect
maintenance from the first to the second survey. The CIGR report comments that only a small
percentage of failures (6-13%) were detected during maintenance, which suggested that it
maintenance practices (at least at the time of these surveys) was not very effective in determining
potential problems in a breaker that may lead to failure.
While the CIGR surveys were heavily focussed on the breaker as a system in itself, they were
also conducted with the intention to provide input for system relaibility studies. There are other
surveys e.g. Nordel [13 ] (see translated extract below), that provide some more insight into the
potential impact of breaker relaibility on overall system reliability.
Extract from Table 4.3 Nordel Fault Statistics Report 2003

Apparatus attributed to fault


Overhead line
Power cable
Power transformer
Circuit breaker
Disconnector
Substation control equipment

Percent impact on Nondelivered energy (average


2000-2003)
15,6%
6,6%
5,8%
2,1%
25,3%
12,1%

It should be noted that the Nordel statistics only relate to the reported information from the
Scandinavian countries and that there may exist special local conditions (e.g. climatic
extremes) that effect these statistics. Nevertheless such data can be useful in providing guidance
17

to utilities on which power system apparatus requires the most focus for RCM at the overall
power system level.
6. RCM APPLIED TO HIGH VOLTAGE BREAKERS
As described earlier, RCM is a process that can be applied at the overall power system level and
at the power system component (i.e. breaker) level. The first part of this section will concentrate
on the breaker as a component; specifically the discussion will centre on SF6 breakers. Later,
some references to the application of breaker RCM applied at the power system level will be
presented, with reference to a broader range of breaker technologies.
CIGR Brochure 165 [5 ] provides a good summary of the reported experience of utilities and
manufacturers in the formulation of breaker maintenance principles, leading in part to
recommendations on application of RCM. CIGR paper 500-06 within Brochure 165 describes a
seven step RCM process, summarized as follows:
1.
2.
3.
4.
5.

System selection
System boundary definition
System description
System function and functional failures (as opposed to equipment failures)
Failure Modes, Effects and Criticality Analysis (FMECA), including development of
appropriate preventative maintenance tasks
6. Logic tree analysis (leading to prioritization of failure modes and related maintenance
tasks)
7. Maintenance task selection
This CIGR paper 500-06 is at pains to point out that the RCM process depends heavily on
considerable detailed knowledge of the system or equipment to be addressed, including not only
statistical performance data, but the proper analysis work to develop appropriate maintenance
activities directed towards meeting reliability goals. In short RCM implementation requires
considerable effort, and is a continuos process with feedback from ongoing service and
maintenance experience leading (ideally) to continuos improvement and optimization.
As can be seen from the previous section on HV breaker failures, potential problems can arise in
every subsystem of a breaker. It is not necessarily so that maintenance can prevent all such
failures. Nevertheless it is possible to plan maintenance of breaker that can at least account for
the reduction in performance security that occurs due to wear and tear of the breaker arising
from its normal operational duties. This relates to key issues for RCM on HV breakers as
components:

What items to maintain?


When should they (best) be maintained?
How should they (best) be maintained?

18

List I.1, p8 of CIGR provides a convenient summary relating wear factors to specific
subsystems in a HV breaker. The basic relationships described in this list are presented in Table
R2 below with some additional inference on the degree of wear influence per case.
The main message of this table is that the wear effects on breakers can be a combination of
accumulated mechanical operation stress, current interruptions and in some cases time. The
dominant wear inducing factor is accumulated mechanical operations, bearing in mind that this is
a number greater than the number of current interruptions, since when a breaker is operationally
tested either in a factory or at site, it is normally disconnected from the power system and so not
interrupting currents. Considering the dynamic nature of a breaker and the high mechanical
energies required for HV breaker operation, it is not surprising that accumulated operations and
thus accumulated mechanical stress forms a dominant role in determining breaker wear, thus
impacting on breaker reliability.
Wear Inducing Factors
Breaker Component or Subsystem
Subject to Operational Wear
Arcing Contacts
Main Contacts
Nozzles (SF6 breakers)
Operating Mechanism
Auxiliary Contacts
Gas Sealing System

Number of
operations
(mechanical)

Number of
interruptions
(current)

L
M

Number of
Energy Storage
Recharging
Cycles

Hours
Exposure to
Environmental
Extremes

Time

H
L
H

H
H
L-M

L-M
L-M
L-M

L = Low influence
M = Moderate Influence
H = High Influence

Table R2:

Summary of Breaker Subsystem Wear Factors


(Based on CIGR Brochure 165 List I.1)

CIGR Brochure 167 [6 ] describing monitoring and diagnositic techniques for HV switchgear
also includes much valuable guidance on relationships between breaker functions and the
component subsystems normally providing those functions within the breaker e.g. see Tables 2.12.4 in same reference.
The most time affected parts of a circuit-breaker are normally seals (static and dynamic) and
open lubricated parts (e.g. latches or bearings in the mechanical transmission system). NGC
UK reports in CIGR paper 13-302 [22 ] of the time-under-pressure exposure life limits seen
with air-blast breakers compounded by the exposure to high oxygen contents in such breakers.

19

It is important to also note that Table R2 above is only a qualitative guide to breaker wear
behavior. The individual subsystem dependencies on number of operations or time are not
necessarily linear. There can be a complex combination of relationships existing, differing
substantially between breaker designs and technologies. Classical, generic wear behaviors (not
necessarily breaker specific) are described in CIGR Brochure 165 Figures I.3 and I.4 as
reproduced below:

The above figures highlight the important difference between wear patterns that can exist for
predictable and non-predictable failures. The non-predictable failures can only be rectified
by corrective maintenance or possibly design modification. In the case that some generic problem
or weakness is discovered on a specific breaker type or technology, then the utility may be able to
implement planned corrective maintenance on the other other breakers of this type to minimize
the risk of the problem developing to failure.
More typically with maintenance, one thinks of the application of preventative maintenance,
addressing those components that can have predictable wear behaviors. A specific example of
predictive wear maintenance is on the interrupters of a breaker. As implied in Table R2 there is
usually a well known relationship between accumulated current interruptions and interrupter
wear. In SF6 breakers such wear is normally focussed on the arcing contacts and the nozzle.
Figures R8 and R9 below (copied from own Licentiate Thesis [11 ]) illustrate this wear
relationship:

20

ifault(t)
Input Energy:
ugap(t) . ifault(t) dt

Nozzle

Arc Energy released


ugap (t)

Radiation &
convection

Dissociation
& ionization

Erosion

Arc contact erosion


Nozzle erosion

Figure R8 Influence of Current Arc Raditiation on Arc Contact & Nozzle Erosion
ARCING
CONTACTS

NOZZLE

NEW INTERRUPTER
(No arcing wear)

ARCING
CONTACT
EROSION

NOZZLE
EROSION

WORN INTERRUPTER
(Arcing wear at maintenance limit)

Figure R9 Illustration of Typical Arcing Contact and Nozzle Erosion Effects


If the commutation margin between the arcing and main contacts becomes too small there is
the risk that proper commutation may not occur and an arc might be formed between the main
contacts and outside the arc control device. Alternatively, if the arc and dielectric flow control
devices become too worn with successive interruptions the effectiveness of interruption at arc
current zero can be reduced. In either of these cases there is the risk of the breaker failing to
interrupt once its contacts have fully opened, which in the worst case could lead to a catastrophic
(i.e. explosive) breaker failure.
Three important questions exist on interrupter wear:
What are the wear limits?
How can the wear be measured, monitored or assessed for in-service breakers?
How often will the wear limits be approached?
21

Guidance on wear limits is most often sought or given by the breaker manufacturer. This
information will normally be derived from the experience of type tests of the breaker design,
together with some extrapolation (bearing in mind the high cost of type tests and the range of
possible current switching combinations to which a breaker can be subjected). Figures I.1 and I.2
from CIGR Brochure 165 provide examples of interrupter wear behaviors for different current
switching duties:

These examples are important to consider in the context of breaker application. Daily switched
reactor or capacitor bank breakers may reach their interrupter wear limits much faster than a
breaker of the same type installed for line or transformer switching duty. Such a difference is
underscored by other studies. For example CIGR paper 13-304 [18 ] reports on a survey
conducted by German utilities on the current interruption stresses seen by breakers on the 123,
220 and 420kV German networks. This study found that within a ten year period nearly 70% of
the surveyed breakers (2000 in total) did not interrupt a fault. Furthermore it was predicted that
between 92-98% of the breakers would not experience wear out of their interrupters within a
service life of 35 years. Though only the experience of one specific network survey, similar
results are cited in the CIGR Brochure 165, which tends to suggest that most HV breakers can
be expected to survive a reasonably long service life e.g. >30years without need of interusive
interrupter maintenance.
A further consideration in SF6 interrupter wear is that the wear rates on the nozzle and arcing
contacts may not necessarily follow the same trend with respect to accumulated interruptions.

22

Lehmann et al [17 ] describe work undertaken to show that the wear rates of arcing contacts and
nozzles in SF6 breakers can vary, due to a number of factors including differences in geometries,
materials and relative arc exposure times. This can be important as the ability to non-intrusively
assess nozzle condition is very difficult compared to assessment of arcing contact wear.
It is important to understand that such conclusions on interrupter wear must be related to the
specific operational behavior of the network in question and the performance of the breaker types
on such a network. However information such as interrupter wear curves shown in CIGR
Figures I1 and I2 above are important guides to maintenance planners that can permit focussing
breaker maintenance activities on potentially more relevant parts of their breakers that affect
system reliability and availability. Avoiding intrusive interrupter maintenance has become even
more critical with the advent of SF6 breakers. Increased environmental requirements strictly
limiting the release of SF6 to atmosphere mean that strict procedures in the recovery of SF6 must
be followed (as would be required for interrupter maintenance). In addition, arcing in SF6
generates a number of by-products (e.g. H2S, HF, CF4 and other compounds ref Jones et al.
[14 ]) which requires strict health and safety procedures for the cleaning and disposal of arc
exposed SF6 interrupter components.
As indicated in Table R2 and from the earlier cited CIGR breaker reliability survey figures,
operating mechanism failures are a major potential source of problem and thus target for
preventative maintenance. Addressing operating mechanism failure mitigation through
maintenance is difficult, as the more critical failures can often occur suddenly, with little prior
indication as inferred from the CIGR finding that only between 6-13% of failures were
detected during maintenance. It is further complicated from the utility perspective that the
detailed critical knowledge of the mechanism design resides with the manufacturer and so the
manufacturer must provide adequate recommendations on what items of the mechanical system
need to be maintained, when and how.
Conversely there is the problem for the manufacturers that they must develop competitive
breakers and even for a certain rating class eg 145kV, 40kA, 50 / 60Hz the breakers can end up
in a variety of operational duties even within a single power system e.g. seldom switched line,
transformer or busbar breakers to frequently switched capacitor or reactor bank breakers. The
CIGR breaker reliability surveys and summary reports like Brochure 165, highlight the need for
exchange of information between utilities and manufacturers in order to achieve a practical basis
for implementing RCM. Some results of such feedback is evident in the trends in breaker designs
and reflected in the changes to international breaker standards (as is summarized later in this
report).
There are general guidelines that can be referenced to the type of maintenance activities that can
be undertaken, with respect to number of operations and/or time, cited both in the CIGR
Borchure 165 and by manufacturers (e.g. Bosma, Thomas [20 ]). Such guidelines describe three
(3) levels or categories of breaker maintenance activity, summarized in Table R3 below.
It should be noted that the typical frequency and the service time indications for maintenance
category below can vary considerably depending on both the circuit breaker type / design and it
operational duty (e.g. switching frequency, environmental conditions). It should also be borne in
mind that even where a utility has breakers of similar technology (e.g. SF6 interrupter with spring
23

operating mechanism), there can be a mix of breakers from different manufacturers with slightly
different maintenance wear limit or activity recommendations.
Maintenance
Category

Details
Visual Inspection:
Record operations count,
check no obvious signs of
damage or leakages, check
insulating medium levels (e.g.
SF6 density monitor reading)

Frequency
(typical)
Every 1-2 years

CB Service Status
In-service (mostly);
Alternatively out-ofservice 0.5 day

Non-intrusive condition
assessment (e.g. CB contact
times, speeds, damping
curves, contact resistance),
cleaning insulators, minor
lubrication (latches / gears)

5-10 years,
depending on CB
type / design

Out-of-service; 1 day

Fully detailed (intrusive)


overhaul. More dismantling of
mechanism and/or poles,
replacement of worn contacts,
seals, linkages

10-20 years; 2Out-of-service; 3-5


10000 operations or days
at interrupter wear
limit depending on
CB type & design

Table R3: Summary of Typical HV Breaker Maintenance Task Categories


An interesting example of the implementation of such a categorized maintenance, directed in
general RCM process terms is described in CIGR paper 13-303 by ESB International and ESB
National Grid [23 ]. This paper outlines a process of evaluating the when specific categories of
maintenance should be performed on breakers, including the prioritization of breakers for
maintenance according to system availability risk. The ESB maintenance taks categories are not
identical to the A, B, C types described above, but similar:

Operational Testing (OPT) made annually (comparable to part Category B above)


Operational Servicing (OS) made every 3-4 years, includes and OPT (comparable to
Category B above)
Detailed Servicing (DS) made every 12-24 years, depending on design of breaker
(comparable to Category C above)

Interestingly, ESB state that in their experience developing faults in breakers are detected
roughly 1:100 OPT or OS activities and 10-20:100 DS activity, which seems consistent with the
overall CIGR reported experience of the difficulty in detecting faults during maintenance.
ESB describe a quantities called Expected Regret and Prevention Ratio which are used for
this RCM-oriented approach.
Expected Regret (ER) is a measure of the risk incurred for deferring a planned (preventative)
maintenance task. ESB evaluate this by a formula:

24

ER

=SxC

ER
S
C

= Expected regret incurred from deferring a maintenance task


= Severity of consequences of breaker failure in service
= Change expected in the risk of breaker failure due to maintenance task deferral

Prevention Ratio (PR) is ESBs measure of the effectiveness of planned maintenance tasks and
is defined by a simple ratio formula:

PR

=
PR
t

No. of prevented failures


-----------------------------------------------------------No. of prevented failures + No. of Failures
e

-t/

= Prevention ratio
= Maintenance task interval (years)
= Process time constant (years)

ESB summarize the calculated prevention ratios expected from different tasks, according to the
planned time schedule for the task resulting in Table II in the paper (reproduced below):
Planned
Maintenance
Task
OPT
OS
DS

Prevention
Ratio

0,90
0,82
0,81

Process
Time
Constant
(Years)
9,5
15,0
85,4

The paper also describes a ranking system applied to types of functional failure and their impact
at a system (substation) level e.g. specific CB failure causing tripping of neighboring breakers
or not.
The paper describes how the Expected Grief is calculated from doing no maintenance at all for
one year. This figure is then used together with the PR figures to assess impact on Expected
Grief of deferring different maintenance activitied per breaker for a defined period e.g. 1 year.
The Expected Regret for maintenance deferral is then evaluated as the change in Expected
Grief times the deferral period. This is made for a set population of breakers to derive a priorty
list of breakers for the different maintenance activities.
The proposed ESB system seems to have the advantage for being fairly straightforward, and
provides a means of both assessing the impact (for their system) of re-scheduling maintenance, in
addition to prioritization of maintenance activities for the coming year.
EdF (France) describe a somewhat detailed application of RCM to SF6 breakers in Appendix D1
of CIGR Brochure 165 [5 ], which contains similar steps-in-principle to the ESB approach. EdF

25

formally apply an FMECA to set rankings for failure consequences (seriousness) and frequency
and using these to calculate and overall criticality number per fault type. However the EdF
approach goes into further equipment detail in also assessing the applicability of specific
maintenance tasks in terms of their effectiveness (degree to which a task can detect or rectify a
fault) and facility (in short, how easy or practical is the task). EdF stated the following relations
were applied:
CRITICALITY = FREQUENCY x SERIOUSNESS (of fault / failure)
ACCPLICABILITY = EFFECTIVENESS x FACILITY (of maintenance task)
EdF go on to describe the use of a maintenance task selection process using a simple decision tree
as shown below (Figure 1.6.1.1 from Appendix D1 of CIGR Brochure 165):
FMECA & TASK ANALYSIS

NO

CRITICALITY 6
of the failure?

NO

YES

YES

APPLICABILTY 6
of the task?

FAILURE
ACCEPTED?

NO

YES

OTHER TASK
CORRECTIVE
MAINTENANCE

PREVENTIVE
MAINTENANCE
NOMINAL
MAINTENANCE

Figure R10: EdF Task Selection Decision Tree


(Based on Fig 1.6.1.1. Appendix D1, CIGR Brochure 165)
The level of effort and complexity required for a detailed equipment-based RCM approach can be
somewhat of a deterrent to most utilities, especially in regard to populations of breakers of older
technologies, where the detailed knowledge of the equipment is diminished or possibly even lost.
National Grid (UK) describe such problems in their CIGR paper 13-302 [22 ]. They provide the
example of an old 420kV air-blast breaker that has 36-interrupters containing over 20000
components, approximately 1500 of which are moving parts and 2500 are various types of seals.
The paper describes a higher power system level approach to breaker RCM, directed towards

26

determining a reasonable program for replacement of older technology breakers for which both
the risk of failure and the cost of preventative maintenance is deemed too high.
The CIGR surveys on HV breaker reliability also contain interesting data on HV breaker
maintenance and its relation to reliability. Summarized from Tables 10a and 10b in the CIGR
13-202 1994 survey report [12 ], show the reported average maintenance intervals and effort per
breaker voltage level:
From Tables 10a and 10b - CIGR Paper 13-202, August 1994 - 2nd International CB Reliability Study
Average interval between
scheduled servicing (years)
Voltage class (kV)

63 kV < 100
100 kV < 200
200 kV < 300
300 kV < 500

First enquiry
(all CB types)
2,30

Second
enquiry
(SF6 CBs)
7,60

Average labour effort


(manhours per CB-year)

Average spare parts "cost"


(manhours per CB-year)

First enquiry (all Second enquiry First enquiry (all


CB types)
(SF6 CBs)
CB types)

Second enquiry
(SF6 CBs)

19,60

15,30

55,00

25,40

2,00

8,80

34,00

17,40

38,20

20,70

2,00

8,20

47,40

24,80

87,50

31,60

1,40

8,20

48,50

31,00

72,70

17,70

The above table shows a distinct trend towards increased maintenance intervals for SF6 breakers
compared to the overall averages for the mix of technologies in the first survey. In addition the
cost of maintenance (in manhours per year) is reported to be overall lower for the SF6
population. This provides some indication of the impact of improved reliability and lower
maintenance cost achieved through the introduction of new technologies that can lead to less
complex devices offer the same or even higher functions / performance. This aspect is discussed
further in the next section.
7. IMPACT OF RCM ON BREAKER DESIGNS AND STANDARDS
Even if the application of RCM at the equipment level on breakers is possibly limited to date,
there are indications that the reliability focussed concerns associated with RCM have impacted on
circuit breaker design and on international breaker standards.
Both the CIGR breaker reliability surveys and Brochure 165 highlight the desire for more
reliable breakers or at least breakers requiring as little intrusive detailed maintenance as possible
in order to maintain a minimum acceptable system reliability. This desire is reflected in
recommendations for more extensive mechanical and electrical endurance testing of new
breakers, both of which are now included in the most recent (2003) edition of the main IEC
breaker standard, IEC 62271-100 [15 ]. This standard includes new classes for breakers relating
to overall mechanical endurance:
Class M1:
Class M2:

Mechanical operations type test to 2000 CO operations (without maintenance)


Mechanical operations type test to 10000 CO operations (without maintenance)

The previous IEC 60056 required only the 2000 CO operation endurance test. The addition of the
10000 CO operation test is mainly aimed at frequently operated breaker applications, such as
27

shunt capacitor or reactor banks. However given that most manufacturers develop breaker
product lines on the basis of general application most major suppliers now aim to provide class
M2 for all their breakers thus providing a very high level indication of the breaker capabilities
to have little or no maintenance need based on number of mechanical operations (given that daily
switched capacitor and reactor banks probably only represent less than 5% of the breaker
population overall in most systems).
In addition the standard has requirements for electrical endurance testing, initially aimed at
distribution breakers upto 52kV. This reflects the utility desire to avoid intrusive interrupter
maintenance and at the same time have a quantitative reference to the cummulative current
switching capability of a new breaker which would assist in future power system level RCM
decision making on overhaul or replacement of breakers.
A further important requirement in IEC 62271-100 [15 ] and IEC 60694 [19 ] standard type
testing is the measurement of SF6 leak rates during high and low temperature tests. The results of
such tests (conducted without maintenance) provide important guidelines for utility maintenance
planning and reporting in regard to SF6 loss control.
One direct example of how RCM concepts can feedback into breaker design is provided in Figure
R11 below:

Figure R11: Example of Optimization in HV Breaker Interrupter Design


(from CIGRE paper 13-101 Paris 2000, Bosma, Schreurs (ABB) [16 ])
The figure shows two variants of fixed contact design from an HV SF6 breaker. Contact set 2
on the right is the later design, based on a much higher integration of contact design into much
fewer parts. Such a design step can be seen to contribute both to higher reliability and easier
maintenance by reduction in the number of individual components performing the same basic

28

functions (i.e. conducting and commuting currents). Reduction in the number of components,
without resulting in lower individual component reliability, will direct improve the overall
reliability of the subsystem, and thus improve the reliability of the breaker as a whole. Integrated
modular parts also assist in reducing the time and complexity required during the maintenance
process.
8. CONCLUSIONS
Reliability Centred Maintenance is a process, that permits the user to structure the collection,
analysis and use of information to optimize preventative maintenance. RCM can be applied at
different levels e.g. at the power system network level or at the component (HV breaker level).
The use of RCM appears to be growing, based on both the results of surveys, publications by
utilities and international standard organizations. It can be seen that the RCM drivers of increased
relaibility and lower or more effective maintenance are also reflected in the ammendments to
equipment standards and new breaker designs.
It is also clear that the application of RCM can require considerable investment in time and
resources. It requires detailed knowledge and understanding of the system (network or
component) for which maintenance and reliability are to be optmized. Such investment and
knowledge demands can limit a utilitys approach to a basic breaker replacement decision
management scheme, though there are cases of its application at a detailed component level
also.
There are a number of comprehensive industry reference documents (e.g. CIGR Brochure 165
and IEC 60300-3-11) to provide guidance on the detailed implementation of RCM for HV
breakers.
Important aspects of RCM for breaker applications, given the complexity of the equipment itself
include:

RCM is a structured process


RCM promotes quantification of function and reliability performance
RCM should be executed with continuos feedback allowing ongoing knowledge and
experience to be utilized in a structured way to maintain optimal product and system
performance

29

9. DEFINITIONS AND SYMBOLS


The following table summarizes the definitions and symbols used in this report with reference to
the source of the defintion (where relevant). The majority of the definitions have been taken from
IEC and ANSI/IEEE standards or CIGR guidelines. For reference details, please refer to the
corresponding numbered reference in the bibliography at the end of this report.
No.
9.1

Symbol
(Abbrev)
ASAI

9.2

9.3

Term

Definition

Average service
availability index

ASAI

Availability

CAIDI

Customer average
interruption
duration index

Customer Hours Service Availability


= --------------------------------------------Customer Hours Service Demands

NT x (No. hours / year) - ri Ni


= ---------------------------------------------NT x (No. hours / year)
The ability of an item to be in a state to perform a
required function under given conditions at a
given instant in time or over a given time interval,
assuming that the required external resources are
provided.
Notes
1 This ability depends on the combined aspects
of the reliability performance, the maintainability
performance and the maintenance support
performance.
2 Required external resources, other than
maintenance resources do not affect the
availability performance of an item
Customer Interruption Durations
CAIDI = ----------------------------------------- Customers interrupted

Source
[Reference]
ANSI/IEEE
13662003
[2 ]

IEC 60050191
[21 ]

ANSI/IEEE
13662003
[2 ]

ri Ni
= -------- Ni

9.4

CBM

Condition Based
Maintenance

9.5

MF

Major Failure

SAIDI
= --------SAIFI
Similar to preventative maintenance, but the
maintenance done is based on information from
monitoring equipment which predicts when
maintenance is necessary rather than on the time
in service or number of operations
Complete failure of a circuit-breaker which causes
lack of one or more of its fundamental functions.
Note: A major failure will result in an immediate
change in the system operating
condition(intervention required within 30
minutes).

30

CIGR
(Brochure
167)
[6 ]

CIGR
[12 ]

No.
9.6

Symbol
(Abbrev)
mf

Term

Definition

minor failure

Failure of circuit-breaker other than major failure;


or any failure, even complete, of a constructional
element or a sub-assembly which does not cause
a major failure of a circuit-breaker.
The expectation of the operating time between
failures

9.7

MTBF

Mean time
between failure

9.8

MTTR

Mean time to
repair

The expectation of the time to restoration

9.9

PBM

Period between
maintenance

Time period between maintenance on any


specific breaker

9.10

PM

Preventative
Maintenance

9.12

R(t1,t2)

Reliability(2)

9.13

RCM

Reliability
Centred
Maintenance

9.14

SAIDI

System average
interruption
duration index

Maintenance carried out at predetermined


intervals or according to prescribed criteria and
intended to reduce the probability of failure or the
degradation of the functioning of an item
The ability of an item to perform a required
function under given conditions for a given time
interval.
The probability that an item can perform a given
function under given conditions for a given time
interval (t1, t2).
Systematic approach for identifying effective and
efficient preventative maintenance tasks for items
in accordance with a set of specific procedures and
for establishing intervals between maintenance
tasks
Customer Interruption Durations
SAIDI = ----------------------------------------Total No. of Customers

System average
interruption
frequency index
Time Based
Maintenance

ri Ni
--------NT
Customers interrupted Ni
SAIFI = ------------------------------ = -------Total No. of Customers NT
The preventative maintenance carried out in
accordance with an established time schedule

9.11

Reliability(1)

Source
[Reference]
CIGR
[12 ]

IEC 60050191
[21 ]
IEC 60050191
[21 ]
CIGR
(Brochure
165)
[5 ]
IEC 60300-314
[1 ]
IEC 60050191
[21 ]
IEC 60050191
[21 ]
IEC 6030011-3
[3 ]

ANSI/IEEE
13662003
[2 ]

9.15

SAIFI

9.16

TBM

9.17

CM

Corrective
Maintenance

Maintenance carried out after fault recognition and


intended to put an item into a state in which it can
perform a required function

31

ANSI/IEEE
13662003
[2 ]
CIGR
(Brochure
167)
[6 ]
IEC 60300-314
[1 ]

10. REFERENCES AND BIBLIOGRAPHY


[1 ]

IEC 60300-3-14 Dependability management Part 3-14: Application guide


Maintenance and maintenance support, First Edition, IEC, Geneva, March 2004.

[2 ]

ANSI / IEEE 1366-2003 IEEE Guide for Electric Power Distribution Reliability
Indicies, IEEE, New York, 2004.

[3 ]

IEC 60300-3-11 Dependability management Part 3-11: Application guide Reliability


centred maintenance, First Edition, IEC, Geneva, March 1999.

[4 ]

IEEE Std C37.10-1995 IEEE Guide for Diagnostics and Failure Investigation of Power
Circuit Breakers, IEEE, New York, 1996.

[5 ]

Life Management of Circuit-Breakers, CIGR Brochure No. 165, CIGR Working Group
13.08, CIGR, Paris, August 2000.

[6 ]

User Guide for the Application of Moniotring and Diagnostic Techniques for Switching
Equipment for Rated Voltages of 72.5kV and Above , CIGR Brochure No. 167, CIGR
Working Group 13.09, CIGR, Paris, August 2000.

[7 ]

Electric Power Systems 4th Edition, Weedy B.M., Cory B.J., John Wiley & Sons Ltd,
Chichester, 1998.

[8 ]

Reliability centred maintenance (RCM) applied to electrical switchgear, Bergman W.J.,


Proceedings of IEEE Power Engineering Society Summer Meeting, vol.2 pp11641167,1999.

[9 ]

High Voltage Circuit Breakers Designs and Applications, Garzon R. D., Marcel
Dekker, Inc., New York, 1997.

[10 ]

Transients in Power Systems, van der Sluis, L., John Wiley & Sons Ltd, Chichester, 2001.

[11 ]

Controlled Switching of High Voltage SF6 Circuit Breakers for Fault Interruption,
Thomas R., Technical Report No 514L, Chalmers University of Technology, Gteborg,
2004.

[12 ]

Final Report on High-Voltage Circuit Breaker Reliability Data for Use in Substation and
System Studies, Paper 13-201, Heising C.R., Colombo E., Janssen A.L.J., Maaskola J.E.,
Dialynas E., CIGR, Paris, Aug 28 - Sept 3, 1994.

[13 ]

Driftstrningsstatistik Fault Statistics 2003, Nordel (from www.nordel.org), 2003


(original in Swedish).

[14 ]

SF6 Switchgear, Ryan H. M., Jones G. R., ISBN 0 86341 123 1, Peter Peregrinus Ltd.,
London, 1989.

[15 ]

IEC 62271-100 high-voltage switchgear and controlgear Part 100: High-voltage


alternating-current circuit-breakers, IEC, Geneva, May 2001.

32

[16 ]

Cost Optimisation Versus Function and Reliability of HV AC Circuit-Breakers, Paper


13-101, Bosma A., Schreurs E., CIGRE, Paris, Aug 2000.

[17 ]

A Novel Arcing Monitoring System for SF6 Circuit Breakers, Lehmann K., Zehnder L.,
Chapman M., Paper 13-301, CIGR, 2002, Paris aug. 2002.

[18 ]

Stress of HV Circuit-Breakers During Operation in the Networks - German Utilities


Experience, Neumann C., Balzer G., Becker J., Meister R., Rees V., Slver C-E., Paper
13-304, CIGR, Paris, Aug. 2002.

[19 ]

IEC 60694 Common specifications for high-voltage switchgear and controlgear


standards, Edition 2.2, IEC, Geneva, January, 2002.

[20 ]

Condition Monitoring and Maintenance Strategies for High-Voltage Circuit-Breakers,


Bosma A., Thomas R., Proceedings of 6th International Conference on Advances in Power
System Control, Operation and Management, APSCOM 2003, Hong Kong, November,
2003.

[21 ]

IEC 60050-191 International Electrotechnical Vocabulary Part 191, IEC, IEC,


website: std.iec.ch

[22 ]

Measurement of life and life extension: A Utility View by National Grid UK. CIGR
Preferrential Subject 3 Paper 13-302, Reid J., Bryan, U., CIGR, Paris, 2002.

[23 ]

A procedure for allocating limited resources to circuit breaker planned maintenance,


Higgins A., Corbett ., Kelleher C (ESB Ireland)., CIGR Paper 13-303, CIGR, Paris,
2002.

End of Report.

33

RCM Applications for Generator Systems


Nathaniel Taylor
KTH, Stockholm
nathaniel@ets.kth.se
2005 08 22

Introduction
This report considers high-voltage (HV) rotating machines, i.e. large generators and motors, and in particular their insulation systems. In relation to the application of Asset
Management (AM), a description is given of the past and modern construction of such
machines and of current methods of condition assessment.
Most electricity is, at present, supplied from large generators of tens, hundreds and
even over a thousand MegaWatts (MW) each. When dealing with such large powers from
single machines it is clear that failure of a generator is highly undesirable to the power
system controllers as well as being very expensive in outage time as well as repairs to the
generators owner. Exactly how important the reliability is does of course vary to some
extent other than just by machine power rating, since countries and contracts vary in
their penalisation of lost supply.
Motors of such high power ratings as the large generators are not used, but motors
rated in the tens of MW up to rather more than 100MW exist in some industries. Such
high-power motors are generally machines of the same type as generatorssynchronous
machineswhile lower power (several MW and less) motors are usually of the induction
type. Motors can be claimed to have a greater variation of importance than generators
have, from those whose failure has no worse effect than the need, eventually, to repair or
replace the motor, to those whose failure implies the halting of a far larger process system
or the creation of a dangerous state. Motors also have a wide range of duties, from nearconstant applications with only occasional starting and stopping, to applications with
frequent hard stopping and starting and possibly rapid changes in load. The justification
for on-line and off-line condition monitoring and maintenance therefore varies vastly, not
so obviously dependent on the power rating of the machine.
In order to limit the scope of this short report to the area of interest to the authors
work, only synchronous machines of power-ratings of MW and more are considered here,
and the stator insulation is the main focus.

HV machine construction
Overview of construction
Large synchronous AC machines consist of a cylindrical iron rotor moving within the bore
of an iron stator. Both of these parts have electrical windings of insulated conductors
arranged to produce a magnetic field linking the rotor and stator.
The stator winding is formed by conductors within slots in the surface that faces the
rotor. In these windings the working power of the machine is generated or dissipated.
The rotor winding is of similar slotted construction in high-speed machines, or has
windings around protruding pole-pieces on lower speed machines. The rotor conductors
are excited by a DC supply, and have a maximum voltage to ground of only a few kiloVolts (kV) even on large machines.
The number of magnetic pole-pairs around the stator or rotor is an important feature
of a machine: this determines the ratio of the (fixed) electrical supply frequency to the
1

frequency of mechanical rotation. Gearing of the huge powers involved in electrical generation is hugely impractical compared to adapting the electrical machine, by adjusting
the number of pole-pairs, to interface different angular speeds of electrical system and
mechanical power-source.
Hydro-turbines move at quite low speeds and their attached generators therefore have
many pairs of poles and rotate at low speeds of a few revolutions per second. To allow
such a large number of pole-pairs to be accommodated the diameter of the machine is
very large. The rotors of such machines are of the salient pole type, with mushroomshaped iron pole-pieces protruding from a central rotor core. The rotor windings are then
around the sides of the pole-pieces.
Steam-turbines and gas-turbines operate at higher speeds, often sufficiently fast
3000rpm on a 50Hz systemfor their attached generators to have just one pair of poles.
The rotors are then of modest diameter and considerable length, and are of slotted cylindrical construction, often called round rotors.
Through the power-range of interest here, a few MW to over 1000MW, there are
various relevant differences in the design of windings, depending on power-rating, age
and the mechanical power source or load: the winding construction and cooling methods
are of particular interest.
Cooling
To remove heat from the windings and iron, several methods may be used. These fall
into two main groups: indirect cooling, which removes heat from conductors after it has
passed through the insulation, and direct cooling, which removes heat from by circulation
of a fluid within the windings.
Indirect and Direct indirect cooling is used in three variants: open-ventilated indirectcooled machines have air from the environment passed through the machine; recirculated
air or recirculated hydrogen cooling uses enclosed air or hydrogen circulated within the
machine and cooled by a heat-exchanger with air or water on the other side. When
using recirculated gas, it is possible to pressurise the machines containment, improving
the thermal (volumetric) capacity of the gas. Hydrogen has a particularly good thermal
capacity.
Direct cooling uses either purified water or hydrogen, circulated either through hollow
conductors or through separate ducts alongside the conductors. This is used as a supplement rather than as an alternative to indirect cooling, for cases where the advantages of
better cooling outweigh the considerable cost and complexity of the inclusion of ducts,
connection of ducts in the end-winding region, and external apparatus for treating and
cooling the fluid.
Electrical consequences of cooling systems Use of hydrogen or greater than atmospheric pressure has further positive effects on the electrical insulation. The breakdown
strength of the higher-pressure gas is greater than for atmospheric air, resulting in reduced risk of PD in the slots and end-windings. The absence of oxygen means that
oxidative thermal deterioration of the insulation is reduced, which is relevant both to
the deterioration due to operating temperature and to that due to PD. Ozone detection
2

systems, mentioned later, will not be of use in hydrogen-cooled machines since there is
not significant oxygen present to form the ozone.
Applications of cooling Only small machines use open-ventilated indirect cooling.
Modern turbo-generators up to a few hundred MW are indirectly-cooled by recirculated air, and larger ones by recirculated hydrogen, while before the 1990s cooling by
recirculated hydrogen was sometimes used even at such low ratings as 50MW. Direct
water-cooling is used in hydro-generators at ratings above about 500MW. Modern turbogenerators have direct hydrogen or water cooling of stator-windings at ratings above
about 200MW, though older designs (1950s) use direct cooling of machines as small as
100MW.
Stator insulation requirements
A high-power machine must have a large product of stator-terminal current and voltage. Making either of these quantities large is however undesirable since high voltage
puts greater demands on the insulation while high currents necessitate thick conductors
(bending problem, skin effect, eddy currrent losses) or many parallel connections (circulating currents, complex connections in the end-windings) and makes harder the matter
of transferring current from the terminals (losses, magnetic forces). The compromise
position chosen as optimal, changing little over the years, is that the voltage used in the
stator is up to around 30kV line-voltage on the largest ratings, and is indeed about half
this value even for very much lower powers of tens of MW.
A particularly strong constraint imposed by machine design is that every bit of space
around the conductor area is very valuable: all electromagnetic machines are a compromise between iron and copper (magnetic and electrical conductors) and to the
electromagnetic designer any insulation is just an unfortunate necessity that must be kept
into as small a cross-sectional area as possible.
The windings operate at high temperature, due to the need to use space efficiently
and therefore to economise on the area of conductors. The conductor temperature of
indirectly-cooled windings is determined by the insulations thermal properties, with
thinner and more thermally conductive insulation being desirable. The temperature also
affects the insulations deterioration, since for an air-cooled machine much of the aging
of the organic insulation material is by a reaction with oxygen.
Coil or bar stator windings
Machines of more than about 1kV (which implies a maximum practical power of hundreds
of kW) always have form-wound stators: the windings are carefully prepared with
conductors and insulation before being inserted in the stator.
Up to about 50MW a coil-type form-winding is used, in which a multiturn loop
of conductors with insulation is prefabricated ready to be inserted so that the two long
parallel sections (legs) fit into stator slots and the remainder protrudes from the ends of
the stator. This is quite simple to construct as one end of the stator then has no explicit
connections needing to be made; the connections are just continuations of the conductors.
3

For larger ratings it is impracticable to fit so thick and rigid a coil into the stator, and
the windings are fabricated in single bars for insertion in the stator: this is a bar-type
or Roebels-bar construction. It is then necessary at each end to make connections
between individual conductors, which is complicated still further if channels for direct
cooling are present.
Stator conductors
Within a coil or bar, there are separate conductors (ten, as a very rough idea!). A highpower machine has high currents, and if a single copper wire were used for the winding its
area would be so great as to cause unacceptable skin effect and eddy current losses and
to make it insufficiently flexible. Several smaller conductors, the strands are therefore
connected in parallel to form a larger conductor.
The strands in a bar-type winding are arranged to change position regularly (transposition) in order to minimise differences in the induced voltages in spite of different
magnetic conditions in different parts of the slot. Strands in a coil-type winding usually
do not need this transposition as the smaller machine size results in less distance between
different strands; an inverted turn at the end-winding can be used to swap inner and
outer parts between the two slots that a coil occupies. Turns of parallel strands are
then connected in series to form whole coils and ultimately whole windings consisting of
many coils.
The set of conductors and insulation that is put in each stator slot is conventionally
of rectangular cross-section with rounded edges, constrained by cross-section demands
of the iron between the slots and by the practicalities of making the winding. Some
recent development has been made of fundamentally different insulation systems, based on
circular-section conductors of cable-insulation design, but the conventional construction
is still far more common. The corners of conventional bars result in a non-uniform electric
field in and around the insulation and therefore in some over- or under-stressed regions.
Stator insulation geometry
A three-part insulation system used for form-wound stators has a thin strand-insulation
able to withstand the high temperatures and low voltages (10V) between strands, then
a turn-insulation able to withstand the induced voltage of a loop through the stator. At
points where the strands have a transposition, extra insulation is often needed to fill the
gaps. Since the 1970s many manufacturers have used a strengthened strand insulation
to obviate the need for turn insulation.
To insulate the turn insulation from the stator-cores ground potential a further layer,
the groundwall insulation, is used. Although the groundwall thickness could be varied along a winding from a small amount at the neutral point to the necessarily greater
thickness near the phase terminal, this is avoided for simplicity of geometry; a trick facilitated by this is the reversal of the connection of a windings so that the previously
higher-stressed parts of groundwall are stressed less than the other end, so prolonging
the insulations life. For machines operating at more than 6kV the groundwall insulation
4

is covered with a semiconductive (low conductivity, 0.1-10k/sq) compound, usually


with carbon-black as the conductive part. This is sufficiently conductive to ensure that
cavities between the bar and the stator will not be exposed to high enough electric fields
to cause partial discharges (PDs) but sufficiently resistive that the laminations of the
stator iron will not be shorted out.
The end-winding region is the part of the winding outside the stator-core, where the
windings through the stator are connected to each other to form loops. With the zeropotential surrounding of the stator core removed, the outer surface of the insulation has
a potential due to the conductors inside. In the region near the stator the electric field
in the surrounding medium (air or hydrogen) is particularly high due to the electric field
from the closest part of the end-winding concentrating at the ground potential. This may
lead to PDs on the surface, with damaging effect.
As the end-winding of large machines is a mass of connections and is therefore sensitive
from an insulation perspective, it is preferred that the end-windings should be allowed to
be remote from ground potential; coating the end-windings with a quite well-conducting
material is therefore not desired (and induced voltage would also need to be allowed for
in this case). Instead, a semi-conducting coating with high resistivity, usually including
Silicon Carbide (SiC) to give it a non-linear current/voltage-gradient relation, is applied
for several centimetres from the end of the windings outer semiconducting coating of
relatively low resistivity. The surface potential then falls off quite smoothly as the stator
is approached from the end-windings, and surface discharges are avoided. The grading
and semiconductor materials are applied as paint or, in more modern designs, as bands.

Insulation materials
From early beginnings of organic cloths with oils or resins on them, stator insulation now
relies on mica flakes surrounded by a polymeric binding. Mica is a mineral that is very
inert and is favoured for many high-temperature applications. It is also very enduring
of partial discharges. Windings are coated with a paper of mica flakes and then, outside
the stator or else all in situ, an epoxy resin filler is added to impregnate the flakes.
In older machines still widely used, bitumen was used instead of epoxy, giving higher
conductivities (due to water absorbtion), the risk of softening or even running out of the
windings at high temperature, and easier formation of voids that would allow PD.
Sources of faults
The parts that are most susceptible to aging and consequent failure are the insulation
systems of the two windings, and the bearings that support the rotor. The bearings of
large or critical machines are monitored on-line for vibration and temperature, and even
quite small machines of a few MW often have regular scheduled measurements. Bearing
problems can even arise due to winding problems if an asymmetry in the magnetic field
is caused by for example turn-shorts.
Which insulation systemstator or rotoris more prone to problems depends on the
type of machine. A survey of hydro-generators cited in [1] in North America has suggested
around 40% of outage-causing faults to be due to stator insulation, and fewer to the rotor

(the remainder are mainly mechanical). In [1] it is claimed as the (illustrious) authors
opinion that turbo-generators generally tend to the contrary, with rotor insulation being
the greater source of outages, but it is stated that there is no wide-spread detailed survey
of such machines to the extent performed with hydro-generators. [2] gives some figures
for North American turbo-generators, of air- or H2 -cooled designs: the generator is seen
as the least troublesome of the main power-plant causing around 0.5 occurences per year;
the generators HV insulation gives more faults than other components on the further
analysis within the generator.
The rotor insulation has lower voltages to insulate, since 1kV (DC) is about the maximum normal excitation voltage, but has less direct cooling in the larger machines and
has to operate with large forces on it due to the rotation. The thermal conditions under
highly excited conditions can also be very severe.
On-line electrical monitoring of rotor insulation is impractical with many designs of
excitation system (rotating exciter or DC machine), since there is not direct electrical connection from the stationary part of the machine to the rotor winding. A short-circuited
turn in the rotor only results in a less effective field, in contrast to the case of the stator
where the alternating magnetic field would cause a huge current to flow in that turn; the
rotor turn insulation is therefore not as critical a candidate for continual monitoring. A
rotor ground-fault can also be mitigated by systems that use an excitation source that is
not solidly grounded, i.e. a single ground fault can be tolerated, although usually this is
taken as a severe state requiring prompt action.
From here on, just the stator insulation will be considered.

Stator insulation failure mechanisms


Thermal deterioration of the organic component of the insulation, particularly in aircooled machines, is a continual process, often simple modelled with an Arrhenius-rate
reaction, i.e. exponential relation of rate to temperature above a minimum value. The
deterioration leads to more brittle insulation material, which in turn can increase the
effect of vibrational aging.
Thermal cycling may cause stresses and movement of whole bars (complete with
groundwall insulation) axially relative to their slot, or for quick changes in load or expoymica bars the forces of bar against groundwall insulation may cause internal movement
in the bar. The windings are the main heat-source when at high load, so warm up
faster than the surrounding iron and reach a much higher temperature besides having
different coefficients of thermal expansion. Mechanical stress and/or movement occurs
during changes of load, which should be make slow and infrequent in order to minimise
the effect of thermal-cycling.
There are large forces between conductors even with the currents of normal operation;
separate conductors within a bar, separate bars within a slot, and nearby connections in
the end-winding region all experience forces that alternate at twice the power frequency
and that can damage the insulation in regions where there is looseness that allows move6

ment. During a short circuit the forces can be many times greater, possibly causing
internal damage that initiates longer-term degradation; the bars must be very firmly
held in place particularly in the end-winding regions.
The semi-conducting coating of the stator bars may wear out, due to the chafing
from vibration in the slot and to thermal expansion or to arcing due to lamination shortcircuits. The end-winding stress-grading materials may also become less effective with
time, leading to surface PD that will only worsen the state of the end-windings. Bandapplied grading is found to be more durable than paints. Stress between windings in the
end-winding region, due to bad design or to movement, can also cause PDs. In all these
cases of PDs against the outside of the stator insulation, wear will be caused, possibly
leading to a proper breakdown.

Condition Assessment
Off-line tests and measurements
Off-line tests were used even with the earliest machines, before the technology and demands for availability caused on-line monitoring to be feasible. Off-line tests are still used
at maintenance times, and allow measurements to be made that give possibly better or
just complementary information to that of on-line measurements. Many popular off-line
tests are very simple, and application of more sophisticated tests, while of great interest
if able to improve diagnoses, have the problem of how their more detailed information
should be interpreted to determine the relevant condition details.
Manual and visual inspection of the machine, possibly requiring disassembly of some
parts, can detect looseness of bars in slots, wear and residues on surfaces due to repeated
PD, signs of overheating and more. The need for internal access rather than, for example, electrical measurements on the terminals or from the end-windings does make
direct-inspection methods less convenient in cases where the maintenance is not otherwise requiring disassembly. The risk of making worse the state of a working machine by
movement of parts or an oversight during disassembly or reassembly should be borne in
mind. It is claimed in [1] that when conducted by an expert such inspection is the best
form; but this surely neglects voids causing PD well within the insulation layers.
Smooth (non-pulsed) currents When a voltage is applied across insulation, current
may flow for various reasons. There is some current that flows during changes of voltage,
charging the fundamental free-space capacitance of the insulation; a material insulation
will then undergo some polarisation of molecules or alignment of already polar molecules,
resulting in a weakened field and consequently more current flowing while the polarisation
is happening. This polarisation current is of interest for showing changes in the chemical
structure of the insulation, or the ingress of moisture (water is highly polar).
Some current may flow due to pure conduction in the insulation material, mainly in
wet, old bituminous insulation. Modern expoy-mica insulation that has not been abused
by exposure directly to water for a long time has negligible conductivity. This is a current
that continues even with constant voltage, rather than falling away as the capacitive
current does. Surfaces of insulation often have enough contamination to make them far

more conducting than the insulations bulk. This adds to the conduction current.
Even PDs can be measured by smooth-current methods, which are particularly suitable for the case of many small PDs, as often found in a stator winding.
Methods of measuring smooth-current effects cover a wide range of complexity. Insulation Resistance (IR) tests just measure the DC current flowing after the application of a
voltage for a time that allows fast polarisation to have been completed. The Polarisation
Index (PI) is similar but uses a ratio of currents at two different times to prevent the
very high sensitivity of measured resistance to temperature from having so powerful an
effect on results wanted for comparison of machines or of a machine over time.
Measurements, e.g. by a Schering bridge, of effective resistance and capacitance of a
winding at a particular voltage and frequency are widely used, with the ratio of dielectric
loss to capacitancethe loss tangent or tan being an important value. Taking such
measurements with varied voltage and observing any trend in loss or capacitance can be
useful, particularly for the detection of PD inception (the tan tip-up test).
More detailed information about insulation can be found from Dielectric Spectroscopy
(DS): the change in current over time is measured for a stepped voltage (time-domain
method) or the magnitude and phase of the fundamental current and possibly some
harmonics too are measured at several different frequencies and amplitudes of the applied
voltage (frequency-domain).
As opposed to the above methods, all of which can be applied at modest voltages even
though use of voltages around the rated value may be desirable in some cases, the hipot (high potential) tests are more endurance tests for the insulation than diagnostic
methods. A DC or AC source drives the coils with a potential much higher than the
rated one, often increased in steps. If a certain over-voltage level (2V + 1, with V being
rated voltage in kV) is withstood than the winding passes the test. If the voltage is not
withstood, the winding may need to be repaired even though it may have survived for a
long time in normal operation.
Pulses (PD) Generator insulation is much more tolerant of PD than is the insulation
of most other HV apparatus. Some PD activity is acceptable in long-term service, but
in tests it may be useful to know some measure of the PD in order to distinguish more
and less harmful or expected locations and concentrations of PD activity.
There are several ways of detecting PD, some of them very crude. Sometimes PD
activity is perceptible by sight (the blackout test, very useful for localisation) or by
sound. Radio-frequency detectors may be used with a probe to locate regions of PD
by the signal transmitted through the air. A machine that can only be accessed at its
terminals is more suited to electrical methods, with the current pulses into the whole
winding being measured.
On-line PD measurement has an advantage of measuring PD in real operating conditions of voltage and temperature. If wider knowledge of the winding condition is desired
than just this matter of is there currently PD or not PD at working conditions, then
modern phase-resolved PD detection systems, possibly together with varied amplitude or
frequency of the applied voltage, may be used.

On-line monitoring
On-line monitoring is clearly highly desirable if its results can be used to keep a machine
running for longer without shut-down while still detecting problems before they cause
severe damage. Some widely used methods particularly relevant to stator insulation are
described below.
Thermal monitoring of the stator windings is used for practically all machines of the
sizes under discussion here. Temperature sensors are included between bars or to measure
the temperature of a cooling fluid. A large machine may have many sensors and have
maintenance personnel make more use of trends over short and long periods than in the
case of a less important machine where temperature sensing is mainly for a short-term
warning of severe malfunction.
Condition monitoring, or Generator Condition Monitoring (GCM), is the detection of
chemical products of hot insulation material, i.e. a sophisticated smoke detection system
relying on the removal of ions from a chamber by their binding to smoke particles. In
some machines a tracer paint may be used on critical areas, to release easily detectable
chemicals at a well-defined temperature. Use of different paints in different parts of the
machine allows still better location of a problem. An extreme GCM reading can be used
as a warning to operations staff, but in most cases the GCM is of interest to maintenance
staff; a long-term analysis is useful to spot problems such as the sporadic burning due to
shorted laminations burning insulation in small areas.
Ozone (O3 ) monitoring of enclosed (recirculated cooling) machines may be acheived
by a sensor inside the machine enclosure, for continuous monitoring. An open-ventilated
machine may instead have ozone measurement by simple manual exposure of a test chemical that reacts with ozone. Ozone is produced by electrical discharges, so ozone detection
methods are sensitive only to PDs that are not internal to the insulation, i.e. they may be
slot discharges between the groundwall insulation and the stator core due to deterioration
of the semiconducting layer or to loose bars, or may be discharges in the end-windings
due to wearing of the stress-grading semiconductive layer or to excessive proximity in
the end region. Single PD sites are unlikely to produce measurable quantities of ozone,
and general wear in poorly represented areas, e.g. the stress-grading of the end-winding
rather than the semiconductive layer all along the slots, will show only weakly even if the
localised activity is strong. Trends, even over a long time, are important as with GCM, if
the measurements are to be usefully interpreted for phenomena other than severe declines
in condition.
Electrical monitoring of Partial Discharges online has the PD measurements made at
realistic (actual!) operating conditions. PD is very sensitive to gas pressure, temperature
and cavity size., so to compare PD measurements throughout a machines working time
requires matching of these conditions quite closely if trending is be be used.
Although not a means of measuring the current state of the insulation, on-line voltage surge monitoring may be useful in recording times when the external network has

introduced a transient voltage to the machine terminals. Since the inductive coils are a
high impedance to high frequency signals there is a large voltage drop over the first turns
in the winding, which may even cause a breakdown in the turn insulation, causing it to
be weakened.

Condition Assessment Practices


Some on-line monitoring is used on almost all machines under consideration here, and on
all those over a few tens of MW; this is, for example, temperature monitoring. On-line
PD monitoring has been popular, with inductive capacitive or RF pickups being used in
various systems to get more or less precise location of PD.
It is credible that deregulation, with its concomitant high demands on reliability and
optimisation of assetts, has encouraged further use of condition monitoring, but it should
also be noted that modern developments in data acquisition (sensitive automatic electrometers) and processing (modern computers and software) have made the extraction
from the machine of meaningful reliability information much improved.
Choice and application of formalised AM methods such as RCM have grown markedly
in the power-industry owners of power-station generators since the 1990s. Although varying considerably depending on the analyst, generators have been singled out as significant
sources of failures of the whole power-station system.
The Swedish generating utility, Vattenfall, has implemented RCM methods in its
maintenance program, including some . Reference [3] gives an description of variations
of RCM applied to generating plant. [4] describes briefly some RCM approaches and
considers the RCM application by Vattenfall to generator systems.

10

References

[1] G. C. Stone, A. Boulter, I. Culbert, and H. Dhirani. Electrical insulation for rotating
machines. IEEE Press series on Power Engineering. IEEE Press, 2004.
[2] H. J. van Breen, E. Gulski, and J. J. Smit. Several aspects of stator insulation
condition based maintenance. In Conference record of the 2004 IEEE internation
symposium on electrical insulation, Indianapolis, IN, USA, pages 446449.
[3] Fredrik Backlund. Managing the Introduction of Reliability-Centred Maintenance,
RCM. RCM as a Method of Working within Hydropower Organisations. PhD thesis,
Lule
a University of Technology, Sweden, 2003.
[4] Otto Wilhelmsson. Evaluation of the introduction of RCM for hydro power generator
at Vattenfall Vattenkraft. Masters thesis, Electrical Engineering, KTH, Stockholm,
2005.
[5] G. C. Stone. Advancements during the past quarter century in on-line monitoring of
motor and generator winding insulation. Dielectrics and Electrical Insulation, IEEE
Transactions on, 9(5):746751, October 2002.
[6] G. C. Stone. Recent important changes in IEEE motor and generator winding insulation diagnostic testing standards. Industry Applications, IEEE Transactions on,
41(1):91100, January-February 2005.
[7] T. Bertheau, M. Hoof, and T. Laird. Permanent on-line partial discharge monitoring
as strategic concept to condition based diagnosis and maintenance. In Electrical
Insulation Conference and Electrical Manufacturing and Coil Winding Conference,
1999. Proceedings, pages 201203, October 1999.
[8] A. Kheirmand, M. Leijon, and S. M. Gubanski. Advances in online monitoring and
localization of partial discharges in large rotating machines. Energy Conversion,
IEEE Transactions on, 19(1):5359, March 2004.
[9] V. Warren. Using on line insulation testing to implement generator predictive maintenance. Electrical Insulation Magazine, IEEE, 13(2):1721, March-April. 1997.
[10] IEEE Trial-use guide to the measurement of partial discharges in rotating machinery,
August 2000. IEEE Std 1434.
[11] G. C. Stone and J. Kapler. Condition-based maintenance for the electrical windings
of large motors and generators. In Pulp and Paper Industry Technical Conference,
1997, pages 5763, June 1997.
11

[12] D. G. Edwards. Planned maintenance of high voltage rotating machine insulation


based upon information derived from on-line discharge measurements. In International Conference on Life Management of Power Plants, 1994, pages 1214, Dec
1994.
[13] T. Laird. An economic strategy for turbine generator condition based maintenance.
In IEEE International Symposium on Electrical Insulation, 2004, pages 440445,
2004.
[14] J. L. Kohler, J. Sottile, and F. C. Trutt. Condition-based maintenance of electrical
machines. In Industry Applications Conference, 1999. Thirty-Fourth IAS Annual
Meeting, volume 1, pages 205211, October 1999.

12

CONDITION MONITORING OF WOODEN POLES


Osmo Auvinen
VTT Processes
email: osmo.auvinen@vtt.fi

1. INTRODUCTION
The wood poles are one of the most remarkable asset in the electricity networks. The number
of wood poles in Finland is about 8 million, including the transmission, distribution and
telecommunication network poles. A great share of these poles is approaching the technical
end of their life-time. This creates a challenge: because the number of pole to be replaced is
so great, what is the optimal way to handle this problem? The first task in this optimisation is
to survey the remaining life-time of the poles and whats the present condition of the poles?
The age distribution of installed poles in Finnish utility is shown in the figure 1.

number of installed poles

The age distribution of installed poles in Finland

1950

1955

1960

1965

1970

1975

1980

1985

1990

1995

2000

construction years

Figure 1. Age distribution of installed poles in Finnish utility


There are developed several methods for this wood pole condition monitoring, starting from
very simple tools to high-tech ultrasonic detectors. The main problem with the assessment of
condition of wood poles is that every single wood pole decays with individual rate, so that to
clarify the whole mass, every pole has to be monitored and this needs time and money.
Damaging of wood pole is caused mainly by decay fungi, insects and woodpeckers. Structural
damages decrease quite quickly residual strength of a wood pole. In the Nordic countries the
most serious problem in using wood poles is decay, while usually insects are not even
considered as a problem. In addition, some woodpecker damages may occur.

2. WOOD POLE PRESERVATIVES


In Finland, mainly two preservatives are used, creosote and CCA (chromated copper
arsenate). Creosote is the oldest preservative in general use, and it is highly effective
providing a long service life. It has also been improved to satisfy the latest demands of
effectiveness and bleeding. Thus, the quality of treated poles is currently quite even, and
bleeding has reduced without any loss of service life.
Characteristic for the creosoted poles is that it will decay internally. This lead to the tube
structure where the pole seems to be in shape, but it have been lost most of its strength.
However, there could be left enough strength for climbing, but it is very hard to measure from
outside. Usually, this kind of pole will break down with noise, which might warns a lineman
so that it is possible for him to climb down before final break down.
Most poles used in Finland are preserved with CCA (chromated copper arsenate) type C,
which has considerably reduced leachability compared to type B. CCA is a highly effective
wood preservative used to treat poles of pines. Nowadays about 95 % of the poles in use are
CCA-treated.
The external decaying (in CCA poles) leads to the situation where the diameter of the pole is
reducing. This phenomenon is, in most cases, clearly visible and climbing can be forbidden if
the diameter is too narrow. The machine tooling (like drilling) of the poles is better to do
before the impregnation whenever it is possible. Then the treatment material fills all the
places. The holes done afterward will be the weak point in the protection of poles and it can
accelerate the decaying of the pole. Also, a pole top cover should always be used because it
provides extra protection for topmost parts of the pole. This decaying of pole top can be
dangerous if the cross arm will fall from the pole and cause damages and maybe hazardous
situation due to decay.
Since the preservation has been improved greatly after 1950s, the expected average service
life of CCA type C treated pole is 35 to 50 years, and 40 to 55 years with creosoted poles.

3. DECAYING OF WOOD POLE


Fungi are long, microscopic, tube-like plant cells that feed off plant materials with external
digestion (i.e. outside of the fungal cells). The fungus penetrates and colonizes its food,
digests and absorbs the soluble fragments, and metabolizes them internally as an energy
source. Fungi reproduce by cell fragmentation or the formation of spores. When favourable
conditions, as follows, occur, fungi rapidly germinate and begin decay.
In order to develop, fungi has certain requirements as follows:
Oxygen
Water (moisture content (MC) of wood exceeds 20%)
Temperature (+5 .. +30C)
Food
Lack of air, overdry wood (MC below 20%) and freezing temperatures only limits or stop
fungal growth, which continues when conditions returns suitable. The most favorable
conditions for fungi are MC of wood between 25 and 50% and temperature between +20 and

+30C, i.e. the ideal location for decay is 50 mm to 450 mm below groundline. On the other
hand, most fungi are killed at temperatures exceeding +65C /Ng/.
On of the most important procedures in the pre-treatment handling of pole is drying. The idea
of drying is that fungi needs moisture in the wood cells and if this moisture is below the fibresaturation point, fungi cannot develop. And when the moisture content increases, the amount
of oxygen becomes limiting parameter: water has only few ppm free oxygen and this is not
enough for fungi. Thats why wood material in the bottom of lake etc. will remain centuries.
Optimal moisture content for fungi has found to be around 40 80 %. Fungi demands food
that fills three requirements: energy from carbon compounds, metabolic storage for
development and suitable vitamins, CO2 and nitrogen. Also pH-level should be around 3 6
for active development of fungi /Zab/.
There are great number of parameters which affect to the development of fungi. Main
parameters are location of pole (i.e. immediate surrounding, like forest or field), type of soil
(dry or wet), wood quality, preservative and preservation quality. Surrounding of a wood pole
has significant impact on the rate of decay. In forest development of decay is found to be
slower than in field. Also, type of soil affects on how rapidly decay develops. Other
parameters are found to be very significant, but they are also more or less individual, so this
makes the exact forecasting of remaining life-time impossible to a single pole. /Gray/
There are three types of wood decay: white rot, brown rot, and soft rot. All these rots affect
differently. White rot is caused by fungi that consume all of the wood cell walls, leaving
behind residual wood characterized by a white punky. Fungi causing brown rot attacks only a
portion of the wood cell walls, leaving soft, brown residual wood characterized by shrinkage
cracks in a cubical pattern. Fungi causing soft rot attacks to the inner portion of the wood cell
wall, forming diamond-shaped holes. Soft rot fungi require less oxygen than other fungi
types, and therefore it develops specially in wet conditions. By starting to extend from the
shell of a pole, substantially reduces the poles strength. Brown rot is the most rapid and
damaging to wood, soft rot is almost as rapid as brown rot, and white rot is intermediate in its
rate of decay /Ng/.
The ideal conditions for fungi are the moisture content of wood 25 50 % and temperature
+20 +30 C. Ideal location for decay is 50 mm 450 mm below groundline where the
combination of moisture and oxygen is optimum.

4. WOOD POLE INSPECTION METHODS


Most of the inspection methods are based on the experience of the inspector, except some
modern technologies. The methods can be divided to two groups: methods for decay
inspection and methods for strength assessment. The first group consists, at least, of eight
different techniques, like:

visual inspection
sounding / hammering
drilling
excavation
sonic devices
x-ray and NMR techniques

decay detecting drill


electronic resistance instruments and
Pilodyn wood tester
Visual inspection is done at first above ground zone of the pole to detect possible damages. If
damage is considerable and it is located at areas critical to the strength of the pole (e.g. in the
root of the pole in straight line poles and in the middle of the pole in A- and stayed poles),
inspector should note that damage on the record so that proper maintenance action can be
done in time. Visual inspection is adequate, if the pole has been in service less than 20 years;
otherwise next phases are carried out.

Figure 2. Visual inspection of the wood pole


Sounding (or hammering) determines the in-pole damages, like inner decay. The location of
decay can also be indicated quite sharply with this technique. Groundline zone inspection
starts with removing the soil around a pole to a depth of 20 to 40 cm. Subsequently, a pole is
hammered to detect the type of decay. Soft surface of a pole is an indication of external decay.
Correspondingly, internal decay can be noticed by listening to the sound of hammering, i.e.
special sound indicates internal decay. After detecting the type of defect and its location, the
depth of decayed wood is measured with pick test in any case and also with increment borer if
internal decay is detected.

Figure 3. Hammering test

Drilling gives even more exact result of possible inner damages of the pole. Drilling is done
by radial boring to the pole to take a test sample. Increment cores are taken from that part
which is regarded most severely decayed. The thickness of poles surface is determined with
extracted core, or with shell-thickness indicator. Proper actions are also defined with a sound
diameter and a rate of decay.

Figure 4. Increment borer and a sample


Excavation is one of the most utilised techniques. It is used to detect decay below the ground
line. The test sample is taken from the pole by pick test. Pick is pushed abeam of the pole into

wood, and a small sliver of wood is lifted. If sliver splinters, the wood is sound, whereas
abruptly breaking sliver indicates rot. Test is carried on as deep as sound wood is indicated.
Same test is also done on the other side of the pole, or if needed, around the pole. After
removing the defected wood, the diameter of the sound pole is measured, and proper actions
are defined depending on an evaluated rate of decay and a sound diameter of the inspected
pole.

Figure 5. Pick tests of the wood pole.


Left: Abruptly breaking sliver indicates decay
Right: The sliver splinter when the wood is sound

X-ray and NMR techniques are base of radiation, x-rays and nuclear magnetic resonance
(NMR). These techniques give 2D- or 3D-mapping of the pole and its strength. Disadvantage
for this technique is that it is for laboratory tests only, not for field services. Also gamma
radiation has been utilised when assessing the deterioration (the density of the wood decreases
when it decays)
Decay drilling technique is based on the same idea than in the conventional drilling, but this
technique utilises special equipments for this purpose. The drill penetrates to the wood and
measures the drilling resistance. The drill resistance depend on the soft or hard wood.
Electronic resistance instruments measures the resistance over the test sample of pole.
Negative ions are released by decayed wood and this lowers the electrical resistance. The
resistance is <25 % of sound wood indicates decay.
Sonic devices are based on the speed of sound in a material. This technique utilises the stress
wave velocity. In soft, decayed wood the stress wave propagates lower speed. Reliability if
this technique is quite low to exactly detect the amount of decay. Several equipments for
sonic based analysis are available nowadays and two of these are introduced briefly as
follows.
De-K-Tector is a measurement device, where the pole is knocked with ball-headed hammer.
The sensor at the opposite side of the pole registers the wave generated by this hammer. If the
sensor investigates more small-frequency (about 160 600 Hz) waves than high-frequency

(1 kHz 5,2 kHz) waves, interprets the measuring device this part of pole as weakened. And
vice versa, if there are more high-frequency than small-frequency waves, the pole is
investigated to be sound /Far/.
PURL (Portable Ultrasound Rot Locator) consists of ultrasonic transmitter and sensor. These
equipments measure the sound volume across the pole diameter and this value is compared to
the set value. If the volume of the measurement is higher than set value, the pole is sound and
if the volume is lower, the pole is decayed. This PURL-technique is quite laborious; only
about 20 poles can be measured with this technique in a day /Far/.
With strength assessment methods, there are great variations in the fibre stress value (gap is
about three times between the species and class). The ANSI O5.1 standard defines the
required pole strenghtness, but only minor percent of the poles really pass this standard. There
are different techniques for the pole strength assessment, like:
a sonic technique, where digital waveforms collected from the ground line of the pole
are compared to the strength model
pendulum impactor, where the stress wave propagates along the pole
Polux-method, where the wood density and moisture are measured by driving two
electrodes into the pole
calculating the stiffness property (MOE) against breaking strength (MOR): MOE
MOR relationship lowers when pole become more decayed.
Other mechanic inspection equipment is so called Pilodyn wood tester, where the device is
loaded with the ramrod and then pressed firmly on to the test surface. The impact pin is shot
into the wood by pressing the trigger cover. The depth of penetration can be read immediately
on the scale mounted on the tester.
The wood pole inspection process can be introduced as a flow chart also. Example of this
flow chart is shown in figure 6

Pole
inspection

Visual inspection

Pole > 20
years old ?

NO

No further
inpection

YES

Dig the soil down to 20- 40 cm and


measure the groundline diameter

Hammering
up to 2 m
height

Dull thud
sound ?

YES

Increment
borer

NO

Locate the softiest spot of


the pole by hammering

Pick test

Estimate the rate


of decay and fill
the inspection
record

End of
inspection

Figure 6. Flow chart from wood pole inspection

4. OVERHEAD LINE INSPECTION


Normally, when wood poles are inspected, the whole overhead power line is also inspected.
Wood poles are the main target, but also bars, insulators and conductors are inspected, in most
cases visually. These other structures are inspected in a case of damages. Damaged insulators
are found to cause interruptions for customers and even top pole fires in UK. Damaged or
failed insulator location techniques have been considered in the past and have relied on
different parameters such as partial discharge, impedance, leakage current and voltage
imbalance in addition to visual inspection. Conductors are generally inspected for signs of
damages stranding, evidence of clashing, bird caging or fatigue during the inspection process

/Hor/. Bars have to be also examined in a case of possible breakage or other damage, which
can cause danger to environment or blackout.

SOURCES
/Zab/
Zabel, Robert, Morrel, Jaffrey. Wood microbiology: Decay and its prevention.
Academic Press Inc. 1992. ISBN 0-12-775210-2. 16 p.
/Hor/
Horsman, Steve. Condition and design assessment of wood pole overhead lines.
Improved Reliability of Woodpole Overhead Lines (Ref. No. 2000/031), IEE Seminar on
8 March 2000. 8 p.
/Gray/
Gray, Scarlett. Effect of soil type and moisture content on soft rot testing.
IRG/WP/2270. The International Research Group on Wood Preservation. 20. April 1986. 26
p.
/Ng/
Harry W. Ng. et al., Wood pole technology. Electric Power Research Institute,
Electrical Systems Division, Publication BR-102525, Palo Alto, USA, 1993, 42 p.
/Far/
Farin, Juho et al. Puupylviden kunnonarviointi ja arvioinnin luotettavuus.
Valtion teknillinen tutkimuskeskus. Raportti SH 7/93. 40 p.

MAINTENANCE AND CONDITION MONITORING ON LARGE POWER


TRANSFORMERS

Pramod Bhusal (56980W)


Lighting Laboratory (HUT)
pramod.bhusal@hut.fi

CONTENTS
INTRODUCTION

TRANSFORMER CONDITION MONITORING ARCHITECTURE AND TECHNIQUES

MONITORING BY OIL ANALYSIS

WATER CONTENT OF PAPER/OIL SYSTEM


ROUTINE TEST OF OIL QUALITY
DISSOLVED GAS ANALYSIS

4
4
5

PARTIAL DISCHARGE MONITORING

TEMPERATURE MONITORING

VIBRATIONAL TECHNIQUE FOR MONITORING

LOAD TAPCHANGER MONITORING

CONCLUSION

10

REFERENCES

11

INTRODUCTION
Large power transformers belong to the most valuable and important assets in electrical
power systems. These devices are very expensive and therefore diagnosis and monitoring
systems will be valuable for preventing damage to these transformers. Also an outage due to
these transformers impacts the stability of the network and the associated financial penalties
for the power utilities can be considerably high. So some ways has to be found to avoid
sudden breakdown, minimize downtime, reduce maintenance cost and extend the lifetime of
the transformer. Condition monitoring is the way helpful to avoid those circumstances and
having the capabilities to provide useful information for utilizing the transformer in an
optimal fashion.
Condition monitoring can be defined as a technique or a process of monitoring the operating
characteristics of machine in such a way that changes and trends of the monitored
characteristics can be used to predict the need for maintenance before serious deterioration or
breakdown occurs, and/or to estimate the machines health. It embraces the life mechanism
of individual parts of or the while equipment, the application and development of special
purpose equipment, the means of acquiring the data and the analysis of that data to predict the
trends. [1]
Before the wide use of Condition monitoring, time-based maintenance had been mainly used
maintenance strategy for a long time. Time-based maintenance strategy involves the
examination and repair of the machine offline either according to the time schedule or running
hours. This strategy may prevent many failures but might also involve many unnecessary
shutdowns and unexpected accident in the intervals. This will cause the unnecessary waste of
money and time due to the blind maintenance without having much information about the
condition of the machine. On the other hand, condition-monitoring lets the operators know
more about the state of the machine and indicate clearly when and what maintenance is
needed so that it can reduce the manpower consumption as well as guarantee that the running
will never halt accidentally. The benefits of condition monitoring can be summarized as:
Reduced maintenance costs
The results provide quality control feature
Limiting the probability of destructive failures, this leads to improvement in operator
safety and quality of supply
Limiting the severity of any damage incurred, elimination of consequential repair
activities, and identifying the root causes of failures
Information is provided on the transformer operating life, enabling business decisions
to be made either on plant refurbishment or on asset replacement.
To be successful, condition monitoring must be self-sufficient and not require manual
intervention or detailed analysis. It must be capable of detecting gradual or sudden
deterioration and trends and have predictive capabilities to permit alarming in sufficient time
to allow appropriate action to be taken and avoid major failure. It must be reliable and not
reduce the integrity of the system, it must not require undue maintenance itself and must be a
cost effective solution.

TRANSFORMER CONDITION MONITORING ARCHITECTURE AND


TECHNIQUES
The integrity of power transformer depends upon the condition of its major components and a
weakness in any can lead ultimately to a major breakdown. The main components are the
windings, insulation oil, core, bushing and on-load tap changers [2]. The operating
temperature of the transformer has a major influence on the ageing of the insulation and the
lifetime of the unit. Thermal impact leads not only long-term oil/paper-insulation degradation
it is also a limiting factor for the transformer operation [3]. Therefore the knowledge of
temperature, especially the hot spot temperature, is of high interest. The degradation of
insulation system is accompanied by phenomenon of changing physical parameters or the
behaviour of insulation systems. The degradation of insulation system is a complex physical
process. Many parameters act at the same time thus making the interpretation extremely
difficult. The monitoring and assessment of such components is vital to achieve better
reliability of the system

Fig.1 Transformer condition-monitoring techniques [4]


Condition monitoring techniques can be off-line or on-line. Offline techniques can only be
carried out during outages and some require complete isolation of the transformer. Frequency
response analysis, power factor and capacitance testing and measurement of winding and
insulation resistance, magnetizing currents and turns ratios are applicable to large or
strategically important transformers. Post fault forensic tests also include paper analysis and
metallurgical tests.
Online techniques maybe by discrete tests or can be applied continuously and avoid the need
of outages. Online, computer based, integrated, multisensor monitoring systems are now
commercially available and in development. These online systems monitor important
transformer performance including: partial discharge, water-in-oil and thermal performance.

Figure 1 shows the various techniques for transformer condition monitoring. The systems use
a combination of online on-line sensors, computer-generated analysis and predictive data to
continuously determine transformer-operating condition and to identify problems at an early
stage [5].
MONITORING BY OIL ANALYSIS
Insulating oils suffer from deterioration, which can become fatal for transformers. Also,
discharge in oil can cause serious damage to the other insulating materials, making the
monitoring of power transformers insulation an important task. The traditional way to monitor
insulation condition of transformer is by oil analysis and this method is fully covered in
international standards [5].
Chemical and physical analysis gives information on
serviceability of the oil as both an insulator and coolant and of the transformer with respect to
its thermal and electrical properties. Decomposition-products from breakdown of the oil,
paper or insulating boards, glues etc are transported through the transformer by the coolant
oil. Some are low molecular weight gases dissolved in the oil and can be identified by gas
chromatography. Others indicating solid degradations include furans, cresols and phenols and
detected by liquid chromatography.
Water content of Paper/oil system
The electric breakdown strength of clean oil is little affected by water content until it is nearly
saturated. Where such contamination is present, the relative amounts of water and
contaminant have a significant, detrimental effect. Few catastrophic failures from arcing in oil
occur without free water being present in the oil. However increase in water content in the
cellulose (present as paper winding insulation and pressboard mechanical parts) not only
increase the chances of a disastrous flashover, but also increases its rate of degradation, with
reduction of the mechanical strength and potential failure of this weakest link in a
transformer.
Routine test of Oil quality
A minimum requirement for any size of oil filled transformer device, to provide a degree of
confidence for its continued operation, is analysis of water content, together with electrical
breakdown strength and acidity. Electrical breakdown strength test is measured with modern
automated test cells. Breakdown voltages are measured with a rising AC voltage under
prescribed conditions and the mean of six tests calculated for a single sample of oil. While
scatter between the tests can be high, the mean value is reasonably repeatable for duplicate
samples.
To test Fibre and Particulate content, a simple count of visible fibres is made using a crossed
Polaroid viewing system. Large, Medium (2-5 mm) and small (<2 mm) fibres are usually
reported. The medium size fibres include the most cellulose fibres derived from the paper
insulation whereas the large fibres are usually contaminants introduced either during
maintenance or sampling. These fibres, in conjunction with water content, can give an
indication of the cause of poor electrical strength.
Other routine tests include acidity, resistivity, odour and colour tests. Measurement of the
acidity can be done manually or automatically and the level detected either calorimetrically or
4

potentiometrically. Water content, acidity and dissolved or suspended contaminants may all
individually affect resistivity. So the resistivity test is useful on site as a general indicator of
oil condition. Appropriate further tests should be carried out to discover the cause of low
values. Odour and colour tests can give an indication of thermal ageing of oil but are only
ancillary to quantitative measurement. After a suspected fault, however, the assessment of the
smell of oil from different parts of a transformer can be a valuable first indicator of the source
and type of fault.
Dissolved gas analysis
By analyzing oil sample for dissolved gas content it is possible to assess the condition of the
equipment and detecting faults at an early stage. If a fault is indicated, the type of fault can be
predicted using various analysis methods. Several dissolved gas analysis (DGA) tests should
be taken over a period of time, in order to determine the rate of increase of fault gases, and
therefore the rate of advancement of the fault. The gases involved are generally CO, CO2, H2,
CH4, C2H4, and C2H6. Further analysis of concentrations, condition and ratios of component
gases can identify the reason for the gas formation and indicate the necessity for corrective
action.
KEY GAS
H2
C 2 H6
C 2 H4
C2H2, C2H4
C 2 H2 , H 2

CHARACTERISTIC FAULT
Partial discharge
Thermal Fault < 300C
Thermal Fault 300C- <700C
Thermal Fault >700C
Discharge of Energy
Table 1. Key Gas Interpretation Method [6]

On-line gas-in-oil monitors became available soon after the introduction of the DGA
technology. On-line gas analysis offers the potential for doing a much more revealing
assessment of the dynamic conditions inside important transformers than possible through
laboratory DGA. An advantage of those on-line monitors is the continuous measurement of
one or more gases, so that any gassing trend, which is critical information for incipient fault
screening, can be easily obtained. Originally only a hydrogen online monitor was available
but now instruments detecting several gases are commercially available with a total oil
monitor. The monitors use a combination of on-line sensors, computer generated analysis and
predictive data to determine transformer operating condition on a continuing basis and
identify problems in the incipient stage. A computer is used for data acquisition and to
generate adaptive model-based transformer performance monitoring information. A database
is established on important elements of transformer performance and long term trend analysis
carried out to tune adaptive mathematical models to a particular unit.

PARTIAL DISCHARGE MONITORING


Dielectric breakdowns in transformers are most frequently proceeded by partial discharges
(PD). PD occurs within a transformer where the electric field exceeds the local dielectric
strength of the insulation. Possible causes include insulation damage caused by over voltages
and lightning strikes, incipient weakness caused by manufacturing defects, or deterioration
caused by natural aging processes. Although PD may initially be quite small, it is by nature

that a damaging process causes chemical decomposition and erosion of materials. Left
unchecked, the damaged area can grow, eventually risking electrical breakdown.
Dissolved gas analysis (DGA) is routinely employed to detect internal electrical discharging
in power transformers. DGA can provide some information about the nature and severity of
the PD [7]. However, knowledge of the PD location (which cannot be obtained from DGA
results) would be a great help to the engineering specialist who must make decisions about
remedial action. Various techniques have been developed to address the problem of PD
monitoring in electrical plants.
Measuring PD electrically is a very difficult task because the PD signals are extremely small
(in the microvolt range), so the electrical interference can limit the sensitivity of the system.
Recent years have seen the successful development and application of ultra high frequency
(UHF) PD monitoring technology. UHF can be applied to realize not only the PD phenomena
but also the location of a PD source. A typical UHF monitoring system is shown in Fig. 2.
Signals from one or more sensors are filtered and amplified before they are detected and
digitized. Analog-to-digital conversion is increasingly taking place at an earlier stage, as the
bandwidth of affordable data acquisition hardware increases. The main reason for this trend is
that adaptive digital signal processing can be used to condition the signals dynamically [8].

Fig. 2 Principles of a typical UHF PD monitoring system

A
clock and a phase reference signal derived from the power frequency waveform provide
additional information that is logged with the digitized PD data. Each PD pulse recorded can
then be associated with a particular time and point-on-wave. The amplitude of the displayed
pulses is proportional to the energy of the UHF signal. Neural networks or intelligent software
agents can be used to recognize patterns in this data and provide meaningful information
concerning the nature and characteristics of the PD source.
The key technique of UHF method is sensor and sensitivity. The sensor mainly adopts the
apacitive sensor or UHF antenna. Because of the broad frequency content of the actual
discharge, capacitive coupling in the UHF region has been shown to be an effective under
certain conditions [9]. Scottish Power and Strathclyde University have developed a diagnostic
tool for transformers, which uses UHF couplers operating in the 300-1500 MHz band [10].
The approach taken was to adapt technologies that were developed for continuous partial
discharge monitoring in gas insulated substations (GIS). Principles such as pattern recognition
and time-of-flight measurement are well known in relation to GIS, but involve greater
challenge when applied to transformers.

Acoustic PD monitoring has been another interesting technique getting the focus from both
academic and industrial people for many years. Partial discharges occurring under oil produce
a pressure wave that is transmitted throughout the transformer via the oil medium. Technique
is available in which piezoelectric sensors are connected to the outside of the tank to measure
the acoustic wave impinging on the tank either directly or via wave-guides. The advantages of
acoustic method are that firstly, it can reach the possibility of PD location, which is of
considerable value for power equipment maintenance. Secondly, it could recognize the PD
acoustic signal regardless the electromagnetic noise in substation. Sometimes, it is difficult to
discriminate the PD acoustic signals due to the interferences from either electrical or
mechanical sounds in the substation. This is the obstacle for the method to be widely applied.
The use of a new fibre optic acoustic sensor for the detection of discharges from within the
transformer is developed by Virginia Polytechnic Institute and State University. The basic
principle of the developed sensor is illustrated in figure 3. The system involves a sensor
probe, optoelectronic signal processing and an optical fibre linking the sensor head and the
signal-processing unit. The light from a laser diode is launched into a tow-by-two fibre
coupler and propagates along the optical fibre to the sensor head. As shown in the enlarged
view of the sensor head, the lead-in fibre and the silica glass diaphragm are bonded to form a
cylindrical sensorhousing
element.
The incident light is
first
partially
reflected (-4%) at
the end face of the
lead-in fibre. The
remainder of the
light
propagates
across the air gap to
the inner surface of
the diaphragm. The
inner surface of the
diaphragm is coated
with gold, which
reflects the entire
incident
light
Fig. 3 Illustration of the principle of the fiber optic acoustic sensor
(96%), preventing
any reflection from
the outer surface; the fibre sensor is thus optically self-contained in any environment. This
means that the optical signal is only a function of the length of the sealed cavity; and it is
immune to the diaphragm outer surface contamination resulting from the contact with
transformer oil. As indicated in the enlarged view of the sensor head, the diaphragm is tilted
at an angle with respect to the lead-in fibre end-face so that the fibre captures only about 4%
of the second reflection. The two reflections travel back along the same lead-in fibre through
the same fibre coupler to the photo-detection end. The interference of these two reflections
produces sinusoidal intensity variations, referred to as interference fringes, as the air gap is
continuously changed. The development of the diaphragm pressure Sensor was concentrated
upon utilizing an epoxy to bond the silica hollow core tube to the ferrule and the hollow core
to the silica diaphragm. Using an online monitoring process, the air gap between the fibre and
the inner surface of the silica diaphragm was adjusted to give the highest interference fringe
visibility.

TEMPERATURE MONITORING
Monitoring a transformer through temperature sensors is taken as one of the simplest and
most effective monitoring technique. Abnormal temperature readings almost always indicate
some type of failure in a transformer. For this reason, it has become common practice to
monitor the hot spot, main tank, and bottom tank temperatures on the shell of a transformer.
As a transformer begins to heat up, the winding insulation begins to deteriorate and the
dielectric constant of the mineral oil begins to degrade. Likewise, as the transformer heats,
insulation deteriorates at even a faster rate. Monitoring the temperature of the load tap
changer (LTC) is critical in determining if a LTC would fail. In addition to the LTC,
abnormal temperatures in the bushings, pumps, and fans can all be signs of impending
failures.
Recently, thermography has been used more widely for detecting temperature abnormalities
in transformers. In this technique, an infrared gun is taken to the field and used to detect
temperature gradients on external surfaces of the transformer. Infrared guns make it easy to
detect whether a bushing or fan bank is overheating and needs to be replaced. The method is
also useful in determining whether a load tap changer (LTC) is operating properly.
Thermography is effective for checking many different transformers quickly to see if there is
any outstanding problem [12].
However, thermography is not conducive to on-line measurements and; therefore, is prone to
miss failures that may be developing between the periods when the transformers are checked.
In order to make on-line monitoring possible, thermocouples are placed externally on the
transformer and provide real-time data on the temperature at various locations on the
transformer. In many applications, temperature sensors have been placed externally on
transformers in order to estimate the internal state of the transformer. These temperature
readings can be used to determine whether the transformer windings and oil are overheating
or running at abnormally high temperatures. High main tank temperatures have been known
to indicate oil deterioration, insulation degradation, and water formation [12].
VIBRATIONAL TECHNIQUE FOR MONITORING
The diagnostic methods described so far have all dealt with trying to detect a failure in the
electrical subsystem of the transformer, namely the electrical insulation around the coils.
There has also been research carried out in regards to mechanical malfunctions in a
transformer. According to the study, the factors generating the vibration in the transformer are
of two types: core vibration and winding vibration [13]. Core vibration consists of excitation
by magnetostriction and excitation generated at air gaps. Winding vibration is generated by
Lorenz force due to correlation of leakage flux and winding current. These vibrations from
winding and core penetrate into transformer oil, travel through it and reach the tank walls
exciting their oscillations. Based on this analysis, we can say that tank vibration signals have
strong relation with the condition of transformers core and windings and can provide useful
diagnosis information.
The vibrations from the windings and core can be measured at the tank wall by piezoelectric
accelerometers. The accelerometer is positioned at different locations on the tank and
measurements are taken twice in no-load and loaded modes, which is necessary to separate
the vibrations of the core and windings. In no-load mode the electrodynamic forces in the

windings are practically absent, vibrations can be attributed to magnetic core conditions only.
Measurements taken under loaded conditions include both core and coil vibrations. Therefore
it is possible to find the spectrum related to winding vibration by subtracting the no-load
(core) results from the loaded (core and coil) results. Such an approach is justified because the
magnetic flux in the core is almost independent of the load. The vibration spectra consist of
harmonics besides fundamental frequency (two times of power frequency).

Fig 4. Example of the measuring system


The core vibration will aggravate when core-clamping force is loosing. If high-voltage
winding or low-voltage winding has displacement, distortion or lack in clamping force, the
difference in height between the winds will increase, thus leads to ampere-turn imbalance and
axial force deviation, resulting in intensified vibration.
Acceleration sensors, which can be used to measure the vibration, are divided into
piezoelectric type, train type and servo type. Low frequency response of servo-type
acceleration sensor is excellent but its bandwidth is narrow (<500Hz), obviously not suitable
for tank vibration measurement. Comparing piezoelectric-type with train type, piezoelectrictype sensor has wider application, installation resonance frequency is beyond 100Khz, and the
bandwidth margin is ample.
LOAD TAPCHANGER MONITORING
On-load tapchangers (OLTCs) are one of the most problematic components of power
transformers. The majority of transformer failures are directly or indirectly caused by tap
changer failures, because a tapchanger contains the only moving components associated with
transformer. The cost of tapchanger is very low compared to that of transformer but the
failure of the tap changer can be responsible for the destruction of the complete unit.
In earlier days the only method used by the tapchanger manufactures was the fitting of a
temperature probe to monitor the temperature of the diverter switch oil. This proved totally
inadequate, because the probe took time to register any significant change in temperature.
Recently surge relays have been fitted to both the selector and diverter switch components but

they are still slow to respond. The diverter switches have to be carefully set up to avoid any
spurious operation of the surge relays. On-line monitoring systems for on-load tapchangers
are now available.
In one on-load tapchanger diverter switch protection scheme a current transformer (CT) is
mounted in the diverter switch compartment to monitor current passing through the
transitional resistors of the tapchanger during operation. The output from the CT is fed to a
timed relay, which operates if the duration of the current flow exceeds a preset limit. The CT
is placed in such a way that it will be energized whenever current passes through either or
both of the resistors during operation of the tapchanger, The output of the CT is used to
energize a current transducer, and this in turn operates a time delayed relay.
CONCLUSION
The justification for condition monitoring of power transformers is driven by the need of the
electrical utilities to reduce operating costs and enhance the availability and reliability of their
equipment. Condition monitoring produces reliable information on plant condition, which
allows maintenance resources to be optimized and assist with optimum economic replacement
of the asset. Many techniques for the monitoring are available and new techniques are being
developed constantly. Researches are concentrated on computer-based techniques on online
monitoring of transformer and its components.

10

REFERENCES
1. Y. Han and Y.H. Song, Condition Monitoring Techniques for Electrical Equipment
A Literature Survey, IEEE transactions of power delivery, vol. 18, No. 1, January
2003
2. Muhammad Arshad, Syed M. Islam, Power transformer condition monitoring and
assessment for strategic benefits. Australian Universities Power Engineering
Conference AUPEC2003, 28September- 1 October 2003.
3. IEC Loading guide for oil immersed transformers. IEC Standard 60354, Sep.1991.
4. J. C. Steed, Condition monitoring applied to power transformers: an REC View. IEE
conference on The reliability of Transmission and Distribution Equipment,
Conference Publication No. 406, pp 109-111, March 1995.
5. D. Harris, M. P. Saravolac, Condition Monitoring in Power Transformers. IEE colloq.

Condition Monitoring of Large Machines and Power Transformers, 1997, pp. 7/17/3.
6. Pahlavanpour, B.; Wilson, A.; Analysis of transformer oil for transformer condition
monitoring. IEE Colloquium on Engineering Review of Liquid insulation. 7 Jan. 1997
Page(s):1/1 - 1/5

7. M. Wang, A. J. Vandermaar, and K. D. Srivastava, Review of condition assessment


of power transformers in service, IEEE Elect. Insul. Mag., vol. 18, no. 6, pp. 1225,
Nov/Dec 2002.
8. Judd M D, Yang L and Hunter I B B 2005 Partial discharge monitoring for power
transformers using UHF sensors 1: sensors and signal interpretation IEEE Ins. Mag.
21 514
9. Judd, M:D:, B. M. Pryor, S. C. Kelly and B. F. Hampton, Transformer monitoring
using the UHF technique, Proc. 11th int. Symp. on High Voltage Engineering
(London), Vol. 5, pp. 362-365, August 1999.
10. Judd, M. D, B.M. Pryor, O.Farish, J.S.Pearson and T. Breakenridge, Power
Transformer Monitoring Using UHF Sensors. IEEE International Symposium on
Electrical Insulation.April, 2000
11. Ward, B.H.; Lindgren, S. A survey of developments in insulation monitoring of power
transformers. Conference Record of the 2000 IEEE International Symposium on
Electrical insulation, 2000, Page(s): 141-147
12. Kirtley Jr., J.L., Hagman, W.H., Lesieutre, B.C., Boyd, M.J., Warren, E.P., Chou,
H.P., and Tabors, R.D. 1996. Monitoring the Health of Transformers. IEEE Computer
Applications of Power, 63, pp.18-23.
13. Ji Shengchang; Shan Ping; Li Yanming; Xu Dake; Cao Junling; The vibration
measuring system for monitoring core and winding condition of power transformer.
Proceedings of 2001 International Symposium on Electrical Insulating Materials, 1922 Nov. 2001, Page(s):849 - 852

11

AGEING PHENOMENA OF PAPER-OIL INSULATION IN POWER


TRANSFORMERS

Henry Lgland
University of Vaasa
Henry.lagland@uwasa.fi

Contents
1.

INTRODUCTION

2.

MECHANISM OF DEGRADATION OF THE PAPER-OIL INSULATION IN


POWER TRANSFORMERS

3.

THERMAL AGEING PROCESSES

3.1

Thermal ageing of paper-oil insulation

3.1.1

The ageing phenomena

3.1.2

Degree of polymerisation (DP)

3.1.3

Mathematical models for thermal ageing

3.1.4

Thermal ageing of oil/transformerboard insulation systems

3.1.5

Thermal ageing of solid and liquid insulating materials under the influence of water
and oxygen

3.1.6

Comparison between open and closed expansion system

3.1.7

Comparison between constant ageing temperature and cyclic temperature changes

4.

ELECTRICAL AND COMBINED ELECTRICAL AND THERMAL AGEING

3.1

Electrical ageing

3.2

Combined electrical and thermal ageing

5.

CONCLUSION

Literature

1.

INTRODUCTION

Power transformers are used in power generation units, transmission and distribution networks to
step up or down the voltage of the power system (figure 1). The capacity is usually between a few
MVA to about 100 MVA.
To be able to use the real capacity of power transformers it is
important to know the duration and level to which power transformers
can be thermally stressed. Thus increasing demands are being imposed
on the liquid and solid insulating materials with regard to the operating
reliability and overloading capability.
This paper describes the mechanism of degradation of the paper-oil
insulation in power transformers. Mathematical models for thermal
ageing are briefly presented as well as findings on thermal and
electrical ageing phenomena for liquid cooled transformer insulation
systems.

Figure 1. Power transformer (ABB).


The paper is mainly based on material chosen by Dr. Hasse Nordman, who is is chairman of the
working group of the loading guide for oil-immersed power transformers [3]. Most of the
material in this paper is based on the chapter Transformerboard II, Properties and application of
transformerboard of different fibres by H.P. Moser and V Dahinden [1].

2.

MECHANISM OF DEGRADATION OF THE PAPER-OIL INSULATION IN


POWER TRANSFORMERS

Paper is a sheet of material made from vegetable cellulose fibres dispersed in water. The fibres
are drained to form a mat. Cellulose is a linear polysaccharide consisting of hydro Dglucopyranose units held together by a -linkage. A single cellulose fibre consists of many of
these long chains [2].

Figure 2. Cellulose structure [2]


The condition and strength of the fibres themselves and the physiochemical bonding, known as
hydrogen bonding, between the cellulose molecules are the most significant factors that
influence the strength of a dried sheet of paper (figure 2). The ageing performance of a sheet of
paper is influenced by the degradation of the cellulose. The mechanism of degradation is rather
complicated and depends on the environmental conditions. According to [2] there are three types
of degradation:
3

1. Hydrolytic degradation, which refers to the cleavage at the glycoside linkage giving the
sugar glucose
2. Oxidative degradation. Because cellulose is highly susceptible to oxidation, the hydroxyl
groups are the weak areas where carbonyl and carboxyl groups are formed eventually
causing secondary reactions giving chain scission
3. Thermal degradation below 200 C is similar to, but faster than, the normal ageing of
cellulose. Oxidative and hydrolytic degradation occur giving:
-severance of the chain, reducing the degree of polymerization and the strength
-opening of the glucose rings
Decomposition products are mostly water and carbon oxides. Processes have been developed to
improve the resistance of paper to degradation, or upgrade it. This is done either chemically by
converting some of the OH radicals to more stable groups or by adding stabilizers, such as
nitrogen-containing chemicals like dicyandiamide.
The ageing of oil/solid insulation systems can be influenced by the treatment of the components
of the insulation system, addition of inhibitors and sealing of the insulation system.

3.

THERMAL AGEING PROCESSES

The ageing behaviour of oil/solid insulation systems depends on the thermal, mechanical,
electrical and combined electro-thermal stresses of the power transformers.

3.1

Thermal ageing of paper-oil insulation

3.1.1

The ageing phenomena [1]

The ageing process in the oil/cellulose insulation system under thermal stress and their
measurable effects are due to chemical reactions in the dielectric. Cellulose is a linear
macromolecule, which in the unaged state consists of 1000- 3000 glucose rings (figure 3).

Figure 3. Basic structure of the cellulose


macromolecule [1].

The periodically repeating structural units of the macromolecule, the -glucose rings, are bonded
to one another via oxygen bridges between the first and fourth carbon atom. Via the hydroxyl
groups cross links are formed to crystalline regions, the micelles. Between the micelles individual
cellulose molecules accumulate, forming a cavity system with a capillary diameter of 10 nm to a
few m within the fibre. The fibre length of pine pulp varies between 0.25 mm and 4 mm.
The insulating oil consists principally of paraffins, naphthenes and a small portion of aromatics.
The chemical composition of the mineral oil can only be stated approximately since the oil
consists of a mixture of hydrocarbon compounds with different molecular structures.
The main important parameters affecting the ageing of the solid and liquid insulation are:
1. The temperature of the oil/cellulose dielectric
2. The presence of water
3. The presence of oxygen
The temperature of the oil/cellulose dielectric is the critical ageing parameter for the change in
the mechanical and electric properties of the material. Thermally supplied vibration energy of
many atoms and groups of atoms is temporarily concentrated on individual C-H, C-O and C-C
bonds and cleaves these bonds. This results in cleavage products such as carbon dioxide, carbon
monoxide, water, hydrogen and scarcely measurable amounts of methane. By interacting with the
oil components the entire final molecule can be separated off at the end of a chain and converted
to other substances (e.g. sludge, acid). Also the ageing of oil under high thermal stress is
characterized by chemical reactions. For the stability of the oil molecules the bond energy of the
C-H and C-C single and double bonds is critical.
Water forms as a reaction product, both during the thermo kinetic degradation of the cellulose
and during the ageing of the oil. Apart from the thermo-kinetic degradation, the moisture present
at the beginning of the ageing process, as well as the water formed by the reactions of the
cellulose and of the oil, causes additional decomposition of the chain molecules. Because of the
hydroscopic nature of the cellulose and fibre structure (capillaries), the water molecules
accumulate between the cellulose chains and thus promote their thermo hydrolytic degradation.
The water continuously causes fresh molecular cleavage thus having the negative property of
constantly and retroactively accelerating the ageing process of the cellulose. In contrast to ageing
of the cellulose, ageing of the oil is scarcely affected by water.
Oxygen is predominantly present in the oil and thus noticeably accelerates the ageing of the oil
while the effect of the oxygen on the ageing of the cellulose tends to be more moderate. On the
other hand, in thoroughly dried insulation systems and in particular in the presence of fairly large
amounts of oxygen, oxidation is the dominant process in the ageing of oil, at least in the initial
stage. In the presence of reactive substances, in particular oxygen, the cleavage of oil molecule
bonds is followed by a reaction sequence, the principal oxidation products of which are acids,
solid constituents (sludge), water, carbon dioxide and carbon monoxide. Dissolved metals, such
as iron and copper have a catalytic effect on the degradation process of the oil molecules. Since
no free oxygen is formed during the ageing process of either the oil or the cellulose, oxygen
enters the system only from the outside. Since the degradation of oil molecules produces other
reactive substances, the decomposition of the oil molecules continues even where there is a
deficiency of oxygen once the decomposition mechanism has been initiated.
5

3.1.2

Degree of polymerisation (DP) [1]

The progress of the ageing of the oil/cellulose dielectric, the presence of water and the oxidation
by oxygen, can be determined from the change in the material properties and from the formation
and precipitation of reaction products. The degree of polymerization is the connection between
the deterioration in the material properties and the formation of ageing products. It is a direct
indication of the decomposition of the cellulose macromolecule and proves to be the most
informative parameter for assessing the ageing or the progress of ageing of the cellulose (figure
4).

Figure 4. Degree of polymerisation (DP) [1].


3.1.3 Mathematical models for thermal ageing [1]
In 1930 Montsinger stated a law describing the existing interrelation between life expectancy and
operating temperature of the transformer based on measurements on transformer insulating
materials. Montsingers law states that an increase or decrease of the operating temperature by 610 K, according to the insulation raw material, results in the doubling or halving of the ageing
rate. Using the Arrhenius relations Bssig and Dakin formulated a life law derived from the
reaction equations of the chemical principles. By inserting the material characteristics in place of
the molecule concentrations in the chemical reaction equations the thermo-kinetic deteriorations
in characteristics are described.

The following differential equation for the first order reactions is:

d
(x0 x ) = C (T ) x
dt

(1)

x0 = material characteristic at time t=0 (initial value)


t = time in days
x = material characteristic after ageing to time t
C = ageing rate constant [1 / days]
T = absolute temperature in K
As x0 is a constant the differential equation (1) can be reduced to:

dx
= C (T ) x
dt

(2)

The following reduction in physical characteristic x subjected to thermal load T is produced by


solving differential equation (2) with boundary condition x ( t = 0) = x0 :

x(T ) = x 0 exp[ C (T ) t ]

(3)

The differential equation (2) was modified by Dakin by introducing the general order of reaction
. The differential equation (4) is normalized with the initial value x ( t = 0) = x0 :

X (t ) = x(t ) x 0

dX dt = C (T ) X

(4)

= order of reaction ( > 0)


Taking into account the order of reaction with boundary condition x ( t = 0) = 1, the solution
functions of differential equation (4) are:
X (t ) = 1 C (T ) t

for = 0
for = 1

[see (2)]

for > 1

X (t ) = exp( C (T ) t )

(5)

(1 X (t )) 1 = ( 1) C (T ) t

Arrhenius equation can express the temperature dependence of ageing rate constant C (T ) :
C (T ) = C A exp( E (R T ))

(6)

C A = constant [1 / time]

E = activation energy [J / mol]


R = gas constant [J / K mol]
Or with Montsingers empirical formula:
C ( ) = C M exp(m )

(7)

C M = constant [1 / time]
m = constant [1 / C]
= temperature in C

The so-called Montsinger step M = ln 2 m

(8)

Deterioration in the physical characteristics of thermally stressed oil/solid insulation systems as a


function of time can be described by means of the relation (3) and (5). The temperature influence
(ageing parameter) is mathematically expressed with equations (6) and (7).
3.1.4 Thermal ageing of oil/transformerboard insulation systems [1]

In the following the ageing behaviour of TRANSFORMERBOARD T III (density = 0.84 g/cm3)
and T IV (density = 1.18 g/cm3) is described taking into account interactions with mineral oil as a
liquid medium. The results are from measurements on testing rigs fitted with an open expansion
system which were operated at temperatures of 90 C, 105 C, 120 C and 135 C with cycles. T
III and T IV are made from the same raw material but produced by different processes. T IV is
extremely highly compressed by hot pressing and T III rather less so by calendaring.

Figure 5. Degree of polymerization and normalized tensile strength ( / 0) as a function of time


[1].
8

The almost identical reduction in degree of polymerization for the two materials indicates that the
production process has little influence on the ageing, respectively on the breakdown of the
cellulose macromolecules (Fig. S.12).
The superior ageing behaviour of T IV compared with T III reveals when examining the
mechanical characteristics. The higher density resulting from hot pressing and the greater degree
of cross linking with its associated reduction of free fibre surface provides T IV with improved
thermal stability with respect to the mechanical characteristics (Fig. S.14)
No deterioration of the dielectric strength of the solid samples occurred under thermal stressing
(Fig. S.18). The conductivity or specific resistance of the oil/solid dielectric is not a characteristic
of the molecular structure of either cellulose or oil, but is due to ionic by-products. Ions are
present even in unaged, freshly prepared samples, both in the oil and the solid insulation, in the
form of residues from the production process. Additional ionic decomposition products are
produced during oil and cellulose ageing by the high thermal stressing of the insulation system.

Figure 6. Impulse withstand field strength as a function of ageing time [1].


Fig. S.20 shows that the ageing temperature plays an important part in lowering the specific
resistance. The alternating voltage losses are also principally due to the ion condition. Hence the
increase in loss factor tan of the solid samples over time with temperature as a parameter is
caused by an increase in the ion concentration (Fig. S.19). This is on the one hand caused by
dissociated ageing products from the chemical reactions of the oil and cellulose and also their
interaction and on the other hand by the impurities and decomposition products absorbed from
the fluid medium which increase tan .

Figure 7. Electric characteristics (impulse withstand field strength, loss factor and specific
resistivity) of oil impregnated solid samples as a function of ageing time [1].
Gases produced in the system by the thermo-kinetic breakdown of the cellulose macromolecules
and the decomposition of the oil molecules due to ageing, also the interaction between the
reaction products of the solid and liquid insulation, are carbon dioxide CO2, carbon monoxide
CO, hydrogen H2 and scarcely measurable amounts of aliphatic hydrocarbon gases. Water H2O is
also produced and entire sections can be split off the end of the cellulose macromolecules. These
separated molecules convert to other substances which are observable in the oil in the form of
acids or low molecular sludge. Fig. S.21 shows the increase in water content in the solid samples
as a function of time and at different temperatures. Shown in Fig. S.22 is the increase in water
content in the oil which forms the insulation system together with T IV or T III. In the oil/solid
insulation system the law applies that the relative moisture content in the oil must be identical to
the relative moisture content in the board if the diffusion processes are complete.
Since the absolute water content in the sheet material is far greater than in oil, the relative
moisture content of the system is mainly determined by the solid sample. The saturation moisture
of the oil and also that of the board is highly temperature dependent and moreover in opposition.
Cyclic operating temperatures create non-equilibrium of the moisture content or promote
equalizing processes between the solid samples, the oil and the air. With a reduction of the
operating temperature, the solid sample absorbs water and releases it again during heating up.

10

Figure 8. Water content of the solid sample and oil as a function of time [1].
The most important electrical characteristics of the liquid insulation which in the transformer
forms an inseparable insulation system with the solid samples are the breakdown voltage, loss
factor, specific resistance and dielectric constant of the oil. The breakdown voltage of the oil,
which has been aged together with T IV or T III, is principally dependent on the water content of
the liquid medium (Fig. S.9). If the water is dissolved in the oil, i.e. the solubility limit of water
in oil at the relevant temperature is not yet reached, then the breakdown voltage is approximately
65 kV, regardless of the ageing temperature, the ageing time and the composition of the
insulation system. If the water is emulsified in the oil, then the breakdown voltage falls to 25 kV,
the undissolved quantity of water having no further influence. Both the increase in loss factor
(Fig. S.25) and the reduction in specific resistance (Fig. S.26) are relative small in comparison to
the loss factor (Fig. S.6) and specific resistance changes (Fig. S.7) of the oil aged under the same
conditions without the addition of the solid samples (pure oil ageing). This is because the T IV
and T III absorb dissociated particles from the oil together with ionic decomposition products
from the ageing process, thus exerting a cleansing effect. The TRANSFORMERBOARD reduces
the direct and alternating current losses of the oil by approximately a factor of 10.

11

Figure 9. The loss factor and the specific resistivity of the oil as a function of ageing time and
breakdown voltage of the oil as a function of the water content [1].
12

3.1.5 Thermal ageing of solid and liquid insulating materials under the influence of water
and oxygen [1]

Fig. S.29 and Fig. S.30 show the effects of intentionally added water and the influence of the
continuous extraction of water from the insulation system on the ageing process of T IV and the
oil. In Fig. S. 29 is the water and carbon dioxide production as a function of time plotted. They
are both relevant values for assessing the ageing process. In Fig. S.30 are the representative
degree of polymerization for the cellulose ageing behaviour and the tensile strength z, which
indicates the deterioration of the mechanical characteristics. Both the added water and the water
formed by the reactions of the oil and cellulose due to ageing, retrospectively accelerate the
ageing of T IV. This means that the life expectancy of poorly dried transformers can fall sharply
compared to correctly prepared transformers. This is also confirmed by the results of experiments
in which oil was constantly dried throughout the ageing period by means of molecular sieves. A
life expectancy of double or even more is possible if the oil of a normally operated transformer is
permanently dried and degassed during its total operating time.

Figure 10. Production of water and carbon dioxide (Fig. S.29) and DP and tensile strength (Fig.
S.30) as a function of time [1].

13

As can be seen in Fig. S.33a the tensile strength, elongation and DP are only slightly influenced
during the experiments with a molecular sieve as acceptor in the system constantly degassing and
drying the oil and thus, by virtue of equilibrium, also dehydrating the solid insulation. Adding
oxygen to the oil/cellulose insulation system significantly increases the water content in both the
solid sample (Fig. S.33c) and the liquid medium (Fig. S.35c). The large rise in water content of
the system is caused by the reactions of the oil molecules with oxygen. The loss factor reduction
in the molecular sieve experiments shows the cleansing effect of the zeolites on both the solid
samples (Fig. S.33a) and the oil (Fig. S.35a). An extremely large increase in loss factor measured
at 90 C occurs when particular board water content is exceeded (Fig. S.33d).
The mechanical characteristics of NBC remain largely unaffected with a thermal stress at 135 C
over an ageing period of 100 days (Fig. S.34). The increase in loss factor in the experiments with
oxygen is due to the filtering effect of the NBC on the oil (Fig. S.34c). NBC proves to be an
extremely ageing resistant material.

a) Addition of molecular sieve 5


[+ MS 5]
b) Without molecular sieve
[-]
c) Oxygen addition
[+O2]
d) Water addition (50 ml)
[+H2O]
e) Water (50 ml) and oxygen
addition [+H2O+O2]
Figure 11. Change in solid properties of Transformerboard T IV/oil insulation (Fig. S.33),
Nomex Board NBC/oil insulation (Fig. S.34) and change in oil properties (Fig. S.35) as a
function of time [1].

14

3.1.6

Comparison between open and closed expansion system [1]

Fig. S.38 shows the effects of open and closed expansion vessels on the change in solid and oil
characteristics. The test chambers were operated at either a cyclically changing or a constant
temperature.
The tensile strength of T IV and the degree of polymerization, not plotted but running
qualitatively parallel, which characterize the ageing state of the cellulose, fall less sharply in the
closed than in the open system. On the other hand the water content of the board increases
slightly more in the closed system than in the open. In the open vessels, with advanced ageing,
the extraction of water from the system can be achieved via the air cushion between the oil in the
expansion vessel and the silica gel drier, supported by the temperature cycles which generate a
regular air exchange. In long term tests at constant temperature, the tensile strength of the T IV
decreases more sharply than in the experiments with cyclic temperature changes. The greater loss
is mainly caused by the higher water concentration in the board which accelerates the thermohydrolytic decomposition of the cellulose. In the open system the oxygen produces accelerated
oil ageing, in which, together with many other reaction products, ions occur, increasing the loss
factor.

Figure 12. Change in Transformerboard T IV properties as a function of time [1].

3.1.7

Comparison between constant ageing temperature and cyclic temperature changes


[1]

The tests also examined the influence of temperature cycles on the ageing behaviour of the mixed
dielectric. The load cycles of a realistically loaded transformer were simulated by the daily

15

recurring two hour cooling and subsequent three hour heating periods of the test vessel to the
nominal temperature. In the long term tests at constant temperature, the tensile strength of the T
IV decreases more sharply than in the experiments with cyclic temperature changes (Fig. S.38).
The greater loss is mainly caused by the higher water concentration in the board which
accelerates the thermo-hydrolytic decomposition of the cellulose. The temperature cycles cause a
regular exchange between the solid samples and the liquid insulation, also between the oil in the
expansion vessel and the air cushion dehumidified by the silica gel drier. Hence the temperature
cycles facilitate the extraction of water from the system, if only in small quantities.

4.

ELECTRICAL AND COMBINED ELECTRICAL AND THERMAL AGEING

4.1

Electrical ageing [1]

The aim of these tests was to determine the effect of AC electric fields (50 Hz) with strengths of
3.33 kV/mm, 6.66 kV/mm and 10 kV/mm, on the ageing characteristic of the oil/solid dielectric.
Transformerboard T IV together with inhibited oil was tested in the test vessels at room
temperature corresponding to purely electrical ageing.
During the tests no changes were observed which would indicate ageing of the samples. The
constancy of the loss factor indicates that, during the 6000 hours test time, no ionic ageing
products which increase tan were formed. Consequently, continuous electric field strengths of
10 kV/mm are incapable of cleaving either oil or cellulose molecules. In the transformer, where
the insulation is exposed to continuous electric stresses with a maximum value of 34 kV/mm,
the electric field has little direct effect on ageing.

4.2

Combined electrical and thermal ageing [1]

Transformerboard T IV and Nomex Board NBC were investigated at constant operating


temperatures of 135 C (120 C) and a continuous electric stress with field strength of 5 kV/mm.
The two solid materials were tested with mineral oil with or without a molecular sieve (Fig.
S.52). NBC is a highly compacted pressboard, consisting of 100 % Aramid synthetic fibre.
The loss factor of the oil/NBC insulation system remains unaffected during the entire test period
under a stress of 5 kV/mm and 135 C. In contrast to the tests with NBC, the oil/T IV insulation
shows a marked increase in the loss factor. During the experiment on the oil/T IV insulation
without a molecular sieve, partial discharges were observed after 2500 hours of operation. The
test was discontinued. The increases found during ageing are a direct consequence of the reaction
products which are formed by thermo-kinetic cleavage (135 C) of the cellulose and oil
molecular chains and their interactions. After the oil had been cleaned and degassed the
experiment was continued with the same samples at a temperature of 120 C. The abrupt decrease
in the loss factor of the oil/T IV dielectric from 130 to 40 can be explained mainly by the
experiment arrangements and the measurement technique. The loss factor measured at 120 C
after this operation decreased from 40 to 30 during the remaining 3500 hours. The reason
for this decrease in the loss factor is the reduced rate of formation of ionic and gaseous reaction
16

products. In the investigations it was found that there is a change in the reaction mechanism in
the ageing process of the T IV in the temperature interval between 120 C and 135 C.
The molecular sieve degasses the oil permanently and thus promotes diffusion of the ageing
products out of the solid sample. A prediction for an increase in the loss factor is a high
temperature. This is also confirmed by the electrical tests at room temperature where no increase
in loss factor was observed even at field strengths of 19 kV/mm over a test period of 6000 hours.
The electric field strength therefore has only an indirect effect on the ageing of the cellulose/oil
insulation system by separating the ions formed by the supply of thermal energy and thus
preventing them from recombining.

Figure 13. Variation of the loss factor of oil/T IV and oil/NBC dielectric as a function of time [1].

5.

CONCLUSION

For the hot pressed Transformerboard T IV and calendared pressboard T III, both solid
insulations with Kraft cellulose as a basic material, a similar decomposition process of the chain
molecules was observed during thermal stressing in spite of their different production methods.
The higher ageing resistance of T IV compared to T III became apparent only on comparison of
the relative reduction in mechanical strength. At temperatures over 120 C, the ageing rate of T
IV is almost twice as high as that below 120 C. The ageing rate of T IV is highly susceptible to
the presence of water in the oil/solid insulation system at high thermal stresses, whilst the oil
ageing is hardly changed by moisture. The presence of oxygen in the oil/cellulose insulation
systems produces a severe ageing of the oil, but only slight ageing of the T IV. In the open
expansion system, the ageing of the oil and cellulose solid insulation in interaction with the oil is
accelerated by the admission of oxygen, supported by the cyclic temperature changes. An
17

excellent age stabilizing effect on Transformerboard T IV and the oil was achieved by the
utilization of a molecular sieve in the oil/cellulose insulation system, principally by the
adsorption of water and gas in connection with a hermetically sealed system. Nomex Board is
very resistant to high thermal stresses, even with the additional influence of water and oxygen
which had a significant effect on the ageing of the oil/cellulose insulation system [1].
The ageing is influenced not only by the temperature. Also the humidity, acid and oxygen content
have a dramatic impact of the ageing. To get the influence of these parameters on the loading
capability of power transformers Dr. Hasse Nordman has initiated a Cigre working group with a
task list according to [8]. Another task for the working group is to define the content of humidity,
acid and oxygen near the hot spot of the windings of the power transformer. If the working group
finds these base values, the article by Lundgaard, Hansen, Linhjell and Painter [7] can give the
relevant factors for the ageing speed.

Literature

[1]

H.P. Moser, V Dahinden. Transformerboard II, Properties and application of


transformerboard of different fibres. Publisher Weidmann. 1987.

[2]

D.H. Shroff, A.W. Stannett. A review of paper ageing in power transformers. 1985.

[3]

IEC 60076-7. Loading guide for oil-immersed power transformers. Proposal 2005.

[4]

IEEE C57.91-1995. Annex D. Philosophy of guide applicable to transformers with 55 C


average winding rise (65 C hottest-spot rise) insulation systems.

[5]

IEEE C57.91-1995. Annex I. Transformer insulation life.

[6]

Cigre WG. Relative ageing rate and life of transformer insulation.

[7]

L.E. Lundgaard, W Hansen, D Linhjell, T.J.Painter. Ageing of Oil-Impregnated Paper in


Power Transformers. IEEE Transactions on Power Delivery, vol. 19, No 1, January 2004.

[8]

Cigre WG Relative ageing rate and life of transformer insulation. Scope. 2005.

18

ON-LINE MONITORING APPLICATIONS FOR POWER TRANSFORMERS


Pekka Nevalainen
Tampere University of Technology
pekka.nevalainen@tut.fi

ABSTRACT
Power transformers are very critical components in an electrical network. To provide continuous
power supplying, transformers need to operate without failures. Condition monitoring of power
transformer is a good tool to reduce power outages caused by transformer failures. There are
many various methods for condition monitoring. This paper describes briefly several different
types of applications for on-line monitoring.

INTRODUCTION
Modern power transformers are expected to operate very long and even a lifetime of 40 years
may be expected. To provide long lifetime and to avoid power outages, condition monitoring of
the transformers should be implemented. Furthermore, condition monitoring can reduce
maintenance costs, may be used for identifying the reasons for failure and extend the lifetime
even more [1], [2].
Condition monitoring can be carried out using different methods. Generally the methods can be
divided in two groups; off-line and on-line. Off-line methods usually demand disconnecting the
transformer from electric network and may use intrusive actions to do the needed measurements.
These methods include for example: return voltage measurements (RVM), dielectric frequency
response (tan- (f)) and gas analysis of transformer oil sample [2]. These methods and
measurements are usually very accurate and provide good information of transformers condition
[3].
On-line methods are using different type of sensors attached to the transformer. These sensors
wont affect the normal operation of the transformer, thus making it possible to perform
continuous measurements for long periods of time. The on-line methods are based on sensors,
data acquisition and analysis. Some of the methods are still experimental, but good commercial
solutions exist [3].
On-line monitoring applications for power transformer include for example: measurements of
different temperatures, gases in oil, humidity of the oil, partial discharges, winding movement,
furfuraldehyde, tan- and load tap changers. Measurements are carried out using variable sensors
including: optical fibers, vibration detectors, UHF antennas, gas sensors, thin film coils and
capacitive sensors [4], [2], [3].

TEMPERATURE MEASUREMENTS
Power transformer can overheat during too heavy loading or because of a cooling system
malfunction. Too high temperatures in transformer cause aging of insulation. A simple model of
thermal aging is called Motsinger equation, which states that if the temperature is above 98 C,
every 6 C rise of hot-spot temperature doubles the aging rate of the insulation. On-line
temperature measurements along with thermal modeling are key factors in temperature based
condition monitoring [5].
Temperature can be traditionally measured using a Pt100 thermocouple device. Generally these
devices can be used to measure transformer oil temperatures but they are also used to measure
temperatures in very difficult locations, such as in the magnetic core itself [6]. However one
drawback of the Pt100 sensors is the induced electrical noise [5].
Thin film sensors
Thin film sensors are an advanced technique to measure temperatures. Thin film sensors are
constructed on metallic or non-metallic bases using several techniques. The sensor described here
is constructed using a vacuum evaporation to form a miniature thin film thermocouple (type K).
Thin film sensors are more reliable and accurate than traditional Pt100 thermocouple devices.
Thickness of thin film sensors ranges from 12 to 50 nm, while Pt100 devices have thickness of
100 m. Therefore the thin film sensors wont create an air gap or change the flux inside the core,
hence they are harmless to use in critical installation locations. Thin film sensors can be used in
on-line temperature measurements [6].
Enhanced fiber optic temperature sensor
Fiber optics can be also used in temperature measurements. Two traditional fiber optic
measurement techniques exist; a point sensor and a distributed measuring system. They provide
sufficient accuracy for on-line monitoring; +/- 1 C for point sensors and +/-1 C/m for
distributed measurement system. Unfortunately these kinds of systems are very expensive and the
long fibers are not robust enough. An enhanced fiber optic temperature sensor was constructed to
overcome these drawbacks [5].
The new system uses a sensing probe integrated with a plastic cladding large-core (200 m)
optical fiber. Peeling a small portion of the cladding and using a reference liquid as a replacement
forms the sensing element. When the temperature of the reference liquid changes, the refractive
index is modulated and therefore the propagation regime of the fiber is modified. The
temperature refractive index change of the reference liquid is known, making it possible to
determine the temperature of the fluid where the probe is immersed. The temperature detection is
based on analog to digital processing hardware to monitor the power output of the optical fiber.
The resolution is 0,2 C and accuracy is 0,5 C for the measurement system [5].

GAS ANALYSIS
Power transformer gas-in-oil analysis (DGA) can be used for effective diagnostics and condition
monitoring. Electrical and thermal stresses such as arching, partial discharges and overheating
cause degradation of dielectric oil and solid dielectric cellulose materials. The degradation of
insulation produces different gases. Important gases for fault detection include: H2, CO, CO2,
CH4, C2H2, C2H4 and C2H6 [7]. Different degradation mechanisms generate different gases thus
making it possible to determine the degrading part of the transformer [2].
On-line DGA
Traditionally DGA is carried out using off-line measurements. However there are increasing
availability of methods and sensors for on-line monitoring of dissolved gases. For example the
Syprotec Hydran is widely used mainly for CO and H2 detection. It was developed in the late
1970s. Recently, there have been better techniques for gas analysis using a membrane or vacuum
extraction [8].
On-line measurements benefit from wide experience gained in laboratories over many decades.
Semiconductor sensors, infrared sensors, combustible gas detector and gas chromatography are
commercially available. Progress in process automation and microelectronics makes it possible to
use more sophisticated measuring equipment, which can be used together with artificial neural
networks (ANN) and fuzzy logic systems. These kind of methods applied to DGA can be used to
reveal apparent fault conditions as well as hidden relationships between different fault types [7].
On-line gas phase monitoring
On-line gas analysis can be carried out using infrared spectroscopy measurements. One
application is to use a Clemet TM Fourier transform infrared spectroscope (FTIR) to measure the
free gases in the inert gas blanket above the transformer oil. FTIR can detect all hydrocarbons
and even Furans too. In temperatures above 120 C the Furans are in gaseous state making it
possible to detect them as gases. There has been discussion that measuring the free gas quantities
will provide helpful information on insulation status more readily than DGA. If the continuous
gas phase monitoring shows abnormal behaviour, more accurate diagnostic test should be
conducted [9].

FURFURALDEHYDE MEASUREMENTS
Concentration of furfuraldehyde (FFA) of the power transformers is usually measured during
periodic inspections. Mostly used technique to measure FFA is high performance liquid
chromatography (HPLC). Statistical survey shows that FFA concentration can vary from 0,1 to
10 ppm [10]. High performance liquid chromatography can be used for on-line FFA
measurements but the systems are rather expensive [11].

On-line FFA measurements using optical sensor


An alternative method for FFA concentration determination is introduced. It is reported
experimental by the author. The method uses toxic chemicals, which react with FFA to produce
colored complex in solution. A linear correlation between the optical absorbance of the complex
in solution and the concentration of FFA in the oil is established [10].
The method is based on solid porous 2 cm thick glass-like discs coated with aniline acetate layer
of 1 mm. Several discs are immersed in oil and when FFA is present, the discs turn into pink
colour in a few minutes. The sensor is made of a light source, monochromator, lens, twobranched light pipe (fiber), mirror and a detector. The light travels along the fiber and through the
discs and then reflects back from the mirror and goes through the discs again and finally arrives
to the detector. There are several discs to improve sensitivity and to shorten response time. The
amount of absorbed optical wavelength of ~530 nm can be measured with detector. The system
can detect 0,1 ppm FFA [10]. The figure 1 describes the behaviour of the normalized
transmission as a function of wavelength and FFA ppm consentration.

Figure 1. The figure represents measurement of different FFA concentrations [10].


The minor drawbacks of this application are that the time taken for a disc to change color is
temperature dependant and the current system components are little bulky. However, an on-line
application is possible using for example few LEDs as the light source and more compact design
[10].

MECHANICAL FAULT DETECTION USING FREQUENCY RESPONSE ANALYSIS


Internal condition of the power transformers is difficult to determine without using non-intrusive
tests. Internal condition may get worse due to winding movement. Usually winding movement is
caused by short circuits and loss of winding clamping pressure. Internal damage may also occur
in transit or during initial installation. Typically these types of faults alter the distributed
capacitance or inductance of the windings. Frequency response analysis (FRA) can be used to

detect the internal mechanical faults but the results are affected by many factors leading to
uncertain conclusion [12].
On-line low voltage impulse test method
Low voltage impulse (LVI) also known as FRA is used to monitor possible winding movement
on-line. The application is based on short low voltage impulse applied to the one transformer
winding and then measuring the impulse again on another winding. A high voltage bushing
capacitance tap was used as impulse injection and also for measurement. Impulse used in the
measurements was a lightning impulse. Switching pulses were also considered, but they did not
contain enough power in high frequencies [13]. Figure 2 shows a measured on-line voltage
impulse.

Figure 2. The measured voltage impulse used in on-line FRA tests [13].
Using Fourier transform for the original input impulse and the measured output impulse, a
transfer function can be calculated. By comparing the results of the transformer before initial
installation and the results made later on, the possible winding movement can be detected [13].
This technique can be called as a difference technique. It is also possible to take signatures from
each phase, which can be called as a signature technique. This technique shows the similarity of
windings at the testing time and provides a reference to evaluate the faults or abnormalities in the
future [12].
Additional methods for on-line FRA measurements are to use different frequency ranges to detect
different type of faults. For example major faults such as winding displacement and grounding
can be detected by low frequency spectrum range of 2 kHz, while minor faults like interturn
faults and bulging of conductors can be identified by high frequency range of 2 MHz [12].

TAN- ON-LINE MONITORING


Measurement of the tan- of power transformer insulation can be used to determine the quality of
the insulation as a standard test before initial installation. However, during in-service conditions
such as interferences, limited time (periodic measurements) and difficult access to equipment
makes the tan- measurements difficult to perform [14]. Field measurements needed the
transformer to be disconnected because of the low testing voltage (10 12 kV) compared to the
operating voltage of the transformer. Low levels of tan- are usually a sing of healthy insulation.
Sudden increases in value in tan- over time are taken as a sign of insulation deterioration [2].
On-line continuous monitoring can overcome some of these drawbacks.
The on-line method is based on a capacitor connected to a tap of HV equipment (e.g. a bushing).
Together they form a voltage divider, which is used along with a reference voltage to calculate
the tan-. The author reports applications for HV current transformers (CT) and bushings [14].
Therefore, a direct application for measuring a tan- of power transformer insulation is not
reported. However, it is possible to monitor the tan- of a power transformers bushing, if the
proper voltage divider tap is present. It is also possible to use a non-invasive capacitive divider as
a sensor if the voltage tap is not present. The capacitive sensor is installed on the porcelain
surface of the bushing next to the ground potential [15].

PARTIAL DISCHARGE MONITORING


Power transformer breakdowns are most frequently preceded by partial discharges (PD).
Monitoring the PD activity of the transformers can give valuable information about the possible
breakdown [11]. Every partial discharge occurring inside the transformer generates an electrical
pulse, a mechanical pulse and electromagnetic waves. These different types of pulses can be
detected using various sensors and techniques. A very high frequency VHF PD detection uses
narrowband measurements tuned to certain frequency with best sensitivity [2]. Also acoustical
sensors can be used in frequency range of 100 300 kHz [11]. Acoustical sensors include for
example microphones installed to the transformer hull. Also fiberglass rods immersed to the
transformer oil can be used for acoustic measurements [23]. Ultra high frequencies (UHF) can be
used to detect PD occurring inside the power transformers [16].
UHF PD detection
Partial discharges occurring inside the transformer excite electromagnetic waves with resonance
at frequencies 500 1500 MHz. Possible causes of PD include: a temporary overvoltage,
weakness in the insulation introduced during manufacturing or due to various aging effects.
Even if the PD pulses of an evolving fault are weak at first, PD will cause chemical
decomposition and erosion of materials. UHF technique may use an UHF disc coupler that fits to
the unused oil ports of the power transformers. Another approach is to use a dielectric window
that is installed in the transformer hull [16], [17].
Different types of data handling layers are used in this on-line application. The system needed to
be scalable and support new sensors, data sets and interpretation techniques as they become

available. Therefore four data layers were constructed including data monitoring layer,
interpretation layer, corroboration layer and information layer. The system will integrate also
other monitoring technologies and data sources to provide more accurate and diverse analysis.
PD sensor data can be analysed using these methods: time-energy mapping and clustering of PD
data, time-frequency analysis combined with feature extraction and clustering and phase-resolved
data representation [17].
New techniques for on-line PD measurements
On-line PD measurements of HV equipment suffer from heavy noise caused by various origins.
A severe noise enough can reduce the measurement sensitivity and accuracy of PD
measurements. Both software and hardware (HW) approaches can be used to reduce the noise.
New sensors, such as fiber optic and directional sensors together with multiple terminal
measurements and better differential and balanced circuits are main points of hardware
development. In software development different type of modern filters and noise gating
technologies and advanced digital signal processing [18].
Couple of new sensors is introduced including a PD coupler board and multi-channel PD
detector. The new PD coupler is not directly connected to the HV conductor or components. The
new coupler consists of a sensing board, a high frequency transformer and amplifier. The sensing
board is installed near of the HV conductor and the stray capacitance acts as a coupling capacitor.
Also a new multi-channel PD detector uses advanced digital signal processing, directional
sensing and noise gating techniques [18].

VIBRATION MEASUREMENTS
There are two inner factors that can cause vibration in power transformers. First there is core
vibration, which is caused by excitation by magnetostriction and excitation generated in the air
gaps. Second there is winding vibration, which is generated by Lorenz force due to correlation of
leakage flux and winding current. These vibrations are carried along the transformer oil to the
transformer tank walls. Vibration signals seem to have strong relation with the conditions of the
transformer core and windings. Thats why vibration measurements can provide helpful
information in transformer condition monitoring [19].
A piezoelectric accelerometer is positioned at different locations on the transformer tank wall.
The accelerometer was isolated from the tank wall using insulation to overcome heavy 50 Hz
noise in the measured signal. Additionally the signal needed amplification and a charge amplifier
input was connected to the accelerometer output. The measurements showed that vibration
frequencies vary from 10 Hz to 2 kHz with amplitude of 0,5 m 50 m. On-line measurements
are conducted under load and no-load conditions. The spectrum of the signal is calculated for
analysis. It is possible to find the spectrum of winding vibration by subtracting the no-load results
from the loaded results. This is possible because the magnetic flux in the core is almost
independent of the load [19]. Figures 3 and 4 represent the measuring system and some results.
Different characteristics of the spectrum of the measured vibration of a power transformer in a
good condition can be used as a reference during on-line monitoring. However, more study with
7

database, vibration modeling and extracting the failure characteristic vector should be performed
for proper condition analysis [19].

Figure 3. The diagram describes the measurement system [19].

Figure 4. The waveform on the right is the spectrum of the measured vibration [19].

MONITORING OF ON LOAD TAP CHANGERS


Tap changers are the mechanical switching devices of the power transformer. On load tap
changer (OLTC) cost is low compared to the power transformer, but its malfunction can destroy
the complete transformer. Traditionally some temperature sensors were installed on OLTCs, but
the sensors werent fast enough to detect the temperature changes [1].
On-line monitoring of OLTCs
An international survey shows that OLTCs cause more failures and outages than any other
component of a power transformer. OLTC possible failures include: motor malfunction, loss of
power and flaw in controlling circuits may cause the tap change to stop before its completion.
The mechanical parts can also wear out causing loss of synchronization between selector and
diverter or jammed moving parts. In addition, the dielectric breakdown can occur due to failure of
inter phase insulation or aging of the oil [20].

The tap changer needs 5 6 seconds to finish its operation. During this time, different sensors are
used for on-line monitoring. Vibrations caused by contact movements are at very high
frequencies thus a high sampling rate is needed. The system consists of a 30 kHz accelerometer,
clamp-on current transformer and pair of thermocouples. Monitoring of OLTCs is carried out
using measurements on vibration of the contact movements, temperature of the insulation oil,
voltage and current of the drive motor. The system is triggered by operation of the OLTC and
above-mentioned measurements are conducted together with detection of the tap position. Figure
5 represents two different vibration waveforms. These measurements are collected to a database
in order to do analysis. The database consists of signatures of a vibration fingerprint, drive motor
current waveform envelope and the deviation between the temperatures of transformer main tank
and OLTC tank [20].

Figure 5. This figure describes two different vibration waveforms of OLTCs. The above
waveform represents a faulty condition and the bottom waveform a normal condition [21].

Figure 6. On the right there is a diagram of a mean value of the condition indicator after 2400
tap change operations [21].
9

The on-line condition monitoring of OLTCs can be used to detect gradual deteriorating and
sudden abnormalities as well. For example long-term measurements over 3 years showed a
gradual drift in the mean value of the power transformer OLTC. The measurements included over
2400 tap change operations. Probabilities of rate of degradation can be thus determined using
long-term tests together with short-term tests [21]. Figure 6 shows the waveform of the condition
indicator value during the 3-year test.

CONCLUSION
The electrical equipment of the power network, particularly a power transformer, should function
correctly without failures for many years. Various techniques for determining the condition of the
transformer exist. Accurate enough long term continuous on-line monitoring together with more
accurate and sophisticated off-line measurements can form a good technique for condition
monitoring for power transformer. Good condition monitoring makes it possible to optimize the
maintenance program thus minimizing the costs and maximizing the reliability.
In this paper, various techniques were introduced. Some of them are still experimental and need
more work in the future for reliable monitoring, while some are already commercially available.
All techniques are promising and combining various measurement results, using for example
fuzzy logic, neural network systems and different transformer models, an intelligent condition
monitoring of power transformers can be established [3], [22]. However, the availability of many
different type of measuring systems combined with insufficient knowledge of many aging
mechanism and especially their interaction, makes an accurate and reliable condition monitoring
a very challenging task to accomplish.

REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]

A. Basak, Condition monitoring of power transformers, Engineering science journal, pp.


41-46, 1999
J. P. van Bolhuis, E. Gulski, and J. J. Smit, Monitoring and Diagnostic of Transformer Solid
Insulation, IEEE transactions on power supply delivery, vol. 17, no. 2, 2002
T. Krieg, M. Napolitano, Techniques and experience in on-line transformer condition
monitoring and fault diagnosis in ElectraNet SA, Power System Technology,
Proceedings, PowerCon, Vol. 2, 4-7 pp. 1019 - 1024, 2002
T. Stirl, R. Skrzypek, C. Q. H. Ma, Practical experiences and benefits with on-line
monitoring systems for power transformers, Electrical Machines and Systems. ICEMS
2003. Sixth International Conference on Volume 1, pp. 9-11, 2003
G. Betta, A. Pietrosanto, A. Scaglione, An enhanced fiber-optic temperature sensor
system for power transformer monitoring, Instrumentation and Measurement, IEEE
Transactions on Volume 50, Issue 5, pp. 1138 1143, 2001
F.J. Anayi, A. Basak, D.M. Rowe, Thin film sensors for flux and loss measurements,
Condition Monitoring of Large Machines and Power Transformers, IEE Colloquium,
Digest No: 086, pp. 3/1 - 3/4, 1997
X.Q. Ding, H.Cai, On-line transformer winding's fault monitoring and condition
assessment, Electrical Insulating Materials, Proc. ISEIM, pp.801 804, 2001
10

[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]

[17]
[18]
[19]
[20]
[21]
[22]
[23]

T. McGrail, A. Wilson, On-line gas sensors, Condition Monitoring of Large Machines


and Power Transformers, IEE Colloquium, Digest No: 086pp. 1/1 1/4, 1997
M. K. Pradhan, T. S. Ramu, Criteria for estimation of end of life of power and station
transformers in service, Electrical Insulation and Dielectric Phenomena, CEIDP, Annual
Report Conference, pp. 220 223, 2004
R. Blue, D. G. Uttamchandani, A novel optical sensor for the measurement of
furfuraldehyde in transformer oil, Instrumentation and Measurement, IEEE Transactions,
Volume 47, Issue 4, pp. 964 966, 1998
A. White, A transformer manufacturer's perspective of condition monitoring systems, HV
Measurements, Condition Monitoring and Associated Database Handling Strategies, Ref.
No: 448, IEE Colloquium, pp. 4/1 - 4/4, 1998
S. Birlasekaran, F. Fetherston, Off/On-line condition monitoring technique for power
transformers, Power Engineering Review, IEEE Volume 19, Issue 8, pp. 54 56, 1999
M. Wang, A. J. Vandermaar, KD Srivastava, Condition monitoring of transformers in
service by the low voltage impulse test method, High Voltage Engineering, Eleventh
International Symposium, Conf. Publ. No: 467, Volume 1, pp. 45 - 48, 1999
P. Vujovi, R. K. Fricker, Development of an on-line continuous tan() monitoring
system, Electrical Insulation, IEEE International Symposium, pp. 50 53, 1994
A. Setayeshmehr, A. Akbari, H. Borsi, E. Gockenbach, New sensors for on-line
monitoring of power transformers bushings, Nordic Insulation Symposium, pp. 151-158,
2005
J. Pearson, B. F. Hampton, M. D. Judd, B. Pryor, P. F. Coventry, Experience with
advanced in-service condition monitoring techniques for GIS and transformers, HV
Measurements, Condition Monitoring and Associated Database Handling Strategies, Ref.
No. 448, IEE Colloquium, pp. 8/1 810, 1998
M. D. Judd, S. D. J. McArthur, J. R. McDonald, O. Farish, Intelligent condition
monitoring and asset management. Partial discharge monitoring for power transformers,
Power Engineering Journal, Volume 16, Issue 6, pp. 297 304, 2002
Q. Su, K. Sack, New techniques for on-line partial discharge measurements, Multi Topic
Conference, IEEE INMIC, Technology for the 21st Century, Proceedings, IEEE
International, pp. 49 53, 2001
J. Schengchang, S. Ping, L. Yanming, X. Dake, C. Junling, The vibration measuring
system for monitoring core and winding condition of power transformer, Electrical
Insulating Materials, ISEIM, Proceedings, International Symposium, pp. 849 852, 2001
P. Kang, D. Birtwhistle, J. Daley, D. McCulloch, Noninvasive on-line condition
monitoring of on load tap changers, Power Engineering Society Winter Meeting, IEEE,
Volume 3, pp. 2223 2228, 2000
P. Kang, D. Birtwhistle, On-line condition monitoring of tap changers-field experience,
Electricity Distribution Part 1, CIRED, 16th International Conference and Exhibition, IEE
Conf. Publ No. 482), Volume 1, pp. 5, 2001
O. Roizman, V. Davydov, Neuro-fuzzy computing for large power transformers
monitoring and diagnostics, Fuzzy Information Processing Society, NAFIPS 18th
International Conference of the North American, pp. 248 252, 1999
D. Harris, M. Saravolac, Condition monitoring in power transformers, Condition
Monitoring of Large Machines and Power Transformers, Digest No: 086, IEE
Colloquium, pp. 7/1 - 7/3, 1997

11

TESTS AND DIAGNOSTICS OF INSULATING OIL


Kaisa Tahvanainen
Lappeenranta University of Technology
Kaisa.Tahvanainen@lut.fi
INTRODUCTION
The importance of oil as an insulation material is significant in compositions, where warmth must
be transported out or where it is important to impregnate the laminate insulation material. Oil is
commonly used as insulating material in power and instrument transformers, switchgear
installations, transformers, capacitors, bushings and cables. Experiences worldwide have shown
that lack of attention to oil condition can lead to shorter operational lives of equipment. In
addition to having good electrical, thermal and mechanical characteristics, insulation oil should
endure the stress of service without deteriorating. Especially long term stress in high
temperatures can alter the characteristics of insulation oil. This somewhat natural ageing of
insulation oil can be fueled by electrical and chemical stress leading to poorer cooling conditions
as well as insulation properties of oil and in the worst case in unplanned outages and failure of
equipment. Changes in insulation oil can be analyzed in order to determine the oil condition and
hence functioning of the equipment.
In this paper, some of the most common insulating oil testing methods and principles are
presented. Some of the oil testing techniques are expensive and require expert-knowledge.
Testing is therefore done in the most important sites as the cost is small in comparison with that
associated with insulation failure. This usually means on-line testing. Oil test performed on site
can also be representative and can be performed by relatively unskilled staff. Many electricity
distribution companies use laboratory tests for important sites as well as on-line monitoring of oil
condition.
INSULATING OILS
Insulating oils have been used in oil filled electrical equipment as coolant and insulation since the
1900s. The purpose of insulation oil is to protect the solid parts of insulation structures (used in
the construction of the equipment) from electric discharges, assist in quenching arcs or to
dissipate heat generated in the equipment during use. The characteristics required for insulation
oil depend on the equipment and circumstances in which the insulation is used. In transformers
liquid insulation requires great dielectric strength and good conveyance of heat in order to ensure
cooling. Insulation liquid should also have great resistivity, low loss factor and good tolerance for
discharge. In cables and capacitors insulation liquid should possess low viscosity (impregnation)
in addition to good tolerance for discharges. In addition to technical features the use of insulation
liquids is influenced by environmental and life span facts. Lately, the research has focused on e.g.
biodegradable vegetable-based oils, such as turnip rape oil. (Aro et al. 1996)

Several different types of insulating oils have been produced and introduced into the market.
These range from mineral to synthetic oils with different chemical and electrical properties for
different applications.
Mineral oil is the most common insulation liquid used due to its availability and affordability.
Mineral oils consist of hydrocarbon composition of which the most common are paraffin,
naphthene and aromatic oils. The main hydrocarbon composition of mineral oil and different
impurities affect to what kind of features mineral oil possesses. Paraffin is the least likely to
oxidize (antioxidants may be added to advance this), it is mainly used in breakers. In transformers
paraffin precipitates easily, so naphthene is mainly used. Aromatic oils are most suited for cases,
where good discharge tolerance is needed. The most common mineral oil in use is transformer oil
(required characteristics are defined e.g. in standard IEC 60296). The electrical features and low
viscosity make it a good insulation and refrigerating medium. Transformer oil boiling
temperature is 250-300 C and it is very fluid. It also oxidizes easily and is flammable (in liquid
form the flashing point is however over 130 C). The features of transformer oil in service are
also influenced by moisture and impurities in addition to oxidation. (Aro et al. 1996)
Synthetic insulation liquids include synthetic aromatic hydrocarbons, alkylbenzenes, esters,
silicon oils and polybutans. Esters are used for refilling transformer oil or in combination of
transformer oil. Silicon oil is synthetic oil, which is more environmentally friendly than
transformer oil. It is also non-combustible. Synthetic oils are more expensive than mineral oils
and they have poor thermal conductivity and discharge tolerance. Silicon oils cannot be used in
breakers because arcs generate flammable gases. In addition silicon oils must be protected from
moisture, because they absorb water. (Aro et al. 1996)
In insulation constructions the combination of solid and liquid insulation is often used. By
impregnating the solid insulation with liquid insulation, the insulation characteristics can be
improved. By combining mineral oil and paper, insulation is created that has better electrical
strength than either material alone. This kind of insulation is used in transformers, cables and
capacitors. The weakness of this insulation combination is that both of the materials are sensitive
to impurities, especially to moisture and oxidation. (Aro et al. 1996)
INSULATION OIL TESTS
In addition to having good electrical, thermal and mechanical characteristics, insulation structures
should endure the stress of service without deteriorating (Aro et al. 1996). Prolonged bulk oil
temperatures of greater than about 75C usually involve spot temperatures of over 98C. Further
increases rapidly increase degradation of oil impregnated paper and pressboards. Ageing of these
cellulose-based materials is also increased by higher moisture and oxygen contents. Additionally
such high temperatures increase the rate of oxidation of the oil, producing acids, moisture and
sludge which impair both cooling properties and dielectric strength. Water increases the rate of
oil oxidation and effectively self-catalyses the reaction. All of these changes increase the
possibility of electrical breakdown. Chemical and physical analysis gives information on
serviceability of the oil as both an insulator and coolant. (Myers 1998)
Oil testing can be split into two categories, tests that are performed on site and test that are
performed in an oil laboratory (table 1). On-site tests are intended to determine the oil quality and
2

acceptance of new oil at the point of delivery. There are limited tests that can be performed on
site and they are relatively simple and low cost. Furthermore, non-specialist personnel usually
can carry out these tests. Laboratory tests are carried out in a controlled environment in the
laboratory for quality of the supplied unused oil, condition of in service oil and plant condition
monitoring. (Pahlavanpour et al. 1999)
Table 1. Location and type of the test conducted on insulation oil. (Pahlavanpour et al. 1999)
Location of test

Type of test

Field test

Moisture
Breakdown voltage
Color and appearance
Acidity
Resistivity
Breakdown voltage
Moisture
Dielectric dissipation factor
Interfacial tension
Flash point
Dissolved gas analysis
Furan analysis

Laboratory test

Off-line techniques can only be carried out during outages. They are only applicable to large or
strategically important sites. Post fault forensic tests also include paper analysis and metallurgical
tests. To use the information from these tests as a fault diagnostic or an ongoing monitoring
programme, it is necessary to compare results with previous datum levels. These tests are
generally sensitive and specific and are not expensive in the context of fault detection for such
plant. A multi-parameter approach is essential. On-line techniques may be by discrete tests or can
be applied continuously and avoid the need for outages. Temperature is usually recorded for large
plant, but for routine detection of abnormal conditions while maintaining output, analysis of
insulating oil provides a cheap but powerful tool to evaluate the condition of any oil-insulated
plant, be it power or instrument transformers, bushings, cables or switches. (Myers 1998)
In general, the optimum interval for sampling and testing of oil will depend on type of equipment
in operation and power, action and service conditions of the equipment. The duty experienced by
insulating oil in transformers and selectors is different from that experienced in circuit breakers
and divertors. This may lead to different changes in chemical characteristics of the oil and a
different rate of oil deterioration. A check interval can be every 1-4 years. Economical factors
and reliability requirements have to be compromised. Details of the oil sampling technique are
outlined in IEC 60475. Frequency of oil sampling and testing is given in IEC 60422.
(Pahlavanpour et al. 1999)

Dissolved Gas in Oil Analysis


Dissolved gas analysis (DGA) is widely accepted as the most reliable tool for the earliest
detection of inception faults in electrical equipments using insulating oil. Gas-in-oil analysis by
gas chromatography (for details, see Annex 1) has proven to be predictive and valuable for some
of the problems, such as arcing, corona, overheated oil, and cellulose degradation. When
insulating oils and cellulose materials in reactive equipment are subjected to higher than normal
electrical or thermal stresses, they decompose to produce certain combustible gases referred to as
fault gases. For incipient fault conditions (i.e. slowly evolving fault), the gases generated will be
dissolved into the oil long before any free gas is accumulated in the gas relay. Thus by analyzing
oil sample for dissolved gas content it is possible to asses the condition of the equipment and
detecting faults at an early stage. If a fault is indicated, the type of fault can be predicted using
various analysis methods. Table 2 shows the concentrations for gases to determine whether there
is any problem and whether there is sufficient generation of each gases for the ratio analysis to be
applicable. (Ward 2003)
Table 2. Concentration of dissolve gas (Saha 2003).
Key gas

Concentrations (ppm)

H2
CH4
CO
C2H2
C2H4
C2H6

100
120
350
35
50
65

Several DGA tests should he taken over a period of time, in order to determine the rate of
increase of the fault gases, and therefore the rate of advancement of the fault. The assumption is
made that the change in the rate should not exceed 10 percent per month for gases, the
concentration of which exceeds the typical concentration value. (Arakelian 2002)
The most important part of DGA is interpretation of the results which can vary from simple use
of key gases to suggest a type of fault, to sophisticated computerized calculations of gas ratios,
rates of increase, equilibrium between free and dissolved gases and predicted times to Buchholz
alarm operation. Such systems, used by experienced analysts, can decrease effort on routine
samples where no significant changes occur, allowing more time to examine the subtle changes
that can give early warning of incipient faults. Trend analysis is of paramount importance, as
residual effects from previous faults and multiple types of overheating can seriously distort
interpretation of results from a one-off sample. (Myers 1998)
Interpretation of DGA results is often complex and should always be done with care.
Doernenberg, Rogers and Duvals triangle methods are the most commonly used in gas-in-oil
diagnostics in addition to IEC 60599.
The key gas method identities the key gas for each type of fault and uses the percent of this gas to
diagnose the fault. Key gases formed by degradation of oil and paper insulation are hydrogen
(H2), methane (CH4), ethane (C2H6), ethylene (C2H4), acetylene (C2H2), carbon monoxide (CO)

and oxygen (02). Except for carbon monoxide and oxygen, all these gases are formed from the
degradation of the oil itself. Acetylene is mainly associated with arcing, where temperatures
reach several thousand degrees, ethylene with hot spots between 150C and 1000C and
hydrogen with the partial discharges. Gas type and amounts are determined by where the fault
occurs in the transformer and the severity and energy of the event. The IEC-standard 60599 is a
guide describing how the concentrations of dissolved gases or free gases may be interpreted to
diagnose the conditions of oil-filled electrical equipment in service and suggest further action. In
addition to threshold values given in the standard, it is suggested that empirical values
characteristic to the device type could be used. According to the recommendations of IEC 60599,
the existing method for the interpretation of gas analysis is based on the ratio of concentration of
CH4/H2, C2H2/C2H4, and C2H4/C2H6 to evaluate the defect. These ratios are to be used when the
concentration of at least one of the gases exceeds the limiting concentration for normal
equipment. (Ward 2003)
Table 3. Examples of the interpretations of dissolved gas analysis (Aro et al. 2003)
Ratios of characteristics gases
Fault type
Partial discharges
discharges of low energy density
discharges of high energy density
Hot spots T< 300 C
Hot spots 300 < T < 700 C
Hot spots T< 700 C

C2H2
C2H4

CH 4
H2

C2H4
C2H6

insignificant
>1
0,6-2,5
insignificant
<0,1
<0,2

<0,1
0,1-0,5
0,1-1
insignificant
>1
>1

<0,2
>1
>2
<1
1-4
>4

Furan analysis
Electrical aging of paper, which is an integral part of oil insulation (oil-barrier insulation, paperoil insulation), can lead to the formation of light gases. The thermal influence on paper initiates
dehydration processes, resulting in the formation of water and compounds related to furans. The
presence of oxygen and increased temperature initiate oxidizing reactions in the cellulose
insulation. (Arakelian 2002)
The five most prevalent derivatives of furan that arise from the degradation of the cellulose and
that are soluble in the oil are 2-Furaldehyde, Furfuryl alcohol, 2-Acetylfuran, 5-Methyl-2furaldehyde and 5-Hydroxymethyl-2-furaldehyde. A sample of the oil is extracted with either
another liquid such as acetonitrile or with a solid phase extraction device. The extract is then
analyzed using liquid chromatography. The five compounds mentioned above are separated on an
appropriate column and each is detected by use of an ultraviolet detector that is adjusted
automatically to the appropriate wavelength for each of the five components. Calibration
solutions are made up for each of the components to be analyzed and these are used to
standardize the instrument responses. From the data on the standard solutions, the extraction
efficiencies for each component can be calculated and corrections can be made accordingly. The
results are usually reported in terms of parts per billion (ppb). (NTT)

Moisture/Water content
During the service life of an oil-filled equipment, the moisture content may increase by breathing
damp air, natural ageing of the cellulose insulation, oil oxidation, condensation or by accidental
means and the water content of the paper may rise to five or even six percent. The presence of
moisture increases the ageing rate of both the oil and the paper. Insulating paper with a one
percent moisture content ages ten times faster than one with only 0.1 %. Water is a polar liquid
and is attracted to areas of strong electrical field. Water-soluble acids produced by oxidation of
the oil act as a catalyst for almost all reactions and will combine with water or oil to assist or
promote corrosion to exposed metal parts in the equipment. The cellulose has a greater affinity
for water than oil and so water will replace the oil in oil-impregnated cellulose. Presence of water
in the oil will reduce the electrical strength of the oil and may shorten life of the insulation
system and lead to early transformer failure. Like other oil properties, the moisture content
should be monitored regularly. (Pahlavanpour & Roberts 1998)
A number of techniques have been investigated over the years to measure the quantity of
moisture in a dielectric fluid, but the only method which has stood the test of time is that
developed by Karl Fischer in the early 1930's. The method is outlined in IEC 60814; Insulating
liquids Oil-impregnated paper and pressboard Determination of water by automatic
coulometric Karl Fischer titration. Titration is a chemical analysis that determines the content of a
substance, such as water, by adding a reagent of known concentration in carefully measured
amounts until a chemical reaction is complete. There are two types of Karl Fischer titrators:
volumetric and coulometric titrators. The main difference between the two is that with the
volumetric method, the titrant is added directly to the sample by a burette. Conversely, with the
coulometric method, the titrant is generated electrochemically in the titration cell. The
coulometric method measures water levels much lower than the volumetric method.
Measurements by coulometric Karl Fischer moisture meters, is quick and can be carried out by
relatively unskilled staff. It is capable of measuring moisture contents down to 1 ppm or 0,0001%
in the oil and can perform field analysis. The most obvious direct benefit of a portable moisture
meter is the elimination of the possibility of further contamination that might occur while a
sample is being transferred to a laboratory for analysis. (Pahlavanpour & Roberts 1998, Poynter
& Barrios 1994)
Estimates of moisture content of the cellulose can be made by relating water content of the oil in
ppm to the % concentration of water in the cellulosic insulation. However, this requires
knowledge of the moisture equilibrium data for the oil in question together with its normal
temperature and moisture content. Also, for a long time, water determination has not been a
problem for gas chromatography. The water extraction from transformer oil occurs
simultaneously with the extraction of gases. The maximum allowable water content of oil in
service depends on the transformer voltage, recommended value of 30 ppm at delivery is given in
IEC 60296 for new oil. (Pahlavanpour & Roberts 1998, Poynter & Barrios 1994)

Acidity/Neutralization Number (NN)


The acidity of an oil sample is related to the deterioration of the oil. New oils contain practically
no acids if properly refined. The acidity test measures the content of acids formed through
oxidation. The oxidation products polymerize to form sludge which then precipitates out. Acids
react with metals on the surfaces inside the tank and form metallic soaps, another form of sludge.
The presence of these acidic materials can be quantitatively determined by a procedure called
titration. The amount of a standardized base that is needed to neutralize the acidic materials
present in a known quantity of an oil sample is determined. The result referred to as the acid
number (formerly referred to as neutralization number) equals the milligrams of KOH (potassium
hydroxide) required to neutralize the acid contained in 1 gram of oil. The titration procedure can
be done either volumetrically or gravimetrically and the end point can be determined either
colorimetrically or potentiometrically (IEC 62021-1). With old oils, the colourimetric
determination is sometimes difficult because of the dark colour of the oil (Myers 1998). The
maximum neutralization value given by the IEC 60296 is 0,03 mgKOH/g. (NNT)
Interfacial Tension
The interfacial tension (IFT) test is employed as an indication of the sludging characteristics of
oil (soluble polar contaminants and products of deterioration). In this procedure the surface
tension of the oil is measured against that of water, which is highly polar. The more nearly the
two liquids are alike in their polarity the lower the value of the surface tension between them.
Thus the higher the concentration of hydrophilic materials in the insulating fluid, the lower will
be the interfacial tension of the oil measured against water. The attraction between the water
molecules at the interface is influenced by the presence of polar molecules in the oil in such a
way that the presence of more polar compounds causes lower IFT. The test measures the
concentration of polar molecules in suspension and in solution in the oil and thus gives an
accurate measurement of dissolved sludge precursors in the oil long before any sludge is
precipitated.
There are several methods that can be used to measure the interfacial tension of oil against water.
One method measures the size of a drop of water that is formed below the surface of the oil,
however, if more accurate values are needed it is recommended that the Nouy ring method is
used. The method involves placing a clean, platinum wire ring on the surface of the oil, where the
force required to pull the ring away from the surface is measured. The method uses a tensiometer
and a platinum ring. The ring is lowered into a beaker of water and oil. It is then brought up to
the water-oil interface where the actual measurement takes place. The force required to pull the
ring through the interface is measured by the tensiometer and considered to be the interfacial
tension of the oil. The value for mineral oil varies from 24 to 40 dynes/cm (IEEE C57.106-1991).
(NNT)

Dielectric dissipation factor (tan )


The power factor of insulating oil is the cosine of the phase angle between a sinusoidal potential
applied to the oil and the resulting current. This can be measured for example using Schering
bridge. For insulating oils the value for this characteristic is called the power factor, loss tangent
or dissipation factor and is expressed at a specified temperature. Power factor indicates the
dielectric loss of an oil; thus the dielectric heating. Oxidation and contamination of oil can cause
dissipation factor of an oil to rise, so determination of this property may provide useful
information about used electrical insulating oil. Since these values vary with temperatures,
comparisons must always be made at the same temperature. Test methods are outlined in IEC
61620 (Insulating liquids Determination of the dielectric dissipation factor by measurement of
the conductance and capacitance Test method) and IEC 60247 (Measurement of relative
permittivity, dielectric dissipation factor and d.c. resistivity of insulating liquids). The maximum
value for dissipation factor is determined in IEC 60296 to be 0,005 at 90 C (50 Hz). (Aro et al.
1996)
Electrical Breakdown Strength
The breakdown voltage is indicative of the amount of contaminant (usually moisture) in the oil.
The effect of moisture in insulation oil increases when there are impurities present in oil. Also the
oil temperature affect to the breakdown strength. Testing electrical breakdown strength begins by
immersing two electrodes in a sample of the oil, then applying an AC voltage across the
electrodes. The voltage is then increased in a specified manner until electrical breakdown occurs.
The various tests used differ in electrode spacing and shape, rate of increase of voltage (and
durance of test in DC voltage) and thus give different breakdown values for the same oil.
Electrical breakdown strength for new, clean transformer oil in AC measurements is over 60
kV/2,5 mm (effective value) and the breakdown strength is independent of the oil brand. Testing
is standardized in IEC 60156. Oil is not necessarily in good condition even when the dielectric
strength is adequate because this tells nothing about the presence of acids and sludge. IEC 60296
determines minimum value for AC breakdown voltage at delivery to be 30 kV and 50 kV after
treatment. (NNT, Aro et al. 1996)
Resistivity
The resistivity of electrical insulating oil is a measure of the resistance to DC current flow
between conductors. The resistivity of mineral insulating oil is naturally high but, as with loss
tangent, is very sensitive to the presence of even minute amounts of suspended water, free ions or
ion forming materials such as acidic oxidation products or polar contaminants. (Aro et al. 1996)

Flash point
Flash point is an indication of the combustibility of the vapors of a mineral oil and is defined as
the lowest temperature at which the vapor of oil can be ignited under specified conditions. The
flash point is considered to be the lowest temperature at which the oil vapors will ignite, but not
sustain a flame. Impurities in oil lower the flash point. Usual method for flash point
determination is Pensky Martens closed cup flash point test. IEC 60296 determines the minimum
value for flash point to be 140 C for higher viscosity mineral oil and 130 C for lower viscosity
mineral oil. Method used is ISO 2719.
Analysis of antioxidant
Oxidation inhibitors in mineral oils readily react with oxygen at elevated temperatures to first
form hydroperoxides, then organic acids. These compounds lead to viscosity increase, formation
of sludge, discoloration, acidic odor and corrosion of metal parts. Oxidation resistance may be
due to natural inhibitors or commercial additives. Four types of oxidation inhibitor additives are
zinc dithiophosphates, aromatic amines, alkyl sulfides and hindered phenols. Metal surfaces and
soluble metal salts, especially copper, usually promote oxidation. Therefore, another approach to
inhibiting oxidation is to reduce the catalysis by deactivating the metal surfaces.
The effectiveness of the anti-oxidants in delaying oil oxidation can be measured by laboratory
tests known generally as oxidation stability tests. Oxidation stability is measured in accelerated
tests at high temperature, in the presence of excess oxygen, catalysts and possibly water. Results
are expressed as the time required to reach a predetermined level of oxidation. Criteria can be a
darkening color, the amount of sludge, gum, acids and the amount of oxygen consumed and in
some cases by the depletion of the anti-oxidant chemical compound itself. The maximum value
given by the IEC 60296 for sludge is 0,10 % by mass or 0,40 mgKOH/g for neutralization value.
Method used is IEC 1125. (Godfrey & Herguth 1995)
Viscosity
Viscosity is the resistance of oil to flow under specified conditions. The viscosity of oil used as a
coolant influences heat transfer rates and consequently the temperature rise of an apparatus. Low
viscosity ensures that oil flows well in particularly in low temperatures and helps quenching arcs.
The viscosity of oil also influences the speed of moving parts in tap changers and circuit
breakers.
The IEC 61868 specifies a procedure for the determination of the kinematic viscosity of mineral
insulating oils, both transparent and opaque, at very low temperatures, after a cold soaking period
of at least 20 h, by measuring the time for a volume of liquid to flow under gravity through a
calibrated glass capillary viscometer. The number of seconds the oil takes to flow through the
calibrated region is measured. The oil's viscosity in cSt is the flow time in seconds multiplied by
the apparatus constant. It is particularly suitable for the measurement of the kinematic viscosity
of liquids for use in cold climates, at very low temperatures (40 C) or at temperatures between
the cloud and pour-point temperatures (typically 20 C) where some liquids may develop
9

unexpectedly high viscosities under cold soak conditions. IEC 60296 determines the minimum
value for viscosity to be 16,5 cSt at 20 C and 800 cSt at -15 C for higher viscosity mineral oil.
For lower viscosity mineral oil the values are 11,0 cSt and 1800 cSt. (Godfrey & Herguth 1995)
Color and appearance
Mineral oil fresh from the refinery is essentially colorless. As the sample ages over time or is
subjected to severe conditions such as local hot spots or arcing the sample will become darker in
color. The clarity of a fresh virgin sample of oil should be sparkling with no indication of
cloudiness, sludge, or particulate matter. The clarity of an oil sample is determined by
observation of the sample when illuminated by a narrow focused beam of light. The color of a
sample is determined by direct comparison to a set of color standards. Also the assessment of the
smell of oil from different parts of a transformer can be a valuable first indicator of the source
and type of fault (Myers 1998). It should be pointed out that the color of the oil by itself should
never be used to indicate the quality of the oil, rather it can be used to determine whether more
definitive tests should be done. (NNT)
The insulation oil condition can also be determined by its fibre and particulate content. A simple
count of visible fibres can be made using a crossed Polaroid viewing system with results usually
reported in terms of small (<2 mm), medium (2 - 5 mm) and large fibres. The range of 2 - 5 mm
includes most cellulose fibres derived from the paper insulation whereas larger fibres are usually
contaminants introduced either during maintenance or sampling. These results, in conjunction
with water content, can give an indication of the cause of poor electrical strength. Accurate fibre
and particulate counts require carefully controlled filtration in clean conditions followed by
microscopic examination and are only needed for very high quality insulation is required. (Myers
1998)
CONCLUSIONS
Practical diagnostics of oil filled equipment are executed in a standard manner, see picture 1.
Sample of oil is taken regularly from the equipment to enable the early determination of
developing defects. If all values remain below the limiting values, the condition of the insulation
is considered to be satisfactory. For abnormal values, the analysis is repeated to confirm the
results, and to calculate the rate at which the defect is developing. If the abnormal results are not
confirmed, or the dynamics of the development of the defect are absent, normal operation can be
assumed, especially if additional tests to determine the electrical, physical, physico-chemical, and
chemical characteristics of the oil are normal. On confirmation of a problem, the type of defect
thermal or electricaland its severity are determined, and a decision is made on further checking
by means of monitoring or frequent gas chromatographic analysis. The recommendation for
refurbishment, repair, or replacement is made on the basis of the data accumulated (IEC 60422,
Supervision and maintenance guide for mineral insulating oils in electrical equipment) (Arakelian
2002)

10

Scheduled Periodic Inspection of OFEE in Service


Electrical
Measurements

GC-Analysis of Oils from OFEE for


the Concentrations of Dissolved Gases
and Water

No Deviations
Present

Deviation from Normal,


Assume Defect Present

Continue in Service

Possible Problems

Additional Selective
Measurements:
1. Tan, UBreakdown, v
2. Furfural,
Antioxidant
3. d20, n20, 20
4. Acidity

Repeat GC
Analysis for
Concentrations of
Gases and Water

Problems

Thermovision
Control

No Problems

Establish or Confirm Defect


Present and Determine its
Rate of Development

Continue in
Service
Decision on Type
of Monitoring

Decision on Periodic
GC-Control

Monitoring
The Decision on Scale, Type
and Expediency of Repair
Repair

Picture 1. Ideology and tactics of oil-filled electronic equipment (OFEE) diagnostic check
(Arakelian 2002).
We are dealing or talking about oil diagnostics only when information on the owner, oil
sampling, oil testing results, technical and operational data and history of oil maintenance
actions, are gathered periodically with expert opinion about suitability of the tested oil filling
together with guidelines for any needed corrective actions (filtering, adding inhibitor, exchange
or reclaiming). For the optimal oil diagnostics, the expert opinion should take into account also
the owners strategic lifetime of equipment and his maintenance and investment strategy, as
critical values of specific test results (degree of degradation), at which some actions should be
done, depend on these parameters. Having all cited information the diagnostic expert should also
advise the techno-financial optimal frequency and type of oil testing. The highly experienced
expert having enough information can reduce expenses for equipment supervision, maintenance
and refurbishment. (Gradnik 2002)

11

REFERENCES
(Arakelian 2002)

Arakelian, V.G, 2002. Effective Diagnostics for Oil-Filled


Equipment.
IEEE
Electrical
Insulation
Magazine,
November/December 2002 Vol. 18, No. 6.

(Aro et al. 1996)

Aro, M., Elovaara, J., Karttunen, M., Nousiainen, K., Palva, V.,
1996. Suurjnnitetekniikka. Otatieto 568. Jyvskyl 2003. ISBN
951-672-320-9.

(Barnes 2002)

Barnes, M. 2002. Gas Chromatography: The Modern Analytical


Tool.
Practicing
Oil
Analysis
Magazine.
Available:
www.practicingoilanalysis.com/article_detail.asp?articleid=352&re
latedbookgroup=OilAnalysis

(Gradnik 2002)

Gradnik, M.K., 2002. Physical-Chemical Oil Tests, Monitoring and


Diagnostics of Oil-Filled Transformers. Proceedings of 14th
International Conference on Dielectric Liquids, Austria.

(Godfrey & Herguth 1995)

Godfrey D. & Herguth W. R. 1994. Physical and Chemical


Properties of Mineral Oil That Affect Lubrication. Herguth
Laboratories. Available:
www.herguth.com/technical/PHYSICAL.HTM

(Myers 1998)

Myers, C., 1998. Transformers Conditioning Monitoring by Oil


Analysis Large or Small; Contentment or Catastrophe. Power
Station Maintenance: Profitability Through Reliability, Conference
Publication No. 452.

(NNT)

Northern Technology & Testing. Available:


www.nttworldwide.com

(Pahlavanpour et al. 1999)

Pahlavanpour, B., Wilson, G., Heywood, R. 1999. Insulating Oil in


Service: Is It Fit for Purpose? The Institution of Electrical
Engneers.

(Pahlavanpour & Roberts 1998)


Pahlavanpour B., Roberts I. A., 1998, Transformer Oil Condition
Monitoring. The Institution of Electrical Engneers.
(Poynter & Barrios 1994)

Poynter W.G & Barrios R.J. 1994. Coulometric Karl Fischer


titration simplifies water content testing. Oil & Gas Journal.
Available : www.kam.com/techcenter-karlfischer.htm

(Saha 2003)

Saha, T. 2003. Review of Modern Diagnostic Techniques for


Assessing Insulation Condition in Aged Transformers). IEEE
Transactions on Dielectrics and Electrical Insulation.
12

(Ward 2003)

Ward S.A., 2003. Evaluating Transformer Condition Using DGA


Oil Analysis. 2003 Annual Report Conference in Electrical
Insulation and Dielectric Phenomena.

13

Annex 1
Gas-Liquid Chromatography
In gas-liquid chromatography, it is the interaction between the gaseous sample (the mobile phase)
and a standard liquid (the stationary phase), which causes the separation of different molecular
constituents. The stationary phase is either a polar or nonpolar liquid, which, in the case of
capillary column, coats the inside of the column, or is impregnated onto an inert solid that is then
packed into the GC column.

Figure 2. Gas Chromatography Instrument.


A schematic layout of a GC instrument is shown in figure 2. The basic components are an inert
carrier gas, most commonly helium, nitrogen or hydrogen, a GC column packed or coated with
an appropriate stationary phase, an oven that allows for precise temperature control of the column
and some type of detector capable of detecting the sample as it exits or elutes from the column.
Gas-liquid chromatography works because the molecules in the samples are carried along the
column in the carrier gas, but partition between the gas phase and the liquid phase. Because this
partitioning is critically dependent on the solubility of the sample in the liquid phase, different
molecular species travel along the column and elute at different times. Those molecules that have
a greater solubility in the liquid phase take longer to elute and thus are measured at a longer
interval. Solubility is dependent on the physical and chemical properties of the solute; therefore,
separation between different components of the sample occurs based on molecular properties
such as relative polarity (like ethylene glycol versus base oil) and boiling point (like, fuel versus
diesel engine base oil). For example, using a polar stationary phase, with a mixture of polar and
nonpolar compounds will generally result in longer elution times for the polar compounds,
because they will have greater solubility in the polar stationary phase. (Barnes 2002)

14

GAS DIAGNOSTICS IN TRANSFORMER CONDITION MONITORING


Pauliina Salovaara
Tampere University of Technology, Institute of Power Engineering
pauliina.salovaara@tut.fi

ABSTRACT
Transformers are vital components in both the transmission and distribution of electrical
power. The early detection of incipient faults in transformers reduces costly unplanned
outages. The most sensitive and reliable technique for evaluating the health of oil filled
electrical equipment is dissolved gas analysis (DGA). Insulating oils under abnormal
electrical or thermal stresses break down to liberate small quantities of gases. The qualitative
composition of the breakdown gases is dependent upon the type of fault. By means of
dissolved gas analysis (DGA), it is possible to distinguish faults such as partial discharge
(corona), overheating (pyrolysis) and arcing in a great variety of oil-filled equipment.
Information from the analysis of gasses dissolved in insulating oils is valuable in a
preventative maintenance program. A number of samples must be taken over a period of time
for developing trends. Data from DGA can: Provide advance warning of developing faults,
provide a means for conveniently scheduling repairs and monitor the rate of fault
development.

INTODUCTION
Monitoring and maintenance of mineral-oil-filled power transformers are of critical
importance in power systems. Failure of a power transformer may interrupt the power supply
and result in loss of profits. Therefore, it is of great importance to detect incipient failures in
power transformers as early as possible, so that we can switch them off safely and improve
the reliability of power systems. If a long in-service transformer is subjected to higher than
normal electrical and thermal stresses, it may generate by-product gases due to the incipient
failures. Dissolved gas analysis (DGA) is a common practice for incipient fault diagnosis of
power transformers and widely accepted as the most reliable tool for the earliest detection of
inception faults in transformers and other electrical equipments using insulating oil. [7,8]
The utility tests and periodically samples the insulation oil of transformers to obtain the
constituent gases in the oil, which are formed due to breakdown of the insulating materials
inside. Gas-in-oil analysis by gas chromatography has proven to be predictive and valuable
some of the problems, which could progress to catastrophic failures in transformers. Problems
that can be detected are: arcing, corona, and both overheated oil and cellulose degradation.
These problems result in gas production as they start to develop and gas production increases
with increasing severity of the problem. The energy dissipation is the least in corona, medium
in overheating, and highest in arcing. According to the IEC Standard (Publication 567), nine
dissolved gases can be determined from a DGA test (i.e., hydrogen (H2), oxygen (O2),
nitrogen (N2) methane (CH4), ethane (C2H6), ethylene (C2H4), acetylene (C2H2), carbon
monoxide (CO), and carbon dioxide (CO2)). Therefore, if we can relate the future gas content
of transformers with the faults, then forecasting fault conditions for power transformers will
be easily done. Future prediction of fault conditions is the most import information for
maintenance engineering group to avoid system outages. In the past, various fault diagnosis

techniques have been proposed, including the conventional key gas method, gas ratio method,
expert systems, neural networks (NN), and fuzzy logic approaches. Recently, the
combinations of fuzzy logic and artificial intelligence (AI) have given promising results in the
fault analysis. [7,8]

METHODS OF GAS DETECTION


Three different methods of gas detection will be discussed and their advantages and
disadvantages will be compared. The first method is the one that determines the total
combustible gases (TCG) that are present in the gas above the oil. The major advantage of the
TCG method compared to the others that will be covered is that it is fast and applicable to use
in the field. In fact it can be used to continuously monitor a unit. However, there are a number
of disadvantages to the TCG method. Although it detects the combustible fault gases
(hydrogen, carbon monoxide, methane, ethane, ethylene, and acetylene), it does not detect the
non-combustible ones (carbon dioxide, nitrogen, and oxygen). This method is only applicable
to those units that have a gas blanket and not to the completely oil-filled units of the
conservator type. Since most faults occur under the surface of the oil, the gases must first
saturate the oil and diffuse to the surface before accumulating in the gas blanket above the oil.
These processes take time, which delays the early detection of the fault. The major
disadvantage of the TCG method is that it gives only a single value for the percentage of
combustible gases but does not identify which gases are actually present. It is this latter
information that is most useful in determining the type of fault that has occurred. [1]
The second method for the detection of fault gases is the gas blanket analysis in which a
sample of the gas in the space above the oil is analyzed for its composition. This method
detects all of the individual components; however, it is also not applicable to the oil-filled
conservator type units and it also suffers from the disadvantage that the gases must first
diffuse into the gas blanket. In addition, this method is not at present best done in the field. A
properly equipped laboratory is preferred for the required separation, identification, and
quantitative determination of these gases at the part per million level. [1]
The third and most informative method for the detection of fault gases is the dissolved gas
analysis (DGA) technique. In this method a sample of the oil is taken from the unit and the
dissolved gases are extracted. Then the extracted gases are separated, identified, and
quantitatively determined. At present this entire technique is best done in the laboratory since
it requires precision operations. Since this method uses an oil sample it is applicable to all
type units and like the gas blanket method it detects all the individual components. The main
advantage of the DGA technique is that it detects the gases in the oil phase giving the earliest
possible detection of an incipient fault. This advantage alone outweighs any disadvantages of
this technique. [1]

FAULT GASES
Insulating materials within transformers and related equipment break down to liberate gases
within the unit. The distribution of these gases can be related to the type of electrical fault and
the rate of gas generation can indicate the severity of the fault. The identity of the gases being
generated by a particular unit can be very useful information in any preventative maintenance
program. The causes of fault gases can be divided into three categories; corona or partial

discharge, pyrolysis or thermal heating, and arcing. These three categories differ mainly in the
intensity of energy that is dissipated per unit time per unit volume by the fault. The most
severe intensity of energy dissipation occurs with arcing, less with heating, and least with
corona. [1]
Mineral insulating oils are made of a blend of different hydrocarbon molecules containing
CH3, CH2 and CH chemical groups linked together by carbon-carbon molecular bonds. Some
of the C-H and C-C bonds may be broken as a result of electrical and thermal faults, with the
formation of small unstable fragments, in radical or ionic form, which recombine rapidly
through a complex reaction, into gas molecules such as hydrogen, methane, ethane, ethylene,
acetylene, C3, and C4 hydrocarbon gases, as well as solid particles of carbon and hydrocarbon
polymers (X-wax), are other possible recombination products. [9]
Fault gases that can be found within a unit are listed in the following three groups:
1. Hydrocarbons and hydrogen: methane (CH4), ethane (C2H6), ethylene (C2H4),
acetylene (C2H2), hydrogen (H2)
2. Carbon oxides: carbon monoxide (CO), carbon dioxide (CO2)
3. Non-fault gases: nitrogen (N2), oxygen (O2)
Except for carbon monoxide, nitrogen and oxygen, all these gases are formed from the
degradation of the oil itself. Carbon monoxide, carbon dioxide (CO2), and oxygen are formed
from degradation of cellulose (paper) insulation. [8]
Faults can be identified based on the formed fault gases. Below is a list of fault and the main
fault gases [10]
o Arcing: Large amounts of hydrogen and acetylene are produced, with minor quantities
of methane and ethylene. Key Gas: Acetylene
o Corona: Low-energy electrical discharges produce hydrogen and methane, with small
quantities of ethane and ethylene. Key Gas: Hydrogen
o Overheated Oil: Decomposition products include ethylene and methane, together with
smaller quantities of hydrogen and ethane. Key Gas: Ethylene
o Overheated Cellulose: Large quantities of carbon dioxide and carbon monoxide are
evolved. Key Gas: Carbon monoxide
Figures 1, 2, 3, and 4 illustrate the chemical processes occurring with corona, pyrolysis, and
arcing in oil and pyrolysis of cellulose respectively. Typical fault gas distributions are also
shown. [1]

Figure 1. Corona in Oil

Figure 2. Pyrolysis in Oil

H2

88 %

H2

16 %

C02

1%

C02

trace

C0

1%

C0

trace

CH4

6%

CH4

16 %

C2H6

1%

C2H6

6%

C2H4

0.1 %

C2H4

41 %

C2H2

0.2 %

C2H2

trace

Figure 3. Arcing in Oil

Figure 4. Pyrolysis of Cellulose

H2

39 %

H2

9%

C02

2%

C02

25 %

C0

4%

C0

50 %

CH4

10 %

CH4

8%

C2H4

6%

C2H4

4%

C2H2

35 %

C2H2

0.3 %

IEC PUBLICATION
New IEC Publication 60599 Ed. 2.0 b: Mineral oil-impregnated electrical equipment in
service - Guide to the interpretation of dissolved and free gases analysis, concerning the
interpretation of dissolved gas-in-oil analysis, was issued in 1999 as a result of the revision of
IEC TC 10 of the previous IEC Publication 599, issued in 1978. It describes how the
concentrations of dissolved gases or free gases may be interpreted to diagnose the condition
of oil-filled electrical equipment in service and suggests future action. It is applicable to
electrical equipment filled with mineral insulating oil and insulated with cellulosic paper or
pressboard-based solid insulation. Information about specific types of equipment such as
transformers (power, instrument, industrial, railways, distribution), reactors, bushings,
switchgear and oil-filled cables is given only as an indication in the application notes.
Publication may be applied only with caution to other liquid-solid insulating systems. In any

case, the indications obtained should be viewed only as guidance and any resulting action
should be undertaken only with proper engineering judgment. [3]
The main body of IEC Publication 60599 contains an in-depth description of the five main
types of faults usually found in electrical equipment in service. The familiar gas ratios have
been retained for the diagnoses, but with new code limits, while additional gas ratios are
suggested for specific fault cases. More precise definitions of normal and alarm gas
concentrations in service are indicated compared to old one. In the Annexes, examples of
typical (normal) gas concentration values observed in service are given for six different types
of equipment. Extensive databases of faulty equipment inspected in service and of typical
normal values related to various types and ages of equipment and types of faults have been
used for the revision of Publication 60599. [3]
Classification of faults in IEC Publication 60599 is according to the main types of faults that
can be reliably identified by visual inspection of the equipment after the fault has occurred in
service [3]:
o partial discharges (PD) of the cold plasma (corona) type with possible X-wax
formation, and of the sparking type inducing small carbonized punctures in paper.
o discharges of low energy (D1), evidenced by larger punctures in paper, tracking, or
carbon particles in oil.
o discharges of high energy (D2), with power follow-through, evidenced by extensive
carbonization, metal fusion, and possible tripping of the equipment.
o a thermal fault below 300 C if paper has turned brownish (T1), above 300 C if paper
has carbonized (T2).
o thermal faults above 700 C (T3), evidenced by oil carbonization, metal coloration, or
fusion.
The number of characteristic faults is thus reduced from nine in the previous IEC Publication
599 to five in new IEC Publication 60599. In the Table 1 there are shown the old IEC
Publication 599 gas ratio codes and the nine fault types that can be detected according to the
codes.

Identification of Faults in Service Using New IEC Publication 60599


The three basic gas ratios of IEC 599 (C2H2/C2H4, CH4/H2, and C2H4/C2H6) are also used in
IEC 60599 for the identification of the characteristic faults. The ratio limits have been made
more precise, using the data of Annex 1 in IEC 60599, in order to reduce the number of
unidentified cases from around 30 % in IEC 599 to practically 0 %. Unidentified cases
occurred in IEC 599 when ratio codes calculated from actual DGA results would not
correspond to any of the codes associated with a characteristic fault. Graphical methods that
allow one to follow more easily and more precisely these cases, as well as the evolution of
faults with time, are also described. A more detailed version of the Triangle method, updated
from a previously published version, is also included to the publication. [3]

Table 1. Old IEC Publication 599 gas ratio codes and fault types according to the codes. [7]

In addition to the three basic interpretation ratios, two new gas ratios have been introduced in
IEC 60599 for specific diagnoses: the C2H2/H2 ratio, to detect possible contamination from
the on-load tap changers (OLTC) compartment (when > 3), and the O2/N2 ratio, to detect
abnormal oil heating/oxidation (when <0.3). The limit below which theCO2/CO ratio
indicates a possible involvement of paper in the fault has been made more precise (< 3).
Finally, other sources of gas, not related to a fault in service (mainly, H2) are also indicated.
[3]

Triangle Method
The Triangle graphical method of representation is used to visualize the different fault cases
and facilitate their comparison. The coordinates and limits of the discharge and thermal fault
zones of the Triangle are indicated in Figure 5. Zone DT in Figure 5 corresponds to mixtures
of thermal and electrical faults. Readers interested in visualizing their own DGA results using

the Triangle representation should preferably use triangular coordinate drawing paper for
better precision. The Triangle coordinates corresponding to DGA results in ppm can be
calculated as follows: %C2H2 = 100x / (x+y+z); %C2H4 = 100y / (x+y+z); %CH4 = 100z /
(x+y+z), with x = (C2H2); y = (C2H4); z = (CH4), in ppm. [4]

Figure 5. Coordinates and fault zones of the Triangle. [4]

ON-LINE MONITORING
Gas chromatographic diagnostics of power transformers has been recognised in the world for
some time as the most efficient physical-chemical on-line diagnostic method for
determination of potential thermal or electrical faults in transformers. Today the method is
increasingly complemented by sensor on-line monitoring of transformers. This analysis is
described as an on-line diagnostic testing method because the sampling of oil as well as the
performance of the analysis can be carried out during normal operation of the power
transformer, in other words, without temporary disconnection. [9]
Monitoring of the state of transformers with various on-line sensors built directly into
transformers or their insulation has been gradually put forward for the detection of
disturbances and sudden faults. Operation of these sensors is based on the measurement of
different signals. The development of on-line sensors for monitoring the state of transformers
proceeds in two main directions.
- Sensors for a certain typical gas (hydrogen, acetylene), which enable us to monitor the
development trends of the gas in question. Certain sensors draw attention to specific
disturbances. Hydrogen sensors thus draw attention to partial discharges and discharges with
low energy, whereas e.g. acetylene sensors draw attention to discharges with high energy.

Neither of the two types of sensor mentioned draws attention to a thermal disturbance or a
thermal damage of paper insulation.
- More complex sensors, analytical instruments in their own right, detect all gases typical of
certain faults in transformers. These draw attention to disturbances in transformers to the
same extent as results of a DGA and should allow for on-line diagnostics. However, due to
their complexity these instruments belong to an entirely different price class and their
application makes sense only in special cases. Examples of such analysers include
Transformer Nursing Unit (TNU) - a mobile unit for temporary on-line monitoring of typical
gases in critical situations, and Transformer Monitoring & Management System (TMMS)
developed for permanent installation in more important transformers. [9]
Gas chromatographic diagnostics of power transformers based on DGA has been improved
since it was introduced, and it still holds the most important place in the monitoring of
transformers in on-line diagnostics. Experience has led to the conclusion that, in order to
make a proper diagnosis in interpretation of results of a DGA, it is necessary first for each
individual transformer to establish normal concentrations and, where these are exceeded, to
use the method of ratios to determine the type of fault. In addition to these typical ratios, it is
necessary to consider the speed of development of gases, the possibility of contamination and
numerous data from the operation and maintenance of a transformer. On-line gas monitoring
is being increasingly used to complement this. [9]

METHODS OF INTERPRETATION
Dissolved gas analysis (DGA) has long been the standard on-line tool used by engineers to
determine the condition of power transformer. The popularity of DGA stemmed from the fact
that testing is performed without disrupting the transformers operation. When DGA was
conceived back in the 1960s, it was heralded as huge success with millions of pounds of
losses being avoided by early detection of faults. However, its failures were almost as
spectacular as it successes and it soon became apparent that DGA was by no means a
complete solution. Many attempts have since been made to refine the decision process used to
guide DGA engineers in their evaluations; such attempts include expert systems and analysis
of the data using e.g. artificial neural networks. Typically, when a DGA engineer examines
DGA data, they will compare the values that they have with the decision rules of several
traditional analysis methods. What the engineer then does is to make subjective decisions and
allowances based on what they see, i.e., does the data fit any of the decision criteria, if not
how close is the data to those criteria? [5]
Existing approaches for the interpretation of dissolved gas data relate gaseous composition to
the condition of power transformers via the utilization of ratio-based schemes or artificial
intelligence (AI) techniques. However, these approaches still contain some limitations. [6]
o Conventional Approaches
Several renowned DGA interpretation schemes are, for example, Drnenburg Ratios, Rogers
Ratios, Duval Triangle, and the IEC Ratios. These schemes have been implemented, either in
modified or improvised format, by various power utilities throughout the world. The
implementation of these schemes requires the computation of several key gas-ratios. Fault
diagnosis is accomplished by associating the value of these ratios with several predefined
conditions of power transformers. Before subjecting the dissolved gas data for interpretation,
a decision has to be made on whether fault diagnosis is necessary based on the comparison of

dissolved gas concentrations with a set of benchmark concentration values, which are also
referred to as the typical values of gas concentration. If all dissolved gas concentrations are
below these typical values, then the power transformer concerned can be regarded as
operating in a faultless manner. These benchmark values should be calculated from a large
historical DGA database, if available, based on the 90% or 95% thresholds. Although well
received by power utilities, there are several limitations pertaining to the foregoing ratiobased approaches. [6]
Table 2. Key gas ratios of conventional DGA interpretation schemes. [6]

o AI Approaches
Attempts have been made to utilize artificial intelligence (AI) techniques to perform diagnosis
of transformer condition based on the dissolved gas information. The intention of these
approaches is to resolve some inherent limitations of the conventional interpretation schemes
and to improve the accuracy of diagnosis. Single AI approaches only involve the utilization of
one AI technique. The most common AI technique within this category is the supervised
neural network (NN). Other AI approaches applied for DGA interpretation are of hybrid
nature. Hybrid-AI approaches such as fuzzy expert system (FES) and the combined expert
system (ES) and NN are more promising due to the fact that fuzzy logic (FL) or NN is used to
tackle the ambiguity of conventional DGA interpretation schemes, which are integrated into
the foregoing approaches, and expert experiences are incorporated to improve the credibility
of diagnosis. [6]

Conventional DGA theory


The most important aspect of fault gas analysis is taking the data that has been generated and
correctly diagnosing the fault that is generating the gases that have been detected. The three
traditional DGA methods are (i) Rogers Ratio Method (RRM), (ii) Dornenburgs Ratio
Method (DRM) and (iii) the Key Gas Method (KGM). All three of the methods have their
theory routed in organic chemistry and base their diagnoses on matching the temperature
generated by a fault to a general fault type. Put simply, each fault type typically generates a
fault temperature within a prescribed range, the more severe the fault the higher the
temperature. Because the insulating oil used in power transformers is organic (i.e., composed
primarily of hydrocarbons), certain fingerprint gases are generated at specific temperature
ranges, therefore allowing the traditional methods to identify a possible fault temperature
range and therefore the possible fault type. The KGM actually uses four characteristic charts

that represent typical relative gas concentrations for four general fault types: overheating of
cellulose (OHC), overheating of oil (OHO), partial discharge (PD) or arcing. The other two
methods use ratios of the same fingerprint gases to try and pinpoint specific temperature
ranges. The fingerprint gases used are again carbon monoxide (CO), hydrogen (H2), methane
(CH4), ethane (C2H6), ethylene (C2H4) and acetylene (C2H2).
One of the earliest methods is that of Dornenburg in which two ratios of gases are plotted on
log-log axes (Fig. 6). The area in which the plotted point falls is indicative of the type of fault
that has developed. [1]

Figure 6. Dornenburgs Ratio Method [1]

Grey model
In the past, forecasting approaches have mostly used time series like least-squares regression
or neural network models like back propagation based neural networks. Generally, these
traditional forecasting models need a large amount of input data. However, the DGA of a
power transformer is usually done only once in every year by power companies due to
inspection cost consideration, so the historical database is very limited. The traditional
forecasting methods are not appropriate for application in this field. The grey dynamic model
(GM model) is particularly designed for handling situations in which only limited data are
available for forecasting while system environment is not well-defined and fully understood.
The GM model has been proved successful in many forecasting fields. For example, due to
few historical dissolved gas records (only one test value for a year), a modified grey model
(MGM) is proposed to predict the trend of dissolved gases. Then, future faults of power
transformers can be directly identified by the fault diagnosis techniques, so that we can switch
them out safely and improve the reliability of power systems. [7]
The tested results show that the proposed method is simple and efficient. The grey theory
describes random variables as a changeable interval number that varies with time factors and
uses colour to represent the degree of uncertainty in a dynamic system. If a system whose
information is completely clear is called as a white system. In opposition, if a system whose
information is not clear at all is called as a black system. In other words, if a system whose
10

information is partly clear or partly unclear is called as a grey system. The grey forecasting
model is one of the applications of the grey theory. Instead of analyzing the characteristics of
the grey systems directly, the grey model exploits the accumulated generating operation
(AGO) technique to outline the system behaviours. The AGO practice may reduce the white
noise embedded in the input data from statistics. [7]
Due to the lack of sampling data, the grey model is very useful to set up the accurate
forecasting model, since it can work with very little data. According to the field test results, it
is shown that the proposed method can, not only provide the high accuracy model of the
transformer oil-dissolved gas; it can also combine with other fault diagnosis method to detect
useful information for future fault analysis. In addition, the calculation of the proposed
method is fast and very simple and can be easily implemented by PC software. [7]

Fuzzy model
The criteria used in dissolved gas analysis are based on crisp value norms. Due to the
dichotomous nature of crisp criteria, transformers with similar gas-in-oil conditions may lead
to very different conclusions of diagnosis, especially when the gas concentrations are around
the crisp norms. To deal with this problem, gas-in-oil data of failed transformers were
collected and treated in order to obtain the membership functions of fault patterns using a
fuzzy clustering method. All crisp norms are fuzzified to linguistic variables and diagnostic
rules are transformed into fuzzy rules. A fuzzy system originally is used to combine the rules
and the fuzzy conditions of transformers to obtain the final diagnostic results. It is shown that
the diagnosing results from the combination of several simple fuzzy approaches are much
better than traditional methods especially for transformers which have gas-in-oil conditions
around the crisp norms. [2]

ER (evidential reasoning) Algorithm


A novel approach to the analysis and handling of dissolved gas analysis (DGA) data from
several traditional methods, namely Rogers Ratio Method, Dornenburgs Ratio Method and
the Key Gas Method, is presented also in Spurgeon paper. Ideas taken from fuzzy set theory
are applied to soften the fault decision boundaries employed by each of the three methods.
This has the effect of replacing traditional Fault or No Fault crisp reasoning diagnoses, with a
set of possible fault types (i.e., those fault types distinguishable by one particular method) and
an associated probability of fault for each. These diagnoses are then considered as pieces of
evidence ascertaining to the condition of the transformer and are aggregated using an
evidential reasoning (ER) algorithm. The results are presented as probabilities of four possible
general fault types: overheating of cellulose (cellulose degradation), thermal faults, partial
discharge and arcing (corona). Finally the remaining belief is assigned to the possibility that
no fault exists. The results show that the pseudo fuzzy representations of the traditional
methods, perform adequately over a wide range of test values taken from actual failed
transformers, and that the overall system can effectively combine the evidence to produce a
more meaningful and accurate diagnosis. [5]
The new approach generates subjective judgments such as these by using fuzzy membership
functions to soften the decision boundaries that are currently utilized by the traditional
methods. More precisely, crisp decision boundaries imply that the probability of fault can

11

either be zero or one. Softening such boundaries using appropriate functions means that the
probability of fault can take on any value in the closed interval [0, 1]. By changing the
boundaries in this way, the belief that an engineer has that a transformer is faulty can be
represented by a single value. [5]
In the case of DGA there is uncertainty in the accuracy of the diagnoses provided by the
traditional methods. What ER does is provide a mathematical framework for combining such
uncertain information (and subjective judgments). By considering each piece of information
as evidence either supporting or denying a hypothesis, the validity of all possible hypotheses
can be calculated. In the case of DGA, each hypothesis corresponds to a possible fault
condition and the validity to the chance that this may be the condition of the transformer, e.g.
20 % chance the transformer has suffered from or is currently suffering an arcing fault. [5]
Test clearly shows the power of the ER algorithm to combine effectively all of the available
evidence from the three diagnosis methods and provide an array of possible faults, mimicking
the natural reasoning process of a DGA engineer. It also demonstrates the practicality of using
fuzzy membership functions for generating subjective beliefs in a simple manner, based on
only two mathematical functions. The potential of this system lies in the fact that, whereas
other systems treat the problem as one of classification, ER treats the problem as one of
reasoning based on the DGA data. The flexibility of the tree structure used to make the
decision and the algorithm for combining the evidence means that the system can be extended
easily to encompass new diagnosis techniques by simply adding extra branches, parallel to the
ones currently used. [5]

CONCLUSIONS
The technology presently exists and is being used to detect and determine fault gases below
the part per million level. However there is still much room for improvement in the technique,
especially in developing the methods of interpreting the results and correlating them with
incipient faults. It is also important to realize that even though there is further need for
improvement in the technique, the analysis of dissolved fault gases represents a practical and
effective method for the detection of incipient faults and the determination of their severity. In
addition to utility companies, many industries and installations that have on-site transformers
are recognizing that the technique of dissolved fault gas analysis an extremely useful, if not
essential, part of a well developed preventative maintenance program. [1]
Obvious advantages that fault gas analyses can provide are summarized blow [1]:
1. Advance warning of developing faults
2. Determining the improper use of units
3. Status checks on new and repaired units
4. Convenient scheduling of repairs
5. Monitoring of units under overload

12

REFERENCES
[1]

DiGiorgio, Joseph B., Dissolved Gas Analysis of Mineral Oil Insulating Fluids, NTT
Technical Bulletin. 28.6.2005, available in www.nttworldwide.com/tech2102.htm
[2] An-Pin Chen & Chang-Chun Lin, Fuzzy approaches for fault diagnosis of transformers,
Fuzzy Sets and Systems, 2001, Vol 118, No 1, pp 139-151.
[3] Duval, M. & dePabla, A., Interpretation of gas-in-oil analysis using new IEC publication
60599 and IEC TC 10 databases, IEEE Electrical Insulation Magazine, 2001, Vol 17,
No 2, pp. 31-41.
[4] Duval, M., A review of faults detectable by gas-in-oil analysis in transformers, IEEE
Electrical Insulation Magazine, 2002, Vol 18, No 3, pp. 8-17.
[5] Spurgeon, K., Tang, W.H., Wu, Q.H., Richardson, Z.J. & Moss, G., Dissolved gas
analysis using evidential reasoning, IEE Proceedings on Science, Measurement and
Technology, 2005, Vol 152, No 3, pp. 110 117.
[6] Thang, K.F., Aggarwal, R.K., McGrail, A.J. & Esp, D.G., Analysis of power
transformer dissolved gas data using the self-organizing map, IEEE Transactions on
Power Delivery, 2003, Vol 18, No 4, pp. 1241-1248
[7] Wang, M. H. & Hung, C. P., Novel grey model for the prediction of trend of dissolved
gases in oil-filled power apparatus, Electric Power Systems Research, 2003, Vol 67, No
1, pp. 53-58.
[8] Ward, S.A., Evaluating transformer condition using DGA oil analysis, Conference on
Electrical Insulation and Dielectric Phenomena, 2003, Annual Report, pp. 463-468.
[9] Varl, A., On-line diagnostics of oil-filled transformers, Proceedings of 14th IEEE
International Conference on Dielectric Liquids, ICDL 2002, 7-12 July 2002, pp. 253257.
[10] http://www.powertech.bc.ca/cfm quoted 30.6.2005

13

Dielectric Diagnostics Measurements of Transformers

DIELECTRIC DIAGNOSTICS MEASUREMENTS OF TRANSFORMERS


Xiaolei Wang
Institute of Intelligent Power Electronics
Helsinki University of Technology
Otakaari 5 A, FIN-02150 Espoo, Finland
Phone: +358 9 451 4965, Fax: +358 9 451 2432
E-mail: xiaolei@cc.hut.fi
TABLE OF CONTENTS
1

Introduction ........................................................................................................................ 1

Transformer insulation structure ........................................................................................ 2

Dielectric diagnostics measurement method...................................................................... 3


3.1.

Return Voltage Measurement (RVM)........................................................................ 4

3.2.

Polarization and Depolarization Current method (PDC) ........................................... 6

3.3.

Frequency Domain Spectroscopy (FDS).................................................................... 8

3.4.

Comparison and conclusion on three dielectric measurement approaches ................ 9

3.5.

Influences on dielectric measurement ...................................................................... 10

Conclusions ...................................................................................................................... 10

References ........................................................................................................................ 10

INTRODUCTION

During recent years, the diagnostics of power system equipments, for example, transformers, has gained great research attention. Diagnostics refers to the interpretation of data and
off-line measurements on transformers. It is normally used either for determining the actual
condition of a transformer, or as a response to a received warning signal [1]. Generally, the
current transformer diagnostics approaches become more and more advanced, which can be
classified by different fault types, e.g., thermal, dielectric, mechanical, and degradation related faults. To detect ageing phenomena of transformers, there are several methods that can
be employed, such as partial discharge measurement and gas-in oil analysis, etc. Dielectric
Diagnostics Method (DDM) is one distinguished approach to asseting ageing condition of
transformers, which represents a family of methods used for characterization of dielectric
materials as well as practical insulation systems [2].
The focus of this report is on the basic principle of DDM as well as its three important
measurement methods. Firstly, the general insulation structure of transformers will be described in the following section. Three dominated DDM-based measurement approaches,
1

Dielectric Diagnostics Measurements of Transformers


Return Voltage Measurement (RVM), Polarization and Depolarization Current (PDC), and
Frequency Domain Spectroscopy (FDS), will be introduced in section 3. Finally, there will
be a short conclusion and discussion.
2

TRANSFORMER INSULATION STRUCTURE

The life span of the magnetic device highly depends on its insulation system, an amalgam of
different insulating materials, processes, and interactions. Generally, the insulation system
of power transformers consists mostly of mineral oil and cellulose paper. With the age increasing, the oil/paper insulation of transformers will degrade due to thermal, oxidative and
hydrolytic factor. One aging indicator is the water content in the solid part of the insulation.
Increased water content accelerates the deterioration of cellulose through depolymerisation,
and causes bubble formation resulting in electrical breakdown as well [3].
To estimate the humidity of transformer insulation, one needs a data library of dielectric
properties ( , dc , and f(t)) of oils and pressboard at different humidity content. This data
library is necessary to calculate the dielectric response of the composite duct insulation, and
for comparing with the measurement results. In frequency domain, the material of transformer is characterized by a complex frequency and temperature depended permittivity, as
shown in (1).

( , T ) = ( , T ) i ( , T )

(1)

In the frequency range of interest, for transformer oil the real part is constant ( r = 2.2 ), and
the imaginary part is dominated by the contribution from the DC conductivity. However, in
the time domain the material is characterized by the power frequency permittivity ( r ), DC
conductivity ( DC ), and dielectric response function f(t) [4].
In addition, the dielectric response is influenced by the insulation structure, as shown in
Fig.1 [5]. In the winding configuration (Fig.1 (a)), the section of insulation duct consists of
cylindrical shells of pressboard barriers separated by axial spacers. In the modeling of the
combination of oil and cellulose in the duct (Fig.1 (b)), parameters X and Y are defined as
the relative amount of solid insulation (barriers) and spacers respectively. Generally, the
barriers fill 20-50% of the main duct, and the spacers fill 15-25% of the circumference.
Based on the abovementioned material and geometric properties of composite system, the
dielectric response can be derived. In the time domain (PDC and RVM methods), the calculation is based on the known response function (t), and it also depends on temperature and
humidity. In the frequency domain (FDS method), the composite dielectric permittivity, duct , of the insulation duct is calculated as the following function [5].

Dielectric Diagnostics Measurements of Transformers

( , T ) duct =

Y
1 X

spacer

barrier

1Y
X
1 X
+

oil

(2)

barrier

Fig. 1. (a) Section of an insulation duct of a power transformer with barriers and spacers,
(b) schematic representation of the barrier content and the spacer coverage in the insulation duct.
On the right side of (2), the permittivity of the oil, spacers, and barriers deduced from the
measurements on the insulation model, are complex quantities dependent on frequency,
temperature, and humidity. The materials properties are varied until a good fit with the
measured values is achieved [6].
3

DIELECTRIC DIAGNOSTICS MEASUREMENT METHOD

It is well known that one important application of Dielectric Diagnostics Method (DDM) is
to asset the humidity in the insulation system of transformer. Due to its oil/paper insulation
structure, transformer shows the characteristics of polarization and conductivity. DDM can
work in such a field dominated by interfacial polarization at the boarders between cellulose
and oil, and cellulose and oil conductivity. Fig. 2 illustrates a basic diagram for electric
measurements [7]. In figure 2, the Low Voltage (LV) and High Voltage (HV) winding terminals of transformer are connected together as a two-terminal test object to the DDM instrument.
DC voltage, DC current, AC voltage, and AC current can be measured to evaluate dielectric
response phenomena, which develop three widely employed dielectric diagnostics measurement methods:
Recovery (Return) Voltage Method (RVM),
Dielectric spectroscopy in time domain, i.e., Polarization and Depolarization Currents
method (PDC),
Frequency Domain Spectroscopy analysis (FDS), i.e., measurements of electric capacitance and loss factor in dependency frequency.
3

Dielectric Diagnostics Measurements of Transformers

Fig. 2. Basic circuit diagram for dielectric measurements.


DC voltage measurements are applied as recovery voltage measurements after charging the
insulation with a DC voltage. The derived diagnostic method is called the RVM. A series of
recovery voltage measurements with increased charging time leads to the so called Polarization Spectrum which is commonly used to evaluate the moisture content of cellulose. A
DC current measurement will record the charging and discharging currents of insulation.
They are known as the PDC. AC voltage and current measurements are derived from the
well-known Tangent Delta measurements. However the frequency range is much enhanced
especially to low frequencies, e.g. 0.1MHz. The derived measurement method is the FDS
[8].
In this section, we will discuss these three methods in detail.
3.1.

Return Voltage Measurement (RVM)

The dielectric measurements can be performed in both frequency and time domain. The feature of time domain measurement methods is applying a step voltage across the sample.
Figure 3 shows a simplified diagram of dielectric response measurement in time domain for
a power transformer. For the RVM, s3 closed and s1, s2 open [9]. In this case, the RVM is
coupled to the low voltage terminal and the high voltage windings and the tank are
grounded. The principle of the measurement is that the test object is first charged for a
given time, then discharged for half the charging time and after that the return voltage is
measured under open circuit conditions.
Assume a DC step voltage is applied to the initially completely discharged system:

Dielectric Diagnostics Measurements of Transformers


0

U (t ) = U 0
0

0>t
0 t t1
t1 t

S1

DC

(3)

HV
S3

S2
V

LV
Fig. 3. Simplified Diagram of Dielectric Response Measurement.
When the step voltage is applied during period 0 t t1 , the charging (polarization) current
is generated

i pol (t ) = C0U 0 r + f (t )
0

(4)

where 0 = 8.854 1012 F / m is the permittivity of vacuum of the dielectric material, and

r is the average conductivity of the composite insulation system. The response function of
the composite insulation f(t) describes the fundamental memory property of the dielectric
system and can provide significant information about the insulation material. Afterwards the
step voltage is disconnected from the insulation (grounded), and the discharging (depolarization) current is generated as shown in (5) [10].

idepol (t ) = C0U 0 [ f (t ) f (t + t1 )]

(5)

The source of the recovery voltage is the relaxation of remaining polarization in the insulation system, giving rise to an induced charge on the electrodes. The polarization spectrum is
established by performing a series of recovery voltage measurements with stepwise charging and discharging time. For each sequence in the spectrum, the peak of recovery voltage

Vr as well as the initial rate of rise of the recovery voltage dVr dt are recorded and plotted
versus the charging time used [5].
Figure 4 demonstrates such a polarization spectrum. In this RVM, a transformer is charged
initially for 0.5 second, after that in every next cycle the charging time is doubled, until
1024 seconds. The ratio of charging to discharging time is a constant two. Charging and
discharging current and return voltage data are recorded for every test cycle.
5

Dielectric Diagnostics Measurements of Transformers

Fig. 4. Dielectric response measurement of return voltage.


A return voltage spectrum, indicating the insulation condition, can be drawn with the return
voltage and the central time constant from each test cycle. Figure 5 illustrates such return
voltage spectra, and the operation times of these transformers are list in Table 1. The peak
in the spectrum can provide indication of the insulation condition. The spectrum also provides the range of the response in the time domain [9]

Fig. 5. Typical return voltage spectra.


Table 1. Transformer operation time in Fig.5.

3.2.

Transformer

T4

T5

T6

Age (year)

38

33

Polarization and Depolarization Current method (PDC)

The polarization current is also obtained by applying a step voltage between the HV and LV
windings during a certain time. The charging current of the transformer capacitance, i.e. in6

Dielectric Diagnostics Measurements of Transformers


sulation material, called polarization current is generated. The depolarization current is
measured with power supply removed. The connection of the PDC is same as that of the
RVM in the time domain measurement illustrated in Fig. 3. For this case, polarization current measurement is performed with s1 closed, s2 and s3 open. For the depolarization current measurement, s2 is closed, and s1 as well as s3 are open. Figure 6 demonstrates the current waveform during the instant of voltage application which decreases during the polarization to a certain value given by the conductivity of the insulation system [9].

Fig. 6. Polarization and depolarization current waveform.


From the (4) and (5), we can conclude that both polarization and depolarization current consider dielectric response function. The DC conductivity of the test object can be estimated
from the measurements of polarization and depolarization current. However, it is easier to
employ the depolarization current due to no DC current involved. Figure 7 shows the measured polarization currents from some moisture-conditioned samples. Obviously, the amplitude of long term DC polarization current is quite sensitive to the moisture content in paper
insulation, which demonstrates that polarization current measurement is capable of assessing the insulation moisture lever [9].

Fig. 7. Measured polarization current from moisture conditioned samples.


7

Dielectric Diagnostics Measurements of Transformers


3.3.

Frequency Domain Spectroscopy (FDS)

In the frequency domain measurement, a sinusoidal voltage is applied, and the complex dielectric constant is determined from the amplitude as well as phase of the measured current
flowing through the test object. The dielectric susceptibility can be considered as the response function in time domain, which is related through the following Fourier transform
function [11].

X s ( ) = X s ( ) iX s( ) = f (t )e jt dt

(6)

The susceptibility is a complex function of frequency, and is related to the relative permittivity as shown in (7)-(9).

r ( ) = r ( ) i r( ) = + X s ( ) iX s( ) i
r ( ) = + X s ( )
r( ) = X s( ) + i

(7)

(8)

(9)

In (7), the imaginary part of the complex relative permittivity (loss part) contains both the
resistive (DC conduction) losses and the dielectric (polarization) losses, and that at a given
frequency it is impossible to distinguish between the two. However, the resistive part is
dominant at low frequency. In this case, the imaginary part of the complex relative permittivity has a slope of 1 and the real part is constant. Based on this, the conductivity of the
test object could be calculated. Another way of presenting the measured information of a
FDS is to use the loss factor [5].
Alternatively, loss factor method is usually employed to present the measured information.
Loss factor is the frequency dependent ratio of imaginary and the real parts of the complex
permittivity as shown in (10) [11]. The geometrical independency of the rest object makes
the loss factor important to study when the object geometry is unknown.
tan =

r( )
r ( )

(10)

Figure 8 demonstrates the loss factor of the four units as a function of frequency. Obviously, T11 (1973) is similar to T12 (1973), and T41 (1977) is similar to T42 (1977). However, the main reason of the difference is that the oil conductivity of T11/T12 is higher than
that of T41/T42. With the fixed oil conductivity, the influence of the moisture content of the
cellulose is illustrated in Fig. 9. The moisture content is around one percent in T11/T12, and
lower in T41/T42 [6].
8

Dielectric Diagnostics Measurements of Transformers

Fig. 8. Loss factor as a function of frequency for different transformers.

Fig. 9. Loss factor for different moisture contents of the cellulose.


3.4.

Comparison and conclusion on three dielectric measurement approaches

Nowadays, all of these three methods are widely employed for the diagnostics of power
transformer insulation, with commercial available or test set-up instruments sometime.
The RVM method is a useful but more sensitive to systematic errors than the others. It is a
high impedance input method, and leakage currents on the bushings could easily corrupt the
measurements. Basically, the PDC is a non-destructive method, and is same as the insulation resistance measurement. If the currents are low the method can be sensitive to offset
currents and interference in the field. The oil conductivity can be deduced from the initial
current and board conductivity affected by moisture is from the final current. The FDS
method has the advantage of including the loss factor and capacitance measurement. Therefore, it is easy to detect fault in the test set-up. The inherent small bandwidth makes the
method relatively insensitive to interference and there is no need for a high voltage power
supply [6].

Dielectric Diagnostics Measurements of Transformers


3.5.

Influences on dielectric measurement

Although dielectric diagnostics techniques have been improved recent years, there are still
some factors affect measurement reliability and stability of analysis results. These influences are also the main source of error. For instance, some essential issues of them are list
as following [7].
Insulation temperature
Migration processes
Decreasing oil conductivity
Parallel current paths
Temperature compensation in analysis software
Interpretation of measurement data
Comparison to moisture equilibrium charts
Measuring time
4

CONCLUSIONS

Due to their remarkable characteristics, the RVM, PDC, and FDS are the three dominant
dielectric diagnostics approaches to asses the power transformer insulation. In this report,
their basic principles and analysis results are discussed respectively. The analysis demonstrates that these diagnostics methods are useful for off-line assessment of power transformer insulation. However, these approaches are still influenced by some conditions and
source errors. Recently, more and more research work are concerning about the condition
factor to the measurement.
5

REFERENCES

[1] C. Bengtsson, Status and trends in transformer monitoring, IEEE Transactions on


Power Delivery, vol. 11, no. 3, pp. 1379-1384, 1996.
[2] U. Gafvert, Dielectric response analysis of real insulation systems, in Proceedings of
the IEEE International Conference on Solid Dielectrics, Toulouse, France, July 2004,
pp. 1028-1037.
[3] Y. M. Du, B. C. Zahn, A. V. Lesieutre, and S. R. Lindgren, A review of moisture
equilibrium in transformer paper-oil systems, IEEE Electrical Insulation Magazine,
vol. 15, no. 1, pp. 11-20, 1999.
[4] A. K. Jonscher, Dielectric Relaxation in Solids. London, UK: Chelsea Dielectrics
Press, 1984
[5] CIGRE, Dielectric response methods for diagnostics of power transformers, IEEE
Electrical Insulation Magazine, vol.19, no.3, pp.12-18, 2003.
10

Dielectric Diagnostics Measurements of Transformers


[6] U. Gafvert, L. Adeen, M. Tapper, P. Ghasemi, and B. Jonsson, Dielectric spectroscopy in time and frequency domain applied to diagnostics of power transformers, in
Proceedings of the International Conference on Properties and Applications of Dielectric Materials, Xian, China, June 2000, pp. 825-830.
[7] K. Maik, and F. Kurt, Reliability and influences on dielectric diagnostic methods to
evaluate the ageing state of oil-paper insulations, in Proceedings of International Conference on Advances in Processing, Testing, and Application of Dielectric Materials,
Wrocaw, Poland, Septermber 2004.
[8] W. S. Zaengl, Dielectric spectroscopy in time and frequency domain for HV power
equipment, part I: theoretical considerations, IEEE Electrical Insulation Magazine,
vol. 19, no. 5, pp. 9-22, 2003.
[9] T. Y. Zheng, and T. K. Saha, Analysis and modeling of dielectric responses of power
transformer insulation, in Proceedings of IEEE Power Engineering Society Summer
Meeting, Chicago, IL, July 2002, pp. 417-421.
[10] T. K. Saha, and P. Purkait, Effects of temperature on time-domain dielectric diagnostics of transformer, Australian Journal of Electrical & Electronics Engineering, vol.
1, no. 3, pp. 157-162, 2004.
[11] T. K. Saha, Review of modern diagnostic techniques for assessing insulation condition in aged transformers, IEEE Transactions on Dielectrics and Electrical Insulation,
vol. 10, no. 5, pp. 903-917, 2003.

11

ASSET MANAGEMENT IN POWER SYSTEMS

CONDITION MONITORING OF GENERATOR SYSTEMS

Matti Heikkil
Vaasa
matti.heikkila@pvo.fi

CONTENTS

1.

INTRODUCTION........................................................................................................ 3

2.

ON LINE CONTINUOUS MEASUREMENTS..................................................... 4

2.1
2.2
2.3
2.4

Vibration Measurements ............................................................................................. 4


Temperature Measurements ....................................................................................... 4
Protection Relays......................................................................................................... 5
Measurement of Shaft Movement.............................................................................. 5

3.

CONDITION MONITORING................................................................................... 5

3.1
3.2
3.3
3.4

On-line Partial Discharge (PD) Measurements ...................................................... 6


Measuring of Isolation Resistance ......................................................................... 11
Mechanical Inspection .............................................................................................. 11
Tan Delta Monitoring System ................................................................................... 11

4.

REFERENCES ........................................................................................................... 13

1. INTRODUCTION
The generator is one of prime devices in a power plant. The arrangement of the condition
monitoring of generator is extremely important to avoid many failures, because the fixing
of the generator takes very long time and the costs of the lost production are significant. So
with the condition monitoring systems it is possible to follow the generator condition and
to plan the revision in right time.
Normally there are many measuring devices in generator, which follow the condition of
generator during production. Such measurements are e.g.:
Vibration measurements
Temperature measurements of generator windings
Protection relays
Measurement of generator shaft movement
Additionally for previous measurements, it is normal that during the revision the resistance
of generator isolation is measured and mechanical inspections are made.
The modern condition monitoring system is an on-line partial discharge (PD) monitoring
system. The aim of partial discharge measurement on windings of rotating electrical
machines is the assessment of the insulating conditions of the windings. The measurement
and subsequent analysis of PD generates important information, which pay attention to the
type of defects that may occur due to local over-stressing within an insulating system.
These measurements are therefore well suited for non-destructive diagnosis of various
dielectric materials e.g. the insulation of generator windings. With regard to risk
assessment and early planning of preventive maintenance for rotating electrical machines it
is necessary to get reliable information on the type and intensity of the partial discharges
and therefore on the actual conditions of the winding insulation. Without performing
meaningful diagnostic measurements a critical PD activity may, over a short or medium
time range, lead to an unexpected machine failure, which causes considerable expenses be
due to unscheduled downtimes and the replacement and/or reconditioning of damaged
components.
In the following Figure 1 is shown an example of air-cooled generator, in which have made
all those measurement, which are mentioned in this seminar report.

Figure 1.

Air cooled generator, 250 MW, 277 MVA, 15 kV

2. ON LINE CONTINUOUS MEASUREMENTS


2.1 Vibration Measurements
If the generator vibration is very high it may cause damage to the shaft bearings of generator
and also to the mechanical construction of generator. So it is normal that there are installed
vibration measurement sensors (axial and radial x/y) into generator bearings. Into automatic
controlling system is also set the limits to permitted vibration level; the automation system
gives an alarm, if the limit is exceeded and the shut down command, if the vibration is too
high.
2.2 Temperature Measurements
The temperature of generator windings is followed by temperature measurement sensors
(normally Pt100 sensors), which are connected into automatic controlling system. Into
automatic controlling system is set limits for alarm and shut down commands.

2.3 Protection Relays


Generator and its electrical systems are protected by protection relays. Modern generator has
equipped with following protection relays:
Generator over current
Generator over voltage
Generator under voltage
Generator differential protection
Generator overload
Generator reverse power
Generator pole slipping
Generator under excitation
Generator minimum impedance
Generator negative phase sequence
Generator under frequency
Generator over frequency
Generator stator earth fault 90 % and 100 %
Generator circuit breaker failure
Transformer bus earth fault
Unit transformer over current
Unit transformer differential protection
Unit transformer oil temperature
Unit transformer winding temperature
Unit transformer tab switch failure
Unit transformer gas relay
Unit transformer over pressure
Block transformer differential protection
Block transformer gas relay
Block transformer oil temperature
Block transformer winding temperature
Overall differential protection
2.4 Measurement of Shaft Movement
Generator and turbine has equipped with shaft movement measurement system. High
temperature, vibration and other reasons can cause that the shaft moves a little bit or has heat
expansion. In that case the bearings function is not good. Measurement of shaft movement
gives an alarm or shut down command if needed.
3. CONDITION MONITORING
Generator condition must be followed and measured during running and during revision. On
Chapter 2 are explained normal on-line measuring systems. Modern condition monitoring
system, Partial Discharge (PD) monitoring system is explained in following Chapter 3.1. In
5

Chapter 3.2 is explained isolation resistance measurement, in Chapter 3.3 mechanical


inspections and in Chapter 3.4 Tan delta measurements.
3.1 On-line Partial Discharge (PD) Measurements
Partial discharges are small sparks which occur in high voltage insulation systems. Partial
discharges (PD) in electrical machines arise
in cavities in insulation systems when subjected to high electric fields,
in overcharged gas-filled gaps in the vicinity of windings,
in the area of faulty anti-corona systems.
Considered individually, the partial discharges in electrical machines are harmless. However,
when high energy partial discharges occur permanently they can destroy any insulation
system. The result is dielectric breakdown leading to machine failure.
Partial discharge (PD) monitoring is the continuous acquisition and analysis of electrical
discharges on an object interacted by voltage. The aim of partial discharge monitoring on
rotating electrical machines in operation, is the early recognition of discharge phenomena
which are dangerous for operation.
The partial discharge monitoring system very sensitively detects and analyses location and
cause of partial discharges arising in the machine while in operation. Partial discharge
phenomena can not be recognized with any of the conventional protection systems.
Partial discharge monitoring measurements are particularly recommended
for key machines as generators
for planning condition-based maintenance
where enhanced operational reliability is called for
for fully automatic plant operation
in cases of severe operating conditions
In the partial discharge system are needed couplers which are connected to the machine to be
monitored, the PD signal processor and the data acquisition and analyzing system as is shown
in the following Figure 2.

Figure 2. Partial Discharge Monitoring System. [16]


The partial discharge signals are decoupled by a special high-frequency current transformer or
a capacitive coupler on a low-voltage side (star point) of the stator winding and depending on
the machine, also by capacitive couplers on the high-voltage side. The measured signals are
transmitted through special screened cables to the PD signal processor. The data acquisition
and analyzing system is permanently connected to the PD signal processor. The system has
also equipped to record the voltages of the objects being monitored.
The individual components of PD- monitoring system are [16]:
Couplers: capacitive couplers and high-frequency current transformers
PD signal processor: multiplexer for many high-frequency channels and voltage
channels including amplifier and filter
Data acquisition and analyzing system: components for digitizing and processing
the measured data, and for controlling all system components and functions. In this
unit, disturbances are automatically eliminated; characteristic values and trends are
calculated and indicated.
Screened high-frequency signal cable with TNC / BNC connectors
Matching unit: only needed if couplers are only periodically connected to
measurement system
PD calibrator: needed during installation only

How does the PD measuring system works [16]:


The signals originating in the insulation system are decoupled by the couplers. The signals are
then transmitted from the couplers to the PD signal processor. There the analogue signals are
prepared for fast scanning. The filters and amplifiers are set automatically by the computer of
the data acquisition and analyzing system. Characteristic values, in conformity with IEC
Publication 60270, are acquired by the data acquisition and analyzing system.
After the processing, the signals are digitized at a scanning rate of up to 40 million
measurements per second (40 MHz) (tuned to the filter characteristic of the PD signal
processor) in the data acquisition and analyzing unit. After a complex noise recognition there
follows an automatic suppression of time and frequency-stable and generator-specific
interference. Analytic processes then statically evaluate the numerous partial discharges and
reduce the immense amount of data. Besides the usual worldwide known evaluation criteria,
additional quantities describing the distributions are also determined and stored.
In the processing result are used tree dimensions as we can see in Figure 4:
Phase angle (deg)
Number of pulses n
Apparent charge q (nC)
Long-time trends of these characteristic values permit continuous comparison with reference
values to be made. Should the measured values exceed reference values an alarm is
immediately initiated. The result of the analysis supply information on the kind, origin,
intensity, and frequency of occurrence of the discharges. This information is then evaluated to
determine the influence on the operational reliability of the machine.
The benefits of PD measuring system:
Partial Discharge Monitoring provides high operational reliability for the
machine.
PD Monitoring system supplies important information on the actual state of the
machine in operation an on any critical changes arising. This information can not
be detected by any other protective or monitoring system.
When the results are positive, the time until the next maintenance inspection can
be extended. When the results are critical the cause of the fault can be recognized
and eliminated at an early stage before an actual failure occurs in the machine.
Failure costs can be avoided and maintenance costs minimized so that the
investment for PD Monitoring system pays for itself in a short time.
Examples of PD measuring:
In the following Figure 3 is shown the measuring arrangements of air cooled generator 250
MW, 277 MVA, 15 kV:
Measuring with generator half load without inductive load
Measuring with generator half load with 80 MVAr inductive load
Measuring with generator full load without inductive load
Measuring with generator full load with 80 MVAr inductive load
8

VL2: Planned PAMOS measurement time schedule 21.4.2002


Test Output
Normal Output
Inductive Output
Test Output
Normal Output
Inductive Output
250

[MW]

200
150
100
50
0
0:00

2:00

4:00

6:00

8:00

10:00

12:00

14:00

16:00

18:00

20:00

22:00

Time

Figure 3. Measuring arrangements for PD measurement.


In the following Figure 4 are measuring results:
The normally visible inner PDs are superimposed by base noise of the machine. This is a
sign of very low PD activity, especially of the inner PDs. Inner discharges are typical for all
mica based and resin impregnated insulation systems. The sources are small gaps and voids
within the ground wall insulation. Since the mica based insulation is highly PD resistant such
inner PDs are regarded as rather harmless. They are classified as uncritical because of the
low level within the same range as the base noise band. PD sources coming from the
insulation system of the machine cannot be detected. Therefore the measured generator seems
to be in good condition at time of the measurement.
Needle shaped disturbances from the excitation device superimpose all PD readings. These
disturbances are narrow and appear on a low level, so that they dont mask any real PD
phenomena from the machine. Those disturbances are marked in the following measuring
result Figure 4.

Figure 4. PD Measuring results of air cooled generator with full load 235 MW and minimum
reactive load 4 MVAr.
10

3.2 Measuring of Isolation Resistance


At least once per year for example during the revision or during long shut down, it is good to
measure the isolation resistance of generator windings. In the following Figure 5 is shown one
example of insulation resistance measuring.
VL3 eristysvastus

Eristysvastus/Mohm.

12000
10000
8000
6000
4000
2000
0
-2000 0

10

20

30

40

50

60

Aika/m in

Figure 5. Generator windings insulation resistance measurement.


The insulation resistance measuring has made with Metriso 5000 measuring device with 2500
V voltage and measuring time was one hour. Measuring time must be long enough for the
insulation resistance to reach the final value.
3.3 Mechanical Inspection
During generator revision it is necessary to make mechanical inspections for generator. The
making of the following inspections are depending on the running hours of generator:
Mechanical inspection of breakings and cracs of different mechanical constructions
Retaining rings must check by ultrasonic devices after about 70 000 running hour if
retaining rings are made of old material. In new generators has used new material
(18% Mn / 18% Cr) and it is not necessary to check retaining rings any more, because
the new material dont split or break any more.
Generator stator windings slots and end-windings must check after about 50 000
running hours. Stator windings might get loose and they must be tightened again.
3.4 Tan Delta Monitoring System
Tan delta measurements are the most suitable for oil-paper insulated high voltage (HV) cables
used in transformers, generators, bushings and other electrical devices. So in the new
generator, in which has used mica- insulation and other new insulation materials, it is better to
use partial discharge measurements.

11

The tan delta measurement is a comparative method in which the reference measurement
device is normally constructed from high voltage capacitors. The measured values are also
compared to previous measurements of examined cables or devices.

MS

Figure 6. Basic arrangement for the Tan delta measurement of C. [14.]


HV = Operating Voltage
C = Capacitance of Insulation (tan delta) (corresponding to dielectric losses of insulation)
Ca = Additional Capacitance for Signal Conditioning
Ur = Reference Voltage
Um = Measurement Signal Voltage
MS = Monitoring System

12

4. REFERENCES
1. Miomir U. Kotlica, On-Line Monitoring of Power Generator Systems, KES
International Ltd, Toronto Canada, 1998.
2. Yunsok Lim, Jayoon Koo, Joenseon Lee, Wonjong Kang, Chaotic Analysis of Partian
Discharge (CAPD) A novel approach to identify the nature of PD source, SMDT
Lab., Dept. of Electrical Engineering, Hanyang University, Korea, 2001.
3. Y. Han and Y. H. Song, Condition Monitoring Techniques for Electrical Equipment
A Literature Survey, IEEE Transactions on Power Delivery, vol. 18, No 1, January
2003.
4. P. H. F. Morshuis, R. Bodega, M. Lazzaroni, F. J. Wester, Partial Discharge Detection
Using Oscillating Voltage at Different Frequencies, Delft University of Technology,
The Nederlands, 2002.
5. X. Ma, C. Zhou and I. J. Kemp, Wavelet for the Analysis and Compression of Partial
Discharge Data, School of Engineering, Science and Design, Glasgow Caledonian
University, UK, 2001.
6. mmhan Basaran, Mehmet Kurran, The Strategy for the Maintenance Scheduling of
the Generating Unit, Anadoly University Eskisehir, Turkey, 2003.
7. A. M. Leite da Silva, G. J. Anders, L. A. F. Manso, Generator Maintenance
Scheduling to Maximize Reliability and Revenue, IEEE Porto Power Tech
Conference, 2001.
8. S. Cherukupalli, R. A. Huber, C. Tan, G. L. Halldorson, Application of Some Novel
Non-Destructive Diagnostic Tests for Condition Assessment of Stator Coils and Bars
Following Voltage Endurance Tests, Conference Record of the 2002 IEEE
International Symposium on Electrical Insulation, Boston, USA, 2002.
9. Dr. Q. Su, New Techniques for On-Line Partial Discharge Measurements, Monash
University, Australia, 2001.
10. Zhan Wang, JiWei guo, JingDong Xie, GuoQing Tang, An Introduction of a
Condition Monitoring System of Electrical Equipment, Southeast University, Nanjing,
China.
11. H. M. Banford, R. A. Fourache, Nuclear Technology and Ageing, Scottish
Universities Research and Reactor Center, 1999.
12. Z. Berler, I. Blokhintsev, A. Golubev, Partial Discharge On-Line Monitoring in
Switchgears and Bus Ducts, Cutler-Hammer Predictive Diagnostics, Minnetonka,
USA, 2001.

13

13. Karim Younsi, Paul Menard, Jean C. Pellerin, Seasonal Changes in Partial Discharge
Activity on Hydraulic Generators, IEEE 2001.
14. Dr. P. Vujovic, R. K. Fricker, Development of an On-Line Continuous Tan ( )
Monitoring System, IEEE 1994.
15. Mr. Jan Franlund, The four sides of Asset Management, Swedish Maintenance
Society, UTEK,
16. ABB Partial Discharge Monitoring System, PAMOS

14

Asset management in power systems post graduate course

On-line monitoring applications for Secondary


substation

Petri Trygg
Institute of Power Engineering
Tampere University of technology
P.O. Box 692
Finland
Email: petri.trygg@tut.fi

Table of contents
TABLE OF CONTENTS ....................................................................................................................... 2
INTRODUCTION .................................................................................................................................. 3
SECONDARY SUBSTATIONS ............................................................................................................ 4
METERS FOR SECONDARY SUBSTATION ................................................................................... 5
ADDITIONAL SENCORS .................................................................................................................... 6
COMMUNICATIONS ........................................................................................................................... 7
INFORMATION SYSTEMS................................................................................................................. 8
ANALYSIS.............................................................................................................................................. 9
CONCLUSION ..................................................................................................................................... 11
REFERENCES :................................................................................................................................... 12

Introduction
This study has been prepared for Assets management in power systems post graduate
course. The course is to be held in Helsinki University of technology in august 2005.

The subject of the study is on-line monitoring applications for secondary substations.
Theoretical backround for the subject has been the foundation but more emphisis is on
actual solutions on markets and some future sights to give path for development.

On-line monitoring applications have been earlier introduced for primary substations.
The whole concept of application consist of multiple levels. First level is the
measurement location. Second level is formed by the meter and sometimes also
additional sencors. Third part is the telecommunications. Recent development is
focused on making the communication more efficient. This means that speed has
increased while the costs have stayed on same level or even decreased in some cases.
Information systems forms the heart of the application. These are also the most
rapidly developing part of the whole concepts. The types of applications are
manufacturers own systems, separate independed systems and existing monitoring
systems in the grid company.

Location/site
Meter and additional sencors
Telecommunications
Database
SCADA/ remote control
system

Analysing tools
User Interface
Interface to other information
systems
Information
system /
Application

Secondary substations
The secondary substation consist of transformer, breakers and bus bars similarly as
primary substation. The main difference is the lower voltage level. Voltage levels
vary according to the national practices. In Finland typical voltages are 20 kV to 0,4
kV
Secondary substations form the last
link before the customer. Traditionally
any information of this level has been
difficult

to

information

get
is

and
only

real
been

time
under

discussion. Technical development has


now made it possible to broaden the
distribution grid monitoring to cover
Picture 1. Outdoor substation [El05]

also secondary substations.

Secondary substations consist of three


different types. First one is the one
usually seen rural areas. It is its own
separate building that consist only of
necessary devices and premises. The
substation building protects the devices
from weather and unauthorized access.
The

other

type

is

the

indoor

substations. This type is typical in


cities. It is built as part of building as a
separate room. The facility serves the
same purpose as in rural areas, but due
to lack of space it is build this way.
The thing to remember with the indoor
secondary substation is the risk of fire.
Picture 2. Indoor secondary substation. [WI05]

Meters for secondary substation


Meters for secondary substations have exist for some time now. In Finland Elkamo
produced meters called PIHI already around ten years ago. Latest meters have more
advanced metering capabilities and communication methods. In picture 3 is lay out of
the WIMOTEC meter. The meter consist of the measurement module and
communications module. This meter type measures current from three phases. It also
includes temperature measurement which can be utilized also for condition
monitoring. Measurement capabilities are increased by analog and binary inputs.
These inputs connect to separate additional sensors which are described more closely
on the next chapter.

Operating

Current

power inputs

measurement

GSM module and antenna

Temperature
input
Binary and analog inputs

Picture 3. Example of a secondary substation on-line monitoring meter


WIMOTEC[WI05]

Additional sensors
In addition to a meter in secondary substations the on-line monitoring is also based on
external sensors inside the substation and its devices. Possible sensors types are:

Door indication

Temperature measurement

Smoke detection

Humidity detection

Short-circuit detection

Fault indicator

Switch / breaker status

Oil temperature

With these additional information a lot of information can be formed straight or by


calculating using measurement and sensor data. Due to the fact that secondary
substations are usually not attached into information systems the information provides
much more detailed data for operating the network.

Pic 4. Typical lay out of a secondary substation. [NoLe05]

Communications
The main reason for lack of on-line monitoring in secondary substations is the lack of
reasonable communication technologies. So far most of the monitoring applications
are based on manual data acquisition from the secondary substation. This is made by
downloading the measurement data during routine visit in secondary substations.
Some use of the information can be seen, but only with on-line communication it is
reasonable to talk about monitoring of secondary substations.

Radio links have been used to communicate with primary substations. In some extent
these are one possibility to provide communication. Development of the GSM
technology has provided solutions also for secondary substation monitoring. Wireless
communication is flexible way to create communication. GSM can be applied by two
separate methods. Firstly data calls can be used to make the connection. Other method
is to use SMS short messages. Also secured TETRA communication is being studied
to increase safety of the communication [NoLe01]. Currently GSM is also reasonable
priced communication method if all the costs are considered. Problem is that GSM is
quite slow for some solutions and further development in wireless communication is
provided e.g. in form of GPRS.

Some distribution companies have installed WLAN or other wireless communication


devices on their distribution area. These solutions can be also used to replace
GSM/GPRS. The speed is far greater and if the technology is existing and used also
for other purposes the cost of usage is low.

The new technology that is called Power Line Communication or PLC is also one
possibility to adopt for secondary substation monitoring. So far in Finland at least
Turku Energia is using the PLC technology.

Nowadays the most developed technology for continuous monitoring is the GSM.
There are ready and available products for on-line monitoring of secondary
substations in the markets already. Other communication methods require more
adjusting for this purpose.

Information systems
New technology in secondary substation monitoring has also boosted the new
technology in software designing. For example wimotec provides the Wimo DTMS
system also in Application Service Provisioning model (ASP). This means that
software is delivered through Internet. The user interface of the application is on the
web browser. The advance of this model is the easiness in information delivery to any
party willing to have it. Also the up keeping of the whole system can be outsourced
wich allows flexibility in cost management.

Also interface with Scada or remote control system is already tested in Finland. In
Kuopion energia there are currently WIMOTEC devices attached into Scada system
[WI05]. This makes it possible to have much more detailed information of the entire
network. According to the Wimotec this seems to be the future trend.

Interface with other information systems such as Network Information System NIS
are necessary. Todays demands in information systems are that they truly can interact
and be flexible for other systems. Open interfaces and databases are part of the policy
to guarantee the best usability also in future.

Analysis
1. On-line monitoring
On-line monitoring include the usage of the meter and communications to gather the
information on real time from the secondary substations. Outages information, and
other real-time events are the first and most critical part of the on-line monitoring. In
these kind of cases the faster the information of an event is in control room the better
fo all. Secondary substation level has earlier been the black spot where no information
on what so ever has come on real-time. Many companies have wanted to change this
weakness the the electricity network. Real-time information decreases the outage
times crucially. Manufacturers and research institutes have been working with this
issue and finally some results can be seen. Also more advanced analyzing methods are
being developed currently [Wu98].

2. Condition monitoring
Temperature and current information
can be used to evaluate the life cycle of
the secondary substations transformer.
Harmonics are used to make the
calculation more precise. Standards to
apply are IEC 354 for oil-immerse
transformers, IEC 905 for dry-type
transformers and CENELEC HD428.4
for current harmonics. [NoVe02]
Picture 5. Distribution transformer
photographed with heat camera.

3. Power Quality analysis


Power Quality base on voltage measurement. Standard EN 50160 defines the limits
for voltages. In Finland power quality analysis are also made with own national tools

such as quality classes and quality reports which are based on the same standard.
Quality report is being used specially for dealing with customer complaints. [M01]

Picture 6. Power Quality classes [PQ05]

Voltage sags and swells can also be


analyzed. The typical presentation is
the depth and time based table. This
tools is suitable for limited amount of
sags and swells analyzing. For longer
time period and larger amount of sags
and swells the ITIC curve is more
efficient. It separates sags and swells
according to how much they interfere
customer. This is based on classifying
them based on length and depth
Picture 7. ITIC curve for voltage sags

according to the ITIC standard. This is

and swells. [PQ05]

very useful tool specially on rural


areas.

Conclusion
In the past secondary substations formed a black spot for distribution companies.
Even though some of secondary substations had meters the information that they
provided was limited and by any means didnt provide on-line monitoring features.
Today and specially in the future new communication technologies have made it
possible to attach secondary substations in on-line monitoring of the electricity
network.

Several manufacturers already exist that provide other or both meter and software for
them. Attaching applications into existing information system is critical to achieve the
best possible usability. More analyzing tools and methods are required to help the
network operator be providing only the most essential information.

References :
[Wu98]

Wu et al., "Application of Regression Models to Predict harmonic


Voltage and Current Growth Trend from Measurement Data at
secondary substations", IEEE Transactions on Power delivery, Vol. 13,
No. 3, July 1998, pages 793-799.

[NoLe01]

Nordman et al, "TETRA radio in Monitoring and Control of Secondary


Substations". Developments in Power System Protection, Conference
publicationNo. 479. pages 283-286. IEE 2001.

[NoLe05]

Nordman et al, An agent concept fo Managing Electrical Distribution


Networks. IEEE Transactions on Power Delivery, Vol 20, No. 2, april
2005. Pages 696-703.

[M01]

Mkinen et al, Power Quality monitoring integrated with distribution


automation. Cired 2001 Juna 18-21, conference publication. IEE

[WI05]

Wimotec Ltd homepage. Visited 4.8.2005. www.wimotec.com

[PQ05]

PowerQ Ltd homepage. Visited 5.8.2005. www.powerq.net

[El05]

Elkamo Ltd homepage. Visited 4.8.2005. www.elkamo.fi

[NoVe02]

Nousiainen K. & Verho, P. Monitoring the temperature and ageing of


distribution transformers with distribution automation. Tampere
University of Technology. Insucon 2002, Berlin.

Partial Discharge Detection by Wireless Sensors


in Covered Conductors Overhead Lines

Report submitted for the course work


ASSET MANAGEMENT IN POWER SYSTEMS

Submitted By

G. Murtaza Hashmi

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

PARTIAL DISCHARGE DETECTION BY WIRELESS SENSORS IN


COVERED CONDUCTORS OVERHEAD LINES
G. Murtaza Hashmi
Researcher, Power Systems and High Voltage Engineering Lab,
Helsinki University of Technology (HUT), ESPOO.
Murtaza.Hashmi@hut.fi
WHAT IS PARTIAL DISCHARGE (PD)?
The term "PD" is defined by IEC 60270 (Partial Discharge Measurements) as a localized
electrical discharge that only partially bridges the insulation between conductors and which may
or may not occur adjacent to a conductor. A PD is confined in some way that does not permit
complete failure of the system, i.e., collapse of the voltage between the energized electrodes such
as the cable conductor and neutral wires. PD can result from breakdown of gas in a cavity,
breakdown of gas in an electrical tree channel, breakdown along an interface, breakdown
between an energized electrode and a floating conductor, etc. [1].
PD is a small electrical avalanche caused by locally disrupted electric fields in dielectric
materials, and is symptom of insulation weakness and at the same time can lead to severe
deterioration of the insulating material. So it is known as one of the major factors to accelerate
degradation of electrical insulation.

SIGNIFICANCE OF PD MONITORING
PDs are small discharges caused by strong and inhomogeneous electrical fields. The reason for
such fields could be voids, bubbles, or defects in an insulation material. Detection of PD is
performed in order to ascertain the condition of the insulating material in high voltage elements,
e.g. cables and covered conductors. Since PD usually occurs before complete breakdown, PD
monitoring provides a warning to remove the power system component from service before
catastrophic failure occurs [2]. Therefore, the area of PD measurement and diagnosis is accepted
as one of the most valuable non-destructive means for assessing the quality and technical
integrity of high voltage (HV) power apparatus and cables.
PD monitoring involves an analysis of materials, electric fields, arcing characteristics, pulse wave
propagation and attenuation, sensor spatial sensitivity, frequency response, calibration, noise, and
data interpretation.

______________________________________________________________________________
Page 2 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

TYPES OF PDs
PDs could be classified in following four types:
i.
ii.
iii.
iv.

Internal discharges
Surface discharges
Corona discharges
Discharges in electrical trees, which may be considered as internal discharges of specific
origin.

UNDERSTANDING THE INITIATION OF PD SIGNALS


The generation of PD signals could be analyzed by considering a cavity in the dielectric material
of the covered conductor. The cavity is generally filled with air or some gas. The equivalent
circuit of PD cavity in the dielectric is shown in Fig.1. The capacity of the cavity is represented
by a capacitance c, which is shunted by a breakdown path. The capacity of a dielectric in series
with the cavity is represented by a capacitance b . The sound part of the dielectric material is
represented by capacitance a.

Vc
a Va

AC SOURCE

Fig. 1: Equivalent circuit of a dielectric with cavity


The electric fields inside the conductor insulation ( E i ), and in cavity ( E c ) are given as:

Ei =

. (1)

______________________________________________________________________________
Page 3 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

Ec =

(2)

Where D is the electric flux density in the conductor measured in Coulombs / m 2 , and is directly
proportional to the applied voltage ( V ) to the conductor. is the permittivity of the insulating
material and = 0 r , where r is the dielectric constant (relative permittivity) of the conductor
insulating material (always greater than unity), and 0 is the permittivity of the free space (or
air). The electric field (or dielectric breakdown strength) is measured in kV / mm , and this value is
3kV / mm at 1 atmosphere air pressure. Its magnitude depends upon the shape, location, and
pressure of air or gas inside the cavity.
As the applied voltage V is increased, the electric field in the cavity is greater than the field in
the surrounding dielectric as a result of the lower permittivity of the air or gas in the cavity.i.e.
as f 0 then, E c f E i
When the field becomes sufficiently higher in the cavity, the air can break down, in the process of
which it goes from non-conducting to conducting, and the field in the cavity goes from very high
to nearly zero immediately after the discharge. The measured PD signal is the result of the change
in the image charge on the electrodes as a result of the transient change in the electric field
distribution caused by the discharge. Such a discharge generates a voltage PD signal between the
system conductors as a result of the change in the electric field configuration, which takes place
when the discharge occurs.
This phenomenon could be well understood by considering the transient change in capacitance
between the conductor and ground shield of the conductor when the cavity goes from nonconducting to conducting. Obviously the capacitance increases when the cavity is conducting,
which means that a current must flow down the conductor to charge the additional capacitance
and maintain constant voltage on the conductor. This current flows through the impedance of the
cable and generates a voltage pulse, which propagates down the conductor [1].

QUANTITIES RELATED TO THE MAGNITUDE OF PD


Charge transfer
The charge q1 , which is transferred in the cavity could be taken as a measure. If the sample is
large as compared to the cavity, as will usually be the case, this charge transfer is equal to [3].

q1 (b + c)V ...(3)
Where V = Vi Ve , Vi is the breakdown voltage at which the discharge occurs in the cavity, and
Ve is the voltage at which discharge extinguishes.
______________________________________________________________________________
Page 4 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

As the deterioration of the dielectric certainly is related to the charge transfer in the defect, q1
would be an attractive choice. However, q1 cannot be measured with a discharge detector and is
therefore not a practical choice.
Apparent charge transfer in the sample
The displacement of charge q in the leads of the sample can be taken. This quantity is equal to:
q = bV .(4)

It causes a voltage drop bV /(b + c ) in the sample. Most discharge detectors respond to this
voltage drop and are thus capable of determining q.
The charge q1 delivered at the cavity by the PD is not identical with that recouped from the
power supply network, measurable at the terminals as "apparent" pulse charge q. Thus.
q =(

b
)q1 (5)
b+c

The only disadvantage in using q is that it is inversely proportional to the insulation thickness.
Thicker insulations are thus measured with less sensitivity than thinner ones.

PD SIGNAL CHARACTERISTICS
The voltage in the cavity collapses in at most a few ns , so that the resulting voltage pulse which
propagates in both directions away from the PD source has an initial pulse width in the ns range.
However, all shielded power cables and covered conductors have substantial high frequency
attenuation that increases the pulse width and decreases the pulse amplitude as a function of
distance propagated, which also limits the optimum signal detection bandwidth [4].
Recent studies have shown that radiation from PDs is impulsive in nature and consists of
individual high-energy, wide band impulses of a few ns in length. Thus PD signals have very
wide frequency range. The digital storage oscilloscopes and advanced digitizers enable study of
the PD signals more closely using window processing, zooming and auto-advance features. Such
techniques promise to be superior to the currently used conventional PD detector methods.
The PD data transferred need to be processed to obtain the PD characteristics such as peak value,
apparent charge, phase position, repetition rate and PD energy. The PD signal attenuates during
its propagation and it give rise to the following critical detection issues, such as:

Sensor locations and sensitivity


Measurement system response to attenuated signals
Noise detection and elimination

______________________________________________________________________________
Page 5 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

CONVENTIONAL PD DETECTORS
When a PD occurs, a current pulse is produced and this current pulse interacts with the insulation
capacitance as well as the external elements in the test circuit. Consequently a voltage pulse is
superimposed on to the HV supply voltage. The conventional detection methods generally
employ matching impedance consisting of resistors, inductors and capacitors in the PD current
path. There are following two different methods of investigation [5].
(a) Straight detection including direct and indirect methods
(b) Balance detection method (bridge circuit)

LIMITATIONS OF CONVENTIONAL PD DETECTING SYSTEMS


The conventional PD measurement techniques have been in practice for several decades. Most
conventional PD detectors use a single input detection method to measure a voltage or a current
signal at a terminal of the test object. It is based on processing of analogue signals derived from
coupling impedance in the PD current path. These conventional techniques experience severe
limitations when it comes to on-line monitoring due to the influence of background noise,
absence of non-intrusive sensors and processing facilities.
The conventional PD detection technique can identify discharges in short isolated cable lengths
only. Unfortunately, it has insufficient sensitivity for a long circuit because of the large
capacitance involved. It is also required to isolate the cable from the circuit. PD testing requires,
besides the PD detector, additional HV components like a test-voltage supply and a coupling
capacitor, which are heavy and expensive and not suitable for on-site tests [6]. They are,
therefore, restricted to testing in a laboratory environment.
Conventional PD detection has had a limit in detection frequency range, especially for CC
overhead lines, due to the attenuation of high frequency partial discharge signals. Low detection
frequency imposes a fundamental limitation on positioning partial discharge location that is one
of the major concerns of the insulation monitoring. Recently, the detection frequency range has
been extended up to radio frequency band with the development of the new sensors e.g.
Rogowski coils.

INTRODUCTION TO ROGOWSKI COIL


Rogowski coils have been used for detection and measurement of electric currents for decades.
They have generally been used where other methods are unsuitable [7]. They have become an
increasingly popular method of measuring current within power electronics equipment due to
their advantages of low insertion loss and reduced size as compared to an equivalent current
transformer (CT) [8]. Actually, they are the preferred method of current measurements having
more suitable features than CTs and other iron-cored devices.

______________________________________________________________________________
Page 6 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

The coils have also been used in conjunction with protection systems particularly high accuracy
systems or where there are DC offsets which would degrade the performance of CTs. This is
useful when measuring the ripple current, e.g., on a DC line. The features of Rogowski coils
which make them particularly useful for transient measurements stem from their inherent
linearity and wide dynamic range.

CONSTRUCTION OF ROGOWSKI COIL


A Rogowski coil is basically a low-noise, toroidal winding on a non-magnetic core (generally aircored), placed round the conductor to be measured, a fact that makes them lighter and smaller
than iron-core devices. The end of the winding is usually wound back through the tube in order to
reduce the coupling of any radiated noise signals. The coil is effectively a mutual inductor
coupled to the conductor being measured and the output from the winding is an electromotive
force proportional to the rate of change of current in the conductor. This voltage is proportional
to the current even when measuring complex waveforms, so these transducers are good for
measuring transients and for applications where they can accurately measure asymmetrical
current flows.
The self inductance of the coil is fixed, and its mutual inductance with the HV test circuit varies
depending on the position of the coil in relation to the conductor. But once the coil is clamped
and held stationary the mutual inductance remains constant. Usually the coil is not loaded and the
voltage appearing across it is used as PD measurement signal. Under this condition the coil
voltage is directly proportional to the derivative of the current in the conductor. Owing to the
small value of the mutual inductance it acts as a high pass filter; the sensor therefore attenuates
the power frequency component of the current in the conductor and minimizes electromagnetic
interference at lower frequencies. However the PD signals are in the range of MHz generally and
induce sufficiently large voltage in the range of mV .
The coils are designed to give a high degree of rejection to external magnetic fields, for example
from nearby conductors. The coils are wound either on a flexible former, which can then be
conveniently wrapped round the conductor to be measured, or are wound on a rigid former which
is less convenient but more accurate. The rigid coils have a cross sectional area of about 6cm 2
and the flexible coils have a cross section of about 0.4cm 2 [7]. Both of these transducer types
exhibit a wide dynamic range, so same Rogowski coil could often be used to measure currents
from mA to kA . They also exhibit wide band characteristics, working well at frequencies as low
as 0.1 Hz, but typically useful out to hundreds of kilohertz, too. All this with low phase error and
without the danger of open circuited secondary (as it could happen in case of CT).

TYPES OF ROGOWSKI COILS


Rigid Rogowski coils
Rigid Rogowski coils have a greater accuracy and stability than flexible coils and excellent
rejection of interference caused by external magnetic fields. They are more suitable for low
______________________________________________________________________________
Page 7 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

current and low frequency measurements. All rigid coils are provided with electrostatic screening
as standard to reduce noise and to minimize the effect of capacitive pick-up from the conductor
voltage.
For a particular coil size, the typical value of mutual inductance is 3 - 5 H , in the current range
of 100 mA to 100 kA , up to frequencies in tens of kHz . The actual specification will depend on
the wire gauge used for the winding.
Rigid coils have a very stable output and can be calibrated to an accuracy of better than 0.1%
using traceable standards. Their output is not affected significantly by the position of the
conductor threading the coil.
Flexible Rogowski coils
Flexible coils are generally more convenient to use than rigid coils but are less accurate. (1%
compared with 0.1% for a rigid coil). They are better than rigid coils for high frequency
measurements
The coils are fitted by wrapping them round the conductor to be measured and bringing the ends
together. It is important to align the ends correctly otherwise accuracy will be impaired and the
coil will become sensitive to interference from adjacent conductors or other sources of magnetic
fields. The standard locating method is a simple push-together system.
A typical mutual inductance for a flexible coil is 200 300 nH . These coils can be used to
measure currents from less than 1A to more than 1MA and at frequencies of up to several
hundred kHz. An electrostatic screen is sometimes useful to reduce noise with very low current
measurements or for minimizing capacitive pick-up with high frequency measurements.

WORKING PRINCIPLE OF ROGOWSKI COIL


A Rogowski coil could be modeled as a transformer consisting of a primary circuit with current
I P (the current to be measured in the conductor) coupling to a secondary (measuring) circuit with
inductance LS loaded with resistance RS . The coupling strength is given by the mutual
inductance M as depicted in Fig.2 [9].
The secondary voltage VS across the terminals of the Rogowski coil could be given as:

VS =

jMRS I P
................(6)
RS + jLS

Usually the air-cored coil is used in combination with a large resistance, i.e. LS pp RS , and the
measured voltage becomes:

______________________________________________________________________________
Page 8 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

IP
M

LP

IS
RS

VS

Fig. 2: Equivalent circuit of the Rogowski coil


VS = jMI P ...(7)
So the output voltage at the terminals of the winding wounded around the toroidal coil is
proportional to the time derivative of the current flowing in a conductor passing through the coil
and is given by Eq. 8.

v S (t ) = M

di P (t ) 0 AN di P (t )
=

........................................................................................(8)
2R
dt
dt

Where v S (t ) and i P (t ) are the instantaneous values of the voltage measured by the Rogowski
coil, and current flowing in the conductor. 0 is the permeability of the free space equal
to 4 10 7 H / m , A is the cross sectional area, N are the number of turns, and R is the radius
of the Rogowski coil.
An integrator is incorporated with the coil, which integrates the output voltage v S (t ) according to
the following equation to convert it into the current following through the conductor,
i P (t ) =

1
v S (t ) dt ....(9)
M

The components of flexible Rogowski coil i.e. the toroidal coil and integrator is shown in Fig. 3.

______________________________________________________________________________
Page 9 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

Fig. 3: Components of flexible Rogowski coil.

ADVANTAGES OF ROGOWSKI COIL PD SENSORS


In the conventional sensor the test circuit capacitance determines the frequency bandwidth and it
is usually not very wide compared with this sensor. The frequency band-width of the Rogowski
coil is not influenced by the capacitance of the test circuit. It is determined largely by the self
inductance and the capacitance of the coil and signal cables. It has following advantages:
i.
ii.

iii.
iv.

The frequency response of the Rogowski-coil sensor is very wide.


There is no conductive coupling between the coil sensors and the high voltage test
circuits. Furthermore, the coil installation does not necessitate disconnection of the
grounding leads of the test objects and therefore becomes non-intrusive sensor which is a
very important aspect for on-site, on-line monitoring.
It has the advantage of possessing high signal to noise ratio with wide frequency bandwidth.
The Rogowski coil based PD measurement system is a low cost solution to the PD
measurement and can be easily implemented on-site.

These advantages are essential for on-line PD measurements. The challenge for on-line PD
measurements is to find the optimal locations for these sensors with respect to their sensitivity,
interference level, signal distinction, and universal applicability.

SIGNIFICANCE OF COVERED CONDUCTORS


The use of covered conductors in the distribution system started with the need of decreasing the
number of faults caused by the trees as well as reducing the expenses with tree clearance, and
maintenance. The advantages for using covered conductors are [10]:

______________________________________________________________________________
Page 10 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

No faults or troubles with snow or hoar frost.


No interruptions by dropping branches.
No faults with touching conductors by ice-shedding.
Smaller lanes through forests, reimbursement for owners is smaller.
The clearing of the lane from growing trees is more seldom. The big trees at the border to
the forest can remain in their place. Therefore, a lot of savings are possible.
Covered conductors are a cheaper alternative to underground cable, especially in difficult
terrain.
Lines nearer to areas where the public visit are not so dangerous because of accidental
touching as it is possible to touch the conductor without an electrical shock.

However, these kinds of conductors have some problems such as:

PDs are produced due to falling trees on the conductors which produce knife traces on the
surface of the conductors.
The sensibility to ultra-violet radiations.
The results produced at different locations could be different due to climatic differences.
The different dielectric constants of the material employed, generating electric field
concentration and consequently the possibility of the corona effects.
The susceptibility to thermo mechanical effects, causing cracks.

MEASURING TEST SET-UP


As the PD signals have frequency in the range of MHz, so the flexible Rogowski coil is used in
this study. The measuring test set-up is arranged in HV Engineering Lab, at HUT, and consists of
the followings:
i.
ii.
iii.
iv.
v.
vi.
vii.

Covered conductor of approximate length 25 meters.


Rogowski coil (without integrator), model i2000 FLEX (Flexible AC current probe),
make FLUKE.
Tree leaning against the conductor to simulate real world analysis.
Digitizing signal analyzer (DSA) or oscilloscope.
Computing system (laptop) for data acquisition from DSA.
Calibrator for measuring system calibration.
HV power supply system 20 kV AC.

Following photographs give an over view and arrangement of the measuring test set-up:

______________________________________________________________________________
Page 11 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

Fig .4: Placement of the Rogowski coil (without integrator) on the conductor

Fig. 5: Measuring test set-up

______________________________________________________________________________
Page 12 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

Fig. 6: Application of 20kV AC voltage for the energizing conductor

Fig. 7: Leaning of the tree on the conductor

Fig. 8: Knife traces made on the conductor for producing PDs

______________________________________________________________________________
Page 13 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

METHODOLOGY
Steps involved for PD detection from Rogowski coil measurements
i.

The charge entering into the system due to additional capacitance produced by PDs is:
q (t ) = i P (t )dt =

ii.
iii.
iv.

1
M

(t )dt. ...............(10)

Acquisition of data from Rogowski coil ( v S ) in the form of MAT file. This voltage
waveform is in the form of high frequency ( MHz ) oscillatory transient.
The magnitude of M lies between 200-300 nH . The average value 250 nH is selected
for this study.
By using the above steps and mathematical expression, following model is used for
simulation in Simulink.

1
s

1
s

Voltage v(t) obtained from


Rogowski coil

Integrator-1

Integrator-2

Di vision

PD measurement in
coulombs

250e-9
Value of mutual inductance M

Fig. 9: Simulink model for PD measurements by using Rogowski coil voltage pulse data

Placement of Rogowski coil for PD measurements


At long distance
The Rogowski coil is placed at the long distance (at open end side of the conductor) from the
point where insulation of the conductor is scratched to produce PD.
At medium distance
The Rogowski coil is placed at the mid distance between the points where insulation of the
conductor is scratched and to the open end side of the conductor.
______________________________________________________________________________
Page 14 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

At short distance
The Rogowski coil is placed at a short distance from the point where insulation of the conductor
is scratched to produce PD. The measurement results are only shown for long distance placement
of Rogowski coil.

MEASUREMENTS AND CALCULATIONS


Calibration of the measuring system
The measurements for different charges of calibrator (10 pC , 100 pC , 1 nC , and 100 nC ) are
made. The calibrator sends pulses of different charges into the system, and Rogowski coil
measures those pulses in terms of voltage signals (without integrator). Those measured voltage
signals are processed in Simulink model for the calculation of the charge.
Rogowski coil response for 10pC calibrator charge
0.14
0.12

Amplitude in volts

0.1
0.08
0.06
0.04
0.02
0
-0.02
-2

-1

1
Time in seconds

4
-6

x 10

Fig. 10: Rogowski coil voltage response for 10pC calibrator charge.

______________________________________________________________________________
Page 15 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

Rogowski coil response for 100pC calibrator charge


-3

x 10

1.5

Amplitude in volts

1
0.5
0
-0.5
-1
-1.5
-2
-5

5
10
Time in seconds

15

20
-6

x 10

Fig. 11: Rogowski coil voltage response for 100pC calibrator charge.
0.045
0.04
0.035

Amplitude

0.03
0.025
0.02
0.015
0.01
0.005
0

0.5

1
1.5
Frequency in Hertzs

2.5
7

x 10

Fig. 12: Rogowski coil voltage FFT response for 100pC calibrator charge.
-10

x 10

0.9

PD measurement in coulombs

0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-5

5
10
Time in seconds

15

20
-6

x 10

Fig. 13: PD measurement by Rogowski coil for 100pC calibrator charge.

______________________________________________________________________________
Page 16 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

Rogowski coil response for 1nC calibrator charge


0.02

0.015

Amplitude in volts

0.01

0.005

-0.005

-0.01

-0.015
-5

5
10
Time in seconds

15

20
-6

x 10

Fig. 14: Rogowski coil voltage response for 1nC calibrator charge.
0.7

0.6

Amplitude

0.5

0.4

0.3

0.2

0.1

0.5

1
1.5
Frequency in Hertzs

2.5
7

x 10

Fig. 15: Rogowski coil voltage FFT response for 1nC calibrator charge.
-10

x 10

PD measurement in coulombs

2.5

1.5

0.5

0
-5

5
10
Time in seconds

15

20
-6

x 10

Fig. 16: PD measurement by Rogowski coil for 1nC calibrator charge.


______________________________________________________________________________
Page 17 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

Rogowski coil response for 10nC calibrator charge


0.03

0.02

Amplitude in volts

0.01

-0.01

-0.02

-0.03
-5

5
10
Time in seconds

15

20
-6

x 10

Fig. 17: Rogowski coil voltage response for 10nC calibrator charge.
1.4

1.2

Amplitude

0.8

0.6

0.4

0.2

0.5

1
1.5
Frequency in Hertzs

2.5
7

x 10

Fig. 18: Rogowski coil voltage FFT response for 10nC calibrator charge.
-9

x 10

PD measurement in coulombs

7
6
5
4
3
2
1
0
-5

5
10
Time in seconds

15

20
-6

x 10

Fig. 19: PD measurement by Rogowski coil for 10nC calibrator charge.

______________________________________________________________________________
Page 18 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

Discussion
The sensitivity of the coil is not so good for the measurements of 10 pC or less values of PDs, so
it is not possible to use this data for simulation. The coil also gives the same response (noise)
when placed at short distance near the calibrator.
In case of calibrator 100 pC pulse, the Rogowski coil response gives the PD measurement of
93 pC when placed at long distance. In case of calibrator 1 nC pulse, the Rogowski coil response
gives the PD measurement of 0.22 nC , and in case of calibrator 10 nC pulse, the Rogowski coil
response gives the PD measurement of 9 nC . The frequency contents in acquired signals are in
the range of 2.5-5 MHz for all above cases. These values are shown in Tab. 1 below.
Calibrator charge

Charge measured by Rogowski


coil placed at long distance

Percentage error

10pC
100pC
1nC
10nC

Could not be detected


93pC
0.22nC
9nC

7%
78%
10%

Tab. 1: Comparison of calibrator charge and PD measurements by Rogowski coil.


PD measurements from different tests
Different tests are made by making knife traces on the surface of the insulation of the conductor
(as could be seen in Fig. 8) . The PDs thus, are produced by deteriorating the insulation, and are
measured by the Rogowski coil. In this measuring test set-up, the transmission line (conductor)
is energized with 20 kV AC, and tree is leaned against the conductor as shown in the Fig. 7. The
Rogowski coil is placed at long distance from the point of application of HV. The results form
one of the tests is given below.
Rogowski coil response without making knife traces with tree leaning on the conductor
-3

x 10

Amplitude in volts

-2

-4

-6

-8
-2

-1

1
Time in seconds

4
-5

x 10

Fig. 20: Rogowski coil voltage response without making knife scratches with leaning tree on the
conductor
______________________________________________________________________________
Page 19 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

0.16
0.14
0.12

Amplitude

0.1
0.08
0.06
0.04
0.02
0

4
5
6
Frequency in Hertzs

10
6

x 10

Fig. 21: Rogowski coil voltage FFT response without making knife scratches with leaning tree on
the conductor
-9

x 10

PD measurement in coulombs

7
6
5
4
3
2
1
0
-2

-1

1
Time in seconds

4
-5

x 10

Fig. 22: PD measurement by Rogowski coil without making knife scratches with leaning tree on
the conductor
Rogowski coil response for a test after making knife traces
0.02

0.015

Amplitude in volts

0.01

0.005

-0.005

-0.01

-0.015
-2

-1

1
Time in seconds

4
-5

x 10

Fig. 23: Rogowski coil voltage response when making knife scratches with leaning tree on the
conductor
______________________________________________________________________________
Page 20 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________
-8

x 10

0.9

PD measurement in coulombs

0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-2

-1

1
Time in seconds

4
-5

x 10

Fig. 24: PD measurement by Rogowski coil when making knife scratches with leaning tree on the
conductor

Discussion
This first measurement is taken without scratching the conductor. The conductor is energized
with 20 kV voltage. The Rogowski coil response gives the PD measurement of 7.8 nC .This PD
could be produced due to corona effect by leaning of tree on the conductor, as coil does not give
any response when tree is not leaning against the conductor.
A test is made by making knife traces on the surface of the conductor. The conductor is energized
with 20KV voltage. The Rogowski coil response gives the PD measurement 0f 10 nC .

CONCLUSIONS
i.
ii.
iii.

iv.
v.

It is possible to measure PD by using Rogowski coil.


The sensitivity of the coil is low as it is unable to measure PDs of 10 pC or of less value.
Although the coil does not give the exact value of PD as pulse sent by calibrator, but it
could be used for approximation. The results could be improved by calculating the exact
value of mutual inductance M .
The frequency of the PD pulses is in the range of 2.5-5 MHz in case of measurements
taken at the long distance.
Corona could be produced due to the leaning of the trees on the conductors.

RECOMMENDATIONS FOR FUTURE WORK


i.
ii.
iii.

Filters and wavelet transforms could be implied for reducing noise in the measured
voltage signals obtained from Rogowski coil, thus improving results.
Covered conductor line could be modeled and the results obtained from the measurements
should be compared with those obtained from the simulation of models for verification.
The attenuation of the PD signals is an important factor in the analysis of PD detection,
and the location of PD sensors over the length of the conductor. The attenuation could be
determined by taking measurements at different known lengths over the conductor.

______________________________________________________________________________
Page 21 of 22

Partial Discharge Detection by Wireless Sensors in Covered Conductors Overhead Lines


AM in Power Systems
______________________________________________________________________________________________

REFERENCES
[1] S.A. Boggs, J. Densley, Fundamentals of Partial Discharge in the Context of Field Cable
Testing, IEEE Electrical Insulation Magazine, Vol. 16, N0.5, September-October 2000, pp. 1318.
[2] T. Babnik, R.K. Aggarwal, P.J. Moore, and Z.D. Wang, Radio Frequency Measurement of
Different Discharges, Paper accepted for presentation at 2003 IEEE Bologna PowerTech
Conference, June 23-26, Bologna, Italy.
[3] F. H. Kreuger, "Partial Discharge Detection in High Voltage Equipment"
[4] S.A. Boggs and G.C. Stone, Fundamental Limitations to the Measurement of Corona and
Partial Discharge, 1981 Annual Report of the Conference on Electrical Insulation and
Dielectric Phenomena and reprinted in the IEEE Trans. EI-17, April, 1982, p. 143.
[5] IEC: Partial discharge measurement, IEC 270, 1981.
[6] N. H. Ahmed, N. N. Srinivas, On-line Partial Discharge Detection in Cables, IEEE
Transaction on Dielectrics and Electrical Insulation, Vol. 5, N0.2, April 1998, pp. 181-188.
[7] Ward, D.A.; Exon, J.L.T.," Experience with using Rogowski coils for transient
measurements", Pulsed Power Technology, IEE Colloquium on 20 Feb 1992 Page(s):61 - 64.
[8] Z. Mingjuan, D. J. Perreault, V. Caliskan"Design and Evaluation of an Active Ripple Filter
with Rogowski Coil Current Sensing", Power Electronics Specialists Conference, PESC99, 30th
Annual IEEE, vol. 2, pp. 874 -880, 1999.
[9] P.C.J.M. van der Wielen, J. Veen, P.A.A.F. Wouters, and E.F. Steennis,Sensors for On-line
PD Detection in MV Power Cables and there Locations in Substations, Proceedings of the 7th
International Conference on Properties and Applications of Dielectric Materials June 1-5 2003,
Nagoya, pp.215-219.
[10] W. Panosch, K. Schongrundner, K. Kominek,20 kV Overhead Lines with Covered
Conductors, CIRED 2001, IEE Conference Publication No. 482, 18-21 June 2001.

______________________________________________________________________________
Page 22 of 22

EVALUATION OF THE REPRESENTATION OF POWER SYSTEM


COMPONENT MAINTENANCE DATA IN IEC STANDARDS
61850/61968/61970
Lars Nordstrm
KTH Royal Institute of Technology
Department of Electrical Engineering
lars.nordstrom@ics.kth.se
1

INTRODUCTION

On the re-regulated market, a utility is put under pressure both from its owners expecting return
on investments and regulatory agencies keeping a watchful eye on the level of tariffs. In addition
to these new business constraints, a number of factors are imposing new challenges. In the
industrialized part of the world, the power grids were built during the first half of the 20th
century. Although replacements and expansions are made continuously, the bulk of the
equipment is reaching the age for which it was originally designed. This aging is apparent both in
primary as well as in secondary equipment. Additionally, the workforce set to maintain the power
grid is aging, resulting in utilities loosing key-competence as people move on to retirement, or is
forced to leave as part of cost-cuttings.
It is natural that under these circumstances a utility will focus on improving its power system
maintenance processes, the goal being to prolong the life-time of equipment, thereby providing
better return on employed capital. Equally obvious is that the utilities will strive to make their
maintenance processes cost efficient and to a large degree automated so as not to become
dependant on single individuals. The answer to both these requirements is increased use and
integration of Information Systems (IS) in support of the maintenance processes. This
development is reported on in part, in [1]. At the same time, the power industry, especially power
distribution, is still only in the early stages of implementing modern methods and tools for asset
management such as condition based maintenance and Reliability-centered maintenance (RCM)
see for example [2].
1.1 Utility Information Systems
Traditionally the Information Systems currently in use at most utilities are not tailored for direct
support of power systems maintenance. At a high level of abstraction, the systems in use can be
divided into four groups, see [3] or for a similar taxonomy [4].
SCADA Supervision, Control and Data Acquisition systems used for monitoring and
control of the power network and its components. Typically this contains data such as Line,
Breaker, Station etc.
Business Support Systems similar to those found in other industries including support for
payroll, accounts receivable, general ledger as well as purchasing and inventory control. This
group of system normally manages data such as Project, Time-sheet, Cost center etc.

Customer Management Systems consisting of support systems for sales, marketing and
billing. The most basic support being a database of existing customers as well as functionality
for creating invoices. Typically this group of systems contain data such as Customer, Contract
and Meter.
Finally the Geographic Information System being the central repository for information
about the network and its components with a focus on its geographic location. This group of
system typically manages data like Breaker, Line, and Station etc.
Unfortunately the data and functionality needed to support condition base maintenance is spread
across several different systems and groups of systems. Information Systems used for equipment
diagnostics are also highly specialized and built by the diagnostic equipment vendors in
conjunction with the diagnostic equipment. This means that results from diagnostics may not be
readily available to integrate with for example an equipment inventory.
Additionally integration of, and implementation of new Information Systems is very often
plagued with budget or schedule overruns as well as performance and/or functionality problems
[5]. In the power industry there exist a series of initiatives to remedy these problems. Among the
notable are EPRIs CCAPI [6] initiative now included in the IEC's standardized CIM model for
information exchange, see [7]. The IEC 61968 series goes a step further and strives to define
Information Exchange specifically for Distribution Management Systems [4]. For an overview of
the relation between these initiatives and others, please see [8]. These initiatives are focused on
modeling data and information in order to facilitate the integration of existing and new IS into an
enterprise wide Utility IS architecture.
Additionally, the IEC has recently released an encompassing standard for communication and
control of substations - the IEC 61850, see [9] for an introduction and overview. The purpose of
this standard is to facilitate the integration of Intelligent Electronic Devices, or IEDs from
different vendors for substation automation. The data models contained in these standards are
further described in section 3 below.
1.2 Research Approach
The scope of the work presented here in is to investigate how maintenance data is represented
and considered, if at all, in the major IEC standards related to Information systems, i.e. 61968/70
and 61850. The focus has been on power system component data needed for condition based
maintenance. The underlying idea is to investigate to which extent that the data needed for
condition based maintenance is represented in the various standards.
The approach for the work has been to study literature on condition based maintenance in order
to identify which data is stored about conducted inspections and measurements. This has then
been cross checked with the IEC standards to verify whether the condition based data is
represented, and if so where, in the standards.
1.3 Scope of the work
The scope of the study has been limited to condition based maintenance of power cables and
specifically to the data storage aspects of partial discharge (PD) and dielectric response
measurements to determine insulation quality in such cables. This scope has been selected in
order to allow a sufficiently detailed treatment of the subject.
2

CONDITION BASED MAINTENANCE

Condition based maintenance is a wide encompassing term which implies that maintenance, in
terms of repairs or replacement is done based on the condition of the component. In [10] the term
is defined as: An equipment maintenance strategy based on measuring the condition of equipment
in order to assess whether it will fail during some future period, and then taking appropriate
action to avoid the consequences of that failure.
From the above definition we understand that information needs to be stored about the condition
of the equipment and that this information needs to be accessible for future analysis. Additionally
from [11] we have that The full potential of condition and maintenance information cannot
always be realized by using traditional techniques of data handling and analysis. There are often
underlying trends or features of the data that are not evident from the usual analysis techniques.
Such detail and trends can be important for the assessment of the equipment operation and there
are increasing demands from operators and asset managers to fully exploit the capabilities of this
data in order to optimize the utilization of high voltage electrical plant
2.1 Required Information Structure
To keep complete records of condition of high voltage cables, an information system needs to
consider three types of data. First, Equipment data such as the identity and location of the cable
section and related information. Second Activity data needs to be stored on the measurements
done, when they where done and which methods were used. Third, Condition data such as the
actual results of the measurements and the conclusions drawn on equipment status. This
taxonomy is in line with [11] and [12], with the exception of a fourth category of data, Norms,
which we here include in the equipment data.
Equipment data consists of static data about the component in question, e.g. a medium
voltage cable, its location, where it is connected, when it was commissioned, the
manufacturer, year of manufacture, technical specifications and so on. An information model
must be able to store these data but also be able to represent relations between cables, joints
and terminations as well as other cables. In other words, the data model must allow run-time
updates.
Activity data consist of information about inspections and measurements made. The data
stored includes type of measurement, when it was done, who did it, which sensors where used
and so on. Of course, the information model must allow the activities to be related to
equipment and vice versa.
Condition data is the actual measurement results, both in terms of measured quantities, but
also the resulting measured value(s) and any conclusions drawn. The condition data is related
both to the equipment and to the Activity.
In the following presentation of the standards and the ensuing analysis, these three terms will be
used to describe the types of data needed.
2.2 Diagnosing Power cables
For High Voltage cables the condition of the equipment is best determined by checking the status
of the insulation, see [11]. For insulation diagnostics, two types of methods are most common,
3

Partial Discharge measurement and dielectric response measurement in time and frequency
domains [13].
2.2.1 Data for Partial Discharge measurements
Partial discharge measurement is the most common diagnostic method for power cables.
Especially since techniques for measurement of PD with the cable on-site are available. There
exist several different PD measurement techniques that can be either, electrical, optical or
acoustic. Of these the electrical methods are most common. The most commonly used sensors are
capacitive, inductive and capacitive-inductive types. There exists a large variety of electrical PD
measurement methods differing in voltage and frequencies used and how these are varied or kept
fixed during the measurements.
From [11] and [14] we have it that for Partial Discharge measurements the following parameters
is of interest: Inception & Extinction Voltages, PD Magnitude, PD Intensity, PD Pattern i.e.
harmful or less harmful, PD Location, i.e. in cable or in accessories and Mapping i.e.
concentrated or scattered.
2.2.2 Data for Dielectric Response Measurements
Just as for PD measurements, there exist several methods for dielectric response measurements.
Dielectric response (DR) is an advanced tool and still requires more research and development to
verify its applicability to different types of cables and degradation phenomena. The dielectric
response changes when the insulation is degraded, and by measuring with different voltages and
frequencies the status of the insulation can be investigated. It has been shown that use of Very
Low Frequencies is an appropriate measurement technique for identification of water trees in
XLPE insulated cables especially when combined with loss factor measurements.
2.2.3 Further considerations
A complicating factor from an asset management perspective is that a circuit connecting two
points may very well consist of several sections of cable connected with joints. These cable
sections may also be of varying age and type. This implies that a complete record of the condition
of a series of cable sections connecting two points will require data to be stored about several
cable instances and joints.
2.3 Summary
Summarizing the above measurement data we have for the three data categories defined above.
2.3.1 Equipment Data
In addition to the obvious necessary information such as manufacturer, year of commission,
location etc. Two groups of data are important. First thresholds and limits that the cable has been
designed for and are permissible operating conditions. Second, data about how the cable is
connected to other cable segments needs to be stored.
2.3.2 Activity Data
The activity data that must be stored, in addition to obvious data such as time of measurement,
and who performed it, is data about which measurement type that was used, which system was
used and how it was calibrated.

2.3.3 Measurement Data


Which measurement data that needs to be stored is based on which method is used for the
diagnostic. For example for electric PD measurements using capacitive sensors the data is
different than from dielectric response measurements at VLF combined with loss factor
measurements.
3

STANDARDISED INFORMATION MODELS

This section is introduced with an overview of the standards included in this study. Following
this introduction is an in depth presentation of how power system equipment is represented in the
data models of IEC 61968/70 and 61850 respectively. The focus of these presentations is on data
for condition based maintenance.
3.1 Information Modeling
Information models and the modeling methods are at the core of the IEC standards. The models
provide an image of the analogue world (power system process, switchgear) which is to be
represented in the Information System. By standardizing the model, information exchange
mechanisms can be specified which in turn are what the system vendors need to implement in
order to be compliant with the standard.
3.1.1 The IEC 61850 standard
The 61850 standard covers communication and automation of substations. As a consequence
switchgear and power transformers have been extensively modelled while the standard contains
less details regarding equipment outside substations, i.e. cables and over head lines and
generating equipment. Still however the standard does include some details about the data related
to conducting equipment although not as complete as the model contained in IEC 61968.
A core component in the 61850 standard is the logical node, which represents a data and
functionality, similar to an object in object oriented analysis. The logical nodes are implemented
in the IEDs and represent parts of, or complete real world objects. These real world objects need
not necessarily be power system components but can also be abstract entities such as
measurements. A logical node may be implemented across several IEDs or be contained within
one IED. In total the 61850 standard defines 98 logical nodes, organized in 13 groups, see Table
1 below. Of the 98 logical nodes, some 50% have been extensively modeled.
Group Indicator

Logical node groups

Automatic Control

Supervisory control

Generic Function References

Interfacing and Archiving

System Logical Nodes

Metering and Measurement

Protection Functions

Protection Related Functions

a)

Sensors, Monitoring

Ta)

Instrument Transformer

Xa)

Switchgear

Ya)

Power Transformer and Related Functions

Za)

Further (power system) Equipment

a)

LNs of this group exist in dedicated IEDs if a process bus is used. Without a process bus, LNs of this group are the I/Os in
the hardwired IED one level higher (for examplein a bay unit) representing the external device by its inputs and outputs
(process image see Figure B.5 for example).

Table 1 The 13 groups of logical nodes defined in the IEC 61850 standard [15]

3.1.2 The IEC 61968 & 61970 standards.


The Common Information Model (CIM) standardized by the IEC [7] is built upon the CCAPI
project initiated by EPRI. The objective of the CCAPI was to facilitate the integration of IT
applications developed independently by different vendors. The initial focus was on EMS
applications, but the scope has grown and through the development of the IEC 61968 series [4] of
standards now cover most aspects of IS integration at utilities. Both these standards are built
around the CIM which in turn provides a model for a complete power system using notation in
UML, mainly in the form of Class Diagrams.
The CIM is a model that contains all major objects at a utility and in its grid. The model includes
classes and attributes for the objects and perhaps most importantly the relationships between the
objects. For convenience the model is partitioned into several packages but it is actually one
single connected class diagram. In essence the CIM is data-driven. By modeling the utility and its
grid using Class diagrams the CIM provides a comprehensive view of the attributes and relations
of data that will be managed in the utilitys IS.
Technically, the CIM and its extensions for Distribution Management are developed by separate
working groups in the IEC TC 57. The work is however well coordinated, and can be considered
one collective modeling effort. In addition to the work within IEC, a number of extensions to
address shortcomings of the CIM are being proposed by researchers. For example in [16] some
aspects on more detailed requirements on equipment information for maintenance are proposed in
terms of extensions to the CIM.
3.2 Representation of High Voltage cables
The following two sections describe how power cables are represented in the two standards.
3.2.1 Power Cables in IEC 61850
In IEC 61850 information about power cables are contained in the Logical Node Group Z for
further power system equipment. Power cables are represented by one single logical node with
the acronym ZCAB [17]. The attributes of the ZCAB logical node are defined in [15], see Table 2
below for easy reference

Attribute Name

Attr.
Type

LNName

Explanation

M/O

Shall be inherited from Logical-Node Class (see IEC 61850-7-2)

Data
Common Logical Node
Information
LN shall inherit all Mandatory Data from Common Logical Node Class

INS

External equipment health

EEName

DPL

External equipment name plate

OpTmh

INS

Operation time

EEHealth

Table 2 ZCAB logical node attributes as defined in IEC 61850 [15]

As is shown in Table 2 above very little information about power cables can be represented by
the attributes in the logical node. This is perhaps not surprising given the standards focus on
substation equipment. For a description of what the data attributes contain, please see Annex A.
The Mandatory data inherited from the Common Logical Node can be found in Annex B.
The 61850 standard contains some other logical nodes of interest. These are however not
specifically related to power cables, but can potentially be used by an IED manufacturer to store
condition based data. For example the RFLO logical node, see Table 3 below, could very well be
used to store information about a line segment. It does however not contain all the information
needed.
Attribute Name

Attr.
Type

LNName

Explanation

M/O

Shall be inherited from Logical-Node Class (see IEC 61850-7-2)

Data
Common Logical Node
Information
LN shall inherit all Mandatory Data from Common Logical Node Class

INC

Resetable operation counter

FltZ

CMV

Fault Impedance

FltDiskm

MV

Fault Distance in km

INS

Fault Loop

LinLenKm

ASG

Line length in km

R1

ASG

Positive-sequence line resistance

X1

ASG

Positive-sequence line reactance

R0

ASG

Zero-sequence line resistance

X0

ASG

Zero-sequence line reactance

Z1Mod

ASG

Positive-sequence line impedance value

Z1Ang

ASG

Positive-sequence line impedance angle

Z0Mod

ASG

Zero-sequence line impedance value

OpCntRs
Measured values

Status Information
FltLoop
Settings

Z0Ang

ASG

Zero-sequence line impedance angle

Rm0

ASG

Mutual resistance

Xm0

ASG

Mutual reactance

Zm0Mod

ASG

Mutual impedance value

Zm0Ang

ASG

Mutual impedance angle

Table 3 The RFLO logical node which contains data related to faults in a circuit

Similarly, the logical node MMXU, see Table 4below, for measurements can be used to store
data from measurements. However this node is also not specifically intended for diagnostic
measurements, but rather holds general information for operative purposes.
Attribute Name

Attr.
Type

LNName

Explanation

M/O

Shall be inherited from Logical-Node Class (see IEC 61850-7-2)

Data
Common Logical Node
Information
LN shall inherit all Mandatory Data from Common Logical Node Class

INS

External equipment health (external sensor)

TotW

MV

Total Active Power (Total P)

TotVAr

MV

Total Reactive Power (Total Q)

TotVA

MV

Total Apparent Power (Total S)

TotPF

MV

Average Power factor (Total PF)

Hz

MV

Frequency

PPV

DEL

Phase to phase voltages (VL1VL2, )

PhV

WYE

Phase to ground voltages (VL1ER, )

WYE

Phase currents (IL1, IL2, IL3)

WYE

Phase active power (P)

VAr

WYE

Phase reactive power (Q)

VA

WYE

Phase apparent power (S)

PF

WYE

Phase power factor

WYE

Phase Impedance

EEHealth
Measured values

Table 4 The MMXU logical node for measurements

The standard also contains a set of logical nodes related to diagnostics, however these are
specialized for switchgear (3 logical nodes) and transformers (two logical nodes). These logical
nodes do contain attributes such as for example the acoustic level of a partial discharge in a
breaker.
3.2.2 Power Cables in IEC 61968/970
The 61968 and 61970 standards are extensive and cover an even larger scope than the 61850
standard. Since the 61970 and 61068 have different scope, the same physical resources, such as a
power cable is modeled as several separate instances. For instance, a connection between two
8

points is modeled both as a Line for electric purposes and as a cable or over-head line for asset
management purposes. For this study, only the asset management perspective is of interest. In
IEC 61968, power cables are contained within in the LinearAssetHierarchy [18] shown in Figure
1 below.

Figure 1 The CableAsset Model in the Linear Asset hierarchy

The 61968 standard also contain classes to represent information about the inspection and
maintenance activities. These are contained in the WorkInspectionMaintenance package, this
model is shown in figure 2 below.

Figure 2 The WorkInspectionMaintenance model.

The 61970 standard provides Measurement package which contains models with data that can be
used to represent results of measurements such as partial discharge diagnostics. The model is
however very generic and intended for operational purposes, i.e. SCADA/EMS application,
similar to the case for the measurements Logical node in IEC 61850, see 3.2.1 above.
4

ANALYSIS

The analysis has been done by verifying to which degree and format the different types of data
defined in section 2.1 is available in the two standards. Please note that the purpose is not to
select one of the standards above the other, but rather to identify potential overlap or areas of
condition based data that have not received proper attention in the standards. Also note that it is
not implied that the standards should contain the data specified in section 2.1.
10

4.1 Equipment Data


For static equipment data both standards contain a basic subset of attributes that can be used to
store information about the cable section. The 61968 model does contain slightly more detailed
information about cable characteristics. Both however lack attributes to store information about
values for which the cable has been design, i.e. limits or thresholds which shall not be exceeded.
In the 61968 standard, the relation between cable sections can be modeled, by using the relation
with CircuitSections and Lines, see figure 1 above. This is however not possible with the Logical
Nodes in 61850. The relation can of course be created in the system anyway, but this would
require the addition of non-standardized relations between the logical nodes.
4.2 Activity Data
Activity data is not available in the 61850 logical nodes, which is unsurprising since such
information is not suitable to store at the IED level. The 61968 models provide sufficient entities
and attributes to store information about performed inspections/diagnostics and the results and
conclusions drawn from the results.
4.3 Measurement data
The weakest point for both standards is the storage of measurement data. Both standards provide
a generic measurement object that contain some attributes related to measurements. However,
none of these are specifically tailored for PD or DR measurements. Very likely this is because
many different methods exist and a standard that includes all necessary parameters will be too
detailed.
4.4 Concluding remarks
Clearly the standards investigated do not have sufficient attributes included to store detailed
information about PD and DR measurements. This is especially true for measurement data which,
depending in measurement method can be very extensive.
The IEC 61968 standard does however cater for a large share of the CBM data needs. The
CableAssetModel combined with the WorkInspectionMaintenace model can be used to store the
conclusions drawn from PD or DR measurements. Since specialized condition assessment tools
like PD measurement equipment, are delivered as stand alone software packages. It can be
claimed that it is not necessary for these applications to be standards compliant as long as they
provide interfaces to external systems. These interfaces may very well be made compliant with
the information exchange mechanisms which these standards also specify.
Additionally, the fact that a specific data point is included, or not, in a standard does not preclude
vendors from providing that data in their system. As long as additional data is marked as
additional to those specified in the standard and an interface is available to query it the data will
be accessible and can be made available.
4.5 Further Work
This study does not include the activities done within Cigrs working group D1.17 on HV Asset
Condition Assessment Tools, Data Quality - Expert Systems. A more detailed study on the
relation between the work done in these working groups and the results gained in this study is a
suitable next step.
11

REFERENCES
[1]

[2]

[3]

[4]

[5]
[6]
[7]
[8]
[9]
[10]
[11]

[12]

[13]

[14]

[15]

P.Verho, K. Nousiainen, "Advanced use of information systems provides new


possibilities for asset management of high voltage devices," in Proc. NORD-IS 03 Nordic
Insulation Symposium
J. Endrenyi et.al. "The present status of Maintenance Strategies and the impact of
Maintenance on Reliability," IEEE Trans. Power Systems, vol. 16, No 4, pp. 638-646,
Nov.,2001.
C-G Lundqvist, J. Jacobsson, Enhancing Customer Services by efficient integration of
modern IT-systems for Asset Management and Network Operation, In Proc of Powercon
2002, Kunming, China, October 14-16, 2002.
IEC 61968-1 Application integration at electric utilities System interfaces for
distribution management Part 1: Interface architecture and general requirements, IEC
61968-1, 1st Ed, 2003-10.
Standish Group, The CHAOS report, The Standish Group International, Inc, 2003,
London, UK.
Common Information Model CIM 10 Version, Report no 10019765, Final report Nov
2001, EPRI.
IEC 61970-301 Energy management system application program interface (EMS-API)
Part 301: Common Information Model (CIM) Base, IEC 61970-301, 1st Ed, 2003-11.
IEC TR 62357 Power system control and associated communications Reference
architecture for object models, services and protocols, IEC TR 62357, 1st Ed, 2003-07.
IEC 61850-1 Communication networks and systems in substations Part 1: Introduction
and overview. International standard, IEC, Geneva Switzerland, 2003.
http://www.maintenanceresources.com/ReferenceLibrary/MaintenanceManagement/KeyT
erms.htm, Access 6 August 2005.
B. Quak, E. Gulski, J.J. Smit, F.J. Wester, E.R.S. Groot Database Support for Condition
Assessment of Distribution Power Cables in Conference Record of the 2002 IEEE
International Symposium on Electrical Insulation, Boston, MA USA, April 7-10,2002.
E. Gulski, F.J. Wester, B. Quak, .J. Smit, F.de Vries, M.B. Mayoral, Datamining for
Decision Support of CBM of Power Cables in Proceedings of CIRED 17th International
Conference on Electricity Distribution Barcelona, 12-15 May 2003.
P. Hyvnen, B. Oyegoke, M. Aro, Advanced diagnostic test and measurement methods
for power cable systems on-site, Technical report number TKK-SJT-49, Helsinki
University of Technology, High Voltage Institute, Espoo, Finland 2001
B. Oyegoke, P. Hyvnen, , M. Aro, Partial Discharge Measurement as Diagnostic Tool
for Power Cable Systems, Technical report number TKK-SJT-45, Helsinki University of
Technology, High Voltage Institute, Espoo, Finland 2001, ISBN 951-22-5394-1
IEC 61850-7-4 Communication networks and systems in substations Part 7-4: Basic
communication structure for substation and feeder equipment Compatible logical node
classes and data classes. International standard, IEC, Geneva Switzerland, 2003.

12

[16]

[17]

[18]

X. Dong, Y. Liu, F.A. LoPinto, K.P. Scheibe, S.D. Sheetz, Information Model for Power
Equipment Diagnosis and Maintenance. In Proceedings of Power Engineering Society
Winter Meeting, 2002. IEEE, 2002.
IEC 61850-5 Communication networks and systems in substations Part 5:
Communication requirements for functions and device models, International standard,
IEC, Geneva Switzerland, 2003.
http://www.cimuser.org, revision CIM10r6 accessed August 6th 2005.

13

ANNEX A
This annex lists the Common Data Attributes used in the IEC 61850 standard. N.B. Please ignore
references to tables and documents within these tables.
DPL class
Attribute
Name
DataName

Attribute Type

FC

TrgOp

Value/Value Range

/O/C

Inherited from Data Class (see IEC 61850-7-2)

DataAttribute
configuration, description and extension
vendor

VISIBLE STRING255

DC

hwRev

VISIBLE STRING255

DC

swRev

VISIBLE STRING255

DC

serNum

VISIBLE STRING255

DC

model

VISIBLE STRING255

DC

location

VISIBLE STRING255

DC

cdcNs

VISIBLE STRING255

EX

AC_DLNDA_M

cdcName

VISIBLE STRING255

EX

AC_DLNDA_M

dataNs

VISIBLE STRING255

EX

AC_DLN_M

Services
As defined in Table 45

Table A.1 The Device name Plate Data attribute


INS class
Attribute
Name
DataName

Attribute Type

FC

TrgOp

Value/Value Range

/O/C

Inherited from Data Class (see IEC 61850-7-2)

DataAttribute
status
stVal

INT32

ST

dchg

Quality

ST

qchg

TimeStamp

ST

M
substitution

subEna

BOOLEAN

SV

PICS_SUBST

subVal

INT32

SV

PICS_SUBST

subQ

Quality

SV

PICS_SUBST

subID

VISIBLE STRING64

SV

PICS_SUBST

VISIBLE STRING255

DC

dU

UNICODE STRING255

DC

cdcNs

VISIBLE STRING255

EX

AC_DLNDA_M

cdcName

VISIBLE STRING255

EX

AC_DLNDA_M

dataNs

VISIBLE STRING255

EX

AC_DLN_M

configuration, description and extension


Text

Services
As defined in Table 13

14

ANNEX B
Common Logical Nodes from which the Logical node inherit generic information.
Attribute Name

Attr.
Type

LNName

Explanation
Shall

be

M/O

inherited from Logical-Node Class (see IEC 61850-7-2)

Data
Mandatory Logical Node Information (Shall be inherited by ALL LN but LPHD)
Mod

INC

Mode

Beh

INS

Behaviour

Health

INS

Health

NamPlt

LPL

Name plate

Optional Logical Node Information


Loc

SPS

Local operation

EEHealth

INS

External equipment health

EEName

DPL

External equipment name plate

OpCntRs

INC

Operation counter resetable

OpCnt

INS

Operation counter

OpTmh

INS

Operation time

Data Sets (see IEC 61850-7-2)


Inherited and specialised from Logical Node class (see IEC 61850-7-2)
Control Blocks (see IEC 61850-7-2)
Inherited and specialised from Logical Node class (see IEC 61850-7-2)
Services (see IEC 61850-7-2)
Inherited and specialised from Logical Node class (see IEC 61850-7-2)

15

_____________________________________________________________________________

ON-LINE CONDITION MONITORING OF


HIGH VOLTAGE CIRCUIT-BREAKERS
Mr. Shui-cheong Kam
Queensland University of Technology, Australia
s2.kam@student.qut.edu.au:
1. Introduction
Transients occur in power systems due to a large variety of causes. Circuit-breaker operation,
faults, harmonic resonances and lightning all produce transients with different characteristics.
Precursors of many different types of failures of power system equipment produce fast transients
on power systems are as follows:
a) Identification of degradation and impending failure of power system equipment from fast
transient over voltage
b) Intermittent faults between phase and ground may not cause circuit breakers to operate
but may cause transient voltages such as trees briefly touching power lines blown by
wind
c) Polluted insulators when wet may cause transient discharges
d) Circuit-breakers originally considered to be restrike free are now known to restrike
during opening
e) Occasionally on switching ferro-resonance may occur producing characterically
deformed system over-voltage
There has been considerable interest in transients occurring in power systems [1, 2] because
transients have caused operational problems with computers and other electronic equipment [3].
Previous research has been aimed at detecting the magnitude and duration of transients on power
systems with a view to estimating the quality of the electricity supply [4, 5]. Many researchers
have made use of computers as condition monitoring equipment and applied modern techniques
such as wavelet analysis in the detection of transient phenomena [6, 7] and artificial neural
networks (NNs) for classification of the type of events causing the disturbance [8, 9]. Lots of
work has been done on transient disturbances over many years. Nowadays there is an interest in
quantifying the distortion and avoiding their effect on power supplier waveforms. This field is
referred as Power Quality. There has been considerable research work in this area but little
attempt to interpret transient waveforms to provide information about the condition of power
system equipment. Prognostic condition assessment which allowed most power equipment
maintenance to be performed during regularly scheduled service rather than on an emergency
basis after failure would greatly reduce the total cost. The needs in the power industry are similar
to those for monitoring the main engines on Navy ships.
2 Methodology
This project is examining the characteristics of transient phenomena occurring in power systems
and explore the possibility of automatically processing real waveforms to obtain information
about power system equipment condition. Examples that are considered include circuit-breaker
restriking when switching shunt capacitor or reactors. To achieve this goal, we first create a
database of three phase capacitor bank transient waveforms and relate this to real life data with
__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 1

_____________________________________________________________________________
and without restrike to build up a database of typical waveforms. This database will then be used
in the design of new methods of restrike detection and classification. Database will initially be
populated with data from ATP (Alternative Transients Program) simulation which will be made
for switching with restrikes and no restrike. Distinctive features of transient waveforms from the
database will be determined for various conditions and pre-failure conditions using advanced
computational techniques such as wavelet analysis, artificial neural network (NN) modeling,
parallel NN and statistical methods as the proposed tools. It is also intended to test the
algorithms on data obtained with a wide range of transients produced by phenomena other than
switching. Opportunities to extend the work by examining the restrike and detection and
classification of power system equipment deterioration caused by transient, is a more realistic
situation that can be studied. With confidence has been developed in the effectiveness of the
methodology, it is intended to explore the possibility of applying the procedures to other power
system phenomena: ferroresonance, arcing, and conductor clashingetc.
The writer has investigated capacitor switching and restriking features, using Wavelet Transform
for feature extraction and a Self-organising Map neural model for data mining/visualization. The
results of the high performance computing environment with the desktop computing
environment are correlated. This PhD programme is aimed at initially developing diagnostic and
prognostic algorithms for specific equipment operations to improve asset management, and then
to develop a parametric estimation (sensitivity analysis) model and a power system equipment
monitoring system that would locate and categorize impending failures by on-line analysis of
transients.
3. Objectives of the Program of Research and Investigation
The overall aim for the next three years full-time or equivalent of the doctoral research program
is to investigate techniques for improving asset management by estimating power system
equipment condition from transient disturbances waveforms.
An example of the estimating power system equipment condition from transient disturbance is
that waveforms from transient phenomena tell us the information about circuit breaker delayed
closing time, arcing and energy level or other information, which tells us whether the circuit
breaker is healthy or has deterioration. This is the motivation of this research project. The other
motivation is to improve asset management of power systems by on-line detection and analysis
of power system transient phenomena. The outcome of the research is the techniques including
an artificial NN expert system with a parametric estimation (sensitivity analysis) model and a
power system equipment monitoring system for improving asset management to detect transient
phenomena caused by power switching equipment deterioration.
The specific aim for the next three years or equivalent of the doctoral research program is to:

develop initially diagnostic and prognostic algorithms for a specific operation such as
capacitor switching
develop a parametric estimation (sensitivity analysis) model and a power equipment
monitoring system system that would locate and categorize impending failures by on-line
analysis of transients

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 2

_____________________________________________________________________________
4. Literature Review
This literature reviews includes knowledge discovery approach, electrical faults, transient
analysis, detection and classification problems on power quality disturbance with an objective to
develop an innovative research for a problem and appropriate methods the ideas engaged from
other researchers. Special attention will be given to assumptions, problems encountered and
methods to the problems that were encountered. The parameters and various neural models will
be investigated to determine the research direction for developing diagnostic and prognostic
algorithms for power system equipment with a parametric estimation (sensitivity analysis) model
for improving asset management. The algorithms will be used to detect and analyse the
waveforms and the artificial NN models for a quantitative trend index to indicate the power
system equipment deterioration caused by transient phenomena for power system equipment
failure prediction.
It is important to distinguish faults from other switching events, because switching operations
may cause transients similar to fault induced transients. If we can identify the fault type, we can
locate a specific fault and clear it. A transient is initiated whenever there is a sudden change of
circuit conditions; this most frequently occurs when a switching operation or a fault takes place.
A variety of methods have been considered and used for reducing transient inrush currents on
capacitor bank switching. The surge impedance can be increased by the intentional addition of
inductance in the form of a reactor. Resistors switching can be added to damp the oscillation.
One approach is to use synchronous switching which means closing the switching device at a
selected point in the cycle. This is done, ideally, when there is no voltage between the contacts.
It requires careful sensing, precise and consistent mechanical operation of the switch and
independent operation of the poles of the three phases. However, there is always the problem of
restrike, the tendency for the intercontact gap to breakdown and establish current before the
contacts physically touch. A phenomena occurs that if the natural frequency is high (if L or C or
both are very small), the voltage across the switch contacts will rise very quickly and if this rate
of application of the system recovery voltage should exceed the rate of buildup of dielectric
strength in the medium between the contacts, the breaker will be unable to hold off the voltage
and a so-called restriking will occur. This usually results in the switch carrying fault current for
at least another half cycle which is described as the restriking voltage [10].
Artificial neural networks (NNs) have been applied successfully in detection and classification
of power quality recognition problem and electrical fault location. Part of analysis of this
research needs artificial NNs to classify power system events. The data obtained from modeling
is fed to an artificial NN in order to classify the power system events that occur under any
circumstances.
The following review will be covered:

Knowledge discovery approach


Transient disturbances on power systems
Transients indicating with system failures and power system equipment deterioration
Diagnostic (classifer) techniques for classifying transient disturbances
Prognosis (projection and trenching) techniques for prediction of power system
equipment failure

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 3

_____________________________________________________________________________
5 Conclusion
There has been extensive previous work on the detection of power system transient using
wavelets and neural network. Based on the analysis of transient provided by very simple models
and certainly there has been no attempt (except for van Karel with arcing fault) to investigate the
arcing fault for classification and detection of disturbance. Most of research work are focused on
education for customers about the ramifications of power power quality, new developments in
instrumentation and network analysis are generally acknowledged as promosing factors towards
solutions for power quality problems [81]. Most of power quality research work has been
concerned with detecting and classifying transient disturbances in order to ascertain the Power
Quality before appropriate mitigation action can be taken. In order to assure consumers that
their electricity supply is free of disturbance as well as equipment sensitivity study, there has
been extensive previous work on the detection of power system transient disturbances and fault
location using wavelets and neural network. There is very limited research work for detecting of
transient phenomena caused by power system equipment deterioration. Opportunities to extend
the work by examining the detection and classification of power system equipment deterioration
caused by transient is more realistic situation using advanced computational techniques such as
wavelet analysis and artificial NN modeling.
Transients will be calculated for a number of well-known types of power system equipment
maloperations using a powerful transient simulation program [3] that has been developed by the
power systems research community over more than thirty years. This program has been
demonstrated to give realistic and accurate solution of most types of complex transients in threephase system. It is proposed to simulate those events occurring in power systems that are
associated with power system equipment deterioration or failure that cause transients in the
network. Events that will be investigated may include: restriking in power system equipment,
disconnecting capacitor banks and three-phase reactors, temporary short term transients caused
by brief contacts with trees and other objects, short duration voltage sags due to temporary
arcing, voltage depression followed by fuse or recloser operation, and maloperation of motors
and distributed generators if found necessary.
The literature review has investigated the parameters for detection and classification of the
transient disturbances that most of the work on faults locator and classification algorithms
concentrates on time and frequency domain while less attention is paid to restriking voltage.
Previous research has been aimed at detecting the magnitude and duration of transients on power
systems with a view to estimating the quality of the electricity supply. Therefore, there exists a
possibility of developing diagnostic and prognosis algorithms for power system equipment
including artificial NN models with wavelet analysis.
Based on the Wavelet analysis and artificial intelligence system are identified as tools for power
quality applications [24], the following research tasks are proposed:
a) Initially we intend to determine if the approach can be used to identify the cause of
capacitor and shunt reactor restriking when circuit breaker disconnects. This is a
phenomenon that occurs infrequently that can cause degradation of circuit breakers.
b) The initial part of the research plan is to make a wide range of capacitor simulation to
build up a suitable database with three phase system for suitability of operation and
relate with and without restriking.

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 4

_____________________________________________________________________________
c) Exploring the possibility of automatic detection and classification of equipment
deterioration from the characteristics of transient phenomena appearing on power
networks. It is proposed to research the use of suitable analysis tool such as wavelet
to extract features related to characteristic time between events.
d) Development of techniques for automatic recognition of events occurring on power
distribution systems such as occurrence of faults, or switching of lines or other
equipment.
e) Detection of abnormal conditions such as circuit breaker failures due to restriking
voltage.
f) Development of fast algorithms for event detection and new techniques for
classifying types of disturbances.
g) Application of these techniques to give network asset management by for example
building up a history of abnormal events.
h) Feature extraction.
i) Development of a parametric estimation (sensitivity analysis) model to indicate the
power system equipment deterioration caused by transient with Wavelet Transforms.
Different NN will be explored to select appropriate error measure for different types
of power system equipment.
6. Summary of Research and Results
Literature review for recent research on the power quality problem is directed to investigate
techniques for estimating power system quipment condition from transient waveforms. The
power quality problem is defined as any power problem manifested in voltage, current or
frequency deviations that result in failure or misoperation of customer equipment.
The following problem is identified from the literature review:

Is it possible to improve asset management of power systems by on-line detection and


analysis of power system transient phenomena?

Using advanced computational techniques such as Wavelet Transform, Artificial NN, Parallel
NN and Statistical Methods to solve the problems is used to estimate power system equipment
condition from transient disturbances. The following sections will explore the investigation path
undertaken within the research field with justifications and results to date.
6.1 Modeling of Power System Transient Disturbances
Many researchers have used different algorithms and models for distribution lines study of
transients for faults locating and classifying problems. The results of these researchers can be
used to determine the parameters and simulate data for artificial NN training and models
development. The efficiency and accuracy of each of the NN models will be assessed and finally
the most suitable models for data simulation will be selected.
For parameters investigation and simulation data production each model will be analysed using
ATP-EMTP. In this program ATPDraw is a graphical preprocessor with which the user can build
up electric circuit. Based on the graphical drawing of the circuit, ATPDraw generates the ATPEMTP file in the appropriate format. This output file then will be executed using ATP-EMTP
software and the results will be saved in a given file, which will be used for analysis.

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 5

_____________________________________________________________________________
6.2 Investigating the Features Extraction Schemes for Power System Disturbances

Different features extraction schemes will be explored and an appropriate one such as the
following method:Obtaining the original transient signals using Teagers operator [96] & [97].
Down-sampling the signatures.
Obtaining the normalised auto-correlation functions of the primary functions.
Preparing feature patterns in a format suitable for input to a neural network classifier.

6.3 Design of New Methods of Restrike Detection and Classification with Diagnostic and
Prognostic Algorithms
Diagnostic and prognotic algorithms for power system equipment including artificial NN expert
system will be developed with the available current transformers. Hardware will be developed
and installed in a local substation. Software will be installed using the method as stated on 6.6.
Quantification of severe restriking is accomplished by comparing the Minimum Quantisation
Error (MQE) of a newly acquired waveform with a predetermined threshold. The alarm
threshold is determined using the Uniformly Most Powerful Test without prior statistical
information on the restrike signature. Numerical results show that properly selected threshold for
MQEs ensures a high rate of severe restrike detection, following procedures developed by Kang
[18, 98]. Alternatively, severe restriking is modeled using ATP. Digital data from simulations
are applied in MATLAB. Results from different conditions are presented for a range of high
voltage network conditions for the SOM training data. Equipment deterioration index k is
defined to measure the difference between the discrepancy/errors for severe restriking and the
discrepancy/errors normal restriking, which is used for a quantitative trend index for the circuit
breaker condition.
k=

E ( severe)errors / discrepancy
E (normal )errors / discrepancy

(1)

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 6

_____________________________________________________________________________

Inputs:
Transient
Waveforms from
power systems
(simulation or
measurement)

Data Format
Conversion
With MATLAB

Diagnostic & Prognosis


Algorithms for Classification
and Power System
Equipment Failure

Signal Processing
Techniques such as
Wavelet-Transform
Based Feature
Extraction for Detection
Outputs:
Detection and
Selection of the
Disturbance with
Power System
Equipment
Deterioration
Index

Figure 1. Detection, classification and estimation of a power system equipment deterioration


index flowchart
Capacitor restrike modeling can be done in many different ways. The problem is the data
required for the model. There are very complex restrike models, but manufacturers normally do
not have the data to utilize these models. A simple model can be simulated by using a voltage
controlled (flashover) switch or simply by putting a voltage controlled switch in parallel with a
switch representing the Circuit Breaker. However the ATP rules do not allow switches to be
directly paralleled- a small resistance (eg 1.0e-6 Ohms) must be inserted between the switch
nodes. A switch is added in parallel to the original switch (using low value resistors) and set
them up such that the first switch is closed at the start of the simulation, and opens at 20 ms, de
energising the capacitor. The second switch is set up as a flashover switch (voltage controlled
switch) which is arranged to flashover when the voltage across it reaches 15,000V. Restriking
occurs because the trapped charge results in an elevated voltage across the switch. Usually
Transient Recovery Voltage is also involved, and the peak voltage occurs at the first voltage
peak after the initial current interruption occurred, i.e. after 10ms. If the restrike voltage is lower
it can of course occur earlier, sometime after 5 ms.
The wavelet transform is utilized to extract additional features for differentiating normal
capacitor switching and capacitor switching restrike.
Voltage starts to change as an additional feature is applied to capacitor switching, other than as
stated in [77]. As shown in [99] Daubechies wavelets. Db5 have been found to produce good
results. The scale/level is application dependent. In this case 2 scale/level is enough for this
application.

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 7

_____________________________________________________________________________
The software method uses a SOM as a classifier which classifies the operating state of a circuit
breaker into a capacitor switching state and a restrike voltage state. The artificial NN classifier is
trained using a data set, which is generated by the Alternative Transients Program (ATP)
program. The trained SOM will be then, used for on-line condition monitoring. The complete
description of the parallel NN methodology is not within the scope of this project report.(Please
refer to [100]). Nevertheless, it is important to discuss the basic design procedure, which
involves the following steps:

Feature selection
Architecture of SOM and its training algorithm
Performance evaluation

After the feature vectors have been obtained, feature vector normalization is applied to separate
the vectors further to improve recognition performance. These vectors are used to train an
artificial NN. Artificial NNs are slow when handling large input vectors. This is why feature
extraction techniques are used to extract a limited number of features from the original signals as
the inputs to the artificial NN as follows:
There were two sets of 150 training vectors for developed each test in the training set. The
training results are listed in Tables 1. & 2.
Table 1. Desktop Computer Results

TYPE

Rectangular

Hexagon

Final quantization error for capacitor switching (normal 0.294


restriking)

0.504

Final quantization error (severe restriking)

0.612

From equation (1), Equipment index for Rectangle Type


Equipment index for Hexagon Type

k =

0.326
k=

0.326
= 1.109
0.294

0.612
= 1.191
0.504

Table 2. High Performance Computer Results


TYPE

Single CPU

Starting epoch-discrepancy 60 iteration for


capacitor switching (normal restriking)

From
644.8617
48.57735

to

From 612.7516
19.41228

to

Starting epoch-discrepancy 60 iteration (severe


capacitor switching)

From
644.8617
52.36634

to

From 612.7516
23.56772

to

From equation (1), Equipment index

k=

Four CPUs

52.36634
= 1.078
48.57735

Comparing the results from desktop computer and HPC, we can conclude that the HPC results
are better than desktop computer for consistency in output results even different random
numbers are generated with both single CPU and four CPUs, where as desktop computer
generates different results each time. This is because HPCs do the jobs faster and do it more
accurate within defined iterations (eg. number of loops). In other words, HPCs converge to the
result more faster and more accurately. It should be pointed out that due to the diversity of cases
in power disturbances, the neural model could be further trained and tested by field recorded
__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 8

_____________________________________________________________________________
data. A Matlab code Codes for NNs (MLP, RBF, SOM, SVM and etc.) will be tested. Parallel
SOM Fortran Codes will be tested under high performance computers. for neural network and
parallel NN learning for selecting the appropriate error measure since they may perform
differently over different ranges [87]. Modeling and simulation, pattern generation and design of
the neural network based algorithm for a circuit breaker, as well as simulating results with and
without parallel NN will be the next immediate tasks.
6.4 Controlled Experiments with Signatures and Field Data
Using established models, an investigation will be conducted into the current diagnostic and
prognostic techniques that model parameters for the current methods exhibit. From this
investigation opportunities to over these restrictions with artificial NN software will be
established. Based on these possibilities, the latest neural model software development for
electrical engineering applications will be explored. Once the model is validated under
simulations with online data, then the application to other online data will be commenced. The
other online data will contain unknown parameters, and so the success of the new methods on
this new data will not be straight forward to evaluate. The most promising means for evaluation
is to use cross-validation of the model with the derived techniques in field evaluation. The
performance criteria outlined previously in section 6.5, will form the basis of the evaluation in
the controlled experiements.
6.5 Prototype Algorithms Development and Test with Different Methods
Simulated data will be used to verify the model. Regression analysis of desired output and actual
output will be performed to check the accuracy of the model. Therefore, once the final validation
of the new model is established, considerations of implementation will be the research focus.
The computational time required by portable/notebook computers is one consideration. Another
is the amount of necessary data to provide the model enough information to be reliable. Both
will be considered. Handling of the pre-processing or post-processing of the data as well as the
noisy data and missing data will be also taken into the implementation. Different methods will be
evaluated including neural network and statistical methods. Different NNs will be explored to
select an appropriate error measure for different type of power system equipment
The parametric estimation (sensitivity) model will be based on optimization principle or global
optimization sensitivity analysis, which is derived from the book Simulation Based
Optimization.
The following variables will be used for the diagnostic and prognostic algorithms to determine:
the time-to-failure estimation of power system equipment (e.g. severe restrikes for circuit
breakers), correlating measured transient waveforms with the database of known waveforms.
1) feature extraction methods
2) a matter of degree defined for the word severe
3) error measures from neural network (NN) software computation
4) ranges of the error measures for each type of NN with different performance measure
such as SOM
5) statistical quality control algorithms such as CUSUM
6) different software and computer hardware
6..6 Field Trials and Effectiveness Assessment

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 9

_____________________________________________________________________________
It is required to design a quantitative trend index to indicate the deterioration of power system
equipment condition caused by transient phenomena. The parameters such as amount of data
required, appropriate feature selection methods and appropriate NN model from the waveforms.
It is likely that changes in the above parameters will provide information about the deterioration
of power system equipment conditions. Wavelet Transforms will be explored as a time-scale
technique one of the most appropriate method for effective processing and analysis of transient
waveform. A number of wavelet based algorithms will be developed to determine the index on
the basis of power system equipment deterioration that may cause changes in the amplitudes and
time delays between transient phenomena from field evaluation [54] .
6..7 Develop a Condition Monitoring System for On-line Analysis of Circuit-breakers
Once the model is validated under simulations with online data, then the application to other
online data will be commenced. The other online data will contain unknown parameters, and so
the success of the new methods on this new data will not be straight forward to evaluate. The
most promising means for evaluation is to use cross-validation of the model with the derived
techniques in field evaluation. The performance criteria outlined previously in section 6.5, will
form the basis of the evaluation. Then a condition monitoring system will be developed that will
locate and categorise impending failures by on-line analysis of transients.
6.8 Vadition of the Circuit Breaker Model with Diagnostic and Prognostic Algorithms
For modelling air and SF6 circuit breakers normally Mayer-Cassie equations are being used. The
problem for the particular circuit breaker (CB) is to determine the time constants in these
equations. Modelling of the vacuum circuit breaker depends on what we need, a model to
simulate the switching small inductive/capacitive currents or switching of high currents. Models
are completely different. As to the first one, the arc voltage is +/- 20 V, so the model can be
developed by IF orders in Models to observe when the arc reignits and when it extinguishes.
The criteria is to compare TRV with the dielectric recovery of the VCB. The arc quenching is
simulated by the critical di/dt and the current zero. Moreover we need the chopping currents that
determines the first reignition.
For a high current modeling, the problem is to determine the post-zero current. There are also a
few models derived fromthe so called Andrew-Vary equation.
Normally for the determination of the TRV caused for circuit breakers the Mayr and Cassie
models are preferred instead of physical models, however as the behaviour of the arc
conductivity under different conditions is different, the Mayr model is used for small currents
(reactors and capacitors bank interruption) and the Cassie model for large currents (fault
interruption). The characteristic for SF6, air and oil are given for the time constant, arc voltage
and the power loss constant depending on the model. Data obtained from real CB operations
could be useful for the model validation. There are modifications of the latter models or different
ones like the kopplin and Urbanek models, unfortunately there is not an arc model included in
the ATPDRAW library, however it is possible to create an user model in TACS or MODELS.

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 10

_____________________________________________________________________________
6..9 Conclusion of Results
At this point in the research program a valid form of potentially improving asset management by
developing diagnostic and prognostic algorithms for power system equipment in order to
improve asset management. Based on the literature review the problem has been solved by
previous researchers on feature extraction with Wavelet Analysis, auto correlation and
normalisation before input to NN for training and testing, and the last state are trending and
projection. Emerging technologies such as ATP modelling and parallel NN learning are
introduced for the NN performance improvement as the distinguishing contribution from others
and applying this diagnostic and prognostic algorithms for power system equipment condition
monitoring.
The problem for the research project is waveforms recognition to identify the power system
equipment deterioration for asset management condition monitoring. This is the pattern
recognition or condition diagnosis problem. In solving this problem there are two approaches:
mathematical and non-mathematical equations. It is impossible to derive all equations from the
power system equipment. NN is more appropriate to solve this problem because of the learning
and non-mathematical modeling but including all formulas. Although some people use fuzzy
logic for the uncertainly or getting all the exact ruling from the power systems. NN is the ideal
candidate to solve this problem. But NN requires a long computation time for training processes,
therefore, parallel NN learning algorithms are implemented according to the Markov processes
for decision making.
.A quantitative trend index has been developed to indicate the power system equipment
deterioration caused by transient phenomena from field evaluation and will be validated with
power equipment from site evaluation. The immediate task involves the evaluation of the neural
model with simulated data and field recorded data. Then field evaluation will be the last part of
the diagnostic and prognostic algorithms for power system equipment with on-line data.
7. References
[1]
K. J. Van Rensburg, Analysis of Arcing Faults On Distribution Lines for Protection and
Monitoring, in School of Electrical & Electronic Systems Engineering. Brisbane,
Australia: Queensland University of Technology, 2003.
[2]
A. Hussain, M. H. Sukairi, A. Mohamed, and R. Mohamed, Automatic detection of
power quality disturbances and identification of transient signals, presented at Signal
Processing and its Applications, Sixth International, Symposium, 2001.
[3]
W. Jun and T. K. Saha, Simulation of power quality problems on a university
distribution system, presented at Power Engineering Society Summer Meeting, 2000.
IEEE, 2000.
[4]
D. O. Koval, R. A. Bocancea, K. Yao, and M. B. Hughes, Frequency and duration of
voltage sags and surges at industrial sites-Canadian National Power Quality Survey,
presented at Industry Applications Conference, 1997. Thirty-Second IAS Annual
Meeting, IAS 97., Conference Record of the 1997 IEEE, 1997.
[5]
E. Styvaktakis, M. H. J. Bollen, and I. Y. H. Gu, Expert system for voltage dip
classification and analysis, presented at Power Engineering Society Summer Meeting,
2001. IEEE, 2001.
[6]
R. Flores, Signal Processing Tools for Power Quality Event Classification, Chalmers
University of Technology, Goteborg 2003.

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 11

_____________________________________________________________________________
[7]

[8]

[9]

[10]
[11]

[12]

[13]

[14]
[15]
[16]

[17]

[18]

[19]

[20]

[21]

[22]

J. Chung, E. J. Powers, W. M. Grady, and S. C. Bhatt, An automatic voltage sag


detector using a discrete wavelet transform and a CFAR detector, presented at Power
Engineering Society Summer Meeting, 2001. IEEE, 2001.
E. Styvaktakis, M. H. J. Bollen, and I. Y. H. Gu, Classification of power system
transients: synchronised switching, Power Engineering Society Winter Meeting, 2000.
IEEE, vol. 4, pp. 2681-2686 vol.4, 2000.
L. Reznik and M. Negnevitsky, A neuro-fuzzy method of power disturbances
recognition and reduction, presented at Fuzzy Systems, 2002. FUZZ-IEEE02.
Proceedings of the 2002 IEEE International Conference on, 2002.
A. Greenwood, Electrical Transients in Power Systems, Second Edition ed: John Wiley
& Sons, Inc., 1991.
S. Santoso and J. D. Lamoree, Power quality data analysis: from raw data to knowledge
using knowledge discovery approach, presented at Power Engineering Society Summer
Meeting, 2000. IEEE, 2000.
M. M. Begovic and P. M. Djuric, Power system disturbance monitoring using spectrum
analysis, presented at Circuits and Systems, 1996. ISCAS 96., Connecting the World.,
1996 IEEE International Symposium on, 1996.
S. J. Huang, C. L. Huang, and C. T. Hsieh, Application of Gabor transform technique to
supervise power system transient harmonics, Generation, Transmission and
Distribution, IEE Proceedings-, vol. 143, pp. 461-466, 1996.
U. A. Khan, S. B. Leeb, and M. C. Lee, A multiprocessor for transient event detection,
Power Delivery, IEEE Transactions on, vol. 12, pp. 51-60, 1997.
S. T. Mak, Application of a differential technique for characterization of waveform
distortions, presented at Power Engineering Society Winter Meeting, 2000. IEEE, 2000.
O. C. Montero-Hernandez and P. N. Enjeti, A fast detection algorithm suitable for
mitigation of numerous power quality disturbances, presented at Industry Applications
Conference, 2001. Thirty-Sixth IAS Annual Meeting. Conference Record of the 2001
IEEE, 2001.
I. Y. H. Gu, M. H. J. Bollen, and E. Styvaktakis, The use of time-varying AR models for
the characterization of voltage disturbances, presented at Power Engineering Society
Winter Meeting, 2000. IEEE, 2000.
P. Kang, On-line Condition Assessment of Power Transformer On-load Tap-Changers
Transient Vibration Analysis Using Wavelet Transform and Self Organising Map, in
School of Electrical and Electronic Systems Engineering. Brisbane: Queensland
University of Technology, Australia, 2000, pp. 247.
G. T. Heydt and K. J. Olejniczak, The Hartley series and its application to power quality
assessment, presented at Industry Applications Society Annual Meeting, 1991.,
Conference Record of the 1991 IEEE, 1991.
M. Moechtar, T. C. Cheng, and L. Hu, Transient stability of power system-a survey,
presented at WESCON/95. Conference record. Microelectronics Communications
Technology Producing Quality Products Mobile and Portable Power Emerging
Technologies, 1995.
A. Poeltl and K. Frohlich, Two new methods for very fast fault type detection by means
of parameter fitting and artificial neural networks, presented at Power Engineering
Society 1999 Winter Meeting, IEEE, 1999.
Z. Ye and B. Wu, Simulation of electrical faults of three phase induction motor drive
system, presented at IEEE 32nd Annual Power Electronics Specialists Conference,
2001., 2001.

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 12

_____________________________________________________________________________
[23]

[24]

[25]

[26]

[27]
[28]
[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

Z. Ye and B. Wu, Simulation of electrical faults of three phase induction motor drive
system, presented at Power Electronics Specialists Conference, 2001. PESC. 2001 IEEE
32nd Annual, 2001.
W. R. Anis Ibrahim and M. M. Morcos, Artificial intelligence and advanced
mathematical tools for power quality applications: a survey, Power Delivery, IEEE
Transactions on, vol. 17, pp. 668-673, 2002.
S. M. Halpin, An improved simulation algorithm for the determination of motor starting
transients, presented at Industry Applications Society Annual Meeting, 1994.,
Conference Record of the 1994 IEEE, 1994.
E. S. Lee and A. S. Herbert, Jr., Fault isolation within power distribution systems,
presented at Telecommunications Energy Conference, 1991. INTELEC 91., 13th
International, 1991.
S. Ertem and Y. Baghzouz, A fast recursive solution for induction motor transients,
Industry Applications, IEEE Transactions on, vol. 24, pp. 758-764, 1988.
A. D. Stokes and W. T. Oppenlander, Electric arcs in open air, Journal of Physics D:
Appl. Phys, vol. 24, pp. 26-35, 1991.
M. Akke and J. S. Thorp, Improved estimates from the differential equation algorithm
by median post-filtering, presented at Developments in Power System Protection, Sixth
International Conference on (Conf. Publ. No. 434), 1997.
T. Segui, P. Bertrand, H. Guillot, P. Hanchin, and P. Bastard, Fundamental basis for
distance relaying with parametrical estimation, Power Delivery, IEEE Transactions on,
vol. 16, pp. 99-104, 2001.
Z. M. Radojevic, V. V. Terzija, and N. B. Djuric, Numerical algorithm for overhead
lines arcing faults detection and distance and directional protection, Power Delivery,
IEEE Transactions on, vol. 15, pp. 31-37, 2000.
M. Fikri and M. A. H. El-Sayed, New algorithm for distance protection of high voltage
transmission lines, Generation, Transmission and Distribution [see also IEE
Proceedings-Generation, Transmission and Distribution], IEE Proceedings C, vol. 135,
pp. 436-440, 1988.
A. C. Parsons, W. M. Grady, E. J. Powers, and J. C. Soward, Rules for locating the
sources of capacitor switching disturbances, presented at Power Engineering Society
Summer Meeting, 1999. IEEE, 1999.
D. D. Sabin, T. E. Grebe, D. L. Brooks, and A. Sundaram, Rules-based algorithm for
detecting transient overvoltages due to capacitor switching and statistical analysis of
capacitor switching in distribution systems, presented at Transmission and Distribution
Conference, 1999 IEEE, 1999.
E. Styvaktakis, M. H. J. Bollen, and I. Y. H. Gu, Classification of power system
transients: synchronised switching, presented at Power Engineering Society Winter
Meeting, 2000. IEEE, 2000.
P. P. Pericolo and D. Niebur, Discrimination of capacitor transients for position
identification, presented at Power Engineering Society Winter Meeting, 2001. IEEE,
2001.
M. Kezunovic and Y. Liao, Fault location estimation based on matching the simulated
and recorded waveforms using genetic algorithms, presented at Developments in Power
System Protection, 2001, Seventh International Conference on (IEE), 2001.
S. Santoso, J. D. Lamoree, and M. F. McGranaghan, Signature analysis to track
capacitor switching performance, presented at Transmission and Distribution
Conference and Exposition, 2001 IEEE/PES, 2001.

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 13

_____________________________________________________________________________
[39]

[40]
[41]

[42]
[43]
[44]

[45]

[46]

[47]

[48]

[49]

[50]
[51]
[52]

[53]

[54]

[55]
[56]

H. D. Hagerty and W. G. Barie, Switched capacitors control feeder regulation with


thyristor motor starter, presented at Petroleum and Chemical Industry Conference, 2002.
Industry Applications Society 49th Annual, 2002.
D. F. Peelo and E. M. Ruoss, A new IEEE Application Guide for Shunt Reactor
Switching, Power Delivery, IEEE Transactions on, vol. 11, pp. 881-887, 1996.
Z. Ma, C. A. Bliss, A. R. Penfold, A. F. W. Harris, and S. B. Tennakoon, An
investigation of transient overvoltage generation when switching high voltage shunt
reactors by SF<sub>6</sub> circuit breaker, Power Delivery, IEEE Transactions on,
vol. 13, pp. 472-479, 1998.
H. Ito, Controlled switching technologies, state-of-the-art, presented at Transmission
and Distribution Conference and Exhibition 2002: Asia Pacific. IEEE/PES, 2002.
Takemoto, Surge Propagation Characteristics on Submarine Repeatered Line, IEEE
Transactions on Communications, vol. 26, pp. 1421-1425, 1978.
H.-S. Song, H.-G. Park, and K. Nam, An instantaneous phase angle detection algorithm
under unbalanced line voltage condition, presented at Power Electronics Specialists
Conference, 1999. PESC 99. 30th Annual IEEE, 1999.
J. Pienaar and P. H. Swart, Ferro-resonance in a series capacitor system supplying a
submerged-arc furnace, presented at Africon Conference in Africa, 2002. IEEE
AFRICON. 6th, 2002.
P. J. Moore, E. J. Bartlett, and M. Vaughan, Fault location using radio frequency
emissions, presented at Developments in Power System Protection, 2001, Seventh
International Conference on (IEE), 2001.
G. Brauner and C. Hennerbichler, Voltage dips and sensitivity of consumers in low
voltage networks, presented at Electricity Distribution, 2001. Part 1: Contributions.
CIRED. 16th International Conference and Exhibition on (IEE Conf. Publ No. 482), 2001.
D. C. Robertson, O. Camps, Mayer; J. S. , and W. B. Gish, Wavelets and
electromagnetic power system transients, IEEE Transactions on Power Delivery, vol.
11, pp. 1050-1056, 1995.
O. Poisson, P. Rioual, and M. Meunier, Detection and measurement of power quality
disturbances using wavelet transform, presented at Harmonics And Quality of Power,
1998. Proceedings. 8th International Conference on, 1998.
C. Xiangxun, Wavelet-based measurement and classification of power quality
disturbances, IEEE Transactions on Power Delivery, vol. 17, pp. 38-39, 2002.
C. H. Lee and S. W. Nam, Efficient feature vector extraction for automatic classification
of power quality disturbances, Electronics Letters, vol. 34, pp. 1059-1061, 1998.
C.-M. Chen and K. A. Loparo, Electric fault detection for vector-controlled induction
motors using the discrete wavelet transform, presented at American Control Conference,
1998. Proceedings of the 1998, 1998.
G. T. Heydt, P. S. Fjeld, C. C. Liu, D. Pierce, L. Tu, and G. Hensley, Applications of the
windowed FFT to electric power quality assessment, Power Delivery, IEEE
Transactions on, vol. 14, pp. 1411-1416, 1999.
P. Kang and D. Birtwhistle, Condition assessment of power transformer onload tap
changers using wavelet analysis and self-organizing map: field evaluation, Power
Delivery, IEEE Transactions on, vol. 18, pp. 78-84, 2003.
M. Kezunovic and Y. Liao, A novel software implementation concept for power quality
study, Power Delivery, IEEE Transactions on, vol. 17, pp. 544-549, 2002.
I. R. M. Kezunovic, C.W. Fromen, D. R. Sevcik, Expert System Reasoning Streamlines
Disturbance Analysis, IEEE Computer Applications in Power, pp. 15-19, 1994.

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 14

_____________________________________________________________________________
[57]

[58]

[59]

[60]

[61]

[62]

[63]

[64]
[66]

[67]

[68]

[69]

[70]

[71]

[72]

[73]

E. Styvaktakis, Automatic Power Quality Analysis, in Department of Electric Power


Engineering and Department of Signals and Systems. Goteborg: Chalmers University of
Technology, 2002, pp. 218.
M. Kezunovic, X. Xu, and Y. Liao, Advanced software developments for automated
power quality assessment using DFR, IEEE Computer Applications in Power, pp. 1-6,
2002.
M. Wang, G. I. Rowe, and A. V. Mamishev, Real-time power quality waveform
recognition with a programmable digital signal processor, presented at Power
Engineering Society General Meeting, 2003, IEEE, 2003.
S. Santoso, W. M. Grady, E. J. Powers, J. Lamoree, and S. C. Bhatt, Characterization of
distribution power quality events with Fourier and Wavelet Transforms, IEEE
Transactions on Power Delivery, vol. 15, 1998.
S. S. W. M. G. E. J. P. J. L. S. C. Bhatt, Characterization of Distribution Power Quality
Events with Fourier and Wavelet Transforms, IEEE Transactions on Power Delivery,
vol. 15, 1998.
J. Chung, E. J. Powers, W. M. Grady, and S. C. Bhatt, Power disturbance classifier
using a rule-based method and wavelet packet-based hidden Markov model, Power
Delivery, IEEE Transactions on, vol. 17, pp. 233-241, 2002.
M. Kezunovic and Y. Liao, The use of genetic algorithms in validating the system
model and determining worst-case transients in capacitor switching simulation studies,
presented at Harmonics and Quality of Power, 2000. Proceedings. Ninth International
Conference on, 2000.
S. Vasilic and M. Kezunovic, Fuzzy ART neural network algorithms for classifying the
power systems faults, IEEE Computer Applications in Power, pp. 1-9, 2002.
P. K. Dash, S. K. Panda, A. C. Liew, and K. S. Lock, Tracking power quality
disturbance waveforms via adaptive linear combiners and fuzzy decision support
systems, presented at Power Electronics and Drive Systems, 1997. Proceedings., 1997
International Conference on, 1997.
J. S. Lee, C. H. Lee, J. O. Kim, and S. W. Nam, Classification of power quality
disturbances using orthogonal polynomial approximation and bispectra, Electronics
Letters, vol. 33, pp. 1522-1524, 1997.
M. H. J. Bollen, M. R. Qader, and R. N. Allan, Stochastical and statistical assessment of
voltage dips, presented at Tools and Techniques for Dealing with Uncertainty (Digest
No. 1998/200), IEE Colloquium on, 1998.
L. D. Zhang and M. H. J. Bollen, A method for characterizing unbalanced voltage dips
(sags) with symmetrical components, Power Engineering Review, IEEE, vol. 18, pp. 5052, 1998.
M. Wang, P. Ochenkowski, and A. Mamishev, Classification of power quality
disturbances using time-frequency ambiguity plane and neural networks, presented at
Power Engineering Society Summer Meeting, 2001. IEEE, 2001.
W. Kexing, S. Zhengxiang, C. Degui, W. Jianhua, and G. Yingsan, Digital identification
of voltage sag in distribution system, presented at Power System Technology, 2002.
Proceedings. PowerCon 2002. International Conference on, 2002.
E. J. Bartlett and P. J. Moore, Analysis of power system transient induced radiation for
substation plant condition monitoring, in Generation, Transmission and Distribution,
IEE Proceedings-, vol. 148, 2001, pp. 215-221.
E. J. M. Bartlett, P.J., Analysis of power system transient induced radiation for
substation plant condition monitoring, in Generation, Transmission and Distribution,
IEE Proceedings-, vol. 148, 2001, pp. 215-221.

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 15

_____________________________________________________________________________
[74]

[75]

[76]
[77]

[78]

[79]

[80]
[82]
[83]

[84]

[85]

[86]

[87]

[88]

[89]

[90]

H. N. Nagamani, S. N. Moorching, Channakeshava, and T. Basavaraju, On-line


diagnostic technique for monitoring partial discharges in capacitor banks, presented at
Solid Dielectrics, 2001. ICSD 01. Proceedings of the 2001 IEEE 7th International
Conference on, 2001.
P. Kang, D. Birtwhistle, and K. Khouzam, Transient signal analysis and classification
for condition monitoring of power switching equipment using wavelet transform and
artificial neural networks, presented at Knowledge-Based Intelligent Electronic
Systems, 1998. Proceedings KES 98. 1998 Second International Conference on, 1998.
C. H. L. S. W. Nam, Efficient feature vector extraction for automatic classification of
power quality disturbances, Electronics Letters, vol. 34, pp. 1059-1061, 1998.
S. Santoso, W. M. Grady, E. J. Powers, J. Lamoree, and S. C. Bhatt, Characterization of
distribution power quality events with Fourier and wavelet transforms, Power Delivery,
IEEE Transactions on, vol. 15, pp. 247-254, 2000.
J. Huang, A Simulated Power Quality Disturbance Recognition System, School of
Computing and Information Technology, University of Western Sydney, Las Vegas TR
No. CIT/18/2003, June 22-26 2003.
A. J. V. Miller and M. B. Dewe, The application of multi-rate digital signal processing
techniques to the measurement of power system harmonic levels, Power Delivery, IEEE
Transactions on, vol. 8, pp. 531-539, 1993.
A. Chandrasekaran and A. Sundaram, Unified software approach to power quality
assessment and evaluation, IEEE Computer Applications in Power, pp. 404-408, 1995.
X. X. M. Kezunovic, Y. Liao, Advanced Software Developments for Automated Power
Quality Assessment Using DFR, IEEE Computer Applications in Power, pp. 1-6, 2002.
Y. H. Song, A. T. Johns, and Q. Y. Xuan, Neural network based techniques for
distribution line condition monitoring, presented at Reliability of Transmission and
Distribution Equipment, 1995., Second International Conference on the, 1995.
B. Perunicic, M. Mallini, Z. Wang, and Y. Liu, Power quality disturbance detection and
classification using wavelets and artificial neural networks, presented at Harmonics And
Quality of Power, 1998. Proceedings. 8th International Conference on, 1998.
F. Mo and W. Kinsner, Probabilistic neural networks for power line fault classification,
presented at Electrical and Computer Engineering, 1998. IEEE Canadian Conference on,
1998.
T. Funabashi, A. Poeltl, and K. Frohlich, Two new methods for very fast fault type
detection by means of parameter fitting and artificial neural networks&rdquo, Power
Delivery, IEEE Transactions on, vol. 15, pp. 1344-1345, 2000.
G. Vachtsevanos and P. Wang, Fault prognosis using dynamic wavelet neural
networks, presented at AUTOTESTCON Proceedings, 2001. IEEE Systems Readiness
Technology Conference, 2001.
S.-L. Hung, C. S. Huang, C. M. Wen, and Y. C. Hsu, Nonparameetric identification of a
Building structure from experimental Data Using Wavelet Neural Network, ComputerAided Civil and iNfrastructure Engineering, vol. 18, pp. 356-368, 2003.
C. R. Parikh, M. J. Pont, N. B. Jones, and F. S. Schlindwein, Improving the performance
of CMFD applications using multiple classifiers and a fusion framework, Transactions
of the Institute of Measurement and Control, vol. 25, pp. 123-144, 2003.
P. C. Pendharkar and J. A. Rodger, Technical efficiency-based selection of learning
cases to improve forecasting accuracy of neural networks under monotonicity
assumption, Decision Support Systems, vol. 36, pp. 117-136, 2003.

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 16

_____________________________________________________________________________
[91]

U. Seiffert, Artificial Neural Networks on Massively Parallel Computer hardware,


ESANN2002 proceedings - European Symposium on Artificial Neural Networks Bruges
(Belgium), 24-26 April, pp. 319-330, 2002.
[92] D. Ernst, M. Glavic, and L. Wehenkel, Power Systems Stability Control: Reinforcement
Learning Framework, IEEE Transactions on Power Systems, vol. 19, pp. 427-435, 2004.
[93] M. Stace and S. M. Islam, Condition monitoring of power transformers in the Australian
State of New South Wales using transfer function measurements, presented at Properties
and Applications of Dielectric Materials, 1997., Proceedings of the 5th International
Conference on, 1997.
[94] S. F. Hiebert and R. B. Chinnam, Role of artifcial neural networks and wavelets in online reliability monitoring of physical systems, IEEE Transactions on Neural Networks,
pp. 369-374, 2000.
[95] E.-W. Lee, J.-G. Kim, T.-S. Kim, and S.-C. Lee, A study on the reduction method and
the analysis of VCB switching surge for high voltage induction motor, Transactions of
the Korean Institute of Electrical Engineers, vol. 43, pp. 761-769, 1994.
[96] R. B. Dunn, T. F. Quatieri, and J. F. Kaiser, Detection of transient signals using the
energy operator, presented at Acoustics, Speech, and Signal Processing, 1993. ICASSP93., 1993 IEEE International Conference on, 1993.
[97] J. F. Kaiser, On a simple algorithm to calculate the energy of a signal, presented at
Acoustics, Speech, and Signal Processing, 1990. ICASSP-90., 1990 International
Conference on, 1990.
[98] P. Kang and D. Birtwhistle, Self-organsing map for fault detection, ANNIE 99 Smart
Engineering Design, St Louise, 1999.
[99] X. Xu and M. Kezunovic, Automated feature extraction from power system transients
using Wavelet Transform, IEEE Computer Applications in Power, pp. 1994-1998, 2002.
[100] J. E. Vitela, U. R. Hanebutte, J. L. Gordillo, and L. M. Cortina, Comparative
performance study of parallel programming models in a neural network training code,
International Journal of Modern Physics, vol. 13, pp. 429-452, 2002.
28 July 2005

__________________________________________________________________________
AM Course Report Shui-cheong Kam Page 17

RELIABILITY OF PROTECTIVE SYSTEMS: RELAYING PHILOSOPHY AND


COMPONENT REALIBILITY

Gabriel Olguin
ABB Corporate Research
gabriel.olguin@ieee.org

Introduction
Reliability evaluation is an important and integral part of the planning, design and operation of all
engineering system; from the smallest and simplest one to the largest and most complex system.
There are many definitions of reliability and the term itself means different things for different
people. A broad and frequently used definition of reliability says: Reliability is the probability
of a device performing its purpose adequately for the period of time intended under the operation
conditions encountered [1]. In the context of power systems, reliability is the knowledge field
within power engineering that treats the ability of the power system to perform its intended
function [2]. Even though restricted to power systems, this is still a rather wide definition since
the intended function can be interpreted in many ways. In particular when referring to
reliability of protective system (a power system component) the definition of reliability needs to
be more precise. Since the main objective of a protective system is to isolate the faulted area,
there are two contrasting ways for the protective system to fail to perform its intended function.
The relay may trip when it should not or may not trip when it should. We will review a number
of relevant concepts regarding protective relaying that will help us to conceptualize the reliability
of protective systems.
The assessment of reliability as such is not new since engineers have always tried to operate
systems free from failures. In the past however reliability was mainly achieved from qualitative
and subjective judgment. Engineering judgment will always be necessary but formal numerical
techniques are nowadays available to help in the task of reliability assessment.
In practice, system reliability concerns the development of methods to predict the long term
expected performance of the system, in particularly in power system reliability, in terms of
number and duration of interruptions. Prediction should be understood here as a long term guess
of the expected performance of the system. No reliability analysis will come to a conclusion like
an interruption of service will occur at 20:00 in the feeder number 12 of substation 10. In turn a
reliability analysis may conclude that there is a high probability that the number of interruptions
affecting feeder 12 of substation 10 exceeds 4 interruptions per year. The simplest prediction
method is based on extending the historical behavior of the system. This approach basically
assumes that what has happened will happen again. Other more elaborated techniques model each
power system component as a stochastic model and interconnect them to model the functional
behavior of the system. The system under study is split into stochastic components. The choice of
the components is rather arbitrary obeying basically to purpose of the study. A complete HV/MV
substation could be a single component for the purpose of reliability analysis of power supply to
a particular consumer. On the other hand one single protective relay could be modeled by several
stochastic components.

Before presenting some analytic and numerical techniques aimed at performing reliability
assessment, some basic but important concepts regarding protective relaying will be introduced.
Protective Relaying
Relaying is the field of electric power engineering concerned with the principles of design and
operation of equipment that detect abnormal power system conditions and initiate corrective
actions as quickly as possible in order to return the power system to its normal state [3]. These
intelligent equipments are called relays or protective relays and the quickness of their response is
an essential element of protective relaying systems. The function of protective relaying is to
cause the prompt removal from service of any element of a power system when it suffers a short
circuit or it starts to operate in any abnormal manner that might cause damage or interfere with
the effective operation of the rest of the system [5]. The relaying equipment is aided in this task
by circuit breakers that are capable of disconnecting the faulty element when they are called to do
so by the relaying equipment. Fusing is employed when relays and circuit breakers are not
economically justifiable.
Circuit breakers are usually located so that each power system component (generator,
transformer, bus, transmission line, etc.) can be completely disconnected from the rest of the
system. Circuit breakers must be able to momentary carry the maximum short circuit current
available at the point of installation. They may even able to withstand closing on such a short
circuit condition.
Relaying must operate not only to protect the system from short circuit faults but also from other
abnormal conditions that may result in damage to the system. In particular, the protection of
rotation machines considers a number of relays (or a unique multifunction relay) aimed at protect
the machine against several abnormal conditions such as over and under frequency, over and
under excitation, vibrations, loss of prime mover, non-synchronized connection, unbalanced
currents, sub-synchronous oscillations, etc. In some cases the relay task is to signalize the
existence of a failure or abnormal condition by activating an alarm, in others cases the relays
command a circuit breaker action. In all these protective functions relays, signal transducers and
circuit breakers are interconnected to provide quick alarm and/or complete removal of the
protected equipment.
Another function of protective relaying is to provide indication of the location of and type of
failure. Such data will assist in expediting the repair. To be effective the relating practice must
observe fundamental principles such as protection zones, primary and back up protection,
sensitivity, selectivity, dependability and security. All these concepts and principles and other
relevant protective relaying concepts are defined and discussed in the following paragraphs.
Protective relaying definitions
Failure versus fault: failure is the termination of the ability of an item to perform its required
function whereas faults are short circuits caused by dielectric breakdown of the insulation system.
A failure does not need to be fault, but a fault usually leads to a failure. Faults can be categorized
as self-clearing, temporary, and permanent. A self-clearing fault extinguishes itself without any
external intervention. A temporary fault is a short circuit that will clear after the faulted
component (typically an overhead line) is de-energized and reenergized. A permanent fault is a
short circuit that will persist until repaired by human intervention [4].

Protective relaying is the term used to signify the science as well as the operation of protective
devices, within a controlled strategy, to maximize service continuity and minimize damage to
property and personnel due to system abnormal behavior. The strategy is not so much that of
protecting the equipment from faults, as this is a design function, but rather to protect the normal
system and environment from the effect of a system component which has become faulted [6].
Note that this definition explicitly sates that the operation of the relaying system must follow a
controlled and predefined strategy.
Relays are devices designed to respond to input conditions in a prescribed manner and, after
specified conditions are met, to cause the contact operation or similar abrupt change in associated
electric control circuits. Inputs are usually electrical but may be mechanical, thermal, or other
quantities. Limit switches and similar simple devices are not relays. A relay may consist of
several units, each responsive to a specific input, with the combination of units providing the
desired overall performance characteristic of the relay [7], [8].
Electromechanical relay is a relay that operates by physical movement of parts resulting from
electromagnetic, electrostatic, or electrothermic forces created by the input quantities. In contrast
a static relay is a relay or relay unit in which the designed response is developed by electronic,
solid-state, magnetic or other components without mechanical motion [8].
Digital relaying (also called microcomputer relaying or numerical relaying) is a relaying system
in which a processor or microcomputer is used to implement the protection functions. A digital
relay consists of the following main parts: processor, analog input system, digital input system,
independent power supply. The main difference in principle between digital and
electromechanical and static relays is in the way input signal are processed. The input signals
(currents and voltages from measurement transformers) are analog signals. They are directly
imposed to electromagnetic windings or electronic circuits in non-numerical relays. In digital or
numerical relays, the analog currents and/or voltages are converted into digital signals by using
analog to digital converters (A/D) before being processed by the processor. Analog and digital
filters are used to adjust the signals and protect digital components [9].
Reach of a relay is the extent of the protection afforded by a relay in terms of the impedance or
circuit length as measured from the relay location. The measure is usually to a point of fault, but
excessive loading or system swings may also come within reach or operating range of the relay
[8].
Zone of protection is that segment of a power system in which the occurrence of assigned
abnormal condition should cause the protective relay system to operate [8]. Primary protection
zone is a region of primary sensitivity where primary relays should operate for prescribed
abnormalities within that zone. Backup relays are relays outside a given primary protection zone,
located in an adjacent zone, which are set to operate for prescribed abnormalities within the
primary protection zone and independently of the primary relays [6].
Local backup protection is a form of backup protection in which the backup protective relays are
at the same station as the primary protective relays. Remote backup protection is a form of
backup protection in which the protection is at a station other than that at which has the primary
protection [8]. Backup should not be confused with redundancy.
Sensitivity in protective systems is the ability of the system to identify an abnormal condition that
exceeds a nominal pickup or detection threshold value and which initiates protective action when
the sensed quantities exceed that threshold [6].
Selectivity of a protective system is a general term describing the interrelated performance of
relays, breakers, and other protective devices. Complete selectivity is obtained when a minimum
amount of equipment is removed from service for isolation of a fault or other abnormality [8].
3

Reliability of a relay or relay system is a measure of the degree of certainty that the relay or relay
system will perform correctly. It denotes certainty of correct operation together with assurance of
against incorrect operation from all extraneous causes. Therefore reliability of protective systems
involves two aspects: first, the system must operate in the presence of a fault that is within its
zone of protection and, second, it must refrain from operating unnecessarily for faults outside its
protective zone or in absence of fault [8].
Dependability of a relay or relay system is the facet of reliability that relates to the degree of
certainty that the relay will operate correctly [8].
Security of a relay or relay system is the facet of reliability that relates to the degree of certainty
that a relay or relay system will not operate incorrectly [8].
Most protection systems are designed for high dependability meaning that a fault is always
cleared by some relay. As a general rule it can be said that relying system becomes less secure as
it is more dependable. In other words the more dependable the protective relaying, the less secure
its operation. In present-day designs, there is a bias towards making them more dependable at the
expense of some degree of security. Consequently, a majority of relay system mal-operations are
found to be the result of unwanted trips caused by insecure relay operations [3].
Redundancy of a relaying system is the quality a relaying system that allows a function to operate
correctly, without degradation, irrespective of the failure or state of one partition, since another
portion performs the same function [8]. Redundancy should not be confused with backup.
Incorrect relay operation refers to any output response or lack of output response by the relay
that, for the applied input quantities, is not correct.
Failure mode is a process of failure of equipment that causes a loss of its proper function [8].
Relays present basically two failure modes. Failure to trip in the performance of a relay or relay
system is the lack of tripping that should have occurred considering the objectives of the relaying
system design. False tripping in the performance of a relay system is the tripping that should not
have occurred considering the objectives of the relaying system design [8].

Introduction to Reliability Analysis


Reliability is the probability of a device performing its purpose adequately for the period of time
intended under the operating conditions encountered. The reliability assessment of a system
means the calculation of a set of performance indicators. An example of a reliability indicator is
the average number of times per year that certain equipment is not available due to failures.
System reliability analysis is principally the analysis of a large set of unwanted system states that
might occur in the future. The result of this state analysis is then used to calculate the various
performance indicators. The reliability assessment starts with the creation of relevant system
events. Not every event is equally likely to happen. The reliability assessment creates events that
will bring the system in an unhealthy system state which is then analyzed.
A particular engineering system, for example a relay, may be simple or complex. A complex
system consists of many different functional subsystems in which each subsystem can be divided
into elements, components or even sub-subsystems. In any case the reliability of the system
depends on the reliability of the elements, components, and subsystems.
This section introduces a few basic methodologies for reliability assessment of engineering
systems. The treatment is brief since most of the material presented here is readily available in
the referred literature [1], [6], [10].

Stochastic Model for Reliably Assessment


In order to assess the reliability of an engineering system, the system has to be represented by a
number of stochastic components. Each component of the system is treated as one single object
in the reliability analysis. Examples of components may be a motor, a line, a diode, a protection
device, a voltage or current transformer, a digital to analog converter, etc. A component may
exhibit different component states, such as being available, being in reparation, being open,
being closed, being stand-by, etc. For some reliability assessments all these possible states may
have to be accounted for. Normally a reduction is made to a few of these states. For instance a
simple two-state model could consider the operative and inoperative states. A three-state model
could further distinguish between repairs (forced outages) and maintenance (planned outages).
Then the model would consider the following three states: operative, planned-maintenance and
forced outage.
In particular in the assessment of reliability of protective relaying the multi-failure modes need to
be properly modeled. A single relay exhibits more than two failure modes which lead to more
than three states. It can fail shorted or fail open. Whether this relay-failure causes the failure of
the relaying system in which the relay is installed depends on the circuit configuration and the
design of the relaying system. Complex protective systems have many failure modes. A simple
circuit breaker can fail to open on command, to close on command, to break a current, to make a
current, etc.
A stochastic component is a component with two or more states which have a random stateduration and for which the next state is selected randomly from the possible next states [11],[12].
A stochastic component changes abruptly from one state to another at unforeseen moments in
time. The state transitions occur instantaneously and unpredictably. If we would monitor a fourstate stochastic component over a long period of time we could get a graph as the one depicted in
Figure 1. Because the behavior of the component is stochastic, another graph will appear even if
we would monitor an absolute exact copy of the component under exactly the same conditions.
For stochastic component models, the state duration (time in that state) and the next state are
stochastic quantities.

X3

X2
X1
Xo
to

t1 t2

t3

t4

t5

tn

Figure 1: Example of monitored states of a stochastic component

The basic quantity in reliability assessment is the duration D (time) for which a component stays
in a given state. This duration is a stochastic quantity, as its precise value is unknown. The word
precise should be understood in the sense that although we do not know the value of the
stochastic quantity, we know the possible values it might have. For example, the time until the
next unplanned trip of a generator is unknown (otherwise it would a planned trip), but nobody
would expect a trip everyday or that the generator does not present any failure in ten years. This
is a wide range for practical purposes, but for actual generators a much smaller range of expected
time to failure can be justified by historic data. The basic question about a stochastic quantity is
the range of its expected values (or outcomes). Which outcome can be expected and with what
probability? Both the outcome probability Pr(D) and its range can be described by a single
function; the Cumulative Density Function, which is designed by FD().
(1)
FD ( ) = Pr( D )
The probability of a negative duration is zero and the probability of duration less than infinite is
one. This is:
(2)
FD (0) = Pr( D 0) = 0
(3)
FD () = Pr( D ) = 1
The Probability Density Function (PDF), which is designed by fD(), is the derivative of the
cumulative density function FD(). The PDF gives a first insight into the possible values for D,
and the likelihood of it taking a value in a certain range.
(4)
d
f D ( ) =
FD ( )
d
Pr( D + )
(5)
f D ( ) = lim
0

(6)
f
(

)
d

=
1
D
0

The survival function (SF), RD(), is defined as the probability that the duration D will be longer
than . It is the complement of FD().
(7)
RD ( ) = Pr( D > ) = 1 FD ( )
For a specific component, D is the lifetime and the survival function gives the probability for the
component functioning for at least a certain amount of time without failures.
The hazard rate function (HRF), hD(), is defined as the probability density for a component to
fail, for a certain time , given the fact that the component is still functioning at . This is:
Pr( D + D >
f ( )
(8)
h ( ) = lim
= D
D

RD ( )

The hazard rate function is an estimate of the unreliability of the components that are still
functioning without failures after a certain amount of time. An increasing HRF signals a
decreasing reliability.
The expected value of a function g of a stochastic quantity D is defined as

(9)
E ( g ( D )) = g ( ) f D ( )d
0

The expected value of D itself is its mean ED which is defined as

ED = E ( D) = D f D ( )d

(10)

The variance VD is defined as the second central moment


VD = E ([ D E ( D)]2 ) = E ( D 2 ) ( E ( D))2
(11)
The Exponential Distribution plays a special role in reliability evaluation. The negative
exponential distribution is probably the most widely known and used distribution in reliability
evaluation of systems. It is defined by
(12)
f D ( ) = e
which makes that
(13)
FD ( ) = 1 e

hD ( ) =
1
ED =

(14)
(15)

VD =

(16)

Wear-out

Hazard Rate Function

Wear-in
Normal oparating or useful life

Time

Figure 2: Typical hazard rate for electric component as function of time


The most important factor for it to be applicable is that the hazard rate function should be
constant. It is frequently used in system reliability evaluation and there are a number of reasons
for that.
Using non-exponential distribution makes that most reliability evaluation techniques
currently available can no longer be used.
Even the studies that are able to use non-exponential distribution (e.g. Monte Carlo
simulation) often still use exponential distribution, because of the lack of data.
The lack of experience with non-exponential distribution makes that the results of such
study are hard to interpret.
In actual systems there is a mixture of components with different ages. The system is not
built at once but grows over the time, incorporating new lines, equipment, etc. Preventive
7

maintenance is performed on components at different times and some components are


replaced after faults.
Most of the components in use are in their so-called useful operating time, where the
failure rate is reasonably constant. See Figure 2.

Calculation Methods
When component models and their functional interrelation have been decided, it is time to choose
a calculation method. There are several calculation methods available in books and technical
papers [1],[4],[10],[11],[12],[13],[14]. They can be divided into two basic categories: 1)
analytical and 2) simulation methods. The analytical methods are well developed and have been
used for years. These methods represent the system by mathematical models (blocks) and
evaluate the reliability using mathematical solutions. The exact mathematical equation can
become quite complicate that some approximations are needed when the system is too complex
or involves too many components. A range of approximate techniques has been developed to
simplify the required calculations. These techniques aim at calculating the expected value of the
index and not its statistical distribution. The expected value or mean value, however, does not
provide any information on the variability of the indicators. The distribution probability on the
other hand provides both a pictorial representation of the way the indicators vary and important
information on significant outcomes, which, although they occur infrequently, can have very
serious effects on the system [13].
In order to obtain an appreciation of the variability of the outcomes, Monte Carlo simulation
technique can be applied. This technique does not perform any analytical calculations, but instead
simulates the stochastic behavior of the system. The main advantage of Monte Carlo method is
that any component model, with any distribution, and any system detail can be included in a
Monte Carlo simulation. However, it may be questionable whether the distribution probability
associated with the component failure, component restoration and scheduled maintenance, etc.
are known sufficiently in detail to apply them in a reliability study. If not, this advantage
disappears. On the other hand, the main disadvantage of the simulation approach is that the more
reliable the components of the system are, the greater the effort required for assessing the
reliability of the system. In the following section a brief review of the basic methods for
reliability evaluation is presented.
Network Modeling or Reliability Block Diagrams
The reliability of any physical system depends on its configuration, in other words on the
arrangement and connection of the various components that make up the system design. The
physical arrangement leads to a particular functional logic that describes a given function and its
associated failure modes. The logic can be described and analyzed in several ways and equations
can be written to compute the reliability of that logic. It should be emphasized that it is the logic
that is being analyzed rather than physical arrangement. Network modeling, also referred to as
reliability block diagramming, is the simplest method to represent the particular logic of a
system. The method can only be used for basic component arrangements but it is still powerful
enough to describe and analyze the reliability of simple systems. Blocks representing devices
functions are connected in parallel or in series to represent the stochastic behavior of the systems
functionality. Components are said to be in series if they all must work for the system success or
alternately if only one needs to fail for the system failure. On the other hand, components are said
8

to be in parallel if only one needs to be working for the system success or alternatively if all must
fail to cause the system failure. According to the functional connection of the components the
reliability of the whole system is found by applying two rules taken from probabilities principles:
The success probability of a system consisting of several components in series is equal to
the multiplication (or convolution) of the individual success probabilities.
The failure probability (the complement of the success probability) of a system consisting
of several components in parallel equal the multiplication (or convolution) of the
individual failure probabilities.
In the modeling process the block diagram is not normally identical to the physical arrangement
of components in the system. The operational or functional reliability logic must be translated in
terms of series or parallel connections so that the model represents the reliability behavior of the
network.

1
3

4
2

Figure 3: Reliability evaluation by network model; a simple example


As a simple example consider the system shown in Figure 3. Two transformers (1 and 2) are
connected in parallel to transport energy from one (3) extreme to another (4). The extreme points
represent buses of the substation, which are also subject to failures. The probability of failure in
buses is Qb, while each transformer has a probability of failure equal to Qt. Then the probability
of success Ps of this simple system is
Ps = 1 Qb {1 (1 Qt)(1 Qt)} Qb
(17)
In this case it has been supposed that any of the two transformers can supply the total load. This
same model can be applied to another system as far as the model represents the functionality of
the system. The model could for example be used to analyze the reliability of a relay where
blocks 3 and 4 would be the relays contacts and 1 and 2 full redundant coils aimed at actuating a
power circuit breaker. Protective relaying however presents some complexities that make it other
techniques more suitable for analyzing it.
To deal with systems where parallel and series rules are not enough, additional techniques have
been developed. A commonly used method is the minimum cut set. The technique aims at
describing the reliability of systems with complex functional networks. A minimum cut set is a
set of components of the system whose failure leads to the failure of the entire system. The
failure of block 3 in figure 3 leads to the failure of the entire system, the same can be said of
block 4. Components in a given cut set are in parallel because they all must fail to cause the
system fail. It is not trivial to identify minimum cut sets in stochastic networks. The following
minimum cut sets can be identified from Figure 3: {3}, {4},{1,2}. After identifying the minimum
cut sets, these are connected in series, because the system fails if any one of the cut set fails.
Evaluation of the whole system is then performed combining series and parallel rules.

Discrete Markov Chains


The Markov approach can be applied to the random behavior of systems that vary discretely
(chains) or continuously (process) with respect to the time and space. This random variation is
known as a stochastic process. Not all stochastic process can be modeled by Markov chains. In
order to the Markov approach be applicable, the behavior of the system must be characterized by
lack of memory [13],[15]. This is, the future state of the system is independent of all past states
except the immediately preceding one. In addition the process must be homogenous which means
that the probability of making a transition from one state to another is the same at all times [15].
In simple words, the Markov chain approach is applicable to those systems whose behavior can
be described by a probability distribution function that is characterized by a constant hazard rate,
i.e., exponential distribution. Only if the hazard rate is constant does the probability of making a
transition between two states remains constant at all points of time [1]. If the probability is
function of time or the number of discrete steps, then the process is non-homogeneous and nonMarkovian [15]. In the particular case of reliability evaluation, the space is normally represented
as a discrete function since this represents the discrete and identifiable states in which the system
and its components can reside, whereas time may either be discrete (chain) or continuous
(process).

1/2

1/2

1/4

2
3/4

Figure 4: A two-state Markov chain model


The basic concepts of Markov modeling can be illustrated by considering a simple two-state
system. Each state represents a particular condition of the system under study. Figure 4 shows a
two-state system, where the probabilities of remaining in or leaving a particular state in a finite
time are also shown. These probabilities are assumed to be constant for all times. This is a
homogeneous Markov chain since the next state only depends on the present state and the
movement between states occurs in discrete steps. Since the system can only be either in state 1
or 2, the sum of probabilities of remaining in and leaving must add one. For the simple two states
shown in Figure 4, the probability of remaining in state 1 is 0.5 and is equal to the probability of
leaving that state. On the other hand the probability of leaving state 2 is 0.25 whereas the
probability of staying at state 2 is 0.75.

10

The formal definition of a Markov chain is given [15]. Let P be a (k,k)-matrix with elements Pi,j.
A random process (X0,X1, ..) with finite state space S=(s1, .. sk) is said to be a Markov chain
with transition matrix P, if for all n, all i,j (1,k) and all i0, in-1 (1,k) we have
(18)
Pr( X n +1 = s j X 0 = si ,... X n = si )
0

= Pr( X n +1 = s j X n = si )
= Pi , j

For our two-state system, matrix P is given by


1/ 2 1/ 2
(19)
P=

1/ 4 3 / 4

The matrix P is known as the stochastic transitional matrix. It represents the transitional
probabilities between states for one step of the Markov chain. In most reliability evaluation
problems the initial condition or state is known. The initial condition of the Markov chain is
known as the initial distribution and is represented by (0). The initial distribution is a vector of k
elements each one representing the probability of the chain being at states i,.k. In our simple
two-state system and supposing the initial state is 1, then (0) = (1 0).
From Markov theory is known that a future distribution can be calculated from the initial
distribution and the transitional matrix.
(20)
( n ) = (0) P n
The aim of a reliability study is to find the reliability as time extends into the future. In
mathematical terms we are interested in the stationary distribution of the Markov chain, this is
(n), for n. A stationary distribution there exits for any Markov chain and is unique if the
chain is irreducible. However the convergence of the Markov chain to the stationary distribution
is not guaranteed, unless the Markov chain is irreducible and a-periodic. A Markov chain is said
to be irreducible if all states can (directly or indirectly) be reached from all others. A Markov
chain is said to be a-periodic if all its states are a-periodic which implies that it is possible to
return to the state in one transition. In practical reliability studies the Markov chain used are
irreducible and frequently a-periodic.
An efficient way to obtain the stationary distribution is based on the fact that once the stationary
distribution has been reached, any further multiplication of the stochastic transitional matrix does
not change the values of the limiting state probabilities. So, if (*) is the stationary distribution
and P is the transitional matrix, then:
(21)
(*) = (*) P
Where (*) is the vector of k unknown variables. The equation system described by (21) is
redundant and for solving it we need to replace one of the k equations for an additional
expression. Because the stationary probabilities sum up 1, the additional expression is:
(22)
1(*) + 2 (*) + ..... + k (*) = 1
The transient behavior of the Markov chain is very dependent on the starting state, but the
limiting values of the state probabilities are totally independent of the starting sate. However, the
rate of convergence to the limiting probabilities can be dependent on the initial conditions and is
very dependent on the probabilities of making transitions between states.
From a reliability point of view the major demerit for this technique is the large number of states
that are needed to model a complex system. The number of states increases exponentially with
the number of studied factors. Many simplifying assumptions are needed to make the model
manageable.

11

Continuous Markov Chain Processes


In the field of reliability assessment of power systems, most references to Markov modeling refer
to the continuous Markov process. Like Markov chain, the Markov process is described by a set
of states and transitions characteristics between these states. However, state transitions in Markov
process occur continuously rather than at discrete time intervals. In continuous Markov
processes, state transitions are presented by transition rates and not by conditional probabilities as
in Markov chains.
Markov processes are easily applied to power system reliability since the failure rates are
equivalent to state transition rates. As long as failure and repair rates are constant, continuous
Markov modes are applicable. However becomes impracticable for large systems due to the large
number of states involved.
Markov processes are solved in a manner similar to Markov chains except that differential
equations are utilized.

0
Operative

1
Failed

Figure 5: A two-state Markov process model


Figure 5 shows a Markov model for a two-state component. Lambda is the failure rate while
is the repair rate. Usually the repair rate is designed with the symbol but we have taken to
avoid confusion with the previous meaning given to .
For the Markov model to be applicable both and must be constant. The failure rate is the
reciprocal of the mean time to failure, with the times to failure counted from the moment the
component begins to operate to the moment the component fails. Similarly the repair rate is the
reciprocal of the mean time to repair.
It can be shown that the probabilities of the component being found in state 0 and 1, given that
the systems started at time t=0 in the state operative, is given by the following equations.
(23)

e ( + )t
P0 (t ) =
+
+
+
(24)

e ( + )t
P1 (t ) =

+
+
In reliability terms P0(t) is the availability, while P1(t) is the unavailability.
The existence and uniqueness of the stationary distribution is subject to the same conditions as in
the case of Markov chains. In the simple case of a reparable component of two states the limiting
probabilities can be evaluated from (23) and (24) by letting t.

(25)
P0 (t ) =
+

(26)
P1 (t ) =
+

12

The number of sates increases as the number of components and the number of states in which
each component can reside increase. For system with n components each one with 2 possible
states the total number of states is 2n. The total size can therefore become unmanageable for large
systems. One possible solution involves state truncation by neglecting states of very low
probability of occurrence. Another involves approximate mathematical expressions that define
the equivalent failure and repair rates of parallel and series components.
An interesting application of Markov process to the reliability of protective system is referred in
[6], Chapter 28, page 1230. According to the reference, the concept of unreadiness to refer to the
condition that the protective system fails to respond when called to operate in presence of a fault
was first introduced by Singh and Patton in 1980. This is a condition probality because depends
on the occurrence of fault to be detected by the protective system. The authors suggest that the
unreadiness probability can be estimated from field data.
number of times the protection has failed to respond
(27)

C=

number of times de protection has been called upon to respond

A Markov transition diagram representing the protection of a particular plant would be as shown
in Figure 6.
Plant UP

p
5
plant UP
protection
INSP

s
1
plant UP
protection
UP

Plant DOWN
2
plant DOWN
protection
UP

Prot. UP

p
p

3
plant UP
protection
DOWN

4
plant DOWN
protection
DOWN

Figure 6: An state Markov process model for a protective system


The following states are depicted in Figure 6.
State 1: the plant and the protective system are both operative (UP).
State 2: the plant is inoperative (DOWN) and the protective system UP.
State 3: the plant is UP and the protection DOWN.
State 4: both the plant and the protective system are DOWN.
State 5: the plant is UP and the protective system is being inspected.
13

Prot.
DOWN

Transitions between states occur according to the following rates:


The protective system leaves the operative condition (UP state) at a failure rate p and at an
inspection rate p. The plant leaves the UP state at a power system failure rate s. Plant repairing
occurs at a rate s.
In addition, the following assumptions are considered appropriate.
Testing detects the failure of the protective system with certainty.
Testing of the protective system does not cause plant failure.
Failures of the protective system are statistically independent of the failures of the plant.
The limiting probabilities pi corresponding to each state can be determined for the exponential
distribution:
s p ( p + s )
(28a)
p1 =
( p + s + p )[ p s + p ( p + s )]

(28b)
p2 = s p1
s
p
(28c)
p3 =
p1
p + p
s p
(28d)
p4 =
p1
s ( p + s )
( + s + p )
(28e)
p5 = p p
p ( p + s )
As expected the sum of limiting probabilities equals 1.
The unreadiness probability can now be expressed in terms of the probability of staying at states
1 and 3.
p3
(29)

C=

p1 + p3

This in term of the given transition rates results:


p
(30)
C=
p + s + p
The only variable in this result that the engineer has much control over is the inspection rate p.
The other two parameters are statistical failure rates of the plant and protection system and not
easily changed. The inspection rate is the inverse of the mean time between inspections. A greater
inspection rate p (more frequent inspections) will reduce the probability of unreadiness. Note
that the equation shows that for a non-inspected protective system (p=0), the probability of
unreadiness is just a function of the failures rates of protective system and plant. The plant failure
rate is grater than the failure rate of the protective system, therefore the unreadiness probability
decreases with an increasing plant failure rate.
Monte Carlo Simulation
A sequential Monte Carlo simulation attempts to model the system behavior precisely as it occurs
in reality. Monte Carlo computer simulation utilizes pseudorandom number to model stochastic
event occurrences. These pseudorandom numbers constitute a sequence of values, which,

14

although they are deterministically generated, have all the appearance of being independent
uniform random variables (0,1).
One method utilized for generating pseudorandom numbers is the multiplicative congruential
method [11]. An initial value x0, called seed, is used to start the sequence and then recursively
computes successive values of xn.
xn = a xn1 mod(m)
(31)
In (31) a and m are given positive integers. The value of xn is the remainder of the quotient
between axn-1 and m. Thus xn is either 0, 1, , m-1. The quantity xn/m is taken as an
approximation to the value of an uniform (0,1) random variable. After some finite value, at most
m, of generated numbers a value must repeat it self. Once this happen the whole sequence will
begins to repeat. Therefore the constants a and m have to be chosen so that this repetition occurs
after a large number of simulations. For a 32-bit word computer it has been shown that m=231-1
and a=75 results in desirable properties.
Random numbers generated by the computer are uniformly distributed in (0,1). To get the
stochastic behavior of the reliability parameters (repair and failure time for example) these
random numbers must be converted into the adequate distribution. This can be done by the
method of inverse transform algorithm. However this method presents some limitation for the
normal distribution, due to the difficulty to invert the cumulated distribution function. In this case
the central limit theorem can be used to give the following method. Suppose we are interested in
simulating a sequence of numbers with normal distribution, mean and variance 2. Then,
generate twelve uniformly distributed numbers U1, U2, .U12, sum them and subtract six,
multiply the result by and sum it the mean value, the sequence presents normal distribution.
12
(32)
xi = + ( U l 6)
l =1

When applied for the purpose of reliability assessment, the Monte Carlo simulation typically
analyzes system behavior for a specific period of time. An artificial history that shows the
changes in component states is generated in chronological order (sequential simulation) by means
of each simulation. The system reliability indicators are obtained from the artificial history.
Because each simulation will produce different results many simulations are needed (and
wished) to average the system indicators and study their distributions.
An algorithm that could be applied to evaluate the reliability of a general protection system
consisting of transducers, relays and circuit breakers consists of the following steps.
1. Set a simulation time.
2. Generate a random number for each component in the protection system (transducers,
relay, circuit breakers) and convert it into mean time to failure corresponding to the
probability distribution of the component parameter.
3. Determine the component with minimum mean time to failure. This is the faulted
component.
4. Cumulate the simulation time by summing mean time to failure of faulted components. If
cumulative time is less than simulation time, continue. Otherwise output results.
5. Determine whether the protection function can still be performed given the existence of
the faulted component. If the protection function cannot be performed output a protection
system failure. If the protection function can be performed with the faulted component,
modify the logic function of the protection system to represent the new condition.
6. Return to 2.

15

Sequential simulation is not needed if contingencies are mutually exclusive and the system
behavior does not depend upon past events. If so, contingencies can be probabilistically selected
and simulated in any arbitrary order. This kind of simulation is refereed as nonsequential Monte
Carlo simulation. Generally speaking, nonsequential simulation is less computational intensive
than sequential simulation, because simulation rules are simpler since contingencies are assumed
not to interact with each other. In nonsequential simulation, contingencies are randomly selected
from a pool of possible events based on contingency probabilities. A given contingence can be
selected more than once. The selected contingencies are then simulated in any order, assuming
they are all mutually exclusive. Like sequential simulation, a nonsequential Monte Carlo
simulation is repeated many times to produce a distribution of results.
Events and Fault Trees
One method for evaluating the probability of failure of a system or component is the use of event
trees and/or fault trees. Event trees are particularly suitable for the assessment of reliability of
stand by and mission oriented systems [1].
A fault/event tree is a pictorial representation of all possible causes leading to a given
failure/event. They provide a convenient method of graphically visualizing causal relations
between events. This is done by constructing building blocks that are displayed as either gate
symbols or event symbols. It is named tree because the graphical representation gradually fans
out like braches of a tree.
In the fault tree method a particular failure condition is considered and a tree is constructed that
identifies the various combinations and sequences of other failures that lead to the failure being
considered [1]. The method can be used for both qualitative and quantitative evaluation of system
reliability. When used for quantitative evaluation, the causes of the system failure are gradually
broken down into an increasing number of hierarchical levels until a level is reached at which
reliability data is sufficient for a quantitative assessment to be made. The appropriate data is then
inserted into the tree and combined together using the logic of the tree to give the reliability
assessment of the complete system being studied.
Some of the logical gate symbols used in fault and event trees are shown in Table 1 [6].
Table 1: Fault tree gate symbols
Gate name
Causal relation
AND
Output event occurs if all input events occur
simultaneously
OR
Output event occurs if any of the input events occurs
INHIBIT
Input produces output when conditional event occurs
PRIORITY AND Output event occurs if all input events occur in the order
from left to right
EXCLUSIVE OR Output event occurs if one, but not both, of the input
occur
m OUT OF n
Output event occurs if m out of n input events occur
The other graphical aid used in fault and event trees is the event symbol. Some common event
symbols are shown in Table 2.

16

Table 2: Event symbols


Event name
Meaning
Circle
Basic event with sufficient data
Diamond
Undeveloped event
Rectangle
Event represented by a gate
Oval
Conditional event used with the inhibit gate
House
House event. Either occurring or not occurring
Triangles
Transfer symbol
Used together, the gate symbols and event symbols permit the engineer to describe how the
occurrence of a particular failure mode or event of a given system can occur.
Event trees (also known as success trees) are analogous to fault trees (the mathematical dual). In
and event tree any event can be considered as the top event.
There many types of protection systems, each one with its particularities. It is therefore not
possible to develop a unique general model for all possible protection systems and each
protection system must be modeled following its structure and functional design. However, there
are common elements that can be included in a simple model from which more elaborated models
can be built. Consequently, the following discussion relates to the simple protection system
shown in Figure 7. The figure shows a protection system represented by functional blocks. These
blocks can be related to most protection systems in which the fault detector (power system fault)
includes appropriate transducers (current and/or voltage), the relay includes operating and
restrain coils, the trip signal contains the trip signal device and associated power supply and the
breaker is the actual device which isolate the power system faulted component [10]. It should be
noted that although simple the block diagram could represent an overcurrent protection or a
complex generator differential protection.

Fault detector
FD

Relay
R

Trip signal
TS

Breaker
B

Figure 7: Block diagram of a general protection system


Evaluation of failure to operate
Consider a particular power system component (for example a power transformer) that is
protected by two breakers B1 and B2. Assume that both breakers are operated by the same fault
detector FD, relay R and trip signal device TS. Consider the event the power system component
has failed then the event tree is as shown in Figure 8. The figure shows the sequence of events
together with the outcomes of each event path. Note that since the success of the system is given
by the opening of both breakers, only the outcome 1 leads to the successful operation of the
protection system. The evaluation of the outcome probabilities is a simple exercise after the event
tree has been deduced:
Identify the paths leading to the required outcome
Calculate the probability of occurrence of each path by multiplying the corresponding
event probabilities in each path (event probabilities are discussed below)
17

Calculate the outcome probability by summing the probabilities of paths leading to the
outcome

EVENT

FD R

TS

B1

Outcome
Fail to open

B2
O

#
1

Opens
B1,B2

B1

B2

B2

B1

B1,B2

B1,B2

B1,B2

O
O
O

O
FAULT

F
F

F
7
B1,B2
Figure 8: Event tree of a protection system, O means operates and F fails
To illustrate consider the data contained in Table 2 where the probability of each path of the
event tree shown in Figure 8 is shown.

Path
1
2
3
4
5
6
7

Table 2: Paths probabilities for event tree in Figure 7


Paths probability based on 1 year inspection interval [10]
0.950990
0.009606
0.009606
0.000097
0.009801
0.009900
0.010000

The outcomes probabilities are therefore:


Probability B1 not opening
= probabilities paths 3 to 7 = 0.03940
Probability B2 not opening
= probabilities paths 2, 4 to 7 = 0.03940
Probability B1 and B2 not opening = probabilities paths 4 to 7 = 0.02980
The event probabilities needed to evaluate the paths probabilities in Table 2 are the probability
that each device (FD, R, TS, B) will not operate when required. These are time-dependent
probabilities and, as shown before, are affected by the time period between when they were last
checked and the time when they are required to operate. The time at which the device is required
to operate is a random variable. The probability of failing to respond when required can be
represented by the average unavailability of the device between consecutive inspections or tests.

18

For a single device, where the time to failure distributes exponentially, and inspected on average
every Tc time units, the average unavailability can be calculated by using the expression (33).
T
(33)
1
1
U = (1 e t ) dt =
(1 e T )
Tc 0
Tc
c

Which for the condition Tc <<1 can be approximated to


Tc
(34)

U=

For example if the failure rate of every device in Figure 6 (FD, R, TS, B) is 0.02 f/yr, then the
average unavailability as function of the inspection time Tc is as shown in Table 3.
Table 3: Device unavailability based on a constant failure rate 0.02 f/yr
Inspections intervals
3 months
6 months
12 months
36 months
Unavailability (33)
0.002495838
0.004983374
0.0099336653
0.0294088930
Unavailability (34)
0.0025
0.005
0.01
0.03
The paths probabilities in Table 2 are then easily calculated by multiplying the corresponding
event probabilities. Consider for example an inspection period of 12 months, then the probability
of path 7 which implies the failure of the fault detector FD is 0.01. Similarly the probability of
path 6 (success of FD and failure of relay R) is (1-0.01)*0.01=0.0099.

Conclusions
The reliability of protective system is extremely important for the secure operation of a power
system. The protection of electrical systems is based on the principal of measuring field values
(currents, voltages, etc) and disconnecting the faulted circuit or component when certain
conditions are met. Reliability of protective system is based on 1) dependability and 2) security.
Dependability is a measure of the confidence that the protective system will operate when it is
required; security is a measure of the confidence that the relaying system will not operate
unnecessary. Most protective systems in power systems are highly dependable. Consequently
most mal-functions of protective relaying are operations when not required.
This report presented a brief review of techniques and concepts applicable to reliability
assessment of protective systems. It has been shown that several techniques are available to
assess the reliability of protective systems. The lack may be in the availability of appropriate data
required to feed such methods.

References
[1] Billinton, R.; Allan, R. Reliability Evaluation of Engineering Systems: Concepts and
Techniques. Pitman Advanced Publishing Program, London1985.
[2] Bhattacharya, K.; Bollen, M.H.J.; Daalder, J.E. Operation of Restructured Power
Systems, Kluwers Power Electronics and Power Systems Series, Series Editor: M.A.
Pai. 2001.

19

[3] Horowitz, S.H.; Phadke, A.G. Power System Relaying, Second edition Research
Studies Press LTD, England 2003.
[4] Brown, R. Electric Power Distribution Reliability. Marcel Dekker Inc, Power
Engineering Series, New York 2002.
[5] Mason, R.C. The art and Science of Protective Relaying, John Wiley & Sons Inc. New
York 1956.
[6] Anderson, P.M. Power System Protection, IEEE Press Series on Power Engineering,
1998.
[7] ANSI/IEEE C37.90-1998 IEEE Standard for Relays and Relay Systems Associated with
Electric Power Apparatus, IEEE 1989.
[8] IEEE Std C37.100-1992 IEEE Standard definitions for Power Switchgear IEEE 1992.
[9] Wang, F. On Power Quality and Protection, Licentiate thesis Department of Electric
Power Engineering, Chalmers University of Technology, 2001.
[10]
Billinton , R., Allan, R. N. Reliability Evaluation of Power Systems, Pitman
Advanced Publishing Program, 1984.
[11]
Bollen, M. Understanding Power Quality Problems; Voltage Sags and
Interruptions. IEEE Press Series on Power Engineering, New York 2000.
[12]
Van-Casteren, J. Power System Reliability Assessment Using the WeibullMarkov Model. Licentiate thesis, Electric Power Engineering Department, Chalmers
University of Technology, Gothenburg, Sweden 2001.
[13]
Billinton, R., Wang, P. Teaching Distribution System Reliability Using Monte
Carlo Simulation. IEEE Transaction on Power Systems, June 1998.
[14]
Meeuwsen, J. Reliability Evaluation of Electric Transmission and Distribution
Systems. Power Engineering PhD Thesis. Delft University of Technology, Holland,
1998.
[15]
Hggstrm, O. Finite Markov Chains and Algorithmic Applications.
Department of Mathematic Statistics, Chalmers University of Technology, 2001.

20

You might also like