You are on page 1of 4

Identification of PD Defect Typologies

Using a Support Vector Machine


P. L. Lewin, J. A. Hunter L. Hao A. Contin
The Tony Davis High Voltage Lab. GE Global Research D.I.A. University of Trieste
Electronics and Computer Science Niskayuna, NY, USA Via A.Valerio, 10
University of Southampton liwei.hao@ge.com 34127 Trieste, Italy
Southampton, SO17 1BJ, UK e-mail: contin@units.it
e-mail: pll@ecs.soton.ac.uk

Abstract— The Support Vector Machine (SVM) has been adopted these defects at different voltage levels, using different set-up
here to identify four different Partial Discharge (PD) sources that configurations (coupling sensors and bandwidths), different
can affect the insulation system of AC rotating machines. A defect locations and PD sources. Global parameters defined in
number of Roebel bars were prepared to generate bar-to-finger, IEC 60270 and IEEE Guide 1434, [8, 9], as well as those
corona and slot PD in addition to the distributed micro-voids that derived from the stochastic analysis of PD-pulse phase and
are typical of this insulation type. PD measurements were height distributions. Initially, a comprehensive selection of 35
performed using different set-up conditions, defect locations and parameters were considered as useful classification features
voltage levels in order to produce examples of PD activity that but, during the training process, parameters whose values are
represent the same source under a range of conditions. The SVM
affected by the tests conditions were progressively removed
was trained to differentiate between the inherent features (global
and the classification accuracy re-evaluated. A cross-
and derived parameters) of the phase resolved PD (PRPD)
distributions produced by each discharge source. In order to validation process was applied to the model in order to verify
achieve the optimum source classification accuracy, different whether the optimum values correspond with the highest
combinations of distribution features were used to produce a testing accuracy. A grid-search algorithm was applied to test
range of SVM models to identify which parameters were the different combinations of model parameter and find the
influenced by the measurement conditions. A cross validation pair of parameters that provided the optimum value for the
technique has been used to obtain the highest testing accuracy. SVM training process. Finally, a blind test was undertaken, the
Moreover, results obtained using raw data and normalized results of which are discussed in the paper and this was
parameters, were also compared to obtain the best identification performed to validate the procedure.
performance of the given defect typologies.
II. EXPERIENTAL SETUP
Keywords; Partial Discharges; Insulation Systems; Diagnostics; In order to attempt to claim the general validity of a PD
Support Vector Machine. source classification algorithm, it must show robustness to the
variation of sensor configuration, applied voltage magnitude,
I. INTRODUCTION defect location and measurement setup. An experiment was
Many approaches have been published that aim to identify designed to produce representative PD data from a series of
defect within insulation systems through the analysis of Partial well understood model defects whilst varying the previously
Discharges (PD) activity. Some of them are suitable for highlighted experiment/sample characteristics.
implementation by experienced engineers, [1], whereas others The test samples were comprised of Roebel bars completed
are designed for computer-based automatic identification, [2, with end-arm and slot stress grading paint. A number of them
3]. Even if a number of statistical and artificial intelligence were tested independently, whilst the others were inserted into
techniques have been proposed, [4-6], significant concerns the slots of a stator mock-up. A sub-set of the bars had no
still remain regarding the performance of automatic data defects except for the unavoidable presence of distributed
interpretation systems. The reliable performances of these microvoids. Others had their semi-conductive paint abraded
methods depend on the choice of PD activity features, in before introducing the bar inside the slot, thus producing a poor
particular those whose values do not depend on the applied contact between the bar and the magnetic core of the stator,
voltage, defect location and measurement set-up (robust hence generating, “slot” PD. Three of them were
parameters) [7]. predominantly affected by corona discharges, that is, PD due to
The use of the Support Vector Machine (SVM) for the the degradation of stress grading paint at the end arm. Two bars
identification of defects generating PD in components such as with no major defects (except for microvoids) were used
bars or coils as well as complete machines, has been together to create bar-to-bar PD between the top and bottom
investigated. Initially, four different PD generating defect bar in the end-arms [10]. Different bars were tested alone and
typologies that is, distributed microvoids, bar-to-finger, end- whilst connected in series (with up to five bars) to evaluate the
arm corona and slot discharges, have been considered. PD influence of the transmission path on the identification system.
measurements were performed on Roebel bars containing As expected, when the number of connected bars increases, PD
magnitudes decrease and the signal waveforms appeared more height, as well as indexes that are frequently mentioned in
distorted due to signal attenuation, dispersion and reflection. literature but, are not yet incorporated within standards.
PD measurements were performed using coupling Among these indexes, the most promising ones for
capacitors of 80 pF and 1nF connected to a resistor of 50 Ω as identification purposes appear to be those that are most suited
well as an HFCT (bandwidth of 1-80 MHz). An innovative to quantifying the distribution characteristics. Skewness and
measurement system was employed to record and process the kurtosis factors are frequently used to describe pulse-phase
PD pulses [11]. It is based on the sequence mode acquisition distributions [3]. The skewness, Sk, describes the asymmetry of
that records the wave-shape of every detected pulse and to store the phase distributions (Sk=0 for a symmetric function, Sk>0
a large number of individual pulses, enough to statistically and Sk<0 for left-side and right-side skewed distribution,
process PD amplitude and phase distributions. Three different respectively). The kurtosis, Ku, is related to the sharpness of
bandwidths were used in the experiments, 25, 200 and 500 phase distributions with respect to the normal distribution
MHz while the low-frequency cut-off was 150 kHz for all of (Ku=0 for normal function, Ku>0 and Ku<0 for sharp and flat
the PD measurements. PD measurements were performed at distributions, respectively).
different voltage levels, from PD inception up to 1.2U0 in steps The height distribution, F(q), can be processed according to
of 2 kV. the two parameter Weibull function having characteristic
parameters, α and β (scale and shape parameters) [7]. Since α
A suitable algorithm based on the assumption that different is proportional to the mean integrated signal height it does not
PD sources or noise can present different signal shapes, allows
provide any information about distribution shape. On the
the separation of the recorded data in groups of signals
contrary, β provides indications on data scatter with low and
(classes) having similar waveform shapes [10]. Thus, the
high values indicate high and low data dispersion, respectively.
phase-resolved PD (PRPD) patterns presented here were
Besides, the ratio of other parameters evaluated for positive
generated by a single source at a time. Typical PRPD patterns
and negative PD can be used to single out the asymmetries in
of internal, corona, bar-to-bar and slot discharges, are reported
PD pattern. One index commonly used for this purpose is the
in Fig.1 (Ck=80 pF and BW=25 MHz).
NQN factor, which is derived from the F(q) distribution
according to IEEE-1434 Std [9].
In total, 35 statistical operators, basic or deduced quantities
were calculated for each PD-pulse sequence. A technique using
SVMs has been applied to investigate the performance of each
distribution descriptor with regard to robust PD source
classification.
IV. THE SUPPORT VECTOR MACHINE
Among the different pattern recognition techniques, the
A B SVM has been selected for its advantages in the statistical
treatment of small quantities of non-linear and high
dimensionality data [13]. According to SVM theory, the
appropriate kernel selection, normalization of the training data
and the optimization of the model generation parameters have
been investigated to obtain a reliable defect identification
method.

A. Kernel Selection

C D A number of kernel functions are available for modern SVM


toolboxes. They act to increase the separation of different
Fig. 1. Examples of PRPD patterns detected on single bars provided by a
single defect typology. (A) internal, (B) corona, (C) bar-to-bar and (D) slot input classes and improve classification results. The definitive
discharges (Ck=80 pF and BW=25 MHz). suitability of one kernel over another has not yet been agreed
for the application of PD classification. However, for general
III. STOCHASTIC ANALYSIS OF PD SIGNALS application, the Gaussian Radial Basis Function (Gaussian-
Once a series of PD pulses generated by a single PD source RBF) kernel is highly recommended and is defined as:
has been separated from other signals and from noise, [12], the
PRPD pattern and the relevant pulse-phase and pulse-height
distributions, can be evaluated. Statistical operators can be (
C ( xi , x j ) = exp − γ xi − x j
2
), γ >0 (1)
derived from the stochastic analysis of these distributions in
order to extract numerical indexes that can be used for The RBF kernel nonlinearly maps data-points into a higher
identification purposes. dimensional space. The second reason is that the number of
It is worthwhile recalling that a number of indexes are kernel parameters which influence the complexity of SVM
suggested by the relevant standard, [8, 9], such as inception and model selection is minimized. Finally, the RBF kernel has less
extinction voltage, repetition rate, mean values of discharge numerical problems than other kernels where kernel values
may tend to infinity [14]. The application of various kernels to D. SVM Identification
PD data has been assessed previously, concluding that the
In order to investigate how the number of folds effects the
Gaussian-RBF kernel is one of the most effective for PD data
classification [15]. parameter optimisation process, a data-set was used to
generate a number or SVM models. The obtained experiment
B. Data Normalization Robinson data generated by two applied voltages were used as
the training data set. By applying the grid-search and different
Normalizing or scaling data is a very important pre-
numbers of folds (2 folds, 3 folds, 5 folds and 10 folds) for
processing stage for PD source classification, not only in the
cross-validation, the optimised training results are shown in
application of SVM but also in many other pattern recognition
Table 2. 5 folds and 10 folds reveal perfect training accuracies
tools such as neural networks. The purpose of normalization is
of 100% for the same C and γ values.
to avoid features that exhibit greater numeric ranges
dominating those in smaller numeric ranges. Another Table 2. SVM training accuracies and optimized parameters
advantage is to avoid numerical difficulties during the Cross-validation
application of kernel functions [16]. In this investigation, the 2 3 5 10
folds
parameter feature vector consisted of 34 positive and 34 Training accuracy 99.97% 99.97% 100% 100%
negative components plus the NQN ratio, which were C 25 25 211 211
normalized in the range of -1 and +1. The data used to test the γ 2−3 2−2 2−7 2−7
performance of the SVM was also normalized using the same
technique as the training data. The generation of the SVM The RFCT data are used to test the performance of the
models using the raw data has also been evaluated and the optimised SVM with C and γ values of 211 and 2-7
results of both approaches, compared. respectively. The overall classification accuracy achieved is
98.33% (59/60) and details can be found in Table 3. The only
C. Grid-Search and Cross-Validation miss-classification cycle is the internal discharge, which may
have been caused by the slight phase shift introduced by the
Two parameters, C and γ require optimisation while
RFCT sensor.
generating SVM models from training data. C is referred to as
the regularisation parameter and γ as the flexibility parameter. V. APPLICATION EXAMPLES
They act to control the trade off between increasing separation
In order to test the suitability of different PD distribution
margin against miss-classification rate and to define the characteristics as effective source classification features, a
nonlinear embedding of data during the kernel application number of combinations of them were used to optimize
respectively. It is not known beforehand which C and γ are the several SVM models. The reason that the optimization process
optimal for a specific data-set and there is no standardised was used to judge the performance of features relates to the
method for selecting these two parameters. Therefore, some size of the available data sets. Each set of data was produced
parameter search algorithms must be applied to find the from a range of samples, data acquisition systems and test
optimized C and γ. This process is called a grid-search. The voltages. Subsequently, the amount of time required to
objective of grid-searches are to identify the best pair of C and generate one data-set is considerable, for the purposes of this
γ to accurately classify the training data. However, in practice, investigation, ten different data-sets were tested. Each set
a high training accuracy does not guarantee a high having been comprehensively described using the various
identification accuracy for the training set, since the statistical techniques explained in Section III. The results of
characteristics of training and testing data can be markedly the optimization process are shown in Table 3. A 5 fold cross-
different – a practical classification system would require validation process was applied to the data to produce the
regular optimisation of these parameters over time. results and the scaled and un-scales results are provided.
A common way to optimise these parameters is to separate VI. DISCUSSION AND CONCLUSIONS
training data into multiple sub-sets of which, one or more are A 5 fold cross-validation technique has been applied to a
considered as unknown data to the classifier during the comprehensive experimental dataset produced from a number
training process and the other set used as testing data. Then the of PD sources using a range of data acquisition systems and
prediction accuracy of this test data set during the training applied voltages. The data was produced in order to
process will more precisely reflect the performance when investigate which commonly used phase-resolved PD
classifying unknown data. This improved procedure is known distribution descriptors are most robust for source
as cross-validation. In this investigation, C and γ were classification. Several SVM models were optimized and tested
optimised using both a grid-search and cross-validation. Pairs with their performance linked to the independence of the
of C and γ are tested and the one with the best cross-validation distribution feature to experiment and applied voltage
accuracy recorded. A practical method used by researchers is variations. Different combinations of distribution features
to populate the grid search with exponentially growing were tested using both the raw data and the scaled data. The
sequences of C and γ (for example, C=2−12 , 2−11,… .2+12; results show increased classification accuracy was returned for
γ=2−12 , 2−11 , . . ., 2+12 are used in this case to investigate the scaled data in all cases. The classification result for the entire
optimal performance. set of distribution descriptors was considerably lower than
other combinations. This implies that sufficient “noise”
(features that have a strong dependence on the experiment REFERENCES
setup or applied voltage) is introduced to the data to have a [1] B. Fruth, J.Fuhr, “Partial Discharge Pattern Recognitio – A Tool for
detrimental effect on SVM performance. This result also Diagnosis and Monitoring of Ageing”, CIGRE Report, Paper 15/33-12,
suggests a weakness in the SVM system and that its robustness Paris (France), 1990.
to less correlated information could be an area for [2] N.C.Sahoo, M.M.Salama, R.Bartnikas, “Trends in PD Pattern
improvement. The highest classification result of 100% was Classification: a Survey”, IEEE Trans. on Dielectrics and Electrical
Insulation, Vol.1, pp.248-264, April 2005.
produced by the following features: Weibull shape factor
Beta, Weibull skewness factor, NQN, skewness and kurtosis [3] A.Contin, A.Tessarolo, “Identification of Defects Generating PD in ac
Rotating Machines by Means of Fuzzy Tools”, Proc. of IEEE
of the pulse number distribution. By producing the highest International Symposium on Electrical Insulation, pp.558-562, June
classification results, these descriptors are shown to possess a 2008.
high source dependency and be robust to changes in [4] L. Ruihua, X. Hengkun, G. Naikui and S. Weixiang, “Genetic
experiment setup and applied voltage. This suggests that they programming for partial discharge feature construction in large generator
could be used for automated classification tools going diagnosis”, Proc. of the 7th International Conference in Properties and
forward. Many PD distribution based classification systems Applications of Dielectric Materials, Vol.1, pp.258-261, June 2003.
work very effectively on experiment based data but struggle to [5] D. Wenzel, H. Borsi and E. Gockenbach. “A new approach for partial
discharge recognition on transformer on-site by means of genetic
perform well in realistic, field conditions. A possible solution algorithms”, Proc. of IEEE International Symposium on Electrical
to this issue to be to combine the method presented here with a Insulation, pp.57-60, June 1996.
suitable source discrimination algorithm such as the one [6] G. Wu, X. Jiang and H. Xie. “A Neural Network used for PD pattern
described here [3]. By pre-processing the data and identifying recognition with genetic algorithm”, Proc. of the 6th International
different active sources, before mathematically describing the Conference in Properties and Applications of Dielectric Materials,
produced distributions, the effects of noise on classification Vol.1, pp.451-454, June 2000.
accuracy could be minimized. [7] A.Contin, A.Cavallini, G.C.Montanari, C.Hudon, M.Belec,
D.N.Nguyen, “Searching for Indexes Suitable for Rotating Machine
Table 3. SVM optimization results Diagnosis”, Proc. of IEEE International Symposium on Electrical
Classification Classification Insulation, pp.101-105, June 2006.
Distribution descriptors accuracy accuracy [8] High-voltage test techniques. Partial discharge measurement. IEC
(un-scaled data) (scaled data) Standard 60270 fourth edition, March 2001.
35 features – entire dataset 84.84% 92.42% [9] IEEE Trial-Use Guide to the Measurements of Partial Discharges in
26 features – Phase Rotating Machinery, IEEE St. 1434-2000, April 2000.
relationship, weibull [10] C.Hudon, M.Belec, “Partial Discharge Signal Interpretation for
discription and statistical Generator Diagnostics”, IEEE Trans. on Dielectrics and Electrical
86.36% 93.93% Insulation, Vol.12, pp.297-319, April 2005.
description of charge, and
the pulse number [11] A. Contin, A. Cavallini, G. C. Montanari, G. Pasini, F. Puletti, “Digital
distributions Detection and Fuzzy Classification of Partial Discharge Signals”, IEEE
16 features - Weibull Trans. on Dielectrics and Electrical. Insululation., Vol.9, pp.335-348,
discription, skewness and April 2002.
kurtosis of the pulse 84.84% 96.96% [12] L. Hao, P.L. Lewin, J.A. Hunter, D.J. Swaffield, A. Contin, C. Walton,
number and charge and M. Michel, “Discrimination of Multiple PD Sources Using Wavelet
distributions Decomposition and Principal Component Analysis”, IEEE Trans. on
8 features – Weibull shape Dielectrics and Electrical Insulation, Vol.18, pp.1702-1711, October
factor Beta, NQN and 2011.
skewness and kurtosis of 98.48% 98.48% [13] L. Hao, P. L. Lewin, Y. Tian and S. J. Dodd. “Partial Discharge
the pulse number Identification Using a Support Vector Machine”, Annual Report of
distribution Conference on Electrical Insulation and Dielectric Phenomena, pp.414-
10 features - Weibull 417, October 2005.
shape factor Beta, Weibull
[14] C. Hsu, C. Chang and C. Lin. “A Aractical Guide to Support Vector
skewness factor, NQN, 90.90% 100%
Classification”, December 2007,
skewness and kurtosis of
http://www.csie.ntu.edu.tw/cjlin/libsvm.
pulse number distribution
[15] L. Hao, P. L. Lewin and Y. Tian. “Partial Discharge Discrimination
Using a Support Vector Machine”, Proc. of the XIVth International
Symposium on High Voltage Engineering, Paper P-35, CD-ROM. IEEE,
August 2005.
[16] C. Chang and C. Lin. “LIBSVM: a Library for Support Vector
Machines”, April 2005, http://www.csie.ntu.edu.tw/cjlin/libsvm.

You might also like