You are on page 1of 12

IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 6, NO.

5, SEPTEMBER 1998 623

A Modular Methodology for Fast Fault Detection


and Classification in Power Systems
Fahmida N. Chowdhury, Member, IEEE, and Jorge L. Aravena, Member, IEEE

Abstract— This paper presents a modular yet integrated ap-


proach to the problem of fast fault detection and classification.
Although the specific application example studied here is a power
system, the method would be applicable to arbitrary dynamic
systems. The approach is quite flexible in the sense that it
can be model-based or model-free. In the model-free case, we
emphasize the use of concepts from signal processing and wavelet
theory to create fast and sensitive fault indicators. If a model Fig. 1. Overall scheme.
is available then conventionally generated residuals can serve as
fault indicators. The indicators can then be analyzed by standard
statistical hypothesis testing or by artificial neural networks
to create intelligent decision rules. After a detection, the fault
indicator is processed by a Kohonen network to classify the
fault. The approach described here is expected to be of wide
applicability. Results of computer experiments with simulated
faulty transmission lines are included.
Index Terms— Fault classification, fault detection, Kohonen
networks, neural networks, power systems, real-time, wavelet
transformation.

Fig. 2. Residual generating with existing model.


I. INTRODUCTION

A. Overview of the Modular Methodology • An accurate mathematical or input–output


(I/O) model of the nominal (fault-free) system

T HE task of fast fault detection includes two major parts:


1) creation of a measure to serve as the indicator of
normal–abnormal behavior and 2) design of a decision rule,
is available and we can use the residual
directly in the second module of our scheme.
This is the best possible case; however, in
based on that measure, to detect the fault. After detection, an many practical situations this does not hold.
additional phase of classification may be required. These three • A mathematical model is not available, but an
modules are described below. Our main focus is to develop I/O model (such as the ARMA model if the
a methodology that does not necessarily rely on the use of system is linear, or a neural-network model if
mathematical models. If models are available they can be used the system is nonlinear) can be built on-line.
to advantage, but the technique can be implemented without an This can lead to the generation of a residual
explicit model. The three-module scheme is shown in Fig. 1. which can be used in Module II. However,
1) Generation of fault indicators (Module I): This can be I/O model building can be a very hard task if
done in two major ways. the system in question is nonlinear. Moreover,
a) Model-based: This is the most widely used method. currently available I/O modeling techniques,
In this, a residual is generated, which is typically including neural network methods, suffer from
the difference between the actual system’s output many restrictions. For example, the order of
and the output predicted by a model. This we shall the system must be known or must be discov-
call the model-based method, which is fully compat- ered by trial and error; one must assume that
ible with our modular methodology. There are two the system will operate fault-free for a long
possibilities here. enough time so that a nominal I/O model can
Manuscript received September 27, 1996. Recommended by Associate be developed, etc.
Editor, G. J. Rogers. The work of F. Chowdhury was supported in part by
NSF Grant ECS-9 526 341.
F. N. Chowdhury is with the Electrical and Computer Engineering, Uni- Figs. 2 and 3 show the residual-generating tech-
versity of Southwestern Louisiana, Lafayette, LA 70504-3890 USA. niques.
J. L. Aravena is with the Electrical and Computer Engineering, Louisiana
State University, Baton Rouge, LA 70803 USA. b) Model-Free: There are many situations when an
Publisher Item Identifier S 1063-6536(98)06251-4. accurate mathematical model is either unavailable
1063–6536/98$10.00  1998 IEEE
624 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 6, NO. 5, SEPTEMBER 1998

fault data with the filter bank and thus generating


fault-indicators for known types of faults.
b) If an accurate and convenient model is available,
then we can introduce simulated faults into the
model and generate specific types of residuals for
specific types of faults, and use them as exemplars in
the neural net of Module III. In this case, Module I is
operating as a residual-based technique. Alternately,
system or model responses during specific simulated
Fig. 3. Residual generating with I/O model. faults can be processed with the filter-bank and used
as exemplars for Module III, in which case Module
or is too complex, and the task of building an I/O I would remain model free.
model is not practical. For such cases we propose a
model-free method of generating the fault indicator.
B. Current Research in the Field
The principle behind this approach is: if the fault is
detectable it must produce changes in the monitored In the context of fault detection in general, the dynamical
variables, which may be small but can be enhanced systems community is actively researching various approaches
with signal processing techniques. Here we describe to fast detection and isolation. Most visible efforts are residual-
the creation of an orthogonal decomposition based based. Figs. 2 and 3 show two variations of the residual-
on multirate filter banks. Incoming data (for example based method of generating the fault indicator. For a good
measured voltages and currents) are processed with discussion of general-purpose failure-detection methods (see
the multirate filter bank to generate a set of sensitive [1]). In [1] the authors summarize the available approaches
fault indicators, without requiring a model of the and develop a general methology for the task. However,
system from which the data are coming. Generation they do not mention any method which can be model-free,
of fault indicators without a mathematical model of and can be implemented only using measured system out-
the system is one of the main contributions of this puts. Our survey of fault-detection methods in the specific
paper, and is described in detail in later sections. context of power systems also shows that model/residual
2) Fault detection (Module II): The fault indicators (regard- based techniques dominate the field. A recent report1 available
less of how they have been generated) can be tested on the World Wide Web [2] indicates that the problem of
by conventional hypothesis testing methods, but since in detecting high-impedance faults is far from solved. In general,
general the indicators are vectors, the design of multiple fault-detection has remained an active area of research, and
hypothesis testing becomes very complicated. In this many different methods are proposed. For example, fixed-
paper we describe an alternate technique where the fault gain filters [3], Kalman filters [4]–[7], fuzzy logic [8] and
indicators are processed by a three-layer feedforward travelling waves [9] based approaches have been explored.
neural network. This neural net works as a hypothesis The common technique in the residual-based methods is to
tester to answer the question: does a fault exist, or generate estimated outputs (either from a “known” nominal
is this a normal situation? The details of this neural model or by using system identification methods), and take
network depends upon whether its inputs are residuals or the error (difference between estimated and actual values)
model-free indicators. The neural network for use with as the indicator of normal/abnormal behavior. Whenever this
residual-based methods is described briefly in the paper. indicator deviates from its theoretical value, one can assume
The necessary changes for adapting this network to the that a fault has happened. The differences between the various
wavelet-based fault-indicators are also developed and fault-detection methods mainly lie in two major areas: how to
presented. generate the residuals, and how to test them. However, all these
3) Fault classification (Module III): Only if a fault is methods rely on the availability (or estimation) of accurate
detected, the indicators are entered into the classification mathematical models, and are therefore subject to all the
network. The classification network is a self-organizing limitations of modeling uncertainties, unknown nonlinearities
neural net that works as a pattern classifier and produces in the actual system etc.
information on fault type and location. The actual oper- Neural nets have been well studied for power system
ation of this module is dependent on whether or not a applications. One can easily compile dozens of references on
system model is available. the subject [10]–[16]. However, the use of wavelets is very
recent, and interest in this tool is now growing. An indication
a) In the absence of a model, this module can be
of such interest is the work by Robertson et al. [17]. A novel
incorporated with an expert system which is based
feature of our paper is the integration of the two tools (wavelets
on historical data of typical faults for the particular
and neural networks) for the development of fast detectors
system. If such a knowledge-base is available, then
which are not dependent on the availability of accurate models.
the classification net would obtain exemplars for
Despite the large number of neural nets applications to power
specific fault classes from Module I, and there would
be no need to train Module III. In this case, Module 1 This report is from the IEEE Power Systems Relaying Committee Working
I would provide exemplars by processing the actual Group.
CHOWDHURY AND ARAVENA: MODULAR METHODOLOGY FOR FAST FAULT DETECTION 625

systems, we believe that our use of them as hypothesis testers in a time-frequency space is well suited to the study of
is unique. Their usefulness in this mode has been verified by nonstationary phenomena. We quote from [18]:
one of the authors in conjunction with a Kalman filter based
Until recently, the Fourier transform was the main math-
estimation process. The application was power system state
ematical tool for analyzing singularities. The Fourier
estimation and fault detection [7]. Also, available literature
transform is global and provides a description of the
suggests that the main focus in power systems neural net
overall regularity of signals, but it is not well adapted
applications has been on supervised learning. We believe
for finding the location and the spatial [temporal] dis-
that unsupervised learning, with its recognized capability for
tribution of singularities. This was a major motivation
extracting relationships present in the data, is a better method
for studying the wavelet transform in mathematics and
for the classification task. The feasibility of unsupervised
applied domains.
learning for power system fault classification was studied by
Lubkeman et al. [15]. Our method differs from the Lubkeman It is reasonable to apply a similar argument to fault-induced
approach in the following. transients.
• Model-free option: While our overall methodology is If we assume that the signals are monitored continuously,
flexible enough to include the use of system models, we the continuous wavelet transform yields a very useful time-
describe a model-free option. Techniques from signal- scale (frequency) representation of a signal. Its determination,
processing provide a set of enhanced, sensitive, and however, can be very time consuming. A more realistic ap-
definitive fault-indicator patterns, without requiring a sys- proach is to assume a computer-based data acquisition system
tem model, (while in [15], the data need to be processed producing discrete time signals which can be decomposed in
with a Kalman filter, which presupposes that a system wavelet packets [19]. In this case, one uses multirate filter
model is available). banks to create representations of the discrete time signal over
• On-line: Our method is intended for on-line application, different regions of the time/frequency domain. By a suitable
while clearly in [15] the application would have to be off- selection of the of the filter banks, one can create very general
line because the construction of the input patterns requires time-frequency representations.
the availability of optimal estimates of pre- and postfault In the conventional applications of multirate filter banks,
magnitudes and angles of voltage and current phasors. etc. one has an analysis bank which uses downsampling to reduce
For the development of indicators, it is desirable to use tech- redundancy and to increase efficiency (e.g., by reducing the
niques which are, as much as possible, based only on general number of samples that must be sent over a communication
cause/effect phenomena and not dependent on the availability channel) and a synthesis bank which in the case of perfect
of an explicit mathematical model. Then the technique will reconstruction, uses the output of the analysis bank to recreate
be applicable to a large class of situations. Although it is the original signal. In the application described here, we use
possible to use neural nets to generate fault indicators, at the the outputs of the synthesis bank to decompose the signal into
present stage of development it appears that the feasibility orthogonal components which have essentially no overlap in
of most of the approaches is based on simulation studies. the frequency domain (zero overlap is theoretically impossible
Lacking a theoretical guarantee of workability, these methods for real signals and filters). This last characteristic can be used
usually cannot be generalized. Hence our decision to try a to increase selectivity and sensitivity in the detection process.
fresh approach based on a theoretically well-establised tool Selectivity is increased because one can differentiate transients
from the field of applied mathematics and signal processing. whose frequency characteristics are very similar. Moreover,
However, the demonstrated success of neural nets as decision- if one has information about the system, or a model, one
makers and pattern classifiers either equals or surpasses that can define a specialized bank to define frequency bands that
of the commonly used statistical methods. Our wavelet-based one should monitor; reducing (ideally, eliminating) frequency
approach is meant to be used together with neural networks overlap would concentrate the energy of the fault induced
for decision-making and classification. transient in a small number of bands thereby increasing
We have performed simulation tests on a transmission line sensitivity of the detector.
model to verify the workability of our approach. The working Given our purpose of using minimal information about the
assumption is that a detectable fault must affect the available system, we chose a filter bank which approximately partitions
instrumentation/data acquisition system. The effect may be the discrete frequency range in bands of equal width. In order
very small in conventional instruments but can be enhanced to create efficient indicators we require that the signals created
with appropiate signal processing tools. The approach does by the filter bank should be mutually orthogonal in the time
not presume any a priori knowledge about the system being domain. In the approach presented here, we use outputs from
analyzed. the filter bank to create instantaneous information vectors,
which are not necessarily orthogonal but show very quickly
the effect of a fault. Here we expect a distinct advantage
II. FILTER BANK FOR THE DEVELOPMENT in using these signal processing tools. Since the monitored
OF MODEL-FREE FAULT INDICATORS variables change in a continuous manner, the onset of a fault
Any detectable fault must introduce transients in the ob- is very difficult to detect and introduces a delay in the detection
served data. These “irregularities” carry important information process. If one can enhance the detection of the fault induced
about the fault. It is generally accepted that a description transient then one can start taking corrective actions faster
626 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 6, NO. 5, SEPTEMBER 1998

Fig. 4. Filter bank. Analysis filters create the compressed components. Synthesis filters create the orthogonal components.

than conventional detectors would allow; it is not difficult In Fig. 4, we show a scheme to generate four orthogo-
to conceive a scenario where early (or ealier) warning is nal components. Each additional level doubles the number
advantageous. of components generated and (approximately) reduces their
The filter bank described below is derived from Daubechies’ bandwidth by a factor of two. This uniform distribution of
orthogonal wavelets with compact support. However, it has the bands appears as the best choice in the absence of any
been enhanced by the introduction of a bilateral decomposition information about the process. If additional information, or a
to create a uniform wavelet packet. Fig. 4 illustrates the model is available, one can design nonuniform bands retaining
principle of the bilateral decomposition. The filters the orthogonality of the signals. As opposed to a conventional
are both FIR; the down arrow denotes a downsampling by Fourier analysis, the orthogonal components retain the time
two (taking every other sample) and the up arrow denotes information of the original signal. Hence one can use them
upsampling by two (inserting a zero in between two values). to localize the fault. On the other hand, the compressed
For the example described in the case study, both have 20 taps components have undergone levels of decimation. This fact
and have been taken from Daubechies’ multiresolution work. is important for the generation of compact signatures of the
The symbol , is used to denote the paraconjugate, .2 various faults.
A formal proof of the properties of the decomposition is
beyond the scope of this paper. However, we emphasize the A. The Fault Indicator: Instantaneous Information Vector
fact that the actual implementation of the process is well
As mentioned before, our goal is to generate a fault indicator
established in the signal processing field and produces efficient
that does not depend on the availability of a model of the
numerical schemes.
dynamic system. Here, we use the instantaneous values of the
resolved components to construct instantaneous vectors. Let us
call them instantaneous information vectors (IIV’s). Basically,
H
2 If the filter, ; is causal then its paraconjugate becomes anticausal. For
we can treat this similar to the residual vector generated in
real time use, all filters must be causal. This is a common problem in DSP and
it is solved by introducing a time delay. Thus, instead of perfect, instantaneous the model-dependent cases. During the course of an on-line
reconstruction, the filter bank is designed to be a perfect delay line operation, we can realistically assume that most of the time
CHOWDHURY AND ARAVENA: MODULAR METHODOLOGY FOR FAST FAULT DETECTION 627

the situation would be fault-free. Thus the IIV’s will have


components with small values resulting from the numerical
steps involved in the orthogonal decomposition process. These
values we will call “noise”; they are not random noise in the
sense that they do not come from the measured data. Rather,
they are an artifact of the wavelet-decomposition. As such,
our detection net must be taught to ignore them. However,
when a set of faulty data is processed by the filter bank, the
resulting IIV will contain one or more component with large
nonzero values. Each different type of fault will produce a
different transient “signature,” which will be reflected in that
particular IIV. The idea is for the detection net to produce a
“yes” output as soon as it encounters an IIV that represents
a faulty situation.

III. DECISION MAKING


The second part of the task of fault detection involves
decision-making, which is usually done by statistical tests.
The two basic hypotheses are: : the null hypothesis, which Fig. 5. The detection network.
means a fault does not exist, and : the alternate hypothesis,
which means a fault exists. If a satisfactory and efficient
hypothesis tester is available, it can be used as Module II. If a robust decision-maker in conjunction with a Kalman filter
a fault indicator has only one component (as in the residual- (which gave fast, on-line estimates of a three-phase power
based method applied to a single-output system), a complicated system quantities). However, in the present paper the idea is
Module II (such as the neural-net-based decision-maker) will to replace the Kalman filter (which requires a system model
be unnecessary; in such cases Module II should contain a and generates a residual vector as the fault indicator) by a
simple hypothesis tester. multiresolution filter-bank (which does not require a system
In this paper we develop a general methodology which can model, but can generate fault indicator vectors) as described
be used for multioutput systems as well as single output ones. in the previous section. The detection net described in [7] can
Even for single output systems, if Module I is a multirate be utilized for our purpose by replacing its inputs (modified
filter-bank (that is, when a system model is not being used residuals from a Kalman filter) by the IIV’s. For the sake
to generate residuals), its output will be a vector, thereby of completeness, we first outline the basic operation of the
requiring a multiple hypothesis tester as Module II. Hence the Kalman-filter-based detection network. Then we will introduce
development of the neural-net-based decision-maker. These a modification to use the IIV’s as the inputs to this network.
are the reasons for using neural networks (instead of the Actually, since each of the three modules of our proposed
conventional statistical hypothesis testing) to carry out the methodology is self-contained, the detection module can be
decision-making task: used successfully with conventional residuals, or with IIV’s as
• For a multioutput system, multiple hypothesis testing defined by us, or with any other suitable fault indicators. Fig. 5
becomes difficult to design. is a conceptual representation of the scheme. This network can
• The usual approach of using the joint probability density be used for both model-free and model-dependent versions, as
function (to decide whether a random vactor has deviated shown in the next two sections. The model-dependent case is
from its expected mean value) may mask deviations of illustrated through an example of a three-phase power system.
the individual components, which may actually carry
important information. A. Model-Dependent Case
• Biological neurons are natural hypothesis-testers, in fact Here, the three-phase voltage measurements are modeled
that IS their basic function: thus a neural network should as sinusoids with a known frequency (which is assumed to
be ideally suitable for the task of multiple hypothesis be the fundamental power frequency), but with unknown
testing. amplitude and unknown phase angle. Gaussian observation
• Biological neurons carry out hypothesis testing without noise is assumed to be present, and the amplitude and phase
explicit knowledge of the statistical distributions and angle are assumed to be constants with small random walk
mathematical models involved: they store implicit models components. The three-phase voltage system is modeled as
by way of synaptic weights, which depend on the past a multioutput system, with a state vector consisting of six
experience of the neurons. Artificial neural networks components. Suppose each phase voltage has the form
function on the same principle. which can be rewritten for each time-step
Preliminary successful tests of a neural network as decision in the standard system-theoretic format
maker were reported by one of the present authors in [7].
In [7], a three-layer neural network was designed to work as (1)
628 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 6, NO. 5, SEPTEMBER 1998

where the observation matrix3 is time-varying. During for each consultant neuron , where is the dimension of the
unfaulted operation, the amplitude and phase angle input vector, and
remain constant, except for small fluctuations due to random
disturbances, therefore, we have the state equation as for all (5)

(2) With the above constraints, the expected value of the internal
potential of each consultant neuron , under no-fault condition,
Both and are zero-mean, uncorrelated Gaussian noise can be found
processes. While generating the simulated data sequence, the
amplitude is taken to be 1 p.u., and the angle is zero,
and , respectively, for phases and assuming a (6)
balanced system under nominal operating conditions.
A Kalman filter is designed for estimating the states of
the above system. The filter uses the state model and the
This allows us to normalize the firing thresholds of the
observation model, but does not have any knowledge of the
neurons so that they have a lower limit of unity. The con-
actual values of the amplitude and phase angle. The Kalman
straints are implemented at each step of iteration during
filter equations are standard; they are not repeated here, except
training. After the training converges, the weights are kept
the equation for the residual, which is known to be a Gaussian
fixed. From then onwards, the decision-making network is
random variable with mean zero and a known (actually,
ready to be used. It should be noted that we trained the network
computed in the Kalman filter algorithm) covariance matrix.
online, assuming that during the beginning of the process
The residual is defined as
(training phase), the system would run without fault for a
long enough time to permit the training to be completed. This
(3) detection network was tested extensively for various amounts
of jumps in the voltage amplitudes. The detailed results can be
where is the Kalman estimate of the state vector found in [7]; it is noted here that even though the successful
at step , given measurements. The residual is detection rate was high, for small faults 15% of prefault
normalized to yield a zero-mean, unit-variance random vector. value) successful detection rate of about 50%. The response
The elements of this normalized residual vector are then of the output neuron, when a fault (sudden drop in voltage
squared , and used as inputs to the fault- amplitude) occurs, is shown in Fig. 6. This is an example of
detecting neural network. Under normal operating conditions immediate detection, when the voltage amplitude of phase
(since the normalized residual has unit variance), the expected was suddenly dropped to 0.8.
value of each input to the neural network is unity
. This property is utilized (through a convenient constraint
on the weights) in the design of the firing thresholds [see B. Model-Free Case
(4) and (5)]. Each consultant neuron (these are the hidden- The development of the model-free fault indicators (the
layer neurons) receives the full set of squared normalized IIV’s) requires us to modify the hypothesis testing scheme
residuals. Squaring is important because 1) we are concerned so that we replace the residuals by IIV’s. Since the IIV’s are
with the magnitudes of these random numbers and 2) if the not, strictly speaking, Gaussian random variables, we cannot
weights associated with each input are equal, then the sum of use the squared values and arrive at a weighted Chi-squared
these squared Gaussian random variables would have the Chi- distribution to assist us in the design of the firing thresholds for
squared distribution, which would provide us with a baseline the neurons, as we did in the model-based case. However, like
for comparing this technique with the conventional techniques. many other neural-net applications, we can use an experience-
Each consultant neuron is a hybrid between a linear combiner based method to decide what threshold ranges we need to
and a hardlimiter. Their firing thresholds vary according to the use for distinguishing between faulted and unfaulted IIV’s. It
“strictness” assigned to each one. The idea is to implement this is known that during a transient in the original system, the
network as a team of experts making decision based on prior IIV will contain some large components. These components
experience with a given system. are oscillatory in nature, therefore we choose to concentrate
In our simulations, the training was done using the batch- on their magnitude only. The following steps are needed to
mode delta rule, in the MATLAB environment. The training prepare the detection net for operation.
of this network is subject to the following constraints: • Use the absolute values of the IIV’s as inputs.
• By experimentation with normal (no-fault) data records,
choose the firing thresholds of the neurons so that they
(4) do not fire during normal operation.
• Train the network with two classes of examples: IIV’s
generated by no-fault cases, and IIV’s generated by faulty
cases. Any type of fault should result in the on response
3 For details of the observation matrix see [7]. of the final decision-making neuron.
CHOWDHURY AND ARAVENA: MODULAR METHODOLOGY FOR FAST FAULT DETECTION 629

Fig. 6. Output of the final decision-making neuron.

Fig. 7. Load voltage: no fault case.

IV. CLASSIFICATION OF FAULTS (neurons or cells) in a self-organizing neural network develop


The task of the classification network is to cluster detected into detectors of specific categories of patterns. In that sense,
faults into seperate classes. As such, it is a pattern recognition each local cell-group acts like a decoder for the inputs.
problem. Self-organizing neural networks have been success- For using a self-organizing neural net, it is necessary to
fully used as pattern classifiers [20], [21] in various contexts. collect information about various types of power system faults.
Self-organization refers to the specific learning method without Since each type of fault would have its own unique signature
external examples. This is also called unsupervised learning. on the wavelet-transformed coefficients, it should be possible
Given a set of input patterns, neighboring processing units to cluster the cases emerging from same types of faults,
630 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 6, NO. 5, SEPTEMBER 1998

Fig. 8. Load voltage: fault in segment three.

Fig. 9. Load voltage: fault in segment two.

and thus differentiate between different cases. Obviously, the detection net. Only in those instances when we already know
classification net could be given a “no-fault” class, thereby there is a fault, it would be cost-effective to switch on the
eliminating the need for the detection net. However, for classification net.
an on-line methodology, it is better to keep the detection Two types of self-organizing nets are commonly used, the
net in the loop. The vast majority of data are expected to Kohonen map [21] and the ART (adaptive resonance theory)
be of the no-fault type, and it would be inefficient (and network [20]. Because of the computational burden of the
computationally costly) to process all the data through the ART net, we use the Kohonen net, with a few modifications,
classification net, which is expected to be slower than the as the chosen classifier. In the following section we describe
CHOWDHURY AND ARAVENA: MODULAR METHODOLOGY FOR FAST FAULT DETECTION 631

Fig. 10. Generator current: fault in segment two.

how we build the Kohonen network for the model-free case. It has been found in many studies [22] that for best
However, the discussion is also valid (in principle) for cases results, the topological neighborhood should be large in the
when residuals (or any other suitable indicators that contain beginning of the training process (the “ordering phase), and
signatures of the fault type) are being used. then shrink with time so that toward the end of the process (the
“convergence phase”), should include only the closest
A. The Kohonen Network: Some Choices and Modifications neighbors of the winning neuron . The usual practice is to
let the radius of the neighborhood shrink linearly with each
The neurons in a Kohonen network initially have a collec-
update. Besides this time-shrinking nature, the neighborhood
tion of random weights. The training vectors, one by one, are
can also have lateral shrinkage. It has been demonstrated that
presented to the neurons. In the original form of the Kohonen
[22] in biological neurons, there is lateral interaction: this
net, the “winning” neuron , for the th input sample ,
means that when a neuron is firing, it excites other neurons
is selected by the process of similarity matching, i.e.,
in its closest neighborhood more than those farther away from
(7) it. To incorporate this feature in the algorithm, usually the
neighborhood around the winning neuron is made to decay
gradually [23], [24]. One of the typical choices is to let the
where is the number of neurons, and is the
amplitude of the topological neighborhood (centered on the
distance between the vectors and . Common practice is
winning neuron) decay according to a unimodal Gaussian rule.
to use the Euclidean distance (that is, the Euclidean norm of
This means that the weight-update is the strongest for the
the difference vector . Once the winning neuron is
winning neuron, and becomes weaker with increasing lateral
found, it, and a selected neighborhood of it, is updated using
distance.
the following rule:
1) Choice of the Distance Measure and Some Other Imple-
if mentation Issues: The idea of similarity matching between
otherwise vectors can be quite complex. While it is common practice to
(8) use the Euclidean distance (that is, norm of the difference of
where is the learning rate at time step , and is the two vectors) as a measure of closeness, it can be argued that
chosen topological neighborhood around the winning neuron it is really a measure of magnitudinal similarity of vectors.
, at time step . The learning process is stopped when the Another measure of similarity, that of directional similarity,
shift of position of any of the output neurons measured by is the inner-product of two vectors. Take, for example, three
the change in the weight vector associated with it falls below vectors: and . According to
a preset value. It is important to note that the neighborhood magnitudinal similarity, the pair is closer to each other
function is assumed to be time-varying in the above than the pair , but according to directional similarity
description, even though in the original form of self-organizing the pair is the winner. In the context of using wavelet-
feature maps it was considered to be fixed. generated IIV’s (as defined in the previous section) as power
632 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 6, NO. 5, SEPTEMBER 1998

system fault indicators, the magnitude of the vector is not as net would consist of processeing a large number of IIV’s which
important a feature as the information “which component of are generated by the filter bank in Module 1. The filter bank,
the vector shows the large nonzero value”; it is the allignment of course, would need real (or simulated) data records.4 From
of the vectors that is more important. Thus, in our specific these records, we gather a set of examples of different types
application we use the inner-product as a measure of closeness of faults. It is known that each fault class can be associated
between vectors. with a known fault type. Once these IIV’s are classified, the
We have introduced some other modifications to the Koho- Kohonen net in effect functions as a knowledge base. During
nen net for the specific application that we are interested in. the subsequent on-line operation of the net, any new fault that
These are described below. cannot be identified by the existing classes, could be stored
• Use of prior knowledge: The training of the Kohonen and later added to the knowledge base.
net results in the construction of a knowledge base
derived from hisorical information regarding the system V. SUMMARY OF SIMULATION STUDIES
in question. In many practical situations, even though
The goal of the computer experiments was to investigate
the system may be difficult to model, real information is
and illustrate the model-free method of generation of fault
available (voltage and/or current records of faulty cases) indicators. (We assume that the model-based methods are well-
where the fault type and location were known, or were established, although they are less widely applicable than the
later found out. If we can construct fault classes by proposed model-free technique.) Simulation experiments were
processing these data by the multiresolution filter bank performed on a transmission line modeled using SIMULINK.
and produce IIV’s corresponding to each known fault The following experiment is a representative case among many
class, we will have some exemples of each fault type for runs of the experiments. Note that the system model is used
the Kohonen net. Then the Kohonen net will be trained solely to generate the simulated system response: the operation
to cluster similar faults and produce exemplars for each of Module I (generation of fault indicators) requires only the
type. Once trained, the net can be used as a ready-made response, not the model.
classifier. The system in our study is a three segment model of
• Experience-based learning rate: The common practice in a transmission line. Each segment is represented by a line
Kohonen nets is to slowly reduce the learning rate for the impedance (resistence plus inductance) and a line-to-ground
updating of the exemplar vectors. This means that with admittance (conductance and capacitance). In the experiment
each epoch of input-pattern presentation, the learning rate described here, all three segments have identical parameters.
becomes smaller. In the so-called “conscience learn- The faults considered are partial short circuit to ground,
ing,” there is an added feature so that the more times a emulated by increasing the line-to-ground admittance from
neuron wins, the less capable it becomes to capture future a prefault value of 0.001 to a post fault value of 0.1. For
inputs. In our modification, is a decreasing function of comparison, the load has a resistance of 0.5. Hence, the fault
the number of patterns captured by the particular neuron. is relatively minor in magnitude, but it is a sustained fault.
In simple terms, this means that while the neurons are Four cases are considered: the normal unfaulted perfor-
not penalized for winning, the more patterns they win, mance and three faulty lines. The same type of fault was
the slower they adjust their existing exemplar. If we applied one by one to each of the three segments. In all cases,
assume that capturing a new pattern is like gaining a new the variables monitored were the current at the generation point
experience, then our conjecture is that a new experience and the voltage at the load. The data was processed using a
should make less of an impact on an already-experienced bilateral filter bank based on Daubechies’ wavelet of order
neuron than on an inexperienced one. This particular ten. The results included here have separated the signal into
modification is a novel feature of this work. eight components. This eight component vector is identified
• Normalizing the exemplar vectors: Usually, when the as the IIV and is displayed in the next figures. For brevity,
inner-product is used as the measure of closeness between we present only decompositions for the following cases: 1)
vectors, the input vectors are normalized by their own no fault; 2) load voltage for fault in segment three; 3) load
lengths. However, instead of normalizing the input pattern voltage for fault in segment two; and 4) generator current for
lengths, we choose to normalize the exemplars for the fault in segment two. Also, for more clarity, only the behavior
clusters. The reason behind this shift is that in our specific in the neighborhood of the fault time is shown.
context, the magnitudes of the oscillations in the IIV’s With this case study, we want to highlight the following
actually contain useful information regarding the fault- points:
type, so these magnitudes cannot be ignored. However, 1) The decomposition did not use any knowledge about the
the clusters should not be able to win a new pattern due system and was based entirely on processing the data
to their own large magnitude: we are concerned with with a fixed digital filter bank based on Daubechie’s
“directional” similarity between vectors. Therefore, after compact support wavelets. Hence, it supports the va-
each update, the length of the exemplar is readjusted to lidity of the model-free aproach to fault detection. It is
unity. We have not encountered this modification in the
4 This does not mean that one would need to introduce faults deliberately
literature.
into the real system. Recorded fault data from real faults that periodically
2) Training of the Kohonen Net: Training of the Kohonen occur in the system would be sufficient.
CHOWDHURY AND ARAVENA: MODULAR METHODOLOGY FOR FAST FAULT DETECTION 633

expected that an optimized design will yield significantly ious types of faults in the system under study and
better performance. determine the components that would allow us to dis-
2) Fig. 7 shows that, for the unfaulted case, only one tinguish among the faults.
component of the IIV is nonzero. This is the low- 3) The effect of measurement noise. In the preliminary
est frequency component. It confirms our observation experiments, the orthogonal components that carry in-
about smooth signals yielding components in the low- formation about the fault are very small and could
frequency range. From the point of view of detection, it easily be masked by larger measurement noise. We plan
is a very convenient situation since only one band reacts to investigate the use of signal enhancing techniques
to the unfaulted data. We do not expect that situation to improve the quality of the detectors. In particular,
to occur with uniform frequency partitioning in a more the use of wavelet-based enhancing techniques is being
general case. However, we expect that one can design a researched [25].
sufficiently fine filter bank in such a way that the effect 4) Developing the entire methodology as an on-line system.
of the fault will appear in only a few selected bands Currently Module I operates off-line, so that the filters
which would not be significantly affected by the normal are noncausal. A major goal of future work would be to
signals. This separation would favor the detection of implement a “moving window” type filter bank, where
faults. old data points are discarded as new data points become
3) Fig. 8 shows strong variations in the IIV components. available. Modules II and III are ready for on-line use.
Each component has a distinctive bahavior according to
its localization in the frequency domain. One can make VII. CONCLUSIONS
use of this distinctive bahavior to classify faults. A modular scheme for fault detection and classification is
4) Using a common dimensionless scale for the high fre- developed in this paper. Each module can be designed in two
quency components, Figs. 9 and 10 illustrate the fact different ways, model-based and model-free, depending on the
that some variables are more sensitive to a given fault. intended application and available information. We present
In this case it is apparent that the variations in the load the model-free method for the generation of fault indicators
voltage IIV is more significant than the variation in the in detail. The method utilizes multirate filter banks based on
generator current. They also show that some bands may wavelet decomposition of actual data. Extensive simulation
be insensitive to some types of faults. studies illustrate the use of the proposed method. The testing of
5) All figures show that the IIV created by the filter bank these indicators is done by a decision-making neural network,
show a clearly different performance according to the which can also be adapted to both the model-based and model-
location of the fault. Hence they should be able to create free situations. A modified Kohonen-type neural network is
indicators sensitive to the type of the fault, and provide proposed for the classification task. It is anticipated that the
indication about their spatial location. integrated and modular approach presented in this paper can
6) The fault simulated in this example is minor but sus- eventually be developed into a widely appliacable tool in
tained, which means it is an important detection prob- detection and classification of faults in dynamic systems.
lem. Large faults are easier to detect with high accuracy
with the help of residual-based methods. The probability ACKNOWLEDGMENT
of correct detection falls drastically as the fault size
gets smaller. In [7] it was reported that the neural- The authors are grateful for many helpful questions and
network based detector which operates on the magnitude suggestions from the anonymous reviewers.
of residuals (generated by a Kalman filter) becomes
REFERENCES
inaccurate when the change in the amplitude of the
voltage waveform is less than about 15%. By compari- [1] M. M. Polycarpou and A. T. Vemuri, “Learning methodology for failure
son, the wavelet-based detector is much more sensitive. detection and accomodation,” IEEE Contr. Syst. Mag., June 1995, pp.
16–24.
Besides, this method does not require the availability of [2] J. Tengdin, R. Westfall, and K. Stephan, “High Impedance Fault
an explicit mathematical model of the system. Detection Technology,” Rep. of PSRC Working Group D15, Mar. 1997.

Available http://www.rt66.com/ w5sr/psrc.html
VI. OPEN RESEARCH QUESTIONS [3] E. O. Schweitzer and Daquing Hou, “Filtering requirements for distance
relays,” in Proc. Amer. Power Conf., vol. 55-I, 1993, pp. 296–301.
One of the hypotheses still to be researched is that with a [4] A. Girgis and E. B. Makram, “Application of adaptive Kalman filtering
in fault classification, distance protection, and fault location using
better choice of decomposition one can increase the detection microprocessors,” IEEE Trans. Power Syst., vol. 3, pp. 301–309, Feb.
sentivity and make the system robust to measurement noise. 1988.
The other issue is on-line implementation. More research needs [5] F. N. Chowdhury, J. P. Christensen, and J. L. Aravena, “Power system
fault detection and state estimation using Kalman filter with hypothesis
to be done in the following categories: testing,” IEEE Trans. Power Delivery, vol. 6, pp. 1025–1029, July 1991.
1) The use of continuous wavelet transforms and trans- [6] J. L. Pinto de Sa and L. Pedro, “Modal Kalman filter-based impedance
relaying,” IEEE Trans. Power Delivery, vol. 6, pp. 78–84, Jan. 1991.
forms with possible hardware implementation for max- [7] F. Chowdhury, “On-line fault detection in multioutput systems using
imal speed. Kalman filter and neural network,” in Proc. Amer. Contr. Conf., vol. 2,
2) The selectivity of the wavelet-based sensors. This issue June 1994, pp. 1729–1731.
[8] A. Ferrero, S. Sangiovanni, and E. Zappitelli, “A fuzzy-set approach
is application dependent with a strong experimental to fault-type identification in digital relaying,” in Proc. IEEE Conf.
component. Our ongoing research is to investigate var- Transmission and Distribution, Apr. 1994, pp. 269–275.
634 IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 6, NO. 5, SEPTEMBER 1998

[9] P. A. Crossley and P. G. McLaren, “Distance protection based on Fahmida N. Chowdhury (S’86–M’87) received the
travelling waves,” IEEE Trans. PAS, vol. PAS-102, no. 9, pp. 2971–2983, combined B.Sc. and M.Sc. degree in electromechan-
Sept. 1983. ical engineering from Moscow Power Engineering
[10] A. Sharaf, “ANN-based pattern classification of synchronous generator Institute, Moscow, Russia, and the Ph.D. degree in
stability and loss of excitation,” IEEE Trans. Energy Conv., vol. 9, no. electrical engineering from Louisiana State Univer-
4, pp. 753–759, Dec. 1994. sity, Baton Rouge.
[11] A. Mazroua, “Neural-network system using the multilayer perceptron She is currently an Assistant Professor of Elec-
technique for the recognition of PD pulse shapes due to cavities and trical and Computer Engineering at the University
electrical trees,” IEEE Trans. Power Delivery, vol. 10, pp. 92–96, Jan. of Southwestern Louisiana, Lafayette. Her research
1995. interests include neural networks, modeling, estima-
[12] H. Yang, W. Chang, and C. Huang, “Online fault diagnosis of power tion, and detection problems in stochastic systems,
substation using connectionist expert system,” IEEE Trans. Power Syst., applications of systems theory approaches to power systems, and probabilistic
vol. 10, pp. 323–331, Feb. 1995. interpretations of robustness issues. Her educational interests focus on devel-
[13] S. Eborn, D. L. Lubkeman, and M. White, “A neural-network approach oping interdisciplinary courses.
to the detection of incipient faults on power distribution feeders,” IEEE Dr. Chowdhury is a reviewer for the IEEE TRANSACTIONS ON SIGNAL
Trans. Power Delivery, vol. 5, pp. 905–912, Apr. 1990. PROCESSING and the IEEE TRANSACTIONS ON NEURAL NETWORKS, and a
[14] K. Nishimura and M. Arai, “Power system state evaluation by structured member of the Conference Editorial Board of the IEEE Control Systems
neural network,” in Proc. IJCNN’90, vol. 1, pp. 271–277, June 1990. Society. She also reviews proposals for the National Science Foundation.
[15] D. Lubkeman, C. Fallon, and A. Girgis, “Unsupervised learning strate-
gies for detection and classification of transient phenomena on electric
power distribution systems,” in Proc. 1st Int. Forum Applicat. Neural
Netowrks to Power Syst., Seattle, WA, June 1991, pp. 107–111.
[16] K. S. Swarup and H. S. Chandrasekharaiah, “Fault detection and
diagnosis of power system using artificial neural networks,” in Proc. Jorge L. Aravena (M’89) received the degree of
1st Int. Forum on Applicat. Neural Netowrks to Power Syst., Seattle, Civil Electrical Engineer from the University of
WA, June 1991, pp. 102–106. Chile at Santiago, and the Ph.D. degree in computer,
[17] D. Robertson, O. I. Camps, and J. S. Mayer, “Wavelets and power information, and control engineering from the Uni-
system transients: Feature detection and classification,” in SPIE Int. versity of Michigan, Ann Arbor.
Symp. Opt. Eng. Aerospace Sensing, vol. 2242, pp. 474–87, Apr. 1994. Currently he is the Graduate Studies Coordinator
[18] S. Mallat and W. L. Hwang, “Singularity detection and processing with for the Department of Electrical and Computer
wavelets,” IEEE Trans. Inform. Theory, vol. 38, pp. 617–643, Mar. 1992. Engineering at Louisiana State University. He has
[19] A. K. Soman and P. P. Vaidyanathan, “On orthonormal wavelets and published more than 30 refereed journal papers and
paraunitary filter banks,” IEEE Trans. Signal Processing, vol. 41, pp. 100 conference papers. His current areas of research
1170–1183, Mar. 1993. include digital signal and image processing, m-D
[20] G. Carpenter and S. Grossberg, “A massively parallel architecture for a system theory, computer-based control, and parallel algorithms and computing
self-organizing neural pattern recognition machine,” in Neural Networks, structure. His research in nonplanar computing structures and fast parallel
Theoretical Foundation and Analysis, C. Lau, Ed. Piscataway, NJ: representation of filtering algorithms has been supported by the State of
IEEE Press, 1992. Louisiana.
[21] T. Kohonen, “The self-organizing map,” in Neural Networks, Theoretical Dr. Aravena is frequent reviewer for IEEE TRANSACTIONS ON CIRCUITS
Foundation and Analysis, C. Lau, Ed. Piscataway, NJ: IEEE Press, AND SYSTEMS, Signal Processing, and Parallel and Distributed Processing.
1992. He also reviews proposals for the National Science Foundation and has been
[22] S. Haykin, Neural Networks: A Comprehensive Foundation. New invited as national panel member to review Research Initiation Awards in
York: Macmillan, 1994. Microelectronics Information Processing.
[23] H. Ritter, Neural Computation and Self-Organizing Maps: An Introduc-
tion. Reading, MA: Addison-Wesley, 1992.
[24] Z. Lo, M. Fujita, and B. Bavarian, “Analysis of neighborhood interaction
in Kohonen neural networks,” in 6th Int. Parallel Processing Symp.
Proc., Los Alamitos, CA, 1991, pp. 247–249.
[25] W. K. Awadzi, Feature Enhancement via the Wavelet Transform and
Quadrature Mirror Filters, M.S. thesis, Louisiana State Univ., Baton
Rouge, 1994.

You might also like