You are on page 1of 6

Preprints, 9th IFAC Symposium on Fault Detection, Supervision and

Safety of Technical Processes

September 2-4, 2015. Arts et Mtiers ParisTech, Paris, France

Design of Robust Fault Detection Scheme

for Penicillin Fermentation Process
Abdul Rehman Khan Abdul Qayyum Khan
Muhammad Taskeen Raza Muhammad Abid
Ghulam Mustafa

Pakistan Institute of Engineering and Applied Sciences,

Nilore, Islamabad, Pakistan
Abstract: Penicillin fermentation is a batch process used for the industrial production of
penicillin. In this paper, a fault detection scheme is designed for this process for a reliable
operation. Since practical systems are subjected to disturbances and noises, designed scheme
might be unreliable. To cope with this problem, optimization is also considered in the design
by identifying an optimized parity vector directly from the process data without going into the
details of system model using subspace aided approach.
Keywords: fault detection, robust filter, data driven techniques, penicillin fermentation process.
Due to increase in automation level of technical processes
the design complexity has also increased. In physical systems, with the passage of time components may get faulty
because of abnormalities or over-aging. Faults can occur
in components, actuators and sensors. Faulty component
may cause severe damage or at least loss in eciency or
economy by changing the operating point from its optimal
value. For the reliable and safe operation of the plant some
sort of fault detection technique must be implemented to
detect the fault instantly. One step ahead, it is better to
isolate and identify the fault and then devise fault tolerant
scheme to avoid the plant from unplanned shutting down,
by implementing certain control strategy, and fulfilling the
condition of desired performance or at least stability.
In early 1970s, fault detection techniques started developing based on the assumption that the system matrices
A, B, C, D are known a priori. To this end, model based
fault detection schemes for linear time invariant systems
have been the main focus among the researchers and designers of fault diagnosis and control society Ding (2008).
A large number of standard methods have been proposed
for the design of fault detection and isolation schemes.
These methods can be classified into observer based and
parity space based fault detection and diagnosis methods.
For all these methods the basic requirement is the precise
model of the process.
Later on, as fault detection techniques were directed towards large industrial setups like large power and chemical
plants, however, these techniques couldnt be applied due
to unavailability of process model. In case of complex
plants like large chemical and industrial setups, physical
modeling is dicult task and a lot of eort is required
This work is funded by IT & Telecom Endowment Fund-Pakistan,
Pakistan Institute of Engineering & Applied Sciences (PIEAS), and
Higher Education Pakistan (HEC), Pakistan

Copyright 2015 IFAC


to derive a model for the system. In such cases the only

information available is the input and the output data
of the plant. Detailed survey of dierent data driven approaches is given in Ding et al. (2011) and Yin et al.
(2014). System identification techniques (Favoreel et al.
(2000), Van Overschee and De Moor (1996)) can be used
to derive the state space approximation of the system.
The computational eort involved in system identification
was considerably reduced after the parity-space vector
was directly identified from the process data Ding et al.
(2009). In data driven parity vector based fault detection
approach, the parity space is computed directly from the
process history of the plant. This can be achieved by utilizing linear algebra concepts and tools. This approach has
been successfully implemented for complex industrial and
chemical processes like tennessee eastman process Ding
et al. (2009) and fed-batch penicillin production process
Yin et al. (2013) etc currently.
Yin et al. (2013) successfully detected faults in penicillin
production process using Data driven fault detection technique. However, in that work robustness against process
and sensor noise was not considered which is inevitable in
complex industrial processes. To cope with the noise eect,
a framework has recently been proposed in Hussain et al.
(2015); wherein a parity vector has been identified which
provides the sensitivity to faults and robustness against
noise and disturbances. Here, in this work we demonstrate
the successful utilization of the results proposed by Hussain et al. (2015), on penicillin production process, which
involves sixteen process variables with wide operating
range and composite dynamics. This paper also studies the
post residual evaluation schemes; wavelet transformation
and low pass filtering, to remove high frequency contents
from data. A comparison between the two schemes is given
and recommendation is also presented.
This paper is organized as follows. Section 2 introduces
the preliminaries of subspace aided data driven technique,

September 2-4, 2015. Paris, France

Section 3 describes the optimization problem, Section 4

introduces penicillin fermentation process and the last two
Sections 5 and 6 present the results obtained from application of robust fault detection scheme and post evaluation
of residuals.
Parity space based fault detection schemes are well established methods used in the domain of model based residual
generation Ding (2008). Parity space based method is computationally less complex as compared to other methods
and rely on linear algebra concepts. Null space of system
observability matrix provides a basis for parity space and
any row from this basis or linear combination of rows is
a parity vector. Following text explains thoroughly the
basics of parity space and how it is used in residual generation. Consider the system represented by the following
state space model
x(k + 1) = Ax(k) + B(u(k) + fa (k)) + w(k)
y(k) = Cx(k) + D(u(k) + fa (k)) + fs (k) + v(k) (1)
where x(k) n represents state vector, u(k) l ,
y(k) m ,w(k) n and v(k) m represents the input
, output, process and sensor noise vectors respectively.
fa (k) and fs (k) are the actuator and sensor fault signals
respectively. The system matrices A,B,C and D are constant and having appropriate dimensions. In what follow,
a parity vector is derived assuming faults and disturbances
are zero.
y(k s) =Cx(k s) + Du(k s)
y(k s + 1) =Cx(k s + 1) + Du(k s + 1)
=CAx(k s) + CBu(k s) + Du(k s + 1)
y(k) =CAs1 x(k s) + CAs1 Bu(k s)+
+ CBu(k s + 1) + Du(k)
where s 0 is an integer. Above equations can be written
in compact form as
ys (k) = s x(k s) + Hus us (k)
(2) is known as parity relation in which,
ys (k) = [y(k s) y(k s + 1) . . . y(k)]T , us (k) = [u(k
s) u(k s + 1) . . . u(k)]T , s = [C CA . . . CAs1 ]T

Hus =





CAs1 B CAs2 B D
In parity relation based residual generator, we are interested in finding s , parity vector, such that
s s = 0
and define,
r(k) = s ys (k) s Hus us (k) = 0
where r is the residual signal and s is known as parity
vector. The set of all possible s constitutes parity-space.
Parity space is defined as
Ps = {s |s s = 0}

Hence, the null space of s ( ) is known as parity space.

The solution for (3) exists only if the design parameter s
is selected such that

rank(s ) < number of rows in s

i.e, s is rank deficient. The residual is zero in cases where
w(k) = v(k) = fa (k) = fs (k) = 0 for all time instants k,
but practical systems are influenced by noises and unknown
inputs therefore residual is not zero in practice. In this
r(k) = s (Hds ws (k) + vs (k))
where Hds is given as,

Hds =
. . ..
. .
CAs1 CAs2 0
In case of faulty system, residual signal is given by
r(k) = s (Hus fas (k) + Hds ws (k) + fss (k) + vs (k)) (8)
fas (k) = [fa (k s) fa (k s + 1) . . . fa (k)]


fss (k) = [fs (k s) fs (k s + 1) . . . fs (k)]

In data driven parity vector based fault detection approach,
the parity space is computed directly from the process
history of the plant. This can be achieved by utilizing
linear algebra concepts. Assuming the model given in (1),
and system matrices A, B, C, D and system order n to be
[ we can
] [write ] [ ]
[ ] Then

s Hsu X(i)
(m+l)(s+1)N (9)
0 I
= Hds W + V
with the data structures defined as
X = [x(i) x(i + 1) x(i + N 1)] nN , Y =
[ys (k) ys (k + 1) ys (k + N 1)] m(s+1)N ,
U = [us (k) us (k + 1) us (k + N 1)] l(s+1)N ,
W = [ws (k) ws (k+1) ws (k+N 1)] n(s+1)N ,V =
[vs (k) vs (k + 1) vs (k + N 1)] m(s+1)N
ws (k) = [w(k s) w(k s + 1) . . . w(k)]

vs (k) = [v(k s) v(k s + 1) . . . v(k)]

where s and N ( s) are integers. s , Hus and Hds are as
defined earlier. For large N , covariance matrix ZZ
be written as
1 s Hsu X(i) T
1 T 0
Z +
0 0
N 0 I
Since represents noise and carries small variations so it
can be separated by singular value decomposition (SVD) of
can be written
N . Singular value decomposition of N

= Uz z1
UzT ; with Uz = z11 z12
0 z2
Uz21 Uz22
z1 R(s+1)m((s+1)m+n)
Uz11 R(s+1)m((s+1)l+n)
Uz21 R(s+1)l((s+1)l+n)

z2 R(s+1)m((s+1)mn)
Uz12 R(s+1)m((s+1)mn)
Uz22 R(s+1)l((s+1)mn)

z1 includes singular values that reflect the eect of input

data structure U on the process variables and hence are
significantly larger as compared to that of z2 , caused by

September 2-4, 2015. Paris, France

the noise information. As shown in Wang and Qin (2002),

forms the basis for parity space such that Uz12
s = 0
and Uz12 Hsu = Uz22 . So, any row or linear combination
of rows from Uz12
is a parity vector, i.e. s = ls Uz12
s = s Hsu = ls Uz22
. The parity space based residual
generator can be reformulated as,
r(k) = s ys (k) + s us (k)

(4) Construct Hds using A and C.

(5) Solve the equation (13) to obtain the eigen-vector lmax
corresponding to maximum eigen-value max .
(6) Compute the robust parity-vector as sr = lmax
(7) Construct the robust residual generator using the
following relation
rR (k) = sr ys (k) sr Hus us (k)


The proposed technique is optimized in the sense that it

maximizes the eect of actuator faults and is robust against
process noise.

We have considered optimized residual generation in the

presence of process noise and measurement noise. Optimal
selection of a parity vector from the extracted parity space
is very important in order to achieve robustness against
noise and sensitivity to faults. an optimal fault detection
system has been designed by identifying an optimal parity
vector having increased sensitivity to faults and least sensitivity to process noise. The eect of process noise has been
reduced to much extent. Also the sensitivity to actuator
faults has been increased. residual is given by
r(k) = s (Hus fas (k) + Hds ws (k) + fss (k) + vs (k))
Since s = ls , we have ls as a design freedom that
can be used to maximize J index for parity-space residual
generation. To this end, following index is proposed,
s Hus Hus
J = max
s s Hds H T T
ds s
This index is simply a ratio of the norms of vectors s Hus
and s Hds which are actually the factors that relate actuator faults and process noise respectively with the residual.
Maximizing this index is equivalent to maximizing the
norm of s Hus and minimizing the norm of s Hds . Hence
increasing sensitivity to actuator faults and increasing robustness against process noise. Let
s Hus Hus
= s
s Hds Hds
s Hus Hus
s s s Hds Hds
s = 0
T ,T
T ,T

s Hds Hds
ls ( Hus Hus
So, the solution of the index J is equivalent to the solution
of following generalized eigen-value problem (13). In order
to maximize the index J, eigenvalue problem in (13) is
solved and the eigenvalue-vector lmax corresponding to
the maximum eigenvalue max is selected to compute
optimized parity vector s using parity space . The
solution of the eigenvalue problem requires knowledge of
, Hus and Hds . For data driven fault detection methods,
it is assumed that the system matrices are not available
and hence these factors can not be computed directly.
Research has been done to extract and Hus directly
from the process input output data without the need of
system matrices to be known(as explained in section 2).
For computing Hds , we need A and C matrices. So, Hds
can also be constructed once A and C have been computed
from s =null( ). To construct robust residual generator
following algorithms Hussain et al. (2015) is employed,

3.1 Algorithm
(1) Identify the terms = Uz12
and Hus = Uz22
from process data using algorithm 2.1.
(2) Compute s from identified in step 1.
(3) Extract the matrices A and C.



Penicillin is a group of antibiotics obtained from penicillin
fungi discovered by Alexander Flemming in 1928. Penicillin is a secondary metabolite of penicillin fungi produced
when the growth of fungi is restrained and is not produced
during active growth. The penicillin cells are grown using
fed-batch culture which is a biotechnological process where
certain substrate is fed to the bioreactor during cultivation
phase and in which the product remains in the bioreactor
till the end of the process. In penicillin production, glucose
is used to inhibit penicillin production. Oxygen concentration, temperature and pH must be carefully controlled
for the production of penicillin. The penicillin production
can be divided in two phases, pre-culture phase and fedbatch phase. In the first phase, the initial amount of substrate(glucose) is consumed by penicillin and the glucose is
depleted forcing the production of penicillin. In the second
phase substrate is continuously fed as an open loop operation. The details of chemical reactions and photosynthesis
are not related to this research work. Summarized model
structure given in Birol et al. (2002) is as follows:
X = f (X, S, CL , H, T )
S = f (X, S, CL , H, T )
CL = f (X, S, CL , H, T )
P = f (X, S, CL , H, T, P )
CO2 = f (X, H, T )
H = f (X, H, T )
where X, biomass concentration; S,substrate concentration; CL , dissolved oxygen concentration; P, penicillin
concentration; CO2 , carbon dioxide concentration; H, pH;
T, temperature Birol et al. (2002) introduced a modular

Fig. 1. Fed-batch penicillin fermentation process

simulation package for fed-batch fermentation of penicillin
production . They extended the model introduced by Bajpai
and Reuss (1980) by including the eects of environmental
variables like pH, Temperature and input variables such
as aeration rate, agitator power and feed flow rate of glucose on biomass and penicillin concentrations which were

September 2-4, 2015. Paris, France

not considered by the formers. Two Proportional Integral

Derivative (PID) controllers are installed to control the pH
and the temperature, using acid/base flow rates and cooling
water flow rate as control variables. In simulation package,
these controls can be turned on/o and the parameters for
PID controllers can also be changed to study controller
design. Also step and ramp faults can be introduced into
three variables-Aeration rate, agitator power and substrate
feed flow rate. The simulation package gives the user the
flexibility of changing total simulation and sampling time.
In bio-processes, variables like biomass and penicillin concentrations are measured o-line by quality analysis laboratory, therefore introducing lag in measurements. The
sampling time is therefore adjusted accordingly. The initial
values and set points of process variables can be changed
in the simulation package. The healthy normal batch with
default initial values and set points mentioned in Birol
et al. (2002) is given below in fig.2

Fig. 3. Residual and T 2 characteristic in fault free

culture volume. Consequently it lowered biomass and penicillin concentrations. Such abnormal change in variables
is reflected in the residual signal as shown in fig. 4 along
with evaluated T 2 characteristic.

Fig. 2. Normal process variable trajectoriesBirol et al.

Fig. 4. Residual and T 2 characteristic in case of fault 1

For simulation purposes PenSim v2.0 package is used. Using this simulator normal and faulty batches were produced
for fault detection purposes. For fault detection scheme
training purpose, 20 healthy batches were obtained with
sampling time 0.5h and total simulation time of 400h
for each batch. These 16,000 samples were used for oline design procedure. s was chosen to be 10 to build
the data structures U, Y and Z. Biomass and penicillin
concentrations served as output variables whereas six other
variables(Dissolved oxygen concentration, culture volume,
carbon dioxide concentration, pH, temperature and cooling
water flow rate) served as input variables. After parityspace was obtained from the process data, robust parity
vector was obtained by solving the index in (12) as proposed
by the algorithm 3.1. Following three faults were studied for
three faulty batches.

To remove high frequency content from residual signals, they are approximated using wavelet transformation. Wavelet transformation is used by many designers for fault diagnosis purposes in dierent ways Paya
et al. (1997), Zarei and Poshtan (2007), Baydar and Ball
(2003). Wavelet transformation is used to decompose a
signal into approximations and details using a series of low
pass and high pass filters. All residual signals are approximated at third level here. Introduction to wave transformation and its formulation is given in the following subsection
and then the results are shown.
6.1 Wavelet Transformation formulation

Table 1. Description of Faults


Description of faults
20% step decrease in agitator power
30% step decrease in substrate feed rate
20% step decrease in aeration rate

Occurrence time(h)

All faults were introduced at 180h and were retained till

the end of the batch. Due to page limitation, we, here,
focus on fault 1. In case of fault 1, as the agitator power
decreased by 20%, it aected the dissolved oxygen in the

Fourier analysis uses sines and cosines as basis functions

to extract average features of a given signal and transforms
a signal from time domain to frequency domain. Wavelet
transformation analysis uses certain wave like energy limited functions known as wavelets. These wavelets are used
as basis functions and are used to study the localized characteristics of the signal. The wavelet can be of dierent
shapes. In mathematical language we may define wavelet
transformation as the convolution of a given signal with a

September 2-4, 2015. Paris, France

Problems using wavelet transform Wavelet transformation is a good analysis tool for fault detection and has been
used by many designers. Here it is used for post-processing
of residual signal. There are problems in its on-line implementation. For example, for third level approximation
there is a transient of 37 samples when the original residual
signal is convolved with low pass filters. Also, 8 samples of
the original residual must be accumulated for overall downsampling. In our case one sample means 0.5hr and delay
of so many samples and transient might be unacceptable.
Although wavelet transform appears to be a good analysis
tool, the design may be more conservative. So some other
solution should be considered for example one low pass
filter with lesser order and transient time.
Fig. 5. 3rd level wavelet approximation in case of fault 1
wavelet. Wavelet transformation of continuous time signal
x(t) is given as

T (a, b) =
T (a, b) defines the transformation plane. a is known as
scaling parameter while b is known as dilation parameter.
(t) is called mother wavelet. The choice of (t) depends
both on nature of the signal x(t) as well as purpose of
analysis. The type of wavelet defines the family of the
transformation, for example haar.
Since continuous range of a and b adds to computational
complexity, usually discrete forms are used. A useful selection is a = 2j and b = k2j . Any signal x(t) at t [0, T ]
can be decomposed into a summation of wavelets using
wavelet transform given as,
x(t) = w0 +


w2j +k (2j t kT )


j=0 k=0



G() =

1 + 2n

where n is the order of the filter and is the normalized

frequency. Discrete time version of the butterworth filter
can be implemented using Infinite Impulse Response(IIR)
filter defined by following dierence equation:

x(t)(2j t kT )dt

w2j +k =

For low pass filtering, a common type of filter, butterworth

filter Butterworth (1930) may be used. The frequency
response of the butterworth filter is given as:

a0 y(k) + a1 y(k 1) + + an y(k n)

=b0 x(k) + b1 x(k 1) + + bn x(k n)

w0 =

6.2 Low pass filtering

w0 and w2j +k are called the wavelet transform of the

given signal. Parameter j is called the level or scale. The
parameter k determines the position of the wavelet. A rich
text on wavelet transformation can be found in Addison
Fast wavelet transform algorithm
The approximation
and detail coecients of wavelet analysis for a given signal
s at level j are given as
aj = (g aj1 )
a0 = s
dj = (h aj1 )
where represents convolution and () represents down
sampling, selecting even indexed elements in a vector. g
and h are the low pass and high pass filter coecients
as defined by type of wavelet transformation. These filters coecients can be obtained using MATLAB function
wfilters(). There are many families of wavelets where each
is specified by its own coecients. For current analysis
of the signal, a family of wavelet(db4) defined by Ingrid
Daubechie is used. The results after applying wavelet transformation are given in subsequent figures.

Filter coecients can be obtained using MATLAB function

butter(). After filtering through a 3rd order butterworth
filter following results were obtained.

Fig. 6. Low pass filtering using butterworth filter in fault

free case

September 2-4, 2015. Paris, France


Fig. 7. Low pass filtering using butterworth filter in case

of fault 1
In these evaluated residuals, faults can be seen easily
although the system is subjected to noises. The threshold
value is set equal to the variance of the residual signal in
fault free case. The overall performance of above techniques
can be summarized in the following two tables.
Table 2. Details of False Alarm Rates(FARs)
Orignal Residual
Evaluated Residual(Wavelet)
Evaluated T 2 (Wavelet)
Evaluated Residual(Butterworth)
Evaluated T 2 (Butterworth)

Fault Free

Fault 1

It can be seen that the FAR values are high in original

residual. The best case among all above techniques appear
to be butterworth approximated residual. All above techniques deliver FDRs of almost 95%.
Table 3. Detail of Fault Detection Rates(F DR)
Original Residual
Evaluated Residual(Wavelet)
Evaluated T 2 (Wavelet)
Evaluated Residual(Butterworth)
Evaluated T 2 (Butterworth)

Fault Free

Fault 1

In this article, an optimal parity vector was computed from
identified parity space to deal with the real time problem of
disturbances and noise. Faults were successfully detected
with good fault detection rates. Post filtering was also done
to improve results.
This research was supported by IT & Telecom Endowment
Fund-Pakistan, Pakistan Institute of Engineering & Applied Sciences (PIEAS), and Higher Education Pakistan
(HEC), Pakistan. We would also like to show our gratitude
to Professor Ali Cinar, Department of Chemical and Environmental Engineering, Illinois Institute of Technology,
Chicago, USA for providing us the pensim simulator of
penicillin fermentation process on which we have tested
our proposed techniques/algorithms.

Addison, P.S. (2010). The illustrated wavelet transform

handbook: introductory theory and applications in science, engineering, medicine and finance. CRC Press.
Bajpai, R. and Reuss, M. (1980). A mechanistic model for
penicillin production. Journal of Chemical Technology
and Biotechnology, 30(1), 332344.
Baydar, N. and Ball, A. (2003). Detection of gear failures
via vibration and acoustic signals using wavelet transform. Mechanical Systems and Signal Processing, 17(4),

Birol, G., Undey,

C., and Cinar, A. (2002). A modular simulation package for fed-batch fermentation: penicillin production. Computers & Chemical Engineering,
26(11), 15531565.
Butterworth, S. (1930). On the theory of filter amplifiers.
Wireless Engineer, 7, 536541.
Ding, S., Zhang, P., Ding, E., Engel, P., and Gui, W.
(2011). A survey of the application of basic datadriven and model-based methods in process monitoring
and fault diagnosis. In Proceedings of the 18th IFAC
World Congress.
Ding, S.X. (2008). Model-based fault diagnosis techniques, volume 2013. Springer.
Ding, S., Zhang, P., Naik, A., Ding, E., and Huang, B.
(2009). Subspace method aided data-driven design of
fault detection and isolation systems. Journal of Process
Control, 19(9), 14961510.
Favoreel, W., De Moor, B., and Van Overschee, P. (2000).
Subspace state space system identification for industrial
processes. Journal of Process Control, 10(2), 149155.
Hussain, A., Khan, A.Q., and Abid, M. (2015). Robust
fault detection using subspace aided data driven design.
Asian Journal of Control, accepted.
Paya, B., Esat, I., and Badi, M. (1997). Artificial neural
network based fault diagnostics of rotating machinery
using wavelet transforms as a preprocessor. Mechanical
systems and signal processing, 11(5), 751765.
Van Overschee, P. and De Moor, B. (1996). Subspace
identification for linear systems: Theory, implementation. Methods.
Wang, J. and Qin, S.J. (2002). A new subspace identification approach based on principal component analysis.
Journal of Process Control, 12(8), 841855.
Yin, S., Ding, S.X., Abandan Sari, A.H., and Hao, H.
(2013). Data-driven monitoring for stochastic systems
and its application on batch process. International
Journal of Systems Science, 44(7), 13661376.
Yin, S., Ding, S.X., Xie, X., and Luo, H. (2014). A
review on basic data-driven approaches for industrial
process monitoring. IEEE Transactions on Industrial
Electronics, 61(11), 64186428.
Zarei, J. and Poshtan, J. (2007). Bearing fault detection
using wavelet packet transform of induction motor stator
current. Tribology International, 40(5), 763769.