You are on page 1of 11




This paper summarizes an industrial necessities and study of the application of neural
networks in the area of process monitoring and control. Descriptions of the major
activities undertaken in this programmed, which included the application of neural
networks for fault detection in a nitrification process and the model based predictive
control of a gasoline engine are provided.
The paper also describes some of the practical difficulties that were experienced
while applying neural networks and lists the important lessons that were learned through
the completion of this project. The main conclusion from the work was that neural
networks are capable of improving industrial process monitoring and control systems.
However the level of improvement must be analyzed on a problem specific basis and in
many applications the use of neural networks is essential.
1.1 Process monitoring and control systems applications
The pressure on the Process Industries to improve yield, reduce wastage, eliminate toxins
and above all increase profits makes it essential to increase the efficiency of process
operations. One possible approach for achieving this is through the improvement of
existing process monitoring and control systems.
Many process monitoring and control schemes are based upon a representation of
the dynamic relationship between cause and effect variables. In such schemes, this
representation is typically approximated using some form of linear dynamic model, such
as finite impulse response (FIR), autoregressive with exogenous variable (ARX) and auto-
regressive, moving average with exogeneous variable (ARMAX) models. Once
determined, the dynamic process model of the system can be integrated within a variety
of process monitoring and control algorithms. In process control, for example, the model
can be incorporated within a model based predictive control (MBPC) algorithm, such as
Generalised Predictive Control.
Alternatively, for process monitoring, the residuals (prediction errors) from such
models can be analyzed to detect abnormal operation. Such monitoring and control
schemes have found widespread application in industry and have led to significant
improvements in process operations. Unfortunately, the models employed within the
schemes tend to be linear in form. Although linear models can provide acceptable
performance for many systems, they may be unsuitable in the presence of significant non-
linearities. For such systems it may be beneficial to employ a model that reflects the non-
linear relationship between cause and effect variables.
Preliminary studies have indicated that artificial neural networks (ANNs) may
provide a generic, non-linear solution for such systems. As with standard linear modelling
techniques, ANNs are capable of approximating the dynamic relationships between cause
and effect variables. In contrast to linear techniques however, ANNs offer the benefit of
being able to capture non-linear relationships. Since the performance of process
monitoring and control algorithms are dependent upon the precision of the model
embedded within them, ANN models have the potential to provide benefits to these
algorithms when applied to nonlinear systems.
Figure-1 Basic Perceptron Model.


A mechanistic model derived from first principles is theoretically the most accurate
model that can be developed for any system. Unfortunately, the resources required to
develop such a model for even the simplest of systems tends to prohibit their use.
Consequently engineers tend to rely on system identification techniques to establish
process models. The most common approaches to system identification include dynamic
process models such as ARX and ARMAX, which are linear in form. The majority of
process systems however contain varying degrees of non-linearity that can reduce the
accuracy of such models. To recover this loss in prediction accuracy many research
projects in recent years have focused on the use of neural networks as a tool for system
As with linear models, ANNs provide a description of the relationship between
cause and effect variables. The benefit ANNs offer over linear models is that they are
capable of modelling nonlinear relationships. In fact studies have shown them to be
capable of modelling any non-linear function to arbitrary accuracy. Although there exist
many different ANN structures, they do possess some common features. They are
generally composed of numerous processing elements, termed nodes, which are arranged
together to form a network. The most commonly used processing element is one, which
weights the input signals and then sums them together with a bias term. The neuron
output is then obtained by passing the summed, weighted inputs through a non-linear
activation function, such as the hyperbolic tangent.
A common type of ANN model used in many applications is the feed forward
network. This type of network comprises an input layer where input information is
presented to the network, one or more hidden layers where neuron processing takes place
and an output layer from which the network outputs are obtained. It is termed a feed
forward network because the outputs from one layer are fed forward as inputs to the
subsequent layer. The topology of such layered networks is usually described according
to the number of nodes in each layer. For example, a network with 2 inputs, 1 hidden
layer with 4 nodes and 1 output is referred to as a 2-4-1 network. This basic feedforward
network is useful for many applications, however, a number of modifications have been
proposed to improve its suitability for application to process systems.

Figure-2 Non-linear optimization and transform neural network model.

In particular, the incorporation of a dynamic element into the network is

important for the modeling of dynamic data. One approach for introducing dynamics is to
adopt the philosophy of ARMAX models and use time lagged data in the model. Another
possibility is to introduce dynamics at a localized level by passing the outputs of the
network nodes through first order low pass filters in what is referred to as a filter based
network. Alternative structures, known as recurrent networks, incorporate feedback
elements within the network. A simple form of recurrent structure is the globally
recurrent network, which feeds back the output of the model to the input layer. In this
study filter based, globally recurrent and time lagged neural networks were all applied.
Fully recurrent networks were not used due to the excessive times required to develop the
All the neural network structures studied in this work contained a direct linear
link between the input and output nodes. The reason for this is that it has been
demonstrated that ANNs are often unable to accurately represent linear relationships.
Whilst it could be argued that if the relationship between cause and effect variables is
linear then a linear model should be used, it can be expected that in a multi-input, multi-
output system there will exist both linear and non-linear relationships between cause and
effect variables. Noticeable improvements in modelling accuracy were observed in many
of the applications detailed in this paper through the inclusion of such a link.
The architecture of a neural network refers to the particular type of neural network
that is being used, for example feedforward, recurrent or filter based network. Once this
architecture has been specified the network must be trained. The issue of training is a
non-linear optimisation problem, the aim being to adjust the weights in the network to
minimise a cost function based on the squared prediction error over a set of process data,
termed training data. Literature abounds with various algorithms for the training of neural
networks. In this study both backpropagation and the second order Levenberg-Marquardt
search direction method were successfully employed. It is worth noting that besides
speed of training, (the Levenberg-Marquardt technique was noticeably faster than
Backpropagation) no difference was observed in the prediction accuracy of models
resulting from either of these training techniques.

Figure-3 Basic Model of Perceptron.

These applications were typically chosen because studies involving traditional control
and monitoring techniques had proved unsuccessful. A brief description of the main
applications studied are provided below.
3.1 Process Descriptions
Nitrification Process
Nitrification is the process whereby highly active liquid waste, obtained in the re-
processing of spent nuclear fuel elements, is encapsulated in glass to form solid blocks of
waste for safe and convenient storage. The liquid waste, along with glass frit, flows into a
vessel, known as a melter, which is heated by four induction coils. When the level of
waste reaches a certain point in the melter the contents are discharged to a storage
container and sent to product store in an operation known as pouring. Heat transfer
mechanisms in the melter are complex and during pouring there is an increase in the
transfer of heat from the vessel walls to the melt.
The control system regulating the temperature in the vessel is relatively crude and
is slow to respond to this effect and consequently the temperature tends to vary
considerably during operation. The large thermal disturbance experienced during pouring
exert significant thermal stresses on the walls of the melter vessel and have resulted in a
number of the vessels fracturing before their full life expectancy has been reached. Such
fractures result in increased downtime costs as well as extra costs incurred in the disposal
of the radioactive vessel itself. The objective for this study was to firstly develop an
accurate model of the process and then to integrate this model into a process monitoring
scheme capable of detecting signs of imminent vessel failure.

Figure-4 Activated neuron is fired.

Polymer Extrusion Process
The polymer extrusion process is responsible for coating a layer of polymer onto
copper wire. The system comprises an induction-heated barrel containing an archimedal
screw. A base polymer is introduced into the barrel, along with a cocktail of chemical
additives, known as monosil. The monosil and polymer are forced along the length of the
barrel by the archimedal screw, reacting to produce a polymer with specific properties,
which is coated onto a wire cable. The quality of the polymer product, measured
principally in terms of its tensile strength, is influenced by a variety of variables such as
the speed at which the screw rotates (screw speed), operating temperature and monosil
dose rate. Presently, the only reliable method for determining the quality of the polymer
coating is through destructive testing 24 hours after production. This means that it is
possible for large quantities of substandard material to be produced before it is ever
detected. To improve quality management on this process BICC have invested heavily
into finding alternative methods for measuring the polymer quality.
They have recently determined that the viscosity of the polymer as it exits the
extruder provides an approximate measure of the ultimate tensile strength of the polymer.
Consequently, through the use of a rheometer, the viscosity and, through inference, the
tensile strength of the polymer can be measured online. Unfortunately, the expense
associated with the installation of rheometers to all the extruders BICC Cables operate
worldwide means that this approach is unsuitable. The objective of this project was
therefore to develop an ANN inferential estimator capable of predicting the viscosity of
the polymer product and to then incorporate this model within an automatic control
system for the process. This project was undertaken on a pilot scale extrusion process,
with a rheometer attached. The pilot scale extruder was approximately half the size of a
typical industrial extruder.
3.2 Gasoline Engine
To meet ever more stringent legislation regarding emission levels from gasoline engines,
it is necessary for the air-fuel ratio (AFR) in the engine to be kept as close as possible to
the stoichiometric mix. The control of the AFR is complicated significantly by the
dynamics in gasoline engines, which contain large non-linearities and varying time delays
and time constants.
Current state of the art techniques for controlling AFR employ adaptive linear
model based controllers. Although such controllers have proved relatively successful,
problems are encountered during large transients when it takes time for the linear model
to adapt to the new operating conditions. Since an ANN is capable of identifying the
nonlinearities in the system, it will not need to adapt during transients and should
therefore be better able to control the AFR. The aim of this project was therefore to
develop an ANN model of the gasoline engine and then incorporate this model within an
automated control system to regulate the AFR.


Thames Water utilize a rapid gravity filtration process during the treatment of drinking
water. The process involves passing water through a filter bed of sand that traps
suspended particles. The filtration process is enhanced through the addition of coagulants
and the pre-injection of ozone gas. Precise dosing of the coagulant is critical to the
performance of the filters. Too little causes trends in the turbidity of the incoming water
to pass through into the filtrate and too much increases the iron residual, resulting in
reduced filter run times and unnecessary expense.
The aim of this work was to optimise the operation of the filtration process.
Specifically plant operations required a system that could maximize output water quality
and filter run length and minimize coagulant and ozone dosage. As a first step an ANN
model was to be developed to predict the output water quality and filter run length, based
upon process measurements. This model could then be incorporated into an optimization
routine that could be used to minimize a cost function. This cost would be a function of
output water quality, filter run length and the coagulant and ozone dosing rates. Other
problems, which were tackled during the initiative, were the forecasting of drinking water
demand for Severn Trent Water, the prediction of chemical dosing levels for drinking
water for Northwest Water and the optimization of lift gas flow rates for Texaco. In
addition, the ability of ANNs to model a series of generic problems encountered in
process systems, such as variable time delays and time constants was investigated.
The first phase for each of the projects was to acquire process data suitable for model
development. As with linear models it is important that the data used to develop the
process models is sufficiently exciting to extract accurate cause and effect relationships.
For some of the systems investigated it was found that historical data was suitably
exciting for identification purposes.
However, in other cases, such as the extrusion process, historical process data was
unavailable. For these systems trial runs were organised so that process data could be
collected. The trial runs took the form of either multi-level pseudo random signals (PRS)
or multi-level step tests. A point worth noting is that unlike linear models, data collected
during pseudo random binary signal (PRBS) testing is unsuitable for non-linear
identification purposes. It is important to consider the effects of feedback controllers
when collecting data. If the model is to be used for control purposes then using data
collected under closed loop operation may introduce problems.
If however the model is to be used for monitoring purposes then the process data
should be collected with the system in its standard configuration. For example if the
system typically operates in closed loop then the data should be collected in closed loop
operation. In this study the data collected for the control applications were obtained
during open loop multi-level step tests. As with conventional linear modelling, the
performance of the developed neural network is very much dependent upon the amount
of process data collected and used during training. For each application investigated here
all available data was utilised. This amounted to data collected during a trial run
involving 10 step changes per variable for the BICC extrusion process to over two years
of data for the Thames Water filtration process.
Whilst two years of data may seem exceptional, the filtration application is
seasonally affected and therefore two years of data was considered to be the minimum
requirement. Once the data was collected it was divided into three sets; the training data
set comprised half the available data and the remaining data was split evenly between
testing and validating data sets. Once the process data was collected, it was cleaned and
analyzed. Noise levels on each of the data sets were not found to be excessive, however,
there were spurious measurements in most data sets. Spurious data were commonly
caused by erroneous signals from measuring devices and were removed by linear
interpolation using reliable process measurements taken before and after the erroneous
signals. Following the cleaning of the process data, it was analyzed to determine the
cause and effect variables and the major time constants and time delays present in the
The tools used for this analysis included cross-correlation and multivariable
techniques such as principal component analysis and principal component regression. The
results from these analyses were subsequently discussed and validated with process
operators and engineers. Such discussions proved extremely useful and provided greater
insight into the operation of the process than was possible with the data analysis
techniques alone.


The investigation detailed in this paper met its objectives in analysing in detail the major
issues involved in applying neural network technology within the process industries. It
may be considered that the effort during this project has been too focused on neural
networks. While brief comparisons with linear techniques have been made, little attention
was given to the issue of whether neural networks were an acceptable or preferred
solution overall. Whilst neural networks may provide greater modeling accuracy than
their linear counterparts, stability is an important issue and until it is resolved some
resistance to process control solutions based upon neural networks is likely to remain.
Adaptive control applications are seen as one area where neural networks have
much to offer. The implications of allowing neural network models to adapt on-line were
covered briefly during this project. In particular the ability of a neural network to adapt to
the continuous dynamic changes in a gasoline engine was studied23. Again, further
research is required in this area to increase the credibility of adaptive neural network
The future of neural networks not only lies in their explicit use, but their use in
conjunction with other advanced technologies. The fusion of neural networks and fuzzy
logic in the form of neurofuzzy techniques is seen by many as the most promising way
ahead for advanced process monitoring and control applications32. An alternative field
that also offers great potential is hybrid modelling, an identification methodology that
complements simplified mechanistic models with either linear or nonlinear data based
models33. The potential for the application for neural network technology in the process
industries is vast. The ability of neural networks to capture and model process dynamics
and severe process nonlinearities makes them powerful tools in model based control and
monitoring. This investigation has looked in detail at the practical and theoretical issues
associated with using neural networks for such application and this paper should provide
readers with an insight into the problems and benefits encountered when exploiting the


1. Ljung, L., System Identification - Theory for the User, Prentice Hall, Englewood
Cliff,New Jersey,1999.

2. Clarke, D.W., Mohtadi, C. and Tuffs, P.S., Generalised predictive control. part 1: the
basic algorithm and part 2: extensions and interpretations, Automatica,
1997, 23 (2), 137-160

3. Frank, P.M., Analytical and qualitative model-based fault diagnosis - a survey and
someway results, Europ. J. Contr., 1996, 2, 6-28.

4. Garcia, C.E., Prett, D.M., Morari, M., Model predictive control: theory and practice - a
survey, Automatica, 1989, 25 (3), 335-348

5. Willis, M.J., Montague, G.A., Massimo, C.Di., Tham, M.T. and Morris, A.J., Artificial
neural networks in process estimation and control, Automatica, 1992,
28 (6), 1181- 1187

6. Chen, S., Billings, S.A. and Grant, P.M., Non-linear system identification using neural
networks, International Journal of Control, 1990, 51 (6), 1191-1214

7. Hunt, K.J., Sbarbaro, D., Zbikowski, R. and Gawthrop, P.J., Neural networks for
controlsystems – a survey, Automatica, 1992, 28 (6), 1083-1112

8. Lightbody, G., Irwin, G.W., Taylor, A., Kelly, K. and McCormick, J., Neural network
modeling of a polymerisation reactor, Proceedings IEE Control ’94, 1994, 1, 237-242

9. Cybenko, G., Approximation by superposition of a sigmoidal function, Mathematics of

Control, Signals and Systems, 1989, 2, 303-314