You are on page 1of 5

A Multilayered Neural Network

Controller
Demetri Psaltis, Athanasios Sideris, and Alan A. Yamamura

ABSTRACT: A multilayered neural network in the abstract models used in the design of shown in Fig. 1. The feedback and feedfor-
processor is used to control a given plant. neural computers. Nevertheless, the general ward controllers and the prefilter can all be
Several learning architectures are proposed morphological features and the basic ap- implemented as multilayered neural net-
for training the neural controller to provide proach to computation are common in both works. The learning process gradually tunes
the appropriate inputs to the plant so that a natural and artificial neural networks. This the weights of the neural network so that the
desired response is obtained. A modified er- can lead to computers that can tackle prob- error signal between the desired and actual
ror-back propagation algorithm, based on lems very effectively for which present ap- plant responses is minimized. Since the error
propagation of the output error through the proaches are relatively ineffective but natural signal is the input to the feedback controller,
plant, is introduced. The properties of the intelligence is very good. the training of the network will lead to a
proposed architectures are studied through a In this paper, we explore the application gradual switching from feedback to feedfor-
simulation example. of the neural approach to control [5]-[15]. ward action as the error signal becomes
Both humans and machines perform control small. In this paper, we consider imple-
Introduction functions, however, there are sharp distinc- menting only the feedforward controller as a
tions between machine and human control neural network. During training, features of
The neural approach to computation has systems. For instance, humans make use of the plant that are initially unknown and not
emerged in recent years [I]-[4] to tackle a much greater amount of sensory informa- taken into account by the control algorithm
problems for which more conventional com- tion in planning and executing a control ac- are learned. In this manner, some of the
putational approaches have proven ineffec- tion compared to industrial controllers. The model uncertainty is eliminated and, thus,
tive. To a large extent, such problems arise reason for this difference is not as much sen- improved control results. In other words, the
when a computer is asked to interface with sor constraints but rather the inability of man- feedforward controller is modified to com-
the real world, which is difficult because the made controllers to efficiently absorb and pensate for the characteristics of the plant
real world cannot be modeled with concise usefully process such a wealth of informa- that are discovered during learning. When
mathematical expressions. Problems of this tion. The collective manner in which infor- the error becomes small, training has been
type are machine vision, speech and pattern mation is processed by neural networks is accomplished and only a relatively small
recognition, and motor control. Not coinci- another distinction between human and ma- feedback signal is necessary to compensate
dentally perhaps, these are precisely the types chine control; it is this collective processing for random uncertainties that are unpredict-
of problems that humans execute seemingly capability that provides neural networks with able and, thus, cannot be learned. An im-
without effort. Therefore, people interested the ability to respond quickly to complex mediate consequence of the increased use of
in building computers to tackle these prob- sensory inputs. Sophisticated control algo- feedforward control action is to speed up the
lems have tried to adopt the existing under- rithms, on the other hand, are severely lim- response of the system.
standing on how the brain computes. There ited by the time it takes to execute them in In the remainder of this paper, we con-
are three basic features that identify a “neural electronic hardware. The third distinction, sider specific methods for training neural
computer”: it consists of a very large num- and perhaps the most important, is that hu- networks to minimize the error signal. In fol-
ber of simple processing elements (the neu- man control is largely acquired through lowing sections, we propose three control
rons), each neuron is connected to a large learning, whereas the operation of man-made learning methods, describe the error back
number of others, and the functionality of controllers is specified by an algorithm that propagation algorithm [16], which is the
the network is determined by modifying the is written a priori. Therefore, in order to im- method used here to adapt the weights in the
strengths of the connections during a learn- plement an effective algorithmic controller, neural networks we use as controllers, and
ing phase. These general characteristics are we must have a thorough understanding of introduce a modification of the error back
certainly similar to those evident in biolog- the plant that is to be controlled, something propagation algorithm that extends its utility
ical neural networks, however, the precise that is generally very difficult to achieve in to problems where the error signal used to
details of the operation of neural networks practice. A neural controller performs a spe- train the network is not the error measured
in a brain can be quite different from those cific form of adaptive control, with the con- at the output of the network. We also present
troller taking the form of a nonlinear multi- simulations for a very simple plant to dem-
Presented at the 1987 IEEE International Confer- layer network and the adaptable parameters onstrate the operation of the proposed archi-
ence on Neural Networks, San Diego, California, being the strengths of the interconnections tectures and training methods.
June 2 1-24, 1987. Dernetri Psaltis, Athanasios between the neurons. In summary, a con-
Sideris, and Alan A. Yamamura are with the De- troller that is designed as a neural network
partment of Electrical Engineering, California In-
architecture should exhibit three important
Learning Control Architectures
stitute of Technology, Pasadena, CA 91125. This
research was funded by DARPA and AFOSR and characteristics: the utilization of large Figure 2 shows the feedforward control-
also in part by the Caltech President’s Fund. Alan amounts of sensory information, collective ler, implemented as a neural network archi-
A. Yarnarnura was supported with a Fellowship processing capability, and adaptation. tecture, with its output U driving the plant.
from the Hertz Foundation. A general diagram for a control system is The desired response of the plant is denoted

0272-170818810400-0017 $01 00 0 1988 IEEE


April 1988 17

Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on April 07,2022 at 12:54:23 UTC from IEEE Xplore. Restrictions apply.
I ~ Additional
T)wGTq network

response/ Fig. 4. Specialized learning architecture.

Fig. 1. General control system diagram.


method with the specialized procedure de-
scribed below.
conjunction with one of the procedures de- Specialized Learning Architecture
scribed below that minimize E .
Figure 4 shows an architecture for training
General Learning Architecture the neural controller to operate properly in
The architecture shown in Fig. 3 provides regions of specialization only. Training in-
a method for training the neural controller volves using the desired response, d , as input
that does minimize the overall error E*. The to the network. The network is trained to
Fig. 2. Indirect learning architecture. training sequence is as follows. A plant input find the plant input, U , that drives the system
U is selected and applied to the plant to ob- output, y , to the desired d . This is accom-
tain a corresponding y . and the network is plished by using the error between the de-
trained to reproduce U at its output from y . sired and actual responses of the plant to
by d and its actual output by y. The neural adjust the weights of the network using a
The trained network should then be able to
controller should act as the inverse of the
take a desired response d and produce the steepest descent procedure; during each it-
plant, producing from the desired response eration the weights are adjusted to maxi-
appropriate U , making the actual plant output
d a signal U that drives the output of the plant
y approach d . This will clearly work if the mally decrease the error. This procedure re-
to y = d . The goal of the learning proce- quires knowledge of the Jacobian of the
input d happens to be sufficiently close to
dure, therefore, is to select the values of the plant. In the following section, we describe
one of the y's that were used during the train-
the correct d -
weights of the network so that it produces
U mappings at least over the
range of d's in which we plan to operate the
ing phase. Thus, the success of this method
is intimately tied to the ability of the neural
an iterative procedure for specialized train-
ing in which the plant derivatives are esti-
network to generalize or learn to respond mated continuously. This architecture can
plant. We will consider three different learn- specifically learn in the region of interest,
correctly to inputs it has not specifically been
ing methods. One possibility is suggested in and it may be trained on-line-fine-tuning
trained for. Notice that, in this architecture,
Fig. 2. Suppose that the feedforward con-
we cannot selectively train the system to re- itself while actually performing useful work.
troller is successfully trained so that the plant
spond correctly in regions of interest because The general learning architecture, on the
output y = d . Then the network used as the other hand, must be trained off-line. Feed-
we normally do not know which plant inputs
feedforward controller will approximately
U correspond to the desired outputs d . Thus, forward neural networks are nondynamical
reproduce the plant input from y (i.e., t =
we typically attempt to uniformly populate systems and, therefore, input-output stable.
U ) . Thus, we might consider training the net-
the input space of the plant with training Consequently, off-line training of the neural
work by adapting its weights to minimize the
samples so that the network can interpolate network presents no stability problem for the
error = U - t using the architecture shown control system. Intuitively, we expect no
the intermediate points. In this case, the gen-
in Fig. 2, because, if the overall error E =
eral procedure may not be efficient since the stability problems also in the case of on-line
d - y goes to zero, so does c l . The positive
network may have to learn the responses of training, as long as the learning rate is suf-
features of this arrangement would be the
the plant over a larger operational range than ficiently slower than the time constants of
fact that the network can be trained only in
is actually necessary. One possible solution the other components of the control system.
the region of interest since we start with the
to this problem is to combine the general
desired response d and all other signals are
generated from it. In addition, as we will see
Neural Net Training
in the next section, it is advantageous to The training method chosen may greatly
adapt the weights in order to minimize the affect the final operation of the system, its
error directly at the output of the network. ability to adapt, and the time it takes to learn.
Unfortunately, this method, as described, is
b
Although neural networks come in many
not a valid training procedure because min- shapes and sizes, we consider here only net-
imizing c l does not necessarily minimize E . works that fit the form of Fig. 5. In this
For instance, simulations with a simple plant layered architecture, neurons in the same
showed that the network tends to settle to a layer, m, perform the same function, f m , on
solution that maps all d's to a single U = uo, pm, where the p r ' s are the weighted sums of
which, in turn, is mapped by the plant to t the outputs, q;"-I's, of the previous layer.
= ,U for which E, is zero but obviously E is Training adjusts the connection strengths be-
,,
not. This training method remains interest- \ tween the neurons, thus modifying the func-
ing, however, because it could be used in Fig. 3. General learning architecture. tional character of the network with the ob-

18 IEEE Control Systems hlogozrne

Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on April 07,2022 at 12:54:23 UTC from IEEE Xplore. Restrictions apply.
pL.9: dency to create better initial weights for spe-
cialized training. Thus, starting with general
training can speed the learning process by
reducing the number of iterations of the en-
suing specialized training. Another advan-
tage of preliminary general learning is that
it may result in networks that can adapt more
easily if the operating points of the system
change or new ones are added.
The distinction between general and spe-
cialized learning arises from the fact that dif-
ferent error functions are minimized. As a
result, the general and specialized learning
tq; = f : " ( p ? ) ,P:" = E, w;qr"-'1. procedure will follow different paths to min-
Fig. 5. Layered neural net. ima. Intuitively, we expect the general learn-
ing procedure to produce a network that ap-
proximates the inverse of the plant better over
jective of minimizing the training error, E = Specialized Learning Network Training
the entire state space but not as well in the
t - 0, between the actual output, o, and
We cannot apply error back propagation regions as specialization as one produced by
target output, t, of the network.
directly to the specialized learning architec- the specialized procedure. By adopting the
Indirect Learning and General Learning ture because of the location of the plant. Re- strategy of switching back and forth between
Network Training femng to Fig. 4, the plant can be thought of the two training methods, we can sometimes
as an additional, although unmodifiable, get out of local minima of one method by
Training the first two proposed architec-
layer. Then the total error, E = d - y , is training with the other. Specifically, by per-
tures is accomplished by error back propa-
propagated back through the plant using the forming general learning prior to specialized
gation, a widely used algorithm for training
partial derivatives of the plant at its operating learning, we generally provide a better initial
feedforward neural networks. Error back
point: condition to the specialized procedure in that
propagation attempts to minimize the squared
it has lower initial error. Switching from one
error for training sample input-output pairs;
training method to the other can also result
for every training sample, it modifies each
in sudden changes of direction in weight
weight using the partial derivative of the
space. This is clearly evident in the sirnu-
squared error with respect to that weight. 6: = d , - yo lations presented in the next section.
Since error back propagation adjusts weights
in the local direction of greatest error reduc- where Pi(@denotes the ith element of the
tion, it is a gradient descent algorithm. For plant output for plant input U. The previously
Simulation Example
the network in Fig. 5, error back propagation described error back propagation algorithm
would modify the weight, wrb, between neu- can then be applied. Therefore, error back We considered as a simple example a plant
rons a and b of the mth and m - 1st layers, propagation through the plant again amounts that converts polar coordinates ( r , 0) to
respectively, as follows: to a gradient descent search for weight com- Cartesian coordinates (x, y). The control net-
binations that minimize true total error. work should convert Cartesian to polar co-
If the plant is a function of unknown form, ordinates. Desirable characteristics of this
we can approximate its partial derivatives as system include a well-behaved plant and a
simple mathematical form for the desired
ap, Pi(U + 6uj';) - P,(U) network so that the performance of the neural
- I
I

network may be checked easily.


p is an acceleration constant that relates to auj 6Uj
We chose a two-layer architecture with
the step size of the simulation; larger accel-
This approximate derivative can be deter- two inputs plus a fixed-unity input, IO hid-
eration constants lead to lower accuracy but
mined either by changing each input to the den neurons plus a fixed-unity hidden neu-
faster training. In the preceding equation, the
plant slightly at the operating point and mea- ron, and two output neurons (see Fig. 5 ) .
output of the kth neuron is qk, its input is p k ,
suring the change at the output or by com- The fixed units allow each neuron to find its
and 6L is the back-propagated error given for
paring changes with previous iterations. The own threshold value through the training
the final layer, n , by
latter can be likened to a person comparing procedure. The hidden neurons have a sig-
6: = f " ' ( P 3 (U, - t,) past and present experiences to determine moid transfer function, f ( x ) = 1/[1 + exp
how a system's behavior changes as its pa- (-41. We chose linear neurons, f ( x ) = x,
and for all other layers by rameters change. for the output so that they have unlimited
range. Initial weights were selected ran-
6:" = f " ' ( p : : ) c, 6 ? + ' w ; + ' Generalized and Specialized Learning
domly as &.
A possible method for combining the two
For the general learning network, we iden- methods is to first perform general training General Learning
tify o = U , where U is the output of the to learn the approximate behavior of the plant The general learning proceeded under the
network to input y , and t = U which, when followed by specialized training to fine-tune assumption that the region of specialization
fed to the plant, produces y (see Figs. 2 and the network in the operating regime of the had unknown ( r , 0) values, except that the
3). system. General training will have a ten- input magnitude r is between 0 and 10 and

April 1988 19

Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on April 07,2022 at 12:54:23 UTC from IEEE Xplore. Restrictions apply.
the input angle t9 is between 0 and 90 deg so shows the error contour before training with a definite advantage to the hybrid learning
that the input values lie in a circular wedge the weights chosen randomly. Figure 6(b) is method. However, in our simulations, with
in the first quadrant. The training samples the error contour after lo00 iterations of gen- this simple plant, we were unable to deter-
were chosen as a 10-point grid spanning the eral learning. The error has been suppressed mine conditions under which we could con-
known input space. Every training sample at and around the training points. Increasing sistently observe improvement by perform-
was presented once in each iteration of gen- the number of iterations reduces the error at ing general learning prior to specialized
eralized learning. We used an acceleration the training points, but, in general, we ob- learning. An important topic for future re-
constant of 0.1. serve regions in which the error increases as search is to find ways to determine the prop-
The diagrams in Fig. 6 are contour plots the iterations increase. Since, with this train- erties of the plants that will allow us to spec-
of e2 = \\E - t 1)’ plotted as a function of the ing method, we cannot place the training ify the appropriate sequence of general and
inputs to the network x and y . The points samples in the regions of interest, we cannot specialized learning.
marked by a circled “g” are the training guarantee what the error will be in these re-
samples used for generalized training, gions.
Conclusions
whereas the three rectangular regions were
Specialized Learning
selected for specialized learning. Figure 6(a) We have begun to explore the idea of using
For each of the three specialization re- neural networks for controlling physical sys-
gions, we chose nine points spanning the do- tems. Specifically, we proposed three differ-
main of specialization. We then trained the ent methods for using error back propagation
network to specialize in each region alone, to train a feedforward neural network con-
starting with the weight matrices after a troller to act as the inverse of the plant. The
varying number of iterations of general general learning method attempts to produce
learning. We used an acceleration constant the inverse of the plant over the entire state
of 0.01. space, but it can be very difficult to use it
Figure 7 is a plot of e2 = 11 2 - I[’, aver- alone to provide adequate performance in a
aged over the second region of specializa- practical control application. In order to cir-
tion, versus the number of total iterations. cumvent this problem, we introduced the
Three separate curves are superimposed on method of error propagation backwards
the same diagram. The solid curve is for through the plant, which allows us to train
general learning only, the dashed curve spe- the network exactly on the operational range
cialized only, and the dash-dot curve is 10 of the plant. Finally, we proposed using gen-
iterations of general followed by specialized eralized training in conjunction with spe-
training. In this example, there seems to be cialized training to gain their advantages and
to avoid their potential disadvantages.

ISE versus total iterations in region References


1000 iterations general
[l] J. Denker, ed., AIP Con5 Proc. Neural
Networks for Computing, American Insti-
@- + tute of Physics, New York, 1986.
2 [2] J. Hopfield, “Neural Networks and Physi-
- - cal Systems with Emergent Collective
Computational Abilities,” Proc. Nut. Acad.
Sci. U.S., vol. 79, pp. 2554-2558, 1982.
@ [ 3 ] T. Kohonen, SelfOrganization and Asso-
ciative Memory, New York: Springer-Ver-
lag, 1984.
141 D. Rumelhart and J . McClelland, Parallel
Distributed Processing, Cambridge, MA:
MIT Press, 1986.
I I I I [5] A. G. Barto, R. S . Sutton, and C. W. An-
1 2 3 4
derson, “Synthesis of Nonlinear Control
log 10 (total iterations) Surfaces by a Layered Associative Search
- - 1 Network,” Biol. Cybern., vol. 43, pp. 175-
X X t
Iterations 185, 1982.
i t * *
[6] D. Bullock and S. Grossberg, “A Neural
Region and points of Network Architecture for Automatic Tra-
specialized learning -General (G) jectory Formation and Coordination of Mul-
0 tiple Effectors During Variable-Speed Arm
Points of Movements,” presented at 1987 IEEE In-
--- Specialized (S) ternational Conference on Neural Net-
general learning
works.
-.-lo G +S [7] S . Grossberg and M. Kuperstein, Neural
Dynamics of Adaptive Sensow-Motor Con-
Fig. 6. Squared total error maps during Fig. 7. Squared error versus total trol: Ballistic Eye Movements, Amsterdam:
general learning. iterations. Elsevier/North Holland. 1986.

20 I € € € Control Systems Magazine

Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on April 07,2022 at 12:54:23 UTC from IEEE Xplore. Restrictions apply.
A. Guez, J. Eilbert, and M. Kam, “Neu- Demetri Psaltis received California. From September 1980 until May 1986,
romorphic Architecture for Fast Adaptive the B.Sc. degree in elec- he was a Teaching/Research Assistant and, from
Robot Control,” presented at 1987 IEEE trical engineering and September 1985 until December 1986, a Lecturer
International Conference on Neural Net- economics in 1974 and the with the Electrical Engineering Department at the
works. M.Sc. and Ph.D. degrees University of Southern California. Since June
M. Kawato, K. Furukawa, and R. Suzuki, in electrical engineering 1986, he has been at the Department of Electrical
“A Hierarchical Neural-Network Model for in 1975 and 1977, respec- Engineering at the California Institute of Tech-
Control and Learning of Voluntary Move- tively, all from Carnegie- nology, where he is an Assistant Professor. His
ment,” Biol. Cybern., vol. 57, pp. 169- Mellon University, Pitts- general area of interest is control and systems the-
185, Feb. 1987. burgh, Pennsylvania. ory; his current research interests include robust
A. J. Pellionisz, “Tensor Network Theory After the completion of control theory, system optimization, and, in par-
and Its Applications in Computer Modeling the Ph.D., he remained at ticular, L, optimal control, nonlinear control sys-
of the Metaorganization of Sensorimotor Carnegie-Mellon for a pe- tems, and applications of neural networks in con-
Hierarchies of Gaze,” in A I f Conf: froc. nod of three years as a Research Associate and trol.
Neural Networks for Computmg, American later as a Visiting Assistant Professor. In 1980,
Institute of Physics, New York, pp. 339- he joined the faculty of the Electrical Engineering
344, 1986. Department at the California Institute of Technol-
[ll] A. Pellionisz, “Sensorimotor Operations: ogy, Pasadena, where he is now Associate Pro-
A Ground for the Co-Evolution of Brain fessor and consultant to Industry. His research in-
Theory with Neurobotics and Neurocom- terests are in the areas of optical information
puters,” presented at 1987 IEEE Intema- processing, acousto-optics, image processing, pat-
tional Conference on Neural Networks. tern recognition, neural network models of com-
[I21 D. Psaltis, A. Sidens, and A. Yamamiira, putation, and optical devices He has over 130
“Neural Controllers,” presented at 1987 technical publications in these areas Dr. Psaltis
IEEE International Conference on Neural IS a Fellow of the Optical Society of Amenca and
Networks. Vice President of the International Neural Net-
[I31 D E Rumelhart, presentation at the Cali- works Society
fornia Institute of Technology, Spnng 1987
[14] A Sidens, A. Yamamura, and D. Psaltis, Alan A. Yamamura was
“Dynamical Neural Networks and Their barn in Hampton, Vir-
Application to Robot Control,” presented ginia, in 1965. He re-
at the IEEE Conference on Neural Infor- Athanasios Sideris re- ceived the S.B. degree in
mation Processing Systems Natural and ceived the diploma in electrical engineering, the
Synthetic, 1987. electncal engineenng S.M. degree in electrical
[15] K. Tsutsumi and H. Matsumoto, “Neural from the National Tech- engineering and computer
Computation and Learning Strategy for Ma- nical University of Ath- science, and the S.B. de-
nipulator Position Control,” presented at ens, Athens, Greece, in gree in physics from the
1987 IEEE International Conference on 1980, the M.S degree in Massachusetts Institute of
Neural Networks. mathematics in 1986, and Technology, Cambridge,
[I61 D. E. Rumelhart, G E. Hinton, and R. J. the M S and Ph D de- Massachusetts, in 1986.
Williams, Parallel Distributed Processing, grees in electncal engi- He is currently a graduate
Chapter 8, “Learning Internal Representa- neering in 1981 and 1985, student in the Department of Electrical Engineer-
tions by Error Propagation,” vol. 1, Cam- respectively, from the ing at the California Institute of Technology, Pas-
bndge, MA. MIT Press, 1986. University of Southern adena, California.

Out of Control

“His defection is certain to have a destabilizing effect.”

April 1988 21

Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on April 07,2022 at 12:54:23 UTC from IEEE Xplore. Restrictions apply.

You might also like