Professional Documents
Culture Documents
Controller
Demetri Psaltis, Athanasios Sideris, and Alan A. Yamamura
ABSTRACT: A multilayered neural network in the abstract models used in the design of shown in Fig. 1. The feedback and feedfor-
processor is used to control a given plant. neural computers. Nevertheless, the general ward controllers and the prefilter can all be
Several learning architectures are proposed morphological features and the basic ap- implemented as multilayered neural net-
for training the neural controller to provide proach to computation are common in both works. The learning process gradually tunes
the appropriate inputs to the plant so that a natural and artificial neural networks. This the weights of the neural network so that the
desired response is obtained. A modified er- can lead to computers that can tackle prob- error signal between the desired and actual
ror-back propagation algorithm, based on lems very effectively for which present ap- plant responses is minimized. Since the error
propagation of the output error through the proaches are relatively ineffective but natural signal is the input to the feedback controller,
plant, is introduced. The properties of the intelligence is very good. the training of the network will lead to a
proposed architectures are studied through a In this paper, we explore the application gradual switching from feedback to feedfor-
simulation example. of the neural approach to control [5]-[15]. ward action as the error signal becomes
Both humans and machines perform control small. In this paper, we consider imple-
Introduction functions, however, there are sharp distinc- menting only the feedforward controller as a
tions between machine and human control neural network. During training, features of
The neural approach to computation has systems. For instance, humans make use of the plant that are initially unknown and not
emerged in recent years [I]-[4] to tackle a much greater amount of sensory informa- taken into account by the control algorithm
problems for which more conventional com- tion in planning and executing a control ac- are learned. In this manner, some of the
putational approaches have proven ineffec- tion compared to industrial controllers. The model uncertainty is eliminated and, thus,
tive. To a large extent, such problems arise reason for this difference is not as much sen- improved control results. In other words, the
when a computer is asked to interface with sor constraints but rather the inability of man- feedforward controller is modified to com-
the real world, which is difficult because the made controllers to efficiently absorb and pensate for the characteristics of the plant
real world cannot be modeled with concise usefully process such a wealth of informa- that are discovered during learning. When
mathematical expressions. Problems of this tion. The collective manner in which infor- the error becomes small, training has been
type are machine vision, speech and pattern mation is processed by neural networks is accomplished and only a relatively small
recognition, and motor control. Not coinci- another distinction between human and ma- feedback signal is necessary to compensate
dentally perhaps, these are precisely the types chine control; it is this collective processing for random uncertainties that are unpredict-
of problems that humans execute seemingly capability that provides neural networks with able and, thus, cannot be learned. An im-
without effort. Therefore, people interested the ability to respond quickly to complex mediate consequence of the increased use of
in building computers to tackle these prob- sensory inputs. Sophisticated control algo- feedforward control action is to speed up the
lems have tried to adopt the existing under- rithms, on the other hand, are severely lim- response of the system.
standing on how the brain computes. There ited by the time it takes to execute them in In the remainder of this paper, we con-
are three basic features that identify a “neural electronic hardware. The third distinction, sider specific methods for training neural
computer”: it consists of a very large num- and perhaps the most important, is that hu- networks to minimize the error signal. In fol-
ber of simple processing elements (the neu- man control is largely acquired through lowing sections, we propose three control
rons), each neuron is connected to a large learning, whereas the operation of man-made learning methods, describe the error back
number of others, and the functionality of controllers is specified by an algorithm that propagation algorithm [16], which is the
the network is determined by modifying the is written a priori. Therefore, in order to im- method used here to adapt the weights in the
strengths of the connections during a learn- plement an effective algorithmic controller, neural networks we use as controllers, and
ing phase. These general characteristics are we must have a thorough understanding of introduce a modification of the error back
certainly similar to those evident in biolog- the plant that is to be controlled, something propagation algorithm that extends its utility
ical neural networks, however, the precise that is generally very difficult to achieve in to problems where the error signal used to
details of the operation of neural networks practice. A neural controller performs a spe- train the network is not the error measured
in a brain can be quite different from those cific form of adaptive control, with the con- at the output of the network. We also present
troller taking the form of a nonlinear multi- simulations for a very simple plant to dem-
Presented at the 1987 IEEE International Confer- layer network and the adaptable parameters onstrate the operation of the proposed archi-
ence on Neural Networks, San Diego, California, being the strengths of the interconnections tectures and training methods.
June 2 1-24, 1987. Dernetri Psaltis, Athanasios between the neurons. In summary, a con-
Sideris, and Alan A. Yamamura are with the De- troller that is designed as a neural network
partment of Electrical Engineering, California In-
architecture should exhibit three important
Learning Control Architectures
stitute of Technology, Pasadena, CA 91125. This
research was funded by DARPA and AFOSR and characteristics: the utilization of large Figure 2 shows the feedforward control-
also in part by the Caltech President’s Fund. Alan amounts of sensory information, collective ler, implemented as a neural network archi-
A. Yarnarnura was supported with a Fellowship processing capability, and adaptation. tecture, with its output U driving the plant.
from the Hertz Foundation. A general diagram for a control system is The desired response of the plant is denoted
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on April 07,2022 at 12:54:23 UTC from IEEE Xplore. Restrictions apply.
I ~ Additional
T)wGTq network
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on April 07,2022 at 12:54:23 UTC from IEEE Xplore. Restrictions apply.
pL.9: dency to create better initial weights for spe-
cialized training. Thus, starting with general
training can speed the learning process by
reducing the number of iterations of the en-
suing specialized training. Another advan-
tage of preliminary general learning is that
it may result in networks that can adapt more
easily if the operating points of the system
change or new ones are added.
The distinction between general and spe-
cialized learning arises from the fact that dif-
ferent error functions are minimized. As a
result, the general and specialized learning
tq; = f : " ( p ? ) ,P:" = E, w;qr"-'1. procedure will follow different paths to min-
Fig. 5. Layered neural net. ima. Intuitively, we expect the general learn-
ing procedure to produce a network that ap-
proximates the inverse of the plant better over
jective of minimizing the training error, E = Specialized Learning Network Training
the entire state space but not as well in the
t - 0, between the actual output, o, and
We cannot apply error back propagation regions as specialization as one produced by
target output, t, of the network.
directly to the specialized learning architec- the specialized procedure. By adopting the
Indirect Learning and General Learning ture because of the location of the plant. Re- strategy of switching back and forth between
Network Training femng to Fig. 4, the plant can be thought of the two training methods, we can sometimes
as an additional, although unmodifiable, get out of local minima of one method by
Training the first two proposed architec-
layer. Then the total error, E = d - y , is training with the other. Specifically, by per-
tures is accomplished by error back propa-
propagated back through the plant using the forming general learning prior to specialized
gation, a widely used algorithm for training
partial derivatives of the plant at its operating learning, we generally provide a better initial
feedforward neural networks. Error back
point: condition to the specialized procedure in that
propagation attempts to minimize the squared
it has lower initial error. Switching from one
error for training sample input-output pairs;
training method to the other can also result
for every training sample, it modifies each
in sudden changes of direction in weight
weight using the partial derivative of the
space. This is clearly evident in the sirnu-
squared error with respect to that weight. 6: = d , - yo lations presented in the next section.
Since error back propagation adjusts weights
in the local direction of greatest error reduc- where Pi(@denotes the ith element of the
tion, it is a gradient descent algorithm. For plant output for plant input U. The previously
Simulation Example
the network in Fig. 5, error back propagation described error back propagation algorithm
would modify the weight, wrb, between neu- can then be applied. Therefore, error back We considered as a simple example a plant
rons a and b of the mth and m - 1st layers, propagation through the plant again amounts that converts polar coordinates ( r , 0) to
respectively, as follows: to a gradient descent search for weight com- Cartesian coordinates (x, y). The control net-
binations that minimize true total error. work should convert Cartesian to polar co-
If the plant is a function of unknown form, ordinates. Desirable characteristics of this
we can approximate its partial derivatives as system include a well-behaved plant and a
simple mathematical form for the desired
ap, Pi(U + 6uj';) - P,(U) network so that the performance of the neural
- I
I
April 1988 19
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on April 07,2022 at 12:54:23 UTC from IEEE Xplore. Restrictions apply.
the input angle t9 is between 0 and 90 deg so shows the error contour before training with a definite advantage to the hybrid learning
that the input values lie in a circular wedge the weights chosen randomly. Figure 6(b) is method. However, in our simulations, with
in the first quadrant. The training samples the error contour after lo00 iterations of gen- this simple plant, we were unable to deter-
were chosen as a 10-point grid spanning the eral learning. The error has been suppressed mine conditions under which we could con-
known input space. Every training sample at and around the training points. Increasing sistently observe improvement by perform-
was presented once in each iteration of gen- the number of iterations reduces the error at ing general learning prior to specialized
eralized learning. We used an acceleration the training points, but, in general, we ob- learning. An important topic for future re-
constant of 0.1. serve regions in which the error increases as search is to find ways to determine the prop-
The diagrams in Fig. 6 are contour plots the iterations increase. Since, with this train- erties of the plants that will allow us to spec-
of e2 = \\E - t 1)’ plotted as a function of the ing method, we cannot place the training ify the appropriate sequence of general and
inputs to the network x and y . The points samples in the regions of interest, we cannot specialized learning.
marked by a circled “g” are the training guarantee what the error will be in these re-
samples used for generalized training, gions.
Conclusions
whereas the three rectangular regions were
Specialized Learning
selected for specialized learning. Figure 6(a) We have begun to explore the idea of using
For each of the three specialization re- neural networks for controlling physical sys-
gions, we chose nine points spanning the do- tems. Specifically, we proposed three differ-
main of specialization. We then trained the ent methods for using error back propagation
network to specialize in each region alone, to train a feedforward neural network con-
starting with the weight matrices after a troller to act as the inverse of the plant. The
varying number of iterations of general general learning method attempts to produce
learning. We used an acceleration constant the inverse of the plant over the entire state
of 0.01. space, but it can be very difficult to use it
Figure 7 is a plot of e2 = 11 2 - I[’, aver- alone to provide adequate performance in a
aged over the second region of specializa- practical control application. In order to cir-
tion, versus the number of total iterations. cumvent this problem, we introduced the
Three separate curves are superimposed on method of error propagation backwards
the same diagram. The solid curve is for through the plant, which allows us to train
general learning only, the dashed curve spe- the network exactly on the operational range
cialized only, and the dash-dot curve is 10 of the plant. Finally, we proposed using gen-
iterations of general followed by specialized eralized training in conjunction with spe-
training. In this example, there seems to be cialized training to gain their advantages and
to avoid their potential disadvantages.
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on April 07,2022 at 12:54:23 UTC from IEEE Xplore. Restrictions apply.
A. Guez, J. Eilbert, and M. Kam, “Neu- Demetri Psaltis received California. From September 1980 until May 1986,
romorphic Architecture for Fast Adaptive the B.Sc. degree in elec- he was a Teaching/Research Assistant and, from
Robot Control,” presented at 1987 IEEE trical engineering and September 1985 until December 1986, a Lecturer
International Conference on Neural Net- economics in 1974 and the with the Electrical Engineering Department at the
works. M.Sc. and Ph.D. degrees University of Southern California. Since June
M. Kawato, K. Furukawa, and R. Suzuki, in electrical engineering 1986, he has been at the Department of Electrical
“A Hierarchical Neural-Network Model for in 1975 and 1977, respec- Engineering at the California Institute of Tech-
Control and Learning of Voluntary Move- tively, all from Carnegie- nology, where he is an Assistant Professor. His
ment,” Biol. Cybern., vol. 57, pp. 169- Mellon University, Pitts- general area of interest is control and systems the-
185, Feb. 1987. burgh, Pennsylvania. ory; his current research interests include robust
A. J. Pellionisz, “Tensor Network Theory After the completion of control theory, system optimization, and, in par-
and Its Applications in Computer Modeling the Ph.D., he remained at ticular, L, optimal control, nonlinear control sys-
of the Metaorganization of Sensorimotor Carnegie-Mellon for a pe- tems, and applications of neural networks in con-
Hierarchies of Gaze,” in A I f Conf: froc. nod of three years as a Research Associate and trol.
Neural Networks for Computmg, American later as a Visiting Assistant Professor. In 1980,
Institute of Physics, New York, pp. 339- he joined the faculty of the Electrical Engineering
344, 1986. Department at the California Institute of Technol-
[ll] A. Pellionisz, “Sensorimotor Operations: ogy, Pasadena, where he is now Associate Pro-
A Ground for the Co-Evolution of Brain fessor and consultant to Industry. His research in-
Theory with Neurobotics and Neurocom- terests are in the areas of optical information
puters,” presented at 1987 IEEE Intema- processing, acousto-optics, image processing, pat-
tional Conference on Neural Networks. tern recognition, neural network models of com-
[I21 D. Psaltis, A. Sidens, and A. Yamamiira, putation, and optical devices He has over 130
“Neural Controllers,” presented at 1987 technical publications in these areas Dr. Psaltis
IEEE International Conference on Neural IS a Fellow of the Optical Society of Amenca and
Networks. Vice President of the International Neural Net-
[I31 D E Rumelhart, presentation at the Cali- works Society
fornia Institute of Technology, Spnng 1987
[14] A Sidens, A. Yamamura, and D. Psaltis, Alan A. Yamamura was
“Dynamical Neural Networks and Their barn in Hampton, Vir-
Application to Robot Control,” presented ginia, in 1965. He re-
at the IEEE Conference on Neural Infor- Athanasios Sideris re- ceived the S.B. degree in
mation Processing Systems Natural and ceived the diploma in electrical engineering, the
Synthetic, 1987. electncal engineenng S.M. degree in electrical
[15] K. Tsutsumi and H. Matsumoto, “Neural from the National Tech- engineering and computer
Computation and Learning Strategy for Ma- nical University of Ath- science, and the S.B. de-
nipulator Position Control,” presented at ens, Athens, Greece, in gree in physics from the
1987 IEEE International Conference on 1980, the M.S degree in Massachusetts Institute of
Neural Networks. mathematics in 1986, and Technology, Cambridge,
[I61 D. E. Rumelhart, G E. Hinton, and R. J. the M S and Ph D de- Massachusetts, in 1986.
Williams, Parallel Distributed Processing, grees in electncal engi- He is currently a graduate
Chapter 8, “Learning Internal Representa- neering in 1981 and 1985, student in the Department of Electrical Engineer-
tions by Error Propagation,” vol. 1, Cam- respectively, from the ing at the California Institute of Technology, Pas-
bndge, MA. MIT Press, 1986. University of Southern adena, California.
Out of Control
April 1988 21
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on April 07,2022 at 12:54:23 UTC from IEEE Xplore. Restrictions apply.