You are on page 1of 10




ac-28, NO. 2.



Adaptive Linear Controller for Robotic


Absbuct-This paper presentsanewapproach

to thepositionand
optimum control theory to the manipulator control. For
velocitycontrol of amanipulatorbyusing
an adaptivecontrollerofthe
example, [4] and [ 5 ] investigate the minimum time solution
manipulator system is
self-tuning type foreachjoint.Thecomplicated
along a specified path.The minimumtimeproblem
modeled by a set of time series difference equations. The
parameters of the
solved in [4] using a linearized continuous-time two-dimenmodels are determined by on-line recursive algorithms, which result from
minimizing the sum of the squared equation errors. The adaptive controller sional model which represents a gross simplification. The
of each joint is designed on the basis of the difference equation model and same problem is studied in [ 5 ] using linear programming.
a chosen performance criterion. The controller gains are calculated on-line Both reports [4], [5] present simulation studies. but the
using the model with the estimated values of system parameters. Simulapractical use of the solution algorithms is not discussed in
tion results are presented to demonstrate the applicabilityof the approach.
[4]. and [SI. The use of variable structure theory in controlSome aspects of the implementation arealso discussed.

ling a manipulator has been proposed in [6]. It suffers from

the typicalwell-knowndrawback
of the minimumtime
solution: the exact determination of the switchmg instances
for the control input is difficult. The practical implementation of the solution is not presented in [6].
A table look-up for the inputs is constructed in [7] to
control a manipulator. The realization of this control
schemerequires tremendous memory, particularly when
the path andthe object size change. Moreover.the optimality of the performance is not evaluated. A different approach is presented in [SI by solving the eigenvalue assignment problem for the linearized model. This scheme is not
based on the minimization of a performance functional.
However. it represents a solution which can be realized if
the linearized model is sufficiently accurate.
The optimal solution to the LPQ problem is applied to
the manipulator control in [9] usingpiecewise constant
controls and a second-order linear model for a single joint.
The solution to a nonlinear optimum control problem is
approximated in [ 101 by forming a sequence of control laws
on the basis of the dynamicprogramming
Another interesting control scheme is proposed in [ 111 by
introducing a nonlinear feedback loop into thesystem such
that the nonlinear termsin the manipulator model are
cancelled; then a controller is constructed for the linear
model. The schemeassumes that the cancellation of the
nonlinear terms in the model is exact, whch hardly can be
achieved in practice.
An adaptive control scheme using a model reference is
proposed in [12]. The step response of a second-order
underdamped model is used as a reference for the output
of each joint. The gains in the position and velocity feedManuscript received April 2, 1981: revised April 29. 1982 and June 24, back loops are adjusted so that the difference between the
1982. Paperrecommended
by A. K. Bejng. Past Chairman of the
output of the actual system and that of the referenceis
Automation and Robotics Committee. This work was supported in part
by the National Science Foundation under Grant ENG 78-22192.
minimized. The gains are adjusted by the method of steep
The authors are withtheSchool
of ElectricalEngineering. Purdue
descent of the performance criterion in the parametric
University, West Lafayette, N 47907.

N industrial manipulator may be controlled remotely

by a human-computer interaction in applications
such as in nuclear reactors, in space, and in underwater
explorations. It can also be directly controlled by a computer, whch can be used to automate the motion control
fast and accurately. The dynamic control of an industrial
manipulator involves the determination of the inputs
(torques or voltages) for the actuators whch operate at the
joints so that a set of desired values for the positions and
velocities of the manipulator is achieved. One of the problems in the control of a manipulatoris that the operating
conditions may change; for example, the effective inertia of
an object being moved usually changes along the trajectory
of the positions of the joints and time. T h s complicates the
operation of a controller which has been designed in advance.
During the past decade, manyschemes for thedirect
digital control of a manipulator have been proposed. Most
of the approaches are based on a mathematical model
derived on the basis of Newtonian mechanics. The control
schemes suggested include classicalcontrollers (e.g.. [ 11-[3])
as well as optimum controllers (e&. [4], [6]. [IO]). Among
the classicaldesign approaches, thecalculation
of the
torques for a nominal trajectory is presented in 111. This
control scheme requires a considerable amount of calculations and memory storage: it has been tested in practice.
Another of the early proposed techniques is the resolved
motion rate control [3] in which the joint angle rates are
computed so as to cause the endpoint of the manipulator
to move in a definite direction. More recent studies apply

0018-9286/83/0200-0162$01.00 61983 IEEE




where 8, and 8 are six-dimensional vectors signifying the

joint position, velocity, and acceleration, respectively. D( e),
a (6 X 6) symmetric matrix, includes the accelerationrelated coefficients of the joints and the effects of link
inertia. Q ( 8 ,B), a six-dimensional vector, signifies Coriolis
and centrifugal torques. G ( 8 ) , a six-dimensionalvector,
represents the torques due to the gravity. The six-dimensional vector u ( t ) is the system input, and F is a diagonal
(scaling) matrix. The expressions of D ( 8 ) , Q ( 8 , e ) , and
G ( 8 ) contain trigonometric functions. A digital simulation
due to a large number of
of (1) isverytimeconsuming
mathematical operations. Moreover, the parameter values
vary from task to task. These aspects are more pronounced
when the solution to an optimumcontrol problem (i.e., to a
Fig. 1. A six-joint manipulator.
two-point boundary value problem) needsto be calculated.
When the optimum control theory has been applied to
(gain) space. The couplingtermsbetween the joints are
the construction of controllers, a common simplification of
neglected; thus, the dynamic model of the manipulator is the model (1) is to assume that the coupling terms due to
considerably simplified. Simulation studies of [ 121 demon- the other joints can be neglected [4], [6], [8],[ 121, [ 131; such
strate the feasibility of the approach. In the model-refer- a control on a joint is called in [8] independent joint
ence control scheme, the complicated model of a manipula- control. In addition, the model is usually linearized. It is
tor based on Newtonian mechanics is not directly being
accomplished in [6] and [ l l ] by introducing nonlinear
used in the design, which makes the approach drastically terms in the feedback controller, which render the model
different from the other aforementioned methods.
linear. The linearization in [4]isperformed
about the
In this paper, we propose autoregressivetimeseries
desired final state, and not about a nominal trajectory;
models for the motion of the joints of a manipulator. The such a linearization is likely to restrict the validity of the
model parameters are obtained by using the data of the linearized model.
measured input-output pairs. Then, the autoregressive
The modelsused in the controller design[I]-[12] are
model is used to determine the controller parameters on
continuous-time (analog) models such as (1) or are derived
line so that a chosen performance criterion is minimized. from (1) by various simplifying assumptions. An alternaThus, the parameters areadjusted using the explicit method tive approach is to discretize the model given in (l), e.g., by
[ 141, [ 181.
Eulers method,and then to design a controller for the
This paper first presents a commonly used mathematical discrete-time model describing the (gross) dynamics of the
model for a manipulator motion. Difficulties encountered manipulator. Interestingly, a continuous-time PID conin its use and simplifications of the model are discussed. A troller is realized in [ 131 using difference equations.
general autoregressive model is then presented. The design
If the model in (1) is first properly linearized about a
of an adaptive controller for each joint of a manipulator is nominal trajectory and then discretized by Eulers method,
outlined. The proposed control scheme is then applied to a multivariable discrete-time model is obtained in the form
the control of a robotic manipulator. Digital simulations of
the system are presented. Some aspects of the implementation are discussed.


A common industrial manipulator consists of six joints,

seven links, and a gripper (end piece).Fig. 1 shows a
Stanford arm with five rotational (revolute) joints and a
translational (prismatic) joint [2]. Each joint is driven by a
permanent magnet motor; the motor shafts are connected
to encoders/potentiometers for sensing the positions, and
to tachometers for sensing the velocities.
The equations for the motion of the manipulator may be
developed by the direct application of the classical (Newtonian) mechanics [l], [2]. For a manipulator with six joints,
the mathematical model may be written in the joint coordinate system as follows:



where A i and Bi,i = 1,2 are (6x6) matrices, uo is the

six-dimensional vector, and the vector e ( represents modeling errors. Experiments seem to indicate that the choice
of the sampling rate should be no less than 60 Hz ( T G 16
a sufficiently smooth control formost
motions. The direct use of (2) is difficult because the
calculation of the model parameters in the coefficient
matrices is a tedious task.
Motivated by the form of (2), an autoregressive model is
proposed here to model the motion of a manipulator and
to design a controller for the system. Such a model will be
determinedon the basis of the measured input-output
data of a manipulator system. The parametersof the model
will be determined so that the best fit of the model to the
a )




input-output data will be obtained in the least-squares

error sense. The approach is particularly suited to control
the motion of a manipulator in repetitive tasks.

A multivariable difference equation model is suggested
by (2) for the motion of a robotic manipulator. As is well
known, the parameters in such an assumed model can be
estimated on line using recursive equations. They can be
obtained by minimizing the sum of the squared (equation)
errors. The resulting equations for the algorithm are reviewed here [15], [16]. Then the algorithm for an adaptive
model with the estimated parameters.
The autoregressive model is assumed to have the same
number of inputsand outputs. The multivariabledifference equation will be written in the following general




The algorithm for the parameter estimation is constructed so that one vector at a time is estimated in 8 of (6).
The error criterion ischosen for eachvector ei of 8 as

where ( N + n ) represents the number of the measurements

and e,(k) is the ith component of the e vector. The
problem is to minimize E(8,) relative to the parameter
vector 8,.
The solution to the least-squares problem is furnished by
the following recursive equations [ 151, [ 171 where the caret
refers to estimated values:


where the sampling period has been omitted in the arguments; the argument k refers to the sampling instant.
k = 0,1,2,. . .,; d is a positive integer specifying the time
delay as an integer multiple of the sampling period. The
constant m-dimensional vector h is a forcing term. which
includes the effects of the gravitationalforces. The rndimensional output vector y and the input vector u have as
the ith component the output y I and the input u, of joint i,
respectively, and i = 1,. . .,m.The equation error vector
e( - ) represents a random, zero-mean, Gaussian white noise
with the covariance R . The argument q-l is a backward
shift operator, i.e., q-'y( k ) = y(k - 1). The ( m X m)
matrices A and B are polynomials defined by

+ . . . + A,q-"
Bo + B1q-' + . . . + B,plq-n+l

Equation (3) can be rewritten as follows:

A ( q - 1 ) = A&




where n is a positiveintegerspecifying the order of the

model. Bo isnonsingular, and det B ( q - ' ) has allzeros
strictly outside the unit circle.
To estimate the parameters in (3), the matrix 8 and the
vector 9 are defied:


where 0 p 5 1 is a forgetting factor, which can be used to

account for an exponential decaying of past data in tracking a slow drift in the parameters. P ( . ) is a (2n + 1)x(2n
+ 1) symmetric matrix.
The parameter estimates will be calculated on line using
(1 1) and (12). The estimated parameters are then used in
the model (9) to determine the feedback gains andthe
control u ( k ) .

The performance criterion for the system is chosen as

where l l - l l R indicates the norm with weight R I i.e., Ilulli =

u'Ru. and R is a positive semidefinite symmetric matrix; Q
e = [ AI , . . . , A , ; Bo,...,B, - l ; h ] F = [ e l , . . . . e , , ] (6) is a positive definite symmetric weightingmatrix; and yd( .)
where the superscript denotes the transposition. Then for describes the desired path vector as a sequence of discrete
points. The expectation operation isconditioned onthe
i = 1,. . .,m,
available measurements up to and including time k - 1.
The admissible controls u(k) which can be considered in
e,= [ aI, , : ~ ~ , a , ! , . a , : ; ~ ~ , a ~ * ; ~ ~ , ;
[ are the ones that are functions of
the minimization of I ku]
bp,;..,bp,,;...b:,-' ,....b:,-';h,]'
(7) the measurements y( r - 7 ) for 7 2 1 and the past controls.
The problemis to minimizetheperformance criterion
@(k-l)= [yF(k-l) ;..,yr(k-n);
(13) relative to the admissible controls while satisfying the
u F ( k - d ) ; . . , ~ ' ( k - d - n + 1 ) ; 1 ] . following constraint equation:





considerably simpler than in the general case presented in

the previous section.
An autoregressivemodel for the single input-output
relation of joint i can be obtained from (3)-(7) by setting
m = 1 (independent joint control); it assumes the following

1 ' 1








+ h i + ei(k).


Fig. 2. Block diagram of the adaptively controlled manipulator system.

Equation (1 8) may be rewritten more concisely as follows:

The problem is solved by expressing the predicted value
d ) in terms of the available information up to time
k - 1, substituting it into equation (13), and performingthe
minimization of I k [ u ] with respect to u ( k ) . The control
which minimizes (13) is determined by the following equation [ 151, [ 161:


y i ( k )=8:+,(k - l ) + e i ( k )


8,= [ai,;..,aYl; b ~ , - - - 7 b ~ - ' ; h j ] T


+ , ( k - l ) = [yi(k-l);..,yi(k-n);


ui(k - d ) ; . ,,u,(k - d - n

where P(k d / k - 1) denotes the optimal predicted value

(in the sense of least-squares error) of y ( k + d ) given the
past measurements and controls up to and including the
time ( k - 1). The r-step ahead predictedvaluemaybe
the prediction equations obtained from (14) for r = 0,l; - , d :


+ 1); 117-.

The algorithm for calculating the least-squares estimate

the parameter vector 8, for the joint i model is based
on (1 1) and (12). Equation (14) with the estimated parameter valueswill then be used to determinean adaptive
controller for joint i. For each joint i, performance crite);(k+rlk-1)=8T(k-l)c$(k-1+rlk-1)
(16) rion I,k(u) of the form shown in (13) is minimized with
respect to admissible controls. The resulting equation (15)
is the basis of the control algorithm. A numerical example
is next presented to illustrate the approach.
NumericalExample I : The set of nonlinear coupled
differential equations in (1)wasused
in simulating the
dynamics of a manipulator. The numerical values of the
parameters in these equations were those of the JPL arm
[2]. The maximum speed is about 98 in/s for the gripper of
the manipulator along the trajectory. The measurements
The algorithms for the adaptive control scheme consist of
were generated by digital simulation of (1) using 0.0025 s
(1 1) and (12) for the parameter estimation and (14)-(17)
for the subinterval of integration. These values were superfor computing the controller gains. The block diagram of
imposedby sample values from a white Gaussian noise
the manipulator system with the controller is shown in Fig.
process with zero mean and variance u2. In this example, u
2. Two case studies will next be presented to illustrate the
was chosen as 0.005 rad for the rotational movement and
applicability of the control scheme.
0.05 in for the translational motion.
An adaptive controller is designed such that the average
Case Study 1
of the mean-squared errors and the control energy appropriately weighted is minimized, wherethe errors refer to
Separate Joint Control (SISO): The problem is to design the deviations of the actual values of the motion from the
an adaptive controller for each joint so as to make the desired values. Then, the resulting controller is to be tested
angular motion of each joint follow a desired path speci- by digital simulation.
fiedby discrete points. It is assumed that interactions
For the system, the input is the voltage ui applied to the
between the joints are small. Thus, coupling terms in the joint motor, and the output is the velocity ui of the joint.
autoregressive modelof the manipulator system are omitted, On the basis of the simulation studies, the autoregressive
and each joint will becontrolled independently of the model was chosen for n = 2, d = 1, and a,' = 0, i.e.,
The problemwillbesolvedby
applying the explicit u , ( k )= a?u,(k -2)+bOui(k - 1)
method of designing an adaptive controller for each joint.
+ b f u i ( k - 2 ) + h , + e i ( k ) (22)
In this case, the equations for the control algorithms are

di of



where the parameter 0, = [a:, by,b:; hi]= istobedetermined. The performance index is chosen as follows:



+ ~ i [ u i ( k ) ] ' / ~ , ( k - 1 ) ) (23)
where E{ * /+i( k - 1)) denotes the expected value given the
measurement Gi(k - 1). The desired trajectory is specified
at discrete points by uf*(k + 1) = of(k + 1)+ ci[y,d(k- 1)
- y , ( k - l)]/T where u f ( k 1) is the desired velocity at
step ( k + 1) and the last term with constant ci accounts for
the correction in the error of the position y,( .) in the
sampling interval k - 1.
The input u , ( k ) is selected so as to minimize the performance index in (23):

u i ( k )=

( 6:)2 + E i

- i ; ~ ~ ( k - l ) - i , ] . (24)
The control algorithm can then be written on the basis
of (24).
The described controller was tested by digital simulation
studies. The desired path specified by discrete points was
1) a straight line, and 2) a circle in the
Cartesian coordinate system. The control scheme was programmed on CDC 6500. Due to the D/A conversion, a
constraint lui(k)l < 40 V was imposed. The constant ci is
-1.m I
chosen as 0.5. The weighting factor ci isset equal to
.am .1m .m .w
[by(k - l)] ', which varies with time; it provides a relative
weighting between the control u , ( k ) and the error [ ui(k +
1)- uf*(k + l)]. The forgetting factor p in (12) is set equA Fig. 3. (a) Trajectory of joint 3 in joint coordinates: Example I , 1). (b)
Trajectory of joint 4 injoint coordinates: Example1. 1).
to 0.95 for every joint. The initial values of the parameters
are 8 =1, &'= h: = 0, hi= 0, and Pi(0)= lo4 I . The
parameters are estimated using ( 1 1) and (12), and the
control is computed as in (24).
I ) The Desired Path a Straight Line: The task is to move
the end effector (hand) from point P1 to point P 2 in 0.6 s.
The path of the gripper is a straight line in the Cartesian
(space) coordinate system. It is desired that the speed of
the end effector follows an isosceles triangle at a constant
acceleration (deceleration) starting from the zero initial
speed until it reaches a maximum speed in 0.33 s, and then
decelerates to zerospeed at the terminal state P2. The
numerical values chosen for P1 and P 2 were P 1 = (42.5,
2.4,23.2) and P 2 = (- 12.3, - 10.5, 20.0)expressedin
inches; thedesired rotational motion is 84" for the end
5.m I
The simulation results for the system with the
Fig. 4. Gripperpositioninspacecoordinates:Example I , 1).
self-tuning type controller are shown in Fig. 3(a), (b) in the
joint coordinates, and the trajectory for the gripper in the
space coordinates in Fig. 4. The estimate of by is graphed
in Fig. 5.
2) TheDesired Path a Circle: The secondtask of the
gripper is to follow a circlewith a diameter of25
centered at point (-30, 8.5, 20 in).whilekeeping
desired angular speed constant. The plane of the circle is
specified in the Cartesian coordinate system by x =
Fig. 5. Estimatedparameter by for joint I ; Example 1, 1).
-30 in. The hand starts at the point (-30. 8.5. 32.5 in)






















Fig. 6. (a) Trajectory of joint 3 in joint coordinates; Example 1, 2). @)

Trajectory of joint 4 in joint coordinates;Example 1,2).


a .m

ments. Our other simulations have shown similar responses

for the manipulator system with the controller when different weights (0-4 lb) are moved along the same desired
path. Simulation results for different loads and repetitive
motions are not shown here because they looked similar to
the ones in Figs. 3 and 6 .
Our simulations indicate that the system performs well
when the separate joint control is applied, even at high
speeds. The main advantage of a separate joint control is
that it is simple. Moreover, the controller can be implemented using microprocessors or a small computer.




Case Study 2







Adaptive Controller for Interacting Joints (MIMO): The

task here is to determine an adaptive control scheme for
Fig. 7. Gripper position in space coordinates; Example 1 2).
the manipulator jointssuch that the motion of the manipulator jointsfollow a desired path asclosely as possible. The
desired trajectory to be followed is described at discrete
points by a six-dimensional vector y d ( k ) ,the components
of which are determined as described in Example 1.
The problem is solved using a multivariable autoregressive model of (3) for the motion of the joints. In order to
.lcm 1'
restrict the complexity of the multivariable model, the
variables of the model may be decomposedinto two groups
Fig. 8. Estimated parameter by for joint I ; Example I , 2).
on the basis of the mechanical structure of the manipulator. The first three joints with appropriate links are mainly
with zero speed. The trajectory for the orientation rotates associated with the positional movement, and the last three
joints representing the skeleton for the hand with the
90" during the 0.8 s period.
The graphs describing the behavior of the system with orientation. The coupling terms between the variables of
the adaptive controller are computed; the simulation re- the two groups are neglected. The termdescribing the
sults are shown in Fig. 6(a), (b) in the joint coordinates, interactions between the variables of each group are inand in Fig. 7 in the space coordinates. The estimate of by is cluded. Thus, the matrices Aj in (4) are written as
graphed in Fig. 8.
In both cases, the adaptability of the controller can be
observed, particularly at the high speed of the manipulator
system. Initially, the system response oscillates whiletrying
to follow the desired value, and then settles down. More- wherej = 1; * , nand A,, and A j , are both 3 x3 matrices
over, the graphs demonstrate how the system learns the with nine unknown entries to be estimated. For example, if
values of the parameters along the path by on-line adjust- the output is the velocity v ( -), the vector is decomposed as
Y - M I S (INCH)




2, FEBRUARY 1983

u ( k ) = [ u , ( k ) ,u2(k),u,(k) j u4(k), u , ( k ) , u6(k)iT. The

matrices Bj in ( 5 ) can be chosen to be diagonal in order to
avoid the redundancy [17].
The autoregressive model was chosenso that n = 2, d = 1
on the basis of simulation studies. The model was assumed
in the following form:


where the constant vector h = [ h , , h,; .,,%,IT accounts for

the gravitational forces; u ( k - 1) = [ u l ( k - l), u,(k l), .,u6(k - I)]= represents the voltages applied to the
motors of the joints; e ( k )= [ e , ( k ) ,e 2 ( k ) ; .,e6(k)lTis
the equation error vector; and each ei(k ) , i = 1,. . .,6 represents a white Gaussian zero-mean noise process which is
independent of u ( k - m ) , u(k - rn) for rn 2 1 and e j ( k )
for j * i. The matrices A , , Bo, and B , are described in
conjunction with expression (25). The matrix A , in (4) was
selected as a null matrix to simplify calculations.
Equation (26) may be written as (9) where 8 and @( k - 1)
are defined as follows:
p 2 1


I b:,




h l p


+(k-1)= [u,(k-2);..,u,(k-2),
u l ( k - 1);


. , U 6 ( k- 1);





Fig. 9. (a) Trajectory of joint 3 in joint coordinates; Example 2, 1) (b)

Trajectory of joint 4 in joint coordinates; Example2, 1).

u , ( k - 2 ) ; . . , u g ( k - 2 )(;218I)T .

The dimension of 8 is (19 X 6), which contains 36 parameters to be estimated.

Theperformance criterion ischosen in the following
I k ( U ) = E{llu(k


+ l)-ud*(k + 1)11;




+llu(k)ll2R/+@ - 1))


E' am

where d*(k 1) = [ u;'*( k l), . . .,ut*( k l)] and each

up* is computed as described in example 1; R is chosen
~ ~ 2is, a diagonal matrix; and Q = I .
equal to ~ \ ~ owhich
By minimizing (29), the controller algorithm will become





Equations (11) and (12) can the be programmed for on-line

computations of the parameter estimate b and (30) for the
on-line computation of the controls.
Numerical Example 2: The simulation of the dynamics
of a manipulator was performed with the same numerical
values as in Example 1. The problem was also posed as in
Example 1.







Fig. 10. Gripper position in spacecoordinates;Example 2, 1).

The solution to the control problem was obtained using

the multivariable model given by (26)-(28). In the parameter vector 8, the six unknowns in each rowof (27) were
estimated as a group. The initial conditions for the estimation are A^,= I , Bo = h, = 0, h= 0, and P(0) = lo4 I. The
values of the controller were obtained from (30). The




d 23.03N







6 . ~17.50




Fig. 12. Gripper position in space coordinates; Example 2, 2).











Fig. 11. (a) Trajectory of joint 3 in joint coordinates;Example 2, 2). (b)

Trajectory of joint 4 in joint coordinates;Example 2, 2).

desired trajectory wasspecified at discrete points in the

Cartesian coordinate system.
1) Desired Path a Straight Line: For a straight line, the
simulations resulted in the trajectories which are graphed
in Fig. 9 for joints 3 and 4 in the joint coordinate system.
The trajectory of the gripper is shown in Fig.10 in the
Cartesian coordinate system.
2) The DesiredPatha Circle: The desired path is the
same circleas in part 2) of Example 1. The simulation
results are shown in Fig. 11 for the joint coordinates, and
in Fig. 12 for the Cartesian space coordinates.
Autoregressive difference equations are introduced here
to model the motion of the joints in a robotic manipulator.
The inputs in the model are the motor voltages, and the
outputs are selected as the velocities or the positions of the
manipulator joints. It is also possible to select the output
variables in the Cartesian coordinate system; thus, the
model can include the transformation between the Cartesian and jointcoordinates.

The stochastic modeluseddoes

not require a priori
knowledge of the manipulator system such as the configuration and the numerical values of the physical parameters. The sampling period for the time series model can be
selected relatively long without sacrificing the accuracy of
the model. In the experimental work on the Stanford arm,
which is still going on, the sampling period of 56 ms was
selectedwhen the control schemedescribedwas
implemented using on-line calculations in LSI-11 (modified Unimation controller) with floating-point arithmetic. The same
CPU was used to control three joints, and three model
parameters per joint were estimated. Thenumberof
mathematical operations imposed this lower bound on the
sampling period.
For the design of a controller, the desired trajectory for
the manipulator is specified in the performance index, (13),
at discrete points. It maybe obtained, for example,by
moving the arm manually, while registering the output of
the encoders. Alternatively, the prespecified discrete points
describing the desired path in the Cartesian coordinates
can be transformed into the joint coordinate system.
It is known that a self-tuning controller designed using
the model with estimated parameters converges, under
certain conditions, to the optimal controller obtained for
the modelwith the true parameter values [ 181. In the
manipulator control, however, the adaptive controller of
the self-tuning type is introduced to control the transient
behavior, instead of the steady-state values. The convergence of the parameter estimates and controller gains may
not beachieved during the finite timeoverwhich
motion takes place (adaptive controller). In repetitive tasks,
the last estimates of the parameters from the previous run
can be used as the initial estimates. Thus, the controller of
the robotic manipulator is trainable in repetitive tasks,
resulting in a more accurate performance.
In the numerical examples, the estimates of the parameters converge faster in the single variable model than in the
multivariable model. When the trajectories are compared
in the cases in whichmost estimated parameters have
reached the steady-state values, the simulation results for



the multivariable case in the examples show no clear improvements relative to the results obtained for the single
variable case. It is possible that adaptive feature of the
closed-loop single variable model accounts, at least partly,
for interacting effects.
The recursive algorithms used in the proposed control
scheme require only a small amount of memory space. The
mathematical operations are simple and fast to compute;
however, the time for the on-line calculations of the controller gains is relatively long, for example,as compared to
the case of constant controller gains.
The controller equations (15) and (24) describe a feedback controller with time-varymg gains, and as such, it is a
proportional controller with time-varying gain. The timevarying gain for the variable of joint 1 is the inverse of the
estimate 6: shown in Figs. 5 and 8. If an integrating effect
is desired in the controller, the performance index should
be chosen so as to include a term ~ l l u ( k ) -u ( k - 1)11*.
Indeed, different features in the controller may be obtained
by properly selecting the performance criterion. Because of
the time-varying gains of the controller (15), a direct comparison to the existing control schemes, many of which use
PID controllers with constant gains,is difficult, and is
hardly objectively possible.
The adaptive controller proposed represents a new approach to control a robotic manipulator. Its adaptive feature makes the manipulator controller more versatile and
trainable. It will perform well under diversifiedcircumstances, particularly in repetitive tasks. Since the control
problem of a robotic manipulator is usually of the finite
time type, the proposed adaptive controller may not always
exhibit the self-tuning property of the asymptotic optimality. The adaptive self-tuning type controller is here applied
successfully to a finite time control problem, which is
different from the usual applications of such controllers.

An adaptive controller of the self-tuning typeis proposed here for each joint of a robotic manipulator. The
controller is determined by minimizing the expected value
of a quadratic criterion subject to the constraint of an
autoregressivedifference equation (model)and available
measurements. Thus, this controller makes the output of
each manipulator jointfollow a desired trajectory specified
at discrete points. The proposed control scheme does not
require a priori knowledge of the physical parameters of
the manipulator or its configuration.
The algorithms for the adaptive controller are designed
by the explicit method. Theestimates of the model parameters and the controller gains are computed recursively on
line. Hence, the requirement on computer memory is relativelysmall, although many mathematical operations are
necessary for the on-line operation.
Simulation studies on the manipulator system have been
presented to demonstrate the applicability of the adaptive
control scheme. Numerical examples demonstrate that the

operation of the manipulator systemwith independent

joint control is comparable to the performance of the
system which was obtained on the basis of a multivariable
modelwith coupling terms.Some aspects of the implementation of the control algorithm have been discussed.

R. C. Paul, Modelling. trajectory calculation and servoing of a

computer controlled artn, Ph.D. dissertation. Stanford Univ., Stanford. CA, Aug. 1972.
A. K. Bejczy, Robot arm dynamics and control, JPL Tech.
Memo. 33-669, Feb. 1974.
D. E. Whitney. Revolved motion rate control of manipulators and
human prostheses, IEEE Trans. Man-Machine S p . , vol. MMS-IO,
pp. 47-53, June 1969.
M. E. Kahn and B. Roth. The near-minimum time control of
open-loop articulated kinematic chains. J . Dynamic Syst., Meas.,
Contr. ( A S M E ) . pp. 164-172, Sept. 1971.
J. Y. Luh and M. Walker, Minimum-time along the path for a
mechanical arm. in Proc. 1977 IEEE Con/. Decision Contr.. New
Orleans, LA. Dec. 1977, pp. 755-759.
K. K. D. Young, Controller design for a manipulator using theory
of variable structure svstems, IEEE Trans. S p t . , Man, Cpbern.,
vol. SMC-8. pp. 101-109, Feb. 1978.
J. S . Albus. A new approach to manipulator control: The cerebellar model articulation controller (CMAC), J. Dynamic Syst., Meas.,
Contr. ( A S I M E ) .pp. 220-227, Sept. 1975.
W. J. Book. 0. Maizzc-Neto, andD. E. Whitney. Feedback
control of two beam, two joint systems aith distributed flexibility,
J . &nomic Syst. Mens., Contr., pp. 424-431, Dec. 1975.
W. E. Snyder and W . A. Gruver. Microprocessor implementation
of optimal control for a robotic manipulator system. in Proc. 18th
IEEE Con/. Decision Contr., vol. 2, Fort Lauderdale, FL. Dec. 1979.
G. S. Saridis and C-S. Lee, An approximation theory of optimal
Syst.. Man,
control for trainable manipulators, IEEETrans.
Crbern.. vol. SMC-9. pp. 152-159. Mar. 1979.
H. Hemami and P. C. Camana, Nonlinear feedback in simple
locomotion systems. IEEE Tram. Automat. Contr., pp. 855-860.
Dec. 1979.
S . Dubowsky and D. T. DesForges, The application of model-referenced adaptive controlto robotic manipulators, J. b n a m i c
SW., Mens.. Contr. ( A S M E ) , vol. 101, pp. 193-200, Sept. 1979.
D. E. Whitney. Force feedback control of manipulator fine motions. J. Dynamic Sysr., Meas.. Conrr. ( A S M E ) . pp. 91-97. June
K. J. Astrom, U. Borinson, L. Ljung, and B. Wittenmark. Theory
and application of self-tuning regulators, Automarica. pp. 457-476,
U. Borinson, Self-tuning regulators fo a class of multivariable
systems, Automatica, vol. 15, pp. 209-2fj. 1979.
H. N. Koivo, A multivariable self-tuning controller, Dep. Elec.
Eng., Tampere Univ. Technol., Tampere, Finland, Rep. 31! 1979.
R. L. Kashyap and R. E. Nasburg, Parameter estimation in
multivariate stochastic difference equations, IEEE Trans. Automar.
Contr.. vol. AC-!?, pp. 784-791, 1974.
P. J. Gawthrop. On the stability and convergence of a self-tuning
controller, Int. J. Conrr., vol. 31, no. 5, pp. 973-998, 1980.
M. Vukobratovic, D. Hristic, and D. Stokic, Dynamic control of
industrial manipulators. Industr. Robot, pp. 104-109, June 1981.
M. Takegark and S . Arimoto, An adaptive trajectory control of
manipulators, lnt. J . Confr.. vol. 34, no. 2, pp. 219-230, 1981.

Antti J. Koivo (S61-M63-SW80)

was born in
Ilmajoki, F d a n d . He received the diploma in
electrical engineering from the Finland Technical
University, Helsinki. the MS. degree from Indiana University, Bloomington. and the Ph.D. degree in electrical engineering in 1963 from Cornell
University. Ithaca, N Y .
He worked at Stroemberg Oy in Finland from
1957 to 1960. He was a Rotary Fellow in 1958 to
1959. and had a Woodrow Uilson FelloLvship in
1960 to 1961. He has been associated with Purdue


AC-28, NO. 2,


University, West Lafayette, IN, since 1965, where heis currently a

Professor in the School of Electrical Engineering. He held the position
(grant) of Senior Researcher at the Finnish Academy of Technical Sciences from 1973 to 1974. He has been involved in research on min-max
problems, optimal control and estimation of time delay equations, and
stochastic modeling of water pollution in rivers. He coauthored a paper
which showed for the first time that DO measurements and the knowledge
of the DO and BOD models suffice to estimate BOD and DO without the
need for measuring BOD in a lab, which takes several days. His recent
work includes the design and testing of a microprocessor-based feedback
system to automatically control mean arterial blood pressure by means of
vasodilator drugs. His research interests are centered on computer (microprocessor) control, robotic manipulators, and applications of system
theory to biomedical problems such as blood pressure control.
Dr. Koivo received the D. Ewing Award in 1979 from the School of
Electrical Engineering, Purdue University, for excellence in teaching. He
is a member of Eta Kappa Nu and Sigma Xi.



Ten-Huei Guo (S177-M83) was born

Taichung, Taiwan, Republic of China,on December 28, 1951. He received the B.S. degree in
electrical engineering from the National Taiwan
University, Taiwan, in 1973, the M.S. degree in
electrical engineering from the University of
Hawaii, Honolulu, in 1978. He is presently a
candidate for the Ph.D. degree in electrical engineering at Purdue University, West Lafayette,

From 1975 to 1976 he served as an Instrument
Engineer at Chinese Technical Consultant Inc., Taipei, Taiwan. From
1976 to 1978 he was a Teaching Assistant in the Department of Electrical
Engineering, University of Hawaii. Since 1978he has served as a Research
and Teaching Assistant in the School of Electrical Engineering, Purdue
University. His current research interests include modeling and control of
robotic manipulators, system identification, adaptive control, and computer control.

The Adaptive LQG Problem-Part I


Abstract --This paper is concerned with a rigorous study of the dual

control problem of Feldbaum, i.e., the LQG optimal control problem in
the presence of Bayesianparameteruncertainty.Thesolutionof
probleminvolves two parts, one relating to filtering and one to control.
Although tve establish our fittering result in complete generalityin the last
section,most of the paper concentrateson the finite parameter case to ease
the exposition. Thecontrolresultthat we establish is incomplete in the
sense that the smoothness of the optimal cost function is assumed rather
than proved. Nevertheless, our results and methods are such that we arrive
of the classicalseparationtheorem
showing thatthe
well-known LQG feedback law is optimal within the widest possibleclass of
admissible controls.As this new proof avoids all talk of dependence of the
sigma algebra on the control, weak ~~Iutions,
measure transformation
techniques, etc., we feel that this result will help to clarify what is involved
in the classical separation theorem.

where xj is normally distributed with mean my and variance ?, j = 1; . , N , and E ( . ) and r](*) are standard
For notational simplicity, we take
u, x , y all to be scalars. All the results that appear here and
their proofs remain valid, mutatis mutandis, in the general
vector case. Here, i is a random variable taking values in
{ 1,. . ., N } and aj,.bj,cj, g j , j = 1, * . N are scalar continuous functions of time.
To each control u associate the cost

e ,

J = E ( ~ T ~ p i x 2 +21- q , u 2 d l +21- r j x ( T ) 2 )(1.3)


where p, > 0, qj > 0, j = 1, . * , N , are scalar functions of

time and 5 L 0, j = 1,. . N , are scalars. Throughout, i will
denote the above random variable, while j and k w
i be
integers. In this paper we derive an expression for the control
u*( that minimizes J among all controls that depend on the
observations y( -). If N = 1, we are reduced to the usual
LQG situation and hence, the results described here may
be viewed as an extension or generalization of the separation principle to this multimode1 situation.
The above problemis important for at least two reasons.
First, it is a special caseof the problem of adaptive control,
when the unknown parameter is restricted to a finite set.
Second, there are many systems (such as helicopters) that
display multiregime behavior in an essential way. For these
situations, the above formulation of the problem is a
natural optimal regulator design philosophy.
a ,

ONSIDER the linear stochastic system

e )

d x = a , x d t + b i u d t +xg( iOd )[ = x ?


dy = cixdt + dr]

( 14


Manuscript received February 23, 1981;revised August 17,1981 and

March 29, 1982. Paper recommended by S. I. Marcus, Past Chairman of
the Stochastic Control Committee. This work was supported in part by
the U.S. Office of Naval Research under the Joint SeMces Electronics
Program Contract N00014-75-C-0648while the author was a Research
Fellow at the Division of Applied Sciences, Harvard University, Cambridge, MA 02138.
The author was at the Division of Applied Sciences, Harvard University, Cambridge, MA 02138. He is now with the Department of Mathematics and Statistics, Case Western Reserve University, Cleveland, OH


01983 IEEE