You are on page 1of 6

Preprints of the Fourth IFAC

Symposium on Robot Control


September 19-21, 1994, Capri, Italy

Swing Up Control of the Acrobot


Using Partial Feedback Linearization *
Mark W. Spong
Coordinated Science Laboratory
University of Illinois at Urbana-Champaign
1308 W. Main St., Urbana, Ill. 61801

"This research is partially supported by the National S ci ence Foundation under grants MSS-91 00618
and IRI-9216428

Abstract. In this paper we study the swing up control problem for the Acrobot using partial
feedback linearization. We give conditions under which the response of either degree of freedom
may be globally decoupled from the response of the other and linearized. This result can be used
as a starting point to design swing up control algorithms. Analysis of the resulting zero dynamics
shows interesting and rich behavior. Simulation results are presented showing the swing up motion
resulting from the partial feedback linearization design.

Key Words_ control theory, robotics, intelligent machines, partial feedback linearization, underac-
tuated systems, mechatronics .

1. INTRODUCTION where

mIf~1 + m2(fi + f~2)


In this paper we study the swing up control prob- +2m2flfc2 COS(q2)) + It + 12

lem for the Acrobot , a two- link, underactuated m2f~2 + 12


robot that we are using to study problems in non- m2(f~2 + flfc2 COS(q2)) + 12
linear control and robotics (Refer to Figure (1)).
m2(f~2 + fIfc2 COS(q2)) + h
The swing up control problem is to move the Ac-
robot from its stable downward position to its un- -m2fl fc2 sin( q2)ci~
stable inverted position and balance it about the -2m2fIfc2 sin(q2)ci2cil
vertical. We derive two distinct algorithms for m2fIfc2 sin(Q2)cir
the swing up control. Both of our algorithms are
(mIfcl + m2 f l)g cos(QI)
based on the notion of partial feedback lineariza-
+m2 f c2g COS(QI + Q2)
tion (Isidori , 1989), but also share a common de-
sign philosophy with the recent method of inte- m2 f c2g cos( QI + Q2)
grator backstepping (Kokotovic et al. , 1992) . As
we shall see, our first algorithm is useful in the
case that there are no limits on the rotation of the
second link , while our second algorithm, a prelim-
inary version of which appears in (Spong , 1994b) ,
can be used in cases where the second link is re-
stricted to less than a full 360 0 rotation.

The Acrobot model that we use is a two- link


planar robot arm with an actuator at the elbow
(joint 2) but no actuator at the shoulder (joint 1).
The equations of motion of the system are (Spong
and Vidyasagar , 1989)

dllii! + d l2 ii2 + hI + CPI o (1) Fig . 1. The Acrobot


d 2I qI + d22 q2 + h2 + CP2 T (2)

833
The difference between the system (1)-(2) and the or the system ~2
standard model of a two- link planar robot (Spong
and Vidyasagar, 1989) is, of course, the absence (6)
of an input torque to the first equation (1). Previ- (7)
ous studies have investigated the problem of bal-
ancing the Acrobot in its inverted position and Thus (under conditions that we will state below)
controlling its motion along its unstable equilib- the systems ~I and ~2 are both feedback equiv-
rium manifold (Bortoff, 1992 ; Bortoff and Spong, alents of the Acrobot dynamics. Either of these
1992; Murray and Hauser , 1990). The problem systems, ~I or ~2 , may be used to generate a
studied in this paper is to swing the Acrobot from swing up control strategy as we will show below,
is stable manifold to its unstable manifold and after first giving the details of the derivations of
balance it. Initial results on the swing up control ~I and ~2.
problem were reported in (Spong, 1994b). In this
paper we extend the results from (Spong, 1994b)
by showing that the input-output or partial feed- 2.1. DERIVATION OF THE SYSTEM ~I
back linearization approach can be used to lin- LINEARIZATION OF ql
earize either degree of freedom. In other words ,
Consider the first equation (1)
we may achieve a double integrator system
(8)
qi = Vi for i = 1 or 2 (3)
and assume that the term
This result is quite simple yet surprising in its im-
plications. We shall see that a careful analysis
and interpretation of the resulting zero dynam-
ics is important to understand both the behavior is non zero for all values of q2. Note that this im-
of the system under partial feedback linearization poses some restrictions on the inertia parameters
control and the limitations of the approach. of the robot, namely that h > m 2i c2 (i l - ic2) .
This condition is termed Strong Inertial Coupling
in (Spong, 1994a). Under this assumption we can
solve for ih from (8) as
2. PARTIAL FEEDBACK LINEARIZATION
(9)
It has been shown (Bortoff, 1992; Murray and
Hauser, 1990) that the Acrobot dynamics are not and substitute the resulting expression (9) into (2)
feedback linearizable with static state feedback to obtain
and nonlinear coordinate transformation . This
is typical of a large class of under actuated me- (10)
chanical systems. However , as we will show , we
may achieve a linear response from ei ther degree where the terms dl , hI, ~I are given by
of freedom by suitable nonlinear feedback. In this
section, we derive and analyze two distinct nonlin-
dl d2I - d 22 d ll /d I2
ear controllers to achieve two distinct closed loop hI h2 - d 22 hI/ d I2
systems , which we call ~I and ~2 , and which rep- ~I <P2 - d 22 <PI/d I2
resent the linearization of the response of link 1
and link 2, respectively. We will use these two sys-
tems to generate two distinct approaches for the A feedback linearizing controller can now be de-
swing up control problem. fin ed for equation (10) according to

The easiest way to see how the partial feedback (11)


linearization is accomplished is as follows . In
equation (1) suppose that we solve for either rh The complete system ~I to this point is now given
or rh and use the resulting expression in the sec- by
ond equation (2) . In this way the second equation
will be a feedback linearizable equation involving (12)
only ih in the first case or only ih in the second
(13)
case . Upon choosing T to linearize the resulting
equation (2) we achieve either the system ~I If qt(t) is a given reference trajectory for ql we
may choose the input term VI as
(4)
(5) (14)

834
wh ere kp and kd are p ositive gains. With state above that the surface z =
0 in state space de-
va riabl es fin es an invariant manifold fo r the system. Since
A is Hurwitz for positive values of kp and kd this
ZI = ql - qf Z2 = ql - qf (15 ) inva ri ant manifold is globally attractive. The dy-
1)1 = q2 172 = q2 namics o n this manifold are given by
t he closed loop syst e m may be written as
I] = !L'(O , 17) (23)

(16 )
Since we are interested in the swing up control
(17) problem, we consider the case qf
= 7r/2. Substi-
171 (18) tuting qf =
7r/2 , qf = = iif
0 into the equation
( 19) and using the original description of the sys-
172 (19) t em (1) yields the following expression for the ze ro
dynamics of the system with respect to the output
y =ql:
It is inte rest ing to note that the same result can be
obtain ed by simply choosing an o utput equation (m2e~2 + m 2£1£c2 COS(q2) + 12)<12
(20)
- m 2£ I£c2 sin(q2)q~ - m 2£c2gsin(q2) =0 (24)

fo r the o riginal system (1 )-(2), differentiating the T he syste m (24) , conside red as a dynamical sys-
o utput y until the input appears, and then choos- te m o n the cylinde r , h as two equilibrium points
ing the control input to Iin earize the resulting PI = (O , ojT , which is a saddle, a nd P2 = (7r,O)T ,
equation. The system therefore has relative de- whi ch is a center. A typi ca l phase portrait of the
g ree 2 with respect to the o utput ql' The manner system (24) is shown in Figure (2). The dynamic
in which we have arrived at t he system ~I has parameters used to generate this phase portrait
t.he advantage that the comput at ion and analysis co in cid e with those used in the simul ations of the
of the resulting zero dyn am ics is simple. next section.

It. is, at first glance , surprising t h at we can achieve


a linear response from the first degree of fr eedom
even though it. is no t directly act uated but is in-
stead driven o nly by t.h e coupling for ces arising
from motion of the second link . This result says ,
in effect, that the second link m ay b e driven in
such a way as to achieve essent ia lly an arbit.rary
response from link 1. The m ot io n of link 2 Jl eces-
sary to achieve this m ay b e complex a nd precisely
defin es the zero dynami cs of t he system. For this
reason the analysis of t he ze ro dynamics (Isidori ,
1989) is cru cial to the understanding of the be-
hav io r o f the comp lete system . The zero dynam-
ics. with respec t. to the o u tput y =
ql are com-
puted by specifying t.hat. t.he y id ent.i cally track
the referenc e traj ec tory qf. We will analyze t.he
zero dynamics for th e case o f a co nsta nt refere nce Fig. 2. Phase Portrait of the Ze ro Dynamics
command in the next sec tio n .
It fo ll ows that , fo r initi a l conditions, z(O) = zo,
1)(0) = ')0, the st.at e .:-(1) converges ex pon entially
2.2. Al'.'ALYSIS OF T HE ZERO DYXAMICS: to zero. while t he state 17 ( t ) co nverges 10 a trajec-
THE A UTOSO .UO l'S CA.SE t o ry o f I he system (24). Since almost. all trajec-
to ri es o f the syst em (24) are p eri o di c, the typical
If t he reference inpu t qf
is a co nsta nt . t he n I he steady state behavior is fo r t he first link to con-
sys t em is autonomous a nd we m ay write (16) - (19) \'e rge expone nti a lly to ql = 7r /2 and the second
as link t o osc illat e about th e eq uilibrium (7r/2.0).
Th e s tr a tegy for the swing up control is then to
A.:- (21 ) swit ch fr om th e above p a rt ial feedback lineari za-
17 U;(':-·17) (22) ti o n co ntro lle r to a lin ear , q u ad rati c regulator de-
s ig ned to balan ce the Acrobot abo ut the vertical ,
with suit.abl e definitions of the mat.rix A and the whe n the A crob ot trajectory app roaches the near
function w(z , 17) (Spong , 19946). We see from the vertic a l position. This will be illustrated in the

835
next section by simulation result.s.

2.3. SIMULATION RESULTS

We have simulated the Acrobot in Simnon


(Elmquist , 1975) , using th e parameters in Table
1 below. Figure (3) shows the response of the
partial feedback lin earization controller with gains
kp = 9, kd =
3. The Acrobot is held at the equi-
librium for about 1 second before the control is
applied. This is merely to make the animation
(not shown here) look better. Also , the angle q2
is plotted modulo 2iT which is the reason for th e
Fig. 4. Partial Feedback Linearization Response with
apparent jump in th e joint angle at about 1.8 sec-
gains kp = 9, kd = 6
onds.

where the state vector x =


(ql - iT/2 , q2 , i/l , (2) ,
g the control input u =
T, and the matrices A and

0.5 0.2 l.0 9.8 Bare gi ven by


Table 1 Paramet ers of th e Simulated Acrobot 0 0 0
0 0 0

:; ; .. ; .. ::::: .;1'"
. .
I: ":" •• ':: I ::::::
.....
11:::: I:::::: I:::::: 11::::: :1: : :: : :1::
A=
[ 10 .19
-10.35
-1.57
6.12
0
0
0
0 1
(26)

q q
2

-2
+--..-1---c::----~-----::6c---~ t.
13=
r -L 1
2.37
(27)

hing ~1atlab , an LQR cont roller was designed


-4
wit h weighting matrices

-6

Q=
r 1000
-500
-500
1000
0
0
0
0
(28)
0 0 1000 -500
Fig. 3. Partial Feedback Linearizat.ion Response with
gains kp = 9, kd = 3
0 0 -500 1000 1
and R =
0.5 , yielding the state feedback controller

Figure (4) show the response of th e partial feed-


11 =
-J\x, where

back linearizati on co nt.roll er fo r larger gain valu es I\· = [-1650.0 , -460.2. -71 6. 1, -278.2] (29)
kp = 9 and kd =
6. In this case link 2 rotates
360 0 in t.he steady state. Csing t.he large gains,
we can then switch to a "balan cing" controller to The' above control law (29) is switched on wh en-
capture and balance th e second link about the ver- ever the acrobot reaches t.h e near vertical config-
tical whenever q2 passes close to zero (mod(2iT)). uratio n . The actual switching time is determin ed
We illustrat e this below using a Linear , Quadratic by trial and error. Figure (5) shows a plot of a
Regulator to balance the Acrobot about the ver- successful swing up and balan ce using the par-
tical. t ial feed back lin earization followed by the linear .
quadratic regulator.

2.4. THE BAL-'d' eISG COSTROLLER


2.;"). DERII/4TIOX OF THE SYSTEM ~2
Linearizing the Acrobot dynami cs about th e verti- Lf.\L4RIZATION OF q'2
cal equilibrium ql = iT /2. q2 = 0 using the param-
eters in Table 1 results in the cont.rollable linear In this section we derive an alternative swing up
system control algorithm which can be used in t.he case
that t he second link is constrained to rotate less
:i; = Ax + Eu (25) than a full revolution, as for example , with the ex-

836
al., 1992) was used to solving the swing up control
1 .'
2
probl em. The result in (Spong , 1994b) contains an
analysis of the resulting zero dynamics for I:2 sim-
ilar to that contained here for I: 1. For reasons of
space we will not include the analysis of the zero
dynamics in this paper . Instead we will discuss
the original energy pumping interpretation of our
algorithm which was the original motivation for
its derivation.

2.6 . ENERGY BASED SWING UP ALGO-


RITHM
Fig. 5. Swing Up Motion of The Acrobat using L:J
If the second link angle q2 is constrained to lie
in an interval q2 E [-13 ,13] then we choose an 0'
perimental Acrobot considered in (Bortoff, 1992). less than 13 and swing the second link between the
The alternative algorithm derived here is based on values ±O' as follows: Let the reference for link qq
linearizing the system with respect to the output 2 be given as
y = q2 instead of y = ql. Consider the equation
(2),
q~ = 0' sgn((h) = { +0'
-0'
if
if
liI > 0
ql < 0 (35 )
(30)
where sgnO is the usual signum function, and
This time we solve for ch from equation (1) and choose the outer loop control term V2 as
substitute the resulting expression into (30) to ob-
tain (36)

(31) with kp and kd positive gains . The idea behind


this choice of reference position for q2 is to "pump
w here the terms d2 , h2 , ~2 are given by energy" into the system by swinging link 2 "in
phase" with the motion of link 1 so that energy
d2 d 22 - d21d12/dll is transferred from link 2 to link 1. In this way,
h2 h2 - d21 h 1/d ll the amplitude of link 1 may be increased with each
~2 <P2 - d21 <PI/dll swing. To see how this might be expected to work
consider the motion of a single link with a force
Note that this requires that the term d l l be F acting at the end of the link. Assume that the
nonzero over the configuration manifold of the force F is directed perpendicular to the link for
robot. This , however , involves no restrictions on simplicity. Then the torque acting at the joint is
the inertia parameters since d l l is always bounded equal to £.F and the equation of motion is
away from zero as a consequence of the uniform
positive definiteness of the robot inertia matrix. (37)

A feedback linearizing controller can be defined The total energy of the system is given by
for equation (31) according to

(32)
and the derivative of V along trajectories of the
Substituting the control (32) into (31) yields the system is given by
system I:2
(39)
(33)
Therefore , the change in total energy over a time
(34) interval [T - 1, T] is

The input term V2 can now be chosen so that r


q2 tracks any given reference trajectory qg. The VeT) - VeT - 1) = £. r
Jr - 1
F(t)ql(t)dt ( 40)
important problem now is to choose the refer-
ence signal qqto execute the swing up maneuver. Suppose that the force F satisfies ,
In (Spong, 1994b) an energy pumping strategy,
based on integrator backstepping (Kokotovic et F = If(tllsgn((it(t)). (41 )

837
Then we see from (40) that

V(T) - V(T - 1) =£ ( I/(t)1 ·Iqlldt 20(42) all 1.t. 1 _FYV _~ ... i _ _ t.on? --_ _ _ _ __

JT-l
i.e., the change in energy during the time interval
[T - 1, T] is nonnegative. Our strategy for swing-
ing link 2 rapidly in the direction of motion of ql
is designed to produce a net force during the time
[T - 1, T] of each swing with the "correct sign"
as above. Although the above simplified analysis
only approximately describes the true Acrobot , we
will see below that the total energy is indeed in-
-~r-----~---------=---------
creased with each swing as we might expect from
the above considerations. Fig. 7. Total Energy During the Swingup Motion

It turns out that a better response can be obtained robot. Ph.D. Thesis , Dept. of Electrical and
by smoothing out the reference command for joint Computer Engineering. University of Illinois at
to by using Urbana- Champaign .
Bortoff, S.A. and M.W. Spong (1992). Observer-
q~ = Cl'sat(qd ( 43) based pseudo-linearization using splines: The
rolling acrobot example. ASME Winter Annual
where sat() is the saturation function. This has
Meeting. Anaheim , CA.
the effect of straightening out the Acrobot at the
Elmquist , H. (1975). S.lmnon- User's Guide. Dept.
top of each swing, which actually increases the
of Automatic Control. Lund Inst. of Tech.
amplitude of link 1 with each swing and facili-
lsidori , A. (1989). N onlznear Control Systems. sec-
tates the capturing of the Acrobot at the vertical
ond ed .. Springer- Verlag. Berlin.
position. Oth er choices for qq
are possible. The
Kokot.ovic , P.V. , M. Krstic and I. Kanellakopou-
essential feature is that the reference function be
los (1992). Backstepping to passivity: Recur-
a so-called "first and third quadrant" function of
sive design of adaptive systems. IEEE Conf. on
ql. See (Spong, 1994b) for additional details and
Decision and Control. Tucson , AZ. pp. 3276-
an analysis of the resulting zero dynamics.
3280.
Figure (6) shows a swing up motion using the ref- Murray, R.M. and J. Hauser (1990). A case study
erence for q2 given by (43) . Again the LQR con- in approximate linearization: The acrobot ex-
troller is switched on at the top of the swing. Fig- ample. Proc. American Control Conference.
ure (7) shows a plot of the total energy during the Spong , M.W. (1994a). The control of underac-
swingup motion. tuated mechanical systems. First International
Symposium on Mechatronics. Mexico City.
Spong , M.W. (1994b). Swing up control of the ac-
robot. IEEE Int. Conf. on Robotics and Au-
tomation . San Diego , CA.
Spong, M.W. and M. Vidyasagar (1989). Robot
Dynamics and Control. John Wil ey & Sons,
lnc .. f' ew York , NY.

Fig. 6. Plot of the Swingup ~Iotion

3. REFER[\C[S

Bortoff. S.:\.. (1992). Ps eudoiznfari::atlOn usmg


Splme Functions lL'lfh ApplicatIOn to the .4c-

838

You might also like