You are on page 1of 13
300 Years of Optimal Control: From The Brachystochrone to the Maximum Principle Hector J. Sussmann and Jan C. Willems: ptimal control was bor in 1697300 years ago—in Gron- ingen, a university town in the north of The Netherlands when Johann Bernoulli, professor of mathematis atthe Local university from 1695 to 1705, published his solution of the Bra chystochrone problem, The year before he had challenged his contemporaries to solve this problem, We will ell the story of, some of the events of 1696 and 1697—when solutions were sub: mitted by Johann Bernoulli and such other giants as Newton, Leibniz, Tschirnbaus, Hopital, and Johana’s brother, Jakob Bernoulli—and then sketch the evolution of this field undl it reached maturity in our century. Since the birth of optimal con- two, lke all births, didnot take place in a vacuum, the historical context will first be described, by outlining briefly some of the main ideas and discoveries on curve minimization peoblems from classical Greece up to Bernoulli's time, We will then state the brachystochrone problem, present Bernoulli's solution, and also provide a short nontechnical interlude, dealing with Ber: noull’s personality and with his exceptionally gifted Fail Subsequently we will follow the intricate path that has led to the ‘modem versions of the necessary conditions for a minimum, from the Euler-Lagrange equations tothe work of Legendre aad Weierstrass and, eventually, the maximum principle of optimal control theory. Finally, we will “close the loop” by returning to the brachystoehrone from the perspective of modern optimal control ‘Our thesis, that the brachystochrone marks the birth of opti, ‘mal contro, is undoubtedly somewhat controversial, and some readers—especially those who espouse views currently in vogue bout the social construction of reality—might suspect that i is ‘merely areflection ofthe professional and nationalistic biases af estor Susman asian hamilton ters.) swith the Departnen ‘of Mathematics, Rutgers University New Brunswick, NJ 08903. His work ‘va sappoted in partby NSF Grant DMSS.OO798 and AFOSR Geant 0923, eis especially aateflta Anaye K Seman forkercaeflreading ofthe manuseript and her invaluable suggestions. Willems (J.C.Wil Jems@muahsuga) is wit the Departeat of Mathematics, Uvesty of roger, RO Box 80; 9700 AV Groningse The Nehedand. This ari ‘was presented in the istry seston of the 35th Conerence on Deson ‘nd Coto in Kobo, ep, on Das. 11, 1996, 2 (0272. 170819718 .0O199TTEEE the authors. We gladly plead guilty to most ofthis charge—aod state forthe record that we are both control theorists, an one af us is @ professor at Groningen—asking only that the word “merely” be stricken out, Our biases may of course explain how IEEE Control Systems B Fig, 2. The Brachystochrone Problem (Acta Eruditorum, ine 16%, 260). we became interested in this issue, butare not at all elevant othe merit and validity of our conclusion Inthis anicle, we will focus on point-to-point optimal control problems, where the objective isto transfer the tate ofa dynami- cal system with minimum cost from one point to another. This ‘means that we are leaving out the whole area of transversality conditions, which arise when one considers “set-to-set” prob- lems. Furthermore, we will not discuss a al the very important ‘elated question of sufficient conditions (and "Hamilton-Jacobi theory”), as well a the problem of finding optimal controllers, orexamplointhe Form of feedback laws, which is of course also ‘central concer of optimal control theory. Bernoulli’s Challenge In the June 1696 issue of Acta Erudiiorum, Bernoulli posed. the following challenge (see Fig. 2) Invitation to all mathematicians to solve a new problem, ina vertical plane wo points Aand Bare given, then itis re quired to specify the orbit AMB of the movable point M, along Which i, starting from A, and under the influence of its own weight, arrives at Bin the shortest possible time. So that those who are keen of such matters will be tempted to solve this prob- lem, ist good to know tha itis no, asitmay seem, purely spect lave and without practical use. Rather it even appears, and tis ‘may be hard 10 believe, that itis very useful also for other branches of science than mechanics. In onier to avoid a hasty ‘conclusion, it should be remarked that the straight line is cer- tainly the line of shortest distance benween A and B, but i isnot ‘the one which is traveled in the shortest time. However, the curve AMB- which I shalt divulge ifby the end ofthis year nobody else hhas found itis very well known among geomeers Later, at the suggestion of Leibniz, Bemoulll extended the gla)that satisfy endpoint constraints asin (2) and are so- lutions of (3) for some control +> u(t), are called minimum time problems. Iti in these problems that the difference be- ‘ween optimal control and the calculus of variations is most clearly seen, and itis no accident that these were the prob- ° TEBE Control Systems Fig. 4. The Brachystochrone eyeloid (Acta Erudtorum, May 1697). {ems that propelled the development of optimal control in the early 19606, and that time-optimal contol is prominently repre- sented in today’s research and in modern optimal contol text- books. ‘Within this framework, we can state the first of our reasons for claiming thatthe brachystochrone problem marks the birth ‘of optimal control: Bernoulli's problem, as posed in the Acta Erualitorur, isa true minimum time problem ofthe kind tha is studied today in optimal contro! theory. Bernoulli called the fastest path the brachystochrone (from the Greek words Bourptovos: shortest, and yoovos: time). Moreover, the bra- chystochrone problem is the fist one ever to deal with ad) ‘namical behavior and explicitly ask for the optimal selection of «path, In both the soperimettie problem and Newton's minimal drag problem the curves to be computed are not thought of as paths of a moving body or particle. Finally, and most impor- tantly, a large part ofthe subsequent history ofthe calculus of variations can be best understood as the search or the simplest and most general statement ofthe necessary conditions for op- timality, and this statement is provided by the maximum princi ple of optimal contro theory. ‘The above reasons are, in our view, compelling arguments in favor of our claim that 1696 deserves to be called the year ofthe birth of optimal control Bernoulli’s Solution of the Brachystochrone Problem ‘We start by describing Johann’s Bemoulli’s solution * Levusfirs formulate the brachytochrone problem in modern smaeratcal language, Choose rand yaxesin he plane withthe vans pointing downwards. Use (00)and (a) to denote respec- vey. the coordinates ofthe endpoints and. pathy [0.1]—> FF, defined on an imerval 0.7, and aving components fh FA, ssid be a feasible trajectory (or feasible path) it {iy A0) = 0.0), fT) = (a) and fis Lipsobiteconinoos, a1) for almost all {0.01 Here g isthe gravitational constant. Condition (j) states thatthe path fmust tar at and end at B, Condition (i reflects conser- vation of energy: at each instant the kinetic energy ofthe body ‘must equal the decrease of potential energy duc to its loss oF height. The law that a body which has fallen from a height has {Suet ents Hamiton cbt an man rogrammang thee ‘cet catenin eto ‘ok (71 gies an excelent acount, June 1997 velocity proportional to VF was doe 1 Gallo, and was well known in Beroull's time) A feasible path f* : (0.7*] —> R? is said to be optimal if there exists no feasible path f:| 0.1 +B for which T<7*. A brachys- fochrone ia curve in Re teavrsed by an optimal feasible pth, i.e., a subset B of R of the form B= (x,y) © BR: there exists te {0,7*], such that (x,y) = f*(2)} where f* [0,7] — R° is an opti- nal feasible path, ‘One obvious fac s that the solution cannot always be a straight ine, a posibilty that Berwoul ighly wams ag Forexampe, consider the exter case when b= 0. tis easy 0 see that it is fit time to rll rom A to B omaha eile, Sincet wil take finite ime tol from tothe bevtom of the cit cle, and the same ime to climb bask oto Sine, however, he Surigh-tine segment from A to is horizontal the speed of m= tion long it vanishes. So, de straight ine segment cannot bean ‘optimal path, because the motion along it takes infinite tie. Te tus out tht the brachystochrone isa eyclid es the curve deseribed bya point Pin ctl that ols without siping ono texans sch way that P passes through A and then through 2, without hing the x ais in between, His easy to see that his defines the clon uniquely see Fi. 2). Bemoul’s ingenious dtvaion ofthe brachystochrone hes teen the subject of merous accounts, but since this event plays ‘ercial oe in our own sory, we wil outline the emo based his derivation on Ferma’ i principle, If we imagine fora momen hat instead of dealing Wh the motion ofa avin body we are dealing witha ight ray, onition (i) above gives us formula forte “speed of light” 28a function of poston: e= 2gy. Let us rescale—or, ithe reader so prefers, “change our choice of physical uits”—sothat “Ten our problemi exactly equivalent o that of deter ring the ight ays the minimn-ime paths —ina plane ‘medium hor th sooo it varies contindownly a fone- ion of position according tothe formula e= y¥- Teisat east intuitively ler tha, we dis ty ving the al-plane nt horizontal strips Sy] ofeight fork 0,1. where yah and eating in ‘each strip Sas a constant cs (by, say, setting y= /y, .) then the light rays forthe discretized problem should approach those fr the original problem as 8 4 0. The light rays of the discretized problem can be sided using the law of refraction of ight eat paths wll be tight in segments within each ind vidal strip, anda that needs fo be done sto dstermine how these rays bend as they cross the Boundary between two sips ‘The answers provided bythe laws of optics as developed by Snel, Perma, and Huygens ‘nels had observed that if two media are separated by a sarah ine, anda ight ray irefactd at the boundary between them. then the ratio of he sins ofthe incidence angles between the light rays andthe normal tthe boundary is constant. Ferma subsequently showed tha thsi precisely what appens when Tight isassumed to folow aminimum-time pth. Applying histo the situation of the two media separated by horizontal leads 0 the following optimization problem, Assume that we ave OO points, the ist Pleat abo, andthe second, Ps, ying be tow the boundary. Suppose aight ray travels with speedy nthe tedium above the horizontal ine ad with sped rin the me reour problem Gs) yes 35 2. tis fastest path is the straight ine from P} to P This implies thatthe fastest path to travel from P; to Pa, when vy # vgisa broken line consisting of a straight line from P to some point Pon the boundary, and an- ‘ther straight line from P” to P2. The problem thus reduced t0 finding the point P’. This, is, however, a simple calculus ques- tion, and it tums out thatthe point is determined by the equa- sind, _ sin8, or, equivalently, “OSL = SN, oe ‘This law relating the incidence angles to the velocities of| propagation is due to Huygens, and implies the law of Snellus. ‘Bemoulli used Huygens’ law toconclude thatthe quantity 22 willbea constant, sncineach stip Sethe speed of ou light ry is f>na Passing tothe imtas 8 LO, we conclude thatthe sin of the angle betvcen the tangent tothe brahystochrone andthe vertical axis must be proportional to yy. Since a ae =, wo find that 2° = Ky, where Kis a fabs + dy? dé +d? de+d? 1 aC onstant, Then ie, Hy) =S, where constant, Then STF =F ies te y(a) = C=. Sothe curve desctibed by expressing the y-oordnateof the brachystockrone asa function of its «-coordinate will satiy the differential equation v= ‘with Caconstant-The curves givenby the parametric equations o x0) ¢ 0+ S@-sing) c 1) =S (emg), 1) $(1-cone) ososz 9 satisfy 4, eis cas seen tat hse equations spe he ey- Ghd generate bya pot Pon ace ot ameter C that ols shoutslipping on thhsioutl as sch way tht Pis at (x0, 0) when @ = 0. "The argument that we have presenta is Berol’s, sod Equation (4) appears in his paper, followed bythe statment “om which onside that he Bralysockrone sth oriaary eto” ie setualy wrote d= de, bate was using x forthe vets coordinate and forthe horizontal on. Ch. 19 38) Tconemporary mathemati te symbol Vu stands forthe nonnegative square roof hats obo tht nin Bernoulli ot have sin mind, What he meant was, clea, ‘what we would write as () 3), ye) oO % or, equivalently, OX + ¥09) o In particular, the solution curves should be allowed to have a negative slope. Bury’ should stay continuous, sothata switching. from a + toa solution of (6) isnot permitted. Even with the more accurate rewriting (7), the differential ‘equation derived by Bernoulli also has spurious solutions, not given by (5)! Indeed, for any >, the constant function 22) = Tis a solution, comesponding to C = 5 More generally, ‘one can take an ordinary eyeloid given by (5), follow itup to ‘$0 that dj/dx = 0—then follow the constant solution yx for an arbitrary time T; and then continue with a cycloid given by {), Such paths ae, indeed, compatible with Huygens” law of re fraction, Is easly understood that the laws of Snellius and Huygens ‘cannot explain wiry light ray has to bend upwant or downwards ‘once it is horizontal, As such Bernoulli's argument is certainly incomplete when the brachystochrone cycloid connecting A and B fisthottoms out before climbing back up tothe point B. There is ‘no reason why it showld not proceed horizontally once it has reached the lowest point, This shortcoming in Bernoulli's argument seems to have escaped historians. We shall later see that ‘the maximum principle does exclude these horizontal motions. ‘The spurious solutions, andall the other problems, suchas the apparent arbitrariness ofthe requirement that y"be continuous, ccan be eliminated in a number of ways. For example, one can prove diretly that the spurious trajectories are not optimal, or fone can use, as an alternative to Bernoulli's method, the ealeulus of variations approach, based on the Buler-Lagrange equstion (10) below. tiseasy to seethat the brachystochrone problem can be putin the standard” form (1), provided we postulate’ that it suffices to consider curves in the x.y plane tha are graphs of functions 4y(2) defined on (Oa) Then the dynamical constrain (1) —with 2g = 1, as before—becomes dx” + dy" = y di’, which gives leva? EW = 1», jd where wy Low =y" 44 ® andvwe are using crater than for the time variable, and writing} for dyldx. So Bernoulli's problem becomes that of minimizing ‘theintegral 'H( (x), 3(x) Jdesubjecttoy(0)=Oand ya ‘This gives the Euler agrange equation TH ye #2, "—)=0, o which i stronger than (2), since (7) is equivalent y+ "3+ 0, whose solutions arethose ‘of (9 plus the spurious solutions found ealce.Iis easy to see thatthe solutions ofthe Euler-Lagrange equation (9) are exactly the curves given by (3), without any extra spurious solutions, showing that, forthe brachystochrone problem, the Euler Lagrange method gives better results than Bemouli's ap- Wao cote -pxla” comea pve cocaine forBachycone mi Cio bel IEEE Control Systems proach. (We will see ina later section that optimal control is ‘even better) Bernoulli was originally under the mistaken impression thatthe brachystochrone problem was new. However, Leib- niz knew better: in 1638 Galileo, in his book on the Two New Seiences, had formulated the brachystochrone problem and ‘even suggested a solution: he seems to have thought it was a circle. Galileo had actually shown—correctly—that an arc ‘ofa circle always did better than a straight line—except, of course, when "Bernoulli considered the fact that Galileo had been mistaken ‘on two counts, by thinking that the catenary was & parabola, that the brachystochrone was aciele, as conclusive evidence of the superiority of differential calculus (or the Nova Methodus as they called it +e was thrilled by his discovery that the brachystochrone was ‘ayeloid. This curve had been introduced by Galileo, who had siven tits name: relared to he circle. Huygens had discovered a remarkable property ofthe cycoid: it isthe only curve such tata ‘body falling under its own weight is guided by this curve soas to ‘oscillate with a period that is independent of the initial point ‘where the body s released, Contrary to what Galileo thought, the cirelehas this property only approximately: the period of oscilla tion of pendulum isa function ofits amplitude. Therefore, Hy~ gens called this curve, the cycloid, the tautochrone (from ‘eavt0¢: equal, and 7povos: time). Bernoulli was amazed and somewhat puzzled, it seems, by the coincidence thatthe cycloid tums out tobe both the brachystochrone and the tautochrone, so that rather different properties related tothe time traveled on iby a body falling under its own weight led, inthe end, to the same curve. He concluded that nature always arranges things in the simplest manner as here, by giving the same curve wo differ: ent properties. Johann Bernoulli and his Family ‘We now sketch some ofthe historical context surrounding the lifeand work of Bernoulli The Bemoulis were a Protestant fam- ily originally from Antwerp in Flanders. They fled Antwerp in 1583 to escape the religious oppression ofthe Spanish rulersand, after spending some time ia Frankfurt finally settled in Basel, ‘Switzerland, early in the 17th century. Among its members there ‘were eight mathematicians in three consecutive generations. “Most of them ended upas professors in Basel, but many spent ex tensive petiods in other universities in Europe. The most prom nent of the Bernoullis were Jakob (1654-1705), his younger brother Johann (1667-1748), the protagonist of our story, and Jo Jhann’s son, Daniel (1700-1782), born in Groningen while his fe ther was a professor there, Jakob Bernoulli made important contsibutions in particular, to probability theory. (Bernoulli dis tributions ate named after him.) Danie! isthe discoverer of Ber ‘oul law in hydrodynamics, one ofthe great laws in physis. ‘the time that Bernoulli came of age, mathematics was 20 ing through a revolution In 1684, Leibniz published his ist arti cle about differential calculus in the Acta Eruditorun, This article wasenitled Nova methodus pro maxims et minimis, item- ‘que tangentibus, quae nec fractas, nec irrarionales quanttas ‘moratur, & singulare pro ills calculi genus, He shovwed the power ofthe Nova Meshodus by finding maxima and minima for ‘numberof examples much more effectively than had been pos- June 197 sible before. Johann ‘and Jakob Bernoulli were among the firstto master Leibniz” tech nique, and, in 1691, Johann achieved his first success by using the differential caleu- lus to determine the ‘catenary, the shape of hanging chain. In is ‘mere mid-20's, Johann was hired by the Mar- quis de I'Hopital, French nobleman and fone of the leading ‘mathematicians of his time, to teach bin the differential calculus. While he received @ handsome payment for his services, he was ‘bound by contract to let the Marquis take eredit for the discoveries ‘made by Johann during this teaching, Johann always claimed that he was the true discov Fig. 5. Johann and Daniel Bernoull everofl Hopital sruleabouthe mito 9. which appcardinthe Marquis’ book, Analyse des Infiniment Petits, His contempo- raties tended to ignore this claim, since Johann was not known to be patticulaely generous to others or objective about his ‘own achievements. However, in 1922. the original notes of these lectures were discovered, which brought positive evi- dence for Johan's claim, Johann Bemouli was not an easy person, He often quarreled this colleagues, and complained about his salary his work. In 1695, shorty after taking up the chair in Groningen that had been offered o him on the recommendation of Huygens, he vented his disenchantment in a letter to Leibniz, \who hadencouraged him toaccept the offer: have not merany of the practitioners of Algebra, which you consider present in Hol land. Tothe contrary, Ihave not had the honor of meeting asingle person who would evendeserve to be called a "mediocre mathe: ‘matician.” inthe same letter he complained that his teaching took too much ofhistime, and that he more progress the students ‘mate, the less progress I make, Bernoulli expressed such pol cally incorrect views not only in private letes, but also publicly. ‘While in Groningen he got into serious difficulties with the local protestant theologians and clergy, who disapproved ofthe way ‘new discoveries inthe physical sciences cast doubt onthe vali ity of revealed tuth Inhis disputes wth hs mathematical colleagues he was unre. Jenting. He was perhaps the most abrasive contender inthe biter controversy between the English, Newtonian, and the continen tal, Leibnizian, schools, regarding the originality and rigor ofthe differential calculus, Hewas aman of violent likes and dislikes: Leibniz and Euler were his gods: Newton he positively hated and sreatly underestimated” ((1], p. 135.) His rivalry with bis ‘brother Jakob became an embarrassment to the scientific com- ‘munity, and when in 1699 they were both clected to the Paris ‘Academy, it was on the explicit condition that they promise to ‘ease arguing, a promise that of course was not kept. Bven more peculiar was Johann’s rivalry with his own son Daniel, whom he criticized —for being 2 Newionian—and plagiarized —on the Jaw of hydrodynamics—and of whose success he was allegedly very jeafous. Johann once threw Daniel out of the house for hay ing won a French Academy of Sciences prize for which Johann had also been a candidate, cf. [1], p. 134. Daniel, however, e- ‘mained dutifully respectful towards his father, but frequently ex- pressed his misgivings to his trend Euler (a stadent of Johann in Basel and a colleague of Daniel in Saint Petersburg) Fig. 5 isa photograph ofa stained glass window ofthe Acad- ‘ey Building (the main venve ofthe university) in Groningen. It shows Daniel Bernoulli sweetly clutching his father’s robe, ‘while Johann shows off his brachystochrone, ‘At the occasion of the 300th anniversary ofthe appointment ‘of Bernoulli and the discovery ofthe brachystocheone, the Uni versity of Groningen erected the monument showin in Fig. 6. It consists of ananist’s rendering of the brachystochrone, with the circle that generates the cycloid. Inthe background, one can see the building ofthe mathematics department, where the second author ofthis article has his office. Euler, Lagrange, Legendre With the work of Johann and Jakob Bernoulli, Leibniz, ‘Tschiemhaus, Newton, and T Hopital on the brachystochrone, op timal control got off toa spectacular start. Let us now look at some critical events in is ater evolution. ‘The next chapter of our tale the work of Euler (1707-1793) and Lagrange (1736-1813). Leonhard Euler entered the Univer- sity of Base atthe age of 13, and became a student of Bernoul ‘who gave him private lessons once a week. In Basel, he worket fn isoperimettic problems in 1732 and 1736. In 1744 he pub- lished his book The Method of Finding Plane Curves that Show Some Property of Maximum or Minimum, whece he gave & pen- eral procedure for writing down what became known as Fuler's equation ‘And hen Lagrange entered the stage. In H, Goldstine’s words (7p. 1103 On I2 August 1755 a 19-year-old Ludovieo de la Grange Tournier of Tarin, wrote Euler a brief letter to which was at- tached an appendix containing mathematical details of a very beautiful and revolutionary idea, He saw how to eliminate from Euler's methods the tedium and need or geometrical insight and ro reduce the entire process to a quite analytic machine or appa- ratus, which could turn out the necessary condition of Euler and ‘more, almost automatically. Ths basic idea of Lagrange ushered in enew epoch inthe calculus of variations. Indeed, after seeing Lagrange’s work, Euler dropped his own method, espoused that ofLagrange, and renamed the subject he calculus of variations. nthe summary to his frst paper using variations, Euler says ‘Bven though the author ofthis (Euler) had meditated a fong time and had revealed 10 friends his desire yt the glory of fist discovery was reserved tothe very penetrating geometer of Turin A GRANGE, who having used anatsis alone, has clearly at tained the same solution which the author had deduced by geo- ‘metrical considerations." Lagrange derived the necessary condition 38 aay 8g 19) ‘known today as the ‘Euler-Lagrange equation. (This was nothis notation. The symbol (for partial derivative was frst used by Legend in 1786.) ‘Equation (10) makes perfect sense and i a necessary cond tion for optimality for a vector-valued variable gas well us for a scalar one, It can be written as a system: an Alternatively we ean regard Equation (10) asa veetoridentity, in al OL which g = (q', ... °) is an n-dimensional vector, and <=, on aL aya a sind forthe nsopes( 2%, 2) (22,22) a ten ternal Sma agora} 0 ‘mathematician mightbe troubled by the use ofboth as an “inde- pendent variable” ands a function of time evaluated along ata {ectory, and might prefer to write (10) as pau x, S[Feca0. 00.9] (4.0.9. foto where the Lagrangian L(g.) 2 function on R°*!. ie, fune~ sion of g eH, w €RX,r ©. This makes it clear that to compute a the left-hand side of (10) one first evaluates se reating gas an G independent variable" then plugs in gt) and 4() forg , and fi- nally differentiates with respect t ‘The Euler-Lagrange system (10)—or (12}—only gave condi- tions for stationarity, i. forthe first variation of 0 be zero. The next natural step was to lookat the second variation, and this was ‘done by Legendre (1752-1833), who found an additional neces sary condition fora minimum. His condition derived forthe sea Inrcase is 0.40.9) 20) «sy Fig. 6. The Brackystochrone Monument TEBE Gonsrol Systems With an appropriate reimerpretation, Legendre’s condition (13) is also necessary in the vector case: all we have to do is read (13) as asserting that the Hessian matrix [2henana} hha to be nonnegative definite ‘The First Fork in the Road: Hamilton Atthis point, we are close tothe Fist and most etic ork in the road, involving the work of W.R, Hamilton (1805-1865). Ina sense, the issue a lake will seem rather vial, just a mater of rewriting the Euler-Lagrange sytem in a different formalism. However, sometimes formalisms can make tremendous differ fence. To understand what happened and what could have hap- pened but didnot, let us try t make sens of the two necessary Conlitions for a minimum that have been presented sofa. We have the Euler-Lagrange equation (10) and the Legendre condi- ‘tom (13). The Legendre condition is clearly the second-order necessary condition fora minimum of function, namely, Lig), te2)a8 a function of u, ut (10) does aot Took a al ike the fst ‘otder condition fora minimur of that same function Its natural toask whether there might be way to relat the tvo conditions, Isitpossile tat both canbe expressed as necessary conditions foraminimum of one and the sme function? The answers yes, andunderstanding how thisis done lads staight to optimal con- teal theory, the maximum principle and farreaching genersiza- tions ofthe casial theory. Butbefore we getter, letus ell the story of bow Hamilton almost got here himself but missed, nd Weierstrass got even closer, but missed as well Ltus look at another way of writing (10). Suppose a curve + g{pisasolution of (10) Define a function Gg. p, oF tee ‘ecto variables gu, pin, and of eR, by leting Heqyusp. =

Ugo oy ‘Then define =a. Ade d= Fk.) 0 tis then clear that 2 ~ u, so along our curve a) ap ey = 4 (40), 40.0). BO =F (AO 4.9.9) as) o 0 (12), with p(t) defined by (15), says that Bey = 840), dd 0 Py —FE (40.4.9, i an _ ab : Finaty, = p 24.5015) a - (aos Ho) kd.) =0 June 1997 ‘The system of equations (16), (17), (18), usualy written more concisely as oy 2H tp dp dh aH aH ay” ou a9) isexactly equivalent to (10), provided that His definedasin (14). ‘We will call the funetion 4 the “contsol Hamiltonian," and re- ferto(19) asthe control Hamiltonian formot the Euler-Lagrange ‘equations. In our view, Formula (14) isthe definition that Hamil ton should have given for the Hamiltonian, and Equations (19) ‘are “Hamilton's equations as he should have written them.” ‘What Hamilton actually wrote was (in our notation, not his) yan de aap an 4 20) where 24q,p.) sa funtion ofp. and tale, defined by the for- mula 24q,p.1) =(p.4)~ U(q.4.1). which resembles (14), butis tot all the same. The cifeence is that in Hamilton’ defi tin, qi supposed to be tated no as an independent variable, but as a function of g, pt, defined implicitly bythe equation x, ae) en This easy tose tha, the map (qyd1) > (Put) defined by (21) can be inverted, i. i wecan solve (21) for jaa function of q,P.1 then (20) i equivalent 0 (19), Indeed, is clear that ae P.O = Hig, ug, p, 1), 0, where u = u(q, p, #) satisfies 80 Seat) BoC BH, BH ay” ay” ay 2) 2H Gi) Oe ep. nest 2 tt Sine (ap) = Of w= ap wesce at = along solutions of (19), and then the frst equation of (20) holds 1s well. Similarly the second equation of (20) also holds. The converse is also easily proved. Ttshould be clear from the above discussion that the Hamilto- nian reformulation of the Buler-Lagrange equations in terms of the “control Hamiltonian” is at least as natural as the classical ‘one, und pethaps even simpler. Moreover, the contol formula tion has at least one obvious advantage, namely, (AY) the control version of the Hamilton equations is equiva- lento the Euler-Lagrange system under completely general con- ditions, whereas the classical version only makes sense when the transformation (21) canbe inverted, atleast locally, to salve ford 4s. a function ofa, P ‘We nov show that (1) isnot the only advantage ofthe contol view over the classical one. To see this, we must take another look at Legendre's congition (13). Since H(g, usp.) is equal t0-L(g. 2) plus a linear function of u, (13) i completely equivalent to 9 FH (dd. dd.K0.1 50. aw ie, 2H (a0, 40.H0.)50 og . ) (23) Now lets (23)sideby side with he tid equation (19 oy ana 2H ou <0 and oe <0 4 and lt us stare atthe result for afew seconds ‘These equations unmistakably suggest something! Cleary, ‘what has to be going on heresthatH must have a maximum as a function of w So we state this a8 a conjecture ‘CONJECTURE M: besides (19) (or the equivalent form (10), «an adiional necessary condition for optimality is that lash tO 1), a8 afnetion of u, have a maximum at (1) foreach t ‘Notice that Conjecture M isa natural consequence of rewrit- ing Hamilton sequations “as Hamilton should have done it" and itis reasonable to guess that, if Hamilton had actually done it, tnen he himself, or some other 19th century mathematician, ‘would have writen (24) and be led by ito the conjecture. On the ‘ther hand, its only by using the Hamiltonian of (14), as op- posed to Hamilton's own form ofthe Hamiltonisn that one can See tha she Legendre condition has odo with the sign ofthe sec ‘ond u-derivative ofa faction of u whose first derivative has 10 vanish. This function cannot be itself, because the first order onions dono say that = 0. Nor can it be Hamilton's Ha niltonian + which isn’t even a function of u, Only the use of the “control” Hamiltonian leads naturally to Conjecture M. ‘tums out that Conjecture M is trae, an that once its auth {known then vast generalizations are possible. But before we get, there, we must move to the next chapter in our tale, and discuss the work of Weierstrass, who essentially discovered and proved ‘Conjecture M, but did using a language that obscured the sim- plicty of the sult, and for that reason missed some profound implications of his discovery. ‘The Second Fork in the Road: Weierstrass ‘Weierstass (1815-1897) considered the problem of minimiz- ingan integra ofthe form = ['1(q(s),4(s) stor Lagrang ‘ans Lsuch that (9, 4) positively homogeneous with respect to the velocity (tht. 9,08) =a. 4) for all jana @ 2 0)and doesnot depend on time. As will become cleat soon, we havea good reason for using sather han asthe “time” variable in the expression for ) Tvasense, one can alway’ make tis assuption on Z.“with- coutlos of generality"by defining anew funtion Ag, 1.42) =t L(q, wl, 1), and think of tas a new q variable, say q’, and of tas ay where sis a new time variable, or “pseudatime,” not 19 be de confused with te true time variable r, However, “without loss of| generality” isa dangerous phrase, and does nat atall entail *with- ‘ut Toss of insight” We shall argue below that this restriction, in conjunction withthe dominant view that Hamilton's equations hhad to be written in the form (20), may have served to conceal 0 feom Weierstrass the true meaning and the far-reeching implica- ‘dons ofthe new condition he discovered, Weierstrass introduced the “excess function” eo ya (6.9.8) = 16g.) EGa. aa depending on thes es of independent variables gx and, He then prowedhisside condition: Foracarvesm> qis}a bea sie tion ofthe minimization problem he fnction € has be » when evaluated for q= as, t= is) and a completely arb tray ‘Weierstrass derived his side conan comparing thet erence curse gs with other ures q¢) that are “val pot tons of inte sense that) islose toes) oral sbut 9) need notte clos os) Since Weirstan'contonivoles comparing L(g-(s) 1) for uclose to .(s), with L(qe(s),ie) near an anrary vl # posi ver fa om) sts bios that variaons q wi large” vals of ate needed ‘Novice that for Lagrangians with he homogeneity propery of Weiertes, 14,1) = 2(g.)-m, So Weiss could ‘equally well have written his excess function as (a3) = Ha.2)- (qa) 06) Using » = (4,1) 8 (1, we see dat 8(g. mit) =(U(4.%) (0.2) -(H.4)-(9.4)), apy Which the reader will immediately recognize as (4,93) = HCG. p)-H(4% Py 2s) \where His our “control Hamiltonian," So Weierstrass’ condition, expressed in ermsof the control Hamiltonian, simply says that (MAX) along ar optimal curve 1 got, if we define pit) via (13), then for every the value u = 54 ¢)must maximize the (con: rol) Hamiltonian H(q-(0), pf), 1) a8. a function oft In Weierstrass’ formulation, the condition was stated in terms of the excess function, forthe special Lagrangians satisfying his bo- ‘mogeneity assumption. In that case the resulting His independent ‘of time, a in our equation (28) But, if one rewrites Weierstrass condition as we have done, in terms of H, then one can take 2 gen- fal Lagrangian, transform the minimization problem into one in Weierstrass form, write the Weierstrass condition in the form (MAX) (soi particular His independent of time) and then undo the transformation and go back othe original problem. The result is (MAX), as written, with the contol Hamiltonian ofthe original problem. So the Weierstrass condition, if reformulated as in (OMAX), is Valid fora problems, with exactly the same statement. Moreover, (MAX) can be simplified considerably, Indeed, the requirement that p() be defined via (1S) is now redundant: if Hg, p(0, 1), regarded as a funetion of u, has a maximum at IEEE Control Systems = i) ten (961), (0) oft) asf vans, sp has to be given by (15). Moreover, the vachiog of Mao. (1). t)isalsoone ofthe conditions of (19).S0 we can state (19) and (MAX) together: (NCO) ffa curve r= a) is. solution ofthe minimization prob- Term (1), then there has to exist a faction 1 p(t) such that the {following three conditions hold for al i) = (40)... «) 9p AO 4A) 1h) aH, Fle AI- HD.) i (dds). AO, 1) = m9911( 40). AD.) [As a version ofthe nevessary conditions for optimality, (NCO) encapsulates in one single statement the combined power of the ler- Lagrange necessary conditions and the Weierstrass side condition as wel, of course, as the Legendre condition, which obviously follows from (MAX). Notice the elegance and econ- ‘omy of language achieved by this unified statement there is no seed to bring in an extra entity called the “excess function.” Nor docs one need to include a formula specifying how p(t) is de Fined, since (30) does this automatically. So the addition of the new Weierstrass condition othe three equations of 19) resultsin anew set of three, rather than four, conditions, a set mueh “sim plerthan the sum ofits pants" Notice moreover that (MAX)—or, more precisely, the Weierstrass side condition part of (MAX)—is exactly Conjecture M, So we can surmise at this point that (MAX), as stated, probably could have been discov- tered soon after the work of Hamilton, since itis strongly sug- gested by (24), and almost cerainly by Weierswass, if only Hamilton's equations had been writen in the form (14), (19) So, we can now add two new items io our list of advantages of| the “control formulation” of Hamilton's equations over the clas- sical one: (A2) Using the control Hamiltonian, it would have been an “obvious next step 10 write Legenaire's condition in “Hamiltonian —thatooeurs in (14) should ofeourse be re- placed by . This leads us to CONJECTURE M3: (NCO) should still bea necessary condition Foropiinaity even for problems where qs restrictedtosaisfv.a differential equation = qu) with he “conira fanetion” t=» IMs) taking values in some set U and allowed 0 bea “compleely arbitrary” U-valued fetion oft and the Haniltonian H now being defined by Hq, 4p.) Lg, td) en ‘Those readers who are familiar with optimal contsol theory will, of course, have recognized Conjecture M3 as being essen- tially the same thing as the celebrated “Pontryagin maximum principle." ‘And we hope to have convinced al readers, even those who are not contol theorist, that (NCO) isa very natural conclusion. Ik should be clear from our discussion that (NCO) could have bbeen guessed almost immediately from “Hamilton's equations as Hamilton should have written them,” together with the Legen- ‘re condition, and would have been an almost obvious conjec- ture to make once the Weierstrass side condition is known, fon the “correct” Hamiltonian formatism, as in (14) and (19), had been wsed all along. ‘The Maximum Principle So fa, we ave shown that Conjecture M3 is almost forced on us if one looks atthe classical condition from the right perspec tive and with the right formalism, but we have not yet said \whetherit is actualy true, nor have we given any indication as 10 hhow one might go about proving it turns out, however, that Conjecture M3, as stated, is not tue, as can be seen from simple examples, but only a minor ‘modification is needed to make ttre, All we have to do's intro- “irene pvetunlersme pie ecobutton oe ‘nese ruc ance ear bereion entree ae eae 4“ doce a new p-variable po—the “abnormal multiptier””—and ‘rite the Hamiltonian as Ha, 1.2.1) = qs(s) of (3) 10 solve the minimization problem is that there exist a function 1 p(t) RM and a con- stant po 2 0 such thar XD (ono mp4 (00) fora fab a(e) = FE S(O yond He) = 3 (S(oygere ade ont (O} rman. aC ra Sort (ahh a0), 1) Bye and the Hamiltonian Hig, up, pt) is given by (32) Conditions (NT), (18) and (MC) are known, respectively, 3s the tontrviality condition, the Hamilionian syste, ane minini- {ation condition, Novice that (HS) is just a restatement of 29), ‘withthe new Hf, andl (MC) is a restatement of (30), The second equation of (HS) is called the adjoint equation. A tajctory- ental pai (gras) for which thee exist. po with he proper tis of (MMP salted an exreme Finally, we reac tht for classical calculus of variations problems (MP yields exactly the same conclusion a (NCO). In eed, inthis cas its possible to exclude the possiblity tha py = Can (MP) reduces (NCO), So( MP) sated, isa tue gener alization of the necessary conitions (NCO), which covers may cases hat cannot be handled by means ofthe classical caleulus of ‘We conclude by presenting the analogue of (MP) for prob- Jems with variable time interval (MP) For a minimization problem ofthe kind discussed in (ME), but with the time interval [a] no fined in advance, as Suuming that fand Ld not depend on, the necessary conditions are exactly the same as those of MP), plus the extra requirement that Hig().w(tp,) = ‘Statement (MP") applies in particular to minimum time problems, ie, problems where ne aed for he soon ue No aealy Ben Heed by Bola wis.et 3) Fr From Principle to Theorem (our discussion so far ins dealt nly wi the forma aspect of the necessary condition for optimality. In order fo get ral ‘mabematcal theorems, we have tobe accuse a5 othe techn cal assumptions on Zand U, the exact statement ofthe prob. dem, andthe presse meaning ofthe concasons “eres of the previous sections, fom the Euler-Lagrange equation the maximum principle, should be regerdod as prin- Ciples aber han theorems Fr sa principles generator of theorems, a not yet completely precise statement that can be madeinta theoremby dlingin de technical ealsandmaling allthe defnitons and conditions completly precise. The esl ing thorems are version ofthe panepl. Usually echoes of tecnica condons canbe made in more than one wa, 50 8 “Principe” nas more than one version Insome cases, «“pinciple” Becomes identified in the minds of mathematicians with ts frat published rigors version. This has happened fo some exten inthe cas of te maximum pick ple,bvcanse the book (8, wher the est ws fistpresented, al ready contains arigorou version, We contend, however, hat his ‘eran docs noteshana te fll power ofthe principe andthe work of stating and proving ronger aad more general versions is sil very much in progress Regarding the necessary conditions for optimality, while the discovery of new and more general form conditions pro rested, rigorous versions of ie formal salts were desived et ‘ious stages ofthe proves, wing ach ase the mathemati tal ool avalable ate dine, The fit rigorous version ofthe maximum principe appears in he book [8]. This “classe” version was then improved by cher auors. We choose to que a version appearing in LD. Berkovitz's 1974 book [2]. “Let f, anf" be the components of f, and write f° for L. Itis ssoumed that the for =O, mare defined on Qx Uo x fab where Q, Upare open subsets of BB, respectively. Moreover ‘each function q > '(q, 2) required tobe of class C! with specttog foreach (,t)¢ Uo x(a b}andeach map (a, Fas tot)has tobe Borlmeasuable foreach fred 0 The set Uis a ubset of Us. An admissible contol is amap (abl) € U such hat for every compact sabet Xf there is ines fonction ¢ n> gx() such that the Bound [roo A) |X (aa.0) s0u( mic tra ane Kx [a,b] and all /= 0, ...m-For a general clas of U-valued functions on fab], andg. 9 ©. et ususe C(t soar the set of al pairs (a(-),w())such that u-) ©, (-) sa sol tion of 3) i.e. g(-)isan absolutely continuous curve (a, bl-> Q such that (3) holds for almost every 1), g(a) = Gand (5) = Use atagn to denote te clas ofall admissiblecontos. Then the optimization problem is that of minimizing the integral [Xa ). 12). fetid class (14-4). The conch sion of the theorem is tht of (MB), with pe absolutely continu ous, and the adjoint equation and the maximization condition holding almost everybere" ‘The proof ofthis fist version ofthe maximum principle is rather long, and we will not even sketch it here. Since then, stronger versions have been obtained by weakening the hy” IEEE Control Systems pothesis ofthe frst version, or strengthening the conclusions or bath ‘One important improvement ofthe classical version resulted from theuse of nnsmooth analysis of Clark 8,8), While hese “nonsmooth generalizations were bing developed ther authors pursued a different dretion, for ver smoot systems. They ob- Served that one could get stronger results by allowing a class of ‘vaiations richer than that sei the lasial prof One can then ibuin “high-order necessary conditions for optimality” In ai tion, thied direction developed in which (MP) is formulated not forcontlled diferental equations j= f(g, ¢), bu for dier- ‘ental inclusions F( 4,1), where Fisuset-vaued map (for example (Sp. The result reterred to are proved by ferent meth ‘sand cannot be combined into asingle theorem, We will tat tempt to explain why this so, because to doit we Would have 19 discuss in dtl the proofs these theorems, showing that in each case one uses a diferent constuction, and these constructions cannot be combined int sinle one valid on the whole interval But itis a fact that, due to this incompatibility of the various proofs, a single theorem covering all cases and combining them—that is, applying to “hybrid” problems as above—ap- peared, until a Tew years ago, tobe beyond reach, Recently, how ever, one of us (Sussmann [10-12] has obtained general version fof (MP) that contains all the above results, applies to some new cases as well, and actually covers the “hybrid” case. Finale for Brachystochrone and Control ‘We conclude by retuming to the brachystochrone problem, this ime from the perspective of optimal control theory. ‘We can formulate Bemoulli’s question as an optimal control problem in the x. plane, whose dynamics are given by dew. 3=~bo. svt the controls 2 dimensional vector) taking values in the set U= (u,v) su? +9" = 1, “The Hamiltonian sya, dP shen given (using =sany)hytheformola = (a+ 0) ~ pan the app cation of (NCO) gives the contons co) a ip) PB Ip) Ba where|) = [p+ pi. as well a the differential equations put oy. ola A=0, pea = Jay ~~ 2a 35) Notice that] #0 Indeed, (MP")tellsus that H =0, So p= 0 ‘would imply pp = 0, contradicting (NT) the constantp; vanishes, then x= O,so we geta vertical ine Otherwise, cs continuous and always #0, showing that we can tise 10 parametrize our solution. Since G5. * Pe wehave is y(x)' =A an (0) = Gen Baa B wehave He ¥(a) d June 1997 eo) Pel and then Equations (35) Ie ‘But (33) and (34) imply that and (36) yield (x) = =-(1+") and then | +"? + 2yy"=0, which is exactly Equation (9). As we ‘pained before, ths leads tothe clos, Withag "spurious so- Tutions.” Notice hat this argument does not mvolve any dinersti- zation or any use of refraction of Tight across boundaries. ‘Nekie als tha nour contol argument we have nor poste Jated that the solution curves could be represented as graphs of functions y(x). We have proved it! (In the calculus of variations ase this was an extra ssumpton, f, "Bernoulli's Solution of the Brachysochrone Problem above) “This is one example showing that, for the brachytochrone problem. the optimal control method gives better results than the ‘asscal caleulas of variations ‘Allthe above consierations apply tothe computation of opi mal iracctores that ae entirely above the axis. in Beno Ti's brachystochrone problem. However, the natural tuheratcal sting forthe minimum ime contol problem cor. respondingto (3) isthe whole plane, whichis why we wrote] rather that >in (33) 1s natural, therefore, to try to solve this rmore general problem, 1 try to find he Fight rays when the tedium i the whale plane, andthe sped of ight i} Notice that his problem is “compete contlable”in he sens that any two points A, B of R°, even ifthey lie on opposite sides ofthe “axis, can be joined by a feasible path. The right-hand side of (33) vanishes along tev ais tt this does po prevent the ex tene offense ath cowing the cai, ocuse the funtion Ais not Lipsctite near te axis I he function was Lip schitz, then by the usual uniqueness theorem of ordinary differential equations, every solution going through a point on the x axis would have to be a constant curve.) However the same non-Lipschitz feature that makes the system controllable also senders the maximum principle inapplicable, in its classical and nonsmooth versions, including the Lajasiowiez version, since ll these require a Lipschitz reference vector field, Suppose, for example, that we want to find an optimal tajec- tory froma to B, where les the upper half-plane and Bis inthe lower half-plane. Then one can show, first of all that an optimal leajectory & exists, using Ascol’s theorem, Next, using the usual necessary conditions lor optimality, eg, the Euler-Lagrange ‘equation or the classical version of the maximum principle, one shows that any portion of an optimal curve whieh i entirely con- tained in the closed upper half plane or in the closed lower half plane isa cyclod given by (5), ora reflection of such a eycloid With respectto the raxis. Next, one sees that cannot traverse the x axis more than once. (This requires an elementary qualitative lemma that we leave as exercise.) So we know that consist of a

You might also like