You are on page 1of 574
OPTIMUM Sco Bev COMI OL Andrew P. Sage Professor and Director Information and Control Sciences Center Institute of Technology Southern Methodist University THE LIBRARY NIVERSITY OF PETROL HAHRAN, 31261, SAUD, PRENTICE-HALL, INC. Englewood Cliffs, N.J. to LaVerne Terri Karen Philip ©1968 by PRENTICE-HALL, INC. Englewood Cliffs, N.J. AIl rights reserved. No part of this book may be reproduced in any form or by any means without permission in writing from the publisher. , n29e 35 ( 0202” jour” GPA Current Printing (last digit): youd 1098765432 say Library of Congress Catalog Card Number 68-20862 Printed in the United States of America PREFACE in the last several years, much interest has centered around the study of opti- mization theory as applied to the control of systems. The purpose of this text is to provide a reasonably comprehensive treatment of this optimum systems control field at a level comparable to that of a beginning graduate student. In this regard, the book does not require prior background in state space techniques, calculus of variations, or probability theory, although some exposure, particularly to the first topic, would be of considerable value. The text has been written strictly from the point of view of an engineer with interest in the study of systems. Consequently, we emphasize the basic concepts of various techniques and the relations, similarities, and limitations of these basic concepts at the expense of mathematical rigor. As befits an introductory text, the level of presentation is generally monotone increasing from chapter to chapter. Structurally, the text is divided into three areas although overlap certainly exists. These are: 1. Optimal control with deterministic inputs (Chapters 2, 3, 4, 5, 6, 7). 2. State estimation and combined estimation and control (Chapters 8, 9, 10, 11, 15). 3. Sensitivity and computational techniques in systems control (Chapters 12, 13, 14, 15). Each division corresponds roughly to a 3 unit one quarter graduate course as taught by the author to on campus students and also to many industrial students via closed circuit TV at the University of Florida and 2045" i" vi PREFACE. more recently at the Institute of Technology at SMU. All students taking the course have had, as a minimum, one course in linear control system design. Many have, of course, had background in sampled data and non- linear systems, as well as numerical analysis, variational calculus, modern analysis, stochastic processes and state space techniques. Although prior knowledge of these latter areas is not required for the courses upon which this text is based, they are certainly fundamental areas for anyone desiring to become truly proficient in modern systems theory. It is hoped that first and intermediate year graduate students, as well as systems engineers in industry and government in electrical, mechanical, aerospace, nuclear, or any engineering discipline with serious interest in modern system theory will find this text interesting and challenging. APS. ACKNOWLEDGEMENTS Many people have been responsible in one way or another for the appearance of this book. The inspiring leadership of Dean Thomas L. Martin Jr. of the Institute of Technology at S.M.U. has been a constant source of encourage- ment. Dr. Wayne H. Chen, chairman of the electrical engineering depart- ment, University of Florida provided constant support, as well the physical facilities to conduct much of the research upon which portions of this text are based. The author wishes to express his heartfelt gratitude for this assistance. A large number of examples solved in this text are the results of joint efforts of the author and his students. The author wishes to acknowledge this help and also permission to use material which has derived from efforts of the author and A. Wayne Bennett, Syma P. Chaudhuri, Barry R. Eisen- berg, Thomas W. Ellis, George W. Masters, and Stanley L, Smith. Much helpful discussion has been provided by Dr. James L. Melsa of the Information and Control Sciences Center faculty and students in the optimization, estimation, and computational techniques courses at the University of Florida and Southern Methodist University. CONTENTS 1 Introduction I 2 Calculus of Extrema and Single Stage Decision Proc- esses 9 2.1 Maxima and minima (scalar process), 9 2.2 Extrema of functions of two or more variables, 11 2.3 Constrained extrema problems—Lagrange multipliers, 14 2.4 Vector formulation of extrema problems—single stage decision processes, 16 2.5 Linear and nonlinear programming, 22 References, 25 Problems, 25 3° Variational Calculus and Continuous Optimal Control 27 3.1 Dynamic optimization without constraints, 27 3.2. Remarks on transversality conditions, 30 3.3 The second variation: sufficient conditions for (weak) extrema, 31 3.4 Unspecified terminal time problems, 35 3.5 Euler-Lagrange equations and transversality conditions- vector formulation, 38 3.6 Variational notation, 40 3.7 Dynamic optimization with equality constraints-Lagrange multipliers, 42 3.8 Dynamic optimization with inequality constraints, 46 References, 48 Problems, 49 x CONTENTS 4) The Maximum Principle and Hamilton Jacobi Theory 4.1 Variation of functions with terminal times not fixed—the Weierstrass-Erdmann conditions, 52 4.2 The Bolza problem and its solution, 55 4.2-1 Continuous optimal control problems—fixed begin- ning and terminal times, no inequality constraints, 4.2-2 Continuous optimal control problems—fixed begin- ning and unspecified terminal times, no inequality constraints, 61 4.3 The Bolza problem with control and state variable in- equality constraints—the Pontryagin maximum principle, 4.3-1 The maximum principle with control variable in- equality contraints, 66 4.3-2 Summary of the maximum principle, 69 43-3 The maximum principle with state (and control) variable inequality constraints, 71 4.4 Hamilton-Jacobi equation and continuous dynamic pro- gramming, 75 References, 83 Problems, 84 5 Optimum Systems Control Examples 5.1 The linear regulator, 88 5.2 The linear servomechanism, 96 5.3 Bang bang control and minimum time problems, 98 5.4 Singular solutions, 104 5.5 Minimum fuel systems, 110 5.6 Minimum time, minimum fuel and minimum energy con- trol of self-adjoint systems, 115 References, 118 Problems, 120 6 Disctete Variational Calculus and the Discrete Maxi- mum Principle 6.1 Derivation of the discrete Euler-Lagrange equation, 124 6.2 The discrete maximum principle, 126 6.2-1 The discrete maximum principle (summary), 129 6.3 Comparison between the discrete and continuous maxi- mum principle, 131 References, 134 Problems, 135 56 5t 87 123 CONTENTS xi 7 Optimum Control of Distributed Parameter Systems 137 7.1 Formulation of distributed system problems and a dis- tributed maximum principle, 138 7.1-1 Summary of the distributed maximum principle, 142 7.2 The optimal linear regulator problem, 148 73 Spatial and temporal discretization schemes for dis- tributed systems, 150 7.3-1 Spatial-discretization, 151 7.3-2 Time-discretization, 154 73-3 Space-time discretization, 155 References, 163 Problems, 164 8 Optimum State Estimation in Linear Stationary Systems 165 8.1 Mathematical preliminaries for state estimation, 167 8.1-1 Averages and behavior of nonstatistical variables, 168 8.1-2 Random processes—behavior of statistical variables, 172 8.1-3 Evaluation of transforms, 174 8.1-4 Autocorrelation and spectral density-discrete case, 175 8.1-5 Ensemble statistics, 175 8.1-6 Ergodicity and stationarity, 177 8.1-7 The central limit theorem, 178 8.1-8 Random processes in linear systems, 180 8.2 Errors in state estimation and related topics, 185 8.2-1 Mean-square error formulation, 185 8.2-2 Parseval’s theorem, 188 8.3 Optimum state estimation in linear systems, 188 8.4 Semifree system configuration constraints, 196 8.4-1 G,(s) stable and minimum phase, 197 8.4-2 G,(s) unstable; poles in RHP, 197 8.4-3 G,(s) nonminimum phase, 198 8.4-4 Fixed plants with saturation, 201 Multivariable systems, 203 8.5-1 Multiple disturbance, 203 8.5-2 Multiple-input optimization, 205 References, 213 Problems, 214 8. in 9 Optimum Filtering for Nonstationary Continuous Systems 217 9.1 Nonstationary optimization—impulse response method, 218 9.2 State space formulation for systems with random inputs, 220 9.3 Optimum state estimation in nonstationary systems, 226 9.4 Duality and an alternate approach to the optimum smoothing and filtering problem, 236 9.5 Connecting the classical and the state transition approach, 251 References, 254 Problems, 255 xii 10 i 12 CONTENTS Least-Squares Curve Fitting and State Estimation in Discrete Linear Systems 257 10.1 Discrete random processes, 257 10.2 State space formulation for discrete systems with ran- dom inputs—the Markov assumption, 260 10.3 The discrete Wiener-Kalman filter—Bayesian approach, 265 10.4 Least-squares curve fitting—single and multistage esti- mation problems, 275 10.4-1 Proof of the matrix inversion lemma, 276 10.4-2 Single-stage estimation processes, 276 10.4-3 Multistage least-squares estimation, 277 10.5 Duality and a variational derivation of the discrete Wiener-Kalman filter, 281 Relationships between least-squares, minimum-vari- ance, and minimum mean-squared-error estimates, 283 10.6-1 Minimum mean-squared-error estimates, 283 10.6-2 Classical least-squares estimates, 285 References, 287 Problems, 288 10. a Controllability and Observability—The Separation Theorem 291 11.1 Observability in linear dynamic systems, 292 111-1 Observability in time-varying discrete systems, 292 11.1-2 Observability in continuous systems, 295 11.2 Controllability in linear systems, 298 11.3 Reconstruction of state variables from output vari- ables—observers, 306 11.3-1 Reconstruction of all system state vectors, 306 11,3-2 State reconstruction with observers of low order, 310 11.4 The separation theorem for linear systems, 312 11.4-1 The separation theorem for continuous linear systems, 312 11.4-2 The separation theorem for discrete linear systems, 318 References, 319 Problems, 320 Sensitivity in Optimum Systems Control 323 12.1 Parameter sensitivity, 324 12.1-1 Parameter sensitivity in continuous systems, 324 12.1-2 Parameter sensitivity in discrete systems, 326 12.2 Sensitivity in optimal control, 330 12.2-1 Performance index sensitivity, 331 12.2-2 Comparative sensitivity, 334 13 14 ‘CONTENTS xiii 12.2-3 The Hamilton-Jacobi-Bellman-equation _for- mulation of performance index sensitivity, 336 12.2-4 Sensitivity in the performance index, 339 12.2-5 Example sensitivity calculations for various methods, 340 12.2-6 Sensitivity to control variations, 344 12.3 Sampling interval sensitivity of discrete systems, 346 12.3-1 Global sampling interval sensitivity, 346 12.3-2 Local sampling interval sensitivity, 347 12.4 Variable increment sampling, 351 References, 362 Problems, 364 Direct Computational Methods in Optimum Systems Control 367 13.1 The Ritz method and the method of finite differences, 369 13.2 Discrete dynamic programming, 372 13.3 Gradient techniques, 382 13.3-1 Gradient techniques for single-stage decision processes, 382 13.3-2 The gradient technique for continuous decision processes—the gradient in function space, 395 13.3-3 The gradient method for multiple-stage decision processes, 413 13.4 Optimization based on the second variation, 414 References, 433 Problems, 435 Quasilinearization 439 14.1 Continuous quasilinearization, 440 14.2 Discrete quasilinearization, 443 14.3 Solution of two-point boundary-value problems of optimal control by quasilinearization, 445 14.3-1 Continuous optimal-control problems with fixed terminal time, 445 14.3-2 Solution of specific optimal-control problems by quasilinearization, 448 14.3-3 Solution of optimal control problems with unspecified terminal time, 454 14.4 Difference and differential approximation, 457 14.5 Continuous system identification and modeling using quasilinearization, 462 Discrete system identification and modeling using quasilinearization, 471 References, 479 Problems, 480 Cag a xiv CONTENTS 15 Invariant Imbedding 483 15.1 Derivation of the continuous invariant imbedding equations, 485 15.2 State and parameter estimation in continuous nonlinear systems, 491 15.3 Suboptimal adaptive control—estimation and identifi- cation via invariant imbedding, 507 15.3-1 Suboptimal adaptive regulator control, 507 15.3-2 Suboptimal adaptive trajectory control, 512 15,3-3 Examples of suboptimal adaptive control using invariant imbedding, 514 15.4 Solution of nonlinear optimal control problems using invariant imbedding, 526 15.5 Discrete invariant imbedding, 531 References, 540 Problems, 542 Appendix A. The Algebra, Calculus, and Differential Equations of Vectors and Matrices 545 1 Matrix algebra, 545 2 Differentiation of matrices and vectors, 552 3 Linear vector differential equations, 556 4 Linear vector difference equations, 557 References, 558 Index 559 | INTRODUCTION In recent years much attention has been focused upon optimizing the behavior of systems. A particular problem may concern maximizing the range of a rocket, maximizing the profit of a business, minimizing the error in estimation of position of an object, minimizing the energy or cost required to achieve some required terminal state, or any of a vast variety of similar statements, The search for the control which attains the desired objective while minimizing (or maximizing) a defined system criterion constitutes the fundamental problem of optimization theory. The fundamental problem of optimization theory may be subdivided into four interrelated parts [Reference 3, Chapter 2]: 1. Definition of a goal. 2. Knowledge of our current position with respect to the goal. 3. Knowledge of all environmental factors influencing the past, present, and future. 4, Determination of the best policy from the goal definition (1) and knowledge of our current state (2) and environment (3). To solve an optimization problem, we must first define a goal or a cost function for the process we are attempting to optimize. This requires an adequate definition of the problem in physical terms and a translation of this physical description into mathematical terms. To effectively control a process, we must know the current state of the process. This we will call the problem of state estimation. Also, we must be able to characterize the process by an effective model which will depend upon various environmental factors. This we will call system identification. With a knowledge of the cost function, 1 2 INTRODUCTION CHAP. 1 and the system states and parameters, we then determine the best control which minimizes (or maximizes) the cost function. Thus we may define five problems, which are again interrelated, and which we must solve in order to determine the best, or optimum, system: I. The Control Problem We are given a known system with relation between system states and input control. We desire to find the control which changes the state x(t) so as to accomplish some desirable objective. Figure 1.1 illus- trates the salient features of the control problem. This may be an open- or closed-loop problem, depending upon whether or not the control is a function of the state. ult) inown eee Control plant State Fig. 1.1. Deterministic optimum control problem. IL. The State Estimation Problem We are given a known system with a random input and measurement noise such that we measure an output z(t) which is a corrupted version of x(¢) as indicated in Fig. 1.2. We know the statistics of the plant noise w(f) and the measurement noise v(t), and we desire to determine a “best” estimate £(¢) of the true system state x(¢) from a knowledge of z(r). v(t) Meosurement noise w(t) Known, x(t) Measurement z(t), Plant noise plant System state | device [Observed state Fig. 1.2. State estimation problem. III. The Stochastic Control Problem We may combine problems I and II to form a stochastic control problem as depicted in Fig. 1.3. We desire to determine a control u(t) such that the output state x(t) is changed in accordance with some desired objective. Plant noise w(/) and measurement noise v(t) are present. We know the statistics of these noises and must of course determine a best estimate, X(¢), of x(t) from a knowledge of the output z(t) before we may discern the “best” control which may be open- or closed-loop. CHAP. 1 INTRODUCTION 3 wit) vit) Plont Measurement noise noise ult) Known x(t) Measurement z(t) Contror plont System stare| device Observed state Fig. 1.3. Stochastic control problem. IV. The Parameter Estimation Problem In many systems we must incorporate some method of identification of system parameters which may vary as a function of the environ- ment. We are given a system such as that shown in Fig. 1.4, where we again know the statistical characteristics of the plant and the measurement noise, and we wish to determine the best estimate of certain plant parameters based upon a knowledge of the determin- istic input u(t), the measured output z(t), and possibly some a priori knowledge of the system plant structure. As we shall see, we often must accomplish state estimation in order to obtain parameter estimation. wit) v(t) Plant Meosu noise noise u(t) Unknown Measurement Control plant ‘System state | device Fig. 1.4, Parameter estimation problem. V. The Adaptive Control Problem We may combine problems I through IV to form an adaptive control problem. We are given the statistical characteristics of w(r) and v(t) or some method of determining these characteristics. Plant param- eters are random. We desire to determine a control u(t) to best accomplish some desired objective in terms of the measurement noise and plant noise as well as the uncertainty in system dynamics. If the control u(t) is determined as a function of the measured output 2(t), we have a closed-loop adaptive system. rement z(t) Observed state We will divide our efforts in these five problems into fourteen chapters. These chapters and their respective purposes and contents will now be described briefly. Each chapter will contain several examples to illustrate our developed theory. Many problems, of varying complexity, will be posed at the end of each chapter for the interested reader. 4 INTRODUCTION CHAP. 1 Chapter 2 Calculus of extrema and single-stage decision processes This chapter will examine ordinary scalar maxima and minima and extrema of functions of two or more variables. Constrained extrema and the vector formulation of extrema problems will be presented for single-stage decision processes. Chapter 3 Variational calculus and continuous optimal control In this chapter we will introduce the subject of the variational calculus for continuous decision processes through a derivation of the Euler-Lagrange equations and associated transversality conditions. We will discuss the use of Lagrange multipliers to treat equality constraints and briefly mention the inequality constraint problem. Several very simple optimal control problems will be considered. Chapter 4 The maximum principle and Hamilton-Jacobi theory The Bolza formulation of the variational calculus will be presented. This will lead into a proof of the Pontryagin maximum principle and the develop- ment of the Hamilton canonic equations and the associated transversality conditions. We will discuss at some length problems involving control and state, and state variable inequality constraints. The Hamilton-Jacobi equa- tions will then be developed and modified to produce Bellman’s equations of continuous dynamic programming. Chapter 5 Optimum systems control examples This chapter will formulate and solve numerous optimal control problems of interest; among those solved are: 1. Minimum time problem. 2. Linear regulator problem. 3. Servomechanism problem. 4. Minimum fuel problem. CHAP. 1 INTRODUCTION 5 5. Minimum energy problem. 6. Singular solution problems. Chapter 6 The discrete maximum principle In this chapter we will develop a simplified discrete maximum principle for cases in which control and state variable inequality constraints are absent. We will give a meaningful comparison of the discrete maximum principle and the discretized results of application of the continuous maximum prin- ciple for a rather general optimization problem. Chapter 7 The maximum principle for distributed systems A formulation of the maximum principle and the Euler-Lagrange equa- tions for distributed systems will be presented. Alternate methods of approach, which discretize spatial and possibly temporal coordinates, thus allowing the use of the continuous or discrete maximum principle, will be presented. Several examples will be formulated and the canonic equations solved for a distributed regulator problem. Chapter 8 Optimal filtering for stationary continuous systems This chapter will cover statistical optimization for stationary random signals with infinite observation time. Also included will be several con- strained statistical optimization problems for estimation and control. This chapter will present the frequency-domain analysis of filter and control problems. Chapter 9 Optimum filtering for non- stationary continuous systems We will extend the work of the previous chapter to include nonstationary optimization problems. The state transition approach will be used and will allow us to develop the celebrated Kalman-Wiener computational algorithms for nonstationary filtering. The dual relations between the filter and the regulator problems will be observed, and the difference between optimum smoothing and optimum filtering will be discussed. 6 INTRODUCTION CHAP. 1 Chapter 10 Least-squares curve fitting and discrete state estimation in linear systems This chapter will present least squares smoothing and sequential estima- tion theory for linear systems and its connection to optimum discrete filter- ing. We will derive the Kalman computational algorithms using Bayesian decision theory. A proof of the relationships between least squares, minimum variance, and minimum mean square error estimators for linear discrete systems will be presented. Chapter 11 Controllability, observability, the separation theorem, and combined estimation and control After having established and solved many state estimation problems and optimal control problems, we now inquire into the conditions which must be established in order for many of these problems to have meaningful solutions. First we examine the manner in which the output of a system is constrained with respect to the ability to observe system states. Then we examine the dual requirement and find the characterization of the manner in which a system is constrained with respect to control of system states or system outputs. Then we turn our attention to the construction of observers of the entire system state vector from partial observations or observations of Jess than all of the system state vectors. Finally, the linear stochastic control problem is posed and a separation theorem is developed which, in linear combined estimation and control problems, allows us to separate the problem of estimation from that of control. Chapter 12 Sensitivity analysis in optimum systems This chapter will present various methods for studying the parameter sensitivity problem in continuous systems, and parameter and sampling interval sensitivity problems in discrete systems. The use of sensitivity con- cepts in optimal and optimal adaptive systems will be presented. CHAP. 1 INTRODUCTION 7 Chapter 13 Direct computational methods in optimum systems control This chapter, the first of three concerning computational techniques, will discuss several direct methods for solving optimal control problems. We will discuss the classical Ritz method and the method of finite differences and then proceed with a discussion of discrete dynamic programming. The gradient method or method of steepest descent will be presented along with several modifications to allow us to consider terminal manifold equality constraints and inequality constraints. Finally, the second variation method will be developed and applied to several illustrative examples to indicate the improved speed of convergence compared with the gradient method. Chapter 14 Quasilinearization In this chapter we will consider a modification of the Newton-Raphson method for the indirect solution of two-point boundary value problems. Problems of optimal control and those of parameter estimation will be considered. A discrete version of the quasilinearization method will be devel- oped as will a starting method for computational solution by quasilineari- zation. We will call this method difference or differential approximation. Chapter 15 Invariant imbedding The invariant imbedding method for solution of optimal control problems will be presented, and sequential estimation of states and parameters in nonlinear systems will be emphasized. The study of a suboptimal adaptive control of a nuclear system and a low thrust orbit transfer problem using this technique, in part, will be presented. A discrete version of the invariant imbedding equations will be described and applied to several examples. Interconnections between estimation via invariant imbedding, conditional mean, and maximum likelihood estimation will be discussed. Appendix A Calculus and differential equations of vectors and matrices In this appendix we will summarize many of the matrix and vector calculus operations used throughout the text. 2 CALCULUS OF EXTREMA AND SINGLE-STAGE DECISION PROCESSES Many problems in modern system theory may be simply stated as extreme value problems. These can be resolved via the calculus of extrema which is the natural solution method whenever one desires to find parameter values which minimize or maximize a quantity dependent upon them. In this chapter we will consider several such problems, starting with simple scalar problems and concluding with a discussion of the vector case. The method of Lagrange multipliers will be introduced and used to solve constrained extrema problems for single-stage decision processes. A brief discussion of linear and nonlinear programming will be presented. Multistage decision processes, which can be treated by the calculus of extrema, will be reserved for a variational treat- ment which will result in a discrete maximum principle. Much of the work in this chapter is very basic, and a selection of only references [1] through [5] of direct interest to the systems control area is given. 2.1 Maxima and minima (scalar process) A real function f(x), defined for a scalar x = a, has a relative maximum or a relative minimum f(a) for x = a@ if and only if there exists a positive real number 8 such that, respectively, Af=f(a@+ Ax) —f(@) <0 (2.1-1) 9 10 CALCULUS OF EXTREMA AND SINGLE-STAGE DECISION PROCESSES. CHAP. 2 7 Af=f(a@+ Ax)—f(@)>0 (2.1-2) for all Ax = x — @ such that f(@ + Ax) exists and 0 < | Ax| < 6. Further, if df(x)/dx exists and is also continuous at x = @, then f(a) can be an interior maximum or minimum only if af) orm 0 (2.1-3) If f(x) has a continuous second derivative for x = a, the nature of the extremum at x = a can be determined. The following well-known procedure Ax) Fx) 4-7 Go 7 a Gi a (b) Had = aie, ewn[-O-1,) 20 10) = e*[ulx)-ulx—a] For x in the interval: (- 00,00) For x in the interval: [0,) F(x) hos an absolute moximum A(x) hos on absolute minimum ot x ot x=0, and an absolute moximum at x=a-¢ where d is an orbitrorily small positive number. (x) x ree (c) F(x) = e*u(x) For x in the interval: [0,0] () F(x) = x?(2-x) For x in the interval: f(x) has an absolute minimum, at x=0, and an absolute maximum ot x=0. For x in the interval: [0,+00] f(x) has an absolute minimum at x=0. F(x) has @ relative minimum at x=, ond @ relative maximum at x=4/3 Fig. 2.1-1. Illustrations of extrema. SEC. 22 EXTREMA OF FUNCTIONS OF TWO OR MORE VARIABLES: a can be used for the determination of the extrema of a given scalar function y =f). 1. Differentiate y with respect to x. 2. For each value of x, determine the specific values of a which satisfy the equation dy/dx = 0. 3. Test to see what kind of extrema the function has for each value of a thus obtained. This we can easily accomplish by the second-derivative test in which we substitute each value of @ into the second derivative of y with respect to x and apply the following rule: >0 then y has a relative minimum <0 then yhasa relative maximum (2.14) =0 then y has a stationary point a NS If 4 4, Evaluate the actual value of the extrema by substituting each value of @ obtained into f(x). There are three different types of extrema possible. If a value of a can be found such that f(@) is an extremum for all x throughout its domain of definition, f(x) is said to have an absolute extremum. If a value of a@ can be found such that f(a) has an extremum throughout a bounded neighbor- hood of x, f(x) has a relative extremum at x = a.. If f(x) is defined only for a limited range of values of x, and if f(x) has an extremum at either boundary of x (with respect to all the values f(x) has for all values of x contained within the limited range of x), then f(x) has an extremum at its boundary. These different types of extrema are illustrated in Fig. 2.1-1. We will have oppor- tunity to apply these concepts to parameter optimization of control systems in Sections 8.2 and 13.3-1. 2.2 Extrema of functions of two or more variables The extrema-finding technique can be extended to include functions of more than one variable. Suppose y = f(x1, X:, --. , Xn) = f(x). A procedure similar to the previous one is used, using partial derivatives instead of total derivatives. A simple example will illustrate the procedure to be followed. Example 2.2-1 Let us consider the maximization of - Oa GG ane 12 CALCULUS OF EXTREMA AND SINGLE-STAGE DECISION PROCESSES CHAP. 2 where x” is used to indicate transpose of the column vector x.f Following an extended version of the foregoing scalar procedure, we take the partial derivatives of » with respect to x, and x, and set them equal to zero to obtain: ay (=u . ax Tea GSA at ay (=u =) mete Thus, since @, = @, = 1 are the only extrema, and since a simple computation shows that the second derivatives are nonpositive at this extrema, we see that we have a maximum at the point x? = [1, 1]. Example 2.2-2 Let us now suppose that the allowable range of x is constrained such that |x, |< 4and | x,| < . Itis desired to find the value of x which yields a maximum for the y = f(x) of Example 2.2-1 in the allowable or admissible range of x. This region of state space is also shown in Fig. 2.2-1. From this figure, it is apparent that, for this simple problem, y = f(x) has an extremum (maximum) somewhere on the boundary of the admissible range for x, in fact precisely at x? =[J, }]. This isa very simple example of optimization with an inequality con- straint. We will have considerably more to say about this very important type of constraint when we consider dynamic systems and the calculus of variations. Example 2.2-3 A slightly more difficult problem arises if the allowable range of x is con- strained such that the Euclidean norm of x equals one. Symbolically, this means that jx? —x’x =x} + xd +... +x4= 0 for all nonzero 6z. A positive semidefinite matrix, P, is defined as one which has the property that 5z7P 5z>0 for all nonzero 6z. In a similar fashion, negative definite and negative semidefinite quadratic forms and matrices are defined. Section 1.23 of Appendix A delineates a method which we can use to discern positive definiteness of a square matrix: Thus we can state the two necessary condi- tions [4] for J(xu) to have an extremum in a given interval of x for convex or concave J(x,u). If J(x,u) is not convex or concave, the second condition is only sufficient, and a quantity known as the bordered Hessian must be used to obtain the second necessary condition. Eq. (2.4-20) reduces to SEC, 2.4 VECTOR FORMULATION OF EXTREMA PROBLEMS 19 I. The following vectors are zero: oH : ee Il. The following matrix 0 0H é (2) Oxéx = u\ ox lig any 298 ou ex eu ou ee semidefinite for a minimum along f(x, u) negative semidefinite for a maximum along f(x, w) A sufficient condition for a function to have a minimum (maximum) given that the first variation vanishes is that the second variation be positive (negative) where the first variation vanishes [4]. These conditions are gen- eral and need be modified only if the possibility of a singular solution exists. Example 2.4-1 Suppose that we have a linear system represented by f(x, u) = Ax + Ba+c=0 and wish to find the m vector u which minimizes I, u) = sllullk + ZIIxIl0 where A is an n x n matrix, B is an n x m matrix, x, ¢, and 0 are n vectors. R and Q are positive definite symmetric matrices of dimensionality m x m and nxn The Hamiltonian function is formed by adjoining the cost function to the given constraint via the Lagrange multiplier technique which gives us = 4u"Ru + $x"Qx + MIAX + Bu + ¢] In order to minimize J, it is necessary that Bogctana0, Ho rvisy x esa where 2 is to be adjusted so that the given equality constr Ax + Bu+c=0 is satisfied, or Thus we find that u = —(R + BA~7Q A“'B)“! B7A-7Q Ale is the optimum u vector. We notice that it is necessary that the inverse of A exist in order for the u vector to exist. To check if this solution does in fact cause J(x, u) to have a minimum, we find the second variation and check the necessary condition II given earlier. From Eq. (2.4-19) and the specifications for this problem, we have ay = S16xt aur a ee a7 F6x"Q dx + 4 SwR Su 20 CALCULUS OF EXTREMA AND SINGLE-STAGE DECISION PROCESSES CHAP. 2 For J(x, u) to have a minimum, 8°J > 0, therefore Q and R must be non-negative definite. Since this is given in the statement of the problem, the solution, if it exists, does minimize J(x, u). Example 2.4-2 Suppose that we wish to minimize the cost function T= Allxlle subject to the constraint x+bu+e=0 where the scalar control is bounded such that |u| <1. This problem can be solved without the magnitude constraint on the control with the result (from the last example) u = —(b"Qb)""'b™Qe If |u| obtained from the foregoing problem is less than 1, we obtain what is called a singular solution. This is so because the H function is linear in the control variable and 2H/au = Xb = 0 is the equation for a stationary point which may well be a minimum. If b’Qb is positive definite, it is at least a local minimum. If the value of w obtained is within the boundary, that value solves our problem. If the value obtained is greater in magnitude than 1, the true solution for « must be on the boundary. This type of problem is of concern in optimal control theory and will be considered in some detail for dynamic processes. Example 2.4-3 [2] Suppose that observations of a constant vector are taken after being corrupted with noise. Symbolically, we express this as z=Hx+v where z which is composed of observed numbers is an m vector, H is an m x n matrix, x is an n vector, and y is an m vector representing measurement noise. It is desired to obtain the best estimate of x, denoted &, such that J = 3\|2 — HR ijko is minimum where R is a symmetric positive definite matrix. We accomplish this by setting OS _ yrp-ig — = Se = HR'@ — HB) = 0 Thus to obtain the best least-square error estimate of x we have & = (H’R"'H)"'H'R"'z, One of the simplest cases of interest occurs when we take m estimates of a scalar. In that case it is reasonable to take H as a unit vector of dimension m or, in other words, a column vector of 1's, and R as the identity matrix. For this sim- plest case, we have for the “best” estimate of x

You might also like