Are you sure?
This action might not be possible to undo. Are you sure you want to continue?
SERIES IN ELECTRICAL AND COMPUTER
ENGINEERING
Adel S. Sedra, Series Editor, Electrical Engineering Michael R. Lightner, Series Editor, Computer Engineering
LINEAR SYSTEM THEORY AND DESIGN
Third Edition
Fabrication
Allen and Holberg, CMOS Analog Circuit Design Bobrow, Elementary Campbell, Linear Circuit Analysis, 2,(d Ed. 2nd Ed. Bobrow, Fundamentals of Electrical Engineering,
The Science and Engineering
of Microelectroinic
Chen, Analog and Digital Control System Design Chen, Linear System Theory and Design, 3rd Ed. Chen, System and Signal Analysis, 2nd Ed. Comer, Digital Logic and State Machine Design, 3rd Ed. Cooper and McGillem, Fortney, Principles Probabilistic Methods of Signal and System Analysis, Analog & Digital 3rd Ed. of Electronics:
Chi Tsang Chen
State University of New York at Stony Brook
Franco, Electric Circuits Fundamentals Granzow, Digital Transmission Hoole and Hoole, A Modem Jones, Introduction Lines & Transformers, 2nd Ed. Electromagnetics Guru and Hiziroglu, Electric Machinery
Short Course in Engineering
to Optical Fiber Communication
Systems
Krein, Elements of Power Electronics Kuo, Digital Control Systems, 3rd Ed. Lathi, Modem McGillem Digital and Analog Communications Systems, 3rd Ed. 3rd Ed. and Cooper, Continuous and Discrete Signal and System Analysis, Fields for Engineers Devices
Miner, Lines and Electromagnetic Roulston, An Introduction Santina, Stubberud,
Roberts and Sedra, SPICE, 2nd Ed. to the Physics of Semiconductor 2nd Ed. Sadiku, Elements of Electromagnetics, Schwarz, Electromagnetics
and Hostetter, Digital Control System Design, 2nd Ed. for Engineers An Introduction, 2nd Ed. Circuits. 4th Ed. Control Systems, 3rd Ed.
Schwarz and Oldham, Electrical Engineering: Sedra and Smith. Microelectronic
Stefani, Savant. Shahian, and Hostetter, Design of Feedback Van Valkenburg. Analog Filter Design Warner and Grung, Semiconductor Wolovich, Automatic Yariv, Optical Electronics in Modem Device Electronics Communications, Control Systems
5th Ed.
New York Oxford OXFORD UNIVERSITY PRESS 1999
OXFORD UNIVERSITY
PRESS
Oxford New York Athens Auckland Bangkok Bogota Buenos Aires Calcutta Cape Town Chennai Dar es Salaam Delhi Florence Hong Kong Istanbul Karachi Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi Paris Sao Paulo Singapore Taipei Tokyo Toronto Warsaw and associated companies in Berlin Ibadan
To
Copyright © 1999 by Oxford University Press, Inc, Published by Oxford University Pr,;,s. Inc, 198 Madison Avenue. New York. New York 10016 Oxford is a registered trademark of Oxford University Press All rights reserved, No part of this publication may be reproduced. stored in a retrieval system. or transmitted. in any form or by any means, electronic, mechanical, photocopying. recording. or otherwise. without the prior permission of Oxford University Press,
BlHJAU
Library of Congress CataloginginPublication Data Chen. Chi Tsong Linear system theory and design I by Chi Tsong Chen,  3rd ed. p, cm,  (The Oxford series in electrical and computer engineering) Includes bibliographical references and index, ISBN 0195117778 (cloth), I. Linear systems. 2, System design, I. Title, II. Series, QA402,C44 1998 629,8'32<1c21 9735535 CIP
Printing (last digit): 9 8 7 6 5 4 3 2 Printed in the United States of America on acidfree paper
I
Contents
Preface xi
Chapter 1: Introduction
1.1 Introduction 1.2 Overview 2
Chapter 2: Mathematical Descriptions of Systems 5
2.1 Introduction 5
6 2.1.1 Causaliity and Lumpedness
2.2 Linear Systems 7 2.3 Linear TimeInvariant (LTI)Systems
2.3.1 OpAmp Circuit Implementation
11
16
2.4 Linearization 17 2.5 Examples 18 2.5.1 RLC Networks 26 2.6 DiscreteTime Systems 31 2.7 Concluding Remarks 37 Problems 38
Chapter 3: Linear Algebra
44
45
3.1 Introduction 44 3.2 Basis,Representation, and Orthonormalization 3.3 Linear Algebraic Equations 48 3.4 Similarity Transformation 53 3.5 Diagonal Form and Jordan Form 55 3.6 Functions of a Square Matrix 61 3.7 Lyapunov Equation 70 3.8 Some Useful Formulas 71 3.9 Quadratic Form and Positive Definiteness 73 3.10 SingularValue Decomposition 76 3.11 Norms of Matrices 78 Problems 78
vii
viii
CONTENTS
Contents
ix
Chapter 4: StateSpace Solutions and Realizations
4. 1 Introduction 86 4.2 Solution of LTIState Equations
4.2.1 Discretization 90 4.2.2 Solution of DiscreteTime
86
7.3 Computing 7.4 7.5 7.6 7.7 7.8
Coprime Fractions
195
192
7.3.1 QR Decomposition
87
92
Equations
4.3 Equivalent State Equations
4.3.1 Canonical 4.3.2 Magnitude
93
Circuits 98
Balanced Realization 197 Realizations from Markov Parameters 200 Degree of Transfer Matrices 205 Minimal RealizationsMatrix Case 207 Matrix Polynomial Fractions 209
7.8.1 Column and Row Reducedness 212 7.8.2 Computing Matrix Coprime Fractions 214
Forms 97 Scaling in OpAmp
4.4 Realizations 700 4.5 Solution of Linear TimeVarying (LTV)Equations
4.5.1 DiscreteTime Case
110
706
4.6 Equivalent TimeVarying Equations 4.7 TimeVarying Realizations 775 Problems 777 I
777
7.9 Realizations from Matrix Coprime Fractions 220 7.10 Realizations from Matrix Markov Parameters 225 7.11 Concluding Remarks 227 Problems 228
,
Chapter 8: State Feedback and State Estimators 231
8.1 Introduction 231 8.2 State Feedback 232
Chapter 5: Stability
121
727 Stability of LTISystems
Case 126 131
5.1 Introduction 5.2 InputOutput
8.2.1 Solving the Lyapunov
Equation
239 Rejection 243
721
8.3 Regulation and Tracking
242
5.2.1 DiscreteTime
5.3 Internal Stability
5.3.1 DiscreteTime 5.4. 1 DiscreteTime
729
Case Case
8.3.1 Robust Tracking and Disturbance 8.3.2 Stabilization 247
8.4 State Estimator
247
State Estimator 251
5.4 Lyapunov Theorem
732
135
8.4.1 ReducedDimensional
5.5 Stability of LTVSystems Problems 740
737
8.5 Feedback from Estimated States 253 8.6 State FeedbackMultivariable Case 255
8.6.1 8.6.2 8.6.3 8.6.4 Cyclic Design 256 LyapunovEquation Method 259 CanonicalForm Method 260 Effect on Transfer Matrices 262
Chapter 6: Controllability and Observability
6.1 Introduction 6.2 Controllability
6.2.1 Controllablity
143
743 744
Indices
150
8,7 State estimatorsMultivariable Case 263 8.8 Feedback from Estimated StatesMultivariable Problems 266
Case
265
6.3 Observability
6.3.1 Observability
753
Indices 157
Chapter 9: Pole Placement and Model Matching
764
171
269
271
6.4 Canonical Decomposition 758 6.5 Conditions in JordanForm Equations 6.6 DiscreteTime State Equations 169
6.6.1 Controllability
9.1 Introduction
269
EquationsClassical Method
9,1.1 Compensator
to the Origin and Reachability
9.2 UnityFeedback ConfigurationPole
Placement
277
273
6.7 Controllability After Sampling 6.8 LTVState Equations 176 Problems 180
772
9,2.1 Regulation and Tracking 275 9.2.2 Robust Tracking and Disturbance Rejection 9.2.3 Embedding Internal Models 280
9.3 Implementable
Transfer Functions
283
286 291
9.3.1 Model MatchingTwoParameter Configuration 9.3.2 Implementation of TwoParameter Compensators
Chapter 7: Minimal Realizations and Coprime Fractions
7.1 Introduction 184 7.2 Implications of Coprimeness
7.2.1 Minimal Realizations 189
184
9.4 Multivariable UnityFeedback Systems 292
9.4.1 Regulation and Tracking 302 9.4.2 Robust Tracking and Disturbance Rejection 303
185
9.5 Multivariable Model MatchingTwoParameter Configuration 306
x
CONTENTS
9,5,1 Decoupling 311 9,6 Concluding Remarks Problems 315
314
References
319
Preface
321 This text is intended for use in senior/firstyear graduate courses on linear systems and multivariable system design in electrical, mechanical, chemical, and aeronautical departments, It may also be useful to practicing engineers because it contains many design procedures, The mathematical background assumed is a working knowledge of linear algebra and the Laplace transform and an elementary knowledge of differential equations, Linear system theory is a vast field, In this text, we limit our discussion to the conventional approaches of statespace equations and the polynomial fraction method of transfer matrices. The geometric approach, the abstract algebraic approach. rational fractions, and optimization are not discussed. We aim to achieve two objectives with this text. The first objective is to use simple and efficient methods to develop results and design procedures, Thus the presentation is not exhaustive. For example, in introducing polynomial fractions, some polynomial theory such as the SmithMcMillan form and Bezout identities are not discussed. The second objective of this text is to enable the reader to employ the results to carry out design, Thus most results are discussed with an eye toward numerical computation. All design procedures in the text can be carried out using any software package that includes singularvalue decomposition. QR decomposition, and the solution of linear algebraic equations and the Lyapunov equation. We give examples using MATLAB®, as the package! seems to be the most widely available. This edition is a complete rewriting of the book Linear System Theory and Design, which was the expanded edition of Introduction to Linear System Theory published in 1970. Aside from, hopefully, a clearer presentation and a more logical development, this edition differs from the book in many ways: • The order of Chapters 2 and 3 is reversed, In this edition, we develop mathematical descriptions of systems before reviewing linear algebra, The chapter on stability is moved earlier, • This edition deals only with real numbers and foregoes the concept of fields. Thus it is mathematically less abstract than the original book. However, all results are still stated as theorems for easy reference. • In Chapters 4 through 6, we discuss first the timeinvariant case and then extend it to the timevarying case. instead of the other way around,
Answers to Selected Problems
Index
331
I. MATLAB is a registered trademark of the Math'Works.Tnc., 24 Prime Park Way, Natick, MA 017601500, Phone: 508·647·7000, fax: 508·().l7·7001, Email: info@mathworks.com, http.z/www.rnarhworks.corn.
xi
xii
PREFACE • The discussion of discretetime systems is expanded.
Preface
xiii
• In statespace design, Lyapunov equations are employed extensively and multi variable canonical forms are downplayed. This approach is not only easier for classroom presentation but also provides an attractive method for numerical computation. • The presentation of the polynomial fraction method is streamlined. The method is equated with solving linear algebraic equations. We then discuss pole placement using a onedegreeoffreedom configuration, and model matching using a twodegreeoffreedom configuration. • Examples using MATLAB are given throughout this new edition.
I am indebted to Mr. Bill Zobrist of Oxford University Press who persuaded me to undertake this revision. The people at Oxford University Press, including Krysia Bebick, Jasmine Urmeneta, Terri O'Prey, and Kristina Della Bartolomea were most helpful in this undertaking. Finally, I thank my wife, BihJau, for her support during this revision. ChiTsong Chen
This edition is geared more for classroom use and engineering applications; therefore, many topics in the original book are deleted, including strict system equivalence, deterministic identification, computational issues, some multi variable canonical forms, and decoupling by state feedback. The polynomial fraction design in the input/output feedback (controller/estimator) configuration is deleted. Instead we discuss design in the twoparameter configuration. This configuration seems to be more suitable for practical application. The eight appendices in the original book are either incorporated into the text or deleted. The logical sequence of all chapters is as follows: Chap. 15
=>
[
Chap. 6
=>
Chap. 8 { Chap. 7
Sec. 7.17.3 => Sec. 9.19.3 => Sec. 7.67.8 => Sec. 9.49.5
In addition, the material in Section 7.9 is needed to study Section 8.6.4. However, Section 8.6.4
may be skipped without loss of continuity. Furthermore, the concepts of controllability and observability in Chapter 6 are useful, but not essential for the material in Chapter 7. All minimal realizations in Chapter 7 can be checked using the concept of degrees, instead of checking controllability and observability. Therefore Chapters 6 and 7 are essentially independent. This text provides more than enough material for a onesemester course. A onesemester course at Stony Brook covers Chapters 1 through 6, Sections 8.18.5, 7.17.2, and 9.19.3. Timevarying systems are not covered. Clearly, other arrangements are also possible for a onesemester course. A solutions manual is available from the publisher. I am indebted to many people in revising this text. Professor Imin Kao and Mr. Juan Ochoa helped me with MATLAB. Professor Zongli Lin and Mr. T. Anantakrishnan read the whole manuscript and made many valuable suggestions. I am grateful to Dean Yacov Shamash, College of Engineering and Applied Sciences, SUNY at Stony Brook, for his encouragement. The revised manuscript was reviewed by Professor Harold Broberg, EET Department, Indiana Purdue University; Professor Peyman Givi, Department of Mechanical and Aerospace Engineering, State University of New York at Buffalo; Professor Mustafa Khammash, Department of Electrical and Computer Engineering, Iowa State University; and Professor B. Ross Barmish, Department of Electrical and Computer Engineering, University of Wisconsin. Their detailed and critical comments prompted me to restructure some sections and to include a number of mechanical vibration problems. I thank them all.
LINEAR SYSTEM THEORY AND DESIGN
Chapter
Introduction
1,1 Introduction
The study and design of physical systems can be carried out using empirical methods. We can apply various signals to a physical system and measure its responses. If the performance is not satisfactory, we can adjust some of its parameters or connect to it a compensator to improve its performance. This approach r~lies heavily on past experience and is carried out by trial and error and has succeeded in designing many physical systems. Empirical methods may become unworkable if physical systems are complex or too expensive or too dangerous to be experimented on. In these cases, analytical methods become indispensable. The analytical study of physical systems consists of four parts: modeling, development of mathematical descriptions, analysis, and design. We briefly introduce each of these tasks. The distinction between physical systems and models is basic in engineering. For example, circuits or control systems studied in any textbook are models of physical systems. A resistor with a constant resistance is a model: it will burn out if the applied voltage is over a limit. This power limitation is often disregarded in its analytical study. An inductor with a constant inductance is again a model: in reality, the inductance may vary with the amount of current flowing through it. Modeling is a very important problem, for the success of the design depends on whether the physical system is modeled properly. A physical system may have different models depending on the questions asked. It may also be modeled differently in different operational ranges. For example, an electronic amplifier is modeled differently at high and low frequencies. A spaceship can be modeled as a particle in investigating its trajectory: however, it must be modeled as a rigid body in maneuvering. A spaceship may even be modeled as a flexible body when it is connected to a space station. In order to develop a suitable model for a physical system, a thorough understanding of the physical system and its operational range is essential. In this text, we will call a model of a physical system simply a system. Thus a physical system is a device or a collection of devices existing in the real world: a system is a model of a physical system. Once a system (or model) is selected for a physical system, the next step is to apply various physical laws to develop mathematical equations to describe the system. For example, we apply Kirchhoff's voltage and current laws to electrical systems and Newton's law to mechanical systems. The equations that describe systems may assume many forms:
2
INTRODUCTION
1.2 Overview
3
( 1.1)
they may be linear equations, nonlinear equations, integral equations, difference equations, differential equations, or others. Depending on the problem under study, one form of equation may be preferable to another in describing the same system. In conclusion, a system may have different mathematicalequation descriptions just as a physical system may have many different models. After a mathematical description is obtained, we then carry out analysesquantitative and/or qualitative. In quantitative analysis, we are interested in the responses of systems excited by certain inputs. In qualitative analysis, we are interested in the general properties of systems, such as stability, controllability, and observability. Qualitative analysis is very important, because design techniques may often evolve from this study. If the response of a system is unsatisfactory, the system must be modified. In some cases, this can be achieved by adjusting some parameters of the system; in other cases, compensators must be introduced. Note th~ the design is carried out on the model of the physical system. If the model is properly chosen, then the performance of the physical system should be improved by introducing the required adjustments or compensators. If the model is poor, then the performance of the physical system may not improve and the design is useless. Selecting a model that is close enough to a physical system and yet simple enough to be studied analytically is the most difficult and important problem in system design.
If a linear system has, in addition, the property of time invariance, through (1.3) reduce to y(t) = and
x(t)
then Equations
L
co 
r)u(r)
dt
( 1.4)
= Ax(t)
y(tl = Cx(t)
+ Burr) + Du(t)
is an important
11.5) ( 1.6) tool in
For this class of linear timeinvariant systems, the Laplace transform analysis and design. Applying the Laplace transform to (1.4) yields
y<s) = G(s)u(s)
( 1.7) of the variable. The function
where a variable with a circumflex denotes the Laplace transform
1,2 Overview
The study of systems consists of four parts: modeling, setting up mathematical equations, analysis, and design. Developing models for physical systems requires knowledge of the particular field and some measuring devices. For example, to develop models for transistors requires a knowledge of quantum physics and some laboratory setup. Developing models for automobile suspension systems requires actual testing and measurements; it cannot be achieved by use of pencil and paper. Computer simulation certainly helps but cannot replace actual measurements. Thus the modeling problem should be studied in connection with the specific field and cannot be properly covered in this text. In this text, we shall assume that models of physical systems are available to us. The systems to be studied in this text are limited to linear systems. Using the concept of linearity, we develop in Chapter 2 that every linear system can be described by
yet) =
11
10
G(s) is called the transfer matrix. Both (1.4) and (1.7) are inputoutput or external descriptions. The former is said to be in the time domain and the latter in the frequency domian. Equations (l.l) through (1.6) are called continuoustime equations because their time variable t is a continuum defined at every time instant in (ce. oo). If the time is defined only at discrete instants, then the corresponding equations are called discretetime equations. This text is devoted to the analysis and design centered around (1.1) through (1.7) and their discretetime counterparts. We briefly discuss the contents of each chapter. In Chapter 2. after introducing the aforementioned equations from the concepts of lumpedness, linearity, and time invariance, we show how these equations can be developed to describe systems. Chapter 3 reviews linear algebraic equations, the Lyapunov equation, and other pertinent topics that are essential for this text. We also introduce the Jordan form because it will be used to establish a number of results. We study in Chapter 4 solutions of the statespace equations in (1.2) and (1.5). Different analyses may lead to different state equations that describe the same system. Thus we introduce the concept of equivalent state equations. The basic relationship between statespace equations and transfer matrices is also established. Chapter 5 introduces the concepts of boundedinput boundedoutput (BlBO) stability, marginal stability, and asymptotic stability. Every system must be designed to be stable; otherwise, it may bum out or disintegrate. Therefore stability is a basic system concept. We also introduce the Lyapunov theorem to check asymptotic stability. Chapter 6 introduces the concepts of controllability and observability. They are essential in studying the internal structure of systems. A fundamental result is that the transfer matrix describes only the controllable and observable part of a state equation. Chapter 7 studies minimal realizations and introduces coprime polynomial fractions. We show how to obtain coprime fractions by solving sets of linear algebraic equations. The equivalence of controllable and observable state equations and coprime polynomial fractions is established. The last two chapters discuss the design of timeinvariant systems. We use controllable and observable state equations to carry out design in Chapter 8 and use coprime polynomial fractions in Chapter 9. We show that, under the controllability condition, all eigenvalues of a system can be arbitrarily assigned by introducing state feedback. If a state equation is observable, fulldimensional and reduceddimensional state estimators, with any desired
G(t. r)u(r)dr
(1.1)
This equation describes the relationship between the input u and output y and is called the inputoutput or external description. If a linear system is lumped as well, then it can also be described by
x(t) = A(t)x(t) y(t) = C(t)x(t)
+ B(t)u(t)
+ D(t)u(t)
( 1.2)
( 1.3)
Equation (1.2) is a set of firstorder differential equations and Equation (1.3) is a set of algebraic equations. They are called the internal description of linear systems. Because the vector x is called the state, the set of two equations is called the statespace or, simply, the state equation.
4
INTRODUCTION eigenvalues, can be constructed to generate estimates of the state. We also establish the separation property. In Chapter 9, we discuss pole placement, model matching, and their applications in tracking, disturbance rejection, and decoupling. We use the unityfeedback configuration in pole placement and the twoparameter configuration in model matching. In our design, no control performances such as rise time, settling time, and overshoot are considered: neither are constraints on control signals and on the degree of compensators. Therefore this is not a control text per se. However, all results are basic and useful in designing linear timeinvariant control systems.
Chapter
Mathematical Descriptions of Systems
2.1 Introduction
The class of systems studied in this text is assumed to have some input terminals and output terminals as shown in Fig. 2.1. We assume that if an excitation or input is applied to the input terminals, a unique response or output signal can be measured at the output terminals. This unique relationship between the excitation and response, input and output, or cause and effect is essential in defining a system. A system with only one input terminal and only one output terminal is called a singlevariable system or a singleinput singleoutput (SISO) system. A system with two or more input terminals and/or two or more output terminals is called a multi variable system. More specifically, we can call a system a multiinput multioutput (MIMO) system if it has two or more input terminals and output terminals, a singleinput multioutput (SIMO) system if it has one input terminal and two or more output terminals.
"It)l
~
U~['J 45
U(I)
y(r)
i
12345k
u[k)
Black box
yet) y[k)
~
y[k)
o
I23
k
o
Figure 2.1
System.
5
6
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS Figure 2.2 Network with 3 state variables.
2.2
Linear Systems
7
A system is called a continuoustime system if it accepts continuoustime signals as its input and generates continuoustime signals as its output. The input will be denoted by lowercase italic lI(t) for single input or by boldface net) for multiple inputs. If the system has p input terminals, then n(t) is a p x 1 vector or u = [Ill 112 .. , up]', where the prime denotes the transpose. Similarly, the output will be denoted by y(t) or yet). The time t is assumed to range from 00 to DC. A system is called a discretetime system if it accepts discretetime signals as its input and generates discretetime signals as its output. All discretetime signals in a system will be assumed to have the same sampling period T. The input and output will be denoted by u[k] := lI(kT) and y[k] := y(kT), where k denotes discretetime instant and is an integer ranging from 00 to DC. They become boldface for multiple inputs and multiple outputs.
+
R,
2.1.1 Causality and Lumpedness
A system is called a memoryless system if its output y(to) depends only on the input applied at to; it is independent of the input applied before or after to. This will be stated succinctly as follows: current output of a memoryless system depends only on current input; it is independent of past and future inputs. A network that consists of only resistors is a memory less system. Most systems, however, have memory. By this we mean that the output at to depends on net) for t < to, t = to, and t > to. That is. current output of a system with memory may depend on past, current, and future inputs. A system is called a causal or nonanticipatory system if its current output depends on past and current inputs but not on future input. If a system is not causal, then its current output will depend on future input. In other words, a noncausal system can predict or anticipate what will be applied in the future. No physical system has such capability. Therefore every physical system is causal and causality is a necessary condition for a system to be built or implemented in the real world. This text studies only causal systems. Current output of a causal system is affected by past input. How far back in time will the past input affect the current output? Generally, the time should go all the way back to minus infinity. In other words. the input from 00 to time t has an effect on yet). Tracking n(t) from t = 00 is. if not impossible, very inconvenient. The concept of state can deal with this problem.
It is a 3 x I vector. The entries of x are called state variables. Thus, in general, we may consider the initial state simply as a set of initial conditions. Using the state at to. we can express the input and output of a system as
X(lo) } n(t), I ::: to _.. y(t),
I::: 10
(2.1)
It means that the output is partly excited by the initial state at to and partly by the input applied at and after to. In using (2.1), there is no more need to know the input applied before to all the way back to 00. Thus (2.1) is easier to track and will be called a stateinputoutput pair. A system is said to be lumped if its number of state variables is finite or its state is a finite vector. The network in Fig. 2.2 is clearly a lumped system; its state consists of three numbers. A system is called a distributed system if its state has infinitely many state variables. The transmission line is the most well known distributed system. We give one more example.
EXAMPLE 2.1 Consider
the unittime
delay system defined by yet) = u(t  1)
Definition 2.1 The state x(to) of a system at time to is the information at to that. together with the input n(t), for t ::: to. determines uniquely the output y(t) for all t ::: to·
The output is simply the input delayed by one second. In order to determine (y(t), t ::: to} from {u(t), t ::: to}, we need the information (u(t), to  1 :::: t < to}. Therefore the initial state of the system is (u(t), to  1 :::: t < to}. There are infinitely many points in {to  I ::::t < to}. Thus the unittime delay system is a distributed system.
2.2 Linear Systems
By definition, ifwe know the state alto. there is no more need to know the inputu(t) applied before to in determining the output y(t) after to. Thus in some sense. the state summarizes the effect of past input on future output. For the network shown in Fig. 2.2, if we know the voltages Xl (to) and X2(tO) across the two capacitors and the current X3 (to) passing through the inductor, then for any input applied on and after to, we can determine uniquely the output for t ::: to. Thus the state of the network at time to is A system is called a linear system if for every to and any two stateinputoutput xJto) u, ( t,) for i = I, 2, we have } t::: to pairs
_.. Yi (I), t ::: to
8
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS
2.2 Linear Systems
9
and
aXI (to) aUI (t),
} t > to
_,. aYI (t),
t
>:
to (homogeneity) property, the second, the
output is excited exclusively by the input. We consider first SISO linear systems. Let 8", (t  tl) be the pulse shown in Fig. 2.3. It has width 6. and height 1/6. and is located at time tl. Then every input u(t) can be approximated by a sequence of pulses as shown in Fig. 2.4. The pulse in Fig. 2.3 has height 1/6.; thus 8",(t  t;)6. has height 1 and the leftmost pulse in Fig. 2.4 with height u(t;) can be expressed as u(ti)8",(t  tIl6.. Consequently, the input u(t) can be expressed symbolically as
for any real constant a. The first property is called the additivity homogeneity property. These two properties can be combined as
alxl
(to)
+ a2x2(to)
t::':to
}
alul(t)+a2u2(t),
_,. alYI (t)
+ a2Y2(t),
t::': to
Let g",(t, t;) be the output at time t excited by the pulse u(t) = 8dt Then we have 8",(t  til _,. 8",(t  t;lu(t;l6. L8",(t Thus the output y(t)  t;)u(t;)6.
 t;) applied at time t..
for any real constants al and a2, and is call~d the superposition property. A system is called a nonlinear system if the superposition property does not hold. If the input u(t) is identically zero for t ::':to, then the output will be excited exclusively by the initial state x(to). This output is called the zeroinput response and will be denoted by Y:i or x(to) u(t)
ss«.
t;) (homogeneity) (additivity) by (2.2)
_,. g",(t, t;lu(t;l6. _,. Lg~(t,ti)lI(t;)6.
==
}
0, t ::':to
_,. Y:i (t),
t::': to
by the input. This
excited by the input u(t) can be approximated
y(t) "'" Lg~(t,
If the initial state x(to) is zero, then the output will be excited exclusively output is called the zerostate response and will be denoted by Yos or
x(to) = 0
t,)u(t;)6.
}
utr), t ::':to The additivity property implies Output due to
x(to) { u(t),
_,. Yzs(t),
t::': to
Figure 2.3
Pulse at tl.
1
I, I,
t
>:
to
x(to) output due to { u(t)
== ,
0
t::': to
+
or Response = zeroinput
output due to { u(t),
x(to) = 0
t ::':to
+ t.
response
+
zerostate response
Thus the response of every linear system can be decomposed into the zerostate response and the zeroinput response. Furthermore, the two responses can be studied separately and their sum yields the complete response. For nonlinear systems, the complete response can be very different from the sum of the zeroinput response and zerostate response. Therefore we cannot separate the zeroinput and zerostate responses in studying nonlinear systems. If a system is linear, then the additivity and homogeneity properties apply to zerostate responses. To be more specific, if x(to) = 0, then the output will be excited exclusively by the input and the stateinputoutput equation can be simplified as {u, _,. Yi}. If the system is linear, then we have {u I + U2 _,. YI + Y2} and {aui _,. aYi} for all a and all u., A similar remark applies to zeroinput responses of any linear system. Inputoutput description We develop a mathematical equation to describe the zerostate response of linear systems. In this study, the initial state is assumed implicitly to be zero and the Figure 2.4 Approximation of input signal.
10
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS
2.3
Linear TimeInvariant
(LTI) Systems
11
Now if to. approaches zero, the pulse 0",Uti) becomes an impulse at r., denoted by oCttil, and the corresponding output will be denoted by get, til. As to. approaches zero, the approximation in (2.2) becomes an equality, the summation becomes an integration, the discrete t, becomes a continuum and can be replaced by r , and to. can be written as dt . Thus (2.2) becomes
x(t) = A(t)x(t)
yet) = C(t)x(t)
+ B(t)u(r) + D(t)u(t)
(2.6) (2.7)
yU) = [:
g(t, r)u(r) d t
(2.3)
Note that g(t, r ) is a function of two variables. The second variable denotes the time at which the impulse input is applied; the first variable denotes the time at which the output is observed. Because gU, r ) is the response excited by an impulse, it is called the impulse response. If a system is causal, the output will not appear before an input is applied. Thus we have Causal
¢=>
g~lt, r ) = 0
where x := dxf dt.' For a pinput qoutput system, u is a p x 1 vector and y is a q x 1 vector. If the system has n state variables, then x is an n x I vector. In order for the matrices in (2.6) and (2.7) to be compatible, A, B, C, and D must be n x /1, n x p, q x n, and q x p matrices. The four matrices are all functions of time or timevarying matrices. Equation (2.6) actually consists of a set of /1 firstorder differential equations. Equation (2.7) consists of q algebrai~ e~uatlOns. The set of two equations will be called an ndimensional statespace equation or, Simply, state equation. For distributed systems, the dimension is infinity and the two equations in (2.6) and (2.7) are not used. The inputoutput description in (2.5) was developed from the linearity condition. The development of the statespace equation from the linearity condition, however, is not as simple and Will not be attempted. We will simply accept it as a fact.
for t < r
A system is said to be relaxed at to if its initial state at to is O. In this case, the output yet), for t ::: to, is excited exclusively by the input u (t) for t ::: to. Thus the lower limit of the integration in (2.3) can be replaced by to. If the system is causal as well, then get, r ) = 0 for t < r. Thus the upper limit of the integration in (2.3) can be replaced by t. In conclusion, every linear system that is causal and relaxed at to can be described by
yCt) =
l'
'0
2,3 Linear TimeInvariant
(LTI) Systems
pair
gU, r)u(r) d t
(2.4) A system is said to be time invariant if for every stateinputoutput
In this derivation, the condition of lumpedness is not used. Therefore any lumped or distributed linear system has such an inputoutput description. This description is developed using only the additivity and homogeneity properties; therefore every linear system, be it an electrical system, a mechanical system, a chemical process, or any other system, has such a description. If a linear system has p input terminals and q output terminals, then (2.4) can be extended to
x(to) u(t),
and any T, we have x(to + T) u(t  T).
t::: to
}
~ yet),
t i: to
yet) =
where
l'
'0
t:::
to + T
}
>
yU  T),
t:::
to + T
(time shifting) is applied
G(t, r)u(r)dr
(2.5)
G(t, r ) =
and gij (t, r ) is the response at time t at the ith output terminal due to an impulse applied at time r at the jth input terminal, the inputs at other terminals being identically zero. That is, gij(t, r ) is the impulse response between the jth input terminal and the ith output terminal, Thus G is called the impulse response matrix of the system. We stress once again that if a system is described by (2.5), the system is linear, relaxed at to, and causal. Statespace of the form description Every linear lumped system can be described by a set of equations
••
gql(t, r)
g21(t, r ) .
g[2(t, r ) g22U, r )
g2p(t, r )
gIPU,r)]
from to + T instead of from to, then the output waveform will be the same except that it starts to appear from time to + T. In other words, if the initial state and the input are the same, no matter at what time they are applied, the output waveform will always be the same. Therefore, for timeinvariant systems, we can always assume, without loss of generality, that to = O. If a system IS not time invariant, it is said to be time varying. Time invariance is defined for systems, not for signals. Signals are mostly time varvinz. If a signal is time invariant such as u(t) = 1 for all t, then it is a very simple or a trivial signal. The characteristics of timeinvariant systems must be independent of time. For example, the network in Fig. 2.2 is time invariant if Ri, C;, and L, are constants. Some physical systems must be modeled as timevarying systems. For example, a burning rocket is a timevarying system, because its mass decreases rapidly with time. Although the performance of an automobile or a TV set may deteriorate over a long period of time, its characteristics do not change appreciable in the first couple of years. Thus a large number of physical systems can be modeled as timeinvariant systems over a limited time period.
It means that if the initial state is shifted to time to + T and the same input waveform
gqp(t. r )
1. We use A := B to denote that A. by definition. equals B. We use A =: B to denote that B. by definition, equals A.
12
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS by
2.3
linear
TimeInvariant
(lTI) Systems
13
Inputoutput description The zerostate response of a linear system can be described (2.4). Now if the system is time invariant as well, then we have' get, r )
Transferfunction matrix The Laplace transform is an important tool in the study of linear timeinvariant (LTI) systems. Let yes) be the Laplace transform of y(t), that is, yes)
= get + T, r + T) = g(t

t •0)
=
g(t  r )
=
for any T. Thus (2.4) reduces to
y(t)
l
X
y(t)eS'dt
=
l'
g(t  r)u(r)
dt =
l'
g(r)u(t
 r)d t
(2.8)
where we have replaced 10 by O. The second equality can easily be verified by changing the variable. The integration in (2.8) is called a convolution integral. Unlike the timevarying case where g is a function of two variables, g is a function of a single variable in the timeinvariant case. By definition get) = g(t  0) is the output at time 1 due to an impulse input applied at time O. The condition for a linear timeinvariant system to be causal is get) = 0 for t < O. 2.2 The unittime delay system studied in Example 2.1 is a device whose output equals the input delayed by I second. If we apply the impulse 8 (t) at the input terminal, the output is 8(1  I). Thus the impulse response of the system is 8(t  I).
EXAMPLE
Throughout this text, we use a variable with a circumflex to denote the Laplace transform of the variable. For causal systems, we have g(t) = 0 for t < 0 or g(t  r ) = 0 for r > t. Thus the upper integration limit in (2.8) can be replaced by 00. Substituting (2.8) and interchanging the order of integrations, we obtain Yes) = = which becomes.
1'" (fX
r=O
foo (IX
t=O
get  T)U(r)dr) g(t  r)eSUrJdt)
esur)eSOdt u(r)esrdr
r=O
[=0
after introducing S'(s) =
2.3 Consider the unityfeedback system shown in Fig. 2.5(a). It consists of a multiplier with gain a and a unittime delay element. It is a SISO system. Let r(t) be the input of the feedback system. If r(t) = 8(1), then the output is the impulse response of the feedback system and equals
EXAMPLE
1: (l~r
the new variable v = t  r , g(v)eSl '
dV) u(r)esr
dt
Again using the causality condition to replace the lower integration limit inside the parentheses from v = E to v = 0, the integration becomes independent of r and the double integrations become r
gj(t)
= a8(t
1)
+a28(t
 2) +a38(1
 3)
+ ... =
YO
2.:>i8(1
i=!
 i)
(2.9) or
yes) =
1:
g(v)e
sr
dv
1:
u(r)esr
dt
Let r(t) be any input with r(t)
y(t)
=
l'
a
==
0 for t < 0; then the output is given by = fai
gj(t  r)r(r)dr
= ;air(r{~'_i
=
;b(t i~1
l'
a
i)
,v(s) = g(s)u(s)
(2.10)
8(t  r  i)r(r)dr
where g(s) =
l
x
g(t)eHdt
Because the unittime delay system is distributed,
so is the feedback system.
Unittime
delay element
Unitlime delay
ylt)
is called the transfer function of the system. Thus the transfer function is the Laplace transform of the impulse response and. conversely, the impulse response is the inverse Laplace transform of the transfer function. We see that the Laplace transform transforms the convolution integral in (2.8) into the algebraic equation in (2.10). In analysis and design. it is simpler to use algebraic equations than to use convolutions. Thus the convolution in (2.8) will rarely be used in the remainder of this text. For a pinput qoutput system, (2.10) can be extended as
element
'~'I(S)l [~II(S)
.\'2(5)
g21
(a) (b)
gI2(s)
(5)
gn(s)
gIP(S)lrUI(s)l
~2~(S) gqp(s) :2;(5) up(s)
[
or
· · ·
S'q(s)
. . .
gql(S)
Figure 2.5
Positive and negative feedback systems.
2. Note that g(r. r ) and g(t  r ) are two different functions. However. for convenience. the same symbol g is used.
yes)
= G(S)I1(s)
(2.11)
14
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS
2.3 Linear TimeInvariant
(LTI) Systems
15
where gij(s) is the transfer function (;.(s) is called the transferfunction
EXAMPLE
from the jth input to the ith output. The q x p matrix matrix or, simply, transfer matrix of the system. delay system studied in Example 2.2. Its impulse
• g(s) strictly proper 9 deg DCs) > degN(s)? • g(s) biproper • g(s) improper
g(oo)
= O.
ee deg D(s) = deg N(s)
9 deg D(s) < deg N(s)
9 g(oo)
= nonzero constant. = 00.
?
Ig(oo)1
2.4 Consider
the unittime
response is o(t  1). Therefore
its transfer function is
g(s) = L[o(t  I)] = This transfer function is an irrational
EXAMPLE
f'(J
8(t  I)e"dt
= estlt=l
= «:
function of s,
2.5 Consider the feedback system shown in Fig. 2.5(a). The transfer function of the unittime delay element is e=. The transfer function from r to y can be computed directly from the block diagram as }
Improper rational transfer functions will amplify highfrequency noise, which often exists in the real world; therefore improper rational transfer functions rarely arise in practice. A real or complex number A is called a pole of the proper transfer function g(s) = N(s)/ D(s) if Ig(A)1 = 00; a zero if g(A) = O. If N(s) and D(s) are coprime, that is, have no common factors of degree 1 or higher, then all roots of N(s) are the zeros of g(s), and all roots of D(s) are the poles of g(s). In terms of poles and zeros, the transfer function can be expressed as g(s) = k (s  2l)(S  22)'" (s  Zm)
ae:'
gj(s)=~ This can also be obtained by taking the Laplace transform computed in (2.9) as
00
(2.12) of the impulse response, which was
(S 
pIles 
P2) ... (s  PII)
This is called the zeropolegain form. In MATLAB, this form can be obtained from the transfer function by calling [ z , p, k ] = tf2zp (num , den). A rational matrix (;.(s) is said to be proper if its every entry is proper or if (;( 00) is a zero or nonzero constant matrix: it is strictly proper if its every entry is strictly proper or if (;.(JO) is a zero matrix. If a rational matrix (;(s) is square and if both (;.(s) and (;.1 (s) are proper. then (;.(s) is said to be biproper. We call A a pole of (;.(s) if it is a pole of some entry of (;.(s). Thus every pol~ of every entry of (;.(s) is a pole of (;.(s). There are a number of ways of defining
gj(t)
= 2::>io(t
i=l
 i)
Because L[o(t  i)] = e:", gj(s)
the Laplace transform
00
of gf(t) = aes
is
00
= L[gf(t»)
= La'eiS
i=l
L(ae ),
i=O
s
zeros for G(s). We call A a blocking useful definition is the transmission
zero if it is a zero of every nonzero entry of (;.(s). A more zero, which will be introduced in Chapter 9. lumped system can be described by a set
Using
Statespace equation Every linear timeinvariant of equations of the form
'"'r'
~ for
i=O
00.
I =Ir
xU)
= Ax(t)
+ Bu(t) + Dutr)
(2.13)
y(t) = Cx(t)
Ir I
< 1, we can express the infinite series in closed form as
For a system with p inputs, q outputs, and n state variables, A, B, C, and D are, respectively, n x n. n X p . q x n , and q x p constant matrices. Applying the Laplace transform to C2.13) yields
sx(s)  x(O) = Ax(s)
which is the same as (2.12). The transfer function in (2.12) is an irrational function of s. This is so because the feedback system is a distributed system. If a linear timeinvariant system is lumped, then its transfer function will be a rational function of s. We study mostly lumped systems; thus the transfer functions we will encounter are mostly rational functions of s . Every rational transferfunction can be expressed as g(s) = N(s)/ D(s), where N(s) and Dis) are polynomials of s. Let us use deg to denote the degree of a polynomial. Then g(s) can be classified as follows: • g(s) proper? deg D(s) ::':deg N(s)
hs) = Cx(s)
which implies
xes) = (sl  A)lx(O)
+ Bu(s) + Du(s)
(2.14)
+ (sl
+
 A)IBu(s)
Yes) = C(sl
 A)IX(O)
C(sI  A)lBu(s)
+
Du(s)
(2.15)
*
g(oo)
= zero or nonzero constant.
They are algebraic equations. Given x(O) and u(s). xCs) and Yes) can be computed algebraically from (2.14) and (2.15). Their inverse Laplace transforms yield the time responses xtr) and y(t). The equations also reveal the fact that the response ofa linear system can be decomposed
16
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS and the zeroinput response. If the initial state x(O) is zero, then
XI
as the zerostate response (2.15) reduces to
y(t) = [2 3] [
x2(1)
(I)] +
2.4 Linearization
17 (2.IS)
Suet)
Y(s) = [C(sI  A)IB
Comparing this with (2.11) yields
G(s)
,
+ D]u(s)
I
= C(sI  A) B + D
(2.16)
This relates the inputoutput (or transfer matrix) and statespace descriptions. The functions t f 2 s sand s s 2 t f in MATLAB compute one description from the other. They compute only the SISO and SIMO cases. For example, [num, den] = ss2 tf (a, b , c • d , 1) computes the transfer matrix from the first input to all outputs or, equivalently, the first column of G(s). If the last argument 1 in s s Z tf (a, b , c , d, 1) is replaced by 3, then the function generates the third column of G(s). To conclude this section, we mention that the Laplace transform is not used in studying linear timevarying systems. The Laplace transform of get, r ) is a function of two variables and L[A(t)x(t)] f. L[A(t)]L[x(t)]; thus the Laplace transform does not offer any advantage and is not used in studying timevarying systems.
It has dimension 2 and we need two integrators to implement it. We have the freedom in choosing the output of each integrator as +Xi or Xi. Suppose we assign the output of the lefthandside (LHS) integrator as XI and the output of the righthandside (RHS) integrator as X2 as shown in Fig. 2.7. Then the input of the LHS integrator should be, from the first equation of (2.17), XI = 2xl + 0.3X2 + 2u and is connected as shown. The input of the RHS integrator should be .~2 = XI  SX2 and is connected as shown. If the output of the adder IS chosen as y, then its input should equal y = 2xI  3X2  5u, and is connected as shown. Thus the state equation in (2.17) and (2.IS) can be implemented as shown in Fiz. 2.7. Note that there are m.any ways to implement the same equation. For example, if we assign the outputs of the two mtegrators in Fig. 2.7 as XI and X2, instead of XI and X2, then we will obtain a different implementation. In actual operational amplifier circuits, the range of signals is limited, usually I or 2 volts below the supplied voltage. If any signal grows outside the range, the circuit will saturate or bum out and the circuit will not behave as the equation dictates. There is. however, a way to deal with this problem. as we will discuss in Section 4.3.1.
2.3.1 OpAmp Circuit Implementation
Every linear timeinvariant (LTI) statespace equation can be implemented using an operational amplifier (oparnp) circuit. Figure 2.6 shows two standard opamp circuit elements. All inputs are connected, through resistors, to the inverting terminal. Not shown are the grounded non inverting terminal and power supply. If the feedback branch is a resistor as shown in Fig. 2.6(a), then the output of the element is (axi +bX2 +CX3). If the feedback branch is a capactor with capacitance C and RC = 1 as shown in Fig. 2.6(b), and if the output is assigned as x, then .~=  (a VI + bV2 + CV3)' We call the first element an adder; the second element, an integrator. Actually, the adder functions also as multipliers and the integrator functions also as multipliers and adder. If we use only one input, say, XI, in Fig. 2.6(a), then the output equals axl, and the element can be used as an inverter with gain a. Now we use an example to show that every LTI statespace equation can be implemented using the two types of elements in Fig. 2.6. Consider the state equation
2,4 Linearization
Most physical systems are nonlinear and time varying. Some of them can be described bv the nonlinear differential equation of the form 
R R
C
RI2 RI3
R
S
0.3]
[XI X2(t)
(t)]
+
[2]
0
C

u
x,
R
R RI8
X2
v
u(t)
RI5
(2.17)
RC = 1
::~/b
Ria Ric
.\:3 
R
t:
:::~R;:
(ax\
RC=I
x
R R R
+ bx; + CX3)
l'~
Ric
x =
aVI
+ bVl + (V3
II
(a)
(b)
Figure 2.6
Two oparnp circuit elements.
Figure 2.7
Oparnp implementation of (2.17) and (2.18).
18
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS 2.5 Examples x(t) yet) 19
== ==
h(x(t),
u(t), t)
(2.19)
and displacement block generally fnctlOn.ls
y from the equilibrium
to be the output. The friction between the floor and the friction, and viscous we disregard coordinate is velocity
f(x(t), u(t), t)
consists of three distinct parts: static friction, Coulomb of the velocity. To simplify
friction as shown in Fig. 2.9. Note that the horizontal clearly not a linear function
where hand f are nonlinear functions. The behavior of such equations can be very complicated and its study is beyond the scope of this text. Some nonlinear equations, however, can be approximated by linear equations under certain conditions. Suppose for some input function u, (t) and some initial state, Xo (t) is the solution of (2.19); that is, (2.20) Now suppose the input is perturbed slightly to become perturbed only slightly. For some nonlinear equations, from xo(t) only slightly. In this case, the solution can small for all t. J Under this assumption, we can expand xo(t)
j. == d y f dt . The
analysis,
the static and Coulomb frictions and consider only the viscous friction. Then the friction becomes linear and can be expressed as k, j.(t), where k, is the viscous friction coefficient. The ~haracteristics
IS
of the spring are shown in Fig. 2.10: it is not linear. However, if the displacement Y2) as shown, then the spring can be considered to be linear and the spring system can be modeled
limited to (YI,
+ ii(t) and the initial state is also the corresponding solution may differ be expressed as xo(t) + x(t) with x(t) (2.19) as
u, (r)
force equals k2y, where k2 is the spring constant. Thus the mechanical as a linear system under linearization and simplification. Figure 2.8 Mechanical system.
.1'=0
+ x(t) ==
==
h(xo(t)
+ x(t),
uo(t)
+ ii(t),
oh ox x
t) oh_ ou u
h(xo(t), uo(t), t)
+
+
+ ...
(2.21)
where, for h == [hI
hz h3l'.
x == [Xl X2
x3l',
and u == [UI ohl/oX2 oh2/oX2 Oh3/0X2
uzl',
Ohl/OX3] oh2/oX3 Oh3/0X3
Force Force Viscous friction
oh A(t) :== ox:==
[Ohl/Oxl OhZ/OXI oh3/oXI
Static Velocity
o
They are called lacobians. Because A and B are computed along the two time functions x, (t) and uo(t), they are, in general, functions of t. Using (2.20) and neglecting higher powers of x and ii, we can reduce (2.21) to x(t)
J Coulomb
(a) (b)
==
A(t)x(t)
+ B(t)ii(t)
Figure 2.9
Mechanical system.(a) Static and Coulomb frictions. (b) Viscous friction.
This is a linear statespace equation. The equation yet) == f(x(t), u(t), t) can be similarly linearized. This linearization technique is often used in practice to obtain linear equations.
Figure 2.10
Characteristic of spring.
Force
2.5 Examples
In this section we use examples equations for physical systems.
EXAMPLE 2.6 Consider
Break
to illustrate how to develop transfer functions
and statespace
fDisplacement
mass m connected
the mechanical system shown in Fig. 2.S. It consists of a block with to a wall through a spring. We consider the applied force U to be the input
3. This is not true in general. For some nonlinear equations, a very small difference in initial states will generate completely different solutions, yielding the phenomenon of chaos.
20
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS force u I must overcome thus we have the spring forces and the remainder
2.5 Examples is used to accelerate
21
We apply Newton's law to develop an equation to describe the system. The applied force u must overcome the friction and the spring force. The remainder is used to accelerate the block. Thus we have (2.22) where y == d2y(t)ldt2 and y initial conditions, we obtain
the block,
== dy(t)ldt.
Applying
the Laplace transform and assuming zero
or mLYI For the second block. we have
+ (kl + k2)YI

kZY2 =
UI
(2.23)
which implies )l(s)
(2.24)
==
I ms?
i
+ kls + k:
u(s)
,
They can be combined
as O][~I]+[kl+k2 m; Y2
This is the inputoutput description of the system. Its transfer function is I/(msz If m == I, kl == 3, k2 == 2, then the impulse response of the system is g(t)
+ kls + k2).
[ml o
=k:
kl
=ka ][YI]==[UI]
+ k:
Y2
U2
== i:' [
s2
1 ] == [_1 ++ +
£..1
3s 2 s of the system is
I
s
1_] +
2
= e:' _ e2r
This is a standard equation in studying vibration and is said to be in normal form. See Reference [18]. Let us define
and the convolution yet)
description
Then we can readily obtain  e2(tr»)u(r)
== for
g(t  r)u(r) d t
==
for (e(rr)
dt
Next we develop a statespace equation to describe the system. Let us select the displacement and velocity of the block as state variables; that is, XI = y, Xz == y. Then we have, using (2.22),
n
X2 X4
XI :==)'1
X2
:== :VI
0
X3 := Y2
x~ :==
Y2
_
[_(k'~k')
ml
0 0 0
k2
X3 .

0
_::.
ml
(kl
k,
0
+ k2)
m2
They can be expressed in matrix form as XICt)] [ x2Ct) y(t)
y
== ==
[
0
k2/m
kl/m
I ] [XI
(t)]
X2(t)
+
[0]
11m
ut
()
:== [),I] )'2
==
[I
0
ml
0
1
0 0
0] o
~W:H~'
m2
 k2Yz(s)
; 1[:;]
to (2.23) and
X
This twoinput
twooutput
state equation describes the system in Fig. 2.11.
[I 0]
[Xdtl]
X2U)
To develop its inputoutput description, we apply the Laplace transform (2.24) and assume zero initial conditions to yield 2 mls .VI(S) m2s2Y2(S)
+
(kl
+
k2).\'I(s)
This statespace equation describes the system.
EXAMPLE 2.7 Consider the system shown in Fig. 2.11. It consists of two blocks, with masses m I and m i. connected by three springs with spring constants k., i == I, 2, 3. To simplify the discussion, we assume that there is no friction between the blocks and the floor. The applied
 k2YI(s)
+ (kl + k2)Y2(S)
== III(S) == 112(S)
From these two equations, we can readily obtain m2s2
[ .~I(S)] Y2(5)
+ kl + k2
des) k: des)
==
[
Figure 2.11
Springmass system. where des)
:== (m Is2 + k, + k2)(m2s2
+ k, + k2)

k~
This is the transfermatrix description of the system. Thus what we will discuss in this text can be applied directly to study vibration.
,
22
MATHEMATICAL DESCRIPTIONS OF SYSTEMS
EXAMPLE 2.8
2.5 Examples
23
Consider a cart with an inverted pendulum hinged on top of it as shown in Fig.
2.12. For simplicity, the cart and the pendulum are assumed to move in only one plane, and the friction, the mass of the stick, and the gust of wind are disregarded. The problem is to maintain the pendulum at the vertical position. For example, if the inverted pendulum is falling in the direction shown, the cart moves to the right and exerts a force, through the hinge, to push the pendulum back to the vertical position. This simple mechanism can be used as a model of a space vehicle on takeoff. Let H and V be, respectively, pendulum the horizontal of Newton's and vertical forces exerted by the cart on the law to the linear movements yields
Using these linearized equations, we now can develop the inputoutput and statespace descriptions. Applying the Laplace transform to (2.25) and (2.26) and assuming zero initial conditions, we obtain MsZy(s) MIs28(s) From these equations,
= u(s)  mg8(s)
= (M
+ m)g8(s)
 u(s)
we can readily compute the transfer function gyu (s) from u to y and the s2  g
'Z::::';;:::_S [Ms2  (M m)g]
as shown. The application dZy M =uH dt? dZ H = m(y dtZ dZ mg  V = m(/ dtZ
transfer function gou (s) from u to 8 as
A
g yu (s)
=
+
+lsinB) cos8)
I
= my +mlBcosB
..'
2
..
 ml(8t
.
?
sin e' and around the hinge
•
gou (s) = ~Ms2  (M +m)g To develop a statespace equation, we select state variables as Xl = y, Xz= = Then from this selection, (2.25), and (2.26) we can readily obtain .
A
1
= ml[8
sin8  (8) cos8] of the pendulum
X4
e.
The application yields
of Newton's
law to the rotational movement
mgl sin 8 = mle . I They are nonlinear equations.
+ m.W cos 8
[~~l°° °°
= [~~
;(3 ;(4
m~/M (M +mvg O]x
Because the design objective is to maintain to assume 8 and
we can use the approximation sin 8 = 8 and cos 8 = 1. By retainingonly the linear terms in 8 and or, equivalently, dropping the terms with e2, (B)2, eo, and e8, we obtain V = mg and
at the vertical position, it is reasonable
e to be small.
the pendulum
Y = [I
°°
°
f Ml
~l[;~l ° 1 °
+[
I/OM u I
X3 X4
v.
X}
= 8.
l/MI
(2.27)
Under this assumption,
This state equation has dimension
4 and describes the system when 8 and
B
are very small.
e
My = u my g8 = Ie which imply My = u mg8
mle
+y
(2.25) u (2.26)
EXAMPLE 2.9 A communication satellite of mass m orbiting around the earth is shown in Fig. 2.13. The altitude of the satellite is specified by r(t), 8(t), and ¢(t) as shown. The orbit can be controlled by three orthogonal thrusts u,(t), uo(t), and u<l>(t). The state, input. and
output of the system are chosen as
r(t) ;(t)
Mle = (M +m)g8 Figure 2.12
x(t)=
8(t)
B(t)
u(t)
=
[Ur(t)] uo(t) u<l>(t)
y(t)
=
[r(t) 8(t)
]
Cart with inverted pendulum.
¢(t)
¢(t)
¢(t)
Then the system can be shown to be described r
re2 cos?
by
¢ + r¢2
 k/rz
+ u.Jm + uo/mr
(2.28) cos ¢
x = h(x,
B u) =
2;B /r
¢
+ 2B¢
sin ¢/ cos ¢  2r¢/r
ezcos¢sin¢
+ u<l>/mr
24
MATHEMATICAL DESCRIPTIONS
OF SYSTEMS
yU)~k
e,
analysis and design.
0 0 0
0
0
0 0
0 0 0
:J
2.5
Examples
25
"n
(2.29)
This sixdimensional state equation describes the system. In this equation, A, B, and C happen to be constant. If the orbit is an elliptic one. then they will be time varying. We note that the three matrices are all block diagonal. Thus the equation can be decomposed into two uncoupled parts, one involving rand the other </>. Studying these two parts independently can simplify
Figure 2.13
Satellite in orbit. to a circular equatorial orbit, is given by
One solution, which corresponds
In chemical plants, it is often necessary to maintain the levels of liquids. A simplified model of a connection of two tanks is shown in Fig. 2.1..\. It is assumed that under normal operation, the inflows and outflows of both tanks all equal Q and their liquid levels equal HI and H2. Let 1I be inflow perturbation of the first tank, which will cause variations in liquid level XI and outflow YI as shown. These variations will cause level variation X2 and outflow variation Y in the second tank. It is assumed that
YI
EXAMPLE 2.10
u,
== 0
with r;w; = k, a known physical constant. Once the satellite reaches the orbit, it will remain in the orbit as long as there are no disturbances. If the satellite deviates from the orbit, thrusts must be applied to push it back to the orbit. Define x(t) = xo(t)
= RI
XI X2
and
Y=
=
X,
R2
where R, are the flow resistances and depend on the normal height HI and H2. They can also be controlled by the valves. Changes of liquid levels are governed by
AI dXI
+ x(t)
0 3w; 0 0 2wo r; 0 0 0 0 0
I
u(t) = uo(t)
+ ii(t)
y(t) = Yo
+ yet)
=
(II 
yll
dt
and
A2 dX2
=
(YI

y) dt
If the perturbation is very small, then (2.28) can be linearized as 0 0 0 0 0 0
0 0 0
where Aj are the cross sections of the tanks. From these equations,
we can readily obtain
0 2woro 0 0 0
0 0 0 0 0
(J)~
0 0 0 0
x(t) Thus the statespace [::~] = description
. XI
•
=Al
II
XI AIRI
X2
Xi
x(t) =
0 0 0 0
0 1
m

= 
Xl
x.,
A2RI
 
X.,
A2R2
of the system is given by
[~~~~~~2
Q +11
,
0
Figure 2.14
tanks.
Y = [0 I/R21x
Hydraulic

+
0 0 0 0
mr ;
0 0
ii(t)
  T .H, +X, Q
Q
A, I
0 0
+)'1
+y
mr.,
26
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS
Its transfer function can be computed as
. g(s)
1
I
2.5 Examples
27
==
AIA2RIR2S
,
+ (AIRI + AIR2 + A2R2)s + I
Il
They can be combined in matrix form as
o
IlL
o
y == [I  I O]x
I/CI] I/C2
o
x
+
[IIRCI]
0
u
0
2.5.1
RLC networks
+ O· u
This threedimensional state equation describes the network shown in Fig. 2.15. In RLC networks, capacitors and inductors can store energy and are associated with state variables. If a capacitor voltage is assigned as a state variable x, then its current is CX, where C is its capacitance. If an inductor current is assigned as a state variable x, then its voltage is Lx ; where L is its inductance. Note that resistors are memoryless elements, and their currents or voltages should not be assigned as state variables. For most simple RLC networks, once state variables are assigned, their state eq~ations can be developed by applying Kirchhoff's current and voltage laws, as the next example illustrates. 2.11 Consider the network shown in Fig. 2.15. We assign the Cicapacitor voltages as Xi, i == 1,2 and the inductor current as X3. It is important to specify their polarities. Then their currents and voltage are, respectively, CIXIo C2X2, and LX3 with the polarities shown. From the figure, we see that the voltage across the resistor is u  XI with the polarity shown. Thus its current is (u  xdl R. Applying Kirchhoff's current law at node A yields C2X2 == X3; at node B it yields
EXAMPLE
Thus we have
. XI X2
== ==
XI RCI

X3
CI
+ 
U
RCI
I X3
C2
. X3
Appling Kirchhoff's voltage law to the righthandside loop yields LX3 == XI  X2 or
The procedure used in the preceding example can be employed to develop state equations to describe simple RLC networks. The procedure is fairly simple: assign state variables and then use branch characteristics and Kirchhoff's laws to develop state equations. The procedure can be stated more systematically by using graph concepts, as we will introduce next. The procedure and subsequent Example 2.12. however, can be skipped without loss of continuity. First we introduce briefly the concepts of tree, link, and cutset of a network. We consider only connected networks. Every capacitor, inductor, 'resistor, voltage source, and current source will be considered as a branch. Branches are connected at·nodes. Thus a network can be considered to consist of only branches and nodes. A Ioop is a connection of branches starting from one point and coming back to the same point without passing any point twice. The algebraic sum of all voltages along every loop is zero (Kirchhoff's voltage law). The set of all branches connect to a node is called a cutset. More generally, a cutset of a connected network is any minimal set of branches so that the removal ofthe set causes the remaining network to be unconnected. For example, removing all branches connected to a node leaves the node unconnected to the remaining network. The algebraic sum of all branch currents ill every cutset is zero (Kirchhoff's current law). A tree of a network is defined as any connection of branches connecting all the nodes but containing no loops. A branch is called a tree branch if it is in the tree, a link if it is not. With respect to a chosen tree, every link has a unique loop, called the fundamental loop, in which the remaining loop branches are all tree branches. Every tree branch has a unique cutset, called the fundamental cutset, in which the remaining cutset branches are all links. In other words, a fundamental loop contains only one link and a fundamental cutset contains only one tree branch.
~ Procedure for developing statespace equations'
==
XI  X2 L
The output)' is given by y == LX3 == XI  X2
tree. The branches of the normal tree are chosen in the order of voltage sources, capacitors, resistors, inductors, and current sources. 2. Assign the capacitor voltages in the normal tree and the inductor currents in the links as state variables. Capacitor voltages in the links and inductor currents in the normal tree are not assigned. 3. Express the voltage and current of every branch in terms of the state variables and, if necessary, the inputs by applying Kirchhoff's voltage law to fundamental loops and Kirchhoff's current law to fundamental cutsets.
4. The reader may skip this procedure and go directly to Example 2.13.
1. Consider an RLC network. We first choose a normal
Figure 2.15
Network.
28
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS loop or cutset of every branch They can be expressed in matrix form as
2.5
Examples
29
4. Apply Kirchhoff's voltage or current law to the fundamental that is assigned as a state variable.
EXAMPLE 2.12 Consider the network shown in Fig. 2.16. The normal tree is chosen as shown with heavy lines; it consists of the voltage source, two capacitors, and the IQ resistor. The capacitor voltages in the normal tree and the inductor current in the link will be assigned as state variables. If the voltage across the 3F capacitor is assigned as X], then its current is 3.t]. The voltage across the IF capacitor is assigned as X2 and its current is X2. The current through the 2H inductor is assigned as X3 and its voltage is 2X3. Because the 2Q resistor is a link. we use its fundamental loop to find its voltage as UI  XI. Thus its current is (u]  x])/2. The IQ resistor is a tree branch. We use its fundamental cutset to find its current as X3. Thus its voltage is I . X3 = X3. This completes Step 3.
xl;' ~l
If we consider the outputs, then we have Yl = 2X3 = x] and Y2 = 0.5(u]  xtl = [0.5 0 O]x + [0.5 O)u

(2.30)
the voltage across the 2H inductor and the current through the 2Q resistor as
X2 
X3
= [I

I
 I)x
The 3F capacitor is a tree branch and its fundamental sum of the cutset currents is 0 or
U]
cutset is as shown. The algebraic
They can be written in matrix form as I
x] 2

3x]
.
+ uo
y = [ ~.5 0
 X3 =
o
I] o
X
+
[00] 0.5
0
u
(2.31)
which implies
x]
=
iX]

!X3
+ iU] + ~U2
cutset we have X2 X3
Equations (2.30) and (2.31) are the statespace description of the network. The transfer matrix of the network can be computed directly from the network or using the formula in (2.16): = 0 or
The IF capacitor is a tree branch, and from its fundamental
We will use MATLAB to compute this equation. We type
a=[1/6 c=[l 1 0 1/3;0 1;0.5 0 1;0.5 0 O];d=[O 0.5 0;0.5 C.5];b=[1/6 0]; 1/3;C 0;0 OJ;
The 2H inductor is a link. The voltage along its fundamental or
loop is 2X3 + X3 
x]
+
X2
=0
[N:,d1]=ss2tf(a,b,c,d,1)
which yields
Fundamental cutset of @ Fundamental cutset of Q)
/ / /
Nl= 0.0000 0.5000 d1= 1.0000 0.6667 0.7500 0.0833 for the second input.
2
0.1667 0.2500
0.0000 0.3333
0.0000 0.0000
This is the first column of the transfer matrix. We repeat the computation Thus the transfer matrix of the network is 0.166752 G(5) = Figure 2.16 Network with two inputs. 53 + 0.~66752 +20.755 + 0.083 0.55 + 0.255 + 0.33335 [ 53 + 0.666752 + 0.755 + 0.083
0.33335 53 + 0.666752 + 0.755 + 0.0833 0.166752  0.08335  0.0833 53 + 0.666752 + 0.755 + 0.0833
1
30
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS
2.13 Consider the network shown in Fig. 2.17(a), where T is a tunnel diode with the characteristics shown in Fig. 2.17(b). Let XI be the voltage across the capacitor and Xz be the current through the inductor. Then we have v = XI and
EXAMPLE
1
i
l
2.6
DiscreteTime
Systems
31
i
I I
This is an LTI statespace equation. Now if XI (I) is known to lie only inside the range (c, d) shown in Fig. 2.l7(b). we may introduce the variables (t) = (t) vo, andx2(t) = x2(t)io and approximate h(xi (r) as io (t)1 R2. Substituting these into (2.32) yields
Xz(t)
= CXI (t) + i(t) = CXI (t) + h(xi
Xl
(r)
LX2(t) = E  RX2(t) 
[.rl] 1:2
XI
XI
XI
=
[I/CRz IlL
I/C RIL
][XI]X2
+
[0]IlL
E
(r)
They can be arranged as
. XI(t) = h(XI(t»
C
+c
 RX2(t) L
xz(t)
(2.32)
E
XI
This set of nonlinear equations describes the network. Now if XI (t) is known to lie only inside the range (a, b) shown in Fig. 2.l7(b), then hix, (r) can be approximated by ht x, (r) = (t)1 RI· In this case, the network can be reduced to the one in Fig. 2.17(c) and can be described by
XI] [ Xz = [I/CRI IlL I/C RIL
. XI(t) xz(t) = I I
where E = E  u,  Rio. This equation is obtained by shifting the operating point from (0. 0) to (vo, io) and by linearization at (va' io). Because the two linearized equations are identical if Rz is replaced by RI and E by E, we can readily obtain its equivalent network shown in Fig. 2.17(d). Note that it is not obvious how to obtain the equivalent network from the original network without first developing the state equation.
+L
2.6 DiscreteTime Systems
This section develops the discrete counterpart of continuoustime systems. Because most concepts in continuoustime systems can be applied directly to the discretetime systems, the discussion will be brief. The input and output of every discretetime system will be assumed to have the same sampling period T and will be denoted by u[k] := u(kT), y[k]:= y(kT), where k is an integer ranging from oc to +x>. A discretetime system is causal if current output depends on current and past inputs. The state at time ko, denoted by x[ko], is the information at time instant ko, which together with u[k], k :::ko, determines uniquely the output y[kJ, k ::: ko. The entries of X are called state variables. If the number of state variables is finite, the discretetime system is lumped; otherwise. it is distributed. Every continuoustime system involving time delay, as the ones in Examples 2.1 and 2.3, is a distributed system. In a discretetime system, if the time delay is an integer multiple of the sampling period T, then the discretetime system is a lumped system. A discretetime system is linear if the additivity and homogeneity properties hold. The response of every linear discretetime system can be decomposed as Response
= zerostate response
][XI]+[ Xz
O]E IlL
Ft
LXl L
;1
X2
lfRo
i
= h(v)
T
~
+
Rxo R
Y
L
(a)
x,
V
Cx,
L
(b)
+
zeroinput response
and the zerostate responses satisfy the superposition property. So do the zeroinput responses. Inputoutput description Let o[k] be the impulse sequence defined as o[k  m] = {~ ifk =m if k f= m
L
Xl
R
L
(c)
x,
t+C
r
V
L
R R,
R,
where both k and m are integers. denoting sampling instants. It is the discrete counterpart of the impulse 8(t  til. The impulse oCt  ttl has zero width and infinite height and cannot be generated in practice: whereas the impulse sequence a[k  m] can easily be generated. Let u[k] be any input sequence. Then it can be expressed as u[k] =
L
cc m=oo
u[m]8[k  m]
(d)
Figure 2.17
Network with a tunnel diode.
Let g[k, m] be the output at time instant k excited by the impulse sequence applied at time instant m . Then we have
32
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS
2.6 DiscreteTime
Systems
33
8[k  m] + g[k, m] 8[k  m]u[m] L:)[k  m]u[m]+
+ g[k, m]u[m]
(homogeneity) (additivity)
where we have interchanged the order of summations, introduced the new variable I = k  m, and then used the fact that g[l] = 0 for I < 0 to make the inner summation independent of m. The equation ,v(z)
Lg[k,
m]u[m]
=
g(z)u(z)
(2.37)
Thus the output y[k] excited by the input u[k] equals
00
y[k] = L
m=:JQ
g[k, m]u[m]
(2.33)
is the discrete counterpart of(2.1O). The function g(z) is the ztransform of the impulse response sequence g[k] and is called the discrete transfer function. Both the discrete convolution and transfer function describe only zerostate responses.
EXAMPLE 2.14
This is the discrete counterpart of (2.3) and its derivation is considerably simpler. The sequence g[k. m] is called the impulse response sequence. If a discretetime system is causal, no output will appear before an input is applied. Thus we have Causal
Consider the unitsamplingtime
delay system defined by
y[k] = u[k  1] The output equals the input delayed by one sampling period. Its impulse response sequence is g[k] = 8[k  1] and its discrete transfer function is g(2) = Z[8[k 
<==>
g[k, m] = 0,
for k < m
If a system is relaxed at ko and causal, then (2.33) can be reduced to
k
III
= ;:1 = ~ system involving time delay is a
y[k] as in (2.4).
=
L
m=ko
g[k, m]u[m]
(2.34)
It is a rational function of c. Note that every continuoustime distributed system. This is not so in discretetime systems.
If a linear discretetime system is time invariant as well, then the time shifting property holds. In this case, the initial time instant can always be chosen as ko = 0 and (2.34) becomes
k k
EXAMPLE 2.15 Consider the discretetime feedback system shown in Fig. 2.18(a). It is the discrete counterpart of Fig. 2.5(a). If the unitsamplingtime delay element is replaced by its transfer function 21. then the block diagram becomes the one in Fig. 2.18(b) and the transfer function from r to v can be computed as
y[k] = L
m=O
g[k  m]u[m]
=L
m=O
g[m]u[k
 m]
(2.35)
This is the discrete counterpart of (2.8) and is called a discrete convolution. The ztransform is an important tool in the study of LTI discretetime be the ztransform of y[k] defined as
00
acI g(;:) = __ Iacl
a = za
systems. Let S(z)
This is a rational function of z and is similar to (2.12). The transfer function can also be obtained by applying the ztransform to the impulse response sequence of the feedback system. As in (2.9), the impulse response sequence is gf[k] The zrransform = a8[k  1]
Yez) := Z[y[k]]
:= Ly[k]zk
k~O
(2.36)
00,5
+ a28[k
 2]
+ '"
=L
m=!
am8[k  m] system is
We first replace the upper limit of the integration (2.36) to yield s'(;:) =
in (2.35) to
and then substitute
it into
of 8[k  Ill] is zm. Thus the transfer function of the feedback
=
f (f i:
k~O
g[k  m]ll[m])
z<km);:m
m~O
(fg[k
 m]z<k:I))
u[m]zm
m~O
=
(f
I~O
k~O
g[I]ZI)
(f
which yields the same result. The discrete transfer functions in the preceding may not be so in general. For example, if for two examples are all rational functions
ll[m]zm)
=: g(Z)u(Z)
m~O
of
z. This
III ::::0
5. This is permitted under the causality assumption.
for k = 1,2, ...
34
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS
y[kJ
2.6
DiscreteTime
Systems
35
Unittime delay
_1
HzJ
frequency
noise, which often exists in the real world. Therefore
improper
transfer
functions
are avoided in practice.
Statespace
(a) (b)
equations
Every linear lumped discretetime x[k
system can be described
by (2.38)
+
1] = A[k]x[k]
+ B[k]u[k]
Figure 2.18
Discretetime feedback system. becomes
y[k] = C[k]x[k] where A, B. C, and D are functions
+ D[k]u[k]
of k. If the system is time invariant
as well. then (2.38)
Then we have g(z) = ;::1
x[k
+ I]
= Ax[k]
+ !Z2 +
.:3
+ ... = In(l
_ .:1)
y[k] = Cx[k] where A. B. C. and D are constant matrices. x(z) Then we have Z[x[k = Z[x[k]]
+ Bu[k] + Du[k]
of x[k] or
(2.39)
It is an irrational function of z. Such a system is a distributed system. We study in this text only lumped discretetime systems and their discrete transfer functions are all rational functions of z. Discrete rational transfer functions can be proper or improper. improper such as g(z) = (Z2 + 2z  1)/(.:  0.5), then If a transfer function is
Let x(z) be the ztransform :=
L x[k]Zk
k=O
oc
Hz)
£1(.:) which implies y[k or y[k
Z2
+ 2z
1
z  0.5
+
1]] =
L x[k
+
I]zk
=Z
L x[k
cc
+
I]Z(k+I)
+ 1] 
0.5y[k]
= u[k
+ 2] + 2u[k + 1] 
u[k] Applying the ztransform
=z
[f
x[l]zI
+ x[O] 
X[O]] = z(x(z)
 x[O])
1=1
to (2.39) yields zx(::)  zx[O] = Ax(z)
It means that the output time instant k + 2, a future input. Thus a discretetime system described by an improper transfer function is not causal. We study only causal systems. Thus all discrete rational transfer functions will be proper. We mentioned earlier that we also study only proper rational transfer functions of s in the continuoustime case. The reason, however, is different. Consider g(s) = s or y(t) = du(t)/dt. It is a pure differentiator. If we define the differentiation as v(t) • du(t) = dt = lirn t>~O
+ 1] = 0.5y[k] + u[k + 2] + 2u[k + 1]  u[k] at time instant k + I depends on the input at
+ Bu(.:)
+ Durz
)
Yez) = Cx(.:)
which implies x(.:) = (zl  A)IZX[O] y(z)
.
u(t
+ 6) 6
= C(zI 
A)IZX[O]
+ (.:1  A)IBu(z) + C(zl  A)IBu(z) + Du(z)
(2.40) (2.41)
u(t)
They are the discrete counterparts x[O]. Ifx[O] and the differentiator is y(z) Comparing
of (2.14) and (2.15). Note that there is an extra z in front of
where 6> 0, then the output yet) depends on future input uit not causal. However, if we define the differentiation as .
+ 6)
=
0, then (2.41) reduces to = [C(zI  A)IB
+ D]u(z) +D
(2.42)
v (t) = 
du(t) dt
. u(t)  u(t  6) = lim _:_',''t>~O 6
this with the MIMO case of (2.37) yields G(z) = C(zl  A)IB (2.43)
then the output yet) does not depend on future input and the differentiator is causal. Therefore in continuoustime systems, it is open to argument whether an improper transfer function represents a noncausal system. However. improper transfer functions of s will amplify high
This is the discrete counterpart ztransform variable
of (2.16). If the Laplace transform variable s is replaced by the are identical.
z.
then the two equations
36
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS
2.7 Concluding
Remarks
37
EXAMPLE 2.16 Consider a money market account in a brokerage firm. If the interest rate depends on the amount of money in the account, it is a nonlinear system. If the interest rate is the same no matter how much money is in the account, then it is a linear system. The account is a timevarying system if the interest rate changes with time; a timeinvariant system if the interest rate is fixed. We consider here only the LTI case with interest rate r = 0.015% per day and compounded daily. The input u[k] is the amount of money deposited into the account on the kth day and the output y[k] is the total amount of money in the account at the end of the kth day. If we withdraw money, then u[k] is negative.
x[kJ := y[k]  u[kJ Substituting y[k,+
~.
I) = x[k x[k
+
IJ
+ u[k +
I] and y[k) = x[k)
+ u[k]
into (2.46) yields (2.48)
+
I] = 1.0OOISx[k]
+
l.00015u[k)
y[kJ = x[kJ
+ u[k)
This is in the standard form and describes the money market account. The linearization discussed for the continuoustime case can also be applied to the discretetime case with only slight modification. Therefore its discussion will not be repeated.
(u[k] = O. k = I. 1 .... ). then )"[0] = u[O) = I and v[l) Because the money is compounded daily, we have y[1) = y[IJ and. in general, y[k] = (l.00015l
If we deposit one dollar on the first day (that is, u[O]
= I) and nothing
=
I
+ 0.00015
thereafter = 1.00015.
+ vl l l : 0.00015
!
= y[I]·
1.00015 = (1.00015)2
2,7 Concluding Remarks
We introduced in this chapter the concepts of causality, lumpedness, linearity. and time invariance. Mathematical equations were then developed to describe causal systems, as summarized in the following. System type
Distributed. linear
Because the input (I. O. 0, ... J is an impulse sequence, the output is. by definition. the impulse response sequence or g[k) and the inputoutput description
k
Internal description
External description yet) =
= (1.00015l
k
of the account is g[k  m]u[m] = L(l·00015lmu[m]
m=O
y[k] = L
(2.44)
Lumped,
linear
x = A(t)x
y = C(t)x
+ B(t)u + D(t)u
y(t) =
l' '" l'
In
co.
r)u(r) dr
GU. r)u(r)dr
The discrete transfer function is the ztransform g(z)
of the impulse response sequence or
= Z[g[k))
= L(1.00015/zk
=
L(1.00015z1l
(1.45 )
co
Distributed, timeinvariant
linear,
y{t) =
10
f'
G(t 
r)u(r)dr
(;(5)
y(s) = (;(S)U(S).
irrational
1 1.00015z1
z  1.00015
Whenever we use (2.44) or (1.45), the initial state must be zero, or there is initially no money in the account. Next we develop a statespace equation to describe the account. Suppose v[k] is the total amount of money at the end of the kth day. Then we have y[k
Lumped,
linear,
x = Ax+Bu
y = Cx +Du
y(t) =
timeinvariant
l'
G(t  r)u(r)dr
(;(5)
Yes) = (;(S)U(S).
rational
+ IJ =
y[kJ
+ 0.00015y[k] + u[k +
:= v[k], then x[k
IJ
=
1.00015y[k]
+
u[k
+
I]
(2.46)
If we define the state variable as x[k]
+
I] = 1.000ISx[k] = x[kJ
+ u[k +
I]
(2.47)
Distributed systems cannot be described by finitedimensional statespace equations. External description describes only zerostate responses; thus whenever we use the description. systems are implicitly assumed to be relaxed or their initial conditions are assumed to be zero. We study in this text mainly lumped linear timeinvariant systems. For this class of systems, we use mostly the timedomain description (A. B. C, D) in the internal description and the frequencydomain (Laplacedomain) description G(s) in the external description. Furthermore, we will express every rational transfer matrix as a fraction of two polynomial matrices, as we will develop in the text. By so doing. all designs in the SISO case can be extended to the multi variable case.
y[k)
Because of u[k + I], (1.47) is not in the standard form of (2.39). Thus we cannot select x[k] := y[k) as a state variable. Next we select a different state variable as
38
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS
Problems
39
The class of lumped linear timeinvariant systems constitutes only a very small part of nonlinear and linear systems. For this small class of systems, we are able to give a complete treatment of analyses and syntheses. This study will form a foundation for studying more general systems.
1. Ifu3 = UI
2. If U3 = 0.5(ul
+ U2, then Y3 = YI + Y2· + U2), then Y3 = 0.5(YI + Y2).
3. Ifu3 = UI  U2, then y, = YI  Y2. Which are correct ifx(O) = O?
2.1
Consider the memoryless systems with characteristics shown in Fig. 2.19, in which u denotes the input and Y the output. Which of them is a linear system? Is it possible to introduce a new output so that the system in Fig. 2.l9(b) is linear?
y
y y
2.6
Consider
a system whose input and output are related by yet) = { ~2(t)/U(t  1) ifu(t  I)
f. 0
property but not the additivity
if u(t for all t. Show that the system satisfies the homogeneity property. 2.7
I) = 0
Show that if the additivity property holds, then the homogeneity property holds for all rational numbers 01. Thus if a system has some "continuity" property, then additivity implies homogeneity. Let g(t, r ) = get +01, r +01) for all t, r , and 01. Show that g(t, r ) depends only on t  r. [Hint: Define x = t + rand Y = t  r and show that 8g(t. r)/8x = 0.] Consider a system with impulse response as shown in Fig. 2.20(a). What is the zerostate response excited by the input u(t) shown in Fig. 2.20(b).
u(t)
2.8 2.9
(a) (b) (e)
Figure 2.19 2.2 The impulse response of an ideallowpass filter is given by
sin 2w(t  to) g (t) = 2w .:._____::_:. 2w(t  to) for all t, where wand to are constants. build the filter in the real world? 2.3 Is the ideallowpass filter causal? Is it possible to
l.
g(t)
~~~~~~~++~. a 1 2
(a)
t
Consider a system whose input u and output yare related by yet) = (POIu)(t) := {u(t)
I
o
for t ~
fort>
01 01 Ib)
where 01 is a fixed constant. The system is called a truncation operator, which chops off the input after time 01. Is the system linear? Is it timeinvariant? Is it causa]? 2.4 The input and output of an initially relaxed system can be denoted by y = Hu, where H is some mathematical operator. Show that if the system is causal, then POly = POIHu = POIHPOIu where POI is the truncation 2.5 Consider a system with system using the inputs x(O) at time t = 0 is the Which of the following operator defined in Problem 2.3. Is it true POIH u = H POIu?
Figure 2.20 2.10 Consider a system described by
ji
+ 2y
 3y = U  u
What are the transfer function and the impulse response of the system? 2.11 Let y(t) be the unitstep response of a linear timeinvariant response of the system equals dy(tl/dt. 2.12 Consider a twoinput and twooutput system described by system. Show that the impulse
input u and output y. Three experiments are performed on the UI (t), U2(t), and U3(t) for t :::: O. In each case, the initial state same. The corresponding outputs are denoted by YI, Y2, and Y3. statements are correct if x(O) f. O?
40
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS
Problems = N21(p)ltl(t)
41
D21(p)YI(tl
+ D22(P)Y2(t)
+ Nn(p)u2(t)
where Nij and Dij are polynomials system?
of p := d i dt . What is the transfer matrix of the
2.13 Consider the feedback systems shown in Fig. 2.5. Show that the unitstep responses of the positivefeedback system are as shown in Fig. 2.2l(a) for a = I and in Fig. 2.2l(b) for a = 0.5. Show also that the unitstep responses of the negativefeedback system are as shown in Figs. 2.21 (c) and 2.2l(d). respectively. for a = I and a = 0.5.
(a)
Ibl
Figure 2.22
0=1
a = 0.5
• r
o
y(l)
1
2
3
4
(a)
6
4
(b)
I 6
•
t
o r~4~+~L_+~~~
1 4 6
(e)
,:~;J
c::=
(d)
•
t
01234567
Figure 2.21 2.14 Draw an oparnp circuit diagram for
Figure 2.23
y = [3 lO]x  211 2.17 The soft landing phase of a lunar module descending 2.1S Find state equations to describe the pendulum systems in Fig. 2.22. The systems are useful to model one or twolink robotic manipulators. If e. el• and e2 are very small, can you consider the two systems as linear" 2.16 Consider the simplified model of an aircraft shown in Fig. 2.23. It is assumed that the aircraft is in an equilibrium state at the pitched angle eo. elevator angle lIQ, altitude ho. and cruising speed Va. It is assumed that small deviations of e and It from eo and 110 generate forces /1 = kle and [: = ku as shown in the figure. Let m be the mass of the aircraft. I the moment of inertia about the center of gravity P, be the aerodynamic damping. and h the deviation of the altitude from ho. Find a state equation to describe the system. Show also that the transfer function from It to h. by neglecting the effect of I. is on the moon can be modeled as by shown in Fig. 2.24. The thrust generated is assumed to be proportional the mass of the module. Then the system can be described
XI
Ill."
to
In.
where
In
is
= ktiz  mg. where of the system as
g is the gravity constant on the lunar surface. Define state variables = )',X2 =
y. X3
= m. and
II
= m . Find a statespace
equation to describe the system.
2.18 Find the transfer functions from transfer functions" important
II
to
YI
and from
)'1
It
to y of the hydraulic
tank system
shown in Fig. 2.25. Does the transfer function from
to y equal the product of the two
Is this also true for the system shown in Fig. 2.147 [Answer: No. mathematical equations to describe composite systems.
because of the loading problem in the two tanks in Fig. 2.14. The loading problem is an issue in developing [7].] See Reference
42
MATHEMATICAL
DESCRIPTIONS
OF SYSTEMS Figure 2.24
I
t
Problems
43
!
of the block. with mass m, from equilibrium. Find a statespace equation to describe the system. Find also the transfer function from u to y.
Figure 2.27
I
t
I
I
a::c==t====:::==========l
+"
_I_
Thrust = km
•
Figure 2.25
__ r
Lunar surface
\
Q+y
2.19 Find a state equation to describe the network shown in Fig. 2.26. Find also its transfer function.
1F
u (current
source)
._____...___.'
H
r 1
y
Figure 2.26
2.20 Find a state equation to describe the network shown in Fig. 2.2. Compute also its transfer matrix. 2.21 Consider the mechanical system shown in Fig. 2.27. Let I denote the moment of inertia of the bar and block about the hinge. It is assumed that the angular displacement () is very small. An external force u is applied to the bar as shown. Let y be the displacement
3.2 Basis .. Representation.
and Orthonormalization
45
Chapter
/
These identities can easily be verified. Note that a, hi is an n x r matrix; it is the product of an n x I column vector and a I x r row vector. The product b.a, is not defined unless n = r; it becomes a scalar if n = r.
3.2 Basis,Representation, and Orthonormalization
Consider an ndimensional of real numbers such as real linear space. denoted by 'R ". Every vector in 'R" is an ntuple
Linear Algebra
X=
.I'll [
.1'2
_\~I!
To save space, we write it as x = [XI X2 ..• x"J'. where theprime denotes the transpose. The set of vectors (XI. x. .... xm) in 'R" is said to be linearly dependent if there exist real numbers 0'1. 0'2. am. not all zero. such that
(3.4)
3.1 Introduction
This chapter reviews a number of concepts and results in linear algebra that are essential in the study of this text. The topics are carefully selected. and only those that will be used subsequently are introduced. Most results are developed intuitively in order for the reader to better grasp the ideas. They are stated as theorems for easy reference in later chapters. However. no formal proofs are given. As we saw in the preceding chapter. all parameters that arise in the real world are real numbers. Therefore we deal only with real numbers. unless stated otherwise. throughout this text. Let A. B. C. and D be. respectively. n x m. m x r.l x n. and r x p real matrices. Let a, be the ith column of A. and b, the jth row of B. Then we have
If the only set of a, for which (3.4) vectors {XI. X2..... x"'} is said to If the set of vectors in (3.4) is 0'1. that is different from zero. Then
holds is a I = O. 0'2 = O..... am = O. then the set of be linearly independent. linearly dependent. then there exists at least one a., say. (3.4) implies
I XI = [Ct2X2
0'1
+a3x3 +
+ C{mxm]
where fJi = a i/O' I. Such an expression is called a linear combination. The dimension of a linear space can be defined as the maximum number of linearly independent vectors in the space. Thus in 'R", we can find at most n linearly independent vectors.
AB=[al
al
...
aml[::l bm
=albl+a2bl+···+ambm
(3.1)
(3.2)
Basis and representation A set of linearly independent vectors in 'R" is called a basis if every vector in 'R n can be expressed as a unique linear combination of the set. In 'R". any set of n linearly independent vectors can be used as a basis. Let (ql. q2 ..... q,,) be such a set. Then every vector x can be expressed uniquely as (35)
and bo bll[bIDl D [ b~n 44 b,~D
BD
=
=
s.n 
Define the n x n square matrix
(3.3)
Q := [ql ql
Then (3.5) can be written as
."
qn
I
(3.6)
46
LINEAR ALGEBRA
(3.7)
II
I
3.2 Basis, Representation, Figure 3.1 vector x. Different representations of
and Orthonormalization
47
x
We call
x
= [al
a2
...
an]' the representation
of the vector x with respect orthonormal basis:
to the basis
[q., q2.· ... qn). We will associate with every 'R" the following I
0
I
il =
0 0 0 0
h=
0 0 0
0 0 0
inl = I
in =
0 0 0 0
(3.8)
0
With respect to this basis. we have
The last inequality Let x = [x where In is the n x n unit matrix. In other words, the representation to the orthonormal
EXAMPLE
t X2
is called the triangular
xn]'.
inequality. of the following:
Then the norm of x can be chosen as anyone Ilxlll:= Llx,l
i=l
of any vector x with respect
basis in (3.8) equals itself.
3.1 Consider the vector x = [I 3]' in X2 as shown in Fig. 3.1, The two vectors ql = [3 I]' and q2 = [2 2]' are clearly linearly independent and can be used as a basis. If we draw from x two lines in parallel with q2 and ql, they intersect at ql and 2q2 as shown. Thus the representation of x with respect to {q., q2l is [I 2]'. This can also be verified from
Ilxll2 :=
v£' =
(~lxil2r!Z
Ilxll"" := max, Ixil They are called. respectively. Inorm, 2 or Euclidean norm, and infinitenorm. The 2norm is the length of the vector from the origin. We use exclusively, unless stated otherwise. the Euclidean norm and the subscript 2 will be dropped. In MATLAB. the norms just introduced can be obtained by using the functions norm(x, 1). no rrn t x , 2) ~norm(x). and no rrn r x , .i n f ) .
To find the representation of x with respect to the basis [q2, iz}. we draw from x two lines in parallel with i2 and q2. They intersect at 0.Sq2 and 2h. Thus the representation of x with respect to [q2, izl is [0.5 2]'. (Verify.) Norms of vectors The concept of norm is a generalization of length or magnitude. Any
Orthonormalization
A vector x is said to be normalized
if its Euclidean norm is I or x'x = I. if
realvalued properties:
function of x, denoted by Ilxll. can be defined as a norm if it has the following
Note that xx is scalar and xx' is n x n. Two vectors Xl and Xl are said to be orthogonal x; X2 = XSXI = O. A set of vectors Xi, i = I. 2 ..... m, is said to be orthonormal if if i
f. j
em. we can obtain an orthonormal
1. Ilxll ?: 0 for every x and Ilxll = 0 if and only if x = O.
2. Ilaxll = letlllxll.foranyrealet. 3. Ilxl
if i = j Given a set of linearly independent set using the procedure that follows: vectors ei. e2, ....
+ x211 CSIlxlll + IIx211for every
XI
and x.
48
LINEAR ALGEBRA := ql := ul/llulll q2 :=
3.3 Linear Algebraic Equations
UI
e]
49
Uz := e2  Cq'le2)ql
u2/11u211 um/liumil
defined as the dimension of the range space or, equivalently, the number oflinearly independent columns in A. A vector x is called a null vector of A if Ax = O. The null space of A consists of all its null vectors. The nullity is defined as the maximum number of linearly independent null vectors of A and is related to the rank by Nullity (A) = number of columns of A EXAMPLE
u., := em  L;:iCqiem)qk
qm :=
rank (A)
(3.10)
The first equation normalizes the vector el to have norm 1. The vector C e2)ql is the projection q; of the vector e2 along q.. Its subtraction from ez yields the vertical part u. It is then normalized to 1 as shown in Fig. 3.2. Using this procedure, we can obtain an orthonormal set. This is called the Schmidt orthonormaliration procedure. Let A = [a] a2 ... aml be an n x m matrix with m :s: n. If all columns of A or {a., i = I, 2 .... , m I are orthonormal, then
3.2 Consider the matrix
A=
[~
~
i :J
=:[al
a2 a, a.l
(3.11)
A' A
~o,;
[
a;
am
1
[0, 0, ..
'
where a; denotes the ith column of A. Clearly al and az are linearly independent. The third column is the sum of the first two columns or a] + az  a, = O. The last column is twice the second column. or 2a2  a. = O. Thus A has two linearly independent columns and has rank 2. The set {a], a21 can be used as a basis of the range space of A. Equation (3.10) implies that the nullity of A is 2. It can readily be verified that the two vectors
where 1m is the unit matrix of order m. Note that, in general, AA' '" In. See Problem 3.4.
3.3 Linear Algebraic Equations
Consider the set of linear algebraic equations Ax =Y (3.9) meet the condition An; = O. Because the two vectors are linearly independent, basis of the null space.
(3.12)
they form a
where A and yare, respectively, m x nand m x I real matrices and x is an n x I vector. The matrices A and yare given and x is the unknown to be solved. Thus the set actually consists of m equations and n unknowns. The number of equations can be larger than, equal to, or smaller than the number of unknowns. We discuss the existence condition and general form of solutions of (3.9). The range space of A is defined as all possible linear combinations of all columns of A. The rank of A is
The rank of A is defined as the number of linearly independent columns. It also equals the number of linearly independent rows. Because of this fact, if A is m x n, then rank(A)
:s: min(m,
n)
orth,
u~ e2
In MATLAB, the range space. null space, and rank can be obtained by calling the functions nu Ll , and rank. For example, for the matrix in (3.11), we type
a= [0 rank(a) 1 1 2; 1 2 3 4; 2 0 2 0];
u, 
which yields 2. Note that MATLAB computes ranks by using singularvalue decomposition (svd), which will be introduced later. The svd algorithm also yields the range and null spaces of the matrix. The MATLAB function R=orth (a) yields' Ans 0.3782 0.8877 0.2627 0.3084 0.1468 0.9399 (3.13)
Figure 3.2
Schmidt orthonormization procedure.
I
1. This is obtained using ~l\TLAB Version 5. Earlier versions may yield different results.
50
LINEAR ALGEBRA The two columns of R form an orthonormal basis of the range space. To check the orthonormality, we type R' *R, which yields the unity matrix of order 2. The two columns in R are not obtained from the basis [ai, a2l in (3.11) by using the Schmidt orthonormalization procedure; they are a byproduct of svd. However, the two bases should span the same range space. This can be verified by typing
rank( [al a2
I
I
I ,
I
!
3.3 Linear Algebraic Equations x = xp is a solution of Ax Substituting
51
+ alnl + ... + aknk
(3.15)
I
=
y, where {nj , ... , nk! is a basis of the null space of A.
:
(3.15) into Ax = y yields
k
Rl)
Axp+
LaiAni
i=l
=Axp+O=y
this from
which yields 2. This confirms that {a.. a2l span the same space as the two vectors of R. We mention that the rank of a matrix can be very sensitive to roundoff errors and imprecise data. For example, if we use the fivedigit display of R in (3.13), the rank of [al a2 RJ is 3. The rank is 2 if we use the R stored in MATLAB, which uses 16 digits plus exponent. The nul! space of (3.11) can be obtaIned by typing null (a) , which yields Ans
N~
Thus, for every a., (3.15) is a solution. Let x be a solution or Ax = y. Subtracting Ax, = y yields A(x  xp) = 0 which implies that x  xp is in the null space. Thus x can be expressed establishes Theorem 3.2.
as in (3.15). This
0.3434 0.8384 0.3434 0.2475
0.5802 0.3395 0.5802 0.4598
EXAMPLE
3.3 Consider the equation 0
Ax = I
(3.14) [
The two columns are an orthonormal basis of the null space spanned by the two vectors [n.. n2l in (3.12). AI! discussion for the range space applies here. That is, rank ([nl n2 NJ) yields 3 if we use the fivedigit display in (3.14). The rank is 2 if we use the N stored in MATLAB. With this background, we are ready to discuss solutions of (3.9). We use p to denote the rank of a matrix.
2
1I 23 02
(3.16)
This y clearly lies in the range space of A and xp = [0  4 0 0]' is a solution. A basis of the null space of A was shown in (3.12). Thus the general solution of (3.16) can be expressed as
(3.17)
>
Theorem 3.1 1. Given an m x n matrix A and an m x I vector y, an n x 1 solution x exists in Ax = y if and only if y lies in the range space of A or, equivalently, peA) = p ([A y]) where [A y] is an m x (n
for any real a I and a2. In application, we will also encounter xA = y, where the m x n matrix A and the I x n vector yare given, and the I x m vector x is to be solved. Applying Theorems 3.1 and 3.2 to the transpose of the equation, we can readily obtain the following result.
+ I)
matrix with y appended to A as an additional column.
2. Given A. a solution x exists in Ax = y for every y, if and only if A has rank m (full row rank). The first statement follows directly from the definition of the range space. If A has full row rank, then the rank condition in (I) is always satisfied for every y. This establishes the second statement.
>
Corollary 3.2 1. Given an m x n matrix A, a solution x exists in xA = y, for any y, if and only if A has full column rank. 2, Given anm x n matrix A and an I x n vector y, let x, be a solution ofxA = y and letk = m  peA). If k = 0, the solution xp is unique. If k > 0, then for any a" i = 1,2, ... , k, the vector x = xp
>
Theorem 3.2 (Parameterization
+ alnl + ... + aknk
of all solutions)
Given an m x n matrix A and an m x 1 vector y, let xp be a solution of Ax = y and let k := n  p (A) be the nullity of A. If A has rank n (full column rank) or k = 0, then the solution xp is unique. If k > 0, then for every real a., i = 1,2, .... k, the vector
is a solution of xA = y. where niA = 0 and the set {n., ... , nk! is linearly independent. In MATLAB, the solution of Ax = y can be obtained by typing A \y. Note the use of backslash, which denotes matrix left division. For example. for the equation in (3.16), typing
52
LINEAR ALGEBRA
a=[O a\y 1 1 2;1 234;202 O];y=[4;3;O];
3.4 Similarity Transiormation
53
2. The homogeneous equation Ax = 0 has nonzero solutions if and only if A is singular. The number of linearly independent solutions equals the nullity of A.
yields [0  4 a 01'. The solution of xA = y can be obtained by typing y / A. Here we use slash. which denotes matrix right division. Determinant and inverse of square matrices The rank of a matrix is defined as the number of linearly independent columns or rows. It can also be defined using the determinant. The determinant of a I x I matrix is defined as itself. For n = 2.3 ..... the determinant of n x n square matrix A = [aij) is defined recursively as. for any chosen i, det A =
3.4 Similarity Transformation
Consider an n x n matrix A. It maps 'Rn into itself. If we associate with 'Rn the orthonormal basis {il. h..... in} in (3.8). then the ith column of A is the representation of A], with respect to the orthonormal basis. Now if w~ select a different set of basis {ql. q2 •.... qn J. then the matrix A has a different representation A. It turns out that the i th column of A is the representation of Aq, with respect to the basis {qt. q2 •.... qn}. This is illustrated by the example that follows.
EXAMPLE 3.4
,
L aijcij
(3.18)
I
where aij denotes the entry at the ith row and jth column of A. Equation (3.18) is called the Laplace expansion. The number c., is the cofactor corresponding to a., and equals (I )'+ j det Mi], where Mij is the (n  1) x (n  I) submatrix of A by deleting its i th row and jth column. If A is diagonal or triangular. then det A equals the product of all diagonal entries. The determinant of any r x r submatrix of A is called a minor of order r. Then the rank can be defined as the largest order of all nonzero minors of A. In other words. if A has rank r , then there is at least one nonzero minor of order r , and every minor of order larger than r is zero. A square matrix is said to be nonsingular if its determinant is nonzero. Thus a nonsingular square matrix has full rank and all its columns (rows) are linearly independent. The inverse of a nonsingular square matrix A = [aU) is denoted by A I. The inverse has the property AA I = A I A = I and can be computed as A
_I
Consider the matrix
(3.21) Let b = [0 0 I)'. Then we have
Ab ~
[i]
A'b = A(Ab, =
[J
(3.22)
It can be verified that the following relation holds:
= 
Adj A detA
=
I [cij) detA
,
(3.19)
where c., is the cofactor. If a matrix is singular. its inverse does not exist. If A is :2 x 2. then we have A1:=[all
a2l
Because the three vectors b. Ab, and A 2b are linearly independent. they can be used as a basis. We now compute the representation of A with respect to the basis. It is clear that
A(b) = [b Ab A"b)
[:J
(3.20) A(Ab) = [b Ab A2b) [:]
Thus the inverse of a 2 x 2 matrix is very simple: interchanging diagonal entries and changing the sign of offdiagonal entries (without changing position) and dividing the resulting matrix tw the determinant of A. In general. using (3.19) to compute the inverse is complicated. If A is triangular. it is simpler to compute its inverse by solving AA I = I. Xote that the inverse of a triangular matrix is again triangular. The MATLAB function :':s computes the inverse of A.
A(A b)
2
= [b Ab A b) [
2
~n
of A with respect to (323)
Theorem 3.3 Consider Ax = y with A square. 1. If A is nonsingular, then the equation has a unique solution for every y and the solution equals A I y. In particular. the only solution of Ax = 0 is x = O.
where the last equation is obtained from (3.22). Thus the representation the basis {b. Ab, A"b} is
54
LINEAR ALGEBRA The preceding discussion can be extended to the general case. Let A be an n x n matrix. If there exists an n x 1 vector b such that the n vectors b, Ab, ... , AnI b are linearly independent and if This shows that the ith column basis {ql, q2, ... , qn}. of
3.5
Diagonal
Form and Jordan Form
55
A
is indeed the representation
of Aq, with respect to the
then the representation
of A with respect to the basis {b, Ab,
3.5 Diagonal Form and Jordan Form
A square matrix A has different representations with respect to different sets of basis. In this section, we introduce a set of basis so that the representation will be diagonal or block diagonal. A real or complex number A is called an eigenvalue of the n x n real matrix A if there
o
o o
0
o p
(3.24)
o
This matrix is said to be in a companion
o
!3n1
!O
form.
!3n
exists a nonzero vector x such that Ax = AX. Any nonzero vector x satisfying Ax = AX is called a (right) eigenvector of A associated with eigenvalue A. In order to find the eigenvalue of A, we write Ax = AX = Alx as (AU)x
=0
(3.30)
Consider the equation Ax=y (3.25)
where I is the unit matrix of order n. This is a homogeneous equation. If the matrix (A  AI) is nonsingular, then the only solution of (3.30) is x = 0 (Theorem 3.3). Thus in order for (3.30) to have a nonzero solution x, the matrix (A  Al) must be singular or have determinant O. We define t.(A) = det(Al  A) It is a monic polynomial of degree n with real coefficients and is called the characteristic polynomial of A. A polynomial is called monic if its leading coefficient is I. If A is a root of the characteristic polynomial, then the determinant of (A  Al) is 0 and (3.30) has at least one nonzero solution. Thus every root of t.(A) is an eigenvalue of A. Because t.(A) has degree n, the n x n matrix A has n eigenvalues (not necessarily all distinct). We mention that the matrices 0'2 0')
The square matrix A maps x in 'R" into y in 'R": With respect to the basis {ql, q2, ... , qn}, the equation becomes (3.26) where
x and y are
the representations
of x and
y with
y=Qy
respect to the basis (ql,
q2, ... , q.},
As discussed in (3.7), they are related by
x = Qx
with Q = [ql q2 an n x n nonsingular matrix. Substituting
...
qnl
(3.27)
these into (3.25) yields
[T
and their transposes all have the following characteristic polynomial: 4 t.(}..) = A
o o
1
o
o
Tl
(3.28)
Comparing this with (3.26) yields A = Q1AQ This is called the similarity (3.29) as AQ=QA
or or
A = QAQI
(3.29)
transformation
and A and
A
are said to be similar. We write
+ 0'IA3 + 0'2A2 + 0'3)' + 0'4
A[ql
q2 ...
qnl = [Aq,
Aq2
.. , Aqnl = [ql q2 ...
qnlA
These matrices can easily be formed from the coefficients of t.(A) and are called companionform matrices. The companionform matrices will arise repeatedly later. The matrix in (3.24) is in such a form.
56
LINEAR
ALGEBRA
3.5
Diagonal
Form and Jordan
Form
57.
Eigenvalues of A are all distinct Let Ai. i = I. 2 .... , n, be the eigenvalues of A and be all distinct. Let qi be an eigenvector of A associated with Ai; that is, Aq, = Aiq.. Then the set of eigenvectors {q I. q2. . .. , q,,} is linearly independent and can be used as ~ basis. Let A be the representation representation of A with respect to this basis. Then the first column of A is the of Aql = Al ql with respect to {ql, q2, ... , q.}. From
Thus A has eigenvalues solution of
2, I,
and O. The eignevector
associated with A = 2 is any nonzero
(A  2I)ql
=
[
'"
o
o
2
~ ] ql I
=0
Thus ql = [0 I I]' is an eigenvector associated with A = 2. Note that the eigenvector is not unique. [0 a a]' for any nonzero real a can also be chosen as an eigenvector. The eigenvector associated with A = I is any nonzero solution of
(A  (I)I)q2 we conclude representation Proceeding that the first column of A/is [AI 0 ... to {ql, 0 0 A3 (3.31) It is a diagonal computing has of with A= matrix with eigenvalues 0]'. The second column q2, of A is the 0)'. of Aq2 = A2q2 with respect qn}, that is, [0 Al 0 ...
=
[i
forward, we can establish
which yields q2 = [0  2 I)'. Similarly, the eigenvector associated with A = 0 can be computed as q3 = [2 I  I]'. Thus the representation of A with respect to {q I, q2, q} is 2 [~ 0 ~I (3.34)
A=
This is a diagonal matrix. Thus we conclude a diagonal matrix representation eigenvectors If we define
l:
0 A2 0
0
0
... :'1
qn]
on the diagonal. This matrix can also be obtained by
that every matrix with distinct eigenvalues as a basis. Different orderings
by using its eigenvectors
will yield different diagonal matrices for the same A.
o
(3.32) However, it is simpler to verify QA = AQ or (3.33)
Q = [ql q2 then the matrix A equals
...
2
(3.35)
o
2
I ~] [~ 0 ~I 0 ~] 0 [~~~] 0I I [~ I
o
2
as derived in (3.29). Computing
(3.33) by hand is not simple because of the need to.compute
the inverse of Q. However, if we know A. then we can verify (3.33) by checking QA = AQ.
The result in this example can easily be obtained using MATLAB. Typing EXAMPLE 3.5 Consider the matrix yields a=[O 0 0;1 0 2;0 1 1]; [q,d]=eig(a)
o
j
j
Its characteristic
polynomial
is
'1
1
I
o
t.(A) = det(AI  A) = det [ ~I A I = A[A(A  I)  2] = (A  2)(A
q = [0.707~ 0.7071
0.8944 0.4472
0.8186] 0.4082 0.4082
d = [~
~I
n
AI
o '"
]
where d is the diagonal matrix in (3.34). The matrix q is different from the Q in (3.35); but their corresponding columns differ only by a constant. This is due to nonuniqueness of eigenvectors and every column of q is normalized to have norm I in MATLAB. If we type eig (a) without the lefthandside argument, then MATLAB generates only the three eigenvalues 2, 1, O.
+ I)A
58
LINEAR ALGEBRA We mention that eigenvalues in MATLAB are not computed from the characteristic polynomial. Computing the characteristic polynomial using the Laplace expansion and then computing its roots are not numerically reliable, especially when there are repeated roots. Eigenvalues are computed in MATLAB directly from the matrix by using similarity transformations. Once all eigenvalues are computed, the characteristic polynomial equals (A  Ai). In MATLAB, typing r=eig (a) ; poly (r) yields the characteristic polynomial.
r
3.6 Functions of a Square Matrix
59
transpose is replaced by complexconjugate transpose. By so doing, all concepts and results developed for real vectors and matrices can be applied to complex vectors and matrices. Incidentally, the diagonal matrix with complex eigenvalues in (3.36) can be transformed into a very useful real matrix as we will discuss in Section 4.3.1.
n
EXAMPLE 3.6 Consider the matrix
A [~1 ~ ~3]
= Its characteristic polynomial is (A+ 1)(A 2 14A+ 13). Thus A has eigenvalues 1, 2±3j. Note that complex conjugate eigenvalues must appear in pairs because A has only real coefficients. The eigenvectors associated with 1 and2+3j are, respectively, [1 0 0]' and [j 3+2) jr. The eigenvector associated with A = 2  3j is [ j  3  2j  j]', the complex conjugate of the eigenvector associated with A = 2 + 3 j. Thus we have 1
Eigenvalues of A are not all distinct An eigenvalue with multiplicity 2 or higher is called a repeated eigenvalue. In contrast, an eigenvalue with multiplicity 1 is called a simple eigenvalue. If A has only simple eigenvalues, it always has a diagonalform representation. If A has repeated eigenvalues, then it may not have a diagonal form representation. However, it has a blockdiagonal and triangularform representation as we will discuss next. Consider an n X n matrix A with eigenvalue A and multiplicity n. In other words, A has only one distinct eigenvalue. To simplify the discussion, we assume n = 4. Suppose the matrix (A  AI) has rank n  1 = 3 or, equivalently, nullity 1; then the equation (A  AI)q = 0 has only one independent solution. Thus A has only one eigenvector associated with A. We need n  1 = 3 more linearly independent vectors to form a basis for 'R4 The three vectors q2, q3, q4 will be chosen to have the properties (A  Alfq2 = 0, (A  AI)3q3 = 0, and (A  AI)4q4 = O. A vector v is called a generalized eigenvector of grade n if (A  AI)"v = 0 and (A  AI)nlv ~0 For n = 4,
Q=
[
j 0 3 +2j
o
j 3 2j
j
7
]
1 and A= [ ~
2
+ 3j
o
(3.36)
o
The MATLAB function
[q, d] =eig (a) yields 1
q=
and
[
0
o
0.2582j 0.7746 + 0.5164j 0.2582j
1
0.2582j ] 0.7746  0.5164j 0.2582j
d= [ ~
All discussion in the preceding example
o 2 + 3j
If n = 1, they reduce to (A  AI)v = 0 and v ~ 0 and v is an ordinary eigenvector. we define
o
applies here.
V3 := (A  AI)V4 = (A  AI)v V2 := (A  AI)V3 = (A  AI)2v
Complex eigenvalues Even though the data we encounter in practice are all real numbers. complex numbers may arise when we compute eigenvalues and eigenvectors. To deal with this problem, we must extend real linear spaces into complex linear spaces and permit all scalars such as a, in (3.4) to assume complex numbers. To see the reason, we consider Av =
VI := (A  AI)V2 = (A  AI)3v They are called a chain of generalized eigenvectors of length n = 4 and have the properties (A  AI)vl = 0, (A  AI)2V2 = 0, (A  AI)3V3 = 0, and (A  AI)4V4 = O. These vectors, as generated, equations, are automatically linearly independent and can be used as a basis. From these we can readily obtain AVI = AVI AV2 = VI + AV2 AV3 = V2 + AV3 AV4=V3+AV4
[I1 j 1 +2 i]
v=
0
(3.37)
If we restrict v to real vectors, then (3.37) has no nonzero solution and the two columns of A are linearly independent. However, if v is permitted to assume complex numbers, then v = [2 1  i]' is a nonzero solution of (3.37). Thus the two columns of A are linearly dependent and A has rank 1. This is the rank obtained in MATLAB. Therefore, whenever complex eigenvalues arise, we consider complex linear spaces and complex scalars and
60
LINEAR ALGEBRA
3.6 Functions of a Square Matrix of A with respect to the basis {VI, V2, V3, V4}is
61
Then the representation
A 1 0 0] oA10
J:=
[
0
0 0
A 0
1 A
(3.38)
o
We verify this for the first and last columns. The first column of J is the representation of AVI = AVI with respect to {VI, V2, V3, V4}, which is [A 00 OJ'. The last column of J is the representation of AV4 = V3 + AV4 with respect to {VI, V2, V3, V4}. which is [0 0 I AJ'. This verifies the representation in (3.38). The matrix J has eigenvalues on the diagonal and I on the superdiagonal. If we reverse the order of the basis, then the I 's will appear on the subdiagonal, The matrix is called a Jordan block of order n = 4. If (A  AI) has rank n  2 or, equiv~ently, nullity 2, then the equation (A  H)q =0
The first matrix occurs when the nullity of (A  AII) is I. If the nullity is 2, then A has two Jordan blocks associated with AI; it may assume the form in A2 orin A3. If (A All) has nullity 3, then A has three Jordan blocks associated with AI as shown in A4. Certainly, the positions of the Jordan blocks can be changed by changing the order of the basis. If the nullity is +, then A is a diagonal matrix as shown in As. All these matrices are triangular and block diagonal with Jordan blocks on the diagonal; they are said to be in Jordan form. A diagonal matrix is a degenerated Jordan form; its Jordan blocks all have order I. If A can be diagonalized, we can use [q, d J = e i g (a) to generate Q and A as shown in Examples 3.5 and 3.6. If A cannot be diagonized, A is said to be defective and [q, d 1 = e i g (a) will yield an incorrect solution. In this case, we may use the MATLAB function [q, d 1 = jordan (a) . However, j orca:". will yield a correct result only if A has integers or ratios of small integers as its entries. Jordanform matrices are triangular and block diagonal and can be used to establish many general properties of matrices. For example, because det(CD) = det C det D and det Qdet QI = det I = I, from A = QAQI, we have detA = detQdetAdetQ1 = detA all eigenvalues of
has two linearly independent solutions. Thus A has two linearly independent eigenvectors and we need (n  2) generalized eigenvectors. In this case, there exist two chains of generalized eigenvectors {VI, V2, ... , vd and {UI, U2, , u.} with k + I = n. If VI and UI are linearly independent, then the set of n vectors {VI, , Vb UI, ... , u.] is linearly independent and can be used as a basis. With respect to this basis, the representation of A is a block diagonal matrix of form A=diag{JI,12} where JI and J2 are, respectively, Jordan blocks of order k and I. Now we discuss a specific example. Consider a 5 x 5 matrix A with repeated eigenvalue AI with multiplicity 4 and simple eigenvalue 1.2. Then there exists a nonsingular matrix Q such that .'\ = QIAQ assumes one of the following forms 0 0 0 AI 0 0 0 0 AI 0 0 AI 0 0 0 0 AI 0 AI 0 0 0
The determinant of A is the product of all diagonal entries or. equivalently, A. Thus we have det A = product of all eigenvalues of A
which implies that A is nonsingular if and only if it has no zero eigenvalue. We discuss a useful property of Jordan blocks to conclude this section. Jordan block in (3.38) with order 4. Then we have 0 (J  AI)
Consider
the
=
[1
0 0 0 0 0 0 0
I 0 0 0 0 0 0
A, ~
lA'1 lA'1 lA'
AI 0 0 0 AI 0 0 0 0 AI 0 0 0
/j Jj
A, ~
lA'
A,"
A, ~ lA'
A,"
l
l l
0 AI 0 0 0 AI 0 0 0 AI 0 0 0 0 AI 0 0
0 0 0 AI 0 0 0 0 AI 0
(JH)3
=
ij ij
(3.39)
[l
fl !]
(J  AI)2 =
[l
0 0 0 0
I 0 0 0
~l
(3.40)
and (J  U)k = 0 for k ::: 4. This is called nilpotent.
3.6 Functions of a Square Matrix
This section studies functions of a square matrix. We use Jordan form extensively because many properties of functions can almost be visualized in terms of Jordan form. We study first polynomials and then general functions of a square matrix. Polynomials define of a square matrix Let A be a square matrix. If k is a positive integer. we
6
0 AI 0 0
jJ
Ak := AA· .. A
(k terms)
62
LINEAR ALGEBRA and A'' I. Let I(A) be a polynomial I (A) is defined as I(A) If A is block diagonal,
3.6 Functions of a Square Matrix
63
=
such as I(A)
= A3+2A2
60r
(A +2)(4A
3).
Then
Their characteristic
polynomials,
however, all equal
= A3 + 2A2  61 such as
or
I(A)
= (A + 2I)(4A
 31)
We see that the minimal polynomial is a factor of the less than or equal to the degree of the characteristic A are distinct, then the minimal polynomial equals Using the nilpotent property in (3.40), we can to verify (3.41) and that no polynomial minimal polynomial. ",(A) =0
characteristic polynomial and has a degree polynomial. Clearly, if all eigenvalues of the characteristic polynomial. show that
where Al and A2 are square matrices of any order, then it is straightforward Ak = [~~ Consider the similarity :~] and I(A) = [/(~Il
1(~2)]
Because
of lesser degree meets the condition.
Thus "'(A) as defined is the
transformati~n
A = QI AQ or A = QAQI. ... (QAQI) = QAkQ_I
Ak = (QAQI)(QAQI) we have I(A)
Theorem 3.4 (CayleyHamilton Let (3.42)
theorem)
= Q/(A)QI
or
I(A)
= QI
I(A)Q
be the characteristic polynomial of A. Then
(3.43)
A monic polynomial is a polynomial with I as its leading coefficient. The minimal polynomial of A is defined as the monic polynomial "'(A) of least degree such that ",(A) = O. Note that the 0 is a zero matrix of the same order as A. A direct consequence of (3.42) is that I(A) 0 if and only if I(A) Thus A and A or, more general, all similar matrices have the same minimal polynomial. Computing the minimal polynomial directly from A is not simple (see Problem 3.25); however, if the Jordanform representation of A is available, the minimal polynomial can be read out by inspection. Let Aj be an eigenvalue of A with multiplicity n., That is, the characteristic polynomial of A is
=
= o.
In words, a matrix satisfies its own characteristic polynomial. Because n, > iii, the characteristic polynomial contains the minimal polynomial as a factor or 6. (i.) = "'(A)h(A) for some polynomial heAl. Because ",(Aj = 0, we have tl(A) = ",(A)h(A) = O· heAl = O. This establishes the theorem. The CayleyHamilton theorem implies that An can be written as a linear combination of {I. A, ... , AnI}. Multiplying (3.43) by A yields An+1 + alAn + ... + an_lA2 + anA
tl(A)
= det(AI
A)
=
n
= O· A = 0
(A  A;)ni
Suppose the Jordan form of A is known. Associated with each eigenvalue, there may be one or more Jordan blocks. The index of Ai, denoted by iii, is defined as the largest order of all Jordan blocks associated with Ai. Clearly we have iii ::: ni . For example, the multiplicities of Al in all five matrices in (3.39) are 4; their indices are, respectively, 4, 3, 2, 2, and I. The multiplicities and indices of AZ in all five matrices in (3.39) are all I. Using the indices of all eigenvalues, the minimal polynomial can be expressed as
which implies that An"l can be written as a linear combination of {A, A2, ...• An}, which, in turn, can be written as a linear combination of {I. A, ... , AnI}. Proceeding forward, we conclude that, for any polynomial leA), no matter how large its degree is, I(A) can always be expressed as (3.44) for some f3i. In other words, every polynomial of an n x n matrix A can be expressed as a linear combination of {I, A, ... ,AnI}. If the minimal polynomial of A with degree ii is available, then every polynomial of A can be expressed as a linear combination of {I. A .... ,AnI}. This is a better result. However, because ii may not be available, we discuss in the following only (3.44) with the understanding that all discussion applies to ii. One way to compute (3.44) is to use long division to express I(A) as I(A) = q(A)tl(A)
with degree ii = iii ::: n, = n = dimension of the five matrices in (3.39) are
L
L
of A. For example, the minimal polynomials
"'3 = (A "'5 = (A 
"'I = (A  AI)4(A  AZ) AI)2(A  AZ) AI)(A  A2)
"'2 = (A  AIl3(A  A2) "'. = (A  AdZ(A  A2)
+ heAl
+ heAl
= heAl
(3.45)
where q(A) is the quotient and heAl is the remainder I(A) = q(A)tl(A)
with degree less than n. Then we have
+ heAl
= q(A)O
64
LINEAR ALGEBRA
3.6 Functions AllJO = f30I = 99 of A are distinct. these f3j can
of a Square
Matrix
65
Long division is not convenient to carry out if the degree of f (A) is much larger than the degree of 6(A). In this case, we may solve heAl directly from (3.45). Let heAl := f30
+ f3IA
+ f3IA + ... + f3n_IAn1
Theorem 3.5.
[Io 0]
I
= 991  100
 iOOA
[0
I
I ]. = [199 2 100
100] 101
where the n unknowns f3j are to be solved. If all n eigenvalues be solved from the n equations f(Aj) = q(Aj)6(Aj)
Clearly A llJO can also be obtained by multiplying
A 100 times. However. it is simpler to use
+ h(A,)
= h(A,) to yield
for i = I. 2 .... , n. If A has repeated eigenvalues, additional equations. This is stated as a theorem.
then (3.45) must be differentiated
Functions of a square matrix Let f(Al be any function, not necessarily a polynomial. One way to define f (A) is to use Theorem 3.5. Let h (A) be a polynomial of degree n  I, where n is the order of A. We solve the coefficients of heAl by equating f(Al = heAl on the spectrum of A. Then f(A) is defined as heAl·
EXAMPLE 3.8 Let
>.
Theorem 3.5 We are given f(A) and an n
X
n matrix A with characteristic polynomial AI = 6(A) = n(AA,)n,
;=1
[
0 0 I
0 I 0
?]
0 3
where n =
L;':,I n,.
Define heAl := f30
Compute eA". Or, equivalently. if fp.) = e)", what is frAil? The characteristic polynomial of AI is (A  1)1(A  2). Let f3n_IAn1 Then f(l)
1'(1)
heAl
= f30
+ f3IA
i (32).2
+ f3IA + ... +
It is a polynomial of degree n 
I with n unknown coefficients. These n unknowns are to be solved from the following set of n equations: f(/)(A;)
where
= h(l) = h'(l)
: :
e' = f30
+ f31 + f32 + 4f31
= h(ll(A;)
for 1= O. 1, ... , n,  I
and
i = I, 2, ... , m
f(2)
= h(2) :
t e' = f31 + 2f32 2 e ' = f30 + 2f31
Note that, in the second equation, the differentiation is with respect to A, not t. Solving these equations yields f30 = 1te' + e"; f31 = 3te' + 2e'  2e2', and f32 = el'  e'  t e', Thus we have (Aj) is similarly defined. Then we have f(A) = heAl eA1' = heAd = (2te'
and
h(/)
+ e2')I + (3te' + 2e'
e'  te')Ar = [
 2e2')AI
l '
+ (e '
l
2e'  e

e~
2e' 0 2e 2el'

l ']
0
el'
_
and heAl is said to equal f(A)
on the spectrum of A.
e'
0
e'
1
1
EXAMPLE 3.7 Compute
A llJO with
EXAMPLE 3.9 Let
In other words, given f(A) MA) =A
2
+ 2A +
I = (A
+
= AllJO. compute
1)2 Let heAl
f(A).
= f30
+ f3IA.
(_I)IIJO
The characteristic
polynomial
of A is
AZ=[~
= heAl) =
~I~2]
l '
On the spectrum of A. we have = f30  f31
f(I)
1'(1)
= h(I): = h'(I):
Compute eA". The characteristic polynomial of Az is (A  l)l(A  2), which is the same as for AI. Hence we have the same heAl as in Example 3.8. Consequently, we have 2e'  e [ 2te' 2e' 0 •
100· (_1)99 = f31 heAl = 99  lOOA. and
Thus we have f31 = 100,
f30 = I
+ f31 = 99,
e
A
:'
0 e21
_
e'
e' +t e'
2e2t

e[
66
LINEAR ALGEBRA
EXAMPLE 3.10 Consider
3.6 Functions of a Square Matrix the Jordan block of order 4: At It is block diagonal imply (3.46) and contains two Jordan blocks. If I(A)
67
= e/", then (3.41) and (3.48)
o
0
A=
Its characteristic polynomial is (A Ad4.
,
[
o o
Although we can select h (A) as /30+ /31A +fhA+
," =
fhA3,
If I(A)
f~" !
I
t e:"
tZeAlt
/2!
0 0 0
e
AJt
te':' eAtt
0 0
0 0 0
eAzf
0
j,]
e'?'
0 0 0 0 0 0 (3.49)
it is computationally
simpler to select h (A) as
= (s  A)I, then (3.41) and (3.47) imply I (s  Adz I (s  All 0 0 0
(5 (s  AI)
This selection unknowns.
is permitted
because
h (A) has degree
(n  1) = 3 and n = 4 independent
AIl3 I
The condition I(A) fio = I(AIl.
= hO.) on the spectrum of /31 = J'(AIl. J"(Ad /32 = 2' ,
,
A yields
immediately (sl  A)I =
0 0 0
1(3)(AI) /33 = 3! 
(5  AI)2
I
(5  All
Thus we have
0 0
0
I
I(A)
= I(A,)I
+ 1';~Il (A _ (A I(AI) 0 0
All)
+ J"~~,) (..\_ AII)2 + 1(3~;AI) (A 
AII)3 0
(5  A2)
(5  A2)2
I (s  A2)
Using the special forms of
AII)k as discussed /,(AI)/I! I(A,) 0 0
in (3.40). we can readily obtain 1(3)(AI)/3!] J"(Ad/2! /,(A,)/I' I(AIl (3.47) Using power series The function of A was defined using a polynomial of finite degree. We now give an alternative definition by using an infinite power series. Suppose I(A) can be expressed as the power series
ex;
J"(Ad/2' 1'(Ad/l! I(AIl 0
, I(A)
=
[
o
If I(A)
=
e"; then
At
I(A)
= L/3iAi
i=O
e
=
[""
t e'"
t2eA,t
/2!
0 0 0
e
Ajt
t e':"
0 0
e':'
0
"""/J!]
tZeA,t
/2!
t e':'
(3.48)
with the radius of convergence
p. If all eigenvalues
of A have magnitudes
less than p ; then
I (A)
can be defined as I(A) =
e':'
L /3iAi
00
(3.50)
i=O
Because functions of A are defined through polynomials are applicable to functions.
of A. Equations
(3.41) and (3.42)
Instead of proving the equivalence we use (3.50) to derive (3.47).
of this definition and the definition based on Theorem 3.5.
!
EXAMPLE 3.11 Consider
EXAMPLE 3.12
Consider the Jordanform I(A) = I(AIl
matrix
A in
(3.46). Let  AIl
0
0 0 0 A2 0 then
+I
,
(AIl(A  All
+ ~(A
J"(AIl,
+ ..
j
A~
f"
AI 0 0 0 AI 0 0
~
jJ
68
LINEAR
ALGEBRA
3.6 Functions of a Square Matrix
69
Because (A  AI 1/ = 0 for k :::: n = 4 as discussed in (3.40). the infinite series reduces immediately to (3.47). Thus the two definitions lead to the same function of a matrix. The most important function series , of A is the exponential A~t~
).nr"
To show (3.54). we set 12= II'
Then (3.53) and (3.52) imply
function
eAr.
Because the Taylor
which implies (3.54). Thus the inverse of
I. Differentiating
eAr
can be obtained by simply changing
the sign of
term by term of (3.51) yields
e" = I + AI
converges for all finite t. and I. we have
+ 
2!
+ ... + 
n'
+.
(3.51)
j
This series involves only multiplications and additions and may converge rapidly: therefore it is suitable for computer computation. We list in the following the program in MATLAB that computes (3.51) for I = I: F~nction E=exp~2(Al 2=zeros (size (;:.;,);
r=eye(size(A)
Thus we have (3.55) This is an important equation. We mention that (3.56) The equality holds only if A and B commute or AB = BA. This can be verified by direct
i;
k=::";
~~ile
norm(E+FS,l»O
E=E+F;
F=A*F/k;
substitution of (3.51). The Laplace transform of a function!
(t) is defined as
k k s l ;
j(s):=
It can be shown that
.L:[f(t)]
=
1
x
!(I)estdt
In the program. E denotes the partial sum and F is the next term to be added to E. The first line defines the function. The next two lines initialize E and F. Let Ck denote the kth term of (3.51) with t = 1. Then we have CkI = (Aj k)Ck for k = 1.2 ..... Thus we have F = A * F jk. The computation stops if the lnorm of E + F  E. denoted by norm(E + F  E. 1). is rounded to 0 in computers. Because the algorithm compares F and E. not F and O. the algorithm uses norm(E + F  E. 1) instead of normtfil ). Note that norm(a.l) is the lnorm discussed in Section 3.2 and will be discussed again in Section 3.9. We see that the series can indeed be programmed easily. To improve the computed result. the techniques of scaling and squaring can be used. In MATLAB. the function expm2 uses (3.51). The function exp:c, or eX9m1. however. uses the socalled Pade approximation. It yields comparable results as exp:n2 but requires only about half the computing time. Thus e xpm is preferred to exprr2. The function expm3 uses Jordan form. but it will yield an incorrect solution if a matrix is not diagonalizahle. If a closedform solution of eAr is needed. we must use Theorem 3.5 or Jordan form to compute e·\r. We derive some important properties of eAr to conclude this section. Using (3.5 I). we can readily verify the next two equalities (352) (3.53) (3.54)
.L:[S]
Taking the Laplace transform of (3.5 1) yields
=S(k+l)
.L:[eAr]
Because the infinite series
= .L:>(k+llAk = sI
cc
c
2)SIA)k
2)51 A)k
k=O
')C
=1
+ SI A + s2A2 + ...
= (I  SI A)I
converges for
IsI AI
SI
<
cc
1. we have
(sIA)k
L
k=O
= sII
+ s2A + sJA2 + ...
(3.57)
and
70
LINEAR ALGEBRA
3.8 Some Useful Formulas mil (3.58) m21 x m31 mI2
m22 1n32
71
ell
ell
Although in the derivation of (3.57) we require s to be sufficiently large so that all eigenvalues of sI A have magnitudes eigenvalues sL[J(t)]  f(O). less than I, Equation (3.58) actually holds for all s except at the from (3.55). Because L[df(t)/dt] of A. Equation (3.58) can also be established applying the Laplace transform sL[eA1] or

ell
ell e22 C32
(3.60)
==
to (3.55) yields AL[eA1]
eO
==
This is indeed a standard
linear algebraic
equation.
The matrix on the preceding
page is a
square matrix of order n x m == 3 x 2 == 6. Let us define .A(M) :== AM + MB. Then the Lyapunov equation can be written as . A(M) == C. It maps an nmdimensional linear space into itself. A scalar I] is called an eigenvalue of A if there exists a nonzero M such that A(M) Because .A can be considered k== 1.2, ... , nm. It turns out
I]k
which implies (3.58).
==
I]M
17b
as a square matrix of order nm, it has nm eigenvalues
for
3.7 Lyapunov Equation
Consider the equation AM+MB==C where A and B are, respectively, to be meaningful, Lyapunov equation. The equation can be written as a set of standard we assume n linear algebraic equations, as mI2]
mZ2 m32
==
Ai
+ J.Lj
for i == I, 2.
n;
j == I, 2 ....
,m
where Ai, i == I, 2 .. ,., n, and J.Lj' j == I, 2. m, are, respectively. the eigenvalues of A and B. In other words, the eigenvalues of A are all possible sums of the eigenvalues of A andB. We show intuitively why this is the case. Let U be an n x 1 right eigenvector of A associated with Ai; that is, Au == AiU. Let v be a I x m left eigenvector of B associated with J.Lj: that is, vB == vJ.Lj. Applying A to the n x m matrix uv yields .A(uv) == Auv + uvB == AiUV + UVJ.Lj == (Ai + J.Lj)uv Because both U and v are nonzero, so is the matrix uv. Thus (Ai + J.Lj) is an eigenvalue of A. The determinant of a square matrix is the product of all its eigenvalues. Thus a matrix is nonsingular if and only if it has no zero eigenvalue. If there are no i and j such that Ai + J.Lj == 0, then the square matrix in (3.60) is nonsingular and, for every C, there exists a unique M satisfying the equation. In this case, the Lyapunov equation is said to be nonsingular. If A, + J.Lj == 0 for some i and i. then for a given C, solutions mayor may not exist. If C lies in the range space of jt then solutions exist and are not unique. See Problem 3.32. The MATLAB function m= lyap (a, b , c) computes the solution of the Lyapunov equation in (3.59).
(3.59)
n x nand m x m constant matrices. In order for the equation the matrices M and C must be of order n x m. The equation is called the
To see this,
==
3 and m == 2 and write (3.59) explicitly al2 al3 ] [mil a2l m21 all eI2 ]
eZI
[ all a21 all
ai:
al2
mI2]
m22
+
[mil m21 mll
mll
ms:
[blI b 21
bI2 ]
b22
==
[ell e2I ell
el2 entries on both sides of the equality, 0
bII
MUltiplying them out and then equating the corresponding we obtain all +bll
aZI an all
3.8 Some Useful Formulas
This section discusses some formulas that will be needed later. Let A and B be m x nand n x p constant matrices. Then we have p(AB) ::: min(p(A), pCB»~ (3.61)
al3 a2l all + bil 0 0
biZ
b21 0 0 all + a21 all
b22 a22
0 0 b21 all
b22
+ bll al2 0 bI2 0
all bI2 0 0
0
an
+ al2
a2l all +
b22
where p denotes the rank. This can be argued as follows. Let pCB) ct. Then B has ct linearly independent rows. In AB, A operates on the rows of B. Thus the rows of AB are
72
LINEAR
ALGEBRA
3.9 Quadratic Form and Positive Definiteness
73
linear combinations of the rows of B. Thus AB has at most a linearly independent rows. In AB, B operates on the columns of A. Thus if A has f3 linearly independent columns, then AB has at most f3 linearly independent columns. This establishes (3.61). Consequently, if A = B IB2B3 .. " then the rank of A is equal to or smaller than the smallest rank of Bi. Let A be m x 11 and let C and D be any 11 x 11 and m x m nonsingular matrices. Then we have p(AC) = peA) = p(DA) (3.62) by a nonsingular
det(NP) and det(QP)
= det N det P = det P
= det Q det P = det P
matnx, To show (3.62), we define
In words, the rank of a matrix will not change after pre or postmultiplying
we conclude detrl., + AB) = dett l, + BA). In N, Q, and P, ifI". I"" and B are replaced. respectively, we can readily obtain
Sol
by JSI".
JSlm.
and B. then
deuxL,  AB) = s'" dett.s l.,  BA)
(365)
P:=AC Because peA) ::: minim, 11) and piC) = 111 we have pCA) ::: p(C). Thus (3.61) implies p(C» ::: peA)
(3.63)
which implies, for
11
=
III
or for 11 x n square matrices A and B. deusf,  AB) = dett.s I,  BA) (366)
pCP) ::: min(p(A),
They are useful formulas.
Next we write (3.63) as A = PCI. Using the same argument, we have peA) ::: pCP). Thus we conclude p(P) = pCA). A consequence of (3.62) is that the rank of a matrix will not chancre by elementary operations. Elementary operations are (1) multiplying a row or a column bya nonzero number, (2) interchanging two rows or two columns, and (3) adding the product of one row (column) and a number to another row (column). These operations are the same as multiplying nonsingular matrices. See Reference [6, p. 542]. Let A be III x 11 and B be 11 x m. Then we have detrL,
3.9 Quadratic Form and Positive Definiteness
An n x n real matrix M is said to be symmetric if its transpose equals itself. The scalar function x'Mx, where x is an 11 x I real vector and M' = M, is called a quadraticform. We show that all eigenvalues of symmetric M are real. The eigenvalues and eigenvectors of real matrices can be complex as shown in Example 3.6. Therefore we must allow x to assume complex numbers for the time being and consider the scalar function x*Mx, where x" is the complex conjugate transpose of x. Taking the complex conjugate transpose of x'~lx yields (x'Mx)' = x'1V1'x = x'I\l'x = x'~lx
+ AB)
= deul,
+ BA)
(3.64)
where L, is the unit matrix of order m. To show (3.64), let us define
Q
We compute
= [~~
I~]
In
A]
I"
NP = and
[1m + AB 0]
B
A ] I" +BA
where we have used the fact that the complex conjugate transpose of a real M reduces to simply the transpose. Thus xMx is real for any complex x. This assertion is not true if M is not symmetric. Let ic be an eigenvalue of M and v be its eigenvector; that is. Mv = t.V. Because
v 'Mv
= v'icv = !e(v'V)
B~cause Nand Q are block triangular, or their blockdiagonal matrices or
their determinants
equal the products of the determinant
det N = det L, . det In = I = det Q Likewise, we have det(NP) Because = dett l.,
and because both v'Mv and v'v are real, the eigenvalue ic must be real. This shows that all eigenvalues of symmetric :\1 are real. After establishing this fact. we can return our study to exclusively real vector x. We claim that every symmetric matrix can be diagonalized using a similarity transformation even it has repeated eigenvalue !e. To show this. we show that there is no generalized eigenvector of grade 2 or higher. Suppose x is a generalized eigenvector of grade 2 or
(M  icl)2x = 0
(367) (368)
+ AB)
det(QP)
= dettl,
+ BA)
Consider
(M  AI)x
i= 0
74
LINEAR ALGEBRA
3.9 Quadratic
Form and Positive Definiteness
75
[(M  ).I)x]'(M
 AI)x = x'(M'
 AI')(M
 H)x
= x'(M  AI)2X
which is nonzero according to (3.68) but is zero according to (3.67). This is a contradiction. Therefore the Jordan form of M has no Jordan block of order 2. Similarly, we can show that the Jordan form of M has no Jordan block of order 3 or higher. Thus we conclude that there exists a nonsingular Q such that M=QDQI (3.69)
for any x. If N is nonsingular. the only x to make Nx = 0 is x = O. Thus M is positive definite. If N is singular, there exists a nonzero X to make Nx = O. Thus M is positive semidefinite. For a proof of Condition (2), see Reference [IOj. We use an example to illustrate the principal minors and leading principal minors. Consider
Its principal minors are m II' m 22, m 33, det
1Il1l [ 11l:!1
where D is a diagonal matrix with real eigenvalues of M on the diagonal. A square matrix A is called an orthogonal matrix if all columns of A are orthonormal. Clearly A is nonsingular and we have A' A = Ii
I
det
[mil
m31
and
A I = A'
and det M. Thus the principal minors are the determinants of all submatrices of M whose diagonals coincide with the diagonal of M. The leading principal minors of M are mil, det [mil
n121
which imply AA' = AA I = I = A' A. Thus the inverse of an orthogonal matrix equals its transpose. Consider (3.69). Because D' = D and M' = M, (3.69) equals its own transpose or
mI2],
m22
and
detM
Thus the leading principal minors of M are the determinants of the submatrices of M obtained by deleting the last k columns and last k rows for k = 2, 1, O. which implies QI = Q' and Q'Q = QQ' = I. Thus Q is an orthogonal matrix; its columns are orthonormalized eigenvectors of M. This is summarized as a theorem. .. Theorem 3,8 1. An m x n matrix H. with m ~ n, has rank n, if and only if the n x n matrix H'H has rank n or det(H'H) ~ O. 2. An m x n matrix H. with m ::::n, has rank m, if and only if the m x m matrix HH' has rank m or det(HH') ~ O. The symmetric matrix H'H is always positive semidefinite. It becomes positive definite if H'H is nonsingular. We give a proof of this theorem. The argument in the proof will be used to establish the main results in Chapter 6: therefore the proof is spelled out in detail. Proof: Necessity: The condition p(H'H) = n implies p(H) = n. We show this by contradiction. Suppose p(H'H) = n but p(H) < n. Then there exists a nonzero vector v such that Hv = 0, which implies H'Hv = O. This contradicts p(H'H) = n . Thus p(H'H) = n implies p(H) = n. Sufficiency: The condition p(H) = n implies p(H'H) = n. Suppose not, or p(H'H) < n; then there exists a nonzero vector v such that H'Hv = 0, which implies v'H'Hv = Oor 0= v'H'Hv = (Hvj'(Hv) = IIHvlli
>
Theorem 3.6 For every real symmetric matrix M, there exists an orthogonal matrix Q such that M = QDQ' or D = Q'MQ
where D is a diagonal matrix with the eigenvalues of M, which are all real, on the diagonal. A symmetric matrix M is said to be positive definite, denoted by M > 0, .if x'Mx > 0 for every nonzero x. It is positive semidefinite, denoted by M ~ 0, if x'Mx ~ 0 for every nonzero x. If M > 0, then xMx = 0 if and only if x = O. If M is positive semidefinite, then there exists a nonzero x such that x'Mx = O. This property will be used repeatedly later. Theorem 3.7 A symmetric n x n matrix M is positive definite (positive semidefinite) if and only if anyone of the following conditions holds. 1. Every eigenvalue of ~I is positive (zero or positive). 2. All the leading principal minors ofM are positive (all the principal minors ofM are zero or positive). 3, There exists an n x n nonsingular matrix N (an n x n singular matrix N or an m x n matrix N with m < n) such that M = N'N. Condition (1) can readily be proved by using Theorem 3.6. Next we consider Conditon (3). If M = N'N,then x'Mx = x'N'Nx = (Nx)'(Nx) = IINxlli ~ 0 Thus we have H\" = O. This contradicts the hypotheses that v ~ 0 and p(H) = n . Thus p(H) = implies p(H'H) = n. This establishes the first part of Theorem 3.8. The second part can be established similarly. Q.E.D. We discuss the relationship between the eigenvalues of H'H and those of HH'. Because both H'H and HH' are symmetric and positive semidefinite, their eigenvalues are real and nonnegative (zero or
76
LINEAR ALGEBRA
3.11 Norms of Matrices
77
positive). If H is m X n, then H'H has n eigenvalues and HH' has m eigenvalues. Let A = Hand B = H'. Then (3.65) becomes
For ,'\. = H·H. there exists. following Q'H'HQ
Theorem 3.6. an orthogonal = D =: S'S
matrix Q such that (3.71 )
deus l.,  HH') =
s",n
deus l,  H'H)
(3.70)
This implies that the characteristic polynomials of HH' and H'H differ only by smn. Thus we conclude that HH' and H'H have the same nonzero eigenvalues but may have different numbers of zero eigenvalues. Furthermore. they have at most ii := min (m, n) number of nonzero eigenvalues.
where D is an n x n diagonal matrix with A; on the diagonal. The matrix S is 111 x n with the singular values i'i on the diagonal. Manipulation on (3.71) will lead eventially to the theorem that follows.
3,10 SingularValue Decomposition
Let H be an !II x n real matrix. Define M := H'H, Clearly M is n x n, symmetric, and semidefinite. Thus all eigenvalues ot M are real and nonnegative (zero or positive). Let r be the number of its positive eigenvalues. Then the eigenvalues of M = H'~ can be arranged as
Theorem 3.9 (Singularvalue
decomposition)
Every m x n matrix H can be transformed into the form H=
RSQ'
111
AI :::
')'
A:! ::: ",
A; > 0 = Ar+1
" 1
= ,., =
An
with R'R = RR' = 1m. Q'Q = QQ' = In. and S being diagonal.
x
/J
with the singular values of H on the
Let ii := mino», n). Then the set Aj ::: A2 ::: .. , Ar > 0 = Ar+1 = .. , = An is called the singular values of H. The singular values are usually arranged in descending order in magnitude.
EXAMPLE
3.13 Consider the 2 x 3 matrix
H= 4 [ 1 0.5
2
We compute 5 M=H'H= 1.25 2.5 and compute its characteristic polynomial as

10 2.5 5
]
The columns of Q are orthonormalized eigenvectors of H'H and the columns of R are orthonormalized eigenvectors of HH'. Once R, S. and Q are computed, the rank of H equals the number of nonzero singular values. If the rank of H is r , the first r columns of R are an orthonormal basis of the range space of H. The last (n  r) columns of Q are an orthonormal basis of the null space ofH. Although computing singularvalue decomposition is time consuming, it is very reliable and gives a quantitative measure of the rank. Thus it is used in MATLAB to compute the rank. range space, and null space. In MATLAB, the singular values of H can be obtained by typing s ~svd (H) . Typing [R, S, Q J ~ svd (H) yields the three matrices in the theorem, Typing orth(H) and null (H) yields. respectively, orthonormal bases of the range space and null space of H. The function null will be used repeatedly in Chapter 7,
EXAMPLE
3.14 Consider the matrix in (3, II). We type
a= ~0 1 1 2; 1 2 3 ~;::: 0 2 0:;
det(AI  M) = A3
26.25A 2 = A2(A  26.25) = of
[r , s . q] os'!d(al
Thus the eigenvalues of H'H are 26.25, 0, and 0, and the singular values of Hare J26.25 5.1235 and O. Note that the number of singular values equals min (n, m). In view of (3,70), we can also compute the singular values of H from the eigenvalues HH'. Indeed, we have lYI:= HH' = [ and det(AI21 10,5 10.5J 5.25
which yield r= [0.3782 0.S877 0.2627 0.308+ 0.1+68 0.9399 0.7020 0.2439 0.4581 0.4877 0.8729] 0'+364 0.2182 0.3+34 0.8384 0.3434 0.2475 0 0 0 0
s = [6.'fS
2.4686 0
[022"
q=
0.3498 0.5793
05802]
0,3395 0.5802 0.4598
n
M)
0.6996 = A2  26.25A = A(A  26,25)
Thus the eigenvalues of HH' are 26.25 and 0 and the singular values of H' are 5,1235 and O. We see that the eigenvalues of H'H differ from those of HH' only in the number of zero eigenvalues and the singular values of H equal the singular values of H'.
Thus the singular values of the matrix A in (3. I I) are 6.1568, 2.4686, and O. The matrix has two nonzero singular values. thus its rank is 2 and, consequently, its nullity is 4  peA) = 2. The first two columns of r are the orthonormal basis in (3.13) and the last two columns of q are the orthonormal basis in (3.14),
78
LINEAR ALGEBRA
Problems
7'
3.11 Norms of Matrices
The concept of norms for vectors can be extended to matrices. This concept is needed in
X,
Chapter 5. Let A be an m x n matrix. The norm of A can be defined as IIAII = sup IIAxl1 = sup IIAxl1 .;00 Ilxll 11'11=1 (3.72)
IIxll,
=
1 4
where sup stands for supremum or the least upper bound. This norm is defined through the norm ofx and is therefore called an induced norm. For different Ilxll. we have different IIAII· For example. if the lnorm Ilxlll is used, then IIAIII = mfx where
IIxl12 1 =
The sum of these two magnitudes gives the norm of A.
2 This magnitude gives the nonn of A.
r~'=l
It
laiJI)
= largest column absolute sum norm IIxl12 is used, then
aij
is the ijth element of A. If the Euclidean
IIAI12 = largest singular value of A = (largest eigenvalue If the infinitenorm Ilxll"" is used, then IIAII"" = m~x of A'A)I/2
2
(t
J=I
laij
x
I) = largest row absolute sum
4 3 2 1234_5       . Ax
x,
These norms are all different for the same A. For example, if
A=[~I ~]
then IIAIII = 3 + I  II = 4, IIAI12 = 3.7, and IIAllco = 3 + 2 3.3. TheMATLAB functions norm (a, 1), norm(a, 2) =norm(a) compute the three norms. The nonn of matrices has the following properties:
2
This magnitude gi yes the norm of A.
=
5, as shown in Fig. ,andnorm(a, inf)
(e)
Figure 3.3
Different norms of A.
IIAxll.:::: IIAllllxl1 IIA+BII.:::: IIAII + IIBII
IIABII.:::: IIAIIIIBII
The reader should try first to solve all problems then verify the results using MATLAB 3.1 Consider
involving
numerical
numbers by hand and
or any software. of the vector x with respect to the basis
3.3
Find two orthonormal 3.2.
vectors that span the same space as the two vectors in Problem
Fig. 3.1. What is the representation
3.4
{ql. h)? What is the representation 3.2 What are the lnorrn, 2nonn,
of ql with respect to {i2' q2P of the vectors 3.5
Consider an n x m matrix A with n ::: m. If all columns of A are orthonormal, A' A = 1m· What can you say about AA'? Find the ranks and nullities of the following matrices:
then
and infinitenorm
80
LINEAR ALGEBRA
Problems
81
2
1
3
2
AI=
o
3.6 3.7
o
AJ=
Find bases of the range spaces and null spaces of the matrices in Problem 3.5. Consider the linear algebraic equation
[~ [~
4 2 0 0
I
~O]
Az =
0
n
matrix
(Xl
[~ IJ
0 4 4 20 25
16
A4 = [:
3]
20
Note that all except A. can be diagonalized.
3.14 Consider the companionform
It has three equations and two unknowns. Does a solution x exist in the equation') Is the
solution unique? Does a solution exist if y = [1 I I)'? 3.8 Find the general solution of
A~[T
Show that its characteristic polynomial
al
0
1
0 0
0
is given by
3
2
I
3
Tl
o
2
o
6(}.) = A4 + alA
+ aZA2 + aJA + 0'4
How many parameters 3.9
do you have? norm.
Find the solution in Example 3.3 that has the smallest Euclidean
Show also that if Ai is an eigenvalue of A or a solution of MA) = 0, then [Ai A; Ai I]' is an eigenvector of A associated with Ai.
3.10 Find the solution in Problem 3.8 that has the smallest Euclidean norm.
3.11 Consider the equation x[n] = An x[O]
3.15 Show that the vandermonde determinant
+ AnIbu[O] + An2bu[l] + ... + Abu[n
 2]
+ bu[n
 I]
['
Al A2 I Al 1
3 A2 A~
A2
A3 A2 3 A3
•1
where A is an n x n matrix and b is an n x 1 column vector. Under what conditions on A and b will there exist u[O], u[I], ... , u[n  I] to meet the equation for any x[n] and x[O]? [Hint: Write the equation in the form
"l
A2
4
A4 1
equals f1ISi<jS.(Aj  A,). Thus we conclude that the matrix is nonsingular or, equivalently. the eigenvectors are linearly independent if all eigenvalues are distinct.
x[n]Anx[O]=[b
Ab
...
AnIb]
[:::=::j]
u[O]
3.16 Show that the companionform
0'4
i= O. Under
this assumption.
matrix in Problem 3.14 is nonsingular show that its inverse equals
if and only if
o
o
3.12 Given 3.17 Consider AT
what are the representations of A with respect to the basis {b, Ab, AZb. A3b} and the basis {b. Ab, Alb. Alb], respectively? (Note that the representations are the sarne ') A = [~ A
o
eigenvector of grade 3 and
3.13 Find Jordanform
representations
of the following
matrices:
with A i= 0 and T > O. Show that [0 0 I)' is a generalized the three columns of
82
LINEAR ALGEBRA
Problems
83
AT2 0]
AT
Find a matrix B such that eB = C. Show that if A, = 0 for some i, then B does not exist. Let
o
0
I of length 3. Verify 1 C=[~
constitute a chain of generalized
eigenvectors
QIAQ=
[o
~
I
' I 0]
A
o
~
~]
0A
C, there exists a matrix B
Find a B such that eB = C. Is it true that, for any nonsingular such that eB = C? 3.25 Let (sl  A)I 1 = Adj ll(s) (sl  A) polynomials of the following
0A
3.18 Find the characteristic matrices:
polynomials
and the minimal
0
[]
Al
0 0
Al
0 0 0
[1
f(A) 3.20 matrix. 3.21 Given
Al
0 0
Al
0
n n
x.
eAt
[1 [~
0 0 0 0 0 0
Al Al
0 0 0
Al
Al
0
n n
is an eigenvalue of
and let m (s) be the monic greatest common divisor of all entries of Adj (s I  A). Verify for the matrix A3 in Problem 3.13 that the minimal poly nominal of A equals ll(s)/m(s). 3.26 Define
where ll(s) := det(sl  A) := sn + alsnI + a2Sn2 + ... + an
3.19 Show that if A is an eigenvalue with the same eigenvector
of A with eigenvector
x, then f(A)
and R, are constant matrices. This definition is valid because the degree in s of the adjoint of (sl  A) is at most n  1. Verify
al
Show that an n x n matrix has the property Ak = 0 for k :::: m if and only if A has eigenvalues 0 with multiplicity n and index m or less. Such a matrix is called a nilpotent
tr(ARo) = 1tr(ARI) tr(AR2)
Ro = I Rj=ARo+aII=A+ajI
aZ=20i3
= 3
find A 10, AI03, and
e/",
an_j = for A. and A. in Problem 3.13. that is,
tr(ARn_2) nI
3.22 Use two different methods to compute 3.23
Show that functions of the same matrix commute; f(A)g(A) Consequently we have
AeAt
n = g(A)f(A) = A. 3.27 3.28 where tr stands for the trace of a matrix and is defined as the sum of all its diagonal entries. This process of computing Oi, and R, is called the Leverrier algorithm. Use Problem 3.26 to prove the CayleyHamilton Use Problem 3.26 to show (sl  A)I 1 = ll(s) [Anj + (s + aj)An2 + (s2 +ajs +(2)An3 theorem.
eAt
3.24 Let
84
LINEAR
ALGEBRA
Problems
85
+ ... +
(SnI +als
n2
+ ···+anI
)1]
3.36 Show
3.29 Let all eigenvalues of A be distinct and let qi be a right eigenvector Ai; that is, Aq, = Aiqi. Define
of A associated with
Q
= [ql q2 ...
qnl and define
P~Q'~[~:1
where Pi is the ith row of P. Show that Pi is a left eigenvector that is, PiA = AiPi. 3.30 Show that if all eigenvalues of A are distinct, then (sl  A)I can be expressed as = " qiPi ~sAi of A associated with Ai· of A associated with Ai; (sl  A)l
3.37 Show (3.65). 3.38 Consider Ax = y, where A is m x n and has rank m. Is (N A)I A'y a solution? If not, under what condition will it be a solution? Is A'(AA')Iy a solution?
1
where qi and P, are right and left eigenvectors 3.31 Find the M to meet the Lyapunov
equation in (3.59) with
What are the eigenvalues Is the solution unique? 3.32 Repeat Problem 3.31 for
of the Lyapunov equation? Is the Lyapunov equation singular?
A[ 0 IJ
I 2 with two different C. 3.33 Check to see if the following
B=I
matrices are positive definite or semidefinite:
[
23 3 2] 10
2 0 2
[
I
~~
0
.
2
3.34 Compute the singular values of the following
matrices:
o
1
~J
between its eigenvalues and singular values?
3.35 If A is symmetric, what is the relationship
4.2 Solution of LTI State Equations
87
Chapter
4
partial fraction expansion, and then using a Laplace transform table. These can be carried out using the MATLAB functions roo t sand re s idue. However, when there are repeated poles. the computation may become very sensitive to small changes in the data, including roundoff errors; therefore computing solutions using the Laplace transform is not a viable method on digital computers. A better method is to transform transfer functions into statespace equations and then compute the solutions. This chapter discusses solutions of state equations, how to transform transfer functions into state equations, and other related topics. We discuss first the timeinvariant case and then the timevarying case.
StateSpace Solutions and Realizations
4.2 Solution of LTIState Equations
Consider the linear timeinvariant (LTI) statespace equation
x(t) = Ax(t) y(t) = Cx(t)
+ Bu(t)
(4.2) (4.3)
+ Du(t)
where A, B, C, and D are, respectively, n x n, n x p, q x n, and q x p constant matrices. The problem is to find the solution excited by the initial state x(O) and the input u(t). The solution hinges on the exponential function of A studied in Section 3.6. In particular, we need the property in (3.55) or
d _eAt dt
= Ae" = eAt A
= eAtBu(t)
4.1 Introduction
We showed in Chapter 2 that linear systems can be described by convolutions and, if lumped, by statespace equations. This chapter discusses how to find their solutions. First we discuss briefly how to compute solutions of the inputoutput description. There is no simple analytical way of computing the convolution
y(t) =
to develop the solution. Premultiplying eAt on both sides of (4.2) yields
eAtx(t)  eAtAx(t)
which implies
l~to
k
get, r)u(r)dr
Its integration from 0 to t yields
eArx(r)I:=o = 1t eArBu(r)dr
The easiest way is to compute it numerically on a digital computer. Before doing so, the equation must be discretized. One way is to discretize it as
y(kb.) =
L g(ktl,
m=ko
mtl)u(mtl)tl
(4.1)
Thus we have
eAtx(t)  eOx(O) = 1t eArBu(r) dt (4.4)
where tl is called the integration step size. This is basically the discrete convolution discussed in (2.34). This discretization is the easiest but yields the least accurate result for the same integration step size. For other integration methods, see, for example, Reference [17]. For the linear timeinvariant (LTI) case, we can also use yes) = g(s)u(s) to compute the solution. If a system is distributed, g(s) will not be a rational function of s. Except for some special cases, it is simpler to compute the solution directly in the time domain as in (4.1). If the system is lumped, g(s) will be a rational function of s. In this case, if the Laplace transform of u (t) is also a rational function of s, then the solution can be obtained by taking the inverse Laplace transform of g(s)u(s). This method requires computing poles, carrying out 86
Because the inverse of eAt is eAt and eO = I as discussed in (3.54) and (3.52), (4.4) implies
x(t) = eAtx(O)
+ 1t
eA(tr)Bu(r)dr
(4.5)
This is the solution of (4.2). It is instructive to verify that (4.5) is the solution of (4.2). To verify this, we must show that (4.5) satisfies (4.2) and the initial condition x(t) = x(O) at t = O. Indeed, at t = 0, (4.5) reduces to
88
STATESPACE SOLUTIONS AND REALIZATIONS
4.2 Solution of LTIState Equations
89
The inverse of (s I  A) is a function of A; therefore, again, we have many methods to compute it: Thus (4.S) satisfies the initial condition. We need the equation
!_ at
11 f(t,
10
T) d
t
= 11
10
(!_ t«. T») at
dt
1. Taking the inverse of (sl  A).
dt
+
fCt, T)lr=1
(4.6)
2. Using Theorem
3,S.
= Q(sl  A)IQI and (3.49). in Problem 3.26. where
3. Using (sl  A)I
to show that (4.S) satisfies (4.2). Differentiating x(t) = :t [eAlx(o) = AeAlx(O) = A (eAtx(O) which becomes, after substituting
(4.S) and using (4.6), we obtain
4. Using the infinite power series in (3.S7).
5. Using the Leverrier algorithm discussed
+ 11
eA(Ir)Bu(T)dT
]
EXAMPLE 4.1 We use Methods
I and 2 to compute (sl  A)I,
+ ['
io
AeA(tr)Bu(T)
i
+ eA(tr)Bu(r)lr=1 + eAoBu(t)
Method 1: We use (3.20) to compute
(sl  A)I = [ ~ I
A=[O I]
s~ 2
I/(s Method 2: The eigenvalues
(4.7) (s ).)1
+ 11'
(4.S),
eA(Ir)Bu(T)
dr)
x(t) = Ax(t)
+ Bu(t)
=
[(S+2)/(s+1)2
r
I
2
=
S2 + ~s + I [s: 1/(S+;)2]
s/(H)
2
Thus (4.S) meets (4.2) and the initial condition x(O) and is the solution of (4.2). Substituting (4.S) into (4.3) yields the solution of (4.3) as y(t) = CeAtx(O)
+ 1)2
of A are 1,  I. Let heAl = /30
+ /3IA.
/30  /31
If h (Al equals
f (A)
:=
+ C it
eAUr)Bu(T)
di
+ Du(t)
on the spectrum of A, then f(I) = h(I) :
(s (s
+
I)I
=
This solution and (4.S) are computed directly in the time domain. We can also compute the solutions by using the Laplace transform. Applying the Laplace transform to (4.2) and (4.3) yields, as derived in (2.14) and (2.IS), xes) = (sl  A)I[X(O) yes) = C(sl  A)I[x(O)
/'(1) = h'(I): Thus we have
+ 1)2
= /31
+ Bu(s)]
+
Bu(s)]
+
Du(s)
and (sl  A)I
Once xes) and Yes) are computed algebraically, their inverse Laplace transforms yield the timedomain solutions. We now give some remarks concerning the computation of eAt. We discussed in Section 3.6 three methods of computing functions of a matrix. They can all be used to compute eAI: 1. Using Theorem 3.S: First, compute the eigenvalues of A; next, find a polynomial degree n  I that equals eAI on the spectrum of A: then eAt = heAl. 2. Using Jordan form of A: Let A = QAQI; form and then eAI = Qe·.\IQI,
= heAl =
=
[(s
+
1)1
+
(5
+
1)2]1
+
(s
+
1)2A
[(S + 2)/(s +11)2 1/(5 + ;)2] I/(s + 1)" s/(s+)
heAl of
EXA\lPLE
4.2 Consider the equation
x(t) = [~
where A is in Jordan Its solution is
=~]x(r)+ [~]U(I)
+ it
eAfIr)BU(T)dr of (51  A)I, which was computed
eAt
can readily be obtained by using (3.48).
3. Using the infinite power series in (3.S I): Although the series will not. in general. yield a closedform solution, it is suitable for computer computation, as discussed following (3.SI). In addition, we can use (3.S8) to compute eAI, that is, eAt = LI(sl _ A)I (4.8)
t
X(I) = eAtx(O)
I
The matrix function eAt is the inverse Laplace transform in the preceding example. Thus we have
90
STATESPACESOLUTIONS AND REALIZATIONS
(s I 1)2 +
s
(S+)2 and x(t) = [ (I
1
= [(I
+ t ye:'
ie:'
1
dt
]
4.2 Solution of LTIState Equations we can approximate (4.9) as x(t If we compute xtr) andy(t) x«k
91
+ T)
= x(t)
+ Ax(t)T
+ Bu(r)T
(4.11 )
only at t = kT for k = 0, I, ... , then (4.11) and (4.10) become
+ iv«:
te:
t
_tet] (I  t)et
f~  r)eUr)u(r) (t x(O) + f~[l(t  r)]e(tr)u(r)dr
[
+ 1)T)
y(kT)
+ TA)x(kT) + TBu(kT) = Cx(kT) + Du(kT)
= (I on a digital computer. results for the same
This is a discretetime We discuss a general property matrix in (3.39). Then we have of the zeroinput t e':" jAlt 0 0 0 t2eAlt response eAtx(O). 0 0 0 eA1l 0 0 Consider the second
statespace
equation and can easily be computed
," = Q
[""
l
This discretization
is the easiest to carry out but yields the least accurate by a digital computer followed
/2
T. We discuss next a different discretization. If an input utr) is generated converter, of control systems. Let u(t) = u(kT) =: u[k] for kT ::: t < (k by a digitaltoanalog then u(r) will be piecewise constant. This situation often arises in computer control
t e:"
eA1t
0 0
: ] Q'
eA2t
+ I)T
instants.
(4.12) For this input,
Every entry of eAt and, consequently, of the zeroinput response is a linear combination of terms {eAlt, teAlt, t2e'lt, eA,t}. These terms are dictated by the eigenvalues and their indices. In general, if A has eigenvalue A I with index ii I, then every entry of eAt is a linear combination of
for k = 0, 1, 2, .... This input changes values only at discretetime the solution of (4.9) still equals (4.5). Computing x[k] := x(kT) = eAkT x(O)
(4.5) at t = kT and t = (k
+ Jo
tT
+ 1)T
yields (4.13)
eA(kTr)Bu(r)
dt
Every such term is analytic in the sense that it is infinitely differentiable and can be expanded in a Taylor series at every t. This is a nice property and will be used in Chapter 6. If every eigenvalue, simple or repeated, of A has a negative real part, then every zeroinput response will approach zero as t > 00. If A has an eigenvalue, simple or repeated, with a positive reai part, then most zeroinput responses will grow unbounded as t > 00. If A has some eigenvalues with zero real part and all with index I and the remaining eigenvalues all have negative real parts, then no zeroinput response will grow unbounded. However, if the index is 2 or higher, then some zeroinput response may become unbounded. For example, if A has eigenvalue 0 with index 2, then eAt contains the terms (I, r}. If a zeroinput response contains the term t, then it will grow unbounded.
and x[k Equation
+ I]
:= x«k
+ I)T)
= eA(k+I)T x(O)
+ Jo
kT
r:
eA((k+I)Tr)Bu(r)
dt
(4.14)
(4.14) can be written as x[k
+ I] = eAT +
[eAkT x(O)
CHI)T
+ !a
eACkTr)BU(r)drJ dt the new variable a .
l
kT
eACkT+Tr)Bu(r)
which becomes,
after substituting
(4.12) and (4.13) and introducing
4.2.1 Discretization
Consider the continuoustime state equation i(t) = Ax(t)
rr v r:«.
x[k
+ I]
= eAT x[k]
+
(!aT
eActdct)
Bu[k] only the
y(t) = Cxtr) If the set of equations is to be computed
+ Bufr ) + Duu)
(4.9) (4.10) Because
Thus, if an input changes value only at discretetime responses at t = kT, then (4.9) and (4.1 0) become x[k
instants kT and if we compute
on a digital computer, it must be discretized.
. . xCt + T)  x(t) x(t) = hm _:___:__~ T_O T
+ I] = Adx[k] + BdU[k] y[k] = Cdx[k] + Ddu[k]
(4.15) (4.16)
92
STATESPACE SOLUTIONS
AND
REALIZATIONS
4.3 Equivalent State Equations
93
with (4.17) This is a discretetime statespace equation. Note that there is no approximation involved in this derivation and (4.15) yields the exact solution of (4.9) at t = kT if the input is piecewise constant. We discuss the computation ofBd. Using (3.51), we have
response. Similar remarks apply to 1 s irn, which computes responses of continuoustime state equations or transfer functions excited by any input. In order to discuss the general behavior of discretetime state equations. we will develop a general form of solutions. We compute x[l] = Ax[O] + Bu[O] x[2] = Ax[l] + Bu[l] = A2x[O] + ABu[O] + Bu[l] Proceeding forward, we can readily obtain, for k > 0, x[k] kI = Akx[O] + LAkImBu[m]
m=O
{T Jo
( I+Ar+A2
2T
I
+
r2)
T3 _A2+ 3!
... T4 _A3 4!
dt
+ ... (3.51). If A is nonsingular, y[k]
(4.20)
= TI+
T2 A+ 2! i
This power series can be computed recursively then the series can be written as, using (3.51), A Thus we have
I (
as in computing
kI = CAkx[O] + L CAkImBu[m]
m=O
+ Du[k]
(4.21)
T2 T A + A
2
2'
T3 + A
3
3'
+ ... + I  I
)
=A
I'\T
(e
 I)
(if A is nonsingular) Using this formula, we can avoid computing an infinite series.
(4.18) state
They are the discrete counterparts of (4.5) and (4.7). Their derivations are considerably simpler than the continuoustime case. We discuss a general property of the zeroinput response Akx[O]. Suppose A has eigenvalue Al with multiplicity 4 and eigenvalue A2 with multiplicity I and suppose its Jordan form is as shown in the second matrix in (3.39). In other words, Al has index 3 and A2 has index. I. Then we have Ak
The MATLAB function [ad, bd] =c2d (a, b , T) transforms the continuoustime equation in (4.9) into the discretetime state equation in (4.15).
A' = Q 4.2.2 Solution of DiscreteTime Equations
Consider the discretetime statespace x[k equation
l1
kA~1 Ak
I
k(k  1)A72/2 kA~1 Ak
I
0 0 0
0 0 0
Ak
I
0 0
~]
A~
Q'
0
+ 1] = Ax[k] + BU[k]
y[k] = Cx[k]
+ Du[k]
(4.19)
where the subscript d has been dropped. It is understood that if the equation is obtained from a continuoustime equation, then the four matrices must be computed from (4.17). The two equations in (4.19) are algebraic equations. Once x[O] and u[k], k = 0, I, ... , are given, the response can be computed recursively from the equations. The MATLAB function ds tep computes unitstep responses of discretetime statespace equations. It also computes unitstep responses of discrete transfer functions; internally, it first transforms the transfer function into a discretetime statespace equation by calling t f 2 s s, which will be discussed later, and then uses ds t ep. The function dl s .irn, an acronym for discrete linear simulation, computes responses excited by any input. The function step computes unitstep responses of continuoustime statespace equations. Internally, it first uses the function c 2 d to transform a continuoustime state equation into a discretetime equation and then carries out the computation. If the function step is applied to a continuoustime transfer function, then it first uses t f 2 s s to transform the transfer function into a continuoustime state equation and then discretizes it by using c 2 d and then uses ds t ep to compute the
which implies that every entry of the zeroinput response is a linear combination of (A~. k2 A~2. A;). These terms are dictated by the eigenvalues and their indices. If every eigenvalue, simple or repeated, of A has magnitude less than I, then every zeroinput response will approach zero as k > 00. If A has an eigenvalue, simple or repeated, with magnitude larger than I, then most zeroinput responses will grow unbounded as k > 00. If A has some eigenvalues with magnitude I and all with index I and the remaining eigenvalues all have magnitudes less than I, then no zeroinput response will grow unbounded. However, if the index is 2 or higher, then some zerostate response may become unbounded. For example, if A has eigenvalue I with index 2, then A k contains the terms (I, k). If a zeroinput response contains the term k, then it will grow unbounded as k > 00.
kA7',
4.3 Equivalent State Equations
The example
EXAMPLE
that follows provides a motivation for studying equivalent
state equations.
inductor,
4.3 Consider the network shown in Fig. 4.1. It consists of one capacitor, one one resistor, and one voltage source. First we select the inductor current XI and
94
STATESPACE SOLUTIONS
AND REALIZATIONS Definition equation, x(1) = Ax(t) yet) = Cx(t) y = [0 IJx (4.22) where
4.3 Equivalent State Equations
95
capacitor voltage X2 as state variables as shown. The voltage across the inductor is XI and the current through the capacitor is X2. The voltage across the resistor is X2; thus its current is x2I1 X2. Clearly we have XI X2 + Xz and XI + X2  U O. Thus the network is described by the following state equation:
4.1 Let P be an n x n real nons in gular matrix and let
x = Px. Then
the state
=
=
=
+ Bu(t) + Du(t) C = cpI
(4.25)
If, instead, the loop currents XI and Xz are chosen as state variables as shown, then the voltage
across the inductor is side loop, we have
A
transformation.
= PAPI equivalent
(4.26)
x I and the voltage
U =XI +XI
across the resistor is (XI  xz) . 1. From the lefthand
is said to be (algebraically)
to (4.24) and x = Px is called an equivalence
Xli
,
or
XI = XI
+xz
+U
The voltage across the capacitor is the same as the one across the resistor, which is XI  xz. Thus the current through the capacitor is XI which equals X2 or
xz,
X2 =XIX2 Thus the network is also described
= XI
+u
by the state equation
(4.23)
Then the ith column of A is, as discussed in Section 3.4, the representation of Aq, with respe~t to the basis {ql, q2. "', qn}. From the equation B = PB or B = pI B = [ql q2 ... qnlB. we see that the ith column of B is the representation of the ith column of B with respect to the basis {q I, q2 .... , q.}, The matrix C is to be computed from CpI . The matrix D, called the direct transmission part between the input and output, has nothing to do with the state space and is not affected by the equivalence transformation. We show that (4.24) and (4.25) have the same set of eigenvalues matrix. Indeed, we have, using det(P) det(pI) = 1, Li(A) = det(A! and the same transfer
Equation (4.26) is obtained from (4.24) by substituting x(t) = pIx(t) and x(t) = PIx(t). In this substitution, we have changed, as in Equation (3.7), the basis v~ctors of the state space from the orthonormal basis to the columns ofP1 =: Q. Clearly A and A are similar and A is simply a different representation of A. To be precise, let Q = pI = [ql q2 ... qnJ.
y = [I
 IJx
they must be
The state equations in (4.22) and (4.23) describe the same network; therefore closely related. In fact, they are equivalent as will be established shortly. Consider the ndimensional state equation x(t) = Ax(t) yet) = Cx(t)
A)
= det(App1
 PAPI) = det(AI
= det[P(AIA) = 6(A)
AlPIJ
+ Bu(t) + Du(t)
= det(P) det(A!  A) det(pI) (4.24) and G(s) = C(sI 
where A is an n x n constant matrix mapping an ndimensional real space R," into itself. The state x is a vector in R," for all t; thus the real space is also called the state space. The state equation in (4.24) can be considered to be associated with the orthonormal basis in (3.8). Now we study the effect on the equation by choosing a different basis.
A)'B + D = CPI[P(sI  AlPIrlpB +D = CPIp(sI  A)lpIpB + D = C(sI  A)IB + D = G(s)
Thus equivalent state equations have the same characteristic polynomial and, consequently, the same set of eigenvalues and same transfer matrix. In fact, all properties of (4.24) are preserved or invariant under any equivalence transformation. Consider again the network shown in Fig. 4.1. which can be described by (4.22) and (4.23). We show that the two equations are equivalent. From Fig. 4.1, we have XI = .rl. Because the voltage across the resistor is X2, its current is xdl and equals XI  Xl. Thus we have
Figure 4.1 Network with two different sets of state variables.
L
~~
3 l'lr
1F Xz
~
or ,rl] [ X2
=
[I ]1
I 0 I
[XI] Xl
=
[I
I
0] J
[XI] X2
(4.27)
96
STATESPACE SOLUTIONS
AND
REALIZATIONS
4.3
Equivalent
State Equations
97
Note (4 23) for thisIP, its inverse happens t 0 equa I' use If'. It IS straightforward to verify that (4 22) . and that, . are re ated by t~e equivalence transformation in (4.26). . The MATLAB function [ab bb cb db ' ss2s . alence transformations. "'" s (a, b . c , d . p) carnes out equiv, matri~:~ state equations are said to be zerostate equivalent if they have the same transfer
develop a state equation for the network in Fig. 4.2(b), we assign the capacitor voltage as state variable x with polarity shown. Its current is flowing from the negative to positive polarity because of the negative capacitance. If we assign the right upper resistor's current as i (t). then the right lower resistor's current is ix, the left upper resistor's current is u  i . and the left lower resistor's current is II  i + x. The total voltage around the upper righthand loop is 0:
x
ix
 (u  i) = 0
or
i = 0.5(x
+ u)
D + C(sI  A)IB
This becomes, after substituting D + CBsI (3.57),
=
D+
C(sI  A)IS
which implies y
=
I .i
= i = 0.5(x + u)
loop is 0:
+
CABs2
+
CA2Bs3
+ ...
=
D + CSs
1
+ CABs
2
+ CA Ss2
3
+ ..
The total voltage around the lower righthand x which implies
Thus we have the theorem that follows.
+
(i  x)  (ll  i
+ xl
=
0
:>
Theorem 4, I Two linear timeinvariant state equations (A, B, C, D) and (A, or have the same transfer matrix if and only if D = D and
B
"
C
D) are zerostate equivalent
2.(
=
2i
+x

II
= X + II + X

u
= 2x
state equation
Thus the network in Fig. 4.2(b) is described by the onedimensional :i(t) = x(t) In order for two with A
y(t)
m = 0, 1, 2, ... It is clear that (algebraic) state equations to b .I case for zerostate ~aX:~:aLnE ;q:;~~~:n::~' .1 . I' equiva ence Imp res zerostate equivalence.
= 0.5x(t)
+ 0.511(t)
~~~~er:~~tea;:~~~es::ew~~mension.
This is, however, not the
:.4
CAmB
= 1, B = o, C 0.5, and D = 0.5. We see that D D = 0.5 and CA'"B = 0 for m = O. I. .... Thus the two equations are zerostate equivalent.
Forms
Consider the two networks shown in Fig. 4.2. The capacitor is assumed to have
ci:CUit in ~i"~ ~(~u~ ahnegatIve capacitance can,be realized using an opamp circuit. For the 0.5. To comep'ut~ th 't e "avefY(t) = 0.5· u(t) or yes) = 0.5u(s). Thus its transfer function is e ransrer unction of the network in Fiz 4 2(b) voltage across the capacitor t b B e" , we may assume the initial the current will go through 0 h e zero. ecause of the symmetry of the four resistors, half of resistor's current C ntl resistor or I (r) = 0.5u (t), where i(t) denotes the right upper o 5 Thu th . onsequent y, y(t) = i (t) . I = 0.5u (t) and the transfer function also equals This fac: ca~ ~~~ ::t:~~~,;~ by the zerodimensional st:te more precisely their state equations, are zerostate equivalent. y USing Th(eorem 4.1. The network in Fig. 4.2(a) is described equa IOn y t) = O.5u(t) or A = B = C = 0 and D = 0.5. To Figure 4.2 Two zerostate equivalent networks.
4.3.1 Canonical
MATLAB contains the function [ab,bb,cb,db,PJ=canon(a,b,c,d, 'typ_e' .If type=companion, the function will generate an equivalent state equation with A in the companion form in (3.24). This function works only if Q := [b. Ab, ... AnIbrl is nonsingular, where bl is the first column ofB. This condition is the same as (A, b.} controllable, as we will discuss in Chapter 6. The P that the function o ariori generates equals QI. See the discussion in Section 3.4. We discuss a different canonical form. Suppose A has two real eigenvalues and two complex eigenvalues. Because A has only real coefficients, the two complex eigenvalues must be complex conjugate. Let;" I, A2, a + j f3, and a  j f3 be the eigenvalues and ql, q2, q3· and q, be the corresponding eigenvectors. where I"" A1, a, f3, q., and q1 are all real and q, equals the complex conjugate of q). Define Q = [ql q2 q) ql]. Then we have AI
j+
0.511
t
0511
(a)
_l
y
ij+
W
t
III
__1
IF III ui
(b)
+.(
y
o
0
J'.
[
0 0
o
i ;i:
Note that Q and J can be obtained from [q, j 1 =e ig (a) in MATLAB as shown in Examples 3.5 and 3.6. This form is useless in practice but can be transformed into a real matrix by the following similarity transformation
98
STATESPACE SOLUTIONS
AND REALIZATIONS
4.3
Equivalent
State Equations
99
Q'JQ:=[i: ~ ~.][~ ~ ..
.[i :
o
0
o
0
j
=J
°
0
0
o~s
O.S
o o
= [~ ~
O~Sj] O.Sj
0
°
a
fJ
a
~]
'A . 
power supplies, then all signals are roughly limited to ± 13 volts. If any signal goes outside the range, the circuit will saturate and will not behave as the state equation dictates. Therefore saturation is an important issue in actual oparnp circuit implementation. Consider an LTI state equation and suppose all signals must be limited to ±M. For linear systems, if the input magnitude increases by a, so do the magnitudes of all state variables and the output. Thus there must be a limit on input magnitude. Clearly it is desirable to have the admissible input magnitude as large as possible. One way to achieve this is to use an equivalence transformation so that
fJ
We see that this transformation transforms the complex eigenvalues on the diagonal into a block with the real part of the eigenvalues on the diagonal and the imaginary part on the offdiagonal. This new Amatrix is said to be in modal fonn. The MATLAB function [ab,bb,cb,db,Pl~canon(a,b,c.'d, 'modal') or canon(a,b,c,d) with no type specified will yield an equivalent state equation with A in modal form. Note that there is no need to transform A into a diagonal form and then to a modal form. The two transformations can be combined into one as
for all i and for all t. The equivalence transformation, however, will not alter between the input and output; therefore we can use the original state equation range to achieve ly(t)1 ::: M. In addition, we can use the same transformation state variables to increase visibility or accuracy. This is illustrated in the next
EXAMPLE 4.5
the relationship to find the input to amplify some example.
Consider
the state equation x=[~·I
p = QQ = [ql qz q3 q4]
I

['
~
°°
I 0 0
0 0
~1]X+[~.~]u
1]x
O.S O.S
O~5j ] O.Sj
y=[O.1
= [ql qz Retq.) Im(q3)] where Re and Im stand, respectively, for the real part and imaginary part and we have used in the last equality the fact that q4 is the complex conjugate of q3. We give one more example. The modal form of a matrix with real eigenvalue Al and two pairs of distinct complex conjugate eigenvalues a, ± jfJi, for i = 1,2, is 0 0
Suppose the input is a step function of various magnitude and the equation is to be implemented using an oparnp circuit in which all signals must be limited to ± I O. First we use MATLAB to find its unitstep response. We type
a~[O.l plot(t,y,t,x) 2;0 lJ;b~[10;O.11;c~[0.2 ; lJ;d~O;
[y,x,tl~step(a,b,c,d)
A~
[A' 1
0 0 0
al
fJI
al
fJI
0 0
0 0
az fJz
Re(q4)
az
11
Imtq.)]
I
(4.28)
which yields the plot in Fig. 4.3(a). We see that IXllmax = 100 > IYlrnax = 20 and IX21 « Iylmax, The state variable X2 is hardly visible and its largest magnitude is found to be 0.1 by plotting it separately (not shown). From the plot, we see that if IU(I)I ::: 0.5, then the output will not saturate but XI (z ) will. Let us introduce new state variables as
It is block diagonal and can be obtained pI = [ql
by the similarity
transformation
xo" =
+ j fJI,
and
20 0.1
x'

= 200x, 
Retq) Im(q2)
where ql, q2, and q4 are, respectively, eigenvectors This form is useful in statespace design.
associated with AI. a
a2 + j fJ2.
With this transformation. the maximum magnitudes of XI (t) and X2 (t) will equally Imax. Thus if yet) does not saturate, neither will all the state variables .ii' The transformation can be expressed as x = Px with
4.3.2 Magnitude
Scaling in OpAmp
Circuits
Then its equivalent
P=
0.2 [
As discussed in Section 2.3.1, every LTI state equation can be implemented using an oparnp circuit.' In actual oparnp circuits, all signals are limited by power supplies. If we use ± ISvolt
1. This subsection may be skipped without loss of continuity,
state equation can readily be computed ~ [0.1 X= y = [I
° 2~]
o
p_1 _

[S 0]
0
[2] 20
It
O.OOS
from (4.26) as
0.002]_ I
x+
 0.005]x
100
STATESPACE SOLUTIONS
AND REALIZATIONS and (A, B. C, D) is called a realization
v
4.4 Realizations of G(s). An LTI distributed state equation.
101
100
10
.t~
system can be de~cribed Thus not every G(s) is
bv a transfer matrix, but not by a finitedimensional
il
6
r;alizable. If G(s) is realizable, then it has infinitely many realizations. not necessarily of the same dimension. Thus the realization problem is fairly complex. We study here only the realizability condition. The other issues will be studied in later chapters.
y
ji'~ 20
(a)
Theorem 4,2
A transfer matrix G(s) is realizable if and only if G(s) is a proper rational matrix.
30
40
20
(b)
30
40
Figure
4.3
Time responses.
We use (3.19) to write
Gsp (5):=
Its step responses due to u(t) = 0.5 are plotted in Fig. 4.3(b). We see that all signals lie inside the range ± 10 and occupy the full scale. Thus the equivalence state equation is better for opamp circuit implementation or simulation. The magnitude scaling is important in using opamp circuits to implement or simulate continuoustime systems. Although we discuss only step inputs, the idea is applicable to any input. We mention that analog computers are essentially opamp circuits. Before the advent of digital computers, magnitude scaling in analog computer simulation was carried out by trial and error. With the help of digital computer simulation, the magnitude scaling can now be carried out easily. If A is n x n, of an (n  I) combinations strictly proper
CCsI  A)IB
=
det(sl
I
 A)
C[Adj (51  A)]B
(4.30)
then det (.II  A) has degree n. Every entry of Adj (51  A) is the determinant x (n  I) submatrix of (.II  A): thus it has at most degree (n  I). Their linear again have at most degree (n  I), Thus we conclude that C(sl  A)lB is a rational matrix. IfD is a nonzero matrix, then C(sl  A)I B + D is proper. This
shows that if G(s) is realizable. then it is a proper rational matrix. Note that we have G(ce) = D Next we show the converse; that is, if G(s) is a q x p proper rational matrix. then there exists a realization. First we decompose
G(s) G(s) as
= G(ce)
+ Gsp(s)
(4.31)
4.4 Realizations
Every linear timeinvariant (LTI) system can be described
Yes) = G(s)ii(s)
where by the inputoutputdescription
c., is the
strictly proper part of G(s). Let
des)
= sr + UISr1
+ ". + Ur_1S + Ur
(4.32)
be the least common denominator equation description (4.29)
of all entries of Gsp(s).
Here we require des) to be monic; as
and. if the system is lumped as well. by the statespace
XU) = Ax(t) y(t) = Cx(t)
that is. its leading coefficient is 1. Then Gsp(s) can be expressed
+ Bu(t)
, G
+ Du(t)
sp
(5)
I = [N(s)] drs)
= 1 drs)
[ Nis
rI
N +. N cS r'  + ,.. + 1 rIS +" . "r
1
(433)
where Ni are q x p constant matrices. Now we claim that the set of equations
If the state equation is known. the transfer matrix can be computed as G(s) = Cis IA)I B+D. The computed transfer matrix is unique. Now we study the converse problem. that is. to find a statespace equation from a given transfer matrix. This is called the realization problem. This terminology is justified by the fact that, by using the state equation. we can build an opamp circuit for the transfer matrix. A transfer matrix G(s) is said to be realizable equation (4.29) or, simply, (A. B. C, D) such that
G(s)
[ 0,1,
r,
0 0
u2Ip
ur_Ilp
X=
r,
0
Nr x
0
0 0
"~I" x + o
0
11
0 0
r~
u
if there exists a finitedimensional
state
(4.34)
r,
= C(sl  A)IB
+D
y= [NI Nc ., ·Nr1
1
+ G(oo)u
102
STATESPACESOLUTIONS AND REALIZATIONS is a realization of (; (s). The matrix Ip is the p x p unit matrix and every 0 is a p x p zero matrix. The Amatrix is said to be in block companion form; it consists of r rows and r columns of p x p matrices; thus the Amatrix has order rp x rp, The Bmatrix has order rp x p. Because the Cmatrix consists of r number of Ni, each of order q x p, the Cmatrix has order q x rp, The realization has dimension We show that (4.34) is a realization
1
4.4 EXAMPLE
Real izations
103
4.6 Consider the proper rational matrix (;(s) = 4s 2s +1 l [ (2s + I)(s + 2)
10 _3_]
2stl (2s
r p and is said to be in controllable canonical form. of (;(s) in (4.31) and (4.33). Let us define
z= [~: j=
where Zj is p x p and Z is rp x p. Theruthe
=[~ ~J+
(4.35)
where we have decomposed 1
s+2 s+I (s + 2)2 12 s+2 3 ] s+I (s + 2)2
(4.38)
[
(d  A)'B
transfer matrix of (4.34) equals
+ I)(s + 2)
(;(s) into the sum of a constant matrix and a strictly proper rational of (;sp(s) is des) = (s +0.5)(s + 2)2 =
,
matrix (;sp(s). The monic least common denominator s3 + 4.5s2 + 6s + 2. Thus we have
C(sl  A)lB
+ (;(00) = N1Zl + NzZz + ... + NrZr + (;(00)
(4.36)
A
[ 6(s
Gsp(s) = s3 + 4.5s2 + 6s + 2 = des)
and a realization of (4.38) is I
+ 2)2 0.5(s + 2)
3(s + 2)(s + 0.5) ] (s + l)(s + 0.5) 0~5])
We write (4.35) as (sl  A)Z = B or
([6
sZ =AZ+B
Using the shifting property of the companion equations in (4.37), we can readily obtain
(4.37)
0
3] s2 [ 24 1 + 0.5
7.5] s + [24 1.5 I
form of A, from the second to the last block of
4.5 0
0 4.5 0
6 0 0 0
0 6 0 0 0
2 0 0 0 0 0
0 2 0 0 0 0 0 0 0 0 0
0
which implies X= X+
0
Substituting these into the first block of equations in (4.37) yields
0 0 0 0
[:J
(4.39)
0 0
or, using (4.32), 0!2 O!r ( s + 0!1 + s + ... + Srl Thus we have Zl1 Substituting
srl )
0 0 3 24 0.5 0 7.5 1.5
Y= Zl = Zl
SrI
[ 6 0
24 03J X+ [~
~
] [::]
des)
= Ip
This is a sixdimensional
realization.
 des) p'
r
= 4 and q = 2. However,
We discuss a special case of (4.31) and (4.34) in which p = I. To save space, we assume the discussion applies to any positive integers rand q. Consider
the 2 x I proper rational matrix I = [N1Srl'
these into (4.36) yields , + G(oo)
C(sl  A)lB
des)
+ Nzsr.
+ ... + Nrl + G(OO)
,
(;(s) =
[dd:
1
]
+ __
S4 + 0!1s3 +
c __
O!
zs2 + 0!3S + 0!4 (4.40)
I~
This equals (;(s) in (4.31) and (4.33). This shows that (4.34) is a realization of (;(s).
. [.811 s: + .81ZS~ + .8l3S + .814] .8Z1s + .8zzs" + .823s +.824
104
STATESPACE SOLUTIONS
AND REALIZATIONS
4.4 Realizations directly from (4.34) as =a : 0 I 0
0'3
lOS
Then its realization can be obtained
X=
y
.
[a,
I 0 0
0 0
;'] [:]
o
x+ 0
U
o
[ dl d2
]
0
(4.41) Typing
= [.BII .B12 f3i3 f31. ] X + .B21 .B22 f323 f32.
realization
11
nl=[4 2 20;00 1];d;_=[2 5 2J; :a,c,c,c::=tf2esin:,dl)
This controllablecanonicalform (4.40).
can be read out from the coefficients transfer matrix. For example,
rq.
of G(s) in yields the following realization for the first column of G(s):
There are many ways to realize
G(s)
a proper
Problem 4.9 can be
gives a different realization of (4.33) with dimension
and let u, be the ith component ex pressed as
Yes) = Gel (5)11 I (s) + GdS)lds)
Let Gei(s)
be the ith column of = G(s)u(s)
of the input vector u. Then J(s)
12] + [2]
0.5
XI
(4.42)
III
+ ... =:
Ycl (5)
0
+ Yds)
+ ...
Similarly, the function
ofG(s):
tf2ss
can generate the following realization 4 I
for the second column
as shown in Fig. 4.4(a). Thus we can realize each column of G(s) and then combine them to yield a realization of G(s). Let
c; (s)
be the ith row of G(s) and let Yi be the ith component can be expressed as
)(0 = Aox,
of the output vector y. Then y(s) = G(s )u(s)


+ b,llo = [ 
4]
o
X,
I  + [0]
u. (443)
Yi(S) = Gn(S)U(S)
as shown in Fig. 4A(b). Thus we can realize each row ofG(s) to obtain a realization ofG(5). The MATLAB function canonicalform
G(s). In its employment,
and then combine them to obtain These two realizations

v,.,
=
Csx + d'lh = [31 
6]xo+[0]lh 0 I 

a realization of G(s). Clearly we can also realize each entry of G(s) and then combine them See Reference [6. pp. 158160]. generates the controllab1etransfer matrix multipleoutput
[a, b , C , d 1 = t f 2 ss (rium , den) G(s)
can be combined
realization shown in (4.41) for any singleinput there is no need to decompose not necessarily
as in (4.31). But we must compute or
[~I] = [AI0 A2 [XI] + [bl0 b2 [III] Xl X2 112 Y = Yc1 + Yc2 = [CI C21x + [d , d2]u
I 0 0 0 12 0.5 3 0 0 4
0]
as
0]
its least common denominator,
monic. The next example will apply t f 2 s s to
to
each column of G(s) in (4.38) and then combine them
form a realization of G(s).
X=
Figure 4.4 Realizations of G(5) by columns and by rows.
[T
[ 6 0
~l+~ [ ll"
~]x+[~ ~]u
(4.44)
y=
v: This is a different realization than the one in (4.39).
of the G(s) in (4.38). This realization has dimension
4. two less
uu
(b)
The two state equations in (4.39) and (444) are zerostate equivalent because they have the same transfer matrix. They are, however, not algebraically equivalent. More will be said
106
STATESPACESOLUTIONS AND REALIZATIONS in Chapter 7 regarding tf2ss, realizations. We mention that all discussion to the discretetime case in this section, including solution of scalar timevarying to develop the solution. Consider
4.5 Solution of Linear TimeVarying (LTV)Equations equations to the matrix case and must use a different
107 approach
applies without any modification
x = A(t)x
4.5 Solution of Linear TimeVarying
Consider the linear timevarying
(4.49)
(LTV) Equations
(LTV) state equation
x(t)
where A is n x n with continuous functions of I as its entries. Then for every initial state Xi(to), there exists a unique solution x.tr), for i = 1, 2, ... , n. We can arrange these n solutions as X = [XI X2 .. , xn], a square matrix of order n. Because every Xi satisfies (4.49), we have (4.45) (4.46) If X(to)
= A(t)x(t)
y(t) = C(t)x(t)
+ B(t)u(t) + D(t)u(t)
X(tl
fundamental
= A(I)X(t)
(4.50) then X(t) is called a be chosen, as long as
It is assumed that, for every initial state f(to) and any input utr), the state equation has a unique solution. A sufficient condition fortsuch an assumption is that every entry of A(t) is a continuous function of t. Before considering the general case, we first discuss the solutions of x(t) = A(t)x(t) and the reasons why the approach taken in the timeinvariant case cannot be used here. The solution of the timeinvariant equation equation = ax. The solution of x = ax is x(t)
is nonsingular or the n initial states are linearly independent, matrix of (4.49). Because the initial states can arbitrarily they are linearly independent, the fundamental matrix is not unique.
EXAMPLE
4.8 Consider the homogeneous
x
x = Ax can be extended from the scalar = ea, x(O) with d(ea')/dt = aea' = ea, a.
with = eAt A or
x(t) =
. [0 0]
t
equation
0
x(t)
(4.51)
Similarly, the solution of x = Ax is x(t)
:!_eAt dt
= eAtx(O) = AeAt
where the commutative
e(A+B)t
i=
property
is crucial. Note that, in general, we have AB equation
i=
BA and
The solution of XI (r) is
= 0 for
to
= 0 is XI (t) = XI (0);
the solution of X2(1)
= tXI
(t)
= IXI (0)
eAt eRt.
The solution of the scalar timevarying
x
= a(t)x
due to x(O) is Thus we have x(O) and
f' a(r)dr (0) x (t ) = eJo x
with
d r atr yd t () f' a(r)dr fo' atr sd t (r) eJo = a t eJo = eJI a dt
=
XI (0) ] [ X2(0)
=
[ 1]
0
=}
X(I)
=
[
0.5t
12]
•
Extending this to the matrix case becomes x(t) = ef; with, using (3.51),
ef; A(r)dr
A(r)dr
x(O)
(4.47)
x(O)
=
xI(O)] [ X2(0)
=
[1] 2
=}
X(I)
=
[I]
0.5t2
+2
(4.52)
The two initial states are linearly independent; = 1+
thus
l
+
)ef;
A(r)dr + ~
(l
A(r)dr)
(l
(l
A(S)dS)
+ .. '
is a fundamental matrix.
X(t)
= [ 0.5tI?
0.5t"
!+ 2 ]
This extension, however, is not valid because
fe
f;
A(r)dr
= A(t)
~A(t)
(l
A(S)dS)
+~
A(r)dr)
A(t)
+ ...
(4.48)
i= A(t
A(r)dr
Thus, in general, (4.47) is not a solution of
x
= A(t)x. In conclusion,
we cannot extend the
A very important property of the fundamental matrix is that X (t) is nonsingular for all t. For example, X(t) in (4.52) has determinant 0.5t2 + 2  0.5t2 = 2; thus it is nonsingular for all t. We argue intuitively why this is the case. IfXCt) is singular at some II, then there exists a nonzero vector v such that X(tl) := X(tilv = 0, which, in turn, implies x(t) := X(t)v == 0 for all t, in particular, at = to. This is a contradiction. Thus X(/) is nonsingular for all I.
108
STATESPACE SOLUTIONS AND REALIZATIONS
Definition
4.5 Solution of Linear TimeVarying (LTV)Equations
matrix of x = A(t)x. (to) The state transition matrix is also the Then
109
4.2 Let X(t)
be any fundamental
where <I>(t, r ) is the state transition matrix of x = A(t)x. Equation (4.58) follows from (4.57) by using <I>(t, r ) = <I>(t, ta) <I> r ). We show that (4.57) satisfies the initial condition and the (to, state equaion. At t = to. we have
x(to)
<I>(t, to) := X(t)X1 is called the state transition
matrix of x: = A(t)x.
unique solution of <I>(t, at with the initial condition
= <I>(to, to)Xa +
a
to)
=
A(t)<I>(t,
to)
(4,53)
Thus (4.57) satisfies the initial condition. d x(t)
dt
<I>(ta, to) = I.
a = <I>(t,
at
to)Xo
1 + a l'
'0
10
<I>(t, r)B(r)u(r)dr
= Ixo + 0 = Xl)
Using (4,53) and (4.6). we have <I>(t,r)B(r)u(r)dr
at
'0
Because X(t1 is nonsingular for all t, its inverse is well defined, Equation (4.53) follows directly from (4.50). From the definition. ,we have the following important properties of the state transition matrix: 1
<I>(t, t) = I <I>I(t, to)
= A(t)<I>(t.
ta1xo
+
= A(t)<I>(t, (4.54)
ta)Xa
+
=
=
[X(t)XI(tOWI
<I>(t. til<l>(tl. 101
= X(ta1XI(t) = <I>(ta,t)
(4.55) (4.56)
= AU)
[
e«. to)Xo
1,' + 1,'
1'0
f'
(!!_<I>(t, r)B(r»)
at
dt
+ <I>(t, t)B(t)ult)
dt
A(t)<I>(t,
r)B(r)u(r)
+ B(t)u(t)
<I>(t, r)B(r)u(r)
dr]
+ B(t)u(t)
<I>(t, ta)
for every t, ta. and tl'
EXAMPLE 4.9 Consider
= A(t)x(t)
+ B(t)u(t)
(4.57) into (4.46) yields
Thus (4.57) is the solution, Substituting the homogeneous equation in Example 4.8. Its fundamental matrix
y(t)
was computed as
X(t)
=
[I
[0.25t
0.51
,
0.5t
,
I] +
= C(t)<I>(t.
to)xa
+ CU) f'
i.
<I>(t, r)B(r)u(r)dr
+ D(t)u(t)
(4.59)
2
If the input is identically
zero, then Equation (4,57) reduces to x(t) = <I>(t,to1xa
Its inverse is. using (3.20),
•
X
_I
2
(t) =
0.25t
+, I
0,5] 0.5
This is the zeroinput
response. Thus the state transition matrix governs the unforced
propa
gation of the state vector. If the initial state is zero. then (4.59) reduces to
Thus the state transition matrix is given by <I>t , to) =
(
[I
0.5t
2'
0.5t
I] +
[0.25t
2
O.25to
J +,
I
y(t)
=
0.5] 0.5
cr» [' 1'0
<I>(t. r)B(r)u(r)
dr
+ D(t)u(t)
 r)l ut r ) d t
= [' [C(t)<I>(t.
Io
r)B(r)
+ m(r
(4.60)
= [0.5()
 t5)
~] matrix satisfies (4.53) and has the three
This
IS
the zerostate response. As discussed in (2.5). the zerostate response can be described by
y(t) =
It is straightforward to verify that this transition properties listed in (4.54) through (4,56).
f' GU. r)u(T)dr ito
(461)
Now we claim that the solution of (4.45) excited by the initial state x(ta) input u(t) is given by
x(t)
= Xa and the
where G(t. r ) is the impulse response matrix and is the output at time t excited by an impulse input applied at time r , Comparing (4.60) and (4.61) yields r)B(r)
= <I>(t. ta1xo
+ f' 110
+
<I>(t. r)B(r)u(r)dr
(4,57)
G(t, r )
=
C(t)<I>(t,
= <I>(t. to) [xa
l'
= C(t)X(t)X1
<I>(ta. r1B(r)u(r) dr]
(r )B(r)
+ D(t)8(t + D(t)8(t
r)

r)
(462)
(4.581
This relates the inputoutput
and statespace
descriptions.
,
110 STATESPACE SOLUTIONS AND REALIZATIONS is triangular The solutions in (4.57) and (4.59) hinge on solving (4.49) or (4.53). If A(t) such as
4.6
Equivalent
TimeVarying
Equations
III
[
Xl(t)}
X2(t)
=
[all(t)
a21(t)
a22(t)
O}
[Xl(t)} X2(t) it into
for all t, the state transition matrix <I>(t. to) is defined for t > propaganon of the state vector in the POSI't' ti  to and t < to and can govern the . ive ime and negative tim d' . I time case, the Amatrix can be singul . th he i  e irecuons. n the discrete..... . ar; us t e in verse of <I> k 1 [k b ... [k, kolls defined only for k > k ad' 0 may not e defined. Thus .. . _ 0 n governs the propagation of th t . posmvetirne direction. Therefore th dis e s ate vector in only the e iscrete counterpart of (4.56) or <I>[k, kol = <I>[k, kIl<l>[kl. ko] holds only for k ::: k, ::: ko· Using ko, as, for k > the discrete state transition matri x, we can express the solutions of(4.64)
we can solve the scalar equation
Xl (t) = all (I).X! (t) and then substitute
Because Xl (I) has been solved, the preceding scalar equation can be solved for X2(t)· This is what we did in Example 4.8. If A(t), such as A(t) diagonal or constant, has the commutative property
and (4.65)
x[kl = <I>[k. ko]xo
+L
m=ko
kI
<I>[k, m
+ I]B[m]u[m]
(4.67)
for all to and t, then the solution of (4.53) can be shown to be <I>(t, to) = e
J,' A(r)dr
'0
=
1 L 1 (1' A(T)dT )k k. '0
oo k=O
(4.63)
I
y[k]
= C(k]<I>[k,
kolxo
+ C(kl
L <I>[k, m + 11B[mlu[m]
m=ko
kl
+ D[k]u[k]
For A(t)
constant, (4.63) reduces to <I>(t, r )
= eA(rr) = <I>(t
T) state transition matrices is
and X(t) = e'", Other than the preceding special cases, computing generally difficult.
I
:
Their derivations are similar to those of (4 ')0 . ') . If the initial state is zero Equation (4'67» andd(4._1) and will not be repeated. , . re uces to
kl
y[kl = C(k]
L <I>[k. m + I]B[mlu[ml
re sponse
0
+ D[k]u[k]
(4.68)
m=ko
i i
for k > ko. This describes the zerostate k < m, then (4.68) can be written as
k
f
(4.65). If we define <I>[k, m] = 0 for
4.5.1 DiscreteTime Case
Consider the discretetime state equation x(k (4.64) (4.65)
y(k]
=
L (C(k]<I> [k, m + 11B[m] + D[m]8[k
m=ko
 ml) u[m]
+ 11 = A(klx(kl + B(klu(kl y(kl = C(klx(kl + D(klu(kl
where the impulse sequence 8[k  m] equals 1 if k dO if the multivariable version of (2.34), we have  m an Ik G[k, m]
'I m.
 m]
Comparing
this with
=
C(k]<I>[k,
m
+ I]B[m] + D[m]8[k
The set consists of algebraic equations and their solutions can be computed recursively once the initial state x(kol and the input u(kl, for k ::: ko, are given. The situation here is much simpler than the continuoustime case. As in the continuoustime case, we can define the discrete solution of <I>(k for k = ko, ko state transition matrix as the
for k ::: m. of relates counterpart This (4.62). the impulse response se quence and the state equation and is the discrete
+ 1, kol
= A(kl<l>(k, kol
with <I>(ko, kol = I of (4.53) and its solution can be (4.66)
4,6 Equivalent TimeVarying Equations
This section extends the equivalent state e uations di .. . case. Consider the ndimensional J' q . iscussed m Section 4.3 to the timevarying mear nrnevarymg state equation
+ I,
.... This is the discrete counterpart
obtained directly as <I>(k, kol = A(k  llA(k  21· .. A(kol
x = A(t)x + B(t)u
y = C(t)x
for k > ko and <I>(ko, kol = I. We discuss a significant difference between the continuous and discretetime cases. Because the fundamental matrix in the continuoustime case is nonsingular
+ D(t)u
(4.69)
112
STATESPACE SOLUTIONS
AND REALIZATIONS
4.6
Equivalent
TimeVarying
Equations
113
Let PUl be an n x n matrix. It is assumed that pet) is nonsingular continuous for all t. Let x = P(t rx. Then the state equation
and both Ptr ) and pet) are
and compute AU) = [P(tlA(t)
x = A(t)x
y = C(t)x where A(t) = [P(I)A(I) B(t) = P(llB(I) C(t) = C(t)PI(ll O(t) = DII) is said to be (algebraically) transformation. equivalent
+B(I)u
+ p(t)]pI(t)
+ O(t)u
(4.70)
= [eA"!XI(t)A(I) which becomes, after substituting A(t)
+ AoeAo!XI(t) + eAo!XIU)]X(t)eAo!
(4.72), AoeAo!X1 (I)X(t)eAo! Q.E.D.
+ p(t)]pI
(I) This establishes the theorem.
=
=
Ao
If A, is chosen as a zero matrix, then P(t) = XI (t) and (4.70) reduces to A(t) = 0 equivalence B(t) = XI(t)B(t) C(t) = C(t)X(I) O(t) = D(t) (.U4)
to (4.69) and P(I) is called an (a/gebraic)
Equation (4.70l is obtained from (4.69) by substituting x = P(t)x and x = P(t)x Let X be a fundamental matrix of (4.69>. Then we claim that XU) := P(t)X(t)
+ P(t)X.
(4.71)
is a fundamental matrix of (4.70). By definition, X(I) = A(t)X(t) and XU) is nonsingular for all I. Because the rank of a matrix will not change by multiplying a nonsingular matrix, the matrix P(t)X(t) is also nonsingular for all I. Now we show that PU)X(I) satisfies the equation x = A(t)x. Indeed, we have Tt[P(I)X(tl]
The block diagrams of (4.69) with A(I) of. 0 and A(I) = 0 are plotted in Fig. 4.5. The block diagram with A(I) = 0 has no feedback and is considerably simpler. Every timevarying state equation can be transformed into such a block diagram. However, in order to do so, we must know its fundamental matrix. The impulse response matrix of (4.69) is given in (4.62). The impulse response matrix of (4.70l is, using (4.71) and (4.72), G(t, r ) = C(r)X(t)X\r)B(r)
+ O(l)8(t
 r)
d
= P(t)X(t) = [per)
.
+ P(t)X(t)
..
= P(t)X(t)
+ P(t)A(t)X(t) = A(t)[P(t)X(I)]
+ P(tJA(I)][PI(t)P(t)]X(I)
matrix ofx(I)= A(t)x(t).
Thus P(t)X(t) Theorem 4.3
is a fundamental
Let Ao be an arbitrary constant matrix. Then there exists an equivalence (4.69) into (4.70) with A(t) = Ao.
transformation
that transforms
(a)
Proof' Let XU) be a fundamental X(I) = I yields
matrix of x = A(t)x.
The differentiation
of XI (I)
which implies (4.72) Because A(t) = A, is a constant matrix, X(t) x = A(t)x = Aox. Following (4.71). we define = eAo! is a fundamental matrix of
(b)
(4.73)
Figure 4.5
Block daigrams
with feedback
and without feedback.
114
STATESPACE SOLUTIONS
AND REALIZATIONS = C(tlP1(t)P(t)X(t)XI(r)pl(r)P(r)B(r)
+ D(t)o(t
r)
4.7 TimeVarying Realizations  r) ... Theorem
115
= C(t)X(t)X1
(r)B(r)
+ D(t)o(r
 r)
= G(t,
4.4
Thus the impulse response matrix is invariant under any equivalence transformation. The property of the Amatrix, however, may not be preserved in equivalence transformations. For example, every Amatrix can be transformed, as shown in Theorem 4.3, into a constant or a zero matrix. Clearly the zero matrix does not have any property of A(t). In the timeinvariant case, equivalence transformations will preserve all properties of the original state equation. Thus the equivalence transformation in the timeinvariant case is not a special case of the timevarying case. Definition 4.3 A matrix P(t) is called ~ Lyapunov transformation ifP(r) is nonsingular, pet) and pet) are continuous, andP(t) hndpI (r) are boundedforall t. Equations (4.69) and (4.70) are said to be Lyapunov equivalent ifP(t) is a Lyapunov transformation. It is clear that if P(t) = P is a constant matrix, then it is a Lyapunov transformation. Thus the (algebraic) transformation in the timeinvariant case is a special case of the Lyapunov transformation, If pet) is required to be a Lyapunov transformation, then Theorem 4.3 does not hold in general. In other words, not every timevarying state equation can be Lyapunov equivalent to a state equation with a constant Amatrix. However, this is true if A(t) is periodic.
Consider (4.69) with ~(t) = A(t + T) for all t and some T > 0 Let X(t) be f d tal ' of X  A(t)x L tAb h . a un amen matnx .e e t e constant matrix computed from eAT _ XI Lyapunov equivalent to (O)X(T). Then (4.69) is x(t) = Ax(t)
yet) = CUlPI (t)x(t) where pet) = eAtXI (r).
+ P(t)B(t)u(t) + D(t)u(t)
part of Theorem 4.4 is the theory of Floquet. It states that if X = A(t)x a~d 'if A(t +~~g:n~~u; for all t, then Its fundamental matrix is of the form pI (t )eAI h p_l. .~ function. Furthermore x = A(t)x is Lyapu n 0 v equiva Ient to x = Ax.ere . "' ~ (t) IS a periodic ,
The matrix pet) in (4.77) satisfies all conditions in Definition 4 3· th s it . L transformarion. The rest of the theorem follows directly from Theore~ 3 u~e ~oa yapunov
4
4.7 TimeVarying Realizations
We studied in Section 4.4 the realization problem for linear t" . . section, we study the corresponding problem for linear ti Ime~mvanant
Periodic state equations
assumed that
Consider
the linear timevarying
state equation
in (4.69). It is
systems.
. In this
ACt
+ T) = A(t)
transtEonn cal.nnot b.e used here; therefore we study the prob~~v2e~~fy si~s:~:~'m~~o~a~~ce very mear tImevarymg syste bd . . m can e escnbed by the inputoutput description yet) = 1t G(t, r)u(r) dr
to
for all t and for some positive constant T. That is, A(t) is periodic with period T. Let X(t) be a fundamental matrix of x = A(t)x or X(t) = A(t)X(t) with X(O) nonsingular. Then we have
xo + T)
Thus XU
= A(t
+ T)X(t + T)
= A(t)X(t
+ T)
as (4.75)
and, if the system is lumped as well, by the state equation x(t) = A(t)x(t) yet) = C(t)x(t)
+ T)
+ B(t)u(t) + D(t)u(t)
matrix can be computed  r) for t ::: r from (4.79) (4.78)
is also a fundamental X(t
matrix. Furthermore,
it can be expressed
+ T) = X(t)XI(O)X(T)
Let us define
If the state equation is available, the impulse response G(t, r) = C(t)X(t)XI(r)B(r)
This can be verified by direct substitution.
Q
= XI (O)X(T). It is a constant
nonsingular matrix. For this Q there exists a constant matrix 3.24). Thus (4.75) can be written as
A
such that / ... = Q (Problem T
xo + T) = X(t)eAT
Define
(4.76)
where X(t) is a fundamental matrix of = A(t)x Th . .f . . e converse problem IS to find a state equation rom a given impulse response matrix A' I b li b . n irnpu se response matrix G(t r) is said t e rea lza le if there exists (A(t), Btr), Ctr), D(t)l to meet (4.79). ' 0 ... Theorem
x
+ D(t)o(t
(4.77) We show that pet) is periodic pet with period T: = eA(t+T)XI(t
4.5
, r) can be d ecomposed as (4.80)
A q x p impulse response matrix G(t, r) is realizable if and only if G(t
+ T)
+ T) = eAtiT[eATXI(t)]
G(t, r) = M(t)N(r)
+ D(t)o(t
 r)
= eAtX1 (t) = Per)
:~;e;~r ~.::: r, where M, N, and D are, respectively, q x n, n x p. and q x p matrices for some
116
STATESPACE SOLUTIONS
AND
REALIZATIONS
•..... ~c., ".
Problems :'i
117
Proof:
Mu)
If G(t, r ) is realizable, there exists a realization that meets (4.79). Identifying C(t)X(t) and N(r) X'(r)B(r) establishes the necessary pan of the theorem. If G(t. r ) can be decomposed as in (4.80), then the »dimensional state equation
=
=
PROBLEMS";;
4,1
An oscillation can be generated
by
x=[
Show that its solution is
0 1
l]x 0
x(t) = N(t)u(t) yet) = M(t)x(t) is a realization. Indeed, a fundamental response matrix of(4.81) is
M(I)I·
+ D(I)u(t)
(4.81)
matrix of x = O· x is Xu) = 1. Thus the impulse
xir ) = 4.2
COS [
 SIn
.
t t
sin cost
t]
x(O)
r'N(r)
+ D(t)S(t
Use two different methods to find the unitstep response of
 r) Q.E.D.
which equals GU. f). This shows the sufficiency of the theorem.
x=[
4.3 4.4 Find the companionform
0 2
l]x+[l]u 2
I
y = [2 3]x Although Theorem 4.5 can also be applied to timeinvariant useful in practical implementation, as the next example illustrates. systems. the result is not Discretize the state equation in Problem 4.2 for T = I and T = n . and modalform
EXAMPLE
4.10 Consider get) = t e" or get, f)
= g(t
 r)
= (t
 r)ewn
x=
[~2~2 lJ
 I O]x
equivalent
equations of
x
+ [ ~]
u
It is straightforward
to verify g(t  r ) = [e" 4,5
y = [I
Thus the twodimensional
timevarying x(t) = [~ yet) =
state equation
~] x
+[
:~="]
Find an equivalent state equation of the equation in Problem 4.4 so that all state variables have their largest magnitudes roughly equal to the largest magnitude of the output. If all signals are required to lie inside ± 10 volts and if the input is a step function with magnitude a, what is the permissible largest a'' Consider
u(t)
(4.82)
4.6
leA'
te"]x(I)
x
= [~
n
x
+ [~: ]
1I
is a realization
of the impulse response g(I)
= tei",.
l_ _ I (s  A)2  52  2As
The Laplace transform of the impulse response is
(5) _ g 
where the overbar denotes complex formed into
conjugate.
Verify that the equation can be trans
.c[te"]

+ ),2
x = Ax
with A (4.83)
+ bll
y=
ex
Using (4.41). we can readily obtain
J'
x(I) = [ '; yet) = [0 l]x(t)
 = [0s.:x = Qx with
Q! _ 
by using the transformation
This LTI state equation is a different realization of the same impulse response.This realization is clearly more desirable because it can readily be implemented using an opamp circuit. The implementation of (4.82) is much more difficult in practice.
[~b,
Ab,
4,7
Verify that the Jordanform
equation
118
STATESPACESOLUTIONS AND REALIZATIONS
)._
Problems 0 0 0 0 0 0 0 0 0 0 0 bi bx b3
bi
X
119
LBlls3+
fhIS2
+
thIS
+ fi41
fiI2S3+
fi22S2
+
fi32S
+
.B421
x=
0 0 0 0 0
)._
Show that its observable canonical form realization can be reduced from Problem 4.9 as u
0 0 0 0
)._
0 0 0
~
0 0
x+
~
0
b2 ~
s,
x=
al ao [
a3 a4

o o
o
1 0 0]
I 0
0 1
X
+
Y = [ci C2 C3 CI C2 C3]X can be transformed into 12
0
0
[fill fiOI fi31 fi41
'= [:
A
0
where A, ii, and Ciare defined in Problem 4.6 and 12is the unit matrix of order 2. [Hint: Change the order of the state variables from [Xl X2 X3 X4 Xs X61' to [Xl X4 Xz Xs X3 x61' and then apply the equivalence transformation x = Qi with Q = diag(QI, Q2, Q).] 4.8 Are the two sets of state equations y = [1 1 O]x and y=[IIO]x equivalent? Are they zerostate equivalent? 4.9 Verify that the transfer matrix in (4.33) has the following realization: allq
a2lq
iJ't [i}
4.11 Find a realization for the proper rational matrix y = [CI C2 c3]i (;(s) = [s ~ s2
I
s+1
(s :~)~s3+ s
2)]

s+2
4.12 Find a realization for each column of G(s) in Problem 4.11 and then connect them, as shown in Fig. 4.4(a), to obtain a realization of G(s). What is the dimension of this realization? Compare this dimension with the one in Problem 4.11. 4.13 Find a realization for each row of G(s) in Problem 4.11 and then connect them, as shown in Fig. 4.4(b), to obtain a realization of G(s). What is the dimension of this realization? Compare this dimension with the ones in Problems 4.11 and 4.12. 4.14 Find a realization for G(s) = [02S 3s
+ 6) + 34
22s + 23J 3s + 34
4.15 Consider the ndimensional state equation
x = Ax+
bu
y = ex
g(s)
r,
0
r,
0 0
0
i~ [
arIlq arlq
OJ [N'
: x+: ~
o
N2
0 0 O]x
1
Let g(s) be its transfer function. Show that numerator of g(s) has degree m if and only if
has
m
zeros or, equivalently. the
u
for i = 0, I, 2, ... , n  m  2 and eAnmIb f= O. Or, equivalently, the difference between the degrees of the denominator and numerator of g (s) is a = n  m if and only if eAaIb for i = 0, 1,2, ... , a  2. 4.16 Find fundamental matrices and state transition matrices for x=
~:I
y=[IqOO···
f= 0 and eAib
=0
This is called the observable canonical form realization and has dimension rq. It is dual to (4.34). 4.10 Consider the I x 2 proper rational matrix
G(s) = [di
A
d2]
+ .,0;:;S4 + aIs)
1
+
a2s2
+
a3s
+
. [0o I J
t
x
a4
120
STATESPACE SOLUTIONS
AND
REALIZATIONS
and
X= 4.17 4.18 Show a <I> (to. t)/at Given
. [1 o
1
ell]
x
Chapter
= <I>(to, t)A(t).
show
Stability
<l>ll (r , to) <I>(t. to) = [ <1>2' (t, to)
<I>!2Ct,to)] <l>dt, to)
4.19 Let
be the state transition matrix of x(t) =
[A'~
(t)
A12(t) ] x(t) An(t) to) = Ai, <l>ii(r, to). for
Show that <l>2,(t. to) = 0 for all t and to and that (a/at)<I>ii(t, i = 1,2. 4.20 Find the state transition matrix of . [X= 4.21 Verify that X(t) =
eAICeBI
5.1 Introduction
Systems are designed to perform some tasks or to process signals. If a system is not stable. the system may bum out. disintegrate. or saturate when a signal. no matter how small. is applied. Therefore an unstable system is useless in practice and stability is a basic requirement for all systems, In addition to stability. systems must meet other requirements. such as to track desired signals and to suppress noise. to be really useful in practice. The response of linear systems can always be decomposed as the zerostate response and the zeroinput response. It is customary to study the stabilities of these two responses separately. We wiJl introduce the BlBO (boundedinput boundedoutput) stability for the zerostate response and marginal and asymptotic stabilities for the zeroinput response, We study first the timeinvariant case and then the timevarying case.
sin t
0
is the solution of X(O) = C
4.22
Show that if A(t) = A,A(t)  A(t)A,. then A(t) = eA,1 A(O)e~A'1 Show also that the eigenvalues of A (t) are independent of t. in Problem 4.20, equivalence
4.23 Find an equivalent timeinvariant 4.24 Transform a timeinvariant transformation, g(t) = 4.26
tZeAI,
state equation of the equation
5.2 InputOutput
Stability of LTISystems
(A. B. C) into (0. B(!). e(t) by a timevarying and a timeinvariant realization
Consider a SISO linear timeinvariant
y(t)
4.25 Find a timevarying
realization
of the impulse response
=
11
(LTI) system described r)u(r)dr =
get 
11
by
t
g(r)u(t
i dt
(51 )
Find a realization of gtt , r ) = sin t (e~(tr) state equation realization?
cos r , Is it possible to find a timeinvariant
where get) is the impulse response or the output excited by an impulse input applied at t = O. Recall that in order to be describable by (5.1). the system must be linear. timeinvariant. and causal. In addition. the system must be initially relaxed at t = 0,
121
122
STABILITY An input U(I) or, equivalently, is said to be bounded if U(I) does not grow to positive or negative infinity
l
Figure 5.1 Function. there exists a constant Urn such that lu(t)1 ~
Urn
5.2 InputOutput
Stability
of LTI Systems
123
<
00
for all
I :::
°
A system is said to be BIBO stable (boundedinput boundedoutput stable) if every bounded input excites a bounded output. This stability is defined for the zerostate response and is applicable only if the system is initially relaxed.
T
n
~
Theorem 5.1 A SISO system described by (5.1) is BlEO stable if and only if g(l) is absolutely integrable in [0, (0), or [" for some constant M. Igtt)ldl ~ M < 00 integration of the function equals L:::z(l/nz) < 00. This function is absolutely but is not bounded and does not approach zero as t + 00. ~ Theorem 5.2 If a system with impulse response g(t) is BIBO stable, then, as t + 1. The output excited by u(t) = a, for t ::: 0, approaches g(O) . a. 2. The output excited by U (I) = sin wol, for I ::: 0, approaches Ig(jwo)1 sin(wot+ where g(5) is the Laplace transform of g(l) or Ig(r)1dr =
00 00:
2
4
5
integrable
Proof: First we show that if g(t) is absolutely integrable, then every bounded input excites
a bounded output. Let U(I) be an arbitrary input with !u(I)1 ~ Urn < ly(t)1 =
If
g(r)u(t
 r)drl
s
f
00
for all t ::: 0. Then
Ig(r)llu(t  r)ldr
~
Urn
10""
Ig(r)ldr ~ umM
Thus the output is bounded. Next we show intuitively that if get) is not absolutely integrable, then the system is not BIBO stable. If get) is not absolutely integrable, then there exists a tl such that
1::g(jwo))
r 10
Let us choose u(tlr)={_~ Clearly
U
if g(r) if g(r)
::: <
°
°
00
•
the proof of
g(5) =
LX! g(r)esr
 r)dr
dr
(5.2)
Proof: If u(l)=
a for all I ::: then (5.1) becomes 0, y(t) =
l'
g(r)u(1
=a
f
g(r)dr
is bounded. However. the output excited by this input equals y(ttl =
which implies y(l) + a
1'1
g(r)II(t1  r) d t =
1'1
Ig(r)1 d t =
1
00
g(r)dr
=ag(O)
aSI+oo
which is not bounded. Theorem 5.1. Q.E.D.
Thus the system is not BIBO stable. This completes
where we have used (5.2) with 5 u(l) = sin wol, then (5.1) becomes y(t) = =
=
0. This establishes
the first part of Theorem
5.2. If
A function that is absolutely integrable may not be bounded or may not approach zero as t +
00.
Indeed, consider the function defined by j(t n + (t  n)n4  n) = { 4 n  (I  n)n for n  I/n3 for n < t ~ n
~ t~n I/n3
+
f l'
g(r) sin wo(t  r) dr g(r) [sin wot cos Wor  cos wol sin war] dr
for n = 2, 3, .. , and plotted in Fig. 5.1. The area under each triangle is l/nz. Thus the absolute
= sinwol
l'
g(r) cos co..r dr  COSWol
l'
g(r) sinwor dr
124
STABILITY Thus we have, as t >
00,
5.2 InputOutput
Stability of LTI Systems
125
yU) > sin wat
fox
g(r) cos Wor d t

cos wot
foX. g(r)
where the gain a can be positive or negative. The impulse is defined as the limit of the pulse in Fig. 2.3 and can be considered to be positive. Thus we have sin Wor d t (5.3) [g(t)1 and if lal ::: 1 iflai <I =
If get) is absolutely integrable, g(jw) =
we can replaces
by jw in (5.2) to yield  j sinwr]dr to be real: thus we have
L [al 8(t
i
cc
 i)
;=1
fox
g(r)[coswr implicitly
The impulse response g(t) is assumed Re[g(jw)] and Im[g(jw)] = =
LX
g(r)coswrdr
I
Thus we conclude that the positive feedback system in Fig. 2.5(a) is BlBO stable if and only if the gain a has a magnitude less than 1. The transfer function of the system was computed in (2.12) as g(s) = I  ae= these It is an irrational function of .I and Theorem use Theorem 5.1 to check its stability. 5.3 is not applicable. In this case, it is simpler to
'fo""
g(r) sin wr d t
ae:'
where Re and Im denote, respectively, into (5.3) yields
the real part and imaginary pan. Substituting
y(t) > sinwot(Re[g(jwo)]) = [g(jwo)[ This completes sin(wot+ 5.2.
+ coswot(Im[g(jwo)])
t:,g(jwo» Q.E.D.
~#
For multi variable systems, we have the following
results.
the proof of Theorem
Theorem 5.M1 A multivariable system with impulse response matrix G(t) every gij(!) is absolutely integrable in [0,(0). [gij(t)] is BlBO stable if and only if
Theorem 5.2 is a basic result; filtering of signals is based essentially on this theorem. Next we state the BlBO stability condition in terms of proper rational transfer functions.
t....
Theorem 5.3 A SISO system with proper rational transfer function g(s) is BlBO stable if and only if every pole of g(s) has a negative real part or. equivalently, lies inside the lefthalf .Iplane. If g(s) has pole Pi with multiplicity factors m, then its partial fraction expansion contains the
;:.... Theorem 5.M3 A multivariable system with proper rational transfer matrix G(s) = [gij (.I)] is BlBO stable if and only if every pole of every gij (s) has a negative real part. We now discuss the BlBO stability of state equations. x(t) = Ax(t)
.IP,
Consider (5.4)
+ Bu(t) + Du(t)
+D
(sPY' of
yet) = Cx(t) Its transfer matrix is
Thus the inverse Laplace transform
g (.I)
or the impulse response contains the factors
G(5) = C(sI  A)IB It is straightforward to verify that every such term is absolutely integrable a negative real part. Using this fact, we can establish Theorem 5.3.
EXAMPLE
if and only if p, has
Thus Equation
(5.4) or. to be more precise. the zerostate
response of (5.4) is BIBO stable if
and only if every pole of G(5) has a negative of G(s) is called a pole of G(s). We discuss the relationship of G(s) =
real part. Recall that every pole of every entry of A. Because
5.1 Consider the positive feedback system shown in Fig. 2.5(al. Its impulse response
00
was computed in (2.9) as get) = La'8(t
i=1
between the poles of G(s) and the eigenvalues 1 det(sI  A)
 i)
C[Adj(sI
 A)]B
+D
(5.5)
126
STABILITY every pole of G(s) is an eigenvalue of A. Thus if every eigenvalue of A has a negative real part, then every pole has a negative real part and (5.4) is BIBO stable. On the other hand, because of possible cancellation in (5.5), not every eigenvalue is a pole. Thus, even if A has some eigenvalues with zero or positive real part, (5.4) may still be BIBO stable, as the next example shows. 5.2 Consider the network Example 4.4 as
EXAMPLE
1
Theorem 5.02 shown in Fig. 4.2(b). Its state equation was derived in 1. The output excited by u[k] = a, for k u(t) yet)
5.2 InputOutput
Stability of LTI Systems
127
Its proof is similar to the proof of Theorem 5.1 and will not be repeated. We give a discrete counterpart of Theorem 5.2 in the following.
If a discretetime system with impulse response sequence g[k] is BIBO stable. then. as k ~
00:
>: O. approaches
gO)'
a.
x(t) = x(t)
+ O·
= 0.5x(t)
+ 0.5u(t)
(5.6)
2. The output excited by u [k] = sin wok, for k ?: 0, approaches Ig(eiw')1 sin(wok+ where g(z) is the :transfonn of g[k] or j:g(eiw
The Amatrix is I and its eigenvalue the equation is
is 1. It has a positive real part. The transfer function of
I
,»
(5.8)
g(s) = 0.5(}  1)1 ·0
+ 0.5
= 0.5
g(z)
=
L g[m]zm
m=O k
so
The transfer function equals 0.5. It has no pole and no condition to meet. Thus (5.6) is BIBO stable even though it has an eigenvalue with a positive real part. We mention that BIBO stability does not say anything about the zeroinput response, which will be discussed later.
Proof
If u[k] = a for all k ?: 0, then (5.7) becomes
k
5.2.1 DiscreteTime Case
Consider a discretetime
y[k] = Lg[m]u[k
 m] = a Lg[m)
k=O
o
which implies
k
8180 system described by
y[k]
=L
k
g[k  m]u[m]
=L
g[m]u[k
 m]
(5.7)
y[k] ~
a
L g[m)
m:;:::Q
00
= agO)
as k +
00
m=O
m=O
where g[k] is the impulse response sequence or the output sequence excited by an impulse sequence applied at k = O. Recall that in order to be describable by (5.7), the discretetime system must be linear, timeinvariant, and causal. In addition, the system must be initially relaxed at k = O. An input sequence u[k] is said to be bounded ifu[k] does not grow to positive or negative infinity or there exists a constant Urn such that lu[k]1 ::::Urn < 00 for k = 0, 1,2, ...
where we have used (5.8) with z = I. This establishes u[k) = sin wok, then (5.7) becomes
k
the first part of Theorem
5.D2. If
y[k) = L
m=O k
g[m) sin work  m)
= Lg[m)(sinwokcoswom
m=O k
coswoksinwom)
k
A system is said to be BIBO stable (boundedinput boundedoutput stable) if every boundedinput sequence excites a boundedoutput sequence. This stability is defined for the zerostate response and is applicable only if the system is initially relaxed. ~ Theorem 5.01 A discretetime SISO system described by (5.7) is BIBO stable if and only if g [k] is absolutely summable in [0, 00) or
00
= sin Wok L
m==O
g[m) cos wom  cos Wok L
g[m) sin wom
Thus we have. as k ~
00. x oc
y[kJ + sin Wok L
m=O
g[m) coswom
 cos wok L
g[m) sin wom
(5.9)
L
k=O
Ig[k]1 ::::M < 00
If g[k) is absolutely
summable,
00
we can replace z by eiw in (5.8) to yield
oo
for some constant M.
g(eiw) = Lg[m)eiWrn
m=O
= Lg[m][coswm
m=O
 j sinwm)
128
STABILITY
5.3 Internal
Stability
129
Thus (5.9) becomes y[k) > sin wok(Re[g(ejwo))) = Ig(ejw")i sin(wok+
+ cos
j:g(ejw
wok (Im[g (ejw,))) It is not a rational function of r and Theorem For multi variable discretetime Next we state the BIBO
,»
5.03 is not applicable.
results.
This completes the proof of Theorem
5.02.
Q.E.D.
systems, we have the following
Theorem 5.02 is a basic result in digital signal processing. stability in terms of discrete proper rational transfer functions.
Theorem
Theorem
5,MO 1
5.03
A MIMO discretetime system with impulse response sequence matrix G[k) if and only if every gij [k] is absolutely summable.
= [glj[k]]
is BlBO stable
A discretetim_eS1S0 system with proper rational transfer function g(z) is BlBO stable if and only if every pole of g(Z) has a magnitude less than I o~, equivalently, lies inside the unit circle on the zplane. If g(z) has pole p, with multiplicity factors
m,
Theorem
5,M03
then its partial fraction expansion
contains
the
A MIMO discretetime system with discrete proper rational transfer matrix G(z) = [glj (Z)] is BlBO stable if and only if every pole of every glj (z) has a magnitude less than I. We now discuss the BIBO stability of discretetime x[k state equations. Consider (5.10)
::.  Pi
(Z  Pi)2 '
(z 
Pi )m,
Thus the inverse ztransforrn of g(z) or the impulse response sequence contains the factors
+
I) = Ax[k] = Cx[k]
+ Bu[k]
p7,
kP:, ... , km,Ip;
Its di screte trans fer matrix is
y[k]
+ Du[k] +D
It is straightforward to verify that every such term is absolutely summable if and only if Pi has a magnitude less than I. Using this fact, we can establish Theorem 5.03. In the continuoustime case, an absolutely integrable function JU), as shown in Fig. 5.1, may not be bounded and may not approach zero as t > 00. In the discretetime case, if g[k) ISabsolutely summable, then it must be bounded and approach zero as k > 00. However, the converse is not true as the next example shows.
EXA~IPLE 5.3 Consider
G(z) = C(zl  A)IB Thus Equation
(5.10) or, to be more precise, the zerostate
response of (5.10) is BIBO stable of A. Because
if and only if every pole of G(z) has a magnitude We discuss the relationship of _ G(z) =
less than I.
between the poles of G(z) and the eigenvalues I detrz l  A) C[Adj(zl  A»)B
a discretetime LTl system with impulse response sequence 1/ k . for k = 1.2, ... , and g[O] = O. We compute
g[k)
=
+D
S := ~ Ig[kll
k~
~
=
I L  = 1+
cc k=1
k
 +  +  + ...
2 3 4
I
I
I
=
I+
~ + (~ + ~) + (~ + ... + ~) + (~ + '" +
2 3 4 5 8 9
_!_) ' ... 16'
therefore their sum
every pole of G(z) is an eigenvalue of A. Thus if every eigenvalue of A has a negative real part, then (5.10) is BIBO stable. On the other hand, even if A has some eigenvalues with zero or positive real part, (5.10) may, as in the continuoustime case, still be BIBO stable.
There are two terms, each is is larger than
i. There
±
or larger. in the first pair of parentheses;
5,3 Internal Stability
The BIBO stability is defined for the zerostate zeroinput response or the response of response. Now we study the stability of the
are four terms, each is ~ or larger, in the second pair of parentheses;
therefore their sum is larger than
1. Proceeding
I
forward we conclude I
S> 1++++"'=00 222
This impulse response sequence is bounded and approaches 0 as k > 00 but is not absolutelv summable. Thus the discretetime system is not BlBO stable according to Theorem 5.0 I. Th~ transfer function of the system can be shown to equal excited by nonzero initial state
Xo'
I
x(t) = Ax(t) Clearly, the solution of (5.11) is
X(I) = eAtxo
(5.11)
(5.12)
130
STABILITY Definition 5.1 The zeroinput response of (5.4) or the equation = Ax is marginally stable or stable in the sense of Lyapunov if every finite initial state x, excites a bounded response. It is asymptotically stable if every finite initial state excites a bounded response, which, in addition, approaches 0 as t + 00. We mention that this definitiun is applicable only to linear systems. The definition that is applicable to both linear and nonlinear systems must be defined using the concept of equivalence states and can be found. for example, in Reference [6, pp. 401403]. This text studies only linear systems: therefore we use the simplified Definition 5.1.
x
T
5.3 Internal Stability
131
I
x
is not marginally stable, however,
= [~
o
~
~] x
is A2p.
0 I
because its minimal polynomial
+ I)
and A = 0 is
not a simple root of the minimal polynomial. As discussed earlier, every pole of the transfer matrix (;(s) = C(sI  A)IB
+D
Theorem 5.4
1. The equation = Ax is marginally stable ;i and only if all eigenvalues of A have zero or negative real parts and those with zero real parts are simple roots of the minimal polynomial of A. 2. The equation X = Ax is asymptotically stable if and only if all eigenvalues of A have negative real parts. We first mention that any (algebraic) equivalence transformation will not alter the stability of a state equation. Consider x = Px, where P is a nonsingular matrix. Then x = Ax is equivalent to = Ax = PAPIX. Because P is nonsingular, if x is bounded, so is x; if x approaches 0 as t + 00, so does X. Thus we may study the stability of A by using A. Note that the eigenvalues of A and of A are the same as discussed in Section 4.3.
x
I
is an eigenvalue of A. Thus asymptotic stability implies BIBO stability. Note that asymptotic stability is defined for the zeroinput response, whereas BIBO stability is defined for the zerostate response. The system in Example 5.2 has eigenvalue I and is not asymptotically stable; however, it is BIBO stable. Thus BIBO stability. in general, does not imply asymptotic stability. We mention that marginal stability is useful only in the design of oscillators. Other than oscillators, every physical system is designed to be asymptotically stable or BIBO stable with some additional conditions, as we will discuss in Chapter 7.
x
5.3.1 DiscreteTime Case
This subsection studies the internal stability of discretetime x[k systems or the stability of (5.13)
The response
of
x
=
Ax
excited
by X(O) equals xU)
= eAtx(O).
It is clear that the
response is b_ounded if and only if every entry of eAt is bounded for all t :::: If O.
A
+ 1] = Ax[k]
is in Jordan
form, then eAt is of the form shown in (3.48). Using (3.48), we can show that if an eigenvalue has a negative real part, then every entry of (3.48) is bounded and approaches 0 as t + 00. If an eigenvalue has zero real part and has no Jordan block of order 2 or higher, then the corresponding entry in (3.48) is a constant or is sinusoidal for all t and is, therefore. bounded. This establishes the sufficiency of the first part of Theorem 5.4. If A has an eigenvalue with a positive real part, then every entry in (3.48) will grow without bound. If A has an eigenvalue with zero real part and its Jordan block has order 2 or higher, then (3.48) has at least one entry that grows unbounded. This completes the proof of the first part. To be asymptotically stable, every entry of (3.48) must approach zero as t + 00. Thus no eigenvalue with zero real part is permitted. This establishes the second part of the theroem.
EXAMPLE 5.4 Consider
excited by nonzero initial state Xo. The solution of (5.13) is, as derived in (4.20). x[k] = Akxo (5.14)
Equation (5.13) is said to be marginally stable or stable in the sense of Lyapunov if every finite initial state x, excites a bounded response. It is asymptotically stable if every finite initial state excites a bounded response. which. in addition. approaches 0 as k + 00. These definitions are identical to the continuoustime case.
Theorem 5.04 1. The equation x[k + 1] = Ax[k] is marginally stable if and only if all eigenvalues of A have magnitudes less than or equal to I and those equal to I are simple roots of the minimal polynomial of
A.
x = [~
o
~ ~] x 0 I
2. The equation x[k + 1] = Ax[k] is asymptotically stable if and only if all eigenvalues of A have magnitudes less than I. As in the continuoustime case, any (algebraic) equivalence transformation will not alter the stability of a state equation. Thus we can use Jordan form to establish the theorem. The proof is similar to the continuoustime case and will not be repeated. Asymptotic stability
Its characteristic polynomial is ~(A) = A"(A + I) and its mimimal polynomial is >/leA) A(A + I). The matrix has eigenvalues 0, 0, and I. The eigenvlaue 0 is a simple root of the minimal polynomial. Thus the equation is marginally stable. The equation
132
STABILITY implies BIBO stability but not the converse. We mention that marginal stability is useful only in the design of discretetime oscillators. Other than oscillators, every discretetime physical system is designed to be asymptotically stable or BlBO stable with some additional conditions, as we will discuss in Chapter 7.
5.4 Lyapunov Theorem
133
M= Indeed, substituting
1
X
/\'fNeAfdt
(5.18)
(5.18) into (5.15) yields = =
A'M TI\IA
5,4 Lyapunov Theorem
This section introduces a different method of checking asymptotic stability of x = Ax. For convenience. we call A stable if every eigenvalue of A has a negative real part. Theorem 5.5
1""
x
1f
A'eAfNeAf
dt
+
1'"
eA'fNeAfAdl
(eA'fNe~f)
dt = eA1NeAf
[0
(5.19)
= 0  N =N
All eigenvalues of A have negative real partsif matrix N. the Lyapunov equation
,
and only if for any given positive definite symmetric =N
(5.15)
where we have used the fact e/" = 0 at I = x for stable A. This shows that the M in (5.18) is the solution. It is clear that if :"I is symmetric. so is M. Let us decompose N as N = N'N, where I'll is nonsingular (Theorem 3.7) and consider
A'M+MA
x'Mx
=
LX;
x'eA'fN'NeAfxdl
= LX.
i:NeAfxll~dl
(520) of (5.20) is the positive
has a unique symmetric solution lVl and NI is positive definite. Corollary 5.5
Because both I'll and eAf are nonsingular, for any nonzero x, the integrand positive for every I. Thus xMx is positive for any x ¥ O. This shows
I'll with m
All eigenvalues of an n x n matrix A have negative real parts if and only if for any given m x n matrix < II and with the property NA
rank 0 ' rank .where 0 is an
/1111
l
I'll
NA:nl
1
n
(full column rank)
definiteness of M. Sufficiency: We show that ifN and M are positive definite, then A is stable. Let A be an eigenvalue of A and v ¥ 0 be a corresponding eigenvector: that is, Av = ;.v. Even though A is a real matrix, its eigenvalue and eigenvector can be complex, as shown in Example 3.6. Taking the complexconjugate transpose of Av = AVyields v*A * = v" A' = i. "v", where the asterisk denotes complexconjugate transpose. Premultiplying v' and postmultiplying v to (5.15) yields v*Nv = v" A'Mv + v*MAv = (A* + i.)v*Mv
(5.16)
x n matrix. the Lyapunov equation A'M + MA = N'N =: N
(5.17)
=
2Re(A)v*Mv
(5.21 )
has a unique symmetric solution lVl and M is positive definite. For any N, the matrix N in (5.17) is positive semidefinite (Theorem 3.7). Theorem 5.5 and its corollary are valid for any given N: therefore we shall use the simplest possible N. Even so. using them to check stability of A is not simple. It is much simpler to compute, using MATL\B, the eigenvalues of A and then check their real parts. Thus the importance of Theorem 5.5 and its corollary is not in checking the stability of A but rather in studying the stability of nonlinear systems. They are essential in using the socalled second method of Lyapunov. We mention that Corollary 5.5 can be used to prove the RouthHurwitz test. See Reference [6. pp. 417419]. Proof of Theorem 5.5 Necessitv: Equation (5.15) is a special case of (3.59) with A = A' and B = A. Because A and A' have the same set of eigenvalues, if A is stable, A has no two eigenvalues such that Ai + AJ = O. Thus the Lyapunov equation is nonsingular and has a unique solution M for any N. We claim that the solution can be expressed as
Because v*Mv and v''Nv are, as discussed in Section 3.9, both real and positive, (S.21) implies ReO,) < O. This shows that every eigenvalue of A has a negative real part. Q.E.D. The proof of Corollary S.S follows the proof of Theorem 5.5 with some modific~tion. We discuss only where the proof of Theorem 5.5 is not applicable. Consider (5.20). Now N is m x n with m < nand N = N'N is positive semidefinite. Even so. M in (5.18) can still be positive definite if the integrand of (5.20) is not identically zero for all z. Suppose the integ~and of (S.20) is identically zero or Ne.\fx == O. Then its derivative with respect to t yields :"IAeAfx = O. Proceeding forward, we can obtain
(5.22)
This equation implies that, because of (5.16) and the nonsingularity
of eAf. the only x meeting
134
STABILITY
(5.22) is O. Thus the integrand of (5.20) cannot be identically zero for any x =f. O. Thus M is positive definite under the condition in (5.16). This shows the necessity of Corollary 5.5. Next we consider (5.21) with N = N'N or! 2Re(A)vOMv = voN'Nv = IINvll~ (5.23) have A Zv
T
5.4.1 DiscreteTime Case
5.4 Lyapunov
Theorem
135
We show that Nv is nonzero under (5.16). Because of Av ... ,AnIv = AnIv. Consider
= AV, we
= AAv = Azv,
If Nv = 0, the rightmost matrix is zero; the leftmost matrix, however, is nonzero under the conditions of (5.16) and v =f. O. This is a contradiction. Thus Nv is nonzero and (5.23) implies Re(A)< O. This completes the proof of Corollary 5.5. In the proof of Theorem of 5.5, we have established the following result. For easy reference, we state it as a theorem.

IN:J=l5J l5J
I
This shows the uniqueness of M. Although the solution can be expressed as in (5.24), the integration is not used in computing the solution. It is simpler to arrange the Lyapunov equation, after some transformations, into a standard linear algebraic equation as in (3.60) and then solve the equation. Note that even if A is not stable, a unique solution still exists if A has no two eigenvalues such that Ai + Aj = O. The solution, however, cannot be expressed as in (5.24); the integration will diverge and is meaningless. If A is singular or, equivalently, has at least one zero eigenvalue, then the Lyapunov equation is always singular and solutions mayor may not exist depending on whether or not N lies in the range space of the equation.
Before discussing the discrete counterpart of Theorems 5.5 and 5.6, we discuss the discrete counterpart of the Lyapunov equation in (3.59). Consider MAMB=C (5.25)
Theorem 5.6 If all eigenvalues of A have negative real parts, then the Lyapunov equation A'M+MA=N has a unique solution for every N, and the solution can be expressed as M=
where A and B are, respectively, n x nand m x m matrices. and M and Care n x m matrices. As (3.60), Equation (5.25) can be expressed as Ym = c, where Y is an nm x nm matrix; m and care nm x I column vectors with the m columns of M and C stacked in order. Thus (5.25) is essentially a set of linear algebraic equations. Let 7]k be an eigenvalue of Y or of (5.25). Then we have
7]k
= 1
Ai/.Lj
for i = I, 2, ... , n; j = 1, 2, ... , m
1
00
eA'INeA1dt
(5.24)
Because of the importance of this theorem, we give a different proof of the uniqueness of the solution. Suppose there are two solutions MI and Mz. Then we have A'(MI which implies eA'I[A'(MI  M2) + (MI  M2)AjeAr
00
where Ai and J.Lj are, respectively, the eigenvalues of A and B. This can be established intuitively as follows. Let us define A(M) := M  AMB. Then (5.25) can be written as A(M) = C. A scalar 7] is an eigenvalue of A if there exists a nonzero M such that A(M) = 7]M. Let u be an n x 1 right eigenvector of A associated with Ai: that is, Au = AiU. Let v be a 1 x III left eigenvector of B associated with J.Lj; that is, vB = vJ.Lj. Applying A to the n x III nonzero matrix uv yields A(uv) = uv  AuvB = (1  A,J.Lj)UV
 Mz) + (MI  Mz)A = 0
= ~[eA'r(MI
 M2)eArj = 0
Thus the eigenvalues of (5.25) are I  AiJ.Lj' for all i and j. If there are no i and j such that AiJ.Lj = 1, then (5.25) is nonsingular and, for any C, a unique solution M exists in (5.25). If AiJ.Lj = I for some i and j, then (5.25) is singular and, for a given C, solutions mayor may not exist. The situation here is similar to what was discussed in Section 3.7.
Its integration from 0 to
yields [eA'r(MI  Mz)eArjl~ =0
'>
Theorem 5.05 All eigenvalues of an n x n matrix A have magnitudes less than I if and only if for any given positive definite symmetric matrix N or for N = N'N, where N is any given m x n matrix with m < 11 and with the property in (5.16), the discrete Lyapunov equation MA'MA=N (5.26)
or, using e/" ~
0 as t ~
00,
0 (M,  M2) = 0
I. Note that if x is a complex vector, then the Euclidean norm defined in Section 3.2 must be modified as Ilxll~ = x'x, where x" is the complex conjugate transpose of x.
has a unique symmetric solution 1\'1and M is positive definite.
136
STABILITY We sketch briefly its proof for N > O. If all eigenvalues of A and, consequently, of A' have magnitudes less than 1, then we have IAi;';I < I for all i and j. Thus Ai AJ I and (5.26) is nonsingular. Therefore, for any N, a unique solution exists in (5.26). We claim that the solution can be expressed as
5.S Stability of LTV Systems
137
t=
which maps the lefthalf splane into the interior of the unit circle on the zplane versa. To differentiate the continuoustime and discretetime cases, we write A'MtMA and =N
and vice
(529)
M=
I)A'J"'NA'"
m=O
(5.27) and is well defined. Substituting (5.27)
(5.30) Following (5.28), these two equations can be related by
Because lA, I < 1 for all i , this infinite series converges into (5.26) yields
Substituting we find =N
the righthandside
equation
into 15.3D) and performing
a simple manipulation.
+ ~(A')"'NA'"
111=1
cc
 ~(A')"'NA'"
1Il=!
=N Comparing this with (5.29) yields A = (..\"+ I)I(A"  I)
(531)
Thus (5.27) is the solution. lf N is symmetric, so is M. If N is positive definite, so is M. This establishes the necessity. To show sufficiency. let A be an eigenvalue of A and v 0 be a corresponding eigenvector: that is, Av = Av. Then we have
t=
v*Nv = v*Mv  v" A'l\lAv = v*Mv  A*v*l\h;, = (l  IAI")v*Mv
These relate (5.29) and (5.30). The MATLAB function lyap computes the Lyapunov equation in (5.29) and dlyap computes the discrete Lyapunov equation in (5.30). The function dlyap transforms (5.3D) into (5.29) by using (5.31) and then calls lyap. The result yields 1\1 = Md.
Because both v*Nv and v*Mv are real and positive, we conclude (l  IAI2) > 0 or IAI" < 1. This establishes the theorem for N > O. The case N ~ 0 can similarly be established. Theorem 5.06 If all eigenvalues
of A have magnitudes less than I, then the discrete Lyapunov equation
5.5 Stability of LN Systems
Consider a SISO linear timevarying yet) (LTV) system described =
11
10
by (532)
gtt . r)u(r)dr
MA'MA=N
has a unique solution for every N. and the solution can be expressed as
The system is said to be BIBO stable if every bounded input excites a bounded output. The condition for (5.32) to be BIBO stable is that there exists a finite constant M such that
M = ~(A')"'NAm It is important to mention that even if A has one or more eigenvalues with magnitudes larger than 1. a unique solution still exists in the discrete Lyapunov equation if)., A; t= I for all i and j. In this case, the solution cannot be expressed as in (5.27) but can be computed from a set of linear algebraic equations. Let us discuss the relationships between the continuoustime and discretetime Lyapunov equations. The stability condition for continuoustime systems is that all eigenvalues lie inside the open lefthalf splane. The stability condition for discretetime systems is that all eigenvalues lie inside the unit circle on the zplane. These conditions can be related by the bilinear transformation
5=
1
I"
yet)
1
!g(t. r)1 d t
:::
M < cc
(533)
for all t and to with t ~ to. The proof in the timeinvariant modification. For the multi variable case, (5.32) becomes
case applies here with only minor
=
11
II)
G(t. r)u(r)dr
(5.3'+)
~l
1
+s
The condition for (5.34) to be BIBO stable is that every entry of G(t, r) meets the condition in (5.33). For multivariable systems. we can also express the condition in terms of norms. Any norm discussed in Section 3.11 can be used. However. the infinitenorm
z+1
ls
(5.28)
Ilulloe
= max ,
lUll
IIGllx
= largest row absolute sum
138
STABILITY
is probably the most convenient to use in stability study. For convenience, no subscript will be attached to any norm. The necessary and sufficient condition for (5.34) to be BIBO stable is that there exists a finite constant
1
5.5 Stability of LTV Systems
139
l'
'0
M such that IIG(t, r)11 d t
::::
(5.37) depend on to? What is the rate for the state transition matrix to approach 0 in (5.38)? The interested reader is referred to References [4, 15]. A timeinvariant equation x = Ax is asymptotically stable if all eigenvalues of A have negative real parts. Is this also true for the timevarying case? The answer is negative as the next example
EXAMPLE
M<
00
shows.
for all t and to with t ::: to. The impulse response matrix of x = A(t)x y = C(t)x is G(t, r ) = C(t)4>(t, and the zerostate response is r)B(r)
5.5 Consider the linear timevarying equation
x=A(t)x== 1 [~ I
+ B(t)u + D(t)u + D(t)8(t
 r)
e ']x
2
(5.39)
(5.35)
The characteristic
polynomial det(AI
of A(t) is A(t» == det [
A+
0
1
Thus A(t) has eigenvalues
1 and 1 for all t. It can be verified directly that 4>(t,O)== [ 0
yet) =
Thus (5.35) or, more precisely,
l'
'0
e'
0.5(ee'_,e')]
C(t)4>(t,
r)8(r)u(r)dr
+D(t)u(t)
the zerostate
response of (5.35) is BIBO stable if and only if
there exist constants MI and M2 such that IID(t)11 :::: MI < and
00
meets (4.53) and is therefore the state transition matrix of (5.39). See also Problem 4.16. Because the (l,2)th entry of 4> grows without bound, the equation is neither asymptotically stable nor marginally stable. This example shows that even though the eigenvalues can be defined for AU) at every t, the concept of eigenvalues is not useful in the timevarying case. All stability properties in the timeinvariant case are invariant under any equivalence transformation. In the timevarying case, this is so only for BIBO stabiliry, because the impulse response matrix is preserved. An equivalence transformation can transform, as shown in Theorem 4.3, any x = A(t)x into = Aox, where A, is any constant matrix; therefore, in the timevarying case, marginal and asymptotic stabilities are not invariant under any equivalence
l'
'0
IIG(t, r)lldr::::
M2 <
00
x
for all t and to with t ::: to. Next we study the stability of the zeroinput
transformation. response of (5.35). As in the timeinvariant
case, we define the zeroinput response of (5.35) or the equation x = A(t)x to be marginally stable if every finite initial state excites a bounded response. Because the response is governed by x(t) = 4>(t, to)x(to) we conclude that the response M such that 114>(t, to)11 :::: M < for all to and for all t ::: to. The equation stability conditions are the boundedness
00
..
Theorem 5.7
Marginal and asymptotic stabilities of x ==A(t)x are invariant under any Lyapunov transformation.
(5.36)
is marginally
stable if and only if there exists a finite constant
(5.37) stable if the response
00.
As discussed in Section 4.6, if pet) and P(t) are continuous, and pet) is nonsingular for all t, then x = PU)x is an algebraic transformation. If, in addition, P(t) and pI (r) are bounded for all t , then x = P(t)x is a Lyapunov transformation. The fundamental matrix X(t) of x = A(t)x and the fundamental matrix X(t) of = A(t)x are related by, as derived in
x
(4.71), X(t) == P(t)X(t) which implies
x = A(t)x is asymptotically condition in (5.37) and as t + oc and conditions.
excited by every finite initial state is bounded and approaches
zero as t +
The asymptotic
114>(t, to)11 + 0 A great deal can be said regarding these definitions
(5.38) Does the constant M in
4>(t, r ) = X(t)XI(r)
= P(t)X(t)XI(r)pl(r) (5.40)
= P(t)4>(t, r)pI(r)
140
STABILITY
Problems Because both P(t)
141
1I<I>(t.r)11 + as t > 00. so is 114>(t. r)ll. This establishes Theorem 5.7. In the timeinvariant case. asymptotic stability of zeroinput responses always implies BIBO stability of zerostate responses. This is not necessarily so in the timevarying case. A timevarying equation is asymptotically 1I<I>(t.to)!! stable if
>
°
and pl(t)
are bounded.
if 1I<I>(t.r)11 is bounded.
so is 114>(1.r)ll;
if
Is it BIBO stable" Consider a discretetime system with impulse response sequence g[k] = k(o.sl Is the system BIBO stable') 5.9 Is the state equation in Problem 5.7 marginally Is the homogeneous state equation stable" Asymptotically stable? for k :::
°
°
as t +
00
(5.41)
for all t, to with t ::: to. It is BIBO stable if
1
10
IIC(t)<I>(t, r)B(r)11 d t < cc 1
(5.42)
5.10
for all t, to with' ::: '0, A function that approaches 0. as t + 00. may not be absolutely integrable. Thus asymptotic stability may hot imply BIBO stability in the timevarying case. However. if 1I<I>(t.r)11 decreases to zero rapidly. in particular, exponentially. and if Ctr) and Bu) are bounded for all t . then asymptotic stability does imply BIBO stability. See References [4.6.15].
x=
marginally 5.11 stable') Asymptotically state equation Is the homogeneous
[I~ °° iJ
stable"
°
x
PROBLEMS
5.1
Is the network shown in Fig. 5.2 BIBO stable? If not. find a bounded excite an unbounded output. Figure 5.2
input that will marginally 5.12
x=
stable') Asymptotically homogeneous Is the discretetime
[I~ °°
stable"
°
iJx
state equation
u
x[k + IJ marginally 5.13 stable" Asymptotically homogeneous
=
[0.9
~
5.2
Consider condition
a system with an irrational transfer function g(s). Show that a necessary for the system to be BIBO stable is that Ig(5)1 is finite for all Re 5::: D.
stable')
°J °
Ii
x[k]
Is the discretetime
state equation
5.3
Is a system with impulse response g(r) = 1/11+ I) BIBO stable? How about g(r) = t e:' for 1 ::: O? Is a system with transfer function
g(s) = e2J/(5
x[k marginally 5.14
+ IJ
=
5.4 5.5
+ I) BIBO stable? stable') Asymptotically
[
0.9 °
~~
Show that the negativefeedback system shown in Fig. 2.5(b) is BIBO stable if and only if the gain a has a magnitude less than I. For a = I. find a bounded input r(r) that will excite an unbounded output. Consider a system with transfer function g(s) = (5 2)/(5+ I). What are the steadystate responses excited by utt; = 3, for 1 ::: 0, and by uit ; = sin 21, for t ::: O? Consider
stable" of
Use Theorem
5.5 to show that all eigenvalues
5.6
A=[ ° I]
0.5 I have negative 5.15 real pans. of the A in Problem 5.14 have magnitudes Use Theorem S.D5 to show that all eigenvalues less than I.
5.7
x=[~
y=[23]x211
1~]x+[~2]11 5.16
For any distinct negative real Ai and any nonzero real
a,.
show that the matrix
142
STABILITY
M=[_E~ ~~~,
A2+ Al
a3al
2A2
~'::~;l
A2+ A3
Chapter
 )..3
+ Al
a3aZ
)"3
+ )...2
aj
 2A3
is positive definite. [Hint: Use Corollary 5.5 and A=diag(AI, A2, A3)']
5.17 A real matrix M (not necessarily symmetric) is defined to be positive definite if x'Mx >
for any nonzero x. Is it true that the matrix M is positive definite if all eigenvalues of M are real and positive or if all its leading principal minors are positive? If not, how do you check its positive definiteness? [Hint: Try
°
[~2r ~]
5.18
[1~9 ~}
2J,LM= N
Controllability and Observability
Show that all eigenvalues of A have real parts less than J,L < given positive definite symmetric matrix N. the equation A'M+MA+
° if and only if, for any
has a unique symmetric solution M and M is positive definite.
5.19 Show that all eigenvalues of A have magnitudes less than p if and only if, for any given
positive definite symmetric matrix N, the equation p2M _ A'MA = p2N has a unique symmetric solution M and M is positive definite.
5.20 Is a system with impulse response get, T) = e2ItlITI, for t ::: T, BIBO stable? How
6.1 Introduction
This chapter introduces the concepts of controllability and observability. Controllability deals with whether or not the state of a statespace equation can be controlled from the input, and observability deals with whether or not the initial state can be observed from the output. These concepts can be illustrated using the network shown in Fig. 6.1. The network has two state variables. Let Xi be the voltage across the capacitor with capacitance C;. for i = I. 2. The input II is a current source and the output y is the voltage shown. From the network. we see that, because of the open circuit across v. the input has no effect on X2 or cannot control X2. The current passing through the 2Q resistor always equals the current source u; therefore the response excited by the initial state XI will not appear in y. Thus the initial state XI cannot be observed from the output. Thus the equation describing the network cannot be controllable and observable. These concepts are essential in discussing the internal structure of linear systems. They are also needed in studying control and filtering problems. We study first continuoustime
Figure 6.1
Network. 1>2
about get. T) = sin t(eCt
T
))
cos T?
5.21 Consider the timevarying equation .:i: = 2tx
+u
Is the equation BIBO stable? Marginally stable? Asymptotically stable?
5.22
Show that the equation in Problem 5.21 can be transformed by using P(t) = «", into y=x
x
= P(t)x,
with
Is the equation BIBO stable? Marginally stable? Asymptotically stable? Is the transformation a Lyapunov transformation?
5.23 Is the homogeneous equation 1 [ e3t
x=
for to ::: 0, marginally stable? Asymptotically stable?
(current service)
143
144
CONTROLLABILITY
AND OBSERVABILITY
6.2 Controllability and then discretetime LTI state equations. Finally, Theorem 6.1
The following statements are equivalent.
145
linear timeinvariant (LTI) state equations we study the timevarying case.
6.2 Controllability
Consider the ndimensional pinput state equation
1. The IIdimensional
2. The
II
pair (A, B) is controllable.
x n matrix 'Vert)
x = Ax
+Bu
(6.1)
is nonsingular
=
11
O.
eA'BB'eA'r
dt
=
l'
eA(tr)BB'eA,,r)dr
(6.2)
where A and B are, respectively, n x nand n x p real constant matrices. Because the output does not play any role in controllability, we will disregard the output equation in this study. Definition 6.1 The state equation (6.1) or the pair (A, B) is said to be controllable iffor any initial state x(O) = Xo and allY final state XI, there exists all input that transfers Xo to XI ill a finite time. Otherwise (6.1) or (A, B) is said to be uncontrollable. This definition requires only that the input be capable of moving any state in the state space to any other state in a finite time: what trajectory the state should take is not specified. Furthermore, there is no constraint imposed on the input; its magnitude can be as large as desired. We give an example to illustrate the concept.
EXAMPLE
for any f >
3. The n x np controllability matrix
(63)
has rank n (full row rank). 4. The
Il
x tn
+ p)
matrix [A  AI BJ has full row rank at every eigenvalue, of A have negative
A. of A.'
5. If. in addition.
all eigenvalues
real parts. then the unique solution of
AWe
is positive definite.
+ WcA'
= BB'
Gramian and can be expressed as
(6.4)
The solution is called the controllabilitv
6.1 Consider the network shown in Fig. 6.2(a). Its state variable x is the voltage
We =
1
X
eArBB' eA'r d t
(6.S)
across the capacitor. If x(O) = 0, then x(t) = 0 for all t ::: 0 no matter what input is applied. This is due to the symmetry of the network, and the input has no effect on the voltage across the capacitor. Thus the system or, more precisely, the state equation that describes the system is not controllable. Next we consider the network shown in Fig. 6.2(b). It has two state variables and X2 as shown. The input can transfer or X2 to any value; but it cannot transfer and X2 to any values. For example, if (0) = X2 (0) = 0, then no matter what input is applied, x, (I) always equals x2(1) for all t ::: O. Thus the equation that describes the network is not controllable.
Proof: (l ) (2): First we show the equivalence i := t  r . Then we have
of the two forms in (6.2). Define
x,
x,
x,
x,
l'
r=U
eAItr)BB'
eA'(tr) d t =
iT='
~o eATBB' eA'T (df)
=
Jf=O
I'
eATBB' eA'T di
1+
111 111
.t
\
It becomes the first form of (6.2) after replacing i by r , Because of the form of the integrand, Wert) is always positive semidefinite; it is positive definite if and on lv if it is nonsingular. See Section 3.9. '
of (6.1) at time
First we show that if Wert) is nonsingular, f, was derived in (4.5) as
X(I,)
then (6. I) is controllable.
The response
+
u "\.,
_I
= e"'lxW)
+
1'1
eAltlr'Bu(r)dr
(6.6)
We claim that for any x(O) 111 111
= x.,
and any X(tl)
= x"
the input
 x.]
u(t) = B'e\'illI)W;I(t,)[eAllxo
(6.7)
will transfer Xo to x, at time fl· Indeed. substituting
(a)
(6.7) into (6.6) yields
(b)
Figure
6.2
Uncontrollable
networks.
1. If A i" complex, then we must use complex numbers as scalars in checking the rank. See the discussion regarding (3.37!.
146
CONTROLLABILITY
AND
OBSERVABILITY
X(tl) =
eArl
Xo _ (1
11
eA(tlt)BB' eA'(t,t)dr
) W;:l (ll)[eAII Xo  x.]
1
6.2 Controllability This contradicts the hypothesis that C has full row rank. This shows the equivalence
147 of
= eAI1Xo We(t))w;:l(I))(eAIlxo
 xd = Xl
(2) and (3). (3) ++ (4): If C has full row rank, then the matrix (A  AI B] has full row rank at every eigenvalue of A. If not, there exists an eigenvalue Al and a I x n vector q 0 such
t=
This shows that if We is nonsingular, then the pair (A, B) is controllable. We show the converse by contradiction. Suppose the pair is controllable but We(tl) is not positive definite for some 11. Then there exists an n X I nonzero vector v such that
that
which implies qA = Alq and qB = O. Thus q is a left eigenvector qA Proceeding
2
of A. We compute
= (qA)A = (Alq)A = Aiq
forward, we have qAk = A1q. Thus we have q(B AB ... An1B] = (qB AlqB ... A71qB] =0
which implies (6.8) for all r in (0, Id. If (6.1) is controllable, there exists an input that transfers the initial state
x(o) = eAII v to X(I)) = 0 and (6.6) becomes O=v+ Its premultiplication 0= by v' yields 111eA(IItlBU(r)dr
This contradicts the hypothesis that C has full row rank. In order to show that p(C) < n implies p«(A  AI B]) < n at some eigenvalue )'1 of A, we need Theorems 6.2 and 6.6, which will be established later. Theorem 6.2 states that controllability is invaraint under any equivalence transformation. Therefore we may show p«(A  AI B]) < n at some eigenvalue of A, where (A, B) is equivalent to (A, B). Theorem 6.6 states that if the rank of C is less than n or p (C) = n  m, for some integer mo::l, then there exists a nonsingular
A
v'eA(,,t)Bu(r)dr = IIvl12 +0
= PAP
v'v } 111
1 [Ae ~12] o
= Ai'
matrix P such that B = PB =

[Be]
0
which contradicts v O. This establishes the equivalence of (I) and (2). (2) ++ (3): Because every entry of eA'B is, as discussed at the end of Section 4.2, an analytical function of t, if We (I) is nonsingular for some t, then it is nonsingular for all I in (00, 00). See Reference (6, p. 554]. Because of the equivalence of the two forms in (6.2), (6.8) implies that We(t) is nonsingular if and only if there exists no n x I nonzero vector v such that for all
I
t=
where Ai' is m x m. Let Al be an eigenvalue of Ai' and ql be a corresponding Ixm nonzero left eigenvector or qlAi' = Alql. Then we have ql (Ai'  All) = O. Now we form the 1 x n vector q := (0 qd. We compute (6.10) which implies p«(A  AI B]) < n and, consequently, p«(A  AI B]) < n at some eigenvalue of A. This establishes the equivalence of (3) and (4). (2) ++ (5): If A is stable, then the unique solution of (6.4) can be expressed as in (6.5) (Theorem 5.6). The Grarnian We is always positive semidefinite. It is positive definite if and only if We is nonsingular. This establishes the equivalence of (2) and (5). Q.E.D. EXAMPLE 6.2 Consider the inverted pendulum developed studied in Example 2.8. Its state equation was the equation becomes
(6.9)
Now we show that if We(t) is nonsingular. then the controllability matrix C has full row rank. Suppose C does not have full row rank, then there exists an n X I nonzero vector v such that v' C = 0 or for k = 0, I, 2, ... , n  I Because eA'B can be expressed as a linear combination of (B. AB .... , AnI B) (Theorem 3.5), we conclude v' eAIB = O. This contradicts the nonsingularity assumption of Wc(t)· Thus Condition (2) implies Condition (3). To show the converse, suppose C has full row rank but Welt) is singular. Then there exists a nonzero v such that (6.9) holds. Setting t = 0, we have v'B = O. Differentiating (6.9) and then setting I = 0, we have v' AB = O. Proceeding forward yields v' AkB = 0 for k = 0. 1.2, .... They can be arranged as v'[B AB ... An1BJ = v'C = 0
in (2.27). Suppose for a given pendulum,
;"[~~}+[jJ !
y = (I
(6.11)
°°
O)x
148
CONTROLLABILITY
AND
OBSERVABILITY
6.2 Controllability
149
We compute
For Equation (6.12), we compute
2
This matrix can be shown to have rank 4; thus the system is controllable. Therefore. if X3 = deviates from zero slightly, we can find a control u to push it back to zero. In fact, a control exists to bring XI = y, X3, and their derivatives back to zero. This is consistent with our experience of balancing a broom on our palm.
° ° ° °
I
2
10
l~l
p([B
AB]) = p [
0.5
1
0.25] 1
,

Thus the equation is controllable and, for any x(O), there exists an input that transfers x(O) to in 2 seconds or in any finite time. We compute (6.2) and (6.7) for this system at tl = 2:
e
o
Wc(2) =
= [0.2162 0.3167 and
111
1° 2([eO.5r °
I]
e::
°]
[0.5]
I
[0.5 1]
[eo;r
°
0]) e~r
dt
0.3167] 0.4908
The MATLAB functions c trb and gram will generate the controllability matrix and controllability Gramian. Note that the controllability Gramian is not computed from (6.5); it is obtained by solving a set of linear algebraic equations. Whether a state equation is controllable can then be determined by computing the rank of the controllability matrix or Gramian by using rank in MATLAB.
(I) = [0.5
[
e~05(2~,)
°
= 58.82eo5'
+ 27.96e'
6.3 Consider the platform system shown in Fig. 6.3; it can be used to study suspension systems of automobiles. The system consists of one platform; both ends of the platform are supported on the ground by means of springs and dashpots, which provide viscous friction. The mass of the platform is assumed to be zero; thus the movements of the two spring systems are independent and half of the force is applied to each spring system. The spring constants of both springs are assumed to be I and the viscous friction coefficients are assumed to be 2 and I as shown. If the displacements of the two spring systems from equilibrium are chosen as state variables XI and X2, then we have XI + 2.(1 = u
EXAMPLE
for r in [0, 2]. This input force will transfer x(O) = [10  1]' to [0 0]' in 2 seconds as shown in Fig. 6.4(a) in which the input is also plotted. It is obtained by using the MATLAB function 1s ~::\an acronym for linear simulation. The largest magnitude of the input is about 45. Figure 6.4(b) plots the input U2 that transfers x(O) = [10  1], to 0 in 4 seconds. We see that the smaller the time interval, the larger the input magnitude. If no restriction is imposed on the input, we can transfer x(O) to zero in an arbitrarily small time interval; however, the input magnitude may become very large. If some restriction is imposed on the input magnitude, then we cannot achieve the transfer as fast as desired. For example, if we require 111(1)1 < 9. for all r. in Example 6.3, then we cannot transfer x(O) to 0 in less than I seconds. We remark that the input u(l) in (6.7) is called the minimal energy control in the sense that for any other input ii(r) that achieves the same transfer, we have
and
X2
+ X2
=
II
or . x = [ 0.5
°
0] I
X+
[0.5] I
II
(6.12)
This state equation describes the system. Now if the initial displacements are different from zero, and if no force is applied, the platform will return to zero exponentially. In theory. it will take an infinite time for X, to equal exactly. Now we pose the problem. If XI (0) = 10 and X2 (0) = I, can we apply a force to bring the platform to equilibrium in 2 seconds? The answer does not seem to be obvious
1"
10
ii' (t)ii(t) dt :::
1"
fO
u' (t)u(t) dt
°
60~~~' 40
L .! ..J _
because the same force is applied to the two spring systems.
20 ~ 2u
Figure 6.3
Platform system.
Damping coefficient
Spring
constant
1
Damping coefficient
Spring
constant
lOOL~~~4
(a) (h)
1
Figure 6.4
Transferx(O) = [10  I]' to [0 0]'.
150
CONTROLLABILITY
AND OBSERVABILITY
Its proof can be found in Reference [6, pp. 556558]. 6.4 Consider again the platform system shown in Fig. 6.3. We now assume that the viscous friction coefficients and spring constants of both spring systems all equal I. Then the state equation that describes the system becomes
EXAMPLE
1
=1
6.2 Controllability
151
is called the controllability index of (A, B). Or, equivalently, if (A, B) is controllable. the controllability index J.L is the least integer such that (6.15) Now we give a range of J.L. If J.LI = J.L2 = ... = J.Lp, then n] p ~ J.L. If all J.Lm. except one, equal I. then J.L = n  (p  I); this is the largest possible J.L. Let ii be the degree of the minimal polynomial of A. Then, by definition, there exist ai such that
An = alAn1
Clearly we have
I and the state equation is not controllable. If XI (0) in a finite time. I
I]
+ a2An2 + ... + anI
Thus (6.16)
which implies that AnB can be written as a linear combination of {B, AB, ... , AnIB}. we conclude no input can transfer x(O) to zero
nip ~ J.L ~ min(ii, n  p
1= xz(O),
+ I)
6.2.1 Controllability
Indices
Let A and B be n x n and n x p constant matrices. We assume that B has rank p or full column rank. If B does not have full column rank, there is a redundancy in inputs. For example, if the second column of B equals the first column of B, then the effect of the second input on the system can be generated from the first input. Thus the second input is redundant. In conclusion, deleting linearly dependent columns of B and the corresponding inputs will not affect the control of the system. Thus it is reasonable to assume that B has full column rank. If (A, B) is controllable, its controllability matrix C has rank n and, consequently, n linearly independent columns. Note that there are np columns in C; therefore it is possible to find many sets of n linearly independent columns in C. We discuss in the following the most important way of searching these columns; the search also happens to be most natural. Let b, be the ith column ofB. Then C can be written explicitly as ., (6.13) Let us search linearly independent columns of C from left to right. Because of the pattern of C, if Aibm depends on its lefthandside (LHS) columns, then Ai+lbm will also depend on its LHS columns. It means that once a column associated with b., becomes linearly dependent, then all columns associated with bm thereafter are linearly dependent. Let J.Lm be the number of the linearly independent columns associated with bm in C. That is, the columns
where p(B) = p. Because of (6.16), in checking controllability, it is unnecessary to check the n x np matrix C. It is sufficient to check a matrix of lesser columns. Because the degree of the minimal polynomial is generally not availablewhereas the rank of B can readil~ be computedwe can use the following corollary to check controllability. The second part of the corollary follows Theorem 3.8.
Corollary 6.1
The ndimensionalpair (A, B) is controllableif and only if the matrix Cnp+1 := [B AB '" where p(B) = p, has rank n or the n x n matrix Cnp+1 AnPB]
C~_p+I
(6.17)
is nonsingular.
EXAMPLE 6.5 Consider the satellite system studied in Fig. 2.13. Its linearized state equation was developed in (2.29). From the equation, we can see that the control of the first four state variables by the first two inputs and the control of the last two state variables by the last input are decoupled; therefore we can consider only the following subequation of (2.29):
x=
[0
I
300 ~ o 2
~~] 1 0 00 0 0] lOx
X+ [~~] 0 0
0
I
u
(6.18)
y= [ 0
0 0
are linearly independent in C and Al'm+ibm are linearly dependent for i = 0, I ..... that if C has rank n, then
J.Ll
It is clear (6.14)
+ J.L2 + ... + J.Lp = n
indices and J.Lz, ..• , J.Lp)
where we have assumed. for simplicity, (vo = m = ro = I. The controllability matrix of (6.18) is of order 4 x 8. If we use Corollary 6.1, then we can' check its controllability by using the following 4 x 6 matrix:
The set {J.LI, J.L2, ••. , J.Lp} is called the controllability
J.L = max (J.LI,
[B A. A'B] ~
[1
0 0 0
I
0 0 2
0 2
I
0
1
0
2 0
j]
(6.19)
152
CONTROLLABILITY
Ai'lD
OBSERVABILITY
6.3 Observability
153
It has rank 4. Thus (6.18) is controllable. From (6.19), we can readily verify that the controllability indices are 2 and 2, and the controllability index is 2. Theorem 6,2 The controllability property is invariant under any equivalence transformation.
state equation describes. The physical significance of the controllability index is not transparent here: but it becomes obvious in the discretetime case. As we will discuss in later chapters. the controllability index can also be computed from transfer matrices and dictates the minimum degree required to achieve pole placement and model matching.
Proof' Consider the pair (A, B) with controllability
matrix
6.3 Observability
The concept of observability is dual to that of controllability. Roughly speaking. controllability studies the possibility of steering the state from the input: observability studies the possibility of estimating the state from the output. These two concepts are defined under the assumption that the state equation or. equivalently. all A, B, C. and D are known. Thus the problem of observability is different from the problem of realization or identification, which is to determine or estimate A. B. C. and D from the information collected at the input and output terminals. Consider the ndimensional pinput qoutput state equation x = Ax
and its equivalent pair (A, E) with A = PAPI and matrix. The controllability matrix of (A. E) is
E=
PB, where P is a nonsingular
C=[E AE .) AnIE]
= [PB PAPIPB = [PB PAB ... ... PAnlplpB] PAnIB] AnIB] = PIC)
= P[B
Because P is nonsingular, Theorem 6.2. Q.E.D. Theorem 6,3
AB
..
= PC
(see Equation (3.62».
(6.20) This establishes where A, B. C, and D are. respectively,
+ Bu + Du
n x p; q
X /1,
we have PIC)
y = Cx
/I
(622) and q x p constant matrices.
x
/1,
The set of the controllability indices of (A, B) is invariant under any equivalence transformation any reordering of the columns of B.
and
7
Proof' Let us define
(6.21) Then we have, following the proof of Theorem 6.2,
Definition 6.01 The state equation (6.22) is said to be observable iffor anv unknown initial state xf O). there exists a finite tl > such that the know/edge of the input u and the output y over [0, tl] suffices to determine uniquely the initial state x(O). Otherwise. the equation is said to be unobservable.
°
for k = 0, I, 2, .... Thus the set of controllability indices is invariant equivalence transformation. The rearrangement of the columns of B can be achieved by
under any
EXAMPLE 6,6 Consider the network shown in Fig. 6.5. If the input is zero, no matter what the initial voltage across the capacitor is, the output is identically zero because of the symmetry of the four resistors. We know the input and output (both are identically zero), but we cannot determine uniquely the initial state. Thus the network or. more precisely. the state equation that describes the network is not observable. EXAMPLE
6.7 Consider the network shown in Fig. 6.6(a). The network has two state variables:
XI
the current
through the inductor and the voltage x, across the capacitor. The input u is a
B = BM
where ,\1 is a p x p nonsingular permutation matrix. It is straightforward to verify Figure 6,5 Unobservable network.
IQ IF IQ
Because diag (M, M ..... M) is nonsingular. we have PIG) = p(Ck) for k = 0.1. .... Thus the set of controllability indices is invariant under any reordering of the columns of B. Q.E.D. Because the set of the controllability indices is invariant under any equivalence transformation and any rearrangement of the inputs. it is an intrinsic property of the system that the
+
I
I
"
'I."
IQ
+
.r IQ
154
CONTROLLABILITY lH
AND OBSERVABILITY lH
1
(b)
6.3 Observability ~ Theorem 6.4 The state equation (6.22) is observable if and only if the n x n matrix Wo(t) is nonsingular for any t > O.
155
+
u 1F
_l
t=
I
y
=
l'
eA'<C'CeMdr
(6.25)
(a)
II
Proof: We pre multiply (6.24) by eA'cC' and then integrate it over [0, tl] to yield
Figure 6.6
Unobservable network. If Wo(ttJ
(fa"
is nonsingular,
eA'CC'CeAldt)X(O) then x(O) = W;l(tl)
=
l
c , eA'IC'y(t)dt
I
current source. If u = 0, the network reduces to the one shown in Fig.6.6(b). If Xl (0) = a 0 and x, 0, then the output is identically zero. Any x(O) [a 0]' and u(t) == 0 yield the same output yet) == O. Thus there is no way to determine the initial state [a 0]' uniquely and the equation that describes the network is not observable.
=
=
1"
eA'IC'y(t)dt
(6.26)
The response of (6.22) excited by the initial state x(O) and the input u(t) was derived in (4.7) as yet) = CeAlx(O)
This yields a unique x(O). This shows that if Wo(t), for any t > 0, is nonsingular, then (6.22) is observable. Next we show that if Wo(ttl is singular or, equivalently, positive semidefinite for all 'I, then (6.22) is not observable. If Wo(tl) is positive semidefinite, there exists an n x I nonzero constant vector v such that
+C
l'
eA(I<)Bu(r)
dt
+ Du(t)
(6.23)
In the study of observability, the output y and the input u are assumed to be known; the inital state x(O) is the only unknown. Thus we can write (6.23) as (6.24) where yet) := yet)  C
which implies (6.27) for all t in [0, ttl. If u
==
0, then Xl (0) = v
t= 0 and
==
X2(0) = 0 both yield the same
fa'
eA(H)Bu(r)
dt  Du(t)
yet) = CeAlx;(O)
0
is a known function. Thus the observability problem reduces to solving x(O) from (6.24). If u == 0, then y(t) reduces to the zeroinput response CeAlx(O). Thus Definition 6.01 can be modified as follows: Equation (6.22) is observable if and only if the initial state x(O) can be determined uniquely from its zeroinput response over a finite time interval. Next we discuss how to solve x(O) from (6.24). For a fixed t, CeAc is a q x n constant matrix, and yet) is a q x 1 constant vector. Thus (6.24) is a set of linear algebraic equations with n unknowns. Because of the way it is developed, for every fixed t, yet) is in the range space of CeAc and solutions always exist in (6.24). The only question is whether the solution is unique. If q < n, as is the case in general, the q x n matrix CeAI has rank at most q and, consequently, has nullity n  q or larger. Thus solutions are not unique (Theorem 3.2). In conclusion, we cannot find a unique x(O) from (6.24) at an isolated t. In order to determine x(O) uniquely from (6.24), we must use the knowledge of u(t) and yet) over a nonzero time interval as stated in the next theorem.
Two different initial states yield the same zeroinput response; therefore we cannot uniquely determine x(O). Thus (6.22) is not observable. This completes the proof of Theorem 6.4. Q.E.D. We see from this theorem that observability depends only on A and C. This can also be deduced from Definition 6.01 by choosing u(t) == O. Thus observability is a property of the pair (A, C) and is independentofB andD. As in the controllability part, ifWo(t) is nonsingular for some t, then it is nonsingular for every t and the initial state can be computed from (6.26) by using any nonzero time interval.
~
Theorem 6.5 (Theorem of duality) The pair (A, B) is controllable if and only if the pair (A', B') is observable.
156
CONTROLLABILITY
AI'ID OBSERVABILITY
6.3 Observabilitv
157
Proof:
The pair (A, B) is controllable Wert) =
l' l'
if and only if eArBB'eA'rdr if and only if, by replacing A by
6,3.1 ObseNobility Indices
Let A and C be n x nand q x 11 constant matrices. We assume that C has rank q (full row rank). If C does not have full row rank, then the output at some output terminal can be expressed as a linear combination of other outputs. Thus the output does not offer any new information regarding the system and the terminal can be eliminated. By deleting the corresponding row. the reduced C will then have full row rank. If (A, C) is observable. its observability matrix 0 has rank 11 and, consequently. n linearly independent rows. Let c, be the ith row of C. Let us search linearly independent rows of 0 in order from top to bottom. Dual to the controllability part, if a row associated with c., becomes linearly dependent on its upper rows, then all rows associated with c., thereafter will also be dependent. Let Vm be the number of the linearly independent rows associated with Cm, It is clear that if 0 has rank 11. then
VI
is nonsingular for any t . The pair (A', B') is observable A' and C by B' in (6.25). Wo(t) = is nonsingular Q.E.D, eArBB'eA'r d t
for any t. The two conditions
are identical
and the theorem
follows.
We list in the following the observability counterpart either directly or by applying the theorem of duality.
of Theorem
6. I. It can be proved
+ V2 + ... + Vq
=n
(6.32)
Theorem 6.01 The following statements are equivalent. 1. The ndimensional pair (A, C) is observable. 2. The n x n matrix
(6.28)
The set
{VI,
V2 •...•
vq} is called the observability
indices and (6.33)
is called the observabilitv such that
index of CA. C). If (A. C) is observable,
it is the least integer
is nonsingular for any t > O. 3. The nq x
11
observability matrix
Dual to the controllability
part, we have n jq ~
V~
min(ii. n  q
+
I) of A.
(6,34)
(6.29)
where p(C) = q and ii is the degree of the minimal polynomial Corollary 6.01 The ndimensional pair CA. C) is observable if and only if the matrix
has rank 4. The
(11
11
(full column rank). This matrix can be generated by calling obsv in MATLAB, n matrix
+q: x
has full column rank at every eigenvalue. A, of A. 5. If. in addition. all eigenvalues of A have negative real parts. then the unique solution of
(6,30)
Onq+1 = [
.:
C~
]
(6.35)
where p(C) = q. has rank 11 or the n x n matrix O~_q+1 Onq+1 is nonsingular.
is positive definite. The solution is called the observability Gramian and can be expressed as
w,
=
l
x
»
(6.31)
Theorem 6.02 The observability property is invariant under any equivalence transformation.
eA'rC'CeArdr
158
CONTROLLABILITY
AND OBSERVABILITY
6.4 Canonical Decomposition
159
~
Theorem 6.03 The set of the observability indices of (A, C) is invariant under any equivalence transformation and any reordering of the rows of C. Let Before concluding this section, we discuss a different ating (6.24) repeatedly and setting t = 0, we can obtain way of solving (6.24). Differenti
x = Ax+Bu
y = CxjDu
(6.38)
x = Px, where
P is a nonsingular
matrix, Then the state equation
x = Ax+Bu
y=
Cx+Du
(6,39)
with A = PAPI, B = PB, C = CpI, and D = D is equivalent to (6.38), All properties of (6.38), including stability, controllability, and observability, are preserved in (6.39). We also have or Ovx(O) = yeO) where y(i) (I) is the ith derivative of y(t), and yeO) := (6.36)
W (0) Y' (0)
(y(vI»']'.
Equation
»
Theorem 6.6 Consider the ndimensional state equation in (6.38) with
(6.36) is a set of linear algebraic equations. Because of the way it is developed, y(0) must lie in the range space of Ov. Thus a solution x(O) exists in (6.36). If (A, C) is observable, then Ov has full column rank and, following Theorem 3.2, the solution is unique. Premultiplying (6.36) by 0:, and then using Theorem 3.8, we can obtain the solution as (6.37) We mention that in order to obtain yeO), yeO), " ., we need knowledge of yet) in the neighborhood of t = O. This is consistent with the earlier assertion that we need knowledge of Y(t) over a nonzero time interval in order to determine x(O) uniquely from (6.24). In conclusion, the initial state can be computed, using (6.26) or (6,37), The output y(t) measured in practice is often corrupted by highfrequency noise, Because • differentiation • integration will amplify highfrequency noise and noise,
pee)
We form the n x n matrix
= p([B
AB
, .. AnIB))
= nl
<n
where the first n I columns are any n I linearly independent columns of C, and the remaining columns can arbitrarily be chosen as long as P is nonsingular, Then the equivalence transformation = Px or X = pIX will transform (6.38) into
x
(6.40)
will suppress or smooth highfrequency
where Ae is n I x n I and Ac is (n  n I)
X
(n  n d, and the n jdimensional subequation of
=
(6"+0).
the result obtained from (6,36) or (6.37) may differ greatly from the actual initial state, Thus (6,26) is preferable to (6.36) in computing initial states. The physical significance of the observability index can be seen from (6.36), It is the smallest integer in order to determine x(O) uniquely from (6,36) or (6.37). It also dictates the minimum degree required to achieve pole placement and model matching, as we will discuss in Chapter 9.
Xc
Aexc + Beu
(6.+1)
is controllable and has the same transfer matrix a' (6.38).
Proof" As discussed
6.4 Canonical Decomposition
This section discusses canonical decomposition of state equations, This fundamental result will be used to establish the relationship between the statespace description and the transfermatrix description, Consider
in Section 4,3, the transformation x = pIX changes the basis of the state space from the orthonormal basis in (3,8) to the columns of Q := pl or [q., , .. , qnl' .. ,. q.}. The ith column of A is the representation of Aq, with respect to {ql. "', qnl' ' ... qn},NowthevectorAqi,fori = I, 2, .. " nl. are linearly dependent on the set {qj , " .. qnl}; they are linearly independent of {qnl+l, .... qn}. Thus the matrix A has the form shown in (6.40). The columns of B are the representation of the columns ofB with respect to {ql. , .. , qnl' "', Qn}. Now the columns ofB depend only on {ql ... '. qnl}: thus B has the form shown in (6.40). We mention that if the n x p
160
CONTROLLABILITY
AND OBSERV,A,BILITY
6.4 Canonical Decomposition The rank ofB is 2: therefore we can use C" the controllability of (6.43) (Corollary
161
matrix B has rank upper part of B is Let C be the is straightforward
p and if its columns are chosen as the first p columns of pI . then the the unit matrix of order p . controllability matrix of (6.40). Then we have p( C) = p( C) = n I. It to verify
= [B
AB]. instead of C
=
[B AB A"B]. to check
6.1). Because
C= =
[~e
[~e
AeBe 0 Ac
 nl 
Ae

n]
Be

P(C2)=P([BABD=P[;
the state equation in (6"+3) is not controllable.
~
Let us choose
0
A~~IBeJ
pI =Q:= [; ~
~]=2<3
Be
0
A;~IBe J
where Cc is the controllability matrix of (Ae. Be). Because the columns of A;Bc. for k ~ n I. are linearly dependent on the columns of the condition p( C) = 111 implies p (C.) = n I. Thus the n Idimensional state equation in (6.41) is controllable. Next we show that (6.41) has the same transfer matrix as (6.38). Because (6.38) and (6.40) have the same transfer matrix. we need to show only that (6.40) and (6.41) have the same transfer matrix. By direct verification. we can show
i]
columns of C2: the last column I 0
c:
The first two columns of Q are the first two linearly independent is chosen arbitrarily to make Q nonsingular. I
I
Let X = Px. We compute
A
= P.W = [~
0
[ where
51
0 Ae
sl  Ai'
AI': J
I
= [(51 
Acrl
0
M
(51  Ac)I
J
(6.42)
M = (51  Ae)IAdsl
Thus the transfer matrix of (6.40) is

Ai')I
~l: :1
0 0 I 0 0
0
:J [i :J[;
n
[cceoC][51Ae
=[CeCc][(51 Ae)1 0
51  Ai'
I AI': J [BeJ
0
+
D
B=PB=
(51Airl
M
J[BeJ 0 +D
the proof of Theorem 6.6. Q.E.D.
[~ JU i]~[:
n
which is the transfer matrix of (6.41). This completes
In the equivalence transformation = Px, the ndimensional state space is divided into two subspaces. One is the n Idimensional subspace that consists of all vectors of the form 0,],: the other is the (n  n I)dimensional subspace that consists of all vectors of the form [0' Because (6.41) is controllable. the input u can transfer Xc from any state to any other state. However. the input u cannot control Xc because. as we can see from (6.401. u does not affect Xc directly. nor indirectly through the state By dropping the uncontrollable state vector. we obtain a controllable state equation of lesser dimension that is zerostate equivalent to the original equation.
x
Note that the I x 2 submatrix A21 of A and Be are zero as expected. The 2 x I submatrix AI2 happens to be zero: it could be nonzero. The upper part of B is a unit matrix because the columns of B are the first two columns of Q. Thus (6.43) can be reduced to
[x;.
x;,j'.
Xc
This equation
=[:
~ Xc
J
+ [~ ~J
u
xc.
is controllable
and has the same transfer matrix as (6.43). transforms (6.38) into (6.40) except that the order of the
The MATLAB function c trbf columns in pI
is reversed. Thus the resulting equation has the form
EXAMPLE 6.8 Consider the threedimensional
state equation y = [I I I]x (6.43) Theorem 6.6 is established from the controllability matrix. In actual computation. it is unnecessary to form the controllability matrix. The result can be obtained by carrying out a sequence
162
CONTROLLABILITY
AND OBSERVABILITY
6.4 Canonical Decomposition Theorem 6.7
163
of similarity transformations to transform [B A] into a Hessenberg form. See Reference [6, pp. 220222]. This procedure is efficient and numerically stable and should be used in actual computation. Dual to Theorem 6.6, we have the following theorem for unobservable state equations.
Every statespace equation can be transformed. by an equivalence transformation. into the following canonical form
>
Theorem 6.06 Consider the ndimensional state equation in (6.38) with Y=
(645)
[Ceo
0
Ceo
O]x
+ Du
where the vector xeo is controllable and observable. Xcii is controllable but not observable. x"o is observable but not controllable. and Xcii is neither controllable nor observable. Furthermore. the state equation is zerostate equivalent to the controllable and observable state equation We form the n x n matrix PI and has the transfer matrix
XeD
=
Aeoxeo + Beou
(6.16)
p=
Pn2
G(s) = Ceo(sI 
A

Aeo) Beo + D

I 
pn
where the first n2 rows are any n2 linearly independent rows of 0, and the remaining rows can be chosen arbitrarily as long as P is nonsingular. Then the equivalence transformation = Px will transform (6.38) into
x
This theorem can be illustrated symbolically as shown in Fig. 6.7. The equation is first decomposed, using Theorem 6.6, into controllable and uncontrollable subequations. We then decompose each subequation. using Theorem 6.06, into observable and unobservable parts. From the figure. we see that only the controllable and observable part is connected to both the input and output terminals, Thus the transfer matrix describes only this part of the system. This is the reason that the transferfunction description and the statespace description are not necessarily equivalent. For example. if any Amatrix other than Aco has an eigenvalue with a
y = [Co 0] [xo] ~o
where
+ Du
(6.44) Figure 6.7 Kalman decomposition.
Ao is n:
x n2 and
Aii
is (n  n2) x (n  112), and the 1I2dimensional subequation of (6.44),
Xo = Acxo + Bou y = CoXo +Du
is observable and has the same transfer matrix as (6.38). In the equivalence transformation = Px, the ndimensional state space is divided into two subspaces. One is the lizdimensional subspace that consists of all vectors of the form [x~ 0']'; the other is the (11  nz)dimensional subspace consisting of all vectors of the form [0' x~]', The state Xo can be detected from the output. However, Xii cannot be detected from the output because, as we can see from (6.44), it is not connected to the output either directly, or indirectly through the state xn' By dropping the unobservable state vector. we obtain an observable state equation of lesser dimension that is zerostate equivalent to the original equation. The MATLAB function obsvf is the counterpart of c trbf. Combining Theorems 6.6 and 6.06, we have the following Kalman decomposition theorem.
u
y
x
c
G
164
cmnROLL,\BILITY
AND
OBSERVABILITY C1 = 2
6.5 Conditions in JordanForm Equations
165
positive real part. then some state variable may grow without bound and the system may bum OLIt.This phenomenon. however. cannot be detected from the transfer matrix. The MATL.\B function .n i r.r ea l , an acronym for minimal realization, can reduce any state equation to (6.46). The reason for calling it minimal realization will be given in the next chapter.
+
Xz LI = 1 I" 1" Lz = 1 x,
lQ
network shown in Fig. 6.8(a). Because the input is a current source. responses due to the initial conditions in C1 and LI will not appear at the output. Thus the state variables associated with C I and L I are not observable: whether or not they are controllable is immaterial in subsequent discussion. Similarly. the state variable associated with L2 is not controllable. Because of the symmetry of the four lQ resistors. the state variable associated with C: is neither controllable nor observable. By dropping the state variables that are either uncontrollable or unobservable. the network in Fig. 6.S(a) can be reduced to the one in Fig. 6.8(b). The current in each branch is un: thus the output y equals 2 . (11/2) or y = II. Thus the transfer function of the network in Fig. 6.8(a) is g(s) = I.
E".UIPLE 6.9 Considerthe
t
IQ
+
X3

In
C, = 2 1Q 1
Q
IQ
(a)
Figure
6.8
Networks.
If we assign state variables as shown. then the network can be described by
,= [1
y
0.5 0 0 0
0 0 0.5 0
o]
~l
x+
[05]
:
u
where J is in Jordan form, To simplify discussion. eigenvalues A I and 1.2 and can be written as
we assume that
J
has only two distinct
J = diag (JI. J2) where JI consists of all Jordan blocks associated with AI and J2 consists of all Jordan blocks associated with ).2 Again to simplify discussion. we assume that J I has three Jordan blocks and J2 has two Jordan blocks or
=
[0 0 0 l lx
+
II
Because the equation is already of the form shown in (6.40). it can be reduced to the following controllable state equation
x, = [~
~.5] x,
+
[0~5]
u
The row of B corresponding to the last row of J'J is denoted corresponding to the first column of J'1 is denoted by Cj,}'
by bl'1' The column
of C
v = [0 OJ,,,
+ 11
11.
Theorem 6.8 This is 1. The state equation
linearly independent in (6.47) is controllable if and only if the three row vectors {bI21. bl2l} are linearly
The output is independent of X, : thus the equation can be further reduced to \' = what we will obtain by using the :IATLAB function miClreal.
{bill. bll2. b/l)} are
and the two row vectors in (6.47) is observable
independent. vectors
2. The state equation are linearly
if and only if the three column vectors
{C/21. ef22}
{cill.
Cj12.
Cj!3}
independent
and the two column
are linearly independent.
6,5 Conditions
in JordanForm
Equations
We discuss first the implications of this theorem. If a state equation is in Jordan form, then the controllability of the state variables associated with one eigenvalue can be checked independently from those associated with different eigenvalues. The controllability of the state variables associated with the same eigenvalue depends only on the rows of B corresponding to the last row of all Jordan blocks associated with the eigenvalue. All other rows of B play no role in determining the controllability. Similar remarks apply to the observability part except that the columns of C corresponding to the first column of all Jordan blocks determine the observability. We use an example to illustrate the use of Theorem 6.8.
Controllability and observability are invariant under any equivalence transformation. If a state equation is transformed into Jordan form, then the controllability and observability conditions become very simple and can often be checked by inspection. Consider the state equation (6.47)
x = Jx + Bx
y = Cx
166
CONTROLLABILITY AND OBSERVABILITY
6.5 Conditions in JordanForm Equations
167
EXAMPLE 6.10 Consider the Jordanform
Al
state equation
x=
0 0 0 0 0 0
Al
0 0
Al
0 0 0 0 0 1 0 0 2 2
0 0 0
Al
0 0 0 0
0 0 0 0
A2
0 0 0 0
A2
0 0 0
0 0 0 0 0
A2
00 10 0
x+
0 0
0 0 0 1 123 0 0
y
U
(6.48)
y
0
y=
The matrix J has two distinct eigenvalues Al and A2· There are three Jordan blocks, with order 2, 1, and 1, associated with A 1. The rows of B corresponding to the last row of the three Jordan blocks are [1 00], [0 1 0], and [1 1 1]. The three rows are linearly independent. There is only one Jordan block, with order 3, associated with A2. The row ofB corresponding to the last row of the Jordan block is [1 1 1], which is nonzero and is therefore linearly independent. Thus we conclude that the state equation in (6.48) is controllable. The conditions for (6.48) to be observable are that the three columns [1 1 I)', [2 12)', and [023]' are linearly independent (they are) and the one column [000]' is linearly independent (it is not). Therefore the state equation is not observable.
y
U
00 20 30
r
",
2 1
nX
y
Before proving Theorm 6.8, we draw a block diagram to show how the conditions in the theorem arise. The inverse of (s I  J) is of the form shown in (3.49), whose entries consist of only 1/(s  Ai)k. Using (3.49), we can draw a block diagram for (6.48) as shown in Fig. 6.9. Each chain of blocks corresponds to one Jordan block in the equation. Because (6.48) has four Jordan blocks, the figure has four chains. The output of each block can be assigned as a state variable as shown in Fig. 6.10. Let us consider the last chain in Fig. 6.9. If bl2l = 0, the state variable XIZI is not connected to the input and is not controllable no matter what values bz21 and bl2l assume. On the other hand, if bl21 is nonzero, then all state variables in the chain are controllable. If there are two or more chains associated with the same eigenvalue, then we require the linear independence of the first gain vectors of those chains. The chains associated with different eigenvalues can be checked separately. All discussion applies to the observability part except that the column vector Cfi} plays the role of the row vector bli}.
Figure 6.9 Block diagram of (6.48).
.r
Figure 6.10 Internalstructureof 1/ (s  Ai)'
The Jordanform matrix J has two distinct eigenvalues AI and Az. There are two Jordan blocks associated with AI and one associated with }'z. If s = AI, (6.49) becomes
iii
Proof of Theorem 6.8 We prove the theorem by using the condition
[A  sl B] or [sl  A B] has full row rank at every eigenvalue overwhelmed by notation, we assume [sl  J B] to be of the form
that the matrix
of A. In order not to be
s  AI
1
S  AI
0 0 0 0 0 0
0 1
S  AI
0 0 0 0 0
0 0 0
s  AI
0 0 0 0
0 0 0 1
s  AI
0 0 0
0 0 0 0 0
s  A2
0 0
0 0 0 0 0 1
s  A2
0
bill b211 bill bll2 bll2 bizi bl2l
(6.49)
0 0 0 0 0 0 0
1 0 0 0 0 0 0
0 1 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0
I
0 0 0
0 0 0 0 0
AI  A2
0 0 0 0 0 1
AI  AZ
bill b211 bill
bll2
(6.50)
bm bl2l
blZI
0
168
CONTROLLABILITY
AND OBSERVABILITY EX.UIPLE 6.11 Consider
6.6
DiscreteTime
State Equations
169
The rank of the matrix will not change by elementary column operations. We add the product of the second column of (6.50) by bill to the last block column. Repeating the process for the third and fifth columns. we can obtain
the state equation
0 0 0 0 0 0 0
I
0
I
0 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0 0 0
0 0 0
I
0 0 0
0 0 0 0 0 ;·1 0
Ae
0 0 0 0 0
1 Al  A2
0 0
bill
X=
~ ~~l ~ [o
0 0 0 0
X+
[I~)l
0
1I
(6.52)
00
2
I
0
bll2 blel bill
y=[1
00
2Jx
There are two Jordan blocks. one with order 3 and associated with eigenvalue O. the other with order I and associated with eigenvalue 2. The entry of B corresponding to the last row of the first Jordan block is zero: thus the state equation is not controllable. The two entries of C corresponding to the first column of both Jordan blocks are different from zero: thus the state equation is observable.
Because i'l and A2 are distinct. AI  );2 is nonzero. We add the product of the seventh column and bill /(i, I  A2) to the last column and then use the sixth column to eliminate its righthandside entries to yield 0 1
0
1
0 0
0
0 0
0
0 0 0 0 0
0
0 0 0 0 0
0 0 0 0 0 0 0
0 0 0
1
0 0 0
0 0 0 0 0
Al  1..2 0
0 0 0 0 0 0
AI Ae
0 0
bill
6.6 DiscreteTime
State Equations
pinput qoutput state equation x[k
Consider the ndimensional (6.51)
0
bll2
+
I] = Ax[kJ = Cx[k]
+ Bu[k]
II
(6.53)
y[k] where A. B. and C are. respectively,
0 0
n x n . 11 x p . and q x
real constant matrices.
It is clear that the matrix in (6.51) has full row rank if and only if bill and bll, are linearly independent. Proceeding similarly for each eigenvalue, we can establish Theorem 6.8. Q.E.D. Consider an ndimensional Jordanform state equation with p inputs and q outputs. If there are In, with m > P. Jordan blocks associated with the same eigenvalue. then III number of I x p row vectors can never be linearly independent and the state equation can never be controllable. Thus a necessary condition for the state equation to be controllable is III ::: p. Similarly, a necessary condition for the state equation to be observable is I1l ::: q . For the singleinput or singleoutput case. we then have the following corollaries.
Definition 6.DI The discretetime state equation (6.53) or the pair (A. B) is said to he controllable if foranv initiul state xfO) = Xo and anv final state XI. there exists WI input sequence of finite length that transfers Xo to XI. Otherwise the equation or (A. B) is said to he uncontrollable.
Theorem 6.01
The following statements are equivalent:
i. The ndimensional 2. The n x n rnatnx
pair (A. B) is controllable.
Corollary 6.8
A singleinput Jordanform state equation is controllable if and only if there is only one Jordan block associated with each distinct eigenvalue and every entry' of B corresponding to the last row of each Jordan block is different from zero.
II!
\Y,icln  II = L(A)"'BB'(A'('
111=0
16.5.))
is nonsingular.
3. The n x n p control/uhfliry mctri:
Corollary 6.08
A singleoutput associated Jordanform state equation is observable if and only if there is only one Jordan to the first column block of each with each distinct eigenvalue from zero. and every entry of C corresponding 4. The n x (n 5. If. in addition.
(I
=
[B AB A'B
by calling
16.551 =tr;:, in MATLAB.
A. of A. of
has rank n (full row rank). The matrix can be generated
+ pv
matrix [A  AI
BJ
has full row rank at every eigenvalue.
Jordan block is different
all eigenvalues
of A have magnitudes
less than I. then the unique solution
.
170
CONTROLLABILITY AND OBSERVABILITY
,
6.6 DiscreteTime State Equations 171
~
(6.56) is positive definite. The solution using the MATLAB function is called the discrete
Theorem 6.001
The following statements are equi valent:
controllability Gramian and can be obtained by
can be expressed as (6.57) 1. The ndimensional 2. The n x n matrix
nl
dgram. The discrete Gramian
00
pair (A, C) is observable.
Wdc = LAmBB'(A,)m
m~O
The solution of (6.53) at k = n was derived in (4.20) as x[n] = Anx[O] + LAnlmBu[m]
m=O nI
Wdo[n  1] = L(A')mC'CAm
m=O
(6.59)
is nonsingular
or. equivalently.
positive
definite.
3. The nq x n observability matrix
which can be written as
x[n]Anx[O]=[B
AB
...
AnIB]
[:::=;:1
u[O]
(6.58)
has rank n (full column 4. The (n rank). The matrix can be generated by calling obsv in MATLAB.
+ q)
x n matrix
It follows from Theorem 3.1 that for any x[O] and x[n], an input sequence exists if and only if the controllability matrix has full row rank. This shows the equivalence of (1) and (3). The matrix Wdc[n  1] can be written as
B'
Wdc[n  1] = [B AB '" AnIB] [ B'A' . B'(A',)n1 ]
has full column 5. If. in addition.
rank at every eigenvalue. all eigenvalues
A. of A.
less than I. then the unique solution of (6.61) as
of A have magnitudes
Wdo  A'WdoA
is positive definite. The solution is called the discrete
oc
= C'C
observabilitv Gramian and can be expressed
The equivalence of (2) and (3) then follows Theorem 3.8. Note that W dc[m] is always positive semidefinite. If it is nonsingular or, equivalently, positive definite, then (6.53) is controllable. The proof of the equivalence of (3) and (4) is identical to the continuoustime case. Condition (5) follows Condition (2) and Theorem 5.D6. We see that establishing considerably simpler than establishing Theorem 6.1. Theorem 6.Dl is
(6.62)
m=O
There is one important difference between the continuous and discretetime cases. If a continuoustime state equation is controllable, the input can transfer any state to any other state in any nonzero time interval, no matter how small. If a discretetime state equation is controllable, an input sequence of length n can transfer any state to any other state. If we compute the controllability index J.L as defined in (6.15), then the transfer can be achieved using an input sequence of length J.L. If an input sequence to transfer any state to any other state. is shorter than J.L, it is not possible
This can be proved directly or indirectly using the duality theorem. We mention that all other propertiessuch as controllability and observability indices, Kalman decomposition, and Jordanform controllability and observability conditionsediscussed for the continuoustime case apply to the discretetime case without any modification. The controllabilitv index and observability index, however, have simple interpretations in the discretetime case. The controllability index is the shortest input sequence that can transfer any state to any other state. The observability index is the shortest input and output sequences needed to determine the initial state uniquely.
6.6.1 Controllability
Definition 6.D2 The discretetime state equation (6.53) or the pair observable iff or any unknown initial state x[O]. there exists a finite that the knowledge of the input sequence u[k] and output sequence kl suffices to determine uniquely the initial state x[O]. Otherwise, to be unobservable. (A, C) is said to be integer kl > 0 such y[k] from k = 0 to the equation is said In the literature,
to the Origin and Reachability
controllability definitions:
there are three different
1. Transfer any state to any other state as adopted in Definition 6.D I.
2. Transfer any state to the zero state, called controllability to the origin.
172
CONTROLLABILITY
AND OBSERVABILITY
6.7 Controllability After Sampling from the origin or, more often, with
173
3. Transfer the zero state to any state, called controllability reachability,
B=
(I
T eArdt) B =: MB
(667)
In the continuoustime case. because eAr is nonsingular, the three definitions are equivalent. In the discretetime case, if A is nonsingular, the three definitions are again equivalent. But if A is singular. then (I) and (3) are equivalent, but not (2) and (3). The equivalence of (1) and (3) can easily be seen from (6.58). We use examples to discuss the difference between (2) and (3). Consider
,[k+
11 [~ : ,[kl + [~}[kl ~
n
(6.63)
The question is: If (6.65) is controllable. will its sampled equation in (6.66) be controllable') This problem is important in designing socalled deadbeat sampleddata systems and in computer control of continuoustime systems. The answer to the question depends on the sampling period T and t~e location of the eigenvalues of A. Let A, and;', be. respectively, the eigenvalues of A and A. We use Re and Im to denote the real part and imaginary part. Then we haw the following theorem. Theorem 6.9 Suppose (6.65) is controllable. A sufficient condition for its discretized equation in (6.66). with sampling period T. to be controllable is that IIm[A,  Ajll ~ 2:TIIl/T for 111 = 1,2 ..... whenever Re[A,  AJ] = O. For the singleinput case. the condition is necessary as well. First we remark on the conditions. If A has only real eigenvalues. then the discretized equation with any sampling period T > 0 is always controllable. Suppose A has complex conjugate eigenvalues a j t3. If the sampling period T does not equal any integer multiple of IT / f3. then the discretized state equation is controllable. If T = 1Il:T / f3 for some integer Ill. then the discretized equation ntav not be controllable. The reason is as follows. Because A = eAT. if A, is an eigenvalue of A. then;', := eA,T is an eigenvalue of A (Problem 3.19). If T = 1Il:T 'f3. the two distinct e~genvalues i'l = 0'+ j f3 and A2 = a  j f3 of A become a repeated eigenvalue _eaT or eaT of A. This will cause the discretized equation to be uncontrollable. as we will see in the proof. We show Theorem 6.9 by assuming A to be in Jordan form. This is permirred because controllability is invariant under any equivalence transformation.
Its controllability matrix has rank 0 and the .equation is not controllable as defined in (1) or not reachable as defined in (3). The matrix A has the form shown in (3.40) and has the property Ak = 0 for k :::: 3. Thus we have
for any initial state x[O]. Thus every state propagates to the zero state whether or not an input sequence is applied. Thus the equation is controllable to the origin. A different example follows. Consider (6.64) Its controllability matrix
=
has rank 1 and the equation is not reachable. However, for any Xl [0] = the input 1I [0] = 20' + f3 transfers x[O] to x[ 1] = O. Thus the equation is origin. Note that the Amatrices in (6.63) and (6.64) are both singular. The in Definition 6.01 encompasses the other two definitions and makes the For a thorough discussion of the three definitions, see Reference [4].
a and X2[Oj = f3,
controllable to the definition adopted discussion simple.
7
Proof of Theorem 6.9 To simplify the discussion, we assume A to be of the form 0 . A ",,,,,A,,
A" A" I ~
["
0 0 :
Al
0 0 0
()
Al
0 0 0
j'l
0 0 0
0 0 0 0
10.2
0 0
0
6,7 Controllability After Sampling
Consider a continuoustime state equation x(t) = Ax(t) If the input is piecewise constant or u[k] := uik.T) then the equation can be described, x[k = 1I(t) as developed for kT S t < (k in (4.17), by
In other words, A has 1\\0 distinct eigenvalues Al and A2. There are two Jordan blocks. one with order 3 and one with order 1, associated with Al and only one Jordan block of order 2 associated w ith i.2. Using (3.48). we have
II
(6.681
+ But r) +
I)T
(6.65)
A=
+
1] = Ax[k]
+ Bu[k]
(6.66)
~[l
diagl'.\.II'
A12. t\2il
T2e;' T /2 Te;·,T
Te;·,T e~iT
0 0
0 0
o
0
ei.IT 0
o
e/qT
0 0
o
o o
eAJ.T
o
0 0
0
J, I
0
eA.;.T
(6.69)
174
CONTROLLABILITY
AND
OBSERVABILITY
6.7 Controllability
After Sampling
175
This is not in Jordan form. Because we will use Theorem 6.8, which is also applicable to the discretetime case without any modification, to prove Theorem 6.9, we must transform A in (6.69) into Jordan form. It turns out that the Jordan form of A equals the one in (6.68) if Ai is replaced by >i := eAiT (Problem 3.17). In other words, there exists a nonsingular triangular matrix P such that the transformation i = Pi will transform (6.66) into x[k + IJ = PAPIx[kJ + PMBu[kJ (6.70)
~
U({)~~
r
u
s+2 (s
+ I)(s + I + 2j)(s + 1 
2j)
3 
I
1 I
2
2
3
with PAPI in the Jordan form in (6.68) with Ai replaced by >i. Now we are ready to establish Theorem 6.9. First we show that M in (6.67) is nonsingular. If A is of the form shown in (6.68), then M is block diagonal and triangular. Its diagonal entry is of form mii:=
Figure 6.11
System with piecewise constant input.
10
faT Ai'dr: = {(TeA;!  I)/Ai
r.
if x, f. 0 ifAi = 0
(6.71)
Using (4.41), we can readily obtain the state equation
Let Ai = a, + jf3i. The only way for m., = 0 is a, = 0 and f3iT = 2rrm. In this case,  jf3i is also an eigenvalue and the theorem requires that 2f3i T f. Zn m. Thus we conclude m., f. 0 and M is nonsingular and triangular. If A is of the form shown in (6.68), then it is controllable if and only if the third and fourth rows of B are linearly independent and the last row of B is nonzero (Theorem 6.8). Under the condition in Theorem 6.9, the two eigenvalues >1 = eA1T and >2 = eA,T of A are distinct. Thus (6.70) is controllable if and only if the third and fourth rows of PMB are linearly independent and the last row of PMB is nonzero. Because P and M are both triangular and nonsingular, PMB and B have the same properties on the linear independence of their rows. This shows the sufficiency of the theorem. If the condition in Theorem 6.9 is not met, then Al = A2. In this case, (6.70) is controllable if the third, fourth and last rows of PMB are linearly independent. This is still possible if B has three or more columns. Thus the condition is not necessary. In the singleinput case, if >1 = >2, then (6.70) has two or more Jordan blocks associated with the same eigenvalue and (6.70) is, following Corollary 6.8, not controllable. This establishes the theorem. Q.E.D. In the proof of Theorem 6.9, we have essentially established the theorem that follows.
(6.73) y = [0 I 2Jx to describe the system. It is a controllableform realization and is clearly controllable. The eigenvalues of A are I, 1 ± j2 and are plotted in Fig. 6.11. The three eigenvalues have the same real part; their differences in imaginary parts are :2 and 4. Thus the discretized state equation is controllable if and only if
T
f. 2
2rrm
= tt m
and
T
f. 4
:271'm
= 0.5rrm
for m = I, 2..... The second condition includes the first condition. Thus we conclude that the discretized equation of (6.73) is controllable if and only if T f. 0.5nm for any positive integerm. We use MATLAB to check the result for m = I or T = 0.571'.Typing
a~[3 7 5;1 0 0;0 1 OI;b~[l;O;OI;
[ad,bdl~c2d(a,b,pi/2)
;;.
Theorem
6. 10
yields the discretized state equation as x[k+I]= 0.1039 0.1390 [ 0.1039 0.2079 0.4158 0.2079
0.5197
If a continuoustime linear timeinvariant state equation is not controllable, then its discretized state equation. with any sampling period. is not controllable. This theorem is intuitively obvious. If a state equation is not controllable using any input, it is certainly not controllable using only piecewise constant input.
EXAMPLE
0.5197] 0.3118
i[k]
+
[0.1039] 0.1039 0.1376
u[k]
(6.74)
Its controllability matrix can be obtained by typing c trb (ad, bd ) . which yields Cd = 0.1039 0.1039 [ 0.1376 0.1039 0.1039 0.0539 0.0045] 0.0045 0.0059
6.12 Consider the system shown in Fig. 6.11. Its input is sampled every T seconds and then kept constant using a hold circuit. The transfer function of the system is given as s+2 s+2 + I  j2)
g(s) = s3 + 3s2 + 7s + 5 = (s + I)(s + I + j2)(s (6.72)
Its first two rows are clearly linearly dependent. Thus Cd does not have full row rank and (6.74) is not controllable as predicted by Theorem 6.9. We mention that if we type
176
CONTROLLABILITY
AND OBSERVABILITY
6.8 LTV State Equations
177
r ank (ctrb (ad, bd i ), the result is 3 and (6.74) is controllable. This is incorrect and is due to roundoff errors. We see once again that the rank is very sensitive to roundoff errors. What has been discussed is also applicable to the observability part. In other words. under the conditions in Theorem 6.9, if a continuoustime state equation is observable. its discretized equation is also observable.
Thus the equation is controllable at to. We show the converse by contradiction. Suppose (6.75) is controllable at 10 but Wc(to. I) is singular or, positive semidefinite. for all II > 10. Then there exists an n x I nonzero constant vector v such that v'Wc(to. IIlv =
1" =" 1
'0
'0
v'<I>(lI. nB(r)B'(r)<I>'(tI' IIB'(r)<I>'(tI. r)vll'dr
t ivd t
=0
6.8 LN State Equations
Consider the ndimensional pinput qoutput x = A(t)x state equation
which implies B'(r)<I>'(lI. r)v "" 0 or V'<I>(lI. r)B(r) "" 0 (679)
+ B(t)u
(6.75)
for all r in [10. II j. If (6.75) is controllable. there exists an input that transfers the initial state Xo = <1>(10. )V at 10 to X(II) = O. Then (6.77) becomes I
y = C(t)x
The state equation is said to be controllable at 10, if there exists a finite II > 10 such that for any x(to) = Xo and any XI. there exists an input that transfers Xo to XI at time II· Otherwise the state equation is uncontrollable at 10. In the timeinvariant case, if a state equation is controllable. then it is controllable at every 10 and for every II > 10; thus there is no need to specify 10 and II. In the timevarying case, the specification of 10 and II is crucial.
0= <1>(11.10)<1>(10. lilv
Its prernultiplication by v' yields
+
1"
'0
<1>(11.r)B(r)u!
r)dt
(6.80)
0= v'v + v'
1
'0
',
<1>(11. )B(r)u(r) r
d t = Ilvll'
+0
Theorem
6.11
This contradicts the hypothesis v i= O. Thus if (ACt). B(I» is controllable at 10. W,(to. II) must be nonsingular for some finite tl > to. This establishes Theorem 6.11. Q.E.D. B(I)) is controllable at time 10 if and only if there exists a rinite II > 10 In order to apply Theorem 6.11, we need knowledge of the state transition matrix. which. however, may not be available. Therefore it is desirable to develop a controllability condition without involving <I>(t. r ). This is possible if we have additional conditions on A(t) and Bill. Recall that we have assumed A(t) and B(t) to be continuous. Now we require them to be (n  I) times continuously differentiable. Define 1\10(0 = B(I). We then define recursively a sequence of II x p matrices l\I", (t) as . (6.81) for m = O. I .....
II 
The ndimensional pair (A(I). such that the n X n matrix
Wc(to. II) =
1"
'0
<1>(11. )B(r)B'(r)<I>'(lI. r
ti
dt
(676)
where <1>(1.r ) is the state transition matrix of x Proof: response
=
A(I)x.
is nonsingular, then (6.75) is controllable. The
We first show that if Wc(lo. IJl is nonsingular, of (6.75) at II was computed x(tJl = <I>{tl.lolxo in (4.57) as
+
1"
'0
<I>(tl. r)B(r)u(r)
dt
(677)
I. Clearly.
we have
We claim that the input u(t) = B'(I)<I>'(tI.I)W,~I(tO.IJl[<I>(II.lo)Xo  x.] (6.78) into (6.r) rJdr
(678)
for any fixed I,. Using at'<I>(t,. I) = <I>(t,. t)A(t) yields (Problem 4.17). we compute ;[<1>11,. t)B(I)]
,)1
(J i)
will transfer Xo at time 10 to XI at time II' Indeed. x(tll = <I>(II.lo)Xo 
1"
'0
substituting
<1>(11. )B(r)B'(r)<I>'(II. r
= [<1>(1,.
a
W;I (to. til [<I> . to)Xo (tl
xil
(to. IJl[<I>(tI' 10JXo  XI] = XI
DI
t)]B(I)
+ <I>(t>
d I)B(I) dt
= <I>(tl, to)Xo  Wc(to. tl)W;1
=
<I>(h. 1)[A(t)MoU)

+
d !\Io(t)] dt·
= <I>(h. I)MI(I)
178
CONTROLLABILITY
AND OBSERVABILITY
>1
(6.82)
6.8 LTV State Equations
179
Proceeding forward, we have
a 
am
1m
4>(tz, t)B(t)
==
4>(tz , t)Mm (t)
for m == 0,1,2, .... The following theorem is sufficient but not necessary for (6.75) to be controllable.
Theorem 6.12 Let A(I) and B(t) be n 1 times continuously differentiable. Then the ndimensional pair (A(I), B(t)) is controllable at 10if there exists a finite t1 > to such that (6.83)
I
The determinant of the matrix
t tZ tZ 1
1
(6.87)
is t2
+ 1, which is nonzero for all t.
Consider
Thus the state equation in (6.86) is controllable at every t.
EXAMPLE 6.14
Proof: We show that if (6.83) holds, then Wc (to , t) is nonsingular for all I :': II· Suppose not, that is, W c(to, t) is singular or positive semidefinite for some t: :': tl· Then there exists an n x 1 nonzero constant vector v such that
v'Wc(to, tz)v
and
== ==
1 1
10 10
'2
x ==
. [10] °
2x
+ [ee21
l ]
u
(6.88)
v'4>(lz, r)B(r)B'(r)4>'(tz,
2
r)vdr
'2
IIB'(r)4>'(t2,
r)vll
dt
==
0
Equation (6.87) is a timeinvariant equation and is controllable according to Corollary 6.8. Equation (6.88) is a timevarying equation; the two entries of its Bmatrix are nonzero for all t and one might be tempted to conclude that (6.88) is controllable. Let us check this by using Theorem 6.11. Its state transition matrix is (6.84) and
which implies
B'(r)4>'(tz, r)v
== 0
or
v'4>(tz, r)B(r)
== 0
for all r in [to, t2]' Its differentiations with respect to r yield, as derived in (6.82).
v'4>(tz, r)Mm(r)
== 0
We compute (6.85)
for m == O. 1.2, ... , n  1, and all r in [to. tz], in particular, at tl· They can be arranged as
Because 4>(t2, tl) is nonsingular, v'4>(t2. tl) is nonzero. Thus (6.85) contradicts (6.83). Therefore, under the condition in (6.83), Wc(to, t2), for any t: :': tl, is nonsingular and (A(t), Btr) is, following Theorem 6.11, controllable at to· Q.E.D.
EXAMPLE
w,«,
t)
==

L [;:,]
21
[e'
_ [e
6.13 Consider
,~u~}+[:}
==
A(t)Mo
(t  to) e31 (t  to)
(6.86)
Its determinant is identically zero for all to and t. Thus (6.88) is not controllable at any to. From this example, we see that, in applying a theorem, every condition should be checked carefully; otherwise, we might obtain an erroneous conclusion. We now discuss the observability part. The linear timevarying state equation in (6.75) is observable at to if there exists a finite tl such that for any state x(to) == xo, the knowledge of the input and output over the time interval [to, tl] suffices to determine uniquely the initial state Xo. Otherwise, the state equation is said to be unobservable at to.
We have Mo == [0 1 1]' and compute
MI
+ fMo ==
[~t1
tZ _ I
M2 == A(t)MI
+ ~MI == [
dt
~;
1
;..
Theorem 6.011 The pair (A(t), n x n matrix C(t» is observable at time to if and only if there exists a finite tl > to such that the
180
CONTROLLABILITY
AND OBSERVABILITY
Problems
181
Wo(to, where <I> (t,
til =
1
'0
'1
<I>'(r, to)C'(r)C(r)<I>(r,
to)dr
(6.89)
AU] A22 is controllable 6.5 controllability
X
+ [BI] 0
u
r ) is
the state transition
matrix
of X = A(t
lx,
is nonsingular.
if and only if the pair (A22, A21) is controllable. to describe the network shown in Fig. 6.1, and then check its index of the state equations in Problems and observability. index and observability
Find a state equation Find the controllability 6.1 and 6.2.
>
Theorem 6.012
LetA(t) and C(t) be n I times continuously at differentiable. Then the ndimensional pair (A(t), C(t)
6.6 6.7
is observable
to if there
exists a finite tl >
to
such that
What is the controllability
index of the state equation
No(ttl N I (ttl rank
where
]
(6.90)
r
x = Ax
where I is the unit matrix? 6.8 Reduce the state equation
+ Iu
N~_:I (ttl
=n
m=O,I
with
....
,n1 to a controllable
y = [1 IJx one. Is the reduced equation observable? and observable equation.
No = C(t) We mention that the duality theorem in Theorem 6.5 for timeinvariant systems is not applicable to timevarying systems. It must be modified. See Problems 6.22 and 6.23.
6.9
Reduce the state equation in Problem 6.5 to a controllable
6.10 Reduce the state equation
.=["
to a controllable 6.11 Consider the ndimensional
6.1
Is the state equation
y = [0 I 1 1 0 IJx x = [~
! n+[+
0 0 0 Al 0 0 0 Al 0 0 0 A2 0 equation. state equation
1 3 3
0
~]
x
+ [~]
0
u
and observable
y = [1 2 IJx controllable? Observable?
x=
AxtBu
y=Cx+Du The rank of its controllability matrix is assumed to be n I < n. Let QI be an n x n I matrix whose columns are any nl linearly independent columns of the controllability matrix. Let PI be an nl x n matrix such that PIQI = In1, where Inl is the unit matrix of order n I. Show that the following n Idimensional state equation XI = PIAQlxl + PIBu
6.2
Is the state equation
controllable? 6,3
Observable? ... AnIBJequalstherankof[AB A2B ... AnBP
Is it true that the rank of [B AB If not, under what conditon
I
y
is controllable
= CQIXI +Du
will it be true?
6.4
Show that the state equation
I
:
and has the same transfer matrix as the original state equation.
6.12 In Problem 6.11, the reduction procedure reduces to solving for PI in PI QI = 1. How
do you solve PI?
182
CONTROLLABILITY
AND OBSERVABILITY
Problems as in Problem 6.11 for an unobservable and observable? 2 2 1 0 state equation. Figure 6.12
183
6.13 6.14
Develop a similar statement Is the Jordanform
state equation controllable 2 0 2 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 I 1 0 1 0 I 0 0 0 0 0 0 0 0 0 0
x= y=U
6.15
2
0 0 0 0
2
0 0 0 3
x+
3 1
2 0 0 0 1 0
u
6.19
2
to
I
iJx
Consider the continuoustime state equation in Problem 4.2 and its discretized equations in Problem 4.3 with sampling period T = 1 and tt , Discuss controllability and observability of the discretized equations. and observability of y = [0 l]x
6.20 Check controllability
cij
Is it possible to find a set of b., and a set of
x=
Y is controllable? 6.16
[' ,
o
0 1
such that the state equation bi:
b3l b4l b5l Cl5 ] C25 C35
0 0 1 0 0
o
o o
0 0 0
° OJ [''
0
lOx
+
b22 bvi b32 b42 b52
J
U
6.21
Check controllability
and observability
of
o
I
0
1
Cl4 C24 C34
=
Cl2 [ Cll C2l C3l C22 C32
Cl3 C23 C33
6.22
Show that (AU), BU» is controllable
alto if and only if (A' (t), B' (t)) is observable
at
to·
6.23 For timeinvariant systems, show that (A, B) is controllable controllable. Is this true for timevarying systems? if and only if (A, B) is
Observable?
Consider the state equation
[A' x=
Y
!
Cl
0
al /3l
0
/3l al
0 0 0
a2 /32 C2l
o o
cn]
0 0
Cll el2
0 0
/32
° J [b' J
x
+
bll
oi
bvi bZl b22
u
= [Cl
It is the modal form discussed complex conjugate eigenvalues. equation observable is controllable if and only if
in (4.28). It has one real eigenvalue and two pairs of It is assumed that they are distinct. Show that the state c, l
if and only if b,
1= 0;
1= 0; b., 1= 0 1= 0 or Ci2 i= 0 for i
or bi2
1=
0 for i = 1.2. It is
= I, 2.
shown in Fig.
6.17 Find two and threedimensional 6.12. Discuss their controllability 6.18 Check controllability
state equations
to describe the network
and observability. of the state equation obtained in Problem 2.19.
and observability
Can you give a physical interpretation
directly from the network?
This action might not be possible to undo. Are you sure you want to continue?