You are on page 1of 255
SYSTEM IDENTIFICATION: Theory for the User Lennart Ljung University of Linképing Sweden P T R Prentice Hall, Englewood Cliffs, New Jersey 07632 Library of Congress Cataloging-in-Publication Data Ljung, Lennart. (date) System identification Bibliography: p. Includes index. 1, System identification. I. Title. QA402.L59 1987003 86-2712 ISBN 0-13-881640-9 Editorial/production supervision and interior design: Gloria Jordan Cover design: Ben Santora Manufacturing buyer: S. Gordon Osbourne © 1987 by P T R Prentice Hall Prentice-Hall, Inc. A Division of Simon & Schuster Englewood Cliffs, New Jersey 07632 All rights reserved. No part of this book may be Teproduced, in any form or by any means, without permission in writing from the publisher. Printed in the United States of America 109 8 7 ISBN O-13-841b40-9 Prentice-Hall International (UK) Limited, London Prentice-Hall of Australia Pty. Limited, Sydney Prentice-Hall Canada Inc., Toronto Prentice-Hall Hispanoamericana, S.A., Mexico Prentice-Hall of India Private Limited, New Delhi Prentice-Hall of Japan, Inc., Tokyo Prentice-Hall of Southeast Asia Pte. Ltd., Singapore Editora Prentice-Hall do Brasil, Ltda., Rio de Janeiro PREFACE ACKNOWLEDGMENTS OPERATORS AND NOTATIONAL CONVENTIONS 1. INTRODUCTION 1.1 1.2 1.3 14 15 Dynamical Systems Models The System Identification Procedure Organization of the Book Bibliography part i: systems and models 21 2.2 TIME-INVARIANT LINEAR SYSTEMS Impulse Responses, Disturbances and Transfer Functions Frequency-domain Expressions CONTENTS 2.3 Signal Spectra 26 2.4 Single Realization Behavior and Ergodicity 34 Results (*) 2.5. Multivariable Systems (*) 35 2.6 Summary 36 2.7 Bibliography 37 2.8 Problems 38 Appendix 2.A: Proof of Theorem 2.2 43 Appendix 2.B: Proof of Theorem 2.3 45 Appendix 2.C: Covariance Formulas 49 3. SIMULATION, PREDICTION, AND CONTROL 51 3.1 Simulation SL 3.2 Prediction 52 3.3. Observers 59 3.4 Control (*) 62 3.5 Summary 65 3.6. Bibliography 65 3.7 Problems 66 4, MODELS OF LINEAR TIME-INVARIANT SYSTEMS 69 4.1 Linear Models and Sets of Linear Models 69 4.2 A Family of Transfer-function Models 71 4.3 State-space Models 81 4.4 Distributed-Parameter Models (*) 90. 4.5 Model Sets, Model Structures, and 93 Identifiability: Some Formal Aspects (*) 4.6 Identifiability of Some Model Structures 101 4.7. Summary 106 4.8 Bibliography 106 4.9 Problems 108 Appendix 4.A: Identifiability of Black-box 115 Multivariable Model Structures 5. MODELS FOR TIME-VARYING 127 AND NONLINEAR SYSTEMS 5.1 Linear Time-varying Models 127 5.2 Nonlinear Models as Linear Regressions 130 5.3. Nonlinear State-space Models 132 5.4 Formal Characterization of Models (*) 134 5.5 Summary 137 5.6 Bibliography 138 5.7 Problems 138 viii Contents part ii: methods -6. NONPARAMETRIC TIME- AND FREQUENCY-DOMAIN METHODS 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 Transient Response Analysis and Correlation Analysis Frequency-response Analysis Fourier Analysis Spectral Analysis Estimating the Disturbance Spectrum (*) Summary Bibliography Problems Appendix 6.A: Derivation of the Asymptotic Properties of the Spectral Analysis Estimate 7, PARAMETER ESTIMATION METHODS 7A 72 73 7.4 15 76 17 78 19 Guiding Principles behind Parameter Estimation Methods Minimizing Prediction Errors Linear Regressions and the Least-squares Method A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method Correlating Prediction Errors with Past Data Instrumental-variable Methods Summary Bibliography Problems Appendix 7.A: Proof of the Cramér-Rao Inequality 8. CONVERGENCE AND CONSISTENCY 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 Contents Introduction Conditions on the Data Set Prediction-error Approach Consistency and Identifiability Linear Time-invariant Models: A Frequency- domain Description of the Limit Model The Correlation Approach Summary Bibliography Problems - 141 169 169 171 176 181 190 192 195 196 197 206 208 208 210 214 218 224 229 233 233 234 9. 11. ASYMPTOTIC DISTRIBUTION OF PARAMETER ESTIMATES 9.1 Introduction 9.2 The Prediction-error Approach: Basic Theorem 9.3 Expressions for the Asymptotic Variance 9.4 Frequency-domain Expressions for the Asymptotic Variance 9.5 The Correlation Approach 9.6 Use and Relevance of Asymptotic Variance Expressions 9.7 Summary 9.8 Bibliography 9.9 Problems Appendix 9.A: Proof of Theorem 9.1 Appendix 9.B: The Asymptotic Parameter Variance COMPUTING THE ESTIMATE 10.1 Linear Regressions and Least Squares 10.2 Numerical Solution by Iterative Search Methods 10.3 Computing Gradients 10.4 Two-stage and Multistage Methods 10.5 Local Solutions and Initial Values 10.6 Summary 10.7 Bibliography 10.8 Problems RECURSIVE ESTIMATION METHODS 11.1 11.2 11.3 11.4 15 11.6 11.7 11.8 11.9 11.10 Introduction The Recursive Least-squares Algorithm The Recursive [V Method Recursive Prediction-Error Methods Recursive Pseudolinear Regressions The Choice of Updating Step Implementation Summary Bibliography Problems Appendix 11.A: Techniques for Asymptotic Analysis of Recursive Algorithms 239 239 240 242 248 254 258 262 263 264 270 274 274 285 288 292 294 294 296 303 303 305 311 311 316 318 322 326 327 328 329 Contents part iii: user’s choices 12. OPTIONS AND OBJECTIVES 12.1 12.2 12.3 12.4 12.5 12.6 Options Objectives Bias and Variance Summary Bibliography Problems 13, AFFECTING THE BIAS DISTRIBUTION OF TRANSFER-FUNCTION ESTIMATES 13.1 13.2 13.3 13.4 13.5 13.6 Some Basic Expressions Heuristic Discussion of Transfer-function Fit in Open-loop Operation Some Solutions to Formal Design Problems Summary Bibliography Problems 14, EXPERIMENT DESIGN 14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9 Some General Considerations Informative Experiments Optimal Input Design (*) Optimal Experiment Design for High-order Black-box Models (*) Choice of Sampling Interval and Presampling Filters Pretreatment of Data Summary Bibliography Problems 15. CHOICE OF IDENTIFICATION CRITERION 15.1 15.2 15.3 15.4 15.5 15.6 Contents General Aspects Choice of Norm: Robustness Variance: Optimal Instruments Summary Bibliography Problems 339 339 341 345 347 347 347 349 349 350 354 356 356 357 358 359 361 369 375 378 386 390 391 394 394 396 402 405 406 xi 16. MODEL STRUCTURE SELECTION 408 AND MODEL VALIDATION 16.1 General Aspects of the Choice of Model 408 Structure 16.2 A Priori Considerations 411 16.3 Model Structure Selection Based 413 on Preliminary Data Analysis 16.4 Comparing Model Structures 416 16.5 Model Validation 424 16.6 Summary 430 16.7 Bibliography 431 16.8 Problems 431 17. SYSTEM IDENTIFICATION IN PRACTICE 434 17.1 The Tool: Interactive Software 434 17.2. A Laboratory-scale Application 440 17.3 Identification of Ship-steering Dynamics 449 17.4 What Does System Identification Have 454 to Offer? 17.5 Bibliography 456 APPENDIX |: Some Concepts from Probability Theory 457 APPENDIX II: Some Statistical Techniques for Linear 461 Regressions REFERENCES 482 AUTHOR INDEX 505 SUBJECT INDEX 511 adi Contents PREFACE System identification is a diverse field that can be presented in many different ways. The subtitle, Theory for the User, reflects the attitude of the present treatment. Yes, the book is about theory, but the focus is on theory that has direct consequences for the understanding and practical use of available techniques. My goal has been to give the reader a firm grip on basic principles so that he or she can confidently approach a practical problem, as well as the rich and sometimes confusing literature on the subject. Stressing the utilitarian aspect of theory should not, I believe, be taken as an excuse for sloppy mathematics. Therefore, I have tried to develop the theory without cheating. The more technical parts have, however, been placed in appen- dixes or in asterisk-marked sections, so that the reluctant reader does not have to stumble through them. In fact, it is a redeeming feature of life that we are able to use many things without understanding every detail of them. This is true also of the theory of system identification. The practitioner who is looking for some quick advice should thus be able to proceed rapidly to Part III (User’s Choices) by hopping through the summary sections of the earlier chapters. The core material of the book should be suitable for a graduate-level course in system identification. As a prerequisite for such a course, it is natural, although not absolutely necessary, to require that the student should be somewhat familiar with dynamical systems and stochastic signals. The manuscript has been used as a text for system identification courses at Stanford University, the Massachusetts Institute of Technology, Yale University, the Australian National University and the Univer- xiii sities of Lund and Linképing. Course outlines, as well as a solutions manual for the problems, are available from the publisher. For a course on system identification, the role of computer-based exercises should be stressed, Simulation sessions demonstrating how hidden properties of data are readily recovered by the techniques discussed in the book enhance the understanding and motivation of the material. In the problems labeled S, in Chapters 2 through 16, a basic interactive software package is outlined that should be possible to implement rather painlessly in a high-level environment. A PC-MATLAB version of this package is commercially available (see Ljung, 1986b). With such a package all basic techniques of this book can be illustrated and tested on real and simulated data. The existing literature on system identification is indeed extensive and virtu- ally impossible to cover in a bibliography. In this book I have tried to concentrate on recent and easily available references that I think are suitable for further study, as well as on some earlier works that reflect the roots of various techniques and results. Clearly, many other relevant references have been omitted. Finally, some words about the structure of this book: The dependence among the different chapters is illustrated in Figure 1.13, which shows that some chapters are not necessary prerequisites for the following ones. Also, some portions contain material that is directed more toward the serious student of identification theory than to the user. These portions are put either in appendixes or in sections and subsections marked with an asterisk (*). While occasional references to this material may be encountered, it is safe to regard it as optional reading; the continuity will not be impaired if it is skipped. The problem sections for each chapter have been organized into six groups of different problem types: O G problems: These could be of General interest and it may be worthwhile to browse through them, even without intending to solve them. O E problems: These are regular pencil-and-paper Exercises to check the basic techniques of the chapter. © T problems: These are Theoretically oriented problems and typically more difficult than the E problems. O D problems: In these problems the reader is asked to fill in technical Details that were glossed over in the text (a way to dump straightforward technicalities from the book into the solutions manual!). © S problems: These develop the basic identification Software package men- tioned earlier. O Cproblems: These require a Computer. Clearly, with the software package at hand, the C problems can be complemented with a myriad of problems experi- menting with identification methods and data. Such problems are not specifi- cally listed, but the reader is encouraged to apply those techniques in an exploratory fashion. xiv Preface ACKNOWLEDGMENTS Any author of a technical book is indebted to the people who taught him the subject and to the people who made the writing possible. My interest in system identi- fication goes back to my years as a graduate student at the Automatic Control Department in Lund. Professor Karl Johan Astrém introduced me to the subject, and his serious attitude to research has always been a reference model for me. Since then I have worked with many other people who added to my knowledge of the subject. I thank, therefore, my previous coauthors (in alphabetical order) Anders Ahlén, Peter Caines, David Falconer, Farhat Fnaiech, Ben Friedlander, Michel Gevers, Keith Glover, Ivar Gustavsson, Tom Kailath, Stefan Ljung, Martin Morf, Ton van Overbeek, Jorma Rissanen, Torsten Séderstrém, Géte Solbrand, Eva Trulsson, Bo Wahlberg, Don Wiberg, and Zhen-Dong Yuan. The book has developed from numerous seminars and several short courses that I have given on the subject world-wide. Comments from the seminar par- ticipants have been instrumental in my search for a suitable structure and frame- work for presenting the topic. Several persons have read and used the manuscript in its various versions and given me new insights. First, I would like to mention: Michel Gevers, who taught from an early version and gave me invaluable help in revising the text; Robert Kosut and Arye Nehorai, who taught from the manuscript at Stanford and Yale, re- spectively; and Jan Holst, who lead a discussion group with it at Denmark’s Tech- nical University, and also gathered helpful remarks. I co-taught the course at MIT with Fred Schweppe, and his lectures as well as his comments, led to many clarifying changes in the manuscript. Students in various courses also provided many useful comments. I mention in particular George Hart, Juan Lavalle, Ivan Mareels, Brett Ridgely, and Bo Wahlberg. Several colleagues were also kind enough to critique the manuscript. I am especially grateful to Hiro Akaike, Chris Byrnes, Peter Falb, Meir Feder, Gene Franklin, Claes Kallstrém, David Ruppert, Torsten Séderstrém, Petre Stoica, and Peter Whittle. Svante Gunnarsson and Stan Granath made the experiments described in Section 17.2, Bo Wahlberg contributed to the frequency-domain interpretations, and Alf Isaksson prepared Figure 14.4. The preparation of the manuscript’s many versions was impeccably coordi- nated and, to a large extent, also carried out by Ingegerd Stenlund. She had useful help from Ulla Salaneck and Karin Lénn. Marianne Anse-Lundberg expertly pre- pared all the illustrations. I deeply appreciate all their efforts. Writing a book takes time, and I probably would not have been able to finish this one had I not had the privilege of sabbatical semesters. The first outline of this book was written during a sabbatical leave at Stanford University in 1980-1981. I wrote a first version of what turned out to be the last edition during a mini- sabbatical visit to the Australian National University in Canberra in 1984. The writing was completed during 1985-1986, the year I spent at MIT. I thank Tom Kailath, Brian Anderson and Sanjoy Mitter (and the U.S. Army Research Office under contract DAAG-29-84-K-005) for making these visits possible and for provid- ing inspiring working conditions. My support from the Swedish National Board for Technical Development (STUF) has also been important. Sabbatical or not, it was unavoidable that a lot of the writing (not to mention the thinking!) of the book had to be done on overtime. I thank my family, Ann- Kristin, Johan, and Arvid, for letting me use their time. Lennart Ljung Linképing, Sweden xvi Acknowledgments OPERATORS AND NOTATIONAL CONVENTIONS arg(z) = argument of the complex number z arg min f(x) = value of x that minimizes f (x) xy € AsF(n, m): sequence of random variables xy converges in distribution to the F-distribution with n and m degrees of freedom xy € AsN(m, P): sequence of random variables xy converges in distribution to the normal distribution with mean m and covariance matrix P; see (1.17) xy € Asy?(n): sequence of random variables xy converges in distribution to the x’ distribution with n degrees of freedom Cov(x) = covariance matrix of the random vector x; see (1.4) det A = determinant of the matrix A dim 6 = dimension (number of rows) of the column vector @ Ex = mathematical expectation of the random vector x; see (1.3) Ex()= lim DI 1 E x(#); see (2.60) O(x) = ordo x: function tending to zero at the same rate as x o(x) = small ordo x: function tending to zero faster than x x € N(m, P): random variable x is normally distributed with mean m and covariance matrix P; see (1.6) Re z = real part of the complex number z Rf) = range of the function f = the set of values that f(x) may assume R¢ = Euclidian d-dimensional space x = sol{f (x) = 0}: x is the solution (or set of solutions) to the equation f@)=0 tr(A) = trace (the sum of the diagonal elements) of the matrix A Var (x) = variance of the random variable x A’ = inverse of the matrix A AT = transpose of the matrix A AT = transpose of the inverse of the matrix A = = complex conjugate of the complex number z (superscript * is not used to denote transpose and complex conjugate: it is used only as a distinguishing superscript) y= {yO + Ds... YO} y= {yQ),yQ),---.¥O} Uy(w) = Fourier transform of w; see (2.37) R,(t) = Ev(Qv(t — 1); see (2.61) Ryo(t) = Es(Qw7(t — 7); see (2.62) ,(w) = spectrum of v = Fourier transform of R, (7); see (2.63) ®,,(w) = cross spectrum between s and w = Fourier transform of Ry (7); see (2.64) RN(t) = 4 SY s(Qs%(¢ — 7); see (6.10) "(w) = estimate of the spectrum of u based on u; see (6.48) 0(t|t — 1) = prediction of v(t) based on v‘~" a V(6) = gradient of V(6) with respect to 6: a column vector of dimension do ella dened dim 6 if V is scalar valued V'(6) = gradient of V with respect to its argument €1(e,0) = partial derivative of € with respect to € 8, = Kronecker’s delta: zero unless i = j &(k) = Bro B(G,£) = e neighborhood of 6: {6||@ — 6| < e} © = the left side is defined by the right side |- | = (Euclidian) norm of a vector ||-|| = (Frobenius) norm of a matrix (see 2.89) Operators and Notational Conventions

You might also like