## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

by Jan Brinkhuis and Vladimir Tikhomirov

Ratings:

Length: 680 pages10 hours

This self-contained textbook is an informal introduction to optimization through the use of numerous illustrations and applications. The focus is on analytically solving optimization problems with a finite number of continuous variables. In addition, the authors provide introductions to classical and modern numerical methods of optimization and to dynamic optimization.

The book's overarching point is that most problems may be solved by the direct application of the theorems of Fermat, Lagrange, and Weierstrass. The authors show how the intuition for each of the theoretical results can be supported by simple geometric figures. They include numerous applications through the use of varied classical and practical problems. Even experts may find some of these applications truly surprising.

A basic mathematical knowledge is sufficient to understand the topics covered in this book. More advanced readers, even experts, will be surprised to see how all main results can be grounded on the Fermat-Lagrange theorem. The book can be used for courses on continuous optimization, from introductory to advanced, for any field for which optimization is relevant.

Publisher: Princeton University PressReleased: Feb 11, 2011ISBN: 9781400829361Format: book

You've reached the end of this preview. Sign up to read more!

Page 1 of 1

EDITORS

Ingrid Daubechies, *Princeton University *

Weinan E, *Princeton University *

Jan Karel Lenstra, *Eindhoven University *

Endre Süli, *University of Oxford *

*Chaotic Transitions in Deterministic and Stochastic Dynamical Systems: Applications of Melnikov Processes in Engineering, Physics, and Neuroscience *by Emil Simiu

*Selfsimilar Processes *by Paul Embrechts and Makoto Maejima

*Self-Regularity: A New Paradigm for Primal-Dual Interior-Point Algorithms *by Jiming Peng, Cornelis Roos, and Tamás Terlaky

*Analytic Theory of Global Bifurcation: An Introduction *by Boris Buffoni and John Toland

*Entropy *by Andreas Greven, Gerhard Keller, and Gerald Warnecke

*Auxiliary Signal Design for Failure Detection *by Stephen L. Campbell and Ramine Nikoukhah

*Thermodynamics: A Dynamical Systems Approach *by Wassim M. Haddad, VijaySekhar Chellaboina, and Sergey G. Nesesov

*Optimization: Insights and Applications *by Jan Brinkhuis and Vladimir Tikhomirov

The Princeton Series in Applied Mathematics publishes high quality advanced texts and monographs in all areas of applied mathematics. Books include those of a theoretical and general nature as well as those dealing with the mathematics of specific applications areas and real-world situations.

Optimization: Insights and Applications

Jan Brinkhuis

Vladimir Tikhomirov

PRINCETON UNIVERSITY PRESS

PRINCETON AND OXFORD

Copyright © 2005 by Princeton University Press

Published by Princeton University Press,

41 William Street, Princeton, New Jersey 08540

In the United Kingdom: Princeton University Press,

3 Market Place, Woodstock, Oxfordshire OX20 1SY

All Rights Reserved

Library of Congress Cataloging-in-Publication Data

Brinkhuis, Jan.

Optimization : insights and applications / Jan Brinkhuis, Vladimir Tikhomirov.

p. cm. — (Princeton series in applied mathematics)

Includes bibliographical references and index.

ISBN-13 : 978-0-691-10287-0 (alk. paper)

ISBN-10 : 0-691-10287-2 (alk. paper)

1. Econometrics — Mathematical models. I. Tikhomirov, V. M. (Vladimir Mikhailovich), 1934- II. Title. III. Series.

HB141.B743 2005

330′.01′5195—dc22 2005046598

British Library Cataloging-in-Publication Data is available

The publisher would like to acknowledge the authors of this volume for providing the camera-ready copy from which this book was printed.

Printed on acid-free paper.

pup.princeton.edu

Printed in the United States of America

10 9 8 7 6 5 4 3 2 1

Je veux parler du rêve. Du rêve, et des visions qu’il nous souffle - impalpables comme lui d’abord, et réticentes souvent à prendre forme. […] Quand le travail est achevé, ou telle partie de travail, nous en présentons le résultat tangible sous la lumière la plus vive que nous pouvons trouver, nous nous en réjouissons, […] Une vision, sans nom et sans contours d’abord, tenue comme un lambeau de brumes, a guidé notre main et nous a maintenus penches sur l’ouvrage, sans sentir passer les heures.

I want to talk about the dream. About the dream, and the visions that it whispers to us, intangible like itself at first, and often reluctant to take shape. […] When the work is finished, or a certain part of the work, then we present the tangible result of it under the most lively light that we can find, we enjoy ourselves about it, […] A vision, without name and without shape at first, tenuous as a shred of mist, has guided our hand and has kept us bent over the work, without feeling the hours.

**Preface **

**0.1 Optimization: insights and applications **

**0.2 Lunch, dinner, and dessert **

**0.3 For whom is this book meant? **

**0.4 What is in this book? **

**0.5 Special features **

**Necessary Conditions: What Is the Point? **

**Chapter 1. Fermat: One Variable without Constraints **

**1.0 Summary **

**1.1 Introduction **

**1.2 The derivative for one variable **

**1.3 Main result: Fermat theorem for one variable **

**1.4 Applications to concrete problems **

**1.5 Discussion and comments **

**1.6 Exercises **

**Chapter 2. Fermat: Two or More Variables without Constraints **

**2.0 Summary **

**2.1 Introduction **

**2.2 The derivative for two or more variables **

**2.3 Main result: Fermat theorem for two or more variables **

**2.4 Applications to concrete problems **

**2.5 Discussion and comments **

**2.6 Exercises **

**Chapter 3. Lagrange: Equality Constraints **

**3.0 Summary **

**3.1 Introduction **

**3.2 Main result: Lagrange multiplier rule **

**3.3 Applications to concrete problems **

**3.4 Proof of the Lagrange multiplier rule **

**3.5 Discussion and comments **

**3.6 Exercises **

**Chapter 4. Inequality Constraints and Convexity **

**4.0 Summary **

**4.1 Introduction **

**4.2 Main result: Karush-Kuhn-Tucker theorem **

**4.3 Applications to concrete problems **

**4.4 Proof of the Karush-Kuhn-Tucker theorem **

**4.5 Discussion and comments **

**4.6 Exercises **

**Chapter 5. Second Order Conditions **

**5.0 Summary **

**5.1 Introduction **

**5.2 Main result: second order conditions **

**5.3 Applications to concrete problems **

**5.4 Discussion and comments **

**5.5 Exercises **

**Chapter 6. Basic Algorithms **

**6.0 Summary **

**6.1 Introduction **

**6.2 Nonlinear optimization is difficult **

**6.3 Main methods of linear optimization **

**6.4 Line search **

**6.5 Direction of descent **

**6.6 Quality of approximation **

**6.7 Center of gravity method **

**6.8 Ellipsoid method **

**6.9 Interior point methods **

**Chapter 7. Advanced Algorithms **

**7.1 Introduction **

**7.2 Conjugate gradient method **

**7.3 Self-concordant barrier methods **

**Chapter 8. Economic Applications **

**8.1 Why you should not sell your house to the highest bidder **

**8.2 Optimal speed of ships and the cube law **

**8.3 Optimal discounts on airline tickets with a Saturday stayover **

**8.4 Prediction of flows of cargo **

**8.5 Nash bargaining **

**8.6 Arbitrage-free bounds for prices **

**8.7 Fair price for options: formula of Black and Scholes **

**8.8 Absence of arbitrage and existence of a martingale **

**8.9 How to take a penalty kick, and the minimax theorem **

**8.10 The best lunch and the second welfare theorem **

**Chapter 9. Mathematical Applications **

**9.1 Fun and the quest for the essence **

**9.2 Optimization approach to matrices **

**9.3 How to prove results on linear inequalities **

**9.4 The problem of Apollonius **

**9.5 Minimization of a quadratic function: Sylvester’s criterion and Gram’s formula **

**9.6 Polynomials of least deviation **

**9.7 Bernstein inequality **

**Chapter 10. Mixed Smooth-Convex Problems **

**10.1 Introduction **

**10.2 Constraints given by inclusion in a cone **

**10.3 Main result: necessary conditions for mixed smooth-convex problems **

**10.4 Proof of the necessary conditions **

**10.5 Discussion and comments **

**Chapter 11. Dynamic Programming in Discrete Time **

**11.0 Summary **

**11.1 Introduction **

**11.2 Main result: Hamilton-Jacobi-Bellman equation **

**11.3 Applications to concrete problems **

**11.4 Exercises **

**Chapter 12. Dynamic Optimization in Continuous Time **

**12.1 Introduction **

**12.2 Main results: necessary conditions of Euler, Lagrange, Pontryagin, and Bellman **

**12.3 Applications to concrete problems **

**12.4 Discussion and comments **

**Appendix A. On Linear Algebra: Vector and Matrix Calculus **

**A.1 Introduction **

**A.2 Zero-sweeping or Gaussian elimination, and a formula for the dimension of the solution set **

**A.3 Cramer’s rule **

**A.4 Solution using the inverse matrix **

**A.5 Symmetric matrices **

**A.6 Matrices of maximal rank **

**A.7 Vector notation **

**A.8 Coordinate free approach to vectors and matrices **

**Appendix B. On Real Analysis **

**B.1 Completeness of the real numbers **

**B.2 Calculus of differentiation **

**B.3 Convexity **

**B.4 Differentiation and integration **

**Appendix C. The Weierstrass Theorem on Existence of Global Solutions **

**C.1 On the use of the Weierstrass theorem **

**C.2 Derivation of the Weierstrass theorem **

**Appendix D. Crash Course on Problem Solving **

**D.1 One variable without constraints **

**D.2 Several variables without constraints **

**D.3 Several variables under equality constraints **

**D.4 Inequality constraints and convexity **

**Appendix E. Crash Course on Optimization Theory: Geometrical Style **

**E.1 The main points **

**E.2 Unconstrained problems **

**E.3 Convex problems **

**E.4 Equality constraints **

**E.5 Inequality constraints **

**E.6 Transition to infinitely many variables **

**Appendix F. Crash Course on Optimization Theory: Analytical Style **

**F.1 Problem types **

**F.2 Definitions of differentiability **

**F.3 Main theorems of differential and convex calculus **

**F.4 Conditions that are necessary and/or sufficient **

**F.5 Proofs **

**Appendix G. Conditions of Extremum from Fermat to Pontryagin **

**G.1 Necessary first order conditions from Fermat to Pontryagin **

**G.2 Conditions of extremum of the second order **

**Appendix H. Solutions of Exercises of Chapters 1–4 **

**Bibliography **

**Index **

Our first aim has been to write an interesting book, […] we can hardly have failed completely, the subject-matter being so attractive that only extravagant incompetence could make it dull. […] it does not demand any great mathematical knowledge or technique.

Das vorliegende Buch, aus Vorlesungen entstanden, die ich mehrfach in […] gehalten habe, setzt sich zum Ziel, den Leser, ohne irgendwelche […] Kenntnisse vorauszusetzen, in das Verständnis der Fragen einzuführen, welche gegenwärtig den Gipfel der Theorie […] bilden. […] Für den Kenner der Theorie werden immerhin vielleicht einige Einzelheiten von Interesse sein.

The present book, arisen from lectures, which I have given several times in […], aims to introduce the reader, without assuming any […] knowledge, into the understanding of the questions that form at present the summit of the theory. […] For the experts of the theory, some details will perhaps be of interest.

Wonder en is gheen wonder.

A miracle is not a miracle.

Geometry draws the soul toward truth.

Solange ein Wissenszweig Ueberfluβ an Problemen bietet, ist er lebenskräftig; […]. Wie überhaupt jedes menschliche Unternehmen Ziele verfolgt, so braucht der mathematische Forscher Probleme. […] Eine mathematische Theorie ist nicht eher als volkommen anzusehen, als du sie dem ersten Manne erklären könntest, den du auf der Straβe triffst.

As long as a branch of science offers an abundance of problems, so long is it alive; […] Just as every human undertaking pursues certain objectives, so also mathematical research requires its problems. […] A mathematical theory is not to be considered complete until you can explain it to the first person whom you meet on the street.

Quand une situation, de la plus humble à la plus vaste, a été comprise dans les aspects essentiels, la démonstration de ce qui est compris (et du reste) tombe comme un fruit mûr à point. Alors que la démonstration arrachée comme un fruit encore vert à l’arbre de la connaissance laisse un arriaère-goût d’insatisfaction, une frustration de notre soif, nullement apaissée.

When a situation, from the most humble to the most immense, has been understood in the essential aspects, the proof of what is understood (and of the remainder) falls like a fruit that is just ripe. Whereas the proof snatched like an unripe fruit from the tree of knowledge leaves an after-taste of dissatisfaction, a frustration of our thirst, not at all silenced.

**0.1 OPTIMIZATION: INSIGHTS AND APPLICATIONS **

We begin by explaining the title.

*Optimization*. It is our great wish to explain in one book all aspects of continuous optimization to a wide circle of readers. We think that there is some truth in the words: * if you want to understand something, then you should try to understand everything. *We intend our self-contained text to be stimulating to the activity of solving optimization problems and the study of the ideas of the methods of solution. Therefore, we have made it as entertaining and informal as our wish to be clear and precise allowed. The emphasis on insights (by means of pictures) and applications (of a wide variety) makes this book suitable as a textbook (for various types of courses ranging from introductory to advanced). There are many books on this subject, but our book offers a novel and consistent approach. The contribution to the field lies in the collection of examples, but as well in the presentation of the theory.

The focus is on solving optimization problems with a finite number of continuous variables analytically (that is, by a formula). Here our ambition is to tell the whole story.

The main message here is that the art of solving has almost become a craft, by virtue of a universal strategy to solve all optimization problems that can be solved at all.

In addition, we give two brief introductions, to classical and modern numerical methods of optimization, and to dynamic optimization. Discrete and stochastic optimization fall outside the scope of our book.

*Insights*. The overarching point of this book is that most problems —when viewed correctly

—may be solved by the direct application of the theorems of Fermat, Lagrange, and Weierstrass. We have pursued an intensive quest to reach the essence of all theoretical methods. This has led to the surprising outcome that the proofs of all methods can be fully understood by means of simple geometric figures. Writing down the rigorous analytical proofs with the aid of these figures is a routine job. Advanced readers and experts might be interested in the simple alternative proofs of the Lagrange principle and of the Pontryagin maximum principle, given at the end of the book. They might also be surprised to see that all methods of optimization can be unified into one result, *the principle of Fermat-Lagrange*.

*Applications*. We introduce the reader to mathematical optimization with continuous variables and apply the methods to a substantial and highly varied list of classical and practical problems. Many of these applications are really surprising, even for experts.

Concerning the painting on the front cover, its choice is inspired by the organization of our book, to be explained below.

**0.2 LUNCH, DINNER, AND DESSERT **

This book can be viewed as consisting of three parts: lunch, dinner, and dessert. We assume that you already have had some breakfast. We offer some snacks for those who did not have enough for breakfast.

**Lunch. **Lunch takes no effort to prepare, at least not in Moscow and Rotterdam, where we wrote this book. It is a light, simple and enjoyable meal. There is a rich choice of things to put on your sandwiches. You leave the table refreshed and often with a taste for more. This is all right, as it was not the purpose to silence all hunger and thirst.

**Dinner. **Dinner requires careful preparation, a visit to the shops, and some concentrated work in the kitchen. It is a substantial, refined, and tasty meal. Once at the table, you have not much choice, but have to eten wat de pot schaft

(≈ eat what is being served). You leave the table satisfied, that is, with the feeling that you have eaten enough. Although dinner and the accompanying wine are of excellent quality, you know that you have to learn to appreciate it. This and the idea that it will restore your energy will help you to finish your dinner.

**Dessert. **For dessert there is a wealth of things in the fridge and in the fruit basket to choose from. To eat dessert you need no ulterior motive. It is just fun, pure pleasure, and delicious. Of course you can combine as many of the choices as you want at home; we felt very much at home both in Moscow and in Rotterdam!

**Food for thought in plain prose **

**Breakfast. **Course on vectors, matrices, differentiation and continuity (we assume that you have taken such a course).

**Snacks. **Three short refreshment courses, appendices A (on vectors and matrices), B (on differentiation), C (on continuity), and the introductory chapter Necessary Conditions: What Is the Point?

**Lunch. **Introductory course on optimization methods for those who are mainly interested in the applications, but moreover want to learn something about optimization: **Chapters 1, 2, 3, 4, 6, 11. All proofs are optional, as well as chapter 5 and appendix D, Crash Course on Problem Solving. **

**Dinner. **Advanced course on optimization for those who want full insights into all aspects of the subject. **Chapters 5, 7, 10, 12, appendix G, Conditions of Extremum from Fermat to Pontryagin, and all proofs in the lunch sections and in appendices E and F, two crash courses on optimization theory (one in geometrical style and one in analytical style). **

**Dessert. **Applications of optimization methods: **Chapters 8, 9, all concrete problems and exercises throughout the book, and the use of software based on the numerical methods from Chapters 6, 7. **

**Appetizer. **Most chapters begin with a summary, motivating the contents of the chapter.

**Royal road **

The pharaoh of Egypt once asked the Greek geometer Euclid whether there was no easier way to learn geometry than by studying all the volumes of Euclid’s textbook * the Elements. *His legendary answer was:

there is no royal road to geometry

Fortunately, there exists a shortcut to the dessert. This route is indicated in each section under the heading royal road.

**Three sorts of desserts **

Each concrete application of substance belongs to at least one of the following three classes.

1) **Pragmatic applications. **Pragmatic applications usually represent a trade off between two opposite effects. Moreover, there are countless examples of economic problems, for example, problems of minimizing cost, or maximizing profit or social welfare.

2) **Optimality of the world. **Most—or all—laws of physics can be viewed as though nature would be optimizing. For example, light behaves essentially as though it chooses the fastest route.

In economics there are similar examples. One of these concerns the basic principle governing markets: the price for which supply equals demand is the price for which total social welfare is maximal.

3) **Mathematical applications. **The justification for mathematical applications is less straightforward. There is—and always has been—a tendency to reach the essence of things: scientific curiosity compels some people to go to the bottom of the matter. For example, if you have obtained an upper or lower bound that suffices for practical purposes, then nevertheless you can go on to search for the sharpest one.

**0.3 FOR WHOM IS THIS BOOK MEANT? **

**Short answer. **A basic mathematical knowledge is sufficient to understand the topics covered in the book and to master the methods. This makes the book very useful for nonspecialists. On the other hand, more advanced readers and even experts might be surprised to see how all main results can be grounded on the so-called Fermat-Lagrange theorem. The book can be used for a wide range of courses on continuous optimization, from introductory to advanced, for any field for which optimization is relevant. The minimum goal would be to master the main tricks of finding analytical solutions of optimization problems. These are explained in a few pages in **appendix D, Crash Course on Problem Solving. The maximal goal would be to get to the bottom of the subject, to get a complete understanding of all insights and to study all applications. **

**Detailed answer. **This book is based on our research—resulting in many novel details—and on our teaching experience. Parts of this book have been tested in courses given at the departments of Mathematics and Economics of the Universities of Moscow and Rotterdam. The participants of these courses were students in Economics, Econometrics and Mathematics, Ph.D. students in Economics and Management Science, and Master’s students in Maritime Economics. The aim is to introduce the reader, without requiring any previous knowledge of optimization, to the state of the art of solving concrete optimization problems. It is meant for beginners as well as for advanced readers and even for experts. The prerequisites are as follows:

**Minimal. **Interest to learn about optimization. A—vague—memory of the following concepts: linear equations, vectors, matrices, limits, differentiation, continuity, and partial derivatives. This memory can be refreshed quickly by means of the appendices.

**Recommended. **Interest in learning about optimization, more than just the tricks.

A good working knowledge of linear algebra and differential calculus.

Moreover, we think that experts will find something of interest in this book as well, if only in the collection of concrete problems, the simple proofs—here we recommend **appendix G, Conditions of Extremum from Fermat to Pontryagin—and various finer points; moreover, there is the chapter about unification of all methods of optimization by the principle of Fermat-Lagrange (Chapter 10). **

We have two types of reader in mind.

**Beginners. **Anyone with an interest in optimization who has ever taken a course on linear equations and differential calculus is invited to lunch. There we explain and apply all successful methods of optimization, using some simple pictures. We provide detailed proofs of all statements, using only elementary arguments. Readers who wish to understand all proofs and discover that they cannot follow some, because their grasp of matrices, limits, continuity, or differentiation is not sufficient, can refresh their memory by reading the appendices.

Having invested some time in familiarizing themselves with the methods of optimization, they can collect the fruits of this by amusing themselves with some of the applications to their favorite subject. For dessert, they will find the full analysis of some optimization problems from economics, mathematics, engineering, physics, and medicine. All are practical, beautiful, and/or challenging. Another interesting possibility is to try one’s hand on some of the many exercises, ranging from routine drills to problems that are challenging but within reach of the tools provided at lunch.

**Advanced readers. **The other type of reader we hope to interest in our book has followed a mathematics course at the university level. They can come straight to dinner and make their choices of the desserts. Each item of the dessert is wonderful in its own way and needs no recommendation. We try to offer them a fuller and deeper insight into the state of the art of optimization. They might be surprised to see that the entire body of knowledge can be based on a small number of fundamental ideas, and how geometrically intuitive these ideas are.

The main dinner course is the principle of Fermat-Lagrange. This unifying principle is the basis of all methods to find the solution of an optimization problem. Finding the solution of an optimization problem might appear at first sight like finding a needle in a haystack. The principle of Fermat-Lagrange makes it possible to carry out this seemingly impossible task.

**Beginners, advanced readers—and experts. **Both types of reader can also profit from that part that is not specially written for them. On the one hand, we like to think that advanced readers—and even experts in optimization theory—will find something new in the lunch and in the short proofs of all necessary conditions of optimization theory—including the formidable Pontryagin maximum principle of optimal control—in **appendix G, Conditions of Extremum from Fermat to Pontryagin. It is worth pointing out as well that all the ideas of dinner are already contained in lunch, sometimes in embryonic form. On the other hand, readers of our lunch who have a taste for more are invited for dinner. **

**0.4 WHAT IS IN THIS BOOK? **

**Four-step method. **Our starting point is a collection of concrete optimization problems. We stick to strict rules of admission to this collection. The problem should belong to at least one of three types: it should serve a *pragmatic aim*, represent a *law of optimality*, or be an *attempt to reach the essence *of some matter. We offer a universal four-step method that allows us to solve all problems that can be solved at all analytically, that is, by a formula.

**Simple proofs based on pictures. **We give insights into all the ingredients of this four-step method. The intuition for each of the theoretical results can be supported by simple geometric figures. After this it is a routine job to write down precise proofs in analytical style. We do more: we clarify all related matters, such as second order conditions, duality, the envelope theorem, sensitivity, shadow prices, and the unification of necessary conditions. These provide additional insights.

**Simple structure theory. **The structure itself of the whole building

is also simple: everything follows from the *tangent space theorem *and the *supporting hyperplane theorem*. The first result allows us to profit from the * smoothness, *that is, the differentiability, of the given problem; the second one from the

convexity.

**Numerical methods. **However, for many pragmatic optimization problems the best you can hope for is a *numerical *solution. Here the main issue is: what can you expect from algorithms and what not? We make clear why there will never be one algorithm that can solve all nonlinear optimization problems efficiently. The state of the art is that you should try to model your problem as a *convex *optimization problem; then you have a fighting chance to find a numerical solution of guaranteed quality, using *self-concordant barrier methods*—also called *interior point methods*—or using *cutting plane methods *such as the *ellipsoid method*.

**Dynamic optimization. **Finally, we will give a glimpse of *dynamic *optimization, that is, of the calculus of variations, optimal control, and dynamic programming. Here we enter the domain of *infinite-dimensional *optimization. We emphasize that we give in **appendix G, Conditions of Extremum from Fermat to Pontryagin, short novel proofs from an advanced point of view of all necessary conditions of optimization from the Fermat theorem to the Pontryagin maximum principle. **

**0.5 SPECIAL FEATURES **

**Four-step method. **We recommend writing all solutions of concrete optimization problems in the same brief and transparent way, in four steps:

1. model the problem and establish existence of global solutions,

2. write down the equation(s) of the first-order necessary conditions,

3. investigate these equations,

4. write down the conclusion.

Thus, you do not have to be a Newton to solve optimization problems. The original solutions of optimization problems often represent brilliant achievements of some of the greatest scientists, like Newton. The four-step method turns their high art into a craft. The main aim of this book is to teach this craft, giving many examples.

**Existence of solutions. **An essential moment in the four-step method is the verification of the existence of a global solution. This is always done in the same way: by using the Weierstrass theorem. Even when the most delicate assumption of this theorem—boundedness of the set of admissible points—does not hold, we can use this theorem to establish the required existence, thanks to the concept of coercivity.

**Applications. **An abundant and highly varied collection of completely solved optimization problems is offered. This corresponds to our view that the development of optimization methods arises from the challenges provided by concrete optimization problems. This is expressed by the words of Hilbert in one of the epigraphs of this preface.

The methods of optimization are universal and can be applied to many different fields, as we illustrate by our choice of applications. In particular, we offer an abundance of economic applications, ranging from classical ones, such as Nash bargaining, to a model on time (in)consistency of Kydland and Prescott, the winners of the Nobel Prize in Economics in 2004. A reason for this emphasis is that one of the authors works at a School of Economics. We recommend that instructors who adopt this book for courses outside economics, econometrics, management science, and applied mathematics, add applications from their own fields to the the present collection. The full solutions of the exercises from the first four chapters are given at the end of the book.

**User-friendliness. **Our main point is to make clear that solving optimization problems is becoming more and more a convenient *craft *for a wide circle of practitioners, whereas originally it was an unattainable *art *of a small circle of experts. For some it will be a pleasant surprise that the role of all formal aspects, such as definitions, is very modest. For example, the definitions of the central concepts continuity and differentiability are never used in the actual problem solving. Instead, one uses for each of these a user-friendly calculus.

**Finer points and tricks. **Many finer points and tricks, about problem solving as well as about the underlying methods, some of them novel, are offered throughout the book. For example, we clarify the source of the secret power of the Lagrange multiplier method: it is the idea of reversing the order of the two tasks, elimination and differentiation. This turns the hard task, elimination, from a nonlinear problem into a linear one. In particular, we emphasize that the power of this method does not come from the use of multipliers.

**Historical perspective. **We emphasize background information and anecdotes, in order to make the book more readable and enjoyable for everyone. Thus one can read about how some of the methods of optimization were discovered. For this we have made use of the MacTutor History of Mathematics archive (**www-history.mcs.st-andrews.ac.uk/history/index.html). **

**Comprehensiveness. **Simple proofs from scratch are given for all results that can be used in the analysis of finite-dimensional optimization problems. Thus anyone who might initially be surprised at the strength of these miraculous

results will be led to agree with the words of Simon Stevin in one of the epigraphs of this preface: *Wonder en is gheen wonder*.

**Pictures. **The idea of the proof of each result in this book has a simple geometric sense. That is, all ideas can be fully understood by means of simple pictures. This holds in particular for the proof of the central result, the Lagrange multiplier rule. We recall that this result is usually derived from the implicit function theorem, by means of a calculation, and that the proofs given in textbooks for this latter theorem are relatively technical and not very intuitive. The role of pictures in this book is in the spirit of the words of Plato chosen as one of the epigraphs of this preface: *Geometry draws the soul toward truth*.

**Unification. **A new unification of all known necessary conditions that are used to solve concrete problems is given, to be called the *principle of Fermat-Lagrange*. This appears to mark, in our opinion, the natural limit of how far conditions of multiplier type—that can be used to solve concrete problems—can be pushed.

**Structure of theory. **The structure itself of the theory underlying the four-step method for smooth-convex problems turns out to be simple as well. This theory is based in a straightforward way on three basic results, the Weierstrass theorem, the tangent space theorem, and the supporting hyperplane theorem.

Concerning the proofs of these three results, the situation is as follows. The Weierstrass theorem is so plausible that it hardly needs a proof, unless one wants to delve into the foundations of the real number system. For the other two results there are universal proofs,

which extend to the infinite-dimensional case and therefore make it possible to extend the four-step method to the calculus of variations and to optimal control.

However, the main topic of the present book is finite-dimensional optimization and in the finite-dimensional case each of these two results can be derived in a straightforward way from the Weierstrass theorem and the Fermat theorem, using the four-step method. We have chosen to present these last proofs in the main text.

The sense of the three basic results is also clear. The Weierstrass theorem establishes the existence of global solutions of optimization problems; it is one of the many ways to express the completeness

property of the real numbers. The tangent space theorem (resp. supporting hyperplane theorem) makes it possible to profit from the smoothness (resp. convexity) properties of optimization problems; it is a statement about the approximation of nonlinear smooth (resp. convex) objects

by linear objects.

**Role of second order conditions. **For each concrete problem, we begin by establishing the *existence *of a solution. This always can be done easily, and once this is done, it only remains to find the solution(s) using the first order conditions. This has the great advantage that there is no need to use the relatively heavy-handed technique of second order conditions to complete the analysis. The role of second order conditions is just that these give some *insight*, as will be shown.

**Role of constraint qualifications. **For concrete problems with constraints, it is usually easy to show that *λ*0, the Lagrange multiplier of the objective function, cannot be zero. This turns out to be more convenient than using a constraint qualification in order to prevent *λ*0 from being zero. Again, the role of constraint qualifications is to give some insight.

**Proofs for advanced readers. **We want to draw the attention of advanced readers and experts once again to **appendix G, Conditions of Extremum from Fermat to Pontryagin. In this appendix we offer novel proofs for all necessary first and second order conditions of finite- as well as infinite-dimensional optimization problems. These proofs are simpler than the proofs in the main text, but they are given from a more advanced point of view. For experts it might be interesting to compare these proofs to the proofs in the literature. For example, a short proof of the Lagrange multiplier rule is given, using Brouwer’s fixed point theorem; this proof requires weaker smoothness assumptions on the problem than the usual one. Another example is a very short transparent proof of the Pontryagin maximum principle from optimal control, using standard results on ordinary differential equations. **

**Optimality of classical algorithms. **We show that the classical optimization algorithms are all optimal in some sense. That is, we show that these algorithms are essentially solutions of some optimization problem.

**Simplified presentation of an advanced algorithm. **We simplify the presentation of the technical analysis of the celebrated self-concordant barrier method of Nesterov and Nemirovskii by means of the device of restriction to a line.

We offer as well a novel, coordinate-free presentation of the v-space approach to interior point methods for LP.

**Bellman equation. **We tell a simple tale about a boat that represents the proof of the Bellman equation, the central result of continuous time dynamic programming.

**Main text and extras. **The main text consists of the explanation of the methods, and their application to concrete problems. In addition there are examples, exercises, proofs, and texts giving insights and background information.

**Easy access to the material. **We have tried to facilitate access to the material in various ways:

1. The royal road.

2. An introduction to necessary conditions.

3. A summary at the beginning of most chapters.

4. A plan

at the beginning and a conclusion

at the end of each section.

5. A crash course on problem solving.

6. Two crash courses on optimization theory, one in analytical style and one in geometrical style.

7. Appendices on linear algebra, on real analysis, and on existence of solutions.

8. A brief sketch of all aspects of—continuous variable—optimization methods, at the end of **Chapter 1. **

9. Short proofs for advanced readers of necessary and sufficient first and second order conditions in **appendix G, Conditions of Extremum from Fermat to Pontryagin. **

10. Our websites contain material related to this book such as a list of any corrections that will be discovered and references to implementations of optimization algorithms.

11. The index points to all statements of interest.

**Uniform structure. **All basic chapters have a uniform structure: summary, introduction, main result, applications to concrete problems, proof main result, discussion and comments, exercises.

**Acknowledgments. **We would like to thank Vladimir Protasov for his contributions and for sharing his insights. We extend our thanks to Jan Boone, who provided a number of convincing economic applications. We are especially grateful to Jan van de Craats, who produced the almost hundred figures, which take a central position.

We were fortunate that friends, colleagues, and students were willing to read early drafts of our book and to comment upon it: Joaquim Gromicho, Bernd Heidergott, Wilfred Hulsbergen, Marieke de Koning, Charles van Marrewijk, Mariëlle Non, Ben Tims, Albert Veenstra, and Shuzhong Zhang.

We also thank the Rijksmuseum Amsterdam for permission to use the painting on the cover: *Stilleven met kaassoorten*, (Still life with cheeses) by the Dutch painter Floris van Dijck (1575-1651).

The aim of this introduction to optimization is to give an informal first impression of *necessary conditions*, our main topic.

**Six reasons to optimize **

**Exercises. **Suppose you have just read for the first time about a new optimization method, say, the Lagrange multiplier method, or the method of putting the derivative equal to zero. At first sight, it might not make a great impression on you. So why not take an example: find positive numbers *x *and *y *with product 10 for which 3*x*+4*y *is as small as possible. You try your luck with the multiplier rule, and after a minor struggle and maybe some unsuccessful attempts, you succeed in finding the solution. That is how the multiplier rule comes to life! You have conquered a new method.

**Puzzles. **After a few of these numerical examples, you start to lose interest in these exercises. You want more than applying the method to problems that come from nowhere and are leading nowhere. Then it is time for a puzzle. Here is one.

To save the world from destruction, agent 007 has to reach a skiff 50 meters off-shore from a point 100 meters farther along a straight beach and then disarm a timing device. The agent can run along the shore at 5 meters per second, swim at 2 meters per second, and disarm a timing device in 30 seconds. Can 007 save the world if the device is set to trigger destruction in 73 seconds?

The satisfaction of solving such a puzzle—by putting a derivative equal to zero in this case—is already much greater.

**Test of strength. **Now, someone comes with shocking news. The multiplier rule is worth nothing, according to him. For example, if, in the example above, you express *y *in terms of *x *by rewriting *xy *= 10 as *y *= 10/*x*, and substitute this into 3*x*+4*y*, then you get 3*x*+40/*x*. Putting the derivative equal to zero, you find the optimal *x *and so, using *y *= 10/*x*, the optimal *y*. You see that you do not need the multiplier method at all, and this solution is even simpler!? Does the multiplier rule have the right to exist?

Time for a test of strength: how can we put down four sticks to form a quadrangle of maximal area? The sticks may have different sizes. This is not just a run-of-the-mill problem. In ancient times, the Greek geometers knew this problem already. They tried very hard to solve it, but without success. With the multiplier rule, you can solve it without any difficulty. We do not know of any other way to solve it.

**Insight. **After a while you have played enough. You would like to do something useful with the optimization tools. Here is a suggestion. Some fruitful principles of economics supported by much empirical evidence tell us that certain things are automatically optimal. An example is the invisible hand

of Adam Smith: individual self-interest can lead to common interest. Another example is the law of comparative advantage,

leading to the conclusion that free trade is always optimal, even for weak trading partners. In physics, all laws can be understood

by optimization. The best-known example is that the law of Snellius on the refraction of light can be derived from the principle that light always chooses the path that takes the shortest time. Here we quote a centuries-old phrase of Euler: Nothing happens in the world without a law of optimization having a hand in it.

**Practical applications. **Many things are not optimal automatically. Fortunately there are many experts (consultants, economic advisors, econometricians, engineers) who can give advice. Here are some possible questions. What is the optimal tax system? What is the best way to stimulate the creation of new jobs? What is the best way to stimulate research and development? How to achieve optimal stability of a bridge? How to achieve the highest possible return without unhealthy risks? What is the best way to organize an auction?

**Turn art into craft. **Many optimization problems of interest have been solved by some of the greatest minds, such as Newton. Their ingenious solutions can be admired, as beautiful *art *can be. The optimization methods that are now available allow you to learn the *craft *to solve all these problems—and more—by yourself. You don’t have to be a Newton to do this! We hope that our book will help you to discover this, as well as the many attractions of optimization problems and their analysis.

Fermat: One Variable without Constraints

When a quantity is the greatest or the smallest, at that moment its flow is neither forward nor backward.

*I. Newton *

• How to find the maxima and minima of a function *f *of one variable *x *without constraints?

**1.0 SUMMARY **

You can never be too rich or too thin.

*W. Simpson, wife of Edward VIII *

**One variable of optimization. **The epigraph to this summary describes a view in upper-class circles in England at the beginning of the previous century. It is meant to surprise, going against the usual view that somewhere between too small and too large is the optimum, the golden mean.

Many pragmatic problems lead to the search for *the golden mean *(or *the optimal trade-off *or *the optimal compromise)*. For example, suppose you want to play a computer game and your video card does not allow you to have optimal quality (high resolution screen

) as well as optimal performance (flowing movements

); then you have to make an optimal compromise. This chapter considers many examples where this golden mean is sought. For example, we will be confronted with the problem that a certain type of vase with one long-stemmed rose in it is unstable if there is too little water in it, but as well if there is too much water in it. How much water will give optimal stability? Usually, the reason for such optimization problems is that a trade-off has to be made between two effects. For example, the height of houses in cities like New York or Hong Kong is determined as the result of the following trade-off. On the one hand, you need many people to share the high cost of the land on which the house is built. On the other hand, if you build the house very high, then the specialized costs are forbidding.

**Derivative equal to zero. **All searches for the golden mean

can be modeled as problems of optimizing a function *f *of one variable *x*, minimization (maximization) if *f*(*x*) represents some sort of cost (profit). The following method, due to Fermat, usually gives the correct answer: "put the derivative of *f *equal to zero," solve the equation, and – if the optimal *x *has to be an integer – round off to the nearest integer. This is well known from high school, but we try to take a fresh look at this method. For example, we raise the question why this method is so successful. The technical reason for this is of course the great strength of the available calculus for determining derivatives of given functions. We will see that in economic applications a conceptual reason for this success is the *equimarginal rule*. That is, rational decision makers take a marginal action only if the marginal benefit of the action exceeds the marginal cost; they will continue to take action till marginal benefit equals marginal cost.

**Snellius’s law. **The most striking application of the method of Fermat is perhaps the derivation of the law of Snellius on the refraction of light on the boundary between two media—for example, water and air. This law was discovered empirically. The method of Fermat throws a striking light on this technical rule, showing that it is a consequence of the simple principle that light always takes the fastest path (at least for small distances).

**1.1 INTRODUCTION **

**Optimization and the differential calculus. **The first general method of solution of extremal problems is due to Pierre de Fermat (1608–1665). In 1638 he presented his idea in a letter to the prominent mathematicians Gilles Persone de Roberval (1602–1675) and Marin Mersenne (1588–1648). Scientific journals did not yet exist, and writing a letter to learned correspondents was a usual way to communicate a new discovery. Intuitively, the idea is that the tangent line at the highest or lowest point of a graph of a function is horizontal. Of course, this tangent line is only defined if the graph has no kink

at this point.

The exact meaning became clear later when Isaac Newton (1642/43–1727) and Gottfried von Leibniz (1646–1716) invented the elements of classical analysis. One of the motivations for creating analysis was the desire of Newton and Leibniz to find general approaches to the solution of problems of maximum and minimum. This was reflected, in particular, in the title of the first published work devoted to the differential calculus (written by Leibniz, published in 1684). It begins with the words Nova methodus pro maximis et minimis . . . .

**The Fermat theorem. **is a point of local minimum (or maximum) of *f*, then *the main linear part of the increment is equal to zero*. The following example illustrates how this idea works.

**Example 1.1.1 ***Verify the idea of Fermat for the function f *(*x*) = *x*².

**Solution. **To begin with, we note that the graph of f is a parabola that has its lowest point at *x *= 0.

be a point of local minimum of *f*. Let *x *; write *h *for the increment of the argument, *x*, that is, *x *+*h*. The increment of the function,

is the sum of the *main linear part **h *and the *remainder h*².

This terminology is reasonable: the graph of the function *h **h *is a straight line through the origin and the term *h*², which remains, is negligible—in comparison to *h*—for *h *small enough. That *h*² is negligible can be illustrated using decimal notation: if *h *≈ 10–k ("*k*-th decimal behind point"), then *h*² = 10–2*k *("2*k*-th decimal behind point"). For example, if *k *= 2, then *h *= 1/100 = .01, and then the remainder *h*² = 1/10000 = .0001 is negligible in comparison to *h*.

*h *= 0.

**Royal road. **If you want a shortcut to the applications in this chapter, then you can read the statements of the Fermat theorem 1.4, the Weierstrass theorem 1.6, and its corollary 1.7, as well as the solutions of examples 1.3.4, 1.3.5, and 1.3.7 (solution 3); thus prepared, you are ready to enjoy as many of the applications in **sections 1.4 and 1.6 as you like. After this, you can turn to the next chapter. **

**1.2 THE DERIVATIVE FOR ONE VARIABLE **

To solve problems of interest, one has to combine the idea of Fermat with the differential calculus. We begin by recalling the basic notion of the differential calculus for functions of one variable. It is the notion of *derivative*. The following experiment illustrates the geometrical idea of the derivative.

** Zooming in approach to the derivative. **Choose a point on the graph of a function drawn on a computer screen. Zoom in a couple of times. Then the graph looks like a straight line. Its slope is the derivative. Note that a straight line through a given point is determined by its slope.

**Analytical definition of the derivative. **For the moment, we restrict our attention to the analytical definition, due to Auguste Cauchy (1789–1857). This is known from high school: it views the derivative *f*) of a function *f *as the limit of the quotient of the increments of the function, *f*+*h*)–*f*), and the argument, *h *+*h*, if *h *tends to zero, *h *→ 0 (**Fig. 1.1). **

Now we will give a more precise formulation of this analytical definition. To begin with, note that the derivative *f*) depends only on the behavior of *f *. This means that for a sufficiently small number *ε *> 0 it suffices to consider *f*(*x*) only for *x *with |*x*| < *ε*.

Figure 1.1 Illustrating definition of differentiability.

We call the set of all numbers *x *for which

*the ε-neighborhood of *. This can also be defined by the inequalities

or by the inclusion *x *–*ε*+*ε*). The notation *f *–*ε*+*ε*denotes that *f *is defined on the *ε*, the real numbers.

**Definition 1.1 Analytical definition of the derivative. **

*exists. Then this limit is called the (first) derivative of f at **and it is denoted by f*).

For a linear function *f*(*x*) = *ax *one has *f*′(*x*) = *a *for all *x *, that is, for all numbers *x *. The following example is more interesting.

**Example 1.2.1 ***Compute the derivative of the quadratic function f*(*x*) = *x*² *at a given number *.

**Solution. **For each number *h *one has

and so

Taking the limit *h *→ 0, we get *f*.

Analogously, one can show that for the function *f*(*x*) = *xn *the derivative is *f*′(*x*) = *nxn*–1. It is time for a more challenging example.

**Example 1.2.2 ***Compute the derivative of the absolute value function f*(*x*) = |*x*|.

**Solution. **The answer is obvious if you look at the graph of *f *and observe that it has a kink at *x *= 0 (**Fig. 1.2). To begin with, you see that f is not differentiable at x = 0. For x > 0 one has f(x) = x and so f′(x) = 1, and for x < 0 one has f(x) = –x and so f′(x) = –1. These results can be expressed by one formula, **

Figure 1.2 The absolute value function.

Without seeing the graph, we could also compute the derivative using the definition. For each nonzero number *h *one has

< 0 this equals, provided |*h*| is small enough,

and this leads to *f*) = –1.

> 0 a similar calculation gives *f*) = 1.

= 0 this equals 1 for all *h *< 0 and –1 for all *h *> 0. This shows that *f*’(0) does not exist.

The geometrical sense of the differentiability of *f *is that the graph of *f *. The following example illustrates this.

Figure 1.3 Illustrating differentiability.

**Example 1.2.3 ***In Figure 1.3, the graphs are drawn of three functions f, g, and h. Determine for each of these functions the points of differentiability. *

**Solution. **The functions *g *and *h *are not differentiable at all integers, as the graph of *g *has kinks

at these points and the graph of *h *makes jumps

at these points. The functions are differentiable at all other points. Finally, *f *is differentiable at all points.

. One writes *D*) for the set of functions of one variable *x *exists.

Newton and Leibniz computed derivatives by using the definition of the derivative. This required great skill, and such computations were beyond the capabilities of their contemporaries. Fortunately, now everyone can learn how to calculate derivatives. *The differential calculus *makes it possible to determine derivatives of functions in a routine way. This consists of a list of basic examples such as

and rules such as the *product rule *

and the *chain rule *

notation for the derivative this takes on a form that is easy to memorize,

where *z *is a function of *y *and *y *a function of *x*.

**Modest role of definitions. **Thus the *definition *of the derivative plays a very modest role: to compute a derivative, you use a user-friendly *calculus*, but not the definition. *Users might be pleasantly surprised to find that a similar observation can be made for most definitions*.

The use of the differential calculus will be illustrated in the analysis of all concrete optimization problems.

If you want an immediate impression of the power of the differential calculus, then you can read about the following holiday experience. This example will provide insight, but the details will not be used in the solution of concrete optimization problems.

Once a colleague of ours met in the mountains a group of geologists. They got excited when they heard he was a mathematician, and what was even stranger is the sort of questions they started asking him. They needed the first four or five decimals of ln 2 and of sin 1°. They even wanted him to explain how they could find these decimals by themselves. Next time, we might need sin 2° or some cosine and then we won’t have you.

Well, not wanting to be unfriendly, our friend agreed to demonstrate this to them on a laptop. However, the best computing power that could be found was a calculator, and this did not even have a sine or logarithm button. Excuse enough not to be able to help, but for some reason he took pity on them and explained how to get these decimals even if you have no computing power. The method is illustrated below for sin 1°.

The geologists explained why they needed these decimals. They possessed some logarithmic lists about certain materials, but these were logarithms to base 2 and the geologists needed to convert these into natural logarithms; therefore, they had to multiply by the constant ln 2. This explains ln 2. They also told a long, enthusiastic story about sin 1°. It involved their wish to make an accurate map of the site of their research, and for this they had to measure angles and do all sorts of calculations.

**Example 1.2.4 ***How to compute the first five decimals of *sin1°?

**Solution. **Often the following observation can be used to compare two functions *f *and *g *on an interval [*a, b*] (cf. exercise 1.6.38).

going over to radians.

**• First step. **To begin with, we establish that

that is, we squash

sin *x *in between 0 and *x*, for all *x *of interest to the geologists. To this end, we apply the observation above. The three functions 0, sin *x*, and *x *have the same value at *x *= 0, and the inequality between their derivatives,

**• Second step. **In the same way one can establish that

, cos *x*, and 1 have the same value at *x *= 0, and the inequality between their derivatives,

is chosen in such a way that its derivative is –*x *and that its value at *x *= 0 equals cos 0 = 1.

**• Third step. **The next step gives

. We do not display the verification, but we note that

Close Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Loading