Kenneth Kuttler April 29, 2004

2

Contents

1 Introduction 11

I

Preliminaries

Real Numbers The Number Line And Algebra Of The Real Numbers . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Set Notation . . . . . . . . . . . . . . . . . . . . . . Exercises With Answers . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Absolute Value . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . Well Ordering Principle And Archimedian Property . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . Divisibility And The Fundamental Theorem Of Arithmetic Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . Systems Of Equations . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . Completeness of R . . . . . . . . . . . . . . . . . . . . . . . Review Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

15 15 19 20 21 22 23 23 26 27 30 33 35 36 40 42 43 47 47 49 52 52 55 60 63 65 65 66 67 69 69 70 71

2 The 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15

3 Basic Geometry And Trigonometry 3.1 Similar Triangles And Pythagorean Theorem . 3.2 Cartesian Coordinates And Straight Lines . . . 3.3 Exercises . . . . . . . . . . . . . . . . . . . . . 3.4 Distance Formula And Trigonometric Functions 3.5 The Circular Arc Subtended By An Angle . . . 3.6 The Trigonometric Functions . . . . . . . . . . 3.7 Exercises . . . . . . . . . . . . . . . . . . . . . 3.8 Some Basic Area Formulas . . . . . . . . . . . . 3.8.1 Areas Of Triangles And Parallelograms 3.8.2 The Area Of A Circular Sector . . . . . 3.9 Exercises . . . . . . . . . . . . . . . . . . . . . 3.10 Parabolas, Ellipses, and Hyperbolas . . . . . . 3.10.1 The Parabola . . . . . . . . . . . . . . . 3.10.2 The Ellipse . . . . . . . . . . . . . . . . 3.10.3 The Hyperbola . . . . . . . . . . . . . . 3

4 3.11 Exercises

CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4 The Complex Numbers 73 4.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

II

**Functions Of One Variable
**

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

81 81 85 87 91 92 93 93 97 97 102 103 106 109 110 110 113 114 114 119 119 120 125 126 126 128 130 132 134 135 136

5 Functions 5.1 General Considerations . . . . . . . . . . . . . . . 5.2 Exercises . . . . . . . . . . . . . . . . . . . . . . 5.3 Continuous Functions . . . . . . . . . . . . . . . 5.4 Suﬃcient Conditions For Continuity . . . . . . . 5.5 Continuity Of Circular Functions . . . . . . . . . 5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . 5.7 Properties Of Continuous Functions . . . . . . . 5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . 5.9 Limits Of A Function . . . . . . . . . . . . . . . 5.10 Exercises . . . . . . . . . . . . . . . . . . . . . . 5.11 The Limit Of A Sequence . . . . . . . . . . . . . 5.11.1 Sequences And Completeness . . . . . . . 5.11.2 Decimals . . . . . . . . . . . . . . . . . . 5.11.3 Continuity And The Limit Of A Sequence 5.12 Exercises . . . . . . . . . . . . . . . . . . . . . . 5.13 Uniform Continuity . . . . . . . . . . . . . . . . . 5.14 Exercises . . . . . . . . . . . . . . . . . . . . . . 5.15 Theorems About Continuous Functions . . . . . 6 Derivatives 6.1 Velocity . . . . . . . . . 6.2 The Derivative . . . . . 6.3 Exercises With Answers 6.4 Exercises . . . . . . . . 6.5 Local Extrema . . . . . 6.6 Exercises With Answers 6.7 Exercises . . . . . . . . 6.8 Mean Value Theorem . . 6.9 Exercises . . . . . . . . 6.10 Curve Sketching . . . . 6.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7 Some Important Special Functions 7.1 The Circular Functions . . . . . . . . . . . . . . . . . . . 7.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 The Exponential And Log Functions . . . . . . . . . . . 7.3.1 The Rules Of Exponents . . . . . . . . . . . . . . 7.3.2 The Exponential Functions, A Wild Assumption 7.3.3 The Special Number, e . . . . . . . . . . . . . . . 7.3.4 The Function ln |x| . . . . . . . . . . . . . . . . . 7.3.5 Logarithm Functions . . . . . . . . . . . . . . . . 7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . .

139 . 139 . 141 . 142 . 142 . 143 . 145 . 146 . 146 . 147

CONTENTS 8 Properties And Applications Of Derivatives 8.1 The Chain Rule And Derivatives Of Inverse Functions . . . . . . . . 8.1.1 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.2 Implicit Diﬀerentiation And Derivatives Of Inverse Functions 8.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 The Function xr For r A Real Number . . . . . . . . . . . . . . . . . 8.3.1 Logarithmic Diﬀerentiation . . . . . . . . . . . . . . . . . . . 8.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 The Inverse Trigonometric Functions . . . . . . . . . . . . . . . . . . 8.6 The Hyperbolic And Inverse Hyperbolic Functions . . . . . . . . . . 8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.8 L’Hˆpital’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . o 8.8.1 Interest Compounded Continuously . . . . . . . . . . . . . . 8.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.10 Related Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.12 The Derivative And Optimization . . . . . . . . . . . . . . . . . . . . 8.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.14 The Newton Raphson Method . . . . . . . . . . . . . . . . . . . . . . 8.15 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Antiderivatives And Diﬀerential Equations 9.1 Initial Value Problems . . . . . . . . . . . . . . 9.2 Areas . . . . . . . . . . . . . . . . . . . . . . . 9.3 Area Between Graphs . . . . . . . . . . . . . . 9.4 Exercises . . . . . . . . . . . . . . . . . . . . . 9.5 The Method Of Substitution . . . . . . . . . . 9.6 Exercises . . . . . . . . . . . . . . . . . . . . . 9.7 Integration By Parts . . . . . . . . . . . . . . . 9.8 Exercises . . . . . . . . . . . . . . . . . . . . . 9.9 Trig. Substitutions . . . . . . . . . . . . . . . . 9.10 Exercises . . . . . . . . . . . . . . . . . . . . . 9.11 Partial Fractions . . . . . . . . . . . . . . . . . 9.12 Rational Functions Of Trig. Functions . . . . . 9.13 Exercises . . . . . . . . . . . . . . . . . . . . . 9.14 Practice Problems For Antiderivatives . . . . . 9.15 Volumes . . . . . . . . . . . . . . . . . . . . . . 9.16 Exercises . . . . . . . . . . . . . . . . . . . . . 9.17 Lengths And Areas Of Surfaces Of Revolution . 9.18 Exercises . . . . . . . . . . . . . . . . . . . . . 9.19 The Equation y + ay = 0 . . . . . . . . . . . . 9.20 Exercises . . . . . . . . . . . . . . . . . . . . . 9.21 Force On A Dam And Work Of A Pump . . . . 9.22 Exercises . . . . . . . . . . . . . . . . . . . . . 10 The 10.1 10.2 10.3 10.4 10.5 Integral Upper And Lower Sums . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . Functions Of Riemann Integrable Functions Properties Of The Integral . . . . . . . . . . Fundamental Theorem Of Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 149 149 149 150 152 153 154 155 156 159 160 161 165 166 167 168 170 173 175 177 179 179 181 182 184 186 188 190 191 192 196 197 202 202 203 210 213 214 219 220 221 222 224 227 227 230 231 233 237

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 10.6 10.7 10.8 10.9 Exercises . . . . . . . . . . . . . . . Return Of The Wild Assumption . . Exercises . . . . . . . . . . . . . . . Techniques Of Integration . . . . . . 10.9.1 The Method Of Substitution 10.9.2 Integration By Parts . . . . . 10.10 Exercises . . . . . . . . . . . . . . . 10.11 Improper Integrals . . . . . . . . . . 10.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 241 245 247 247 248 249 252 255 257 257 259 261 261 266 269 274 275 277 284 286

11 Inﬁnite Series 11.1 Approximation By Taylor Polynomials 11.2 Exercises . . . . . . . . . . . . . . . . . 11.3 Inﬁnite Series Of Numbers . . . . . . . . 11.3.1 Basic Considerations . . . . . . . 11.3.2 More Tests For Convergence . . 11.3.3 Double Series . . . . . . . . . . . 11.4 Exercises . . . . . . . . . . . . . . . . . 11.5 Taylor Series . . . . . . . . . . . . . . . 11.5.1 Operations On Power Series . . . 11.6 Exercises . . . . . . . . . . . . . . . . . 11.7 Some Other Theorems . . . . . . . . . .

III

**Vector Valued Functions
**

. . . . . . . . . . . . . . . . . . Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

291

. . . . . . . . . . . . . . . . . . . . . . 293 294 295 296 299 300 302 302 305 305 309 311 311 313 313 315 317 319 320 321 324 326 328 329

12 Rn 12.1 Algebra in Rn . . 12.2 Exercises . . . . 12.3 Distance in Rn . 12.4 Exercises . . . . 12.5 Lines in Rn . . . 12.6 Exercises . . . . 12.7 Open And Closed 12.8 Exercises . . . . 12.9 Vectors . . . . . 12.10 Exercises . . . .

13 Vector Products 13.1 The Dot Product . . . . . . . . . . . . . . . . . . . . 13.2 The Geometric Signiﬁcance Of The Dot Product . . . 13.2.1 The Angle Between Two Vectors . . . . . . . . 13.2.2 Work And Projections . . . . . . . . . . . . . . 13.2.3 The Parabolic Mirror . . . . . . . . . . . . . . 13.2.4 The Equation Of A Plane . . . . . . . . . . . . 13.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 13.4 The Cross Product . . . . . . . . . . . . . . . . . . . 13.4.1 The Distributive Law For The Cross Product 13.4.2 Torque . . . . . . . . . . . . . . . . . . . . . . 13.4.3 The Box Product . . . . . . . . . . . . . . . . 13.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS

7

13.6 Vector Identities And Notation . . . . . . . . . . . . . . . . . . . . . . . . . . 330 13.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 14 Functions 14.1 Exercises . . . . . . . . . . . . . . . . . . . . . . 14.2 Continuous Functions . . . . . . . . . . . . . . . 14.2.1 Suﬃcient Conditions For Continuity . . . 14.3 Exercises . . . . . . . . . . . . . . . . . . . . . . 14.4 Limits Of A Function . . . . . . . . . . . . . . . 14.5 Exercises . . . . . . . . . . . . . . . . . . . . . . 14.6 The Limit Of A Sequence . . . . . . . . . . . . . 14.6.1 Sequences And Completeness . . . . . . . 14.6.2 Continuity And The Limit Of A Sequence 14.7 Properties Of Continuous Functions . . . . . . . 14.8 Exercises . . . . . . . . . . . . . . . . . . . . . . 14.9 Some Advanced Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 334 334 335 335 336 339 339 341 342 343 343 344 351 351 352 354 355 357 358 359 363 364 365 366 367 371 373 375 375 378 380 381 384 385

15 Limits And Derivatives 15.1 Limits Of A Vector Valued Function . . . . . . . . . . . . . . . 15.2 The Derivative And Integral . . . . . . . . . . . . . . . . . . . . 15.2.1 Geometric And Physical Signiﬁcance Of The Derivative 15.2.2 Diﬀerentiation Rules . . . . . . . . . . . . . . . . . . . . 15.3 Leibniz’s Notation . . . . . . . . . . . . . . . . . . . . . . . . . 15.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.5 Newton’s Laws Of Motion . . . . . . . . . . . . . . . . . . . . 15.5.1 Kinetic Energy . . . . . . . . . . . . . . . . . . . . . . . 15.5.2 Impulse And Momentum . . . . . . . . . . . . . . . . . 15.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.7 Systems Of Ordinary Diﬀerential Equations . . . . . . . . . . . 15.7.1 Picard Iteration . . . . . . . . . . . . . . . . . . . . . . 15.7.2 Numerical Methods For Diﬀerential Equations . . . . . 15.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Line Integrals 16.1 Arc Length And Orientations . . . 16.2 Line Integrals And Work . . . . . . 16.3 Exercises . . . . . . . . . . . . . . 16.4 Motion On A Space Curve . . . . . 16.5 Exercises . . . . . . . . . . . . . . 16.6 Independence Of Parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17 The Circular Functions Again 387 17.1 The Equations Of Undamped And Damped Oscillation . . . . . . . . . . . . . 392 17.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395 18 Curvilinear Coordinate Systems 18.1 Polar Cylindrical And Spherical Coordinates 18.2 The Acceleration In Polar Coordinates . . . . 18.3 Planetary Motion . . . . . . . . . . . . . . . . 18.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 397 399 401 406

8

CONTENTS

IV

**Functions Of More Than One Variable
**

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

409

. . . . . . . . . . . . . . . . . . . . 411 411 417 419 421 423 425 426 427 428 428 432 437 438 445 447 458 458 462 462 466

19 Linear Algebra 19.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1.1 Finding The Inverse Of A Matrix . . . . . . . . . . 19.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.3 Linear Transformations . . . . . . . . . . . . . . . . . . . 19.3.1 Least Squares Problems . . . . . . . . . . . . . . . 19.3.2 The Least Squares Regression Line . . . . . . . . . 19.3.3 The Fredholm Alternative . . . . . . . . . . . . . . 19.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.5 Moving Coordinate Systems . . . . . . . . . . . . . . . . 19.5.1 The Coriolis Acceleration . . . . . . . . . . . . . . 19.5.2 The Coriolis Acceleration On The Rotating Earth 19.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.7 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . 19.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.9 The Mathematical Theory Of Determinants . . . . . . . . 19.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.11 The Determinant And Volume . . . . . . . . . . . . . . . 19.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.13 Linear Systems Of Ordinary Diﬀerential Equations . . . 19.14 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . .

20 Functions Of Many Variables 20.1 The Graph Of A Function Of Two Variables . . . . . . . . . 20.2 The Directional Derivative . . . . . . . . . . . . . . . . . . . 20.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.4 Mixed Partial Derivatives . . . . . . . . . . . . . . . . . . . 20.5 The Limit Of A Function Of Many Variables . . . . . . . . 20.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.7 Approximation With A Tangent Plane . . . . . . . . . . . . 20.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.9 Diﬀerentiation And The Chain Rule . . . . . . . . . . . . . 20.9.1 The Chain Rule . . . . . . . . . . . . . . . . . . . . 20.9.2 Diﬀerentiation And The Derivative . . . . . . . . . . 20.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.11 The Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . 20.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.13 Nonlinear Ordinary Diﬀerential Equations Local Existence 20.14 Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . . 20.15 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.16 Taylor’s Formula For Functions Of Many Variables . . . . 20.16.1 Some Linear Algebra . . . . . . . . . . . . . . . . . 20.16.2 The Second Derivative Test . . . . . . . . . . . . . . 20.17 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

469 . 469 . 470 . 473 . 474 . 475 . 478 . 478 . 484 . 484 . 484 . 489 . 492 . 493 . 495 . 496 . 499 . 504 . 506 . 508 . 509 . 513

CONTENTS 21 The 21.1 21.2 21.3 21.4 21.5 21.6 21.7 21.8 21.9 Riemann Integral On Rn Methods For Double Integrals . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . Methods For Triple Integrals . . . . . . . . Exercises With Answers . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . Diﬀerent Coordinates . . . . . . . . . . . . . Exercises With Answers . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . The Moment Of Inertia . . . . . . . . . . . 21.9.1 The Spinning Top . . . . . . . . . . 21.9.2 Kinetic Energy . . . . . . . . . . . . 21.9.3 Finding The Moment Of Inertia And 21.10 Exercises . . . . . . . . . . . . . . . . . . . 21.11 Theory Of The Riemann Integral . . . . . 21.11.1 Basic Properties . . . . . . . . . . . 21.11.2 Iterated Integrals . . . . . . . . . . 21.11.3 Some Observations . . . . . . . . .

9 519 519 527 528 533 537 538 543 549 551 551 554 556 557 558 560 573 577

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Center Of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mass . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

22 The Integral On Other Sets 579 22.1 Exercises With Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587 22.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591 23 Vector Calculus 23.1 Divergence And Curl Of A Vector Field . . . . . . . . 23.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 23.3 The Divergence Theorem . . . . . . . . . . . . . . . . 23.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 23.5 Some Applications Of The Divergence Theorem . . . . 23.5.1 Hydrostatic Pressure . . . . . . . . . . . . . . . 23.5.2 Archimedes Law Of Buoyancy . . . . . . . . . 23.5.3 Equations Of Heat And Diﬀusion . . . . . . . . 23.5.4 Balance Of Mass . . . . . . . . . . . . . . . . . 23.5.5 Balance Of Momentum . . . . . . . . . . . . . 23.5.6 The Wave Equation . . . . . . . . . . . . . . . 23.5.7 A Negative Observation . . . . . . . . . . . . . 23.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 23.7 Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . 23.7.1 Green’s Theorem . . . . . . . . . . . . . . . . . 23.8 Green’s Theorem Again . . . . . . . . . . . . . . . . . 23.9 Stoke’s Theorem From Green’s Theorem . . . . . . . . 23.9.1 Conservative Vector Fields . . . . . . . . . . . 23.9.2 Maxwell’s Equations And The Wave Equation 23.10Exercises . . . . . . . . . . . . . . . . . . . . . . . . . A The Fundamental Theorem Of Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593 593 597 598 602 603 603 603 604 605 606 610 611 611 612 616 617 619 622 625 626 629

10

CONTENTS

Introduction

Calculus consists of the study of limits of various sorts and the systematic exploitation of the completeness axiom. It was developed by physicists and engineers over a period of several hundred years in order to solve problems from the physical sciences. It is the language by which precision and quantitative predictions for many complicated problems are obtained. It is used to ﬁnd lengths of curves, areas and volumes of regions which are not bounded by straight lines. It is used to predict and account for the motion of satellites. It is essential in order to solve many maximization problems and it is prerequisite material in order to understand models based on diﬀerential equations. These and other applications are discussed to some extent in this book. It is assumed the reader has a good understanding of algebra on the level of college algebra or what used to be called algebra II along with some exposure to geometry and trigonometry although the book does contain an extensive review of these things. I have tried to keep the book a manageable length in order to focus more on the important ideas. I have also tried to give complete proofs of all theorems in one variable calculus and to at least give plausibility arguments for those in multiple dimensions. Physical models are derived in the usual way through the use of diﬀerentials leading to diﬀerential equations which are introduced early and used throughout the book as the basis for physical models. I expect the reader to be able to use a calculator whenever it would be helpful to do so. Many of the exercises will be very troublesome without one. Having said this, calculus is not about using calculators or any other form of technology. I believe that when the syntax and arcane notation associated with technology are presented, these things become the topic of study rather than the concepts of calculus. This is a book on calculus and should not be considered an instruction manual for the use of technology. Pictures are often helpful in seeing what is going on and there are many pictures in this book for this reason. However, calculus is not about drawing pictures and ultimately rests on logic and deﬁnitions. Algebra plays a central role in gaining the sort of understanding which generalizes to higher dimensions where pictures are not available. Therefore, I have emphasized the algebraic aspects of this subject far more than is usual, especially linear algebra which is absolutely essential to understand in order to do multivariable calculus. I have also featured the repeated index summation convention and the usual reduction identities which allow one to discover vector identities.

11

12

INTRODUCTION

Part I Preliminaries 13 .

.

−4 −3 −2 −1 ' 0 1 1/2 2 3 4 E As shown in the picture. By 2 analogy. 1. Axiom 2. This section contains a review of the algebraic properties of real numbers. 0. denoted by R.1. Thus the integers are deﬁned as Z ≡ {· · · − 1. 1 is half way between the number 0 and the number. These properties will not be proved which is why they are called axioms rather than theorems. (existence of additive inverse). listed here as a collection of assertions called axioms. 15 . deﬁned as the numbers which are the quotient of two integers. the natural numbers. the notation. 2.3 For each x ∈ R.1 The Number Line And Algebra Of The Real Numbers To begin with. Axiom 2. In general. · · ·} and the rational numbers.1. there exists −x ∈ R such that x + (−x) = 0. In this book. 1. 2. as a line extending inﬁnitely far in both directions. Q≡ m such that m. (additive identity). you can see where to place all the other rational numbers. N ≡ {1. It is assumed that R has the following algebra properties. (commutative law for addition) Axiom 2. ≡ indicates something is being deﬁned. n ∈ Z.The Real Numbers An understanding of the properties of the real numbers is essential in order to understand calculus. · · ·} . n = 0 n are each subsets of R as indicated in the following picture. axioms are statements which are regarded as true. Often these are things which are “self evident” either from experience or from some sort of intuition but this does not have to be the case. consider the real numbers.1 x + y = y + x.1.2 x + 0 = x.

Axiom 2.1. the claim is made that the additive inverse and the multiplicative inverse are unique. Thus. In many cases the source of confusion will disappear almost magically. (commutative law for multiplication).(distributive law). you would normally interpret it as the ﬁrst of the above alternatives. The multiplicative inverse and additive inverses are unique.(existence of multiplicative inverse). feelings and emotions have nothing to do with math. think of division as multiplication by the multiplicative inverse as just indicated and think of subtraction as addition of the additive inverse.4 (x + y) + z = x + (y + z) . Proofs and deﬁnitions are very important in mathematics because they are the means by which “truth” is determined. These axioms are known as the ﬁeld axioms and any set (there are many others besides R) which has two such operations satisfying the above axioms is called a ﬁeld. (multiplicative identity). only one number has the property that it is a multiplicative inverse. if you see 6 − 3 − 4.1. think x + (−y) . Axiom 2. It is assumed that the reader is completely familiar with these axioms in the sense that he or she can do the usual algebraic manipulations taught in high school and junior high algebra courses. The signiﬁcance of this is that if you are wondering if a given number is the additive inverse of a given number. The axioms listed above are just a careful statement of exactly what is necessary to make the usual algebraic manipulations valid. It is conventional to simply do the operations in order of appearance reading from left to right. 1 There are certainly real and important things which should not be described using mathematics because it has nothing to do with these things. all you have to do is to check and see if it acts like one. Axiom 2. there exists x−1 such that xx−1 = 1.1 It is also the deﬁnitions and proofs which make the subject of mathematics intellectually worth while. (associative law for addition). the following theorem is important and follows from the above axioms.10 The above axioms imply the following. Truth is not determined on the basis of experiment or opinions and it is this which makes mathematics useful as a language for describing certain kinds of reality in a precise manner. only one number has the property that it is an additive inverse and that. something is “true” if it follows from axioms using a correct logical argument.7 1x = x. In the ﬁrst part of the following theorem.1. Theorem 2. Take these away and it becomes a gray wasteland ﬁlled with endless tedium and meaningless manipulations. (associative law for multiplication). The reason for this is that subtraction and division do not satisfy the associative law.9 x (y + z) = xy + xz.1. Whenever you feel a little confused about an algebraic expression which involves division or subtraction. the so called binary operations of addition and multiplication are associative and so no such confusion will occur. Axiom 2. For example.5 xy = yx.1. This means that for a given number. A word of advice regarding division and subtraction is in order here.6 (xy) z = x (yz) . think x y −1 and when you see x − y. The reasoning which demonstrates this assertion is called a proof.8 For each x = 0. In doing algebra. 1. Axiom 2.16 THE REAL NUMBERS Axiom 2. . In mathematics. given a nonzero number. Division and subtraction are deﬁned in the usual way by x−y ≡ x+(−y) and x/y ≡ x y −1 .1.1. This means there is a natural ambiguity in an expression like 6 − 3 − 4. when you see x/y. Thus. Do you mean (6 − 3) − 4 = −1 or 6 − (3 − 4) = 6 − (−1) = 7? It makes a diﬀerence doesn’t it? However.

− (−x) = x. −x for x. From the deﬁnition of the additive identity and the distributive law it follows that 0x = (0 + 0) x = 0x + 0x. Therefore. Consider 2. use 2. If xy = 0 then either x = 0 or y = 0. (−1) (−1) = 1. You should prove the multiplicative inverse is unique. To verify (−1) x = −x. It follows from 1. it suﬃces to show x acts like the additive inverse of −x in order to conclude that − (−x) = x. it follows (−1) x = −x as claimed. This is because it has just been shown that additive inverses are unique. To demonstrate 3. (−1) x = −x 4. 0x = 0. there exists an additive inverse. From the existence of the additive inverse and the associative law it follows 0 = (−0x) + 0x = (−0x) + (0x + 0x) = ((−0x) + 0x) + 0x = 0 + 0x = 0x To verify the second claim in 2.2. (−1) + (−1) (−1) = 0. ((−x) + x) + y = ((−x) + x) + z which implies 0 + y = 0 + z. and 2. 17 Proof: Suppose then that x is a real number and that x + y = 0 = x + z. Now by the deﬁnition of the additive identity. by 2. by the uniqueness of the additive inverse proved in 1. It is necessary to verify y = z.. and the distributive law to write x + (−1) x = x (1 + (−1)) = x0 = 0. Then x−1 exists by the axiom about the existence of multiplicative inverses. From the above axioms. By the deﬁnition of additive inverse.. It is desired to verify 0x = 0. and the distributive law. this implies y = z. . suppose x = 0. x + (−x) = 0 and so x = − (−x) as claimed. and the associative law for multiplication.. Therefore. −x + 0 = (−x) + (x + y) = (−x) + (x + z) and so by the associative law for addition. To verify 4. THE NUMBER LINE AND ALGEBRA OF THE REAL NUMBERS 2. 3. that 1 = − (−1) = (−1) (−1) . Therefore.1. y = x−1 x y = x−1 (xy) = x−1 0 = 0. (−1) (1 + (−1)) = (−1) 0 = 0 and so using the deﬁnition of the multiplicative identity..

Example 2. Note the use of the j as a generic variable which takes values from 1 up to m. the notation requires m+r xj−r = x1 + x2 + · · · + xm j=1+r and so j=1 xj = j=1+r xj−r . x1 . When you have doubts. insert parentheses to resolve the ambiguities. you manipulate expressions involving letters in the same manner as you would if they were speciﬁc numbers. j=k It is also possible to change the variable of summation. j=1 Thus this symbol. Then m xj ≡ x1 + x2 + · · · + xm .1. not just numbers.12 6 k=1 m (2k + 1) = 48. Exponents are always done before multiplication. Thus y 2 = y × y and b−3 = b1 3 etc. Thus xy 2 = x y 2 and is not equal 2 to (xy) . there are a few conventions related to the order in which operations are performed. x2 .11 Let x1 . Parentheses are done before anything else.18 THE REAL NUMBERS This proves 4. Since they represent numbers. Deﬁnition 2. Also. Recall the notion of something raised to an integer power. m xj = x1 + x2 + · · · + xm j=1 while if r is an integer. you should verify the following. Division or multiplication is always done before addition or subtraction. and completes the proof of this theorem. If you have not seen this. Be very careful of such things since they are a source of mistakes. Thus x − y (z + w) = x − [y (z + w)] and is not equal to (x − y) (z + w) . · · ·. Also recall summation notation. m m+r . · · ·. xm and add them all up. the following is a short review of this topic.1. As an example of the use of this notation. Be sure you understand why m+1 m xk = k=1 k=1 xk + xm+1 . As a slight generalization of this notation. This notation will be used whenever there are things which can be added. Summation notation will be used throughout the book whenever it is convenient to do so. Another thing to keep in mind is that you often use letters to represent numbers. x2 . m xj ≡ xk + · · · + xm . xm be numbers. j=1 xj means to take all the numbers.

You add these just like they were numbers. 6. Therefore. Let x = 3 and y = 1. (a) (b) (c) x2 y+xy 2 +x x x2 y+xy 2 +x xy x3 +2x2 −x−2 x+1 14. Let x = 1 and y = 2. x and y. 5 = 3. EXERCISES Example 2. 5 = 9. and (x + y) . √ Find the error in the following argument. Dividing both sides by (y − x) yields x = x+y. 3. 2 12. Show on the number line the eﬀect of subtracting a positive number from another positive number. Write the ﬁrst expression as (x2x(x−1) and +y)(x−1) y (x2 +y ) the second as (x−1)(x2 +y) . Find the error in the following argument. x2 + 1 = x + 1 and so letting x = 2. Now substituting in what these variables equal yields 1 = 1+1. When is it true that (x + y) = xn + y n ? 9.2. y x + 2+y x x−1 = = y x2 + y x (x − 1) + (x2 + y) (x − 1) (x − 1) (x2 + y) x2 − x + yx2 + y 2 . Find a formula for (x + y) . Show on the number line the eﬀect of adding two positive numbers. 8. Simplify x2 y 4 z −6 x−2 y −1 z . y) .13 Add the fractions. 2) . Therefore. Add the fractions x x2 −1 + x−1 x+1 . x x2 +y 19 + y x−1 . yields 2 = 9. 1 3 n = 1 x+y = 1 x + 1 y = 13. 11. √ 10. Then cross multiplying. (x + y) .2. 2. 2 3 4 7. Find the error in the following. Then since these have the same common denominator. 5. Show − (ab) = (−a) b. Then 3 1 + 1 = 2 . In these expressions the variables represent real numbers. Based on what you observe for 8 these. Then xy = y 2 and so xy − x2 = y 2 − x2 . Simplify the following expressions using correct algebra. you add them as follows. Find f (−1. 4.2 Exercises 1. Consider the expression x + y (x + y) − x (y − x) ≡ f (x. Let x = y = 1. x (y − x) = (y − x) (y + x) . Find the error in the following argument. give a formula for (x + y) . (x2 + y) (x − 1) 2. Then 1 = 3 − 2 = 3 − (3 − 1) = x − y (x − y) = (x − y) (x − y) = 22 = 4. . Show on the number line the eﬀect of multiplying a number by −1.1.

Axiom 2. (x is less than y) means y lies to the right of x on the number line.5 If x < y and y < z then x < z (Transitive law).3 Order The real numbers also have an order deﬁned on them. Instead. things assumed to be true. one and only one of the following alternatives holds. the following should be fairly reasonable and are listed as axioms.3. You may assume the associative. This order can be deﬁned very precisely in terms of a short list of axioms but this will not be done here.7 If x ≤ 0 and y ≤ 0. Show the rational numbers satisfy the ﬁeld axioms.3.8 If x > 0 then x−1 > 0.3.3.3.20 15.9 If x < 0 then x−1 < 0. commutative. Axiom 2.10 If x < y then xz < yz if z > 0. Verify the following formulas. is positive if x > 0. x.11 If x < y and z < 0. properties which should be familiar are listed here as axioms. Deﬁnition 2. Axiom 2.3. Find the error in the following. Axiom 2.2 The sum of two positive real numbers is positive.3. Axiom 2.3. 2. Axiom 2. x < y.3 The product of two positive real numbers is positive. xy + y = y + y = 2y.3. and distributive laws hold for the integers. x ≤ y if either x = y or x < y.4 For a given real number x. then xy ≥ 0. x Now let x = 2 and y = 2 to obtain 3=4 THE REAL NUMBERS 17. I suggest you plug in some numbers to reassure yourself about these axioms. Axiom 2. Either x is positive. . Axiom 2. x = 0.3. x ≥ y if either x > y or x = y.3. then xz > zy (multiplication of an inequality by a negative number). in words (x is greater than y) means x is to the right of y on the number line.1 The expression. Axiom 2. or −x is positive. If you examine the number line. A number. The expression x > y.6 If x < y then x + z < y + z (addition to an inequality). Axiom 2. (a) (x − y) (x + y) = x2 − y 2 (b) (x − y) x2 + xy + y 2 = x3 − y 3 (c) (x + y) x2 − xy + y 2 = x3 + y 3 16. (multiplication of an inequality by a positive number). in words.

3. 3. x2 + 2x + 4 = 2 (x + 1) + 3 ≥ 0 for all x. subtract x from both sides to get x + 4 ≤ −8. 8} = {1.3.2. In general A ∪ B ≡ {x : x ∈ A or x ∈ B} . To indicate that 3 is an element of {1. The union of two sets is the set consisting of everything which is contained in at least one of the sets. 8}. A or B. 8} . This notation says: the set of all integers. 3. . 4. 8} . x. 4. 3. 2. As an example of the union of two sets. 2. 7.12 Each of the above holds with > and < replaced by ≥ and ≤ respectively except for 2. 8} because these numbers are those which are in at least one of the two sets. In general.8 and 2. If A and B are sets with the property that every element of A is an element of B. or x > y (trichotomy). Therefore. The second case yields x + 1 ≤ 0 and 2x − 3 ≤ 0 which implies x ≤ −1 and x ≤ 3 . 8} would be a set consisting of the elements 1. {1. {1. The intersection of two sets. 3. Then multiply both sides by (−1) to obtain x ≤ −12. either both of the factors. 3. 3. 2. 8} means 9 is not an element of {1.13 For any x and y. 4. 8} ⊆ {1. the solution to this problem is all x ∈ R. 8} . {1. 8} ⊇ {1. 2.3. Therefore. To simplify the way such things are written. 5. 2.3. For example. 9 ∈ {1. 3. 7. / Sometimes a rule speciﬁes a set. 4. Example 2. 2. This is described next. it is customary to write 3 ∈ {1. 2. 3. 8} because 3 and 8 are those elements the two sets have in common. then A is a subset of B. involves set notation. such that x > 2. and 8. 3. 2. For example you could specify a set as all integers larger than 2.16 Solve the inequality (x) (x + 2) ≥ −4 Here the problem is to ﬁnd x such that x2 + 2x + 4 ≥ 0. 3. the solution to this inequality is x ≤ −1 or x ≥ 3 . 8} . 8} = {3.3. 8} . 2. 2. 2 2 Example 2. Example 2. 2. 8} ∪ {3.3.3. 5. Either x = y. 7. 2. in symbols. However.3.1 Set Notation A set is just a collection of things called elements. 8} ∩ {3. The ﬁrst case yields x + 1 ≥ 0 and 2x − 3 ≥ 0 so x ≥ −1 and x ≥ 2 3 yielding x ≥ 2 . 2. It is not an exclusive or. 3. 4. Axiom 2.14 Solve the inequality 2x + 4 ≤ x − 8 Subtract 2x from both sides to yield 4 ≤ −x−8.3. x < y. If this is to hold. Next add 8 to both sides to get 12 ≤ −x. ORDER 21 Axiom 2. A ∩ B ≡ {x : x ∈ A and x ∈ B} . This would be written as S = {x ∈ Z : x > 2} . 3. 3.15 Solve the inequality (x + 1) (2x − 3) ≥ 0. Note that trichotomy could be stated by saying x ≤ y or y ≤ x. 5. 4. x + 1 and 2x − 3 are nonnegative or they 3 are both nonpositive. Alternatively.3. 2. The same statement about the two sets may also be written as {1.2. 2.9 in which it is also necessary to stipulate that x = 0. Be sure you understand that something which is in both A and B is in the union.3. exactly one of the following must hold. Then subtract 4 from both sides to obtain x ≤ −12. For example {1. A and B consists of everything which is in both of the sets. Thus {1. 8} is a subset of {1.

/ Set notation is used whenever convenient. Mathematicians like to say the empty set is a subset of every set. To illustrate the use of this notation consider the same three examples of inequalities. Thus the answer would be −∞. 1) . b) are deﬁned by analogy to what was just explained. [a. Other intervals such as (−∞. 2 3 2 was the answer. such that ∅ has something in it which is not in A.18 Solve the inequality (x + 1) (2x − 3) ≥ 0. ∞) means the set of all numbers. These sorts of sets of real numbers are called intervals. b) denotes the set of real numbers such that a ≤ x < b. a situation which never occurs. Solve (3x + 1) (x − 2) ≤ 0. ∞). Written as −1 . To identify the interesting intervals. . The reason that there will always be a curved parenthesis next to ∞ or −∞ is that these are not real numbers. denoted by ∅. −1 ∪ (2. Thus ∅ is deﬁned as the set which has no elements in it. ∞) . This was worked earlier and x ≤ −1 or x ≥ this is denoted by (−∞. and nonnegative if x ≤ −1. a] means the set of all real numbers which are less than or equal to a. b] denotes the set of real numbers. [a. x such that a < x ≤ b. (x + 1) and (2x − 2) and determine where these equal zero. The reason they say this is that if it were not so. A. This is just everything not included in the above problem. the curved parenthesis indicates the end point it sits next to is not included while the square parenthesis indicates this end point is included. they cannot be included in any set of real numbers. However. x such that a < x < b and (a.17 Solve the inequality 2x + 4 ≤ x − 8 This was worked earlier and x ≤ −12 was the answer. Example 2.3. all that was necessary to do was to look at the two factors. ∞) . x+1 Note that 2x−2 is positive if x > 1.22 THE REAL NUMBERS When with real numbers.3. This is written as (−∞. ∅ has nothing in it and so the least intellectual discomfort is achieved by saying ∅ ⊆ A. In terms of set notation Example 2.19 Solve the inequality (x) (x + 2) ≥ −4 Recall this inequality was true for any value of x. 2. x such that x ≥ a and (−∞. Thus A \ B ≡ {x ∈ A : x ∈ B} . In general. 3 3 1. such that a ≤ x ≤ b and [a. Thus either 3x + 1 ≤ 0 and x − 2 ≥ 0 in which case x ≤ −1 and x ≥ 2. Therefore. −12]. there would have to exist a set. A special set which needs to be given a name is the empty set also called the null set. b] indicates the set of numbers. 2. Therefore. b) consists of the set of real numbers. Example 2. x. 1) . negative if x ∈ (−1. Solve (3x + 1) (x − 2) > 0. It is written as R or (−∞. If A and B are two sets. a and b are called endpoints of the interval. the answer is (−1. (a. The two points. 3 3. 2 . or else 3 3x + 1 ≥ 0 and x − 2 ≤ 0 so x ≥ −1 and x ≤ 2. A \ B denotes the set of things which are in A but not in B. −1] ∪ [ 3 . Solve x+1 2x−2 < 0.3.4 Exercises With Answers This happens when the two factors have diﬀerent signs.

Theorem 2. 2.6. 8. the answer is [−2. How far away from 0 is the number 3? How about the number −3? Look at the number line and observe they are both 3 units away from 0. 2. Solve x+1 < 1. Deﬁnition 2. Solve 11. Solve 3x−4 x2 +2x+2 3x+9 x2 +2x+1 x2 +2x+1 3x+7 2 ≥ 0. −x if x < 0. Solve (x + 2) (x − 2) ≤ 0. Therefore.2 |xy| = |x| |y| .1 |x| ≡ x if x ≥ 0. 9. On something like this.5 Exercises 1. x+3 5. 7. < 1. Solve x+2 3x−2 < 0.6 The Absolute Value A fundamental idea is the absolute value of a number. This is important because the absolute value deﬁnes distance on R. To describe this algebraically. Solve x2 − 2x ≤ 0. It may be useful to think of this function in terms of its graph if you recall the notion of the graph of a function. Solve (3x + 2) (x − 3) > 0. Solve (x − 1) (2x + 1) ≤ 2. ≥ 1. subtract 1 from both sides to get 6 + x − x2 (3 − x) (2 + x) = . Solve 10. 3].2. For x ∈ (−2. Solve 3x+7 x2 +2x+1 23 ≥ 1. this equals zero. . Solve (3x + 2) (x − 3) ≤ 0. 2.5. 3.6. Solve (x − 1) (2x + 1) > 2. y d d d d d x The following is a fundamental theorem about the absolute value. Thus |x| can be thought of as the distance between x and 0. 6. 4. EXERCISES 4. 3) the expression is positive and it is negative if x > 3 or if x < −2. 2 x2 + 2x + 1 (x + 1) When x = 3 or x = −2.

2) (2. Theorem 2. Therefore.) Applying this observation to the above inequality.6.6.1) while if |x| − |y| ≤ 0. Therefore. y ≤ 0.2. This proves the theorem. You observe that 5 > 1 and so it is important to remember that the triangle inequality is an inequality. b ≥ 0 then if a2 ≤ b2 it follows that b2 ≥ ba ≥ a2 and so b ≥ a (see the above axioms. Multiply by a−1 if a = 0. Example 2. ||x| − |y|| ≤ |x − y| .2). |x + y| ≤ |x| + |y| . ||x| − |y|| ≤ |x − y| because if |x| − |y| ≥ 0. then the conclusion follows from (2. y ≥ 0 and x ≤ 0 while y ≥ 0. Note there is an inequality involved. the conclusion follows from (2. Now note that if 0 ≤ a ≤ b then 0 ≤ a2 ≤ ab ≤ b2 and that if a. Consider the following. |3 + (−2)| = |1| = 1 while |3| + |(−2)| = 3 + 2 = 5. This veriﬁes the ﬁrst of these inequalities.4 Solve the equation |x − 1| = 2 (2. You should verify the other cases. Proof: By Theorem 2.3 The following inequalities hold. both x. |x + y| ≤ |x| + |y| .24 THE REAL NUMBERS Proof: If both x. then |xy| = xy because in this case xy ≥ 0 while |x| |y| = (−x) (−y) = (−1) x (−1) y = (−1) (−1) xy = xy. in this case the result of the theorem is veriﬁed. |x + y| = (x + y) 2 2 2 = (x + y) 2 2 = x2 + y 2 + 2xy ≤ x2 + y 2 + 2 |x| |y| = |x| + |y| + 2 |x| |y| = (|x| + |y|) . To obtain the second one.1) 2 .6. note |x| = |x − y + y| ≤ |x − y| + |y| and so |x| − |y| ≤ |x − y| Now switch the letters to obtain |y| − |x| ≤ |y − x| = |x − y| . Either of these inequalities may be called the triangle inequality. This theorem is the basis for the following fundamental result which is of major importance in calculus.

1 ] ∪ [3. 1 . This happens if 2x − 1 > 2 or if 2x − 1 < −2. the desired inequality will hold. The values of x larger than 3 do 3 not produce equality so either |x + 1| < |2x − 2| for these points or |2x − 2| < |x + 1| for these points. Therefore. Note that some of this is arbitrary. |x + 1| = x + 1 when x > −1 and |x + 1| = −1 − x when x ≤ −1.6 Solve the inequality |2x − 1| > 2. you see the ﬁrst of the two cases is the one which holds.6.8 Solve |x + 1| ≤ |2x − 2| In order to keep track of what is happening.6. Similar reasoning obtains (−∞. Example 2. Therefore. if |x − 2| < 1. Consider x 3 between 1 and 3. [3. it is not hard to draw its graph. −1/2 < x and x < 3/2. ∞ ∪ −∞.6. if |x − 2| < 50 . 1 Therefore. 3 must be excluded. y = |x + 1| and y = |2x − 2| on the same set of coordinate axes.2. δ. Therefore. In other words. It could be the case that x + 1 = 2x − 2 in which case x = 3 or alternatively. You can see these values of x do not solve the inequality. then ||x| − 2| < 3 and so |x| < 5. If |x − 2| < 1. The result is 10 8 6 4 2 -4 -2 0 2 4 Equality holds exactly when x = 3 or x = 1 as in the preceding example. then ||x| − |2|| < 1 and so |x| < 3. x2 − 4 = |x + 2| |x − 2| ≤ (|x| + 2) |x − 2| ≤ 5 |x − 2| . Example 2. Similar considerations apply to the other relation. Example 2. This is not a hard job. there are two solutions to this problem. 1 ]. x + 1 = 2 − 2x in which case x = 1/3. Thus the solution is x > 3/2 or x < 3 −1/2.9 Obtain a number. −2 < 2x − 1 < 2 and so −1/2 < x < 3/2. For example. 2 . it is necessary to have 2x − 1 between −2 and 2 because the inequality says that the distance from 2x − 1 to 0 is less than 2. ∞) is included. 3 Example 2. THE ABSOLUTE VALUE 25 This will be true when x − 1 = 2 or when x − 1 = −2. if |x − 2| < 3. Checking examples. 2 Example 2. Therefore. it is a very good idea to graph the two relations. Therefore.5 Solve the inequality |2x − 1| < 2 From the number line.7 Solve |x + 1| = |2x − 2| There are two ways this can happen. for such x. It follows the solution set 3 to this inequality is (−∞. − 1 . x = 3 or x = −1. x2 − 4 = |x + 2| |x − 2| ≤ (|x| + 2) |x − 2| ≤ 7 |x − 2| .6. then x2 − 4 < 1/10. Therefore. ∞).6. For example 3 x = 1 does not work.6. Therefore. such that if |x − 2| < δ.

then | x − 1| < ε.7 Exercises 1. Show a−1 > b−1 . ε Now let δ = min 1. 5. 3 2.26 THE REAL NUMBERS 1 and so it would also suﬃce to take |x − 2| < 70 . not the question of ﬁnding a particular such number. Show that if |x − 6| < 1. 8. Solve x+3 2x+1 x+2 3x+1 > 1. Obtain a number. 4. This notation means to take the minimum of the two numbers. then |x| < 7. First of all. 6. Obtain a number. Example 2. it follows |x| < 2 and so for |x − 1| < 1. Give your answer in terms of intervals on the real line. then x2 − 1 < ε. δ > 0. such that if |x − 1| < δ. Solve (x + 1) (2x − 2) x ≥ 0. Solve |x + 1| = |2x − 3| . such that if |x − 1| < δ. > 2. then | x − 2| < 1/10. Solve |x + 2| < |3x − 3| . √ 15. note x2 − 1 = |x − 1| |x + 1| ≤ (|x| + 1) |x − 1| . 11. How large can |x − 5| be? 14. You need to tell how. show that if x ≤ z and y < 0 then xy ≥ yz. δ > 0. 2. Solve 9. a 16. x2 − 1 < 3 |x − 1| . Suppose 0 < a < b. 13. Solve |3x + 1| < 8. .10 Suppose ε > 0 is a given positive number. such that if |x − 4| < δ. 7. then any which is smaller will also work. Then if |x − 1| < δ. The example is about the existence of a number which has a certain property. 3 . Now if |x − 1| < 1. Hint: This δ will depend in some way on ε. Obtain a number. Solve |x + 2| ≤ 8 + |2x − 4| . Obtain a number. such that if |x − 1| < δ. 10. Suppose ε > 0 is√ given positive number. δ > 0. 3. Describe the set of numbers. a such that there is no solution to |x + 1| = 4 − |x + a| . δ > 0. 12. Verify the axioms for order listed above are reasonable by consideration of the number line. x2 − 1 < 3 |x − 1| < 3 ε = ε. Suppose |x − 8| < 2.6. then x2 − 1 < 1/10. Tell when equality holds in the triangle inequality. There are inﬁnitely many which will work because if you have found one. In particular. 1 ε and 3 .

WELL ORDERING PRINCIPLE AND ARCHIMEDIAN PROPERTY 27 2. Suppose this formula is valid for some n ≥ 1 where n is an integer. It must be the case that b > a since by deﬁnition.8 Well Ordering Principle And Archimedian Property Deﬁnition 2. 2.3 (Mathematical induction) A set S ⊆ Z. Then the integer. 6 By inspection. / b − 1 ∈ ([a. Mathematical induction is a very useful devise for proving theorems about the integers.8.2. Axiom 2. ∞) ∩ Z) \ S = T which contradicts the choice of b as the smallest element of T. Theorem 2.8.1 A set is well ordered if every nonempty subset S.4 Prove by induction that n k=1 k2 = n(n+1)(2n+1) . If T = ∅ then by the well ordering principle. Thus T consists of all integers larger than or equal to a which are not in S. having the property that a ∈ S and n + 1 ∈ S whenever n ∈ S contains all integers x ∈ Z such that x ≥ a. contains a smallest element z having the property that z ≤ x for all x ∈ S. denoted as b. there would have to exist a smallest element of T. The theorem will be proved if T = ∅. then b − 1 + 1 = b ∈ S by the assumed property of S. ∞) ∩ Z is also in S. a ∈ T. Proof: Let T ≡ ([a. b − 1 ≥ a and / b − 1 ∈ S because if b ∈ S. This is called the induction hypothesis. ∞) ∩ Z) \ S. n (n + 1) (2n + 1) 2 + (n + 1) . Then n+1 n k2 = k=1 k=1 k 2 + (n + 1) 2 = n (n + 1) (2n + 1) 2 + (n + 1) . 6 The step going from the ﬁrst to the second line is based on the assumption that the formula is true for n. The sum yields 1 and so does the formula on the right.8. Example 2. Now simplify the expression in the second line.2 Any set of integers larger than a given number is well ordered. it must be the case that T = ∅ and this says that everything in [a. 6 This equals n (2n + 1) (n + 1) + (n + 1) 6 and n (2n + 1) 6 (n + 1) + 2n2 + n + (n + 1) = 6 6 (n + 2) (2n + 3) = 6 . the natural numbers deﬁned as N ≡ {1. Therefore.) Since a contradiction is obtained by assuming T = ∅. The above axiom implies the principle of mathematical induction. if n = 1 then the formula is true. (b − 1 is smaller. In particular.8. · · ·} is well ordered.8.

such that x < l < y. Theorem 2.8. Proof: Let x be the smallest positive integer. the ﬁrst step was to show 1 ∈ S and then that whenever n ∈ S. 1 is the smallest natural number. S is normally not mentioned.8. and a > 0. Just look at the number line. Not surprisingly. 6 = showing the formula holds for n + 1 whenever it holds for n. When this has been done. This proves the formula by mathematical induction. 2 . Therefore. It also can be used to verify a very important property of the rational numbers. This proves the inequality. S contains [1.7 R has the Archimedian property. all positive integers.8. If x is an integer. Example 2. One just veriﬁes the steps above. it follows n + 1 ∈ S. First show the thing is true for some a ∈ Z and then verify that whenever it is true for m it follows it is also true for m + 1.8 Suppose x < y and y − x > 1. Deﬁnition 2. In doing an inductive proof of this sort. the theorem has been proved for all m ≥ a. 1 2 · 3 4 ··· 1 2 2n−1 2n < √ 1 .28 Therefore. This is not hard to believe. satisfying x < y < x + 1 since otherwise.8. Suppose < = 1 2n + 1 √ 2n + 1 2n + 2 √ 2n + 1 . Then 1 3 2n − 1 2n + 1 · ··· · 2 4 2n 2n + 2 < 1 √ 3 which is obviously true. 2n+1 If n = 1 this reduces to the statement that then that the inequality holds for n. This happens if and only if 2 1 2n + 1 1 √ = > 2 2n + 3 2n + 3 (2n + 2) which occurs if and only if (2n + 2) > (2n + 3) (2n + 1) and this is clearly true which may be seen from expanding both sides. 2n + 2 1 The theorem will be proved if this last expression is less than √2n+3 . If x < 1 then x2 < x contradicting the assertion that x is the smallest natural number. Therefore. there is no integer y satisfying x < y < x + 1. Lets review the process just used. the set. by the principle of mathematical induction. n+1 THE REAL NUMBERS k2 = k=1 (n + 1) (n + 2) (2n + 3) 6 (n + 1) ((n + 1) + 1) (2 (n + 1) + 1) . there exists n ∈ N such that na > x. Axiom 2. Then there exists an integer. If S is the set of integers at least as large as 1 for which the formula holds. you could subtract x and conclude 0 < y − x < 1 for some integer y − x.5 Show that for all n ∈ N. l ∈ Z. ∞) ∩ Z. y. This shows there is no integer. x = 1 but this can be proved.6 The Archimedian property states that whenever x ∈ R. This Archimedian property is quite important because it shows every real number is smaller than some integer.

Suppose you want to do the following problem 79 . Even though you saw it at a young age it is very profound and quite diﬃcult to understand.8 there exists m ∈ Z such that nx < m < ny and so take r = m/n.8.8. It follows from Theorem 2. You probably saw the process of division in elementary school.8. Thus the above theorem says Q is “dense” in R. It is the next theorem which gives the density of the rational numbers. b) = ∅. Therefore. . This means that for any real number. x < k − 1 < y and this proves the theorem with l = k − 1. If k − 1 ≤ x. then ≤0 y − x ≤ y − (k − 1) = y − k + 1 ≤ 1 contrary to the assumption that y − x > 1. Can this always be done? The answer is in the next theorem and depends here on the Archimedian property of the real numbers. there exists a rational number arbitrarily close to it. 22 This meant 79 = 3 (22) + 13.8. S ∩ (a. 79 and 22 and you wrote the ﬁrst as some multiple of the second added to a third number which was smaller than the second number. n (y − x) = ny + n (−x) = ny − nx > 1. WELL ORDERING PRINCIPLE AND ARCHIMEDIAN PROPERTY Now suppose y − x > 1 and let S ≡ {w ∈ N : w ≥ y} .8. Either k − 1 ≤ x or k − 1 > x.2. Proof: Let n ∈ N be large enough that n (y − x) > 1. Deﬁnition 2. S ⊆ R is dense in R if whenever a < b. Thus. You were given two numbers. Let k be the smallest element of S. k − 1 < y. Theorem 2.11 Suppose 0 < a and let b ≥ 0. 79 = 3 with remainder 13. What did you do? You likely did a process of long division 22 which gave the following result. Thus (y − x) added to itself n times is larger than 1.10 A set.9 If x < y then there exists a rational number r such that x < r < y. Then there exists a unique integer p and real number r such that 0 ≤ r < a and b = pa + r. 29 The set S is nonempty by the Archimedian property. Theorem 2. Therefore.

2 n k=1 10. 8. Show there is no smallest number in Q ∩ (0. contradicting Theorem 2. show by induction that 9. Therefore. By the Archimedian property this set is nonempty. Prove by induction that n k=1 1 k 3 = 4 n4 + 1 n3 + 1 n2 . and F such that things work out right.30 THE REAL NUMBERS Proof: Let S ≡ {n ∈ N : an > b} . D. 2 4 6. Now consider the numbers which are of the form 2k where k ∈ Z and m ∈ N. E. B. r < a as desired. C. If r ≥ a then b − pa ≥ a and so b ≥ (p + 1) a contradicting p + 1 ∈ S. i = 1. 1) means the real numbers which are strictly larger than 0 and smaller than 1. Thus r1 = r2 and it follows that p1 = p2 . Show that if S ⊆ R and S is well ordered with respect to the usual order on R then S cannot be dense in R. 1) . suppose pi and ri .9 Exercises 1. The case that r1 > r2 cannot occur either by similar reasoning. Therefore. 4. 3. 5. The Archimedian property implies the rational numbers are dense in R.8. r ≡ b − pa ≥ 0. 2. Look for a formula in the form An + Bn + Cn + 2 Dn + En + F. Then try to ﬁnd the constants A. n+1 4 5 4 3 2 k= n(n+1) . To verify uniqueness of p and r. Derive a formula for n 4 5 4 3 k=1 k in the following way. When you get your answer. Using the number m line. Prove by induction that whenever n ≥ 2. Then pa ≤ b because p + 1 is the smallest in S. (a + kd) and then prove your . prove it is valid by induction. 1) . Prove by induction that n k=1 n k=1 r ark = a r r−1 − a r−1 . Recall (0. Then a little algebra shows r2 − r1 p1 − p2 = ∈ (0. show (n + 1) = A (n + 1) + B (n + 1) + C (n + 1) + D (n + 1) + E (n + 1) + F −An5 + Bn4 + Cn3 + Dn2 + En + F and so some progress can be made by matching the coeﬃcients. It is a ﬁne thing to be able to prove a theorem by induction but it is even better to be able to come up with a theorem to prove in the ﬁrst place. 2. Find a formula for result by induction. k=1 √k > n. Let a and d be real numbers. This theorem is called the Euclidean algorithm when a and b are integers. Let p + 1 be the smallest element of S. a Thus p1 − p2 is an integer between 0 and 1. demonstrate that the numbers of this form are also dense in R. If r = 0. In doing this. both work and r2 > r1 . 2. √ n 1 7.8. Show there is no smallest number in (0. 1) .

letting A be the amount of the loan. what should the monthly payments be? Hint: The rate per payment period is . Thus there are n payments in all. This problem is a continuation of Problem 11. n n 2 A= k=1 P (1 + r) −k . the rate per payment period equals . Typically. It follows the amount of the loan should equal the sum of these “discounted payments”. So what is a payment made at the end of k payment periods worth today assuming money is worth r per payment period? Shouldn’t it be the amount.) In general. If the payment period is one month. you make payments. You put money in the bank and it accrues interest at the rate of r per payment period. the ﬁrst payment at the end of the ﬁrst payment period. In an ordinary annuity. you n would have P (1 + r) in the bank. etc. . The second sits in the bank for n − 2 payment n−2 periods so it grows to P (1 + r) . and you started with $100 then the amount at the end of one month would equal 100 (1 + r) = 100 + 100r. This is because the one made today could be invested by the bank and having accrued interest for 10 years would be far larger.000 with the ﬁrst monthly payment at the end of the ﬁrst month. In this the second term is the interest and the ﬁrst is called the principal. 1−r ark−1 = k=1 12. then a − arn . n is pretty large. Suppose the available interest rate is 7% per year and you want to take a loan for $100. 360 for a thirty year loan. n k=1 n 31 ark−1 . this means r.06/12. See the formula you got in Problem 13 and solve for P. ﬁnd a formula for the right side of the above formula. Using Problem 11. is worth −k P (1 + r) to the bank right now. In general. Each accrue interest at the rate of r per payment period. Q which when invested at a rate of r per payment period would yield k P at the end of k payment periods? Thus from Problem 12 Q (1 + r) = P and so −k Q = P (1 + r) . Clearly a payment made 10 years from now can’t be considered as valuable to the bank as one made today. the amount you would have at the end of n months would be 100 (1 + r) . Now suppose you want to buy a house by making n equal monthly payments. 14. suppose you start with P and it sits in the bank for n payment periods. P at the end of each payment period. Using Problem 11. Now you have 100 (1 + r) in the bank. Consider the geometric series. This is called the present value of an ordinary annuity.07/12. (When a bank says they oﬀer 6% compounded monthly. Prove by induction that if r = 1. Then at the end of the nth payment period. ﬁnd a formula for the amount you will have in the bank at the end of n payment periods? This is called the future value of an ordinary annuity. If you want to pay oﬀ the loan in 20 years. That is.9.2. 13. Hint: The ﬁrst payment sits in the bank for n − 1 payment periods and so n−1 this payment becomes P (1 + r) . How much will you have at the end of the second month? By analogy to what was just done it would equal 100 (1 + r) + 100 (1 + r) r = 100 (1 + r) . Thus this payment of P at the end of n payment periods. These terms need a little explanation. EXERCISES 11.

Using the binomial theorem prove that for all n ∈ N. Prove that n k the 18.32 15. Argue the term got bigger and then note that in the binomial expansion for 1 + 1 n+1 n+1 n·(n−1)···(n−k+1) and k!nk n+1 1 1 + n+1 except note that a similar term occurs in the . k k same as those obtained in Pascal’s triangle? Prove your assertion. If I jump from a height of one inch. k!nk Now consider the term binomial expansion for that n is replaced with n + 1 whereever this occurs. and (x + y) = x3 + 3x2 y + 3xy 2 + y 3 . 21. n n k=0 k kpk (1 − p) n−k = np. (x + y) = 3 x2 + 2xy + y 2 . Are these numbers. I am unharmed. . there is a relation between the numbers st n on the (n + 1) row and those on the nth row. Use the Problems 21 and 20 to verify for all n ∈ N. if I am unharmed from jumping from a height of n inches. Hint: Letting the numbers of the nth row of Pascal’s triangle be denoted by n n n 0 . Consider the ﬁrst ﬁve rows of Pascal’s2 triangle 1 11 121 1331 14641 THE REAL NUMBERS What would the sixth row be? Now consider that (x + y) = 1x + 1y . The binomial theorem states (a + b) = k=0 n an−k bk . Hint: You might try using the preceding problem. 2k ≤ k! 22. 1) . I can jump oﬀ the top of the empire state building without suﬀering any ill eﬀects. k k This is used in the inductive step. 16. k factors 1+ = k=0 n k 1 n k = k=0 n · (n − 1) · · · (n − k + 1) . Let n whenever k ≥ 1 and k ≤ n. 17. 24. Give a conjecture about that 5 (x + y) would be. Prove by induction that 1 + n i=1 1 n n ≤ 3. = n n·(n−1)···(n−k+1) . the relation being n+1 = n + k−1 . 19. 1 + 23. 1 n n 20. i (i!) = (n + 1)!. k! n By the binomial theorem. This is self evident and provides the induction step. 1 . What is the matter with this reasoning? 2 Blaise Pascal lived in the 1600’s and is responsible for the beginnings of the study of probability. then n+1 = n + k−1 . n n n k n 1 2 ≡ n! (n−k)!k! where 0! ≡ 1 and (n + 1)! ≡ (n + 1) n! for all n ≥ 0. Here is the proof by induction. I can jump from a height of n inches for any n. · · ·. Based on Problem 15 conjecture a formula for (x + y) and prove your conjecture by induction. there are more terms. Furthermore. Prove the binomial theorem k by induction. Show that for p ∈ (0. n in reading from left to right. 1 + Hint: Show ﬁrst that 1 n n n k ≤ 1+ 1 n+1 n+1 . then jumping from a height of n + 1 inches will also not harm me. Therefore. Prove by induction that for all k ≥ 4.

Either p divides m or it does not. Is there something wrong with this argument? 2. there is a smallest element of S. Therefore. . n be two positive integers and deﬁne S ≡ {xm + yn ∈ N : x. then by Theorem 2. Therefore. Then m = qx and n = qy for integers. b if in Theorem 2. Deﬁnition 2.2 Let m. Then the smallest number in S is the greatest common divisor. Theorem 2. If p does not divide m. r = m (1 − x0 ) + (−y0 q) n ∈ S. Proof: Suppose p does not divide a. Therefore. y ∈ Z } .10. The following deﬁnition describes what is meant by a prime number and also what is meant by the word “divides”. all n + 1 horses are the same color as the n − 1 horses which didn’t get moved. The notation for this is a|b. then q divides p.8. The greatest common divisor of two positive integers. Thus m = (x0 m + y0 n) q + r and so.11.10. this is a contradiction because p was the smallest element of S. m = pq + r where 0 < r < p. Put the horse which was removed back in and take out another horse.10. Here is the proof by induction. x and y such that 1 = ax + yp.3 If p is a prime and p|ab then either p|a or p|b. Similarly p|n. Proof: First note that both m and n are in S so it is a nonempty set of positive integers. a divides the number. Two integers are relatively prime if their greatest common divisor is one. p which has the property that p divides both m and n and also if q divides both m and n. solving for r. All horses are the same color. read a divides b and a is called a factor of b.10. Theorem 2. However. Thus p|m. DIVISIBILITY AND THE FUNDAMENTAL THEOREM OF ARITHMETIC 33 25. A prime number is one which has the property that the only numbers which divide it are itself and 1. Now suppose the theorem that all horses are the same color is true for n horses and consider n + 1 horses. r = 0. A single horse is the same color as himself. n) . m. it is good general knowledge and so is included. n is that number. a) = 1 and therefore.10 Divisibility And The Fundamental Theorem Of Arithmetic It is not necessary to read this section in order to do calculus. called p = x0 m + y0 n. Then since the only factors of p are 1 and p it follows(p. there exists integers. Now suppose q divides both m and n. However. The remaining n horses are the same color by the induction hypothesis. By well ordering.11. p = (m. n) . This proves the theorem. denoted by (m. That is there is zero remainder. p = mx0 + ny0 = x0 qx + y0 qy = q (x0 x + y0 y) showing q|p.8. Remove one of the horses and use the induction hypothesis to conclude the remaining n horses are all the same color.2.1 The number. x and y.

It remains to argue the prime factorization is unique except for order of the factors. n. Then by Theorem 2.10. q) = 1. Since these are prime numbers this requires p1 = q1 . · · ·. q2 .4 (Fundamental theorem of arithmetic) Let a ∈ N\ {1}. there is no way to reorder the qk such that m = n and pi = qi for all i. Furthermore. b = abx + ybp = pzx + ybp = p (xz + yb) and this shows p divides b. The next theorem is a very nice high school theorem which characterizes all possible rational roots for polynomials having integer coeﬃcients. then it has a prime factorization. ab = pz for some integer z. if n is not a prime. ± factor of an Proof: Let p be a rational solution. Proof: If a equals a prime number. these are of the form factor of a0 .10. Theorem 2. the fraction q may be reduced to lowest terms such that (p. Dividing p and q by (p. Therefore. Then if the equation has any rational solutions. proving the theorem. Substituting into the equation. Since p|ab. Therefore. and this results in a contradiction. n−1 m−1 pi+1 = i=1 j=1 qj+1 .34 Multiplying this equation by b yields b = abx + ybp. Since n + m was as small as possible for the theorem to fail.3. this prime factorization is unique except for the order of the factors. In particular the prime factorization exists for the prime number 2. qm can be reordered in such a way that pk = qk for all k = 2. each has a prime factorization. Assume this theorem is true for all a ≤ n − 1. Then dividing both sides by p1 = q1 . Suppose n m n pi = i=1 j=1 qj where the pi and qj are all prime. Reordering if necessary it can be assumed that qj = q1 . Thus so does n. each of these is no larger than n − 1 and consequently. the prime factorization clearly exists. it follows that n − 1 = m − 1 and the prime numbers. Hence pi = qi for all i because it was already argued that p1 = q1 . q) if necessary. On the other hand. an pn + an−1 pn−1 q + · · · + a1 pq n−1 + a0 q n = 0. THE REAL NUMBERS Theorem 2. and n + m is the smallest positive integer such that this happens. If n is a prime.5 (rational root theorem) Let an xn + · · · + a1 x + a0 = 0 where each ai is an integer and an = 0. · · ·.10. Then a = i=1 pi where pi are all prime numbers. then there exist two integers k and m such that n = km where each of k and m are less than n. p1 |qj for some j. .

b] is the least common multiple. Explain why this shows there must be inﬁnitely many primes. 2. Example 2. b] q + r where 0 < r < [a. Using the fact that 2 is irrational. c} are three positive integers. a0 q n = − an pn + · · · + a1 pq n−1 and so p|a0 q n but (p. Here is how you might proceed. EXERCISES Hence an pn = − an−1 pn−1 q + · · · + a0 q n 35 and q divides the right side of the equation and therefore. √ 2 must be irrational.10. Now obtain a terrible contradiction. [a. b] will denote their least common multiple. · · ·. pk from the above list must divide it. This argument about inﬁnitely many primes is due to Polya. Show [a. b] must divide ab. Therefore. p|a0 and this proves the theorem. b] = ab/q for some q an integer. However. By Theorem 2. then an and am are relatively prime. Similarly. 2(2 ) + 1 are 3 He n lived about 300 B. some prime number. then all the numbers dividing it other than 1 fail to divide am for all m < n. 3 7. Hence r = 0.10. pn ) = 1 and so by Theorem 2. from Theorem 2. n It gives more information than the argument of Euclid. 3.2. √ √ 2 is irrational 2 is the solution of the equation x2 − 2 = 0. b) .3 q|an because it does not divide pn due to the fact that pn and q have no prime factors in common. Then verify these numbers are irrational.C. Show that if n = m.11. Show if it exists.10. (not rational) show that numbers of the form r 2 where r ∈ Q are dense in R. You ﬁll in the details. Hint: Show [a. show 7 6. y. · · ·. If not. (q. Show that if {a.5. b] is a common multiple of a and b. b are integers. Then this number can’t be prime because it is larger than every prime number. ab = [a. If it is not. Then verify r is a common multiple of a and b contradicting that [a. they have a greatest common divisor which may be written as ax + by + cz for some integers x. 2. q must divide the left side also. . He assumed there were only ﬁnitely many. pn } and then considered the number p1 · · · pn + 1 consisting of the product of all the primes plus 1. Using Theorem 2. However.10. 5 5 are all irrational numbers. The numbers. Therefore.6 An irrational number is one which is not rational. Since [a. If a. 6. Let an = 22 + 1 for n = 1. argue that q must divide both a and b. Either an is prime or it is not. b] . Euclid3 showed there were inﬁnitely many prime numbers using a very simple argument. Now what is the largest such q? This would yield the smallest ab/q. {p1 .10. [a. This means they are not rational. This is the smallest number which has both a and b as factors.11 Exercises √ √ √ 1. 4. the only possible rational roots to this equation are ±2 and ±1 and none of these work. q n ) = 1.3 again. b] = ab/ (a. 5. Therefore. z.5. b. √ √ 2.

it satisﬁes aE1 = af1 and so. 297. it follows from the other equation that x + 2 = 7 and so x = 5. To solve this. (x. 6 Fermat lived from 1601 to 1665. E2 + aE1 = f2 + af1 . y) 4 Leonhard Euler. 967 . He and Lagrange invented the branch of mathematics known as calculus of variations. For more information on these matters. He was the most proliﬁc mathematician ever to live. He had 13 children. algebra. you should see the book by Chahal.6). For example the problem could be to ﬁnd x and y such that x + y = 7 and 2x − y = 8.4) The second equation was replaced by −2 times the ﬁrst equation added to the second. 5 The number in this case is 4. you can see that (5. Hint: To verify an and am are relatively prime for m > n. (2. analysis. p (integer) = 2. His memory was prodigious and he could do unbelievable feats of computation in his head. E2 = f2 (2. then (2. His most famous conjecture was that there is no solution to the equation xn + y n = z n if n ≥ 3. x and y solve the above system if and only if x and y solve the system −3y=−6 x + y = 7. note that the solution set does not change if any equation is replaced by a non zero multiple of itself. Then letting m = n + r. That is there is no analog to pythagorean triples with higher exponents than 2. He is generally regarded as the founder of number theory. 2x − y + (−2) (x + y) = 8 + (−2) (7).12 Systems Of Equations Sometimes it is necessary to solve systems of equations. knowing y = 2.6) Why is this? If (x. lived from 1707 to 1783. mechanics. At this time it is unknown whether there are inﬁnitely many Fermat primes. (2. This was ﬁnally proved in the 1990’s by Andrew Wiles. [4]. from −3y = −6 and now. For example. Why exactly does the replacement of one equation with a multiple of another added to it not change the solution set? The two equations of (2. born in Switzerland. What does this say about p? How does pk1 = 22 + 1 yield a contradiction? n 2. If (x. He made major contributions to number theory. and diﬀerential equations.5) has the same solution set as E1 = f1 . Also. 2) = (x. His collected papers take up more shelf space than a typical encyclopedia.3) The set of ordered pairs. 294 .36 THE REAL NUMBERS prime numbers for several values of n but Euler4 showed that when n = 5. since it also solves E2 = f2 it must solve the second equation in (2. Thus the solution is y = 2. (2. Consequently.5) then it solves the ﬁrst equation in (2.6).explain why pk2 = am = 22 n 2r + 1 = (pk1 − 1) 2r +1 = p (integer) + 2. .3) are of the form E1 = f1 . For example.5) where E1 and E2 are expressions involving the variables. y) which solve both equations is called the solution set. y) is a solution to the above system. y) solves (2.suppose they are not and that for some number. the number is not prime5 . an = pk1 while am = pk2 . p = 1. The claim is that if a is a number. It also does not change if one equation is replaced by itself added to a multiple of the other equation. they are called Fermat6 primes. When numbers of this form are prime.

y) is a solution of the second equation in (2. Add (−2) times the bottom equation to the middle and then add (−6) times the bottom to the top. This yields x + 3y = 19 y=6 z=3 Now add (−3) times the second to the top. In (2. Of course the same reasoning applies with no change if there are many more variables than two and many more equations than two. This yields x=1 y=6 .9) you could have continued as follows. The other thing which does not change the solution set of a system of equations consists of listing the equations in a diﬀerent order. This process is not really much diﬀerent from what you have always done in solving a single equation. you can tell what the solution is.12. x + 3y + 6z = 25 2x + 7y + 14z = 58 2y + 5z = 19 (2. This yields.5). This shows the solutions to (2. Example 2.12. More precisely.7) To solve this system replace the second equation by (−2) times the ﬁrst equation added to the second. Therefore.8) Now take (−2) times the second and add to the third. SYSTEMS OF EQUATIONS 37 solves (2.6) then it solves the ﬁrst equation of (2. the solution set of the whole system does not change. z = 3. This yields x = 11. it follows y + 6 = 8 and so y = 2. Also aE1 = af1 and it is given that the second equation of (2. You did the same thing to both sides of the equation thus preserving the solution set until you obtained an equation which was simple enough to give the answer. Then using this in the second equation.9) At this point. Now using this in the top equation yields x + 6 + 18 = 25 and so x = 1. replace the third equation with (−2) times the second added to the third. . It is still the case that when one equation is replaced with a multiple of another one added to itself. suppose you wanted to solve 2x + 5 = 3x − 6. Here is another example. z=3 a system which has the same solution set as the original system. the system x + 3y + 6z = 25 y + 2z = 8 2y + 5z = 19 (2. This system has the same solution as the original system and in the above. you would add −2x to both sides and then add 6 to both sides. For example.5).2. E2 = f2 and it follows (x.1 Find the solutions to the system. In this case.6) are exactly the same which means they have the same solution set. This yields the system x + 3y + 6z = 25 y + 2z = 8 z=3 (2.5) and (2.6) is veriﬁed.

7) would be to take (−2) times the ﬁrst row of the augmented matrix above and add it to the second row.2 Give the complete solution to the system of equations. 1 3 6 25 0 1 2 8 0 0 1 3 which is the same as (2. multiplying a row by a non zero number. The augmented matrix for this system is 2 4 −3 −1 5 10 −7 −2 3 6 5 9 Multiply the second row by 2. take (−3) times the ﬁrst row and add this to 2 times the last row and replace the last row with this. 2 4 −3 −1 0 0 1 1 . Now when you replace an equation with a multiple of another equation added to itself. 0 0 1 21 . 0 2 5 19 Note how this corresponds to (2. 2x + 4y − 3z = −1. You get the idea I hope. This yields 2 4 −3 −1 0 0 1 1 3 6 5 9 Now. the ﬁrst row by 5. x + 3y + 6z = 25.8). 2 . Thus the ﬁrst step in solving (2. It is easier to write the system (2. combining some row operations. 14 .7) as the following “augmented matrix” 1 3 6 25 2 7 14 58 . 5x+10y−7z = −2. Next take (−2) times the second row and add to the third. 7 and a z column. 1 3 6 25 0 1 2 8 . a y column. Write the system as an augmented matrix and follow the procedure of either switching rows. This yields. Example 2. 0 2 5 19 It has exactly same information the original system but here is understood there is the as it 3 6 1 an x column. Each of these operations leaves the solution set unchanged. and then take (−1) times the ﬁrst row and add to the second. These operations are called row operations.9). or replacing a row by a multiple of another row added to it. Then multiply the ﬁrst row by 1/5.12. Thus the top row in the augmented matrix corresponds to the equation.38 THE REAL NUMBERS It is foolish to write the variables every time you do these operations. and 3x + 6y + 5z = 9. you are just taking a row of this augmented matrix and replacing it with a multiple of another row added to it. The rows correspond 5 2 0 to the equations in the system.

Example 2. −1 −5 9 1 −10 0 0 0 0 and then divide the top row which results by 3. It doesn’t matter what x equals. The phenomenon of an inﬁnite solution set occurs in equations having only one variable also. such a system of linear equations may have a unique solution. These things are all the same.2. This is impossible so the last system of equations determined by the above augmented matrix has no solution. or inﬁnitely many solutions. For example. and z = t where t is completely arbitrary. y − 10z = 0. the last two rows say z = 1 and z = 21. The system has an inﬁnite set of solutions and this is a good description of the solutions. This is what it is all about. This gives 3 −1 −5 9 0 1 −10 0 0 1 −10 0 Next take −1 times the middle row and 3 0 0 Take the middle row and add to the top 1 0 0 add to the bottom. There is clearly no solution to this. 3. Consider for example. xn ) solving each of the equations listed. x = x+1. It turns out these are the only three cases which can occur for linear systems. Therefore. i = 1.12. You can have more equations than variables.12. y = 10t. fj is a number. and −2x + y = −6. This shows there is no solution to the three given equations. SYSTEMS OF EQUATIONS 39 Putting in the variables. no solution.3 Give the complete solution to the system of equations. and it is desired to ﬁnd (x1 . Deﬁnition 2. However. When this happens. you do exactly the same things to solve any linear system. Apparently z can equal any number. 0 0 0 This says y = 10z and x = 3 + 5z. The augmented matrix of this system is 3 −1 0 1 −2 1 9 0 −6 −5 −10 0 Replace the last row with 2 times the top row added to 3 times the bottom row. the solution set of this system is x = 3 + 5t. fewer equations than variables. . 0 −5 3 1 −10 0 . consider the equation x = x. You write the augmented matrix and do row operations until you get a simpler system in which it is possible to see the solution. ﬁnding the solutions to the system. the system is called inconsistent.12. As illustrated above. It can even happen for one equation in one variable. All is based on the observation that the row operations do not change the solution set. It doesn’t matter. This should not be surprising that something like this can take place. 2. Furthermore. · · ·. m j=1 where aij are numbers. n aij xj = fj . You always set up the augmented matrix and go to work on it. · · ·. etc. 3x − y − 5z = 9.4 A system of linear equations is a list of equations. it has the same solution set as the ﬁrst system of equations.

5 Give the complete solution to the system of equations. This yields 492 0 0 −1476 0 0 0 0 0 4 0 12 0 0 3 15 It follows x = −1476 = −3. Give the complete solution to the system of equations. 2. Finally take 41 times the third row and add to 4 times the top row. −3 1 0 12 2 0 1 −1 Now take 2 times the third row and replace fourth row. 0 12 1 −1 To solve this multiply the top row by 109. y + 8z = 0. and 2x + z = −1. Then take 5 times the third row and add to four times the second.12. Give the complete solution to the system of equations. x + 6z = 27. −41 15 0 −5 −3 1 0 2 the fourth row by this added to 3 times the 0 168 0 −15 . 123 −41 0 −492 0 −5 0 −15 . Here are some exercises. and −2x + y = −4. y = 3 and z = 5. 2x + z = 511. This yields −41 15 0 168 0 −5 0 −15 . Then switch the third and the ﬁrst rows. and multiply the top row by 1/109. −3x + y = 12. the second row by 41. 492 You should practice solving systems of equations. and y = 1.40 THE REAL NUMBERS Example 2. 0 12 3 21 Take (−41) times the third row and replace the ﬁrst row by this added to 3 times the ﬁrst row.13 Exercises 1. 2. 0 4 0 12 0 2 3 21 Take −1/2 times the third row and add to the bottom row. −41x + 15y = 168. The augmented matrix is −41 109 −3 2 15 −40 1 0 0 168 0 −447 . 3x − y + 4z = 6. 109x − 40y = −447. . add the top row to the second row.

Give the complete solution to the system of equations. −9x+15y = 66. y − 4z = 0. and −2x + y = −4. 9x − 2y + 4z = −17.13. −2x + y = −12. 15. and −4x − 8y + 13z = 8. and −2x + y = −2. Find the weights of the four girls. 3x − y − 2z = 3. −7x − 14y − 10z = −17. −71x+30y = −404. Give the complete solution to the system of equations. and x + y − 5z = 21. Give the complete solution to the system of equations. and −18x + 40y − 9z = 179. 2x + 4y + 3z = 5. Give the complete solution to the system of equations. Four times the weight of Gaston plus the weight of Siegfried equals 290 pounds. 8x+3y+3z = −1. and 4x + y + 3z = −9. y − 4z = 0. Consider the system −5x + 2y − z = 0 and −5x − 2y − z = 0. 81x + 105y + 20z = 682. z = −4. it doesn’t work. Thus x and z can equal anything. 9x − 18y + 4z = −83. 16. 4x + z = 14. 12. Give the complete solution to the system of equations. and y = 0 are plugged in to the equations. . 14. But when x = 1. 10. Both equations equal zero and so −5x + 2y − z = −5x − 2y − z which is equivalent to y = 0. −8x + 2y + 5z = 18. Why? 4. Give the complete solution to the system of equations. and −2x − z = 3. 13. 2x+2y −5z = 27. 5. 6. 3x − y + 4z = −9. 13x − 3y + 6z = −25. 7x + 14y + 15z = 22. Give the complete solution to the system of equations. and 3x + 6y + 10z = 13. Four times the weight of Ichabod is 660 pounds less than seventeen times the weight of Gaston. Give the complete solution to the system of equations. 2x+3y −5z = 31.2. Give the complete solution to the system of equations. 3x − y − 2z = 6. −8x + 3y + 5z = 13. 8x+2y+3z = −3. Give the complete solution to the system of equations. and z = 3. Brunhilde would balance all three of the others. 2x+4y−4z = −2. and −4x + y + 5z = 19. Four times the weight of Gaston is 150 pounds more than the weight of Ichabod. Give the complete solution to the system of equations. −19x+8y = −108. Give the complete solution to the system of equations. EXERCISES 41 3. 65x + 84y + 16z = 546. −11x+18y = 79 .−x + y = 4. Give the complete solution to the system of equations. 7. 9. y + 8z = 0. 2x + 4y + 2z = 4. and −2x + y = 6. −32x + 63y − 14z = 292. and 84x + 110y + 21z = 713. 11. −5x−10y+5z = 0. 8. 17. and 2x + 4y − 7z = −6. 18. Give the complete solution to the system of equations.

Proof: Consider the ﬁrst claim. g.14 Completeness of R By Theorem 2. between any two real numbers.C. This shows that if it is desired to consider all points on the number line. as the quotient of integers.. but this is not the approach to be followed here because it will eﬀectively eliminate every major theorem of calculus. contrary to the deﬁnition of sup (S) as the least upper bound. Proposition 2. It is this axiom which distinguishes Calculus from Algebra. A fundamental result about sup and inf is the following.b. If inf (S) exists. If the indicated set equals ∅. but it is not clear from this Theorem whether the entire real line consists of only rational numbers. it is necessary to abandon the attempt to describe arbitrary real numbers in a purely algebraic manner using only the integers. S ∩ [inf (S) . If S is a nonempty set in R which is bounded above.14. conﬁne their attention to algebra.1 A non empty set. (S) or inf (S) similarly. they were also able to show 2 could not be written as the quotient of two integers. inf (S) = −∞. sup (S)] = ∅.2 (completeness) Every nonempty set of real numbers which is bounded above has a least upper bound and every nonempty set of real numbers which is bounded below has a greatest lower bound.8. Every existence theorem in calculus depends on some form of the completeness axiom. thus they could √ regard 2 as a number. This discovery that the rational numbers could not even account for the results of geometric constructions was very upsetting to the Pythagoreans. This suggests there are a lot of rational numbers. l which has the property that l is an upper bound and that every other upper bound is no smaller than l is called a least upper bound. Some people might wish this were the case because then each real number could be described. Then for every δ > 0. (S) means g is a lower bound for S and it is the largest of all lower bounds. Axiom 2. S ∩ (sup (S) − δ.b.42 THE REAL NUMBERS 2. Unfortunately. if the indicated set equals ∅. then a number. l.u. S ⊆ R is bounded above (below) if there exists x ∈ R such that x ≥ (≤) s for all s ∈ S. but they discovered their beliefs were false. points on the number line.l. led by Pythagoras believed in this. If S is a nonempty subset of R which is not bounded above. This lack of holes is more precisely described in the following way. (S) or often sup (S) . especially when it became clear there were an endless supply of such “irrational” numbers. In the second claim.14.14. Before 500 B. Thus g is the g. a line which has no holes. not just as a point on a line but also algebraically. and considering only the rational numbers. inf (S) + δ) = ∅. deﬁne the greatest lower bound. It happened roughly like this. Some might desire to throw out all the irrational numbers.9.b. Deﬁnition 2.3 Let S be a nonempty set and suppose sup (S) exists. then sup (S) − δ is an upper bound for S which is smaller than sup (S) . this information is expressed by saying sup (S) = +∞ and if S is not bounded below. then for every δ > 0.l. then inf (S) + δ would be a lower bound which is larger than inf (S) contrary to the deﬁnition of inf (S) . In this book real numbers will continue to be the points on the number line. If S is a nonempty set bounded below. a group of mathematicians. . there exists a rational number. They knew they could construct the square root of two as the diagonal of a right triangle in which the two sides have unit √ length.

is deﬁned to be the number of diﬀerent ways to form an ordered list of k of the numbers. Let k ≤ n where k and n are natural numbers. Then 0 = x2 − y 2 = (x − y) (x + y) and so 0 = (x − y) (x + y) . ak xn−k y k and you need to determine n what ak should be. Now consider what happens when you multiply. That is. there is only one least upper bound and only one greatest lower bound. Find sup S. Let S be a set which is bounded above and let −S denote the set {−x : x ∈ S} . Imagine writing (x + y) = (x + y) (x + y) · · · (x + y) where there are n factors in the product. Using the preceding problem. Prove by induction that n < 2n for all natural numbers. Let S = [2. (a) |x + 1| = |2x + 3| (b) |x + 1| − |x + 4| = 6 5. (a) |2x − 6| < 4 (b) |x − 2| < |2x + 2| 6. Prove the binomial theorem from Problem 9. 5). k) = n · (n − 1) · · · (n − k + 1) = n! . Find sup S. REVIEW EXERCISES 43 2. {1.15 Review Exercises 1. n .15.2. n} . 7. show the number of ways of selecting a set of k things from a set of n things is n . Which of the ﬁeld axioms is being abused in the following argument that 0 = 2? Let x = y = 1. 5] . Now divide both sides by x − y to obtain 0 = x + y = 1 + 1 = 2. Hint: When you take (x + y) . · · ·. Now let S = [2. Give conditions under which equality holds in the triangle inequality. Show that if S = ∅ and is bounded above (below) then sup S (inf S) is unique. k) . 11. 2. 8. note that the result will be a sum of terms of the form. n ≥ 1.5? What about any other number? 3. How are inf (−S) and sup (S) related? Hint: Draw some pictures on a number line. Solve the following equations which involve absolute values. What about sup (−S) and inf S where S is a set which is bounded below? 4. Each factor contributes either an x or a y to a typical term. P (n. permutations of n things taken k at a time. Is sup S always a number in S? Give conditions under which sup S ∈ S and then give conditions under which inf S ∈ S. Solve the following inequalities which involve absolute values. (n − k)! 9. If S = ∅ can you conclude that 7 is an upper bound? Can you conclude 7 is a lower bound? What about 13. 2. Show P (n. k 10.

= 3 − 49 17 − 36 3 7 The ancient Babylonians knew how to solve these quadratic equations sometime before 1700 B. · · ·. 13. ax2 + bx + c = 0. there is exactly one real root and when it equals a negative number there are no real roots. The process by which this was accomplished in adding in the term b2 /4a2 is referred to as completing the square. Is it ever the case that (a + b) = an + bn for a and b positive real numbers? √ 15. n} which contain k1 . Suppose a > 0 and that x is a real number which satisﬁes the quadratic equation. Is it ever the case that a2 + b2 = a + b for a and b positive real numbers? 16. Verify that b2 b x2 + x + 2 = a 4a x+ b 2a 2 . Let n be a natural number and let k1 + k2 + · · ·kr = n where ki is a non negative integer. When it is positive there are two diﬀerent real roots. Also determine f (R) and sketch the graph of f. When it is zero. k2 ···kr elements in them. Hint: First divide by a to get c b x2 + x + = 0. . Find a formula for x in terms of a and b and square roots of expressions involving these numbers. Is it ever the case that 1 x+y n = 1 x + 1 y for x and y positive real numbers? p k=1 17.44 THE REAL NUMBERS 12. √ −b ± b2 − 4ac x= . Derive a formula for the multinomial expansion. You should obtain the quadratic formula7 . The symbol n k1 k2 · · · kr denotes the number of ways of selecting r subsets of {1. Hint: f (x) = 17 7 3 x2 + x − 3 3 x+ 7 6 2 7 49 49 17 = 3 x2 + x + − − 3 36 36 3 . Hint: See Problem 10. Find a formula for this number. ak ) which is analogous to the n 18. Prove by the binomial theorem and Problem 9 that the number of subsets of a given ﬁnite set containing n elements is 2n . ( binomial expansion. Find the value of x at which f (x) is smallest by completing the square. Now solve the result for x. 19. a a Then add and subtract the quantity b2 /4a2 . 2a The expression b2 − 4ac is called the discriminant. It seems they used pretty much the same process outlined in this exercise. Suppose f (x) = 3x2 + 7x − 17.C. 14.

REVIEW EXERCISES 45 20. ﬁnd the largest value of f (x) and the value of x at which it occurs. Let l be the least upper bound and argue there exists n ∈ N ∩ [l − 1/4. Hint: Suppose completeness and let a > 0.2. Now what about n + 1? . Can you conjecture and prove a result about y = ax2 + bx + c in terms of the sign of a based on these last two problems? 21. Suppose f (x) = −5x2 + 8x − 7. l] . then x/a is an upper bound for N. Find f (R) . In particular. If there exists x ∈ R such that na ≤ x for all n ∈ N. Show that if it is assumed R is complete.15. then the Archimedian property can be proved.

46 THE REAL NUMBERS .

1.3 Two lines in the plane are said to be parallel if no matter how far they are extended. they never intersect. Axiom 3.1.4 If two lines l1 and l2 are parallel and if they are intersected by a line.C.1. For example in the above picture. the alternate interior angles are shown in the following picture labeled as α. l3 . the two triangles are similar because the angles are the same.1.2 If two triangles are similar then the ratios of corresponding parts are the same. [6] 3. The purpose here is not to give a complete treatment of plane geometry.Basic Geometry And Trigonometry This section is a review some basic geometry which is especially useful in the study of calculus. B A B c a b C A c∗ a∗ b∗ C The fundamental axiom for similar triangles is the following. this says that a∗ a = ∗ b b Deﬁnition 3. To do this right. 47 . in the following picture. Deﬁnition 3. just a suitable introduction.1 Two triangles are similar if they have the same angles. you should consult the books of Euclid written about 300 B. For example.1 Similar Triangles And Pythagorean Theorem Deﬁnition 3.

It is a line perpendicular to the line determined by the line segment joining B and C which passes through the point.1.11 When an angle α is placed next to an angle β as shown above. Theorem 3. Proof: Consider the following picture. the following axiom will be used. and γ be the angles of a right triangle with γ the right angle. β. The similar triangles axiom can be used to prove the Pythagorean theorem. Axiom 3. then the resulting angle is denoted by α + β.1.9 Given a straight line and a point.1. B α β γ α C A In the picture the top horizontal line is obtained from Axiom 3.7 Suppose l1 and l2 both intersect a third line.1. Deﬁnition 3. Theorem 3. Deﬁnition 3. With this deﬁnition. Axiom 3.8 A right triangle is one in which one of the angles is a right angle. there exists a straight line which contains the point and intersects the given line in two right angles.12 (Pythagoras) In a right triangle the square of the length of the hypotenuse equals the sum of the squares of the lengths of the other two sides. Then c is deﬁned to be the length of the line from A to B. Then as shown in the picture. This line is called perpendicular to the given line.1. In a right triangle the long side is called the hypotenuse. the resulting angle is a right angle. Axiom 3.1.6 An angle is a right angle if when either side is extended. Deﬁnition 3. a is the . Then if the angles.48 BASIC GEOMETRY AND TRIGONOMETRY l2 l1 ¡ l3 ¡ α¡ ¡ ¡ ¡α ¡ ¡ As suggested by the above picture. α and β are placed next to each other.9. Theorem 3. A right angle is said to have 90◦ or to be a 90◦ angle.5 If l1 and l2 are parallel lines intersected by l3 . B.1.1. Then l1 and l2 are parallel.1. Proof: Consider the following picture in which the large triangle is a right triangle and D is the point where the line through C perpendicular to the line from A to B intersects the line from A to B. l3 in a right angle. the angle formed by placing α and β together is a right angle as claimed. then alternate interior angles are equal.10 says the sum of the two non 90◦ angles in a right triangle is 90◦ .10 Let α. the new angle formed by the extension equals the original angle.1.

3.C. This was during the Babylonian captivity of the Jews. and b is the length of the line from A to C. 624-547 B. B D c γ A α b δ C β a Then from Theorem 3. two lines which intersect each other at right angles and one identiﬁes a point by specifying a pair of numbers. Also from this same theorem. 3) involves going 2 units to the right on the x axis and then 3 units directly up on a line perpendicular to the x axis. For example. Therefore. a y axis. α + δ = 90◦ and so α = γ. and = . . who also did fundamental work in geometry.3. 1 This theorem is due to Pythagoras who lived about 572-497 B.2 Cartesian Coordinates And Straight Lines Recall the notion of the Cartesian coordinate system.C. the number (2. consider the following picture. an even earlier Greek mathematician named Thales. a b DB c − DB Therefore. Therefore.1. 1 √ This theorem implies there should exist some such number which deserves to be called a2 + b2 as mentioned earlier in the discussion on completeness of R.C. Alexander the great would not come along for more than 100 years. δ + γ = 90◦ and β + γ = 90◦ . It involved an x axis.10. however.2. For example. CARTESIAN COORDINATES AND STRAIGHT LINES 49 length of the line from B to C. δ = β. This proves the Pythagorean theorem.1. Denote by DB the length of the line from D to B. cDB = a2 and c c − DB = b2 so c2 = cDB + b2 = a2 + b2 . By Axiom 3.2. There was. Thus Pythagoras was probably a contemporary of the prophet Daniel. Greek geometry was organized and published by Euclid about 300 B. a b c c = . sometime before Ezra and Nehemiah. the three triangles shown in the picture are all similar.

I will often indulge in this sloppiness. Recall that you ﬁrst found lots of ordered pairs which satisﬁed the relation. 3) x Because of the simple correspondence between points in the plane and the coordinates of a point in the plane. 5 4 3 2 1 0 –1 –2 –1 1 2 Now here is the graph of the relation y = 2x + 1 which is a straight line.(1. it is common to see (x.50 y • BASIC GEOMETRY AND TRIGONOMETRY (2. First here is the graph of y = x2 + 1. Here are some simple examples which you should see that you understand. Thus. 1) all satisfy the ﬁrst relation which describes a straight line. For example (0. and (−1. . it is often the case that people are a little sloppy in referring to these things. The reader has likely encountered the notion of graphing relations of the form y = 2x + 3 or y = x2 + 5. y) referred to as a point in the plane. 5). 3).

Sometimes this 2 4 . y2 ) which satisfy this relation. consider x ≤ −2 6 + x if x2 if −2 < x < 3 y= 1 − x if x≥3 Then the graph of this relation is sketched below. then (y1 − y0 ) − (y2 − y0 ) m (x1 − x0 ) − m (x2 − x0 ) y1 − y2 = = x1 − x2 x1 − x2 x1 − x2 m (x1 − x2 ) = m. = x1 − x2 Remember the slope of the line segment through two points is always the diﬀerence in the y values divided by the diﬀerence in the x values. x0 . –2 –1 1 2 8 6 4 2 –4 –2 0 –2 –4 A very important type of relation is one of the form y − y0 = m (x − x0 ) . taken in the same order. CARTESIAN COORDINATES AND STRAIGHT LINES 51 5 4 3 2 1 0 –1 Sometimes a relation is deﬁned using diﬀerent formulas depending on the location of one of the variables. and y0 are numbers. The reason this is important is that if there are two points.3. y1 ) . and (x2 . For example. (x1 . where m.2.

3. Find the equation of the line through the point (2. 3) and ﬁnd the relation which describes this line. 7. There are actually many ways used to measure this distance but the best way. Sketch the graph of y = 4.4 Distance Formula And Trigonometric Functions As just explained. From the above discussion. This shows that there is a constant slope. Suppose a. 8.1 Find the relation for a straight line which contains the point (1. 3) which is parallel to the line whose equation is 2x + 3y = 8.52 BASIC GEOMETRY AND TRIGONOMETRY is referred to as the rise divided by the run. 5. a) . the point (x0 . Suppose there are two points in the plane and it is desired to ﬁnd the distance between them. b = 0.3 Exercises 1. between any pair of points satisfying this relation. Example 3. This is more often called the equation of the straight line. 2) and has constant slope equal to 3. Two lines are parallel if they have the same slope. y0 ) satisﬁes the relation. 6. Sketch the graph of the relation deﬁned as y= 1 if x ≤ −2 1 − x if −2 < x < 3 1 + x if x≥3 3.points in the plane may be identiﬁed by giving a pair of numbers. the slope of the line. 3. . (y − 2) = 3 (x − 1) . m. Also. Sketch the graph of the straight line which goes through the points (1. and the only way used in this book is determined by the Pythagorean theorem. Sketch the graph of y = x3 + 1. Such a relation is called a straight line. 1 1+x2 . Sketch the graph of y = x2 − 2x + 1. 0) and (2. and (b. 0) .2. Sketch the graph of x x2 +1 . Consider the following picture. Find the equation of the line which goes through the points (0. 2.

The length of the side on the bottom is |x0 − x1 | while the length of the side on the right is |y0 − y1 | . Consider the following picture in which the small circle has radius 1.1. and the right side of each of the two triangles is perpendicular to the bottom side which lies on the x axis. y1 ) • 53 x • (x0 . Therefore. y0 ) − (x1 . Therefore. y0 ) and P1 is the point determined by (x1 . sin(θ)) ¡ ¡ ¡θ ¡ By Theorem 3. by the Pythagorean theorem the distance between the two indicated points is (x0 − x1 ) + (y0 − y1 ) . Rsin(θ)) (cos(θ).10 on Page 48 the two triangles have the same angles and so they are similar. it follows the coordinates of the top vertex of the larger triangle are as shown. y1 ) should be the square root of the sum of the squares of the lengths of the two sides. y0 ) and (x1 . Note you could write (x1 − x0 ) + (y1 − y0 ) or even (x0 − x1 ) + (y1 − y0 ) 2 2 2 2 2 2 and it would make no diﬀerence in the resulting number. the distance between the points denoted by (x0 .4. This shows the following deﬁnition is well deﬁned. ¡ ¡ ' ' (Rcos(θ). sin θ) the coordinates of the top vertex of the smaller triangle. y1 ) . The trigonometric functions cos and sin are deﬁned next. as d (P0 . The distance between the two points is written as |(x0 . y0 ) In this picture. P1 ) or |P0 P | . Now deﬁne by (cos θ. DISTANCE FORMULA AND TRIGONOMETRIC FUNCTIONS y (x1 . . y1 )| or sometimes when P0 is the point determined by (x0 . the large circle has radius R.3. y0 ) • (x1 .

Then c2 = a2 + b2 − 2ab cos C Proof: Situate the triangle so the vertex of the angle.54 BASIC GEOMETRY AND TRIGONOMETRY Deﬁnition 3. Proposition 3. Then the point of intersection. deﬁne cos θ and sin θ as follows. B c a A b C You should verify sin A ≡ a/c.) at the point whose coordinates are (0. tan θ ≡ sin θ cos θ 1 1 . Consider the following picture of a triangle in which a.4. cot θ ≡ . cos2 θ + sin2 θ = 1. In particular. . B a c C The law of cosines is the following. The other trigonometric functions are deﬁned as follows. sec θ ≡ .4. Then from the deﬁnition of the cos C. θ. this speciﬁes cos θ and sin θ by simply letting R = 1. cos θ sin θ cos θ sin θ (3. and C denote the angles indicated. is given as (R cos θ.2 For any angle. the coordinates of the vertex. tan A ≡ a/b. Extend this other side until it intersects a circle of radius R. b and c are the lengths of the sides and A. C. B. Consider the following picture of a right triangle. is on the point whose coordinates are (0. Thus the above identity holds. Having deﬁned the cos and sin there is a very important generalization of the Pythagorean theorem known as the law of cosines. 0) and so the side opposite the vertex. csc θ ≡ . sec A ≡ c/b.1) It is also important to understand these functions in terms of a right triangle. R sin θ) . 0) equals 1.4. and csc A ≡ c/a. sin θ) is a point on the circle which has radius 1. Proof: This follows immediately from the above deﬁnition and the distance formula. Since (cos θ. B is on the positive x axis as shown in the above picture. cos A ≡ b/c. the distance of this point to (0. b A Theorem 3. Place the vertex of the angle (The vertex is the point. 0) in such a way that one side of the angle lies on the positive x axis and the other side extends upward.1 For θ an angle.3 Let ABC be a triangle as shown above.

5 The Circular Arc Subtended By An Angle How can angles be measured? This will be done by considering arcs on a circle. Corollary 3.4. a sin C) while it is clear the coordinates of A are (b. a . Then the corresponding sides are proportional. Therefore. This proves the corollary. Similarly.2. Thus the angle at A is equal to the angle at A . The following picture is descriptive of the situation. c and c the sides indicated in the picture. c2 = (a cos C − b) + a2 sin2 C = a2 cos2 C − 2ab cos C + b2 + a2 sin2 C = a2 + b2 − 2ab cos C as claimed. Corollary 3. let θ denote an angle and place the vertex of this angle at the center of . Similar reasoning shows a /b = a/b = a /b .5 Suppose T and T are two triangles such that one angle is the same in the two triangles and in each triangle.5.3. Proof: Let T = ABC with the two equal sides being AC and AB. To see how this will be done.4. Let T be labeled in the same way but with primes on the letters.4 Let ABC be any triangle as shown above. Such triangles in which two sides are equal are called isoceles. 2 (1 − cos A) and so 3.4. from the distance formula. a/c = a /c . b. a2 = b2 + c2 − 2bc cos A = 2b2 − 2b2 cos A and so a/b = 2 (1 − cos A). 2 |cos θ| ≤ 1 and so c2 = a2 + b2 − 2ab cos C ≤ a2 + b2 + 2ab = (a + b) . By assumption c/b = 1 = c /b . Proof: This follows immediately from the law of cosines. From Proposition 3. the sides forming that angle are equal. THE CIRCULAR ARC SUBTENDED BY AN ANGLE 55 B are (a cos C. and Proposition 3. b . C ¨ ¨ ¨ a B A¨ ¨ ¨ 2 A ¨¨ ¨ ¨¨ b ¨ ¨¨ c C ¨ ¨ b ¨ a ¨¨ c B Denote by a.2. Then by the law of cosines. Then the length of any side is no longer than the sum of the lengths of the other two sides. 0) .4.

Stretching this string out and measuring it would then give you the length of the arc. This is a little sloppy right now because no precise deﬁnition of the length of an arc of a circle has been given. Think. 10 across and 30 around. extend its two sides till they intersect the circle. Then. it is the same angle regardless of how far its sides are extended. It is reasonable to suppose. Of course it is also necessary to deﬁne what is meant by the length of a circular arc in order to do any of this. Take an angle and place its vertex (the point) at the center of a circle of radius r. If r were changed to R. read on. This is not too far oﬀ. Thus. Thus the picture looks exactly the same. methods will be given which allow one to calculate π more precisely. based on this reasoning that the way to measure the angle is to take the length of the arc subtended in whatever units being used and divide this length by the radius measured in the same units. Such intuitive discussions involving string may or may not be enough to convey understanding. Thus the Bible. consider the following picture. gives the value of π as 3. the length of the arc subtended by the angle on a circle of radius r is rθ.56 BASIC GEOMETRY AND TRIGONOMETRY the circle. If you need to see more discussion. the ratio between the circumference (length) of a circle and its radius is a constant which is independent of the radius of the circle2 . Thus this procedure could yield any of inﬁnitely many diﬀerent circular arcs. A1 θ A2 In this picture. a big one having radius. θ is situated in two diﬀerent ways subtending the arcs A1 and A2 as shown. First I will describe an intuitive way of thinking about this and then give a rigorous deﬁnition and proof. Next. only larger. this really amounts to a change of units of length. Each of these arcs is said to subtend the angle. Note the angle could be opening in any of inﬁnitely many diﬀerent directions. The angle. was round. To give a precise description of what is meant by the length of an arc. Since the time of Euler in the 1700’s. In summary. for example. placing one end of it on one end of the circular arc and then wrapping the string till you reach the other end of the arc. in particular. taken literally. It was proved by Lindeman in the 1880’s that π is transcendental which is the worst sort of irrational number. if θ is the radian measure of an angle. If the intuitive way of thinking about this satisﬁes you. Later. skip to the next section. . A subset of 2 In 2 Chronicles 4:2 the “molten sea” used for “washing” by the priests and found in Solomon’s temple is described. there are two circles. this constant has been denoted by 2π. imagine taking a string. of keeping the numbers the same but changing centimeters to meters in order to produce an enlarged version of the same picture. In fact each of these arcs has the same length. it will be easy to measure angles. It sat on 12 oxen. After all. thus obtaining a number which is independent of the units of length used just as the angle itself is independent of units of length. R and a little one having radius r.1415926535 and presently this number is known to thousands of decimal places. like those shown in the above picture. This is in fact how to deﬁne the radian measure of an angle and the deﬁnition is well deﬁned. For now. A better value is 3. no harm will be done by skipping the more technical discussion which follows. Otherwise. extending the sides of the angle if necessary till they intersect the circle. 5 cubits high. Angles will be measured in terms of lengths of arcs subtended by the angle. this determines an arc on the circle. When this has been shown. Letting A be an arc of a circle.

{|P | : P ∈ P (A)} is a set of numbers which is bounded above by M and by completeness of R it is possible to deﬁne the length of A. THE CIRCULAR ARC SUBTENDED BY AN ANGLE 57 A. then |P | ≤ |Q| . For P = {p0 . · · ·. By Corollary 3. see the following picture. for any P ∈ P (A) .4. To see this. {p0 . For example. To be a little sloppy. and the points are encountered in the indicated order as one moves in the counter clockwise direction along the arc. The only purpose for doing this is to obtain the existence of an upper bound. A. see the following picture. pn } is a partition of A if p0 is one endpoint. Then for P ∈ P (A) . by l (A) ≡ sup {|P | : P ∈ P (A)}. Thus |P | consists of the sum of the lengths of the little lines joining successive points of P and appears to be an approximation to the length of the circular arc. . denote by |pi − pi−1 | the distance between pi and pi−1 .5. Q ∈ P (A) and P ⊆ Q.4 is that if P. p2 p3 p1 p0 Also. Therefore. pn } . l (A) .3. denote by P (A) the set of all such partitions.4 the length of any of the straight line segments joining successive points in a partition is smaller than the sum of the two sides of a right triangle having the given straight line segment as its hypotenuse. simply pick M to be the perimeter of a rectangle containing the whole circle of which A is a part. pn is the other end point. deﬁne |P | ≡ n i=1 |pi − pi−1 | . C D A B The sum of the lengths of the straight line segments in the part of the picture found in the right rectangle above is less than A + B and the sum of the lengths of the straight line segments in the part of the picture found in the left rectangle above is less than C + D and this would be so for any partition. A fundamental observation following from Corollary 3. To illustrate. · · ·. This eﬀect of adding in one point is illustrated in the following picture.4. |P | ≤ M where M is the perimeter of a rectangle containing the arc. A. Therefore. add in one point at a time to P .

1 Let θ be an angle which subtends two arcs. R 1 R Since ε is arbitrary. θi−1 have produced points. AR on a circle of radius R and Ar on a circle of radius r. But now reverse the argument and write l (A2 ) < |P2 | + ε = l (A1 ) < |P1 | + ε = R R |P | + ε ≤ l (A2 ) + ε r 2 r which implies. .58 BASIC GEOMETRY AND TRIGONOMETRY Also. Then determining the angles as just described. If |P2 | + ε > l (A2 ) .5. Theorem 3. pn } be a partition of A. and both P1 and P2 determine the same sequence of angles. · · ·. letting {p0 . When the angles. The angle θi is formed by the two lines. ¤¤ ¤ ¤ ¤ θ i ¤ ¤ θi−12 2 222 ¤ Now let ε > 0 be given and pick P1 ∈ P (A1 ) such that |P1 | + ε > l (A1 ) . specify angles. · · ·. A in the counterclockwise direction as shown below. a speciﬁcation of these angles yields the partition of A in the following way. letting one side lie on the line from the center of the circle to p0 and the other side extended resulting in a point further along the arc in the counter clockwise direction. Then |P1 | + ε > l (A1 ) . P2 . since ε is arbitrary that Rl (A2 ) ≥ rl (A1 ) and this has proved the following theorem. Place the vertex of θ1 on the center of the circle. Furthermore. one from the center of the circle to pi and the other line from the center of the circle to pi−1 . Then denoting by l (A) the length of a circular arc as described above. θ1 . this shows Rl (A2 ) ≤ rl (A1 ) . resulting in a point further along the arc. then stop.4. θi as follows.5 |P1 | R = |P2 | r and so r r |P | + ε ≤ l (A1 ) + ε. place the vertex of θi on the center of the circle and let one side of θi coincide with the side of the angle θi−1 which is most counter clockwise. p0 . Rl (Ar ) = rl (AR ) . use these angles to produce a corresponding partition of A2 . · · ·. Using Corollary 3. pick Q ∈ P (A2 ) such that |Q| + ε > l (A2 ) and let P2 = P2 ∪ Q. Otherwise. Then use the angles determined by P2 to obtain P1 ∈ P (A1 ) . the other side of θi when extended. pi−1 on the arc. |P2 | + ε > l (A2 ) .

this seems abundantly clear but in case it is hard to believe. The measure of θ is deﬁned to be the length of the circular arc subtended by θ on a circle of radius r divided by r. here is the deﬁnition of the measure of an angle. if A is an arc subtended by the angle θ on a circle of radius r then the length of A. You should note the measure of θ is independent of dimension.5.3. Here is a picture which illustrates the conclusion of this corollary. you could take ε = a−b and 2 conclude 0 < a − b < a−b . THE CIRCULAR ARC SUBTENDED BY AN ANGLE 59 Before proceeding further.4 Let 0 < radian measure of θ < π/4.2) Proof: Situate the angle.1 and letting R = 1. sin(θ)) ¨¨ % ¨¨ (1 − cos(θ)) θ The following corollary states that the length of the subtended arc shown in the picture is longer than the vertical side of the triangle and smaller than the sum of the vertical side with the segment having length 1 − cos θ. It follows that (1 − cos θ) + sin θ is an upper bound for all the |P | where P ∈ P (A) and so (1 − cos θ) + sin θ is at least as large as the sup or least upper bound of the . 2 With this preparation. the conclusion that l (A1 ) ≤ R l (A2 ). Then denoting the resulting arc on the circle by A. Proof: That the deﬁnition is well deﬁned follows from Theorem 3. Then letting A be the arc on the unit circle resulting from situating the angle with one side on the positive x axis and the other side pointing up from the positive x axis. sin θ) . This implies a − b < ε for all positive ε and so it must be the case that a − b ≤ 0 since otherwise. the following corollary gives a proof.5. To show a ≤ b ﬁrst show that for every ε > 0 it follows that a < b + ε. it follows that for all P ∈ P (A) the inequality (1 − cos θ) + sin θ ≥ |P | ≥ sin θ.5.5. The formula also follows from Theorem 3. a contradiction. Corollary 3. θ such that one side is on the positive x axis and extend the other side till it intersects the unit circle at the point. Now is a good time to present a useful inequality which may or may not be self evident. This is because the units of length cancel when the division takes place.1. note the proof of the above theorem involved showing l (A1 ) < + ε where ε > 0 was arbitrary and from this. r This is a very typical way of showing one number is no larger than another.2 Let θ be an angle. denoted by l (A) is given by l (A) = rθ. (1 − cos θ) + sin θ ≥ l (A) ≥ sin θ (3. Proposition 3. (cos θ. To me. (cos(θ).3 The above deﬁnition is well deﬁned and also. This is also called the radian measure of the angle.5. R r l (A2 ) Deﬁnition 3.5.

Then there exists a unique integer.2 Let b ∈ R.8.6. r ≡r Otherwise. Then from Theorem 2. 0) due to the deﬁnition of l (A) . From the deﬁnition of the trig functions. From the observation that the x and y axes intersect at right angles the four arcs on the unit circle subtended by these axes are all of equal length. cos b = cos (−b) . then b = (−p) 2π. . and r ∈ [0. y ∈ R.6 The Trigonometric Functions Now the Trigonometric functions will be deﬁned as functions of an arbitrary real variable.3 Let k ∈ Z. Then sin b ≡ sin r and cos b ≡ cos r where b = 2πp + r for p an integer.11 on Page 29 there exists a unique integer. Proof: By deﬁnition. 2π).6. denote by rb the element of [0. sin (b + 2kπ) = sin b. The measure of the angle which is determined by the arc from (1. Then there exists a unique integer p and real number r such that 0 ≤ r < 2π and b = p2π + r. x = 2πp1 + rx . 2π) . The next topic is the important formulas for the trig. You can easily ﬁnd other values for cos and sin at all the other multiples of π/2. then the following formulas hold. The following theorem will make possible the deﬁnition. Now suppose b < 0. For b ∈ R. 2π). L ≥ sin θ because L is the length of the hypotenuse of a right triangle having sin θ as one of the sides.3) (3. 0) to (−1. p such that −b = 2πp + r1 where 1 ∈ [0. p such that b = 2πp + r where 0 ≤ r < 2π.4) (3. sin b = − sin (−b) . The bottom half follows because l (A) ≥ L where L is the length of the line segment joining (cos θ. functions of sums and diﬀerences of numbers. cos (π/2) = 0 and sin (π/2) = 1. Several observations are now obvious from this.5) The other trigonometric functions are deﬁned in the usual way as in (3. 3.1 Let b ∈ R. If r1 = 0. the measure of a right angle must be 2π/4 = π/2. y = 2πp2 + ry . b = (−p) 2π + (−r1 ) = (−p − 1) 2π + 2π − r1 and r ≡ 2π − r1 ∈ (0. 0) is seen to equal π by the same reasoning. Then rx+y = rx + ry + 2kπ for some k ∈ Z. This proves the top half of the inequality. Proof: First suppose b ≥ 0.4 Let x. However. The following deﬁnition is for sin b and cos b for b ∈ R.60 BASIC GEOMETRY AND TRIGONOMETRY |P | .1) provided they make sense. x + y = 2πp + rx+y . 2π) having the property that b = 2πp+rb for p an integer.6. Lemma 3. Observation 3.6. cos (b + 2kπ) = cos b cos2 b + sin2 b = 1 (3. sin θ) and (1. Deﬁnition 3. Theorem 3. Therefore. Up till now these have been deﬁned as functions of pointy things called angles.

Proof: The length of the arc between p (x + y) and p (x) is |rx+y − rx | . 0) is the same as the length of the arc joining p (ry ) = p (y) to (1. by Corollary 3. 0) . Then |rx+y − rx | = rx+y − rx = ry + 2kπ. y ∈ R . Lemma 3.4. 0) . Therefore.6. 0) p(y) It follows from the deﬁnition of the radian measure of an angle that the two angles determined by these arcs are equal and so.3. 2 2 2 . there is a unique point. their diﬀerence is also in [0. Let z ∈ R and let p (z) denote the point on the unit circle determined by the length rz whose coordinates are cos z and sin z. p(x) p(x + y) (1. Theorem 3. starting at (1. 2π) and so k = 0. y ∈ R. p (z) on the unit circle and the coordinates of this point are cos z and sin z. (3. Since rx and rx+y are both in [0.6. 2π) their diﬀerence is also in [0.5 the distance between the points p (x + y) and p (x) must be the same as the distance from p (y) to p (0) . in this case. the length of the arc joining p (2π − ry ) to (1.6) Proof: Recall that for a real number. rx+y < rx and in this case |rx+y − rx | = rx − rx+y = −ry − 2kπ. 2π) and so in this case k = −1.5 Let x.6 Let x. |rx+y − rx | = 2π − ry . The following theorem is the fundamental identity from which all the major trig. Therefore. As an illustration see the following picture. This proves the lemma. Writing this in terms of the deﬁnition of the trig functions and the distance formula. the length of the arc between p (x + y) and p (x) has the same length as the arc between p (y) and p (0) . the arc joining p (x) and p (x + y) is of the same length as the arc joining p (y) and (1. identities involving sums and diﬀerences of angles are derived. z. Since both rx+y and rx are in [0. 0) . In the other case. Then the length of the arc between p (x + y) and p (x) is equal to the length of the arc between p (y) and (1. (cos (x + y) − cos x) + (sin (x + y) − sin x) = (cos y − 1) + sin2 x. Note also that p (z) = p (rz ) . Now from the above lemma. Then cos (x + y) cos y + sin (x + y) sin y = cos x.6. First assume rx+y ≥ rx . There are two cases to consider here. 2π). Thus. Now since the circumference of the unit circle is 2π. THE TRIGONOMETRIC FUNCTIONS From this the result follows because ≡−k 61 0 = ((x + y) − x) − y = 2π ((p − p1 ) − p2 ) + rx+y − (rx + ry ) . 0) and moving counter clockwise a distance of rz on the unit circle yields p (z) .

6) implies cos u cos v + sin u sin v = cos (u − v) Also.13) (3.9) (3. In addition to this.16) With these fundamental identities.12) (3.6.10) (3. cos (u + v) = cos (u − (−v)) = cos u cos (−v) + sin u sin (−v) = cos u cos v − sin u sin v Thus. this implies sin (x − y) = sin x cos y − cos x sin y. (3. Then (3.11) Then using Observation 3.15) √ √ and so cos π = 2/2.6.14) (3. it is easy to obtain the cosine and sine of many special angles. letting v = π/2. from this and (3. (Why?) Here is another one. Letting y = π/2. cos (3x) = cos 2x cos x − sin 2x sin x = 2 cos2 x − 1 cos x − 2 cos x sin2 x = 4 cos3 x − 3 cos x (3.6.7) (3.16). and Observation 3. This proves the theorem. Now let u = x + y and v = y. cos π . Observation 3.3 again. this shows that sin (x + π/2) = cos x.) Thus √4 sin π = 2/2 also. First. cos u + It follows sin (x + y) = − cos x + π +y 2 π π = − cos x + cos y − sin x + sin y 2 2 = sin x cos y + sin y cos x π 2 = − sin u. 6 6 .3.62 BASIC GEOMETRY AND TRIGONOMETRY cos2 (x + y) + cos2 x − 2 cos (x + y) cos x + sin2 (x + y) + sin2 x − 2 sin (x + y) sin x = cos2 y − 2 cos y + 1 + sin2 y From Observation 3.3).3 this implies (3. called reference angles.6). 4 0 = cos π 2 = cos 3 π 6 = 4 cos3 π π − 3 cos . making use of the above identities. From (3.8) (3. (Why do isn’t it equal to − 2/2? Hint: Draw a picture.3 implies cos 2x = cos2 x − sin2 x = 2 cos x − 1 = 1 − 2 sin x Therefore.6. 4 0 = cos π 2 = cos π π + 4 4 = 2 cos2 π −1 4 2 2 (3.

2 √ 1. from Problem 2. Find a formula for tan (x + y) in terms of trig.7. 3 . 4 . Find cos θ and sin θ for θ ∈ 2. Here is a short table including 6 6 2 these and a few others. π/12 = π/3 − π/4. Suppose sin x = a where 0 < a < 1. 2 . Prove cos2 θ = 1+cos 2θ 2 . Prove that sin x sin y = 7. angles are considered as real numbers. 3 . Solve the equations and give all solutions. 8. Prove that sin x cos y = 6. θ cos θ sin θ 0 1 0 π 6 √ 3 2 1 2 π 4 √ 2 2 √ 2 2 π 3 1 2 √ 3 2 π 2 0 1 3. 2π 1−cos 2θ . 6 . not as pointy things. Is there a problem here? Please explain. Prove 1 + tan2 θ = sec2 θ and 1 + cot2 θ = csc2 θ. In the table. θ refers to the radian measure of the angle. From now on. Find all possible values for (a) tan x (b) cot x (c) sec x (d) csc x (e) cos x 10. and sin2 θ = 1+( 3/2) 3. cos (π/12) = cos (π/3 − π/4) = cos π/3 cos π/4 + sin π/3 sin π/4 √ √ and so cos (π/12) = 2/4 + 6/4. π. 9. 5. You should make sure you can obtain all these entries. 6 . sin π = 1 . 4 . On the other 2 hand. Prove that cos x cos y = 1 2 1 2 1 2 (sin (x + y) + sin (x − y)) . cos (π/12) = . (cos (x + y) + cos (x − y)) . EXERCISES √ 63 Therefore. Therefore. 4. functions of x and y. Using Problem 5. . cos π = 23 and consequently. (a) sin (3x) = 1 2 √ 3 2 (b) cos (5x) = √ (c) tan (x) = 3 (d) sec (x) = 2 (e) sin (x + 7) = (f) cos2 (x) = 1 2 √ 2 2 (g) sin4 (x) = 4 11. 4 .7 Exercises 7π 5π 4π 3π 5π 7π 11π 2π 3π 5π 3 . ﬁnd an identity for sin x − sin y. (cos (x − y) − cos (x + y)) . 6 .3.

4 3 d c αd b d d βd θ a . 17. 16. θ = 1 π. a = 5. 20.64 BASIC GEOMETRY AND TRIGONOMETRY 12. Prove the law of sines from the deﬁnition of the trigonometric functions. show there exists φ such that f (x) = A2 + B 2 sin (αx + φ) . √ Show there also exists ψ such that f (x) = A2 + b2 cos (αx + ψ) . functions of x. This is a very √ important result. C. Hint: Use Problem 20. Find a formula for tan (2x) in terms of trig. The law of sines says sin (A) /a = sin (B) /b = sin (C) /c. 18. Find a formula for tan x 2 in terms of trig. 3 d c d b d d θ d a 24. √ 3. Sketch a graph of y = sin x. 3 cos x = 22. Find c. Give all solutions to sin x + √ √ 3 cos x. In the picture. 23. A2 + B 2 is called the amplitude and φ or ψ are called phase shifts. enough that some of these quantities are given names. Sketch a graph of y = cos x. Sketch a graph of y = sin 2x. and θ = 2 π. Using Problem 2 graph y = cos2 x. 15. 19. Find a. 13. If ABC is a triangle where the capitol letters denote vertices of the triangle and the angle at the vertex. Sketch a graph of y = tan x. functions of x. Using Problem 19 graph y = sin x + 21. If f (x) = A cos αx + B sin αx. Let a be the length of the side opposite A and b is the length of the side opposite B and c is the length of the side opposite the vertex. In the picture. 14. α = 2 π and c = 3. b = 3.

Thus this area would be 1 BD AD + CD = 1 BD b. consider a right triangle as indicated in the following picture. c b a The area of this triangle shown above must equal ab/2 because it is half of a rectangle having sides a and b. In words.1 Some Basic Area Formulas Areas Of Triangles And Parallelograms This section is a review of how to ﬁnd areas of some simple ﬁgures. D C A B Note the height of triangle ABC taken with respect to side AB is the same as the height of the parallelogram taken with respect to this same side. For example. the area of the triangle equals 2 2 one half the base times the height. see the picture in which the two triangles are ABC and CDA. B c A D a b C The area of this triangle would be the sum of the two right triangles formed. . A parallelogram is a four sided ﬁgure which is formed when two identical triangles are joined along a corresponding side with the corresponding angles not adjacent. First of all.8. as just described in the case where AB is the side. SOME BASIC AREA FORMULAS 65 3. This also holds if the height and base are chosen with respect to any other side of the triangle. Now consider a general triangle in which a line perpendicular to the line from A to C has been drawn through B. Therefore. the area of this parallelogram equals twice the area of one of these triangles which equals 2AB (height of parallelogram) 1 = 2 AB (height of parallelogram) . The discussion will be somewhat informal since it is assumed the reader has seen this sort of thing already.8 3. Similarly the area equals height times base where the base is any side of the parallelogram and the height is taken with respect to that side.8.3.

cos α ≤ α ≤ (1 + ε) cos α tan α Taking α smaller if necessary and noting that for all α small enough. sin α sin α Now using the properties of the trig functions. 1 − cos α α +1≥ ≥ 1.2 Let S (θ) denote the sector of a circle of radius r having central angle. The sector. 1 − cos α 1 − cos2 α = sin α sin α (1 + cos α) = sin2 α sin α = .18) Proof: This follows from Corollary 3. In this corollary. let α be small enough that (3. α 1≤ ≤1+ε (3. To obtain (3. Proof: Let the angle which A subtends be denoted by θ and divide this sector into n equal sectors each of which has a central angle equal to θ/n. Then whenever the positive number. l (A) = α and so 1 − cos α + sin α ≥ α ≥ sin α. α. sin α (1 + cos α) 1 + cos α (3. Therefore.2 The Area Of A Circular Sector Consider an arc. yields (3.66 BASIC GEOMETRY AND TRIGONOMETRY 3. The angle between the two lines is called the central angle of the sector. is small enough. sin α <ε 1 + cos α and so (3. First a fundamental inequality must be obtained.19) From the deﬁnition of the sin and cos. θ. dividing by sin α.5.1 Let 1 > ε > 0 be given. A. This lemma is very important in another context.8. This proves the lemma. Then for such α.18).17) holds. (3. Theorem 3. Lemma 3. θ. The following is a picture of one of these. to the center of the circle.18). The circular sector determined by A is obtained by joining the ends of the arc.19) implies that for such α.8. whenever α is small enough. cos α is very close to 1. 2 Then the area of S (θ) equals r2 θ. A. A and the two lines just mentioned. The problem is to deﬁne the area of this shape.17) holds and multiply by cos α.4 on Page 59. of a circle of radius r which subtends an angle. S (θ) denotes the points which lie between the arc. .8.17) sin α and 1+ε≥ α ≥1−ε tan α (3.

for all n.D. it follows the area of the sector equals claimed. Give another argument which veriﬁes the Pythagorean theorem by supplying the details for the following argument3 . r2 sin (θ/n) r2 tan (θ/n) θ . Take the given right triangle and situate copies of it as shown below.3. 1+ε (θ/n) 1−ε r2 θ 2 1 1+ε 1 1−ε r2 θ. a small positive number less than 1. EXERCISES 67 ¨¨ ¨ ¨ ¨¨ ¨ ¨¨ ¨ ¨¨ ¨ ¨¨ ¨¨ ¨ S(θ/n) r In the picture. the area of S (θ) is trapped between the two numbers. Thus any reasonable deﬁnition of area would require r2 r2 sin (θ/n) ≤ area of S (θ/n) ≤ tan (θ/n) . 2 r2 2 θ Therefore.9 Exercises 1. Now from the picture. the area of each triangle equals ab/2. θ . 2 2 Therefore. 2 2 It follows the area of the whole sector having central angle θ must satisfy the following inequality. the area of the big square equals c2 . S (θ/n) and inside this circular sector is a triangle while outside the circular sector is another triangle. there is a circular sector. since it is half of a rectangle of area ab. The big four sided ﬁgure which results is a rectangle because all the angles are equal.9. 2 (θ/n) 2 (θ/n) Now let ε > 0 be given. nr2 nr2 sin (θ/n) ≤ area of S (θ) ≤ tan (θ/n) . ≤ Area of S (θ) ≤ Since ε is an arbitrary small positive number. . and the area 3 This argument is old and was known to the Indian mathematician Bhaskar who lived 1114-1185 A. (Why?) as 3. and let n be large enough that 1≥ and sin (θ/n) 1 ≥ (θ/n) 1+ε 1 tan (θ/n) 1 ≤ ≤ .

Hint: Think of painting the side of the cone and while the paint is still wet.68 2 BASIC GEOMETRY AND TRIGONOMETRY of the inside square equals (b − a) . c2 = 4 (ab/2) + (b − a) = a2 + b2 . Here a. and c are the lengths of the respective sides. Find the area of the side of this cone in terms of h and r. Therefore. Sum the areas of three triangles in the following picture or write the area of the trapezoid as (a + b) a+ 1 (a + b) (b − a) 2 which is the sum of a triangle and a rectangle as shown. Another very simple and convincing proof of the Pythagorean theorem4 is based on writing the area of the following trapezoid two ways. rolling it on the ﬂoor yielding a circular sector. . A right circular cone has radius r and height h. Do it both ways and see the pythagorean theorem appear. 2 = 2ab + b2 + a2 − 2ab c b a 2. 4 This argument involving the area of a trapezoid is due to James Garﬁeld who was one of the presidents of the United States. b. a b a c c b 3. This is like an ice cream cone.

it reduces to dx2 + ex + f e d x2 + x + d e d x2 + x + d d x− −e 2d f d e2 4d2 2 − e2 d 4d2 − e2 d . 2 2 2 (3. P0 . 5. one can obtain an equation which will describe a parabola.10. Now join the ends of these lines to obtain a four sided ﬁgure. Then in this case. y) such that the equation holds.10 3.10. Such a set of points always is a parabola. Ellipses. Now consider an arbitrary equation of the form. and Hyperbolas The Parabola A parabola is a collection of points. AND HYPERBOLAS 69 4. An equilateral triangle is one in which all sides are of equal length. y) to the line is |c − y| . 4d2 e2 1 2 y = (x − a) − 2 d 4d .3. Therefore. letting a = −e 2d . y = dx2 + ex + f where d = 0. By this is meant the set of points (x. Draw two parallel lines one having length a and the other having length b suppose also these lines are at a distance of h from each other. From this deﬁnition. Squaring both sides. b) The distance from the point. the description of the parabola requires that (x − a) + (y − b) = |c − y| . ELLIPSES. y=c · P0 = (a. P = (x. b) where b = c as shown in the picture. Find the area of an equilateral triangle whose sides have length l. Suppose then that the line is y = c and the point is (a. x2 − 2xa + a2 + y 2 − 2yb + b2 = c2 − 2cy + y 2 and so (x − a) + b2 − c2 = (2b − 2c) y. To see this.1 Parabolas. complete the square on the right as follows: y = = = = Therefore. PARABOLAS.20) The simplest case is when a = 0 and b = −c. P in the plane such that the distance from P to a ﬁxed line is the same as the distance from P to a given point. What is the area of this four sided ﬁgure? 3. x2 = 4cy.

b) and (a. (x − a) + (y − b) + Subtracting 2 2 2 2 (x − a) + (y − b − h) = c.70 BASIC GEOMETRY AND TRIGONOMETRY Now you can show that there exists numbers. y = c is called the directrix and the point. b + h) . Let a generic point on the ellipse be (x.21) Then the above equation reduces to (3. P1 and P2 which are ﬁxed and the ellipse consists of the set of points. Each is called a focus point by itself. 2 4d d (3. = 2b − 2c. P2 ) = c. where c is a ﬁxed positive number. the set of points in the plane satisfying ay 2 + by + c = x is also a parabola. Exactly similar results occur if the directrix is of the form x = c and by similar arguments to those above. 2 2 (3. there are two points. The line.20). (y − b) − 2h (y − b) + h2 2 = c2 − 2 (x − a) + (y − b) c + (y − b) 2 2 2 and so −2h (y − b) + h2 = c2 − 2 (x − a) + (y − b) c Therefore. 3.10. Simplifying this yields c2 − h2 (y − b) + h (y − b) h2 − c2 + c2 (x − a) = which simpliﬁes further to (y − b) − h (y − b) + 2 2 2 1 2 c − h2 4 1 + h2 4 2 h2 + 4 c2 c2 − h2 (x − a) = 2 . These two points are called the foci of the ellipse. P1 ) + d (P. y) .22) (x − a) + (y − b) from both sides. P0 in the above is called the focus. 2 2 Now squaring both sides yields (x − a) + (y − b − h) 2 2 = c2 − 2 (x − a) + (y − b) c + (x − a) + (y − b) . 2 2 2 2 Therefore. Then according to the description of an ellipse and the distance formula. (x − a) + (y − b − h) = c − 2 2 (x − a) + (y − b) ≥ 0. b and c such that − e2 1 = b2 − c2 . Let the two given points be (a. Now one can obtain an equation which will describe an ellipse much as was done with the parabola. −2h (y − b) + h2 − c2 = −2 Now square both sides again to obtain h2 − c2 2 2 2 (x − a) + (y − b) c 2 2 − 4h (y − b) h2 − c2 + 4h2 (y − b) = 4c2 (x − a) + (y − b) 2 2 2 .2 The Ellipse With an ellipse. P such that d (P.

(Why?) Suppose (x1 . P1 ) − d (P. Then according to the description of a hyperbola and the distance formula. AND HYPERBOLAS which is equivalent to y−b− 1+h2 4 71 h 2 2 + (x − a) 2 1+h2 c2 4 / c2 −h2 = 1. where c is a ﬁxed positive number. Thus. Let the two given points be (a. y1 ) − (x2 . ELLIPSES. the ellipse would be long in the y direction rather than the x direction. y2 ) are two points on the above ellipse in which b > a. b + h) . |(x1 . This greatest distance between any two points is called the diameter and this shows the diameter of an ellipse is twice the larger of the two numbers appearing in the denominators on the left in (3. there are two points. P such that d (P. redeﬁning the constants. P1 and P2 which are ﬁxed and the hyperbola consists of the set of points.10. b + β) and (α. Now one can obtain an equation which will describe a hyperbola.3. β + b z α−a d d X $ $ y (α. β) α+a © x © β − b$ In this ellipse. If b > a. These two points are called the foci of the hyperbola.24) . 2 b a2 2 2 (3. (x − a) + (y − b) − 2 2 (x − a) + (y − b − h) = c.23) Note that if a = b = r. y2 )| ≤ 4a2 + 4b2 ≤ 2 a2 + b2 ≤ 2b Thus the greatest distance between two points on the ellipse equals 2b and occurs when the two points are (α. Here is the graph of a typical ellipse. b) and (a.3 The Hyperbola With a hyperbola. y) . Therefore. 2 2 You can now show that the equation of a hyperbola is of the form (y − β) (x − α) − =1 a2 b2 2 2 (3. it follows that |xi − α| ≤ a for i = 1. 3. y1 ) and (x2 . Each is called a focus point by itself. This last expression is the generic equation for an ellipse. 2 implying that |x1 − x2 | ≤ |x1 − α| + |α − x2 | ≤ a + a = 2a and similarly. then |x1 − x2 | ≤ 2a because from the above equation. PARABOLAS. an ellipse has the form (y − β) (x − α) + = 1. β − b) . |y1 − y2 | ≤ 2b. b < a. P2 ) = c. this reduces to the equation for a circle of radius r centered at the point (α. Let a generic point on the hyperbola be (x.10. β) .23).

(If n is any positive number there exist points (x. Sketch a graph of the hyperbola. + (y−2)2 4 8. (3. − − y2 9 x2 9 = 1. 7. Sketch a graph of the ellipse whose equation is 6. b. Sketch a graph of the ellipse whose equation is 5. What is the diameter of the ellipse. x2 4 y2 4 (x−1)2 4 (x−1)2 9 + + (y−2)2 9 (y−2)2 4 = 1. = 1. you can simply take these last two equations as the deﬁnition of a hyperbola.25) is unbounded.20) in the case that the directrix is of the form x = c. 2.21) hold. Show that the set of points which satisﬁes either (3.25) holds for a hyperbola. Find the numbers. 4. Derive a similar formula to (3.24) or (3. = 1. Verify that either (3. Consider y = 2x2 + 3x + 7.72 or BASIC GEOMETRY AND TRIGONOMETRY (y − β) (x − α) − = 1. 10. y)| > n.24) or (3. y) satisfying the equations given such that |(x. Find the focus and the directrix of this parabola. 9. Sketch a graph of the hyperbola.25) 2 b a2 If you like. 2 2 3. 3.) .11 Exercises 1. c which make (3. (x−1)2 9 = 1.

Theorem 4. in the following picture. In dealing with complex numbers. with a2 +b2 = 0. You see it corresponds to the point in the plane whose coordinates are (3. (a + ib) + (c + id) = (a + c) + i (b + d) and (a + ib) (c + id) = = ac + iad + ibc + i2 bd (ac − bd) + i (bc + ad) . a + ib a + b2 a + b2 a + b2 You should prove the following theorem.1 The complex numbers with multiplication and addition deﬁned as above form a ﬁeld satisfying all the ﬁeld axioms listed on Page 15. has a unique multiplicative inverse. However. a+ib. I have graphed the point 3 + 2i. b) identiﬁes a point whose x coordinate is a and whose y coordinate is b.The Complex Numbers This chapter gives a brief treatment of the complex numbers. Every non zero complex number. q 3 + 2i Multiplication and addition are deﬁned in the most obvious way subject to the convention that i2 = −1. The ﬁeld of complex numbers is denoted as C. Just as a real number should be considered as a point on the line. An important construction regarding complex numbers is the complex conjugate denoted by a horizontal line above the number. 2) . a − ib a b 1 = 2 = 2 −i 2 . For example.0. you can skip this short chapter. This will not be needed in Calculus but you will need it when you take diﬀerential equations and various other subjects so it is a good idea to consider the subject. 73 . if you are in a hurry to get to calculus. Thus (a. Thus. a complex number is considered a point in the plane which can be identiﬁed in the usual way using the Cartesian coordinates of the point. These things used to be taught in precalculus classes and people were expected to know them before taking calculus. such a point is written as a + ib.

a + ib (a + ib) = a2 + b2 .4 Let r > 0 be given. b) and the ordered pair (c. Thus. |z| = (zz) 1/2 . 8) . consider the distance between (2. From the distance formula √ 2 2 this distance equals (2 − 1) + (5 − 8) = 10. the following formula is easy to obtain. y x2 + y2 y x2 + y2 x2 + y 2 x x2 2 2 2 + y2 y +i y x2 2 + y2 . Suppose x + iy is a complex number.3 : Let z = a + ib and w = c + id. Theorem 4. Remark 4. a + ib ≡ a − ib. Deﬁnition 4. [r (cos t + i sin t)] = rn (cos nt + i sin nt) . Then if n is a positive integer. |a + ib| ≡ a2 + b2 . On the other hand. n . sin θ = . (a. there exists a unique angle. THE COMPLEX NUMBERS What it does is reﬂect a given complex number across the x axis. z − w = 1 − i3 and so (z − w) (z − w) = (1 − i3) (1 + i3) = 10 so |z − w| = 10. denoting by z the complex number. Be sure to verify this. the same thing obtained with the distance formula.0. For example.2 Deﬁne the absolute value of a complex number as follows. With this deﬁnition. Complex numbers. The polar form of the complex number is then r (cos θ + i sin θ) where θ is this angle just described and r = x2 + y 2 . A fundamental identity is the formula of De Moivre which follows. it is important to note the following. + x2 + y 2 =1 is a point on the unit circle. θ ∈ [0. Therefore. Thus the distance between the point in the plane determined by the ordered pair. letting z = 2 + i5 and √ w = 1 + i8.74 It is deﬁned as follows. z = a + ib. Then |z − w| = (a − c) + (b − d) . It is not too hard but you need to do it.0. Then x + iy = Now note that x x2 + y 2 and so x x2 + y2 . 2π) such that cos θ = x x2 + y2 . 5) and (1.0. Algebraically. d) equals |z − w| where z and w are as just described. are often written in the so called polar form which is described next.

Example 4. Suppose it is true for n.5 Let z be a non zero complex number. Thus α= and so the k th roots of z are of the form |z| 1/k 1/k n+1 = [r (cos t + i sin t)] [r (cos t + i sin t)] n and also both cos (kα) = cos t and sin (kα) = sin t. This requires rk = |z| and so r = |z| This can only happen if for l an integer. 2. Corollary 4. 2 2 The ability to ﬁnd k th roots can also be used to factor some polynomials.0.75 Proof: It is clear the formula holds if n = 1. and −i. [r (cos t + i sin t)] which by induction equals = rn+1 (cos nt + i sin nt) (cos t + i sin t) = rn+1 ((cos nt cos t − sin nt sin t) + i (sin nt cos t + cos nt sin t)) = rn+1 (cos (n + 1) t + i sin (n + 1) t) by the formulas for the cosine and sine of the sum of two angles. Therefore.l ∈ Z k t + 2lπ k cos t + 2lπ k + i sin . First note that i = 1 cos π + i sin 2 corollary. the cube roots of i are 1 cos π 2 . r (cos α + i sin α) .0. the roots are cos and cos √ π π + i sin . Using the formula in the proof of the above (π/2) + 2lπ 3 (π/2) + 2lπ 3 + i sin where l = 0. By De Moivre’s theorem. 6 3 π + i sin 2 √ Thus the cube roots of i are 23 + i 1 . kα = t + 2lπ t + 2lπ . Since the cosine and sine are periodic of period 2π. a complex number. cos 6 6 5 π + i sin 6 3 π . Proof: Let z = x + iy and let z = |z| (cos t + i sin t) be the polar form of the complex number. 2 5 π . l ∈ Z. Then there are always exactly k k th roots of z in C. −2 3 + i 1 . . there are exactly k distinct numbers which result from this formula. 1.6 Find the three cube roots of i. is a k th root of z if and only if rk (cos kα + i sin kα) = |z| (cos t + i sin t) .

7 Factor the polynomial x3 − 27. x3 + 27 = 2 2 (x − 3) x − 3 −1 2 √ 3 2 √ −1 3 +i 2 2 x−3 −1 2 x−3 √ 3 2 √ −1 3 −i 2 2 . you might use Pascal’s triangle to construct the binomial coeﬃcients. Do the same for the four fourth roots of 16. By the above procedure using De Moivre’s theorem. 10. Also. Factor x3 + 8 as a product of linear factors. n. Hint: Use Problem 18 on Page 32 and if you like. derive a formula for sin (5x) and one for cos (5x). n .76 Example 4. 12. If z is a complex number. You already know formulas for cos (x + y) and sin (x + y) and these were used to prove De Moivre’s theorem. and w/z. De Moivre’s theorem says [r (cos t + i sin t)] = rn (cos nt + i sin nt) for n a positive integer. 2. Find z −1 . √ √ these cube roots are 3. r = |z| .1 Exercises 1. Note also x − 3 +i −i = x2 + 3x + 9 and so x3 − 27 = (x − 3) x2 + 3x + 9 where the quadratic polynomial. show that in the polar form for zw the angle involved is θ + φ. x2 + 3x + 9 cannot be factored without using complex numbers. 3. 4. Graph the complex cube roots of 8 in the complex plane. Factor x4 + 16 as the product of two quadratic polynomials each of which cannot be factored further without using complex numbers. Give the complete solution to x4 + 16 = 0. 11. If z and w are two complex numbers and the polar form of z involves the angle θ while the polar form of w involves the angle φ. 9. Completely factor x4 + 16 as a product of linear factors. and 3 −1 − i 23 . Therefore.0. even negative integers? Explain. Find zw. 7. 3 −1 + i 23 . show that in the polar form of a complex number. 6. show there exists ω a complex number with |ω| = 1 and ωz = |z| . z + w. 4. Let z = 5 + i9. Now using De Moivre’s theorem. THE COMPLEX NUMBERS First ﬁnd the cube roots of 27. 5. 8. Let z = 2 + i7 and let w = 3 − i8. Does this formula continue to hold for all integers. z 2 . z. Write x3 + 27 in the form (x + 3) x2 + ax + b where x2 + ax + b cannot be factored any more using only real numbers.

Suppose p (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 where all the ak are real numbers.4. what is wrong? 16. squaring both sides it follows 1 = −1 as in the previous problem. 14. Suppose also that p (z) = 0 for some z ∈ C. 1 = 1(1/4) = (cos 2π + i sin 2π) 1/4 = cos (π/2) + i sin (π/2) = i.1. I plan to use it now for rational exponents. What does this tell you about De Moivre’s theorem? Is there a profound diﬀerence between raising numbers to integer powers and raising numbers to non integer powers? . In words this says the conjugate of a product equals the product of the conjugates and the conjugate of a sum equals the sum of the conjugates. This is clearly a remarkable result but is there something wrong with it? If so. Here is why. I claim that 1 = −1. 15. De Moivre’s theorem is really a grand thing. not just integers. −1 = i2 = √ √ −1 −1 = (−1) = 2 √ 1 = 1. EXERCISES 77 13. If z. Show it follows that p (z) = 0 also. Therefore. w are complex numbers prove zw = zw and then show by induction that z1 · · · zm = m m z1 · · · zm . Also verify that k=1 zk = k=1 zk .

78 THE COMPLEX NUMBERS .

Part II Functions Of One Variable 79 .

.

Note that it is not deﬁned in a simple way from a formula. also written R (f ) .1. f (2) = −2. 7} . Example 5.1) This is a very interesting function called the Dirichlet function. 8} by f (x) ≡ x + 1 for each x ∈ D. This rule is called a function and it is denoted by a letter such as f. {1. (2. the function. 2. 5. is called the range of f.1. and f (7) = 6.1.4 For x ∈ R deﬁne f (x) = m 1 n if x = n in lowest terms for m. For example.1 General Considerations The concept of a function is that of something which gives a unique output for a given input.1. However. Sometimes functions are not given in terms of a formula. (8. Deﬁne a function which assigns an element of D to R ≡ {2. let f (d) ≡ the biological father of d. 81 . 3. −2) . In this example there was a clearly deﬁned procedure which determined the function. The set R. −2. Example 5. Deﬁnition 5.1 Consider two sets.1. The set of all elements of R which are of the form f (x) for some x ∈ D is often denoted by f (D) .2 Consider the list of numbers. is said to be onto. 7} ≡ D. D and R along with a rule which assigns a unique element of R to every element of D. sometimes there is no discernible procedure which yields a particular function. and let f (1) = 2. Example 5. 3) .Functions 5. 7. When R = f (D). 6. 8. the set of second entries. 4. 3. 6. the set of ﬁrst entries in the given set of ordered pairs. D (f ) = D is called the domain of f. 2) . It is common notation to write f : D (f ) → R to denote the situation just described in this deﬁnition where f is a function deﬁned on D having values in R. consider the following function deﬁned on the positive real numbers having the following deﬁnition. 5. (7. (1. 6} . n ∈ Z 0 if x is not rational (5. 4. 2. f (8) = 3. The symbol. Example 5.3 Consider the ordered pairs. f . Then f is a function. 3. R ≡ {2. 6) and let D ≡ {1.5 Let D consist of the set of people who have lived on the earth except for Adam and for d ∈ D.

x. These last two examples show how physical problems can result in functions. Examples 5. The following deﬁnition gives several ways to make new functions from old ones. x. When D (f ) is not speciﬁed. f /g is the name of a function whose domain is D (f ) ∩ {x ∈ D (g) : g (x) = 0} deﬁned as (f /g) (x) = f (x) /g (x) . then g ◦ f is the name of a function whose domain is {x ∈ D (f ) : f (x) ∈ D (g)} which is deﬁned as g ◦ f (x) ≡ g (f (x)) .1. When this is done. If f : D (f ) → X and g : D (g) → Y. from this point with the positive direction being up.1. This is called the composition of the two functions. Then you could let A (t) denote the amount of the chemical at time t.6 and 5.7 Certain chemicals decay with time. Nevertheless. given by f (x) = x2 + 4. . Suppose the weight has extended the spring so that the force exerted by the spring exactly balances the force resulting from the weight on the spring. Thus.1. The function. Deﬁnition 5.7 are considered later in the book and techniques for ﬁnding x (t) and A (t) from the given conditions are presented. Let a.82 FUNCTIONS Example 5. Measure the displacement of the mass. Then af + bg is the name of a function whose domain is D (f ) ∩ D (g) which is deﬁned as (af + bg) (x) = af (x) + bg (x) . b be elements of R. Similarly for k an integer.8 Let f. Later this will be generalized.6 Consider a weight which is suspended at one end of a spring which is attached at the other end to the ceiling. and deﬁne a function as follows: x (t) will equal the displacement of the spring at time t given knowledge of the velocity of the weight and the displacement of the weight at some particular time. f. In this chapter the functions are deﬁned on some subset of R having values in R. The name of the function is f. It is the value of the function at the point. g be functions with values in R. people often write f (x) to denote a function and it doesn’t cause too many problems in beginning courses. the variable.1. You should note that f (x) is not a function. Example 5. f k is the name of a function deﬁned as f k (x) = (f (x)) k The function. it is understood to consist of every number for which f makes sense. Suppose A0 is the amount of chemical at some given time.1. f g is the name of a function which is deﬁned on D (f ) ∩ D (g) given by (f g) (x) = f (x) g (x) . x should be considered as a generic variable free to be anything in D (f ) . I will use this slightly sloppy abuse of notation whenever convenient. x2 + 4 may mean the function.

.5.12 A function f is a polynomial if f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 where the ai are real numbers and n is a nonnegative integer. In this case the degree of the polynomial. Note that in the case of a rational function.1. f −1 (f (x)) = x for all x ∈ D (f ) and f f −1 (y) = y for all y ∈ f (D (f )) . f (x) is n. Deﬁning id by id (z) ≡ z this says f ◦ f −1 = id and f −1 ◦ f = id . Polynomials and rational functions are particularly easy functions to understand. Thus from the deﬁnition. the function f is one to one sometimes written. if whenever f (x) = f (x1 ) . For example. f (x) = 3x5 + 9x2 + 7x + 5 is a polynomial of degree 5 and 3x5 + 9x2 + 7x + 5 x4 + 3x + x + 1 is a rational function. g ◦ f : {t ∈ R : t ≥ −1} → R. If t < −1 the inside of the square root sign is negative so makes no sense.1. Then f g : R → R is given by f g (t) = t (1 + t) = t + t2 . This is discussed in the following deﬁnition. if x2 + 8 .11 For any function. Thus f is one to one. Then √ g ◦ f (t) = 1 + (2t + 1) = 2t + 2 83 for t ≥ −1.9 Let f (t) = t and g (t) = 1 + t. the domain of the function might not be all of R. Note that in this last example. If f is one to one. f : D (f ) ⊆ X → Y. f is a rational function if h (x) f (x) = g (x) where h and g are polynomials. The concept of a one to one function is very important. Thus the degree of a polynomial is the largest exponent appearing on the variable. √ Example 5. Deﬁnition 5. deﬁne the following set known as the inverse image of y. f −1 (y) ≡ {x ∈ D (f ) : f (x) = y} . 1 − 1. GENERAL CONSIDERATIONS Example 5. For example. Closely related to the deﬁnition of a function is the concept of the graph of a function. then x = x1 . Therefore. the inverse function. f −1 is deﬁned on f (D (f )) and f −1 (y) = x where f (x) = y. f (x) = x+1 the domain of f would be all real numbers not equal to −1.1. 1 − 1.10 Let f (t) = 2t + 1 and g (t) = 1 + t.1. There may be many elements in this set. but when there is always only one element in this set for all y ∈ f (D (f )) .1. Deﬁnition 5. it was necessary to fuss about the domain of g ◦ f because g is only deﬁned for certain values of t.

{0. Usually the domain of the sequence is either N. · · ·} . 3. ∞ . Thus you can consider f (k) . is assumed to be a set described as follows. {(x. the natural numbers consisting of {1. X and Y. f (x) = x2 − 2 y 2 1 x −2 −1 1 2 Deﬁnition 5.13 Given two sets. Example 5. consider the picture of a part of the graph of the function f (x) = 2x − 1. The graph of f can be represented by drawing a picture as mentioned earlier in the section on Cartesian coordinates beginning on Page 49.1.16 Let {ak }k=1 be deﬁned by ak ≡ k 2 + 1. f (k + 1) . · · ·} for k an integer is known as a sequence. k + 2. · · ·} or the nonnegative integers. it is traditional to write f1 .1. The graph of f consists of the set. For example. the Cartesian product of the two sets. when referring to sequences. Note that knowledge of the graph of a function is equivalent to knowledge of the function. 3. etc. y) : x ∈ X and y ∈ Y } .15 A function whose domain is deﬁned as a set of the form {k. y ¡ ¡ ¡ 2 1 x −2 −1 ¡ ¡ ¡ −2 ¡ ¡ 1 ¡ ¡ 2 Here is the graph of the function. R2 denotes the Cartesian product of R with R.14 Let f : D (f ) → R (f ) be a function. ∞ not as f but as {fi }i=k or just {fi } for short. It is also common to write the sequence. 2. The notion of Cartesian product is just an abstraction of the concept of identifying a point in the plane with an ordered pair of numbers.84 FUNCTIONS Deﬁnition 5. k + 1. To ﬁnd f (x) . instead of f (1) . Deﬁnition 5. fk is called the ﬁrst term. Also. f (2) . 2. simply observe the ordered pair which has x as its ﬁrst element and the value of y equals f (x) . etc. f (3) etc.1. f (k + 2) . written as X × Y. 1. y) : y = f (x) for x ∈ D (f )} . fk+1 the second and so forth. X × Y = {(x.1. f2 . In the above context.

Deﬁnition 5. etc. this is often not the case. are 1. EXERCISES 85 This gives a sequence. Let f : {t ∈ R : t = −1} → R be deﬁned by f (t) ≡ t t+1 . listed in order. Suppose also that for n ≥ 4 it is known that an = an−1 + 2an−2 + 3an−3 . Thus the ﬁrst several terms of this sequence. In fact. Thus a1 = 2. · · ·. Let g (t) ≡ 2.· · ·. Are you able to guess a formula for the k th term of this sequence? 5. This particular sequence is called the Fibonacci sequence and is important in the study of reproducing rabbits. 1. then letting bk = ank . 8. f : R → R is a strictly increasing function if whenever x < y. n2 = 3. suppose an = n2 + 1 . Example 5. an .18 Let {an } be a sequence and let n1 < n2 < n3 . and a3 = −1. a7 = a (7) = 72 + 1 = 50 just from using the formula for the k th term of the sequence. 2. it follows bk = (2k − 1) + 1 = 4k 2 − 4k + 2.17 Let a1 = 1 and a2 = 1. · · ·. For example. Let f : R → R be deﬁned by f (t) ≡ t3 + 1. 2 5. Suppose a1 = 1. This happens. This rule which speciﬁes an+1 from knowledge of ak for k ≤ n is known as a recurrence relation.1. an+1 are known. t 1. However. an+2 ≡ an + an+1 . If n1 = 1. ··· be any strictly increasing list of integers such that n1 is at least as large as the ﬁrst index used to deﬁne the sequence {an } . {bk } is called a subsequence of {an } . . A function. Is f one to one? Can you ﬁnd a formula for f −1 ? 4. Find g ◦ f. Then if bk ≡ ank .2. Find a7 . Give the domains of the following functions. Sometimes sequences are deﬁned recursively. a3 = 10. · · ·. Find f −1 if possible. 6.1. a2 = 3. Assuming a1 . does f −1 always exist? Explain your answer. nk = 2k − 1. 3. If f is a strictly increasing function. when the ﬁrst several terms of the sequence are given and then a rule is speciﬁed which determines an+1 from knowledge of a1 . Include the domain of g ◦ f.5. it follows that f (x) < f (y) .2 Exercises √ 2 − t and let f (t) = 1 . It is nice when sequences come to us in this way from a formula for the k th term. n3 = 5. 5. (a) f (x) = (b) f (x) = (c) f (x) = (d) f (x) = (e) f (x) = √ √ x+3 3x−2 x2 − 4 4 − x2 x 4 3x+5 x2 −4 x+1 3.

n i=1 14. an+2 ≡ an + an+1 of the form rn . which of the following are necessarily one to one on their domains? Explain why or why not by giving a proof or an example. one of which is r = 1+2 5 . and let nk = 2k . 11. Then you try to pick c and d such that the conditions. In the situation of the Fibonacci sequence show that the formula for the nth term can be found and is given by 5 an = 5 √ √ 1+ 5 2 n 5 − 5 √ √ 1− 5 2 n . Suppose f : D (f ) → R (f ) is one to one. Draw the graph of the function f (x) = 13. satisfying f (x) − f in the domain of f ? 1 x = 3x which has both x and 1 x 17. (a) f + g (b) f g (c) f 3 (d) f /g 10. Does there exist a function f . Does it follow that g ◦ f is one to one? 9. the empty set. 2t + 1 if t ≤ 1 . If Xi are sets and for some j. Does there exist any such function? 16. Draw the graph of the function f (x) = x2 + 2x + 2. Suppose f (x) + f x = 7x and f is a function deﬁned on R\ {0} . n n Next you can observe that if r1 and r2 both satisfy the recurrence relation then so n n does cr1 + dr2 for any choice of constants c. You will √ be able to show that there are two values of r which work. Xj = ∅. 12. and g : D (g) → R (g) is one to one. Verify carefully that Xi = ∅. R (f ) ⊆ D (g) . Suppose an = 1 n x 1+x . a1 = 1 and a2 = 1 both hold. If f : R → R and g : R → R are two one to one functions.86 7. 1 15. t if t > 1 FUNCTIONS 8. Find all values of x where f (x) = 1 if there are any. Find bk where bk = ank . Let f (t) be deﬁned by f (t) = Find f −1 if possible. Draw the graph of the function f (x) = x3 + 1. d. Hint: You might be able to do this by induction but a better way would be to look for a solution to the recurrence relation. . the nonzero real numbers.

For example. CONTINUOUS FUNCTIONS 87 5. In Calculus. Now that pictures have been given of what it is desired to eliminate. it is not possible for a car to jump from one point to another instantly. Making this restriction precise turns out to be surprisingly diﬃcult although many of the most important theorems about continuous functions seem intuitively clear. They rule out things which could never happen physically. 2 1 x −2 −1 y • c x0 1 2 You see. it is time to give the precise deﬁnition. . Continuous functions are those in which a suﬃciently small change in x results in a small change in f (x) . the most fundamental restriction made is to assume the functions are continuous. here are examples of graphs of functions which are not continuous at the point x0 . This is called a removable discontinuity because the problem can be ﬁxed by redeﬁning the function at the point x0 . 2 1 x −2 −1 y r b x0 1 2 You see from this picture that there is no way to get rid of the jump in the graph of this function by simply redeﬁning the value of the function at x0 . there is a hole in the picture of the graph of this function and instead of ﬁlling in the hole with the appropriate value. Here is another example. That is why it is called a nonremovable discontinuity or jump discontinuity. There are various ways to restrict the concept in order to study something interesting and the types of restrictions considered depend very much on what you ﬁnd interesting. f (x0 ) is too large.3 Continuous Functions The concept of function is far too general to be useful by itself.5.3. Before giving the careful mathematical deﬁnitions.

1 A function f : D (f ) ⊆ R → R is continuous at x ∈ D (f ) if for each ε > 0 there exists δ > 0 such that whenever y ∈ D (f ) and |y − x| < δ it follows that |f (x) − f (y)| < ε. . he wrote a paper which was so profound that he was granted a doctor’s degree. Cauchy gave the deﬁnition in words and Weierstrass. He wrote several hundred papers which ﬁll 24 volumes. Cauchy refused to take the oath of allegience to the new ruler and ended up leaving his family and going into exile for 8 years.3. In sloppy English this deﬁnition says roughly the following: A function. The completely rigorous deﬁnition above is due to Weierstrass. He married in 1818 and lived for 12 years with his wife and two daughters in Paris till the revolution of 1830. Consider the second nonremovable discontinuity. complex analysis. A function. Deﬁnition 5. Cauchy invented the subject of complex analysis. Notwithstanding his great achievments he was not known as a popular teacher. The need for rigor in the subject of calculus was only realized over a long period of time. He was also a good Catholic. 2 Wilhelm Theodor Weierstrass 1815-1897 brought calculus to essentially the state it is in now. Cauchy was educated at ﬁrst by his father who taught him Greek and Latin. He also discovered some pathological examples such as space ﬁlling curves. In fact this statement in words is pretty much the way Cauchy described it. the family returned to Paris and Cauchy studied at the university to be an engineer but became a mathematician although he made fundamental contributions to physics and engineering. somewhat later produced the totally rigorous ε δ deﬁnition presented here. He made fundamental contributions to partial diﬀerential equations. More than anyone else. optics and astronomy. Cauchy was one of the most proliﬁc mathematicians who ever lived. He also did research on many topics in mechanics and physics including elasticity. f is continuous if it is continuous at every point of D (f ) . 1 Augustin Louis Cauchy 1789-1857 was the son of a lawyer who was married to an aristocrat. Eventually Cauchy learned many languages. He is also credited with giving the ﬁrst rigorous deﬁnition of continuity. and many other topics. After the reign of terror. He was born in France just after the fall of the Bastille and his family ﬂed the reign of terror and hid in the countryside till it was over. This deﬁnition does indeed rule out the sorts of graphs drawn above. When he was a secondary school teacher. calculus of variations. f is continuous at x when it is possible to make f (y) as close as desired to f (x) provided y is taken close enough to x.88 FUNCTIONS The deﬁnition which follows. due to Cauchy1 and Weierstrass2 is the precise way to exclude the sort of behavior described above and all statements about continuous functions must ultimately rest on this deﬁnition from now on.

89 2+ 2 2− 1 f (x) c x0 − δ • x −2 −1 x0 d d1 d 2 d d d x0 + δ For the ε shown you can see from the picture that no matter how small you take δ. To verify the assertion about the above function. |f (x) − f (x0 )| > ε.2 Let f (x) = x if x is rational . Similar reasoning shows it excludes the removable discontinuity case as well. y ∈ (x − δ. between x0 − δ and x0 where f (x) < 2 + ε. x + δ) there are rational numbers. In particular. 0 if x is irrational This function is continuous at x = 0 and nowhere else. (x − δ. There are many ways a function can fail to be continuous and it is impossible to list them all by drawing pictures. the deﬁnition of continuity given above excludes the situation in which there is a jump in the function.3. In the interval. |f (y) − f (x)| < ε. But then. for these values of x. letting y1 . If f were continuous at x. Thus |f (y1 ) − f (y2 )| = |y1 | > |x| . Therefore. y2 . x + δ). This is why it is so important to use the deﬁnition. x. Example 5.5. Now let δ > 0 be completely arbitrary.3. there would exist δ > 0 such that for every point. CONTINUOUS FUNCTIONS The removable discontinuity case is similar. there will be points. Here is an example. That is to say it is a property which a function may or may not have at a single point. y1 such that |y1 | > |x| and irrational numbers. ﬁrst show it is not continuous at x if x = 0. The other thing to notice is that the concept of continuity as described in the deﬁnition is a point property. Take such an x and let ε = |x| /2.

note ﬁrst that f (−3) = 25 and it is desired to verify the conditions for continuity. 22 ≤ √ |x − 5| < √ 22 22 2 Exercise 5. f (x) = First note f (5) = √ 22. 5 Sometimes the determination of δ in the veriﬁcation of continuity can be a little more involved. it follows |f (y) − f (0)| < ε and so f is continuous at x = 0. Here is another example. How did I know to let δ = ε? That is a very good question. Then |f (y) − f (0)| = 0 if y is irrational |f (y) − f (0)| = |y| < ε if y is rational. Now let ε > 0 be given. it must be concluded that f is not continuous there. 1√ 2 √ |x − 5| ≤ 22 |x − 5| 11 2x + 12 + 22 √ whenever |x − 5| < 1 because for such x. Choose δ √ such that 0 < δ ≤ min 1. f (x) = −5x + 10 is continuous at x = −3. Consider the following.4 Show the function. The choice of δ for a particular ε is usually arrived at by using intuition. Now consider: √ 2x + 12 − √ 22 = √ 2x + 12 − 22 √ 2x + 12 + 22 √ 2x + 12 is continuous at x = 5. This allows one to ﬁnd a suitable δ. Then if 0 < |x − (−3)| < 5 δ. all the inequalities above hold and =√ √ 2x + 12 − √ √ 2 ε 22 2 = ε.5 Show f (x) = −3x2 + 7 is continuous at x = 7. . To do this. whenever |y − 0| < δ.3. Since a contradiction is obtained by assuming that f is continuous at x.3 Show the function. Example 5. 2x + 12 > 0. the actual ε δ argument reduces to a veriﬁcation that the intuition was correct. Example 5. Then if |y − 0| < δ = ε.90 and y2 be as just described.3. Here is another example.3. ε 222 . either way. To see f is continuous at 0. it follows from this inequality that 1 |−5x + 10 − (25)| = 5 |x − (−3)| < 5 ε = ε. FUNCTIONS which is a contradiction. let 0 < δ ≤ 1 ε. |x| < |y1 | = |f (y1 ) − f (y2 )| ≤ |f (y1 ) − f (x)| + |f (x) − f (y2 )| < 2ε = |x| . Then if |x − 5| < δ. |−5x + 10 − (25)| = 5 |x − (−3)| . If ε > 0 is given. let ε > 0 be given and let δ = ε.

5. 84 . it follows that −3x2 + 7 − (−140) ≤ 3 ((1 + 7) + 7) |x − 7| = 3 (1 + 27) |x − 7| = 84 |x − 7| . af + bg is continuous at x when f . and g is continuous at f (x) .1 The following assertions are valid for f. 3 Lagrange and Laplace were two great physicists of the 1700’s. it follows ε = ε. The function. If f and g are each real valued functions continuous at x. −3x2 + 7 − (−140) ≤ 84 |x − 7| < 84 5.4. g functions and a. before the need for rigor was realized. then f g is continuous at x. in addition to this. 2. you will not really understand calculus as it has been understood for approximately 150 years. it follows from the version of the triangle inequality which states ||s| − |t|| ≤ |s − t| that |x| < 1 + 7. 4. Don’t be discouraged by these historical observations. For example the ﬁrst claim says that (af + bg) (y) is close to (af + bg) (x) when y is close to x provided the same can be said about f and g. 3. g (f (y)) is close to g (f (x)) . b ∈ R. SUFFICIENT CONDITIONS FOR CONTINUITY First observe f (7) = −140. If you are able to master calculus as understood by Lagrange or Laplace3 . They made fundamental contributions to the calculus of variations and to mechanics and astronomy. then f /g is continuous at x. If f is continuous at x. 84 These ε δ proofs will not be emphasized any more than necessary. g are continuous at x ∈ D (f ) ∩ D (g) and a. ε Now let ε > 0 be given. If. you should try a few of them because until you master this concept. Choose δ such that 0 < δ ≤ min 1. The proof of this theorem is in the last section of this chapter but its conclusions are not surprising. The fourth claim is veriﬁed as follows. 1. The best you can do without this deﬁnition is to gain an understanding of the subject as it was understood by people in the 1700’s. you will have learned some very profound ideas even if they did originate in the eighteenth century. The function f : R → R.4.then g ◦ f is continuous at x. |x| = |x − y + y| ≤ |x − y| + |y| and so |x| − |y| ≤ |x − y| . 91 If |x − 7| < 1. Now −3x2 + 7 − (−140) = 3 |x + 7| |x − 7| ≤ 3 (|x| + 7) |x − 7| . Then for |x − 7| < δ. g (x) = 0. continuity of f indicates that if y is close enough to x then f (x) is close to f (y) and so by continuity of g at f (x) . Theorem 5. Therefore. b numbers. However. .4 Suﬃcient Conditions For Continuity The next theorem is a fundamental result which will allow us to worry less about the ε δ deﬁnition of continuity. if |x − 7| < 1. f (x) ∈ D (g) ⊆ R. For the third claim. given by f (x) = |x| is continuous.

By Corollary 3. The proof that sin is continuous is left for you to verify. From the ﬁrst part of this theorem.2) From the ﬁrst part of√ this argument for sin. Theorem 5.5. f (x) = |x| . Proof: First it will be shown that cos and sin are continuous at 0. It follows that for θ small and positive. cos y − cos x = cos (y − x) cos x − sin (x − y) sin x − cos x and so. Now let ε > 0 be given and take δ = ε. whenever |θ| is small enough. This is because for each x ∈ R. showing sin is continuous at 0. This proves these functions are continuous at 0. |sin x| ≤ 1.5.92 Similarly. Next. . ||x| − |y|| ≤ |x − y| . sin2 θ ≥ sin2 θ 1 − cos2 θ = = 1 − cos θ ≥ 0. Then if |θ| < δ. |θ| ≥ |sin θ| . then −θ = |θ| > 0 and −θ ≥ sin (−θ) . since |cos x| . It follows from (5.4 on Page 59 the following inequality is valid for small positive values of θ. FUNCTIONS It follows that if ε > 0 is given. But then this means |sin θ| = − sin θ = sin (−θ) ≤ −θ = |θ| in this case also. one can take δ = ε and obtain that for |x − y| < δ = ε. cos and sin are continuous. 1 − cos θ + sin θ ≥ θ ≥ sin θ. |θ| ≥ |sin θ| = sin θ. Therefore. cos θ ≥ 0 and so for such θ.2) that if |θ| < δ. given ε > 0 there exists δ > 0 such that if |θ| < δ. ||x| − |y|| ≤ |x − y| < δ = ε which shows continuity of the function. Now y = (y − x) + x and so cos y = cos (y − x) cos x − sin (x − y) sin x. note that for |θ| < π/2. it follows |sin θ − 0| = |sin θ − sin 0| = |sin θ| ≤ |θ| < δ = ε.5 Continuity Of Circular Functions The functions sin x and cos x are often called the circular functions. |y| − |x| ≤ |x − y| . (cos x. Therefore. 1 + cos θ 1 + cos θ (5. 5.1 The functions. sin x) is a point on the unit circle. If θ < 0. if |y − x| is suﬃciently small. then |sin θ| < ε. Therefore. then ε > 1 − cos θ ≥ 0. |cos y − cos x| ≤ ≤ |cos x (cos (y − x) − 1)| + |sin x| |sin (y − x)| |cos (y − x) − 1| + |sin (y − x)| . both of these last two terms are less than ε/2 and this proves cos is continuous at x.

Let f (x) = x2 + 1. Show f is continuous at x = 3. 6. 4.5. 5. Using Theorem 5.6 Exercises 1. √ 9. 2. Let f (x) = x2 + 1. Thus if |x − 3| < 1. √ x− √ y=√ x−y √ x+ y at some point in your argument. Show f is continuous at x = 2. Hint: You might want to make use of the identity. Show f is continuous at x = 1.6. In this case. Hint: You need to let ε > 0 be given. Determine where g is 0 if x ∈ Q / continuous and explain your answer. it follows from the triangle inequality. Let f (x) = 1 if x ∈ Q and consider g (x) = f (x) sin x. Show sin is continuous. Proofs of these theorems are in the last section at the end of this chapter. The symbol. min means to take the minimum of the two numbers in the parenthesis. Now try to complete the argument by letting δ ≤ min (1. 10. you should try δ ≤ ε/2. 3. 5. 11. Hint: Review the two versions of the triangle inequality for absolute values. Let f (x) = x2 + 2x. Let f (x) = 2x2 + 1. Show f is continuous at x = 4. Note that if one δ works in the deﬁnition. Let f (x) = |2x + 3|. |x| < 1 + 3 = 4 and so |f (x) − f (3)| < 4 |x − 3| . Let f (x) = x show f is continuous at every value of x in its domain.1. . Hint: |f (x) − f (3)| = x2 + 1 − (9 + 1) = |x + 3| |x − 3| . 7. ε/4) . Hint: First show the function given as f (x) = x is continuous and then use the theorem.4. EXERCISES 93 5.7 Properties Of Continuous Functions Continuous functions have many important properties which are consequences of the completeness axiom. 8. then so does any smaller δ. Show f is continuous at every point. Then show it is continuous at every point. Let f (x) = 2x + 7. Let f (x) = 1 x2 +1 . Show f is continuous at every value of x. Show f is continuous at every point x. show all polynomials are continuous and that a rational function is continuous at every point of its domain.

5π such that f (x) = 0. given √ by f (x) = x4 + 7 − x3 sin x is continuous. Then there exists x ∈ (a. Therefore. b) such that f (x) = c.7.) Now if your function is continuous (having no jumps) and is one to one. It gives the existence of a certain point.4.4 Let φ : [a.7. f. f. . by the intermediate value theorem there must exist x ∈ 0. It says something exists but it does not tell how to ﬁnd it. Also. 2 This example illustrates the use of this major theorem very well.2 Does there exist a solution to the equation x4 + 7 − x3 sin x = 0? By Theorem 5. √ Example 5. The function is strictly decreasing if whenever x < y. This lemma is not real easy to prove but it is one of those things which seems obvious. for it to fail would be incredible. (This is called the horizontal line test. You see this taking place in the above picture where the line and the graph of the function cross. Deﬁnition 5. The proof of this lemma is in the last section of this chapter in case you are interested. b] → R be a one to one continuous function.94 FUNCTIONS The next theorem is called the intermediate value theorem and the following picture illustrates its conclusion.1 Suppose f : [a. You should draw a picture of the graph of a strictly increasing or decreasing function from the deﬁnition. The x value at this point is the one whose existence is guaranteed by the theorem.7. 731 331 831 631 6 < 0. y = c and a continuous function which starts oﬀ less than c at the point a and ends up greater than c at point b.1 and Problem 9 on Page 93 it follows √ easily that the function. try to imagine how this could happen without it being either strictly increasing or decreasing and you will soon see this is highly believable and in fact. the above picture makes this theorem very easy to believe.3 A function. The intermediate value theorem says there is some point between a and b such that the value of the function at this point equals c. It may seem this is obvious but without completeness the conclusion of the theorem cannot be drawn. Lemma 5. deﬁned on some interval is strictly increasing if whenever x < y. it follows f (x) > f (y) . y c x a b You see in the picture there is a horizontal line. To say a function is one to one is to say that every horizontal line intersects the graph of the function in no more than one point. f (0) = 7 > 0 while f 5π 2 = 5π 2 4 +7− 5π 2 3 sin 5π 2 which is approximately equal to −422. Nevertheless.7. Then φ is either strictly increasing or strictly decreasing. Theorem 5. b] → R is continuous and suppose f (a) < c < f (b) . it follows f (x) < f (y) .

Consider the intervals Ik ≡ (0. the reason for the last inequality in (5. d] → [a. d) → (a. This corollary is not too surprising either. Of course. The next corollary follows from the above. the concept of continuity is tied to a rigorous deﬁnition. Separating the two halves you ﬁnd an identical doll inside. bk and suppose that for all k = 1. you ﬁnd yet another identical doll inside. Thus c ∈ Ik for every k and this proves the lemma. Then there is no point which lies in all these intervals because no negative number can be in all the intervals and 1/k is smaller than a given positive number whenever k is large enough. Then you notice this inside doll also comes apart in the center. Therefore. d] and f −1 : [c. [c. To view the graph of the inverse function. d) and f −1 : (c. With the nested interval lemma.6 Let f : (a. this implies ak ≤ ak+1 . This goes on quite a while until the ﬁnal doll is in one piece. Lemma 5. · · · ≤ bk (5. b) is continuous. c ∈ R which is an element of every Ik . (c. In Russia there is a kind of doll called a matrushka doll. 1/k) . The nested interval lemma is like a matrushka doll except the process never stops. Also. then f ([a. b) → R be one to one and continuous.4) (5. . · · · ak ≤ c = sup al : l = k.7. Ik ⊇ Ik+1 . The proof of this corollary is the same as the proof of the lemma.7.3). 2 · · · . · · ·. Then there exists a point. simply turn things on the side and switch x and y.7.5) is that from (5. Then f (a. PROPERTIES OF CONTINUOUS FUNCTIONS 95 Corollary 5. the second containing the third. b) is an open interval. if f : [a. · · · . If this went too fast. and (5. Proof: Since Ik ⊇ Ik+1 . it is at least as large as the least upper bound. This is really quite a remarkable result and may not seem so obvious. it becomes possible to prove the following lemma which shows a function continuous on a closed interval in R is bounded. Corollary 5. You pick it up and notice it comes apart in the center.7. bk is an upper bound to al : l = k. 2. bk ≥ bk+1 .4). The problem here is that the endpoints of the intervals were not included contrary to the hypotheses of the above lemma in which all the intervals included the endpoints. It involves a sequence of intervals.7 Let Ik = ak . the third containing the fourth and so on. b) → R be a one to one continuous function. the ﬁrst containing the second. k + 1.5. b]) is a closed interval.3) (5. neither will the new graph. Now deﬁne al ≤ al ≤ bl ≤ bk . Then φ is either strictly increasing or strictly decreasing. k + 1.5 Let φ : (a. If the original graph has no jumps in it. if k ≤ l. Consequently. b] → R is one to one and continuous. 2. b] is continuous.5) for each k = 1. This is why there is a proof in the last section of this chapter.4) By the ﬁrst inequality in (5. The fundamental question is whether there exists a point in all the intervals. Thus the only candidate for being in all the intervals is 0 and 0 has been left out of them all. c ≡ sup al : l = 1. Separating the two halves. not to the drawing of pictures.

there exists a point. (5. This contradiction proves the lemma. b] .7.6) holds for all such y. Alternatively. This proves the theorem. Example 5. m and M such that for all x ∈ [a. by continuity. b] and let f : I → R be continuous. Does this violate the conclusion of the nested interval lemma? It does not because the end points of the interval involved are not in the interval.7. 1) .8 Let I = [a. f is not bounded. If for −1 1 all x ∈ I. 1) would have been bounded although in this case the boundedness of the function would not follow from the above lemma because it fails to include the right endpoint. But this implies that for all y ∈ Ik .000001. Thus h (x1 ) ≥ h (x) for all x ∈ I and so M − f (x1 ) ≥ M − f (x) for all x ∈ I which implies f (x1 ) ≤ f (x) for all x ∈ I. Therefore. x2 ∈ I such that for all x ∈ I. This means there exist. 2 Continue in this way obtaining sets. Then f is bounded.8 f (I) has a least upper bound.7. Ik such that Ik ⊇ Ik+1 and for any two points in Ik . That is there exist numbers.10 Let I = [a. The next theorem is known as the max min theorem or extreme value theorem. g (x) ≡ (M − f (x)) = M −f (x) is continuous on I. a. Clearly. . f (x) = M.6) Let k be so large that 2−k (b − a) < δ. Now do to I1 what was done to I0 to obtain I2 ⊆ I1 and for any two points. there exists a δ > 0 such that if |c − y| < δ. |c − y| < δ and so (5. x1 . M. Proof: By completeness of R and Lemma 5. By the nested interval lemma.96 FUNCTIONS Lemma 5. Let I1 be one of these on which f is not bounded.4. a+b 2 and a+b . Theorem 5.8. x. Then for every y ∈ Ik . |x − y| ≤ 2−k (b − a). c which is contained in each Ik . Then use what was just proved to conclude h achieves its maximum at some point.1. you could consider the function h (x) = M − f (x) . The argument for the minimum is similar. The same function deﬁned on [. m ≤ f (x) ≤ M. there must exist some x ∈ I such that f (x) = M. This proves f achieves its maximum. Consequently. x ∈ I such that (M − f (x)) is as small as desired. x1 .7. b . y. the function. Since M is the least upper bound of f (I) there exist points. then |f (c) − f (y)| < 1. x. Consider the two sets.7. g is not bounded above. f (x1 ) ≤ f (x) ≤ f (x2 ) . Proof: Let I ≡ I0 and suppose f is not bounded on I0 . |f (y)| ≤ |f (c)| + 1 which shows that f is bounded on Ik contrary to the way Ik was chosen.9 Let f (x) = 1/x for x ∈ (0. contrary to Lemma 5. b] and let f : I → R be continuous. Also. y ∈ I2 |x − y| ≤ 2−1 b−a ≤ 2−2 (b − a) . then by Theorem 5. Then f achieves its maximum and its minimum on I. Since f is not bounded on I0 . it follows that f must fail to be bounded on at 2 least one of these sets.

Does this contradict the intermediate value theorem? Explain. For all ε > 0 there exists δ > 0 such that if 0 < |y − x| < δ. Do you believe in 7 8? That is. In fact. f (x) = x7 − 8. Give an example of a continuous function deﬁned on [0. That is.5. show that the conclusion of the intermediate value theorem could no longer be obtained. the number you get on your √ calculator is wrong. 1) which is bounded but which does not achieve either its maximum or its minimum. Give an example of a discontinuous function deﬁned on [0. Give an example of a continuous function deﬁned on (0. If you throw out all the irrational numbers. 5. then the notation is y→x+ lim f (y) = L. the rational numbers. It has been known since the time of Pythagoras that 2 is irrational. EXERCISES 97 5. Let f (x) = x − 2 for x ∈ Q. 1) which does not achieve its maximum on (0. 2. 4. Hint: Try f (x) = x2 − 2. 1) ∪ (1.9. 1) . Give an example of a function which is one to one but neither strictly increasing nor strictly decreasing.8 Exercises 1. 5. Why does it exist? Hint: Use the intermediate value theorem on the function. Give an example of a continuous function deﬁned on (0. x + r) for some r > 0. If everything is the same as the above. your calculator does not even know about 7 8. Hint: Look for discontinuous functions satisfying the horizontal line test. negative at 0 but is not equal to zero for any value of x. except y is required to be larger than x and f is only required to be deﬁned on (x. √ 6. show there exists a function which starts oﬀ less than zero and ends up larger than zero and yet there is no number where the function equals zero. there is no point in Q where f (x) = 0. 3. x + r) .9 Limits Of A Function A concept closely related to continuity is that of the limit of a function. then. √ 8. |L − f (y)| < ε. . Deﬁnition 5. Then y→x lim f (y) = L if and only if the following condition holds. Note that f is not necessarily deﬁned at x. 2] which is positive at 2. √ 7.8. x) ∪ (x. does there exist a number which multiplied by itself seven times yields 8? Before you jump to any conclusions.1 Let f : D (f ) ⊆ R → R be a function where D (f ) ⊇ (x − r. 1] which is bounded but does not achieve either its maximum or its minimum. Show that even though f (0) < 0 and f (2) > 0. All it can do is try to approximate it and what it gives you is this approximation. You supply the details.

Note that the value of the function at the point x has nothing to do with the limit of the function in any of these cases. You do not evaluate interesting limits by computing f (x)! In the above picture. limy→x f (y) = b. y→x− Limits are also taken as a variable “approaches” inﬁnity.7) holds. The following pictures illustrate some of these deﬁnitions. (5. Of course nothing is “close” to inﬁnity and so this requires a slightly diﬀerent deﬁnition.98 FUNCTIONS If f is only required to be deﬁned on (x − r. In the second picture.2 If limy→x f (y) = L and limy→x f (y) = L1 . this shows L = L1 . Theorem 5. x) and y is required to be less than x. this is equivalent to f being continuous at x. The value of a function at x is irrelevant to the value of the limit at x! This must always be kept in mind. |f (x) − L| < ε and x→−∞ (5. f (x) is always wrong! It may be the case that f (x) is right but this is merely a happy coincidence when it occurs and as explained below in Theorem 5. Note the value of the function at x equals c while limy→x+ f (y) = b and limy→x− f (y) = a. |f (y) − L1 | < ε. The above theorem holds for any of the kinds of limits presented in the above deﬁnition.6. More precisely: . In this case.9. Roughly speaking. Therefore. for such y. Another concept is that of a function having either ∞ or −∞ as a limit. then L = L1 . Proof: Let ε > 0 be given. we write lim f (y) = L. the limit of the function equals ∞ if the values of the function are ultimately larger than any given number. There exists δ > 0 such that if 0 < |y − x| < δ.7) lim f (x) = L if for every ε > 0 there exists l such that whenever x < l.with the same conditions above. x→∞ lim f (x) = L if for every ε > 0 there exists l such that whenever x > l. the values of the function do not ever get close to their target because nothing can be close to ±∞.9. |L − L1 | ≤ |L − f (y)| + |f (y) − L1 | < ε + ε = 2ε. then |f (y) − L| < ε. c b a s c c c b s c x x In the left picture is shown the graph of a function. Since ε > 0 was arbitrary.

lim (af (y) + bg (y)) = aL + bK. |f g (y) − LK| ≤ (1 + |K| + |L|) [|g (y) − K| + |f (y) − L|] . |g (y) − K| < .4 In this theorem. y→x lim (5. Let ε > 0 be given.11) Suppose limy→x f (y) = L. (5. and the triangle inequality. |f (y) − L| < ε ε . for 0 < |y − x| < δ1 . It may seem there is a lot to memorize here. then f (x) > l.9).5.8) is left for you. δ 2 ). There exists δ 1 such that if 0 < |y − x| < δ 1 .10) is left to you. the symbol. there exists δ > 0 such that whenever |y − x| < δ. 2 (1 + |K| + |L|) 2 (1 + |K| + |L|) (5. It is just like the theorem about the quotient of continuous functions being continuous provided the function in the denominator is non zero at the point of interest. b ∈ R. One sided limits and limits as the variable approaches −∞. g (y) K (5. |f (y)| < 1 + |L| . |f g (y) − LK| ≤ |f g (y) − f (y) K| + |f (y) K − LK| ≤ |f (y)| |g (y) − K| + |K| |f (y) − L| . are deﬁned similarly.10) Also.9. .9). Limits as x → ±∞ and one sided limits are handled similarly. Therefore. Then by the triangle inequality. limx→∞ f (x) = ∞ if for all k. then |f (y) − L| < 1.9. and so for such y. Suppose limy→x f (y) = L and limy→x g (y) = K where K and L are real numbers in R.13) that |f g (y) − LK| < ε and this proves (5. (5. LIMITS OF A FUNCTION 99 Deﬁnition 5. then y→x lim h ◦ f (y) = h (L) . there exists l such that f (x) > k whenever x > l.12) Then letting 0 < δ ≤ min (δ 1 . It is like a corresponding theorem for continuous functions. this is not so because all the deﬁnitions are intuitive when you understand them. In fact. Then if a.13) Now let 0 < δ 2 be such that for 0 < |x − y| < δ 2 . Proof: The proof of (5. Theorem 5.9) and if K = 0. If f (y) ≤ a all y of interest. limy→x denotes any of the limits described above. The proof of (5. Next consider (5. if h is a continuous function deﬁned near L. it follows from (5.8) y→x y→x lim f g (y) = LK f (y) L = .3 If f (x) ∈ R. then limy→x f (x) = ∞ if for every number l.9. (5. then L ≤ a and if f (y) ≥ a then L ≥ a.

A very useful theorem for ﬁnding limits is called the squeezing theorem. Then x→a lim h (x) = L.9. |f (x) − L| < ε/2 and there exists δ 2 such that if 0 < |x − a| < δ 2 . then |h (y) − h (L)| < ε Now since limy→x f (y) = L. if 0 < |y−x| < δ.11). Proof: You ﬁll in the details. then f is continuous at x ∈ I if and only if limy→x f (y) = f (x) . and I is an interval of the form (a. for y close enough to x. It only remains to verify the last assertion. b). |h (x) − L| ≤ |f (x) − L| + |g (x) − L| . if 0 < |x − a| < δ. |h (f (y)) − h (L)| < ε. then |f (y) − L| < η.100 FUNCTIONS Consider (5. Theorem 5. there exists η > 0 such that if |y−L| < η. δ 2 ). [a. They are minor modiﬁcations of the above. Therefore. f (y) ∈ (L − ε. b]. there exists δ > 0 such that if 0 < |y−x| < δ. f (x) ≤ h (x) ≤ g (x) . Compare the deﬁnition of continuous and the deﬁnition of the limit just given. then L > a.6 For f : I → R. If L < h (x) . If this is not true. then |h (x) − L| ≤ |f (x) − L| + |g (x) − L| < ε/2 + ε/2 = ε. Since h is continuous near L. Letting ε be small enough that a < L − ε. There exists δ 1 such that if 0 < |x − a| < δ 1 . then |h (x) − L| ≤ |f (x) − L| . b] . Now let ε > 0 be given. it follows that ultimately. Letting 0 < δ ≤ min (δ 1 . then |g (x) − L| < ε/2. Proof: If L ≥ h (x) . Theorem 5.5 Suppose limx→a f (x) = L = limx→a g (x) and for all x near a. Therefore. or [a. The proofs are left to you. L + ε) which requires f (y) > a contrary to assumption. it follows that for ε > 0 given. This proves the theorem. The same theorem holds for one sided limits and limits as the variable moves toward ±∞. (a. Assume f (y) ≤ a. . b) . It is required to show that L ≤ a. then |h (x) − L| ≤ |g (x) − L| .9.

**5.9. LIMITS OF A FUNCTION Example 5.9.7 Find limx→3 Note that
**

x2 −9 x−3 x2 −9 x−3 .

101

= x + 3 whenever x = 3. Therefore, if 0 < |x − 3| < ε, x2 − 9 − 6 = |x + 3 − 6| = |x − 3| < ε. x−3

It follows from the deﬁnition that this limit equals 6. You should be careful to note that in the deﬁnition of limit, the variable never equals the thing it is getting close to. In this example, x is never equal to 3. This is very signiﬁcant because, in interesting limits, the function whose limit is being taken will usually not be deﬁned at the point of interest. x2 − 9 if x = 3. x−3 How should f be deﬁned at x = 3 so that the resulting function will be continuous there? f (x) = In the previous example, the limit of this function equals 6. Therefore, by Theorem 5.9.6 it is necessary to deﬁne f (3) ≡ 6. Example 5.9.9 Find limx→∞

x 1+x .

Example 5.9.8 Let

x 1 Write 1+x = 1+(1/x) . Now it seems clear that limx→∞ 1 + (1/x) = 1 = 0. Therefore, Theorem 5.9.4 implies

x 1 1 = lim = = 1. 1 + x x→∞ 1 + (1/x) 1 √ √ Example 5.9.10 Show limx→a x = a whenever a ≥ 0. In the case that a = 0, take the limit from the right.

x→∞

lim

There are two cases. First consider the case when a > 0. Let ε > 0 be given. Multiply √ √ and divide by x + a. This yields √ x− √ a = √ x−a √ . x+ a

Now let 0 < δ 1 < a/2. Then if |x − a| < δ 1 , x > a/2 and so √ x− √ a = √ |x − a| x−a √ ≤ √ √ √ x+ a a/ 2 + a √ 2 2 ≤ √ |x − a| . a

Now let 0 < δ ≤ min δ 1 , ε√a . Then for 0 < |x − a| < δ, 2 2 √ x− √ √ √ √ 2 2ε a 2 2 √ = ε. a ≤ √ |x − a| < √ a a 2 2

√

Next consider the case where a = 0. In this case, let ε > 0 and let δ = ε2 . Then if √ 1/2 0 < x − 0 < δ = ε2 , it follows that 0 ≤ x < ε2 = ε.

102

FUNCTIONS

5.10

Exercises

(a) limx→0+

|x| x x limx→0+ |x| |x| x

1. Find the following limits if possible

(b)

(c) limx→0− (d) limx→4 (e) limx→3 (g) (h)

x2 −16 x+4 x2 −9 x+3

(f) limx→−2

**x2 −4 x−2 x limx→∞ 1+x2 x limx→∞ −2 1+x2
**

1 (x+h)3 1 − x3

2. Find limh→0 3. Find

h

.

√ √ 4 x− limx→4 √x−22 . √ √ √ 5 3x+ 4 x+7 x √ . 3x+1 (x−3)20 (2x+1)30 . (2x2 +7)25

4. Find limx→∞ 5. Find limx→∞ 6. Find limx→2

x2 −4 x3 +3x2 −9x−2 .

7. Find limx→∞

√

1 − 7x + x2 −

√

1 + 7x + x2 .

8. Prove Theorem 5.9.2 for right, left and limits as y → ∞. √ √ 9. Prove from the deﬁnition that limx→a 3 x = 3 a for all a ∈ R. Hint: You might want to use the formula for the diﬀerence of two cubes, a3 − b3 = (a − b) a2 + ab + b2 . 10. Find limh→0

(x+h)2 −x2 . h

**11. Prove Theorem 5.9.6 from the deﬁnitions of limit and continuity. 12. Find limh→0 13. Find limh→0
**

(x+h)3 −x3 h

1 1 x+h − x

h

3

+27 14. Find limx→−3 xx+3 √ (3+h)2 −3 15. Find limh→0 if it exists. h

√ (x+h)2 −x exists and ﬁnd the limit. 16. Find the values of x for which limh→0 h √ √ 3 (x+h)− 3 x 17. Find limh→0 if it exists. Here x = 0. h

5.11. THE LIMIT OF A SEQUENCE

103

18. Suppose limy→x+ f (y) = L1 = L2 = limy→x− f (y) . Show limy→x f (x) does not exist. Hint: Roughly, the argument goes as follows: For |y1 − x| small and y1 > x, |f (y1 ) − L1 | is small. Also, for |y2 − x| small and y2 < x, |f (y2 ) − L2 | is small. However, if a limit existed, then f (y2 ) and f (y1 ) would both need to be close to some number and so both L1 and L2 would need to be close to some number. However, this is impossible because they are diﬀerent. 19. Show limx→0 sin x = 1. Hint: You might consider Theorem 5.5.1 on Page 92 to write x the inequality |sin x| + 1 − cos x ≥ |x| ≥ |sin x| whenever |x| is small. Then divide both |x| sin2 x sides by |sin x| and use some trig. identities to write |sin x|(1+cos x) + 1 ≥ |sin x| ≥ 1 and then use squeezing theorem.

5.11

The Limit Of A Sequence

A closely related concept is the limit of a sequence. This was deﬁned precisely a little before the deﬁnition of the limit by Bolzano4 . The following is the precise deﬁnition of what is meant by the limit of a sequence. Deﬁnition 5.11.1 A sequence {an }n=1 converges to a,

n→∞ ∞

lim an = a or an → a

if and only if for every ε > 0 there exists nε such that whenever n ≥ nε , |an − a| < ε. In words the deﬁnition says that given any measure of closeness, ε, the terms of the sequence are eventually all this close to a. Note the similarity with the concept of limit. Here, the word “eventually” refers to n being suﬃciently large. Earlier, it referred to y being suﬃciently close to x on one side or another or else x being suﬃciently large in either the positive or negative directions. Theorem 5.11.2 If limn→∞ an = a and limn→∞ an = a1 then a1 = a. Proof: Suppose a1 = a. Then let 0 < ε < |a1 − a| /2 in the deﬁnition of the limit. It follows there exists nε such that if n ≥ nε , then |an − a| < ε and |an − a1 | < ε. Therefore, for such n, |a1 − a| ≤ |a1 − an | + |an − a| < ε + ε < |a1 − a| /2 + |a1 − a| /2 = |a1 − a| ,

**a contradiction. Example 5.11.3 Let an = Then it seems clear that
**

n→∞ 1 n2 +1 .

lim

1 = 0. n2 + 1

4 Bernhard Bolzano lived from 1781 to 1848. He was a Catholic priest and held a position in philosophy at the University of Prague. He had strong views about the absurdity of war, educational reform, and the need for individual concience. His convictions got him in trouble with Emporer Franz I of Austria and when he refused to recant, was forced out of the university. He understood the need for absolute rigor in mathematics. He also did work on physics.

104 In fact, this is true from the deﬁnition. Let ε > 0 be given. Let nε ≥ √ n > nε ≥ ε−1 , it follows that n2 + 1 > ε−1 and so 0< n2 1 = an < ε.. +1 √

FUNCTIONS

ε−1 . Then if

Thus |an − 0| < ε whenever n is this large. Note the deﬁnition was of no use in ﬁnding a candidate for the limit. This had to be produced based on other considerations. The deﬁnition is for verifying beyond any doubt that something is the limit. It is also what must be referred to in establishing theorems which are good for ﬁnding limits. Example 5.11.4 Let an = n2 Then in this case limn→∞ an does not exist. Sometimes this situation is also referred to by saying limn→∞ an = ∞. Example 5.11.5 Let an = (−1) . In this case, limn→∞ (−1) does not exist. This follows from the deﬁnition. Let ε = 1/2. If there exists a limit, l, then eventually, for all n large enough, |an − l| < 1/2. However, |an − an+1 | = 2 and so, 2 = |an − an+1 | ≤ |an − l| + |l − an+1 | < 1/2 + 1/2 = 1 which cannot hold. Therefore, there can be no limit for this sequence. Theorem 5.11.6 Suppose {an } and {bn } are sequences and that

n→∞ n n

**lim an = a and lim bn = b.
**

n→∞

**Also suppose x and y are real numbers. Then
**

n→∞

**lim xan + ybn = xa + yb
**

n→∞

(5.14) (5.15)

lim an bn = ab lim an a = . bn b

If b = 0,

n→∞

(5.16)

Proof: The ﬁrst of these claims is left for you to do. To do the second, let ε > 0 be given and choose n1 such that if n ≥ n1 then |an − a| < 1. Then for such n, the triangle inequality implies |an bn − ab| ≤ |an bn − an b| + |an b − ab|

≤ |an | |bn − b| + |b| |an − a| ≤ (|a| + 1) |bn − b| + |b| |an − a| .

5.11. THE LIMIT OF A SEQUENCE Now let n2 be large enough that for n ≥ n2 , |bn − b| < ε ε , and |an − a| < . 2 (|a| + 1) 2 (|b| + 1)

105

Such a number exists because of the deﬁnition of limit. Therefore, let nε > max (n1 , n2 ) . For n ≥ nε , |an bn − ab| ≤ (|a| + 1) |bn − b| + |b| |an − a| ε ε < (|a| + 1) + |b| ≤ ε. 2 (|a| + 1) 2 (|b| + 1)

This proves (5.15). Next consider (5.16). Let ε > 0 be given and let n1 be so large that whenever n ≥ n1 , |bn − b| < Thus for such n, an a 2 an b − abn − = ≤ 2 [|an b − ab| + |ab − abn |] bn b bbn |b| ≤ 2 |a| 2 |an − a| + 2 |bn − b| . |b| |b| |b| . 2

**Now choose n2 so large that if n ≥ n2 , then |an − a| < ε |b| ε |b| , and |bn − b| < . 4 4 (|a| + 1)
**

2

**Letting nε > max (n1 , n2 ) , it follows that for n ≥ nε , an a − bn b ≤ < 2 |a| 2 |an − a| + 2 |bn − b| |b| |b| 2 ε |b| 2 |a| ε |b| + 2 < ε. |b| 4 |b| 4 (|a| + 1)
**

2

Another very useful theorem for ﬁnding limits is the squeezing theorem. Theorem 5.11.7 Suppose limn→∞ an = a = limn→∞ bn and an ≤ cn ≤ bn for all n large enough. Then limn→∞ cn = a. Proof: Let ε > 0 be given and let n1 be large enough that if n ≥ n1 , |an − a| < ε/2 and |bn − b| < ε/2. Then for such n, |cn − a| ≤ |an − a| + |bn − a| < ε. This proves the theorem. As an example, consider the following.

**106 Example 5.11.8 Let cn ≡ (−1) and let bn =
**

1 n, n

FUNCTIONS

1 n

1 and an = − n . Then you may easily show that

n→∞

lim an = lim bn = 0.

n→∞

Since an ≤ cn ≤ bn , it follows limn→∞ cn = 0 also. Theorem 5.11.9 limn→∞ rn = 0. Whenever |r| < 1. Proof:If 0 < r < 1 if follows r−1 > 1. Why? Letting α = r= Therefore, by the binomial theorem, 0 < rn = 1 1 . n ≤ 1 + αn (1 + α)

n 1 r

− 1, it follows

1 . 1+α

Therefore, limn→∞ rn = 0 if 0 < r < 1. Now in general, if |r| < 1, |rn | = |r| → 0 by the ﬁrst part. This proves the theorem. An important theorem is the one which states that if a sequence converges, so does every subsequence. You should review Deﬁnition 5.1.18 on Page 85 at this point. Theorem 5.11.10 Let {xn } be a sequence with limn→∞ xn = x and let {xnk } be a subsequence. Then limk→∞ xnk = x. Proof: Let ε > 0 be given. Then there exists nε such that if n > nε , then |xn − x| < ε. Suppose k > nε . Then nk ≥ k > nε and so |xnk − x| < ε showing limk→∞ xnk = x as claimed.

5.11.1

Sequences And Completeness

You recall the deﬁnition of completeness which stated that every nonempty set of real numbers which is bounded above has a least upper bound and that every nonempty set of real numbers which is bounded below has a greatest lower bound and this is a property of the real line known as the completeness axiom. Geometrically, this involved ﬁlling in the holes. There is another way of describing completeness in terms of sequences which I believe is more useful than the least upper bound and greatest lower bound property. Deﬁnition 5.11.11 {an } is a Cauchy sequence if for all ε > 0, there exists nε such that whenever n, m ≥ nε , |an − am | < ε. A sequence is Cauchy means the terms are “bunching up to each other” as m, n get large. Theorem 5.11.12 The set of terms in a Cauchy sequence in R is bounded above and below.

5.11. THE LIMIT OF A SEQUENCE

107

Proof: Let ε = 1 in the deﬁnition of a Cauchy sequence and let n > n1 . Then from the deﬁnition, |an − an1 | < 1. It follows that for all n > n1 , |an | < 1 + |an1 | . Therefore, for all n, |an | ≤ 1 + |an1 | +

k=1 n1

|ak | .

This proves the theorem. Theorem 5.11.13 If a sequence {an } in R converges, then the sequence is a Cauchy sequence. Proof: Let ε > 0 be given and suppose an → a. Then from the deﬁnition of convergence, there exists nε such that if n > nε , it follows that |an − a| < Therefore, if m, n ≥ nε + 1, it follows that |an − am | ≤ |an − a| + |a − am | < ε ε + =ε 2 2 ε 2

showing that, since ε > 0 is arbitrary, {an } is a Cauchy sequence. Deﬁnition 5.11.14 The sequence, {an } , is monotone increasing if for all n, an ≤ an+1 . The sequence is monotone decreasing if for all n, an ≥ an+1 . If someone says a sequence is monotone, it usually means monotone increasing. There exists diﬀerent descriptions of the completeness axiom. If you like you can simply add the three new criteria in the following theorem to the list of things which you mean when you say R is complete and skip the proof. All versions of completeness involve the notion of ﬁlling in holes and they are really just diﬀerent ways of expressing this idea. In practice, it is often more convenient to use the ﬁrst of the three equivalent versions of completeness in the following theorem which states that every Cauchy sequence converges. In fact, this version of completeness, although it is equivalent to the completeness axiom for the real line, also makes sense in many situations where Deﬁnition 2.14.1 on Page 42 does not make sense. For example, the concept of completeness is often needed in settings where there is no order. This happens as soon as one does multivariable calculus. From now on completeness will mean any of the three conditions in the following theorem. It is the concept of completeness and the notion of limits which sets analysis apart from algebra. You will ﬁnd that every existence theorem, a theorem which asserts the existence of something, in analysis depends on the assumption that some space is complete. Theorem 5.11.15 The following conditions are equivalent to completeness. 1. Every Cauchy sequence converges 2. Every monotone increasing sequence which is bounded above converges. 3. Every monotone decreasing sequence which is bounded below converges.

108

FUNCTIONS

Proof: Suppose every Cauchy sequence converges and let S be a non empty set which is bounded above. In what follows, sn ∈ S and bn will be an upper bound of S. If, in the process about to be described, sn = bn , this will have shown the existence of a least upper bound to S. Therefore, assume sn < bn for all n. Let b1 be an upper bound of S and let s1 be an element of S. Suppose s1 , · · ·, sn and b1 , · · ·, bn have been chosen such that sk ≤ sk+1 and bk ≥ bk+1 . Consider sn +bn , the point on R which is mid way between sn and bn . If this 2 point is an upper bound, let sn + bn bn+1 ≡ 2 and sn+1 = sn . If the point is not an upper bound, let sn+1 ∈ sn + b n , bn 2

and let bn+1 = bn . It follows this speciﬁes an increasing sequence {sn } and a decreasing sequence {bn } such that 0 ≤ bn − sn ≤ 2−n (b1 − s1 ) . Now if n > m, 0 ≤ bm − bn = |bm − bn |

n−1 n−1 n−1

=

k=m

bk − bk+1 ≤

k=m

bk − sk ≤

k=m

2−k (b1 − s1 )

=

2

−m

−2 2−1

−n

(b1 − s1 ) ≤ 2−m+1 (b1 − s1 )

and limm→∞ 2−m = 0 by Theorem 5.11.9. Therefore, {bn } is a Cauchy sequence. Similarly, {sn } is a Cauchy sequence. Let l ≡ limn→∞ sn and let l1 ≡ limn→∞ bn . If n is large enough, |l − sn | < ε/3, |l1 − bn | < ε/3, and |bn − sn | < ε/3. Then |l − l1 | ≤ |l − sn | + |sn − bn | + |bn − l1 | < ε/3 + ε/3 + ε/3 = ε. Since ε > 0 is arbitrary, l = l1 . Why? Then l must be the least upper bound of S. It is an upper bound because if there were s > l where s ∈ S, then by the deﬁnition of limit, bn < s for some n, violating the assumption that each bn is an upper bound for S. On the other hand, if l0 < l , then for all n large enough, sn > l0 , which implies l0 is not an upper bound. This shows 1 implies completeness. First note that 2 and 3 are equivalent. Why? Suppose 2 and consequently 3. Then the same construction yields the two monotone sequences, one increasing and the other decreasing. The sequence {bn } is bounded below by sm for all m and the sequence {sn } is bounded above by bm for all m. Why? Therefore, the two sequences converge. The rest of the argument is the same as the above. Thus 2 and 3 imply completeness. Now suppose completeness and let {an } be an increasing sequence which is bounded above. Let a be the least upper bound of the set of points in the sequence. If ε > 0 is given, there exists nε such that a − ε < anε . Since {an } is a monotone sequence, it follows that whenever n > nε , a−ε < an ≤ a. This proves limn→∞ an = a and proves convergence. Since 3 is equivalent to 2, this is also established. If follows 3 and 2 are equivalent to completeness. It remains to show that completeness implies every Cauchy sequence converges.

5.11. THE LIMIT OF A SEQUENCE Suppose completeness and let {an } be a Cauchy sequence. Let inf {ak : k ≥ n} ≡ An , sup {ak : k ≥ n} ≡ Bn

109

**Then An is an increasing sequence while Bn is a decreasing sequence and Bn ≥ An . Furthermore, lim Bn − An = 0.
**

n→∞

The details of these assertions are easy and are left to the reader. Also, {An } is bounded below by any lower bound for the original Cauchy sequence while {Bn } is bounded above by any upper bound for the original Cauchy sequence. By the equivalence of completeness with 3 and 2, it follows there exists a such that a = limn→∞ An = limn→∞ Bn . Since Bn ≥ an ≥ An , the squeezing theorem implies limn→∞ an = a and this proves the equivalence of these characterizations of completeness. Theorem 5.11.16 Let {an } be a monotone increasing sequence which is bounded above. Then limn→∞ an = sup {an : n ≥ 1} Proof: Let a = sup {an : n ≥ 1} and let ε > 0 be given. Then from Proposition 2.14.3 on Page 42 there exists m such that a − ε < am ≤ a. Since the sequence is increasing, it follows that for all n ≥ m, a − ε < an ≤ a. Thus a = limn→∞ an .

5.11.2

Decimals

You are all familiar with decimals. In the United States these are written in the form .a1 a2 a3 ··· where the ai are integers between 0 and 9.5 Thus .23417432 is a number written as a decimal. You also recall the meaning of such notation in the case of a terminating decimal. 2 3 4 For example, .234 is deﬁned as 10 + 102 + 103 . Now what is meant by a nonterminating decimal? Deﬁnition 5.11.17 Let .a1 a2 · ·· be a decimal. Deﬁne

n

.a1 a2 · ·· ≡ lim

n→∞

k=1

ak . 10k

**Proposition 5.11.18 The above deﬁnition makes sense.
**

ak Proof: Note the sequence k=1 10k n=1 is an increasing sequence. Therefore, if there exists an upper bound, it follows from Theorem 5.11.16 that this sequence converges and so the deﬁnition is well deﬁned. n k=1 n ∞

ak ≤ 10k

n k=1

9 =9 10k

n k=1

1 . 10k

Now 9 10

n k=1

1 10k

n

=

k=1

1 1 − 10k 10

n k=1

1 = 10k

n k=1

1 − 10k

n+1 k=2

1 10k

=

1 1 − n+1 10 10

5 In France and Russia they use a comma instead of a period. This looks very strange but that is just the way they do it.

110 and so

FUNCTIONS

n k=1

1 10 ≤ k 10 9

1 1 − n+1 10 10

≤

10 9

1 10

=

1 . 9

Therefore, since this holds for all n, it follows the above sequence is bounded above. It follows the limit exists.

5.11.3

Continuity And The Limit Of A Sequence

There is a very useful way of thinking of continuity in terms of limits of sequences found in the following theorem. In words, it says a function is continuous if it takes convergent sequences to convergent sequences whenever possible. Theorem 5.11.19 A function f : D (f ) → R is continuous at x ∈ D (f ) if and only if, whenever xn → x with xn ∈ D (f ) , it follows f (xn ) → f (x) . Proof: Suppose ﬁrst that f is continuous at x and let xn → x. Let ε > 0 be given. By continuity, there exists δ > 0 such that if |y − x| < δ, then |f (x) − f (y)| < ε. However, there exists nδ such that if n ≥ nδ , then |xn − x| < δ and so for all n this large, |f (x) − f (xn )| < ε which shows f (xn ) → f (x) . Now suppose the condition about taking convergent sequences to convergent sequences holds at x. Suppose f fails to be continuous at x. Then there exists ε > 0 and xn ∈ D (f ) 1 such that |x − xn | < n , yet |f (x) − f (xn )| ≥ ε. But this is clearly a contradiction because, although xn → x, f (xn ) fails to converge to f (x) . It follows f must be continuous after all. This proves the theorem.

5.12

Exercises

n 3n+4 . 3n4 +7n+1000 . n4 +1 2n +7(5n ) 4n +2(5n ) .

1. Find limn→∞ 2. Find limn→∞ 3. Find limn→∞

1 4. Find limn→∞ n tan n . Hint: See Problem 19 on Page 103. 2 5. Find limn→∞ n sin n . Hint: See Problem 19 on Page 103.

6. Find limn→∞ 7. Find limn→∞ 8. Find limn→∞

9 n sin n . Hint: See Problem 19 on Page 103.

**(n2 + 6n) − n. Hint: Multiply and divide by
**

n 1 k=1 10k . n k k=0 r .

(n2 + 6n) + n.

9. For |r| < 1, ﬁnd limn→∞ recall Theorem 5.11.9.

Hint: First show

n k k=0 r

=

r n+1 r−1

−

1 r−1 .

Then

5.12. EXERCISES

111

10. Suppose x = .3434343434 where the bar over the last 34 signiﬁes that this repeats forever. In elementary school you were probably given the following procedure for ﬁnding the number x as a quotient of integers. First multiply by 100 to get 100x = 34.34343434 and then subtract to get 99x = 34. From this you conclude that x = n 1 k 34/99. Fully justify this procedure. Hint: .34343434 = limn→∞ 34 k=1 100 now use Problem 9. 11. Suppose D (f ) = [0, 1] ∪ {9} and f (x) = x on [0, 1] while f (9) = 5. Is f continuous at the point, 9? Use whichever deﬁnition of continuity you like. 12. Suppose xn → x and xn ≤ c. Show that x ≤ c. Also show that if xn → x and xn ≥ c, then x ≥ c. Hint: If this is not true, argue that for all n large enough xn > c. 13. Let a ∈ [0, 1]. Show a = .a1 a2 a3 · ·· for a unique choice of integers, a1 , a2 , · · · if it is possible to do this. Otherwise, give an example. 14. Show every rational number between 0 and 1 has a decimal expansion which either repeats or terminates. 15. Consider the number whose decimal expansion is .010010001000010000010000001· · ·. Show this is an irrational number. Now using this, show that between any two integers there exists an irrational number. Next show that between any two numbers there exists an irrational number. 16. Using the binomial theorem prove that for all n ∈ N, 1 + Hint: Show ﬁrst that

n n k 1 n n

≤

1+

1 n+1

n+1

.

=

n

n·(n−1)···(n−k+1) . k!

**By the binomial theorem,
**

k factors

1+

1 n

=

k=0

n k

1 n

k

n

=

k=0

n · (n − 1) · · · (n − k + 1) . k!nk

Now consider the term

binomial expansion for you replace n with n + 1 whereever this occurs. Argue the term got bigger and then note that in the binomial expansion for 1+

1 n+1 n+1

n·(n−1)···(n−k+1) and k!nk n+1 1 1 + n+1 except

note that a similar term occurs in the

, there are more terms.

17. Prove by induction that for all k ≥ 4, 2k ≤ k! 18. Use the Problems 21 and 16 to verify for all n ∈ N, 1 + 19. Prove limn→∞ 1 +

1 n n 1 n n

≤ 3.

**exists and equals a number less than 3.
**

n

20. Using Problem 18, prove nn+1 ≥ (n + 1) for all integers, n ≥ 3. 21. Find limn→∞ n sin n if it exists. If it does not exist, explain why it does not. 22. Recall the axiom of completeness states that a set which is bounded above has a least upper bound and a set which is bounded below has a greatest lower bound. Show that a monotone decreasing sequence which is bounded below converges to its greatest lower bound. Hint: Let a denote the greatest lower bound and recall that because of this, it follows that for all ε > 0 there exist points of {an } in [a, a + ε] .

112

n

FUNCTIONS

1 23. Let An = k=2 k(k−1) for n ≥ 2. Show limn→∞ An exists. Hint: Show there exists an upper bound to the An as follows. n k=2

1 k (k − 1)

n

=

k=2

1 1 − k−1 k

=

n

1 1 1 − ≤ . 2 n−1 2

1 24. Let Hn = k=1 k2 for n ≥ 2. Show limn→∞ Hn exists. Hint: Use the above problem to obtain the existence of an upper bound.

25. Let a be a positive number and let x1 = b > 0 where b2 > a. Explain why there exists such a number, b. Now having deﬁned xn , deﬁne xn+1 ≡ 1 xn + xa . Verify that 2 n {xn } is a decreasing sequence and that it satisﬁes x2 ≥ a for all n and is therefore, n bounded below. Explain why limn→∞ xn exists. If x is this limit, show that x2 = a. Explain how this shows that every positive real number has a square root. This is an example of a recursively deﬁned sequence. Note this does not give a formula for xn , just a rule which tells us how to deﬁne xn+1 if xn is known.

9 26. Let a1 = 0 and suppose that an+1 = 9−an . Write a2 , a3 , a4 . Now prove that for all √ n, it follows that an ≤ 9 + 3 5 (By Problem 6 on Page 97 there is no problem with 2 2 the existence of various roots of positive numbers.) and so the sequence is bounded above. Next show that the sequence is increasing and so it converges. Find the limit of the sequence. Hint: You should prove these things by induction. Finally, to ﬁnd 9 the limit, let n → ∞ in both sides and argue that the limit, a, must satisfy a = 9−a .

27. If x ∈ R, show there exists a sequence of rational numbers, {xn } such that xn → x and a sequence of irrational numbers, {xn } such that xn → x. Now consider the following function. 1 if x is rational f (x) = . 0 if x is irrational Show using the sequential version of continuity in Theorem 5.11.19 that f is discontinuous at every point. 28. If x ∈ R, show there exists a sequence of rational numbers, {xn } such that xn → x and a sequence of irrational numbers, {xn } such that xn → x. Now consider the following function. x if x is rational f (x) = . 0 if x is irrational Show using the sequential version of continuity in Theorem 5.11.19 that f is continuous at 0 and nowhere else. 29. The nested interval lemma and Theorem 5.11.19 can be used to give an easy proof of the intermediate value theorem. Suppose f (a) > 0 and f (b) < 0 for f a continuous function deﬁned on [a, b] . The intermediate value theorem states that under these conditions, there exists x ∈ (a, b) such that f (x) = 0. Prove this theorem as follows: Let c = a+b and consider the intervals [a, c] and [c, b] . Show that on one of these 2 intervals, f is nonnegative at one end and nonpositive at the other. Now consider that interval, divide it in half as was done for the original interval and argue that on one of these smaller intervals, the function has diﬀerent signs at the two endpoints. Continue in this way. Next apply the nested interval lemma to get x in all these intervals and

Now pick n1 such that xn1 ∈ I1 . It is an amazing fact that under certain conditions continuity implies uniform continuity. . 1) . [a. Consider the two intervals a. does it follow that limn→∞ |an | = |a|? Prove or else give a counter example. Proof: Let {xn } ⊆ [a.1 Let f : D ⊆ R → R be a function. Show the following converge to 0. The following theorem is part of the Heine Borel theorem. Split it in half and let I2 be the interval which contains xn for inﬁnitely many values of n. This is a continuous function because.13. there exists a subsequence.01n 10n n! √ √ n 32.2 A set. This n 2 nice approach to establishing this limit using only elementary algebra is in Rudin [14]. The notion of uniform continuity involves being able to choose a single δ which works on the whole domain of f. This proves the theorem. Continue this way obtaining a sequence of nested intervals I0 ⊇ I1 ⊇ I2 ⊇ I3 · ·· where the length of In is (b − a) /2n .13.) By the nested interval lemma there exists a point. δ deﬁnition of continuity becomes very small as x gets close to 0. n2 such that n2 > n1 and xn2 ∈ I2 . n3 such that n3 > n2 and xn3 ∈ I3 .4. Theorem 5. b] ≡ I0 . the δ needed in the ε. At least one of these intervals contains xn for inﬁnitely many values of n. UNIFORM CONTINUITY 113 argue there exist sequences. K ⊆ R is sequentially compact if whenever {an } ⊆ K is a sequence. by Theorem 5. (a) (b) n5 1. However. Now observe that en > 0 and use the binomial theorem to conclude 1 + nen + n(n−1) e2 ≤ n. Consider the function f (x) = x for x ∈ (0. there exists a δ depending only on ε such that if |x − y| < δ then |f (x) − f (y)| < ε. you can assume f (xn ) → f (x) and f (yn ) → f (x). Deﬁnition 5. b each of 2 2 which has length (b − a) /2. By continuity. This is discussed in this section. Then f is uniformly continuous if for every ε > 0. {ank } such that this subsequence converges to a point of K. (This can be done because in each case the intervals contained xn for inﬁnitely many values of n. 5.1. Here is the deﬁnition.13. b] is sequentially compact. 1) . Call this interval I1 . Now do for I1 what was done for I0 . etc. |xnk − c| < (b − a) 2−k and so limk→∞ xnk = c ∈ [a. Show this requires that f (x) = 0. it is continuous at every point of (0. a+b and a+b . for a given ε > 0. 30. Furthermore. c contained in all these intervals.5. Prove limn→∞ n n = 1. xn → x and yn → x such that f (xn ) < 0 and f (yn ) > 0. 31. Deﬁnition 5. b] . Hint: Let en ≡ n n − 1 so that (1 + en ) = n.3 Every closed interval.13 Uniform Continuity There is a theorem about the integral of a continuous function which requires the notion of 1 uniform continuity. If limn→∞ an = a.13.

proofs of some theorems which have not been proved yet are given.3. Obviously ynk must also move toward the point also. |f (xδ ) − f (yδ )| ≥ ε. Now nk ≥ k and so 1 |xnk − ynk | < . the theorem must be true.15. Corollary 5. xδ and yδ such that even though |xδ − yδ | < δ. 0 = |f (z) − f (z)| = lim |f (xnk ) − f (ynk )| ≥ ε. |f (xn ) − f (yn )| ≥ ε. The following corollary follows from this theorem and Theorem 5. 5. b] and f : I → R is continuous. k Consequently. Therefore.1 The following assertions are valid . 1 3. f : D ⊆ R → R is Lipschitz continuous or just Lipschitz for short if there exists a constant. ∞) → R given by f (x) = x . there exists a subsequence.4 Let f : K → R be continuous where K is a sequentially compact set in R.114 FUNCTIONS Theorem 5.15 Theorems About Continuous Functions In this section.13. If |xn − yn | → 0 and xn → z. n Now since K is sequentially compact.) By continuity of f and Problem 12 on Page 111. Then f is uniformly continuous.14 Exercises 1. Theorem 5. there exists ε > 0 such that for every δ > 0 there exists a pair of points. and letting the exceptional pair of points for δ = 1/n be denoted by xn and yn . 5. Proof: If this is not true. A function. 2. ( xnk is like a person walking toward a certain point and ynk is like a dog on a leash which is constantly getting shorter. I = [a. Then f is uniformly continuous on K.13. Show f is uniformly continuous even though the set on which f is deﬁned is not sequentially compact. y ∈ D. · · ·. Taking a succession of values for δ equal to 1. k→∞ an obvious contradiction. Consider f : (1. 1/2. does it follow that |f | is also uniformly continuous? If |f | is uniformly continuous does it follow that f is uniformly continuous? Answer the same questions with “uniformly continuous” replaced with “continuous”. ynk → z also. show that yn → z also. If f is uniformly continuous. You should give a precise proof of what is needed here. K such that |f (x) − f (y)| ≤ K |x − y| for all x. 1/3. |xn − yn | < 1 .13. 4. {xnk } such that xnk → z ∈ K.5 Suppose I is a closed interval. Explain why. Show every Lipschitz function is uniformly continuous.

b ∈ R. Now let ε > 0 be given. Therefore. |g (x) − g (y)| < |g (x)| /2 . 3. then everything happens at once. g (x) = 0. If. then f /g is continuous at x. in addition to this.15. Then let 0 < δ ≤ min (δ 1 . The function f : R → R. It follows that for such y.) Let ε > 0 be given. then |f (x) − f (y)| < 1.) There exists δ 1 > 0 such that if |y − x| < δ 1 . Therefore. |f (y)| < 1 + |f (x)| . By assumption. If f is continuous at x. af + bg is continuous at x when f .5. Now consider 2. and g is continuous at f (x) . f (x) ∈ D (g) ⊆ R. 2.then g ◦ f is continuous at x. The function. δ 2 . 2 (1 + |g (x)| + |f (y)|) ε 2 (1 + |g (x)| + |f (y)|) and there exists δ 3 such that if |x−y| < δ 3 . THEOREMS ABOUT CONTINUOUS FUNCTIONS 115 1. There exists δ 2 such that if |x − y| < δ 2 . Then if |x−y| < δ. δ 2 ) .) To obtain the second part. |f g (x) − f g (y)| ≤ |f (x) g (x) − g (x) f (y)| + |g (x) f (y) − f (y) g (y)| ≤ |g (x)| |f (x) − f (y)| + |f (y)| |g (x) − g (y)| ≤ (1 + |g (x)| + |f (y)|) [|g (x) − g (y)| + |f (x) − f (y)|] . If |x − y| < δ. g are continuous at x ∈ D (f ) ∩ D (g) and a. it follows |f (x) − f (y)| < 2(|a|+|b|+1) and there exists ε δ 2 > 0 such that whenever |x − y| < δ 2 . If and f and g are each real valued functions continuous at x. let δ 1 be as described above and let δ 0 > 0 be such that for |x−y| < δ 0 . then |f (x) − f (y)| < Now let 0 < δ ≤ min (δ 1 . This proves the ﬁrst part of 2. for such y. then |g (x) − g (y)| < ε . there exist δ 1 > 0 ε such that whenever |x − y| < δ 1 . all the above hold at once and so |f g (x) − f g (y)| ≤ (1 + |g (x)| + |f (y)|) [|g (x) − g (y)| + |f (x) − f (y)|] < (1 + |g (x)| + |f (y)|) ε ε + 2 (1 + |g (x)| + |f (y)|) 2 (1 + |g (x)| + |f (y)|) = ε. using the triangle inequality |af (x) + bf (x) − (ag (y) + bg (y))| ≤ |a| |f (x) − f (y)| + |b| |g (x) − g (y)| < |a| ε 2 (|a| + |b| + 1) + |b| ε 2 (|a| + |b| + 1) < ε. then f g is continuous at x. δ 3 ) . 4. given by f (x) = |x| is continuous. it follows that |g (x) − g (y)| < 2(|a|+|b|+1) . Proof: First consider 1.

and |x−y| < δ. δ 3 ) . then |g (y) − g (x)| < ε −1 M . f (x) ∈ D (g) ⊆ Rp . δ 1 ) . and g is continuous at f (x) . δ 1 . Either is it or it is not continuous. f (x) f (y) f (x) g (y) − f (y) g (x) − = g (x) g (y) g (x) g (y) |f (x) g (y) − f (y) g (x)| ≤ 2 |g(x)| 2 FUNCTIONS = 2 |f (x) g (y) − f (y) g (x)| |g (x)| 2 ≤ ≤ ≤ ≤ 2 |g (x)| 2 |g (x)| 2 |g (x)| 2 2 [|f (x) g (y) − f (y) g (y) + f (y) g (y) − f (y) g (x)|] [|g (y)| |f (x) − f (y)| + |f (y)| |g (y) − g (x)|] 3 |g (x)| |f (x) − f (y)| + (1 + |f (x)|) |g (y) − g (x)| 2 2 2 2 (1 + 2 |f (x)| + 2 |g (x)|) [|f (x) − f (y)| + |g (y) − g (x)|] |g (x)| ≡ M [|f (x) − f (y)| + |g (y) − g (x)|] where M is deﬁned by M≡ 2 |g (x)| 2 (1 + 2 |f (x)| + 2 |g (x)|) Now let δ 2 be such that if |x−y| < δ 2 . δ 2 . − |g (x)| /2 ≤ |g (y)| − |g (x)| ≤ |g (x)| /2 which implies |g (y)| ≥ |g (x)| /2. Now consider 3.116 and so by the triangle inequality. If f is continuous at x. Then there exists η > 0 such that if <M . everything holds and f (x) f (y) − ≤ M [|f (x) − f (y)| + |g (y) − g (x)|] g (x) g (y) ε −1 ε −1 = ε. then |f (x) − f (y)| < and let δ 3 be such that if |x−y| < δ 3 .then g ◦ f is continuous at x. Then if |x−y| < min (δ 0 . 2 ε −1 M 2 Then if 0 < δ ≤ min (δ 0 . Let ε > 0 be given.) Note that in these proofs no eﬀort is made to ﬁnd some sort of “best” δ. and |g (y)| < 3 |g (x)| /2.). The problem is one which has a yes or a no answer. M + M 2 2 This completes the proof of the second part of 2.

Then there exists x ∈ (a. then on [d. Next here is a proof of the intermediate value theorem. Now consider that interval. if f (d) ≤ c. b] . Therefore.15. y2 ∈ (a. the triangle inequality implies |f (x) − f (y)| = ||x| − |y|| ≤ |x−y| < δ = ε. the above inequality does not hold. x2 < y2 with x2 . b] . b] → R be a continuous function and suppose φ is 1 − 1 on (a. b) . then |f (z) − f (x)| < η. y1 ∈ (a. Continue in this way. Theorem 5. n→∞ Consequently f (x) = c and this proves the theorem. b) such that f (x) = c. b).3 Let φ : [a. Proof: Let d = a+b and consider the intervals [a. Next apply the nested interval lemma to get x in all these intervals. then since φ is 1 − 1. (For the last step.). 2 the function is ≤ c at one end point and ≥ c at the other. all the above hold and so |g (f (z)) − g (f (x))| < ε. Lemma 5. b).15.2 Suppose f : [a. b] f ≥ 0 at one end point and ≤ 0 at the other. f (x) − c = lim (f (xn ) − c) ≤ 0 n→∞ while f (x) − c = lim (f (yn ) − c) ≥ 0. Now |xn − x| ≤ (b − a) 2−n and |yn − x| ≤ (b − a) 2−n and so xn → x and yn → x. Then if |x−y| < δ. x1 . Proof: First it is shown that φ is either strictly increasing or strictly decreasing on (a. let xn . Let xt ≡ tx1 + (1 − t) x2 and yt ≡ ty1 + (1 − t) y2 .) and completes the proof of the theorem. f (yn ) ≥ c. Then xt < yt for all t ∈ [0. Then if |x−z| < δ and z ∈ D (g ◦ f ) ⊆ D (f ) . b] → R is continuous and suppose f (a) < c < f (b) .) To verify part 4. then there exists x1 < y1 .5. b) such that (φ (y1 ) − φ (x1 )) (y1 − x1 ) > 0. Then φ is either strictly increasing or strictly decreasing on [a. This proves part 3. it follows that |g (y) − g (f (x))| < ε. THEOREMS ABOUT CONTINUOUS FUNCTIONS 117 |y−f (x)| < η and y ∈ D (g) . 1] because tx1 ≤ ty1 and (1 − t) x2 ≤ (1 − t) y2 . let ε > 0 be given and let δ = ε. the function has values at least as large as c and values no larger than c. If for some other pair of points.15. yn be elements of this interval such that f (xn ) ≤ c. then on [a. divide it in half as was done for the original interval and argue that on one of these smaller intervals. If f (d) ≥ c. In the nth interval. see Problem 12 on Page 111). This proves part 4. From continuity of f at x. d] . there exists δ > 0 such that if |x−z| < δ and z ∈ D (f ) . d] and [d. (φ (y2 ) − φ (x2 )) (y2 − x2 ) < 0. If φ is not strictly decreasing on (a. b) . Pick the interval on which f has values which are at least as large as c and values no larger than c. On the other hand.

while h (1) > 0. it follows so z ≡ f −1 (f (z)) ∈ (x − η. 1) such that h (t) = 0. This proves the lemma. d) and f −1 : (c. then y ≡ f −1 (f (y)) < x ≡ f −1 (f (x)) and so f −1 is also decreasing. Why is f −1 is continuous at f (x)? Since f is decreasing. a similar argument holding for φ strictly decreasing on (a. Now let x ∈ (a. Let ε > η > 0 and (x − η. f (x − η)) . b). b) is an open interval. Now by continuity of φ at a. x + η) ⊆ (x − ε. Then f (x) ∈ (f (x + η) .118 FUNCTIONS with strict inequality holding for at least one of these inequalities since not both t and (1 − t) can equal zero. if f (x) < f (y) . b) is an open interval. (c. b) . Now deﬁne h (t) ≡ (φ (yt ) − φ (xt )) (yt − xt ) . This proves the theorem in the case where f is strictly decreasing. It follows φ is either strictly increasing or strictly decreasing on (a. it follows that f (a. (c. d) . f (x − η) − f (x)) . φ (y) < φ (x) . there exists t ∈ (0. The case where f is increasing is similar. Let ε > 0 be given. Suppose φ is strictly increasing on (a. φ (a) < φ (x) whenever x ∈ (a. Then if |f (z) − f (x)| < δ. x + η) ⊆ (a. b) is continuous. Proof: Since f is either strictly increasing or strictly decreasing. b] by the continuity of φ. Let δ = min (f (x) − f (x + η) . b) . x + ε) f −1 (f (z)) − x = f −1 (f (z)) − f −1 (f (x)) < ε. then pick y ∈ (a. d) → (a. Corollary 5. If x > a.15. Therefore.4 Let f : (a. This property of being either strictly increasing or strictly decreasing on (a. x→a+ Therefore. . Assume f is decreasing. both xt and yt are points of (a. b) and φ (yt ) − φ (xt ) = 0 contradicting the assumption that φ is one to one. Then f (a. b). b) → R be one to one and continuous. Similarly φ (b) > φ (x) for all x ∈ (a. x) and from the above. Since h is continuous and h (0) < 0. b) . φ (a) = lim φ (z) ≤ φ (y) < φ (x) . b) . b) carries over to [a. b) .

What number does this average get close to as h gets smaller and smaller? Clearly it gets close to 31 and for this reason.5 + h) − 30 (. the object is at 21 kilometers. from −10 to 21. Thus at t = 0. this average velocity would be pretty close to the thing which deserves to be called the instantaneous velocity at t = .5 + h] to get 30 (.5 + .0001) − r (. the position of the object is r (t) = −10 + 30t + t2 where distance is measured in kilometers and t in hours.5) − (. you would expect to be even closer using a time interval of length .000001 instead of just . the number would be negative.1 Velocity Imagine an object which is moving along the real line in the positive direction and that at time t > 0. the object is at the point −10 kilometers and when t = 1.0001 = 31. Thus the velocity at t = .5.5 + h) + (. It is positive because the object is moving in the positive direction.0001) − 30 (. The average velocity during this time is the distance traveled divided by the elapsed time. . Thus in this case form the average velocity on the interval.5) + h.0001. The notion just described of ﬁnding an instantaneous velocity has a geometrical application to ﬁnding the slope of a line tangent 119 . . In general. For example. Suppose it was desired to ﬁnd something which deserves to be referred to as the instantaneous velocity when t = 1/2? If the object were a car. [. It came out positive because the object moved in the positive direction along the real line. Thus the average velocity would be 21−(−10) = 31 1 kilometers per hour.0001] .5 would be close to (r (.5) 2 2 /. consider a time interval of length h and then deﬁne the instantaneous velocity to be the number which all these average velocities get close to as h gets smaller and smaller.5 + .5) + (.0001 = 30 (. if considering the average velocity of the object on the interval [. 000 1 Of course.Derivatives 6.5 + .5 + .5 is deﬁned as 31.5) 2 2 /h = 30 + 2 (.5)) /. the velocity at time .5. it is reasonable to suppose that the magnitude of the average velocity of the object over a very small interval of time would be very close to the number that would appear on the speedometer.01) + (.5 hours. If the object were moving in the negative direction.

Also it is clear from setting y = x + h that f (x) = lim f (y) − f (x) .2) (6. in the deﬁnition of limit. r(t0 + h)) t0 + h t 0 + h1 d d d d d d ¡ t0 d ¡ d ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ t (t0 + h1 .2. is deﬁned as the following limit whenever the limit exists. f (x) . you see the slope of the line joining the two points (t0 . r (t0 )) and (t0 + h. |h| > 0. r (t0 + h)) is given by r (t0 + h) − r (t0 ) h which equals the average velocity on the time interval. The slope of the resulting line segment appears to get closer and closer to what ought to be considered the slope of the line tangent to the curve at the point (t0 .120 to a curve. t0 + h] . You can also see the eﬀect of making h closer and closer to zero as illustrated by changing h to the smaller h1 in the picture. This is why. [t0 . then neither does f (x) . 6.1 The derivative of a function. y−x (6.2 The Derivative The derivative of a function of one variable is a function given by the following deﬁnition. r(t0 + h1 )) (t0 + h. Deﬁnition 6. r(t0 )) In the above picture. The distinction between the limit of a function and its value is very important and must be kept in mind. f (x + h) − f (x) ≡ f (x) h→0 h lim The function of h on the left is called the diﬀerence quotient. It is time to make this heuristic material much more precise. Note that the diﬀerence quotient on the left of the equation is a function of h which is not deﬁned at h = 0. If the limit does not exist. It is not necessary to have the function deﬁned at the point in order to consider its limit.1) y→x . r (t0 )) . DERIVATIVES r(t) (t0 + h.

one from the right and one from the left do not agree as they would have to do if the function had a derivative at x = 0. It is very important to remember that just because f is continuous.6.2. .2.3 Let f (x) = c where c is a constant. Proof: Suppose ε > 0 is given and choose δ 1 > 0 such that if |h| < δ 1 . this shows that if |y − x| < δ. h→0− h→0− h h lim Thus the two limits. h→0 lim f (x + h) − f (x) = lim 0 = 0 h→0 h Example 6. the pointy places don’t have derivatives. See Problem 18 on Page 103. then |f (x + h) − f (x)| < ε.2. Find f (x) . this lack of diﬀerentiability is manifested by there being a pointy place in the graph of y = |x| at x = 0. then f is continuous at x. Set up the diﬀerence quotient. h then for such h. In short. Geometrically.4 Let f (x) = cx where c is a constant.2.2 If f (x) exists. f is continuous at x f (x)exists f (x) = |x| As indicated in the above picture the function f (x) = |x| does not have a derivative at x = 0. To see this. f (x + h) − f (x) c−c = =0 h h Therefore. 1+|fε (x)| it follows if |h| < δ. |f (y) − f (x)| < ε 121 which proves f is continuous at x. Example 6. Find f (x) . The following picture describes the situation. THE DERIVATIVE Theorem 6. Now letting δ < min δ 1 . f (x + h) − f (x) − f (x) < 1. does not mean f has a derivative. the triangle inequality implies |f (x + h) − f (x)| < |h| + |f (x)| |h| . f (h) − f (0) h lim = lim =1 h→0+ h→0+ h h while f (h) − f (0) −h = lim = −1. Letting y = h + x.

(f (ct)) = cf (ct) .7) (6. The formula. (af + bg) (t) = af (t) + bg (t) . (6. h→0 DERIVATIVES lim √ f (x + h) − f (x) = lim c = c. then g is diﬀerentiable on (a − c.8) Written with a slight abuse of notation.5) is referred to as the quotient rule. Theorem 6. Find f (x) . (6.5 Let f (x) = x for x > 0. f f (t) g (t) − g (t) f (t) (t) = . g g 2 (t) Formula (6. b − c) and g (t) = f (t + c) . f (x + h) − f (x) c (x + h) − cx ch = = = c. an easy to remember version of (6.3) (f g) (t) = f (t) g (t) + f (t) g (t) . assume f (t) = 0. (6. b) and if g (t) ≡ f (t + c) . let gp (t) ≡ f (t) .6 Let a. h h h Therefore.4) is referred to as the product rule. f (x + h) − f (x) = h and so h→0 lim f (x + h) − f (x) 1 1 = lim √ √ = √ . Then letting g (t) ≡ f (ct) .6) (6.122 Set up the diﬀerence quotient.9) f (t) . Written with a slight abuse of notation. g (t) = cf (ct) . (f (t + c)) = f (t + c) For p an integer and f (t) exists. √ x+h− x x+h−x √ = √ h h x+h+ x 1 =√ √ x+h+ x √ Set up the diﬀerence quotient. (6. (6. h→0 h 2 x x+h+ x There are rules of derivatives which make ﬁnding the derivative very easy.) Written with a slight abuse of notation. Then the following formulas are obtained.4) (6. If g (t) = 0.10) (In the case where p < 0. b ∈ R and suppose f (t) and g (t) exist.2. . Then (gp ) (t) = pf (t) p−1 p (6. If f is diﬀerentiable at ct where c = 0. h→0 h Example 6.2.10) says (f (t) ) = pf (t) p p−1 f (t) .5) If f is diﬀerentiable on (a.

If p = 0.6. THE DERIVATIVE Proof: The ﬁrst formula is left for you to prove. g−p (t) = gp (t) −1 p p p−1 −1 (t) = g (t) f (t) − g (t) f (t) . (g (t + h) − g (t)) h−1 = h−1 (f (ct + ch) − f (ct)) f (ct + ch) − f (ct) ch f (ct + h1 ) − f (ct) =c h1 =c where h1 = ch.4) follows. it follows from Theorem 5. If the formula holds for some integer.10) holds for p an integer.4 on Page 99. (6. Here is why.2. First consider (6.7) and (6. Then (gp+1 (t)) = f (t) gp (t) and so by the product rule. f (t) . f g Now consider Formula (6.9.10) in the case where p equals a nonnegative integer. f g (t + h) − f g (t) f (t + h) g (t + h) − f (t + h) g (t) f (t + h) g (t) − f (t) g (t) = + h h h (g (t + h) − g (t)) (f (t + h) − f (t)) = f (t + h) + g (t) h h 123 Taking the limit as h → 0 and using Theorem 6.9.4 that (6. h−1 = = f f (t + h) − (t) g g = f (t + h) g (t) − g (t + h) f (t) hg (t) g (t + h) f (t + h) (g (t) − g (t + h)) + g (t + h) (f (t + h) − f (t)) hg (t) g (t + h) −f (t + h) (g (t + h) − g (t)) g (t + h) (f (t + h) − f (t)) + g (t) g (t + h) h g (t) g (t + h) h and from Theorem 5.10) holds because g0 (t) = 1 and so by Example 6.2. g0 (t) = 0 = 0 (f (t)) Next suppose (6.2.3.4). Next consider the quotient rule.2 to conclude limh→0 f (t + h) = f (t) . Consider the second. Formulas (6.6). p then it holds for −p. gp+1 (t) = f (t) gp (t) + f (t) gp (t) = f (t) (f (t)) + f (t) pf (t) = (p + 1) f (t) f (t) .8) are left as an exercise. Then h1 → 0 if and only if h → 0 and so taking the limit as h → 0 yields g (t) = cf (ct) as claimed. g 2 (t) f (t) . (6.

124 and so

DERIVATIVES

g−p (t + h) − g−p (t) = h

gp (t) − gp (t + h) h

1 gp (t) gp (t + h)

−2p

.

**Taking the limit as h → 0 and using the formula for p, g−p (t) = −pf (t)
**

p−1

f (t) (f (t)) f (t) .

= −p (f (t)) This proves the theorem.

−p−1

Example 6.2.7 Let p (x) = 3 + 5x + 6x2 − 7x3 . Find p (x) . From the above theorem, and abusing the notation, p (x) = 3 + 5x + 6x2 − 7x3 = 3 + (5x) + 6x2 = 5 + 12x − 21x2 . Note the process is to take the exponent and multiply by the coeﬃcient and then make the new exponent one less in each term of the polynomial in order to arrive at the answer. This is the general procedure for diﬀerentiating a polynomial as shown in the next example. Example 6.2.8 Let ak be a number for k = 0, 1, · · ·, n and let p (x) = p (x) . Use Theorem 6.2.6

n n n k=0

+ −7x3

= 0 + 5 + (6) (2) (x) (x) + (−7) (3) x2 (x)

ak xk . Find

p (x) =

k=0 n

ak xk

=

k=0

ak xk

n

=

k=0

ak kxk−1 (x) =

k=0

ak kxk−1

x2 +1 x3 .

Example 6.2.9 Find the derivative of the function f (x) = Use the quotient rule f (x) =

2x x3 − 3x2 x2 + 1 1 = − 4 x2 + 3 6 x x

4

Example 6.2.10 Let f (x) = x2 + 1

x3 . Find f (x) .

**Use the product rule and (6.10). Abusing the notation for the sake of convenience, x2 + 1
**

4

x3

=

x2 + 1

4 3

x3 + x3

x2 + 1

4 4

= 4 x2 + 1

(2x) x3 + 3x2 x2 + 1

3

**= 4x4 x2 + 1 Example 6.2.11 Let f (x) = x3 / x2 + 1
**

2

+ 3x2 x2 + 1

4

. Find f (x) .

**6.3. EXERCISES WITH ANSWERS Use the quotient rule to obtain f (x) = 3x2 x2 + 1
**

2

125

− 2 x2 + 1 (2x) x3

4

(x2 + 1)

=

3x2 x2 + 1

2

− 4x4 x2 + 1

4

(x2 + 1)

Obviously, one could consider taking the derivative of the derivative and then the derivative of that and so forth. The main thing to consider about this is the notation. The second derivative is denoted with two primes. Example 6.2.12 Let f (x) = x3 + 2x2 + 1. Find f (x) and f (x) .

To ﬁnd f (x) take the derivative of the derivative. Thus f (x) = 3x2 + 4x and so f (x) = 6x + 4. Then f (x) = 6. When high derivatives are taken, say the 5th derivative, it is customary to write f (5) (t) putting the number of derivatives in parentheses.

6.3

**Exercises With Answers
**

x3 +6x+2 x2 +2 ,

**1. For f (x) = Answer: − −x −12+4x (x2 +2)2
**

4

ﬁnd f (x) .

2. For f (x) = 3x3 + 6x + 3 Answer: 9x2 + 6

3x2 + 3x + 9 , ﬁnd f (x) .

**3x2 + 3x + 9 + 3x3 + 6x + 3 (6x + 3) √ 3. For f (x) = 5x2 + 1, ﬁnd f (x) from the deﬁnition of the derivative. Answer: √
**

5x (5x2 +1)

4. For f (x) = Answer: √ 3

4x (6x2 +1)

2

√ 3

6x2 + 1, ﬁnd f (x) from the deﬁnition of the derivative.

5. For f (x) = (−5x + 2) , ﬁnd f (x) from the deﬁnition of the derivative. Hint: You might use the formula bn − an = (b − a) bn−1 + bn−2 a + · · · + an−2 b + bn−1 Answer: −45 (−5x + 2)

8 2

9

6. Let f (x) = (x + 5) sin (1/ (x + 5)) + 6 (x − 2) (x + 5) for x = −5 and deﬁne f (−5) ≡ 0. Find f (−5) from the deﬁnition of the derivative if this is possible. Hint: Note that |sin (z)| ≤ 1 for any real value of z. Answer: −42

126

DERIVATIVES

6.4

Exercises

1. For f (x) = −3x7 + 4x5 + 2x3 + x2 − 5x, ﬁnd f (3) (x) . 2. For f (x) = −x7 + x5 + x3 − 2x, ﬁnd f (3) (x) . 3. For f (x) = − 3x −x−1 , ﬁnd f (x) . 3x2 +1 4. For f (x) = −2x3 + x + 3 −2x2 + 3x − 1 , ﬁnd f (x) . √ 5. For f (x) = 4x2 + 1, ﬁnd f (x) from the deﬁnition of the derivative. √ 6. For f (x) = 3 3x2 + 1, ﬁnd f (x) from the deﬁnition of the derivative. Hint: You might use √ 2 3 b3 − a3 = b2 + ab + a2 (b − a) for b = 3 (x + h) + 1 and a = 3 3x2 + 1. 7. For f (x) = (−3x + 5) , ﬁnd f (x) from the deﬁnition of the derivative. Hint: You might use the formula bn − an = (b − a) bn−1 + bn−2 a + · · · + an−2 b + bn−1 8. Let f (x) = (x + 2) sin (1/ (x + 2)) + 6 (x − 5) (x + 2) for x = −2 and deﬁne f (−2) ≡ 0. Find f (−2) from the deﬁnition of the derivative if this is possible. Hint: Note that |sin (z)| ≤ 1 for any real value of z. 9. Let f (x) = (x + 5) sin (1/ (x + 5)) for x = −5 and deﬁne f (−5) ≡ 0. Show f (−5) does not exist. Hint: Verify that limh→0 sin (1/h) does not exist and then explain why this shows f (−5) does not exist.

2 5

3

6.5

Local Extrema

When you are on top of a hill, you are at a local maximum although there may be other hills higher than the one on which you are standing. Similarly, when you are at the bottom of a valley, you are at a local minimum even though there may be other valleys deeper than the one you are in. The word, “local” is applied to the situation because if you conﬁne your attention only to points close to your location, you are indeed at either the top or the bottom. Deﬁnition 6.5.1 Let f : D (f ) → R where here D (f ) is only assumed to be some subset of R. Then x ∈ D (f ) is a local minimum (maximum) if there exists δ > 0 such that whenever y ∈ (x − δ, x + δ) ∩ D (f ), it follows f (y) ≥ (≤) f (x) . Derivatives can be used to locate local maximums and local minimums. Theorem 6.5.2 Suppose f : (a, b) → R and suppose x ∈ (a, b) is a local maximum or minimum. Then f (x) = 0. Proof: Suppose x is a local maximum. If h > 0 and is suﬃciently small, then f (x + h) ≤ f (x) and so from Theorem 5.9.4 on Page 99, f (x) = lim Similarly,

h→0+

f (x + h) − f (x) ≤ 0. h

f (x + h) − f (x) ≥ 0. h→0− h The case when x is local minimum is similar. This proves the theorem. f (x) = lim

6.5. LOCAL EXTREMA

127

Deﬁnition 6.5.3 Points where the derivative of a function equals zero are called critical points. It is also customary to refer to points where the derivative of a function does not exist as a critical points. Example 6.5.4 It is desired to ﬁnd two positive numbers whose sum equals 16 and whose product is to be a large as possible. The numbers are x and 16 − x and f (x) = x (16 − x) is to be made as large as possible. The value of x which will do this would be a local maximum so by Theorem 6.5.2 the procedure is to take the derivative of f and ﬁnd values of x where it equals zero. Thus 16 − 2x = 0 and the only place this occurs is when x = 8. Therefore, the two numbers are 8 and 8. Example 6.5.5 A farmer wants to fence a rectangular piece of land next to a straight river. What are the dimensions of the largest rectangle if there are exactly 600 meters of fencing available. The two sides perpendicular to the river have length x and the third side has length y = (600 − 2x) . Thus the function to be maximized is f (x) = 2x (600 − x) = 1200x − 2x2 . Taking the derivative and setting it equal to zero gives f (x) = 1200 − 4x = 0 and so x = 300. Therefore, the desired dimensions are 300 × 600. Example 6.5.6 A rectangular playground is to be enclosed by a fence and divided in 5 pieces by 4 fences parallel to one side of the playground. 1704 feet of fencing is used. Find dimensions of the playground which will have the largest total area. Let x denote the length of one of these dividing fences and let y denote the length of the playground as shown in the following picture. y x

Thus 6x + 2y = 1704 so y =

1704−6x 2

and the function to maximize is = x (852 − 3x) .

f (x) = x

1704 − 6x 2

Therefore, to locate the value of x which will make f (x) as large as possible, take f (x) and set it equal to zero. 852 − 6x = 0 and so x = 142 feet and y = 1704−6×142 = 426 feet. 2 Revenue is deﬁned to be the amount of money obtained in some transaction. Proﬁt is deﬁned as the revenue minus the costs. Example 6.5.7 Francine, the manager of Francine’s Fancy Shakes ﬁnds that at $4, demand for her milk shakes is 900 per day. For each $. 30 increase in price, the demand decreases by 50. Find the price and the quantity sold which maximizes revenue.

128

DERIVATIVES

Let x be the number of $. 30 increases. Then the total number sold is (900 − 50x) . The revenue is R (x) = (900 − 50x) (4 + . 3x) . Then to maximize it, R (x) = 70 − 30x = 0. The solution is x = 2. 33 and so the optimum price is at $4. 70 and the number sold will be 783. Example 6.5.8 Sam, the owner of Spider Sam’s Tarantulas and Creepy Critters ﬁnds he can sell 6 tarantulas every day at the regular price of $30 each. At his last spider celebration sale he reduced the price to $24 and was able to sell 12 tarantulas every day. He has to pay $.0 5 per day to exhibit a tarantula and his ﬁxed costs are $30 per day, mainly to maintain the thousands of tarantulas he keeps on his tarantula breeding farm. What price should he charge to maximize his proﬁt. He assumes the demand for tarantulas is a linear function of price. Thus if y is the number of tarantulas demanded at price x, it follows y = 36 − x. Therefore, the revenue for price x equals R (x) = (36 − x) x. Now you have to subtract oﬀ the costs to get the proﬁt. Thus P (x) = (36 − x) x − (36 − x) (.05) − 30. It follows the proﬁt is maximized when P (x) = 0 so −2.0x + 36. 05 = 0 which occurs when x =$18. 025. Thus Sam should charge about $18 per tarantula. Exercise 6.5.9 Lisa, the owner of Lisa’s gags and gadgets sells 500 whoopee cushions per year. It costs $. 25 per year to store a whoopee cushion. To order whoopee cushions it costs $8 plus $. 90 per cushion. How many times a year and in what lot size should whoopee cushions be ordered to minimize inventory costs? Let x be the times per year an order is sent for a lot size of 500 . If the demand is x constant, it is reasonable to suppose there are about 500 whoopee cushions which have to 2x be stored. Thus the cost to store whoopee cushions is 125.0 = .25 500 . Each time an order 2x 2x is made for a lot size of 500 it costs 8 + 450.0 = 8 + .9 500 and this is done x times a x x x year. Therefore, the total inventory cost is x 8 + 450.0 + 125.0 = C (x) . The problem is to x 2x minimize C (x) = x 8 + 450.0 + 125.0 . Taking the derivative yields x 2x 16x2 − 125 2x2 √ 500 and so the value of x which will minimize C (x) is 5 5 = 2. 79 · ·· and the lot size is 5 √5 = 4 4 178. .88. Of course you would round these numbers oﬀ. Order 179 whoopee cushions 3 times a year. C (x) =

6.6

**Exercises With Answers
**

Answer:

2 3, 0

1. Find the x values of the critical points of the function f (x) = x2 − x3 .

2. Find the x values of the critical points of the function f (x) = Answer: 1

√

3x2 − 6x + 6.

6.6. EXERCISES WITH ANSWERS

129

3. Find the extreme points of the function, f (x) = 2x2 − 20x + 55 and tell whether the extreme point is a maximum or a minimum. Answer: The extremum is at x = 5. It is a maximum.

9 4. Find the extreme points of the function, f (x) = x + x and tell whether the extreme point is a local maximum or a local minimum or neither.

Answer: The extrema are at x = ±3. The one at 3 is a local minimum and the one at −3 is a local maximum. 5. A rectangular pasture is to be fenced oﬀ beside a river with no need of fencing along the river. If there is900 yards of fencing material, what are the dimensions of the largest possible pasture that can be enclosed? Answer: 225 × 45 6. A piece of property is to be fenced on the front and two sides. Fencing for the sides costs $3. 50 per foot and fencing for the front costs $5. 60 per foot. What are the dimensions of the largest such rectangular lot if the available money is $840.0? Answer: 60 × 75. 7. In a particular apartment complex of 120 units, it is found that all units remain occupied when the rent is $300 per month. For each $30 increase in the rent, one unit becomes vacant, on the average. Occupied units require $60 per month for maintenance, while vacant units require none. Fixed costs for the buildings are $30 000 per month. What rent should be charged for maximum proﬁt and what is the maximum proﬁt? Answer: Need to maximize f (x) = (300 + 30x) (120 − x) − 37 200 + 60x for x ∈ [0, 120] where x is the number of $30 increases in rent. $92 880 when the rent is $1980. 8. A picture is 5 feet high and the eye level of an observer is 2 feet below the bottom edge of the picture. How far from the picture should the observer stand if he wants to maximize the angle subtended by the picture? Answer: Let the angle subtended by the picture be θ and let α denote the angle between a horizontal line from the observer’s eye to the wall and the line between the observer’s eye and the base of the picture. Then letting x denote the distance between the wall and the observer’s eye, 7 = x tan (θ + α) = x x

x = 7 and so z = 5 14+x2 . It follows √ to zero, x = 14.

2 z+ x 2 1−z ( x )

tan θ+tan α 1−tan θ tan α

=x

2

2 tan θ+ x 2 1−tan θ ( x )

. The

**problem is equivalent to maximizing tan θ so denote this by z and solve for it. Thus
**

dz dx 14−x = 5 (14+x2 )2 and setting this equal

130 9. Find the point on the curve, y = Answer: √ 3, 63 √ 81 − 6x which is closest to (0, 0) .

DERIVATIVES

10. A street is 200 feet long and there are two lights located at the ends of the street. One of the lights is 27 times as bright as the other. Assuming the brightness of light 8 from one of these street lights is proportional to the brightness of the light and the reciprocal of the square of the distance from the light, locate the darkest point on the street. Answer: 80 feet from one light and 120 feet from the other. 11. Two cities are located on the same side of a straight river. One city is at a distance of 3 miles from the river and the other city is at a distance of 8 miles from the river. The distance between the two points on the river which are closest to the respective cities is 40 miles. Find the location of a pumping station which is to pump water to the two cities which will minimize the length of pipe used. Answer: miles from the point on the river closest to the city which is at a distance of 3 miles from the river.

120 11

6.7

Exercises

1. If f (x) = 0, is it necessary that x is either a local minimum or local maximum? Hint: Consider f (x) = x3 . 2. Two positive numbers add to 32. Find the numbers if their product is to be as large as possible. 3. The product of two positive numbers equals 16. Find the numbers if their sum is to be as small as possible. 4. The product of two positive numbers equals 16. Find the numbers if twice the ﬁrst plus three times the second is to be as small as possible. 5. Theodore, the owner of Theodore’s tarantulas ﬁnds he can sell 6 tarantulas at the regular price of $20 each. At his last spider celebration day sale he reduced the price to $14 and was able to sell 14 tarantulas. He has to pay $.0 5 per day to maintain a tarantula and his ﬁxed costs are $30 per day. What price should he charge to maximize his proﬁt. 6. Lisa, the owner of Lisa’s gags and gadgets sells 500 whoopee cushions per year. It costs $. 25 per year to store a whoopee cushion. To order whoopee cushions it costs $2 plus $. 25 per cushion. How many times a year and in what lot size should whoopee cushions be ordered to minimize inventory costs? 7. A continuous function, f deﬁned on [a, b] is to be maximized. It was shown above in Theorem 6.5.2 that if the maximum value of f occurs at x ∈ (a, b) , and if f is diﬀerentiable there, then f (x) = 0. However, this theorem does not say anything about the case where the maximum of f occurs at either a or b. Describe how to ﬁnd the point of [a, b] where f achieves its maximum. Does f have a maximum? Explain.

6.7. EXERCISES

131

8. Find the maximum and minimum values and the values of x where these are achieved √ for the function, f (x) = x + 25 − x2 . 9. A piece of wire of length L is to be cut in two pieces. One piece is bent into the shape of an equilateral triangle and the other piece is bent to form a square. How should the wire be cut to maximize the sum of the areas of the two shapes? How should the wire be bent to minimize the sum of the areas of the two shapes? Hint: Be sure to consider the case where all the wire is devoted to one of the shapes separately. This is a possible solution even though the derivative is not zero there. 10. A cylindrical can is to be constructed of material which costs 3 cents per square inch for the top and bottom and only 2 cents per square inch for the sides. The can needs to hold 90π cubic inches. Find the dimensions of the cheapest can. Hint: The volume of a cylinder is πr2 h where r is the radius of the base and h is the height. The area of the cylinder is 2πr2 + 2πrh. 11. A rectangular sheet of tin has dimensions 10 cm. by 20 cm. It is desired to make a topless box by cutting out squares from each corner of the rectangular sheet and then folding the rectangular tabs which remain. Find the volume of the largest box which can be made in this way. 12. Let f (x) = 1 x3 − x2 − 8x on the interval [−1, 10] . Find the point of [−1, 10] at which 3 f achieves its minimum. 13. A rectangular garden 200 square feet in area is to be fenced oﬀ against rabbits. Find the least possible length of fencing if one side of the garden is already protected by a barn. 14. A feed lot is to be enclosed by a fence and divided in 5 pieces by 4 fences parallel to one side. 1272 feet of fencing is used. Find dimensions of the feed lot which will have the largest total area. 15. Find the dimensions of the largest rectangle that can be inscribed in a semicircle of radius r where r = 8. 16. A smuggler wants to ﬁt a small cylindrical vial inside a hollow rubber ball with a eight inch diameter. Find the volume of the largest vial that can ﬁt inside the ball. The volume of a cylinder equals πr2 h where h is the height and r is the radius. 17. A function, f, is said to be odd if f (−x) = f (x) and a function is said to be even if f (−x) = f (x) . Show that if f is even, then f is odd and if f is odd, then f is even. Sketch the graph of a typical odd function and a typical even function. 18. Recall sin is an odd function and cos is an even function. Determine whether each of the trig functions is odd, even or neither. 19. Find the x values of the critical points of the function f (x) = 3x2 − 5x3 . √ 20. Find the x values of the critical points of the function f (x) = 3x2 − 6x + 8. 21. Find the extreme points of the function, f (x) = x + 25 and tell whether the extreme x point is a local maximum or a local minimum or neither. 22. A piece of property is to be fenced on the front and two sides. Fencing for the sides costs $3. 50 per foot and fencing for the front costs $5. 60 per foot. What are the dimensions of the largest such rectangular lot if the available money is $1400?

132

DERIVATIVES

23. In a particular apartment complex of 200 units, it is found that all units remain occupied when the rent is $400 per month. For each $40 increase in the rent, one unit becomes vacant, on the average. Occupied units require $80 per month for maintenance, while vacant units require none. Fixed costs for the buildings are $20 000 per month. What rent should be charged for maximum proﬁt and what is the maximum proﬁt? 24. A picture is 9 feet high and the eye level of an observer is 2 feet below the bottom edge of the picture. How far from the picture should the observer stand if he wants to maximize the angle subtended by the picture? √ 25. Find the point on the curve, y = 25 − 2x which is closest to (0, 0) . 26. A street is 200 feet long and there are two lights located at the ends of the street. One of the lights is 1 times as bright as the other. Assuming the brightness of light 8 from one of these street lights is proportional to the brightness of the light and the reciprocal of the square of the distance from the light, locate the darkest point on the street. 27. Two cities are located on the same side of a straight river. One city is at a distance of 3 miles from the river and the other city is at a distance of 8 miles from the river. The distance between the two points on the river which are closest to the respective cities is 40 miles. Find the location of a pumping station which is to pump water to the two cities which will minimize the length of pipe used.

6.8

Mean Value Theorem

The mean value theorem is one of the most important theorems about the derivative. The best versions of many other theorems depend on this fundamental result. The mean value theorem says that under suitable conditions, there exists a point in (a, b), x, such that f (x) equals the slope of the secant line, f (b) − f (a) . b−a The following picture is descriptive of this situation. ¨¨ ¨¨¨ ¨¨ ¨ ¨¨ ¨¨ ¨¨ ¨¨ ¨ a b This theorem is an existence theorem and like the other existence theorems in analysis, it depends on the completeness axiom. The following is known as Rolle’s1 theorem. Theorem 6.8.1 Suppose f : [a, b] → R is continuous, f (a) = f (b) ,

1 Rolle is remembered for Rolle’s theorem and not for anything else he did. Ironically, he did not like calculus.

6.8. MEAN VALUE THEOREM and f : (a, b) → R

133

has a derivative at every point of (a, b) . Then there exists x ∈ (a, b) such that f (x) = 0. Proof: Suppose ﬁrst that f (x) = f (a) for all x ∈ [a, b] . Then any x ∈ (a, b) is a point such that f (x) = 0. If f is not constant, either there exists y ∈ (a, b) such that f (y) > f (a) or there exists y ∈ (a, b) such that f (y) < f (b) . In the ﬁrst case, the maximum of f is achieved at some x ∈ (a, b) and in the second case, the minimum of f is achieved at some x ∈ (a, b). Either way, Theorem 6.5.2 on Page 126 implies f (x) = 0. This proves Rolle’s theorem. The next theorem is known as the Cauchy mean value theorem. Theorem 6.8.2 Suppose f, g are continuous on [a, b] and diﬀerentiable on (a, b) . Then there exists x ∈ (a, b) such that f (x) (g (b) − g (a)) = g (x) (f (b) − f (a)) . Proof: Let h (x) ≡ f (x) (g (b) − g (a)) − g (x) (f (b) − f (a)) . Then letting x = a and then letting x = b, a short computation shows h (a) = h (b) . Also, h is continuous on [a, b] and diﬀerentiable on (a, b) . Therefore Rolle’s theorem applies and there exists x ∈ (a, b) such that h (x) = f (x) (g (b) − g (a)) − g (x) (f (b) − f (a)) = 0. This proves the theorem. The usual mean value theorem, sometimes called the Lagrange mean value theorem, illustrated by the above picture is obtained by letting g (x) = x. Corollary 6.8.3 Let f be continuous on [a, b] and diﬀerentiable on (a, b) . Then there exists x ∈ (a, b) such that f (b) − f (a) = f (x) (b − a) . Corollary 6.8.4 Suppose f (x) = 0 for all x ∈ (a, b) where a ≥ −∞ and b ≤ ∞. Then f (x) = f (y) for all x, y ∈ (a, b) . Thus f is a constant. Proof: If this is not true, there exists x1 and x2 such that f (x1 ) = f (x2 ) . Then by the mean value theorem, f (x1 ) − f (x2 ) 0= = f (z) x1 − x2 for some z between x1 and x2 . This contradicts the hypothesis that f (x) = 0 for all x. This proves the theorem. Corollary 6.8.5 Suppose f (x) > 0 for all x ∈ (a, b) where a ≥ −∞ and b ≤ ∞. Then f is strictly increasing on (a, b) . That is, if x < y, then f (x) < f (y) . If f (x) ≥ 0, then f is increasing in the sense that whenever x < y it follows that f (x) ≤ f (y) . Proof: Let x < y. Then by the mean value theorem, there exists z ∈ (x, y) such that 0 < f (z) = f (y) − f (x) . y−x

Since y > x, it follows f (y) > f (x) as claimed. Replacing < by ≤ in the above equation and repeating the argument gives the second claim.

134

DERIVATIVES

Corollary 6.8.6 Suppose f (x) < 0 for all x ∈ (a, b) where a ≥ −∞ and b ≤ ∞. Then f is strictly decreasing on (a, b) . That is, if x < y, then f (x) > f (y) . If f (x) ≤ 0, then f is decreasing in the sense that for x < y, it follows that f (x) ≥ f (y) Proof: Let x < y. Then by the mean value theorem, there exists z ∈ (x, y) such that 0 > f (z) = f (y) − f (x) . y−x

Since y > x, it follows f (y) < f (x) as claimed. The second claim is similar except instead of a strict inequality in the above formula, you put ≥ .

6.9

Exercises

1. Sally drives her Saturn over the 110 mile toll road in exactly 1.3 hours. The speed limit on this toll road is 70 miles per hour and the ﬁne for speeding is 10 dollars per mile per hour over the speed limit. How much should Sally pay? 2. Two cars are careening down a freeway weaving in and out of traﬃc. Car A passes car B and then car B passes car A as the driver makes obscene gestures. This infuriates the driver of car A who passes car B while ﬁring his handgun at the driver of car B. Show there are at least two times when both cars have the same speed. Then show there exists at least one time when they have the same acceleration. The acceleration is the derivative of the velocity. 3. Show the cubic function, f (x) = 5x3 + 7x − 18 has only one real zero. 4. Suppose f (x) = x7 + |x| + x − 12. How many solutions are there to the equation, f (x) = 0? 5. Let f (x) = |x − 7| + (x − 7) − 2 on the interval [6, 8] . Then f (6) = 0 = f (8) . Does it follow from Rolle’s theorem that there exists c ∈ (6, 8) such that f (c) = 0? Explain your answer. 6. Suppose f and g are diﬀerentiable functions deﬁned on R. Suppose also that it is known that |f (x)| > |g (x)| for all x and that |f (t)| > 0 for all t. Show that whenever x = y, it follows |f (x) − f (y)| > |g (x) − g (y)| . Hint: Use the Cauchy mean value theorem, Theorem 6.8.2. 7. Show that, like continuous functions, functions which are derivatives have the intermediate value property. This means that if f (a) < 0 < f (b) then there exists x ∈ (a, b) such that f (x) = 0. Hint: Argue the minimum value of f occurs at an interior point of [a, b] . 8. Consider the function f (x) ≡ 1 if x ≥ 0 . −1 if x < 0

2

Is it possible that this function could be the derivative of some function? Why?

6.10. CURVE SKETCHING

135

6.10

Curve Sketching

The theorems and corollaries given above can be used to aid in sketching the graphs of functions. The second derivative will also help in determining the shape of the function. Deﬁnition 6.10.1 A diﬀerentiable function, f , deﬁned on an interval, (a, b) , is concave up if f is an increasing function. A diﬀerentiable function deﬁned on an interval, (a, b) , is concave down if f is a decreasing function. From the geometric description of the derivative as the slope of a tangent line to the graph of the function, to say derivative is an increasing function means that as you move from left to right, the slopes of the lines tangent to the graph of f become larger. Thus the graph of the function is bent up in the shape of a smile. It may also help to think of it as a cave when you view it from above, hence the term concave up. If the derivative is decreasing, it follows that as you move from left to right the slopes of the lines tangent to the graph of f become smaller. Thus the graph of the function is bent down in the form of a frown. It is concave down because it is like a cave when viewed from beneath. The following theorem will give a convenient criterion in terms of the second derivative for ﬁnding whether a function is concave up or concave down. The term, concavity, is used to refer to this property. Thus you determine the concavity of a function when you ﬁnd whether it is concave up or concave down. Theorem 6.10.2 Suppose f (x) > 0 for x ∈ (a, b) . Then f is concave up on (a, b) . Suppose f (x) < 0 on (a, b) . Then f is concave down. Proof: This follows immediately from Corollaries 6.8.6 and 6.8.5 applied to the ﬁrst derivative. The following picture may help in remembering this.

In this picture, the plus signs and the smile on the left correspond to the second derivative being positive. The smile gives the way in which the graph of the function is bent. In the second face, the minus signs correspond to the second derivative being negative. The frown gives the way in which the graph of the function is bent. Example 6.10.3 Sketch the graph of the function, f (x) = x2 − 1

2

= x4 − 2x2 + 1

Take the derivative of this function, f (x) = 4x3 − 4x = 4x (x − 1) (x + 1) which equals zero at −1, 0, and 1. It is positive on (−1, 0) , and (1, ∞) and negative on (0, 1) and (−∞, −1) . Therefore, x = 0 corresponds to a local maximum and x = −1 and x = 1 correspond to local minimums. The second derivative is f (x) = 12x2 − 4 and this equals √ √ zero only at the points −1/ 3 and 1/ 3. The second derivative is positive on the intervals √ √ 1/ 3, ∞ and −∞, −1/ 3 so the function, f is smiling on these intervals. The second √ √ derivative is negative on the interval −1/ 3, 1/ 3 and so the original function is frowning on this interval. This describes in words the qualitative shape of the function. It only remains to draw a picture which incorporates this description. The following is such a sketch. It is not intended to be an accurate drawing made to scale, only to be a qualitative picture of what was just determined.

There are also easy to use versions of Maple available which involve essentially pointing and clicking. if you are interested in getting a nice graph of a function. What do these graphs tell you about the case when the second derivative equals zero? 4. Sketch the graph of f (x) = 1/ 1 + x2 showing the intervals on which the function is increasing or decreasing and the intervals on which the graph is concave up and . you also need to place an asterisk between quantities which are multiplied since otherwise it will not know you are multiplying and won’t work.136 DERIVATIVES −1 −1 √ 3 1 √ 3 1 A better graph of this function is the following. An eﬀective way to accomplish your graphing is to go to the help menu and copy and paste an example from this menu changing it as needed. √ 2. done by a computer algebra system. 1 -1 0 x 1 In general. Keep in mind there are certain conventions which must be followed. 3. Find intervals on which the function. y = x3 . You won’t learn any calculus from playing with a computer algebra system but you might have a lot of fun. Sketch the graph of the function.11 Exercises 1. 6. and y = −x4 . However the computer worked a lot harder. For example to write x raised to the second power you enter x ˆ2. you should use a computer algebra system. Sketch the graphs of y = x4 . Both mathematica and Maple have good help menus. f (x) = x3 −3x+1 showing the intervals on which the function is concave up and down and identifying the intervals on which the function is increasing. f (x) = 1 − x2 is increasing and intervals on which it is concave up and concave down. In Maple. Sketch a graph of the function.

11. Hint: For the last part consider y = x3 and y = x4 . EXERCISES concave down. 7. 6. Inﬂection points are points where the graph of a function changes from being concave up to concave down or from being concave down to being concave up. Show that inﬂection points can be identiﬁed by looking at those points where the second derivative equals zero but that not every point where the second derivative equals zero is an inﬂection point. f (x) = x2 / 1 + x2 . Find all inﬂection points for the function. 137 5. Sketch the graph of f (x) = x/ 1 + x2 showing the intervals on which the function is increasing or decreasing and the intervals on which the graph is concave up and concave down. .6.

138 DERIVATIVES .

Some Important Special Functions 7. cos. sin x =1 x→0 x 1 − cos x =0 lim x→0 x lim (7. sin x + (1 − cos x) ≥ x ≥ sin x. (7. There are several approaches to this lemma. [17] and Rose. The ﬁrst thing to do is to give an important lemma. [2].2) Proof: First consider (7. |sin x| sin x sin x 139 . csc. and cot. However. see Apostol.1).3) (cos(x).1 The Circular Functions The Trigonometric functions are also called the circular functions. or almost any other calculus book. sin. sin(x)) ¨ ¨¨ ¨ % x (1 − cos(x)) Now divide by sin x to get 1+ 1 − cos x 1 − cos x x =1+ ≥ ≥ 1. Lemma 7. To see it done in terms of areas of a circular sector. [13] and is based on arc length. In the following picture. Thus this section will be on the functions. tan. it follows from Corollary 3.1.1) (7.5.4 on Page 59 that for small positive x. The proof given here is a modiﬁcation of that found in Tierney.1 The following limits hold. the book by Apostol has no loose ends in the presentation unlike most other books which use this approach. sec.

1 on Page 92 which says limx→0 sin (x) = 0. it is easy to ﬁnd the derivative of sin. 1 − cos x 1 − cos2 x sin x 1 = = sin x .9. sin (x + h) − sin x h→0 h lim sin (x) cos (h) + cos (x) sin (h) − sin x h→0 h (sin x) (cos (h) − 1) sin (h) = lim + cos x h→0 h h = cos x. sin (x) cos (x) tan (x) cot (x) sec (x) csc (x) = cos x = − sin x = sec2 (x) = − csc2 (x) = sec x tan x = − csc x cot x . Theorem 5. = lim Finally. cos (x) = sin (x + π/2) and so cos (x) = sin (x + π/2) = cos (x + π/2) = cos x cos (π/2) − sin x sin (π/2) = − sin x. x→0 x This proves the Lemma. Alternatively. functions are as follows. |sin x| sin x (Why?) From the trig. x x (1 + cos x) x 1 + cos x Therefore. Using Lemma 7. Theorem 7. from the limit theorems. 1+ sin2 x |sin x| x =1+ ≥ ≥1 |sin x| (1 + cos x) (1 + cos x) sin x and so from the squeezing theorem.1.140 SOME IMPORTANT SPECIAL FUNCTIONS For small negative values of x.1. The derivative of cos can be found the same way. With this.2 The derivatives of the trig. x lim =1 x→0 sin x and consequently. identities.5 on Page 100. x→0 lim sin x = lim x→0 x 1 x sin x = 1. it follows that for all small values of x. and the limit theorems. The following theorem is now obvious and the proofs of the remaining parts are left for you. 1 − cos x lim = 0. it is also true that 1+ 1 − cos x x ≥ ≥ 1.5. from Theorem 5.1.

EXERCISES 141 Here are some examples of extremum problems which involve the use of the trig. Therefore. √ √ 2 1 and 2 cos3 θ − 5 sin3 θ = 0 and so tan θ = 5 3 2 3 5 . Therefore. Prove tan (x) = 1 + tan2 (x) . you see that √ 2 √ 2 √ 2 √ 2 ( 3 5) +( 3 2) ( 3 5) +( 3 2) √ √ at this value of θ. Example 7. Find the derivative of the function. functions. using the rules of diﬀerentiation. you have sec θ = and csc θ = .7. 3 3+ √ 3 √ 3 2 2 3 . Find and prove a formula for the derivative of sinm (x) for m an integer. What is the length of the shortest ladder which will lean against the top of the fence and touch the building? Let θ be the angle of the ladder with the ground.1. Prove all parts of Theorem 7. f (θ) = 2 cos3 θ − 5 sin θ + 5 sin θ cos2 θ =0 (cos2 θ) (−1 + cos2 θ) should be solved to get the angle where this length is as small as possible. 2. 4. 5.3 Two hallways intersect at a right angle. 3 3 5 2 the minimum is obtained by substituting these values in to the equation for f (θ) yielding √ 2 √ 2 3 3 5 + 32 . Letting θ be the angle between this rod and the outside wall for the hall having width 2. 3. sin6 (5x) . sin θ cos θ Then the ﬁnal answer is the two hallways.1. .4 A fence 9 feet high is 2 feet from a building. The details are similar to the problem of 7.2. Thus 2 cos3 θ − 5 sin θ + 5 sin θ cos2 θ = 0.2. Then the length of this ladder making this angle with the ground and leaning on the top of the fence while touching the building is 9 2 f (θ) = + . Find the derivative of the function. minimize length of rod f (θ) = 2 csc θ + 5 sec θ. One is 5 feet wide and the other is 2 feet wide.1. Drawing a triangle. What is the length of the longest thin rod which can be carried horizontally from one hallway to the other? You must minimize the length of the rod which touches the inside corner of the two halls and extends to the outside walls.2 Exercises 1. Example 7. tan7 (4x) .

1 The Exponential And Log Functions The Rules Of Exponents As mentioned earlier. what would you do with 2 2 ? As mentioned earlier. √ 12. 1 bm is deﬁned as b−m . . First ﬁnd the number 21234567812345 and then · · ·? Can you ﬁnd this number? It is just 1234567812345 1234567812345 too big. A fence 5 feet high is 2 feet from a building. The amplitude gives the height of the periodic function. The problem is not one of theory but 1234567812345 of practicality.4) (7. When x and y are not integers. Then the following algebraic properties are obtained. SOME IMPORTANT SPECIAL FUNCTIONS sec3 (2x) tan3 (3x) . Clearly something else must be going on. Find the intervals where cos (3x) is increasing. That is its deﬁnition and it is a useful exercise for you to verify (7. Write f (x) in the form A2 + B 2 and note that √ A2 A B cos ωx + √ sin ωx 2 2 + B2 +B A is a point on the unit circle. The number.3 7. However. √A2 +B 2 A2 +B 2 7. (ab) = ax bx bxy = (bx ) . Show there exists an angle. m/n and b > 0 the symbol bm/n √ means n bm .5) hold with this deﬁnition.3. This is very important because it allows us to understand what is going on. 1/2 suppose b = −1 and x = 1/2. In the case where b = 0 the symbol is undeﬁned. Be sure you understand these properties for x and y integers. Repeat Problem 11 but this time show f (x) = A2 + B 2 cos (ωx + φ) . Hint: Remember a point on the unit circle determines an angle. A2 + B 2 gives the “amplitude” and φ is called the “phase shift” while ω is called the “frequency”.142 6. f. b0 ≡ 1 provided b = 0. What exactly is meant by (−1) ? Even in the case where b > 0 there are diﬃculties. 7. Find the derivative of the function.4) and (7. What is the length of the shortest ladder which will lean against the top of the fence and touch the building? Suppose f (x) = A cos ωx + B sin ωx. Two hallways intersect at a right angle. To make √ √ matters even worse. If x is a rational number. If m < 0. √ √ A2 + B 2 sin (ωx + φ) . What is the length of the longest thin rod which can be carried horizontally from one hallway to the other? 10.5) 1 b These properties are called the rules of exponents. bx+y = bx by . Could you use this deﬁnition to ﬁnd 2 1234567812344 ? Consider what you would do. For example. 9. There are no mathematical questions about the existence of this number. a calculator can ﬁnd 2 1234567812344 . 000 000 000 001 123 as an approximate answer. the meaning of bx is no longer clear. These are serious diﬃculties and must be dealt with. One is 3 feet wide and the other is 4 feet wide. 8. consider Problem 6 on Page 97. 2 is irrational and so cannot be written as the quotient of two integers. To see this. How could you ﬁnd φ? A √ B . It yields 2 1234567812344 = 2. φ such that f (x) = 11. bm means to multiply b by itself m times assuming m is a positive integer. Find all intervals where sin (2x) is concave down. b−1 = y x (7.

1 For every b > 0 there exists a unique diﬀerentiable function expb (x) ≡ bx valid for all real values of x such that (7. 3x 8 6 (1/2)x 2x 4 2 –2 –1 0 1 x 2 These graphs suggest that if b < 1 the function. √ expb (m/n) = n bm whenever m.4) and (7. I want it to be completely clear that the Wild Assumption is just that. Based on Wild Assumption 7. do the laws of exponents continue to hold for all real values of x? The short answer is that they do and this is shown later but for now here is a wild assumption which glosses over these issues.8) (7. No reason for believing in such an assumption has been given notwithstanding the pretty pictures drawn by the calculator. Furthermore. denoted by ln b for b > 0 satisfying expb (x) = ln b expb (x) . This is done to conform with usual usage. and for all y ∈ R.7. Wild Assumption 7. then expb (h) = bh = 1. a given function deﬁned at x. y = bx is decreasing while if b > 1. A Wild Assumption Using your calculator or a computer you can obtain graphs of the functions. Then there exists a unique number. (7.1 one can easily ﬁnd out all about bx . the last claim in Wild Assumption 7.2 Let expb be deﬁned in Wild Assumption 7. The following picture gives a few of these graphs.3.7) (7. Theorem 7.6) .2 The Exponential Functions. n are integers. Furthermore. ln (by ) = y ln b. Instead of writing expb (x) I will often write bx and I will also be somewhat sloppy and regard bx as the name of a function and not just as expb (x). ln (ab) = ln (a) + ln (b) .3.5) both hold for all x. y ∈ R.3. the wild assumption will be completely justiﬁed. y = bx for various choices of b. if b = 1 and h = 0. ln 1 = 0. and bx > 0 for all x ∈ R. THE EXPONENTIAL AND LOG FUNCTIONS 143 7.3. Also.3.1 for b > 0. Later in the book.1 follows from the ﬁrst part of this assumption.3. See Problem 1. the function is increasing but just how was the calculator or computer able to draw those graphs? Also.3.

144 ln SOME IMPORTANT SPECIAL FUNCTIONS a = ln (a) − ln (b) b The function. Now by (7. Next.8). ln (ab) = ln a + ln b as claimed.1 and this is denoted by ln b.8). dividing both sides by 1x .1. Now dividing both sides by (by ) veriﬁes (7. ln maps (0. x → ln x is one to one on (0. ∞) onto (−∞.9) from this. x Therefore. 1x = 1 for all x ∈ R.6). To obtain (7. x → ln x is diﬀerentiable and deﬁned for all x > 0 and ln (x) = 1 .3.10) The function.4) and (7. ln (by ) (by ) = ((by ) ) = g (x) where g (x) ≡ bxy . ∞) . using (6. Therefore. Keeping y ﬁxed.9) (7. by the product rule and Wild Assumption 7. x (7. ∞) . consider the function x → bxy = (by ) . note ln a = ln ab−1 = ln (a) + ln b−1 = ln (a) − ln (b) . From (6. if b = 1 then bx = 1x for all x ∈ R. b x x x x x It remains to verify (7. expb (x + h) − expb (x) h→0 h lim bx+h − bx h→0 h bh − 1 x = lim b . an operation justiﬁed by Wild Assumption 7. Thus ln 1 = 0 as claimed. x Next consider (7. h 1x 1x = 1x+x = 12 x = 1x and so. exp1 (x) = 1 for all x and so exp1 (x) = ln 1 exp1 (x) = 0. Then.1. Also.10).6). To verify (7.3. Proof: First consider (7. This proves (7.5). ln (1) = lim ln 2h − ln 1 h ln 2 ln 2 = lim h = = 1.7).2) and the continuity of 2x . ln (ab) (ab) x = ((ab) ) = (ax bx ) = (ax ) bx + ax (bx ) = (ln a) ax bx + (ln b) bx ax x = [ln a + ln b] (ab) . h→0 2 − 1 h→0 2h − 1 ln 2 . Therefore. g (x) = y expb (xy) and so ln (by ) (by ) = y expb (xy) = y (ln b) (bxy ) = y (ln b) (by ) .3. h→0 h = lim −1 The expression.7) on Page 122. limh→0 b h is assumed to exist thanks to Wild Assumption 7.

Now the ﬁrst part of this lemma implies ln (x) = = ln y − ln x 1 ln = lim y→x y→x x y−x 1 1 ln (1) = . from (7. Therefore. ∞) onto (−∞. By Wild Assumption 7.11) and (7. this would imply 2x is a constant function which it is not.3 The Special Number. expe (x) ≡ (ex ) = ln (e) ex ≡ ln (e) expe (x) = expe (x) showing that expe has the remarkable property that it equals its own derivative. by Corollary 6. x = y and this shows ln is one to one as claimed.8. Suppose ln x = ln y.3. let y ∈ R. Therefore. Then choose n large enough that n > y and − n < y. 2n such that ln x = y. Corollary 6. It only remains to verify that ln maps (0. .3. This 7.12) ln 1 2 n ≤ −n n < y < < ln (2n ) . Then 2 2 from (7.7. ln achieves all values. 2 2 1 n 2 By the intermediate value theorem. Therefore. then (2x ) = (ln 2) 2x = 0 and by Corollary 6.8. (7. there exists x ∈ proves the theorem. e such that ln (e) = 1.3 21 − 1 = ln (2) 2y ≤ ln (2) 21 1 and it follows that 1 ≤ ln 2. Then by the mean value theorem. 1) such that 0< 21 − 1 = ln (2) 2y 1 Since 2y > 0 it follows ln (2) > 0 and so x → 2x is strictly increasing. e Since ln is one to one onto R.12) that ln achieves values which are arbitrarily large and arbitrarily large in the negative direction.3.1 and the mean value theorem.8. THE EXPONENTIAL AND LOG FUNCTIONS 145 ln 2 = 0 because if it were. 718 3. there exists t between x and y such that (1/t) (x − y) = ln x − ln y = 0. it follows there exists a unique number.3. .11) and (7.12) 1 ≤− .4.11) 2 Also. there exists y ∈ (0. This wonderful number is called Euler’s number and it can be shown to equal approximately 2.7) 0 = ln (2) + ln which shows that ln 1 2 1 2 ≥ 1 + ln 2 1 2 (7. ∞) . 2 It follows from (7. More precisely. by the intermediate value theorem. Therefore. x x lim y x y x − ln 1 −1 It remains to verify ln is one to one.

This proposition shows this new function is logb you may have studied in high school.3. logb by ≡ ln (by ) y ln b = =y ln b ln b 1 1 .5 Logarithm Functions Next a new function called logb will be deﬁned. bx and logb x is in the following proposition. Formula (7.5 Let b > 0 and b = 1.3.15) (7. Also.3.146 SOME IMPORTANT SPECIAL FUNCTIONS 7. ln b (7.14) holds.(7. However.8). 7. Proposition 7. The fundamental relationship between the exponential function. What is the derivative of this function? Corollary 7. x Proof: If x > 0 the formula is just (7. it is possible to write ln |x| whenever x = 0. What is the derivative of these functions? .3.10).3. Then f (x) = 1 .7) . Deﬁnition 7. Then for all x > 0.14) follows from (7. The functions.7) on Page 122. ln is only deﬁned on positive numbers.4 For all b > 0 and b = 1 logb (x) ≡ ln x .13) Notice this deﬁnition implies (7. ln b x (7. it is possible to write logb |x| whenever x = 0. Then ln |x| = ln (−x) so by (6. and for all y ∈ R. blogb x = x.15). since ln is one to one. However. Suppose then that x < 0.14) and this veriﬁes (7. logb (x) = Proof: Formula (7. logb by = y. ln blogb x = logb x ln b = ln x and so.16) is obvious from (7.13).16) (7. logb are only deﬁned on positive numbers.4 The Function ln |x| The function.3 Let f (x) = ln |x| for x = 0. −x x This proves the corollary. (ln |x|) = (ln (−x)) = (ln ((−1) x)) 1 1 = (−1) = . it follows (7.9) all hold with ln replaced with logb .

x = 1+ 2(81) = 162 + quadratic formula.3. Example 7. log3 1 x 9 = log3 1 9 + log3 (x) 1 9x (logb (−x)) = (logb ((−1) x)) 1 1 (−1) = . show bx = 1 for all x ∈ R. −x ln b x ln b . (Why?) 1 √ (x+3) 1 162 √ 973.7 Using properties of logarithms.9). These logarithms are called natural logarithms.7.7) . (ln b) x 147 Proof: If x > 0 the formula is just (7. (logb |x|) = = This proves the corollary.9 Solve log3 (x) + 2 = log9 (x + 3) . show that if b = 1. (b) log3 27x3 .3. That is. loge (x) ≡ Recall that 7. Then f (x) = 1 . solve 5x−1 = 32x+2 . Hint: If bh = 1 for h = 0. = log3 3−2 + log3 (x) = −2 + log3 (x) .8 Using properties of logarithms.4.10 Compare ln (x) and loge (x) ln x . it follows loge (x) = ln x. Prove the last part of the Wild Assumption follows from the ﬁrst part of this assumption. In the use of the Example 7. Example 7. only one solution was possible.3.4 Exercises 1. Therefore. 2.6 Let f (x) = logb |x| for x = 0. log3 From (7. simplify the expression. From the given equation.3. Take ln of both sides.(7. ln 3 Example 7. Simplify (a) log4 (16x) . Thus (x − 1) ln 5 = (2x + 2) ln 3.3. 3log3 (x)+2 = 3log9 (x+3) = 9 2 (log9 (x+3)) = 9log9 √ √ 1+12×81 1 and so 9x = x + 3. EXERCISES Corollary 7. ln e Since ln e = 1 from the deﬁnition of e. Then solving this for x yields ln 5+2 x = ln 5−2 ln 3 .16).7) on Page 122. Suppose then that x < 0. Then logb |x| = logb (−x) so by (6. then expb (h) = 1 if h = 0 follows from the ﬁrst part.

Exploit this and the assumed continuity of expb to obtain uniqueness. 4. simplify the expression. Hint: Recall the rational numbers were dense in R and so one can obtain a rational number arbitrarily close to a given real number. . Solve the equation 52x+9 = 7x in terms of logarithms. Show that f (x) ≥ f (a) for all x ≥ a. Solve log4 (x) + 3 = log2 (x + 8) . Find a relationship between xy and y x .) Show (f (|x|)) = g (x) . Prove the function. ∞) → R is onto. solve 3 + ln (−3x) = 4 + ln 3x2 . 11.148 SOME IMPORTANT SPECIAL FUNCTIONS (c) (logb a) (loga b) for a. Using properties of logarithms. log4 8. Suppose f is any function deﬁned on the positive real numbers and f (x) = g (x) where g is an odd function. 6. Explain why the function 6x (1 − x ln 6) is never larger than 1. 10. Using properties of logarithms. 9. Hint: Use Problem 13 and at some point consider the function h (x) = ln x . Then use the intermediate value theorem to ﬁll in the gaps. Hint: Use the mean value theorem. Let e be deﬁned in Problem ?? and suppose e < x < y. 5. Now show from (7. 14. 12. 7.8) that it assumes values which are large and negative and values which are large and positive. Hint: Consider f (x) = 6x (1 − x ln 6) and ﬁnd its maximum value. 1 64 x . x 15. b positive real numbers not equal to 1. The Wild Assumption gave the existence of a function. Hint: You know it is diﬀerentiable so it is continuous (Why?). 13. Let f be a diﬀerentiable function and suppose f (a) ≥ 0 and that f (x) ≥ 0 for x ≥ a. Solve log2 (x) + 3 = log2 (3x + 8) . bx is concave up and logb (x) is concave down. Show there can be no more than one such function. 3. bx satisfying certain properties. Using properties of logarithms and exponentials. Prove that logb : (0. (g (−x) = −g (x) . solve 4x−1 = 32x+2 .

Find f (x) .1 8. Theorem 8.2.2 on Page 121. g (f (x)) if f (x + h) − f (x) = 0 g(f (x+h))−g(f (x)) f (x+h)−f (x) lim g (f (x + h)) − g (f (x)) = g (f (x)) f (x) .9. ln (x4 + 1) x4 + 1 149 . From the chain rule. b) → (c.1. Now it is time to consider the theorem in full generality.6 on Page 100 and 6.1 Suppose f : (a. taking the limit and using Theorem 5. Therefore.1. h This proves the chain rule. h→0 if f (x + h) − f (x) = 0 .Properties And Applications Of Derivatives 8.9.6.4. Proof: Deﬁne H (h) ≡ Then for h = 0. Also suppose that f (x) exists and that g (f (x)) exists.2. f (x) = ln ln x4 + 1 ln x4 + 1 1 1 = x4 ln (x4 + 1) x4 + 1 1 1 = 4x3 . Example 8. d) → R.1.1 The Chain Rule And Derivatives Of Inverse Functions The Chain Rule The chain rule is one of the most important of diﬀerentiation rules. Special cases of it are in Theorem 6. d) and g : (c.2 Let f (x) = ln ln x4 + 1 . h h Note that limh→0 H (h) = g (f (x)) due to Theorems 5. Then (g ◦ f ) (x) exists and (g ◦ f ) (x) = g (f (x)) f (x) . g (f (x + h)) − g (f (x)) f (x + h) − f (x) = H (h) .

1. Near the point. [a. 0) . The procedure by which this is accomplished is nothing more than the chain rule and other rules of diﬀerentiation.3 Let f (x) = (2 + ln |x|) .1. 3y 2 y + 2xy + 2y = 3x2 + 2 y . Recall that f −1 is deﬁned according to the formula f −1 (f (x)) = x.1. 0) . then you can diﬀerentiate both sides with respect to x and write. The theorems which give this justiﬁcation are called the implicit and inverse function theorems. Thus near (1. 1) . The interested reader should consult the book by Rudin. This illustrates the technique of implicit diﬀerentiation. (1. using the chain rule and product rule. you have y = − 4 − x2 . This relation deﬁnes y as a function of x near a given point such as (0. √ √ Near this point. −1) . even though it is impossible to ﬁnd y (x) you can still ﬁnd the derivative of y. For example. and f (x) exists and is non zero then the inverse function. 3 +2xy−1 Of course there are signiﬁcant mathematical considerations which are being ignored when it is assumed y is a diﬀerentiable function of x. b] . Deﬁne f (a) ≡ lim f (x) − f (a) f (x) − f (b) . Here is an example in which. y = 4 − x2 .5 Let f : [a. One case is of special interest in which y = f (x) and it is desired to ﬁnd dx or in other words. (0. x→b− x−a x−b x→a+ . x 2 8. It turns out that for problems like this. Use the chain rule again.1. If you believe y is some diﬀerentiable function of x. Example 8.4 Suppose y is a diﬀerentiable function of x and y 3 + 2yx = x3 + 7 + ln |y| .150 PROPERTIES AND APPLICATIONS OF DERIVATIVES 3 Example 8. Find f (x) . f −1 has a derivative at the point f (x) . Thus f (x) = 3 (2 + ln |x|) (2 + ln |x|) 3 2 = (2 + ln |x|) . b] → R be a continuous function. dy It happens that if f is a diﬀerentiable one to one function deﬁned on an interval. x = 4 − y 2 . Find y (x) . you might have x2 + y 2 = 4. the derivative of the inverse function. They are some of the most profound theorems in mathematics and are topics for advanced calculus. y Now you can solve for y and obtain y = − 3y2y−3x y.2 Implicit Diﬀerentiation And Derivatives Of Inverse Functions Sometimes a function is not given explicitly in terms of a formula. you can’t use algebra to solve for one of the variables in terms of the others even if the relation deﬁnes that variable as a function of the others. the equation relating x and y actually does deﬁne y as a diﬀerentiable function of x near points where it makes sense to formally solve for y as just done. [14] for this and generalizations of all the hard theorems given in this book. Deﬁnition 8. This was a simple example but in general. f (b) ≡ lim . Near the point. you can’t solve for y in terms of x but you can solve for x in terms of y.

7 Let f : (a.1. Thus.7. f (x) Of course this gives the conclusion of the above theorem rather eﬀortlessly and it is formal manipulations like this which aid many of us in remembering formulas such as the one given in the theorem. b] and f (x1 ) = 0. Suppose f (x1 ) exists for some x1 ∈ (a. b) → R be continuous and one to one. b] → R be continuous and one to one. Then f −1 (f (x1 )) exists and is given by the formula. b] . Proof: By Lemma 5.4.7. Therefore there exists η > 0 such that if 0 < |f (x1 ) − f (x)| < η. f −1 (y) − f −1 (f (x1 )) 1 = y − f (x1 ) f (x1 ) y→f (x1 ) lim and this proves the theorem. 1 1 f −1 (f (x)) − f −1 (f (x1 )) x − x1 − − = <ε f (x) − f (x1 ) f (x1 ) f (x) − f (x1 ) f (x1 ) Therefore. 1 f −1 (f (x1 )) = f (x1 ) . Theorem 8. Then f −1 (f (x1 )) exists and is given by the formula. This is one of those theorems which is very easy to remember if you neglect the diﬃcult questions and simply focus on formal manipulations.1. f (x1 ) ≡ lim x→x1 f (x) − f (x1 ) x − x1 where it is understood that x is always in [a. b) and f (x1 ) = 0. . Consider the following. The following obvious corollary comes from the above by not bothering with end points.8. this deﬁnition includes the derivative of f at the endpoints of the interval and to save notation. x − x1 1 − < ε.6 on Page 95 f is either strictly increasing or strictly decreasing and f −1 is continuous. and then divide both sides by f (x) to obtain f −1 (f (x)) = 1 .1. f −1 (f (x)) = x. The notation x → b− is deﬁned similarly. f (x) − f (x1 ) f (x1 ) It follows that if 0 < |f (x1 ) − f (x)| < η. and Corollary 5. Now use the chain rule on both sides to write f −1 (f (x)) f (x) = 1. Suppose f (x1 ) exists for some x1 ∈ [a.6 Let f : [a. then 0 < |x1 − x| = f −1 (f (x1 )) − f −1 (f (x)) < δ where δ is small enough that for 0 < |x1 − x| < δ. THE CHAIN RULE AND DERIVATIVES OF INVERSE FUNCTIONS 151 Recall the notation x → a+ means that only x > a are considered in the deﬁnition of limit. 1 f −1 (f (x1 )) = f (x1 ) . Corollary 8. since ε > 0 is arbitrary.

As in the last example. f is one to one because it is strictly increasing due to the fact that its derivative is positive for all x. 1. This particular form is a very useful crutch and is used extensively in applications. 8. dy dy du = . By Corollary 6. I have no idea how to ﬁnd a formula for f −1 but I do see that f (a) = 0 and so f −1 (0) = f −1 (f (a)) = 1 = f (a) 1 1 + a4 + ln (1 + a2 ) . To do this. Thus y = g ◦ f (x) . Suppose y = g (u) and u = f (x) .5 on Page 133. dx du dx Notice how the du s cancel. this function is strictly increasing on R and so it has an inverse function although I have no idea how to ﬁnd an explicit formula for this inverse function. f (x) = 2x + 3x2 + 7 1 + x2 > −1 + 3x2 + 7 > 6 > 0.2 Exercises dy dx . I can see that f (0) = 7 and so by the formula for the derivative of an inverse function.8. f −1 (7) = f −1 (f (0)) = = 1 . However. Then from the above theorem (g ◦ f ) (x) = g (f (x)) f (x) = g (u) f (x) or in other words. In each of the following. Show that f has an inverse and ﬁnd f −1 (7) . Find f −1 (0) . I am not able to ﬁnd a formula for the inverse function. The question is to determine whether f has an inverse. ﬁnd (a) y = esin x (b) y = ln sin x2 + 7 (c) y = tan cos x2 . 7 1 + x4 + ln (1 + x2 ). The methods of algebra are insuﬃcient to solve hard problems in analysis.9 Suppose f (a) = 0 and f (x) = The function.1.8 Let f (x) = ln 1 + x2 + x3 + 7. This is typical in useful applications so you need to get used to this idea.152 PROPERTIES AND APPLICATIONS OF DERIVATIVES Example 8. 1 f (0) Example 8. The chain rule has a particularly attractive form in Leibniz’s notation.1. You need something more.

assume the relation deﬁnes y as a function of x for values of x and y of interest and use the process of implicit diﬀerentiation to ﬁnd y (x) . ∞) exists. Then by Corollary 8. This function is so important it is given a special symbol. Using Problem 3 show that the following functions are one to one and ﬁnd the derivative of the inverse function at the indicated point.5 +1 8.3. . (a) xy 2 + sin (y) = x3 + 1 (b) y 3 + x cos y 2 = x4 (c) y cos (x) = tan (y) cos x2 + 2 (d) x2 + y 2 (e) (f) xy 2 +y y 5 +x 6 = x3 y + 3 + cos (y) = 7 2 x2 + y 4 sin (y) = 3x (g) y 3 sin (x) + y 2 x2 = 2x y + ln |y| (h) y 2 sin (y) x + log3 (xy) = y 2 + 11 (i) sin x2 + y 2 + sec (xy) = ex+y + y2y + 2 (j) sin tan xy 2 + y 3 = 16 (k) cos (sec (tan (y))) + ln (5 + sin (xy)) = x2 y + 3 3. it also follows that ln−1 : R → (0. and for r a real number. exp .0 . By this theorem.2 on Page 143 says that for x > 0. In each of the following. e2 3 (b) y = x3 + 7x + 1 (c) y = tan x3 + 5 π 4 .3. Thus exp (ln x) = x. ln (exp (y)) = y.3 The Function xr For r A Real Number ln (xr ) = r ln (x) (8. Show that if D (g) ⊆ U ⊆ D (f ) . 4. then f ◦ g is also one to one.1) Theorem 7. THE FUNCTION X R FOR R A REAL NUMBER (d) y = log2 (sin (x) + 6) (e) y = sin log3 x2 + 1 (f) y = √ √ x3 +7 sin(x)+4 153 (g) y = 3tan(sin(x)) (h) y = x2 +2x tan(x2 +1) 6 2. and if f and g are both one to one.1.8.1 .1 π 4 π 2 (d) y = tan −x + (e) y = 2 5x+sin(x) . (a) y = ex 3 +1 .7 ln−1 is diﬀerentiable.

x x x (8.3. it follows exp (x) =1 exp (x) which gives the desired result. r f (x) r−2 = r |f (x)| f (x) f (x) .3.1 Logarithmic Diﬀerentiation x Example 8.3. Proof: From Theorem 7. the part about the validity of the laws of exponents.3 Suppose f (x) is a non zero diﬀerentiable function. note ln (exp x) = x and so using the chain rule and Corollary 8. With this understanding. Then exp (x) = ex . f (x) 1 + x2 .3. e such that ln e = 1. from the above.1) can be written as xr = exp (r ln x) Theorem 8. (|f (x)| ) = exp (r ln |f (x)|) (r ln |f (x)|) = |f (x)| r r r |f (x)| = exp (r ln |f (x)|) . Find the derivative of r |f (x)| . Example 8. One way to do this is to take ln of both sides and use the chain rule to diﬀerentiate both sides with respect to x. note that (8.3.154 PROPERTIES AND APPLICATIONS OF DERIVATIVES Proposition 8.3. x = x ln (e) = ln (ex ) = ln (expe (x)) and so.3) that (xr ) = rxr−1 as claimed. it follows ex = exp x as claimed. exp (x) = exp (x) .4 Let f (x) = 1 + x2 . since ln is one to one. From the Wild Assumption on Page 143. Find f (x) .2 For x > 0. (xr ) = exp (r ln x) r r r = exp (r ln x) = xr = rxr−1 .2 on Page 143 there exists a unique number. Also. f (x) 8.1. it follows exp (x) = expe (x) = ex . First. using the chain and product rules. Note that x = ln(exp (x)). Thus ln (f (x)) = x ln 1 + x2 and so. in particular.3) (8. f (x) 2x2 = + ln 1 + x2 . Therefore.2) using the chain rule. Now by this theorem again. taking the derivative of both sides. ln (ex ) = x ln e = x.1 There exists a unique number e such that ln e = 1. it becomes possible to ﬁnd derivatives of functions raised to arbitrary real powers.7. Proof: Diﬀerentiate both sides of (8. From (8. To establish the last claim. Furthermore. ln (exp x) = x and so since ln is one to one.2).2) and this shows from (8. (xr ) = rxr−1 .

Is the derivative of a function always continuous? Hint: Consider a diﬀerentiable function which is periodic of period 1 and non constant. the answer is 3 3x2 + cos x x3 + sin x − 1 4x3 + 2 .3. 4.5 Let f (x) = √x4 +2x . 1 1 ln (f (x)) = ln x3 + sin (x) − ln x4 + 2x . Find f (x) . Find f −1 (3) . Periodic of period 1 means f (x + 1) = f (x) for all x ∈ R. Let f (x) ≡ x3 + 7x + 3.4 Exercises 1.8. This process is called logarithmic diﬀerentiation. Find f −1 (y) . Now ﬁnd f −1 (1) . 8. f. 0 if x = 0 Show h (0) = 0. 6 You could use the quotient and chain rules but it is easier to use logarithmic diﬀerentiation. 3 6 Diﬀerentiating both sides. What is wrong with the following “proof” of the chain rule? Here g (f (x)) exists and f (x) exists. I think you can see the advantage of doing it this way over using the quotient rule. 1 f (x) = f (x) 3 Therefore. It follows from two rules which you cannot survive without. 2. Explain why f has an inverse. What is h (x) for x = 0? Is h also periodic of period 1? . = lim 5. g (f (x + h)) − g (f (x)) lim h→0 h g (f (x + h)) − g (f (x)) f (x + h) − f (x) h→0 f (x + h) − f (x) h = g (f (x)) f (x) . This shows you don’t need to remember the wretched quotient rule if you don’t want to. EXERCISES Then solve for f (x) to obtain f (x) = 1 + x2 x 155 2x2 + ln 1 + x2 1 + x2 .4. Derive the quotient rule from the product rule and the chain rule. √ 3 3 x +sin(x) Example 8. 6 x4 + 2x f (x) = x3 + sin (x) √ 6 x4 + 2x 1 3 3x2 + cos x x3 + sin x − 1 4x3 + 2 6 x4 + 2x . Now consider h (x) = 1 x2 f x if x = 0 . 3. Let f (x) ≡ x3 + 1.

π . arcsin (x) is deﬁned 2 2 to be the angle whose sine is x which lies in − π . Let f (x) = x3 + 1. (a) sin x2 ln x2 + 1 (b) ln 1 + x2 (c) x3 + 1 (d) ln 6 sin x2 + 7 6 x3 + 1 sin x2 + 7 √ 7 (e) tan sec sin x2 + 1 (f) sin2 x2 + 5 8. cos (y) ≥ 0.1. the derivative of x → arcsin (x) exists for all x ∈ [−1. Graphing the function y = sin x. Find the derivatives of the following functions. From Theorem 8. The formula in this theorem could be used to ﬁnd the derivative of arcsin but it is more useful to simply use that theorem to resolve the existence question and apply . π .156 PROPERTIES AND APPLICATIONS OF DERIVATIVES 6.5 -4 -2 0 2 4 -0. 1] . is one to one on the interval − π . In words. Use (8. Now the arcsin function is deﬁned as the inverse of the sin when its domain is restricted to − π . 7. is clear sin is not one to one on R and so it is not possible to deﬁne an inverse function. sin. 1 0. Also note that for y on this interval. (a) 2 + sin x2 + 6 (b) xx (c) (xx ) x cos x tan x (d) tan2 x4 + 4 + 1 (e) sin2 (x) tan x 8. functions. a little thing like this will not prevent the deﬁnition of useful inverse trig.5 The Inverse Trigonometric Functions It is desired to consider the inverse trigonometric functions.6 on Page 151 2 2 about the derivative of the inverse function. Find an explicit formula for f −1 and use it to compute f −1 (9) . π shown in the 2 2 above picture as the interval containing zero on which the function climbs from −1 to 1.2) or logarithmic diﬀerentiation to diﬀerentiate the following functions. Then use the formula given in the theorem of this section to see you get the same answer.5 -1 However. Observe the function.

You are aware that tan is periodic of period π because tan (x + π) = sin (x + π) sin (x) cos (π) + cos (x) sin (π) = cos (x + π) cos (x) cos (π) − sin (x) sin (π) − sin (x) = = tan (x) .4) Next consider the inverse tangent function. However. y = and so 1 1 1 = . tan (y) = x and by the chain rule. 1 − x2 The positive value of the square root is used because for y ∈ − π . π . Therefore. tan is one to one on − π . I will follow the way of doing it which is used in the book . π 2 2 and lim tan (x) = +∞.1.8. cos (y) y = 1 and so y = 1 = cos y 1 1 − sin2 (y) =√ 1 . lim tan (x) = −∞ π π x→ 2 − x→− 2 + as shown in the following graph of y = tan (x) in which the vertical lines represent vertical asymptotes. cos (y) ≥ 0. THE INVERSE TRIGONOMETRIC FUNCTIONS 157 the chain rule to ﬁnd the formula. sec2 (y) y = 1. −π 2 π 2 Therefore. Let y = arcsin (x) so sin (y) = x. 1 − x2 (8. By Theorem 8. (8.5) 1 + x2 The inverse secant function can be deﬁned similarly.5. letting y = arctan (x) . it is impossible to take the inverse of tan. ∞) is deﬁned according to the rule: arctan (x) is the angle whose tangent is x which is in − π . arctan (x) for x ∈ (−∞. = sec2 (y) 1 + x2 1 + tan2 (y) 1 = arctan (x) . Thus 2 2 √ 1 = arcsin (x) . π . − cos (x) Therefore. Now taking the derivative of both sides. There is no agreement on the best way to restrict the domain of sec. 2 2 Therefore. It is a mistake to memorize too many formulas.6 arctan has a derivative.

158 PROPERTIES AND APPLICATIONS OF DERIVATIVES by Salas and Hille [16] recognizing that there are good reasons for doing it other ways also. There is a vertical asymptote at 2 2 x = π . there is an interesting and useful formula involving arcsec (|x|) . then x = sec (y) < 0 and tan (y) ≤ 0 so 2 y = Thus either way. For x < 0. x2 − 1 As in the case of ln. this function equals arcsec (−x) and so by the chain rule. Now from the trig. x→ π + 2 lim sec (x) = −∞ 6 4 2 0 –2 –4 –6 1 2 3 As in the case of arcsin and arctan. x2 − 1 (8. its derivative equals 1 1 √ (−1) = √ 2−1 |−x| x x x2 − 1 . 1 1 √ y = √ = . both x = sec (y) and tan (y) 2 are nonnegative and so in this case. identity. 1 + tan2 (y) = sec2 (y) . y = 1 sec (y) tan (y) 1 √ = x ± x2 − 1 and it is necessary to consider what to do with ±. π ) ∪ ( π . y = This yields the formula |x| √ |x| √ x (−1) 1 √ x2 − 1 = 1 1 √ √ = . Let y = arcsec (x) so x = sec (y) and using the chain rule. 2 2 1 = sec (y) tan (y) y .6) 1 = arcsec (x) . If y ∈ [0. arcsec (x) is the angle whose secant is x which lies in [0. 2−1 x x |x| x2 − 1 If y ∈ ( π . π]. Thus 2 x→ π − 2 lim sec (x) = +∞. The graph of sec is represented below on [0. π ) ∪ ( π . π]. (−x) x2 − 1 |x| x2 − 1 1 . π ). π].

it is concave up for all x. THE HYPERBOLIC AND INVERSE HYPERBOLIC FUNCTIONS If x > 0. concave down for x < 0 and concave up for x > 0 because sinh (x) = sinh (x) while cosh is decreasing for x < 0 and increasing for x > 0. The reason these are called hyperbolic functions is that cosh2 t − sinh2 t = 1 (8. cosh (x) ≡ . I imagine you can guess what the third is called. sinh t) is a point on the hyperbola whose equation is x2 − y 2 = 1. 1 .8. you see that sinh (0) = 0. Using the chain rule. If you guessed “hyperbolic tangent” you got it right. you can use the graph of the function x → exp (x) = ex to verify that the graph of tanh (x) is as given below .7) 8. Also. this function equals arcsec (x) and its derivative equals 1 1 = √ 2−1 |x| x x x2 − 1 √ and so either way. The other hyperbolic functions are deﬁned by analogy to the circular functions. cosh (x) = sinh (x) . cosh (x) The hyperbolic functions are given by sinh (x) ≡ and The ﬁrst of these is called the hyperbolic sine and the second the hyperbolic cosine. sinh (x) = cosh x. 2 2 tanh (x) ≡ sinh (x) . Therefore.6 The Hyperbolic And Inverse Hyperbolic Functions ex + e−x ex − e−x .6. but cosh (x) > 0 for all x. sinh is an increasing function. This is not important but is the source for the term hyperbolic. (cosh t. cosh (0) = 1 and that sinh (x) < 0 if x < 0 while sinh (x) > 0 for x > 0. Thus the graphs of these functions are as follows. (arcsec (|x|)) = √ x x2 − 1 159 (8. y = sinh(x) y = cosh(x) 4 4 2 3 –2 –1 0 –2 1 2 2 1 –4 –2 –1 0 1 2 Also. Since cosh (x) = cosh (x) .8) and so the point.

computers began to be used and currently π is “known” to millions of decimal places.9) 1 + x2 The derivative of the hyperbolic tangent is also easy to ﬁnd. π = 4 arctan 4 1 5 − arctan 1 239 . Simplify arcsin x + arccos x.160 PROPERTIES AND APPLICATIONS OF DERIVATIVES 1 –3 –2 –1 0 –1 1 2 3 Since x → sinh (x) is strictly increasing. Verify (8.8) cosh (y) = √ 1 + sinh2 (y) = 1 + x2 . A wonderful identity which was used to compute π for over 200 years1 is the following. y =√ which gives the formula 1 1 + x2 1 = sinh−1 (x) . 1 John Machin computed π to 100 decimal places in 1706 through the use of this identity. Therefore. 4. Later in 1873 William Shanks did it to over 700 places using this identity. 808 decimal places. If y = sinh−1 (x) . . From the identity (8. (8. After this.10). sinh−1 (x) . it has an inverse function.10) √ Another notation for the inverse hyperbolic functions which is sometimes used is arcsinh or arccosh or arctanh. Many other schemes have been used besides this identity for computing π. Find derivatives of the following functions. then sinh (y) = x and so using the chain rule and the theorem about the existence of the derivative of the inverse function.7 Exercises 1. 8. y cosh (y) = 1. 2. (8. This yields after a short computation (tanh x) = 1 − tanh2 x. The next advance was in 1948. (a) sinh−1 x2 + 7 (b) (c) (d) (e) tanh−1 (x) sin sinh−1 x2 + 2 sin (tanh (x)) x2 sinh (sin (cos (x))) √ 6 sin x (f) cosh x3 (g) 1 + x4 3.

By the Cauchy mean value theorem. y) such that g (t) (f (x) − f (y)) = f (t) (g (x) − g (y)) . there exists t ∈ (x. o The rule was actually due to Bernoulli who had been L’Hˆpital’s teacher.1 Let [a. Theorem 8. y such that c < x < y < b.12) x→b− Then x→b− lim (8. L’Hˆpital did not claim the rule o o as his own but Bernoulli accused him of plagarism. g (x) f (x) = L.8. 2 L’Hˆpital published the ﬁrst calculus book in 1696. b).2 on Page 133.8 L’Hˆpital’s Rule o There is an interesting rule which is often useful for evaluating diﬃcult limits called L’Hˆpital’s2 o rule. Nevertheless.8. x→b− (8. 9. It is possible to start with the arctan function and obtain all the other trig functions in terms of this one. If you knew the function.11) and f and g exist on (a. Show arcsin x = arctan √1−x2 . L’HOPITAL’S RULE 161 Establish this identity by taking the tangent of both sides and using an appropriate formula for the tangent of the diﬀerence of two angles. The best versions of this rule are based on the Cauchy Mean value theorem. 8. This is interesting because there is a simple way to deﬁne arctan directly as a function of a real variable[9]. This rule.8. appeared in this book. Approaches like these avoid all reference to plane geometry. Prove coth2 x − 1 = csch2 x. Use De Moivre’s theorem to get some help in ﬁnding a formula for tan (4θ) . then ε f (t) −L < . g (t) 2 Now pick x. Find a formula for sinh−1 in terms of ln . Prove 1 − tanh2 x = sech2 x. this rule has become known as L’Hˆpital’s o rule ever since. 8. ∞] and suppose f.ˆ 8. Suppose also that lim f (x) = L. 5. 7. 6. x→b− lim f (x) = lim g (x) = 0. Theorem 6. Find a formula for tanh−1 in terms of ln. named after him. The version of the rule presented here is superior to what was discovered by Bernoulli and depends on the Cauchy mean value theorem which was found over 100 years after the time of L’Hˆpital. What about cosh−1 ? Deﬁne it by restricting the domain of cosh to be nonnegative numbers? What is cosh−1 in terms of ln? x 10. b) with g (x) = 0 on (a. g are functions which satisfy. g (x) (8.13) Proof: By the deﬁnition of limit and (8.12) there exists c < b such that if t > c. b] ⊆ [−∞. arctan explain how to deﬁne sin and cos. o .

But from Lemma 7. Therefore. limx→0 − 6x x = −1 .16) Here is a simple example which illustrates the use of this rule. o 6 6 o Warning 8. since t > c. From L’Hˆpital’s rule. x→0 7 sec2 (7x) tan 7x 7 Sometimes you have to use L’Hˆpital’s rule more than once. limx→0 sinx3 = −1 . .8. if limx→0 cos x−1 exists and equals L. the derivative of the denominator is nonzero for x close to 0.2 Let [a. b] ⊆ [−∞. f (x) ε − L ≤ < ε.3 Find limx→0 5x+sin 3x tan 7x . g (x) (8. b). Therefore.1. The conditions of L’Hˆpital’s rule are satisﬁed because the numerator and denominao tor both converge to 0 and the derivative of the denominator is nonzero for x close to 0.4 Find limx→0 sin x−x . o Example 8. 3x2 it will follow from L’Hˆpital’s rule that the original limit exists and equals L. The following corollary is proved in the same way.8. Example 8. o limx→0 (cos x − 1) = 0 and limx→0 3x2 = 0 so L’Hˆpital’s rule can be applied again to cono sin sider limx→0 − 6x x .15) x→a+ Then x→a+ lim (8. g (x) f (x) = L. Thus. x3 Note that limx→0 (sin x − x) = 0 and limx→0 x3 = 0.13).14) and f and g exist on (a. by L’Hˆpital’s rule. b) with g (x) = 0 on (a. Therefore. f (t) f (x) − f (y) = g (t) g (x) − g (y) and so. x→0 lim 5x + sin 3x 5 + 3 cos 3x 8 = lim = . if this limit exists and equals L. g (x) 2 Since ε > 0 is arbitrary. ∞] and suppose f. However. Therefore.1 on 3x2 sin x−x Page 139. x→a+ lim f (x) = lim g (x) = 0. Suppose also that lim f (x) = L. f (x) − f (y) ε −L < . g are functions which satisfy. x→a+ (8.8. it will follow o x−x that limx→0 cos x−1 = L and consequently limx→0 sinx3 = L. if the limit of the quotient of the derivatives exists. this shows (8.5 Be sure to check the assumptions of L’Hˆpital’s rule before using it.162 PROPERTIES AND APPLICATIONS OF DERIVATIVES Since g (s) = 0 for all s ∈ (a. Corollary 8. Also.8. b) it follows g (x) − g (y) = 0. it will equal the limit of the original function. g (x) − g (y) 2 Now letting y → b−.

In fact there is no limit o unless you deﬁne the limit to equal +∞. Since g (s) = 0 on (a. If the limit is taken as x gets close to 0.001 in to the function. Therefore. x→b− lim f (x) = ±∞ and lim g (x) = ±∞. Therefore. g are functions which satisfy. ∞] and suppose f. this should work although you have no way of knowing how small you need to take x to get a good estimate of the limit. f (t) f (x) − f (y) = g (t) g (x) − g (y) and so. y) such that g (t) (f (x) − f (y)) = f (t) (g (x) − g (y)) . b] ⊆ [−∞. Example 8. Theoretically. an expression whose limit as 1 x → 0+ equals 0. these people think one can ﬁnd the limit by evaluating the function at values of x which are closer and closer to 0. f (x) − f (y) ε −L < . g (t) 2 Now pick x. Some people get the unfortunate idea that one can ﬁnd limits by doing experiments with a calculator.8.19) Proof: By the deﬁnition of limit and (8.6 Find limx→0+ The numerator becomes close to 1 and the denominator gets close to 0.8 Let [a. This illustrates the folly of trying to compute limits through calculator or computer experiments. 1 ln|1+y| y = limy→0 1 ( 1+y ) Theorem 8.001. since t > c. This is a good illustration of the above warning. it will give the answer. You should plug . 0 and will keep on returning the answer of 0 for smaller numbers than .18) there exists c < b such that if t > c. b) . Example 8. the assumptions of L’Hˆpital’s rule do not hold and so it does not apply. then f (t) ε −L < .8. g (x) (8. Take the derivative of the numerator and the denominator which yields −2 sin 2x .17) and f and g exist on (a. Now lets try to use the conclusion of L’Hˆpital’s o rule even though the conditions for using this rule are not veriﬁed. b) with g (x) = 0 on (a. x→b− g (x) lim (8. x→b− (8. By the Cauchy mean value theorem. Suppose also lim f (x) = L. b). g (x) − g (y) 2 . x10 = 1 where L’Hˆpital’s rule has been o ln|1+x10 | used.ˆ 8. there exists t ∈ (x. and x10 see what your calculator or computer gives you.18) x→b− Then f (x) = L. it follows g (x) − g (y) = 0.8. In practice. There is another form of L’Hˆpital’s rule in which limx→b− f (x) = ±∞ and limx→b− g (x) = o ±∞.7 Find limx→0 This limit equals limy→0 ln|1+x10 | . If it is like mine. the procedure may fail miserably. y such that c < x < y < b.8. This is an amusing example. L’HOPITAL’S RULE 163 cos 2x x .

−1 −1 ε + 2 g(x) g(y) f (x) f (y) −1 −1 <ε due to the assumption (8. g (x) (8.22) Theorems 8.8.8.164 Now this implies PROPERTIES AND APPLICATIONS OF DERIVATIVES f (y) g (y) f (x) f (y) g(x) g(y) −1 −1 −L < g(x) g(y) ε 2 where for all y large enough. Continuing g(x) g(y) f (x) f (y) −1 −1 ε < 2 −1 −1 . Theorem 8. Don’t ever try to assign meaning to such symbols. b).2.17) which implies lim g(x) g(y) f (x) f (y) −1 −1 y→b− = 1. Therefore.9 will be referred to as L’Hˆpital’s o rule from now on.9 deal with indeterminate forms which are of the form ±∞ . Please do not think any meaning is being assigned to the nonsense 0 expression 0 .8.21) lim f (x) = L.9 Let [a. b] ⊆ [−∞. . for y large enough.8.2 and 8. There are other indeterminate forms which can be reduced to these forms just discussed.19). x→a+ g (x) lim Then x→a+ (8. It is just a symbol to help remember the sort of thing described by Theorem 0 8. Theorem 8.8. whenever y is large enough. there is no essential diﬀerence between the proof in the case where x → b− and the proof when x → a+. This observation is stated as the next corollary. As before. x→a+ lim f (x) = ±∞ and lim g (x) = ±∞. b) with g (x) = 0 on (a. both f (x) − 1 and f (y) to rewrite the above inequality yields f (y) −L g (y) Therefore.1 8.8.1 and Corollary 8.8.8 and Corollary 8.2 involve the notion of indeterminate forms of the form 0 . This proves the theorem. Suppose also that f (x) = L.8.1 and Corollary 8. Corollary 8. g are functions which satisfy.8. Again. f (y) −L <ε g (y) and this is what is meant by (8. this is just a symbol which is helpful in remembering the ∞ sort of thing being considered. ∞] and suppose f. f (y) −L ≤ L−L g (y) g(x) g(y) f (x) f (y) g(x) g(y) f (x) f (y) − 1 are not equal to zero.8. x→a+ (8.8 and Corollaries 8.20) and f and g exist on (a.8.

10 that limy→∞ 1 + x y written as r P 1+ n y n t . it follows x/y → 0 and so 1 + x → 1. If the payment period is one month. It is an indeterminate form which can be described as 1∞ . Now you have 100 (1 + r) in the bank.ˆ 8. if x > 0. This is what is meant by compounding continuously. 1+ Now using L’Hˆpital’s rule. This becomes the new principal. When a bank says they oﬀer 6% compounded monthly. Now 1 raised to anything is 1 and so it would y seem this limit should equal 1. 1 + x > 1 and a number raised y to higher and higher powers should approach ∞. One might think that as y → ∞. The expression in (8.1 Interest Compounded Continuously Suppose you put money in the bank and it accrues interest at the rate of r per payment period. In this the second term is the interest and the ﬁrst is called the principal. 8. On the other hand. It is good to ﬁrst see why this is called an indeterminate form. and you started with $100 then the amount at the end of one month would equal 100 (1 + r) = 100 + 100r.10 Find limy→∞ 1 + .8. Consider the problem of a rate of r per year and compounding the interest n times a year and letting n increase without bound. These terms need a little explanation. y→∞ x y y = exp y ln 1 + x y .8.23) can be Recall from Example 8.06/12.8. L’HOPITAL’S RULE 165 x y y Example 8. By deﬁnition. this means r. From the above the amount in the account after t years is P 1+ r n nt n 2 (8. It really isn’t clear what this limit should be. the rate per payment period equals .23) = ex . the amount you would have at the end of n months is 100 (1 + r) . How much will you have at the end of the second month? By analogy to what was just done it would equal 100 (1 + r) + 100 (1 + r) r = 100 (1 + r) .8. In general. o lim y ln 1 + x y = = = Therefore. y→∞ y→∞ lim ln 1 + 1/y 1 1+(x/y) x y y→∞ (−1/y 2 ) x lim =x y→∞ 1 + (x/y) x y =x lim −x/y 2 lim y ln 1 + Since exp is continuous. it follows y→∞ lim 1+ x y y = lim exp y ln 1 + y→∞ x y = ex . The interest rate per payment period is then r/n and the number of payment periods after time t years is approximately tn.

in 4 years.166 PROPERTIES AND APPLICATIONS OF DERIVATIVES and so. How much will you have at the end of 4 years? From the above discussion. Find the following limits. Thus.06)4 = 127.9 Exercises (a) limx→0 (b) 3x−4 sin 3x tan 3x x−(π/2) π limx→ 2 − (tan x) arctan(4x−4) arcsin(4x−4) arctan 3x−3x x3 9sec x−1 −1 3sec x−1 −1 1. Find the limits.8.01)x 1/x2 ) (s) limx→0 (cos 4x)( 2. 2 sin x −sin(3x) 5 . (c) limx→1 (d) limx→0 (e) limx→0+ (f) limx→0 (g) (i) 3x+sin 4x tan 2x ln(sin x) limx→π/2 x−(π/2) cosh 2x−1 x2 − arctan x+x limx→0 x3 1 x8 sin x sin 3x 2 (h) limx→0 (j) limx→0 (l) limx→0 (m) (k) limx→∞ (1 + 5x ) x −2x+3 sin x x ln(cos(x−1)) limx→1 (x−1)2 1 (n) limx→0+ sin x x (o) limx→0 (csc 5x − cot 5x) (p) limx→0+ 3sin x −1 2sin x −1 x2 (q) limx→0+ (4x) (r) limx→∞ x10 (1. you get P ert = A. this would be 100e(. you would gain interest of about $27. This shows how to compound interest continuously. (a) limx→0+ (b) limx→0 √ 1− cos 2x √ . 12. taking the limit as n → ∞. sin4 (4 x) 2 2x −25x . 8.11 Suppose you have $100 and you put it in a savings account which pays 6% compounded continuously. Example 8.

.10. 9 .1 A cube of ice is melting such that dV = −4cm3 / sec where V is the dt volume. Find the following limits. The related rates problem asks for how fast the remaining variable is changing. x2 167 3x+2 5x−9 3x+2 5x−9 . (a) limx→0+ (1 + 3x) (b) limx→0 (c) limx→0 (d) limx→0 (e) limx→0 (f) cot 2x tan(sin x)−sin(tan x) x7 sin(x2 )−sin2 (x) x4 e − 1/x2 ( ) x 1 x − cot (x) limx→0 cos(sin2x)−1 x 1/2 (g) limx→∞ x2 4x4 + 7 (h) limx→0 (i) limx→0 cos(x)−cos(4x) tan(x2 ) arctan(3x) x − 2x4 (j) limx→∞ x9 + 5x6 1/3 − x3 8.10 Related Rates Sometimes some variables are related by a formula and it is known how fast all are changing but one. 3. √ 2 x +13n (i) limn→∞ cos π 4n n . How fast are the sides changing when they are equal to 5 centimeters in length? . . (l) limx→∞ (m) limx→∞ (n) limx→0+ (o) 5x2 +7 2x2 −11 5x2 +7 2x2 −11 x 1−x . Example 8.8. x ln x 1−x √ 2 ln e2x +7 x √ sinh( x) √ √ 7 x− 5 x √ limx→0+ √x− 11 x . 1/x 2x (f) limn→∞ cos √n 2x (g) limn→∞ cos √5n n n (h) limx→3 x −27 x−3 .10. . . RELATED RATES (c) limn→∞ n (d) limx→∞ (e) limx→∞ √ n 7−1 . √ √ (j) limx→∞ 3 x3 + 7x2 − x2 − 11x . √ √ (k) limx→∞ 5 x5 + 7x4 − 3 x3 − 11x2 .

If dy dt = 2. l2 = x2 + y 2 . 8. How fast is the distance between the two cars changing when this distance equals ﬁve miles and the car heading north is at a distance of three miles from the intersection? Let l denote the distance between the cars. Note there is no way of knowing the volume or the sides as a functions of t.10. dx dt at the point (2.2 One car travels north at 70 miles per hour and the other travels east at 60 miles per hour toward an intersection. How fast is the distance between the two cars changing when this distance equals ﬁve miles and the car heading north is at a distance of three miles from the intersection? 2. Imagine such a triangle in which the two legs have length 8 inches and denote the included angle by θ and the area by √ A. it follows that x = 4. Example 8. and will have a very sore neck when he wakes up the next morning. y) moves over the ellipse. He watches the ball go back and forth. How fast is the angle between his line of sight and the line determined by the net changing when the ball crosses over the net at a point 12 feet from the end assuming the ball travels at a speed of 60 miles per hour? . One car travels north at 70 miles per hour toward an intersection and the other travels east at 60 miles per hour away from the intersection. A point having coordinates (x. 3). A spectator at a tennis tournament sits 10 feet from the end of the net and on the line determined by the net. Suppose dA = 3 square inches per minute. Suppose each side of the base is changing at the rate of −3 inches per second.168 PROPERTIES AND APPLICATIONS OF DERIVATIVES The volume is V = x3 where x is the length of a side of the cube. Therefore. Therefore. ﬁnd 5. A trash compactor compacts some trash which is in the shape of a box having a square base and a height equal to twice the length of a side of the base. Therefore.11 Exercises 1. How fast is θ changing when θ = π/6 dt radians? 4. When y = 3. x2 4 + y2 9 = 1. the chain rule implies dV dx −4 = = 3x2 dt dt and the problem is to ﬁnd dx dt when x = 5. How fast is the volume changing when the side of the base equals 10 inches? 3. at the instant described 2ll = 2xx + 2yy 10l = 8 (−60) + 6 (−70) and so l = −90 miles per hour at this instant. An isosceles triangle has two sides of equal length. Thus if x is the distance from the intersection of the car traveling east and y is the distance from the intersection of the car traveling north. dx −4 −4 = = cm/second dt 3 (25) 75 at this time.

) Also. At the instant described. It is also known that the rate at which the grain falls oﬀ the conveyor belt is 100π cubic feet per minute. 11. A hemispherical dish of radius 5 inches is sitting on a table. When the radius of the cone is 10 feet what is the height of the cone? 10. l2 = (x − z) + y 2 . then dV = A (y) where A (y) is the surface area of the top surface of the water at this dy height. A revolving search light at a prison makes one revolution per minute. A mother cheetah attempts to ﬁx dinner. for her hungry children. Soup is being poured in at the constant rate of 4 cubic inches per second.) 1 9.8. area times height. It is observed that r (t) = . . How fast is the surface area changing when the volume of the ball equals 20π cubic inches? 7. How fast is the distance between her and dinner decreasing when she is located at the point (0. A vase of water is sitting on a table. (To see this is very reasonable. It will be shown later that if V (y) is the total volume of the vase up to height y.5 feet per minute and h (t) = . Thus dV = −kA (y). Then if l is the desired 2 distance. The surface area of a sphere of radius r equals 4πr2 and the volume of the ball of radius r equals (4/3) πr3 . 0) denote the coordinates of the gazelle. a Thompson gazelle. She moves at 100 feet per second while dinner travels at 80 feet per second. 40) feet and dinner is moving in the direction of the positive x axis at the point (30. assuming the cheetah moves toward the gazelle at all times. y /x = −4/3 (why?) and also (x ) + (y ) = 100. the rate at which the water evaporates is proportional to the surface area of the exposed water. A balloon in the shape of a ball is being inﬂated at the rate of 6 cubic inches per minute. EXERCISES 169 6. note that a little chunk of volume of the vase between heights y and y + dy would be dV = A (y) dy.3 feet per minute. How fast is the end of his shadow moving when he is at a distance of 10 feet from the base of the light pole. 12. How fast is the light travelling along the nearest point on a wall 1/4 mile away? Give your answer in miles per hour. Show dy is a constant even though the surface area of dt dt the exposed water is constantly changing in a typical vase. Grain comes oﬀ a conveyor belt and falls to the ground making a right circular cone. y) denote the coordinates of the cheetah and let (z. How fast is the end of the shadow moving when he is 5 feet from the pole? (Assume he does not walk normally but instead oozes along like a giant amoeba so that his head is always exactly 6 feet above the ground. The volume of a right circular cone is 3 πr2 h. How fast is the level of soup rising when the radius of the top surface of the soup equals 3 inches? The volume of soup 3 at depth y will be shown later to equal V (y) = π 5y 2 − y3 . 2 2 8.11. A six foot high man walks at a speed of 5 feet per second away from a light post which is 12 feet high that has the light right on the top. 0) feet? Is your answer a little surprising? Hint: Let (x.

The minimum or maximum could occur at either end point or it could occur at a point in the open interval. dt 8. 2]. . and the smallest must be the minimum value of f on the interval. Suppose f is continuous on [a. [a. x. If it occurs at a point of the open interval.1 Find the maximum and minimum values of the function. A disposable cup is made in the shape of a right circular cone with height 5 inches and radius 2 inches. b]. if the function is diﬀerentiable and if a maximum or minimum exists. A painter is on top of a 13 foot ladder which leans against a house. V and their derivatives. and if f (x0 ) exists. Example 8. The largest must be the maximum value of f on the interval. in (a.2 on Page 126 f (x0 ) = 0. f . (a. The two equal sides of an isosceles triangle have length x inches and the third leg has length y inches. it can still be found by looking at the points where the derivative equals zero. Then consider these points along with the end points of the interval. say at x0 . A rope fastened to the bow of a row boat has the other end wound around a windlass which is 4 feet above the level of the bow of the boat. Typically. b] .5. then from Theorem 6. Find a formula for dV in terms dt of k. x. Sometimes there are no end points. f (x) = x3 − 3x + 1 on the interval [−2. b) where f (x) does not exist. b) where f (x) = 0 and all points. 3 18. ﬁnd dθ when x = 5 inches. How fast is the rope unwinding when the distance between the bow of the boat and the windlass is 10 feet? 15.12. P is the pressure.170 PROPERTIES AND APPLICATIONS OF DERIVATIVES 13. in (a. How fast is the water level rising when the water in the cone is three inches deep? The volume of a cone is 1 πr2 h. the following simple procedure can be used to locate the maximum or minimum of a function. For θ the angle between the two equal sides. Therefore. b). b].10 on Page 96 which ensure a maximum or minimum of a function exists but in this section the goal is to give ways to ﬁnd the maximum or minimum values of a function. How fast is the painter descending when the base of the ladder is at a distance of 5 feet from the house? 14. (a. A certain volume of an ideal gas satisﬁes P V = kT where T is the absolute temperature. V is the volume and k is a constant which depends on the amount of the gas and the sort of gas in the sample.7. Evaluate f at the end points and at these points where the derivative is zero or does not exist. The base of the ladder is moving away from the house at the rate of 2 feet per second causing the top of the ladder to move down the house.12 The Derivative And Optimization There are existence theorems such as Theorem 5. A kite 100 feet above the ground is being blown away from the person holding its string in a direction parallel to the ground at the rate of 10 feet per second. this involves checking only ﬁnitely many points. The current pulls the boat away at the rate of 2 feet per second. However. Suppose dx = 2 inches per minute and that the length of the other dt side changes in such a way that the area of the triangle is always 10 square inches. [a. P. In this case. Water ﬂows in to this conical cup at the rate of 4 cubic inches per minute. At what rate must the string be let out when the length of string already let out is 200 feet? 16. you do not necessarily know a maximum value or a minimum value of a continuous function even exists. Find all points. b) . 17.

5. You should verify that this function fails to have a derivative at the point x = 1.12. 2.5 1 1. x = 0. It follows the function achieves its maximum at the end point.5 2 Example 8.5 0. −2.5. f (1) = −1. THE DERIVATIVE AND OPTIMIZATION 171 The points where f (x) = 3x2 − 3 = 0 are x = 1 or −1. Therefore. 1 and the point where the derivative equals zero.8. f (0) = 1. 1) the function equals 1 − x2 and so its derivative equals zero at x = 0. x = 2 and its minimum at the point. 2. the points to look at are the end points. −.12. The following is a graph of this function. evaluate the function at −1. the function equals x2 − 1 and so there is no point larger than 1 where the derivative equals zero. The graph of this function is given below. 6 4 2 -2 -1 0 -2 -4 1 2 Example 8. Therefore. The following is a graph of this function. Therefore. Now f (−.2 Find the maximum and minimum values of the function f (x) = x2 − 1 on the interval [−. and f (2) = 3. 2]. There are no points where the derivative does not exist. " . f (1) = 0. the maximum value of the function occurs at the point −1 and 2 and has the value of 3 while the minimum value of the function occurs at −1 and −2 and equals −1. ∞). For x > 1.3 Find the minimum value of the function f (x) = x + 1 x for x ∈ (0. x = 1 where the derivative fails to exist.12. Thus f (−1) = 3. and 1. -0. and f (2) = 3. the points where the derivative fails to exist. f (−2) = −1.5. For x ∈ (−. 75.5) = .

x = −1 is of no interest because it is not greater than zero. it seems there should exist a minimum value at the bottom of the graph. Example 8. This is a little messy but ﬁnally f (x) = −64 + x3 x2 (x2 + 64) 1 and the value where this equals zero is x = 4. ∞) equals f (1) = 2 as suggested by the graph. (Why?) Also the function is diﬀerentiable and so it suﬃces to consider only points where the derivative equals zero. 42 . Therefore.172 PROPERTIES AND APPLICATIONS OF DERIVATIVES 5 4 3 2 1 0 1 2 3 4 5 From the graph. yt t wall t t t t d d x t d d warehouse In this picture. To ﬁnd it.12. the slanted line represents the ladder and x and y are as shown.4 An eight foot high wall stands one foot from a warehouse. Therefore. The solution to the equation. From the Pythagorean theorem the length √ of the ladder equals 1 + y 2 + x2 + 64. the function of a single variable. A diagram of this situation is the following picture. to minimize is f (x) = 1+ 64 + x2 x2 + 64. What is the length of the shortest ladder which extends from the ground to the warehouse. Thus 1 1− 2 =0 x and so x = 1. y/1 = 8/x. By similar triangles. x. the minimum value of the function on the interval. xy = 8. (0. Now using the relation between x and y. Clearly x > 0 so there are no endpoints to worry about. take the derivative of f and set it equal to zero and then solve for the value of x. It follows the shortest ladder is of length √ √ 1 + 64 + 42 + 64 = 5 5 feet.

A rectangle is inscribed in a circle of radius r. Find the rectangle of this sort which has the largest possible area. What is the relation between θ1 and θ2 ? The time it takes for the light to go from (x1 .5 Fermat’s principle says that light travels on a path which will minimize the total time. the total time is 2 2 2 (x2 − x1 − x) + y2 x2 + y1 T = + c1 c2 Thus T is minimized if dT d = dx dx = x c1 x2 + 2 y1 x2 + c1 2 y1 + 2 (x2 − x1 − x) + y2 2 c2 (x2 − x1 − x) − c2 2 (x2 − x1 − x) + y2 2 = sin (θ1 ) sin (θ2 ) − =0 c1 c2 at this point. . EXERCISES 173 Example 8. 2. (x. y) is on the ellipse. y) as one end of a diagonal and (0. 0) to (x2 . 8. y2 ) is (x2 − x1 − x) + y2 /c2 . Find the minimum cost for such a can. The ordered pair. (x1 . y1 ) to the point (xp . 3. The top and bottom of the can are constructed of a material costing one cent per square inch and the sides are constructed of a material costing 2 cents per square inch. y1 ) to (x2 . 0) at the other end. This is called Snell’s law. this yields the desired relation between θ1 and θ2 . The picture indicates a situation in which c1 > c2 . Find the formula for the rectangle of this sort which has the largest possible area. Form the rectangle which has (x. Therefore.13 Exercises 1. 4. 0) θ2d Speed equals c2 d d d d• (x2 . A cylindrical can is to be constructed to hold 30 cubic inches. Therefore. 0) equals 2 2 x2 + y1 /c1 2 and the time it takes to go from (xp . x2 + 4y 2 = 4. Find the numbers if their product is to be as large as possible. The angle θ1 is called the angle of incidence while the angle. Two positive numbers sum to 8. Consider the following picture of light passing from (x1 .8.12.13. y2 ) Then deﬁne by x the quantity xp − x1 . θ2 is called the angle of refraction. y2 ) as shown. y1 )• θ1 Speed equals c1 d (xp .

Find the path which will minimize the time it takes for him to get to the cabin. A rectangular beam of height h and width w is to be sawed from a circular log of radius 1 foot. Oil needs to ﬂow from a mooring place for oil tankers to this reﬁnery. A hiker begins to walk to a cabin in a dense forest. f (x) = 2+sin x for x ∈ (0. ﬁnd the one with smallest area. A feeding trough is to be made from a rectangular piece of metal which is 3 feet wide and 12 feet long by folding up two rectangular pieces of dimension one foot by 12 feet. 10. He walks 5 miles per hour on the road but only 3 miles per hour in the woods. The volume of the right circular cone is (1/3) πr2 h and √ the area of the cone is πr h2 + r2 . x2 + 4y 2 = 4 which is in the ﬁrst quadrant. 15. The area of the triangle formed by the y axis. Here k is some constant which depends on the type of wood used. An open box is to be made by cutting out little squares at the corners of a rectangular piece of cardboard which is 20 inches wide and 40 inches long. Describe how you would ﬁnd the maximum value of the function. 6. It is desired to carry a ladder horizontally around the corner. Find maximum and minimum values if they exist for the function. Describe the most economical route for a pipeline from the mooring place to the reﬁnery. Find the dimensions of the right circular cone which has the smallest area given the volume is 30π cubic inches. Suppose the mooring place is two miles oﬀ shore from a point on the shore 8 miles away from the reﬁnery and that it costs ﬁve times as much to lay pipe under water than above the ground. Hint: You might want to use a calculator to graph this and get an idea what is going on. what is the largest area that he can enclose. 16. 9. Out of all possible triangles formed in this way. Two hallways. one of length x and the other of length 10 − x. A wire of length 10 inches is cut into two pieces. ln x x for ln x 7. and then folding up the rectangular tabs which result. A point is picked on the ellipse. one 5 feet wide and the other 6 feet wide meet. f (x) = x > 0. the x axis. 13. Then a line tangent to this point is drawn which intersects the x axis at a point. One piece is bent into the shape of a square and the other piece is bent into the shape of a circle. A reﬁnery is on a straight shore line. What are the lengths if the sum of the two areas is to be as small as possible. A farmer has 600 feet of fence with which to enclose a rectangular piece of land that borders a river. Find the dimensions of the strongest such beam assuming the strength is of the form kh2 w. If he can use the river as one side. Find the two lengths such that the sum of the areas of the circle and the square is as large as possible. 8. 14. 6) if it exists. and the line just drawn is thus x12y1 . What is the largest possible volume which can be obtained? 11. x1 and the y axis at the point y1 .174 PROPERTIES AND APPLICATIONS OF DERIVATIVES 5. He is walking on a road which runs from East to West and the cabin is located exactly one mile north of a point two miles down the road. What is the longest ladder which can be carried in this way? Hint: Consider a line through the inside corner which extends . What is the best angle for this fold? 12.

A triangle is inscribed in a circle in such a way that one side of the triangle is always the same length. α s 18. The shortest such line will be the length of the longest ladder. there exists x ∈ (0. bi positive numbers. not just to prove it exists.1. The problem consists of how to ﬁnd this number. q n n 1/p n 1/q ai bi ≤ i=1 i=1 ap i i=1 bq i . 17. THE NEWTON RAPHSON METHOD 175 to the opposite walls.14. 21. Therefore. Suppose the total perimeter of the window is to be no more than 4π + 8 feet.8. 1 p 20. Use the law of sines and this fact. . A window is to be constructed for the wall of a church which is to consist of a rectangle of height b surmounted by a half circle of radius a. 2) √ such that f (x) = 0 and this x must equal 2. Hint: From theorems in plane geometry. For p + 1 = 1. You know limx→∞ ln x = ∞. Show that out of all such triangles the maximum area is obtained when the triangle is an isosceles triangle. and ai . suppose you want to ﬁnd 2. t q ap Show the minimum value of f is ab. by the intermediate value theorem. In the following picture.14 The Newton Raphson Method The Newton Raphson method is a way to get approximations of solutions to various equa√ √ tions. α is a constant. Prove the important inequality. ab ≤ p + bq . 8. The following picture illustrates the procedure of the Newton Raphson method. You might also consider Example 7. then limx→∞ ln x xα = 0. Show that if α > 0. Using Problem 20 establish the following magniﬁcent inequality which is a case of 1 Holder’s inequality. 19. the angle opposite the side having ﬁxed length is a constant. The existence of 2 is not diﬃcult to establish by considering the continuous function. Find the shape and dimensions of the window which will admit the most light. Now q q b let a and b be two positive numbers and consider f (t) = (at) + for t > 0. Suppose p and q are two positive numbers larger than 1 which satisfy 1 p p 1 q + 1 = 1. f (x) = x2 − 2 which is negative at x = 0 and positive at x = 2. s.3 on Page 141. For example. no matter how you draw the triangle.

24) which hopefully has the property that limn→∞ xn = x where f (x) = 0. Thus x2 = x1 − f (x1 ) . 2.176 PROPERTIES AND APPLICATIONS OF DERIVATIVES x2 x1 In this picture. This method does not always work.5) − 2 x3 = 1.5 − ≤ 1. Then extend this tangent line to ﬁnd where it intersects the x axis. denoted in the picture as x1 is chosen and then the tangent line to the curve y = f (x) at the point (x1 . the sequence oscillates between positive and negative values as its absolute value gets larger and larger. f (x1 )) is obtained. showing that the Newton Raphson method has yielded a very good approximation after only a few iterations. 414 216 302 046 577. x3 . 2 (1. 417. 2 (1. f (x2 ) Continuing this way. a ﬁrst approximation. x2 is the second approximation and the same process is done for x2 that was done for x1 in order to get the third approximation. For example. From (8. x4 = 1. set y = 0 and solve for x.24).417) − 2 = 1.417) √ √ What is the true value of 2? To several decimal places this is 2 = 1. The equation of this tangent line is y − f (x1 ) = f (x1 ) (x − x1 ) . which was not very good. Now carry out the computations in the above case for x1 = 2 and f (x) = x2 − 2. You can see from the above picture that this must work out in the case of f (x) = x2 − 2. 2 x2 = 2 − = 1.5. This value of x is denoted by x2 .417 − 2 . Thus x3 = x2 − f (x2 ) . {xn } given by xn+1 = xn − f (xn ) . In other words. even starting with an initial approximation. This is because. f (xn ) (8.5) (1. suppose you wanted to ﬁnd the solution to f (x) = 0 where f (x) = x1/3 . yields a sequence of points. 414 213 562 373 095. f (x1 ) This second point. You should check that the sequence of iterates which results does not converge. 4 Then 2 (1. starting with x1 the above procedure yields x2 = −2x1 and so as the iteration continues.

. Draw some graphs to illustrate the Newton Raphson method does not yield a convergent sequence in the case where f (x) = x1/3 . you can draw a picture to show that the method will yield a sequence which converges to x0 provided the ﬁrst approximation. Using the Newton Raphson method and an appropriate picture. c > 0 and p > 1.15 Exercises 1. 5. if f (x0 ) = 0 and f (x) > 0 for x near x0 . Use the Newton Raphson method to approximate the ﬁrst positive solution of x − tan x = 0. Similarly. Hint: You may need to use a calculator to deal with tan x. √ 4. then the method produces a sequence which converges to x0 provided x1 is close enough to x0 . By drawing representative pictures. 8. Use the Newton Raphson method to compute an approximation to 3 which is within 10−6 of the true value. if f (x) < 0 for x near x0 . Explain how you know you are this close. 3. x1 is taken suﬃciently close to x0 .15. show convergence of the Newton Raphson method in the cases described above where f (x) > 0 near x0 or f (x) < 0 near x0 . discuss the convergence of the recursively deﬁned sequence xn+1 = (p − 1) xn + cx1−p /p where n x1 . 2. EXERCISES 177 However.8.

178 PROPERTIES AND APPLICATIONS OF DERIVATIVES .

y (x)) . Then letting H (x) ≡ A (x) − B (x) . 9.1. Many interesting problems may be solved by formulating them as solutions of a suitable diﬀerential equation with initial condition called an initial value problem. it follows that H (a) = 0 and H (x) = A (x) − B (x) = f (x) − f (x) = 0. are called antiderivatives and there is a special notation used to denote them. Thus the initial value problem of interest here is one of the form y (x) = f (x) . y (x) for x ∈ [a.1) Theorem 9. Deﬁnition 9. Therefore.1) is in ﬁnding a function whose derivative equals the given function.2 Let f be a function. This is in general a very hard problem. f (x) dx denotes the set of antiderivatives of f .4 on Page 133. (9.1.1 There is at most one solution to the initial value problem. this constant must equal H (a) = 0. By continuity of H. b] . The main diﬃculty in solving these initial value problems like (9. b). Nevertheless. you can’t escape the fact that it is common usage to call that symbol an integral. The functions whose derivatives equal a given function. It is customary to refer to f (x) as the integrand. The initial value problem is to ﬁnd a function. b] such that Various assumptions are made on f (t. from Corollary 6. This symbol is also called the indeﬁnite integral and sometimes is referred to as an integral although this last usage is not correct.1) which is continuous on [a. although techniques for doing this are presented later which will cover many cases of interest. At this time it is assumed that f does not depend on y and f is a given continuous deﬁned [a. (9. b]. Diﬀerential equations are the unifying idea in this chapter. y (a) = y0 . y (a) = y0 . The reason this last usage is not correct is that the integral of a function is a single number not a whole set of functions. 179 .8. Proof: Suppose both A (x) and B (x) are solutions to this initial value problem for x ∈ (a. Thus F ∈ f (x) dx means F (x) = f (x). f (x) . b). it follows H (x) equals a constant on (a. y) .Antiderivatives And Diﬀerential Equations A diﬀerential equation is an equation which involves an unknown function and its derivatives.1 Initial Value Problems y (x) = f (x.

4 on Page 133. Lemma 9.3 Suppose F . Suppose then that F ∈ f (x) dx and G ∈ g (x) dx. Now take H ∈ (af (x) + bg (x)) dx and pick F ∈ f (x) dx. It is necessary to verify the two sets of functions are the same. Then there exists a constant.8. Taking the derivative of this function.4 If a and b are nonzero real numbers. Thus it is customary to write f (x) dx = F (x) + C where it is understood that C is an arbitrary constant.1. C. Proof: F (x) − G (x) = f (x) − f (x) = 0 for all x ∈ (a. There is another simple lemma about antiderivatives. b). Consequently.1. G ∈ f (x) dx for x ∈ (a. F (x) − G (x) = C. This proves the lemma. bg (x) dx = b g (x) dx This has shown the two sets of functions are the same and proves the lemma. the following table of antiderivatives follows. b) .1.180 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS Lemma 9. then every other function in f (x) dx is of the form F (x) + C for a suitable constant. From Lemma 9. b). Therefore H ∈ aF + b g (x) dx ⊆ a f (x) dx + b g (x) dx. C such that for all x ∈ (a. From the formulas for derivatives presented earlier.3 it follows that if F (x) ∈ f (x) dx. and if nonempty. called a constant of integration. yields af (x) + bg (x) and so this shows the set of functions on the right side is a subset of the set of functions on the left. Then aF ∈ a f (x) dx and (H (x) − aF (x)) = af (x) + bg (x) − af (x) = bg (x) showing that H − aF ∈ because b = 0. then (af (x) + bg (x)) dx = a f (x) dx + b f (x) dx and g (x) dx are g (x) dx Proof: The symbols on the two sides of the equation denote sets of functions. F (x) = G (x) + C. n = −1 x−1 cos (x) sin (x) ex cosh (x) sinh (x) √ 1 1−x2 √ 1 1+x2 f (x) dx +C ln |x| + C sin (x) + C − cos (x) + C ex + C sinh (x) + C cosh (x) + C arcsin (x) + C arcsinh x + C xn n+1 . Then aF + bG is a typical function of the right side of the equation. f (x) xn . by Corollary 6.

h h→0 The consideration of h < 0 is also straightforward. for continuous functions. 9.5 Let n k=0 181 ak xk be a polynomial. h h h Therefore.10 on Page 96 implies there exists xM . AREAS The above table is a good starting point for other antiderivatives. f. This happens because the function is decreasing near x. x + h]} and f (xm ) ≡ min {f (x) : x ∈ [x. f (xm ) = A (x) ≡ lim A (x + h) − A (x) = f (x) .7. In general. You can see that this area is between hf (x) and hf (x + h). Theorem 5.1. Then n n ak xk dx = k=0 k=0 ak xk+1 +C k+1 Proof: This follows from the above table and Lemma 9. xm ∈ [x. A (x + h) − A (x) as indicated. x + h] with the properties f (xM ) ≡ max {f (x) : x ∈ [x. This discussion implies the following theorem. hf (xm ) A (x + h) − A (x) hf (xM ) ≤ ≤ = f (xM ) .5.2 Areas Consider the problem of ﬁnding the area between the graph of a function of one variable and the x axis as illustrated in the following picture. x + h up to the curve and the vertical line from x up to the curve deﬁne the area. For example. a and the point x as shown. The vertical line from the point.2.9. A(x) a A(x + h) − A(x) xd d dx + h The curved line on the top represents the graph of the function y = f (x) and the symbol. Proposition 9. . Theorem 5. using the squeezing theorem. and the continuity of f. x + h]} Then.4.9. A (x) represents the area between this curve and the x axis between the point.1.

182 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS Theorem 9. ∞) be continuous. Example 9.2. x.1 Let a < b and let f : [a. A (x) = f (x) for x ∈ (a. Then letting A (x) denote the area between a.2) 9. x + C has the property that its derivative gives x2 . The function. b]. Letting C = 2. b) because of the existence of its derivative and the above argument can also be used to obtain one sided derivatives for A at the end points.2. 3 Thus C = −1 . g (x) is on top.2 Let f (x) = x2 for x ∈ [1. The assertion that A should be continuous on [a. It is called this because it is an equation for an unknown function. b] → [0. Also. It follows that A (x) = x − 1 . Find the area between the graph of the function. 1 A (x) = − x + 2 satisﬁes the appropriate initial value problem and so the area equals A (3) = 5 3. A (x) written in terms of the derivative of this unknown function. 3 (9. A (a) = 0. A is continuous on [a. a and b. The problem for A described in the above theorem is called an initial value problem and the equation.3 Area Between Graphs It is a minor generalization to consider the area between the graphs of two functions. It is the length of the vertical line joining the two graphs which is of importance . and the x axis. b) . f (x) A(x) g(x) A(x + h) − A(x) a x x+h You see that sometimes the function. Consider the following picture. 2]. the graph of the function. This is true for any 3 C. f (x) is on top and sometimes the function. b]. b] follows from the fact that it has to be continuous on (a. 3 3 3 Example 9.2. 1 The function − x + C has the property that its derivative equals 1/x2 . the points 1 and 2. It only remains to choose C in such a way that the function equals zero at x = 1.3 Find the area between the graph of the function y = 1/x2 and the x axis for x between 1/2 and 3. Therefore. A (x) = f (x) is a diﬀerential equation. the area described equals 3 3 3 A (2) = 8 − 1 = 7 square units. which yields continuity on [a. and the x axis.

Therefore. 3] because it is supposed to have a derivative. g : [a. 3). b) . 3). This yields the following theorem which generalizes the one presented earlier because the x axis is the graph of the function. 3]. t] .5 on Page 100. −3) . −3) . A is continuous on [a.2 Let f (x) = 8 − x and g (x) = 2 of the two functions for x ∈ [−4. x + h]} . AREA BETWEEN GRAPHS 183 and this length is always |f (x) − g (x)| regardless of which function is larger. Then A (x + h) − A (x) |f (xM ) − g (xM )| h |f (xm ) − g (xm )| h ≤ ≤ . A (−3) = 3 x3 44 − 9x − 3 3 (−3) 44 10 − 9 (−3) − = . A (−3) = for x ∈ (−3. By Theorem 5.1 Let a < b and let f.3.9. −3] 9 − x2 if x ∈ [−3. Theorem 5. b] → R be continuous. Find the area between the graphs You should graph the two functions. A (t) = |f (t) − g (t)| for t ∈ (a. x + h] satisfying |f (xM ) − g (xM )| ≡ max {|f (x) − g (x)| : x ∈ [x. A (x) = x − 9x + C where C is chosen so that A (−4) = 0. Then letting A (t) denote the area between the graphs of the two functions for x ∈ [a. Also. b]. Example 9. 3] It follows that on (−4.3. 3 3 3 Now consider A (x) for x ∈ (−3. xm ∈ [x. 3 Thus 3 (−4) − 9 (−4) + C = 0 3 and so C = − 44 . for x ∈ (−4.9. A (−4) = 0. The answer is A (3) where A (x) = |f (x) − g (x)| = 9 − x2 . It follows that A (x) = 9x − x3 +D 3 10 3 .10 on Page 96 there exist xM . Hence A (x) = 9 − x2 . 3 A (x) = Also. as h → 0 A (x) = |f (x) − g (x)| Also A (a) = 0 as before. A (a) = 0.7. 2 (9.3) x2 2 − 1. h h h and using the squeezing theorem. y = 0. Now 9 − x2 = 3 x2 − 9 if x ∈ [−4. Theorem 9. It is necessary to have A continuous on the interval [−4.3. x + h]} and |f (xm ) − g (xm )| ≡ min {|f (x) − g (x)| : x ∈ [x.

Procedure 9. ﬁnd an antiderivative of |f (x) − g (x)| . 3).1.4 To solve the initial value problem. it follows C = −G (a) and so F (x) = G (x)−G (a) . 4. Then from the above explanation. . Find the area between the graphs of y = 5x + 14 and y = x2 . Find the area between the graphs of the functions. A (x).3. Proof: Suppose B (x) = |f (x) − g (x)| . Then to ﬁnd the area between the two graphs of the functions. The solution is y = −1 and 4. For y in this interval. consider the initial value problem F (x) = f (x) . F (a) = 0. (9.3. Proof: Suppose F solves (9. d] . 3 3 3 This illustrates the following procedure which you can use to ﬁnd areas. (9.1. C such that A (x) = B (x) + C. 2.3 Suppose y = f (x) and y = g (x) are two functions deﬁned for x ∈ [a.5 Find the area between x = 4 − y 2 and x = −3y.4 Exercises 1.184 where ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS A (−3) = 9 (−3) − 3 (−3) 10 +D = 3 3 3 Thus D = 64 and A (x) = 9x − x + 64 for x ∈ (−3. 3]. A (3) .3 there is some constant. y = 2x2 and y = 6x + 8. 3 3 3 is given by 3 (3) 64 118 A (3) = 9 (3) − + = . there exists a constant. If there is a solution to the initial value problem. such that G (x) = f (x) . From Lemma 9. B (a) = 0. By continuity of A. the area. Then by Lemma 9.4) Procedure 9. y = x2 + 1 and y = 3x + 5. C such that F (x) = G (x)+C. A (b) − A (a) = (B (b) + C) − (B (a) + C) = (B (b) + C) − C = B (b) which is the desired area.4).4) it will equal G (x) − G (a) . G. you can verify that 4−y 2 > −3y and so 4 − y 2 − (−3y) = 4−y 2 +3y. Therefore. 6 9. Example 9. A similar procedure holds for ﬁnding the area between two functions which are of the form x = f (y) and x = g (y) for y ∈ [c. 3. Find the area between the graphs of the functions. (9. More generally.3. ﬁnd an antiderivative. Since F (a) = 0. b] . An antiderivative is 4y − 4 (4) − y3 3 3 + 3y 2 2 and so the desired area is 2 3 (4) (4) + − 3 2 4 (−1) − (−1) 3 (−1) + 3 2 3 2 = 125 . Find the area between the graphs of the functions y = x + 1 and y = 2x for x ∈ [0. 4 − y 2 = −3y. First ﬁnd where the two graphs intersect.4).3. The desired area is then given by A (b) − A (a) . B (b) is the desired area.

Find the area between y = ln x and y = sin x for x > 0. you have to ﬁnd a solution to ex = 2x + 1 and this will require a numerical procedure such as Newton’s method or graphing and zooming on your calculator. Find the area between y = ex and y = 2x + 1 for x > 0. Show that the area of a right triangle equals one half the product of two sides which are not the hypotenuse. b x x = tp−1 t = xq−1 a t In the picture the right side of the inequality represents the sum of all the areas and the left side is the area of the rectangle determined by (a. Find the area between the graphs of y = x and y = sin x for x ∈ − π . 1) . f (x) = x − x2 . Hint: This will likely involve the intermediate value theorem. 11. 4 14. Write an equation satisﬁed by k and then ﬁnd an approximate value for k. Find the area between the x axis and the graph of the function 2x 1 + x2 for x ∈ [0. π . Find the area between the graphs of x = y 2 and y = 2 − x. . 9. 12. Find the area between the graph of f (x) = 1/x2 for x ∈ [1. Hint: Recall the chain rule for derivatives. the line y = kx divides this region into two pieces. Explain why there exists a number. k such that the area of these two pieces is exactly equal. For k ∈ (0. An inequality which is of major importance is ab ≤ ap bq + p q where here q is deﬁned by 1/p + 1/q = 1. you have to ﬁnd a solution to ln x = sin x and this will require a numerical procedure such as Newton’s method or graphing and zooming on your calculator. 2π] . 2] and the x axis in terms of known functions. π . b). EXERCISES 5. 2] and the x axis in terms of known functions. 2 7. 13. Find the area between ex and cos x for x ∈ [0. In order to do this.4. 2]. 2]. Let p > 1. Find the area between y = sin x and y = cos x for x ∈ 0. 6. Let A denote the region between the x axis and the graph of the function. Find the area between the graph of f (x) = 1/x for x ∈ [1. Hint: You should draw plenty of pictures to do this last part. 10.9. In order to do this. 185 √ 8. 17. 16. 0) and (0. 15. Find the area between y = |x| and the x axis for x ∈ [−2. Establish this inequality by adding up areas in the following picture.

The method of substitution is based on the following formula which is merely a restatement of the chain rule.5) and massage things to get them in that form. This is a hard problem if you insist on ﬁnding antiderivatives in terms of known functions but there are some general procedures for ﬁnding antiderivatives which are presented next. 1 12 3 12 + 2 sin2 x + C. There is a trick based on the Leibniz notation for the derivative which is very useful and illustrated in the following example.4 Find cos (2x) sin2 (2x) dx. 2 Thus cos (2x) sin2 (2x) dx = = 1 2 u2 du = 3 u3 +C 6 (sin (2x)) +C 6 . the answer is 6 7 1 + x2 2x dx = 1 + x2 /7 + C.186 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS 9. Here are some other examples of the method of substitution. The crucial step in the solution of this initial value problem is to ﬁnd an antiderivative for a given function.5 The Method Of Substitution Finding areas can often be reduced to solving an initial value problem. This is a special case of (9.5. sin (x) cos (x) Example 9. Therefore. it is not necessary to recall (9. Now formally du = cos (2x) dx. f (g (x)) g (x) dx = F (x) + C.2 Find 3 + 2−1 sin2 (x) dx = 6 (3 + u) 3 . 1 + x2 2x dx.5. Example 9. Then = 2 cos (2x).3 Find This equals 1 2 1+x 2 6 1 + x2 6 x dx 1 1 + x2 2x dx = 2 7 7 1 + x2 +C = 14 7 + C.5) with g (x) = 2−1 sin2 (x) and F (u) = Therefore.5.5.5) when g (x) = 1 + x2 and F (u) = u7 /7. Actually. Example 9.1 Find sin (x) cos (x) 3 + 2−1 sin2 (x) dx 2 3 (9. du dx Let u = sin (2x).5) Note it is of the form given in (9. where F (y) = f (y). Example 9.

5 Find √ 3 2x + 7x dx. 4 . 2 Therefore. it is a good idea to check your answer to be sure you have not made a mistake.5. Thus in this example. Thus 2 1 1 du = [u + C] 2 ln (3) 2 ln (3) 2 1 1 = 3x + C 2 ln (3) 2 ln (3) Since the constant is an arbitrary constant. This was solved and ﬁnally the original variable was replaced. 1 + cos (2x) cos2 (x) = . (sin(2x))3 6 = cos (2x) sin2 (2x) which veriﬁes the answer is right. du = 2dx and so cos2 (x) dx = 1 + cos (u) du 4 1 1 = u + sin u + C 4 4 1 = (2x + sin (2x)) + C. this is written as 2 1 3x + C.5.5. Then subtracting and solving for cos2 (x). THE METHOD OF SUBSTITUTION 187 The expression (1/2) du replaced the expression cos (2x) dx which occurs in the original problem and the resulting problem in terms of u was much easier.9.5.7 Find cos2 (x) dx Recall that cos (2x) = cos2 (x) − sin2 (x) and 1 = cos2 (x) + sin2 (x). When using this method. Here Example 9. 2 ln (3) Example 9. Then x dx √ 3 2x + 7x dx = √ u−71 3 du u 2 2 1 4/3 7 1/3 = u − u du 4 4 3 7/3 21 4/3 = u − u +C 28 16 3 21 7/3 4/3 = (2x + 7) − (2x + 7) + C 28 16 Example 9. In this example u = 2x + 7 so that du = 2dx. the chain rule implies is another example. 1 + cos (2x) cos2 (x) dx = dx 2 Now letting u = 2x.6 Find Let u = 3x so that 2 x3x dx du dx 2 = 2x ln (3) 3x and x3x dx = 2 2 du 2 ln(3) = x3x dx.

Example 9. this becomes −1 du. You write as sec(x)(sec(x)+tan(x)) dx and note that (sec(x)+tan(x)) the numerator of the integrand is the derivative of the denominator. Thus sec (x) dx = ln |sec (x) + tan (x)| + C. Example 9. Find the indicated antiderivatives. This is usually done by a trick. (a) x cosh x2 + 1 dx .5.10 Find sec (x) dx. recall that (ln |u|) = 1/u.6 Exercises (a) (b) (c) (d) (e) √ x 2x−3 1. C. Procedure 9. dx csc (x) = − csc (x) cot (x) and − csc(x)(cot(x)+csc(x)) 2 cot (x) = − csc (x) .5. Thus this antiderivative is u − ln |u| + C = ln u−1 + C and so tan (x) dx = ln |sec x| + C. Write the integral as − dx = − ln |cot (x) + csc (x)|+ (cot(x)+csc(x)) 9. This follows from the chain rule. d This is done like the antiderivatives for the secant.8 Find tan (x) dx Let u = cos x so that du = − sin (x) dx. Find the indicated antiderivatives.5. Then writing the antiderivative in terms of u. 2. (a) (b) (c) (d) (e) sec (3x) dx sec2 (3x) tan (3x) dx 1 3+5x2 dx √ 1 dx 5−4x2 √ 3 dx x 4x2 −5 3. This illustrates a general procedure.9 f (x) f (x) dx = ln |f (x)| + C.188 Also ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS 1 1 sin2 (x) dx = − cos x sin x + x + C 2 2 which is left as an exercise. dx 5 x 3x2 + 6 x sin x 3 √ 1 1+4x2 2 dx dx sin (2x) cos (2x) dx Hint: Remember the sinh−1 function and its derivative.11 Find d dx csc (x) dx. Find the indicated antiderivatives.5. At this point. Example 9.

Hint: Derive and use sin2 (x) = 1−cos(2x) . 4]. sin2 (x) dx. 6. π/2].6. 11. Find the area between the graphs of y = sin2 x. 2 4 189 5. Find 1 dx.9. (a) (b) (c) (d) (e) (f) ln x x 3 dx dx Hint: Complete the square in the denominator and then let u = x + 1. . Find the indicated antiderivatives. Find the area between the graphs of y = cos2 x and y = sin2 x for x ∈ [0. x 3+x4 dx 1 x2 +2x+2 √ 1 dx 4−x2 1 √ dx x x2 −9 ln(x2 ) x Hint: Let x = 3u. for x ∈ [0. 2π]. 8. Find the indicated antiderivatives. 7. (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) x (2x + 4) dx x (3x + 2) dx √ 1 2 dx (36−25x ) 1 x (3x + 5) dx 1 (9−4x2 ) 1 (1+4x2 ) x (3x−1) √ √ dx dx dx 4√ 1 x2 3 (6x + 4) dx dx dx dx 4√ x 2 x (5x+1) √ 1 (9x2 −4) √ 1 (9+4x2 ) 10. 2π]. and the x axis. EXERCISES (b) (c) (d) (e) 4. dx √ x3 (6x2 +5) (g) Find (h) Find dx x 3 (6x + 4) dx 9. x1/3 +x1/2 Hint: Try letting x = u6 and use long division. Find x3 5x dx sin (x) 7cos(x) dx x sin x2 dx √ x5 2x2 + 1 dx Hint: Let u = 2x2 + 1. Find the area between the graphs of y = sin (2x) and y = cos (2x) for x ∈ [0. Find the area between the graph of f (x) = x3 / 2 + 3x4 and the x axis for x ∈ [0.

2 . Example 9. Example 9.7.2 Find x sin (x) dx Let u (x) = x and v (x) = sin (x). then (uv) (x) = u (x) v (x) + u (x) v (x) .7). x sin (x) dx = (− cos (x)) x − (− cos (x)) dx = −x cos (x) + sin (x) + C. This proves the proposition. arctan (x) dx = x arctan (x) − = x arctan (x) − x 1 dx 1 + x2 2x dx 1 + x2 1 2 1 = x arctan (x) − ln 1 + x2 + C. Then applying (9.4 Find arctan (x) dx Let u (x) = arctan (x) and v (x) = 1.7) is a function found in the right side of (9. Recall the product rule. Then (uv − F ) = (uv) − F = (uv) − u v = uv by the chain rule.190 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS 9. x ln (x) dx = x2 x2 1 ln (x) − 2 2 x 2 x x = ln (x) − 2 2 2 x 1 = ln (x) − x2 + C 2 4 Example 9.1 Let u and v be diﬀerentiable functions for which u (x) v (x) dx are nonempty. Then (uv − G) = −uv + (uv) = u v by the chain rule. Then uv − Proof: Let F ∈ u (x) v (x) dx = u (x) v (x) dx. Thus every function from the right in (9. Therefore every function from the left in (9. Then from (9. Therefore. (uv) (x) − u (x) v (x) = u (x) v (x) Proposition 9.7 Integration By Parts Another technique for ﬁnding antiderivatives is called integration by parts and is based on the product rule. It follows that uv − G ∈ u (x) v (x) dx and so G ∈ uv − u (x) v (x) dx. u (x) v (x) dx and (9.7.7) is a function from the left. If u and v exist.7.7).7).7).7.6) (9.3 Find x ln (x) dx Let u (x) = ln (x) and v (x) = x. Now let G ∈ u (x) v (x) dx. Then from (9.7) u (x) v (x) dx.

If you don’t make any mistakes. Find the following antiderivatives. .8. Integrate by parts. Consider the following argument. Find the antiderivatives (a) (b) (c) (d) (e) (f) (g) (h) x2 sin x dx x2 sin x dx x3 7x dx x2 ln x dx (x + 2) ex dx x3 2x dx sec3 (2x) tan (2x) dx x2 7x dx 2 6. Now do it by taking sin2 x dx = x sin2 x − 2 x sin x cos x dx = x sin2 x − x sin (2x) dx. 1 x2 3. 4. Try doing sin2 x dx the obvious way. Show that tan (x) sec (x) + ln |sec x + tan x| + C. (a) (b) (c) (d) (e) (f) (g) (h) x3 arctan (x) dx x3 ln (x) dx x2 sin (x) dx x2 cos (x) dx x arcsin (x) dx cos (2x) sin (3x) dx x3 ex dx x3 cos x2 dx 2 5. letting u (x) = x and v (x) = to get 1 dx = x x 1 x2 dx = 1 dx. x3 cos x2 dx sec3 (x) dx = 1 2 1 2 2 2 1 x(ln(|x|))2 1. the process will go in circles. x − 1 x x+ 1 dx x = −1 + Now subtracting what? 1 x dx from both sides. 2.8 Exercises (a) (b) (c) (d) (e) xe−3x dx dx √ x 2 − x dx (ln |x|) dx Hint: Let u (x) = (ln |x|) and v (x) = 1.9. EXERCISES 191 9. Is there anything wrong here? If so. Find the following antiderivatives. 0 = −1.

putting in this information to change back to the x variable. Example 9. x + 1 = tan u so dx = sec2 u du. this last indeﬁnite integral becomes sec2 u tan2 u + 1 2 2 dx = 1 (x + 1) + 1 2 2 dx (9. Substitutions Certain antiderivatives are easily obtained by making an auspicious substitution involving a trig.1 Find 1 (x2 +2x+2)2 dx.8). 2 2 (x + 1)2 + 1 . (x+1)2 +1 Therefore.9. sin u = √ x+1 (x+1)2 +1 and cos u = √ 1 . Complete the square as before and write 1 (x2 + 2x + 2) Use the following substitution next. The technique will be illustrated by presenting examples. 1 (x2 + 2x + 2) 1 1 arctan (x + 1) + 2 2 2 dx = = x+1 (x + 1) + 1 2 1 (x + 1) + 1 2 +C 1 1 x+1 arctan (x + 1) + + C. function.9 Trig.192 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS 9. x+1 (x + 1)2 + 1 u 1 In this picture which is descriptive of (9.8) du = = = = u 2 u 2 cos2 u du 1 + cos 2u du 2 sin 2u + +C 4 2 sin u cos u + +C 4 Next write this in terms of x using the following device based on the following picture. Therefore.

restoring the x = ln = ln 3 + x2 x √ + √ +C 3 3 3 + x2 + x + C. Example 9. √ 1 x2 + 3 dx and tan u = √ x √ .2 Find Let x = √ √ 1 x2 +3 193 dx.9. SUBSTITUTIONS Example 9. x √ 3 + x2 u √ 3 √ 3+x2 √ 3 Using the above diagram. sec u = variable. Making the substitution. Then making the substitution. √ √ 3 1/2 2 3 tan u + 1 sec2 (u) du 2 3 = sec3 (u) du.9.3 Find √ √ Let 2x = 3 tan u so 2dx = 3 sec2 (u) du. TRIG.9. consider √ 1 3 sec2 u du √ √ 2 3 tan u + 1 3 tan u so dx = √ = (sec u) du = ln |sec u + tan u| + C Now the following diagram is descriptive of the above transformation. 3 sec2 u du. 2 Now use integration by parts to obtain sec3 (u) du = sec2 (u) sec (u) du = 1/2 (9. 3 Therefore.9) = tan (u) sec (u) − = tan (u) sec (u) − = tan (u) sec (u) + tan2 (u) sec (u) du sec2 (u) − 1 sec (u) du sec (u) du − sec3 (u) du sec3 (u) du = tan (u) sec (u) + ln |sec (u) + tan (u)| − .9. 4x2 + 3 dx.

tan (u) = and sec (u) = 4x2 + 3 1/2 √ 3+4x2 √ .10) Now it follows from (9.9. Making the . let 5x = 3 sin (u) so 5dx = 3 cos (u) du. bx = a tan u and the trig was the right one to use. This is the auspicious substitution which often simpliﬁes these sorts of problems. 2 and so sec3 (u) du = ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS sec3 (u) du = tan (u) sec (u) + ln |sec (u) + tan (u)| + C 1 [tan (u) sec (u) + ln |sec (u) + tan (u)|] + C.194 Therefore. dx = 3 2x √ 4 3 = = √ 3 + 4x2 √ + ln 3 √ 3 + 4x2 2x √ +√ 3 3 +C √ √ 3 2x 3 + 4x2 3 + 4x2 2x √ √ √ + ln +√ +C 4 3 3 3 3 1 3 x (3 + 4x2 ) + ln 3 + 4x2 + 2x + C 2 4 a2 + (bx) 2 Note that these examples involved something of the form substitution. 2x √ 3 + 4x2 u √ 3 2x √ 3 From the diagram.9) that in terms of u the set of antiderivatives is given by 3 [tan (u) sec (u) + ln |sec (u) + tan (u)|] + C 4 Use the following diagram to change back to the variable. 3 Therefore. x. 2 (9. Example 9. The reason this might be a good idea is that it will get rid of the square root sign as shown below.4 Find √ 3 − 5x2 dx √ √ √ √ In this example.

2 5 2 5 = .5 Find 5x2 − 3 dx √ √ √ √ In this example. √ 5x u √ 3 √ 3 − 5x2 √ 5x √ 3 From the diagram.10). SUBSTITUTIONS substitution. 3 − 5x2 dx = = √ 3 195 √ 3 1 − sin2 (u) √ cos (u) du 5 3 √ cos2 (u) du 5 1 + cos 2u 3 du = √ 2 5 3 u sin 2u = √ + +C 4 5 2 3 3 √ u + √ sin u cos u + C = 2 5 2 5 The appropriate diagram is the following. 3 Therefore. 5 Now from (9.9. changing back to x. let 5x = 3 sec (u) so 5dx = 3 sec (u) tan (u) du. this equals 3 1 √ [tan (u) sec (u) + ln |sec (u) + tan (u)|] − ln |tan (u) + sec (u)| + C 5 2 3 3 √ tan (u) sec (u) − √ ln |sec (u) + tan (u)| + C.9. 3 − 5x2 dx = √ √ √ 3 5x 5x 3 − 5x2 3 √ √ arcsin √ + √ √ +C 2 5 3 2 5 3 3 1√ 1 3√ 5 arcsin 15x + x (3 − 5x2 ) + C = 10 3 2 √ Example 9. TRIG. consider √ √ 3 2 (u) − 1 √ sec (u) tan (u) du 3 sec 5 3 tan2 (u) sec (u) du = √ 5 3 = √ sec3 (u) du − sec (u) du . sin (u) = and cos (u) = √ 3−5x2 √ . Then changing the variable.9.

9. here is a short table of auspicious substitutions corresponding to certain expressions. tan (u) = √ 5x2 −3 √ 3 and sec (u) = and so 5x2 − 3 dx = √ √ √ 5x2 − 3 5x 3 5x 5x2 − 3 √ √ − √ ln √ + √ +C 2 5 3 3 2 5 3 3 √ 1 3√ 5x2 − 3 x − 5 ln 5x + (−3 + 5x2 ) + C 2 10 3 √ √ = = To summarize. substitution bx = a tan (u) bx = a sin (u) ax = b sec (u) Of course there are no “magic bullets” but these substitutions will often simplify an expression enough to allow you to ﬁnd an antiderivative. dx dx dx 3 (36−25x2 ) 3 (16−25x2 ) 1 (4−9x2 ) 1 (36−x2 ) dx dx 3 (9 − 16x2 ) (16 − x2 ) 5 dx dx (25 − 36x2 ) dx .196 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS Now it is necessary to change back to x. Expression a2 + b2 x2 a2 − b2 x2 a2 x2 − b2 Trig. The diagram is as follows. These substitutions are often especially useful when the expression is enclosed in a square root.10 (a) (b) (c) (d) (e) (f) (g) (h) Exercises √ √ √ √ √ x (4−x2 ) 1. Find the antiderivatives. √ 5x2 − 3 u 5x √ √ 3 √ 5x √ 3 Therefore.

Example 9.11. (a) (b) (c) (d) (e) (f) √ √ √ (4x2 + 16x + 15) dx (x2 + 6x) dx 3 (−32−9x2 −36x) 3 (−5−x2 −6x) dx dx dx 1 (9−16x2 −32x) (4x2 + 16x + 7) dx 9. Find the antiderivatives. (x2 + 9) dx (4x2 + 25) dx √ x 4x4 + 9 dx √ x3 4x4 + 9 dx 1 1 (25+36(2x−3)2 ) (16+25(x−3)2 ) 2 2 dx dx dx Hint: Complete the square. 3 1 261+25x2 −150x (25x2 + 9) 1 25+16x2 dx dx 4.1 Find 1 x2 +2x+2 dx. .9. (a) (b) (c) (d) (36x2 − 25) dx (x2 − 4) dx (16x2 − 9) 3 dx (25x2 − 16) dx 3.11. PARTIAL FRACTIONS (i) (j) (4 − 9x2 ) 3 197 dx (1 − 9x2 ) dx 2. Before presenting this technique.11 Partial Fractions The main technique for ﬁnding antiderivatives in the case f (x) = p(x) for p and q polyq(x) nomials is the technique of partial fractions. Find the antiderivatives. Hint: Complete the square. (a) (b) (c) (d) (e) (f) (g) (h) (i) (j) 1 26+x2 −2x dx Hint: Complete the square. a few more examples are presented. Find the antiderivatives.

198 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS To do this complete the square in the denominator to write 1 dx = x2 + 2x + 2 1 (x + 1) + 1 2 dx Now change the variable.3 Find dx. 1 1 dx = ln |3x + 5| + C.2 Find 1 3x+5 dx. Therefore. The following simple but important Lemma is needed to continue. 3 ln 2 2 √ 3 x+ 1 2 2 √ 3 x+ 1 2 + C. . u 3 Let u = 3x + 5 so du = 3dx and changing the variable. Therefore.11.11. 1 2 = 3 2 u 4 changing the variable. 3x + 2 +x+ 1 + 4 3x + 2 1 2 2 First complete the square in the denominator. dx = 4 3 = = = 3 2 √ 3 2 √ 3 √ 3 2 du and √ 3 1 2 u− 2 u2 + 1 √ 2 3 √ 3 u2 u 2 du − +1 3 u2 2 2u 1 du − du u2 + 1 3 u2 + 1 √ 3 3 2 ln u + 1 − arctan u + C 2 3 3x + 2 dx = x2 + x + 1 √ 2 3 +1 − arctan 3 Therefore. Then the last indeﬁnite integral reduces to 1 du = arctan u + C u2 + 1 and so 1 dx = arctan (x + 1) + C. 2 + 2x + 2 x Example 9. x2 3x + 2 dx = +x+1 = Now let x+ so that x + 1 2 x2 3 4 dx x+ 2 + 3 4 dx. + 2 √3 2 du 1 du +1 = √ 3 2 u. 3x + 5 3 3x+2 x2 +x+1 Example 9. 1 3 1 1 du = ln |u| + C. letting u = x+1 so that du = dx.

pick the one which has the smallest degree. If this doesn’t happen. c.11. a. b.4 Let f (x) and g (x) be polynomials. then the degree of r ≥ the degree of g. Suppose f (x) − g (x) l (x) is never equal to zero for any l (x).6 Find −x3 +11x2 +24x+14 (2x+3)(x+5)(x2 +x+1) dx. In this problem. and d so that the above rational functions sum to the integrand. q (x) such that r (x) f (x) = q (x) + . f (x) − g (x) q1 (x) + an f (x) − g (x) q1 (x) − it follows this is not zero by the assumption that f (x) − g (x) l (x) is never equal to zero for any l (x). This proves the lemma. Proof: Consider the polynomials of the form f (x) − g (x) l (x) and out of all these polynomials. a contradiction to the construction of r (x). b cx + d a + + 2x + 3 x + 5 x2 + x + 1 and try to ﬁnd constants.9. ﬁrst check to see if the degree of the numerator in the integrand is less than the degree of the denominator. If it is not so. b. Corollary 9. r (x) such that the degree of r (x) < degree of g (x) and a polynomial. This can be done because of the well ordering of the natural numbers discussed earlier. q (x) such that f (x) = q (x) g (x) + r (x) where the degree of r (x) < degree of g (x) or r (x) = 0. Then there exists a polynomial. such that −x3 + 11x2 + 24x + 14 a b cx + d = + + 2 2 + x + 1) (2x + 3) (x + 5) (x 2x + 3 x + 5 x + x + 1 . Now look for a partial fractions expansion for the integrand which is in the following form.11. d. this is so. use long division to write the integrand as the sum of a polynomial with a rational function in which the degree of the numerator is less than the degree of the denominator.5 Let f (x) and g (x) be polynomials. Thus the problem involves ﬁnding a. g (x) g (x) Example 9. Let this take place when l (x) = q1 (x) and let r (x) = f (x) − g (x) q1 (x). PARTIAL FRACTIONS 199 Lemma 9. The reason cx + d is used in the numerator of the last expression is that x2 + x + 1 cannot be factored using real polynomials. Let r (x) g (x) = bm xm + · · · + b1 x + b0 = an xn + · · · + a1 x + a0 where m ≥ n and bm and an are nonzero.11.11. Then letting r1 (x) = r (x) − = = xm−n bm g (x) an xm−n bm g (x) an xm−n bm . Now the degree of r1 (x) < degree of r (x). It is required to show the degree of r (x) is smaller than the degree of g (x) . See the preceding corollary which guarantees this can be done. c. It is required to show degree of r (x) < degree of g (x) or else r (x) = 0. Then there exists a polynomial. Then r (x) = 0. In this case.

First let x = −5 in (9. (9.200 and so ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS −x3 + 11x2 + 24x + 14 = a (x + 5) x2 + x + 1 + b (2x + 3) x2 + x + 1 + (cx + d) (2x + 3) (x + 5) . In this case the degree of the numerator is larger than the degree of the denominator and so long division must ﬁrst be used. −x3 + 11x2 + 24x + 14 dx = (2x + 3) (x + 5) (x2 + x + 1) 1 dx − 2x + 3 2 dx + x+5 x2 1+x dx. Anyway. 2b + 2c + a = −1 6a + 5b + 13c + 2d = 11 6a + 13d + 5b + 15c = 24 15d + 5a + 3b = 14 The solution is c = 1. b = −2. Example 9.11) and obtain a simple equation for ﬁnding b. dx. Multiplying the right side out and collecting the terms. (2x + 3) (x + 5) (x2 + x + 1) 2x + 3 x + 5 x2 + x + 1 This may look like a fairly formidable problem. In reality it is not that bad. Thus 7 + 3x 3x5 + 7 = 3x3 + 3x + 2 x2 − 1 x −1 Now look for a partial fractions expansion of the form 7 + 3x a b = + . it is now possible to ﬁnd the antiderivative of the given function. 2−1 x (x − 1) (x + 1) . d = 1.11. Next let x = −3/2 to get a simple equation for a. +x+1 Now each of these indeﬁnite integrals can be found using the techniques given above. −x3 + 11x2 + 24x + 14 = (2b + 2c + a) x3 + (6a + 5b + 13c + 2d) x2 + (6a + 13d + 5b + 15c) x + 15d + 5a + 3b and therefore. it is necessary to solve the equations. a = 1. they have the same coeﬃcients. This reduces the above system to a more manageable size. −x3 + 11x2 + 24x + 14 1 2 1+x = − + . Thus the desired antiderivative equals 1 ln |2x + 3| − 2 ln |x + 5| + 2 √ 1 1√ 3 ln x2 + x + 1 + 3 arctan (2x + 1) 2 3 3 This was a long example.11) Now these are two polynomials which are supposed to be equal. Therefore. Here is an easier one. Therefore.7 Find 3x5 +7 x2 −1 + C.

If this is not so. it follows b = −2. 2−1 x x−1 x+1 3x5 + 7 3 3 dx = x4 + x2 + 5 ln (x − 1) − 2 ln (x + 1) + C.11.11. Computing the constants yields 3x + 7 1 2 2 = 2 2 + x+2 − x+3 (x + 2) (x + 3) (x + 2) and therefore.9. Next follow the procedure illustrated in the above examples. Then letting x = −1. Therefore. Since this is the case. (x + 2) (x + 2)2 (x + 3) The reason there are two terms devoted to (x + 2) is that this is squared. 7 + 3x 5 2 = − x2 − 1 x−1 x+1 and so 5 2 3x5 + 7 = 3x3 + 3x + − . In this case the correct form of the partial fraction expansion is a b c + + . 7 + 3x = a (x + 1) + b (x − 1) . Then you factor the denominator into a product of factors. some linear of the form ax + b and others quadratic. look for a partial fractions decomposition in the following form. x+2 Example 9. PARTIAL FRACTIONS Therefore.9 Find the proper form for the partial fractions expansion of x3 + 7x + 9 (x2 + 2x + 2) (x + 2) (x + 1) (x2 + 1) 3 2 . (x + 2) (x + 2)2 (x + 1) x +1 These examples illustrate what to do when using the method of partial fractions. First check to see if the degree of the numerator is smaller than the degree of the denominator. cx + d ex + f ax + b + + 3+ (x2 + 2x + 2) (x2 + 2x + 2)2 (x2 + 2x + 2) A B D gx + h + + + 2 .11. x2 − 1 4 2 What is done when the factors are repeated? Example 9. . Letting x = 1. ax2 + bx + c which cannot be factored further. You ﬁrst check to be sure the degree of the numerator is less than the degree of the denominator. First observe that the degree of the numerator is less than the degree of the denominator.8 Find 3x+7 (x+2)2 (x+3) 201 therefore. 3x + 7 (x + 2) (x + 3) 2 dx = − 1 + 2 ln (x + 2) − 2 ln (x + 3) + C. a = 5. dx. do a long division.

The substitution is u = tan θ .1 Find dθ. the indeﬁnite integral is 2 cos (2 arctan u) du. Give a condition on a. Otherwise you may be looking for something which is not there. (1 + cos (2 arctan u)) (1 + u2 ) You can evaluate cos (2 arctan u) exactly. (a) (b) (c) (d) 2x+7 (x+1)2 (x+2) 5x+1 (x2 +1)(2x+3) 5x+1 (x2 +1)2 (2x+3) 5x4 +10x2 +3+4x3 +6x (x+1)(x2 +1)2 3. 5.12. Now replace u to obtain − tan θ 2 + 2 arctan tan θ 2 + C. Find 1+sin θ dθ.202 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS 9. 9. When you use partial fractions. When such a thing occurs there is a substitution which will reduce the integrand to a rational function like those above which can then be integrated using partial fractions. in terms of u the antiderivative equals −u + 2 arctan u. This equals 2 cos2 (arctan u)−1. Thus in this example. and c such that ax2 + bx + c cannot be factored as a product of two polynomials which have real coeﬃcients. cos (arctan u) equals 1/ 1 + u2 and so the integrand reduces to √ 2 2 2 1/ 1 + u2 − 1 1 − u2 2 = = −1 + √ 2 2 1+u 1 + u2 1 + 2 1/ 1 + u2 − 1 (1 + u2 ) therefore. Find the partial fractions expansion of the following rational functions. 2. Hint: Use the above procedure of letting u = tan θ and then 2 multiply both the top and the bottom by (1 − sin θ) to see another way of doing it. Setting up a little √ triangle as above. .13 Exercises 1. b. try the substitution u = sin (x) . du = 1 + tan2 θ 1 dθ and so in terms of this new 2 2 2 variable. In ﬁnding sec (x) dx. Find the antiderivatives (a) (b) (c) x5 +4x4 +5x3 +2x2 +2x+7 (x+1)2 (x+2) 5x+1 (x2 +1)(2x+3) dx dx 5x+1 (x2 +1)2 (2x+3) sin θ 4. Functions cos θ 1+cos θ Example 9. The integrand is an example of a rational function of cosines and sines. be sure you look for something which is of the right form.12 Rational Functions Of Trig.

I shall give answers to these problems so you can see whether you have it right. the better you will be at taking antiderivatives. However.14 Practice Problems For Antiderivatives Here are lots of practice problems for ﬁnding antiderivatives. Some of these are very hard but you don’t have to do all of them if you don’t want to. Find the antiderivatives (a) (b) (c) 52x2 +68x+46+15x3 (x+1)2 (5x2 +10x+8) dx dx 9x2 −42x+38 (3x+2)(3x2 −12x+14) 9x −6x+19 (3x+1)(3x2 −6x+5) 2 dx 9. 203 7. (a) (b) (c) (d) 17x−3 (6x+1)(x−1) 4 3 dx dx Hint: Notice the degree of the numerator is larger than the degree of the denominator.9. (a) (b) (c) 1 (3x2 +12x+13)2 1 (5x2 +10x+7)2 dx dx dx 1 (5x2 −20x+23)2 9. there is no guarantee that my answers are right. Most of these problems are modiﬁcations of problems I found in a Russian calculus book. 8x2 +x−5 (3x+1)(x−1)(2x−1) 3x+2 (5x+3)(x+1) 50x −95x −20x2 −3x+7 (5x+3)(x−2)(2x−1) dx dx 8. Also. the more you do. . PRACTICE PROBLEMS FOR ANTIDERIVATIVES 6. Beware that sometimes you may get it right even though it looks diﬀerent than the answer given. This book had some which were even harder. Find the antiderivatives. Find the antiderivatives.14. In ﬁnding csc (x) dx try the substitution u = cos (x) .

1 16 1 cosh 8x sinh 8x+ 2 x+ Answer: 1 49+4x2 dx = 16. Find Answer: tanh 2x dx = − 1 tanh 2x − 2 +1 4 9. Find Answer: dx. 1 3 1 − 2 sin(2x+3) cos (2x + 3) + C Answer: cosh (3x + 3) dx = C 10. Find Answer: (3 + 2x) C 15. Find x 1 dx = − 48(2x+3)24 + C dx = (3 + 2x) + 5x dx. Find +C cosh2 8x dx. 1 x ln 5 5 dx. 2 19. Find (36 + x2 ) + C (7x + 1) 30 30 Answer: 3. Find √ 1 (4x2 −9) dx. 4.204 1. Find 81+x dx. 1 217 5−2x2 5+2x2 Answer: (7x + 1) 13. Find 1 1+cos 6x dx. Find dx = (3 + 2x) 5 dx. Find √ √ 2x+1 x ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS dx. 1 14 2 arctan 7 x + C Answer: 5 dx = 7. √ 11. tanh2 2x dx. Answer: cosh2 8x dx = C 8. dx. Find 5−2x2 5+2x2 et t+1 +C dx. Find 1 49+4x2 5 dx = (7x + 1) 31 +C Answer: C √ √ dx = −x+ 10 arctan 1 x 10+ 5 3 − x5 2 2 (1 + sin 3x) dx. Find (36+x2 )+ √ √ (36−x2 ) Answer: 2x+1 x (1296−x4 ) dx. √ √ dx = 2 2 x + ln x + C Write this as 2. Answer: (2x + 3) 6. Find Answer: (1 + sin 3x) dx = √ (1+sin 3x) 2 (sin 3x − 1) cos 3x +C 3 − x6 + 9x + C 14. Find 1 4 2 Answer: √ 12 1 2 (4x −9) dx = (4x2 − 9) + C dx. ln 2x + 17. 2 3x ln 2 2 18. ln x + 12. 1 7 7 (2x + 3) −25 −25 dx. Find + 2 3x 3 ln 2 2 1 1+sin(3x) Answer: 81+x dx = +C Answer: 1 1+sin 3x dx = − 3 (tan 3 x+1) 2 +C . 1 11 11 x 3 − x5 5. tet Find (t+1)2 dt Hint: (1+t)et et − (1+t)2 dt (t+1)2 Answer: √ √ (36+x2 )+ (36−x2 ) √ dx = 4 (1296−x ) √ 1 (36−x2 ) dx+ √ 1 (36+x2 ) dx = arcsin 1 x+ 6 = et (t+1) − et (1+t)2 dt. Find ln (−1 + tanh 2x) 1 sin2 (2x+3) Answer: 1 sin2 (2x+3) dx = ln (1 + tanh 2x) + C cosh (3x + 3) dx. sinh (3x + 3) + Answer: 1 1+cos 6x dx = tan 1 x + C 2 dx.

33. Find u 3 18 (6+2x)(6−x) Answer: + C. Find (3x9 + 2) dx 7 1 x(2+ln 2x) Answer: 1 x(2+ln 2x) dx = (3x9 + 2) x 3+8x4 +C ln (2 + ln 2x) + C 30. 1 5 Answer: (3x9 + 2) 5 ln4 x x 21. 23. 3 3 25. Find Answer: (4x3 + 2) dx = (4x3 + 2) x8 3 Answer: x7 ex dx = 1 ex + C 8 28. Find dx = ln5 x + C dx. Answer: 1 24 √ x 3+8x4 dx = 2 2 3x Answer: √sin 7x+cos 7x √ 6 +C 2 7 (sin 7x−cos 7x) dx = 6 arctan (sin 7x − cos 7x) + C csc 4x dx. Answer: x8 = 2 189 29. Find 5 ln4 x x +C dx. dx. Find x2 1 18 205 x7 ex dx. Find x23 2 − 6x12 10 10 dx. dx Answer: √ 1 3 dx (x(3x−5)) Answer: x23 2 − 6x12 dx = 11 1 5184 = 5 6 √ 3 ln √ 2 − 6x √ 12 12 3 x− + (3x2 − 5x) + C 1 − 2376 2 − 6x12 + C 26. PRACTICE PROBLEMS FOR ANTIDERIVATIVES 20. Find √ 1 (x(3x−5)) sin ln 6+2x + C 6−x 34. 31. 27.9. Find 1 x cos(ln 4x) 35. Find dx.14. 18 (6+2x)(6−x) √ 1 (25x2 −9) dx = 1 3 du = cos ln 6+2x 6−x Now restoring the original variables. Find Answer: √ √ 3 6 arctan 1 √ 3(2+x) x Answer: csc 4x dx = √ √ √ 3 x 6 +C dx. Find √sin 7x+cos 7x (sin 7x−cos 7x) 22. this yields 1 arcsec 5 |x| + C. Find √ 1 x (25x2 −9) Answer: arctan 5x 1+25x2 Answer: You could let 3 sec u = 5x so (3 sec u tan u) du = 5 dx and then x dx = arctan2 5x + C cos ln 6+2x 6−x dx = dx. Find dx. 1 4 dx = 1 6 ln |csc 4x − cot 4x| + C arctan 5x 1+25x2 1 9 32. 8 8 8 x2 (4x3 + 2) dx. dx. Find 1 √ 3(2+x) x dx. Find dx. x5 (6−3x2 ) dx. Answer: 1 x cos(ln 4x) Answer: 5 √ x 1 − 15 x4 (6−3x2 ) dx = 8 2 45 x dx = (6 − 3x2 ) − (6 − 3x2 ) + C (6 − 3x2 ) ln (sec (ln 4x) + tan (ln 4x)) + C − 32 45 . 1 10 24.

Find cos3 3x sin 2 3x dx. Then this is √ 1+sin θ (1−sin2 θ) cos θ dθ = (1 + sin θ) dθ = θ − cos θ + C. √ x+C Answer: √ C 44. Now let + (x2 − 4) + C. Find √ 1 (ex +1) dx. Find Answer: 1 dx. Find 1 (x2 +49) 39. In terms of u this is 2 √1 u 1+u2 2 tan θ − 2 ln (sec θ + tan θ) + C. −1−xex ex Answer: 1 e2x +ex Multiply the fraction on the top and √ 2 (x −4) bottom by x + 2 to get dx x+2 Now let x = 2 sec θ so dx = 2 sec θ tan θ dθ and in terms of θ the indeﬁnite integral is √ 2 sec (θ)−1 2 sec θ tan θ dθ = sec θ+1 2 =2 tan2 θ sec(θ) 1+sec θ tan2 θ 1+cos θ dx = + ln (ex + 1) + C dθ sec2 θ dθ−2 sec θ dθ = 38. 1 1 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS x−2 x+2 41. Find x ln2 (6x) dx. Now let x = sin θ so dx = (1−x ) Answer: x ln (3x) dx = 1 x2 ln 3x − 1 x2 + C 2 4 46. 1x e2 √ u2 +1 u 1 u dx = (4 + 9x2 )− (4 + 9x2 ) + C dx. Find √ 1 (x2 −36) 1 (x2 +49) Answer: dx = ln x + (x2 + 49) + √ arctan x √ x(1+x) = arctan2 1+x 1−x dx. dx. cos θ dθ. 40. Now in terms of x this is (x2 − 4)− 2 ln 42. In terms of u this is 2 ln − + C and in terms of x this is √ x 1 (e +1) 2 ln − e− 2 x + C . Find 1 2x 1 2 du. Find Answer: dx. Answer: √ C 45. Answer: 1 2 4x 1 1 x ln2 6x dx = 2 x2 ln2 6x− 2 x2 ln 6x+ +C . Answer: cos3 3x sin 2 3x dx = cos 3x 1 − sin2 3x sin 2 3x 2 = − 21 sin 2 3x + 7 dx 2 9 sin 2 3x + C 3 37. In terms of x this gives arcsin x − (1 − x2 ) + C. Find √ arctan x √ x(1+x) dx. u = tan θ so du = sec2 θ dθ. Then the indeﬁnite integral becomes 2 sec2 θ tan θ sec(θ) √ x2 (4+9x2 ) dθ = 2 csc θ dθ = Answer: √ 1 18 x 2 27 x2 (4+9x2 ) 2 ln |csc θ − cot θ| + C. Find 1 e2x +ex dx. =2 Answer: Let u2 = ex so 2udu = ex dx = au2 dx. ln 3x + √ 43. 1 (x2 −36) dx = ln x + (x2 − 36) + Multiply the fraction on the top and bottom by 1 + x to get √ 1+x 2 dx.206 36. Find x ln (3x) dx.

Answer: e3x sin2 2x dx = 1 e3x − 6 x+C 58. 2 2 Answer: 4x +65x+266 2 (x+2)(x2 +16x+66) dx Answer: cos (ln 7x) dx = x cos (ln 7x) + x sin (ln (7x)) 1 x = 8 ln (x + 2) + √ √ 2 arctan 1 (16 + 2x) 2 + C 4 62.9. Find 4x +65x+266 2 (x+2)(x2 +16x+66) dx. Find +C dx.14. Find Answer: (e4x + 1) dx (e4x + 1)+ 1 arcsinh e2x +C 4 cos (ln 7x) dx. ln 1 + x2 + C Answer: x+4 3 (x+3)2 dx = 3 x+3 dx Answer: arctan 6x dx x3 1 3 = − 2x2 arctan 6x− x −18 arctan 6x+C 3 ln (x + 3) − 59. Answer: x2 sin 8x dx = − 1 x2 8 49. Find (1 − x2 ) + C 57. 2 ln (csc 4x − cot 4x) + C e2x (e4x + 1) dx. x3 ex dx. = x cos (ln 7x) + x sin (ln 7x) − 1 2 Answer: dx = 1 ln (x + 2) − 24 ln x2 − 2x + 4 √ √ 1 + 12 3 arctan 1 (2x − 2) 3 + C 6 cos (ln (7x)) dx and so cos (ln 7x) dx = [x cos (ln 7x) + x sin (ln 7x)] + C . Find arctan 6x x3 x+4 3 (x+3)2 dx. Find 3 x2 207 sin (ln 7x) . Answer: 3x +7x+8 2 (x+2)(x2 +2x+2) dx 53. Find e2x = 1 e2x 4 54. Find − 1 x2 2e Answer: +C Answer: sin (ln 7x) dx = 1 2 x2 sin 8x dx [x sin (ln 7x) − x cos (ln 7x)] + C e5x sin 3x dx. Find 1 cos 8x+ 256 1 cos 8x+ 32 x sin 8x+C 56. Find Answer: e5x sin 3x dx 3 = − 34 e5x cos 3x + 5 5x 34 e arcsin x dx sin 3x + C Answer: arcsin x dx = x arcsin x + 50. = 6 ln (x + 2) + 2 arctan (1 + x) + C 61. dx = 1 2 x2 2x e 2 55. PRACTICE PROBLEMS FOR ANTIDERIVATIVES 47. Find 2 +C 3x +7x+8 2 (x+2)(x2 +2x+2) dx. Find 1 2 3 3x 25 e arctan x dx (cos 2) x − 4 3x 25 e (sin 2) Answer: arctan x dx = x arctan x − 51. Find x e 48. Find sin 4x ln (tan 4x) dx dx 2 x+7 Answer: Integration by parts gives − 1 cos 4x ln (tan 4x) + 4 1 4 = 7 ln (x + 2) − 60. Find 1 x3 +8 1 12 1 x3 +8 dx dx. 7x2 +100x+347 (x+2)(x+7)2 Answer: 7x2 +100x+347 (x+2)(x+7)2 52. Find e3x sin2 2x dx.

Find x2 +4 x6 +64 x2 +4 x6 +64 + dx. 3x+ 1 √ 2 48 + (x+ 3) +1 1 96 1 192 √ Answer: dx = √ √ dx = √ 1 − 96 3 ln x2 − 2 3x + 4 + √ 1 3 + 16 arctan x − √ √ 1 2 96 3 ln x + 2 3x + 4 √ 1 + 16 arctan x + 3 + C √ 1 arctan 1 x− 384 3 ln x2 − 2 3x + 4 2 √ 1 + 192 arctan x − 3 + √ √ 1 2 384 3 ln x + 2 3x + 4 + √ 1 3 +C 192 arctan x + . 2 +1)(x2 +2x+5) 2 3 Answer: 2 (x4+4x+3x +x dx 2 +1)(x2 +2x+5) = 1 2 dx. √ √ x4 −x2 √2+1 x5 1 x8 +1 dx = 16 2 ln x4 +x2 2+1 √ √ + 1 2 arctan x2 2 + 1 8 √ √ + 1 2 arctan x2 2 − 1 + C 8 71. the indeﬁnite integral is 3x2 +12 16 √ 3x+ 1 + 1 − 192 √ 3x+ 1 √ 2 48 x− 3) +1 ( du. Hint: √ x2 + 3x 2 + 9 . There(x+ 3) +1 fore. = u 2u4 +2 2 x5 x8 +1 dx + 192 √ 2 48 . Now use partial fractions.208 63. Find 1 x4 +x2 +1 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS dx. Find x8 +1 dx + 1 . Find x2 (x−3)100 dx. Find ln 1 + x7 + C dx x2 +1 x4 −2x2 +1 dx. Find 1 x4 +81 1 11(3−x)99 + 1 97(3−x)97 +C 67. Find 1 x6 +64 2+1 √ 1 3x 2 − 1 + C √ = ln x − 69. Answer: x2 +1 x4 −2x2 +1 Answer: First factor the denominator. Find dx + 1x + 2 Now you can use partial fractions to ﬁnd 1 x4 +81 1 216 1 108 dx = √ x2 +3x√2+3 + 2 −3x 2+3 x 1 3x Answer: 1−x7 x(1+x7 ) √ √ dx 2 7 2 ln 2 arctan √ 1 + 108 2 arctan 65. 66. the partial fractions decomposition is √ 1 1 − 192 3x+ 48 16 √ 3x2 +12 + (x− 3)2 +1 1 Ax+B Cx+D Ex+F √ √ x2 +4 + (x− 3)2 +1 + (x+ 3)2 +1 Let u = x2 . Find 2 (x4+4x+3x +x dx. 1 2 x4 + 81 = √ x2 − 3x 2 + 9 Answer: ln x2 + 1 + arctan x+ 1 2 ln x2 + 2x + 5 + arctan 1−x7 x(1+x7 ) C 68. Answer: 1 x4 +x2 +1 1 4 Answer: dx = = x2 (x−3)100 dx − 3 49(3−x)98 2 3 ln x2 + x + 1 + √ √ 1 1 6 3 arctan 3 (2x + 1) 3 − 1 ln x2 − x + 1 + 4 √ √ 1 1 6 3 arctan 3 (2x − 1) 3 + C 64. x6 + 64 = x2 + 4 1 x6 +64 dx 1 2(1+x) 1 = − 2(x−1) − 5 +C x− √ 3 2 +1 x+ √ 3 2 x 70. Answer: = and so after wading through much afﬂiction.

Find sin6 (3x) dx. You might try letting u = sinh−1 (x + 2) . x2 −9 77. Find x3 √1 x2 +9 dx. Find 76.14. Answer: √ dx 1 tan(2x) = 5 (x2 +4x+5) x (x +4x+5) x dx = √ √ 2 5 +C (x +4x+5) √ 1 − 5 arctanh 5 (5 + 2x) + 4x + 5) + 2 arcsinh (x + 2) tan 2 tan 2 2 1+2 tanx x + 2 1+2 tanx x 2 2 √ √ tan 1 x 1 2 −2 tan 2 x + 1 2 arctan 2 2 1−2 tan x 4 √ 1 √ √ + 1 2 ln 2 tan x+2 2 tan 2 x+1 + C 4 2 (1+2 tan x) You might try the substitution. PRACTICE PROBLEMS FOR ANTIDERIVATIVES 72. Find Answer: dx √ √ x(2+3 x+ 3 x) Answer: = 6 u(2+3u3 +u2 ) du = x4 1 27x3 dx √ x2 −9 = 2 243x 6 3 ln u− 7 ln (1 + u)− 15 ln 3u2 − 2u + 2 14 √ √ 3 1 − 35 5 arctan 10 (6u − 2) 5 + C (x2 − 9) + (x2 − 9) + C 78. 82. = 3 ln x1/6 − 6 7 ln 1 + x1/6 − − 2 √ 5+C Answer: sin6 (3x) dx = − 1 sin5 x cos x − 2 − 15 cos x sin x + 16 79. Find √ 1 8 74. Find tan5 (2x) dx. Answer: √ √ √(x+3)−√(x−3) dx = (x+3)+ (x−3) 1 2 6x Answer: sin3 (x) cos4 (x) 3 4 dx = − 1 6 (x + 3) (x − 3) − 1 (x + 3) (x − 3)+ 2 √ 3 √ ((x−3)(x+3)) √ ln x + (x2 − 9) +C 2 (x−3) (x+3) 1 sin x 1 sin4 x 3 cos3 x − 3 cos x − 2 1 2 3 sin x cos x − 3 cos x +C 80. You might try letting u = . Find dx √ √ . x(2+3 x+ 3 x) 209 x4 dx √ . Find √ √ √(x+3)−√(x−3) dx. Answer: √ 1 3 dx cos x+3 sin x+4 Answer: x3 √1 x2 +9 = dx = 1 1 6 arctan 12 6 tan 2 x + 6 √ 6+C x 2 1 − 18x2 (x2 + 9)+ 1 √ 3 54 arctanh (x2 +9) Try the substitution. x2 +9 . Find sin3 (x) cos4 (x) 3 5 8 sin x cos x 5 16 x + C 15 1/3 − 2x1/6 + 2 14 ln 3x √ 3 1 1/6 − 35 5 arctan 10 6x 73. Answer: x2 x2 +6x+13 tan4 2x dx = (x2 + 6x + 13)+ + 1 ln 2 + 2 tan2 2x + C 4 dx .9. u = tan (x) . Answer: tan5 (2x) dx = − 1 tan2 2x + 4 81. tan(2x) 9 (x2 + 6x + 13)− 2 7 ln x + 3 + √ 75. Find √ 1 2x x2 √ x2 +6x+13 dx. (x+3)+ (x−3) dx. u = tan +C √ 1 . Find Answer: √ 2 (x2 (x2 + 6x + 13) + C dx. dx cos x+3 sin x+4 .

the length of one of the sides. the constant must equal 5 3 Therefore. dV = A (y) . A (x) = 4 100 − x2 and so V (x) = 400x − 4 x3 + C. 5 Example 9. V (0) = 0. Then the volume of the building between the two parallel planes at height y and height y + ∆y would be approximately ∆y (A (y)) . Therefore.1 Consider a pyramid which sits on a square base of length 500 feet and suppose the pyramid has height 300 feet. would satisfy the diﬀerential equation. y. Find the volume of the resulting solid.210 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS 9. The length of one side of this √ surface is 2 100 − x2 because the equation of the circle which bounds the base is x2 + y 2 = 100. 000.3 Find the volume of a sphere of radius R. 000 cubic feet. 2 dV 5 = (300 − y) dy 3 3 x 300−y = 500 300 = 5 3 and so and so V (y) = − 1 500 − 5 y + C. yields dV = A (x) . 3 + 1 3 (500) = 25. the total volume of the pyramid would equal − 1 5 5 500 − 300 3 3 1 5 (500) . Example 9. When this solid is cut with a plane which is perpendicular to the xy plane and the x axis at x. At height.2 The base of a solid is a circle of radius 10 meters in the xy plane. equal A (y) . Since V (0) = 0. To ﬁnd C. Let the area of this intersection. Here A (x) is the area of the surface resulting from the intersection of the plane with the solid.15. the total volume is 3 3 V (10) = 400 (10) − 4 8000 16 000 3 (10) + = 3 3 3 Example 9. the result is a square. You can use diﬀerential equations to compute volumes of other shapes as well. dV = π R2 − y 2 . Find the volume of the pyramid in cubic feet. Therefore. V (−10) = 0 dx as an appropriate diﬀerential equation for this volume. Therefore. called the cross section at height y. dy The volume of the building would be V (h) .15. V (y) . Thus the radius of a the cross section at height y would be R2 − y 2 and so the area of this cross section is π R2 − y 2 . V (−R) = 0 dy . would satisfy 5 x = 3 (300 − y) . use 3 3 V (−10) = 0 and so C = −400 (−10) + 4 (−10) = 8000 .15 Volumes Imagine a building having height h and consider the intersection of the building with a plane which is parallel to the ground at height y. Therefore. Written in terms of diﬀerentials dV = A (y) dy and so the total volume of the building between 0 and y. The sphere is obtained by revolving a disk of radius R about the y axis.15. x. Reasoning as above for the building and letting V (x) denote the volume between −10 and x.

9. The following picture is descriptive of the situation. Therefore. Therefore. y = f (x) y = g(x) c a s xd b d x+h . 2 In this picture the radius of the inner circle will be r and the radius of the outer circle will be r + ∆r while the height of the shell is H. C = π − R + R3 and so 3 3 211 R3 4 + R3 = πR3 3 3 √ Example 9. This method involves the notion of shells. Now since V (−R) = 0. the volume of the shell would be the diﬀerence in the volumes of the two cylinders or π(r + ∆r)2 H − πr2 h = 2πHr (∆r) + πH (∆r) . From the diﬀerential equation and initial condition. V (R) = π R3 − 1 3 (R) 3 +π − The cross sections perpendicular to the x axis for this shape are circles and the cross √ section at x has radius x. Consider the following picture of a circular shell. f (x) = x is revolved about the x axis.15. 2 (9. Find the volume of the resulting shape between x = 0 and x = 8. V (0) = 0 dx and the answer is V (8) . V (x) = π x 2 and so V (8) = π 64 = 32π.4 The graph of the function. VOLUMES 1 and so V (y) = π R2 y − 3 y 3 + C.15.12) Now consider the problem of revolving the region between y = f (x) and y = g (x) for x ∈ [a. 2 There is another way to ﬁnd volumes without using cross sections. A (x) = πx and the appropriate diﬀerential equation and initial value is dV = πx. b] about the line x = c for c < a.

dV = 2π |f (x) − g (x)| (x − c) . V (a) = 0 and this means that to ﬁnd the volume of revolution it suﬃces to solve the initial value problem. This results in a solid which is very nearly a circular shell like the one shown in the previous picture and the approximation gets better as let h decreases to zero. the volume is V (π/4) = 2 (5 cos (π/4) + (π/4) sin (π/4) + 3 sin (π/4) + (π/4) cos (π/4)) π − 10π √ 1 √ = 2 4 2 + π 2 π − 10π 4 .5 Find the volume of the solid formed by revolving the region between y = sin (x) and the x axis for x ∈ [0. However. π] . V (0) = 0. c = 0 and since sin (x) ≥ 0 for x ∈ [0. Example 9. it is not necessary to worry about which is larger. Therefore. V (x) = V (x + h) − V (x) h→0 h 2π |f (x) − g (x)| (x − c) h + π |f (x) − g (x)| h2 = lim h→0 h = 2π |f (x) − g (x)| (x − c) . the volume is V (π) = 2π 2 . dx Then using integration by parts. the initial value problem is dV = 2πx sin (x) . π/4] and so the initial value problem is dV = 2π (x + 4) (cos x − sin x) .6 Find the volume of the solid formed by revolving the region between y = sin x. f (x) or g (x) because it is expressed in terms of the absolute value of their diﬀerence. Therefore. in doing the computations necessary to solve a given problem. x] about the line x = c. V (a) = 0 dx (9. V (0) = 0. Thus V (x + h) − V (x) equals the volume which results from revolving the region between the graphs of f and g which is also between the two vertical lines shown in the above picture. lim (9. [a.13) Also. V (x) = 2π (sin x − x cos x) + C and from the initial condition.14) and the volume of revolution will equal V (b) . In this example. π] about the y axis. dx Using integration by parts V ∈ 2π (x + 4) (cos x − sin x) dx = 2 (5 cos x + x sin x + 3 sin x + x cos x) π + C and the initial condition gives C = −10π. cos x > sin x for x ∈ [0. C = 0. you typically will have to worry about which is larger. Note that in the above formula.212 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS Let V (x) denote the volume of the solid which results from revolving the region between the graphs of f and g above the interval. π/4] about the line x = −4 In this example.15. y = cos x and the x axis for x ∈ [0. Therefore.15. Example 9.

using the method of substitution. Find the volume of this solid. What is the volume of the resulting solid? 9. C = 3 R3 π.15. Find the volume of what is left after the hole has been drilled. Find the volume of this solid. Now ﬁnd the volume of the solid obtained by revolving this ellipse about the y axis. the volume of the cone is (1/3) × A × h. How much material was taken out? 3. Therefore. The equation of an ellipse is x + y = 1. Hint: The cross sections at height y look just like R but shrunk. The bounded region between y = x2 and y = x is revolved about the x axis. A square having each side equal to r in the xy plane is the base of a solid which has the property that cross sections perpendicular to the x axis are equilateral triangles. 2 Argue that the area at height y. Sketch the graph of this in the case where a b a = 2 and b = 3. π/3] is revolved about the line x = −1. Thus this volume satisﬁes the initial value problem dV = 2π 2 R2 − x2 x. and the x axis for x ∈ [1. 213 √ In this case the volume of the sphere is obtained by revolving the region between y = √ R2 − x2 . EXERCISES Example 9. You should get an oval shaped thing which looks like a squashed circle. R] about the y axis. the same formula obtained earlier using the method of cross sections to set up the diﬀerential equation. h2 5. 6. Find the volume of the solid which results. 8. A circle of radius r in the xy plane is the base of a solid which has the property that cross sections perpendicular to the x axis are equilateral triangles. denoted by A (y) is simply A (y) = A (h−y) . dx Thus. What is the volume if it is revolved about the x axis? What is the volume of the solid obtained by revolving it about the line x = −2? 2. Let R be a region in the xy plane of area A and consider the cone formed by ﬁxing a point in space h units above R and taking the union of all lines starting at this point which end in R. Show that under these general conditions. .7 Find the volume of a sphere of radius R using the method of shells. The region between y = 2 + sin 3x and the x axis for x ∈ [0. 2π 2 R2 − x2 x dx = − 4 3 (R2 − x2 ) 3 π+C 4 and from the initial condition.9. 3] is revolved about the y axis. Find the volume of the solid which results.16. The region between y = ln x. A sphere of radius R has a hole drilled through a diameter which is centered at the diameter and of radius r < R. 9.16 Exercises 2 2 1. 7. 4. Show the volume of a right circular cone is (1/3) × area of the base × height. V (x) = − 4 3 (R2 − x2 ) 3 4 π + R3 π 3 4 and so the volume of the sphere is V (R) = 3 R3 π. V (0) = 0. y = − R2 − x2 and the x axis for x ∈ [0.

£ dl £ dy £ £ dx £ £ which depicts a small right triangle attached as shown to the graph of a function. dl = dx Thus l (x) = 1+ d−b c−a 2 2 2 1+ d−b c−a 2 . l (x) which gives the length of this curve on [a. our diﬀerential equation is dl = dx 1+ (r2 x2 = − x2 ) r2 r2 r =√ . letting l denote the arc length function.214 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS 9. Thus this diﬀerential equation gives the right answer in the familiar cases but it also can be used to ﬁnd lengths for more general curves than straight lines. What is obtained from the above initial value problem? The equation of the line is f (x) = d−b d−b b+ c−a (x − a) and so f (x) = c−a .17. Then the right answer is given by the Pythagorean theorem or distance formula and is (a − c) + (b − c) . the length of the line is given by d−b c−a 2 1+ (c − a) = (a − c) + (b − c) 2 2 as hoped. l (a) = 0. To see this. y = f (x) for x ∈ [a.1 Find the length of the part of the circle having radius r which is between √ √ √ the points 0. y = f (x) . x] .17 Lengths And Areas Of Surfaces Of Revolution The same techniques can be used to compute lengths of the graph of a function. Therefore. Thus. Therefore. − x2 r2 − x2 . 22 r and 22 r. (x − a) and in particular. d) where a < c. l (c) . this shows the length of the curve joined by the hypotenuse of the right triangle is essentially equal to the length of the hypotenuse. b] . If the triangle is small enough. This deﬁnition gives the right answer for the length of a straight line. consider a straight line through the points (a. b) and (c. Consider the following picture. Example 9. √ √ Here the function is f (x) = r2 − x2 and so f (x) = −x/ r2 − x2 . Here is another familiar example. 2 2 2 2 (dl) = (dx) + (dy) and dividing by (dx) yields dl = dx 1+ dy dx 2 = 1 + f (x) . l (a) = 0 2 as an initial value problem for the function. 22 r .

10) on Page 194. substitution. Using a trig substitution. l (0) = 0. Here f (x) = 2x and so the initial value problem to be solved is dl = 1 + 4x2 . Plugging in x = 0 this gives 0 = C and l (x) = r arcsin x so in r particular. 2 2 4 Note this gives the length of one eighth of the circle and so from this the length of the whole circle should be 2rπ. It only remains r to ﬁnd the constant. l is an antiderivative of this last function.17. 2x = tan u so dx = tution and using (9. LENGTHS AND AREAS OF SURFACES OF REVOLUTION 215 Therefore. T (x) 0 T (x) sin θ T q θ E T (x) cos θ T ' 0q ρl(x)g c . 1 + 4x2 dx = 1 2 sec2 u du. x = r sin θ. Use the trig.17. Here is another example Example 9.9. the length of the desired arc is √ √ 2 2 π l r = r arcsin =r . Therefore. making this substi- 1 sec3 u du 2 1 1 = (tan u) (sec u) + ln |sec u + tan u| + C 4 4 1 1 (2x) 1 + 4x2 + ln 2x + 1 + 4x2 + C = 4 4 and since l (0) = 0 it must be the case that C = 0 and so the desired length is √ 1 1√ 5 + ln 2 + 5 l (1) = 2 4 Example 9.3 What is the equation of a hanging chain? Consider the following picture of a portion of this chain. it follows dx = r cos θdθ and so r 1 √ dx = r cos θ dθ 2 − x2 r 1 − sin2 θ = r dθ = rθ + C Hence changing back to the variable x it follows l (x) = r arcsin x + C. dx Thus.17. getting the exact answer depends on ﬁnding 1 + 4x2 dx.2 Find the length of the graph of y = x2 between x = 0 and x = 1.

this yields y (x) = cl (x) . In particular. If this chain does not move. T (x) cos θ = T0 . as shown. The problem of ﬁnding the surface area of a solid of revolution is closely related to that of ﬁnding the length of a graph.216 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS In this picture. dividing these yields ≡c sin θ = l (x) ρg/T0 . y (x) = cl (x) = c 1 + y (x) and so y (x) 1 + y (x) 2 2 = c. Note these curves result from an assumption the only forces acting on the chain are as shown. √ by (8. T (x) and T0 represent the magnitude of the tension in the chain at x and at 0 respectively. Therefore. sinh−1 (y (x)) = cx + d and so y (x) = sinh (cx + d) . Therefore. cos θ Now letting y (x) denote the y coordinate of the hanging chain corresponding to x. then all these forces acting on it must balance. T (x) sin θ = l (x) ρg. combining the constants of integration. Let the bottom of the chain be at the origin as shown. Now diﬀerentiating both sides of the diﬀerential equation. sin θ = tan θ = y (x) . . Curves of this sort are called catenaries. ρ denotes the density of the chain which is assumed to be constant and g is the acceleration due to gravity. First consider the following picture of the frustrum of a y (x) = Therefore. Change the variable in the antiderivative letting u = z (x) and this yields z (x) √ dx 1 + z2 = = du = sinh−1 (u) + C 1 + u2 sinh−1 (z (x)) + C. 1 cosh (cx + d) + k c where d and k are some constants and c = ρg/T0 . cos θ Therefore.9) on Page 160. √1+z2 dx = cx + d. Let z (x) = y (x) so the above diﬀerential equation becomes z (x) √ = c. 1 + z2 z (x) Therefore.

2π (l + l1 ) l + l1 Therefore. In this picture.2 on Page 66. by Theorem 3. The wet paint would make the following shape.9.17. the frustrum of the cone is the left part which has an l next to it and the lateral surface area is this part of the area of the cone. LENGTHS AND AREAS OF SURFACES OF REVOLUTION 217 cone in which it is desired to ﬁnd the lateral surface area. imagine painting the sides and rolling the shape on the ﬂoor for exactly one revolution. rr rl rr rr rr RE ¨¨ ¨ ¨¨ ¨ l r1 rr rE r ¨ ¨¨ ¨¨ ¨¨ ¨¨ rr To do this. Both of these have the same central angle equal to 2πR 2πR 2π = . wet paint on ﬂoor 2πr l1 2πR l What would be the area of this wet paint? Its area would be the diﬀerence between the areas of the two sectors shown. one having radius l1 and the other having radius l + l1 . this area is l + 2l1 πR 2 πR − l1 = πRl (l + l1 ) (l + l1 ) l + l1 (l + l1 ) 2 The view from the side is .8.

y = r for x ∈ [a. [a. Therefore. this yields the following initial value problem for A which can be used to ﬁnd the area of a surface of revolution.4 Find the surface area of the surface obtained by revolving the function. l1 = lr/ (R − r) .17. deﬁned on an interval. Now consider a function. 1 A (x + h) − A (x) ≈ 2π h h h2 + (f (x + h) − f (x)) 2 f (x + h) + f (x) 2 where ≈ denotes that these are close to being equal and the approximation gets increasingly good as h → 0. b] and suppose it is desired to ﬁnd the area of the surface which results when the graph of this function is revolved about the x axis. Then from the above formula for the area of a frustrum. b] about the x axis.218 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS R−r rr l rr rr l1 r rr r ¨¨ r ¨¨ ¨¨ ¨ ¨ ¨¨ and so by similar triangles. Consider the following picture of a piece of this graph. y = f (x) x x+h x axis Let A (x) denote the area which results from revolving the graph of the function restricted to [a. f. Of course this is just the cylinder of radius r and height b − a so this area should equal 2πr (b − a) . substituting this into the above. A (x) = 2πf (x) 1 + f (x) . (Imagine painting it and rolling it on the ﬂoor and then taking the area of the rectangle which results. and using A (a) = 0. the area of this frustrum is πRl l+2 l+ lr R−r lr R−r = πl (R + r) = 2πl R+r 2 . A (a) = 0. 2 Example 9.) . rewriting this a little yields A (x + h) − A (x) ≈ 2π h 1+ f (x + h) − f (x) h 2 f (x + h) + f (x) 2 Therefore. Therefore. taking the limit as h → 0. x] .

y = x1/2 − x 3 is revolved about the x axis. Find the length of y = cosh (x) for x ∈ [0. ﬁnd the length of y = course your answer should depend on a. EXERCISES Using the above initial value problem. the chain was horizontal. π/4] . Find the area of the surface of revolution. 3] . 9. √ 8. 4. 9. r] and it is to be revolved about the x axis. For a a positive real number. 219 The solution is A (x) = 2πr (x − a) . √ Here the function involved is f (x) = r2 − x2 for x ∈ [−r.17. The graph of the function. Of 6.9. Find the area of the surface of revolution if x ∈ [0.5 Find the surface area of a sphere of radius r. In this case −x f (x) = √ r2 − x2 and so the initial value problem is of the form =r A (x) = 2π r2 − x2 1+ x2 . 3/2 .18. 11. 1] . The graph of the function. Find the area of the surface of revolution. The graph of the function. y = x for x ∈ [0. 3. solve A (x) = 2πr 1 + 02 . 5. Can you modify the above derivation for the hanging chain to show that even in this case the chain will be in the form of a catenary? 2. In the hanging chain problem the picture and the derivation involved an assumption that at its lowest point. y = sinh x for x ∈ [0. Example 9. y = x2 for x ∈ [0. the surface area equals A (r) = 2πr2 + 2πr2 = 4πr2 . 1] is revolved about the y axis. y = cosh x for x ∈ [0. simplifying the above yields A (x) = 2πr and so A (x) = 2πrx+C and since A (−r) = 0. Find the area of the surface of revolution. Imagine lifting the end higher and higher and you will see this might not be the case in general. Find the length of the graph of y = x1/2 − x3/2 3 for x ∈ [0. A (b) = 2πr (b − a) as expected. 10. 1] is revolved about the x axis. The graph of the function. 1] is revolved about the x axis. Find the area of the surface of revolution. The giant arch in St. Hint: Switch x and y and then use the previous problem. The graph of the function. 1] is revolved about the x axis. Why? 7. A (−r) = 0 r2 − x2 Thus. Find the length of the graph of y = ln (cos x) for x ∈ [0.18 Exercises 1. 2] . ax2 2 − 1 4a ln x for x ∈ [1. Therefore. 2] . Therefore. A (a) = 0. Louis is in the form of an inverted catenary. it follows that C = 2πr2 .

This proves the following theorem. x2 a2 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS + y2 b2 = 1 is revolved about the x axis. then it must be of the form Ce−at for some constant.19 The Equation y + ay = 0 The homogeneous ﬁrst order constant coeﬃcient linear diﬀerential equation is a diﬀerential equation of the form y + ay = 0. this indicates that to ﬁnd the solutions of the equation. since the derivative of the function t → eat y (t) equals zero. C. dx g (y) The reason these are called separable is that if you formally cross multiply. Consequently. Generalizations of it include the entire subject of linear diﬀerential equations and even many of the most important partial diﬀerential equations occurring in applications. Now show that if G (y) = g (y) and F (x) = f (x) . revolution. .220 12. showing this yields all solutions. then x → y (x) solves the separable diﬀerential equation. Here is how to ﬁnd the solutions to this equation. yeat = C and so y (t) = Ce−at . it suﬃces to take G (y) ∈ g (y) dy. dt Therefore. y2 15. Find the solution to the initial value problem. In short. y (0) = 0. F (x) − G (y) = c for c an arbitrary constant is regarded as an acceptable description of the solution of the diﬀerential equation. The ellipse. The x variables are on one side and the y variables are on the other. then if the equation. Find the area of the surface of 13. Multiply both sides of the equation by eat . it follows this function must equal some constant. y + ay = 0. Separable diﬀerential equations are those which can be written in the form dy f (x) = . Then use the product and chain rules to verify that eat (y + ay) = d at e y = 0. y (0) = 1. F (x) − G (y) = c speciﬁes y as a diﬀerentiable function of x. and obtain the solutions as F (x) − G (y) = c where c is a constant. Hint: Use implicit diﬀerentiation or the chain rule. Find the solution to the initial value problem. This shows that if there is a solution of the equation. y = 1 + y 2 . Such an equation was encountered in the hanging chain problem. You should verify that every function of the form. For this reason the expression. F (x) ∈ f (x) dx. 14. y (t) = Ce−at is a solution of the above diﬀerential equation.15) It is arguably the most important diﬀerential equation in existence. (9. y = x . g (y) dy = f (x) dx and the variables are “separated”. 9. C. √ √ dz Recall z (x) / 1 + z 2 = c so dx = c 1 + z 2 .

y (t) = Ce−at solves the diﬀerential equation.06. From Theorem 9. There are ten grams of a radioactive substance which is allowed to decay for ﬁve years. 7. r is the interest rate per year. letting A (t) denote the amount of the radioactive substance at time t. The rate of decay is proportional to the amount present. In Example 9. A substance is decaying at the rate of . Hint: Use the given information to ﬁnd k 2 and then use Problem 4. A sample of 4 grams of a radioactive substance has decayed to 2 grams in 5 days. 6.2 the half life is the time it takes for half of the original amount of the substance to decay. Find the half life of the substance. Letting t = 0.20. Give your answer in terms of logarithms. 5. it follows A0 = A (0) = C. A (t) = −k 2 A (t) . The half live of a substance is 40 years. and A0 is the initial amount. 9. 10. A (t) = rA (t) .19.1 The solutions to the equation. Prove Theorem 9. Hint: You should have ln x > 0.9.1 by using Problem 13 on Page 220.5 grams of the substance left. 9.0 1 per year. A (t) satisﬁes the following initial value problem.19. Find a formula for the half life assuming you know k 2 . If $100 is placed in an account which is compounded continuously at 6% per year. What is the solution to the initial value problem? Write the diﬀerential equation as A (t)+ k 2 A (t) = 0. For x suﬃciently large. y + ay = 0. Exercise 9. Find A (t) explicitly.19.2 Radioactive substances decay in the following way. the function. This means it satisﬁes the diﬀerential equation. How big does x need to be 1. C. Find the time it takes for the population to double. Thus A (t) = A0 exp −k 2 t .19. y + ay = 0 consist of all functions of the form. A (0) = A0 where A0 is the initial amount of the substance.20 Exercises ln x Find f (x) . They do so by letting the amount in the account satisfy the initial value problem. Sometimes banks compound interest continuously. r = . 4. At the end of the ﬁve years there are 9. A = . Find the half life of the substance. 8. 2. In other words. Find the half life of the substance. A population is growing at the rate of 11% per year. Verify that for any constant. how many years will it be before there is $200 in the account? Hint: In this case. A (0) = A0 where here A (t) is the amount at time t measured in years. Find the rate of decay of the substance. 3.19.1 the solution is A (t) = Ce−k 2 t and it only remains to ﬁnd C. let f (x) = (ln x) in order for this to make sense. . Ce−at where C is some constant.11A. EXERCISES 221 Theorem 9.

Given three measurements of population at three equally spaced times. What percentage of the original carbon 14 should it contain? 16. The following picture is what you would see. The reason for this is that when the population gets large enough. the model is far too simplistic for human population growth.) 15. assuming this has been going on for billions of years. One model for population growth is to assume the rate of growth is proportional to both the population and the diﬀerence between some maximum sustainable population and the population.21 Force On A Dam And Work Of A Pump Imagine you are a ﬁsh swimming in a lake behind a dam and you are interested in the total force acting on the dam. How many were present in the original sample? 13. there begin to be insuﬃcient resources. see Problem 5. Conclude A (t) = D − C/k 2 e−k t . Carbon 14 is a radioactive isotope of carbon and it is produced at a more or less constant rate in the earth’s atmosphere by radiation from the sun. the time since the death of the animal or plant can be estimated. 9. What happens to this over a long period of time? 14. if A is the amount of Carbon 14 in the atmosphere. show how to predict the maximum sustainable population1 . how long has it been since the mummy was alive? (To see how to come up with this ﬁgure for the half life. A certain tree stump is known to be 4000 years old. it is reasonable to assume the amount of Carbon 14 in the atmosphere is essentially constant over time. there are 3000. Thus dA = kA (M − A) . When it dies. the amount of carbon 14 can be measured and on this basis. Given the half life of carbon 14 is 5730 years and the amount of carbon 14 in a mummy is .3 what it was at the time of death. Show this implies that. 1 This has been done with the earth’s population and the maximum sustainable population has been exceeded. After some time. .222 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS 11. You do experiments and take measurements over a smaller period of time. t. However. it quits absorbing carbon and the carbon 14 begins to decay. it would work somewhat better for predicting the growth of things like bacteria. The method of carbon dating is based on the result of Problem 13. A sample of one ounce of water from a water supply is cultured in a Petri dish. Now diﬀerentiate both 2 2 sides and show A (t) = Ce−k t . It also decays at a rate proportional to the amount present. $1000 is deposited in an account that earns interest at the rate of 5% per year compounded continuously. Therefore. The half life of carbon 14 is known to be 5730 years. where k and M are dt positive constants. After four hours. Hint: By assumption.0 bacteria present and after six hours there are 7000 bacteria present. When an animal or plant is alive it absorbs carbon from the atmosphere and so when it is living it has a known percentage of carbon 14. dA = −k 2 A + r where k 2 is the constant of dt decay described above and r is the constant rate of production. How much will be in the account after 10 years? 12. Show this is a separable diﬀerential equation and its solutions are of the form M A (t) = 1 + CM e−kM t where C is a constant.

21. Then the total force acting on this base area would be 62. FORCE ON A DAM AND WORK OF A PUMP 223 ' slice of area of height dy The reason you would be interested in that long thin slice of area having essentially the same depth. In tons this is 2. An initial value problem for the work done to pump the water down to a depth of y feet would be dW = (y + H) 62. Here 62. Therefore. say at y feet is because the pressure in the water at that depth is constant and equals 62. F (0) = 0. The total weight of a slice of water having thickness dy at this depth is 62. 3 2 . the work done is dW = (y + H) 62. W (0) = 0.1 Suppose the width of a dam at depth y feet equals L (y) = 1000 − y and its depth is 500 feet. 2 Later on a nice result on hydrostatic pressure will be presented which will verify this assertion. 750.5A (y) dy. 83y 3 + 31250y 2 . the total force the water exerts on the long thin slice is dF = 62.5y pounds per square foot2 .5 is the weight in pounds of a cubic foot of water.9. Suppose also that at depth y below the surface.5yL (y) dy where L (y) denotes the length of the slice. 000. The total force on the dam would be F (500) = −20. 208. F (0) = 0 dy which is F (y) = −20. this is obtained as the solution to the initial value problem dF = 62.5A (y) .5 × y pounds. Find the total force in pounds exerted on the dam.5A (y) dy and the pump needs to lift this weight a distance of y + H feet.0 pounds. Therefore. the area of a cross section having constant depth is A (y) . dy The reason for the initial condition is that the pump has done no work to pump no water.5 you would do the same things but replace the number. dy Example 9. That is a lot of force. Therefore.21. If the weight of the ﬂuid per cubic foot were diﬀerent than 62.5y (1000 − y) . think of a column of water of height y having base area equal to 1 square foot. 375. 83 (500) + 31250 (500) = 5. From the above. the total force on the dam up to depth y is obtained as a solution to the initial value problem. 604 . dF = 62.5yL (y) . Now suppose you are pumping water from a tank of depth d to a height of H feet above the top of the water in the tank. If you like.

If the depth of the hole is 20 feet and the radius of the hole is 10 feet.22 Exercises 1.2 A spherical storage tank sitting on the ground having radius 30 feet is half ﬁlled with a ﬂuid which weighs 50 pounds per cubic foot. A cylindrical storage tank having radius 10 feet is ﬁlled with water to a depth of 20 feet. how much work is needed to pump the water to a height of 10 feet above the ground? 4. A silo is 10 feet in diameter and at a height of 30 feet there is a hemispherical top. How much work is done to pump this ﬂuid to a height of 100 feet? Letting r denote the radius of a cross section y feet below the level of the ﬂuid. what is the total force the water exerts on the sides of the tank? Hint: The pressure in the water at depth y is 62. suppose the person on the top maintains a constant velocity of 1 foot per second. This tank is lying on its side on the ground. 3. How much work was done in ﬁlling it to the very top? 8. 000π 9. W (0) = 0. The bucket is full at the bottom but leaks at the rate of . dy Therefore.5y pounds per square foot. A cylindrical storage tank having radius 20 feet and length 40 feet is ﬁlled with a ﬂuid which weighs 50 pounds per cubic foot. Find the work needed to pump the ﬂuid to a height of 50 feet. A dam 500 feet high has a width at depth y equal to 4000 − 2y feet. How much work does the person on the top of the building do in lifting the bucket to the top? Will the bucket be empty when it reaches the top? You can use Newton’s law that force equals mass times acceleration. 125. Suppose the tank in the above problem is ﬁlled to a depth of 8 feet.21. The silage weighs 10 pounds per cubic foot. Therefore. It follows the area of the cross section at depth y is π 900 − y 2 . Here H = 70 and so the initial value problem to solve is dW = (y + 70) 50π 900 − y 2 . r2 + y 2 = 900. r = 900 − y 2 . 6. . If the storage tank stands upright on its circular base. 2. Find the total force acting on the ends of the tank by the ﬂuid. W (y) = 50π − 1 y 4 − 4 equals 70 3 3 y + 450y 2 + 63 000y and the total work in foot pounds 1 70 4 3 2 W (30) = 50π − (30) − (30) + 450 (30) + 63 000 (30) 4 3 = 73 .224 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS Example 9.1 cubic feet per second. A conical hole is ﬁlled with water. What is the total force on the dam if it is ﬁlled? 5. In the situation of the above problem. When the bucket is ﬁlled with water it weighs 30 pounds and when empty it weighs 2 pounds and the person on top of a 100 foot building exerts a constant force of 40 pounds. How much work does he do and is the bucket empty when it reaches the top? 7.

A spherical storage tank having radius 10 feet is ﬁlled with water. EXERCISES 225 9. What is the total force the water exerts on the storage tank? Hint: The pressure in the water at depth y is 62. This is a surface of revolution and you know how to deal with these.22.5y consider the area corresponding to a slice at height y.9. .

226 ANTIDERIVATIVES AND DIFFERENTIAL EQUATIONS .

y (x) = f (x) . A (0) = 0. but what is the solution? Does it even exist? More generally. y (0) = y0 ? These questions are typical of mathematics. 227 . consider the problem of ﬁnding the 2 area between y = ex and the x axis for x ∈ [0. It is related to the fundamental mathematical question of existence. · · ·. The two questions are often very diﬀerent and one can have a good understanding of one without having any idea how to go about considering the other. The main diﬀerence is that here the triangles will be replaced with rectangles. As an illustration. The integral is needed to remove some of the mathematical loose ends and also to enable the study of more general problems involving diﬀerential equations. There are usually two aspects to a mathematical concept. 1 Archimedes 287-212 B. The story is told about how he determined that a gold smith had cheated the king by giving him a crown which was not solid gold as had been claimed. One is the question of existence and the other is how to ﬁnd that which exists. It will also be useful for formulating other physical models. 2 10. He also made fundamental contributions to physics.C.C. there is a profound diﬀerence between what is about to be presented and what has just been done. As pointed out earlier. the area is obtained as a solution to the initial value problem. {x0 . However.1 Upper And Lower Sums The Riemann integral pertains to bounded functions which are deﬁned on a bounded interval. b] be a closed interval. Let [a. f does there exist a solution to the initial value problem. Archimedes1 was ﬁnding areas of various curved shapes about 250 B.The Integral The integral originated in attempts to ﬁnd areas of various shapes and the ideas involved in ﬁnding integrals are much older than the ideas related to ﬁnding derivatives. In the preceding chapter the only thing considered was the second question. b]. A set of points in [a. However.1 there is at most one solution. So what is the solution to this initial value problem? By Theorem 9. You may be wondering what the fuss is about. Areas have already been found as solutions of diﬀerential equations. both are absolutely essential. In fact. xn } is a partition if a = x0 < x1 < · · · < xn = b. He did this by ﬁnding the amount of water displaced by the crown and comparing with the amount of water it should have displaced if it had been solid gold. 1] . The technique used for ﬁnding the area of a circular segment presented early in the book was essentially that employed by Archimedes and contains the essential ideas for the integral. found areas of curved regions by stuﬃng them with simple shapes which he knew the area of and taking a limit.1. A (x) = ex . for which functions.

is added between xk .1 If P ⊆ Q then U (f. In the following picture. labeled z has been added in. b] . P ) . y. P ) ≡ i=1 Mi (f ) ∆xi and L (f.1. are well deﬁned real numbers because f is assumed to be bounded and R is complete. the sum of the areas of the rectangles in the picture on the left is a lower sum for the function in the picture and the sum of the areas of the rectangles in the picture on the right is an upper sum for the same function which uses the same partition.228 THE INTEGRAL Such partitions are denoted by P or Q. y = f (x) x0 x1 x2 x3 x0 x1 x2 x3 What happens when you add in more points in a partition? The following pictures illustrate in the context of the above example. let Mi (f ) ≡ sup{f (x) : x ∈ [xi−1 . Q) . Then deﬁne upper and lower sums as n n U (f. y. In general this is the way it works and this is shown in the following lemma. xi ]} is bounded above and below. For f a bounded function deﬁned on [a. xk+1 . In this example a single additional point. · · ·. Thus let P = {x0 . xi ]}. Proof: This is veriﬁed by adding in one point at a time. and L (f. xk . Thus exactly one point. · · ·. Mi (f ) and mi (f ) . Lemma 10. The numbers. P ) ≤ L (f. Also let ∆xi ≡ xi − xi−1 . y = f (x) x0 x1 x2 z x3 x0 x1 x2 z x3 Note how the lower sum got larger by the amount of the area in the shaded rectangle and the upper sum got smaller by the amount in the rectangle shaded by dots. Thus the set S = {f (x) : x ∈ [xi−1 . P ) ≡ i=1 mi (f ) ∆xi respectively. xi ]}. xn } and let Q = {x0 . Q) ≤ U (f. xn }. · · ·. mi (f ) ≡ inf{f (x) : x ∈ [xi−1 .

a b f (x) dx ≡ I = I. xk+1 ] and U (f. Now the term in the upper sum which corresponds to the interval [xk . Q) . then L (f. since Q is arbitrary. Q) is sup {f (x) : x ∈ [xk . P ) where P is a partition}. This proves the ﬁrst part of the lemma pertaining to upper sums because if Q ⊇ P.1.3 I ≡ inf{U (f. xk+1 ]} (y − xk ) = sup {f (x) : x ∈ [xk . xk+1 ]} (xk+1 − xk ) and the term which corresponds to the interval [xk .1. xk+1 ]} ≥ max (M1 . Q) where Q is a partition} I ≡ sup{L (f. Q) where Q is a partition} ≡ I where the inequality holds because it was just shown that I is a lower bound to the set of all upper sums and so it is no larger than the greatest lower bound of this set. P ) is (10.1. P ∪ Q) ≤ U (f. The second part is similar and is left as an exercise. Proof: By Lemma 10.2) is no larger than sup {f (x) : x ∈ [xk .5 A bounded function f is Riemann integrable.1. Proof: From Lemma 10. UPPER AND LOWER SUMS 229 and xk+1 . . I = sup{L (f.2 If P and Q are two partitions. xk+1 ] in U (f. Q) is an upper bound to the set of all lower sums and so it is no smaller than the least upper bound. P ) where P is a partition} ≤ inf{U (f. This proves the theorem. written as f ∈ R ([a. Q) . [xk . xk+1 ]} (xk+1 − y) + sup {f (x) : x ∈ [xk . Theorem 10. Therefore.1. one can obtain Q from P by adding in one point at a time and each time a point is added.2.2) (10. Q) because U (f. Note that I and I are well deﬁned real numbers.1) sup {f (x) : x ∈ [xk .1.3) All the other terms in the two sums coincide. Lemma 10. y]} (y − xk ) + sup {f (x) : x ∈ [y. the corresponding upper sum either gets smaller or stays the same. b]) if I=I and in this case. Now sup {f (x) : x ∈ [xk . M2 ) and so the expression in (10. the term corresponding to the interval. P ∪ Q) ≤ U (f. Deﬁnition 10. xk+1 ]} (xk+1 − y) ≡ M1 (y − xk ) + M2 (xk+1 − y) (10. xk+1 ]} (xk+1 − xk ) . Deﬁnition 10.1. P ) ≤ L (f. P ) where P is a partition} ≤ U (f.10. P ) ≤ U (f. I = sup{L (f. xk+1 ] in U (f. L (f.1. P ) .4 I ≤ I.

P ) ≤ ε.3. Then since I = I. P ) . P ) < ε. .5).7 A bounded function f is Riemann integrable if and only if for all ε > 0. Q) < I + ε/2.1.4) holds. P ) and 2 L (f. P ) < I + ε/2 − (I − ε/2) = ε. there exists a partition P such that U (f.6 Let S be a nonempty set and suppose sup (S) exists. Then let P and Q be two partitions such that U (f. let f (x) ≡ 1 if x ∈ Q 0 if x ∈ R \ Q (10. S ∩ (sup (S) − δ. then for every δ > 0. Then for given ε and partition P corresponding to ε I − I ≤ U (f. Verify that for f given in (10. 10. For example. Recall Proposition 2. 2. 2 . Proposition 10. P ) − L (f. in words. 0.230 THE INTEGRAL Thus. Find U (f. Q ∪ P ) − L (f.1.1.4) Proof: First assume f is Riemann integrable.1 about lower sums. the lower sums on the interval [0. If inf (S) exists. 1] all upper sums for f equal 1 while all lower sums for f equal 0. (10.2 Exercises 1. the Riemann integral is the unique number which lies between all upper sums and all lower sums if there is such a unique number. Since ε is arbitrary. sup (S)] = ∅. Let f (x) = 1 + x2 for x ∈ [−1. inf (S) + δ) = ∅. Then for every δ > 0. − 3 . Q) − L (f. 1 . b] = [0. U (f. Prove the second half of Lemma 10. Theorem 10. 1] are all equal to zero while the upper sums are all equal to one. Now suppose that for all ε > 0 there exists a partition such that (10.14. P ) − L (f. L (f. P ) > I − ε/2. 3] and let P = −1. this shows I = I and this proves the theorem. S ∩ [inf (S) . P ∪ Q) ≤ U (f. 1. It is stated here for ease of reference. The condition described in the theorem is called the Riemann criterion . Therefore the Riemann criterion is violated for ε = 1/2. 1 3.5) Then if [a. This proposition implies the following theorem which is used to determine the question of Riemann integrability. Not all bounded functions are Riemann integrable.

d2 ] → R satisfy. b]). · · ·.3. h.3. d1 ] and g ([a. b]) and f is changed at ﬁnitely many points. Find upper and lower sums for the function. 1 1 . Let H : [c1 . f (x) = 4 2 4 using this partition. g be bounded functions and let f ([a. If f ∈ R ([a. This is not the most general theorem which will relate to this question but it will be enough for the needs of this book. g)) − mi (H ◦ (f. n f (xk ) (xk − xk−1 ) . |Mi (H ◦ (f.3 Functions Of Riemann Integrable Functions It is often necessary to consider functions of Riemann integrable functions and a natural question is whether these are Riemann integrable. g))| ≤ K [|Mi (f ) − mi (f )| + |Mi (g) − mi (g)|] . b]) . Proof: In the following claim.10. d1 ] × [c2 . Mi (h) and mi (h) have the meanings assigned above with respect to some partition of [a. This is shown in more advanced courses when the Lebesgue integral is studied. xk+1 ] . show the new function is also in R ([a. b]) ⊆ [c2 . 10. there exists a partition. regardless of whether x is rational. 1 1 . What does this tell you about ln (2)? 1 x 6. The following theorem gives a partial answer to this question.5).1 Let f. k=1 f (zk ) (xk − xk−1 ) . Deﬁne a “left sum” as n f (xk−1 ) (xk − xk−1 ) k=1 and a “right sum”. xn } such that for any zk ∈ [xk . |H (a1 . . Then if f. 1 3 . This illustrates why this method of deﬁning the integral in terms of left and right sums is total nonsense. Show that if f ∈ R ([a. b]) . k=1 Also suppose that all partitions have the property that xk − xk−1 equals a constant. b]) it follows that H ◦ (f. 2 . b2 )| ≤ K [|a1 − a2 | + |b1 − b2 |] for some constant K. Claim: The following inequality holds. g ∈ R ([a. g) ∈ R ([a. and deﬁne the integral to be the number these right and left sums get close to as n gets larger and larger. b] for the function. Let P = 1. d2 ] . 5. It turns out that the correct answer should always equal zero for that function. {x0 . b n f (x) dx − a n k=1 f (zk ) (xk − xk−1 ) < ε This sum. 7. x x Show that for f given in (10. b]) ⊆ [c1 . is called a Riemann sum and this exercise shows that the integral can always be approximated by a Riemann sum. FUNCTIONS OF RIEMANN INTEGRABLE FUNCTIONS 231 4. 0 f (t) dt = 1 if x is rational and 0 f (t) dt = 0 if x is irrational. Theorem 10. b1 ) − H (a2 . (b − a) /n so the points in the partition are equally spaced.

g (x1 )) − H (f (x2 ) . g (x2 ))| < 2η + K [|f (x1 ) − f (x2 )| + |g (x1 ) − g (x2 )|] ≤ 2η + K [|Mi (f ) − mi (f )| + |Mi (g) − mi (g)|] . to see that |f | is Riemann integrable. The following theorem gives an example of many functions which are Riemann integrable. i = 0. This theorem implies that if f. there exist x1 . along with inﬁnitely many other such continuous combinations of Riemann integrable functions. Then f ∈ R ([a. this proves the claim. this shows H ◦ (f. xi ] be such that H (f (x1 ) .3. Clearly this function satisﬁes the conditions of the above theorem and so |f | = H (f. g)) − mi (H ◦ (f. n b−a n . g))| < 2η + |H (f (x1 ) . Proof: Let ε > 0 be given and let xi = a + i Then since f is increasing. P ) − L (f. g)) . Theorem 10. . Now continuing with the proof of the theorem. x2 ∈ [xi−1 . f ) ∈ R ([a. g (x1 )) + η > Mi (H ◦ (f. g (x2 )) − η < mi (H ◦ (f. then so is af +bg. g) is Riemann integrable as claimed. let H (a. 2K n (Mi (g) − mi (g)) ∆xi < i=1 ε . b]. The proof for decreasing f is similar. Thus the Riemann criterion is satisﬁed and so the function is Riemann integrable. g))) ∆xi i=1 n < i=1 K [|Mi (f ) − mi (f )| + |Mi (g) − mi (g)|] ∆xi < ε.232 THE INTEGRAL Proof of the claim: By the above proposition. g are Riemann integrable. g)) − mi (H ◦ (f. n. let P be such that n (Mi (f ) − mi (f )) ∆xi < i=1 ε . Since η > 0 is arbitrary.2 Let f : [a. f 2 . |f | . b]) as claimed. b] → R be either increasing or decreasing on [a. U (f. For example. and H (f (x2 ) . g)) . n (Mi (H ◦ (f. g) satisﬁes the Riemann criterion and hence H ◦ (f. 2K Then from the claim. Then |Mi (H ◦ (f. Since ε > 0 is arbitrary. P ) = i=1 (f (xi ) − f (xi−1 )) b−a n b−a n <ε = (f (b) − f (a)) whenever n is large enough. This proves the theorem. · · ·. b) = |a| . b]) .

then Mi − mi < b−a .13. although this is harder. Theorem 10. 10.4 Properties Of The Integral The integral has many important algebraic properties. Proof: By Corollary 5. K.4. it is enough to assume φ is continuous.3. This proves the theorem. b]) . Lemma 10.10. Let P ≡ {x0 . such that for all x.4 Suppose f : [a. b−a By the Riemann criterion. Then −x ≤ sup (−S) and so x ≥ − sup (−S) . P ) − L (f.7) Proof: Consider (10. − inf (S) is an upper bound for −S and so − inf (S) ≥ sup (−S) .7) is similar and is left as an exercise. This proves the corollary. b] → R is continuous. Then x ∈ S and so x ≥ inf (S) which implies −x ≤ − inf (S) . In particular. Formula (10. In fact. the above lemma implies that for Mi (f ) and mi (f ) deﬁned above Mi (−f ) = −mi (f ) . Therefore. ε if ε > 0 is given. |φ (x) − φ (y)| < K |x − y| . Now let −x ∈ −S. This implies sup (−S) ≥ − inf (S) . − sup (−S) ≤ inf (S) . First here is a simple lemma. y.4. . · · ·. b] → R be Lipschitz continuous. This is the content of the next theorem which is where the diﬃcult theorems about continuity and uniform continuity are used. b) ≡ φ (a). b]) then −f ∈ R ([a.2 If f ∈ R ([a. there exists a δ > 0 such that if |xi − xi−1 | < δ. sup (−S) = − inf (S) (10. f is Riemann integrable.2. Then f ∈ R ([a. Recall that a function.4. Then if −S ≡ {−x : x ∈ S} . Then by Theorem 10. If follows that − sup (−S) is a lower bound for S and therefore.3 Let [a. f ) = φ ◦ f = φ is also Riemann integrable. b]) .3.5 on Page 114. Let x ∈ S. Then n U (f. and mi (−f ) = −Mi (f ) . P ) < i=1 (Mi − mi ) (xi − xi−1 ) < ε (b − a) = ε. This shows (10. b]) . Lemma 10. Then φ ∈ R ([a. Then by Theorem 10. Therefore. (10. xn } be a partition with |xi − xi−1 | < δ. Let H (a. f ∈ R ([a.3.6). Proof: Let f (x) = x.1 Let S be a nonempty set which is bounded above and below.3.6). φ. is Lipschitz continuous if there is a constant. f is uniformly continuous on [a. PROPERTIES OF THE INTEGRAL 233 Corollary 10. b] .6) and inf (−S) = − sup (S) .1 H ◦ (f. b]) and b b − a f (x) dx = a −f (x) dx. b] be a bounded closed interval and let φ : [a.

P ) . αf + βg ∈ R ([a. P1 ) < ε/2. Theorem 10. . P ) − U (g. Thus. P ) + U (g. since ε is arbitrary. Q1 ) − L (f. U (g. b b b (αf + βg) (x) dx = α a a f (x) dx + β a g (x) dx. Therefore. P ) ≤ U (f. U (g + f. P1 ) − L (g. P ) < ε/2. whenever f. P ) < ε/2.2 since the function φ (y) ≡ −y is Lipschitz continuous.Q1 be such that U (f. P ) − L (f.234 THE INTEGRAL Proof: The ﬁrst part of the conclusion of this lemma follows from Theorem 10. P ) ≥ L (f.3 The integral is linear. Next note that mi (f + g) ≥ mi (f ) + mi (g) . P ) < ε.3. It follows b b b b −f (x) dx ≤ − a a f (x) dx = − a − (−f (x)) dx ≤ a −f (x) dx and this proves the lemma.1. (10. b]) then b b b (f + g) (x) dx = a a f (x) dx + a g (x) dx. β ∈ R. g ∈ R ([a.4. P ) . Then letting P ≡ P1 ∪ Q1 . Q1 ) < ε/2.8) Let P1 . Now choose P such that b −f (x) dx − L (−f.1. b n b n ε> a −f (x) dx − i=1 mi (−f ) ∆xi = a −f (x) dx + i=1 Mi (f ) ∆xi which implies b n b b ε> a −f (x) dx + i=1 Mi (f ) ∆xi ≥ a −f (x) dx + a f (x) dx. a b b −f (x) dx ≤ − a f (x) dx whenever f ∈ R ([a. consider the claim that if f. Mi (f + g) ≤ Mi (f ) + Mi (g) . a Then since mi (−f ) = −Mi (f ) . b]) and α. and U (g.3. b]) . To begin with. P ) + L (g. Lemma 10.1 implies U (f. b]) . L (g + f. g ∈ R ([a. Proof: First note that by Theorem 10.

P ) − L (f. i = 1. c]) and c b c f (x) dx = a a f (x) dx + b f (x) dx.4. P1 ) − L (f. b 235 (f + g) (x) dx ∈ [L (f + g. P ) .10. This proves the theorem. P ) + U (g. then this and Lemma 10. Suppose ﬁrst that α ≥ 0. PROPERTIES OF THE INTEGRAL For this partition. c] and U (f. P ) . b b (f + g) (x) dx − a a f (x) dx + a g (x) dx ≤ U (f.9) Proof: Let P1 be a partition of [a. P2 ) < ε/2 + ε/2 = ε. P ) + L (g. b]) and f ∈ R ([b.2 imply b b αf (x) dx = a b a (−α) (−f (x)) dx b = (−α) a (−f (x)) dx = α a f (x) dx. Then b αf (x) dx ≡ sup{L (αf. (10.4 If f ∈ R ([a. then f ∈ R ([a. P ) + U (g. This proves (10.10) . Then P is a partition of [a. Therefore. P1 ) + U (f. P ) = U (f. U (f. Pi ) < ε/2.8) since ε is arbitrary. P ) : P is a partition} ≡ α a f (x) dx. P )] and a b b f (x) dx + a b g (x) dx ∈ [L (f. Let P ≡ P1 ∪ P2 . Theorem 10. P ) + U (g.4. P ) + L (g. U (f + g. If α < 0.4. It remains to show that b b α a f (x) dx = a αf (x) dx. P ) − (L (f. c]) . 2. P ) : P is a partition} = a b α sup{L (f. c] such that U (f. P ) . P ) + L (g. Pi ) − L (f. U (f. (10. P )] . P2 ) − L (f. P )] a ⊆ [L (f. P )) < ε/2 + ε/2 = ε. b] and P2 be a partition of [b.

P2 ) . P2 )] = [L (f. b] be a closed and bounded interval and suppose that a = y1 < y2 · ·· < yl = b and that f is a bounded function deﬁned on [a. b] which has the property that f is either increasing on [yj . b c c f (x) dx + a b f (x) dx = a f (x) dx. yj+1 ] for j = 1. P )] . Then f ∈ R ([a.4. P1 ) + L (f. Note that with this deﬁnition. b) . P1 ) + U (f. b c THE INTEGRAL f (x) dx + a b f (x) dx ∈ [L (f. c]) by the Riemann criterion and also for this partition. a a f (x) dx = − a a a f (x) dx and so a f (x) dx = 0. Then from Theorem 10.6.3. · · ·. U (f. U (f. P ) . Proof: This follows from Theorem 10. Hence by (10.6 Let [a. a f (x) dx when a > b has not yet been deﬁned. c b c f (x) dx − a a f (x) dx + b f (x) dx < U (f. This proves the theorem. b]) . b] be an interval and let f ∈ R ([a. Corollary 10.4. P ) − L (f.4. P ) . assume c ∈ (a. Theorem 10. b]) .5 Let [a.9) holds.4. f ∈ R ([a. P ) < ε which shows that since ε is arbitrary. Then a b f (x) dx ≡ − b a f (x) dx. (10.4.7 Assuming all the integrals make sense. c b b f (x) dx + a c f (x) dx = a f (x) dx .4. b The symbol. Deﬁnition 10. P )] and a c f (x) dx ∈ [L (f. Proof: This follows from Theorem 10. yj+1 ] or decreasing on [yj .236 Thus. U (f.4 and Deﬁnition 10. For example.2.4.4.10).4 and Theorem 10. l − 1.

5 Fundamental Theorem Of Calculus With these properties. To show this one. Then by (10. c b b 237 f (x) dx = a a b f (x) dx − c c f (x) dx f (x) dx. (10. Isaac Barrow had made some progress in this direction. a b b b (αf + βg) (x) dx = α a b c a f (x) dx + β a c g (x) dx. The ﬁrst version of the fundamental theorem of calculus is a statement about the derivative of the function x x→ a f (t) dt.15) (10. c]) . FUNDAMENTAL THEOREM OF CALCULUS and so by Deﬁnition 10. However the connection between these two ideas had not been fully made although Newton’s predecessor. If f ∈ R ([a. 10. b] . Let f ∈ R ([a. . f ∈ R ([a. b b |f (x)| dx ≥ a a f (x) dx b and a b |f (x)| dx ≥ − a b b f (x) dx. Therefore.14) (10. if a < b.11) (10. 2 This theorem is why Newton and Liebnitz are credited with inventing calculus. The following properties of the integral have either been established or they follow quickly from what has been shown so far.16) f (x) dx + a b b f (x) dx = a f (x) dx. b = a f (x) dx + The other cases are similar.6. f (x) dx ≥ 0 if f (x) ≥ 0 and a < b. note that |f (x)| − f (x) ≥ 0. by (10. b]) then if c ∈ [a.16). b] . |f (x)| dx ≥ a a f (x) dx . Therefore. b]) .11) f ∈ R ([a.4. The integral had been around for thousands of years and the derivative was by their time well known.15) and (10. The only one of these claims which may not be completely obvious is the last one.10. This implies (10. b (10.13). x]) for each x ∈ [a.5.12) α dx = α (b − a) . a b b f (x) dx ≤ a a |f (x)| dx . If b < a then the above inequality holds with a and b switched. |f (x)| + f (x) ≥ 0.13) (10. it is easy to prove the fundamental theorem of calculus2 .

b] . b) . lim h−1 (F (x + h) − F (x)) = f (x) and this proves the theorem. if |h| < δ. Then by using (10.14).2 Let f ∈ R ([a. this shows h→0 −1 ε |h| = ε. b] . h−1 (F (x + h) − F (x)) − f (x) = h−1 x+h x+h x+h x+h f (t) dt. Therefore. b]) but this is a hard theorem using the diﬃcult result about uniform continuity.1 Let f ∈ R ([a. x (f (t) − f (x)) dt x ≤ h−1 |f (t) − f (x)| dt . by (10.3 The next theorem is also called the fundamental theorem of calculus. a (10. x Let ε > 0 and let δ > 0 be small enough that if |t − x| < δ. Then if f is continuous at x ∈ (a. x f (x) dt. b) and G is continuous on [a. Theorem 10. then |f (t) − f (x)| < ε.16). b] .12) shows that h−1 (F (x + h) − F (x)) − f (x) ≤ |h| Since ε > 0 is arbitrary. the above inequality and (10.5. b) be a point of continuity of f and let h be small enough that x + h ∈ [a.5. F (x) = f (x) . Then b f (x) dx = G (b) − G (a) . then f ∈ R ([a. using (10. such that G (x) = f (x) for every point of (a. . Proof: Let x ∈ (a. F (x) = f (x) .12).17) 3 Of course it was proved that if f is continuous on a closed interval. b]) and let x THE INTEGRAL F (x) ≡ a f (t) dt.238 Theorem 10. h−1 (F (x + h) − F (x)) = h−1 Also. b]) and suppose there exists an antiderivative for f. f (x) = h−1 Therefore. G. Note this gives existence for the initial value problem. [a. F (a) = 0 whenever f is Riemann integrable and continuous.

10. Then there exists a partition. P ) . b] and let P ≡ {x0 . U (f. xi ] is chosen. n} . xi ] . P )] . Then b a f (x) dx = F (b) − F (a) ≡ F (x) |b . P ) < ε. and also a b f (x) dx ∈ [L (f. Proposition 10. The following notation is often used in this context.3 Let f be a bounded function deﬁned on a closed interval [a. Suppose zi ∈ [xi−1 . that G (b) − G (a) ∈ [L (f.5. P ) < ε. P ) − L (f. U (f. P ≡ {x0 . b]) . xk ] . · · ·. a Deﬁnition 10. ||P || ≡ max {|xi − xi−1 | : i = 1. P ) .5. Since ε > 0 is arbitrary. n G (b) − G (a) = i=1 n G (zi ) (xi − xi−1 ) f (zi ) ∆xi i=1 = where zi is some point in [xi−1 . P )] . This proves the theorem. xn } with the property that for any choice of zk ∈ [xk−1 . b) .5. · · ·. Then the sum n f (zi ) (xi − xi−1 ) i=1 is known as a Riemann sum. (10. . Then G (b) − G (a) = G (xn ) − G (x0 ) n 239 = i=1 G (xi ) − G (xi−1 ) . FUNDAMENTAL THEOREM OF CALCULUS Proof: Let P = {x0 . Also. By the mean value theorem. xn } be a partition of the interval.4 Suppose f ∈ R ([a. Therefore. Suppose F is an antiderivative of f as just described with F continuous on [a. xn } be a partition satisfying U (f. b] and F = f on (a. It follows. b n f (x) dx − a k=1 f (zk ) (xk − xk−1 ) < ε. b G (b) − G (a) − a f (x) dx < U (f. P ) − L (f. · · ·. · · ·.17) holds. since the above sum lies between the upper and lower sums.

Solve the following initial value problem from ordinary diﬀerential equations which is to ﬁnd a function y such that y (x) = x6 x7 + 1 . there exists δ > 0 such that if P ≡ {x0 . x1 . Furthermore the two integrals coincide. there are two deﬁnitions of the Riemann integral. The proof of this theorem is left for the exercises in Problems 10 . dt. and zi ∈ [xi−1 . Let F (x) = 2. Find F (x) . y (10) = 5.5. The number b a f (x) dx is deﬁned as I. The reason for using the Darboux approach to the integral is that all the existence theorems are easier to prove in this context.5. n k=1 Deﬁnition 10. P ) . xi ] . b] is Riemann integrable in the sense of Deﬁnition 10.6 Exercises x3 t5 +7 x2 t7 +87t6 +1 x 1 2 1+t4 1. It isn’t essential that you understand this theorem so if it does not interest you. Sketch a graph of F and explain why it looks the way it does.5. Theorem 10. + 97x5 + 7 . n I− i=1 f (zi ) (xi − xi−1 ) < ε. P ) < ε and then both a f (x) dx and f (zk ) (xk − xk−1 ) are contained in [L (f.240 THE INTEGRAL b Proof: Choose P such that U (f. P ) − L (f. Note that it implies that given a Riemann integrable function f in either sense. leave it out. 10. I with the property that for every ε > 0. it can be approximated by Riemann sums whenever ||P || is suﬃciently small. P )] and so the claimed inequality must hold. 3. f deﬁned on [a. Thus. · · ·.5 A bounded function. b] is said to be Riemann integrable if there exists a number. Let a and b be positive numbers and consider the function.6 A bounded function deﬁned on [a.5 if and only if it is integrable in the sense of Darboux. The deﬁnition of Riemann integrability given in this chapter is also called Darboux integrability and the integral deﬁned as the unique number which lies between all upper sums and all lower sums which is given in this chapter is called the Darboux integral . It is signiﬁcant because it gives a way of approximating the integral. This proves the proposition. U (f. 4. a2 + t2 Show that F is a constant. xn } is any partition having ||P || < δ. The deﬁnition of the Riemann integral in terms of Riemann sums is given next. It turns out they are equivalent which is the following theorem of Darboux. Both versions of the integral are obsolete but entirely adequate for most applications and as a point of departure for a more up to date and satisfactory integral.12. ax F (x) = 0 a2 1 dt + + t2 a/x b 1 dt. Let F (x) = dt.

7. then |U (f. 8. Suppose f is a bounded function deﬁned on [a. Consider the function f (x) ≡ g (t) dt.6. x∗ . ↑ Prove Theorem 10. P )| < ε. Q)| . n 0 Show that |U (f. show F (x) = G (x) + C for some constant. Q) − L (f. b) . and terms for which [xi−1 . 10. If F. the sum of these terms should be k−1 k no larger than |U (f. b] and |f (x)| < M for all x ∈ [a. P )| ≤ 2M n ||P || + |U (f. 10. [xi−1 . (In fact continuous implies Riemann integrable. In the latter case. Suppose f is Riemann integrable on [a. P ) − L (f. b) such that f (c) = 1 b−a b f (x) dx. xi ] contains at least one point of Q. Now let Q be a partition having n points. b] and continuous. Use this to give a diﬀerent proof of the fundamental theorem of calculus which has for its b conclusion a f (t) dt = G (b) − G (a) where G (x) = f (x) . ↑ If ε > 0 is given and f is a Darboux integrable function deﬁned on [a.) Show there exists c ∈ (a. show there exists δ > 0 such that whenever ||P || < δ. P ) − L (f. x a Hint: Deﬁne F (x) ≡ a f (t) g (t) dt and let G (x) ≡ mean value theorem on these two functions. xi ] must be contained in some interval.7 Return Of The Wild Assumption The entire treatment of exponential functions and logarithms up till this time has been based on the Wild Assumption on Page 143. P ) and split this sum into two sums. RETURN OF THE WILD ASSUMPTION 241 5. a x Hint: You might consider the function F (x) ≡ a f (t) dt and use the mean value theorem for derivatives and the fundamental theorem of calculus. b]. b] and that g (x) = 0 on (a. G ∈ f (x) dx for all x ∈ R. Therefore.10. · · ·. Then use the Cauchy 1 sin x if x = 0 . Prove the second part of Theorem 10. {x∗ . Q) − L (f. C. Suppose f and g are continuous functions on [a. Show there exists c ∈ (a. This was a totally unjustiﬁed assumption that exponential functions existed and were diﬀerentiable. b] . Hint: Write the sum for U (f. Q)| . x∗ } and let P be any other partition.7.3. 0 if x = 0 Is f Riemann integrable? Explain why or why not. P ) − L (f.2 about decreasing functions. 9. 11. 12. 6. b) such that b b f (c) a x g (x) dx = a f (x) g (x) dx. x∗ . Some pictures were drawn by a . the sum of terms for which [xi−1 .5. xi ] does not contain any points of Q.

8. L1 (xy) = L1 (x) + L1 (y) . strictly increasing. Also the second derivative equals L1 (x) = −1 <0 x2 √ n xm = m L1 (x) .18) 1 t There is no problem in writing this integral because the function. f (x) is a constant.242 THE INTEGRAL computer as evidence.7. (10. 1 From the Fundamental theorem of calculus. (10. L1 (xn ) = nL1 (x) . and its graph is concave downward. L1 : (0.1.1. Deﬁne x 1 L1 (x) ≡ dt. whenever m ∈ Q n L1 Proof: Fix y > 0 and let f (x) = L1 (xy) − (L1 (x) + L1 (y)) Then by Theorem 6. However. f (1) = 0 and so this proves (10.21) . L1 (1) = 0.5.3.1.19). logarithms were deﬁned. First note that from the deﬁnition. It is now possible to establish Wild Assumption 7. L1 (2) ≥ 1/2 as can be seen by looking at a lower sum for 2 1 (1/t) dt. Theorem 10. n. by Corollary 6. L1 (x) = x > 0 and so L1 is a strictly increasing function and is therefore one to one. f (t) = 1/t is decreasing. 1 1 f (x) = y − = 0. ∞) → R satisﬁes the following properties.1 The function. (10. L1 ((x) (x) (x)) = L1 ((x) (x)) + L1 (x) = L1 (x) + L1 (x) + L1 (x) = 3L1 (x) . Continuing in this way it follows that for any positive integer.19) The function. L1 is one to one and onto. n (10. In fact.20) showing that the graph of L1 is concave down. In addition to this. Theorem 10. Based on this outrageous wild assumption. xy x Therefore.5. Also. Now consider the assertion that L1 is onto. L1 (2) > 0. Theorem 10.2.4 on Page 133.6 on Page 122 and the Fundamental theorem of calculus. Now if x > 0 L1 (x × x) = L1 x + L1 x = 2L1 x.

21) and use the deﬁnition of L1 to verify that L1 (2) > 0. 1 (10. positive or negative. 2 Also. Thus the picture of the graph of L1 looks like the following where the graph approaches the y axis as x gets close to 0.21).3.1 on Page 143 can be fully justiﬁed.22).7.10.22) You see that 1 can be made as close to zero as desired by taking n suﬃciently large. 1 1 L1 = nL1 = −nL1 (2) 2n 2 showing that L1 (x) gets arbitrarily large in the negative direction provided that x is a suﬃciently small positive number. L1 (x) achieves arbitrarily large values as x gets increasingly large because you can take x = 2 in (10.2 For b > 0 deﬁne expb (x) ≡ L−1 (xL1 (b)) . mL1 (x) = L1 (xm ) = L1 and so √ n xm n = nL1 √ n xm √ m L1 (x) = L1 n xm . n This proves the theorem. L1 (xn ) = nL1 (x) .1 on Page 143 and L1 (b) = ln b as deﬁned on Page 143. Now if x > 0 =1 1 0 = L1 x (x) = L1 1 x + L1 (x) showing that L1 x−1 = L1 n 1 x = −L1 x. Now Wild Assumption 7. Since L1 is continuous.3. From (10.23) satisﬁes all conditions of Wild Assumption 7. Therefore. RETURN OF THE WILD ASSUMPTION 243 Therefore. letting m.7. It only remains to verify the claim about raising x to a rational power. the intermediate value theorem may be used to ﬁll in all the numbers in between.23) Proposition 10. (10. from (10.7. n. Deﬁnition 10. n be integers.21) and (10. .3 The function just deﬁned in (10. for any integer.

First. Now L1 (expab (x)) = xL1 (ab) = xL1 (a) + xL1 (b) while L1 (expa (x) expb (x)) = = L1 (expa (x)) + L1 (expb (x)) xL1 (a) + xL1 (b) . To verify that expb (m/n) = = √ n bm √ n bm . this shows the ﬁrst law of exponents holds. That expb (x) > 0 follows immediately from the deﬁnition of the inverse function. then expb (h) = 1. 1 . (L1 is deﬁned on positive real numbers and so L−1 has values in the positive real numbers. 1 = expb (1 + (−1)) = expb (−1) expb (1) = expb (−1) b showing that expb (−1) = b−1 .3. hL1 (b) = L1 (1) = 0. √ m m n ≡ L−1 L1 (b) = L−1 L1 bm 1 1 n n by (10. expb deﬁned in (10. Suppose then that expb (h) = 1. these were assumed to hold. expb exists by Theorem 8. expab (x) = expa (x) expb (x) .1.23) satisﬁes all the conditions of Wild Assumption 7. L1 (expb (x + y)) = (x + y) L1 (b) and L1 (expb (x) expb (y)) = = L1 (expb (x)) + L1 (expb (y)) xL1 (b) + yL1 (b) = (x + y) L1 (b) . Therefore.6 on Page 151.3.244 Proof: First let x = expb m n THE INTEGRAL where m and n are integers. Hence either h = 0 or b = 1 which are both excluded. This establishes all the laws of exponents for arbitrary real values of the exponent.1. Finally. L1 expexpb (x) (y) = yL1 (expb (x)) = yxL1 (b) while L1 (expb (xy)) = xyL1 (b) and so since L1 is one to one. What of the laws of exponents for arbitrary values of x and y? As part of Wild Assumption 7. since L1 is one to one. Again. From the deﬁnition. Therefore. expb (x + y) = expb (x) expb (y) .) 1 Consider the claim that if h = 0 and b = 1. expb (1) = b and expb (0) = 1. expexpb (x) (y) = expb (xy) . Then doing L1 to both sides. L1 L−1 (x) = x 1 and so L1 L−1 (x) 1 L−1 (x) = 1. It remains to consider the derivative of expb and verify L1 (b) = ln b. Since L1 is one to one.20).1.

[a. x1 . xj − xj−1 = h. P = {x0 . b . Would this be true of upper and lower sums for such a function? Can you show that in the case of the function.10.8 Exercises 1. This is known as the trapezoidal rule.18). Now by the chain rule. Verify that if f (x) = mx + b. Now use a calculator or table to ﬁnd the exact value of ln 2. Show that adding these up h [f0 + 2f1 + · · · + 2fn−1 + fn ] 2 as an approximation to a f (x) dx. f0 h f1 Thus 1 2 yields fi +fi−1 2 approximates the area under the curve. 10.5. Let fi = f (xi ) and assuming f ≥ 0 on the interval [xi−1 . 1] . f (x) = y on an interval. and fi as shown in the following picture. From the deﬁnition of ln b on Page 143. b] . 1 1 expb (x) = L−1 (xL1 (b)) L1 (b) 1 = L−1 (xL1 (b)) L1 (b) 1 = expb (x) L1 (b) . it follows ln = L1 . xn } where for each j. Apply the trapezoidal rule to estimate ln 2 in the case where h = 1/5. 2.8. approximate the area above this interval and under the curve with the area of a trapezoid having vertical sides. the trapezoidal rule gives the exact answer for the integral. xi ] . f (t) = 1/t the trapezoidal rule will always yield an answer which is too large for ln 2? 3. fi−1 . · · ·. Show ln 2 ∈ [. L−1 (x) 1 L−1 (x) 1 =1 245 showing that L−1 (x) = L−1 (x). Form a uniform partition. EXERCISES By (10. There is a general procedure for estimating the integral of a function.

24) Show the only diﬀerentiable solutions to this equation are functions of the form x Lk (x) = 1 k dt. f. xi+1 gi (x) dx xi−1 is an approximation to xi+1 xi−1 f (x) dx. Next. [x0 . If you had a table of logarithms to some base. and xi + 2h ≡ xi+1 . x4 ] . (10. L : (0. · · ·. f deﬁned on this interval are fi = f (xi ) . xi−1 . In fact. Logarithms were invented before calculus. and xi+1 respectively. 5. Then let y = 1. Use Simpson’s rule to compute an approximation to ln 2 letting n = 4. and fi+1 at the points xi−1 . gi is an approximation to the function. Recall that ln e = 1. has the value fi−1 at x. y→0+ Hint: Consider ln (1 + yx) 1/y = 1 y ln (1 + yx) = 1 y sums and then the squeezing theorem to verify ln (1 + yx) is continuous. 3 3 3 Now suppose n is even and {x0 . 3 This is called Simpson’s rule. . The function. Recall that x → ex 6. xi−1 + h ≡ xi . he did not have calculus. a Scottish nobleman. Describe how you could construct a table for log10 x for x various numbers. Let there be three equally spaced points. L(1) = 0. xn ] . show that an approximation to b a f (x) dx is h [f0 + 4f1 + 2f2 + 4f3 + 2f2 + · · · + 4fn−1 + fn ] . · · ·. You can ﬁnd the logarithm of the number you are after. and fi+1 at x + 2h. you could then see what number this corresponded to. Also. 7. x2 ] . one of the inventors being Napier. His interest in logarithms was computational. xi+1 ] . Compare with the answer from a calculator or computer. 1+yx 1 t 1 1/y dt. [a. Suppose it is desired to ﬁnd a function. fi at x + h. [xn−2 . ∞) → R which satisﬁes L (xy) = Lx + Ly. this was how e was deﬁned in Problem ?? on Page ??. x1 . Suppose also a function.246 THE INTEGRAL 4. Show xi+1 xi−1 gi (x) dx equals hfi−1 hfi 4 hfi+1 + + . [x2 . xi . Also describe how one could use logarithms to do computations involving large numbers. Then consider gi (x) ≡ fi−1 fi fi+1 (x − xi ) (x − xi+1 )− 2 (x − xi−1 ) (x − xi+1 )+ 2 (x − xi−1 ) (x − xi ) . b] and the values of a function. 2h2 h 2h Check that this is a second degree polynomial which equals the values fi−1 . fi . Hint: Fix x > 0 and diﬀerentiate both sides of the above equat tion with respect to y. how do you suppose Napier did it? 1 Remember. Adding these approximations for the integral of f on the succession of intervals. Hint: logb 351/7 = 7 logb (35) . f on the interval [xi−1 . Describe how one could use logarithms to ﬁnd 7th roots for example. xn } is a partition of the interval. use upper and lower → x. Show that 1/y lim (1 + yx) = ex .

2 2 . How can you remember this? The easiest way is to use the Leibniz notation. This is ﬁne provided you don’t write anything which is false. Then making the substitution b a f (y) dy ? f (g (x))g (x) dx = ? f (y) dy. 10.25) where F (y) = f (y) .9. (10. it follows that y = g (a) to the bottom limit must equal g (a) . changing the variables gives 4 x sin x2 dx = 1 2 1 1 1 sin (u) du = − cos (4) + cos (1) 2 2 Sometimes people prefer not to worry about the limits.9 Techniques Of Integration The techniques for ﬁnding antiderivatives may be used to ﬁnd integrals. Example 10. TECHNIQUES OF INTEGRATION 247 10. Therefore. How does this relate to ﬁnding deﬁnite integrals? This is based on the following formula in which all the functions are integrable and F (y) = f (y) .1 Recall The Method Of Substitution f (g (x)) g (x) dx = F (x) + C.9. both sides equal F (g (b)) − F (g (a)) .9.26) This formula follows from the observation that. Then dy = g (x) dx and so formally dy = g (x) dx.10. by the fundamental theorem of calculus. 1 x sin x2 dx = − cos x2 + C 2 and so from an application of the fundamental theorem of calculus 2 1 1 x sin x2 dx = − cos x2 |2 1 2 1 1 = − cos (4) + cos (1) . The above problem can be done in the following way.26) let y = g (x) . you must change the limits! When x = a. What should go in as the top and bottom limits of the integral? The important thing to remember is that if you change the variable. b g(b) f (g (x)) g (x) dx = a g(a) f (y) dy. In (10. Similarly the top limit should be g (b) . (10.1 Find 2 1 x sin x2 dx du 2 Let u = x2 so du = 2x dx and so 2 1 = x dx.

Proposition 10.28) Proof: Use the product rule and properties of integrals to write b b b u (x) v (x) dx = a a (uv) (x) dx − a b u (x) v (x) dx = uv (x) |b − a This proves the proposition.5 Find π 0 u (x) v (x) dx. 2 b a2 2 2 THE INTEGRAL If you sketch the ellipse. b] such that uv . u v ∈ R ([a. y = β and the left top quarter is just the reﬂection of the top right quarter reﬂected about the line. π 0 x sin (x) dx = (− cos (x)) x|π − 0 = −π cos (π) = π.248 Example 10.27). a x sin (x) dx Let u (x) = x and v (x) = sin (x) . (10. Proposition 10.9. b 1− (x − α) dx a2 1 a 2 Change the variables.3 Let u and v be diﬀerentiable functions for which u (x) v (x) dx are nonempty. this is stated in the following proposition. Then b a u (x) v (x) dx = uv (x) |b − a b u (x) v (x) dx a (10. letting u = Then du = correspond to the new variables. Example 10.9. b]) .2 Integration By Parts u (x) v (x) dx and Recall the following proposition for ﬁnding antiderivative.27) In terms of integrals.9.9. Then uv − u (x) v (x) dx = u (x) v (x) dx. you see that it suﬃces to ﬁnd the area of the top right quarter for y ≥ β and x ≥ α and multiply by 4 since the bottom half is just a reﬂection of the top half about the line. Then applying (10.2 Find the area of the ellipse. 10. x = α.4 Let u and v be diﬀerentiable functions on [a.9. π (− cos (x)) dx 0 . (y − β) (x − α) + = 1. Thus the area of the ellipse is α+a 4 α x−α a . this equals π/4 1 dx and so upon changing the limits to 4ba 0 1 1 − u2 du = 4 × ab × π = πab 4 because the integral in the above is just one quarter of the unit circle and so has area equal to π/4.

28) 1 0 xe2x dx = 1 2x e2x 1 e x|0 − dx 2 2 0 e2x 1 e2x 1 = x| − | 2 0 4 0 e2 1 e2 = − − 2 4 4 2 1 e + = 4 4 10. EXERCISES 1 0 249 xe2x dx Example 10.10. (a) (b) (c) (d) (e) 6 √ x 5 2x−3 dx 5 4 x 3x2 + 6 dx 2 π x sin x2 dx 0 π/4 sin3 (2x) cos (2x) 0 7 √ 1 dx Hint: Remember 0 1+4x2 the sinh−1 function and its derivative. Find 3.10. Find 5. dx √ 1 (c) 0 x 2 − x dx 3 2 (ln |x|) dx 2 π 3 x cos x2 0 2 1 (d) (e) 2. (a) (b) (c) (d) π/9 sec (3x) dx 0 π/9 sec2 (3x) tan (3x) 0 5 1 dx 0 3+5x2 1 √ 1 dx 0 5−4x2 dx .6 Find Let u (x) = x and v (x) = e2x . Find the integrals. Find the integrals. Then from (10.9. dx 2 x ln x2 dx sin (x) dx 1 x e 0 1 0 2 0 2x cos (x) dx x3 cos (x) dx 6. Find Hint: Let u (x) = (ln |x|) and v (x) = 1. Find the integrals. 7.10 (a) (b) Exercises 4 xe−3x dx 0 3 1 2 x(ln(|x|))2 1. Find 4.

(a) (b) (c) (d) π 2 x sin x3 dx 1 6 x dx 1 1+x2 . verify the left side equals d P (t) e y (t) . Find the following integrals. The most important of all diﬀerential equations is the ﬁrst order linear equation. 2π] .5 1 dx 0 1+4x2 4 2√ x 1 + x dx 1 12. y (a) = ya is y (t) = e−P (t) ya + e−P (t) a t t eP (s) f (s) ds. n f (x) = f (x0 ) + k=1 f (k) (x0 ) 1 k (x − x0 ) + k! n! x x0 (x − t) f (n+1) (t) dt n where n! ≡ n (n − 1) · · · 3 · 2 · 1 if n ≥ 1 and 0! ≡ 1. b) and that f is a function which has n + 1 continuous derivatives on this interval. x f (x) = f (x0 ) + x0 f (t) dt x = f (x0 ) + (t − x) f (t) |x0 + x = f (x0 ) + f (x0 ) (x − x0 ) + (x − t) f (t) dt x0 x (x − t) f (t) dt.250 (e) 6 √ 3 2 x 4x2 −5 THE INTEGRAL dx 8. (a) (b) (c) (d) (e) 9. . 2 sin2 (x) dx. y + p (t) y = f (t) . Suppose x0 ∈ (a. 1−cos(2x) . Find 3 x cosh x2 + 1 dx 0 2 3 x4 x 5 dx 0 π sin (x) 7cos(x) dx −π π x sin x2 dx 0 2 5√ x 2x2 + 1 dx Hint: 1 π/4 0 Let u = 2x2 + 1. Multiply both sides by eP (t) . where P (t) = a p (s) ds. 13. dt and then take the integral. Show the solution to the initial value problem consisting of this equation and the initial condition. Find the area between the graphs of y = sin (2x) and y = cos (2x) for x ∈ [0. Give conditions under which everything is correct. x0 Explain the above steps and continue the process to eventually obtain Taylor’s formula. Find the integrals. Consider the following. t a of both sides. Hint: You use the integrating factor approach. Hint: Derive and use sin2 (x) = 10. 11.

xn } where xi − xi−1 = 1/n for each i. change variables to write b 1 f (y) dy a = ≈ = (b − a) 0 f (a + (b − a) x) dx n b−a n b−a n ci f i=1 n a + (b − a) i n ci fi i=1 i where fi = f a + (b − a) n . p3 (x) . Show that if fi are the values of f at b a. In the above Taylor’s formula. Consider [0.10. The approximate integration scheme for a function. Show using Problem 14. it follows that the above formula gives a f (x) dx 2 2 exactly whenever f is a polynomial of degree three. · · ·. It is necessary to ﬁnd constants c0 and c1 such that c0 + c1 = 1 = 0 1 dx 1 0c0 + c1 = 1/2 = 0 x dx. x2 . the case when x > x0 and the case when x < x0 . x. 15. Show that c0 = c1 = 1/2. 1] and divide this interval into n pieces using a uniform partition. · · ·. use Problem 7 on Page 241 to obtain the existence of some z between x0 and x such that n f (x) = f (x0 ) + k=1 f (k) (x0 ) f (n+1) (z) k n+1 (x − x0 ) + (x − x0 ) .10. EXERCISES 251 14. ci are chosen in such a way that the above sum 1 gives the exact answer for 0 f (x) dx where f (x) = 1. will be of the form 1 n n 1 ci fi ≈ i=0 0 f (x) dx where fi = f (xi ) and the constants. there exists a polynomial of degree three. Consider the case where n = 1. Show also that if this integration scheme is applied to any polynomial of degree 3 the result will be exact. and b with f1 = f a+b . 1 2 1 4 1 f0 + f1 + f2 3 3 3 1 1 = 0 f (x) dx whenever f (x) is a polynomial of degree three. {x0 . a+b . When this has been done. 16. and that this yields the trapezoidal rule. Obtain an integration scheme for n = 3. xi+1 ] where xi+1 = xi−1 + 2h and xi = xi−1 + h. f. That is. Let f have four continuous derivatives on [xi−1 . k! (n + 1)! Hint: You might consider two cases. There is a general procedure for constructing these methods of approximate integration like the trapezoidal rule and Simpson’s rule. xn . Next take n = 2 and show the above procedure yields Simpson’s rule. such that 1 4 f (x) = p3 (x) + f (4) (ξ) (x − xi ) 4! .

f. b. b] . 10. if limx→b− f (t) = ±∞ but f is integrable on [a. Similarly. if limx→a+ f (t) = ±∞. Now let S (a. xi ] . 4! 5 where M satisﬁes. then b b f (t) dt ≡ lim a δ→0+ f (t) dt a+δ whenever this limit exists. Nevertheless people do consider things like the following: 0 f (t) dt. R a f (t) dt is deﬁned above as limδ→0+ f (t) dt. then ∞ R f (t) dt ≡ lim a R→∞ f (t) dt a R a+δ where the improper integral.2 Find ∞ −t e 0 dt.11.3 Find 1 1 √ 0 x dt = limR→∞ 1 − e−R = 1. Example 10. 2m) deb note the approximation to a f (x) dx obtained from Simpson’s rule using 2m equally spaced points. . You can probably construct other examples of improper integrals such as integrals of the a form −∞ f (t) dt. dx.11. b − δ] for all small δ. this equals limR→∞ Example 10. Finally. then b b−δ f (t) dt ≡ lim a δ→0+ f (t) dt a whenever the limit exists. If limx→a+ f (t) = ±∞ but f is integrable on [a + δ. M ≥ max f (4) (t) : t ∈ [xi−1 . Deﬁnition 10. Whenever things like this occur they require a special deﬁnition. f. b.11 Improper Integrals The integral is only deﬁned for certain bounded functions which are deﬁned on closed and ∞ bounded intervals. The deﬁnitions are analogous to the above.252 Now use Problem 15 and Problem 6 on Page 246 to conclude xi+1 THE INTEGRAL f (x) dx − xi−1 hfi−1 hfi 4 hfi+1 + + 3 3 3 < M 2h5 . b] for all small δ. 2m) < a M 5 1 (b − a) 1920 m4 where M ≥ max f (4) (t) : t ∈ [a.11. However. Better estimates are available in numerical analysis books. Show b f (x) dx − S (a. these also have the error in the form C 1/m4 . R −t e 0 From the deﬁnition. In this section a few types of improper integrals will be discussed.1 The symbol ∞ a f (t) dt is deﬁned to equal R R→∞ lim f (t) dt a whenever this limit exists.

I]. Letting δ → 0+. the integral a f (t) dt exists and R f (t) dt ≤ M. R]) whenever δ > 0 is small enough. IMPROPER INTEGRALS 253 √ 1 From the deﬁnition this equals limδ→0+ δ x−1/2 dx = limδ→0+ 2 − 2 δ = 2. b] and there exists M such that b ∞ f (t) dt ≤ M a+δ (10.11.29) Then a f (t) dt exists. If f (t) ≥ 0 for all t ∈ (a. ∞) and that f ∈ R ([a + δ. a+δ 0 Therefore. If f (t) ≥ 0 for all t ∈ [a. R a Proof: Suppose (10. for every R > a. f (t) dt − I < ε. there exists R0 such that follows that for R ≥ R0 . R R0 a R0 f (t) dt : R > a ≤ M. I −ε< b b f (t) dt ≤ a+δ 0 a+δ f (t) dt ≤ I . b a+δ Now suppose (10. and small δ > 0. Suppose also there exists a number. since f (t) ≥ 0. Then I ≡ sup is given.31) for all δ > 0. then a f (t) dt exists. a (10.11. The following theorem is about this question of existence. Then I ≡ sup ε > 0 is given. whenever R > R0 .4 Suppose f (t) ≥ 0 for all t ∈ [a.30) for all δ > 0.10. Then. It follows that if ε > 0 f (t) dt ∈ (I − ε. R R0 R R0 f (t) dt = a a R a f (t) dt + R0 f (t) dt ≥ a f (t) dt. It follows that if f (t) dt ∈ (I − ε. M such that R for all R > a. I]. if δ < δ 0 . Since ε is arbitrary. there exists δ 0 > 0 such that b f (t) dt : δ > 0 ≤ M. then b a f (t) dt exists. Theorem 10.29). Therefore.30). Sometimes you can argue the improper integral exists even though you can’t ﬁnd it. b) and there exists M such that b−δ b f (t) dt ≤ M a (10. it R f (t) dt = a+δ a+δ f (t) dt + R0 f (t) dt. the conditions for R I = lim are satisﬁed and so I = ∞ a R→∞ f (t) dt a f (t) dt.

This is the case when functions are not all the same sign for example. sin δ 1 Now using the argument of Example 10. R δ e−t tα−1 dt ≤ δ k e−t tα−1 dt + k R e−t tα−1 dt k ≤ 0 t α−1 ∞ dt + k e−t/2 dt. 1 Example 10. This proves the theorem. the ﬁrst integral in the above is bounded above √ √ √ 1−δ 1 by δ1 − δ 6. Sometimes the existence of the improper integral is a little more subtle. This is what is meant by the expression b δ→0+ lim f (t) dt = I a+δ and so I = a f (t) dt. let M ≡ k α−1 t 0 dt and this shows from the above theorem that e−t tα−1 dt ≤ M for all large R and so 0 e−t tα−1 dt exists. The second integral equals √sin δ .11. the improper integral 1 √ √ √ 1−δ 1 exists because the conditions of Theorem 10. Here k is chosen such that if t ≥ k. Does the improper integral exist? ∞ −t α−1 e t 0 dt whenever α > You should supply the details to the following estimate in which δ is a small positive number less than 1 and R is a large positive number.5 Does 1 √1 0 sin x b dx exist? I don’t know how to ﬁnd an antiderivative for this function but the question of existence x x 3 can still be resolved. it follows 2 x < sin x 3 and so if δ < δ 1 . b a+δ THE INTEGRAL f (t) dt − I < ε. Such a k exists because e−t tα−1 = 0. Since limx→0+ sin x = 1. Example 10. it follows that for x small enough. Therefore.4 with M = √sin δ + δ1 − δ 6. The last case is entirely similar to this one.11.6 The gamma function is deﬁned by Γ (α) ≡ 0.11.11. Then for such x. ∞ . 1 √ δ 1 dx ≤ sin x δ1 δ 3 1 √ dx + 2 x 1 √ δ1 1 dx.3. sin x < 2 . e−t tα−1 < e−t/2 . say for x < δ 1 . t→∞ e−t/2 lim dt + ∞ −t/2 e k R 0 Therefore.254 showing that for such δ.

7.11.10 Does ∞ 0 cos x2 dx exist? This is called a Fresnel integral and it has also√ √ evaluated exactly using techniques been ∞ from complex analysis.11. However.11. Then f + (t) ≡ |f (t)|+f (t) and f − (t) ≡ 2 |f (t)|−f (t) .11.9. x2 x ∞ This is an important example. First change the variable letting x2 = u and then integrate by parts. x2 Thus the improper integral exists if it can be shown that R 1 ∞ cos x x2 1 dx exists.12. Verify all the details in Example 10.11. Example 10. 2.11.32) limR→∞ 1 cos x dx also exists and so 0 sin x dx exists. Theorem 10.11.10. Corollary 10. The above argument is a special case of the following corollary to Theorem 10. Break it up into positive and u3/2 negative parts and use Corollary 10.12 Exercises 1.4 hold for both f + and f − .11. 10. Deﬁnition 10.4. In fact 0 cos x2 dx = 1 2 π.4 applies and the limits R R→∞ lim 1 (|cos x| − cos x) dx.32) cos x dx = x2 R 1 R 1 R 1 cos x + |cos x| dx − x2 R 1 R 1 R 1 (|cos x| − cos x) dx x2 ∞ 0 ∞ 0 and cos x + |cos x| dx ≤ x2 2 dx ≤ x2 2 dx ≤ x2 2 dx < ∞ x2 2 dx < ∞. they involve techniques which will not be discussed in this book.7 Does You should verify R 0 1 sin x x 0 dx exists and that 1 sin x dx x = 0 1 sin x dx + x R 1 sin x dx x R 1 = 0 cos R sin x dx + cos 1 − − x R cos x dx. ∞ You will eventually get an integral of the form 0 sin u du.6.11.8 Let f be a real valued function.11. Verify all the details of Example 10. . (10. Then 0 f (t) dt exists. Thus |f (t)| = f + (t) + f − (t) and f (t) = f + (t) − f − (t) while both f + and f − 2 are nonnegative functions. EXERCISES ∞ sin x x 0 255 dx exist? Example 10. There are at least two ways to show that 0 sin x dx = x 1 2 π. It is a standard problem in the subject of complex analysis. However. lim R→∞ x2 R R 1 cos x + |cos x| dx x2 ∞ both exist and so from (10. x2 (|cos x| − cos x) dx ≤ x2 Since both integrands are positive. The veriﬁcation that this integral 4 exists is left to you.9 Suppose f is a real valued function and the conditions of Theorem ∞ 10.

For α > 0 ﬁnd ln t dt if it exists and if it does not exist. Verify all the details of Example 10.256 3. Recall the area of the surface obtained by revolving the graph of y = f (x) about the x axis for x ∈ [a. √ 5. Next show that ε→0+ ∞ lim dx = f (x1 ) .10. b] for f a positive continuous function having continuous derivative is given by b 2π a f (x) 1 + (f (x)) dx. 2 Also the volume of the solid obtained in the same way is given by the integral. (tan t) (sin t) α−1 α−1 11. 1 α−1 t dt 0 1 0 1 0 8. Prove that for every α > 0. dt exists. Show Γ (1) = 1 = Γ (2) . ∞) in terms of an improper integral. Prove that for every α > 0. f (x) = 1/x for x ≥ 1. It can be shown that Γ 1 = π. This last limit is called the Cauchy principle value integral. explain why. Show ∞ −∞ ε π ε2 + (x − x1 ) εf (x) π ε2 + (x − x1 ) 2 2 dx = 1. Try to show the following: For every number. Next show Γ (α + 1) = αΓ (α) and prove that for n a nonnegative integer. The methods of complex analysis typically give the Cauchy principle value. dt exists. 1 + x2 This is true and shows why such principle value integrals are somewhat disreputable. Using this and Problem 4. Suppose f is a continuous function which is bounded and deﬁned on R. Prove that for every α > 0. Sometimes people call this Gabriel’s horn. Try it on the function. THE INTEGRAL 4. bn → ∞ such that bn n→∞ ∞ R lim −an 2x dx = A. ﬁnd Γ 5 . Show the resulting solid has ﬁnite volume but inﬁnite surface area. A. −∞ . there exist sequences an . Thus you could ﬁll it but you couldn’t paint it. 2x 2x 12. It is not a very respectable thing. 9. exists and ﬁnd the answer. Γ (n + 1) = n!. 10. What is the 2 2 advantage of the gamma function over the notion of factorials? Hint: For what values of x is Γ (x) deﬁned? 6. b π a (f (x)) dx.11. Prove ∞ 0 sin x2 dx exists. Show 0 1+x2 dx does not exist but that limR→∞ −R 1+x2 dx = 0. 2 It seems reasonable to deﬁne the surface area and volume for x ∈ [a. 1 α t 0 7. 13.

4! = 4 × 3! = 24. (a. b) . and the observation that g (c) and h (c) both equal 0. By the mean value theorem.1 Suppose f has n + 1 derivatives on an interval.) Proof: Suppose ﬁrst that n = 0. and those which come as short words. (11. etc.1) . the method of Taylor polynomials is discussed. Theorem 6. the symbol 0 k=1 ak will denote the number 0. In this section. Theorem 11.2 on Page 133. k! (n + 1)! (In this formula. Before presenting it.Inﬁnite Series 11. b) . 2! = 2. there exists ξ between x and c such that h (ξ) g (x) = g (ξ) h (x) and so n+1 (n + 2) (ξ − c) n+1 f (x) − f (c) − k=1 f (k) (c) k! n+2 = n+1 f (ξ) − k=1 f (k) (c) k−1 (ξ − c) (k − 1)! 257 (x − c) . This latter type of function is not so easy to evaluate. those which come from a formula like f (x) = x2 + 2 which are easy to evaluate by following a simple procedure. things like ln (x) or sin (x) . By the Cauchy mean value theorem. you have noticed there are two sorts of functions. Deﬁne 0! ≡ 1 = 1! and (n + 1)! ≡ (n + 1) n! so that n! = n (n − 1) · · · 1. there exists ξ between c and x such that n f (x) = f (c) + k=1 f (k) (c) f (n+1) (ξ) k (x − c) + .1 Approximation By Taylor Polynomials By now. For example. h (x) = (x − c) k! so g (c) = h (c) = 0.8. 3! = 3 × 2! = 6. f (x) − f (c) = f (ξ) (x − c) for some ξ between c and x. Then let n+1 g (x) ≡ f (x) − f (c) − k=1 f (k) (c) k n+2 (x − c) . Now assume the theorem holds for n. The following theorem is called Taylor’s theorem. what is sin 2? Can you get it by doing a simple sequence of operations like you can with f (x) = x2 + 2? How can you ﬁnd sin 2? It turns out there are many ways to do so.1. Then if x ∈ (a. recall the meaning of n! for n a positive integer. b) and let c ∈ (a. In particular.

dividing by (n + 2) (ξ − c) f (x) − f (c) − k=1 n+1 (n+1) (x − c) n+2 . Use Taylor’s formula just presented and let c = 0. f (0) = 1. is called the remainder and this particular form of the remainder is called the Lagrange form of the remainder. f (x) = cos x. Therefore. Thus ξ 1 is between c and x and this proves the theorem. f (0) = 0. For n = 2 in the above. (ξ) The term f (n+1)! . Thus the Taylor polynomial for sin x is of the form x3 x2n+1 x− +···± 3! (2n + 1)! while the remainder is of the form f (2n+2) (ξ) (2n + 2)! for some ξ between 0 and x. n+1 (n + 2) (ξ − c) n+1 f (x) − f (c) − k=1 f (k) (c) k! = (f ) (ξ 1 ) n+1 (ξ − c) (n + 1)! and so. f (x) = − cos x. f (0) = −1.1.2) .258 The induction hypothesis in (11.1) yields n+1 INFINITE SERIES f (ξ) − k=1 f (k) (c) k−1 (ξ − c) (k − 1)! n = = f (ξ) − f (c) − k=1 (n+1) (f ) (k) (c) k! (ξ − c) k (f ) (ξ 1 ) n+1 (ξ − c) (n + 1)! for some ξ 1 between ξ and c. sin x − x − ≤ ≤ 3! 5! 6! 6! . (11. the resulting polynomial is x− x5 x3 + 3! 5! and the error between this polynomial and sin x must be measured by the remainder term. Consequently. f (x) = − sin x. etc. (n+1) Example 11.2 Approximate sin x for x in some open interval containing 0. Then for f (x) = sin x. Therefore. etc. f (k) (c) (f ) (ξ 1 ) n+2 = (x − c) k! (n + 2)! (n+1) n+1 where ξ 1 is between c and ξ while ξ is between c and x. x5 x3 f (6) (ξ) x6 x6 + . f (0) = 0.

then this requirement is achieved if and only if a0 = f (c) . Let pn (x) = a0 + k=1 ak (x − c) . EXERCISES 259 For small x. sin (. Thus the Taylor polynomial of degree n and its ﬁrst n derivatives agree with the function and its ﬁrst n derivatives when x = c. Find the Taylor polynomials for cos x for x near 0 along with a formula for the remainder. This is illustrated in the following picture. Suppose for example. 2. · · ·. If this is used to ﬁnd sin (. 4.5) + 3! 5! 3 5 ≤ 1 (. 6 4 sin(x) 2 –4 –3 –2 –1 0 –2 –4 –6 1 2 x 3 © 4 x− © x3 3! + x5 5! You see from the picture that the polynomial is a very good approximation for the function.2 Exercises n k 1. Find the Taylor polynomials for ln (1 − x) for x near 0.2. you wanted to ﬁnd sin (. the approximation is lousy. 3.5) = 6! 46 080 6 so diﬀerence between the approximation and sin (.5) − (. an = f n!(c) .5) .5) (. The above estimate indicates the good approximation holds as long as |x| is small and it quantiﬁes how good the approximation is. pn (c) = (n) f (c) . Use your approximate polynomial to compute cos (. pn (c) = f (n) (c) . sin x as long as |x| is small but that if |x| gets very large. 11. 1+x 1−x (n) and use it to compute ln 5 to three decimal .5) to 3 decimal places and prove your approximation is this good.1) the polynomial approximation would be even closer.5) − (. Find a Taylor polynomial for ln places. no such conclusion can be drawn.5) is less than 10−4 . Find the Taylor polynomials for ln (1 + x) for x near 0. a1 = f (c) . Then from the above error estimate. this error is very small but if x is large. Show that if you require that pn (c) = f (c) . · · ·. 5.11.

Hint: Prove by n! √ n √ n induction that M n /n! ≤ (2M ) / ( n) . 1] . 1 + t2 0 show 1 = 1 + t2 n (−1) t2k ± k=0 k t2n+2 . Use this to verify that π = lim n→∞ 4 n (−1) k=1 k−1 1 . Thus the error would be no larger than x 0 t2n+2 dt ≤ 1 + t2 x 0 t2n+2 dt . . Find an such that arctan (x) = limn→∞ k=0 ak xk for some values of x. x 8. show ex = limn→∞ n xk k=1 k! n for all x ∈ R. Now consider what happens when n is much larger than 2M. 11. n n−1 2n−2 9. 1 + t2 and then integrate this ﬁnite sum from 0 to x. Find the values of x for which the limit is true and prove your result. Hint: Use the sin x − k=1 (−1) k−1 x2k−1 |x| ≤ (2k − 1)! (2n)! 2n and then use the result of Problem 6. Repeat 10 for the function ln (1 − x) . 7. 0 1+t 13. If you did Problem 10 correctly. Using Problem 6. Hint: It is a good idea to use x 1 arctan (x) = dt. Show cos (x) = limn→∞ k=1 (−1) (2n−2)! by ﬁnding a suitable formula for the remainder and then using an argument similar to that done in Problem 7. 10. Verify that limn→∞ M = 0 whenever M is a positive real number. 2k − 1 12. Use x 1 ln (1 + x) = dt. you found n arctan x = lim n→∞ (−1) k=1 k−1 x2k−1 2k − 1 and that this limit will hold for x ∈ [−1. sin (x) = limn→∞ formula for the error to conclude that n n k=1 (−1) n−1 x2n−1 (2n−1)! . Do for ln (1 + x) what was done for arctan (x) and ﬁnd a formula of this sort for ln 2.260 n INFINITE SERIES 6. Show that for every x ∈ R.

is an increasing sequence because n ak − k=m k=m ak = an+1 ≥ 0.15 on Page n 107 it follows the sequence of partial sums converges to sup { k=m ak : n = m.3. people will sometimes write k=m ak < ∞ to denote the case where convergence ∞ occurs and k=m ak = ∞ for the case where divergence occurs.1 Inﬁnite Series Of Numbers Basic Considerations Earlier in Deﬁnition 5. It follows the symbol just written is meaningless and the inﬁnite sum diverges. Sometimes they do and sometimes they don’t. Proposition 11. the inﬁnite series diverges.1 on Page 103 the notion of limit of a sequence was discussed.11. If it is bounded above. then k=m ak converges and its value equals n n ∞ sup k=m ak : n = m. INFINITE SERIES OF NUMBERS 261 11.9 on Page 106. the above proposition shows there are only two alternatives available. Be very careful you never think this way in the case where it is not true that all ak ≥ 0. ∞ k=m When the sequence is not bounded above. there is no limit for the sequence of partial sums. There is a very closely related concept called an inﬁnite series which is dealt with in this section. This ∞ n series is n=0 r . consider k=1 (−1) . The partial sums corresponding to this symbol alternate between −1 and 0. depending on the behavior of the partial ∞ k sums. See Theorem 5. As an example.11. m + 1.11.13 on Page 107. If this ∞ sequence is bounded above. then it is not a Cauchy sequence and so it does not converge.3. For ∞ this reason.11.1 Deﬁne ∞ n ak ≡ lim k=m n→∞ ak k=m whenever the limit exists and is ﬁnite. If it does ∞ n not converge. Then { k=m ak }n=m is an increasing sequence. This proves the proposition. From this deﬁnition. Proof: It follows { n k=m ∞ ak }n=m n+1 ak diverges. If the sequence of partial sums is not bounded.3. In the ﬁrst case convergence occurs and in the second case. In the case where ak ≥ 0. then by the form of completeness found in Theorem 5. it is said to diverge. In this case the series is said to converge. For example. Let Sn ≡ k=0 rk . The sequence { k=m ak }n=m in the above is called the sequence of partial sums. Then n n n+1 Sn = k=0 rk . m + 1. rSn = k=0 rk+1 = k=1 rk .2 Let ak ≥ 0.3 11. · · · . Deﬁnition 11. . Either the sequence of partial sums is bounded above or it is not bounded above. the partial ∞ k sums of k=1 (−1) are bounded because they are all either −1 or 0 but the series does not converge.3. One of the most important examples of a convergent series is the geometric series. Therefore. it should be clear that inﬁnite sums do not always make sense. · · ·}.11. The study of this series depends on simple High school algebra and n Theorem 5.

11.5) follows from the observation that. Now if |r| ≥ 1. limn→∞ Sn = 1−r in the case when |r| < 1. 1−r INFINITE SERIES 1 By Theorem 5. In fact. Proof: The above theorem is really only a restatement of Theorem 5.262 Therefore. Thus ∞ n n+j ∞ ak = lim k=m n→∞ ak = lim k=m n→∞ ak−j = k=m+j k=m+j ak−j . from the triangle inequality. .3. n ∞ ak ≤ k=m k=m n |ak | ∞ and so ∞ ak = lim k=m n→∞ ak ≤ k=m k=m |ak | .4) (11. then ∞ ak = k=m ∞ k=m+j ∞ ak−j ∞ (11. y are numbers.3 The geometric series.4 If ak and ∞ ∞ k=m bk both converge and x. Sn = 1 − rn+1 . the limit clearly does not exist because Sn fails to be a Cauchy sequence (Why?).11. To establish (11.6 on Page 104 to write ∞ n xak + ybk k=m = = = n→∞ lim lim xak + ybk k=m n n n→∞ ∞ x k=m ak + y k=m ∞ bk x k=m ak + y k=m bk . the last sum equals +∞ if the partial sums are not bounded above. ∞ k=m ∞ n n=0 r converges and equals 1 1−r if |r| < 1 and If the series do converge.4). subtracting the second equation from the ﬁrst yields (1 − r) Sn = 1 − rn+1 and so a formula for Sn is available. diverges if |r| ≥ 1.6 on Page 104 and the above deﬁnitions of inﬁnite series.3. if r = 1.3) xak + ybk = x k=m ∞ k=m ∞ ak + y k=m bk (11. Theorem 11. the following holds. Theorem 11. use Theorem 5. This shows the following. Formula (11.9.11.5) ak ≤ k=m k=m |ak | where in the last inequality.

3. from the triangle inequality. Then { k=m ak }n=m is a Cauchy sequence by Theorem 5. Theorem 11.8 If ∞ k=m ak converges absolutely.3.11. . INFINITE SERIES OF NUMBERS Example 11. Therefore.6) Proof: Suppose ﬁrst that the series converges.3. then it converges. Proof: Let ε > 0 be given.3. In fact. By the deﬁnition of inﬁnite series. it converges. q |ak | < ε. there exists nε such that whenever q ≥ p ≥ nε .3.3. Thus its validity implies completeness. the above theorem is really another version of the completeness axiom.6. there exists nε q ak < ε. then ∞ k=m ak converges if and only if for all ε > 0. Deﬁnition 11.11.3.6. then it is said to converge conditionally. by the completeness axiom.5 Find ∞ n=0 5 2n 263 . (11.7) Next suppose (11.6 The sum such that if q ≥ p ≥ nε . k=p n ∞ (11. k=p Therefore. Then from (11. + 6 3n From the above theorem and Theorem 11.7) it follows upon letting p be replaced with ∞ n p + 1 that { k=m ak }n=m is a Cauchy sequence and so.7 A series ∞ ak k=m is said to converge absolutely if ∞ |ak | k=m converges.13 on Page 107. Theorem 11.3. this shows the inﬁnite sum converges as claimed.6) holds. By Theorem 11. If the series does converge but does not converge absolutely. ∞ n=0 5 6 + n n 2 3 = 5 = 5 1 1 +6 n 2 3n n=0 n=0 1 1 +6 = 19. there exists nε > m such that if q ≥ p − 1 ≥ nε > m. 1 − (1/2) 1 − (1/3) ∞ ∞ The following criterion is useful in checking convergence.3. You might try to show this. Then by assumption and Theorem 11. q q ε> k=p ∞ |ak | ≥ k=p ak . k=m ak converges and this proves the theorem. q p−1 q ak − k=m k=m ak = k=p ak < ε.

3. then ∞ n=m an converges.{ n=m an }p=m is bounded above and increasing. .11 Let an . This is considered next. m) and for all n ≥ n∗ bn ≥ an . then an an > a and <b bn bn and so for all such n. Then if p ≥ n∗ .10 Determine the convergence of For n > 1. The second claim is left as an exercise. ∞ 1 n=1 n2 . bn bn bn converge or diverge together. 0<a< Then an and an an ≤ < b < ∞. 1 1 ≤ . Then 1.264 INFINITE SERIES Theorem 11. letting an = n2 and bn = n(n−1) A convenient way to implement the comparison test is to use the limit comparison test. Example 11. diverges. bn > 0 and suppose for all n large enough. 2 n n (n − 1) p p Now 1 n (n − 1) n=2 = n=2 1 1 − n−1 n 1 →1 p = 1− 1 1 Therefore. an diverges. Therefore. an ≤ bn . Theorem 11.3. then ∞ n=m bn Proof: Consider the ﬁrst claim.3. Proof: Let n∗ be such that n ≥ n∗ . Example 11. it converges by completeness. abn < an < bbn and so the conclusion follows from the comparison test. n=k ≤ n=m p ∞ an + Thus the sequence. If 2. If ∞ n=k bn ∞ n=k converges.9 (comparison test) Suppose {an } and {bn } are sequences of non negative real numbers and suppose for all n suﬃciently large.3.12 Determine the convergence of ∞ 1 √ k=1 n4 +2n+7 . p n∗ k an n=m ≤ n=m n∗ an + n=n∗ +1 ∞ bn bn . From the assumption there exists n∗ such that n∗ > max (k.

n4 dominates the other terms in n4 + 2n + 7. Therefore. Claim: n n 2 n 2k a2k−1 ≥ k=1 k=1 ak ≥ k=0 2k−1 a2k . n=1 n−1 diverges ∞ −2 while n=1 n converges. In particular.3. where p is a positive number. INFINITE SERIES OF NUMBERS 265 This series converges by the limit comparison test.3.3. Then a2n = ∞ 1 n . It follows ∞ that the p series converges if p > 1 and diverges if p ≤ 1. Compare with the series of Example 11. Theorem 11. some which converge and some which do not. n+1 n 2n 2k a2k−1 = 2n+1 a2n + k=1 2n+1 2n k=1 2k a2k−1 ≥ 2n+1 a2n + k=1 n n+1 ak ≥ k=1 ak ≥ 2 a2n+1 + k=1 n ak ≥ 2n a2n+1 + k=0 2k−1 a2k = k=0 ∞ 1 k=1 kp 2k−1 a2k . it is still a geometric series but in this case the common ratio is either 1 or greater than 1 so the series diverges.10. the higher order term.3. Proof of the Claim: Note the claim is true for n = 1. since 2n+1 − 2n = 2n . If p > 1. Let an = 1 np . it is desirable to get lots of examples of series.14 Determine the convergence of These are called the p series. Suppose the claim is true for n. Then. Of course this is not always easy and there is room for acquiring skill through practice. Thus an ≥ an+1 for all n. The tool for obtaining these examples here will be the following wonderful theorem known as the Cauchy condensation test.3.13 Let an ≥ 0 and suppose the terms of the sequence {an } are decreasing. Proof: This follows from the inequality of the following claim. 3 n n Therefore. Then ∞ ∞ an and n=1 n=0 2n a2n converge or diverge together. Example 11. √ reasoning that 1/ n4 + 2n + 7 is a lot like 1/n2 for large n. If p ≤ 1. the last series above is a geometric series having common ratio less than 1 and so it converges. and the terms. an . 2p From the Cauchy condensation test the two series ∞ 1 and np n=1 2n n=0 1 2p n ∞ = n=0 2(1−p) n converge or diverge together. . √ 1 n4 + 2n + 7 n2 lim = lim n→∞ √ n→∞ 1 n2 n4 +2n+7 = n→∞ lim 1+ 7 2 + 4 = 1. the series converges with the series of Example 11. How did I know what to √ √ compare with? I noticed that n4 + 2n + 7 is essentially like n4 = n2 for large enough n. To really exploit this limit comparison test. are decreasing. You see.10.11. it was easy to see what to compare with.

Proof: Apply Theorem 11. A−1 ≡ A0 ≡ 0. Sometimes it is good to be able to say a series does not converge.15 Determine the convergence of Use the limit comparison test. the tests for convergence have been applied to non negative terms only.16 Determine the convergence of Use the Cauchy condensation test. k=n It is very important to observe that this theorem goes only in one direction. Sometimes. Let {an } and {bn } be sequences and let n An ≡ k=1 ak . you don’t ∞ know anything from this information. Theorem 11. The following picture is descriptive of the situation. If this happens.2 More Tests For Convergence So far.17 If ∞ n=m an converges.3. INFINITE SERIES n→∞ =1 and so this series diverges with ∞ 1 k=1 k .3. a series converges.6 to conclude that n n→∞ lim an = lim n→∞ ak = 0.266 Example 11. not because the terms of the series get small fast enough.3. That is. then limn→∞ an = 0.3. . Example 11.3. It is really a corollary of Theorem 11. A discussion of this involves some simple algebra. The nth term test gives such a condition which is suﬃcient for this. you cannot conclude the series converges if limn→∞ an = 0.3. Recall limn→∞ n−1 = 0 but n=1 n−1 diverges.6. The above series does the same thing in terms of convergence as the series ∞ ∞ 1 1 = 2n n 2 ln (2n ) n=1 n ln 2 n=1 1 and this series diverges by limit comparison with the series n. lim an = 0 an converges an = n−1 11. ∞ 1 k=2 k ln k . but because of cancellation taking place between positive and negative terms. lim 1 n 1 √ n2 +100n ∞ 1 √ k=1 n2 +100n .

6. Then |an+1 | < R |an | an+1 <R an .3. This is discussed next. and using the partial summation formula above along with the assumption that the bn are decreasing. letting |An | ≤ C. It is just like integration by parts. Proof: This follows quickly from Theorem 11. INFINITE SERIES OF NUMBERS Then if p < q q q q q 267 an bn = n=p q n=p q−1 bn (An − An−1 ) = n=p b n An − n=p q−1 bn An−1 = n=p bn An − n=p−1 bn+1 An = bq Aq − bp Ap−1 + n=p An (bn − bn+1 ) This formula is called the partial summation formula. The following corollary is known as the alternating series test. this last expression is small whenever p and q are suﬃciently large. then ∞ n=1 (−1) bn converges. Theorem 11.18 (Dirichlet’s test) Suppose An is bounded and limn→∞ bn = 0. then 0< where r < R < 1. Then ∞ an bn n=1 converges. |an | n→∞ Then diverges if r > 1 converges absolutely if r < 1 .20 Suppose |an | > 0 for all n and suppose lim |an+1 | = r.3. Indeed. with bn ≥ bn+1 . Corollary 11.3.19 If limn→∞ bn = 0.3. Theorem 11. n A favorite test for convergence is the ratio test.3. Then there exists n1 such that if n ≥ n1 . q q−1 an bn = bq Aq − bp Ap−1 + n=p n=p q−1 An (bn − bn+1 ) ≤ C (|bq | + |bp |) + C n=p (bn − bn+1 ) = C (|bq | + |bp |) + C (bp − bq ) and by assumption. an n=1 test fails if r = 1 ∞ Proof: Suppose r < 1.11. with bn ≥ bn+1 . This proves the theorem.

k! (11.23 Show that for all x ∈ R.8) and so if m > n. To see the test fails if r = 1. Since the nth term fails to converge to 0.21 Suppose |an | 1/n < R < 1 for all n suﬃciently large. then an diverges. note that if r > 1. 1/n Next suppose |an | ≥ 1 for inﬁnitely many values of n.) The ratio test is very useful for many diﬀerent examples but it is somewhat unsatisfactory mathematically. To see the test fails ∞ ∞ 1 1 if r = 1. n=1 If there are inﬁnitely many values of n such that |an | ∞ 1/n ≥ 1.21. Then for such n. |an | converges. Showing limn→∞ |an | = ∞.3.22 Suppose limn→∞ |an | exists and equals r. |an | ≥ 1 and so the series fails to converge by the nth term test.3. |an | ≤ Rn and so the comparison test with a geometric series applies and gives absolute convergence as claimed. then |am | < Rm−n1 |an1 | . By the comparison test and the theorem on geometric series. The next test. n=1 Proof: Suppose ﬁrst that |an | for all n ≥ nR . This proves the convergence part of the theorem. Say this holds n |an | < R. consider the two series n=1 n and n=1 n2 both of which have r = 1 but having diﬀerent convergence properties. Then ∞ converges absolutely if r < 1 test fails if r = 1 ak diverges if r > 1 k=m Proof: The ﬁrst and last alternatives follow from Theorem 11. necessitated by the need to divide by an . Therefore. ∞ 1/n ex = k=0 xk . then (11. To verify the divergence part.9) . 1/n < R < 1 for all n suﬃciently large. The ﬁrst series diverges while the second one converges but in both cases. r = 1. One reason for this is the assumption that an > 0. it follows the series diverges. Then for those values of n.3. and the other reason is the possibility that the limit might not exist.8) can be turned around for some R > 1. (Be sure to check this last claim. Corollary 11.3. for such n. |an1 +p | < R |an1 +p−1 | < R2 |an1 +p−2 | < · · · < Rp |an1 | INFINITE SERIES (11. Example 11. called the root test removes both of these objections. Theorem 11.268 for all such n. consider n−1 and n−2 . Therefore. Then ∞ an converges absolutely.

It is placed here just to prove conclusively there is a question which needs to be considered. (n + 1)! n→∞ Therefore. In other words. taking the limit in (11.3 Double Series Sometimes it is required to consider double series which are of the form ∞ ∞ ∞ ∞ ajk ≡ k=m j=m k=m j=m ajk . it is not always true that it is permissible to interchange the two sums. The major consideration for these double series is the question of when ∞ ∞ ∞ ∞ ajk = k=m j=m j=m k=m ajk . limits foul up algebra and also introduce things which are counter intuitive. In general. Here is an example. you may not do it without agonizing over the question. the nth term converges to zero by the nth term test and so for each x ∈ R. Now for any x ∈ R eξ n |x| xn+1 ≤ e|x| (n + 1)! (n + 1)! n+1 and an application of the ratio test shows ∞ e|x| n=0 |x| < ∞.3. .11. (n + 1)! n+1 Therefore. INFINITE SERIES OF NUMBERS By Taylor’s theorem ex = 1+x+ n 269 xn+1 x2 xn +···+ + eξn 2! n! (n + 1)! (11.9) holds. lim eξn xn+1 = 0. 11. Therefore.10) = k=0 xk xn+1 + eξ n k! (n + 1)! where |ξ n | ≤ |x| . when does it make no diﬀerence which subscript is summed over ﬁrst? In the case of ﬁnite sums there is no issue here. A general rule of thumb is this: If something involves changing the order in which two limits are taken. This example is a little technical. In other words. You can always write M N N M ajk = k=m j=m j=m k=m ajk because addition is commutative. ﬁrst sum on j yielding something which depends on k and then sum these. However.3.10) it follows (11. there are limits involved with inﬁnite sums and the interchange in order of summation involves taking limits in a diﬀerent order.

∞ n=1 amn = b if m = 1. n=1 m=1 Next observe that if m > 2. 0r 0r 0r 0r 0r cr 0r -c r 0r 0r 0r 0r cr 0r -c r 0r 0r 0r 0r cr 0r -c r 0r 0r 0r 0r cr 0r -c r 0r 0r 0r 0r cr 0r -c r 0r 0r 0r 0r br 0r -c r 0r 0r 0r 0r 0r 0r ar 0r 0r 0r 0r 0r 0r The numbers next to the point are the values of amn . Therefore. a12 = b.3. First. ∞ ∞ ∞ n=1 amn = a + c if m = 2. Then Therefore. some notation should be discussed. ∞ m=1 amn = 0. it turns out that if aij ≥ 0 for all i.270 INFINITE SERIES Example 11. j.24 Consider the following picture which depicts some of the ordered pairs (m. limits are taken in two diﬀerent orders. a21 = a. and c. It happens because. and ∞ n=1 amn = 0 amn = b + a + c m=1 n=1 and so the two sums are diﬀerent. This example in no way violates the commutative law of addition which has nothing to do with limits. as indicated above. An inﬁnite sum always involves a limit and this illustrates why you must always remember this. amn = a + b − c. you can get an example for any two diﬀerent numbers desired. n) where m. Moreover. You see ann = 0 for all n. This is shown next and is based on the following lemma. Don’t become upset by this. b. However. n) on the line y = x − 1 whenever m > 2. amn = c for (m. ∞ ∞ m=1 ∞ amn = b − c if n = 2 and if n > 2. and amn = −c for all (m. n are positive integers. ∞ m=1 amn = a if n = 1. n) on the line y = 1 + x whenever m > 1. you can see that by assigning diﬀerent values of a. . then you can always interchange the order of summation.

The symbol. ∞ ∞ ∞ aij ≥ sup j=r i=r n j=r i=r n m aij = sup lim n m→∞ aij j=r i=r m n = = sup lim n n m→∞ aij = sup n n n→∞ i=r ∞ i=r j=r ∞ m→∞ lim aij j=r ∞ ∞ aij i=r j=r sup n i=r j=r aij = lim aij = i=r j=r Interchanging the i and j in the above argument proves the theorem. b) is either a number. Of course there is no such number. b) ∈ [−∞. Proof: First note there is no trouble in deﬁning these sums because the aij are all nonnegative. it only diverges to ∞ and so ∞ is the value of the sum. ∞. b) ≤ supb∈B supa∈A f (a. B are sets. The other case is that r = −∞. you can take the sup in diﬀerent orders. But in this case. b) means sup (Sb ) where Sb ≡ {f (a. Proof: Let sup ({An : n ∈ N}) = r. INFINITE SERIES OF NUMBERS 271 Deﬁnition 11. ∞] for a ∈ A and b ∈ B where A.28 Let aij ≥ 0. for all a. to get the conclusion of the lemma. If a sum diverges. ∞] for a ∈ A and b ∈ B where A. That is why it is called ∞. Then sup sup f (a. ∞ aij ≥ i=r i=r n aij .26 Let f (a. In the case where r = ∞. b) ∈ [−∞. it follows that if m > n. suppose r < ∞. b). there exists n such that An ∈ (r − ε.3. b) = sup sup f (a. then r − ε < An ≤ Am ≤ r and so limn→∞ An = r as claimed. An = −∞ for all n and so limn→∞ An = −∞. supb∈B f (a. Then letting ε > 0 be given. The following is the fundamental result on double sums.3. b) . −∞ is interpreted similarly. or −∞. b) ≤ sup sup f (a. Lemma 11.3. Next note that ∞ ∞ ∞ n aij ≥ sup j=r i=r n j=r i=r n aij because for all j.3. there exists n such that An > a. it follows if m > n. Then ∞ i=1 ∞ j=1 aij = ∞ j=1 ∞ i=1 aij . Theorem 11.11. f (a. Since {Ak } is increasing.25 Let f (a. Then supa∈A f (a. b) : a ∈ A} . ∞]. a∈A b∈B b∈B a∈A Repeat the same argument interchanging a and b. The symbol. Am > a. a∈A b∈B b∈B a∈A Proof: Note that for all a. b. b) ≤ supb∈B supa∈A f (a. . sup sup f (a. Lemma 11. m n Therefore.27 If {An } is an increasing sequence in [−∞. r]. But this is what is meant by limn→∞ An = ∞. B are sets which means that f (a. +∞ is interpreted as a point out at the end of the number line which is larger than every real number. Therefore. b) and therefore. In the ﬁrst case. Unlike limits. then sup {An } = limn→∞ An . then if a is a real number. b) .3. Since {An } is increasing.

8 again. Also. every i.28 and Theorem 11.3.3.8 both converge. ∞ ∞ aij . It only remains to verify they are equal. ∞ i=r |aij | ∞ j=r aij ∞ < ∞ and for each i.3.30 Let ∞ i=r ai and cn ≡ ∞ i=r bi n be two series.3. k=r The series ∞ n=r cn is called the Cauchy product of the two series. ∞ on Page 263. by Theorem 11. the ﬁrst one for every j and the second for ∞ ∞ ∞ aij ≤ j=r i=r j=r i=r ∞ ∞ |aij | < ∞ and ∞ ∞ aij ≤ i=r j=r i=r j=r ∞ ∞ |aij | < ∞ so by Theorem 11. j=r i=r i=r j=r aij both exist. . Proof: By Theorem 11.272 Theorem 11. By Theorem 11. i=r aij . deﬁne ak bn−k+r .3.28 ∞ ∞ ∞ ∞ |aij | = j=r i=r i=r j=r |aij | < ∞ ∞ Therefore. Note 0 ≤ (|aij | + aij ) ≤ |aij | . This proves the theorem.29 Let aij be a number and suppose ∞ ∞ INFINITE SERIES |aij | < ∞ . i=r j=r Then ∞ ∞ ∞ ∞ aij = i=r j=r j=r i=r aij and every inﬁnite sum encountered in the above equation converges.4 on Page 262 ∞ ∞ ∞ ∞ ∞ ∞ |aij | + j=r i=r j=r i=r aij = j=r i=r ∞ ∞ (|aij | + aij ) (|aij | + aij ) i=r j=r ∞ ∞ ∞ ∞ = = i=r j=r ∞ ∞ |aij | + i=r j=r ∞ ∞ aij aij i=r j=r = j=r i=r ∞ ∞ ∞ ∞ |aij | + and so j=r i=r aij = i=r j=r aij as claimed. For n ≥ r. for each j.3. Therefore.3. Deﬁnition 11. j=r |aij | < ∞. One of the most important applications of this theorem is to the problem of multiplication of series.

···. Also. This yields a0 b0 + (a0 b1 + b0 a1 ) + (a0 b2 + a1 b1 + a2 b0 ) + · · · and you see the expressions in parentheses above are just the cn for n = 0. Then ∞ bj = n=r cn i=r where cn = n ak bn−k+r .3. Theorem 11. Therefore. it is necessary to prove a theorem. k=r Proof: Let pnk = 1 if r ≤ k ≤ n and pnk = 0 if k > n.3.31 Suppose ∞ i=r ∞ ai and ai ∞ j=r bj ∞ j=r both converge absolutely1 . it is reasonable to conjecture that ∞ ∞ ∞ ai i=r j=r bj = n=r cn and of course there would be no problem with this in the case of ﬁnite sums but in the case of inﬁnite sums.29 ∞ ∞ n ∞ ∞ cn n=r = = k=r ∞ ak bn−k+r = n=r k=r ∞ ∞ n=r k=r ∞ pnk ak bn−k+r ∞ ak n=r ∞ pnk bn−k+r = k=r ak n=k bn−k+r = k=r ak m=r bm 1 Actually. The following is a special case of Merten’s theorem. 1.3. 2. it is only necessary to assume one of the series converges and the other converges absolutely. ∞ ∞ ∞ ∞ pnk |ak | |bn−k+r | = k=r n=r k=r ∞ |ak | n=r ∞ pnk |bn−k+r | |bn−k+r | n=k ∞ = k=r ∞ |ak | |ak | k=r ∞ n=k ∞ = = k=r bn−(k−r) |bm | < ∞. This is known as Merten’s theorem and may be read in the 1974 book by Apostol listed in the bibliography. Then ∞ cn = k=r pnk ak bn−k+r .11. . INFINITE SERIES OF NUMBERS 273 It isn’t hard to see where this comes from. by Theorem 11. Formally write the following in the case r = 0: (a0 + a1 + a2 + a3 · ··) (b0 + b1 + b2 + b3 · ··) and start multiplying in the usual way. m=r |ak | Therefore.

274 and this proves the theorem. Prove this theorem. tell why. Determine whether the following series converge absolutely. conditionally. then ak does not converge absolutely. ∞ .01 n 10n (−1) (1. |ak | ln k ∞ k=1 exists. If p < 1.4 Exercises 1. can you conclude ∞ converges? What if it is only the case that n=1 an converges? ln 1 an bn 9. and bn is bounded. A person says a series diverges by the alternating series test. (a) (b) (c) (d) (e) (f) (g) ∞ n=1 ∞ n=1 ∞ n=1 ∞ n=1 (−1) (−1) n n √ 1 n2 +n+1 √ n+1− √ n n (n!)2 (−1) (2n)! (−1) n (2n)! (n!)2 ∞ (−1)n n=1 2n+2 ∞ n=1 ∞ n=1 (−1) (−1) n n n+1 n n+1 n n2 n 2. If n=1 an converges absolutely. A person says a series converges conditionally by the ratio test. or not at all and give reasons for your answers. Suppose Explain. The logarithm test states the following. or not at all and give reasons for your answers. conditionally. (a) (b) (c) (d) (e) (f) ∞ n=1 ∞ n=1 ∞ n=1 ∞ n=1 ∞ n=1 ∞ n=1 (−1) (−1) 5 n ln(k ) k 5 n ln(k ) k1. can the same thing be said about ∞ n=1 a2 ? n 4. INFINITE SERIES 11. can you conclude the sum. Find a series which diverges using one test but converges using another if possible. Explain why his statement is total nonsense. If n=1 an and verges? ∞ ∞ ∞ n=1 bn both converge. If this is not possible. 5.01)n n 1 (−1) sin n n 1 (−1) tan n2 n 1 (−1) cos n2 3. Explain why his statement is total nonsense. Determine whether the following series converge absolutely. then k=1 ak converges absolutely. ∞ n=1 an converges absolutely. If p > 1. 7. ∞ n=1 an bn con∞ n=1 8. Suppose ak = 0 for large k and that p = limk→∞ the series. 6.

an interesting variation on the ratio test. C and a number p such that ak+1 p ≤1− ak k+C for all k large enough. Deﬁne f (x) ≡ (1 − x) − 1 + px and show that f (0) = 0 while f (x) ≥ 0 for all x ∈ (0.1 Let {ak }k=0 be a sequence of numbers. This test says the following. 15.5 Taylor Series Earlier Taylor polynomials were used to approximate known functions such as sin x and ln (1 + x) . show that (1 − x) ≥ 1 − px. Now conclude that there exists an integer. ∞). Verify Theorem 11. A much more exciting idea is to use inﬁnite series of known functions as deﬁnitions of possibly new functions. and p ≥ 1. This is also called a power series centered at a. For 1 ≥ x ≥ 0. The expression. Determine the convergence of the series 16. 1 f (t) dt and the sum k=1 f (k) converge or diverge together. ∞ ∞ ak (x − a) k=0 k (11. Show the ∞ ∞ improper integral. k0 such that bk0 > 1 and for all k ≥ k0 the given inequality above holds.31 on the two series ∞ n=1 ∞ k=0 ∞ 1 n=2 n(ln(n))p for various values n 1 −n/2 k=1 k . bk > 1. 12.11. Suppose f is a nonnegative continuous decreasing function deﬁned on [1. Hint: This can be done using p the mean value theorem from calculus. . Use this test to verify convergence of k=1 k1 whenever α α > 1 and divergence whenever α ≤ 1. 1) . k k Now use comparison theorems and the p series to obtain the conclusion of the theorem. Then if p > 1. Using the result of Problem 10 establish Raabe’s Test. Suppose there exists a constant.5. determine the convergence of of p. Hint: Let bk ≡ k − 1 + C and note that for all k large enough. Deﬁnition 11. TAYLOR SERIES p 275 10. What about the Cauchy product of this series? Does k=0 (−1) n+1 it even converge? What does this mean about using algebra on inﬁnite sums as though they were ﬁnite sums? 2 √1 . 14.11) is called a Taylor series centered at a. Thus |ak | ≤ C/bp = C/ (k − 1 + C) . Consider the series ∞ ∞ k=0 n p (−1) n √1 to write . 11. ∞ This is called the integral test.5. n+1 Show this series converges and so it makes sense 13. For p a positive number. Use Problem 10 to conclude that ak+1 p ≤1− ≤ ak k+C 1− 1 k+C p ∞ = bk bk+1 p showing that |ak | bp is decreasing for k ≥ k0 . 2−k and ∞ k=0 11. it follows that k=1 ak converges absolutely.3. 3−k .

much can be said about such functions.3 Let k=0 ak (x − a) be a Taylor series.5. |an | |x − a| = |an | |z − a| ∞ k n n |an | |z − a| < 1 |x − a| |x − a| n n ≤ n <r |z − a| |z − a| rn . so for such n. The nth term test implies n→∞ lim |an | |z − a| = 0 n n and so for all n large enough. Deﬁning. Proof: Let r ≡ sup {|y − a| : y ∈ D} .5.4 Find the interval of convergence of the Taylor series Use Corollary 11. k=0 |ak | |x − a| converges by comparison with the geometric series. if |x − a| > r. x is a variable. Thus you can put in various values of x and ask whether the resulting series of numbers converges. Example 11. k This might be a totally new function. D to be the set of all values of x such that the resulting series does converge. With this lemma. However. the determination of r tends to be pretty easy. f deﬁned on D as ∞ f (x) ≡ k=0 ak (x − a) .2 Suppose z ∈ D. Determining which points of {x : |x − a| = r} are in D requires the use of speciﬁc convergence tests and can be quite hard. Proof: Let 1 > r = |x − a| / |z − a| .22. Lemma 11.3. r wouldn’t be as deﬁned. Furthermore. ∞ k Theorem 11. This proves the theorem. From now on D will be referred to as the interval of convergence and r of the above theorem as the radius of convergence. Nevertheless. by the above lemma k=0 |ak | |x − a| converges and this proves the ﬁrst part of this theorem. the following fundamental theorem is obtained. Then there exists r ≤ ∞ such that the Taylor series converges absolutely if |x − a| < r. Then if |x − a| < |z − a| . Then if |x − a| < r. lim |x| n n 1/n ∞ xn n=1 n . In fact |x − a| would then be an upper bound to {|y − a| : y ∈ D} . ∞ k Therefore. the Taylor series diverges.5. r fails to be an upper bound to {|y − a| : y ∈ D} and so the Taylor series must diverge as claimed.276 INFINITE SERIES In the above deﬁnition. one which has no name. then x ∈ D also and furthermore. ∞ k Now suppose |x − a| > r. n n Therefore. The following lemma is fundamental in considering the form of D which always turns out to be an interval centered at a which may or may not contain either end point. deﬁne a new function. If k=0 ak (x − a) converges then by the above lemma. ∞ k the series k=0 |ak | |x − a| converges. n→∞ |x| = lim √ = |x| n→∞ n n . it follows there exists z ∈ D such that |z − a| > |x − a| since otherwise.

5.11. the series converges because it reduces to n=1 n and the alternating series test applies and gives convergence.13) has the same radius of convergence as the original series. the limit involved in taking the derivative and the limit of the sequence of ﬁnite sums.5. Consequently. TAYLOR SERIES 277 √ because limn→∞ n n = 1 and so if |x| < 1 the series converges. and multiply power series. 11. |an | |y − a| n−1 < 1. . however. but it is a serious mistake to think this. Proof: First it will be shown that the series on the right in (11. The endpoints require ∞ 1 special attention.6 Let r > 0 and let ∞ n=0 an (x − a) n be a Taylor series having radius of convergence ∞ f (x) ≡ n=0 an (x − a) n (11. When x = 1 the series diverges because it reduces to n=1 n . has radius of convergence equal to r. just as you would diﬀerentiate a polynomial. Then n→∞ lim |an | |y − a| ∞ n−1 = lim |an | |y − a| = 0 n→∞ n n because |an | |y − a| < ∞ n=0 and so. At the ∞ (−1)n other endpoint. The following is special and pertains to power series. Thus let |x − a| < r and pick y such that |x − a| < |y − a| < r. You usually cannot diﬀerentiate an inﬁnite series whose terms are functions even if the functions are themselves polynomials. The following theorem says one can diﬀerentiate power series in the most natural way on the interval of convergence. This theorem may seem obvious. It is another example of the interchange of two limits.1 Operations On Power Series It is desirable to be able to diﬀerentiate. in this case. r = 1/e.12) for |x − a| < r. integrate. th Apply the ratio test. the derived series. Theorem 11.5. Then ∞ ∞ f (x) = n=0 an n (x − a) n−1 = n=1 an n (x − a) n−1 (11.5 Find the radius of convergence of ∞ nn n n=1 n! x .5.13) and this new diﬀerentiated power series. Example 11. for n large enough. Taking the ratio of the absolute values of the (n + 1) terms n+1 (n+1)(n+1) n 1 (n+1)n! |x| n → |x| e = (n + 1) |x| n−n = |x| 1 + n n n n n! |x| and the nth Therefore the series converges absolutely if |x| e < 1 and diverges if |x| e > 1.

f (y) − f (x) y−x ∞ = n=0 ∞ an (y − x) n−1 an nzn n=1 −1 [(y − a) − (x − a) ] (11. By the comparison test. the radius of convergence of the derived series is at least as large as that of the original series. it follows n=1 an n (x − a) converges absolutely for any x satisfying |x − a| < r. It remains to verify the assertion about the derivative.15) n n = where the last equation follows from the mean value theorem and zn is some point between x − a and y − a. ∞ n−1 ∞ n−1 if n=1 |an | n |x − a| converges then by the comparison test. Then considering the diﬀerence quotient. it follows both x and y are in [a − r1 . a + r1 ] . letting r2 ∈ (r1 . a + r1 ] ⊆ (a − r. Therefore. a + r1 ) ⊆ [a − r1 . n=0 n=0 ∞ n |an | r2 < ∞ ∞ n−1 (11.278 Therefore. |an | n |x − a| n−1 INFINITE SERIES = ≤ |an | |y − a| n x−a y−a n−1 n x−a y−a n−1 n−1 and ∞ n n=1 x−a y−a n−1 converges by the ratio test. ∞ n |an | r1 . n=1 |an | |x − a| and ∞ n therefore n=1 |an | |x − a| also converges which shows the radius of convergence of the derived series is no larger than that of the original series. f (y) − f (x) n−1 = an nzn = y−x n=1 ∞ ∞ n−1 an n zn − (x − a) n=1 ∞ n−1 ∞ = = n=2 + n=1 an n (x − a) ∞ n−1 an n (n − n−2 1) wn (zn − (x − a)) + n=1 an n (x − a) n−1 (11.16) where wn is between zn and x − a. r) . Let |x − a| < r and let r1 < r be close enough to r that x ∈ (a − r1 .14) Letting y be close enough to x. Thus. On the other hand. for large enough n. Thus wn is between x − a and y − a and so wn + a ∈ [a − r1 . a + r1 ] . Therefore. a + r) .

y−x n=1 Taking the limit as y → x (11. .14). The ﬁrst sum on the right in (11.17) Then an = (11. n−2 |an | r2 r1 r2 n−2 < 1 for all n large enough. Next use Theorem 11.11. Therefore.7 Let and let ∞ n=0 ∞ an (x − a) be a Taylor series with radius of convergence r > 0 ∞ n f (x) ≡ n=0 an (x − a) .16) therefore satisﬁes ∞ n−2 an n (n − 1) wn (zn − (x − a)) n=2 ∞ 279 ≤ |y − x| n=2 ∞ |an | n (n − 1) |wn | n−2 ≤ |y − x| n=2 ∞ n−2 |an | r2 n (n − 1) n=2 n−2 |an | n (n − 1) r1 = |y − x| Now from (11. TAYLOR SERIES which implies |wn | ≤ r1 . r1 r2 n−2 n−2 |an | r2 n (n − 1) n−2 ≤ n (n − 1) r1 r2 n−2 r1 and the series n (n − 1) r2 converges by the ratio test.5. ∞ ∞ f (x) = n=1 an n (x − a) n−1 = a1 + n=2 an n (x − a) n−1 . This proves the theorem. Continuing this way proves the corollary.6 again to take the second derivative and obtain ∞ f (x) = 2a2 + n=3 an n (n − 1) (x − a) n−2 let x = a in this equation and obtain a2 = f (a) /2 = f (a) /2!. From Theorem 11. for such n. f (a) = a0 ≡ f (0) (a) /0!. Therefore.6. it is possible to characterize the coeﬃcients of a Taylor series.17).5. there exists a constant. Corollary 11. As an immediate corollary.16) f (y) − f (x) n−1 − an n (x − a) ≤ C |y − x| .18) Proof: From (11. Now let x = a and obtain that f (a) = a1 = f (a) /1!.13) follows.5. C independent of y such that ∞ n−2 |an | n (n − 1) r1 = C < ∞ n=2 Consequently. from (11. n! n (11.5. f (n) (a) .

5. f − cos (x) etc.280 INFINITE SERIES This also shows the coeﬃcients of a Taylor series are unique. doing the series term by term and obtain ∞ 2k k x cos (x) = (−1) (2k)! k=0 for all x ∈ R. from Taylor’s formula. This implies sin (x) = k=0 (−1) k x2k+1 f (2n+2) (ξ n ) + (2k + 1)! (2n + 2)! and the last term converges to zero as n → ∞ for any value of x and therefore. f (x) = 0 + x + 0 − x3 x5 x2n+1 f (2n+2) (ξ n ) +0+ +···+ + 3! 5! (2n + 1)! (2n + 2)! (x) = where ξ n is some number between 0 and x. by the nth term test limn→∞ n = 0.5. Furthermore. f (x) = − sin (x) . Then f (x) = cos (x) .8 Find the power series for sin (x) . it follows that 1 <∞ (2n + 2)! n=0 and so by the comparison test. Therefore. First consider f (x) = sin (x) . . By Theorem 11. and cos (x) centered at 0 and give the interval of convergence. ≤ (2n + 2)! (2n + 2)! By the ratio test. Thus f (2n+2) (ξ n ) 1 . Example 11. That is.1 on Page 257. then ak = bk for all k. ∞ sin (x) = k=0 (−1) k x2k+1 (2k + 1)! for all x ∈ R.1.5. this equals either ± sin (ξ n ) or ± cos (ξ n ) and so its absolute value is no larger than 1. Theorem 11. Therefore. Example 11.9 Find the sum ∞ k=1 k2−k .6. you can diﬀerentiate both sides. ∞ n=0 ∞ f (2n+2) (ξ n ) <∞ (2n + 2)! f (2n+2) (ξ n ) (2n+2)! also. if ∞ ∞ ak (x − a) = k=0 k=0 k bk (x − a) k for all x in some interval.

However. To −α see this. Then 4= 1 −2 = k=1 ktk−1 ∞ 2 (1 − (1/2)) −1 = k=1 k2−(k−1) and so if you multiply both sides by 2 . such that (1 + x) −α = 0. from (11. 1−t = k=0 tk if |t| < 1.20). First note that if y (x) ≡ (1 + x) . y (0) = 1 and so it must be that C = 1. then y is a solution of the following initial value problem. y (0) = 1.5.19) α α Next it is necessary to observe there is only one solution to this initial value problem. ∞ 1 From the formula for the sum of a geometric series. However.11. (11. ∞ 2= k=1 k2−k . there is exactly one solution α to the initial value problem in (11. Example 11. y − α y = 0. ∞ ∞ (1 + x) n=0 an nxn−1 − n=0 ∞ αan xn = 0 and so ∞ an nxn−1 + n=1 n=0 an (n − α) xn = 0 .6 to do this. Diﬀerentiate both sides to obtain ∞ (1 − t) whenever |t| < 1. Let ∞ y (x) ≡ n=0 an xn (11. When this is done one obtains d α −α −α (1 + x) y = (1 + x) y − y dx (1 + x) Therefore.10 Find a Taylor series for the function (1 + x) centered at 0 valid for |x| < 1.5.20) y = C.19) and it is y (x) = (1 + x) .5. From Theorem 11. multiply both sides of the diﬀerential equation in (11.5. The following is a very important example known as the binomial series.6 and the initial value problem. Therefore.19) by (1 + x) . Of course it is not known at this time whether such a series exists. the process of ﬁnding it will demonstrate its existence. TAYLOR SERIES 281 It may not be obvious what this sum equals but with the above theorem it is easy to ﬁnd.19). (1 + x) (11. Use Theorem 11. Let t = 1/2.21) be a solution to (11. C. there must exist a constant. The strategy for ﬁnding the Taylor series of this function consists of ﬁnding a series which solves the initial value problem above.

You may have noticed the prominent role played by geometric series.22) Therefore. The function f (x) = (1 + x) makes sense for all x > −1 but one is only able to describe it with a power series on the interval (−1. 1) . Then ∞ (1 + x) = n=0 α α n x . a1 = α. Theorem 11. . In general. n! n=0 ∞ Furthermore. a2 = α(α−1) . To completely understand power series. these are topics for another course.22) and letting n = 0.5. our candidate for the Taylor series is y (x) = α (α − 1) · · · (α − n + 1) n x . it is necessary to take a course in complex analysis. ∞ ∞ INFINITE SERIES an+1 (n + 1) x + n=0 n=0 n an (n − α) xn = 0 and from Corollary 11. n of these factors an = α (α − 1) · · · (α − n + 1) .19) this requires an+1 = an (α − n) . The expression. Thus the ratio st of the absolute values of (n + 1) term to the absolute value of the nth term is α(α−1)···(α−n+1)(α−n) (n+1)n! α(α−1)···(α−n+1) n! |x| n n+1 |x| = |x| |α − n| → |x| n+1 showing that the radius of convergence is 1 since the series converges if |x| < 1 and diverges if |x| > 1. With this notation. Using the same process. To ﬁnd the radius of convergence. use the ratio test. Think about this. the above discussion shows this series solves the initial value problem on its interval of convergence. This is no accident. n There is a very interesting issue related to the above theorem which illustrates the α limitation of power series.282 Changing the order variable of summation in the ﬁrst sum. α(α−1)···(α−n+1) is often denoted as α . The above technique is a standard one for obtaining solutions of diﬀerential equations and this example illustrates a deﬁciency in the method. However. a3 = = α(α−1)(α−2) . It only remains to show the radius of convergence of this series α equals 1. Then using (11. It turns out that the right way to consider Taylor series is through the use of geometric series and something called the Cauchy integral formula of complex analysis. a0 = 1. n! Therefore. By 2 3 3! now you can spot the pattern.22) again along with this ( α(α−1) )(α−2) 2 information. It will then follow that this series equals (1 + x) because of uniqueness of the initial value problem. from (11.7 and the initial condition for (11.5. You can also integrate power series on their interval of convergence. n+1 (11. the following n! n theorem has been established.11 Let α be a real number and let |x| < 1.

23 on Page 268. Theorem 11. all that is required is to multiply sin x ex x3 x5 x2 x3 + · ·· x − + + · · · 1 + x + 2! 3! 3! 5! From the above theorem the result should be x + x2 + − 1 1 + 3! 2! x3 + · · · .5.5. Using Problems 7 . TAYLOR SERIES Theorem 11. ∞ ∞ an (y−a)n+1 . By Theorem 11. and Example 11.3. both positive. r2 ) . Then if |y − a| < r. But C = 0 because F (a) − G (a) = 0.14 Find the Taylor series for ex sin x centered at x = 0. Then ∞ ∞ ∞ n ∞ n ∞ n an (x − a) n=0 n n=0 bn (x − a) n = n=0 k=0 ak bn−k (x − a) n whenever |x − a| < r ≡ min (r1 .5. Here is an example. n=0 n+1 G (y) = n=0 an (y − a) = f (y) = F (y) . Next consider the problem of multiplying two power series.3 both series converge absolutely if |x − a| < r.13 Let n=0 an (x − a) and n=0 bn (x − a) be two power series having radii of convergence r1 and r2 .6 Proof: Deﬁne F (y) ≡ a f (x) dx and G (y) ≡ and the Fundamental theorem of calculus. G (y) − F (y) = C for some constant. Therefore. This theorem can be used to ﬁnd Taylor series which would perhaps be hard to ﬁnd without it.5. The signiﬁcance of this theorem in terms of applications is that it states you can multiply power series just as you would multiply polynomials and everything will be all right on the common interval of convergence. by Theorem 11.3. This proves the theorem. y ∞ ∞ n=0 n 283 an (x − a) and suppose the interval of convergence an (y − a) n+1 n=0 ∞ n+1 y a f (x) dx = a y n=0 an (x − a) dx = n .9 on Page 260 or Example 11.5.31 ∞ ∞ an (x − a) n=0 ∞ n k n n=0 bn (x − a) ∞ n n−k n = n ak (x − a) bn−k (x − a) n=0 k=0 = n=0 k=0 ak bn−k (x − a) .8 on Page 280.5. n Therefore.11.5.12 Let f (x) = is r > 0. Example 11. This proves the theorem. Proof: By Theorem 11.

tan (x) will have a power series valid on some interval centered at 0 and this becomes completely obvious when one uses methods from complex analysis but it isn’t too obvious at this point.5. In fact. Clearly one can continue in this manner. Find k2−k . there does not appear to be a very simple pattern for the coeﬃcients of the power series for tan x. 1 1 1 1 7 x + x2 + x3 − x5 − x6 − x +··· 3 30 90 630 INFINITE SERIES I don’t see a pattern in these coeﬃcients but I can go on generating them as long as I want. but if it does. (In practice this tends to not be very long. 0 −1 + a2 x2 = 0 so a2 = 0.284 1 = x + x2 + x3 + · · · 3 You can continue this way and get the following to a few more terms. a3 − a1 x3 = 2 2 −1 3 1 1 1 3! x so a3 − 2 = − 6 so a3 = 3 . a0 = 0. the above must be it. . Find the radius of convergence of the following. 3! 5! Using the above.15 Find the Taylor series for tan x centered at x = 0. read the last section of the chapter. Note also that what has been accomplished is to divide the power series for sin x by the power series for cos x just like they were polynomials. 2. Thus the ﬁrst several terms of the power series for tan are 1 tan x = x + x3 + · · ·. as you see. 11.6 (a) (b) (c) (d) (e) Exercises ∞ x n k=1 2 ∞ 1 n n k=1 sin n 3 x ∞ k k=0 k!x ∞ (3n)n n n=0 (3n)! x ∞ (2n)n n n=0 (2n)! x ∞ k=1 1. Lets suppose it has a Taylor series a0 + a1 x + a2 x2 + · · ·. 3 You can go on calculating these terms and ﬁnd the next two yielding 1 2 17 7 tan x = x + x3 + x5 + x +··· 3 15 315 This is a very signiﬁcant technique because. Then cos x x4 x2 a0 + a1 x + a2 x2 + · · · 1 − + + · · · = 2 4! x− x3 x5 + +··· . If you are interested in this issue. Example 11. Of course there are some issues here about whether tan x even has a power series.) I also know the resulting power series will converge for all x because both the series for ex and the one for sin x converge for all x. a1 x = x so a1 = 1.

f (x) ≡ |x| has no derivative at x = 0. However.5. ex ≡ k=0 Theorem 11. and it is this very example which is used to show this. the function. Use Show that f (k) (x) exists for all k and for all x. 9.5. this function is very important to those who use modern techniques to study diﬀerential equations. Show that for all x. Tell where your solution gives a valid description of a solution for the initial value problem. Use the power series technique on the initial value problem y + y = 0. ex is deﬁned in terms of a power series. 5. y (0) = 1. 0 if x = 0 ∞ xk k! . Thus even though fn (x) → f (x) for all x. 8. y (0) = 1. they do. 10.10. The main diﬀerence is you have to do two diﬀerentiations of the power series instead of one. x = 0. The theory of complex variables can be used to show there are no examples of such functions if they have a valid power series expansion. Be sure to check all the hypotheses. there are enough of them. Therefore. 1 ||x| − fn (x)| ≤ √ .6. Suppose the function. y (0) = 1. fn (x) be given in Problem 9 and consider g1 (x) = f1 (x) . Let the functions. ex+y = ex ey . Show also that f (k) (0) = 0 for all ∞ k ∈ N.11. One needs to consider test functions which have the property they have inﬁnitely many derivatives but vanish outside of some interval.31 on Page 273 to show directly the usual law of exponents. It even becomes a little questionable whether such strange functions even exist at all. 7.3. Hint: This is a little diﬀerent but you proceed the same way as in Example 11. gn (x) = fn (x) − fn−1 (x) if n > 1. 4. 2 Surprisingly. n Now show fn (0) = 0 for all n and so fn (0) → 0.10 to consider the initial value problem y = y. However. EXERCISES 285 3. Nevertheless. This yields another way to obtain the power series for ex . y (0) = 0. Use the power series technique to ﬁnd solutions in terms of power series to the initial value problem y + xy = 0. What is the solution to this initial value problem? 6. you cannot say that fn (0) → f (0) . the power series for f (x) is of the form k=0 0xk and it converges for all values of x. Find the power series centered at 0 for the function 1/ 1 + x2 and give the radius of convergence. Deﬁne the following function2 : f (x) ≡ 2 e−(1/x ) if x = 0 . Let fn (x) ≡ 1 n + x2 1/2 . Use the power series technique which was applied in Example 11. it fails to converge to f (x) except at the single point. .

Find the exact value of ∞ n=1 n2 2−n . 11. How do you know this is the power series for sin x2 ? 16. You should plug in small values of x using a calculator and see what you get using modern technology. Theorem 11. You know 0 t21 dt = arctan x. Replace ex with an appropriate power series and estimate this integral. INFINITE SERIES ∞ gk (x) = |x| k=0 and that gk (0) = 0 for all k. Use this and Theorem 11.7 Some Other Theorems First recall Theorem 11. ln |1 + x|? x 14. k=0 Furthermore. Thus it is nonsense to diﬀerentiate an inﬁnite sum term by term without a theorem of some sort. Therefore. In fact.31 on Page 273.5. bad can this get? It can be much worse than this. Every such function can be written as an inﬁnite sum of polynomials which of course have derivatives at every point. 3 How ∞ n=0 cn converges absolutely. Where does this power series converge? Where does it converge to the function.12 to ﬁnd the power series for ln |1 + x| centered at 0.286 Show that for all x. n 12.5. the version of this theorem which is of interest here is listed below. you can’t diﬀerentiate the series term by term and get the right answer3 . . For convenience. arctan x? 15. 1 13. x7 1 0 1 2 x sin x2 dx. Do the same as the previous problem for 18. Then ∞ bj = n=0 cn i=0 where cn = ak bn−k . Where does this power series converge? Where does it converge to the function. Use this and Theorem 11. 17. 11. You know 0 t+1 dt = ln |1 + x| . 4 This is a wonderful example.3.7. Use the theorem about the binomial series to give a proof of the binomial theorem (a + b) = k=0 n n n−k k a b k whenever n is a positive integer.1 Suppose ∞ i=0 ai and ∞ ai ∞ j=0 bj ∞ j=0 n both converge absolutely. Find limx→0 tan(sin x)−sin(tan x) 4 .12 to ﬁnd the power +1 series for arctan x centered at 0. Find the power series for sin x2 by plugging in x2 where ever there is an x in the power series for sin x. It is hard to ﬁnd 0 ex dx because you don’t have a convenient antiderivative for the 2 integrand. We typically don’t have names for them but they are there just the same. there are functions which are continuous everywhere and diﬀerentiable nowhere.

7.3 Suppose ∞ n=0 an converges absolutely. The theorem is about multiplying two series. Then ∞ p ∞ an n=0 = m=0 cmp where cmp ≡ k1 +···+kp =m ak1 · · · akp . · · ·.28 on Page 271 and letting pnk be as deﬁned there.3. Theorem 11. By Theorem 11. It turns out a similar theorem holds for replacing 2 with p. ∞ ∞ n |cn | n=0 = n=0 k=0 ∞ n ak bn−k ∞ ∞ ≤ n=0 k=0 ∞ ∞ |ak | |bn−k | = pnk n=0 k=0 ∞ ∞ k=0 n=k |ak | |bn−k | |ak | |bn−k | = = k=0 pnk |ak | |bn−k | = k=0 n=0 ∞ ∞ |ak | n=0 |bn | < ∞.7. . form the product. kp which have the p property that i=1 ki = m. ak1 ak2 · · · akp and then add all these products. SOME OTHER THEOREMS 287 Proof: It only remains to verify the last series converges absolutely. For each such list of integers. Consider all ordered lists of nonnegative integers k1 .2 Deﬁne ak1 ak2 · · · akp k1 +···+kp =m as follows. from the above theorem. Note that n ak an−k = k=0 k1 +k2 =n ak1 ak2 Therefore. This proves the theorem. if ∞ 2 ai converges absolutely.11.7. it follows ∞ ai i=0 = n=0 k1 +k2 =n ak1 ak2 . What if you wanted to consider ∞ p an n=0 where p is a positive integer maybe larger than 2? Is there a similar theorem to the above? Deﬁnition 11.

Now suppose this is true for p and consider ( n=0 an ) .15 on Page 284 about the existence of the power series for tan x. By the induction hypothesis and the above theorem on the Cauchy product. ∞ p ∞ an (x − a) n=0 n = n=0 bnp (x − a) n where bnp ≡ k1 +···+kp =n ak1 · · · akp .288 INFINITE SERIES Proof: First note this is obviously true if p = 1 and is also true if p = 2 from the above p+1 ∞ theorem. This question is clearly included in the more general question of when ∞ −1 an (x − a) n=0 n . ∞ p+1 ∞ p ∞ an n=0 = n=0 ∞ an n=0 ∞ an an n=0 = = cmp m=0 ∞ n n=0 k=0 ∞ n ckp an−k ak1 · · · akp an−k n=0 k=0 k1 +···+kp =k ∞ = = n=0 k1 +···+kp+1 =n ak1 · · · akp+1 and this proves the theorem.7. Then if |x − a| < r. Corollary 11. ∞ n=0 Proof: Since |x − a| < r. This theorem implies the following corollary for power series. the above theorem applies and ∞ an (x − a) . n n=0 k1 +···+kp =n With this theorem it is possible to consider the question raised in Example 11.5. p n an (x − a) ∞ n=0 n = n=0 k1 ak1 (x − a) ∞ k · · · akp (x − a) p = k1 +···+kp =n ak1 · · · akp (x − a) . the series. r > 0. converges absolutely. Therefore.4 Let ∞ an (x − a) n=0 n be a power series having radius of convergence.

7. Suppose also that f (a) = 1. SOME OTHER THEOREMS has a power series. Then 1 f (x) = = 1+ 1+ 1 n an (x − a) 1 ∞ n n=0 cn (x − a) ∞ n=1 ∞ where cn = an if n > 0 and c0 = 0. n=1 n Now pick such an x. a power series having radius of convergence r > 0.23) and the formula for the sum of a geometric series. bnp (x − a) p=0 n=0 n (11. f (x) n=0 Proof: By continuity. ∞ 1 n = bn (x − a) . then ∞ ∞ n |an | |x − a| < 1.7.3.3. the following theorem is easy to obtain.5 Let f (x) = n=0 an (x − a) . Since the series of (11.24) converges absolutely.7. .24) where bnp = k1 +···+kp =n ck1 · · · ckp . Theorem 11.11. there exists r1 > 0 such that if |x − a| < r1 .4.7. ∞ ∞ n |bnp | |x − a| p=0 n=0 ≤ p=0 n=0 ∞ Bnp |x − a| ∞ n p = p=0 n=0 |cn | |x − a| n <∞ by (11.24) equals ∞ n=0 ∞ ∞ bnp p=0 (x − a) n and so. letting p=0 bnp ≡ bn .23) and so from the formula for the sum of a geometric series. 289 Lemma 11.28 on Page 271 implies the series in (11. this proves the lemma. With this lemma. this equals ∞ ∞ ∞ ∞ p cn (x − a) n=0 n . Then there exists r1 > 0 and {bn } such that for all |x − a| < r1 . 1 = f (x) p=0 By Corollary 11. Thus |bnp | ≤ k1 +···+kp =n ∞ ∞ |ck1 | · · · ckp ≡ Bnp and so by Theorem 11. Then ∞ an (x − a) n=1 n ≤ n=1 |an | |x − a| < 1 n (11.

What happens is this. Suppose also that f (a) = 0. Then by that lemma. ∞ ∞ . there exists r1 > 0 and a sequence. you should see the book by Apostol [1] or any text on complex variable.6 Let f (x) = n=0 an (x − a) .290 ∞ n INFINITE SERIES Theorem 11. 1/f (x) will have a power series that will converge for |x − a| < r1 where r1 is the distance between a and the nearest singularity or zero of f (x) in the complex plane. ∞ 1 n = bn (x − a) . f (x) n=0 Proof: Let g (x) ≡ f (x) /f (a) so that g (x) satisﬁes the conditions of the above lemma. Consider f (x) = 1 + x2 . This proves the theorem. {bn } such that f (a) n = bn (x − a) f (x) n=0 for all |x − a| < r1 . One might think that if |x − a| < r. In this case r = ∞ but the power series for 1/f (x) converges only if |x| < 1. In the case of f (x) = 1 + x2 this function has a zero at x = ±i. There is a very interesting question related to r1 in this theorem. the radius of convergence of f (x) and if f (x) = 0 it should be possible to write 1/f (x) as a power series centered at a.7. Then 1 n = bn (x − a) f (x) n=0 where bn = bn /f (a) . a power series having radius of convergence r > 0. Then there exists r1 > 0 and {bn } such that for all |x − a| < r1 . To read more on power series. Unfortunately this is not true. This is just another instance of why the natural setting for the study of power series is the complex plane.

Part III Vector Valued Functions 291 .

.

xj = yj . Deﬁnition 12. Recall that R is identiﬁed with the points of a line. · · ·. 1. The set {(0. 1. Look at the number line again. Then from the deﬁnition. The reason you go to the left is that there is a − sign on the eight. · · ·. 3) · Notice how you can identify a point shown in the plane with the ordered pair. every ordered pair determines a unique point in the plane. even though the same numbers are involved. you can identify another point in the plane with the ordered pair (−8. In other words a real number determines where you are on this line. 2. n} . 4) ∈ R3 but (1. · (2. · · ·. xn ) : xj ∈ R for j = 1. More precisely. t. 0) is called the origin.Rn The notation. 4) because. · · ·. · · ·. 6) . 0) : t ∈ R } for t in the ith slot is called the ith coordinate axis. 6) 3 2 −8 6 (−8. taking a point in the plane. You go to the right a distance of 2 and then up a distance of 6. the ﬁrst entries are not equal.0. xn ) by the single bold face letter. The point 0 ≡ (0. 0. From this reasoning. R1 = R. yn ) if and only if for all j = 1. (x1 . · · ·. 2. In particular. Similarly. they don’t match up. · · ·. xn ) ∈ Rn . Go to the left a distance of 8 and then up a distance of 3. When (x1 . one vertical and the 293 . Conversely. (2. xn ) = (y1 . · · ·. Thus (1. 3) . Now suppose n = 2 and consider two lines which intersect each other at right angles as shown in the following picture. it is conventional to denote (x1 .7 Deﬁne Rn ≡ {(x1 . · · ·. 0. x. The numbers. Why would anyone be interested in such a thing? First consider the case when n = 1. 4) ∈ R3 and (2. Observe that this amounts to identifying a point on this line with a real number. you could draw two lines through the point. n. Rn refers to the collection of ordered lists of n real numbers. 4) = (2. xj are called the coordinates. consider the following deﬁnition. · · ·.

Now suppose n = 3.1) This is known as scalar multiplication. How many coordinates would be needed? You would need one for the temperature. v + w = w + v. −5) would mean to determine the point in the plane that goes with (1. Thus. physiology.1. xn + yn ) (12. β scalars.294 RN other horizontal and determine unique points. and physics being some of them. (real numbers). This is often the case in applications to business when they are trying to maximize proﬁt subject to constraints. · · ·. If the two objects moved around. Then ax ∈ Rn is deﬁned by ax = a (x1 . yn ) ≡ (x1 + y1 . axn ) . depending on whether this number is positive or negative. you would need a time coordinate as well. the algebraic properties satisfy the conclusions of the following theorem. As another example. · · ·. Deﬁnition 12. points in the plane can be identiﬁed with ordered pairs similar to the way that points on the real line are identiﬁed with real numbers. If x. such that the point of interest is identiﬁed with the ordered pair. It also occurs in numerical analysis when people try to solve hard problems on a computer.2) With this deﬁnition. the following hold. In this case. the coordinates are known as Cartesian coordinates after Descartes1 who invented this idea in the ﬁrst half of the seventeenth century.2 For v. the ﬁrst two coordinates determine a point in a plane. y ∈ Rn then x + y ∈ Rn and is deﬁned by x + y = (x1 . One is addition and the other is multiplication by numbers. xn ) + (y1 . 4) and then to go below this plane a distance of 5 to obtain a unique point in space. As just explained. Sometimes n is very large. this determines a point in space. (12. · · ·. chemistry. three for the position of the point in the object and one more for the time. (x1 . Therefore. · · ·. . You can’t stop here and say that you are only interested in n ≤ 3. Thus you would need to be considering R5 . Descartes ended up dying in Sweden. Theorem 12. 1 Ren´ e (12. xn ) ≡ (ax1 . You see that the ordered triples correspond to points in space just as the ordered pairs correspond to points in a plane and single real numbers correspond to points on a line. x1 on the horizontal line in the above picture and x2 on the vertical line in the above picture. Letting the third component determine how far up or down you go. There are other ways to identify points in space with three numbers but the one presented is the most basic. (1.1 Algebra in Rn There are two algebraic operations done with elements of Rn .3) Descartes 1596-1650 is often credited with inventing analytic geometry although it seems the ideas were actually known much earlier. I will often not bother to draw a distinction between the point in n dimensional space and its Cartesian coordinates. called scalars. also called a scalar. 4. you would need to be considering R6 . · · ·. consider a hot object which is cooling and suppose you want the temperature of this object.1 If x ∈ Rn and a is a number. He also wrote a large book in which he tried to explain the book of Genesis scientiﬁcally. He was interested in many diﬀerent subjects. In short. w ∈ Rn and α. x2 ) . 12.1. What if you were interested in the motion of two objects? You would need three coordinates to describe where the ﬁrst object is and you would need another three coordinates to describe where the other object is located. Many other examples can be given.

12. · · ·. 2.6) (12. 3. For example. 0). EXERCISES the commutative law of addition. Draw a picture of the points in R3 which are determined by the following ordered triples. (a) (1. 3) (d) (2. −2.8) (12.4) (12. −2.2 Exercises 1. the associative law for addition. 2. 1)? Explain. (a) (1. αvn + αwn ) = (αv1 . · · ·. 2) + (2. 1v = v. 2) (b) (−2.5) (12. v + 0 = v.10) 12.3)-(12. 5. Does it make sense to write (1. Draw a picture of the points in R2 which are determined by the following ordered pairs. 3. You should verify these properties all hold. (v + w) + z = v+ (w + z) . Also α (v + w) = αv+αw. the existence of an additive inverse. the existence of an additive identity. αwn ) = αv + αw. −2) . 295 (12. 1) (c) (−2. −2) (c) (−2. αvn ) + (αw1 . · · ·. α (βv) = αβ (v) . Verify all the properties (12. 3. · · ·. −2) + 6 (2. 7) . · · ·.7) α (v + w) = α (v1 + w1 .10).2. Compute 5 (1.9) (12. 0) (b) (−2. α (vn + wn )) = (αv1 + αw1 . 2.7) (12. −5) 4. · · ·. 1. v+ (−v) = 0. In the above 0 = (0. As usual subtraction is deﬁned as x − y ≡ x+ (−y) . consider (12. 3. (α + β) v =αv+βv. vn + wn ) = (α (v1 + w1 ) .

Geometrically. x3 ) and (y1 . x2 ) (y1 . Then the solid line joining these two points is the hypotenuse of a right triangle which is half of the rectangle shown in dotted lines. Thus it is the same formula as the above deﬁnition except there is only one term in the sum. Consider the following picture in which one of the solid lines joins the two points and a dotted line joins . The symbol. This is called the distance formula. There the distance between two points. x2 . x2 ) There are two points in the plane whose Cartesian coordinates are (x1 .3.3 Distance in Rn How is distance between two points in Rn deﬁned? Deﬁnition 12. · · ·. B (a. Then |x − y| to indicates the distance between these points and is deﬁned as n 1/2 2 distance between x and y ≡ |x − y| ≡ k=1 |xk − yk | . y2 . r) ≡ {x ∈ Rn : |x − a| < r} . y2 ) respectively. First of all note this is a generalization of the notion of distance in R. (y1 . this is the right way to deﬁne distance which is seen from the Pythagorean theorem. This is called an open ball of radius r centered at a.1 Let x = (x1 . y2 ) 2 1/2 (x1 . x2 ) and (y1 . Now |x − y| = (x − y) where the square root is always the positive square root.296 RN 12. r) is deﬁned by B (a. Thus |x − y| is equal to the distance between these two points on R. y3 ) be two points in R3 . Therefore. x and y was given by the absolute value of their diﬀerence. It gives all the points in Rn which are closer to a than r. What is its length? Note the lengths of the sides of this triangle are |y1 − x1 | and |y2 − x2 | . · · ·. Consider the following picture in the case that n = 2. yn ) be two points in Rn . Now suppose n = 3 and let (x1 . xn ) and y = (y1 . the Pythagorean theorem implies the length of the hypotenuse equals |y1 − x1 | + |y2 − x2 | 2 2 1/2 = (y1 − x1 ) + (y2 − x2 ) 2 2 1/2 which is just the formula for the distance given above.

x2 .2 Find the distance between the points in R4 . x3 ) and (y1 . Example 12. y2 . x3 ) to (y1 . y3 ) 297 (y1 . −4. x3 ) (x1 .3. x3 ) equals (y1 − x1 ) + (y2 − x2 ) 2 2 1/2 while the length of the line joining (y1 . All this amounts to deﬁning the distance between two points as the length of a straight line joining these two points. However.3. a = (1. x3 ) and (y1 . In R3 it is customary to denote the ﬁrst and second components as just described while the third component is called z. Another convention which is usually followed. y2 . It won’t be done in this book but sometimes this sort of thing is done. y2 . especially in R2 and R3 is to denote the ﬁrst component of a point in R2 by x and the second component by y.12. x2 . This completes the argument that the above deﬁnition is reasonable. y2 . there is nothing sacred about using straight lines. y2 . 2. 2 2 2 2 2 . y3 ) is just |y3 − x3 | . by the Pythagorean theorem again. y2 . x2 . −1. One could deﬁne the distance to be the length of some other sort of line joining these points. x2 . y3 ) equals (y1 − x1 ) + (y2 − x2 ) 2 2 2 1/2 2 1/2 + (y3 − x3 ) 2 1/2 2 = (y1 − x1 ) + (y2 − x2 ) + (y3 − x3 ) 2 . (y1 . 2) . 3. Example 12. which is again just the distance formula above. 6) and b = (2. Here is an example. 0) Use the distance formula and write |a − b| = (1 − 2) + (2 − 3) + (−4 − (−1)) + (6 − 0) = 47 √ Therefore. x3 ) . the length of the line joining the points (x1 .3 Describe the points which are at the same distance between (1. 3) and (0. y2 . x3 ) By the Pythagorean theorem. x2 . DISTANCE IN RN the points (x1 . 2. |a − b| = 47. x3 ) (y1 . Therefore.3. the length of the dotted line joining (x1 . x3 ) and (y1 . Of course you cannot continue drawing pictures in ever higher dimensions but there is not problem with the formula for distance in any number of dimensions. 1.

n n n 0 ≤ p (t) = = |x| + 2t 2 x2 + 2t i i=1 n i=1 i=1 xi θyi + t2 i=1 2 2 yi xi θyi + t2 |y| If |y| = 0 then (12.3. z) such that (12. The following lemma is fundamental. · · ·. it either has no zeroes. yn ) be two points in Rn .298 Let (x. If it has two zeros. the above inequality must be violated because in this case the graph must dip below the x axis. the set of points which is at the same distance from the two given points consists of the points. Then n xi yi ≤ |x| |y| . assume |y| = 0 and then p (t) is a polynomial of degree two whose graph opens up. Therefore.12) is obviously true because both sides equal zero. z) be such a point. 2 2 2 2 2 2 2 2 RN x2 + (y − 1) + (z − 2) . y. From the quadratic formula (see Problem 18 on Page 44) this happens exactly when n 2 4 i=1 xi θyi − 4 |x| |y| ≤ 0 2 2 and so n n xi θyi = i=1 i=1 xi yi ≤ |x| |y| .12) Proof: Let θ be either 1 or −1 such that n n n θ i=1 xi yi = i=1 2 xi (θyi ) = i=1 xi yi and consider p (t) ≡ n i=1 (xi + tθyi ) . Therefore. Then (x − 1) + (y − 2) + (z − 3) = Squaring both sides (x − 1) + (y − 2) + (z − 3) = x2 + (y − 1) + (z − 2) and so x2 − 2x + 14 + y 2 − 4y + z 2 − 6z = x2 + y 2 − 2y + 5 + z 2 − 4z which implies −2x + 14 − 4y − 6z = −2y + 5 − 4z and so 2x + 2y + 2z = −9. i=1 (12. · · ·. two zeros or one repeated zero. 2 2 (12.11) holds. Then for all t ∈ R. (x. Lemma 12.4 Let x = (x1 .11) Since these steps are reversible. y. it either has no zeros or exactly one. Therefore. It is a form of the Cauchy Schwarz inequality. xn ) and y = (y1 .

|x − y| ≥ 0 and equals 0 only if y = x. z0 ) equals r. y0 .12.4 Exercises 1. Recall that in any triangle the sum of the lengths of two sides is always at least as large as the third side. z) whose distance to (x0 . 5. This proves the inequality. 0) . The third fundamental property of distance is known as the triangle inequality. Proof: Using the above lemma. y. y) and (x . Note that 3 = 4+2 . (0. n |x + y| 2 ≡ i=1 n (xi + yi ) x2 + 2 i i=1 2 2 n n = ≤ xi yi + i=1 2 i=1 2 yi |x| + 2 |x| |y| ≤ |y| 2 = (|x| + |y|) and so upon taking square roots of both sides. (4. y1 . Then |x + y| ≤ |x| + |y| . Show the distance from the point. Corollary 12. 4.4.5 Let x. Write an equation for this sphere in R3 . Suppose the distance between (x. EXERCISES 299 as claimed. z0 ) ∈ R3 having radius r consists of all points. 0) if this notion of distance is used. (3. (x. y. You are given two points in R3 . (x. −4) and (2. Draw a picture of the sphere centered at the point. y be points of Rn . 4 = 5+3 . z) and (x1 . 2. Find an equation for the sphere which is the set of all points that are at a distance of 4 from the point (1. The above lemma makes possible a completely algebraic proof of this inequality. A sphere centered at the point (x0 . −2) to the ﬁrst of these points is the same as the distance from this point to the second of the original pair of points. The following corollary is equivalent to this simple statement and this will be more clear in the next chapter. 3. z1 ) and prove your theorem using the distance formula. . 4. |x + y| ≤ |x| + |y| and this proves the corollary. A sphere is the set of all points which are at a given distance from a single given point. Two of them which follow directly from the deﬁnition are |x − y| = |y − x| . 3. In the next chapter a diﬀerent proof is given. 2. y ) were deﬁned to equal the larger of the two numbers |x − x | and |y − y | . Obtain a 2 2 theorem which will be valid for general pairs of points. 12. 3) in R3 .3. y0 . There are certain properties of the distance which are obvious.

y1 ) + t (x2 − x1 . y1 ) Now by similar triangles. consider x = x1 + t (x2 − x1 ) where t ∈ R and the totality of all such points will give R. x1 +x2 . |x| . y2 − y1 ) . y1 +y2 . Therefore. y2 . showing that every point on R is of this form. Therefore.300 RN 5. y. z2 ) . you obtain this equation along with y = y1 + mt (x2 − x1 ) = y1 + t (y2 − y1 ) . yi . y) m≡ y2 − y1 y − y1 = x2 − x1 x − x1 and so the point slope form of the line. y) is an arbitrary point on l.)? 7. the only line is just R1 = R. l. Then if (x. (x. y2 ) (x. 2. Suppose that x1 = x2 . is given as y − y1 = m (x − x1 ) . If (x1 .5 Lines in Rn To begin with consider the case n = 1. z)| = maximum of |z| . What would happen 2 2 2 if you used the way of measuring distance given in Problem 4 (|(x. show that in terms of the usual distance. y) = (x1 . You see that you can always solve the above equation for t. l. (x2 . Now consider the plane. |y| . if x1 and x2 are two diﬀerent points in R. 2. z2 ) are two points such that |(xi . y1 . z1 +z2 < 1. (x1 . In the case where n = 1. Give a simple description using the distance formula of the set of points which are at an equal distance between the two points (x1 . y1 . y1 ) and (x2 . Repeat the same problem except this time let the distance between the two points be |x − x | + |y − y | . 12. z1 ) and (x2 . 6. If t is deﬁned by x = x1 + t (x2 − x1 ) . . z1 ) and (x2 . Does a similar formula hold? Let (x1 . y2 . y2 ) be two diﬀerent points in R2 which are contained in a line. zi )| = 1 for i = 1.

. This is known as a parametric equation and the variable t is called the parameter.5. It tells us how the set of points are obtained. Example 12. Deﬁnition 12. rather than the word. → Think of − as an arrow whose point is on q and whose base is at p as shown in the pq following picture. x1 and x2 is the collection of points of the form x = x1 + t x2 − x1 where t ∈ R. “the” is there are inﬁnitely many diﬀerent parametric equations for the same line. This proves the lemma. The directed line segment from → p to q. What happens is this: The line is a set of points but the parametric description gives more information than that. the following is the deﬁnition of a line in Rn . b ∈ Rn with a = 0. The diﬀerence is they are obtained from diﬀerent values of the parameter. 2. Deﬁnition 12.5. t ∈ R. To see this replace t with 3s. x = x1 . 0) and (2. Proof: Let x1 = b and let x2 − x1 = a so that x2 = x1 . Then you obtain a parametric equation for the same line because the same set of points are obtained.1 A line in Rn containing the two diﬀerent points.3 Let p and q be two points in Rn . Lemma 12.12. pq. Obviously. −6.4 Find a parametric equation for the line through the points (1. Because of this. The reason for the word. 6) .5. Often t denotes time in applications to Physics. t ∈ R. 0) + t (1. Then x = ta + b. p = q. 2. “a”. y1 = y2 and so you still obtain the above formula for the line. y. denoted by − is deﬁned to be the collection of points. is a line. t ∈ [0. Use the deﬁnition of a line given above to write (x. Then ta + b = x1 + t x2 − x1 and so x = ta + b is a line containing the two diﬀerent points. 6) . there are many ways to trace out a given set of points and each of these ways corresponds to a diﬀerent parametric equation for the line. −4. 1] with the direction corresponding to increasing t. x = p + t (q − p) . x1 and x2 . z) = (1.5. p 0 q This line segment is a part of a line from the above Deﬁnition.2 Let a. Note this deﬁnition agrees with the usual notion of a line in two dimensions and so this is consistent with earlier concepts. Since the two given points are diﬀerent. then in place of the point slope form above.5. LINES IN RN 301 If x1 = x2 .

(x. 0)| + |(x. The next deﬁnition will end up being quite important. To illustrate this deﬁnition. 5) and (−2.6 Exercises 1. This is because an open set. x ∈ U is an interior point of U if there exists r > 0 such that x ∈ B (x. Give a simple description using the distance formula of the perpendicular bisector of the line segment joining these two points.302 RN 12. Simplify this to the form algebra.1 Let U ⊆ Rn . 3. y1 ) and (x2 . Let (x. Find a parametric equation for the line through the points (2. y) ∈ R2 : |(x. then so is y whenever y is close enough to x. r) U You see in this picture how the edges are dotted. y2 ) be two points in R2 . y) − (a. 0) and (a. 3. For example. y1 )| = |(x. 3. 4.2 A subset. In other words U is an open set exactly when every point of U is an interior point of U . 12. The set of points described by (x. can not include the edges or the set would fail to be open. qx B(x. Thus you want all points. if U is any subset of Rn . Describe the set of points encountered as t changes. Suppose you are given two points. Deﬁnition 12. 2 sin (t)) where t ∈ [0. of Rn is called a closed set if Rn \ C is an open set. consider what would happen . y. U is an open set if whenever x ∈ U. y) − (−a. y2 )| . Describe the set of points encountered as t changes. t) where t ∈ R. r > 2a. z) = (2 cos (t) .7. 4. 0)| = r is known as an ellipse. Deﬁnition 12. surely there should be something called a closed set and here is the deﬁnition of one. Let (x. y) such that |(x.7 Open And Closed Sets Eventually. The two given points are known as the focus points of the ellipse. 5. y) − (x2 . Let (x1 . This is a nice exercise in messy 2.7. r) ⊆ U. y) = (2 cos (t) . 1) . If there is something called an open set. C. 2π] . consider the following picture. y) − (x1 . More generally. 0. 0) in R2 and a number. there exists r > 0 such that B (x. (−a. r) ⊆ U. one must consider functions which are deﬁned on subsets of Rn and their properties. 2 sin (t) . It describe a type of subset of Rn with the property that if x is in this set. x−A 2 α 2 + y β = 1.

there exists a ball. Therefore. Therefore. Therefore. 1) is also / contained in Rn \ ∅ = Rn . the empty set. B (x. r) contained entirely / in Rn \ C. This will be done in the next theorem and will give examples of open sets. It is roughly the case that open sets don’t have their skins while closed sets do. what would be its skin? It can’t be in Rn \ C and so it must be in C. Here is a picture of a closed set. C. the statment: Whenever a pig is born with wings it can ﬂy must be taken as true. as equally true. We do not consider biological or aerodynamic considerations in such statements. .3 Let x ∈ Rn and let r ≥ 0. r) = ∅. ∅ is both open and closed. r) . Therefore. Since ∅ has no points in it. It is necessary to show there exists r1 > 0 such that B (y. Note that if r = 0 then B (x. you can say anything you want about the elements of the empty set and no one can gainsay your statement. C qx B(x. r) ≡ {y ∈ Rn : |y − x| ≤ r} is a closed set. If you look at Rn \ C. |x − y| ≥ 0 and so y ∈ B (x. This must be considered to be true because there is nothing in ∅ so there can be no example to show it false2 .7. On the other hand we would also consider the statement: Whenever a pig is born with wings it can’t possibly ﬂy. r) Note that x ∈ C and since Rn \ C is open. There is no such thing as a winged pig and therefore. it follows ∅ is open. then there must be a ball centered at x which is also contained in ∅. Also. Every open ball centered at that point would have in it some points which are outside U . Then B (x. r) ought to be an open set. Also note that Rn and ∅ are both open and closed. / (There are none. 0) . If x ∈ ∅. This is a rough heuristic explanation of what is going on with these deﬁnitions. Here is why.12. Then if |z − y| < r1 . Theorem 12. you can see that if x is close to the edge of U. r) is an open set. r1 ) ⊆ B (x. This is intuitively clear but does require a proof.r) . It is also closed because if x ∈ ∅. From this. You also see the edges of B (x. all winged pigs must be superb ﬂyers since there can be no example of one which is not. This is because if y ∈ Rn . You may say this is a very strange way of thinking about truth and ultimately this is because mathematics is not about truth.) satisﬁes the desired property of being an interior point. The point is. 2 To a mathematician. such a point would violate the above deﬁnition. r) dotted suggesting that B (x. from the deﬁnition. D (x. then B (x. OPEN AND CLOSED SETS 303 if you picked a point out on the edge of U in the above picture. it follows Rn is also both open and closed. it must be open because every point in it. Also. Proof: Suppose y ∈ B (x. It is more about consistency and logic. Deﬁne r1 ≡ r − |x − y| . you might have to take r to be very small. such statements are considered as true by default.7. it follows from the above triangle inequality that |z − x| = ≤ < |z − y + y − x| |z − y| + |y − x| r1 + |y − x| = r − |x − y| + |y − x| = r.

the Cartesian product of two sets. then it follows that s ∈ B (x. r1 ) ⊆ U and that t ∈ B (y. there exists r1 > 0 such that B (x. A × B. Then U × V is an open set in Rn+m . r) is an open set which shows from the deﬁnition that D (x. then by the triangle inequality. T r r1 q E q x y B(x. Either x ∈ C or y ∈ H since otherwise (x. · · ·. · · ·.7. Next suppose (x. Hence U × V is open as claimed. [ai . Also here is an important deﬁnition. it follows that / if z ∈ B (y. y) . δ) ⊆ Rn+m \ (C × H) . · · ·. b) : a ∈ A. bi ] . t) ∈ Rn+m : k=1 |xk − sk | + j=1 2 |yj − tj | < δ 2 2 Therefore. The following theorem has to do with the Cartesian product of two closed sets or two open sets.304 RN Now suppose y ∈ D (x. (x. xm . y) ∈ U × V. It is necessary to show there exists δ > 0 such that / B ((x.5 Let U be an open set in Rm and let V be an open set in Rn . δ) . y) . A picture which is descriptive of the conclusion of the above theorem which also implies the manner of proof is the following. you can write it in the form (x. If C is a closed set in Rm and H is a closed set in Rn . · · ·. r) is a closed set as claimed. y) ∈ A × B. Now suppose A ⊆ Rm and B ⊆ Rn . yn ) . starting with something in Rn+m . In general.7. means {(a. Now B ((x. r) . xm ) and y = (y1 . then C × H is a closed set in Rn+m . R2 is also written as R × R. Proof: Let (x. If C and H are bounded. then so is C × H. y) . y) would be a / / . δ) . r) r1 E q y Recall R2 consists of ordered pairs. t) ∈ B ((x. the following identiﬁcation will be made. y) where x ∈ Rm and y ∈ Rn . b ∈ B} . A ⊆ Rn is said to be bounded if there exist ﬁnite intervals. Similarly. Theorem 12. r2 ) ⊆ V which shows that B ((x. δ) ⊆ U × V.4 A set. r) T r q x D(x. Since U is open. there exists r2 > 0 such that B (y. x = (x1 . r) . y) ∈ C × H. if δ ≡ min (r1 . yn ) ∈ Rn+m . r1 ) ⊆ U. r2 ) ⊆ V . δ) ≡ m n (s. y1 . Similarly. y) such that x ∈ R and y ∈ R. (x. y) = (x1 . bi ] such that n A⊆ i=1 [ai . Deﬁnition 12. δ) ⊆ Rn \ D (x. Therefore. r) . Then |x − y| > r and deﬁning δ ≡ |x − y| − r. Since y was an arbitrary point in Rn \ D (x. it follows Rn \D (x. Then if (x. y) . |x − z| ≥ |x − y| − |y − z| > |x − y| − δ = |x − y| − (|x − y| − r) = r and this shows that B (y. r2 ) and (s.

t ∈ [0. Subtract p from all these points to obtain the directed line segment consisting of the points 0 + t (q − p) . Suppose therefore. how hard you push and the direction you push. bi ] such that C ⊆ i=1 [ai . EXERCISES 305 point of C ×H.12. r) ⊆ Rm \ C. If (s. and the same magnitude (length). Show carefully that Rn is both open and closed. the direction in which the arrows point. B ((x. 3.5. / m If C is bounded. r) . Show that every open set in Rn is the union of open balls contained in it. 2. q − p. Since C is closed. For example. r) ⊆ Rn+m \ (C × H) showing C × H is closed. − was slid so it points in the same direction and the base is pq. it follows that s ∈ B (x. The point in Rn . What is important? There are really two things which are important. there exist [ai . Then from Deﬁnition 12. Vectors are used to model this. Let − be a directed line segment or pq − consists of the points of the form → vector. y) . bm+n ] . t) ∈ B ((x. Therefore. Geometrically think of vectors as directed line segments as shown in the following picture in which all the directed line segments are considered to be the same vector because they have the same direction. see the following picture.8 Exercises 1. It has two essential ingredients. the arrow. Show the intersection of any two open sets is an open set. r) . y) . will represent the vector. Therefore. · · ·. it is always possible → to put a vector in a certain particularly simple form.9 Vectors Suppose you push on something. Consider B ((x. [am+n . its magnitude and its direction. C × H ⊆ m+n i=1 [ai . ¢ ¢ . 1] .3 that pq p + t (q − p) where t ∈ [0. 0. A similar argument holds if y ∈ H. What was just described would be called a force vector.8. 12. bi ] and this establishes the last part of this theorem. → Geometrically. ¢ ¢ ¢ ¢¢ ¢ ¢ ¢ ¢ ¢ Because of this fact that only direction and magnitude are important. bi ] and if H is bounded. r) which is contained in Rm \ C. bm+1 ] . that x ∈ C. there exists r > 0 such that / B (x. m+n H ⊆ i=m+1 [ai . at the origin. bi ] for intervals [am+1 . 12. 1] . y) .

what is rv? n 1/2 n 1/2 |rv| = k=1 (rai ) 2 1/2 2 = k=1 1/2 r (ai ) 2 2 n = r a2 i k=1 = |r| |v| . Thus the unit distance along any coordinate axis now has length r and in this rescaled system the vector is represented by a. z e3 T E e1 e2 © y x . By the distance formula this equals n 1/2 (qk − pk ) k=1 2 = |p − q| n 1/2 2 and for v any vector in Rn the magnitude of v equals = |v| . a will be referred to as a vector instead of an element of Rn representing a vector as just described. From now on. ¢ ¢ ¢ ¢¢ v 2v ¢ −2v ¢ ¢ ¢ ¢ Note there are n special vectors which point along the coordinate axes. v because multiplying by the scalar. See the picture in the case of R3 . · · ·. 0. These are ei ≡ (0. → The magnitude of a vector determined by a directed line segment − is just the distance pq between the point p and the point q. 1. · · ·. only has the eﬀect of scaling all the distances. the element of Rn at its point is a. Thus the magnitude of rv equals |r| times the magnitude of v. If r is positive. 0) where the 1 is in the ith slot and there are zeros in all the other spaces.306 ¢ RN ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ In this way vectors can be identiﬁed with elements of Rn . The following picture illustrates the eﬀect of scalar multiplication. k=1 vk What is the geometric signiﬁcance of scalar multiplication? If a represents the vector. then the vector represented by rv has the same direction as the vector. r. If r < 0 similar considerations apply except in this case all the ai also change sign. 0. v in the sense that when it is slid to place its tail at the origin.

How can one obtain this geometrically? Consider the − → directed line segment. u + v. determines u + v. Thus the addition of vectors according to the rules of addition in Rn . · · ·. n = i=1 (ai + bi ) ei . ai ei is the ith component of the vector. a involves a component in the i direction. knowledge of the ith component of the vector is equivalent to knowledge of the vector because it gives the entry in the ith slot and for v = (a1 . ai . see the following picture ¢ I +v u u ¢ ¢ ¢ I ¢ v ¢ Note the vector u + v is the diagonal of a parallelogram determined from the two vectors u and v and that identifying u + v with the directed diagonal of the parallelogram determined by the vectors u and v amounts to the same thing as the above procedure. the second exerts a force of 3i+5j + k . This is exactly what is obtained when the vectors. an ) . ai ei while the component in the ith direction of b is bi ei . an ) .1 There are three ropes attached to a car and three people pull on these ropes. The ﬁrst exerts a force of 2i+3j−2k Newtons. n v= k=1 a i ei . · · ·. follow −− −→ −−− the directed line segment u (u + v) to its end. j for e2 .9. 0u and then. starting at the end of this directed line segment. v = (v1 . The point of this slid vector. th Then the vector. yields the appropriate vector which duplicates the cumulative eﬀect of all the vectors in the sum. · · ·. In the case of Rn where n ≤ 3. · · ·. 0. a + b = (a1 + b1 . Given a vector. 0) and so this vector gives something possibly nonzero only in the ith direction. Also. and k for e3 . To illustrate. un + vn ) .12. An item of notation should be mentioned here. What does addition of vectors mean physically? Suppose two forces are applied to some object.9. · · ·. vn ) Then u + v = (u1 + v1 . VECTORS 307 The direction of ei is referred to as the ith direction. v = (a1 . u = (u1 . In other words. a and b are added. Each of these would be represented by a force vector and the two forces acting together would yield an overall force acting on the object which would also be a force vector n n known as the resultant. · · ·. v are vectors. place the vector u in standard position with its base at the origin and then slide the vector v till its base coincides with the point of u. Then it seems physically reasonable that the resultant vector should have a component in the ith direction equal to (ai + bi ) ei . · · ·. Suppose the two vectors are a = k=1 ai ei and b = k=1 bi ei . What is the geometric signiﬁcance of vector addition? Suppose u. Thus ai ei = (0. it is standard notation to use i for e1 . an + bn ) . 0. Now here are some applications of vector addition to some problems. un ) . Example 12. · · ·.

The vector has length 100.4 A certain river is one half mile wide with a current ﬂowing at 4 miles per hour from East to West. the new displacement vector for the airplane is (1. is the initial position vector of the airplane.308 RN Newtons and the third exerts a force of 5i − j+2k.9.9. To ﬁnd the total force add the vectors as described above. Therefore. As it moves. Therefore. Now using that vector as the hypotenuse of a right triangle √ having equal sides. The Newton is a unit of force like pounds. . 1) . . Newtons. This gives 10i+7j + k Example 12. Example 12. 2. 2.9. Find the total force in the direction of i. After one minute the airplane has moved in the i direction a 1 1 distance of 100 × 60 = 5 kilometer.2 An airplane ﬂies North East at 100 miles per hour. Therefore. 2. A man swims directly toward the opposite shore from the South bank of the river at a speed of 3 miles per hour. The velocity of the swimmer in still water would be 3j while the velocity of the river would be −4i. the vector would the √ be 100/ 2i + 100/ 2j. Find the position of this airplane one minute later. In the j direction it has moved 60 kilometer during this 3 1 same time. 1) + 5 1 1 .3 An airplane is traveling at 100i + j + k kilometers per hour and at a certain instant of time its position is (1. Here imagine a Cartesian coordinate system in which the third component is altitude and the ﬁrst and second components are measured on a line from West to East and a line from South to North. the velocity . T 3 ' 4 You should write these vectors in terms of components. How far down the river does he ﬁnd himself when he has swam across? How far does he end up swimming? Consider the following picture. 3 60 60 = 8 121 121 . Newtons. Consider the vector (1. Write this as a vector. A picture of this situation follows. Therefore. 1) . the position vector changes. while it moves 60 kilometer in the k direction. the force in the i direction is 10 Newtons.√ sides should be each of length 100/ 2. 3 60 60 Example 12.

How far down the river does he ﬁnd himself when he has swam across? How far does he end up swimming? 5. An airplane is ﬂying due north at 150 miles per hour. The wind blows from West to East at a speed of 50 kilometers per hour and an airplane is heading North West at a speed of 300 Kilometers per hour. Since the component of velocity in the direction across the river is The speed at which he travels is √ 3. Find the third force. What will be the speed of the airplane? 3. A certain river is one half mile wide with a current ﬂowing at 2 miles per hour from East to West. and the positive x axis points east and the positive y axis points north. the plane starts ﬂying 30◦ East of North. Three forces are applied to a point which does not move. Find the displacement vector from the nest to the telephone pole. A wind is pushing the airplane due east at 40 miles per hour. EXERCISES 309 of the swimmer is −4i + 3j.12. 5 42 + 32 = 5 miles per hour and so he travels 5 × 1 = 6 miles. Two of the forces are 2i + j + 3k Newtons and i − 3j + 2k Newtons. A bird ﬂies from its nest 5 km. Now to ﬁnd the distance 6 downstream he ﬁnds himself. . it follows the trip takes 1/6 hour or 10 minutes. 7.10 Exercises 1. Therefore. A man can swim at 3 miles per hour in still water. After 1 hour. It then ﬂies 10 km. In what direction should he swim in order to travel directly across the river? What would the answer to this problem be if the river ﬂowed at 3 miles per hour and the man could swim only at the rate of 2 miles per hour? 6.10. What is the velocity of the airplane relative to the ground? What is the component of this velocity in the direction North? 2. Assuming the plane starts at (0. In the situation of Problem 1 how many degrees to the West of North should the airplane head in order to ﬂy exactly North. 3 12. A certain river is one half mile wide with a current ﬂowing at 2 miles per hour from East to West. where is it after 2 hours? Let North be the direction of the positive y axis and let East be the direction of the positive x axis. 0) . x and 1/2 are two legs of a right triangle whose hypotenuse equals 5/6 miles. by the Pythagorean theorem the distance downstream is (5/6) − (1/2) = 2 2 2 miles. note that if x is this distance. Will the airplane have enough fuel to arrive at its destination given that it has 63 gallons of fuel? 4. in the direction 60◦ north of east where it stops to rest on a tree. in the direction due southeast and lands atop a telephone pole. Place an xy coordinate system so that the origin is the bird’s nest. In the situation of 2 suppose the airplane uses 34 gallons of fuel every hour at that air speed and that it needs to ﬂy North a distance of 600 miles. 8. A man swims directly toward the opposite shore from the South bank of the river at a speed of 3 miles per hour.

310 RN .

1. 3) . This equals 0 + 2 + 0 + −3 = −1. The ﬁrst of these is called the dot product. c will denote vectors.6) Furthermore equality is obtained if and only if one of a or b is a scalar multiple of the other.1.2 The dot product satisﬁes the following properties.5) You should verify these properties.2) (13. It is listed here for the sake of convenience.4) (13. Proposition 13.4 The dot product satisﬁes the inequality |a · b| ≤ |a| |b| . Also be sure you understand that (13.Vector Products 13. Theorem 13. b be two vectors in Rn deﬁne a · b as n a·b≡ k=1 ak bk .4) follows from the ﬁrst three and is therefore redundant. The dot product satisﬁes a fundamental inequality known as the Cauchy Schwartz inequality. 1.3) (13. In the statement of these properties. 2. also called the scalar product and sometimes the inner product. Deﬁnition 13. (13. −1) · (0.1. b.3 Find (1. 2. With this deﬁnition.1. Example 13. 0. α and β will denote scalars and a. a·b=b·a a · a ≥ 0 and equals zero if and only if a = 0 (αa + βb) · c =α (a · c) + β (b · c) c · (αa + βb) = α (c · a) + β (c · b) |a| = a · a 2 (13.1) (13. 311 .1 Let a.1 The Dot Product There are two ways of multiplying vectors which are of great importance in applications. there are several important properties satisﬁed by the dot product.

(13.1).(13. Then by (13.3).2). This means that whenever something satisﬁes these properties.312 VECTOR PRODUCTS Proof: First note that if b = 0 both sides of (13. it will be assumed in what follows that b = 0. Deﬁne a function of t ∈ R f (t) = (a + tb) · (a + tb) . − a·b |b| 2 2 2 ≥ 0. Also from (13. f (t) ≥ 0 for all t ∈ R. b ∈ Rn |a + b| ≤ |a| + |b| (13.4). Now f (t) = |b| 2 2 2 t2 + 2t a·b |b| 2 + |a| 2 2 2 |b| 2 = |b| t2 + 2t a·b |b| 2 + 2 a·b |b| 2 |a| |b| 2 − a·b |b| a·b |b| 2 2 2 2 = |b| t + 2 2 |a| + 2 |b| 2 a·b |b| 2 + 2 − ≥ 0 for all t ∈ R. In particular f (t) ≥ 0 when t = − a · b/ |b| |a| 4 2 2 which implies (13. The Cauchy Schwartz inequality allows a proof of the triangle inequality for distances in Rn in much the same way as the triangle inequality for the absolute value.6) whenever one of the vectors is a scalar multiple of the other.7). You should note that the entire argument was based only on the properties of the dot product listed in (13. This proves the theorem.5) f (t) = a · (a + tb) + tb · (a + tb) = a · a + t (a · b) + tb · a + t2 b · b = |a| + 2t (a · b) + |b| t2 .8) and equality holds if and only if one of the vectors is a nonnegative scalar multiple of the other.6). Therefore.1.2).6) equal zero and so the inequality holds in this case. It only remains to verify this is the only way equality can occur. equality holds in (13. There are many other instances of these properties besides vectors in Rn . it follows that for this value of t.4.6) so it can be assumed both vectors are non zero and that equality is obtained in (13.7) |b| Multiplying both sides by |b| . then equality is obtained in (13. and (13.(13. Theorem 13. a+tb = 0 showing a = −tb.1) . From Theorem 13.9) . the Cauchy Schwartz inequality holds. This implies that f (t) = 0 2 when t = − a · b/ |b| and so from (13. Also ||a| − |b|| ≤ |a − b| (13. |a| |b| ≥ (a · b) 2 2 which yields (13. If either vector equals zero.5 (Triangle inequality) For a.5).1.

The dot product can be used to determine the included angle between two vectors.1. |a| − |b| ≤ |a − b| Similarly. Therefore. (13.8). (13. the included angle is the angle between these two vectors which is less than or equal to 180 degrees. then that vector equals zero times the other vector and the claim about when equality occurs is veriﬁed. If α < 0 then equality cannot occur in the ﬁrst inequality because in this case 2 2 (a · b) = α |a| < 0 < |α| |a| = |a · b| Therefore. ||a| − |b|| ≤ |a − b| . |a + b| = (a + b) · (a + b) = (a · a) + (a · b) + (b · a) + (b · b) = |a| + 2 (a · b) + |b| ≤ |a| + 2 |a · b| + |b| ≤ |a| + 2 |a| |b| + |b| = (|a| + |b|) . 2 2 2 2 2 2 2 2 313 Taking square roots of both sides you obtain (13.2. Therefore.10) 13. Say b = αa. This proves the theorem. To see how to do this.1 The Geometric Signiﬁcance Of The Dot Product The Angle Between Two Vectors Given two vectors.11) that (13.2. THE GEOMETRIC SIGNIFICANCE OF THE DOT PRODUCT Proof : By properties of the dot product and the Cauchy Schwartz inequality. α ≥ 0. To get equality in the second inequality above. a=a−b+b so |a| = |a − b + b| ≤ |a − b| + |b| .10) or (13.11) and either way. consider the following picture.2 13. To get the other form of the triangle inequality. Theorem 13. This is because ||a| − |b|| equals the left side of either (13.10) and (13. If either vector equals zero.9) holds. B ¨ b¨ e ¨ e ¨¨ θ e a e e e e q a − be e e .11) It follows from (13. it can be assumed both vectors are nonzero.4 implies one of the vectors must be a multiple of the other.13. It remains to consider when equality occurs. |b| − |a| ≤ |b − a| = |a − b| . a and b.

you should get used to thinking in terms of radians and not degrees.76616 = 360 because 360◦ corresponds to 2π 2π radians. If the answer is zero. a · b = |a| |b| cos θ. You can tell if two nonzero vectors are perpendicular by simply taking their dot product. Example 13.2. Example 13. 720 58 26 6 Now the cosine is known. 898◦ . . Also from the properties of the dot product. v be two vectors whose magnitudes are equal to 3 and 4 respectively and such that if they are placed in standard position with their tails at the origin. 2 2 2 2 2 2 VECTOR PRODUCTS (13. Set up a proportion.12) the cosine of the included angle equals cos θ = √ 9 √ = . The answer is θ = .2. Note this gives a geometric description of the dot product which does not depend explicitly on the coordinates of the vectors. the angle can be determines by solving the equation. cos θ = . the angle between u and the positive x axis equals 30◦ and the angle between v and the positive x axis is -30◦ . θ = .12) u · v = 3 × 4 × cos (60◦ ) = 3 × 4 × 1/2 = 6. Example 13.2.12) In words. in calculus.314 By the law of cosines.1 Find the angle between the vectors 2i + j − k and 3i + 4j + k. . this means they are are perpendicular because cos θ = 0. 720 58. |a − b| = |a| + |b| − 2 |a| |b| cos θ.4 Determine whether the two vectors. Observation 13. This will involve using a calculator or a table of trigonometric functions.2. From the geometric description of the dot product in (13. the dot product of two vectors equals the product of the magnitude of the two vectors multiplied by the cosine of the included angle. This is because all the important calculus formulas are deﬁned in terms of radians. When you take this dot product you get 2 + 3 − 5 = 0 and so these two are indeed perpendicular. Find u · v. √ √ The dot product of these two vectors equals 6+4−1 = 9 and the norms are 4 + 1 + 1 = √ √ 6 and 9 + 16 + 1 = 26. 2i + j − k and 1i + 3j + 5k are perpendicular. Therefore. from (13. 766 16 radians or in terms of degrees.2 Let u. 766 16 × 360 = 43. Recall how this 2π x last computation is done. However.3 Two vectors are said to be perpendicular if the included angle is π/2 radians (90◦ ). |a − b| = (a − b) · (a − b) = |a| + |b| − 2a · b and so comparing the above two formulas.

The only part of a force which does work in the sense of physics is the component of the force in the direction of motion. The work is deﬁned to be the magnitude of the component of this force times the distance over which it acts in the case where this component of force points in the direction of motion and (−1) times the magnitude of this component times the distance in case the force tends to impede the motion. the work equals |F| |p2 − p1 | cos θ = F· (p2 −p1 ) .2.13. one which is parallel to a given vector.2. THE GEOMETRIC SIGNIFICANCE OF THE DOT PRODUCT 315 13. F on the object would have been negative because in this case. Thus. p1 to the point p2 . This is illustrated in the following picture in the case where the given force contributes to the motion. Thus from the geometric description of the dot product given above. since F|| points in the direction of the vector from p1 to p2 .5 Let F be a force acting on an object which moves from the point. Then the work done on the object by the given force equals F· (p2 − p1 ) . F in terms of two vectors.2 Work And Projections Our ﬁrst application will be to the concept of work.2. cos θ would also be negative and so it is still the case that the work done would be given by the above formula. D and the other which is perpendicular can also be explained with no reliance on trigonometry. . The concept of writing a given vector. F|| and F⊥ and the picture is intended to indicate that when you add these two vectors you get F while F|| acts in the direction of motion and F⊥ acts perpendicular to the direction of motion. The reason for such an unusual deﬁnition is that even though your arms exerted considerable force on the weight. There are two vectors shown. sweating profusely and exerting all your strength to keep the weight from falling on your feet. then the work done by the force. the total work done should equal −→ |F| −1− 2 cos θ = |F| |p2 − p1 | cos θ p p If the included angle had been obtuse. The physical concept of work does not in any way correspond to the notion of work employed in ordinary conversation. keeping the height always three feet and then deposit this weight on another three foot high table. the physical concept of work would indicate that the force exerted by your arms did no work during this project even though the muscles in your hands and arms would likely be very tired. the force tends to impede the motion from p1 to p2 but in this case. This explains the following deﬁnition. y g F⊥ g $q p F $$$ 2 $$ $$ X $ θ g$$$ F|| q p1 In this picture the force. From trigonometry. enough to keep it from falling. Deﬁnition 13. For example. if you were to slide a 150 pound weight oﬀ a table which is three feet high and shuﬄe along the ﬂoor for 50 yards. F is applied to an object which moves on the straight line from p1 to p2 . Only F|| contributes to the work done by F on the object as it moves from p1 to p2 . completely in terms of the algebraic properties of the dot product. Thus the work done by a force on an object as the object moves from one point to another is a measure of the extent to which the force contributes to the motion. the direction of motion was at right angles to the force they exerted. you see the magnitude of F|| should equal |F| |cos θ| .

316

VECTOR PRODUCTS

As before, this is mathematically more signiﬁcant than any approach involving geometry or trigonometry because it extends to more interesting situations. This is done next. Theorem 13.2.6 Let F and D be nonzero vectors. Then there exist unique vectors F|| and F⊥ such that F = F|| + F⊥ (13.13) where F|| is a scalar multiple of D, also referred to as projD (F) , and F⊥ · D = 0. Proof: Suppose (13.13) and F|| = αD. Taking the dot product of both sides with D and using F⊥ · D = 0, this yields 2 F · D = α |D| which requires α = F · D/ |D| . Thus there can be no more than one vector, F|| . It follows F⊥ must equal F − F|| . This veriﬁes there can be no more than one choice for both F|| and F⊥ . Now let F·D F|| ≡ 2 D |D| and let F⊥ = F − F|| = F− Then F|| = α D where α =

F·D . |D|2 2

F·D |D|

2

D

**It only remains to verify F⊥ · D = 0. But
**

2 D·D |D| = F · D − F · D = 0.

F⊥ · D = F · D−

F·D

This proves the theorem. Example 13.2.7 Let F = 2i+7j − 3k Newtons. Find the work done by this force in moving from the point (1, 2, 3) to the point (−9, −3, 4) where distances are measured in meters. According to the deﬁnition, this work is (2i+7j − 3k) · (−10i − 5j + k) = −20 + (−35) + (−3) = −58 Newton meters. Note that if the force had been given in pounds and the distance had been given in feet, the units on the work would have been foot pounds. In general, work has units equal to units of a force times units of a length. Instead of writing Newton meter, people write joule because a joule is by deﬁnition a Newton meter. That word is pronounced “jewel” and it is the unit of work in the metric system of units. Also be sure you observe that the work done by the force can be negative as in the above example. In fact, work can be either positive, negative, or zero. You just have to do the computations to ﬁnd out. Example 13.2.8 Find proju (v) if u = 2i + 3j − 4k and v = i − 2j + k.

13.2. THE GEOMETRIC SIGNIFICANCE OF THE DOT PRODUCT From the above discussion in Theorem 13.2.6, this is just 1 (i − 2j + k) · (2i + 3j − 4k) (2i + 3j − 4k) 4 + 9 + 16 −8 16 24 32 (2i + 3j − 4k) = − i − j + k. 29 29 29 29

317

=

Example 13.2.9 Suppose a, and b are vectors and b⊥ = b − proja (b) . What is the magnitude of b⊥ in terms of the included angle?

|b⊥ | = (b − proja (b)) · (b − proja (b)) = b−

2

2

b·a |a|

2

a

·

2

b−

b·a |a|

2

a

2 2

= |b| − 2 = |b|

2

(b · a) |a|

2

+

2 2

b·a |a|

2

|a|

1−

(b · a)

2

|a| |b|

= |b|

2

1 − cos2 θ = |b| sin2 (θ)

2

where θ is the included angle between a and b which is less than π radians. Therefore, taking square roots, |b⊥ | = |b| sin θ.

13.2.3

The Parabolic Mirror

When light is reﬂected the angle of incidence is always equal to the angle of reﬂection. This is illustrated in the following picture in which a ray of light reﬂects oﬀ something like a mirror.

c

Angle of incidence

} Angle of reﬂection Reﬂecting surface

An interesting problem is to design a curved mirror which has the property that it will direct all rays of light coming from a long distance away (essentially parallel rays of light) to a single point. You might be interested in a reﬂecting telescope for example or some sort of scheme for achieving high temperatures by reﬂecting the rays of the sun to a small area. Turning things around, you could place a source of light at the single point and desire to have the mirror reﬂect this in a beam of light consisting of parallel rays. How can you design such a mirror?

318

VECTOR PRODUCTS

It turns out this is pretty easy given the above techniques for ﬁnding the angle between vectors. Consider the following picture.

(0, p) r }

c

T Piece of the curved mirror

r (x, y(x))

Tangent to the curved mirror

It suﬃces to consider this in a plane for x > 0 and then let the mirror be obtained as a surface of revolution. In the above picture, let (0, p) be the special point at which all the parallel rays of light will be directed. This is set up so the rays of light are parallel to the y axis. The two indicated angles will be equal and the equation of the indicated curve will be y = y (x) while the reﬂection is taking place at the point (x, y (x)) as shown. To say the two angles are equal is to say their cosines are equal. Thus from the above, (0, 1) · (1, y (x)) 1 + y (x)

2

=

(−x, p − y) · (−1, −y (x)) x2 + (y − p)

2

1 + y (x)

2

.

This follows because the vectors forming the sides of one of the angles are (0, 1) and (1, y (x)) while the vectors forming the other angle are (−x, p − y) and (−1, −y (x)) . Therefore, this yields the diﬀerential equation, y (x) = which is written more simply as x2 + (y − p) + (p − y) y = x Now let y − p = xv so that y = xv + v. Then in terms of v the diﬀerential equation is xv = √ 1 − v. 1 + v2 − v

2

−y (x) (p − y) + x x2 + (y − p)

2

Using the technique for solving separable diﬀerential equations described in Problem 13 on Page 220 this reduces to 1 dx √ − v dv = x 1 + v2 − v

√ 1 To ﬁnd − v dv use a trig. substitution, v = tan θ. Then in terms of θ, the 1+v 2 −v antiderivative becomes

1 − tan θ sec2 θ dθ sec θ − tan θ

=

sec θ dθ

= ln |sec θ + tan θ| + C.

13.2. THE GEOMETRIC SIGNIFICANCE OF THE DOT PRODUCT Now in terms of v, this is ln v + 1 + v 2 = ln x + c.

319

There is no loss of generality in letting c = ln C because ln maps onto R. Therefore, from laws of logarithms, ln v + 1 + v2 = ln x + c = ln x + ln C = ln Cx. Therefore, v+ and so 1 + v 2 = Cx − v. Now square both sides to get 1 + v 2 = C 2 x2 + v 2 − 2Cxv which shows 1 = C 2 x2 − 2Cx Solving this for y yields y= C 2 1 x + p− 2 2C y−p = C 2 x2 − 2C (y − p) . x 1 + v 2 = Cx

and for this to correspond to reﬂection as described above, it must be that C > 0. As described in an earlier section, this is just the equation of a parabola. Note it is possible to choose C as desired adjusting the shape of the mirror.

13.2.4

The Equation Of A Plane

The dot product makes possible a description of planes. To ﬁnd the equation of a plane, you need two things, a point contained in the plane and a vector normal to the plane. Let p0 = (x0 , y0 , z0 ) denote the position vector of a point in the plane, let p = (x, y, z) be the position vector of an arbitrary point in the plane, and let n denote a vector normal to the plane. This means that n· (p − p0 ) = 0 whenever p is the position vector of a point in the plane. The following picture illustrates the geometry of this idea. ¢ ¢ ¢ ¢ ¢ ¢ # £ ¢ ¢ £n ¢ ¢ p !i £p ¢ ¢ 0 ¢ ¢ ¢ ¢ ¢ ¢ ¡ ¡ ¡ ¡ ¡ Expressed equivalently, the plane is just the set of all points p such that the vector, p − p0 is perpendicular to the given normal vector, n.

320

VECTOR PRODUCTS

Example 13.2.10 Find the equation of the plane with normal vector, n = (1, 2, 3) containing the point (2, −1, 5) . From the above, the equation of this plane is just (1, 2, 3) · (x − 2, y + 1, z − 3) = x − 9 + 2y + 3z = 0 Example 13.2.11 2x + 4y − 5z = 11 is the equation of a plane. Find the normal vector and a point on this plane. You can write this in the form 2 x − 11 + 4 (y − 0) + (−5) (z − 0) = 0. Therefore, a 2 normal vector to the plane is 2i + 4j − 5k and a point in this plane is 11 , 0, 0 . Of course 2 there are many other points in the plane. Deﬁnition 13.2.12 Suppose two planes intersect. The angle between the planes is deﬁned to be the angle between their normal vectors.

13.3

Exercises

1. Use formula (13.12) to verify the Cauchy Schwartz inequality and to show that equality occurs if and only if one of the vectors is a scalar multiple of the other. 2. For u, v vectors in R3 , deﬁne the product, u ∗ v ≡ u1 v1 + 2u2 v3 + 3u3 v3 . Show the ax1/2 1/2 ioms for a dot product all hold for this funny product. Prove |u ∗ v| ≤ (u ∗ v) (v ∗ v) . Hint: Do not try to do this with methods from trigonometry. 3. Find the angle between the vectors 3i − j − k and i + 4j + 2k. 4. Find the angle between the vectors i − 2j + k and i + 2j − 7k. 5. If F is a force and D is a vector, show projD (F) = (|F| cos θ) u where u is the unit vector in the direction of D, u = D/ |D| and θ is the included angle between the two vectors, F and D. 6. A boy drags a sled for 100 feet along the ground by pulling on a rope which is 20 degrees from the horizontal with a force of 10 pounds. How much work does this force do? 7. An object moves 10 meters in the direction of j. There are two forces acting on this object, F1 = i + j + 2k, and F2 = −5i + 2j−6k. Find the total work done on the object by the two forces. Hint: You can take the work done by the resultant of the two forces or you can add the work done by each force. 8. An object moves 10 meters in the direction of j + i. There are two forces acting on this object, F1 = i + 2j + 2k, and F2 = 5i + 2j−6k. Find the total work done on the object by the two forces. Hint: You can take the work done by the resultant of the two forces or you can add the work done by each force. 9. An object moves 20 meters in the direction of k + j. There are two forces acting on this object, F1 = i + j + 2k, and F2 = i + 2j−6k. Find the total work done on the object by the two forces. Hint: You can take the work done by the resultant of the two forces or you can add the work done by each force. 10. If a, b, and c are vectors. Show that (b + c)⊥ = b⊥ + c⊥ where b⊥ = b− proja (b) .

13.4.

THE CROSS PRODUCT

321

11. In the discussion of the reﬂecting mirror which directs all rays to a particular point, (0, p) . Show that for any choice of positive C this point is the focus of the parabola 1 and the directrix is y = p − C . 12. Suppose you wanted to make a solar powered oven to cook food. Are there reasons for using a mirror which is not parabolic? Also describe how you would design a good ﬂash light with a beam which does not spread out too quickly. 13. Find (1, 2, 3, 4) · (2, 0, 1, 3) . 14. Show that (a · b) =

1 4

|a + b| − |a − b|

2

2

.

2

15. Prove from the axioms of the dot product the parallelogram identity, |a + b| + 2 2 2 |a − b| = 2 |a| + 2 |b| . 16. Find the equation of the plane having a normal vector, 5i + 2j−6k which contains the point (2, 1, 3) . 17. Find the equation of the plane having a normal vector, i + 2j−4k which contains the point (2, 0, 1) . 18. Find the equation of the plane having a normal vector, 2i + j−6k which contains the point (1, 1, 2) . 19. Find the equation of the plane having a normal vector, i + 2j−3k which contains the point (1, 0, 3) . 20. If (a, b, c) = (0, 0, 0) , show that ax + by + cz = d is the equation of a plane with normal vector ai + bj + ck. 21. Find the cosine of the angle between the two planes 2x+3y−z = 11 and 3x+y+2z = 9. 22. Find the cosine of the angle between the two planes x+3y −z = 11 and 2x+y +2z = 9. 23. Find the cosine of the angle between the two planes 2x+y−z = 11 and 3x+5y+2z = 9. 24. Find the cosine of the angle between the two planes x+3y+z = 11 and 3x+2y+2z = 9.

13.4

The Cross Product

The cross product is the other way of multiplying two vectors in R3 . It is very diﬀerent from the dot product in many ways. First the geometric meaning is discussed and then a description in terms of coordinates is given. Both descriptions of the cross product are important. The geometric description is essential in order to understand the applications to physics and geometry while the coordinate description is the only way to practically compute the cross product. Deﬁnition 13.4.1 Three vectors, a, b, c form a right handed system if when you extend the ﬁngers of your right hand along the vector, a and close them in the direction of b, the thumb points roughly in the direction of c.

322

VECTOR PRODUCTS

For an example of a right handed system of vectors, see the following picture. ¢ ¢c

y a ¢¢ ¨ b ¨ ¨ % ¨

¢

¢

In this picture the vector c points upwards from the plane determined by the other two vectors. You should consider how a right hand system would diﬀer from a left hand system. Try using your left hand and you will see that the vector, c would need to point in the opposite direction as it would for a right hand system. From now on, the vectors, i, j, k will always form a right handed system. To repeat, if you extend the ﬁngers of our right hand along i and close them in the direction j, the thumb points in the direction of k. The following is the geometric description of the cross product. It gives both the direction and the magnitude and therefore speciﬁes the vector. Deﬁnition 13.4.2 Let a and b be two vectors in Rn . Then a × b is deﬁned by the following two rules. 1. |a × b| = |a| |b| sin θ where θ is the included angle. 2. a × b · a = 0, a × b · b = 0, and a, b, a × b forms a right hand system. The cross product satisﬁes the following properties. a × b = − (b × a) , a × a = 0, For α a scalar, (αa) ×b = α (a × b) = a× (αb) , For a, b, and c vectors, one obtains the distributive laws, a× (b + c) = a × b + a × c, (b + c) × a = b × a + c × a. (13.16) (13.17) (13.15) (13.14)

Formula (13.14) follows immediately from the deﬁnition. The vectors a × b and b × a have the same magnitude, |a| |b| sin θ, and an application of the right hand rule shows they have opposite direction. Formula (13.15) is also fairly clear. If α is a nonnegative scalar, the direction of (αa) ×b is the same as the direction of a × b,α (a × b) and a× (αb) while the magnitude is just α times the magnitude of a × b which is the same as the magnitude of α (a × b) and a× (αb) . Using this yields equality in (13.15). In the case where α < 0, everything works the same way except the vectors are all pointing in the opposite direction and you must multiply by |α| when comparing their magnitudes. The distributive laws are much harder to establish but the second follows from the ﬁrst quite easily. Thus, assuming the ﬁrst, and using (13.14), (b + c) × a = −a× (b + c) = − (a × b + a × c) = b × a + c × a.

13.4.

THE CROSS PRODUCT

323

A proof of the distributive law is given in a later section for those who are interested. Now from the deﬁnition of the cross product, i × j = k j × i = −k k × i = j i × k = −j j × k = i k × j = −i With this information, the following gives the coordinate description of the cross product. Proposition 13.4.3 Let a = a1 i + a2 j + a3 k and b = b1 i + b2 j + b3 k be two vectors. Then a × b = (a2 b3 − a3 b2 ) i+ (a3 b1 − a1 b3 ) j+ + (a1 b2 − a2 b1 ) k. Proof: From the above table and the properties of the cross product listed, (a1 i + a2 j + a3 k) × (b1 i + b2 j + b3 k) = a1 b2 i × j + a1 b3 i × k + a2 b1 j × i + a2 b3 j × k+ +a3 b1 k × i + a3 b2 k × j = a1 b2 k − a1 b3 j − a2 b1 k + a2 b3 i + a3 b1 j − a3 b2 i = (a2 b3 − a3 b2 ) i+ (a3 b1 − a1 b3 ) j+ (a1 b2 − a2 b1 ) k (13.19) This proves the proposition. It is probably impossible for most people to remember (13.18). Fortunately, there is a somewhat easier way to remember it. This involves the notion of a determinant. A determinant is a single number assigned to a square array of numbers as follows. det a b c d ≡ a b c d = ad − bc. (13.18)

This is the deﬁnition of the determinant of a square array of numbers having two rows and two columns. Now using this, the determinant of a square array of numbers in which there are three rows and three columns is deﬁned as follows. a b c e f 1+1 det d e f ≡ (−1) a i j h i j + (−1)

1+2

b

d f h j

+ (−1)

1+3

c

d h

e . i

Take the ﬁrst element in the top row, a, multiply by (−1) raised to the 1 + 1 since a is in the ﬁrst row and the ﬁrst column, and then multiply by the determinant obtained by crossing out the row and the column in which a appears. Then add to this a similar number 1+2 obtained from the next element in the ﬁrst row, b. This time multiply by (−1) because b is in the second column and the ﬁrst row. When this is done do the same for c, the last element in the ﬁrst row using a similar process. Using the deﬁnition of a determinant for square arrays of numbers having two columns and two rows, this equals a (ej − if ) + b (f h − dj) + c (di − eh) , an expression which, like the one for the cross product will be impossible to remember, although the process through which it is obtained is not too bad. It turns out these two

324

VECTOR PRODUCTS

impossible to remember expressions are linked through the process of ﬁnding a determinant which was just described. The easy way to remember the description of the cross product in terms of coordinates, is to write i a × b = a1 b1 j a2 b2 k a3 b3 (13.20)

and then follow the same process which was just described for calculating determinants above. This yields (a2 b3 − a3 b2 ) i− (a1 b3 − a3 b1 ) j+ (a1 b2 − a2 b1 ) k (13.21)

which is the same as (13.19). Later in the book a complete discussion of determinants is given but this will suﬃce for now. Example 13.4.4 Find (i − j + 2k) × (3i − 2j + k) . Use (13.20) to compute this. i 1 3 j k −1 2 −2 1 = −1 −2 2 1 i− 1 2 3 1 j+ 1 3 −1 −2 k

= 3i + 5j + k.

13.4.1

The Distributive Law For The Cross Product

This section gives a proof for (13.16), a fairly diﬃcult topic. It is included here for the interested student. If you are satisﬁed with taking the distributive law on faith, it is not necessary to read this section. The proof given here is quite clever and follows the one given in [5]. Another approach, based on volumes of parallelepipeds is found in [16] and is discussed a little later. Lemma 13.4.5 Let b and c be two vectors. Then b × c = b × c⊥ where c|| + c⊥ = c and c⊥ · b = 0. Proof: Consider the following picture.

c⊥ Tc θ b

E

b b Now c⊥ = c − c· |b| |b| and so c⊥ is in the plane determined by c and b. Therefore, from the geometric deﬁnition of the cross product, b × c and b × c⊥ have the same direction. Now, referring to the picture,

|b × c⊥ | = |b| |c⊥ | = |b| |c| sin θ = |b × c| . Therefore, b × c and b × c⊥ also have the same magnitude and so they are the same vector. With this, the proof of the distributive law is in the following theorem.

13.4.

THE CROSS PRODUCT

325

Theorem 13.4.6 Let a, b, and c be vectors in R3 . Then a× (b + c) = a × b + a × c (13.22) Proof: Suppose ﬁrst that a · b = a · c = 0. Now imagine a is a vector coming out of the page and let b, c and b + c be as shown in the following picture. a × (b + c) f w f f f f f f f f f s d f d a×c d f

T a×b

I f c d f r b + c r d E f b

Then a × b, a× (b + c) , and a × c are each vectors in the same plane, perpendicular to a as shown. Thus a × c · c = 0, a× (b + c) · (b + c) = 0, and a × b · b = 0. This implies that to get a × b you move counterclockwise through an angle of π/2 radians from the vector, b. Similar relationships exist between the vectors a× (b + c) and b + c and the vectors a × c and c. Thus the angle between a × b and a× (b + c) is the same as the angle between b + c and b and the angle between a × c and a× (b + c) is the same as the angle between c and b + c. In addition to this, since a is perpendicular to these vectors, |a × b| = |a| |b| , |a× (b + c)| = |a| |b + c| , and |a × c| = |a| |c| . Therefore, |a × c| |a × b| |a× (b + c)| = = = |a| |b + c| |c| |b|

|b + c| |a× (b + c)| |b + c| |a× (b + c)| = , = |a × c| |c| |a × b| |b| showing the triangles making up the parallelogram on the right and the four sided ﬁgure on the left in the above picture are similar. It follows the four sided ﬁgure on the left is in fact a parallelogram and this implies the diagonal is the vector sum of the vectors on the sides, yielding (13.22). Now suppose it is not necessarily the case that a · b = a · c = 0. Then write b = b|| + b⊥ where b⊥ · a = 0. Similarly c = c|| + c⊥ . By the above lemma and what was just shown, a× (b + c) = a× (b + c)⊥ = a× (b⊥ + c⊥ ) = a × b⊥ + a × c⊥ = a × b + a × c. This proves the theorem. The result of Problem 10 of the exercises 13.3 is used to go from the ﬁrst to the second line.

and so

326

VECTOR PRODUCTS

13.4.2

Torque

Imagine you are using a wrench to loosen a nut. The idea is to turn the nut by applying a force to the end of the wrench. If you push or pull the wrench directly toward or away from the nut, it should be obvious from experience that no progress will be made in turning the nut. The important thing is the component of force perpendicular to the wrench. It is this component of force which will cause the nut to turn. For example see the following picture.

¢¢ F ¢ ¢θ¨¨ ¨ ¢

¢¢ F u e ¢ e F⊥ ¢θ¨¨ e¨ B¢ ¨¨ R ¨ ¨¨ ¨

In the picture a force, F is applied at the end of a wrench represented by the position vector, R and the angle between these two is θ. Then the tendency to turn will be |R| |F⊥ | = |R| |F| sin θ, which you recognize as the magnitude of the cross product of R and F. If there were just one force acting at one point whose position vector is R, perhaps this would be suﬃcient, but what if there are numerous forces acting at many diﬀerent points with neither the position vectors nor the force vectors in the same plane; what then? To keep track of this sort of thing, deﬁne for each R and F, the Torque vector, τ ≡ R × F. That way, if there are several forces acting at several points, the total torque can be obtained by simply adding up the torques associated with the diﬀerent forces and positions. Example 13.4.7 Suppose R1 = 2i − j+3k, R2 = i+2j−6k meters and at the points determined by these vectors there are forces, F1 = i − j+2k and F2 = i − 5j + k Newtons respectively. Find the total torque about the origin produced by these forces acting at the given points. It is necessary to take R1 × F1 + R2 × F2 . Thus the total torque equals i 2 1 j k −1 3 −1 2 + i 1 1 j 2 −5 k −6 1 = −27i − 8j − 8k Newton meters

Example 13.4.8 Find if possible a single force vector, F which if applied at the point i + j + k will produce the same torque as the above two forces acting at the given points. This is fairly routine. The problem is to ﬁnd F = F1 i + F2 j + F3 k which produces the above torque vector. Therefore, i 1 F1 j 1 F2 k 1 F3 = −27i − 8j − 8k

13.4.

THE CROSS PRODUCT

327

which reduces to (F3 − F2 ) i+ (F1 − F3 ) j+ (F2 − F1 ) k = − 27i − 8j − 8k. This amounts to solving the system of three equations in three unknowns, F1 , F2 , and F3 , F3 − F2 = −27 F1 − F3 = −8 F2 − F1 = −8 However, there is no solution to these three equations. (Why?) Therefore no single force acting at the point i + j + k will produce the given torque. The mass of an object is a measure of how much stuﬀ there is in the object. An object has mass equal to one kilogram, a unit of mass in the metric system, if it would exactly balance a known one kilogram object when placed on a balance. The known object is one kilogram by deﬁnition. The mass of an object does not depend on where the balance is used. It would be one kilogram on the moon as well as on the earth. The weight of an object is something else. It is the force exerted on the object by gravity and has magnitude gm where g is a constant called the acceleration of gravity. Thus the weight of a one kilogram object would be diﬀerent on the moon which has much less gravity, smaller g, than on the earth. An important idea is that of the center of mass. This is the point at which an object will balance no matter how it is turned. Deﬁnition 13.4.9 Let an object consist of p point masses, m1 , · · ··, mp with the position of the k th of these at Rk . The center of mass of this object, R0 is the point satisfying

p

(Rk − R0 ) × gmk u = 0

k=1

for all unit vectors, u. The above deﬁnition indicates that no matter how the object is suspended, the total torque on it due to gravity is such that no rotation occurs. Using the properties of the cross product,

p p

Rk gmk − R0

k=1 k=1

gmk

×u=0

(13.23)

for any choice of unit vector, u. You should verify that if a × u = 0 for all u, then it must be the case that a = 0. Then the above formula requires that

p p

Rk gmk − R0

k=1 k=1

gmk = 0.

dividing by g, and then by

p k=1

mk , R0 =

p k=1 Rk mk . p k=1 mk

(13.24)

This is the formula for the center of mass of a collection of point masses. To consider the center of mass of a solid consisting of continuously distributed masses, you need the methods of calculus. Example 13.4.10 Let m1 = 5, m2 = 6, and m3 = 3 where the masses are in kilograms. Suppose m1 is located at 2i+3j+k, m2 is located at i−3j+2k and m3 is located at 2i−j+3k. Find the center of mass of these three masses.

The result will be either the desired volume or minus the desired volume. i + 3j − 6k. s.328 Using (13.3 The Box Product Deﬁnition 13.3i + 2j + 3k. b] . The following is a picture of such a thing.4. Example 13. 1] and form ra+sb + tb. You should consider what happens if you interchange the b with the c or the a with the c. You can see geometrically from drawing pictures that this merely introduces a minus sign. t ∈ [0. b. c. a. (i + 2j − 5k) × (i + 3j − 6k) = = i j k 1 2 −5 1 3 −6 3i + j + k . s.4. pick any two of these. According to the above discussion.12 Find the volume of the parallelepiped determined by the vectors. r. take the cross product and then take the dot product of this with the third of these vectors. Therefore. and c consists of {ra+sb + tb : r. and t each in [0. the parallelogram determined by the vectors. a and c has area equal to |a × c| while the altitude of the parallelepiped is |b| cos θ where θ is the angle shown in the picture between b and a × c. i + 2j − 5k. That is. ¨¨ ¢ ¢ ¨ ¢ ¢ ¨¨ ¨ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ a × c ¢ ¢ ¢ ¢ c¢ ¢ ¢ ¢ ¢ ¢ B¢ ¢ ¨¨ ¨¨ ¨ θ ¢ ¨¨b ¨ ¨¨ ¢ ¨ E¨ a You notice the area of the base of the parallelepiped. if you pick three numbers. 1]} .24) R0 = VECTOR PRODUCTS 5 (2i + 3j + k) + 6 (i − 3j + 2k) + 3 (2i − j + 3k) 5+6+3 11 3 13 = i− j+ k 7 7 7 13. This expression is known as the box product and is sometimes written as [a. the volume of this parallelepiped is the area of the base times the altitude which is just |a × c| |b| cos θ = a × c · b.4. then the collection of all such points is what is meant by the parallelepiped determined by these three vectors. In any case the box product of three vectors always equals either the volume of the parallelepiped determined by the three vectors or else minus this volume.11 A parallelepiped determined by the three vectors.

and c are three vectors whose components are all integers. Now to prove the distributive law. Suppose a = (a1 . 5. Find the center of mass of these three masses. c2 . m2 = 3. Suppose m1 is located at 2i − 3j + k. x · a× (b + c) = (x × a) · (b + c) = (x × a) · b+ (x × a) · c =x·a×b+x·a×c = x· (a × b + a × c) . Show that if a × u = 0 for all unit vectors.5. this holds for x = a× (b + c) − (a × b + a × c) showing that a× (b + c) = a × b + a × c and this proves the distributive law for the cross product another way. c3 ) . What does it mean geometrically if the box product of three vectors gives zero? 8. and c = (c1 . Let m1 = 5. j. b3 ) . In particular. a3 ) . EXERCISES 329 Now take the dot product of this vector with the third which yields (3i + j + k) · (3i + 2j + 3k) = 9 + 2 + 3 = 14. Can you conclude the volume of the parallelepiped determined from these three vectors will always be an integer? 7. and m3 = 1 where the masses are in kilograms and the distance is in meters. c or -1 times the volume of this parallelepiped. and m3 = 4 where the masses are in kilograms and the distance is in meters. i − 2j − 6k. m2 = 1. Here is another proof of the distributive law for the cross product. show that this implies (13. b2 . Find the center of mass of these three masses. b. i − 7j − 5k. 4. Show the box product. c] equals the determinant c1 a1 b1 c2 a2 b2 c3 a3 b3 .23) holds for u = i. b = (b1 . Suppose a. let x be a vector. m2 is located at i − 2j + k and m3 is located at 4i + j + 3k. b.13. b. m2 is located at i − 3j + 6k and m3 is located at 2i + j + 3k.5 Exercises 1. u. . Find the volume of the parallelepiped determined by the vectors.23) holds for all unit vectors. If you only assume (13. From the above picture (a × b) ·c = a· (b × c) because both of these give either the volume of a parallelepiped determined by the vectors a. Suppose m1 is located at 2i − j + k. x· [a× (b + c) − (a × b + a × c)] = 0 for all x. then a = 0. 13. Let m1 = 2. [a. 3. Therefore. From the above observation.3i + 2j + 3k. a2 . This shows the volume of this parallelepiped is 14 cubic units. 2. u. 6. k.

j. 1) . (2. j. Next verify that the distributive law holds for the coordinate description of the cross product. a × b has the property that it is perpendicular to both a and b. . by the coordinate description of the dot and cross product and by geometric reasoning. 1. Explain why |a × b| has the correct magnitude. 2. the dot and the cross can be switched as long as the order of the vectors remains the same. Hint: There are two ways to do this. k) = (1. show that (a × b) ·c = a· (b × c) In other words. 2) . 0 if i = j With the Kroneker symbol. 2) −1 if (i. 10. It is desired to ﬁnd an equation of a plane containing the two vectors.6. here is the deﬁnition of these symbols. or (3. j.6. Verify directly from the coordinate description of the cross product that the right thing happens with regards to the vectors i. εijk is deﬁned as follows. Using Problem 7. z) such that the above expression equals zero. 1) . First deﬁne it in terms of coordinates and then get the geometric properties from this. y. δ ij ≡ 1 if i = j . n} for any n ∈ N. This gives another way to approach the cross product. 13. k. δ ij . Deﬁnition 13.330 VECTOR PRODUCTS 9. 2. Then show by direct computation that this coordinate description satisﬁes |a × b| = |a| |b| − (a · b) = |a| |b| 2 2 2 2 2 2 1 − cos2 (θ) where θ is the angle included between the two vectors. j. the set of all (x. and k integers in the set. 3) . i and j can equal any integer in {1.2 For i. A single one is called an index. 1 if (i. {1. · · ·. show an equation for this plane is x a1 b1 y a2 b2 z a3 b3 =0 That is. Deﬁnition 13. 3. a and b. Verify directly that the coordinate description of the cross product. Using the notion of the box product yielding either plus or minus the volume of the parallelepiped determined by the given three vectors. 3} . 2. or (3. εijk ≡ 0 if there are any repeated integers The subscripts ijk and ij in the above are called indices. 3) .1 The symbol. All that is missing is the material about the right hand rule. k) = (2. called the Kroneker delta symbol is deﬁned as follows. (1. δ ij and εijk which are very useful in dealing with vector identities. 11. 1. 2. This symbol. To begin with. 3. εijk is also called the permutation symbol.6 Vector Identities And Notation There are two special symbols.

i u1 v1 j u2 v2 k u3 v3 . Then the right side equals zero. v are vectors in R3 . note that if the two sets are not equal. It is useful to use the Einstein summation convention when dealing with these symbols. You should check that this rule reduces to the above deﬁnition. s} then every term in the sum on the left must have either εijk or εirs contains a repeated index. When you use this convention. εijk εirs = (δ jr δ ks − δ kr δ js ) . Also. Therefore. the left side equals zero. If there is a repeated index in {j. u×v ≡ and so (u × v)1 = (u2 v3 − u3 v2 ) . k} . If i = r and j = s for s = r. then there is exactly one term in the sum on the left and it equals 1. vn ). The right side also reduces to -1 in this case. For example. If u. The reason for this is that you end up getting confused about what is meant. δ ik ak = ai . · · ·. k. s} . un ) and the Cartesian coordinates of v are (v1 . With this notation. VECTOR IDENTITIES AND NOTATION 331 The way to think of εijk is that ε123 = 1 and if you switch any two of the numbers in the list i. Then u · v = ui vi . Lemma 13. then (u × v)i = εijk uj vk . it could be that j is not equal to either r or s. the convention is that you sum over the repeated index. j. it immediately implies that if there is a repeated index. The second is veriﬁed by simply checking it works. For example. To see this. it can be assumed {j.4 Let u. there is exactly one term in the sum on the left which is nonzero and it must equal -1. the answer is zero. The last claim follows directly from the deﬁnition. then there is one of the indices in one of the sets which is not in the other set. v be vectors in Rn where the Cartesian coordinates of u are (u1 . it changes the sign. · · ·. From the above formula in the proposition. It is this: Never have an index be repeated more than once. Simply stated. Thus ai bi is all right but aii bi is not. The right side also equals zero in this case. you can easily discover vector identities and simplify expressions which involve the cross product. there is one very important thing to never forget. This follows because εiij = −εiij and so εiij = 0. Proof: If {j.3 The following holds. Therefore. If you want to write i ai bi ci it is best to simply use the summation notation. Thus εijk = −εjik and εijk = −εkji etc.6. Proof: The ﬁrst claim is obvious from the deﬁnition of the dot product. Also. The cases for (u × v)2 and (u × v)3 are veriﬁed similarly.6. k} = {r. the same thing. Thus ai bi means i ai bi . For example. then every term in the sum on the left equals zero. k} = {r. δ ij xj means j δ ij xj = xi .6.13. The right also reduces to zero in this case because then j = k = r = s and so the right side becomes (1) (1) − (−1) (−1) = 0. Proposition 13. If i = s and j = r. The right also reduces to 1 in this case. ε1jk uj vk ≡ u2 v3 − u3 v2 . There is a very important reduction identity connecting these two symbols.

From the above reduction formula. 2 2 2 2 . Actually. Discover a vector identity for (u × v) × (z × w) in terms of box products. ((u × v) ×w)i = εijk (u × v)j wk = εijk εjrs ur vs wk = −εjik εjrs ur vs wk = − (δ ir δ ks − δ is δ kr ) ur vs wk = − (ui vk wk − uk vi wk ) = u · wvi − v · wui = ((u · w) v − (v · w) u)i . Prove that εijk εijr = 2δ kr . Simplify (u × v) · (v × w) × (w × z) . It also occurs in the subject of diﬀerential geometry. Discover a vector identity for u× (v × w) . Discover a vector identity for (u × v) · (z × w) . 5.5 Discover a formula which simpliﬁes (u × v) ×w.6.332 Example 13. it follows that (u × v) ×w = (u · w) v − (v · w) u. Simplify |u × v| + (u × v) − |u| |v| . 6. VECTOR PRODUCTS Since this holds for all i. 13. 3. but there is time for this more general notion later. This is good notation and it will be used in the rest of the book whenever convenient. You will see it in advanced books on mechanics in physics and engineering. 2. 4. this notation is a special case of something more elaborate in which the level of the indices is also important.7 Exercises 1.

Example 14. 9. If f : D (f ) → X and g : X → Y. Then D (f ) would consist of the set z of all (x. f × g by f × g (t) ≡ f (t) × g (t) . 1 − x2 .4 Let f . y 3 + x. z) such that |x| ≤ 1 and z = 0. θ (t) is the longitude. D (f ) denotes the domain of the function. f (1. As before. Here is a simple example which is obviously of interest. This is called the composition of the two functions.0. You could deﬁne a function having values in R3 as (r (t) . y . deﬁne a new function. When D (f ) is not speciﬁed. it will be understood that the domain of f consists of those things for which f makes sense. f which is written in bold face because it will possibly have values in Rp . φ (t)) where r (t) is the distance of the center of mass of the rocket from the center of the earth. g be functions with values in Rp . Let a. then g ◦ f is the name of a function whose domain is {x ∈ D (f ) : f (x) ∈ D (g)} which is deﬁned as g ◦ f (x) ≡ g (f (x)) . 333 . Then f is a function deﬁned on R2 which has values in R3 . There are many ways to make new functions from old ones. Example 14.3 Let f (x. 16). For example.2 Let f (x. x4 . b be elements of R (scalars).0. y) = sin xy. If f and g have values in R3 .0.1 A rocket is launched from the rotating earth. 2) = (sin 2. and φ (t) is the latitude of the rocket. g) is the name of a function whose domain is D (f ) ∩ D (g) which is deﬁned as (f . √ Example 14.0. g) (x) ≡ f · g (x) ≡ f (x) · g (x) . Deﬁnition 14. z) = x+y .Functions Vector valued functions have values in Rp where p is an integer at least as large as 1. y. Then af + bg is the name of a function whose domain is D (f ) ∩ D (g) which is deﬁned as (af + bg) (x) = af (x) + bg (x) . f · g or (f . θ (t) . y.

f is continuous if it is continuous at every point of D (f ) . y) ≡ x + y. and h (t) = (sin t. g be given in the previous problem. t. Find f × g. 1 + t2 and g:R2 → R is given by g (x. Note the total similarity to the scalar valued case. Deﬁnition 14. Find D (f ) if f (x. . 1. Example 14. It is the value of the function at the point.0. t+1 2. t2 + 1. people often write f (x) to denote a function and it doesn’t cause too many problems in beginning courses. g. y. t. Let f (t) = t. I will use this slightly sloppy abuse of notation whenever convenient. t2 +1 . z. t2 . Find the time rate of change of the volume of the parallelepiped spanned by the vectors f .1 Exercises t and let g (t) = t + 1. Let f (t) = t.2.5 Let f (t) ≡ (t. The name of the function is f .1 A function f : D (f ) ⊆ Rp → Rq is continuous at x ∈ D (f ) if for each ε > 0 there exists δ > 0 such that whenever y ∈ D (f ) and |y − x| < δ it follows that |f (x) − f (y)| < ε.0. w) = xy zw . Example 14. 3. t 1. When this is done. 14. Then f · g is the name of the function satisfying f · g (t) = f (t) · g (t) = t3 + t + t2 + 2t = t3 + t2 + 3t Note that in this case is was assumed the domains of the functions consisted of all of R because this was the set on which the two both made sense.2 Continuous Functions What was done earlier for scalar functions is generalized here to include the case of a vector valued function.334 FUNCTIONS You should note that f (x) is not a function. g (t) = 1. 4. Nevertheless. 1 + t. t . 1 + t2 = 1 + 2t + t2 .6 Suppose f (t) = 2t. 1) . Let f . 6 − x2 y 2 . x should be considered as a generic variable free to be anything in D (f ) . Also note that f and g map R into R3 but f · g maps R into R. t2 . t3 . 2) and g (t) ≡ t2 . x. t. Then g ◦ f : R → R and g ◦ f (t) = g (f (t)) = g 2t. and h. 14. the variable. Find f · g.

For example the ﬁrst claim says that (af + bg) (y) is close to (af + bg) (x) when y is close to x provided the same can be said about f and g. Suppose |f (x) − f (y)| ≤ K |x − y| where K is a constant. Deﬁnition 14. EXERCISES 335 14. The fourth claim is immediate from the triangle inequality.3. fq ) : D (f ) → Rq . and g is continuous at f (x) . If f is continuous at x. g (f (y)) is close to g (f (x)) . If f = (f1 .2.then g ◦ f is continuous at x. f (x) is close to f (y) and so by continuity of g at f (x). The above theorem implies that polynomials are all continuous. Show that f is everywhere continuous. where the dα are real numbers. 2. there is a notion of polynomial just as there is for functions deﬁned on R. b ∈ R. For the second claim. Let f (t) = (t.2 The following assertions are valid. let n |α| ≡ i=1 |αi | The symbol. Its conclusions are not surprising. For functions deﬁned on Rn . 4. f (x) ∈ D (g) ⊆ Rp . . sin t) . 2. The function f : Rp → R. The function. given by f (x) = |x| is continuous.14.1 Suﬃcient Conditions For Continuity The next theorem is a fundamental result which will allow us to worry less about the ε δ deﬁnition of continuity. Theorem 14.3 Exercises 1. To see the third claim is likely. Show f is continuous at every point t. · · ·. The proof of this theorem is in the last section of this chapter. xα . Also. 3. This means α = (α1 . 3 1 2 An n dimensional polynomial of degree m is a function of the form p (x) = |α|≤m dα xα. αn ) where each αi is a natural number or zero. note that closeness in Rp is the same as closeness in each coordinate. then f is continuous if and only if each fk is a continuous real valued function. 14.means xα ≡ xα1 xα2 · · · xαn . af + bg is continuous at x whenever f . 1.3 Let α be an n dimensional multi-index. g are continuous at x ∈ D (f ) ∩ D (g) and a. · · ·. if y is close to x.2.2. Functions satisfying such an inequality are called Lipschitz functions.

Use Theorem 14.3 If limy→x f (y) = L and limy→x f (y) = L1 . Deﬁnition 14. There exists δ > 0 such that if 0 < |y − x| < δ and y ∈ D (f ) . A point. Suppose f : R3 → R is given by f (x) = 3x1 x2 + 2x2 . Then lim f (y) = L y→x if and only if the following condition holds. and y ∈ D (f ) then.4. Pick such a y. x.4 Limits Of A Function As in the case of scalar valued functions of one variable.336 α FUNCTIONS 3. x.2.2 to verify 3 that f is continuous. There exists one because x is a limit point of D (f ) .4. Suppose |f (x) − f (y)| ≤ K |x − y| where K is a constant and α ∈ (0. then |f (y) − L| < ε. which are limit points of D (f ) and this concept is deﬁned next. Deﬁnition 14. π k : R3 → R given by π k (x) = xk is a continuous function.1 which involves quotients of functions encountered in the previous problem. Generalize the previous problem to the case where f : Rq → R is a polynomial.4. a concept closely related to continuity is that of the limit of a function. Hint: You should ﬁrst verify that the function. For all ε > 0 there exists δ > 0 such that if 0 < |y − x| < δ. is a limit point of A if B (x. The following theorem is just like the one variable version presented earlier. there exists δ > 0 such that whenever |y − x| < δ and y ∈ D (f ) . 14.4. Proof: Let ε > 0 be given. |f (y) − L1 | < ε. As in the case of functions of one variable. 1). 4. limy→x f (x) = ∞ if for every number l.2 Let f : D (f ) ⊆ Rp → Rq be a function and let x be a limit point of D (f ) .4. Theorem 14. then L = L1 . r) contains inﬁnitely many points of A for every r > 0.4 If f (x) ∈ R. Show that f is everywhere continuous. Then |L − L1 | ≤ |L − f (y)| + |f (y) − L1 | < ε + ε = 2ε.1 Let A ⊆ Rm be a set. Since ε > 0 was arbitrary. Deﬁnition 14. State and prove a theorem using Theorem 5. then f (x) > l. The notion of limit of a function makes sense at points. . 6. |L − f (y)| < ε. one can deﬁne what it means for limy→x f (x) = ±∞. 5. this shows L = L1 .

b ∈ R. |g (y) − K| < . and so for such y. then y→x lim h ◦ f (y) = h (L) .3) is left to you. there exists δ > 0 such that if 0 < |y − x| < δ.2)is to be veriﬁed.5) Then letting 0 < δ ≤ min (δ 1 . then |h (y) −h (L)| < ε Now since limy→x f (y) = L. (14. It is like a corresponding theorem for continuous functions.2). The proof of (14.5 Suppose limy→x f (y) = L and limy→x g (y) = K where K. then |L − b| ≤ r also. Then by the triangle inequality. (14. Consider (14. y→x lim f (y) g (y) = LK.3) Also. Therefore. Then if a. if 0 < |y − x| < δ. there exists η > 0 such that if |y − L| < η. LIMITS OF A FUNCTION 337 Theorem 14.4.1) is left for you. then |f (y) − L| < 1. if h is a continuous function deﬁned near L. then |f (y) −L| < η. |f · g (y) − L · K| ≤ |fg (y) − f (y) · K| + |f (y) · K − L · K| ≤ |f (y)| |g (y) − K| + |K| |f (y) − L| . Let ε > 0 be given. Since h is continuous near L.4.2) and if g is scalar valued with limy→x g (y) = K = 0. the triangle inequality implies.5) that |f · g (y) − L · K| < ε and this proves (14. L ∈ Rq .14. Now (14. 2 (1 + |K| + |L|) 2 (1 + |K| + |L|) (14. lim (af (y) + bg (y)) = aL + bK.4) Suppose limy→x f (y) = L. |f (y) − L| < ε ε . Proof: The proof of (14. |f (y)| < 1 + |L| . Now let 0 < δ 2 be such that if y ∈ D (f ) and 0 < |x − y| < δ 2 . If |f (y) − b| ≤ r for all y suﬃciently close to x. . δ 2 ) . Therefore. (14.1) y→x y→x lim f · g (y) = LK (14. There exists δ 1 such that if 0 < |y − x| < δ 1 and y ∈ D (f ) . for 0 < |y − x| < δ 1 . it follows that for ε > 0 given. |h (f (y)) −h (L)| < ε. |f · g (y) − L · K| ≤ (1 + |K| + |L|) [|g (y) − K| + |f (y) − L|] . it follows from (14.4).

The following theorem is important. Then letting ε > 0 be given. it follows |f (x) − f (y)| < ε. .7) where f (y) ≡ (f1 (y) . Then if 0 < |y − x| < δ.7 Suppose f : D (f ) → Rq .4. If this is not true. there exists δ k such that if 0 < |y − x| < δ k . Then letting ε > 0 be given there exists δ > 0 such that if 0 < |y − x| < δ. · · ·. Then for every ε > 0 there exists δ > 0 such that if |y − x| < δ and y ∈ D (f ) . f is continuous at x if and only if lim f (y) = f (x) . y→x Proof: First suppose f is continuous at x a limit point of D (f ) .7) holds. by the triangle inequality. Assume |f (y) − b| ≤ r. Theorem 14. In particular. Since L is the limit of f . This means that if ε > 0 there exists δ > 0 such that for 0 < |x − y| < δ and y ∈ D (f ) . then ε |fk (y) − Lk | < √ . However. fp (y)) and L ≡ (L1 . then |L − b| > r. Next suppose x is a limit point of D (f ) and limy→x f (y) = f (x) .6). Now suppose (14. then |f (y) − f (x)| = |f (x) − f (x)| = 0 and so whenever y ∈ D (f ) and |x − y| < δ. Theorem 14. it follows |f (y) − f (x)| < ε. this holds if 0 < |x − y| < δ and this is just the deﬁnition of the limit. |L − b| − r) whenever y ∈ D (f ) is close enough to x. Lp ) .6 For f : D (f ) → Rq and x ∈ D (f ) a limit point of D (f ) . Then for x a limit point of D (f ) . a contradiction to the assumption that |b − f (y)| ≤ r. · · ·. It is required to show that |L − b| ≤ r. Hence f (x) = limy→x f (y) . p Let 0 < δ < min (δ 1 .7). if y = x. |f (y) − L| < |L − b| − r and so r < |L − b| − |f (y) − L| ≤ ||b − L| − |f (y) − L|| ≤ |b − f (y)| . Consider B (L. it follows |fk (y) − Lk | ≤ |f (y) − L| < ε which veriﬁes (14. Proof: Suppose (14. it follows f (y) ∈ B (L. it follows p 1/2 |f (y) − L| = k=1 p |fk (y) − Lk | ε2 p 1/2 2 < k=1 = ε. δ p ) .6) if and only if y→x lim fk (y) = Lk (14. Thus.338 FUNCTIONS It only remains to verify the last assertion. then |f (x) − f (y)| < ε. |L − b| − r) . y→x lim f (y) = L (14. · · ·. showing f is continuous at x.4.

14. To see this. (d) lim(x. it follows there can be no limit.y)→(0. = 6 and lim(x. why must x be a limit point of D (f )? Hint: If x were not a limit point of D (f ). 0) there are points where the function equals 1/2 and points where the function has the value 0. x2 −9 x−3 x2 −9 x−3 . t) = (x − 1. 0)} .0) (x2 −y4 ) 2 (x2 +y 4 )2 Hint: Consider along y = 0 and along x = y 2 . y − 2) . take points on the line y = 0.1) equals (6. In fact. Hint: It might help to −y 2 write this in terms of the variables (s. Note it is necessary to rely on the deﬁnition of the limit much more than in the case of a function of one variable and it is the case there are no easy ways to do limit problems for functions of more than one variable.3 is a limit and that so is 22 and 7 and 11. Example 14.4. y . Now consider points on the line y = x where the value of the function equals 1/2.y)→(0.y)→(0.0) (c) lim(x. the value of the function equals 0. one can consider the limit of a sequence of points in Rp .0) x sin 2 1 x2 +y 2 3 2 2 2 3 −18y +6x −13x−20−xy −x (e) lim(x. Therefore. Therefore. this is the case here. the limit may not exist.1) y = 1. δ) contains no points of D (f ) other than possibly x itself. This theorem shows it suﬃces to consider the components of a vector valued function when computing the limit. In other words the concept is totally worthless. 0) is a limit point of the domain of the function so it might make sense to take a limit.y)→(3.4. just as in the case of a function of one variable. Example 14.8 Find lim(x. It is what it is and you will not deal with these concepts without agony. 14.2) −2yx +8yx+34y+3y+4y−5−x2 +2x . However.y)→(3. .5 Exercises x2 −y 2 x2 +y 2 x(x2 −y 2 ) (x2 +y 2 ) 1.1) It is clear that lim(x.9 Find lim(x. every point in R2 except the origin. In the deﬁnition of limit.y)→(0. Since arbitrarily close to (0.5. 1) .6 The Limit Of A Sequence As in the case of real numbers. 14.y)→(3. Just take ε = 1/10 for example. EXERCISES 339 This proves the theorem.0) First of all observe the domain of the function is R2 \ {(0. At these points. this limit xy x2 +y 2 . Find the following limits if possible (a) lim(x.y)→(0. You can’t be within 1/10 of 1/2 and also within 1/10 of 0 at the same time.0) (b) lim(x. show there exists δ > 0 such that B (x. Argue that 33.y)→(1. 2. (0.

ε. an ∈ Rp .6. 3n2 +5n . |an −a| < ε. n 1/2 n |an − a| = k=1 |an k − ak | 2 < k=1 ε2 p 1/2 = ε.8). p. √ |an − ak | < ε/ p. (14.340 Deﬁnition 14. k Therefore. showing that limn→∞ an = a. a contradiction. Therefore.6. Theorem 14. Proof: Suppose a1 = a.8) k n→∞ Proof: First suppose limn→∞ an = a. Theorem 14. n n +3 sin (n) . and write n→∞ ∞ FUNCTIONS lim an = a or an → a if and only if for every ε > 0 there exists nε such that whenever n ≥ nε . np ) . · · ·. This proves the theorem. Example 14. 2 It suﬃces to consider the limits of the components according to the following theorem. 0. As in the case of a vector valued function. Then limn→∞ an = a ≡ (a1 . it suﬃces to consider the components.6. There is absolutely no diﬀerence between this and the deﬁnition for sequences of numbers other than here bold face is used to indicate an and a are points in Rp . for such n.4 Let an = 1 1 n2 +1 . Then let 0 < ε < |a1 −a| /2 in the deﬁnition of the limit. |a1 −a| ≤ |a1 −an | + |an −a| < ε + ε < |a1 −a| /2 + |a1 −a| /2 = |a1 −a| . · · ·. Then letting ε > 0 be given there exist nk such that if n > nk . In words the deﬁnition says that given any measure of closeness. the terms of the sequence are eventually all this close to a.3 Let an = an . · · ·. Thus the limit is (0. then |an − ak | ≤ |an − a| < ε k which establishes (14.6. 1/3) .2 If limn→∞ an = a and limn→∞ an = a1 then a1 = a.1 A sequence {an }n=1 converges to a. Now suppose (14. · · ·. This is the content of the next theorem. lim an = ak . . It follows there exists nε such that if n ≥ nε . Then given ε > 0 there exists nε such that if n > nε . then |an −a| < ε and |an −a1 | < ε. it follows that for n > nε . letting nε > max (n1 . ap ) if and p 1 only if for each k = 1.8) holds for each k.

Then n→∞ lim xan + ybn = xa + yb n→∞ (14.10) is entirely similar and is left for you. |an · bn −a · b| ≤ (|a| + 1) |bn −b| + |b| |an −a| ε ε < (|a| + 1) + |b| ≤ ε. Then for such n.5 Suppose {an } and {bn } are sequences and that n→∞ 341 lim an = a and lim bn = b. n get large. the triangle inequality and Cauchy Schwarz inequality imply |an · bn −a · b| ≤ |an · bn −an · b| + |an · b − a · b| ≤ |an | |bn −b| + |b| |an −a| ≤ (|a| + 1) |bn −b| + |b| |an −a| .6.6 {an } is a Cauchy sequence if for all ε > 0. 2 (|a| + 1) 2 (|b| + 1) This proves (14.6. 14.9) (14.14. let nε > max (n1 . THE LIMIT OF A SEQUENCE Theorem 14. To do the second. there exists nε such that whenever n. let ε > 0 be given and choose n1 such that if n ≥ n1 then |an −a| < 1. A sequence is Cauchy means the terms are “bunching up to each other” as m. Theorem 14. m ≥ nε . Therefore.6. n2 ) . |an −am | < ε.7 Let {an }n=1 be a Cauchy sequence in Rp . Deﬁnition 14. 2 (|a| + 1) 2 (|b| + 1) Such a number exists because of the deﬁnition of limit. and |an −a| < . then Proof: The ﬁrst of these claims is left for you to do.6. |bn −b| < ε ε . For n ≥ nε . ∞ . Now let n2 be large enough that for n ≥ n2 . If bn ∈ R.10) lim an · bn = a · b an bn → ab.1 Sequences And Completeness Recall the deﬁnition of a Cauchy sequence. The proof of (14.6. Then there exists a ∈ Rp such that an → a. n→∞ Also suppose x and y are real numbers.9).

.2 Continuity And The Limit Of A Sequence Just as in the case of a function of one variable.9 If a sequence {an } in Rp converges.6.6.10 A function f : D (f ) → Rq is continuous at x ∈ D (f ) if and only if. it says a function is continuous if it takes convergent sequences to convergent sequences whenever possible. · · ·. n ≥ nε + 1. Proof: Let ε > 0 be given and suppose an → a. |an −an1 | < 1. since ε > 0 is arbitrary. · · ·. then the sequence is a Cauchy sequence. Therefore. By completeness k of R. |an | < 1 + |an1 | . |an | < M for some M < ∞. Then from the deﬁnition of convergence. Theorem 14. {an } is a Cauchy sequence.342 Proof: Let an = an . it follows that |an −a| < Therefore.6. an .3 that lim an = a.6. Proof: Let ε = 1 in the deﬁnition of a Cauchy sequence and let n > n1 . Theorem 14. In words. Theorem 14. for all n. Then p 1 |an − am | ≤ |an − am | k k FUNCTIONS which shows for each k = 1. if m. Then from the deﬁnition.6. p. n1 |an | ≤ 1 + |an1 | + k=1 |ak | . it follows f (xn ) → f (x) . This proves the theorem. it follows k from Theorem 14. It follows that for all n > n1 . it follows there exists ak such that limn→∞ an = ak . it follows that |an −am | ≤ |an −a| + |a − am | < ε ε + =ε 2 2 ε 2 showing that. there is a very useful way of thinking of continuity in terms of limits of sequences found in the following theorem. whenever xn → x with xn ∈ D (f ) . it follows {an }n=1 is a Cauchy sequence. 14. · · ·. Letting a = (a1 . n→∞ ∞ This proves the theorem. ap ) .8 The set of terms in a Cauchy sequence in Rp is bounded in the sense that for all n. there exists nε such that if n > nε .

14. This means there exist. Then there exists ε > 0 and xn ∈ D (f ) 1 such that |x − xn | < n . Show every Lipschitz function is uniformly continuous which means that given ε > 0 there exists δ > 0 independent of x such that if |x − y| < δ. But this is clearly a contradiction because. then |f (x) − f (y)| < ε. This proves the theorem. Let ε > 0 be given. g (x) = 0. If and f and g are each real valued functions continuous at x.7 Properties Of Continuous Functions Functions of p variables have many of the same properties as functions of one variable. y ∈ D. b ∈ R. then f g is continuous at x.then g ◦ f is continuous at x. x2 ∈ C such that for all x ∈ C. By continuity. f (x1 ) ≤ f (x) ≤ f (x2 ) . af + bg is continuous at x when f . Theorem 14. Theorem 14. First there is a version of the extreme value theorem generalizing the one dimensional case. although xn → x. Then f achieves its maximum and its minimum on C. If. fq ) : D (f ) → Rq . 14. f (xn ) fails to converge to f (x) . 4. 5. It follows f must be continuous after all. Now suppose the condition about taking convergent sequences to convergent sequences holds at x.7. These theorems are proved in the next section. and g is continuous at f (x) . Suppose f fails to be continuous at x. there exists δ > 0 such that if |y − x| < δ.2 The following assertions are valid 1. |f (x) −f (xn )| < ε which shows f (xn ) → f (x) . However. If f is continuous at x. 2.14. given by f (x) = |x| is continuous. Explain why. g are continuous at x ∈ D (f ) ∩ D (g) and a. K such that |f (x) − f (y)| ≤ K |x − y| for all x. does it follow that |f | is also uniformly continuous? If |f | is uniformly continuous does it follow that f is uniformly continuous? Answer the same questions with “uniformly continuous” replaced with “continuous”. then |f (x) − f (y)| < ε. If f is uniformly continuous.7. f : D ⊆ Rp → Rq is Lipschitz continuous or just Lipschitz for short if there exists a constant. yet |f (x) −f (xn )| ≥ ε. then f /g is continuous at x. in addition to this. then |xn −x| < δ and so for all n this large. PROPERTIES OF CONTINUOUS FUNCTIONS 343 Proof: Suppose ﬁrst that f is continuous at x and let xn → x. . The function.7. The function f : Rp → R. then f is continuous if and only if each fk is a continuous real valued function.1 Let C be closed and bounded and let f : C → R be continuous. x1 . · · ·. f (x) ∈ D (g) ⊆ Rp .8 Exercises 1. there exists nδ such that if n ≥ nδ . There is also the long technical theorem about sums and products of continuous functions. 2. 3. If f = (f1 .

it follows |f (x) − f (y)| < 2(|a|+|b|+1) and there exists δ 2 > 0 ε such that whenever |x − y| < δ 2 .then g ◦ f is continuous at x. δ 2 ) . Then if |x − y| < δ. af + bg is continuous at x when f . all the above hold at once and |f g (x) − f g (y)| ≤ . By assumption. there exist δ 1 > 0 such ε that whenever |x − y| < δ 1 . fq ) : D (f ) → Rq . 4. f (x) ∈ D (g) ⊆ Rp .) Let ε > 0 be given. · · ·. |f (y)| < 1 + |f (x)| . Therefore.344 FUNCTIONS 14. If |x − y| < δ. then f /g is continuous at x. 2 (1 + |g (x)| + |f (y)|) and there exists δ 3 such that if |x − y| < δ 3 . Then let 0 < δ ≤ min (δ 1 . The function. Now let ε > 0 be given.1 The following assertions are valid 1. in addition to this. 2. it follows that |g (x) − g (y)| < 2(|a|+|b|+1) . 5. b ∈ R. δ 2 . If f = (f1 . and g is continuous at f (x) .9 Some Advanced Calculus This section contains the proofs of the theorems which were just stated without proof. g are continuous at x ∈ D (f ) ∩ D (g) and a. δ 3 ) . Proof: Begin with 1. given by f (x) = |x| is continuous. |f g (x) − f g (y)| ≤ |f (x) g (x) − g (x) f (y)| + |g (x) f (y) − f (y) g (y)| ≤ |g (x)| |f (x) − f (y)| + |f (y)| |g (x) − g (y)| ≤ (1 + |g (x)| + |f (y)|) [|g (x) − g (y)| + |f (x) − f (y)|] . then |g (x) − g (y)| < ε . Now begin on 2. Theorem 14. then |f (x) − f (y)| < 1. If. If and f and g are each real valued functions continuous at x. 3. then |f (x) − f (y)| < ε 2 (1 + |g (x)| + |f (y)|) Now let 0 < δ ≤ min (δ 1 . If f is continuous at x. The function f : Rp → R. Therefore. for such y. then f g is continuous at x. using the triangle inequality |af (x) + bf (x) − (ag (y) + bg (y))| ≤ |a| |f (x) − f (y)| + |b| |g (x) − g (y)| < |a| ε 2 (|a| + |b| + 1) + |b| ε 2 (|a| + |b| + 1) < ε.9. g (x) = 0. then everything happens at once. There exists δ 2 such that if |x − y| < δ 2 . It follows that for such y.) There exists δ 1 > 0 such that if |y − x| < δ 1 . then f is continuous if and only if each fk is a continuous real valued function.

) To obtain the second part. Then if |x − y| < min (δ 0 . |g (x) − g (y)| < |g (x)| /2 and so by the triangle inequality. − |g (x)| /2 ≤ |g (y)| − |g (x)| ≤ |g (x)| /2 which implies |g (y)| ≥ |g (x)| /2. δ 1 ) . then |g (y) − g (x)| < ε −1 M . and |g (y)| < 3 |g (x)| /2. then |f (x) − f (y)| < and let δ 3 be such that if |x − y| < δ 3 . 345 This proves the ﬁrst part of 2. δ 1 . and |x − y| < δ. δ 2 . SOME ADVANCED CALCULUS (1 + |g (x)| + |f (y)|) [|g (x) − g (y)| + |f (x) − f (y)|] < (1 + |g (x)| + |f (y)|) ε ε + 2 (1 + |g (x)| + |f (y)|) 2 (1 + |g (x)| + |f (y)|) = ε.14. δ 3 ) .9. everything holds and f (x) f (y) − ≤ M [|f (x) − f (y)| + |g (y) − g (x)|] g (x) g (y) . 2 ε −1 M 2 Then if 0 < δ ≤ min (δ 0 . f (x) f (y) f (x) g (y) − f (y) g (x) − = g (x) g (y) g (x) g (y) |f (x) g (y) − f (y) g (x)| ≤ 2 |g(x)| 2 = 2 |f (x) g (y) − f (y) g (x)| |g (x)| 2 ≤ ≤ ≤ ≤ 2 |g (x)| 2 |g (x)| 2 |g (x)| 2 2 [|f (x) g (y) − f (y) g (y) + f (y) g (y) − f (y) g (x)|] [|g (y)| |f (x) − f (y)| + |f (y)| |g (y) − g (x)|] 3 |g (x)| |f (x) − f (y)| + (1 + |f (x)|) |g (y) − g (x)| 2 2 2 2 (1 + 2 |f (x)| + 2 |g (x)|) [|f (x) − f (y)| + |g (y) − g (x)|] |g (x)| ≡ M [|f (x) − f (y)| + |g (y) − g (x)|] where M≡ 2 |g (x)| 2 (1 + 2 |f (x)| + 2 |g (x)|) Now let δ 2 be such that if |x − y| < δ 2 . let δ 1 be as described above and let δ 0 > 0 be such that for |x − y| < δ 0 .

q.11) Suppose ﬁrst that f is continuous at x.) Note that in these proofs no eﬀort is made to ﬁnd some sort of “best” δ. Here is a multidimensional version of the nested interval lemma. Now begin on 3. · · ·. Now suppose each function. This shows the only if part. c ∈ R which is an element of every Ik . bk ≡ x ∈ Rp : xi ∈ ak . · · ·.). there exists δ k > 0 such that whenever |x − y| < δ k |fk (x) − fk (y)| < ε/q. 2 2 This completes the proof of the second part of 2. Then if |x − y| < δ. then |f (z) − f (x)| < η.) and completes the proof of the theorem. (14. Then if |x − z| < δ and z ∈ D (g ◦ f ) ⊆ D (f ) .) says: If f = (f1 . and g is continuous at f (x) . 2.) Part 4.11) implies q |f (x) − f (y)| ≤ i=1 q |fi (x) − fi (y)| ε = ε. The ﬁrst part of the above inequality then shows that for each k = 1. · · ·. Then there exists η > 0 such that if |y − f (x)| < η and y ∈ D (g) . Let ε > 0 be given. Either is it or it is not continuous. Then if ε > 0 is given. · · ·.) To verify part 5. all the above hold and so |g (f (z)) − g (f (x))| < ε. δ q ) . bk i i i i Ik ⊇ Ik+1 . It follows from continuity of f at x that there exists δ > 0 such that if |x − z| < δ and z ∈ D (f ) .2 Let Ik = k = 1. Then |fk (x) − fk (y)| ≤ |f (x) − f (y)| q 1/2 ≡ i=1 q |fi (x) − fi (y)| ≤ i=1 2 |fi (x) − fi (y)| . the above inequality holds for all k and so the last part of (14.346 <M FUNCTIONS ε −1 ε −1 M + M = ε.then g ◦ f is continuous at x. fq ) : D (f ) → Rq . then f is continuous if and only if each fk is a continuous real valued function. p i=1 ak . Now let 0 < δ ≤ min (δ 1 . let ε > 0 be given and let δ = ε.9. p . If f is continuous at x. This proves part 3. This proves part 5. fk is continuous. The problem is one which has a yes or a no answer. then |f (x) − f (y)| < ε. the triangle inequality implies |f (x) − f (y)| = ||x| − |y|| ≤ |x − y| < δ = ε. it follows that |g (y) − g (f (x))| < ε. Lemma 14.). Then there exists δ > 0 such that if |x − y| < δ. f (x) ∈ D (g) ⊆ Rp . |fk (x) − fk (y)| < ε. and suppose that for all Then there exists a point. For |x − y| < δ. q < i=1 This proves part 4.

Deﬁnition 14. bi . c ∈ Ik for all k. al : l = k.9. for each i and each k. bk ⊇ ak+1 . . Proof: Let {an } ⊆ C.9. The following deﬁnition is similar to that given earlier. This proves the lemma.13) shows that is an upper bound for the set.7. · · · i ci = sup al : l = k. If you don’t like the proof. picking any k. Lemma 5. bi ] .13) By the ﬁrst inequality in (14. Then C is sequentially compact. bk+1 and so by the nested interval i i i i theorem for one dimensional problems. ak . · · · and so it is at least as large as the least upper bound of this i set which is the deﬁnition of ci given in (14. Only half of this result will be needed in this book and this is proved next. if x and y are two points in one of these sets.4 A set. · · ·.5 Let C ⊆ Rp be closed and bounded. k + 1. letting D0 = p i=1 p p ∞ p |bi − ai | p 2 1/2 . 1/2 2 |x − y| = i=1 |xi − yi | p −1 i=1 1/2 ≤2 |bi − ai | 2 ≡ 2−1 D0 . · · ·.14).14) (14. · · ·. SOME ADVANCED CALCULUS 347 Proof: Since Ik ⊇ Ik+1 . bk ≡ x ∈ Rp : xi ∈ ak . · · ·. i i Deﬁning c ≡ (c1 . here is another one which follows directly from the one variable nested interval lemma. let C ⊆ i=1 [ai . there exists a point ci ∈ ak . Theorem 14. 2. {xnk }k=1 such that xnk → x. · · · i bk i (14. p.12). ak ≤ ak+1 . It deﬁnes what is meant by a sequentially compact set in Rp . i i i ci ≡ sup al : l = 1. Lemma 14. bk for all k. ak ≤ ci ≤ bk . bk+1 . di ] = ai +bi .9. Thus there are 2p of these sets 2 2 because there are two choices for the ith slot for i = 1. · · ·. and consider all sets of the form i=1 [ci . i i i i This implies that for each i. Therefore. (14. This proves the lemma. it follows that for each i = 1.7 on Page 95. ai +bi or [ci . Thus. Also. 2 · · · . bk i i i i Ik ⊇ Ik+1 . if k ≤ l. bk ⊇ ak+1 .9. k + 1. there exists a point. · · ·. for each k = 1. Now deﬁne al ≤ bl ≤ bk . di ] where [ci .(14.12) i i i i Consequently. p. 2. K ⊆ Rp is sequentially compact if and only if whenever {xn }n=1 ∞ is a sequence of points in K. cp ) . Therefore. di ] equals either ai . cp ) it follows c ∈ Ik for all k. c ∈ R which is an element of every Ik .3 Let Ik = k = 1. p i=1 ak . Proof: For each i = 1. and suppose that for all Then there exists a point. x ∈ K and a subsequence. It turns out the sequentially compact sets in Rp are exactly those which are closed and bounded.14. Then i i letting c ≡ (c1 . |xi − yi | ≤ 2−1 |bi − ai | . bk ≥ bk+1 . ak . p .

and a point. dp ) and c ≡ (c1 . Therefore. Ik such that Ik ⊇ Ik+1 and for any two points in Ik . let an2 ∈ I2 ∩ {an }n=1 with n2 > n1 . x1 . r) and so has no points of C in it. ∞ Having picked these two.9. p 1/2 FUNCTIONS D1 ≡ i=1 |di − ci | 2 ≤ 2−1 D0 Denote by {J1 . there exists a point. {xn } such that f (xn ) → M. x. Having picked this. Claim: c ∈ C. and Ik ∩ C contains an for inﬁnitely many values of n. f (x1 ) ≤ f (x) ≤ f (x2 ) .6. In other words. This means there exist. it follows |x − y| ≤ 2−k D0 . ∞ ∞ Now pick an1 ∈ I1 ∩ {an }n=1 . Since the union of these sets equals all of I0 . By the nested interval lemma. . This proves f achieves its maximum and also shows its maximum is less than ∞. But then since f is continuous at x. Let I1 ≡ Jk . Recall that a function is uniformly continuous if the following deﬁnition holds. Ik must be contained in B (c. Continue in this way obtaining sets.8 Let f :C → Rq be continuous where C is a closed and bounded set in Rp . Theorem 14. x ∈ C such that xnk → x. cp ) are two such points. Now do to I1 what was done to I0 to obtain I2 ⊆ I1 and for any two points. Since C is sequentially compact. Then since c ∈ Ik .9.7 Let f : D (f ) → Rq . Then f is uniformly continuous on C. Here is a proof of the extreme value theorem. · · ·. r) is contained completely in Rp \ C. k=1 Pick Jk such that an is contained in Jk ∩ C for inﬁnitely many values of n. let an3 ∈ I3 ∩ {an }n=1 with n3 > n2 and continue this way. c ∈ C as claimed. J2p } these sets determined above. Deﬁnition 14. there exists r > 0 such that / B (c. Let k be so large that D0 2−k < r. Then f is uniformly continuous if for every ε > 0 there exists δ > 0 such that whenever |x − y| < δ. · · ·. Proof: Let M = sup {f (x) : x ∈ C} . x. Let x2 = x.10 on Page 342 that f (x) = limk→∞ f (xnk ) = M. This proves the theorem.348 In particular.9. it follows |f (x) − f (y)| < ε. Proof of claim: Suppose c ∈ C. Then there exists a sequence. and I2 ∩ C contains an for inﬁnitely many values of n. · · ·. Then f achieves its maximum and its minimum on C. contrary to the manner in which the Ik are deﬁned in which Ik contains an for inﬁnitely many values of n. r) contains no points of C. y. The case of a minimum is handled similarly. Recall this means +∞ if f is not bounded above and it equals the least upper bound of these values of f if f is bounded above. Since C is a closed set. there exists a subsequence. y ∈ I2 |x − y| ≤ 2−1 D1 ≤ 2−2 D0 . c which is contained in each Ik .6 Let C be closed and bounded and let f : C → R be continuous. it follows from Theorem 14. x2 ∈ C such that for all x ∈ C. since d ≡ (d1 . and any two points of Ik are closer than D0 2−k . xnk . B (c. The ∞ result is a subsequence of {an }n=1 which converges to c ∈ C because any two points in Ik −k are within D0 2 of each other. Theorem 14. it follows p C = ∪2 Jk ∩ C.

10 on Page 342. This proves the theorem. xn and yn satisfying |xn − yn | < 1/n but |f (xn ) − f (yn )| ≥ ε.6. But |xnk − ynk | < 1/k and so ynk → x also. from Theorem 14. there exists ε > 0 and pairs of points. SOME ADVANCED CALCULUS 349 Proof: If this is not so. Therefore. k→∞ a contradiction. there exists x ∈ C and a subsequence. .9. ε ≤ lim |f (xnk ) − f (ynk )| = |f (x) − f (x)| = 0. {xnk } satisfying xnk → x. Since C is sequentially compact.14.

350 FUNCTIONS .

s→t+ lim f (s) = L if and only if for all ε > 0 there exists δ > 0 such that if 0 < s − t < δ. 351 . one can give a meaning to s→t+ The above discussion considered expressions like lim f (s) . s→t− lim f (s) = L if and only if for all ε > 0 there exists δ > 0 such that if 0 < t − s < δ. then |f (s) − L| < ε. t + r) .1 In the case where D (f ) is only assumed to satisfy D (f ) ⊇ (t. s→t− s→∞ and s−∞ lim f (s) . t) .1 Limits Of A Vector Valued Function f (t0 + h) − f (t0 ) h and determined what they get close to as h gets small. Deﬁnition 15.1. In other words it is desired to consider f (t0 + h) − f (t0 ) lim h→0 h Specializing to functions of one variable. lim f (s) .Limits And Derivatives 15. then |f (s) − L| < ε. lim f (s) . In the case where D (f ) is only assumed to satisfy D (f ) ⊇ (t − r.

|f (t) − L| < ε and t→−∞ (15.1 The derivative of a function. lim t→π/2 t→π/2 t→π/2 t2 + 1 .3 Let f (t) = Recall that limt→0 (1. lim ln (t) t→π/2 = 0. ln 4 2 sin t 2 t . then neither does f (t) .2 The Derivative And Integral The following deﬁnition is on the derivative and integral of a vector valued function of one variable. sin t t + 1 . sin t.1) holds. Use Theorem 14. limt→0 f (t) = 15. b]) . Find limt→π/2 f (t) .7 on Page 338. a b b f1 (t) dt. a a fp (t) dt . is deﬁned as the following limit whenever the limit exists. This is what is meant by saying f ∈ R ([a. Find limt→0 f (t) . (15. 0. then b f (t) dt is deﬁned as the vector.1. ln π π + 1 . lim sin t.2 Let f (t) = cos t.2.7 on Page 338 and the continuity of the functions to write this limit equals lim cos t. f (t) . If f (t) = (f1 (t) . .4.1) lim f (t) = L if for every ε > 0 there exists l such that whenever t < l. The only diﬀerence is that here |·| refers to the norm or length in Rp where maybe p > 1.t .1. Example 15. · · ·.4. = 1. Then from Theorem 14. lim f (t + h) − f (x) ≡ f (t) h h→0 The function of h on the left is called the diﬀerence quotient just as it was for a scalar b valued function.352 LIMITS AND DERIVATIVES One can also consider limits as a variable “approaches” inﬁnity. 1. Note that in all of this the deﬁnitions are identical to the case of scalar valued functions. t→∞ lim f (t) = L if for every ε > 0 there exists l such that whenever t > l.t 2 . t2 + 1. p. · · ·. Of course nothing is “close” to inﬁnity and so this requires a slightly diﬀerent deﬁnition. ln (t) . Deﬁnition 15. If the limit does not exist. · · ·. 1) . Example 15. fp (t)) and a fi (t) dt exists for each i = 1.

2. fp (t)) . As before. Theorem 15. fp (t)) = f (t) a and this proves the claim. f (t + h) − f (t) − f (t) < 1.3 If f ∈ R ([a. Proof: Suppose ε > 0 is given and choose δ 1 > 0 such that if |h| < δ 1 . there is a fundamental theorem of calculus. then |f (t + h) − f (t)| < ε. Find f (t) . This proves the theorem. THE DERIVATIVE AND INTEGRAL This is exactly like the deﬁnition for a scalar valued function. then d dt t f (s) ds a = f (t) . diﬀerentiability implies continuity but not the other way around. this shows that if |y − t| < δ. b) . bt) where a. 1+|fε (x)| it follows if |h| < δ. |f (y) − f (t)| < ε which proves f is continuous at t. The diﬀerence quotient. y−x As in the case of a scalar valued function. then f is continuous at t. From the above discussion this derivative is just the vector valued functions whose components consist of the derivatives of the components of f . b) . Find f (x) . Then it follows 1 h t+h f (s) ds − a t+h 1 h t f (s) ds = a 1 h t+h f1 (s) ds. t 1 h t+h fp (s) ds t 1 and limh→0 h t fi (s) ds = fi (t) for each i = 1. Now letting δ < min δ 1 . Letting y = h + t.2. f (x + h) − f (x) c−c = =0 h h Therefore.2. the triangle inequality implies |f (t + h) − f (t)| < |h| + |f (t)| |h| . f (x) = lim y→x 353 f (y) − f (x) . As in the scalar case.2.2. Example 15. · · ·. f (x + h) − f (x) = lim 0 = 0 h→0 h→0 h lim Example 15. h then for such h.15. · · ·. Therefore. · · ·. Theorem 15. p from the fundamental theorem of calculus for scalar valued functions. 1 h→0 h lim t+h f (s) ds − a 1 h t f (s) ds = (f1 (t) . b]) and if f is continuous at t ∈ (a. Thus f (t) = (a.5 Let f (t) = (at.4 Let f (x) = c where c is a constant.2 If f (t) exists. · · ·. b are constants. . Proof: Say f (t) = (f1 (t) .

t + h] divided by the length of time. rq (t)) . Therefore. ¨ ¨¨ ¨r¨ r ¨¨ ¨ r 2 222 r(t + h) ¨¨ ¨22222 ¨ B r(t) ¨¨ 22 I r 2 X r 22 ¨ E In this picture there are unit vectors in the direction of the vector from r (t) to r (t + h) .1 Geometric And Physical Signiﬁcance Of The Derivative Suppose r is a vector valued function of a parameter. Therefore. t not necessarily time and consider the following picture of the points traced out by r. the expression r (t0 + h) − r (t0 ) h (15. a unit tangent vector to the curve at the point r (t) . [t.2. the approximate distance travelled on the time interval. |r (t + h) − r (t)| Thus Th → T. this expression is really the average speed of the object on this small time interval and so the limit as h → 0. How do you go about computing r (t)? Letting r (t) = (r1 (t) . converge to a unit vector. r (t) ≡ = r (t + h) − r (t) |r (t + h) − r (t)| r (t + h) − r (t) = lim h→0 h→0 h h |r (t + h) − r (t)| |r (t + h) − r (t)| lim Th = |r (t)| T. T which is tangent to the curve at the point r (t) . T which deﬁnes the direction in which the object is moving. Thus |r (t)| T represents the speed times a unit direction vector. h→0 h lim In the case that t is time. The real distance would be the length of the curve joining the two points but if h is very small. this is essentially equal to |r (t + h) − r (t)| as suggested by the picture below. Now each of these unit vectors is of the form r (t + h) − r (t) ≡ Th . · · ·. Thus r (t) is the velocity of the object. r(t + h) r r(t) r |r (t + h) − r (t)| h gives for small h. h.2) Therefore.354 LIMITS AND DERIVATIVES 15. You can see that it is reasonable to suppose these unit vectors. the expression |r (t + h) − r (t)| is a good approximation for the distance traveled by the object on the time interval [t. deserves to be called the instantaneous speed of the object. t + h] . if they converge. This is the physical signiﬁcance of the derivative when t is time. .

t2 . Find a tangent line to the curve parameterized by r at the point r (2) . h h Then as h converges to 0.4.7 Let r (t) = sin t. 4. the cross product. · · ·. 15.3) (15. 355 where vk = rk (t) . r (t) and so it is possible to ﬁnd parametric equations for the line tangent to the curve at various points. 3) + t (cos 2. (15. Therefore.15.4) . (15.2. t + 1 for t ∈ [0.7 on Page 338. From the above discussion.2 Diﬀerentiation Rules There are rules which relate the derivative to the various operations done with vectors such as the dot product. z) . Theorem 15. v if and only if all the coordinate functions of the term in (15. 4. 5] .2. vq ) . Find the velocity vector when t = 1. 5] . 1) = (x. (15. Then the following formulas are obtained. 4. (f · g) (t) = f (t) · g (t) + f (t) · g (t) If f . From the above discussion. In the case where t is time. 2. T determines a direction vector which is tangent to the curve at the point. a parametric equation for the tangent line is (sin 2. which says that the term in (15.2. a direction vector has the same direction as r (2) .2.2) gets close to a vector.2) converges to v ≡ (v1 . (af + bg) (t) = af (t) + bg (t) . and (15. This by Theorem 14. t2 . r (t) . this is simply r (1) = (cos 1. b ∈ R and suppose f (t) and g (t) exist.8 Let a.2) get close to the corresponding coordinate functions of v. THE DERIVATIVE AND INTEGRAL is equal to r1 (t0 + h) − r1 (t0 ) rq (t0 + h) − rq (t0 ) . it suﬃces to simply use r (2) as a direction vector for the line. and vector addition and scalar multiplication. t + 1 for t ∈ [0.5) are referred to as the product rule. the vector. g have values in R3 . 1) .4). y.6 Let r (t) = sin t. Example 15.5) (15. Therefore. r (2) = (cos 2. then (f × g) (t) = f (t) × g (t) + f (t) × g (t) The formulas. Example 15.2. · · ·. 1) . this simply says the velocity vector equals the vector whose components are the derivatives of the components of the displacement vector. In any case.

ln (t + 1) .2. 1 From (15. 1+1 .5) is left as an exercise which follows from the product rule and the deﬁnition of the cross product in terms of components given on Page 323. sin t.356 LIMITS AND DERIVATIVES Proof: The ﬁrst formula is left for you to prove. Find the velocity of the object in kilometers per hour when t = 1.5) this equals(2t. (15. sometimes it is possible to give a simple description of the curve which results. . Find (r (t) × p (t)) . cos t. sin 2t. 2t) + t2 .12 Describe the curve which results from the vector valued function. the velocity is r (1) = 1 1 3. Example 15. . π 0 cos t dt = 1 3 3 π . sin t. cos t × 1. ln (t + 1) .2. Consider the second. That is. Therefore. t) where t ∈ R. 2t). you can consider the possibility of taking the derivative of the derivative and then the derivative of that and so forth. cos t and let p (t) = (t. ﬁnd r (t) and plug in t = 1 to ﬁnd the velocity.9 Let r (t) = t2 . f · g (t + h) − fg (t) f (t + h) · g (t + h) − f (t + h) · g (t) f (t + h) · g (t) − f (t) · g (t) = lim + h→0 h→0 h h h (g (t + h) − g (t)) (f (t + h) − f (t)) + · g (t) = lim f (t + h) · h→0 h h lim n = lim n h→0 fk (t + h) k=1 (gk (t + h) − gk (t)) + h n n k=1 (fk (t + h) − fk (t)) gk (t) h = k=1 fk (t) gk (t) + k=1 fk (t) gk (t) = f (t) · g (t) + f (t) · g (t) . Formula (15.4).2. Usually it is not possible to do this! Example 15. Example 15. r (t) = = When t = 1. 1 (1 + t) − t 1 2 t +2 . dt. 2 . √ 4 3 kilometers per hour. Thus r (t) denotes the second derivative. r (t) = (cos 2t. t2 + 2 kilometers where t is given in hours. sin t. − sin t) × (t. π 0 sin t dt.10 Let r (t) = t2 . . cos t Find This equals π 2 t 0 π 0 r (t) dt. 3t2 . 3t2 . 2. 2 2 (1 + t) 1 (1 + t) 2. this can be continued. The main thing to consider about this is the notation and it is exactly like it was in the case of a scalar valued function presented earlier.2. When you are given a vector valued function of one variable. −1/2 2t 1 (t2 + 2) t Obviously. 0 √ t Example 15. Recall the velocity at time t was r (t) .11 An object has position r (t) = t3 . t+1 .

z (t)) . LEIBNIZ’S NOTATION 357 The ﬁrst two components indicate that for r (t) = (x (t) . The sound waves spread out and you would expect your sound to be inaudible very far away. While it is doing so. notations holding. here is an interesting application. possibly far away from you? Then you might have the interesting phenomenon of someone far away hearing what you said quite clearly. Suppose you are in a large room and you make a sound. z (t) is moving at a steady rate in the positive direction.2.3 Leibniz’s Notation dy dt Leibniz’s notation also generalizes routinely.6) Now and a similar formula holds for P1 replaced with P0 . For example.6). Draw a picture to see this. Therefore. the curve which results is a cork skrew shaped thing called a helix. y (t) . 1 −1/2 ((r (t) − P1 ) · (r (t) − P1 )) 2 ((r (t) − P1 ) · r (t)) 2 (r (t) − P1 ) · (r (t)) .This implies the curve of intersection of the plane with the room is an ellipse having P0 and P1 as the foci. (P0 − r (t)) · (−r (t)) (P1 − r (t)) · (r (t)) = . Consider a plane which contains the two points and let r (t) denote a parameterization of the intersection of this plane with the walls of the room. = y (t) with other similar . Then the condition that the angle of reﬂection equals the angle of incidence reduces to saying the angle between P0 − r (t) and −r (t) equals the angle between P1 − r (t) and r (t) . the pair. But what if the room were shaped so that the sound is reﬂected oﬀ the wall toward a single point.13 Sound waves have the angle of incidence equal to the angle of reﬂection. Therefore. Example 15.15. |r (t) − P1 | 15. It is also a situation where the curve can be identiﬁed as something familiar.3. y (t)) traces out a circle. C. from (15. |P0 − r (t)| |r (t)| |P1 − r (t)| |r (t)| This reduces to (r (t) − P0 ) · (−r (t)) (r (t) − P1 ) · (r (t)) = |r (t) − P0 | |r (t) − P1 | (r (t) − P1 ) · (r (t)) d = |r (t) − P1 | |r (t) − P1 | dt (15. How should the room be designed? Suppose you are located at the point P0 and the point where your sound is to be reﬂected is P1 . As an application of the theorems for diﬀerentiating curves. d d (|r (t) − P1 |) + (|r (t) − P0 |) = 0 dt dt showing that |r (t) − P1 | + |r (t) − P0 | = C for some constant. This is because |r (t) − P1 | = (r (t) − P1 ) · (r (t) − P1 ) and so using the chain rule and product rule. (x (t) . d |r (t) − P1 | dt = = Therefore.

(b) Show that you can conclude that r (t) · r (t) = 0.4 Exercises (a) limx→0+ |x| x . Is the velocity of this object ever equal to zero? If so. Let r (t) = sin t. t + 1 for t ∈ [0. 9. Prove (15. Let r (t) = sin t.5) directly from the deﬁnition of the derivative without considering components. Find a tangent line to the curve parameterized by r at the point r (2) . 8. a + 1) for all a ∈ R. Let r (t) = t. x−2 2 √ √ 3. cos x x x |x| . the velocity is always perpendicular to the displacement. 5. 4] . Find the following limits if possible (b) limx→0+ (c) limx→4 (d) limx→∞ 2. cos t2 . 5] . 12. Let r (t) = 4 + (t − 1) . 11. x + 1) = ( 3 a. ln t2 + 1 . Hint: You might want to use the formula for the diﬀerence of two cubes. Find a tangent line to the curve parameterized by r at the point r (2) . 2t + 1 for t ∈ [0. t2 . Find the velocity when t = 3. x x2 −4 2 x+2 . 14. sin x/x. ﬁnd the value of t at which this occurs and the point in R3 at which the velocity is zero. x −4 . Find limx→2 + 7. x + 2x − 1. Let r (t) = sin t. tan 4x 5x x x2 sin x2 1+x2 . 7. t + 1 for t ∈ [0. 3 2 √ t2 + 1 (t − 1) . sec x. Let r (t) = sin 2t. e x2 −16 x+4 .358 LIMITS AND DERIVATIVES 15. Let r (t) = t. (t−1) t5 3 3 describe the position of an object in R as a function of t where t is measured in seconds and r (t) is measured in meters. . Find a tangent line to the curve parameterized by r at the point r (2) . Prove from the deﬁnition that limx→a ( 3 x. sin t2 . Suppose an object has position r (t) ∈ R3 where r is diﬀerentiable and suppose also that |r (t)| = c where c is a constant. 10. Prove (15. t2 . Find the velocity when t = 3. That is. (a) Show ﬁrst that this condition does not require r (t) to be a constant. Find the velocity when t = 3. x 1. 13. a3 − b3 = (a − b) a2 + ab + b2 . t + 1 for t ∈ [0. t + 1 for t ∈ [0. 1+x2 .5) from the formula (f × g)i = εijk fj gk . 5] . 5] .5) from the component description of the cross product. cos t2 for t ∈ [0. 5] . Prove (15. 6. Hint: You can do this either mathematically or by giving a physical example. 4. 5] . t2 .

Suppose r (t). .5 Newton’s Laws Of Motion Deﬁnition 15. He ﬁnished his life as Master of the Mint. Principia. NEWTON’S LAWS OF MOTION 359 15. The second law is the one of most interest. b a f (t) dt = 15. The next example involves the concept that if you know the force along with the initial velocity and initial position.7) where a is the acceleration and m is the mass of the object. the ﬁrst law being implied by the second. made major contributions to optics. If F (t) = f (t) for all t ∈ (a. Recall that n = n = 1 and 0 n n n n−1 = 1 = n. the thing applies the same force back. b) and F is continuous on [a. 0. and r (0) = (0. He showed that Kepler’s laws for the motion of the planets came from calculus and his laws of gravitation.” Of these laws. 1) meters/sec. A bezier curve in Rn is a vector valued function of the form n y (t) = k=0 n n−k k xk (1 − t) t k where here the n are the binomial coeﬃcients and xk are n + 1 points in Rn . Show k that y (0) = x0 . s (t) . c such that r (t) = c for all t ∈ (a. or. 1. Then the acceleration of the object is deﬁned to be r (t) . Example 15. However. The third law says roughly that if you apply a force to something. in which many of his ideas are found. 1 − t2 . Find r (t) . b) .2 Let r (t) denote the position of an object of mass 2 kilogram at time t and suppose the force acting on the object is given by F (t) = t. If r (t) = 0 for all t ∈ (a. b] . and directed to contrary parts. 2e−t . and p (t) are three diﬀerentiable functions of t which have values in R3 . Newton was also very interested in theology and had strong views on the nature of God which were based on his study of the Bible and early Christian writings. 1) meters. Newton’s third law states: “To every action there is always opposed an equal reaction. and stated the fundamental laws of mechanics listed here. 1 Isaac Newton 1642-1727 is often credited with inventing calculus although this is not correct since most of the ideas were in existence earlier. 18.1 Let r (t) denote the position of an object. he made major contributions to the subject partly in order to study physics and astronomy. and ﬁnd y (0) and y (1) .” Newton’s second law is: F = ma (15. 16. only the second two are independent of each other. In 1686 he published an important book. He invented a version of the binomial theorem when he was only 23 years old and built a reﬂecting telescope. show there exists a constant vector. Suppose r (0) = (1.5. Curves of this sort are important in various computer programs.5. Newton’s1 ﬁrst law is: “Every body persists in its state of rest or of uniform motion in a straight line unless it is compelled to change that state by forces impressed on it. 17. show F (b) − F (a) . He formulated the laws of gravity. b). the mutual actions of two bodies upon each other are always equal. Find a formula for (r (t) × s (t) · p (t)) . y (1) = xn . then you can determine the position. Newton used calculus and these laws to solve profound problems involving the motion of the planets and other problems in mechanics. Note that the statement of this law depends on the concept of the derivative because the acceleration is deﬁned as a derivative.5.15.

C3 ) 12 2 where the constant vector. the displacement has also been found. 2r (t) = F (t) = t. Rather. Thus r (0) = (1. −e−t 4 2 +c where c is a constant vector which must be determined from the initial condition given for the velocity. C2 . 4 2 Now from this. 1) = (0. Denoting r (t) as (x (t) . C2 . C3 ) which means C1 = 1. the velocity is found. and C3 = 0. 1) + (C1 . 1. 0. How far does the shell ﬂy before hitting the ground? Neglect air resistance in this problem. r (t) = t3 t2 /2 − t4 /12 + 1. and c3 = 2. (C1 . −e−t + 2 . 1) = (0. it comes as measurements taken by instruments and the position is continuously being updated based on this information. . . mr (t) = −mgj. Thus the initial velocity is the vector. 1 − t2 /2. 0.5. e−t + 2t 12 2 meters. Thus letting c = (c1 . 400 cos (π/6) i + 400 sin (π/6) j while the only force experienced by the shell after leaving the artillery piece is the force of gravity. in applications of this sort of thing acceleration does not usually come to you as a nice given function written in terms of simple functions you understand. + 1.3 An artillery piece is ﬁred at ground level on a level plain. C2 . C3 ) must be determined from the initial condition for the displacement. 0) . Actually. r (0) = 400 cos (π/6) i + 400 sin (π/6) j.360 LIMITS AND DERIVATIVES By Newton’s second law. −1) + (c1 . r (0) = (0. + t. r (t) = t2 t − t3 /3 . The acceleration of gravity equals 9. (0. c3 ) which requires c1 = 0. x (t) = 0. 2e−t and so r (t) = t/2. c2 . c3 ) . y (t)) . y (t) = −g. c2 = 1. c2 . e−t . −mgj where m is the mass of the shell. Also let the direction of ﬂight be along the positive x axis. 1 − t2 . C2 = 0. the displacement must equal r (t) = t3 t2 /2 − t4 /12 . Therefore. Another situation which often occurs is the case when the forces on the object depend not just on time but also on the position or velocity of the object. 0. e−t + 2t + (C1 .8 meters per sec2 and so the following needs to be solved. + t. Therefore the velocity is given by r (t) = t2 t − t3 /3 . The angle of elevation is π/6 radians and the speed of the shell is 400 meters per second. Example 15. Therefore.

The jet is ﬂying at 600 miles per hour. Example 15. 30000) feet and the initial velocity is r (0) = (880.1) (r1 (t) . The position of the lump of blue ice will be computed from a point on the ground directly beneath the airplane at the instant the blue ice escapes and regard the airplane as moving in the direction of the positive x axis. Also ﬁnd the velocity when the lump hits the ground. The ﬁrst thing needed is to obtain information which involves consistent units. C = 400 sin (π/6) = 200. 816 326 530 6 seconds and so at this time. √ x = 200 3 (40. 0) . The force of gravity is (0. Find the position and velocity of the blue ice as a function of time measured in seconds. Thus the initial displacement is r (0) = (0. The shell hits the ground when y = 0 and this occurs when −4. This blue ice weighs 64 pounds near the earth and experiences a force of air resistance equal to (−. Thus it ﬂies at 600 × 5280 = 880 feet per second.9t2 + 200t + D.5. Thus t = 40. Similarly.1) r (t) pounds. 60 × 60 The explanation for this is that there are 5280 feet in a mile and so it goes 600×5280 feet in one hour. The next example is more complicated because it also takes in to account air resistance. −64) . 816 326 530 6) = 14139. −64) pounds and the force due to air resistance is (−.1) r (t) pounds. x (t) = 400 cos (π/6) so x (t) = 400 cos (π/6) t = 200 3t. Thus y (t) = −4. 190 265 9 meters. There are 60 × 60 seconds in an hour. 64 = m × 32. r2 (t)) . NEWTON’S LAWS OF MOTION 361 Therefore. Such lumps have been known to surprise people on the ground. The slug is the unit of mass in the system involving feet and pounds. r2 (0)) = (0. I want to change this to feet per second. r2 (0)) = (880. (r1 (0) . Thus m = 2 slugs. 2 (r1 (t) .5. r2 (t)) = (−.000 feet. The blue ice weighs 32 pounds near the earth. Newtons second law yields the following initial value problem for r (t) = (r1 (t) . Thus 32 pounds is the force exerted by gravity on the lump and so its mass must be given by Newton’s second law as follows.9t2 + 200t = 0.4 A lump of “blue ice” escapes the lavatory of a jet ﬂying at 600 miles per hour at an altitude of 30.15. (r1 (0) . 0) feet/sec. 30000) . D = 0 because the artillery piece is ﬁred at ground level which requires both x and √ to y equal zero at this time. r2 (t)) + (0. y (t) = −gt+C and from the information on the initial velocity.

1) r1 (t) = 0 2r2 (t) + (.8) mr + kr = c. You should check this gives d e(k/m)t r dt Therefore.8) to ﬁnd r (t) = (c/k) t − r1 (t) = − 1 2 exp (. (15.0 and r2 (t) = (−64/ (. r2 (t) = −640. r (0) = v. r1 (t) = 880.0t − 12800.1) = −640. e(k/m)t r = = (c/m) e(k/m)t 1 kt em c + C k and using the initial condition.1)) t − (.0 exp (−.0 exp (−.0 5t) + 42800. What of the velocity? Using (15.0 exp (−.1) .0 + 640.9) = −17600. Now multiply both sides by e(k/m)t . Thus m c u=− v− +D k k r (t) = (c/k) t − and so 1 −kt c m c me m v − + u+ v− k k k k Now apply this to the system (15.1) + 30000 + 2 (. v = c/k + C and so r (t) = (c/k) + (v − (c/k)) e− m t Now this implies 1 −kt c me m v − +D k k where D is a constant to be determined from the initial conditions.1) 2 64 (.0 5t) .0 exp (−.10) .1) 1 2 exp − t (.1) r2 (t) = −64 r1 (0) = 0 r2 (0) = 30000 r1 (0) = 880 r2 (0) = 0 To save on repetition solve LIMITS AND DERIVATIVES (15. Divide the diﬀerential equation by m and get r + (k/m) r = c/m.0 This gives the coordinates of the position. 2r1 (t) + (.0 5t) + 17600.1) t 2 (880) + 2 (880) (. r (0) = u.1) 64 (.362 Therefore.0 5t) . k (15.9) in the same way to obtain the velocity.1) − (.

0. as is often done in beginning physics courses that everything takes place in a vacuum. However.588 feet.14)) + 42800.0 + 640. This is not really true because it actually depends on the distance between the center of mass of the earth and the center of mass of the lump. W (0) = 0.0 exp (−. 2 2 . You see that air resistance can be very important so it is not enough to pretend.0 (66. F (t) = mv (t) .11) implies dW = F · v. −640. When a force is exerted on an object which causes the object to move. 15. dW m d 2 = mv (t) · v (t) = |v (t)| .0 = 1.10) to determine the velocity. Thus it suﬃces to solve the equation. This is an over simpliﬁcation when high speeds are involved.14.11) If no work has been done at time t = 0. Notice how because of air resistance the component of velocity in the horizontal direction is only about 32 feet per second even though this component started out at 880 feet per second while the component in the vertical direction is -616 feet per second even though this component started oﬀ at 0 feet per second. (t. This is a fairly hard equation to solve using the methods of algebra. Actually. It was assumed the force acting on the lump of blue ice by gravity was constant. and letting v be the velocity.0 5t) + 42800. dt Hence. It hits ground when r2 (t) = 0. this problem used several physical simpliﬁcations. However if plugging in various values of t using a calculator you eventually ﬁnd that when t = 66.1 Kinetic Energy Newton’s second law is also the basis for the notion of kinetic energy.0 5 (66.0t − 12800.then (15. In fact. 23. it is necessary to ﬁnd the value of t when this event takes place and then to use (15.14) − 12800. This is close enough to hitting the ground and so plugging in this value for t yields the approximate velocity. dt 2 dt Therefore. (15. −616. (880. How is the total work done on the object by the force related to the ﬁnal velocity of the object? By Newton’s second law. it follows that the force is doing work which manifests itself in a change of velocity of the object. the work done on the object would be approximately equal to dW = F (t) · v (t) dt. 0 = −640. NEWTON’S LAWS OF MOTION 363 To determine the velocity when the blue ice hits the ground.5. t + dt) .5.14)) .0 exp (−.15. It was also assumed the air resistance is proportional to the velocity. the total work done up to time t would be W (t) = m |v (t)| − m |v0 | where |v0 | 2 2 denotes the initial speed of the object. 56) . This diﬀerence represents the change in the kinetic energy. increasingly correct models can be studied in a systematic way as above. Now in a small increment of time.0 5 (66. I do not have a good way to ﬁnd this value of t using algebra.0 exp (−. −640.0 5 (66.0 exp (−.14))) = (32.

b b F (t) dt = a a m dv dt = mv (b) − mv (a) dt (15. a This is deﬁned as b b b F1 (t) dt. The notion of impulse and momentum are related in the following theorem. The linear momentum of an object of mass m and velocity v is deﬁned as Linear momentum = mv.5. Theorem 15. In other words. b] and denoting the two masses by mA and mB and their velocities by vA and vB it follows that b mA vA (b) − mA vA (a) = a (Force of B on A) dt and b mB vB (b) − mB vB (a) = a (Force of A on B) dt b = − a (Force of B on A) dt = − (mA vA (b) − mA vA (a)) and this shows mB vB (b) + mA vA (b) = mB vB (a) + mA vA (a) . Impulse involves a force which acts on an object for an interval of time. This is known as the conservation of linear momentum. a a F2 (t) dt. . [a. [a. b] . Newton’s third law says the force exerted by mass A on mass B is equal in magnitude but opposite in direction to the force exerted by mass B on mass A.364 LIMITS AND DERIVATIVES 15. Then the impulse equals the change in momentum.6 Let F be a force acting on an object of mass m. a F3 (t) dt . a Proof: This is really just the fundamental theorem of calculus and Newton’s second law applied to the components of F.12) Now suppose two point masses.2 Impulse And Momentum Work and energy involve a force acting on an object for some distance. Letting the collision take place in the time interval. A and B collide.5 Let F be a force which acts on an object during the time interval. More precisely. in a collision between two masses the total linear momentum before the collision equals the total linear momentum after the collision.5.5. Deﬁnition 15. The impulse of this force is b F (t) dt. b F (t) dt = mv (b) − mv (a) .

Show that if v (t) = 0. 11. If the angle of elevation equals π/4. θ from ground level on a vast plain. Find the position of the object as a function of t. Suppose that the air resistance is proportional to the velocity but it is desired to ﬁnd the constant of proportionality. only its direction. Suppose the force acting on an object. Find a formula for the displacement. Therefore. use a calculator or other means to estimate the time before the cannon ball hits the ground. then there exists a constant vector. 1) meters at time t = 0 and that its initial velocity at this time is v = i + j − k meters per sec. 8. ﬁnd a formula for how far the cannon ball goes before hitting the ground. z independent of t such that v (t) = z for all t. possibly from gravity. Show the Kinetic energy of the object is constant. Does there exist a terminal velocity? What is it? 2. Suppose also that the object is at the point (0. Also. why would dW = F (t) · v (t) dt? 2 6. z. Show the maximum range for the cannon ball is achieved when θ = π/4.6. what can be said about z and r? . Show using Corollary 6. v (0) = v0 is v (t) = v0 − c e−rt + (c/r) . f .15. argue from Newton’s second law that this is the equation for ﬁnding the velocity. A cannon is ﬁred at an angle.12) carefully by considering the components. Verify that the motion of the object takes place in a plane. this equals mr (t) . r (t) of the cannon ball. Describe how you could ﬁnd this constant. 1. Suppose in the context of Problem 7 that the cannon ball has mass 10 kilograms and it experiences a force of air resistance which is . In particular verify that mv (t) · 2 d v (t) = m dt |v (t)| . 3. b) . Such forces are sometimes called forces of constraint because they do not contribute to the speed of the object. Verify Formula (15. Now argue that r × r = (r × r) . 4. v of an object acted on by air resistance proportional to the velocity and a constant force.8 meters per sec2 . F (t) =e−t i + cos (t) j + t2 k meters per sec2 . 10.4 that Newton’s ﬁrst law can be obtained from the second law. Also suppose that the initial speed is 100 meters per second. F is always perpendicular to the velocity of the object. Therefore. 9. 5. EXERCISES 365 15.8. Hint: Let r (t) denote the position vector of the object from the ﬁxed point. 7. Suppose an object moves in three dimensional space in such a way that the only force acting on the object is directed toward a single ﬁxed point in three dimensional space.01v Newtons where v is the velocity in meters per second. Thus F · v = 0. mr × r = g (r) r × r = 0. Show the solution to v + rv = c with the initial condition. Fill in the details for the derivation of kinetic energy. for all t ∈ (a. Then the force acting on the object must be of the form g (r (t)) r (t) and by Newton’s second law. Suppose an object having mass equal to 5 kilograms experiences a time dependent force. showing that (r × r) must equal a constant vector. Neglecting air resistance. If v is velocity and r = k/m where k is a constant for air r resistance and m is the mass.6 Exercises 1. The speed of the ball as it leaves the mouth of the cannon is known to be s meters per second. and c = f /m. The acceleration of gravity is 9.

Explain how this can be done and why. 2 15. Of course this could change if you ﬁxed things so the balls would stick to each other. 16. Why does this happen? Why don’t two or more of the stationary balls start to move. Hint: Use Newton’s second law to show the time derivative of the above expression equals zero. Many times the resulting system is so complex that it is impossible to solve in terms of known functions. The ball at the right is lifted and allowed to swing. Thus F · v = 0.7 Systems Of Ordinary Diﬀerential Equations Newton’s laws of motion yield a system of ordinary diﬀerential equations. It is a large massive block of wood hanging from a long string. 13. Show the total energy of the object. how fast will it be going? Assume it starts from rest. There is a popular toy consisting of identical steel balls suspended from strings of equal length as illustrated in the following picture. When it collides with the other balls. Thus the energy is 1 (m + M ) V 2 and this block of wood rises to a height of h. Using Problem 12. By conservation of momentum mv = (m + M ) V where V is the speed of the block of wood immediately after the collision. Now use Problem 12. −mgk and a force. When it reaches the bottom. Thus linear momentum is conserved but energy is not. the ball on the left is observed to swing away from the others with the same speed the ball on the right had when it collided.366 LIMITS AND DERIVATIVES 12. The speed of the bullet can be determined from measuring how high the block of wood rises. there is a natural question about whether there even exist solutions. Suppose the only forces acting on an object are the force of gravity. Therefore. perhaps at a slower speed? This is an example of an elastic collision because energy is conserved. suppose an object slides down a frictionless inclined plane from a height of 100 feet. 14. A riﬂe is ﬁred into the block of wood which then moves. 15. Hint: Let v be the speed of the bullet which has mass m and let the block of wood have mass M. This is . Such a collision is called inelastic. F which is perpendicular to the motion of the object. show the kinetic energy before the collision is greater than the kinetic energy after the collision. In the experiment of Problem 14. Here v is the velocity and the ﬁrst term is the kinetic energy while the second is the potential energy. 1 2 E ≡ m |v| + mgz 2 is constant. The ballistic pendulum is an interesting device which is used to determine the speed of a bullet.

b].2 Suppose x : [a. Therefore.3. if f is continuous on [a.16) . The ﬁrst of these conditions is known as a Lipschitz condition. To do this. Therefore.13) (15.7. then f ∈ R ([a.2.1 Let f be a continuous function deﬁned on [a. This proves the lemma.4 on Page 233. x (s)) ds. x) . x (c) = x0 if and only if x is a solution to the integral equation. b]) . vn ) . In fact you may let v = u/ |u| if u = 0. f is continuous and bounded. recall from Theorem 10. b] .15. v such that |v| = 1 and u · v = |u| . each fi is also and so lims→t |f (s)| = |f (t)| showing f is continuous at t. let v be such a unit vector with the property that b b v· a f (t) dt = a f (t) dt .7. Proof: By the triangle inequality. note that if u ∈ Rn . t (15. b] → Rn is a continuous function and c ∈ [a. (15. x = f (t. b] × Rn → Rn satisﬁes the following two conditions. If u = 0. · · ·.1 Picard Iteration |f (t. b] having values in Rn . Then |f | is also continuous and b b f (t) dt ≤ a a |f | (t) dt. Lemma 15. Also from this deﬁnition. First of all. Then letting v = (v1 . b]) . 15. Lemma 15.15) x (t) = x0 + c f (s. if f is a continuous vector valued function deﬁned on [a. one can obtain the following simple lemma. Then x is a solution to the initial value problem. x1 )| ≤ K |x − x1 | .14) Suppose that f : [a. It only remains to prove the inequality. n 1/2 ||f (t)| − |f (s)|| ≤ |f (t) − f (s)| ≡ i=1 |fi (t) − fi (s)| 2 and since f is continuous.1 on Page 352 that f ∈ R ([a. there exists a vector.7. SYSTEMS OF ORDINARY DIFFERENTIAL EQUATIONS 367 a profound question which must ultimately be dealt with. let v = e1 .7. This section contains a fairly general result on existence and uniqueness for systems of ordinary diﬀerential equations. b] it follows from Deﬁnition 15. b b b v· a f (t) dt = v1 a b f1 (t) dt + · · · + vn a b fn (t) dt = a v · f (t) dt ≤ a |f (t)| dt by the Cauchy Schwarz inequality. x) − f (t. (15.

3 Let k ≥ 2. b] .19) . gives x (c) = x0 . t x1 (t) ≡ x0 + c t f (s. x (t)) .3 on Page 353 can be used to diﬀerentiate both sides and obtain x (t) = f (t. These deﬁnitions imply some simple estimates. The most famous technique for studying this integral equation is the method of Picard iteration .18) Continuing in this way leads to the following lemma.2. Theorem 15.16) and so it suﬃces to study this integral equation. This proves the lemma. Also. (15. |x1 (t) − x0 | ≤ t |f (s. Then for any t ∈ [a. t x2 (t) ≡ x0 + c f (s.7.368 LIMITS AND DERIVATIVES Proof: If x solves (15. 2 2 KMf c |s − c| ds ≤ KMf (15. you begin with an initial function. Let Mf ≡ max {|f (s. x0 )| ds ≤ c t K |x1 (s) − x0 | ds ≤ |t − c| . b] . Lemma 15. It follows from this lemma that the initial value problem. x ∈ Rn } . x1 (s)) ds.the fundamental theorem of calculus. x0 (s)) ds = x0 + c f (s. x0 )| ds ≤ Mf |t − c| . and if xk−1 (s) has been determined.15) is equivalent to the integral equation (15. Then for any t ∈ [a. b] . x0 (t) ≡ x0 and then iterate as follows. x1 (s)) − f (s. then since f is continuous. |xk (t) − xk−1 (t)| ≤ Mf K k−1 |t − c| . Conversely. c (15. xk−1 (s)) ds. integrate both sides from c to t obtaining (15. k! k (15.17) Now using this estimate and the Lipschitz condition for f . t |x2 (t) − x1 (t)| ≤ c t |f (s.16). if x is a solution of the initial value problem. letting t = c on both sides. x0 )| : s ∈ [a. x0 ) ds.16). In this method. t xk (t) ≡ x0 + c f (s.

(k + 1)! Lemma 15.22) . SYSTEMS OF ORDINARY DIFFERENTIAL EQUATIONS 369 Proof: The estimate has been veriﬁed when k = 2. r=0 (K(b−a)) .7. {xk (t)}k=1 is a Cauchy sequence.1 implies t t |xk+1 (t) − xk (t)| = x0 + c t f (s. then for all t ∈ [a. This shows that for every ε > 0. xk−1 (s))| ds t ≤ c K |xk (s) − xk−1 (s)| ds which. (15. ∞ |x (t) − xl (t)| ≤ Mf r=l (K (b − a)) <ε (r + 1)! r (15. l > L. by the induction hypothesis. there exists L such that if l ≥ L. |x (t) − xl (t)| < ε. Lemma 15. b] .20) ∞ r a quantity which converges to zero as l → ∞ due to the convergence of the series.21) whenever l is suﬃciently large. then |xk (t) − xl (t)| < ε. it follows that for all ε > 0. Then if k > l.. Then the Lipshitz condition on f and Lemma 15. Since the right side of the inequality does not depend on t. is dominated by t ≤ c KMf K k−1 |s − c| ds k! k+1 k ≤ Mf K k This proves the lemma. b] . ∞ Since {xk (t)}k=1 is a Cauchy sequence. {xk (t)}k=1 is a Cauchy sequence. Proof: Pick such a t ∈ [a. the triangle inequality and the estimate of the above lemma imply k−1 ∞ |xk (t) − xl (t)| ≤ r=l ∞ r+1 |xr+1 (t) − xr (t)| ≤ ∞ r=l Mf r=l Kr |t − c| ≤ Mf (b − a) (r + 1)! (K (b − a)) (r + 1)! r (15.7. Assume it is true for k. ∞ there exists L such that if k. (r+1)! a fact which is easily seen by an application of the ratio test.20).4 For each t ∈ [a.7.7.5 The function t → x (t) is continuous and l→∞ lim (sup {|x (t) − xl (t)| : t ∈ [a.15. b]}) = 0. xk (s)) ds − x0 + c f (s. xk (s)) − f (s. xk+1 (s)) ds = c |f (s. In other words. Letting k → ∞ in (15. denote by x (t) the point to which it converges. |t − c| . b] . This proves the lemma.

x (s)) ds ≤ c |f (s.21).22) was established above in (15. x is a solution to the initial value problem. xl (s)) − f (s. xk−1 (s)) ds f (s. . |x (s) − x (t)| ≤ |x (s) − xl (s)| + |xl (s) − xl (t)| + |xl (t) − x (t)| ≤ 2ε + |xl (s) − xl (t)| . b] . x1 (s))) ds. xl (s)) ds = c c f (s. Therefore.15). assume x and x1 are two solutions to the initial value problem.15). Proof: It only remains to verify the uniqueness assertion. 3 ε By the continuity of xl .13) and (15.2. b]}) < t t ε . (15.7. (15. This veriﬁes continuity of t → x (t) and proves the lemma. b]}) < ε/3. Then there exists a unique solution to the initial value problem.7. x (s)) ds. K (b − a) < K (b − a) Since ε is arbitrary.6 Let f satisfy (15. xl (s)) ds − c t c f (s.370 LIMITS AND DERIVATIVES Proof: Formula (15. this veriﬁes that t l→∞ t lim f (s. Let ε > 0 be given and let t ∈ [a. K (b − a) f (s. x (s))| ds t ≤ c K |xl (s) − x (s)| ds ε = ε. Then letting l be large enough that l→∞ lim (sup {|x (t) − xl (t)| : t ∈ [a.14).7. x (s)) ds c = x0 + and so by Lemma 15.the last term is dominated by 3 whenever |s − t| is small enough. Theorem 15.2 it follows that t x (t) − x1 (t) = c (f (s. Then by Lemma 15. Letting l be large enough that l→∞ lim (sup {|x (t) − xl (t)| : t ∈ [a. This proves the existence part of the following theorem. x (s)) − f (s. It follows that x (t) = lim xk (t) k→∞ t = lim k→∞ x0 + c t f (s.

15.7. .15. Then t |x (t) − x1 (t)| ≤ c t |f (s. There are much simpler and better ways to obtain approximate solutions to diﬀerential equations than using Picard iteration. ≤K c Letting h (t) = t c |x (s) − x1 (s)| ds. x1 (s))| |x (s) − x1 (s)| ds. it follows that −g (t) = |x (t) − x1 (t)| and so −g (t) ≤ Kg (t) which implies 0 ≤ g (t) + Kg (t) and consequently. But h (c) = 0 and so this requires t h (t) ≡ c |x (s) − x1 (s)| ds ≤ 0 and so x (t) = x1 (t) whenever t ≥ c. But g (c) = 0 and so 0 ≥ g (t) ≥ 0.2 Numerical Methods For Diﬀerential Equations The above considers a fundamental philosophical question about existence of solutions to the initial value problem. Now suppose that t ≥ c. Letting g (t) = c t |x (s) − x1 (s)| ds. x (0) = x0 . The most primitive way of doing this is the Euler method. Therefore. Then this equation implies that for all such t. This proves uniqueness and establishes the theorem. In principle. c |x (t) − x1 (t)| ≤ t c |f (s. 0 ≤ eKt g (t) which implies integration from t to c gives 0 ≤ eKc g (c) − eKt g (t) . x1 = x. x (s)) − f (s. Suppose you want to ﬁnd the solution to the initial value problem. Hence g (t) = 0 for all t ≤ c and so x (t) = x1 (t) for these values of t. x (s)) − f (s. SYSTEMS OF ORDINARY DIFFERENTIAL EQUATIONS 371 Suppose ﬁrst that t < c. x1 (s))| ≤ t K |x (s) − x1 (s)| ds.7. x) . it yields a method for ﬁnding the solution as well but it is never used this way. it follows that h (t) ≤ Kh (t) and so e−Kt h (t) ≤0 which means e−Kt h (t) − e−Kc h (c) ≤ 0. x = f (t.

x (0) = x0 . tk+1 ) you deﬁne (tk+1 − t) (t − tk ) x (tk ) + x (tk+1 ) .372 LIMITS AND DERIVATIVES for t ∈ [0. 5000 you would have (1. Using x (t2 ) one used (15. (n−1)T .24) to ﬁnd x (t3 ) and you continue doing this till you get x (tn ). 0. x) . y (0) = 1 for t ∈ [0. Thus y (5) = e5 = 148. xn + k2 2 2 hf (tn + h. and using the pattern which is emerging. x (ti )) . To solve x = f (t. In general. · · ·. this can be written more simply as x (ti+1 ) = x (ti ) + hf (ti . 2T . this can be substituted back in to (15. You split the time interval into n equal pieces. y500 = (1. From (15. This is done by a computer. x0 ) . x (0) = x0 . x (t1 )) . 413 159 102 576 6. . x (ti + h) − x (ti ) = f (ti .01) 1 = 1.24) x (t1 ) = x0 + hf (0. you then n n n n n replace the derivative with a diﬀerence quotient as follows. Letting h = T and ti = ih. Now from Euler’s method. xn + k3 ) .001. y3 = 1.24) Now this can be used to ﬁnd an approximate solution as follows. T ] . nT . 042 which is pretty close.01 = (1. Therefore. x (ti )) . In the following example.01 and a step size of . For t ∈ (tk .001) = 148. x0 is known because it is given. when you are working with numerical methods you test them on known examples and see how well they perform.01. 5] using Euler’s method with a step size of . (15. T . y1 = 1 + .24) to ﬁnd x (t2 ) = x (t1 ) + hf (t1 . 77 and this is the value which Euler’s method gives for y (5) . there will be no pattern and you just have to iterate things till you get to the end. The correct answer is y = ex . xn ) 1 1 hf tn + h.01) 1. x (0) = x0 .001. Having found x (t1 ) . the resulting function will be piecewise linear and hopefully will not be too far oﬀ. Of course this was an easy example because it led to a pattern which was easy to apply. It is customary to write xi instead of x (ti ) . You need a much better method than the Euler method if you want to solve ordinary diﬀerential equations. Next. k1 k2 k3 k4 ≡ ≡ ≡ ≡ hf (tn . h Noting that ti + h = ti+1 . h h This whole idea is based on approximating functions with their linear approximations locally. where f satisﬁes the conditions given above for existence.7.01) . x (t) ≡ Example 15.23) (15. There are many methods and the detailed study of them is outside the scope of this book. Continuing this way. Euler’s method will be applied to an initial value problem whose solution is known. you do the following to go from tn to tn+1 . How well does it work? In general.01) 500 2 = 144. One of the most famous is the Runge Kutta fourth order method.01 (0) = 1 Now y2 = 1 + (. If you used the step size . xn + k1 2 2 1 1 hf tn + h.01 + (.7 Solve the initial value problem y = y.

y (0) = 0 so the diﬀerential equation does not depend on y. the the the the 4. 6 This method is called fourth order because it can be shown that the diﬀerence between the true solution and this approximate solution is dominated by Ch4 where C is some constant independent of the choice of h.8 Exercises 1.15. Consider the model problem y = y. xn ) . f.2 using the Euler method and compare with the true solution. Then xn+1 is the correction. Then EXERCISES 373 1 (k1 + 2k2 + 2k3 + k4 ) . uses one such system which is a ﬁfth order method. Now show how these errors add up and conclude Euler method is of order 1. Now use the Runge Kutta method for h = . Maple. Even better schemes have been devised. The important thing for you to realize at this point is that the initial value problem for an ordinary diﬀerential equation is a signiﬁcant problem and it has been solved quite well. x∗ n+1 2 ∗ where here x∗ n+1 = xn + hf (tn .8.2 and compare with the true solution.2 and compare with the true solution. Work the above model problem using the following method known as a predictor corrector method. This xn+1 is the prediction. 1] . Assume anything you like about smoothness of the function. 3. Show that in going from 0 to t1 . The computer algebra system. . Solve it numerically for h = .the diﬀerence between the Euler solution and true solution is less than a constant times h2 . Thus the diﬀerence between the true solution and Euler solution is less than Ch. xn+1 = xn + h f (tn . Show that if y = f (x) . xn+1 ≡ xn + 15. then the Runge Kutta method above reduces to Simpson’s rule for numerical integration. y (0) = 1 on the interval [0. Use h = . 2. xn ) + f tn+1 .

374 LIMITS AND DERIVATIVES .

· · ·. · · ·. This is done by deﬁning things in such a way that the more general concept reduces to the earlier notion.1 Arc Length And Orientations The application of the integral considered here is the concept of the length of a curve. b]} . 16. xn ) . b] . 3. 2. xi (a + h) − xi (a) lim . [a. 375 . First it is necessary to consider what is meant by arc length. Consider such a smooth curve having parameterization (x1 . xn (t)) : t ∈ [a. b] → R such that the following conditions hold 1. The integral is used to deﬁne what is meant by the length of such a smooth curve. · · ·. xn (ti ) ). b] ⊆ R and functions xi : [a. |p (t)| ≡ n i=1 |xi (t)| 2 1/2 = 0 for all t ∈ [a. For p (t) ≡ (x1 (t) . C = ∪ {(x1 (t) . xi (t) . C is a smooth curve in Rn if there exists an interval. t → p (t) is one to one on (a. xi exists and is continuous and bounded on [a.Line Integrals The concept of the integral can be extended to functions which are not deﬁned on an interval of the real line but on some curve in Rn . b) . C. deﬁned above are giving the coordinates of a point in Rn and the list of these functions is called a parameterization for the smooth curve. b] . The functions. b] . The earlier treatment in terms of initial value problems will be made more rigorous here. xn (t)) . Note the natural direction of the interval also gives a direction for moving along the curve. 5. you could consider the polygon formed by lines from p0 to p1 and from p1 to p2 and from p3 to p4 etc. a = t0 < · · · < tn = b and letting pi = ( x1 (ti ). 4. b]. h→0+ h and xi (b) deﬁned similarly as the derivative from the left. with xi (a) deﬁned as the derivative from the right. Forming a partition of [a. to be an approximation to the curve. · · ·. xi is continuous on [a. The following picture illustrates what is meant by this. Such a direction is called an orientation.

it is reasonable to expect this to be close to n |p (tk−1 )| (tk − tk−1 ) k=1 which is seen to be a Riemann sum for the integral b |p (t)| dt a and it is this integral which is deﬁned as the length of the curve. 1] . The answer to this question is that the length of the curve does not depend on parameterization. t ∈ [0. p and q. You can see from the following picture that the polygonal approximation would appear to be even better and that as more points are added in the partition. p3 p1 p2 p0 Thus the length of the curve is approximated by n |p (tk ) − p (tk−1 )| .376 LINE INTEGRALS p1 4 p2 4 4 4 4 4 p3 p0 Now consider what happens when the partition is reﬁned by including more points. A parameterization for the line segment joining these two points is fi (t) ≡ tpi + (1 − t) qi . Does the deﬁnition of length given above correspond to the usual deﬁnition of length in the case when the curve is a line segment? It is easy to see that it does so by considering two points in Rn . Would the same length be obtained if another parameterization were used? This is a very important question because the length of the curve should depend only on the curve itself and not on the method used to trace out the curve. . The proof is somewhat technical so is given in the last section of this chapter. k=1 Since the functions in the parameterization are diﬀerentiable. the sum of the lengths of the line segments seems to get close to something which deserves to be deﬁned as the length of the curve. C.

d] → C be two parameterizations satisfying 1 .1.3) Proof: Formula (16.6. Lemma 16. (16. If f ∼ g and g ∼ h.1. Imagine t is time and p (t) gives the location of a point in space. and if f : [a. If f ∼ g then f −1 ◦ g is increasing. Corollary 16. To see (16. This proves the lemma.3 The following hold for ∼.1. Now g−1 ◦ f must also be increasing because it is the inverse of f −1 ◦ g.1) (16.1. it must be the case that g−1 ◦ h is increasing. Example 16. consider g (t) ≡ f ((a + b) − t) .1. t2 for t ∈ [−1. They give the direction of motion on C.1 Let C be a smooth curve and let f : [a. which satisfy h ∼ f are called the equivalence class determined by f and those h ∼ g are called the equivalence class determined by g. the message is that there are exactly two directions of motion along a curve.3). The proof that curve length is well deﬁned for a smooth curve contains a result which deserves to be stated as a corollary. f ∼ f.4 Graph the curve t3 . You see that going from f to g corresponds to tracing out the curve in the opposite direction. then f ∼ h. It is proved in Lemma 16. This veriﬁes (16. It is also said that the two parameterization give the same orientation for the curve when f ∼ g. f −1 ◦ h = f −1 ◦ g ◦ g−1 ◦ h and so since both of these functions are increasing. Thus this new deﬁnition which is valid for smooth curves which may not be straight line segments gives the usual length for straight line segments.1. In simple language. ARC LENGTH AND ORIENTATIONS 377 Using the deﬁnition of length of a smooth curve just given. Deﬁnition 16. Now by Corollary 16. h. if h is a parameterization.1) is obvious because f −1 ◦ f (t) = t so it is clearly an increasing function.2). This is not hard to believe.1. then f and g are said to be equivalent parameterizations and this is written as f ∼ g.16. When the parameterizations are equivalent. . If f ∼ g then g ∼ f .5.2 on Page 385 but the proof is mathematically fairly advanced so it is presented later. they preserve the direction. b] → C is a parameterization of C. of motion along the curve and this also shows there are exactly two orientations of the curve since either g−1 ◦ f is increasing or it is decreasing. in the deﬁnition of a smooth curve that p (t) = 0. Consequently. then if f −1 ◦ h is not increasing. Here is an example. These parameterizations. also a parameterization of C. The symbol. If C is such a smooth curve just described.2) (16. Sometimes people wonder why it is required. These two classes are called orientations of C. either h ∼ g or h ∼ f .2 If g−1 ◦ f is increasing. 1] . Then g−1 ◦ f is either strictly increasing or strictly decreasing. b] → C and g : [c. the length according to this deﬁnition is 1 0 n 1/2 (pi − qi ) i=1 2 dt = |p − q| . the point can stop and change directions abruptly. producing a pointy place in C. it follows f −1 ◦ h is also increasing. If p (t) is allowed to equal zero. ∼ is called an equivalence relation.

378 LINE INTEGRALS In this case. C. F(x(t)) x(t) x(t + h) In this picture. F on an object which moves from the point x (t) to the point x (t + h) along the straight line shown would equal F·(x (t + h) − x (t)) . Such a curve should not be considered smooth. one writes dt for h and dW = F (x (t)) · x (t) dt . Note the pointy place. in the direction determined by the given orientation of the curve will be deﬁned. Earlier the work done by a force which acts on an object moving in a straight line was discussed but here the object moves over a curve. x (t + h) − x (t) ≈ x (t) h where the wriggly equal sign indicates the two quantities are close. Also. provided h is small. it is called a force ﬁeld. each of which is called an “orientation”. C is an “oriented curve” if the only parameterizations considered are those which lie in exactly one of the two equivalence classes.2. it speciﬁes the order in which the points of C are encountered. Deﬁnition 16. In simple language. the work done by a constant force. The pair of concepts consisting of the set of points making up the curve along with a direction of motion along the curve is called an oriented curve. 16.2 Line Integrals And Work Let C be a smooth curve contained in Rp . Thus the graph of this curve looks like the picture below. A curve.1 Suppose F (x) ∈ Rp is given for each x ∈ C where C is a smooth oriented curve and suppose x → F (x) is continuous. consider the following picture. t = x1/3 and so y = x2/3 . The mapping x → F (x) is called a vector ﬁeld. It is reasonable to assume this would be a good approximation to the work done in moving along the curve joining x (t) and x (t + h) provided h is small enough. F on an object as it moves along the curve. In order to deﬁne what is meant by the work. This is new. In the case that F (x) is a force. In the notation of Leibniz. Next the concept of work done by a force ﬁeld. orientation speciﬁes a direction over which motion along the curve is to take place. Thus.

Is the same thing obtained from the above deﬁnition? Let x (t) ≡ p+t (q − p) . this will show up as the work being negative.2 Let F (x) be given above. When F is interpreted as a force. this is called the line integral. Does the concept of work as deﬁned here coincide with the earlier concept of work when the object moves over a straight line when acted on by a constant force? Let p and q be two points in Rn and suppose F is a constant force acting on an object which moves from p to q along the straight line joining these points. 1] be a parameterization for this oriented curve. Regardless the physical interpretation of F. W (a) = 0. the above deﬁnition yields 1 F · (q − p) dt = F · (q − p) . Theorem 16. the straight line in the direction from p to q.16. dW = F (x (t)) · x (t) . This proves the theorem. to equal zero. . Proof: Suppose g : [c.2. [a. Therefore. 0 Therefore. Then the work done is F · (q − p) . dW = F (x (t)) · x (t) . Deﬁnition 16. LINE INTEGRALS AND WORK or in other words. the line integral measures the extent to which the motion over the curve in the indicated direction is aided by the force. x is one of the allowed parameterizations of C in the given orientation of C.2. dt 379 Deﬁning the total work done by the force at t = 0. d] → C is another allowed parameterization. Then since φ is increasing.3 The symbol. φ. corresponding to the ﬁrst endpoint of the curve. is well deﬁned in the sense that every parameterization in the given orientation of C gives the same value for C F · dR. there is an interval. dt This motivates the following deﬁnition of work. d b F (g (s)) · g (s) ds = c b a F (g (φ (t))) · g (φ (t)) φ (t) dt b = a F (f (t)) · d g g−1 ◦ f (t) dt dt = a F (f (t)) · f (t) dt. the new deﬁnition adds to but does not contradict the old one.2. In other words. Then the work done by this force ﬁeld on an object moving over the curve C in the direction determined by the speciﬁed orientation is deﬁned as b F · dR ≡ C a F (x (t)) · x (t) dt where the function. If the net eﬀect of the force on the object is to impede rather than to aid the motion. C F · dR. Thus g−1 ◦ f is an increasing function. Then x (t) = q − p and F (x (t)) = F. b] and as t goes from a to b. x (t) moves in the direction determined from the given orientation of the curve. t ∈ [0. the work would satisfy the following initial value problem.

−2 sin (2t) . F (x. y. 2 sin 2t. 3. To ﬁnd this line integral use the above deﬁnition and write π F · dR = C 0 2t (cos (2t)) . z) ≡ 2xyi + x2 j + k. If f and g are both increasing functions. this equals the following integral. y. 1] the position of an object is given by r (t) = ti + tj + tk.1 · (1. 5. 2yz and here is the parameterization of a curve. Find F · dR C where C is the curve traced out by this object which has the orientation determined by the direction of increasing t. Also suppose there is a force ﬁeld deﬁned on R3 . x + z 2 . C. 2π] the position of an object is given by r (t) = ti + cos (2t) j + sin (2t) k. Repeat the problem for r (t) = ti + t2 j + tk.t2 .380 LINE INTEGRALS Example 16.4 Suppose for t ∈ [0. Suppose for t ∈ [0. Repeat the problem for r (t) = ti + t2 j + tk. 4. Find C F· dR. y. show f ◦ g is an increasing function also. t) where t goes from 0 to π/4. Assume anything you like about the domains of the functions. F (x. Suppose for t ∈ [0. Find F · dR C where C is the curve traced out by this object which has the orientation determined by the direction of increasing t.3 Exercises 1. Suppose for t ∈ [0. Find F · dR C where C is the curve traced out by this object which has the orientation determined by the direction of increasing t. R (t) = (cos 2t. Also suppose there is a force ﬁeld deﬁned on R3 . z) ≡ yzi + xzj + xyk.2. the y in the formula for F with cos (2t) and the z in the formula for F with sin (2t) because these are the values of these variables which correspond to the value of t. y. 2 cos (2t)) dt In evaluating this replace the x in the formula for F with t. F (x. z) ≡ zi + xzj + xyk. Also suppose there is a force ﬁeld deﬁned on R3 . π 2t cos 2t − 2 (sin 2t) t2 + 2 cos 2t dt = π 2 0 16. . Find F · dR C where C is the curve traced out by this object which has the orientation determined by the direction of increasing t. Also suppose there is a force ﬁeld deﬁned on R3 . π] the position of an object is given by r (t) = ti + cos (2t) j + sin (2t) k. 3] the position of an object is given by r (t) = ti + tj + tk. 2. F (x. z) ≡ 2xyi+ x2 + 2zy j+ y 2 k. y. Here is a vector ﬁeld. Taking the dot product.

In particular. the radius of curvature is small and at points where the curvature is small. κ (s) with T (s) = κ (s) N (s) . y. from the fundamental theorem of calculus. 16. assume anything about smoothness and continuity to make the following manipulations valid. there is no special direction normal to the curve at such points which could be distinguished from any other direction normal to the curve. z (t)) dt = a b m (x (t) . and a satellite orbiting the earth all have something in common. then there exists a unit vector. This is known as the parameterization of the curve in terms of arc length. 1 = |R (s)| = |T (s)| . Show directly that the work done equals the diﬀerence in the kinetic energy. B (s) ≡ T (s) × N (s) is called the binormal. y (t) . z (t)) · (x (t) . Thus R (s) equals the point on the curve which occurs when you have traveled a distance of s along the curve from one end.4. the vector. The plane determined by the two vectors. κ (s) = 0 and the radius of curvature would be considered inﬁnite. MOTION ON A SPACE CURVE 381 6. Denote by R (s) the function which takes s to a point on this curve where s is arc length. b] . z (t)) · (x (t) . In case T (s) = 0. T is perpendicular to the vector.16. assume that R exists and is continuous. y (t) . y (t) . T and N is called the osculating plane. y (t) . z (t)) dt. T let N (s) = |T (s) and so T (s) = |T (s)| N (s) . z (t)) for t ∈ [a. showing the scalar valued function is (s)| κ (s) = |T (s)| . This makes sense because if there is no curving going on. Let F (x. the radius of curvature is large. Lemma 16. Deﬁnition 16. the principal normal is undeﬁned in this case. T (s) equals a constant and so the space curve is a straight line which it would be supposed has no curvature. 1 The radius of curvature is deﬁned as ρ = κ .4. Therefore. s = 0 |R (r)| dr because of the deﬁnition of arc length.When T (s) = 0 so the principal normal is deﬁned. The function. Then |T (s)| = 1 and if T (s) = 0. In the case where |T (s)| = 0.2 The vector. T · T = 1 and so upon diﬀerentiating this on both sides. It identiﬁes a particular plane which is in a sense tangent to this space curve. Therefore. y (t) . N (s) is called the principal normal.4 Motion On A Space Curve A ﬂy buzzing around the room. Thus it measures the rate at which the curve twists. Note also that it incorporates an orientation on the curve because there are exactly two ends you could begin measuring length from. a etc. Hint: b F (x (t) . In this section.4. T. They are moving over some sort of curve in three dimensions. Proof: First. yields T · T + T · T = 0 which shows T · T = 0. κ (s) in the above lemma is called the curvature. This proves the lemma. N (s) perpendicular to T (s) and a scalar valued function. m on a curve with parameterization. a person riding a roller coaster. In the case where |T (s)| = 0 near the point of interest. z) be a given force ﬁeld and suppose it acts on an object having mass. Therefore. (x (t) . The binormal is normal to the osculating plane and B tells how fast this vector changes. T (s) is called the unit tangent vector and the vector. Thus at points where there is a lot of curvature. the vector. s .1 Deﬁne T (s) ≡ R (s) . Also.

B must be some scalar multiple of N and it is this multiple called τ . B is given to form a right handed system of unit vectors each perpendicular to the others. N = −κT − τ B where κ is the curvature and is nonnegative and τ is the torsion. Therefore. Therefore. and principal normal at this point of the curve. Now recall again that T. Suppose |T (s)| = 0 T so the principal normal. τ is called the torsion. T and B and since this is in three dimensions.3 Let R (s) be a parameterization of a space curve with respect to arc length and let the vectors. B is a unit vector and so B (s) · B (s) = 0. B is either zero or is perpendicular to T. B = τ N. note that B × T = N which follows because T.4. B must be a multiple of N. There are two vectors. N. none of this is deﬁned because in this case there is not a well deﬁned osculating plane. draw two unit vectors on it which are perpendicular. and principal normal vectors. The ﬁrst two equations are already established. B forms a right handed system of unit vectors each of which is perpendicular to every other. It is just the solution of an initial value problem of the sort discussed earlier. To get the third. It is a remarkable result because it says that from knowledge of the two scalar functions. unit tangent. . you can reconstruct the entire space curve starting at some point. In case T = 0. Later it will be done algebraically but for now. s R0 because R (s) = T (s) and so R (s) = R0 + 0 T (r) dr. N. T. and you can diﬀerentiate both sides and get B = T × N + T × N . This establishes the Frenet Serret formulas. and N (t) be the binormal. T and N have the same direction and B = T × N . unit tangent.4. τ (s) such that B = τ N. T = κN. This is an important example of a system of diﬀerential equations in R9 . T.B (s) · B (s) + B (s) · B (s) = 0 showing that B is perpendicular to B also. The following example is typical. Thus N × T = −B and B × N = −T. B is a right hand system. T (t) . Thus B = τ N as claimed. Proof: From the deﬁnition of B = T × N. hence perpendicular to the plane determined by them. the following situation is obtained. N (s) = |T (s) is deﬁned. B is a vector which is perpendicular to both vectors. Take a piece of paper. and initial values for B. The scalar function.382 LINE INTEGRALS Lemma 16. from the deﬁnition of B.) Now take the derivative of this expression. Theorem 16. The vectors. N. thus N = = B ×T+B×T τ N × T+κB × N. Diﬀerentiating this. and B be as deﬁned above. Having done this. Proof: κ ≥ 0 because κ = |T (s)| . Now recall that T is a multiple called curvature multiplied by N so the vectors. Then the following system of diﬀerential equations holds in R9 . Therefore. and N when s = 0 you can obtain the binormal. especially in applications. Thus a value of t would correspond to a point on this curve and you could let B (t) . N. B. T and B which are perpendicular to each other and both B and N are perpendicular to these two vectors. Then you can see that any two vectors which are perpendicular to this plane must be multiples of each other.4 (Serret Frenet) Let R (s) be the parameterization with respect to arc length of a space curve and T (s) = R (s) is the unit tangent vector. Often. (Use your right hand. and N are vectors which are functions of position on the space curve. The binormal is the vector B ≡ T × N (s)| so T. Then B = T × N and there exists a scalar function. But also. The conclusion of the following theorem is called the Serret Frenet formulas. T. you deal with a space curve which is parameterized by a function of t where t is time. τ and κ. Lets go over this last claim a little more.

B (t) . Now the tangent vector is dR dR dt 1 R (t) = =√ ds dt ds a2 + b2 1 √ ((−a sin t) i + (a cos t) j + bk) 2 + b2 a 1 ((b sin t) i−b cos tj + ak) + b2 a cos t a2 +b2 2 Now the curvature. Here t ∈ [0.the unit tangent vector.4. remembering that t is a function of s. B (s) = = = 1 dt ((b cos t) i+ (b sin t) j) 2 ds +b 1 ((b cos t) i+ (b sin t) j) a2 + b2 τ (− (cos t) i − (sin t) j) = τ N √ a2 and it follows −b/ a2 + b2 = τ . Corollary 16. s (t) . the binormal. MOTION ON A SPACE CURVE 383 Example 16. Therefore. T ] . R (t) = (a cos t) i + (a sin t) j + (bt) k. Note the curvature is constant in this example. κ (t) = dT ds = + a sin t a2 +b2 2 = a a2 +b2 .4. s = 0 v (r) dr and so ds dt 1 = v . T (t) . Recall that B = τ N where the derivative on B is taken with respect to arc length. Therefore. dt dt ds dt dt Proof: T = dR = dR ds = v ds .5 Given the circular helix. The arc length is s (t) = 0 obtained using the chain rule as T = = The principal normal: dT ds = = and so N= The binormal: B = = √ √ 1 2 + b2 a a2 i −a sin t − cos t j k a cos t b − sin t 0 dT dt dt ds 1 ((−a cos t) i + (−a sin t) j + 0k) a2 + b2 dT dT / = − ((cos t) i + (sin t) j) ds ds t √ a2 + b2 dr = √ a2 + b2 t. T = v/v which implies v = vT as claimed. v (t) = R (t) and let v (t) ≡ |v (t)| denote the speed and let a (t) denote the acceleration. N (t) . Then v = vT and a = dv T + κv 2 N. the curvature. the principal normal. τ (t) . t ds dt = v which implies .16. An important application of the usefulness of these ideas involves the decomposition of the acceleration in terms of these vectors of an object moving over a space curve. ﬁnd the arc length. The ﬁnal task is to ﬁnd the torsion. κ (t) .6 Let R (t) be a space curve and denote by v (t) the velocity.4. and the torsion. Also.

r r r r Sometimes curves don’t come to you parametrically. a= Example 16.4. The parameterization of the object which is as described is R (t) = r cos Therefore. 4) and (1. . v |T | T (t) = v 2 κN and so v = κ |T | and N = T (t) |T | From this. Find a parametrization for the intersection of the planes 2x + y + 3z = −2 and 3x − 2y + z = −4. N points from the object toward the center of the circle because it is a positive multiple of the vector. Suppose an object orbits a point at constant speed. 1] . This is unfortunate when it occurs but you can sometimes ﬁnd a parametric description of such curves. the parameterization is (x. The vector. It follows v rt v v t . What is the centripetal acceleration of this object? You may know from a physics class that the answer is v 2 /r where r is the radius. y. You need to solve for x and y in terms of x.7 Find a parameterization for the intersection of the surfaces y + 3z = 2x2 + 4 and y + 2z = x + 1. 4) . 2. 10. − v sin v t .4. This follows from the above quite easily. 2. Therefore. 5) − (3. 2. This yields z = 2x2 − x + 3. 1) = (3 − 2t. . 2 + 8t. r sin t r r v rt . − v cos v t . 4 + t) where t ∈ [0. 2. y = −4x2 + 3x − 5. 16. y. 10. − v sin r v rt . Also note that from the above. dt r The vector.8 Find a parametrization for the straight line joining (3. Find a parametrization for the intersection of the plane 4x + 2y + 3z = 2 and the elliptic cylinder x2 + 4z 2 = 9. 4) + t (−2. 8. (x. 1) is obtained from (1. Example 16. Thus. Note where this came from. −4t2 − 5 + 3t. Now you should check to see this works. T = − sin 1 r . a = = dv dT T+v dt dt dv dT dv T+v v= T + v 2 κN dt ds dt Note how this decomposes the acceleration into a component tangent to the curve and one |T | which is normal to it.384 LINE INTEGRALS Now the acceleration is just the derivative of the velocity and so by the Serrat Frenet formulas. 5) . cos v rt and T = − v cos r . v. 3. it is possible to give an important formula from physics. z) = (3. 8. (−2. letting t = x.5 Exercises 1. Find a parametrization for the intersection of the plane 3x + y + z = −3 and the circular cylinder x2 + y 2 = 1. z) = t. κ = |T (t)| /v = dv v2 T + v 2 κN = N. −t + 3 + 2t2 .

then there exists a sequence. Is the force acting on the object a continuous function? Explain. 2. INDEPENDENCE OF PARAMETERIZATION 4. t. 7. {tn } ⊆ [a. b] . √ 6. b] . s as a function of the parameter. Therefore. Find the arc length. 16. This proves the theorem. d b f (s) ds = c a f (φ (t)) φ (t) dt s Proof: Let F (s) = f (s) . φ is either strictly increasing or strictly decreasing. Suppose φ is strictly decreasing. g : [c. (For example. b] .) Then the ﬁrst integral equals F (d) − F (c) by the fundamental theorem of calculus. It only remains to verify continuity on [a. 1) and (−1. Lemma 16. b) from the conditions f and g satisfy. If φ is not continuous on [a. for some ε > 0 there 1 Recall that all continuous functions of this sort are Riemann integrable. √ 8. d] be one to one and suppose φ exists and is continuous on [a. s as a function of the parameter. Let R (t) = 2i + (4t + 2) j + 4tk.7. t.4 on Page 94.6. if t = 0 is taken to correspond to s = 0. Then if f is a continuous function deﬁned on [a. b] → [c. let F (s) = a f (r) dr. 9.6. s as a function of the parameter. b) . Find a parametrization for the straight line joining (1.16.3 on Page 117. b] → C. Find a parametrization for the intersection of the surfaces 3y + 3z = 3x2 + 2 and 3y + 2z = 3. Then φ (a) = d and φ (b) = c. if t = 0 is taken to correspond to s = 0. By Lemma 5. Find the arc length. Let R (t) = e5t i+e−5t j+5 2tk. Find the arc length. b] . An object moves along the x axis toward (0. 4. b] was a parameterization of a smooth curve. φ ≤ 0 and the second integral equals b a − a f (φ (t)) φ (t) dt = b d (F (φ (t))) dt dt = F (φ (a)) − F (φ (b)) = F (d) − F (c) . 0) and then along the curve y = x2 in the direction of increasing x at constant speed. 385 5. Proof: It is obvious φ is 1 − 1 on (a. if t = 0 is taken to correspond to s = 0. Therefore.6. Then φ (t) ≡ g−1 ◦ f (t) is 1 − 1 on (a. continuous on [a. Let R (t) = (cos t) i + (cos t) j + 2 sin t k. b] because then the ﬁnal claim follows from Lemma 5. C.15. d] → C be parameterizations of a smooth curve which satisfy conditions 1 .5. .1 Let φ : [a.2 Let f : [a. b] which is Riemann integrable1 . t. 4) . b] such that tn → t but φ (tn ) fails to converge to φ (t) . the length of C is deﬁned as |p (t)| dt a If some other parameterization were used to trace out C. and either strictly increasing or strictly decreasing on [a. The case when φ is increasing is similar. would the same answer be obtained? Theorem 16.6 Independence Of Parameterization b Recall that if p (t) : t ∈ [a.

for s ∈ I. gi (s) = fi (t) . J ≡ gi (I) is also an open interval and you can deﬁne a diﬀerentiable function. Let s0 ∈ φ ([a + δ. Now f (t) = g◦ g−1 ◦ f (t) = g (φ (t)) and it was just shown that φ is a continuous function on [a − δ. it follows t ∈ J1 . Proof: Let C be the curve and suppose f : [a. φ is continuous as claimed. Is it true that a |f (t)| dt = c |g (s)| ds? Let φ (t) ≡ g−1 ◦f (t) for t ∈ [a. an open interval containing φ−1 (s0 ) . φ(b−δ) b−δ |g (s)| ds = φ(a+δ) a+δ b−δ |g (φ (t))| φ (t) dt |f (t)| dt. this shows φ exists on [a + δ. d] → C both satisfy b d conditions 1 . Since s0 is arbitrary. still denoted by n such that |φ (tn ) − φ (t)| ≥ ε. f (t) = g (s) and so in particular. Consequently. by Theorem 16. Also. By continuity of gi . hi : J → I by hi (gi (s)) = s. 1 hi (gi (s)) = . Therefore. b] and letting δ → 0+ in the above. s. φ (t) = hi (fi (t)) fi (t) = hi (gi (s)) fi (t) = fi (t) gi (φ (t)) (16.1. it follows gi (s) = 0 for all s ∈ I where I is an open interval contained in [c.6.4) Now letting s = φ (t) for s ∈ I. gi (s) (16. Therefore.386 LINE INTEGRALS exists a subsequence.6. the continuity of f and g imply f (tn ) → g (s) and f (tn ) → f (t) . It follows that on this interval. still denoted by n such that {φ (tn )} converges to a point. and f on [a. yields d b |g (s)| ds = c a |f (t)| dt and this proves the theorem. Then by the above lemma φ is either strictly increasing or strictly decreasing on [a.5. for s and t related this way. Suppose for the sake of simplicity that it is strictly increasing.5) which shows that φ exists and is continuous on J1 .9. b − δ] and is continuous there. d] which contains s0 . Using the sequential compactness of [c. an open interval. As in Theorem 8. b]. gi (s0 ) = 0 for some i. (See Theorem 14. g (s) = f (t) and so s = g−1 ◦ f (t) = φ (t) . b] → C and g : [c. b + δ] . d) . a contradiction.1. Therefore. The decreasing case is handled similarly. a+δ = Now using the continuity of φ. Therefore. this implies that for s ∈ I. b] . Thus g−1 ◦ f (tn ) → s and still tn → t. d] which is not equal to φ (t) . s = hi (fi (t)) = φ (t) and so. of [c. d] .6 on Page 151.3 The length of a smooth curve is not dependent on parameterization.) there is a further subsequence. for t ∈ J1 . It follows f (t) = g (φ (t)) φ (t) and so.5 on Page 347. g . Theorem 16. b − δ]) ⊂ (c. . gi is either strictly increasing or strictly decreasing. Then by assumption 4.

there is a physical motivation for the circular functions. Any mass will do. Now let y be the displacement from this equilibrium position of the bottom of the spring with the positive direction being up. Then Newton’s second law along with Hooke’s law imply my = k (l − y) − mg and since kl − mg = 0. the only thing necessary do calculus is the concept of completeness and a way to measure distance. to the bottom end of the spring as shown in the following picture. the restoring force exerted by the spring is of the form kz where k is some constant which depends on the spring. This involved similar triangles and such notions. m. A major consideration was the careful deﬁnition of arc length of a circle and showing the radian measure of an angle is well deﬁned. Is it possible to deﬁne the circular functions beginning with these things instead of dragging in all that plane geometry? Also. mg = kl. z. Consider a garage door spring. the distance formula. what are some things the circular functions are good for? This chapter is devoted to these questions. It has been experimentally observed that as long as the extension. On the other hand. (It would be diﬀerent for a slinky than for a garage door spring.) This is known as Hooke’s law which is the simplest model for elastic materials. mg. moving the bottom end of the spring downward a distance l where the upward force exerted by the spring exactly balances the downward force exerted on the mass by gravity. You recall the important role of plane geometry in deﬁning them. Attach such a spring to the ceiling and attach a mass. m The weight of this mass. Thus the acceleration of the spring would be y . of such a spring is not too great. The extension of the spring in terms of y would be (l − y) . is a downward force which extends the spring. These springs exert a force which resists extension. 387 . Therefore. this yields my + ky = 0. To begin with.The Circular Functions Again The circular functions have already been discussed. It does not have to be a small elephant.

However. Remember. in particular. (17. It follows immediately from (17. Then let x = z − y and use Theorem 6. y + ω 2 y = 0. C + C = 0. C (0) = 0.6 again to write 2 2 d (x ) ω 2 (x) + = 0.4) Consider the function. To establish the existence of a solution.6 to conclude x satisﬁes the following initial value problem. Based on physical reasoning just presented. which satisﬁes the following initial value problem. (17.2) Now let C (t) ≡ n=0 ∞ (−1) n t2n t2n+1 n . Suppose both z and y are solutions to the problem (17. you can deﬁne functions in terms of power series. dt 2 2 By Corollary 6. it is not enough to base questions of existence in mathematics on physical intuition. (17. . y. (2n!) (2n + 1)! n=0 ∞ (17. The quantity which is a constant in the above proof of uniqueness is sometimes referred to as the energy.1) is the function. y (a) = y0 . It follows from the chain rule and what ω was just observed about S and C that z solves (17. The following theorem gives the necessary existence and uniqueness results. y (t) ≡ z (t − a). Now multiply both sides of the diﬀerential equation by x and use Theorem 6.4 There exists exactly one function. although it is sometimes done.0. However this expression equals zero when t = a and so it must be zero for all t. (Of course you know what these series are already but you don’t need to know this for the purposes of this presentation. This proves the uniqueness part.388 THE CIRCULAR FUNCTIONS AGAIN Dividing by m and letting ω 2 = k/m yields the equation for undamped oscillation. C (0) = 1. y (a) = y1 . y + ω 2 y = 0. Therefore. let z (t) ≡ y (t + a) .2. Theorem 17.8. S (t) ≡ (−1) .1) if and only if z solves z + ω 2 z = 0.1).) Then you see right away using the theorem about diﬀerentiation of power series that S (t) = C (t) and C (t) = −S (t) .1) Proof: First consider the question of uniqueness.2). Therefore. Then y solves (17.3) Using the ratio test. x (t) = 0 and so z (t) = y (t) . x (a) = x (a) = 0. It is the displacement of the bottom end of a spring from the equilibrium position. z (t) = y0 C (ωt) + y0 S (ωt) . our solution to (17. z (0) = y1 . there should be a solution to this equation. x + ω 2 x = 0. z (0) = y0 .3) that C (t) satisﬁes the following initial value problems. you see both series in the above converge for all t ∈ R.4 on Page 133 the expression (x ) ω 2 (x) + 2 2 2 2 must equal a constant.2.

6 x2 x2 x4 1− ≤ C (x) ≤ 1 − + .4. while y (0) = C (0) + S (0) = −C (0) + C (0) = 0. S (x + y) = S (x) C (y) + C (x) S (y) . m and the function would be the displacements from equilibrium of the bottom of the spring when initially it is at the equilibrium point and moving upward with unit speed. another proof is given of Property (17. These properties will be obtained directly from the initial value problems they satisfy.6 to obtain d dt (C (t)) (C (t)) + 2 2 2 2 +C +S +S =0 =0 . k equal in magnitude to the mass.4) is proved similarly only this time consider y (x) = S (x) − C (x) . y (0) = S (0) − C (0) = 1 − 1 = 0.2.10) Proof: To verify (17. Then y +y =C and y (0) = C (0) + S (0) = 0 + 0 = 0.5) use the initial value problem satisﬁed by C to write C + C = 0.4) consider the function. The second of these functions. The second half of (17. S (−t) = −S (t) . By the Theorem 17.7) (17. C (x − y) = C (x) C (y) + S (x) S (y) . S (t) is the solution to the initial value problem S + S = 0.5 The functions C and S satisfy the following properties. C and S. S (0) = 1.389 In terms of the spring. Now multiply by C and use Theorem 6. the spring constant. (17.6) These conditions in (17. k would be equal in magnitude to the mass. It will be shown that y solves the diﬀerential equation and the initial condition. C (−t) = C (t) .5) (17. The next theorem is on properties of these functions. S (0) = 0. For x ≥ 0. Thus by Theorem 17.0. y (x) = C + S. In addition.6) say that S is odd and C is even because the property of being odd or even is deﬁned in terms of these equations. 2 2 24 x− (17.4 again y (x) = 0.0.8) (17. y (x) = 0. To verify (17.9) (17. this would have the spring constant. x3 ≤ S (x) . In terms of the spring. Theorem 17. C 2 (t) + S 2 (t) = 1.4) which is based on the initial value problems satisﬁed by the functions rather than the diﬀerentiation theorem for power series.0. m and the function would be the displacements from equilibrium of the bottom of the spring when initially it is raised one unit above the equilibrium point and released with zero velocity.

c equals 1/2. Therefore. by Corollary 6.7). Then z (0) = 0 and z (x) = −S (x) + x ≥ 0 so by Corollary 6. it follows that y (x) = S (x) − x − (17. yields (17.9). Now the initial conditions for C imply this constant.5).6). Then w (0) = 0 and w (x) = 1 − C (x) ≥ 0. To verify (17. The proof of (17.10) other properties of C and S related to their periodic behavior may be obtained. it follows that w is increasing on [0.8. With (17. ﬁx y and let y (x) = S (x + y) − (S (x) C (y) + S (y) C (x)) .10). Therefore. it follows y (t) = 0.8) is completely similar and is left as an exercise. This proves + x 24 4 − C (x) . Then by Theorem 17.10). Therefore.390 and so THE CIRCULAR FUNCTIONS AGAIN (C (t)) (C (t)) + =c 2 2 for some constant. u (x) ≥ 0 for all x ≥ 0 and this shows the top half of (17. Let z (x) = C (x) − 1 − x2 2 2 2 .4.0.7).5 on Page 133.5. Also using Theorem 6. Using C = −S.0.2.8.4).8. ∞). w (x) ≥ 0 which shows that x − S (x) ≥ 0. it follows y (x) = 0 proving (17. by Theorem 17.8.9) and (17.5. . Now let y (x) = S (x) − x − y (x) = C (x) − 1 − x2 2 ≥0 x3 6 x 6 3 x2 2 ≥ 0 for all x ≥ 0. y (x) = C (x + y) − (C (x) C (y) − S (y) S (x)) and y (x) = −S (x + y) − (−S (x) C (y) − S (y) C (x)) so y + y = 0. The second part is similar.5on Page 133. .4. Also y (0) = S (y) − S (y) = 0 and y (0) = C (y) − C (y) = 0. Then from (17. Now let u (x) = 1 − x 2 2 ≥ 0 for all x ≥ 0. Then by Theorem 6.9). This proves the ﬁrst part of (17. Let y (t) = S (−t) + S (t) .6 y (t) = −S (−t) + S (t) and so y (0) = 0 and y (0) = −1 + 1 = 0. By Corollary 6. Then u (0) = 0 and x3 + S (x) ≥ 0 6 u (x) = −x + by (17. Let w (x) = x − S (x) . c.6 again.2. y (t) = S (−t) + S (t) and so y + y = 0. Then y (0) = 0 and so by Corollary 6. This brings us to the estimates. it follows that z (x) = C (x) − 1 − proving the bottom half of (17.

S (t)) is one to one for t ∈ [0. P ] . S is increasing on (0. P/2). 2 4 C (x) = −S (x) ≤ −x + x2 x3 = (−x) 1 − 6 6 <0 showing that C is strictly decreasing on this interval because of Corollary 6. In going from P to 3P/2. In going from 3P/2 to 2P. P/2) while S (x + P/2) = S (x) C (P/2) + S (P/2) C (x) = C (x) ≥ 0. Also the function. First consider the claim that S and C are 2P periodic. P/2] .6 There exists a positive number. r (t) ≡ (C (t) . 3P/2] . P ) and 2 C P = 0. From (17. 3P/2] . S (x + 2P ) = S (x + P ) C (P ) + C (x + P ) S (P ) = −S (x + P ) = − [S (x) C (P ) + C (x) S (P )] = S (x) . < 0 if x ∈ (0.7). P/2) . S (x + 2P ) = S (x) . In addition to this. It follows that r is one to one on [0. 2p) and t1 ∈ [0.(17.0. both C and S are periodic of period 2P. Since C is strictly decreasing on the interval [0. This means that for 2 all x ∈ R. Now consider what happens on the interval. Therefore. P/2] . . then C (t1 ) = C (t2 ) and S (t1 ) = S (t2 ) . But S (t1 ) ≥ 0 and so t1 = P = t2 .6). Now if r (t1 ) = r (t2 ) for t2 ∈ (3P/2. If r (t1 ) = r (t2 ) for t1 ∈ [0. Points on this interval are of the form. By the intermediate value theorem there exists a number. 2 3 C (1) = 1 > 0. Therefore. P/2] . Therefore. P/2) . P ] . P ] because its derivative. P/2 ∈ (0. P ) .391 Theorem 17. and (17. P/2) because S (x) = C (x) and C (x) > 0 on (0. 2 Proof: From (17.10) that C (2) < 0 because C (2) ≤ 1 − 2 + 24 = − 1 < 0. P ] and t2 ∈ [P. P such that C (x) > 0 on [0. Furthermore. C (x + P/2) = C (x) C (P/2) − S (x) S (P/2) = −S (x) ≤ 0. use the same arguments to verify that C and S are strictly increasing and C (2P ) = 1 while S (2P ) = 0.5). t1 ∈ [0.8). There is only one such number because for x ∈ (0.6. S (P ) = S (P/2) C (P/2) + S (P/2) C (P/2) = 0 C (P ) = C (P/2) C (P/2) − S (P/2) S (P/2) = −1. P ] it follows that r is one to one on [0. C (t1 ) = C (t2 ) > 0 and so. 2P ).8. S (P/2) = 1. Therefore. Recall. This proves the ﬁrst part of the theorem. S (t2 ) ≤ 0 and so S (t1 ) ≤ 0. and that C = cos while S = sin. r is one to one on [0. 3P/2] . It only remains to verify that C and S are periodic of period 2P. −S is negative on (P/2. x + P/2 where x ∈ [0. Now S (t1 ) = S (t2 ) < 0 which can’t happen for t1 ∈ [0. 2P ) and C (t) = cos t while S (t) = sin t for t given in radians and P = π where 2π is the circumference of the unit circle. P ). [P/2. that P = π. from (17. C is negative on (P/2. Also this shows that r (t) is one to one for t ∈ [0. S is strictly decreasing on [P/2. > 0 if x ∈ (0. C (x + 2P ) = C (x) . 2) such that C (P/2) = 0. 2) . Also C is strictly decreasing on this interval because its derivative. use similar arguments to verify that S is strictly decreasing and that C is strictly increasing and that S (3P/2) = −1 while C (−3P/2) = 0.

A (t) = (C (t)) + (S (t)) = 2 2 S 2 (t) + C 2 (t) = 1.1 The Equations Of Undamped And Damped Oscillation With the circular functions. 0) and continuing around the unit circle till (1.1. 0) . this involves starting at (1. so this gives an alternate way of deﬁning the circular functions. r (t) = (C (t) . 2P ] is just 2P but from the above discussion. the equation of undamped oscillation. (C (t) .392 To verify the assertion about C. Now it remains to verify the other assertions. Since r is one to one on (0. S (t)) is the point on the unit circle which is at a distance of t measured along an arc which starts at (1. No reference was made to plane geometry. s] is given by A (s) where A solves the initial value problem. A (0) = 0. the length of the path traced out by r for t ∈ [0. y + ω 2 y = 0 may be solved. The solution to this is just A (t) = t and so the length of this path traced out by r is just s. THE CIRCULAR FUNCTIONS AGAIN C (x + 2P ) = C (x + P ) C (P ) − S (x + P ) S (P ) = −C (x + P ) = − [C (x) C (P ) − S (x) S (P )] = C (x) . (C(t). S (t)) is a point on the unit circle. y (0) = y1 (17. Note that this ﬁnishes oﬀ all the trig identities as well thanks to the previous theorem.1 The initial value problem.5). Deﬁne 2π to be this length. This proves the theorem. 0) and proceeds counter clock wise to this point as shown in the following picture in which t denotes the length of the indicated arc of the circle. In other words. y (0) = y0 . Therefore. P = π. Theorem 17. 17. drawing a right triangle with the radius of the circle as its hypotenuse and t the angle just described. First note that by (17. just the deﬁnition of distance along with the notion of arc length. This equation resulted from the oscillations of a heavy weight on a spring but it occurs in many other physical contexts also.11) . 2P ) . C (t) = cos t and S (t) = sin t. y + ω 2 y = 0. S(t)) t Thus. Now the length of the curve traced out by r for t ∈ [0.

1 on Page 221. Thus in our spring example. every solution of y = 0 is of the form C1 t + C2 for some constants.17. It follows from Theorem 17. In the example of the object bobbing on the end of a spring. it follows any solution of (17. Therefore. e−at y = and so y= C1 −2at e + C2 −2a C1 −at e + C2 eat . If the car is just sitting still the shock absorber applies no force to the car. (D − a) y ≡ y − ay = C1 e−at .13) is of the form y = C1 e−at +C2 eat . You know how these work. let Dy ≡ y .1.11). Theorem 17. Now you should verify that any expression of this form actually solves the equation.1. ω 393 (17. All that remains of the proof is to do the part when a = 0 which is left as an exercise involving the mean value theorem. d −at e y = C1 e−2at . (17. By the product and chain rules.4 that this is the only solution to this problem. It only gives a force in response to up and down motion of the car and you assume this force is proportional to the velocity and opposite the velocity. C1 and C2 .12) does indeed provide a solution to the initial value problem (17. This is a device which resists motion like a shock absorber on a car. Now consider another sort of diﬀerential equation. my = −ky where k was the spring constant and m was the mass of the object.12) Proof: You should verify that (17. This proves most of the following theorem. Let z = (D − a) y. a > 0 (17.19.2 Every solution of the diﬀerential equation. −2a Now since C1 is arbitrary. In the case when a = 0.13). dt Therefore. you would have my = −ky − δ 2 y . Suppose the object is also attached to a dash pot. Multiply both sides of this last equation by e−at .13) To give the complete solution. Then the diﬀerential equation may be written as (D + a) (D − a) y = 0. Thus (D + a) z = 0 and so z (t) = C1 e−at from Theorem 9. y − a2 y = 0 is of the form C1 e−at + C2 eat for some constants C1 and C2 provided a > 0. Now consider the diﬀerential equation of damped oscillation.0. y − a2 y = 0. THE EQUATIONS OF UNDAMPED AND DAMPED OSCILLATION has a unique solution and this solution is y (t) = y0 cos (ωt) + y1 sin (ωt) .

The voltage drop across the capacitor is v = Q where Q is the charge on the C capacitor and C is a constant called the capacitance.1. as claimed. y + 2by + ay = 0. z (t) = C1 cos (ωt) + C2 sin (ωt) where ω = Therefore.1.1.1. If b2 − a < 0. a second order linear diﬀerential equation of the sort discussed above. The current equals the rate of change of the charge on the capacitor. (17. (17. the following theorem is given. R C L di The voltage drop across the inductor is L dt where i is the current and L is the inductance. When these voltages are summed.such damped oscillation satisﬁes an equation of the form. This is according to Ohm’s law. more general than what results from damped oscillation. z is a solution to the equation z + a − b2 z = 0.14) in terms of z. The voltage drop across the resistor is Ri where R is the resistance.17) Proof: Let z = ebt y and write (17. The other two cases are completely similar. you di must get zero because there is no voltage source in the circuit.14) are of the form e−bt (C1 + C2 t) .1. Theorem 17. then by Theorem 17. Then all solutions of (17. . In this theorem the ﬁrst case is referred to as the underdamped case. (17.1. a resistor.3 Suppose b2 − a < 0. where r = √ b2 − a. Example 17. Thus i = Q = Cv . y = e−bt (C1 cos (ωt) + C2 sin (ωt)) (17.15) √ where ω = a − b2 and C1 and C2 are constants. In the case that b2 − a = 0 the solutions of (17. (17. Concerning the solutions to this equation.18) √ a − b2 .4 An important example of these equations occurs in an electrical circuit having a capacitor.16) In the case that b2 − a > 0 the solutions are of the form e−bt C1 e−rt + C2 ert . Thus.1. and an inductor in series as shown in the following picture. this becomes LCv + CRv + v = 0. The second case is called the critically damped case and the third is called the overdamped case.394 THE CIRCULAR FUNCTIONS AGAIN where δ 2 is the constant of proportionality of the resisting force.14) Actually this is a general homogeneous second order equation. Dividing by m and adjusting the coeﬃcients.14) are of the form e−bt (C1 cos (ωt) + C2 sin (ωt)) .2 rather than Theorem 17. They use Theorem 17. Thus L dt + Ri + Q = 0 and C written in terms of the voltage drop across the capacitor.

Keep everything the same in Problem 8 except suppose the suspended end of the spring is also attached to a dash pot which provides a force opposite the direction of the √ velocity having magnitude 10 19 |v| Newtons for |v| the speed. x→0 x lim 15. S (x) = 1.5 and Theorem 17. y (0) = 0. 10. Is the derivative of a function always continuous? Hint: Consider h (x) = 1 x2 sin x if x = 0 . Solve the initial value problem y + 2y + 2y = 0. Using Theorem 17.0.0. 12. Find the displacement from the equilibrium position of the end of the spring as a function of time. x→0 x lim 8.0.14) consider the equation in which b = −1 and a = 3. 2. Give upper and lower bounds for these numbers. 13. What is h (x) for x = 0? . This mass causes the end of the spring to be displaced a distance of 39.8).5 and Problem 5 and Theorem 5. Show from Corollary 6. 2 cm. In (17. y + 5y − y = 0.18). estimate S (. 11. Using Theorem 17. y (0) = 1.1) .6). 4.5 that for all x ≥ 0. 7. Also verify the second half of (17. Verify that all solutions to the diﬀerential equation.19) 6. Using Theorem 17.1) and C (. EXERCISES 395 17.9. establish the limit. Verify that y = C1 e−at + C2 eat solves the diﬀerential equation.5. In the case of undamped oscillation show the solution can be written in the form A cos (ωt − φ) where φ is some angle called a phase shift and a constant.8. 6 120 (17. The mass end of the spring is then pulled down a distance of one cm. 14. y C1 t + C2 .5 and Problem 5. Solve the initial value problem.9.8m/ sec2 .17. 1 − C (x) = 0. A mass of ten Kilograms is suspended from a spring attached to the ceiling. A. 9.5. = 0 are of the form y = 5. Explain why this equation describes a physical system which has some dubious properties.2 Exercises 1.13).5 and Problem 5 and Theorem 5. 0 if x = 0 Show h (0) = 0. establish the limit. y (0) = 0. 3.2. and released. (17. S (x) ≤ x − x3 x5 + .0. Assume the acceleration of gravity is 9. called the amplitude. y (0) = 1. Give the displacement as before. Verify (17. Prove from the initial value problems satisﬁed by S and C the formula (17.

.396 THE CIRCULAR FUNCTIONS AGAIN 16. Now recall the formulas for the sine and cosine of sums of angles. Show there exists φ such that f (t) = a2 + b2 sin (ωt + φ) . √a2 +b2 is a point on the unit circle so it is of the form (cos φ.π. π/4.5π/4.2π. For θ = 0. Suppose f (t) = a cos ωt + b sin ωt. 7π/4.5π/3. show these angles on the unit circle and ﬁnd the sine and cosine of each one.3π/2. 2π/3.11π/6.3π/4. sin φ) .5π/6. π/6. 17. Do this without any reference to triangles or plane geometry. π/2.4π/3.7π/6. These are important examples. π/3. √ a a Hint: f (t) = a2 + b2 √a2 +b2 cos ωt + √a2b+b2 sin ωt and √a2b+b2 .

y = r sin (θ) . π . x = 5 cos π = 5 3 and y = 5 sin π = 5 . The idea is suggested in the following picture. 397 . 4) . 2π).1) Example 18. Find the polar coordinates. θ) θ r x You see in this picture. x = r cos (θ) . θ) . (x.2 Suppose the Cartesian coordinates of a point are (3.1 Polar Cylindrical And Spherical Coordinates So far points have been identiﬁed in terms of Cartesian coordinates but there are other ways of specifying points in two and three dimensional space. √ From (18. Probably the simplest curvilinear coordinate system is that of polar coordinates.1 The polar coordinates of a point in the plane are 5. Thus the Cartesian coordinates 6 2 6 2 √ 5 5 are 2 3. (18. the number r identiﬁes the distance of the point from the origin. indicated by a small dot in the picture. This angle will always be given in radians and is in the interval [0. How are the two coordinates systems related? From the picture. y) or the polar coordinates.1.Curvilinear Coordinate Systems 18. 2 . 0) while θ is the angle shown between the positive x axis and the line from the origin to the point. (r. y) • (r. y (x. In general these lists of numbers which have a diﬀerent meaning than Cartesian coordinates are called Curvilinear coordinates.1. Find the Carte6 sian or rectangular coordinates of this point. (0. Example 18.1). These other ways involve using a list of two or three numbers which have a totally diﬀerent meaning than Cartesian coordinates to specify a point in two or three dimensional space. can be described in terms of the Cartesian coordinates. Thus the given point.

dividing yields tan (θ) = 4/3. The spherical coordinates are determined by (ρ. 927 295 radians. and 4 = 5 sin (θ) . Spherical coordinates should be especially interesting to you because you live on the surface of a sphere. y1 .) Therefore. 0. For Cylindrical coordinates. 0) as shown. θ) . Thus r and θ determine a point in the plane determined by letting z = 0 and r and θ are the usual polar coordinates. while r is the length of this line. and θ. y1 . At this point. z1 ) are the cylindrical coordinates of the dotted point. z = ρ cos (φ) . the location of the point is narrowed down to a circle and ﬁnally. π]. θ ∈ [0. like the one shown as a dot. Therefore. When ρ is speciﬁed. Then when φ is given. use a calculator or a table of trigonometric functions to ﬁnd that at least approximately. φ. (r. z=z and for spherical coordinates. The picture shows how to relate these new coordinate systems to Cartesian coordinates. Thus r ≥ 0 and θ ∈ [0. this indicates that the point of interest is on some sphere of radius ρ which is centered at the origin. φ. Now consider two three dimensional generalizations of polar coordinates. y = ρ sin (φ) sin (θ) . θ determines which point is on this circle. the angle is something between 0 and π/2 and also 3 = 5 cos (θ) . z T (ρ. the point whose Cartesian coordinates are (0.398 CURVILINEAR COORDINATE SYSTEMS √ Recall that r is the distance form (0. 2π). y = r sin (θ) . θ. 2π). The angle between the positive z axis and the line between the origin and the point indicated by a dot is denoted by φ. θ. z1 ). z1 ) • (x1 . ρ is the distance between the origin. θ) (r. z1 ) . and ρ ∈ [0. φ. and (ρ. It remains to identify the angle. z1 ) φ ρ y1 E y r z1 x1 © x θ (x1 . is the angle between the positive x axis and the line joining the origin to the point (x1 . 0) In this picture. y1 . (r. Let φ ∈ [0. θ. θ) . 0) and the point indicated by a dot and labeled as (x1 . The following picture serves as motivation for the deﬁnition of these two other coordinate systems. Letting z1 denote the usual z coordinate of a point in three dimensions. (Both the x and y values are positive. ∞). This has been known for several hundred years. You may also know . x = ρ sin (φ) cos (θ) . θ = . Note the point is in the ﬁrst quadrant. y1 . 0) and so r = 5 = 32 + 42 . x = r cos (θ) .

2. THE ACCELERATION IN POLAR COORDINATES 399 that the standard way to determine position on the earth is to give the longitude and latitude. Note that er is a unit vector pointing away from 0 and der deθ eθ = . sin θ) and the vector. This force is always directed toward the sun and so the force vector changes as the planet moves.3) dt dt Using (18.2). der deθ = θ (t) eθ . 0 to the point. . You should convince yourself that the picture above corresponds to this deﬁnition of the two vectors. cos θ) . Thus r (t) is just the distance from the origin. θ) θ The vector. der der deθ deθ = θ (t) . consider the following simple diagram in which two unit vectors. r (t) er + r (t) = er + 2r (t) θ (t) + r (t) θ (t) eθ . (18. eθ = (− sin θ.3) as needed along with the product rule and the chain rule. s d eθ e d r y (r. = θ (t) dt dθ dt dθ and so from (18. When this is the case. = −θ (t) er (18. er = − . The latitude corresponds to φ and the longitude corresponds to θ.1 18. A good example of this is the force exerted by the sun on a planet. er and eθ are shown.4) 1 Actually latitude is determined on maps and in navigation by measuring the angle from the equator rather than the pole but it is essentially the same idea that we have presented here.Then r (t) = r (t) er (θ (t)) where r (t) = |r (t)| . it is often useful to express things in terms of diﬀerent coordinates which are consistent with these directions.2 The Acceleration In Polar Coordinates Sometimes you have information about forces which act not in the direction of the coordinate axes but in some other direction. r (t) = = Next consider the acceleration. r (t) . r (t) = r (t) er + r (t) der d + r (t) θ (t) eθ + r (t) θ (t) eθ + r (t) θ (t) (eθ ) dt dt = r (t) er + 2r (t) θ (t) eθ + r (t) θ (t) eθ + r (t) θ (t) (−er ) θ (t) r (t) − r (t) θ (t) 2 d (er (θ (t))) dt r (t) er + r (t) θ (t) eθ . What is the velocity and acceleration? Using the chain rule. er = (cos θ.18. (18.2) dθ dθ Now consider the position vector from 0 of a point in the plane. To discuss this.

Go to a playground and have someone spin one of those merry go rounds while you ride it and move from the center toward the edge. The next example has as a special case one of Kepler’s laws.4) mr (t) = −mr (t) ω 2 er + m2vωeθ . a constant and since the speed is constant. Consider the following examples. Find the force acting on the object.400 CURVILINEAR COORDINATE SYSTEMS This is a very profound formula. mr × r = m (r × r) = g (r) r × r = 0. This proves the lemma. a positive constant. (r × r) = n. Then r (t) = v. a constant vector and sor · n = r· (r × r) = 0 showing that n is a normal vector to a plane which contains r (t) for all t. Thus in a central force ﬁeld. r (t) = 0. The term 2r θ is called the Coriolis force. θ (t) = 0. the term in (18. Proof: Let r (t) denote the position vector of the object. Therefore. Kepler’s second law . Therefore. Then from the deﬁnition of a central force and Newton’s second law.2. From (18. r is associated a force. the force an object experiences will always be directed toward or away from the origin.4). R This is the familiar formula for centripetal force from elementary physics. Example 18. Thus θ (t) = s/R and so mr = −mR s R 2 er = −m s2 er .2 A platform rotates at a constant speed in the counter clockwise direction and an object of mass m moves from the center of the platform toward the edge at constant speed.2. Thus the object experiences centripetal force from the ﬁrst term and also a funny force from the second term which is in the direction of rotation of the platform.1 Suppose an object of mass m moves at a uniform speed. objects must move in a plane. The following simple lemma is very interesting because it says that in a central force ﬁeld. s.4) corresponding 2 to eθ equals zero and mr = −Rθ (t) er . Suppose at each point of space. the force acting on the object is mr . mr = g (r) r. In this case. This is called a force ﬁeld. You can observe this by experiment if you like. . while θ (t) = ω. What forces act on this object? Let v denote the constant speed of the object moving toward the edge of the platform. Therefore. Lemma 18. a force ﬁeld is a central force ﬁeld if F (r) = g (r) er . Then the motion of the object is in a plane. r (t) = R. around a circle of radius R. The speed of the object is s and so it moves s/R radians in unit time. the equal area law. Example 18.3 Suppose an object moves in three dimensions in such a way that the only force acting on the object is a central force. θ = 0. obtained as a very special case of (18. By Newton’s second law.2. F (r) which a given object of mass m will experience if its position vector is r. 0.

G. the equal area law described above holds. m by a mass.2. It is now accepted to be 6.5) dθ 2 2222 2 222 In this picture. PLANETARY MOTION 401 Example 18.4 An object moves in three dimensions in such a way that the only force acting on the object is a central force. Therefore. 2 1 dθ c dA = r2 = . it follows from (18. The constant. dθ is the indicated angle and the two lines determining this angle are position vectors for the object at point t and point t + dt. These laws.67 × 10−11 Newton meter2 /kilogram2 .4) that 2r (t) θ (t) + r (t) θ (t) = 0 Multiply both sides of this equation by r.3.7) 18. (18. C. (18. This yields 2rr θ + r2 θ = r2 θ Consequently.6) = 0. were shown by Newton to be consequences of his law of gravitation which states that the force acting on a mass. From the assumption that the force ﬁeld is a central force ﬁeld. Gravity acts according to this law and so does electrostatic force. and there is a formula for the time it takes for the planet to move around the sun.18. M is given by F = −GM m 1 r3 r = − GM m 1 r2 er where r is the distance between centers of mass and r is the position vector from M to m.3 Planetary Motion Kepler’s laws of planetary motion state that planets move around the sun along an ellipse. is essentially r2 dθ and so dA = 1 r2 dθ. dt 2 dt 2 (18. This is called an inverse square law. The experiment involved a light source shining on a mirror attached to a quartz ﬁber from which was suspended a long rod with two equal masses at the ends which were attracted . r2 θ = c for some constant. The above lemma says the object moves in a plane. Then the object moves in a plane and the radius vector from the origin to the object sweeps out area at a constant rate. Here G is the gravitation constant. dA. Now consider the following picture. discovered by Kepler. The area of the circular sector. is very small when usual units are used and it has been computed using a very delicate experiment.

The constant was ﬁrst measured successfully by Lord Cavendish in 1798 and the present accepted value was obtained in 1942.9).5).) Consider the ﬁrst of Kepler’s laws. dθ r dt d2 θ dt2 −2c dr −2c dr dθ = 3 3 dt r r dθ dt −2c2 dr −2c dr c = 5 r3 dθ r2 r dθ d2 r dθ2 d2 r dθ2 d2 r dθ2 dr d2 θ dθ dt2 2 dr −2c2 dr + dθ r5 dθ + 2 2 (18.9) Therefore. c. = = Diﬀerentiating (18. by (18. θ = dr dθ dr = dθ dt dt and by (18.402 CURVILINEAR COORDINATE SYSTEMS by two larger masses. (18. d2 r dt2 = = = dθ dt c r2 c r2 − dr dθ 2 2c2 r5 . The gravitation force between the suspended masses and the two large masses caused the ﬁbre to twist ever so slightly and this twisting was measured by observing the deﬂection of the light reﬂected from the mirror on a scale placed some distance from the ﬁbre. 2r (t) θ (t) + r (t) θ (t) = 0.10) and (18. (It could also be a comet or an asteroid. r (t) − r (t) θ (t) Thus k = GM and r (t) − r (t) θ (t) = −k As in (18.9). In the following argument. dr c dr = 2 .12) (18.10) r2 r r dt r2 −2c dr = (18. the motion is in a plane. Experiments like these are major accomplishments. r2 θ 2 2 er + 2r (t) θ (t) + r (t) θ (t) eθ = − GM m m 1 r2 er = −k 1 r2 er 1 r2 . Now from (18.13) Also. regarding r as a function of θ and θ as a function of t.8) = 0 and so there exists a constant.11) r3 dt Now consider the ﬁrst of the above equations. By the chain rule. (18. such that r2 θ = c.4) and Newton’s second law. also. The question of interest is to know how r depends on θ.2. M is the mass of the sun and m is the mass of the planet. . From Lemma 18.3.12) again with respect to t. the one which states that planets move along ellipses. and so 2rr θ + r2 θ = 0 −2r θ −2 dr c −2rr θ = = (18.

1 d 2 dθ dρ dθ 2 dρ dθ . r = ρ−1 . Thus.8) yields r (t) − r (t) θ (t) = −k d2 r dθ2 c r2 2 2 1 r2 403 − dr dθ 2 2c2 r5 −r c r2 2 = −k 1 r2 which is a fairly messy looking diﬀerential equation. from the chain rule.14) Next consider the above equation in terms of ρ = r−1 . dr dρ = (−1) ρ−2 . Multiply both sides by d2 ρ dρ dρ k dρ +ρ = 2 dθ c dθ dθ2 dθ Then from the product rule. d2 ρ k + ρ = 2. 2 dθ ρ c dθ Now note that the ﬁrst and third terms add to zero and so −ρ−2 d2 ρ k −1 =− 2 2 2 −ρ ρ c dθ Multiplying both sides by −ρ−2 yields the equation. PLANETARY MOTION It follows that the ﬁrst equation of (18. dθ dθ dρ dθ 2 d2 r = 2ρ−3 dθ2 substituting this in to (18. However. there exists a constant. d r = dθ2 2 − ρ−2 d2 ρ .14).3.18. + ρ2 = k dρ . c2 dθ Therefore. c dθ2 a much more manageable equation. it can be simpliﬁed by 4 multiplying both sides by r2 to get c d2 r − dθ2 dr dθ 2 2 r −r =− k 2 r c2 (18. c1 such that 1 2 dρ dθ 2 + ρ2 − k ρ = c1 c2 . dθ2 dr = dθ 2 2ρ−3 dρ dθ 2 − ρ−2 d2 ρ dρ k − (−1) ρ−2 (2ρ) − ρ−1 = − 2 2 .

Therefore. Therefore. this is called a hyperbola. If ε < 1. δ dρ1 dθ 2 + ρ1 δ 2 =1 is a point on the unit circle.17) 2 In case ε = 1. θ if necessary. (Let θ = θ + φ) it can be assumed that φ = 0 so ρ− Thus ρ= and so r = = where 1 c2 /k = k 1 + (c2 /k) δ sin θ c2 + δ sin θ pε 1 + ε sin θ ε = c2 /k δ and p = c2 /kε. ρ1 = δ sin (α (θ)) . dρ1 = δ cos (α (θ)) .404 and so dρ dθ 2 CURVILINEAR COORDINATE SYSTEMS = 2c1 + = ≡ k ρ − ρ2 c2 − ρ− 2 k2 + 2c1 4c4 δ2 − ρ − k 2c2 2 k 2c2 Now letting ρ1 = ρ − k 2c2 . Then squaring both sides. φ. Redeﬁning. there exists an angle.15) (18. this reduces to the equation of a parabola. This proves that objects which . dρ1 = α (θ) δ cos (α (θ)) dθ and so α (θ) = 1. dθ Diﬀerentiating the second equation with respect to θ.16) Here all these constants are nonnegative. 2c2 k + δ sin θ c2 (18. Thus r + εr sin θ = εp and so r = (εp − εy) . (18. α (θ) = θ + φ for some constant. k = ρ1 = δ sin θ. this reduces to the equation of an ellipse and if ε > 1. 1 δ2 which shows that α (θ) such that 1 dρ1 ρ1 δ dθ . x2 + y 2 = (εp − εy) = ε2 p2 − 2pε2 y + ε2 y 2 And so x2 + 1 − ε2 y 2 = ε2 p2 − 2pε2 y.

ε is called the eccentricity. PLANETARY MOTION 405 are acted on only by a force of the form given in the above example move along hyperbolas.16). The case where ε = 0 corresponds to a circle. holds for any central force.3.7) the following equation must hold.2 on Page 248 which gives the area of an ellipse. From (18. 2) (1 − ε (1 − ε2 ) / ε2 p2 (1 − ε2 ) 2 = 1.18) Now note this is the equation of an ellipse and that the diameter of this ellipse is 2εp ≡ 2a. εp c εp =T 2 1 − ε2 (1 − ε2 ) 2 πε2 p2 c (1 − ε2 )3/2 4π 2 ε4 p4 c2 (1 − ε2 ) 3 T = and so T2 = Now using (18.19) This relationship is known as Kepler’s third law. (1 − ε2 ) This follows because ε2 p2 (1 − 2 ε2 ) ≥ ε2 p2 . Kepler’s second law. the equal area formula. (18. Using this formula.9. area of ellipse π√ Therefore. this has shown T2 = 4π 2 GM diameter of ellipse 2 3 . T2 = = 4π 2 ε4 p4 3 kεp (1 − ε2 ) k (1 − ε2 ) 2 3 2 3 4π a 4π a = . Kepler’s third law involves the time it takes for the planet to orbit the sun. This is called Kepler’s ﬁrst law in the case of a planet. Recall Example 10. ellipses or circles. and (18. not just one which satisﬁes an inverse square law.17) you can complete the square and obtain x2 + 1 − ε2 and this yields x2 / ε2 p2 1 − ε2 + y+ pε2 1 − ε2 2 y+ pε2 1 − ε2 2 = ε2 p 2 + p2 ε4 ε2 p2 = . The constant. 1 − ε2 Now let T denote the time it takes for the planet to make one revolution about the sun. . (18. k GM = 4π 2 (εp) 3 3 Written more memorably.18.

8. Describe the following surface in rectangular coordinates. 6.4 Exercises 1. and ρ (t) = t. 9. φ = π/4 where φ is the polar angle in spherical coordinates. Describe the following surface in rectangular coordinates. but also it takes you in entirely the wrong direction because the whole point of introducing the new coordinates is to write everything in terms of these new coordinates and not in terms of Cartesian coordinates. Find the speed and the velocity of the object in terms of Cartesian coordinates. Describe the following surface in rectangular coordinates. θ = π/4 where θ is the angle measured from the postive x axis cylindrical coordinates. z = x2 + y 2 in cylindrical coordinates and in spherical coordinates. A point has Cartesian coordinates. θ (t) = 1 + t. Can you ﬁgure out the velocity of the point? Speciﬁcally.406 CURVILINEAR COORDINATE SYSTEMS 18. suppose φ (t) = t. ρ = 4 where ρ is the distance to the origin. 7. θ = π/4 where θ is the angle measured from the postive x axis spherical coordinates. However. Suppose you know how the spherical coordinates of a moving point change as a function of t. Describe the following surface in rectangular coordinates. Hint: You would need to ﬁnd x (t) . (b) x2 − y 2 = 1 (c) z 2 + x2 + y 2 = 6 (d) z = (e) y = x (f) z = x 11. (a) z = x2 + y 2 . 3) . r = 5 where r is one of the cylindrical coordinates. Give the cone. 5. (b) x2 − y 2 = 1 (c) z 2 + x2 + y 2 = 6 (d) z = (e) y = x (f) z = x 10. the velocity would be x (t) i + y (t) j + z (t) k. Then in terms of Cartesian coordinates. sometimes this inversion can be done. Find its spherical and cylindrical coordinates. Write the following in spherical coordinates. Write the following in cylindrical coordinates. In general it is a stupid idea to try to use algebra to invert and solve for a set of curvilinear coordinates such as polar or cylindrical coordinates in term of Cartesian coordinates. y (t) . 3. 4. (a) z = x2 + y 2 . x2 + y 2 x2 + y 2 . and z (t) . Not only is it often very diﬃcult or even impossible to do it. Describe the following surface in rectangular coordinates. Describe how to solve for r and θ in terms of x and y in polar coordinates. (1. 2. 2.

Is it necessary that the orbit be circular? What if you want the satellite to stay above the same point on the earth at all times? If the orbit is to be circular and the satellite is to stay above the same point. EXERCISES 407 12.19) and the above value of the universal gravitation constant. the low pressure area already has a counter clockwise rotation because of the rotation of the earth and its spherical shape. Such a satellite is called geosynchronous. It is desired to place a satellite above the equator of the earth which will rotate about the center of mass of the earth every 24 hours. .) 15. Explain why low pressure areas rotate counter clockwise in the Northern hemisphere and clockwise in the Southern hemisphere. Using (18. How are things diﬀerent in the Southern hemisphere? 13.6). determine the mass of the sun. Hint: Note that from the point of view of an observer ﬁxed in space. The orbit of the earth is pretty nearly circular and the distance from the sun to the earth is about 149 × 106 kilometers. (Actually it is 365.256 days. at what distance from the center of mass of the earth should the satellite be? You may use that the mass of the earth is 5.98 × 1024 kilograms. Now consider (18.18. What are some physical assumptions which are made in the above derivation of Keplers laws from Newton’s laws of motion? 14.4. The earth goes around it in 365 days. In the low pressure area stuﬀ will move toward the center so r gets smaller.

408 CURVILINEAR COORDINATE SYSTEMS .

Part IV Functions Of More Than One Variable 409 .

.

(aij ) refers to a matrix in which the i denotes the row and the j denotes the column. 3 because it is in the second row and the third column. For example. here is a matrix. multiplied by a scalar and sometimes multiplied. To illustrate scalar multiplication. The ﬁrst column is 5 . the second row is (5 2 8 7) and so forth. For example. Also.1 Matrices Several of them are referred to as matrices. 5 2 6 −4 11 −2 Two matrices are equal exactly when they are the same size and the corresponding entries are identical. 411 . Thus 1 2 −1 4 0 6 3 4 + 2 8 = 5 12 . Stated in a general form the rules for scalar multiplication and addition are these. a12 = 2. a32 = −9. a23 = 8. 3 4 8 7 1 2 A matrix is a rectangular array of numbers. 8 is in position 2. you can remember the columns are like columns in a Greek temple. Two matrices which are the same size can be added.Linear Algebra 19. 3 5 2 8 7 = 15 6 −9 1 2 18 −27 3 6 The new matrix is obtained by multiplying every entry of the original matrix by the given scalar. Using this notation on the above matrix. Elements of the matrix are identiﬁed according to position in the matrix. They can sometimes be added. If A is an m × n matrix. They stand up right while the rows just lay there like rows made by a tractor in a plowed ﬁeld. consider the following example. The 6 convention in dealing with matrices is to always list the rows ﬁrst and then the columns. etc. 1 2 5 2 6 −9 This matrix is a 3 × 4 matrix because there are three rows and four columns. 1 2 3 4 3 6 9 12 6 24 21 . the result is the matrix which is obtained by adding corresponding entries. The symbol. When this is done. There are various operations which are done on matrices. −A is deﬁned to equal (−1) A. The ﬁrst 1 row is (1 2 3 4) .

(A + B) + C = A + (B + C) .2) (19. and C.1. the existence of an additive inverse. β scalars. α (A + B) = αA + αB. the associative law for addition.1) The above properties. From now on. while a 1 × n matrix of the form (x1 · · · xn ) is referred to as a row vector.3) (19. Also. the following properties are all obvious but you should verify all of these properties valid for A.(19. x = .6) (19. xn These n × 1 matrices are also called vectors. 1A = A. . The zero matrix. 3 × 4 zero matrices. The number Aij will typically refer to the ij th entry of the matrix. m × n matrices and 0 an m × n zero matrix. .1.7) (19. (19. With this deﬁnition. A + B = B + A. α (βA) = αβ (A) . A + 0 = A.412 LINEAR ALGEBRA Deﬁnition 19. .1) . Deﬁnition 19. (19. this list of numbers will be arranged vertically. sometimes column vectors. Thus x1 . Then A + B = C where C = (cij ) for cij = aij + bij . for α. Recall this means x is a list of n real numbers.8) (19. the following also hold.2 Let x ∈ Rn . the existence of an additive identity. B. A + (−A) = 0.8) are known as the vector space axioms and the fact that the m × n matrices satisfy these axioms is what is meant by saying this set of matrices forms a vector space. denoted by 0 will be the matrix consisting of all zeros. In fact for every size there is a zero matrix. etc. the commutative law of addition.1 Let A = (aij ) and B = (bij ) be two m × n matrices. A. xA = (cij ) where cij = xaij . Also if x is a scalar. (α + β) A = αA + βA.5) (19.4) (19. Note there are 2 × 3 zero matrices.

1. vn Then Av is an m × 1 matrix and the ith component of this matrix is n (Av)i = j=1 Aij vj . in this case.19. a11 a21 a12 a22 a13 a23 x1 x2 = x3 a11 x1 + a12 x2 + a13 x3 a21 x1 + a22 x2 + a23 x3 . an n × 1 matrix. This is where things quit being banal. (vector) Deﬁnition 19. Note also that multiplication by an m × n matrix takes a vector in Rn . Motivated by this example. (Av)i = Aij vj . . here is the deﬁnition of how to multiply an m × n matrix by an n × 1 matrix. v = . First consider the problem of multiplying an m × n matrix by an n × 1 column vector. Slide the vector. n j=1 Amj vj n j=1 Using the repeated index summation convention on Page 331. an m × 1 matrix. . a 2 × 1 matrix. Thus 7 1 2 3 7×1+8×2+9×3 50 8 = = . Consider the following example 7 1 2 3 8 =? 4 5 6 9 The way I like to remember this is as follows. 8 9 7 4 5 6 multiply the numbers on the top by the numbers on the bottom and add them up to get a single number for each row of the matrix. Thus Av = A1j vj . MATRICES 413 All the above is ﬁne. v1 . and produces a vector in Rm . . .1. placing it on top the two rows as shown 7 8 9 1 2 3 . 4 5 6 7×4+8×5+9×6 122 9 In more general terms. These numbers are listed in the same order giving.3 Let A = Aij be an m × n matrix and let v be an n × 1 matrix. but the real reason for considering matrices is that they can be multiplied. .

Then AB is an m × p matrix and n (AB)ij = k=1 Aik Bkj . ith entry of the j th column of AB. bj = . Hence AB as just What is the ij th entry of AB? It would be the it would be the ith entry of Abj . these must match (m × n) (n × p )=m×p If the two middle numbers don’t match. · · ·. k=1 (19. Thus if A is an r × s matrix and B is a s × p then A and B are conformable in the order. It is of the form (3 × 2) (2 × 3) and since the inside numbers match.5 Multiply if possible 3 1 . 3 1 3 . . the next task is to multiply an m × n matrix times an n × p matrix.414 LINEAR ALGEBRA With this done. bp ) where bk is an n × 1 matrix.1.9) deﬁned is an m × p matrix.1. you can’t multiply the matrices. · · ·.10) This motivates the deﬁnition for matrix multiplication which identiﬁes the ij th entries of the product. it must be possible to do this. (19. 1 2 2 3 1 Example 19.11) In terms of the repeated summation convention. (AB)ij = Aik Bkj . on Page 331. Bnj and from the above deﬁnition. (19. Then B is of the form B = (b1 . AB is deﬁned as follows: AB ≡ (Ab1 .12) Two matrices. Then an m × p matrix. 7 6 2 2 6 First check to see if this is possible. Before doing so.4 Let A = (Aij ) be an m × n matrix and let B = (Bij ) be an n × p matrix. AB. the ith entry is n (19. Let A be an m × n matrix and let B be an n × p matrix. 3 1 1 7 6 2 2 6 2 6 2 6 . the following may be helpful. A and B are said to be conformable in a particular order if they can be multiplied in that order. Now B1j . Thus Aik Bkj . Deﬁnition 19. The answer is of the form 1 2 1 2 1 2 3 1 2 . Abp ) where Abk is an m × 1 matrix.

In fact. When the multiplication is done it equals 13 13 29 32 . 14 This is not possible because it is of the form (3 × 2) (3 × 3) and the middle numbers don’t match. 2 3 1 2 Example 19.6 Multiply if possible 3 1 7 6 0 0 2 6 1 2 .9 If all multiplications and additions make sense. A (aB + bC) = a (AB) + b (AC) (B + C) A = BA + CA A (BC) = (AB) C (19.7 Multiply if possible 7 6 2 3 1 . 0 415 resulting product.13) (19. Example 19. 0 0 Note the above two examples show matrix multiplication is not commutative.1. there are some properties which do hold.14) (19.1. the following hold for matrices and a. 0 1 1 0 = 2 4 1 3 . It seems natural to ask if the product of any two n × n matrices commutes.1.19. Thus the above product 5 5 .15) . it can happen that AB makes sense while BA does not make sense which was the case in these examples.1.1. Therefore. Proposition 19. 2 6 0 0 0 This is possible because in this case it is of the form (3 × 3) (3 × 2) and the middle numbers do match. b scalars. However. 2 3 1 1 2 Example 19. you cannot conclude that AB = BA for matrix multiplication. and you see these are not equal.8 Compare The ﬁrst product is 1 2 3 4 the second product is 0 1 1 0 1 3 2 4 = 3 1 4 2 . MATRICES where the commas separate the columns in the equals 16 15 13 15 46 42 a 3 × 3 matrix as desired. 1 2 3 4 0 1 1 0 and 0 1 1 0 1 3 2 4 .

(AB) T ij T (19. Then (AB) = B T AT and if α and β are scalars.16) (19.17) (αA + βB) = αAT + βB T T = = = = (AB)ji Ajk Bki B T ik AT B T AT ij kj (19. denoted by placing a T as an exponent on the matrix. Proof: From the deﬁnition.17) is left as an exercise and this proves the lemma. .15). Deﬁnition 19.416 LINEAR ALGEBRA Proof: Using the repeated index summation convention and the above deﬁnition of matrix multiplication.1. (A (aB + bC))ij = Aik (aB + bC)kj = Aik (aBkj + bCkj ) = aAik Bkj + bAik Ckj = a (AB)ij + b (AC)ij = (a (AB) + b (AC))ij showing that A (B + C) = AB + AC as claimed. AT ij = Aji The transpose of a matrix has the following important property.14) is entirely similar. This motivates the following deﬁnition of the transpose of a matrix.1. Then AT denotes the n × m matrix which is deﬁned as follows. Another important operation on matrices is that of taking the transpose. This proves (19. Lemma 19. Consider (19.15).11 Let A be an m × n matrix and let B be a n × p matrix. T 1 2 1 3 2 3 1 = 2 1 6 2 6 What happened? The ﬁrst column became the ﬁrst row and the second column became the second row. The following example shows what is meant by this operation. the associative law of multiplication. (A (BC))ij = Aik (BC)kj = Aik Bkl Clj = (AB)il Clj = ((AB) C)ij .10 Let A be an m × n matrix. The number 3 was in the second row and the ﬁrst column and it ended up in the ﬁrst row and second column. Thus the 3 × 2 matrix became a 2 × 3 matrix. Formula (19.

MATRICES There is a special matrix called I and deﬁned by Iij = δ ij 417 where δ ij is the Kroneker symbol deﬁned on Page 330. 1 1 1 1 −1 1 = 0 0 and if A−1 existed.13 An n×n matrix.14 Let A = 1 1 1 1 .1.15 Let A = To check this. multiply 1 1 1 2 and 2 −1 −1 1 2 −1 1 1 −1 1 1 2 = 1 0 1 0 0 1 0 1 1 1 1 2 . Proof: (AIn )ij = = Aik δ kj Aij and so AIn = A. = .1 Finding The Inverse Of A Matrix A little later a formula is given for the inverse of a matrix.19. it also follows that Im A = A. Deﬁnition 19.1. It is called the identity matrix because it is a multiplicative identity in the following sense. Thus the answer is that A does not have an inverse. The other case is left as an exercise for you. this could not happen because you could write 0 0 = A−1 −1 1 0 0 =I = A−1 A −1 1 = −1 1 −1 1 = . it is not a good way to ﬁnd the inverse for a matrix.1. However. Lemma 19. = A−1 A a contradiction. It is also important to note that not all matrices have inverses. Then AIn = A.1. A has an inverse. Example 19. Example 19. However. Does A have an inverse? One might think A would have an inverse because it does not equal zero. Show 2 −1 −1 1 is the inverse of A. If Im is the m × m identity matrix.12 Suppose A is an m × n matrix and In is the n × n identity matrix.1.1. A−1 if and only if AA−1 = A−1 A = I where I = (δ ij ) for 1 if i = j δ ij ≡ 0 if i = j 19.

Didn’t the above seem rather repetitive? Note that exactly the same row operations were used in both systems. the inverse is 2 −1 −1 1 . how would you ﬁnd A−1 ? You wish to ﬁnd a matrix. z + 2w = 1. Lets solve the ﬁrst system.418 showing that this matrix is indeed the inverse of A. Take (−1) times the ﬁrst row and add to the second to get 1 1 1 0 1 −1 Now take (−1) times the second row and add to the ﬁrst to get 1 0 0 1 2 −1 . such that 1 1 1 2 x z y w = 1 0 0 1 .19) 1 0 (19. the end result was something of the form (I|v) x where I is the identity and v gave a column of the inverse. Writing the augmented matrix for these two systems gives 1 1 1 2 for the ﬁrst system and 1 1 1 2 0 1 (19. the ﬁrst y z column of the inverse was obtained ﬁrst and then the second column . x + 2y = 0 and z + w = 0. x z y w This requires the solution of the systems of equations.18) for the second. Now solve the second system. In the above.19) to ﬁnd z and w. 0 1 1 Now take (−1) times the second row and add to the ﬁrst to get 1 0 0 1 −1 1 . Putting in the variables. Putting in the variables. LINEAR ALGEBRA In the last example. (19. x + y = 1. this says z = −1 and w = 1. . In each case. Therefore. . Take (−1) times the ﬁrst row and add to the second to get 1 1 0 . this says x = 2 and y = −1. w This is the reason for the following simple procedure for ﬁnding the inverse of a matrix.

16 Suppose A is an n × n matrix.8) show −A is unique. form the augmented n × 2n matrix. Using only the properties (19. 1 1 0 0 1 0 1 0 . 5. 4. Using only the properties (19. B = A−1 .8) show 0A = 0. 1 19.8) and previous problems show (−1) A = −A. EXERCISES 419 Procedure 19.19. 1 1 0 2 2 1 −1 0 . Find A−1 . 1 0 1 Example 19.(19.1) .1) .8) show 0 is unique.2.17 Let A = 1 −1 1 . −1 0 0 1 1 1 1 0 −1 1 Now do row operations until the n × n matrix on the left becomes the identity matrix. When this has been done. 7. Prove (19.(19.20).1) . Prove that Im A = A where A is an m × n matrix.1) .17). Just multiply the matrices and see if 1 1 1 0 1 0 1 0 2 2 1 −1 1 1 −1 0 = 0 1 1 1 −1 0 0 1 −1 −1 2 2 it works. Using only the properties (19. A has no inverse exactly when it is impossible to do row operations and end up with one like (19. To ﬁnd A−1 if it exists. .1) .1.(19. (A|I) and then do row operations until you obtain an n × 2n matrix of the form (I|B) (19. Here the 0 on the left is the scalar 0 and the 0 on the right is the zero for m × n matrices.(19.2 Exercises 1. 6. The matrix. 1 1 1 −2 −2 Checking the answer is easy.20) if possible. In (19. Using only the properties (19. 2.8) describe −A and 0. 3. 0 0 . This yields after a some computations.(19. 1 1 −1 Form the augmented matrix. 1 1 1 0 0 0 2 2 0 1 0 1 −1 0 1 0 0 1 1 −2 −1 2 and so the inverse of A is the matrix on the right.1.

Suppose A and B are square matrices of the same size. A such that A2 = I and yet A = I and A = −I. 17. and yet AB = AC. 14. ·)Rk denotes the dot product in Rk . and C = −1 2 2 1 −2 −3 −1 0 2 = B −1 A−1 . 11. Which of the following are correct? (a) (A − B) = A2 − 2AB + C 2 (b) (AB) = A2 B 2 (c) (A + B) = A2 + 2AB + C 2 (d) (A + B) = A2 + AB + BA + B 2 (e) A2 B 2 = A (AB) B (f) (A + B) = A3 + 3A2 B + 3AB 2 + B 3 (g) (A + B) (A − B) = A2 − B 2 (h) None of the above. y)Rm = x. A. B such that neither A nor B equals zero and yet AB = 0. Give another example other than the one given in this section of two square matrices. Give an example 1 12. Find xT y and xyT if possible. A = 0. x1 − x2 + 2x3 x1 2x3 + x1 in the form A x2 where A is an appropriate matrix. Give an example of a matrix.AT y Rn where (·. Show that if A−1 exists for an n × n matrix. Let A and be a real m × n matrix and let x ∈ Rn and y ∈ Rm . −1. 1) and y = (0. 19. Use the result of Problem 8 to verify directly that (AB) = B T AT without making any reference to subscripts. if BA = I and AB = I. Show (Ax. 15. A and B such that AB = BA. That is. Find −1 . Give an example of matrices. B. 2) . Let A = −2 1 if possible. Let x = (−1. 1. 1 1 −3 1 1 −1 −2 0 . (a) AB (b) BA (c) AC (d) CA (e) CB (f) BC 13. Write x3 3x3 3x4 + 3x2 + x1 x4 18. 9. then it is unique. 16. They are all wrong. A. 10. B = . C such that B = C. 3 2 2 2 2 .420 LINEAR ALGEBRA 8. then B = A−1 . Show (AB) −1 T of matrices.

then for v. If A−1 does not exist. LINEAR TRANSFORMATIONS (i) All of the above.1 A function. B such that AB = 0. 22.13). 19. It turns out this is the only type of linear transformation available. . They are all right.3.22) i Also. ei ∈ Rn is deﬁned 0 . · · ·.3. b scalars. determine why. Thus v is a list of real numbers arranged vertically.19. . Lemma 19. These vectors have a signiﬁcant property. · · ·. Before showing this. ∈Rn A au + bv = aAu + bAv ∈ Rm (19. if A is an m × n matrix. 1.21) holds.2 A vector. v1 . 20.21). From (19. 1 0 2 Find A−1 if possible. Thus if A is a linear transformation from Rn to Rm . there is always a matrix which produces A.23) T . matrix multiplication deﬁnes a linear transformation as just deﬁned. v ∈ Rn and a. Find all 2 × 2 matrices. if A is an m × n matrix. · · ·. where the 1 is in the ith position and there are zeros everywhere else. ei ≡ 1 . However. (19. Let 1 2 3 A = 2 1 4 . eT Aej = Aij i (19. b scalars. 421 21. vn . Of course the ei for a particular value of i in Rn would be diﬀerent than the ei for that same value of i in Rm for m = n. Let A = −1 3 −1 3 . Deﬁnition 19. Prove that if A−1 exists and Ax = 0 then x = 0.21) Deﬁnition 19. then letting ei ∈ Rm and ej ∈ Rn . One of them is longer than the other. (19. A : Rn → Rm is called a linear transformation if for all u. Then eT v = v i . 0. 0) . u vectors in Rn and a.3. 0. which one is meant will be determined by the context in which they occur.3. here is a simple deﬁnition.3 Linear Transformations By (19. 0 as follows: . . Thus ei = (0.3 Let v ∈ Rn . .

Therefore.4 Let L : Rn → Rm be a linear transformation.3. i To verify uniqueness. and noting that (ej )k = δ kj A1j A1k (ej )k . From the above theorem and corollary.6 Find the linear transformation. means the linear transformation determined by the matrix is one to one. to say a matrix is one to one. From the deﬁnition of matrix multiplication using the repeated index summation convention. Then there exists a unique m × n matrix. Then in particular. . Proof: This follows immediately from the above theorem. . (Lx)i = eT Lx = eT xk Lek = eT Lek xk . This proves the lemma. .5 A linear transformation. .23). L : Rn → Rm is completely determined by the vectors {Le1 . .422 LINEAR ALGEBRA Proof: First note that eT is a 1 × n matrix and v is an n × 1 matrix so the above i multiplication in (19.22) makes perfect sense. For example. . Theorem 19. · · ·.3. Consider (19. . · · ·0) vi = vi . this linear 1 3 transformation is that determined by matrix multiplication by the matrix 2 1 1 3 . i i i Let Aik = eT Lek .3. T T T ei Aej = ei Aik (ej )k = ei Aij = Aij . It equals v1 . . . This theorem shows that any linear transformation deﬁned on Rn can always be considered as a matrix. Amj Amk (ej )k by the ﬁrst part of the lemma. Corollary 19. vn as claimed. . This proves uniqueness. . · · ·. suppose Bx = Ax = Lx for all x ∈ Rn . to prove the existence part of the theorem. . Example 19. L : R2 → R2 which has the property 2 1 that Le1 = and Le2 = . (0. the terms “linear transformation” and “matrix” will be used interchangeably. 1. . . (19. The ik th entry of this matrix is given by eT Lek i Proof: By the lemma. Len } . A such that Ax = Lx for all x ∈ Rn .24) . The unique matrix determining the linear transformation which is given in (19.24) depends only on these vectors. this is true for x = ej and then multiply on the left by eT to obtain i Bij = eT Bej = eT Aej = Aij i i showing A = B.

· · ·. The following lemma is very signiﬁcant. A = 0.25) . fk } for all j. Then there exist vectors. fr } such that fi · fj = δ ij and span {f1 . Before reading further. fk } . fk } for some j. · · ·. · · ·. {f1 . · · ·.7 Let {x1 . xk }. (19.3. · · ·.3. These sums are called linear combinations. fr } = A (Rn ) ≡ {Ax : x ∈ Rn }. LINEAR TRANSFORMATIONS 423 19.3. Deﬁne span {x1 . · · ·. Also recall that δ ij = 1 if i = j . fk } for all j ≤ ik .19. · · ·. · · ·. en } with the property that Aei = 0. fk } = A (Rn ) . you should review the dot product and its properties on Page 311. Aeik } = span {f1 . · · ·. stop and consider span {f1 . · · ·. 0 if i = j Lemma 19. · · ·. (By construction. · · ·. First it could happen that Aej ∈ span {f1 . In this case. Thus span {f1 } = span {Aei1 } = span {Ae1 . · · ·. fk } . fk } have been chosen in such a way that fi · fj = δ ij and span {Aei1 . · · ·. Aeik } = span {Ae1 . Aei1 } because Aek = 0 for k < i1 . Aej is in / span {f1 . xk } be a set of vectors in Rn . There are two cases to consider. fk } showing that span {f1 . {e1 . Proof: Let f1 = Aei1 |Aei1 | where ei1 is the ﬁrst vector in the list.3. This was just accomplished for k = 1. Ae2 .) Let ik+1 be the smallest index larger than ik such that Aeik+1 ∈ span {f1 . · · ·. n k Ax = A i=1 xi ei =Aei = i=1 xi Aei n k k n = i=1 xi j=1 yij fj = j=1 i=1 xi yij fj ∈ span {f1 . · · ·.1 Least Squares Problems k Deﬁnition 19. ck ∈ R In words span {x1 . fk } . · · ·. The other case is that Aej ∈ span {f1 . For x ∈ Rn arbitrary. Then deﬁne / fk+1 ≡ Aeik+1 − Aeik+1 − k j=1 k j=1 Aeik+1 · fj fj Aeik+1 · fj fj . xk } consists of all vectors which can be obtained by adding up scalars times the vectors in the set {x1 . Now suppose {f1 . · · ·. · · ·. xk } ≡ i=1 ci xi : c1 . being a special case of something called the Gram Schmidt procedure. · · ·.8 Let A be an m × n matrix.

it follows that if you can ﬁnd y1 . Furthermore. yr in such a way as to minimize r 2 n y− k=1 r y k fk . Then from the deﬁnition of |·| and the properties of the dot product. . fr } = A (Rn ).26) fk+1 · fl = = Aeik+1 · fl − Aeik+1 − Aeik+1 − k j=1 k j=1 Aeik+1 · fj fj · fl Aeik+1 · fj fj =0 Aeik+1 · fl − Aeik+1 · fl k j=1 Aeik+1 · fj fj Therefore. it follows that {f1 . x minimizes this function if and only if (y−Ax) · Aw = 0 for all w ∈ R . Aeik+1 = span Ae1 . · · ·. / Then span {f1 . · · ·. · · ·. then letting Ax = k=1 yk fk . Theorem 19. Let y1 .9 Let y ∈ Rm and let A be an m × n matrix. · · ·. Proof: Let {f1 .8 with span {f1 . yr be a list of scalars. since |fk+1 | = 1. fk+1 } has the property that fi · fj = δ ij and each fj ∈ A (Rn ) . Continue until the ﬁrst case is obtained. · · ·. ij < ij+1 and ij can’t be any larger than n. fk+1 } = span Aei1 . fr } . r 2 r r y− k=1 yk fk = y− k=1 yk fk r · y− k=1 yk fk δ kl = |y| − 2 k=1 r 2 yk (y · f k ) + k r l 2 yk k=1 yk yl fk · fl = |y| − 2 k=1 r 2 yk (y · f k ) + 2 yk − 2yk (y · f k ) = |y| + k=1 2 Now complete the square to obtain r r 2 yk − 2yk (y · f k ) + (y · f k ) k=1 r r 2 = = |y| + |y| + k=1 2 2 − k=1 2 (y · f k ) 2 (yk − (y · f k )) − k=1 2 (y · f k ) . |y−Ax| . The following is a fundamental existence theorem which is based on the above lemma. Then there exists x ∈ Rn 2 minimizing the function. · · ·. · · ·. it will follow that this x is the desired solution. Since A (Rn ) = span {f1 . δ jl (19. This proves the lemma. from the properties of the dot product on Page 311. fr } be the vectors of Lemma 19. · · ·. fk } .3. for l < k + 1. Aeik+1 Also. · · ·.424 LINEAR ALGEBRA Note the denominator is non zero because of the assumption that Aeik+1 ∈ span {f1 . This must happen in no more than n applications of the above process because for each j.3. · · ·.

The desired system is y1 x1 1 . (Why?) This proves the theorem. According to Theorem 19.9. Lemma 19.10 Let A be an m × n matrix. . 2 2 2 2 19. . b yn xn 1 which is of the form y = Ax and it is desired to choose m and b to make m b y1 . .19. . the best values for m and b occur as the solution to y1 m . 2 2 2 2 Then from the above equation. This proves the existence part of the Theorem. m .2 The Least Squares Regression Line As an important application of the above theorem is the problem of ﬁnding the least squares n regression line in statistics. Ax · y = = = Aij xj yi xj AT yi ji x·AT y. −2t (y−Ax) · Aw + t2 |Aw| ≥ 0. = . . Using the repeated index summation convention. AT A = AT . yn 2 A as small as possible. To verify the other part. Suppose you are given points in the plane. which occurs if and only if (y−Ax) · Aw = 0 for all w ∈ Rn . {(xi . b yn .3. − . b to get as close as possible. LINEAR TRANSFORMATIONS 425 This shows that the minimum is obtained when yk = (y · f k ) . Then Ax · y = x·AT y Proof: This follows from the deﬁnition. let t ∈ R and consider |y−A (x + tw)| 2 = = (y−Ax − tAw) · (y−Ax − tAw) |y−Ax| − 2t (y−Ax) · Aw + t2 |Aw| . Therefore. try to ﬁnd m. Of course this will be impossible in general. . yi )}i=1 and you would like to ﬁnd constants m and b such that the line y = mx + b goes through all these points. |y−Ax| ≤ |y−Az| for all z ∈ Rn if and only if for all w ∈ Rn and t ∈ R |y−Ax| − 2t (y−Ax) · Aw + t2 |w| ≥ |y−Ax| and this happens if and only if for all t ∈ R and w ∈ Rn . .3.3.3. .

.9 there exists 2 x minimizing |y − Ax| which therefore satisﬁes (y − Ax) · Aw = 0 for all w ∈ Rn . .426 where 1 . (y − Ax) · y = 0. . including many in higher dimensions and they are all solved the same way. .9 and Lemma 19. . Proof: First suppose that for some x ∈ Rn . Ax = y. b. 19. b = .11 Let A be an m × n matrix. One could clearly do a least squares ﬁt for curves of the form y = ax2 + bx + c in the same way. In this case you want to solve as well as possible for a. A= . c x2 xn 1 yn n and one would use the same technique as above. Then there exists x ∈ Rn such that Ax = y if and only if whenever AT z = 0 it follows that z · y = 0. Therefore. for all w ∈ Rn .3. xn Thus. . AT (y − Ax) · w = 0 which shows that AT (y − Ax) = 0. AT z = 0 it follows that z · y = 0. It comes from Theorem 19.10. suppose that whenever.3 The Fredholm Alternative The next major result is called the Fredholm alternative. . y · z = Ax · z = x · AT z = x · 0 = 0. This proves half the theorem. . Many other similar problems are important.3. 1 LINEAR ALGEBRA x1 . . It is necessary to show there exists x ∈ Rn such that y = Ax. .27) . To do the other half. n 2 i=1 xi n i=1 xi n i=1 xi n m b = n i=1 xi yi n i=1 yi Solving this system of equations for m and b. . Theorem 19. (Why?) Therefore. (19. m= and b= −( n i=1 xi ) ( n i=1 n i=1 yi ) + ( ( n i=1 x2 ) n − ( i ( n i=1 xi yi ) n 2 n i=1 xi ) n i=1 −( xi ) ( n i=1 xi yi + n 2 i=1 xi ) n − ( n i=1 yi ) 2 n i=1 xi ) x2 i . From Theorem 19. by assumption.3. and c the system 2 y x1 x1 1 1 a . .3. . computing AT A. Then letting AT z = 0 and using the above lemma.3. .

then x = 0. Now do the same thing for y = ax2 + bx + c. 19.3.12 Let A be an m × n matrix. 2) .8 explain the assertion (19. Then by Theorem 19. ﬁnding a.3. Now let y ∈ Rm be given. The following corollary is also called the Fredholm alternative.9 concluded with the following observation.11 the following argument was used. If AT x = AT y. If −ta+t2 b ≥ 0 for all t ∈ R and b ≥ 0. 3.11.3. then AT (x − y) = 0 and so x − y = 0. Why is this so? 4. show that an m × n matrix is onto if and only if its transpose is one to one.4 Exercises 1. (1. 7. Then A is onto if and only if AT is one to one. Show the following are equivalent. In the proof of Theorem 19. then a = 0. and 0 is a solution to this equation.3. Referring to Problem 7. You are doing experiments and have obtained the ordered pairs. ﬁnd the least squares solution to the following system. (2.11 there exists x such that Ax = y therefore. and (3. since AT is assumed to be one to one.27) with w = x. 8. 3. y = Ax. Explain why there always exists a solution to the equation. it follows that for all y ∈ Rm . by Theorem 19. 6. and c to give the best approximation. A is onto. Is it possible that AT is one to one? What does this say about A being onto? Prove your answer.19. The proof of Theorem 19.5) . it must be the only solution.4. EXERCISES Now by (19. If x · w = 0 for all w ∈ Rn . Proof: Suppose ﬁrst A is onto. (y − Ax) · (y−Ax) = (y − Ax) · y− (y − Ax) · Ax = 0 showing that y = Ax. In the proof of Lemma 19.3. Suppose A is a 3 × 2 matrix. Thus AT is one to one. y · z = 0 whenever AT z = 0. Therefore.12 and Problem 4. Therefore.3. This proves the theorem. 5. let y = z where AT z = 0 and conclude that z · z = 0 whenever AT z = 0. 4) . y · z = 0 whenever AT z = 0 because. (b) L is one to one. (a) Lx = 0 implies x = 0.26). Suppose L : Rn → Rm is a linear transformation. 1) .3. AT y = AT Ax and also explain why this is called a least squares solution to the equation. Using Corollary 19. Find m and b such that y = mx + b approximates these four points as well as possible. (0. Why is this so? 2. b. 427 Corollary 19. . x + 2y = 1 2x + 3y = 2 3x + 5y = 4 9.

the ij th entry of A (t) B (t) equals the ij th entry of A (t) B (t) + A (t) B (t) and this proves the lemma.5. Describe how to ﬁnd a polynomial. Deﬁnition 19. To begin with.5 Moving Coordinate Systems The study of moving coordinate systems gives a non trivial example of the usefulness of the above ideas involving linear transformations and matrices. Therefore.2 Let A (t) be an m × n matrix and let B (t) be an n × p matrix with the property that all the entries of these matrices are diﬀerentiable functions. Say A (t) = (Aij (t)) . A (t) is the matrix which consists of replacing each entry by its derivative. The next lemma is just a version of the product rule. (xi . Lemma 19. Proof: (A (t) B (t)) = Cij (t) where Cij (t) = Aik (t) Bkj (t) and the repeated index summation convention is being used.1 Let A (t) be an m × n matrix. Is there any reason you have to use a polynomial? Would similar approaches work for other combinations of functions just as well? 19. Then deﬁne A (t) ≡ Aij (t) . 19. That is. z = a + bx + cy + dxy + ex2 + f y 2 for example giving the best ﬁt to the given ordered triples. Suppose also that Aij (t) is a diﬀerentiable function for all i. Such an m × n matrix in which the entries are diﬀerentiable functions is called a diﬀerentiable matrix. Cij (t) = Aik (t) Bkj (t) + Aik (t) Bkj (t) = (A (t) B (t))ij + (A (t) B (t))ij = (A (t) B (t) + A (t) B (t))ij Therefore. Then (A (t) B (t)) = A (t) B (t) + A (t) B (t) . Suppose you have several ordered triples. zi ) .428 LINEAR ALGEBRA 10. yi . one pointing East and one pointing directly away from the center of the earth. Now consider unit vectors.5. here is the concept of the product rule extended to matrix multiplication.1 The Coriolis Acceleration Imagine a point on the surface of the earth. one pointing South.5. j. k ' i£ £ £ rrj r j .

and the vector has changed although the components have not.5. k at time t0 as u ≡ u1 i (t0 ) + u2 j (t0 ) + u3 k (t0 ) . Lemma 19. yields a moving coordinate system. Thus u (t) ≡ u1 i (t) + u2 j (t) + u3 k (t) . 3 1/2 i 2 |Q (t) u| = i=1 u = |u| . Letting the positive x axis extend in the direction of i (t) . but of course they are not. like the vectors described in the ﬁrst paragraph. Then deﬁne the components of u with respect to these vectors. Let u (t) be deﬁned as the vector which has the same components with respect to i.5. It is assumed these vectors are C 1 functions of t. This is exactly the situation in the case of the apparently ﬁxed basis vectors on the earth if u is a position vector from the given spot on the earth’s surface to a point regarded as ﬁxed with the earth due to its keeping the same coordinates relative to the coordinate axes which are ﬁxed with the earth. Then Q (t) Q (t) = Q (t) Q (t) = I. i. Now let u be a vector and let t0 be some reference time. velocities. diﬀerentiable n × n matrix which preserves disT T tances. they change direction and so each is in reality a function of t. In general. k∗ be the usual ﬁxed vectors in space and let i (t) . Q (t) (αu + βv) ≡ αu1 + βv 1 i (t) + αu2 + βv 2 j (t) + αu3 + βv 3 k (t) = αu1 i (t) + αu2 j (t) + αu3 k (t) + βv 1 i (t) + βv 2 j (t) + βv 3 k (t) = α u1 i (t) + u2 j (t) + u3 k (t) + β v 1 i (t) + v 2 j (t) + v 3 k (t) ≡ αQ (t) u + βQ (t) v showing that Q (t) is a linear transformation. Nevertheless. β. k (t) be an orthonormal basis of vectors for each t. . Q (t) preserves all distances because. since the vectors. Now deﬁne a linear transformation Q (t) mapping R3 to R3 by Q (t) u ≡ u1 i (t) + u2 j (t) + u3 k (t) where u ≡ u1 i (t0 ) + u2 j (t0 ) + u3 k (t0 ) Thus letting v be a vector deﬁned in the same manner as u and α. As the earth turns. the positive y axis extend in the direction of j (t). i (t) . and displacements. j∗ . then there exists a vector. the second as j and the third as k. Also. and the positive z axis extend in the direction of k (t) . k but at time t. Also. j. j (t) .3 Suppose Q (t) is a real. scalars. it is with respect to these apparently ﬁxed vectors that you wish to understand acceleration. j. k (t) form an orthonormal set.19. j (t) . MOVING COORDINATE SYSTEMS 429 Denote the ﬁrst as i. Ω (t) such that u (t) = Ω (t) × u (t) . If you are standing on the earth you will consider these vectors as ﬁxed. let i∗ . For example you could let t0 = 0. if u (t) ≡ Q (t) u.

29) where because Ω (t) × u (t) ≡ Ω (t) = ω 1 (t) i∗ +ω 2 (t) j∗ +ω 3 (t) k∗ . Therefore. k∗ .28) that T the matrix of Q (t) Q (t) is of the form 0 −ω 3 (t) ω 2 (t) ω 3 (t) 0 −ω 1 (t) −ω 2 (t) ω 1 (t) 0 for some time dependent scalars. This proves the lemma and yields the existence part of the following theorem. i∗ .2 that Q (t) Q (t) + Q (t) Q (t) = 0 and so Q (t) Q (t) = − Q (t) Q (t) From the deﬁnition. It follows from the product rule. . w.5. T Q (t) Q (t) u · w = (u · w) for all u. T u (t) = Ω (t) ×u (t) = Q (t) Q (t) u (t) (19. Lemma 19. Therefore.430 Proof: Recall that (z · w) = 1 4 LINEAR ALGEBRA |z + w| − |z − w| 2 2 . (19. ω i .28) u (t) = Q (t) u =Q (t) Q (t) u (t). Therefore. =u T T T T T T T T . i∗ w1 u1 j∗ w2 u2 k∗ w3 u3 T T ≡ i∗ w2 u3 − w3 u2 + j∗ w3 u1 − w1 u3 + k∗ w1 u2 − w2 u1 . Q (t) Q (t) u = u and so Q (t) Q (t) = Q (t) Q (t) = I. where these are the usual basis vectors for R3 . (Q (t) u·Q (t) w) = = = This implies 1 2 2 |Q (t) (u + w)| − |Q (t) (u − w)| 4 1 2 2 |u + w| − |u − w| 4 (u · w) . This proves the ﬁrst part of the lemma. Then writing the matrix of Q (t) Q (t) with respect to ﬁxed in space orthonormal basis vectors. it follows from (19. j∗ . 1 1 u 0 −ω 3 (t) ω 2 (t) u u2 (t) = ω 3 (t) 0 −ω 1 (t) u2 (t) u3 −ω 2 (t) ω 1 (t) 0 u3 w2 (t) u3 (t) − w3 (t) u2 (t) = w3 (t) u1 (t) − w1 (t) u3 (t) w1 (t) u2 (t) − w2 (t) u1 (t) where the ui are the components of the vector u (t) in terms of the ﬁxed vectors i∗ . Q (t) u = u (t) . j∗ . Therefore. k∗ .

4 Let i (t) . k are constant. j.29). (Ω − Ω1 ) ×Q (t) u = 0 for all u and since Q (t) is one to one and onto. then u (t) = Ω (t) × u (t) . if e ∈ {i. k} . Therefore. Now let R (t) be a position vector and let r (t) = R (t) + rB (t) where rB (t) ≡ x (t) i (t) +y (t) j (t) +z (t) k (t) . Proof: It only remains to prove uniqueness. e = Ω × e because the components of these vectors with respect to i. k (t). thus regarding p (t) as the origin. j (t) . Now consider the acceleration. rB (t) ¢¢d d ¢ R(t) ¢ r(t) ¢ In the example of the earth. This proves the theorem. (19. j (t) .5.5. vB (t) and aB (t) will be the velocity and acceleration relative to i (t) . this implies (Ω − Ω1 ) ×w = 0 for all w and thus Ω − Ω1 = 0. Therefore. B on those relative to the moving coordinates system. By . k (t) . j. v = R + x i + y j + z k + Ω × rB = R + x i + y j + z k + Ω× (xi + yj + zk) . MOVING COORDINATE SYSTEMS 431 Theorem 19. rB (t) is the position vector of a point as perceived by the observer on the earth with respect to the vectors he thinks of as ﬁxed. Similarly. Then u (t) = Q (t) u and so u (t) = Q (t) u and Q (t) u = Ω×Q (t) u = Ω1 ×Q (t) u for all u. Ω×vB a = v = R + x i + y j + z k+x i + y j + z k + Ω × rB . Suppose Ω1 also works. R (t) is the position vector of a point p (t) on the earth’s surface and rB (t) is the position vector of another point from p (t) . Then there exists a unique vector Ω (t) such that if u (t) is a vector whose components are constant with respect to i (t) . xi + yj + zk = xΩ × i + yΩ × j + zΩ × k = Ω × (xi + yj + zk) and consequently.19. and so vB = x i + y j + z k and aB = x i + y j + z k. Quantities which are relative to the moving coordinate system and quantities which are relative to a ﬁxed coordinate system are distinguished by using the subscript. Then v ≡ r = R + x i + y j + z k+xi + yj + zk . j (t) . k (t) be as described.

The mass multiplied by the Coriolis acceleration deﬁnes the Coriolis force. Ω = ωk∗ and so =0 R = Ω × R+Ω × R = Ω× (Ω × R) since Ω does not depend on t. Also. Letting R (t) be the position vector of the point p. Then the whole circular room begins to revolve faster and faster. j pointing East. it being an acceleration felt by the observer relative to the moving coordinate system which he regards as ﬁxed. k∗ . p. 19. There is a ride found in some amusement parks in which the victims stand next to a circular wall covered with a carpet or some rough material. Thus k∗ is ﬁxed in space with k∗ pointing in the direction of the north pole from the center of the earth while i∗ and j∗ point to ﬁxed points on the survace of the earth.432 vB Ω×rB (t) LINEAR ALGEBRA +Ω× x i + y j + z k+xi + yj + zk = R + aB + Ω × rB + 2Ω × vB + Ω× (Ω × rB ) . The force they feel is called centrifugal force and it causes centrifugal acceleration. The term Ω× (Ω × rB ) is called the centripetal acceleration. it follows from the geometric deﬁnition of the cross product that R = ωk∗ × R Therefore. (19. At some point. if the nauseated victim moves relative to the rotating wall. j.2 The Coriolis Acceleration On The Rotating Earth Now consider the earth. k be the unit vectors described earlier with i pointing South.30) implies aB = a − Ω× (Ω × R) − 2Ω × vB − Ω× (Ω × rB ) .30) Here the term − (Ω× (Ω × rB )) is called the centrifugal acceleration. (19. the bottom drops out and the victims are held in place by friction. The acceleration aB is that perceived by an observer who is moving with the moving coordinate system and for whom the moving coordinate system is ﬁxed. observe the coordinates of R (t) are constant with respect to i (t) . aB = a − R − Ω × rB − 2Ω × vB − Ω× (Ω × rB ) . It is not necessary to move relative to coordinates ﬁxed with the revolving wall in order to feel this force and it is pretty predictable. Formula (19.31) . and k pointing away from the center of the earth at some point of the rotating earth’s surface.5. The diﬀerence between these forces is that the Coriolis force is caused by movement relative to the moving coordinate system and the centrifugal force is not. k (t) . Solving for aB . an acceleration experienced by the observer as he moves relative to the moving coordinate system. j (t) . since the earth rotates from West to East and the speed of a point on the surface of the earth relative to an observer ﬁxed in space is ω |R| sin φ where ω is the angular speed of the earth about an axis through the poles. Thus i∗ and j∗ depend on t while k∗ does not. Let i∗ . and the term −2Ω × vB is called the Coriolis acceleration. he will feel the eﬀects of the Coriolis force and this force is really strange. Let i. However. j∗ . be the usual basis vectors attached to the rotating earth. from the center of the earth.

the following formula describes the acceleration relative to a coordinate system moving with the earth’s surface. This is described in terms of the vectors i (t) .32) . this term is therefore. φ) be the usual spherical coordinates of the point p (t) on the surface taken with respect to i∗ . To see this. although the components of acceleration which are in other directions are very small when compared with the acceleration due to the force of gravity and are often neglected.19. θ. i. the Coriolis acceleration could be signiﬁcant. b. ρ cos (φ) It follows. k∗ coordinates of this point are ρ sin (φ) cos (θ) ρ sin (φ) sin (θ) . it follows the i∗ . note seconds in a day ω (24) (3600) = 2π. and so ω = 7. j∗ .2722 × 10−5 2 |rB | . the acceleration relative to the moving coordinate system on the earth is not directed exactly toward the center of the earth except at the poles and at the equator. Clearly this is not worth considering in the presence of the acceleration due to gravity which is approximately 32 feet per second squared near the surface of the earth. Letting (ρ. 2 Ω× (Ω × R) = (Ω · R) Ω− |Ω| R and so g. is due to gravity. Therefore. then aB = a − Ω× (Ω × R) − 2Ω × vB = ≡g − Note that GM (R + rB ) |R + rB | 3 − Ω× (Ω × R) − 2Ω × vB ≡ g − 2Ω × vB . i = cos (φ) cos (θ) i∗ + cos (φ) sin (θ) j∗ − sin (φ) k∗ j = − sin (θ) i∗ + cos (θ) j∗ + 0k∗ and k = sin (φ) cos (θ) i∗ + sin (φ) sin (θ) j∗ + cos (φ) k∗ . If the acceleration a. if the relative velocity. MOVING COORDINATE SYSTEMS 433 In this formula. no larger than 7. k∗ the usual way with φ the polar angle. j. k.2722 × 10−5 in radians per second. Thus the following equation needs to be solved for a. It is necessary to obtain k∗ in terms of the vectors. Ω is quite small. j (t) . vB is large. if the only force acting on an object is due to gravity. you can totally ignore the term Ω× (Ω × rB ) because it is so small whenever you are considering motion near some point on the earth’s surface. aB = g−2 (Ω × vB ) While the vector. If you are using seconds to measure time and feet to measure distance. c to ﬁnd k∗ = ai+bj+ck 0 cos (φ) cos (θ) − sin (θ) 0 = cos (φ) sin (θ) cos (θ) 1 − sin (φ) 0 k∗ sin (φ) cos (θ) a sin (φ) sin (θ) b cos (φ) c (19. k (t) next. j∗ .5.

However. Example 19. aB = −gk−2z ω sin φj.6 In 1851 Foucault set a pendulum vibrating and observed the earth rotate out from under it. . Therefore. p (t) on the earth’s surface. Now the Coriolis acceleration on the earth equals ∗ k 2 (Ω × vB ) = 2ω − sin (φ) i+0j+ cos (φ) k × (x i+y j+z k) . Where will it strike? Assume a = −gk and the j component of aB is approximately −2ω (x cos φ + z sin φ) . Also. This shows the rock does not fall directly towards the center of the earth as expected but slightly to the east. This is what allowed him to observe the earth rotate out from under it. (19. this is 2gtω sin φ. This equals 2ω [(−y cos φ) i+ (x cos φ + z sin φ) j − (y sin φ) k] . aB = a − Ω× (Ω × R) −2ω [(−y cos φ) i+ (x cos φ + z sin φ) j − (y sin φ) k] . The dominant term in this expression is clearly the second one because x will be small. considering the j component. and c = cos (φ) . Clearly such a pendulum will take 24 hours for the plane of vibration to appear to make one complete revolution at the north pole. It is also reasonable to expect that no such observed rotation would take place on the equator. b = 0.33) Remember φ is ﬁxed and pertains to the ﬁxed point.5. the second is j and the third is k in the above matrix. Example 19.31). aB = g−2ω [(−y cos φ) i+ (x cos φ + z sin φ) j − (y sin φ) k] B where g = − GM (R+r3 ) − Ω× (Ω × R) as explained above.5. there is a little sign which says Warning! 1000 ohms. Is it possible to predict what will take place at various latitudes? Using (19. It was a very long pendulum with a heavy weight at the end so that it would vibrate for a long time without stopping1 . in (19.5 Suppose a rock is dropped from a tall building. z = −gt approximately. Therefore. Therefore.434 LINEAR ALGEBRA The ﬁrst column is i. the Coriolis force will not be neglected. 1 There is such a pendulum in the Eyring building at BYU and to keep people from touching it.33). a is due to gravity. The term Ω× (Ω × R) is pretty |R+rB | small and so it will be neglected. The solution is a = − sin (φ) . the i and k contributions will be very small. the following equation is descriptive of the situation. Two integrations give ωgt3 /3 sin φ for the j component of the relative displacement at time t. if the acceleration.

x (0) = 0.35) yields a solution to (19. Therefore. the pendulum is assumed to be long with a heavy weight so that x and y are also small. √ c b2 + 4a2 x (0) = 0. y + a2 y = −bx where a2 = constant. The pendulum can be 2 thought of as the position vector from (0. Also. y (0) = 0. l (19. 0. ml All terms of the form xy or y y can be neglected because it is assumed x and y remain small. l) to the surface of the sphere x2 +y 2 +(z − l) = 2 l . the ﬁrst two equations become x = − (gm − 2ωmy sin φ) and x + 2ωy cos φ ml y − 2ω (x cos φ + z sin φ) . the last equation may be solved for T to get z =T gm − 2ωy sin (φ) m = T. y = c cos 2 bt 2 √ sin b2 + 4a2 t 2 x = c sin (19. Ω× (Ω × R) .34) along with the initial conditions.19. this becomes = −gk + T/m−2ω [(−y cos φ) i+ (x cos φ + z sin φ) j − (y sin φ) k] where T. 2 (19. z = z = 0. the tension in the string of the pendulum. the diﬀerential equations of relative motion are x = −T y = −T and x + 2ωy cos φ ml y − 2ω (x cos φ + z sin φ) ml l−z − g + 2ωy sin φ. Then it is fairly tedious but routine to verify that for each bt 2 √ sin b2 + 4a2 t . and m is the mass of the pendulum bob. the equations of motion become y = − (gm − 2ωmy sin φ) x +g and y +g These equations are of the form x + a2 x = by . c. y l−z x k T = −T i−T j+T l l l and consequently. ml If the vibrations of the pendulum are small so that for practical purposes. With these simplifying assumptions. g l x = 2ωy cos φ l y = −2ωx cos φ. MOVING COORDINATE SYSTEMS 435 Neglecting the small term.36) .5. y (0) = . is directed towards the point at which the pendulum is supported. Therefore.34) and b = 2ω cos φ.

Therefore. a = −a (rB ) rB where rB = r will denote the distance from the ﬁxed point p (t) on the earth’s surface which is also the lowest pressure point. The term sin t changes relatively fast 2 and takes values between −1 and 1. Therefore. Also note the pendulum would not appear to change its plane of vibration at the equator because limφ→π/2 sec φ = ∞. i. Writing these solutions diﬀerently. it follows ω = hour. observe T in hours.35) by uniqueness of the initial value problem. the above formula implies T = 24 sec φ. (Recall k points away from the center of the earth and j points East. The purpose of this discussion is not to establish these self evident facts but to predict how long it takes for the plane of vibration to make one revolution.7 It is known that low pressure areas rotate counterclockwise as seen from above in the Northern hemisphere but clockwise in the Southern hemisphere. The Coriolis acceleration is also responsible for the phenomenon of the next example. there will be some instant in time at which the pendulum will be vibrating in a plane determined by k and j. 2 Therefore. Of course the situation could be more complicated but this will suﬃce to explain the above question. This is what describes the actual observed vibrations of the pendulum. The plane of vibration is √ b2 +4a2 determined by this vector and the vector k. ) At this instant in time.436 LINEAR ALGEBRA It is clear from experiments with the pendulum that the earth does indeed rotate out from under it causing the plane of vibration of the pendulum to appear to rotate. Why? Neglect accelerations other than the Coriolis acceleration and the following acceleration which comes from an assumption that the point p (t) is the location of the lowest pressure. Thus the plane of vibration will have made one complete revolution when t = T for bT ≡ 2π.5. and then solve the above equation for φ. = π 12 in radians per I think this is really amazing. not by taking readings with instruments using the North Star but by doing an experiment with a big pendulum. Then the acceleration observed by a person on the earth relative to the apparently ﬁxed vectors. √ b2 + 4a2 x (t) sin bt 2 t =c sin bt y (t) cos 2 2 sin bt 2 always has magnitude equal to |c| cos bt 2 but its direction changes very slowly because b is very small. j. the time it takes for the earth to turn out from under the pendulum is This is very interesting! The vector. the conditions of (19.36) will hold for some value of c and so the solution to (19. k. is aB = −a (rB ) (xi+yj+zk) − 2ω [−y cos (φ) i+ (x cos (φ) + z sin (φ)) j− (y sin (φ) k)] . You could actually determine latitude.34) having these initial conditions will be those of (19. You would set it vibrating. c T = 4π 2π = sec φ. deﬁned as t = 0. 2ω cos φ ω 2π 24 Since ω is the angular speed of the rotating earth. Example 19.

one obtains some diﬀerential equations from aB = x i + y j + z k by matching the components. Is this true on the rotating earth assuming the experiment .6 Exercises 1. so that z is fairly constant and the last term can be neglected. Show that i (t) · i (t0 ) j (t) · i (t0 ) k (t) · i (t0 ) Q (t) = i (t) · j (t0 ) j (t) · j (t0 ) k (t) · j (t0 ) . this implies rB × rB = 0 initially and then the above implies its kth component is negative in the upper hemisphere when φ < π/2 and positive in the lower hemisphere when φ > π/2. Therefore. EXERCISES 437 Therefore. An illustration used in many beginning physics books is that of ﬁring a riﬂe horizontally and dropping an identical bullet from the same height above the perfectly ﬂat ground followed by an assertion that the two bullets will hit the ground at exactly the same time.6. (rB × rB ) = i −a (rB ) x + 2ωy cos φ x i x x j y y k z z = i x x j y y k z z = j k −a (rB ) y − 2ωx cos φ − 2ωz sin (φ) −a (rB ) z + 2ωy sin φ y z Then the kth component of this cross product equals ω cos (φ) y 2 + x2 + 2ωxz sin (φ) . Using the right hand and the geometric deﬁnition of the cross product. 2 2 Beginning with a point at rest. i. These are x + a (rB ) x = y + a (rB ) y = z + a (rB ) z = 2ωy cos φ −2ωx cos φ − 2ωz sin (φ) 2ωy sin φ Now remember. the motion towards the low pressure has to be more pronounced in comparison with the motion in the k direction in order to draw this conclusion. Note also that as φ gets close to π/2 near the equator. Q (t) as described above. j. Remember the Coriolis force was 2Ω × vB where Ω was a particular vector which came from the matrix. the vectors.19. π and positive if φ ∈ π . from the properties of the determinant and the above diﬀerential equations. The ﬁrst term will be negative because it is assumed p (t) is the location of low pressure causing y 2 +x2 to be a decreasing function. i (t) · k (t0 ) j (t) · k (t0 ) k (t) · k (t0 ) There will be no Coriolis force exactly when Ω = 0 which corresponds to Q (t) = 0. π . then the kth component of (rB × rB ) is negative provided φ ∈ 0. 19. When will Q (t) = 0? 2. Therefore. the above reasoning tends to break down because cos (φ) becomes close to zero. this shows clockwise rotation in the lower hemisphere and counter clockwise rotation in the upper hemisphere. k are ﬁxed relative to the earth and so are constant vectors. If it is assumed there is not a substantial motion in the k direction.

19. = a b c d . j points East. What other irregularities will occur? Recall the Coriolis force is 2ω [(−y cos φ) i+ (x cos φ + z sin φ) j − (y sin φ) k] where k points away from the center of the earth.7 Determinants Let A be an n × n matrix. Multiply this by 4 and then by (−1) because the 4 is in the ﬁrst column and the second row.7.438 LINEAR ALGEBRA takes place over a large perfectly ﬂat ﬁeld so the curvature of the earth is not an issue? Explain. (−1) 2+1 4 2 3 2 1 + (−1) 2+2 3 1 3 3 1 + (−1) 2+3 2 1 3 2 2 = 0.7. Thus det Example 19. This is the pattern used to evaluate the determinant by expansion along the ﬁrst column. you cross out the row . Deﬁnition 19. Then take the determinant of the resulting 2 × 2 matrix. (−1) 1+1 1 3 2 2 1 + (−1) 2+1 4 2 2 3 1 + (−1) 3+1 3 2 3 3 2 = 0. 1 Here is how it is done by “expanding along the ﬁrst column”. It follows exactly the same pattern and you see it gave the same answer. denoted as det (A) is a number. Having deﬁned what is meant by the determinant of a 2 × 2 matrix. this number is very easy to ﬁnd.3 Find the determinant of 1 2 4 3 3 2 3 2 . This gives the ﬁrst term in the above sum. Then det (A) ≡ ad − cb. What is going on here? Take the 1 in the upper left corner and cross out the row and the column containing the 1. If the matrix is a 2×2 matrix. Finally consider the 3 on the bottom of the ﬁrst column. Then 3+1 multiply this by 3 and by (−1) because this 3 is in the third row and the ﬁrst column. what about a 3 × 3 matrix? Example 19.1 Let A = a b c d . Cross out the row and column containing this 3 and take the determinant of what is left. The determinant of A.7. Now go to the 4. You could also expand the determinant along the second row as follows. The determinant is also often denoted by enclosing the matrix with two vertical lines. You pick a row or column and corresponding to each number in that row or column. From the deﬁnition this is just (2) (6) − (−1) (4) = 16. and i points South. Now 1+1 multiply this determinant by 1 and then multiply by (−1) because this 1 is in the ﬁrst row and the ﬁrst column.2 Find det 2 4 −1 6 a b c d . Cross out the row and the column which contain 4 and take the determinant of the 2 × 2 2+1 matrix which remains.

I hope you see that from the above pattern there will be 10! = 3. Therefore. take the determinant of what is left.7.4 Find det (A) where 1 5 A= 1 3 2 4 3 4 3 2 4 3 4 3 5 2 As in the case of a 3 × 3 matrix. cof (A) is deﬁned by cof (A) = (cij ) where to obtain cij delete the ith row and the j th column of A.5 Let A = (aij ) be an n×n matrix.6 Let A be an n × n matrix where n ≥ 2. The above examples motivate the following incredible theorem and deﬁnition.19. However. multiply this by the number i+j and by (−1) assuming the number is in the ith row and the j th column. What about a 4 × 4 matrix? Example 19. It is incredible and not at all obvious.7. ) and then multiply this number by (−1) . Theorem 19. Then a new matrix called the cofactor matrix.37) The ﬁrst formula consists of expanding the determinant along the ith row and the second expands the determinant along the j th column. DETERMINANTS 439 and column containing it. cof (A)ij will denote the ij entry of the cofactor matrix. you will get −12 assuming you make no mistakes. Then adding these gives the value of the determinant. Suppose now you have a 10 × 10 matrix. .7. take the determinant of the (n − 1) × (n − 1) matrix which results. (19. To make the th formulas easier to remember. Now you know how to expand each of these 3 × 3 matrices along a row or a column. You could expand this matrix along any row or any column and assuming you make no mistakes. This method of evaluating a determinant by expanding along a row or a column is called the method of Laplace expansion. A. you can expand this along any row or column.7. 800 terms involved in the evaluation of such a determinant by Laplace expansion along a row or column. 628 . det (A) = 3 (−1) 1+3 5 4 1 3 3 4 1 2 5 4 3 4 3 5 2 4 3 2 + 2 (−1) 2+3 1 1 3 1 5 1 2 3 4 2 4 3 4 5 2 4 3 5 + 4 (−1) 3+3 + 3 (−1) 4+3 . This is a lot of terms. it requires a little eﬀort to establish it. This is done in the section on the theory of the determinant which follows. If you do so. there will be 4 × 3 × 2 = 24 terms to evaluate in order to ﬁnd the determinant using the method of Laplace expansion. I think you should regard the above claim that you always get the same answer by picking any row or column with considerable skepticism. In addition to the diﬃculties just discussed. Lets pick the third column. (This i+j is called the ij th minor of A. Then n n det (A) = j=1 aij cof (A)ij = i=1 aij cof (A)ij . Note that each of the four terms above involves three terms consisting of determinants of 2 × 2 matrices and each of these will need 2 terms. you will always get the same thing which is deﬁned to be the determinant of the matrix. Deﬁnition 19.

From the above corollary. you could expand along the ﬁrst column. .. . Suppose the ith row of A equals (xa1 + yb1 . Then det (M ) is obtained by taking the product of the entries on the main diagonal. Thus such a matrix equals zero below the main diagonal. There are many properties satisﬁed by determinants.7. certain types of matrices are very easy to deal with. Corollary 19. .9 Let 1 0 A= 0 0 Find det (A) . Example 19. You should verify the following using the above theorem on Laplace expansion. Thus det (A) = 1 × 2 × 3 × −1 = −6. the determinant of the resulting matrix equals (−1) times the determinant of the original matrix.7 A matrix M . det is a linear function of each .7. .7. 0 ∗ . 0 ··· 0 ∗ A lower triangular matrix is deﬁned similarly as a matrix for which all entries above the main diagonal are equal to zero.8 Let M be an upper (lower) triangular matrix. an ) and the ith row of A2 is (b1 . the entries of the form Mii . · · ·. This example also demonstrates why the above corollary is true.7 0 0 −1 and now expand this along the ﬁrst column to get this equals 1×2× 3 33. is upper triangular if Mij = 0 whenever i > j.440 LINEAR ALGEBRA Notwithstanding the diﬃculties involved in using the method of Laplace expansion. · · ·.10 If two rows or two columns in an n × n matrix... In other words. Theorem 19. it suﬃces to take the product of the diagonal elements.. . · · ·. ∗ . . Deﬁnition 19. . are switched. as shown.7. xan + ybn ). .7 0 −1 Next expand the last along the ﬁrst column which reduces to the product of the main diagonal elements as claimed. ∗ ∗ ··· ∗ . bn ) . Some of the most important are listed in the following theorem. Then det (A) = x det (A1 ) + y det (A2 ) where the ith row of A1 is (a1 . If A is an n×n matrix in which two rows are equal or two columns are equal then det (A) = 0.7 0 −1 2 2 0 0 3 77 6 7 3 33. Without using the corollary. all other rows of A1 and A2 coinciding with those of A. This gives 2 6 7 1 0 3 33. A.

7. it has the same determinant as A. det (B) = −1 det (C) where 3 1 0 C= 0 0 2 0 −3 6 3 11 −8 30 4 22 . Finally. the method of Laplace expansion will not be practical for any matrix of large size. 1 2 3 4 0 −9 −13 −17 B= 0 −3 −8 −13 0 −2 −10 −3 and from the above corollary. and if A is an n × n matrix. If B is the matrix obtained from A by replacing the ith row (column) of A by a times the ith row (column) then a det (A) = det (B) . The same is true with the word “row” replaced with the word “column”. Then replace the third row by (−4) times the ﬁrst row added to it. Then det (C) = − det (D) where 1 0 D= 0 0 2 −3 0 0 3 4 −8 −13 11 22 14 −17 . replace the fourth row by (−2) times the ﬁrst row added to it.12 Find the determinant of the matrix. if A and B are n × n matrices. Here is an example which shows how to use this corollary to ﬁnd a determinant. This theorem implies the following corollary which gives a way to ﬁnd determinants.7. Now using the corollary some more. 1 2 3 4 5 1 2 3 A= 4 5 4 3 2 2 −4 5 Replace the second row by (−5) times the ﬁrst row added to it. This yields the matrix. then det (A) = det AT .7. Corollary 19. In addition to this.11 Let A be an n × n matrix and let B be the matrix obtained by replacing the ith row (column) of A with the sum of the ith row (column) added to a multiple of another row (column). As I pointed out above. DETERMINANTS 441 row A. −13 9 The second row was replaced by (−3) times the third row added to the second row and then the last row was multiplied by (−3) . then det (AB) = det (A) det (B) . Now replace the last row with 2 times the third added to it and then switch the third and second rows.19. Example 19. Then det (A) = det (B) .

and similar reasoning. Proof: By Theorem 19. Now n n air cof (A)ik = i=1 T i=1 air cof (A)ki T which is the krth entry of cof (A) A. However.13 on Page 417. Thus 11 22 det (D) = 1 (−3) = 1485 14 −17 and so det (C) = −1485 and det (A) = det (B) = −1 (−1485) = 495.7. Bk whose determinant equals zero by Theorem 19.1. expanding this matrix along the k th column yields n th 0 = det (Bk ) det (A) Summarizing. Recall the deﬁnition of the inverse of a matrix in Deﬁnition 19. Replace the k column with the rth column to obtain a matrix. 3 The theorem about expanding a matrix along any row or column also provides a way to give a formula for the inverse of a matrix.10. n −1 = i=1 air cof (A)ik det (A) −1 air cof (A)ik det (A) i=1 −1 = δ rk . i=1 Now consider n air cof (A)ik det(A)−1 i=1 when k = r. cof (A) A = I. then A−1 = a−1 ij where a−1 = det(A)−1 cof (A)ji ij for cof (A)ij the ij th cofactor of A.7. n T (19. Theorem 19.6 and letting (air ) = A.7.6.13 A−1 exists if and only if det(A) = 0.7. det (A) Using the other formula in Theorem 19. Therefore.442 LINEAR ALGEBRA You could do more row operations or you could note that this can be easily expanded along the ﬁrst column followed by expanding the 3 × 3 matrix which results along its ﬁrst column. If det(A) = 0. if det (A) = 0.38) arj cof (A)kj det (A) j=1 −1 = δ rk Now n n arj cof (A)kj = j=1 j=1 arj cof (A)jk T . n air cof (A)ir det(A)−1 = det(A) det(A)−1 = 1.

Therefore.7. Using Corollary 19.7.39) that A−1 = a−1 . 0 This way of ﬁnding inverses is especially useful in the case where it is desired to ﬁnd the inverse of a matrix whose entries are functions.7. Example 19.7. This proves the theorem.11 on Page 441. 1 2 3 A= 3 0 1 1 2 1 First ﬁnd the determinant of this matrix. from the above theorem. Then by Theorem 19. DETERMINANTS which is the rk th entry of A cof (A) . A−1 is equal to one over the determinant of A times the adjugate matrix of A. the determinant of this matrix equals the determinant of the matrix. The transpose of the cofactor matrix is called the adjugate or sometimes the classical adjoint of the matrix A. A−1 = cof (A) . . take the transpose of the cofactor matrix and divide by the determinant. Theorem 19. Therefore. Now suppose A−1 exists. the inverse of A should equal −2 1 4 12 2 −2 −2 8 T 1 −6 6 0 = −1 6 1 −6 2 1 3 −1 6 1 6 2 3 −1 2 . A cof (A) = I. 1 2 3 0 −6 −8 0 0 −2 which equals 12.19.38) and (19. It is an abomination to call it the adjoint although you do sometimes see it referred to in this way. det (A) T −1 . In words.7.13 says that to ﬁnd the inverse. det (A) T T 443 (19. The cofactor matrix of A −2 4 2 is −2 6 −2 0 . where ij a−1 = cof (A)ji det (A) ij In other words. 1 = det (I) = det AA−1 = det (A) det A−1 so det (A) = 0.14 Find the inverse of the matrix.10. 8 −6 Each entry of A was replaced by its cofactor.39) and it follows from (19.

xi = det . r. · · ·. y = (y1 . x = A−1 A x = A−1 (Ax) = A−1 y thus solving the system. The cofactor matrix is 1 0 0 C (t) = 0 et cos t et sin t 0 −et sin t et cos t and so the inverse is 1 0 1 0 et cos t et 0 −et sin t T −t 0 e et sin t = 0 et cos t 0 0 − sin t . yn ) . Let A be an m × n matrix. Cramer’s rule gives a formula for the solutions. x.15 Suppose LINEAR ALGEBRA et 0 A (t) = 0 Find A (t) −1 0 0 cos t sin t − sin t cos t . . xn ) . ∗ · · · y1 · · · ∗ 1 . The row rank of a matrix is the smallest number. cj such that as = j=1 cj aij . · · ·. n n xi = j=1 a−1 yj ij = j=1 1 cof (A)ji yj .7. . det (A) ∗ · · · yn · · · ∗ where here the ith column of A is replaced with the column vector. The column rank of a matrix is the smallest number. det (A) By the formula for the expansion of a determinant along a column. r such that every row is a linear combination of some r rows. . it follows that if A−1 exists.7. The determinant rank of the matrix equals r where r is the largest number such that some r × r submatrix of A has a non zero determinant. yn ) for x = (x1 . as of a matrix. This formula is known as Cramer’s rule. and the determinant of this modiﬁed matrix is taken and divided by det (A).17 A submatrix of a matrix A is a rectangular array of numbers obtained by deleting some rows and columns of A.444 Example 19.16 Suppose A is an n × n matrix and it is desired to solve the system T T Ax = y. Using this formula. ···. Ax = y. · · ·. to a system of equations. A given row. Procedure 19. Then Cramer’s rule says xi = det Ai det A T T where Ai is obtained from A by replacing the ith column of A with the column (y1 .7. First note det (A (t)) = et . there is a formula for A−1 given above. Now in the case that A−1 exists. . Deﬁnition 19. A is a linear combination of rows r ai1 . . Ax = y for x. . In case you are solving a system of equations. . . air if there are scalars. cos t 0 cos t sin t This formula for the inverse also implies a famous procedure known as Cramer’s rule. (y1 · · · ·. . such that every column is a linear combination of some r columns. yn ) .

A is one to one. give a counter example. This amazing result is proved in the next section but is stated here. explain why it is so and if it is not so. Thus the inverse of an orthogonal matrix is just its transpose.19.8. 1. (The answer is −2. 19. The following theorem is of fundamental importance and ties together many of the ideas presented above.) 3 −9 3 1 2 3 2 1 3 2 3 (c) 4 1 5 0 . 5. Theorem 19.8 Exercises 1. There also exist r columns such that every column is a linear combination of these r columns.18 If A has determinant rank. r. det (A) = 0. It is a remarkable fact that for n × n matrices. EXERCISES 445 The following theorem is proved in the section on the theory of the determinant.20 Let A be an n × n matrix. 3. Find the determinants of the following matrices. Theorem 19. Then the following are equivalent. 4. A is onto. 1 2 3 (a) 3 2 2 (The answer is 31.) 1 2 1 2 2. one to one is equivalent to onto. If A−1 exist. Explain your answer. Show det (A) = 0. A matrix is said to be orthogonal if AT A = I. Then A is one to one if and only if A is onto. It says the row rank and column rank are both no larger than the determinant rank. then there exist r rows of the matrix such that every other row is a linear combination of these r rows.19 Let A be an n × n matrix. Is it true that det (A + B) = det (A) + det (B)? If this is so.) 0 9 8 4 3 2 (b) 1 7 8 (The answer is 375. 2. what is the relationship between det (A) and det A−1 .7. it is easy to see they are all equal once the theorem has been proved. Corollary 19. It is proved in the next section. Let A be an r × r matrix and suppose there are r − 1 rows (columns) such that all rows (columns) are linear combinations of these r − 1 rows (columns).7. In fact.7. . What are the possible values of det (A) if A is an orthogonal matrix? 3.

Is it true that A−1 will also be upper triangular? Explain. 10.7.446 LINEAR ALGEBRA 6. Let A be an n × n matrix and let x be a nonzero vector such that Ax = λx for some scalar.7. In the context of Problem 9 show that if A ∼ B. 7. t e 0 0 . 11. Now use Theorem 19. Let A and B be two n × n matrices. 13. Show also that A ∼ A and that if A ∼ B and B ∼ C. Verify a (t) c (t) b (t) d (t) + det a (t) c (t) b (t) d (t) . S such that A = S −1 BS.18 showed that determinant rank is at least as large as column and row rank. Suppose A is an upper triangular matrix. g (t) h (t) i (t) Conjecture a general result valid for n × n matrices and explain why it will be true. Can a similar thing be done with the columns? 14. the vector. g (t) h (t) i (t) c (t) f (t) i (t) Use Laplace expansion and the ﬁrst part to verify F (t) = a (t) b (t) c (t) a (t) b (t) det d (t) e (t) f (t) + det d (t) e (t) g (t) h (t) i (t) g (t) h (t) a (t) b (t) c (t) + d (t) e (t) f (t) . Only certain ones are. then A ∼ C.7. Show using Theorem 19. Show det (aA) = an det (A) where here A is an n × n matrix and a is a scalar. Show that A−1 exists if and only if all elements of the main diagonal are non zero. Use the formula for the inverse in terms of the cofactor matrix to ﬁnd the inverse of the matrix. What sort of equation is this? How many solutions does it have? 12. λ is called an eigenvalue. Is everything the same for lower triangular matrices? 9. F (t) = det Now suppose a (t) b (t) c (t) F (t) = det d (t) e (t) f (t) .19 to argue det (λI − A) = 0. then B ∼ A. Show they are all equal. Let F (t) = det a (t) c (t) b (t) d (t) . then (λI − A) x = 0.19 there exists x = 0 such that (λI − A) x = 0. λ. x is called an eigenvector and the scalar. Show that if A ∼ B. When this occurs. Explain why this shows that (λI − A) is not one to one and not onto. It turns out that not every number is an eigenvalue. et cos t et sin t A= 0 t t t 0 e cos t − e sin t e cos t + et sin t . ↑Theorem 19. Why? Hint: Show that if Ax = λx. then det (A) = det (B) . A ∼ B (A is similar to B) means there exists an invertible matrix. Suppose det (λI − A) = 0. 8.

in } = {1.11. Let A be an r × r matrix and let B be an m × m matrix such that r + m = n. in+1 ) .40) (19. where the D is an m × r matrix. Also. · · ·. If there are any repeated numbers in (i1 . in ) . The following Lemma will be essential in the deﬁnition of the determinant. the value of the function is multiplied by −1. Thus. (i1 . 2) = 1 and sgn2 (2. sgnn (1. iθ+1 . · · ·. In the case where n = 2. · · ·.9. THE MATHEMATICAL THEORY OF DETERMINANTS 447 15. · · ·. (i1 . · · ·. iθ−1 . 2) = sgn2 (1. it is necessary to show the existence of such a function. · · ·. n} . · · ·. sgnn+1 will be deﬁned in terms of sgnn . n + 1. or −1 which also has the following properties. in ) be an ordered list of numbers from {1. and the 0 is a r × m matrix. 3) and (2.41) In words. iθ+1 . 1) = 0 and verify it works. · · ·. q. · · ·.9. in+1 ) ≡ 0. iθ−1 . If there are no repeats.42) it must be that sgnn+1 (i1 . 2. · · ·. · · ·. in+1 ) ≡ . 1. iθ−1 . n} to one of the three numbers. · · ·. Let sgn2 (1. · · ·. the second property states that if two of the numbers are switched. Proof: To begin with. Consider the following n × n block matrix C= A D 0 B . iθ−1 . sgnn (i1 . sgnn+1 (i1 . 0. · · ·. D Im See Corollary 19. Let (i1 . · · ·. it is also clearly true. This is clearly true if n = 1. p. in ) = − sgnn (i1 . 1. This means the order is important so (1. n + 1. Let θ be the position of the number n + 1 in the list. 19. in ) (19.42) where n = iθ in the ordered list. sgnn which maps each list of numbers from {1.1 There exists a unique function. in+1 ) . in ) (19. 1) = 0 while sgn2 (2. · · ·. · · ·. Now explain why det (C) = det (A) det (B) . Deﬁne sgn1 (1) ≡ 1 and observe that it works. · · ·. in ) . n) = 1 sgnn (i1 . · · ·. q. Letting Ik denote the k × k identity matrix. · · ·. tell why C= A D 0 Im Ir 0 0 B . · · ·. n. · · ·. There will be some repetition between this section and the previous section for the convenience of the reader.7.19. iθ+1 . · · ·.9 The Mathematical Theory Of Determinants It is easiest to give a diﬀerent deﬁnition of the determinant which is clearly well deﬁned and then prove the earlier one in terms of Laplace expansion. No switching is possible. n} so that every number from {1. in the case where n > 1 and {i1 . 3) are diﬀerent. Lemma 19. n} appears in the ordered list. · · ·. · · ·. Hint: Part of this will require an explanation of why A 0 det = det (A) . p. the list is of the form (i1 . in ) ≡ (−1) n−θ sgnn−1 (i1 . iθ+1 . Assuming such a function exists for n. From (19. then n + 1 appears somewhere in the ordered list.

· · ·. · · ·. n + 1. · · ·. then it is obvious (19. It is necessary to verify this satisﬁes (19.44). n + 1. · · ·. By induction. p. sgnn i1 . · · ·. n + 1. · · ·. · · ·. q. q. a switch of p and q introduces a minus sign in the result. If there are repeated numbers in (i1 . · · ·. · · ·. Similarly. in+1 r s−1 and so. · · ·.43) to the rth position in (19. q. in+1 (−1) n+1−θ = sgnn i1 . It remains to verify (19. in+1 ) . n + 1. q. · · ·. · · ·. The ﬁrst of these is obviously true because sgnn+1 (i1 . q. in+1 . p.41) in the case where there are no numbers repeated in (i1 . Then sgnn+1 i1 . · · ·. in+1 r s−1 (−1) s−1−r sgnn i1 . · · ·. · · ·. q. in+1 2s−1 sgnn i1 . · · ·. in+1 = (−1) n+1−s sgnn i1 . · · ·. in+1 = (−1) Therefore. · · ·. · · ·. · · ·. sgnn+1 i1 . in+1 . · · ·. · · ·. q. q .41) holds because both sides would equal zero from the above deﬁnition. in+1 = (−1) r (−1) n+1−s sgnn i1 . sgnn+1 i1 .44) By making s − 1 − r switches. · · ·. in+1 = (−1) while n+1−r r s sgnn i1 .40) and (19. · · ·. · · ·. p. q is in the sth position. p is in the rth position and the s above the q indicates that the number. move the q which is in the s − 1th position in (19. in+1 (−1) while r n+1−θ r θ s r s ≡ sgnn i1 . q. r (19. n + 1. · · ·. Suppose ﬁrst that r < θ < s. in+1 θ s r s−1 sgnn+1 i1 . · · ·. · · ·. · · ·. · · ·. q . by induction. q.41) holds. in+1 ) . if θ > s or if θ < r it also follows that (19. in+1 r s s−1 (19. · · ·. · · ·. q. · · ·. · · ·.41) with n replaced with n + 1. · · ·. in+1 r . p . · · ·. · · ·. · · ·. · · ·. iθ−1 . in+1 = (−1) = (−1) = (−1) n+s n+1−r r s n+1−r sgnn i1 . The interesting case is when θ = r or θ = s. · · ·. Consider sgnn+1 i1 . · · ·. where the r above the p indicates the number. · · ·. n + 1) ≡ (−1) n+1−(n+1) sgnn (i1 . · · ·. in+1 . each of these switches introduces a factor of −1 and so r s−1 s−1−r sgnn i1 . iθ+1 . · · ·. n. in+1 ) . q. n) = 1. · · ·.448 (−1) n+1−θ LINEAR ALGEBRA sgnn (i1 .43) sgnn+1 i1 . p. · · ·. q . · · ·. q. Consider the case where θ = r and note the other case is entirely similar. q . · · ·.

rn ) be an ordered list of numbers from {1. sgn (k1 .···. kn ) = 0 and so that term contributes 0 to the sum. · · ·. rn ) det (A) = (k1 . 1) + f (1. · · ·. THE MATHEMATICAL THEORY OF DETERMINANTS = − sgnn+1 i1 . n) and applying the same sequence of switches. n) = A. kn ) of numbers of {1. · · ·. kn ) ar1 k1 · · · arn kn (19. kn ) a1k1 · · · ankn where the sum is taken over all ordered lists of numbers from {1. in ) .kn ) sgn (k1 . If there exist two functions. Then sgn (r1 . A = (aij ) and let (r1 . · · ·.41). · · ·.···. 2) + f (2. · · ·.···. A. · · ·.41) gives both functions are equal to zero for that ordered list.9.···. n}. Let A be an n × n matrix. n} .3 Let (aij ) = A denote an n × n matrix.9. Let A (r1 .kn ) sgn (k1 .2 Let f be a real valued function which has the set of ordered lists of numbers from {1. The determinant of A. 2) . rn ) denote an ordered list of n numbers from {1. · · ·. n}. · · ·. For example. n} as its domain. · · ·. then (19. f (k1 .45) and A (1. · · ·. Deﬁnition 19. Note it suﬃces to take the sum over only those ordered lists in which there are no repeats because if there are. If any numbers are repeated.47) = det (A (r1 .19. you could start with f (1. rn )) = (k1 . n}. in ) = g (i1 . denoted by det (A) is deﬁned by det (A) ≡ (k1 . · · ·.kn ) to be the sum of all the f (k1 · · · kn ) for all possible choices of ordered lists (k1 . (k1 . To see this function is unique. · · ·. · · ·.46) (19. · · ·.4 Let (r1 . f and g both satisfying (19. · · ·. r s 449 This proves the existence of the desired function. note that you can obtain any ordered list of distinct numbers from a sequence of switches. rn )) .9.kn ) sgn (k1 . · · ·. Proposition 19. · · ·. q. rn ) denote the matrix whose k th row is the rk row of the matrix. k2 ) = f (1.40) and (19. · · ·. · · ·. . n + 1. · · ·. · · ·. Thus det (A (r1 . 1) + f (2. in+1 . · · ·. This proves the lemma. Deﬁne f (k1 · · · kn ) (k1 . eventually arrive at f (i1 . In what follows sgn will often be used rather than sgnn because the context supplies the appropriate n. n) = g (1.k2 ) Deﬁnition 19. kn ) ar1 k1 · · · arn kn (19.9. · · ·.

6 The following formula for det (A) is valid. ks = − det (A (1. However. · · ·. det (A) = 1 · n! (19. · · ·. kr . rn ) = 0 so the formula holds in this case also.···. kr . · · ·. · · ·. n)) = − det (A) Now letting A (1. · · ·. n)) = − det (A (1. rn )) = (−1) det (A) where it took p switches to obtain(r1 . det (A (1. r. ij aT = aji . · · ·. · · ·. · · ·. To see this. ks . rn ) from (1. it is possible to give a more symmetric description of the determinant from which it will follow that det (A) = det AT . · · ·s. · · ·. p det (A (r1 . s. n} . · · ·. switching pairs of numbers. rn )) = (−1) det (A) = sgn (r1 . consider n slots placed in order. this equals = (k1 . For each of these choices.450 Proof: Let (1. kn a1k1 · · · askr · · · arks · · · ankn (19. · · ·. · · ·. · · ·.kn ) And also det AT = det (A) where AT is the transpose of A. · · ·. Then for each of these ways there are n − 2 choices left for the third slot. · · ·. and continuing in this way. there are n − 1 choices for the second.···. r. ks . · · ·. n) so r < s. · · ·. · · ·. · · ·. · · ·.48) -(19. · · ·.48) sgn (k1 . Thus there are n (n − 1) ways to ﬁll the ﬁrst two slots. r. kn ) a1k1 · · · arks · · · askr · · · ankn These got switched .1. n). With the above. s. s. By Lemma 19. kn ) ar1 k1 · · · arn kn . (r1 .···. · · ·.9. · · ·. · · ·.9. there are n! ordered lists of distinct numbers from {1. n)) = LINEAR ALGEBRA (19.49) = (k1 .50) p sgn (r1 . Corollary 19. det (A (1.kn ) and renaming the variables. rn ) sgn (k1 .···. Observation 19. · · ·. · · ·.5 There are n! ordered lists of distinct numbers from {1. Continuing this way. · · ·. if there is a repeat.kn ) − sgn k1 . then the reasoning of (19. · · ·. r. n} as stated in the observation. n)) . r. r. · · ·. kr and kr . · · ·. s. n) = (1. this implies det (A (r1 . kr . · · ·. s. Consequently. calling ks . rn ) = 0 and also sgn (r1 . n) play the role of A.rn ) (k1 . kn ) a1k1 · · · arkr · · · asks · · · ankn . (Recall that for AT = aT . · · ·. · · ·.49) shows that A (r1 . (r1 . · · ·. · · ·. rn ). There are n choices for the ﬁrst slot. · · ·.···. · · ·. ks .9. · · ·. rn ) det (A) and proves the proposition in the case when there are no repeated numbers in the ordered list. say the rth row equals the sth row.kn ) sgn (k1 . (k1 . · · ·.) ij .

The same is true of columns because det AT = det (A) and the rows of AT are the columns of A. 1 If A has two equal columns or two equal rows. · · ·.kn ) This proves the corollary since the formula gives the same number for A as it does for AT .8 We say a vector. an ) and the ith row of A2 is (b1 . · · ·.kn ) 451 sgn (r1 . Corollary 19. · · ·. A. · · ·. rn ) where the ri are distinct. all other rows of A1 and A2 coinciding with those of A. bn ) .9.···.6 the same holds for columns because the columns of the matrix equal the rows of the transposed matrix. kn ) a1k1 · · · (xaki + ybki ) · · · ankn sgn (k1 . kn ) a1k1 · · · aki · · · ankn (k1 . The same is true with the word “row” replaced with the word “column”. · · ·. kn ) ar1 k1 · · · arn kn .kn ) =x +y (k1 .9. rn ) sgn (k1 . w. It remains to verify the last assertion.19. This is the same as saying w ∈ span {v1 . Summing over all ordered lists. (If the ri are not distinct. Suppose the ith row of A equals (xa1 + yb1 .7 If two rows or two columns in an n × n matrix.···. THE MATHEMATICAL THEORY OF DETERMINANTS Proof: From Proposition 19. vr } r if there exists scalars.···. · · ·. · · ·. xan + ybn ). Therefore. Thus if A1 is the matrix obtained from A by switching two columns.rn ) (k1 . kn ) ar1 k1 · · · arn kn . det (A) ≡ (k1 . · · ·. By Corollary 19.4 when two rows are switched. vr } .9.kn ) sgn (k1 . If A is an n×n matrix in which two rows are equal or two columns are equal then det (A) = 0. then switching them results in the same matrix.9. are switched. . rn ) = 0 and so there is no contribution to the sum.) n! det (A) = sgn (r1 . · · ·. (r1 . · · ·cr such that w = k=1 ck vk . det is a linear function of each row A. · · ·.9. if the ri are distinct.···. · · ·. · · ·.4. kn ) a1k1 · · · bki · · · ankn ≡ x det (A1 ) + y det (A2 ) . · · ·. the determinant of the resulting matrix equals (−1) times the determinant of the original matrix.kn ) sgn (k1 . In other words. det (A) = (k1 . is a linear combination of the vectors {v1 . Proof: By Proposition 19.9. rn ) sgn (k1 .···. det (A) = − det (A) and so det (A) = 0. c1 . The following corollary is also of great use. det (A) = det AT = − det AT = − det (A1 ) . sgn (r1 . Then det (A) = x det (A1 ) + y det (A2 ) where the ith row of A1 is (a1 . (r1 .···. Deﬁnition 19. the determinant of the resulting matrix is (−1) times the determinant of the original matrix. · · ·.

452 LINEAR ALGEBRA Corollary 19.9. Then by Proposition 19. kn ) c1k1 · · · cnkn (k1 . Recall the following deﬁnition of matrix multiplication.51) (19. Then det (A) = 0. AB = (cij ) where n cij ≡ k=1 aik bkj . det (A) = k=1 ck det a1 · · · ar ··· an−1 ak = 0.9 Suppose A is an n × n matrix and some column (row) is a linear combination of r other columns (rows).4.10 If A and B are n × n matrices.7 r a1 ··· ar ··· an−1 r k=1 ck ak . Lemma 19. Proof: Let A = a1 · · · an be the columns of A and suppose the condition that one column is a linear combination of r of the others is satisﬁed.12 Suppose a matrix is of the form M= or M= A 0 A ∗ ∗ a 0 a (19. Thus an = k=1 ck ak and so det (A) = det By Corollary 19. Deﬁnition 19. Proof: Let cij be the ij th entry of AB. kn ) br1 k1 · · · brn kn (a1r1 · · · anrn ) sgn (r1 · · · rn ) a1r1 · · · anrn det (B) = det (A) det (B) . Then by using Corollary 19.9. The case for rows follows from the fact that det (A) = det AT . A = (aij ) and B = (bij ). det (AB) = sgn (k1 .rn ) = This proves the theorem. · · ·.9.···. (r1 ···.7 we may rearrange the columns to have the nth column a linear combination of the r ﬁrst r columns.rn ) (k1 . One of the most important rules about determinants is that the determinant of a product equals the product of the determinants. This proves the corollary.kn ) sgn (k1 .52) .···. · · ·.···.kn ) = (k1 .9. kn ) r1 a1r1 br1 k1 ··· rn anrn brn kn = (r1 ···. Theorem 19.9.9.11 Let A and B be n × n matrices.kn ) sgn (k1 .9. · · ·. Then det (AB) = det (A) det (B) .

1. Thus in the ﬁrst case. In terms of the theory of determinants. .19. Deﬁnition 19. the only terms which survive are those for which θ = n or in other words. take the determinant of the (n − 1) × (n − 1) matrix which results. mnn = a and mni = 0 if i = n while in the second case. kn ) then using the earlier conventions used to prove Lemma 19.14 Let A be an n × n matrix where n ≥ 2.kn ) n−θ sgnn−1 k1 .9.51) use Corollary 19.9. From the deﬁnition of the determinant. the above expression reduces to a (k1 . The following is the main result. Then if kn = n. This proves the lemma. Theorem 19. the term involving mnkn in the above expression equals zero.···. ) and then multiply this number by (−1) . kn ) m1k1 · · · mnkn Letting θ denote the position of n in the ordered list. · · ·. Then a new matrix called the cofactor matrix. THE MATHEMATICAL THEORY OF DETERMINANTS 453 where a is a number and A is an (n − 1) × (n − 1) matrix and ∗ denotes either a column or a row having length n − 1 and the 0 denotes either a column or a row of length n − 1 consisting entirely of zeros.···. The following theorem proves this assertion. kθ+1 .6 and (19. Proof: Denote M by (mij ) . · · ·kn−1 ) m1k1 · · · m(n−1)kn−1 = a det (A) .52) to write det (M ) = det M T = det AT ∗ 0 a = a det AT = a det (A) . those for which kn = n. mnn = a and min = 0 if i = n. · · ·. To get the assertion in the situation of (19. kθ−1 .52). i+j (This is called the ij th minor of A. Then det (M ) = a det (A) . Therefore. Therefore.kn ) sgnn (k1 . cof (A) is deﬁned by cof (A) = (cij ) where to obtain cij delete the ith row and the j th column of A.9. Earlier this was given as a deﬁnition and the outrageous totally unjustiﬁed assertion was made that the same number would be obtained by expanding the determinant along any row or column.13 Let A = (aij ) be an n × n matrix.53) The ﬁrst formula consists of expanding the determinant along the ith row and the second expands the determinant along the j th column. det (M ) equals (−1) (k1 . · · ·. kn θ n−1 m1k1 · · · mnkn Now suppose (19. (k1 . This will follow from the above deﬁnition of a determinant.9. · · ·.9. det (M ) ≡ (k1 . Then n n det (A) = j=1 aij cof (A)ij = i=1 aij cof (A)ij . (19. To make th the formulas easier to remember. cof (A)ij will denote the ij entry of the cofactor matrix. arguably the most important idea is that of Laplace expansion along a row or a column.kn−1 ) sgnn−1 (k1 .···.

i=1 Now consider n air cof (A)ik det(A)−1 i=1 when k = r.9. · · ·. If det(A) = 0. if det (A) = 0. i+j Aij 0 n aij cof (A)ij which is the formula for expanding det (A) along the ith row. Proof: By Theorem 19. 0) . n det (A) = j=1 det (Bj ) Denote by Aij the (n − 1) × (n − 1) matrix obtained by deleting the ith row and the j th i+j column of A.15 A−1 exists if and only if det(A) = 0. Therefore. Then by Corollary 19. this results in multiplying the determinant of the old matrix by −1 to get the determinant of the new matrix. expanding this matrix along the k th column yields n th 0 = det (Bk ) det (A) −1 = i=1 air cof (A)ik det (A) −1 . Replace the k column with the rth column to obtain a matrix. Note that this gives an easy way to write a formula for the inverse of an n × n matrix. n air cof (A)ir det(A)−1 = det(A) det(A)−1 = 1.9. n det (A) = = det A n T = j=1 aT cof AT ij ij aji cof (A)ji j=1 which is the formula for expanding det (A) along the ith column.14 and letting (air ) = A.13 on Page 417. However.9. det (Bj ) = = Therefore. are switched.9. Let Bj be the matrix obtained from A by leaving every row the same except the ith row which in Bj equals (0.9. by Lemma 19.7. Recall the deﬁnition of the inverse of a matrix in Deﬁnition 19. recall that from Proposition 19. then A−1 = a−1 ij where a−1 = det(A)−1 cof (A)ji ij for cof (A)ij the ij th cofactor of A. 0. · · ·.4. ain ) be the ith row of A.12. Theorem 19. M. · · ·. Bk whose determinant equals zero by Corollary 19. Thus cof (A)ij ≡ (−1) det Aij . This proves the theorem.7.1. At this point. 0. aij .454 LINEAR ALGEBRA Proof: Let (ai1 . det (A) = j=1 (−1) (−1) n−j (−1) det n−i det ∗ aij Aij 0 ∗ aij = aij cof (A)ij . when two rows or two columns in a matrix.9. Also.

19.9. THE MATHEMATICAL THEORY OF DETERMINANTS Summarizing,

455

n

**air cof (A)ik det (A)
**

i=1

−1

= δ rk .

**Using the other formula in Theorem 19.9.14, and similar reasoning,
**

n

**arj cof (A)kj det (A)
**

j=1

−1

= δ rk

This proves that if det (A) = 0, then A−1 exists with A−1 = a−1 , where ij a−1 = cof (A)ji det (A) ij

−1

.

Now suppose A−1 exists. Then by Theorem 19.9.11, 1 = det (I) = det AA−1 = det (A) det A−1 so det (A) = 0. This proves the theorem. The next corollary points out that if an n × n matrix, A has a right or a left inverse, then it has an inverse. Corollary 19.9.16 Let A be an n × n matrix and suppose there exists an n × n matrix, B such that BA = I. Then A−1 exists and A−1 = B. Also, if there exists C an n × n matrix such that AC = I, then A−1 exists and A−1 = C. Proof: Since BA = I, Theorem 19.9.11 implies det B det A = 1 and so det A = 0. Therefore from Theorem 19.9.15, A−1 exists. Therefore, A−1 = (BA) A−1 = B AA−1 = BI = B. The case where CA = I is handled similarly. The conclusion of this corollary is that left inverses, right inverses and inverses are all the same in the context of n × n matrices. Theorem 19.9.15 says that to ﬁnd the inverse, take the transpose of the cofactor matrix and divide by the determinant. The transpose of the cofactor matrix is called the adjugate or sometimes the classical adjoint of the matrix A. It is an abomination to call it the adjoint although you do sometimes see it referred to in this way. In words, A−1 is equal to one over the determinant of A times the adjugate matrix of A. In case you are solving a system of equations, Ax = y for x, it follows that if A−1 exists, x = A−1 A x = A−1 (Ax) = A−1 y thus solving the system. Now in the case that A−1 exists, there is a formula for A−1 given above. Using this formula,

n n

xi =

j=1

a−1 yj = ij

j=1

1 cof (A)ji yj . det (A)

By the formula for the expansion of a determinant along a column, ∗ · · · y1 · · · ∗ 1 . . . , . . det . xi = . . . det (A) ∗ · · · yn · · · ∗

456

LINEAR ALGEBRA

T

where here the ith column of A is replaced with the column vector, (y1 · · · ·, yn ) , and the determinant of this modiﬁed matrix is taken and divided by det (A). This formula is known as Cramer’s rule. Deﬁnition 19.9.17 A matrix M , is upper triangular if Mij = 0 whenever i > j. Thus such a matrix equals zero below the main diagonal, the entries of the form Mii as shown. ∗ ∗ ··· ∗ . .. 0 ∗ . . . . . . .. ... ∗ . 0 ··· 0 ∗ A lower triangular matrix is deﬁned similarly as a matrix for which all entries above the main diagonal are equal to zero. With this deﬁnition, here is a simple corollary of Theorem 19.9.14. Corollary 19.9.18 Let M be an upper (lower) triangular matrix. Then det (M ) is obtained by taking the product of the entries on the main diagonal. Recall the following deﬁnition of rank presented earlier. Deﬁnition 19.9.19 A submatrix of a matrix A is a rectangular array of numbers obtained by deleting some rows and columns of A. Let A be an m × n matrix. The determinant rank of the matrix equals r where r is the largest number such that some r × r submatrix of A has a non zero determinant. A given row, as of a matrix, A is a linear combination of rows r ai1 , ···, air if there are scalars, cj such that as = j=1 cj aij . The row rank of a matrix is the smallest number, r such that every row is a linear combination of some r rows. The column rank of a matrix is the smallest number, r, such that every column is a linear combination of some r columns. Theorem 19.9.20 The determinant rank of A coincides with the row rank. Proof: Suppose the determinant rank of A = (aij ) equals r. If rows and columns are interchanged, the determinant rank of the modiﬁed matrix is unchanged. Thus rows and columns can be interchanged to produce an r × r matrix in the upper left corner of the matrix which has non zero determinant. Now consider the r + 1 × r + 1 matrix, M, a11 · · · a1r a1p . . . . . . . . . ar1 · · · arr arp al1 · · · alr alp where C will denote the r×r matrix in the upper left corner which has non zero determinant. I claim det (M ) = 0. There are two cases to consider in verifying this claim. First, suppose p > r. Then the claim follows from the assumption that A has determinant rank r. On the other hand, if p < r, then the determinant is zero because there are two identical columns. Expand the determinant along the last column and divide by det (C) to obtain

r

alp = −

i=1

cof (M )ip det (C)

aip .

19.9. THE MATHEMATICAL THEORY OF DETERMINANTS

457

**Now note that cof (M )ip does not depend on p. Therefore the above sum is of the form
**

r

alp =

i=1

mi aip

which shows the lth row is a linear combination of the ﬁrst r rows of A. Since l is arbitrary, this proves that every row is a linear combination of the ﬁrst r rows. Now suppose for some k < r, there exist k rows such that every row is a linear combination of these k rows. Then the matrix, C in the upper left corner must have determinant equal to zero because this matrix would be of the form k 1 T i=1 ci vi . . .

k r T i=1 ci vi T where the rows are vi matrix is of the form k i=1

**. By Corollary 19.9.7 on Page 451 the determinant of the above T vi1 . 1 2 k ci1 ci2 · · · cik det . .
**

i1 ,···,ik T vik

where each matrix in the above sum has r rows. But there are only k choices for vectors to ﬁll these rows and so there must be repeated rows resulting in the above sum equaling zero as claimed. Therefore, k ≥ r after all and so the row rank of A equals r. Corollary 19.9.21 The column rank of A coincides with the determinant rank of A. Proof: This follows from the above theorem by considering AT . The rows of AT are the columns of A and the determinant rank of AT and A are the same by Corollary 19.9.6 on Page 450. The following theorem is of fundamental importance and ties together many of the ideas presented above. Theorem 19.9.22 Let A be an n × n matrix. Then the following are equivalent. 1. det (A) = 0. 2. A, AT are not one to one. 3. A is not onto. Proof: Suppose det (A) = 0. Then the determinant rank of A = r < n. Therefore, there exist r columns such that every other column is a linear combination of these columns by Corollary 19.9.21. In particular, it follows that for some m, the mth column is a linear combination of all the others. Thus letting A = a1 · · · am · · · an where the columns are denoted by ai , there exists scalars, αi such that am =

k=m

α k ak . ··· −1 ··· αn

T

Now consider the column vector, x ≡

α1

. Then

Ax = −am +

k=m

αk ak = 0.

458

LINEAR ALGEBRA

Since also A0 = 0, it follows A is not one to one. Similarly, AT is not one to one by the same argument applied to AT . This veriﬁes that 1.) implies 2.). Now suppose 2.). Then since AT is not one to one, it follows there exists x = 0 such that AT x = 0. Taking the transpose of both sides yields xT A = 0 where the 0 is a 1 × n matrix or row vector. Now if Ay = x, then |x| = xT (Ay) = xT A y = 0y = 0 contrary to x = 0. Consequently there can be no y such that Ay = x and so A is not onto. This shows that 2.) implies 3.). Finally, suppose 3.). If 1.) does not hold, then det (A) = 0 but then from Theorem 19.9.15 on Page 454 A−1 exists and so for every y ∈ Rn there exists a unique x ∈ Rn such that Ax = y. In fact x = A−1 y. Thus A would be onto contrary to 3.). This shows 3.) implies 1.) and proves the theorem. Corollary 19.9.23 Let A be an n × n matrix. Then the following are equivalent. 1. det(A) = 0. 2. A and AT are one to one. 3. A is onto. Proof: This follows immediately from the above theorem.

2

19.10

Exercises

1. Let m < n and let A be an m × n matrix. Show that A is not one to one. Hint: Consider the n × n matrix, A1 which is of the form A1 ≡ A 0

where the 0 denotes an (n − m) × n matrix of zeros. Thus det A1 = 0 and so A1 is not one to one. Now observe that A1 x is the vector, A1 x = which equals zero if and only if Ax = 0. Ax 0

19.11

The Determinant And Volume

The determinant is the essential algebraic tool which provides a way to give a uniﬁed treatment of the concept of volume and it is this which is the most signiﬁcant application of determinants.

19.11.

THE DETERMINANT AND VOLUME

459

Lemma 19.11.1 Suppose A is an m × n matrix where m > n. Then A does not map Rn onto Rm . Proof: Suppose A did map Rn onto Rm . Then consider the m × m matrix, A1 ≡ A 0 where 0 refers to an n × (m − n) matrix. Thus A1 cannot be onto Rm because it has at least one column of zeros and so its determinant equals zero. However, if y ∈ Rm and A is onto, then there exists x ∈ Rn such that Ax = y. Then A1 x 0 = Ax + 0 = Ax = y.

Since y was arbitrary, it follows A1 would have to be onto. The following proposition is a special case of a fundamental linear algebra result sometimes called the exchange theorem. Proposition 19.11.2 Suppose {v1 , · · ·, vp } are vectors in Rn and span {v1 , · · ·, vp } = Rn . Then p ≥ n. Proof: Deﬁne a linear transformation from Rp to Rn as follows.

p

Ax ≡

i=1

x i vi .

(Why is this a linear transformation?) Thus A (Rp ) = span {v1 , · · ·, vp } = Rn . Then from the above lemma, p ≥ n since if this is not so, A could not be onto. Proposition 19.11.3 Suppose {v1 , · · ·, vp } are vectors in Rn such that p < n. Then there exist at least n − p vectors, {wp+1 , · · ·, wn } such that wi · wj = δ ij and wk · vj = 0 for every j = 1, · · ·, vp . Proof: Let A : Rp → span {v1 , · · ·, vp } be deﬁned as in the above proposition so that A (Rp ) = span {v1 , · · ·, vp } . Since p < n there exists zp+1 ∈ A (Rp ) . Then by Lemma 19.3.8 / on Page 423 there exists xp+1 such that (zp+1 − xp+1 , Ay) = 0 for all y ∈ R . Let wp+1 ≡ (zp+1 − xp+1 ) / |zp+1 − xp+1 | . Now if p + 1 = n, stop. {wp+1 } is the desired list of vectors. Otherwise, do for span {v1 , · · ·, vp , wp+1 } what was done for span {v1 , · · ·, vp } using Rp+1 instead of Rp and obtain wp+2 in this way such that wp+2 · wp+1 = 0 and wp+2 · vk = 0 for all k. Continue till a list of n − p vectors have been found. Recall the geometric deﬁnition of the cross product of two vectors found on Page 322. As explained there, the magnitude of the cross product of two vectors was the area of the parallelogram determined by the two vectors. There was also a coordinate description of the cross product. In terms of the notation of Proposition 13.6.4 on Page 331 the ith coordinate of the cross product is given by εijk uj vk where the two vectors are (u1 , u2 , u3 ) and (v1 , v2 , v3 ) . Therefore, using the reduction identity of Lemma 13.6.3 on Page 331 |u × v|

2 p

= εijk uj vk εirs ur vs = (δ jr δ ks − δ kr δ js ) uj vk ur vs = uj vk uj vk − uj vk uk vj = (u · u) (v · v) − (u · v)

2

460 which equals det u·u u·v u·v v·v .

LINEAR ALGEBRA

Now recall the box product and how the box product was ± the volume of the parallelepiped spanned by the three vectors. From the deﬁnition of the box product u×v·w = i j k u1 u2 u3 · (w1 i + w2 j + w3 k) v1 v2 v3 w1 w2 w3 = det u1 u2 u3 . v1 v2 v3 w2 u2 v2 2 w3 u3 v3

Therefore,

w1 2 |u × v · w| = det u1 v1

which from the theory of determinants equals u1 u2 u3 u1 v1 w1 det v1 v2 v3 det u2 v2 w2 = w1 w2 w3 u3 v3 w3 u1 u2 u3 u1 v1 w1 det v1 v2 v3 u2 v2 w2 = w1 w2 w3 u3 v3 w3 u1 v1 + u2 v2 + u3 v3 u1 w1 + u2 w2 + u3 w3 u2 + u2 + u2 3 2 1 2 2 2 v1 w1 + v2 w2 + v3 w3 v1 + v2 + v3 det u1 v1 + u2 v2 + u3 v3 2 2 2 u1 w1 + u2 w2 + u3 w3 v1 w1 + v2 w2 + v3 w3 w1 + w2 + w3 u·u u·v u·w = det u · v v · v v · w u·w v·w w·w You see there is a deﬁnite pattern emerging here. These earlier cases were for a parallelepiped determined by either two or three vectors in R3 . It makes sense to speak of a parallelepiped in any number of dimensions. Deﬁnition 19.11.4 Let u1 , · · ·, up be vectors in Rk . The parallelepiped determined by these vectors will be denoted by P (u1 , · · ·, up ) and it is deﬁned as p P (u1 , · · ·, up ) ≡ sj uj : sj ∈ [0, 1] .

j=1

**The volume of this parallelepiped is deﬁned as volume of P (u1 , · · ·, up ) ≡ (det (ui · uj ))
**

1/2

.

In this deﬁnition, ui ·uj is the ij th entry of a p×p matrix. Note this deﬁnition agrees with all earlier notions of area and volume for parallelepipeds and it makes sense in any number of dimensions. However, it is important to verify the above determinant is nonnegative. After all, the above deﬁnition requires a square root of this determinant.

19.11.

THE DETERMINANT AND VOLUME

461

Lemma 19.11.5 Let u1 , · · ·, up be vectors in Rk for some k. Then det (ui · uj ) ≥ 0. Proof: Recall v · w = vT w. Therefore, in terms of matrix multiplication, the matrix (ui · uj ) is just the following

p×k

uT 1 . . u1 . uT p which is of the form U T U.

k×p

···

up

First show det U T U ≥ 0. By Proposition 19.11.3 there are vectors, wp+1 , · · ·, wk such that wi · wj = δ ij and for all i = 1, · · ·, p, and l = p + 1, · · ·, k, wl · ui = 0. Then consider U1 = (u1 , · · ·, up , wp+1 , · · ·, wk ) ≡ where W T W = I. Then

T U1 U1 =

U

W

UT WT

U

W

=

UT U 0

0 I

. (Why?)

Now using the cofactor expansion method, this last k × k matrix has determinant equal T T to det U T U (Why?) On the other hand this equals det U1 U1 = det (U1 ) det U1 = 2 det (U1 ) ≥ 0. In the case where k < p, U T U has the form W W T where W = U T has more rows than columns. Thus you can deﬁne the p × p matrix, W1 ≡ and in this case,

T 0 = det W1 W1 = det

W

0

,

W

0

WT 0

= det W W T = det U T U.

This proves the lemma and shows the deﬁnition of volume is well deﬁned. Note it gives the right answer in the case where all the vectors are perpendicular. Here is why. Suppose {u1 , · · ·, up } are vectors which have the property that ui · uj = 0 if i = j. Thus P (u1 , · · ·, up ) is a box which has all p sides perpendicular. What should its volume be? Shouldn’t it equal the product of the lengths of the sides? What does det (ui · uj ) give? The matrix (ui · uj ) is a diagonal matrix having the squares of the magnitudes of the sides 1/2 down the diagonal. Therefore, det (ui · uj ) equals the product of the lengths of the sides as it should. The matrix, (ui · uj ) whose determinant gives the square of the volume of the parallelepiped spanned by the vectors, {u1 , · · ·, up } is called the Grammian matrix and sometimes the metric tensor. These considerations are of great signiﬁcance because they allow the computation in a systematic manner of k dimensional volumes of parallelepipeds which happen to be in Rn for n = k. Think for example of a plane in R3 and the problem of ﬁnding the area of something on this plane. Example 19.11.6 Find the equation of the plane containing the three points, (1, 2, 3) , (0, 2, 1) , and (3, 1, 0) .

462

LINEAR ALGEBRA

These three points determine two vectors, the one from (0, 2, 1) to (1, 2, 3) , i + 0j + 2k, and the one from (0, 2, 1) to (3, 1, 0) , 3i + (−1) j+ (−1) k. If (x, y, z) denotes a point in the plane, then the volume of the parallelepiped spanned by the vector from (0, 2, 1) to (x, y, z) and these other two vectors must be zero. Thus x y−2 z−1 −1 −1 = 0 det 3 1 0 2 Therefore, −2x − 7y + 13 + z = 0 is the equation of the plane. You should check it contains all three points.

19.12

Exercises

T T T

1. Here are three vectors in R4 : (1, 2, 0, 3) , (2, 1, −3, 2) , (0, 0, 1, 2) . Find the volume of the parallelepiped determined by these three vectors. 2. Here are two vectors in R4 : (1, 2, 0, 3) , (2, 1, −3, 2) . Find the volume of the parallelepiped determined by these two vectors. 3. Here are three vectors in R2 : (1, 2) , (2, 1) , (0, 1) . Find the volume of the parallelepiped determined by these three vectors. Recall that from the above theorem, this should equal 0. 4. If there are n + 1 or more vectors in Rn , Lemma 19.11.5 implies the parallelepiped determined by these n + 1 vectors must have zero volume. What is the geometric signiﬁcance of this assertion? 5. Find the equation of the plane through the three points (1, 2, 3) , (2, −3, 1) , (1, 1, 7) .

T T T T T

19.13

Linear Systems Of Ordinary Diﬀerential Equations

Another important application of linear algebra is to linear systems of ordinary diﬀerential equations. Consider for t ∈ [a, b] the system x = A (t) x (t) + g (t) , x (c) = x0 , (19.54)

where c ∈ [a, b] , A (t) is an n × n matrix whose entries are continuous functions of t, (aij (t)) and g (t) is a vector whose components are continuous functions of t. It turns out that this system satisﬁes the conditions of Theorem 15.7.6 on Page 370 with f (t, x) ≡ A (t) x + g (t) . Lemma 19.13.1 The system, (19.54) satisﬁes the hypotheses of Theorem 15.7.6 on Page 370 with f (t, x) ≡ A (t) x + g (t) . Proof: First note that if the ai are nonnegative numbers, then by the Cauchy Schwarz inequality,

n n n 1/2 n 1/2

ai

i=1

=

i=1

ai 1 ≤

n

a2 i

i=1 1/2 i=1

1

=

n1/2

i=1

a2 i

19.13.

LINEAR SYSTEMS OF ORDINARY DIFFERENTIAL EQUATIONS

463

**so squaring both sides,
**

n

2

n

ai

i=1 T

≤n

i=1

a2 . i

T

Now let x = (x1 , · · ·, xn ) and x1 = (x11 , · · ·x1n ) . Then letting M ≡ max {|aij (t)| : t ∈ [a, b] , i, j ≤ n} , it follows from the above inequality |f (t, x) − f (t, x1 )| = |A (t) (x − x1 )| =

n n 2

1/2

** aij (t) (xj − x1j ) 2 1/2 |xj − x1j | 1/2
**

2 |xj − x1j | j=1

i=1 j=1

≤ M ≤ M

i=1 n i=1 n

n j=1 n

n

= Mn

n j=1

1/2

2 |xj − x1j |

= M n |x − x1 | .

Therefore, let K = M n. This lemma yields the following extremely important theorem. Theorem 19.13.2 Let A (t) be a continuous n × n matrix and let g (t) be a continuous vector for t ∈ [a, b] and let c ∈ [a, b] and x0 ∈ Rn . Then there exists a unique solution to (19.54) valid for t ∈ [a, b] . This theorem includes more examples of linear systems of equations than one typically encounters in any beginning course on diﬀerential equations. With this and the theory of determinants and matrices, some fundamental theorems about the nature of solutions to linear systems of equations become possible. Deﬁnition 19.13.3 Let Φ (t) = (x1 (t) , · · ·, xn (t)) be an n × n matrix in which the ith column is the vector, xi (t) . Then Φ (t) ≡ (x1 (t) , · · ·, xn (t)) . Φ (t) is called a fundamental matrix for the equation x (t) = A (t) x (t) , on the interval [a, b] if for each i = 1, 2, · · ·, n, xi (t) = A (t) xi (t) , and for all t ∈ [a, b] , Φ (t)

−1

(19.55)

exists.

464

LINEAR ALGEBRA

A fundamental matrix consists of a matrix whose columns are solutions of (19.55). The following lemma is an amazing result about matrices of this form. Lemma 19.13.4 Let Φ (t) be a matrix which has the property that its columns are solutions of (19.55) where A (t) is a continuous n × n matrix deﬁned on an interval, [a, b] . Then if c is any constant vector, it follows that Φ (t) c is a solution of (19.55). Furthermore, this gives all possible solutions to (19.55) if and only if det (Φ (t0 )) = 0 for some t0 ∈ [a, b] . (When all solutions to (19.55) are obtained in the form Φ (t) c, Φ (t) c is called the general solution.) Finally, det (Φ (t)) either equals zero for all t ∈ [a, b] or det (Φ (t)) is never equal to zero for any t ∈ [a, b] . Proof: Let Φ (t) = (x1 (t) , · · ·, xn (t)) . Then Φ (t) c =

n n n i=1 ci xi

(t) and so

(Φ (t) c)

=

i=1 n

ci xi (t)

=

i=1

ci xi (t)

n

=

i=1

ci A (t) xi (t) = A (t)

i=1

ci xi (t)

= A (t) (Φ (t) c) .

This proves the ﬁrst part, also called the principle of superposition. Suppose now that det (Φ (t0 )) = 0 for some t0 ∈ [a, b] and let z be a solution of (19.55). Choose a constant vector, c as follows. c ≡ Φ (t0 )

−1

z (t0 ) .

Then Φ (t0 ) c = z (t0 ) and deﬁning x (t) ≡ Φ (t) c, it follows x is a solution to (19.55) which has the property that at the single point, t0 , x (t0 ) = z (t0 ) . Therefore, by the uniqueness part of Theorem 19.13.2, x (t) = z (t) for all t ∈ [a, b] , showing that z (t) = Φ (t) c. Since z is an arbitrary solution to (19.55) this proves all solutions to the equation, (19.55), are obtained as Φ (t) c if det (Φ (t0 )) = 0 at some t0 ∈ [a, b] . Now it is shown that if at some point, t0 , det (Φ (t0 )) = 0 then you do not obtain all solutions in this form. Since det (Φ (t0 )) = 0, it follows from Theorem 19.9.22 on Page 457 that Φ (t0 ) cannot map Rn onto Rn and so there exists z0 ∈ Rn with the property that there is no solution, c, to the equation Φ (t0 ) c = z0 . From the existence part of Theorem 19.13.2, there does exist a solution, z, to the initial value problem z (t) = A (t) z (t) , z (t0 ) = z0 , and this solution cannot equal Φ (t) c for any choice of c because if it did, you could plug in t0 and ﬁnd that Φ (t0 ) c = z0 , a contradiction to how z0 was chosen. It remains to verify the last amazing assertion. This follows from the following picture which summarizes the above conclusions which have been proved at this point. The graph in the picture represents det (Φ (t)) and shows it is impossible to have this function equal to zero at some point and yet nonzero at another.

19.13.

LINEAR SYSTEMS OF ORDINARY DIFFERENTIAL EQUATIONS

465

f Φ(t)c is general solution a f t s db d Φ(t)c not general solution

Either Φ (t) c is the general solution or it is not. Therefore, the graph of det (Φ (t)) cannot cross the t axis and this proves the lemma. The function, t → det (Φ (t)) is known as the Wronskian2 . and the above lemma is sometimes called the Wronskian alternative. Theorem 19.13.5 Let A (t) be a continuous n × n matrix and let c ∈ [a, b]. Then there exists a fundamental matrix for the equation, x (t) = A (t) x (t) , Φ (t) , which satisﬁes Φ (c) = I. Proof: From Theorem 19.13.2 there exists xi satisfying (19.55) and the initial condition, xi (c) = ei . Now let Φ (t) = (x1 (t) , · · ·, xn (t)) . Thus Φ (c) = I and so det (Φ (t)) is never −1 equal to zero on [a, b] by the above lemma. Therefore, for all t ∈ [a, b] , Φ (t) exists and consequently, Φ (t) is a fundamental matrix. This proves the theorem. The most important formula in the theory of diﬀerential equations is the variation of constants formula, sometimes called Green’s formula. This formula is concerned with the solution to the system, x = A (t) x (t) + g (t) , x (c) = x0 , (19.56) where c ∈ [a, b] , A (t) is an n × n matrix whose entries are continuous functions of t, (aij (t)) and g (t) is a vector whose components are continuous functions of t. By Theorem 19.13.2 there exists a unique solution to this initial value problem but that is all this theorem says. The variation of constants formula gives a representation of the solution to this problem in terms of the fundamental matrix for x = A (t) x. Look for a solution to (19.56) which T is of the form x (t) = Φ (t) v (t) , v (t) = (v1 (t) , · · ·, vn (t)) , where Φ (t) is a fundamental matrix which satisﬁes Φ (c) = I. Let Φ (t) = (x1 (t) , · · ·, xn (t)) . Then

n

Φ (t) v (t) =

i=1

vi (t) xi (t)

**and so by the product rule,
**

n

(Φ (t) v (t))

=

i=1

vi (t) xi (t) + vi xi (t) Φ (t) v (t) + Φ (t) v (t) .

=

**Therefore, x will be a solution to the diﬀerential equation in (19.56) if and only if
**

=A(t)Φ(t)v(t)

Φ (t) v (t) + Φ (t) v (t) = A (t) Φ (t) v (t) + g (t) .

2 This

is named after Wronski, a Polish mathematician who lived from 1778-1853.

466

LINEAR ALGEBRA

Since Φ (t) is a fundamental matrix, Φ (t) = A (t) Φ (t) and so this equation reduces to Φ (t) v (t) = g (t) . Now multiplying on both sides by Φ (t) which exists because Φ (t) is a fundamental matrix, it follows that x will be a solution to (19.56) if and only if v (t) = Φ (t)

−1 −1

g (t) .

−1

From the formula for the inverse of a matrix, Φ (t) has all continuous entries and so the right side of the above is continuous. Therefore, it makes sense to deﬁne

t

v (t) ≡

c

Φ (s)

−1

g (s) ds + x0

**and Φ (t) v (t) will be a solution to the diﬀerential equation of (19.56). Now consider
**

t

x (t) = Φ (t) v (t) = Φ (t) x0 + Φ (t)

c

Φ (s)

−1

g (s) ds.

(19.57)

From the above discussion, it solves the diﬀerential equation of (19.56). However, it also satisﬁes the initial condition because x (c) = Φ (c) x0 +Φ (c) 0 = Ix0 . Therefore, by Theorem 19.13.2 the formula given in (19.57) is the solution to the initial value problem (19.56). It is (19.57) which is called the variation of constants formula.

19.14

Exercises

1. The diﬀerential equation of undamped oscillation is y + ω 2 y = 0 where ω = 0. Deﬁne a new variable, z = y and show this diﬀerential equation may be written as the ﬁrst order system z 0 −ω 2 z = . y 1 0 y Now show a fundamental matrix for this equation is ω cos ωt sin ωt −ω sin ωt cos ωt .

Finally, show that all solutions of the equation, y +ω 2 y = 0 are of the form c1 cos ωt+ c2 sin ωt. 2. Do the same thing for the equation y − ω 2 y = 0 that was done in Problem 1. 3. Now suppose you have the equation, y + 2by + 4cy = 0. Try to reduce this to one of the above problems by changing the equation to one for the dependent variable z = eλt y and choosing λ appropriately. 4. Show that any diﬀerential equation of the form y (n) + a1 (t) y (n−1) + · · · + an−1 (t) y + an (t) y = 0 can be written as a ﬁrst order system using a technique similar to that outlined in Problem 1. 5. Show that if Φ (t) is a fundamental matrix for x = A (t) x and xp = A (t) xp + f (t) then if z = A (t) z + f (t) , there exists c a constant vector such that xp + Φ (t) c = z. Thus the general solution for the equation x = A (t) x + f (t) is xp + Φ (t) c where c is an arbitrary constant vector. The solution, xp is called a particular solution.

x1 (t) = sin t t2 and x2 (t) = tet t Find the continuous matrix if possible. tell why.19. He has forgotten the continuous matrix. EXERCISES 467 6.14. . 1] used in the diﬀerential equation x = A (t) x but remembers there were two solutions. A (t) for t ∈ [−1. If not possible.

468 LINEAR ALGEBRA .

z) such that z = f (x. Notice how elaborate this picture is. 469 . The contour graph of the above three dimensional graph is below and comes from using the computer algebra system again. say x and consider the function z = f (x. How does one go about depicting such a graph? The usual way is to ﬁx one of the variables. I used a grid consisting of 70 choices for x and 70 choices for y. In calculus. The lines in the drawing correspond to taking one of the variables constant and graphing the curve which results. This is because one needs more than three dimensions to accomplish the task and we can only visualize things in three dimensions. Let f (x. y. one of the main purposes of calculus is to free us from the tyranny of art. a computer algebra system.1 The Graph Of A Function Of Two Variables With vector valued functions of many variables.1 . One of these is the concept of a scalar valued function of two variables.Functions Of Many Variables 20. Graphing this would give a curve which lies in the surface to be depicted. Ultimately. y) denote a scalar valued function of two variables evaluated at the point (x. Its graph consists of the set of points. y) . The following is the graph of the function z = cos (x) sin (2x + y) drawn using Maple. Computers do this very well. y) . Sometimes attempts are made to understand three dimensional objects like the above graph by looking at contour graphs in two dimensions. (x. we are permitted and even required to think in a meaningful way about things which cannot be drawn. ﬁle which I then imported into this document. Then do the same thing for other values of x and the result would depict the graph desired graph. it doesn’t take long before it is impossible to draw meaningful pictures. y) where y is allowed to vary and x is ﬁxed. it is certainly interesting to consider some things which can be visualized and this will help to formulate and understand more general notions which make sense in contexts which cannot be visualized. However. 1I used Maple and exported the graph as an eps. The computer did this drawing in seconds but you couldn’t do it as well if you spent all day on it.

This is done by looking at level surfaces in R3 which are deﬁned as surfaces where the function assumes a constant value. However. The level surfaces of this function would be concentric spheres centered at 0. In a contour geographic map. consider f (x. It is the derivative of a function in a particular direction. If many contour lines are close to each other. z) = x2 + y 2 + z 2 . However.2 The Directional Derivative The directional derivative is just what its name suggests. 20. z T ¨ ¨¨v ¨ ¨¨ B ¨ z = f (x. some people like to try and visualize even these examples. there really are limits to what you can accomplish in this direction. these lines are called isotherms or isobars depending on whether the function involved is temperature or pressure. temperature. If you have looked at a weather map. As a simple example.470 FUNCTIONS OF MANY VARIABLES 4 y 2 –4 –2 0 –2 –4 2 x 4 This is in two dimensions and the diﬀerent lines in two dimensions correspond to points on the three dimensional graph which have the same z value. The following picture illustrates the situation in the case of a function of two variables. y) (x0 . A scalar function of three variables. y. this indicates rapid change in the altitude. They play the role of contour lines for a function of two variables. y0 ) ¨ E y x © . So much for art. or whatever else may be measured. cannot be visualized because four dimensions are required. (Why?) Another way to visualize objects in higher dimensions involves the use of color and animation. pressure. the contour lines represent constant altitude.

3 Let U be an open subset of Rn and let f : U → R. · · ·. xi + t.2. The directional derivative is just a measure of the steepness in a given direction. · · ·xn ) .2 Find the directional derivative of the function. Then to ﬁnd the directional derivative from the deﬁnition. y) moves in the direction of v. These are the vectors. This unit 1 1 vector is v ≡ √2 . . = lim t→0 t t→0 lim 2 Actually. ∂f (x) ∂xi ≡ f (x+tei ) − f (x) t f (x1 . First you need a unit vector which has the same direction as the given vector. 0. deﬁne the directional derivative of f in the direction. 2 t √ 2 2 2+ t √ 2 and f (x) = 2. v. Thus. √2 . letting t → 0. xi . · · ·0) and the 1 is in the ith position. t t 1 + √2 2 + √2 − 2 f (x + tv) − f (x) = . f (x. 4 Dv f (1.20. 2 There is something you must keep in mind about this. this results in a change in z = f (x. y) as shown in the picture. The direction vector must always be a unit vector2 .2. · · ·. ei where T ei = (0. · · ·xn ) − f (x1 . t Example 20. ∂xi This is called the partial derivative of f. If you looked at it from the side. y0 ) is a point in the xy plane. · · ·. this you diﬀerence quotient equals 1 2 10 + 4t 2 + t2 and so. However. t t and to ﬁnd the directional √ derivative. There are some special unit vectors which come to mind immediately. This motivates the following general deﬁnition of the directional derivative. ∂f (x) ≡ Dei f (x) . 2) . Then letting T x = (x1 .2. y0 + tv2 ) − f (x0 . THE DIRECTIONAL DERIVATIVE 471 In this picture. write the diﬀerence quotient described above. · · ·. at the point x as Dv f (x) ≡ lim t→0 f (x + tv) − f (x) . v2 ) is a unit vector in the xy plane and x0 ≡ (x0 . The directional derivative in this direction is deﬁned as f (x0 + tv1 . A simple example of this is a person climbing a mountain. some steeper than others. y0 ) . Deﬁnition 20. For x ∈ U. 1. √ take the limit of this as t → 0. you would be getting the slope of the indicated tangent line. t It tells how fast z is changing in this direction.1 Let f : U → R where U is an open set in Rn and let v be a unit vector. v ≡ (v1 . t→0 lim Deﬁnition 20. Thus f (x + tv) = 1 + Therefore. 2) = 5√ 2 . y) = x2 y in the direction of i + j at the point (1. When (x. there is a more general formulation of this known as the Gateaux derivative in which the length of v is not considered but it will not be considered here. 0.2. xn ) be a typical element of Rn . He could go various directions.

2. deﬁne the directional derivative of f in the direction. one could take the partial ∂2f derivative with respect to y of the partial derivative with respect to x. 5 5 5 5 You see from this example and the above deﬁnition that all you have to do is to form the vector which is obtained by replacing each component of the vector with its directional derivative. fxy are called mixed partial derivatives. Having taken ∂x ∂y ∂z one partial derivative.2. The concept of a directional derivative for a vector valued function is also easy to deﬁne although the geometric signiﬁcance expressed in pictures is not. If y = f (x) .472 FUNCTIONS OF MANY VARIABLES and to ﬁnd the partial derivative. w) = T xy 2 . Thus. t . z 3 T . z sin (xy) . Find the directional derivative in the direction √ √ T First. y) = y sin x + x2 y + z. y) . fx (x.2. you can take partial derivatives of vector valued functions and use the same notation. at the point x as Dv f (x) ≡ lim Example 20. z) = y 2 . there is no reason to stop doing it. ∂xi Example 20. Example 20. yx T (1. In the above example. the partial derivative of f with respect to xi may also be denoted by ∂y or yxi . . For x ∈ U. x 5 + y 5 + t t→0 5 5 5 5 25 5 5 5 4 √ 1√ 2 2 √ 1 √ = xy 5 + 5y . T t→0 f (x + tv) − f (x) . fyxx = − sin x + 2. and ∂f ∂z if f (x.i . or Di f. y. the desired limit is √ √ 2 √ √ x + t 1/ 5 y + t 2/ 5 − xy 2 .2.7 Find the partial derivative with respect to x of the function f (x.6 Let f (x. ∂f = y cos x + 2xy. ∂x∂y Higher order partial derivatives are deﬁned by analogy to the above. f. ∂y∂x Also observe that ∂2f = fyx = cos x + 2x. 2) at the point (x. ∂y . denoted by ∂y∂x or fxy . zy cos (xy) . x + t 1/ 5 y + t 2/ 5 − xy lim t→0 t 1√ 2 4 4 2√ 2 √ 4 √ 4 1 √ 2 = lim xy 5 + xt + 5y + ty + t 5. Other notation for this partial derivative is fxi . These partial derivatives.4 Find ∂f ∂f ∂x . v. Thus in the above example. Deﬁnition 20. x 5 + y 5 . y. From the above deﬁnition. y. In particular. a unit vector in this direction is 1/ 5. ∂f = sin x + x2 . y) = xy 2 .5 Let f : U → Rp where U is an open set in Rn and let v be a unit vector. and ∂f = 1. z. z) = D1 f (x. ∂2f = fxy = cos x + 2x. From the deﬁnition above. z 3 x . 2/ 5 and from the deﬁnition. diﬀerentiate with respect to the variable of interest and regard all the others as constants.

fzx . fxy . 6. Find fx . you want to minimize the function of two variables. (a) x2 y 3 z 4 + sin (xyz) (b) sin (xyz) + x2 yz (c) z ln x2 + y 2 + 1 (d) ex 2 +y 2 +z 2 (e) tan (xyz) 5.3 Exercises 1. b) ≡ i=1 (ati + b − xi ) . fxz. and ∂f ∂z for f = (a) x2 y + cos (xy) + z 3 y (b) ex 2 +y 2 z sin (x + y) 2 (c) z 2 sin3 ex +y 3 (d) x2 cos sin tan z 2 + y 2 (e) xy 2 +z 3. xi )}i=1 the set of all such pairs and try to ﬁnd numbers a and b such that the line x = at + b approximates these ordered pairs as well as p 2 possible in the sense that out of all choices of a and b. 4. ∂f ∂y (0. Why must the vector in the deﬁnition of the directional derivative be a unit vector? Hint: Suppose not. Would the directional derivative be a correct manifestation of steepness? 2. Experiments are done at n times. f (x) ≤ f (y) . EXERCISES 473 20.20.3. Prove that if f has all of its partial derivatives at x. i=1 (ati + b − xi ) is as small as possible. then fxi (x) = 0 for each xi .5. 0) and (0. fzy . 0) . You just do this argument given earlier for each variable to get the conclusion. p 2 f (a. y) = Find ∂f ∂x 2xy+6x3 +12xy 2 +18yx2 +36y 3 +sin(x3 )+tan(3y 3 ) 3x2 +6y 2 if (x. You will be ﬁnding the formula for the least squares regression line. fyz for the following and form a conjecture about the mixed partial derivatives. Find a formula for a and b in terms of the given ordered pairs. 0) . . t1 . fyx . As an important application of Problem 5 consider the following. Suppose f (x. 0) 0 if (x. y) = (0. Suppose f : U → R where U is an open set and suppose that x ∈ U has the property that for all y near x. · · ·. fy . y) = (0.2 on Page 126. tn and at each time there results a collection of numerical p outcomes. Hint: This is just a repeat of the similar one variable theorem given earlier. t2 . In other words. Theorem 6. Denote by {(ti . Find ∂f ∂f ∂x . fz . ∂y .

(s. Then if fxy and fyx are continuous at the point (x. t) ≡ {f (x + t. t) = fxy (x. 1). r) ⊆ U.474 FUNCTIONS OF MANY VARIABLES 20. y) . Applying the mean value theorem again. β ∈ (0. by the mean value theorem from calculus and the (one variable) chain rule.4. t) is also unchanged and the above argument shows there exist γ.4. 1) such that ∆ (s. y + s)−f (x + t.0) lim ∆ (s. fy .1). This astonishing fact is due to Euler in 1734. The following is a well known example [1]. 2 As implied above. k. Corollary 20. l. t) → (0. ∆ (s.2 Suppose U is an open subset of Rn and f : U → R has the property that for two indices. Letting (s. The following is obtained from the above by simply ﬁxing all the variables except for the two of interest. y) ∈ U . t) = fyx (x + γt. y) and f (x. If the terms f (x + t. Therefore. y + s) − f (x. y + s) ∈ U because |(x + t. . fxl . y + s) are interchanged in (20.1 Suppose f : U ⊆ R2 → R where U is an open set on which fx . y + s) − fx (x + αt. h (t) ≡ f (x + t. y))}. y + s) − f (x + t.t)→(0. y)| = ≤ |(t. fxk . y) − (f (x. y) = fyx (x.1) r = √ < r. 1) . it follows fxy (x. y + s) − (x. y) = fyx (x. y). ∆ (s. δ ∈ (0. 0) and using the continuity of fxy and fyx at (x. s)| = t2 + s2 r2 r2 + 4 4 1/2 1/2 (20. there exists r > 0 such that B ((x. This proves the theorem. Now let |t| .4 Mixed Partial Derivatives Under certain conditions the mixed partial derivatives will always be equal. y) . Theorem 20. y)) s for some α ∈ (0. t) = = 1 1 (h (t) − h (0)) = h (αt) t st st 1 (fx (x + αt. y) . y + δs) . ∆ (s. y) . and fxk xl exist on U and fxk xl and fxl xk are both continuous at x ∈ U. fxy and fyx exist. fxl xk . st Note that (x + t. It is necessary to assume the mixed partial derivatives are continuous in order to assert they are equal. Then fxk xl (x) = fxl xk (x) . Proof: Since U is open. t) = fxy (x + αt. |s| < r/2 and consider h(t) h(0) 1 ∆ (s. y + βs) where α.

0) = fy (0. 0) − fy (0. p ≥ 1 be a function and let x be a limit point of D (f ). THE LIMIT OF A FUNCTION OF MANY VARIABLES Example 20. Then lim f (y) = L y→x .5.3 Let f : D (f ) ⊆ Rp → Rq where q.5 The Limit Of A Function Of Many Variables Limits of scalar valued functions of one variable and later vector valued functions of one variable were considered earlier. r) be a ball as above. y) = if (x. r) .1 Let A denote a nonempty subset of Rp . A point. Now every nonempty ball in Rp contains inﬁnitely many points (Why?) and this proves the lemma. Now consider vector valued functions of many variables.5. y) − fx (0. x is said to be a limit point of the set. Then letting δ = min (r1 . 0) . Proof: Let x ∈ A and let B (x. 20. 0) 0 if (x. y) = (0. 0) x x4 lim =1 x→0 (x2 )2 x→0 x4 − y 4 + 4x2 y 2 (x2 + 2 y2 ) . r) contains inﬁnitely many points of A. then every point of A is a limit point of A.4. r1 ) ⊆ A for some r1 > 0.3 Let f (x. 0) y −y 4 lim = −1 y→0 (y 2 )2 y→0 lim lim showing that although the mixed partial derivatives do exist at (0. 0) .5. 0) = 0. fx = y Now fxy (0. r) . B (x. A if for every r > 0. Deﬁnition 20.5. D (f ) and this concept is deﬁned next.2 If A is a nonempty open set in Rp . When f : D (f ) → Rq one can only consider in a meaningful way limits at limit points of the set.20. y) = (0. 0) xy (x2 −y 2 ) x2 +y 2 475 From the deﬁnition of partial derivatives it follows immediately that fx (0. The case where A is open will be the one of most interest but many other sets have limit points. for (x. they are not equal there. 0) ≡ = fy (x. Using the standard rules of diﬀerentiation. δ) ⊆ A∩B (x. Lemma 20. 0) ≡ = while fyx (0. it follows B (x. B (x. Deﬁnition 20. Since x ∈ A. y) = (0. fy = x x4 − y 4 − 4x2 y 2 (x2 + y 2 ) 2 fx (0. an open set.

and so for such y. (20. Therefore.476 FUNCTIONS OF MANY VARIABLES if and only if the following condition holds.7) (20. y→x lim af (y) + bg (y) = aL + bK. limy→x f (y) = L = (L1 · ··. y→x (20.2) (20. |f · g (y) − L · K| ≤ |f · g (y) − f (y) · K| + |f (y) · K − L · K| ≤ |f (y)| |g (y) − K| + |K| |f (y) − L| .3). Now let 0 < δ 2 be such that for 0 < |x − y| < δ 2 . |L1 − L2 | ≤ |L1 − f (y)| + |f (y) − L2 | < ε + ε = 2ε. b ∈ R. f (y) = (f1 (y) . let δ i > 0 correspond to Li in the deﬁnition of the limit and let δ = min (δ 1 . Since ε > 0 is arbitrary. p ≥ 1 be a function and let x be a limit point of D (f ).2) is left for you. Then by the triangle inequality. p. Then for ε > 0 be given.4) T For a vector valued function. it follows from the triangle inequality that |f (y)| < 1 + |L| . fq (y)) . it must be unique.5 Suppose x is a limit point of D (f ) and limy→x f (y) = L .4 Let f : D (f ) ⊆ Rp → Rq where q.5) y→x for each k = 1. δ 2 ). Then if limy→x f (y) exists. · · ·. δ) ∩ D (f ) . Proposition 20.6) . then y→x lim h ◦ f (y) = h (L) .5. the properties of the dot product and the Cauchy Schwartz inequality. Consider (20. Lk ) if and only if lim fk (y) = Lk (20. |L − f (y)| < ε. Since x is a limit point. · · ·. then |f (y) − L| < 1. It is like a corresponding theorem for continuous functions. There exists δ 1 such that if 0 < |y − x| < δ 1 . For all ε > 0 there exists δ > 0 such that if 0 < |y − x| < δ and y ∈ D (f ) then. for 0 < |y − x| < δ 1 . if h is a continuous function deﬁned near L. |f (y) − L| < ε ε . this shows L1 = L2 . |f · g (y) − L · K| ≤ (1 + |K| + |L|) [|g (y) − K| + |f (y) − L|] . Proof: Suppose limy→x f (y) = L1 and limy→x f (y) = L2 . limy→x g (y) = K where K and L are vectors in Rp for p ≥ 1. 2 (1 + |K| + |L|) 2 (1 + |K| + |L|) (20.5. there exists y ∈ B (x. Let ε > 0 be given. Then if a. |g (y) − K| < .3) lim f · g (y) = L · K Also. Therefore. Proof: The proof of (20. Theorem 20.

then |f (x) − f (y)| < ε. then |h (f (y)) − h (L)| < ε.5. there exists δ > 0 such that if 0 < |y − x| < δ.5. . there exists δ > 0 such that whenever y ∈ D (f ) and |y − x| < δ. Then if ε > 0 is given. δ 2 ) . Suppose then that limy→x f (y) = f (x) and let ε > 0 be given. Proof: It follows immediately that if f is continuous at x then for all ε > 0 there exists δ > 0 such that if |x − y| < δ and y ∈ D (f ) . Therefore. From the deﬁnition of the limit. it follows from (20. Next suppose limy→x fk (y) = Lk for each k. q Then. · · ·. if follows that if 0 < |y − x| < δ. |f (y) − L| < ε. limy→x f (y) = f (x) .4). Suppose ﬁrst that limy→x f (y) = L.7) that |f · g (y) − L · K| < ε 477 and this proves (20. since x is a limit point of D (f ) . there exists η > 0 such that if |y − L| < η. then |f (y) − L| < η.5). then |h (y) − h (L)| < ε Since limy→x f (y) = L.6 For f : D (f ) → Rq and x ∈ D (f ) such that x is a limit point of D (f ) . |fk (y) − Lk | ≤ |f (y) − L| < ε. δ q ) . Theorem 20. it follows |f (y) − f (x)| < ε. Therefore. But for each k. Consider (20. Consider (20. letting 0 < δ ≤ min (δ 1 . Then if ε > 0 is given. This proves the theorem. it follows that for ε > 0 given. there exists δ k such that if 0 < |y − x| < δ k . THE LIMIT OF A FUNCTION OF MANY VARIABLES Then letting 0 < δ ≤ min (δ 1 .3). |f (x) − f (x)| < ε and so if |x − y| < δ. if 0 < |y − x| < δ. q 1/2 |f (y) − L| ≡ k=1 q |fk (y) − Lk | ε2 q 1/2 2 < k=1 = ε. |fk (y) − Lk | ≤ |f (y) − L| and so if |y − x| < δ. it follows f is continuous at x if and only if limy→x f (y) = f (x) . then |f (x) − f (y)| < ε. L. Also.20. there exists δ > 0 such that if |y − x| < δ. Since h is continuous near. ε |fk (y) − Lk | < √ .

limy→x f (y) was only deﬁned in the case where x was a limit point of D (f ) . Prove lim g (|x|) = g (|x0 |) . Then the tangent line to the graph of the function. then limy→x f (y) × g (y) = L × M. 0) xy x2 +y 2 n Find lim(x. f (x0 )) is given by y − f (x0 ) = f (x0 ) (x − x0 ) (20. f (x0 )) ¢ ¢ ¢ ¢ Thus from (20. Suppose g is a continuous vector valued function of one variable deﬁned on [0. y) if it exists. Let f (x. Suppose x is deﬁned to be a limit point of a set. y) = (0. y = f (x0 ) + f (x0 ) (x − x0 ) .478 FUNCTIONS OF MANY VARIABLES 20. f at the point (x0 .6 Exercises 1. 7. 4. B (x. ∞). Show that if f . y) = (0. y) ≡ if (x.y)→(0. 20. x→x0 8. Show this set has no limit points. A if and only if for all r > 0. r) contains a point of A diﬀerent than x. . Find limx→0 sin(|x|) |x| and prove your answer from the deﬁnition of limit. 6.0) f (x.8).8) and a picture of the situation is depicted below. If it does not exist. 5. 0 if (x. 0) . Why? 2.7 Approximation With A Tangent Plane To begin with. 3. show if limy→x f (y) = L and limy→x g (y) = M. Give some examples of limit problems for functions of many variables which have limits and prove your assertions. g :U → R3 where U is a nonempty subset of Rp and x is a limit point of U. tell why it does not exist. suppose f is a function of one variable which has a derivative at the point x0 . Let {xk }k=1 be any ﬁnite set of points in Rp . Show this is equivalent to the above deﬁnition of limit point. ¢ ¢ ¢ ¢ ¢r (x0 .

y0 ) + av1 + bv2 + o (v1 . y0 ) v1 + (x0 .1 Let U be an open subset of R2 and suppose f : U → R has the property that the partial derivatives fx and fy exist for (x. y0 + sv2 ) v2 ∂x ∂y . y0 )) − changes only in ﬁrst component = f (x0 + v1 . v2 )) − f (x0 . |v| (20. but what about letting v be arbitrary and small? The idea similar to the above is an expression of the form =o(v) f ((x0 . y0 ) v1 + (x0 . y) ∈ U and are continuous at the point (x0 . v2 )) = f (x0 . for small v. y0 ) v1 + (x0 . APPROXIMATION WITH A TANGENT PLANE How close y is to f (x)? Letting v = (x − x0 ) .7.7. 1] such that this equals = ∂f ∂f (x0 + tv1 . y0 ) + ∂f ∂f (x0 . y0 ) v2 ∂x ∂y changes only in second component ∂f ∂f (x0 . y0 ) v2 ∂x ∂y − By the mean value theorem. y0 + v2 ) v1 + (x0 . = v Now this shows f (x0 + v) − (f (x0 ) + f (x0 ) v) = o (v) where o (v) = 0. y0 + v2 ) − f (x0 . f (x0 + v) . f (x0 + v) − y = f (x0 + v) − f (x0 ) (v) − f (x0 ) (v) v →0 as v→0 479 f (x0 + v) − f (x0 ) − f (x0 ) v. y0 ) v2 + o (v) ∂x ∂x (20. v2 )) = f (x0 . y0 ) v1 + (x0 . y0 ) + (v1 . y0 ) + (v1 .10) = (f (x0 + v1 . What of a function of two variables? Obviously such an approximation exists in one direction if the function has a directional derivative in the given direction. Then f ((x0 . y0 + v2 ) − f (x0 .9). v2 ) where |v|→0 lim |o (v)| = 0. Proof: f ((x0 . y0 + v2 ) + f (x0 .20. y0 ) + where o (v) satisﬁes (20. y0 ) + (v1 .9) Theorem 20. y0 + v2 ) − f (x0 . v→0 |v| lim Therefore. there exist numbers s and t in [0. f (x0 ) + f (x0 ) v is a very good approximation to. y0 ) ∂f ∂f (x0 . y0 ) . y0 ) v2 ∂x ∂y ∂f ∂f (x0 .

denote by θi v the vector T (0. there is a meaningful theorem about approximating the function with one of a special form as above. 0. · · ·. y0 ) (x − x0 ) + (x0 . Of course the above theorem has a generalization to any number of variables but for more than two it is impossible to draw a meaningful picture. (x0 ) . Nevertheless. The message of the above theorem is that the approximation is very good if both |x − x0 | and |y − y0 | are small. y0 + sv2 ) − (x0 . y0 + v2 ) − (x0 . y0 ) ∂x ∂x ∂y ∂y |v| . vi . · · ·. Then p ∂f f (x0 + v) = f (x0 ) + (x0 ) vi + o (v) . letting v ≡ (x − x0 . The right side of the above. y − y0 ) . y0 ) v1 + (x0 . y0 + v2 ) − (x0 .10). y0 + v2 ) − (x0 . vi+1 . y0 ) v2 ∂y ∂y ∂f ∂f (x0 + tv1 . y0 ) |v| ∂x ∂x ∂y ∂y Therefore. y0 + sv2 ) − (x0 . f (x. ∂x1 ∂xp Proof: The proof is similar to the case of two variables. y) = f (x0 . letting o (v) denote the expression in (20. and noticing that |v1 | and |v2 | are both no larger than |v| . |o (v)| ≤ It follows |o (v)| ∂f ∂f ∂f ∂f ≤ (x0 + tv1 . · · ·. ∂xi i=1 That is.480 − = FUNCTIONS OF MANY VARIABLES ∂f ∂f (x0 . y0 ) + (x0 . Theorem 20.7. y0 ) (y − y0 ) + o (v) ∂x ∂x ∂f ∂f ∂f ∂f (x0 + tv1 . y0 ) and this proves the theorem. f is diﬀerentiable at x0 and the derivative of f equals the linear transformation obtained by multiplying by the 1 × p matrix. f (x0 . limv→0 |o(v)| = 0 because of the assumption that fx and fy are continuous at |v| the point (x0 . y0 ) + ∂f ∂f (x0 . ∂f ∂f (x0 ) . y0 ) (x − x0 ) + ∂f (x0 . y0 ) v1 + ∂x ∂x Therefore. Letting v = (v1 · ··.2 Let U be an open subset of Rp for p ≥ 1 and suppose f : U → R has the property that the partial derivatives fxi exist for all x ∈ U and are continuous at the point x0 ∈ U. vp ) . y0 ) + (x0 . y0 + sv2 ) − (x0 . y0 ) (y − y0 ) = z is the ∂x ∂x equation of a plane tangent to the graph of f. vp ) T . y0 ) + ∂f (x0 . Writing this in a diﬀerent form. y0 ) v2 ∂x ∂y ∂f ∂f (x0 .

fx = y. Approximate f (x. 3) . Sometimes people use C 1 as an adjective and say f is C 1 Example 20.7. y. Deﬁnition 20. fy (1. fy = x.3 Let f :U → Rp where U is an open set in Rn . Now p T 481 f (x0 + v) − p f (x0 ) + i=1 ∂f (x0 ) vi ∂xi p i=1 (20. As before. 0. 1) such that the above expression equals p p ∂f ∂f = (x0 + θi v+si vi ) vi − (x0 ) vi ∂xi ∂xi i=1 i=1 and so letting o (v) equal the expression in (20.20. Therefore. 3) = 1. This proves the theorem. y. fz = 2z.4 Suppose f (x. Therefore. Then f is in C 1 (U ) if all its partial derivatives exist and are continuous on U. APPROXIMATION WITH A TANGENT PLANE Thus θ0 v = v. 3) + 1 (x − 1) + 2 (y − 2) + 6 (z − 3) = 11 + 1 (x − 1) + 2 (y − 2) + 6 (z − 3) = −12 + x + 2y + 6z What about the case where f has values in Rq rather than R? Is there a similar theorem about linear approximations in this case also? . z) for (x. 2. 3) = 6. 2. This is called the linear approximation of the function f at the point x0 . 2. vp ) . p |o (v)| ≤ i=1 p ∂f ∂f (x0 + θi v + si vi ) − (x0 ) |vi | ∂xi ∂xi ≤ i=1 ∂f ∂f (x0 + θi v + si vi ) − (x0 ) |v| ∂xi ∂xi p and so lim |o (v)| ∂f ∂f ≤ lim (x0 + θi v + si vi ) − (x0 ) = 0 v→0 |v| v→0 ∂xi ∂xi i=1 because of continuity of the fxi at x0 .7.11). the desired approximation is f (1. let x − x0 = v and write p f (x) = f (x0 ) + i=1 ∂f (x0 ) (xi − x0i ) + o (v) ∂xi where o (v) satisﬁes (20.7. Taking the partial derivatives of f. 2. 2. and θp v = 0.9). z) near (1. and fz (1. z) = xy + z 2 . 3) = 2. y.11) changes only in the ith position = i=1 f (x0 + θi−1 v) − f (x0 + θi v) − ∂f (x0 ) vi ∂xi Now by the mean value theorem there exist numbers si ∈ (0. fx (1. · · ·. θp−1 (v) = (0.

7.6 Let f : U → Rq where U is an open subset of Rp .13) When a formula like the above holds.5 Let U be an open subset of Rp for p ≥ 1 and suppose f : U → Rq has the property that the partial derivatives fxi exist for all x ∈ U and are continuous at the point x0 ∈ U. Proof: Let f (x) ≡ (f1 (x) .12) ∂xi i=1 where o (v) satisﬁes (20. then ∂f f is continuous at x0 . let δ = min δ 1 . ∂xi i=1 Deﬁne o (v) ≡ (o1 (v) .7. If (20. then whenever |x − x0 | is small enough.482 FUNCTIONS OF MANY VARIABLES Theorem 20. p i=1 ∂f (x0 ) (xi − x0i ) ≤ C ∂xi p |xi − x0i | ≤ Cp |x − x0 | i=1 Therefore. there exists δ 1 > 0 such that if |x − x0 | < δ 1 . p |f (x) − f (x0 )| ≤ i=1 ∂f (x0 ) (xi − x0i ) + |x − x0 | ∂xi < (Cp + 1) |x − x0 | ε which veriﬁes (20. Cp+1 .5 on Page 299. Furthermore. · · ·. i = 1. This is the content of the following important lemma.13) holds. the above equation is then equivalent to (20. if C ≥ max ∂xi (x0 ) .12).9). the function must be continuous at x0 . p ∂fk fk (x) = fk (x0 ) + (x0 ) vi + ok (v) . Then for |x − x0 | < δ.7.9). Corollary 12. Now letting ε > 0 be given. Then (20.9) holds for o (v) because it holds for each of the components of o (v) . oq (v)) . this is equivalent to p T T f (x) = f (x0 ) + i=1 ∂f (x0 ) (xi − x0i ) + o (x − x0 ) . q. by the triangle inequality. then |o (x − x0 )| < |x − x0 | . · · ·. From Theorem 20. fq (x)) .2.14). then p ∂f f (x0 + v) = f (x0 ) + (x0 ) vi + o (v) (20. · · ·. ∂xi (20.14) Proof: Suppose (20. the following holds for each k = 1. if |x − x0 | < δ 1 . Lemma 20. Letting v = x − x0 . Since o (v) satisﬁes (20. |f (x) − f (x0 )| ≤ (Cp + 1) |x − x0 | (20. · · ·. But also.13) holds. ε =ε Cp + 1 .3. Also. |f (x) − f (x0 )| < (Cp + 1) |x − x0 | < (Cp + 1) showing f is continuous at x0 . p .

5. Here is an example. it is likely the picture would never have revealed the truth.7. y) = (0. That is. 0) xy x2 +y 2 Similarly. it is impossible to give the sort of approximation for f described in Theorem 20. 0) because for every δ > 0. y) = (0.7. δ) contains points where the value of the function equals 1/2 and also points where the value of the function equals 0. δ/2) = 0. namely (20.1 at the point (0.20.7 A function. f (x. This is both imprecise and useful because it neglects everything which is not important and directs our attention to that which really matters. f (x. You might ﬁnd the picture of this function amusing. I realize some of the above theorems and deﬁnitions are hard but they are the only route to correct understanding of multivariable calculus. Consequently no limit can exist. 0) 1 if (x. Therefore. However. y) ≡ if (x. However.15) . 0) . y) = (0. f (δ/2. This illustrates the limitations of abandoning careful logical reasoning and deﬁnitions in favor of pictures. |v| 483 v→0 Also. 0) . 0) = 0 so the partial derivatives do indeed exist. (Why?) It follows from Theorem 20. fy (0. I think you should notice that from the above discussion you see there is a problem and so it can be seen in the picture because it was expected. y) ≡ Then fx (0. 0) − f (0. lim |g (v)| = 0. APPROXIMATION WITH A TANGENT PLANE Deﬁnition 20. 0) 0 if (x.9). means o (·) is a function which satisﬁes (20.9). 0) . o (v) . δ/2) = 1 while 2 f (0.9). 0) ≡ lim f (t. You see the way it is pinched near the center. This time it is of the function.7.9) holds for g (v) . 0) 0 = lim = 0.6 on Page 477 that f is not continuous at (0. this function is not even continuous at the point (0. Thus B ((0. y) = (0. the symbol. t→0 t→0 t t if (x. from the above discussion. I ﬁnd it helpful to think of o (v) as an adjective rather than a precise deﬁnition. Thus it is permissible to write things like o (v) = 10o (v) or o (v) = o (v) + o (v) because if o (v) satisﬁes (20. Here is another picture. 0) x2 −y 4 x2 +y 4 2 (20. then so does o (v) + o (v) and 10o (v) . Something more than existence of the partial derivatives at a point is necessary in order to accomplish such a linear approximation. v → g (v) is o (v) if (20.

1 Diﬀerentiation And The Chain Rule The Chain Rule As in the case of a function of one variable. Show the function in (20.9..6. And it was an hand breadth thick. and his height was ﬁve cubits: and a line of thirty cubits did compass it round about..5 feet. ten cubits from the one brim to the other: it was round all about. There exists η > 0 such that if |g (x + v) − g (x)| < η. 1) . A function. it is important to consider the derivative of a composition of two functions. 3..1) and then use a calculator to make a comparison.. (20. then |o (g (x + v) − g (x))| < ε Cn + 1 |g (x + v) − g (x)| (20. 8. z) = x 3 yz 2 near the point (1. 0). there exists δ > 0 such that if |v| < δ.16). Proof: It is necessary to show v→0 lim |o (g (x + v) − g (x))| = 0.7.. Then o (g (x + v) − g (x)) = o (v) . How many baths of brass were used? 20.15) is not continuous at (0.1 Let g : U → Rp where U is an open set in Rn and suppose g has a good linear approximation at x ∈ U .18) Now let ε > 0 be given.1. Use this approximation to estimate f (1.9.8 Exercises 1. 20. 7.484 FUNCTIONS OF MANY VARIABLES This is a very interesting example which you will examine in the exercises. 1. (20.16) ∂xi i=1 Lemma 20. √ 2.17) From (20..9 20. “And he made a molten sea. What was the approximate volume in cubic feet of the brass used in the molten sea.” Lets assume a hand breadth is 8 inches and a cubit is 1. and Lemma 20. 0) and yet it has directional derivatives in every direction at (0. Find a linear approximation to the function f (x. It contained two thousand baths. In 1 Kings chapter 7 there is a description of the molten sea in Solomon’s temple.19) .. y. g : U → Rp has a good linear approximation at x ∈ U if n ∂g (x) g (x + v) = g (x) + vi + o (v) .9. then |g (x + v) − g (x)| ≤ (Cn + 1) |v| . |v| (20.

p f (g (x + v)) = f (g (x)) + i=1 ∂f (g (x)) (gi (x + v) − gi (x)) + o (g (x + v) − g (x)) ∂yi which by Lemma 20. Thus f ◦ g is the name of a function and this function is deﬁned by what was just written.9.21) Proof: From the assumption that f has a good linear approximation at g (x) .9. and let f : V → Rq . n p i=1 f (g (x + v)) = f (g (x)) + j=1 ∂f (g (x)) ∂gi (x) ∂yi ∂xj vj + o (v) . Then f ◦ g has a good linear approximation at x and furthermore. which implies 485 |o (g (x + v) − g (x))| < < ε Cn + 1 ε Cn + 1 |g (x + v) − g (x)| (Cn + 1) |v| and so |o (g (x + v) − g (x))| <ε |v| which establishes (20. let g : U → Rp be such that g (U ) ⊆ V. Theorem 20. let V be an open set in Rp . ∂yi ∂xj (20. the above becomes p n ∂gi (x) ∂f (g (x)) f (g (x + v)) = f (g (x)) + vj + o (v) + o (v) ∂yi ∂xj j=1 i=1 p p n ∂gi (x) ∂f (g (x)) ∂f (g (x)) vj + o (v) + o (v) = f (g (x)) + ∂yi ∂xj ∂yi j=1 i=1 i=1 n p i=1 = f (g (x)) + j=1 ∂f (g (x)) ∂gi (x) ∂yi ∂xj vj + o (v) . Recall the notation f ◦ g (x) ≡ f (g (x)) . DIFFERENTIATION AND THE CHAIN RULE η Let |v| < min δ.20.17). Cn+1 .2 Let U be an open set in Rn .9. (20. Suppose g has a good linear approximation at x ∈ U and that f has a good linear approximation at g (x) . ∂yi Now since g has a good linear approximation at x. This proves the lemma.20) and ∂ (f ◦ g) (x) = ∂xj p i=1 ∂f (g (x)) ∂gi (x) . |g (x + v) − g (x)| ≤ η. The last assertion of the following theorem is known as the chain rule.1 equals p f (g (x + v)) = f (g (x)) + i=1 ∂f (g (x)) (gi (x + v) − gi (x)) + o (v) . For such v.

and eφ = (ρ cos φ cos θ. Then the above says ∂z ∂yi ∂z = . ρ sin φ cos θ. eρ = (sin φ cos θ. −ρ sin φ) . describe its acceleration in terms of spherical coordinates and the vectors. Example 20. ρ cos φ sin θ. v) = sin (uv) and let u (x. y = ρ sin φ sin θ. t) = t sin x + cos y and v (x. sin φ sin θ. There is an easy way to remember this in terms of the repeated index summation convention presented earlier.20). ∂z ∂z ∂u ∂z ∂v = + = v cos (uv) t cos (x) + us sec2 (x) cos (uv) . ∂ (f ◦ g) (x) ∂xk = = lim lim f (g (x+tek )) − f (g (x)) t p i=1 t→0 1 t→0 t ∂f (g (x)) ∂gi (x) ∂yi ∂xk t + o (tek ) . 0) . Let y = g (x) and z = f (y) . y. It only remains to ∂yi verify (20. This follows from observing that for v = tek . t. ∂t ∂u ∂t ∂v ∂t Also.486 p FUNCTIONS OF MANY VARIABLES because i=1 ∂f (g(x)) o (v) + o (v) = o (v) . Letting z = f (u. T T T . ∂yi ∂xk ∂xk Remember there is a sum on the repeated index. ∂t From the above. v are as just described.21). ∂x ∂u ∂x ∂v ∂x Clearly you can continue in this way taking partial derivatives with respect to any of the other variables.9. vi = 0 unless i = k in which case vk = t. y. If an object moves in three dimensions. Example 20. Now t→0 lim |o (tek )| =0 t and so.9. cos φ) .4 Recall spherical coordinates are given by x = ρ sin φ cos θ. v) where u. s) = ∂z s tan x + y 2 + ts. taking the limit yields ∂ (f ◦ g) (x) = ∂xk p i=1 ∂f (g (x)) ∂gi (x) ∂yi ∂xk as claimed.3 Let f (u. ﬁnd ∂z and ∂x . ∂z ∂z ∂u ∂z ∂v = + = v cos (uv) sin (x) + us cos (uv) . Using the formula. This establishes (20. eθ = (−ρ sin φ sin θ. z = ρ cos φ.

ρ sin φ sin θ. Let r (ρ. Letting only φ change and ﬁxing θ and ρ. The giant turtle carrying the ﬂat earth on its back as it proceeds through the universe is not usually accepted as a correct model. this gives a curve in the direction of increasing ρ.22). . −ρ sin φ sin θ. the solution is a = 0. ∂φ ρ ∂eρ 1 T = (− sin φ sin θ. y (t) .22) T Therefore. 0) = eθ . ρ cos φ) T 487 and ﬁx φ and θ. sin φ cos θ. dφ dθ eφ + eθ . To do so. by the chain rule. 0) T = (− cos φ sin φ) eφ + −ρ sin2 φ eρ .20. this gives a vector which is tangent to the sphere of radius ρ and points South. letting only ρ change. ∂ρ deρ dt ∂eρ dφ + ∂φ dt 1 dφ eφ + ρ dt ∂eρ dθ ∂θ dt 1 dθ eθ . c such that ? = aeθ + beφ + ceρ . Similarly. b. 0) =? ∂θ where it is desired to ﬁnd a. θ. b = − cos φ sin φ. This must be diﬀerentiated with respect to t.9. = = By (20. ∂eρ 1 T = (cos φ cos θ. ∂θ ρ and ∂eρ = 0. (20. It is thought by most people that we live on a large sphere. Given we live on a sphere. z (t)) Now this implies the velocity is r (t) = ρ (t) eρ (t) + ρ (t) (eρ (t)) . −ρ sin φ sin θ. You see. − sin φ) = eφ . φ) where each of these variables is a function of t. Thus r (t) = ρ (t) eρ (t) = (x (t) . r = ρ eρ + ∂eθ T = (−ρ sin φ cos θ. θ. Thus it is a vector which points away from the origin. Thus −ρ sin φ sin θ ρ cos φ cos θ sin φ cos θ a −ρ sin φ cos θ ρ sin φ cos θ ρ cos φ sin θ sin φ sin θ b = −ρ sin φ sin θ 0 −ρ sin φ cos φ c 0 Using Cramer’s rule. ρ dt (20. cos φ sin θ. φ) = (ρ sin φ cos θ.23) dt dt Now things get interesting. letting θ change and ﬁxing the other two gives a vector which points East and is tangent to the sphere of radius ρ. DIFFERENTIATION AND THE CHAIN RULE Why these vectors? Note how they were obtained. and c = −ρ sin2 φ. what directions would be most meaningful? Wouldn’t it be the directions of the vectors just described? Let r (t) denote the position vector of the object from the origin. eρ = eρ (ρ. Thus ∂eθ ∂θ = (−ρ sin φ cos θ.

0) = eθ . 0) (cot φ) eθ ∂eφ 1 T = (cos φ cos θ. T = = (−ρ cos φ sin θ. the chain rule is used to obtain r which will yield a formula for the acceleration in terms of the spherical coordinates and these special vectors. ∂ρ ρ With these formulas for various partial derivatives.23). ρ cos φ cos θ. − sin φ) = eφ . FUNCTIONS OF MANY VARIABLES ∂eθ T = (−ρ cos φ sin θ. ∂eφ ∂eφ ∂eφ θ + φ + ρ ∂θ ∂φ ∂ρ θ cot φ eθ + φ (−ρeρ ) + ρ eφ ρ r = ρ eρ + φ eφ + θ eθ + ρ (eρ ) + φ (eφ ) + θ (eθ ) and from the above. sin φ cos θ. this equals ρ eρ + φ eφ + θ eθ + ρ φ θ φ θ eθ + eφ ρ ρ ρ θ cot φ eθ + φ (−ρeρ ) + eφ ρ + + ρ eθ ρ θ (− cos φ sin φ) eφ + −ρ sin2 φ eρ + φ (cot φ) eθ + .488 Also. ∂ρ ρ and Now in (20. ∂eφ T = (−ρ sin φ cos θ. ρ cos φ cos θ. −ρ cos φ) = −ρeρ ∂φ ∂eφ ∂θ and ﬁnally. By the chain rule. −ρ sin φ sin θ. cos φ sin θ. 0) = (cot φ) eθ ∂φ ∂eθ 1 T = (− sin φ sin θ. d (eρ ) = dt = ∂eρ ∂eρ ∂eρ θ + φ + ρ ∂θ ∂φ ∂ρ θ φ eθ + eφ ρ ρ d (eθ ) dt = = ∂eθ ∂eθ ∂eθ θ + φ + ρ ∂θ ∂φ ∂ρ θ (− cos φ sin φ) eφ + −ρ sin2 φ eρ + φ (cot φ) eθ + ρ eθ ρ d (eφ ) = dt = By (20.23) it is also necessary to consider eφ .

this would be a known function.9. 20. (There is no reason to think these will stay the same. . φ . and ρ. Thus r equals r = ρ −ρ φ + θ + 2 489 −ρ θ 2 sin2 (φ) eρ + φ + 2ρ φ − θ ρ 2 cos φ sin φ eφ + 2θ ρ + 2φ θ cot (φ) eθ . θ . probably 0. You might want to specify forces in the eθ and eφ direction as well and attempt to control the position of the rocket or rather its payload.2 Diﬀerentiation And The Derivative By now.5 Let f : U → Rp where U is an open set in Rn for n. As an example of how this could be used. This sort of computation is usually done by a computer. Suppose you wanted to know the latitude and longitude of the rocket as a function of time. 3 You won’t be able to ﬁnd the solution to equations like these in terms of simple functions. More could be said here involving moving coordinate systems and the Coriolis force. θ.) Then all that would be required would be to solve the system of diﬀerential equations3 . Note the prominent role played by the chain rule. All of the above is done in books on mechanics for general curvilinear coordinate systems and in the more general context. say a (t).20. Of course this force produces a corresponding acceleration which can be computed as a function of time. p ≥ 1 and let x ∈ U be given.) and then initial conditions for φ. θ.). 2ρ φ 2 − θ cos φ sin φ = 0. the rocket becomes less massive and so the acceleration will be an increasing function of t. DIFFERENTIATION AND THE CHAIN RULE and now all that remains is to collect the terms. Thus you could predict where the booster shells would fall to earth so you would know where to look for them. However. Of course there are many variations of this. ρ (0) = ρ0 (the distance from the launch site to the center of the earth. consider a rocket. The point is that if you are interested in doing all this in terms of φ. ρ −ρ φ φ + 2 −ρ θ 2 sin2 (φ) = a (t) . ρ (0) = ρ1 (the initial vertical component of velocity of the rocket. This means you determine the value of the various variables for various values of t without ﬁnding a neat formula involving known functions for the solution.9. Suppose for simplicity that it experiences a force only in the direction of eρ . As the fuel is burned. you may have suspected there should be a deﬁnition of something which will be equivalent to the assertion that a function has a good linear approximation. the above shows how to do it systematically and you see it is all an exercise in using the chain rule. Deﬁnition 20.9. This is in fact the case. The initial value problems could then be solved numerically and you would know the distance from the center of the earth as a function of t along with θ and φ. they can be solved numerically. special theorems are developed which make things go much faster but these theorems are all exercises in the chain rule. However. Then f is deﬁned to be diﬀerentiable at x ∈ U if and only if f has a good linear approximation at x. directly away from the earth. ρ and this gives the acceleration in spherical coordinates. ρ 2θ ρ + 2φ θ cot (φ) = 0 θ + ρ along with initial conditions.

an obvious question is: What is the derivative? In the case of a scalar function of one variable. What if n. f (x) equals the limit of the diﬀerence quotient as before and this shows that if f is diﬀerentiable in the new way. Why bother with a new deﬁnition if there is an old one which is already perfectly good? It is because in the case of a function of n variables. the old deﬁnition makes absolutely no sense because you can’t divide by a vector. p = 1. Is this new deﬁnition equivalent to the old one in this case? In the case where n.490 FUNCTIONS OF MANY VARIABLES Of course an obvious question arrises. Thus v → f (x) v is a linear mapping from R to R. Till now. if f is diﬀerentiable in the old way. ∂xj Taking the ith coordinate of the above equation yields p fi (x + v) = fi (x) + j=1 ∂fi (x) vj + o (v) ∂xj . Then |f (x + v) − f (x) − f (x) v| =0 |v| |v|→0 lim because |f (x + v) − f (x) − f (x) v| f (x + v) − f (x) = − f (x) |v| v and f (x) is deﬁned as the limv→0 f (x+v)−f (x) . suppose f (x) exists. the derivative is the same thing. p = 1. (20.24) where v ∈ R. but multiplication by it yields a linear transformation from R to R. it has been thought of as a number and this is what it is. Now that a deﬁnition of what it means for a function to be diﬀerentiable has been given. Recall this implies p f (x + v) = f (x) + j=1 ∂f (x) vj + o (v) . a linear transformation. In the more general setting described above. Now suppose f is diﬀerentiable in the new way in terms of having a good linear approximation. and f (x) can be thought of as this linear mapping. it is diﬀerentiable in the new way just described. Let f : U → Rq where U is an open subset of Rp and f is diﬀerentiable. Therefore. Therefore. f (x + v) = f (x) + f (x) v + o (v) . then it is diﬀerentiable in the old way. Then f (x + v) − f (x) − f (x) v = = f (x) v + o (v) − f (x) v o (v) v and this last expression converges to 0 by the deﬁnition of what it means to be o (v) . This is the whole reason for the new way of looking at things. v It follows that f (x + v) − f (x) − f (x) v = o (v) .

y) ∂y2 ∂f1 (x. Deﬁnition 20.y) ∂f1 (x. .y) ∂y2 ∂f2 (x. here is a picture. y) = ∂f1 (x. DIFFERENTIATION AND THE CHAIN RULE 491 and it follows that the term with a sum is nothing more than the ith component of J (x) v where J (x) is the q × p matrix. ∂f1 ∂x1 ∂f2 ∂x1 ∂f1 ∂x2 ∂f2 ∂x2 .y) ∂xn ∂f2 (x.20.y) ∂y2 ∂fq (x. First there was the concept of continuity. y) = f (x. Thus J (x) v = Df (x) v = f (x) v.y) ∂y1 ∂f2 (x. Next there was the concept of diﬀerentiability and the derivative being a linear transformation determined by a certain matrix. . .y) ∂xn ∂fq (x. it was shown that if a function is C 1 . ﬁx y and take the derivative of the resulting function of x. where the ﬁrst of these terms means matrix multiplication in which v ∈ Rp and the second two denote a linear transformation acting on the vector. Next the concept of partial or directional derivative. y) + D1 f (x. D1 f (x.y) ∂x1 ··· ··· .y) D1 f (x. y) is a linear transformation mapping Rn to Rq which satisﬁes f (x + v. y) v + o (v) . There is no need to distinguish these subtle issues at this time. .y) ∂yp There have been quite a few terms deﬁned. Also suppose f : U × V → Rq is in C 1 (U × V ) .y) ∂yp . Generalization to more factors follows the same pattern. . then it has a derivative.y) ∂y1 ∂fq (x. .25) The linear transformation which results by multiplication by this q × p matrix is known as the derivative.y) ∂x1 .y) ∂xn while D2 f (x. . ∂x2 ∂f2 (x. y) is deﬁned similarly as the derivative of what is obtained when x is ﬁxed and the variable of interest is y. y) and D2 f (x.9. (See Problem 1) In other words. ··· ··· ··· . There is also a notion of partial derivative in this context. Finally.. y) as follows. ∂f1 (x.y) ∂x2 ∂fq (x. y) = ∂x1 ∂f2 (x. To give a rough idea of the relationships of these topics. ··· ∂f1 ∂xp ∂f2 ∂xp ∂fq ∂xp and that f (x + v) = f (x) + J (x) v + o (v) .y) ∂yp ∂f2 (x. In terms of matrices of partial derivatives. ··· ∂f1 (x.y) ∂y1 . D2 f (x. ∂f1 (x. Just think of them as diﬀerent notations for the derivative.6 Suppose U is an open set in Rn and V is an open set in Rp . .y) ∂x2 ∂fq (x. Deﬁne D1 f (x. v.. ∂fq ∂x1 ∂fq ∂x2 ··· ··· . . . (20.9. This linear transformation is also denoted by Df (x) or sometimes f (x) to correspond completely with the earlier notation.. ∂fq (x.

and are all equal to zero but the function is not even continuous at (0. show that Jf (x) the above matrix of partial derivatives exists and Df (x) v = Lv = Jf (x) v. 0) . 2. Let f (x. 3. y) = if (x. Show Dv f (x) = Df (x) v. . L such that f (x + v) = f (x) + Lv + o (v) . Of course there are. 0) and that the partial derivatives exist at (0. 4. In fact. i = 1. such an example exists even in one dimension. 0) but the function is not diﬀerentiable at (0. Consider f (x) = 1 x2 sin x if x = 0 . Hint: Replace v with tv and use the above to write f (x+tv) − f (x) o (t) = Li v+ .26) Then you should verify that f (x) exists for all x ∈ R but f fails to be continuous at x = 0. t t Now this implies L1 v = L2 v+ o(t) . y) = x if |y| > |x| . using the deﬁnition in Problem 1. 20. 0) . 0 if x = 0 (20. In more advanced treatments of diﬀerentiation the following is done. 0) . −x if |y| ≤ |x| Show f is continuous at (0. t 2. Suppose f : U → Rq and let x ∈ U and v be a unit vector. Let f : U → Rq where U is an open set in Rp . y) = (0. y) = (0. 0) (x2 +y 4 )2 (x2 −y4 ) 2 Show that all directional derivatives of f exist at (0.492 FUNCTIONS OF MANY VARIABLES Partial derivatives xy x2 +y 2 derivative C1 Continuous |x| + |y| You might ask whether there are examples of functions which are diﬀerentiable but not C 1 . Now. That is. 0) .10 Exercises 1. Let f (x. f is diﬀerentiable at x ∈ U if there exists a linear transformation. there can’t be two diﬀerent linear transformations which satisfy the above equation. then Df (x) is well deﬁned. 5. 1 if (x. Thus the function is diﬀerentiable at every point of R but fails to be C 1 at every point of R. Show that deﬁning L ≡ Df (x) .

x cos (xy) .31) occurs when v = − f (x) / | f (x)| . · · ·.11.30) Proof: From (20.11. f (x) ≡ ∂f ∂x1 ∂f (x) .2 Let f (x.27) ∂xi j=1 Deﬁnition 20.11 The Gradient Let f : U → R where U is an open subset of Rn and suppose f is diﬀerentiable on U. from the above.1 Deﬁne gradient vector. 1 1 1 Therefore. the maximum in (20.3 Let f : U → R be a diﬀerentiable function and let x ∈ U. Find Dv f (1. 1) . It follows immediately from (20. √ . 0. y.11. Note this vector which is given is already a unit vector. the directional derivative is (2. y. 1) = (2.29) 1 1 1 √ . From (20. Thus if x ∈ U. for v a unit vector. 0.29) and the Cauchy Schwarz inequality. THE GRADIENT 493 20. taking t → 0. 0. 1) . f (x+tv) − f (x) t = = Therefore. Furthermore. 1)· √3 .28) An important aspect of the gradient is its relation with the directional derivative. Because of (20.11. √ 3 3 3 o (tv) t o (t) f (x) · v+ . it is only necessary to ﬁnd f (1. z) = (2x.29) it is easy to ﬁnd the largest possible directional derivative and the smallest possible directional derivative. √3 . √3 = √ 4 3 3. (20.31) f (x) / | f (x)| and the minimum (20.27) that f (x + v) = f (x) + f (x) · v + o (v) (20.30) occurs when v = in (20. 1. ∂xn (x) T . 1) and take the dot product. Then max {Dv f (x) : |v| = 1} = | f (x)| and min {Dv f (x) : |v| = 1} = − | f (x)| . Therefore. n ∂f (x) f (x + v) = f (x) + vi + o (v) . (20. There are ways to deﬁne the gradient for vector valued functions but this will not be attempted in this book.28). f (1. f (x. 1) where v = . (20. t f (x) · v+ Example 20. − | f (x)| ≤ Dv f (x) ≤ | f (x)| . Proposition 20. |Dv f (x)| ≤ | f (x)| and so for any choice of v with |v| = 1. Therefore. This vector is called the This deﬁnes the gradient for a scalar valued function.20. 1. . z) = x2 +sin (xy)+z. Dv f (x) = f (x) · v.

The gradient has fundamental geometric signiﬁcance illustrated by the following picture. Thus the surface is deﬁned by f (x. y. it is indistinguishable from the problem of heat ﬂow. z) = c} . Because it has cool places and hot places. The above relation between the heat ﬂux and u is usually called the Fourier heat conduction law and the constant. this would be a piece of a sphere. An identical relationship is usually postulated for the ﬂow of a diﬀusing species. K is known as the coeﬃcient of thermal conductivity. it follows 4 Do there exist any smooth curves which lie in the level surface of f and pass through the point (x0 . it tries to move from areas of high concentration toward areas of low concentration. is C 1 . z2 (s)) which intersect at the point. z) . y2 (s) . y. Like heat. It is a material property. In most applications. y1 (t) . Mathematically. the surface is a piece of a level surface of a function of three variables. It is reasonable to suppose the rate at which heat ﬂows across this surface will be largest when n is in the direction of greatest rate of decrease of the temperature. z0 ) . It may be an insecticide in ground water for example. z0 )? It turns out there do if f (x0 . For example. 2 The conclusion of the above proposition is important in many physical models. (x0 . However. y0 . This intersection occurs when t = t0 and s = s0 . If it is desired to ﬁnd the rate in calories per second at which heat crosses this little surface in the direction of n it is deﬁned as J · nA where A is the area of the surface and J is called the heat ﬂux. For example. z0 ) on this surface4 . x1 (t) for t in an interval lie in the level surface. this relationship is known as Fick’s law. diﬀerent for iron than for aluminum. y0 . it is expected that the heat will ﬂow from the hot places to the cool places. y0 . Since the points. z) = x2 + y 2 + z 2 . There are two smooth curves in this picture which lie in the surface having parameterizations. When applied to diﬀusion. Experiments show this scalar should depend on temperature. if f (x. In this problem. f. consider some material which is at various temperatures depending on location. y0 . K is considered to be a constant but this is wrong. x1 (t) = (x1 (t) . y. Thus n is a normal to this surface and has unit length. n. Nevertheless. something like a pollutant diﬀuses. z) : f (x. then Dv f (x) = = f (x) · ( f (x) / | f (x)|) | f (x)| / | f (x)| = | f (x)| . Consider a small surface having a unit normal. this is a £ # £ f (x0 . y. things get very diﬃcult if this dependence is allowed. heat ﬂows most readily in the direction which involves the maximum rate of decrease in temperature.494 FUNCTIONS OF MANY VARIABLES The proposition is proved by noting that if v = − f (x) / | f (x)| . z) = c or more completely as {(x. b & £ & x1 (t0 ) £ & t t t x2 (s0 ) In this picture. y. z0 ) = 0 and if the function. In this case J = −K c where c is the concentration of the diﬀusing species. Note the importance of the gradient in formulating these models. In other words. z1 (t)) and x2 (s) = (x2 (s) . then Dv f (x) = f (x) · (− f (x) / | f (x)|) 2 = − | f (x)| / | f (x)| = − | f (x)| while if v = f (x) / | f (x)| . This expectation will be realized by taking J = −K u where K is a positive scalar function which can depend on a variety of things. f (x. The constant can depend on position in the material or even on time.

11. y1 (t) . z1 (t)) · x1 (t) = 0. 4y. y2 (s) . it suﬃces to ﬁnd the normal vector to the proposed plane. 1) is a point on this level surface. 1 1 1 √ 2. z. these two direction vectors would determine a plane which deserves to be called the tangent plane to the level surface of f at the point (x0 . First note that (1. w. 6z) and so f (1. (x0 . y. 2. ∂y ∂z In terms of the gradient. 2) 2. it follows f (x0 . f (x. one of the greatest theorems in all mathematics and a topic for an advanced calculus class. 1. 6) · (x − 1. y. 4. taking the derivative of both sides and using the chain rule on the left.20. 1. this merely states f (x1 (t) . y0 . 1) . 1) consequence of the implicit function theorem. y0 . y0 . z0 ). f (x2 (s) .12 Exercises 1. u) = (1. 1. z1 (t)) x1 (t) + ∂x ∂f ∂f (x1 (t) . from this problem. 0) (c) u ln x + y + z 2 + w at (x. . z0 ) · x2 (s0 ) = 0. Therefore. z0 ) · x1 (t0 ) = 0. z0 ) and that f (x0 . y1 (t) . 4. f (x0 . (a) x2 y + z 3 at (1. 6) . y1 (t) . z1 (t)) = c for all t in some interval. 1. 1. 1. 2) (b) z sin x2 y + 2x+y at (1. z1 (t)) y1 (t) + (x1 (t) . But f (x. Therefore.4 Find the equation of the tangent plane to the level surface. Find the gradients of f = (a) x2 y + z 3 at (1. y. the equation of the plane is (2. z) = (2x. 1. EXERCISES 495 f (x1 (t) . ∂f (x1 (t) . z0 ) is perpendicular to both the direction vectors of the two indicated curves shown. Similarly. z1 (t)) z1 (t) = 0. It follows f (x0 . 2 . 1. y0 . 1. z0 ) is perpendicular to this tangent plane at the point. z − 1) = 0 or in other words. y0 . y. Surely if things are as they should be. 20. y0 . Example 20. y − 1. z) = x2 + 2y 2 + 3z 2 at the point (1. 2x − 12 + 4y + 6z = 0. 1) = (2. z2 (s)) · x2 (s) = 0. Letting s = s0 and t = t0 . z) = 6 of the function. y1 (t) . Therefore.12. y1 (t) . Find the directional derivatives of f at the indicated point in the direction. f (x.

r) . (a) x2 y + z 3 = 2 at (1.1 Let D (x0 . n. it follows that for all x. y (t) . y ∈ D (x0 . z) = 0 and f2 (x. r) ≡ {x ∈ Rn : |x − x0 | ≤ r} and suppose U is an open set containing D (x0 . 1.13 Nonlinear Ordinary Diﬀerential Equations Local Existence Recall Deﬁnition 20. 2) (c) xy + z 2 + w at (1.7. 22 . 1] as shown in the following picture. z (t)) . 1) . r) r r x x0 r y Observe x+ (0) (y − x) = x while x+ (1) (y − x) = y and for t ∈ (0. |f (x) − f (y)| ≤ K |x − y| . where M denotes ∂f the maximum of ∂xi (z) for z ∈ D (x0 .13.496 (b) z sin x2 y + 2x+y at (1. 1. The following lemma applies to such functions. Proof: Let x. 2 2 4. y. 5. 1) (b) z sin x2 y + 2x+y = 2 sin 1 + 4 at (1. y. 2) (c) cos (x) + z sin (x + y) = 1 at −π. 3) FUNCTIONS OF MANY VARIABLES 3. A direction vector for the desired curve should be perpendicular to both of these gradients. (x (t) . x+t (y − x) for t ∈ [0. Explain why the displacement vector of an object moving in R3 is always perpendicular to the velocity vector if the object is always at a ﬁxed distance from a given point. r) and consider the line segment joining these two points. The level surfaces x2 + y 2 + z 2 = 4 and z + x2 + y 2 = 4 have the point 22 . D(x0 . · · ·. 2. Hint: Recall the gradients of the two surfaces are perpendicular to the corresponding surfaces at this point. |x+t (y − x) − x0 | = ≤ ≤ |(1 − t) (x − x0 ) + t (y − x0 )| (1 − t) |x − x0 | + t |y − x0 | (1 − t) r + tr = r . Then for K = M n.3. Find a direction vector for this curve at this point. z) = 0 are two level surfaces which intersect in a curve which has parameterization. r) such that f : U → Rn is C 1 (U ) . r) and i = 1. √ √ 20. 1 in the curve formed by the intersection of these surfaces. In a slightly more general setting. suppose f1 (x. Find a diﬀerential equation for this curve. 3π . Find the tangent plane to the indicated level surface at the indicated point. 6. 1. Lemma 20. y ∈ D (x0 .

the cosine of this angle is always negative. It follows that (y−P y) · (P x−P y) ≤ 0 and (x−P x) · (P y−P x) ≤ 0. If h (t) = f (x+t (y − x)) for t ∈ [0. 1] are contained in D (x0 . the angle between the vectors x−P x and z−P x is always greater than π/2 radians. 1 n 0 i=1 1 n i=1 n i=1 ∂f (x+t (y − x)) (yi − xi ) dt ∂xi ∂f (x+t (y − x)) |yi − xi | dt ∂xi ≤ 0 ≤ M |yi − xi | ≤ M n |x − y| . For x ∈ D (x0 . Also. by the chain rule. r) P x will be the closest point in D (x0 .2 For any pair of points. |P x − P y| ≤ |x − y| . y ∈ Rn . Now consider the map. For x ∈D (x0 . . n h (t) = i=1 ∂f (x+t (y − x)) (yi − xi ) .13. NONLINEAR ORDINARY DIFFERENTIAL EQUATIONS LOCAL EXISTENCE 497 showing that the points on the line segment x+t (y − x) for t ∈ [0. r) as claimed and as indicated in the picture. P x = x. P which maps all of Rn to D (x0 . r) . then 1 f (y) − f (x) = h (1) − h (0) = 0 h (t) dt. Therefore. r) • x0 Lemma 20. r) to x. / x $$P x W $$ • $ z ' D(x0 .13. Proof: From the above picture and z ∈ D (x0 . ∂xi |f (y) − f (x)| = Therefore.20. x. r) arbitrary. r) given as follows. 1] .

then |f (t. c − Proof: Consider the following picture. Therefore. then one can take I = [p. b] × U → Rn be continuous and suppose that for each t ∈ [a. y ∈ D (x0 . r)} .498 FUNCTIONS OF MANY VARIABLES Thus (x−P x) · (P x−P y) ≥ 0 and so if subtracting yields (x−P x− (y−P y)) · (P x−P y) ≥ 0 which implies (x − y) · (P x−P y) ≥ (P x−P y) · (P x−P y) = |P x−P y| . b] . Now apply the Cauchy Schwarz inequality to the left side of the above inequality to obtain |x − y| |P x−P y| ≥ |P x−P y| 2 2 which yields the claim of the lemma. x)| : (t. x) − f (t. U D(x0 .13. r) . |f (t.33) ∂f From Lemma 20. If D (x0 . r M and q = min b. x (c) = x0 . there exists a constant. q] where p = max a. and if M ≡ max {|f (t. r) The large dotted circle represents U and the little solid circle represents D (x0 .13. r) is a closed ball as deﬁned above which is contained in U. (20. b] × D (x0 . Theorem 20. Now let P denote the projection map deﬁned above. This implies the following local existence and uniqueness theorem. P x) − f (t. x) is continuous. y)| ≤ K |x − y| for all t ∈ [a. r) is contained in U as shown. c + r M .13. by Lemma 20.32) valid for t ∈ I. K such that if x. Here r is chosen so small that D (x0 .1 and the continuity of x → ∂xi (t. Then there exists an interval. x) . x) ∈ [a. x = f (t. b] . P y)| ≤ K |P x−P y| ≤ K |x − y| . x (c) = x0 (20.2. b] be a closed interval and let U be an open subset of Rn .3 Let [a. b] . . P x) . x) . Also let x0 ∈ U and c ∈ [a. I ⊆ [a. Let ∂f f : [a. the map x → ∂xi (t. b] such that c ∈ I and there exists a unique solution to the initial value problem. Consider the initial value problem x = f (t. r) as indicated.

20.14. This was easy because you could easily solve the constraint equation for one of the variables in terms of the other. However. Then there exists an interval. It turns out that at an extremum. x (t) ∈ B (x0 . f (t.4 Let [a. c + r M . Things are clearly getting more involved and messy. x + y = 4 to ﬁnd y = 4 − x. t t |x (t) − x0 | = c x (s) ds ≤ c |x (s)| ds ≤ M |t − c| r and so. Also. Theorem 20. For example. P x) = f (t. This relation can be seen geometrically as in the following picture.32) on I. substitute it into f. b] be a closed interval and let U be an open subset of Rn . Therefore. x) ∈ [a. Solve for one of the variables. r)} . r) and so for [p. b] . y. b] × D (x0 . . z) subject to the constraints x2 + y 2 = 4 and z = 2x + 3y 2 . for these values of t. it seems you might encounter many cases and it does look a little fussy.34) and q = min b. there is a simple relationship between the gradient of the function to be maximized and the gradient of the constraint function.33) has a unique solution valid for t ∈ [a. in the constraint equation. q] where p = max a.33) is the solution to (20. However. it follows x (t) ∈ D (x0 . Let f : [a. It remains to describe I. Thus the two numbers are x = y = 2. q] . b] . Since x is continuous. and then ﬁnd where the partial derivatives equal zero to ﬁnd candidates for the extremum. it follows that there exists an interval. There is a more general theorem which just gives existence. If D (x0 . LAGRANGE MULTIPLIERS 499 It follows from Theorem 15. suppose you want to maximize xy given that x+y = 4. This is not too hard to do using methods developed earlier. it does involve the notion of compactness in a space of continuous functions and this is a topic for an advanced calculus course. then one can take I = [p. z) = xyz subject to the constraint that x2 + y 2 + z 2 = 4? It is still possible to do this using the techniques of Problem 5 on Page 473. x (c) = x0 valid for t ∈ I. c − r M (20. Solve for one of the variables in the constraint equation. What if you wanted to maximize f (x.13.14 Lagrange Multipliers Lagrange multipliers are used to solve extremum problems for a function deﬁned on a level set of another function. Then the function to maximize is f (x) = x (4 − x) and the answer is clearly x = 2. and if M ≡ max {|f (t.32). I containing c such that for t ∈ I. y. However. say y. I ⊆ [a. x) and so there is a unique solution to (20. x) . if |t − c| < M . x = f (t. This is called the Peano (pronounced piano) existence theorem and it is one of the most useful results in certain areas of applied math. r) .6 on Page 370 that (20. x)| : (t.7. Also let x0 ∈ U and c ∈ [a. P x (t) = x (t) and the solution to (20. 20. say z. q] as given. b] such that c ∈ I and there exists a solution to the initial value problem. if t ∈ [p. Now what if you wanted to maximize f (x. what if you had many constraints. sometimes you can’t solve the constraint equation for one variable in terms of the others. This proves the theorem. r) is a closed ball which is contained in U. The proof is not particularly hard and may be found as an exercise in [10] or [14]. b] × U → Rn be continuous.

y0 . z0 ) f (x0 .500 FUNCTIONS OF MANY VARIABLES £ £ £ & £t££ q (x0 . Then g (x. z0 ) is some scalar multiple of g (x0 . y. 2z) and f (x. y. z0 ) is also perpendicular to the surface. However. h (t) = f (x (t) . y0 . being essentially conﬁned to three dimensions. the surface represents a piece of the level surface of g (x. it fails to consider the situation in which there are many constraints. z) = (yz.14. z0 ) This λ is called a Lagrange multiplier after Lagrange who considered such problems in the 1700’s. z0 ) . z) = x2 +y 2 +z 2 −27. Then at the point which maximizes this function5 . z0 ) · x (t0 ) and since this holds for any such smooth curve. y (t0 ) . As shown above. But this means 0 = h (t0 ) = f (x (t0 ) . z (t)) is a point on a smooth curve which passes through (x0 . 2z) . y0 . y. (yz. z0 ). y0 . y0 . z (t)) must have either a maximum or a minimum at the point. g (x0 . z) = xyz while g (x. z0 ) .1 Maximize xyz subject to x2 + y 2 + z 2 = 27. Therefore. z0 ) £ £ £ £ p In the picture. then the function. Thus f (x0 . Here f (x. It does not deal with the question of existence of smooth curves lying in the constraint surface passing through (x0 . z) = 0 and f (x. z) = (2x. y0 . z0 ) is perpendicular to the surface or more precisely to the tangent plane. y. xy) . 2y. This picture represents a situation in three dimensions and you can see that it is intuitively clear that this implies f (x0 . if x (t) = (x (t) . z (t0 )) · x (t0 ) = f (x0 . xz. z0 ) £ t £ £ £ £ £ # g(x0 . Of course the above argument is at best only heuristic. y0 . However. 2y. . z0 ) = λ g (x0 . z0 ) when t = t0 . y0 . xy) = λ (2x. 5 There exists such a point because the sphere is closed and bounded. I think it is likely a geometric notion like that presented above which led Lagrange to formulate the method. xz. y0 . z) is the function of three variables which is being maximized or minimized on the level surface and suppose the extremum of f occurs at the point (x0 . f (x0 . Nor does it consider all cases. y0 . y0 . h (t0 ) = 0. Example 20. y0 . y. y0 . y. In addition to this. y (t) . t = t0 . y (t) .

Then there exists (p. Therefore. 2λz 2 equals xyz. −3. i = 1. x0 is a local maximum if f (x0 ) ≥ f (x) for all x near x0 which also satisﬁes the constraints (20. 3) (−3. · · ·. the only candidates for the point where the maximum occurs are (3. There are no magic bullets here. Not much is lost in doing this because this extra regularity will be available in nearly every conceivable example of any interest. m (20. x (t)) . Theorem 20. LAGRANGE MULTIPLIERS 501 Therefore. |x| = |y| = |z| .2 Let f : (a. x to the initial value problem.4 on Page 499 there exists an open interval. First it is essential to have a special case of the implicit function theorem. Now consider the system of nonlinear equations f (x) gi (x) = t = 0. etc. x (t)) + D2 f (t.36) be a collection of equality constraints with m < n. A local minimum is deﬁned similarly. q) → U such that f (t. x (t))) = D1 f (t. Then 0 = f (t0 . d (f (t. x0 ) and by the chain rule. 3. x) is continuous. q) such that t0 ∈ (p. This proves the theorem. f (t0 . (−3.35) Therefore. · · ·. Let f : U → R be a C 1 function and let gi (x) = 0. x (t) = −D2 f (t. it does often help to do it this way. D1 f (t. Thus. x0 )) = 0. x (t)) must be a constant vector and this vector equals zero by (20. x (t)) x (t) = 0 dt D1 f (t. x : (p. x) (20.14. q) .13. x (t))−1 D1 f (t. the function. b) × U. 3) . .20. However. q) . each of 2λx2 .36). x) → −D2 f (t.35). However you can just assume in addition that the function (t. det (D2 f (t0 . i = 1. (p. 3) which can be veriﬁed by plugging in to the function which is being maximized. Theorem 20. The above generalizes to a general procedure. 3. 3) . 3. x (t)) + D2 f (t. (t. x (t0 ) = x0 −1 for t ∈ (p. 2λy 2 . x (t)) x (t) = 0. t → f (t. This is because by assumption. x) → −D2 f (t. It was still required to solve a system of nonlinear equations to get the answer. The maximum occurs at (3. Proof: By the Peano existence theorem6 . q) containing t0 and a solution. x (t)) = 0 for all t ∈ (p. m. 6 I have not presented a proof of this important theorem so this part of the argument is technically incomplete. q) and a function. Suppose also that for t0 ∈ (a. It follows that at any point which maximizes xyz. b) × U → Rn where U is an open set in Rn and suppose D1 f and D2 f are both continuous on (a. x (t)) −1 D1 f (t. x (t)) is C 1 and then use the Theorem on the local existence for solutions to ordinary diﬀerential equations which was proved. x0 ) = 0.14. b) and x0 ∈ U.

. and a function y deﬁned on this interval with the property that F (t. · · ·. · · ·. fx1 (x0 ) fx2 (x0 ) · · · fxn (x0 ) g1x1 (x0 ) g1x2 (x0 ) · · · g1xn (x0 ) (20. . Now let f 1 (y) be deﬁned as what is obtained from f (x) by replacing xj for j none of the ik with x0j and xik with yk and let g l (y) be deﬁned similarly for l = 1. · · ·. . · · ·. replace the variable. m. g1x1 (x0 ) g1x2 (x0 ) · · · g1xn (x0 ) . (The case of a local minimum is handled similarly.9. · · ·. q) . . . . x0n ) and let y ∈ Rm+1 be deﬁned as y ≡ (y1 . ··· ··· ··· 1 fym+1 (y0 ) 1 gym+1 (y0 ) . . ym+1 ) ≡ xi1 . if j is not equal to one of the ik . . y (t)) = 0 for all t ∈ (p. . m gym+1 (y0 ) has nonzero determinant. Suppose therefore. (20. m.37) has rank less than m + 1 as claimed. . . .19 on Page 456. . .) Thus t0 ≡ f (x0 ) is the largest value of any f (x) for x near x0 satisfying the constraints. xim+1 where x ∈ U. g l (y0 ) = 0. . as noted above. 1 1 fy1 (y0 ) fy2 (y0 ) 1 1 gy (y0 ) gy (y0 ) 2 1 .39) . m m gy1 (y0 ) gy2 (y0 ) = 0 for each l = 1. This is a contradiction because. gmx1 (x0 ) gmx2 (x0 ) ··· gmxn (x0 ) . 2. gmx1 (x0 ) gmx2 (x0 ) · · · gmxn (x0 ) It will be shown that this matrix has rank less than m + 1. f 1 (y0 ) = t0 . . . xim+1 corresponding to the columns used to get the m + 1 × m + 1 matrix having nonzero determinant. this matrix has rank m + 1 then by Deﬁnition 19. . xim+1 and let y0 ≡ (y01 . y0 ) where F : R×W → Rm+1 is deﬁned as 1 f (y) − t g 1 (y) F (t. the matrix of (20. .502 FUNCTIONS OF MANY VARIABLES Now suppose x0 satisﬁes all the constraints and is a point where f has a local maximum. g m (y) Therefore by Theorem 20. xi1 . y0m+1 ) ≡ x0i1 .37) . For x ≡ (x1 . · · ·. · · ·. xn ) . · · ·.14. Therefore. t0 ≥ f 1 (y) . Consider the m + 1 × n matrix. x0im+1 . . t0 is the largest value for all f 1 (y) with y close to y0 such that g l (y) = 0. . . some m + 1 × m + 1 submatrix has nonzero determinant. But this matrix equals D2 F (t0 . This will be established by deriving a contradiction if the matrix has rank m + 1. and for all y close enough to y0 with g l (y) Furthermore.2 there is an open interval containing t0 . Suppose now the rank of the matrix. · · ·. This determines m + 1 variables. (p. q) . . . .38) . xj with the constant x0j where x0 ≡ (x01 . y) ≡ (20. · · ·. Denote by W the set of points in Rm+1 W ≡ y ∈ Rm+1 : y = xi1 . Therefore.

To help remember how to use (20. λ1 .4 Minimize xyz subject to the constraints x2 + y 2 + z 2 = 4 and x − 2y = 0. If µ = 0 and λ = 0. from the ﬁrst two equations. + · · · + λm = λ1 . xn . In general. When you take the derivatives with respect to the Lagrange multipliers. then there exist scalars. yz − 2λx − µ = xz − 2λy + 2µ = xy − 2λz x + y2 + z2 2 0 0 0 4 0 = = = x − 2y Now you have to ﬁnd the solutions to this system of equations. Therefore. this could be very hard or even impossible. This means there is an m × m submatrix which has nonzero determinant and so none of the above rows can be a linear combination of the others. which . · · ·.37) is a linear combination of the rows of the above matrix. . Theorem 20. x1 .40) L = f (x) − i=1 λi gi (x) and then proceed to take derivatives with respect to each of the components of x and also derivatives with respect to each λi and set all of these equations equal to 0. ···. . gmxn (x0 ) g1xn (x0 ) fxn (x0 ) The λi are called the Lagrange multipliers. Then if x0 ∈ U is either a local maximum or local minimum of f subject to the constraints (20. λ1 .39) equals m. In particular. and set what results equal to 0. λ1 .40) is what results from taking the derivatives of L with respect to the components of x. µ = 0 also.40) it may be helpful to do the following. λm . Therefore. by Theorem 19. then from the ﬁrst two equations.9. If λ = 0. This yields n + m equations for the n + m unknowns. but you just do your best and hope it will work out. and if the rank of the matrix in (20. . · · ·.14. . you just pick up the constraint equations. This proves the following theorem.20. m (20. First write the Lagrangian. λm such that gmx1 (x0 ) g1x1 (x0 ) fx1 (x0 ) . Example 20. . xyz = 2λx2 and xyz = 2λy 2 and so either x = y or x = −y. .14. either x or y must equal 0. there exist scalars. Then you proceed to look for solutions to these equations. The formula (20. Of course these might be impossible to ﬁnd using methods of algebra. LAGRANGE MULTIPLIERS 503 equals m. Form the Lagrangian.40) holds. λm such that (20. · · ·. .20 on Page 456 every row of the matrix of (20. L = xyz − λ x2 + y 2 + z 2 − 4 − µ (x − 2y) and proceed to take derivatives with respect to every possible variable.3 Let U be an open subset of Rn and let f : U → R be a C 1 function. then from the third equation.36).14. leading to the following system of equations.

divide the last equation by y and solve for λ to get λ = (2/5) z. Therefore. Find the points on y 2 x = 9 which are closest to (0. 4 3 . Maximize 2x + 3y − 6z subject to the constraint. and −2 8 15 . A can is supposed to have a volume of 36π cubic centimeters. 4. What does this say about the method of Lagrange multipliers? 5. 2. 8 15 . − √ √ 2 and the minimum value of the function subject to the constraints is − 5 30 − 2 3. The top and bottom of the can are made of tin costing 4 cents per square centimeter and the sides of the can are made of aluminum costing 5 cents per square centimeter. A can is supposed to have a volume of 36π cubic centimeters. x2 + 2y 2 + 3z 2 = 9. − 15 . However. But then from the fourth equation. Therefore. 5y 2 + z 2 = 4 y 2 − λz = 0 yz − λy + µ = 0 yz − 4λy − µ = 0 From the last equation. tell why. Find the dimensions of the can which minimizes the surface area. 3 You should rework this problem ﬁrst solving the second easy constraint for x and then producing a simpler problem involving only the variables y and z. Find the dimensions of the can which minimizes the cost. 6.504 FUNCTIONS OF MANY VARIABLES requires that both x and y equal zero thanks to the last equation. µ = (yz − 4λy) . Clearly the one which gives the smallest value is 2 8 15 . Substitute this into the third and get 5y 2 + z 2 = 4 y 2 − λz = 0 yz − λy + yz − 4λy = 0 y = 0 will not yield the minimum value from the above example. ± 4 3 . both µ and λ must be non zero. This satisﬁes the constraints and the product of these numbers equals a negative number. I know this is not the best value for a minimizer because I can take x = 2 3 . − 8 15 . Now put this in the second equation to conclude 5y 2 + z 2 = 4 . y = 3 . 0) if any exist. 3. and 5 5 z = −1. . Therefore. Now use the last equation eliminate x and write the following system. 0) . 4 3 4 3 8 8 or −2 15 . Thus y 2 = 8/15 and z 2 = 4/3. − 20. z = ±2 and now this contradicts the third equation. Find points on xy = 4 farthest from (0. a choice of 4 points to check. xyz equals zero in this case. ± 8 15 . candidates for minima are 2 8 15 . Find the dimensions of the largest rectangle which can be inscribed in a circle of radius r. Thus µ and λ are either both equal to zero or neither one is and the expression. If none exist.15 Exercises 1. y 2 − (2/5) z 2 = 0 a system which is easy to solve.

Find the point on the intersection of z = x2 + y 2 and x + y + z = 1 which is closest to (0. Maximize their product subject to the constraint that x1 + 2x2 + 3x3 + 4x4 + 5x5 = 300. z) on the level surface. Find the point. 18. 11. 1 + 3s) . 4x2 + y 2 − z 2 = 1which is closest to (0. A curve is formed from the intersection of the plane. 14. 19. x = (1 + 2t. Show ∂xi (Aij xj xi ) = 2Aij xj . 0. 9. xn ) = xn x2 · · · x1 . y. 1 + 2s. Let n be a positive integer. 0. Find the point on x2 4 T T + y2 9 + z 2 = 1 closest to the plane x + y + z = 10. 8. 15. Let A = (Aij ) be an n × n matrix which is symmetric. Minimize 4x2 + y 2 + 9z 2 subject to x + y − z = 1 and x − 2y + z = 0. Thus Aij = Aji and recall ∂ (Ax)i = Aij xj where as usual sum over the repeated index. the value of λ which j corresponds to the maximum value of this functions is such that Aij xj = λxi . Your answer should be j some function of a which you may assume is a positive number. 0) . 10. 16. 2x + 3y + z = 4 which is closest to the point (1. 0. 0) . j=1 x2 = 1. 3) . one chosen on one line and the other on the other line. 2. 0) . n 1 n S≡ x ∈ Rn : i=1 ixi = 1 and each xi ≥ 0 . A curve is formed from the intersection of the plane. Find the point on the level surface. ﬁnd x1 /xn . Minimize j=1 xj subject to the constraint j=1 x2 = a2 . 0) . Thus Ax = λx. Find the point on this curve which is closest to (0. · · ·. 20. 2x2 + xy + z 2 = 16 which is closest to (0. 2x + 3y + z = 3 and the cylinder x2 + y 2 = 4. 0.20. Find n numbers whose sum is 8n and the sum of the squares is as small as possible.15. 17. Let f (x1 . n−1 22. Then f achieves a maximum on the set. 21. 0) . (x. 3 + t) and x = (2 + s. This λ is called an eigenvalue of the matrix. If x ∈ S is the point where this maximum is achieved. 0. Minimize xyz subject to the constraints x2 + y 2 + z 2 = r2 and x − y = 0. Find the dimensions of the largest triangle which can be inscribed in a circle of radius r. x5 be 5 positive numbers. Find points p1 on the ﬁrst line and p2 on the second with the property that |p1 − p2 | is at least as small as the distance between any other pair of points. 12. · · ·. Find the point on this curve which is closest to (0. Find the point on the plane. Let x1 . 2x + 3y + z = 3 and the sphere x2 + y 2 + z 2 = 16. A. Show that when you use the method of Lagrange multipliers to maximize n the function. . 2 + t. Aij xj xi subject to the constraint. Here are two lines. 13. EXERCISES n n 505 7.

Deﬁnition 20. Extend the tangent line through (x.16. y) be a point on the ellipse. n 1 2 3 i n Show the maximum is r2 /n . ∂i g ≡ ∂xi . i x2 i i=1 ≤ 1 n n x2 i i=1 and ﬁnally. What is the analog of Taylor’s formula for a function of one variable? To answer this question. 24. Then if v ∈ Rp with |v| < r. 2r) ⊆ U. This is a function of one variable and so the one variable Taylor’s formula applies to F provided F has derivatives. Now show from this that n 1/n n n i=1 x2 = r2 . it follows that x + tv ∈ U whenever |t| ≤ 2. then n 1/n xi i=1 ≤ 1 n n xi i=1 and there exist values of the xi for which equality holds. Find the maximum value of the area of this triangle as a function of a and b. Now conclude that if x. 20. y) denote the area of the triangle formed by this line and the two coordinate axes. then xp yq xy ≤ + p q and there are values of x and y where this inequality is an equation. 2) . q are real numbers larger than 1 which have the property that 1 1 + = 1.1 Let ∂i denote the partial derivative with respect to xi where x = (x1 . The consideration of this question involves the function F (t) ≡ f (x + tv) for t ∈ (0. Maximize x2 y 2 subject to the constraint x2p y 2q + = r2 p q where p. p q show the maximum is achieved when x2p = y 2q and equals r2 . xp ) . This says the “geometric mean” is always smaller than the arithmetic mean. Let (x. First a deﬁnition is convenient. y) till it intersects the x and y axes and let A (x.506 FUNCTIONS OF MANY VARIABLES 23. y > 0. Thus ∂g . Maximize i=1 x2 (≡ x2 × x2 × x2 × · · · × x2 ) subject to the constraint. let x ∈ U and let B (x. · · ·. That this is the case is part of the next lemma. 25. x2 /a2 + y 2 /b2 = 1 which is in the ﬁrst quadrant.16 Taylor’s Formula For Functions Of Many Variables Let U be an open subset of Rp and let f : U → R be a C k function. conclude that if each number xi ≥ 0.

in in {1. TAYLOR’S FORMULA FOR FUNCTIONS OF MANY VARIABLES p i=1 507 vi ∂i ac- Now for v = (v1 . p p 1 F (t) = i=1 vi ∂i f (x + tv) = i=1 vi ∂i f (x + tv) and this proves the lemma if l = 1. Similarly.j. Now suppose this is true for l < k. Lemma 20. p 2 vi ∂i i=1 = i. deﬁne the “diﬀerential operator”.···.j vi vj (∂i (∂j g)) .20.16. p} . · · ·.in vi1 vi2 · · · vin ∂i1 ∂i2 · · · ∂in where the sum is taken over all choices of the indices. · · ·. p 3 vi ∂i i=1 = i. ( vi ∂i ) is deﬁned just as you might expect. Then p l F (l) (t) = i=1 vi ∂i f (x + tv) = i1 . Then F ∈ C k (0.j vi vj ∂i ∂j (g) = i.2 Let f and F be as above. a ﬁxed vector. 2) and for each l ≤ k p l (l) F (t) = i=1 vi ∂i f (x + tv) . i1 .j vi vj ∂i ∂j where i. vp ) . · · ·.16.k vi vj vk ∂i ∂j ∂k and more generally. p n vi ∂i i=1 = i1 . cording to the rule p p vi ∂i i=1 (g) ≡ i=1 p vi (∂i g) vi i=1 = p i=1 2 ∂g .il vi1 vi2 · · · vil ∂i1 ∂i2 · · · ∂il f (x + tv) .···. ∂xi The symbol. Proof: From the chain rule.

Then Ax · y = x·AT y. Therefore. µ such that λ = min {Hx · x : x ∈ S} . yields F (l+1) (t) = i1 . If xλ is a point of S at which Hxλ · xλ = λ.41) and this is Taylor’s formula for a function of many variables.16. . This proves the following theorem.1 Some Linear Algebra Hx · y = x·Hy =Hy · x. there exists s ∈ (0. Similarly. if xµ is a point of S at which Hxµ · xµ = µ.il+1 p vi1 vi2 · · · vil+1 ∂i1 ∂i2 · · · ∂il+1 f (x + tv) l+1 = i=1 vi ∂i f (x + tv) and this proves the lemma. ji yi = x·AT y. if Hx =rx for some real number.16.41) holds whenever |v| is suﬃciently small. Theorem 11.508 FUNCTIONS OF MANY VARIABLES Taking the derivative of this using the chain rule.3 Let U be an open set in Rp and let f : U → R be C k .16. Ax · y = Aij xj yi = xj Aij yi = xj AT This lemma implies (20.42). Theorem 20. F (t) = F (0) + F (0) t + · · · + F (k−1) (0) tk−1 F (k) (st ) tk + (k − 1)! k! for some st ∈ (0.5 Let H be a real n × n symmetric matrix and let S denote the vectors in Rn which have unit length.1. (20. 1) such that p f (x + v) = f (x) + i=1 vi ∂i f (x) + · · · p k 1 ···+ (k − 1)! p k−1 vi ∂i i=1 1 f (x) + k! vi ∂i i=1 f (x + sv) (20.···. then λ ≤ r ≤ µ. then Hxλ = λxλ .42) If H is an n × n matrix with H = H T . Then there exists λ. In particular. Lemma 20. it follows from Exercise 8 on Page 420 In case you didn’t work this exercise. Theorem 20. µ = max {Hx · x : x ∈ S} .···. here is a proof of this exercise.4 Let A be an m × n matrix.1 on Page 257. then Hxµ = µxµ .16. In addition to this. By Taylor’s theorem.il p vil+1 ∂il+1 f (x + tv) vi1 vi2 · · · vil ∂i1 ∂i2 · · · ∂il il+1 =1 = i1 . r and x = 0. Proof: Using the repeated index summation convention. this formula holds for t = 1 since 1 < 2. Then there exists s ∈ (0. from the above lemma. 20. t) . 1) such that (20.

43) . If t is negative this yields 2Hxλ · v + tHv · v ≤ 2λxλ · v + tλv · v Now let t → 0 to obtain 2Hxλ · v ≤ 2λxλ · v.16. Consequently.j 1 f (x) + vT H (x + sv) v 2 1 f (x) + H (x + sv) v · v 2 (20.3 the following equation is valid for all v having |v| small enough. The set of such points. Since this holds for any v. This proves the theorem. These points are called critical points. subtract this from both sides and then divide both sides by t. Since Hxλ · xλ = λ. x is a closed and bounded set in Rn and so since x →Hx · x is continuous. First note that if u is any nonzero vector in Rn . TAYLOR’S FORMULA FOR FUNCTIONS OF MANY VARIABLES 509 Proof: A vector in S determines a unique point. µ. (H − λI) xλ = 0. Now let v ∈ Rn be an arbitrary vector. λ and a maximum value. What is the analog of the second derivative test for functions of one variable? Let x be a point of U where Df (x) = 0. x which is at unit distance from 0.2 The Second Derivative Test Let f : U → R where U is an open set in Rn and f is C 2 . The claim about xµ and µ is completely similar. it holds in particular for v = (H − λI) xλ and therefore.16. H and so u u · ≥λ |u| |u| 2 Hu · u ≥ λ |u| = λu · u and so (Hu − λu) ·u ≥ 0 for all u = 0.20. =λ Hxλ · xλ + 2tHxλ · v + t Hv · v ≥ λxλ · xλ + 2tλxλ · v + t2 λv · v. It remains to show µ and λ satisfy the given conditions.42).16. f (x + v) = = = = f (x) + f (x) + 1 2 1 2 p 2 vi ∂i i=1 f (x + sv) vi vj ∂i ∂j f (x + sv) i. it achieves a minimum value. By Theorem 20. Hxλ · v = λxλ · v and so (H − λI) xλ · v = 0. This means H |x| · |x| = r |x| · |x| = r and so λ ≤ r ≤ µ by the way these two numbers are deﬁned. x x x x Now suppose Hx = rx. Then from what was just shown H (xλ + tv) · (xλ + tv) ≥ λ (xλ + tv) · (xλ + tv) and so from properties of dot products and (20. 2 20. Letting t > 0 and doing the same thing leads to 2Hxλ · v ≥ 2λxλ · v. Thus all partial derivatives of f at this point equal zero.

By Theorem 20.By the assumption f is C 2 . The line through x having v as its direction vector is just x+tv and so from (20. ··· ··· . If either λ or µ equals zero. 0) . all the entries of H (y) are continuous functions of y. Also let µ and λ be deﬁned for H = H (x) as in Theorem 20. This matrix is called the Hessian matrix. .44) f (x+tv) = f (x) + t2 t2 H (x) v · v+ (H (x + sv) − H (x)) v · v 2 2 µt2 t2 2 = f (x) + |v| + (H (x + sv) − H (x)) v · v.43). . the test fails. 0) is clearly neither a local minimum nor a local maximum for this function. .5 there exists v such 2 that H (x) v = µv. The case where µ < 0 is completely similar and is left as an exercise.4. Thus λ = min {H (x) u · u : u ∈ S} .46) . y) = x4 − y 4 in which (0. If λ < 0 and µ > 0 there exists a direction in which when f is evaluated on the line through the critical point having this direction. the resulting function of one variable has a local maximum. fxn xn (y) fxn x1 (y) fxn x2 (y) · · · and this is a symmetric matrix due to Theorem 20.510 where H (y) ≡ FUNCTIONS OF MANY VARIABLES fx1 x1 (y) fx2 x1 (y) . Finally consider the case where µ > 0 and λ < 0. f (x + v) ≥ ≥ 1 v v 1 2 f (x) + H (x) · |v| + 2 |v| |v| 2 λ 2 f (x) + |v| 4 λ 2 − |v| 2 λ 2 |v| 2 (20. f (x. 2 2 (20. .44) showing f has a local minimum at x. y) = x4 + y 4 in which f clearly has a local minimum at (0.1 on Page 474. 0) . The following theorem is the second derivative test for a function of many variables. Proof: Suppose ﬁrst λ > 0. From (20. and f (x. If µ < 0 then f has a local maximum at x. This is easily done by looking at f (x. |(H (x + sv) − H (x)) v · v| ≤ Therefore. . Thus H (x) v · v = µ |v| .16. µ = max {H (x) u · u : u ∈ S} . If λ > 0 then f has a local minimum at x. This last case is called a saddle point.45) (20. for all v small enough.5. fx1 x2 (y) fx2 x2 (y) . . Theorem 20. fx1 xn (y) fx2 xn (y) . y) = −x4 − y 4 in which f has a local maximum at (0.16. the resulting function of one variable has a local minimum and there exists a direction in which when f is evaluated on the line through the critical point having this direction.6 Let f : U → R for U an open set in Rn and let f be a C 2 function and suppose that at some x ∈ U. it follows that if |v| is small enough. .16. It remains to verify the test fails in case either µ or λ = 0.. 1 1 f (x + v) = f (x) + H (x) v · v+ (H (x + sv) − H (x)) v · v 2 2 By continuity of the entries of H (y) . ∂i f (x) = 0 for all i.

The solutions to this equation are called eigenvalues of the matrix. Therefore. f has a local minimum at the point. The number λ is the smallest solution to this equation and the number µ is the largest. det (λI − H) = 0. then (λI − H) x = 0 even though x = 0. and (1. there is a picture to help you remember the situation. −1) .5 in which it was shown that Hxλ = λxλ and Hxµ = µxµ . Therefore. −8. Therefore. How can you tell which case occurs? The way to do this is to recall Theorem 20. (0. In general. This is the case λ > 0. y) = 8x3 − 12x2 + 28x + 24yx − 12y − 12 and fy (x. and the thing to determine is the sign of its eigenvalues evaluated at the critical points. This matrix is and its eigenvalues are 16. Find the critical points and determine whether they are local minimums. fx (x. for small enough |v| in a certain direction. therefore. . if Hx = λx and x = 0.16. called the characteristic equation of the matrix. 2 4 4 showing that in that direction. − 4 . − 1 . y) = 12x2 − 12x + 4y + 4. you could multiply both sides of (λI − H) x = 0 by this inverse and conclude x = 0. 16 0 First consider the point 1 . 4 −12 Next consider (0. this point is a saddle point. The minus signs denote all negative eigenvalues. or saddle points. 2 The Hessian matrix is 24x2 + 28 + 24y − 24x 24x − 12 24x − 12 4 . local maximums. It follows the numbers. Note this equation will always be an nth degree polynomial equation. From continuity of the partial derivatives. Example 20.16. −1) at this point the Hessian matrix is and the eigen−12 4 values are 16. f (x+tv) = f (x) + ≥ µt2 t2 2 |v| + (H (x + sv) − H (x)) v · v 2 2 µt2 µ 2 µt2 2 2 f (x) + |v| − |v| = f (x) + |v| . As in the case of a function of one variable. The 1 points at which both fx and fy equal zero are 1 . x. The line is unchanged if v is multiplied by a small scalar. TAYLOR’S FORMULA FOR FUNCTIONS OF MANY VARIABLES 511 Now consider the last term.16. In the picture the plus signs denote all positive eigenvalues. λ and µ will be found among the solutions to the above equation. −1) .20. the matrix (λI − H) must not have an inverse since if it did. This is the case µ < 0.7 Let f (x. The case where λ < 0 is similar. |(H (x + sv) − H (x)) v · v| ≤ µ 2 |v| 4 provided |v| is small enough. y) = 2x4 − 4x3 + 14x2 + 12yx2 − 12yx − 12x + 2y 2 + 4y + 2. 4 2 4 0 4 showing that this is a local minimum.

2 0. −1) .5 –1 –1.5 –2 –1 x0 1 2 –2 –1 0y 1 2 Or course sometimes the second derivative test is inadequate to determine what is going on.512 FUNCTIONS OF MANY VARIABLES Finally consider the point (1. For a function of two variables. 0. it preserves all the information about whether the given function is increasing or decreasing in certain directions. .1 y –1 0x –0. 1. y) = arctan 6xy 2 − 2x3 − 3y 4 .1 –0. −1) .16. Below is a graph of this function which illustrates the behavior near the point (1. In one direction it looks like a local minimum while in another it looks like a local maximum.6 0. If you want to get a better picture. In fact.5 1 0.4 0. At this point the Hessian is 4 12 12 4 and the eigenvalues are 16. Before doing anything it might be interesting to look at the graph of this function of two variables plotted using Maple.2 0 –0. This should be no surprise since this was the case even for a function of one variable.1 –0.2 You see it is a lot like the place where you sit on a saddle. Show (0. Example 20.9 –0. −1) . you could graph instead f (x.8 Suppose f (x. −8 so this point is also a saddle point.2 –1. (0.5 0 –0.4 –1. Since arctan is a strictly increasing function. 0) is a critical point for which the second derivative test gives no information. a nice example is the Monkey saddle. they do look like a saddle. y) = arctan 2x4 − 4x3 + 14x2 + 12yx2 − 12yx − 12x + 2y 2 + 4y + 2 . Here is a picture of the graph of the above function near the saddle point. The geometric signiﬁcance of a saddle point was explained above.8 0.2 –0.

16. t) = arctan(4t3 − t4 ). local maximums or saddle points. This implies fxx . So are (1. 3. fyy are all equal to zero at (0. 0) also. along the line. EXERCISES 513 This picture should indicate why this is called a monkey saddle. y) = 12xy − 12y 3 and clearly (0. It is because the monkey can sit in the saddle and have a place for his tail. f has a local maximum at (0. y) ∂x 1 + g (x.6. Now gxx (0. Therefore. and (1.16.17 Exercises 1. t) = −3t4 . −1) for Example 20. Then p (t) ≡ f (0. Therefore. it suﬃces to verify that for g (x. Verify the claims made about the three examples in Theorem 20. fxy . If H = H T and Hx = λx while Hx = µx for λ = µ. Show the points 1 .5. 2. 1) and (1. the Hessian matrix is the zero matrix and clearly has only the zero eigenvalue. gx (x. Now to see (0.8. gy (x. nor a local min. show x · y = 0. . suppose you took x = t and y = t and evaluated this function on this line. 0) is a critical point. Finish the proof of Theorem 20. (0. 4. (0. 0) = 0 and so does gxy (0. the second derivative test is totally useless at this point. 5. −1) . 0) is a critical point. which is strictly increasing near t = 0. Next let x = 0 and y = t. y) = 6xy 2 − 2x3 − 3y 4 gx (0. −4) are critical points of the following 2 4 function of three variables and classify them as local minimums. and (1. Therefore. 0) and gyy (0. y))) 1 = 2 gx (x. − 21 . 0) = 0. −4) . note that ∂ (arctan (g (x. y) = 6y 2 − 6x2 . y) and that a similar formula holds for the partial derivative with respect to y.16. (Why?) Therefore. 0) .20. t) . Use the second derivative test on the critical points (1. 0) . 0) of f is neither a local max.17. 20. 0) = gy (0. (0. This reduces to h (t) = f (t. This shows the critical point. 1) . However.

At this point the Hessian matrix is −2 −10 −10 −2 and the eigenvalues are 8. − 53 . local maximums or saddle points. The same 3 analysis shows there are saddle points at the other two critical points. Show the points 1 . (0.514 FUNCTIONS OF MANY VARIABLES f (x. −12 so the function has a saddle point. both negative. Answer: The Hessian matrix is −36x2 + 74 + 20y + 36x 20x − 10 20x − 10 −6 . −4) . Finally consider the point (1. and (1. 1 53 2 . the Hessian is −24 0 0 −2 and its eigenvalues are −24. Answer: The Hessian matrix is −12x2 + 78 + 20y + 12x 20x − 10 20x − 10 −2 The eigenvalues must be checked at the critical points. −4) . Therefore. − 12 Check its eigenvalues at the critical points. The Hessian equals −2 10 10 −2 having eigenvalues: 8. y) = 5x4 − 10x3 + 17x2 − 6yx2 + 6yx − 12x + 5y 2 − 20y + 20. At this and its eigenvalues are − 16 . 2) are critical points of the following function 2 20 of three variables and classify them according to whether they are local minimums. − 4 . −6 so there is a local maximum at this point. Next consider (0. 2) . 6. 37 . local maximums or saddle points. First consider the point point the Hessian is − 16 3 0 0 −6 . f (x. −4) are critical points of the following func2 12 tion of three variables and classify them according to whether they are local minimums. f (x. the function has a local maximum at this point. −2. 7. (0. First consider the point 1 21 2 . Answer: . and (1. y) = −x4 + 2x3 + 39x2 + 10yx2 − 10yx − 40x − y 2 − 8y − 16. y) = −3x4 + 6x3 + 37x2 + 10yx2 − 10yx − 40x − 3y 2 − 24y − 48. −12 and so there is a saddle point here. Show the points 1 . At this point. −4) .

local maximums or saddle points. EXERCISES 515 The Hessian matrix is 60x2 + 34 − 12y − 60x −12x + 6 −12x + 6 10 . −5) . z) = − 3 x2 + 2 x − 3 2 3 + 8 yx + 2 y + 3 3 14 3 zx − 28 3 z − 5 y2 + 3 14 3 zy − 8 z2. 20 Check its eigenvalues at the critical points. Show the points 1 . y) = 4x4 − 8x3 − 4yx2 + 4yx + 8x − 4x2 + 4y 2 + 16y + 16. Otherwise the test fails. 0) . 4. Find the critical points of the following function of three variables and classify them according to whether they are local minimums. 1. 2) . then local max. 5 Next consider (0. First consider the point point. Find the critical points of the following function of three variables and classify them according to whether they are local minimums. At this and its eigenvalues are − 16 . The eigenvalues of the Hessian matrix at this point are −6. 5 f (x. . and (1. 8. −10. − 8 . −2. f (x. Therefore. 8. z) = 3 x2 + 32 3 x + 4 3 − 16 3 yx − 58 3 y − 4 zx − 3 46 3 z 4 + 1 y 2 − 3 zy − 5 z 2 . This matrix is and its eigenvalues are −3. −2) at this point the Hessian matrix is and the eigenvalues are 12. 3 Answer: The eigenvalues are 4. 3 3 Answer: The critical point is at (−2. local maximums or saddle points. y. If the eigenvalues are both negative. eigenvalues: 12. −2) . − 17 . −2) are critical points of the following func2 8 tion of three variables and classify them according to whether they are local minimums. then local min. Therefore. Next consider (0. Check its eigenvalues at −8x + 4 8 17 1 the critical points. 1 37 2 . y. (1. 3. (0. 1 f (x. 10. there is a saddle point.20. Answer: The Hessian matrix is −3 0 0 8 8 4 8 −4 4 8 −4 8 48x2 − 8 − 8y − 48x −8x + 4 . 9. There is also a local minimum at the critical point. 4. Finally consider the point (1. 4. local maximums or saddle points. the Hessian matrix is − 16 5 0 0 10 . and −6 and the only critical point is (1. If both positive. and 6. 2) at this point the Hessian matrix is 10 6 6 10 and the eigenvalues are 16. . First consider the point 2 . −2) . there is a local minimum at this point. 10.17.

Find the critical points of the following function of three variables and classify them as local minimums. z) = t. z) = − 3 x2 + 28 3 x + 37 3 + 14 3 yx + 10 3 y − 4 zx − 3 26 3 z 4 − 2 y 2 − 3 zy + 7 z 2 . 2 f (x. z) = − 11 x2 + 3 40 3 x − 56 3 + 8 yx + 3 10 3 y − 4 zx + 3 22 3 z − 11 2 3 y 4 − 3 zy − 5 z 2 . (0. and (3. 0) are critical points of the following function 2 20 of three variables and classify them as local minimums. y. Show the points 3 . −5) . − 9 . local maximums or saddle points. 15. y. 0) . f (x. z) = 10 2 3 x − 44 3 x + 64 3 − 10 3 yx + 16 3 y + 2 zx − 3 20 3 z + 10 2 3 y + 2 zy + 4 z 2 . y. local maximums or saddle points. 16. 7 f (x. z) = 4x2 − 30x + 510 − 2yx + 60y − 2zx − 70z + 4y 2 − 2zy + 4z 2 . Show by giving examples that the second derivative tests fails. y) = 5x4 − 30x3 + 45x2 + 6yx2 − 18yx + 5y 2 . the resulting function of one variable has a local minimum. local maximums or saddle points. 3 3 13. local maximums or saddle points. State and prove a similar result in the case where some eigenvalue of the Hessian matrix is negative. y) = 2x4 − 4x3 + 42x2 + 8yx2 − 8yx − 40x + 2y 2 + 20y + 50. local maximums or saddle points. y. 17. Show that if f has a critical point and some eigenvalue of the Hessian matrix is positive. 3 12. 2t2 − 10t. Show the critical points of the following function are points of the form. and (1. Find the critical points of the following function of three variables and classify them according to whether they are local minimums. −5) are critical points of the following 2 function of three variables and classify them as local minimums. Suppose µ = 0 but there are negative eigenvalues of the Hessian at a critical point. z) = 3 x2 + 4x + 75 − 14 3 yx 8 − 38y − 8 zx − 2z + 2 y 2 − 3 zy − 1 z 2 . then there exists a direction in which when f is evaluated on the line through the critical point having this direction. 18. 3 3 3 21. Find the critical points of the following function of three variables and classify them as local minimums. (x. f (x. f (x. − 11 . 2 f (x. and (2. f (x. (0. y. Find the critical points of the following function of three variables and classify them as local minimums. local maximums or saddle points. 14.516 FUNCTIONS OF MANY VARIABLES 11. 3 20. 27 . −5) are critical points of the following func2 2 tion of three variables and classify them as local minimums. . Find the critical points of the following function of three variables and classify them according to whether they are local minimums. −5) . z) = − 3 x2 − 146 3 x + 83 3 + 16 3 yx + 4y − 3 14 3 zx + 94 3 z − 7 y2 − 3 14 3 zy + 8 z2. Show the points 1 . local maximums or saddle points. y) = 4x4 − 16x3 − 4x2 − 4yx2 + 8yx + 40x + 4y 2 + 40y + 100. y. f (x. 3 3 19. local maximums or saddle points. y. Find the critical points of the following function of three variables and classify them as local minimums. local maximums or saddle points. Show the points 1. f (x. (0. local maximums or saddle points. −t2 + 5t for t ∈ R and classify them as local minimums. 22.

−4 16 3 0 −4 0 0 0 . 6 The veriﬁcation that the critical points are of the indicated form is left for you. (2. −3. 1 2 √ 97. The Hessian is −2x2 + 10x − 25 + 3 20 50 3 x− 3 38 95 3 x− 3 at a critical point it is − 4 t2 + 20 t − 3 3 20 50 3 (t) − 3 38 95 3 (t) − 3 The eigenvalues are 2 10 2 0. y. y. 0 2 −4 0 0 −3 1 2 Next consider the critical point. 2 The Hessian is −12 + 36x + 2z − 18x2 0 −2 + 2x 0 −4 0 −2 + 2x 0 −3 Now consider the critical point. 2 10 2 − t2 + t − 6 − (t4 − 10t3 + 493t2 − 2340t + 2916). (2. − 15 − 2 √ 97. Show the critical points of the following function are (0.20. − 1 . local maximums or saddle points. At this point the Hessian is . 1. and 1.17. z) = − 6 x4 + 5 x3 − 3 − 50 19 2 3 yx + 3 zx − 95 5 2 3 zx − 3 y − 10 3 zy − 1 z2. − 3 and classify them as local minimums. 3 f (x. −3. −3. 0) . −3. EXERCISES 25 2 6 x 517 + 10 2 3 yx 1 f (x. (0. −3 and so this point is a saddle point. −3. 0) . z) = − 2 x4 + 6x3 − 6x2 + zx2 − 2zx − 2y 2 − 12y − 18 − 3 z 2 . −3. 3 3 3 If you graph these functions of t you ﬁnd the second is always positive and the third is always negative. 1 23. Therefore. 25 3 20 3 20 3 y + 38 3 z 20 3 x − − 10 3 − 10 3 50 3 38 3 x − − 10 3 −1 3 95 3 (t) − − 10 3 − 10 3 50 3 38 3 (t) − − 10 3 −1 3 95 3 . −3. 0) . − 15 + 2 is a local max. 0) . − t2 + t − 6 + 3 3 3 and (t4 − 10t3 + 493t2 − 2340t + 2916). At this point the Hessian matrix is −12 0 2 The eigenvalues are −4. At this point the Hessian matrix equals 3 0 0 The eigenvalues are 16 3 . all these critical points are saddle points. all negative so at this point there Finally consider the critical point.

0) . 1 f (x. and (2. 2t2 + 6t. y. −t2 − 3t for t ∈ R and classify them as local minimums. (x. all negative. z) = 2 x4 − 4x3 + 8x2 − 3zx2 + 12zx + 2y 2 + 4y + 2 + 1 z 2 . y. local maximums or saddle points. (4. Therefore. −1. −1. 24. Show the critical points of the following function are (0. 25. 2 . z) = −2yx2 − 6yx − 4zx2 − 12zx + y 2 + 2yz. Show the critical points of the following function are points of the form. f (x. 0) . y. local maximums or saddle points. there is a local maximum at this point. z) = t.518 FUNCTIONS OF MANY VARIABLES −12 0 −2 0 −2 −4 0 0 −3 and the eigenvalues are the same as the above. −12) and classify them as local minimums. −1.

y) Q In this picture. z = f (x. 519 . y) outside of some bounded set. (21. y) = 0 for all (x. y) ≥ 0 and f (x.1 Methods For Double Integrals This chapter is on the Riemann integral for a function of n variables. consider the following picture. To solve this problem.1) and v (Q) is deﬁned as the area of Q. It begins by introducing the basic concepts and applications of the integral. y) where f (x. Now consider the following picture. the volume of the little prism which lies above the rectangle Q and the graph of the function would lie between MQ (f ) v (Q) and mQ (f ) v (Q) where MQ (f ) ≡ sup {f (x) : x ∈ Q} . To begin with consider the problem of ﬁnding the volume under a surface of the form z = f (x. The proofs of the theorems involved are diﬃcult and are left till the end. mQ (f ) ≡ inf {f (x) : x ∈ Q} .The Riemann Integral On Rn 21.

means for each Q. Then each of those little squares are the base of a prism of the sort in the previous picture and the sum of the volumes of those prisms should be the volume under the surface. None of this depends in any way on the function being nonnegative. and then add all these numbers together. MQ (f ) v (Q) and Q Q mQ (f ) v (Q) where the notation. Q MQ (f ) v (Q) . d] is a rectangle in R2 . Q MQ (f ) v (Q) is called an upper sum. b] and also y ∈ [c. means x ∈ [a. the desired volume must lie between the two numbers. and the desired volume is caught between these upper and lower sums. d [a. namely those which intersect the circle. v (Q) . Thus in Q MQ (f ) v (Q) . Therefore. The situation is illustrated in the following picture. b] × [c. y) .520 THE RIEMANN INTEGRAL ON RN In this picture. Q mQ (f ) v (Q) is a lower sum. it is assumed f equals zero outside the circle and f is a bounded nonnegative function. It also does not depend in any essential way on the function being deﬁned on R2 . y) ∈ [a. Note this is a ﬁnite sum because by assumption. First you must understand that something like [a. multiply it by the area of Q. f = 0 except for ﬁnitely many Q. z = f (x. adds numbers which are at least as large as what is desired while in Q mQ (f ) v (Q) numbers are added which are at least as small as what is desired. although it is impossible to draw meaningful pictures in higher dimensional cases. d] and the points which do this . To deﬁne the Riemann integral. The sum. d] c a b (x. b] × [c. it is necessary to ﬁrst give a description of something called a grid. b] × [c. take MQ (f ) . having sides parallel to the axes. d] .

F is a reﬁnement of G if every box of G is the union of boxes of F.3) Q = α1 . αi < αi . Then f dV is deﬁned to be the unique number which lies between all upper sums and all lower sums.1. If G is a grid. this means to take a rectangle from G multiply MQ (f ) deﬁned in (21. d] .1. y) outside some bounded set. Deﬁnition 21. another grid. let αi k k→∞ ∞ k=−∞ 521 be points on R which satisfy (21. This deﬁnition begs a diﬃcult question. αl+1 . Deﬁnition 21. For which functions does there exist a unique number between all the upper and lower sums? This interesting question is discussed in the section on the theory of the Riemann integral. For G a grid. MQ (f ) v (Q) Q∈G is called the upper sum associated with the grid. Again.1 For i = 1. METHODS FOR DOUBLE INTEGRALS comprise the rectangle just as shown in the picture. First consider the question: How can the Riemann integral be computed? Consider the following picture in which f equals zero outside the rectangle [a.2 Let f : R2 → R be a bounded function which equals zero for all (x. mQ (f ) v (Q) . the expression. The symbol. b] × [c.1) by its area. In the case of R2 . Q∈G called a lower sum.1. . it is common to replace the V with A and write this symbol as f dA where A stands for area. With this preparation it is time to give a deﬁnition of the Riemann integral of a function of two variables. k k k+1 For such sequences. α1 k k+1 × αl . G as described above in the discussion of the volume under a surface. k k→−∞ lim αi = −∞.21. v (Q) and sum all these products for every Q ∈ G. is deﬁned similarly. 2.2) lim αi = ∞. deﬁne a grid on R2 denoted by G or F as the collection of rectangles of the form 2 2 (21.

Therefore.4) and (21. Therefore. y) on [a. obtaining a function of x. it is expected that this is a way to evaluate f dV. y) dy dx (21. It is the same here. A hard problem. Thus b a c d b d f (x. In practice. say F (y) and then do d c a b d f (x. Then b V (y + h) − V (y) ≈ h a f (x. but there is no diﬀerence in the general case where f is not necessarily nonnegative. y) ≥ 0 and the integral represents a volume. Note what has been gained here. you could add these up and get the volume of the solid. you could ﬁnd the volume of the whole loaf by adding the volumes of individual slices.5) These expressions in (21. I have presented this for the case where f (x. The same thing should be obtained from the integral. having thickness equal to h. V (c) = 0. ﬁnding f dV. y) dy dx and it is understood that you are to do the inside integral ﬁrst and then when you have done it. Of course there is nothing special about ﬁxing y ﬁrst. dividing by h and taking a limit. b a c d f (x. First do b f (x. y) dy dx = a c f (x. y) dx where the integral is obtained by ﬁxing y and integrating with respect to x.5) are called iterated integrals.522 THE RIEMANN INTEGRAL ON RN It depicts a slice taken from the solid deﬁned by {(x. y) dx dy (21. y) dx dy = c F (y) dy. If you wanted to ﬁnd the volume of the loaf of bread. and you knew the volume of each slice of bread. b] × [c. the parenthesis is usually omitted in these expressions. y] by V (y) . You see these when you look at a loaf of bread. y)} . .4) but this was also the result of f dV. If you could ﬁnd the volume of the slice represented in this picture. The slice in the picture corresponds to constant y and is assumed to be very thin. y) dx. They are tools for evaluating f dV which would be hard to ﬁnd otherwise. Thus. y) dx a getting a function of y. it is expected that b V (y) = a f (x. It is hoped that the approximation would be increasingly good as h gets smaller. y) : 0 ≤ y ≤ f (x. you integrate this function of x. is reduced to a sequence of easier problems. Denote the volume of the solid under the graph of z = f (x. the volume of the solid under the graph of z = f (x. y) is given by d c a b f (x.

It is only intended to make things plausible. Find This is done using iterated integrals like those deﬁned above. 3 x2 y + yx dx = Now 2 0 0 1 x2 y + yx dx dy = 0 2 dy = If a diﬀerent answer had been obtained it would have been a sign that a mistake had been made. For example. Thus 1 2 0 R f dV = R 0 x2 y + yx dy dx. METHODS FOR DOUBLE INTEGRALS 523 Throughout. y) = x2 y + yx for (x. 2 0 1 0 0 1 x2 y + yx dx dy 5 y 6 5 y 6 5 . S ⊆ R2 . I have been assuming the notion of volume has some sort of independent meaning. This assumption is nonsense and is one of many reasons the above explanation does not rise to the level of a proof.3 Let f : S → R where S is a subset of R2 . y) = x2 y + yx for (x.21. y) ∈ R where R is the triangular region deﬁned to be in the ﬁrst quadrant. below the line y = x and to the left of the line x = 4. y) ∈ S . Find R f dV. Example 21. . The inside integral yields 2 0 x2 y + yx dy = 2x2 + 2x 1 0 and now the process is completed by doing 1 0 0 2 to what was just obtained. f1 (x.4 Let f (x. Another aspect of this is the notion of integrating a function which is deﬁned on some set. A careful presentation will be given later.1. not on all R2 . Example 21. Thus 1 0 x2 y + yx dy dx = 2x2 + 2x dx = 5 .1.1. 2] ≡ R.5 Let f (x. 1] × [0.1. 3 If the integration is done in the opposite order. What is meant by f dV ? S Deﬁnition 21. f dV. y) ≡ 0 if (x. y) if (x. Then denote by f1 the function deﬁned by f (x. y) ∈ [0. the same answer should be obtained. y) ∈ S / Then f dV ≡ S f1 dV. suppose f is deﬁned on the set.

1. 4 x 0 f dV = R 0 x2 y + yx dy dx The reason for this is that x goes from 0 to 4 and for each ﬁxed x between 0 and 4. 4 y x2 y + yx dx = 4 1 1 88 y − y4 − y3 3 3 2 672 . y) ∈ R where R is the triangular region deﬁned to be in the ﬁrst quadrant. Find f dV. Thus y goes from 0 to x. the variable x. below the line y = 2x and to the left of the line x = 4. goes from y to 4. y goes fr