You are on page 1of 30

AMA286 Supplementary Notes

September 3, 2008

Contents

Chapter 1 Taylors Theorem for one variable


Theorem 1.0.1 (Taylors Theorem) Suppose F is a real value function on [a, b], F (n) is and dene continuous on [a, b] and F (n+1) (t) exists for t (a, b). Let , be distinct points in [a, b] F (n) () F () (t )2 + + (t )n 2! n!

Pn (t) = F () + F ()(t ) +

(1.0.1)

There exists a point in between and such that Rn () = F () Pn () = Proof: Let M= and put
(k)

F (n+1) () ( )n+1 (n + 1)!

F () Pn () ( )n+1

g(t) = F (t) Pn (t) M (t )n+1 Then g() = 0. Furthermore, since Pn () = F (k) () for k = 0, 1, . . . , n, we have g() = g () = = g (n) () = 0. Hence there is some 1 between and such g (1 ) = 0 by the Rolles theorem (or Mean value theorem). Since g () = 0, we conclude similarly that g (2 ) = 0 for some 2 between 1 and . Continue the process, we have g (n+1) () = 0 for some between and n , that is between and . Remark 1.0.1 In particular, if = 0 then we obtain the nth Maclaurin polynomial Pn (t) = F (0) + F (0) t + F (0) 2 F (n) (0) n t + + t 2! n! 2 (1.0.2)

CHAPTER 1. TAYLORS THEOREM FOR ONE VARIABLE and Rn (t) = where c is between 0 and t Examples 1. The nth Maclaurin polynomial of et is t2 tn e 1 + t + + + = 2! n!
t n

F (n+1) (c) n+1 t (n + 1)!

k=0

tk k!

and Rn (t) =

ec tn+1 (n + 1)!

ex

2. The nth Maclaurin polynomial of sin(t) is


2n+1

sin(t) P2n+1 (t) = P2n+2 (t) =

(1)k
k=0

t2k+1 (2k + 1)!

P1

P5 P7

P3

Chapter 2 Mixed Partial Derivatives


Example 2.0.1 1. Let x2 y 2 , (x, y) = (0, 0); x4 + y 4 g(x, y) = 0, (x, y) = (0, 0);

0.4 0.2 0 -2 -1 0 1 2 -2 -1 1 0

(a) Show that

(x,y)(0,0)

lim

g(x, y) does not exist.


1 2

As (x, y) tends to (0, 0) along the xaxis, g(x, y) = g(x, 0) = 0 tends to 0. On the other hand, as (x, y) tends to (0, 0) along the line y = x, g(x, y) = g(x, x) = tends to 1 . Thus the function is not continuous at (0, 0). 2 (b) Show that gx (0, 0) and gy (0, 0) both exist and nd their values. g(0 + x, 0) g(0, 0) (x)4 + 04 gx (0, 0) = lim = lim = 0. x0 x0 x x g(0, 0 + y) g(0, 0) Similarly, gy (0, 0) = lim = 0. y0 y 2. Let 2 2 xy(y x ) , (x, y) = (0, 0); x2 + y 2 f (x, y) = 0, (x, y) = (0, 0); 4
(x)2 02

CHAPTER 2. MIXED PARTIAL DERIVATIVES

1 2 0 1 -1 -2 -1 0 1 2 -2 -1 0

Show that fxy (0, 0) = fyx (0, 0). For any x0 and y0 , one has fx (0, y0 ) = f (0 + x, y0 ) f (0, y0 ) x0 x 2 1 x y0 (y0 x2 ) = lim 0 = y0 , 2 x0 x x2 + y0 lim

and fy (x0 , 0) = f (x0 , 0 + y) f (x0 , 0) y0 y 1 x0 y(y 2 x2 ) 0 = lim 0 = x0 . y0 y x2 + y 2 0 lim

Thus

fyx (0, 0) = lim and

x 0 fy (0 + x, 0) fy (0, 0) = lim = 1; x0 x0 x x fx (0, 0 + y) fx (0, 0) y 0 = lim = 1. y0 y0 y y

fxy (0, 0) = lim

Theorem 2.0.2 Suppose f C 2 (E), where E is an open set in R2 . Then if (a, b) E 2f 2f (a, b) = (a, b) xy yx Proof: Method 1 Let R is a closed rectangle in E with sides parallel to the coordinate axes, having (a, b) and (a + h, b + k) as opposide vertices, where h = 0 and k = 0. Put Sh,k = f (a + h, b + k) f (a + h, b) + f (a, b + k) f (a, b),

CHAPTER 2. MIXED PARTIAL DERIVATIVES and dene u(x) = f (x, b + k) f (x, b). Then, apply the Mean Value theorem to obtain Sh,k = u(a + h) u(a) = hu (x ), where x is between a and a + h. Notice that u (x) = f f (x, b + k) (x, b) x x

Therefore, repeating the use of Mean Value theorem, we obtain Sh,k = hu (x ) = h f 2f f (x , b + k) (x , b) = hk (x , y ) x x yx

where y is between b and b + k. Put A= 2f (a, b) yx

Choose > 0. If h and k are suciently small, we have A for all (x, y) R and 2f (x, y) < yx

Sh,k A < hk

Fix h, and let k 0. Since f /y exists in E, the last inequality implies that
f (a y

+ h, b) h

f (a, b) y

A <

Since is arbitrary, and this inequality holds for all sucient small h = 0, it follows that 2f f (a, b) = A = (a, b) xy yx

Method 2 Use the rectangle R in the last proof and let P1 = (a, b), Q1 = (a + h, b),

CHAPTER 2. MIXED PARTIAL DERIVATIVES P2 = (a + h, b + k) and Q2 = (a, b + k). Consider the double integral 2f (x, y) dydx = yx f (x, y) dydx x b a a+h f f (x, b + k) (x, b) dx = x x a a+h a+h f f = (x, b + k) dx (x, b) dx x x a a = [f (a + h, b + k) f (a, b + k)] [f (a + h, b) f (a, b)] = f (P2 ) f (Q2 ) + f (P1 ) f (Q1 ) Repeating the argument shows that 2f (x, y) dydx = f (P2 ) f (Q2 ) + f (P1 ) f (Q1 ) = yx 2f 2f (x, y) (x, y) dxdy = 0 xy yx 2f (x, y) dxdy. xy
a+h b+k

Thus
R

for every choice of R, and the integrand must be identically 0 in E, that is, 2f 2f (x, y) = (x, y) xy yx

Chapter 3 Second order derivative test


The following theorem is known as the second order derivative test for functions of two variables. Theorem 3.0.1 Let (x0 , y0 ) be a critical point of f (x, y) and suppose that A = fxx (x0 , y0 ), B = fxy (x0 , y0 ), C = fyy (x0 , y0 ) and H = AC B 2 . 1. If H > 0 and A < 0 or C < 0, then f (x0 , y0 ) is a relative maximum. 2. If H > 0 and A > 0 or C > 0, then f (x0 , y0 ) is a relative minimum. 3. If H < 0, then f (x0 , y0 ) is not a relative extremum. [A critical point which is NOT a relative extremum is commonly known as a saddle point of the function.] 4. If H = 0, then the second order derivative test is inconclusive.

Relative Maximum fx = 0 and fy = 0 H = AC B > 0 A<0


2

Relative Minimum fx = 0 and fy = 0 H = AC B > 0 A>0


2

Saddle Point fx = 0 and fy = 0 H = AC B 2 < 0

CHAPTER 3. SECOND ORDER DERIVATIVE TEST The proof of this theorem requires Taylors theorem. f = f (x0 + x, y0 + y) f (x0 , y0 ) fx (x0 , y0 ) (x) + fy (x0 , y0 ) (y) 1 + fxx (x0 , y0 ) (x)2 + 2fxy (x0 , y0 ) (x) (y) + fyy (x0 , y0 ) (y)2 2 At a critical point (x0 , y0 ) , fx (x0 , y0 ) = fy (x0 , y0 ) = 0, f = 1 fxx (x0 , y0 ) (x)2 + 2fxy (x0 , y0 ) (x) (y) + fyy (x0 , y0 ) (y)2 2 1 A (x)2 + 2B (x) (y) + C (y)2 = 2 A B C = (x)2 + 2 (x) (y) + (y)2 2 A A = = A 2B (x) (y) + (x)2 + 2 A A 2 (x) + B A
2

B A

(y)2

B A

(y)2 +

C (y)2 A

(y)

AC B 2 (y)2 A2

where x = (x x0 ) and y = (y y0 ) . Case 1: If AC B 2 > 0 and A or C < 0, f < 0. Thus, (x0 , y0 ) is a local maximum. Case 2: If AC B 2 > 0 and A or C > 0, f > 0. Thus, (x0 , y0 ) is a local minimum. Case 3: If AC B 2 < 0, f takes on dierent signs. Thus, (x0 , y0 ) is a saddle point.

Chapter 4 Computation of Directional Derivative


Theorem 4.0.2 Suppose f is dierentiable at a point P0 , then given any vector a = 0, the directional derivative Da f (P0 ) = f (P0 ) a = f cos , a

where is the angle between the vectors a and f. Proof: Recall that the directional derivative with respect to a vector v is dened as Dv f (P0 ) = lim f (P0 + tv) f (P0 ) t0 t

at P0 in vector form, we have

Let P = P0 + tv. Then P = P P0 = tv. If we write the Taylor series expansion for f

f (P0 + tv) = f (P0 ) + f (P0 ) P + R(P0 ) Then f (P0 + tv) f (P0 ) f (P0 ) P R(P0 ) f (P0 ) (tv) R(P0 ) = + = + t t t t t and R(P0 ) f (P0 + tv) f (P0 ) = lim f (P0 ) v + t0 t0 t t Dv f (P0 ) = f (P0 ) v

lim

Since R(P0 )/t 0 as t 0,

In particular, if we choose v as an unit vector a/ a , we have the result. Note that we write Da f (P0 ) instead of D
a a

f (P0 ).

10

Chapter 5 Derivatives in Rn
at a U if there is a linear function, denoted by DF(a) : Rn Rm and is called the f (x) f (a) DF(a)(x a) =0 xa Denition 5.0.3 Let U be an open set in Rn . A map F : U Rm is said to be dierentiable

derivative of F at a, such that lim


xa

Remark 5.0.4 Here DF(a)(x a) denotes the value of the linear map DF(a) applied to the vector x a, so DF(a)(x a) Rm

representation with respect to the standard bases, which is given by f1 (a) x1 f2 (a) x1 . . . fm (a) x1 f1 (a) f1 (a) x2 xn f2 (a) f2 (a) x2 xn . . . . . . fm (a) fm (a) x2 xn

Let F = (f1 , f2 , . . . , fm ) where fi : Rn R. As a linear transformation, DF(a) has a matrix

DF(a) =

This matrix is called the Jocobian matrix of F . Let u = x a, then f1 (a) x1 f2 (a) DF(a)(x a) = x1 . . . fm (a) x1 f1 (a) f1 (a) x2 xn f2 (a) f2 (a) x2 xn . . . . . . fm (a) fm (a) x2 xn 11 u1 u2 . . . un

CHAPTER 5. DERIVATIVES IN RN In particular, when m = 1, DF(a) is called the gradient of F and is denoted by F (a) = F (a) x1 F (a) F (a) x2 xn

12

Moreover, the derivative applied to a vector u is u1 u2 . . . un

F (a)u =

F (a) x1

F (a) F (a) x2 xn

i=1

F (a) ui xi

Remark 5.0.5 The denition of DF(a) is independent of the basis used. If we change the basis from the standard basis to another one, the matrix elements will also change (see any Linear Algebra book for details).

Chapter 6 Level curves


Given f (x, y) = cos2 x + sin2 y e
x2 +y 2 8

1.5 z 1 0.5 0 -4 -2 0 x 2 4 -2 0 y 2

Some level curves of f (x, y)


y 3 y 3

-1

-1

-2

-2

-3 -4 -2 0 2 4 x

-3 -4 -2 0 2 4 x

13

CHAPTER 6. LEVEL CURVES Side views of f (x, y)


1.5 1 z x 4 02 -4-2 2 0.5 -2 0 z 0.5 -2 0 -4 y 0 2 -2 0 x
z 1.5 1.25 1 0.75 0.5 0.25 1 2 3 y -4 -2 2 4 x

14

1.5 1

0 y
z 1.5 1.25 1 0.75 0.5 0.25

-3

-2

-1

Fix x

Fix y

Theorem 6.0.6 Let S : f (x, y, z) = c be a level surface for a function f (x, y, z). If P0 = (x0 , y0 , z0 ) is a point on S, then f (x0 , y0 , z0 ) is a vector normal to S at P0 . Proof: Suppose that r(t) = x(t)i + y(t)j + z(t)k is any curve C lying on S, passing through P0 , when t = t0 . Then, we have f (x(t), y(t), z(t)) = c. Using chain rule, dierentiating with respect to t, we have f dx f dy f dz + + = x dt y dt z dt f f f i+ j+ k x y z dr (t0 ) = f =0 dt dy dz dx i+ j+ k dt dt dt

to S.

is tangent to C at P0 , for all curves C lying on S. Hence, f (x0 , y0 , z0 ) must be perpendicular

Similarly, if is a level curve of a function g(x, y) = c and Q = (x0 , y0 ) is any point on , then g (x0 , y0 ) is a vector normal to at (x0 , y0 ).
g (x0 , y 0 )

(x0 , y0 )
The curve g (x, y ) = g (x0 , y 0 )

Chapter 7 Lagrange mulitplier method


Theorem 7.0.7 Suppose that f (x, y, z) has continuous rst partial derivatives in a region that contains the dierentiable curve C : R(t) = x(t)i + y(t)j + z(t)k. If P0 is a point on C where f has a local maximum or minimum relative to its values on C, then f is perpendicular to C at P0 . Proof: We show that f v = 0 at the points in question, where v = R (t) is the tangent vector to C. The values of f on C are given by the composite function f (x(t), y(t), z(t)), whose derivative with respect to t is f dx f dy f dz df = + + = f v dt x dt y dt z dt At any point P0 where f has a local maximum or minimum relative to its values on the curve, df /dt = 0, so that f v = 0. This theorem is the key to why the method of Lagrange multipliers works. Suppose that f (x, y, z) and g(x, y, z) have continuous rst partial derivatives and that P0 is a point on the surface have g(x, y, z) = 0 where f has a local maximum or minimum value relative to its other values on the surface. Then f takes on a local maximum or minimum at P0 relative to its values on every dierentiable curve through P0 on the surface g(x, y, z) = 0. Therefore, so is g (because g is perpendicular to the level surface g = 0). Therefore, at P0 , f is f is perpendicular to the velocity vector of every such dierentiable curve through P0 . But

some scalar multiple of g.

15

CHAPTER 7. LAGRANGE MULITPLIER METHOD The method of Lagrange Multiplier: Suppose that f (x, y, z) and g(x, y, z) have continuous partial derivatives. To nd the local maximum and minimum values of f subject to the constraint g(x, y, z) = 0, nd the values of x, y, z, and that satisfy the equations f = g simultaneously. and g(x, y, z) = 0

16

Remark 7.0.8 Some books describe the method of Lagrange multipliers without vector notation in the following equivalent way: To maximize or minimize a function f (x, y, z) subject to the constraint g(x, y, z) = 0, construct the auxiliary function H(x, y, z, ) = f (x, y, z) g(x, y, z). Then nd the values of x, y, z, and for which the partial derivatives of H are all zero: Hx = 0, Hy = 0, Hz = 0, H = 0. These requirements are equivalent to the earlier requirements, as we can see by calculating Hx = fx gx = 0 Hy = fy gy = 0 Hx = fz gz = 0 H = g(x, y, z) = 0 or or or or fx = gx , fy = gy , fz = gz , g(x, y, z) = 0.

The rst three equations give f = g, and the last is g(x, y, z) = 0.

Chapter 8 Nonlinear optimization


8.1 Method of Steepest Descent
z = (x, y )

(x0 , y 0 )

(x1 , y1 ) (x 2 , y 2 )
(x3 , y 3 )

(x1 , y1 )
(x3 , y3 ) (x2 , y 2 ) (x0 , y 0 )
y

(x0 , y 0 )

(x 2 , y 2 )

(x1 , y1 )

To minimize a function (x, y), we start with an initial point (x0 , y0 ) and search for points (x1 , y1 ), (x2 , y2 ), (x3 , y3 ) , . . . such that (x0 , y0 ) > (x1 , y1 ) > (x2 , y2 ) > (x3 , y3 ) > . From the unit on Partial Dierentiation, we learned that at each point Pn = (xn , yn ), the gradient (xn , yn ) is the direction at which the function (x, y) decreases most rapidly. Hence, the point (xn , yn ) may be chosen by

minimizing (Pn + t ( (xn , yn ))) for t > 0. This method of searching for the minimum is known as method of steepest descent. Example 8.1.1 Use the method of steepest descent to locate the minimum point of the 17

CHAPTER 8. NONLINEAR OPTIMIZATION function (to 2 decimal places only) (x, y) = x4 + xy + y 2 + 2y. Use (0, 0) as the initial approximation and perform two iterations. Solution: With (x, y) = x + xy + y + 2y, we have =
4 2

18

4x3 + y x + 2y + 2

At (x0, y0 ) = (0, 0), the steepest descent is the direction given by the vector = We consider the function (P0 + t ( (x0 , y0 ))) = 0 0 +t 0 2 = (0, 2t) = 4t2 4t. 4 (0)3 + (0) (0) + 2 (0) + 2 = 0 2

At (x1 , y1 ) = (0, 1), the direction of steepest descent is given by = and we have to consider the function (P1 + t ( (x1 , y1 ))) = 0 1 +t 1 0 4 (0)3 + (1) (0) + 2 (1) + 2 = 1 0

1 Since = 8t4, thus 4t2 4t attains its minimum at t = 2 , and we choose (x1 , y1 ) = (0, 1).

= (t, 1) = t4 t 1.
1 2

Since = 4t3 1, t4 t 1 which attains a minimum value at t = (x2 , y2 ) = (0.63, 1) .

3 2 = 0.63. Therefore,

CHAPTER 8. NONLINEAR OPTIMIZATION

19

8.2

Newtons method

The iteration scheme follows immediately from the right triangle shown in the above gure, which has the angle of inclination of the tangent line to the curve at x = x0 as one of its acute angles: tan = f (x0 ) = f (x0 ) x0 x1 f (x0 ) x1 = x0 f (x0 )

We continue the iteration scheme by computing x2 = x1 In generally, xn+1 = g (xn ) = xn Formula (??) is known as Newtons method. Remark 8.2.1 Assuming that f (x) = 0 in an interval J, we have x=x f (x) = g (x) f (x) f (xn ) f (xn ) (8.2.1) f (x1 ) f (x1 )

Consequently, is a root of f (x) = 0 if and only if = g (). f (x) f (x) f (x) f (x) f (x) f (x) g (x) = 1 = [f (x)]2 [f (x)]2

converges to at least quadratically.

Thus, g () = 0 and |g (x)| k < 1 for some k. Hence, the iteration scheme in (??)

CHAPTER 8. NONLINEAR OPTIMIZATION Examples 1. Estimate 2 using Newtons method.

20

Solution: Let f (x) = x2 2. Consider the equation f (x) = 0. f (x) = 2x. By choosing x0 = 1, one obtains for n = 0, 1, 2, 3, . . . xn+1 = xn Therefore, xn n=0 n=1 n=2 n=3 n=4 1.000000000 1.500000000 1.416666667 1.414215686 1.414213562 xn+1 1.500000000 1.416666667 1.414215686 1.414213562 1.414213562 x2 2 1 xn n + . = 2xn 2 xn

The result is accurate to 10 signicant gures after only 4 iterations. 2. Find the real roots of the cubic equation, arising from an optimization problem, 2x3 + x 1 = 0. (8.2.2)

Solution: Let f (x) = 2x3 + x 1. Since f (0) = 1, f (1) = 2, f (x) = 6x2 + 1 > 0, we method yields xn n=0 n=1 n=2 n=3 n=4 f (0.58975) 0.000013929. 1.00000 0.71429 0.60517 0.59002 0.58975 xn+1 0.71429 0.60517 0.59002 0.58975 0.58975

conclude that there is one and only one real root for (??). Choosing x0 = 1, Newtons

We may take x4 = 0.58975 as our approximation to the root of (??). Note that

8.2.1

Solving system of nonlinear equations

Let F : Rn Rm . Find x Rn such that F (x) = 0. Suppose F C 1 , the linear (ane) approximation of F at a Rn is

M (x) = F (a) + DF (a)(x a)

CHAPTER 8. NONLINEAR OPTIMIZATION

21

where DF (a) is the Jacobian matrix of F evaluated at a, that is, if F = (f1 , f2 , . . . , fm ) and x = (x1 , x2 , . . . , xn ) then f1 (a) f1 (a) f1 (a) x1 x2 xn f2 (a) f2 (a) f2 (a) x1 x2 xn . . . . . . . . . . fm (a) fm (a) fm (a) x1 x2 xn

DF (a) = Solving M (x) = 0 is equivalently to

F (a) = DF (a)(x a). If DF (a) is invertible (or has left inverse), then the Newton step is given by xn+1 = a [DF (a)]1 F (a) Example 8.2.1 1. In particular, if F : R2 R2 , that is, F = (f, g), and x = (x, y), then DF (a) = fx (a) fy (a) gx (a) gy (a) and [DF (a)]1 = 1 fx (a) fy (a) gx (a) gy (a) Therefore, [DF (a)]1 F (a) = 1 fx (a) fy (a) gx (a) gy (a) = 1 fx (a) fy (a) gx (a) gy (a) Finally, if a = (xn , yn ), then f (a) fy (a) xn+1 = xn gy (a)f (a) fy (a)g(a) fx (a) fy (a) gx (a) gy (a) = xn g(a) gy (a) fx (a) fy (a) gx (a) gy (a) = xn + f (a) fy (a) g(a) gy (a) fx (a) fy (a) gx (a) gy (a) gy (a) fy (a) f (a) g(a) gx (a) fx (a) gy (a) fy (a) .

gx (a) fx (a)

gx (a)f (a) + fx (a)g(a)

gy (a)f (a) fy (a)g(a)

CHAPTER 8. NONLINEAR OPTIMIZATION and fx (a) f (a) yn+1 = yn gx (a)f (a) + fx (a)g(a) fx (a) fy (a) gx (a) gy (a) Remark 8.2.2 Consider the minimization problem min f (x). Solving f (x) = 0 by Newtons method, we have x=a f (a) f (a) = yn gx (a) g(a) fx (a) fy (a) gx (a) gy (a) = yn + fx (a) f (a) gx (a) g(a) fx (a) fy (a) gx (a) gy (a)

22

Consider the second degree Taylor polynomial of f (x) at x = a, m(x) = f (a) + f (a)(x a) + Then m (x) = f (a) + f (a)(x a) So, nding x such that m (x) = 0 is equivalent to x=a f (a) f (a) f (a) (x a)2 2

Hence, solving the minimization problem by Newtons method can be consider as modelling the function f (x) by a quadratic model. In general, if f : Rn R, the second degree Taylor polynomial is given by 1 m(x) = f (a) + f (a)T (x a) + (x a)T Ha (x a), 2 where f (a) is the gradient of f and Ha is the Hessian of f at a which is given by 2 f (a) x2 1 2 f (a) Ha = x2 x1 . . . 2 f (a) xn x1 2 f (a) x1 x2 2 f (a) x2 2 . . . 2 f (a) xn x2 2 f (a) x1 xn 2 f2 (a) xn . . . . 2 f (a) x2 n

CHAPTER 8. NONLINEAR OPTIMIZATION Applying the modelling concept, the Newton Step for miminizing f (x) is given by xn+1 = a [Ha ]1 f (a) provided that Ha is invertible. Example 8.2.2

23

1. The Newtons method can be used to nd a critical point by solving the system of nonlinear equations obtained. For instance, if w = x3 + y 3 3xy, we want
w x w y

= 3y 2 3x = 0

= 3x2 3y = 0

Dene F : R2 R2 by F = (f1 , f2 ) where f1 (x, y) = 3x2 3y,

f2 (x, y) = 3y 2 3x. 1 36xy 9 6y 3 3 6x

Then DF (x, y) = 3 6y 6x 3 and [DF (a)]1 = .

Using x0 = [1, 2] as initial guess, we obtain 1 2 1.1429 1.2857 1.0275 1.0424 1.0010 1.0014 1.0000 1.0000

Hence x 1.0000 and y 1.0000.


1 On the other hand, using x0 = [ 4 , 1 ] 4

as initial guess yields x 0.0000 and y 0.0000.


1 ] 2

1 Warning: The method may fails with x0 = [ 2 ,

or x0 = [1, 0].

2. When applying Lagrange Multiplier method to solve a constrained optimization problem, we obtain a system of nonlinear equations which can be solve numerical by Newtons method. For instance, the standard oil drum problem in the text Page 102 is formulated as a system of nonlinear equations 2h + 4r = 2rh, 2r = r2 , r2 h 10 = 0.

CHAPTER 8. NONLINEAR OPTIMIZATION Dene F : R3 R3 by F = (f1 , f2 , f3 ) where f1 (r, h, ) = 2h + 4r 2rh f2 (r, h, ) = 2r r2 f3 (r, h, ) = 10 r2 h.

24

Then

Using x0 = [1, 2, 3] as initial guess, we obtain 1 1.1972 1.1683

DF = 2 2r 2rh

4 2h 2 2r 2rh 0 r2

r2 0 1.1675

1.1675

2 2.3944 2.3365 2.3351 2.3351 3 1.2113 1.6887 1.7130 1.7130 Hence r 1.1675, h 2.3351 and 1.7130.

Chapter 9 Curvilinear Coordinates


To have a useful coordinate system in three dimensions, each point in space must be associated with a unique triple of real numbers (the coordinates of the point), and each triple of real numbers must determine a unique point.

9.1

Rectangular coordinates

Just as points in a plane can be placed in one-to-one correspondences with pairs of real numbers by using two perpendicular coordinate lines, so points in three-dimensional space can be placed in one-to-one correspondence with triple of real numbers by using three mutually perpendicular coordinate lines. To obtain this correspondence, we choose the coordinate lines so that they intersect at their origin, and we call these lines the x-axis, the y-axis, and the z-axis. The three coordinate axes form a three-dimensional Rectangular or Cartesian coordinate system, and the point of intersection of the coordinate axes is called the origin of the coordinate system. Each pair of coordinate axes determine a plane called a coordinate plane. These are referred to as the xy-plane, xz-plane, and the yz-plane. To each point P in 3-space we assign a triple of number (a, b, c) called the coordinates of P , by passing three planes through P parallel to the coordinate planes, and letting a, b, and c be the coordinates of the intersection of these planes with the x, y and z axes, respectively. The notation P (a, b, c) will sometimes be used to denote a point P with coordinates (a, b, c). Rectangular coordinate systems in 3-space fall into two dierent categories: left-handed and right-handed. A right-handed system has the property that when the ngers or the right hand are cupped so that they curve from the positive x-axis toward the positive y-axis, the thumb points in the direction of the positive zaxis. A system that is not right-handed is called left-handed. We shall use only right-handed coordinate system. Just as the coordinate axes in a two-dimensional coordinate system divide 2-space into four 25

CHAPTER 9. CURVILINEAR COORDINATES z

26

quadrants, so the coordinate planes of a three-dimensional coordinate system divide 3-space into eight parts, called octants.

9.2
that

Transformation of coordinates

Let the rectangular coordinates (x, y, z) of any point be expressed as function of (u, v, w) so x = x(u, v, w), y = y(u, v, w), z = z(u, v, w)

Suppose that the above system can be solved for u, v, w in terms of x, y, z, i.e., u = u(x, y, z), v = v(x, y, z), w = w(x, y, z)

The functions in these two systems are assumed to be single-valued so that the correspondence between (x, y, z) and (u, v, w) is unique. Given a point P with rectangular coordinate (x, y, z), we can, from the second system, associate a unique set of coordinates (u, v, w) called the curvilinear coordinates of P . The sets of equations in the rst system or the second system dene a transformation of coordinates.

9.3

Orthogonal curvilinear coordinates

The surface u = c1 , v = c2 , w = c3 , where c1 , c2 , c3 are constants, are called coordinate surface and each pair of these surfaces intersect in curves called coordinate curves or lines. If the coordinate surfaces intersect at right angles the curvilinear coordinate system is called orthogonal. The u, v, and w coordinate curves of a curvilinear system are analogous to the x, y, z coordinate axes of a rectangular system.

CHAPTER 9. CURVILINEAR COORDINATES

27

9.3.1

Cylindrical coordinates (r, , z)

To dene cylindrical coordinates, we take an axis (usually called the z-axis) and a perpendicular plane, on which we choose a ray (the initial ray) originating at the intersection of the plane and the axis (the origin). The coordinates of a point P are: the polar coordinates (r, ) of the projection of P on the plane, and the coordinate z of the projection of P on the axis. The coordinate transformation systems are given by x = r cos , y = r sin , z=z

where r 0, 0 2, and z ; and r= x2 + y 2 , tan = y , x z=z

Common Uses: The most common use of cylindrical coordinates is to give the equation of a surface of revolution. If the z-axis is taken as the axis of revolution, then the equation will not involve at all. z (x, y, z) z r y

9.3.2

Spherical coordinates (, , )

To dene spherical coordinates, we take an axis (the polar axis) and a perpendicular plane (the equatorial plane), on which we choose a ray (the initial ray) originating at the intersection of the plane and the axis (the origin). The coordinates of a point P are: the distance from P to the origin; the angle (zenith) between the line OP and the positive polar axis; and the angle (azimuth) between the initial ray and the projection of OP to the equatorial plane. As in the case of polar and cylindrical coordinates, is only dened up to multiples of 2, and likewise . Usually is assigned a value between 0 and .

CHAPTER 9. CURVILINEAR COORDINATES The coordinate transformation systems are given by x = sin cos , y = sin sin , z = cos

28

where 0, 0 2, and 0 ; and = x2 + y 2 + z 2 , tan = y , x cos = z x2 + y2 + z2

Common Uses: The most common use for spherical coordinates is in situations where a function has values that are spherically symmetrical; that is, where they depend only on the distance from the origin. z (x, y, z)

9.3.3

Examples

1. Describe the coordinate surface and coordinate curve for cylindrical coordinates. Coordinate surfaces are: r = c1 : Cylinders coaxial with z axis. = c2 : Planes through the z axis. z = c3 : Planes perpendicular to the z axis. Coordinate curves are: Intersection of r = c1 and = c2 (z curve) is a straight line. Intersection of r = c1 and z = c3 ( curve) is a circle (or point). Intersection of = c2 and z = c3 (r curve) is a straight line. 2. Describe the coordinate surface and coordinate curve for spherical coordinates. Coordinate surfaces are:

CHAPTER 9. CURVILINEAR COORDINATES = c1 : Sphere having center at the origin (or origin if c1 = 0). = c2 : Planes through the z axis.

29

= c3 : Cones having vertex at the origin (lines if c2 = 0 or and the xy plane if c2 = /2). Coordinate curves are: Intersection of = c1 and = c2 ( curve) is a circle (or point). Intersection of = c1 and = c3 ( curve) is a semi-circle (c1 = 0). Intersection of = c2 and = c3 ( curve) is a line. 3. Prove that a cylindrical coordinate system is orthogonal. (This example uses partial derivatives, which will be discussed in later section.) The position vector of any point in cylindrical coordinate is P (r, , z) = (r cos , r sin , z) The tangent vectors to the r, , z curves are given respectively by P r P P z Then P P r P P r z P P z = 0 = 0 = 0 = (cos , sin , 0) = (r sin , r cos , 0) = (0, 0, 1)

and so the tangent vectors are mutually perpendicular and the coordinates system is orthogonal.

You might also like