You are on page 1of 70

MATH 100

Spring 2006-07

Introduction to Multivariable Calculus Lecture Notes

Dr. Tony Yee Department of Mathematics The Hong Kong University of Science and Technology

February 28, 2007

ii

Contents
Table of Contents 1 Vectors and Geometry of Space 1.1 Three-Dimensional Coordinate Systems 1.2 Vectors . . . . . . . . . . . . . . . . . . 1.3 The Dot Product . . . . . . . . . . . . . 1.4 The Cross Product . . . . . . . . . . . . 1.5 Equations of Lines . . . . . . . . . . . . 1.6 Equations of Planes . . . . . . . . . . . 1.7 Quadric Surfaces . . . . . . . . . . . . . 2 Vector-Valued Functions 2.1 Vector Functions . . . . . . . . . . . . . 2.2 Calculus with Vector Functions . . . . . 2.3 Tangent, Normal and Binormal Vectors 2.4 Arc Length in Space . . . . . . . . . . . 3 Partial Derivatives 3.1 Functions of Several Variables . . . 3.2 Limits and Continuity . . . . . . . 3.3 Partial Derivatives . . . . . . . . . 3.4 The Chain Rule . . . . . . . . . . . 3.5 Directional Derivatives . . . . . . . 3.6 Applications of Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii 1 1 5 10 13 18 22 29 33 33 39 42 46 49 49 53 58 73 81 89

iii

Chapter 3

Partial Derivatives
The notation y = f (x) is used to indicate that the variable y depends on the single independent variable x, that is, that y is a function of x. In fact, many functions depend on more than one independent variable. For instance, the volume of a circular cone is a function V = (1/3)r2 h of its radius and its height, so it is a function V (r, h) of two variables. In this chapter we extend the basic ideas of single variable calculus to functions of several variables. The calculus of several variables is basically single variable calculus applied to several variables once at a time. When we hold all but one of the independent variables of a function constant and dierentiate with respect to that one variable, we get a partial derivative. Section 3.3 (page 58) will show how partial derivatives are dened and interpreted geometrically, and how to calculate them by applying the rules for dierentiating functions of a single variable. Despite the fact that this chapter is about derivatives we would like to rst develop the fundamentals and to introduce the basic concepts on limits and continuity of functions of several variables.

3.1

Functions of Several Variables

In this beginning section we rst dene functions of more than one independent variable and discuss the their geometric representations. Real-valued functions of several independent variables are dened similarly to functions in the single variable case. By analogy with the corresponding denition for functions of single variable, we dene a function of n variables as follows: 3.1.1 Denition Suppose D is a set of n-tuples of real numbers (x1 , x2 , , xn ). A real-valued function f on D is a rule that assigns a unique (single) real number w = f (x1 , x2 , , xn ) to each element in D. The set D is the functions domain. The set of w-values taken on by f is the functions range. The symbol w is the dependent variable of f , and f is said to be a function of the n independent variables x1 to xn . We also call the xj s the functions input variables and call w the functions output variable. Most of the examples we consider hereafter will be functions of two or three independent variables. When a function f depends on two variables, we will usually call these independent variables x and y , and we will use z to denote the dependent variable that represents the value of the function; that is, z = f (x, y ). We will normally use x, y , and z as the independent variables of a function of three variables and w as the value of the function: w = f (x, y, z ). Some denitions will be given, and some theorems will be stated only for the two-variable case, but extensions to three or more variables will usually be natural and obvious.

49

3. Partial Derivatives

Natural domain
As we stated in the previous denition that the independent variables of a function of two or more variables may be restricted to lie in some set D, which we call the domain of f . Sometimes the domain will be determined by physical restrictions on the variables (for instance, the time t must be non-negative). If the function is dened by a formula and if there are no physical restrictions or other restrictions stated explicitly, then it is understood that the domain consists of all points for which the formula yields a real value for the dependent variable. We this the natural domain of the function. Example 3.1.1 (Natural domain) Sketch the natural domain of the function f (x, y ) = ln(x2 y ). Solution The function ln(x2 y ) is dened only when x2 y > 0 or y < x2 .

We rst sketch the parabola y = x2 as a dashed curve. y


y = x2

x The region y < x2 then consists of all points below this curve. Remark that the dashed boundary does not belong to the domain. 2

Example 3.1.2 (Natural domain) Let f (x, y, z ) =


1 1 Find f (0, 2 , 2 ) and the natural domain of f .

1 x2 y 2 z 2 .

Solution

By substitution, 1 1 f (0, , ) = 2 2

r 1 (0)2

1 1 ( )2 ( )2 = 2 2

1 . 2

Because of the square root sign, we must have 1 x2 y 2 z 2 f (x, y, z ). Rewriting this inequality in the form x2 + y 2 + z 2 1

0 in order to have a real value for

we see that the natural domain of f consists of all points on or within the sphere x2 + y 2 + z 2 = 1.
2

50

3.1 Functions of Several Variables

Graphical representations
The graph of a function f of one variable (i.e., the graph of the equation y = f (x)) is the set of points in the xy -plane having coordinates (x, f (x)), where x is in the domain of f . Similarly, the graph of a function of two variables (i.e., the graph of the equation z = f (x, y )) is the set of points in 3-space having coordinates (x, y, f (x, y )), where (x, y ) belongs to the domain of f . This graph is a surface in R3 lying above (if f (x, y ) > 0) or below (if f (x, y ) < 0) the domain of f in the xy -plane. The graph of a function of three variables is a three-dimensional hypersurface in 4-space, R4 . In general, the graph of a function of n variables is an n-dimensional surface in Rn+1 . However, we will not attempt to draw graphs of functions of more than two variables! Example 3.1.3 (Surface as a plane) Consider the function x y f (x, y ) = 4 1 , 2 3

2,

4 2x.

The graph of f is the plane triangular surface with vertices at (2, 0, 0), (0, 3, 0), and (0, 0, 4). If the domain of f had not been explicitly stated to be a particular set in the xy -plane, the graph would have been the whole plane through these three points. 2

Example 3.1.4 (Surface as a shell) Consider the function f (x, y ) =

9 x2 y 2 .

The expression inside the square root cannot be negative, so the domain is the disk x2 + y 2 9 in the xy p plane. If we square the equation z = 9 x2 y 2 , we can rewrite the result in the form x2 + y 2 + z 2 = 9. This is a spherical shell of radius 3 centered at the origin. However, the graph of f is only the upper hemisphere where z 0. 2 Quite often it is dicult to sketch the surface z = f (x, y ) onto a two-dimensional paper without considerable artistic talent and training. Nevertheless, you should always try to visualize such a graph and sketch it as best you can. Sometimes it is convenient to sketch only part of a graph, for instance, the part lying in the rst octant. It is also helpful to determine (and sketch) the intersections of the graph with various planes, especially the coordinate planes, and planes parallel to the coordinate planes.

Another way to represent the function f (x, y ) graphically is to produce a two-dimensional topographic map of the surface z = f (x, y ). In the xy -plane we sketch the curves f (x, y ) = C for various values of the constant C . These curves are called level curves of f because they are the vertical projections onto the xy -plane of the curves in which the graph z = f (x, y ) intersects the horizontal (level) planes z = C . For example, the graph of the function f (x, y ) = x2 + y 2 is a circular paraboloid in 3-space; the level curves are circles centered at the origin in the xy -plane. Example 3.1.5 (Level curves)

x y The level curves of the function f (x, y ) = 4 1 of Example 3.1.3 are the segments of the straight 2 3 lines x y x y C 4 1 =C or + =1 , 0 C 3, 2 3 2 3 4 which lie in the rst quadrant. Such level curves correspond to equally spaced values of C , and their equal spacing indicates the uniform steepness of the graph of f . 2

51

3. Partial Derivatives Example 3.1.6 (Level curves) The level curves of the function f (x, y ) = p 9 x2 y 2 = C

9 x2 y 2 of Example 3.1.4 are the concentric circles or x2 + y 2 = 9 C 2 , 0 C 3.

The level curves (i.e. concentric circles) should be drawn for several equally spaced values of C . The circles are getting closer and closer for values of C getting closer and closer to 0. The decreasing spacing indicates the steepness of the hemispherical surface that is the graph of f . 2

Let us summarize all these terminologies for a function of two variables in the following denition. 3.1.2 Denition The set of points in the plane where a function f (x, y ) has a constant value f (x, y ) = C is called a level curve of f . The set of all points (x, y, f (x, y )) in space, for (x, y ) in the domain of f , is called the graph of f . The graph of f is also called the surface z = f (x, y ).

Exercise 3.1.1 2 Find the natural domains of the functions in the following.
1. f (x, y ) = 2. f (x, y ) = 3. f (x, y ) = x2 x . + y2 4. f (x, y ) = p 4x2 + 9y 2 36.

xy .

x+y . xy

1 5. f (x, y ) = p . x2 y 2 6. f (x, y ) = ln(1 + xy ).

2 Describe the level curves of the functions in words. 7. f (x, y ) = x y . 8. f (x, y ) = x2 . y 11. f (x, y ) = 12. f (x, y ) = xy . x+y y . x2 + y 2
2

9. f (x, y ) = x2 + 4y 2 . 10. f (x, y ) = xy .

13. f (x, y ) = e(x

+y 2 )

14. f (x, y ) = ln(x2 + y 2 ).

2 Sketch the level curve z = C for the specied values of C . 15. z = x2 + y 2 ; C = 0, 1, 2, 3, 4. 18. z = x2 + 9y 2 . C = 0, 1, 2, 3, 4. 19. z = x2 y 2 . C = 2, 1, 0, 1, 2. 20. z = y csc x. C = 2, 1, 0, 1, 2.

16. z = y/x. C = 2, 1, 0, 1, 2. 17. z = x2 + y . C = 2, 1, 0, 1, 2.

21. Let f (x, y ) = yex .

Find an equation of the level curve that passes through th point (a) (ln 2, 1); (b) (0, 3); (c) (1, 2).

52

3.2 Limits and Continuity

3.2

Limits and Continuity

In this section we will take a look at limits involving functions of more than one variable. In fact, we will concentrate on limits of functions of two variables, but the ideas can be extended out to functions with more than two variables. Before getting into this let us briey recall how limits of functions of one variable work. We say that
xa

lim f (x) = L

(exists)

provided that the two one-sided limits exist and have equal value, that is,
xa

lim f (x) = lim f (x) = L.


xa+

Also recall that greater than a.

than a. Likewise, lim f (x) is the right hand limit and requires us to only look at values of x that are
xa+

xa

lim f (x) is the left hand limit and requires us to only look at values of x that are less

y
point of interest Function of single variable y = f (x) L

(
from the left

)
from the right

Now, notice that in this case there are only two paths that we can take as we move in towards x = a. We can either move in from the left or we can move in from the right. Then in order for the limit of a function of one variable to exist the function must be approaching the same value as we take each of these paths in towards x = a.

Limits of functions of two variables


With functions of two variables we will have to do something similar, except this time there is possibly going to be a lot more work involved. Let us rst address the notation and get a feeling of what we are going to be asking for in these kinds of limits. We will be asking to take the limit of the function f (x, y ) as x approaches a and y approaches b. This can be written in several ways. Here are a couple of the more standard notations
xa, y b

lim

f (x, y ),

(x,y )(a,b)

lim

f (x, y ).

We will use the second notation in this course. The second notation is also a little more helpful in illustrating what we are really doing here when we are taking a limit. In taking a limit of a function of two variables we are really asking what the value of f (x, y ) is doing as we move the point (x, y ) in closer and closer to the point (a, b) without actually letting it be (a, b).

53

3. Partial Derivatives Just like with limits of functions of one variable, in order for this limit to exist, the function must be approaching the same value regardless of the path that we take as we move in towards (a, b). The problem that we are immediately faced with is that there are literally an innite number of paths that we can take as we move towards the point (a, b).
z Function of two variables z = f (x, y ) y

b b y

Domain of f (x, y )

a x

The above gure (right side) gives a few examples of paths that we could take. We put in several straight line paths as well as a couple of stranger paths that are not straight line paths. Also, we only included 6 paths here and as you can see simply by varying the slope of the straight line paths there are an innite number of these and then we would need to consider paths that are not straight line paths. In other words, to show that a limit (of function of two variables) exists we would technically need to check an innite number of paths and verify that the function is approaching the same value regardless of the path we are using to approach the point. Of course, practically, this is simply not possible. Fortunately however we can use the main ideas from single variable calculus to help us take limits here. See the following denition for continuity of a function of two variables.

Continuity of functions of two variables


As for functions of one variable, continuity of function of a function f at a point of its domain is dened directly in terms of the limit.

3.2.1 Denition A function f is continuous at the point (a, b) if 1. f is dened at (a, b), 2.
(x,y )(a,b)

3.

(x,y )(a,b)

lim

f (x, y ) = f (a, b).

lim

f (x, y ) exists,

A function is continuous if it is continuous at every point of its domain.

From a graphical viewpoint this denition means the same thing as it did when we rst learnt single variable calculus. A function will be continuous at a point if the graph does not have any holes or breaks at that point.

54

3.2 Limits and Continuity How can this help us take limits? Well, just as in single variable calculus, if you know that a function is continuous at (a, b) then you also know that
(x,y )(a,b)

lim

f (x, y ) = f (a, b)

must be true. So, if we know that a function is continuous at a point then all we need to do to take the limit of the function at that point is plug the point into the function. All the standard functions that we know to be continuous are still continuous even if we are plugging in more than one variable now. We just need to watch out for division by zero, square roots of negative numbers, logarithms of zero or negative numbers, etc.. Note that the idea about paths however is not one that we should forget since it is a nice way to determine if a limit does not exist. If we can nd two paths upon which the function approaches dierent values as we get near the point then we will know that the limit does not exist. Let us rst take a look at a couple of examples. Example 3.2.1 (Evaluation of limits) Determine if the following limits exist or not. If they do exist then give the value of the limit. ` ` 2 (a) lim 2x y 2 , (b) lim x2 y, (c) lim 3x + xy cos(x) .
(x,y )(2,3) (x,y )(a,b) (x,y )(2,1)

Solution In this example the three functions are continuous at the respective point and so all we need to do is plug in the values and we are done. ` (a) lim 2x y 2 = 2(2) (3)2 = 4 9 = 5,
(x,y )(2,3)

(b) (c)

(x,y )(a,b)

lim

x2 y = a2 b, ` 2 3x + xy cos(x) = 3(2)2 + (2)(1) cos(2 ) = 10.

(x,y )(2,1)

lim

Recall that any combinations and compositions of continuous functions (e.g. polynomials, rational functions, sine/cosine functions, exponential functions, etc.) are still continuous. For example, the composite functions p xy x2 + y 2 + 1, cos 2 , ln(1 + x2 y 2 ) exy , x +1 are continuous at every point (x, y ). So, basically, we can calculate the limits of this kind of continuous functions by evaluating the function values at (a, b). The only reminder is that the rational functions must be dened at (a, b). 2 Example 3.2.2 (Evaluation of limits) Investigate the limiting behavior of the function f as (x, y ) approaches (5, 1): f (x, y ) = xy . x+y

Solution In this example the function f will not be continuous along the line y = x since we will encounter division by zero when this is true. However, for this problem that is not something that we will need to worry about since the point that we are taking the limit at is not on this line. Therefore, all that we need to do is plug in the point as in Example 3.2.1 since the function is continuous at this point.
(x,y )(5,1)

lim

5 xy = x+y 6

(exists).
2

55

3. Partial Derivatives Example 3.2.3 (Existence of limits) Find the limit if it exists. lim x2 y 2 . + 3y 4

(x,y )(0,0) x4

Solution Now, in this example the function is not continuous at the point in question and so we cannot just plug in the point. So, since the function is not continuous at the point there is at least a chance that the limit does not exist. If we can nd two dierent paths to approach the point that will give two dierent values for the limit then we will know that the limit does not exist. Two of the more common paths to check are the x and y -axis so let us try those. Before actually doing this we need to address just what exactly do we mean when we say that we are going to approach a point along a path. When we approach a point along a path we will do this be either xing x or y or by relating x and y through some function. In this way we can reduce the limit to just a limit involving a single variable which we know how to do from elementary calculus. So, let us see what happens along the x-axis. If we are going to approach (0, 0) along the x-axis we can take the advantage of the fact that along the x-axis we know that y = 0. This means that, along the x-axis, we will plug in y = 0 into the function and then take the limit as x approaches zero. lim x2 y 2 x2 (0)2 = lim = lim 0 = 0. 4 4 x0 (x,0)(0,0) x + 3(0)4 + 3y

(x,y )(0,0) x4

So, along the x-axis the function will approach zero as we move in towards the origin. Now, let us try the y -axis. Along the y -axis we have x = 0 and so the limit becomes lim (0)2 y 2 x2 y 2 = lim = lim 0 = 0. y 0 (0,y )(0,0) (0)4 + 3y 4 x4 + 3y 4

(x,y )(0,0)

So, the same limit along two paths. Do not mis-read this. This does NOT say that the limit exists and has a limit value of zero. This only means that limit happens to have the same limit value along two special paths. Let us take a look at the limit of the function along a third (fairly common) path. In this case we will move in towards the origin along the path y = x. This is what we meant previously about relating x and y through a function. To do this we will replace all the y s with xs and then let x approach zero. Let us take a look at this limit. lim x2 (x)2 x2 y 2 x4 1 1 = lim = lim = lim = . x0 4x4 x0 4 (x,x)(0,0) x4 + 3(x)4 x4 + 3y 4 4

(x,y )(0,0)

So, a dierent value from the previous two paths and this means that the limit lim x2 y 2 x4 + 3y 4 does not exist.

(x,y )(0,0)

Note that we can use this idea of moving in towards the origin along a line with the more general path y = mx if we need to. 2

Example 3.2.4 (Existence of limits) Find the limit if it exists. lim x3 y . + y2

(x,y )(0,0) x6

56

3.2 Limits and Continuity Solution With this ending example we still have continuity problems at the origin. So, again let us see if we can nd a couple of paths that give dierent values of the limit. First, we will use the path y = x. Along this path we have
(x,y )(0,0) x6

lim

x3 (x) x3 y x4 x2 = lim = lim 2 4 = lim 4 = 0. 2 6 2 x0 x (x + 1) x0 x + 1 (x,x)(0,0) x + (x) +y

Now, let us try the path y = x3 . Along this path the limit becomes
(x,y )(0,0) x6

lim

x3 (x3 ) x3 y x6 1 1 = lim = lim = lim = . 2 6 3 2 3 x0 2x6 x0 2 +y 2 (x,x )(0,0) x + (x )

We now have two paths that give dierent values for the limit and so the limit does not exist. As this limit has shown us we can, and often need, to use paths other than straight lines. 2

Exercise 3.2.1 2 Find the limits (if they exist).


1. 2. 3.
(x,y )(0,0)

lim lim

3x2 y 2 + 5 . x2 + y 2 + 2 x . y p x2 + y 2 1.

4. 5. 6.

(x,y )(0,0)

lim

e1/(|x|+|y|) . ey sin x . x xy+2 . x+y

(x,y )(0,4)

(x,y )(0,0)

lim

(x,y )(3,4)

lim

(x,y )(0,0)

lim

2 At what points (x, y ) in the plane are the functions in the following continuous? 7. f (x, y ) = sin(x + y ). 8. f (x, y ) = 9. f (x, y ) = x+y . xy x+y . 2 + cos x 10. f (x, y ) = ln(x2 + y 2 ). 11. f (x, y ) = 12. f (x, y ) = x2 + y 2 . x2 3x + 2 1 . x2 y

2 By considering dierent paths of approach, show that the functions have no limit as (x, y ) (0, 0). xy 13. f (x, y ) = p . x2 + y 2 14. f (x, y ) = x4 . x4 + y 2 x4 y 2 . x4 + y 2 xy . |xy | 17. f (x, y ) = 18. f (x, y ) = xy . x+y y . x3 + y 3 x2 + y 3 . x2 + y x2 . y

15. f (x, y ) = 16. f (x, y ) =

19. f (x, y ) =

20. f (x, y ) =

x2

21. Let f (x, y ) = xy ln(x2 + y 2 ). Is it possible to re-dene f (0, 0) so that f will be continuous at (0, 0) ?

57

3. Partial Derivatives

3.3

Partial Derivatives

In this section we begin the process of dierentiating functions of more than one variable. Before we actually start taking derivatives of functions of more than one variable let us recall an important interpretation of derivatives of functions of one variable. Recall that given a function of one variable, f (x), the derivative, f (x), represents the rate of change of the function as x changes. This is an important interpretation of derivatives and we are not going to want to lose it with functions of more than one variable. The problem with functions of more than one variable is that there is more than one variable. In other words, what do we do if we only want one of the variables to change, or if we want more than one of them to change? In fact, if we are going to allow more than one of the variables to change there are then going to be an innite amount of ways for them to change. For instance, one variable could be changing faster than the other variable(s) in the function. Notice as well that it will be completely possible for the function to be changing dierently depending on how we allow one or more of the variables to change. We will need to develop ways, and notations, for dealing with all of these cases. In this section we are going to concentrate exclusively on only changing one of the variables at a time, while remaining variable(s) are held xed. We will deal with allowing multiple variables to change in a later section (page 81). Because we are going to only allow one of the variables to change taking the derivative will become a fairly simple process. Let us start o with a fairly simple function. Consider the function f (x, y ) = 2x2 y 3 . Let us determine the rate at which the function is changing at a point (a, b), if we hold y xed and allow x to vary and if we hold x xed and allow y to vary. We will start by looking at the case of holding y xed and allowing x to vary. Since we are interested in the rate of change of the function at (a, b) and are holding y xed this means that we are going to always have y = b. Doing this will give us a function involving only xs and we can dene a new function as follows g (x) = f (x, b) = 2x2 b3 . Now, this is a function of a single variable and at this point all that we are asking is to determine the rate of change of g (x) at x = a. In other words, we want to compute g (a) and since this is a function of a single variable we already know how to do that. Here is the rate of change of the function at (a, b) if we hold y xed and allow x to vary. g (a) = 4ab3 . We will call g (a) the partial derivative of f (x, y ) with respect to x at (a, b) and we will denote it in the following way, fx (a, b) = 4ab3 . Now, let us do it the other way. We will now hold x xed and allow y to vary. We can do this in a similar way. Since we are holding x xed it must be xed at x = a and so we can dene a new function of y and then dierentiate this as we have always done with functions of one variable. Here is the work for this, h(y ) = f (a, y ) = 2a2 y 3 = h (b) = 6a2 b2 .

In this case we call h (b) the partial derivative of f (x, y ) with respect to y at (a, b) and we denote it as follows fy (a, b) = 6a2 b2 . Note that these two partial derivatives are sometimes called the rst-order partial derivatives. Just as with functions of one variable we can have derivatives of all orders. We will be looking at higher-order (partial) derivatives later in this section (page 70).

58

3.3 Partial Derivatives Note that the notation for partial derivatives is dierent than that for derivatives of functions of a single variable. With functions of a single variable we could denote the derivative with a single prime. However, with partial derivatives we will always need to remember the variable that we are dierentiating with respect to and so we will subscript the variable that we dierentiated with respect to. We will shortly be seeing some various notations for partial derivatives as well. Note as well that we usually do not use the (a, b) notation for partial derivatives. The more standard notation is to just continue to use (x, y ). So, the partial derivatives from above will more commonly be written as fx (x, y ) = 4xy 3 and fy (x, y ) = 6x2 y 2 .

Now, as this quick example has shown taking derivatives of functions of more than one variable is done in pretty much the same manner as taking derivatives of a single variable. To compute fx (x, y ) all we need to do is treat all the y s as constants (or numbers) and then dierentiate the xs as we have done. Likewise, to compute fy (x, y ) we will treat all the xs as constants and then dierentiate the y s as we are used to doing. Before we do a few examples let us get the formal denition of the partial derivative out of the way as well as some alternate (but equivalent) notations. Since we can think of the two partial derivatives above as derivatives of single variable functions it should not be too surprising that the denition of each is very similar to the denition of the derivative for single variable functions. For a function of two variables, we make this precise in the following denition.

3.3.1 Denition The rst partial derivatives of the function f (x, y ) with respect to the variables x and y are the functions fx (x, y ) and fy (x, y ) given by fx (x, y ) = lim f (x + h, y ) f (x, y ) , h and fy (x, y ) = lim f (x, y + h) f (x, y ) , h

h0

h0

provided these limits exist. Each of the two partial derivatives is the limit of a dierence quotient in one of the variables. Observe that fx (x, y ) is just the ordinary rst derivative of f (x, y ) considered as a function of x only, regarding y as a constant parameter. Similarly, fy (x, y ) is the rst derivative of f (x, y ) considered as a function of y alone, with x held xed.

Example 3.3.1 (Partial derivatives) If f (x, y ) = x2 sin y , then fx (x, y ) = 2x sin y and fy (x, y ) = x2 cos y.
2

Remark that various notations can be used freely to denote the partial derivatives of z = f (x, y ) considered as functions of x and y : fx (x, y ) fy (x, y ) = = fx fy = = f x f y = = (f (x, y )) x (f (x, y )) y = = zx zy = = z , x z . y

59

3. Partial Derivatives For the fractional notation for the partial derivative notice the dierence between the partial and the ordinary derivative from single variable calculus. f (x) f (x, y ) = = f (x) = fx (x, y ) = df , dx f , x and fy (x, y ) = f . y

To distinguish partial derivatives from ordinary derivatives we use the symbol rather than the d used in single variable calculus. The symbol /x should be read as partial with respect to x so f /x is partial f with respect to x. Let us work some examples. When working these examples always keep in mind that we need to pay very much attention to which variable we are dierentiating with respect to. This is important because we are going to treat all other variables as constants and then proceed with the derivative as if it was a function of a single variable. Also note that the standard dierentiation rules for sums, products, reciprocals, and quotients continue to apply to partial derivatives.

Example 3.3.2 (Partial derivatives) Find all of the rst-order partial derivatives for the following functions. (a) f (x, y ) = x4 + 6 y 10. (b) w = x2 y 10y 2 z 3 + 44x 7 tan(4y ). (c) h(s, t) = t7 ln s2 + (d) f (x, y ) = cos 9 7 s4 . 3 t (e) z = 9u . u2 + 5 v x sin y . z2

(f) g (x, y, z ) = (g) z = p

4 x2 y5y3 e . x

x2 + ln(5x 3y 2 ).

Solution (a) Let us rst take the derivative with respect to x and remember that as we do so all the y s will be treated as constants. The partial derivative with respect to x is fx (x, y ) = 4x3 . Notice that the second and the third terms dierentiate to zero in this case. It should be clear why the third term dierentiated to zero. It is a constant and we know that constants always dierentiate to zero. This is also the reason that the second term dierentiated to zero. Remember that since we are dierentiating with respect to x here we are going to treat all y s as constants. This means that those terms that only involve y s will be treated as constants and hence dierentiate to zero. Now, let us take the derivative with respect to y . In this case we treat all xs as constants and so the rst term involves only xs and so will dierentiate to zero, just as the third term will. The partial derivative with respect to y is 3 fy (x, y ) = . y (b) With this function we have three rst-order derivatives to compute. Let us do the partial derivatives with respect to x rst. Since we are dierentiating with respect to x we will treat all y s and all z s as constants. This means that the second and fourth terms will dierentiate to zero since they only involve y s and z s. The rst term contains both xs and y s and so when we dierentiate with respect

60

3.3 Partial Derivatives to x the y is just treated to be a multiplicative constant and so the rst term will be dierentiated just as the third term will be dierentiated. Here is the partial derivative with respect to x. w = 2xy + 44. x Let us now dierentiate with respect to y . In this case all xs and z s will be treated as constants. This means the third term will dierentiate to zero since it contains only xs while the xs in the rst term and the z s in the second term will be treated as multiplicative constants. Here is the partial derivative with respect to y . w = x2 20yz 3 28 sec2 (4y ). y Finally, let us get the derivative with respect to z . Since only one of the terms involve z s this will be the only non-zero term in the derivative. Also, the y s in that term will be treated as multiplicative constants. Here is the partial derivative with respect to z . w = 30y 2 z 2 . z

(c) With this function we will not put in the detail of the rst two. Before taking derivative let us rewrite the function a little to help us with the dierentiation process. h(s, t) = 2t7 ln s + 9t3 s 7 . Now, the fact that we are using s and t here instead of the standard x and y should not be a problem at all. It will work the same way. Here are two partial derivatives for this function. hs (s, t) ht (s, t) = = h 1 4 3 = 2t7 + 0 s 7 s s 7 h = 14t6 ln s 27t4 . t = 2t7 4 3 s 7 , s 7
4

(d) Now, we cannot forget the product rule with derivatives. The product rule will work the same way here as it does with functions of one variable. We will just need to be careful to remember which variable we are dierentiating with respect to. Let us start out by dierentiating with respect to x. In this case both the cosine and the exponential contain xs and so we have a product of two functions involving xs and so we will need the product rule for dierentiation. Here is the derivative with respect to x. fx (x, y ) = = sin
2 3 2 3 4 4 4 ( 2 ) ex y5y + cos ex y5y (2xy ) x x x 4 4 x2 y5y3 4 x2 y5y3 sin e + 2xy cos e . x2 x x

Do not forget the chain rule for functions of one variable. We will be looking at the chain rule for some more complicated expressions for multivariable functions in a later section (page 73). However, at this point we are treating all the y s as constants and so the chain rule will continue to work as it does in single variable calculus. Also, do not forget how to dierentiate exponential functions d f (x) e = f (x) ef (x) . dx Now, let us dierentiate with respect to y . In this case we do not have a product rule to worry about since the only place that the y shows up is in the exponential. Therefore, since xs are considered to

61

3. Partial Derivatives be constants for this derivative, the cosine in the front will also be treated as a multiplicative constant. Here is the partial derivative with respect to y . fy (x, y ) = (x2 15y 2 ) cos 4 x2 y5y3 . e x

(e) We also cannot forget about the quotient rule. Since there is not much to do this one, we will simply give the derivatives. zu zv = = 9(u2 + 5v ) 9u(2u) 9u2 + 45v = , 2 2 (u + 5 v ) (u2 + 5v )2 (0)(u2 + 5v ) 9u(5) 45u = . (u2 + 5v )2 (u2 + 5v )2

In the case of the derivative with respect to v recall that us are constants and so when we dierentiate the numerator we will get zero. (f) Now, we do need to be careful however not to use the quotient rule when it does not need to be used. In this case we do have a quotient, however, since the xs and y s only appear in the numerator and the z s only appear in the denominator this really is not a quotient rule problem. Let us compute the derivatives with respect to x and y rst. In both these cases the z s are constants and so the denominator is a constant and so we do not really need to worry too much about it. Here are the derivatives for these two cases. gx (x, y, z ) = sin y z2 and gy (x, y, z ) = x cos y . z2

Now, in the case of dierentiation with respect to z we can avoid the quotient rule with a quick rewrite of the function. Here is the rewrite as well as the derivative with respect to z . g (x, y, z ) gz (x, y, z ) = = x sin y z 2 , 2x sin y z 3 = 2x sin y . z3

(g) In the last part we are going to apply the chain rule. If you have a good knowledge in single variable calculus this should not be all that dicult of a problem. Here are the two derivatives. zx = = = 1 ` 2 1` 2 x + ln(5x 3y 2 ) 2 x + ln(5x 3y 2 ) 2 x 1 1` 2 5 x + ln(5x 3y 2 ) 2 2x + 2 5x 3y 2 ` 2 1 5 x+ x + ln(5x 3y 2 ) 2 , 2 2(5x 3y ) 1 ` 2 1` 2 x + ln(5x 3y 2 ) 2 x + ln(5x 3y 2 ) 2 y 1 1` 2 6y x + ln(5x 3y 2 ) 2 2 5x 3y 2 ` 2 1 3y x + ln(5x 3y 2 ) 2 . 5x 3y 2
2

zy

= = =

62

3.3 Partial Derivatives So, there are some examples of partial derivatives. Hopefully you will agree that as long as we can remember to treat the other variables as constants these work in exactly the same manner that derivatives of functions of one variable do. So, if you can do single variable calculus derivative you should not have too much diculty in doing basic partial derivatives.

Exercise 3.3.1 2 Find the rst-order derivatives fx and fy for the following functions.
1. f (x, y ) = (x 1)(y 2 + 2). 2. f (x, y ) = (xy + 2) . 3. f (x, y ) = (2x 3y )3 . p 4. f (x, y ) = x2 + y 2 + 1.
2

5. f (x, y ) = (x3 + y )2/3 . 6. f (x, y ) = 1/(x + y ). 7. f (x, y ) = x/(x2 + y 2 ). 8. f (x, y ) = xy .

2 Find the rst-order derivatives fx , 9. f (x, y, z ) = sin(x + y + z ). 10. f (x, y, z ) = xy + yz + zx. 11. f (x, y, z ) = 1 + xy 2 2z 2 .

fy and fz for the following functions.


12. f (x, y, z ) = (x2 + y 2 + z 2 )1/2 . 13. f (x, y, z ) = yz ln(xy ). 14. f (x, y, z ) = exyz .

2 Evaluate the indicated partial derivatives. 15. f (x, y ) = 9 x2 7y 3 ; 16. f (x, y ) = x ye


2 xy

fx (3, 1),

fy (3, 1). f /y (1, 1).

17. z =

p x2 + 4y 2 ;
2

z/x(1, 2), w/x( 1 , ), 2

z/y (1, 2).


1 w/y ( 2 , ).

f /x(1, 1),

18. w = x cos xy ;

Z
19. Find fx and fy if f (x, y ) =
y

et dt.

(
20. Let f (x, y ) =

x3 y, xy 3 ,

if y

x,

if y < x.

. Does fy (1, 1) exist?

21. The volume V

of a right circular cone is given by 2p 2 V = d 4s d2 , 24 where s is the slant length and d is the diameter of the base. Suppose that s has a constant value of 10 cm, but d varies. Find the rate of change of V with respect to d when d = 16 cm.

22. The temperature at a point (x, y ) on a metal plate in the xy -plane is

T (x, y ) = x3 + 2y 2 + x degrees Celsius. Assume that distance is measured in centimeters and nd the rate at which temperature changes with respect to distance if we start at the point (1, 2) and move (a) to the right and parallel to the x-axis; (b) upward and parallel to the y -axis.

63

3. Partial Derivatives

Implicit dierentiation
There is one important topic that we need to take a quick look in this section, implicit dierentiation. Before getting into implicit dierentiation for multivariable functions let us rst remark how implicit dierentiation works for functions of one variable.

Example 3.3.3 (Implicit dierentiation) dy Find for 3y 4 + x7 = 5x. dx Solution Remember that the key to this is to always think of y as a function of x, or y = y (x) and so whenever we dierentiate a term involving y s with respect to x we will need to use the chain rule which dy will mean that we will add on a to that term. The rst step is to dierentiate both sides with respect dx to x, we have 12y 3 The second step is to solve for dy . dx dy 5 7x6 = . dx 12y 3
2

dy + 7x6 = 5. dx

Implicit dierentiation works in exactly the same manner with multivariable functions. If we have a function in terms of three variables x, y , and z we will assume that z is in fact a function of x and y . In other words, z = z (x, y ). Then whenever we dierentiate z s with respect to x we will use the chain rule and z z add on a . Likewise, whenever we dierentiate z s with respect to y we will add on a . Let us take x y a quick look at a couple examples of implicit dierentiation problems.

Example 3.3.4 (Implicit dierentiation) z z Find and for each of the following functions. x y (a) x3 z 2 5xy 5 z = x2 + y 3 . Solution z (a) Let us start with nding . We will dierentiate both sides with respect to x and remember to x z add on a whenever we dierentiate a z . x 3x2 z 2 + 2x3 z z z 5y 5 z 5xy 5 = 2x. x x (b) x2 sin(2y 5z ) = 1 + y cos(6xz ).

Remember that since we are assuming z = z (x, y ) then any product of xs and z s will be a product z and so we need the product rule. Now, solve for . x ` 3 z = 2x 3x2 z 2 + 5y 5 z, 2x z 5xy 5 x z 2x 3x2 z 2 + 5y 5 z = . x 2x3 z 5xy 5

64

3.3 Partial Derivatives z z except this time we will need to remember to add on a y y = = = 3y 2 , 3y 2 + 25xy 4 z, 3y 2 + 25xy 4 z . 2x3 z 5xy 5

Now, we will do the same thing for whenever we dierentiate a z . 2x3 z

z z 25xy 4 z 5xy 5 y y ` 3 z 2x z 5xy 5 y z y

z (b) Basically, we will do the same thing for this function as we did in the previous part. Let us nd x rst. 2x sin(2y 5z ) + x2 cos(2y 5z ) (5 z z ) = y sin(6xz ) (6z + 6x ). x x

Do not forget to do the chain rule on each of the trigonometric functions and when we are dierentiating z the inside function on the cosine we will need to also use the product rule. Now let us solve for . x 2x sin(2y 5z ) 5x2 cos(2y 5z ) z x = = = 6yz sin(6xz ) 6xy sin(6xz ) z , x ` 2 z 5x cos(2y 5z ) 6xy sin(6xz ) , x 2x sin(2y 5z ) + 6yz sin(6xz ) . 5x2 cos(2y 5z ) 6xy sin(6xz )

2x sin(2y 5z ) + 6yz sin(6xz ) z x Next, let us nd

z . This one will be slightly easier than the rst one. y z z x2 cos(2y 5z ) 2 5 = cos(6xz ) y sin(6xz )(6x ), y y z y = = = cos(6xz ) 6xy sin(6xz ) z , y

2x2 cos(2y 5z ) 5x2 cos(2y 5z )

` z 6xy sin(6xz ) 5x2 cos(2y 5z ) y z y

cos(6xz ) 2x2 cos(2y 5z ), cos(6xz ) 2x2 cos(2y 5z ) . 6xy sin(6xz ) 5x2 cos(2y 5z )
2

Exercise 3.3.2 2 Assuming the equations in the following dene y as a dierentiable function of x. Find the value of dy/dx at the given point.
1. x3 2y 2 + xy = 0,
2

(1, 1). (1, 1).

3. x2 + xy + y 2 6 = 0,
y

(1, 2). (0, ln 2).

2. xy + y 3x 3 = 0,

4. xe + sin xy + y ln 3 = 0,

65

3. Partial Derivatives

Interpretations of partial derivatives


At this point we will show that the two main interpretations of derivatives of functions of a single variable still hold for partial derivatives, with small modications of course to account for the fact that we now have more than one variable. Rates of change. The rst interpretation we have already seen and is the more important of the two. As with functions of several variables partial derivatives represent the rates of change of the functions as the variables change. As we saw previously, fx (x, y ) represents the rate of change of the function f (x, y ) as we change x and hold y xed while fy (x, y ) represents the rate of change of the function f (x, y ) as we change y and hold x xed. Example 3.3.5 (Rates of change) x2 Determine if f (x, y ) = 3 is increasing or decreasing at (2, 5), y (a) if we allow x to vary and hold y xed. Solution (a) In this case we still rst need fx (x, y ) and its value at the point. fx (x, y ) = 2x y3 = fx (2, 5) = 4 > 0. 125 (b) if we allow y to vary and hold x xed.

The partial derivative with respect to x is positive and therefore if we hold y xed the function is increasing at (2, 5) as we vary x. (b) For this part we will need fy (x, y ) and its value at the point. fy (x, y ) = 3x2 y4 = fy (2, 5) = 12 < 0. 625

The partial derivative with respect to y is negative and therefore the function is decreasing at (2, 5) as we vary y and hold x xed. 2

Note that it is completely possible for a function to be increasing for a xed y and decreasing for a xed x at a point as the above example has shown. To see a nice example of this take a look at the following graph.

0.5 0 -0.5 -2 -1 0 0 2

1 2

-2

66

3.3 Partial Derivatives This is a graph of hyperbolic paraboloid and at the origin we can see that if we move along the positive x-axis the graph is increasing and if we move along the positive y -axis the graph is decreasing. So it is completely possible to have a graph both increasing and decreasing at a point depending upon the direction that we move. We should never expect the function will behave in exactly the same way at a point as each variable changes. Slopes of tangent lines. The next interpretation was one of the standard interpretations in any single variable calculus course. We know that from single variable calculus that f (a) represents the slope of the tangent line to the curve y = f (x) at x = a. Here, fx (x, y ) and fy (x, y ) also represent the slopes of tangent lines. The dierence is the functions that they represent tangent lines to. Partial derivatives are the slopes of traces. By a trace to f (x, y ) at the point (a, b) we mean the intersection curve between the surface dened by z = f (x, y ) and a plane dened by x = a (resp. by another plane y = b). In particular, the partial derivative fx (a, b) is the slope of the trace to f (x, y ) for the plane y = b at the point (a, b). Likewise, the partial derivative fy (a, b) is the slope of the trace to f (x, y ) for the plane x = a at the point (a, b).

Example 3.3.6 (Slopes of tangent lines) Find the slopes of the traces to z = 10 4x2 y 2 at the point (1, 2). Solution We sketch the graphs of the traces for the planes x = 1 and y = 2 in the following.

Trace for x = 1

Trace for y = 2

Next we will need the two partial derivatives so we can get the slopes. fx (x, y ) = 8x, fy (x, y ) = 2y.

To get the slopes all we need to do is evaluate the partial derivatives at the point (1, 2). fx (1, 2) = 8, fy (1, 2) = 4.

So, the tangent line at (1, 2) for the trace to z = 10 4x2 y 2 for the plane y = 2 has a slope of 8. Also, the tangent line at (1, 2) for the trace to z = 10 4x2 y 2 for the plane x = 1 has a slope of 4.
2

67

3. Partial Derivatives Example 3.3.7 (Slopes of tangent lines) The plane x = 1 intersects the paraboloid z = x2 + y 2 in a parabola. Find the slope of the tangent to the parabola at (1, 2, 5). Solution The slope is the value of the partial derivative z/y at (1, 2). z 2 2 = (x + y ) = 2y = 4. y (1,2) y y =2 (1,2)

As a check, we can treat the parabola as the graph of the single variable function z = (1)2 + y 2 = 1 + y 2 in the plane x = 1 and ask for the slope at y = 2. The slope, calculated now as an ordinary derivative, is d dz 2 = (1 + y ) = 2y = 4. dy y=2 dy y =2 y =2 Could you sketch the graphs of the functions involved in this question?
2

Vector equations of tangent line. Finally, let us briey talk about getting the equations of the tangent line. Recall that the equation of a line in 3-space is given by a vector equation. Also to get the equation we need a point on the line and a vector that is parallel to the line. The point is easy. Since we know the x-y coordinates of the point all we need to do is plug this into the equation to get the point. So, the point will be (a, b, f (a, b)). The parallel (or tangent) vector is also easy. We can write the equation of the surface as a vector function as follows, r(x, y ) = xi + y j + z k = xi + y j + f (x, y )k, or in the alternate vector notation r(x, y ) = x, y, f (x, y ) . We know that if we have a vector function of one variable we can get a tangent vector by dierentiating the vector function. The same will still be true here. If we dierentiate with respect to x we will get a vector to traces for the plane y = b (i.e. for xed y ) and if we dierentiate with respect to y we will get a vector to traces for the plane x = a (for xed x). The following is the tangent vector for traces with xed y . rx (x, y ) = 1, 0, fx (x, y ) . We dierentiated each component with respect to x. Therefore the rst component becomes a one and the second becomes a zero because we are treating y as a constant when we dierentiate with respect to x. The third component is just the partial derivative of the function with respect to x. For traces with xed x the tangent vector is ry (x, y ) = 0, 1, fy (x, y ) . The equation for the tangent line to traces with xed y is r(t) = a, b, f (a, b) + t 1, 0, fx (a, b) , whereas the tangent line to traces with xed x is r(t) = a, b, f (a, b) + t 0, 1, fy (a, b) .

68

3.3 Partial Derivatives Example 3.3.8 (Vector equations of tangent line) Write down the vector equations of the tangent lines to the traces to z = 10 4x2 y 2 at the point (1, 2). Solution There really is not all that much to do with these other than plugging the values and function into the formulas above. We have already computed the derivatives and their values at (1, 2) in Example 3.3.6 (page 67) and the point on each trace is (1, 2, f (1, 2)) = (1, 2, 2). The equation of the tangent line to the trace for the plane y = 2 is given by r(t) = 1, 2, 2 + t 1, 0, 8 = 1 + t, 2, 2 8t , and the equation of the tangent line to the trace for the plane x = 1 is given by r(t) = 1, 2, 2 + t 0, 1, 4 = 1, 2 + t, 2 4t .
2

Exercise 3.3.3 2 Find the rate of change of z with respect to x at the given point with y held xed.
1. f (x, y ) = sin(y 2 4x) at (2, 1). 2. f (x, y ) = (x + y )1 at (2, 4).

2 Find the rate of change of z with respect to y at the given point with y held xed. 3. f (x, y ) = sin(y 2 4x) at (2, 1). 4. f (x, y ) = (x + y )1 at (2, 4).

2 Find the slope of the surface z = f (x, y ) in the x-direction at the given point. 5. f (x, y ) = p 3x + 2y at (4, 2). 6. f (x, y ) = xey + 5y at (3, 0).

2 Find the slope of the surface z = f (x, y ) in the y -direction at the given point. 7. f (x, y ) = p 3x + 2y at (4, 2). 8. f (x, y ) = xey + 5y at (3, 0).

2 Find the slope of the tangent line at (1, 1, 5) to the curve of intersection of the surface z = x2 + 4y 2

and the plane


9. x = 1. 10. y = 1.

69

3. Partial Derivatives

Higher-order partial derivatives


Just as we have higher-order derivatives with functions of one variable we will also have higher-order (partial) derivatives of functions of more than one variable. Consider the case of a function of two variables, f (x, y ), since both of the rst-order partial derivatives are also functions of x and y we could in turn dierentiate each with respect to x or y . This means that for the case of a function of two variables there will be a total four possible second-order derivatives. Here they are the notations that we will use to denote them. f 2f (fx )x = fxx = = , x x x2 f 2f , (fx )y = fxy = = y x yx f 2f (fy )x = fyx = , = x y xy f 2f (fy )y = fyy = = . y y y 2 In the above, the second and third second-order partial derivatives are often called mixed partial derivatives since we are taking derivatives with respect to more than one variable. Note as well that the order that we take the derivatives in is given by the notation for each of these. If we are using the subscripting notation, for example fxy , then we will dierentiate from left to right. In other words, in this case, we will dierentiate 2f rst with respect to x and then with respect to y . With the fractional notation, for example , it is yx the opposite. In these cases we dierentiate moving along the denominator from right to left. So, again, in this case we dierentiate with respect to x rst and then y . Let us take a quick look at an example.

Example 3.3.9 (Second-order partial derivatives) Find all the second-order derivatives for f (x, y ) = cos(2x) x2 e5y + 3y 2 . Solution We will need the rst-order derivatives so here they are fx (x, y ) = 2 sin(2x) 2xe5y , Now, let us get the second-order derivatives. They are fxx = 4 cos(2x) 2e5y , fxy = 10xe5y , fyx = 10xe5y , fyy = 25x2 e5y + 6.
2

fy (x, y ) = 5x2 e5y + 6y.

Note that we dropped the (x, y ) from the derivatives (i.e., writing for example fxx instead of fxx (x, y )). This is fairly standard and we will be doing it most of the time from now on. We will also be dropping it for the rst-order derivatives in most cases. You may have noticed that the mixed second-order partial derivatives fxy = 2f yx and fyx = 2f xy

in Example 3.3.9 are equal. This is not a coincidence. If the function is nice enough this will always be the case. So, what is actually nice enough? The following theorem tells us the answer.

70

3.3 Partial Derivatives Theorem 3.3.1 (The Mixed Derivative Theorem) If f (x, y ) and its partial derivatives fx , fy , fxy , and fyx are dened on a disk containing a point (a, b) and are all continuous at (a, b), then fxy (a, b) = fyx (a, b).

The theorem is also known as Clairauts Theorem, named after the French mathematician Alexis Clairaut who discovered it. The proof is omitted here. This theorem says that to calculate a mixed second-order derivative, we may dierentiate in either order, provided the continuity conditions are satised. This can lead to our advantage. Example 3.3.10 (Mixed derivative) 2w ey Find if w = xy + 2 . xy y +1
w Solution The symbol xy tells us to dierentiate rst with respect to y and then with respect to x. If we postpone the dierentiation with respect to y and dierentiate rst with respect to x, however, we get the answer more quickly. In two steps,
2

w =y x

and

2w = 1. yx
2

w = 1 as well. We can dierentiate If we dierentiate rst with respect to y , certainly we still obtain xy in either order because the conditions of Theorem 3.3.1 hold for w at all points. 2

Although we will deal mostly with rst- and second-order partial derivatives, because these appear the most frequently in applications, there is no theoretical limit to how many times we can dierentiate a function as long as the derivatives involved exist. There are higher-order derivatives as well and the following is a couple of the third-order partial derivatives of a function of two variables. 2 f 3f fxyx = (fxy )x = = , x yx xyx 2 f 3f fyxx = (fyx )x = = . x xy 2 xy Notice as well that for both of these we dierentiate once with respect to y and twice with respect to x. There is also another third-order partial derivative in which we can do this, fxxy . There is an extension to Clairauts Theorem that says if furthermore all three of these are continuous then they should all be equal, fxxy = fxyx = fyxx . To this point we have only looked at functions of two variables, but everything that we have done here will work regardless of the number of variables that we have got in the function and there are natural extension to Clairauts theorem to all of these cases as well. For instance, fxz (x, y, z ) = fzx (x, y, z ), provided both of the derivatives are continuous. In general, we can extend Clairauts theorem to any function and mixed partial derivatives. The only requirement is that in each derivative we dierentiate with respect to each variable the same number of times. In other words, provided we meet the continuity condition, the following will be equal fssrtsrr = frtsrssr because in each case we dierentiate with respect to t once, s three times and r three times. Let us do a couple of examples with higher-order derivatives and functions of more than two variables.

71

3. Partial Derivatives Example 3.3.11 (Higher-order derivatives) Find the indicated derivative for each of the following functions. (a) Find fxxyzz for f (x, y, z ) = z 3 y 2 ln x. (b) Find 3f yx2 for f (x, y ) = exy .

Solution (a) In this case remember that we dierentiate from left to right. The derivatives are fx fxxyz = = z3 y2 , x 6z 2 y , x2 fxx fxxyzz = = z3 y2 , x2 12zy . x2 fxxy = 2z 3 y , x2

(b) Here we dierentiate from right to left. The derivatives are f x = yexy , 2f x2 = y 2 exy , 3f yx2 = 2yexy + xy 2 exy .
2

Exercise 3.3.4 2 Let z =


1. 2 z/x2 .

x cos y . Find the following.


2. 2 z/y 2 . 3. 2 z/xy . 4. 2 z/yx.

2 Let f (x, y ) = 4x2 2y + 7x4 y 5 . Find the following. 5. fxx . 6. fyy . 7. fxy . 8. fyx .

2 Conrm that the mixed second-order partial derivatives of f are the same. p 9. f (x, y ) = x2 + y 2 . 11. f (x, y ) = ln(x2 + y 2 ). 10. f (x, y ) = exy .
2

12. f (x, y ) = ex cos y .

2 Express the following derivatives in notation. 13. fxxx . 14. fxyy . 15. fyyxx . 16. fxyyy .

2 Express the following derivatives in subscript notation. 17. 3f . y 2 x 18. 4f . x4 19. 4f . y 2 x2 20. 5f . x2 y 3

72

3.4 The Chain Rule

3.4

The Chain Rule

The chain rule for functions of a single variable says that when y = f (x) is a dierentiable function of x and x = g (t) is a dierentiable function of t, y becomes a dierentiable function of t and dy/dt could be calculated with the formula dy dy dx = . dt dx dt It is now time to extend the chain rule out to more complicated situations. Notice that in the above the derivative dy/dt really does make sense since if we plug in for x then y really will be a function of t. One way to remember this form of the chain rule is to note that if we think of the two derivatives on the right side as fractions the dxs will cancel to get the same derivative on both sides. As with many topics in multivariable calculus, there are in fact many dierent formulas depending on the number of variables that we are dealing with. So, let us start this discussion o with a function of two variables, z = f (x, y ). From this point there are still many dierent possibilities that we can look at. We will be looking at two distinct cases prior to generalizing the whole idea out.

Case 1.

z = f (x, y ), x = g (t), y = h(t) and compute

dz . dt

This case is analogous to the standard chain rule from single variable calculus that we looked at above. In this case we are going to compute an ordinary derivative since z really would be a function of t only if we substitute in for x and y . The chain rule for this case is dz f dx f dy = + . dt x dt y dt So, basically what we have done here is dierentiating f with respect to each variable in it and then multiplying each of these by the derivative of that variable with respect to t. The nal step is to add them up together. Let us take a look at a couple of examples.

Example 3.4.1 (Chain rule) dz Compute for each of the following. dt (a) z = xexy , x = t2 , y = t1 . Solution (a) We may directly apply the formula dz dt = = = f dx f dy + x dt y dt (exy + xyexy )(2t) + x2 exy (t2 ) 2t(1 + xy ) exy t2 x2 exy . (b) z = x2 y 3 + y cos x, x = ln t, y = sin(4t).

So, technically we have computed the derivative. However, we should probably go ahead and substitute in for x and y as well at this point since we have already got ts in the derivative. Doing this gives dz = 2t(1 + t) et t2 t4 et = (2t + t2 ) et . dt

73

3. Partial Derivatives Note that in this case it might actually be easier to just substitute in for x and y in the original function and just compute the derivative as we normally would. For comparison purpose let us do that z = t2 et = dz = 2tet + t2 et . dt

The same result for less work. Note however, that often it will actually be more work to do the substitution rst. (b) In this case it would almost denitely be more work to do the substitution rst so we will use the chain rule rst and then substitute. dz dt = = 1 (2xy 3 y sin x)( ) + (3x2 y 2 + cos x)(4 cos(4t)) t 3 ` 2 sin (4t) ln t sin(4t) sin(ln t) + 4 cos(4t) 3 sin2 (4t) ln2 t + cos(ln t) . t

Note that sometimes, because of the signicant mess of the nal answer, we will only simplify the rst step a little and leave the answer in terms of x and y , and t. This is dependent upon the situation, class and instructor however this kind of substitution work is not necessary in the examinations for this class.
2

Now, there is a special case that we should take a quick look at before moving on to the next case. Let us suppose that we have the following situation. z = f (x, y ) In this cae the chain rule for dz becomes dx dz f dx f dy f f dy = + = + . dx x dx y dx x y dx In the rst term we used the fact that d dx = (x) = 1. dx dx Let us take a quick look at an example. and y = g (x).

Example 3.4.2 (Chain rule) dz Compute for dx z = x ln(xy ) + y 3 Solution and y = cos(x2 + 1).

We just plug into the formula ` dz y x = ln(xy ) + x + x + 3y 2 2x sin(x2 + 1) dx xy xy ` x 2 2 + 3 cos ( x + 1) = ln x cos(x2 + 1) + 1 2x sin(x2 + 1) cos(x2 + 1) ` = ln x cos(x2 + 1) + 1 2x2 tan(x2 + 1) 6x sin(x2 + 1) cos2 (x2 + 1).
2

74

3.4 The Chain Rule Let us take a look at the second case. z z and . s t In this case if we substitute in for x and y we may nd that z is a function of s and t and so it makes sense that we will be computing partial derivatives here and that there will be two of them. Here is the chain rule for both of these partial derivatives. Case 2. z = f (x, y ), x = g (s, t), y = h(s, t) and compute f x f y z = + s x s y s and z f x f y = + . t x t y t

So, not surprisingly, these are very similar to the rst case that we looked at. Here is a quick example of this kind of chain rule.

Example 3.4.3 (Chain rule) z z Find and for s t z = e2r sin(3), Solution Here is the chain rule for z s = = Now the chain rule for z t z . s r = st t2 , =

s2 + t2 .

` 2r ` s 2e sin(3) (t) + 3e2r cos(3) s2 + t2 2 2 3se2(stt ) cos(3 s2 + t2 ) . 2t e2(stt ) sin(3 s2 + t2 ) + s2 + t2

z . t ` 2r ` t = 2e sin(3) (s 2t) + 3e2r cos(3) s2 + t2 = 2 2 3te2(stt ) cos(3 s2 + t2 ) 2(s 2t) e2(stt ) sin(3 s2 + t2 ) + . s2 + t2
2

We have seen a couple of cases for the chain rule let us see the general version of the chain rule. Chain Rule Suppose that z is a function of n variables x1 , x2 , , xn , and that each of these variables are in turn functions of m variables t1 , t2 , , tm . Then for any variable ti (i = 1, 2, , m), we have the following z x1 z x2 z xn z = + + + . ti x1 ti x2 ti xn ti This is a bit troublesome. There is actually an easier way to construct all the chain rules that we have discussed in the section or will look at in later examples. We can build up a tree diagram that will give us the chain rule for any situation. To see how these work let us go back and take a look at the chain rule for z/s given that z = f (x, y ), x = g (s, t), y = h(s, t). Of course we have already known the answer but the following tree diagram is used as an illustration. For reference, here is the chain rule for this case, f x f y z = + . s x s y s

75

3. Partial Derivatives Here is the tree diagram for this case. z


z x z y

x s

x t

y s

y t

We start at the top with the function itself and the branch out from that point. The rst set of branches is for the variables in the function. From each of these endpoints we put down a further set of branches that gives the variables that both x and y are a function of. We connect each letter with a line and each line represents a partial derivative as shown. Note that the letter in the numerator of the partial derivative is the upper node of the tree and the letter in the denominator of the partial derivative is the lower node of the tree. To use this to get the chain rule we start at the bottom and for each branch that ends with the variable we want to take the derivative with respect to (s in this case) we move up the tree until we hit the top multiplying the derivatives that we see along that set of branches. Once we have done this for each branch that ends at s, we then add the results up to get the chain rule for that given situation. Note that we do not usually put the derivatives in the tree. They are always an assumed part of the tree. Let us write down the chain rules for a couple of examples.

Example 3.4.4 (Chain rule) Use a tree diagram to write down the chain rule for the given derivatives. dw for w = f (x, y, z ), x = g1 (t), y = g2 (t), z = g3 (t). dt w (b) for w = f (x, y, z ), x = g1 (r, s, t), y = g2 (r, s, t), z = g3 (r, s, t). r (a) Solution (a) We rst draw the tree diagram. w

76

3.4 The Chain Rule From this tree diagram we know that the chain rule is given by dw f dx f dy f dz = + + dt x dt y dt z dt which is really just a natural extension to the two variable case that we saw before. (b) Here is the tree diagram for this situation. w

From this tree diagram we know that the chain rule is given by w f x f y f z = + + . r x r y r z r
2

So, provided we can construct the tree diagram, and it is not too dicult to write down the chain rule for any set up that we might run across. We have now seen how to take the rst-order derivatives of these more complicated situation, but what about higher-order derivatives? How do we do these? It is probably easiest to see how to deal with these with an example.

Example 3.4.5 (Chain rule) Compute 2z for z = f (x, y ) if x = r cos and y = r sin . 2

Solution We will need the rst-order derivative before we can even think about nding the second-order derivative so let us get that. This situation falls into the second case that we looked at above so we do not need a new tree diagram. The following is the rst-order derivative. f = = f x f y + x y f f + r cos . r sin x y

Now, the second-order derivative is given by 2f f f f = = r sin + r cos . 2 x y

77

3. Partial Derivatives The issue here is to correctly deal with this derivative. Since the two rst-order derivatives, f and f , x y are both functions of x and y which are in turn functions of r and both of these terms are products. So, using the product rule gives the following, 2f f f f f = r cos r sin r sin + r cos . 2 x x y y ` f f We now need to determine what and will be. These are both chain rule problems again x y since both of the derivatives are functions of x and y and we want to take the derivative with respect to . f f f = r sin + r cos x x x y x = f y = = r sin r sin r sin 2f 2f , + r cos 2 x yx x f y + r cos y f y

2f 2f + r cos . xy y 2

The nal step is to plug these back into the second-order derivative and do some simplifying. 2f f 2f 2f = r cos r sin r sin + r cos 2 x x2 yx f 2f 2f r sin + r cos r sin + r cos y xy y 2 = r cos r sin f 2f 2f + r2 sin2 r2 sin cos 2 x x yx f 2f 2f r2 sin cos + r2 cos2 y xy y 2 f f 2f r sin + r2 sin2 x y x2 2f 2f + r2 cos2 . yx y 2
2

r cos

2r2 sin cos It is long and fairly messy but there it is.

Implicit Dierentiation The nal topic in this section is a revisiting of implicit dierentiation. With these forms of the chain rule implicit dierentiation actually becomes a fairly simple process. We will start with a function in the form F (x, y ) = 0 (if it is not in this form simply move everything to one side of the equal sign to get it into this form) where y = y (x). In a single variable calculus course we were asked to dy compute dx and this was often a fairly messy process. Using the chain rule from this section however we can get a nice simple formula for doing this. We will start by dierentiating both sides with respect to x. This will mean using the chain rule on the left side and the right side will dierentiate to zero. Here is the result of that. F x + Fy dy =0 dx = dy Fx = . dx Fy

dy As shown, all we need to do next is solve for dx and we now have a very nice formula to use for implicit dierentiation. Note as well that in order to simplify the formula we switched back to using the subscript notation for the derivatives. Let us check out a quick example.

78

3.4 The Chain Rule Example 3.4.6 (Implicit dierentiation) dy Find for dx x cos(3y ) + x3 y 5 = 3x exy . Solution The rst step is to get a zero on one side of the equal sign and that is easy enough to do. x cos(3y ) + x3 y 5 3x + exy = 0. Now, the function on the left is F (x, y ) in our formula so all we need to do is use the formula to nd the derivative. cos(3y ) + 3x2 y 5 3 + yexy dy = . dx 3x sin(3y ) + 5x3 y 4 + xexy
2

We can also do something similar to handle the types of implicit dierentiation problems involving partial derivatives like those we saw when we rst introduced partial derivatives. In these cases we will start o with a function in the form F (x, y, z ) = 0 and assume that z = f (x, y )

z z z and we want to nd x and/or y . Let us start by trying to nd x . We will dierentiate both sides with respect to x and we will need to remember that we are going to be treating y as a constant. Also, the left side will require the chain rule. Here is the derivative.

F x F y F z + + = 0. x x y x z x Now, we have the following, x =1 x and y = 0. x

The rst is because we are just dierentiating x with respect to x and we know that is 1. The second is because we are treating y as a constant and so it will dierentiate to zero. Plugging these in and solving for
z x

gives z Fx = . x Fz

A similar argument can be used to show that z Fy = . y Fz As with the one variable case we switched to the subscripting notation for derivatives to simplify the formulas. Let us take a quick look at an example of this.

Example 3.4.7 (Implicit dierentiation) z z Find and for x y x2 sin(2y 5z ) = 1 + y cos(6xz ).

79

3. Partial Derivatives Solution This is one of the functions discussed in Example 3.3.4 (page 64). You might go back and see the dierence between the two. First let us get everything on one side, x2 sin(2y 5z ) 1 y cos(6xz ) = 0. Now, the function on the left is F (x, y, z ) and so all we need to do is use the formulas developed above to nd the derivatives, z x z y = 2x sin(2y 5z ) + 6yz sin(6xz ) , 5x2 cos(2y 5z ) + 6xy sin(6xz ) 2x2 cos(2y 5z ) cos(6xz ) . 5x2 cos(2y 5z ) + 6xy sin(6xz )

If you go back and compare these answers to those that we found the rst time around you will notice that they might appear to be dierent. However, if you take into account the minus sign that sits in the front of our answers here you will see that they are in fact the same. 2

Exercise 3.4.1 2 Use the chain rule to nd dz/dt.


1. z = ln(2x2 + y ), x= t, y = t2/3 . y = 3t. 3. z = e1xy , x = t1/3 , y = t3 . p 4. z = 1 + x 2xy 4 , x = ln t, y = t.

2. z = 3 cos x sin xy ,

x = 1/t,

5. Find

f (x2 y, x + 2y ) and f (x2 y, x + 2y ) in terms of the partial derivatives of f , assuming x y that these partial derivatives are continuous.

6. Find dz/dt, where z = f (x, y, t), x = g (t), and y = h(t).

7. If x = t sin s and y = t cos s, nd

2 f (x, y ). st

8. Find

3 f (2x + 3, xy ) in terms of partial derivatives of the function f . xy 2 2 f (y 2 , xy, x2 ) in terms of partial derivatives of the function f . yx x y z = 1. y z x

9. Find

10. Let F (x, y, z ) = 0 where x = x(y, z ), y = y (x, z ), and z = z (x, y ). Prove that

11. Suppose that the temperature T in a certain liquid varies with depth z and time t according to

the formula T = et z . Find the rate of change of temperature with respect to time at a point that is moving through the liquid so that at time t its depth is f (t). What is the rate if f (t) = et ? What is happening in this case?

80

3.5 Directional Derivatives

3.5

Directional Derivatives

To this point we have only looked at the two partial derivatives fx (x, y ) and fy (x, y ). Recall that these derivatives represent the rate of change of f as we vary x (holding y xed) and as we vary y (holding x xed) respectively. We now need to discuss how to nd the rate of change of f if we allow both x and y to change simultaneously. The problem here is that there are many ways to allow both x and y to change. For instance one could be changing faster than the other and then there is also the issue of whether or not each is increasing or decreasing. So, before we get into nding the rate of change we need to get a couple of preliminary ideas take care of rst. The main idea that we need to look at is just how are we going to dene the changing of x and/or y . Let us start o by supposing that we want the rate of change of f at a particular point, say (x0 , y0 ). Let us also suppose that both x and y are increasing and that, in this case, x is increasing twice as fast as y is increasing. So, as y increases one unit of measure x will increase two units of measure. To help us see how we are going to dene this change let us suppose that a particle is sitting at (x0 , y0 ) and the particle will move in the direction given by the changing x and y . Therefore, the particle will move o in a direction of increasing x and y and the x-coordinate of the point will increase twice as fast as the y -coordinate. Now that we are thinking of this changing x and y as a direction of movement we can get a way of dening the change. We have known that vectors can be used to dene a direction and so the particle, at this point, can be said to be moving in the direction, v = 2, 1 . Since this vector can be used to dene how a particle at a point is changing we can also use it describe how x and/or y is changing at a point. For our example we will say that we want the rate of change of f in the direction of v = 2, 1 . In this way we will know that x is increasing twice as fast as y is. There is still a small problem with this however. There are many vectors that point in the same direction. For instance all of following vectors point in the same direction as v = 2, 1 , v= 1 1 , , 5 10 v = 6, 3 , 1 2 v= , . 5 5

We need a way to consistently nd the rate of change of a function in a given direction. We will do this by insisting that the vector that denes the direction of change be a unit vector. Recall that a unit vector is a vector with length, or magnitude, of 1. This means that for the example that we started o thinking about we would want to use 2 1 v= , , 5 5 since this is the unit vector that points in the direction of change. For reference purposes recall that the magnitude or length of the vector v = a, b, c v = p a2 + b 2 + c 2 . is given by

For two-dimensional vectors we drop the c from the formula. Sometimes we will give the direction of changing x and y as an angle. For instance, we may say that we want the rate of change of f in the direction of = /3. The unit vector that points in this direction is given by u = cos , sin . Now that we know how to dene the direction of changing x and y it is time to start talking about nding the rate of change of f in this direction. Let us rst give the formal denition.

81

3. Partial Derivatives 3.5.1 Denition The rate of change of f (x, y ) in the direction of the unit vector u = 2, 1 is called the directional derivative and is denoted by Du f (x, y ). The denition of the directional derivative is Du f (x, y ) = lim f (x + ah, y + bh) f (x, y ) . h

h0

So, the denition of the directional derivative is very similar to the denition of partial derivatives. However, in practice this can be a very dicult limit to compute so we need an easier way of taking directional derivatives. It is actually fairly simple to derive an equivalent formula for taking direction derivatives. To see how we can do this let us dene a new function of a single variable, g (z ) = f (x0 + az, y0 + bz ), where x0 , y0 , a, and b are some xed numbers. Note that this really is a function of a single variable now since z is the only letter that is not representing a xed number. Then by the denition of the derivative for functions of a single variable we have g (z ) = lim and the derivative at z = 0 is given by g (0) = lim If we now substitute in for g (z ) we get g (0) = lim g (h) g (0) f (x0 + ah, y0 + bh) f (x0 , y0 ) = lim = Du f (x0 , y0 ). h0 h h (3.1) g (h) g (0) . h g (z + h) g (z ) h

h0

h0

h0

Now let us look at this from another perspective. Let us rewrite g (z ) as follows, g (z ) = f (x, y ), where x = x0 + az and y = y0 + bz.

We can now use the chain rule to compute g (z ) = dg f dx f dy = + = fx (x, y ) a + fy (x, y ) b. dz x dz y dz

So, from the chain rule we get the following relationship g (z ) = fx (x, y ) a + fy (x, y ) b. (3.2)

If we now take z = 0 we will get that x = x0 and y = y0 (from how we dened x and y above) and plug these into (3.2) we get g (0) = fx (x0 , y0 ) a + fy (x0 , y0 ) b. Now, simply equate (3.1) and (3.3) to get that Du f (x0 , y0 ) = g (0) = fx (x0 , y0 ) a + fy (x0 , y0 ) b. (3.3)

82

3.5 Directional Derivatives If we now go back to allowing x and y to be any number we get the following formula for computing directional derivatives. Du f (x, y ) = fx (x, y ) a + fy (x, y ) b. This is much simpler than the limit denition. Also note that this denition assumed that we were working with functions of two variables. There are similar formulas that can be derived by the same type of argument for functions with more than two variables. For instance, the directional derivative of f (x, y, z ) in the direction of the unit vector u = a, b, c is given by Du f (x, y, z ) = fx (x, y, z ) a + fy (x, y, z ) b + fz (x, y, z ) c.

Let us work a couple of examples. Example 3.5.1 (Directional derivative) Find each of the directional derivatives. (a) Du f (2, 0), where f (x, y ) = xexy + y and u is the unit vector in the direction of = 2 . 3

(b) Du f (x, y, z ), where v = 1, 0, 3 . Solution

f (x, y, z ) = x2 z + y 3 z 2 xyz and u is the unit vector in the direction of

(a) We will rst nd Du f (x, y ) and then use this formula for nding Du f (2, 0). The unit vector giving the direction is 2 2 1 3 u = cos , sin = , . 3 3 2 2 So, the directional derivative is Du f (x, y ) Du f (2, 0) = 1 3 ` 2 xy (exy + xyexy ) + x e +1 , 2 2 1 3 5 31 (1) + (5) = . 2 2 2

(b) In this case let us rst check to see if the direction vector is a unit vector or not and if it is not convert it into one. To do this all we need to do is compute its magnitude. v = 1 + 0 + 9 = 10. So, it is not a unit vector. Recall that we can normalize it into the unit vector 1 1 3 u= 1, 0, 3 = , 0, . 10 10 10 The directional derivative is then Du f (x, y, z ) = ` 1 3 ` 2 (2xz yz ) + 0 3y 2 z 2 xz + x + 2y 3 z xy 10 10 1 ` 2 3x + 6y 3 z 3xy 2xz + yz . 10
2

83

3. Partial Derivatives There is another form of the formula that we may use to get the directional derivative that is a little nicer and somewhat more compact. It is also a much more general formula that will encompass both of the formulas above. Let us start with the second one and notice that we can rewrite it as follows. Du f (x, y, z ) = = fx (x, y, z ) a + fy (x, y, z ) b + fz (x, y, z ) c fx , fy , fz a, b, c .

In other words we can write the directional derivative as a dot product and notice that the second vector is nothing more than the unit vector u that gives the direction of change. Also, if we like to use this version for functions of two variables the third component will not be there, but other than that the formula will be the same.

Now let us give a name and notation to the rst vector in the dot product since this vector will show up fairly regularly throughout this course. The gradient of f or gradient vector of f is dened to be f = fx , f y , f z or f = fx , fy .

Or, if we want to use the standard basis vectors the gradient is f = fx i + fy j + fz k or f = f x i + f y j.

The denition is only shown for functions of two or three variables, however there is a natural extension to functions of any number of variables that we would like. With the denition of the gradient we can now say that the directional derivative is given by Du f = f u, where we will no longer show the variable and use this formula for any number of variables. Note as well that we will sometimes use the following notation Du f (x) = f (x) u, where x = x, y, z or x = x, y as needed. This notation will be used when we want to note the variables in some way, but dont really want to restrict ourselves to a particular number of variables. In other words, x will be used to represent as many variables as we need in the formula and we will most often use this notation when we are already using vectors or vector notation in the problem. Let us work a couple of examples using this formula of the directional derivative.

Example 3.5.2 (Directional derivative) Find each of the directional derivatives. (a) Du f (2, 0) for f (x, y ) = x cos y in the direction of v = 2, 1 . f (x, y, z ) = sin(yz ) + ln x2 at (1, 1, ) in the direction of v = 1, 1, 1 .

(b) Du f (x, y, z ) Solution

for

84

3.5 Directional Derivatives (a) Let us rst compute the gradient for this function. f = cos y, x sin y . Also, as we saw earlier in this section the unit vector for the direction of v is 2 1 u= , . 5 5 The directional derivative is then Du f (x) = 2 1 cos y, x sin y , 5 5 1 (2 cos y x sin y ) . 5

(b) In this case we are asking for the directional derivative at a particular point. To do this we will rst compute the gradient, evaluate it at the point in question and then do the dot product. So, let us get the gradient. f (x, y, z ) f (1, 1, ) = = 2 , z cos(yz ), y cos(yz ) , x 2 , cos , cos 1 = 2, , 1 .

Next, we need the unit vector for the direction. v = 3, 1 1 1 u = , , . 3 3 3

Finally, the directional derivative at the point is Du f (1, 1, ) = 1 1 1 2, , 1 , , 3 3 3 1 3 (2 + 1) = . 3 3


2

Before proceeding let us note that the rst-order partial derivatives that we were looking at in the majority of the section can be thought of as special cases of the directional derivatives. For instance, fx can be thought of as the directional derivative of f in the direction of u = 1, 0 or u = 1, 0, 0 , depending on the number of variables that we are working with. The same can be done for fy and fz . Gradient vectors. We will nish this section with a couple of nice facts about the gradient vector. The rst tells us how to determine the maximum rate of change of a function at a point and the direction that we need to move in order to achieve that maximum rate of change.

Theorem 3.5.1 The maximum value of Du f (x) ( and hence then the maximum rate of change of the function f (x) ) is given by f (x) and will occur in the direction given by f (x).

This theorem provides a very useful interpretation for the gradient vector. Now we are going to discuss the detail of this theorem and the reasoning behind it.

85

3. Partial Derivatives For any point x and any unit vector u we have Du f (x) = f (x) u = f (x) cos ,

where is the angle between the vector u and f (x). Since cos only takes on values between 1 and 1, Du f (u) only takes on values between f (x) and f (x) . Moreover, Du f (x) = f (x) Du f (x) = f (x) if and only if if and only if u points in the opposite direction to f (x) (cos = 1), u points in the same direction as f (x) (cos = 1).

The directional derivative is zero in the direction = /2; this is the direction of the (tangent line to the) level curve of f through x. We summarize these properties of the gradient as follows.

Theorem 3.5.2 (Geometric Properties of the Gradient Vector) (a) At x, f (x) increases most rapidly in the direction of the gradient vector f (x). The maximum rate of increase is f (x) . (b) At x, f (x) decreases most rapidly in the direction of f (x). The maximum rate of decrease is f (x) . (c) The rate of change of f (x) at x is zero in directions tangent to the level curve of f that passes through x.

As we remarked before, these properties hold in three dimensions as well as two.

Example 3.5.3 (Gradient vector) Find the directions in which the function f (x, y ) = x2 y2 + 2 2

(a) Increases most rapidly at the point (1, 1). (b) Decreases most rapidly at (1, 1). (c) What are the directions of zero change in f at (1, 1). Solution (a) The function increases most rapidly in the direction of f (x) at (1, 1). The gradient there is f f f (1, 1) = , = x, y = 1, 1 . x y (1,1) (1,1) Its direction is 1 1 1 u = 1, 1 = , . 2 2 2 (b) The function decreases most rapidly in the direction of f (x) at (1, 1), which is 1 1 u = , . 2 2

86

3.5 Directional Derivatives (c) The directions of zero change at (1, 1) are the directions orthogonal to f : 1 1 n = , 2 2 and 1 1 n = , . 2 2
2

Example 3.5.4 (Gradient vector) Suppose that the height of a hill above sea level is given by z = 1000 0.01x2 0.02y 2 . If you are at the point (60, 100) in what direction is the elevation changing faster? What is the maximum rate of change of the elevation at this point? Is the maximum rate of change of the elevation towards the center of the hill or away from it? Solution First, you will hopefully know that the graph of the function is an elliptic paraboloid that opens downward. So even though most hills are not this symmetrical it will at least be vaguely hill shaped and so the question makes at least some sense. To this problem there are a couple of questions to answer here, but using Theorem 3.5.2 makes answering them very simple. We will rst need the gradient vector. f (x) = f (x, y ) = 0.02x, 0.04y . The maximum rate of change of the elevation will then occur in the direction of f (60, 100) = 1.2, 4 . The maximum rate of change of elevation at this point is p f (60, 100) = (1.2)2 + (4)2 = 17.44 4.176. To answer the nal part it might be convenient to have a quick sketch of the gradient at this point. y
100

80

60 40

20

0 0 20 40 60 80 100

We have only shown a portion of the axis system here to make the picture easier to see. The center of the hill is at the origin and that is also the highest point on the hill. If we are standing at the point (60, 100) then the direction with the maximum rate of change of the elevation is given by f (60, 100) = 1.2, 4 . This means that both x and y are decreasing (since they are negative) and y is decreasing faster than x. This is shown by the vector in the above sketch. This also shows that the direction with maximum rate of change of the elevation is generally up the hill (and hence towards the center) rather than down the hill (away from the hill). 2

87

3. Partial Derivatives The second fact about the gradient vector that we need to give before the end of this section will be very convenient in some later sections. Let us consider the case of two variables for illustration. If a dierentiable function f (x, y ) has a constant value c along a smooth curve r = g (t), h(t) (making the curve a level curve of f ), then f (g (t), h(t)) = c. Dierentiating both sides of this equation with respect to t leads to the equations d f (g (t), h(t)) dt f dg f dh + x dt y dt f f dg dh , , x y dt dt in which we might denote f = f f , x y and dr dg dh = , . dt dt dt = = = d (c), dt 0, 0,

The last equation says that f is normal to the tangent vector dr/dt, so it is normal to the curve. We summarize the second fact about gradient vector as follows.

Theorem 3.5.3 (Gradient Vector Normal to Level Curve) The gradient vector f (x0 , y0 ) is orthogonal (or perpendicular) to the level curve f (x, y ) = c at the point (x0 , y0 ). Likewise, the gradient vector f (x0 , y0 , z0 ) is orthogonal to the level surface f (x, y, z ) = c at the point (x0 , y0 , z0 ).

As we will be seeing in later sections we are often going to be needing vectors that are orthogonal to a surface or curve and using this fact we will know that all we need to do is compute a gradient vector and we will get the orthogonal vector that we need. We will see the rst application of this in the next section.

Exercise 3.5.1 2 Find the directions in which the functions increase and decrease most rapidly at P0 . Then nd the derivatives of the functions in these directions.
1. f (x, y ) = x2 + xy + y 2 , 2. f (x, y ) = x y + e
2 xy

P0 (1, 1). P0 (1, 0). P0 (4, 1, 1).

4. f (x, y, z ) = xey + z 2 ,

P0 (1, ln 2, 1/2). P0 (1, 1, 1). P0 (1, 1, 0).

sin y ,

5. f (x, y, z ) = ln(xy ) + ln(yz ) + ln(zx), 6. f (x, y, z ) = ln(x + y 1) + y + 6z ,


2 2

3. f (x, y, z ) = (x/y ) yz ,

7. Is there a direction u in which the rate of change of the temperature function T (x, y, z ) = 2xy yz

(temperature in degrees Celsius, distance in feet) at P (1, 1, 1) is 3 C/ft ? Give reasons.


1 , 5

1 8. If the temperature is given by f (x, y, z ) = 3x2 5y 2 + 2z 2 and you are located at ( 3 ,

1 ) 2

and

want to get cool as soon as possible, in which direction should you move?

88

3.6 Applications of Partial Derivatives

3.6

Applications of Partial Derivatives

In this section we will take a look at a couple of applications of partial derivatives. Most of the applications will be extensions to applications of ordinary derivatives that we saw back in single variable calculus. For instance, we will be looking at nding absolute and relative extrema of a function and we will also be looking at optimization. Both of these subjects are major applications back in single variable calculus. They will, however, be a little more work here because we now have more than one variable.

Tangent planes and linear approximations


Earlier we saw (page 67) how the two partial derivatives fx and fy can be thought of as the slopes of traces. We want to extend this idea out a little in this section. The graph of a function z = f (x, y ) is a surface in R3 (three-dimensional space) and so we can now start thinking of the plane that is tangent to the surface as a point. Let us start out with a point (x0 , y0 ) and also let C1 represent the trace to f (x, y ) for the plane y = y0 (i.e., allowing x to vary with y held xed) and we will let C2 represent the trace to f (x, y ) for the plane x = x0 (i.e., allowing y to vary with x held xed). Now, we know that fx (x0 , y0 ) is the slope of the tangent line to the trace C1 and fy (x0 , y0 ) is the slope of the tangent line to the trace C2 . So, let L1 be the tangent line to the trace C1 and let L2 be the tangent line to the trace C2 . The tangent plane will then be the plane that contains the two lines L1 and L2 . Geometrically this plane will serve the same purpose that a tangent line did in single variable calculus. A tangent line to a curve was a line that just touched the curve at that point and was parallel to the curve at the point in question. Tangent planes to a surface are planes that just touch the surface at the point and are parallel to the surface at the point. Note that this gives us a point that is on the plane. Since the tangent plane and the surface touch at (x0 , y0 ) the following point will be on both the surface and the plane. (x0 , y0 , z0 ) = (x0 , y0 , f (x0 , y0 )) . What we need to do now is determine the equation of the tangent plane. We know that the general equation of a plane is given by a(x x0 ) + b(y y0 ) + c(z z0 ) = 0, where (x0 , y0 , z0 ) is a point that is on the plane, which we know already. Let us rewrite this a little. We will move the x terms and y terms to the other side and divide both sides by c. Doing this gives b a z z0 = (x x0 ) (y y0 ). c c Now, let us rename the constants to simplify up the notation a little. Let us rename them as follows. a A= , c b B= . c

With this renaming the equation of the tangent plane becomes z z0 = A(x x0 ) + B (y y0 ) and we need to determine values for A and B . Let us rst think about what happens if we hold y xed, i.e., if we assume that y = y0 . In this case the equation of the tangent plane becomes z z0 = A(x x0 ).

89

3. Partial Derivatives This is the equation of a line and this line must be tangent to the surface at (x0 , y0 ) (since it is part of the tangent plane). In addition, this line assumes that y = y0 (i.e., xed) and A is the slope of this line. But if we think about it this is exactly that the tangent to C1 is a line tangent to the surface at (x0 , y0 ) assuming that y = y0 . In other words, z z0 = A(x x0 ) is the equation for L1 and we know that the slope of L1 is given by fx (x0 , y0 ). Therefore we have the following A = fx (x0 , y0 ). If we hold x xed at x = x0 the equation of the tangent plane becomes z z0 = B (y y0 ). However, by a similar argument to the one above we can see that this is nothing more than the equation for L2 and that its slope is B or fy (x0 , y0 ). So, B = fy (x0 , y0 ). The equation of the tangent plane to the surface given by z = f (x, y ) at (x0 , y0 ) is then z z0 = fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ). Also, if we use the fact that z0 = f (x0 , y0 ) we can rewrite the equation of the tangent plane as z f (x0 , y0 ) z = = fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ), f (x0 , y0 ) + fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ).

We will see another derivation of this formula (actually a more general formula) later on (page 92). So if you didnt quite follow this argument hold o until then to see a better derivation.

Example 3.6.1 (Tangent plane) Find the equation of the tangent plane to z = ln(2x + y ) at the point (1, 3). Solution There really is not too much to do here other than taking a couple of derivatives and doing some quick evaluations. f (x, y ) = ln(2x + y ), fx (x, y ) = fy (x, y ) = The equation of the plane is then z0 z = = 2(x + 1) + (1)(y 3), 2 x + y 1.
2

z0 = f (1, 3) = ln 1 = 0, fx (1, 3) = 2, fy (1, 3) = 1.

2 , 2x + y 1 , 2x + y

90

3.6 Applications of Partial Derivatives One nice use of tangent planes is that they give us a way to approximate a surface near a point. As long as we are near to the point (x0 , y0 ) then the tangent plane should nearly approximate the function at that point. The tangent plane to the graph of z = f (x, y ) at (x0 , y0 ) is z = L(x, y ), where L(x, y ) = f (x0 , y0 ) + fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ) is the linear approximation of f at (x0 , y0 ). We can use L(x, y ) to approximate values of f (x, y ) near (x0 , y0 ): f (x, y ) L(x, y ) = f (x0 , y0 ) + fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ).

Example 3.6.2 (Linear approximation) Find an approximate value for f (x, y ) = 2x2 + e2y at (2.2, 0.2). Solution It is convenient to use the linear approximation at (x0 , y0 ) = (2, 0), where the values of f and its partial derivatives are easily evaluated: f (x, y ) = 2x2 + e2y , f (2, 0) = 3, fx (x, y ) = fy (x, y ) = Thus, L(x, y ) = 3 + 2x , 2x2 + e2y fx (2, 0) = fy (2, 0) = 4 , 3 1 . 3

e2y , 2x2 + e2y

1 4 (x 2) + (y 0), and 3 3 f (2.2, 0.2) L(2.2, 0.2) = 3 + 4 1 (2.2 2) + (0.2 0) = 3.2. 3 3


2

(For the sake of comparison, f (2.2, 0.2) 3.2172 to 4 decimal places.)

Exercise 3.6.1 2 Find the equation of the tangent plane to the graph of the given function at the specied point.
1. f (x, y ) = x2 y 2 2. f (x, y ) = xy x+y at (2, 1). at (1, 1). 3. f (x, y ) = yex at (0, 1). p 4. f (x, y ) = 1 + x3 y 2 at (2, 1).
2

2 Use suitable linear approximations to nd approximate values for the given functions at the specied

points.
5. f (x, y ) = sin(xy + ln y ) 24 6. f (x, y ) = 2 x + xy + y 2 at (0.01, 1.05). at (2.1, 1.8). 7. f (x, y, z ) = p x + 2y + 3z
2

at (1.9, 1.8, 1.1).

8. f (x, y ) = xey+x

at (2.05, 3.92).

91

3. Partial Derivatives

Gradient vector, tangent planes and normal lines


Now in this subsection we want to revisit the derivation of tangent planes only this time we will look at them in light of the gradient vector. In the process we will also take a look at a normal line to a surface. Let us rst recall the equation of a plane that contains the point (x0 , y0 , z0 ) with normal vector n = a, b, c is given by a(x x0 ) + b(y y0 ) + c(z z0 ) = 0. When we introduced the gradient vector in Section 3.5 (page 84) on directional derivatives we gave the following fact in Theorem 3.5.3 (page 88).

Fact 1 The gradient vector f (x0 , y0 , z0 ) is orthogonal to the level surface f (x, y, z ) = c at the point (x0 , y0 , z0 ).

This says that the gradient vector is always orthogonal, or normal, to the surface at the point. Also recall that the gradient vector is f = fx , fy , fz . So, the tangent plane to the surface given by f (x, y, z ) = c at (x0 , y0 , z0 ) has the equation fx (x0 , y0 , z0 ) (x x0 ) + fy (x0 , y0 , z0 ) (y y0 ) + fz (x0 , y0 , z0 ) (z z0 ) = 0. This is a much more general form of the equation of a tangent plane than the one that was derived previously (page 90). Note however, that we can also get the equation from the previous subsection (page 90) using this more general formula. To see this let us start with the equation z = f (x, y ) and we want to nd the tangent plane to the surface given by z = f (x, y ) at the point (x0 , y0 , z0 ) where z0 = f (x0 , y0 ). In order to use the formula above we need to have all the variables on one side. This is easy enough to do. All we need to do is subtract a z from both sides to get f (x, y ) z = 0. Now, if we dene a new function F (x, y, z ) = f (x, y ) z, we can see that the surface given by z = f (x, y ) is identical to the surface given by F (x, y, z ) = 0 and this new equivalent equation is in the correct form for the equation of the tangent plane that we derived in this subsection. So, the rst thing that we need to do is nd the gradient vector for F , F = Fx , Fy , Fz = fx , fy , 1 . Notice that Fx Fy Fz = = = (f (x, y ) z ) x (f (x, y ) z ) y (f (x, y ) z ) z = = = fx , fy , 1.

92

3.6 Applications of Partial Derivatives The equation of the tangent plane is then fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ) (z z0 ) = 0. Solving for z gives z = f (x0 , y0 ) + fx (x0 , y0 ) (x x0 ) + fy (x0 , y0 ) (y y0 ) which is identical to the equation that we derived in the previous subsection (page 90). We can get another nice piece of information out of the gradient vector as well. We might on occasion want a line that is orthogonal to a surface at a point, sometimes called the normal line. This is easy enough to get if we recall that the equation of a line only requires that we have a point and a parallel vector. Since we want a line that is at the point (x0 , y0 , z0 ) we know that this point must also be on the line and we know that f (x0 , y0 , z0 ) is a vector that is normal to the surface and hence will be parallel to the line. Therefore the equation of the normal line is r(t) = x0 , y0 , z0 + t f (x0 , y0 , z0 ).

Example 3.6.3 (Tangent plane, normal line) Find the tangent plane and normal line to x2 + y 2 + z 2 = 30 at the point (1, 2, 5). Solution For this case the function that we are going to be working with is F (x, y, z ) = x2 + y 2 + z 2 , and note that we dont have to have a zero on one side of the equal sign. All that we need is a constant. To nish this problem out we simply need the gradient evaluated at the point F (x, y, z ) = 2x, 2y, 2z , The tangent plane is then 2(x 1) 4(y + 2) + 10(z 5) = 0. The normal line is r(t) = 1, 2, 5 + t 2, 4, 10 = 1 + 2t, 2 4t, 5 + 10t .
2

F (1, 2, 5) = 2, 4, 10 .

Exercise 3.6.2 1. Find the equations of the tangent plane and normal line to the surface z = y + ln (1, 1, 1). x at the point z

2. Find the coordinates of all points on the surface with equation z = x4 4xy 3 + 6y 2 2 where the surface has a horizontal tangent plane. 3. Find all horizontal planes that are tangent to the surface with equation z = xye(x points are they tangent?
2

+y 2 )/2

. At what

93

3. Partial Derivatives

Relative minima and maxima


In this subsection we are going to extend one of the more important ideas from single variable calculus into functions of two variables. We are going to start looking at trying to nd minima and maxima of functions. Recall that we will often use the word extrema to refer to both minima and maxima. This in fact will be the topic of the following two subsections as well (i.e., absolute maxima and Lagrange multipliers). The denition of relative extrema for functions of two variables is identical to that for functions of one variable we just need to remember now that we are working with functions of two variables. So, for the sake of completeness here is the denition of relative minima and relative maxima for functions of two variables.

3.6.1 Denition (Relative Extrema) 1. A function f (x, y ) has a relative minimum at the point (a, b) if f (x, y ) points (x, y ) in some region around (a, b). 2. A function f (x, y ) has a relative maximum at the point (a, b) if f (x, y ) points (x, y ) in some region around (a, b). f (a, b) for all f (a, b) for all

Note that this denition does not say that a relative minimum is the smallest value that the function will ever take. It only says that in some region around the point (a, b) the function will always be larger than f (a, b). Outside of that region it is completely possible for the function to be smaller. Likewise, a relative maximum only says that around (a, b) the function will always be smaller than f (a, b). Again, outside of that region it is completely possible that the function will be larger. Next we need to extend the idea of critical values/points up to functions of two variables. Recall that a critical value of the function f (x) is a number x = c so that either f (c) = 0 or f (c) does not exist. If x = c is a critical value of f , then (c, f (c)) is said to be a critical point. We have a similar denition for critical points of functions of two variables.

3.6.2 Denition The point (a, b) is a critical point ( or a stationary point ) of f (x, y ) provided one of the following is true. 1. f (a, b) = 0 ( this is equivalent to saying that fx (a, b) = 0 and fy (a, b) = 0 ). 2. fx (a, b) and/or fy (a, b) does not exist.

To see the equivalence in the rst part let us start o with f = 0 and put in the denition of each part. f (a, b) fx (a, b), fy (a, b) = = 0, 0, 0 .

The only way that these two vectors can be equal is to have fx (a, b) = 0 and fy (a, b) = 0. In fact, we will use this denition of the critical point more than the gradient denition since it will be easier to nd the critical points if we start with the rst-order partial derivatives. Note as well that both of the rst-order partial derivatives must be zero at (a, b). If only one of the rst-order partial derivatives is zero at the point then the point will NOT be a critical point. We now have the following fact that, at least partially, relates critical points to relative extrema.

94

3.6 Applications of Partial Derivatives

Fact 2 If the point (a, b) is a relative extremum of the function f (x, y ) then (a, b) is also a critical point of f (x, y ).

Note that this does NOT say that all critical points are relative extrema. It only says that relative extrema will be critical points of the function. To see this let us consider the function f (x, y ) = xy. The two rst-order partial derivatives are fx (x, y ) = y and fy (x, y ) = x.

The only point that will make both of these derivatives zero at the same time is (0, 0) and so (0, 0) is a critical point for the function. Here is the graph of the function.
z

Note that the axes are not in the standard orientation here so that we can see more clearly what is happening at the origin, i.e., at (0, 0). If we start at the origin and move into either of the quadrants where both x and y are the same sign the function increases. However, if we start at the origin and move into either of the quadrants where x and y have opposite signs then the function decreases. In other words, no matter what region you take about the origin there will be points larger than f (0, 0) = 0 and points smaller than f (0, 0) = 0. Therefore, there is no way that (0, 0) can be a relative extremum. Critical points that exhibit this kind of behavior are called saddle points. While we have to be careful to not misinterpret the results of this fact it is very useful in helping us to identify relative extrema. Because of this fact we know that if we have all the critical points of a function then we also have every possible relative extrema for the function. The fact tells us that all relative extrema must be critical points so we know that if the function does have relative extrema then they must be in the collection of all the critical points. Remember however, that it will be completely possible that at least one of the critical points wont be a relative extremum. So, once we have all the critical points in hand all we need to do is test these points to see if they are relative extrema or not. To determine if a critical point is a relative extremum (and in fact to determine if it is a minimum or a maximum) we can use the following fact.

95

3. Partial Derivatives

Fact 3 (Second Derivative Test) Suppose that (a, b) is a critical point of f (x, y ) and that the second-order partial derivatives are continuous in some region that contains (a, b). Next dene D = D(a, b) = fxx (a, b) fyy (a, b) [fxy (a, b)]2 . We then have the following classications of the critical point. 1. If D > 0 and fxx (a, b) > 0, then (a, b) is a relative minimum. 2. If D > 0 and fxx (a, b) < 0, then (a, b) is a relative maximum. 3. If D < 0, then (a, b) is a saddle point. 4. If D = 0, then (a, b) may be a relative minimum, relative maximum or a saddle point. Other techniques would need to be used to classify the critical point.

Note that we are not going to be seeing any cases in this class where D = 0. We will be able to classify all the critical points that we nd. Let us see a couple of examples.

Example 3.6.4 (Critical points) Find and classify all the critical points of f (x, y ) = 4 + x3 + y 3 3xy. Solution We rst need all the rst-order (to nd the critical points) and second-order (to classify the critical points) partial derivatives so let us get those. fx = 3x2 3y, fxx = 6x, fy = 3y 2 3x, fyy = 6y, fxy = 3.

Let us rst nd the critical points. Critical points will be solutions to the system of equations ( fx = 3x2 3y = 0, fy = 3y 2 3x = 0.

This is a nonlinear system of equations and these can (quite often) be dicult to solve. However, in this case it is not too bad. We can solve the rst equation for y as follows 3x2 3y = 0 Plugging this into the second equation gives 3(x2 )2 3x = 3x(x3 1) = 0. From this we can see that we must have x = 0 and x = 1. Now use the fact that y = x2 to get the critical points. x=0: x=1: y = 02 = 0 y = 12 = 1 = = (0, 0), (1, 1). = y = x2 .

96

3.6 Applications of Partial Derivatives So we get two critical points. All we need to do now is classify them. To do this we will need to know the sign of D. Here is the general formula for D. D(x, y ) = = = fxx (x, y ) fyy (x, y ) [fxy (x, y )]2 (6x)(6y ) (3)2 36xy 9.

To classify the critical points all that we need to do is plug in the critical points and use the fact above to classify them. For the critical point (0, 0): D = D(0, 0) = 9 < 0. For (0, 0), D is negative and so this must be a saddle point. For the critical point (1, 1): D = D(1, 1) = 36 9 = 27 > 0, fxx (1, 1) = 6 > 0.
2

For (1, 1), D is positive and fxx is positive and so we must have a relative minimum.

Example 3.6.5 (Critical points) Find and classify all the critical points of f (x, y ) = 3x2 y + y 3 3x2 3y 2 + 2. Solution We rst need all the rst-order (to nd the critical points) and second-order (to classify the critical points) partial derivatives so let us get those. fx = 6xy 6x, fxx = 6y 6, fy = 3x2 + 3y 2 6y, fyy = 6y 6, fxy = 6x.

We will rst need the critical points. The equations that we will need to solve this time are ( 6xy 6x = 0, 3x2 + 3y 2 6y = 0.

These equations are a little trickier to solve than the rst set, but once you see what to do they really are not too complicated. First, notice that we can factor out a 6x from the rst equation to get 6x(y 1) = 0. So, we can see that the rst equation will be zero if x = 0 or y = 1. Be careful to not just cancel x from both sides. If we really do this we would miss the case x = 0. To nd the critical points we can plug these (individually) into the second equation and solve for the remaining variable. x=0: y=1: 3y 2 6y = 3y (y 2) = 0 3x2 3 = 3(x2 1) = 0 = = y = 0, y = 2, x = 1, x = 1.

97

3. Partial Derivatives So, if x = 0 we have the following critical points (0, 0) and if y = 1 the critical points are (1, 1) and (1, 1). and (0, 2)

Now all we need to do is classify the critical points. To do this we will need the general formula for D. D(x, y ) = (6y 6)(6y 6) (6x)2 = (6y 6)2 36x2 . To classify the critical points all that we need to do is plug in the critical points and use the fact above to classify them. For the critical point (0, 0): D = D(0, 0) = 36 > 0, fxx (0, 0) = 6 < 0. So (0, 0) is a relative maximum. For the critical point (0, 2): D = D(0, 2) = 36 > 0, fxx (0, 2) = 6 > 0. So (0, 2) is a relative minimum. For the critical point (1, 1): D = D(1, 1) = 36 < 0. So (1, 1) is a saddle point. For the critical point (1, 1): D = D(1, 1) = 36 < 0. So (1, 1) is a saddle point.
2

Let us do one more example that is a little dierent from the rst two.

Example 3.6.6 (Critical points) Determine the point on the plane 4x 2y + z = 1 that is closest to the point (2, 1, 5). Solution Note that we are NOT asking for the critical points of the plane. In order to do this example we are going to work out the equation that we are going to work with. First let us suppose that (x, y, z ) is any point on the plane. The distance between this point and the point in question, (2, 1, 5), is given by the formula p d = (x + 2)2 + (y + 1)2 + (z 5)2 . What are then asking is to nd the minimum value of this distance function. The point (x, y, z ) that gives the minimum value of this equation will be the point on the plane that is closest to (2, 1, 5). There are a couple of issues with this function. First, it is a function of x, y and z and we can only deal with functions of x and y at this point. This is easy to x however. We can solve the equation of the plane to see that z = 1 4x + 2y. Plugging this into the distance function gives p d = (x + 2)2 + (y + 1)2 + (1 4x + 2y 5)2 p = (x + 2)2 + (y + 1)2 + (4 4x + 2y )2 . Now, the next issue is that there is a square root in this formula and we know that we are going to be dierentiating this eventually. So, in order to make our argument a little easier let us notice that nding

98

3.6 Applications of Partial Derivatives the minimum value of d will be equivalent to nding the minimum value of d2 . So, let us instead nd the minimum value of f (x, y ) = d2 = (x + 2)2 + (y + 1)2 + (4 4x + 2y )2 . Now, we need to be a little careful here. We are being asked to nd the closest point on the plane to (2, 1, 5) and that is not really the same thing as what we have been doing in this subsection. In this subsection we have been nding and classifying critical points as relative minima or maxima and what we are really asking is to nd the smallest value the function will take, or the absolute minimum. Hopefully, it does make sense from a physical standpoint that there will be a closest point on the plane to (2, 1, 5). Also, this point should be a relative minimum. So, let us go through the process from the rst and second example and see what we get as far as relative minima go. If we only get a single relative minimum then we will be done since that point will also need to be the absolute minimum of the function and hence the point on the plane that is closest to (2, 1, 5). We will need the derivatives rst. fx fy fxx fyy fxy = = = = = 2(x + 2) + 2(4)(4 4x + 2y ) = 36 + 34x 16y, 2(y + 1) + 2(2)(4 4x + 2y ) = 14 16x + 10y, 34, 10, 16.

Now, before we get into nding the critical point(s) let us compute D quickly. D = (34)(10) (16)2 = 84 > 0. So, in this case D will always be positive and also notice that fxx = 34 > 0 is always positive and so any critical points that we get will be guaranteed to be relative minima. Now, let us nd the critical point(s). This will mean solving the system ( 36 + 34x 16y = 0, 14 16x + 10y To do this we can solve the rst equation for x. 1 1 (16y 36) = (8y 18). 34 17 Now, plug this into the second equation and solve for y . x= 14y 16 (8y 18) + 10y = 0 17 = y= 25 . 21 = 0.

Backward substituting this into the equation for x gives x = 34/21. So we get a single critical point 25 34 , . 21 21 Also, since we know this will be a relative minimum and it is the only critical point we know that this is also the x and y coordinates of the point on the plane that we are looking for. We can nd the z coordinate by plugging into the equation of the plane as follows z = 1 4( 25 107 34 ) + 2( ) = . 21 21 21

So, the point on the plane that is closest to (2, 1, 5) is 34 25 107 , , . 21 21 21


2

99

3. Partial Derivatives Exercise 3.6.3 2 Locate all relative maxima, relative minima, and saddle points, if any.
1. f (x, y ) = y 2 + xy + 3y + 2x + 3. 2. f (x, y ) = x2 + xy 2y 2x + 1. 3. f (x, y ) = x2 + xy + y 2 3x. 4. f (x, y ) = xy x3 y 2 . 5. f (x, y ) = xy + 2 4 + . x y 10. f (x, y ) = xy + 8 1 + . x y

11. f (x, y ) = cos(x + y ). 12. f (x, y ) = xy . 2 + x4 + y 4


3

13. f (x, y ) = xex 14. f (x, y ) = 15. f (x, y ) =

+y 3

6. f (x, y ) = y sin x. 7. f (x, y ) = ex sin y . 8. f (x, y ) = x2 + y 2 + 9. f (x, y ) = e(x


2

x2

x2 . + y2

2 . xy .

xy . x2 + y 2 1 1 1 1 )(1 + )( + ). x y x y

+y 2 +2x)

16. f (x, y ) = (1 +

2 Show that the second-derivative test provides no information about the critical points of the following

functions. Classify all critical points of f as relative maxima, relative minima, or saddle points.
17. f (x, y ) = x4 + y 4 . 18. f (x, y ) = x4 y 4 . 19. f (x, y ) = x3 + y 3 . 20. f (x, y ) = exp(x4 y 4 ).

21. Let f (s, t) denote the square of the distance between a typical point of the line x = t, y = t + 1,

z = 2t and a typical point of the line x = 2s, y = s 1, z = s + 1. Show that the single critical point of f is a relative minimum. Hence nd the closest points on these two skew lines.
22. Let f (x, y ) denote the square of the distance from (0, 0, 2) to a typical point of the surface z = xy .

Find and classify the critical points of f .


23. Show that the graph of the function

1 2 f (x, y ) = xy exp [x + 4 y 2 ] 8 has a saddle point but no relative extrema.


24. Consider the function

f (x, y ) = sin Find and classify the critical points of the function.
25. Let the function

x y sin . 2 2

f (x, y ) =

xy (x2 y 2 ) . x2 + y 2

Classify the behavior of f near the critical point (0, 0).

100

3.6 Applications of Partial Derivatives

Absolute minima and maxima


Now we are going to extend the work from the previous subsection. In the previous subsection we were asked to nd and classify all critical points as relative minima, relative maxima and/or saddle points. In this subsection we want to optimize a function, that is identify the absolute minimum and/or the absolute maximum of the function, on a given region in R2 . Note that when we say we are going to be working on a region in R2 we mean that we are going to be looking at some region in the xy -plane. In order to optimize a function in a region we are going to introduce a couple of denitions out of the way. Here are the denitions.

3.6.3 Denition 1. A region in R2 is called closed if it includes its boundary. A region is called open if it does not include any of its boundary points. 2. A region in R2 is called bounded if it can be completely contained in a disk. In other words, a region will be bounded if it is nite.

Let us think a little more about the denition of closed. We said a region is closed if it includes its boundary. Just what does this mean? Let us think of a rectangle. Below are two denitions of a rectangle, one is closed and the other is open. Open 5 < x < 3, 1 < y < 6. 1 Closed 5 x y 3, 6.

In the rst case we dont allow the ranges to include the endpoints (i.e., we are not including the edges of the rectangle) and so we are not allowing the region to include any points on the edge of the rectangle. In other words, we are not allowing the region to include its boundary and so it is open. In the second case we are allowing the region to contain points on the edges and so will contain its entire boundary and hence will be closed. This is an important idea because of the following fact.

Theorem 3.6.1 (Extreme Value Theorem) If f (x, y ) is continuous in some closed, bounded set D in R2 then there are points in D, (x1 , y1 ) and (x2 , y2 ) so that f (x1 , y1 ) is the absolute maximum and f (x2 , y2 ) is the absolute minimum of the function in D.

Note that this theorem does NOT tell us where the absolute minimum or absolute maximum will occur. It only tells us that they will exist. Note as well that the absolute minimum and/or absolute maximum may occur in the interior of the region or it may occur on the boundary of the region. The basic process for nding absolute maxima is pretty much identical to the process that we used in single variable calculus when we looked at nding absolute extrema of functions of a single variable. There will however, be some dierences to account for the fact that we now are dealing with functions of two variables. Here is the process.

101

3. Partial Derivatives

Theorem 3.6.2 (Finding Absolute Extrema) 1. Find all the critical points of the function that lie in the region D and determine the function value at each of these points. 2. Find all extrema of the function on the boundary. This usually involves the single variable calculus approach for this work. 3. The largest and smallest values found in the rst two steps are the absolute maximum and the absolute minimum of the function respectively.

The main dierence between this process and the process that we used in single variable calculus is that the boundary in single variable calculus is just two points and so there is just little to do in the second step. For these problems the majority of the work is often in the second step as we will often end up doing an absolute extrema problem in single variable calculus one or more times. Let us take a look at a couple of examples.

Example 3.6.7 (Absolute extrema) Find the absolute minimum and absolute maximum of f (x, y ) = x2 + 4y 2 2x2 y + 4 on the rectangle given by 1 Solution x 1 and 1 y 1.

Let us rst get a quick picture of the rectangle for reference purposes. y
y=1

x = 1

x=1

y = 1

The boundary of this rectangle is given by the following conditions. right side : left side : upper side : lower side : x = 1, 1 x = 1, 1 y = 1, 1 y = 1, 1 x x y y 1, 1. 1, 1,

102

3.6 Applications of Partial Derivatives These will be important in the second step of our process. We will start this o by nding all the critical points that lie inside the given rectangle. To do this we will need the two rst-order derivatives fx = 2x 4xy, fy = 8y 2x2 .

Note that since we are not going to be classifying the critical points we dont have to consider the second-order derivatives. To nd the critical points we will need to solve the system ( 2x 4xy = 0, 8y 2x2 We can solve the second equation for y to get y= Plugging this into the rst equation gives 2x 4x( This implies that we must have x=0 or x = 2 1.414. x2 ) = 2x x3 = x(2 x2 ) = 0. 4 x2 . 4 = 0.

Now, recall that we only want critical points in the region that we are given. This means that we only want critical points for which 1 x 1. The only value of x that will satisfy this is the rst one so we can ignore the last two for this problem. Note however that a simple change to the boundary would include these two so dont forget to always check if the critical points are in the region (or on the boundary since that can also happen). Plugging x = 0 into the equation for y gives y= 02 = 0. 4 We now need to get the

The single critical point, in the region (and again, thats important), is (0, 0). value of the function at the critical point. f (0, 0) = 4.

Eventually we will compare this to values of the function found in the next step and take the largest and smallest as the absolute extrema of the function in the rectangle. Now we have reached the long part of this problem. We need to nd the absolute extrema of the function along the boundary of the rectangle. What this means is that we are going to look at what the function is doing along each of the sides of the rectangle listed above. Let us rst take a look at the right side. As noted above the right side is dened by x = 1, Notice that along the right side we know that x = 1. function as follows 1 y 1.

Let us take advantage of this by dening a new

g (y ) = f (1, y ) = (1)2 + 4y 2 2(1)2 y + 4 = 5 + 4y 2 2y. Now, nding the absolute extrema of f (x, y ) along the right side will be equivalent to nding the absolute extrema of g (y ) in the range 1 y 1. Hopefully you can recall how to do this from single variable

103

3. Partial Derivatives calculus. We nd the critical points of g (y ) in the range 1 y 1 and then evaluate g (y ) at the critical points and the end points of the range y s. Let us do that for this problem. g (y ) = 8y 2 = y= 1 . 4

This is in the range and so we will need the following function evaluations. g (1) = 11, g (1) = 7, 19 1 g( ) = = 4.75. 4 4

Notice that, using the denition of g (y ) these are also function values for f (x, y ). g (1) g (1) 1 g( ) 4 = = = f (1, 1) = 11, f (1, 1) = 7, 1 19 f (1, ) = = 4.75. 4 4

We can now do the left side of the rectangle which is dened by x = 1, 1 y 1.

Again, we will dene a new function (it doesnt matter we are still using the symbol g ) as follows g (y ) = f (1, y ) = (1)2 + 4y 2 2(1)2 y + 4 = 5 + 4y 2 2y. Notice however that, for this boundary, this is the same function as we looked at for the right side. This will not always happen, but by this example let us take advantage of the fact that we have already done the work for this function. We know that the critical point is y = 1/4 and we know that the function value at the critical point and the end points are g (1) = 11, g (1) = 7, 1 19 g( ) = = 4.75. 4 4

The only real dierence here is that these will correspond to value of f (x, y ) at dierent points than for the right side. In this case these will correspond to the following function values for f (x, y ). g (1) g (1) 1 g( ) 4 = = = f (1, 1) = 11, f (1, 1) = 7, 1 19 f (1, ) = = 4.75. 4 4

We can now look at the upper side dened by y = 1, 1 x 1.

We will again dene a new function except this time it will be a function of x. h(x) = f (x, 1) = x2 + 4(1)2 2x2 (1) + 4 = 8 x2 . We need to nd the absolute extrema of h(x) on the range 1 h (x) = 2x = x 1. First nd the critical points.

x = 0.

The value of this function at the critical point and the end points are h(1) = 7, h(1) = 7, h(0) = 8,

104

3.6 Applications of Partial Derivatives and the corresponding values for f (x, y ) are h(1) h(1) h(0) = = = f (1, 1) = 7, f (1, 1) = 7, f (0, 1) = 8.

Note that there are several repeats here. The rst two function values have already been computed when we looked at the right and left sides. This will often happen. Finally, we need to take care of the lower side. This side is dened by y = 1, The new function we will dene in this case is h(x) = f (x, 1) = x2 + 4(1)2 2x2 (1) + 4 = 8 + 3x2 . The critical point for this function is h (x) = 2x = x = 0. 1 x 1.

The function values at the critical point and the end points are h(1) = 11, and the corresponding values for f (x, y ) are h(1) h(1) h(0) = = = f (1, 1) = 11, f (1, 1) = 11, f (0, 1) = 8. h(1) = 11, h(0) = 8,

The nal step to this long process is to collect up all the function values for f (x, y ) that we have computed in this problem. Here they are f (0, 0) = 4, 1 f (1, ) = 4.75, 4 1 f (1, ) = 4.75, 4 f (1, 1) = 11, f (1, 1) = 7, f (0, 1) = 8, f (1, 1) = 7, f (1, 1) = 11, f (0, 1) = 8.

The absolute minimum is at (0, 0) since it gives the smallest function value and the absolute maximum occurs at (1, 1) and (1, 1) since these two points give the largest value among all. 2

As this example has shown these can be very long problems. Let us take a look at an easier problem with a dierent kind of boundary.

Example 3.6.8 (Absolute extrema) Find the absolute minimum and absolute maximum of f (x, y ) = 2x2 y 2 + 6y on the disk of radius 4, x2 + y 2 16.

105

3. Partial Derivatives Solution First note that a disk of radius 4 is given by the inequality in the problem statement. The less than inequality is included to get the interior of the disk and the equal sign is included to get the boundary. Of course, this also means that the boundary of the disk is a circle of radius 4. y

Circular boundary x2 + y 2 = 16

Let us nd the critical points of the function that lie inside the disk. This will require the following two rst-order partial derivatives. fx = 4x, fy = 2y + 6.

To nd the critical points we will need to solve the system ( 4x = 2y + 6 =

0, 0.

This is actually a fairly simple system to solve however. The rst equation tells us that x = 0 and the second tells us that y = 3. So the only critical point for this function is (0, 3) and this is inside the disk of radius 4. The function value at this critical point is f (0, 3) = 9. Now we need to look at the boundary. This one will be somewhat dierent from the previous example. In this case we dont have xed values of x and y on the boundary. Instead we have x2 + y 2 = 16. We can solve this for x2 and plug this into the x2 in f (x, y ) to get a function of y as follows x2 g (y ) = = 16 y 2 , 2(16 y 2 ) y 2 + 6y = 32 3y 2 + 6y. y 4 (this is the range of

We will need to nd the absolute extrema of this function on the range 4 y s for the disk). We will rst need the critical points of this function. g (y ) = 6y + 6 = y = 1.

The value of this function at the critical point and the end points are g (4) = 40, g (4) = 8, g (1) = 35.

106

3.6 Applications of Partial Derivatives Unlike the rst example we will still need to nd the values of x that correspond to these. We can do this by plugging the value of y into our equation for the circle and solving for y . y = 4 : y=4: y=1: x2 = 16 16 = 0 x = 16 16 = 0 x2 = 16 1 = 15
2

= = =

x = 0, x = 0, x = 15.

The function values for g (y ) then correspond to the following function values for f (x, y ). g (4) = 40 g (4) = 8 g (1) = 35 = = = f (0, 4) = 40, f (0, 4) = 8, f ( 15, 1) = 35 f ( 15, 1) = 35.

and

Note that the third one actually corresponds to two dierent values for f (x, y ) since that y also produces two dierent values of x. So, comparing these values to the value of the function at the critical point of f (x, y ) that we found earlier we can see that the absolute minimum occurs at (0, 4) while the absolute maximum occurs twice at ( 15, 1) and ( 15, 1). 2

In both of these examples one of the absolute extrema actually occured at more than one place. Sometimes this will happen and sometimes it wont so dont read too much into the fact that it happened in both examples given here. Also note that, as we have seen, absolute extrema will often occur on the boundaries of these regions, although they dont have to occur at the boundaries. There are more complicated examples with multiple critical points that the absolute extrema may occur interior to the region and not on the boundary.

Exercise 3.6.4 2 Find the absolute extrema of the given function on the indicated closed and bounded set R.
1. f (x, y ) = xy x 3y ; R is the triangular region with vertices (0, 0), (0, 4), and (5, 0). R is the square with vertices (0, 0), (0, 2), (2, 2), and (2, 0).

2. f (x, y ) = x2 3y 2 2x + 6y ; 3. f (x, y ) = xey x2 ey ; 4. f (x, y ) = x2 + 2y 2 x;

R is the rectangular region with vertices (0, 0), (0, 1), (2, 1), and (2, 0). R is the disk x2 + y 2 4.

5. Find three positive numbers whose sum is 48 and such that their product is as large as possible. 6. Find all points on the portion of the plane x + y + z = 5 in the rst octant at which f (x, y, z ) = xy 2 z 2

has a maximum value.


7. Find the dimensions of the rectangular box of maximum volume that can be inscribed in a sphere of

radius a.

107

3. Partial Derivatives

Lagrange multipliers
In the previous subsection we optimized (i.e., found the absolute extrema of) a function on a region that contains its boundary. Find potential optimal points in the interior of the region is not too dicult in general, all that we need to do is nd the critical points and plug them into the function. However, as we saw in Examples 3.6.7, 3.6.8 nding potential optimal critical points on the boundary is often a fairly long and messy process. Now we are going to take a look at another method (Lagrange multipliers) of optimizing a function subject to given constraint(s). The constraint(s) may be equation(s) that describe the boundary of a region although in this subsection we wont concentrate on those types of problems since this method just requires a general constraint and doesnt really care where the constraint came from. So, let us get things set up. We want to optimize (nd the minimum and maximum) of a function, f (x, y, z ), subject to the constraint g (x, y, z ) = c. Again, the constraint may be the equation that describes the boundary of a region or it may not be. The process is actually fairly simple, although the work can still be a little overwhelming at times.

Theorem 3.6.3 (Method of Lagrange Multipliers) 1. Solve the following system of equations 8 < f (x, y, z ) : g (x, y, z )

= =

g (x, y, z ), c.

2. Plug in all solutions, (x, y, z ), from the rst step into f (x, y, z ) and identify the minimum and maximum values, provided they exist. The constant, , is called the Lagrange multiplier.

Notice that the system of equations actually has four equations, we just wrote the system in a simpler form. To see this let us take the rst equation and put in the denition of the gradient vector to see what we get. fx , f y , f z = = gx , gy , gz gx , gy , gz .

In order for these two vectors to be equal the individual components must also be equal. So, we actually have three equations here. fx = gx , fy = gy , fz = gz .

These three equations along with the constraint, g (x, y, z ) = c, give four equations with four unknowns x, y , z , and . Note as well that if we only have functions of two variables then we wont have the third component of the gradient and so will only have three equations in three unknowns x, y , and . Let us work a couple of examples.

108

3.6 Applications of Partial Derivatives Example 3.6.9 (Lagrange multiplier) Find the dimensions of the box with largest volume if the total surface area is 64 cm2 . Solution Before we start the process here note that we also have a way to solve this kind of problem in single variable calculus, except in those problems we require a condition that relates one of the sides of the box to the other sides so that we can get down to a volume and surface area function that only involve two variables. We no longer need this condition for these problems. Now, let us get on to solving the problem. We rst need to identify the function that we are going to optimize as well as the constraint. Let us set the length of the box to be x, the width of the box to be y and the height of the box to be z . We want to nd the largest volume and so the function that we want to optimize is given by f (x, y, z ) = xyz. Next we know that the surface area of the box must be a constant 64. So this is the constraint. The surface area of a box is simply the sum of the areas of each of the sides so the constraint is given by 2xy + 2yz + 2xz = 64 = xy + yz + xz = 32.

Note that we divide the constraint by 2 to simplify the equation a little. Also, we get the function g (x, y, z ) from this. g (x, y, z ) = xy + yz + xz. Here are the four equations that we need to solve. yz = (y + z ) xz = (x + z ) xy = (x + y ) xy + yz + xz = 32 (fx = gx ) , (fy = gy ) , (fz = gz ) , (g (x, y, z ) = 32) . (3.4) (3.5) (3.6) (3.7)

Although the equations are nonlinear, there are many ways to solve this system. We will solve it in the following way. Let us multiply equation (3.4) by x, equation (3.5) by y and equation (3.6) by z . xyz xyz xyz = = = x(y + z ), y (x + z ), z (x + y ). (3.8) (3.9) (3.10)

Now notice that we can set equations (3.8) and (3.9) equal. Doing this gives x(y + z ) (xy + xz ) (xy + yz ) (xz yz ) = = = y (x + z ), 0, 0 = =0 or xz = yz.

This implies two possibilities. The rst, = 0, is not possible since if this is the case equation (3.4) will reduce to yz = 0 = y=0 or z = 0.

Since we are talking about the dimensions of a box neither of these are possible so we can discount = 0. This leaves the second possibility. xz = yz.

109

3. Partial Derivatives Since we know that z = 0 (again since we are talking about the dimensions of a box) we can cancel the z from both sides. This gives x = y. Next, let us set equations (3.9) and (3.10) equal. Doing this gives y (x + z ) (xy + yz xz yz ) (xy xz ) = = = z (x + y ), 0, 0 = =0 or xy = xz. (3.11)

As already discussed we know that = 0 wont work and so this gives xy = xz. We can also say that x = 0 since we are dealing with the dimensions of a box so we have y = z. Plugging equations (3.11) and (3.12) into equation (3.7) we get r y 2 + y 2 + y 2 = 3y 2 = 32, y= 32 3.266. 3 (3.12)

However, we know that y must be positive since we are talking about the dimensions of a box. Therefore the only solution that makes physical sense here is x = y = z 3.266 cm. This shows that we have a cube here. We should be a little careful here. Since we have obtained only one solution we might be tempted to assume that this is the dimensions that will give the largest volume. The method of Lagrange multipliers will give a set of points that will either maximize or minimize a given function subject to the constraint. However, when we get a single solution it may be either a maximum or a minimum. To verify that we indeed have a maximum, as we want, all we need to do is pick any other point that satises the constraint and check its volume against the volume of the point we got above. If the volume of the point above is larger than the second point we will know that we indeed have a maximum. To get the second point let us choose y = z = 2 plugging these into the constraint gives 2x + 2x + 4 = 32, Checking the volume at the two points gives f (3.266, 3.266, 3.266) f (7, 2, 2) = = 34.8376, 28.
2

x = 7.

So, it is certain that we did get a maximum value as expected.

Notice that we never actually found values for in the above example. This is fairly standard for these kind of problems. The value of is not really important to determining if the point is a maximum or a minimum so often we will not bother with nding a value for it. On occasion we will need its value to help solve the system, let us take a look at the next example for illustration.

110

3.6 Applications of Partial Derivatives Example 3.6.10 (Lagrange multiplier) Find the maximum and minimum of f (x, y ) = 5x 3y subject to the constraint x2 + y 2 = 136. Solution This one is going to be a little easier than the previous one since it only has two variables. Here is the system that we need to solve. 5 3 x2 + y 2 = = = 2x, 2y, 136.

Notice that, as with the last example, we cannot have = 0 since that would not satisfy the rst two equations. So, since we know that = 0 we can solve the rst two equations for x and y , respectively. This gives x= Plugging these into the constraint gives 25 9 17 + 2 = = 136. 42 4 22 We can solve this for . 2 = 1 16 = 1 = . 4 5 , 2 y= 3 . 2

Now, that we know we can nd the points that will be potential maxima and/or minima. 1 If = , we get 4 x = 10, 1 , we get 4 x = 10, y = 6. y = 6.

If =

To determine if we have maxima or minima we just need to plug these into the function. f (10, 6) = 68, f (10, 6) = 68, minimum at (10, 6), maximum at (10, 6).
2

In the rst two examples we have excluded = 0 either for physical reasons or because it would not satisfy one or more of the equations. Do not always expect this to happen. Sometimes we will be able to automatically exclude a value of and sometimes we wont. Let us take a look at another example.

111

3. Partial Derivatives Example 3.6.11 (Lagrange multiplier) Find the maximum and minimum of f (x, y, z ) = xyz subject to the constraint x + y + z = 1. Assume that x, y , z Solution Here is the system that we need to solve. yz xz xy x+y+z = = = = , , , 1. (3.13) (3.14) (3.15) (3.16) 0.

Let us start this solution process o by noticing that since the rst three equations all have they are all equal. So, let us start o by setting equations (3.13) and (3.14) equal. yz = xz = z (y x) = 0 = z=0 or y = x.

So, we have two possibilities here. Let us consider the rst possibility: z = 0. With this we can see from either equation (3.13) or (3.14) that we must have = 0. From equation (3.15) we see that this means that xy = 0. This in turn means that either x = 0 or y = 0. So, we have two possible cases to deal with here. In each case two of the variables must be zero. Once we know this we can plug into the constraint, equation (3.16), to nd the remaining value. z = 0, z = 0, and and x=0 y=0 = = y = 1, x = 1.

So, we get two possible solutions (0, 1, 0) and (1, 0, 0). Now, let us go back and take a look at the other possibility, y = x. We also have two possible cases to look at here as well. The rst case is x = y = 0. In this case we can see from the constraint that we must have z = 1 and so we now have a third solution (0, 0, 1). The second case is x = y = 0. Let us set equations (3.14) and (3.15) equal. xz = xy = x(z y ) = 0 = x=0 or z = y.

Now, we have already assumed that x = 0 and so the only possibility is that z = y . However, this also means that x = y = z. Using this in the constraint gives 3x = 1 = x= 1 . 3

1 1 1 So, the next solution is ( , , ). We got four solutions by setting the rst two equations equal. 3 3 3 To completely nish this problem out we should probably set equations (3.13) and (3.15) equal as well as setting equations (3.14) and (3.15) equal to see what we get. Doing this gives yz = xy xz = xy = = y (z x) = 0 x(z y ) = 0 = = y=0 x=0 or or z = x, z = y.

112

3.6 Applications of Partial Derivatives Both of these are very similar to the rst situation that we looked at and we will leave it up to you to show that in each of these cases we arrive back at the four solutions that we already found. So, we have four solutions that we need to check in the function to see whether we have minima or maxima. f (0, 0, 1) = 0, f (0, 1, 0) = 0, f (1, 0, 0) = 0 : all minima,

1 1 1 1 f( , , ) = : maximum. 3 3 3 27 So, in this case the maximum occurs only once while the minimum occurs three times. Note as well that we never really used the assumption that x, y , z 0 in this problem. This assumption is here mostly to make sure that we really do have a maximum and a minimum of the function. Without this assumption it would not be too dicult to nd points that give both larger and smaller values for the function. For example, x = 100, y = 100, z = 1 : x = 50, y = 50, z = 101 : 100 + 100 + 1 = 1, 50 50 + 101 = 1, f (100, 100, 1) = 10000, f (50, 50, 101) = 252500.

With these examples you can clearly see that it is not too hard to nd points that will give larger and smaller function values. However, all of these examples required negative values of x, y , and/or z to make sure we satisfy the constraint. By eliminating these we can now say that we have found the minimum and maximum values of the function. 2

To this point we have only looked at constraints that were equations. We can also have constraints that are inequalities. The process for these types of problems is nearly identical to what we have been doing. The main dierence between the two types of problems is that we will also need to nd all the critical points that satisfy the inequality in the constraint and check these in the function when we check the values we found using Lagrange multipliers. We are not going to give any examples of this type here and this will not appear in our examinations.

The nal topic that we need to discuss here is what to do if we have more than one constraint. We will look at two constraints, but we can naturally extend the work here to more than two constraints. We want to optimize f (x, y, z ) subject to the constraint g (x, y, z ) = c and h(x, y, z ) = k. The system that we need to solve in this case is 8 f (x, y, z ) > > > > < g (x, y, z ) > > > > : h(x, y, z ) = = = g (x, y, z ) + h(x, y, z ), c, k.

So in this case we get two Lagrange multipliers (i.e., and ). Also, note that the rst equation really is three equations as we saw in the previous examples. Let us see an example of this kind of optimization problem.

Example 3.6.12 (Lagrange multiplier) Find the maximum and minimum of f (x, y, z ) = 4y 2z subject to the constraints 2x y z = 2 and x2 + y 2 = 1.

113

3. Partial Derivatives Solution Here is the system that we need to solve. 0 = 2 + 2x 4 = + 2y 2 = 2x y z = 2, x + y = 1. First, let us notice that from equation (3.19) we get = 2. equation (3.18) and solving for x and y respectively gives 0 = 4 + 2x 4 = 2 + 2y Now, plug these into equation (3.21). 4 9 13 + 2 = 2 =1 2 = = 13. 13. In this case we have = =
2 2

(fx = gx + hx ) , (fy = gy + hy ) , (fz = gz + hz ) ,

(3.17) (3.18) (3.19) (3.20) (3.21)

Plugging this into equation (3.17) and

2 x= , y= 3 .

So, we have two cases to look at here. First, let us see what we get when = 2 x = 13 Plugging these into equation (3.20) gives 3 4 z = 2 13 13 So, we have obtained one solution. Let us now see what we get if we take = 13. Here we have 2 x= 13 Plugging these into equation (3.20) gives 3 4 + z =2 13 13 and this is the second solution. = and 3 y = . 13 = and 3 y= . 13

7 z = 2 . 13

7 z = 2 + , 13

Now all that we need to do is check the two solutions in the function to see which is the maximum and which is the minimum. 2 3 7 f ( , , 2 ) 13 13 13 2 3 7 f ( , , 2 + ) 13 13 13 So, we have a maximum at 2 3 7 ( , , 2 ) 13 13 13 and a minimum at 2 3 7 ( , , 2 + ). 13 13 13
2

26 4+ 11.2111, 13 26 4 3.2111. 13

114