ACL49 NP

Lecture 49: Derivatives of functions of several
variables
Winfried Just
Department of Mathematics, Ohio University
Companion to Advanced Calculus
Winfried Just, Ohio University MATH4/5302, Lecture 49: Derivatives in several variables
Review: Derivatives in one variable
Let f : E → R, where E ⊆ R. Recall that for such f we define the

derivative at x0 ∈ E as:
f (x) − f (x0 )
f 0 (x0 ) := lim
x→x0 ;x∈E \{x0 } x − x0
We would like to use the exact same definition for derivatives at x0

when E ⊆ Rn for some n > 1.
Question L49.1: Why does this not work?
Unfortunately this does not work, since x − x0 is an n-dimensional

vector in this case and we do not have a well-defined operation of
dividing a real number f (x) − f (x0 ) by a multidimensional vector.
We need to take a different approach, based on Newton’s
approximation theorem.
A linear approximation of f near x0
Lemma 6.2.1: Let E be a subset of R, let f : E → R be a
function, x0 ∈ E , and L ∈ R. Then the following are equivalent:
(a) f is differentiable at x0 , and f 0 (x0 ) = L.

|f (x)−(f (x0 )+L(x−x0 ))|
(b) We have limx→x0 ;x∈E \{x0 } |x−x0 | = 0.
We can look at this result as follows: When f 0 (x0 ) = L, and when

we let T be the linear transformation that maps the vector y
(f (x)−f (x0 ))−T (x−x0 )
to Ly , then the difference |x−x0 | approaches 0
as x → x0 . More informally, f (x) − f (x0 ) ≈ T (x − x0 )
Question L49.2: What is the matrix A such that T = LA ?
This matrix is simply the matrix (f 0 (x0 )) = (L).

This observation motivates the definition of derivatives of
real-valued functions on subsets of Rn as linear transformations.
Derivatives of functions of several variables
Definition 6.2.2: (Differentiability) Let E be a subset of Rn ,

f : E → Rm be a function, x0 ∈ E be a point, and let
L : Rn → Rm be a linear transformation. We say that f is
differentiable at x0 with derivative L if we have
||f (x) − (f (x0 ) + L(x − x0 ))||
lim = 0.
x→x0 ;x∈E \{x0 } ||x − x0 ||
Here ||x|| is the length of x (as measured in the `2 metric):

q
||(x1 , x2 , . . . , xn )|| = x12 + x22 + · · · + xn2 .
Notice that in the general case we divide by the Euclidean norm,

which is of course the same number ||x − x0 || = |x − x0 | in the
special case when E ⊆ R.
An example
Example 6.2.3: Let f : R2 → R2 be the map f (x, y ) := (x 2 , y 2 ),

let x0 be the point x0 := (1, 2), and let L : R2 → R2 be the map
L(x, y ) := (2x, 4y ).
Question L49.3: What is the matrix A such that L = LA ?
Here
2 0
A=
0 4
A worked-out example
We claim that f is differentiable at x0 with derivative L.
To see this, we compute
||f (x, y ) − (f (1, 2) + L((x, y ) − (1, 2)))||
lim .
(x,y )→(1,2);(x,y )6=(1,2) ||(x, y ) − (1, 2)||
Making the change of variables (x, y ) = (1, 2) + (a, b), we get

kf (1 + a, 2 + b) − (f (1, 2) + L(a, b))||
lim .
(a,b)→(0,0);(a,b)6=(0,0) ||(a, b)||
Substituting the formulas for f and for L, this becomes

||((1 + a)2 , (2 + b)2 ) − (1, 4) − (2a, 4b))||
lim ,
(a,b)→(0,0);(a,b)6=(0,0) ||(a, b)||
which simplifies to
||(a2 , b 2 )||
lim .
(a,b)→(0,0);(a,b)6=(0,0) ||(a, b)||
A worked-out example, completed
We use the squeeze test.

||(a2 ,b 2 )||
By definition, 0 ≤ ||(a,b)|| .
On the other hand, we have by the triangle inequality
||(a2 , b 2 )|| ≤ ||(a2 , 0)|| + ||(0, b 2 )|| = a2 + b 2 ,
and hence
||(a2 , b 2 )|| p 2
0≤ ≤ a + b2 .
||(a, b)||
√
Since a2 + b 2 → 0 as (a, b) → 0, we see from the squeeze test
||(a2 ,b 2 )||
that lim(a,b)→(0,0);(a,b)6=(0,0) ||(a,b)|| exists and is equal to 0.
Thus f is differentiable at x0 with derivative L.
Uniqueness of derivatives
As in the one-dimensional case, derivatives, if they exist, are
unique:
Lemma 6.2.4: (Uniqueness of derivatives) Let E be a subset
of Rn , f : E → Rm be a function, x0 ∈ E be an interior point of E ,
and let L1 : Rn → Rm and L2 : Rn → Rm be linear transformations.
Suppose that f is differentiable at x0 with derivative L1 , and also
differentiable at x0 with derivative L2 . Then L1 = L2 .
This Lemma allows us to talk about the derivative of a function f

at x0 and to use the notation f 0 (x0 ). We will prove it in Module 62.
The calculations in Example 6.2.3 were rather tedious, and we will

now try a different approach that allows for finding derivatives of
multidimensional functions more easily. Essentially, let’s try to
reduce the problem to finding derivatives of several one-dimensional
functions and then assembling the derivative L from these pieces.
Directional derivatives
Let us assume that f : E → Rm , and v 6= 0 is a vector such that
∃δ > 0 ∀t ∈ [0, δ) x0 + tv ∈ E .
Then the function g : [0, δ) → R defined by g (t) = x0 + tv is a function
of one variable, and we can consider its ordinary one-sided derivative
at 0, if it exists.
This derivative gives us information about the rate of change of f at x0
in the direction of v and we call it a directional derivative. Formally:
Definition 6.3.1: (Directional derivative) Let E be a subset of Rn , let
f : E → Rm be a function, let x0 be an interior point of E , and let v be a
vector in Rn . If the limit
f (x0 + tv ) − f (x0 )
lim
t→0;t>0,x0+tv ∈E t
exists, we say that f is differentiable in the direction v at x0 , and we
denote the above limit by Dv f (x0 ):
f (x0 + tv ) − f (x0 )
Dv f (x0 ) := lim .
t→0;t>0 t
An example of a directional derivative
Definition 6.3.1 makes the assumption that f takes values in Rm

for arbitrary m ≥ 1.
Let us consider the case when m = 1 first.
Then Dv f (x0 ), if it exists, is simply a real number.
When m > 1, then we can think of f (x) = (f1 (x), . . . , fm (x)),
where fi (x) : E → R for each i ∈ 1, . . . , m.
Then Dv f (x0 ) is the vector (Dv f1 (x0 ), . . . , Dv fm (x0 )).
Example 6.3.4: We use the function f : R2 → R2 defined by

f (x, y ) := (x 2 , y 2 ) from before, and let x0 := (1, 2) and
v := (3, 4). Then
f (1+3t, 2+4t)−f (1,2)
Dv f (x0 ) = limt→0;t>0 t
(1+6t+9t 2 , 4+16t+16t 2 )−(1,4)
Dv f (x0 ) = limt→0;t>0 t
Dv f (x0 ) = limt→0;t>0 (6 + 9t, 16 + 16t) = (6, 16).
Computing directional derivatives from derivatives
Recall that the derivative in this example was f 0 (x0 ) = LA , where

2 0
A= Moreover, note that in this example we have
0 4
T
2 0 3
Dv f (x0 ) = (6, 16) = = f 0 (x0 )v
0 4 4
This illustrates the following result:

Lemma 6.3.5: Let E be a subset of Rn , f : E → Rm be a
function, x0 be an interior point of E , and let v be a vector in Rn .
If f is differentiable at x0 , then f is also differentiable in the
direction v at x0 , and Dv f (x0 ) = f 0 (x0 )v .
Thus if we know the derivative f 0 (x0 ) at x0 , we can easily deduce
all directional derivatives.
Next we will show how, in many cases, we can compute the
derivative itself from certain directional derivatives.
Partial derivatives
Definition 6.3.7: (Partial derivative) Let E be a subset of Rn , let
f : E → Rm be a function, let x0 be an interior point of E , and let
1 ≤ j ≤ n. Then the partial derivative of f with respect to the xj
∂f
variable at x0 , denoted ∂x j
(x0 ), is defined by
∂f f (x0 + tej ) − f (x0 ) d

(x0 ) := lim = f (x0 +tej )|t=0
∂xj t→0;t6=0,x0 +tej ∈E t dt
provided of course that the limit exists.
∂f
If the limit does not exist, we leave ∂x j
(x0 ) undefined.
You can think of the partial derivatives as directional derivatives in
the directions of the standard basics vectors ej (or coordinate axes
if you will). For some reasons that are mysterious to your
instructor, the textbook defines directional derivatives by only
considering positive t. But most other textbooks don’t, so that the
partial derivatives simply are special cases of directional
∂f
derivatives: ∂x j
(x0 ) = Dej f (x0 ).
Computing partial derivatives
∂f
Partial derivatives are easy to compute: For calculating ∂x j
(x0 ),
simply treat all xi for xi 6= xj as constant, and treat f as a function
of one variable xj .
∂f
Question L49.4: Let f (x, y ) := x y . Find ∂x (1, 2).
Here w treat f as a power function with fixed exponent y = 2.
∂f
Thus ∂x (1, 2) = (2x)1 (1, 2) = 2.
∂f
Question L49.5: Let f (x, y ) := x y . Find ∂y (1, 2).
Here we treat f as a constant exponential function with base
∂f
x = 1. Thus ∂y (1, 2) = 0.
In these examples we had m = 1 and we could somewhat
informally treat the partial derivatives as real numbers. In the
general case though, we would have f = (f1 , . . . , fm ) and

∂f ∂f1 ∂fm
(x0 ) = (x0 ), . . . , (x0 )
∂xj ∂xj ∂xj
is a vector.
Directional derivatives vs. partial derivatives
∂f
Let f : Rn → Rm . By the remark above, the vector ∂x j
(x0 ) will
also be the directional derivative Dej f (x0 ) in the direction of the
vector ej .
Now let v be any vector in Rn . If f is differentiable at x0 , then
f 0 (x0 ) is a linear transformation, and by Lemma 6.3.5:
Dv f (x0 ) = f 0 (x0 )v = f 0 (x0 )(x1 e1 + · · · + xn en )
Dv f (x0 ) = x1 f 0 (x0 )e1 + x2 f 0 (x0 )e2 + · · · + xn f 0 (x0 )en
∂f ∂f ∂f
Dv f (x0 ) = x1 (x0 ) + x2 (x0 ) + · · · + xn (x0 )
∂x1 ∂x2 ∂xn
Thus the partial derivatives allow us to calculate all directional

∂f
derivatives in this case. Moreover, since ∂x i
(x0 ) = f 0 (x0 )ei for
all ei , the columns of the matrix A for which f 0 (x0 ) = LA must be
the transposes of the partial derivatives at x0 .
Assembling derivatives from partial derivatives
Thus A itself, also denoted by Df (x0 ), will be of the form:

∂fi
Df (x0 ) = (x0 )
∂xi 1≤i≤m;1≤j≤n
 ∂f1 ∂f1 ∂f1 
∂x1 (x0 ) ∂x2 (x0 ) ... ∂xn (x0 )
 ∂f2 (x0 ) ∂f2 ∂f2
∂x2 (x0 ) ... ∂xn (x0 )

 ∂x1
Df (x0 ) =  .. .. 
 ... ... . . 
∂fm ∂fm ∂fm
∂x1 (x0 ) ∂x2 (x0 ) . . . ∂xn (x0 )
A natural question is this: Suppose we can calculate all partial

derivatives at x0 and form Df (x0 ) as above. Is it then true that
f 0 (x0 ) exists?
This is not always the case; we will see an example in Module 62.
However, this will be true under the additional assumption that all
partial derivatives exist in a neighbourhood of x0 and are
continuous at x0 .
Derivatives vs. partial derivatives
Theorem 6.3.8: Let E be a subset of R n , f : E → R m be a

function, F ⊆ E , and x0 be an interior point of F . If all the partial
∂f
derivatives ∂x j
exist on F and are continuous at x0 ,
then f is differentiable at x0 , and the linear transformation
f 0 (x0 ) : Rn → Rm is defined, for every vector v = (vj )1≤j≤n ∈ Rn
by:
m
X ∂f
f 0 (x0 )v = f 0 (x0 )(vj )1≤j≤n = vj (x0 ).
∂xj
j=1
The textbook gives a rigorous proof of Theorem 6.3.8, but we will

omit it here.

ACL49 NP

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ACL49 NP

Uploaded by

Copyright:

Available Formats

Lecture 49: Derivatives of functions of several

Companion to Advanced Calculus

Let f : E → R, where E ⊆ R. Recall that for such f we define the

We would like to use the exact same definition for derivatives at x0

Question L49.1: Why does this not work?

Unfortunately this does not work, since x − x0 is an n-dimensional

(a) f is differentiable at x0 , and f 0 (x0 ) = L.

We can look at this result as follows: When f 0 (x0 ) = L, and when

Question L49.2: What is the matrix A such that T = LA ?

This matrix is simply the matrix (f 0 (x0 )) = (L).

Definition 6.2.2: (Differentiability) Let E be a subset of Rn ,

Here ||x|| is the length of x (as measured in the `2 metric):

Notice that in the general case we divide by the Euclidean norm,

Example 6.2.3: Let f : R2 → R2 be the map f (x, y ) := (x 2 , y 2 ),

Question L49.3: What is the matrix A such that L = LA ?

Making the change of variables (x, y ) = (1, 2) + (a, b), we get

Substituting the formulas for f and for L, this becomes

We use the squeeze test.

On the other hand, we have by the triangle inequality

||(a2 , b 2 )|| ≤ ||(a2 , 0)|| + ||(0, b 2 )|| = a2 + b 2 ,

Thus f is differentiable at x0 with derivative L.

This Lemma allows us to talk about the derivative of a function f

The calculations in Example 6.2.3 were rather tedious, and we will

Definition 6.3.1 makes the assumption that f takes values in Rm

Example 6.3.4: We use the function f : R2 → R2 defined by

This illustrates the following result:

∂f f (x0 + tej ) − f (x0 ) d

Dv f (x0 ) = x1 f 0 (x0 )e1 + x2 f 0 (x0 )e2 + · · · + xn f 0 (x0 )en

Thus the partial derivatives allow us to calculate all directional

A natural question is this: Suppose we can calculate all partial

Theorem 6.3.8: Let E be a subset of R n , f : E → R m be a

The textbook gives a rigorous proof of Theorem 6.3.8, but we will

You might also like