12 views

Uploaded by Javier Sotelo Espinosa

Cálculo Vectorial

- analisis
- (11). Max Linear Systems
- (09)Tunnelling Machine p115-125
- Differential Equations
- math 51
- Oscillation Energy Based Sensitivity Analysis
- Structural damage assessment by using improved sensitivity of identified modal data based on stiffness modification index
- Part Programming Manual
- chapter 3 Class 11
- 04 Modal Analysis
- T2D CN M1 V2
- CalcIII_DoubleIntegrals
- Chapter 1
- 02 instructional software lesson idea template 2017 1
- MC0074
- Cal 13
- New Results on Robust Adaptive Beamspace Preprocessing, IEEE 2008
- Coupled Oscillators Guide (1)
- Lecture 4 Red
- Differential Equation

You are on page 1of 141

1

Contents

1 Vectors 4

1.1 Points in Space . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Distance between Points . . . . . . . . . . . . . . . . . . . . . 4

1.3 Scalars and Vectors . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 Vectors and Coordinate Systems . . . . . . . . . . . . . . . . . 5

1.5 Vector Operations . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.6 Revision of Vectors . . . . . . . . . . . . . . . . . . . . . . . . 12

1.7 Quiz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2 Topics in 3D Geometry 19

2.1 Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.3 Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.4 Functions of several variables . . . . . . . . . . . . . . . . . . 23

2.5 Change of Coordinates . . . . . . . . . . . . . . . . . . . . . . 25

3 Partial Derivatives 31

3.1 First Order Partial Derivatives . . . . . . . . . . . . . . . . . . 31

3.2 Higher Order Partial Derivatives . . . . . . . . . . . . . . . . . 33

3.3 Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4 Maxima and Minima . . . . . . . . . . . . . . . . . . . . . . . 38

4.1 Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.2 Directional Derivative . . . . . . . . . . . . . . . . . . . . . . 45

4.3 Curl and Divergence . . . . . . . . . . . . . . . . . . . . . . . 48

4.4 Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5 Linear Equations 56

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.2 Row Echelon Form . . . . . . . . . . . . . . . . . . . . . . . . 57

5.3 Soving Systems of Linear Equations . . . . . . . . . . . . . . . 59

6.1 Linear Combination of Vectors . . . . . . . . . . . . . . . . . . 64

6.2 Linear Independence of Vectors . . . . . . . . . . . . . . . . . 66

6.3 Orthogonal and orthonormal vectors . . . . . . . . . . . . . . 68

6.4 Gram-Schmidt Process . . . . . . . . . . . . . . . . . . . . . . 70

7 Matrices 74

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

7.2 Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . 74

7.3 Special matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 76

7.4 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

7.5 Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

8.2 Method to find eigenvalues and eigenvectors . . . . . . . . . . 83

8.3 Dimension of Eigenspace . . . . . . . . . . . . . . . . . . . . . 97

8.4 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . 101

8.5 Diagonalization of symmetric matrices . . . . . . . . . . . . . 105

9.1 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

9.2 Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . 114

9.3 Composing Operators . . . . . . . . . . . . . . . . . . . . . . . 117

9.4 Commutators . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

10.2 Definition of a Fourier Series . . . . . . . . . . . . . . . . . . . 125

10.3 Why the coefficients are found as they are. . . . . . . . . . . . 126

10.4 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

10.5 Odd and even functions . . . . . . . . . . . . . . . . . . . . . 132

10.6 Functions with period 2L . . . . . . . . . . . . . . . . . . . . . 136

10.7 Parseval’s Result . . . . . . . . . . . . . . . . . . . . . . . . . 139

1 Vectors

1.1 Points in Space

A point in two dimensional (2D) space is represented on the XY -plane using

two coordinates (x, y).

3D Coordinate system

The coordinate system in 3D space consists of three axes X,Y and Z. To

determine the directions of these axes, we follow the right hand rule:

We first define a plane in 3D space as the XY plane. Then place your right

hand on the X-axis, curl your fingers in the direction of the Y -axis, and let

your thumb stick out. Your thumb then points in the direction of the positive

Z axis.

dinates (x, y, z).

In 2D-space, the distance between two points P (x1 , y1 ) and Q(x2 , y2 ) is

p

d(P, Q) = (x2 − x1 )2 + (y2 − y1 )2 .

Similarly in 3D-space, the distance between two points P (x1 , y1 , z1 ) and

Q(x2 , y2 , z2 ) is

p

d(P, Q) = (x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2 .

p p

d(P, Q) = (x2 − x1 )2 + (y2 − y1 )2 = (0 − 1)2 + (0 − 2)2 + (1 − 3)2 = 3.

Certain quantities in physical applications of mathematics are represented

by vectors. To understand what a vector is it is perhaps better to first

understand what a scalar is:

Definition. A scalar quantity is defined completely by a single number, e.g.

length, area, volume, mass, time etc. It has only magnitude.

Definition. (Physical) A vector quantity is defined by magnitude and

direction, e.g. force, velocity, acceleration.

Example. 1. Force is reresented by a vector; the direction of the vector

describes the direction in whch the force is applied, and the magnitude

of the vector indicates the strength applied.

2. Velocity of a car is represented by a vector; the direction gives the

direction of motion and the magnitude gives the speed of the car.

Note. Two vectors with the same length and same direction are equal, we

do not care about their position in space.

Vectors can occur in two, three or even higher dimensions.

Definition. (Geometric) A vector in 2D-space can be represented by a

directed line segment, with an initial point and a terminal point. We denote

a vector by an over head arrow, e.g. ~v . Then the magnitude of ~v is given by

the length of the directed line segment, and is denotd by |~v |, and the direction

of ~v is given by the direction of the line segment.

Example. Plot the vector P~Q with initial point P (2, 3) and terminal point

Q(−1, 4). Find the magnitude of the vector P~Q.

√

|P~Q| =

p

(−1 − 2)2 + (4 − 3)2 = 10

Example. We can also represent the vector P~Q as the difference of the

~ and OP

vector OQ ~ (Use diagram.)

P~Q = OQ

~ − OP ~

= (−1, 4) − (2, 3)

= (−3, 1)

Then the directed line segment from O(0, 0) to R(−3, 1) also represents the

vector P~Q. Verify that the magnitude of both line segments is the same.

√

~ = (−3 − 0)2 + (1 − 0)2 = 10

p

|OR|

The two vectors also point in the same direction.

Definition. (Algebraic) A vector in 2D-space is represented by two com-

ponents (x, y), indicating the change in the X and Y directions from the

origin (0, 0).

Note. When we say a point P (−3, 1) in 2D-space, we mean the point with

x-coordiante −3 and y-coordinate 1. When we say the vector ~v = (−3, 1)

in 2D-space, we mean the directed line segment

p from the point

√ (0, 0) to the

2 2

point (−3, 1), which has magnitude or length (−3) + 1 = 10 and points

in the direction of the line segment.

Definition. A vector in 3D-space is given by a three tuple (x, y, z), indi-

cating the change in the X, Y and Z directions from the origin.

Note. We represent a vector ~v = (x, y, z) in 3D-space as the directed

line

p segment from the point (0, 0, 0) to the point (x, y, z), which has length

x2 + y 2 + z 2 and points in the direction of the line segment.

Example. Plot the vector ~v = (1, 2, 3) and determine it’s length.

1.5 Vector Operations

Addition and Subtraction

We add and subtract vectors component-wise.

Example. Let ~v1 = (1, 0, 1) and ~v2 = (−1, 2, 3). Then

~v1 + ~v2 = (1 − 1, 0 + 2, 1 + 3) = (0, 2, 4)

and

~v1 − ~v2 = (1 − (−1), 0 − 2, 1 − 3) = (2, −2, −2).

Scalar Multiplication

If c is a real number and ~v = (x, y, z), then

c~v = (cx, cy, cz).

This is known as scaling - we change the length but not the direction of v.

p p

|c~v | = c2 x2 + c2 y 2 + c2 z 2 = c x2 + y 2 + z 2 = c|~v |

Special vectors

The zero vector: ~o = (0, 0, 0).

Definition. A unit vector is a vector with length equal to one. i.e. |~v | = 1.

In 2D space, the unit vectors in the X and Y direction are special and denoted

by:

~i = (1, 0)

~j = (0, 1)

denoted by:

~i = (1, 0, 0)

~j = (0, 1, 0)

~k = (0, 0, 1)

Example. Using the special unit vectors i, j, k, we get a different way of

representing a vector,

~v = (13, 6, −8)

= 13(1, 0, 0) + 6(0, 1, 0) − 8(0, 0, 1)

= 13~i + 6~j − 8~k

This representation is often useful and we will use both notations in the

course.

Example. Find a unit vector in the direction of the vector ~v = (6, 8, 10).

Dot Product

Definition. The dot product of 2 vectors ~a = (a1 , a2 , a3 ) and ~b = (b1 , b2 , b3 )

in 3-dimensional space is defined to be the scalar

~a.~b = a1 b1 + a2 b2 + a3 b3 .

On the other hand, if θ is the angle between the vectors ~a and ~b, using the

Law of Cosines on the triangle with sides ~a, ~b, ~a − ~b we get,

(a1 − b1 )2 + (a2 − b2 )2 + (a3 − b3 )2 = a21 + a22 + a23 + b21 + b22 + b23 − 2|~a|.|~b| cos θ

a1 b1 + a2 b2 + a3 b3 = |~a|.|~b| cos θ

~a.~b = |~a|.|~b| cos θ

Remark. The two main applications of taking dot products;

(i) to find the angle between vectors and

(ii) to determine if two vectors are perpendicular.

√

Example. Find the dot product of ~a = (1, −1, 3) and ~b = (2, −1, 0). Also

find the angle between the two vectors.

Solution:

√

~a.~b = 1.2 + (−1)(−1) + 3.0 = 3

√ √

|~a| = 1+1+3= 5

√ √

|~b| = 4+1= 5

~a.~b

cos θ = = 3/5

|~a||~b|

3

θ = cos−1 ( )

5

Proposition Let ~a and ~b be non-zero vectors. The vectors, ~a and ~b, are

perpendicular to each other if and only if ~a · ~b = 0.

6 0 and |~b| =

6 0. Let θ

be the angle between the two vectors.

~a · ~b = 0

⇔ |~a|.|~b| cos θ = 0

6 0, |~b| =

⇔ cos θ = 0; Since |~a| = 6 0, we can divide the equation by them.

⇔ θ = π/2

Example. Determine if the vectors ~a = (1, −1, 1) and ~b = (1, 2, 1) are per-

pendicular to each other.

Therefore by the above proporsition, the vectors are perpendicular to

each other.

Cross Product

Definition. Given 2 vectors ~a = (a1 , a2 , a3 ) and ~b = (b1 , b2 , b3 ) in 3-dimensional

space the cross product ~a × ~b is a vector defined as follows:

~b, determined by the right-hand-rule. i.e. place your right hand on ~a and

curl it towards ~b, ~n now points in the direction of your thumb. This is the

geometric definition of the cross-product.

define it, we need to be able to compute determinants!

a b

c d = ad − bc.

1 3

−5 6 = 1.6 − 3(−5) = 9.

a11 a12 a13

a21 a22 a23 = a11 a22 a23 − a12 a21 a23 + a13 a21 a22

a32 a33 a31 a33 a31 a33

a31 a32 a33

= a11 (a22 a33 − a32 a23 ) − a12 (a21 a33 − a31 a23 ) + a13 (a21 a32 − a31 a22 )

4 2 1

−6 3

− 2 −2 3

−2 −6

−2 −6 3 = 4 + 1

5 0 −7 0 −7 5

−7 5 0

= 4(−15) − 2(21) + (1)(−52)

= −154

Definition. Given 2 vectors ~a = (a1 , a2 , a3 ) and ~b = (b1 , b2 , b3 ) in 3-dimensional

space the cross product ~a × ~b can also be defined as follows:

~i ~j ~k

~a × ~b = a1 a2 a3

b1 b2 b3

where ~i = (1, 0, 0), ~j = (0, 1, 0), ~k = (0, 0, 1) are unit vectors along the X,

Y , Z axes respectively. However while calculating the determinant they only

have symbolic value.

Remark. The cross product operates on 2 vectors and produces a new vec-

tor.

dimensional space. Find the cross-product of the two vectors and verify that

it is indeed perpendicular to both ~a and ~b. Also find the magnitude of ~a × ~b.

Solution:

~j ~k

~i

~

~a × b = 2 1 0

−1 1 0

1 0 ~ 2 0 ~ 2 1 ~

= i− j+ k

1 0 −1 0 −1 1

= 0~i − 0~j + 3~k

= 0(1, 0, 0) − 0(0, 1, 0) + 3(0, 0, 1)

= (0, 0, 3)

Observe that the two vectors lie in the XY -plane and the cross product lies

on the Z-axis and is indeed perpendicular to the plane containing the two

vectors. But we can verify this more concretely using the dot product. Let

~n = ~a × ~b = (0, 0, 3).

~n · ~a = (2, 1, 0).(0, 0, 3) = 0 + 0 + 0 = 0.

~n · ~b = (−1, 1, 0).(0, 0, 3) = 0 + 0 + 0 = 0.

Since both the dot products are zero, we can conclude that ~a × ~b is perpen-

dicular to both ~a and ~b.

Finally the magnitude of ~a × ~b is given by,

√

|~a × ~b| = 0 + 0 + 9 = 3.

~b. In particular it points in the direction of your thumb if the fingers of

your right hand curl from ~a to ~b.

~a and ~b. It is the area of the parallelogram determined by ~a and ~b.

Example. Find a unit vector in the direction of the following vectors:

need to do is scale the size of the vector by the inverse of its length.

The length of ~v1 is |~v1 | = 22 + (−3)2 = 13.

A unit vector u1 in the direction of v1 may then be obtained as follows:

1 2 −3

u1 = √ (2, −3) = ( √ , √ ).

13 13 13

The length of ~v2 is |~v2 | = 12 + (−2)2 + 52 = 30.

A unit vector u2 in the direction of v2 may then be obtained as follows:

1 1 −2 5

u2 = √ (1, −2, 5) = ( √ , √ , √ ).

30 30 30 30

(iii) ~v3 = (1, 1, 1, 1). √ √

The length of ~v3 is |~v3 | = 12 + 12 + 12 + 12 = 4 = 2.

A unit vector u3 in the direction of v3 may then be obtained as follows:

1 1 1 1 1

u3 = (1, 1, 1, 1) = ( , , , ).

2 2 2 2 2

Example. Consider the two vectors ~a = (2, 1, −4) and ~b = (3, −2, 5) in

3-dimensional space. Find the cross-product of the two vectors.

Solution:

~k

~i ~j

~a × ~b =

2 1 −4

3 −2 5

1 −4 ~ 2 −4 ~ 2 1 ~k

= i− j+

−2 5 3 5 3 −2

= −3~i − 22~j − 7~k

= (−3, −22, −7)

1.7 Quiz

d

1. Find dx

(tan(x2 ) cos(ex ).

Solution:

d d d

(tan(x2 ) cos(ex ) = tan(x2 ) (cos(ex ) + cos(ex ) (tan(x2 ))

dx dx dx

= tan(x )(− sin(e ))e + cos(e )(sec2 (x2 ))2x

2 x x x

Solution: Let f (x) = x3 − 3x. Since f (x) is continuous, the stationary

points of f (x) are the solutions of the equation f 0 (x) = 0.

f (x) = x3 − 3x

f 0 (x) = 3x2 − 3

3x2 − 3 = 0

3(x − 1)(x + 1) = 0

R

3. Find x ln xdx.

Solution:

x2

u = ln(x); v=

2

dx

du = ; dv = xdx

Z Zx Z

x ln(x)dx = udv = uv − vdu

x2 ln(x)

Z 2

x dx

= −

2 2 x

2 Z

x ln(x) x

= − dx

2 2

x2 ln(x) x2

= − + C.

2 4

R

Solution: Z

1 u

√ du = sin−1 ( ) + c.

a2 − u2 a

Therefore, Z

(1 − x2 )−1/2 dx = sin−1 (x) + c.

2 −1

5. Find the determinant of .

−6 2

Solution:

2 −1

−6 2 = (2)(2) − (−1)(−6) = −2

1 1 1 1 −1

p (1, 1, −1) = √ (1, 1, −1) = ( √ , √ , √ ).

12 + 12 + (−1)2 3 3 3 3

(b)

3~u − 5~v = 3(2, 3, 5) − 5(1, 1, −1) = (6, 9, 15) − (5, 5, −5) = (1, 4, 20).

(c)

~u · ~v = (2, 3, 5).(1, 1, −1) = 2 + 3 − 5 = 0.

(d)

~j ~k

~i

~u × ~v = 23 5

11 −1

3 5 ~ 2 5 ~ 2 3 ~

= i− j+ k

1 −1 1 −1 1 1

= (3(−1) − (5)(1))~i − (2(−1) − (5)(1))~j + ((2)(1) − (3)(1))~k

= −8~i + 7~j − ~k

= (−8, 7, −1)

7. Determine if ~u = (−2, 2, 1, −1) is perpendicular to ~v = (−2, −3, 1, 1).

Solution:

Summary

1. The dot product of 2 vectors ~a = (a1 , a2 , a3 ) and ~b = (b1 , b2 , b3 ) in

3-dimensional space is defined to be the scalar

~a.~b = a1 b1 + a2 b2 + a3 b3 .

or

~a.~b = |~a|.|~b| cos θ.

2. Let ~a and ~b be non-zero vectors. The vectors, ~a and ~b, are perpendic-

ular to each other if and only if ~a · ~b = 0.

the cross product ~a × ~b can be defined as follows:

~i ~j ~k

~a × ~b = a1 a2 a3

b1 b2 b3

where ~i = (1, 0, 0), ~j = (0, 1, 0), ~k = (0, 0, 1) are unit vectors along the X,

Y , Z axes respectively.

~a and ~b. In particular it points in the direction of your thumb if the

fingers of your right hand curl from ~a to ~b.

and ~b. It is the area of the parallelogram determined by ~a and ~b.

References

1. A complete set of notes on Pre-Calculus, Single Variable Calculus, Mul-

tivariable Calculus and Linear Algebra. Here is a link to the chapter on

Vectors.

http://tutorial.math.lamar.edu/Classes/CalcII/VectorsIntro.aspx.

http://people.usd.edu/ jflores/MultiCalc02/WebBook/Chapter1 3/Graphics/Chapter131 /Demo

topics, from the University of Minnesota.

http://www.math.umn.edu/ nykamp/m2374/readings/completelist.html.

http://www.calculus.org/.

2 Topics in 3D Geometry

In two dimensional space, we can graph curves and lines. In three dimensional

space, there is so much extra space that we can graph planes and surfaces in

addition to lines and curves. Here we will have a very brief introduction to

Geometry in three dimensions.

2.1 Planes

Just as it is easy to write the equation of a line in 2D space, it is easy to

write the equation of a plane in 3D space.

A vector perpendicular to a plane is said to be normal to the plane and is

called a normal vector, or simply a normal.

and a normal vector ~n = (a, b, c) to the plane.

to the plane. Then a point Q(x, y, z) lies on the plane,

⇔ ~n · P~Q = 0,

⇔ (a, b, c) · (x − x0 , y − y0 , z − z0 ) = 0,

P (x0 , y0 , z0 ) and has normal vector ~n = (a, b, c) is

(2, 3, 4), and C = (−2, 0.3). Find a vector which is normal to the plane. Find

an equation of the plane.

Solution: We need a point on the plane and a normal to the plane. The

~ AC

vector AB× ~ = (2, −3, 1) is a normal to the plane and we take A = (1, 2, 3)

as a point on the plane (you can choose B or C instead of A if you want).

The equation on the plane in point-normal form is:

2(x − 1) − 3(y − 2) + (z − 3) = 0

or equivalently,

2x − 3y + z = −1

Observe that the coefficients of x, y and z are (2, −3, 1) which is the normal

to the plane.

2.2 Lines

Vector equation of a line

To write the vector equation of a line, we need a point P (x0 , y0 , z0 ) on the

line and a vector ~v = (a, b, c) that is parallel to the line.

Definition. The vector equation of a line that contains the point P (x0 , y0 , z0 )

and is parallel to the vector ~v = (a, b, c) is:

P + t~v = ~r, where t is scalar.

or,

(x0 , y0 , z0 ) + t(a, b, c) = (x, y, z)

(x0 + ta, y0 + tb, z0 + tc) = (x, y, z)

The parametric equation of a line is derived from the vector equation of a

line.

Definition. The parametric equation of a line that contains the point P (x0 , y0 , z0 )

and is parallel to the vector ~v = (a, b, c) is:

x = x0 + ta

y = y0 + tb

z = z0 + tc

Example. Let L which passes through the points P (1, 1, 1) and Q(3, 2, 1).

Find a vector which is parallel to the line. Find the vector-equation and

parametric equation of the line.

Solution: The vector P~Q = (2, 1, 0) is parallel to the line and we take the

point P (1, 1, 1) on the line.

The vector equation of the line:

x = 1 + 2t

y = 1+t

z = 1

Example. Find the equation of the plane which contains the point (0, 1, 2)

and is perpendicular to the line (1, 1, 1) + t(2, 1, 0) = (x, y, z).

2.3 Surfaces

The graph in 3D space of an equation in x, y and z is a surface. Often the

graph is too difficut to draw, but here we sketch the graph of a few special

types of equations whose graphs are easy to visualize.

Cylindrical surfaces

The graph in 3D space of an equation containing only one or two of the three

variables x, y, z is called a cylindrical surface.

Example. Plot y = x2 .

Plot x2 + y 2 = 5.

Quadric Surfaces

The graph in 2D space of a second degree equation in x and y is an ellipse,

parabola or hyperbola. In 3D space, the graph of a second degree equation

in x, y and z is one of six quadric surfaces.

x2 y2 z2

1. Ellipsoid a2

+ b2

+ c2

=1

x2 y2

2. Elliptic Cone z2 = a2

+ b2

x2 y2

3. Elliptic Paraboloid z= a2

+ b2

y2 x2

4. Hyperbolic Parabolid z= b2

− a2

x2 y2 z2

5. Elliptic Hyperboloid of one sheet a2

+ b2

− c2

=1

2 y2 z2

6. Elliptic Hyperboloid of two sheets − xa2 − b2

+ c2

=1

2.4 Functions of several variables

So far you have studied about functions of one variable, e.g.

f (x) = x + x2 .

You have learned how to graph these functions, perhaps how to determine

their domain and range. You have gone much further in your quest to under-

stand functions; you have learned how to differentiate them, then to calculate

the maximum and minimum, then to integrate them. At the end of this term

(in March) we will learn how to write a function as a sum of simpler func-

tions, i.e. a Fourier Series Expansion of a function.

However now we will learn something new. We will learn about functions

of several variables, e.g.

You will very quickly see that although the concepts are new, the techniques

are old and familiar.

To draw the graph of f (x) = x2 , we drew the graph of the equation

y = x2 .

graph of the equation

z = x2 + y 2 .

We now use the methods developed in the last lecture to draw graphs. Note

however that it is difficult to graph general surfaces.

Remark. The graph of a function f (x) of one variable is the graph of the

equation y = f (x), a curve in 2D space. The graph of a function f (x, y) of

two variables is the graph of the equation z = f (x, y), a surface in 3D space.

The graph of a function f (x, y, z) is a set of points in 4D space and we cannot

draw the graph.

One way to understand functions of two or more variables is by using

level sets.

Example. An example of level sets is a topographic map, which maps hills

and valleys in a region by drawing curves indicating height or elevation. If

h(x, y) is the height function over a region, say the Himalayas, then if we

mark all the points (x, y) on the ground at which the height of the mountain

is h(x, y) = 3000m, we get the level set for L = 3000. If we draw the level

sets for different heights, e.g 2000m, 3000m, 4000m, 5000m, 6000m, 7000m,

8000m, we get a rough topographic map for the Himalayas.

Note that we draw the level sets on the ground, i.e. in the domain.

Definition. We fix a number C. The level set of a function of two variables

f (x, y) is the set of points (x, y) in the domain which satisfy the equation

f (x, y) = C.

In general, for a fixed number C and a function of several variables f , we

define the level set to be the collection of points in the domain which satisfy

the equation f = C.

Example. Let f (x) = x2 and say the constant c = 1. The level set is the

set of points such that

x2 = 1

x = 1, −1

C = 1 : x2 + y 2 = 1; a circle of radius 1

C = 4 : x2 + y 2 = 4; a circle of radius 2

C = 9 : x2 + y 2 = 9; a circle of radius 3

Example. Let f (x, y, z) = x2 + y 2 + z 2 and say the constant c = 1. The

level set is the set of points such that

x2 + y 2 + z 2 = 1

Hence the level set for c = 1 is the set of points lying on the unit sphere.

Before we describe cylindrical and spherical coordinate systems, we will recall

the polar coordinate system in 2D space.

Polar Coordinates

The polar coordinate system is equivalent to the rectangular coordinate sys-

tem. It locates points using two coordinates r and θ. The coordinate r is the

distance from a point to the origin, and θ is the angle used in trignometry

which measures the counterclockwise rotation from the positive X-axis.

Let (x, y) be a point in the rectangular coordinate system.

p

r = x2 + y 2

y

tan θ =

x

Conversion from polar to rectangular coordinates:

Let (r, θ) be a point in the polar coordinate system.

x = r cos θ

y = r sin θ

Example. Express in polar coordiantes the portion of the unit disc that lies

in the first quadrant.

Solution: The region may be expressed in polar coordinates as

0 ≤ r ≤ 1; 0 ≤ θ ≤ π/2

Example. Express in polar coordinates the function

f (x, y) = x2 + y 2 + 2yx.

Cylindrical Coordinates

This is a three dimensional extension of plane polar coordinates. Cylindrical

coordinates are given by the 3-tuple (r, θ, z), the polar coordinates of the

X, Y plane and the rectangular coordinate z. Given a point (x, y, z) in 3-

dimensional space, to calculate r and θ we project the point to the XY-plane

(x, y, z) 7→ (x, y). Then calculate (r, θ) as in the previous section.

Cylindrical to Rectangular Conversion Formulas:

x = r cos θ

y = r sin θ

z = z

p

r = x2 + y 2

y

θ = tan−1 ( )

x

z = z

p

f (x, y, z) = x2 + y 2 + z 2 − 2z x2 + y 2

r = sin θ

Spherical coordinates

Spherical coordinates consist of the 3-tuple (ρ, θ, φ). These are determined

as follows:

3. φ = the angle between the positive z-axis and the line from the origin to

the point.

x = ρ sin(φ) cos(θ)

y = ρ sin(φ) sin(θ)

z = ρ cos φ

p

ρ = x2 + y 2 + z 2

y

θ = tan−1 ( )

x

−1 z

φ = cos ( )

ρ

f (x, y, z) = x2 + y 2 + z 2

ρ=5

Summary

1. The point-normal equation of a plane that contains the point P (x0 , y0 , z0 )

and has normal vector ~n = (a, b, c) is

a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0.

2. The vector equation of a line that contains the point P (x0 , y0 , z0 ) and

is parallel to the vector ~v = (a, b, c) is:

P + t~v = ~r, where t is scalar.

and is parallel to the vector ~v = (a, b, c) is:

x = x0 + ta

y = y0 + tb

z = z0 + tc

is the set of points (x, y) in the domain which satisfy the equation

f (x, y) = C.

5. The polar coordinate system has two coordinates r and θ. The coor-

dinate r is the distance from a point to the origin, and θ is the angle

used in trignometry which measures the counterclockwise rotation from

the positive X-axis.

Conversion from rectangular to polar coordinates:

Let (x, y) be a point in the rectangular coordinate system.

p

r = x2 + y 2

y

tan θ =

x

Conversion from polar to rectangular coordinates:

Let (r, θ) be a point in the polar coordinate system.

x = r cos θ

y = r sin θ

6. Cylindrical Coordinate System is a three dimensional extension of

plane polar coordinates. Cylindrical coordinates are given by the 3-tuple

(r, θ, z), the polar coordinates of the X, Y plane and the rectangular

coordinate z.

Cylindrical to Rectangular Conversion Formulas:

x = r cos θ

y = r sin θ

z = z

p

r = x2 + y 2

y

θ = tan−1 ( )

x

z = z

7. Spherical coordinates consist of the 3-tuple (ρ, θ, φ). These are deter-

mined as follows:

1. ρ = the distance from the origin to the point.

2. θ = the same angle that we saw in polar/cylindrical coordinates.

3. φ = the angle between the positive z-axis and the line from the origin

to the point.

Spherical to Rectangular Conversion Formulas:

x = ρ sin(φ) cos(θ)

y = ρ sin(φ) sin(θ)

z = ρ cos φ

p

ρ = x2 + y 2 + z 2

y

θ = tan−1 ( )

x

−1 z

φ = cos ( )

ρ

References

tivariable Calculus and Linear Algebra. Here is a link to the chapter on

Lines, Planes and Quadric Surfaces. Also read the section on cylindrical

and spherical coordiantes.

http://tutorial.math.lamar.edu/Classes/CalcIII/3DSpace.aspx.

related topics, from the University of Minnesota.

http://www.math.umn.edu/%7Erogness/quadrics/.

http://www.calculus.org/.

3 Partial Derivatives

3.1 First Order Partial Derivatives

A function f (x) of one variable has a first order derivative denoted by f 0 (x)

or

df f (x + h) − f (x)

= lim .

dx h→0 h

It calculates the slope of the tangent line of the function f at x.

∂x ∂y

Just like in the one variable case, the partial derivatives are also related to

the tangent plane of the function at a point (x, y)

Definition. ∂f ∂x

is defined as the derivative of f with respect to x with y

treated as a constant.

∂f f (x + h, y) − f (x, y)

= lim .

∂x h→0 h

∂f

∂y

is defined as the derivative of f with respect to y with x treated as a

constant.

∂f f (x, y + k) − f (x, y)

= lim .

∂y k→0 k

∂f

Similarly for f (x, y, z) we can define three first order partial derivatives: ∂x

,

∂f ∂f

, .

∂y ∂z

Example. 1.

f (x, y) = x2 y 3

∂f ∂x2 y 3 ∂x2

= = y3 = 2xy 3

∂x ∂x ∂x

∂f ∂x2 y 3

= = 3x2 y 2

∂y ∂y

2. f (x, y, z) = ze2x+3y+4z

3. f (x, y) = x2 + y 2

Example. Consider the function f (x, y) = x2 /y. Calculate the first order

partial derivatives and evaluate them at the point P (2, 1).

Solution:

x2

f (x, y) =

y

∂f 2x

=

∂x y

∂f

|{x=2,y=1} = 4

∂x

∂f x2

= − 2

∂y y

∂f 1

|{x=2,y=1} = −

∂y 9

Remark. Partial derivatives are used in the same manner as the derivative

of a function of one variable. The partial of f (x, y) with respect to x is

the rate of change (or the slope) of f with respect to x as y stays constant.

Similarly the partial of f (x, y) with respect to y is the rate of change (or the

slope) of f with respect to y as x stays constant. For instance in the above

example, the slope of the function at the point P (2, 1) in the x direction is

4, and the slope of the function at P in the y direction is −1/9.

be a point on the graph. The equation of the tangent plane to the graph of

f (x, y) at the point P is given by,

∂f ∂f

z − z0 = |{x0 ,y0 } (x − x0 ) + |{x ,y } (y − y0 ).

∂x ∂y 0 0

We will see why this is true after we study about Directional Derivative and

the Gradient. For now we accept it as a result and use it in some examples.

Example. We consider the function f (x, y) = x3 + 2xy − y + 3. Compute

the tangent plane at the point P0 = (1, 2, 6) on the graph of the function.

Solution: (Solution:) We first check that the point P0 (x0 , y0 , z0 ) = (1, 2, 6)

is on the graph of the function:

f (1, 2) = 13 + 2.1.2 − 2 + 3

= 1+4−2+3

= 6

So the point P0 does indeed lie on the graph of f .

Then the equation of the tangent plane is given by;

∂f ∂f

z − z0 = |(x − x0 ) + |{x ,y } (y − y0 ).

∂x {x0 ,y0 } ∂y 0 0

∂f

|{1,2} = 3x2 + 2y|{1,2} = 7

∂x

∂f

|{1,2} = 2x − 1|{1,2} = 1

∂y

z − 6 = 7(x − 1) + 1(y − 2)

= 7x − 7 + y − 2

z = 7x + y − 3

Therefore the tangent plane to the graph of f at (1, 2, 6) is

7x + y − z − 3 = 0.

If f is a function of several variables, then we can find higher order partials

in the following manner.

Definition. If f (x, y) is a function of two variables, then ∂f ∂x

and ∂f

∂y

are

also functions of two variables and their partials can be taken. Hence we can

differentiate them with respect to x and y again and find,

∂2f

∂x2

, the derivative of f taken twice with respect to x,

∂2f

∂x∂y

, the derivative of f with respect to y and then with respect to x,

∂2f

∂y∂x

, the derivative of f with respect to x and then with respect to y,

∂2f

∂y 2

, the derivative of f taken twice with respect to y.

∂ f 3

We can carry on and find ∂x∂y 2 , which is taking the derivative of f first with

manner we can find nth-order partial derivatives of a function.

∂2f ∂2f

Theorem ∂x∂y

and ∂y∂x

are called mixed partial derivatives. They are equal

∂2f ∂2f

when ∂x∂y

and ∂y∂x

are continuous.

Note. In this course all the fuunctions we will encounter will have equal

mixed partial derivatives.

f (x, y) = x4 y 2 − x2 y 6 .

Solution:

∂f

= 4x3 y 2 − 2xy 6

∂x

∂f

= 2x4 y − 6x2 y 5

∂y

∂2f

= 12x2 y 2 − 2y 6

∂x2

∂2f

= 8x3 y − 12xy 5

∂y∂x

∂2f

= 2x4 − 30x2 y 4

∂y 2

∂2f

= 8x3 − 12xy 5

∂x∂y

Notations:

∂f

fx =

∂x

∂f

fy =

∂y

∂2f

fxx =

∂x2

∂2f

fyy =

∂y 2

∂2f

fxy =

∂x∂y

3.3 Chain Rule

You are familiar with the chain rule for functions of one variable: if f is

a function of u, denoted by f = f (u), and u is a function of x, denoted

u = u(x). Then

df df du

= .

dx du dx

For a two-dimensional version, suppose z is a function of u and v, denoted

z = z(u, v)

then

∂z ∂z ∂u ∂z ∂v

= +

∂x ∂u ∂x ∂v ∂y

∂z ∂z ∂u ∂z ∂v

= +

∂y ∂u ∂y ∂v ∂y

oo z PPP

o PPP

ooo

∂z ∂z

∂u

oo PP∂vP

ooo PPP

wooo PP'

u v

??? ∂u ??? ∂y

∂x ∂x

∂u ∂v ∂v

??∂y ??

?? ??

?

x y x y

z = z(u, v)

u = xy

v = 2x + 3y

Solution:

∂z ∂z ∂u ∂z ∂v

= +

∂x ∂u ∂x ∂v ∂x

∂z ∂z

= y +2

∂u ∂v

∂z ∂z ∂u ∂z ∂v

= +

∂y ∂u ∂y ∂v ∂y

∂z ∂z

= x +3

∂u ∂v

To find second order partials, we can use the same techniques as first order

partials, but with more care and patience!

Example. Let

z = z(u, v)

u = x2 y

v = 3x + 2y

∂2z

1. Find ∂y 2

.

∂2z

Solution: We will first find ∂y 2

.

∂z ∂z ∂u ∂z ∂v ∂z ∂z

= + = x2 +2 .

∂y ∂u ∂y ∂v ∂y ∂u ∂v

Now differentiate again with respect to y to obtain

∂2z ∂ 2 ∂z ∂ ∂z

2

= (x )+ (2 )

∂y ∂y ∂u ∂y ∂v

∂ ∂z ∂ ∂z

= x2 ( ) + 2 ( )

∂y ∂u ∂y ∂v

Note that z is a function of u and v, and u and v are functions of x and

∂z ∂z

y. Then the partial derivatives ∂u and ∂v are also initially functions of

u and v and eventually functions of x and y. In other words we have

the same dependence diagram as z.

∂ ∂z ∂2z ∂2z

Note. ( )

∂y ∂u

6= ∂y∂u

). Never write ∂y∂u

as it is mathematically mean-

ingless.

∂z

p ∂u NNNN ∂ 2 z

∂2z

∂u2 ppppp NNN∂v∂u

pp pp NNN

NNN

ppp N'

w p

u ? ∂u v ? ∂v

?? ∂y ??? ∂y

∂x

∂u ∂v

??? ∂x

??

?? ?

x y x y

∂z

N

∂2z

ppppp ∂v NNNNN ∂ 22z

∂u∂v

ppp N∂v

NNN

p pp NNN

pw p p N'

u ? ∂u v

??? ∂y ??? ∂y

∂x ∂x

∂u ∂v ∂v

?? ??

?? ??

x y x y

Therefore

∂2z 2

2 ∂ z ∂u ∂ 2 z ∂v ∂ 2 z ∂u ∂ 2 z ∂v

= x ( + ) + 2( + )

∂y 2 ∂u2 ∂y ∂v∂u ∂y ∂u∂v ∂y ∂v 2 ∂y

∂2z ∂2z ∂2z ∂2z

= x2 (x2 2 + 2 ) + 2(x2 + 2 2)

∂u ∂v∂u ∂u∂v ∂v

The mixed partials are equal so the answer simplifies to

∂2z 2

4∂ z

2

2 ∂ z ∂2z

= x + 4x + 4 .

∂y 2 ∂u2 ∂u∂v ∂v 2

∂2z

2. Find ∂x∂y

Solution: We have

∂z ∂z ∂u ∂z ∂v ∂z ∂z

= + = x2 +2 .

∂y ∂u ∂y ∂v ∂y ∂u ∂v

Now differentiate again with respect to x to obtain

∂2z ∂ 2 ∂z ∂ ∂z

= (x )+ (2 )

∂x∂y ∂x ∂u ∂x ∂v

∂ ∂z ∂z ∂ ∂z

= x2 ( ) + 2x +2 ( )

∂x ∂u ∂u ∂x ∂v

2 2

∂ z ∂u ∂ z ∂v ∂z ∂ 2 z ∂u ∂ 2 z ∂v

= x2 ( 2 + ) + 2x + 2( + )

∂u ∂x ∂v∂u ∂x ∂u ∂u∂v ∂x ∂v 2 ∂x

∂2z ∂2z ∂z ∂2z ∂2z

= x2 (2xy 2 + 3 ) + 2x + 2(2xy + 3 2)

∂u ∂v∂u ∂u ∂u∂v ∂v

2 2 2

∂ z ∂ z ∂z ∂ z

= 2x3 y 2 + (3x2 + 4xy) ) + 2x +6 2

∂u ∂v∂u ∂u ∂v

Recall from 1-dimensional calculus, to find the points of maxima and minima

of a function, we first find the critical points i.e where the tangent line is

horizontal f 0 (x) = 0. Then

(i) If f 00 (x) > 0 the gradient is increasing and we have a local minimum.

(ii) If f 00 (x) < 0 the gradient is decreasing and we have a local maximum.

We are now interested in finding points of local maxima and minima for a

function of two variables.

Definition. A function f (x, y) has a relative minimum (resp. maximum)

at the point (a, b) if f (x, y) ≥ f (a, b) (resp. f (x, y) ≤ f (a, b)) for all points

(x, y) in some region around (a, b)

Definition. A point (a, b) is a critical point of a function f (x, y) if one of

the following is true

(i) fx (a, b) = 0 and fy (a, b) = 0

Classification of Critical Points

We will need two quantities to classify the critical points of f (x, y):

2

2. H = fxx fyy − fxy the Hessian

If the Hessian is zero, then the critical point is degenerate. If the Hessian

is non-zero, then the critical point is non-degenerate and we can classify the

points in the following manner:

case(i) If H > 0 and fxx < 0 then the critical point is a relative maximum.

case(ii) If H > 0 and fxx > 0 then the critical point is a relative minimum.

fy = 3y 2 + 12x2 − 75 = 3(4x2 + y 2 − 25).

fx = 0

12x(3x + 2y) = 0

case(i) x = 0.

Then substituting this in fy we get fy = 3(y 2 − 25) = 0 implies y = ±5.

case(ii) 3x + 2y = 0.

Then y = −3x/2 and substituting this in fy we get,

9x2

fy = 3(4x2 + − 25)

4

3

= (16x2 + 9x2 − 100)

4

3

= (25x2 − 100)

4

75 2

= (x − 4)

4

fy = 0

75 2

(x − 4) = 0

4

x2 − 4 = 0

x = ±2

Thus we have found four critical points: (0, 5), (0, −5), (2, −3), (−2, 3).

We must now classify these points.

fxy = 24x

fyy = 6y

2

H = fxx fyy − fxy

= (24)(3x + y)(6y) − (24x)2

(0, 5) 120 3600 Minimum

(0, −5) -120 3600 Maximum

(2, −3) 72 -3600 Saddle

(−2, 3) -72 -3600 Saddle

Summary

∂f

1. The First Order Partial Derivative is defined as the derivative of

∂x

f with respect to x, with y treated as a constant.

∂f

Similarly we can define .

∂y

∂f ∂f

2. and are also functions of two variables and hence their partials

∂x ∂y

can be taken. We can differentiate them with respect to x and y again

and find Higher Order Partial Derivatives,

∂2f ∂2f ∂2f ∂2f ∂3f

, , , , , etc.

∂x2 ∂x∂y ∂y∂x ∂y 2 ∂x3

3. Chain Rule for Partial Dericatives. Suppose z is a function of u and

v, denoted z = z(u, v) and u and v are functions of x and y, u = u(x, y)

and v = v(x, y), then

∂z ∂z ∂u ∂z ∂v ∂z ∂z ∂u ∂z ∂v

= + , = +

∂x ∂u ∂x ∂v ∂y ∂y ∂u ∂y ∂v ∂y

is true

(i) fx (a, b) = 0 and fy (a, b) = 0

(ii) fx (a, b) and/or fy (a, b) does not exist.

5. We will need two quantities to classify the critical points of f (x, y):

1. fxx , the second partial derivative of f with respect to x.

2

2. H = fxx fyy − fxy the Hessian

If the Hessian is zero, then the critical point is degenerate. If the Hessian

is non-zero, then the critical point is non-degenerate and we can classify

the points in the following manner:

case(i) If H > 0 and fxx < 0 then the critical point is a relative maximum.

case(ii) If H > 0 and fxx > 0 then the critical point is a relative minimum.

case(iii) If H < 0 then the critical point is a saddle point.

References

tivariable Calculus and Linear Algebra. Here is a link to the chapter on

Partial Differentiation.

http://tutorial.math.lamar.edu/Classes/CalcIII/PartialDerivatives.aspx.

http://tutorial.math.lamar.edu/Classes/CalcIII/HighOrderPartialDerivs.aspx.

http://tutorial.math.lamar.edu/Classes/CalcIII/ChainRule.aspx.

http://tutorial.math.lamar.edu/Classes/CalcIII/RelativeExtrema.aspx

http://people.usd.edu/ jflores/MultiCalc02/WebBook/Chapter 15/

http://www.calculus.org/.

4 A little Vector Calculus

4.1 Gradient

Vector Function/ Vector Fields

The functions of several variables we have so far studied would take a point

(x, y, z) and give a real number f (x, y, z). We call these types of functions

scalar-valued functions i.e. for example

f (x, y, z) = x2 + 2xyz.

point (x, y, z) and the value of f (x, y, z) is a vector. i.e. for example

and returns a vector.

f (1) = (3, 5, −2), f (−1) = (1, 1, 4),

f (0) = (2, 3, 1)

f (x, y, z) = (y, x, z 2 ).

f (0, 1, 0) = (1, 0, 0), f (0, 0, 1)= (0, 0, 1),

f (1, 2, 1) = (2, 1, 1), f (2, 1, 2) = (1, 2, 4)

Definition. A vector field is an assignment of a vector to every point in

space.

Example. 1. In a magnetic field, we can assign a vector which describes

the force and direction to every point in space.

2. Normal to a surface. For a surface, at every point on the surface, we

can get a tangent plane and hence a normal vector to every point on the

surface.

Example. Let f (x, y, z) = x2 + 2xyz be a scalar-valued function. We then

define a vector-valued function by taking its partial derivatives.

∂f ∂f ∂f

∇f = ( , , )

∂x ∂y ∂z

= (2x + 2yz, 2xz, 2xy)

This kind of vector function has a special name, the gradient.

Definition. Suppose that f (x, y) is a scalar-valued function of two vari-

ables. Then the gradient of f is the vector function defined as,

∂f ∂f ∂f ∂f

∇f = ( , ) = ~i + ~j.

∂x ∂y ∂x ∂y

Similarly if f (x, y, z) is a scalar-valued function of three variables. Then the

gradient of f is the vector function defined as,

∂f ∂f ∂f ∂f ∂f ∂f ~

∇f = ( , , ) = ~i + ~j + k.

∂x ∂y ∂z ∂x ∂y ∂z

Remark. The gradient of a scalar function is a vector field which points in

the direction of the greatest rate of increase of the scalar function, and whose

magnitude is the greatest rate of change.

Example. Consider the scalar-valued function defined by f (x, y) = x2 + y 2 .

Find the gradient of f at the point x = 2, y = 5.

Solution: Then gradient of f , is a vector function given by,

∂f ∂f

∇f = ( , )

∂x ∂y

= (2x, 2y)

∇f (2, 5) = (4, 10)

4.2 Directional Derivative

For a function of 2 variables f (x, y), we have seen that the function can be

used to represent the surface

z = f (x, y)

(i) fx (a, b)-represents the rate of change of the function f (x, y) as we vary

x and hold y = b fixed.

(ii) fy (a, b)-represents the rate of change of the function f (x, y) as we vary

y and hold x = a fixed.

direction?

Recall the definition of the vector function ∇f ,

∂f ∂f

∇f = , .

∂x ∂y

We observe that,

∇f · î = fx

∇f · ĵ = fy

tion, by taking the dot product of ∇f with a unit vector, ~u, in the desired

direction.

denoted by D~u f , is defined to be,

∇f · ~u

D~u f =

|~u|

direction ~i + 2~j at the point (1, 1)?

Solution: We first find ∇f .

∂f ∂f

∇f = ( , )

∂x ∂y

= (2x + y, x)

∇f (1, 1) = (3, 1)

Let u = ~i + 2~j. √ √ √

|~u| = 12 + 22 = 1+4= 5.

∇f · ~u

D~u f (1, 1) =

|~u|

(3, 1).(1, 2)

= √

5

(3)(1) + (1)(2)

= √

5

5

= √

5

√

= 5

Derivatives

∇f · ~u

D~u f =

|~u|

|∇f ||~u| cos θ

=

|~u|

= |∇f | cos θ

maximum. Therefore we may conclude that

(ii) The magnitude of ∇f gives the slope in the steepest di-

rection.

2. At any point P , ∇f (P ) is perpendicular to the level set through

that point.

the graph of f since f (1, 2) = 5. Find the slope and the direction of the

steepest ascent at P on the graph of f

direction of the steepest ascent at P on the graph of f is the direction

of the gradient vector at the point (1, 2).

∂f ∂f

∇f = ( , )

∂x ∂y

= (2x, 2y)

∇f (1, 2) = (2, 4).

nitude of the gradient vector at the point (1, 2).

√ √

|∇f (1, 2)| = 22 + 42 = 20.

the point (1, 2, 5). Hence write an equation for the tangent plane at the

point (1, 2, 5).

function g, ∇g(P ) is perpendicular to the level set. So we want our

surface z = x2 + y 2 to be the level set of a function.

Therefore we define a new function,

g(x, y, z) = x2 + y 2 − z.

g(x, y, z) = 0

x2 + y 2 − z = 0

z = x2 + y 2

∂g ∂g ∂g

∇g = ( , , )

∂x ∂y ∂z

= (2x, 2y, −1)

∇g(1, 2, 5) = (2, 4, −1)

0. Therefore ∇g(P ) is the required normal vector.

Finally an equation for the tangent plane at the point (1, 2, 5) on the

surface is given by

We denoted the gradient of a scalar function f (x, y, z) as

∂f ∂f ∂f

∇f = ( , , )

∂x ∂y ∂z

∂ ∂ ∂

Let us separate or isolate the operator ∇ = ( ∂x , ∂y , ∂z ). We can then define

various physical quantities such as div, curl by specifying the action of the

operator ∇.

Divergence

Definition. Given a vector field ~v (x, y, z) = (v1 (x, y, z), v2 (x, y, z), v3 (x, y, z)),

the divergence of ~v is a scalar function defined as the dot product of the vector

operator ∇ and ~v ,

Div ~v = ∇ · ~v

∂ ∂ ∂

= ( , , ) · (v1 , v2 , v3 )

∂x ∂y ∂z

∂v1 ∂v2 ∂v3

= + +

∂x ∂y ∂z

Solution:

∂ ∂ ∂

∇ = ( , , )

∂x ∂y ∂z

Div ~v = ∇ · ~v

∂ ∂ ∂

= ( , , ) · ((x − y), (x + y), z)

∂x ∂y ∂z

∂(x − y) ∂(x + y) ∂z

= + +

∂x ∂y ∂z

= 1+1+1

= 3

Curl

Definition. The curl of a vector field is a vector function defined as the

cross product of the vector operator ∇ and ~v ,

i j k

∂ ∂ ∂

Curl ~v = ∇ × ~v = ∂x ∂y ∂z

v1 v2 v3

∂v3 ∂v2 ∂v3 ∂v1 ∂v2 ∂v1

= ( − )i − ( − )j + ( − )k

∂y ∂z ∂x ∂z ∂x ∂y

Example. Compute the curl of the vector function (x − y)~i + (x + y)~j + z~k.

Solution:

i j k

∂ ∂ ∂

Curl ~v = ∇ × ~v = ∂x ∂y

∂z

(x − y) (x + y) z

∂z ∂(x + y) ∂z ∂(x − y) ∂(x + y) ∂(x − y)

= ( − )i − ( − )j + ( − )k

∂y ∂z ∂x ∂z ∂x ∂y

= (0 − 0)~i − (0 − 0)~j + (1 − (−1))~k

= 2~k

4.4 Laplacian

We have seen above that given a vector function, we can calculate the diver-

gence and curl of that function. A scalar function f has a vector function

∇f associated to it. We now look at Curl(∇f ) and Div(∇f ).

Curl(∇f ) = ∇ × ∇f

∂fz ∂fy ∂fx ∂fz ∂fy ∂fx

= ( − )i + ( − )j + ( − )k

∂y ∂z ∂z ∂x ∂x ∂y

= (fyz − fzy )i + (fzx − fxz )j + (fxy − fyx )k

= 0

Div(∇f ) = ∇ · ∇f

∂ ∂ ∂ ∂f ∂f ∂f

= ( , , )·( , , )

∂x ∂y ∂z ∂x ∂y ∂z

∂2f ∂2f ∂2f

= + +

∂x2 ∂y 2 ∂z 2

Definition. The Laplacian of a scalar function f (x, y) of two variables is

defined to be Div(∇f ) and is denoted by ∇2 f ,

∂2f ∂2f

∇2 f = + .

∂x2 ∂y 2

The Laplacian of a scalar function f (x, y, z) of three variables is defined to

be Div(∇f ) and is denoted by ∇2 f ,

∂2f ∂2f ∂2f

∇2 f = + + .

∂x2 ∂y 2 ∂z 2

Example. Compute the Laplacian of f (x, y, z) = x2 + y 2 + z 2 .

Solution:

∂2f ∂2f ∂2f

∇2 f = + +

∂x2 ∂y 2 ∂z 2

∂2x ∂2y ∂2z

= + +

∂x ∂y ∂z

= 2+2+2

= 6.

We have the following identities for the Laplacian in different coordinate

systems:

∂2f ∂2f ∂2f

Rectangular : ∇2 f = + +

∂x2 ∂y 2 ∂z 2

1 ∂ ∂f 1 ∂ 2 f

P olar : ∇2 f = r + 2 2

r ∂r ∂r r ∂θ

1 ∂ ∂f 1 ∂2f ∂2f

Cylindrical : ∇2 f =

r + 2 2 + 2

r ∂r ∂r r ∂θ ∂z

1 ∂ ∂f 1 ∂ ∂f 1 ∂2f

Spherical : ∇2 f = 2 ρ2

+ 2 sin θ + 2 2

ρ ∂ρ ∂ρ ρ sin θ ∂θ ∂θ ρ sin θ ∂φ2

Example. Consider the same function f (x, y, z) = x2 + y 2 + z 2 . We have

seen that in rectangular coordinates we get

∇f= + 2 + 2 = 6.

∂x2 ∂y ∂z

We now calculate this in cylindrical and spherical coordinate systems, using

the formulas given above.

1. Cylindrical Coordinates.

We have x = r cos θ and y = r sin θ so

f (r, θ, z) = r2 cos2 θ + r2 sin2 θ + z 2 = r2 + z 2 .

Using the above formula:

1 ∂ ∂f 1 ∂ 2 f ∂2f

∇2 f = r + 2 2 + 2

r ∂r ∂r r ∂θ ∂z

1 ∂ ∂(2z)

= (r2r) + 0 +

r ∂r ∂z

1

= (4r) + 2

r

= 4+2

= 6

2. Spherical Coordinates. p

We have x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ and ρ = x2 + y 2 + z 2 ,

so

f (r, θ, z) = ρ2 .

Using the above formula:

1 ∂ 2 ∂f 1 ∂ ∂f 1 ∂2f

∇2 f = ρ + sin θ +

ρ2 ∂ρ ∂ρ ρ2 sin θ ∂θ ∂θ ρ2 sin2 θ ∂φ2

1 ∂ 2

= (ρ 2ρ) + 0 + 0

ρ2 ∂ρ

1 ∂

= 2

(2ρ3 )

ρ ∂ρ

1

= (6ρ2 )

ρ2

= 6.

These three different calculations all produce the same result because ∇2

is a derivative with a real physical meaning, and does not depend on the

coordinate system being used.

Summary

1. If f (x, y) is a scalar-valued function of two variables. Then the gradient

of f is the vector function defined as,

∂f ∂f ∂f ∂f

∇f = ( , ) = ~i + ~j.

∂x ∂y ∂x ∂y

• ∇f points in the steepest direction.

• The magnitude of ∇f gives the slope in the steepest direc-

tion.

• At any point P , ∇f (P ) is perpendicular to the level set through

that point.

3. The directional derivative of the function f in the direction ~u denoted

by D~u f , is defined to be,

∇f · ~u

D~u f =

|~u|

4. Given a vector field ~v (x, y, z) = (v1 (x, y, z), v2 (x, y, z), v3 (x, y, z)), the di-

vergence of the vector field, ~v is a scalar function defined as the dot

product of the vector operator ∇ and ~v ,

Div ~v = ∇ · ~v

∂ ∂ ∂

= ( , , ) · (v1 , v2 , v3 )

∂x ∂y ∂z

∂v1 ∂v2 ∂v3

= + +

∂x ∂y ∂z

product of the vector operator ∇ and ~v ,

i j k

∂ ∂ ∂

Curl ~v = ∇ × ~v = ∂x ∂y ∂z

v1 v2 v3

∂v3 ∂v2 ∂v3 ∂v1 ∂v2 ∂v1

= ( − )i − ( − )j + ( − )k

∂y ∂z ∂x ∂z ∂x ∂y

6. We have the following identities for the Laplacian in different coordinate

systems:

Rectangular : ∇2 f = + +

∂x2 ∂y 2 ∂z 2

1 ∂ ∂f 1 ∂ 2 f

P olar : ∇2 f = r + 2 2

r ∂r ∂r r ∂θ

1 ∂ ∂f 1 ∂2f ∂2f

Cylindrical : ∇2 f =

r + 2 2 + 2

r ∂r ∂r r ∂θ ∂z

1 ∂ 2 ∂f 1 ∂ ∂f 1 ∂2f

Spherical : ∇2 f = 2

ρ + 2 sin θ + 2 2

ρ ∂ρ ∂ρ ρ sin θ ∂θ ∂θ ρ sin θ ∂φ2

References

occurs in the direction of the gradient vector. The animation shows:

• a surface

• a unit vector rotating about the point (1, 1, 0), (shown as a rotating

black arrow at the base of the figure)

• a rotating plane parallel to the unit vector, (shown as a grey grid)

• the traces of the planes in the surface, (shown as a black curve on

the surface)

• the tangent lines to the traces at (1, 1, f (1, 1)), (shown as a blue

line)

• the gradient vector (shown in green at the base of the figure)

http://archives.math.utk.edu/ICTCM/VOL10/C009/dd.gif

variable Calculus and Linear Algebra.

Here is a link to the chapter on Directional Derivatives.

http://tutorial.math.lamar.edu/Classes/CalcIII/DirectionalDeriv.aspx.

http://tutorial.math.lamar.edu/Classes/CalcIII/CurlDivergence.aspx

5 Linear Equations

5.1 Introduction

A linear equation is one in which all the unknown variables occur with a

power of one. i.e. 2x + y = 5 is a linear equation, but 2x2 + y = 5 is not a

linear equation, since x has power 2. Systems of linear equations are common

in all branches of science. In school you have learned how to simultaeously

solve a system of two linear equations.

3x1 + 2x2 = 7

−x1 + x2 = 6

is the point of intersection of the two lines given by the two equations.

acid HN O3 to produce trinitrotoluene (TNT) C7 H5 O6 N3 along with wa-

ter. In what proportion should those components be mixed, i.e. solve for

x, y, z, w?

Solution: The number of atoms of each element before the reaction must

equal the number present afterward!

7x = 7z

8x + y = 5z + 2w

y = 3z

3y = 6z + w

To find the answer we need to solve the above system of linear equations. In

this chapter we will learn an easy method to solve such a system.

5.2 Row Echelon Form

In this section we learn Gauss’ Method to solve a system of linear equations.

We will use an example to understand the method.

x− y =0

2x − 2y + z + 2w =4

y + w =0

2z + w =5

We can abbreviate this linear system with the linear array of numbers,

whose entries are the coefficients of the equations:

1 −1 0 0 0

2 −2 1 2 4

0 1 0 1 0 .

0 0 2 1 5

The vertical bar just reminds us that the coefficients of the system are on

the left hand side of the bar and the constants are on the right. We call this

an augmented matrix. We can now proceed with Gauss’ method.

1 −1 0 0 0 1 −1 0 0 0

2 −2 1 2 4 R2 =−2R1 +R2 0 0 1 2 4

0 1 0 1 0

→

0 1 0 1 0

0 0 2 1 5 0 0 2 1 5

1 −1 0 0 0

R2 ↔R3 0 1 0 1 0

→

0 0 1 2 4

0 0 2 1 5

1 −1 0 0 0

0 1 0 1

R4 =−2R3 +R4

→

0

0 0 1 2 4

0 0 0 −3 −3

The resulting equations are:

x−y =0

y + w =0

z + 2w =4

w =1

set of equations, using row operations to put the array into Row Echelon

Form. You probably have two questions;

1. Why do both the original system and the new simpler system has the

same solution set?

Answer: There is a theorem which says that, if a linear system is changed

to another by one of these operations

• an equation has both sides multiplied by a nonzero constant

• an equation is replaced by the sum of itself and a multiple of another

Answer: A matrix is in Row Echelon Form if it has the following form:

• any all zero rows are at the bottom of the reduced matrix

• in a non-zero row, the first-from-left nonzero value is 1

• the first 1 in each row is to the right of the first 1 in the row above

(i) Make the first entry of the first row non-zero by doing a swap if neces-

sary.

(ii) Make the first entry of the first row 1 by multiplying by the reciprocal.

The first row is now in the correct form.

(iii) Make the first entry in all the rows below the first row 0, by using

the1

1

0

0

in the first row above to subtract. The first column should be .

.

.

(iv) Now we can disregard the first row and first column because they are in

the correct form, and only consider the remaining sub-matrix. Repeat

steps (i)-(iv) for the smaller sub-matrix. Continue till you run out of

rows.

The solution set of a system of linear equations can be;

(i) Empty

(iii) Infinite

x + 2y = 4

x + 2y = 8

1 2 4 R2 =−R1 +R2 1 2 4

→

1 2 8 0 0 4

We get the absurd relation 0 = 4. This implies that the system has no

solutions. Geometrically the two equations are two parallel lines and hence

there is no point of inersection.

An example where the solution set is infinite.

x1 + 2x2 =4

x2 − x3 = 0

x1 + 2x3 = 4.

1 2 0 4

0 1 −1 0

1 0 2 4

1 2 0 4 1 2 0 4

R =−R +R3

0 1 −1 0 3 →1 0 1 −1 0

1 0 2 4 0 −2 2 0

1 2 0 4

R3 =2R2 +R3

→ 0 1 −1 0

0 0 0 0

We have a bottom row of zeros. The second row gives x2 = x3 and the first

row gives x1 + 2x2 = 4. We can express x1 in terms of x3 to get x1 = 4 − 2x3 .

Thus we can write the solution set for the system in the following manner:

x1

S = { x2 |x2 = x3 and x1 = 4 − 2x3 }

x3

4 −2x3

= { x3 }

x3

4 −2

= { 0 + 1 x3 |x3 ∈ R}

0 1

Example. Consider the linear system

x+y+ z− w =1

y− z+ w = −1

3x + 6z − 6w =6

−y+ z− w =1

1 1 1 −1 1

0

1 −1 1 −1

3 0 6 −6 6

0 −1 1 −1 1

Gauss’ Method:

1 1 1 −1 1 1 1 1 −1 1

0 1 −1 1 −1

R3 =−3R +R3 0 1 −1 1 −1

→1

3 0 6 −6 6 0 −3 3 −3 3

0 −1 1 −1 1 0 −1 1 −1 1

1 1 1 −1 1

R3 =3R2 +R3 ,R4 =R2 +R4 0 1 −1 1 −1

→

0 0 0

0 0

0 0 0 0 0

x+y+z−w =1

y − z + w = −1

equation one we get, x + (−1 + z − w) + z − w = 1. Solving for x we get

x = 2 − 2z + 2w. The soution set:

x

y

S = { z |y = −1 + z − w and x = 2 − 2z + 2w}

w

2 −2z +2w

−1 +z −w

= { }

z

w

2 −2 2

−1 1 −1

= { 0 + 1 z + 0 w|z, w ∈ R}

0 0 1

Summary

1. A matrix is in Row Echelon Form if it has the following form:

• any all zero rows are at the bottom of the reduced matrix

• in a non-zero row, the first-from-left nonzero value is 1

• the first non-zero entry in each row is to the right of the first non-zero

entry in the row above

(i) Make the first entry, of the first row, non-zero by doing a swap if

necessary.

(ii) Make the first entry, of the first row 1, by multiplying by the recip-

rocal. The first row is now in the correct form.

1

0

0

(iii) The first column should be . . We do it, by making the first

.

.

entry in all the rows below the first row 0, by using the 1 in the first

row.

(iv) Now we can disregard the first row and first column because they

are in the correct form, and only consider the remaining sub-matrix.

Repeat steps (i)-(iv) for the smaller sub-matrix. Continue till you

run out of rows.

(ii) Reduce the augmented matrix to row echelon form

(iii) Back substitute to find the solution set.

6 Linear Independence and Gram Schmidt

6.1 Linear Combination of Vectors

A linear combination of vectors in R2 is a sum of the form

x1 y1

α +β .

x2 y2

e.g.

1 0

2 + (−4) .

−1 1

A linear combination of vectors in R3 is a sum of the form

x1 y1

α x2 + β y2 .

x3 y3

e.g.

1 0

2 −1 + (−4) 1 .

1 3

9 1 2 3

Example. Write 8 as a linear combination of the vectors 2 , −1 , 1 .

3 3 0 −1

1 2 3 9

x 2 + y −1 + z 1 = 8

3 0 −1 3

x +2y +3z 9

2x −y +z = 8

3x −z 3

We get a system of three linear equations:

x + 2y + 3z = 9

2x − y + z = 8

3x − z=3

We can use the methods of the last chapter to find the solution set of this

system of linear equations. The associated augmented matrix:

1 2 3 9

2 −1 1 8

3 0 −1 3

1 2 3 9 1 2 3 9

R2 =−2R1 +R2,R3 =−3R1 +R3

2 −1 1 8 → 0 −5 −5 −10

3 0 −1 3 0 −6 −10 −24

R2 = −1 R2

1 2 3 9

5

→ 0 1 1 2

0 −6 −10 −24

1 2 3 9

R3 =6R2 +R3

→ 0 1 1 2

0 0 −4 −12

−1

R3 = 4 R3

1 2 3 9

→ 0 1 1 2

0 0 1 3

z = 3

y = 2 − z = 2 − 3 = −1

x = 9 − 2y − 3z = 9 − 2(−1) − 3(3) = 2

Therefore we write,

1 2 3 9

(2) 2 + (−1) −1 + (3) 1 = 8

3 0 −1 3

6.2 Linear Independence of Vectors

Definition. A set of vectors S is linearly independent if there is no non-

trivial linear combination of vectors that sums to zero.

Remark. If a set is not linearly independent, it is linearly dependent..

Example. 1. Consider the linear combination which sums to zero,

1 0 0

α +β =

0 1 0

α 0 0

+ =

0 β 0

α 0

=

β 0

α = 0

β = 0

1 0

is a unique solution. Therefore the vectors and are linearly

0 1

independent.

2. Consider the linear combination which sums to zero

1 0 1 0

x +y +z =

0 1 1 0

x 0 z 0

+ ++ =

0 y z 0

x+z 0

=

y+z 0

x+z = 0; x = −z

y+z = 0; y = −z

So for any value of z, say z = 1, we can choose x = y = −z = −1 and get

1 0 1 0

(−1) + (−1) + (1) =

0 1 1 0

a non-trivial linear

combination

which sums

to zero.

1 0 1

So the vectors , and are not linearly independent.

0 1 1

3. Determine whether the set of vectors {[1, 1, 0], [1, 0, 1], [2, 1, 1]} is linearly

independent.

Solution: Take a linear combination of the vectors which sums to zero.

1 1 0 0

x 1

+y 0 +z 1 = 0

2 1 1 0

x +y +2z 0

x +z = 0

y +z 0

We get a system of three linear equations:

x + y + 2z = 0

x + z=0

y+ z=0

We can use the methods of the last chapter to find the solution set of this

system of linear equations. If x = y = z = 0 is a unique solution, then

the vectors are linearly independent, other wise we can find a dependency.

The associated augmented matrix:

1 1 2 0

1 0 1 0

0 1 1 0

We now reduce the matrix to row echelon form:

1 1 2 0 1 1 2 0

1 0 1 0 R2 =−R +R2

→1 0 −1 −1 0

0 1 1 0 0 1 1 0

1 1 2 0

R3 =R2 +R3

→ 0 1 1 0

0 0 0 0

We have a bottom row of zeros and hence cannot have a unique solution.

The second row gives y + z = 0 and the first row gives x + y + 2z = 0. So

y = −z

x = −y − 2z = −(−z) − 2z = −z

If we choose z = 1, then x = −1 and y = −1we get a non-trivial linear

combination which sums to zero:

1 1 2 0

(−1) 1

+ (−1) 0

+ (1) 1

= 0

0 1 1 0

Theorem Let S = {~v1 , ~v2 , ..., ~vn } be a set of n, n-dimensional vectors, then

the set S is linearly independent if and only if the determinant of the matrix

having the n vectors as columns is non-zero.

Example. 1. We can apply this result to the example above. Let {[1, 1, 0], [1, 0, 1], [2, 1, 1]}

be three, 3-dimensional vectors.

1 1 2

1 0 1 = 1(0 − 1) − 1((1 − 0) + 2(1 − 0) = −1 − 1 + 2 = 0.

0 1 1

Since the determinant is zero, the set of vectors are not linearly indepen-

dent.

2. Consider the set {[1, −1, 1], [1, 0, 1], [1, 1, 2]}.

1 1 1

−1 0 1 = 1(0 − 1) − 1((−1)(2) − (1)(1)) + 1((−1)(1) − 0) = 1 6= 0.

1 1 2

Since the determinant is non-zero, the set of vectors are linearly indepen-

dent.

Definition. We say that 2 vectors are orthogonal if they are perpendicular

to each other. i.e. the dot product of the two vectors is zero.

Definition. We say that a set of vectors {~v1 , ~v2 , ..., ~vn } are mutually or-

thogonal if every pair of vectors is orthogonal. i.e.

1 √1 1

√

Example. The set of vectors 0 , 2 , − 2 is mutually

−1 1 1

orthogonal.

√

(1, 0, −1).(1, 2, 1) = 0

√

(1, 0, −1).(1, − 2, 1) = 0

√ √

(1, 2, 1).(1, − 2, 1) = 0

magnitude 1 and the set of vectors are mutually orthogonal.

1 √1 1

√

~v1 = 0 , ~v2 = 2 , ~v3 = − 2

−1 1 1

are mutually orthogonal. The vectors however are not normalized (this term

is sometimes used to say that the vectors are not of magnitude 1). Let

√

1 1/ 2

~v1 1

~u1 = = √ 0 = 0√

|~v1 | 2 −1 −1/ 2

√1 √1/2

~v2 1

~u2 = = 2 = 2/2

|~v2 | 2

1 1/2

1

√ 1/2

√

~v3 1

~u3 = = − 2 = − 2/2

|~v3 | 2

1 1/2

6.4 Gram-Schmidt Process

Given a set of linearly independent vectors, it is often useful to convert them

into an orthonormal set of vectors. We first define the projection operator.

~u is defined as folows:

(~v .~u)

Proj~u~v = ~u.

|~u|2

1 1

Example. Consider the two vectors ~v = and ~u = .

1 0

These two vectors are linearly independent.

However they are not orthogonal to each other. We create an orthogonal

vector in the following manner:

~v1 = ~v − (Proj~u~v )

(1)(1) + (1)(0) 1 1

Proj~u~v = √ = (1)

2

( 1 +0 ) 2 2 0 0

1 1 0

~v1 = − (1) =

1 0 1

Let v1 , v2 , ..., vn be a set of n linearly independent vectors in Rn . Then we

can construct an orthonormal set of vectors as follows:

~

u1

Step 1. Let ~u1 = ~v1 . ~e1 = |~

u1 |

.

~

u2

Step 2. Let ~u2 = ~v2 − Proj~u1 ~v2 . ~e2 = |~

u2 |

.

~

u3

Step 3. Let ~u3 = ~v3 − Proj~u1 ~v3 − Proj~u2 ~v3 . ~e3 = |~

u3 |

.

~

u4

Step 4. Let ~u4 = ~v4 − Proj~u1 ~v4 − Proj~u2 ~v4 − Proj~u3 ~v4 . ~e4 = |~

u4 |

.

•

Example. We will apply the Gram-Schmidt algorithm to orthonormalize

the set of vectors

1 1 1

~v1 = −1 , ~v2 =

0 , ~v3 =

1 .

1 1 2

To apply the Gram-Schmidt, we first need to check that the set of vectors

are linearly independent.

1 1 1

−1 0 1 = 1(0 − 1) − 1((−1)(2) − (1)(1)) + 1((−1)(1) − 0) = 1 6= 0.

1 1 2

Therefore the vectors are linearly independent.

Gram-Schmidt algorithm:

Step 1. Let

1

~u1 = ~v1 = −1

1

1

~u1 1

~e1 = = √ −1 .

|~u1 | 3 1

Step 2. Let

~u2 = ~v2 − Proj~u1 ~v2

1 1

(1, 0, 1).(1, −1, 1) = 2 −1

Proj~u1 ~v2 = −1

12 + (−1)2 + 12 3

1 1

1 1

~u2 = 0 − 2 −1

3

1 1

1/3

= 2/3

1/3

1/3

~u2 3

~e2 = = √ 2/3 .

|~u2 | 6 1/3

Step 3. Let

1 1

(1, 1, 2).(1, −1, 1) 2

Proj~u1 ~v3 = 2 2 1

−1 = −1

1 + (−1) + 1 3

1 1

1/3 1/3

(1, 1, 2).(1/3, 2/3, 1/3) 5

Proj~u2 ~v3 = 2/3 = 2/3

(1/3)2 + (2/3)2 + (1/3)2 2

1/3 1/3

1 1 1/3

2 5

~u3 = 1 − −1 − 2/3

3 2

2 1 1/3

−1/2

= 0

1/2

√ −1/2

~u3

~e3 = = 2 0 .

|~u3 |

1/2

Example. Consider the vectors {[3, 0, 4], [−1, 0, 7], [2, 9, 11]} Check that the

vectors are linearly independent and use the Gram-Schmidt process to find

orthogonal vectors.

Ans. {[3, 0, 4], [−4, 0, 3], [0, 9, 0]} Check that the vectors are mutually orthog-

onal.

Summary

We assume ~v1 , ~v2 , ~v3 are vectors and a, b, c are real numbers for this summary.

find real numbers a, b, c such that

2. A set of vectors ~v1 , ~v2 , ~v3 is linearly independent if the only solution of the

vector equation

a~v1 + b~v2 + c~v3 = ~0,

is a = b = c = 0.

If one can find a non-zero solution, then the vectors are linearly dependent.

i.e. we can express one vector as a linear combination of the remaining

vectors.

3. We say that a set of vectors {~v1 , ~v2 , ..., ~vn } are mutually orthogonal if every

pair of vectors is orthogonal. i.e.

and the set of vectors are mutually orthogonal.

5. The Gram-Schmidt Algorithm:

Let v1 , v2 , ..., vn be a set of n linearly independent vectors in Rn . Then

we can construct an orthonormal set of vectors as follows:

~

u1

Step 1. Let ~u1 = ~v1 . ~e1 = |~

u1 |

.

~

u2

Step 2. Let ~u2 = ~v2 − Proj~u1 ~v2 . ~e2 = |~

u2 |

.

~

u3

Step 3. Let ~u3 = ~v3 − Proj~u1 ~v3 − Proj~u2 ~v3 . ~e3 = |~

u3 |

.

~

u4

Step 4. Let ~u4 = ~v4 − Proj~u1 ~v4 − Proj~u2 ~v4 − Proj~u3 ~v4 . ~e4 = |~

u4 |

.

•

•

7 Matrices

7.1 Introduction

Definition. A m × n matrix is a rectangular array of numbers having m

rows and n columns.

a11 a12 . . . a1n

a21 a22 . . . a2n

a31 a32 . . . a3n

A=

.

. . .

. . . .

am1 am2 . . . amn

The element aij represents the entry in the ith row and jth column. We

sometimes denote A by (aij )m×n .

Addition

We can only add two matrices of the same dimension i.e. same number of

rows and columns. We then add element-wise.

Example.

2 1 4 3 6 4

+ = .

1 3 5 6 6 9

Scalar Multiplication

If c is a real number and A = (aij )m×n is a matrix then cA = (caij )m×n .

Example.

2 1 10 5

5 = .

1 3 5 15

Matrix Multiplication

Given a matrix A = (aij )m×n and a matrix B = (bij )r×s , we can only multiply

them if n = r. In such a case the multiplication is defined to be the matrix

C = (cij )m×s as follows:

n

X

cij = aik bkj .

k=1

We may view the cij th element as the dot product of the ith row of the

matrix A and jth column of the matrix B.

Example. 1.

2 1 4 3 (2)(4) + (1)(5) (2)(3) + (1)(6)

=

1 3 5 6 (1)(4) + (3)(5) (1)(3) + (3)(6)

13 12

= .

19 21

1 2 3 2 0 0 4 12 4

2. 2 1 0 1 3 2 = 5 3 2

0 3 1 0 2 0 3 11 6

7.3 Special matrices

(a) A matrix is called a square matrix if the number of rows is equal to the

number of columns.

The rows of A become the columns of AT .

1 2 3

Example. The transpose of A = 2 1 0 is

0 3 1

1 2 0

AT = 2 1 3 .

3 0 1

elements in the matrix are zero.

3 0 0

e.g. 0 −4 0 .

0 0 1

matrix and all the non-zero elements are 1.

1 0 0

e.g. 0 1 0 .

0 0 1

AI = A = IA

7.4 Determinants

Definition. The determinant of a 2 × 2 matrix is defined as follows:

a b

c d = ad − bc.

a11 a12 a13

a21 a22 a23 = a11 a22 a23 − a12 a21 a23 + a13 a21 a22

a32 a33 a31 a33 a31 a33

a31 a32 a33

= a11 (a22 a33 − a32 a23 ) − a12 (a21 a33 − a31 a23 ) + a13 (a21 a32 − a31 a22 )

matrix of any size. Let A = (aij ) be a square n × n matrix. Fix a row i.

Then the determinant is defined to be,

n

X

det (A) = (−1)i+j aij det (Aij ),

j=1

where Aij is the minor matrix obtained by deleting row i and column j. It is

called the ij Cof actor of A. T

1 1 0 −1

0 −1 1 4

A= 0 0

.

2 2

1 0 1 2

Choose row i = 1.

−1 1 4

A11 = 0 2 2

0 1 2

det (A11 ) = −2

0 1 4

A12 = 0 2 2

1 1 2

det (A11 ) = −6

0 −1 4

A13 = 0 0 2

1 0 2

det (A11 ) = −2

0 −1 1

A14 = 0 0 2

1 0 1

det (A11 ) = − 2

X n

det (A) = (−1)1+j a1j det (A1j )

j=1

= (−1)2 (1)det (A11 ) + (−1)3 (1)det (A12 ) + (−1)4 (0)det (A13 ) + (−1)5 (−1)det (A14 )

= (1)(1)(−2) + (−1)(1)(−6) + (1)(0)(−2) + (−1)(−1)(−2)

=2

7.5 Inverses

Definition. A matrix I which has 1’s on the diagonal and 0’s everywhere

else is called the identity matrix. This matrix has the property that AI = A.

1 0

Example. 1. A 2 × 2 identity matrix has the form I = .

0 1

1 0 0

2. A 3 × 3 identity matrix has the form I = 0 1 0 .

0 0 1

Definition. Let A = (aij ) be a square matrix. A matrix B = (bij ) is called

the inverse of A if

AB = BA = I.

Remark. A matrix A has an inverse if and only if det(A) 6= 0.

Inverse of a 2 × 2 matrix

a b

Definition. The inverse of a 2 × 2 matrix, A = is given by,

c d

−1 1 d −b

A = .

−c a

ad − bc

1 1

Example. Find the inverse of the matrix A = .

1 2

Solution: We first check that the inverse of A exists!

det(A) = (1)(2) − (1)(1) = 1 6= 0.

Hence the inverse of A must exist and is given by

−1 1 2 −1

A = .

1 −1 1

We check that this is correct by multiplying A−1 A to see if we get the

identity matrix.

2 −1 1 1 (2)(1) + (−1)(1) (1)(1) + (−1)(2)

=

−1 1 1 2 (−1)(1) + (1)(2) (−1)(1) + (1)(2)

1 0

= .

0 1

We can find the inverse in another way: Write as (A|I2 )

1 1 1 0

.

1 2 0 1

Then use matrix row operations to get it into the form (I2 |B).

1 0 a b

.

0 1 c d

1 1 1 0 R2 =R2 −R1 1 1 1 0 R1 =R1 −R2 1 0 2 −1

→ →

1 2 0 1 0 1 −1 1 0 1 −1 1

a b 2 −1

=

c d −1 1

is the inverse of A.

Inverse of a 3 × 3 matrix

The method is the same as in the 2 × 2 case:

• Check determinant A is non-zero.

• Rewrite as (A|I3 ).

1 −1 1

Example. Let A = 0 −2 1 . Find A−1 .

−2 −3 0

1. Check that det(A) 6= 0.

= 3+2−4

= 1 6= 0

2. Rewrite as (A|I3 ).

1 −1 1 1 0 0

0 −2 1 0 1 0 .

−2 −3 0 0 0 1

1 −1 1 1 0 0 1 −1 1 1 0 0

0 −2 1 0 1 0 R3 =R→ 3 +2R1

0 −2 1 0 1 0

−2 −3 0 0 0 1 0 −5 2 2 0 1

−1

R2 = 2 R2

1 −1 1 1 0 0

→ 0 1 −1/2 0 −1/2 0

0 −5 2 2 0 1

1 −1 1 1 0 0

R3 =R3 +5R2

→ 0 1 −1/2 0 −1/2 0

0 0 −1/2 2 −5/2 1

1 −1 1 1 0 0

R3 =−2R3

→ 0 1 −1/2 0 −1/2 0

0 0 1 −4 5 −2

R2 =R2 + 12 R3

1 −1 1 1 0 0

→ 0 1 0 −2 2 −1

0 0 1 −4 5 −2

1 −1 0 5 −5 2

R1 =R1 −R3

→ 0 1 0 −2 2 −1

0 0 1 −4 5 −2

1 0 0 3 −3 1

R1 =R1 +R2

→ 0 1 0 −2 2 −1

0 0 1 −4 5 −2

4.

3 −3 1

A−1 = −2 2 −1 .

−4 5 −2

Summary

1. The determinant of a 2 × 2 matrix is defined as follows:

a b

c d = ad − bc.

a11 a12 a13

a22 a23 a21 a23 a21 a22

a32 a33 − a12 a31 a33 + a13 a31 a33

a21 a22 a23 = a11

a31 a32 a33

= a11 (a22 a33 − a32 a23 ) − a12 (a21 a33 − a31 a23 ) + a13 (a21 a32 − a31 a22 )

• Rewrite as (A|I3 ).

• Use row operations to put in the form (I3 |A−1 ).

8 Eigenvalues, Eigenvectors and Diagonaliza-

tion

8.1 Introduction

Definition. Suppose that A is an n × n square matrix. Suppose also that

~x is a non-zero vector in Rn and that λ is a scalar so that,

A~x = λ~x.

We then call ~x an eigenvector of A and λ an eigenvalue of A.

4 2

Example. Suppose A = .

1 3

2

Then is an eigenvector associated to the eigenvalue 5 because

1

4 2 2 2

A~x = =5 = 5~x.

1 3 1 1

−1

Also is an eigenvector associated to the eigenvalue 2 because

1

4 2 −1 −1

A~x = =2 = 2~x.

1 3 1 1

In this chapter we will learn how to find these eigenvalues and eigenvec-

tors.

We start with A~x = λ~x and rewrite it as follows,

A~x = λI~x

λI~x − A~x = 0

(λI − A)~x = 0

and only if det (λI − A) = 0.

Steps to find eigenvalues and eigenvectors:

1. Form the characteristic equation

det(λI − A) = 0.

3. For each eigenvalue λ, the corresponding set of eigenvectors are the non-

zero solutions the linear system of equations

(λI − A)~x = 0

4 2

1. A = .

1 3

Solution: (i) The characteristic equation of A.

We must first find the matrix λI − A.

1 0 4 2

λI − A = λ −

0 1 1 3

λ 0 4 2

= −

0 λ 1 3

λ−4 2

=

1 λ−3

The characteristic equation:

det(λI − A) = 0

λ−4 2

= 0

1 λ−3

(λ − 4)(λ − 3) − 2 = 0

λ2 − 7λ + 12 − 2 = 0

λ2 − 7λ + 10 = 0

(ii) To find all the eigenvalues of A, solve the characteristic equa-

tion.

λ2 − 7λ + 10 = 0

(λ − 5)(λ − 2) = 0

So we have two eigenvalues λ1 = 2, λ2 = 5.

(iii) For each eigenvalue λ, the corresponding set of eigenvectors

are the non-zero solutions of the linear system of equations

given by,

(λI − A)~x = 0.

case(i) λ1 = 2.

(2I − A)~x = 0

2 − 4 −2 x1 0

=

−1 2 − 3 x2 0

−2 −2 x1 0

=

−1 −1 x2 0

−2x1 − 2x2 0

=

−x1 − x2 0

We get two equations:

2x1 + 2x2 = 0

x1 + x2 = 0

Both equations give the relation x1 = −x2 . Therefore the set of

eigenvectors corresponding to λ = 2 is given by:

x1

{ |x1 = −x2 }

x2

−x2

{ }

x2

−1

{ x2 |x2 is a non-zero real number.}

1

−1

An eigenvector corresponding to λ1 = 2 is .

1

case(i) λ2 = 5.

(5I − A)~x = 0

5 − 4 −2 x1 0

=

−1 5 − 3 x2 0

1 −2 x1 0

=

−1 2 x2 0

x1 − 2x2 0

=

−x1 + 2x2 0

We get two equations:

x1 − 2x2 = 0

−x1 + 2x2 = 0

eigenvector corresponding to λ2 = 2 is,

x1

{ |x1 = 2x2 }

x2

2x2

{ }

x2

2

{ x2 |x2 is a non-zero real number.}

1

2

An eigenvector corresponding to λ2 = 5 is .

1

4 0 1

A = −1 −6 −2 .

5 0 0

The characteristic equation is:

det (λI − A) = 0

λ−4 0 −1

1 λ + 6 2 = 0

−5 0 λ

(λ − 4)((λ + 6)(λ) − 0) − 1(0 − (−5)(λ + 6)) = 0

λ3 + 2λ2 − 29λ − 30 = 0

We need to solve the characteristic equation. i.e. we need to factor-

ize the characteristic polynomial. We can factorize it by either using

long division or by directly trying to spot a common factor.

We want to factorize this cubic polynomial. In general it is quite dif-

ficult to guess what the factors may be. We try λ = ±1, ±2, ±3, etc.

and hope to quickly find one factor. Let us try λ = −1. We divide

the polynomial λ3 + 2λ2 − 29λ − 30 by λ + 1, to get,

λ2 +λ −30

3 2

λ + 1 |λ +2λ −29λ −30;

−λ3 +λ2

λ2 −29λ −30

− λ2 +λ

−30λ −30

− −30λ −30

0

The quotient is λ2 + λ − 30.

The remainder is 0.

Therefore λ3 + 2λ2 − 29λ − 30 = (λ + 1)(λ2 + λ − 30) + 0.

λ3 + 2λ2 − 29λ − 30 = 0

(λ + 1)(λ2 + λ − 30) = 0

(λ + 1)(λ + 6)(λ − 5) = 0

Therefore the eigenvalues are: {−1, −6, 5}.

λ−4 0 0

1 λ + 6 2 = 0

−5 0 λ

(λ − 4)((λ + 6)(λ) − 0) − 1(0 − (−5)(λ + 6)) = 0

(λ − 4)((λ + 6)(λ)) − 5(λ + 6) = 0

(λ + 6)(λ(λ − 4) − 5) = 0

(λ + 6)(λ2 − 4λ − 5) = 0

(λ + 6)(λ − 5)(λ + 1) = 0

Therefore the eigenvalues of A are: λ1 = −1, λ2 = −6 and λ3 = 5.

(iii) Find Eigenvectors corresponding to each Eigenvalue:

case(i) λ1 = −1.

The eigenvectors are the solution space of the following system:

−5 0 −1 x1 0

1 5 2 x2 = 0

−5 0 −1 x3 0

−1

−5x1 − x3 = 0; x1 = x3

5

−9

x1 + 5x2 + 2x3 = 0; x2 = x3

25

The set of eigenvectors corresponding to λ1 = −1 is,

x1

−1 −9

{ x2 |x1 = x 3 , x2 = x3 }

5 25

x3

−1

x

5 3

−9

{ x |x3 is a non-zero real number}

25 3

x3

−1

5

−9

{ x3 25

|x3 is a nn-zero real number}

1

−1

5

−9

An eigenvector corresponding to λ1 = −1 is 25

.

1

case(ii) λ2 = −6.

The eigenvectors are the solution space of the following system:

−10 0 −1 x1 0

1 0 2 x2 = 0

−5 0 −6 x3 0

−10x1 − x3 = 0; x3 = −10x1

x1 + 2x3 = 0; x1 = −2x3 = −2(−10)x1

x1 = 0 = x3

The set of eigenvectors corresponding to λ2 = −6 is,

x1

{ x2 |x1 = x3 = 0}

x3

0

{ x2 |x2 is a non-zero real number}

0

0

{ x2 1 |x2 is a non-zero real number}

0

0

An eigenvector corresponding to λ2 = −6 is 1 .

0

case(iii) λ3 = 5.

The eigenvectors are the solution space of the following system:

1 0 −1 x1 0

1 11 2 x2 = 0

−5 0 5 x3 0

x1 = x3

−3

x2 = x3

11

The set of eigenvectors corresponding to λ3 = 5 is,

x1

−3

{ x2 |x1 = x3 , x2 = x3 }

11

x3

x3

{ −3 x |x3 is a non-zero real number}

11 3

x3

1

{ x3 −3 11

|x3 is a non-zero real number}

1

1

An eigenvector corresponding to λ3 = 5 is −3 11

.

1

3. An example of repeated eigenvalue having two eigenvectors.

0 1 1

A = 1 0 1 .

1 1 0

The characteristic equation is:

λ −1 −1

−1 λ −1 = 0

−1 −1 λ

λ(λ2 − (−1)(−1)) − (−1)((−1)λ − (−1)(−1)) − 1((−1)λ − (−1)(−1)) = 0

λ3 − 3λ2 − 2 = 0

We need to solve the characteristic equation. i.e. we need to factor-

ize the characteristic polynomial. We can factorize it by either using

long division or by directly trying to spot a common factor.

We want to factorize this cubic polynomial. In general it is quite dif-

ficult to guess what the factors may be. We try λ = ±1, ±2, ±3, etc.

and hope to quickly find one factor. Let us try λ = −1. We divide

the polynomial λ3 + 2λ2 − 29λ − 30 by λ + 1, to get,

λ2 −λ −2

3

λ + 1 |λ −3λ −2;

−λ3 +λ2

−λ2 −3λ −2

− −λ2 −λ

−2λ −2

− −2λ −2

0

The quotient is λ2 − λ − 2.

The remainder is 0.

Therefore λ3 − 3λ − 2 = (λ + 1)(λ2 − λ − 2) + 0.

λ3 − 3λ − 2 = 0

(λ + 1)(λ2 − λ − 2) = 0

(λ + 1)(λ + 1)(λ − 2) = 0

λ −1 −1

−1 λ −1 = 0

−1 −1 λ

λ(λ2 − (−1)(−1)) − (−1)((−1)λ − (−1)(−1)) − 1((−1)λ − (−1)(−1)) = 0

λ(λ2 − 1) + 1(−λ − 1) − 1(1 + λ) = 0

λ(λ − 1)(λ + 1) − 1(λ + 1) − 1(1 + λ) = 0

(λ + 1)(λ(λ − 1) − 1 − 1) = 0

(λ + 1)(λ2 − λ − 2) = 0

(λ + 1)2 (λ − 2) = 0

(iii) Find Eigenvectors corresponding to each Eigenvalue:

case(i) λ1 = −1.

The eigenvectors are the solution space of the following system:

−1 −1 −1 x1 0

−1 −1 −1 x2 = 0

−1 −1 −1 x3 0

−x1 − x2 − x3 = 0; x1 = −x2 − x3

The set of eigenvectors corresponding to λ1 = −1 is,

x1

{ x2 |x1 = −x2 − x3 }

x3

−x2 − x3

{ x2 | at least one of x2 and x3 is a non-zero real number}

x3

−1 −1

{ x2 1 + x3 0 | at least one of x2 and x3 is a non-zero real number}

0 1

responding

λ1 = −1:

to

−1 −1

~v1 = 1 , ~v2 = 0 .

0 1

case(ii) λ2 = 2.

The eigenvectors are the solution space of the following system:

2 −1 −1 x1 0

−1 2 −1 x2 = 0

−1 −1 2 x3 0

2x1 − x2 − x3 = 0

−x1 + 2x2 − x3 = 0

−x1 − x2 + 2x3 = 0

be a good idea to use row reduction to simplify the system.

2 −1 −1 0 −1 −1 2 0

↔R1

−1 2 −1 0 R3→ −1 2 −1 0

−1 −1 2 0 2 −1 −1 0

1 1 −2 0

R1 =(−1)R1

→ −1 2 −1 0

2 −1 −1 0

1 1 −2 0

R =R2 −2R1

−−3−−− −−→ 0 3 −3 0

R2 =R1 +R2

0 −3 3 0

1 1 −2 0

R2 =1/3R2

→ 0 1 −1 0

0 −3 3 0

1 1 −2 0

R3 =R3 +3R2

→ 0 1 −1 0

0 0 0 0

The resulting set of equations is:

x2 − x3 = 0 =⇒ x2 = x3

x1 + x2 − 2x3 = 0 =⇒ x1 = x3

x1

{ x2 |x1 = x2 = x3 }

x3

x3

{ x3 |x3 is a non-zero real number}

x3

1

{ x3 1 |x3 is a non-zero real number}

1

1

An eigenvector corresponding to λ2 = 2 is 1 .

1

4. An example of a repeated eigenvalue having only one linearly independent

eigenvector.

6 3 −8

A = 0 −2 0 .

1 0 −3

Solution: The characteristic polynomial for A:

λ − 6 −3 8

0 λ+2 0 = 0

−1 0 λ+3

(λ − 6)((λ + 2)(λ + 3) − 0) − 3(0 − 0) + 8(0 − (−1)(λ + 2)) = 0

λ3 − λ2 − 16λ − 20 = 0

(λ + 2)(λ2 − 3λ − 10) = 0

(λ + 2)2 (λ − 5) = 0

Therefore the eigenvalues of A are: λ1 = −2, λ2 = 5.

case(i) λ1 = −2.

−8 −3 8 x1 0

0 0 0 x2 = 0

−1 0 1 x3 0

−x1 + x3 = 0 =⇒ x1 = x3

−8x1 − 3x2 + 8x3 = 0 =⇒ x2 = 0

The set of eigenvectors corresponding to λ1 = −2 is,

x1

{ x2 |x1 = x3 , x2 = 0}

x3

x3

{ 0 |x3 is a non-zero real number}

x3

1

{ x3 0 |x3 is a non-zero real number}

1

1

An eigenvector corresponding to λ1 = −2 is 0 .

1

case(ii) λ2 = 5.

The eigenvectors are the solution space of the following system:

−1 −3 8 x1 0

0 7 0 x2 = 0

−1 0 8 x3 0

−x1 + 8x3 = 0 =⇒ x1 = 8x3

7x2 = 0 =⇒ x2 = 0

x1

{ x2 |x1 = 8x3 , x2 = 0}

x3

8x3

{ 0 |x3 is a non-zero real number}

x3

8

{ x3 0 |x3 is a non-zero real number}

1

8

An eigenvector corresponding to λ2 = 5 is 0 .

1

independent eigenvectors.

4 0 −1

A = 0 3 0 .

1 0 2

Solution: The characteristic equation for A:

det (λI − A) = 0

λ−4 0 1

0

λ−3 0 = 0

−1 0 λ−2

λ3 − 9λ2 + 27λ − 27 = 0

(λ − 3)3 = 0

We now find the eigenvectors corresponding to λ = 3:

−1 0 1 x1 0

0 0 0 x2 = 0

−1 0 1 x3 0

−x1 + x3 = 0 =⇒ x1 = x3

x1

{ x2 |x1 = x3 }

x3

x3

{ x2 | at least one of x2 and x3 is a non-zero real number}

x3

0 1

{ x2 1 + x3 0 | at least one on x2 and x3 is a non-zero real number}

0 1

to the eigenvalue λ = 3:

0 1

1 and 0 .

0 1

8.3 Dimension of Eigenspace

This section is optional for the exam. Here we will see why a repeated

eigenvalue can have one or more linearly independent eigenvectors associated

to it. First some definitions.

Definition. The rank of a matrix is defined to be the number of linearly

independent rows or columns.

• Let A be a square matrix and let Aechelon be the row echelon form of

the matrix. Then both A and Aechelon have the same number of linearly

independent rows.

• Therefore, Rank(A) = Rank(Aechelon ).

• But Rank(Aechelon )= number of linearly independent rows of Aechelon

= number of non-zero rows of Aechelon

• We may conclude that

Rank(A) = Rank(Aechelon ) = number of non-zero rows in Aechelon .

−1 −1 −1 1 1 1

Example. 1. Rank −1 −1 −1 = Rank 0 0 0 = 1.

−1 −1 −1 0 0 0

−8 −3 8 1 0 −1

2. Rank 0 0 0 = Rank 0 −3 0 = 2.

−1 0 1 0 0 0

−1 0 1 −1 0 1

3. Rank 0 0 0 = Rank 0 0 0 = 1.

−1 0 1 0 0 0

Definition. Let A be a square matrix of size n × n having an eigenvalue λ.

The set of eigenvectors corresponding to the eigenvalue λ, alongwith the zero

vector, is called the eigenspace corresponding to eigenvalue λ.

The dimension of the eigenspace is defined to be the maximum number of

linearly independent vectors in the eigenspace.

Theorem Let A be a matrix of size n × n and let λ be an eigenvalue of A

repeating k times. Then the dimension of the eigenspace of λ or equivalently

the maximum number of linearly independent eigenvectors corresponding to

λ is:

n − Rank(λI − A).

Example. Let us use this result to see why some repeated eigenvalues had

one or more linearly independent eigenvalues in the examples in the previous

section.

0 1 1

1. Recall A = 1 0 1

1 1 0

• has eigenvalue λ = −1 repeating 2 times.

• We found two linearly independent eigenvectors corresponding to this

eigenvalue.

−1 −1

~v1 = 1 , ~v2 = 0 .

0 1

We will use the above theorem to confirm that we should indeed expect

to find two linearly independent eigenvectors. By the above theorem, the

maximum number of linearly independent eigenvectors corresponding to

λ = −1 is:

= 3 − Rank(λI − A)

−1 −1 −1

= 3 − Rank −1 −1 −1

−1 −1 −1

=3−1

=2

6 3 −8

2. Recall A = 0 −2 0

−1 0 1

• has eigenvalue λ = −2 repeating 2 times.

• We found only one linearly independent eigenvector corresponding

to this

eigenvalue.

1

~v1 = 0 .

1

We will use the above theorem to confirm that we should indeed expect

to find only one linearly independent eigenvector. By the above theorem,

the maximum number of linearly independent eigenvectors corresponding

to λ = −2 is:

= 3 − Rank(λI − A)

−8 −3 8

= 3 − Rank 0 0 0

−1 0 1

=3−2

=1

4 0 −1

3. Recall A = 0 3 0

1 0 2

• has eigenvalue λ = 3 repeating 3 times.

• We found only two linearly independent eigenvectors corresponding

to this

eigenvalue.

0 1

~v1 = 1 , ~v2 =

0 .

0 1

We will use the above theorem to confirm that we should indeed expect to

find only two linearly independent eigenvectors. By the above theorem,

the maximum number of linearly independent eigenvectors corresponding

to λ = 3 is:

= 3 − Rank(λI − A)

−1 0 1

= 3 − Rank 0 0 0

−1 0 1

=3−1

=2

8.4 Diagonalization

Definition. A square matrix of size n is called a diagonal matrix if it is of

the form

d1 0 . . . 0

0 d2 0 . . 0

. . .

. . .

0 . . dn−1 0

0 . . . 0 dn

Definition. Two square matrices A and B of size n are said to be similar,

if there exists an invertible matrix P such that

B = P −1 AP

trix. i.e. there exists an invertible matrix P such that P −1 AP = D where D

is a diagonal matrix. In this case, P is called a diagonalizing matrix.

4 2

A= .

1 3

Solution: In previous lectures we have found,

the eigenvalues of A are: λ1 = 2, λ2 = 5.

−1

An eigenvector corresponding to λ1 = 2 is .

1

2

An eigenvector corresponding to λ2 = 5 is .

1

Define a new matrix P such that its columns are the eigenvectors of A.

−1 2

P = .

1 1

Note that since eigenvectors corresponding to distinct eigenvalues are linearly

independent, the columns of P are linearly independent, hence P is an in-

vertible matrix. (We can also confirm this by verifying that the determinant

of P is non-zero).

−1 −1 1 −2

P =

3 −1 −1

−1 −1 1 −2 4 2 −1 2

P AP =

3 −1 −1 1 3 1 1

−1 2 −4 −1 2

=

3 −5 −5 1 1

−1 −6 0

=

3 0 −15

2 0

= ,

0 5

which is a diagonal matrix. Therefore A is diagonalizable and a diagonalizing

matrix for A is

−1 2

P = ,

1 1

and the corresponding diagonal matrix D is

2 0

D= .

0 5

if and only if there exists n linearly independent eigenvectors. In this case a

diagonalizing matrix P is formed by taking as its columns the eigenvectors of

A,

P −1 AP = D.

where D is a diagonal matrix whose diagonal entries are eigenvalues corre-

sponding to the eigenvectors of A. The ith diagonal entry is the eigenvalue

corresponding to the eigenvector in the ith column of P .

distinct eigenvalues of A, and ~v1 and ~v2 are eigenvectors corresponding to λ1

and λ2 respectively, then ~v1 and ~v2 are linearly independent.

values then A is diagonalizable.

Example. For the following matrices:

(i) Find the eigenvalues and eigenvectors of A.

(ii) Is A diagonalizable?

(iii) If it is diagonalizable, find a diagonalizing matrix P such that P −1 AP =

D, where D is a diagonal matrix.

1.

4 0 1

A = −1 −6 −2 .

5 0 0

Solution: (i) In previous lectures we have found,

The eigenvalues of A are: λ1 = −1, λ2 = −6

and λ3= 5.

−1

5

−9

An eigenvector corresponding to λ1 = −1 is 25

.

1

0

An eigenvector corresponding to λ2 = −6 is 1 .

0

1

−3

An eigenvector corresponding to λ3 = 5 is

11

.

1

(ii) The matrix A is diagonalizable, since A is a square matrix of size

3 × 3 and we have found three linearly independent eigenvectors

for A.

(iii) A diagonalizing matrix P has the eigenvectors as its columns. Let

−1

5

0 1

P = −925

1 −311

.

1 0 1

The corresponding diagonal matrix D will have the eigenvalues as

diagonal entries corresponding to the columns of P . i.e. the ith

diagonal entry will be the eigenvalue corresponding to the eigen-

vector in the ith column.

−1 0 0

D = 0 −6 0 .

0 0 5

We will now verify that P −1 AP = D.

−1

4 0 1 5

0 1

AP = −1 −6 −2 −9 25

1 −3

11

5 0 0 1 0 1

1/5 0 5

= 9/25 −6 −15/11

−1 0 5

−1

5

0 1 −1 0 0

P D = −9 25

1 −311

0 −6 0

1 0 1 0 0 5

1/5 0 5

= 9/25 −6 −15/11

−1 0 5

AP = P D (1)

terminant and is therefore invertible. We multiply (1) by P −1 on both

sides.

P −1 AP = P −1 P D,

= ID

=D

6 3 −8

A = 0 −2 0 .

−1 0 1

A had two eigenvalues:

λ1 = −1 occurring with multiplicity 2, and

λ2 = 5 occurring with multiplicity 1.

We could find only one linearly independent eigenvector corresponding

to λ1 = −1,

and one linearly independent eigenvector corresponding to λ2 = 5.

Therefore we could find only two linearly independent eigenvectors. By

the above theorem, A is not diagonalizable.

Definition. Let A be a square matrix of size n. A is a symmetric matrix

if AT = A

orthogonal.

vectors and P is orthogonal.

P −1 = P T .

matrix P such that

that are mutually orthogonal.

Example. Recall

0 1 1

A = 1 0 1 .

1 1 0

(ii) Is A diagonalizable?

diagonal matrix.

in a previous lecture.

(i), (ii) Observe that A is a real symmetric matrix. By the above theorem, we

know that A is diagonalizable. i.e. we will be able to find a sufficient

number of linearly independent eigenvectors.

The eigenvalues of A were; −1, 2. We found two linearly

independent

−1 −1

eigenvectors corresponding to λ1 = −1: ~v1 = 1 , ~v2 = 0 .

0 1

1

And one eigenvector corresponding to λ2 = 2: 1 .

1

Since A is a real symmetric matrix, eigenvectors corresponding to dis-

tinct

eigenvalues are orthogonal.

i.e.

1 −1 −1

1 is orthogonal to 1 and 0 .

1 0 1

However

the eigenvectors

corresponding

to eigenvalue λ1 = −1, ~v1 =

−1 −1

1 and ~v2 = 0 are not orthogonal to each other, since

0 1

we chose them from the eigenspace by making arbitrary choices*. We

will have to use Gram Schmidt to get two corresponding orthogonal

vectors.

~u1 = ~v1

−1

((−1, 1, 0).(−1, 0, 1))

Proj~u1 ~v2 = 1

2

0

−1

1

= 1

2

0

~u2 = ~v2 − Proj~u1 ~v2

−1 −1/2

= 0 − 1/2

1 0

−1/2

= −1/2

1

Note that since ~u1 = ~v1 and ~u2 is a linear combination of ~v1 and ~v2 ,

the two new vectors are also eigenvectors of λ = −1. We now have a

set of orthogonal eigenvectors:

1 −1 −1/2

{ 1 , 1 , −1/2 }.

1 0 1

We normalize the vectors to get a set of orthonormal vectors:

√ √ √

1/√3 −1/√ 2 −1/√6

{ 1/√3 , 1/ 2 , −1/√ 6 }.

1/ 3 0 2/ 6

We are now finally ready to write the orthonormal diagonalizing matrix:

√ √ √

1√3 −1/√ 2 −1/√6

P = 1√3 1/ 2 −1/√ 6

1 3 0 2/ 6

and the corresponding diagonal matrix D

2 0 0

D = 0 −1 0 .

0 0 −1

√ √ √

0 1 1 1√3 −1/√ 2 −1/√6

AP = 1 0 1 1√3 1/ 2 −1/√ 6

1 1 0 1 3 0 2/ 6

√ √ √

2√3 1/ √2 1/√6

= 2√3 −1/ 2 1/ √6

2 3 0 −2/ 6

√ √ √

1√3 −1/√ 2 −1/√6 2 0 0

P D = 1√3 1/ 2 −1/√ 6 0 −1 0

1 3 0 2/ 6 0 0 −1

√ √ √

2√3 1/ √2 1/√6

= 2√3 −1/ 2 1/ √6

2 3 0 −2/ 6

AP = P D (1)

and is therefore invertible. We multiply (1) by P −1 on both sides.

P −1 AP = P −1 P D,

= ID

=D (2)

Also since P is orthonormal, we have

P −1 = P T

i.e. P P T = I = P T P .

√ √ √ √ √ √

1√3 −1/√ 2 −1/√6 1 √3 1/√3 1/ 3 1 0 0

1 3 1/ 2 −1/ 6 −1 2 1/ 2 0√ = 0 1 0 .

√ √ √ √

1 3 0 2/ 6 −1 6 −1/ 6 2/ 6 0 0 1

Therefore from (2) and since P −1 = P T we finally get the relation

P T AP = D.

Note. *:Look back at how we selected the eigenvecors ~v1 and ~v2 ; we chose

x2 = 1, x3 = 0 to get ~v1 and x2 = 0, x3 = 1 to get ~v2 . If we had chosen

x2 = 1, x3 = 0 to get ~v1 and x2 = −1/2, x3 = 1 to get ~v2 , then ~v1 and ~v2

would be orthogonal. However it is much easier to make arbitrary choices

for x1 and x2 and then use the Gram Schmidt Process to orthogonalize the

vectors as we have done in this example.

Summary

1. Suppose that A is an n × n square matrix. Suppose also that ~x is a

non-zero vector in Rn and that λ is a scalar so that,

A~x = λ~x.

det(λI − A) = 0.

(ii) To find all the eigenvalues of A, solve the characteristic equation for

λ.

(iii) For each eigenvalue λ, to find the corresponding set of eigenvectors,

find a non-zero solution to the linear system of equations:

(λI − A)~x = 0

and only if there exists 3 linearly independent eigenvectors. In this case a

diagonalizing matrix P is formed by taking as its columns the eigenvectors

of A. Then we have

P −1 AP = D.

where D is a diagonal matrix whose ith diagonal entry is the eigenvalue

corresponding to the eigenvector in the ith column of P .

there exists an othogonal matrix P such that

(ii) The diagonal entries of D are the eigenvalues of A

(iii) The column vectors of P are eigenvectors of A

(iv) If λi 6= λj then the eigenvectors are orthogonal.

Note. Point (iv) above says that if we have 3 distinct eigenvalues, then

the 3 corresponding eigenvectors will be mutually orthogonal and we can

define P to be the matrix whose columns are these eigenvectors.

However if we have a repeated root, then the 2 eigenvectors corresponding

to the repeated eigenvalue will not necessarily be orthogonal. Here we will

have to use Gram Schmidt to orthogonalize the vectors. Note that the

orthogonalized vectors will also be eigenvectors of the repeated eigenvalue.

References

1. http://tutorial.math.lamar.edu/Classes/LinAlg/EVal EVect Intro.aspx

2. http://www.sosmath.com/matrix/diagonal/diagonal.html

9 Operators and Commutators

9.1 Operators

Operators are commonly used to perform a specific mathematical operation

on another function. The operation can be to take the derivative or integrate

with respect to a particular term, or to multiply, divide, add or subtract a

number or term with regards to the initial function. Operators are commonly

used in physics, mathematics and chemistry.

(i)

f : x 7→ ax,

where a is a real number.

Operator: the function f

Operates on: real numbers

Action: multiply by a.

(ii) Squaring

f : x 7→ x2

Operator: the function f

Operates on: real numbers

Action: squaring

(i)

â : f (x) 7→ af (x),

where a is a real number.

Operator: â

Operates on: scalar functions

Action: multiply by a.

(ii)

x̂ : f (x) 7→ xf (x),

Operator: x̂

Operates on: scalar functions

Action: multiply by x.

(iii) Differentiation Operator

d

D : f (x) 7→ f (x)

dx

Operator: D

Operates on: scalar functions

Action: Differentiate with respect to x.

We can similarly define partial differentiation operators, grad opera-

tor, integral operator, etc.

(iv) The Momentum Operator

~ ∂

P̂ =

i ∂x

where:

– ~ is Planck’s constant,

– i is the imaginary unit.

~ ∂ ~ ∂ψ(x, t)

P̂ = : ψ(x, t) 7→

i ∂x i ∂x

Operator: P̂

Operates on: wave function, ψ(x, t).

~∂

Ê =

i ∂t

where:

– ~ is Planck’s constant,

– i is the imaginary unit.

~∂ ~ ∂ψ(x, t)

Ê = : ψ(x, t) 7→

i ∂t i ∂t

Operator: Ê

Operates on: wave function, ψ(x, t).

~2 ∂ 2

Ĥ = −

2m ∂x2

where:

– ~ is Planck’s constant,

– m is the mass of the particle (regard as constant).

– i is the imaginary unit.

~2 ∂ 2 ~2 ∂2

Ĥ = − : ψ(x, t) →

7 −

2m ∂x2 2m ∂ψ(x, t)x2

Operator: Ĥ

Operates on: wave function, ψ(x, t).

1 1 x1

(i) Let A = be a square matrix and ~v = be an arbi-

0 1 x2

trary vector. Define an operator in the following manner:

~v →

A : 7 A~

v

x1 1 1 x1 x1 + x2

7 → = .

x2 0 1 x2 x2

Operator: A

Operates on: two dimensional

vectors

x1 x1 + x2

Action: maps a vector to .

x2 x2

An operator O is a linear operator if it satisfies the following two conditions:

1. O = â : f (x) 7→ af (x).

Solution:

O(f + g) = a(f + g)

= af + ag

= O(f ) + O(g)

O(λf ) = a(λf )

= λaf

= λO(f )

Therefore O is linear.

2. O = squaring : x 7→ x2 .

Solution:

O(x + y) = (x + y)2

= x2 + y 2 + 2xy

= O(x) + O(y) + 2xy

O(1 + 2) = O(3) = 9

O(1) + O(2) = 1+4=5

d

3. D : f (x) 7→ dx

f (x).

Solution:

d

D(f + g) = (f + g)

dx

d d

= f+ g

dx dx

= D(f ) + D(g)

d

D(λf ) = (λf )

dx

d

= λ f

dx

= λD(f )

Therefore D is linear.

1 1

4. A = : ~v 7→ A~v .

0 1

Solution:

1 1 x1 y1

A(~v + w)

~ = ( + )

0 1 x2 y2

1 1 x1 + y1

= ( )

0 1 x2 + y2

(x1 + y1 ) + (x2 + y2 )

=

x2 + y2

x1 + x2 y1 + y2

= +

x2 y2

1 1 x1 1 1 y1

= +

0 1 x2 0 1 y2

= A(~v ) + A(w)~

1 1 x1

A(λ~v ) = (λ )

0 1 x2

1 1 λx1

=

0 1 λx2

λx1 + λx2

=

λx2

x1 + x2

= λ

x2

1 1 x1

= λ

0 1 x2

= λA~v

Therefore A is linear.

9.3 Composing Operators

When we have two operators and we want to apply them to a function in

sequence, we use the notation O1 ◦ O2 .

O1 ◦ O2 : f 7→ O1 (O2 (f )).

O1 : f (x) 7→ xf (x)

O2 : g(x) 7→ (g(x))2

Find O1 ◦ O2 and O2 ◦ O1 .

= O1 ((h(x)2 )

= x(h(x)2 )

O1 ◦ O2 : h(x) 7→ x(h(x)2

We now find O2 ◦ O1 .

= O2 (xh(x))

= x2 (h(x)2 )

O2 ◦ O1 : h(x) 7→ x2 (h(x)2 )

operators is important.

9.4 Commutators

Let OA and OB be two operators. The commutator of OA and OB is the

operator defined as

[OA , OB ] = OA ◦ OB − OB ◦ OA .

Example. 1. Consider the following two operators.

1 0

OA : ~v 7→ A~v , A =

0 0

1 1

OB : ~v 7→ B~v , B =

1 0

Find [OA , OB ].

Solution: We first find OA ◦ OB .

OA ◦ OB (~v ) = OA (OB (~v ))

= OA (B~v )

= AB~v

1 1

OA ◦ OB : ~v 7→ AB~v , AB =

0 0

x1 1 1 x1 x1 + x2

7→ =

x2 0 0 x2 0

We now find O2 ◦ O1 .

OB ◦ OA (~v ) = OB (OA (~v ))

= OB (A~v )

= BA~v

1 0

OB ◦ OA : ~v 7→ BA~v , BA =

1 0

x1 1 0 x1 x1

7→ =

x2 1 0 x2 x1

Therefore the commutator of OA and OB is

[OA , OB ] = OA ◦ OB − OB ◦ OA

[OA , OB ](~v ) = (OA ◦ OB − OB ◦ OA )(~v )

= (AB − BA)(~v )

2. Consider the following two operators.

d

D : f (x) 7→ (D)f (x) = f (x)

dx

d

xD : f (x) 7→ (xD)f (x) = x f (x)

dx

Solution:

d

= D(x f (x))

dx

d d

= (x f (x))

dx dx

d2 f df

= x 2+

dx dx

2

= (xD + D)(f (x))

(xD ◦ D)(f (x)) = xD(D(f (x)))

d

= xD( f (x))

dx

d d

= x ( f (x))

dx dx

d2 f

= x 2

dx

= (xD2 )(f (x))

[D, xD] = D ◦ xD − xD ◦ D

= xD2 + D − xD2

= D

3. Consider the two operators

d

O1 = D + 1 : f (x) 7→ (D + 1)f (x) = f (x) + f (x)

dx

d

O1 = D + 2 : f (x) 7→ (D + 2)f (x) = f (x) + 2f (x)

dx

Solution:

d

= O1 ( f (x) + 2f (x))

dx

d d d

= ( f (x) + 2f (x)) + ( f (x) + 2f (x))

dx dx dx

d2 f df df

= +2 + + 2f

dx2 dx dx

= (D2 + 3D + 2)(f (x))

O2 ◦ O1 (f (x)) = O2 (O1 (f (x)))

d

= O2 ( f (x) + 2f (x))

dx

d d d

= ( f (x) + f (x)) + 2( f (x) + f (x))

dx dx dx

2

df df df

= 2

+ + 2 + 2f

dx dx dx

= (D2 + 3D + 2)(f (x))

[O1 , O2 ] = O1 ◦ O2 − O2 ◦ O1

= (D2 + 3D + 2) − (D2 + 3D + 2) = 0

∂ ∂

4. Find the commutator [x ∂y , y ∂x ].

Solution:

∂ ∂ ∂ ∂ ∂ ∂

[x ,y ] = x ◦y −y ◦x

∂y ∂x ∂y ∂x ∂x ∂y

∂ ∂ ∂ ∂

(x ◦ y )(f ) = x (y (f )

∂y ∂x ∂y ∂x

∂2f ∂f

= x(y + )

∂y∂x ∂x

∂2 ∂

= (xy + x )(f )

∂y∂x ∂x

∂ ∂ ∂ ∂

(y ◦ x )(f ) = y (x (f )

∂x ∂y ∂x ∂y

∂2f ∂f

= y(x + )

∂x∂y ∂y

∂2 ∂

= (xy + y )(f )

∂x∂y ∂x

Therefore the commutator is:

∂ ∂ ∂ ∂ ∂ ∂

[x ,y ] = x ◦y −y ◦x

∂y ∂x ∂y ∂x ∂x ∂y

2

∂ ∂ ∂2 ∂

= (xy + x ) − (xy +y )

∂y∂x ∂x ∂x∂y ∂x

∂ ∂

= x −y

∂x ∂y

∂

Ê : ψ(x, t) 7→ i~ ψ(x, t)

∂t

t̂ : ψ(x, t) 7→ tψ(x, t)

Solution:

∂ ∂

[Ê, t̂]ψ(x, t) = (i~ ◦ t̂ − t̂ ◦ i~ )(ψ(x, t))

∂t ∂t

∂ ∂

= (i~ ◦ t̂)(ψ(x, t)) − (t̂ ◦ i~ )(ψ(x, t))

∂t ∂t

∂ ∂(ψ(x, t))

= i~ (tψ(x, t)) − ti~

∂t ∂t

∂(ψ(x, t)) ∂t ∂(ψ(x, t))

= i~t + i~(ψ(x, t)) − ti~

∂t ∂t ∂t

= i~(ψ(x, t))

6. Find the commutator of the following two operators.

~2 ∂ 2

Ĥ : ψ(x, t) 7→ − ψ(x, t)

2m ∂x2

x̂ : ψ(x, t) 7→ xψ(x, t)

Solution:

[Ĥ, x̂] = Ĥ ◦ x̂ − x̂ ◦ Ĥ

~2 ∂ 2

(Ĥ ◦ x̂)(ψ(x, t)) = − (x̂(ψ))

2m ∂x2

~2 ∂ 2

=− (x(ψ))

2m ∂x2

~2 ∂ ∂

=− ( (xψ))

2m ∂x ∂x

~2 ∂ ∂ψ

=− (x + ψ)

2m ∂x ∂x

~2 ∂ ∂ψ ∂ψ

=− ( (x ) + )

2m ∂x ∂x ∂x

~2 ∂ 2 ψ ∂ψ

=− (x 2 + 2 )

2m ∂x ∂x

~2 x ∂ 2 ~2 ∂

Ĥ ◦ x̂ = − −

2m ∂x2 m ∂x

2 2

~ ∂

(x̂ ◦ Ĥ)(ψ(x, t)) = (x̂ ◦ − )(ψ)

2m ∂x2

~2 x ∂ 2 ψ

=−

2m ∂x2

~2 x ∂ 2

x̂ ◦ Ĥ = −

2m ∂x2

~2 ∂ 2 ~2 ∂ ~2 x ∂ 2

[Ĥ, x̂] = − x 2− +

2m ∂x m ∂x 2m ∂x2

2

~ ∂

=−

m ∂x

[A ◦ B, C] = A ◦ [B, C] + [A, C] ◦ B.

Solution:

A ◦ [B, C] + [A, C] ◦ B = A ◦ (B ◦ C − C ◦ B) + (A ◦ C − C ◦ A) ◦ B

= A◦B◦C −A◦C ◦B+A◦C ◦B−C ◦A◦B

= (A ◦ B) ◦ C − C ◦ (A ◦ B)

= [A ◦ B, C]

[D2 , ex ] = D2 ◦ ex − ex ◦ D2

Try the direct method at home. In class we will use the identity to find

the commutator.

d x

(D ◦ ex )(f ) = (e f )

dx

df dex

= ex +f

dx dx

df

= ex + f ex

dx

= (ex D + ex )(f )

(ex ◦ D)(f ) = (ex D)(f )

[D, ex ] = (ex D + ex ) − (ex D)

= ex

[D2 , ex ] = D ◦ [D, ex ] + [D, ex ] ◦ D

= D ◦ ex + ex ◦ D

= (ex D + ex ) + ex D

= ex (2D + 1)

Summary

1. Operators are commonly used to perform a specific mathematical opera-

tion on another function.

tions:

(ii) O(λf ) = λO(f ), where λ is a scalar.

operator defined as

[OA , OB ] = OA ◦ OB − OB ◦ OA .

10 Fourier Series

10.1 Introduction

When the French mathematician Joseph Fourier (1768-1830) was trying to

study the flow of heat in a metal plate, he had the idea of expressing the

heat source as an infinite series of sine and cosine functions. Although the

original motivation was to solve the heat equation, it later became obvious

that the same techniques could be applied to a wide array of mathematical

and physical problems.

In this course, we will learn how to find Fourier series to represent periodic

functions as an infinite series of sine and cosine terms.

A function f (x) is said to be periodic with period T , if

The period of the function f (t) is the interval between two successive repe-

titions.

Let f be a bounded function defined on the [−π, π] with at most a finite

number of maxima and minima and at most a finite number of discontinuities

in the interval. Then the Fourier series of f is the series

1

f (x) = a0 + a1 cos x + a2 cos 2x + a3 cos 3x + ...

2

+ b1 sin x + b2 sin 2x + b3 sin 3x + ...

1 π

Z

a0 = f (x)dx

π −π

Z π

1

an = f (x)cos(nx)dx

π −π

1 π

Z

bn = f (x)sin(nx)dx

π −π

10.3 Why the coefficients are found as they are.

We want to derive formulas for a0 , an and bn . To do this we take advantage

of some properties of sinusoidal signals. We use the following orthogonality

conditions:

Orthogonality conditions

(i) The average value of cos(nx) and sin(nx) over a period is zero.

Z π

cos(nx)dx = 0

−π

Z π

sin(nx)dx = 0

−π

Z π

sin(mx)cos(nx)dx = 0

−π

Z π

π if m = n 6= 0

sin(mx)sin(nx)dx =

−π 0 otherwise

Z π 2π if m = n = 0

cos(mx)cos(nx)dx = π if m = n 6= 0

−π

0 if m 6= n

above orthogonality conditions:

1

cos A cos B = [cos(A − B) + cos(A + B)]

2

1

sin A sin B = [cos(A − B) − cos(A + B)]

2

1

sin A cos B = [sin(A − B) + sin(A + B)]

2

Finding an

We assume that we can represent the function by

1

f (x) = a0 + a1 cos x + a2 cos 2x + a3 cos 3x + ...

2

+ b1 sin x + b2 sin 2x + b3 sin 3x + ... (1)

permissible to integrate the series term by term.

Z π Z π Z π Z π

1

f (x) cos(nx)dx = a0 cos(nx) + a1 cos x cos(nx)dx + a2 cos 2x cos(nx)dx + ...

−π 2 −π −π −π

Z π Z π

+ b1 sin x cos(nx)dx + b2 sin 2x cos(nx)dx + ...

−π −π

= an π, because of the above orthogonality conditions

1 π

Z

an = f (x) cos(nx)dx

π −π

π, we can find the formula for the coefficient bn . To find a0 , simply integrate

(1) from −π to π.

10.4 Fourier Series

Definition. Let f (x) be a 2π-periodic function which is integrable on [−π, π].

Set

1 π

Z

a0 = f (x)dx

π −π

1 π

Z

an = f (x)cos(nx)dx

π −π

Z π

1

bn = f (x)sin(nx)dx

π −π

∞

1 X

a0 + (an cos(nx) + bn sin(nx))

2 n=1

Remark. Notice that we are not saying f (x) is equal to its Fourier Series.

Later we will discuss conditions under which that is actually true.

Example. Find the Fourier coefficients and the Fourier series of the square-

wave function f defined by

0 if −π ≤ x < 0

f (x) = and f (x + 2π) = f (x)

1 if 0 ≤ x < π

Using the formulas for the Fourier coefficients we have

1 π

Z

a0 = f (x)dx

π −π

Z 0

1 π

Z

1

= ( 0dx + 1dx)

π −π π 0

1

= (π)

π

=1

1 π

Z

an = f (x) cos(nx)dx

π −π

Z 0

1 π

Z

1

= ( 0dx + cos(nx)dx)

π −π π 0

1 sin(nx) π

= (0 + 0

)

π n

1

= (sin nπ − sin 0)

nπ

=0

1 π

Z

bn = f (x) sin(nx)dx

π −π

Z 0

1 π

Z

1

= ( 0dx + sin(nx)dx)

π −π π 0

1 cos(nx) π

= (− 0

)

π n

1

= − (cos nπ − cos 0)

nπ

1

= − ((−1)n − 1), since cos nπ = (−1)n .

( nπ

0, if n is even

= 2

, if n is odd

nπ

The Fourier Series of f is therefore

1

f (x) = a0 + a1 cos x + a2 cos 2x + a3 cos 3x + ...

2

+ b1 sin x + b2 sin 2x + b3 sin 3x + ...

1

= + 0 + 0 + 0 + ...

2

2 2 2

+ sin x + 0 sin 2x + sin 3x + 0 sin 4x + sin 5x + ...

π 3π 5π

1 2 2 2

= + sin x + sin 3x + sin 5x + ...

2 π 3π 5π

Remark. In the above example we have found the Fourier Series of the

square-wave function, but we don’t know yet whether this function is equal

to its Fourier series. If we plot

1 2

+ sin x

2 π

1 2 2

+ sin x + sin 3x

2 π 3π

1 2 2 2

+ sin x + sin 3x + sin 5x+

2 π 3π 5π

1 2 2 2 2

+ sin x + sin 3x + sin 5x + sin 7x+

2 π 3π 5π 7π

1 2 2 2 2 2 2 2

+ sin x + sin 3x + sin 5x + sin 7x + sin 9x + sin 11x + sin 13x,

2 π 3π 5π 7π 9π 11π 13π

we see that as we take more terms, we get a better approximation to the

square-wave function. The following theorem tells us that for almost all

points (except at the discontinuities), the Fourier series equals the function.

tion with period 2π and f and f 0 are piecewise continuous on [−π, π], then

the Fourier series is convergent. The sum of the Fourier series is equal to

f (x) at all numbers x where f is continuous. At the numbers x where f is

discontinuous, the sum of the Fourier series is the average value. i.e.

1

[f (x+ ) + f (x− )].

2

Remark. If we apply this result to the example above, the Fourier Series

is equal to the function at all points except −π, 0, π. At the discontinuity 0,

observe that

f (0+ ) = 1 and f (0− ) = 0

The average value is 1/2.

Therefore the series equals to 1/2 at the discontinuities.

10.5 Odd and even functions

Definition. A function is said to be even if

examples of odd functions. The product of two even functions is even, the

product of two odd functions is also even. The product of an even and odd

function is odd.

Z π

f (x)dx = 0,

−π

Z π Z π

f (x)dx = 2 f (x)dx.

−π 0

Rπ

• f (x) sin(nx) is even hence bn = π2 0 f (x) sin(nx)dx

2

Rπ

• f (x) cos(nx) is even hence an = π 0

f (x) cos(nx)dx and

• f (x) sin(nx) is even hence bn = 0

Example. 1. Let f be a periodic function of period 2π such that

f (x) = x for − π ≤ x < π

Find the Fourier series associated to f .

Solution: So f is periodic with period 2π and its graph is:

f (−x) = −x = −f (x), so f (x) is odd.

Therefore,

an = 0

2 π

Z

bn = f (x)sin(nx)dx

π 0

Using the formulas for the Fourier coefficients we have

2 π

Z

bn = x sin(nx)dx

π 0

Z π

2 cos nx π cos nx

= ( −x 0

− (− )dx)

π n 0 n

2 1 sin nx

= ( [−π cos nπ] + [ 2 ]π0 )

π n n

2

= − cos nπ

n

−2/n if n is even

=

2/n if n is odd

The Fourier Series of f is therefore

f (x) = b1 sin x + b2 sin 2x + b3 sin 3x + ...

2 2 2 2

= 2 sin x − sin 2x + sin 3x − sin 4x + sin 5x + ...

2 3 4 5

2. Let f be a periodic function of period 2π such that

Since f is even,

bn = 0

2 π

Z

an = f (x) cos(nx)dx

π 0

Using the formulas for the Fourier coefficients we have

2 π

Z

an = f (x) cos(nx)dx

π 0

2 π 2

Z

= (π − x2 ) cos(nx)dx

π 0

Z π

2 2 2 sin nx π

sin nx

= ( (π − x ) 0

− −2x dx)

π n 0 n

Z π

2 sin nπ sin 0 sin nx

= ([(π 2 − π 2 ) − (π 2 − 0) ]+ 2x dx)

π n n 0 n

22 π

Z

= x sin(nx)dx

πn 0

Z π

22 cos nx π cos nx

= ( −x 0

− (− )dx)

πn n 0 n

221 sin nx

= ([−π cos nπ] + [ 2 ]π0 )

πnn n

4

= − 2 cos nπ

n

−4/n2 if n is even

=

4/n2 if n is odd

It remains to calculate a0 .

2 π 2

Z

a0 = (π − x2 )dx

π 0

2 x3

= [π 2 x − ]π0

π 3

2

4π

=

3

The Fourier Series of f is therefore

1

f (x) = a0 + a1 cos x + a2 cos 2x + a3 cos 3x + ...

2

b1 sin x + b2 sin 2x + b3 sin 3x + ...

2π 2 1 1 1 1

= + 4(cos x − cos 2x + cos 3x − cos 4x + cos 5x + ....)

3 4 9 16 25

10.6 Functions with period 2L

If a function has period other than 2π, we can find its Fourier series by making

a change of variable. Suppose f (x) has period 2L, that is f (x + 2L) = f (x)

for all x. If we let

πx Lt

t= and g(t) = f (x) = f ( )

L π

then, g has period 2π and x = ±L corresponds to t = ±π. We know the

Fourier series of g(t):

∞

1 X

a0 + (an cos nt + bn sin nt).

2 n=1

where

1 π

Z

a0 = f (t)dt

π −π

1 π

Z

an = f (t) cos(nt)dt

π −π

1 π

Z

bn = f (t) sin(nt)dt

π −π

Lt πx

If we now use the Substitution Rule with x = π

, then t = L

, dt = ( Lπ )dx,

we have the following:

Definition. If f (x) is a piecewise continuous function on [−L, L], its Fourier

Series is ∞

1 X nπx nπx

a0 + an cos( ) + bn sin( ),

2 n=1

L L

where

1 L

Z

a0 = f (x)dx

L −L

1 L

Z

nπx

an = f (x) cos( )dx

L −L L

1 L

Z

nπx

bn = f (x) sin( )dx

L −L L

Example. Find the Fourier series of the triangular wave function defined

by f (x) = |x| − a for −a ≤ x ≤ a and f (x + 2) = f (x) for all x.

Therefore,

bn = 0

2 a

Z

nπx

an = f (x) cos( )dx

a 0 a

Using the formulas for the Fourier coefficients we have

2 a

Z

nπx

an = (|x| − a) cos( )dx

a 0 a

2 a

Z

nπx

= (x − a) cos( )dx

a 0 a

sin( nπx sin( nπx

Z a

2 a

) a a

)

= ( (x − a) nπ 0

− a nπ dx)

a a 0 a

2 a2 nπx a

= ( 2 2 cos( ) 0)

a nπ a

2a

= 2 2 (cos nπ − 1)

nπ

(

0 if n is even

= 4a

− 2 2 if n is odd

nπ

It remains to determine a0 .

2 a

Z

a0 = |x| − adx

a 0

2 a

Z

= x − adx

a 0

2 x2

= [ − ax]a0

a 2

= −a

a 4a πx 1 3πx 1 5πx

= − − 2 (cos( ) + cos( )+ cos( ) + ...)

2 π a 9 a 25 a

10.7 Parseval’s Result

The folowing result can be used to find the sum of certain series from calcu-

lated Fourier series.

Theorem Let f be a bounded function defined in the interval [−L, L] with

at most a finite number of maxima and minima and at most a finite number

of discontinuities in the interval [−L, L], and associated to the Fourier series

∞

1 X

a0 + (an cos nx + bn sin nx).

2 n=1

Then ∞

Z L

1 1 X

(f (x))2 dx = a20 + (a2n + b2n ).

L −L 2 n=1

x ≤ π. Use Parseval’s result to show that

∞

π2 X 1

= 2

.

6 n=1

n

2 2 2 2

f (x) = 2 sin x − sin 2x + sin 3x − sin 4x + sin 5x + ...

2 3 4 5

Using Parseval’s Identity

∞

1 π

Z

2 1 2 X 2

(f (x)) dx = a0 + (an + b2n )

π −π 2 n=1

1 x3 π 1 1

= 4(1 + + + ...)

π 3 −π 4 9

2π 2 1 1

= 4(1 + + + ...)

3 4 9

π2 1 1

= 1 + + + ...

6 4 9

∞

π2 X 1

=

6 n=1

n2

Summary

1. The Fourier Series of a periodic function f (x) defined on the interval

[−π, π] is:

1

a0 + a1 cos x + a2 cos 2x + a3 cos 3x + ...

2

+ b1 sin x + b2 sin 2x + b3 sin 3x + ...

where the coefficients an and bn are given by the formulae

1 π

Z

a0 = f (x)dx

π −π

1 π

Z

an = f (x)cos(nx)dx

π −π

1 π

Z

bn = f (x)sin(nx)dx

π −π

2. Even and Odd Functions for functions with period 2π.

If f (x) is odd, (i.e. f (−x) = −f (x)) , then

• an = 0 and

Rπ

• bn = π2 0 f (x) sin(nx)dx

If f (x) is even, (i.e. f (−x) = f (x)), then

Rπ

• an = π2 0 f (x) cos(nx)dx and

• bn = 0

3. The Fourier Series of a periodic function f (x) defined on the interval

[−L, L] is:

∞

1 X nπx nπx

a0 + an cos( ) + bn sin( ),

2 n=1

L L

where

1 L

Z

a0 = f (x)dx

L −L

1 L

Z

nπx

an = f (x) cos( )dx

L −L L

1 L

Z

nπx

bn = f (x) sin( )dx

L −L L

4. Even and Odd Functions for functions with period 2L.

If f (x) is odd, (i.e. f (−x) = −f (x)) , then

• an = 0 and

RL

• bn = π2 0 f (x) sin( nπx

L

)dx

RL

• an = π2 0 f (x) cos( nπxL

)dx and

• bn = 0

5. Parseval’s Result

Z L ∞

1 1 X

(f (x))2 dx = a20 + (a2n + b2n ).

L −L 2 n=1

References

1. http://www.intmath.com/Fourier-series/Fourier-intro.php

2. http://www.falstad.com/fourier/e-index.html

- analisisUploaded byDanny Chamorro
- (11). Max Linear SystemsUploaded byAhmad Dimyathi
- (09)Tunnelling Machine p115-125Uploaded byAlex Ferrari
- Differential EquationsUploaded byEngin
- math 51Uploaded byyoududsterster
- Oscillation Energy Based Sensitivity AnalysisUploaded byJaol1976
- Structural damage assessment by using improved sensitivity of identified modal data based on stiffness modification indexUploaded byTI Journals Publishing
- Part Programming ManualUploaded bychidambaram kasi
- chapter 3 Class 11Uploaded byKAMAL KANT KUSHWAHA
- 04 Modal AnalysisUploaded byPablo
- T2D CN M1 V2Uploaded bySeba Pñaloza Dz
- CalcIII_DoubleIntegralsUploaded byKazm Shehab
- Chapter 1Uploaded bySamuel Jacob Karunakaran
- 02 instructional software lesson idea template 2017 1Uploaded byapi-362552106
- MC0074Uploaded bydashingvicky15
- Cal 13Uploaded byRodrigo Merlano
- New Results on Robust Adaptive Beamspace Preprocessing, IEEE 2008Uploaded bysahathermal6633
- Coupled Oscillators Guide (1)Uploaded byJustin Wong
- Lecture 4 RedUploaded byCésar Santana
- Differential EquationUploaded byTarek Mohamed
- N6rt_YxXuoAUploaded byBastian Dewi
- chap0.psUploaded bybrahim
- Solving 3DOF problem withour finding eigen valueUploaded byMd Zia Khurshid
- 03 - Vectors.pdfUploaded byMarlo Aristorenas
- 4.pdfUploaded byOmaar Mustaine Rattlehead
- dot+crossUploaded byAtonu Tanvir Hossain
- PCAManualUploaded byViktor Simjanoski
- design-of-dipole-antenna-using-mom - Copy.pdfUploaded bybal krishna dubey
- 1711.01079Uploaded byChetan Kotwal
- 00629692Uploaded byprimiloco

- Black Body RadiationUploaded bysindhsanam
- C9bUploaded byShweta Sridhar
- CalcIII Complete ProblemsUploaded byNaidu
- Cylindrical and Spherical CoordinatesUploaded bySankar Karuppaiah
- Linear systemsUploaded bySandeep Ayyagari
- Tutorial 6Uploaded byRajMohan
- Dynamic Response for Continuous SystemsUploaded byDarsHan MoHan
- Principle theories, constructive theories, and explanation in modern physics.pdfUploaded by622aol
- General Gaussian IntegralUploaded bysthitadhi91
- KroneckerDeltaAndLeviCivitaUploaded byduolipe
- Martin Bojowald- What happened before the big bang?Uploaded byLopmaz
- m14119Uploaded bybbelwariar
- CirclesUploaded byAditya Nanda
- MATH45061_Ch2Uploaded byNeoXana01
- 5numUploaded bySepliong
- comperrfnc.pdfUploaded bysurendra parla
- Introduction to Quantum Field TheoryUploaded byJan
- How Much Information is in a Jet?Uploaded byraveneyesdead
- fem analizaUploaded byincognito_xxx
- "How to Study Circular Motion (Physics) for JEE Main?"Uploaded byEdnexa
- Analytic Solutions to PDE'sUploaded bysudarsan77
- Symmetries of Polynomial Graphs in Two DimensionUploaded byStuart Mac Pharthaláin
- Sakurai J.J. Modern Quantum Mechanics (AW, 1994)(802dpi)(T)(513s)Uploaded byjamesbondrus
- TachyonsUploaded byNina Brown
- Unified Field Theory(1)Uploaded byStephan Lewis
- Rank and NullityUploaded byAliAlMisbah
- Einstein Field EquationsUploaded byEduardo Salgado Enríquez
- Milota J.-Invitation to mathematical control theory (2007).pdfUploaded byRamón Marcelo Bustos Méndez
- Topic 8 MatricesUploaded bymaths_w_mr_teh
- Transverse Vibration of a Cantilever BeamUploaded byGeraldo Rossoni Sisquini