You are on page 1of 62

NOTES ON MATHEMATICS 2018-2019

1. Determinant & Matrix


1.1 Determinant of a Matrix
a b  a b
The determinant of the matrix   is  ad  bc .
c d  c d

Example Find the following:

 26   3 2  12  6  18
2 2
3 6

 43   6 2  12  12  0


4 2
6 3

 1 4   65  34


1 5
6 4

Finding Determinant of 3x3 Matrix Expansion by Minors


a b c
e f d f d e
d e f a b c  aej  ahf  bdj  bgf  cdh  cge
h j g j g h
g h j
(This is the expansion by the first row.)

To set-up the minor matrix, ignore the row and column that contains the coefficient.
a b c a b c a b c a b c
d e f  ad e f  bd e f  c d e f
g h j g h j g h j g h j

Example Find the determinant using expansion by minors:


2 1 3
1 4 1 4 1 1
1 1 4  2 1 3
6 2 3 2 3 6
3 6 2

 22  24   12   12   36   3 


 2 22  114   39 
 44  14  27
 30  27
 57

Finding Determinant of 3x3 Matrix ~ Diagonals Method


a b c a b
d e f d e  aej  bfg  cdh  gec  hfa  jdb
g h j g h

Page 1
NOTES ON MATHEMATICS 2018-2019

Example Find the determinant for the matrix in example 2 using the diagonals method.
2 1 3 -2 1
1 1 4 1 1  2 * 1 * 2  1 * 4 *  3   3 * 1 * 6   3  * 1 * 3   6 * 4 *  2  2 * 1 * 1
3 6 2 -3 6

 4  12  18  9  48  2
 16  27  46
 57

Applications of Determinants
Example Find the area of a triangle whose vertices are at 0,0  , 10,25  and 28,20  .
“Formula”: Let x1, y 1  , x 2 , y 2  , and x 3 , y 3  be the vertices of a triangle. Then
x1 y1 1
1
A   x2 y2 1
2
x3 y3 1
0 0 1 0 0
1
A   10  25 1 10  25
2
28  20 1 28  20


1
0  0  200   700  0  0
2
  500  250 square units
1
2

Example Find the equation of a line that contains  4,1 and 3,5  .
x y 1
“Formula”: x1 y1 1  0
x2 y2 1
x y 1 x y
4 1 1 4 1 0
3 5 1 3 5

x  3 y  20  3   5 x    4 y   0
x  3 y  17  5 x  4 y  0
6 x  7 y  17

Check using point-slope form of a line:


1  5 6
m 
43 7
y  1    x  4  y   x   6 x  7 y  17
6 6 17
7 7 7

Page 2
NOTES ON MATHEMATICS 2018-2019

Example

1) Expand the determinant

2) Prove that

Page 3
NOTES ON MATHEMATICS 2018-2019

3) Find the value of

Page 4
NOTES ON MATHEMATICS 2018-2019

1.2 Matrix
Row Matrix [1 4 8 2 4]

 3
 
5
Column Matrix 1
 
 2
 

To give a matrix a name, use underline to indicate that it is a matrix. e.g. x = [ 1 2 4 3 ]

Matrix Arithmetic

Addition and Subtraction


You can only do addition and subtraction between matrix that are of the same type and with the same
number of elements.

Examples

[1 3 2 4] + [3 -1 2 6] = [4 2 4 10]

 2  6 8
     
 3    1   4
  4  4  0
     
 8 
1  
   3 
If x=   1 and y=   2  then
1  
   
 
9  7
   
x+y=  2  and x - y =   4
 1  3 
   

A Square Matrix is a matrix which has the same number of rows and columns.

Page 5
NOTES ON MATHEMATICS 2018-2019

A Diagonal Matrix is a square matrix which has non-zero element values only on the leading
diagonal of the matrix and all the other elements are zero.
 3 0 0
 
e.g.  0  1 0 
 0 0 2
 

A Symmetric Matrix is one where the non-diagonal elements have a 'mirror-image' across
the leading diagonal.
5 2 1
 
e.g.  2 8 0 
1 0 7
 

The Transpose of a matrix is the matrix formed by interchanging the rows and the columns.
The transpose of A is denoted by A' (or sometimes AT)
1 2
 1 0 5  
e.g. If A =   then A' =  0 3 
 2 3 6 5 6
 
Matrix Arithmetic

Addition and Subtraction


Can only add or subtract matrices of the same size & shape
Simply add or subtract the corresponding elements to get a matrix of the same size and shape as the
original ones

 1 4   2 7   3 11 
     
e.g.  2 2  +  1 4 =  1 6 
 5  5   9 3  14  2 
     

Thus, the result of C=A+B ( [cij] = [aij] + [bij] )


Is cij = aij + bij
for i = 1,2, .. n where n is the number of rows
& for j = 1,2, .. m where m is the number of cols

( i,j : 1..n · cij = aij + bij )

Matrix Multiplication
Only certain sizes and shapes of matrix and matrices can be multiplied together
The order in which the matrices or matrix are multiplied is significant
The dimensions of the resultant matrix may be different from the original ones
The number of columns in the first matrix or matrix must be the same as the number of rows in the
second matrix or matrix

Example
To multiply x. M where x is a row matrix of 3 elements and M is a 3 x 2 matrix

Page 6
NOTES ON MATHEMATICS 2018-2019

1 2
 
2 4 1 0 1   4 11
 2 3
 
The result is a row matrix of 2 elements.
What we did to obtain each element in the result r was
To get r11 take the 1st row of the matrix and the 1st column of the matrix, then
r11 = x11 . m11 + x12 . m21 + x13 . m31
To get r12 take the 1st row of the matrix and the 2nd column of the matrix, then
r12 = x11 . m12 + x12 . m22 + x13 . m32
Remember, [m x n] means a matrix with m rows and n columns
And for multiplication we go
Along the rows of the first matrix
Down the columns of the second matrix
The size of the answer can be predicted: [m x n] [n x p] gives answer [m x p]

Example
 1 2
 1 2 3    1  1  2  2  3  2 1  2  2  1  3  2  11 10 
  2 1       
 2 1 2  2 2   2  1  1  2  2  2 2  2  1  1  2  2   8 9 
 

Inverse of a Matrix
There is a special case for square matrices where
A.B = B.A.
This is when matrix B is the inverse of matrix A.
(It also follows that matrix A is the inverse of matrix B.)
The inverse of matrix A is denoted by A-1. A-1 = 1

A
The result of the multiplications A.A-1 and A-1.A is the identity matrix I. This is a special type of
diagonal matrix where the diagonal elements have the value 1.
 1 0 .. 0 
 
 0 1 .. 0 
A . A-1 = A-1. A = I = . . . .
 
 0 .. 0 1 
 
Symmetric Matrices
Definition of symmetric matrix

A matrix is said to be symmetric matrix when A=AT.

A r  r matrix Ar  r is defined as symmetric if A  At . That is,


 a11 a12 a1r 
a a 2 r 

a 22
A   12 , aij  a ji

    .
 
 
 a1r a2 r  a rr 

Page 7
NOTES ON MATHEMATICS 2018-2019

Example
1 2 5
A   2 3 6  is symmetric since A  At .
 5 6 4 

For instance, let


 1 3
1 2  1 A   2 0 
A
1 
t

3 0
and .
  1 1 
Then,
 row (At)
 col col 3 ( A )  row
 t 
1

AA t
1 (A) col 2 (A) 2 ( A )
 row t 
3 ( A )

 col t
( A)
 col col 3 ( A )  col
 
1

1 (A) col 2 (A) 2


t
( A )
 col ( A ) 

t
3

 col 1 ( A ) col 1
t
( A )  col 2 ( A ) col t
2 ( A )  col 3 ( A ) col t
3 (A)

   1 3     2 0      1 1 
1  2   1
3 0   1 
1 3  4 0  1  1 6 2 
           
3 9  0 0  1 1   2 10 
In addition,
A t A  row1t ( A) row1 ( A)  row 2t ( A) row 2 ( A)
1 3
  2  1 2  1   0  3 0 1
 
  1 1 
1 2  1  9 0 3  10 2 2 
2 4  2    0 0 0    2 4  2 
  1  2 1   3 0 1   2 2 2 

Note:
A and B are symmetric matrices. Then, AB is not necessarily equal to BA  ( AB ) t . That is, AB
might not be a symmetric matrix.

Page 8
NOTES ON MATHEMATICS 2018-2019

Square Matrix A is said to be skew


skew-symmetric if aij=−aji for all i and j. In other words, we can say
that matrix A is said to be skew
skew-symmetric if transpose of matrix A is equal to negative of Matrix A
i.e (AT=−A).

Note that all the main diagonal elements in skew-symmetric matrix are zero.
0 5 4
Lets take an example of matrix 5 0 1
4 1 0

It is skew-symmetric matrix because aij=−aji for all i and j. Example, a12 = -5 - and a21=5 which
means a12=−a21.. Similarly, this condition holds true for all other values of i and j.
We can also verify that Transpose of Matrix A is equal to negative of matrix A i.e AT=−A.

0 5 4 0 5 4
5 0 1 and 5 0 1
4 1 0 4 1 0
We can clearly see that AT=−A which makes A skew-symmetric matrix.

0 5 4
Let’s take another example of matrix 5 0 1.
4 1 0
Matrix B is not skew-symmetric because b12≠−b21 or BT≠−B.
Note that in skew-symmetric
symmetric matrices
matrices, to make a11=−a11, a22=−a22, a33=−aa33 ... All the main
diagonal elements have to be zero.

Example

1) Express the following matrix as a sum of a symmetric and skew symmetric.

Page 9
NOTES ON MATHEMATICS 2018-2019

2) Show that the matrix

3) Prove that

Solutions
Let A be non-singular
singular square matrix i.e |A| is not 0.
Again we know that

Page
10
NOTES ON MATHEMATICS 2018-2019

Page
11
NOTES ON MATHEMATICS 2018-2019

2. Numerical Methods
Newton forward interpolation
Suppose we are given the following values of y = f(x) for a set of values of x:
x : x0 x1 x2 …… xn
y : y0 y1 y2 …… yn
The process of finding the values of y corresponding to any value of x=xi between x0 and xn is called
interpolation.
The technique of estimating the value of a function for any intermediate value of the independent
variable is called interpolation.
The technique of estimating the value of a function outside the given range is called extrapolation.
The study of interpolation is based on the concept of differences of a function.
Suppose that the function y=f(x) is tabulated for the equally spaced values x = x0, x1=x0+h, x2=x0+2h,
…, xn=x0+nh giving y = y0, y1, y2, …, yn. To determine the values of f(x) and f '(x) for some
intermediate values of x, we use the following three types of differences
Forward differences
Backward differences
Central differences
Forward differences: The forward differences are defined and denoted by ∆f(x)=f(x+h)-f(x),
∆y0 = y1 – y0
∆y1 = y2 – y1
∆y2 = y3 – y2
…………….
∆yr = yr+1 – yr
…………….
∆yn-1 = yn – yn-1
These are called the first forward differences and ∆ is the forward difference operator.
Similarly the second forward differences are defined by
∆2 yr = ∆ yr+1 – ∆yr.
In general
∆p yr = ∆p-1 yr+1 – ∆p-1yr ,
pth forward differences.
The forward differences systematically set out in a table called forward difference table.

Value of x Value of y 1st diff. 2nd diff. 3rd diff. 4th diff. 5thdiff.
∆ ∆2 ∆3 ∆4 ∆5
x0 y0
∆y0
x1 y1 ∆2y0
∆y1 ∆3y0
x2 y2 ∆2y1 ∆4y0
∆y2 ∆3y1 ∆5y0
x3 y3 ∆2y2 ∆4y1
∆y3 ∆3y2
x4 y4 ∆2y3
∆y4

Page
12
NOTES ON MATHEMATICS 2018-2019

x5 y5

Example The table gives the distances in nautical miles of the visible horizon for the given heights in
feet above the earth’s surface :
x=height 100 150 200 250 300 350 400

y=distance 10.63 13.03 15.04 16.81 18.42 19.90 21.27

Find the values of y when (i) x= 218 ft. (ii) x= 410 ft.
Sol. The difference table is
x y ∆ ∆2 ∆3 ∆4

100 10.63

2.4
150 13.03 -0.39

2.01 0.15

x0=200 15.04 -0.24 -0.07

1.77 0.08

250 16.81 -0.16 -0.05

1.61 0.03

300 18.42 -0.13 -0.01

1.48 0.02

350 19.90 -0.11

1.37
xn=400 21.27

(i) If we take x0=200, then y0=15.04, ∆y0=1.77, ∆2y0=-0.16, ∆3y0=0.03, ∆4y0=-0.01.


Since x=218, step length h=50 and p=(x-x0)/h =18/50 = 0.36.
By Newton’s forward interpolation formula, we have
y(218) = y0 + p ∆y0 + p(p-1)/2! ∆2 y0+ p(p-1)(p-2)/3! ∆3 y0+ p(p-1)(p-2)(p-3)/4! ∆4 y0
= 15.04 + 0.36 (1.77) + 0.36(0.36-1)/2 (-0.16)+ 0.36(0.36-1)(0.36-2)/6 (0.03)
+ 0.36(0.36-1)(0.36-2)(0.36-3)/24 (-0.01)
=15.04+0.6372+0.0184+0.0018+ 0.00041 = 15.69741
≈ 15.7 nautical miles.

Gaussian Elimination

Gaussian elimination is one popular method of solving linear equations. We illustrate this technique
by means of an example.

Example Find x, y and z that satisfy the following three equations at the same time.

x - y + 3z = 4

Page
13
NOTES ON MATHEMATICS 2018-2019

(1) 2x - y + 2z = 6
3x + y - 2z = 9

Before discussing the details of Gaussian elimination, let's look at two ways to reformulate a system
of linear equations. Both ways begin by putting the equations in vector form. For the equations
above this is the following.

 x - y + 3z  4
(2)  2x - y + 2z  = 6
 3x + y - 2z  9

The left side we can write as the matrix of coefficients times the vector of unknowns.

1 -1 3x 4
 2 - 1 2  
y = 6
3 1 -2z 9
or

(3) Au = b

where
1 -1 3 x 4
A = 2 -1 2 u = y b = 6
3 1 -2 z 9

So the original equations (1) are equivalent to (3). In general the problem of solving a system of
linear equations is equivalent to solving Au = b where A is the matrix of coefficients, b is the vector
of numbers on the right side and u is the vector of unknowns.

The second reformulation of the equations starts with (2) and writes the vector on the left as the sum
of three vectors where each term contains the terms with one of the variables. We get
 x -y  3z  4
 2x  +  - y  +  2z  =  6 
 3x   y  - 2z  9

Now we factor the variables out of each of the vectors on the left to get

 1 -1  3 4
x 2  + y-1 + z 2 = 6
3  1 -2 9
or
xv1 + yv2 + zv3 = b

where
 1 -1  3
v1 =  2  v2 =  - 1  v3 =  2 
3  1 -2

So the original equations (1) are equivalent to writing b as a linear combination of v1, v2 and v3. In
general the problem of solving a system of linear equations is equivalent to writing b as a linear
combination of the vectors that are the coefficients of each of the variables.

Now let's look at solving linear equations using Gaussian elimination. We shall look at two methods
to keep track of our calculations. One is with the equations themselves. The other is by means of

Page
14
NOTES ON MATHEMATICS 2018-2019

another matrix which is just the coefficient matrix A and right hand side b of the equation combined.
It is called the augmented matrix. For the equations in Example 1 it is.

1 -1 3 | 4
M = 2 -1 2 | 6
3 1 -2 | 9

Note that we draw a line separating the last column which contains b from the rest which contains A.
To start out we have the original equations and the corresponding M.

Equations Augmented matrix

x - y + 3z = 4 1 -1 3 | 4
2x - y + 2z = 6 M = 2 -1 2 | 6
3x + y - 2z = 9 3 1 -2 | 9

Numerical Integration trapezoidal and Simpson’s 1/3 rule

Trapezoidal Rule/Formula
We approximate the function f(x) in the interval [x i , x i  1 ] by a straight line joining the points (xi, yi)
and [x i  1 , yi  1 ] . For convenience let us consider the integral in the first interval [x0, x1]. The line
joining (x0, y0) and (x1, y1) may be written from the Lagrange’s formula as,
x  x1 x  x0
y (x)  y0  y1
x 0  x1 x1  x 0
 x  x1 
 
x  x0
x1 xi

Now, f (x) dx  y0  y1  dx
x0 x0  0
x  x1 x1  x 0 
1
   (x  x1 ) 2 y 0  (x  x 0 ) 2 y1 
x  x1

2h x  x0

h
 (y 0  y1 )
2
Adding over all the intervals the Trapezoidal formula/rule may be written as,

 
h
b xn

f (x) dx  y (x) dx  {(y0  y1 )  (y1  y 2 )  . . .  (y n  2  y n  1 )  (y n  1  yn )}


a x0
2
h

{y 0  2(y1  y 2  . . .  y n  1 ) y n }
2
 y  yn 
or, h 0  (y1  y 2  . . .  y n  1 ) 
 2 
Geometrically, the integral in an interval is approximated by the area of a trapezium
(see Figure 2).

Page
15
NOTES ON MATHEMATICS 2018-2019

Figure Trapezoidal Rule

Simpson’s 1/3rd Rule/Formula


In this case the integral is evaluated over two intervals at a time, say [x0, x1] and [x1, x2]. The function
f(x) is approximated by a quadratic passing through the points (x0, y0) and (x1, y1) and
(x2, y2). From Lagrange’s formula we may write the quadratic as,
(x  x1 ) (x  x 2 ) (x  x 0 ) (x  x 2 ) (x  x 0 ) (x  x1 )
y (x)  y0  y1  y2
(x 0  x1 )(x 0  x 2 ) (x1  x 0 ) (x1  x 2 ) (x 2  x 0 ) (x 2  x1 )
Integrating term by term we get,
1  (x  x 2 )2 (x  x 2 )3 

(x  x1 ) (x  x 2 ) h
x2 x2

dx  2  (x  x1 )   
x0
( h) ( 2h) 2h  2 6  x0 3
1  (x  x 2 ) 2 (x  x 2 )3 

(x  x 0 ) (x  x 2 ) 4
x2 x2

dx  2  (x  x 0 )    h
x0
h ( h) h  2 6  x0 3
1  (x  x1 ) 2 (x  x1 )3 

(x  x 0 ) (x  x1 ) h
x2 x2

dx  2  (x  x 0 )   
x0
(x 2  x 0 ) (x 2  x1 ) 2h  2 6  x0 3
Hence we get,

 
h 4h h
x2 x2

f (x) dx  y (x) dx  y0  y1  y 2
x0 x0
3 3 3
h
 (y 0  4y1  y 2 )
3
n
Applying this formula over next two intervals and then next two and so on for times and adding
2
we get

    
b xn x2 x4 xn

f (x) dx  y (x) dx  y (x) dx  y(x) dx  . . .  y (x) dx


a x0 x0 x2 xn  2

h
[(y 0  4y1  y 2 )  (y 2  4y 3  y 4 )  . . .  (y n  2  4y n  1  y n )]
3
h
 [y 0  y n  4(y1  y 3  . . .  y n  1 )  2 (y 2  y 4  . . .  y n  2 )]
3
Obviously n should be chosen as a multiple of 2 i.e. an even number for applying this formula.
Example

Evaluate the integral I  


dx
1

by trapezoidal rule dividing the interval [0, 1] into five equal


0 1  x2
parts. Compute upto five decimals.
Solution
1 0
n  5; h   0.2
5
i 0 1 2 3 4 5
x 0 0.2 0.4 0.6 0.8 1.0
1
y=
1.0 0.98058 0.92848 0.85749 0.78087 0.70711
1 + x2

From Trapezoidal Rule;

Page
16
NOTES ON MATHEMATICS 2018-2019

h
I [y 0  y 5  2 (y1  y 2  y 3  y 4 )]
2
0.2
 [1.0  0.70711  2 (0.98058  0.92848  0.85749  0.78087)]
2
 0.1 [1.70711  2  3.54742]
= 0.88016
Example


dx
0.8
Evaluate the integral I  by Simpson’s 1/3rd rule dividing the interval [0, 0.8] to 4 equal
1 x
0

sub-intervals. Compute up to five places of decimal only.


Solution
0.8  0
n  4; h   0.2
4
i 0 1 2 3 4
x 0 0.2 0.4 0.6 0.8
1
y=
1.0 0.91287 0.84515 0.79057 0.74536
1 + x2

From Simpson’s 1/3rd Rule


h
0.8
I ydx  [(y0  4y1  y2 )  (y 2  4y3  y4 )]
0 3
h
 [y 0  y 4  4 (y1  y3 )  2  y 2 )]
3
0.2
 [1.0  0.74536  4 (0.91287  0.79051)  2  0.84515)]
3
0.2
 [1.74536  4  1.70344  1.69030]
3
= 0.68329

Example
1) State the appropriate interpolation formula which is used to calculate the value of exp(1.75) from
the following data and hence evaluate it from the data.

Solution
The Difference table is under

Page
17
NOTES ON MATHEMATICS 2018-2019

2)

Page
18
NOTES ON MATHEMATICS 2018-2019

3)

4)

Page
19
NOTES ON MATHEMATICS 2018-2019

Page
20
NOTES ON MATHEMATICS 2018-2019

3. Integration
Revision of Integration (Indefinite Integration)

When a function f(x) is known we can differentiate it to obtain its derivative df/dx. The reverse
process is to obtain the function f(x) from knowledge of its derivative. This process is called
integration and it has numerous applications in all areas of sciences.

Suppose we differentiate the function y=x2. We obtain

dy
 2x
dx

Integration reverses this process and we say that the integral of 2x is x2. Schematically we can regard
the process of integration in the following way

Differentiate

x2 2x

Integrate

In reality the situation is a bit more complicated because differentiating any of


x 2  7, x 2  3, x 2  0.5
yields 2x. In fact, any function of the form x2+c, where c is a constant, will be the answer to our
question since when we differentiate the constant term we obtain zero. Consequently, when we
reverse the process of differentiation, we do not know what the original constant term might have
been. So we include in our answer an unknown constant, c say, called the constant of integration and
we state that the integral of 2x is x2+c. This constant must always be included when finding an
indefinite integral. The solution containing the constant c defines a family of curves all being
solutions of the integral. If one of these curves is a solution of the integral, they all are. To select
between them we need more information. This problem defines so called initial conditions, a topic
you will meet later when solving differential equations.
The indefinite integration is represented by the symbol ∫ known as an integral sign. Accompanying the
integral sign is always a term of the form dx, which indicates the independent variable involved in the
integration, in this case x. So we write
 2 x dx  x c
2

Example
(a) State the derivative of x3
(b) Hence find the indefinite integral of 3x2

Page
21
NOTES ON MATHEMATICS 2018-2019

Solution
(a) From our knowledge of differentiation, the derivative of x3 is 3x2.
(b) Indefinite integration reverses the process of differentiation, and so we write

 3x dx  x c
2 3

We always include the additional constant of integration when finding indefinite integrals. Note that
our answer can be checked by differentiating x3+c to obtain 3x2.
More generally, we have the following relationship between derivatives and indefinite integrals:

if
d
F x   f x  then  f x  dx  F x   c
dx

In the expression ∫ f(x) dx, the function f(x) is referred to as the integrand. When we have calculated ∫
f(x) dx, we say f(x) has been integrated with respect to x to yield F(x)+c.

In what follows we are going to discuss some techniques which can be used to evaluate many types of
integrals.

Integration by parts

If f and g are differentiable functions, then by the Product Rule

d
 f ( x) g ( x)  f ( x) dg ( x)  g ( x) df ( x) or equivalently
dx dx dx
f ( x ) g ' ( x )   f ( x) g ( x)   g ( x ) f ' ( x )
d
dx

where the dash means the differentiation of the function with its argument. Integrating both sides of
the previous equation gives us

 f ( x) g ' ( x)dx   dx  f ( x) g ( x)dx   g ( x) f ' ( x)dx


d

The simplest is to perform the first integral on the right side and the integral is f(x)g(x)+C (why???).
Since another constant of integration results from the second integral, it is unnecessary to include C in
the formula; that is

 f ( x) g ' ( x)dx  f ( x) g ( x)   g ( x) f ' ( x) dx

If we let u=f(x) and v=g(x), so that du=f′(x)dx and dv=g′(x)dx, then the preceding formula may be
written (integration by part)

 udv  uv   vdu
Example
Find  xe dx
2x

Page
22
NOTES ON MATHEMATICS 2018-2019

Solution
There are four possible choices for dv, namely dx, xdx, e2xdx or xe2xdx. If we let dv= e2xdx, then the
remaining part of the integrand is u; that is u=x. To find v we integrate dv to obtain v=e2x/2. Note that
a constant of integration is not added at this stage of the solution. Since u=x we see that du=dx. For
ease of reference it is convenient to display these expressions as follows

ux dv  e 2 x dx
1 2x
du  dx v  e
2

Substituting these expressions in the definition equation, we obtain

 xe dx  x e 2 x    e 2 x dx
2x 1  1
2  2

The integral on the right hand side may be found using the integrals of exponential functions (see in
the table below). This gives us

 xe
1  1
2x
dx  x e 2 x   e 2 x  C
2  4

It takes considerable practice to become proficient in making a suitable choice for dv. To illustrate, if
we had chosen dv=xdx, then it would have been necessary to let u=e2x, giving us

u  e2x dv  xdx
1 2
du  2e 2 x dx v  x
2

Integrating by parts we obtain

 xe x e   x 2 e 2 x dx
1 2 2x
2x
dx 
2

Since the exponent associated with x has increased, the integral on the right is more complicated than
the given integral. This indicates an incorrect choice of dv.

Trigonometric integrals

In many cases evaluating an integral we can meet trigonometric functions. For example, integrals of
the type ∫sinnxdx require new method of solving. If n is an odd positive integer, we begin by writing

 sin xdx   sin n 1 x sin xdx


n

Since the integer n-1 is even, we may then use the fact that sin2x=1-cos2x to obtain a form which is
easy to integrate.

Example
Evaluate ∫sin5xdx.

Page
23
NOTES ON MATHEMATICS 2018-2019

Solution
As discussed earlier we have

 sin xdx   sin x sin xdx   (sin x) 2 sin xdx   (1  cos 2 x) 2 sin xdx 
5 4 2

  (1  2 cos x  cos x) sin xdx


2 4

We next employ the method of substitution, letting u=cosx and du=-sinxdx. Thus

 sin xdx    (1  2 cos 2 x  cos 4 x)( sin x)dx    (1  2u 2  u 4 )du 


5

2 1 2 1
 u  u 3  u 5  C   cos x  cos3 x  cos5 x  C
3 5 3 5
A similar technique can be employed for odd powers of cos x, specifically, we write

 cos xdx   cos n 1 x cos xdx


n

and use the fact that cos2x=1-sin2x in order to obtain an integrable form. If the integrand is sinnx or
cosnx and n is even, then the half-angle formulas

1  cos 2 x 1  cos 2 x
sin 2 x  or cos 2 x  may be used to simplify the integrand.
2 2

Example
Evaluate ∫sin4xdx

Solution

 sin xdx   (sin x) dx    2  dx  4  (1  2 cos 2 x  cos 2 x)dx


 1  cos 2 x  1
2
4 2 2 2

We apply a half-angle formula again and write

1 1 1
cos 2 2 x  (1  cos 4 x )   cos 4 x
2 2 2

Substituting in the last integral and simplifying gives us

 sin 4  2
1 3 1  3 1 1
4
xdx 
  2 cos 2 x  cos 4 x dx  x  sin 2 x  sin 4 x  C
2  8 4 32
Integrals of the form ∫sinmxcosnxdx where m and n are positive integers may be found by using
variations of the previous techniques. If m and n are both even, then half-angle formulas should be
employed first. In n is odd, we can write

 sin x cos n xdx   sin m x cos n 1 x cos xdx


m

and express cosn-1x in terms of sinx by using the identity cos2x=1-sin2x. The substitution u=sinx then
leads to an integrand which can be handled easily. A similar technique can be used if m is odd.

Trigonometric substitutions

Page
24
NOTES ON MATHEMATICS 2018-2019

If an integrand contains the expression a 2  x 2 where a>0, then the trigonometric substitution
x=asinθ leads to

 
a 2  x 2  a 2  a 2 sin 2   a 2 1  sin 2   a 2 cos 2   a | cos  |

When making this substitution or the other trigonometric substitutions in the next examples, we shall
assume that θ is in the range of the corresponding inverse trigonometric function. Thus, for the sine
substitution above –π/2≤θ≤π/2. Consequently cosθ≥0 and a 2  x 2  a cos . Of course, if
a 2  x 2 occurs in a denominator we make the further restriction –π/2<θ<π/2.

Example
Evaluate

x
1
dx, a  0
2
a2  x2

Solution
Let x=asinθ , where –π/2<θ<π/2. It follows that


a 2  x 2  a 2  a 2 sin 2   a 2 1  sin 2   a cos  
Since x=asinθ, we have dx=acosθdθ. Substituting in the given integral

x dx   2 
1 1 1 1 1
a cos  d  d   2 cot   C
2
a x
2 2 ( a sin  )a cos 
2 2
a sin 2
a

It is now necessary to return to the original variable of integration x. A simple method of doing so is
to use a geometrical approach. If 0<θ<π/2, then since sinθ=x/a, we may interpret θ as an acute angle
of a triangle having opposite side and hypotenuse of lengths x and a, respectively. The length
a 2  x 2 of the adjacent side is calculated by means of the Pythagorean Theorem.

Referring to the triangle we see that

Page
25
NOTES ON MATHEMATICS 2018-2019

a2  x2
cot  
x

It can be shown that this formula is also valid if –π/2<θ<0. Thus the above figure can be used whether
θ is positive or negative. Substituting the new form of cotθ in the result obtained for the integral, we
have

x
1 1 a2  x2 a2  x2
dx   C   C
2
a 2 x2 a2 x a2x

If an integrand contains a 2  x 2 , where a>0, then the substitution x=atanθ will eliminate the radical
sign. When using this substitution it will be assumed that θ is in the range of the inverse tangent
function; that is –π/2<θ<π/2. After making this substitution and evaluating the resulting trigonometric
integral, it is necessary to return to the original variable, x. The preceding formulas show that

x a2  x2
tan   and sec 
a a

For integrands containing x 2  a 2 we substitute x=asecθ, where θ is chosen in the range of the
inverse secant function; that is either 0≤θ≤π/2 or π≤θ<3π/2. After evaluating the integral, we have to
return to the original variable and we use

x x2  a2
sec  and tan  
a a

Example


x2  9
Evaluate dx
x

Solution
Let us substitute as follows

x  3 sec , dx  3 sec tan  d

Consequently

x 2  9  9 sec 2   9  3 sec 2   1  3 tan 


and therefore

3 tan 

 x dx   3 sec 3 sec tan  d  3 tan  d  3 sec   1 d 
x2  9 2 2

 3tan      C

Since secθ=x/3 we may refer to the changing rule into the old variable and we write

Page
26
NOTES ON MATHEMATICS 2018-2019

 x2  9  x 

x2  9 x
dx  3  sec 1    C  x 2  9  3 sec 1    C
x  3  
3 3

Example
1)

2)

Page
27
NOTES ON MATHEMATICS 2018-2019

3)

4)

Page
28
NOTES ON MATHEMATICS 2018-2019

Page
29
NOTES ON MATHEMATICS 2018-2019

4. Ordinary Differential Equation


Homogeneous differential equations

A function f of two variables is said to be homogeneous of degree n if

f tx,ty   t n f  x, y  4.1

for every t>0 such that (tx, ty) is in the domain of f. For example, if

f  x, y   2 x 4  x 2 y 2  5 xy 3
then f is homogeneous of degree 4, since

f tx,ty   2tx   tx  ty   5tx ty  


 
4 2 2 3

t 4 2 x 4  x 2 y 2  5 xy 3  t 4 f  x, y 

Similarly, if

f  x, y  
1 x
e y

x y
2 2

then f is homogeneous of degree -2, since

f tx,ty    t  2 f  x, y 
1 tx
e ty

t x t y
2 2 2 2

A homogeneous differential equation is an equation which can be written in the form

P x, y dx  Q x, y dy  0 4.2

where P and Q are homogeneous functions of the same degree. Equations of this type can be
transformed into separable equations by means of substitutions

y  xv , where v  v x  4.3

and

dy  vdx  xdv
Thus, substitution of xv for y in (4.2) yields

P x, xv dx  Q  x, xv vdx  xdv   0


If P and Q are homogeneous functions of degree n, then

Page
30
NOTES ON MATHEMATICS 2018-2019

P x, xv dx  x n P1,v  and Qx, xv   x n Q1,v 


Substituting in the preceding differential equation and dividing both sides by xn we obtain

P1,v dx  Q1,v vdx  xdv  0


This equation can be written in the separable form

1 Q1,v  4.4
dx  dv  0
x P1,v   vQ1,v 

provided non-zero denominators occur. We have proved that if y=xv is a solution of (4.2), and then v
is a solution of (4.4). Conversely, if v is a solution of (4.4), then reversing our argument shows that
y=vx is a solution of (4.2). It is not advisable to memorize the final form of (4.4). Instead, remember
the substitution y=xv which is used to simplify the homogeneous equation.

Example
Solve the differential equation

y 2
 xy dx  x 2 dy  0.
Solution
If P(x,y)=y2-xy and Q(x,y)=x2, then the functions P and Q are both homogeneous of degree 2. Hence
the differential equation is homogeneous and we substitute

y  xv, dy  vdx  xdv.


This leads to the following chain of equations

x v  x v dx  x vdx  xdv   0


x v  v dx  x vdx  xdv   0
2 2 2 2

v  v dx  vdx  xdv  0


2 2 2

v 2 dx  xdv  0
1 1
dx  2 dv  0
x v

Integrating each term gives us

1 x x
ln x   C1  ln | x |   C1  y 
v y ln | x | C1

Linear differential equations of the first order

A first-order linear differential equation is an equation of the form

Page
31
NOTES ON MATHEMATICS 2018-2019

y' P( x ) y  Q ( x ) 4.5

where P and Q are continuous functions. If Q(x) =0 for all x, then equation (4.5) is separable and we
may write

y'
  P( x)
y

provided y≠0. Integrating, we obtain

ln y    P ( x)dx  ln C .

We have expressed the constant of integration as ln|C| in order to change the form of the last equation
as follows:

ln y  ln C    P( x)dx
y
ln    P( x)dx
C

e 
y  P ( x ) dx

ye  C
P ( x ) dx

We next observe that

d   P ( x ) dx   P ( x ) dx  P( x ) ye  P ( x ) dx
ye  y ' e
dx  

 e
P ( x ) dx
 y ' yP( x).

Consequently, if we multiply both sides of (4.5) by e∫P(x)dx, then the resulting equation may be written
as

d   P ( x ) dx 
 Q ( x )e 
P ( x ) dx
 ye  .
dx  

Page
32
NOTES ON MATHEMATICS 2018-2019

This gives us the following (implicit) solution of (3.5)

ye    Q ( x )e   D.
P ( x ) dx P ( x ) dx

4.6

Solving this equation for y leads to an explicit solution. The expression e∫P(x)dx is called an integrating
factor of (4.5). We have shown that multiplications of both sides if (4.5) by this expression leads to an
equation which has the solution (4.6).

Example
Solve the differential equation

x 2 y '5 xy  3 x 5  0

where x≠0.

Solution
In order to find the integrating factor we begin by expressing the given differential equation in the
“standardized” form (4.5), where the coefficient of y′ is 1. Thus, dividing both sides by x2 we obtain
5
y ' y  3 x 3
x
which has the form (4.5) with P(x)=5/x and Q(x)=- 3x3. From the preceding discussion, the required
integrating factor is

e  e5 ln x  eln x  x . .
P ( x ) dx 5
5

If x>0 then |x|5=x5, whereas if x<0 then |x|5=-x5. In either case, multiplying both sides of the
standardized form by |x|5 gives us

d 5
x 5 y '5 x 4 y  3 x 8 or ( x y )  3 x 8 .
dx
Thus a solution is

x9 x4 C
x5 y   C or y 
3 3 x5 .
A generalization of the (4.5) is the Bernoulli equation

Page
33
NOTES ON MATHEMATICS 2018-2019

y ' P ( x) y  Q ( x ) y n , 4.7

where n≠0. Evidently y=0 is a solution. If y≠0 we may divide both sides by yn, obtaining

y  n y ' P( x) y1 n  Q ( x). 4.8

If we let w=y 1-n, then


dw
w'   (1  n) y  n y '
dx

and hence

1
y n y'  w'.
1 n

Replacing y-n y′ in (4.8) by the last expression gives us

1
w' P ( x ) w  Q ( x ).
1 n

This first-order linear diff. eq. may be solved for w using the integrating factor technique. After w was
found, the solution (4.7) is given by y 1-n=w (and y=0).

Example
Solve the differential equation

2y
y '  x6 y 3.
x

Solution
The equation has the Bernoulli form (4.6) with n=3. If, as in the previous discussion, we multiply both
sides by y-3 and substitute w=y1-n=y-2 we obtain

Page
34
NOTES ON MATHEMATICS 2018-2019

2
y 3 y '  x6
xy 2

1 2w
 w'  x6
2 x
4w
w'  2 x 6
x

Since the integrating factor for the last equation is

e  e 4 ln x  eln x  x  x4
( 4 / x ) dx 4
4

supposing that x>0 (in fact it can be shown that considering x<0, eventually we would obtain the
same result). Now we can write

x  4 w'4 x 5 w  2 x 2 .

Consequently
2 x3 2x7
x w
4
C or w  Cx 4 .
3 3

Finally, since w=y-2, the solution of the given equation is

2x 7  2x 7 
y 2    Cx 4 or    Cx 4  y 2  1
3  3 
Example
Solve the differential equations
xy   x  y  e x
y  y  y 2 e x

Linear differential equations of the second order

If f1, f2, … , fn and k are functions of one variable which have the same domain, then an equation of
the form

y n   f 1 x  y n 1    f n 1 x  y ' f n x  y  k x  5.1

Page
35
NOTES ON MATHEMATICS 2018-2019

is called a linear differential equation of order n. If k(x)=0 for all x, the equation is said to be
homogeneous. Notice that this meaning of the word homogeneous is different from that we met
earlier. If k(x)≠0 for some x, then the equation (5.1) is said to be non-homogeneous. We shall restrict
our work to second-order equations in which f1 and f2 are constant functions. First we are going to
discuss the homogeneous case.

The general second-order homogeneous differential equation with constant coefficients has the form

ay ' ' by ' cy  0 5.2

where a, b and c are constants. Before attempting to find particular solutions let us establish the
following result.

Theorem: (the superposition principle) If y=f(x) and y=g(x) are solutions of the differential equation
ay″+by′+cy=0, then

y  C1 f ( x )  C 2 g ( x ) 5.3
is a solution for all real numbers C1 and C2.
Proof: By hypothesis
af ' ' ( x)  bf ' ( x)  cf ( x)  0

ag ' ' ( x)  bg ' ( x)  cg ( x)  0
If we multiply the first of these equations by C1, the second by C2, and add, the result is

a C1 f ' ' ( x )  C 2 g ' ' ( x )   bC1 f ' ( x )  C 2 g ' ( x )   cC1 f ( x )  C 2 g ( x )   0

Thus C1f(x)+C2g(x) is a solution.

It can be shown that if the solutions f and g in above Theorem have the property that f(x)≠Cg(x) for
all real numbers C, and if g(x) is not identically 0, then y=C1f(x)+C2g(x) is a general solution of
ay″+by′+cy=0. Thus, to determine the general solution it is sufficient to find two such functions f and
g and employ Eq. (5.3).
In our search for solution of (5.2) we shall use y=Aemx as a trial solution, where A is an arbitrary
constant and we need to find m.

Then

dy d2y
 Ame mx ;  Am 2 e mx
dx dx 2

Page
36
NOTES ON MATHEMATICS 2018-2019

Substituting in Eq. (4.2) gives

am 2

 bm  c Ae mx  0

or, since emx≠0 and we are interested in non-trivial solutions (A≠0)

am 2  bm  c  0 5.4

The equation (5.4) is called the auxiliary equation of eq. (5.2). It can be obtained from this differential
equation by replacing y″ by m2, y′ by m and y by 1. In simple cases, the roots of the auxiliary equation
can be found by factoring. If the factorisation is not obvious, then applying the quadratic formula, we
see that the roots of the auxiliary equation are given by

 b  b 2  4ac
m
2a 5.5

According to the sign of b2-4ac in eq. (5.5), we could have three different cases
Theorem 1: If the roots m1 and m2 of the auxiliary equation are real and unequal, then the general
solution of ay″+by′+cy=0 is

y  C1 e m1 x  C 2 e m2 x 5.6
Example
Solve the differential equation

y″+10y′+16y=0

Solution
The auxiliary equation is

m 2  10m  16  0 or (m  8)(m  2)  0

which means that m1=-8 and m2=-2. Since the roots of the auxiliary equations are real and unequal, it
follows from Theorem 1 that the general solution is

y  C1e 8 x  C 2 e 2 x

Page
37
NOTES ON MATHEMATICS 2018-2019

with C1 and C2 two arbitrary constants. To find the two constants we need two boundary (or initial)
conditions. For example, we may give two points the solution passes through or one point and the
gradient for some value of x, or two gradients. In this case, if y=0 and dy/dx=1 when x=0, using the
general solution given above we have

C1  C 2  0

 8C1  2C 2  1

which eventually will lead to C1=-1/6 and C2=1/6. So, the final form of the solution is

y
1
6

 e 8 x  e  2 x 
Theorem 2: If the auxiliary equation has a double root m, then the solution of the equation
ay″+by′+cy=0 is

y  C1e mx  C 2 xe mx  e mx C1  C 2 x  5.7

Proof: Using (4.5) with b2-4ac=0, we obtain m=-b/2a or 2am+b=0. Since m satisfies the auxiliary
equation, y=emx is a solution of the differential equation. According to the remark following the proof
of Theorem (4.3), it is sufficient to show that y=xemx is also a solution. Substitution of xemx for y in
ay″+by′+cy=0 gives us

  
a 2me mx  m 2 xe mx  b mxe mx  e mx  cxe mx  
 am 2

 bm  c xe mx  2am  b e mx 
 0 xe mx  0e mx  0 which is what we wished to show.

Example
Solve the differential equation

y″-6y′+9y=0

Solution
The auxiliary equation is m2-6m+9=0, or equivalently (m-3)2=0, has a double root 3. Hence by
Theorem (4.7), the general solution is
y  C1 e 3 x  C 2 xe 3 x  e 3 x C1  C 2 x 
Example:
Solve the differential equation:

Page
38
NOTES ON MATHEMATICS 2018-2019

y   5 y   6 y  0

Solution
The second case when deciding the nature of solution for a second order homogeneous differential
equation corresponds to the case when b2-4ac<0. In this case the auxiliary equation is said to have
complex roots.

Complex numbers may be represented by expressions of the form a+ib, where a and b are real
numbers, and i is a symbol which may be manipulated in the same manner as a real number, but has
the additional property that i2=-1.

Two complex numbers a+ib and c+id are said to be equal, and we write a+ib=c+id, if and only if a=c
and b=d. Operations of addition, subtraction, multiplication and division are defined just as in the case
of real numbers; in the case of complex numbers all letters denote real numbers, with the additional
stipulation that whenever i2 occurs, it may be replaced by –1. For example, the formulas for addition
and multiplication of two complex numbers a+ib and c+id are

a  ib   c  id   a  c   i b  d 
a  ib c  id   ac  bd   iad  bc 

We may regard the real numbers as a subset of the complex numbers by identifying the real number a
with the complex number a+i0. A complex number of the form 0+ib is called an imaginary number.

Complex numbers are often required for solving equations of the form f(x)=0, where f(x) is a
polynomial. For example, if only real numbers are allowed, then the equation x2=-4 has no solutions.
However, if complex numbers are allowed, then the equation has a solution 2i, since

2i 2  2 2 i 2  4 1  4

Similarly, -2i is a solution of the equation x2=-4.

Since i2=-1, we sometimes use the symbol √(-1) in place of i and write

 13  i 13 , 2   25  2  i 25  2  5i

and so on.

Page
39
NOTES ON MATHEMATICS 2018-2019

A quadratic equation ax2+bx+c=0, where a, b and c are real numbers and a≠0, has roots given by the
quadratic formula

 b  b 2  4ac
x .
2a

If b2-4ac<0, then the roots are complex numbers. To illustrate, if we apply the quadratic formula to
the equation x2-4x+13=0 we obtain

4  16  52 4   36 4  6i
x    2  3i
2 2 2

Thus the equation has two complex roots 2+3i and 2-3i.

The complex number a-bi is called the conjugate of the complex number a+bi. We see from the
quadratic formula that if a quadratic equation with real coefficients has complex roots, then they must
be conjugate complex numbers.

It follows from the previous discussion that if the auxiliary equation am2+bm+c=0 has complex roots,
then they are of the form
z1  s  it and z 2  s  it

where s and t are real numbers. We may anticipate, from Theorem (4.6), that the general solution of
the differential equation ay″+by′+cy=0 is

y  C1e z1x  C2 e z2 x  C1e  s it x  C2 e  s it  x 5.8

In order to handle such complex exponents it is necessary to extend some of the concepts of calculus
to include functions whose domain includes complex numbers. Since a complete development is
beyond the scope of our work, we shall merely outline the main ideas.

The Euler’s Theorem


For every complex number

e iz  cos z  i sin z

Page
40
NOTES ON MATHEMATICS 2018-2019

It can be shown that the Laws of Exponents are true for complex numbers. In addition, formulas for
derivatives developed earlier can be extended to functions of a complex variable z. One such formula
is dekz/dz=kekz, where k is a complex number.

Reminder: Laws of exponents

a m a n  a mn a m n
 a mn ( ab) m  a m b m
am a am
m

 a mn    m
a b b
n

It can be proved that the general solution of ay″+by′+c=0, where the roots of the auxiliary equation
are complex numbers s±ti, is given by Eq. (5.8). The form of this solution may be changed as follows:

y  C1e z1x  C2 e z2 x  C1e  sit x  C2 e s it x


 C1e sxtxi  C2 e sxtxi
 C1e sx etxi  C2 e sx e txi

 e sx C1eitx  C2 e itx  5.9

This can be further simplified by using Euler’s Formula. Specifically, we see from (4.8) that

e itx  cos tx  i sin tx


e itx  cos tx  i sin tx

from which it follows that

e itx  e  itx e itx  e  itx


cos tx  ; sin tx 
2 2i 5.10

If we let C1=C2=1/2 in (5.9) and use (5.10) we obtain the particular solution y=esx costx of
ay″+by′+cy=0. Letting C1= - i/2 and C2=i/2 gives us the particular solution y=esx sintx. This is a
particular proof of the next theorem.

Example: Evaluate

e i  1

Page
41
NOTES ON MATHEMATICS 2018-2019

Theorem
If the auxiliary equation am2+bm+c=0 has distinct complex roots s±ti, then the general solution of
ay″+by′+cy=0 is

y  e sx (C1 cos tx  C 2 sin tx ) 5.11

Example
Solve the differential equation

y ' '10 y '41y  0

Solution
The roots of the auxiliary equation m2-10m+41=0 are

10  100  164 10   64 10  8i
m    5  4i
2 2 2

Hence by Theorem (4.11), the general solution of the differential equation is

y  e5 x C1 cos 4 x  C2 sin 4 x 


Example
Find the particular solution of the differential equation 2y″-3y′+2y=0 which satisfies the boundary
condition y=0 and y′=2 when x=0.

Solution
The roots of the auxiliary equation 2m2-3m+2=0 are

3  9  16 3   7 3  i 7
m  
4 4 4

According to (4.11), the solution of the given differential equation can be written as

3x
 7 7 
y  e 4  C1 cos x  C 2 sin x
 4 4 

This solution contains two constants which can be determined using the boundary conditions. It is
obvious that y(0)=C1=0. In order to use the second boundary condition we have differentiate the
solution y(x) with respect to x

Page
42
NOTES ON MATHEMATICS 2018-2019

3 4 7 7   7 7 7 7 
3x 3x

y'  e  C1 cos x  C2 sin x   e 4  
 C1 sin x C2 cos x 
4  4 4   4 4 4 4 
Taking into account that C1=0, we obtain that

7
y ' (0)  C2  2
4

Solving this equation for C2, we obtain that

8 8 7
C2  
7 7

The final form of the solution will be

8 7 4 7
3x
y e sin x
7 4

Example: Solve the differential equation:

y   6 y   13 y  0; y ( 0 )  2 , y ( 0 )  3

A special second order homogeneous differential is the Euler-Cauchy differential equation:

x 2 y   bxy   cy  0

where a and b are constants. Although this equation does not seem to be an equation with constant
coefficients, it can be reduced to that form using a new variable

x  et or t  ln | x |

Since we changed the variable, we have to change y′ and y″, as well. To do so, we use the chain rule :

dy dy dt dy 1 dy t
   e
dx dt dx dt x dt
d 2 y dy d 2 t d 2 y  dt  dy  1  d 2 y 1  2 t dy  2t d y
2 2
           e  e
dx 2 dt dx 2 dt 2  dx  dt  x 2  dt 2 x 2 dt dt 2

Introducing all derivatives back into the original equation, we have

Page
43
NOTES ON MATHEMATICS 2018-2019

d 2 y  2t dy 2t dy
e 2t
e  e  2t e  be t e t  cy  0
dt 2
dt dt
d2y dy
 (b  1)  cy  0
dt 2
dt

which now is a second order homogeneous differential equation with constant coefficients. Therefore,
we use the previous results to solve this equation.

The auxiliary equation is

m 2  (b  1)m  c  0

and the solution of the Euler-Cauchy differential equation can be found according to the nature of the
roots of the auxiliary equation.

a. Real and distinct roots (m1 and m2)

y  A e m1t  Be m 2 t  Ae m1 ln | x |  Be m 2 ln | x |  A | x | m1  B | x | m 2

b. Real but equal roots

y  ( A  Bt )e mt  ( A  B ln | x |)e m ln| x|  ( A  B ln | x |) | x | m

c. Complex roots (m12=α±iβ)

y  e t (C1 cos  t  C2 sin  t )  e ln|x| (C1 cos  ln | x | C2 sin  ln | x |) 


| x | (C1 cos  ln | x | C2 sin  ln | x |)

Page
44
NOTES ON MATHEMATICS 2018-2019

1)

2)

Page
45
NOTES ON MATHEMATICS 2018-2019

3)

Page
46
NOTES ON MATHEMATICS 2018-2019

6. Partial Differentiation
Now we enter new territory. Having spent the semester studying functions of several
variables, and having worked through the concept of a partial derivative, we are in position to
generalize the concept of a differential equation to include equations that involve partial
derivatives, not just ordinary ones. Solutions to such equations will involve functions not just
of one variable, but of several variables. Such equations arise naturally, for example, when
one is working with situations that involve positions in space that vary over time. To model
such a situation, one needs to use functions that have several variables to keep track of the
spatial dimensions and an additional variable for time.

Examples of some important PDEs:

 2u 2  u
2
(1)  c One-dimensional wave equation
t 2 x 2

u  2u
(2)  c2 2 One-dimensional heat equation
t x

 2u  2u
(3)  0 Two-dimensional Laplace equation
x 2 y 2

 2u  2u
(4)   f ( x, y ) Two-dimensional Poisson equation
x 2 y 2

Note that for PDEs one typically uses some other function letter such as u instead of y, which
now quite often shows up as one of the variables involved in the multivariable function.

In general we can use the same terminology to describe PDEs as in the case of ODEs. For
starters, we will call any equation involving one or more partial derivatives of a multivariable
function a partial differential equation. The order of such an equation is the highest order
partial derivative that shows up in the equation. In addition, the equation is called linear if it
is of the first degree in the unknown function u, and its partial derivatives, ux, uxx, uy, etc.
(this means that the highest power of the function, u, and its derivatives is just equal to one in
each term in the equation, and that only one of them appears in each term). If each term in
the equation involves either u, or one of its partial derivatives, then the function is classified
as homogeneous.

Take a look at the list of PDEs above. Try to classify each one using the terminology given
above. Note that the f(x,y) function in the Poisson equation is just a function of the variables
x and y, it has nothing to do with u(x,y).

Answers: all of these PDEs are second order, and are linear. All are also homogeneous
except for the fourth one, the Poisson equation, as the f(x,y) term on the right hand side
doesn’t involve u or any of its derivatives.

Page
47
NOTES ON MATHEMATICS 2018-2019

The reason for defining the classifications linear and homogeneous for PDEs is to bring up
the principle of superposition. This excellent principle (which also shows up in the study of
linear homogeneous ODEs) is useful exactly whenever one considers solutions to linear
homogeneous PDEs. The idea is that if one has two functions, u 1 and u 2 that satisfy a linear
homogeneous differential equation, then since taking the derivative of a sum of functions is
the same as taking the sum of their derivatives, then as long as the highest powers of
derivatives involved in the equation are one (i.e., that it’s linear), and that each term has a
derivative in it (i.e. that it’s homogeneous), then it’s a straightforward exercise to see that the
sum of u 1 and u 2 will also be a solution to the differential equation. In fact, so will any linear
combination, au 1  bu 2 , where a and b are constants.

For instance, the two functions cos( xy ) and sin(xy ) are both solutions for the first-order
linear homogeneous PDE:

u u
(5) x y 0
x y

It’s a simple exercise to check that cos( xy )  sin( xy ) and 3 cos( xy )  2 sin( xy ) are also
solutions to the same PDE (as will be any linear combination of cos( xy ) and sin(xy ) )

Solving PDEs
Solving PDEs is considerably more difficult in general than solving ODEs, as the level of
complexity involved can be great. For instance the following seemingly completely
unrelated functions are all solutions to the two-dimensional Laplace equation:

(1) x2  y2 , e x cos(y) and ln(x 2  y 2 )

You should check to see that these are all in fact solutions to the Laplace equation by doing
 2u  2u
the same thing you would do for an ODE solution, namely, calculate and ,
x 2 y 2
substitute them into the PDE equation and see if the two sides of the equation are identical.

Now, there are certain types of PDEs for which finding the solutions is not too hard. For
instance, consider the first-order PDE

u
(2)  3 x 2  xy 2
x

where u is assumed to be a two-variable function depending on x and y. How could you


solve this PDE? Think about it, is there any reason that we couldn’t just undo the partial
derivative of u with respect to x by integrating with respect to x? No, so try it out! Here,
note that we are given information about just one of the partial derivatives, so when we find a
solution, there will be an unknown factor that’s not necessarily just an arbitrary constant, but
in fact is a completely arbitrary function depending on y.

Page
48
NOTES ON MATHEMATICS 2018-2019

To solve (2), then, integrate both sides of the equation with respect to x, as mentioned. Thus

 x dx   (3x
u
(3) 2
 xy 2 ) dx

1 2 2
so that u ( x, y )  x 3  x y  F . What is F? Note that it could be any function such that
2
when one takes its partial derivative with respect to x, the result is 0. This means that in the
case of PDEs, the arbitrary constants that we ran into during the course of solving ODEs are
now taking the form of whole functions. Here F, is in fact any function, F(y),of y alone. To
check that this is indeed a solution to the original PDE, it is easy enough to take the partial
derivative of this u ( x, y ) function and see that it indeed satisfies the PDE in (2).

Now consider a second-order PDE such as

 2u
(4)  5x  y 2
xy

where u is again a two-variable function depending on x and y. We can solve this PDE by
integrating first with respect to x, to get to an intermediate PDE,

u 5 2
(5)  x  xy 2  F ( y)
y 2

where F(y) is a function of y alone. Now, integrating both sides with respect to y yields

5 2 1
(6) u ( x, y )  x y  xy 3  F ( y )  G ( x )
2 3

where now G(x) is a function of x alone (Note that we could have integrated with respect to y
first, then x and we would have ended up with the same result). Thus, whereas in the ODE
world, general solutions typically end up with as many arbitrary constants as the order of the
original ODE, here in the PDE world, one typically ends up with as many arbitrary functions
in the general solutions.

To end up with a specific solution, then, we will need to be given extra conditions that
indicate what these arbitrary functions are. Thus the initial conditions for PDEs will typically
involve knowing whole functions, not just constant values. We will also see that the initial
conditions that appeared in specific ODE situations have slightly more involved analogs in
the PDE world, namely there are often so-called boundary conditions as well as initial
conditions to take into consideration.

Page
49
NOTES ON MATHEMATICS 2018-2019

Example
1)

2)

Page
50
NOTES ON MATHEMATICS 2018-2019

Page
51
NOTES ON MATHEMATICS 2018-2019

7. Statistics
(I) Mean:

x
n

i
x  i 1

Sample mean: n (sample statistic)

y
N

 
i
i 1

Population mean: N (population parameter)

Basically, the mean can provide the information about the “center” of the data. Intuitively, it can
measure the rough “location” of the data.

Example (continue):
1  3    10
x  5.5
10

(II) Median:

The data are arranged in ascending (or descending) order. Then,


As the sample size is odd, the median is the middle value.
As the sample size is even, the median is the mean of the middle two numbers.

Example (continue):
56
median   5 .5
2
If the data are 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11. Then,
median  6

Note: the median is less sensitive to the data with extreme values than the mean. For example, in the
previous data, suppose the last data has been wrongly typed, the data become 1, 3, 5, 7, 9, 2, 4, 6, 8,
100. Then the median is still 5.5 while the mean becomes 14.5.

(III) Mode:
The data value occurs with greatest frequency (not necessarily to be numerical).

Page
52
NOTES ON MATHEMATICS 2018-2019

Note: if the data have exactly two modes, we say that the data are bimodal. If the data have more than
two modes, we say that the data are multimodal.
Example:
Suppose there are two factories producing the batteries. From each factory, 10 batteries are drawn to
test for the lifetime (in hours). These lifetimes are:

Factory 1: 10.1, 9.9, 10.1, 9.9, 9.9, 10.1, 9.9, 10.1, 9.9, 10.1
Factory 2: 16, 5, 7, 14, 6, 15, 3, 13, 9, 12.

The mean lifetimes of the two factories are both 10. However, by looking at the data, it is obvious that
the batteries produced by factory 1 are much more reliable than the ones by factory 2. This implies
other measures for measuring the “dispersion” or “variation” of the data are required.

(I) Range:
range=(largest value of the data)-(smallest value of the data).
Example (continue):

Range of lifetime data for factory 1=10.1-9.9=0.2


Range of lifetime data for factory 2=16-3=13
 The range of battery lifetimes for factory 1 is much smaller than the one for factor 2.
Note: the range is seldom used as the only measure of dispersion. The range is highly influenced by
an extremely large or an extremely small data value.

(II) Variance and Standard Deviation:

population deviation about the mean:


yi  , i  1,2,, N

sample deviation about the mean:


xi  x, i  1,2,, n
Intuitively, the population deviation and the sample deviation can measure how far the data is from
the “center” of the data. Then, population variance and sample variance are the sum of square of the
population deviation and sample deviation,

 y  
N
2
i
2  i 1

N and

 x  x x
n n
 nx 2
2 2
i i
s2  i 1
 i 1

n 1 n 1 , respectively.
The population standard deviation and sample standard deviation are the square root of population
variance and sample variance:

Page
53
NOTES ON MATHEMATICS 2018-2019

   2 and s  s 2 ,respectively.
Large sample variance or sample standard deviation implies the data are “dispersed” or are highly
varied.

x
n

x  x  x  nx  x  n  xi  xi  0


n n n i n n
i 1

n
i i i
Note: i1 i1 i1 i1 i1

Example:

s 2
( factory.1) 
10.1 10  9.9 10   10.1  10
2 2 2
 0.0111
10  1

s ( factory.2) 
2 16  10  5  10    12  10
2 2 2
 21.1111
10  1
 The sample variance of battery lifetimes for factory 2 is 190 times larger than the one for
factor 1.

The sample standard deviation for the data from factories 1 and 2 are
0.01111 0.1054
and
21.1111 4.5946, respectively.

(III) Coefficient of Variation:


The coefficient of variation is another useful statistic for measuring the dispersion of
the data. The coefficient of variation is
s
C .V .   100
x
The coefficient of variation is invariant with respect to the scale of the data. On the other hand, the
standard deviation is not scale-invariant. The following example demonstrates the property.

Example:
In the battery data from factory 1, suppose the measurement is in minutes rather than hours. Then, the
data are 606, 594, 606, 594, 594, 606, 594, 606, 594, 606.
Thus, the standard deviation becomes 6.3245 which is 60 times larger than the one 0.1054 based on
the original data measured in hours. However, no matter the data are measured in hours and minutes,
the coefficient of variation is
0.1054 6.3245
C.V .  100  100  1.054.
10 600

Page
54
NOTES ON MATHEMATICS 2018-2019

Note: since the coefficient of variation is scale-invariant, it is very useful for comparing the dispersion
of different data. For example, in the previous battery data, if the lifetime of the batteries from factory
1 and factory 2 are measured in minutes and hours, respectively, the standard deviation for factory 1,
6.3245, would be larger than for factory 2, 4.5946. However, the coefficient of variation for factory 1,
1.054 is still much smaller than the one for factory 2, 45.946.

Page
55
NOTES ON MATHEMATICS 2018-2019

8. Probability
Outcomes – possible results of an experiment

Event – collection of outcomes

Favorable Outcomes – the outcomes of a specified event

Probability – a measure of the likelihood that an event will occur

To find the probability of an event, divide the number of favorable outcomes by the total
number of outcomes.

Theoretical Probability – based on knowing all of the equally likely outcomes

Experimental Probability – based on repeated trials of an experiment

A tree diagram helps count all possible outcomes by using branching to list choices.

The Counting Principle is a method used to count possible outcomes by multiplying the
outcomes of each event.
Permutation – order is imPortant
Combination – we don’t Care about order

Disjoint Events are events that have no outcomes in common.

Overlapping Events are events that have one or more outcomes in common.

Two disjoint events in which one or the other must occur are called complementary events.

If A and B are disjoint events, then


P(A or B) = P(A) + P(B)

When you consider outcomes of two events, the events are called compound events.

Two compound events that do no affect each other are called independent events.
P(A and B) = P(A) • P(B)

Two compound events are dependent events if the occurrence of one event does affect the
likelihood that the other will occur.
P(A and B) = P(A) • P(B given A)

 Sample space. This is just terminology. A sample space is the set of all possible
outcomes of what every ‘experiment’ you are doing.
For example, if you are flipping a coin 3 times, the sample space is

S = {TTT, TTH, THT, HTT, THH, HTH, HHT, HHH}.


(1.1)

Page
56
NOTES ON MATHEMATICS 2018-2019

If your are considering the possible years of age of a human being, the sample space
consists of the non-negative integers up to 150. If you are considering the possible
birthdates of a person drawn at random, the sample space consists of the days of the year,
thus the integers from 1 to 366. If you are considering the possible birthdates of two
people selected at random, the sample space consists of all pairs of the form (j, k) where j
and k are integers from 1 to 366.
To reiterate: S is just the collection of all conceivable outcomes.

 Events An event is a subset of the sample space, thus, a subset of possible outcomes for
your experiment. Thus, if S is the sample space for flipping a coin three times, then HTH
is an event. The event that a head appears on the first flip is the four element subset
{HTT, HHT, HTH, HHH}.
Thus, an event is simply a certain subset of the possible outcomes.

 Axiomatic definition of probability A probability function on a sample space is, by


definition, an assignment of a non-negative number to every subset of S subject to the
following rules:

P(S) = 1 and P(AB) = P(A) + P(B) when AB = ø ..


(1.2)
Here, the notation is as follows: A subset of S is a collection of its elements. If A and B
are subsets of S, then AB is the subset of elements that are in A or in B. Meanwhile,
AB is the subset of elements that are in both A and B. Finally, ø is the stupid subset
with no elements; deemed the ‘empty set’. Note that AB is said to be the union of A
and B, while AB is said to be the intersection of A and B.
Note that condition P(S) = 1 says that there is probability 1 of at least something
happening. Meanwhile, the condition P(AB) = P(A) + P(B) when A and B have no
points in common asserts the following: The probability of something happening that is
in either A or B is the sum of the probabilities of something happening from A or
something happening from B.
There is a general rule:

If you know what P assigns to each element in S, then you know P on every subset: Just
add up the probabilities that are assigned to its elements.
This assumes that S is a finite set. We’ll talk about the story when it isn’t later in the
course. Anyway, the preceding illustrates the more intuitive notion of probability that we
all have: It says simply that if you know the probability of every outcome, then you can
compute the probability of any subset of outcomes by summing up the probabilities of the
outcomes that are in the subset.
For example, if S is the set of outcomes for flipping a fair coin three times (as
depicted in (1.1)), then each of its elements has P(·) = 18 and then we can use the rule in
(1.2) to assign probabilities to any given subset of S. For example, the subset given by
{HHT, HTH, THH} has probability 38 since

P({HHT, HTH, THH}) = P({HHT, HTH}) + P(THH)

Page
57
NOTES ON MATHEMATICS 2018-2019

by invoking (1.2). Invoking it a second time finds P({HHT, HTH}) = P(HHT) + P(HTH),
and so P({HHT, HTH, THH}) = P(HHT) + P(HTH) + P(THH) = 38 .
Here are some consequences of the definition of probability.

a) P(ø) = 0.
b) P(AB) = P(A) + P(B) – P(A.
c) P(A) ≤ P(B) if A  B.
d) P(B) = P(BA) + P(BAc).
e) P(Ac) = 1 – P(A).
(1.3)
In the preceding, Ac is the set of elements that are not in A. The set Ac is called the
‘complement’ of A.
I want to stress that all of these conditions are simply translations into symbols of
intuition that we all have about probabilities. Here are the respective English versions of
(1.3):

a) The probability that no outcomes appear is zero. This is to say that if S is the list of
all possible outcomes, then at least one outcome must appear.
b) The probability an outcome is in either A or B is the probability that is in A plus the
probability that it is in B minus the probability that it is in both. The point here is that
if A and B have elements in common, then one is overcounting by just summing the
two probabilities. If you doubt this, try the case where A = B.
c) The probability of an outcome from A is no greater than that of an outcome from B in
the case that all outcomes from A are contained in the set B.
d) The probability of an outcome from the set B is the sum of the probability that the
outcome is in the portion of B that is contained in A and the probability that the
outcome is in the portion of B that is not contained in A.
e) The probability of an outcome that is not in A is 1 minus the probability that an
outcome is in A.

 Conditional probability: This is the probability that an event in A occurs given that you
already know that an event in B occurs. It is denoted by P(A|B)and it is a probability
assignment for S that is typically not the same as the original one, P. The rule for
computing this new probability is

P(A|B)  P(AB)/P(B).
(1.4)
You can check that this obeys all of the rules for being a probability. In English, this
says:
The probability of an event occuring from A given that the event is in B is the probability
of the event being in both A and B divided by the probability of the event being in B in
the first place.
Another way to view this notion is as follows: Since we are told that the event B
happened, we can shrink the sample space from the whole of S to just the elements that
define the event B. The probability of A given that B happened is then the probability
assigned to the part of A in B (thus, P(AB)) divided by P(B). In this regard, the

Page
58
NOTES ON MATHEMATICS 2018-2019

division by P(B) is done to make the conditional probability of B given that B happened
equal to 1.
Anyway, here is an example: Suppose we want the conditional probability of a head
on the last flip granted that there is a head on the first flip. Use B to denote the event that
there is a head on the first flip. Then P(B) = 12 . The conditional probability that there is a
head on the final flip given that you know there is one on the first flip is obtained using
(1.4). Here, A is the event that there is a head on the final flip, thus the set {TTH, THH,
HTH, HHH}. Its intersection with B is A  B = {HTH, HHH}. This set has probability
4 so our conditional probability is 4 / 2 = 2 .
1 1 1 1

 That’s all there is to probability: You have just seen most of probability theory for
sample spaces with a finite number of elements. There are a few new notions that are
introduced later, but a good deal of what follows concerns either various consequences of
the notions that were just introduced, or else various convenient ways to calculate
probabilities that arise in common situations.

 Decomposing a subset to compute probabilities: It is often the case (as we will see) that it
is easier to compute conditional probabilities. This can be used to one’s advantage in the
following situation: Suppose that S is decomposed into a union of some number, N, of
subsets that have no elements in common: S = 1≤j≤N Aj where {Aj}1≤j≤N are subsets of S
with AjAj´ = ø when j ≠ j´. Now suppose that A is any given set. Then

P(A) = ∑1≤j≤N P(A|Aj)·P(Aj).


(1.5)
In words, this says the following:

The probability of A is the probability that an outcome from A occurs that is in A1, plus
the probability that an outcome from A occurs that is in A2, plus

By the way, do you recognize (1.5) as a linear equation? You might if you denote
P(A) by y, each P(Aj) by xj and P(A|Aj) by aj so that this reads

y = a1x1 + a2x2 + · · · + aNxN

Thus, linear systems arise


Here is an example: Suppose we have a stretch of DNA of length N, and want to
know what the probability of not seeing the base G = Guanine in this stretch. Let A = Event
of this happening for length N stretch and B = Event for a length N-1 stretch. If each of the
four basis have equal probability of appearing, the P(A|B) = 34 . Thus, we learn that PN = 34
PN-1, and so we can iterate this taking N = 1, N = 2, etc to find the general formula PN = 34 N.
Here more linear algebra: Let {Aj}j=1,2,3,4 denote the event that a given site in DNA
has base {A, G, C, T} = {1, 2, 3, 4}. Let {Bj}j=1,…4 denote the analogous event for the
adjacent site to the 5´ end of the DNA. (The ends of a DNA molecule are denoted 3´ and
5´ for reasons that have to do with a tradition of labelling carbon atoms on sugar
molecules.) According to the rule in (1.5), we must have

P(Aj) = ∑k P(Aj|Bk)·P(Bk).

Page
59
NOTES ON MATHEMATICS 2018-2019

So, we have a 4  4 matrix M whose entry in row j and column k is P(Aj|Bk). Now write
each P(Aj) as yj and each P(Bk) as xk, and this last equation reads yj = ∑k Mjkxk.

 Independent events: An event A is said to be independent of B in the case that

P(A|B) = P(A).

In English: Events A and B are independent when the probability of A given B is the
same as that of A with no knowledge about B. Thus, whether the outcome is in B or not
has no bearing on whether it is in A.
Here is an equivalent definition: Events A and B are independent when P(AB) =
P(A)P(B). This is equivalent because P(A|B) = P(AB)/P(B). Note that the equality
between P(AB) and P(A)P(B) implies that P(B|A) = P(B). Thus, independence is
symmetric. Here is the English version of this equivalent definition: Events A and B are
independent in the case that the probality of an event being both in A and in B is the
product of the probability that it is in A and in B.
For an example, take A to be the event that a head appears on the first coin toss and B
the event that it appears on the third. Are these events independent? Well, P(A) is 12 as is
P(B). Meanwhile, P(A B) = 14 which is P(A)P(B). Thus, they are indeed
independent.
For a second example, consider A to be the event that a head appears on the first toss
and B the event that a tail appears on the first toss. Then A  B = ø, so P(A  B) is zero
but P(A)P(B) = 14 . So, these two events are not independent. (Are you surprised?)
Here is food for thought: Is it reasonable to suppose that the probability of seeing the
base G at a given site is independent of seeing it at the next site in a stretch of DNA?
Check out the DNA code for most commonly used amino acids and let me know.

Exercises:
1. Suppose we have an experiment with three possible outcomes, labeled 1,2, and 3.
Suppose in addition, that we do the experiment three successive times.
a) Give the sample space for the possible outcomes of the three experiments.
b) Write down the subsets of your sample space that correspond to the event that
outcome 1 occurs in the second experiment.
c) Suppose that we have a theoretical model of the situation that predicts equal
probability for any of the three outcomes for any one given experiment. Our model also says
that the event that outcome k appears in one experiment and outcome j in another are
independent. Use these facts to give the probability of three successive experiments getting as
outcome any given triple (i, j, k) with i either 1, 2 or 3, and with j and k likewise constrained.
d) Which is more likely: Getting exactly two identical outcomes in the three
experiments, or getting three distinct outcomes in the three experiments.

2. Suppose that 1% of Harvard students have a particular mutation in a certain protein, that
20% of people with this mutation have trouble digesting lactose, and that 5% of Harvard
students have trouble digesting lactose. If a Harvard student has trouble digesting Lactose,

Page
60
NOTES ON MATHEMATICS 2018-2019

what is the probability that the student has the particular mutation? (Hint: Think Bayes’
theorem.)

3. A certain experiment has N ≥ 2 possible outcomes. Let S1 denote the corresponding


sample space. Suppose that k is a positive integer and that p  (0, 1).
a) How many elements are in the sample space for the possible outcomes of k separate
repeats of the experiment?
b) Suppose that N ≥ 2 and that we have a theoretical model that predicts that one
1 p
particular outcome, ô  S1, has probability p and all others have probability (N 1) .
Suppose that we run the experiment twice and that our model predicts that the event of
getting any given outcome in the first run is independent from the event of getting any
particular outcome in the second. How big must p be before it is more probable to get ô
twice as opposed to never?
c) How big must p be before it is more probable to get ô twice in two consecutive runs as
opposed to ô just once in the two experiments?

4. Label the four basis that are used in a DNA molecule as {1, 2, 3, 4}.
a) Granted this labeling, write down the sample space for the possible basis at two given
sites on the molecule.
b) Let {Aj}j=1,2,3,4 denote the event in this 2-site sample space that the first site has the
base i, and let {Bj}j=1,…4 denote the analogous event for the second site. Explain why
P(A1|Bk) + P(A2|Bk) + P(A3|Bk) + P(A4|Bk) = 1 for all k.
c) If Ai is independent from each Bk, we saw that P(Ai|Bk) = P(Bk|Ai) for all i and k.
Can this last symmetry condition hold if some pair Ai and Bk are not independent?
If so, give an example by specifying the associated probability function on the two
site sample space. If not, explain why..

Example
1)

Page
61
NOTES ON MATHEMATICS 2018-2019

2)

3)

Page
62

You might also like