Professional Documents
Culture Documents
26 3 2 12 6 18
2 2
3 6
To set-up the minor matrix, ignore the row and column that contains the coefficient.
a b c a b c a b c a b c
d e f ad e f bd e f c d e f
g h j g h j g h j g h j
Page 1
NOTES ON MATHEMATICS 2018-2019
Example Find the determinant for the matrix in example 2 using the diagonals method.
2 1 3 -2 1
1 1 4 1 1 2 * 1 * 2 1 * 4 * 3 3 * 1 * 6 3 * 1 * 3 6 * 4 * 2 2 * 1 * 1
3 6 2 -3 6
4 12 18 9 48 2
16 27 46
57
Applications of Determinants
Example Find the area of a triangle whose vertices are at 0,0 , 10,25 and 28,20 .
“Formula”: Let x1, y 1 , x 2 , y 2 , and x 3 , y 3 be the vertices of a triangle. Then
x1 y1 1
1
A x2 y2 1
2
x3 y3 1
0 0 1 0 0
1
A 10 25 1 10 25
2
28 20 1 28 20
1
0 0 200 700 0 0
2
500 250 square units
1
2
Example Find the equation of a line that contains 4,1 and 3,5 .
x y 1
“Formula”: x1 y1 1 0
x2 y2 1
x y 1 x y
4 1 1 4 1 0
3 5 1 3 5
x 3 y 20 3 5 x 4 y 0
x 3 y 17 5 x 4 y 0
6 x 7 y 17
Page 2
NOTES ON MATHEMATICS 2018-2019
Example
2) Prove that
Page 3
NOTES ON MATHEMATICS 2018-2019
Page 4
NOTES ON MATHEMATICS 2018-2019
1.2 Matrix
Row Matrix [1 4 8 2 4]
3
5
Column Matrix 1
2
Matrix Arithmetic
Examples
[1 3 2 4] + [3 -1 2 6] = [4 2 4 10]
2 6 8
3 1 4
4 4 0
8
1
3
If x= 1 and y= 2 then
1
9 7
x+y= 2 and x - y = 4
1 3
A Square Matrix is a matrix which has the same number of rows and columns.
Page 5
NOTES ON MATHEMATICS 2018-2019
A Diagonal Matrix is a square matrix which has non-zero element values only on the leading
diagonal of the matrix and all the other elements are zero.
3 0 0
e.g. 0 1 0
0 0 2
A Symmetric Matrix is one where the non-diagonal elements have a 'mirror-image' across
the leading diagonal.
5 2 1
e.g. 2 8 0
1 0 7
The Transpose of a matrix is the matrix formed by interchanging the rows and the columns.
The transpose of A is denoted by A' (or sometimes AT)
1 2
1 0 5
e.g. If A = then A' = 0 3
2 3 6 5 6
Matrix Arithmetic
1 4 2 7 3 11
e.g. 2 2 + 1 4 = 1 6
5 5 9 3 14 2
Matrix Multiplication
Only certain sizes and shapes of matrix and matrices can be multiplied together
The order in which the matrices or matrix are multiplied is significant
The dimensions of the resultant matrix may be different from the original ones
The number of columns in the first matrix or matrix must be the same as the number of rows in the
second matrix or matrix
Example
To multiply x. M where x is a row matrix of 3 elements and M is a 3 x 2 matrix
Page 6
NOTES ON MATHEMATICS 2018-2019
1 2
2 4 1 0 1 4 11
2 3
The result is a row matrix of 2 elements.
What we did to obtain each element in the result r was
To get r11 take the 1st row of the matrix and the 1st column of the matrix, then
r11 = x11 . m11 + x12 . m21 + x13 . m31
To get r12 take the 1st row of the matrix and the 2nd column of the matrix, then
r12 = x11 . m12 + x12 . m22 + x13 . m32
Remember, [m x n] means a matrix with m rows and n columns
And for multiplication we go
Along the rows of the first matrix
Down the columns of the second matrix
The size of the answer can be predicted: [m x n] [n x p] gives answer [m x p]
Example
1 2
1 2 3 1 1 2 2 3 2 1 2 2 1 3 2 11 10
2 1
2 1 2 2 2 2 1 1 2 2 2 2 2 1 1 2 2 8 9
Inverse of a Matrix
There is a special case for square matrices where
A.B = B.A.
This is when matrix B is the inverse of matrix A.
(It also follows that matrix A is the inverse of matrix B.)
The inverse of matrix A is denoted by A-1. A-1 = 1
A
The result of the multiplications A.A-1 and A-1.A is the identity matrix I. This is a special type of
diagonal matrix where the diagonal elements have the value 1.
1 0 .. 0
0 1 .. 0
A . A-1 = A-1. A = I = . . . .
0 .. 0 1
Symmetric Matrices
Definition of symmetric matrix
Page 7
NOTES ON MATHEMATICS 2018-2019
Example
1 2 5
A 2 3 6 is symmetric since A At .
5 6 4
3 0
and .
1 1
Then,
row (At)
col col 3 ( A ) row
t
1
AA t
1 (A) col 2 (A) 2 ( A )
row t
3 ( A )
col t
( A)
col col 3 ( A ) col
1
col 1 ( A ) col 1
t
( A ) col 2 ( A ) col t
2 ( A ) col 3 ( A ) col t
3 (A)
1 3 2 0 1 1
1 2 1
3 0 1
1 3 4 0 1 1 6 2
3 9 0 0 1 1 2 10
In addition,
A t A row1t ( A) row1 ( A) row 2t ( A) row 2 ( A)
1 3
2 1 2 1 0 3 0 1
1 1
1 2 1 9 0 3 10 2 2
2 4 2 0 0 0 2 4 2
1 2 1 3 0 1 2 2 2
Note:
A and B are symmetric matrices. Then, AB is not necessarily equal to BA ( AB ) t . That is, AB
might not be a symmetric matrix.
Page 8
NOTES ON MATHEMATICS 2018-2019
Note that all the main diagonal elements in skew-symmetric matrix are zero.
0 5 4
Lets take an example of matrix 5 0 1
4 1 0
It is skew-symmetric matrix because aij=−aji for all i and j. Example, a12 = -5 - and a21=5 which
means a12=−a21.. Similarly, this condition holds true for all other values of i and j.
We can also verify that Transpose of Matrix A is equal to negative of matrix A i.e AT=−A.
0 5 4 0 5 4
5 0 1 and 5 0 1
4 1 0 4 1 0
We can clearly see that AT=−A which makes A skew-symmetric matrix.
0 5 4
Let’s take another example of matrix 5 0 1.
4 1 0
Matrix B is not skew-symmetric because b12≠−b21 or BT≠−B.
Note that in skew-symmetric
symmetric matrices
matrices, to make a11=−a11, a22=−a22, a33=−aa33 ... All the main
diagonal elements have to be zero.
Example
Page 9
NOTES ON MATHEMATICS 2018-2019
3) Prove that
Solutions
Let A be non-singular
singular square matrix i.e |A| is not 0.
Again we know that
Page
10
NOTES ON MATHEMATICS 2018-2019
Page
11
NOTES ON MATHEMATICS 2018-2019
2. Numerical Methods
Newton forward interpolation
Suppose we are given the following values of y = f(x) for a set of values of x:
x : x0 x1 x2 …… xn
y : y0 y1 y2 …… yn
The process of finding the values of y corresponding to any value of x=xi between x0 and xn is called
interpolation.
The technique of estimating the value of a function for any intermediate value of the independent
variable is called interpolation.
The technique of estimating the value of a function outside the given range is called extrapolation.
The study of interpolation is based on the concept of differences of a function.
Suppose that the function y=f(x) is tabulated for the equally spaced values x = x0, x1=x0+h, x2=x0+2h,
…, xn=x0+nh giving y = y0, y1, y2, …, yn. To determine the values of f(x) and f '(x) for some
intermediate values of x, we use the following three types of differences
Forward differences
Backward differences
Central differences
Forward differences: The forward differences are defined and denoted by ∆f(x)=f(x+h)-f(x),
∆y0 = y1 – y0
∆y1 = y2 – y1
∆y2 = y3 – y2
…………….
∆yr = yr+1 – yr
…………….
∆yn-1 = yn – yn-1
These are called the first forward differences and ∆ is the forward difference operator.
Similarly the second forward differences are defined by
∆2 yr = ∆ yr+1 – ∆yr.
In general
∆p yr = ∆p-1 yr+1 – ∆p-1yr ,
pth forward differences.
The forward differences systematically set out in a table called forward difference table.
Value of x Value of y 1st diff. 2nd diff. 3rd diff. 4th diff. 5thdiff.
∆ ∆2 ∆3 ∆4 ∆5
x0 y0
∆y0
x1 y1 ∆2y0
∆y1 ∆3y0
x2 y2 ∆2y1 ∆4y0
∆y2 ∆3y1 ∆5y0
x3 y3 ∆2y2 ∆4y1
∆y3 ∆3y2
x4 y4 ∆2y3
∆y4
Page
12
NOTES ON MATHEMATICS 2018-2019
x5 y5
Example The table gives the distances in nautical miles of the visible horizon for the given heights in
feet above the earth’s surface :
x=height 100 150 200 250 300 350 400
Find the values of y when (i) x= 218 ft. (ii) x= 410 ft.
Sol. The difference table is
x y ∆ ∆2 ∆3 ∆4
100 10.63
2.4
150 13.03 -0.39
2.01 0.15
1.77 0.08
1.61 0.03
1.48 0.02
1.37
xn=400 21.27
Gaussian Elimination
Gaussian elimination is one popular method of solving linear equations. We illustrate this technique
by means of an example.
Example Find x, y and z that satisfy the following three equations at the same time.
x - y + 3z = 4
Page
13
NOTES ON MATHEMATICS 2018-2019
(1) 2x - y + 2z = 6
3x + y - 2z = 9
Before discussing the details of Gaussian elimination, let's look at two ways to reformulate a system
of linear equations. Both ways begin by putting the equations in vector form. For the equations
above this is the following.
x - y + 3z 4
(2) 2x - y + 2z = 6
3x + y - 2z 9
The left side we can write as the matrix of coefficients times the vector of unknowns.
1 -1 3x 4
2 - 1 2
y = 6
3 1 -2z 9
or
(3) Au = b
where
1 -1 3 x 4
A = 2 -1 2 u = y b = 6
3 1 -2 z 9
So the original equations (1) are equivalent to (3). In general the problem of solving a system of
linear equations is equivalent to solving Au = b where A is the matrix of coefficients, b is the vector
of numbers on the right side and u is the vector of unknowns.
The second reformulation of the equations starts with (2) and writes the vector on the left as the sum
of three vectors where each term contains the terms with one of the variables. We get
x -y 3z 4
2x + - y + 2z = 6
3x y - 2z 9
Now we factor the variables out of each of the vectors on the left to get
1 -1 3 4
x 2 + y-1 + z 2 = 6
3 1 -2 9
or
xv1 + yv2 + zv3 = b
where
1 -1 3
v1 = 2 v2 = - 1 v3 = 2
3 1 -2
So the original equations (1) are equivalent to writing b as a linear combination of v1, v2 and v3. In
general the problem of solving a system of linear equations is equivalent to writing b as a linear
combination of the vectors that are the coefficients of each of the variables.
Now let's look at solving linear equations using Gaussian elimination. We shall look at two methods
to keep track of our calculations. One is with the equations themselves. The other is by means of
Page
14
NOTES ON MATHEMATICS 2018-2019
another matrix which is just the coefficient matrix A and right hand side b of the equation combined.
It is called the augmented matrix. For the equations in Example 1 it is.
1 -1 3 | 4
M = 2 -1 2 | 6
3 1 -2 | 9
Note that we draw a line separating the last column which contains b from the rest which contains A.
To start out we have the original equations and the corresponding M.
x - y + 3z = 4 1 -1 3 | 4
2x - y + 2z = 6 M = 2 -1 2 | 6
3x + y - 2z = 9 3 1 -2 | 9
Trapezoidal Rule/Formula
We approximate the function f(x) in the interval [x i , x i 1 ] by a straight line joining the points (xi, yi)
and [x i 1 , yi 1 ] . For convenience let us consider the integral in the first interval [x0, x1]. The line
joining (x0, y0) and (x1, y1) may be written from the Lagrange’s formula as,
x x1 x x0
y (x) y0 y1
x 0 x1 x1 x 0
x x1
x x0
x1 xi
Now, f (x) dx y0 y1 dx
x0 x0 0
x x1 x1 x 0
1
(x x1 ) 2 y 0 (x x 0 ) 2 y1
x x1
2h x x0
h
(y 0 y1 )
2
Adding over all the intervals the Trapezoidal formula/rule may be written as,
h
b xn
Page
15
NOTES ON MATHEMATICS 2018-2019
dx 2 (x x1 )
x0
( h) ( 2h) 2h 2 6 x0 3
1 (x x 2 ) 2 (x x 2 )3
(x x 0 ) (x x 2 ) 4
x2 x2
dx 2 (x x 0 ) h
x0
h ( h) h 2 6 x0 3
1 (x x1 ) 2 (x x1 )3
(x x 0 ) (x x1 ) h
x2 x2
dx 2 (x x 0 )
x0
(x 2 x 0 ) (x 2 x1 ) 2h 2 6 x0 3
Hence we get,
h 4h h
x2 x2
f (x) dx y (x) dx y0 y1 y 2
x0 x0
3 3 3
h
(y 0 4y1 y 2 )
3
n
Applying this formula over next two intervals and then next two and so on for times and adding
2
we get
b xn x2 x4 xn
h
[(y 0 4y1 y 2 ) (y 2 4y 3 y 4 ) . . . (y n 2 4y n 1 y n )]
3
h
[y 0 y n 4(y1 y 3 . . . y n 1 ) 2 (y 2 y 4 . . . y n 2 )]
3
Obviously n should be chosen as a multiple of 2 i.e. an even number for applying this formula.
Example
Page
16
NOTES ON MATHEMATICS 2018-2019
h
I [y 0 y 5 2 (y1 y 2 y 3 y 4 )]
2
0.2
[1.0 0.70711 2 (0.98058 0.92848 0.85749 0.78087)]
2
0.1 [1.70711 2 3.54742]
= 0.88016
Example
dx
0.8
Evaluate the integral I by Simpson’s 1/3rd rule dividing the interval [0, 0.8] to 4 equal
1 x
0
h
0.8
I ydx [(y0 4y1 y2 ) (y 2 4y3 y4 )]
0 3
h
[y 0 y 4 4 (y1 y3 ) 2 y 2 )]
3
0.2
[1.0 0.74536 4 (0.91287 0.79051) 2 0.84515)]
3
0.2
[1.74536 4 1.70344 1.69030]
3
= 0.68329
Example
1) State the appropriate interpolation formula which is used to calculate the value of exp(1.75) from
the following data and hence evaluate it from the data.
Solution
The Difference table is under
Page
17
NOTES ON MATHEMATICS 2018-2019
2)
Page
18
NOTES ON MATHEMATICS 2018-2019
3)
4)
Page
19
NOTES ON MATHEMATICS 2018-2019
Page
20
NOTES ON MATHEMATICS 2018-2019
3. Integration
Revision of Integration (Indefinite Integration)
When a function f(x) is known we can differentiate it to obtain its derivative df/dx. The reverse
process is to obtain the function f(x) from knowledge of its derivative. This process is called
integration and it has numerous applications in all areas of sciences.
dy
2x
dx
Integration reverses this process and we say that the integral of 2x is x2. Schematically we can regard
the process of integration in the following way
Differentiate
x2 2x
Integrate
Example
(a) State the derivative of x3
(b) Hence find the indefinite integral of 3x2
Page
21
NOTES ON MATHEMATICS 2018-2019
Solution
(a) From our knowledge of differentiation, the derivative of x3 is 3x2.
(b) Indefinite integration reverses the process of differentiation, and so we write
3x dx x c
2 3
We always include the additional constant of integration when finding indefinite integrals. Note that
our answer can be checked by differentiating x3+c to obtain 3x2.
More generally, we have the following relationship between derivatives and indefinite integrals:
if
d
F x f x then f x dx F x c
dx
In the expression ∫ f(x) dx, the function f(x) is referred to as the integrand. When we have calculated ∫
f(x) dx, we say f(x) has been integrated with respect to x to yield F(x)+c.
In what follows we are going to discuss some techniques which can be used to evaluate many types of
integrals.
Integration by parts
d
f ( x) g ( x) f ( x) dg ( x) g ( x) df ( x) or equivalently
dx dx dx
f ( x ) g ' ( x ) f ( x) g ( x) g ( x ) f ' ( x )
d
dx
where the dash means the differentiation of the function with its argument. Integrating both sides of
the previous equation gives us
The simplest is to perform the first integral on the right side and the integral is f(x)g(x)+C (why???).
Since another constant of integration results from the second integral, it is unnecessary to include C in
the formula; that is
If we let u=f(x) and v=g(x), so that du=f′(x)dx and dv=g′(x)dx, then the preceding formula may be
written (integration by part)
udv uv vdu
Example
Find xe dx
2x
Page
22
NOTES ON MATHEMATICS 2018-2019
Solution
There are four possible choices for dv, namely dx, xdx, e2xdx or xe2xdx. If we let dv= e2xdx, then the
remaining part of the integrand is u; that is u=x. To find v we integrate dv to obtain v=e2x/2. Note that
a constant of integration is not added at this stage of the solution. Since u=x we see that du=dx. For
ease of reference it is convenient to display these expressions as follows
ux dv e 2 x dx
1 2x
du dx v e
2
xe dx x e 2 x e 2 x dx
2x 1 1
2 2
The integral on the right hand side may be found using the integrals of exponential functions (see in
the table below). This gives us
xe
1 1
2x
dx x e 2 x e 2 x C
2 4
It takes considerable practice to become proficient in making a suitable choice for dv. To illustrate, if
we had chosen dv=xdx, then it would have been necessary to let u=e2x, giving us
u e2x dv xdx
1 2
du 2e 2 x dx v x
2
xe x e x 2 e 2 x dx
1 2 2x
2x
dx
2
Since the exponent associated with x has increased, the integral on the right is more complicated than
the given integral. This indicates an incorrect choice of dv.
Trigonometric integrals
In many cases evaluating an integral we can meet trigonometric functions. For example, integrals of
the type ∫sinnxdx require new method of solving. If n is an odd positive integer, we begin by writing
Since the integer n-1 is even, we may then use the fact that sin2x=1-cos2x to obtain a form which is
easy to integrate.
Example
Evaluate ∫sin5xdx.
Page
23
NOTES ON MATHEMATICS 2018-2019
Solution
As discussed earlier we have
sin xdx sin x sin xdx (sin x) 2 sin xdx (1 cos 2 x) 2 sin xdx
5 4 2
We next employ the method of substitution, letting u=cosx and du=-sinxdx. Thus
2 1 2 1
u u 3 u 5 C cos x cos3 x cos5 x C
3 5 3 5
A similar technique can be employed for odd powers of cos x, specifically, we write
and use the fact that cos2x=1-sin2x in order to obtain an integrable form. If the integrand is sinnx or
cosnx and n is even, then the half-angle formulas
1 cos 2 x 1 cos 2 x
sin 2 x or cos 2 x may be used to simplify the integrand.
2 2
Example
Evaluate ∫sin4xdx
Solution
1 1 1
cos 2 2 x (1 cos 4 x ) cos 4 x
2 2 2
sin 4 2
1 3 1 3 1 1
4
xdx
2 cos 2 x cos 4 x dx x sin 2 x sin 4 x C
2 8 4 32
Integrals of the form ∫sinmxcosnxdx where m and n are positive integers may be found by using
variations of the previous techniques. If m and n are both even, then half-angle formulas should be
employed first. In n is odd, we can write
and express cosn-1x in terms of sinx by using the identity cos2x=1-sin2x. The substitution u=sinx then
leads to an integrand which can be handled easily. A similar technique can be used if m is odd.
Trigonometric substitutions
Page
24
NOTES ON MATHEMATICS 2018-2019
If an integrand contains the expression a 2 x 2 where a>0, then the trigonometric substitution
x=asinθ leads to
a 2 x 2 a 2 a 2 sin 2 a 2 1 sin 2 a 2 cos 2 a | cos |
When making this substitution or the other trigonometric substitutions in the next examples, we shall
assume that θ is in the range of the corresponding inverse trigonometric function. Thus, for the sine
substitution above –π/2≤θ≤π/2. Consequently cosθ≥0 and a 2 x 2 a cos . Of course, if
a 2 x 2 occurs in a denominator we make the further restriction –π/2<θ<π/2.
Example
Evaluate
x
1
dx, a 0
2
a2 x2
Solution
Let x=asinθ , where –π/2<θ<π/2. It follows that
a 2 x 2 a 2 a 2 sin 2 a 2 1 sin 2 a cos
Since x=asinθ, we have dx=acosθdθ. Substituting in the given integral
x dx 2
1 1 1 1 1
a cos d d 2 cot C
2
a x
2 2 ( a sin )a cos
2 2
a sin 2
a
It is now necessary to return to the original variable of integration x. A simple method of doing so is
to use a geometrical approach. If 0<θ<π/2, then since sinθ=x/a, we may interpret θ as an acute angle
of a triangle having opposite side and hypotenuse of lengths x and a, respectively. The length
a 2 x 2 of the adjacent side is calculated by means of the Pythagorean Theorem.
Page
25
NOTES ON MATHEMATICS 2018-2019
a2 x2
cot
x
It can be shown that this formula is also valid if –π/2<θ<0. Thus the above figure can be used whether
θ is positive or negative. Substituting the new form of cotθ in the result obtained for the integral, we
have
x
1 1 a2 x2 a2 x2
dx C C
2
a 2 x2 a2 x a2x
If an integrand contains a 2 x 2 , where a>0, then the substitution x=atanθ will eliminate the radical
sign. When using this substitution it will be assumed that θ is in the range of the inverse tangent
function; that is –π/2<θ<π/2. After making this substitution and evaluating the resulting trigonometric
integral, it is necessary to return to the original variable, x. The preceding formulas show that
x a2 x2
tan and sec
a a
For integrands containing x 2 a 2 we substitute x=asecθ, where θ is chosen in the range of the
inverse secant function; that is either 0≤θ≤π/2 or π≤θ<3π/2. After evaluating the integral, we have to
return to the original variable and we use
x x2 a2
sec and tan
a a
Example
x2 9
Evaluate dx
x
Solution
Let us substitute as follows
Consequently
3 tan
x dx 3 sec 3 sec tan d 3 tan d 3 sec 1 d
x2 9 2 2
3tan C
Since secθ=x/3 we may refer to the changing rule into the old variable and we write
Page
26
NOTES ON MATHEMATICS 2018-2019
x2 9 x
x2 9 x
dx 3 sec 1 C x 2 9 3 sec 1 C
x 3
3 3
Example
1)
2)
Page
27
NOTES ON MATHEMATICS 2018-2019
3)
4)
Page
28
NOTES ON MATHEMATICS 2018-2019
Page
29
NOTES ON MATHEMATICS 2018-2019
f tx,ty t n f x, y 4.1
for every t>0 such that (tx, ty) is in the domain of f. For example, if
f x, y 2 x 4 x 2 y 2 5 xy 3
then f is homogeneous of degree 4, since
t 4 2 x 4 x 2 y 2 5 xy 3 t 4 f x, y
Similarly, if
f x, y
1 x
e y
x y
2 2
f tx,ty t 2 f x, y
1 tx
e ty
t x t y
2 2 2 2
where P and Q are homogeneous functions of the same degree. Equations of this type can be
transformed into separable equations by means of substitutions
y xv , where v v x 4.3
and
dy vdx xdv
Thus, substitution of xv for y in (4.2) yields
Page
30
NOTES ON MATHEMATICS 2018-2019
1 Q1,v 4.4
dx dv 0
x P1,v vQ1,v
provided non-zero denominators occur. We have proved that if y=xv is a solution of (4.2), and then v
is a solution of (4.4). Conversely, if v is a solution of (4.4), then reversing our argument shows that
y=vx is a solution of (4.2). It is not advisable to memorize the final form of (4.4). Instead, remember
the substitution y=xv which is used to simplify the homogeneous equation.
Example
Solve the differential equation
y 2
xy dx x 2 dy 0.
Solution
If P(x,y)=y2-xy and Q(x,y)=x2, then the functions P and Q are both homogeneous of degree 2. Hence
the differential equation is homogeneous and we substitute
v 2 dx xdv 0
1 1
dx 2 dv 0
x v
1 x x
ln x C1 ln | x | C1 y
v y ln | x | C1
Page
31
NOTES ON MATHEMATICS 2018-2019
y' P( x ) y Q ( x ) 4.5
where P and Q are continuous functions. If Q(x) =0 for all x, then equation (4.5) is separable and we
may write
y'
P( x)
y
ln y P ( x)dx ln C .
We have expressed the constant of integration as ln|C| in order to change the form of the last equation
as follows:
ln y ln C P( x)dx
y
ln P( x)dx
C
e
y P ( x ) dx
ye C
P ( x ) dx
d P ( x ) dx P ( x ) dx P( x ) ye P ( x ) dx
ye y ' e
dx
e
P ( x ) dx
y ' yP( x).
Consequently, if we multiply both sides of (4.5) by e∫P(x)dx, then the resulting equation may be written
as
d P ( x ) dx
Q ( x )e
P ( x ) dx
ye .
dx
Page
32
NOTES ON MATHEMATICS 2018-2019
ye Q ( x )e D.
P ( x ) dx P ( x ) dx
4.6
Solving this equation for y leads to an explicit solution. The expression e∫P(x)dx is called an integrating
factor of (4.5). We have shown that multiplications of both sides if (4.5) by this expression leads to an
equation which has the solution (4.6).
Example
Solve the differential equation
x 2 y '5 xy 3 x 5 0
where x≠0.
Solution
In order to find the integrating factor we begin by expressing the given differential equation in the
“standardized” form (4.5), where the coefficient of y′ is 1. Thus, dividing both sides by x2 we obtain
5
y ' y 3 x 3
x
which has the form (4.5) with P(x)=5/x and Q(x)=- 3x3. From the preceding discussion, the required
integrating factor is
e e5 ln x eln x x . .
P ( x ) dx 5
5
If x>0 then |x|5=x5, whereas if x<0 then |x|5=-x5. In either case, multiplying both sides of the
standardized form by |x|5 gives us
d 5
x 5 y '5 x 4 y 3 x 8 or ( x y ) 3 x 8 .
dx
Thus a solution is
x9 x4 C
x5 y C or y
3 3 x5 .
A generalization of the (4.5) is the Bernoulli equation
Page
33
NOTES ON MATHEMATICS 2018-2019
y ' P ( x) y Q ( x ) y n , 4.7
where n≠0. Evidently y=0 is a solution. If y≠0 we may divide both sides by yn, obtaining
and hence
1
y n y' w'.
1 n
1
w' P ( x ) w Q ( x ).
1 n
This first-order linear diff. eq. may be solved for w using the integrating factor technique. After w was
found, the solution (4.7) is given by y 1-n=w (and y=0).
Example
Solve the differential equation
2y
y ' x6 y 3.
x
Solution
The equation has the Bernoulli form (4.6) with n=3. If, as in the previous discussion, we multiply both
sides by y-3 and substitute w=y1-n=y-2 we obtain
Page
34
NOTES ON MATHEMATICS 2018-2019
2
y 3 y ' x6
xy 2
1 2w
w' x6
2 x
4w
w' 2 x 6
x
e e 4 ln x eln x x x4
( 4 / x ) dx 4
4
supposing that x>0 (in fact it can be shown that considering x<0, eventually we would obtain the
same result). Now we can write
x 4 w'4 x 5 w 2 x 2 .
Consequently
2 x3 2x7
x w
4
C or w Cx 4 .
3 3
2x 7 2x 7
y 2 Cx 4 or Cx 4 y 2 1
3 3
Example
Solve the differential equations
xy x y e x
y y y 2 e x
If f1, f2, … , fn and k are functions of one variable which have the same domain, then an equation of
the form
Page
35
NOTES ON MATHEMATICS 2018-2019
is called a linear differential equation of order n. If k(x)=0 for all x, the equation is said to be
homogeneous. Notice that this meaning of the word homogeneous is different from that we met
earlier. If k(x)≠0 for some x, then the equation (5.1) is said to be non-homogeneous. We shall restrict
our work to second-order equations in which f1 and f2 are constant functions. First we are going to
discuss the homogeneous case.
The general second-order homogeneous differential equation with constant coefficients has the form
where a, b and c are constants. Before attempting to find particular solutions let us establish the
following result.
Theorem: (the superposition principle) If y=f(x) and y=g(x) are solutions of the differential equation
ay″+by′+cy=0, then
y C1 f ( x ) C 2 g ( x ) 5.3
is a solution for all real numbers C1 and C2.
Proof: By hypothesis
af ' ' ( x) bf ' ( x) cf ( x) 0
ag ' ' ( x) bg ' ( x) cg ( x) 0
If we multiply the first of these equations by C1, the second by C2, and add, the result is
It can be shown that if the solutions f and g in above Theorem have the property that f(x)≠Cg(x) for
all real numbers C, and if g(x) is not identically 0, then y=C1f(x)+C2g(x) is a general solution of
ay″+by′+cy=0. Thus, to determine the general solution it is sufficient to find two such functions f and
g and employ Eq. (5.3).
In our search for solution of (5.2) we shall use y=Aemx as a trial solution, where A is an arbitrary
constant and we need to find m.
Then
dy d2y
Ame mx ; Am 2 e mx
dx dx 2
Page
36
NOTES ON MATHEMATICS 2018-2019
am 2
bm c Ae mx 0
am 2 bm c 0 5.4
The equation (5.4) is called the auxiliary equation of eq. (5.2). It can be obtained from this differential
equation by replacing y″ by m2, y′ by m and y by 1. In simple cases, the roots of the auxiliary equation
can be found by factoring. If the factorisation is not obvious, then applying the quadratic formula, we
see that the roots of the auxiliary equation are given by
b b 2 4ac
m
2a 5.5
According to the sign of b2-4ac in eq. (5.5), we could have three different cases
Theorem 1: If the roots m1 and m2 of the auxiliary equation are real and unequal, then the general
solution of ay″+by′+cy=0 is
y C1 e m1 x C 2 e m2 x 5.6
Example
Solve the differential equation
y″+10y′+16y=0
Solution
The auxiliary equation is
m 2 10m 16 0 or (m 8)(m 2) 0
which means that m1=-8 and m2=-2. Since the roots of the auxiliary equations are real and unequal, it
follows from Theorem 1 that the general solution is
y C1e 8 x C 2 e 2 x
Page
37
NOTES ON MATHEMATICS 2018-2019
with C1 and C2 two arbitrary constants. To find the two constants we need two boundary (or initial)
conditions. For example, we may give two points the solution passes through or one point and the
gradient for some value of x, or two gradients. In this case, if y=0 and dy/dx=1 when x=0, using the
general solution given above we have
C1 C 2 0
8C1 2C 2 1
which eventually will lead to C1=-1/6 and C2=1/6. So, the final form of the solution is
y
1
6
e 8 x e 2 x
Theorem 2: If the auxiliary equation has a double root m, then the solution of the equation
ay″+by′+cy=0 is
Proof: Using (4.5) with b2-4ac=0, we obtain m=-b/2a or 2am+b=0. Since m satisfies the auxiliary
equation, y=emx is a solution of the differential equation. According to the remark following the proof
of Theorem (4.3), it is sufficient to show that y=xemx is also a solution. Substitution of xemx for y in
ay″+by′+cy=0 gives us
a 2me mx m 2 xe mx b mxe mx e mx cxe mx
am 2
bm c xe mx 2am b e mx
0 xe mx 0e mx 0 which is what we wished to show.
Example
Solve the differential equation
y″-6y′+9y=0
Solution
The auxiliary equation is m2-6m+9=0, or equivalently (m-3)2=0, has a double root 3. Hence by
Theorem (4.7), the general solution is
y C1 e 3 x C 2 xe 3 x e 3 x C1 C 2 x
Example:
Solve the differential equation:
Page
38
NOTES ON MATHEMATICS 2018-2019
y 5 y 6 y 0
Solution
The second case when deciding the nature of solution for a second order homogeneous differential
equation corresponds to the case when b2-4ac<0. In this case the auxiliary equation is said to have
complex roots.
Complex numbers may be represented by expressions of the form a+ib, where a and b are real
numbers, and i is a symbol which may be manipulated in the same manner as a real number, but has
the additional property that i2=-1.
Two complex numbers a+ib and c+id are said to be equal, and we write a+ib=c+id, if and only if a=c
and b=d. Operations of addition, subtraction, multiplication and division are defined just as in the case
of real numbers; in the case of complex numbers all letters denote real numbers, with the additional
stipulation that whenever i2 occurs, it may be replaced by –1. For example, the formulas for addition
and multiplication of two complex numbers a+ib and c+id are
a ib c id a c i b d
a ib c id ac bd iad bc
We may regard the real numbers as a subset of the complex numbers by identifying the real number a
with the complex number a+i0. A complex number of the form 0+ib is called an imaginary number.
Complex numbers are often required for solving equations of the form f(x)=0, where f(x) is a
polynomial. For example, if only real numbers are allowed, then the equation x2=-4 has no solutions.
However, if complex numbers are allowed, then the equation has a solution 2i, since
2i 2 2 2 i 2 4 1 4
Since i2=-1, we sometimes use the symbol √(-1) in place of i and write
13 i 13 , 2 25 2 i 25 2 5i
and so on.
Page
39
NOTES ON MATHEMATICS 2018-2019
A quadratic equation ax2+bx+c=0, where a, b and c are real numbers and a≠0, has roots given by the
quadratic formula
b b 2 4ac
x .
2a
If b2-4ac<0, then the roots are complex numbers. To illustrate, if we apply the quadratic formula to
the equation x2-4x+13=0 we obtain
4 16 52 4 36 4 6i
x 2 3i
2 2 2
Thus the equation has two complex roots 2+3i and 2-3i.
The complex number a-bi is called the conjugate of the complex number a+bi. We see from the
quadratic formula that if a quadratic equation with real coefficients has complex roots, then they must
be conjugate complex numbers.
It follows from the previous discussion that if the auxiliary equation am2+bm+c=0 has complex roots,
then they are of the form
z1 s it and z 2 s it
where s and t are real numbers. We may anticipate, from Theorem (4.6), that the general solution of
the differential equation ay″+by′+cy=0 is
In order to handle such complex exponents it is necessary to extend some of the concepts of calculus
to include functions whose domain includes complex numbers. Since a complete development is
beyond the scope of our work, we shall merely outline the main ideas.
e iz cos z i sin z
Page
40
NOTES ON MATHEMATICS 2018-2019
It can be shown that the Laws of Exponents are true for complex numbers. In addition, formulas for
derivatives developed earlier can be extended to functions of a complex variable z. One such formula
is dekz/dz=kekz, where k is a complex number.
a m a n a mn a m n
a mn ( ab) m a m b m
am a am
m
a mn m
a b b
n
It can be proved that the general solution of ay″+by′+c=0, where the roots of the auxiliary equation
are complex numbers s±ti, is given by Eq. (5.8). The form of this solution may be changed as follows:
This can be further simplified by using Euler’s Formula. Specifically, we see from (4.8) that
If we let C1=C2=1/2 in (5.9) and use (5.10) we obtain the particular solution y=esx costx of
ay″+by′+cy=0. Letting C1= - i/2 and C2=i/2 gives us the particular solution y=esx sintx. This is a
particular proof of the next theorem.
Example: Evaluate
e i 1
Page
41
NOTES ON MATHEMATICS 2018-2019
Theorem
If the auxiliary equation am2+bm+c=0 has distinct complex roots s±ti, then the general solution of
ay″+by′+cy=0 is
Example
Solve the differential equation
Solution
The roots of the auxiliary equation m2-10m+41=0 are
10 100 164 10 64 10 8i
m 5 4i
2 2 2
Solution
The roots of the auxiliary equation 2m2-3m+2=0 are
3 9 16 3 7 3 i 7
m
4 4 4
According to (4.11), the solution of the given differential equation can be written as
3x
7 7
y e 4 C1 cos x C 2 sin x
4 4
This solution contains two constants which can be determined using the boundary conditions. It is
obvious that y(0)=C1=0. In order to use the second boundary condition we have differentiate the
solution y(x) with respect to x
Page
42
NOTES ON MATHEMATICS 2018-2019
3 4 7 7 7 7 7 7
3x 3x
y' e C1 cos x C2 sin x e 4
C1 sin x C2 cos x
4 4 4 4 4 4 4
Taking into account that C1=0, we obtain that
7
y ' (0) C2 2
4
8 8 7
C2
7 7
8 7 4 7
3x
y e sin x
7 4
y 6 y 13 y 0; y ( 0 ) 2 , y ( 0 ) 3
x 2 y bxy cy 0
where a and b are constants. Although this equation does not seem to be an equation with constant
coefficients, it can be reduced to that form using a new variable
x et or t ln | x |
Since we changed the variable, we have to change y′ and y″, as well. To do so, we use the chain rule :
dy dy dt dy 1 dy t
e
dx dt dx dt x dt
d 2 y dy d 2 t d 2 y dt dy 1 d 2 y 1 2 t dy 2t d y
2 2
e e
dx 2 dt dx 2 dt 2 dx dt x 2 dt 2 x 2 dt dt 2
Page
43
NOTES ON MATHEMATICS 2018-2019
d 2 y 2t dy 2t dy
e 2t
e e 2t e be t e t cy 0
dt 2
dt dt
d2y dy
(b 1) cy 0
dt 2
dt
which now is a second order homogeneous differential equation with constant coefficients. Therefore,
we use the previous results to solve this equation.
m 2 (b 1)m c 0
and the solution of the Euler-Cauchy differential equation can be found according to the nature of the
roots of the auxiliary equation.
y A e m1t Be m 2 t Ae m1 ln | x | Be m 2 ln | x | A | x | m1 B | x | m 2
y ( A Bt )e mt ( A B ln | x |)e m ln| x| ( A B ln | x |) | x | m
Page
44
NOTES ON MATHEMATICS 2018-2019
1)
2)
Page
45
NOTES ON MATHEMATICS 2018-2019
3)
Page
46
NOTES ON MATHEMATICS 2018-2019
6. Partial Differentiation
Now we enter new territory. Having spent the semester studying functions of several
variables, and having worked through the concept of a partial derivative, we are in position to
generalize the concept of a differential equation to include equations that involve partial
derivatives, not just ordinary ones. Solutions to such equations will involve functions not just
of one variable, but of several variables. Such equations arise naturally, for example, when
one is working with situations that involve positions in space that vary over time. To model
such a situation, one needs to use functions that have several variables to keep track of the
spatial dimensions and an additional variable for time.
2u 2 u
2
(1) c One-dimensional wave equation
t 2 x 2
u 2u
(2) c2 2 One-dimensional heat equation
t x
2u 2u
(3) 0 Two-dimensional Laplace equation
x 2 y 2
2u 2u
(4) f ( x, y ) Two-dimensional Poisson equation
x 2 y 2
Note that for PDEs one typically uses some other function letter such as u instead of y, which
now quite often shows up as one of the variables involved in the multivariable function.
In general we can use the same terminology to describe PDEs as in the case of ODEs. For
starters, we will call any equation involving one or more partial derivatives of a multivariable
function a partial differential equation. The order of such an equation is the highest order
partial derivative that shows up in the equation. In addition, the equation is called linear if it
is of the first degree in the unknown function u, and its partial derivatives, ux, uxx, uy, etc.
(this means that the highest power of the function, u, and its derivatives is just equal to one in
each term in the equation, and that only one of them appears in each term). If each term in
the equation involves either u, or one of its partial derivatives, then the function is classified
as homogeneous.
Take a look at the list of PDEs above. Try to classify each one using the terminology given
above. Note that the f(x,y) function in the Poisson equation is just a function of the variables
x and y, it has nothing to do with u(x,y).
Answers: all of these PDEs are second order, and are linear. All are also homogeneous
except for the fourth one, the Poisson equation, as the f(x,y) term on the right hand side
doesn’t involve u or any of its derivatives.
Page
47
NOTES ON MATHEMATICS 2018-2019
The reason for defining the classifications linear and homogeneous for PDEs is to bring up
the principle of superposition. This excellent principle (which also shows up in the study of
linear homogeneous ODEs) is useful exactly whenever one considers solutions to linear
homogeneous PDEs. The idea is that if one has two functions, u 1 and u 2 that satisfy a linear
homogeneous differential equation, then since taking the derivative of a sum of functions is
the same as taking the sum of their derivatives, then as long as the highest powers of
derivatives involved in the equation are one (i.e., that it’s linear), and that each term has a
derivative in it (i.e. that it’s homogeneous), then it’s a straightforward exercise to see that the
sum of u 1 and u 2 will also be a solution to the differential equation. In fact, so will any linear
combination, au 1 bu 2 , where a and b are constants.
For instance, the two functions cos( xy ) and sin(xy ) are both solutions for the first-order
linear homogeneous PDE:
u u
(5) x y 0
x y
It’s a simple exercise to check that cos( xy ) sin( xy ) and 3 cos( xy ) 2 sin( xy ) are also
solutions to the same PDE (as will be any linear combination of cos( xy ) and sin(xy ) )
Solving PDEs
Solving PDEs is considerably more difficult in general than solving ODEs, as the level of
complexity involved can be great. For instance the following seemingly completely
unrelated functions are all solutions to the two-dimensional Laplace equation:
You should check to see that these are all in fact solutions to the Laplace equation by doing
2u 2u
the same thing you would do for an ODE solution, namely, calculate and ,
x 2 y 2
substitute them into the PDE equation and see if the two sides of the equation are identical.
Now, there are certain types of PDEs for which finding the solutions is not too hard. For
instance, consider the first-order PDE
u
(2) 3 x 2 xy 2
x
Page
48
NOTES ON MATHEMATICS 2018-2019
To solve (2), then, integrate both sides of the equation with respect to x, as mentioned. Thus
x dx (3x
u
(3) 2
xy 2 ) dx
1 2 2
so that u ( x, y ) x 3 x y F . What is F? Note that it could be any function such that
2
when one takes its partial derivative with respect to x, the result is 0. This means that in the
case of PDEs, the arbitrary constants that we ran into during the course of solving ODEs are
now taking the form of whole functions. Here F, is in fact any function, F(y),of y alone. To
check that this is indeed a solution to the original PDE, it is easy enough to take the partial
derivative of this u ( x, y ) function and see that it indeed satisfies the PDE in (2).
2u
(4) 5x y 2
xy
where u is again a two-variable function depending on x and y. We can solve this PDE by
integrating first with respect to x, to get to an intermediate PDE,
u 5 2
(5) x xy 2 F ( y)
y 2
where F(y) is a function of y alone. Now, integrating both sides with respect to y yields
5 2 1
(6) u ( x, y ) x y xy 3 F ( y ) G ( x )
2 3
where now G(x) is a function of x alone (Note that we could have integrated with respect to y
first, then x and we would have ended up with the same result). Thus, whereas in the ODE
world, general solutions typically end up with as many arbitrary constants as the order of the
original ODE, here in the PDE world, one typically ends up with as many arbitrary functions
in the general solutions.
To end up with a specific solution, then, we will need to be given extra conditions that
indicate what these arbitrary functions are. Thus the initial conditions for PDEs will typically
involve knowing whole functions, not just constant values. We will also see that the initial
conditions that appeared in specific ODE situations have slightly more involved analogs in
the PDE world, namely there are often so-called boundary conditions as well as initial
conditions to take into consideration.
Page
49
NOTES ON MATHEMATICS 2018-2019
Example
1)
2)
Page
50
NOTES ON MATHEMATICS 2018-2019
Page
51
NOTES ON MATHEMATICS 2018-2019
7. Statistics
(I) Mean:
x
n
i
x i 1
y
N
i
i 1
Basically, the mean can provide the information about the “center” of the data. Intuitively, it can
measure the rough “location” of the data.
Example (continue):
1 3 10
x 5.5
10
(II) Median:
Example (continue):
56
median 5 .5
2
If the data are 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11. Then,
median 6
Note: the median is less sensitive to the data with extreme values than the mean. For example, in the
previous data, suppose the last data has been wrongly typed, the data become 1, 3, 5, 7, 9, 2, 4, 6, 8,
100. Then the median is still 5.5 while the mean becomes 14.5.
(III) Mode:
The data value occurs with greatest frequency (not necessarily to be numerical).
Page
52
NOTES ON MATHEMATICS 2018-2019
Note: if the data have exactly two modes, we say that the data are bimodal. If the data have more than
two modes, we say that the data are multimodal.
Example:
Suppose there are two factories producing the batteries. From each factory, 10 batteries are drawn to
test for the lifetime (in hours). These lifetimes are:
Factory 1: 10.1, 9.9, 10.1, 9.9, 9.9, 10.1, 9.9, 10.1, 9.9, 10.1
Factory 2: 16, 5, 7, 14, 6, 15, 3, 13, 9, 12.
The mean lifetimes of the two factories are both 10. However, by looking at the data, it is obvious that
the batteries produced by factory 1 are much more reliable than the ones by factory 2. This implies
other measures for measuring the “dispersion” or “variation” of the data are required.
(I) Range:
range=(largest value of the data)-(smallest value of the data).
Example (continue):
y
N
2
i
2 i 1
N and
x x x
n n
nx 2
2 2
i i
s2 i 1
i 1
n 1 n 1 , respectively.
The population standard deviation and sample standard deviation are the square root of population
variance and sample variance:
Page
53
NOTES ON MATHEMATICS 2018-2019
2 and s s 2 ,respectively.
Large sample variance or sample standard deviation implies the data are “dispersed” or are highly
varied.
x
n
n
i i i
Note: i1 i1 i1 i1 i1
Example:
s 2
( factory.1)
10.1 10 9.9 10 10.1 10
2 2 2
0.0111
10 1
s ( factory.2)
2 16 10 5 10 12 10
2 2 2
21.1111
10 1
The sample variance of battery lifetimes for factory 2 is 190 times larger than the one for
factor 1.
The sample standard deviation for the data from factories 1 and 2 are
0.01111 0.1054
and
21.1111 4.5946, respectively.
Example:
In the battery data from factory 1, suppose the measurement is in minutes rather than hours. Then, the
data are 606, 594, 606, 594, 594, 606, 594, 606, 594, 606.
Thus, the standard deviation becomes 6.3245 which is 60 times larger than the one 0.1054 based on
the original data measured in hours. However, no matter the data are measured in hours and minutes,
the coefficient of variation is
0.1054 6.3245
C.V . 100 100 1.054.
10 600
Page
54
NOTES ON MATHEMATICS 2018-2019
Note: since the coefficient of variation is scale-invariant, it is very useful for comparing the dispersion
of different data. For example, in the previous battery data, if the lifetime of the batteries from factory
1 and factory 2 are measured in minutes and hours, respectively, the standard deviation for factory 1,
6.3245, would be larger than for factory 2, 4.5946. However, the coefficient of variation for factory 1,
1.054 is still much smaller than the one for factory 2, 45.946.
Page
55
NOTES ON MATHEMATICS 2018-2019
8. Probability
Outcomes – possible results of an experiment
To find the probability of an event, divide the number of favorable outcomes by the total
number of outcomes.
A tree diagram helps count all possible outcomes by using branching to list choices.
The Counting Principle is a method used to count possible outcomes by multiplying the
outcomes of each event.
Permutation – order is imPortant
Combination – we don’t Care about order
Overlapping Events are events that have one or more outcomes in common.
Two disjoint events in which one or the other must occur are called complementary events.
When you consider outcomes of two events, the events are called compound events.
Two compound events that do no affect each other are called independent events.
P(A and B) = P(A) • P(B)
Two compound events are dependent events if the occurrence of one event does affect the
likelihood that the other will occur.
P(A and B) = P(A) • P(B given A)
Sample space. This is just terminology. A sample space is the set of all possible
outcomes of what every ‘experiment’ you are doing.
For example, if you are flipping a coin 3 times, the sample space is
Page
56
NOTES ON MATHEMATICS 2018-2019
If your are considering the possible years of age of a human being, the sample space
consists of the non-negative integers up to 150. If you are considering the possible
birthdates of a person drawn at random, the sample space consists of the days of the year,
thus the integers from 1 to 366. If you are considering the possible birthdates of two
people selected at random, the sample space consists of all pairs of the form (j, k) where j
and k are integers from 1 to 366.
To reiterate: S is just the collection of all conceivable outcomes.
Events An event is a subset of the sample space, thus, a subset of possible outcomes for
your experiment. Thus, if S is the sample space for flipping a coin three times, then HTH
is an event. The event that a head appears on the first flip is the four element subset
{HTT, HHT, HTH, HHH}.
Thus, an event is simply a certain subset of the possible outcomes.
If you know what P assigns to each element in S, then you know P on every subset: Just
add up the probabilities that are assigned to its elements.
This assumes that S is a finite set. We’ll talk about the story when it isn’t later in the
course. Anyway, the preceding illustrates the more intuitive notion of probability that we
all have: It says simply that if you know the probability of every outcome, then you can
compute the probability of any subset of outcomes by summing up the probabilities of the
outcomes that are in the subset.
For example, if S is the set of outcomes for flipping a fair coin three times (as
depicted in (1.1)), then each of its elements has P(·) = 18 and then we can use the rule in
(1.2) to assign probabilities to any given subset of S. For example, the subset given by
{HHT, HTH, THH} has probability 38 since
Page
57
NOTES ON MATHEMATICS 2018-2019
by invoking (1.2). Invoking it a second time finds P({HHT, HTH}) = P(HHT) + P(HTH),
and so P({HHT, HTH, THH}) = P(HHT) + P(HTH) + P(THH) = 38 .
Here are some consequences of the definition of probability.
a) P(ø) = 0.
b) P(AB) = P(A) + P(B) – P(A.
c) P(A) ≤ P(B) if A B.
d) P(B) = P(BA) + P(BAc).
e) P(Ac) = 1 – P(A).
(1.3)
In the preceding, Ac is the set of elements that are not in A. The set Ac is called the
‘complement’ of A.
I want to stress that all of these conditions are simply translations into symbols of
intuition that we all have about probabilities. Here are the respective English versions of
(1.3):
a) The probability that no outcomes appear is zero. This is to say that if S is the list of
all possible outcomes, then at least one outcome must appear.
b) The probability an outcome is in either A or B is the probability that is in A plus the
probability that it is in B minus the probability that it is in both. The point here is that
if A and B have elements in common, then one is overcounting by just summing the
two probabilities. If you doubt this, try the case where A = B.
c) The probability of an outcome from A is no greater than that of an outcome from B in
the case that all outcomes from A are contained in the set B.
d) The probability of an outcome from the set B is the sum of the probability that the
outcome is in the portion of B that is contained in A and the probability that the
outcome is in the portion of B that is not contained in A.
e) The probability of an outcome that is not in A is 1 minus the probability that an
outcome is in A.
Conditional probability: This is the probability that an event in A occurs given that you
already know that an event in B occurs. It is denoted by P(A|B)and it is a probability
assignment for S that is typically not the same as the original one, P. The rule for
computing this new probability is
P(A|B) P(AB)/P(B).
(1.4)
You can check that this obeys all of the rules for being a probability. In English, this
says:
The probability of an event occuring from A given that the event is in B is the probability
of the event being in both A and B divided by the probability of the event being in B in
the first place.
Another way to view this notion is as follows: Since we are told that the event B
happened, we can shrink the sample space from the whole of S to just the elements that
define the event B. The probability of A given that B happened is then the probability
assigned to the part of A in B (thus, P(AB)) divided by P(B). In this regard, the
Page
58
NOTES ON MATHEMATICS 2018-2019
division by P(B) is done to make the conditional probability of B given that B happened
equal to 1.
Anyway, here is an example: Suppose we want the conditional probability of a head
on the last flip granted that there is a head on the first flip. Use B to denote the event that
there is a head on the first flip. Then P(B) = 12 . The conditional probability that there is a
head on the final flip given that you know there is one on the first flip is obtained using
(1.4). Here, A is the event that there is a head on the final flip, thus the set {TTH, THH,
HTH, HHH}. Its intersection with B is A B = {HTH, HHH}. This set has probability
4 so our conditional probability is 4 / 2 = 2 .
1 1 1 1
That’s all there is to probability: You have just seen most of probability theory for
sample spaces with a finite number of elements. There are a few new notions that are
introduced later, but a good deal of what follows concerns either various consequences of
the notions that were just introduced, or else various convenient ways to calculate
probabilities that arise in common situations.
Decomposing a subset to compute probabilities: It is often the case (as we will see) that it
is easier to compute conditional probabilities. This can be used to one’s advantage in the
following situation: Suppose that S is decomposed into a union of some number, N, of
subsets that have no elements in common: S = 1≤j≤N Aj where {Aj}1≤j≤N are subsets of S
with AjAj´ = ø when j ≠ j´. Now suppose that A is any given set. Then
The probability of A is the probability that an outcome from A occurs that is in A1, plus
the probability that an outcome from A occurs that is in A2, plus
By the way, do you recognize (1.5) as a linear equation? You might if you denote
P(A) by y, each P(Aj) by xj and P(A|Aj) by aj so that this reads
P(Aj) = ∑k P(Aj|Bk)·P(Bk).
Page
59
NOTES ON MATHEMATICS 2018-2019
So, we have a 4 4 matrix M whose entry in row j and column k is P(Aj|Bk). Now write
each P(Aj) as yj and each P(Bk) as xk, and this last equation reads yj = ∑k Mjkxk.
P(A|B) = P(A).
In English: Events A and B are independent when the probability of A given B is the
same as that of A with no knowledge about B. Thus, whether the outcome is in B or not
has no bearing on whether it is in A.
Here is an equivalent definition: Events A and B are independent when P(AB) =
P(A)P(B). This is equivalent because P(A|B) = P(AB)/P(B). Note that the equality
between P(AB) and P(A)P(B) implies that P(B|A) = P(B). Thus, independence is
symmetric. Here is the English version of this equivalent definition: Events A and B are
independent in the case that the probality of an event being both in A and in B is the
product of the probability that it is in A and in B.
For an example, take A to be the event that a head appears on the first coin toss and B
the event that it appears on the third. Are these events independent? Well, P(A) is 12 as is
P(B). Meanwhile, P(A B) = 14 which is P(A)P(B). Thus, they are indeed
independent.
For a second example, consider A to be the event that a head appears on the first toss
and B the event that a tail appears on the first toss. Then A B = ø, so P(A B) is zero
but P(A)P(B) = 14 . So, these two events are not independent. (Are you surprised?)
Here is food for thought: Is it reasonable to suppose that the probability of seeing the
base G at a given site is independent of seeing it at the next site in a stretch of DNA?
Check out the DNA code for most commonly used amino acids and let me know.
Exercises:
1. Suppose we have an experiment with three possible outcomes, labeled 1,2, and 3.
Suppose in addition, that we do the experiment three successive times.
a) Give the sample space for the possible outcomes of the three experiments.
b) Write down the subsets of your sample space that correspond to the event that
outcome 1 occurs in the second experiment.
c) Suppose that we have a theoretical model of the situation that predicts equal
probability for any of the three outcomes for any one given experiment. Our model also says
that the event that outcome k appears in one experiment and outcome j in another are
independent. Use these facts to give the probability of three successive experiments getting as
outcome any given triple (i, j, k) with i either 1, 2 or 3, and with j and k likewise constrained.
d) Which is more likely: Getting exactly two identical outcomes in the three
experiments, or getting three distinct outcomes in the three experiments.
2. Suppose that 1% of Harvard students have a particular mutation in a certain protein, that
20% of people with this mutation have trouble digesting lactose, and that 5% of Harvard
students have trouble digesting lactose. If a Harvard student has trouble digesting Lactose,
Page
60
NOTES ON MATHEMATICS 2018-2019
what is the probability that the student has the particular mutation? (Hint: Think Bayes’
theorem.)
4. Label the four basis that are used in a DNA molecule as {1, 2, 3, 4}.
a) Granted this labeling, write down the sample space for the possible basis at two given
sites on the molecule.
b) Let {Aj}j=1,2,3,4 denote the event in this 2-site sample space that the first site has the
base i, and let {Bj}j=1,…4 denote the analogous event for the second site. Explain why
P(A1|Bk) + P(A2|Bk) + P(A3|Bk) + P(A4|Bk) = 1 for all k.
c) If Ai is independent from each Bk, we saw that P(Ai|Bk) = P(Bk|Ai) for all i and k.
Can this last symmetry condition hold if some pair Ai and Bk are not independent?
If so, give an example by specifying the associated probability function on the two
site sample space. If not, explain why..
Example
1)
Page
61
NOTES ON MATHEMATICS 2018-2019
2)
3)
Page
62