You are on page 1of 35

AMA1110 Basic Mathematics I - Calculus and

Probability & Statistics


Lecture 10
12 November 2019

Dr. Guofeng Zhang

Email: guofeng.zhang@polyu.edu.hk
Office: TU832
Telephone: 2766 6936
Student Consultation Hours: Tue. 13:00–15:00

1 / 35
A brief review of the last lecture
Conditional probability
P (A ∩ B)
P (A|B) = .
P (B)
Law of total probability. Let B1 , . . . Bn be a partition of the
sample space. Then
P (A) = P (B1 )P (A|B1 ) + · · · + P (Bn )P (A|Bn ). In particular,
P (A) = P (B)P (A|B) + P (B)P (A|B)
Events A and B are independent if P (A|B) = P (A) or
P (B|A) = P (B) or P (A ∩ B) = P (A)P (B).
Bayes’ Theorem
P (Bk ∩ A) P (Bk )P (A|Bk )
P (Bk |A) = = .
P (A) P (B1 )P (A|B1 ) + · · · + P (Bn )P (A|Bn )
In particular,
P (A ∩ B) P (B)P (A|B)
P (B|A) = =
P (A) P (B)P (A|B) + P (B)P (A|B)
2 / 35
Outline

Random Variables
Mean of Discrete Random Variables
Variance of Discrete Random Variables
The Binomial Probability Distribution
Poisson Distribution

3 / 35
Chapter 11: Probability distributions

In some scenarios, we need some quantitative description of a sample


space.
Example 1
We have 10 balls in a box, their weights (in gram) are
10, 9, 9, 10, 9, 9, 8, 7, 8, 8. If we randomly select one ball (this is an
experiment), and let W denote its weight, what would be W ?

Weight W Number of Balls Probability


7 1 0.1
8 3 0.3
9 4 0.4
10 2 0.2

We see that the weight W can take four possible values {7,8,9,10} and
it takes these values with probabilities.
4 / 35
Example. We toss a fair die (this is an experiment) and let W denote
the number we get. Then W could possibly be 1,2,3,4,5, or 6, with
equal probability 16 . Therefore, we write

P (W = 1) = P (W = 2) = P (W = 3) = P (W = 4)
1
= P (W = 5) = P (W = 6) =
6
Some other examples:
• the number of iPhones sold at a shop in a given month
• the number of people coming to a theater on a certain evening
• the number of complaints received by an airline in a certain year
• the number of heads one gets after flipping a coin 100 times

5 / 35
Definition of random variables

Definition 11.1.1. A random variable is a variable whose values are


determined by the outcomes of an experiment.

In the above examples, variables take discrete values, we call them


discrete random variables.

There are many random variables taking continuous values, like


• time taken to finish an assignment
• the life length of a light bulb
• the length of a person’s sleep in a certain night
• the price of a stock at 3pm next Monday
We call them continuous random variables.

6 / 35
Example 2

We have 10 balls in a box, their weights (in gram) are


Ball a b c d e f g h i j
Weight 10 9 9 10 9 9 8 7 8 8

We randomly take one and denote its weight by W .


Weight W Number of balls Probability
7 1 0.1
8 3 0.3
9 4 0.4
10 2 0.2

Find P (W < 9) and P (W > 9).

Solution. Since the events {W = 7}, {W = 8}, {W = 9}, and


{W = 10} are mutually exclusive,
P (W < 9) = P (W = 8) + P (W = 7) = 0.4,
P (W > 9) = P (W = 10) = 0.2.
7 / 35
Let X be a discrete random variable. We define
fX (u) , P (X = u)
We call fX the probability function (or probability distribution) of
the random variable X.
It is convenient to denote random variables by capital letters
(X, Y, Z, W, . . .), and fixed real numbers they take by small letters
(a, b, c, x, y, z, . . .).
Recall for Example 2,
Weight W Number of Balls Probability
7 1 0.1
8 3 0.3
9 4 0.4
10 2 0.2
Total: 1
The total probability is
X
P (X = u) = 1.
u

This is always true for any random variable. This is because P (S) = 1,
where S is the whole sample space. 8 / 35
Mean of discrete random variable

Example. Recall the previous problem where we have ten balls of


weights 10, 9, 9, 10, 9, 9, 8, 7, 8, 8. What is the average weight?
That’s easy.
10 + 9 + 9 + 10 + 9 + 9 + 8 + 7 + 8 + 8
= 8.7
10
We change another way of counting the average weight.
10 + 9 + 9 + 10 + 9 + 9 + 8 + 7 + 8 + 8
10
1 3 4 2
=7× +8× +9× + 10 ×
10 10 10 10
= 7×P (W = 7)+8×P (W = 8)+9×P (W = 9)+10×P (W = 10) = 8.7

9 / 35
Mean of discrete random variable

In general, we have the following definition.

Definition 1
Let X be a discrete random variable. The mean, or expected value of
X is defined as
X
µ = E[X] , u × P (X = u)
u

We see that E[X] = u ufX (u), where fX (u) = P (X = u) is the


P
probability distribution function of X.

10 / 35
Example 3

Compare random variables X and Y which have the following


probability functions fX and fY .
u 1 2 3 4 5 6 7
fX (u) 0.1 0.13 0.17 0.2 0.17 0.13 0.1
fY (u) 0.01 0.09 0.2 0.4 0.2 0.09 0.01

fX (u) fY (u) mean

mean

u u
1 2 3 4 5 6 7 1 2 3 4 5 6 7

X and Y have the same mean which is 4. However, Y appears to be


more “concentrated” around the mean value 4.
11 / 35
Variance of discrete random variables
To measure how “concentrated” a random variable is around its mean
value, we introduce the concept of variance.
Definition 2 (Definition 11.1.11)
Let X be a discrete random variable with mean E[X] = µ. The
Variance of X is defined as
X
σ 2 = Var(X) , (u − µ)2 fX (u),
u

where fX (u) = P (X = u) is the probability function of X.

Fact:
X X
Var(X) = (u − µ)2 fX (u) = (u2 − 2uµ + µ2 )fX (u)
u
X X X
= u2 fX (u) − 2µ ufX (u) + µ2 fX (u)
X
= u2 fX (u) − µ2 .
u 12 / 35
Let’s revisit Example 3. Recall that E[X] = E[Y ] = 4.
u 1 2 3 4 5 6 7
(u − 4)2 9 4 1 0 1 4 9
P
fX (u) 0.1 0.13 0.17 0.2 0.17 0.13 0.1
(u − 4)2 fX (u) 0.90 0.52 0.17 0.00 0.17 0.52 0.90 Var(X)=3.18
fY (u) 0.01 0.09 0.2 0.4 0.2 0.09 0.01
(u − 4)2 fY (u) 0.09 0.36 0.20 0.00 0.20 0.36 0.09 Var(Y)=1.3

fX (u) fY (u) mean

mean

u u
1 2 3 4 5 6 7 1 2 3 4 5 6 7

The variance of Y is smaller, which means the notion of variance can


indeed be used to measure how “concentrated” a random variable is:
the smaller the variance is, the more concentrated (on the mean value)
the random variable is.
13 / 35
Two important examples on variance

Let us look at an extreme case first.


Example. Suppose the random variable X takes only one value, c.
That is fX (c) = P (X = c) = 1. Then we have

µ = E[X] = cfX (c) = c × 1 = c;


σ = Var(X) = (c − µ)2 fX (c) = 0 × 1 = 0.
2

In this case, X does not vary at all!

14 / 35
Example 4
Let X be discrete random variable. Let c be a constant. Prove
(i)
E[X + c] = E[X] + c
(ii)
Var(X + c) = Var(X)

Proof.
We have
X
E[X + c] = (u + c)P (X = u)
u
X X
= uP (X = u) + c P (X = u) = E[X] + c.
u u
X
Var(X + c) = (u + c − E[X + c])2 P (X = u)
u
X
= (u − E[X])2 P (X = u) = Var(X).
u
15 / 35
Example 5
We toss a fair die and let X denote the number obtained. Find E[X]
and Var(X).

Solution.
6
X
E[X] = i × P (X = i)
i=1
1 1 1 1 1 1 21
=1× +2× +3× +4× +5× +6× = .
6 6 6 6 6 6 6
6
X
Var(X) = (i − 3.5)2 × P (X = i)
i=1
1 1 1
= (1 − 3.5)2 × + (2 − 3.5)2 × + (3 − 3.5)2 ×
6 6 6
2 1 2 1 2 1
+(4 − 3.5) × + (5 − 3.5) × + (6 − 3.5) × = 2.9167
6 6 6

16 / 35
Example 6 (Q4(a), Semester 1, 2017/2018)
The following table lists the probability distribution of the number of
breakdowns per week for a machine based on past data:
Breakdowns per week 0 1 2 3
Probability c 0.25 0.2 0.3

Find the value of c, the mean µ and variance σ 2 for this probability
distribution.
Solution. As
1 = c + 0.25 + 0.2 + 0.3
we have c = 0.25. The mean value is
µ = 0 ∗ 0.25 + 1 ∗ 0.25 + 2 ∗ 0.2 + 3 ∗ 0.3 = 1.55. The variance is
σ 2 = (0 − 1.55)2 ∗ 0.25 + (1 − 1.55)2 ∗ 0.25 + (2 − 1.55)2 ∗ 0.2 +
(3 − 1.55)2 ∗ 0.3 = 1.3475.

17 / 35
Example 7 (Q4(a), Semester 2, 2017/2018)
Suppose X is a random variable that can only assume values 1, 2, 3,
and 4. Suppose that these values are assumed with probability
distribution P (X = k) = 10 k
+ c with unknown constant c for
k = 1, 2, 3, 4. Find the value of c and then find the mean µ and
standard deviation σ for this probability distribution.

Solution. Solving
1 2 3 3
1= +c+ +c+ +c+ +c=1+c
10 10 10 10
we get c = 0.
4
X k
µ= k∗ = 3.
10
k=1

4
X k
σ2 = (k − 3)2 ∗ = 1.
10
k=1

18 / 35
§11.1.1 The Binomial Probability Distribution

Suppose we flip a coin with two outcomes H and T , standing for head
and tail respectively.

Denote probabilities P (H) = p and P (T ) = q. Hence p + q = 1.

We do the experiment 3 times and denote the output random variables


by X, Y, Z, respectively.

Question. What is the probability of the output


(X = H, Y = T, Z = H)?
Solution.
Since the experiment was repeated three times,

P (X = H, Y = T, Z = H)
= P (X = H)P (Y = T )P (Z = H) = p2 q.

19 / 35
A binomial experiment (also called Bernoulli trial) has the following
properties:
A fixed number of trials, say n.
Each trial has only two possible outcomes, which are
conventionally labelled as “success” and “failure”.
Probability of success is p, probability of failure is q = 1 − p.
Trials are independent, in other words, the outcome of one trial
does not affect the outcome of any other trial.
Example: the example on the previous page.

Define a random variable, X, as the number of successes in n trials of a


binomial experiment.

Clearly X can takes values 0, 1, . . . , n.

We desire to calculate P (X = k) for k = 0, 1, 2, . . . , n.

20 / 35
Start with n = 1, namely, only one trial. Clearly.

P (X = 0) = P (failure) = 1−p = q, P (X = 1) = P (success) = p.

Now, n = 2 (2 trials).

P (X = 0) = P (failure, failure) = P (failure) ∗ P (failure) = q 2


P (X = 1) = P (success, failure, OR failure, success)
= P (success, failure) + P (failure, success) = 2pq
P (X = 2) = P (success, success) = P (success) ∗ P (success) = p2

What if n = 10 (10 trials)? For example, how to calculate the


probability P (X = 4)?

21 / 35
Definition 3 (Definition 11.1.15)
In general, if we do n times the independent tails, of which each
succeeds with probability p, and denote X the total number of
successes in n trials and write q = 1 − p, then X ∈ {0, 1, . . . , n} and
 
n k n−k
P (X = k) = p q
k

We say that this X follows a Binomial distribution. We write as


X ∼ Binomial(n, p).

We have the following


Theorem 1
If X ∼ Binomial(n, p) with q = 1 − p, then

E[X] = np and Var(X) = npq.

22 / 35
0.25
● ●

0.25

0.20
● ●

0.20

0.15
0.15
P(x)

P(x)
● ●

0.10
0.10

0.05
0.05

● ●


0.00

0.00
● ● ●
● ● ● ● ●

0 2 4 6 8 10 0 2 4 6 8 10
X~Binomial(10, 0.3), ==> E[X]=3, Var(X)=2.1 X~Binomial(10, 0.5), ==> E[X]=5, Var(X)=2.5

0.4
● ●
0.25


0.3
0.20


0.15
P(x)

P(x)
0.2


0.10


0.1
0.05

● ●

0.00

● ●
0.0

● ● ● ● ● ● ● ● ●

0 2 4 6 8 10 0 2 4 6 8 10
X~Binomial(10, 0.7), ==> E[X]=7, Var(X)=2.1 X~Binomial(10, 0.9), ==> E[X]=9, Var(X)=0.9
23 / 35
Example 8
Tom flips a fair coin four times. Let X be the number of “Heads” he
will get. Find P (X = 0), P (X = 1), P (X = 3), E[X], and Var(X).

Solution. We have X ∼ Binomial(4, 0.5), so

   0  4  4
4 1 1 4! 1
P (X = 0) = = = 0.0625
0 2 2 0!4! 2
   1  3  4
4 1 1 4! 1
P (X = 1) = = = 0.25
1 2 2 1!3! 2
   3  1  4
4 1 1 4! 1
P (X = 3) = = = 0.25
3 2 2 3!1! 2
E[X] = np = 4 × 0.5 = 2
Var(X) = np(1 − p) = 4 × 0.5 × (1 − 0.5) = 1.

24 / 35
Example 9
If 80 percent of the balls in a box are red, and we randomly draw 10
balls with replacement. What is the probability that no more than 8 red
balls are obtained?
Solution.
• Let X be the number of red balls we would get.
• Then X ∼ Binomial(10, 0.8).
• We have

P (X ≤ 8) = 1 − P (X = 10) − P (X = 9)
   
10 10 0 10
=1− 0.8 × 0.2 − 0.89 × 0.21
10 9
= 0.6242.

25 / 35
§11.1.2 Poisson Distribution

A Poisson random variable is usually used to model


1 The number of customers arriving at a counter during a fixed time
span;
2 The number of misprints on a page of a book;
3 The number of accidents in a manufacturing plant per week;
4 the number of phone calls received at an exchange in an hour;
5 the number of cars passing the Hung Hom Tunnel per day;

26 / 35
Suppose we are given an interval (this could be time, length, area or
volume) and we are interested in the number of “successes” in that
interval (see the examples on the previous page).

Assume that the interval can be divided into very small subintervals
such that:
the probability of more than one success in any subinterval is zero;
the probability of one success in a subinterval is constant for all
subintervals and is proportional to its length;
what is happening in a subinterval is independent of any other
subinterval.

Let the random variable X denote the number of successes in the whole
interval. Also, let λ be the mean number of successes in the whole
interval.

It turns out that X follows a Poisson distribution, i.e., X ∼ Poisson(λ).

27 / 35
Definition 4 (Definition11.1.20, Poisson Distribution)
A discrete random variable X is said to have the Poisson distribution
with parameter λ > 0 if for any integer k = 0, 1, . . .,

λk
P (X = k) = e−λ
k!

Theorem 2 (Theorem 11.1.23)


If X ∼ Poisson(λ), then

E[X] = Var(X) = λ

28 / 35
Example 10
Suppose X ∼ Poisson(2), find out
1 P (X = 3)
2 P (X ≤ 3)
3 P (X > 3)

Solution. (1).

23
P (X = 3) = e−2 = 0.1804.
3!
(2).

P (X ≤ 3) = P (X = 0) + P (X = 1) + P (X = 2) + P (X = 3)
 0
21 22 23

−2 2
=e + + + = 0.8571.
0! 1! 2! 3!

(3). P (X > 3) = 1 − P (X ≤ 3) = 0.1429.


29 / 35
Example 11
A washing machine in a laundry breaks down in an average of three
times per month. Using the Poisson distribution to model the times it
breaks down, find the probability that during the next month this
machine will have
(a). exactly two breakdowns
(b). at most one breakdown

Solution.
• Let X be the number of times the machine breaks down in one
month.
• Suppose X ∼ Poisson(λ). Then λ = E[X] = 3. So
X ∼ Poisson(3).
32 e−3
• (a). P (X = 2) = 2! = 0.2240.
 
30 31
• (b). P (X ≤ 1) = P (0) + P (1) = e−3 0! + 1! = 0.1991.

30 / 35
Example 12
For a certain manufacturing industry, the number of industrial
accidents averages three per week. Assume accidents occur
independent of each other. Find the probability that:
(a). at most four accidents occur in a given week;
(b). exactly 4 accidents occur in TWO weeks.

Solution:
(a) Let X be the number of accidents occur in a given week. Then
X ∼ Poission(3).
4
X 3k e−3
P (X ≤ 4) = = 0.816
k!
k=0

(b) Let Y be the number of accidents occur in TWO weeks. Then


Y ∼ Poission(2λ) = Poission(6):
64
P (Y = 4) = e−6 = 0.1339.
4!
31 / 35
Example 13 (Q4(b), Semester 1, 2017/2018)
The average number of accidents at a crossroad every month is 4.
Suppose that the number of accidents per month at the crossroad
satisfies the Poisson distribution.
(i) Calculate the probability that there are exactly 3
accidents this month.
(ii) What is the probability that there are exactly 2 months
having 3 accidents in each month in the coming 5
months.
Solution:
(i) Le X be the number of accidents per month at the crossroad. Then
X ∼ Poisson(4).

43
P (X = 3) = e−4 ∗ ≈ 0.1954.
3!

32 / 35
(ii) Let Y be the number of months when there are 3 accidents. Then
Y ∼ Binomial(5, 0.1954). (Notice that the number 0.1954 was that
obtained in item (i) above. That is, the 5 coming months can be
regarded as 5 independent trials, and the “success” for each trial
(month) means that there are 3 accidents in that trial (month).) Hence,
we have
 
5
P (Y = 2) = ∗ 0.19542 ∗ (1 − 0.1954)3 ≈ 0.1989.
2

33 / 35
Example 14 (Q4(a), Semester 2, 2018/2019)
Defects occur at random in planks of wood with a constant rate of 0.5
per 10 cm length. Timothy buys a plank of length 100cm.
(i) Find the probability that Timothy’s plank contains at
most 3 defects.
(ii) Timothy’s friend George buys 8 planks, each of length
100 cm. Find the probability that more than 6 of
George’s planks contains at most 3 defects.

Solution:
(i) Let X be the number of defects of a plank of 100cm. Then
X ∼ Poisson(0.5 ∗ 100
10 ) = Poisson(5). Hence

51 52 53
P (X ≤ 3) = e−5 (1 + + + ) ≈ 0.4335.
1! 2! 3!

34 / 35
Let Y be the number of planks that contains at most 4 defects. Then
Y ∼ Binomial(8, 0.4335). We have

P (Y > 6))
= P (Y = 7) + P (Y = 8)
   
8 7 1 8
= ∗ 0.4335 ∗ (1 − 0.4335) + ∗ 0.43358 ∗ (1 − 0.4335)0
7 8
= 0.01429.

35 / 35

You might also like