You are on page 1of 17

Chapter six

Elementary Probability
1.1.

Definition of basic terms of probability

Random experiment: - is a process of measurement or observation which is repeated at any time


and whose out come cant be predicted with certainty. E.g. tossing a coin
Outcome: - a particular result of an experiment (result of single trial of an experiment)
Sample space: - is the set of all possible outcomes of a random experiment. Each possible
outcome is called sample point.
Event: - is a subset of a sample space (one or more outcomes of an experiment)
Example1: if we toss a coin, the sample space (S) of this experiment is
S = {head, tail} where head and tail are two faces of a coin. If we are interested the outcome of
head will turn up then the event E= {head}
Example 2: the sample space of tossing a coin twice is
S= {HH, HT, TH, TT}
Elementary or simple event: - an event has only one sample point.
Mutually exclusive event: - two events E1 and E2 are said to be mutually exclusive if there is no
sample point which is common to E1 and E2.

E1n E2 =

i.e., if E1 and E2 are mutually exclusive events, then P (E1

E2) = P (E1) + P

(E2).
Independent event: two events E1 and E2 are said to be independent if the occurrence or non
occurrence of one cannot affect the occurrence or non occurrence of the other.
Equally likely outcomes: - if each outcome in a sample space has the same chance to be
occurred.
Exa0mple In throwing a fair die all possible outcomes are equally likely. That means the
elements of the sample space have equal chance to be occurred.
Definition of probability
Probability:-is a chance (likely hood) of occurrence of an event. It is expressed by a numerical
value between 0 and 1 inclusively. Probability is a building block of inferential statistics.
Counting techniques:
1

In order to determine the number of out comes one can use several rules of counting.
1. Multiplication rule: - in a sequence of n events in which the first event has k1 possibilities, the

second event has

k2

possibilities,, the nth event has kn possibilities, then the total possibilities of

the sequence will be k1.k2.kn.


Example: - in a personnel department a larger corporation wishes to issue each employee an ID
cards with two letters followed by two digit numbers. How many possible ID cards can be
imposed?
Solution
K1
K2
K3
26
26
10
Thus the total number of ID cards issued could be:

K4
10

26*26*10*10=67600(with repetition)
26*25*10*9=58500 (with out repetition)
2. Permutation: is an arrangement of n objects in a specific order. In this case order is crucial.
a) The number of permutations of n objects taken all together is n! i.e.

n! / (n-n)!

b) The arrangement of n distinct objects in a specific order taking r objects at a time is given by
nPr =n!/(n-r)!= n(n-1)(n-2)..(n-r-1)
c) The number of permutation of n objects in which k1 are alike, K2 are alike, kn are alike is
n! / k1!k2!....kn!
Example: a photographer wants to arrange 3 persons in a raw for photograph. How many
different types of photographs are possible?
Solution:
Assume 3 persons Aster (A), lemma (L), Yared (Y) and n=3
Since n! =3! = 3*2! = 6, there are 6 possible arrangement ALY, AYL, LAY, LYA, YLA and YAL
Example2: fifteen athletes including Haile were entered to the race.
a) In how many different ways could prizes for the first, the second and the third place be
awarded?
b) How many of the above triplets just counted have if Haile is in the first position?
Solution:
a) 15 objects taken 3 at a time 15P3=15! / (15-3)! = 2730
b) There are 14P2= 14! / (14-2) = 182
2

3. Combination: - counting technique in which the order of the objects is immaterial. Selection

of r objects from a collection of n objects where r<= n without regarding order.

The

combination of n objects r objects taken at a time is given by


nCr = n! / (n-r)! r!
Example: In a club containing 7 members a committee of 3 people is to be formed. In how many
ways can the committee be formed?
Solution: 7C3 = 7! / (7-3)! 3! = 35
Basic approaches to probability
Classical approach: - Uses sample space to determine the numerical probability that an event
will happen. If there are n equally likely outcomes of an experiment, and out of the n outcomes
event E occur only k times the probability of the event E is denoted by P (E) is defined as
P (E) = n (E)/ n(S) =k/n
Deficiencies of classical approach
-

If total number of outcomes is infinite or if it is not possible to enumerate all elements of

the sample space.


If each out come is not equally likely

Example: in the experiment of tossing a coin and a die together, find the probability of an event
E consisting head and even numbers.
Solution: S={H1,H2,H3,H4,H5,H6,T1,T2,T3,T4,T5,T6} then
E= {H2, H4, H6} thus, P (E) =n (E)/n(S) =3/12=1/4
Let S be sample space of an experiment, P is called probability function if it satisfies the
following condition
0 < P (A) 1, for each event A, P (A) is called probability of A
Where P (S) = 1

Note: If A and B are mutually exclusive events, then P (A

Ai
Similarly P (

i 1

) =P (

)+P(

) ++ P (

B) = P (A) + P (B)

P( A )
=

i 1

Relative frequency Approach (empirical approach):- suppose we repeat a certain experiment


n times and let A be an event of the experiment and let k be the number of times that event A
occurs.
P( A)

Then

the

ratio

k/n

is

called

the

relative

frequency

of

event

A.

number of times event A has occurred k

total number of observations


n

In other words given a frequency distribution, the probability of an event (E) being in a

total

frquency of a class
frequency in the distributi on

given class is P (E)=


Example: the national center for health statistics reported out of every 539 deaths in recent
years, 24 resulted from automobile accident, 182 from cancer, and 353 from other disease. What
is the probability that particular death is due to an automobile accident?
Solution P (automobile) = death due to automobile /total death =24/539
Rules of probability
Rule l: let A be an event and A be the compliment of A with respect to a given sample space of
an experiment, then p(A)=1-P(A)
Proof:
let S be a sample space

S=A

A A =

P ( A n A)=0

P(S) = P (A

A) = P (A) + P (A) - P( An A)

1= P (A) + P (A) - 0

1= P (A) + P (A)

P (A) = 1 - P (A)

Rule 2: let A and B are events of a sample space S, then

P (A B) = P (B)-P (A

Proof: B =S

B)

B = (A

A)

B = (A

B)

(A B)

Case 1: if A B , then P (B) =P (A

B) +P (A B)

P (A B) = P (B) P (A B)

Case 2: if A B = , then P (B) =P (A B) + P (A B) since P (A B) = P ( ) =0

=> P (B) = P (A B)
Rule 3: Suppose A and B are two events of a sample space, then

P (A

B) = P (A) + P (B) - P (A

B)

Example: A fair die is thrown twice. Calculate the probability that the sum of spots on the face
of the die that turn up is divisible by 2 or 3.
Solution:
S= {(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(2,1),(2,2),(2,3),(2,4),(2,5),(2,6),(3,1),
(3,2),(3,3),(3,4),(3,5),(3,6),(4,1),(4,2),(4,3),(4,4),(4,5),(4,6),(5,1),(5,2),(5,3),(5,4),(5,4),(5,5),
(5,6),(6,1),(6,2),(6,3),(6,4),(6,5),(6,6)}
This sample space has 6*6 =36 elements let E1 be the event that the sum of the spots on the die
is divisible by 2 and E2 be the event that the sum of the spots on the die is divisible by 3, then

P (E1 or E2) = P (E1

E2)

= P (E1) +P (E2) P (E1

E2)

= 18/36 + 12/36 -6/36 = 24/36 = 2/3


Conditional probability: the conditional probability of an event A in relation to B is defined as
the probability that event E occurs given that event A has already occurred.

P (A/B) = P (A

B)/ P (B) where P (B) > 0


5


Remark: (i) P (A

B) & P (B) are computed w. r. t. Original sample

(ii) P (S/B) = P(S

B)/P (B) = P (B)/P (B) = 1

P (B/S) = P (B) because P (B/S) = P (B

S)/P(S) = P (B)/1 =P (B)

(iv) if A and B are independent events, then P(A/B) =P(A) and P(B/A) =P(B) two events

are independent if the occurrence of B doesnt affect the occurrence of A. i.e. P(A/B) =P(A
B)/P(B)

P (A

B) = P (A/B) *P (B) but P (A/B) = P (A)

Hence P (A

B) = P (A)* P (B)

Example: Suppose that an office has 100 calculating machines. Some of them use electric power
(E) while others are manual (M) and some machines are well known (N) while others are used
(U). The table below gives numbers of machines in each category. A person enter the office picks
a machine at random and discovers that it is new. What is the probability that it is used with
electric power?

E
40

M
30

U
Total

20
10
60
40

Solution: P (E/N) =P (E N) /P (N) = 40/70 =4/7

Total
70
30
100

Chapter seven
Probability Distributions
Probability distribution: is a list of all the possible out comes of an experiment and the
probability associated with each out come.
Example: Suppose we are interested in the number of heads showing face up on 3 tosses of coin.
This is the experiment and the possible outcomes are 0 heads, 1 head, 2 head, and 3 heads. What
is the probability distribution for the number of heads?
Solution: The experiment has 8 possible outcomes, and below is the list of all the outcomes.

Possible
result
1.
2.
3.
4.
5.
6.
7.
8.

st

1
T
T
T
T
H
H
H
H

Coin toss
2nd
3rd
T
T
No.Tof heads,
H x
H
T
0 H
H
1 T
T
2 T
H
3 H
T
Total
H
H

No. of heads
0
1
1
2
1
2
2
3

From the above table, the probability distribution


for the number of heads is
P (outcome), P
(x)
1/8
3/8
3/8
1/8
1

6.1. Random variables. A random variable is a quantity resulting from an experiment that can
assume different values.In any experiment of chance, the outcomes occur randomly. For
example, rolling a single die is an experiment; and any one of the six possible outcomes can
occur at a time.
A random variable may be either discrete or continuous.
i. Discrete random variable: a variable that results from counting and can assume only certain
clearly separated values of some item of interest.
Example: The number of heads in flipping a fair coin 5 times.
ii. Continuous random variable: a variable that results from measuring and can take any value
with in a certain range of values.
Example: The distance b/n Sodo & Addis Ababa could be 330 km, 330.5 km, 331.5 km. and
soon; depending on the accuracy of our measuring device.
6.2. Discrete probability distributions (probability mass function), expectation and variance
of discrete random variable
If we organize a set of discrete random variables in a probability distribution, the distribution is
called a discrete probability distribution; it is also called probability mass function (pmf). And
it can be summarized by its mean and variance.
Mean: The mean of a probability distribution is also referred to as expected value, E (x), and is
given by
Mean = E (x) =(x p(x))
P(x)= p (the possible value of random variable x).
Variance & standard deviation: Though the mean is a typical value used to summarize a discrete
probability distribution, it does not describe about the spread in the distribution, but the variance
does this.

( ( x ) 2 p ( x ) )

= x2p(x)

var iance
Standard deviation () =
Example: the following is the probability distribution for the number of cars a company expects
to sell on a particular day.
No. of cars sold, Probability. P(x)
x
0
0.1
7

1
0.2
2
0.3
3
0.3
4
0.1
Total
1.0
1. What type of distribution is it?
2. On a typical day, how many cars does the company expect to sell?
3. What is the variance of the distribution? What is the standard deviation?
Solution:
1. It is a discrete probability distribution.
= E (x) =(x p(x))
2.
= 0(0.1) +1(0.2) +2(0.3) +3(0.3) +4(0.1)
= 2.1.
Interpretation: Over a large number of days, the company expects to sell 2.1cars a day. Of
course, it is not possible for him to sell exactly 2.1 cars on any particular day; thus the mean is
sometimes called the expected value.
3. 2 = x2p(x) 2 = (02(0.1)+12(0.2)++42(0.1)) - (2.1)2 = 1.29

1.29

1.136

6.3. Common discrete problem distributions


1. Binomial distribution.
It is used to represent the probability distribution of discrete random variables. Binomial means
two categories. The successive repetition of an observation (trial) may result in an outcome
which possesses or which does not possess a specified character. Our primary interest will be
either of these possibilities. Conventionally, the outcome of primary interest is termed as
success. The alternative outcome is termed as failure. These terminologies are used irrespective
of the nature of the outcome. For example, non-germination of a seed may be termed as success.
Properties:
1. There must be only two mutually exclusive outcomes: success or failure.
2. The probability of success, p, and the probability of failure, q=1-p, remains constant from
one trial to another.
3. The probability of success in one trial is totally independent of any other trial.
4. The experiment can be repeated many times
Example: The coin flip experiment has only two possible outcomes: head or tail. The probability
of each is known and constant from one trial to another. We can flip a coin many times.
The binomial distribution is computed by
P ( x ) n c x ( p x )( q n x )

C = combination
8

n= number of trials
x=number of successes
p=the probability of success
q=1-p=the probability of failure
Mean of a binomial distribution
= np
Variance of a binomial distribution
2 = npq
Example: There are 5 flights daily from Addis Ababa to Washington, suppose the probability that
any flight arrives late is 0.2. What is the probability that
a. None of the flights are late today?
b. Exactly one flight is late today?
c. Construct the entire probability distribution
d. What is the probability that more than 4 flights are late?
e. Between 2 and 4 (inclusive) flights are late?
f. What is the mean?
g. What is the variance?
Solution: given that the probability of a particular flight is late is 0.2, and thus the probability
that a particular flight is not late is 0.8. There are 5 flights, so n = 5, and x refers to the number of
successes. In the questions a to e, we are asked about the late flights, so here let success = late
flight. Then p = 0.2, and q = 0.8.
a. P (none of the flights are late today) = P (0 flights are late) = P (x = 0)
P ( x ) n c x ( p x )( q n x )
P (0) 5 c0 (0.2 0 )( 0.8 50 )

=0.3277
b. P (exactly one flight is late today) = P (1 flight is late) = P (x = 1)
P (1) 5 c1 (0.21 )(0.8 51 ) 0.4096
c. The entire distribution is
Number of
P (x)
late flights, x
0
0.3277
1
0.4096
2
0.2048
3
0.0512
4
0.0064
5
0.0003
Total
1.0000
d. P (x > 4) = P (x = 5) = 0.0003
e. P (2 x 4) = P (x = 2) + P (x = 3) + P (x = 4) = 0.2048 + 0.0512 + 0.0064 = 0.2624
= np = 5 * 0.2 = 1 late flight or 5 * 0.8 = 4 not late flights
f.

g.

= npq = 5 * 0.2 * 0.8 = 0.8

2. The Poisson distribution


The Poisson distribution is also used to represent the probability distribution of a discrete
random variable. It is employed in describing random events that occur rarely over some
unit of time or space.
Examples of events where Poisson probability function can be used:
Number of telephone calls per hour
Number of typing errors per page
Number of accidents on a particular road per day
Hospital emergencies per day,
etc
Assumptions:
1. The probability of occurrence of an event is constant for any two intervals of time or
space
2. The occurrence of an event in any interval is independent of the occurrence in any other
interval.
Having these assumptions, the Poisson distribution is given by the function
x e
x e
x!
x1
P (x) =
Where x = the number of times the event has occurred

= is the mean no. of occurrences per unit of time or space.


e = 2.71828, the base of the natural logarithm system.
Example: Simple observation over the past 80 hours has shown that 800 customers have entered
the shop. What is the probability that
a. exactly 5 customers will arrive during any given hour?
b. more than 3 customers will arrive during any given hour?
c. exactly 5 customers will arrive during any 30 minutes?
800
10 customers
hour
80
Solution: =
10 5 2.71828 10
0.0378 (10)5 x 2.71828
5!
5!
a. P (x = 5) =
b. P (x > 3) = P (4) + P (5) +
by the complement rule that we have discussed earlier P (x > 3) = 1 P (x 3)
10 0 e 10 101 e 10 10 2 e 10 10 3 e 10

1 P (0) P (1) P (2) P (3)


0!
1!
2!
3!
=
=1= 1 (0.0103) = 0.989
10

c. P (x = 5/30 minutes)
Here, as we are asked per 30 minutes, we should change the value per 30 minutes; thus
800
10 customers
10 customers
5 customers
hour
60 min utes
30 min utes
80
=
5 5 2.71828 5
(10)5 x 2.71828
0.175
5!
5!
P (x = 5) =
6.4. Continuous probability distribution
Continuous probability distribution is also called probability density function (pdf)
Let x be a continuous random variable, then the pdf of x is a function f(x), such that for any two
numbers a and b with a b
b

P a x b f ( x)dx

x b

P (a

)=

Which is the area under the curve bounded by x=a and x=b
If f(x) is pdf of x
1. f(x) 0 for all x

f ( x)dx 1

2.

i.e. area under the graph of f(x) must equals 1, since the sum of relative frequencies is 1.
Example: The diameter of an electronic cable, say x, is assumed to be continuous random
variable with pdf f(x)=6x(1-x), 0 x 1
1. Check f(x) is pdf
2. Determine number b such that P(0.5<x<0.9)
So/n: 1. To check f (x) is pdf, we should check the two points
i. f(x) 0 for all x Simple trial and error check can show us f (x) 0

f ( x)dx 1

ii.
1

6x 2
0 6 x(1 x)dx 0 (6 x 6 x )dx 0 6 xdx 0 6 x dx 2
2

11

1
0

6x 3

1
0

32 1

Expected value and variance of a continuous random variable:

xf ( x)dx x f (x) d

E(x) = =

1.

(x )

2. Var (x)==

f ( x)dx

f ( x )dx 2

Example: Calculate the E(x) and Var (x), for the following function
f(x) = 2x, 0 x 1
So/n: 1. E (x) =

2x 3

xf ( x)dx x(2 x)dx 2 x


3

0
0
1

b.

2
3

var (x)=
1

2
x f ( x)dx x 2 xdx
3
0

4
2 x dx
9
0
3

2x 4

The cumulative density function (cdf), F(x)


If x is a continuous random variable with pdf, f(x), then
x

f (t )dt; x

F(x)= P (X x) =

Properties
1. 0 F(x) 1
F ' ( x) f ( x )
2.
3. F(- )= 0, F( )=1
P(a x b) F (b) F (a )

4.
Example Given f(x) = 6x (1-x), 0 x 1 ,
1. Find F(x)

12

4 2 4 2
1

9 4 9 36 18

P(0.3 x 0.8)

2. what is the
So/n: 1. F (x) =
x

f (t )dt; x 6t (1 t )dt ;0 x 1

6t 2
6tdt 6t dt
2
0
0
x

6t 3

F ( x) 3x 2 2 x 3
=>
P(0.3 x 0.8)

2.

= F (0.8) F (0.3) = (3(0.8)22(0.8)3) (3(0.3)22(0.3)3)

6.5. Common continuous probability distributions


1. Normal distribution (N-distribution)
It is the most important distribution in describing a continuous random variable and used as an
approximation of other distribution. A random variable X is said to have a normal distribution if
its probability density function is given by
1
2
2 x
1
2
f ( x)
e
2
, Where X is the real value of X,

i.e. - <x< , - << and >0


Where =E(x)
() 2 = variance(X)
and () 2 are the Parameters of the Normal Distribution.
Properties of Normal Distribution:
1. It is bell shaped and is symmetrical about its mean. The maximum coordinate is at
X
x=
2. The curve approaches the horizontal x-axis as we go either direction from the mean.
1

x 2
1
2
f
(
x
)
dx

e
dx 1

3. Total area under the curve sums to 1, that is


4. The Probability that a random variable will have a value between any two points is equal
to the area under the curve between those points.
X
5. The height of the normal curve attains its maximum at
this implies the mean and
mode coincides(equal)

13

6.4.2 Standard normal Distribution


It is a normal distribution with mean 0 and variance 1.Normal distribution can be converted to
X
standard normal distribution as follows. If X has normal distribution with mean
and standard
x

deviation , then the standard normal distribution devariate Z is given by Z=

z
1
e
2

P (Z) =
Properties of the standard normal distribution:
The same as normal distribution, but the mean is zero and the variance is one.
Areas under the standard normal distribution curve have been tabulated in various ways.
The most common ones are the areas between Z = 0 and a positive value of Z.
Given a normal distributed random variable X with mean and standard deviation .
b x a
P(

P (a<X<b)
x a
x
P( X a ) P

But,
Standard normal r.v.
a

P
Note: i) P (a<x<b) = P (a<=X<b)
= P (a<X<=b)
=P (a<=X<=b)
P( Z ) 1
ii)

a Z b P Z b

P Z a

forq b

iii) P
Consider the situations under the standard normal curve. It is clear that
P 0 Z 0.5 P Z 0
i)

Let Z0 be negative number then,


P Z Z 0 P Z 0 P( Z 0 Z 0)

ii)

If Z0 is positive real number, then

14

P Z Z 0 P Z 0 P( Z 0 Z 0)

iii)

Let Z1 be a negative number and Z2 be positive real number, then


P Z 1 Z Z 2 P Z 1 Z 0 P (Z 2 Z 0)

iv)

If Z1 and Z2 are positive real numbers with Z1<Z2


P Z 1 Z Z 2 P Z 1 Z 0 P (Z 2 Z 0)
i.e.
iv) P(Z1<Z<Z2)

i) p(Z<Z0)
ii) p(Z>Z0)
0 Z1

Z2

0 Z0

Z0 0

iii) p (Z1<Z<Z2)

As the value of

Z1

Z2

increases, the curve becomes more and more flat and vice versa.

Examples: - For a standard normal variable Z find


a) P(-2.2 <Z<1.2)
b) P(Z>1.05)

c) P(0<Z<0.96)
d) p(-1.45 <Z<0)

Solution: a)
-2.2
1.2
P (-2.2<Z<1.2) = P (0<Z<1.2) +p (-2.2<Z<0)
= p (0<Z<1.2) +P (0<Z<2.2)
= 0.3849+0.4861
= 0.8710
b)
= P (Z>1.05) = 1 - P (0<Z<1.05)
= 1-0.8531 = answer
c) P (0<Z<0.96) = 0.3315
d) P (-1.45 <Z<0) = P (0<Z<1.45)

= 0.4265

NOTE: By determining the z- value, we can find the area or the probability under any normal
curve by referring to the standard normal distribution table.

15

a.

How to use the Normal distribution table to determine probabilities


If you wish to find the area between 0 and Z (or Z), look up the value directly in the table.
Example: P (0 < Z < 0.96) = 0.3315
Example: P (-0.96 < Z < 0) = P (0 < Z < 0.96); because the curve is symmetric to z = 0 = 0.3315
b. To find area between two points on the different sides of the mean, add the corresponding
areas found in the N table.
Example: P (-2.2 < Z < 1.2) = P (-2.2 < Z < 0) + P (0 < Z < 1.2)
=P (0 < Z < 2.2) + P (0 < Z < 1.2)
=0.4861 + 0.3849
= 0.8710
c. To find the area between two points on the same side of the mean, determine the areas related
to the two values from the table, and then subtract the smaller area from the larger.
Example: P (0.96 < Z < 1.2) = P (0 < Z < 1.2) P (0 < Z < 0.96)
= 0.3849 0.3315
= 0.0534
Example: P (-1.2 < Z < -0.96) = P (-1.2 < Z < 0) P (-0.96 < Z < 0)
= P (0 < Z < 1.2) P (0 < Z < 0.96); because the curve is
symmetric to z = 0
= 0.3849 0.3315
= 0.0534
d. To find the area beyond Z (or -Z) value towards the same direction, look the value of Z
directly from the table, and then subtract it from 0.5.
Example: P (Z > 1.05) = 0.5 (0 < Z < 1.05)
= 0.5 0.3531
= 0.1469
Example: P (Z < -1.05) =0.5 P (-1.05 < Z < 0)
=0.5 P (0< Z < 1.05); because the curve is symmetric to z = 0
=0.5 0.3531
=0.1469
e. To find area beyond Z (or Z) value towards the different direction, look the value of Z
directly from the table, and then add the probability with 0.5.

Example: P (Z > -1.05) = P (-1.05 < Z < 0) + P (0 < Z < )


= P (0 < Z < 1.05) + 0.5
= 0.3531 + 0.5
= 0.8531

Example: P (Z < 1.05) = P (


< Z < 0) + P (0< Z < 1.05)
= P 0.5 + 0.3531
= 0.8531

a.
b.
c.

Example: The average satellite transmission is 150 seconds, with a standard deviation of 150
seconds. Time appears to be normally distributed. What is the probability that a call will last
between 125 and 150 seconds
e. less than 125 seconds
between 145 and 155 seconds
f. between 160 and 165 seconds
more than 175 seconds
g. between 135 and 140 seconds
16

d.
e.

less than 160 seconds


So/n: Given

= 150

h. more than 140 seconds

= 15, and let x = time

a) P (125 < x < 150)


125 x 150
P

=
125
150
P(
<Z <
)

P(

125150
150150
<Z <
)
= P (-1.67 < Z <O)
15
15

17