You are on page 1of 178

PROBABILITY

INTRODUCTION:
• If an experiment is repeated under essentially
homogeneous and similar conditions we
generally come across two types of situations:

i. The result (Outcome) is unique or certain.

ii. The result is not unique but may be one of the


several possible outcomes.

• The
  phenomena covered by (i) are known as
'deterministic' or 'predictable' phenomena.

 By a deterministic phenomenon we mean one in


which the result, can be predicted with certainty.

 For example:
if the potential difference E between the two ends
of the conductor and the resistance R are known,
the current I flowing in the conductor is uniquely
determined by Ohm‘s- Law", viz., I = .
There are phenomena [as covered by (ii) above]
which do not lend themselves to deterministic
approach and are known as 'unpredictable' or
'probabilistic' phenomena.

 For example:
I. In tossing of a coin one is not sure if a head or
tail will be obtained.
II. A salesman can not predict the with certainty
about the sales target next year.
Basic Terminology

• Random experiment:
Random experiments are those experiments whose results
depend on chance the word ‘random’ may be taken as
equivalent to ‘depending on chance’).

• Sample Space:
The set of all possible outcomes of a random experiment is
called the sample space (or universal set), and it is
denoted by S. An element in sample space S is called a
sample point.
Each outcome of a random experiment corresponds to a
sample point.
Trial and Event:
Consider an experiment which though repeated under essentially
identical conditions, the results(outcomes) are not unique, but
may be one of the possible outcomes.
The experiment is known as a trial and the outcomes are known
as events or cases.

For example:

 Throwing of a die is a trial and getting ( l or 2 or 3, ... or 6) is an


event.

 Tossing of a coin is a trial and getting head (H) or tail (T) is an event.

 Drawing two cards from a pack of well-shuffled cards is a trial and


getting a king and a queen are events.
• Exhaustive Events:
The total number of possible outcomes in any trial is
known as exhaustive events or exhaustive cases. For
example:

 In tossing of a coin there are two exhaustive cases, viz., head


and tail. (The possibility of the coin standing on an edge
being ignored).

 In throwing of a die, there are six, exhaustive cases since


anyone of the 6 faces 1, 2... 6 may come uppermost.

 In drawing two cards from a pack of 52 cards the exhaustive


number of cases is 52C­2, since 2 cards can be drawn out of 52
cards in 52C2 ways.
• Favourable Events or Cases:
The number of cases favourable to an event in
a trial is the number of outcomes which entail
the happening of the event. For example:

 In drawing a card from a pack of 52 cards the number of


cases favourable to drawing of an ace is 4, for drawing a
spade is 13 and for drawing a red card is 26.

 In throwing of two dice, the number of cases favourable


to getting the sum 9 is : (3,6) (4,5) (5,4) (6,3), i.e., 4.
• Mutually exclusive events:
Events are said to be mutually exclusive if the
happening of any one of them precludes the
happening of all the others (i.e., no two or more
of them can happen simultaneously in the same
trial.
For example:

 In throwing a die all the 6 faces numbered 1 to 6 are mutually


exclusive, since if anyone of these faces comes, the possibility of
others, in the same trial, is ruled out.

 Similarly in tossing a coin the events head and tail are mutually
exclusive.
• Equally likely events:
Outcomes of a trial are said to be equally likely
if taking into consideration all the relevant
evidence, there is no reason to expect one in
preference to the others.
For example:
In tossing an unbiased or uniform coin, head
or tail are equally likely events.
In throwing an unbiased die, all the six faces
are equally likely to come.
• Independent events:
Several events are said 'to be independent if the
happening (or non-happening) of an event is 'not affected
by the supplementary knowledge concerning the
occurrence of any number of the remaining events. For
example:

In tossing an unbiased coin the event of getting a head in the first
toss is independent of getting a head, in the second, third and
subsequent throws.

If we draw a card from a pack of well-shuffled cards and replace it


before drawing the second card, the result of the second draw is
independent of the first draw. But, however, if the first card drawn is
not replaced then the second draw is dependent on the first draw.
Mathematical or Classical probability
Definition:
If a random experiment (or a trial) results in n exhaustive, mutually
exclusive and equally likely outcomes (or cases) and m of them are
favourable to the happening of an event E, then the probability 'p' of
happening of E, is given by
Number of favourable cases m
p  P (E…(1·1)
) 
Total no. Exhaustive of cases n

We compute the probability in (1.1) by logical reasoning, without


conducting any experiment.
Since the probability in (1.1) can be computed prior to obtaining any
experimental data, it is also termed as ‘priori or a mathematical
probability’.
Example . What is the chance that in a leap year selected at random will contain 53
Sundays?

Solution: In a leap year (which consists of 366 days)


there are 52 complete Sundays and 2 days over.
 
The following are the possible combinations for these Two 'over' days:
(i)Sunday and Monday,
(ii) Monday and Tuesday,
(iii) Tuesday' and Wednesday,
(iv) Wednesday and Thursday,
(v)Thursday and Friday,
(vi) Friday and Saturday, and
(vii) Saturday and Sunday.
 
In order that a leap year selected at random should contain 53 Sundays, one of the two
'over’ days must be Sunday.
Since out of the above 7 possibilities, 2 viz.,(i) and (vii), are favourable to this event,
Number of favourable cases 2
 
Total no. Exhaustive of cases 7
Required probability
Example. A letter of the English alphabet is
chosen at random. Calculate the probability that
the letter so chosen
a) Is a vowel
b) precedes m and is a vowel.
• Remarks:
 Since m ≥ 0 and n > 0 and m ≤ n, we get from (1.1)
P (E )  0 & P ( E )  1  0  P ( E )  1
 Sometimes we express (1.1) by saying that the
odds in favour of E are m : (n-m) or
The odds in against of E are (n-m): m.
 Since the number of cases favourable to the non-
happening' of the event E are (n - m ) , the
probability 'q' that, E will not happen is given by
nm m
q  P (E )   1  1 p  p  q  1
n n
 P (Happening of atleast one of the events E1,E2 ,......E. n )

 1  P (none of the events E1,E2 ,......En happens)

 Probability 'p' of the happening of an event is also


known as the probability of success and the
Probability 'q' of the non- happening of the event
as the probability of failure.
 If P (E) = 1, E is called a certain event and
 if P (E) = 0, E is called an impossible event.
• Limitations of Classical Definition:
This definition of Classical Probability breaks down
in the following cases:
 If the various outcomes of the trial are not equally
likely or equally probable.
For example, the probability that a candidate will
pass in a certain test is not 50% since the two
possible outcomes, viz., success and failure
(excluding the possibility of a compartment) are
not equally likely.
 If the exhaustive number of cases in a trial is
infinite or unknown.
Statistical or Empirical or Estimated Probability
Definition (Von Mises).
If a trial is repeated a number of times under
essentially homogeneous and identical conditions,
then the limiting value of the ratio of the number of
times the event happens to the number of trials, as the
number of trials becomes indefinitely large, is called
the probability of happening of the event.
(It is assumed that the limit is finite and unique).
Symbolically, if in N trials an event E happens M times,
then the probability 'p' of the happening of E is given
by
M
p  P (E )  lim ........(1.2)
N  N
• For example. If we want to find the probability that a
spare part produced by a machine is defective, we study
the record of defective item produced by machine over a
considerable period of time . If out of 10,000 item
produced, 500 are defective, it is assumed that
probability of defective item is 0.05.
• Remarks
 Since in relative frequency approach, the probability is
obtained objectively by repetitive empirical observations, it
is also known as ‘Empirical probability’.

 Empirical probability approaches classical probability as the


number of trial becomes indefinitely large.
• Limitations of Empirical probability:
If an experiment is repeated large number of times, the
experimental condition may not remain identical and homogeneous.
The limit in (1.2) may not attain a unique value, however large N
maybe.

Axiomatic approach to probability: 

• The axiomatic definition of probability includes 'both' the classical and


the 'statistical definitions as particular cases and overcomes the
deficiencies of each of them.

• On this basis, it is possible to construct a logically perfect structure of


the modern theory of probability. This modern theory of probability
introduces probability simply as a number associated with each event. It
is based on certain axioms, which express the rule for operating with
such numbers.
• Definition.
Let S be a sample space of a random experiment. If to each event
A of sample space S, we associate a number P(A), then P(A) is
called the probability of event A, if the following axioms hold:
Axiom 1. For every event A
P(A) ≥ 0 (axioms of non-negativity)

Axiom 2. For the sure event S.


P(S) = 1 (axioms of certainty)

Axiom 3. For any finite number or countably infinite


number of mutually exclusive events A1, A2, A3,… of S. i.e.
P(A1U A2 U …..) = P(A1) + P(A2)+­­------
(axioms of additivity)
Some Theorems on Probability
• In development of the probability theory, all the results are
derived directly or indirectly using only axioms of probability .
• In this section we shall prove some simple theorems which help
us to evaluate the probabilities of some complicated event.
Th.1.Probability of the impossible event is zero., i.e.
P(φ)= 0.
Proof: The certain event S and impossible event φ are mutually
exclusive.
S U φ= S

 P(S  )  P(S)
 P(S)  P()  P(S) ....(By axioms of additivity)
 P()  0
Ex. If a year selected at random, find the probability that
February 30 is a Saturday.
Sol. The given event can never occurs, as no year
contain a day which February 30. Hence required
probability is zero.

Similarly we can prove that the probability of the


complementary event A of A is P(A)  1  P(A)
Addition theorem of Probability
Th. If A and B are two events and are not mutually
exclusive, then
P(A or B)  P(A)  P(B)  P(A & B)

i.e. P(A  B)  P(A)  P(B)  P(A  B)


s

A B
S

A B
 (A  B)  (A  B)  (A  B)  (A  B)

 P(A  B)  P (A  B)  (A  B)  (A  B) 
 P(A  B)  P(A  B)  P(A  B)
....(By axioms of additivity)
 P(A) - P(A  B)  P(A  B)  P(B)  P(A  B)
 P(A)  P(B) - P(A  B)
• Cor.1 If event A and B are mutually exclusive,
then A  B  
 P  A  B   P    0
 P(A  B)  P(A)  P(B)

• Cor.2 For three non mutually exclusive events


A, B and C, we have

P(A  B  C)  P(A)  P(B)  P(C)  P(A  B)


-P(B  C)  P(C  A)  P(A  B  C)
Q.1 A, B and C are three mutually exclusive and
exhaustive events associated with a random
experiment. Find the P(A) given that:
P(B)= 3/2 P(A) and P(C)= 1/2 P(B)

Q.2 Two dice are thrown. Find the probability of


getting ‘an even number on the first die or a total of
8’.

Q.3 A card is drawn from a pack of 52 cards. Find the


probability of getting a king or a heart or red card.
Multiplication Law of Probability and Conditional
Probability
For two events A and B
 P(A  B)  P(A) . P(B / A), P(A)  0
 P(B) . P(A / B), P(B)  0
P(B / A)
Where represents the conditional probability of
occurrence of event B when A has already happened and
P(A / B) is the conditional probability of happening of event A,
given that B has already happened.

i.e. The probability of simultaneous occurrence of two events


A and B is equal to product of the probability of one of these
events and the conditional probability of the other, given that
first on has occurred.
Remarks
P(A  B) P(A  B)
1. P(B / A)  and P(A / B) 
P(A) P(B)

Thus conditional probabilities P(B/A)and P(A/B) are


defined if and only if P(A) ≠ 0 and P(B) ≠ 0, respectively.

2.P(B/B) = 1

3. If A, B and C are three events in a sample space S


such that then
P  A  B  0

 P(A  B  C)  P(A) . P(B / A) . P(C / A  B)


Example:
From a city population, the probability of
selecting a male or a smoker is 7/10, a male
smoker is 2/5, and a male, if a smoker is already
selected is 2/3. Find the probability of selecting
(i) a non-smoker
(ii) A male and
(ii)a smoker, if a male is already selected.
From a city population, the probability of selecting a male or a smoker is 7/10,
a male smoker is 2/5, and a male, if a smoker is already selected is 2/3.
Find the probability of selecting
(i) a non-smoker
(ii) A male and
(ii)a smoker, if a male is already selected.
Sol.
let A- a male
B -a smoker
P(male or a smoker)
P(A  B)=7/10
P(A  B)=2/5
P(A/B)=2/3
(i) P(B)=1-P(B) P(B)=P(A  B)/P(A/B)
=1- P(A  B)/P(A/B)=1-2/5/2/3=1-3/5=2/5
(ii)P(A  B)=P(A)+P(B)-P(A  B)
P(A)=P(A  B)-P(B)+P(A  B)
=7/10-3/5+2/5=5/10=1/2
P(A  B) 2 / 5
(iii) P(B/A)=   4/5
P(A) 1/ 2
Independent events
Two or more events are said to be independent if the
happening or non-happening of any of them, does
not, in any way, affect the happening of others.
Definition:
An event A is said to be independent (or statistically
independent) of another event B, if conditional
probability of A, given B, i.e. P(A/B) is equal to the
unconditional probability of A.
i.e., if P(A/B) = P(A)
Multiplication theorem of probability for
independent events
Theorem. If A and B are two events with positive
probabilities{P(A)≠0 and P(B)≠ 0}, then
A and B are independent  P(A  B)  P(A) . P(B)
Extension:
Necessary and sufficient conditions of independence
A1,A 2 ,A 3 ,......A n
of n events is that the probability of
their simultaneous happening is equal to the product
of their respective probabilities, i.e.,

P  A1  A 2  A 3  ......  A n   P  A1  .P  A 2  .P  A 3  .......P  A n 
Mutually exclusive (disjoint) events and
independent events:
• Two mutually disjoint events with positive
probabilities are always dependent (not
independent) events.
• Two independent events (both of which are
possible events), can not be mutually disjoint.
• If A and B are independent events, then
(i) A and B (ii) and AB(iii) and A B
are also independent.
• If A1, A2 A3……. An are independent events with
respective probabilities of occurrence P(A1), P(A2),……
P(An) find the probability of occurrence of at least
one of them.
Proof:
[  (A1  A 2  .....  A n ) = (A 1  A 2  .....  A n )
(De-Morgan's Law)]
Hence the probability of happening of at least one
of the events is given by
P(A1  A 2  .....  A n ) = 1 - P(A1  A 2  .....  A n )
= 1 - P(A1  A 2  .....  A n )
= 1 - P (A1 ) .P (A 2 )........P (A n )
• P[happening of at least one of the events
A1,A 2 ,.........,A]n
=1 - P(none of the events A1,A 2 ,.........,A
happens)
n

or equivalently,
• P {none of the given events happens}
= 1 - P {at least one of them happens}.

Ex. A problem in Statistics is given to the three students A,B


and C whose chances of solving it are 1/2 , 3/4 , and 1/4
respectively. What is the probability that (i) the problem will
be solved and (ii)at least two of them solved the problem if all
of them try independently?
A  A solved the problem
B  B solved the problem
C  C solved the problem

Given
P ( A)  1/ 2  P( A)  1/ 2
P ( B )  3 / 4  P( B )  1/ 4
P (C )  1/ 4  P(C )  3 / 4
P ( problem is solved )
 P ( Atleast one of them solved problem)
 1  P(none of the A,B & Csolved problem)
=1-P(A  B  C)
=1-P(A)P(B)P(C) ( A,B, C are ind.)
1 1 3
=1- . .
2 4 4
29

32
Given
P( A)  1/ 2  P ( A)  1/ 2
P( B )  3 / 4  P ( B )  1/ 4
P(C )  1/ 4  P(C )  3 / 4
P( Atleast two of them solved problem)
 P(ABC or ABC or ABC or ABC)
=P (ABC)+P(ABC) + P(ABC) + P(ABC)
=P(A)P(B)P(C)+P(A)P(B)P(C)+P(A)P(B)P(C)+P(A)P(B)P(C)
=9/32+1/32+3/32+3/32
 1/ 2
Ex. A person has undertaken a construction job the
probabilities are 0.65 that there will be strike, 0.80 the
construction job will be completed on time if there is no
strike and 0.32 that the construction job will be completed
on time if there is a strike. Determine the probability that
the construction will be completed on time.
Solution:
Let A be an event that construction job will be completed
on time &
B is the event that there will be strike.

Given P (B )  0.65 then

P (there is no strike) = P (B )  1  0.65  0.35


Conditional probability that the construction job will be completed on time if there is a
strike.

 A
P    0.32
B
Conditional probability that the construction job will be completed on time if there is no
strike.

 A
P    0.80
B

Since

A  ( A & B ) or ( A & B )

 ( A  B)  ( A  B )

 P ( A)  P [ A  B )  ( A  B )]

Since ( A  B ) & ( A  B ) are mutually disjoint events


Since ( A  B ) & ( A  B ) are mutually disjoint events

 P ( A)  P ( A  B ) +P ( A  B )

 P (B ).P ( A / B ) +P (B ).P ( A / B )

 0.65  0.32  0.80  0.35

 0.49 (Approx)
Bayes Theorem
The probability revision process are shown in the following diagram

New
information
probability
Theorem:
if B1, B2, B3, ……., Bn be a set of exhaustive and mutually
exclusive events associated with a random experiment
and An event A occur either with B1 or B2 or B3 or …….
or Bn .Then conditional probability P(Bi/A) of a specified
event Bi when A is stated to actually occurred is given by

This is known as Bayes theorem.


B3 B2

B4 AB3 AB2 B1
AB1

 
ABn
 

 ABr 
 Bn


Br 

Proof:
Outer circle represents a sample space which
consists Bi i = 1,2, ……, n, n mutually exclusive and
exhaustive cases and inner circle represents event A
can occur with either B1 or, B2 or, B3, or ……., Bn.
Therefore
P(A) = P(B1A or B2A or B3A or…………or BnA)
= P(B1A) + P(B2A) + P(B3A) +…………+ P(BnA)
[B1A or B2A or B3A or…………or BnA are also mutually
exclusive events.]
 P (B1)P ( A / B1)  P (B2 )P ( A / B2 )    P (Bn )P ( A / Bn )
 P (B1)P ( A / B1)  P (B2 )P ( A / B2 )    P (Bn )P ( A / Bn )

n
  P (Bi )P ( A / Bi )
i 1

Again
P ( ABi )  P ( A)P (Bi / A)
P (Bi A)  P (Bi )P ( A / Bi )

Since events P ( ABi ) and P (Bi A)


are equivalent, their probabilities are equal

P ( A)P (Bi / A)  P (Bi )P ( A / Bi )


P (Bi )P ( A / Bi )
P (Bi / A) 
P ( A)

P (Bi )P ( A / Bi )
P (Bi / A)  n
 P (Bi )P ( A / Bi )
i 1
• Remarks:
a) The probabilities P(B1), P(B2),…….,P(Bn) are termed as the'
a priori probabilities' because they exist before we gain
any information from the experiment itself.

b) The probabilities P(A/B1), P(A/B2),…….,P(A/Bn) are called


'likelihoods' because they indicate how likely the event A
under consideration is to occur. given each and every a
priori probability.

c) The probabilities i = 1,2 ....., n are called 'posterior


probabilities' because they are determined after the
results of the experiment are known.
• The posterior probabilities, using Bays theorem can
be obtained conveniently in tabular form, which
involves the following steps.
List of m. Prior The Compute Compute the
e. events probabilities conditional joint posterior
B1, B2, B3, P(B1),P(B2), probabilities of probabilities probabilities
……., Bn …….,P(Bn) the new for each using the basic
occur in for the information event using relation of
the events (A) given each formula conditional
problem event, viz., probability. i.e.
P(Bi /A),
i=1,2,...,n P (Bi A)
 P (Bi )P ( A / Bi ) P (Bi / A)
P (Bi )P ( A / Bi )

P ( A)
Example:
• For certain binary communication
channel, the probability that a
transmitted ‘0’ is received as ‘0’ is 0.95
and the probability that transmitted ‘1’ is
received as ‘1’ is 0.90 and the probability
that a ‘0’ is transmitted is 0.4, find the
probability that
I. A ‘1’ is received.
II. A ‘1’ is transmitted given that a ‘1’ is
received.
Let the events be

• B1 → The event of transmitting ‘0’


• B2 → The event of transmitting ‘1’
Clearly B1 and B2 are exhaustive and mutually
exclusive events.
Let A → The event of receiving ‘1’
Given:
P(B1)= 0.4 &
P(B2)= 0.6
The probability that a transmitted ‘0’ is received as
‘0’ is 0.95.i.e.

P  A / B1   0.95

The probability that transmitted ‘1’ is received as


‘1’ is 0.90. i.e.
P  A / B2   0.90

 The probability that a transmitted ‘0’ is received


as ‘1’ is

P  A / B1   0.05
Therefore the probability of a ‘1’ is received i.e.

P(A) = P(B1A U B2A)

= P(B1A) + P(B2A)

= P(B1) P(A/B1) +P(B2) P(A/B2)

 0.4  0.05+0.6  0.90


= 0.56
P(Â)=1-P(A)=1-0.56=0.44
Therefore the probability of a ‘0’ is received i.e.

P(Â) = P(B1 Â U B2 Â)

= P(B1 Â) + P(B2 Â)

= P(B1) P(Â /B1) +P(B2) P(Â /B2)

=.4x0.95+.6x0.1
= 0.44
•And
  The probability of a ‘1’ is transmitted given
that a ‘1’ is received(by Bayes theorem) is:

P=

0.6  0.9 27
 
0.56 28
List of m. Prior The Joint The posterior probabilities using
e. and probabilities conditional probabilities for the basic relation of conditional
exhaustive probabilities each event probability.
events of the new
information
(A) given
each event

B1 P(B1)=0.4 P(A/B1) = P(B1A) =


B1 P(B1)=0.4 P(A/B1) = P(B1A) =
0.05 P(B1) P(A/B1)
0.05 P(B1) P(A/B1)

=0.4  0.05
B2 P(B2)= 0.6 P(A/B2) P(B2A) P
B2 P(B2)= 0.6 P(A/B2) P(B2A) =
=0.90 =P(B2) P(A/B2)
=0.90 =P(B2) P(A/B2)

 0.6  0.90
0.6  0.9 27
 
0.56 28
No. of favourable caese for A n( A)
P ( A)  
Total no. of cases n( S )
P ( A)  1  P( A)
0  P( A)  1

Addition theoerm
(i ) M .E.
A and B two M .E. events
P ( AorB)  P( A  B )  P ( A  B)  P ( A)  P ( B)

A1 , A2 ,......... An are M .E. then


P ( A1orA2 .........orAn )  P( A1  A2  .........  An )
 P( A1 )  P( A2 )  ..............  P( An )
P ( A  J  K  Q )  P ( A)  P ( J )  P ( K )  P (Q )
4 4 4 4 16
    
52 52 52 52 52
if events not M .E.
P ( A  B)  P( A)  P( B)  P ( A  B)

A  H , C , D, S n( A)  4
H  A, 2,3, 4,5, 6, 7,8,9,10, J , Q, K n( H )  13
n( A  H )  1

P ( A  H )  P( A)  P( H )  P( A  H )
4 13 1 16
   
52 52 52 52

mutiplication theorem

ind . events
P ( A  B)  P( AB )  P ( A) P( B)

not ind . events


P ( A  B)  P( A)  P( B / A)
 P( B).P( A / B)
P( A  B)
P( A / B) 
P( B)
P( A  B)
P ( B / A) 
P( A)
Introduction
• The outcome of random experiment may be numerical or non-
numerical in nature. For example,
The number of telephone received in a board in 1h is numerical in
nature while the results of coin tossing experiment in which 2 coin are
tossed at a time is non numerical in nature.

• As it often useful to describe the outcome of a random experiment


by a number, we will assign a number to every non numerical
outcome of the experiment. For example, in two coin tossing
experiment we can assign the value
0 to the outcome of getting 2 tails,
1 to the outcome of getting 1 head and 1 tail and
2 to the outcome of getting 2 heads.
Thus in any experimental situation, we can assign a number x to
every element s of the sample space S. i.e.
The function X(s) = x maps the elements of the sample space into
real numbers is called R.V. associated with random experiment.
Random Variables
Definition :
A random variable is function that assigns a real
number X(s) to every elements s є S where S is
the sample space corresponding to a random
experiment.
i.e. X: S є R
The function X(s)=x that maps the elements of
the sample space into real numbers is called R.V.
Type of random variables

• Discrete random variable

• Continuous random variable.


• Discrete random variable:
• If X is a R. V. which can take a finite number or
countably infinite number of values, X is called
discrete random variable.

• For example, in a dice throwing experiment in


which two dice are thrown at a time, sum of the
number on the upper most face of the dice is a
discrete random variable.
When a R.V. is discrete, the possible values of X may be
assumed as

x1, x2, x3, ……, xn, ……….

In the finite case, the list of values terminates and in the


countably infinite case, the list goes up to infinity.

Ex. No of failure preceding the first success in an infinite


series of independent trials.
In this experiment RV X denote No of failure preceding the
first success. i.e.
X: 0 1 2 3 …. …. ….
Probability mass function (p.m.f.)
If X is a R. V. which can take the values x1, x2, x3, ……,
xn, ………. s.t. P(X = xi) = pi, then pi
Is called probability mass function (p.m.f.) or point
probability function, provided pi (i=1, 2, …)
satisfy the following condition:

1. pi ≥ 0 for all i, and


p
2. ,
i
i 1

The collection of pairs {xi, pi}, I =1, 2, 3,….., is called


the probability distribution of the R.V. X
Verify whether the following functions can be regarded as the probability
mass function for the given values of X :
x-2
 for x  1, 2,3, 4,5
P ( X  x)   5
0, otherwise
x 1 2 3 4 5
-1 1 2 3
P ( X  x)  px 0
5 5 5 5
P ( X  1)  1/ 5  0
1st condition not satisfied
5
-1 1 2 3
though  p x   0    1
x 1 5 5 5 5
 the given fun is not a pmf
• Check
  weather the following function serve as
probability mass function
• P(X = x) = where x = 1, 2, 3, 4
• P(X = x) = where x = 1, 2, 3, 4
Continuous random variable:
• If X is a R. V. which can take all values (i.e., infinite
number of values)in a certain interval, then x is called
a continuous RV.

• In other words a random variable is said to be


continuous when its different values cannot be put in
1-1 correspondence with a set of positive integers.
• A continuous random variable is a random variable
that can be measured to any desired degree of
accuracy.
• Examples of continuous random variables are age,
height, weight etc.
Note:
• In continuous RV the number of events is
indefinitely large , probability that a particular
event will occur is practically zero. Here
instead of finding the probability at a
particular value x we find the probability of x
falling in small interval.
• Probability density function(p.d.f.):
If X is a continuous RV s.t.
 1 1 
P  x  dx  X  x  dx   f ( x )dx
 2 2 
f(x) dx represents the probability that X falls in
the extremely small interval  x  1 dx, x  1 dx  .
 2 2 
Then f(x) is called the probability density
function of X, provided f(x) satisfies the
following conditions:
1. .f(x)≥ 0 for all xRx, and

2. . f ( x)dx  1
Rx
• The probability for a variate value to lie in the interval dx is f(x)
dx and hence the probability for a variate value to fall in the
finite interval [a, b] is:

b
P (a  X  b)   f ( x )dx..............(*)
a

Important Remark.
• In case of discrete random variable the probability at a point,
i.e., P (x = c) is not zero for some fixed c.

• However, in case of continuous random variables the probability


at a point is always zero, i.e., P (x = c) = 0 for all possible values
of c. This follows directly from (*) by taking a= b = c.
This property of continuous r.v., viz.,
P(X= c)= 0, for every c
leads us to the following important result :

P ( a  X  b )  P (a  X  b )  P ( a  X  b )  P (a  X  b )

i.e., in case of continuous R.V., it does not matter


whether we include the end points of the
interval from a to b.
• Cumulative distribution function:
If X is an RV discrete or continuous , then P(X<=x) is
called the cumulative distribution function or
distribution function of X and denoted by F(x).
If X is discrete,
F(x)=  j
pj
x j x

If X is continuous,
x
P(  X  x)  

f ( x)dx
Properties of cdf F(x)
• F(x)is non-decreasing function of x.
• F(-∞) = 0 and F(∞)=1
• If X is a discrete RV taking the values x1, x2,….,
where x1< x2 <……< xn-1 < xn <…., then
P(xn) = F(xn) - F(xn-1)
• If X is continuous RV, then
d/dx F(x)=f(x),
at all point where F(x) is differentiable.
Find the constant k such that the function
 kx 2 0  x  3
f ( x)  
 0 otherwise
is a probability density function and compute (i ) P(1  x  2),
(ii ) P( X  2), and (iii ) P ( X  2).
Sol.
f ( x) pdf then
(i ) f ( x)  0 for 0  x  3
3
(ii )  f ( x)dx  1
0
3
 k  x 2 dx  1
0
3
 x3 
 k   1
 3 0
 9k  1
 k  1/ 9
2
1 x 
2 2 3
x 7
(a) P(1  x  2)   dx    
1
9 9  3 1 27
3 2
1 x 
2 2
x 8
(b) P( X  2)   dx    
0
9 9  3  0 27
3
1  x  19
3 2 3
x
(c) P( X  2)   dx    
2
9 9  3  2 27
8 19
P( X  2)  1- P( X  2)  1- 
27 27
Let the continuous random var iable X have the probability density
function
2 / x3 1  x  
f ( x)  
 0 otherwise
findF ( x )
x
x
x
31
 1  x 1
2
F ( x)  p ( X  x)   2 / x dx  2 
3
    2  1  2
1  3  1 1 x  x
 x2 1

F ( x)   x 2 1  x  
 0 otherwise
• Ex. Find the value of k so that f(x) = kx(2-x), may be a p.d.f. of
RV x for 0  x 2.
• . Verify that the function F(x) is a distribution function.
1-e-x/4 x  0
F ( x)  
 0 x<0
Also, find the probabilities P(X  4), P(X  8), P(4  X  8).

• Ex. A continuous random variable X has the distribution function


0, if x  1

F (x)  k (x-1) 4 , if 1 < x < 3
 1, if x > 3

Find (i) k, (ii) the probability density function f (x)


Example : A random var iable X has the following probability distribution.
x : 0 1 2 3 4 5 6 7
p( x) : 0 K 2 K 2 K 3 K K 2 2K 2 7 K 2  K
1 3 1 8 81 83 100
F ( x)  P( X  x) : 0
10 10 2 10 100 100 100
Find (i ) the value of K ,
if pmf
x7

 p( x)  1
x 0
0  K  2 K  2 K  3K  K 2  2 K 2  7 K 2  K  1
10 K 2  9 K  1  0
10 K 2  10 K  K  1  0
K  1/10, 1
K  1/10

(ii ) P(1.5  X  4.5 / X  2)


P ( A / B )  P ( AB ) / P ( B )
P (1.5  X  4.5 / X  2)  P(1.5  X  4.5  X  2) / P( X  2)
P ( X  2, X  3, X  4  X  3, X  4, X  5, X  6, X  7)

P ( X  3, X  4, X  5, X  6, X  7)
P ( X  3)  P ( X  4) 5K 5 /10
    5/ 7
1- P( X  2) 1  3K 7 /10

(iii ) the smallest value of  for which P ( X   )  1/ 2


4
 xe  x / 2 x  0
2

f ( x)  
0, x0
(a) show that f(x) is a pdf (of a continuous RV X.)
(b) find its distribution function F(x).
(a) f(x) is a pdf
(i) f ( x)  0

(ii )  f ( x)dx  1
-
 0 

  
2
x /2
f ( x)dx  0.dx  xe dx
- - 0
let x / 2  t  xdx  dt limits x=0  t=0;x=  t  
2

 
e  t
 0   e t dt   
0  1  0
 0  (1)  1
x 0 x2 / 2
(b)F(x)=P(X  x)=  f ( x)dx   0.dx   e t dt
- - 0
2
x /2
e  t
 x2 / 2
   e 1
 1  0
 x2 / 2
 1 e
A continuous RV X that can assume any value between x = 2 and
x = 5 has a density function given by f(x) = k(1 + x). Find P(X < 4).
f ( x) 
5
k (1  x)
0,
2 x5
elsewhere

 k (1  x)dx  1
2
5
 k  x  x / 2   1
2
2
 k  2 / 27
4
4
P ( X  4)  2 / 27  (1  x) dx  (2 / 27).  x  x / 2   16 / 27
2
2
2
Expectation (mean )and variance of RV X

INTRODUCTION:
• A discrete/ continuous RV is no doubt completely described
by is pmf /pdf or by its cumulative distribution function.
• for many purposes, this description is often considered to
consist too many details.
• It is rather simpler and more convenient to describe a RV
or to characterize its distribution by a few parameter,
known as statistical measure , that are representative of
the distribution. Out of which expectation or mean is the
most important.
• Mathematical
  Expectation
If X is a discrete R. V. with p.m.f. P(X = xi) = pi;
then mathematical expectation of X or arithmetic
mean of X, denoted as E(X), is defined by
E(X) = pi.
If X is a continuous R. V. with p.d.f. f(x), then
mathematical expectation of X is defined by
E(X) =
• Properties of expected values:
• E(X+Y) = E(X) + E(Y), provided all expectation
exists.
• If X and Y are independent random variables
then E(XY) = E(X). E(Y).
• In general, E(XY) E(X). E(Y).
• Ex. Find the expectation of the number on
dice when thrown.
Subject: Probability Theory

Subject
code: MA-451
Discrete probability distribution

• In this section we shall discuss some of the


probability distributions that figure most
prominently in statistical theory and application.
• We shall also study their parameter and derivation
of some of their important characteristics.
• We shall introduce number of discrete probability
distribution, that have been successfully applied in a
wide variety of decision situations.
• The purpose of this section is to show the type of
situations in which these distributions can applied.
• This section devoted to the study of univariate
distributions like
1. Discrete uniform (rectangular) distribution
2. Binomial distribution
3. Poisson distribution
4. Geometric distribution &
5. Hypergeometric distribution
• We have already defined distribution function,
mathematical expectation, variance and m.g.f.
This prepares us for a study of discrete
theoretical probability distribution.
Discrete uniform (rectangular) distribution
• Definition:
  A r. v. X said to have a discrete uniform
(rectangular) distribution over the range [1, n] if its
p.m.f. is expressed as follows:

• P(X = x) =

• Here n is known as the parameter of the


distribution and lie in the set of all +ve integers.
•  Mean = E(X) =

• Variance = Var (X) =

• e.g. The number of points obtained in single


throw of an unbiased die follows uniform
distribution with p.m.f
P(X = x) =
Binomial distribution
•• Let
  a random experiment be performed repeatedly and let
the occurrence of an event in a trial be called a success and
its non-occurrence a failure.
• Consider a set of n independent trials (n being finite) in
which the probability 'p' of success in any trial is constant
for each trial. Then q = 1 - p, is the probability of failure in
any trial.
• Definition.
• If random variable X is said to follow binomial distribution if
it assumes only non-negative values and its 'Probability
mass function is given by

• P(X = x) =
• The
  two independent constants n and p in the
distribution are known as the parameters of the
distribution.

• 'n' is also, sometimes known as the degree of the


binomial distribution.

• We shall use the notation


XB (n, p)
to denote that the random variable X follow the
binomial distribution with parameters n and p.
• examples of binomial distribution
• (i) Number of defective bolts in a box
containing n bolts.
• (ii) Number of post-graduates in a group of n
people.
• (iii) Number of oil wells yielding natural gas in
a group of n wells test drilled.
• (iv) Number of machines lying idle in a factory
having n machines.
Physical conditions for Binomial Distribution.

We get the binomial distribution under the following


experimental conditions.

1) Each trial results in two mutually disjoint outcomes


termed as success and failure.

2) The number of trials ‘n' is finite.

3) The' trials are independent of each other.

4) The probability of success 'p’ is constant for each trial.


• Mean and variance of binomial distribution

Mean = E(X) = np

Var(X) = npq
Example 1
The incidence of an occupational disease in an industry is such that the workers have a 20%
chance of suffering from it. What is the probability that out of 6 workers chosen at random,
four or more will suffer from the disease?
Poisson Distribution
Poisson distribution is a distribution related to probabilities of events which are
extremely rare, but which have a large number of independent opportunities
for occurrence.

Following are some instances where Poisson distribution may be successfully


employed
1) Number of deaths from a disease. Such as heart attack or cancer or due to
snake bite.
2) Number of suicides reported in a particular city.
3) The number of defective material in a packing manufactured by a good
concern.
4) Number of faulty blades in a packet of 100.
5) Number of air accidents in some unit of time.
6) Number of printing mistakes at each page of the book.
7) Number of cars passing a crossing per minute during the busy hours of a day.
•  Poisson distribution is a limiting case of the
binomial distribution under the following
conditions:
1) n, the number of trials is indefinitely large, i.e.,
n →∞.

2) p, the constant probability of success for each


trial is indefinitely small, i.e., p→0 .

3) np = λ, (say), is finite. Thus p = , q = 1 - , where λ


is a positive real number.
• Definition:
 
• A random variable X is said to follow a Poisson
distribution if it assumes only non-negative
values and its probability mass function is given
by 
• P(X = x) =

• Here λ is known as the parameter of the


distribution.

• We shall use the notation XP (λ) to denote that X


is a Poisson variate with parameter λ.
• Mean and variance of Poisson distribution

• Mean = E(X) = λ.

• Var X = λ
A car-hire firm has two cars, which it hires out day by day. The number
of demands for a car on each day is distributed as a Poisson distribution
with a mean of 1.5. Calculate the proportion of days on which (i) neither
car is used, and (ii) the proportion of days on which some demand is
refused.
X:The number of demands for a car on each day
X P(m)
mean X= m =1.5
Probability of days on which there are x demands for a car
e -m m x e-1.5 (1.5) x
P(X=x)=  .......(*)
x! x!
(i) neither car is used,(x=0)
e -m m x e-1.5 (1.5) 0
P(X=0)=   e -1.5  0.22313
x! 0!
(ii) the proportion of days on which some demand is
refused.
P(X>2)=1-P(X  2)
=1-  P(X=0)+P(X=1)+P(X=2)
 (1.5) 2.25 
 1  e -1.5 1+ + 
 1 2 
 0.1912
Six coins are tossed 6400 times. Using the Poisson distribution, what is
the approximate probability of getting six heads 10 times?
X: no of times getting six heads.
X P(m)
n=6400,
p=probability of getting six heads=(1/2) 6  1/ 64
m  np
 6400 / 64  100
Probability of getting x heads
e-m m x e-100100 x
P(X=x)= 
x! x!
what is
the approximate probability of getting six heads 10 times
e-m m x e-10010010
P(X=10)=  
x! 10!
•   Geometric distribution
•Suppose we have a series of independent trials or
repetitions and on each repetition or trial the
probability of success 'p' remains the same. Then the
probability that there are x failures preceding the first
success is given by qxp.
 
• Definition.
A random variable X is said to have a geometric
distribution if it assumes only non-negative values
and its probability mass function is given by
• P(X=x) =
• Remarks:
 
1. Since the various probabilities for x = 0, 1, 2, ...... are
the various terms of geometric progression, hence the
name geometric distribution.

2. Clearly, assignment of probabilities is permissible,


since
 
• Mean and variance of geometric distribution

• Mean = E(X) = q/p.


 
• Var X = q/p2
Hypergeometric distribution
• If
  X represents number of defectives found,
when n items are drawn without replacement
from a lot of N items containing k defectives
and (N – k) non-defectives, then

P(X = r) = ; r = 0, 1, 2, …..,min (n, k)


• Definition:
 
• If X is a discrete RV that can assume , non-negative
values 0, 1, 2,….., s.t. its p. m. f is given by
P(X = r) = ; r = 0, 1, 2, …..,min (n, k)
Then x follows a hypergeometric distribution
with parameter N, k and n.
• Remarks:
 
• Clearly, assignment of probabilities is
permissible,
since
• Mean and variance of Hypergeometric
distribution
• E(X) =
• Var (X) =
Subject: Probability Theory

Subject
code: MA-451
Continuous probability distribution

• In this section we shall discuss some of the


probability distributions that figure most
prominently in statistical theory and applications.
• We shall also study their parameter and derivation
of some of their important characteristics.
• We shall introduce number of continuous
probability distribution, that have been successfully
applied in a wide variety of decision situations.
• The purpose of this section is to show the type of
situations in which these distributions can applied.
This section devoted to the study of univariate
distributions like
1. Continuous uniform (rectangular) distribution
2. exponential distribution
3. Normal distribution
4. Chi square distribution
We have already defined distribution function,
mathematical expectation, variance and m.g.f.

This prepares us for a study of continuous


theoretical probability distribution.
Rectangular (or Uniform Distribution).

• A random variable X is said to have a


continuous uniform distribution over an
interval (a, b) if its probability density
function is constant = k (say), over the
entire range of X, i.e.

k, a  x  b
f (x)  
0, otherwise
Since total probability is always unity, we have
b b

 f (x)dx  1  k  dx  1
a a

1
i.e.k 
ba

 1
 , a  x  b
 f (x)   b  a
0, otherwise
Remarks.
• a and b, (a < b) are the two parameters of the
uniform distribution on (a, b).
• The distribution is also known as rectangular
distribution, since the curve y = f(x) describes a
rectangle over the x-axis and between the ordinates
at x=a and x=b.
• The distribution function F (x) is given by
0, if    x  a
x  a

F(x)   , a  x  b
b a
1, b  x   
• For a rectangular or uniform variate X in
(- a, a) , p.d.f. is given by
 1
 , a  x  a
 f (x)   2a

0, otherwise
Moments of Rectangular Distribution.

1 b a r 1 r 1

 
'
r  
ba  r 1 
• In particular

ab
Mean= E(X) =  
'
1
2

 
2
Variance  b  a
12
• Ex. If X is uniformly distributed
with mean 1 and variance 4/3. find
P(X < 0).

• Find mean and variance of


rectangular distribution given by
p.d.f. f(x) = 1, (0 ≤ x ≤ 1).
The Exponential Distribution
A continuous random variable X
assuming non-negative values is said
to have an exponential distribution
with parameter λ > 0, if its p.d.f. is
given by
e , x  0
x
f (x)  
0, otherwise
• The distribution function F (x) is given by

1  e , x  0
x
F(x)  
0, otherwise

• Moment Generating Function of Exponential


Distribution

M X (t)  ,   t
t
1
• Mean =

1
• Variance = 2

Ex. Show that the exponential distribution
"lacks memory“, i.e . if X has an exponential
distribution, then for every constant a> 0,
one has p(Y x / X  a) = P (X  x) for all x,
where Y= X-a.

and p(Y > x / X  a) = P (X > x) for all x, where


Y= X-a.
Ex. The time (in hours) required to
repaired a machine is exponentially
distributed with parameter λ= 1/2.
(i) What is the probability that repair time
exceeds 2 hrs?
(ii) What is the conditional probability
that repair takes at least 10 hours given
that its duration exceeds 9 hours.
Subject: Probability Theory

Subject
code: MA-451
Normal (Gaussian) distribution
------ 68.27%----
-----------------95.45% ------------------
----------------------------- 99.73%----------------------------
Area between z = 1 is 68.27%
Area between z = 2 is 95.45%
Area between z = 1 is 99.73%
uses of normal distribution
(i) The normal distribution can be used to approximate binomial and Poisson
distributions.
(ii) It is used extensively in sampling theory. It helps to estimate parameters from
statistics and to find confidence limits of the parameter.
(iii) It is widely used in testing statistical hypothesis and tests of significance in
which it is always assumed that the population from which the samples have
been drawn should have normal distribution.
(iv) It serves as a guiding instrument in the analysis and interpretation of statistical
data.
(v) It can be used for smoothing and graduating a distribution which is not normal
simply by contracting a normal curve.
• The average seasonal rainfall in a place is 16 inches with
an SD of 4 inches. What is the probability that the
rainfall in that place will be between 20 and 24 inches in
a year?
.
• .
Chi square distribution
If X1, X2,…… Xn are normally distributed independent
random variables, then it is known that (X12+ X22+…… +Xn2)
follows a probability distribution, called chi- square
Ex. The following data give the number of
aircraft accidents that occurred during the
various days of a week.
•  
Day: Mon Tues Wed Thu FriSat
No. of accident: 15 19 13 12 16 15
• Test whether the accidents are uniformly
distributed over the week.
• (Given  0.05
2
(  5)  11.07 )
Marginal Probability Distribution
P( X  xi ) 
P{( X  xi and Y  y j ) or ( X  xi and Y  y2 ) or etc.}

 pi1  pi 2  ..........   pij  pi*


j
is called the m arg inal probability function of X .
It is defined for X  x1 , x2 , .... and denoted as pi* .
The collection of pairs {xi , pi*}, i  1, 2, 3, ....
is called the m arg inal probability distribution of X .
Similarly the collection of pairs { y j , p* j }, j  1, 2, 3, .... is
called the m arg inal probability distribution of Y , where
p* j   pij  P (Y  y j ).
i
In the continuous case,

f X ( x)   f ( x, y)dy is called the m arg inal density of
-
X.

Similarly

fY ( y )   f ( x, y)dx is called the m arg inal density of
-
Y.

Conditional Probability Distribution


P  X  xi , Y  y j  pij
P  X  xi / Y  y j   
P Y  yj  p* j
is called the conditional probability function of X, given that Y = y j .
P  X  xi , Y  y j  pij
P  Y  y j / X  xi   
P  X  xi  pi*
is called the conditional probability function of Y, given that X = x i .
X & Y are ind.
pij
 P ( X  xi )  pi*  pij  pi*  p* j
p* j
f ( x, y )
f  X  xi / Y  y j   XY
fY ( y )
is called the conditional density of X, given Y, and is denoted by f (x/y).
f XY ( x, y )
f  Y  y j / X  xi  
f X ( x)
is called the conditional density of Y , given X, and is denoted by f (y/x)
X & Y are ind
f XY ( x, y )
 f X ( x)  f XY ( x, y )  f X ( x)  fY ( y )
fY ( y )
Functions of random variables
.
The P.d.f. fy(Y), when fx(X) is known
• Theorem:
If two RVs are independent, then density function
of their sum is given by the convolution of their
density functions.
• Theorem:
If two RVs X and Y are independent, then p.d.f.
of Z = XY in terms of density functions of X and
Y is given by
.

You might also like