Professional Documents
Culture Documents
DISTRIBUTIONS
Ishaan Taneja
2
About the author
The author, Ishaan Taneja, is a highly accomplished and driven professional with a profound
expertise in the fields of Statistics and Computer Science. With a solid academic background,
including a B.Sc. (Hons.) in Statistics from Ram Lal Anand College, University of Delhi,
and an M.Sc. in Statistics from Hindu College, University of Delhi. They have consistently
demonstrated exceptional performance and a passion for data analysis.
Currently pursuing an M.S. in Computer Science at IIT Madras, Ishaan Taneja is actively
exploring the intersection of Statistics and Machine Learning, further expanding their knowl-
edge and understanding of the subject. With their diverse experiences, including their role
as a Course Instructor at IIT Madras BS Degree Programme, where they have contributed
to the development of an online Statistics coursework.
Ishaan Taneja possesses the unique ability to bridge theoretical concepts with practical ap-
plications. Through their book, ”Probability and Probability Distributions,” they aim to
provide readers with a comprehensive and accessible exploration of these subjects, sharing
their expertise and empowering others in the field.
3
Contents
1 Data, Statistics and Probability 6
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.1 Deterministic pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.2 Random-like pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.3 Statistical study of phenomenon . . . . . . . . . . . . . . . . . . . . . 7
1.2 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Outcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.3 Sample Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.3.1 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 Occurrence of events . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.1.1 One event can be contained in another: . . . . . . . . . . . . 10
1.3.1.2 Complement of an event: . . . . . . . . . . . . . . . . . . . . 10
1.3.2 Combining events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.2.1 Union of events: . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.2.2 Intersection of events: . . . . . . . . . . . . . . . . . . . . . 11
1.3.2.3 Solved examples: . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.3 Disjoint events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.3.1 Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.4 De Morgan’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.5 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Venn Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2 Introduction to Probability 17
2.1 Events and Chance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Mathematical and Probability theory . . . . . . . . . . . . . . . . . . . . . . 18
2.2.1 Mathematical Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.2 Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.2.1 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Basic properties of probability . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.1 Working with probability spaces . . . . . . . . . . . . . . . . . . . . . 22
2.4 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.1 Uniform Distribution on a finite sample space . . . . . . . . . . . . . 25
2.4.2 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5 Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5.1 Conditional Probability Space . . . . . . . . . . . . . . . . . . . . . . 27
2.5.2 Multiplication rule: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5.3 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.6 Bayes’ Theorem and Independence . . . . . . . . . . . . . . . . . . . . . . . 31
2.6.1 Law of Total Probability . . . . . . . . . . . . . . . . . . . . . . . . . 31
4
2.6.1.1 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . 32
2.6.2 Bayes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.6.2.1 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . 35
2.6.3 Independence of Two Events . . . . . . . . . . . . . . . . . . . . . . . 38
2.6.3.1 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . 39
2.6.4 Mutual Independence of Three Events . . . . . . . . . . . . . . . . . 41
2.6.4.1 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . 41
2.7 Repeated Independent Trials . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.7.1 Bernoulli Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.7.1.1 Single Bernoulli Trial . . . . . . . . . . . . . . . . . . . . . . 43
2.7.1.2 Repeated Bernoulli Trials . . . . . . . . . . . . . . . . . . . 44
2.7.1.3 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . 44
2.7.2 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.7.2.1 Visualtion of Binomial Distribution . . . . . . . . . . . . . . 47
2.7.2.2 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . 49
2.7.3 Geometric Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.7.3.1 Visualisation of Geometric Distribution . . . . . . . . . . . 52
2.7.3.2 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . 53
2.8 Discrete Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.8.1 Random Variable and events . . . . . . . . . . . . . . . . . . . . . . . 57
2.8.2 Distribution of a Discrete Random Variable . . . . . . . . . . . . . . 58
2.8.2.1 Probability Mass Function (PMF) . . . . . . . . . . . . . . 58
2.8.2.2 Properties of PMF . . . . . . . . . . . . . . . . . . . . . . . 59
2.8.3 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.9 Common Discrete Distributions . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.9.1 Uniform Random Variable . . . . . . . . . . . . . . . . . . . . . . . . 64
2.9.2 Bernoulli Random Variable . . . . . . . . . . . . . . . . . . . . . . . 64
2.9.3 Binomial Random Variable . . . . . . . . . . . . . . . . . . . . . . . . 65
2.9.4 Geometric Random Variable . . . . . . . . . . . . . . . . . . . . . . . 65
2.9.5 Negative Binomial Distribution . . . . . . . . . . . . . . . . . . . . . 65
2.9.5.1 Negative Binomial Random Variable . . . . . . . . . . . . . 66
2.9.6 Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.9.6.1 Application of Poisson Distribution . . . . . . . . . . . . . . 66
2.9.6.2 Visualtion of Poisson Distribution . . . . . . . . . . . . . . . 69
2.9.6.3 Poisson Random Variable . . . . . . . . . . . . . . . . . . . 69
2.9.7 Hypergeometric Distribution . . . . . . . . . . . . . . . . . . . . . . . 69
2.9.7.1 Hypergeometric Random Variable . . . . . . . . . . . . . . . 70
2.10 Functions of one discrete random variable . . . . . . . . . . . . . . . . . . . 70
2.10.1 Solved Examples: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5
Chapter 1
6
1.1.3 Statistical study of phenomenon
1.2.2 Outcome
The result of an experiment (in as much detail as necessary) is referred to as the Outcome
of that experiment.
It may also be referred to as data in some sense.
Examples
(i) When the experiment is ‘Tossing a coin’, the possible outcomes could be either ‘Heads’
or ‘Tails’.
(ii) When the experiment is ‘Rolling a die’, the possible outcomes could be either ‘1’ or ‘2’
or ‘3’ or ‘4’ or ‘5’ or ‘6’.
(iii) When the experiment is complex like ‘Indian Premier League tournament’, we cannot
list down all possible outcomes easily. (It could be available in YAML format file)
7
1.2.3 Sample Space
The sample space is a set that contains all outcomes of an experiment. It is typically denoted
by ‘S’.
Note:
• In practice, it is enough to imagine a sample space instead of explicitly writing it down.
• In situations where confusion occurs, writing down the sample space can give the clarity.
Examples
(i) If the experiment is ‘Tossing a coin’ then, the sample space; S = {Heads, Tails}.
(ii) If the experiment is ‘Rolling a die’ then, the sample space; S = {1, 2, 3, 4, 5, 6}.
(iii) If the experiment is as complex as ‘Indian Premier League tournament’ then the sample
space is difficult to write down. So, we generally break it into small experiments
depending on what level of details we want.
Q1. In an urn of marbles, there are three marbles each of colour red, blue and white. If
the experiment is drawing a marble from the urn, then write down its sample space.
Solution:
The possible outcomes are that either the marble drawn is of red colour or blue colour
or white colour.
Hence, the sample space; S = {red, blue, white}
Q2. In an urn of marbles, there are two red coloured marbles, and each one of blue and
white colour. If the experiment is drawing two marbles from the urn with replacement,
then write down its sample space.
Solution:
The possible outcomes in the first draw is that the marble drawn is of red colour or
blue colour or white colour.
Since, we are drawing the marble with replacement, so for the second draw the possible
outcomes are the same as first draw.
Let us denote, Red: R, Blue: B and White: W
Hence, the sample space; S = {RR, RB, RW, BR, BW, BB, WR, WB, WW}
Q3. In an urn of marbles, there are two red coloured marbles, and 1 each of blue and white
colour. If the experiment is drawing two marbles from the urn without replacement,
then write down its sample space.
Solution:
8
The possible outcomes in the first draw is that the marble drawn is of red colour or
blue colour or white colour.
Since, we are drawing the marble without replacement, so for the second draw the
possible outcomes depends on the first draw.
Let us denote, Red: R, Blue: B and White: W
Hence, the sample space; S = {RR, RB, RW, BR, BW, WR, WB}
Q4. If the experiment is to draw a card from a well shuffled pack of 52 cards, then write
down its sample space.
Solution:
The possible outcomes are that either the numbered card or the face card is drawn.
In addition, it could belong to any of the four suits, i.e. Hearts, Diamonds, Clubs and
Spades.
So, the sample space can conveniently be written as the Cartesian product of these
two sets, i.e. Set of suits and Set of values.
Hence, S = {Hearts, Diamonds, Clubs, Spades}×{2, 3, 4, 5, 6, 7, 8, 9, 10, J, K, Q, A}
which is the same as the set of 52 cards in the shuffled pack.
Remark:
In the above example, if we were to draw 13 cards from the well shuffled pack of 52 cards,
then the number of possible outcomes are 52 C13 .
And, writing down the sample space for this experiment is a very tedious task. Hence, it is
not useful to write down sample space in such cases, but to imagine it instead.
1.3 Events
An event is a subset of the sample space. There is a technical restriction on what subsets
can be events.
(More details on technical restrictions in section 2.8.1)
Note:
• All set theory notions apply to events, and that tends to have a natural meaning.
Examples
• Number of events = 22 = 4
9
• Events: empty set, {Heads}, {Tails}, {Heads, Tails}
(ii) Rolling a die: S = {1, 2, 3, 4, 5, 6}.
• Number of events = 26 = 64
• Events: empty set, {1}, {2}, {3}, {4}, {5}, {6}, {1,2}, {1,3}, . . . , {1, 2, 3, 4, 5, 6}
• Some events can also be described in words: Getting an even number i.e. {2,4,6},
Getting a multiple of 3 i.e. {3,6}, etc.
(iii) Fisherman goes out to fish:
• Sample space is complicated as we have to describe everything like catch of fish
in kilos, type of fish and so on.
• We can still define events of interest quite easily, even on the complicated sample
space.
Events: Catch is more than 100 kg, Pomfret is the catch, etc.
10
1.3.2 Combining events
We can combine events to create new events, and the standard way to operate on is by using
the two operations on the events.
These set operations are very central in set theory, namely, ‘Unions’ and ‘Intersections’.
• The union of events is said to be occurred if either of the events have occurred.
Examples
(ii) Fisherman’s Catch: A = {more than 200 Kg} and B = {less than 50 Kg}
Implies that, A ∪ B = {either more than 200 Kg or less than 50 Kg}
• The intersection of events is said to be occurred if all the events have occurred.
Examples
(ii) Fisherman’s Catch: A = {more than 100 Kg} and B = {less than 150 Kg}
Implies that, A ∩ B = {more than 100 Kg and less than 150 Kg}
11
1.3.2.3 Solved examples:
Q1. For an experiment of rolling a die, let us define the events as follows:
A = {even number} and B = {prime number}
(i) Find A ∪ B.
(ii) Find A ∩ B.
Solution:
A = {even number} = {2, 4, 6}
B = {prime number} = {2, 3, 5}
Q2. 5 cards are to be drawn without replacement from a well shuffled pack of 52 cards.
Can we represent the event that there are no aces as an intersection of 5 events?
Solution:
Yes. Let us define the events as follows:
E = {no aces in 5 draws of cards}
E1 = {no ace in the first draw of card}
E2 = {no ace in the second draw of card}
E3 = {no ace in the third draw of card}
E4 = {no ace in the fourth draw of card}
E5 = {no ace in the fifth draw of card}
Then, we can express the event E as (E1 ∩ E2 ∩ E3 ∩ E4 ∩ E5 )
Q3. In an IPL experiment, consider an event E = {5 runs being scored in two legal deliveries}.
Can we represent this event as the union of events?
Solution:
Yes. The possible number of runs that can be scored off bat on any legal delivery is
0, 1, 2, 3, 4 and 6. (Assuming there are no extras and no overthrows)
Let us define the events as follows:
E1 = {1 run off the first delivery and 4 runs off the second delivery} = {1+4}
E2 = {2 runs off the first delivery and 3 runs off the second delivery} = {2+3}
E3 = {3 runs off the first delivery and 2 runs off the second delivery} = {3+2}
E4 = {4 runs off the first delivery and 1 run off the second delivery} = {4+1}
Then, we can express the event E as (E1 ∪ E2 ∪ E3 ∪ E4 )
12
Ei ∩ Ej = empty set ; for any i ̸= j
If the events A and B are disjoint, then it implies that:
Examples
(ii) Fisherman’s Catch: A = {more than 200 Kg} and B = {less than 50 Kg}
A and B are disjoint events because, A ∩ B = empty set
1.3.3.1 Partition
If two or more disjoint events make up the whole sample space together, then it is referred
to as the Partition.
We can make a partition of large sample spaces into multiple disjoint events for study.
Note:
The event and its complement are referred to as a “Partition” because:
• The event and its complement are always disjoint events, i.e. A ∩ Ac = empty set
• The event and its complement together cover all the outcomes, i.e. A ∪ Ac = S
Examples
For an experiment of drawing a card from a pack of 52 well shuffled cards, consider the
events:
E1 = {The card is a Spade}, E2 = {The card is a Heart}, E3 = {The card is a Club}
and E4 = {The card is a Diamond}
Then, the events E1 , E2 , E3 and E4 are referred to as the Partition.
• (A ∪ B)c = Ac ∩ B c
• (A ∩ B)c = Ac ∪ B c
These laws are very useful and will come to a rescue when we want to interpret an event
given in English.
13
1.3.5 Solved Examples:
Going through these examples will enhance your skill of translating English to events.
Q1. The hats of 5 persons are identical and get mixed up in a box. If each person picks a
hat at random from the box, then consider the following events:
A = {No person gets their own hat}
B = {Every person gets their own hat}
C = {At least one person does not get their own hat}
D = {At least one person gets their own hat}
Based on the given information, answer the following questions.
(i) What is Ac ?
(ii) What is B c ?
(iii) Are A and B disjoint events?
(iv) What is A ∩ B c ?
Solution:
(Note: Don’t take the Ac as “Everyone gets their own hat”, it’s not an En-
glish comprehension. That’s where writing down complete sample space is a
complicated task, but having an idea of sample space is useful.)
(ii) Again, as we know that, B c = {outcomes in S and not in B} = (S \B)
So, even if any one of the 5 persons does not get their own hat, that outcome will
not belong to the event B but will belong to S.
Hence, B c = {At least one person does not get their own hat} = C
(iii) Since, A ∩ B = empty set =⇒ A and B are disjoint events.
i.e. the occurrence of these events that no person gets their own hat and every
person gets their own hat is not possible together.
(iv) A ∩ B c =⇒ All the outcomes that are in the event A but not in B.
Now,
A ∩ B c = A ∩ C ; (because, B c = C)
= {No person gets their own hat and Atleast one person does not get their own hat}
= {No person gets their own hat}
c
∴A∩B =A
14
Another way to approach:
Since A and B are disjoint events =⇒ A ⊆ B c
Hence, A ∩ B c = A
Q2. For the IPL experiment, in one over 6 deliveries are bowled where, in each delivery
0, 1, 2, 3, 4 or 6 runs may be scored. Consider the following events:
A = {No 4s in one over}
B = {No 6s in one over}
C = {Exactly 20 runs scored in one over}
Based on the given information, answer the following questions.
(i) What is A ∪ B?
(ii) What is A ∩ B?
(iii) Can A ∩ B ∩ C occur?
Solution:
15
• Union of events:
All covered regions.
• Intersection of events:
Common region.
16
The shaded region in the above Venn diagram represents A ∩ B c , i.e. Region of A
outside B.
Example:
Rolling a die: Even number, but not a multiple of 3.
• Disjoint events:
No overlap in regions.
In the above Venn diagram, events A and C and events B and C are disjoint events.
Similarly, we can do more complicated operations and will get different looking results.
One can also verify the De Morgan’s law using the Venn diagram. (Do give it a try!)
2 Introduction to Probability
2.1 Events and Chance
Now that we have defined the sample space, outcomes and events, we are interested in
whether the event will occur or not. If yes, then what are the chances of it occurring?
For instance,
• 13 cards drawn randomly from a well shuffled pack of 52 cards: What are the chances
of getting all 13 cards of the same suit?
Though intuitively, we can say that the chances of occurring of an event is low or high, but
still we want to assign the proper values of chance to the events in an experiment.
17
2.2 Mathematical and Probability theory
The probability theory let us work with numbers, which represent the chance and to associate
a mathematical value of chance to the occurrence of an event in the experiment.
The probability theory is a mathematically developed theory.
• A few assumptions about the object we have or define the conditions which are always
valid i.e., axioms.
• Events
• Probability function.
We have already discussed sample space and events in detail in the earlier sections. Let us
now discuss the probability function in detail.
2.2.2.1 Probability
Probability is a function P that assigns to each event a real number between 0 and 1. The
entire probability space (sample space, events and probability function) should satisfy the
following two axioms:
(i) P (S) = 1 (Probability of entire sample space equals 1)
18
• Higher the output value means higher the chance of occurring of that event.
• 0 means the event cannot occur, and 1 means it will always occur.
Examples
Tossing a coin: S = {H, T}.
Which of the following probability function(s) satisfy the axioms?
(ii) For 0 ≤ p ≤ 1:
P (empty) = 0, P ({H}) = p, P ({T}) = 1 − p and P ({H, T}) = 1
Solution:
19
2.3 Basic properties of probability
Property 1: Empty Set
The probability of an empty set (denoted by Φ) equals 0.
i.e., P (Φ) = 0
Proof:
We know that an event and its complement are always disjoint.
Since Φc = S, S and Φ are disjoint and Φ ∪ S = S
By using Axiom (2),
P (S ∪ Φ) = P (S) + P (Φ)
=⇒ P (S) = P (S) + P (Φ)
∴ P (Φ) = 0
Property 2: Complement
Let E c be the complement of an event E. Then,
P (E c ) = 1 − P (E)
Proof:
We know that an event and its complement are always disjoint. So, E and E c are disjoint
and E ∪ E c = S
By using Axiom (2),
P (E ∪ E c ) = P (E) + P (E c )
P (S) = P (E) + P (E c )
=⇒ 1 = P (E) + P (E c ) ; (By Axiom 1, P (S) = 1)
∴ P (E c ) = 1 − P (E)
Property 3: Subset
If the event E is a subset of the event F , i.e., E ⊆ F , then
20
P (F ) = P (E) + P (F \E) =⇒ P (E) ≤ P (F )
Proof:
P (E) = P (E ∩ F ) + P (E \F )
P (F ) = P (E ∩ F ) + P (F \E)
Proof:
Since E ∩ F is a subset of E, using the subset property
P (E) = P (E ∩ F ) + P (E \(E ∩ F ))
Now, E \(E ∩ F ) = E \F
Hence, P (E) = P (E ∩ F ) + P (E \F )
Similarly,
P (F ) = P (E ∩ F ) + P (F \E)
21
P (E ∪ F ) = P (E) + P (F ) − P (E ∩ F )
Proof:
The events (E \F ), (E ∩ F ) and (F \E) are disjoint.
Also, E ∪ F = (E \F ) ∪ (E ∩ F ) ∪ (F \E)
By using Axiom (2),
P (E ∪ F ) = P (E \F ) + P (E ∩ F ) + P (F \E)
(As we know that for any events A and B, P (A \B) = P (A) − P (A ∩ B))
=⇒ P (E ∪ F ) = [P (E) − P (E ∩ F )] + P (E ∩ F ) + [P (F ) − P (E ∩ F )]
∴ P (E ∪ F ) = P (E) + P (F ) − P (E ∩ F )
If we consider the events E and F as disjoint, then P (E ∩ F ) = 0. Plugging in, we will get
the Axiom (2).
Hence, this property is a generalization of the Axiom (2).
Solution:
Let us denote, David: D, Megha: M , Rajesh: R and Veronica: V .
(i) The outcome of this experiment is of the form {Waiter name, Cashier name}
Therefore, the sample space is given by:
(ii) If cashier is from Delhi, then the cashier position can be filled either by David or
by Megha.
Therefore, the event A is given by:
22
(iii) If exactly one position is to be filled by a Delhiite, then either the cashier position
or the waiter position, but not both, can be filled either by David or by Megha.
Therefore, the event B is given by:
(iv) Neither position is to be filled by a Delhiite, means that none of the position is
to be filled by David and Megha.
Therefore, the event C is given by:
Q2. In a town, there are fishing boats that go out to catch fish everyday. Over the years,
folks have observed the following:
What are the chances of catching between 400 and 500 kg of fish in a day?
Solution:
Let us define the events as follows:
A : Catching more than 400 kg of fish in a day.
B : Catching more than 500 kg of fish in a day.
So, P (A) = 0.35 and P (B) = 0.10
In other words, the catch of fish in a day should be more than 400 kg but not more
than 500 kg. Therefore,
P (E) = P (A \B)
(By subset property, if B ⊆ A then, P (A) = P (B) + P (A \B))
=⇒ P (E) = P (A) − P (B)
= 0.35 − 0.10
∴ P (E) = 0.25
Q3. Suppose you hear the following forecast for rain and temperature:
23
• Chance of maximum temperature above 30◦ C tomorrow is 70%.
• Chance of rain and maximum temperature above 30◦ C tomorrow is 40%.
What are the chances of no rain and maximum temperature below 30◦ C tomorrow?
Solution:
Let us define the events as follows:
A : There will be rain ; Ac : There will be no rain
B : The maximum temperature is above 30◦ C ; B c : The maximum temperature
is below 30◦ C
So, P (A) = 0.60, P (B) = 0.70 and P (A ∩ B) = 0.40
Now, the event
E ={There will be no rain and maximum temperature below 30◦ C} = Ac ∩ B c
Therefore,
P (E) = P (Ac ∩ B c )
= P (A ∪ B)c ; (Using De Morgan’s Law)
= 1 − P (A ∪ B)
=⇒ P (E) = 1 − [P (A) + P (B) − P (A ∩ B)] ; (Using Property 4: Union and intersection)
= 1 − [0.60 + 0.70 − 0.40]
∴ P (E) = 0.10
2.4 Distributions
The idea of distributions is to assign probabilities to each of the individual outcomes in the
sample space.
It gives us the sense of how the probabilities are distributed over the outcomes.
It is possible when:
• When the outcomes can be enumerated as first, second, third and so on.
Example:
Rolling a die: S = {1,2,3,4,5,6}
Let us suppose that all the individual outcomes are events and,
P ({1}) = p1 , P ({2}) = p2 , P ({3}) = p3 , P ({4}) = p4 , P ({5}) = p5 and, P ({6}) = p6
where, 0 ≤ pi ≤ 1; i = 1, 2, 3, 4, 5, 6
Also, the events {1}, {2}, {3}, {4}, {5} and {6} are disjoint events.
And, {1} ∪ {2} ∪ {3} ∪ {4} ∪ {5} ∪ {6} = S.
Hence, by Axiom (2), p1 + p2 + p3 + p4 + p5 + p6 = P (S) = 1
24
Each of the pi ; i = 1, 2, 3, 4, 5, 6, have to be between 0 and 1, and they have to add up to 1.
We can also find the probability of any complicated event by splitting it as union of in-
dividual outcomes.
Note:
1
If the die is fair, then pi =; i = 1, 2, 3, 4, 5, 6
6
i.e. each individual outcome has the same probability. (Generally, referred to as equally-
likely outcomes)
Q2. In a throw of two fair dice, what is the probability that the sum of the two numbers is
8?
25
Solution:
The sample space consists of 36 outcomes given by:
S ={(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6),
(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6),
(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6),
(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6),
(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6),
(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}
Since, the fair dice are rolled, the distribution is assumed to be uniform.
Let us define the event,
A = {The sum of the two numbers is 8} = {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)}
Therefore,
Number of outcomes in the event A
P (A) = ; (because, Uniform distribution)
Number of outcomes in S
5
∴ P (A) =
36
Q3. The hats of 3 persons are identical and get mixed up. Each person picks a hat at
random. What is the probability that none of the persons gets their own hat?
Solution:
Let the three persons be P1 , P2 and P3 , and their hats are H1 , H2 and H3 respectively.
Then the possible outcomes are listed in a table below, where each column represents
a person and each row is the hat picked by them. The sample space is as follows:
P1 P2 P3
1 H1 H2 H3
2 H1 H3 H2
3 H2 H1 H3
4 H2 H3 H1
5 H3 H1 H2
6 H3 H2 H1
Let B be an event with P (B) > 0, then the Conditional Probability Space given B is:
• Sample space : B, because the event B has already occurred, and we have moved into
this sample space.
Remember that, the new probability function also satisfy both the axioms:
Note:
27
• P (B) > 0 because, if P (B) = 0 then, it has no chance of occurring, and can’t be
observed.
• One can also check that all the basic properties of probability are satisfied by the
probability function P (A|B).
Example:
Toss a coin three times:
Initial sample space ; S = {HHH, HHT, HT H, T HH, T T H, T HT, HT T, T T T }
Let us suppose, an event B = {First toss result in tails} has occurred.
Therefore, the reduced probability space to work on for the second and third toss is:
It can be generalized to n events by keep on iterating the multiplication rule: (Do try!)
(i) P ({2}|E)
(ii) P ({1}|E)
(iii) P ({2, 3, 4}|E)
Solution:
Sample space: S = {1, 2, 3, 4, 5, 6}
Since, the fair die is rolled, the distribution is assumed to be uniform, i.e.
1
P ({i}) = ; i = 1, 2, 3, 4, 5, 6
6
28
The event E = {2, 4, 6}. Therefore,
Number of outcomes in the event
P (E) = ; (because, Uniform distribution)
Number of outcomes in S
3
∴ P (E) =
6
P ({2} ∩ E)
(i) P ({2}|E) =
P (E)
P ({2})
=
P (E)
1/6 1
∴ P ({2}|E) = 3 =
/6 3
P ({1} ∩ E)
(ii) P ({1}|E) =
P (E)
P ({Φ})
=
P (E)
0
∴ P ({1}|E) = 3 = 0
/6
P ({2, 3, 4} ∩ E)
(iii) P ({2, 3, 4}|E) =
P (E)
P ({2, 4})
=
P (E)
2/6 2
∴ P ({2, 3, 4}|E) = 3 =
/6 3
Q2. There are two urns ‘A’ and ‘B’. Urn ‘A’ contains 7 red marbles and 6 blue marbles
while, Urn ‘B’ contains 5 red marbles and 8 blue marbles. If an urn is chosen at
random and then the marble is drawn at random from the chosen urn. Based on this
information, compute the following probabilities:
(i) The red ball is drawn given that Urn ‘A’ is chosen.
(ii) The blue ball is drawn given that Urn ‘B’ is chosen.
Solution:
(i) It is given that the Urn ‘A’ is chosen. Hence, we have the conditional probability
space, where the sample space is given by:
S = {R1 , R2 , R3 , R4 , R5 , R6 , R7 , B1 , B2 , B3 , B4 , B5 , B6 }
Since, we are drawing the marble at random, the distribution is uniform.
7
∴ P (Red Marble | Urn A) =
13
29
(ii) It is given that the Urn ‘B’ is chosen. Hence, we have the conditional probability
space, where the sample space is given by:
S = {R1 , R2 , R3 , R4 , R5 , B1 , B2 , B3 , B4 , B5 , B6 , B7 , B8 }
Since, we are drawing the marble at random, the distribution is uniform.
8
∴ P (Blue Marble | Urn B) =
13
Q3. Consider a class of 15 students: 4 from State-1, 8 from State-2 and 3 from State-
3. Three different students are chosen at random one after another. What is the
probability that the selected three are from State-1, State-3 and State-1 again in that
order?
Solution:
Let us define the following events:
A1 : The first student is from State-1
A2 : The second student is from State-3
A3 : The third student is from State-1
Since, students are chosen at random. Therefore,
4
P (A1 ) =
15
3
P (A2 |A1 ) = (Because, it is given that one student of State-1 is already selected)
14
3
And, P (A3 |A1 ∩ A2 ) =
13
Because, it is given that the one from each State-1 and 3 is already selected.
The probability that the selected three are from State-1, State-3 and State-1 again in
that order is P (A1 ∩ A2 ∩ A3 ).
Now,
Q4. A family has 2 children. What is the probability that both are girls, given that at least
one is a girl?
Solution:
30
Let us denote Boy: B and Girl: G
Then the sample space: S = {BB, BG, GB, GG} and, the distribution is assumed to
be uniform.
Let us define the following events:
E = {At least one is a girl} = {GB, BG, GG}
F = {Both are girls} = {GG}
Therefore, the probability that both are girls, given that at least one is a girl is:
P (F ∩ E)
P (F |E) =
P (E)
P ({GG} ∩ {GB, BG, GG})
=
P ({GB, BG, GG})
P ({GG})
=
P ({GB, BG, GG})
1/4
= 3/4
; (Because, Uniform ditribution)
1
∴ P (F |E) =
3
Let us consider the events A and B such that event B partitions the sample space into two
halves i.e. B and B c , and A is an event of interest.
Then, the Law of Total Probability states that:
P (A) = P (A ∩ B) + P (A ∩ B c ) = P (A|B)P (B) + P (A|B c )P (B c )
31
Proof:
Since, A ∩ B and A ∩ B c are disjoint events and, A = (A ∩ B) ∪ (A ∩ B c )
By using Axiom (2),
P (A) = P (A ∩ B) + P (A ∩ B c )
=⇒ P (A) = P (A|B)P (B) + P (A|B c )P (B c ) ; (Using Multiplication rule)
Generalization:
If there are n partitions of the sample space, say B1 , B2 , . . . Bn , then for any event A in
the sample space:
Q1. There are two urns ‘A’ and ‘B’. Urn ‘A’ contains 7 red marbles and 6 blue marbles
while, Urn ‘B’ contains 5 red marbles and 8 blue marbles. If an urn is chosen at
random and then the marble is drawn at random from the chosen urn, then find the
probability that a red ball is drawn.
Solution:
In Q2 of section 2.5.3, we have already computed that:
7 5
P (Red Marble | Urn A) = ; P (Red Marble | Urn B) =
13 13
6 8
P (Blue Marble | Urn A) = ; P (Blue Marble | Urn B) =
13 13
Let us define the event,
B1 : The marble is drawn from Urn ‘A’
B2 : The marble is drawn from Urn ‘B’
32
R: The marble drawn is red
The events B1 and B2 partitions the sample space into two halves, i.e. B1c = B2 .
Therefore, by using the Law of Total Probability, we get:
P (R) = P (R|B1 )P (B1 ) + P (R|B2 )P (B2 )
7 1 5 1
= × + ×
13 2 13 2
6
∴ P (R) =
13
Q2. An economic model predicts that if interest rates rise, then there is a 60% chance that
unemployment will increase, but that if interest rates do not rise, then there is only a
30% chance that unemployment will increase. If the economist believes that there is
a 40% chance that interest rates will rise, what should she calculate is the probability
that unemployment will increase?
Solution:
Let us define the events,
B1 : The interest rates rise
B2 : The interest rates do not rise
A: There is an increase in unemployment
Then,
40 60
P (B1 ) = ; P (A|B1 ) =
100 100
60 30
P (B2 ) = ; P (A|B2 ) =
100 100
Either the interest rates will rise or will not rise. Hence, the events B1 and B2 partitions
the sample space into two halves. (B1c = B2 )
Therefore, by using the Law of Total Probability, we get:
P (A) = P (A|B1 )P (B1 ) + P (A|B2 )P (B2 )
60 40 30 60
= × + ×
100 100 100 100
∴ P (A) = 0.42
Q3. A man possesses 5 coins- 2 are double-headed, 1 is double-tailed, and 2 are normal.
He picks a coin at random and tosses it. What is the probability that he sees a head?
Solution:
Let us define the events,
B1 : The double-headed coin is tossed
B2 : The double-tailed coin is tossed
B3 : The normal coin is tossed
A: He sees a head
Then,
33
2
P (B1 ) = ; P (A|B1 ) = 1
5
1
P (B2 ) = ; P (A|B2 ) = 0
5
2 1
P (B3 ) = ; P (A|B3 ) =
5 2
Since, the events B1 , B2 and B3 partitions the sample space into three parts. (B1c = B2 )
Therefore, by using the Law of Total Probability, we get:
The Bayes’ theorem states that if A and B are events with P (A) > 0, P (B) > 0 then:
34
Note:
• Bayes’ theorem allows us to move across different conditional probability spaces.
• Bayes’ theorem along with the Law of Total Probability are at the heart of so many
applications in probability.
Q1. There are two urns ‘A’ and ‘B’. Urn ‘A’ contains 7 red marbles and 6 blue marbles
while, Urn ‘B’ contains 5 red marbles and 8 blue marbles. If an urn is chosen at
random and then the marble is drawn at random from the chosen urn, then find the
probability that urn ‘A’ is chosen given that the red ball is drawn.
Solution:
In Q1 of section 2.6.1.1, we have already computed that:
7 5
P (A|B1 ) = ; P (A|B2 ) =
13 13
6 8
P (Ac |B1 ) = ; P (Ac |B2 ) =
13 13
6
And, P (A) = ; (By using the Law of Total Probability)
13
Therefore, by using the Bayes’ theorem, we get:
P (A|B1 )P (B1 )
P (B1 |A) =
P (A)
(7/13) × (1/2)
= 6/13
7
∴ P (B1 |A) =
12
Q2. In a city, 1% of people have Swine Flu. In the Flu test, 95% of people with Swine
flu test positive, while 2% of people without the disease will test positive. A person
is randomly chosen from the city and tests positive. What is the probability that the
person actually has Swine Flu?
Solution:
Let us define the events,
A: Person tests positive
B: Person has Swine Flu
=⇒ B c : Person does not have Swine Flu
Then,
35
The events B and B c partition the sample space in two halves. Therefore, by using
the Law of Total Probability, we get:
P (A|B)P (B)
P (B|A) =
P (A)
0.01 × 0.95
=
0.0293
∴ P (B|A) = 0.3242
This implies that there is only 32.42% chance that the person has Swine Flu given that
the person tests positive. Hence, the test is not much reliable.
Q3. A student attempting an MCQ with 4 choices (of which one is correct) knows the
correct answer with probability 34 . If she does not know, she guesses a random choice.
Given that a question was answered correctly, what is the conditional probability that
she knows the answer?
Solution:
Let us define the events,
A: She answered the question correctly
B: She knows the correct answer of the question
=⇒ B c : She guesses the answer of the question
Then,
3
P (B) = ; P (A|B) = 1
4
1 1
P (B c ) = ; P (A|B c ) =
4 4
The events B and B c partition the sample space in two halves. Therefore, by using
the Law of Total Probability, we get:
36
Now, by using the Bayes’ Theorem, we get:
P (A|B)P (B)
P (B|A) =
P (A)
1 × (3/4)
= 13
/16
12
=
13
∴ P (B|A) = 0.9231
This implies that, there is approximately, 92.31% chance that she knew the answer to
a question given that the question was answered correctly.
Q4. You first roll a fair die, then toss as many fair coins as the number that showed on the
die. Given that 5 heads are obtained, what is the probability that the die showed 5?
Solution:
Let us define the events,
Ei : The die showed the number i ; i = 1, 2, 3, 4, 5, 6
A: 5 Heads are obtained
Then,
1
P (Ei ) = ; i = 1, 2, 3, 4, 5, 6 and, P (A|Ei ) = 0 ; i = 1, 2, 3, 4
6
Now, by using Uniform distribution, we get:
1 3
P (A|E5 ) = and P (A|E6 ) =
32 32
Since, the events E1 , E2 , . . . E6 partition the sample space in two halves. Therefore, by
using the Law of Total Probability, we get:
P (A) = P (A|E1 )P (E1 ) + P (A|E2 )P (E2 ) + . . . P (A|E6 )P (E6 )
1 1 1 3
=0+ × + ×
6 32 6 32
1
=⇒ P (A) =
48
Now, by using the Bayes’ theorem, we get:
P (A|E5 )P (E5 )
P (E5 |A) =
P (A)
( /32) × (1/6)
1
= 1/48
1
=
4
∴ P (E5 |A) = 0.25
37
2.6.3 Independence of Two Events
Two events A and B are independent if:
In other words, the events A and B are independent if the probability of occurrence of A
(B) is unaffected by the occurrence of B (A), i.e. P (A|B) = P (A) and P (B|A) = P (B).
If the two events A and B are independent, then it implies that:
P ((A ∩ B) ∪ (A ∩ B c )) = P (A ∩ B) + P (A ∩ B c )
P (A) = P (A ∩ B) + P (A ∩ B c )
=⇒ P (A ∩ B c ) = P (A ∩ B) − P (A)
= P (A)P (B) − P (A) ; (A and B are independent events)
= P (A)[1 − P (B)]
∴ P (A ∩ B ) = P (A)P (B c )
c
38
P ((Ac ∩ B c )) = P (A ∪ B)c ; (Using De Morgan’s Law)
= 1 − P (A ∪ B)
= 1 − [P (A) + P (B) − P (A ∩ B)]
= 1 − P (A) − P (B) + P (A)P (B) ; (A and B are independent events)
= 1 − P (A) − P (B)[1 − P (A)]
= [1 − P (A)][1 − P (B)]
∴ P (A ∩ B ) = P (Ac )P (B c )
c c
Note:
• The independence is defined very precisely in probability that one should not use their
intuition to conclude that the events are independent.
• If the two events A and B are independent, then the multiplicative rule states that the
probability of occurrence of event A and event B is given by, P (A ∩ B) = P (A)P (B)
• If the two events A and B are disjoint and B occurs then A definitely did not occur.
Thus, they can never be independent, as the occurrence of B impacts the conditional
probability of A.
=⇒ For event to be independent, they should have non-empty intersection
Q1. If a fair coin is tossed thrice and the events are defined as follows:
A: First toss is heads
B: Second toss is heads
Then, check the following:
(i) Are events A and B independent?
(ii) Are events A and B c independent?
Solution:
The sample space; S = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }
Since, the same fair coin is tossed thrice. Therefore, the distribution is uniform.
1
A = {HHH, HHT, HT H, HT T } ; P (A) =
2
1
B = {HHH, HHT, T HH, T HT } ; P (B) =
2
1
and, A ∩ B = {HHH, HHT } ; P (A ∩ B) =
4
39
(i). Since,
1
P (A ∩ B) =
4
1 1
= ×
2 2
∴ P (A ∩ B) = P (A)P (B)
Hence, A and B are independent events.
(ii). Since, the events A and B are independent, this implies that the event A and
B c are also independent.
Q2. If a fair dice is rolled and the events are defined as follows:
A: The die showed even number
B: The die showed odd number
Are events A and B independent?
Solution:
The sample space; S = {1, 2, 3, 4, 5, 6}
1
A = {2, 4, 6} ; P (A) =
2
1
B = {1, 3, 5} ; P (B) =
2
A∩B =Φ =⇒ A and B are disjoint events
Since, A and B are disjoint events, they can not be independent.
Therefore, A and B are dependent events.
Q3. If a card is drawn from a well shuffled pack of 52 cards, and the events are defined as
follows:
A : Card is a spade
B : Card is a king
Are events A and B independent?
Solution:
13
A = {Spade card} ; P (A) =
52
4
B = {King card} ; P (B) =
52
1
A ∩ B = {Spade-King card} ; P (A ∩ B) =
52
Now,
13 4
P (A)P (B) = ×
52 52
1
=
52
∴ P (A)P (B) = P (A ∩ B)
40
Hence, the events A and B are independent.
Remark: As there’s only one outcome, which can belong to multiple events. That’s why
even though only one card is drawn, still two or more events can be independent.
Note:
• For mutual independence of n events, there are almost 2n constraints i.e.
P ({Intersection of any subset of events}) = Product of the P ({Events})
• If n events are mutually independent, then it implies that any subset of events with or
without complementing are independent as well.
Q1. If a fair coin is tossed twice and the events are defined as follows:
A: Either both toss are heads or tails
B: First toss is heads
C: Second toss is heads
Then, check for independence of the three events.
Solution:
The sample space; S = {HH, HT, T H, T T }
Since, the same fair coin is tossed twice. Therefore, the distribution is uniform.
2
A = {HH, T T } ; P (A) =
4
2
B = {HH, HT } ; P (B) =
4
2
C = {HH, T H} ; P (C) =
4
41
Now,
1
A ∩ B = {HH} ; P (A ∩ B) = = P (A)P (B) =⇒ Events A and B are independent
4
1
A ∩ C = {HH} ; P (A ∩ C) = = P (A)P (C) =⇒ Events A and C are independent
4
1
B ∩C = {HH} ; P (B ∩C) = = P (B)P (C) =⇒ Events B and C are independent
4
1
But, A ∩ B ∩ C = {HH} ; P (A ∩ B ∩ C) = ̸= P (A)P (B)P (C)
4
Hence, the events A, B and C are not mutually independent but are pairwise inde-
pendent, i.e. each event is independent of every other possible combination of paired
events.
Q2. Two roads each connect A and B and B and C. Each of the four roads get blocked
with probability p independent of all other roads. What is the probability that there
is an open route from A to B given that there is no open route from A to C?
Solution:
Let us define the events,
E1 : First road from A to B i.e. Road 1, gets blocked
E2 : Second road from A to B i.e. Road 2, gets blocked
E3 : First road from B to C i.e. Road 3, gets blocked
E4 : Second road from B to C i.e. Road 4, gets blocked
F : There is no open route from A to B
E : There is no open route from A to C
It is given that each road get blocked with probability p independent of all other roads.
This implies that the 4 events are mutually independent and,
1
P (Ei ) = ; i = 1, 2, 3, 4
4
42
Now,
P (F c ) = P (Either road 1 or road 2 is open)
= P (E1c ∪ E2c )
= P (E1 ∩ E2 )c ; (Using De Morgan’s Law)
= 1 − P (E1 ∩ E2 )
= 1 − P (E1 )P (E2 ) ; (E1 and E2 are independent events)
∴ P (F C ) = 1 − p2
and, P (F ) = 1 − P (F c ) = p2
And, P (E|F ) = 1
Because, If there is no open route from A to B, then certainly there would be no open
route from A to C
Also,
P (E|F c ) = P (Road 3 and Road 4 are blocked)
= P (E3 ∩ E4 )
= P (E3 )P (E4 ) ; (E3 and E4 are independent events)
∴ P (E|F ) = p2
c
Therefore,
P (E) = P (E|F )P (F ) + P (E|F c )P (F c ) ; (Law of total probability)
= (1 × p2 ) + (p2 × (1 − p2 ))
∴ P (E) = 2p2 − p4
Hence, probability that there is an open route from A to B given that there is no open
route from A to C is:
P (E ∩ F c )
P (F c |E) =
P (E)
P (E|F c )P (F c )
= ; (Using Multiplication rule)
P (E)
p2 × (1 − p2 )
=
2p2 − p4
1 − p2
∴ P (F c |E) =
2 − p2
43
function P .
Consider an event A, such that P (A) = p
If in an experiment, the occurrence of a particular event A is considered to be a success
and the non-occurrence of the event A is considered a failure, then we can interpret it
as a Bernoulli trial as follows:
Generalization:
If a Bernoulli(p) trial is repeated n times independently, then
Q1. Suppose a fair coin is tossed five times. If in each toss of a coin, getting tails is
considered to be as a success, then compute the following:
44
(i) Probability of getting 0 tails
(ii) Probability of getting 2 tails
Solution:
For a single toss of a fair coin, consider an event A = {Tail}
Since in a trial, the occurrence of a particular event A is considered to be a success
and the non-occurrence of the event A is considered a failure, then we can interpret it
as a Bernoulli(p) trial with:
45
2
Q2. Suppose a biased coin with P (T ) = is tossed five times. If in each toss of a
3
coin, getting tails is considered to be as a success, then compute the following:
Solution:
For a single toss of the given coin, consider an event A = {T ail}
Since in a trial, the occurrence of a particular event A is considered to be a success
and the non-occurrence of the event A is considered a failure, then we can interpret it
as a Bernoulli(p) trial with:
(ii) P (5 Tails) = P (T T T T T )
= P (1 in all the 5 trials)
= P (11111)
5 5−5
2 2
= × 1− ; (Trials are independent)
3 3
5
2
=
3
32
∴ P (5 Tails) =
243
46
2.7.2 Binomial Distribution
We can simplify the repeated Bernoulli trials (section 2.7.1.2) to Binomial distribution.
Consider a Bernoulli(p) trial with S = {0, 1}.
If this Bernoulli(p) trial is repeated n times independently, then the number of successes
in these n trials is given by Binomial distribution as follows:
Remark:
47
Note:
• The plot starts at (1 − p)n and then increases till it reaches the peak and then falls
to pn
• The peak is roughly around np and the exact values are as follows:
(1) If ‘(n + 1)p’ is an integer, then it is bimodal (i.e. two peaks) and the two peak
values are ‘(n + 1)p’ and ‘(n + 1)p − 1’.
(2) If ‘(n + 1)p’ is not an integer, then there exists a unique modal value (i.e. unique
peak value) and it’s the integral part of ‘(n + 1)p’.
48
2.7.2.2 Solved Examples:
Q1. Each person has a disease with probability 0.1 independently. Out of 100 random
persons tested for the disease, what is the probability that 20 persons test positive?
Assume that the disease can be tested accurately with no false positives.
Solution:
Each test can be considered as a Bernoulli(p) trial, where success is the person tested
as positive and p = 0.1.
Now, this Bernoulli(0.1) trial is repeated 100 times independently. Hence, the proba-
bility of getting 20 successes in these 100 trials is given by:
Therefore, probability that out of 100 random persons tested for the disease, exactly
20 persons test positive is 0.0012 approximately.
Q2. Suppose a fair coin is tossed 10 times, then find the following probabilities:
Solution:
Each toss of a fair coin can be considered as a Bernoulli(p) trial, where success is
getting a head and p = 0.5.
Now, this Bernoulli(0.5) trial is repeated 10 times independently.
Hence, its distribution is defined as Binomial(10, 0.5), say B with sample space
S = {0, 1, 2, . . . , 10}
Let us define the events
E : {Number of heads is a multiple of 3} = {0, 3, 6, 9}
F : {Number of heads is even} = {0, 2, 4, 6, 8, 10}
49
(i) The probability that the number of heads is a multiple of 3 is given by:
P (E) = P (B = 0 or B = 3 or B = 6 or B = 9)
= P (B = 0) + P (B = 3) + P (B = 6) + P (B = 9) ; (Because, disjoint events)
10 3 10−3 6 10−6
1 10 1 1 10 1 1
= 1− + C3 1− + C6 1−
2 2 2 2 2
9 10−9
1 1
+ 10C9 1−
2 2
10
1
= [1 + 10C3 + 10C6 + 10C9 ]
2
∴ P (E) = 0.33
P (F ) = P (B = 0 or B = 2 or B = 4 or B = 6 or B = 8 or B = 10)
P (F ) = P (B = 0) + P (B = 2) + P (B = 4) + P (B = 6) + P (B = 8) + P (B = 10)
10 2 10−2 4 10−4
1 10 1 1 10 1 1
= 1− + C2 1− + C4 1−
2 2 2 2 2
6 10−6 8 10−8 10
1 1 1 1 1
+ 10C6 1− + 10C8 1− +
2 2 2 2 2
10
1
= [1 + 10C2 + 10C4 + 10C6 + 10C8 + 10C10 ]
2
∴ P (E) = 0.50
Q3. A bit (0 or 1) sent by Alice to Bob gets flipped with probability 0.1.
(i) If 5 bits are sent by Alice independently, what is the probability that at most 2
bits get flipped?
(ii) If 10 bits are sent by Alice independently, what is the probability that at most 2
bits get flipped?
Solution:
Each bit sent by Alice can be considered as a Bernoulli(p) trial, where success is a bit
getting flipped and p = 0.1.
50
(i) The Bernoulli(0.1) trial is repeated 5 times independently.
Hence, its distribution is defined as Binomial(5, 0.1), say B with sample space;
S = {0, 1, 2, . . . , 5}
Let us define an event E : {At most 2 bits gets flipped} = {0, 1, 2}
P (E) = P (B = 0 or B = 1 or B = 2)
= P (B = 0) + P (B = 1) + P (B = 2) ; (Because, disjoint events)
= (1 − 0.1)5 + 5C1 (0.1)(1 − 0.1)5−1 + 5C2 (0.1)2 (1 − 0.1)5−2
= (0.9)5 + 5C1 (0.1)(0.9)4 + 5C2 (0.1)2 (0.9)3
∴ P (E) = 0.9914
Hence, the probability that at most 2 of the 5 bits gets flipped is 0.9914.
(ii) The Bernoulli(0.1) trial is repeated 10 times independently.
Hence, its distribution is defined as Binomial(10, 0.1), say B, with sample space;
S = {0, 1, 2, . . . , 10}
Let us define an event E : {At most 2 bits gets flipped} = {0, 1, 2}
P (E) = P (B = 0 or B = 1 or B = 2)
= P (B = 0) + P (B = 1) + P (B = 2) ; (Because, disjoint events)
= (1 − 0.1)10 + 10C1 (0.1)(1 − 0.1)10−1 + 10C2 (0.1)2 (1 − 0.1)10−2
= (0.9)10 + 10C1 (0.1)(0.9)9 + 5C2 (0.1)2 (0.9)8
∴ P (E) = 0.9298
Hence, the probability that at most 2 of the 10 bits gets flipped is 0.9298.
We can also simplify the repeated Bernoulli trials (section 2.7.1.2) to Geometric dis-
tribution.
Consider a Bernoulli(p) trial with S = {0, 1}.
If this Bernoulli(p) trial is repeated independently, then the number of trials needed
for the first success is given by the Geometric distribution as follows:
51
Proof:
P (G(p) = k) = P (Failure in the first (k − 1) trials and success in the k-th trial)
= P (First (k − 1) trials result in 0 and k-th trial is 1)
= P ( 00 . . . 0} 1)
| {z
(k-1) times
∴ P (G(p) = k) = (1 − p)k−1 p ; k = 1, 2, 3 . . .
Remark:
52
Note:
• The plot starts at p and then keeps falling.
• Even though the plot keeps on decreasing but, if p < 1, it never goes all the way to
zero.
Q1. In Ludo, a player needs to repeatedly throw a fair die till she gets a 1. Find the
following:
(i) The probability that she needs lesser than 6 throws.
(ii) The probability that she needs lesser than 11 throws.
(iii) The probability that she needs lesser than 21 throws.
Solution:
Each throw of a fair die can be considered as a Bernoulli(p) trial, where success is
1
getting 1 and p = .
6
1
Now, this Bernoulli trial is repeated independently till we get the first success.
6
1
Hence, its distribution is defined as Geometric , say G, with sample space ; S =
6
{1, 2, 3 . . .}
(i) The probability that she needs lesser than 6 throws is:
P (G < 6) = P (G ≤ 5)
5
1
=1− 1− ; (Remark in section 2.7.3)
6
5
5
=1−
6
∴ P (G < 6) = 0.5981
Hence, the probability that she needs lesser than 6 throws is 0.5981.
(ii) The probability that she needs lesser than 11 throws is:
P (G < 11) = P (G ≤ 10)
10
1
=1− 1− ; (Remark in section 2.7.3)
6
10
5
=1−
6
∴ P (G < 11) = 0.8385
53
Hence, the probability that she needs lesser than 11 throws is 0.8385.
(iii) The probability that she needs lesser than 21 throws is:
P (G < 21) = P (G ≤ 20)
20
1
=1− 1− ; (Remark in section 2.7.3)
6
20
5
=1−
6
∴ P (G < 21) = 0.9739
Hence, the probability that she needs lesser than 6 throws is 0.9739.
Q2. Player 1 is 40% free throw shooter, while Player 2 is 70% shooter. Each throw
is independent of all previous throws. The two players alternate shooting, with
Player 1 starting till the basket is scored.
(i) What is the probability that Player 1 wins before the 3rd round?
(ii) What is the probability that Player 1 wins?
Solution:
Each throw of basketball by Player 1 and by Player 2 can be considered as Bernoulli(0.4)
and Bernoulli(0.7) trial, respectively, where success is getting the basket scored.
(i) The favourable outcomes for an event that Player 1 wins before the third round
is as follows:
• Player 1 wins in the first round, i.e. 1P1
And, the probability is given by:
P (1P1 ) = 0.4
• Player 1 wins in the second round, which implies that both player loses in the
first round i.e. 0P1 0P2 1P1
And, the probability is given by:
P (0P1 0P2 1P1 ) = P (0P1 )P (0P2 )P (1P1 ) ; (because, independent events)
= (1 − 0.4)(1 − 0.7)(0.4)
= (0.6)(0.3)(0.4)
∴ P (0P1 0P2 1P1 ) = 0.072
Since, the events {1P1 } and {0P1 0P2 1P1 } are disjoint. Therefore, the probability
that Player 1 wins before the third round is:
P (Player 1 wins before 3rd round) = P (1P1 ) + P (0P1 0P2 1P1 )
= 0.4 + 0.072
= 0.472
54
Hence, there is 47.2% chance that Player 1 wins the game before the third
round.
(ii) Similarly, the favourable outcomes for the event that Player 1 wins the game are:
{1P1 , 0P1 0P2 1P1 , (0P1 0P2 )2 1P1 , (0P1 0P2 )3 1P1 , (0P1 0P2 )4 1P1 , . . .}
P (1P1 ) = 0.4
and so on.
Since, the events {1P1 }, {0P1 0P2 1P1 },{(0P1 0P2 )2 1P1 },. . . are disjoint, probability
that Player 1 wins the game is:
Remark:
This question is not quite based on geometric distribution because, if it had been
geometric, then instead of (0.6)(0.3) it would have been 0.6.
55
2.8 Discrete Random Variable
Random Variable:
A random variable is a function with domain as the sample space of an experiment
and range as the real numbers, i.e. a function from the sample space to the real line.
There is a technical condition that functions need to satisfy, which will be discussed in
section 2.8.1.
Discrete Random Variable:
A random variable is said to be discrete if its range is a discrete set.
Remarks:
• The range of a random variable is the set of values taken by it and is a subset of the
real line.
• The discrete sets are as follows:
(i) All finite subsets of the real line are discrete.
(ii) All subsets of integers are discrete.
(iii) The set of all integer multiples of a real number is discrete.
(iv) Any subset of a discrete set is discrete.
• The non-discrete sets are as follows:
(i) Any interval is not discrete, i.e. (a, b) is not discrete for a < b.
(ii) Any subset that contains a non-discrete set is not discrete.
Examples:
56
However, xi ’s need not be distinct.
Suppose x2 = x4 = x6 = 1 and x1 = x3 = x5 = 0, then
Random variable E : E(2) = E(4) = E(6) = 1, E(1) = E(3) = E(5) = 0
But if xi ’s are distinct, then the random variable is a one-to-one function.
Note:
• For an experiment, a lot of valid random variables are possible but usually, the
meaningful functions are considered, like random variable X in the first example.
• In practice, the random variables are easier to assign probabilities and easier to
work with.
Examples:
57
• E : {1, 2, 3} : (X < 4)
• E : {2, 5} : (X = 2) ∪ (X = 5)
(2) Rolling a die: S = {1, 2, 3, 4, 5, 6}
Random variable X : X(2) = X(4) = X(6) = 1, X(1) = X(3) = X(5) = 0
Range (X) = {0, 1}
Any event E can be expressed in terms of X as follows:
• E : {1, 3, 5} : (X = 0)
• E : {2, 4, 6} : (X = 1)
• E : {Null Event} : (X < 0)
• E : {1, 2, 3, 4, 5, 6} : (X ≤ 1)
Remark: Not all events can be written in terms of X, like E : {2, 5}.
Hence, due to the technical restriction, this subset of sample space cannot be
considered as an event.
fX (t) = P (X = t) for t ∈ T
In general, the probability mass function is the probability that the random variable
takes one particular value.
The probability of any event, defined using the random variable X, can be computed
by using the PMF as follows:
Note: A discrete random variable X with range T = {t1 , t2 , . . . , tk } and PMF fX can
be anticipated in tabular form as follows:
58
t t1 t2 ... tk
fX (t) fX (t1 ) fX (t2 ) ... fX (tk )
where, the first row is the value that the random variable takes and the second row is
the probability with which the random variable takes that value.
Solution:
(i) The range of a random variable is the set of values taken by it, i.e.
TX = {−1, 1, 2, 4}
X
(ii) By the property of PMF, we know that fX (t) = 1.
x∈T
59
(iii) Let (X ∈ B) be an event such that B = (X > 3) = {4} then,
X
P (X > 3) = P (X = t)
t∈B
= P (X = 4)
∴ P (X > 3) = 0.125
3
(iv) Let (X ∈ B) be an event such that B = X < = {−1, 1} then,
2
X
3
P X< = P (X = t)
2 t∈B
= P (X = −1) + P (X = 1)
= 0.5 + 0.25
3
∴P X< = 0.75
2
Q2. The PMF of a random variable X is given by:
c
fX (k) = k ; k = 1, 2, 3, . . .
3
Compute the following:
(i) c
(ii) P (X > 10)
(iii) P (X > 10|X > 5)
Solution:
(i) The range of random variable X is not finite but countable, i.e.
T = {1, 2, 3, . . .} X
By property of PMF, we know that fX (t) = 1
x∈T
1
• fX (3) = P ({HHH}) =
8
Hence,
62
t 0 1 2 3
fX (t) 1/8 3/8 3/8 1/8
Q4. For a certain lottery, a three-digit number is randomly selected (from 000 to
999). If a ticket matches the number exactly, it is worth 2 lakhs. If the ticket
matches exactly two of the three digits, it is worth |20000. Otherwise, it is
worth nothing. Let X be the value of the ticket. Find the distribution of X.
Solution:
There are a total of 1000 tickets numbered from 000 to 999.
Range (X) ; T = {0, 20000, 200000}
• fX (200000) = P (All three digits are matched)
1
=
1000
63
• fX (0) = 1 − fX (20000) − fX (200000)
(By using the property of PMF)
1 27
=⇒ fX (0) = 1 − −
1000 1000
28
fX (0) = 1 −
1000
972
∴ fX (0) =
1000
• Range (X) = T
1
• PMF fX (t) = for all t ∈ T and |T | = Size of finite set T
|T |
Note: If all the outcomes of a random variable are equally likely, then it can be
assumed to be uniform.
Examples:
It is also referred to as binary in some sense because it takes only two values, 0 and 1.
Refer section 2.7.1 for more details.
64
2.9.3 Binomial Random Variable
• Range (X) = T
1
• PMF fX (t) = for all t ∈ T and |T | = Size of finite set T
|T |
Refer section 2.7.3 for more details.
65
2.9.5.1 Negative Binomial Random Variable
A random variable X follows Negative Binomial distribution with parameters r and p,
i.e.
X ∼ Negative Binomial(r, p), where
p = Probability of success ; 0 ≤ p ≤ 1
r = Number of successes ; r : Positive integer
then:
Consider an event that keeps occurring over a period of time with the following two
assumptions:
Then the number of times the event occurred in a fixed interval of time is given by
Poisson distribution as follows:
No. of particles 0 1 2 3 4 5 6 7 8 9 10
No. of times 57 203 383 525 532 408 273 139 45 27 16
Fraction 0.022 0.078 0.147 0.201 0.204 0.156 0.105 0.053 0.017 0.0104 0.0061
66
The rate of occurrence of the event, i.e.
This implies that there are 3.8673 emission of particles per 7.5 seconds.
Here, the emission rate can be assumed to be constant, and also it is reasonable
to assume that the time of next emission is independent of the past emissions.
Hence, by using the Poisson distribution, we get:
λk
P (No. of Particles = k) = e−λ
k!
The probabilities computed using the Poisson distribution are close to the fractions
given in the Table.
Visually, this implies that the above assumptions holds true and the Poisson dis-
tribution fits the data, i.e. the chances that k particles are emitted in a given time
interval of 7.5 seconds can be obtained using the Poisson distribution.
(2) A meteorite entering earth’s atmosphere
In 276 months from 1994-2016, the number of fireballs entering the atmosphere is
as follows:
67
No. of fireballs 0 1 2 3 4 5 6 7 8 9 10
No. of times 24 54 76 52 37 21 7 3 1 0 1
Fraction 0.087 0.196 0.275 0.188 0.134 0.076 0.025 0.0109 0.0036 0 0.0036
λk
P (No. of fireballs = k) = e−λ
k!
The probabilities computed using the Poisson distribution are close to the fractions
given in the Table.
Visually, this implies that the above assumptions holds true and the Poisson dis-
tribution fits the data, i.e. the chances that k fireballs arrived in a month can be
obtained using the Poisson distribution.
68
2.9.6.2 Visualtion of Poisson Distribution
Note: As the value of λ increases, the peak is shifted from left to right.
69
For more details, refer to examples of section 2.9.7.1
r
Ck × N −r Cm−k
• P (H(N, r, m) = k) = NC
; k = max(0, m − (N − r)), . . . , min(r, m)
m
Examples:
Remarks:
If X is a random variable with range T and PMF fx (t), then the PMF of function of
the random variable X, i.e. g(X) can be found using fx (t) as follows:
70
2.10.1 Solved Examples:
Solution:
Range (X) ; T = {−2, −1, 0, 1, 2}
1
fX (t) = for all t ∈ T ; (Since X ∼ Uniform)
5
t fX (t) g(t) = t2
−2 1/5 4
−1 1/5 1
0 1/5 0
1 1/5 1
2 1/5 4
• fg(X) (0) = P (X = 0)
1
=
5
71
(
1 x for x < 5
Q2. Let X ∼ Geometric and g(X) =
2 5 for x ≥ 5
Find:
(i) Range of g(X)
(ii) PMF of g(X)
Solution:
Range (X) ; T = {1, 2, 3, 4, 5, 5, 5, . . .}
t−1
1 1
fX (t) = × for all t ∈ T ; (Because X ∼ Geometric)
2 2
t fX (t) g(t)
1 1/2 1
2 1/22 2
3 1/23 3
4 1/24 4
5 1/25 5
6 1/26 5
7 1/27 5
.. .. ..
. . .
• fg(X) (1) = P (X = 1)
1
=
2
• fg(X) (2) = P (X = 2)
1
= 2
2
1
∴ fg(X) (2) =
4
• fg(X) (3) = P (X = 3)
1
= 3
2
1
∴ fg(X) (3) =
8
72
• fg(X) (4) = P (X = 4)
1
= 4
2
1
∴ fg(X) (4) =
16
73