MA-2203: Introduction To Probability and Statistics: Lecture Slides

MA-2203: Introduction to Probability and Statistics
Lecture slides
by
Dr. Suchandan Kayal
Department of Mathematics
National Institute of Technology Rourkela
Rourkela - 769008, Odisha, India
Autumn, 2020
Outline (Part-I)
Historical motivation and preliminary notions

Methods of assigning probabilities
Classical method
Relative frequency method
Axiomatic approach to probability
Some consequences
Boole’s and Bonferroni’s inequalities
Conditional probabilty
Bayes theorem
Independent events
Assignment-I
Outline (Part-II)
Random variable
Historical motivation
Probability has its origin in the study of gambling and

insurance in seventeenth century.
These days, the theory of probability is an indispensible
tool of both the social and natural sciences.
A gambler’s dispute in 1654 led to the creation of the
probability theory by two French mathematicians Fermat
and Pascal.
Their motivation came from a problem related to gambling
proposed by a nobleman Chevalier de Mere. There was an
apparent contradiction concerning a popular dice game.
Historical motivation (cont...)
The game consisted in throwing a pair of dice 24 times.

The problem was to decide whether or not to bet even
money on the occurrence of at least one ‘double six’ during
the 24 throws.
A seemingly well-established gambling rule led de Mere to
believe that betting on a ‘double six’ in 24 throws would be
profitable. But his own calculations indicate just the
opposite.
This problem and others posed by de Mere led to an
exchange of letters between Pascal and Fermat in which
the fundamental principles of probability theory were
formulated for the first time.
Basic notions (statistical regularity)
One of the fundamental features of probability is that the

phenomena we are interested are random in nature.
Flipping of a coin: In this case, we do not know about the
event which will happen in the next flip. It may be head or
tail. However, in the long run, it is known to us that
aproximately 50% head and 50% tail will occur. So,
P (H) ≈ 0.5 and P (T ) ≈ 0.5
.
Birth of a offspring:
P (B) ≈ 0.5 and P (G) ≈ 0.5.
The long term behaviour of an event is known as statistical

regularity. This encourages to study the subject
probability.
Basic notions (experiment)
Experiment
An experiment is observing something happen or conducting
something under certain conditions which result in some
outcomes.
Example
Rainfall: It is a consequence of several things such as cloud
formation, elnino occurrence, humidity, atmospheric pressure
etc. Finally, we observe that there is rainfall. Thus, observing
weather is an experiment.
Types of experiment
Deterministic experiment: It results known outcomes under
certain conditions.
Random experiment: Under fixed conditions, the outcomes
are not known.
Basic notions (random experiment)
Random experiment
An experiment is said to be a random experiment if the
following conditions are satisfied.
The set of all possible outcomes of the experiment is known
in advance.
The outcomes of a particular performance (trial) of the
experiment can not be predicted in advance.
The experiemnt can be repeated under identical conditions.
Sample space
The collection of all possible outcomes of a random experiment
is called the sample space. It is denoted by Ω.
Basic notions (sample space and event)
Sample space/examples
Throwing of a die. Here Ω = {1, 2, 3, 4, 5, 6}.
Throwing of a die and tossing of a coin simultaneously.
Ω = {1, 2, 3, 4, 5, 6} × {H, T }
A coin is flipped repeatedly until a tail is observed.
Ω = {T, HT, HHT, HHHT, · · · }
Lifetime of a battery. Here Ω = [0, 10000].
Event
An event is a set of outcomes of an experiment (a subset of the
sample space) to which a probability is assigned.
Basic notions
Remarks on event
When the sample space is finite, any subset of the sample
space is an event. In this case, all elements of the power set
of the sample space are defined as events.
This approach does not work well in cases where the
sample space is uncountably infinite. So, when defining a
probability space it is possible, and often necessary to
exclude certain subsets of the sample space from being
events.
In general measure theoretic description of probability
spaces an event may be defined as an element of a selected
sigma-field of subsets of the sample space.
Basic notions (impossible and sure events)
Impossible event
An event is said to be impossible if the probability of
occurrence of that event is zero. For example, during the rolling
of a six faces die, the event that the face 7 will occur.
Sure event
An event with probability of occurrence one is called the sure
event. The sample space of any random experiment is always a
sure event. ANother example could be that the lifetime of a
battery is a nonnegative number.
Basic notions
Various operations
Union:
A ∪ B means occurrence of at least one of A and B.
∪ni=1 Ai means occurrence of at least one of Ai , i = 1, · · · , n.
∪∞i=1 Ai means occurrence of at least one of Ai ,
i = 1, · · · , ∞.
Intersection:
A ∩ B means simultaneous occurrence of both A and B.
∩ni=1 Ai means simultaneous occurrence of Ai , i = 1, · · · , n.
∩∞
i=1 Ai means simultaneous occurrence of Ai , i = 1, · · · , ∞.
Exhaustive events:
If ∪ni=1 Ai = Ω, we call A1 , · · · , An to be exhaustive events.
Basic notions
Various operations (cont...)

Disjoint events:
If A ∩ B = ϕ, an empty set, that is A and B can not occur
simulataneously, then A and B are called disjoint events or
mutually exclusive events. In this case, the happening of
one excludes the happening of the other.
Suppose {An }n≥1 be a sequence of events such that
Ai ∩ Aj = ϕ, for i ̸= j, then A1 , A2 , · · · are said to be
pairwise disjoint or mutually exclusive events.
Complementation and substruction:
Ac means not happening of the event A.
A − B = A ∩ B c means happening of A but not of B.
A. Classsical approach
Assumptions:
A random experiment results in a finite number of equally
likely outcomes.
Let Ω = {ω1 , · · · , ωn } be a finite sample space with n ∈ N
possible outcomes, N denotes the set of natural numbers.
For a subset E of Ω, |E| denotes the number of elements in
E.
Result:
The probability of occurrence of an event E is given by
# of outcomes favourable to E |E| |E|

P (E) = = = .
Total # of outcomes in Ω |Ω| n
Methods of assigning probabilities/Classsical approach
(cont...)
Observations
For any event E, P (E) ≥ 0
For mutually exclusive events E1 , · · · , En ,
∑n
| ∪ni=1 Ei | ∑
n ∑
n
i=1 |Ei | |Ei |
P (∪ni=1 Ei ) = = = = P (Ei )
n n i=1
n i=1
|Ω|
P (Ω) = |Ω| = 1.
(cont...)
Example-1
Suppose that in your section, we have 150 students born
in the same year. Assume that a year has 365 days. Find
the probability that all the students of your section are
born on different days of the year.
Solution
Denote the event that all the students are born on different
days of the year by E. Here,
|Ω| = 365140 and |E| = 365 × 364 × · · · × 266 = 365P 140 .

|E| 365P
140
Thus, P (E) = |Ω| = 365140
.
(cont...)
Example-2
Find the probability of getting exactly two heads in three
tosses of a fair coin.
Solution
Denote the event that getting exactly two heads in three
tosses of a fair coin by E. Here,
Ω = {HHH, HHT, HT H, T HH, T HT, T T H, HT T, T T T }
and
E = {HHT, HT H, T HH}.
|E|
Thus, P (E) = |Ω| = 38 .
(cont...)
Drawbacks
The random experiment must produce equally likely
outcomes.
The total number of outcomes of the random experiment
must be finite.
A. Relative frequency approach

Assumptions:
Suppose that a random experiment can be repeated
independently (the outcome of one trial is not affected by
the outcome of another trial) under identical conditions.
Let an denote the number of times (frequency) an event E
occurs in n trials of a random experiment.
Result:
Using weak law of large numbers, under mild conditions, it
can be shown that the relative frequencies an /n stabilize in
certain sense as n gets large.
an
P (E) = lim , provided the limit exists.
n→∞ n
Methods of assigning probabilities/Relative frequency
approach (cont...)
Observations
For any event E, P (E) ≥ 0
For mutually exclusive events E1 , · · · , En ,
( n )
∪ ∑
n
P Ei = P (Ei )
i=1 i=1
P (Ω) = 1.
Example-3
After tossing a fair coin, we have the following outputs:
HHT HHT HHT HHT · · ·
Using relative frequency approach, find P (H).
Solution
Note that 


2k−1
k = 1, 2, · · ·
an 1 2 2 3 4 4  3k−2 ,
= , , , , , , · · · = 3k−2
2k
, k = 1, 2, · · ·
n 1 2 3 4 5 6 

 2k , k = 1, 2, · · ·
3k
an 2
Thus, lim = = P (H).
n→∞ n 3
approach (cont...)
Drawbacks
The probability has been calculated based on an
approximation.
The random experiment has to be conducted a large
number of times. This is not always possible since some
experiments are costly (launching satellite).
√
n
lim = 0 ⇒ P (E) = 0 (not correct !).
n→∞ n
√
n− n
lim = 1 ⇒ P (E) = 1 (not correct !).
n→∞ n
Axiomatic approach to probability
Basic concepts
A set whose elements are themselves set is called a class of
sets. For example, A = {{2}, {2, 3}}.
A set function is a real-valued function whose domain is a
class of sets.
A sigma-field of subsets of Ω is a class F of subsets of Ω
satisfying the following properties:
(i) Ω ∈ F
(ii) E ∈ F ⇒ E c = Ω − E ∈ F (closed under complement)
(iii) Ei ∈ F , i = 1, 2, · · · ⇒ ∪∞
i=1 Ei ∈ F (closed under countably
infinite unions)
F = {ϕ, Ω} is a sigma (trivial) field.
Suppose A ⊂ Ω. Then, F = {ϕ, Ω, A, Ac } is a sigma field of
subsets of Ω.
Axiomatic approach to probability (cont...)
Definition
Let Ω be a sample space of a random experiment. Let F be the
event space or a sigma field of subsets of Ω. Then, a probability
function or a probability measure is a set function P , defined on
F, satisfying the following three axioms:
For any event E ∈ F , P (E) ≥ 0 (nonnegativity)
For a countably infinite collection of mutually exclusive
events E1 , E2 , · · · , we have
∞
∪ ∞
∑
P( Ei ) = P (Ei )
i=1 i=1
(countably infinite additive)

P (Ω) = 1 (Probability of the sample space is one)
Axiomatic approach to probability (cont...)
Consequences of the axiomatic definition

Let (Ω, F, P ) be a probability space. Then,
(i) P (ϕ) = 0
(ii) for all E ∈ F , 0 ≤ P (E) ≤ 1 and P (E c ) = 1 − P (E)
(iii) for n mutually exclusive events Ei , i = 1, · · · , n,
∪
n ∑
n
P( Ei ) = P (Ei )
i=1 i=1
(iv) Let E1 .E2 ∈ F and E1 ⊂ E2 . Then,

P (E2 − E1 ) = P (E2 ) − P (E1 ) and P (E1 ) ≤ P (E2 )
(v) For E1 , E2 ∈ F , P (E1 ∪ E2 ) = P (E1 ) + P (E2 ) − P (E1 ∩ E2 )
Proof
See it during lecture.
Inequalities
Boole’s inequality (union bound proposed by George Boole)

Let (Ω, F, P ) be a probability space and let E1 , · · · , En ∈ F ,
where n ∈ N. Then,
∪
n ∑
n
P( Ei ) ≤ P (Ei ).
i=1 i=1
Proof
See it during the lecture.
Note
To prove Boole’s inequality for the countable set of events, we
can use ∪ni=1 Ei → ∪∞
i=1 Ei for n → ∞ along with the continuity
of the probability measure P.
Inequalities (cont...)
Bonferroni’s inequality
Let (Ω, F, P ) be a probability space and let E1 , · · · , En ∈ F ,
where n ∈ N. Then,
( n )
∩ ∑
n
P Ei ≥ P (Ei ) − (n − 1).
i=1 i=1
Proof
See it during the lecture.
Note
The Bonferroni’s inequality holds only for the probability of
finite intersection of events!
Conditional probability
Example
Let us toss two fair coins. Let A denote that both coins show
same face and B denote at least one coin shows head. Obtain
the probability of happening of A given that B has already
occured.
Solution
Listen to my lecture.
Definition
Let (Ω, F, P ) be a probability space and B ∈ F be a fixed event
such that P (B) > 0. Then, the conditional probability of event
A given that B has already occured is defined as
P (A ∩ B)
P (A|B) = .
P (B)
Conditional probability (cont...)
Example
Six cards are dealt at random (without replacement) from

a deck of 52 cards. Find the probability of getting all cards
of heart in a hand (event A) given that there are at least
5 cards of heart in the hand (event B).
Solution
Clearly,
(13) (13)(39) (13)
+
P (A ∩ B) = P (A) = 6
(52) and P (B) = 5 1
(52) 6
.
6 6
(13
6)
Thus, P (A|B) = .
( )(39
13
5
+ 13
1) (6)
Note
For events E1 , E2 · · · , En ∈ F , n ≥ 2, we have
P (E1 ∩ E2 ) = P (E1 )P (E2 |E1 ) if P (E1 ) > 0
P (E1 ∩ E2 ∩ E3 ) = P (E1 )P (E2 |E1 )P (E3 |E1 ∩ E2 ) if
P (E1 ∩ E2 ) > 0. This condition also gurantees that
P (E1 ) > 0, since E1 ∩ E2 ⊂ E1
P (∩ni=1 Ei ) =
P (E1 )P (E2 |E1 )P (E3 |E1 ∩E2 ) · · · P (En |E1 ∩E2 ∩· · ·∩En−1 ),
provided P (E1 ∩ E2 ∩ · · · ∩ En−1 ) > 0, which also
guarantees that P (E1 ∩ E2 ∩ · · · ∩ Ei ) > 0, for
i = 1, 2, · · · , n − 1.
Example
An urn contains four red and six black balls. Two balls are
drawn successively, at random and without replacement,
from the urn. Find the probability that the first draw
resulted in a red ball and the second draw resulted in a
black ball.
Solution
Let A denote the event that the first draw results in a red
ball and B that the second ball results in a black ball.
Then,
4 6 12
P (A ∩ B) = P (A)P (A|B) = × = .
10 9 45
Total probability
Theorem of total probability

Let (Ω, F, P ) be a probability space and let {Ei ; i ∈ A} be a
countable collection of mutually exclusive and exhaustive events
(that is, Ei ∩ Ej = ϕ for i ̸= j and P (∪i∈A Ei ) = P (Ω) = 1) such
that P (Ei ) > 0 for all i ∈ A. Then, for any event E ∈ A,
∑ ∑
P (E) = P (E ∩ Ei ) = P (E|Ei )P (Ei ).
i∈A i∈A
Proof
Let F = ∪i∈A Ei . Then, P (F ) = P (Ω) = 1 and
P (F c ) = 1 − P (F ) = 0. Again,
E ∩ F c ⊂ F c ⇒ 0 ≤ P (E ∩ F c ) ≤ P (F c ) = 0.
Total probability (cont...)
Proof (cont...)
Thus,
P (E) = P (E ∩ F ) + P (E ∩ F c )
= P (E ∩ F )
= P (∪i∈A (E ∩ Ei ))
∑
= P (E ∩ Ei )
i∈A
∑
= P (E|Ei )P (Ei ),
i∈A
since Ei ’s are disjoint implies that Ei ∩ E’s are disjoint.

Total probability (cont...)
Example
Urn U1 contains four white and six black balls and urn U2
contains six white and four black balls. A fair die is cast and
urn U1 is selected if the upper face of die shows 5 or 6 dots,
otherwise urn U2 is selected. If a ball is drawn at random from
the selected urn find the probability that the drawn ball is
white.
Solution
W → drawn ball is white; E1 → Urn U1 is selected; E2 → Urn
U2 is selected.
Here, {E1 , E2 } is a collection of mutually exclusive and
exhaustive events. Thus,
P (W ) = P (E1 )P (W |E1 ) + P (E2 )P (W |E2 )

2 4 4 6 8
= × + × = .
6 10 6 10 15
Bayes theorem
Theorem
Let (Ω, F, P ) be a probability space and let {Ei ; i ∈ A} be a
countable collection of mutually exclusive and exhaustive events
with P (Ei ) > 0 for i ∈ A. Then, for any event E ∈ F , with
P (E) > 0, we have
P (E|Ej )P (Ej )
P (Ej |E) = ∑ , j ∈ A.
i∈A P (E|Ei )P (Ei )
Proof
For j ∈ A,
P (Ej ∩E)
= ∑ P (E|E
P (E|Ej )P (Ej ) P (E|Ej )P (Ej )
P (Ej |E) = P (E) = P (E) )P (E )
from
i∈A i i
the theorem of total probability.
Bayes theorem (cont...)
Note
P (Ej ), j ∈ A are known as the prior probabilities.
P (Ej |E) are known as the posterior probabilities.
Example
Urn U1 contains four white and six black balls and urn U2
contains six white and four black balls. A fair die is cast and
urn U1 is selected if the upper face of die shows 5 or 6 dots,
otherwise urn U2 is selected. A ball is drawn at random from
the selected urn.
Given that the drawn ball is white, what is the conditional
probability that it came from U1 .
Given that the ball is white, find the conditional
probability that it came from urn U2 .
Solution
W → drawn ball is white;
E1 → Urn U1 is selected;
E2 → Urn U2 is selected.
Solution (contd...)
E1 and E2 are mutually exclusive and exhaustive events.
P (W |E1 )P (E1 )
(i) P (E1 |W ) =
P (W |E1 )P (E1 ) + P (W |E2 )P (E2 )
10 × 6
4 2
1
= 4 = 4.
10 × 6 + 10 × 6
4 2 6
(ii) Since E1 and E2 are mutually exclusive and

P (E1 ∪ E2 |W ) = P (Ω|W ) = 1, we have
3
P (E2 |W ) = 1 − P (E1 |W ) = .
4
Observations from the previous example

P (E1 |W ) = 14 < 13 = P (E1 ) : that is, the probability of
occurrence of the event E1 decreases in the presence of the
information that the outcome will be an element of W.
P (E2 |W ) = 34 > 23 = P (E2 ) : that is, the probability of
occurrence of the event E2 increases in the presence of the
information that the outcome will be an element of W.
P (E1 ∩W )
P (E1 |W ) < P (E1 ) ⇔ P (W ) < P (E1 ) ⇔ P (E1 ∩ W ) <
P (E1 )P (W ).
P (E2 ∩W )
P (E2 |W ) > P (E2 ) ⇔ P (W ) > P (E2 ) ⇔ P (E2 ∩ W ) >
P (E2 )P (W ).
Definition
Let (Ω, F, P ) be a probability space and A and B be two
events. Events A and B are said to be
negatively associated (correlated) if P (A ∩ B) < P (A)P (B)
positively associated (correlated) if P (A ∩ B) > P (A)P (B)
independent if P (A ∩ B) = P (A)P (B)
dependent if they are not independent.
Note
If P (B) = 0, then P (A ∩ B) = 0 = P (A)P (B) for all
A ∈ F . That is, if P (B) = 0, then any event A ∈ F and B
are independent.
If P (B) > 0, then A and B are said to be independent if
and only if P (A|B) = P (A).
Independence
Let (Ω, F, P ) be a probability space. Let A ⊂ R be an index set
and let {Eα : α ∈ A} be a collection of events in F.
Events {Eα : α ∈ A} are said to be pairwise independent if
any pair of events Eα and Eβ , α ̸= β in the collection
{Ej : j ∈ A} are independent, that is, if
P (Eα ∩ Eβ ) = P (Eα )P (Eβ ), α, β ∈ A and α ̸= β.
Let A = {1, 2, · · · , n} for some n ∈ N . The events
E1 , · · · , En are said to be independent if for any sub
collection {Eα1 , · · · , Eαk } of {E1 , · · · , En } (k = 2, 3, · · · , n)
∏
n
P (∩nj=1 Eαj ) = P (Eαj ).
j=1
Independence
Independence ⇒ pairwise independence
pairwise independence ⇏ Independence (always!)
Example of independent events

Rolling two dice, x1 and x2 . Let A be the event x1 = 3 and
B be the event x2 = 4. Then, A and B are independent.
Example
Take four identical marbles. On the first write symbols
A1 A2 A3 . On each of the other three, write A1 , A2 and A3 ,
respectively. Put the four marbles in an urn and draw one at
random. Let Ei denote the event that the symbol Ai appears
on the drawn marble. Then, show that E1 , E2 and E3 are not
independent though they are pairwise independent.
Solution
See during the lecture!
Assignment-I
Problems
Q1. A student prepares for a quiz by studying a list of ten
problems. She only can solve six of them. For the quiz, the
instructor selectes five questions at random from the list of
ten. What is the probability that the student can solve all
five problems on the examination?
Q2. A total of n shells is fired at a target. The probability that
the ith shell hitting the target is pi , i = 1, · · · , n. Find the
probability that at least two shells out of n find the target.
Q3. A bag contains 5 white and 2 black balls and balls are
drawn one by one without replacement. What is the
probability of drawing the second white ball before the
second black ball?
Assignment-I (cont...)
Problems
Q4. Balls are drawn repeatedly and with replacement from a
bag consisting of 60 white and 30 black balls. What is the
probability of drawing the third white ball before the
second black ball?
Q5. Let A and B be two events which are independent. Then,
show that A and B c , Ac and B, and Ac and B c are
independent.
Q6. Consider the experiment of tossing a coin three times. Let
Hi , i = 1, 2, 3, denote the event that the ith toss is a head.
Assuming that the coin is fair and has an equal probability
of landing heads or tails on each toss, the events H1 , H2
and H3 are mutually independent.
Assignment-I (cont...)
Problems
Q7. When coded messages are sent, there are sometimes errors
in transmission. In particular, Morse code uses “dots" and
“dashes", which are known to occur in the proportion of
3 : 4. This means that for any given symbol,
3 4
P (dot sent) = and P (dash sent) = .
7 7
Suppose there is interference on the transmission line, and
with probability 18 a dot is mistakenly received as a dash,
and vice versa. If we receive a dot, can we be sure that a
dot was sent? (Ans. 21/25)
Solve more problems other than these exercises if you are
willing to have good grade.
Part-II
Random variable
Motivation
Someone may not be interested in the full physical
description of the sample space or events. Rather, one may
be interested in the numerical characteristic of the event
considered.
For example, suppose some components have been put on a
test. After ceratain time t > 0, we may be interested that
how many of these are functioning or how many are not
functioning. Here, we are not interested which unit have
failed to work.
To study certain phenomena of a random experiment, it is
required to quantify the phenomena. One option is to
associate a real number to every outcome of the random
experiment. This encourages us to develop the concept of
the random variable.
Random variable (cont...)
Definition
Let (Ω, F, P ) be a probability space and let X : Ω → R be a
given function. We say that X is a random variable if
X −1 (B) ∈ F for all B ∈ B1 ,
where B1 is the Borel sigma-field.
Alternative
Let (Ω, F, P ) be a probability space. Then, a real valued
measurable function defined on the sample space is known as
the random variable.
Theorem
Let (Ω, F, P ) be a probability space and let X : Ω → R be a
given function. Then, X is a random variable if and only if
X −1 ((−∞, a]) = {w ∈ Ω : X(ω) ≤ a} ∈ F
for all a ∈ R.
Example
Consider the experiment of tossing of a coin. Then, the sample
space is Ω = {H, T }. Define X as the number of heads. Then,
X(H) = 1 and X(T ) = 0. Consider
F = Power set(Ω) = {ϕ, {H}, {T }, Ω}.
Our goal is to show that X : Ω → R is a random variable. Here,



ϕ, a<0

{w ∈ Ω : X(ω) ≤ a} = {T }, 0≤a<1


Ω, a≥1
which belongs to F. Thus, X is a random variable.

Distribution function
Definition
A function F : R → R defined by
F (x) = P (X ≤ x) = P ((−∞, x]), x ∈ R
is called the distribution function of the random variable x. It

is also denoted by FX (x).
Theorem
Let FX be the distribution function of a random variable X.
Then,
FX is non-decreasing.
FX is right continuous.
FX (∞) = 1 and FX (−∞) = 0.
Distribution function (cont...)
Example
Suppose that a fair coin is independently flipped thrice. Then,
the sample space is
Ω = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }.
Let X be a random variable, which denotes the number of

heads. Then,
1 3
P (X = 0) = = P (X = 3), P (X = 1) = = P (X = 2).
8 8
Example (cont...)
The distribution function of X is


 0, x<0



 1
0≤x<1

8,
FX (x) = 2 , 1 ≤ x < 2
1




8, 2 ≤ x < 3
7




1, x ≥ 3.
FX (x) is non-decreasing, right continuous, FX (+∞) = 1

and FX (−∞) = 0. Moreover, FX (x) is a step function
having discontinuities at 0, 1, 2 and 3.
Sum of the sizes of the jumps is equal to 1.
Note
Let −∞ < a < b < ∞. Then,
P (a < X ≤ b) = P (X ≤ b) − P (X ≤ a)
P (a < X < b) = P (X < b) − P (X ≤ a)
P (a ≤ X < b) = P (X < b) − P (X < a)
P (a ≤ X ≤ b) = P (X ≤ b) − P (X < a)
P (X ≥ a) = 1 − P (X < a)
P (X > a) = 1 − P (X ≤ a)
Theorem
Let G : R → R be a non-decreasing and right continuous
function for which G(−∞) = 0 and G(+∞) = 1. Then, there
exists a random variable X defined on a probability space
(Ω, F, P ) such that the distribution function of X is G.
Example
Consider a function G : R → R defined by
{
0, x<0
G(x) = −x
1−e , x≥0
Observations
Clearly, G is nondecreasing, continuous and satisfies
G(−∞) = 0 and G(∞) = 1. Thus, G is a distribution
function for a random variable X.
Since G is continuous, we have
P (X = x) = G(x) − G(x− ) = 0 for all x ∈ R, where G(x− )
is the left hand limit of G at the point x.
Example (cont...)
For −∞ < a < b < ∞, P (a < X < b) = P (a ≤ X < b) =
P (a ≤ X ≤ b) = P (a < X ≤ b) = G(b) − G(a).
P (X ≥ a) = P (X > a) = 1 − G(a) and
P (X < a) = P (X ≤ a) = G(a).
P (2 < X ≤ 3) = G(3) − G(2) = e−2 − e−3
P (−2 < X ≤ 3) = G(3) − G(−2) = 1 − e−3
P (X ≥ 2) = 1 − G(2) = e−2
P (X > 5) = 1 − G(5) = e−5 .
Note that the sum of sizes of jumps of G is 0.
Types of the random variables
Discrete random variable

Continuous random variable
Absolutely continuous random variable
Mixed random variable
Note: We will only study discrete and (absolutely) continuous

random variables in detail.
Discrete random variables
Definition
A random variable X is said to be of discrete type if there
exists a non-empty and countable set SX such that
P (X = x) = FX (x) − FX (x− ) > 0, ∀ x ∈ SX
and
∑ ∑
PX (SX ) = P (X = x) = [FX (x) − FX (x− )] = 1.
x∈SX x∈SX
The set SX is called the support of the discrete random variable

X.
Discrete random variables (cont...)
Theorem
Let X be a random variable with distribution function FX and
let DX be the set of discontinuity points of FX . Then, X is of
discrete type if and only if
P (X ∈ DX ) = 1.
Definition
Let X be a discrete type random variable with support SX . The
function fX : R → R defined by
{
P (X = x), x ∈ SX
fX (x) =
0, x ∈ SX
c
is called the probability mass function of X.

Discrete random variable (cont...)
Example
Let us consider a random variable X having the distribution
function FX : R → R defined by The distribution function of X
is


 0, x<0







1
8, 0≤x<2




1
2≤x<3
4,
FX (x) = 2 , 3 ≤ x < 6
1




5 , 6 ≤ x < 12
4





8 , 12 ≤ x < 15
7




1, x ≥ 15.
Is the random variable of discrete type? If yes, find the

probability mass function of X.
Discrete random variable (cont...)
Solution
The set of discontinuity points of FX is DX = {0, 2, 3, 6, 12, 15}
∑
and P (X ∈ DX ) = x∈DX [FX (x) − FX (x− )] = 1. Thus, the
random variable X is of discrete type with support
SX = DX = {0, 2, 3, 6, 12, 15}. The probability mass function is




1
8, x ∈ {0, 2, 15}


{ 

1
FX (x) − FX (x− ), x ∈ SX 4, x=3
fX (x) = 3
= 10 , x=6
0, x ∈ SX
c 


 3

 40 , x = 12


0, otherwise.
Discrete random variables (cont...)
Remark
The PMF of a discrete type random variable X having support
SX satisfies the following properties:
(i) fX (x) > 0 for all x ∈ SX and fX (x) = 0 for all x ∈ SX
c .
∑ ∑
(ii) x∈SX fX (x) = x∈SX P (X = x) = 1
Conversely, if a function satisfies the above two properties, then
it is a probability mass function.
Thank You

MA-2203: Introduction To Probability and Statistics: Lecture Slides

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MA-2203: Introduction To Probability and Statistics: Lecture Slides

Uploaded by

Copyright:

Available Formats

MA-2203: Introduction to Probability and Statistics

Historical motivation and preliminary notions

Probability has its origin in the study of gambling and

The game consisted in throwing a pair of dice 24 times.

One of the fundamental features of probability is that the

P (H) ≈ 0.5 and P (T ) ≈ 0.5

P (B) ≈ 0.5 and P (G) ≈ 0.5.

The long term behaviour of an event is known as statistical

Various operations (cont...)

# of outcomes favourable to E |E| |E|

|Ω| = 365140 and |E| = 365 × 364 × · · · × 266 = 365P 140 .

Ω = {HHH, HHT, HT H, T HH, T HT, T T H, HT T, T T T }

A. Relative frequency approach

HHT HHT HHT HHT · · ·

Using relative frequency approach, ﬁnd P (H).

(countably inﬁnite additive)

Consequences of the axiomatic deﬁnition

(iv) Let E1 .E2 ∈ F and E1 ⊂ E2 . Then,

Boole’s inequality (union bound proposed by George Boole)

Six cards are dealt at random (without replacement) from

Theorem of total probability

since Ei ’s are disjoint implies that Ei ∩ E’s are disjoint.

P (W ) = P (E1 )P (W |E1 ) + P (E2 )P (W |E2 )

(ii) Since E1 and E2 are mutually exclusive and

Observations from the previous example

Example of independent events

X −1 (B) ∈ F for all B ∈ B1 ,

where B1 is the Borel sigma-ﬁeld.

X −1 ((−∞, a]) = {w ∈ Ω : X(ω) ≤ a} ∈ F

F = Power set(Ω) = {ϕ, {H}, {T }, Ω}.

Our goal is to show that X : Ω → R is a random variable. Here,

which belongs to F. Thus, X is a random variable.

F (x) = P (X ≤ x) = P ((−∞, x]), x ∈ R

is called the distribution function of the random variable x. It

Ω = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }.

Let X be a random variable, which denotes the number of

FX (x) is non-decreasing, right continuous, FX (+∞) = 1

Discrete random variable

Note: We will only study discrete and (absolutely) continuous

P (X = x) = FX (x) − FX (x− ) > 0, ∀ x ∈ SX

The set SX is called the support of the discrete random variable

is called the probability mass function of X.

Is the random variable of discrete type? If yes, ﬁnd the

You might also like