You are on page 1of 29

Probability and Stochastic

Processes
Reza Pulungan

Department of Computer Science and Electronics


Faculty of Mathematics and Natural Sciences
Universitas Gadjah Mada
Yogyakarta

October 3, 2020
Random Variables
Definition

Definition
A random variable R on a probability space is a total function
whose domain is the sample space of the probability space.

The codomain of R can be any set, but it is usually a


subset of R.
The name “random variable” is not exactly precise, since it
is not a variable, but a function.

Reza Pulungan Probability and Stochastic Processes 3


Example

Suppose we toss 3 mutually independent and unbiased coins.


Let C be the number of heads produced on a tossing.
Let M = 1 if the tossing produces three heads or three
tails, and M = 0 otherwise.
Every outcome of this 3-coin tossing experiment uniquely
determines the values of C and M .
Hence, C dan M can be regarded as functions that map
the outcomes from:

S = {HHH , HHT , HTH , HTT , THH , THT , TTH , TTT },

ke numbers.

Reza Pulungan Probability and Stochastic Processes 4


Example
Suppose we toss 3 mutually independent and unbiased coins.
More specificly, C is a function that maps every outcome to
a number as follows:

C(HHH ) = 3 C(HHT ) = 2 C(HTH ) = 2 C(HTT ) = 1


C(THH ) = 2 C(THT ) = 1 C(TTH ) = 1 C(TTT ) = 0

M is a function that maps every outcome to a number as


follows:

M (HHH ) = 1 M (HHT ) = 0 M (HTH ) = 0 M (HTT ) = 0


M (THH ) = 0 M (THT ) = 0 M (TTH ) = 0 M (TTT ) = 1

Therefore, C and M are random variables.


Reza Pulungan Probability and Stochastic Processes 5
Indicator Random Variables

Definition
An indicator random variable is a random variable that maps
every outcome to either 0 or 1.

Indicator random variables are also called Bernoulli


variables.
The variable M in the previous example is an indicator
random variable.

Reza Pulungan Probability and Stochastic Processes 6


Indicator Random Variables
Definition
An indicator random variable is a random variable that maps
every outcome to either 0 or 1.

Indicator random variables are closely related to event,


because they partition the sample space to two blocks: the
block containing outcomes mapped to 1, and the block
containing outcomes mapped to 0.
Similarly, an event also partition the sample space to two
blocks: the block containing the outcomes that are the
elements of the event, and the block containing the
outcomes that are not the elements of the event.
Hence, for each event E , we can associate an indicator
random variable IE , such that:

0, if ω 6∈ E ,
IE (ω) =
1, if ω ∈ E .
Reza Pulungan Probability and Stochastic Processes 7
Random Variables and Events

Events are also closely related to random variables: a


random variable that produces n different values partitions
the sample space to n blocks.
Example: the random variable C partitions S to 4 blocks:

| {z} |TTH , THT


TTT
{z
, HTT THH , HTH , HHT HHH
} | {z } | {z }
.
C=0 C=1 C=2 C=3

Each block is a subset of S, and therefore, an event.


Furthermore, we can view an equality or an inequality
involving a random variable as an event.

Reza Pulungan Probability and Stochastic Processes 8


Random Variables and Events
Example: C = 2 is an event containing outcomes
THH , HTH , HHT .
C ≤ 1 is an event containing outcomes
TTT , TTH , THT , HTT .
Furthermore, we can talk about the probability of the
events determined by random variables; for instance:

1 1 1 3
Pr(C = 2) = Pr(THH )+Pr(HTH )+Pr(HHT ) = + + = .
8 8 8 8
and
1 1
Pr(C ≤ 1) = Pr(TTT )+Pr(TTH )+Pr(THT )+Pr(HTT ) = 4· =
8 2

Reza Pulungan Probability and Stochastic Processes 9


Functions of Random Variables
Random variables can be combined to form a more
complex random variables.
Example: suppose we toss two independent and unbiased
dies. Let Di be the random variable representing the
outcome of the i -th die, i = 1, 2. Then:
1
Pr(D1 = 3) = Pr(D2 = 3) = .
6

We can form a more complex random variable


T = D1 + D2 , which represents the sum of the two dies.
Right now:
1 1
Pr(T = 3) = , and Pr(T = 7) = .
18 6

Reza Pulungan Probability and Stochastic Processes 10


Conditional Probability
Combining the conditional probability with events involving
random variables does not create any problem.
Example: Pr(C ≥ 2 | M = 0) is the probability that at least
2 coins produce head (C ≥ 2), if not all coins produce the
same results (M = 0).
We can calculate:
Pr(C ≥ 2 ∩ M = 0)
Pr(C ≥ 2 | M = 0) = ,
Pr(M = 0)
Pr({THH , HTH , HHT })
= ,
Pr({THH , HTH , HHT , HTT , THT , TTH })
3/8 1
= = .
6/8 2

Reza Pulungan Probability and Stochastic Processes 11


Independence
The concept of independence also applies to random variables,
similar to events.
Definition
Two random variables R1 and R2 are independent if and only if
for all x1 ∈ co(R1 ) and for all x2 ∈ co(R2 ), either
Pr(R2 = x2 ) = 0 or:

Pr(R1 = x1 | R2 = x2 ) = Pr(R1 = x1 ).

Definition
Two random variables R1 and R2 are independent if and only if
for all x1 ∈ co(R1 ) and for all x2 ∈ co(R2 ):

Pr(R1 = x1 ∩ R2 = x2 ) = Pr(R1 = x1 ) · Pr(R2 = x2 ).

Reza Pulungan Probability and Stochastic Processes 12


Independence
For instance: are C and M independent?

C and M are not independent if we can find x1 , x2 ∈ R such that:

Pr(C = x1 ∩ M = x2 ) 6= Pr(C = x1 ) · Pr(M = x2 ).

We can find them, namely: x1 = 2 and x2 = 1:

Pr(C = 2 ∩ M = 1) = 0,

and
3 1 3
Pr(C = 2) · Pr(M = 1) = · = .
8 4 32

Reza Pulungan Probability and Stochastic Processes 13


Independence

Like events, the concept of independence also applies to more


than two random variables.
Definition
Random variables R1 , R2 , · · · , Rn are mutually independent if
and only of:

Pr(R1 = x1 ∩ R2 = x2 ∩ · · · ∩ Rn = xn )
= Pr(R1 = x1 ) · Pr(R2 = x2 ) · · · Pr(Rn = xn ).

for all x1 , x2 , · · · , xn .

Reza Pulungan Probability and Stochastic Processes 14


Distributions
Definition

Definition
Let R be a random variable with codomain V . The probability
density function (PDF) of R is a function PDFR : V −→ [0, 1]
defined by:

Pr(R = x ), if x ∈ range(R),
PDFR (x ) ::=
0, if x 6∈ range(R).

Consequently: X
PDFR (x ) = 1.
x ∈range(R)

Reza Pulungan Probability and Stochastic Processes 16


Example
Suppose we toss two independent and unbiased dies.
Let T be the random variable representing the sum of the
results of the two dies, then the codomain of T is
V = {2, 3, 4, · · · , 12}.

#"!#
PDFT.$/
!"!#

& ! ' ( # ) * + ,- ,, ,&


$ ! %

Reza Pulungan Probability and Stochastic Processes 17


Definition
Definition
The cumulative distribution function (CDF) of R is a function
CDFR : V −→ [0, 1] defined by:

CDFR (x ) ::= Pr(R ≤ x ).

From the definitions of PDF and CDF:


X X
CDFR (x ) = Pr(R ≤ x ) = Pr(R = y ) = PDFR (y ).
y ≤x y ≤x

PDFR (x ) is the probability that R = x and CDFR (x ) is the


probability that R ≤ x .
Several random variables can have the same PDF and
CDF. PDF and CDF represent the distribution.
Reza Pulungan Probability and Stochastic Processes 18
Example
Continuing the previous example; the CDF for random variable
T is:

" ...
CDFT/%0
"#$

!
! " $ ' ( ) * + , - "! "" "$
% ! &

Reza Pulungan Probability and Stochastic Processes 19


Bernoulli Distributions
Bernoulli distributions are the simplest and the most
frequently occurring distributions.
These are the distributions of the indicator random
variables.
The PDF of Bernoulli distributions is fp : {0, 1} −→ [0, 1],
where:
fp (0) = p, and
fp (1) = 1 − p,
for some p ∈ [0, 1].
The CDF of Bernoulli distributions is Fp : R −→ [0, 1],
where: 
 0, if x < 0,
Fp (x ) = p, if 0 ≤ x < 1,

1, if 1 ≤ x .

Reza Pulungan Probability and Stochastic Processes 20


Uniform Distributions
Random variables, whose values all have the same
probability, are called uniform.
Let the sample space be of the form {1, 2, · · · , n}, then the
PDF of uniform distributions is fn : {1, 2, · · · , n} −→ [0, 1],
where:
1
fn (k ) = ,
n
+
for some n ∈ N .
The CDF of uniform distributions is Fn : R −→ [0, 1], where:

 0, if x < 1,
k
Fn (x ) = , if k ≤ x < k + 1 for 1 ≤ k < n,
 n
1, if n ≤ x .

Reza Pulungan Probability and Stochastic Processes 21


Binomial Distributions

Suppose a random variable R represents the number of


heads that occur from n independent tossings of a coins.
If the coin is fair, then the random variable has an unbiased
binomial distribution with PDF fn : {1, 2, · · · , n} −→ [0, 1],
where:  
n −n
fn (k ) = 2 ,
k
for some n ∈ N+ .

This is because there are kn sequences (ways of
selecting) n tossings that produce precisely k heads; and
each of the sequences has probability ( 21 )n .

Reza Pulungan Probability and Stochastic Processes 22


Binomial Distributions

The CDF of unbiased binomial distributions is


Fn : R −→ [0, 1], where:

0,
 P if x < 1,
k n  −n
Fn (x ) = i =0 i 2 , if k ≤ x < k + 1 for 1 ≤ k < n,

1, if n ≤ x .

Reza Pulungan Probability and Stochastic Processes 23


PDF for n = 20
The PDF of unbiased binomial distribution with n = 20.
#'()
#'(*
#'(+
#'("
!"#$%& #'(#
#'#)
#'#*
#'#+
#'#"
#
# , (# (, "#
%

The decreasing parts on the left and on the right of the graph is
called the tails of the distribution.
Reza Pulungan Probability and Stochastic Processes 24
General Binomial Distributions

If the coin is biased with the probability of producing head


is p, then the random variable has a general binomial
distribution with PDF fn,p : {1, 2, · · · , n} −→ [0, 1], where:
 
n k
fn,p (k ) = p (1 − p)n−k ,
k

for some n ∈ N+ and p ∈ [0, 1].



This is because there are kn sequences (ways of
selecting) n tossings that produce precisely k heads and
n − k tails; but each of those sequences right now has
probability p k (1 − p)n−k .

Reza Pulungan Probability and Stochastic Processes 25


General Binomial Distributions

The CDF of the general binomial distributions is


Fn,p : R −→ [0, 1], where:


 0, if x < 1,
 Pk n

i =0 i p i (1 − p)n−i , if k ≤ x < k + 1
Fn,p (x ) =

 for 1 ≤ k < n,

1, if n ≤ x .

Reza Pulungan Probability and Stochastic Processes 26


PDF for n = 20 and p = 0.75
The PDF of the general binomial distribution with n = 20 and
p = 0.75.
#%"'

#%"

#%+'
!"#$%&'()*
#%+

#%#'

#
# ' +# +' "#
)
Reza Pulungan Probability and Stochastic Processes 27
PDF Approximation
The following is a method to approximate the value of the
binomials.
Lemma

   
n 2nH (α) n 2nH (α)
∼p dan <p (1)
αn 2πα(1 − α)n αn 2πα(1 − α)n

where H (α) is an entropy function:


   
1 1
H (α) ::= α log + (1 − α) log .
α 1−α

Moreover, if αn > 10 dan (1 − α)n > 10, then the RHS and LHS
of Eq. (1) differ at most 2%. If αn > 100 and (1 − α)n > 100,
then the RHS and LHS of Eq. (1) differ at most 0.2%.

Reza Pulungan Probability and Stochastic Processes 28


PDF Approximation

Now we can write that the PDF of the general binomial


distribution with n and p is:
 
n
fn,p (αn) = p αn (1 − p)n−αn ,
αn
2nH (α)
<p p αn (1 − p)n−αn ,
2πα(1 − α)n
p 1−p
2n (α log( α )+(1−α) log( 1−α ))
< p .
2πα(1 − α)n

Although this expression looks complicated, it is easy to


evaluate.

Reza Pulungan Probability and Stochastic Processes 29

You might also like