Probability and Stochastic Processes: Reza Pulungan

Probability and Stochastic
Processes
Reza Pulungan
Department of Computer Science and Electronics

Faculty of Mathematics and Natural Sciences
Universitas Gadjah Mada
Yogyakarta
October 3, 2020
Random Variables
Definition
Definition
A random variable R on a probability space is a total function
whose domain is the sample space of the probability space.
The codomain of R can be any set, but it is usually a

subset of R.
The name “random variable” is not exactly precise, since it
is not a variable, but a function.
Reza Pulungan Probability and Stochastic Processes 3

Example
Suppose we toss 3 mutually independent and unbiased coins.

Let C be the number of heads produced on a tossing.
Let M = 1 if the tossing produces three heads or three
tails, and M = 0 otherwise.
Every outcome of this 3-coin tossing experiment uniquely
determines the values of C and M .
Hence, C dan M can be regarded as functions that map
the outcomes from:
S = {HHH , HHT , HTH , HTT , THH , THT , TTH , TTT },
ke numbers.

Example
Suppose we toss 3 mutually independent and unbiased coins.
More specificly, C is a function that maps every outcome to
a number as follows:
C(HHH ) = 3 C(HHT ) = 2 C(HTH ) = 2 C(HTT ) = 1

C(THH ) = 2 C(THT ) = 1 C(TTH ) = 1 C(TTT ) = 0
M is a function that maps every outcome to a number as

follows:
M (HHH ) = 1 M (HHT ) = 0 M (HTH ) = 0 M (HTT ) = 0

M (THH ) = 0 M (THT ) = 0 M (TTH ) = 0 M (TTT ) = 1
Therefore, C and M are random variables.

Indicator Random Variables
Definition
An indicator random variable is a random variable that maps
every outcome to either 0 or 1.
Indicator random variables are also called Bernoulli

variables.
The variable M in the previous example is an indicator
random variable.

Indicator Random Variables
Definition
An indicator random variable is a random variable that maps
every outcome to either 0 or 1.
Indicator random variables are closely related to event,

because they partition the sample space to two blocks: the
block containing outcomes mapped to 1, and the block
containing outcomes mapped to 0.
Similarly, an event also partition the sample space to two
blocks: the block containing the outcomes that are the
elements of the event, and the block containing the
outcomes that are not the elements of the event.
Hence, for each event E , we can associate an indicator
random variable IE , such that:

0, if ω 6∈ E ,
IE (ω) =
1, if ω ∈ E .
Random Variables and Events
Events are also closely related to random variables: a

random variable that produces n different values partitions
the sample space to n blocks.
Example: the random variable C partitions S to 4 blocks:
| {z} |TTH , THT

TTT
{z
, HTT THH , HTH , HHT HHH
} | {z } | {z }
.
C=0 C=1 C=2 C=3
Each block is a subset of S, and therefore, an event.

Furthermore, we can view an equality or an inequality
involving a random variable as an event.

Random Variables and Events
Example: C = 2 is an event containing outcomes
THH , HTH , HHT .
C ≤ 1 is an event containing outcomes
TTT , TTH , THT , HTT .
Furthermore, we can talk about the probability of the
events determined by random variables; for instance:
1 1 1 3
Pr(C = 2) = Pr(THH )+Pr(HTH )+Pr(HHT ) = + + = .
8 8 8 8
and
1 1
Pr(C ≤ 1) = Pr(TTT )+Pr(TTH )+Pr(THT )+Pr(HTT ) = 4· =
8 2

Functions of Random Variables
Random variables can be combined to form a more
complex random variables.
Example: suppose we toss two independent and unbiased
dies. Let Di be the random variable representing the
outcome of the i -th die, i = 1, 2. Then:
1
Pr(D1 = 3) = Pr(D2 = 3) = .
6
We can form a more complex random variable

T = D1 + D2 , which represents the sum of the two dies.
Right now:
1 1
Pr(T = 3) = , and Pr(T = 7) = .
18 6

Conditional Probability
Combining the conditional probability with events involving
random variables does not create any problem.
Example: Pr(C ≥ 2 | M = 0) is the probability that at least
2 coins produce head (C ≥ 2), if not all coins produce the
same results (M = 0).
We can calculate:
Pr(C ≥ 2 ∩ M = 0)
Pr(C ≥ 2 | M = 0) = ,
Pr(M = 0)
Pr({THH , HTH , HHT })
= ,
Pr({THH , HTH , HHT , HTT , THT , TTH })
3/8 1
= = .
6/8 2

Independence
The concept of independence also applies to random variables,
similar to events.
Definition
Two random variables R1 and R2 are independent if and only if
for all x1 ∈ co(R1 ) and for all x2 ∈ co(R2 ), either
Pr(R2 = x2 ) = 0 or:
Pr(R1 = x1 | R2 = x2 ) = Pr(R1 = x1 ).
Definition
Two random variables R1 and R2 are independent if and only if
for all x1 ∈ co(R1 ) and for all x2 ∈ co(R2 ):
Pr(R1 = x1 ∩ R2 = x2 ) = Pr(R1 = x1 ) · Pr(R2 = x2 ).

Independence
For instance: are C and M independent?
C and M are not independent if we can find x1 , x2 ∈ R such that:
Pr(C = x1 ∩ M = x2 ) 6= Pr(C = x1 ) · Pr(M = x2 ).
We can find them, namely: x1 = 2 and x2 = 1:
Pr(C = 2 ∩ M = 1) = 0,
and
3 1 3
Pr(C = 2) · Pr(M = 1) = · = .
8 4 32

Independence
Like events, the concept of independence also applies to more

than two random variables.
Definition
Random variables R1 , R2 , · · · , Rn are mutually independent if
and only of:
Pr(R1 = x1 ∩ R2 = x2 ∩ · · · ∩ Rn = xn )
= Pr(R1 = x1 ) · Pr(R2 = x2 ) · · · Pr(Rn = xn ).
for all x1 , x2 , · · · , xn .

Distributions
Definition
Definition
Let R be a random variable with codomain V . The probability
density function (PDF) of R is a function PDFR : V −→ [0, 1]
defined by:

Pr(R = x ), if x ∈ range(R),
PDFR (x ) ::=
0, if x 6∈ range(R).
Consequently: X
PDFR (x ) = 1.
x ∈range(R)

Example
Suppose we toss two independent and unbiased dies.
Let T be the random variable representing the sum of the
results of the two dies, then the codomain of T is
V = {2, 3, 4, · · · , 12}.
#"!#
PDFT.$/
!"!#
& ! ' ( # ) * + ,- ,, ,&

$ ! %

Definition
Definition
The cumulative distribution function (CDF) of R is a function
CDFR : V −→ [0, 1] defined by:
CDFR (x ) ::= Pr(R ≤ x ).
From the definitions of PDF and CDF:

X X
CDFR (x ) = Pr(R ≤ x ) = Pr(R = y ) = PDFR (y ).
y ≤x y ≤x
PDFR (x ) is the probability that R = x and CDFR (x ) is the

probability that R ≤ x .
Several random variables can have the same PDF and
CDF. PDF and CDF represent the distribution.
Example
Continuing the previous example; the CDF for random variable
T is:
" ...
CDFT/%0
"#$
!
! " $ ' ( ) * + , - "! "" "$
% ! &

Bernoulli Distributions
Bernoulli distributions are the simplest and the most
frequently occurring distributions.
These are the distributions of the indicator random
variables.
The PDF of Bernoulli distributions is fp : {0, 1} −→ [0, 1],
where:
fp (0) = p, and
fp (1) = 1 − p,
for some p ∈ [0, 1].
The CDF of Bernoulli distributions is Fp : R −→ [0, 1],
where: 
 0, if x < 0,
Fp (x ) = p, if 0 ≤ x < 1,

1, if 1 ≤ x .

Uniform Distributions
Random variables, whose values all have the same
probability, are called uniform.
Let the sample space be of the form {1, 2, · · · , n}, then the
PDF of uniform distributions is fn : {1, 2, · · · , n} −→ [0, 1],
where:
1
fn (k ) = ,
n
+
for some n ∈ N .
The CDF of uniform distributions is Fn : R −→ [0, 1], where:

 0, if x < 1,
k
Fn (x ) = , if k ≤ x < k + 1 for 1 ≤ k < n,
 n
1, if n ≤ x .

Binomial Distributions
Suppose a random variable R represents the number of

heads that occur from n independent tossings of a coins.
If the coin is fair, then the random variable has an unbiased
binomial distribution with PDF fn : {1, 2, · · · , n} −→ [0, 1],
where:
n −n
fn (k ) = 2 ,
k
for some n ∈ N+ .

This is because there are kn sequences (ways of
selecting) n tossings that produce precisely k heads; and
each of the sequences has probability ( 21 )n .

Binomial Distributions
The CDF of unbiased binomial distributions is

Fn : R −→ [0, 1], where:

0,
 P if x < 1,
k n −n
Fn (x ) = i =0 i 2 , if k ≤ x < k + 1 for 1 ≤ k < n,

1, if n ≤ x .

PDF for n = 20
The PDF of unbiased binomial distribution with n = 20.
#'()
#'(*
#'(+
#'("
!"#$%& #'(#
#'#)
#'#*
#'#+
#'#"
#
# , (# (, "#
%
The decreasing parts on the left and on the right of the graph is
called the tails of the distribution.
General Binomial Distributions
If the coin is biased with the probability of producing head

is p, then the random variable has a general binomial
distribution with PDF fn,p : {1, 2, · · · , n} −→ [0, 1], where:

n k
fn,p (k ) = p (1 − p)n−k ,
k
for some n ∈ N+ and p ∈ [0, 1].

This is because there are kn sequences (ways of
selecting) n tossings that produce precisely k heads and
n − k tails; but each of those sequences right now has
probability p k (1 − p)n−k .

General Binomial Distributions
The CDF of the general binomial distributions is

Fn,p : R −→ [0, 1], where:


 0, if x < 1,
 Pk n

i =0 i p i (1 − p)n−i , if k ≤ x < k + 1
Fn,p (x ) =

 for 1 ≤ k < n,

1, if n ≤ x .

PDF for n = 20 and p = 0.75
The PDF of the general binomial distribution with n = 20 and
p = 0.75.
#%"'
#%"
#%+'
!"#$%&'()*
#%+
#%#'
#
# ' +# +' "#
)
PDF Approximation
The following is a method to approximate the value of the
binomials.
Lemma

n 2nH (α) n 2nH (α)
∼p dan <p (1)
αn 2πα(1 − α)n αn 2πα(1 − α)n
where H (α) is an entropy function:

1 1
H (α) ::= α log + (1 − α) log .
α 1−α
Moreover, if αn > 10 dan (1 − α)n > 10, then the RHS and LHS
of Eq. (1) differ at most 2%. If αn > 100 and (1 − α)n > 100,
then the RHS and LHS of Eq. (1) differ at most 0.2%.

PDF Approximation
Now we can write that the PDF of the general binomial

distribution with n and p is:

n
fn,p (αn) = p αn (1 − p)n−αn ,
αn
2nH (α)
<p p αn (1 − p)n−αn ,
2πα(1 − α)n
p 1−p
2n (α log( α )+(1−α) log( 1−α ))
< p .
2πα(1 − α)n
Although this expression looks complicated, it is easy to

evaluate.

Probability and Stochastic Processes: Reza Pulungan

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability and Stochastic Processes: Reza Pulungan

Uploaded by

Copyright:

Available Formats

Probability and Stochastic

Department of Computer Science and Electronics

The codomain of R can be any set, but it is usually a

Reza Pulungan Probability and Stochastic Processes 3

Suppose we toss 3 mutually independent and unbiased coins.

S = {HHH , HHT , HTH , HTT , THH , THT , TTH , TTT },

Reza Pulungan Probability and Stochastic Processes 4

C(HHH ) = 3 C(HHT ) = 2 C(HTH ) = 2 C(HTT ) = 1

M is a function that maps every outcome to a number as

M (HHH ) = 1 M (HHT ) = 0 M (HTH ) = 0 M (HTT ) = 0

Therefore, C and M are random variables.

Indicator random variables are also called Bernoulli

Reza Pulungan Probability and Stochastic Processes 6

Indicator random variables are closely related to event,

Events are also closely related to random variables: a

| {z} |TTH , THT

Each block is a subset of S, and therefore, an event.

Reza Pulungan Probability and Stochastic Processes 8

Reza Pulungan Probability and Stochastic Processes 9

We can form a more complex random variable

Reza Pulungan Probability and Stochastic Processes 10

Reza Pulungan Probability and Stochastic Processes 11

Pr(R1 = x1 ∩ R2 = x2 ) = Pr(R1 = x1 ) · Pr(R2 = x2 ).

Reza Pulungan Probability and Stochastic Processes 12

C and M are not independent if we can find x1 , x2 ∈ R such that:

Pr(C = x1 ∩ M = x2 ) 6= Pr(C = x1 ) · Pr(M = x2 ).

We can find them, namely: x1 = 2 and x2 = 1:

Reza Pulungan Probability and Stochastic Processes 13

Like events, the concept of independence also applies to more

Reza Pulungan Probability and Stochastic Processes 14

Reza Pulungan Probability and Stochastic Processes 16

& ! ' ( # ) * + ,- ,, ,&

Reza Pulungan Probability and Stochastic Processes 17

CDFR (x ) ::= Pr(R ≤ x ).

From the definitions of PDF and CDF:

PDFR (x ) is the probability that R = x and CDFR (x ) is the

Reza Pulungan Probability and Stochastic Processes 19

Reza Pulungan Probability and Stochastic Processes 20

Reza Pulungan Probability and Stochastic Processes 21

Suppose a random variable R represents the number of

Reza Pulungan Probability and Stochastic Processes 22

The CDF of unbiased binomial distributions is

Reza Pulungan Probability and Stochastic Processes 23

If the coin is biased with the probability of producing head

for some n ∈ N+ and p ∈ [0, 1].

Reza Pulungan Probability and Stochastic Processes 25

The CDF of the general binomial distributions is

Reza Pulungan Probability and Stochastic Processes 26

where H (α) is an entropy function:

Reza Pulungan Probability and Stochastic Processes 28

Now we can write that the PDF of the general binomial

Although this expression looks complicated, it is easy to

Reza Pulungan Probability and Stochastic Processes 29

You might also like