You are on page 1of 35

Basics on Probability

Jingrui He
09/11/2007
Coin Flips
You flip a coin
Head with probability 0.5


You flip 100 coins
How many heads would you expect
Coin Flips cont.
You flip a coin
Head with probability p
Binary random variable
Bernoulli trial with success probability p
You flip k coins
How many heads would you expect
Number of heads X: discrete random variable
Binomial distribution with parameters k and p
Discrete Random Variables
Random variables (RVs) which may take on
only a countable number of distinct values
E.g. the total number of heads X you get if you
flip 100 coins

X is a RV with arity k if it can take on exactly
one value out of
E.g. the possible values that X can take on are 0,
1, 2,, 100
{ }
1
, ,
k
x x
Probability of Discrete RV
Probability mass function (pmf):
Easy facts about pmf

if
if

( )
P X X 0
i j
x x = = =
i j =
( )
1 2
P X X X 1
k
x x x = = = =
( )
( )
( )
P X X P X P X
i j i j
x x x x = = = = + =
i j =
( )
P X 1
i
i
x = =

( )
P X
i
x =
Common Distributions
Uniform
X takes values 1, 2, , N

E.g. picking balls of different colors from a box
Binomial
X takes values 0, 1, , n



E.g. coin flips
| |
X 1, , U N
( )
P X 1 i N = =
( )
X , Bin n p
( ) ( )
P X 1
n i
i
n
i p p
i

| |
= =
|
\ .
Coin Flips of Two Persons
Your friend and you both flip coins
Head with probability 0.5
You flip 50 times; your friend flip 100 times
How many heads will both of you get

Joint Distribution
Given two discrete RVs X and Y, their joint
distribution is the distribution of X and Y
together
E.g. P(You get 21 heads AND you friend get 70
heads)


E.g.
( )
P X Y 1
x y
x y = = =

( )
50 100
0 0
P You get heads AND your friend get heads 1
i j
i j
= =
=

Conditional Probability
is the probability of ,
given the occurrence of
E.g. you get 0 heads, given that your friend gets
61 heads


( )
P X Y x y = =
( )
( )
( )
P X Y
P X Y
P Y
x y
x y
y
= =
= = =
=
X x =
Y y =
Law of Total Probability
Given two discrete RVs X and Y, which take
values in and , We have { }
1
, ,
m
x x
{ }
1
, ,
n
y y
( )
( )
( )
( )
P X P X Y
P X Y P Y
i i j
j
i j j
j
x x y
x y y
= = = =
= = = =

Marginalization
Marginal Probability
Joint Probability
Conditional Probability
( )
( )
( )
( )
P X P X Y
P X Y P Y
i i j
j
i j j
j
x x y
x y y
= = = =
= = = =

Marginal Probability
Bayes Rule
X and Y are discrete RVs
( )
( )
( )
( )
( )
P Y X P X
P X Y
P Y X P X
j i i
i j
j k k
k
y x x
x y
y x x
= = =
= = =
= = =

( )
( )
( )
P X Y
P X Y
P Y
x y
x y
y
= =
= = =
=
Independent RVs
Intuition: X and Y are independent means that
neither makes it more or less probable
that
Definition: X and Y are independent iff


( ) ( ) ( )
P X Y P X P Y x y x y = = = = =
X x =
Y y =
More on Independence




E.g. no matter how many heads you get, your
friend will not be affected, and vice versa
( ) ( ) ( )
P X Y P X P Y x y x y = = = = =
( )
( )
P X Y P X x y x = = = =
( )
( )
P Y X P Y y x y = = = =
Conditionally Independent RVs
Intuition: X and Y are conditionally
independent given Z means that once Z is
known, the value of X does not add any
additional information about Y
Definition: X and Y are conditionally
independent given Z iff
( ) ( ) ( )
P X Y Z P X Z P Y Z x y z x z y z = = = = = = = =
More on Conditional Independence
( ) ( ) ( )
P X Y Z P X Z P Y Z x y z x z y z = = = = = = = =
( ) ( )
P X Y , Z P X Z x y z x z = = = = = =
( ) ( )
P Y X , Z P Y Z y x z y z = = = = = =
Monty Hall Problem
You're given the choice of three doors: Behind one
door is a car; behind the others, goats.
You pick a door, say No. 1
The host, who knows what's behind the doors, opens
another door, say No. 3, which has a goat.
Do you want to pick door No. 2 instead?





Host must
reveal Goat B







Host must
reveal Goat A





Host reveals
Goat A
or
Host reveals
Goat B





Monty Hall Problem: Bayes Rule
: the car is behind door i, i = 1, 2, 3

: the host opens door j after you pick door i



i
C
ij
H
( )
1 3
i
P C =
( )
0
0
1 2
1 ,
ij k
i j
j k
P H C
i k
i k j k
=

=

=

= =

Monty Hall Problem: Bayes Rule cont.


WLOG, i=1, j=3






( )
( )
( )
( )
13 1 1
1 13
13
P H C P C
P C H
P H
=
( )
( )
13 1 1
1 1 1
2 3 6
P H C P C = =







Monty Hall Problem: Bayes Rule cont.
( ) ( ) ( ) ( )
( )
( )
( )
( )
13 13 1 13 2 13 3
13 1 1 13 2 2
, , ,
1 1
1
6 3
1
2
P H P H C P H C P H C
P H C P C P H C P C
= + +
= +
= +
=
( )
1 13
1 6 1
1 2 3
P C H = =
Monty Hall Problem: Bayes Rule cont.
( )
1 13
1 6 1
1 2 3
P C H = =





You should switch!
( ) ( )
2 13 1 13
1 2
1
3 3
P C H P C H = = >
Continuous Random Variables
What if X is continuous?
Probability density function (pdf) instead of
probability mass function (pmf)
A pdf is any function that describes the
probability density in terms of the input
variable x.

( )
f x
PDF
Properties of pdf



Actual probability can be obtained by taking
the integral of pdf
E.g. the probability of X being between 0 and 1 is
( )
0, f x x >
( )
1 f x
+

=
}
( ) ( )
1
0
P 0 1 X f x dx s s =
}
( )
1 ??? f x s
Cumulative Distribution Function

Discrete RVs

Continuous RVs



( ) ( )
X
P X F v v = s
( ) ( )
X
P X
i
i
v
F v v = =

( ) ( )
X
v
F v f x dx

=
}
( ) ( )
X
d
F x f x
dx
=
Common Distributions
Normal



E.g. the height of the entire population
( )
2
X , N o
( )
( )
2
2
1
exp ,
2 2
x
f x x

to o


= e
`

)
-5 -4 -3 -2 -1 0 1 2 3 4 5
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
x
f
(
x
)
Common Distributions cont.
Beta


: uniform distribution between 0 and 1
E.g. the conjugate prior for the parameter p in
Binomial distribution
( )
X , Beta o |
( )
( )
( ) | |
1
1
1
; , 1 , 0,1
,
f x x x x
B
|
o
o |
o |

= e
1 o | = =
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
x
f
(
x
)
Joint Distribution
Given two continuous RVs X and Y, the joint
pdf can be written as


( )
X,Y
, f x y
( )
X,Y
, 1
x y
f x y dxdy =
} }
Multivariate Normal
Generalization to higher dimensions of the
one-dimensional normal


( )
( )
( ) ( )
1
X
1 2 2
1
1
, ,
2
1
exp
2
d
d
T
f x x
x x
t

=
E

E
`
)
Covariance Matrix
Mean
Moments
Mean (Expectation):
Discrete RVs:

Continuous RVs:
Variance:
Discrete RVs:

Continuous RVs:
( )
X E =
( ) ( )
X P X
i
i i
v
E v v = =

( ) ( )
X E xf x dx
+

=
}
( ) ( )
2
X X V E =
( ) ( ) ( )
2
X P X
i
i i
v
V v v = =

( ) ( ) ( )
2
X V x f x dx
+

=
}
Properties of Moments
Mean


If X and Y are independent,
Variance

If X and Y are independent,
( ) ( ) ( )
X Y X Y E E E + = +
( ) ( )
X X E a aE =
( ) ( ) ( )
XY X Y E E E =
( ) ( )
2
X X V a b a V + =
( )
X Y (X) (Y) V V V + = +
Moments of Common Distributions
Uniform
Mean ; variance
Binomial
Mean ; variance
Normal
Mean ; variance
Beta
Mean ; variance
| |
X 1, , U N
( )
1 2 N +
( )
2
1 12 N
( )
X , Bin n p
np
2
np
( )
2
X , N o

2
o
( )
X , Beta o |
( )
o o | +
( ) ( )
2
1
o|
o | o | + + +
Probability of Events
X denotes an event that could possibly happen
E.g. X=you will fail in this course
P(X) denotes the likelihood that X happens,
or X=true
Whats the probability that you will fail in this
course?
denotes the entire event set

O
{ }
X, X O=
The Axioms of Probabilities
0 <= P(X) <= 1

, where are
disjoint events
Useful rules


( )
P 1 O =
( ) ( )
1 2
P X X P X
i
i
=

X
i
( ) ( ) ( ) ( )
1 2 1 2 1 2
P X X P X P X P X X = +
( )
( )
P X 1 P X =
Interpreting the Axioms
O
1
X
2
X

You might also like