Basics On Probability

Basics on Probability
Jingrui He
09/11/2007
Coin Flips
You flip a coin

Head with probability 0.5
You flip 100 coins

How many heads would you expect
Coin Flips cont.

You flip a coin

Head with probability p

Binary random variable
Bernoulli trial with success probability p
You flip k coins

How many heads would you expect

Number of heads X: discrete random variable
Binomial distribution with parameters k and p
Discrete Random Variables

Random variables (RVs) which may take on

only a countable number of distinct values
E.g. the total number of heads X you get if you

flip 100 coins
X is a RV with arity k if it can take on exactly

one value out of { x1 ,K , xk }
E.g. the possible values that X can take on are 0,

1, 2,, 100
Probability of Discrete RV

Probability mass function (pmf): P ( X = xi )

Easy facts about pmf
P(X = x ) = 1
(
(
)
) (
P X = xi X = x j = 0 if i j
P X = xi X = x j = P X = xi + P X = x j
P X = x1 X = x2 K X = xk = 1
) if i j
Common Distributions
Uniform X U [1,K , N ]
X takes values 1, 2, , N
P X = i =1 N
E.g. picking balls of different colors from a box
Binomial X
Bin ( n, p )
X takes values 0, 1, , n
n i
n i
P ( X = i ) = p (1 p )
i
E.g. coin flips
Coin Flips of Two Persons

Your friend and you both flip coins

Head with probability 0.5

You flip 50 times; your friend flip 100 times
How many heads will both of you get
Joint Distribution
Given two discrete RVs X and Y, their joint

distribution is the distribution of X and Y
together
E.g. P(You get 21 heads AND you friend get 70

heads)
P (X = x Y = y) = 1
E.g.

50
100
i =0
j =0
P ( You get i heads AND your friend get j heads ) = 1
Conditional Probability
P ( X = x Y = y ) is the probability of X = x ,
given the occurrence of Y = y
E.g. you get 0 heads, given that your friend gets

61 heads
P (X = x Y = y) =
P (X = x Y = y)
P (Y = y)
Law of Total Probability

Given two discrete RVs X and Y, which take

values in { x1 ,K , xm } and { y1 ,K , yn } , We have
P (X = x Y = y )
= P ( X = x Y = y )P ( Y = y )
P ( X = xi ) =
Marginalization
Marginal Probability
Joint Probability
P (X = x Y = y )
= P ( X = x Y = y )P ( Y = y )
P ( X = xi ) =
Conditional Probability
Marginal Probability
Bayes Rule
X and Y are discrete RVs

P (X = x Y = y) =
P X = xi Y = y j =
P (X = x Y = y)
P (Y = y)
P Y = y j X = xi P ( X = xi )
P (Y = y
k
X = xk P ( X = xk )
Independent RVs
Intuition: X and Y are independent means that

X = x neither makes it more or less probable
that Y = y
Definition: X and Y are independent iff
P (X = x Y = y) = P (X = x) P (Y = y)
More on Independence
P (X = x Y = y ) = P (X = x) P (Y = y)
P (X = x Y = y ) = P (X = x)
P (Y = y X = x) = P (Y = y)
E.g. no matter how many heads you get, your

friend will not be affected, and vice versa
Conditionally Independent RVs

Intuition: X and Y are conditionally

independent given Z means that once Z is
known, the value of X does not add any
additional information about Y
Definition: X and Y are conditionally
independent given Z iff
P (X = x Y = y Z = z) = P (X = x Z = z) P (Y = y Z = z)
More on Conditional Independence

P (X = x Y = y Z = z) = P (X = x Z = z) P (Y = y Z = z)
P ( X = x Y = y, Z = z ) = P ( X = x Z = z )
P ( Y = y X = x, Z = z ) = P ( Y = y Z = z )
Monty Hall Problem

You're given the choice of three doors: Behind one

door is a car; behind the others, goats.
You pick a door, say No. 1
The host, who knows what's behind the doors, opens
another door, say No. 3, which has a goat.
Do you want to pick door No. 2 instead?
Host reveals
Goat A
or
Host reveals
Goat B
Host must
reveal Goat B
Host must
reveal Goat A
Monty Hall Problem: Bayes Rule

Ci : the car is behind door i, i = 1, 2, 3

P ( Ci ) = 1 3
H ij : the host opens door j after you pick door i
P H ij Ck
i= j
0
0
j=k
=
i=k
1 2
1 i k , j k
Monty Hall Problem: Bayes Rule cont.

WLOG, i=1, j=3

P ( C1 H13 ) =
P ( H13
P ( H13 C1 ) P ( C 1 )
P ( H13 )
1 1 1
C1 ) P ( C1 ) = =
2 3 6

P ( H13 ) = P ( H13 , C1 ) + P ( H13 , C2 ) + P ( H13 , C3 )

= P ( H13 C1 ) P ( C1 ) + P ( H13 C2 ) P ( C2 )
1
1
= + 1
6
3
1
=
2
16 1
P ( C1 H13 ) =
=
12 3

16 1
P ( C1 H13 ) =
=
12 3
1 2
P ( C2 H13 ) = 1 = > P ( C1 H13 )
3 3
You should switch!
Continuous Random Variables

What if X is continuous?
Probability density function (pdf) instead of
probability mass function (pmf)
A pdf is any function f ( x ) that describes the
probability density in terms of the input
variable x.
PDF
Properties of pdf

f ( x ) 0, x
f ( x) = 1
( )
f x 1 ???
Actual probability can be obtained by taking

the integral of pdf
E.g. the probability of X being between 0 and 1 is
P ( 0 X 1) =
f ( x )dx
Cumulative Distribution Function

FX ( v ) = P ( X v )
Discrete RVs
FX ( v ) =
vi
P ( X = vi )
Continuous RVs
( )
FX v =
f ( x ) dx
d
FX ( x ) = f ( x )
dx
Common Distributions
N ,
Normal X
( )
f x =
x
)
1
(
exp
, x
2
2
2
E.g. the height of the entire population

0.4
0.35
0.3
0.25
f(x)
0.2
0.15
0.1
0.05
0
-5
-4
-3
-2
-1
0
x
Common Distributions cont.

Beta X
Beta ( , )
1
1
1
x (1 x ) , x [ 0,1]
f ( x; , ) =
B ( , )
= = 1 : uniform distribution between 0 and 1
E.g. the conjugate prior for the parameter p in

Binomial distribution
1.6
1.4
1.2
1
f(x)
0.8
0.6
0.4
0.2
0
0.1
0.2
0.3
0.4
0.5
x
0.6
0.7
0.8
0.9
Joint Distribution
Given two continuous RVs X and Y, the joint

pdf can be written as f X,Y ( x, y )
x y
f X,Y ( x, y )dxdy = 1
Multivariate Normal
Generalization to higher dimensions of the

one-dimensional normal
Covariance Matrix
f Xv
( x1 ,K , xd ) =
( 2 )
d 2
12
T 1 v
1 v
exp ( x ) ( x )
2
Mean
Moments
Mean (Expectation): = E ( X )

Discrete RVs: E ( X ) = vi P ( X = vi )
Continuous RVs: E ( X ) =
vi
Variance: V ( X ) = E ( X )

xf ( x ) dx
Discrete RVs: V ( X ) = ( vi )2 P ( X = vi )
vi
Continuous RVs: V ( X ) =
( x ) f ( x )dx
2
Properties of Moments
Mean
(
) ( )
E ( aX ) = aE ( X )
( )
E X+Y = E X +E Y

If X and Y are independent, E ( XY ) = E ( X ) E ( Y )
Variance
( )
V aX + b = a V X
If X and Y are independent, V ( X + Y ) = V (X) + V (Y)
Moments of Common Distributions

Uniform X U [1,K , N ]
Mean (1 + N ) 2 ; variance ( N 1) 12
2
Binomial X
Mean np ; variance np 2
Normal X
N , 2
Mean ; variance 2
Beta X
Bin ( n, p )
Beta ( , )
Mean ( + ) ; variance
( + ) ( + + 1)
2
Probability of Events
X denotes an event that could possibly happen

P(X) denotes the likelihood that X happens,

or X=true
E.g. X=you will fail in this course
Whats the probability that you will fail in this

course?
denotes the entire event set
= X, X
The Axioms of Probabilities

0 <= P(X) <= 1
P () = 1
P ( X1 X 2 K) =
disjoint events
Useful rules
P ( X ) , where X
( )
P (X) = 1 P (X)
P X1 X 2 = P X1 + P X 2 P X1 X 2
are
Interpreting the Axioms
X1
X2

Basics On Probability

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basics On Probability

Uploaded by

Copyright:

Available Formats

Basics on Probability

You flip a coin

Head with probability 0.5

You flip 100 coins

How many heads would you expect

Coin Flips cont.

You flip a coin

Head with probability p

You flip k coins

How many heads would you expect

Discrete Random Variables

Random variables (RVs) which may take on

E.g. the total number of heads X you get if you

X is a RV with arity k if it can take on exactly

E.g. the possible values that X can take on are 0,

Probability mass function (pmf): P ( X = xi )

E.g. picking balls of different colors from a box

E.g. coin flips

Coin Flips of Two Persons

Your friend and you both flip coins

Head with probability 0.5

Given two discrete RVs X and Y, their joint

E.g. P(You get 21 heads AND you friend get 70

P ( You get i heads AND your friend get j heads ) = 1

E.g. you get 0 heads, given that your friend gets

Law of Total Probability

Given two discrete RVs X and Y, which take

X and Y are discrete RVs

Intuition: X and Y are independent means that

E.g. no matter how many heads you get, your

Conditionally Independent RVs

Intuition: X and Y are conditionally

More on Conditional Independence

Monty Hall Problem

You're given the choice of three doors: Behind one

Monty Hall Problem: Bayes Rule

Ci : the car is behind door i, i = 1, 2, 3

H ij : the host opens door j after you pick door i

Monty Hall Problem: Bayes Rule cont.

WLOG, i=1, j=3

Monty Hall Problem: Bayes Rule cont.

P ( H13 ) = P ( H13 , C1 ) + P ( H13 , C2 ) + P ( H13 , C3 )

Monty Hall Problem: Bayes Rule cont.

You should switch!

Continuous Random Variables

Actual probability can be obtained by taking

E.g. the probability of X being between 0 and 1 is

Cumulative Distribution Function

E.g. the height of the entire population

Common Distributions cont.

E.g. the conjugate prior for the parameter p in

Given two continuous RVs X and Y, the joint

Generalization to higher dimensions of the

If X and Y are independent, E ( XY ) = E ( X ) E ( Y )

If X and Y are independent, V ( X + Y ) = V (X) + V (Y)

Moments of Common Distributions

X denotes an event that could possibly happen

P(X) denotes the likelihood that X happens,

E.g. X=you will fail in this course

Whats the probability that you will fail in this

denotes the entire event set

The Axioms of Probabilities

0 <= P(X) <= 1