You are on page 1of 18

# 1

## Lecture 3: Review of basic probability theory

Dr. Itamar Arel
College of Engineering
Electrical Engineering and Computer Science Department
The University of Tennessee
Fall 2011
August 25, 2011
ECE-517: Reinforcement Learning in
Artificial Intelligence
ECE-517 - Reinforcement Learning in AI
2
Outline
Probability theory fundamentals
Random variables
ECE-517 - Reinforcement Learning in AI
3
Basic definitions
The collection or set of "all possible" distinct outcomes of
an experiment is called the sample space of the
experiment/trial.
Flipping a coin {H,T}
Rolling a die {1,2,3,4,5,6}
Outcomes Elements of the sample space
Event - The possible outcome of a experiment/trial
An Experiment
The Sample Space
ECE-517 - Reinforcement Learning in AI
4
More definitions
Independence
Two experiments are independent if the outcome
of either one does not depend on the outcome of
the other
Deterministic - outcome of a trial is predictable
(100%)
Randomness The absence of any pattern
A sample space is called discrete if it is a finite
or a countable infinite set, otherwise it is called
continuous
Probability can be viewed as the likelihood of an
event occurring
ECE-517 - Reinforcement Learning in AI
5
Fundamentals
Let S denote the sample space and A
i
the set of all
possible outcomes with probabilities P(A
i
), respectively
P(A
i
) > 0 for all i
P(A
i
) =1
For example, a probabilistic model might present the
length of a packet sent over a network
Two events A and B are called mutually exclusive, or
disjoint, if they have no common outcomes
P(A + B) = P(A) + P(B) - P(AB)
Often, P(A + B) = P(A) + P(B)
A
B
AB
ECE-517 - Reinforcement Learning in AI
6
Conditional Probability
Conditional probability
P(A|B) = P(AB)/P(B)
A and B are defined as independent events if and only if,
P(AB) = P(A)P(B)
P(A|B) = P(A)
Bayes Rule
Consider two events, A and B, where P(AB) = P(A|B)P(B) and
P(BA) = P(B|A)P(A)
But P(AB) = P(BA) , so P(A|B)P(B) = P(B|A)P(A) and P(A|B) =
P(AB)/P(B)
) (
) ( ) | (
) | (
B P
A P A B P
B A P =
ECE-517 - Reinforcement Learning in AI
7
Outline
Probability theory fundamentals
Random variables
ECE-517 - Reinforcement Learning in AI
8
Discrete Random Variables
A random variable (r.v.) is a function that assigns a real number
to each outcome in the sample space of a random experiment
For a discrete r.v. X, the probability mass function (PMF) gives
the probability that X will take on a particular value in its range.
We note this by P
X
, i.e.
P
X
(x) = P(X=x)
The expected value of a discrete r.v. X is defined by
E[X] = x P
X
(x)
The variance of X is defined as
E(X -E[X])
2
= E[X
2
]-E[X]
2
Question: in what scenario will the variance be zero ?
ECE-517 - Reinforcement Learning in AI
9
Bernoulli and Geometric Random Variables with parameter p
X is a Bernoulli r.v. with parameter p if it can take on
values 1 (success) and 0 (failure) with
P(x=1)=p
P(x=0)=1-p
Example: Packet arrivals may be modeled as either correct
(1) or erroneous (0)
Given a sequence of independent Bernoulli r.v.s, let T be
the number of successes observed up to and including the
first. Then T will have a geometric distribution; its PMF is
given by
P(T=n)=(1-p)
n-1
p
E[T]=1/p
ECE-517 - Reinforcement Learning in AI
10
Memoryless property the fact that there were n time steps
separating success events has no influence on future events
The memoryless property makes it very useful in various analysis
Geometric distribution with N = 16, p = 0.3
ECE-517 - Reinforcement Learning in AI
11
Binomial Random Variable with parameters p and n
Let S denote the number of successes out of n
independent Bernoulli r.v.s. The PMF is given by
for k = 0,1,, n.
The expected number of successes is given by
Example: if packets arrive correctly at a node in a network
with probability p (independently); then the number of
correct arrivals out of n is a Binomial r.v.
k n k
p p
k
n
k S P

|
|
.
|

\
|
= = ) 1 ( ) (
E[S] = np
ECE-517 - Reinforcement Learning in AI
12
Mean of a Binomial r.v.
Note that:
ECE-517 - Reinforcement Learning in AI
13
Examples (from ECE-453)
Consider the following network. Packets transmitted from Router A
to Router B have a packet error rate (PER) of p
AB
, while packets
transmitted from Router B to Router C have a PER of p
BC
. The packet
error rates are assumed to be independent.
p
AB
A B C
p
BC
If all traffic from Router A to Router C traverses Router B, what is the probability
that all N packets transmitted from Router A to Router C are received correctly?
Given that N packets were transmitted from Router A to Router B, write an
expression for the probability that at least m of those N packets are received
correctly.
Assuming Router A has transmitted N packets to Router C (via Router B), what is
the probability that exactly m packets (where m<N) are received correctly at Router
C?
ECE-517 - Reinforcement Learning in AI
14
CDF and PDF
The Cumulative Distribution Function (cdf) of a r.v. X,
F
X
(x), is defined as the probability of the event {X s x}
Axioms related are:
The probability density function (pdf) of a r.v. X, f
X
(x), is
defined as the derivative of the CDF
< < s = x x X P x F
X
], [ ) (
) ( ) ( b a
0 ) ( lim , 1 ) ( lim , 1 ) ( 0
x x
b F a F then if
x F x F x F
X X
X X X
> >
= = s s

k} {X (k) P
dx
x dF
x f
X
X
X
= = = Pr
) (
) (
ECE-517 - Reinforcement Learning in AI
15
Exponential Distribution
Continuous random variable
Continuous-time analogy to the geometric
distribution (memoryless properties hold)
ECE-517 - Reinforcement Learning in AI
16
Minimum of Independent Exponential rvs
Assume X
1
, X
2
, , X
n
, are Independent Exponentials
ECE-517 - Reinforcement Learning in AI
17
Memoryless Property
True for Geometric and Exponential Dist.:
The coin does not remember that it came up tails l times
Root cause of Markov property (discussed later)
ECE-517 - Reinforcement Learning in AI
18
Useful Results
The following are some results that are useful for
manipulating many of the equations that may arise when
dealing with discrete-time probabilistic models
when |x|<1,
Differentiating both sides of the previous equation yields
another useful expression:

=
+

=
n
k
n
k
x
x
x
0
1
1
1

=

=
0
1
1
k
k
x
x

=

=
0
2
) 1 (
k
k
x
x
kx