Basic Rules of Probability and Combinatorial Principles

Basic
Rules of Probability and Combinatorial Principles

..
Basic Set Operations
A ∪ B = {x ∣ x ∈ A or x ∈ B or both} ; A ∩ B = {x ∣ x ∈ A and x ∈ B}
..
c c c c c c
′
A or A
c
= complement of A = {x ∣ x ∉ A} ; (A ∪ B) = A ∩ B ; (A ∩ B) = A ∪ B
A ⊂ B = B contains all the sample points in event A. It calls A is a subevent of B.
If A ⊂ B , then A ∩ B = A and A ∪ B = B
Basic Probability
. For a sample space S , ; ∅] where ∅ is the null or empty set
..
Pr[S ] = 1 Pr[ = 0
For an event A in the sample space S, 1 ≥ Pr[A] ≥ 0
.
c
For any event A, Pr[A] = 1 − Pr[A ]

∅),
. If A and B are two mutually exclusive events, (no sample points in common A∩ B = then

Pr[A ∩ B] = 0
General Formula :
..
Pr[A ∪ B] = Pr[A] + Pr[B] − Pr[A ∩ B]
Pr[A ∪ B ∪ C ] = Pr[A] + Pr[B] + Pr[C ] − Pr[A ∩ B] − Pr[A ∩ C ] − Pr[B ∩ C ] + Pr[A ∩ B ∩ C ]
. If A ⊂ B then Pr[A]
Pr[A] = Pr[A ∩ B] + Pr[A ∩ B

≤ Pr[B]
c
]
Conditional Probability
. For two events A and B, the conditional probability of A given B has occurred is:
Pr[A ∩ B]
Pr[A ∣ B ] =
Pr[B]
. If A ⊂ B , then Pr[A ∣ B] =
Pr[A]

and Pr [B ∣A ] = 1
.
Pr[B]
If
c
Pr[A ∣ B ] = 1 − Pr[A ∣ B ] and A1∩ A2 = ∅ then Pr[ A 1 ∪ A 2 ∣ B ] = Pr[ A 1 ∣ B ] + Pr[ A 1 ∣ B ]
i.e. Pr[ . ∣ B ] is a probability on S .
Baye's Theorem
Let A1 , A2 , … , An be a collection of n mutually exclusive and exhaustive events with Pr[ Ai] > 0 for i = 1 , …., n .
Then, for any other event B ,
n

Pr[B] = ∑ Pr [B ∣ A i ] Pr[A i ]
i=1
Pr[A k ∩ B] Pr[B ∣A k ] Pr[A k ]

Pr [A ∣ B] = = k = 1, … , n
k n
Pr[B] ∑ P r[B ∣ A i ] Pr[A i ]
i=1
Independence
Two events A and B are independent if and only if Pr[ B ∣ A ] = Pr[B ]
if and only if Pr[A ∣ B] = Pr[A]
if and only if Pr[A ∩ B] = Pr[A] ⋅ Pr[B]

Combinatorial Principles
Fundamental Principle of Counting: (also known as the multiplication rule for counting) If a task
can be performed in n1 ways, and for each of these a second task can be performed in n2 ways, and
for each of the latter a third task can be performed in n3 ways, ... , and for each of the latter a kth task
can be performed in nk ways, then the entire sequence of k tasks can be performed in
n1 • n2 • n3 • ... • nk ways.
Permutations
An ordered sequence of k objects taken from a set of n distinct objects is called a permutation of the
n
objects of size k. P k denotes the number of size k permutations that can be constructed from the
n objects.
n n!
Pk =
(n − k)!
Combinations
An unordered subset of k objects taken from a set of n distinct objects is called a combination .
n
( )
k
denotes the number of combinations of size k that can be formed from n objects.
n n!
( ) =
k (n − k)! k!
n n n n n n
( ) = 1 ( ) = 1 ( ) = ( ) = n ( )= ( )
n 0 1 n−1 k n−k
Random Variables and Probability Distributions, Expectation and
Other Distributional Parameters
Random Variables
Probability Distributions and Independence
Given an experiment with sample space S , a random variable X is any rule that associates a number with each
outcome in S .
The cumulative distribution function (cdf) of a random variable written F (x) is defined for every number x by: F (x) = Pr[X ≤ x]
Two random variables X and Y are independent if

Pr [X ∈ A, Y ∈ B ] = Pr [X ∈ A ] Pr [ Y ∈ B] for every A and B.
Discrete Random Variables
A set is described as discrete if its elements can be listed in sequence or if it consists of a finite number of elements.
The probability distribution or probability mass function of a discrete random variable written p( x)
is defined for every number by: p(x) = Pr[X = x]
The probability mass function p(x) must satisfy: 0 ≤ p (x ) ≤ 1 and ∑ p(x ) = 1 .
x
The cumulative distribution function (cdf) of a discrete random variable with pmf p(x) written F(x) is defined by:

F (x) = ∑ Pr[X = k ] for every number x

k ≤x

Two random discrete variables X and Y are independent if Pr [ X=x, Y=y ] = Pr [ X=x ] Pr [ Y=y ] for every x and y.
Continuous Random Variables
A random variable X is continuous if its set of possible values is an entire interval of numbers.
The probability distribution or probability density function (pdf) of a continuous random variable written f (x) is :
Pr[a ≤ X ≤ b] = ∫ f (x) dx
a
∫
∞ ( ) 1
The probability density function f (x ) must satisfy: f (x ) ≥ 0 and ∫ −∞
d
f (x) x = 1 .
x
The cumulative distribution function of a continuous random variable written F (x) is: F (x) = Pr[X ≤ x] = ∫
−∞
f (y) dy
′
We have Pr[a ≤ x ≤ b] = F (b) − F (a) and F (x) = f (x)
Expectation, Variance, Standard deviation, Covariance
The expected value or mean value of a discrete random variable written E(X) or μ is:
X
E(X) = ∑ x Pr[X = x] if X is dicrete and E(X) = ∫ x f (x) dx if X is continuous.

x −∞
The variance of a discrete random variable written V ar(X) or σ 2

X
is: V ar(X) = E[ (X − E(X) ) ]
2
=
2
E[X ] − [E(X)]
2
2 2
V ar(X) = ∑(x − μX ) Pr[X = x] if X is dicrete and V ar(X) = ∫ (x − μ
X
) f (x) dx if X is continuous.
x −∞
√
σ = standard deviation =
X V ar(X)
C ov(X, Y ) = E[(X − E(X))(Y − E(Y ))] = E(XY ) − E(X)E(Y ) = C ov ( Y , X)
If X, Y are independent, then C ov(X, Y ) = 0
Rules of expected value:
For any constant a, E(aX) = aE(X)
For any random variables X, Y, E(X + Y ) = E(X) + E( Y)
Rules of variance:
For any constants a and b, V ar(aX + b) 2
= a [V ar(X)]
For any random variables X, Y, V ar( X +Y ) = V ar( X ) + V ar( Y ) + 2 C ov(X, Y )
If X, Y are independent, then V ar( X +Y ) = V ar( X ) + V ar( Y )

Expected

value of a function

of

a random

variable

∞
E [ h(X) ]= ∑ h(x ) Pr[X = x] if X is dicrete and E [ h(X) ] =∫ h(x ) f (x) dx if X is continuous.

x −∞
Moment generating function

The moment generating function of a discrete random variable written M (t) is: M (t) = E(e tx
)
X X
• Expectation, variance, and moments via mgf ’s: MX(0) = 1,

MX 0 (0) = E(X), MX00 (0) = E(X 2 ), MX000 (0) = E(X 3), etc. Var( X) = MX00 (0) − MX0 (0)2 .
• Mgf of a multiple of a r.v.: If X has mgf MX (t), and Y = cX with c a constant, then the mgf of Y is
MY (t) = E(etY ) = E(etcX ) = MX (tc).
• Mgf of a sum of independent r.v.’s X and Y : If X and Y are independent, then X + Y has mgf
MX+Y (t) = MX (t)MY (t ).
(An analogous formula holds for sums of more than two independent r.v.’s.)
Other parameters of the distribution

For 0 ≤ p ≤ 1 , the (100p) th percentile of a continuous random variable written η(p) is:
η(p)
p = F [η(p)] = ∫ f (y) y
−∞
~ ~
The median of a continuous distribution written μ satisfies 0.5 = F (μ) percentile of the distribution.
th
= the 50

The mode of a distribution is any point m at which the probability mass function p(x) or the density function f (x) is maximized.
3
E[(X− μ) ]
The skewness of a distribution is:
3
σ
Discrete Distributions
, where N is an integer.

1. Uniform A distribution of N points, 1, 2... , N
2 t Nt
e (e −1)
p(x) =
1
E(X) =
N +1
V ar(X ) =
N −1
M (t) =
N 2 12 X t −
N (e 1)
. possible outcomes,
2. Binomial An experiment consisting of a fixed number of n trials. Each trial is identical and results in one of two

denoted
success (S)
or failure (F) The trials are independent and the probability of a success is denoted p. The binomial random
variable X is equal
to the
number
of successes in n trials.

n n
p(x)= ( x )px (1 − p) n−x, x = 0, 1, 2, ....
,n
, E(X) = np V ar(X) = np(1 − p) M (t) = ( 1 − p + p e t )

X
3. Hypergeometric Closely related to binomial

, but it is known
that the
population ( N ) contains
M successes .
X is the number of

successes in a random sample of size n drawn from a population of M successes .

M N −M
( ) ( )
x n− x
M N−n
p(x) = E(X) = n V ar(X) = n
M
(1 −
M
)
N
N N N N−1
( )
M
4. Pascal Related
to binomial
. An experiment continues
until a total of r successes have been observed (
r is a positive integer).
X is equal to the number of trials until the r th

success is completed .
p(x)
x −1 r x− r
, x = r , r+ 1, r+ 2, ....
r r(1−p) p et
[r
= ( ) p (1 − p) (X) =
E V ar(X) =
r−1 p p
2
MX (t) = [ 1−(1−p) e t
5. Geometric Special case of negative binomial when r = 1. So, want to perform the experiment until a success occurs.

.
X is equal to the number of trials until the first success is completed .
1 p et
1−p
p(x) = p (1 − p) x −1
, x = 1, 2, .... E(X)
− =
p
V ar(X) =
2
M (t) =
X 1−(1−p)e
t
p

6. Poisson Not based p
of event that occurs in a
a certain type
on any simple experiment . M ost often used to count the number of

period of time. x

−λ t
e λ
p(x) = ,
x = 0, 1, 2, … E(X) = λ V ar(X) = λ M (t) = e
λ ( e −1)
x! X
for some λ > 0 where λ is frequently some rate per unit of time.

Continuous Distributions
1. Uniform X is distributed uniformly over (a, b) .
2
f (x) =
1
,
a < x < b E(X) =
b−a
V ar(X) =
(b−a)
M (t) =
e
bt
−e
at
b−a 2 12 X (b−a) t
2. Normal Distribution given that is the mean and is the variance of the distribution.
2
(x−μ) 2 2
1 − 2 σ t
f (x) = e 2 σ2 E (X) = μ V ar(X) = σ M (t) = exp [μt + ]
√2πσ
X 2
When μ = 0 and σ = 1, the normal distribution is called a standard normal distribution and the standard normal

random variable is denoted Z. 2
x
1 −
f (x) = e 2
Z √2 π
X−μ
If X has a normal distribution, it can be standardized by : Z =
σ
.
∞
3 Gamma For α > 0 , the gamma function Γ (α) is defined
by: Γ(α) = ∫
0
x
α−1 −x
e dx .
Γ(α + 1) = α Γ( α ) and Γ (n ) = (n −1 )! for n positive integer.
A continuous random variable X is said to have a gamma distribution with parameters α , β < 0 if:
α
β x β
, x < 0
α−1 − β α α α
f (x) = x e E (X) = V ar(X) = M (t) = ( ) f or t < β
Γ(α) β
2 X β−t
β
4. Exponential The exponential

distribution
is a special
case
of the
gamma
distribution
with α = 1 and β = λ
< 1
f (x) = λe
−λx
,
x 0 E(X) =
1
V ar(X) =
2
M (t) =
λ
f or t < λ
X
λ λ λ−t
Joint, Marginal and Conditional Distributions

Let X and Y be two discrete random variables
defined the
on sample

space o f an experiment. The
S

joint probability mass function p (x, y), is defined for each (x,y) pair by:
X, Y
p (x, y) = p (x, y) = Pr[X = x Y = y ] ,

X, Y
and p (x, y) must satisfy:
0 ≤ p (x, y) ≤ 1 and ∑ ∑ p (x, y) = 1 .
x y

Let X and Y be two continuous random variables. Then, the joint probability density function f (x, y) = f (x, y)
X, Y
must satisfy: ∫
∞
∞
∫∞
∞
( , ) = 1
f (x, y) ≥ 0 and ∫−∞ ∫−∞ f (x, y) dx dy = 1 .
Marginal Distribution
The marginal proability mass function for the discrete random variables X and Y , denoted p (x) and
X
p (y) are given by:
Y
p (x) = ∑ p (x, y) and p (y) = ∑ p (x, y)

X Y
y x
In the continuous case, they are given by:
∞ ∞
f (x) = ∫
X −∞
f (x, y) d y and f (y) = ∫
Y −∞
f (x, y) dx
Expected Value

Let X and Y be two jointly distributed
random variables. The expected value of a function h of (X, Y )
denoted E [h(X, Y ) ] is:

E[ h(X, Y ) ] = ∑ ∑ h(x, y) p (x, y) If X and Y are discrete
x y
∞ ∞

E[ h(X, Y ) ] = ∫ ∫ h(x, y) f (x, y) dx d y If X and Y are continuous
−∞ −∞
Moment Generating Function
The moment generating function for two jointly distributed random variables X and Y , denoted
MX,Y (t1 , t2 ) is:
t1 X+ t2 Y
MX,Y (t1 , t2 ) = E[e ]
Variance , Covariance , Correlation

The covariance between two random variables X and Y is :
C ov(X, Y ) = E[(X − E(X))(Y − E(Y ))] = E(XY ) − E(X)E(Y )
∞ ∞
C ov(X, Y ) = ∑ ∑ x y p (x, y) −μ
X
μ
Y
i n the discrete case C ov(X, Y ) = ∫ ∫ x y f (x, y) dx d y − μ μ i n the continuous case
x y X Y
− ∞ − ∞

If X and Y are independent, then E[X ⋅ Y ] = E[X]E[Y ] C ov(X, Y ) = 0
V ar[X + Y ] = V ar[X] + V ar[Y ] + 2 C ov[X, Y ]
The correlation coefficient of X and Y , denoted by corr(X, Y ) or ρ X,Y

is defined by:
C ov(X, Y )
ρ
X,Y

=
σ σ
X Y
If a and c are either both positive or both negative, C orr(aX + b, cY + d ) = C orr(X, Y )

For any two random variables X and Y , −1 ≤ C orr(X, Y ) ≤ 1 .
Independence
Two random variables X and Y are said to be independent if and only if one of the following holds:
(i) for every pair of x and y values:
p (x, y) = p (x) p (y) if is dicrete and

X Y
X f (x, y) = f (x) f (y) if X is continuous.
X Y
(ii) for every pair of functions g and h

E[ g (X) h(Y ) ] = E [g (X) ] E [ h( Y ) ]
(iii) for every pair of t1 and t2 values:
MX,Y (t1 , t2 ) = MX (t1 ) M ( t2 )

Y
Conditional Distributions in the continuous case

Let X and Y be two random variables with joint pdf f (x, y) and marginal X pdf f (x) For any x value for which
. X
f (x) > 0 , the conditional probability density function of Y given that X = x is:
X
f (x, y)
f ( y ) =
Y∣X = x f (x)
X
Conditional expectation and variance

For two jointly distributed random variables X and Y :
∞

E[ X ∣ Y = y ] = ∫ x f (x ) dx = g( y )
X∣ Y = y
−∞

E[ X ∣ Y ] = g ( Y ) V ar[X ∣ Y ] = E[X ∣ Y ]
2
E[X ∣ Y ]
2
E[ X ] = E [ E[ X ∣ Y ] ] V ar[X] = E [V ar[X ∣ Y ]] + V ar[E[X ∣ Y ]]

Basic Rules of Probability and Combinatorial Principles

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basic Rules of Probability and Combinatorial Principles

Uploaded by

Copyright:

Available Formats

Basic

. If A and B are two mutually exclusive events, (no sample points in common A∩ B = then

Pr[A ∪ B ∪ C ] = Pr[A] + Pr[B] + Pr[C ] − Pr[A ∩ B] − Pr[A ∩ C ] − Pr[B ∩ C ] + Pr[A ∩ B ∩ C ]

. If A ⊂ B then Pr[A]

Pr[A] = Pr[A ∩ B] + Pr[A ∩ B

i.e. Pr[ . ∣ B ] is a probability on S .

Pr[A k ∩ B] Pr[B ∣A k ] Pr[A k ]

if and only if Pr[A ∣ B] = Pr[A]

if and only if Pr[A ∩ B] = Pr[A] ⋅ Pr[B]

Two random variables X and Y are independent if

E(X) = ∑ x Pr[X = x] if X is dicrete and E(X) = ∫ x f (x) dx if X is continuous.

The variance of a discrete random variable written V ar(X) or σ 2

C ov(X, Y ) = E[(X − E(X))(Y − E(Y ))] = E(XY ) − E(X)E(Y ) = C ov ( Y , X)

If X, Y are independent, then C ov(X, Y ) = 0

For any random variables X, Y, E(X + Y ) = E(X) + E( Y)

For any random variables X, Y, V ar( X +Y ) = V ar( X ) + V ar( Y ) + 2 C ov(X, Y )

If X, Y are independent, then V ar( X +Y ) = V ar( X ) + V ar( Y )

E [ h(X) ]= ∑ h(x ) Pr[X = x] if X is dicrete and E [ h(X) ] =∫ h(x ) f (x) dx if X is continuous.

Moment generating function

• Expectation, variance, and moments via mgf ’s: MX(0) = 1,

Other parameters of the distribution

, where N is an integer.

3. Hypergeometric Closely related to binomial

X is equal to the number of trials until the r th

5. Geometric Special case of negative binomial when r = 1. So, want to perform the experiment until a success occurs.

for some λ > 0 where λ is frequently some rate per unit of time.

When μ = 0 and σ = 1, the normal distribution is called a standard normal distribution and the standard normal

4. Exponential The exponential

p (x, y) = p (x, y) = Pr[X = x Y = y ] ,

p (x) = ∑ p (x, y) and p (y) = ∑ p (x, y)

Variance , Covariance , Correlation

V ar[X + Y ] = V ar[X] + V ar[Y ] + 2 C ov[X, Y ]

The correlation coefficient of X and Y , denoted by corr(X, Y ) or ρ X,Y

If a and c are either both positive or both negative, C orr(aX + b, cY + d ) = C orr(X, Y )

p (x, y) = p (x) p (y) if is dicrete and

(ii) for every pair of functions g and h

MX,Y (t1 , t2 ) = MX (t1 ) M ( t2 )

Conditional Distributions in the continuous case

Conditional expectation and variance

E[ X ] = E [ E[ X ∣ Y ] ] V ar[X] = E [V ar[X ∣ Y ]] + V ar[E[X ∣ Y ]]

You might also like