You are on page 1of 4

Mark’s Formula Sheet for Exam P

Discrete distributions

• Uniform, U (m)
1
– PMF: f (x) = m, for x = 1, 2, . . . , m
m+1 m2 − 1
– µ= and σ 2 =
2 12
• Hypergeometric
N1 N2
 
x n−x
– PMF: f (x) = N

n
– x is the number of items from the sample of n items that are from group/type 1.
N1 N1 N2 N − n
– µ = n( ) and σ 2 = n( )( )( )
N N N N −1
• Binomial, b(n, p)
 
n x
– PMF: f (x) = p (1 − p)n−x , for x = 0, 1, . . . , n
x
– x is the number of successes in n trials.
– µ = np and σ 2 = np(1 − p) = npq
– MGF: M (t) = [(1 − p) + pet ]n = (q + pet )n

• Negative Binomial, nb(r, p)


 
x−1 r
– PMF: f (x) = p (1 − p)x−r , for x = r, r + 1, r + 2, . . .
r−1
– x is the number of trials necessary to see r successes.
1 r r(1 − p) rq
– µ = r( ) = and σ 2 = 2
= 2
p p p p
t r
r
pet

(pe )
– MGF: M (t) = =
[1 − (1 − p)et ]r 1 − qet
• Geometric, geo(p)

– PMF: f (x) = (1 − p)x−1 p, for x = 1, 2, . . .


– x is the number of trials necessary to see 1 success.
– CDF: P (X ≤ k) = 1 − (1 − p)k = 1 − q k and P (X > k) = (1 − p)k = q k
1 1−p q
– µ = and σ 2 = 2
= 2
p p p
pe t pet
– MGF: M (t) = =
1 − (1 − p)et 1 − qet
– Distribution is said to be “memoryless”, because P (X > k + j|X > k) = P (X > j).
• Poisson
λx e−λ
– PMF: f (x) = , for x = 0, 1, 2, . . .
x!
– x is the number of changes in a unit of time or length.
– λ is the average number of changes in a unit of time or length in a Poisson process.
λ2 λx
– CDF: P (X ≤ x) = e−λ (1 + λ + 2! + ··· + x! )
– µ = σ2 = λ
t −1)
– MGF: M (t) = eλ(e

Continuous Distributions
• Uniform, U (a, b)
1
– PDF: f (x) = , for a ≤ x ≤ b
b−a
x−a
– CDF: P (X ≤ x) = , for a ≤ x ≤ b
b−a
a+b (b − a)2
– µ= and σ 2 =
2 12
etb − eta
– MGF: M (t) = , for t 6= 0, and M (0) = 1
t(b − a)
• Exponential
1
– PDF: f (x) = e−x/θ , for x ≥ 0
θ
– x is the waiting time we are experiencing to see one change occur.
– θ is the average waiting time between changes in a Poisson process. (Sometimes called the
“hazard rate”.)
– CDF: P (X ≤ x) = 1 − e−x/θ , for x ≥ 0.
– µ = θ and σ 2 = θ2
1
– MGF: M (t) =
1 − θt
– Distribution is said to be “memoryless”, because P (X ≥ x1 + x2 |X ≥ x1 ) = P (X ≥ x2 ).

• Gamma
1 1
– PDF: f (x) = α
xα−1 e−x/θ = xα−1 e−x/θ , for x ≥ 0
Γ(α)θ (α − 1)!θα
– x is the waiting time we are experiencing to see α changes.
– θ is the average waiting time between changes in a Poisson process and α is the number of
changes that we are waiting to see.
– µ = αθ and σ 2 = αθ2
1
– MGF: M (t) =
(1 − θt)α
• Chi-square (Gamma with θ = 2 and α = 2r )
1
– PDF: f (x) = xr/2−1 e−x/2 , for x ≥ 0
Γ(r/2)2r/2
– µ = r and σ 2 = 2r
1
– MGF: M (t) =
(1 − 2t)r/2
• Normal, N (µ, σ 2 )
1 2 2
– PDF: f (x) = √ e−(x−µ) /2σ
σ 2π
2 t2 /2
– MGF: M (t) = eµt+σ

Integration formulas
Z
1 1 1
• p(x)eax dx = p(x)eax − 2 p0 (x)eax + 3 p00 (x)eax − . . .
a a a
Z ∞  
1 −x/θ
• x e dx = (a + θ)e−a/θ
a θ
Z ∞  
2 1 −x/θ
• x e dx = ((a + θ)2 + θ2 )e−a/θ
a θ
Other Useful Facts

• σ 2 = E[(X − µ)2 ] = E[X 2 ] − µ2 = M 00 (0) − M 0 (0)2

• Cov(X, Y ) = E[(X − µx )(Y − µy )] = E[XY ] − µx µy

• Cov(X, Y ) = σxy = ρσx σy


and
σxy
ρ=
σx σy
σy
• Least squares regression line: y = µy + ρ (x − µx )
σx
• When variables X1 , X2 , . . . , Xn are not pairwise independent, then
Xn Xn X
Var( Xi ) = σi2 + 2 σij
i=1 i=1 i<j
and
Xn n
X X
Var( ai Xi ) = a2i σi2 + 2 ai aj σij
i=1 i=1 i<j
where σij is the covariance of Xi and Xj .

• When X depends upon Y , E[X] = E[E[X|Y ]].

• When X depends upon Y , Var(X) = E[Var(X|Y )] + Var(E[X|Y ]). (Called the “Total Variance” of
X.)

• Chebyshev’s Inequality: For a random variable X having any distribution with finite mean µ and
variance σ 2 , P (|X − µ| ≥ kσ) ≤ k12 .

• For the variables X and Y having the joint PMF/PDF f (x, y), the moment generating function for
this distribution is
XX
M (t1 , t2 ) = E[et1 X+t2 Y ] = E[et1 X et2 Y ] = et1 x et2 y f (x, y)
x y

– µx = Mt1 (0, 0) and µy = Mt2 (0, 0) (These are the first partial derivatives.)
– E[X 2 ] = Mt1 t1 (0, 0) and E[Y 2 ] = Mt2 t2 (0, 0) (These are the “pure” second partial derivatives.)
– E[XY ] = Mt1 t2 (0, 0) = Mt2 t1 (0, 0) (These are the “mixed” second partial derivatives.)

• Central Limit Theorem: As the sample size n grows,


n
X
– the distribution of Xi becomes approximately normal with mean nµ and variance nσ 2
i=1
n
1X σ2
– the distribution of X̄ = Xi becomes approximately normal with mean µ and variance .
n n
i=1

• If X and Y are joint distributed with PMF f (x, y), then


X
– the marginal distribution of X is given by fx (x) = f (x, y)
y
X
– the marginal distribution of Y is given by fy (y) = f (x, y)
x
f (x, y0 )
– f (x|y = y0 ) = .
fy (y0 )
P
X xf (x, y0 )
– E[X|Y = y0 ] = xf (x|y = y0 ) = x
x
fy (y0 )

You might also like