You are on page 1of 2

Probability Cheat Sheet Poisson Distribution Exponential Distribution Quantile Function

notation P oisson (λ) notation exp (λ) The function X ∗ : [0, 1]→ R for which for any
Distributions p ∈ [0, 1], FX X ∗ (p)− ≤ p ≤ FX (X ∗ (p))
k
Unifrom Distribution
X λi 1 − e−λx for x ≥ 0
cdf e−λ cdf
i! FX ∗ = FX
i=0
notation U [a, b] −λx
λk pdf λe for x ≥ 0 E (X ∗ ) = E (X)
x−a pmf · e−λ for k ∈ N
cdf for x ∈ [a, b] k! 1
b−a expectation
expectation λ λ Expectation
1 1
pdf for x ∈ [a, b] variance
b−a
Z 1
variance λ λ2
1 λ E (X) = X ∗ (p)dp
expectation (a + b) mgf 0
2 exp λ et − 1

mgf λ−t Z 0 Z ∞
1 k
(b − a)2 n n E (X) = FX (t) dt + (1 − FX (t)) dt
!
variance X
12
X X ind. sum Xi ∼ Gamma (k, λ) −∞ 0
ind. sum Xi ∼ P oisson λi
e − eta
tb i=1 Z ∞
mgf i=1 i=1 k
!
t (b − a) X E (X) = xfX xdx
story: the probability of a number of events minimum ∼ exp λi −∞
story: all intervals of the same length on the
occurring in a fixed period of time if these i=1 Z ∞
distribution’s support are equally probable.
events occur with a known average rate and story: the amount of time until some specific E (g (X)) = g (x) fX xdx
independently of the time since the last event. −∞
Gamma Distribution event occurs, starting from now, being
memoryless. E (aX + b) = aE (X) + b
notation Gamma (k, θ) Normal Distribution
θk xk−1 e−θx Binomial Distribution Variance
pdf Ix>0 notation N µ, σ 2

Γ (k)
Z ∞ notation Bin(n, p) Var (X) = E X 2 − (E (X))2

1 −(x−µ)2 /(2σ 2 )
Γ (k) = xk−1 e−x dx pdf √ e k  
2πσ 2
X n i  
0 cdf p (1 − p)n−i Var (X) = E (X − E (X))2
i
expectation kθ expectation µ i=0
n Var (aX + b) = a2 Var (X)
pmf pi (1 − p)n−i
variance kθ2 variance σ 2 i
−k 1

1

expectation np Standard Deviation
mgf (1 − θt) for t < mgf exp µt + σ 2 t2
θ 2 p
n n σ (X) = Var (X)
!
n n n variance np (1 − p)
X X !
ind. sum Xi ∼ Gamma ki , θ
X X X
ind. sum Xi ∼ N µi , σi2 n
i=1 i=1
i=1 i=1 i=1 mgf 1 − p + pet Covariance
story: the sum of k independent
exponentially distributed random variables, story: describes data that cluster around the story: the discrete probability distribution of
mean. Cov (X, Y ) = E (XY ) − E (X) E (Y )
each of which has a mean of θ (which is the number of successes in a sequence of n
equivalent to a rate parameter of θ−1 ). independent yes/no experiments, each of Cov (X, Y ) = E ((X − E (x)) (Y − E (Y )))
Standard Normal Distribution which yields success with probability p.
Geometric Distribution Var (X + Y ) = Var (X) + Var (Y ) + 2Cov (X, Y )

notation G (p) notation N (0, 1)


1
Z x
2 Basics Correlation Coefficient
cdf k
1 − (1 − p) for k ∈ N cdf Φ(x) = √ e−t /2 dt
2π −∞ Cov (X, Y )
Comulative Distribution Function ρX,Y =
1 2 σX , σY
pmf (1 − p) k−1
p for k ∈ N pdf √ e−x /2 FX (x) = P (X ≤ x)

1 1 Moment Generating Function
expectation expectation Probability Density Function
p λ  
1 ∞
MX (t) = E etX
Z
1−p
variance variance FX (x) = fX (t) dt
p2 λ2   −∞
pet t2 Z ∞
E (X n ) = MX (0)
(n)
mgf mgf exp fX (t) dt = 1
1 − (1 − p) et 2
−∞ MaX+b (t) = etb MaX (t)
story: the number X of Bernoulli trials story: normal distribution with µ = 0 and d
needed to get one success. Memoryless. σ = 1. fX (x) = FX (x)
dx
Joint Distribution Conditional Density Convergence in Distribution Inequalities
PX,Y (B) = P ((X, Y ) ∈ B) fX,Y (x, y) D Markov’s inequality
fX|Y =y (x) = notation Xn −→ X
FX,Y (x, y) = P (X ≤ x, Y ≤ y) fY (y) E (|X|)
fX (x) P (Y = n | X = x) P (|X| ≥ t) ≤
Joint Density fX|Y =n (x) = meaning lim Fn (x) = F (x) t
n→∞
ZZ P (Y = n) Chebyshev’s inequality
PX,Y (B) = fX,Y (s, t) dsdt
Z x Almost Sure Convergence
FX|Y =y = fX|Y =y (t) dt Var (X)
ZBx Z y −∞ a.s. P (|X − E (X)| ≥ ε) ≤
notation Xn −
−−→X ε2
FX,Y (x, y) = fX,Y (s, t) dtds
Z ∞ Z ∞ −∞ −∞ Conditional Expectation   Chernoff ’s inequality
Z ∞ meaning P lim Xn = X = 1
fX,Y (s, t) dsdt = 1 E (X | Y = y) = xfX|Y =y (x) dx
n→∞ Let X ∼ Bin(n, p); then:
2
−∞ −∞ −∞ Criteria for a.s. Convergence P (X − E (X) > tσ (X)) < e−t /2

Marginal Distributions E (E (X | Y )) = E (X) • ∀ε∃N ∀n > N : P (|Xn − X| < ε) > 1 − ε Simpler result; for every X:
P (Y = n) = E (IY =n ) = E (E (IY =n | X))
• ∀εP (lim sup (|Xn − X| > ε)) = 0 P (X ≥ a) ≤ MX (t) e−ta
PX (B) = PX,Y (B × R)
PY (B) = PX,Y (R × Y ) ∞
X Jensen’s inequality
Z a Z ∞ • ∀ε P (|Xn − X| > ε) < ∞ (by B.C.)
FX (a) = fX,Y (s, t) dtds Sequences and Limits n=1
for ϕ a convex function, ϕ (E (X)) ≤ E (ϕ (X))
−∞ −∞ ∞ ∞
Z b Z ∞ \ [ Convergence in Lp
FY (b) = fX,Y (s, t) dsdt lim sup An = {An i.o.} = An
−∞ −∞ m=1 n=m Lp Miscellaneous
∞ ∞
notation Xn −−→ X ∞
X
Marginal Densities E (Y ) < ∞ ⇐⇒ P (Y > n) < ∞ (Y ≥ 0)
[ \
lim inf An = {An eventually} = An
Z ∞
m=1 n=m
meaning lim E (|Xn − X|p ) = 0 n=0
n→∞ ∞
fX (s) = fX,Y (s, t)dt lim inf An ⊆ lim sup An X
Z −∞ Relationships E (X) = P (X > n) (X ∈ N)
∞ (lim sup An )c = lim inf Acn n=0
fY (t) = fX,Y (s, t)ds (lim inf An )c = lim sup Acn
Lq Lp
−∞ −−→ ⇒ −−→ X ∼ U (0, 1) ⇐⇒ − ln X ∼ exp (1)

! q>p≥1
Convolution
[
Joint Expectation P (lim sup An ) = lim P An ⇓
ZZ n→∞
n=m For ind. X, Y, Z =X +Y:
E (ϕ (X, Y )) = ϕ (x, y) fX,Y (x, y) dxdy ∞
!
a.s. p D
Z ∞
R2 −
−−→ ⇒ −−→ ⇒ −−→ fX (s) fY (z − s) ds
\
P (lim inf An ) = lim P An fZ (z) =
n→∞ −∞
n=m D p
If Xn −→ c then Xn −→c
Independent r.v. p Kolmogorov’s 0-1 Law
Borel-Cantelli Lemma If Xn −→ X then there exists a subsequence
P (X ≤ x, Y ≤ y) = P (X ≤ x) P (Y ≤ y) ∞ a.s. If A is in the tail σ-algebra F t , then P (A) = 0
X nk s.t. Xnk −
−−→X or P (A) = 1
FX,Y (x, y) = FX (x) FY (y) P (An ) < ∞ ⇒ P (lim sup An ) = 0
fX,Y (s, t) = fX (s) fY (t) n=1 Laws of Large Numbers
And if An are independent:
Ugly Stuff
E (XY ) = E (X) E (Y ) If Xi are i.i.d. r.v.,
∞ cdf
Z t ofk Gamma distribution:
p
Var (X + Y ) = Var (X) + Var (Y ) θ xk−1 e−θk
X
P (An ) = ∞ ⇒ P (lim sup An ) = 1 weak law Xn −
→ E (X1 ) dx
Independent events: n=1 0 (k − 1)!
a.s.
P (A ∩ B) = P (A) P (B) strong law Xn −
−−→ E (X1 )
Convergence Central Limit Theorem
Conditional Probability Sn − nµ D
Convergence in Probability √ −→ N (0, 1)
P (A ∩ B) σ n
P (A | B) = p tn → t, then 
If  This cheatsheet was made by Peleg Michaeli in
P (B) notation Xn −
→X Sn − nµ January 2010, using LATEX.
P (B | A) P (A) P √ ≤ tn → Φ (t) version: 1.01
bayes P (A | B) = σ n
P (B) meaning lim P (|Xn − X| > ε) = 0 comments: peleg.michaeli@math.tau.ac.il
n→∞

You might also like