Stat6201 ch3-1

STAT6201 Lecture Note 3 Fall 2021
3 Common Families of Distributions

3.1 Introduction
A family of distributions: a class of pmf/pdf indexed by one or more “parameters”.
• distributions have a common functional form but different parameter values, quantifying certain char-
acteristics of the distribution
• functional form is tractable — parametric distribution family.
3.2 Discrete Distributions

3.2.1 Discrete Uniform Distribution
• X: equal chance of taking 1, 2, . . . , N . Here N is the parameter.
• pmf:
P (X = x | N ) = 1/N, x = 1, 2, . . . , N.
Note: Here P (X = x | N ) is NOT a conditional probability; it means such probability depends on
the parameter N .
• Moments: E(X) = (N + 1)/2 and Var(X) = (N + 1)(N − 1)/12.
• Generalization: X takes all integer values N0 , N0 + 1, . . . , N1 . With two parameters (N0 , N1 ), the
pmf is
P (X = x | N0 , N1 ) = 1/(N1 − N0 + 1).
Think: What are E(x) and Var(X)?
3.2.2 Hypergeometric Distribution

• N balls, M red, N − M green.
• Draw K balls at random without replacement.
• X = # red balls ∼ Hypergeometric(N, M, K).
• pmf:
M N −M

x K−x
P (X = x | M, N, K) = N
, x = 0, 1, . . . , min(M, K).
K
In many cases, K is small compared to M and N . So the range is 0 ≤ x ≤ K.
• Moments:
KM KM (N − M )(N − K)
E(X) = , Var(X) = .
N N N (N − 1)
Exericise: Prove them using the following facts:

M M −1 N N N −1
x =M , = .
x x−1 K K K −1
• Applications:
– Capture-Recapture Method: There are N (unknown) fish in a pond. To estimate N , we catch
M fish, mark them, and return them back to the pond. After a while, we catch K fish, and find
X = # marked fish. How to estimate N ?
– Acceptance Sampling: There are N = 25 machine parts in a lot. We sample K = 10 parts
and let X = # of defective parts found in this sample. If the total number of defective parts is
M = 6, what is the probability of “none are defective”? Calculate P (X = 0).
STAT6201 Lecture Note 3 Page 2 of 4
3.2.3 Binomial Distribution

• Bernoulli trial : Two possible outcomes: Success (S) or Failure (F).
• p = P (Success) in each trail; q = 1 − p.
• Experiment: n independent Bernoulli trials, with the same p.
• X = # of Successes ∼ Binomial(n, p).
• pmf:
n k n−k n k n−k
P (X = k | n, p) = p (1 − p) = p q , k = 0, 1, . . . , n.
k k
Pn
Note that (x + y)n = i=0 ni xi y n−i , so P (X = k | n, p) is the (x + 1)th term of (p + q)n .

• Moments: E(X) = np, Var(X) = np(1 − p). (Can you derive? )

• MGF: M (t) = (1 − p + pet )n . (Can you derive? )
3.2.4 Poisson Distribution

• The Poisson distribution is most commonly used to model the number of random occurrences of some
phenomenon in a specified unit of time or space. e.g. the number of customers to arrive in a bank
between 10am and 11am.
• X ∼ Poisson(λ) if
e−λ λx
P (X = x | λ) = , x = 0, 1, 2, . . . .
x!
Here λ is called the intensity parameter, which specifies the mean rate of occurrence per unit (of time,
area, volume, Petc.), i.e., λ = E(X).
+∞
Verify that x=0 P (X = x | λ) = 1.
Example 3.2.1 (Textbook Example 3.2.4, pp.93)
• Moments: E(X) = λ, Var(X) =?

t
−1)
• MGF: MX (t) = eλ(e .
• Poisson approximation by binomial distribution.
Let X ∼ Binomial(n, p). If n → ∞, p → 0, np → λ (i.e., n large, p small, but np is moderate), then
Bin(n, p) → Poisson(λ).
3.2.5 Geometric Distribution

• i.i.d. Bernoulli trials: P (Success) = p. Denote q = 1 − p.
• pmf: Let X =number of trials needed to get the first Success.
P (X = x | p) = pq x−1 , x = 1, 2, . . . .
• Moments: E(X) = 1/p, Var(X) = (1 − p)/p2 .

pet
• MGF: MX (t) = 1−(1−p)et .
Cont.
• Memoryless Property:
P (X > a + b | X > a) = P (X > b) = (1 − p)b .

Fun Fact: The geometric distribution is the only discrete distribution with the memoryless property.
3.2.6 Negative Binomial Distribution

• i.i.d. trials, with Success or Failure in each trial. P (Success) = p. Denote q = 1 − p.
• Let X denote the trial # on which the rth success occurs.

• pmf: X ∼ nb(r, p) if

x − 1 r x−r
P (X = x | r, p) = p q , x = r, r + 1, , . . . .
r−1
Geometric distribution is a special case of negative Binomial with r = 1.
• Moments:
r r(1 − p)
E(X) = , Var(X) = .
p p2
• Alternative definition: Let Y = # failures before the rth successes, i.e. Y = X − r.

r+y−1 r y r+y−1 r y
P (Y = y | r, p) = p q = p q , y = 0, 1, . . . .
r−1 y
We can obtain
r(1 − p) r(1 − p)
E(Y ) = , Var(Y ) = .
p p2
• Why called negative binomial?: Newton’s Generalized Binomial Theorem: For any a ∈ R,
∞
X a
(x + z)a = xy z a−y ,
y=0
y
where the generalized binomial coefficient is defined as

a a(a − 1) · · · (a − y + 1)
=
y y!
Letting a = −r, we have

−r (−r)(−r − 1) · · · (−r − y + 1) y (r + y − 1) · · · r y r+y−1
= = (−1) = (−1) .
y y! y! y
Thus
r+y−1 r y −r r
P (Y = y | r, p) = p q = p (−q)y .
y y
Also
∞ ∞ ∞
X X −r X −r
P (Y = y | r, p) = pr (−q)y = pr (−q)y 1−r−y = pr (1 − q)−r = 1.
y=0 y=0
y y=0
y
Cont.
• MGF.
∞
X −r r
M (t) = E(etY ) = ety p (−q)y = pr (1 − qet )−r
y=0
y
• Connection with geometric distributions: If X1 , . . . , Xr i.i.d. Geometric(p), then
Y = X1 + · · · + Xr ∼ nb(r, p).
How to understand this relationship intuitively?
Acknowledgement
The lecture notes of this course are based on the textbook and Prof. Huixia Judy Wang’s lecture slides. The
instructor thanks Prof. Wang for kindly sharing them.
The End.

Stat6201 ch3-1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stat6201 ch3-1

Uploaded by

Copyright:

Available Formats

STAT6201 Lecture Note 3 Fall 2021

3 Common Families of Distributions

3.2 Discrete Distributions

3.2.2 Hypergeometric Distribution

3.2.3 Binomial Distribution

• Moments: E(X) = np, Var(X) = np(1 − p). (Can you derive? )

3.2.4 Poisson Distribution

Example 3.2.1 (Textbook Example 3.2.4, pp.93)

• Moments: E(X) = λ, Var(X) =?

Example 3.2.2 (Textbook Example 2.3.13, pp.66)

Example 3.2.3 (Textbook Example 3.2.5, pp.94)

3.2.5 Geometric Distribution

• Moments: E(X) = 1/p, Var(X) = (1 − p)/p2 .

P (X > a + b | X > a) = P (X > b) = (1 − p)b .

Example 3.2.4 (Textbook Example 3.2.7, pp.98)

3.2.6 Negative Binomial Distribution

• Let X denote the trial # on which the rth success occurs.

Geometric distribution is a special case of negative Binomial with r = 1.

where the generalized binomial coefficient is defined as

Letting a = −r, we have

• Connection with geometric distributions: If X1 , . . . , Xr i.i.d. Geometric(p), then

How to understand this relationship intuitively?

You might also like