Professional Documents
Culture Documents
Nathaniel E. Helwig
Updated 17-Jan-2017
Copyright
c 2017 by Nathaniel E. Helwig
Univariate Normal
1 (x−µ)2
−
f (x) = √ e 2σ2
σ 2π
(1)
(x − µ)2
1
= √ exp −
σ 2π 2σ 2
where
µ ∈ R is the mean
σ > 0 is the standard deviation (σ 2 is the variance)
e ≈ 2.71828 is base of the natural logarithm
−4 −2 0 2 4
x
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 6
Univariate Normal Probability Calculations
where Z x
F (x) = f (u)du (4)
−∞
Normal Probabilities
34.1% 34.1%
0.1
2.1% 2.1%
0.1% 13.6% 13.6% 0.1%
0.0
From http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg
Φμ,σ (x)
φμ,σ (x)
2
2
0.4 -3 -2 -1 0.4 -3 -2 -1
0.2 0.2
0.0 0.0
−5 −4 −3 −2 −1 0 1 2 3 4 5 −5 −4 −3 −2 −1 0 1 2 3 4 5
x x
http://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg http://en.wikipedia.org/wiki/File:Normal_Distribution_CDF.svg
Note that the cdf has an elongated “S” shape, referred to as an ogive.
Likelihood Function
The likelihood function for the parameters (given the data) has the form
n n
(xi − µ)2
2
Y Y 1
L(µ, σ |x) = f (xi ) = √ exp −
2πσ 2 2σ 2
i=1 i=1
Bivariate Normal
where
µx ∈ R and µy ∈ R are the marginal means
σx ∈ R+ and σy ∈ R+ are the marginal standard deviations
0 ≤ |ρ| < 1 is the correlation coefficient
0.12
4
0.10
0.12
0.08
0.10
f (x, y )
0.08
0.06
y
0
f (x, y )
0.06
0.04
0.04
−2
4
0.02
2
0.02
−4
0
−4
−2 −2 y
0.00
0
2 −4 −4 −2 0 2 4
4
x x
http://en.wikipedia.org/wiki/File:MultivariateNormal.png
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 16
Bivariate Normal Distribution Form
µx = 0, µy = 0 µx = 1, µy = 2 µx = −1, µy = −1
0.12
0.12
0.12
4
4
0.10
0.10
0.10
2
2
0.08
0.08
0.08
f (x, y )
f (x, y )
f (x, y )
0.06
0.06
0.06
y
y
0
0
0.04
0.04
0.04
−2
−2
−2
0.02
0.02
0.02
−4
−4
−4
0.00
0.00
0.00
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4
x x x
√
Note: for all three plots σx2 = 1, σy2 = 2, and ρ = 0.6/ 2.
0.20
0.10
4
4
0.10
0.08
0.15
2
2
0.08
f (x, y )
f (x, y )
f (x, y )
0.06
0.06
y
0.10
0
0
0.04
0.04
−2
−2
−2
0.05
0.02
0.02
−4
−4
−4
0.00
0.00
0.00
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4
x x x
σy = 1 σy = 2 σy = 2
0.12
0.08
4
4
0.10
0.15
0.06
2
2
0.08
f (x, y )
f (x, y )
f (x, y )
0.10
0.06
0.04
y
y
0
0
0.04
−2
−2
−2
0.05
0.02
0.02
−4
−4
−4
0.00
0.00
0.00
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4
x x x
1.0
0.12
4
4
0.10
0.8
2
2
0.08
0.6
F (x, y )
f (x, y )
0.06
y
y
0
0.4
0.04
−2
−2
0.2
0.02
−4
−4
0.00
0.0
−4 −2 0 2 4 −4 −2 0 2 4
x x
√
Note: µx = µy = 0, σx2 = 1, σy2 = 2, and ρ = 0.6/ 2
Note that the cdf still has an ogive shape (now in two-dimensions).
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 21
Bivariate Normal Affine Transformations
a11 a12 b1 0 0
Let A = and b = with A 6= 02×2 = .
a21 a22 b2 0 0
fXY (x, y )
fY |X (y |X = x) = (8)
fX (x)
where
fXY (x, y ) is the joint pdf of X and Y
fX (x) is the marginal pdf of X
where fXY (x, y ) is joint pdf, and fX (x) and fY (y ) are marginals pdfs.
Example #1
A statistics class takes two exams X (Exam 1) and Y (Exam 2) where
the scores follow a bivariate normal distribution with parameters:
µx = 70 and µy = 60 are the marginal means
σx = 10 and σy = 15 are the marginal standard deviations
ρ = 0.6 is the correlation coefficient
The probability that the sum of Exam 1 and Exam 2 is above 150 is
150 − 130
P(X + Y > 150) = P Z > √
505
= P (Z > 0.8899883)
= 1 − Φ(0.8899883)
= 1 − 0.8132639
= 0.1867361
The probability that the student did better on Exam 1 than Exam 2 is
P(X > Y ) = P(X − Y > 0)
0 − 10
=P Z > √
145
= P (Z > −0.8304548)
= 1 − Φ(−0.8304548)
= 1 − 0.2031408
= 0.7968592
# Example 1a # Example 1d
> pnorm(1,lower=F) > pnorm(-10/sqrt(145),lower=F)
[1] 0.1586553 [1] 0.7968592
> pnorm(75,mean=60,sd=15,lower=F) > pnorm(0,mean=10,sd=sqrt(145),lower=F)
[1] 0.1586553 [1] 0.7968592
# Example 1b # Example 1e
> pnorm(0.5,lower=F) > pnorm(0.8,lower=F)
[1] 0.3085375 [1] 0.2118554
> pnorm(75,mean=69,sd=12,lower=F) > pnorm(150,mean=110,sd=50,lower=F)
[1] 0.3085375 [1] 0.2118554
# Example 1c
> pnorm(20/sqrt(505),lower=F)
[1] 0.1867361
> pnorm(150,mean=130,sd=sqrt(505),lower=F)
[1] 0.1867361
Multivariate Normal
where
µ = (µ1 , . . . , µp )0 is the p × 1 mean vector
σ11 σ12 · · · σ1p
σ21 σ22 · · · σ2p
Σ= . .. is the p × p covariance matrix
.. ..
.. . . .
σp1 σp2 · · · σpp
fXY (x, y)
fY |X (y|X = x) = (15)
fX (x)
where
fY |X (y|X = x) is the conditional distribution of y given x
fXY (x, y) is the joint pdf of x and y
fX (x) is the marginal pdf of x
Note that Σxy = 0p×q implies that the p elements of x are uncorrelated
with the q elements of y.
For multivariate normal variables: uncorrelated → independent
For non-normal variables: uncorrelated 6→ independent
Example #2
Each Delicious Candy Company store makes 3 size candy bars:
regular (X1 ), fun size (X2 ), and big size (X3 ).
Assume the weight (in ounces) of the candy bars (X1 , X2 , X3 ) follow a
multivariate normal distribution with parameters:
5 4 −1 0
µ = 3 and Σ = −1 4 2
7 0 2 9
# Example 2b
> pnorm(2.25/sqrt(119/32),lower=F)
[1] 0.1216523
> pnorm(8,mean=5.75,sd=sqrt(119/32),lower=F)
[1] 0.1216523
# Example 2c
> pnorm(1)
[1] 0.8413447
> pnorm(63,mean=46,sd=17)
[1] 0.8413447
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 49
Multivariate Normal Parameter Estimation
Likelihood Function
The likelihood function for the parameters (given the data) has the form
n n
Y Y 1 1
L(µ, Σ|X) = f (xi ) = exp − (xi − µ)0 Σ−1 (xi − µ)
(2π)p/2 |Σ|1/2 2
i=1 i=1
∂LL(µ, Σ|X)
= −2nΣ−1 x̄ + 2nΣ−1 µ ←→ x̄ = µ̂
∂µ
The sample mean vector x̄ is the MLE of the population mean µ vector.
i.e., the sample covariance matrix Σ̂ = (1/n) ni=1 (xi − x̄)(xi − x̄)0 is
P
the MLE of the population covariance matrix Σ.
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 52
Sampling Distributions
Sampling Distributions
Pk iid
σ 2 χ2k = 2
i=1 zi where zi ∼ N(0, σ 2 )
Pk 0 iid
Wk (Σ) = i=1 zi zi where zi ∼ N(0p , Σ)