You are on page 1of 45

1

MATH 466: Theory of Statistics


http:
//math.arizona.edu/
~
yueniu/yueniu/Math466_Fall12.html
Yue Selena Niu
yueniu@math.arizona.edu
Oce: 520 Math Building
2
Course Outline:
1. Sampling Distributions and the Central Limit Theorem
2. Estimation and Condence Intervals
3. Properties of Point Estimators and Estimation Methods
4. Hypothesis Testing
5. Linear Regression
7 Statistics, Sampling distributions, and Central Limit Theorem 3
7 Statistics, Sampling distributions, and
Central Limit Theorem
7.1 Sampling Distributions
Denition:
If n random variables (rv) are independent and identically distributed,
we refer them as a random sample and denote them by i.i.d. rv.
Example 7.1.1
Toss a coin n times. The results X
1
, , X
n
are a random sample
from Bin(1,p). Here 1 for Head and 0 for tail; p is Pr(Head).
Example 7.1.2
Take n = 100 measurement of the outdoor temperature at noon. The
readings X
1
, , X
100
can be regarded as a random sample from
7 Statistics, Sampling distributions, and Central Limit Theorem 4
N(,
2
), where is the true temperature and the measurment error
is assumed to follow a normal.
Denition:
A Statistic is a function of the observable random variables in a
sample.
Examples of Statistics:
sample total

n
i=1
X
i
, sample mean

X =
1
n

n
i=1
X
i
sample variance S
2
=
1
n1

n
i=1
(X
i


X)
2
X
(1)
, X
(n)
In 7.1.1, a commonly-used statistic is the sample total

n
i=1
X
i
.
In 7.1.2, a commonly-used statistic is sample mean

X = n
1

n
i=1
X
i
.
Note that,

X is not a statistic because it involves the unknown
parameter .
7 Statistics, Sampling distributions, and Central Limit Theorem 5
One of the goals of statistical theory is to estimate the unknown
popular parameters by statistics, e.g., can be estimated by

X.
Examples:
In 7.1.1, we want to use the data to estimate p.
In 7.1.2, we want to use the data to estimate .
Denition:
The probability distribution of a statistic (such as

X) is called the
sampling distribution of that statistic.
Examples:
In 7.1.1, the sampling distribution o

n
i=1
X
i
is Bin(n, p).
In 7.1.2, the sampling distribution of

X is N(,
2
/100). (Show
details later)
Sampling distributions can be used to express the uncertainty of the
estimators.
7 Statistics, Sampling distributions, and Central Limit Theorem 6
7.2 Sampling Distributions Related to the
Normal Distribution
Denition:
The expression
U = a
1
X
1
+ +a
n
X
n
is called a linear combination of the rvs X
1
, , X
n
.
It is important to understand the properties (e.g. mean, variance,
sampling distribution) of these because many statistics are linear
combinations or functions of linear combinations.
Examples of linear combinations:
n

i=1
X
i
= X
1
+ +X
n
,

X =
1
n
X
1
+ +
1
n
X
n
.
7 Statistics, Sampling distributions, and Central Limit Theorem 7
Sampling Dist. of Linear Combinations of Normal rvs
Recall Theorem 6.3: Let X
1
, , X
n
be independent rvs from
N(
i
,
2
i
), i = 1, , n, and let a
1
, , a
n
be known constants. Then
U = a
1
X
1
+ +a
n
X
n
N(
U
,
2
U
), where

U
= a
1

1
+ +a
n

2
U
= a
2
1

2
1
+ +a
2
n

2
n
.
Proof: Use the method of moment generating functions.
m
X
i
(t) = exp
_

i
t +

2
i
t
2
2
_
.
And the fact that
m
U
(t) = m
X
1
(a
1
t) m
X
2
(a
2
t) m
X
n
(a
n
t).
7 Statistics, Sampling distributions, and Central Limit Theorem 8
Sampling Distribution of the Sample Mean of Normals
Let X
1
, , X
N
be a random sample from N(,
2
). Apply Theorem
6.3 to determine the distribution of the sample mean:

X = (X
1
+ +X
n
)/n.
Answer:

X N(,
2
/n)
7 Statistics, Sampling distributions, and Central Limit Theorem 9
Example 7.2.1 SAT scores of entering students at UofA follow the
normal distribution with mean of 575 and standard deviation of 40. A
random sample of 25 students is selected. Find the probability that the
sample mean of the 25 SAT scores is less than 585.
7 Statistics, Sampling distributions, and Central Limit Theorem 10
Example 7.2.2 In Example 7.2.1, how many observations should be
included in the sample if we wish the sample mean to dier from the
population mean by no more than 10 with probability 0.95?
7 Statistics, Sampling distributions, and Central Limit Theorem 11
Sampling Distribution of Other Linear Combinations of
Normals
Note: Linear combinations of Normal random variables are still normal.
Let X
1
, , X
n
be a random sample from N(,
2
). Then
U
1
= (X
1
+ +X
n
)/n N(,
2
/n)
U
2
= X
1
+ +X
n

U
3
= X
1
X
2

U
4
=
X
1

2

X
2

2
N(0, 1)
U
5
= X
1
+X
2
+X
3
3X
4
N(0, 12
2
)
7 Statistics, Sampling distributions, and Central Limit Theorem 12
Example 7.2.3 Suppose that random variables Y
1
, Y
2
and Y
3
are a
random sample from the normal distribution with = 0 and
2
= 1.
State the distribution with associated parameter values of each of the
following functions of Y
1
, , Y
3
.
1. U
1
= (Y
1
+ 2Y
2
)/3 Y
3
2. U
2
= Y
1
+Y
2
+Y
3
7 Statistics, Sampling distributions, and Central Limit Theorem 13
Sampling Distribution of the Chi-square Statistic
The sum of squares of v independent standard normal rvs is a
Chi-square with v degrees of freedom. That is, if Z
1
, , Z
v
are
i.i.d. N(0, 1) random variables, then
U = Z
2
1
+Z
2
2
+ +Z
2
v
follows
2
v
, the chi-square distribution with v degrees of freedom.
Proof: Use the method of moment generating functions.
m
Z
2
i
(t) = (1 2t)
1/2
.
And the fact that
m
U
(t) = m
Z
2
1
(t) m
Z
2
2
(t) m
Z
2
n
(t) = (1 2t)
n/2
.
Recall:
2

is the same as Gamma distribution with = v/2 and


= 2.

2
v
Gamma( = v/2, = 2)
7 Statistics, Sampling distributions, and Central Limit Theorem 14
Sampling Distribution of the Sample Variance
Let X
1
, , X
n
be i.i.d. N(,
2
) rvs, and dene the sample mean

X =
1
n
(X
1
+ +X
n
)
and the sample variance
S
2
=
1
n 1
n

i=1
(X
i


X)
2
.
Then
(n 1)S
2

2

2
n1
,
i.e. it follows the chi-square distribution with v = n 1 degrees of
freedom. That is,

n
i=1
(X
i


X)
2

2

2
n1
.
7 Statistics, Sampling distributions, and Central Limit Theorem 15
Proof (for n = 2):
First show that

2
i=1
(X
i

X)
2

2
=
_
X
1
X
2

2
_
2
Whats the distribution of
X
1
X
2

2
?
7 Statistics, Sampling distributions, and Central Limit Theorem 16
Properties of the chi-square distribution
The percentage points for the chi-square distribution are tabulated in
Table 6 of Appendix 3. Suppose
2
is a rv following the chi-square
distribution with v degrees of freedom. For a given , the table gives
the value
2
v,
that solves:
P(
2

2
v,
) = .
The value
2
v,
is the th percentage point, and the (1 )th
quantile of the chi-square distribution with v degrees of freedom.
7 Statistics, Sampling distributions, and Central Limit Theorem 17
Example 7.2.4 SAT scores of entering students at UofA follow the
normal distribution with mean of 575 and standard deviation of 40. A
random sample of 25 students is selected. Find the probability that the
sample variance of the 25 SAT scores is less than 2200.
7 Statistics, Sampling distributions, and Central Limit Theorem 18
If X
1
, , X
n
are i.i.d. N(,
2
) rvs, then

X and S
2
are in-
dependent random variables.
Justication for the independence of

X and S
2
(n = 2):
U
1
= X
1
+X
2
and U
2
= X
1
X
2
can be shown to be
independent (see Example 6.13 on WMS text)


X is only a function of U
1
, and S
2
is only a function of U
2
, so

X
and S
2
are also independent.
Properties of

X and S
2
for N(,
2
) r.s.:
1.

X N(,
2
/n)
2. (n 1)S
2
/
2

2
n1
3.

X and S
2
are independent
For a formal proof, refer to Section 4.8 of Introduction to
Mathematical Statistics by Hogg and Craig.
7 Statistics, Sampling distributions, and Central Limit Theorem 19
Example 7.2.5 Suppose that random variables Y
1
, Y
2
and Y
3
are a
random sample from the normal distribution with = 1 and
2
= 4.
Find the distribution of U =

3
i=1
(Y
i


Y )
2
, where

Y = 1/3

3
i=1
Y
i
.
7 Statistics, Sampling distributions, and Central Limit Theorem 20
Students t-statistic
Denition: If Z N(0, 1) and
2
v
chi-square with v degrees of
freedom, and if Z and
2
v
are independent, then the statistic
t = t
v
=
Z
_

2
v
/v
is a Students t-statistic with v degrees of freedom, and it has the pdf:
f(t) = K
_
1 +
t
2
v
_
(v+1)/2
, < t < +,
where
K =
((v + 1)/2)

v(v/2)
.
See Exercise 7.98 for the derivation of the pdf.
7 Statistics, Sampling distributions, and Central Limit Theorem 21
Properties of the t-distribution
Like the standard Normal, it is symmetric about 0.
Centered at 0, and has a shape similar to that of the Normal.
Has a fatter tail and larger variation than the standard Normal
As v , t
v
N(0, 1)
The mean and variance of the t-distribution are:
E(t
v
) = 0, if v > 1
V (t
v
) =
v
v 2
, if v > 2
t
1
: Cauchy(0,1) distribution, which has no mean, variance or
higher moments dened.
7 Statistics, Sampling distributions, and Central Limit Theorem 22
A comparison of histograms of N(0, 1), t
3
and t
10
:
7 Statistics, Sampling distributions, and Central Limit Theorem 23
Properties of the t-distribution
The percentage points for the t-distribution are tabulated in Table 5 of
Appendix 3. For a given , the table gives the value t
v,
that solves:
P(t
v
t
v,
) = .
The value t
v,
is the th percentage point, and the (1 )th quantile
of the t-distribution with v degrees of freedom.
7 Statistics, Sampling distributions, and Central Limit Theorem 24
The t-statistic in Normal Samples
If

X and s
2
are the sample mean and variance of a random sample
from N(,
2
), then
Z =

X
/

n
N(0, 1) (1)
W =
(n 1)S
2

2

2
n1
(2)
t =

X
S/

n
t
n1
(3)
Verify (3)!
7 Statistics, Sampling distributions, and Central Limit Theorem 25
William Gosset aka Student
William Gosset who worked at Guinness brewery published an article in
1908 on Biometrika describing the t-statistic and its distribution. He
published under the pseudonym Student.
See: http://en.wikipedia.org/wiki/William_Gosset
7 Statistics, Sampling distributions, and Central Limit Theorem 26
Example 7.2.6 Suppose T t
5
.
1. Calculate P(T > 2)
2. Find c such that P(|T| < c) = 0.90
3. Find c such that P(T > c) = 0.25
7 Statistics, Sampling distributions, and Central Limit Theorem 27
F DistributionDenition
Let W
1
and W
2
be independent chi-square random variables with v
1
and v
2
degrees of freedom, respectively. Then the statistic
F =
W
1
/v
1
W
2
/v
2
is said to follow the F distribution with v
1
and v
2
df. The v
1
is called
the numerator df and v
2
is the denominator df. We denote
F F(v
1
, v
2
).
The pdf of F Distribution
The pdf for F(v
1
, v
2
) can be derived:
f
F
(x) =
(
v
1
+v
2
2
)
(v
1
/2)(v
2
/2)
_
v
1
v
2
_
x
v
1
/21
_
1 +
v
1
v
2
x
_
(v
1
+v
2
)/2
for 0 < x < +, where v
1
and v
2
are positive integers.
7 Statistics, Sampling distributions, and Central Limit Theorem 28
F DistributionPercentage Points
The percentage points for the F distribution are tabulated in Table 7
of Appendix 3. For given values of v
1
, v
2
and , the table gives the
value F
v
1
,v
2
,
that solves:
P(F
v
1
,v
2
F
v
1
,v
2
,
) = .
Note that F
v
1
,v
2
represent an F rv with v
1
and v
2
df, and F
v
1
,v
2
,
denotes the th percentage point of the F distribution with v
1
, v
2
df.
7 Statistics, Sampling distributions, and Central Limit Theorem 29
Histogram of F distribution, and the percentage point.
7 Statistics, Sampling distributions, and Central Limit Theorem 30
F Distribution
For v
2
> 2,
E(F
v
1
,v
2
) = v
2
/(v
2
2).
One useful conclusion: If F F(v
1
, v
2
), then 1/F F(v
2
, v
2
).
Why?
Percentage Point relationship (extends tables):
F
v
1
,v
2
,1
=
1
F
v
2
,v
1
,
F DistributionConnection to Normal
Notice that if we have two independent samples of sizes n
1
and n
2
respectively from normal distributions with a common variance, then
S
2
1
/S
2
2
has a F distribution with (n
1
1) numerator df and (n
2
1)
denominator df.
7 Statistics, Sampling distributions, and Central Limit Theorem 31
More formally, suppose X
11
, , X
1n
1
and X
21
, , X
2n
2
are
independent random samples from N(
1
,
2
) and N(
2
,
2
),
respectively. Let
S
2
i
=
1
n
i
1
n
i

j=1
(X
ij


X
i
)
2
, i = 1, 2.
Then
W
i
=
(n
i
1)S
2
i

2

2
n
i
1
, i = 1, 2
and W
1
and W
2
are independent. From these we can form
F =

2
n
1
1
/(n
1
1)

2
n
2
1
/(n
2
1)
=
W
1
/(n
1
1)
W
2
/(n
2
1)
=
(n
1
1)S
2
1
/[
2
(n
1
1)]
(n
2
1)S
2
2
/[
2
(n
2
1)]
=
S
2
1
S
2
2
.
Therefore, under the stated condition, S
2
1
/S
2
2
follows the F
distribution with v
1
= n
1
1 and v
2
= n
2
1 df.
7 Statistics, Sampling distributions, and Central Limit Theorem 32
Example 7.2.7 Compute P(S
1
cS
2
) for c
2
= 2, n
1
= 11, n
2
= 18
using Appendix 3.
7 Statistics, Sampling distributions, and Central Limit Theorem 33
Sir Ronald Fisher (1890-1962)
The F stands for Fisher. Sir Ronald Fisher was a British geneticist and
statistician who is often referred to as the father of modern statistics.
http://en.wikipedia.org/wiki/Ronald_Fisher
The F statistic is used to test important hypotheses in the analysis of
variance and in regression analysis.
7 Statistics, Sampling distributions, and Central Limit Theorem 34
Review for normal samples
Suppose X
1
, , X
n
i.i.d. N(
x
,
2
x
) and
Y
1
, , Y
m
i.i.d. N(
y
,
2
y
) are independent. Then we have


X N(
x
,
2
x
/n)
(n 1)S
2
x
/
2
x

2
n1
, S
2
x
= (n 1)
1

n
i=1
(X
i


X)
2
.


X and S
2
are independent
T =

X
x
S
x
/

n
t
n1
F =
S
2
x

2
y
S
2
y

2
x
F
n1,m1
7 Statistics, Sampling distributions, and Central Limit Theorem 35
Example 7.2.8 Suppose that Y
1
, , Y
3
are independent standard
normal random variables. State the distribution with associated
parameter values of each of the following functions of Y
1
, Y
2
and Y
3
.
1. U
1
=

Y
2. U
2
= Y
2
1
+Y
2
2
+Y
2
3
3. U
3
=
(Y
1
+Y
2
)/

Y
2
3
4. U
4
=
Y
1

0.5(Y
2
2
+Y
2
3
)
5. U
5
=
2Y
2
1
Y
2
2
+Y
2
3
7 Statistics, Sampling distributions, and Central Limit Theorem 36
7.3 Central Limit Theorem
Central Limit Theorem
Assume that X
1
, , X
n
are i.i.d. rvs with nite mean and
variance
2
. Suppose that
2
< +. Then as n +, the
distribution of
U
n
=

X
/

n
approaches that of the standard normal N(0, 1).
Thus, for large n we may use the approximation:
P
_

X
/

n
t
_
P(Z t),
where Z is a standard normal rv. The approximation improves as n
increases.
7 Statistics, Sampling distributions, and Central Limit Theorem 37
When the CLT holds, we say

X is asymptotically normal (AN)
with mean and variance

2
n
, i.e.

X = AN
_
,

2
n
_
.
Similarly, when the CLT holds, we say

n
i=1
X
i
= n

X is
asymptotically normal (AN) with mean n and variance n
2
, i.e.
n

i=1
X
i
= AN(n, n
2
).
7 Statistics, Sampling distributions, and Central Limit Theorem 38
Figure 1: In this simulation experiment random samples of size n(=
1, 10, 20, 30, 40, 100) were simulated from a Uniform(0, 1) and the sam-
ple mean x was calculated. The histograms are based on the 1000
simulated sample means. Normality is achieved with n = 10!
7 Statistics, Sampling distributions, and Central Limit Theorem 39
Figure 2: In this simulation experiment random samples of size n(=
1, 10, 20, 30, 40, 100) were simulated from a Exponential(1) and the
sample mean x was calculated. The histograms are based on the 1000
simulated sample means. Normality is achieved with n = 20!
7 Statistics, Sampling distributions, and Central Limit Theorem 40
Figure 3: In this simulation experiment random samples of size n(=
1, 10, 20, 30, 40, 100) were simulated from a t
1
= Cauchy(0, 1) and the
sample mean x was calculated. The histograms are based on the 1000
simulated sample means. Clearly the CLT fails here!! Why?
7 Statistics, Sampling distributions, and Central Limit Theorem 41
Example 7.3.1 Let X
1
, , X
n
be a random sample of size n of
inter-arrival times between calls to a switchboard. It is known that the
X
i
s are exponentially distributed with mean arrival rate 2 seconds.
Find the probability that the sample mean of 36 observations will be
less than 2.1.
7 Statistics, Sampling distributions, and Central Limit Theorem 42
Continuity Correction: Approximate Distribution of a
Discrete rv
Suppose that X AN(,
2
) but is discrete, and suppose X is
measured to the nearest whole unit. For example, suppose X =weight
of female patients measured to the nearest lb and suppose
X AN( = 125,
2
= 25). Then
P(X 130) = P(X < 131) P(Z
130.5 125
5
) = P(Z 1.1) = 0.8643,
where 0.5 is added to 130 as a correction for continuity. A continuity
correction can also be applied when other discrete distributions
supported on the integers are approximated by the normal distribution.
7 Statistics, Sampling distributions, and Central Limit Theorem 43
Approximating the Binomial with the Normal
Suppose Y Binomial(n, p). Let
X
i
=
_
_
_
1 if trial i is a success
0 otherwise
Then Y = X
1
+ +X
n
. Furthermore, X
1
, , X
n
are independent
and X
i
Binomial(1, p), i = 1, , p. Thus, E(X
i
) = p and
V (X
i
) = p(1 p) 0.25 < +. By the CLT,

X AN(p, p(1 p)/n)


and
Y = n

X AN(np, np(1 p)).
Therefore, if n is large, with the continuity correction,
P(Y y) P
_
Z
(y + 0.5) np
_
np(1 p)
_
.
7 Statistics, Sampling distributions, and Central Limit Theorem 44
Criteria for Approximating Binomial(n, p) with
N(np, np(1 p))
The approximation is acceptable when
0 < p 3
_
p(1 p)/n
and
p + 3
_
p(1 p)/n < 1
both hold. These hold when n is moderately large and p is not near 0
or 1. In some other texts, the following criteria is used:
np 10 and n(1 p) 10.
7 Statistics, Sampling distributions, and Central Limit Theorem 45
Example 7.3.2 Suppose that Y Binomial(50, 0.25), then
calculate P(Y 10) with the normal approximation.

You might also like