CH 02

Chi-Kong Ng, ENGG2780B, Dept.
of SEEM, CUHK 2:1
Chapter 2.
Sampling Distributions
Chi-Kong Ng, ENGG2780B, Dept. of SEEM, CUHK 2:2
2.1. Introduction
2.1.1. Populations and Samples
• A population consists of the totality of the observations with which
we are concerned.
• When the population is too large to study in its entirety, or
techniques used in the study are destructive in nature, in either
cases we must depend on a subset or “sample” of observations from
the population to help us make inferences concerning that same
population.
2.1.2. Random Sampling

• To eliminate any possibility of bias in the sampling procedure, it is
desirable to choose a random sample.
• If X1, X2, . . . , Xn are independent and identically distributed (IID)
random variables, we say that they constitute a random sample
of size n from the infinite population given by their common
probability distribution/density function f (x). Note that this
definition applies also to sampling with replacement from finite
populations.
• We say that the random variables X1, X2, . . . , Xn constitute a
random sample of size n from a finite population of size N if its
values are chosen so that each subset of n of the N elements of the
population has the same probability of being selected.
• Before a random sample of size n is selected, the observations are

modeled as the random variables X1, X2, . . . , Xn.
We also apply the term random sample to the set of observed values
x1, x2, . . . , xn of the random variables.
The lower case distinguishes the realization of a random sample
from the upper case which represents the random variables before
they are observed.
2.1.3. Statistics and Sampling Distributions

• Statistical inferences are usually based on statistics.
• A statistic is a random variable which is a function of a random
sample X1, X2, . . . , Xn.
• Distributions of statistics are referred to as sampling distributions.
2.2. The Sample Mean

• If X1, X2, . . . , Xn constitute a random sample, then the statistic
n
1X
X̄ = Xj .
n j=1
is called the sample mean or the mean of the random sample.

Exercise 2.1. A random sample of size 9 yields the following

observations on the random variable X, the coal consumption in millions
of tons by electric utilities for a given year:
406, 395, 400, 450, 390, 410, 415, 401, 408.
Find the sample mean of these data.
2.2.1. Sampling from an Infinite Population

Theorem 2.1. If X̄ is the mean of a random sample of size n from
an infinite population with the mean µ and the variance σ 2, then
2 σ2
E(X̄) = µX̄ = µ and Var(X̄) = σX̄ =
n
2
and σX̄ (the positive square root of σX̄ ) is called the standard error of
the mean.
Theorem 2.2 (Chebyshev’s Theorem for Sampling Distribution). If X̄

is the mean of a random sample of size n from an infinite population
with the mean µ and the variance σ 2, then for any positive constant c,
σ2
P (|X̄ − µ| ≥ c) ≤ 2 ,
nc
or equivalently,
σ2
P (|X̄ − µ| < c) ≥ 1 − 2 .
nc
Theorem 2.3 (The Law of Large Numbers). If X̄ is the mean of a

random sample of size n from an infinite population with the mean µ
and the variance σ 2, then for any positive constant c,
P (|X̄ − µ| ≥ c) → 0 as n → ∞.
Theorem 2.4 (Central Limit Theorem). If X̄ is the mean of a random

sample of size n from an infinite population with the mean µ and the
variance σ 2, then the statistic (called the standardized sample mean)
X̄ − µ
Z= √
σ/ n
is a random variable whose probability density function approaches that
of the standard normal distribution as n → ∞.
Remark: Although the Central Limit Theorem will work well for small
samples in most cases, particularly where the population is continuous,
unimodal, and symmetric, larger samples (depending on the shape of
the population) will be required in other situations. In many cases
of practical interest, if n ≥ 30, the normal approximation will be
satisfactory regardless of the shape of the population.
An illustration of the approach
toward normality for the sampling
distribution of X̄ as sample size
increases
Example 2.2. If a 1-gallon can of paint covers on the average

513.3 square feet with a standard deviation of 30 square feet, what
is the probability that the sample mean area covered by a sample of 36
of these 1-gallon cans will be anywhere from 510 to 520 square feet?
Solution:
P (510 < X̄ < 520)

510 − 513.3 X̄ − µ 520 − 513.3
=P √ < √ < √
30/ 36 σ/ n 30/ 36
= P (−0.66 < Z < 1.34)
≈ Φ(1.34) − Φ(−0.66)
≈ 0.9099 − 0.2546
= 0.6553.
✷
Exercise 2.3. The lifetime of a special type of battery is a random

variable with mean 40 hours and standard deviation 20 hours.
A battery is used until it fails, at which point it is replaced by a new
one.
Assuming a stockpile of 36 such batteries the lifetimes of which are
independent, approximate the probability that over 1560 hours of use
can be obtained.
2.2.2. Sampling from a Finite Population

Theorem 2.5. If X̄ is the mean of a random sample of size n from a
finite population of size N with the mean µ and the variance σ 2, then
2 σ2 N − n
E(X̄) = µX̄ = µ and Var(X̄) = σX̄ = · .
n N −1
• It is of interest to note that the formulas we obtained for Var(X̄) of

the random sample of size n from an infinite population and that
from a finite population of size N differ only by the finite population
correction factor
N −n
.
N −1
• Indeed, when N is large compared to n, then

N −n
→1
N −1
and thus
σ2
Var(X̄) → .
n
A general rule of thumb is to use this approximation when
n ≤ N/20.
• In practice, we often deal with random samples from populations
that are finite, but large enough to be treated as if they were infinite.
Thus, most statistical theory and most of the methods we shall
discuss apply to samples from infinite populations.
2.3. The Sample Variance

• Suppose that X1, X2, . . . , Xn constitute a random sample of size n.
Let X̄ be the sample mean. The statistic
n
2 1 X
S = (Xj − X̄)2
n − 1 j=1
is called the sample variance (or the variance of the random sample).
• The statistic S (the positive square root of S 2 ) is called the
sample standard deviation (or the standard deviation of the random
sample).
Theorem 2.6. Suppose that X1, X2, . . . , Xn constitute a random

sample of size n from an infinite population which has the mean µ
and the variance σ 2. If S 2 is the sample variance, then
E(S 2) = σ 2.
Theorem 2.7. Suppose that X1, X2, . . . , Xn constitute a random

sample of size n from an infinite population which has the mean µ
and the variance σ 2. If S 2 is the sample variance, then
n
!
2 1 X
S = Xj2 − nX̄ 2 .
n − 1 j=1
Exercise 2.4. A random sample of size 9 yields the following

observations on the random variable X, the coal consumption in millions
of tons by electric utilities for a given year:
406, 395, 400, 450, 390, 410, 415, 401, 408.
Using the results of the exercise on page 7, find the sample standard
deviation of these data.
2.4. Sampling Distributions from a Normal

Population
2.4.1. Independence
Theorem 2.8. If X̄ and S 2 are the mean and the variance of a random
sample from a normal population, then X̄ and S 2 are independent.
2.4.2. The Sampling Distribution of the Mean

Theorem 2.9. If X̄ is the mean of a random sample of size n from
the normal population with the mean µ and the variance σ 2, then the
standardized sample mean
X̄ − µ
Z= √
σ/ n
has the standard normal distribution, no matter how small the size
of the sample.
Exercise 2.5. An electrical firm manufactures light bulbs that have a

length of life that is approximately normally distributed, with mean
equal to 800 hours and a standard deviation of 40 hours.
Find the probability that a random sample of 16 bulbs will have an
average life of less than 775 hours.
2.4.3. The t Distribution and its Applications in

Sampling
The t Distribution
• A random variable T has the t distribution (also called the Student-t
distribution or the Student’s-t distribution) with ν degrees of
freedom, and it is referred to as a t random variable, if and only if
its probability density function is
ν+1
−(ν+1)/2
1 Γ 2 t2
f (t) = √ · ν
· 1+
πν Γ 2 ν
where Z ∞
Γ(α) = y α−1 e−y dy for α > 0
0
for −∞ < t < ∞, where ν is a positive integer.
• Since the t distribution arises in many important applications,

integrals of its probability density function have been extensively
tabulated.
• Let tα,ν be such that the area to its right under the curve of the
t distribution with ν degrees of freedom is equal to α. That is, tα,ν
is such that P (T ≥ tα,ν ) = α.
The tα,ν notation

• Since the probability density function of the t distribution is

symmetrical about t = 0, thus
−tα,ν = t1−α,ν and P (T < −tα,ν ) = α.
Example 2.6.
t0.05,10 = 1.812 and t0.95,10 = −1.812.
✷
Example 2.7. Find P (−t0.025 < T < t0.05), for any degrees of
freedom ν.
Solution: For any degrees of freedom ν, we have
P (T > −t0.025) = P (T > t1−0.025) = 1 − 0.025 = 0.975
P (T > t0.05) = 0.05
Thus
P (−t0.025 < T < t0.05) = P (T > −t0.025) − P (T > t0.05)
= 0.925
✷
Applications of the t Distribution in Sampling

Theorem 2.10. If X̄ and S are the mean and the standard deviation
of a random sample of size n from a normal population with the mean µ
(but unknown variance), then the statistic
X̄ − µ
T = √
S/ n
has the t distribution with (n − 1) degrees of freedom.
Theorem 2.11. The t distribution with ν degrees of freedom

approaches the standard normal distribution as ν → ∞.
Remark: The standard normal distribution provides a good approxi-

mation to the t distribution for samples of size n ≥ 30.
Example 2.8. In 16 one-hour test runs, the gasoline consumption of an

engine averaged 16.4 liters with a standard deviation of 2.1 liters.
Assume that the distribution of gasoline consumption is approxi-
mately normal. Test the claim that the average gasoline consumption
of this engine is 12.0 liters per hour.
Solution: Degrees of freedom = n − 1 = 15.

Now, the value of the statistic is
x̄ − µ 16.4 − 12.0
t= √ = √ = 8.38
s/ n 2.1/ 16
Since
P (T > t0.005 = 2.947) = 0.005
thus
P (T > 8.38) → 0.
Therefore, it would seem reasonable to conclude that the true average
hourly gasoline consumption of the engine exceeds 12.0 liters. ✷
Exercise 2.9. A chemical engineer claims that the population mean

yield of a certain batch process is 500 grams per milliliter of raw
material.
To check this claim he samples 25 batches each month.
If the computed t-value falls between −t0.025 and t0.025, he is satisfied
with his claim.
Assuming the distribution of yields to be approximately normal, what
conclusion should he draw from a sample that has a mean 518 grams
per milliliter and a sample standard deviation 40 grams?
2.4.4. The Chi-Square Distribution and its

Applications in Sampling
The Chi-Square Distribution
• A random variable χ2 has a chi-square distribution, and it is referred
to as a chi-square random variable, if and only if its probability
density function is
f (χ2) = chi-square(χ 2
; ν)
1

ν−2 −χ2 /2
 χ e , for χ2 > 0,
= 2ν/2 · Γ(ν/2)
 0, elsewhere,
where ν is a positive integer.
• The parameter ν is referred to as the number of degrees of freedom,
or simply the degrees of freedom.
• Since the chi-square distribution arises in many important ap-

plications, integrals of its probability density function have been
extensively tabulated.
• Let χ2α,ν be such that the area to its right under the curve of the
chi-square distribution with ν degrees of freedom is equal to α.
That is, χ2α,ν is such that P (χ2 ≥ χ2α,ν ) = α.
The χ2α,ν notation

Example 2.10.
χ20.05,10 = 18.307 and χ20.95,10 = 3.940.
✷
Applications of the Chi-Square Distribution in Sampling

Theorem 2.12. If S 2 is the variance of a random sample of size n
from a normal population with the variance σ 2, then the statistic
2 (n − 1)S 2
χ =
σ2
has the chi-square distribution with (n − 1) degrees of freedom.
Exercise 2.11. An optical firm purchases glass to be ground into lenses,

and it is known from past experience that the variance of the refractive
index of this kind of glass is 1.26 × 10−4.
As it is important that the various pieces of glass have nearly the
same index of refraction, the firm rejects such a shipment if the sample
variance of 20 pieces selected at random exceeds 2.00 × 10−4.
Assuming that the sample values may be looked upon as a random
sample from a normal population, what is the probability that a
shipment will be rejected even though σ 2 = 1.26 × 10−4?
2.4.5. The F Distribution and its Applications in

Sampling
The F Distribution
• A random variable X has an F distribution, and it is referred to as
an F random variable, if and only if its probability density function
is
ν + ν ν ν1 /2

 1 2 1 ν/2−1

 Γ x
2 ν2


(ν1 +ν2 )/2 , for x > 0,

f (x) = ν ν
1 2 ν1
 Γ Γ x+1
2 2 ν

2



0, elsewhere,

where ν1 and ν2 are positive integers.

• The parameters ν1 and ν2 are referred as the degrees of freedom for

numerator and denominator , respectively.
• In view of its importance, the F distribution has been tabulated
extensively.
• Let Fα,ν1 ,ν2 be such that the area to its right under the curve of
the F distribution with ν1 and ν2 degrees of freedom is equal to α.
That is Fα,ν1,ν2 is such that P (F ≥ Fα,ν1,ν2 ) = α.
The Fα,ν1 ,ν2 notation

• F1−α,ν1 ,ν2 = 1/Fα,ν2,ν1
Example 2.12. Find the value of F0.95 for ν1 = 10 and ν2 = 20 degrees

of freedom.
Solution:
1 1
F0.95,10,20 = ≈ ≈ 0.361
F0.05,20,10 2.77
✷
Applications of the F Distribution in Sampling

Theorem 2.13. If S12 and S22 are the variances of independent
random samples of size n1 and n2, respectively, taken from two normal
populations with the variances σ12 and σ22, respectively, then the statistic
S12 /σ12
F = 2 2
S2 /σ2
has the F distribution with (n1 − 1) and (n2 − 1) degrees of freedom.

CH 02

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH 02

Uploaded by

Copyright:

Available Formats

Chi-Kong Ng, ENGG2780B, Dept.

of SEEM, CUHK 2:1

2.1.2. Random Sampling

• Before a random sample of size n is selected, the observations are

2.1.3. Statistics and Sampling Distributions

2.2. The Sample Mean

is called the sample mean or the mean of the random sample.

Exercise 2.1. A random sample of size 9 yields the following

2.2.1. Sampling from an Infinite Population

Theorem 2.2 (Chebyshev’s Theorem for Sampling Distribution). If X̄

Theorem 2.3 (The Law of Large Numbers). If X̄ is the mean of a

Theorem 2.4 (Central Limit Theorem). If X̄ is the mean of a random

Example 2.2. If a 1-gallon can of paint covers on the average

Exercise 2.3. The lifetime of a special type of battery is a random

2.2.2. Sampling from a Finite Population

• It is of interest to note that the formulas we obtained for Var(X̄) of

• Indeed, when N is large compared to n, then

2.3. The Sample Variance

Theorem 2.6. Suppose that X1, X2, . . . , Xn constitute a random

Theorem 2.7. Suppose that X1, X2, . . . , Xn constitute a random

Exercise 2.4. A random sample of size 9 yields the following

2.4. Sampling Distributions from a Normal

2.4.2. The Sampling Distribution of the Mean

Exercise 2.5. An electrical firm manufactures light bulbs that have a

2.4.3. The t Distribution and its Applications in

• Since the t distribution arises in many important applications,

The tα,ν notation

• Since the probability density function of the t distribution is

Applications of the t Distribution in Sampling

Theorem 2.11. The t distribution with ν degrees of freedom

Remark: The standard normal distribution provides a good approxi-

Example 2.8. In 16 one-hour test runs, the gasoline consumption of an

Solution: Degrees of freedom = n − 1 = 15.

Exercise 2.9. A chemical engineer claims that the population mean

2.4.4. The Chi-Square Distribution and its

• Since the chi-square distribution arises in many important ap-

The χ2α,ν notation

Applications of the Chi-Square Distribution in Sampling

Exercise 2.11. An optical firm purchases glass to be ground into lenses,

2.4.5. The F Distribution and its Applications in

where ν1 and ν2 are positive integers.

• The parameters ν1 and ν2 are referred as the degrees of freedom for

The Fα,ν1 ,ν2 notation

• F1−α,ν1 ,ν2 = 1/Fα,ν2,ν1

Example 2.12. Find the value of F0.95 for ν1 = 10 and ν2 = 20 degrees

Applications of the F Distribution in Sampling

You might also like