You are on page 1of 16

CHAPTER 5 SAMPLING DISTRIBUTIONS

Objectives
1. To be able to understand and use the concept of sampling
distribution.
2. To calculate the probability for sampling distribution of sample
means.
3. To calculate the probability for sampling distribution of sample
proportions.

5.1 Introduction
Recall from Chapter 1 the term population is applied to sets or
collections of objects, measurements, or observations under
investigation. For example, if we are interested in determining the
average number of television sets per household in Malaysia, the totality
of this number of sets, one for each household, is the population of this
study. In this case, the population is finite. If we look upon the results
obtained in an unending series of flips of a coin, the population is
infinite.

If a population is infinite, it is impossible to observe all its values, and


even if it is finite, it may be impractical or uneconomical to observe all
elements. Thus, it is usually necessary to use a sample, a part of
population, and infer from it results pertaining to the entire population.

5.2 Population Distribution


Definition: The population distribution refers to the population data and
its probability distribution.

Example 5.1
Suppose there are only 5 students in an advance statistics class and the
midterm scores for the 5 students are: 70, 78, 80, 80, 95. Compute:
(a) the probability distribution of this population
(b) the mean and standard deviation of the population distribution
92 Intro to Statistics & Probability

Solution

(a) Let X denote the midterm scores and f denote the frequency
distribution. The probability distribution is as below:

X=x f Relative Probability,


frequency P(X=x)
70 1 1/5=0.20 0.20
78 1 1/5=0.20 0.20
80 2 2/5=0.40 0.40
95 1 1/5=0.20 0.20
∑ f =5 Sum=1.00 ∑ p(x) = 1.00
Table 5.1: Population probability distribution

Note: X ~ random variable


x ~ values of the random variable
f
Probability, P(X=x) = p(x) = Relative frequency =
∑f

(b) Mean of the population distribution:


μ = ∑ xp( x) = 70(0.2) + 78(0.2) + 80(0.4) + 95(0.2) = 80.6

Standard deviation of the population distribution:


σ= ∑x 2
p ( x) − μ 2

= [70 2 (0.2) + 78 2 (0.2) + 80 2 (0.4) + 95 2 (0.2)] − 80.6 2 = 8.09

5.3 Sampling Distribution of the Sample Mean, x


The value of a population parameter is always constant. Previously, we
have touched about inferential statistics where it consists of methods that
use sample results to help make decisions or predictions about a
population. For example, a computer engineer, in selecting a sample of
silicon wafers is interested only in using the sample mean for estimating
the population average thickness. Thus, in this kind of situation,
Chapter 5 Sampling Distributions 93

summary statistics such as the sample mean are used to make


conclusions about the corresponding population parameters.

The probability distribution of a statistic is called a sampling


distribution. The sampling distribution of x with sample size n is the
distribution that results when an experiment is conducted over and over
(always with sample size n) and the many values of x result. This
sampling distribution, then, describes the variability of sample averages,
x around the population mean, µ.

Steps in Tabulating the Sampling Distribution of x :


(a) Obtain all possible samples of size n (without replacement) from a
population size N. The number of possible samples is N C n .
(b) Compute the sample mean for each sample.
(c) Tabulate the frequency distribution of x .
(d) Tabulate the probability distribution of x .

Example 5.2

Reconsider the population of midterm scores for 5 students in Example


5.1. Construct a sampling distribution of the sample mean for samples of
size 3.

Solution

Step 1
Total number of possible samples =
5
C 3 = 10

Step 2
Suppose we assign letters F=70, G=78, H=80, I=80, J=95.

Then the 10 possible samples of 3 scores each are: FGH, FGI, FGJ,
FHI, FHJ, FIJ, GHI, GHJ, GIJ, HIJ.
94 Intro to Statistics & Probability

Sample Scores x
FGH 70, 78, 80 76
FGI 70, 78, 80 76
FGJ 70, 78, 95 81
FHI 70, 80, 80 76.67
FHJ 70, 80, 95 81.67
FIJ 70, 80, 95of size three
Table 5.2: Sample mean for each sample 81.67
GHI 78, 80, 80 79.33
GHJ
Step 3 and Step 4 78, 80, 95 84.33
GIJ 78, 80, 95 84.33
HIJ 80, 80, 95 85.00

Table 5.2 : Sample mean for each sample of size three

Sample mean, f Relative frequency p(x )


x
76 2 2/10=0.2 0.2
76.67 1 1/10=0.1 0.1
79.33 1 1/10=0.1 0.1
81 1 1/10=0.1 0.1
81.67 2 2/10=0.2 0.2
84.33 2 2/10=0.2 0.2
85 1 1/10=0.1 0.1
∑ f = 10 Sum = 1 ∑ p(x) = 1
Table 5.3: Sampling distribution of x

Note: Probability distribution of x , p(x ) = Relative frequency

Exercise 5.1
Consider the following population of six numbers: 4, 7, 9, 2, 8, 5

(a) Find the population mean.


(b) Construct a sampling distribution of the sample mean for samples of
size 4.
Chapter 5 Sampling Distributions 95

Mean and Standard Deviation of x

The mean and standard deviation calculated for the sampling distribution
of x are denoted by μ x and σ x , respectively. μ x is referred to as the
mean of the sample mean. σ x is referred to as the standard deviation of
the sample mean. It is also called the standard error of x .

Mean of the sampling distribution of x

The mean of the sampling distribution of x is always equal to the mean


of the population. Thus,

μx = μ

Standard deviation of the sampling distribution of x

The standard deviation of the sampling distribution of x is

σ
σx =
n

where σ is the standard deviation of the population and n is the sample


n
size. This formula is used when ≤ 0.05 , where N is the population
N
size. In most practical applications, the sample size is usually small
compared to the population size.

Example 5.3
The mean age of a microwave is 6.8 years with standard deviation 4.3
n
years. A random sample of 78 microwaves is taken. Assuming ≤ 0.05 ,
N
find the mean and standard deviation of the sample mean for these
microwaves.
96 Intro to Statistics & Probability

Solution

Given: n = 78
μ = 6 .8
σ = 4 .3

μ x = μ = 6. 8
σ 4 .3
σx = = = 0.4869
n 78

Exercise 5.2
Consider a large population with μ = 95 and σ = 20 . Assuming
n
≤ 0.05 , find the mean and standard deviation of the sample mean, x
N
for a sample size of
(a) 11
(b) 29

Shape of the Sampling Distribution of x

i. Sampling from a Normally Distributed Population

When samples are drawn from a normally distributed population with


mean, μ and standard deviation, σ , the sampling distribution of x will
have the following properties:

1. The shape of the sampling distribution of x is normal, irrespective of


the sample size, n.

2. The mean of the sampling distribution of x , μ x is equal to the


population mean, µ.

μx = μ

3. The standard deviation of the sampling distribution of x , σ x is equal


σ n
to , assuming ≤ 0.05.
n N
Chapter 5 Sampling Distributions 97

x − μx x−μ
4. The z-value is z = =
σx σ
n

Example 5.4

An electronic factory manufactures electric diodes that have a length of


life that is approximately normally distributed, with mean equal to 800
hours and a standard deviation of 40 hours. Find the probability that a
random sample of 16 electric diodes will have an average life of less than
775 hours.

Solution

The sampling distribution of x will be approximately normal.


Given: µ=800, σ=40 and n=16

⎛ ⎞
⎜ ⎟
⎜ 775 − 800 ⎟
Therefore, P( x < 775) = P z < = P( z < −2.5) = 0.0062
⎜ 40 ⎟
⎜ ⎟
⎝ 16 ⎠

ii. Sampling from a Population That is Not Normally Distributed

Central Limit Theorem

If a random sample of n observations is drawn from a population with


finite mean, μ and standard deviation, σ , then, when n is sufficiently
large, the sampling distribution of the sample mean, x , can be
approximated by a normal density function.

This means, when the sample size is large ( n ≥ 30 ), the sampling


distribution of x is approximately normal, irrespective of the shape of
the population distribution.

Note:
1. large sample size: n ≥ 30
98 Intro to Statistics & Probability

σ
2. mean of x , μ x = μ and standard deviation of x , σ x =
n
3. z-value for a value of x :

x − μx x−μ
z= =
σx σ
n

Example 5.5
From a study, it is known that the mean grams of hydrocarbons emitted
by automobiles per mile is 1.2 grams. If a random sample of 150
automobiles is selected:

(a) What is the population standard deviation so that the probability that
the sample mean will be more than 1.21 grams is 0.3121?
(b) Find the probability that the sample mean will be between 1.16 and
1.26 grams.

Solution

Since the sample size is large ( n ≥ 30 ), the sampling distribution of x is


approximately normal.

Given: μ = 1.2, n = 150


P ( x > 1 . 21 ) = 0 . 3121
⎛ ⎞
⎜ ⎟
1 . 21 − 1 . 2
P⎜ z > ⎟ = 0 . 3121
⎜ σ ⎟
⎜ ⎟
⎝ 150 ⎠

(a) ⎛⎜ ⎞

1 . 21 − 1 . 2
P⎜ z < ⎟ = 0 . 6879
⎜ σ ⎟
⎜ ⎟
⎝ 150 ⎠

1 . 21 − 1 . 2
= 0 . 49
σ
150
σ = 0 . 2499
Chapter 5 Sampling Distributions 99

(a)
⎛ ⎞
⎜ ⎟
⎜ 1.16 − 1.2 1.26 − 1.2 ⎟
P(1.16 < x < 1.26) = P <z<
⎜ 0.2499 0.2499 ⎟
⎜ ⎟
⎝ 150 150 ⎠

= P(−1.96 < z < 2.94)


= 0.9734

Exercise 5.3
Suppose that the current annual salary for all Malaysian teachers have a
probability distribution that is skewed to the right with a mean of RM
30,000 and a standard deviation of RM 9874. Let x be the mean annual
salary for a sample of 300 teachers.

(a) What is the probability that the mean annual salary of Malaysian
teachers obtained from this sample will be less than the population
mean by RM 800 or more?
(b) What is the probability that the mean annual salary of Malaysian
teachers obtained from this sample will be within RM 1000 of the
population mean?

5.4 Sampling Distribution of the Sample Proportion, p̂

The concept of proportion is the same as the concept of relative


frequency and the concept of probability of success in a binomial
experiment. The relative frequency of a category or class gives the
proportion of the sample or population that belongs to that category or
class. Similarly, the probability of success in a binomial experiment
represents the proportion of a sample or population that possesses a given
characteristic.

Population and sample proportions

The population proportion, denoted by p, is obtained by taking the ratio


of the number of elements in a population with a specific characteristic to
the total number of elements in the population. The sample proportion,
denoted by p̂ (pronounced as p hat), gives a similar ratio for a sample.
100 Intro to Statistics & Probability

The population and sample proportions, denoted by p and p̂ ,


respectively are calculated as follows:

X x
p= and pˆ =
N n

where
N = total number of elements in the population
n = total number of elements in the sample
X = number of elements in the population that possess a specific
characteristic
x = number of elements in the sample that possess a specific
characteristic

Example 5.6
Suppose there are 4351 engineering students in UteM and 378 of them
have not sat for their MUET exam. Then, the proportion of UTeM
engineering students who have sat for their MUET exam is:

N = population size = 4351


X = UTeM engineering students who have sat for their MUET exam
= 4351 – 378 = 3973
X 3973
p= = = 0.9131
N 4351

Now, suppose a sample of 862 students is taken from this population,


and 762 of them have sat for their MUET exam. Then, the sample
proportion of UTeM engineering students who have sat for their MUET
exam is:

762
pˆ = = 0.884
862

Sampling Distribution of p̂

Definition: The sampling distribution of the sample proportion, p̂ is the


probability distribution of the sample proportion.
Chapter 5 Sampling Distributions 101

Example 5.7

A postgraduate class has only 5 students. The table below gives the
names of these 5 students and their current status, as a full time or part-
time student.
Name Part-time student
Amir Yes
Sandra Yes
Chong No
Rani Yes
Daniel No

Table 5.4: Names of postgraduate students and their


status as part-time or full time students

(a) What is the population proportion, p of part-time students?


(b) List all possible samples of size 3 (without replacement) that can be
selected from this population, and calculate the sample proportion,
p̂ of part-time students for each sample. Then, find the sampling
distribution of p̂ .

Solution

X 3
(a) p = = = 0.6
N 5
(b) Total number of samples of size 3 (without replacement) that can be
selected from this population = 5 C 3 = 10
No Sample Sample proportion, p̂
1 Amir, Sandra, Chong 2/3 = 0.67
2 Amir, Sandra, Rani 3/3 = 1.00
3 Amir, Sandra, Daniel 2/3 =0.67
4 Amir, Chong, Rani 2/3 = 0.67
5 Amir, Chong, Daniel 1/3 = 0.33
6 Amir, Rani, Daniel 2/3 = 0.67
7 Sandra, Chong, Daniel 1/3= 0.33
8 Sandra, Chong, Rani 2/3 = 0.67
9 Sandra, Rani, Daniel 2/3 = 0.67
10 Chong, Rani, Daniel 1/3= 0.33
Table 5.5: All possible samples of size 3 and
the values of sample proportion, p̂
102 Intro to Statistics & Probability

Sample Frequency, f Relative p( p̂ )


proportion, p̂ frequency
0.33 3 0.3 0.3
0.67 6 0.6 0.6
1.00 1 0.1 0.1
∑ f = 10 Total = 1 ∑ p( pˆ ) = 1
Table 5.6: Sampling Distribution of p̂ for sample size 3

Mean and Standard Deviation of p̂

1. The mean of sample proportion, p̂ is denoted by μ p̂ and is equal to


the population proportion, p.
Thus,
μ pˆ = p

2. The standard deviation of the sample proportion, p̂ , is denoted by σ p̂


and is given by the formula
pq
σ pˆ =
n

where p is the population proportion, q=1-p, and n is the sample size.


n
The formula is used when ≤ 0.05 , where N is the population size. In
N
most practical applications, the sample size is usually small compared to
the population size.

Shape of The Sampling Distribution of p̂

Central Limit Theorem for Sample Proportion

According to the central limit theorem, the sampling distribution of p̂ is


approximately normal for a sufficiently large sample size. In the case of
proportion, the sample size is considered to be sufficiently large if np and
nq are both greater than 5, that is if

np > 5 and nq > 5


Chapter 5 Sampling Distributions 103

Thus, the z -value for a value of p̂ :

pˆ − μ pˆ pˆ − p
z= =
σ pˆ pq
n

Example 5.8
According to the QA inspector of Southern Electronic Corp., the
company will produce 3% defective electric diodes everyday. If a
random sample of 100 electric diodes is inspected for being good or
defective, let p̂ be the proportion of good electric diodes in the sample.
What is the probability that p̂ is between 0.94 and 0.95?

Solution

n = 100 p = 1 − q = 1 − 0.03 = 0.97


⎛ ⎞
⎜ ⎟
0.94 − 0.97 pˆ − p 0.95 − 0.97 ⎟
P (0.94 < pˆ < 0.95) = P⎜ < <
⎜ (0.97)(0.03) pq (0.97)(0.03) ⎟
⎜ ⎟
⎝ 100 n 100 ⎠
= P(−1.76 < z < −1.17)
= P( z < −1.17) − P( z < −1.76)
= 0.1210 − 0.0392
= 0.0818

Exercise 5.4
A fast food outlet claims that 53% of their customers favor a certain type
of pizza. Assume that this claim is true. What is the probability that in a
random sample of 400 customers, less than 49% will favor this pizza?

Review Exercises
1. The heights of UTeM students are approximately normally
distributed with a mean of 174.5 centimetres and a standard deviation
104 Intro to Statistics & Probability

of 6.9 centimetres. A random sample of size 36 is drawn from this


population. Determine:
(a) the mean and the standard deviation of the sampling distribution
of x .
(b) the probability that the average height is between 172.5 and 176.5
centimetres, inclusive.
(c) the probability that the average height is below 170.5.

2. The amount of time that a telemarketer spends on a customer is a


random variable with a mean, μ = 3.2 minutes and a standard
deviation, σ = 1.6 minutes. If a random sample of 64 customers is
observed from a normally distributed population, find the probability
that the mean time of this sample is:
(a) at most 2.7 minutes.
(b) more than 3.5 minutes.
(c) at least 3.2 minutes but less than 3.4 minutes.

3. The average life of a bread-making machine is 7 years, with a


standard deviation of 1 year. Assuming that the lives of these
machines follow approximately a normal distribution, find:
(a) the probability that the mean life of a random sample of 9 such
machines falls between 6.4 and 7.5 years.
(b) the probability that the mean life is more than the population
mean by at least 2 standard deviations (n = 9 ) .
(c) the value of x to the right of which 15% of the means computed
from random samples of size 9 would fall.

4. Suppose that a printer needs an average of Y minutes to print several


designs. Given that the standard deviation to print all design is 1.4
minutes. If a random sample of 49 designs is selected,
(a) what is the value of Y so that there is 99% chance that the sample
mean will be at least 13.634 minutes?
(b) find the probability that the sample mean will be below 13.5 or
above 14.8 minutes.

5. Researchers are experimenting with a new compound used to bond A


to steel. The drying time that the compound requires is being
monitored and it is known that it is approximately normally
distributed, with an average drying time of 4.70 minutes and a
standard deviation of 0.40 minutes. Suppose that a sample of 25
drying times is selected,
Chapter 5 Sampling Distributions 105

(a) what is the probability that the sample mean will be at least 4.57
minutes?
(b) there is an 85 percent chance that the sample average will fall
between two values symmetrically distributed around the
population mean. What are those two values?

6. The test scores for 300 UTeM students were entered into a computer,
analyzed, and stored in a file. Knowing that 30% of the mean scores
were below 65 and 15% of the mean scores were above 90, find their
mean and standard deviation (assuming the scores are normally
distributed).

7. For a population with p = 76% and σ pˆ = 0.2 , find the probability


σ = p̂

that
(a) pˆ > 80% .
(b) 0.75 < pˆ < 0.85 .
(c) p̂ lies within 0.12 of p.

8. A survey conducted by UTeM shows that 77% of the Information


Technology students have laptops at home. If a random sample of
120 Information Technology students is selected, find the probability
that the value of p̂ is
(a) at most 75%.
(b) at least 85%.
(c) between 70% and 80%.
(d) not within 0.15 of the population proportion.

9. A company that manufactures soda drink claims that 85% of their


soda drinks are good for 4 years or longer. Assume that the claim is
true. Let p̂ be the proportion in a sample of 100 soda drinks that are
good for 4 years or longer.
(a) What is the probability that this sample proportion is within 0.05
of the population proportion?
(b) What is the probability that this sample proportion is less than the
population proportion by 0.06 or more?
(c) What is the probability that this sample proportion is greater than
the population proportion by 0.07 or more?

10. The proportion of females in an organization is 85%. We have a


random sample of n = 500 individuals.
106 Intro to Statistics & Probability

(a) What are the mean and standard deviation of p̂ , the sample
proportion of females in the organization?
(b) Is the distribution of p̂ approximately normal? Justify your
answer.
(c) What is the probability that the sample proportion exceeds 82%?
(d) What is the probability that the sample proportion lies between
83% and 88%?
(e) 99% of the time, the sample proportion would lie between what
two symmetrical limits?

11. A city is planning to build a shopping complex building. A local


newspaper found that 55% of the voters in this city favour the
construction of this building. Assume that this result holds true for
the population of all voters in this city.
(a) What is the probability that more than 50% of the voters in a
random sample of 150 voters selected from this city will favour
the construction of this plant?
(b) A politician would like to take a random sample of voters in
which over 50% would favour the plant construction. How large a
sample should be selected so that the politician is 95.5% sure of
this outcome?

12. A survey conducted by UTeM Exam Unit shows that 88% of UTeM
students passed all subjects and completed their studies in 4 years’
time. If a random sample of 132 students is selected, find the
probability that the sample proportion is within 0.10 of the population
mean. Next, find the number of samples that are selected, if the
probability is 0.38 that at most 81% of UTeM students passed all
subjects and completed their studies in 4 years’ time.

You might also like