You are on page 1of 6

1

Sampling Distribution
Population and Sampling Distributions
Population Distribution

The population distribution is the distribution of the population data

Suppose there are only five students in an advanced statistics class and the midterm scores
of these five students are:
70 78 80 80 95
Let x denote the score of a student. Using single-valued classes, we can write the frequency
distribution of scores as:

x f
70 1
78 1
80 2
95 1
N=5

The values of the mean and standard deviation calculated for the probability distribution
gives the values of the population parameters μ and σ as μ = 80.60 and σ = 8.09.

x f x∙f μ – x (μ – x)2 f ∙ (μ – x)2


70 1 70 10.6 112.36 112.36
78 1 78 2.6 6.76 6.76
80 2 160 0.6 0.36 0.72
95 1 95 -14.4 207.36 207.36
N = 5 μ = 80.6 σ2 = 65.44

Sampling Distribution

The value of a population parameter is always constant. For example, for any population
data set, there is only one value of the population mean, μ. However, we cannot say the same about
the sample mean, x̅ . We would expect different samples of the same size drawn from the same
population to yield different values of the sample mean, x̅ . The value of the sample mean for any one
sample will depend on the elements included in that sample. Consequently, the sample mean, x̅ , is a
random variable. Therefore, like other random variables, the sample mean possesses a probability
distribution, where is more commonly called the sampling distribution of x̅ . Other sample statistics,
such as the median, mode, and standard deviation, also possess sampling distributions.

Sampling Distribution of x̅ . The probability distribution of x̅ is called its sampling


distribution. It lists the various values that x̅ assume and the probability of each value of x̅ .

In general, the probability distribution of a sample statistic is called its sampling


distribution.

Reconsider the population of midterm scores of five students. Consider all possible samples
of three scores each that can be selected, without replacement, from that population. Suppose we
assign letters A, B, C, D, and E to the scores of the five students, so that

A = 70 , B = 78, C = 80, D = 80, E = 95

Then, the possible samples of three scores each are

ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE

/jlpj2018
2

These 10 samples and their respective means are as follows:

Sample Scores in the Sample x̅


ABC 70, 78, 80 76.000
ABD 70, 78, 80 76.000
ABE 70, 78, 95 81.000
ACD 70, 80, 80 76.667
ACE 70, 80, 95 81.667
ADE 70, 80, 95 81.667
BCD 78, 80, 80 79.333
BCE 78, 80, 95 84.333
BDE 78, 80, 90 84.333
CDE 80, 80, 95 85.000

Note that the first two samples have the same three scores. The reason for this is that two of
the students have the same score, and, hence, the samples ABC and ABD contain the same values.
The mean of each sample is obtained by dividing the sum of three scores included in that sample by
3. For instance, the mean of the first sample is (70 + 78 + 80)/3 = 76. Note that the values of the
means of samples are rounded to three decimal places.

x̅ f x∙f μ – x (μ – x)2 f ∙ (μ – x)2


76.000 2 152.000 4.600 21.160 42.320
76.667 1 76.667 3.933 15.468 15.468
79.333 1 79.333 1.267 1.605 1.605
81.000 1 81.000 -0.400 0.160 0.160
81.667 2 163.334 -1.067 1.138 2.277
84.333 2 168.666 -3.733 13.935 27.871
85.000 1 85.000 -4.400 19.360 19.360
σx̅ 2 ≈ 10.906
N = 10 μx̅ = 80.6
σx̅ ≈ 3.302

Sampling and Nonsampling Errors


Usually, different samples selected from the same population will give different results
because they contain different elements. The result obtained from any one sample will generally be
different from the result obtained from the corresponding population. The difference between the
value of a sample statistic obtained from a sample and the value of the corresponding population
parameter obtained from the population is called the sampling error. Note that this difference
represents the sampling error only if the sample is random and no nonsampling error has been
made. Otherwise, only a part of this difference will be due to the sampling error.

Sampling Error. Sampling error is the difference between the value of a sample statistic and
the value of the corresponding population parameter. In the case of the mean,

Sampling error = x̅ - μ

assuming that the sample is random and no nonsampling error has been made.

It is important to remember that a sampling error occurs because of chance. The errors that
occur for other reasons, such as errors made during collection, recording, and tabulation of data,
are called nonsampling errors. These errors occur because of human mistakes, and nor chance. Note
that there is only one kind of sampling error – the error that occurs due to chance. However, there
is not just one nonsampling error, but there are many nonsampling errors that may occur for
different reasons.

Nonsampling Error. The errors that occur in the collection, recording, and tabulation of data
are called nonsampling errors.

/jlpj2018
3

Example

Reconsider the population of five scores. Suppose one sample of three scores is selected
from this population, and this sample includes the scores 70, 80 and 95. Find the sampling error.

Solution

The population mean as illustrated by the table before is 80.60 and the mean of the random
sample of three scores which includes 70, 80, and 95 is

70 + 80 + 95
𝑥̅ = = 81.667
3

Consequently,

𝑆𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑒𝑟𝑟𝑜𝑟 = 𝑥̅ − 𝜇 = 81.667 − 80.60 = 1.067

That is, the mean score estimated from the sample is 1.067 higher than the mean score of
the population. Note that this difference occurred due to chance – that is, because we used a sample
instead of the population.

Now suppose, when we select the sample of three scores, we mistakenly record the second
score as 82 instead of 80. As a result, we calculate the sample mean as

70 + 82 + 95
𝑥̅ = = 82.333
3

𝑥̅ − 𝜇 = 82.333 − 80.60 = 1.733

However, this difference between the sample mean and the population mean does not
represent the sampling error. As we calculated earlier, only 1.067 of this difference is due to the
sampling error. The remaining portion, which is equal to 1.733 − 1.067 = 0.666, represents the
nonsampling error because it occurred due to the error we made in recording the second score in
the sample. Thus, in this case,

𝑆𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑒𝑟𝑟𝑜𝑟 = 1.07


𝑁𝑜𝑛𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 𝑒𝑟𝑟𝑜𝑟 = 0.666

Exercises

Concepts and Procedures

1. Briefly explain the meaning of a population distribution and a sampling distribution. Give an
example of each.
2. Explain briefly the meaning of sampling error. Give an example. Does such an error occur
only in a sample survey, or can it occur in both a sample survey and a census?
3. Explain briefly the meaning of nonsampling error. Give an example. Does such an error
occur only in a sample survey, or can it occur in both a sample survey and a census?
4. Consider the following population of six numbers
16 13 7 20 9 12
a. Find the population mean
b. Liza selected one sample of four numbers from this population. The sample included
the numbers 13, 7, 9, and 12. Calculate the sample mean and sampling error for this
sample.
c. When Liza calculate the sample mean, she mistakenly used the numbers 13, 7, 6, and
12 to calculate the sample mean. Find the sampling and nonsampling errors in this
case.
d. List all samples of four numbers that can be selected from this population. Calculate
the sample mean and sampling error for each of these samples.

/jlpj2018
4

Application

5. The following data gives the ages of all five members a family
55 53 28 25 21
a. Let x denote the age of a member of this family. Write the population distribution of
x
b. List all the possible samples of size 3 that can selected from this population.
Calculate the mean for each of these samples. Write the sampling distribution of x̅ .
c. Calculate the mean for the population data. Select one random sample of size 3 and
calculate the sample mean x̅ . Compute the sampling error.

Mean and Standard Deviation of x̅


The mean and standard deviation calculated for the sampling distribution of x̅ are called the
mean and standard deviation of x̅ . Actually, the mean and standard deviation of x̅ are, respectively,
the mean and standard deviation of the means of all samples of the same size selected from a
population. The standard deviation of x̅ is also called the standard error of x̅ .

Mean and Standard Deviation of x̅ . The mean and standard deviation of the sampling
distribution of x̅ are called the mean and standard deviation of x̅ and are denoted by μx̅ and σx̅ ,
respectively.

If we calculate the mean and standard deviation of the 10 values of x̅ , we obtain the mean,
μx̅ , and standard deviation, σx̅ . These will be the values of μx̅ and σx̅ . From these caluculations, we
will obtain μx̅ = 80.60 and σx̅ = 3.30.

The mean of the sampling distribution of x̅ is always equal to the mean of the population.

Mean of the Sampling Distribution of x̅ . The mean of the sampling distribution of x̅ is always
equal to the mean of the population. Thus,
𝜇𝑥̅ = 𝜇

Hence, if we select all possible samples from a population and calculate their means, the
mean of all these sample means will be the same as the mean of the population.

The sample mean, x̅ , is called an estimator of the population mean, μ. When the expected
value of a sample statistic is equal to the value of the corresponding population parameter, that
sample statistic is said to be an unbiased estimator. For the sample mean x̅ , μx̅ = μ. Hence, x̅ is an
unbiased estimator of μ. This is very important property that an estimator should possess.

However, the standard deviation, σx̅ , of x̅ is not equal to the standard deviation, σ, of the
population distribution. The standard deviation of x̅ is equal to the standard deviation of the
population divided by the square root of the sample size; that is,
𝜎
𝜎𝑥̅ =
√𝑛

This formula for the standard deviation of x̅ holds true only when the sampling is done
either with replacement from a finite population or with or without replacement from an infinite
population. These two conditions can be replaced by the condition that the above formula holds
true if the sample size is small in comparison to the population size. The sample size is considered
to be small compared to the population size if the sample size is equal to or less than 5% of the
population size – that is, if

𝑛
≤ 0.05
𝑁

/jlpj2018
5

If this condition is not satisfied, we use the following formula to calculate σx̅ :

𝜎 𝑁−𝑛
𝜎𝑥̅ = √
√𝑛 𝑁 − 1

𝑁−𝑛
where factor √ is called the finite population correction factor.
𝑁−1

Standard Deviation of the Sampling Distribution of x̅ . The standard deviation of the sampling
distribution of x̅ is
𝜎
𝜎𝑥̅ =
√𝑛

Where σ is the standard deviation of the population and n is the sample size. This formula is
used when n/N ≤ 0.05, where N is the population size.

Example

The mean wage per hour for all 5000 employees who work at a large company is ₱2750,
and the standard deviation is ₱370. Let x̅ be the mean wage per hour for a random sample of certain
employees selected from this company. Find the mean and standard deviation of x̅ for a sample of
a. 30
b. 75
c. 200

Solution.

From the given information, for the population of all employees.

𝑁 = 5000
𝜇 = ₱2750
𝜎 = ₱370

a. The mean, μx̅ , of the sampling distribution of x̅ is

𝜇𝑥̅ = 𝜇 = ₱2750

In this case, n = 30, N = 5000, and n/N = 30/5000 = 0.006. Because n/N is less than 0.05, the
standard deviation of x̅ is obtained by using the formula

𝜎 370
𝜎𝑥̅ = = = ₱67.60
√𝑛 √30

Thus, we can state that if we take all possible samples of size 30 from the population of all
employees of this company and prepare the sampling distribution of x̅ , the mean and standard
deviation of this sampling distribution of x̅ will be ₱2750 and ₱67.60, respectively.

b. In this case, n = 75, N = 5000, and n/N = 75/5000 = 0.015, which is less than 0.05. the mean
and standard deviation of x̅ are

𝜇𝑥̅ = 𝜇 = ₱2750

𝜎 370
𝜎𝑥̅ = = = ₱42.70
√𝑛 √75

/jlpj2018
6

c. In this case, n = 200, N = 5000, and n/N = 200/5000 = 0.04, which is less than 0.05. the mean
and standard deviation of x̅ are

𝜇𝑥̅ = 𝜇 = ₱2750

𝜎 370
𝜎𝑥̅ = = = ₱26.20
√𝑛 √200

From the preceding calculations we observe that the mean of the sampling distribution of x̅
is always equal to the mean of the population whatever the size of the sample. However, the value
of the standard deviation of x̅ decreases from ₱67.60 to ₱42.70 and then to ₱26.20 as the sample
size increases from 30 to 75 and then to 200.

Exercises

Concepts and Procedures

1. Let x̅ be the mean of a sample selected from a population


a. What is the mean of the sampling distribution of x̅ equal to?
b. What is the standard deviation of the sampling distribution of x̅ equal to? Assume
n/N ≤ 0.05
2. What is an estimator? When is an estimator unbiased? Is the sample mean, x̅ , an unbiased
estimator of μ? Explain.
3. How does the value of σx̅ change as the sample size increases? Explain.
4. Consider a large population with μ = 70 and σ = 10. Assuming n/N ≤ 0.05, find the mean
and standard deviation of the sample mean, x̅ , for a sample size of
a. 18
b. 80
5. Consider a large population with μ = 90 and σ = 18. Assuming n/N ≤ 0.05, find the mean
and standard deviation of the sample mean, x̅ , for a sample size of
a. 15
b. 40
6. A population of N = 5000 has σ = 25. In each of the following cases, which formula will you
use to calculate σx and why? Using the appropriate formula, calculate σx̅ for each these
cases.
a. n = 300
b. n = 100
7. A population of N = 100 000 has σ = 40. In each of the following cases, which formula will
you use to calculate σx and why? Using the appropriate formula, calculate σx̅ for each these
cases.
a. n = 2500
b. n = 7000

Applications

8. The living spaces of all homes in a city have a mean of 2300 square feet and a standard
deviation of 500 square feet. Let x̅ be the mean living space for a random sample of 25
homes selected from this city. Find the mean and standard deviation of the sampling
distribution of x̅ .
9. The mean monthly out-of-pocket cost of prescription drugs for all senior citizens in a
particular city is ₱520 with a standard deviation of ₱72. Let x̅ be the mean of such costs for a
random sample of 25 senior citizens from this city. Find the mean and standard deviation of
the sampling distribution of x̅ .
10. The standard deviation of the 2018 gross sales of all corporations is known to be ₱139.50
million. Let x̅ be the mean of the 2018 gross sales of a sample of corporations. What sample
size will produce the standard deviation of x̅ equal to ₱15.50 million?

/jlpj2018

You might also like