You are on page 1of 22

CHAPTER 7: Sampling and Sampling Distributions

MULTIPLE CHOICE

1. Which of the following statements correctly describe estimation?


a. It is the process of inferring the values of known population parameters from those of
unknown sample statistics.
b. It is the process of inferring the values of unknown sample statistics from those of known
population parameters.
c. It is the process of inferring the values of known sample statistics from those of unknown
population parameters.
d. It is the process of inferring the values of unknown population parameters from those of
known sample statistics.
ANS: D PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

2. A sample chosen in such a way that every possible subset of same size has an equal chance of being
selected is called a(n)
a. interval estimation c. simple random sample.
b. point estimation d. statistic
ANS: C PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

3. The mean of the sampling distribution of always equals


a. the population mean c. the population standard deviation
b. /n d. /n

ANS: A PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

4. The sampling method in which a population is divided into blocks and then selected by choosing a
random mechanism is called a
a. random sampling c. stratified sampling
b. systematic sampling d. cluster sampling
ANS: B PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

5. Which of the following is not a consideration when determining appropriate sample size?
a. The cost of sampling c. Interviewer fatigue
b. The timely collection of the data d. The likelihood of nonsampling error
ANS: C PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

6. Identifiable subpopulations within a population are called:


a. clusters
b. samples
c. blocks
d. strata
e. None of these options
ANS: D PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

7. A sample in which the sampling units are chosen from the population by means of a random
mechanism is a
a. probability sample c. stratified sample
b. judgmental sample d. systematic sample
ANS: A PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

8. A judgmental sample is a sample in which the


a. sampling units are chosen using a random number table
b. quality of sampling units judged
c. sampling units are chosen according to the sampler’s judgment
d. sampling units are all biased and vocal about it
ANS: C PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

9. The defining property of a simple random sample is that:


a. every sample has the same chance of being chosen
b. the easiest method to access samples are chosen
c. the fewest samples are chosen
d. every fourth subject is chosen as a sample
ANS: A PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

10. The probability of being chosen in a simple random sample of size n from a population of size N is:
a. 1/N c. N/n
b. N – 1/n d. n/N
ANS: D PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

11. Selecting a random sample from each identifiable subgroup within a population is called:
a. random sampling
b. systematic sampling
c. stratified sampling
d. cluster sampling
e. None of these options
ANS: C PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

12. Potential sample members, called sampling units, are:


a. people c. households
b. companies d. All of these options
ANS: D PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

13. In sampling, a population is:


a. the set of all humans
b. the set of all members about which a study intends to make inferences
c. any group of test subjects
d. a random group of individuals, households, cities or countries
ANS: B PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

14. The key to using stratified sampling is:


a. identifying the strata c. defining the strata
b. selecting the appropriate strata d. randomizing the strata
ANS: B PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

15. A sampling error is the result of:


a. measurement error c. nontruthful responses
b. nonresponse bias d. bad luck
ANS: D PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

16. The standard deviation of is usually called the


a. standard error of the mean c. standard error of the population
b. standard error of the sample d. randomized standard error
ANS: A PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

17. The opportunity for nonsampling error is increased by:


a. larger sample sizes c. affluent samples
b. smaller sample sizes d. educated samples
ANS: A PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

18. When a portion of the sample does not respond to the survey, _____ results.
a. measurement error
b. nonresponse bias
c. sampling error
d. systematic failure
e. nonlinear error
ANS: B PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

19. The accuracy of the point estimate is measured by its:


a. standard deviation c. sampling error
b. standard error d. nonsampling error
ANS: B PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

20. The sampling mean is the _____ estimate for the population mean .
a. random c. simple
b. point d. interval
ANS: B PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

21. Non-truthful response is a particular problem when:


a. sensitive questions are asked. c. interviewers are not trained.
b. surveys are anonymous. d. the sample is from an unusual population.
ANS: A PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

22. Measurement error occurs when:


a. a portion of the sample does not respond to the survey
b. the sample responses are not clear
c. the responses to question do not reflect what the investigator had in mind
d. the investigator does not correctly tally all responses
ANS: C PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

23. The two basic sources for error when using random sampling are:
a. sampling and selection
b. identification and selection
c. sampling and nonsampling
d. bias and randomness
e. linear and nonlinear
ANS: C PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

24. Sampling error is evident when:


a. a question is poorly worded
b. the sample is too small
c. the sample is not random
d. the sample mean differs from the population mean
ANS: D PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

25. The opportunity for sampling error is decreased by:


a. larger sample sizes c. affluent samples
b. smaller sample sizes d. educated samples
ANS: A PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

26. The theorem that states that the sampling distribution of the sample mean is approximately normal
when the sample size n is reasonably large is known as the:
a. central limit theorem c. simple random sample theorem
b. central tendency theorem d. point estimate theorem
ANS: A PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

27. A list of all members of the population is called a:


a. sampling unit c. frame
b. probability sample d. relevant population
ANS: C PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

28. There is approximately _____ % chance that any particular will be within two standard deviations
of the population mean ( ).
a. 90 c. 99
b. 95 d. 99.7
ANS: B PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

29. Which of the following statements are correct?


a. A point estimate is an estimate of the range of a population parameter
b. A point estimate is a single value estimate of the value of a population parameter
c. A point estimate is an unbiased estimator if its standard deviation is the same as the actual
value of the population standard deviation
d. All of these options
ANS: B PTS: 1

30. An unbiased estimator is a:


a. sample statistic used to approximate a population parameter
b. sample statistic, which has an expected value equal to the value of the population
parameter
c. sample statistic whose value is usually less than the population parameter
d. standard error of the mean
ANS: B PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

31. Which of the following statements are correct?


a. An interval estimate describes a range of values that is likely not to include the actual
population parameter
b. An interval estimate is an estimate of the range for a sample statistic.
c. An interval estimate is an estimate of the range of possible values for a population
parameter.
d. None of these options
ANS: C PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

32. Which of the following are reasons for why simple random sampling is used infrequently in real
applications?
a. Samples can be spread over a large geographic region
b. Simple random sampling requires that all sampling units be identified prior to sampling
c. Simple random sampling can result in underrepresentation or overrepresentation of certain
segments of the population
d. All of these options
ANS: D PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

33. If systematic sampling is chosen as the sampling technique, it is probably because:


a. Systematic sampling has better statistical properties than simple random sampling
b. Systematic sampling is more convenient
c. Systematic sampling always results in more representative sampling than simple random
sampling
d. None of these options
ANS: B PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

34. With proportional sample sizes:


a. The proportion of a stratum in the sample is independent of the proportion of that stratum
in the population
b. The proportion of a stratum in the sample is the same as the proportion of that stratum in
the population
c. The proportion of a stratum in the sample is greater than the proportion of that stratum in
the population
d. The proportion of a stratum in the sample is less than the proportion of that stratum in the
population
ANS: B PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

35. The approximate standard error of the sample mean is calculated as:
a. c.
b. d.

ANS: B PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

36. The approximate 95% confidence interval for a population mean is:
a. c.
b. d.

ANS: D PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference


37. The finite population correction factor, , should generally be used when:
a. N is any finite size
b. n is less than 5% of the population size N
c. n is greater than 5% of the population size N
d. n is any finite size
ANS: C PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

38. The reason the Central Limit Theorem (CLT) is such an important result in statistics is because:
a. The CLT allows us to assume that the population distribution is approximately normal,
provided n is reasonably large
b. The CLT allows us to estimate the population mean without knowing the exact form of
the population distribution, provided n is reasonably large
c. The CLT allows us to construct confidence intervals for the population mean without
knowing the exact form of the population distribution, provided n is reasonably large
d. All of these options
ANS: D PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

39. The Central Limit Theorem (CLT) is generally valid for:


a. n > 5
b. n > 10
c. n > 20
d. n > 30
e. any size n
ANS: D PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

40. The averaging effect means that as you average more and more observations from a given distribution,
the variance of the average
a. increases
b. decreases
c. is unaffected
d. could either increase, decrease or stay the same
ANS: B PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

TRUE/FALSE

1. The primary advantage of cluster sampling is sampling convenience (and possibly less cost). The
downside, however, is that the inferences drawn from a cluster sample can be less accurate, for a given
sample size, than for other sampling plans.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

2. We can measure the accuracy of judgmental samples by applying some simple rules of probability.
This way, judgmental samples are not likely to contain our built-in biases.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference


3. When we sample less than 5% of the population, the finite population correction factor; fpc =

, is used to modify the formula for the standard error of the sample mean.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

4. If a simple random sample of size n is chosen from a population of size N, then each member of the
population has probability N / n of being chosen in the sample.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

5. Simple random sampling can result in under-representation or over-representation of certain segments


of the population. This is one of several reasons that simple random samples are almost never used in
real applications.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

6. Simple random samples are typically used in real applications.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

7. A simple random sample is one where each member of the population has a known chance (this may
differ from one member to another) or probability of being chosen.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

8. A list of all members of the population from which we can choose a sample is called a frame, and the
potential sample members are called sampling units.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

9. In systematic sampling, one of the first k members is selected randomly, and then every kth member
after this one is selected. The value k is called the sampling interval and equals the ratio N / n, where N
is the population size and n is the desired sample size.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

10. A sample of size 20 is selected at random from a population of size N. If the finite population
correction factor is 0.9418, then N must be 169.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

11. In stratified sampling, the population is divided into relatively homogeneous subsets called strata, and
then random samples are taken from each stratum.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

12. Stratified samples are typically not used in real applications because they provide less accurate
estimates of population parameters for a given sampling cost.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

13. In stratified sampling with proportional sample sizes, the proportion of each stratum selected differs
from stratum to stratum.
ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

14. In cluster sampling, the population is divided into subsets called clusters (such as cities or city blocks),
and then a random sample of the clusters is selected. Once the clusters are selected, we typically
sample all of the members in each selected cluster.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

15. Cluster sampling is often less convenient and more costly than other random sampling methods.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

16. A point estimate is a single numeric value, a “best guess” of a population parameter, calculated from
the sample data.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

17. The difference between the point estimate and the true value of the population parameter being
estimated is called the estimation error.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

18. An interval estimate is an interval calculated from the population data, where we strongly believe the
true value of the population parameter lies.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

19. One obvious advantage of stratified sampling is that we obtain separate estimates within each stratum
– which we would not obtain if we took a simple random sample from the entire population. A more
important advantage is that we can increase the accuracy of the resulting population estimates by using
appropriately defined strata.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

20. The sampling distribution of any point estimate (such as the sample mean or proportion) is the
distribution of the point estimates we would obtain from all possible samples of a given size drawn
from the population.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

21. An unbiased estimate is a point estimate such that the mean of its sampling distribution is equal to the
true value of the population parameter being estimated.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

22. A probability sample is a sample in which the sampling units are chosen from the population by means
of a random mechanism such as a random number table.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

23. The standard error of an estimate is the standard deviation of the sampling distribution of the estimate.
It measures how much estimates from different samples vary.
ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

24. Ideally, we prefer estimates that have large standard errors.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

25. The standard error of sample mean is large when the observations in the population are spread out
(large ), but that the standard error can be reduced by taking a smaller sample.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

26. It is customary to approximate the standard error of the sample mean by substituting the sample
standard deviation s for in the formula: SE( )= .

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

27. An estimator is said to be unbiased if the mean of its sampling distribution equals the value of the
population parameter being estimated.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

28. Estimation is the process of inferring the value of an unknown population parameter using data from a
random sample

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

29. The sampling distribution of the mean will have the same mean as the original population from which
the samples were drawn.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

30. The sampling distribution of the mean will have the same standard deviation as the original population
from which the samples were drawn.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

31. Systematic sampling is generally similar to simple random sampling in its statistical properties.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

32. The randomized response technique is a way of getting at sensitive information to avoid estimation
errors due to nontruthful responses.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

33. Voluntary response bias occurs when the responses to questions do not reflect what the investigator
had in mind.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

34. The Central Limit Theorem (CLT) states that the sampling distribution of the mean is approximately
normal, no matter what the distribution of the population, so long as the sample size is large enough.
ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

35. If the sample size is greater than 30, the Central Limit Theorem (CLT) will always apply.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

36. The Central Limit Theorem (CLT) says that as long as the sample size is reasonably large, there is
about a 95% chance that the magnitude of the sampling error for the mean will be no more than two
standard errors.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

37. The size of a sample can be selected by first determining the desired standard error and then using the
formula to calculate n.

ANS: T PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

38. The averaging effect says that as you average more and more observations from a given distribution,
the variance of the average increases.

ANS: F PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

SHORT ANSWER

1. Consider the frame of 50 full-time employees of Computer Technologies, Inc (CTI). CTI’s human
resources manager has collected annual salary figures for all employees and she has calculated a mean
of $47,723, a median of $41,082 and a standard deviation of $24,167. A simple random sample of 10
employees is presented below (salary is in $1,000’s). Compute the mean, median, and standard
deviation for the sample and compare these statistics with the measures for the entire company.

Employee 1 2 3 4 5 6 7 8 9 10
Salary 38.8 46.7 61.1 49.6 58.5 78.8 36.7 46.5 47.6 56.7

ANS:
Sample statistics: mean = $52,100, median = $48,600, standard deviation = $12,279.5
Population parameters: mean = $47,723, median = $41,082, standard deviation = $24,167
The sample mean and median are larger than the corresponding population mean and median, but the
sample standard deviation is much smaller (about 51%) of the population standard deviation.

PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

2. A sales manager for a company that makes commercial ovens for restaurants is interested in estimating
the average number of restaurants in all metropolitan areas across the entire country. He does not have
access to the data for each metropolitan location, so he had decided to select a sample that will be
representative of all such areas, and will use a sample size of 30. Do you believe that simple random
sampling is the best approach to obtaining a representative subset of the metropolitan areas in the
given frame? Explain. If not, recommend how the sales manager might proceed to select a better
sample of size 30 from this data?

ANS:
Using a simple random sample may not be the best approach. If you are trying to determine the
number of restaurants in metropolitan areas, it seems as though this would be somewhat dependent on
the size (population) of the metropolitan areas under investigation. It may be better to stratify amples.
You could divide the metropolitan areas into several strata based on their population and then sample
within each stratum. This may be more representative of the metropolitan areas across the country.

PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

NARRBEGIN: SA_81_82
A battery manufacturer wants to estimate the average number of defective (or dead) batteries contained
in a box shipped by the company. Production personnel at this company have recorded the number of
defective batteries found in each of the 2000 boxes shipped in the past week.
NARREND

3. (A) What sample size would be required for the production personnel to be approximately 95% sure
that their estimate of the average number of defective batteries per box is within 0.3 unit of the true
mean? Assume that the best estimate of the population standard deviation ( ) is 0.9 defective
batteries per box.

(B) How does your answer to (A) change if the production personnel want their estimate to be within
0.5 unit of the actual population mean? Evaluate the tradeoff between required accuracy and sample
size requirement for this case and the case in (A).

ANS:
(A)

(B) In this case, . This shows that we need almost 3


times as many observations to reduce the absolute error from 0.5 to 0.3 units. However, 36 is still a
relatively small sample, and may be with it to keep the absolute error within 0.3 units.

PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

NARRBEGIN: SA_83_85
Auditors of Old Kent Bank are interested in comparing the reported value of customer savings account
balances with their own findings regarding the actual value of such assets. Rather than reviewing the
records of each savings account at the bank, the auditors decide to examine a representative sample of
savings account balances. The frame from which they will sample is shown below.

$75.30 $614.11 $696.34 $572.08


$748.23 $21.20 $99.79 $1,233.38
$530.40 $378.37 $596.14 $239.65
$2,995.38 $1,069.06 $929.80 $259.98
$123.65 $68.92 $192.35 $754.45
$309.00 $163.31 $71.75 $904.92
$40.70 $161.12 $459.38 $171.48
$402.81 $157.44 $41.81 $87.08
$489.97 $468.12 $400.57 $319.40
$533.82 $1,801.35 $1,666.50 $37.16
$85.92 $91.43 $193.14 $106.95
$214.62 $10.62 $582.18 $39.65
$123.66 $76.33 $291.73 $398.48
$659.18 $101.24 $1,740.47 $322.26
$1,509.34 $1,599.04 $358.62 $492.05
$1,052.68 $596.33 $100.54 $1,288.70
$421.46 $1,799.51 $581.21 $571.63
$180.58 $98.82 $358.68 $38.93
$874.78 $2,761.93 $750.44 $376.60
$269.48 $456.79 $216.81 $305.49

NARREND

4. (A) What sample size would be required for the auditors to be approximately 95% sure that their
estimate of the average savings account balance at this bank is within $150 of the true mean? Assume
that their best estimate of the population standard deviation is $300.

(B) Choose a simple random sample of the size found in (A).

(C) Compute the observed sampling error based on the sample you have drawn from the population.
How does the actual sampling error compare to the maximum possible probable absolute error
established in (A)? Explain

ANS:
(A)

(B) The simple random sample of size 16 was generated using StatTool’s Random Sample tool in the
Data Utilities section. Next, the VLOOKUP function was used to place the appropriate balances next
to the customers that were selected to be included in the sample. The following sample was obtained.

Customer Balance
40 456.79
51 193.14
63 239.65
37 1799.51
8 402.81
20 269.48
42 99.79
39 2761.93
78 38.93
3 530.40
35 1599.04
64 259.98
14 659.18
32 10.62
11 85.92
68 87.08

(C)
Based on the above sample (results will differ):
The sample mean = $593.39
The frame mean = $537.31
The sampling error is the difference between the sample mean and the frame mean. In this case, the
sampling error is $56.08, which is much less than the maximum probable absolute error of $150. This
is the case because the maximum probable absolute error is, by definition, the largest possible amount
that will still give 95% certainty. As illustrated here, the observed sampling error is smaller than the
largest possible error.

PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

NARRBEGIN: SA_86_90
A statistics professor has just given the final examination in his introductory statistics course. In
particular, he is interested in learning how his class of 50 students performed on this exam. The data
are shown below.

78 72 73 75 79 72 75 77 71 78
83 84 71 81 82 79 71 73 89 74
75 93 74 88 83 90 82 79 62 73
88 76 76 76 80 84 84 91 70 76
74 68 80 87 92 84 79 80 91 74
NARREND

5. (A) Using these 50 students as the frame, use Excel to generate a simple random sample of size 10
from this frame.

(B) Compute the mean scores in the frame and the simple random sample you generated in (A).

(C) Compare the mean scores you computed in (B). Is your simple random sample a good
representative of the frame? Why or why not?

(D) Using these 50 students as the frame, use Excel to generate a systematic sample of size 10 from
this frame.

(E) Compare the mean scores in the frame with that in the systematic sample in (D). What do you
conclude?

ANS:
In order to solve this problem, we first generated an index value for each score in the given frame.
Then we used StatTool’s Random Sample tool in the Data Utilities section to generate a simple random
sample of scores from the population. Lastly, we used VLOOKUP function to find the corresponding
score for each index value. This process resulted in the following sample:

Index 6 34 37 10 26 24 32 36 4 33
Score 72 76 84 78 90 88 76 84 75 76

(B) Population mean score = 78.92, Sample mean score = 79.90 in the above case.

(C) The mean of the sample generated from the given frame of scores is clearly very close to the mean
of the population. Therefore we may conclude that the simple random sample is fairly representative of
the population of introductory statistics final exam scores.
(D) In order to generate a systematic sample, we must first divide the frame size (50) by the desired
sample size (10) to find the relevant intervals from which we will sample. The sampling interval in this
case is 5, meaning that every 5 th score will be included in the sample. Next, we randomly choose a
number between 1 and 5. Suppose that this number happens to be 1. This will be our starting point in
the first block of 5 scores. To identify every 5 th score thereafter, we first developed an index column to
assign an index value to each score. We then used Excel’s MOD function to label every 5 th score with a
“1” assigned to it. We have now generated a systematic sample of size 10. The sample consists of the
following values: 78, 72, 83, 79, 75, 90, 88, 84, 74, and 84.

(E) The means of the frame and of the sample were found to be 78.92 and 80.7, respectively. We see
that these means are very close. From this analysis, we can conclude that the systematic sample is
fairly representative of the frame or population.

PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

6. A cannery claims that its sardine cans have a net weight of 8 oz., with a standard deviation of 0.1 oz.
You take a simple random sample of 30 cans and encounter a sample mean of 7.85 oz. Are you
inclined to believe the claim?

ANS:
The sampling distribution of is normal (since n 30) with mean and standard deviation given by E(
)= = 8, and SE( )= = 0.0183, respectively.
Therefore, P( < 7.8) = P(Z < -8.2) = 0. If the claim were true, such a sample would not be
encountered. The cannery management is not telling the truth.

PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

NARRBEGIN: SA_92_93
An editor of a local newspaper is concerned with the number of errors that are found in the daily paper.
In order to understand the extent of this problem, the editor would like to estimate the average number
of errors in the daily paper. The frame in this case is the number of errors found in the daily paper for
the past six months (180 issues).
NARREND

7. (A) What sample size would be required for the production personnel to be approximately 95% sure
that their estimate of the average number of errors per issue is within 4 errors of the true mean?
Assume that the editor’s best estimate of the population standard deviation ( ) is 10 errors per issue.

(B) How does your answer to (A) change if the editor wants the estimate to be within 3 errors of the
actual population mean? Explain the difference in your answers to (A) and (B).

ANS:
(A)

(B) In this case, . This shows that we need almost


twice as many observations to decrease the absolute error from 4 to 3.

PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

NARRBEGIN: SA_94_95
Suppose that you are an entrepreneur interested in establishing a new Internet-based auction service.
Furthermore, suppose that you have gathered basic demographic information on a large number of
Internet users. You currently have information on 1000 individuals related to their gender, age,
education, marital status, annual household income, and number of people in household. Assume that
these individuals were carefully selected through stratified sampling.
NARREND

8. (A) To assess potential interest in your proposed enterprise, you would like to conduct telephone
interviews with a representative subset of the 1000 Internet users. How would you proceed to stratify
the given frame of 1000 individuals to choose 50 for telephone interviews? Explain your approach.

(B) Explain how you could apply cluster sampling to obtain a sample size of 50 from this frame. What
are the advantages and disadvantages of employing cluster sampling in this case?

ANS:
(A) Which of these factors will have an impact on the use of the auction service? You may want to use
gender, age, and annual household income. You should attempt to gather data on individuals that
represent the different gender, age, and annual income groups that represent your customers. You may
find that you have different responses between these groups.

(B) You may decide that you want to sample 50 people in your immediate area. You can use your local
phone directory and call customers in your area. This type of sampling is convenient and is less costly.
The drawback is that the inference drawn from this type of sample may not be representative of the
entire population.

PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

NARRBEGIN: SA_96_102
The manager of a small computer company has collected current annual salaries and number of years
of post-secondary education for 52 full-time employees. The data are shown below:

Current annual salaries:

Number of years of post-secondary education:


NARREND

9. (A) Compute the mean, median, and standard deviation of the annual salaries for the 52 employees in
the given frame.

(B) Use Excel to choose a systematic sample of size 13 from the frame of annual salaries.

(C) Compute the mean, median, and standard deviation of the annual salaries for the 13 employees
included in your systematic sample in (B)

(D) Compare your statistics in (C) with your computed descriptive measures for the frame in (A). Is
your systematic sample representative of the frame with respect to the annual salary variable?

(E) Assume that we wish to stratify these employees by the number of years of post-secondary
education, select such a stratified sample of size 15 with approximately proportional sample sizes.

(F) Compute the mean, median, and standard deviation of the annual salaries for the 15 employees
included in your stratified sample in (E).

(G) Compare these statistics in (F) with your computed descriptive measures for the frame obtained in
(A). Is your stratified sample representative of the frame with respect to the annual salary variable?

ANS:
(A) The mean, median, and standard deviation of the given frame were computed using StatTools as
shown below:

(B) In order to generate a systematic sample, we must first divide the frame size by the desired sample
size to find the relevant intervals from which we will sample. The sampling interval in this case is 4,
meaning that every 4th salary will be included in the sample. Next, we randomly choose a number
between 1 and 4. Suppose that this number happens to be 1. This will be our starting point in the first
block of 4 salaries. To identify every 4th salary thereafter, we first developed an index column to
assign an index value to each salary. We then used Excel's MOD function to label every 4th salary with
a "1" in column C. Lastly, we used an IF statement to identify every value that has a "1" assigned to it.
We have now generated a systematic sample of size 13. The sample consists of the following values
shown below (read across rows):

$38,450 $109,285 $87,489 $49,638 $76,927 $90,473


$89,867 $28,743 $39,205 $54,199 $49,987 $21,750
$31,008

(C)

(D) After generating the summary measures for both the frame and the sample, we can conclude that
the sample does not represent the frame well. The mean, median, and standard deviation of the frame
are all much smaller than the mean, median, and standard deviation of the sample.

(E) This portion of the solution involves several steps. First, we noted the total sample size needed.
Second, we developed the strata we will use to separate the given frame: in this case we placed every
two years in a new stratum as shown below. Next, we generated a column labeled "Category", to place
a number between 1 and 5 next to the salary that corresponds with the stratum of that number. For
example, if the annual salary was of a person who only had 2 years of education beyond secondary
education, then a number 2 for Stratum 2 was placed next to the salary. The "Category" column was
generated using an IF statement. We then unstacked the categories in order to count the number of
salaries in each stratum. This was done by using StatTools's Data Utilities/Unstack function. Once this
was completed, we used Excel's COUNT function to count the number of values in each stratum and
then generated proportional numbers for each stratum with respect to the size of the given population.
Once the proportions are generated, we used an Excel's random number function to assign a random
number to each salary. Then, by using Excel to sort the salaries in each stratum by their random
number (in this case by ascending number) we selected the salaries in each stratum that will be
included in the sample. These salaries are shown below.
(F)

(G) When looking at the mean, median, and standard deviation of both the sample and population, we
can conclude that the stratified sample represents the population fairly well, although the summary
measures are all slightly lower than those of the population.

PTS: 1 MSC: AACSB: Analytic | AACSB: Descriptive Statistics

NARRBEGIN: SA_103_105
Sally Bird of Big Rapids Realty has received data on 60 houses that were recently sold in Mecosta
County in Michigan. The data are recorded in the table shown below. Included in this data set are
observations for each of the following variables:

· The appraised value of each house (in thousands of dollars)


· The selling price of each house (in thousands of dollars)
· The size of each house (in hundreds of square feet)
· The number of bedrooms in each house
NARREND

10. (A) Suppose that Sally wishes to examine a representative subset of these 60 houses that has been
stratified by the number of bedrooms. Use Excel to assist her by finding such a stratified sample of
size 10 with proportional sample sizes.

(B) Explain how Sally could apply cluster sampling in selecting a sample of size 15 from this frame.

(C) What are the advantages and disadvantages of employing cluster sampling in this case?

ANS:
(A) In this problem, the stratified sample was found by using strata that were based on the number of
bedrooms in the house. Once we established how to stratify the frame, we unstacked the prices
according to the strata (in this case, the number of bedrooms). This was done by using StatPro’s Data
Utilities/Unstack variables. Once this was completed, we counted the number of houses in each
stratum and then assigned a proportional size to each stratum relative to the size of the frame (in this
case, size of the frame is 60). After the proportions were generated, we used Excel to generate a
random number for each price in each stratum. Next, we used Excel’s sort function to place the prices
in order of ascending random numbers. We then chose the prices to be included in the stratified
sample. These results are shown below. Note that the stratified sample size is 11 (not 10) due to
rounding.

HOUSE # BEDROOMS PRICE


17 2 132.54
29 2 111.95
18 2 114.33
2 2 111.70
1 3 132.98
45 3 136.16
27 3 153.69
34 3 127.30
12 4 136.51
32 4 155.46
6 5 162.03

(B) In this situation, Sally could have selected a few neighborhoods within Mecosta County, Michigan,
and obtained all the sample information from the selected neighborhoods.

(C) By using cluster sampling, Sally would be able to generate her sample more quickly and
conveniently. The disadvantage of cluster sampling in this case is that Sally would have to make sure
she selected neighborhoods that fairly represented the variety of households in the county. For
example, if the county had a large variety of homes with only 2 bedrooms, but the sample
neighborhoods selected mostly contained homes with 4 bedrooms, the sample information would not
fairly represent the entire frame. If this were the case, cluster sampling would not be a good way to
select a sample.

PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

NARRBEGIN: SA_106_109
The manager of a local fast-food restaurant is interested in improving service provided to customers
who use the restaurant’s drive-up window. As a first step in the process, the manager asks his assistant
to record the time (in minutes) it takes to serve a large number of customers at the final window in the
facility’s drive-up system. The given frame in this case is 200 customer service times observed during
the busiest hour of the day for this fast-food restaurant. The frame of 200 service times yielded a mean
of 0.881. A simple random sample of 10 from this frame is presented below.

Customer 1 2 3 4 5 6 7 8 9 10
Service time 1.02 1.18 0.95 0.90 0.85 1.10 0.75 0.60 1.25 1.00

NARREND

11. (A) Compute the point estimate of the population mean from the sample above. What is the sampling
error in this case? Assume that the population consists of the given 200 customer service times.

(B) Compute the point estimate of the population standard deviation from the sample above.

(C) Should you use the finite population correction (fpc) factor to estimate the standard error of ?
Explain. If your answer is yes, what is the value of the fpc?

(D) Determine a good approximation to the standard error of the mean in this case.

ANS:
(A) Sample mean = 0.96. Then, sampling error = 0.96 – 0.881 = 0.079

(B) s = 0.1963

(C) Yes, we should use the finite population correction factor in this case, since a sample size of 10 is
5% of population size of 200. Here fpc = = 0.9771.

(D)
PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

NARRBEGIN: SA_110_111
A university bookstore manager is mildly concerned about the number of textbooks that were under-
ordered and thus unavailable two days after the beginning of classes. The manager instructs an
employee to pick a random number, go to the place where that number book is shelved, examine the
next 50 titles, and record how many titles are unavailable.
NARREND

12. (A) Technically, this process does not yield a random sample of the books in the store. Why not?

(B) How could a truly random sample be obtained?

ANS:
(A) For true random sampling, all possible combinations of 50 books must have equal probability of
being sampled. In this process, books that are shelved far from each other could not be in the same
sample. Thus not all combinations would have equal probability; some would have probability 0.

(B) Obtain an inventory list of all book titles and number the books. Use a table of random numbers
(or computer generated random numbers) to select 50 books to be examined.

PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

NARRBEGIN: SA_112_116
Suppose that the average weekly earnings for employees in general automotive repair shops is $450,
and that the standard deviation for the weekly earnings for such employees is $50. A sample of 100
such employees is selected at random.
NARREND

13. (A) Find the mean and standard deviation of the sampling distribution of the average weekly earnings
in the sample.

(B) Find probability that the mean of the sample is less than $445.

(C) Find the probability that the mean of the sample is between $445 and $455.

(D) Find the probability that the mean of the sample is greater than $460.

(E) Explain why the assumption of normality about the distribution of the average weekly earnings for
employees was not involved in the answers to (A) through (D).

ANS:
(A) E( )= = 450, and SE( )= =5

(B) P( < 445) = P(Z < -1) = 0.5000 – 0.3413 = 0.1587

(C) P(445< <455) = P(-1.0 < Z < 1.0) = 2(0.3413) = 0.6826

(D) P( > 460) = P(Z > 2.0) = 0.5000 – 0.4772 = 0.0228


(E) The sample size is large; n = 100 is greater than 30, so the distribution of the average weekly
earnings for employees is at least approximately normal.

PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

NARRBEGIN: SA_117_120
A columnist for the LA Times is working to meet a deadline on a story about commuting in Los
Angeles. She wants to include information about the current price of gasoline in the Los Angeles metro
area, but her source person for this type of information has already gone home for the day. So she
decides to take her own sample as she drives home, writing down the prices she observes as she makes
her way from downtown to her neighborhood in the suburbs. Below is the data sample she obtains
(units are $/gallon).

NARREND

14. (A) Do you think she has obtained a true random sample?

(B) What average price could she report, based on the above sample?

(C) What average price range could she report, based on the above sample?

(D) Do you see any issues with reporting the range calculated for (C)?

ANS:
(A) For a true random sample, all possible gas stations in the LA metro area must have an equal chance
of being sampled. In this case, only the stations on her route home were sampled, although they do at
least represent a variety of settings (different parts of town). Given her time constraints, this sample
may suffice, though.

(B) The sample mean is $3.23

(C) Using the sample mean and sample standard deviation (0.185), she could calculate a 95%
confidence interval for the true mean price of $3.15/gallon to $3.30/gallon.

(D) The sample, in addition to perhaps not being truly random, may also be too small to justify using
the assumptions used calculating the range in (C). Typically we want n>30 unless the population data
is approximately normal as well.

PTS: 1 MSC: AACSB: Analytic | AACSB: Statistical Inference

You might also like