Professional Documents
Culture Documents
INTRODUCTION
Previously, we have seen that under certain conditions the sampling distribution of sample means is
normally distributed. As with any normal distribution, we are able to standardize a distribution of sample
means by finding Z-scores. We can do this if we know the standard deviation of the population. But what
happens when we don’t know this?
Often we do not know population standard deviations. In this lesson, we will confront the reality that
population standard deviations are rarely known. Not knowing the population standard deviation influences
the way we perform statistical inference regarding a population mean.
When σ is Known
Suppose we have a population whose standard deviation (ı) is known. If the population is normal or the
sample size is sufficiently large ( > 30), a sampling distribution of sample means is approximately normal.
The standard error of the sample means is given by:
ı
ıx n
The standardized value (Z-score) of a particular sample mean x from a sample of size is called a test
statistic (or Z-statistic) and is computed with the formula:
xíȝ
Z ı
n
Since the sampling distribution is normal, the distribution of the test statistic (Z-statistic) is also normal. In
this case, the standard normal distribution describes the variability in the test statistic. Recall that in Module
7 we used Z-statistics to perform hypothesis tests about population proportions.
When σ is Unknown
In most situations we do not know the population standard deviation. Another way of saying this is that ı is
unknown. The only option available to us is to approximate ı with a sample standard deviation, . To do this
we need to substitute for ı.
The T-Distribution
The T-distribution describes the variability of the test statistic,
xíȝ
T s when
n
This test statistic is called the T-statistic. Like the Z-statistic, the T-statistic is an estimate. It estimates how
many standard errors the sample mean is from the hypothesized population mean.
s √ Ȉx í x
ní
The deviations from the mean x í x are averaged in a sample standard deviation. When the deviations are
added together, they always add to zero. Because of this, the last deviation summed in this average is not
free—it is always the value that makes the resulting sum zero. There are deviations from the mean, but
only í 1 of these are free deviations.
It is important to know that the fewer the degrees of freedom, the more the sample standard deviation
varies. In other words, the smaller the sample size, the more the sample standard deviation varies. So, the
smaller the sample size, the greater the variability in a T-distribution.
Researchers are very worried about how quickly sea ice is melting. If sea ice continues to melt, it is possible
that polar bears will become an endangered species. That means that eventually there may be no more
polar bears anywhere in the world. Polar bears use the sea ice for hunting and making their dens. Also, as
the ice melts and erodes, there are smaller seal populations. Seals are the polar bears’ main source of food.
This further endangers the polar bear population.
Biologists regularly visit Arctic regions to track the health and numbers of polar bears. One important
measurement is the weight of adult male polar bears.
Biologists estimate that the average weight of an adult male polar bear is approximately 475 kilograms
(1050 pounds). For the questions below, assume that polar bear weights are normally distributed.
1 Suppose that four random samples of five polar bears are drawn from four different areas in Alaska. The
weights of the polar bears in kilograms (kg) are displayed in the table below:
A All of the samples have the same sample size. What is the appropriate number of degrees of
freedom for this sample size? (Remember that degrees of freedom = n í 1.)
(1B) Answer:
Mean ( x ) Sample Standard Deviation (s)
Sample A 501.8 21.55
Sample B 456.8 20.89
Sample C 461.6 25.32
Sample D 491.6 14.54
T-Statistic
Sample A
Sample B
Sample C
Sample D
D Which of the four T-statistics in the previous question shows a sample mean that is the farthest
away from μ = 475 (in estimated standard errors)?
E Which of the four T-statistics indicates that the sample mean(s) are below μ = 475?
F Which of the four T-statistics indicates a sample mean that is the closest (in estimated standard
errors) to μ = 475?
G Based on the sample statistics, which sample(s) should biologists and conservationists be most
worried about? (Note: Conservationists are people who are interested in conserving or saving
endangered animals or plant life.)
(1C) Answer:
T-Statistic
Sample A 2.78
Sample B −1.95
Sample C −1.18
Sample D 2.55