You are on page 1of 5

The T-Distribution and T-Statistics

INTRODUCTION

Previously, we have seen that under certain conditions the sampling distribution of sample means is
normally distributed. As with any normal distribution, we are able to standardize a distribution of sample
means by finding Z-scores. We can do this if we know the standard deviation of the population. But what
happens when we don’t know this?

Often we do not know population standard deviations. In this lesson, we will confront the reality that
population standard deviations are rarely known. Not knowing the population standard deviation influences
the way we perform statistical inference regarding a population mean.

When σ is Known

Suppose we have a population whose standard deviation (ı) is known. If the population is normal or the
sample size is sufficiently large ( > 30), a sampling distribution of sample means is approximately normal.
The standard error of the sample means is given by:

ı
ıx n

The standardized value (Z-score) of a particular sample mean x from a sample of size  is called a test
statistic (or Z-statistic) and is computed with the formula:

xíȝ
Z ı
n

Since the sampling distribution is normal, the distribution of the test statistic (Z-statistic) is also normal. In
this case, the standard normal distribution describes the variability in the test statistic. Recall that in Module
7 we used Z-statistics to perform hypothesis tests about population proportions.

When σ is Unknown

In most situations we do not know the population standard deviation. Another way of saying this is that ı is
unknown. The only option available to us is to approximate ı with a sample standard deviation, •. To do this
we need to substitute • for ı.

Statway® Version 3.1, © 2018 WestEd. All rights reserved.


When we make this substitution, the standard error of the sampling
distribution is estimated by
s
n

The test statistic for a sample mean is x ís ȝ . Since the sample


n

standard deviation (•) varies from sample to sample and only


approximates ı, its use in the test statistic calculation introduces
additional variability in the test statistic. The added variability means
that the distribution of the test statistic is not normal. The test
statistic actually varies according to what we call the Student’s
T-distribution.

The T-Distribution
The T-distribution describes the variability of the test statistic,
xíȝ
T s when
n

● the sampling distribution of sample means is normal, and


● a sample standard deviation (s) is used to estimate the
population standard deviation (ı).

This test statistic is called the T-statistic. Like the Z-statistic, the T-statistic is an estimate. It estimates how
many standard errors the sample mean is from the hypothesized population mean.

The T-distribution is a family of continuous probability distributions.


The width of a T-distribution depends on how much a sample
standard deviation can vary. The amount of variability in a sample
standard deviation depends on how many deviations vary freely
when it is computed.

The sample standard deviation is computed using the formula below:


s √ Ȉ x í x
ní

The deviations from the mean x í x are averaged in a sample standard deviation. When the deviations are
added together, they always add to zero. Because of this, the last deviation summed in this average is not
free—it is always the value that makes the resulting sum zero. There are  deviations from the mean, but
only  í 1 of these are free deviations.

Statway® Version 3.1, © 2018 WestEd. All rights reserved.


The variability of standard deviations depends on the number of free deviations in the sample standard
deviation,  í 1. This quantity is known as the degrees of freedom (d.f.). Technically speaking, each degree
of freedom defines a uniquely associated T-distribution.

It is important to know that the fewer the degrees of freedom, the more the sample standard deviation
varies. In other words, the smaller the sample size, the more the sample standard deviation varies. So, the
smaller the sample size, the greater the variability in a T-distribution.

T-distributions have the following characteristics:

● T-distributions are bell-shaped and symmetric with a mean of 0.


● Each T- distribution depends on the degrees of freedom, d.f.
● T-distributions have heavier tails and narrower peaks than the standard normal distribution.
● The area under each T-distribution curve is 1.
● As the degrees of freedom increase, the tails are thinner.
● As the degrees of freedom increase, the T-distribution eventually approaches the standard normal
distribution.
● When making inferences about a population mean, the degrees of freedom are equal to the sample
size minus 1 (d.f. =  í 1).

Statway® Version 3.1, © 2018 WestEd. All rights reserved.


TRY THESE

Researchers are very worried about how quickly sea ice is melting. If sea ice continues to melt, it is possible
that polar bears will become an endangered species. That means that eventually there may be no more
polar bears anywhere in the world. Polar bears use the sea ice for hunting and making their dens. Also, as
the ice melts and erodes, there are smaller seal populations. Seals are the polar bears’ main source of food.
This further endangers the polar bear population.

Biologists regularly visit Arctic regions to track the health and numbers of polar bears. One important
measurement is the weight of adult male polar bears.

Biologists estimate that the average weight of an adult male polar bear is approximately 475 kilograms
(1050 pounds). For the questions below, assume that polar bear weights are normally distributed.

1 Suppose that four random samples of five polar bears are drawn from four different areas in Alaska. The
weights of the polar bears in kilograms (kg) are displayed in the table below:

Sample A 466 520 512 513 498


Sample B 471 476 461 453 423
Sample C 493 482 431 450 452
Sample D 481 492 475 498 512

A All of the samples have the same sample size. What is the appropriate number of degrees of
freedom for this sample size? (Remember that degrees of freedom = n í 1.)

(1A) Answer: †ǤˆǤ = 5 í 1 = 4.


B What are the means and standard deviations for each sample. Complete the table below:

(1B) Answer:
Mean ( x ) Sample Standard Deviation (s)
Sample A 501.8 21.55
Sample B 456.8 20.89
Sample C 461.6 25.32
Sample D 491.6 14.54

Statway® Version 3.1, © 2018 WestEd. All rights reserved.


9.2.1 | 4
C Calculate the T-statistic for each sample. Assume that the population mean weight (μ) = 475 kg.
Complete the table below.

T-Statistic

Sample A

Sample B

Sample C

Sample D

D Which of the four T-statistics in the previous question shows a sample mean that is the farthest
away from μ = 475 (in estimated standard errors)?

E Which of the four T-statistics indicates that the sample mean(s) are below μ = 475?

F Which of the four T-statistics indicates a sample mean that is the closest (in estimated standard
errors) to μ = 475?

G Based on the sample statistics, which sample(s) should biologists and conservationists be most
worried about? (Note: Conservationists are people who are interested in conserving or saving
endangered animals or plant life.)

(1C) Answer:

T-Statistic
Sample A 2.78
Sample B −1.95
Sample C −1.18
Sample D 2.55

(1D) Answer: Sample A.

(1E) Answer: Sample B and Sample C.

(1F) Answer: Sample C.

(1G) Answer: Sample B.

You might also like