You are on page 1of 18

Hypothesis Testing

Hypothesis testing is a decision-making process for evaluating claims about a population.


In this process, the researcher must define the population under study, state he particular
hypotheses to be investigated, give the significance level, select a sample form the population,
collect the data, perform the required test, and reach a conclusion. There are two specific statistical
tests for hypothesis testing on means: the z test and the t test. On the other hand, the chi-square test
is used for testing the standard deviation.

Steps in Hypothesis Testing

Every hypothesis testing begins with the statement of a hypothesis. A statistical hypothesis
is an inference about a population parameter. This inference may or may not be true. Anyone who
has watched commercial TV cannot fail to be aware of the contest barrage of claims. The Brand
X detergent will wash white clothes sparkling white. With a certain gasoline your car will get more
kilometres to the liter that before. And so on and so on.

The only sure way of finding the truth or falsity of a hypothesis is by examining the entire
population. Because this is not always feasible, a sample is instead examined for the purpose of
drawing conclusion.

The null hypothesis, symbolized as Ho, states that there is no difference between a
parameter and a specific value. The alternative hypothesis, symbolized as Ha, states a specific
difference between a parameter and a specific value.

In order to state the hypothesis correctly, the researcher must translate correctly the claim
into mathematical symbols. There are three possible sets of statistical hypotheses.

1. Ho: parameter = specific value this is a two-tailed test.


Ha: parameter ≠ specific value

2. Ho: parameter = specific value this is a left-tailed test.


Ha: parameter < specific value

3. Ho: parameter = specific value this is a right-tailed test.


Ha: parameter > specific value

In the hypothesis testing, there are four possible outcomes as shown in the table. In reality,
the null hypothesis may or may not be true. The decision to reject or not to reject is on the basis of
the data obtained from the sample of the population.
Do not
Reject Ho
reject Ho

Correct
Ho is true Type I Error decision

Ho is false Correct decision Type II Error

A type I error occurs if one rejects the null hypothesis when it is true. A type II error
occurs if one does not reject the null hypothesis when it is false.

The decision is made on the basis of probabilities. That is, if there is a large difference
between the value of the parameter obtained from the sample and the hypothesized parameter, the
null hypothesis is probably not true. The next question the researcher would ask is “How large a
difference is necessary to reject the null hypothesis?” Here is where the level of significance is
used.

The level of significance is the maximum probability of committing a type I error. This
probability is symbolized by α (Greek letter alpha). That is, P (Type I error) = α. The probability
of type II error is symbolized by β (Greek letter beta). That is, P (Type II error) = β. Although, in
most hypothesis testing situations, β cannot be computed.
Generally, statisticians agree on using three arbitrary significance levels: the 0.10, 0.05,
and 0.01 level. That is, if the null hypothesis is rejected, the probability of a type I error will be
10%, 5%, or 1%, and the probability of a correct decision will be 90%, 95%, or 99%, depending
on which level of significance is used. In other words when α=0.05, there is 5% chance of rejecting
a true null hypothesis.

In a hypothesis-testing situation, the researcher decides what level of significance to use.


It does not have to be the levels mentioned above. It can be any level, depending on the seriousness
of the type I error.

After a significance level is chosen, a critical value is selected from a table for the
appropriate test. The critical value determines the critical and the noncritical regions. The critical
region or the rejection region is the range of values of the test value that indicates that there is a
significant difference and that the null hypothesis should be rejected. The noncritical on non-
rejection region is the range of values of the test value that indicates that the difference was
probably due to chance and that null hypothesis should not be rejected.

The rejection can be located on the both sides with the non-rejection region in the middle,
or it can be on the left side or the right side of the non-rejection region. A test with two rejection
regions is called a two-tailed test. In this test, the null hypothesis should be rejected when the test
value is in neither of the two critical regions. A one-tailed test indicates that the null hypothesis
should be rejected when the test values is in the critical region on one side of the parameter. A
one tailed test is either right-tailed when the inequality in the alternative hypothesis is greater than
(>) or left-tailed when the inequality in less than (<).
−𝑧𝛼⁄2 0 𝑧𝛼⁄2
Two-tailed test

−𝑧𝛼 0
Left-tailed test

0 𝑧𝛼
Right-tailed rest
If the test is two-tailed, the critical value will be either positive or negative. If the test is
left-tailed, the critical value will be negative. If the test is right-tailed, the critical value will be
positive.

Next, the researcher must perform the required test to compute the test statistic. As
mentioned earlier, there are two tests that can be used of means: the z test and the t test, and the
chi-square test for standard deviation. Based on the test results, the researcher will reach a
conclusion about the population under study.
The steps are discussed below.
1. State the null and alternative hypotheses.
2. Select the level of significance.
3. Determine the critical value and the rejection region/s.
4. State the decision rule.
5. Compute the test statistic.
6. Make a decision, whether to reject or not to reject the null hypothesis.

Example 1 Using the z table, find the critical value of a two-tailed test with α = 0.05.

Solution Draw the figure and indicate the appropriate area. Since this is a two-tailed test,
there are two areas equivalent to α/2 or 0.05/2 = 0.025

Subtract 0.025 from 0.5 to get 0.475. Find the z value that corresponds to 0.475. In
this case, it is 1.96. Since this is a two-tailed test, there are two critical values: one
is positive, and the other is negative. They are +1.96 and -1.96.

Example 2 State the null and alternative hypotheses for each statement
a) The average age of bus drivers in Metro Manila is 38.8 years.
b) The average number of calories of a low-calorie meal is at most 300.

Solution a) H0: µ = 38.8 years b) H0: µ = 300 calories


Hα: µ ≠ 38.8 years Hα: µ < 300 calories
Exercises

Using the z table, find the critical value/s for each.

1. α= 0.01, two-tailed test


2. α= 0.10, left tailed test
3. α= 0.005, right-tailed test
4. α= 0.04, two-tailed test
5. α= .02, right-tailed test
6. α= 0.05, left-tailed test
For each statement, state the null and alternative hypotheses.
1. The average pulse rate of female joggers is less than 72 beats per minute.
2. The average age of sales representatives at a drug company is greater than 27.6 years.
3. The average weight loss of people who enrolled in an aerobics class for one month is at
4. The average content of soda in can is equal to 300mL
Test on Large Sample Mean

Many hypotheses are tested using a statistical test based on the following general formula:
(𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑣𝑎𝑙𝑢𝑒) − (𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑣𝑎𝑙𝑢𝑒)
𝑡𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 =
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟

The observed value is the statistic such as the mean that is computed from the sample data. The
expected value is the parameter that one would expect to obtain if the null hypothesis is true. The
denominator is the standard error of the parameter being tested.

The z test
The z test is a statistical test for the mean of a population, it can be used when the sample
size is greater than 30 (n≥30), or when the population is normally distributed and σ is
known. The formula for the z test is:

𝑋−µ
𝑧=
𝜎/√𝑛

Where: X - sample mean


µ - hypothesized mean
σ- Population deviation
n- sample size

The next example illustrates the steps in hypothesis testing using the z test.

Example 1 A manufacturer claims that the average lifetime of his light bulbs is 3 years or 36
months. The standard deviation is 8 months. Fifty light bulbs are selected, and the
average lifetime is found to be 32 months. Should the manufacturer’s statement be
rejected at α = 0.01?
Solution Step 1 State the hypotheses.

H0: µ = 36 months

Hα: µ ≠ 36 months

Step 2 Level of significance α = 0.01

Step 3 Determine the critical values and rejection region.


The significance level is 0.01. The ≠ sign in the alternative hypothesis indicates that
the test is two-tailed with two rejection regions, one in each tail of the normal
distribution curve of x. Because the total area of both rejection regions is 0.01
(significance level), the area of rejection in each tail is:

𝑎 0.01
𝑎𝑟𝑒𝑎 𝑖𝑛 𝑒𝑎𝑐ℎ 𝑡𝑎𝑖𝑙 = = = 0.005
2 2

These areas are shown in the figure. Since the area to the right of µ is 0.5, the area
between 0 and the critical value z is 0.495. Looking at the normal distribution table,
the critical value is z = ±2.575.
Step 4 State the decision rule.

Reject the null hypothesis if zc > 2.575 or zc < -2.575.

Step 5 Compute the test statistic.

𝑥−µ 32 − 36
zc = = = −3.54
𝜎/√𝑛 8/√50

Step 6 Make a decision.

The test statistic zc = -3.54 is less than the critical value z= -2.575 and it falls in the
rejection region in the left tail. Therefore, reject H0 and conclude that the average
lifetime of light bulbs is not equal to 36 months.
Example 2 A test on car breaking reaction times for men between 18 and 30 years old have
produced a mean and standard deviation of 0.610 sec and 0.123 sec, respectively.
When 40 male drivers of this age group were randomly selected and tested for their
breaking reaction times, a mean of 0.587 second came out. At the α = 0.10 level of
significance, the test claim of the driving instructor that his graduates had faster
reaction times.

Solution The claim of the instructor means that is graduates have a mean breaking reaction
time of less than 0.610 sec.

Step 1 H0: µ = 0.610 sec


Hα: µ < 0.610 sec

Step 2 α = 0.10

Step 3 Since α = 0.10 and the test is left-tailed,


z = -1.28

Step 4 Reject H0 if zc < -1.28.

Step 5 Compute for the test statistic.

. 0587 − 0.610
𝑧=
0.123/√40

Step 6 Since the test statistics falls within the noncritical region, do not
reject H0. There is not enough evidence to support the instructor’s
claim.
Example 3 A diet clinic states that there is an average loss of 24 pounds for those who stay on
the program for 20 weeks. The standard deviation is 5 pounds. The clinic tries a
new diet reducing salt intake to see whether that strategy will produce a greater
weight loss. A group of 40 volunteers loses an average weight of 16.3 pounds each
over 20 weeks. Should the clinic change the new diet? Use α= 0.05

Solution Step 1 H0: µ = 24


Hα: µ < 24

Step 2 α = 0.05

Step 3 z = -1.65

Step 4 Decision Rule: Reject H0 if zc < -1.65

Step 5 Compute the test statistic.

𝑥−𝜇 16.3 − 24 −7.7


𝑧𝑐 = = =
𝜎⁄√𝑛 5⁄√40 5⁄2√10

𝑧𝑐 = −9.74

Step 6 The test statistic zc= -9.74 is less than the critical value z = -1.65.
Therefore, reject H0 and conclude that the average weight loss is not
equal to 24 pounds
Exercises

In each of the following exercises, test the given hypotheses.

1. Consider the following null and alternative hypothesis.

H0: µ = 120 versus Hα: µ >120

A random sample of 81 observations taken from this population produced a sample mean
of 123.5 and a sample standard deviation of 15. If this test is made at the 2.5% significance
level, would you reject the null hypothesis?
2. Test the claim that µ = 100 against µ > 100 given a sample of n = 81 for which x = 100.8.
Assume that σ = 5, and test at the α = 0.01 significance level
3. Test claim that µ = 15.5 against µ < 15.5 given a sample of n = 45 for which x = 14.3.
Assume that σ = 5.5 and test at the α = 0.05 significance level.
4. A survey found that women over the age of 55 consume an average of 1660 calories a day.
In order to see if the number of calories consumed by women over age of 55 living in a
certain city is the same, the researcher sampled 43 women over the age of 55 and found the
mean number of calories consumed was 1446. The standard deviation of the sample was
56 calories. At α = 0.10, can it be concluded that there is no difference between the number
of calories consumed by the women over age 55?
5. The manufacturer of a certain brand of auto batteries claims that the mean life of these
batteries is 45 months. A consumer protection agency that wants to check this claim took
a random sample of 36 such batteries and found that the mean life for this sample is 43.75
months with a standard deviation of 4 months. Using the 2.5% significance level, would
you conclude that the mean life of these batteries is less than 45 months?
6. A study claims that all adults spend an average of 8 hours on chores during weekend. A
researcher wanted to check if this claim is true. A random sample of 200 adults taken by
this researcher showed that these adults spend 8.20 hours on chores during weekend with
a standard deviation of 2.1 hours. Using the 1% significance level, can you conclude that
all adults spend an average of 8 hours on chores during a weekend is false?
7. A paint manufacturing company claim that the mean drying time for its paints is 45
minutes. A random sample of 35 gallons of paints selected from the production line of this
company showed that the mean drying time for this sample is 50 minutes with a standard
deviation of 3 minutes. Assume that the drying time for these paints have a normal
distribution. Using the 1% significance level, would you conclude that the company’s
claim is true?
8. A sociologist finds that for a certain population, the mean number of years of education is
13.20, while the standard deviation is 2.95. In one region, a random sample of 60 people is
drawn from this population, and the sample mean is 13.87 years. At the 0.05 level of
significance, test the claim that the mean for this region is the same as the mean of the
population.
9. A certain night time cold medicine bears a label indicating the presence of 600 mg of
acetaminophen in each fluid ounce of the drug. A researcher claims that a fluid ounce
contains less than 600 mg. he randomly selected 65 1-oz samples and finds that the mean
acetaminophen content is 589 mg, while the standard deviation is 21 mg. With α = 0.05,
test the claim that the population mean is equal to 600 mg. (Assume that the sample
deviation can be used for 𝜎)
10. A car dealer recommends that transmissions be serviced at 18,750 kilometers. In order to
see whether his customers are adhering to this recommendation, the dealer selects a sample
of 40 customers and finds that the average kilometres of the cars serviced is 19,035
kilometers. The standard deviation of the sample is 1,052. At α = 0.10, determine if the
owners are servicing their cars at 18, 750 kilometers.
11. The time in minutes taken by a biological cell to divide into two cells has a normal
distribution. From past experience, the population standard deviation was assumed to be
3.5 minutes. When sixteen cells were observed, the mean time taken by them to divide into
two was 31.6 minutes. At 1% level of significance, test the following
a) H0: µ = 30 against Hα: µ ≠ 30
b) H0: µ = 30 against Hα: µ > 30
12. The label on the can of pineapple slices states that the mean carbohydrate content per
serving of canned pineapple is over 50 grams. It may be assumed that the standard deviation
of the carbohydrate content σ is 4 grams. A random sample of forty servings has a mean
carbohydrate content of 52.3 grams. Is the company correct in its claim? Use α = 0.05
13. A company claims that the mean weight per banana it ships is 150 grams with a standard
deviation of 18 grams. Data generated from a sample of 49 bananas randomly selected
from a shipment indicated a mean weight of 153.5 grams per banana. Is there sufficient
evidence to reject the company’s claim?
14. A machine can be adjusted so that when under control, the mean amount of powdered soap
filled in the bag is 5 kilos. From past experiences, the standard deviation of the amount
filled is known to be 0.15 kilos. To check if the machine is under control, a random sample
of sixteen bags was weighted and the mean weight was found to be 5.1 kilos. At 5%
significance level, is the adjustment out of control? (Assume a normal distribution of the
amount of powdered soap filled in the bag.)

Test on Small Sample Mean


When the population standard deviation is unknown and the sample size is less than 30,
the 𝑧 test is inappropriate for testing hypothesis involving means. A different test, called the 𝑡 test,
is used.

𝑇ℎ𝑒 𝑡 𝑡𝑒𝑠𝑡

The 𝒕 𝒕𝒆𝒔𝒕 is a statistical test for the mean of a population and is used when the population
is normally or approximately normally distributed, 𝜎 is unknown, and n < 30. The formula
for the 𝑡 test is

Χ−𝜇
𝑡=
𝑠/√𝑛

The degrees of freedom are d.f. = 𝑛 − 1


The formula for the 𝑡 test is similar to the 𝑧 test. But since the population standard deviation
is unknown, the sample standard deviation is used. The critical values for a 𝑡 test are found in a 𝑡
table (Appendix C).

Example 1 Find the critical value 𝑡 for ∝ = 0.10 with d.f. = 12 for a right-tailed test.
Solution Find the 0.10 column in the top row and 12 in the left-hand column. Where the row
and the column meet, the appropriate critical value is found; it is + 1.356.

Confidence Intervals 50% 80% 90% 95% 98% 99%

One tail, ∝ 𝑡.25 𝒕.𝟏𝟎𝟎 𝑡.050 𝑡.025 𝑡.010 𝑡.005

d.f. Two tail, ∝ 𝑡.50 𝒕.𝟐𝟎 𝑡.10 𝑡.05 𝑡.02 𝑡.01


11 0.697 1.363 1.796 2.201 2.718 3.106
12 0.695 1.356 1.782 2.179 2.681 3.055
13 0.694 1.350 1.771 2.160 2.650 3.012
14 0.692 1.345 1.761 2.145 2.624 2.977
15 0.691 1.341 1.753 2.131 2.602 2.947
Example 2 In order to increase customer service, a muffler repair shop claims its mechanics
can replace a muffler in 12 minutes. A time management specialist selected six
repair jobs and found their mean time to be 11.6 minutes. The standard deviation
of the sample was 2.1 minutes. At ∝ = 0.025, is there enough evidence to conclude
that the mean time in changing a muffler is less than 12 minutes?

Solution Step 1 𝐻0 : 𝜇 = 12

𝐻𝑎 : 𝜇 < 12

Step 2 ∝ = 0.025

Step 3 Since ∝ = 0.025 and d.f. = 6 − 1 = 5, then 𝑡𝛼 = −2.571.

Step 4 Reject 𝐻0 if 𝑡𝑐 < −2.571.

Step 5 Compute for the test statistic.

Χ−𝜇 11.6−12
𝑡 = 𝑠/ = = −0.47
√𝑛 2.1/√6

Step 6 Since the critical value falls within the noncritical region, do not reject 𝐻0 .
Exercises

Find critical value(s) for the 𝑡 test for each.

1. 𝑛 = 15, 𝛼 = 0.05, right-tailed


2. 𝑛 = 24, 𝛼 = 0.01, left-tailed
3. 𝑛 = 17, 𝛼 = 0.025, right-tailed
4. 𝑛 = 9, 𝛼 = 0.02, two-tailed
5. 𝑛 = 10, 𝛼 = 0.10, left-tailed
6. 𝑛 = 6, 𝛼 = 0.01, two-tailed
7. 𝑛 = 28, 𝛼 = 0.02, two-tailed
8. 𝑛 = 20, 𝛼 = 0.10, right-tailed
Answer each of the following.
9. A random sample of 25 observations taken from a population that is normally distributed
produced a sample mean of 58.5 and a standard deviation of 7.5. Find the critical and
observed values of 𝑡 for each of the following tests of hypotheses using 𝛼 = 0.10.
a) 𝐻0 : 𝜇 = 5 𝐻𝑎 : 𝜇 > 55
b) 𝐻0 : 𝜇 = 55 𝐻𝑎 : 𝜇 ≠ 55
10. A new laboratory technician read a report that the average number of students using the
computer laboratory per hour was 16. To test this hypothesis, he selected a day at random
and kept track of the number of students who used the lab over an eight-hour period. The
results were as follows:

20,24,18,16,16,19,21,23

At 𝛼 = 0.05, can the technician conclude that the average is actually 16?
11. The manager of a car rental agency claims that the average mileage of cars rented is less
than 8000. A sample of five automobiles has an average mileage of 7723, with a standard
deviation of 500 miles. At 𝛼 = 0.01, is there enough evidence to reject the manager’s
claim?
12. A special cable has a breaking strength of 800 pounds. A researcher selects a sample of 20
cables and finds that the average breaking strength is 793 pounds with a standard deviation
of 12 pounds. Assume that the variable is normally distributed, test the claim at 𝛼 = 0.02.
13. Machine is designed to fill jars with 16 ounces of coffee. A consumer suspects that the
machine is not filling the jars completely. A sample of 8 jars has a mean of 15.6 ounces
and a standard deviation of 0.3 ounces. Is there enough evidence to support the
consumer’s claim at 𝛼 = 0.10?
14. A recent survey stated that households received an average of 37 telephone calls per month.
To test the claim, a researcher surveyed 29 households and found that the average number
of calls was 34.9. The standard deviation of the sample was 6. At 𝛼 = 0.02, can the claim
be substantiated?
15. In a certain city, a researcher wishes to determine whether the average age of its citizens is
really 61.2 years. A sample of 22 residents has an average age of 59.8. The standard
deviation of the sample is 1.5 years. At 𝛼 = 0.01, is the average age of the residents really
61.2 years? Assume that the variable is approximately normally distributed.
Additional Exercises
Answer each of the following.

1. A recent study stated that if a person smoked, the average number of cigarettes he or she
smoked was 14 per day. To test the claim, a researcher selected a random sample of 40
smokers and found that the mean number of cigarettes smoked per day was 18. The sample
standard deviation was 6. At 𝛼 = 0.05, is the number of cigarettes a person smokes per
day actually equal to 14?

2. A high school counsellor wishes to test the theory that the average age of high school
graduating students is 16.3 years. She samples 32 graduating students and finds that their
mean age is 16.9 years. At 𝛼 = 0.01, is the theory correct? The standard deviation of the
population is 0.3.

3. An advertisement claims that a certain drug will provide relief from indigestion in 10
minutes. For a test of the claim, 35 individuals were given the product; the average time
until relief was achieved was 9.25 minutes. From past studies, the standard deviation was
known to be 2 minutes. Can one conclude that the claim is justified? Use 𝛼 = 0.05.

4. A biologist knows that the average length of a leaf of a certain plant is 4 inches. The
standard deviation of the population is 0.6 inch. A sample of 20 leaves of that type of plant
given a new type of plant food had an average length of 4.2 inches. At 𝛼 = 0.02, is there
reason to believe that the new food is responsible for a change in the growth of leaves?

5. A magazine article stated that the average age of men who were getting married is equal to
30. A researcher decided to test his theory at 𝛼 = 0.02. She selected a sample of 20 men
who were recently married and found that the average age was 28.6 years. The standard
deviation of the sample was 4 years. Should the null hypothesis be rejected? Assume that
the variable is approximately normally distributed.

6. A study conducted by the census department showed that the mean family size was 3.16 in
year 2000. A researcher wanted to check if the current mean family size is less than 3.16.
A sample of 900 families taken from this year by this researcher produced a mean family
size of 3.13 with a standard deviation of 0.70. Using the 0.025 significance level, can he
conclude that the mean family size has declined since year 2000?
7. According to a study, 16 – 26-year-olds make an average of 6.9 visits to shopping malls
per week. A researcher wanted to check if this mean is true for the current population of
16 – 26-year-olds. A random sample of 45 such persons showed that the mean number of
visits to shopping malls was 7.2 per week with a standard deviation of 1.3. Using the 2%
significance level, is the mean number of visits to shopping malls per week for the current
population of 16 – 26-year-olds is more than 6.9?
8. A study conducted a few years earlier claims that adult males spend an average of 27 hours
a week watching sports on television. A recent sample of 100 adult males showed that the
mean time they spend per week watching sports on television is 22 hours with a standard
deviation of 3.8 hours. Test at the 1% level of significance if currently all adult males spend
less than 27 hours watching sports on television.
9. A psychologist claims that the mean age at which children start walking is 12.5 months.
Nicole wanted to check if this claim is true. She took a random sample of 18 children and
found that the mean age at which these children started walking was 12.9 months with a
standard deviation of 0.80 months. Using the 2% significance level, can she conclude that
the mean age at which all children start walking is different from 12.5 months? Assume
that the ages at which all children start walking have an approximate normal distribution.
10. According to a basketball coach, the mean height of all female college volleyball players
is 69.5 inches. A random sample of 25 such players produced a mean height of 70.2 inches
with a standard deviation of 2.1 inches. Assuming that the heights of all female college
volleyball players are normally distributed, test at the 5% level of significance if their mean
height is greater than 69.5 inches.

You might also like