You are on page 1of 8

Chapter 7

Hypothesis testing
7.1

Learning activity A7.1


Question:
Think about each of the following statements. Then give the null
and alternative hypotheses and say whether they will need one- or
two-tailed tests.
a) The general mean level of family income in a population is
known to be 10,000 ulam a year. You take a random sample in an
urban area U and find the mean family income is 6,000 ulam a year
in that area. Do the families in the chosen area have a lower income
than the population as a whole?
b) You are looking at data from two schools on the heights and
weights of children by age. Are the mean weights for girls aged
1011 the same in the two schools?
c) You are looking at reading scores for children before and after a
new teaching programme. Have their scores improved?

Solution:
Look at the wording of a) to c) carefully. You can get clues as to
whether you are dealing with a one-tailed test (where H1 will use a
< or >), or a two-tailed test (where H1 will involve 6=) according to
the use of words like:
increase

higher

greater

diminished

which all imply a one-tailed test, or the use of words like:


equal

changed

different from

which all imply a two-tailed test.


a) Here you want to know whether the mean of incomes in a chosen
area (U ) is less than the general population mean ( = 10,000).
So you need a one-tailed test.
H0 : U = 10,000 (i.e. U is equal to )
H1 : U < 10,000 (i.e. U is less than )
b) Here we are comparing the mean weights in two schools, lets say
they are x A and x B , with population means A and B respectively.
We want to know if they are different so we need a two-tailed test.

Statistics 1 Solutions to learning activities

H0 : A = B (i.e. the means are the same)


H1 : A 6= B (i.e. the means are different)
c) We look at the reading score means for children before, x B , and
after, x A , a teaching programme with populations means B and A
respectively.
H0 : A = B (i.e. B is no different from A )
H1 : A > B (i.e. the mean score after the programme is greater
than before).


7.2

Learning activity A7.2


Question:
Complete the following chart:

Result from your test

Real situation
H0 true
Correct

H0 true

H0 false
Type II error

Probability (1 )
called the confidence
interval of the test
H0 false
Probability (1 ) called
the power of the test

Solution:
Your completed table should look like this:

Result from your test

Real situation

H0 true

H0 true
Correct

H0 false
Type II error
Probability =

H0 false

Probability (1 )
called the confidence
interval of the test
Type I error
Probability called the
significance level
of the test

Probability (1 ) called
the power of the test

Correct

CHAPTER 7. HYPOTHESIS TESTING

7.3

Learning activity A7.3


Question:
The manufacturer of a patient medicine claimed that it was 90%
effective in relieving an allergy for a period of 8 hours. In a sample
of 200 people suffering from the allergy, the medicine provided
relief for 160 people.
Determine whether the manufacturers claim is legitimate. (Be
careful. Your parameter here will be .) Is your test one- or
two-tailed?

Solution:
Here the manufacturer is claiming that 90% of the population will
be relieved over 8 hours. That is = 0.9.
A sample of n = 200, and 160 gained relief. That is p =

160
200

= 0.8.

For the manufacturers claim to be accepted, I think it is fair to


assume that we are asking whether p is less than or not (we would
accept the claim otherwise). So we have a one-tailed test with:
H0 : = 0.9
H1 : > 0.9.
We use the population value to work out the standard error, and
so calculate:
z

0.8 0.9
q
0.90.1
200

0.1
0.0212
= 4.71698.
=

This goes beyond the tables (the z given is ()4.417, so this result is
highly significant the p-value is nearly zero). Looked at another
way, you could look at Table 10 (Percentage points of the
t-distribution) and take the bottom line (where = ).
Note that the 5% value is 1.645 for a one-tailed test and the 1%
and the 0.1% values are 2.326 and 3.090 respectively. This
confirms that the result is highly significant. So we reject H0 and the
manufacturers claim is not met. The population given relief from
the sample taken is significantly less than the 90% he claims.


7.4

Learning activity A7.4


Question:
A sample of seven is taken at random from a large batch of

Statistics 1 Solutions to learning activities

(nominally 12 volt) batteries. These are tested and their true


voltages are shown below:
12.9

11.6

13.5

13.9

12.1

11.9

13.0

a) Test if the mean voltage of the whole batch is 12 volts.


b) Test if the mean batch voltage is less than 12.

Solution:
In part a) you are asked to do a two-sided test; in part b) it is a
one-sided test. Which is more appropriate will depend on the
purpose of the experiment, and your suspicions before you conduct
it.
If you suspected before collecting the data that the mean voltage
was less than 12 volts, the one-sided test could be appropriate.
If you had no prior reason to believe that the mean was less
than 12 volts you would do a two-sided test.
General rule: decide on whether it is a one- or two-sided test
before calculating the test statistic.
a) We are to test H0 : = 12 v. H1 : 6= 12. The key points here are
that n is small and that 2 is unknown. We can use the t-test
and this is valid provided the data are normally distributed. The
test statistic value is
t=

12.7 12
x 12
=
= 2.16.
s/ 7
0.858/ 7

This is compared to a Students t distribution on 6 degrees of


freedom. The critical value corresponding to a 5% test is 2.447.
Hence we cannot reject the null hypothesis at the 5% level. (We
can reject at the 10% level, but the convention on this course is
to regard such evidence merely as casting doubt on H0 , rather
than justifying rejection as such.)
b) We are to test H0 : = 12 v. H1 : < 12. There is no need to do
a formal statistical test. As the sample mean is 12.7, which is
greater than 12, there is no evidence whatsoever for the
alternative hypothesis.


7.5

Learning activity A7.5


Question:
Explain what you understand by the statement: The test is
significant at the 5% level. How would you interpret a test that was
significant at the 10% level but not at the 5% level?
In a particular city it is known, from past surveys, that 25% of
homemakers regularly use a washing powder named SNOLITE.
After an advertising campaign, a survey of 300 randomly selected
homemakers showed that 100 had recently purchased SNOLITE. Is
there evidence that the campaign had been successful?

CHAPTER 7. HYPOTHESIS TESTING

Solution:
Significant at the 5% level means there is a less than 5% chance of
getting data as extreme as those observed if the null hypothesis was
true. This implies that the data are incompatible with the null
hypothesis, which we reject.
Significant at 10% but not at 5% is often interpreted as meaning
there is some doubt about the null hypothesis, but not enough to
reject it.
Here we are testing a proportion: H0 : = 0.25 v. H1 : > 0.25.
Note that this is a one-sided test we have reason to believe that
the sales campaign has increased sales, and we believe this before
collecting any data. As this is a test for proportions and n is large,
we compute the test statistic value
100
300 0.25
=
= 3.33.
0.000625
(1 )/n

z=p

Compare this to the critical values of a normal distribution and you


will see that it is significant at the 1% level, say. That is, there is very
strong evidence that more than 25% of homemakers use SNOLITE.
It appears that the campaign has been successful.


7.6

Learning activity A7.6


Question:
If you live in California, the decision to purchase earthquake
insurance is a critical one. An article in the Annals of the Association
of American Geographers (June 1992) investigated many factors
that California residents consider when purchasing earthquake
insurance. The survey revealed that only 133 of 337 randomly
selected residences in Los Angeles County were protected by
earthquake insurance.
a) What are the appropriate null and alternative hypotheses to test
the research hypothesis that less than 40% of the residents of
Los Angeles County were protected by earthquake insurance?
b) Do the data provide sufficient evidence to support the research
hypothesis? (Use = 0.10.)
c) Calculate and interpret the p-value for the test.

Solution:
a) Appropriate hypotheses are H0 : = 0.4 v. H1 : < 0.4.
b) The estimate of the proportion covered by insurance is 133
337 , or
39.5%. For the formal test, as this is a test for proportions and n
is large, we use the test statistic:
z=p

133
337 0.4
=
= 0.20.
0.02669
( (1 ))/n

Statistics 1 Solutions to learning activities

Compare this to the critical vales of a normal distribution and


you will see that it is not significant at even the 10% level.
c) To compute the p-value for this lower-tailed test, use
P (Z < 0.20) = 0.4207.
Hence we would not reject H0 for any < 0.4207.


7.7

Learning activity A7.7


Question:
A random sample of 250 households in a particular community was
taken and in 50 of these the lead level in the water supply was
found to be above an acceptable level. A sample was also taken
from a second community, which adds anti-corrosives to its water
supply and, of these, only 16 out of 320 households were found to
have high levels of lead level. Is this conclusive evidence that the
addition of anti-corrosives reduces lead levels?

Solution:
Here we are testing the difference between two proportions with a
one-sided test. Also n1 and n2 are both large. Let:
1 = (population) proportion of households with unacceptable
levels of lead in the community without anticorrosives.
2 = (population) proportion of households with unacceptable
levels of lead in the community with anticorrosives.
We wish to test H0 : 1 = 2 v. H1 : 1 > 2 . The test statistic is
p1 p2
q

p(1 p) ( n11

50
16
250 320
=q
+ n12 )
0.116(1 0.116)

1
250

1
320

 = 5.55,

50+16
where p is the pooled sample proportion, i.e. p = 250+320
= 0.116.
This test statistic value (5.55) is highly significant, so there is strong
evidence that anticorrosives reduce lead levels.

7.8

Learning activity A7.8


Question:
Two different methods of determination of the percentage fat
content in meat are available. Both methods are used on portions of
the same meat sample. Is there any evidence to suggest that one
method gives a higher reading than the other?

CHAPTER 7. HYPOTHESIS TESTING

Meat Sample
1
2
3
4
5
6
7
8

Method
I
II
23.1 22.7
23.2 23.6
26.5 27.1
26.6 27.4
27.1 27.4
48.3 46.8
40.5 40.4
25.0 24.9

Meat Sample
9
10
11
12
13
14
15
16

Method
I
II
38.4 38.1
23.5 23.8
22.2 22.5
24.7 24.4
45.1 43.5
27.6 27.0
25.0 24.9
36.7 35.2

Solution:
Since the same meat is being put through two different tests, we are
clearly dealing with paired samples. We wish to test
H0 : 1 = 2

v.

H1 : 1 6= 2 ,

but for paired samples we do our calculations on differences, so we


might reformulate this presentation of the hypotheses in the form:
H0 : 1 2 = 0

v.

H1 : 1 2 6= 0.

If we list the differences from the original data table then we get
0.4
0.3

-0.4
-0.3

-0.6
-0.3

-0.8
0.3

-0.3
1.6

1.5
0.6

0.1
0.1

0.1
1.5

The main statistics are

Differences

n
16

Sample mean
0.238

Sample st. dev.


0.745

S.E. of mean
0.745

= 0.186
16

So the test statistic is 0.2380


0.186 = 1.28. Looking at the tables for the t15
distribution, we find that the critical value for a two-tailed test at
the 5% level is 2.131, so this result is not significant at the 5% level.
Indeed, the critical value for a test at the 20% level is 1.341, so the
result would not be significant even at 20% (which, of course, is in
any case too high a level to be used).
The test statistic shows, in fact, that the p-value of the test is over
20%, although the limited information in the standard tables means
that we cannot work it out exactly. (To do so we would need much
more detailed tables, or a computer programme.)


7.9

Learning activity A7.9


Question:
The data in the following table show the numbers of daily parking
offences in two areas of a city. The day identifications are unknown
and the recordings were not necessarily made on the same days. Is
there evidence that the areas experience different numbers of
offences?

Statistics 1 Solutions to learning activities

Area A
38
38
29
45
42

Area B
32
38
22
30
34

Area A
33
27
32
32
34

Area B
28
32
34
24

Solution:
It is clear that the data for the two areas cannot be regarded as
paired, because there is no detail about the days to which the
figures refer (and because there are different numbers of values for
the two areas anyhow).
The sample sizes are 10 and 9 for A and B respectively, so we need
to decide whether we can use the combination of a pooled estimate
of the variance and a t distribution test on the appropriate number
of degrees of freedom.
So begin by calculating the sample means and variances (or
standard deviations), to begin to understand the pattern:
n
Mean
St. dev.

A
10
35
5.68

B
9
30.44
5.08

Because the events to which the data refer are of the same type, it is
reasonable to assume that the population variances of the numbers
of offences are the same (especially since the sample standard
deviations turn out to be quite similar: it can be shown that, if the
ratio of the sample variances is between roughly 13 and 3, then there
is no reason to suppose that the population variances are different).
We test H0 : A B = 0 v. H1 : A B 6= 0. The pooled variance is
s2p =

(10 1)5.682 + (9 1)5.082


= 29.2243
(10 1) + (9 1)

Hence the estimated standard error is


s



1
1
29.2243
+
= 5.406 0.21 = 2.484
10 9

on 17 d.f.s.

on 17 d.f.s.

The test statistic is thus


35 30.44
4.56
=
= 1.836
2.484
2.484

on 17 d.f.s.

For a two-tailed test at the 5% level the critical value from t17 is
2.110, so the evidence is not significant at the 5% level. For a
two-tailed test at the 10% level the critical value from t17 is 1.740, so
the evidence is significant at the 10% level.
We cannot conclude with confidence that there is a difference in the
number of parking offences between the two areas, but there is a
suspicion that there may be such a difference. It would be preferable
to examine more data in order to come to a firmer conclusion.

Solutions prepared by Dr James Abdey.

You might also like