You are on page 1of 12

STATISTICAL TESTS OF HYPOTHESES

A hypothesis is a proposal intended to explain certain facts or observations. It is a concept that is


not yet verified but if true would explain certain facts or phenomena. It is therefore a message
expressing an opinion based on incomplete evidence and hence a claim which should be verified
through a test.

Situations arise in which rather than estimate the value of a population parameter, it must be
decided whether a statement concerning the parameter is true or false. In do so the testing of a
hypothesis takes place.

The two types of errors.

When testing a hypothesis we are likely to commit two types of errors.

Let H be the hypothesis.

Accept H Reject H

H is true Correct Type I error


decision (α)

H is false Type II error Correct


(β) decision

Type I error: Occurs when a correct hypothesis is rejected.

Type II error: Occurs when an incorrect hypothesis is accepted.

The two types of hypotheses:

The null hypothesis H 0 

This is the hypothesis to be tested. It may be looked at as any hypothesis set up primarily to see
whether it can be rejected.

The alternative hypothesis H 1 

This is the hypothesis we accept when the null hypothesis must be rejected.

The level of significance:

The testing of a hypothesis is based on the probability of Type I error also known as the level of
significance and denoted by α. It is the probability of rejecting the null hypothesis when it is true.
For example, a significance level of 0.05 indicates a 5% risk of concluding that a difference
exists when there is no actual difference.

Procedure:

1. Formulate the null hypothesis and the appropriate alternative hypothesis.

2. Specify the probability of Type I error (α) also known as the level of significance

3. Set up the criterion for testing the null hypothesis against the alternative hypothesis.

4. Calculate the test statistic.

5. Decide whether to accept or reject the null hypothesis.

Hypotheses concerning one mean

Null hypothesis: This is formulated as H 0 :    0 where  0 is a given numerical value.

Alternative hypotheses:

Alternative hypotheses of the types H 1 :    0 and H 1 :    0 are called one sided or one
tailed alternatives. This is because the rejection region for H 0 in each case is found on one side
of the curve.

H1 :   0 H1 :   0

 

 t(tn (n1) 01) 0 t ( n  1)

Rejection region for H 0 Rejection region for H 0

Alternative hypothesis of the type H 1 :    0 is called two sided or two tailed alternative
because the rejection regions for H 0 are to be found on both sides of the curve.
H1 :   0

 
2 2

 t  2 (n  1) 0 t  2 (n  1)

Rejection regions for H 0

Summary:

Null hypothesis Test Statistic Alternative hypotheses Reject H 0 if:


H 0 :   0 x  0 H1 :   0 t   t ( n  1)
t
s
n H1 :   0 t  t ( n  1)
Where

x
 fx t   t  2 ( n  1)
f H1 :   0  t  t ( n  1)
 
2

1 
 fx 
 fx 
2


s 2

n 1 
  f 

NB: When formulating the alternative hypotheses, the question will be very clear on the form it
should take, H 1 :    0 or H 1 :    0 . When this is not clear formulate it as H 1 :    0 .

Examples:

1. The lengths of rods made by an artisan were measured and recorded as follows:

Length(cm) 14.2 14.3 14.4 14.6 16.0


No. of rods 6 5 3 7 4

Test at 1% level of significance whether the mean length is less than 14.8cm.

Solution:
For this test to be done we must assume that that there was a claim that the mean length is
14.8cm.

H 0 :   14.8
1.
H 1 :   14.8

2.   0.01

3. Criterion: Reject H 0 if t   t (n  1) i.e. t   2.492

  0.01

 t 0.01 (24 )  2,492

4. Calculations:

x  u0
t x f fx fx2
s
n

14.2 6 85.2 1209.84

14.3 5 71.5 1022.45

14.4 3 43.2 622.08

14.6 7 102.2 1492.12

16.0 4 64 1024.0

366.1 1  (366 .1) 2 


x  14.644 s 5370 .44    0.6233
25 24  25 

14.644  14.8
t  1.251
0.6233
25

5. Decision:

Since  1.251   2.492 , H 0 is accepted. We can conclude that the mean is not less than
14.8cm at 1% level of significance.
2. The contents of a random sample of 15 similar containers of a chemical used in the
construction industry are as follows:

Content(lit) 9.6 9.8 10.0 10.2 10.4


No. of containers 2 4 5 3 1

The manufacturer claims that the mean content of the containers is 10.0 litres. Do the data
support the claim at 5% level of significance?

Solution:

H 0 :   10
1.
H 1 :   10
2.   0.05
3. Criterion: Reject H 0 if t   t  2 (n  1) or t  t  2 (n  1) i.e. t 2.145 or t  2.145


2  0.025

0 t 0.025  2.145

4. Calculations:
x  9.96 , s  0.229 , t  0.677

5. Decision:
Since  0.667   2.145 , H 0 is accepted. We can conclude that the data supports the claim
at 5% level of significance.

Exercises:

1. The times between seven calls for an ambulance and the patient’s arrival at the hospital are 27,
15, 20, 32, 18, 26 and 28. The ambulance service claims that it takes on average 20 minutes
between the call for an ambulance and the patient’s arrival at the hospital. Test at 5% level of
significance whether the claim is valid.

2. A random sample of a company’s files shows that 8 orders for a certain piece of machinery
were filled in 10, 12, 19, 14, 15, 18, 11 and 13 days. At 1% level of significance, can it be
concluded that the period such orders are filled exceeds 10.5 days?
Hypotheses concerning the difference between two means

We will consider two populations with a common variance  2 and test whether the difference
between the means of the two populations equals to zero i.e. whether the two means are equal.

The null hypothesis is H 0 : 1   2

The alternative hypotheses are:

H 1 : 1   2 And reject H 0 if t   t (n1  n2  2)

H 1 : 1   2 And reject H 0 if t  t (n1  n2  2)

H 1 : 1   2 And reject H 0 if t   t  2 (n1  n2  2) or t  t  2 (n1  n2  2)

x1  x 2
Test statistic: t 
2 2

n1 n2

Example:

The following random samples are measurements of the heat producing capacity in millions of
calories per ton of specimens of coal from two mines.

Mine 1: 8260 8130 8350 8070 8340

Mine 2: 7950 7890 7900 8140 7920 7840

If the variance of the heat producing capacity from the two mines has been found to be 13,000,
test at 1% level of significance whether there is any difference in the heat producing capacity of
the two mines.

Solution:

Let  1 and  2 be the population means from mine 1 and mine 2 respectively.

H 0 : 1   2
1.
H 1 : 1   2
2.   0.01 
2  0.005
3. Criterion: Reject H 0 if t   3.250 or t  3.250

0 t 0.005 (9)  3.250


4. Calculations:
8230  7940
x1  8230 , x 2  7940 , t   4.19
13000 13000

5 6
5. Decision:
Since 4.19  3.250 H 0 is rejected. We can then conclude that there is a difference in the
heat producing capacity of the two mines.

2. There are two methods for instructing trainees as part of an industrial training programme. The
scores obtained by each of 10 trainees instructed by each of these two methods are given below.

Trainee 1 2 3 4 5 6 7 8 9 10
Method A 71 65 75 69 73 66 68 71 74 68
Method B 72 84 77 78 69 70 77 73 65 75

If the standard deviation of the scores from the two methods has been found to be 20, test at 5%
level of significance whether method B is more effective than method A.

Solution:

Let  1 and  2 be the population means for methods A and B respectively.

H 0 : 1   2
1.
H 1 : 1   2
2.   0.05
3. Criterion: Reject H 0 if t   1.734

  0.05

 t 0.05 (18)  1.734

4. Calculations:
70  74
x1  70 , x 2  74 , t   .447
20 2 20 2

10 10
5. Decision:
Since  0.447   1.734 H 0 is accepted. We can conclude that both methods are equally
effective at 5% level of significance.

Exercise:

A potential buyer of electric light bulbs bought 100 bulbs, each of two famous brands A and B.
Upon testing them, he found that brand A had a mean life of 1500 hours while brand B had a
mean life of 1530 hours. From the past experience, the variance from the two brands has been
found to be 55 hours. Test at 5% level of significance whether the two brands differ significantly
in quality.

Two-way contingency tables analyses.

Contingency tables arise when the classes in which frequencies are grouped are not class
intervals but correspond to some attribute or some descriptive quality. We’ll consider
contingency tables in which the data are tallied into a two-way classification having r rows and c
columns. They can therefore be referred to as r  c tables.

Example: If cars are rated as excellent, superior, average or poor with regard to performance and
appearance, we get the following 4  4 contingency table.

APPEARANCE

E S A P

E a

PERFORMANCE S b

A c

P d

Each car would fall into one of the 16 cells: a would be the number of cars that are excellent in
both performance and appearance, b would be the number of cars that are superior in
performance and average in appearance, c would be the number of cars that are average in
performance and superior in appearance while d would be the number of cars that are poor in
both performance and appearance.

This table contains the observed frequencies Oi j , i  1, 2...r , j  1, 2...c

These tables are analyzed using the chi-square,  2 distribution where we seek to test the null
hypothesis that the two attributes are independent:
H 0 : Performance and appearance are independent

H 1 : Performance and appearance are dependent

Procedure:

First the expected frequencies Ei j are calculated by multiplying the totals of the respective rows
and columns and then dividing by the grand total.

r c O  Ei j 
2

Secondly we substitute in the formular   


2 ij
and reject the null hypothesis
i 1 j 1 Ei j
if the calculated value of  2 exceeds   (r  1)(c  1) which is the value, with (r  1)(c  1)
2

degrees of freedom, read from the chi-square distribution table to the right of which the area
under the curve is  .

Examples:

1. With r  3, c  4,   0.05,   (r  1)(c  1)   0.05 (6)  12.592


2 2

NB: We’ve 12 cells and with 6 degrees of freedom we’re free to calculate Ei j for 6 cells and
obtain the rest by subtraction from the totals. The totals of Oi j and Ei j tables must be the same.

  0.05

0  0.05 2  12 .592

2. A company director wishes to determine whether there is a relationship between an


employee’s performances in the company’s training programme and the employee’s ultimate
success in the job. A sample of 400 cases from the files is taken and the following table obtained.
Performance in the training programme

Below Average Above

Average Average

Poor 23 60 29

Success in the job

(Employer’s rating) Average 28 79 60

Very Good 9 49 63

Test at 1% level of significance whether performance in the training programme and success in
the job are independent.

Solution:

1. H 0 : Performance in the training programme and success in the job are independent

H 1 : Performance in the training programme and success in the job are dependent

2.   0.01
3. Criterion: Reject H 0 if  2    (r  1)(c  1) i.e.  2   0.01 (4) i.e.  2  13 .277
2 2

  0.01

0  0.01 2  13 .277

4. Calculations:
Since we’ve 4 degrees of freedom, we’ll calculate Ei j for 4 cells and obtain the rest by
subtraction from the totals of the respective rows and columns.
Observed frequencies Oi j Total
o11 23 o1 2 60 o13 29 112

o21 28 o2 2 79 o2 3 60 167

o31 9 o3 2 49 o3 3 63 121

Total 60 188 152 400

Expected frequencies Ei j

112  60 112  188


E11   16.8 E1 2   52.6 E13  112  (16 .8  52 .6)  42 .6
400 400

167  60 167  188


E 21   25.0 E2 2   78.5 E 2 3  167  (25 .0  78 .5)  63 .5
400 400

E31  60  (16 .8  25 .0)  18 .2 E3 2  188  (52 .6  78 .5)  56 .9


E3 3  152  (42 .6  63 .5)  45 .9

O ij  Ei j 
2

Oi j Ei j
Ei j

23 16.8 2.2881

60 52.6 1.0412

29 42.6 4.3418

28 25.0 0.3600

79 78.5 0.0032

60 63.5 0.1929

9 18.2 4.6505

49 56.9 0.1097

63 45.9 6.3706

 2  19 .358
5. Decision:
Since 19 .358  13 .277 , H 0 is rejected. We can therefore conclude that there is
dependence between the company’s training programme and employee’s ultimate success
in the job.

Exercise:

1. A company carries two lines of products and has 3 salespersons. The table below shows a
record of one month of the number of units of the two lines sold by each salesperson.

Salesperson

1 2 3

Line 1 20 8 15

Line 2 17 16 5

Test at 10% level of significance the claim that each salesperson’s ability depends on the line
he/she is selling.

2. 200 tyres each of brands A, B, C and D were tested for the distance in km that they would last
and the following table obtained.

Brand

A B C D

Distance

Failed to last 32154 km 26 23 15 32

Lasted from 32154 to 48232 km 118 93 116 121

Lasted more than 48232 km 56 84 69 47

Test at 5% level of significance whether there is any difference in the quality of the four brands
with regard to their durability.

You might also like