You are on page 1of 28

I.B.

Mathematics HL Option: Hypothesis Testing Chi-Squared Test

Index:

Please click on the question number you want

Question 1 Question 2

Question 3 Question 4

Question 5 Question 6

Question 7

You can access the solutions from the end of each question
Question 1

A Headmaster of a large school wants to check on the number of students who


are absent during one term. The results are shown in the table below.

Days of
Mon Tues Weds Thurs Fri Total
the week
Number
of 250 171 160 183 236 1000
absentees

Test the hypothesis that the number of absentees is independent of the days of
the week. Test at the 5% level. What conclusions might the headmaster draw?

Click here to read the solution to this question

Click here to return to the index


Solution to question 1

The observed frequencies are shown in the table.

Days of
Mon Tues Weds Thurs Fri Total
the week
Number
of 250 171 160 183 236 1000
absentees

Performing the test

H0 : The absentees are evenly distributed.

H1 : The absentees are not evenly distributed.

Under H0 the expected frequencies we would expect the absentees to be evenly


1000
distributed throughout the week, i.e. = 200 per day.
5
Days of
Mon Tues Weds Thurs Fri Total
the week
Number
of 200 200 200 200 200 1000
absentees

Note: The number of degrees of freedom is the number of independent variables


= 5 −1= 4
We perform a χ test at the 5% level.
2

Considering χ ( 4 ) and we reject H0 if


2
χ ( 4) 2

χ 2calc > 9.488 where 5%


5
(Oi − Ei )
2

χ 2
calc =
Ei
1 9.488

Reject H0

Using the graphics calculator in the STAT menu, 1:Edit enter the observed
frequencies into 1 and the expected frequencies into L2.

Click here to continue with solution or go to next page


(Oi − Ei )
2

Now press QUIT and enter (L1 - L2) x ÷ L2 STO L3, ENTER. This is
2

Ei

Re-enter the STAT menu, 1:Edit to see the table. Press QUIT and enter the LIST
menu, select MATH, 5: sum(L3) ENTER gives the value of χ calc = 32.63 .
2

(Oi − Ei )
2
Days of the
Oi Ei
week Ei
Mon 250 200 12.5
Tues 171 200 4.205
Weds 160 200 8
Thurs 183 200 1.445
Fri 236 200 6.48
1000 1000 32.63

As χ calc > 9.448 , we reject H0 and conclude that the distribution is not evenly
2

distributed. The test tells us nothing about how the observed data is distributed.
However the Headmaster could think that some of his students are taking a long
weekend by just looking at the observed data.

Click here to read the question again

Click here to return to the index


Question 2

A bag contains red, yellow and green balls in the ratio 3:4:5. A ball is drawn out at
random from a bag and its colour is noted and it is then replaced back into the
bag.

In 240 trials the results are as follows

Colour Red Yellow Green Total


Frequency 68 74 98 240

Perform a test at the 5% level to determine whether the differences between the
observed and expected frequencies are significant.

Click here to read the solution to this question

Click here to return to the index


Solution to question 2

The observed frequencies are

Colour Red Yellow Green Total


Frequency 68 74 98 240

Performing the test

H0 : The colour of the balls is in the ratio 3:4:5.

H1 : The colour of the balls is not in the ratio 3:4:5.

Under H0 the expected frequencies are


3 4 5
Red × 240 = 60 , Yellow × 240 = 80 and Green × 240 = 100 .
12 12 12
Writing the expected frequencies into a table we have

Colour Red Yellow Green Total


Frequency 60 80 100 240

Note: The number of degrees of freedom is the number of independent variables


= 3 −1= 2

We perform a χ test at the 5% level. χ2 ( 2)


2

Considering χ ( 2 ) and we reject H0 if


2

χ 2calc > 5.991 where 5%


3
(Oi − Ei )
2

χ 2
calc =
Ei 5.991
1
Reject H0

Using the graphics calculator in the STAT menu, 1:Edit enter the observed
frequencies into 1 and the expected frequencies into L2.

Click here to continue with solution or go to next page


(Oi − Ei )
2

Now press QUIT and enter (L1 - L2) x ÷ L2 STO L3, ENTER. This is
2

Ei

Re-enter the STAT menu, 1:Edit to see the table. Press QUIT and enter the LIST
menu, select MATH, 5: sum(L3) ENTER gives the value of χ calc = 1.56 .
2

(Oi − Ei )
2

Colour Oi Ei
Ei
Red 68 60 1.06666666
Yellow 74 80 0.45
Green 98 100 0.04
240 240 1.55666666

As χ calc < 5991 , we do not reject H0 and conclude that there is significant
2

evidence that the colour of the balls is in the ratio 3:4:5.

Click here to read the question again

Click here to return to the index


Question 3

The table below shows the result of planting seeds in rows of 6 and the number
of seeds that germinate in each row after a two-week period. Test at the 10%
level whether the data can be modelled by a binomial distribution.

Number of seeds that germinate (x) 0 1 2 3 4 5 6


Frequency (f) 15 26 21 14 10 9 5

Click here to read the solution to this question

Click here to return to the index


Solution to question 3

Let X be the r.v. ‘the number of seeds that germinate per row’
As we do not know the value of p the probability of success we will have to
calculate it from the data given using the graphics calculator.

In the STAT menu, 1:Edit enter the data from the table into L1 and L2. Re-enter
the STAT menu, select CALC followed by 1; 1-Var Stats ENTER, L1, L2 ENTER.
The mean can be then written down.

x=
∑ fx = 225 = 2.25 so an estimate
∑ f 100
2.25
for p is = 0.375
6

H0 : X ∼ Bin ( 6, 0.375 ) , the distribution is binomial.

H1 : The distribution is not binomial.

Under H0 the expected frequencies can be calculated by using a graphics


calculator, remembering that p ( X = x ) = nCx q n − x p x = 6Cx ( 0.625 ) ( 0.375 )
6− x x
.
Press QUIT and in the DISTR menu select 0: binompdf(

The parameters for binompdf are binompdf(n, p, x). input the parameters as
n = 6 , p = 0.375 and x = L1 then enter STO L3, ENTER.

Click here to continue with solution or go to next page


The expected frequencies can be worked out by
100p ( X = x ) . Enter 100 × L3 STO L4 ENTER
Re-enter the STAT menu, 1:Edit and you will get
the expected frequencies.

The completed table is shown below.

x p( X = x) 100p ( X = x )
0 0.059604 5.9604
1 0.21457 21.457
2 0.32186 32.186
3 0.25749 25.749
4 0.11587 11.587
5 0.027809 2.7809
6 0.0027809 0.27809
100

We will perform a χ goodness of fit test. As the χ is not valid for expected
2 2

frequencies less than 5 we will combine the last 3 rows into 4 or more.

The degrees of freedom is the number of independent variables which is

v = number of cells – the number of restrictions


v =5−2=3

Note: That the number of restrictions is 1 i.e. the totals agree, and p has been
estimated from the sample.

Click here to continue with solution or go to next page


We perform a χ test at the 10% level.
2

Considering χ ( 3 ) and we reject H0 if


2
χ2 ( 3)

χ 2calc > 6.251 where 10%


5
(Oi − Ei )
2

χ 2
calc =
Ei
1 6.251

Reject H0

Using the graphics calculator in the STAT menu, 1:Edit enter the observed
frequencies into 1 and the expected frequencies into L2.

(Oi − Ei )
2

Now press QUIT and enter (L1 - L2) x ÷ L2 STO L3, ENTER. This is
2

Ei

Re-enter the STAT menu, 1:Edit to see the table. Press QUIT and enter the LIST
menu, select MATH, 5: sum(L3) ENTER gives the value of χ = 29.89
2

Number of
(Oi − Ei )
2

seeds Oi Ei
germinating Ei
0 15 5.9604 13.709544
1 26 21.457 0.961870
2 21 32.186 3.887609
3 14 25.749 5.360946
4 or more 24 14.645 5.974161
100 100 29.894131

As χ calc > 6.251 , we reject H0 and conclude that the distribution is not binominal
2

X is notBin ( 6, 0.375 ) .

Click here to read the question again

Click here to return to the index


Question 4

The number of telephone calls received by an operator at a hotel between the


hours of 9.00 a.m. and 10.00 p.m. over a 100 day period is shown in the table
below.

Number of phone calls 0 1 2 3 4 5


Number of days 25 36 16 11 8 4

Determine whether a Poisson distribution with mean 2 can model the above
distribution. Test at the 10% level.

Click here to read the solution to this question

Click here to return to the index


Solution to question 4

Let X be the random variable the number of telephone calls received between
9.00 a.m. and 10.00a.m.

H0 : The distribution is Poisson with mean 2, i.e. X ∼ Po ( 2 )

H1 : The distribution is not distributed this way.

We now have to calculate the expected frequencies, which can be done using the
e −2 2x
graphics calculator. The calculator can calculate p ( X = x ) = for the
x!
difference values of x. Press QUIT and in the DISTR menu select 0:poissonpdf(

the parameters for poissonpdf are poissonpdf( µ , x). input the parameters as
µ = 2 and x = L1 then enter STO L3, ENTER.

As a Poisson distribution is infinite, you


have make x = 5 , x ≥ 5 . Press QUIT
enter 1 – and enter the DISTR menu,
select C: poissoncdf( µ = 2 , x = 4 ).

Now replace the last entry in L3 by pressing ANS STO L3 (6).


The expected frequencies can be worked out by 100p ( X = x ) . Enter 100 × L3
STO L4 ENTER Re-enter the STAT menu, 1:Edit and you will get the expected
frequencies. The completed table is shown below.

Click here to continue with solution or go to next page


The completed table is shown below.

Number of calls p( X = x) 100p ( X = x )


0 0.13533 13.534
1 0.27067 27.067
2 0.27067 27.067
3 0.18044 18.045
4 0.090223 9.0224
5 or more 1 − 0.947333 = 0.052667 5.2653
Total 100

The degrees of freedom is the number of independent variables which is


v = number of cells – the number of restrictions
v = 6 −1= 5

Note: That the number of restrictions is 1 i.e. the totals agree.

We perform a χ test at the 10% level.


2

Considering χ ( 5 ) and we reject H0 if


2
χ2 ( 5)

χ 2calc > 9.236 where 10%


6
(Oi − Ei )
2

χ 2
calc =
Ei
1 9.236

Reject H0

Using the graphics calculator in the STAT menu remembering that the observed
frequencies are in List 2 and the expected frequencies are in List 4. Press QUIT
( Oi − Ei )
2

and enter (L2 – L4) x ÷ L4 STO L5, ENTER. This is


2
.
Ei

Square List 3 and put the answer into List 4 to give (Oi − Ei ) . Now divide List 4
2

(O − E i )
2

by List 2 and put the answer into List 5 for i .


Ei

Click here to continue with solution or go to next page


Re-enter the STAT menu, 1:Edit to see the table.
Press QUIT and enter the LIST menu, select
MATH, 5: sum(L5) ENTER gives the value of
χ 2calc = 20.36

The completed table is shown below.

(Oi − Ei )
2

Heights Oi Ei
Ei
0 25 13.534 9.71640353
1 36 27.067 2.94818373
2 16 27.067 4.52501160
3 11 18.045 2.74983019
4 8 9.0224 0.11584607
5 or more 4 5.2653 0.30406395
100 100 20.35850043

Now as χ calc > 9.336 we do reject H0 and say that there is significant evidence
2

at the 10% level that a Poisson distribution cannot model the distribution.

Click here to read the question again

Click here to return to the index


Question 5

The heights measured in cm, of a group of students are given in the table below.
Determine whether the data can be modelled by a normal distribution. Test at the
5% level.

Height in cm 146-150 151-155 156-160 161-165 166-170 171-175


Frequency 10 17 20 14 10 9

Click here to read the solution to this question

Click here to return to the index


Solution to question 5

Let X be the r.v. ‘the height in cm of students in a class’


To find the mean and standard deviation of the distribution we take the mid- point
of each interval.

Mid-interval (cm) 148 153 158 163 168 173


Frequency 10 17 20 14 10 9

Using the graphics calculator in the STAT menu, enter the mid-intervals into List
1 and the corresponding frequencies into List 2.

µ=x= ∑ fx = 12760 = 159.5


∑f 80

σ = sn =
∑ fx − x = 12760 − 159.5 2
= 7.59934207
∑f 80

and using the χ goodness-of-fit test we have


2

(
H0 : X ∼ N 159.5, 7.5992 )
H1 : X is not distributed in this way.

Now calculating the expected frequencies we will use our graphics calculator.
First go to the STAT Menu 1: Edit and in List 1 put in the upper class boundaries.
x − µ x − 159.5
Next press QUIT and standardise each u.c.b., using z = = . Enter
σ 7.599…
(L1 – 159.5) ÷ 7.599…STO L2 ENTER. Press QUIT and enter the DISTR menu
then select 2:normalcdf(.

Click here to continue with solution or go to next page


The parameters for normalcdf( are normalcdf(lowerbound, upperbound, µ, σ ).
Note if you are using the standard normal curve you do not need the last two.

Select 2 and enter -1EE99, L2(1) STO L3(1) ENTER and then press 2nd ENTER
which recalls the last entry and enter normalcdf(L2(1), L(2) STO L3(2) and then
repeat for the following normalcdf(L2(1), L(2) STO L3(2)
normalcdf(L2(2), L(3) STO L3(3)
normalcdf(L2(3), L(4) STO L3(4)
normalcdf(L2(4), L(5) STO L3(5)
normalcdf(L2(5), L(6) STO L3(6)

in the last case enter normalcdf(L2(6), 1EE99 STO L3(7)

For the expected frequencies we need to evaluate 80p ( X = x ) , which we do by


entering 80 × L3 STO L4. re-enter the STAT menu to view the expected
frequencies.

Click here to continue with solution or go to next page


The completed table is shown below.

u.c.b
Standardised p (a < X ≤ b ) 80p ( X = x )
u.c.b.
150.5 -1.184313052 0.11814 9.4512
155.5 -0.526361356 0.18118 14.4944
160.5 0.131590339 0.25303 20.2424
165.5 0.789542035 0.23275 18.62
170.5 1.44749373 0.14102 11.2816
175.5 2.105445426 0.05625 4.5
∞ ∞ 0.01763 1.4104
∑ Ei = 80
Now the χ test is not valid for expected frequencies less than 5 so we combine
2

the last two cells. The degrees of freedom is the number of independent variables
which is

v = number of cells – the number of restrictions


v = 6−3 = 3

Note: That the number of restrictions is 3 i.e. the totals agree, means agree and
variances agree.

χ2 ( 3)
We perform a χ test at the 5% level.
2

Considering χ ( 3 ) and we reject H0 if


2

χ 2calc > 7.815 where


6
(Oi − Ei )
2
7.815
χ 2
calc =
Ei Reject H0
1

Using the graphics calculator in the STAT menu remembering that the observed
frequencies are in List 2 and the expected frequencies are in List 4. Press QUIT
( Oi − Ei )
2

and enter (L1 – L2) x ÷ L2 STO L3, ENTER. This is


2
.
Ei

Click here to continue with solution or go to next page


Re-enter the STAT menu, 1:Edit to see the table. Press QUIT and enter the
LIST menu, select MATH, 5: sum(L5) ENTER gives the value of χ calc = 3.37
2

The completed table is shown below.

(Oi − Ei )
2

Heights Oi Ei
Ei
<150 10 9.4512 0.031867
151-155 17 14.4944 0.433135
156-160 20 20.2424 0.002902
161-165 14 18.62 1.1463158
166-170 10 11.2816 0.1455909
>171 9 5.9104 1.6150562
80 80 3.374868

Now as χ calc < 7.815 we do not reject H0 and say that there is significant
2

evidence at the 5% level that the distribution can be modelled by a normal


distribution X where X ∼ N (159.5, 7.5992 ) .

Click here to read the question again

Click here to return to the index


Question 6

A research team is interested to see if whether there is a relation between family


income and the type of car owned. They survey 200 families, with the following
results.

Family
Low income Middle income High income
Expensive
$35000 or 17 31 23
Type of more
car own Mid-range
$15000- 20 22 16
$35000
Cheap
26 28 17
Under $15000

Test to see if there is a relationship between type of car and family income. Test
at the 5% level.

Click here to read the solution to this question

Click here to return to the index


Solution to question 6

First determine the totals for the contingency table.

Family
Middle High
Low income Total
income income
Expensive
$35000 or 17 31 23 71
Type of more
car own Mid-range
$15000- 20 22 16 58
$35000
Cheap
Under 26 28 17 71
$15000
Total 63 81 56 200

H0 : The type of car owned is independent of family income.

H1 : The type of car owned is not independent of family income.

Under H0 the expected frequencies have the same row and column totals and
row total × column total
can be calculated for each cell by . The calculated
grand total
values are shown in the table below.

Family
Middle High
Low income Total
income income
Expensive
$35000 or 22.365 28.755 19.88 71
Type of more
car own Mid-range
$15000- 18.27 23.49 16.24 58
$35000
Cheap
Under 22.365 28.755 19.88 71
$15000
Total 63 81 56 200

Notice that when the first four cells are calculated the others are restricted by the
totals.

Click here to continue with solution or go to next page


Note: The number of degrees of freedom is the number of independent variables
=4

χ2 ( 4 )
We perform a χ test at the 5% level.
2

Considering χ ( 4 ) and we reject H0 if


2

χ 2calc > 9.488 where


9
(Oi − Ei )
2
9.488
χ 2
calc =
Ei Reject H0
1

Using the graphics calculator in the STAT menu remembering that the observed
frequencies are in List 2 and the expected frequencies are in List 4. Press QUIT
( Oi − Ei )
2

and enter (L1 – L2) x ÷ L2 STO L3, ENTER. This is


2
.
Ei

Re-enter the STAT menu, 1:Edit to see the table. Press QUIT and enter the LIST
menu, select MATH, 5: sum(L5) ENTER gives the value of χ calc = 3.242
2

Click here to continue with solution or go to next page


The completed table is shown below.

(Oi − Ei )
2

Oi Ei
Ei
17 22.365 1.2869763
20 18.27 0.1638149
26 22.365 0.5907992
31 28.755 0.1752747
22 23.49 0.0945125
28 28.755 0.0198235
23 19.88 0.4896579
16 16.24 0.0035479
17 19.88 0.4172233
200 200 3.2416294

Now as χ calc < 9.488 we do not reject H0 and say that there is significant
2

evidence at the 5% level that the type of car owned is independent of family
income.

This test can also be very easily done directly with a graphics calculator. However
be careful to write each stage of the working out as in the above solution.

In the MATRIX menu select EDIT then 1: [A] state the number of rows as 3 and
columns as 3. Now enter the data from the original contingency table given in the
question. Enter the STAT menu and select TESTS, C: χ -Test, ENTER.
2

Set Observed to Mat A and scroll down to Calculate and ENTERL. The calculator
returns χ calc = 3.242 , the degrees of freedom as 4 and that the area is 51.8%
2

and therefore we do not reject.

Re-enter the STAT menu and select TESTS, C: χ -Test, ENTER. Scrolll down to
2

Draw and press ENTER.

Click here to continue with solution or go to next page


The calculator draws the graph and displays the p-value

Press F1 Mat followed by Ans then EXE will give the matrix of expected
frequencies.

Click here to read the question again

Click here to return to the index


Question 7

In a school the results obtained in the IB Mathematics examination and Physics


examination are shown in the table below. Test at the 5% whether the
performance in both subjects is related.

Physics
Pass Fail
Mathematics Pass 52 22
Fail 10 16

Click here to read the solution to this question

Click here to return to the index


Solution to question 7

First determine the totals for the contingency table.

Physics
Pass Fail Total
Mathematics Pass 52 22 74
Fail 10 16 26
Total 62 38 100

H0 : The results of Mathematics are independent of Physics.

H1 : The results of Mathematics are not independent of Physics.

Under H0 the expected frequencies have the same row and column totals and
row total × column total
can be calculated for each cell by . The calculated
grand total
values are shown in the table below.

Physics
Pass Fail Total
Mathematics Pass 45.88 28.12 74
Fail 16.12 9.88 26
Total 62 38 100

Notice that when the first cell is calculated the others are restricted by the totals.

Note: The number of degrees of freedom is the number of independent variables


=1

We perform a χ test at the 5% level. χ 2 (1)


2

Considering χ (1) and we reject H0 if


2

χ 2calc > 3.841 where 5%

∑ (O − Ei − 0.5 )
4 2

χ 2
calc =
i

Ei 3.841
1
Reject H0

Note: As the number of degrees of freedom is one ( v = 1 ) we have to use Yates’


continuity correction when calculating χ calc .
2

Click here to continue with solution or go to next page


Using the graphics calculator in the STAT menu remembering that the observed
frequencies are in List 2 and the expected frequencies are in List 4. Press QUIT
and enter ( MATH menu Num 1: Abs (L1 – L2) x 2 ) ÷ L2 STO L3, ENTER.
(O − Ei − 0.5 )
2
i
This is
Ei

Re-enter the STAT menu, 1:Edit to see the table. Press QUIT and enter the LIST
menu, select MATH, 5: sum(L5) ENTER gives the value of χ calc = 6.968
2

(O − Ei − 0.5 )
2
i
Oi Ei
Ei
52 45.88 0.688413252
10 16.12 1.959330025
22 28.12 1.123200569
16 9.88 3.196801619
100 100 6.967745465

Now as χ calc < 3.841 we reject H0 and say that there is significant evidence at
2

the 5% level that the results of Mathematics are not independent of the results of
Physics. Note: The graphics calculator cannot be used to perform this test, as it
does not allow for Yates’ continuity correction, although it can calculate the
expected frequencies. (See previous question)

Click here to read the question again

Click here to return to the index

You might also like