You are on page 1of 22

Hypothesis testing: Two

Proportions
Bio 106 Statistical Biology
Chi-square test of independence

• The Chi-Square test of independence is used to


determine if there is a significant relationship
between two nominal (categorical) variables.

• The frequency of each category for one nominal


variable is compared across the categories of the
second nominal variable.

• Assumption: data obtained thru random sampling


Presence or absence of Species A and
Species B
(Response: Present or Absent)
Statistical hypothesis

• Null hypothesis: There is no association between


Species A and B.
• Alternative hypothesis: There is an association
between Species A and B.
Set α

• Let’s set the significance level at 5% here (α = 0.05)


Data collection

• Done via random sampling (required)


• Quadrat method (a field sampling method)
• Presence or absence of Species A and Species B
(Response: Present or Absent)
• Record data
• Tabulate data into a contingency table
Determine the expected values

• Expected frequencies =
(row total X column total) / grand total
Calculate chi-square statistic
Calculate chi-square statistic

X2 = 0.0 + 0.1 = 0.1 + 0.2 = 0.4


Calculate degree of freedom

• Degree of freedom is calculated by using the following formula:

DF = (r-1)(c-1)

Where
DF = Degree of freedom
r = number of rows
c = number of columns

*the row and column for the total in the contingency table are not included
• DF = (2-1) X (2-1) = 1 X 1 = 1
• If the calculated value is lower than the 0.05 level of
significance, fail to reject the null hypothesis and
conclude that there is NO
significant association between the variables.

• If the calculated value is higher than the 0.05 level of


significance, reject the null hypothesis and conclude that
there IS a significant association between the variables.

2
• X =0.4 < 3.841 (DF=1, α=0.05)

• There is NO significant association between Species A


and Species B.
Conduct the chi-square test
Make a graph
Expected frequency
(row total x column total)/grand total

Asia Africa South America

Malaria A 30.96 23.04 36

Malaria B 20.64 15.36 24

Malaria C 34.40 25.60 40


(O-E)2/E

Asia Africa South America

Malaria A .0000516 3.456 2.25

Malaria B 16.83 6.99 35.04

Malaria C 10.06 14.70 36.10

χ2 = Σ (O-E)2/E
= 125.516

DF= (c-1)(r-1) = 2(2) = 4


Probability level (alpha)
Df 0.5 0.10 0.05 0.02 0.01 0.001

1 0.455 2.706 3.841 5.412 6.635 10.827

2 1.386 4.605 5.991 7.824 9.210 13.815

3 2.366 6.251 7.815 9.837 11.345 16.268

4 3.357 7.779 9.488 11.668 13.277 18.465

5 4.351 9.236 11.070 13.388 15.086 20.517

Reject Ho because 125.516 is greater than


9.488 (for alpha = 0.05)

You might also like