You are on page 1of 11

Chi-square test

Session Outline
• Tests of Association

 Chi-square test

 Fisher’s exact test


Contingency Tables

• When we wish to compare two categorical variables,


we present the data in the form of a table

• Consider the following contingency table comparing


two different treatments of cancer of the larynx

• How can we show statistically, which of the two


treatments is best for these patients?
Cancer Cancer Not
Controlled Controlled Total
Surgery 21 10 31
Radiation Therapy 15 8 23
Total 36 18 54

• One possible approach could be to compare the


proportion of surgery patients who had their cancer
controlled (0.68) vs. the proportion of radiation
treatment patients who had their cancer controlled
(0.65)
 difference between two proportions
Pearson’s Chi-square Test
• Under the chi-square test, we wish to determine if
there is an association between two variables

• Null hypothesis: there is no association between the


two variables.
Alternative hypothesis: there is an association
between the two variables.

• We calculate, for each cell in the table, the frequency that


we would expect if there was no association between the
two variables
• To do this, we use row and column totals, so that we are
finding the expected frequencies for tables based on
these marginal totals

• If the two variables are not associated, the observed and


expected frequencies should be close together
 any minor discrepancy being due to random error

• How “minor” a discrepancy should we accept?

• We need a statistic which measures this


• The expected frequency of a cell is calculated as:

row total  column total


grand total

Cancer Cancer Not


Controlled Total
Controlled
Surgery a c 31
Radiation Therapy b d 23
Total 36 18 54

a = (31 x 36)  54 = 20.67 b = (36 x 23)  54 = 15.33


c = (31 x 18)  54 = 10.33 d = (23 x 18)  54 = 7.67
Cancer Cancer Not
Controlled Controlled Total
21 10
Surgery (20.67) (10.33) 31
15 8
Radiation Therapy (15.33) (7.67) 23
Total 36 18 54
How to do a chi-square test in Excel:

= CHITEST(E4:F5,E10:F11)
Assumptions

• No expected cell frequency less than 1

• No more than 20% of cells with expected frequencies less


than 5

• For the cancer example:


 smallest expected cell frequency was 7.66
 all expected cell frequencies > 5
Thank you

You might also like