# 2/10/2016

ChiSquareTestofIndependence
Doyourememberhowtotesttheindependenceoftwocategoricalvariables?Thistestisperformedbyusing
aChisquaretestofindependence.
Recallthatwecansummarizetwocategoricalvariableswithinatwowaytable,alsocalledarccontingency
table,wherer=numberofrows,c=numberofcolumns.OurquestionofinterestisArethetwovariables
independent?Thisquestionissetupusingthefollowinghypothesisstatements:
NullHypothesis:Thetwocategoricalvariablesareindependent.
AlternativeHypothesis:Thetwocategoricalvariablesaredependent.
Thechisquareteststatisticiscalculatedbyusingtheformula:

= (O E) /E

whereOrepresentstheobservedfrequency.Eistheexpectedfrequencyunderthenullhypothesisand
computedby:

## row total column total

E =
sample size

Wewillcomparethevalueoftheteststatistictothecriticalvalueof withdegreeoffreedom=(r1)(c
1),andrejectthenullhypothesisif > .
2

Example
Isgenderindependentofeducationlevel?Arandomsampleof395peopleweresurveyedandeachperson
summarizedinthefollowingtable:

HighSchool

Bachelors

Masters

Ph.d.

Total

Female

60

54

46

41

201

Male

40

44

53

57

194

100

98

99

98

395

Total

Question:Aregenderandeducationleveldependentat5%levelofsignificance?Inotherwords,giventhe
datacollectedabove,istherearelationshipbetweenthegenderofanindividualandthelevelofeducation
thattheyhaveobtained?
Here'sthetableofexpectedcounts:

HighSchool

Bachelors

Masters

Ph.d.

Total

Female

50.886

49.868

50.377

49.868

201

Male

49.114

48.132

48.623

48.132

194

100

98

99

98

395

Total

So,workingthisout,

## = (60 50.886) /50.886 + + (57 48.132) /48.132 = 8.006

Thecriticalvalueof with3degreeoffreedomis7.815.Since8.006>7.815,thereforewerejectthenull
hypothesisandconcludethattheeducationleveldependsongenderata5%levelofsignificance.
2

UsingMinitab
WecanenterthedataintoMinitabandrequestthatthe'Chisquaretest'beconductedfortheabove

TheChisquaretestofindependencevaluethatMinitabcalculatedis8.006,whichisthesameaswe
calculatedabove.
variableswherethechancethatsomethingfallsintoaparticularcategorydependsonwhetherthevariable
fallsintoanothercategorycomesintoplay.Thisrelationshipofindependence/dependenceisimportanttobe
abletounderstandanduse.

ChiSquareGoodnessofFitTests
Doyourememberhowtousethechisquaregoodnessoffittesttotestwhetherrandomcategoricalvariables
followaparticularprobabilitydistribution?Let'stakealookatanexample:

Example
SupposethePennStatestudentpopulationis20%
PAresidentand80%nonPAresident.Then,ifa
sampleof100studentsyields16PAresidentand
84nonPAresident,how'good'dothedata'fit'the
assumedprobabilitymodelof20%PAresident
and80%nonPAresident?
Wecanusethechisquaregoodnessoffitstatistic
totestthehypothesesstatements:
NullHypothesis:P

= 0.2

AlternativeHypothesis:P

0.2

Workingthisoutweget,
2

(16 20)
=

(84 80)
++

20

= 1
80

Thecriticalvalueof with1degreeoffreedomis3.84.Since1<3.84,wecannotrejectthenull
5%levelofsignificance.Inotherwords,thestudentsthatwererandomlyselectedinthisexampledid
resembletheprobabilitydistributionthatwasspecified.
2

3/3