You are on page 1of 21


Test of
This lesson explains how to
conduct achi-square test
of homogeneity. The test is
applied to a single
variablefrom two
different populations. It
is used to determine
whether frequency
When to Use Chi-Square
Test for Homogeneity
For each population, the sampling method
issimple random sampling.
The variable under study iscategorical.
If sample data are displayed in acontingency
table(Populations x Category levels), the
expected frequency count for each cell of the
table is at least 5.
This approach consists of four steps: (1) state
the hypotheses, (2) formulate an analysis plan,
(3) analyze sample data, and (4) interpret
Sample Problem
In a study of the television
viewing habits of children, a
developmental psychologist
selects a random sample of 300
first graders - 100 boys and 200
girls. Each child is asked which of
the following TV programs they
like best: The Lone Ranger,
Sesame Street, or The Simpsons.
Step 1: State the Hypotheses
Every hypothesis test requires the analyst to
state anull hypothesisand analternative
hypothesis. The hypotheses are stated in
such a way that they are mutually exclusive.
That is, if one is true, the other must be false;
and vice versa.
Suppose that data were sampled
fromrpopulations, and assume that the
categorical variable hadc levels. At any
specified level of the categorical variable, the
null hypothesis states that each population
has the same proportion of observations.
H0: Plevel 1 of population 1= Plevel 1 of
population 2= . . . = Plevel 1 of population r

H0: Plevel 2 of population 1= Plevel 2 of

population 2= . . . = Plevel 2 of population r
. . .
H0: Plevel c of population 1= Plevel c of population
2= . . . = Plevel c of population r
Null hypothesis: The null hypothesis states
that the proportion of boys who prefer the
Lone Ranger is identical to the proportion of
girls. Similarly, for the other programs. Thus,
H0: Pboys who prefer Lone Ranger= Pgirls who prefer Lone Ranger
H0: Pboys who prefer Sesame Street= Pgirls who prefer Sesame
H0: Pboys who prefer The Simpsons= Pgirls who prefer The Simpsons

Alternative hypothesis: At least one of the

null hypothesis statements is false.
Step 2: Formulate an Analysis Plan
The analysis plan describes how to use sample
data to accept or reject the null hypothesis.
The plan should specify the following elements.
Significance level. Often, researchers
choosesignificance levelsequal to 0.01,
0.05, or 0.10; but any value between 0 and 1
can be used
Test method. Use thechi-square test for
homogeneityto determine whether observed
sample frequencies differ significantly from
expected frequencies specified in the null
hypothesis. The chi-square test for
homogeneity is described in the next section.
For this analysis, the
significance level is 0.05.
Using sample data, we
will conduct achi-
square test for
Step 3: Analyze
Sample Data
Degrees of freedom.The
degrees of freedom(DF) is equal to:
DF = (r - 1) * (c - 1)
where r is the number of
populations, and c is the number of
levels for the categorical variable.
Expected frequency counts.The
expected frequency counts are
computed separately for each
population at each level of the
categorical variable, according to the
where Er,cis the expected frequency
count for population r at levelcof
the categorical variable, nris the
total number of observations from
population r, ncis the total number of
observations at treatment levelc,
and n is the total sample size.
Applying the chi-square test for homogeneity to
sample data, we compute the degrees of freedom,
the expected frequency counts, and the chi-square
test statistic. Based on the chi-square statistic and
the degrees of freedom, we determine the P-
DF = (r - 1) * (c - 1) = (2 - 1) * (3 - 1) = 2

Er,c= (nr* nc) / n

E1,1= (100 * 100) / 300 = 10000/300 = 33.3
E1,2= (100 * 110) / 300 = 11000/300 = 36.7
E1,3= (100 * 90) / 300 = 9000/300 = 30.0
E2,1= (200 * 100) / 300 = 20000/300 = 66.7
E2,2= (200 * 110) / 300 = 22000/300 = 73.3
E2,3= (200 * 90) / 300 = 18000/300 = 60.0
Step 3: Analyze Sample
Test statistic.The test statistic is a chi-square
random variable (2) defined by the following
equation.2= [ (Or,c- Er,c)2/ Er,c]
where Or,cis the observed frequency count in
populationrfor levelcof the categorical variable,
and Er,cis the expected frequency count in
populationrfor levelcof the categorical variable.
P-value.The P-value is the probability of
observing a sample statistic as extreme as the
test statistic. Since the test statistic is a chi-
square, use the Chi-Square Distribution Tableto
assess the probability associated with the test
statistic. Use the degrees of freedom computed
2= [ (Or,c- Er,c)2/ Er,c]
2= (50 - 33.3)2/33.3 + (30 - 36.7)2/36.7 + (20 - 30)2/30
+ (50 - 66.7)2/66.7 + (80 - 73.3)2/73.3 + (70 - 60)2/60
2= (16.7)2/33.3 + (-6.7)2/36.7 + (-10.0)2/30 + (-
16.7)2/66.7 + (3.3)2/73.3 + (10)2/60
2= 8.38 + 1.22 + 3.33 + 4.18 + 0.61 + 1.67 = 19.39
The P-value is the probability that a chi-square statistic
having 2 degrees of freedom is more extreme than
We use theChi-Square Distribution Table to find P( 2>
19.39) = 0.0000. (The actual P-value, of course, is not
exactly zero. If the Chi-Square Distribution Calculator
reported more than four decimal places, we would find
that the actual P-value is a very small number that is
less than 0.00005 and greater than zero.)
Step 4: Interpret Results
If the sample findings are unlikely,
given the null hypothesis, the
researcher rejects the null
hypothesis. Typically, this involves
comparing the P-value to
thesignificance level, and rejecting
the null hypothesis when the P-value
is less than the significance level.
Since the P-value (0.0000) is
less than the significance
level (0.05), we reject the null
1. Chi-square: Test of
Homogeneity is
applied to a _____ from
2 different populations
2. This test is used to
determine whether
_______ are distributed
identically across
different populations.
3-5. When to use the
Chi-square: Test of
6-9. This approach
consists of steps,
what are these
10. How are babies
formed? (Without
using the concept of