You are on page 1of 9

Chi-Square 2 Introduction to Nonparametrics

Psychology 320 Gary S. Katz, Ph.D.

Definitions
Parametric Tests
Statistical tests that involve assumptions about or estimations of population parameters. (what weve been learning)

Nonparametric Tests
Also known as distribution-free tests Statistical tests that do not rely on assumptions of distributions or parameter estimates (what were going to be learning)

More Definitions
The Chi-Square (X2) test is a nonparametric test that is used to test hypotheses about distributions of frequencies across categories of data. Different from what weve been learning
Then Averages Scales Now Frequencies Categories

Two Applications of the Chi-Square Test


The X2 goodness-of-fit test.
Used when we have distributions of frequencies across two or more categories on one variable. Test determines how well a hypothesized distribution fits an obtained distribution.

The X2 test of independence.


Used when we compare the distribution of frequencies across categories in two or more independent samples. Used in a single sample when we want to know whether two categorical variables are related.

Flowers & Genetics The X2 Goodness-of-Fit Test


In my backyard, I have a new hybrid rose bush. I hypothesize that (according to Mendelian genetic theory) that I should have 50% pink flowers, 25% white flowers, and 25% red flowers.

Pp

Pp

PP

Pp

Pp

pp

Flowers
I grow 120 of these plants from seed. The resulting colors of flowers are as follows:

Flowers - Reality & Expectations


Recall, my expectations were 50% Pink, 25% White, 25% Red. Pink White 20 Red 25

Pink 75

White 20

Red 25

Observed

75

So, if I planted 120 seeds, Id expect this set of colored flowers. Expected 60 30 30

Flowers - Reality & Expectations


Pink Observed Expected 75 60 White 20 30 Red 25 30

The Chi-Square Test


If my hypothesis is true (50%, 25%, 25%), how likely is it that I could get this difference between my actual distribution and my expected distribution of colored flowers?

If my hypothesis is true (50%, 25%, 25%), how likely is it that I could get this difference between my actual distribution and my expected distribution of colored flowers?

Used to determine if the probability < , in which case the hypothesis is rejected or if the probability > , in which case the hypothesis is not rejected.

The Chi-Square Test


Hypotheses
H0: P(pink, white, red) = .5, .25, .25
The population proportions of pink, white, and red flowers are .5, .25, and .25, respectively.

The Chi-Square Test


Notice that the hypotheses for the ChiSquare Goodness-of-Fit Test are stated in terms of proportions. The Chi-Square TEST is conducted on actual frequencies not proportions. Specifically, the X2 test operates on differences between observed and expected frequencies. First - make sure everything is a frequency.

H1: P(pink, white, red) .5, .25, .25


The population proportions of pink, white, and red flowers are something other than .5, .25, and .25, respectively. mutually-exclusive, exhaustive categories (P=1).

The Chi-Square Test


Pink Observed Expected 75
120(.5) = 60

White 20

Red 25 O = 120 E = 120

In the Chi Square Test


we calculate (O-E)2/E in each cell, sum all of the (O-E)2/E values over all cells, and compare this summed value to a critical value.

120(.25) = 30 120(.25) = 30

Observed frequencies = O Expected frequencies = E


E = N Expected Proportion E = N P(cell)

(O E )2 2 o = E

Notice that E = O, always.

The Chi-Square Distribution


Statisticians have found that if H0 is true and we calculate the X2 statistic for all possible samples of size N, the values for a probability distribution called the X2 distribution.

Characteristics of the X2 distribution


A family of distributions varying in df (like the t distribution). Positively skewed; the amount of skew decreases as df increases. Minimum value = 0 (X2 cant be negative) Average (typical) value increases (the entire distribution shifts to the right) as df increases.

Characteristics of the X2 distribution


A family of distributions varying in df (like the t distribution). df=1 df=3 df=5 df=10

Characteristics of the X2 distribution


o
2

(O E )2 = E

As differences between Os and Es get bigger, X2 gets bigger. Since we are only interested in rejecting H0 if the differences between the obtained frequencies and the expected frequencies is greater than expected by chance, the rejection region is in the upper tail.

Chi-Square: Upper One-Tailed Test

Finding X2c
Table E.1 has the tabled values. df?

df = k - 1 Why?
If you have 3 categories, only the counts in 2 of them are free to vary.

X2c Decision Rule: Reject H0 if X2o > X2c

Choose , read down list of df to find X2c

Finding X2c
df 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0.050 3.841 5.991 7.815 9.488 11.070 12.592 14.067 15.507 16.919 18.307 19.675 21.026 22.362 23.685 24.996 26.296 27.587 28.869 30.144 31.410 0.025 5.024 7.378 9.348 11.143 12.832 14.449 16.013 17.535 19.023 20.483 21.920 23.337 24.736 26.119 27.488 28.845 30.191 31.526 32.852 34.170 0.010 6.635 9.210 11.345 13.277 15.086 16.812 18.475 20.090 21.666 23.209 24.725 26.217 27.688 29.141 30.578 32.000 33.409 34.805 36.191 37.566 0.005 7.879 10.597 12.838 14.860 16.750 18.548 20.278 21.955 23.589 25.188 26.757 28.300 29.819 31.319 32.801 34.267 35.718 37.156 38.582 39.997

The Dreaded Six Steps


State H0 and H1. Choose Relevant probability distribution is X2 with k - 1 df. Find X2c & state decision rule: I will reject H0 if X2o > X2c Calculate X2o Apply decision rule.

Calculating X2o
Pink Observed 75 60 White 20 30

Finding X2o
Red 25 30

Pink Observed Expected 75 60

White 20 30

Red
Expected

25 30 Observed - Expected Pink 15 White -10 Red -5

(O-E) = 0, always

Finding X2o
Pink Observed - Expected 15 White -10 Red -5
Pink (Observed - Expected)2 Expected 225 60

Finding X2o
White 100 30 Red 25 30

Components of X2

Pink (Observed - Expected)2 225

White 100

Red 25
(Observed - Expected)2 / Expected

Pink 3.75

White 3.33

Red .83

[(O-E)2/E] = X2o = 7.91 Since X2o > X2c, reject H0

Interpretation
Since we reject H0, the geneticists hypothesis does not fit the data. The population distribution across the three categories is probably different than .50 pink, .25 white, .25 red.

Another Example: Breakfast!


The manufacturer of Posts Raisin Bran cereal, which lags behind Kelloggs in sales, believe that, given the chance to try both, most consumers will prefer Posts. They devise a blind taste test. A sample of 100 people eat a small bowl of each cereal, without knowing which is which, and they are asked which cereal they like better. Fifty-seven people say they like Posts better, while 43 choose Kelloggs. Can the manufacturer advertise, More people prefer Posts? H0: P(Posts) = P(Kelloggs) or P(Posts, Kelloggs) = .5, .5 H1: P(Posts) P(Kelloggs) or P(Posts, Kelloggs) .5, .5

Breakfast Answers
2) Use = .05 3) df = 1, X2 distribution with 1 df 4) X2c for = .05, df = 1, is 3.84; Decision rule: reject H0 if X2o > 3.84 5) Calculations E(Posts) = E(Kelloggs) = 100 (.5) = 50
Cereal Post's Kellogg's O E O-E 57 50 7 43 50 -7 100 100 0 (O-E)^2 (O-2)^2/E 49 0.98 49 0.98 1.96 = X2o

Extra Credit Breakfast


In the Breakfast Example, we found that a 57 to 43 majority isnt enough to reject H0. What is the smallest number of Posts preferences that will lead to a significant finding (rejection of H0) at = .05? Correct and well-reasoned answers are worth 5pts on top of your final (total) grade.

Since X2o < X2c (1.96 < 3.84), we retain H0. The manufacturers cannot claim that more people prefer Posts.

Assumptions of the Goodness-of-Fit Test


Observations in different categories are independent. Categories are mutually exclusive. Categories are exhaustive. No expected frequency < 2. Few expected frequencies < 5.
The X2 distribution does not accurately describe the probabilities of various sampling outcomes if expected frequencies are small.

The X2 test of Independence

The X2 Test of Independence


Used when:
We want to compare the distribution of frequencies across categories in two or more independent samples. We want to determine whether the paired observations obtained in two or more categorical variables are independent or associated.

Physical Contact in Neonates


A developmental psychologist hypothesizes that mothers who have physical contact with their infants immediately after birth are more likely to hold them on the left side, where the sound of the mothers heartbeat is more pronounced, than mothers who do not have such early contact with their infants. She observes 125 early-contact mothers and 105 latecontact mothers with the following results: Left Is there a significant difference? Early Late 80 55 Right 45 50 125 105 230

Same test.

Contingency Tables
Left Early Late 80 55 Right 45 50

Stating H0 & H1
When two or more groups are being compared, H0 states that the population distributions across all categories are the same. H1 states that the population distributions differ.

This type of table is called a contingency table.


We are trying to determine if the frequencies in one variable are contingent upon the frequencies of the other variable.

This is a 2 2 contingency table.

Stating H0 & H1
H0 : Early and late contact mothers do not differ in how they hold their neonates. H1 : Early and late contact mothers hold their neonates differently. OR H0 : Group membership and distribution across categories are unrelated. H1 : Group membership and distribution across categories are related.

Stating H0 & H1
H0 : Group membership and distribution across categories are unrelated. H1 : Group membership and distribution across categories are related. OR H0 : Time of first contact and how neonates are held are not related H1 : Time of first contact and how neonates are held are related.

Stating H0 & H1
H0 : Time of first contact and how neonates are held are not related H1 : Time of first contact and how neonates are held are related. OR H0 : Time of first contact and how neonates are held are independent. H1 : Time of first contact and how neonates are held are dependent / correlated / related.

The X2 Test of Independence


Test statistic for the test of independence is the same as in the goodness-of-fit test:

(O E )2 2 o = E
Two differences:
Calculation of expected frequencies Calculation of df

Calculation of Expected Frequencies: X2 test of Independence


For each cell,

Calculation of Expected Frequencies: X2 test of Independence


Early Late Column Sums Left 80 55 135 Right 45 50 95 Row Sums 125 105 N = 230

E=

(row sum )(column sum )


N
Where N = total number of observations.
E (early , left ) = E (late, left ) =

(125 )(135 ) = 73.4


230

E (early , right ) =

(125 )(95 ) = 51.6


230

(105 )(135 ) = 61.6


230

E (late, right ) =

(105 )(95 ) = 43.4


230

Observed and Expected Frequencies


Observed Frequencies Left Right Early 80 45 Late 55 50 Column Sums 135 95 Row Sums 125 105 N = 230

Why this Row and Column Sum?


P ( row ) = row sum N P (column ) = column sum N

P (row AND column ) =

row sum column sum N N

Expected Frequencies Left Right Early 73.4 51.6 Late 61.6 43.4 Column Sums 135 95

Expected frequencies = N P
Row Sums 125 105 N = 230

E (row AND column ) = N


E (row AND column ) =

row sum column sum N N

Marginal Frequencies are Fixed in

X2

Analyses.

row sum column sum N

df in X2 Independence Tests
df = (# rows - 1) (# columns - 1) Why?
Remember marginals are fixed in X2 independence tests.
100 100 50 60 90

Mothers and Neonates


H0 : Time of first contact and how neonates are held are independent. H1 : Time of first contact and how neonates are held are dependent / correlated / related. = .05 X2 distribution with 1 df X2c for = .05, 1 df = 3.84;
reject H0 if X2o > 3.84

How many cells are truly free to vary?

Mothers and Neonates


Observed Frequencies Left Right Early 80 45 Late 55 50
Observed - Expected Left Right Early 6.6 -6.6 Late -6.6 6.6

Expected Frequencies Left Right Early 73.4 51.6 Late 61.6 43.4
(Observed - Expected)^2 Left Right Early 43.56 43.56 Late 43.56 43.56
(O E ) 2 o = E
2

Overview of X2
X2 - a nonparametric test applied to categorical, frequency data. Relevant probability distribution is the X2 distribution.
A family of distributions varying in df Positively skewed with minimum = 0 Skew decreases as df increases. Center of distribution and critical values increase as df increases.

(O - E)^2 / E Left Right Early 0.59 0.84 Late 0.71 1.00

X2o = 3.14

Decision: Retain H0, there is no relationship between side and time.

Overview of X2
Rejection region in the upper tail. Decision rule: reject H0 if X2o > X2c Two forms:
Goodness-of-fit
used to determine whether an obtained distribution fits a hypothetical one.

Independence
used to test whether two categorical variables are related used to test whether two different samples are related