You are on page 1of 3

NON-PARAMETRIC TEST:

CHI-SQUARE GOODNESS OF FIT TEST


Prepared by: Teffany V. Daniel, MS

Non-parametric tests do not rely upon parameter estimations. It is especially useful with nominal or ordinal data.

Advantages of Non-parametric Test:

 Not susceptible to outliers-like parameters are


 Do not require restrictive assumptions (like the population must be normally distributed)

Chi-square is used for test concerning frequency distributions of nominal or categorical data. A chi-square test can
either be a test for goodness of fit and independence.

TEST FOR GOODNESS OF FIT


 In a goodness of fit test, the independent variable is a single categorical variable with atleast two groups or
categories while the dependent variable is the frequency in each independent variable category.
 It tests whether the proportions from the obtained sample are a good fit to the proportions known to exist in the
population, which we call expected frequencies/values.
 The expected frequencies/values should not be less than 5.
 It answers the question “Are my observations a good fit to what I would expect”?

Sample Problem 1:
Suppose as a market analyst you wished to see whether consumers have any preference among five flavors of a new fruit
soda.

A sample of 100 people provided these data:

Frequency Cherry Strawberry Orange Lime Grape


Observed 32 28 16 14 10

The actual frequencies are called the observed frequencies. The frequencies obtained by calculation (as if there were no
preference) are called the expected frequencies.

If there were no preference, you would expect that all flavors are chosen randomly (equally). In this case, the equal
frequency is 100/5=20. That is, approximately 20 people would select each flavor.

Frequency Cherry Strawberry Orange Lime Grape


Observed 32 28 16 14 10
Expected 20 20 20 20 20

The observed frequencies will almost always differ from the expected frequencies due to sampling error; that is, the
values differ from sample to sample. But the question is: Are these differences significant (a preference exists), or are
they due to chance? The chi-square goodness of fit test will enable the researcher determine the answer.

The hypotheses are as follows:

Consumers show no preference for flavors of the fruit soda.

Consumers show preference.

Formula for the Goodness-of-Fit Test

with degrees of freedom equal to the number of categories minus 1, and

O=Observed frequency
E=Expected frequency
In Sample Problem 1, is there enough evidence to reject the claim that there is no preference in the selection of fruit soda
flavors. Let

Solution:

Step 1: The null and alternative hypotheses for each chi-square test can be stated as

If the claim made in the null hypothesis is true, the observed and expected values are close to each other and is
small for each category. When the observed data does not conform to what has been expected on the basis of the null
hypothesis, the difference between the observed and expected values is large. Thus, in Sample Problem 1, the
hypotheses are as follows:

Consumers show no preference for flavors of the fruit soda.

Consumers show preference.

Step 2: Exactly how large the value must be in order to be considered large enough to reject the null hypothesis, can
be determined from the level of significance and the chi-square table. From the chi-square distribution table, the critical
value is 9.488 for degrees of freedom and

Step 3:

Step 4: If the calculated value of Chi-square goodness of fit test is greater than the critical or table value, we will reject
the null hypothesis and conclude that there is a significant difference between the observed and expected frequency.
Hence, the decision is to reject the null hypothesis, since 18.0 >9.488.

Step 5: There is enough evidence to reject the claim that consumers show no preference for flavors.

Why “goodness of fit”?

To get some idea why this test is called goodness-of-fit test, examine the graphs of the observed values and expected
values below. When the observed values and expected values are close together, the chi-square test value will be small.
Then the decision will be to not reject the null hypothesis- hence, there is “a good fit”. When the observed values and
expected values are far apart, the chi-square test will be large. Then, the null hypothesis will be rejected-hence, there is
“not a good fit”.

35

30
F
r 25
e
20
q Expected Values
Series1
u 15 Observed Values
Series2
e
n 10
c
5
y
0
Cherry Strawberry Orange Lime Grape
Sample Problem 2:
A researcher read that firearm-related deaths for people aged 1 to 18 were distributed as follows: 74% were accidental,
16% were homicides, and 10% were suicides. In her district, there were 68 accidental deaths, 27 homicides, and 5 suicides
during the past year. At test the claim that the percentages are equal.

Solution:

Step 1: The deaths due to firearms for people aged 1 to 18 are distributed as follows: 74% were accidental, 16%
were homicides, and 10% were suicides.

The distribution is not the same as stated in the null hypothesis.

Step 2: The critical value is 4.605 for degrees of freedom and

Step 3: To get the expected values, multiply each of the population percentages by the total of the observed values.
Since there are 100 people from the observed values, the expected values are

Step 4: The decision is to reject the null hypothesis, since 10.549 >4.605.

Step 5: There is enough evidence to reject the claim that the distribution is 74% accidental, 16% homicides, and 10%
suicides.

You might also like