You are on page 1of 8

SPSS INSTRUCTION – CHAPTER 5

The mathematical operations described in the chapter do not provide the researcher with
the exact p-value. This value, as explained in Chapter 3, indicates the possibility of making a
Type 1 Error. Most often, the researcher simply needs to know if this probability lies above
his or her designated α value. But, for situations in which more detailed information is
desired, statistical software programs such as SPSS prove helpful.

The SPSS program contains two <prompts> for performing chi-square tests. Both are
accessed by choosing the Analyze option from the main menu. A researcher who wishes to
perform a one-variable chi-square test using SPSS follows a different procedure than one
who wishes to perform a multiple-variable test using the program.

One-Variable Chi-Square Tests with SPSS


Before performing the one-variable chi-square test, coded values for the relevant variable
should appear within a row of the SPSS Data Editor. In addition, the “values” cell of the
Variable View screen should contain the coding frame. Including the coding frame
instructs SPSS to display each category’s name, rather than numerical code, in output. With
data organized this way, the researcher can perform the one-variable chi-square test with
the following steps.
1. Choose “non-parametric tests” from the Analyze pull-down menu.
2. Choose “chi-square” from the options provided. A Chi-Square Test window should
appear on the screen.

FIGURE 5.7 – SPSS ONE-VARIABLE CHI-SQUARE TEST WINDOW


The user performs a one-variable chi-square test by selecting the appropriate variable from those listed in the
box above. SPSS assumes that equal expected values exist unless designated otherwise in the “Expected
Values” portion of the window.

3. Highlight the name of the relevant variable from the list appearing in the upper left
corner of the window. Click on the arrow to the right of the list to move the name of
the variable to the Test Variable List.
4. Redefine expected values if necessary. SPSS assumes equal expected values.
However, if expected values are unequal, define them within the Expected Values
section of the window. To do so, choose “values” and then input the expected values
or percents, clicking on the ADD button after each one.
5. Click OK.

This process produces somewhat different output than that provided by the mathematic
operations described earlier in this chapter. Although this output may appear quite
intimidating, its meaning mirrors that of the hand-done calculation. A table labeled with the
relevant variable’s name contains expected values, observed values, and residuals. Another
table presents the calculated chi-square value, the degrees of freedom, and asymptote
significance level. Although SPSS provides the user with the calculated value, it does not
provide the critical value. Instead, it determines the actual p-value, otherwise known as
the asymptote significance level.

A comparison of the p-value and the chose α indicates whether the researcher should
accept or reject the null hypothesis. Recalling that the value of α represents the largest
probability that the researcher accepts of making a Type I error, only tests producing a p-
value smaller than α suggest a rejected hypothesis. With the most commonly used α-value
of .05 allows for only a 5% chance of incorrectly claiming that a significant difference exists
between the compared values, a researcher can reject the null hypothesis only if p<.05. A p-
value greater than .05 indicates too large a possibility of making a Type I error and, thus,
leads to acceptance of the null hypothesis. A researcher who wishes to use an α-value
other than .05 simply changes his or her requirements of p accordingly.

Example 5.29 – SPSS Output for One-Variable Chi-Square Test with Equal Expected
Values
Output produced by the data used for the one-variable chi-square with equal expected
values (See example 5.3) appears as follows.

reason

Observed N Expected N Residual


Game 12 20.0 -8.0
Commercials 19 20.0 -1.0
game and commercials 29 20.0 9.0
Total 60

Test Statistics

Reason
Chi-Square(a) 7.300
Df 2
Asymp. Sig. .026

TABLE 5.12 AND TABLE 5.13 – SPSS OUTPUT USING EQUAL EXPECTED VALUES
Table 5.12 provides the values used to compute the chi-square statistics. The decision about whether to
accept or reject the null hypothesis depends upon the values in Table 5.13.

Table 5.12 and 5.9 contain many familiar values. The values in Table 5.12 consist of the
observed and expected values that constitute the chi-square formula. Table 5.13 informs
the user that , itself, equals 7.3, the same value obtained through calculations earlier in
the chapter, and that the test uses two degrees of freedom, also the same as the previously-
mentioned value.

With this knowledge and a desired α-value, the researcher could obtain a critical value to
compare with the calculated value. However, this process becomes unnecessary with
SPSS’s inclusion of the p-value in the output. Rather than comparing computed and critical
chi-square values, the researcher can compare the asymptote significance value to the
chosen α. The p-value suggests that a 2.6% chance of making a Type I error exists. One
using the standard α –value of .05, allowing up to a 5% chance of making a Type I error,
would reject the null hypothesis because the p-value of .026 lies below .05. This result
suggests that the number of individuals who watch games for each of the three reasons
differs significantly. However, one who does not accept a 5% chance of making a Type I
error may choose to compare the p-value to an α of .01, which would result in an accepted
hypothesis. ▄

Because the process for performing a one-variable chi-square test with equal expected
values differs only slightly from that for performing the test with unequal expected values,
the output for the two tests looks similar. Differences lie only in the values within the
tables.

Example 5.30 – SPSS Output for One-Variable Chi-Square with Unequal Expected
Values
The following SPSS output results from a chi-square test using unequal expected values
(See Example 5.4).

reason

Observed N Expected N Residual


game 12 15.0 -3.0
commercials 19 15.0 4.0
game and commercials 29 30.0 -1.0
Total 60

Test Statistics

Reason
Chi-
1.700
Square(a)
Df 2
Asymp. Sig. .427

TABLE 5.14 AND TABLE 5.15 – SPSS OUTPUT USING UNEQUAL EXPECTED VALUES
Table 5.14 provides the values used to compute the chi-square statistics. The decision about whether to
accept or reject the null hypothesis depends upon the values in Table 5.15.

As in the condition involving equal expected values, the residuals listed in Table 5.14
constitute the values that appear in the numerator of the chi-square value and the degrees
of freedom value in Table 5.15 equals that used to determine . The SPSS program also

provides the same as the one produced by the computations. A comparison between
the calculated chi-square value and the critical value, obtained using the degrees of
freedom, can take place. However, considering the asymptote significance (p) value of .427
provides the simplest means of determining whether to accept or reject the null
hypothesis. This value lies well above the standard α of .05 as well as any other reasonable
α-value. Thus, the researcher should accept the null hypothesis, concluding that the
observed values do not differ significantly from the expected values. ▄

Multiple-Variable Chi-Square Tests with SPSS


Because multiple-variable chi-square tests address differences in frequencies of a
crosstabulation, the data, entered in SPSS should appear as it does for the creation of a
crosstabulation. Thus, coded data for each of the variables in a multiple-variable chi-
square test must reside in a column of the SPSS Data Editor. As always, inputting the
coding frame into the “values” cell of the Variable View screen allows for the output to
display category names.
Unfortunately, familiarity with the process for conducting one-variable chi-square tests
does not prove helpful in a multiple-variable situation. The Chi-Square Test window used
for the one-variable test does not accommodate more than one variable (and, interestingly,
applications used to perform the multiple-variable test do not accommodate fewer than
two variables), necessitating the use of different procedures to perform the two versions of
the test. Conducting a multiple-variable chi-square test in SPSS requires the following
steps.
1. From the Analyze menu, choose “descriptive statistics,” and then “crosstabs.” A
Crosstabs window should appear on the screen.
2. Assign variables to row, column, and layer positions in the same way described in
Chapter 2. Highlight each variable individually and move it to the row, column, or
layer position by clicking on the arrow to the left of the appropriate box.
3. Click on the “statistics” button, located at the bottom of the Crosstabs window. A
Crosstabs: Statistics window should appear.

FIGURE 5.8 – SPSS MULTIPLE-VARIABLE CHI-SQUARE WINDOW


The user performs a multiple-variable chi-square test by selecting the “Chi-square” option in the above
window. SPSS calculates expected values based upon the data identified for the crosstabulation on the
previous screen.

4. Select “chi-square” from the options in the Crosstabs: Statistics window and click
“continue.” The basic Crosstabs window should, once again, appear.
5. Click OK.

Resulting output contains a case processing summary and a crosstabulation as well as the
chi-square output. The case processing summary informs the user of any data that the
SPSS program could not use in the analysis, usually due to missing values. The
crosstabulation reminds the user of the observed values that he or she wishes to compare
to the expected values. But, unlike the output resulting from a one-variable chi-square test,
SPSS output for a chi-square test involving multiple variables does not include expected or
residual values. All other important values appear in the chi-square table.

For a basic multiple-variable chi-square analysis, the researcher should focus upon the
information in the first row of the table, labeled “Pearson Chi-Square.” Here, he or she can
obtain the degrees of freedom used for the test, the value of and the asymptote
significance (p) value.

Example 5.31 – SPSS Output for Two-Variable Chi-Square


A chi-square test using the variables subjects’ sexes and reasons for watching the
Superbowl, as demonstrated by the calculations in Example 5.5, would produce the
following output if performed using SPSS.
Case Processing Summary

Cases
Valid Missing Total
N Percent N Percent N Percent
sex * reason 60 100.0% 0 .0% 60 100.0%

sex * reason Crosstabulation

Count

Reason
game and
Game commercials commercials Total
sex male 5 8 11 24
female 7 11 18 36
Total 12 19 29 60

Chi-Square Tests

Asymp. Sig.
Value Df (2-sided)
Pearson Chi-Square .101(a) 2 .951
Likelihood Ratio .101 2 .951
Linear-by-Linear
.072 1 .788
Association
N of Valid Cases
60

TABLE 5.16, TABLE 5.17, AND TABLE 5.18 – SPSS OUTPUT USING TWO VARIABLES
The values in Table 5.16 confirm that the chi-square test included data from all subjects. An n less than the
total number of subjects and a corresponding percent less than 100% indicate the omission of subjects, often
as a result of missing data. Table 5.17 and 5.14 provide the values used to compute the chi-square statistics.
The decision about whether to accept or reject the null hypothesis depends upon the values in Table 5.18

Once again, SPSS produces the same calculated chi-square as obtained through the
calculations earlier in this chapter. But, a comparison between the p- value and the
designated α provides the easiest method of determining whether to accept or reject the
null hypothesis. The p-value of .951 indicates an accepted null hypothesis because of the
high possibility that any evident difference occurred only by chance. Due to the 95.1%
chance of making a Type I error, the researcher should accept the null hypothesis,
concluding that the observed and expected frequencies do not differ significantly. ▄

Presenting SPSS Results


One who has used a statistical software program to perform the test can provide the
audience with more specific information than can one who performs the calculations “by
hand.” Because statistical programs provide the p-value, you can include this value in your
report.

Example 5.32 – Summary of Insignificant Results from SPSS


An appropriate summary of results from an SPSS analysis of the two-variable example used
in this chapter might appear as follows.
The chi-square test yielded a p-value of .951( = .101). This value suggests
that no significant difference exists between the number of males and of
females who watch the Superbowl with a preference for the game, with a
preference for the commercials, and with no preference, supporting the
null hypothesis at α=.05. Further, an equal number of males and females fall
into each of these three categories. ▄

Descriptions of results from a test that indicates a significant difference requires the
inclusion of some additional information. You cannot claim that a significant difference
between frequencies exists without identifying the source of the difference. This task is
most easily accomplished by including the frequencies for each category in the results
section of your report. For a study involving two categories, those reading the report can
easily see which variable has the higher and which variable or variables has the lower of
the frequencies. Explanations of significant differences for studies involving more than
two categories may require further explanation. The frequency values reported can still
indicate the “direction” of differences. However, the audience needs to know which
variable or variables differs significantly from the others.
Example 5.33 – Summary of Significant Results from SPSS
A report of the findings from of Example 5.5 requires this sort of explanation. One could
reasonably assume that the frequency of 29, corresponding to the “no preference” category,
significantly exceeds the frequencies of 12 and 19, corresponding to the “commercials” and
the “game” categories respectively. The following passage presents an appropriate
summary of such results.
A rejected null hypothesis reflected the fact that the chi-square test
produced a of 7.3 (p<.05). A slightly lower number of individuals watch
the Superbowl with a preference for the game or for the commercials than
the number of individuals who watch with no preference. ▄

This context, however, is rather straightforward as a comparatively large gap exists


between the frequency for the “no preference” category and the frequencies for the other
categories. When disparities are not so obvious, you may need to perform post-hoc tests to
determine exactly which frequencies differ from which. Usually, one suspects a particular
extreme frequency as the cause of the significant finding. Performing post-hoc chi-square
tests to compare this frequency to another or to a combination of all others can verify your
suspicion. In relation to this example, you could combine the two groups that do not seem
to differ into a single group with an expected frequency of 40 and compare this group to
the remaining one, which has an expected frequency of 20.

Hopefully, this analysis indicates the presence of a significant difference, which would
confirm your suspicions regarding the reason for the originally-rejected null hypothesis. In
this case, you can report this difference as described earlier. If, however, this test indicates
no significant difference, you should continue performing comparisons using different
arrangements of categories until you find the source of the significance. In data contexts
involving four or more variables, you may even combine multiple groups from the original
test to form two new comparison groups. For example, you may choose to compare the
combination of groups 1 and 2 to the combination of groups 3 and 4.

This sort of restructuring of data into groups occurs rather frequently in data analysis.
Forthcoming chapters present circumstances in which you may choose to re-organize non-
categorical data into categories to compare groupings of subjects using a chi-square test.
The scope of this test, consequently, reaches far beyond naturally occurring categorical
data.

You might also like