You are on page 1of 43

Contingency tables enable us to

compare one characteristic of the


sample, e.g. degree of religious
fundamentalism, for groups or
subsets of cases defined by another
categorical variable, e.g. gender.
A contingency table, which SPSS calls
a cross-tabulated table is shown
below:

09/28/15

Slide 1

Each cell in the table represents a


combination of the characteristics
associated with the two variables:
29 males were
also
fundamentalist
s.

42 females
were
fundamentalis
ts.
While a larger number of females were
fundamentalist, we cannot tell if females were
more likely to be fundamentalist because the
total number of females (146) was different
from the total number of males (107).
To answer the more likely question, we need
to compare percentages.
09/28/15

Slide 2

There are three percentages that can


be calculated for a contingency
table:
percentage of the total number of
cases
percentage of the total in each row
percentage of the total in each
column
Each of the three percentages
provide different information and
09/28/15

Slide 3

The percentage of the total number


of cases is computed by dividing the
number in each cell (e.g. 29, 42, etc.)
by the total number of cases (253).
11.5% of the
cases were both
male and
fundamentalist.

We have two clues that


the table contains total
percentages. First, the
rows that the percentages
are on are labeled % of
Total.
09/28/15

16.6% of the
cases were both
female and
fundamentalist.

Second, the 100%


figure appears ONLY
in the grand total
cell beneath the
table total of 253.
Slide 4

The percentage of the total for each


row is computed by dividing the
number in each cell (e.g. 29, 42) by
the total for the row (71).
40.8% of the
fundamentali
sts were
male.

The label for the


percentage tells us that it
is computed within the
category for
fundamentalist.

59.2% of the
fundamentali
sts were
female.

The percentages in each


row sums to 100% in the
total column for rows (the
row margin).

09/28/15

Slide 5

The percentage of the total for each


column is computed by dividing the
number in each cell (e.g. 29, 36, and
42) by the total for the column (107).
27.1% of the
males were
fundamentalists.

33.6% of the
males were
moderates.
The label for the
percentage tells us that it
is computed within the
category for sex.
09/28/15

The percentage in
each column sums
to 100% in the total
row for columns
(the column
margin).

Slide 6

The three percentages tell us:


the percent that is in both
categories (total percentage)
the percent of each row that is
found in each of the column
categories (row percentages)
the percent of each column that is
found in each of the row categories
(column percentages)
The row and column percentages are
referred to as conditional or
contingent percentages.

09/28/15

Slide 7

The three percentages tell us:


the percent that is in both
categories (total percentage)
the percent of each row that is
found in each of the column
categories (row percentages)
the percent of each column that is
found in each of the row categories
(column percentages)
The row and column percentages are
referred to as conditional or
contingent percentages.

09/28/15

Slide 8

Our real interest is in conditional or


contingent percentages because these
tell us about the relationship between
the variables.
The relationship between variables is
defined by a distinct role for each:
the variable which affected or impacted
by the other is the dependent variable
the variable which affects or impacts the
other is the independent variable

We assign the role to the variable. An


independent variable in one analysis
may be a dependent variable in
09/28/15

Slide 9

A categorical variable has a


relationship to another categorical
variable if the probability of being in
one category of the dependent
variable differs depending on the
category of the independent
variable.
For example, if there is a relationship
between social class and college
attendance, the percentage of upper
class persons who attend college will
be different from the percentage of

09/28/15

Slide 10

Given that we can represent this


statistically with either the row or
column percentages in a contingency
table, my practice is to always put
the independent variable in the
columns and the dependent variable
in the rows, and compute column
percentages.
This order matches the order for
many graphics where the dependent
variable is on the vertical axis and
the independent variable is on the

09/28/15

Slide 11

Based on the column percentages,


we can make statements like the
following:

Males were most likely to


be liberal (39.3%), while
females were most likely
to be moderate (45.5%).

09/28/15

Slide 12

Based on the column percentages,


we can make statements like the
following:

This is not equivalent to


the statement that
liberals are more likely to
be male or female.
09/28/15

Males were
more likely to
be liberal
(39.3%)
compared to
females
(26.7%).

Slide 13

We can also describe a relationship


based on a comparison of odds. First,
we compute the odds separately for
each category of the independent
variable:

The odds that


a male would
be a liberal
rather than a
fundamentalist
are 42 29 =
1.45.
09/28/15

The odds that


a female
would be
liberal rather
than
fundamentalis
t are 39 42
= .93.
Slide 14

We compare the odds by computing


the ratio between the two: 1.45 for
males .93 for females = an odds
ratio of 1.56.
We can now state the relationship
between the two variables as: males
are 1.56 times more likely to be liberal
rather than fundamentalist, than are
females.
This could also be stated as: being
male increases the odds of being
liberal rather than fundamentalist by a
09/28/15

Slide 15

If the odds ratio were 1.0, then both


groups would be equally likely to be
liberals rather than fundamentalists.
We could have divided the odds for
females by the odds for males (.93
1.45 = .64) and stated that being
female decreased the odds of being
liberal versus fundamentalist by a
factor of .64, or 36%. (.64 1.00 = .
36) and multiplying .36 by 100 to
convert it to a percent. Explaining
decreases in odds is more awkward
09/28/15

Slide 16

The introductory statement in the question


indicates:
The data set to use (GSS200R)
The statistic to use (contingency table)
The variable to use in the rows of the
table (attitude toward life )
The variable to use in the columns of the
table (sex)

09/28/15

slide 17

The first statement for us to


evaluate concerns the number of
valid and missing cases. To answer
this question, we produce the
contingency table in SPSS.

09/28/15

slide 18

To compute a
contingency table in
SPSS, select the
Descriptive Statistics
> Crosstabs
command from the
Analyze menu.

09/28/15

slide 19

First, move the


row variable life to
the Row(s) list box.

Second, move the


column variable
sex to the
Column(s) list box.
Third, click on Cells
button to specify what
should be printed in
each cell of the table.

09/28/15

slide 20

Second, click
on the Continue
button to close
the dialog box.

First, mark the


check boxes for
Column and Total
percentages.

09/28/15

slide 21

After returning to
the Crosstabs
dialog box, click
on the OK button
to produce the
output.

09/28/15

slide 22

The SPSS output


provides us with the
answer to the
question on sample
size.

09/28/15

The 'Case Processing


Summary' in the SPSS output
showed the total number of
valid cases to be 186 and the
number of missing cases to be
84.

slide 23

The 'Case Processing Summary' in


the SPSS output showed the total
number of valid cases to be 186
and the number of missing cases to
be 84.
Click on the check box to mark the
statement as correct.

09/28/15

slide 24

The next statement asks which


combination of characteristics was
most common. The key word and
tell us that the problem is looking
for total percentages, i.e. the
percentage that has both
characteristics.

09/28/15

slide 25

The largest total percentage


(30.1%) in the contingency
table was in the cell for the
column labeled FEMALE and
the row labeled ROUTINE.

09/28/15

slide 26

The statement that "more survey


respondents were female and said
that they generally find life pretty
routine than any other combination of
categories" is correct.
The check box for the first statement
is marked.
Since this precludes the second
statement from being marked, its
checkbox is left unmarked.

09/28/15

slide 27

The next pair of statements asks us


to compare the groups, and identify
which group was more likely (had a
larger proportion) with the specified
characteristic.

09/28/15

slide 28

The column percent for survey


respondents who were female was
54.4%, which was larger than the
column percent of 44.6 for survey
respondents who were male.

09/28/15

slide 29

The statement that "compared to


survey respondents who were male,
those who were female were more
likely to have said that they generally
find life pretty routine" is correct.
The check box for the first statement
is marked.
Since this precludes the second
statement from being marked, its
checkbox is left unmarked.

09/28/15

slide 30

The next pair of questions gives two


options for the most likely response for
each group. The first option identifies
alternatives for the most likely
response. The second option states
that both groups have the same most
likely responses.

The question of which response


was most likely for each group
requires that we identify the
mode for each group.

09/28/15

slide 31

The category EXCITING had


the largest percentage of
cases (49.4%), making it the
modal category for survey
respondents who were male.

09/28/15

slide 32

The category ROUTINE had


the largest percentage of
cases (54.4%), making it the
modal category for females.

09/28/15

slide 33

The statement that "survey respondents who were


male were most likely to have said that they
generally find life exciting, while survey
respondents who were female were most likely to
have said that they generally find life pretty
routine" is correct

The check box for the first statement is marked.


Since this precludes the second statement from
being marked, its checkbox is left unmarked.

09/28/15

slide 34

The final pair of questions requires us to


compute the odds of describing life as
routine rather than exciting for each
group, and then compute the odds ratio
to determine which group was more likely
to have said life was routine rather than
exciting.

09/28/15

slide 35

The odds for survey respondents


who were female was computed
by dividing the Count for
ROUTINE by the Count' for
EXCITING (5644=1.27).

09/28/15

slide 36

The odds for survey respondents


who were male were computed
by dividing the Count' for
ROUTINE by the Count' for
EXCITING (3741=0.90).

09/28/15

slide 37

I use Excel to do the


calculations that I cannot
do easily in SPSS.

The odds ratio for females


to males is 1.4, which
corresponds to a greater
likelihood for females.

09/28/15

First, I
calculate the
odds for each
group (sex).

Second, I
calculate the
odds ratio, once
for the ratio of
group 1 (females)
to group 2
(males), and a
second time for
the ratio of group
2 (males) to
group 1
(females).

slide 38

The
second
stateme
nt in the
pair is
marked.

09/28/15

The statement that "compared to survey respondents


who were male, those who were female were about 1.4
times more likely to have said that they generally find
life pretty routine" is correct. The odds for survey
respondents who were female was computed by
dividing the 'Count' for the ROUTINE row by the 'Count'
for the EXCITING row in the FEMALE column
(5644=1.27). The odds for survey respondents who
were male was computed by dividing the 'Count' for
the ROUTINE row by the 'Count' for the EXCITING row
in the MALE column (3741=0.90). The odds ratio for
survey respondents who were female to survey
respondents who were male is 1.27 to 0.90, or 1.41 to
1.

slide 39

The homework problems


translate some of the decimal
fractions for odds and odds ratios
from numbers to text. The
following table shows the
translations used.

If the odds
are:

Homework
problems will
describe the
likelihood as:

0.95 through
1.05

about equally likely

0.95,0.96,0.97,0.98,0.99,1.00,1.01,1.02,1.03,1.04,
1.05

1.95 through
2.05

about twice as likely

1.95,1.96,1.97,1.98,1.99,2.00,2.01,2.02,2.03,2.04,
2.05

2.95 through
3.05

about three times as


likely

2.95,2.96,2.97,2.98,2.99,3.00,3.01,3.02,3.03,3.04,
3.05

3.95 through
4.05

about four times as


likely

3.95,3.96,3.97,3.98,3.99,4.00,4.01,4.02,4.03,4.04,
4.05

4.95 through
5.05

about five times as


likely

4.95,4.96,4.97,4.98,4.99,5.00,5.01,5.02,5.03,5.04,
5.05

5.95 through
6.05

about six times as


likely

5.95,5.96,5.97,5.98,5.99,6.00,6.01,6.02,6.03,6.04,
6.05

6.95 through
7.05

about seven times as


likely

6.95,6.96,6.97,6.98,6.99,7.00,7.01,7.02,7.03,7.04,
7.05

7.95 through
09/28/15
8.05

about eight times as


likely

7.95,7.96,7.97,7.98,7.99,8.00,8.01,8.02,8.03,8.04,
Slide 40
8.05

Examples:

The homework problems


translate some of the decimal
fractions for odds and odds ratios
from numbers to text. The
following table shows the
translations used.

09/28/15

If the
decimal
fraction for
the odds is:

Homework problems
will describe the
likelihood as:

Examples:

0.20 through
0.30

and a quarter times


more likely

3.2
1

three and a quarter


times more likely

Greater than
0.30 and less
than 0.37

and a third times more


likely

3.3
6

three and a third


times more likely

0.45 through
0.55

and a half times more


likely

3.4
9

three and a half


times more likely

Greater than
0.63 and less
than 0.70

and two thirds times


more likely

3.6
9

three and two thirds


times more likely

0.70 through
0.80

and three quarter times


more likely

3.7
0

three and three


quarters times more
likely

otherwise

reported as a number
rounded to one decimal
place

3.4
2

3.4 times more likely


Slide 41

After BlackBoard grades the


assignment, it will give you an
option to review the results.
For this problem, we received the
full 10 points because we marked
all of the correct answers and did
not mark any of the incorrect
answers. Note: this version of
BlackBoard does not give partial
credit.

09/28/15

slide 42

The feedback after


the graded answer
explains what the
correct answer
would have been.

09/28/15

slide 43