Investigate the relationship between two or more variables. Variables investigated have nominal or ordinal values. If the variables are in the interval scale then transform to ordinal scale.

What is a Crosstabulation?

Joint frequency distribution of two or more class (ordinal/nominal) variables. Subdivision of one variable according to the values of another variable.

Academic Status Staff Gender Male Female 3 9 Student 4 11

SPSS Example

Open the data set tastetest.sav. Data set consists of 30 subjects responding to a product called mulch. Mulch is produced in 3 colors: Red, Blue and Black. The 30 subjects were divided into 3 groups of 10 each. The three groups were randomly assigned the colored mulch.

Cont.

Their respective taste of the colored mulch were recorded. Open SPSS go to File > Open > Data. An Open File dialog box opens. Look for tastetest.sav. Double click on it to open it.

Go to Utilities > Variables. A Variables display box opens. The variable color is highlighted on the left panel of the display box. Information with regard to color is shown on the right panel of the display box.

In the Variables display box, highlight taste. The following information is presented.

Cont.

Go to Analyze > Descriptive Statistics > Crosstabs.

A Crosstabs dialog box opens. Place color in the Row(s): box and taste in the Column(s): box. Click OK.

Cont.

Output

color * taste Crosstabulation u t t 2 2 t 2 7 2 5 2 6 t 5 t

Interpretation

Note that the red colored mulch (red=1) has a lot more respondents saying that its taste is above average than the other colored mulches. Note that the black colored mulch (black = 3) has a lot more below average rating compared to the other colored mulches.

A clustered bar chart can be produced to aid in this interpretation. Use the Dialog Recall icon to recall the crosstabulation of color x taste. This time check the options:

Display Clustered Bar Charts Suppress Tables.

Click OK.

r rt

a s te s a le

ar a e a e ra e e a e ra

e ra

Bel ar

era a

el

era

un t

ed

B lu e

B la

M u l

Stacked Bar Charts Clustered Pie Chart

Go to Graphs > Bar. A Bar Charts dialog box opens. Choose the Stacked icon. Choose the option Summaries for groups of cases in the Data in Chart Are section. Click Define.

A Define Stacked Bar: Summaries for Groups of Cases dialog box opens. Place color in the Category Axis: box. Place taste in the Define Stacks by: box. Note that the default choice in the Bars Represent section is N of cases. This is OK if the number of cases in the grouping variable are the same.

Cont.

In this particular example N of cases (red) = N of cases (blue) = N of cases (black) = 10. In the case when these are not the same choose the option % of cases instead. Following that we can adjust the bars to be of equal length so that comparisons can be made.

Cont.

Observed Output

s te s

r b ve b ve ver

le

ver e

ver el

ver

r bel

ver

un t

ed

lu e

M u l

(

(

!% $ # " !

To change the orientation of the default output from vertical to horizontal. Double click on the graph to invoke the Chart Editor. Go the Transpose chart coordinate system icon and click on it. Close the Chart Editor.

49

E3

E3 D3 GC B 3 A E3 D3 GC 2 E3 D F E3 D3 DCB F D3 DC B 3 3 A @ @

T

el er r r

lu e

ed

ck

Preferred Output

u n t

M u l l r

s te s c

el

er

er

er

le

er

e e

3 2

Recall that in the last example the number of cases in each of the colored mulch group = 10. Suppose this is not the case. Suppose our grouping variable of interest is taste, hence the number of cases for far above average, above average, average, below average and far below average are 3, , 11, and 3, respectively.

Go to Graphs > Bar. A Bar Charts dialog box opens. Choose the Stacked icon. Choose the option Summaries for groups of cases in the Data in Chart Are section. Click Define.

A Define Stacked Bar: Summaries for Groups of Cases dialog box opens. Place taste in the Category Axis: box. Place color in the Define Stacks by: box. Choose % of cases in the Bars Represent section.

Cont.

Resultant Graph

M u lc h c o lo r

1 0 .0 % 1 0 0 .0 %

P erce nt

4 0 .0 %

R ed B lu e B la c k

0 .0 %

P I H

0 .0%

0 .0 %

0 .0 % F a r ab o v e a v e ra g e Above a v e r a ge A v e r a ge B e lo w a v e ra g e F a r b e lo w a v e r ag e

T a s te s c a le

Note

The default colors in the graph do not tally with the mulch colors. We need to edit the graph to make the colors the same as that of the product. The bar length are based on overall percentage. We need to change the bar lengths so that they are all the same, i.e. the % breakdown of colors are based on group size.

Edits

Double click on the graph to activate the Chart Editor. Click on the blue key box representing red mulch in the Mulch Color legend. All the blue sections of the bars are highlighted. Double click on the blue key box in the legend. A Properties dialog box opens.

Go to the Fill & Border tab and click on it.

Change the color in Fill to red.

Cont.

Click Apply.

Resulting Edit

In the same manner change the green color of the blue mulch to blue and the khaki color of the black mulch to black.

UR X

TV

U R TR T SV

o er

U R TR T SR R Q

r er o

`Y Ya

.

`Y Yb

.

`Y Y c

.

`Y YYd

. l

ihR W W g f

ul h

lu e ed

`Y Yad

.

`Y Y

.

er

e lo er

UR TR S RQ

r er

TR W

UR

T ast

scal

e lo e

`Y Y

rce nt

o lo r

Double click on the bars to open the Properties dialog box. Click on the Bar Options tab. In the Bar Options tab choose Scale to 100% in the Stacked Bars section. Click Apply.

Resultant Graph

ul h

R e lu e la

l r

.8

P erce nt

e ra

ar a a era

e e

e ra

e a

el era

ar el a e ra e

T a s te s c a le

v v s t wr q p t wr s v t su s srq u t srq s p y y x y x y x yx x x

yx

Reorientate

lc c o lo r

Re ar e lo a e ra l e la c k e

e lo

era

era

e a

era

ar a

o e a

e ra

nt

Note

Note that the x-axis labels are all wrong. They should be in the form of decimals 0, 0. , 0.4, 0. , 0. and 1. Not 0%, 0. %, , 1%. Relabel them appropriately. You can also use 0%, 0%, , 100%. What can you interpret from this graph?

Go to Graphs > Interactive > Pie > Clustered. The Create Clustered Pie Chart dialog box opens.

Choose the -D Coordinate option. Grab Taste from the variables window on the left and place it in the Slice by: box in the Pie Variables section on the right. Grab mulch and place it in the Panel Variables box. Click OK.

Resultant Graphs

Red Blue

Taste scale

Far above average Above average Average Below average Far below average Pies show counts

Black

Double click graph output to invoke the Interactive Graph Editor. Go to the legend box and double click on the key. A dotted frame appears around the key and all Far above average sectors in the three pie charts. A Color Legend-Taste Scale dialog box opens. Change the colors for every key to match those in the bar charts. Click OK.

Red Blue

ounts

Black

i k h g i k j i i ih ih g lk

Taste scale

Clustered Pie Charts with Taste as Panel and Color as Slices after Editing.

Far above average Above average Average

Mulch color

ed lue lac Pi

Below average

Measures of Association

Interval variables: Pearson Correlation Coefficient Ordinal variables: Spearman Rho Correlation Nominal & ordinal variables: Chi-Square Statistic

Contingency Coefficient Phi and Cramers V Lambda Uncertainty Coefficient

Gamma Somers d Kendalls tau-b Kendalls tau-c Nominal x interval variables: eta

Given a set of bivariate data, the tendency towards a linear relationship between the two variables can be measured by the Pearson correlation coefficient, r. -1 < r < 1. r = 0, no tendency towards a linear relationship. Could be random. Could be nonlinear. r = s1, perfect linear relationship.

r!

n

n n n

nx xi i!1 i!1

n 2 i

n y yi i!1 i!1

n n 2 i

Weight (X) Glucose (Y) 4 10 .3 109 3 104 .1 10 . 10 9 . 1 1 9.4 9 93.4 10 Weight (X) Glucose (Y) .1 101 .9 . 99 .1 100 3.9 10 3 104 4.4 10 .

Cont.

16

n ! 16

16

x y

i i =1 16 i !1 16

!126128.1 ! 1621

x

i !1 16

! 1237.8 ! 97178.6

y y

i !1 2 i

x

i !1

2 i

! 165801

Cont.

r!

! 0.484

Interpretation

Body weight and blood glucose level have a weak affinity towards a linear relationship.

1 4 0 .0 0

/

1 2 0 .0 0

lu c o s e le v e ls ( l

l)

1 0 0 .0 0 8 0 .0 0 6 0 .0 0

w ei

t (

n n n nr

.

nn nq

n n np

nn no

nn nm

SPSS Example

Run this data using SPSS. Type the data in the SPSS Data Editor. Name the first variable x. Label it weight (kg). Name the second variable y. Label it blood glucos level (mg/100ml). File > Save as sugar.sav

Cont.

Go to Analyze > Correlate > Bivariate. A Bivariate Correlations dialog box opens.

Place x and y in the Variables: box Check the Pearson box under Correlation Coefficients. Uncheck Flag significant correlations. Click Options.

Check the Means and standard deviations and Cross-product and covariances boxes under Statistics. Click Continue.

Output

Descriptive St tistics t x t

Cont.

Correlations

x x Pearson Correlation Sig. ( -tailed) Sum of Squares and Cross-products Covariance N Pearson Correlation Sig. ( -tailed) Sum of Squares and Cross-products Covariance N 1 1419. 9 94. 0 1 .4 4 .0 3.4 4 . 3 1 1

y .4 4 .0 3.4 4 . 3 1 1 3.43

104. 9 1

Sum of squares of x = Corrected sum of squares of x

n

= x

i !1

2 i

1 xi n i !1

n n n 2 i

n Sum of squares of x ! n x xi i !1 i !1

Similarly

um o squares o y

n

n 2 i n n 2

um o squares o y
! n y yi i !1 i !1

2 i

and

um o cross product o xy

n

n n

Therefore

Sum o cross n product o xy r! Sum o Sum o n n squares o x squares o y

Next Week

More Measures of Association & Test of Hypothesis

