Handout Practical SPSS

You might also like

You are on page 1of 20

Handout Practical

Those who are not [very] familiar with SPSS are advised to check the SPSS tutorial (see the Help
menu), and to study the book by Andy Field carefully.

Goals for this practical


 Getting acquainted with SPSS
 Obtaining descriptive statistics using SPSS.
 Identifying outliers and coding missing values
 Basic statistical hypothesis testing using e.g. the chi-square test
 Comparing means / ranks of two groups (t-test, Mann-Whitney test)
 Comparing means / ranks of multiple groups (ANOVA, Kruskall-Wallis)

In the end of this course, hopefully you can do these 


Descriptive statistics:
- Report mean, standard deviation, range etc.
- Check distributions of variables for skewness, kurtosis
- Identify possible outliers through boxplots
- Deciding: Outliers in or out?
Hypotheses formulating and testing:
- Based on the given datasets and a given research question, what are the hypotheses you are
going to test?
- Chi-square test for relations between two nominal variables (or one nominal, and 1 ordinal)
Comparing two groups:
- Check assumptions
- T-test
- Mann-Whitney
Comparing mutliple groups:
- Check assumptions
- ANOVA
- Kruskall-Wallis

1
1. Descriptive statistics: mean, sd, skewness, kurtosis, outliers…
The first exercise of this practical deals with obtaining descriptive statistics of your data. First,
open the file exercise.sav.

File -> Open -> Data

and go to the folder where you put the .sav file


The file contains the data from a small experiment which concerned the effect of physical
exercise on mental health. Subjects assigned to condition 1 participated in a 8 month physical training
program. Subjects assigned to condition 2 participated in no physical exercise whatsoever during
these 8 months. This file contains the subject identification numbers (subject), the subjects’ test status
(tstatus; yes or no training), and anger and anxiety data that were collected with the Spielberger
anger / anxiety scale after an 8 month period.
The Spielberger Anger scale consists of 10 items endorsed on a 4-point scale, ranging from 1
(almost never) to 4 (almost always). The overall trait anger score as presented in this datafile is
created by summing the scores of all individual items resulting in the overall trait anger score, which
theoretically ranges from 10 to 40 (10x1 vs 10x4).
The Spielberger Anxiety scale consists of 15 items endorsed on a 4-point scale, ranging from
1 (almost never) to 4 (almost always). The overall trait anger score as presented in this datafile is
created by summing the scores of all individual items resulting in an overall trait anxiety score, which
ranged from 15 to 60 (15x1 vs 15x4).

Let’s focus on the anger data for now.

Outliers
Let us first examine whether our data contain any outliers.

Analyze -> Descriptive Statistics ->


Explore

Select the anger variable as ‘dependent


variable’,
The training status as ‘factor’, and subject
identification number
as variable to label cases by.

Note that we will now explore the anger


variable (the dependent variable) for the two
conditions separately as we have indicated
training status as ‘factor’. Also, when a certain
subject is a noticeable case (e.g., because it
is an outlier), SPSS will refer to this case by
using the subject identification number
because we have asked the program to use
the subject identification number to label
cases by.
(If you do not select a variable to label the cases by, SPSS will refer to specific cases by their current
position in the file (the grey column on the left hand of your ‘data View’ window). This means that if you
change the order in the file, the referral number will change as well.)

Now click on the button ‘Statistics’ and select the options ‘Descriptives’ and ‘Outliers’, and click
‘Continue’. The Descriptives option will provide you will descriptive information of the anger variable
(mean, minimum score, maximum score, standard deviation etc.), including information on skewnes
and curtosis. The Outliers option will provide you with the five lowest and highest values, which is
convenient if you’re looking for outliers.
Click also the button ‘Plots’ and select the options ‘Stem-and-leaf’ and ‘Histogram’ and ‘Normality plots
with test’, and click ‘Continue’.

2
Now click on ‘Paste’; your command will now be shown in a new window called the syntax window,
and reads as follows:

EXAMINE VARIABLES=anger BY tstatus


/ID=subject
/PLOT BOXPLOT STEMLEAF HISTOGRAM NPPLOT
/COMPARE GROUPS
/STATISTICS DESCRIPTIVES EXTREME
/CINTERVAL 95
/MISSING LISTWISE
/NOTOTAL.

By pasting all your SPSS commands in the syntax window and saving them as a separate file
together with your data, you can log and keep track of the analyses that you performed and
possibly the transformation to the data that you have performed. This may be of help for
example when you want to perform similar analyses on new variables, but it also functions as a
log/journal of your research activities. If you look at your data file and your results in two years
time, you may not remember what is what, and what you did exactly. However, if you save all
your commands and actions as syntax, then you will be able to look up your previous actions.

You can run the SPSS syntax by selecting all the text and clicking the run button ► (or use ctrl+R).

First look at the table with Descriptives. As you can see, all descriptives are reported separately for
condition 1 (longterm training) and conition 2 (no training).

1.1 What are the mean and standard deviation of the anger variable in condition 1 and 2?

Mean condition 1:16,27 Mean condition 2: 21,53

SD condition 1: 4,026 SD condition 2: 10,391


1.2 Look at the minimum and maximum anger scores in condition 1 and 2. Does something
strike you as odd? If first condition minimum is 11, maximum is 23, with std deviation is 4,026.
Second condition minimum is. 12, maximum is 55, it is not a small gap between the maximum and
minimum value. The Maximum value should not be higher than 40.

1.3 What are the skewness and kurtosis statistics for each condition, and what are their
standard errors?

Condition 1:
Skewness: 0,495 std error: 0,580

Kurtosis: -1,265 std error: 1,121

Condition 2:
Skewness: 2,636 std error: 0,580

Kurtosis: 8,277 std error: 1,121

The statistics for skewness and kurtosis are scale dependent so they are not directly interpretable.
However, if you divide the statistic by its standard error, you get a z-score (see pages 138-139). Z-
scores have a known mean and standard deviation, i.e., a known distribution. For example, we know
that only about 5% of the z-scores have absolute values greater than 1.96, and only 1% greater than
2.58. That is, if we observe a statistic larger than 1.96 or 2.58, we know that this is a very unlikely
score. In the case of testing for skewness or kurtosis, statistics larger than 1.96 are cause for alarm:
we then usually decide that the skew/kurtosis is significant (i.e., the distribution of our dependent
variable shows significant skew or kurtosis). We may then need to transform our variable before we

3
continue with subsequent analyses. However, significant skew/kurtosis may also be due to the
presence of outlying scores…

1.4 Given this information, calculate the z-scores for skewness and kurtosis for condition 1 and
2 and command on the Z-values.

Condition 1 :
Z score skewness 0,85344
Z score kurtosis -1,128

Condition 2 :
Z score skewness 4,544
Z score kurtosis 7,383

1.5 Look at the table with Extreme values in your SPSS output. Do you see any extreme (low or
high) values in condition 1 or 2?
Yes , I do.

Low = 11 High =55


Condition 2 case number 17

SPSS prints 2 tests for normality: Kolmogorov-Smirnov and Shaprio-Wilks.

The column ‘Statistic’ presents the value of the test statistic, and the column ‘df’ the number of
degrees of freedom. In the columns labels ‘Sig.’ are the p-values, i.e., the chance of observing a test
statistic of this magnitude given that the null-hypothesis is true. In this case, the null-hypothesis is “the
data are normally distributed” and the alternative hypothesis is “the data are not normally distributed”.
(see also pages 145-148 Field).

1.6 Given that we test against a criterion level α of .05, what do you conclude with respect to
the distribution of the anger scores in condition 1? And condition 2? Use the histograms of the
data as presented in the SPSS output to support your answer.

Kondisi 1 dan 2 tidak berdistribusi normal

60

Finally, lets take a look at the stem-and-leaf 17

plot (or box-plot). See page 99-103 of your 50


book for more details on how to interpret this
Spielberger Trait Anger scale

type of representation (the length of the


wiskers, the size of the box, etc). 40

As you can see, subject 17 is a clear outlier in


condition 2. 30

20

4
10

Long-term training No training


Training manipulation grouping
The high z-score for skew and kurtosis in condition 2 are probably due to this outlier and it’s high score
of 55. Actually, a score of 55 is impossible since the sumnscores for this scale theoretically range
between 10 and 40. This score therefore is probably a typo. Ideally, you would go back to the raw data
and look up what the score should have been. At this point we do not have the raw data at hand, and
so, because we want to proceed with the analyses, we recode this value into “missing”. Let us make a
new variable for Anger and have the value 55 turned into the value -99999 1.

Always define a new variable when making changes in the data or save the file with the
changed variable under a different name!! This way you can always go back to the original
data.

Transform -> compute

Give the name of the new variable


(anger_R) in the box under ‘Target
variable’ and select ‘anger’ in the box
‘numerical expression’. Note that the
extension “_R” is often used to
denote that a variable is a revised
version of the original variable
‘anger’. You are of course allowed to
give the new variable a totally
different name. However, it is wise to
develop a policy with respect to such
things: the name “anger_R” or
“anger_new” or “anger_2” will
immediately tell you that the variable
is a revision of the variable “anger”
(but what kind of revision is not
clear). You make your life much
harder if you give new variables
names that are not informative like
‘new’ (new what??), ‘transform’ (transform what? How?).

Note that we now simply copy all values from the original variable to make up the new variable (no
arithmetical transformation), but you can also use the Compute statement to add up different
variables, or to subtract 10, divide a variable by 100, log-transform them, etc etc. (see all possible
functions listed under “Functions”.)

1
You can also leave the cell empty, but I don’t recommend that: you will not know whether you made a mistake
while typing in the data, or whether it is an actual missing value. I also recommend to use the same missingness
code for ALL variables in your file. So do not use -1, -9, and 999 as different missingness codes for different
variables in one file: that would require you to look up the specific code every time. So: chose 1 sensible
missingness code (a value that is impossible for ALL variables in your file) and use it for all variables. I always
use -99999 in all my files as I have never come across a variable for which this is a possible value.

5
Now click on the ‘If’ button. In this next window you can specify which cases will be included in the
compute-statement. At present, we only want to copy those values of anger that are valid, and so we
specify that we only want to copy them when the values are smaller than 41 (10-40 are admissible
scores; scores < 10 are not present so we do not need to worry about these right now).

Click ‘Continue’. If you now paste the


command in the syntax window [click ‘Paste’]
you will see:

IF (anger<41) anger_R=anger.
EXECUTE.

Now copy this statement and adjust it as


follows:

IF (anger>=41) anger_R = -99999 .


EXECUTE .

That is: all anger scores of 41 and higher will


be replaced by the value -99999.

Now select this part of the syntax and run it.


You will see that a new variable is added,
namely anger_R, for which the value 55 is
replaced by -99999.
However, SPSS doesn’t know that -99999 indicates a missing score and will treat it as an admissible
score unless you tell the program that -99999 stands for missing.
You can either use the "variable view" window (bottom left of the SPSS screen) to label the value
-9999 as a missing value for anger_R so that SPSS will exclude this observation from the analyses.
Or you can run the following syntax (copy it to you syntax file so that your syntax file is complete, i.e.,
that every operation/manipulation is recorded in your syntax file):

MISSING VALUES anger_r (-99999).

Look in your data view window: is one score coded as -99999 now?

You can check whether all went well:

Analyze -> Descriptive Statistics -> Descriptives

Select the new variable anger_R and click OK.


If correct, the new variable has a minimum of 11 and a maximum of 30, and there are 29 valid scores.

1.7 Now rerun the explore-analyses as we ran previously but now with the new variable
anger_R. Calculate z-scores for skewness and kurtosis again for condition 2, and comment on
them. Are the skew and kurtosis still significant? And what about the Kolmogorov-Smirnov and
Shaprio-Wilks tests?

For future reference, save this file as exercise_R.sav (File -> Save as -> and paste the save
statement to your syntax file!) and save your syntax as Syntax_pract1_1.sps (check that you save
these all in the correct directory!) and go to the next question.

6
2. Relations between nominal variables

The data on which the following exercise is based, were taken from

Chase, M. A., and Dummer, G. M. (1992), "The Role of Sports as a Social Determinant for Children,"
Research Quarterly for Exercise and Sport, 63, 418-424
(http://lib.stat.cmu.edu/DASL/DataArchive.html)

As discussed during the first lecture, the relationship between two nominal variables (or between a
nominal variable and an ordinal variable) can be studied using the frequencies given in a contingency
table or crosstab. Based on the observed and the expected frequencies under the null-hypothesis, the
chi-square test-statistic can be calculated, which can subsequently be used to test whether the
variables show any relation or not.

Open the file goals.sav

Students in grades 4, 5 and 6 were asked the following question:


What would you like to do most at school?
1. make good grades
2. be good at sports
3. be popular

Demographic information was collected as well (e.g. urbanisation level, gender, race).
The file contains the following variables:
Gender, Grade, Race, Urban, Goals.
Familiarize yourself with the meaning of the numerical codes for each variable before proceeding, by
going to the “Variable View” window and looking up the description of the “Values”.

2.1. Describe the levels of each of the five variables in the file, and determine their
measurement level, and the frequency with which each level of each variable was endorsed.
(for the last question, use Analyze -> Descriptive Statistics -> Frequencies)

Number of levels Meaning of levels Frequency for each Measurement


level separately level variable
Gender 0 Boy 227 47,5

1 Girl 251 52,5

Grade

Race White 442 92,5


0
Othrer 36 7,5
1,0

Urban 1 Rural 149 31,2

2 Suburban 151 31,6

3 Urban 178 37,2

Goals 1 Grade 247 51,7

2 Sport 141 29,5

7
3 Popular 90 18,8

2.2. Which goal was selected most often as most important by the students?
(Use Analyze -> Descriptive Statistics -> Frequencies)

Grades

Now that we have familiarized ourselves with the data in this file, we can begin to ask some questions.
For example, do boys and girls differ with respect to what they find the most important goal in school?
We will go through this research question together.

2.3. Testing the relationship between gender and goals


The research question we are going to study is:

Is there a relationship between gender and the goal which is appreciated as most important in school?

2.3.a. Is this a one- or two-sided research question? Explain

This is two side reasech,. There is a relationship

2.3.b. Formulate the hypotheses that we are going to test:

H0: menyatakan tidak adanya hubungan antara variable dependent dan variable independent

H1: menyatakan adanya hubungan antara variable dependent dan variable independent

To test this hypothesis, proceed as follows:

Analyze -> Descriptive Statistics ->


Crosstabs

In the window that appears, select on the


left the
variable ‘Gender’ for Row(s)
and the variable ‘Goals’ for Column(s):

To get the chi-square test, click the button


‘Statistics’ and select the option ‘Chi-
square’ [top left], and click ‘Continue’.
To get the observed as well as the
expected frequencies, click the button
‘Cells’, tick both ‘Observed’ and ‘Expected’
[top left], and click ‘Continue’.

Then click OK (or better still: paste).

Your output will look like this:

8
Case Processing Summary

Cases
Valid Missing Total
N Percent N Percent N Percent
Gender * Goals 478 100.0% 0 .0% 478 100.0%

Gender * Goals Crosstabulation

Goals
grades sports popular Total
Gender boy Count 117 50 60 227
Expected Count 117.3 67.0 42.7 227.0
girl Count 130 91 30 251
Expected Count 129.7 74.0 47.3 251.0
Total Count 247 141 90 478
Expected Count 247.0 141.0 90.0 478.0

Chi-Square Tests

Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square 21.455a 2 .000
Likelihood Ratio 21.769 2 .000
Linear-by-Linear
4.322 1 .038
Association
N of Valid Cases 478
a. 0 cells (.0%) have expected count less than 5. The
minimum expected count is 42.74.

The first table is just a description of the number of valid and missing cases for this crosstab (0 in this
case).
The second table is the actual contingency table. As you can see, each cell contains an observed value
(count) and an expected value (expected count). Note that for the marginal cells, observed and expected counts
are equal.
The third table contains the test-statistics (Value: we will only look at the Pearson Chi-square, but if you
want to know what the other two are, look them up in the help function!), the degrees of freedom (df) and the
accompanying p-value (Asymp. Sig (2-sided)).

2.3.c. How is the expected count of 42.7 in cell “boy” in combination with “popular” calculated?

Expected. Count : 42,7 , the value of chi squre count > chi square table, it means that HO is rejected
and Ha is accepted.

2.3.d. Given the formula for the chi-square test statistic, calculate the chi-square test statistic
yourself:

(obs  exp)2
2   exp
 ...

2.3.e. Why is the number of degrees of freedom 2?

9
Because there are 2 variables

2.3.f. What are the two assumptions that need to be met before one can do the chi-square test,
and are they met here?

There is no cell with the true Popular frequency value or also known as real count

2.3.g. How large is the probability to observe a chi-square test statistic of this magnitude, given
that there is no relation between ‘Gender’ and ‘Goals’ ?

The probability is high, if there is no relation so P>0,05

2.3.h. Given this p-value and given that we test against a criterion level α of .05, what do you
conclude? Do you reject the null-hypothesis, and if so, what does this mean (with respect to
the research question!)?

Yes we disagree the null hypothesis because the p > 0,05, it means that we accept h1 probability

2.3.i. We now know that boys and girls differ with respect to the goals they most aspire to in
school, but how do they differ? Look at the observed and expected counts in the six cells, and
describe the difference between the observed and expected counts: in which cells do they
differ most (and thus contribute most to chi-square test statistic)? Try to say something
intelligent about how and with respect to which goals boys and girls seem to differ.

Congrats, you have performed your first chi-square test (of the day)!

Note that if you would write a formal results section for a paper or assignment, the report would have to read
something like this:

“A chi-square test showed that there was a significant relationship between gender and most aspired goals
(χ2(2)=21.46, p < .001), meaning that boys and girls differ with respect to the goals they pursue in school.”

2.4. Testing the relationship between race and goals

Now we want to know whether there is a relationship between Race and Goals. As we worked through the
previous research question step by step, I believe you can do this next one on your own. The following elements
need to be present in you answer: phrase both hypotheses, state whether one or two-sided, state criterion level α
that you’ll use, mention the value of the test statistic and the accompanying p-value, explain the
number of df’s, and draw an informed conclusion (i.e., referring back to the actual research question,
which is: Is there a relationship between race and the goal which is appreciated as most important in
school?)

10
Go ahead:

2. Comparing means
Open the file balance.sav

The data for the following exercise were taken (and adjusted) from:

Teasdale, Bard, LaRue, & Fleury. (1993). Experimental Aging Research

http://lib.stat.cmu.edu/DASL/Stories/MaintainingBalance.html

The file contains data from a balance task. Each subject stood barefoot on a "force platform" and was
asked to maintain a stable upright position and to react as quickly as possible to an unpredictable
noise by pressing a hand held button. The noise came randomly and the subject concentrated on
reacting as quickly as possible. The platform automatically measured how much each subject swayed
in millimeters in both the forward/backward and the side-to-side directions.

The file contains four variables:


ID: subject ID number
GROUP: subject group membership (1=old, 2=young)
FO_BA: forward/backward sway in millimeters
SIDE: side-side sway in millimeters

The research question being: do older people experience more difficulty maintaining their balance
while performing an reaction time task than younger people?
Nine elderly people and nine young people were subjects in this experiment.

We have two groups, and the dependent variables are


continuous so in principle we could perform a t-test to
find out whether older people show more forward-
backward sway and/or more side sway than younger
people. However, are the assumptions for the t-test
met?

Let us first do some preliminary analyses to check


whether the data are normally distributed within each
group (young vs old), and whether the variances can be
considered equal for the two groups (both assumptions
for the t-test).

Analyze -> Descriptive Statistics -> Explore…

Select both forward-backward sway and side sway as


Dependent variables and age group as Factor. Click
Plots and select Histogram, and Normality plots with
tests, option Untransformed. The click Continue and
OK.

11
Ignore all output except the tables entitled Tests for Normality and Tests for homogeneity of variance,
such as displayed below

12
Tests of Normality
a
Kolmogorov-Smirnov Shapiro-Wilk
age group Statistic df Sig. Statistic df Sig.
forward-backward sway old .254 9 .097 .797 9 .019
young .216 9 .200* .885 9 .177
side-side sway old .134 9 .200* .945 9 .639
young .309 9 .013 .747 9 .005
*. This is a lower bound of the true significance.
a. Lilliefors Significance Correction

1. 0. Are the variables normally distributed in the ‘old’ and ‘young’ group?

Test of Homogeneity of Variance

Levene
Statistic df1 df2 Sig.
forward-backward sway Based on Mean 7.074 1 16 .017
Based on Median 1.880 1 16 .189
Based on Median and
1.880 1 8.579 .205
with adjusted df
Based on trimmed mean 5.830 1 16 .028
side-side sway Based on Mean 3.757 1 16 .070
Based on Median 2.567 1 16 .129
Based on Median and
2.567 1 13.728 .132
with adjusted df
Based on trimmed mean 3.926 1 16 .065

2. Can the variance of forward-backward sway be considered equal across the two
groups? And for side sway?

3. Given these results, what do you conclude with respect to the assumptions of
normality and equal variances? So, which test do you perform: the parametric t-test or
the non-parametric Mann-Whitney?

Hopefully, you chose the Mann Whitney test, because that is the test we are now going to perform.

Remember that the Mann-Whitney test uses ranked scores: the raw scores of both groups are pooled
and ranked. The ranks are then summed for each group. If the groups do not differ with respect to the
dependent variable (forward-backward / side sway), you expect the sum of ranks to be equal for the
two groups as the ranks will be randomly distributed across the two groups.

13
Analyze -> Non-parametric tests -> Independent samples…

INTERMEZZO

Note that the SPSS options etc shown in Field p 546-549 are for SPSS 17; we use IBM SPSS Statistics
version 20; it looks different and does not show us any longer some of the options that were actually quite
handy, such as the Descriptive for the Mann-Whitney test (i.e., the Mean rank and sum of Ranks for each
group separately).
You can actually do the ranking of the data in SPSS yourself:

Transform -> Rank cases ->

And then select the two sway variables as Variables, select Assign Rank 1 to “Smallest Value” and under
Ties select “Mean” (i.e., subjects with the same observed value for the variables, are assigned the mean of
the corresponding ranks. For instance:

Observed 10 12 12 12 17
Ranks 1 3 3 3 5

The observations “12” all get rank 3 because they correspond to ranks 2,3 and 4 and the mean of
2+3+4=3.

SPSS will now create two new variables with the ranks, and if you first Split the file so that SPSS shows the
means and variance separately for the old and the young group (Data -> Split File -> Organize output by
groups), you can run Descriptives (Analyze -> Descriptive Statistics -> Descriptives) to see the mean rank
for each group separately.

Remember that before you move on to do analyses on the full sample, undo the split option!
Data -> Split file -> Analyze all cases

You first see this screen,


where you have to select
the option “Automatically
compare distributions
across groups” (that this is
the option you want is
clear from the Description
given below in the screen).

In the next tab Fields,


select the two sway
variables as Test fields
and the age grouping
variable as Groups
variable.

(Note that you do not have


to tell SPSS how many
levels the Groups variable
has: if it has 2 levels,
SPSS will run the Mann-
Whitney test, if it has more
levels, then SPSS will
automatically run the
Kruskal-Wallis test.)

14
In the last tab Settings, you can either let SPSS
decide which tests fits your data best, or select
some options yourself. Besides the Mann-
Whitney test, I would also like to see the
Kolmogorov-Smirnov Z test, and the Wald-
Wolfowitz test, just for illustration purposes.

Do not confuse the just now requested


“Kolmogorov-Smirnov Z” test with the
Kolmogorov-Smirnov for normality.
Kolmogorov-Smirnov Z tests whether two
samples were actually drawn from the same
population (much as the Mann-Whitney test)
and is more powerful than the Mann-Whitney
test in case the groups consist of less than 25
observations each, as is the case in this data
set).
The “Wald-Wolfowitz runs” test tests whether,
after ranking, there are “runs of scores” from
the same group in the ranked order (e.g., rank
5-6-7-8 are all in Group 1 while rank 20-21-22-
23 are all in Group 2: note that runs of scores
go against the idea that ranks are randomly
distributed across the groups) (see Field, p.
548).

This is the research question that we were interested in:


Do older people experience more difficulty maintaining their balance while performing an reaction time
task than younger people?

4. Is this a one- or two-sided question? What are the exact hypotheses?

I calculated the Ranks for each


variable as described above in the
INTERMEZZO box. Here are the
Descriptives for young and old
separately:

5. looking at the means of


the Ranks what would you
conclude: do older people
sway more or less than
younger people?

15
6. Looking at the Hypotheses
Test Summary Table from the
non-parametric test output,
what do you conclude; do
older people differ from
younger people with respect
to forward-backward sway?
And with respect to side
sway? (DON’T FORGET: is this
a 1- or 2-sided test? If 1-sided,
you may need to divide the p-
value by 2 if SPSS only prints
the 2-sided value!!)

7. If you double click on the Mann-Whitney test in the output table, you get a lot more
information: the actual mean ranks per group, the U statistic, but also the standardized
Test statistic, i.e., the Z-statistic (Note that in the lower part of the newly openend Model
Viewer window you can select whether you want to see results for the Mann-Whitney,
or other tests: see page 544 Field); handy! Why is the Z-statistic negative?

8. Are the results for the K-S Z test, and the W-W runs test in agreement with the Mann-
Whitney U test? (Again, check whether you need to divide by 2, depending on the direction of
your hypotheses and the kind of p-value that SPSS prints).

9. calculate the effect size r from the z-statistics printed in the table with the Mann-Whitney
test results (i.e., if you double click the Mann-Whitney test in the output table). Remember: r
= Z / √N. Are these effects small, medium or large?

16
Open the file learningdisability.sav

The file contains data of a study on learning disabilities. The research question was: do children
with specific learning disabilities (either in reading (RD), in arithmetic (AD), or both (RAD)) also
have lower memory capacity than children without any learning disabilities (controls)?

The dataset contains 15 variables:


Subject descriptives:
ID: subject ID number
Sex: subject’s gender (0=male,1=female)
Age: age subjects
Group: controls, reading disabled (RD), arithmetic disabled (AD), both (RAD)
Rakit: score on children’s intelligence test the RAKIT
Reading: reading score
Arithmetic: arithmetic score

Experimental tasks:
Visual: visual memory
Spatial: spatial memory
Counting: counting span memory
Listening: listening span memory
digitF: digit span forward memory
digitA: digit span backward memory
trail_num: trail making taks numbers only
trail_numlet: trail making numbers and letters

10. is this a real experiment? Why (not)?

Let us first check whether the group are accurately matched on IQ (rakit).
First test whether the IQ data are normally distributed in the 4 groups.

Analyze -> Descriptive Statistics ->


Explore…

Select ‘group’ as Factor (so that the


results will be shown for all four groups
separately) and ‘rakit’ as Dependent
variable. Also click on Plots and select
Histograms and Normality plots with tests,
and the Levene tests on the
Untransformed scores. Run the analysis.

Take a look at the table with the results for


the normality tests.

11. Are the IQ scores normally


distributed in all four groups?

17
The histogram for the controls looks like this:

In statistics we call this censoring from above, which


means that the scale is not normally distributed
because a lot of subjects obtained the highest score.
This means that the test was not difficult enough to
discriminate between children in the highest regions.
Ideally, you would want to add some difficult IQ items
to this test, and ‘extend’ the scale of the test such that
the 8 subject who now all received the highest score
[zero mistakes in the test] will obtain different scores.
This is of course impossible now so we’ll have to make
do with what we got.

We know that the data in all groups should be normally distributed for the one-way ANOVA. Let’s do a
one-way ANOVA and a non-parametric Kruskal-Wallis test and see whether the results differ (i.e.,
whether the ANOVA procedure is robust against violation of normality assumption in 1 of 4 groups).

Analyze -> Compare means -> One-way ANOVA…

Select rakit as dependent variable and group as factor, select


Tukey as Post Hoc test, and under Options select
Descriptives, Homogeneity of variance, Brown-Forsythe, and
Welch. Then run the analysis.

The Brown-Forsythe and the Welch test can be used if the


homogeneity of variance assumption is violated (see Field
379/380).

12. reading the descriptives table, which group has the


highest IQ mean? And which group the lowest? And can
the variances be considered equal across the four groups
(Levene test)?

13. Looking at the F-statistic, what do you conclude: do the four groups differ with respect to
IQ? Would you draw the same conclusion based on the Brown-Forsythe and the Welch tests?

14. Take a look at the pairwise post hoc test: are the means of all groups the same?

Now let’s reanalyze the data but using the Kruskal-Wallis test.

Analyze -> Nonparametric tests -> Independent samples…

Select group as grouping variable, and select rakit as dependent variable.


SPSS will now automatically chose the Kruskal-Wallis test because the group variable has more than
2 levels.

18
Note that in the Settings tab, you can customize, and chose the Kruskal-Wallis, and ask for multiple
comparisons (i.e., automatic post-hoc tests). However, these are only performed if the overall test is
significant. (This is slightly annoying: although SPSS “is right” that one should generally not look at the
post hoc tests if the overall effect was not significant, it now actually excludes the possibility to do so.
You can of course still do all paired comparisons manually by selecting 2 groups of the 4 (e.g., Data ->
Select cases -> If condition is satisfied, and then chose e.g. “group=1 or group=2” to compare controls
(coded 1) to reading disabled children (coded 2), specifically) and conducting a Mann-Whitney test for
every pair.

15. is the result of the Kruskal-Wallis chi-square test in accordance with the results from the
ANOVA?

We will now examine whether males have smaller memory spans as measured with the counting span
tasks (variable “counting”), than females.

16. Phrase the two hypotheses and state whether this is a one- or two-sided research question

Before we go on we need to check the normality assumptions. Do it as we did before with the IQ-
scores (Analyze -> Descritive statistics -> Explore… etc)

17. are the counting span data normally distributed in males? And in females? And are the
variance homogeneous?

18. given the results for the distribution of the variables, do you chose a t-test or the non-
parametric variant, the Mann-Whitney test?

Let us perform a t-test:

Analyze -> Compare means -> Independent samples t-test…

Select counting as test variable and sex as grouping variable [and define the range, i.e., click “Define
Range” and tell SPSS how the two groups are coded. Note that this seems silly if your variable only
has two levels, but if it has more, you can do the t-test and in this window specify which 2 groups of
the multiple you actually want to compare. Try it for example with “group” [controls, RD, AD, RAD] as
grouping vareiable!].

Run the analysis.

19. What is the mean counting span score for females? And for males? And what is the
difference (3 decimals)?

20. first look at the result for the Levene test as the choice of t-test depends on this result: are
the variances homogeneous (i.e., equal)?

19
21. Choose the appropriate t-test. What is the value of the t-statistic, the number of degrees of
freedom, and the accompanying p-value? And what do you conclude: do males have smaller
memory spans than females? [note that the t-test is always 2-sided while your hypothesis may be 1-
sided]

22. Using equation 9.5 on page 335, calculate the t-statistic by hand, and give its number of
degrees of freedom.

Df=

23. now do a regression analysis

Analyze -> Regression -> Linear regression…

with the counting span scores as dependent variable, and sex as independent variable and
comment on the results. How should the constant (intercept) be interpreted? And the beta-
coefficient (“slope”) for sex? Does the value for the t-statistics of this b-coefficient seem
familiar? Write out the regression equation, fill in 0 and 1 for sex, and compare the results with
the means you wrote down for males and females under question 18.

20

You might also like