Professional Documents
Culture Documents
Spss Guide
Spss Guide
This document was prepared using SPSS versions 8 and 10 for the PC. Mac versions or other
PC versions may look slightly different, but most instructions here should still work.
1
Table of contents
1. Getting Started
1.1 Entering data from scratch..….………………………………………….……...…. 3
Defining Variables (SPSS version 8)…………………………………………... 3
Defining Variables (SPSS version 10)…..……………………………………... 4
1.2 Importing data from Excel…………………………………………………...……. 6
2. Getting your data in shape
2.1 Calculating variables………………………………………………………………. 8
2.2 The If… button ………………………………………………………………….... 9
2.3 Recoding Variables …………………………………………………..…………… 10
Recoding into Same Variables………………………………..…………..……. 10
Recoding into Different Variables………………………….….……………..... 11
Special case: Median (or tertile or quartile) splits ……….…….…………….... 12
2.4 Select cases…………………………………………………………………….….. 13
2.5 Merging files………………………………………………………………………. 14
Adding cases…………………………………………………………………… 15
Adding variables……………………………………………………………….. 16
3. Analyzing your data
3.1 Independent Samples t-test ……………………………………………………….. 18
3.2 Paired t-test ……………………………………………………………………….. 19
3.3 Oneway simple ANOVA………………………………………………………….. 21
3.4 Chi square contingency test ………………………………………………………. 24
3.5 Correlations (simple and partial)………………………………………………….. 25
3.6 Regression……………..………………………………………………………….. 27
3.7 ANOVA models and GLM ………………………………………………………. 30
Repeated Measures ………………..……………….………..……….………... 34
3.8 Reliability ……………………………………………………………………….… 37
4. Taking a look at your data
4.1 Checking the numbers ……………………………………………………………. 39
Frequencies ……………………………………………………………………. 39
Tables ………………………………………………………………………….. 40
4.2 Graphing and plotting …………………………………………………………….. 42
Scatterplots ……………………………………………………………………. 42
Histograms …………………………………………………………………….. 43
Bar charts ……………………………………………………………………… 43
5. Output
5.1 Organizing your output …………………………………………………………… 45
5.2 Results Coach …………………………………………………………………….. 46
6. Using Syntax
6.1 The Paste function …………………………………………………………………48
6.2 Creating a Session Journal ……….………….……….…………………………… 48
7. For more information ……………………………………………………….…….….. 49
2
1. Getting started
1.1 Entering data from scratch:
You will first want to create a template into which to enter data by defining variables. This is
done differently in SPSS 8 and SPSS 10, and is the most commonly used feature that differs
between the 2 versions.
Type your variable name into the Variable Name Box (circled in red above). Variable names
must have 8 characters or less. Specify the variable Type by clicking on Type (circled in green
above). Numeric is the default, but date or string are other common types. If numeric, you can
specify the number of decimal places here. Specify whether your variable is scale, ordinal, or
nominal (circled in purple above). It has to be scale if you want to do things like add and
average it or to do typical statistics like t-tests. Specify labels for your variables by clicking on
3
Labels (circled in blue above). I strongly recommend that you do this. Many the grad student
has come back to their data a year later and had no idea what boms47 meant. Here is the Labels
dialog box:
Specify a Variable label (i.e. tell yourself that boms47 is the “Brat-o-meter Scale, question #47,
hairpulling”). Enter this information in the Variable label box (circled in red). Then specify
value labels if appropriate. For example, entering 1 may mean the person responded with “I
never pull people’s hair”; 2 means “I pull people’s hair occasionally”; 3 means “I pull people’s
hair often”… etc. In that case, you would enter 1 in the Value box above (circled in green), enter
“I never pull people’s hair” into the Value label box (circled in blue), then click Add (red arrow).
Your new value label will appear in the box circled in purple. Do that for value=2, 3, and so on
until you have all of your values entered. Then click continue to return to the Define variable
dialog box.
Click OK in the Define Variable dialog box, and that variable will be created. If you want to do
a whole slew of similar ones of these (e.g. boms1 - boms50), there may be easier ways. You can
do one, and then copy and paste the syntax to create all of your variables. I’ll explain how to do
this in the Using Syntax section below.
The good news is that defining variables got much easier in SPSS version 10.
At the opening screen, you will see two tabs at the bottom of the grid (circled in red below):
You start out in the Data View tab. You can click on the Variable View tab to define variables.
4
Once in Variable View, you can enter a variable name in the first
column, labeled name. Here, I have entered our old boms47 into the
first column, and all of the defaults have filled themselves in:
At the red arrow, you can see all of the characteristics of your variable
that can be specified, including our old Label (meaning variable label)
and Values (meaning value labels). Again, I strongly recommend that
you use variable and value labels. If you click in the Label box for
your variable, you will see 3 little dots in a box (circled in red below).
Clicking on those dots will pop up the Value labels dialog box (circled
in green below). You can add value labels using this dialog box in the
same way you did in version 8.
5
1.2 Importing data from Excel
Importing data from Excel is easy. You can type your variable names (again, 8 characters or
less) in the first row (see red arrow below) and enter data below that. Once you are done, save
the file as a Microsoft Excel 4.0 Worksheet. Most versions of SPSS cannot read anything
newer than Excel version 4.0. Excel will prompt you to OK some things (like that you are only
saving the active worksheet, not the whole book and that some features may be lost). If you are
dealing with ordinary data, this should be fine. When in doubt, save as the current Excel version
first as a backup.
Once you have your data in Excel 4.0 format, open SPSS. Click on File Open Data (red
arrow below) which will open the Open File dialog box. Under “files of type” choose “Excel
(*.xls)” (green arrow below) to show your Excel 4.0 file. Choose your file (note: it must not be
currently open in Excel or you will not be able to open it in SPSS).
6
Once you choose the file, the Opening File Options
box (right) will pop up. If you put variable names in
your Excel file (which I do recommend), make sure the
“Read variable names” box is checked (red arrow right).
Then click OK.
This will read in your data (green arrow below) and pop
up an output window (outlined in red below) that will
show a Log that tells you how many variables and how
many cases were read in. Check to make sure this is correct.
Variable names longer than 8 characters will be truncated. Errors may result from
funny characters in your
variable names, or
duplicates. In that case,
SPSS will give some
dummy variable name
and will report an error
in the Log.
7
2. Getting your data in shape
2.1 Calculating variables
In some cases, you may want to compute a variable one way for one set of subjects and another
way for another set. In that case you would use the If… button (purple arrow below) to specify
the conditions under which you want to compute this variable in the way that you are specifying.
I’ll describe that below in section 2.2. When you’re done and click OK, your new variable will
appear at the end of your dataset (i.e. as the last variable).
8
2.2 The If… button
The If… button will show up in quite a few dialog boxes, as it does in the Calculate Variable box
above. Clicking it yields the If Cases dialog box below. The first thing to do is make sure the
Include if case satisfies condition radio button is selected (red arrow below). Then place your
condition in the box below that (where the expression is circled in green). Here, I have asked
that only cases where id <= 8 be included (circled in green). So only subjects with ID#s less
than or equal to 8 will be included. The buttons circled in blue below are the operators you can
use to place your conditions.
The table to the right are
definitions for the operators. < Less than = Is equal to
Right clicking on them will also > Greater than ~= Is not equal to
give you their definitions. For <= Less than or equal to & AND
example, right clicking on ** >= Greater than or equal to | OR
will tell you that it is the ~ NOT
exponential operation. So 5**2 is 5 squared, or 25. So for example, if you wanted to limit cases
to females (where sex=1 means female) who were also at least 18 years old, you would enter
sex=1 & age >= 18 in the box.
Once you are finished with your If condition, clicking Continue will return you to the Compute
Variable dialog box, or whatever box you were in prior to clicking the If… button.
9
2.3 Recoding Variables
Once you have your variables in, you may decide that you need to recode. For example, you
may need to reverse code certain variables. Clicking on Transform Recode (as below) gives
you two options—Into Same Variables or Into Different Variables (circled in red below).
Recoding into different variables leaves your initial variable intact. Recoding into the same
variable does not.
For example, let’s say that boms46 is a reverse coded item (e.g. 1 is I am a big brat; 2 is I am a
medium sized brat; etc.) so 1 becomes 4, 2 becomes 3, and so on.
Clicking on Old and New Values (above) brings up the Old and New Values dialog box below.
Entering a 2 into the Old Value box (red arrow) and a 3 into the New Value box (green arrow)
and then clicking the Add button (circled in red) will make all 2’s in the boms46 column change
to 3’s. Once you add them, they will appear in the Old New Box where the 1 4 already
10
appears. You can also change a range of numbers using the three Range options (outlined in
purple) or change all remaining values to some value. For example, you could recode all other
values to system missing by clicking All other values on the left (inside the purple square) and
system missing on the right (blue arrow) then clicking Add. Or change missing values to zeros
by clicking system missing on the left (above the purple box) and entering zero in the new value
box on the right (green arrow) then clicking Add. Don’t forget to click Add. (It’s easy to
forget). When you’re done, click Continue to go back to the Recode dialog box. Then click OK.
If instead you want to keep your original variable and just make a new one that recodes the
original, use Recode into Different Variables. That dialog box is shown below. In this example,
I have clicked boms46 over to the Numeric Variable Output Variable box (circled in red
below) and typed boms46r (my new variable name) into the Output Variable Name box (circled
in green below). Notice I have also entered a variable label into the Label box (green arrow
below). Once you do this, click the Change button (red arrow below) and the question mark in
the red circle below will change to boms46r. If you want to do more than one variable at this
stage, click over the next variable into the Numeric Variable Output Variable box and repeat
this process of typing in the new variable name and label. You have to click the Change button
between each Variable that you want to change. Once you are done, click the Old and New
11
Values button
(circled in blue to
left). This will
send you to an
identical dialog
box to that for
Recode into Same
Variables (above).
Follow those
instructions to
recode your
variable(s) then
click Continue
and OK. Your
new variable will
be at the end of
your dataset.
Special case: Median (or tertile or quartile) splits
One common form of recoding is to divide your variable values into two groups, split at the
median (or into four quartile groups, etc.). To do this in SPSS, there is a secret function in Rank
Cases. Click on Transform Rank Cases (circled in red below). This will bring up the Rank
Cases dialog box (below). Click over the variable(s) you want to recode (circled in green below)
then click the Rank Types button (circled in blue below).
You should leave the default of Assign Rank 1 to Smallest Value (red arrow above) unless you
want your highest values to be assigned a value of 1 in your new recoded variable. Clicking on
Rank Types (circled in blue above) will get you to the Rank Cases: Types dialog box below.
12
To do a median split, check
the Ntiles box (red arrow to
left) then enter 2 in the box
(green arrow to left).
Entering 3 would give you
tertiles, 4 would give you
quartiles, etc. You can also
do a simple ranking by
checking the Rank box (blue
arrow to left), but that is not
what is needed for the
median split, so it is not
checked. Click Continue to
go back to the Rank Cases
dialog box.
Clicking on the Ties button (previous page circled in purple) will give you options for how to
deal with ties. The default is to give them the mean of the two values, but you can also put ties
into the lower or higher category if that is what you need. Once you are done, click OK in the
Rank Cases dialog box. This will create a new variable that has the same name as your old
variable with an n at the beginning. So in this example, we created nboms45, which takes on the
value 1 if boms45 was below the median and 2 if boms45 was above the median. This variable
will be placed at the end of your dataset.
If you want, for example, to limit an analysis to only female subjects 18 or older, or to only Time
2 data, you can do that using the Select Cases function in SPSS. Click Data Select Cases
(circled in red below). That brings up the Select Cases dialog box. To select a subset of cases
based on some condition, click the If condition is satisfied radio button (red arrow below) then
click the If… button (blue arrow below). The default is that cases that do not meet your
condition are merely filtered (I’ll show you what that looks like in a minute). But you can also
change that so that unselected cases are deleted (see green arrow below). Clicking the If…
button takes you to our old friend the If… dialog box which you already know how to use. Set
your condition (e.g. sex=1 & age >= 18, time=2, etc) then click Continue, then OK. Cases will
be filtered (or deleted). At this select cases dialog box, you can also take a random sample of
cases. When you are done with your specific analysis and want all of your data again (assuming
you filtered and did not delete), just go to Data Select Cases again and click the All cases radio
13
When you filter cases, a diagonal line will go through the case
number as shown to the right (column indicated by the red arrow)
for cases that are being filtered out. That is, for cases that are NOT
selected. In this case, I selected cases if boms45 <= 2, so all 3’s
and 4’s are filtered. Any analyses I do at this point will not include
any subjects who scored a 3 or 4 on boms45. Don’t forget to
Select all cases again when you are done. Incidentally, this also
creates a variable called FILTER_$ in your dataset that takes a
value of 1 if the case is selected and 0 if it is filtered out. You can
ignore that variable if you like, but sometimes it can be useful.
Sometimes it is easier to enter data into multiple separate data files (Time 1 data and Time 2 data
for example, or each questionnaire in a separate data file) to keep file size more manageable.
But at some point, you may need to look at data all together—that is, you need to merge your
data files. There are two ways to merge data files—adding cases (or adding subjects) and adding
variables.
14
Adding cases
To add cases to an existing data file, go to Data Merge files Add Cases (circled in red below).
That will pop up the Add cases: Read file window shown below. Click on the file that contains
the cases you need to add. In this case, that is boms3.sav.
15
Adding variables
Go into one of your data files, and click Data Merge Files Add Variables (circled in red
below). This will pop up the Add Variables: Read File window. Choose the (sorted) file that
has the additional variables you want to add to your current (sorted) data file. In this case that is
boms2 (green arrow below). Click OK.
16
This will pop up the Add Variables
dialog box to the right. The red arrow
shows that one of the id variables is
being excluded, while the rest of our
variables are in the New working data
file (blue box). The key variables box
is currently empty (outlined in green).
This is NOT what you want—this is
just how the dialog box pops up by
default. You will want to check the
Match cases on key variables in
sorted files box (green arrow to right)
then click on id (red arrow to right)
17
example, if you calculated a total score from a questionnaire and don’t need all of the individual
items, you can click on them in the New Working Data File box and send them (using the little
arrow button) over to the Excluded Variables box. This is a good way to clean up your dataset
so you are only looking at the variables you need. But make sure you keep your original data
somewhere so you don’t have to re-enter it.
OK, let’s start simple—a t-test. Are men more bratty than women? A t-test on bomstot by
gender.
18
compare people who scored above 10 compared to below
10 on some scale) by using the cut point radio button
(green arrow to left). I tend not to use this option. Click
continue and then click OK in the T-test dialog box. The
Options button in the T-test dialog box (above) doesn’t do
much interesting. It does allow you to change the
confidence interval alpha of the confidence intervals that
the t-test spits out. The default is a 95% confidence
interval, which is what most people want.
Here is the t-test output:
The window to the left above shows an outline of all of your output. I like to rename the tests so
I can see what I’ve done. For example, I would call this T-test of bomstot by gender (rather than
just T-test) I’ll show you how to do that later in the output section. You can see that SPSS has
spit out the two categories (female and male—red arrow above), the N for each group (green
arrow above) and the mean for each group (blue arrow above) as well as the standard deviation
and the standard error. Woohoo, boys are brattier than girls according to the means, but is it
significant? Levine’s test for quality of variances (outlined in red above) is not significant, so
the variances can be assumed to be equal. In that case, you use the first line of results (in purple
above). If the Levine’s test had been significant, we would use the lower line of results (in
orange above). You can see the t value, degrees of freedom, and p value in the green box above,
and the 95% confidence interval for the difference in the blue box. In this case, men and women
are not significantly different on the Brat-O-Meter Scale, t(28)=-1.529, p=.137.
To do a paired t-test in SPSS, we will use the Time 1 vs. Time 2 bomstot variables. This will test
whether people were brattier at the first time point (let’s say, right before a visit to see parents)
and the second (right after the same visit). Go to Analyze Compare Means Paired Samples
19
T-Test (circled in red below). This will pop up the Paired samples t-test dialog box below. Click
on the 2 variables that you want to compare (here bomstot and bomstot2—green arrows below)
then click the arrow button (circled in blue below). This will pair those two variables. Again,
the options button only allows you to change the percentage on your confidence interval, and the
default is 95%.
20
You can also see that there is not a significant effect of time (or not a significant difference
between Time 1 and Time 2 boms score), t(29)=.499, p=.622 (outlined in green). The blue box,
again, shows the confidence interval of the difference.
The oneway ANOVA works pretty much like an independent samples t-test. We’ll do an
ANOVA to determine whether birth order has an effect on brattiness. Go to Analyze Compare
means One-way ANOVA (circled in red below). This will pop up the One-way ANOVA
dialog box. You can see I have clicked over birth order into the factor box and bomstot into the
Dependent list box (again, you can analyze more than one dependent measure at once). Here,
you do get more options. Clicking the Options button
(blue arrow below) takes you to the Options dialog box
(to right). I generally check the Descriptive box (red
arrow to right) to get descriptive statistics of the
dependent measures for my groups. You can also
check the Homogeneity of variance box (green arrow to
right) to check that assumption of the ANOVA.
21
You can also click on the Post Hoc
button (green arrow to left). Which
will bring up the Post Hoc multiple
comparisons window below. There
are many post-hoc techniques to
choose from. Simply check the box
or boxes you want. You can change
the familywise error rate by
changing the value in the
significance level box (circled in red
below). As for choosing a post-hoc?
I generally use Tukey—it seems like
a good mix of controlling error and
not being too conservative
(Bonferroni is the most
conservative).
22
23
dataset (or 99’s that someone entered as a missing value). Next, the Levine’s test for
homogeneity of variances is not significant (circled in red above) so equal variances can be
assumed. Next is a typical ANOVA table (outlined in green above) including SS, df, Mean
Squares, F, and p. This analysis is not significant (probably because the data are completely
random). Because you do not have a significant main effect, you should stop here, but we will
look at the output from the contrasts and post-hocs anyway as a learning exercise. In real life,
you do not look at these tests if your main ANOVA is not significant. The blue box above shows
the contrast coefficients—this is just as a double-check. Next you have the contrast tests.
Because Levine’s above was not significant, you can use the first row of numbers (assume equal
variances). This table includes the contrast value blue arrow above), the t value (purple arrow
above), df (orange arrow above), and significance (pink arrow above). In this case, the contrast
value was –3.30 and was not significant t(27)=-.774, p=.446. Finally we come to the multiple
comparisons. In the blue box above, you can see the mean difference for each pairwise
comparison and the significance value. When a difference is significant, the mean difference is
starred. The purple box above shows the confidence intervals for the difference—these all
include zero, confirming that out differences are not significant.
This is the question about SPSS that I have fielded more than any other question. This oft-used
test is just not where you would think. As an example, we can examine whether gender is
associated with scoring above or below the median on the bomstot variable (using our median
split nbomstot). Go to Analyze Descriptive Statistics Crosstabs (circled in red below). Click
24
Also notice that the Crosstabs: Statistics box is where you would go to perform a Kappa
reliability test (blue arrow above)—Kappa is the reliability statistic used when two raters make
categorical judgments rather than continuous ratings.
Below is the Chi-square test output. First, you’ll see a Case Processing Summary (circled in
green to left). This
will pop
up in many of the
statistics you do.
It’s good to check
that you have the
expected number of
cases included and
are not missing
large portions of
data. Next is the
crosstab, or
contingency table
(red arrow to left).
Finally the Chi-
square test is
reported (in blue
box to left). The
Pearson Chi-square
on the first line is
the typical test used
for data of this sort.
Notice that SPSS
will warn you if you
have expected cell
counts lower than 5
(purple arrow to
left). This test
should not be used
in that case.
Note that the Chi-square model fit test is under Analyze Nonparametric tests Chi-square.
This is a different test—one in which you assign expected values to cells and test the goodness of
fit of that model The fact that these are called the same thing has tricked many an SPSS user.
Simple correlations are a piece of cake in SPSS. You can do a whole slew of ‘em if you want.
Go to Analyze Correlate Bivariate (circled in red below). Click over all of the variables that
you want to correlate. In this case, we have age, bomstot and bomstot2 (Time 1 and Time 2
brattiness). SPSS will compute all pairwise correlation. That’s it—just click OK.
25
SPSS will spit
out a nice table
(see below).
Each cell has a
correlation
coefficient, 2-
tailed
significance,
and N. Each
correlation
appears twice in
the symmetrical
table, and there
are 1’s (as
expected) on the
diagonal. Easy
as pie. Nothing
significant here,
as usual.
Correlate Partial
(circled in red
below). Send the
variables of interest into the Variables box, and the control variable(s) into the Controlling for
box and click OK.
26
Below you will see
the results of this
analysis. The
(symmetrical) table
reports the
correlation, degrees
of freedom, and the
2-tailed p-value
(outlined in green
below). You can
see that the partial
correlation of age
and bomstot2,
controlling for
bomstot, is a
whopping -.0335.
3.6 Regression
The linear regression function in SPSS covers a lot of ground. Go to Analyze Regression
Linear (circled in red below). That will pop up the Linear regression dialog box shown below.
Enter your dependent measure (here we used bomstot) into the Dependent box (red arrow
below). Enter your independent variable(s) (here age) into the Independent(s) box (green arrow
below). You can enter more than one independent variable here. Choose a regression method if
you are using more than one independent variable using the pulldown menu (blue arrow below).
27
Enter is the default and is standard linear regression but you can also use stepwise regression,
either forward and backward, enter (and remove) variables in blocks using the Previous and Next
buttons, etc. This is a very versatile dialog box. Of more common use are the Statistics, Save,
and Options buttons.
The Statistics button (outlined
in green to left) brings up the
Statistics window below.
Checking the estimates box
(red arrow below) gives you
estimated for your regression
coefficients (or betas).
Checking the Model fit box
(green arrow below) gives you
an R2 for the regression model.
Checking R squared change
will tell you the change in R2 if
each variable (when you have
more than one independent
variable) is removed. Finally,
checking casewise diagnostics
(purple arrow below) will give
you information on outliers
outside a range that you specify
(here 2 standard deviations).
Clicking the Save button (outlined in purple above) allows you to save residuals of various kinds
from your regression in a column in your dataset (outlined in red to left below). This is useful in
examining residuals to look for a patterns and in computing corrected means. Finally, clicking
the Options button (outlined in orange above) allows you to remove the constant from your
regression (forcing it to go through zero) by unchecking the Include constant box (orange arrow
to right below). It also gives some options for Stepwise regression.
28
There are clearly far too many regression
options for this guide to explicate all of
them, but again, right clicking on most
options in SPSS will give you more
information.
To the right is output from a simple but
typical SPSS regression analysis. The R
and R2 are reported in the Model summary
(red and green arrows to right
respectively). An ANOVA table for the
regression is also reported (outlined in
blue to right). This tells whether your
regression model as a whole is predicting a
significant amount of variance. Finally the
Beta coefficients and t-tests for them are
reported in the orange box to right. Here,
the only thing that is significant is the
Constant (or the intercept). Don’t get
excited boys and girls, that doesn’t help
you get published.
29
you to assign some of your “covariates” as categorical. Output will also include a Chi-square
goodness of fit test (to test the goodness of your prediction) and a table of predicted values. A
full treatment of logistic regression is beyond the scope of this guide, but it is fairly
straightforward to use the SPSS functionality if you read and understand a chapter or so on the
statistical test that you are performing.
30
main effect of age, and also the interaction effect of age by gender (orange arrow below to right).
To do this just click on both age and gender, then while both are highlighted, click the arrow
button (outlined in purple below to left). Once you have the custom model you want, click
Continue.
Finally, the Options button in the Univariate window (blue arrow on previous page) allows you
to examine multiple comparisons in your factors, request homogeneity tests (green arrow below),
etc. Here, we have requested descriptive statistics ed arrow below) and LSD multiple
comparisons for the family variable (purple arrow below). By using the pulldown menu (blue
arrow below) you can change the comparison technique to Bonferroni or Sidak. This window
also allows you to do such things as report observed power, effect size estimates, etc.
31
Below and on the following
page, you will see the output
from this large analysis. First,
below and to the left, the output
simply reports your between
subjects factors (red arrow below
to left). You can double-check
your Ns here. Next, you have the
descriptive statistics that you
requested in the Options box to
left (green arrow below to left).
This presents a nice table of
means, suitable for later
graphing. Next, below to the
right, you have the Levene’s test
for equality of variances
(outlined in blue below to right)
that you also requested in
Options Because this test is not
significant, you can assume your
equal variance assumption was met. Next you
have an ANOVA table (outlined below to
right in orange) that reports F, df, p-value etc.
for all of the main and interaction effects in
your custom model.
32
To the left, you’ll see the
results of the contrast we
requested on the birth order
variable. Level 1 (only child)
is not different from Level 2
(firstborn) (see red arrow to
left) and Level 1 is not
different from Level 3 (later
born) (see green arrow to
left). The overall test results
for this contrast indicate it is
not useful (see nonsignificant
p circled in orange to left).
Next, we have the estimate
marginal means for birth
order controlling for our
covariate—age (outlined in
pink to left). SPSS did also
spit out pairwise comparisons
for the birth order variable,
but that output looks identical
to the pairwise comparisons
we produced in the simple
oneway ANOVA example so
we will not go through them
in detail here.
33
Repeated measures
To give a full example of the functionality of the Repeated measures GLM, I have added 4 new
variables to our dataset. They are: bomsfam1, bomsfam2, bomsfrd1, and bomsfrd2. These
assess the family and friend subscales of the BOMS scale at Time 1 and Time2. These will help
me to show an example of a fully crossed within-subjects design.
To run a repeated measures ANOVA, go to
Analyze General Linear Model Repeated
34
To the left is the
first page of
output from the
repeated measures
GLM. This
output can be very
confusing. First
you have a table
of your within-
subjects factors
(reed arrow to
left). Next you
see your between
subjects factor(s)
(green arrow to
left). Next is a
large and scary-
looking table of
Multivariate tests
(orange arrow to
left). In most
cases, you can
actually ignore
this table. The
multivariate tests
are not necessarily
the tests you need
to look at,
although they are
often equivalent to
the within- and
between-subjects
tests later. Next,
something called
Mauchly’s test of
Sphericity will
print out. In this
example, there
were not sufficient
degress of
freedom to do this
test. If Mauchly’s
test is significant,
you should NOT
use the
35
“Sphericity Assumed”
row in your ANOVA
table. (red arrow to
right). Otherwise, in
most cases, you can
assume Sphericity. In
fact, in most cases, all
rows within a cell of
this table will look the
same. This table also
gives information on
the error terms for
each group of tests—
most importantly, the
MSE for these tests
(green arrows to
right). Next, SPSS
prints out tests of
within-subjects
contrasts (red arrow
on next page). It does
this even if you don’t
request it, and uses
linear trend contrasts
as a default. These
tend not to be useful
to most people. You
can ignore this table
too. Finally, you get
to your between
subjects effects
ANOVA table (purple
arrow on next page).
You can see that Repeated measures GLM outputs quite a bit of material. You will probably
want to tidy this output up a little, which will be demonstrated in the Output section of this guide.
You can also see that we have a significant 3-way interaction in these data
(subscale*time*family above), thus showing that Type I error will give you a significant result
every so often even when nothing is going on.
36
3.8 Reliability
pop up the Reliability Analysis dialog box below. Click over all of your items or coders (here,
bomns1-boms10) into the Items box. Make sure your Model is set to Alpha (orange arrow
below). You can also set this Model to split-half or some other forms of reliability. If you like,
you can press the Statistics button (outlined in green below). That will take you to a dialog box
in which you can do item analysis (e.g. get the alpha with each item of the scale deleted to see if
any items are pulling your alpha down, etc.). Otherwise, just press OK to see your alpha.
37
Easy as pie—you can
see the Alpha in the
simple output below
(red arrow below).
Generally an alpha of .7
or higher is considered
acceptable.
38
4. Taking a look at your data
4.1 Checking the numbers
One way to get a simple look at your data is to look at frequencies or tables. Tables can give you
an idea of means or medians, etc for your groups. Frequencies can alert you to outliers or data
entry errors. Perhaps this section should have come before data analysis, but I can never resist
getting a peek at significance levels before I tease myself with means and pretty graphs. I’m
weird that way.
Frequencies
39
The output is fairly straightforward. It
gives the observed values of your
variable (red arrow to right), the
observed frequency (green arrow to
right), the percentage of observations
with that value (blue arrow to right).
The Valid percent column (purple arrow
to right) gives the percentages based on
only non-missing observations (in this
case that is the same). Finally you get
the cumulative percent (orange arrow to
right). Then you can see the histogram
that we requested, clearly showing one
outlier.
Tables
40
Click the Statistics button (outlined in
green to left) to choose which statistics
will go into the table. Below you can
see we have selected mean and
standard error of the mean.
41
4.2 Graphing and plotting
OK, it’s pretty picture time. You can use scatterplots to get an idea about the relationship
between two variables, histograms to get an idea about the distribution of your variables, and bar
charts to help interpret interactions or to show your results to your friends and family (I include
grant reviewers in this category).
Scatterplots
To create a scatterplot, go to Graphs Scatter (circled in red below). This will pop up the
Scatterplot dialog box in which you select a style of scatterplot. A simple scatterplot (red arrow
below) will serve most people’s purposes. Choose your style then click Define. This will pop
up the Simple Scatterplot dialog box below. Choose your X and Y axes from your variable list
(green arrows below). Click on Titles (outlined in blue below) to add titles to your scatterplot.
You will probably not need to click on Click on Options (outlined in orange below).
42
Histograms
We saw one way to create histograms using the Frequencies function in the last section. You can
also create them another way. Go to Graphs Histogram (circled in red below). This pops up
the histogram
dialog box (to
left). Click over
the variable you
want to graph.
You can click on
the Titles button
to add titles. You
can check the
Display normal
curve box (green
arrow to left) if
you want a normal
curve
superimposed on
your histogram.
Bar Charts
43
below). Note that the Other summary function radio button must be clicked in order to create
this kind of bar chart. You could, instead, do a bar chart on number of cases, or percentage,
using one of the other radio buttons. You can change the summary function from mean (the
default) to median or some other function by clicking the Change summary button (purple arrow
below). Again, you can add titles by using the Titles button. In this case, I do generally click the
Options button and deselect (uncheck) the Display groups defined by missing values checkbox.
If you don’t do this, you will get an extra group for anyone who is missing values in your dataset
and it gets in the way, in my opinion. Once you are done, click OK to see your bar chart.
44
And here is your
completed bar chart.
Line charts and
other types of
graphs are equally
simple to create, so
I will leave it to you
to play around with
the rest of those.
5. Output
5.1 Organizing
As we mentioned before, some of these analyses spit out large amounts of output that you don’t
really need. In addition, a happy day of data analysis can leave you with more tests that you can
handle, so keeping things organized is the goal of this section.
We have been kind of
ignoring the lefthand side
of the output window—the
organizational part. You
can see in the output to
right that it is hard from the
output window to know
exactly what analyses were
done. The first big help is
to rename the tests. Instead
of T-test, report WHAT the
t-test was on. You can also
click on the little minuses
to temporarily hide
analyses. Finally, you can
45
see in the output window above that
there are Notes whose icons look like a
closed book rather than an open book.
These are hidden sections of output.
They will remind you exactly what
analysis you are looking at, whether a
filter was in place, etc. You can
unhide these notes (or hide any visible
output component) by double clicking
on it. To rename a component, do not
double click on it. Rather, click on it
twice, slowly, to highlight the name so
that you can change it. Below you will see a much tidier example of the same output, in which
we have hidden the Graph and renamed all of the components
Still, if you print out your output, none of these pretty organizational things will show up. So
you need to incorporate some organization into the right side of the results window. Double
clicking on any element in the results window allows you to edit it. For example, double
clicking on the t-test title (red arrow above) will allow you to edit that title to read T-test of
bomstot by gender, or whatever is helpful to you. Double clicking on charts and graphs will give
you options to change them, add titles, change the axes, etc. Double clicking on output tables
will allow you to go in and change numbers, or copy and paste the cells out into Excel or some
other program.
46
To the left is
the Results
Coach.
Simply hit
the Next
button
(green arrow
to left) to
cycle
through the
information
given by the
coach. This
is a very
helpful
feature.
47
6. Using syntax
There are two simple ways to start using syntax. Either you can save a specific analysis by using
the Paste function, or you can log your entire session in a Session Journal.
All of the functions that you can use in SPSS to compute variables, do statistics, and create
graphs have a little button near the Cancel and OK buttons called Paste. Here is an example
from the Univariate ANOVA case (red arrow below). Hitting the Paste instead of the OK button
will paste the syntax associated with the
action you are about to perform into a
syntax window (which will pop up
automatically). Below you will see the
syntax associated with this analysis. To
run this syntax, highlight the part you wish
to run (all of it in this case) and then hit
the Arrow button (orange arrow below).
Actually, SPSS has been creating a session journal, a kind of log file, every time you use SPSS.
But it has been putting it in a temporary directory and probably overwriting it. Go to Edit
Options (circled in red below). In the Options window, go to the General Tab (it will probably
come up by default). Outlined below in green, you will see the Session Journal Options. If the
box for Record syntax in journal is not checked, check it. You will see that right now, my syntax
has been recorded in C:\WINNT\TEMP\spss.jnl. You can click the Browse button in the green
box to choose a file or directory that you’d like to save your syntax into. Decide whether you
want to append the files each time or overwrite it each time you begin a new session. Then click
OK to have this Option take effect. This will save all of the syntax for your entire session into a
48
file that you choose.
You can then go in and
highlight parts to run
them again at a later
time, or else simply
keep the syntax as a
record of your analyses.
can help you to find more information on specific kinds of analyses. Finally, you can contact
technical support assuming you are using a licensed copy of SPSS. Go to whoever handled the
licensing and ask for the tech support number, or else seek out the technical support person in
your organization. If you have any questions about this guide, please e-mail me at
pam@psych.stanford.edu.
49