Professional Documents
Culture Documents
2019-2020
Version 3
Contents
Downloading the data file ........................................................................................................................................................................................................................................................ 3
SPSS Settings ................................................................................................................................................................................................................................................................................. 4
SPSS Windows .............................................................................................................................................................................................................................................................................. 6
SPSS syntax .................................................................................................................................................................................................................................................................................... 7
Opening an SPSS data file from the syntax ....................................................................................................................................................................................................................... 9
Good data management ......................................................................................................................................................................................................................................................... 11
Frequency table......................................................................................................................................................................................................................................................................... 12
Cross tabulation ........................................................................................................................................................................................................................................................................ 13
Means ............................................................................................................................................................................................................................................................................................ 15
Chi-square ................................................................................................................................................................................................................................................................................... 16
One sample t-test ...................................................................................................................................................................................................................................................................... 17
Independent samples t-test.................................................................................................................................................................................................................................................. 18
Correlation (pearson) ............................................................................................................................................................................................................................................................. 19
Scatterplot ................................................................................................................................................................................................................................................................................... 20
Regression ................................................................................................................................................................................................................................................................................... 21
Cronbach’s alpha....................................................................................................................................................................................................................................................................... 22
Declaring user missing values............................................................................................................................................................................................................................................. 23
Renaming variables ................................................................................................................................................................................................................................................................. 24
Transforming and generating variables using recode .............................................................................................................................................................................................. 25
Transforming and generating variables using compute .......................................................................................................................................................................................... 26
Making a scale using compute............................................................................................................................................................................................................................................. 28
Dummy coding examples ...................................................................................................................................................................................................................................................... 29
Variable labels ........................................................................................................................................................................................................................................................................... 32
Value labels ................................................................................................................................................................................................................................................................................. 33
Sub-setting data ........................................................................................................................................................................................................................................................................ 34
Splitting the data in subgroups ........................................................................................................................................................................................................................................... 35
2
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
3
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
SPSS Settings
You are recommended to adjust the settings of SPSS to the following:
Go to Edit > Options
In tab “general”, tick the box “only open one dataset at a time” In tab “pivot tables”, select TableLook “compact”.
4
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
In the tab “Output”, set all to “names and labels”/ “values and labels”. In the tab “Viewer”, tick “Display commands in the log”.
Click “apply”.
If you are working on your own laptop, you only have to do this once.
If you are working on a computer lab computer, you will need to adjust this at the start of each session.
5
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
SPSS Windows
When working with SPSS you will use three types of windows
1. Main window
2. Syntax
3. Output
Open SPSS
If you open SPSS, you will first only see the main window.
The main SPSS window consists of the tabs, the ‘data view’ and the ‘variable view’.
6
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
SPSS syntax
In SPSS the commands are stored in a “syntax” file. This is the most important file. If you work with code, this is the only file you need to save at
the end of your session.
There are two main ways of working with statistical programmes such as SPSS. The first is to use the drop-down menus and click on whatever operation
you would like to execute (just like you would do in Word or Excel). The second is to use commands. These commands are the language of SPSS. The
rules of this language are called its “syntax”.
There are many advantages to using commands over drop-down menus. You can easily fine-tune settings, repeat a range of similar operations much
faster, and, most importantly, it allows you and others to trace what you have done. This is especially useful if you want to go back to an analysis or
recoding that you have done a week or longer ago. You do not have to remember what you have done, you can just see it.
If you are working with several people on the same dataset you can share your syntax-file and if you have a problem you can email the file to
someone so they can try to help you. Once you get it hang of it, it’s also faster to use syntax.
While you can get SPSS to generate the code for you by using the ‘paste’ option in the menus, this approach does not help you understand what the
code means. Pasted codes are also much longer than self-generated code.
Remember to end an SPSS command with a full stop ‘.’. Otherwise SPSS doesn’t understand the command is finished and won’t execute it. (Note in
newer versions of SPSS you can also end a command with an empty line (witregel).
It is highly recommended to write comments above each command that explain what your file and code are for. Comments
should start with ‘*’ and end with ‘.’
If you do this correctly the comment will turn grey.
This guide covers examples of the most frequently used codes for this course.
The examples use the following placeholders
file-path: you should replace this by the file path you are using
filename : you should replace this by name of the file you are using or the name that you want to give to the file.
varlist: you should replace this by the names of 1 or more variable.
var: you should replace this by the name of 1 variable.
value(s): you should replace this by one or more numbers.
7
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
At the top of your files, make a comment on what the file is for.
Save your syntax file in your folder and give it a clear name (so not ‘syntax1’ but for example ‘OM-20180418’).
When all changes are saved, the disk icon turn grey.
You are strongly recommended to frequently save your syntax (by clicking on the disk icon) during your SPSS session. This prevents you from
losing (a lot of) your work.
8
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Command:
get file=”file-path/filename”.
It is important to put the file name and path between quotation marks (“ ” or ‘ ’) so that SPSS knows where the file path starts and ends, and does not
get stuck on any spaces.
But how to know the file path?
Go to Windows explorer and open the folder in which you have save your data file. Click in the address bar at the top; this will show you the file path
of your datafile.
Copy this file path and paste it in your syntax, followed by the name of your file (I recommend also copying this from the explorer). Make sure the
file name ends with ‘ .sav’.
To run your command, select the command line in the syntax file and either use ctrl+r or click on the green arrow in the command ribbon.
9
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Open the European Social Survey (ESS) file via your syntax.
An output window should pop up with the ‘get file’ command (if not, check your SPSS settings). You should now see data and variables in the main
window.
10
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
In the Data view; each row represents a case (most commonly a respondent of a survey) and each column represents a variable.
In the Variable view you can see information on the variables in the data set:
• Name: the variable name. This is what you use in the code to refer to a variable
• Type: indicates whether the variable is entered as numbers (numeric) or text (string). Note that a nominal variable can be entered as
numeric for example the nominal variable gender can be entered as 1=male and 2=female.
• Label: short description of the variable, usually the question text or a summary thereof
• Values: for numeric variables you can find the labels for the values here
• Missing: information on user defined missing values. These contain the codes to answer categories in the survey such as ‘don’t know’
and ‘refusal’ that should not be included in analyses.
11
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Frequency table
Command:
freq varlist.
Example:
Request a frequency table for the variable ‘interest in politics’:
freq polintr.
You can see that the missing values have already be declared by the data producers (as they are listed under ‘missing’ rather than under ‘valid’).
Request a frequency table for the gender of the respondent. The name of this variable is ‘gndr’.
12
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Cross tabulation
Command:
crosstab varlist by varlist /cell col row count.
The first variable you list will make up the rows in the table and the second variable the columns.
You can add one or more of the following options
col= column wise percentage
row= row wise percentages
count=absolute numbers
These options need to be specified after “/cell” because this options refers to what SPSS should display in the cells of the cross tabulation.
Example:
Request a crosstabulation between the variable ‘interest in politics’ (polintr) and ‘main activity in last 7 days (recoded)’ (mnactic):
You should see this in your output window (see next page):
13
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Request a crosstabulation between “interest in politics” and “gender”. To see whether there is a difference in interest in
politics by gender, add percentages to your table. What do you conclude?
14
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Means
Command:
means varlist.
Example:
Request the means for the variable ‘interest in politics’:
means polintr.
Chi-square
Command:
crosstabs
/tables = var by var
/statistic = chisq.
Example:
To test whether there is a relation between the variable ‘interest in politics’ and ‘main activity in last 7 days (recoded)’:
crosstab
/tables= mnactic by polintr
/statistic=chisq.
The test meets the assumptions of chi-square for minimum expected cell count (5). The chi-square test is significant at p<.001 (top row);
there is a significant relationship between main activity and interest in politics.
16
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Example:
Test whether, Europeans are overall satisfied with the way democracy works (stfdem : measured on a scale from 0 ‘extremely dissatisfied’ tot
10’extremely satisfied’, say let’s see if satisfaction is above 5).
t-test
/testval=5
/variables=stfdem.
Because the command didn’t specify a confidence interval, SPSS presents the results for the default confidence interval (95%C.I.). The mean score in
the ESS dataset is 5.27. The mean difference between that score on the value we test against (5) is 5.27-5=.265 (rounded to .27). This is significant at
p<.001 (the p-value is listed under ‘Sig (2- tailed)’).
17
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Example:
Test whether women (respondents with a score of 2 on the variable gndr) are less interested in politics than men (respondents with a score of 1 on
the variable gndr)
t-test groups=gndr (1 2)
/variables= polintr.
The output shows that men have a lower mean score (2.39) than women (2.64); men in the sample are more interested in politics (on the interest in
politics variable a higher score means less interest). Levene’s test is significant (Sig .00: p<.001), so you have to look at the output of the bottom row
(Equal variances not assumed). The t-test has a p-value (Sig (2-tailed)) of p<.001; women are significantly less interested in politics than men.
18
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Correlation (pearson)
Command:
correlations
/variables=varlist.
Example:
Is there a relationship between trust in the national parliament (trstprl) and in the European parliament (trstep)?
Both variables are measured on a scale from 0-10 with higher scores signaling more trust. The correlation between the two is .55 with a p<.001
There is a significant positive relation between trust in the national and European parliament.
19
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Scatterplot
Command:
GRAPH
/SCATTERPLOT(BIVAR)= var with [varname1]
A scatterplot is only useful if at least one of the variables has a wide range. Otherwise you just get rows of dots:
20
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Regression
Command:
regr
/dep= depvar
/enter= varlist
/descriptives.
Example:
The effect of age (agea) on interest in politics (polintr), controlling for years of fulltime education (eduyrs)
Reg
/dep= polintr
/enter=agea eduyrs.
Age and education explain 9.6% of the variation in political interest (R-square
is .096)
Controlling for education, for each year increase in age, the political interest
score decreases by .010 (look at column unstandardized coefficients, B).
This effect is significant with p<.001 (look at the Sig column): there is a
significant positive relation between age and interest
(Remember: a lower score on the political interest variable means more
interest.)
21
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Cronbach’s alpha
Command:
rel /variables=varlist /sum=total.
Example:
Examining whether the variables on trustin different institutions, form a reliable scale of institutional trust
The reliability analysis is conducted only with cases with no missing values on any
of the 5 items (variables) listed in the command, leaving an N of 44387.
The Cronbach’s alpha is .885 which is high.
The column ‘Cronbach’s alpha if item deleted’ in the bottom output table shows
that removing any of the items, would decrease Cronbach’s alpha.
If this column suggests the Cronbach’s alpha would improve considerably after
removing the item, you can rerun the command without the ‘bad’ item. Never
remove more than 1 item are a time.
22
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Command:
missing values varlist (values).
In ESS, the data producers have already declared (almost) all missing values. This will not always be the case.
Therefore you should always inspect your variable in a frequency table before using it in any transformation or analyses. If you see that the
missing values have not yet been declared (because values 99/don’t are listed under ;’valid’ rather than under ‘missing’), you can do this with the
‘missing values’ command.
Example:
Let’s say you have a variables z1 in which 88 stands for ‘don’t know’ and 99 for ‘refusal’. To inform SPSS that these are (user) missing values, you
should type and run the code:
23
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Renaming variables
Command:
rename variable oldname=newname.
Example:
Rename the variable ‘trust in parliament’ (variable name = trstprl) ‘trustpar’:
Only the command will appear in the output; there is no other output to show.
The name of the variable has now change in the dataset. If you use the old in your commands, SPSS will return an error message.
24
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Command:
recode varlist (oldvalue1=newvalue1) (oldvalue2)=(newvalue2) into newvariablename.
Example:
Reverse the coding of the variable polinterest so that a higher score means more interested, rather than less interested. Name the new
variable ‘polinterest_rev’.
Only the command will appear in the output; there is no other output to show.
To check whether the new variable was generated correctly by comparing it to the source variable in 2 ways:
1) Comparing the number of missing values
2) Comparing the coding (with a reverse coding, all values should be on the diagonal of the crosstab)
25
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Example:
Generate the variable age from information on the year of birth and the year of survey.
compute age= inwyys- yrbrn.
exe.
After running the ‘compute’ line SPSS may indicate that ‘transformations are pending’.
26
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
SPSS will finish the transformation once you run another command.
You can also force SSP to execute the command by running the line ‘exe’ (short for execute).
You can check this variable by requesting an excerpt from the dataset, for example the first 15 rows:
list age inwyys yrbrn /cases from 1 to 15.
SPSS displays the values of the three variables in the list for rows 1 to 15 of the dataset. This allows you to see whether you used the correct
formula. A person born in 1982 was indeed 34 at the time of the survey in 2016.
27
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Combining items into a scale can best be done with ‘compute….mean’ because this command takes missing values into account.
Example
Making a scale of political trust using the items trstprl trstlgl trstplc trstplt trstprt.
All items are measured on the same 0-10 scale. If you want to combine items measured on different answer scales you should first standardize them,
before combining them into a scale (this is because a score of ‘2’ means something different on a scale from 1-3 than on a scale from 0-10).
Check the newly generated variable by exploring the range (in this case, values should remain between 0 and 10, because that is the range of the
variables going into the scale) and by looking at a list.
descr poltrust.
list poltrust trstprl trstlgl trstplc trstplt trstprt /cases from 1 to 20.
As you may be able to see in the output, if a respondent only fewer than 5 of the items items, the score on ‘poltrust’ will be based on the mean score on
these items they provide a valid answer on.
If you had used the code
The summed score of a respondent who answered 4 out of 5 questions would have still been divided by 5, artificially decreasing their score.
You can require a minimum number of valid answers for inclusion in the scale. Respondents who gave valid answers to fewer items will be assigned a
missing value in the scale. If for example, you want to only include respondents who gave at least 3 valid answers, you can use the code
Compute poltrust=mean.3(trstprl, trstlgl, trstplc, trstplt, trstprt).
28
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Option 1
recode gndr (1=1) (2=0) into male.
Option 2
compute male=$SYSMIS.
if gndr=1 male=1.
if gndr=2 male=0.
exe.
The first line of code generates a new column in your dataset name ‘male’ with only missing values ($SYSMIS).
The second line of code assigns a score of ‘1’ in the new variable ‘male’ to all men (men are people coded 1 in the variable gndr). The
third line of code assigns a score of ‘0’ in the new variable ‘male’ to all women (women are people coded 2 in the variable gndr). “Exe”
forces SPSS to execute all transformations.
Option 3
Compute male=gndr=1.
This tell SPSS to make a new variable ‘male’ which equals 1 when gndr equals 1, and 0 for all other valid values.
As with recode, you should always check your newly generated variables.
1) Comparing the number of missing values (this should (almost) always remain the same if your new variable is generated from 1 variable)
descr gndr male.
2) Comparing the coding (with a reverse coding, all values should be on the diagonal of the crosstab)
crosstab gndr by male .
29
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
To make dummies for ‘city/suburb’ and ‘rural ’ from the variable ‘domicil’, which is coded
1 A big city
2 Suburbs or outskirts of big city
3 Town or small city
4 Country village
5 Farm or home in countryside
The dummy ‘city’ will have a score of 1 for respondents in big cities and suburbs or outskits of big city, and 0 for all other types of domicile.
The dummy ‘rural’ will have a score of 1 for respondents in country villages or farm or home in country side, and 0 for all other types of domicile.
Option 1
recode domicil (1 thru 2=1) (3 thru 5=0) into city.
recode domicil (1 thru 3=0) (4 thru 5=1) into rural.
Option 2
compute city=$SYSMIS.
if domicil <3 city =1.
if domicil >2 city =0.
exe.
The first line of code generates a new column in your dataset name ‘city’ with only missing values ($SYSMIS).
The second line of code assigns a score of ‘1’ in the new variable ‘city’ to all respondents livigin in a city or suburb (people coded 1
or 2 in the variable domicil). The third line of code assigns a score of ‘0’ in the new variable ‘city’ to respondents in all other types of
domiciles (respondents with codes of 3,4, or 5 on the variable domicil). “Exe” forces SPSS to execute all transformations.
30
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Option 3
Compute city= domicil<3.
This tell SPSS to make a new variable ‘city’ which equals 1 when domicil is smaller than 3 (so 1 or 2), and 0 for all other valid values.
2) Comparing the coding (with a reverse coding, all values should be on the diagonal of the crosstab)
crosstab domcil by city rural.
31
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Variable labels
Command:
variable labels var ‘label’.
Example:
Label the variable age (see p26) as “age at time of survey”.
32
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Value labels
Command:
value labels var value “label” value “label” value “label” value “label”
Example:
Add value labels to the variable political interest (reversed) – (see page 25):
value labels polinterest_rev
1 "not at all interested"
2 "hardly interested"
3 "quite interested"
4 "very interested".
Depending on your preference you can type the code on one line, or start a line for each value. The full stop (.) should only be listed once, at the end of
the code (see example above).
33
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Sub-setting data
Command:
Select if var=condition.
Examples:
The variable name for age is agea. The code to limit the dataset to only respondents aged 18 and over:
select if agea>17.
This commands tell SPSS to drop all respondents with an age of 17 or lower from the dataset. All analyses from this point on will only be on
respondents aged 18 and up.
You want to limit the dataset to only respondents from France. Country (cntry) is a string variable. The code for France FR. For string variables, codes
need to be placed between quotation marks.
select if cntry=’FR’.
This commands tell SPSS to drop respondents from all countries except from France from the dataset. All analyses from this point on will only be on
respondents from France.
Sometimes it can be helpful to only do one analysis for a subgroup, rather than dropping respondents from the dataset. This can be done by adding the
temporary command. For example to run a chi-square test only for France;
Temporary.
select if cntry=’FR’.
crosstab
/tables= mnactic by polintr
/statistic=chisq.
It is important to run all three commands (temporary, select if, and the analysis) in one go.
34
SPSS guide for Research Methods and Skills for Premaster - 2019-2020
Command:
Example:
You may want to know if the relation between age, education and political interest is the same for all countries in the dataset.
The variable for country is cntry. The code is
All four commands (sort, split, reg and split file off) should be run in one go.
SPSS returns a regression table that is split by country.
35