You are on page 1of 49

Introduction

An introduction
to data entry, data analysis,
and graphing using SPSS
What is SPSS?
Statistical Package for the Social Sciences

A commonly used computer package in business,


government, research and academic organizations.
It is especially used in the social and behavioural
sciences for processing and analysing data and for
producing graphs.
In this session

Learn how to navigate through the different windows of


SPSS
Learn how to open and save data files
Learn how to calculate simple statistics from variables in
a data file
Learn how to calculate new variables using the
COMPUTE function
Learn how to compare different subsets and groups of
data using the SPLIT FILE and SELECT CASES IF
functions
Learn how to produce and edit simple graphs in SPSS
and incorporate them into a word document
Seminar Worksheets
During the seminar you will have
worksheets to complete.
When you complete the worksheet enter
your answers on the online worksheet
which is on the U24103 resources page.
On Friday you will receive a email with
your mark and the correct answers.
Why are Statistics important?

They help put a number in its context*


*if used correctly
Putting a number in its context
January 8th, 2011 by Ben Goldacre in bad science

The many
4,065,000 to 10,Media outlets
000 which you doreported
by dividing the story
that 584 woman
by 406.5.
with the contraceptive implant had unplanned pregnancies.
Then divide the 584 pregnancies by 406.5 which gives 1.43.
MHRA estimate that 1.355 million implants have been sold.
To get the percentage you divide 1.43 by 100 which gives 0.014%
Each implant lasts 3 years, this gives a total exposure time of
4.06 million women-years at risk.
584 unplanned pregnancies in this exposed population
means there were 1.4 unwanted pregnancies reported for
every 10,000 women with implants per year.
Or, you can say that the failure rate is 0.014% per year.
This is rather good:
– Implants are still the most reliable form of
contraception
The headline:

Is a lot less scary when the number 584 it is put in context:


The failure rate for the contraceptive implant is 0.014% per 10,000 per
year
Or without numbers:
Implants are still the most reliable form of Contraception

Back to SPSS
SPSS version 19 interface
Where to
find help
File menu.
Here is where
you load and
save files.

Switch between Current ‘active’


“Data view” & data-entry cell
“Variable view”
A sample of 50 people
asked to answer a set
of questions for a
survey of health
behaviour.
How to open an SPSS data file

Select your file in the “Open Data” window


File menu: Open: Data

U:\data U24103 Seminar 1


SurveyData_Seminar1
The SPSS ‘Data View’ window
Each column
represents a
different variable

Each row
represents
a different
participant
The SPSS ‘Variable View’
This is where you tell SPSS what kinds of variables you have

Label is where Roles


you can data
Name is where
“Type” you give
property: If you
what is are missing a some “Measure”
you shouldproperty:
not just what
give yourcellWhat
variable the variables role is
your variable/column
contained
Each a
row in the variable: leave
“Values”
the You canaYou
blank. assign is the ‘level
should decide on a of
longer Input
name = independent variable (IV).
short name.
represents Remember
a = Numbers numberlabels
and for
tell each
SPSS value
what of
it is
a measurement’
in this box. in the
People
Numeric Target = dependent variable (DV).
you cannot use a space so
different variable.
commonly
Each E.g. Male
column = 1, or “999”variable
use “666”
String =a Text Both= ether IV or DV
you need to
variable (columnuse “_” Female=0.
represents None.a different Nominal,
The variable has Ordinal, or Scale
no role assignment.
in the data view) propertyPartition.
of that used to(i.e. interval
partition theordata
ratio)
into
variable
separate samples.
Orienting through SPSS menu options

Transform menu
Analyse
Data menu
Graph menu
menu Click here to see any
COMPUTE function
DESCRIPTIVES
SPLITBarFILE
Chart function
function
function names you have given to
(and all other functions relating to modifying values and numbers
producinginnew
the Values
(and all other
(and other functions relating
(andfunctions
to relating
selecting/
other to ANALYSIS
ordering
graph-related data of DATA)
based
functions) on criteria)
variables from your data)
The SPSS OUTPUT window

Where the
data file is
saved
Do not close this window
keep the same window
open for the whole Print out of
Click on these to
session what you told
navigate to the output
SPSS to do
from previous analyses
Use: “–” to hide things
&
“+” to show them again Output from
an analysis
Generating simple descriptive statistics
in SPSS
SPSS can generate a multitude of statistics. We will not be
using all of them in this course.
Analyse menu
DESCRIPTIVES
(and all other
functions relating to
ANALYSIS of DATA)

Today we are using Descriptive


Statistics to look at the
variables in your data,
measures of central tendency,
measures of dispersion etc
“Frequencies”
function

“Descriptives”
function

“Frequencies” and “Descriptives”


have a lot of overlapping
functions (e.g. both can give
means, standard deviations).
Frequencies has a greater range
of options (e.g. it can also
compute medians, modes).
Descriptives
“Options…”: This gives some
options for the type of
statistics you wish to show.

Left hand box


Put the variables that you want to Click on “Range”
contains all the
compute statistics from by tick-box so you also
numerical
selecting them and clicking this get this statistic in
variables in your
arrow to move them into the right your output
data set
hand side box.
The OUTPUT window for Descriptives

Number of The columns show the calculated values for each of


participants the statistical measures you ask for
Q1: Find out the means and
ranges of the heights and
weights of the participants
using the ‘Descriptives’
command.

Each row designates one of your variables.


Frequencies
The “Statistics…” option gives
To select multiple items in a row output options for ‘Frequences’
click on the first item you want
then hold down the “shift” key and
click on the last one you want

Options for
Options for
‘median’
‘Standard deviation’
and ‘mode’
and ‘standard error’

If checked ‘Frequencies’ outputs a list


of occurrences of a particular value
(useful for categorical and ordinal variables)
Frequencies table.
e.g. for the ‘Cigarettes’
variable it tells you how
many (and what percentage)
of your data file are
Q2: Use ‘Frequencies’ to find out Smokers.

a) How many males Each


took column
part in the survey?
designates a separate variable. The
different calculated values for that variable are
b) To calculate what shown
percentage
on the of the
individual rows.
sample are ‘Skilled labourers’

c)To find the median weight of the sample


How to produce new variables
from values in a data-file
SPSS allows us to calculate new variables based on
combining the variables we already have in our file.
For example, Body Mass Index (BMI) is a measure which is
mathematically derived from a persons height and weight.

Formula for BMI

BMI is an (indirect) indicator of the proportion of


body fat a person has and is thus a useful health
measure.
The Compute function
Transform menu
COMPUTE function
(and all other functions
relating to modifying
values and producing
new variables from
your data)
Enter the Target variable name here.
This is the name that SPSS will give the new calculated variable.
Note that spaces and certain characters aren’t allowed in variable
names (e.g. symbols such as ‘&’). The “Label” option box below it
allows you to specify things like the level of measurement etc. of the
variable.

Type a valid mathematical


formula to calculate the
variable here.
If you wish to use existing
variables then these can be
moved in using the arrow from
the horizontal variable box on
the left.
We want to calculate BMI
To do this we need peoples weight in metric Kilograms rather than
imperial pounds (lb). The formula to do this is: kg = lb ÷ 2.2
First type a name for the new
variable here:
e.g. Weight_in_kg
(**don’t use an existing
variable name otherwise it will
be overwritten**).

To calculate the Weight in


pounds we need the data
variable for the weight in
Kg.
The conversion formula
then requires us to divide
this variable’s values by
2.2.
Weight_In_lb / 2.2

(N.B. for computers the * is a multiplication sign and the / is the division sign).

What we are telling SPSS to do the sum:

Weight_In_Kg = Weight_In_Lb / 2.2

Press and SPSS will now create and compute this


new variable based on the formula you have given it.
In the output window
SPSS has printed out
the sum you put into the
Q3: Now you know how to use compute window.
‘Compute’ try to create a variable for
BMI. Remember the formula is:
In the data view window you should see that a new
variable called Weight in kilograms has been
calculated for each of the 50 participants in the
sample.
Here are the BMI
values for the first
five participants
(P00001 to P00005).
How to analyse data groups in SPSS
Often we are interested in looking at or comparing values of
different groups within our data. For instance we might want to
compare the average height of males and females.
SPSS has several ways to allow us to do this.

Data menu Or you can use the


SPLIT FILE, SPLIT FILE button
SELECT CASES IF… function
(and all other functions relating to selecting or
ordering data based on given criteria)
Split file function
Change to
‘Compare groups’

If we want to compare Males


and Females then move the
‘Gender’ variable to this box
Note that in the bottom-right corner of the
SPSS data window it informs you that the
file is now split by gender.
This remains the case until you turn this
Q4: Use ‘Split File’ to generate the
off again in the ‘Split File’ function.
mean heights of males and :females in
Select
our sample.
Now when you run any analysis again
**When you have done(e.g.this make sure You get separate
Descriptives).
you turn off Splitvalues for Males and Females in the
file again**.
table.
Select Cases If.. function
Sometimes we wish to include only a subset of the cases (participants)
in our analysis. The ‘Select Cases If’ function allows us to include only a
subset of cases (and ignore others) based on a criterion that we give it.

Select
“If condition is satisfied” Then
press the IF button so we can
enter our criterion.
Criterion window. What we need to do here is enter a Boolean
condition (i.e. a mathematical statement which is either True or
False)
What this is telling SPSS is to select only those cases
(participants) which have the Gender value of “0” (and
therefore ignore all those that have the value 1).

Gender = 0 In other words - select only if the participant is ‘Female’


otherwise ignore

The is equal to sign ‘=‘ is a commonly used relation in Select Cases IF statements
Others common signs for this function are:
greater than ‘>’
less than ‘<‘
not equal to ‘<>’ (note Gender <> 1 would have the same effect here as Gender = 0)
P.S. How to remember ‘less than’ &
‘more than’?
Pacman’s evil statistics-loving twin
brother always eats the largest number

P .05 P .05
So the p value is less So the p value is greater
than .05 or p<.05 than .05 or p>.05
You can see that the male
cases are temporarily
crossed out.
A new filter variable
(called filter_$) is inserted

“Filter On”
If you now run any analysis with the filter ON the analysis
will only be performed on the selected cases (others will be
ignored in the calculation).
**Remember to always turn off the filter after you finish with it in your
Q5: Using Select Cases IF and
analyses**
Go back to thethe
Select Cases IF menu
‘Descriptives’ and click on “Select all cases”.
function,
calculate the mean weight for
people who drink less than 15
units of alcohol a week.
You now have been shown the basics of data handling in
SPSS.
Now might be a good time to save a personal copy of the
data file onto your personal folder (H:) or pen-drive.

Type your file name here


Q6: Answer the following questions using what you have learned
(**Remember to take off Filter/Split File after use**)
a) What is the mode average of sleep that participants in our sample have?
b) What percentage of our sample are Students?
c) Do males or of females have a larger standard deviation for BMI?
d) What is the median hours of sleep that someone in a manual labour job
We are going to stop for
reports they get?
e) How many people in our sample are aged 35 or over?
20 min so you can work
f) Who has a higher mean BMI in our sample, Smokers or non-smokers?

through
Q7: Some more difficult Q6
questions &that
(note Q7forandthese questions AND, OR,
NOT can be used as wellhave a
in Boolean break
conditions)
g) How many people in our sample are both smokers and drink 15 or more
units of alcohol per week?
h) What is the mode average of units alcohol drank by someone who is over
the age of 45 and is in either in a manual-labour, skilled labour, or
administrative/clerical/sales job?
Q6: Answer the following questions using what you have learned
a) What is the mode average of sleep that participants in our sample have?
......8......
b) What percentage of our sample are Students?
...14..........
c) Do males or of females have a larger standard deviation for BMI?
...Females….
d) What is the median hours of sleep that someone in a manual labour job reports they get?
....8.5.....
e) How many people in our sample are aged 35 or over?
.....24........
f) Who has a higher mean BMI in our sample, Smokers or non-smokers?
.. non-smokers.....
Q7: Some more difficult questions)
a) How many people in our sample are both smokers and drink 15 or more units of alcohol per
week?
..6...
b) What is the mode average of units alcohol drank by someone who is over the age of 45 and
is in either in a manual-labour, skilled labour, or administrative/clerical/sales job?
... up to 14 units per week....
Using the graph functions
SPSS can plot graphs from any of the data in your file.

Graph menu
All graph-
related
functions
Histograms
Are used to look at the distribution of data.
Here is an example for age. This variable is clearly not very normally distributed.

Q8: Create histograms for height,


weight, BMI from the data file. Do
these variables show a normal
distribution?
Bar chart

Useful for comparing different participant groups on some measure.


Requires two variables

One usually
ordinal or scalar
for the Y axis.

One categorical
(for the x-axis)
Y-axis variable
(what values do you
want the height of the
individual columns to
show)

X-axis variable
(what groups do you
want the columns to
represent)

Select whether the height of the columns represents


the mean, median or mode average (or some other
measure) of the group.
An example SPSS bar
chart showing a
difference in height
Q9: Now produce a Bar graph between the genders.
showing the median BMI for the
different occupation groups.

Which of the different occupation


groups has the lowest median BMI?
Scatterplot chart
Useful for plotting the relationship between two interval (or ratio) level
variables
Need to give two variables:
– One for the x-axis and One for the y-axis
You can have different markers for different groups
Each point is an
individual
participant’s
Q10: Now produce an SPSS graph for height (Y score on the two
axis) verses weight (X axis) where gender is values.

distinguished with different markers.


Double click on
the graph to open
Edit it so it is easy to understand when printed in chart editor
black and white

Are the lines of best fit for males and females


roughly parallel?
Right click on
the finished
chart.
Copy the chart.

Open Word.

Paste into
word as a
picture.
Self-study exercises for Seminar 1

That’s all for today. Its worth spending a bit of time on your own using
SPSS to really familiarise yourself with its functions.

Try some exercises from the online book on the psychology resources
page in the Statistics folder :

Secure Resources
SPSS Version 17
A Beginner's Guide to SPSS for Windows: Entering and Analysing Questionnaire data

Using SPSS…
– Open the data file “spssraw.sav”
this can be found at: u:\data\SOCSCI\spssraw.sav
Have a look particularly at section 1. (pg. 1-6)
Have a look at section 6. and section 7 (pg. 22-31)
Have a look at section 10 (pg. 112-123).

You might also like