You are on page 1of 40

Practical Statistical Training

Using SPSS Software

Trainer – Hailegebriel Yirdaw


(PhD Candidate at AAU and UoG)
E-mail: hailaenani@gmail.com

Trainer: Hailegebriel Yirdaw 1


Trainer: Hailegebriel Yirdaw 2
Data Handling Rules
◦ NEVER modify the original inputted data.
◦ Make modifications to a working file and
keep a syntax file that preserves all steps.
Use menus and then the PASTE command, or
Type commands directly into Syntax file and save,
then execute the Syntax file commands
◦ Perform all analyses on working files
Preserve all analyses in a syntax file
to document what you’ve done
to reproduce output

Trainer: Hailegebriel Yirdaw 3


Using the Syntax editor

Click Transform Recode  Into different


variables 
‘Continue.’
Click ‘Paste.’

Trainer: Hailegebriel Yirdaw 4


Recoding Variables
It is possible to modify the values of existing
variables in the dataset. For example
◦ Combine two categories into a single category or
changing codes into another.
Recoding can be
◦ Into the same variables (Use the already existing
variable name)
◦ Into different variables (Use new variable name)
Note: Better to use the into different
variables to preserve the original values and
variables.
Trainer: Hailegebriel Yirdaw 5
Recoding string variables
To convert the gender M & F to 1&2
Transform Recode  Into different variables 
The dialog window will appear
Selected a variable from the existing dataset and
move to Numeric Variable -> Output Variable 
supply the name of the new variable using output
variable name and change
 Click on the button labelled Old and new Values
 another dialog window will appear  Enter
the original value in the box labelled Old Value
 Enter the new value in the box labelled New
Value  Add to complete the recode process.
Trainer: Hailegebriel Yirdaw 6
Automatic Recode
Convert string variables by using
Automatic Recode
Transform Automatic Recode  The dialog
window will appear
Selected a variable from the existing dataset
and move to Numeric Variable -> Output
Variable  supply the name of the new
variable using new name and click on add
new name.
 Choose either the Lowest or Highest value
 Ok
Trainer: Hailegebriel Yirdaw 7
Recoding data into categories
Let’s create a new variable geduc,
based on the variable educ.
◦ < 8 - group 1;
◦ 9 to 12 - group 2;
◦ >12 - group 3.

Trainer: Hailegebriel Yirdaw 8


Follow the following steps:
◦ Transform  Recode into Different
Variables dialog box will appear
◦ Select educ into the box Input
Variable/Output Variable
◦ Enter the name of the new variable
(Label)  Old and New Values  enter
the old and new value  Add
◦ Continue in the same manner until you
recode for all
◦ Click Continue then Ok.

Trainer: Hailegebriel Yirdaw 9


Example- Recode Variables
The Confidence variable indicates
students' responses to the question: On a
scale of 1 to 10, how confident are you
that you will learn statistics? Their
responses are currently Scale data
(1-10). To make a comparison of the
participants who answered with a low,
medium, or high response, you can create
groups (Nominal data).

Trainer: Hailegebriel Yirdaw 10


Select: Transform > Recode > Into
Different Variables.
Highlight the Confidence question on the list
and click on the arrow to move
Confidence into the Input Variable box.
Type: “ConfLoHi” in Output Variable: Name.
Click on the Change button.

Trainer: Hailegebriel Yirdaw 11


Trainer: Hailegebriel Yirdaw 12
Select: Old and New Values. Under Old Value, select:
Range.
Type: “1” in the top box and “3” in the box under through.
Type: “1” in the Value box under New Value. Click: Add.
Type: “3” in the top Range box under Old Value and “6” in
the lower box.
Type: “2” in the Value box under New Value. Click: Add.
Type: “7” in the top Range box under Old Value and “10”
in the lower box.
Type: “3” in the Value box under New Value. Click: Add.

Trainer: Hailegebriel Yirdaw 13


Click on Continue > OK
The new variable will appear at the right hand side
of your current variables
Trainer: Hailegebriel Yirdaw 14
Trainer: Hailegebriel Yirdaw 15
At the bottom left of your screen, select:
Variable View
Go to line 9 (ConfLoHi) and move over to
the Values column. Click on the cell and
then on the 3 dots shaded in grey. Type
“1” in the Value box and “Low” in the
Label box. Select: Add. Type “2” and
“Medium.” Select: Add. Type “3” and
“High.” Select: Add

Trainer: Hailegebriel Yirdaw 16


Select: OK Trainer: Hailegebriel Yirdaw 17
Activity 1. (10 minutes)

Use “Expenditure data” to recode into a


new variable gfsize, based on the variable
fsize. For instance up to 4 is group 1; 5 to
8 is group 2; and above 8 is group 3.

Trainer: Hailegebriel Yirdaw 18


Sorting Cases and Splitting Files
Sorting Cases
Sorting cases allows you to organize rows
of data in ascending or descending order
on the basis of one or more variable. Sort
by Current Salary
Data Sort Cases  dialog window will
appear  Select the variable and take it
to Sort by Select sort order  Ok

Trainer: Hailegebriel Yirdaw 19


Activity 2. (3 minutes)

Based on “Expenditure data” sort in an


Ascending order expenditure among male
and female.

Trainer: Hailegebriel Yirdaw 20


Splitting Files
Sometimes it’s necessary to split your file
and to repeat analyses for groups (e.g.
male and female respectively) separately
This procedure does not physically alter
your file in any permanent manner. It’s an
option you can turn on and off as it suit
your purposes
Data Split File Compare groups 
Specify the grouping variable (e.g.
Gender)  OK
Trainer: Hailegebriel Yirdaw 21
When you have finished the analyses, you
need to go back and return the Split File
turn off
Data Split File  Analyze all cases, do
not create groups.

Trainer: Hailegebriel Yirdaw 22


Select cases
To select certain cases for analysis:
click on Data Select cases  click on if
condition click on if continue ok

To take random sample from a given data:


click on Data Data Select cases  click
on random sample of cases sample
continue ok

Trainer: Hailegebriel Yirdaw 23


Merging Files
There are times when it’s necessary to
merge different SPSS data files
SPSS allows you to merge files by adding
cases at the end of your file or merge
additional variables for each of the cases in
an existing data files. Let’s see each of them
separately.
Use- Employee data, First year data and More
observ data

Trainer: Hailegebriel Yirdaw 24


Add Variables
First , keep three considerations
1) There should be at least one identification
variable in both files
2) Variables with the same names (not used to
match cases) need to be recode or excluded
3) Sort your data in ascending orders
Data Merge Files Add Variables  dialog box
will appear  Select the file to be merged
Open  Set the identification variable in Key
Variables box  Activate Both files provide
cases  Ok
Note: Any variables that you do not want
in the merged file can be highlighted in the
box labelled New Working Data File and
moved to the Excluded Variables box.
Trainer: Hailegebriel Yirdaw 25
Combining datasets
(Adding variables)

Trainer: Hailegebriel Yirdaw 26


Activity 3. (5 minutes)

Use another data set “Addvariable data”


and merge the variable with “Expenditure
data” file.

Trainer: Hailegebriel Yirdaw 27


Merging cases
It allow you to merge files that have the
same variables but for different cases or
entered by two different people. In this
case, the two files should have the same
variable names.
Data Merge Files Add cases  dialog
box will appear  Select the file to be
merged Open  Collect the common
variables merged to Variables in new active
dataset box  Activate External file is keyed
table  Ok

Trainer: Hailegebriel Yirdaw 28


Combining datasets
(adding cases)

Trainer: Hailegebriel Yirdaw 29


Activity 4. (5 minutes)

Use another data set “Addcases data” and


merge the cases with “Expenditure data”
file.
Activity 4.5. (5 minutes)

Use another data set “Employee data” ,”


First year data” and” moreobserv data” to
work with merge the cases and variables.

Trainer: Hailegebriel Yirdaw 30


Aggregating data
The Aggregate procedure allows you to
condense a dataset by collapsing the data
on the basis of one or more variables.
◦ For example, to investigate the characteristics
of people in the company on the basis of the
amount of their education, you could collapse
all of the variables you want to analyze into
rows defined by the number of years of
education.

Trainer: Hailegebriel Yirdaw 31


Data Aggregate  dialog box will appear
enter the break variable (the variable within
which other variables are summarized) 
Select the aggregate Variable(s) (the variables
that will be collapsed) There are several
options for summarizing variables  select
the variable  Function  Ok
Note: You can save your result in two
ways.
1. Save the number of cases that were
collapsed at each level of the break variable
or variables as a new variable.
2. Saved as a new file or replaces the working
dataset.
Trainer: Hailegebriel Yirdaw 32
Activity 6. (7 minutes)

Based on “Expenditure data” summarize


the mean of exp, mean of inc, and
minimum age within each family size
(considering fsize as a break variable)

Trainer: Hailegebriel Yirdaw 33


Computing New Variables
You may want to modify the values of the
variables in your datasets. Example
◦ To get the difference in salary (a new variable)
could be computed by subtracting the starting
salary from the present salary.
 Transform Compute
Compute dialog box will appear
 Type the name of the new variable in the box labelled
Target Variable
Type the expression defining the new variable in the
box labelled Numeric Expression.
Note:The numeric expression can also be done using
functions and if condition can also be used

Trainer: Hailegebriel Yirdaw 34


Activity 7. (7 minutes)

Based on “Expenditure data” compute a


new variable saving by deduction
expenditure from income.

Trainer: Hailegebriel Yirdaw 35


Cleaning your data: missing data
• There are two types of missing values in SPSS:
system-missing and user-defined.
• System-missing data is assigned by SPSS when a function
cannot be performed.
• For example, dividing a number by zero. SPSS indicates
that a value is system-missing by one period in the data
cell.

Trainer: Hailegebriel Yirdaw 36


Cleaning your data – missing data cont.
• When you have missing data in your data set, you
can fill in the missing data with surrounding
information so that the missingness does not
impede your analysis.
click TRANSFORM
REPLACE MISSING VALUES
select the variable with missing values
and move it to the right using the arrow
SPSS will rename and create a new
variable with your filled-in data.
click METHOD to select what type of
method you would like SPSS to use
when replacing missing values.
click OK and view your new data in
data view

Trainer: Hailegebriel Yirdaw 37


Cleaning your data – missing data
• User-defined missing data are values that the researcher can tell
SPSS to recognize as missing. For example, 9999 is a common
user-defined missing value.
• To define a variable’s user-defined missing value…
• Look at your variables in VARIABLE VIEW
Find the column labeled MISSING
Find the variable that you would like to work with.
Select that variable’s missing cell by clicking on the gray box in the
right corner.
click DISCRETE MISSING VALUES
enter a specific value, such as 9999, to define this variable’s missing
value
A range can also be used if, for example, you only want to use half of
a scale.

Trainer: Hailegebriel Yirdaw 38


Weight cases
• Example:
▫ Proportion of male headed households in a sampled
data is 69%
▫ While the proportion of such households is 85% in the
total population
% in population
Weighting factor =
% in sample

▫ Weighting factor for male headed households is 85/69


= 1.23 and
▫ Female headed is 15/31 = 0.48

Trainer: Hailegebriel Yirdaw 39


Weight cases
◦ Compute a variable called weight whose value is 1.23 (Transform |
Compute Variable | Target Variable = weight)
◦ Replace the values of weight by 0.48 for female headed households
(Transform | Recode into Same Variables | weight | Old Value = 1.23 |
New value = 0.48 | Continue | if malehead = 0)
◦ Now go to Data | Weight cases
◦ Select Weight cases by and put the variable weight
◦ Click Ok

Note that the ‘Weight On’ sight is on at the bottom right corner of
the data file. All results from this point on will be weighted by these
weight factors.
To end the weighting, go to Data | Weight cases | Do not weight
cases

Trainer: Hailegebriel Yirdaw 40

You might also like