You are on page 1of 47

Introduction to SPSS

1
SPSS

 SPSS is most widely used in social science


disciplines and courses.

 SPSS is the oldest software programs developed and


made available in 1960s and has been redeveloped
over the years, the latest version is SPSS 24.0 which
was produced in.

 SPSS has a "point and click" interface that allows


you to use pull down menus to select commands that
you wish to perform.
SPSS
 SPSS assists the user in describing data, testing
hypotheses and looking for a correlation or
relationship between one or more variables.

 SPSS is very suitable for most regression analysis


and different kinds of analytical tests
 regression, linear, logistic, etc
 survival analysis,
 analysis of variance,
 factor analysis,
 multivariate analysis
 but not suitable for time series analysis and
multilevel regression analysis)
SPSS

PRO CON
 Easy to learn and use  Very expensive

 More powerful than some


 Not adequate for modeling
other soft wares
and cutting edge statistical
 One of the most widely used analysis
statistical packages in
academia and industry
 Has a command line interface
in addition to menu driven
user interface
 One of the most powerful
statistical package that is also
easy to use.
What is Used? (Academia)

Figure 7a. Use of data analysis software in academic publications as measured by hits on Google Scholar.
SPSS Windows has 3 windows:

Data Editor

Viewer or Draft Viewer which displays the output files

Syntax Editor, which displays syntax files

The Data Editor has two parts:

Data View window, which displays data from the active file in
spreadsheet format

Variable View window, which displays metadata or information


about the data in the active file, such as variable names and
labels, value labels, formats, and missing value indicators.

6
SPSS Data View

7
SPSS Variable View

8
SPSS Menu & Toolbars

• File, Edit, View, Window, Help: Similar to most windows


applications.

File - Standard options for opening, saving, printing and


exiting
Edit - Standard commands to undo, redo, cut, copy and
paste
View - Options for showing/hiding toolbars, displaying
values or their labels in Data Editor
Window - Provides option for switching between different
SPSS windows
Help – Contains SPSS help system
Toolbars Continued

• Data – Used to manipulate the data; sort, merge.. etc

• Transform - Creation of new variables.

• Analyze - Heart of SPSS.


– This menu provides access to the statistical procedures
for analysing your data set.
– All the items on the analyze menu have sub menus.

• Graphs - Provide options to create high quality plots and charts.

• Utilities - Used to display information on individual variables.


Data Entry into SPSS

• There are 2 ways to enter data into SPSS:

1. Directly enter in to SPSS by typing in Data


View
2. Enter into other database software such as
Excel, EpiInfo, EpiData, etc and then import
into SPSS
1. Manual Data Entry
• Manually Enter Data:

1. Define Variables in Variable View

2. Enter data in Data view


Enter variables

1. Click Variable View


2. Type variable name under
2. Type 4. Description Name column (e.g. Age).
variable of variable NOTE: Variable name can be 64
name bytes long, and the first
3. Type: character must be a letter or
numeric or
one of the characters @, #,
string
or $.
3. Type: Numeric, string, etc.
4. Label: description of
variables.

1. Click this
Window

13
The Workspace
Variables
Value labels

Cases

Toggle between
Data and Variable
Views

14
Enter cases

1. Two variables in the data set.


2. They are: Code and Q01.
3. Code is an ID variable, used to identify
individual case (NOT people’s real IDs).
4. Q01 is about participants’ ages: 1 = 12 years
or younger, 2 = 13 years, 3 = 14 years…

Under Data
View

15
2. Import from other software
Example: Reading in Data from Excel to SPSS

• Two options:

1 – Copy data in excel and paste directly into the


Data View screen
2 – Read in an excel file (.xls)
Read in an excel file (.xls)

• Select File Open Data


• Choose Excel as file type
• Select the file you want to import
• Then click Open

17
Reading in Data from Excel to SPSS
Warning:

•SPSS is much better at handling numeric variables than


string variables (categorical data entered as text).

•Therefore, if you want to transfer data from Excel to SPSS it


is a good idea to ensure that any categorical data (e.g.
yes/no/don’t know, male/female, etc.) are entered in Excel as
numeric data (codes) rather than text.

•For example, you could always code ’No’ as 0 and ’Yes’ as


1, and so on.
Clean data after import data files
• Key in values and labels for each variable

• Run frequency for each variable

• Check outputs to see if you have variables with


wrong values.

• Check missing values and physical surveys if you use


paper surveys, and make sure they are real missing.

• Sometimes, you need to recode string variables into


numeric variables
19
General guidelines for data entry
 Encode categorical variables.

 Convert letters and words to numbers.

 Avoid mixing symbols with data and convert them to


numbers

 Give each participant a unique, sequential case


number (ID).

 Place this ID number in the first column on the left


20
General guidelines…

• Each variable should be in its own column.


Change to:
Avoid this:
Animal Group
Animal
1 0
Control1
2 0
Control2
Experiment1 3 1
4 1
Experiment2
• Do not combine variables in one column

• It is recommended to use 0/1 for 2 groups with 0 as a reference


group.

21
General guidelines…
• All data for a project should be in one spreadsheet.

• Do not include graphs or summary statistics in the


spreadsheet.

• Each participant should be entered on a single line or


row.

• Do not copy a participant's information to another row


to perform subgroup analysis.

22
General guidelines…

 However when data are repeatedly collected over the same


participant, it’s recommended to have patient-day observation on a
simple line to ease data management.
 SPSS has a nice feature to convert from the longitudinal format to
horizontal format.
 When the number of repeats are few 2 or 3, horizontal format may be
preferred for simplicity.

Longitudinal data entry Horizontal data entry

Date ID SYSBP ID SYSBP1 SYSBP2 SYSBP3


1/2/2005 1 130 1 130 120 120
1/3/2005 1 120 2 110 140
1/4/2005 1 120
3/1/2005 2 110
3/2/2005 2 140

23
General guidelines…
• Do not leave blanks for no.

•Do not enter “?”, “*”, or “NA” for missing data because this
indicates to the statistical program than the variable is a string
variable.

• String variables cannot be used for any arithmetic computation.

• Put ordinal variables into one column if they are mutually exclusive
Avoid: Preferred:

Pain Pain
Mild Moderate Severe
1 0 0 1
0 1 0 2
0 0 1 3
24
Data merging in SPSS – Adding Variables
It a way of merging or joining two or more data set
into a single data set

To do this, we must merge the variables in the two


data sets by the values of the common (and unique)
SampleID variable (this is known as the key variables),
so that the correct unit information is associated with
each candy packet.

In order to do this:

 Click Data>Merge Files>Add Variables


25
Side to Side Merge

ID Health1 Health2 ID Educ1 Educ2


01 02 03 01 34 45
02 04 05 02 71 55
03 14 24 03 62 34
: : : : : :
n X1 X2 n X1 X2

• Used when data files have same records but different variables

• Each file should have key field(s) to ensure correct merging

• For example: Person A enters Health data, Person B enters


Education data

26
Data merging…
1. Make sure that both files are sorted by Key variable in ascending order
2. In SPSS, open Data from one of the data source
3. Select Add Variables under Data, Merge Files

27
Data merging…
4. Select the dataset you want to merge into the working file.

28
Data merging…
5. Click on Match cases on key variables in sorted files,
6. Click on Both files provide cases
7. Highlight ID in the excluded variables box, then click ► near key
Variables

29
Note in Data merging in SPSS
• Cases must be sorted in the same order in both data files.

• If one or more key variables are used to match cases, the two data
files must be sorted by ascending order of the key variable.

• Variable names in the second data file that duplicate variable names
in the working data file are excluded by default because Add
Variables assumes that these variables contain duplicate information.

•Thus before you merge data files, you need carefully to check two
variables with the same name.

•If two variables contain different information, SPSS automatically


delete variable from the file, which is being merged into

30
Concatenating or appending data in SPSS
This is merging data that was entered into two
different data set

 Click Data>Merge Files>Add Cases

31
Top to Bottom Merge
• Used when data files ID Var1 Var2 var3

have the same 01 24 54 62

variables but different 02 32 54 14


03 54 24 35
records
: : : :
• Used to combine data 10 35 46 45
entered by different
data entry staff
• For example: A enters
records 1 to 10, B ID Var1 Var2 var3
11 35 45 12
enters records 11 to
12 64 74 25
20
13 54 54 65
: : : :
20 37 65 56
32
Data Cleaning in SPSS
1. Re-coding existing variables – into the same
variable

2. Re-coding existing variables – into the different


variable

3. Creating new variable from existing variables

33
Recoding existing variables
• We want to use numeric coding for group instead of A
and B.

Old New

ID Group Group

1 A 0
2 A 0
3 B 1
4 B 1

34
Recoding existing variables (2)
From SPSS dialog box, go to:
Transform
Recode
Into Same variables

35
Recoding existing variables (3)

1. Select Group from the variable box into String Variables box
2. Click on Old and new Values to proceed

36
Recoding existing variables (4)

1. Type the old value and the new value you want to convert into
2. Click on Add (To remove, or change, click on Change or Remove)
3. Type all values in the Old  New box, then click Continue
4. Click OK to execute the commands.

37
Re-coding existing variables – into the different variable
• Recoding into a different variable transforms an
original variable into a new variable.

•That is, the changes do not overwrite the original variable; they are
instead applied to a copy of the original variable under a new name.

To recode into different variables, click Transform > Recode into


Different Variables.

38
Re-coding existing variables – into the different variable
• The Recode into Different Variables window will
appear.

39
Re-coding existing variables ….
Input Variable -> Output Variable: The center text box lists the
variable(s) you have selected to recode, as well as the name your
new variable(s) will have after the recode. You will define the new
name in (C).
Output Variable: Define the name and label for your recoded
variable(s) by typing them in the text fields. Once you are finished,
click Change. Now the center text box, (B), will display both the
name of the original variable as well as the name for the new
variable (e.g., “Height --> Height_categ”).
Old and New Variables: Click the Old and New Values to
specify how you wish to recode the values for the selected variable.
If: The If option allows you to specify the conditions under which
your recode will be applied.

40
Re-coding existing variables ….
Old and New Values
Once you click Old and New Values, a new window where you will
specify how to transform the values will appear.

41
Re-coding existing variables ….

Old Value: Specify the type of value you wish to recode (e.g., a
specific value, missing data, or a range of values) and the specific
value to be recoded (e.g., a value of “1” or a range of “1-5”).

New Value: Specify the new value for your variable (i.e., a specific
numeric code such as “2,” system-missing, or copy old values).

Old -> New: Once you have selected the old and new values for
your selected variable in (1) and (2), click Add in area (3), Old--
>New.
• The recode that you have specified now appears in the text field.
• If you need to change one of the recodes that you have added to
the Old-->New area section, simply click on the one you wish to
change and make changes in (1) and (2) as necessary.

42
Creating a new variable for Diastolic blood pressure (DiasBP):
In SPSS, go to Variable View,
Then type DiasBP at the last row under Name

Go back to Data View and directly type diastolic blood pressure to separate from
SysBP. For ease of data entry, you can move DiasBP right after SysBP. Now also
edit sysBP.
43
Creating new variable from existing variables
• Sometimes you may need to compute a new variable
based on existing information (from other variables) in
your data.

•For example, you may want to:


• Convert the units of a variable from feet to meters

• Use a subject's height and weight to compute their


BMI

• Apply a computation conditionally, so that a new


variable is only computed for cases where certain
conditions are met
44
Creating new variable from….

To compute a new variable,


click Transform > Compute Variable.

45
Creating new variable from …
The Compute Variable window will open where you will specify how to
calculate your new variable.

46
Target Variable: The name of the new variable that will be
created during the computation.

The left column lists all of the variables in your dataset

Numeric Expression: Specify how to compute the new variable


by writing a numeric expression.

The center of the window includes a collection of arithmetic


operators, Boolean operators, and numeric characters, which you
can use to specify how your new variable will be calculated.

I IF: The If option allows you to specify the conditions under


which your computation will be applied.

Function group: You can also use the built-in functions in


the Function group list on the right-hand side of the
window. 47

You might also like