You are on page 1of 226

Introduction to SPSS

Sisay Wondaya (Ph.D.)


(Ass. Prof. in Statistics, Specialization in Biostatistics)

Email: sisaywondaya@gmail.com
Introduction to SPSS

❖ Originally SPSS is an acronym of Statistical Package for the Social


Science

❖ But now it stands for Statistical Product & Service Solutions.

❖ It is a statistical analysis and data management software package.

❖ It can take data from almost any type of file


Cont’d…

❖ It used to;

o To create data set

o Generate and tabulate reports, charts

o Plots of distributions and trends,

o Conduct descriptive statistics, and complex statistical analyses.


Launching SPSS

❖ There are many ways to launch SPSS.

❖ The easiest way is to start it from the Start button located at the bottom
of the Windows desktop.

❖ Click Start ➔ All Programs ➔ Select SPSS Inc ➔ IBM SPSS


Statistics 20.
…Launching SPSS
❖ The following start dialog window opens to instruct SPSS what you
intend to do

❖ You may select Run the tutorial to have a tour of SPSS most basic
features.

❖ If you select Type in data the SPSS Data Editor will be opened
SPSS windows

❖ In running SPSS, you will encounter several windows.

❖ The four most common windows in SPSS are

1. The Data Editor window

2. The Viewer window (The Output Navigator)

3. The Chart Editor

4. The Syntax Editor


1. Data Editor Window
❖ The most important components of the Data Editor window are
menus, toolbar, and status bar
A. Data Editor Menus
❖ The menu bar provides easy access to most SPSS features.

❖ It consists of eleven drop-down menus:


Getting Help
▪ Various ways to get help through the SPSS Help system.

▪ Help/Topics. Very useful giving information about how to carry


out particular tasks.

▪ Help/Case Studies. Provides hands-on examples of how to create


various types of statistical analyses and how to interpret the results.

▪ Help/Statistics Coach. Designed to assist in data analysis by


leading you through a series of questions about your data and what
you want to do with your data.
B. Data Editor Toolbar

❖ It provides quick and easy access to frequently used features

❖ It is displayed below the menu bar on Data Editor window.

❖ It helps to perform an action, such as opening a data file, or selecting a


chart for editing.

❖ It is executed the function by placing and clicking the pointer over the
corresponding button
...Data Editor Toolbar
C. Status Bar

❖ A status bar is located at the bottom of the SPSS application window

❖ It indicates the current status of the SPSS processor.

❖ If the processor is running a command, it displays the command name


and a case counter indicating the current case number being processed.

❖ When the statement SPSS Processor is ready appears in the Status Bar,
SPSS is ready to receive your instructions.
Data Editor Window…Cont’d

❖ The Data Editor is a spreadsheet in which we define our variables and


enter data.

❖ The Data Editor window consist two windows;

o The Data View

o The Variable View windows.


A. The Data View

❖ It is simply a grid with rows and columns

❖ The rows represent subjects (cases or observations)

❖ The columns represent variables

❖ The cell is the intersection between row and column in the grid

❖ A cell will therefore contain the score of a particular subject (or case) on
one particular variable
…the Data View

❖ The Data View window displays the contents of data file

❖ Used for;

o Creating and entering new data,

o Editing and modifying existing data

❖ It opens automatically when you start an SPSS session


…the Data View
…the Data View
B. The Variable View
❖ It is also a simple grid with rows and columns

❖ It contains descriptions of the attributes of each variable that make up


your data set

❖ In this window the rows are variables and columns are variable attributes

❖ Used to define the type of information that is entered into each column
in data view

❖ Changes like add, delete and modify attributes of variables can be made
on this window
…the variable view
2. The viewer window

❖ It is where results are displayed after a statistical procedure has been


performed

❖ It is divided into two main sections:

o The left pane contains an outline view of the output contents

o The right pane contains statistical tables, charts, and text output.

❖ You can edit the output in this window and save it for later use.

❖ This window opens automatically the first time you run a procedure
that generates output
…the viewer window
3. The Chart Editor Window
❖ This window is used to edit charts and plots.

❖ It is only displayed after SPSS has been requested to produce a


plot.

❖ You can use the window to change the colors, select different type
fonts or sizes, rotate axes, change the chart type, and the like.

❖ The window can be accessed by double-clicking on any graph


displayed in the Viewer
…the chart editor
4. The Syntax Editor

❖ Most SPSS commands are accessible from the SPSS menus and dialog
boxes.

❖ However, some commands and options are available only by using the
SPSS command language.

❖ In this case the Syntax Window is used.

❖ You will also use this window if you wish to run SPSS commands instead
of clicking on the pull-down menus.

❖ File ➔ New ➔ Syntax


…the syntax editor
Working with the Data Editor

❖ Creating and Manipulating Data in SPSS

o We used Data Editor window when creating or accessing data in


SPSS

o There are three steps that must be followed to create a new data set
in SPSS

• STEP 1: Defining Variables

• STEP 2: Entering Data

• STEP 3: Saving a New Data Set


STEP1: Defining a variable

❖ Whenever you are working with data, it is important to make sure the
variables in the data are defined.

❖ So we can understand exactly what was measured, and how.

❖ It includes giving it a name, specifying its type, the values etc.

❖ Without this information, the data will be much harder to understand


and use.
…STEP1: Defining a variable
❖ There are two ways of defining information about variables

o 1. Using Variable View column attributes


…STEP1: Defining a variable
❖ Using Define Variable Properties window

o In this case we must have prior data


Defining variable in variable view
❖ The variable view tab displays the information, in columns, about each variable in your
data
❖ There are eleven columns altogether namely;
o Name,
o Type,
o Width,
o Decimal,
o Label,
o Value,
o Missing,
o Columns,
o Align,
o Measure
o Role
Variable names
❖ It is always better to give meaningful names to all variables.

❖ If you do not, SPSS name the variables as var00001, var00002 and so


on.
Rules in Defining Variable Name
1. Must not exceed 32 characters. (A character is simply a letter, digit
or symbol).
2. Must begin with a letter.
3. Could have a mixture of letters, digits and any of the following
symbol: @, #, _, $.
4. Must not end with a full stop.
5. Must not contain any of the following: a blank, !, ?, *.
6. Must not be one of the keywords used in SPSS (e.g. AND, NOT, EQ,
BY, and ALL)
7. Variable names can not contain spaces.
…variable names

❖ When you change the name of a variable, it does not change the data

o All values associated with the variable stay the same.

o The specified attributes of the variable remain the same

❖ To change a variable's name, double-click on the variable that you wish


to rename. Type your new variable name
Specifying the type of variable
❖ Click on the ‘type’ box. The two basic types of variables that you will use
are numeric & string. This column enables you to specify the type of
variable.
Width
❖ The number of digits displayed for numerical values or the length of a
string variable.

o It allows you to determine the number of characters SPSS will allow


to be entered for the variable

❖ To set a variable's width, click cell corresponding to the “Width” column


then click the "up" or "down" arrow icons to increase or decrease the
number width.
Decimal
❖ The number of digits after a decimal point for each value of the variable
(applicable to non-string variables only).

❖ Note that this changes how the numbers are displayed, but does not
change the values in the dataset.

❖ To specify the number of decimal places for a numeric variable, click cell
corresponding to the “Decimals” column for that variable.

❖ Then click the “up” or “down” arrow icons to increase or decrease the
number of decimal places
Label
❖ A brief but descriptive definition or display name for the variable.

❖ When defined, a variable's label will appear in the output in place of its
name.

❖ Example: The variable prechf might be described by the label


“preexisting congestive hearth failure”.
Values

❖ Used for coded categorical variables and suggest which code represent
which categories

❖ Value labels are useful primarily for categorical (i.e., nominal or ordinal)
variables, especially if they have been recorded as codes (e.g., 1, 2, 3).

❖ Note that defining value labels only affects the labels associated with each
value, and does not change the recorded values themselves
…values
❖ When value labels are defined, the labels will display in the output
instead of the original codes.
…values
…values

❖ Type the first possible value (1) for your variable in the Value field.

❖ In the Label field type the label exactly as you want it to display (e.g.,
"Freshman").

❖ Click Add when you are finished defining the value and label.

❖ Your variable value and label will appear in the center box.

❖ Repeat these steps for each possible value for your variable.

❖ Click OK at the bottom of the window.


…values
Change or remove value and label

❖ To change value/label, highlight the value/label in the center text box in


the Value Labels window

❖ Make changes to the selected value or label as needed. And then Click
Change

❖ To remove a specific value/label, highlight the value/label in the center


text box Click Remove.
Missing
❖ The user defined values that indicate data are missing for a variable (e.g.,
99).

❖ Note that this does not affect or eliminate SPSS's default missing value
code (".").

❖ This column merely allows the user to specify alternative codes for
missing values

❖ To set user defined missing value codes, click inside the cell
corresponding to the “Missing” column for that variable.

❖ A square button will appear; click on it


…Missing
❖ The Missing Values window appears.

❖ Click the option that best matches how you wish to define
missing data and enter any associated values, then click OK at the
bottom of the window.
Columns

❖ This simply refers to the width of the actual column in the spreadsheet

❖ To set a variable's column width, click inside the cell corresponding to


the “Columns” column for that variable.

❖ Then click the “up” or “down” arrow icons to increase or decrease the
column width
Align
❖ The alignment of content in the cells of the SPSS Data View spreadsheet.

❖ Options include leftjustified, rightjustified, or centerjustified

❖ To set the alignment for a variable, click inside the cell corresponding to
the "Align" column for that variable.

❖ Then use the dropdown menu to select your preferred alignment: Left,
Right, or Center
Measure
❖ The level of measurement for the variable (e.g., nominal, ordinal, or
scale)

❖ Some procedures in SPSS treat categorical and scale variables


differently .

❖ By default, variables with numeric responses are automatically detected


as “Scale” variables.

❖ If the numeric responses actually represent categories, you must change


the specified measurement level to the appropriate setting
…Measure

❖ To define a variable's measurement level, click inside the cell


corresponding to the “Measure” column for that variable.

❖ Then click the dropdown arrow to select the level of measurement for
that variable: Scale, Ordinal, or Nominal.
…Measure

❖ It is vital that you correctly define each variable's measurement level.

❖ This setting affects everything from graphs to internal algorithms for


statistical analysis.

❖ Incorrectly specifying measurement level can have unintended and


potentially disastrous effects on your results.
Role

❖ The role that a variable will play in your analyses (i.e., independent
variable, dependent variable, both independent and dependent).

❖ Some options in SPSS allow you to preselect variables for particular


analyses based on their defined roles

❖ Any variable that meets the role requirements will be available for use in
such analyses
…Role

❖ Input: For a variables used as a predictor (independent variable). This is


the default assignment for variables.

❖ Target: For a variables used as an outcome (dependent variable).

❖ Both: For a variables used as both a predictor and an outcome


(independent and dependent variable).

❖ None: The variable has no role assignment.

❖ Partition: The variable will partition the data into separate samples

❖ Split: Used with the IBM® SPSS® Modeler (not IBM® SPSS®
Statistics)
…Role
❖ To define a variable's role in your analysis, click inside the cell
corresponding to the “Role” column for that variable.

❖ Then use the dropdown menu to select the role that variable will take:
Input, Target, Both, None, Partition, or Split.
Exercise: Define Variables Below
ID No. Facility Name
………………………………………… ……………………………………………………
Town Date of interview
……………………………….. ……………………………...…….
Interviewer Name and signature Supervisor Name and signature
………………………………………… …………………………………….
Part - I: Background information (before counseling )
SN Question
01 Respondent’s age (in completed years) _______________________
02 Ethnicity? 1. Oromo 2. Amhara 3. Others
(specify)__________
03 Religion 1. Muslim 2. Orthodox 3. Protestant
4. Others (specify) _______________
04 Level of Education? 1.Illiterate 5. High school 9-10
2.Read and write 6. Preparatory 11-12
3.Elementary 1st cycle 1-4 7. College/university
4.Elementary 2nd cycle 4-8
05 Marital status 1. Single 3. Married
2. Divorced 4. Widowed
5. Other(specify) _____________
06 Residence 1. Rural 2. Urban
Step-2: Entering Data
❖ Data entry is accomplished in Data View window.

❖ Switch from the Variables View window to the Data View

❖ Each row represents a case or an observation.

❖ Clicking any cell will highlight it (active cell) and its contents will appear
in the cell editor.

❖ Data can be entered in any order.

❖ Data values are not recorded until you press Enter or select another
cell.
…Entering Data

❖ Unlike spreadsheet programs, cells in the Data Editor cannot contain


formulas

❖ Enter the values for all cases on one variable (column) and then repeat
the procedure for all values in the remaining columns.
Editing Data
❖ To delete the old value and enter a new value: click the cell, enter the
new value, press Enter.

❖ To modify a data value: click the cell, click the cell editor, edit the data
value, and press Enter.

❖ To delete the values in a range, select (highlight) the area concerned and
press Delete.

❖ Use the Undo command in Edit to undo any action you just performed.
For example, use the Undo command to delete the value you have just
entered in the Data Editor window.
Adding Cases
❖ To insert a new case (row) in between cases that already exist in your
data file: click the row below the row where you wish to enter the new
case, click Data on the menu bar, click Insert Case from the pull down
menu.

Deleting Cases
❖ To delete a case, click the case number that you wish to delete, click Edit
from the menu, and then on Clear.

❖ The selected case will be deleted and the rows below will shift upward
Exercise: Perform Data Entry
Saving Data Files

❖ To save a new SPSS data file or save data in a different format


make the Data Editor the active window

❖ From the main menu choose File and then Save As…. The following
Save Data As dialog box is displayed
…Saving Data Files
❖ Choose the appropriate directory in the Look in: box to save your file.

❖ Then type the name of the data file in the File name: box. No extension
(i.e. a dot followed by three letters) is required.

❖ SPSS automatically adds the proper extension, which depends on the


type of file.

❖ To save changes to an SPSS data file make the Data Editor the active
window and from the menus choose File and then Save.

❖ The modified data file is saved, overwriting the previous version of the
file.

❖ By default, this will save the data file as an SPSS data file.
Data Manipulation
Using SPSS
Data Manipulation

Data
•Merge files

•Sort cases

•Split file

•Select cases
I. Merging Data

❖ Two data files can be merged (mixed) together using SPSS program

❖ There are two types of merging

1. Merging to add cases (rows)

2. Merging to add variables (Columns)


1. Merging to add cases (rows)

❖ Data files entered using two computers are merged

❖ To merge data the two files should have variables of the same
characteristics,

❖ Usually such merging is possible if we use similar template during data


entry,

❖ If there is slight difference in characteristics of a variable then merging


for that variable will be difficult,
How to Merge Cases?

❖Procedure
1. Open one of the two files (the file that you assume will be the first file)

2. From the pull down menu click “Data” & go down to select “Merge
files”

3. From the “Merge files” select “add cases”

o Data ➔ Merge files ➔ Add cases


1st Select
here

2nd Click here,


to select file

➔ Select the data file you wanted through the Browser to merge & click
the Continue
You will find two windows:

1. Unpaired variables:- list of variables that are not matched thus could
not be included in the new file

❖ If there is slight difference in the characteristics of variables it will be


categorized to this window

o *➔ for variables that come from the opened file

o +➔ for variables that come from the new file


Cont’d…

Unpaired variables
Cont’d...
2. Variables in the new working file

❖ List of variables that could be merged to the file

❖ If all variables are found in this window, we could click “Ok” & we could
have a new file that needs saving, & we should save it for further analysis
Variables in the
new working file
2. Merging to add variables

❖ It is used when we want to add certain variables of a file from other


database,

❖ Usually useful in a place where there is database

❖ It is also useful to take a variable from master data

❖ Here a variable having a common identity of cases is needed

❖ One or more variables are added to each cases of the file


How to merge variables?
Procedure
1. Open the two files & prepare for merging

o (The common identity variable in the two files should be sorted


ascending & should be saved)

2. From the pull down menu click “Data” & go down to select “Merge
files”

3. And from the “Merge files” select “add variables”

o Data ➔ Merge files ➔ Add variables


Select the data file with the variable you wanted to merge & click
the open,
Cont’d…
❖ You will find two windows
1. Excluded variables:- list of variables that would be excluded from
the working file are included
o Variables found in common are also listed here, thus the common
identity also will be found here
• *➔ for variables that come from the opened file
• +➔ for variables that come from the new file
2. Variables in the new working data file:- list of variables that would
be found in the new file
o If all variables needed to the original file are included, process of
merging will be started.
➔ Select the data file you wanted through the Browser to merge & click the
Continue
1st Select the common
Identity variable

2nd Click
The match cases on Key
variables in sorted files
3rd Pass the common identity variable to
the key variable by clicking the arrow
If you are sure that both data files are sorted & saved, click OK
II. Sort

❖ Sorting is useful in cleaning data

❖ When you sort ascending or descending, you can find ‘missing data’,
‘unknown (unexpected)’ data & ‘outliers’

❖ If you find such cases, you can re-check with hard data for possible
correction

❖ It is also useful during merging (especially when you are adding


variables)
Procedure
Two ways

1. Using pull down menu

o From the pull down menu click “Data” & go down to select
“Sort cases”

o Then select the variable you wanted to sort & pass it to “Sort
by”

o Select ascending or descending sorting & click “OK”

• Data ➔ Sort cases


2nd click here to pass
1st select the variable into sort by

3rd select ascending or


descending sorting
Cont’d…

2. Right clicking

o This is on “data view” format of the SPSS

o You go to the variable name you wanted to sort & do a right click

o And select ascending or descending sorting as you wish

o The outcome is then displayed in the Data editor, on “data view”


III. Split file

❖ It is used to do analysis by stratification

❖ Data is stratified by the variable selected for splitting.

❖ Out come (result) of the planned analysis is displayed by stratification


(i.e result for each value stratified is given separately)

❖ Eg if data is split by sex, then any analysis done will be displayed first for
males then for females (first for value 1 then for value 2)
Procedure:

❖ From the pull down menu click “Data” & go down to select “Split file”

❖ Then on the displayed window select “Organize out put by groups”

❖ Then pass the variable to “Groups based on” & click “OK”

o Data ➔ Split file

o Then do any analysis you wanted to perform


1st Select “Organize
output by groups” Finally, click Ok

2nd Select the variable 3rd Click the arrow to pass


you wanted to split by To “Groups based on:”
IV Select cases

❖ It is useful when you want to analyze data among certain category of a


variable (conditionally selected data)

o E.g. you can only analyze data only among male population

❖ It can also be used to select study subjects from sample frame (Simple
random sampling)

❖ It can also be used to select certain range of a population in sequence


A. Conditional selection

How is performed?

1. From the pull down menu click “Data” & go down to select “Select
cases”

2. Then on the displayed window select “If condition is satisfied” & click
“if ”

3. Under the “Select case: if ” window select the variable & pass to the
space given

4. And fulfill the logic you wanted to select using the mathematical
functions given
Cont’d...
5. Click “continue”

6. Returning back to “Select cases” window & select “deleted”


from the “unselected cases are”

o Any analysis done will be on these selected category fulfilling


the criteria of selection

❖ Then do any analysis you wanted to perform


1st Select “If condition is
satisfied”

2nd Click “if”


1st Select the variable
you wanted to select 2nd Click the arrow to pass
the variable”

Do the function you wanted to select


(Right clicking the functions could tell you what it mean)
Function
Variable

Click the continue


Finally, click Ok

1st Click the


Any analysis done will be on these selected “deleted”
category fulfilling the criteria of selection
Some commands

❖ In the numerical expression

❖ Select if
B. Simple random selection

How is performed?

There is a need of a list of the population (Sampling frame)

1. From the pull down menu click “Data” & go down to select “Select
cases”

2. Then on the displayed window select “random sample of cases” & click
sample

3. Under the “Select case: random sample” window, then click “exactly” &
write the sample size in the first space & total number of population on
the second space
4. Click “continue”

5. Returning back to “Select cases” window & select “deleted” from the
“unselected cases are”

List of sample selected by simple random sampling is result


1st Select “Random sample
of cases”

2nd Click
“Sample”
3rd write the total population in the list

1st Click “Exactly” Finally, click continue

2nd write the sample size


Finally, click “OK”

1st Click the


“deleted”
List of the sample is finally found
Practical Exercise

❖ How would you sort the data by the ‘Age’ in descending order?

❖ Split file by string variable of the given data.

❖ Select case with specified value or condition.

❖ Merge files by “merge cases”.

❖ Merge files by “ merge variables”.


Transforming Data
Using SPSS
Data Transformations
❖ After data entered entry into SPSS, it may be necessary to modify it in
certain ways.

❖ With SPSS, data transformations ranging from simple tasks, such as


combining categories for analysis, to more advanced tasks, such as
creating new variables based on complex equations can be performed.
Data Transforming

❖ Compute

❖ Rank Cases

❖ Count

❖ Recode

❖ Date and time wiz…


A. Compute

❖ Compute is used to get a new variable from manipulation of other


variable(s).

❖ It uses different conditions, logical and mathematical functions

❖ Used for numerical variables

❖ It can replace most data manipulations techniques


Procedure

Transform ➔ Compute

❖ From the pull down menu click “Transform” and go down to select
“Compute”

❖ Then on the displayed window write a new variable name and give a
value on the “numeric expression” (Usually numeric variable value)

❖ Click “if ” to do manipulation (Logical or mathematical)


…Compute
1st Space to write the variable name 2nd Space to write numeric expression

3rd Click “if” for conditional manipulation


…Compute

❖ Within “Compute variable: if cases” window, select “include if case


satisfies condition”

❖ You can now shift any variable to the space provided

❖ Do a function or logical expression

❖ click ‘continue’ to return back.

❖ Click “Ok” to manipulate now or click “paste” to save the command on


syntax.
…Compute

❖ Numeric expression can be typed directly or assembled by clicking


arrows in the Variable and Function group boxes.

❖ Observe that there is a numerical variable icon (of a histogram shape or


ruler) at the variable age and the variable systolic. In fact, all numeric
variables (e.g weight, height are numeric) are identified with the icon.

❖ On the other hand, all string variables (i.e. sex) are identified by a
categorical variable icon (two circles) with the letter a
…Compute

❖ If…dialog box allows you to apply data transformations to selected


subsets of cases

❖ Example, to calculate new variable EXCESS which is excess SBP defined


as excess = systolic - 125, Compute dialog box look like the box shown
below

❖ When you have completed the expression, click OK to end Compute


command
…Compute
…Compute
❖ If you wish to obtain the excess SBP for males over 45 only, you
should click If … button and enter “age>45” in the appropriate
text box
2nd select the variable computed and pass it 1st Click “include if case satisfies
to the space condition”

Do the function you wanted to do


3rd Click “continue”
Mathematical Functions
Greater/
Mathematical lesser than

expression
Greater or equal
lesser or equal

‘Not equal’

Or

And

Exponential Equal Logically not Bracket, grouping


Some functions (time)
Click “Paste” to keep the
Command of manipulating as a syntax.

3rd Click “Ok” to manipulation now


B. Count
❖ It counts selected answers (Similar to additive property)

❖ It is useful to estimate the sum-scale of a phenomena

o Eg Knowledge of HIV/AIDS

• It is estimated by total number of correct answers an


interviewee answered.

❖ It is well used in psychiatry and psychological measurements


Procedure
Transform ➔ Count…

❖ Make sure questions for the scaling have the same numeric
expression as a correct answer

❖ From the pull down menu click “Transform” and go down to


select “Count..”

❖ Then on the displayed window write a new “variable name” on


the space “Target variable” and name the label for the new
variable on the “Target label”
Procedure . . .
❖ Pass the questions prepared for the scaling to the “Variable” list,
one by one or combined

❖ Below the variable click the “define values” and another window
useful for iterance of the value will be opened

❖ Write the common correct value on the space “value” and pass it
the space for “Values to count” by clicking “add” and click
continue to return back.

❖ Click “Ok” to manipulate now or click “paste” to save the


command on syntax
Example
❖ If we have data from “self reported question” with 20 questions
of yes or no answer

❖ 1=Yes if the illness occurred, and 0=No if not

❖ We want to do scaling of 20.


2nd write the label of the variable
1st write the new variable name

3rd Select variables useful for the scaling & transfer it the “variable”
Click the “define variable”
1st write the common correct 2nd Transfer into “Value to count” by clicking “Add”
value on the “Value”

Finally click continue to return back


Click ok to complete
manipulating

Click “Paste” to keep the Command of


manipulating as a syntax
C. Recode
❖ It is useful to reduce data (continuous variables to discrete
variable)

❖ To group continuous type of data

❖ A single variable could be recoded to the same variable or to a


different variable

❖ Similarly, conditional recoding can happen

❖ Two types of recoding

o Recode into the same variable

o Recode into different variables


I. Recode into the same variable
❖ No new variable is produced from the recoding process

❖ Reduction of the value of the variable is made to the same


variable

❖ One or more variables are possibly recoded using a single


command
Procedure
Transform ➔ Recode➔ Into the same variable…

❖ Make sure what you wanted to recode, how to reduce the value
and taking notes is essential.

❖ From the pull down menu click “Transform” and go down to


select “Recode” and “into the same variable”
Transform ➔ Recode➔ Into the same variable…

1st select the variable


Cont’d. . .

2nd Click the “Old and New Values”


Cont’d. . .
Group old values Provide a new value
Cont’d. . .
❖ Select old values for grouping as you planned and give them a
new value and put the selection to the space called “Old ->new”
by clicking add

❖ Continue the procedure till all old values are recoded and click
continue to return back.

❖ Click “Ok” to recode now or click “paste” to save the command


on syntax.
Old Value
Missing values

A single value is entered

Ranged value
lowest to Highest

All values below the written


value Eg < 20

All values above the written


value Eg > 50

Click first the dotes Other ungrouped values


Example- Old could be put in
ranges;
New value

Value to be Value system


written missing

To pass/ add
selected values

To change
mistakenly
selected value

To remove selected
value
At the end click continue
Click ok to complete recoding now

Click “Paste” to keep the Command of recoding as a syntax


II. Recode into different variables
❖ A new variable is formed from the mother variable

❖ Reduction of value is transferred to a new variable

❖ Only a single variable is changed to another new variable


Procedure
Transform ➔ Recode➔ Into different variable…

❖ Make sure what you wanted to recode and how to reduce the
value and taking notes is essential

❖ From the pull down menu click “Transform” and go down to


select “Recode” and “into different variable”

❖ Then on the displayed window select the variable you wanted to


recode and pass it to the space named “input variable-> output
variable”.
1st Select a variable to be recoded

2nd Click here to move the variable


1. Then write the new 2. Write also label of
variable name here variable name here

5. Continue clicking “old and new 4. Click the if you want


values” to move the new variable 3. Continue clicking the change
conditional changes
to move the new variable
Cont’d…
❖ If case window
2nd Select a variable 1st Click here to get variable
to be conditioned for conditional function

4th Do the
function

5th Click
3rd Click here to pass the selected continue
variable
Cont’d...

5 Click the OLD and NEW values to continue recoding


Cont’d. . .
❖ Select old values for grouping as you planned and give them a
new value and put the selection to the space called “Old ->new”
by clicking add

❖ Continue the procedure on number 5 above till all old values are
recoded and click continue to return back.

❖ Click “Ok” to recode now or click “paste” to save the command


on syntax.
Group old values Provide a new value
Continue putting old and new
values

At the end click continue


At the end click OK
D. Date and Time Wizard
❖ It is useful to do function with time

❖ Time is necessary part of epidemiological phenomenon

❖ It is part of calculation for ‘rates’ as in incidence, prevalence, etc

❖ In longitudinal studies, it is useful to calculate person years

❖ In survival analysis, the cumulative hazard and survival is


measured using time measure
Date and time wizard . . .
❖ The wizard can produce date format from simple numbers
(representing, day, month and year)

❖ It can also calculate time measure (days or years) from dates.

❖ It is able to extract ‘date format’ to a day, a month or a year.

❖ It is also able to extract a ‘date format’ to weeks of a year,


I. Creating Date variable
❖ Date format can be formed from simple numbers entered by a
software having no date format

❖ Date format can be entered day (1-31), month (1-12), and a four
digit year

❖ In such condition a date format is possibly created


Procedure
Transform ➔ Date and time wizard

Click here 1st

Click here next


Select the variable having numbers representing the year, month and day

Transfer the variable to respective time format by clicking here


Click here next
1st Write the new variable 2nd Chose the output format

3rd Write the label of the variables Click to finish


II. Calculate time measure
❖ Time is calculated from two date format

❖ The out come is measured in seconds, but transformed into days,


weeks, months or years

❖ Eg. Date of interview – date of birth can give age of a person


Procedure
Transform ➔ Date and time wizard

Click here 1st

Click here next


Select here if you want to
Add/ subtract time

Select here if you want to


Calculate between dates
Then click next
Performing Statistics
Analysis Using SPSS
Statistics Analysis

❖ Descriptive Statistics

❖ Graphics

❖ Comparing means

❖ Comparing proportion

❖ Correlation and crosstab

❖ Regression…
Prerequisites for analysis
❖ Be clear with the objectives of study

❖ Knowledge of type of variables

o Qualitative Vs. quantitative

o Dependent Vs. independent

❖ Knowledge of measurement scale

❖ Knowledge of type of analysis needed for each objectives, type


of variables and measurement scale
Study objectives
❖ A research is made principally to answer certain questions

❖ We should be aware of that;

o Results should answer the objectives (study questions)

o Discussion should interpret what it mean by the results


answering the objectives

o Conclusion should be based on the answer to the objectives

o Recommendation also should be based on finding but not on


wish
Cont….
❖ Results should answer the objectives (study questions)

❖ Eg;

o To determine prevalence of TB in a community

o Assess factors associated with HIV/ AIDS

o Measure effect of multiple partner on HIV/AIDS prevalence


Type of variables
❖ Before any analysis in a research be clear with;

o Dependent and independent variable

o Qualitative and quantitative variable

o Measurement scales(nominal, ordinal, interval and ration


scales)
Cont…
❖ There are two major forms of variables

o Qualitative (or categorical)

o Quantitative (or numerical)


SUMMARY
Summary

Variable
Types
of Qualitative Quantitative
variables or categorical measurement

Nominal Ordinal Discrete Continuous


(not ordered) (ordered) (count data) (real-valued)
e.g. ethnic e.g. response e.g. number e.g. height
group to treatment of admissions

Measurement scales
Dependent vs Independent Variables
❖ Dependent variable

o Is the outcome (end-product) variable of a research

o Example ;

• Depression status

• HIV status

• Condom use

• Treatment defaulting

Independent variable Dependent variable


Cont…
❖ Independent variable

o Explanatory variable in which it is assumed as a determinant


(= Cause) of the out come variable

o Example

• Adverse life event

• Experience of violence

• HIV status if outcome is getting TB


Types of Analysis
❖ Univariate analysis

o Usually aimed to characterize or describe your study


population

o To assess the nature of your variable like normality…

❖ Bivariate analysis

o To assess or test independence or association between two


variables
Cont’d…
❖ Multivariate analysis

o To study the functional relationship between multiple


variables

o To forecast or predict a the value of one variable


corresponding to a given value of another variable
Bivariate Analysis
❖ Bivariate analysis is second step in analysis

❖ It is analysis made to test presence of relationship between two


variables

❖ Describes presence of association between two variables

❖ Answers the question: Is there a relationship between these two


variables?

❖ It is initial step in hypothesis testing


Possible combination
❖ There are three possible combination pairs of variable types,

o Two qualitative variables

o Two quantitative variables

o A quantitative & qualitative variables


1. Two qualitative variables
❖ This is when the dependent & the independent variables are
categorical

❖ The statistics can be done

o Crosstab & logistic regression in SPSS

❖ Chi square is the usual test of statistics


Performing Bivariate using SPSS
❖ Analysis➔ Descriptive statistics➔ Crosstab

❖ Then under crosstabs

o Put dependent variable to “column” & the independent


variables to “Rows”.

o By Clicking the ‘statistics’ mark the ‘Chi square’, ‘risk’.

o By clicking the ‘Cells’, mark ‘rows’ from the percent

❖ NB: If a Case-control study, better to click the cells and mark


column
Analysis➔ Descriptive statistics➔ Crosstab
Analysis➔ Descriptive statistics➔ Crosstab

Put the independent variables to “Rows”


(One or more categorical variables)

The dependent variable to “column”

Under ‘statistics’

‘Chi square’,

‘risk’.
Analysis➔ Descriptive statistics➔ Crosstab

Under ‘Cells’,

‘rows’ .
gender * depression diagnosis Crosstabulation

Output depression diagnosis


depression
non-case case Total
gender female Count 497 358 855
% within gender 58.1% 41.9% 100.0%
male Count 420 160 580
% within gender 72.4% 27.6% 100.0%
This is considered Total Count
% within gender
917 518 1435
63.9% 36.1% 100.0%
as the reference
Compare percentages
between different
Chi-Square T ests exposure status

Asymp. Sig. Exact Sig. Exact Sig.


Value df (2-sided) (2-sided) (1-sided)
Pearson Chi-Square 30.571b 1 .000
Continuity Correctiona 29.955 1 .000
Likelihood Ratio 31.089 1 .000 X2 that needs
Fisher's Exact Test .000 .000
Linear-by-Linear Consideration (for 2x2)
30.550 1 .000
Association
N of Valid Cases 1435
a. Computed only for a 2x2 table
b. 0 cells (.0%) have expected count less than 5. The minimum expected count is
209.37.
•If the variables are of 2X2 table format, take the X2 under the continuity correction
•If it is of 2X(>2) take the X2 under the Pearson chi-Square
•If any cell in the table has < 5 expected count, choose likelihood ratio Fisher’s Ex.
•If the dependent variable is of ordinal type, choose linear by linear association.
Risk Estimate

95% Confidence Cont….


Interval
Value Lower Upper
Odds Ratio for gender
.529 .421 .664
(female / male)
For cohort depression
.803 .744 .866
diagnosis = non-case
OR that needs
For cohort depression
diagnosis = 1.518 1.302 1.770 Consideration (for 2x2)
depression case
N of Valid Cases 1435

1. This table gives us the ‘OR’ or ‘RR’, if & only if the variables in the
model are of a 2x2 table format
2. The first raw value of the independent variable is considered as a
reference in the above OR (1st raw) & RR (2nd raw) of the above
analysis result

3. The second raw value of the independent variable is considered as a


reference in the above RR (3rd raw) of the above analysis
Binary Dependent Variable
❖ When the dependent is binary we are able to use;

o Simple crosstabs (as in the above)

o Logistic regression (Binary)

o If we are using binary logistic regression, the dependent


variable should be treated as success & failure

o The success should be assigned as ‘1’ & the failure as ‘0’


Cont’d…
❖ Analysis ➔ Regression ➔ Binary logistic

o Then transfer the dependent variable to “dependent” & the


predictor (only one predictor variable) to the “Covariates”.

o If the predictor variable is categorical click the “categorical”


& by highlighting the variable transfer to “categorical
covariate” &

o by choosing & ticking the reference option (first or last) &


clicking “change” click the “continue”.

o Click the “Option” & mark the “CI for B (Exp) 95 %”


Analysis ➔ Regression ➔ Binary logistic
Analysis ➔ Regression ➔ Binary logistic

Dependent variable

Independent variable

click the “categorical”

1st Shade the variable

2nd pass by clicking here


Analysis ➔ Regression ➔ Binary logistic

Dependent variable

Independent variable

Transferred “categorical covariate”


Analysis ➔ Regression ➔ Binary logistic

Dependent variable

Last or First is chosen


from your hypothesis
or your expectation

Independent variable

Choose the reference option

Last or First

then clicking “change”


Choosing the Reference Group
❖ One or more values of the independent variable is considered as
either exposure or non-exposure for outcome interest
❖ The reference of the independent variable is selected by our
hypothesis or experience
❖ Usually normal occurrence is considered as reference (non-
exposure)
❖ This postulated reference should be arranged in ordered as First
or Last
❖ Then the reference group is selected according to its place in
order of its existence
Analysis ➔ Regression ➔ Binary logistic

–Click the “Option” and

–mark the “CI for B (Exp) 95 %”


OUTPUT
Dependent Variable Encoding

Original Value Internal Value


non-case 0 Values of the
depression case 1 dependent & independent

Catego rical Variables Coding s

Parameter
coding

gender female
Frequency
855
(1)
.000
The reference is female
male 580 1.000

Parameter code (1) is


given to the exposure (eg here ‘male’)
OUTPUT
Variables in the Equation

95.0% C.I.for EXP(B)


B S.E. Wald df Sig. Exp(B) Lower Upper
Step
a SEXNO(1) -.637 .116 30.202 1 .000 .529 .421 .664
1 Constant -.328 .069 22.396 1 .000 .720
a. Variable(s) entered on step 1: SEXNO.

Here the B is the regression coefficient that depicts the slope & the
interception. It is the change in logit of the outcome variable associated with a
one unit change in the predictor variable.

Wald statistics has a chi-square distribution

The most crucial & more displayed for the interpretation of logistic regression is
the value of Exp (B) & its 95% CI, which is the change in odds resulting from a unit
change in the predictor
Preventive Risk

0 +1
The Exp (B) odds ratio & its 95% CI are the only result usually displayed
How should we display
Say we got, OR (95% CI)
❖ Sex
o Male 1.00 Exp(B)
o Female 1.86 (1.05, 2.46)
❖ Residence
o Urban 1.00
o Rural 2.78 (0.78, 5.64)
❖ Marital status
o Single 1.00
o Married 0.67 (0.25, 0.89)
o Divorced/widowed 1.82 (1.04, 2.56)
Interpretation
non-Exposure (so reference)

Sex OR (95% CI)


Male 1.00
Female 1.86 (1.05, 2.46)(Being female predict the outcome)
Residence Exposure
Urban 1.00
There is no statistically significant
Rural 2.78 (0.78, 5.64) relation between residence and outcome
Marital status non-Exposure (referent)
Single 1.00
Married 0.67 (0.25, 0.89) Being married is reduce the odds of
having outcome of interest
Divorced/widowed 1.82 (1.04, 2.56)
Where as being divorced or widowed
Exposure increase the odds of having outcome
of interest
2. Two quantitative variables
❖ Uses a correlation matrix

❖ Pearson’s correlation is used, when the two variables are


continuous & are symmetrically distributed

❖ Therefore, we should test the variables for their symmetry

❖ If they fulfill for symmetry, we are able to analyze using the


Pearson’s correlation matrix
Cont’d…
• Analysis ➔ Correlation ➔ bivariate
Cont…
❖Analysis ➔ Correlation ➔ bivariate

1st Select continuous 2nd Pass by clicking here


variables Finally click here
To see for result

3rd Select Pearson


or make sure its
selection
❖ When the continuous variables are symmetrically distributed
we choose ‘Pearson Correlation’

Pearson
Correlation
(r)
The result of analysis
❖ Pearson’s Correlation Coefficient (r) tells you about

o Strength and

o Direction of the relationship

o We can also determine the significance of correlation using p-


value:
Interpreting Correlation
Coefficient
❖ The value of “r” ranges from -1 to +1 and exhibits both the
direction and the strength of relationship
Direction of Relationship
o r > 0 two variables have a direct relationship i.e. they tend to
increase or decrease together
o r < 0 two variables have a inverse relationship i.e. an
increase in one variable is accompanied by a decrease in the
other
o r = -1 or +1 indicate the existence perfect negative and
positive relationship respectively.
Cont’d…
❖Strength of Relationship
o r = (0.60 to 0.99 or -0.60 to -0.99) strong relationship.

o r = 0.30 to 0.59 (-0.30 to -0.59) moderate relationship

o r = 0.01 to 0.29 (-0.01 to -0.29) weak relationship

o r = 0 indicate absence of relationship (means variables are


uncorrelated). .
Scatter plot
Significant

❖ The significance is illustrated by its P-value

❖ When P-value is below or equal to 0.05, then we consider the


correlation is statistically significant
Cont…
❖ If the variables (especially the dependent) are not symmetrically
distributed

o We should follow non-parametric correlation using

• Kendall’s Tau_b’ or

• Spearmans rho’
: Analysis ➔ Correlation ➔ bivariate
Analysis➔ Correlation ➔ bivariate

Similar to Pearson c.
But select Kendall’s tau-b & Spearman rho
❖ Similar interpretation of the correlation coefficient
r & P-value
3.Qualitative & Quantitative
Variables
❖ Here you can look at a difference in mean values between two or
more groups

❖ Statistics of significance is made by:

o ‘Students t-test” for two groups, &

o ‘F-test’ for more than two groups

❖ P-value is seen to judge for significance

o P < 0.05, ➔it is significant

o P > 0.05, ➔it is NOT significant


Cont…
❖ If the dependent variable is symmetrically distributed, look for
the independent variable

❖ 1. If it is categorical & binary type,

• ➔ Use ‘students t-test’.

independent
Analysis ➔ Compare means ➔ samples t-test
independent
Analysis ➔ Compare means ➔
:
samples t-test
Within independent samples t-
test…..
❖ Select the dependent variable to the ‘test variable’ space & the
independent variable to the ‘grouping variables’

❖ Define the independent variable as their labeled number, & click


the ‘Ok’

❖ This will give you the mean difference & its significance using t-
test
Eg. Sex vs Verbal fluency

Eg ‘Sexno’ is defined
1. Female
2. Male
OUTPUT Group Statistics

Std. Error
gender N Mean Std. Deviation Mean
verbal fluency - animal female 855 15.24 5.711 .195
naming score male 580 15.95 5.493 .228

The group statistics tells us the mean of animal naming score among
males and females Independent Samples Test

Levene's Test for


Equality of Variances t-test for Equality of Means
95% Confidence
Interval of the
Mean Std. Error Difference
F Sig. t df Sig. (2-tailed) Difference Difference Lower Upper
verbal fluency - animal Equal variances
.643 .423 -2.336 1433 .020 -.71 .303 -1.300 -.113
naming score assumed
Equal variances
-2.354 1274.743 .019 -.71 .300 -1.296 -.118
not assumed

Levene’s test for equality of variances, tests assumption


of homogeneity of variance,
The t-test is a test that tells us
if it is not significant, we could say that ‘EQUAL the mean difference observed
VARIANCES ASSUMED’, thus to take from first raw. on animal naming score among
males & females, is statistically
If it was significant, it could be said that EQUAL significant.
VARIANCES NOT ASSUMED, & taking the second raw
will be advised
❖ If the dependent variable is symmetrically distributed, look
for the independent variable

2. If it is categorical & non-binary type,


➔ Use F-test.

1. Analysis ➔ Compare means ➔ One-Way ANOVA

2. Analysis ➔ Regression ➔ Linear


1. One-Way ANOVA

1. Analysis ➔ Compare means ➔ One-Way ANOVA


❖Select the dependent variable to the ‘dependent list’
space & the independent variable to the ‘factor’.

❖After Clicking the “options”, choose the


o ‘descriptive’
o ‘Homogeneity of variance’ &
o ‘Means plot’
Cont…
❖ After clicking “Post Hoc”, choose ‘Tukey’, click the ‘Ok’.

o This will give you the mean difference between


& within group difference & its significance
using F-test

o It also gives you Regression coefficients (the


intercept & the slop)
1. Analysis ➔ Compare means ➔ One-Way ANOVA
Analysis ➔ Compare means ➔ One-Way ANOVA

Under “Post Hoc”, & choose

‘Tukey’
e.g. Verbal fluency Vs Marital
status

Under OPTION choose

• Descriptive

• Homogeneity of variance test

• Means plot
Descriptives
OUTPUT verbal fluency - animal naming score
95% Confidence Interval for
Mean
N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum
Never married 74 15.42 5.581 .649 14.13 16.71 6 35
currently married or
759 16.27 5.471 .199 15.88 16.66 0 36
cohabiting
separated or divorced 58 17.55 6.319 .830 15.89 19.21 5 32
widowed 427 14.36 5.466 .265 13.84 14.88 0 42
not known 28 10.07 2.340 .442 9.16 10.98 4 17
Total 1346 15.54 5.600 .153 15.24 15.84 0 42

The group descriptive statistics tells us the mean of animal naming score among different
marital status Test of Homogeneity of Variances

verbal fluency - animal naming score


Levene
Statistic df1 df2 Sig.
5.597 4 1341 .000

Levene’s test for equality of variances, tests assumption of homogeneity of variance, if


it is significant, we could say that EQUAL VARIANCES NOT ASSUMED

ANOVA

verbal fluency - animal naming score


Sum of
Squares df Mean Square F Sig.
Between Groups 2064.896 4 516.224 17.258 .000
Within Groups 40111.191 1341 29.911
Total 42176.086 1345

The ANOVA statistics tells us that there is mean difference in animal naming
score between groups that is statistically significant
Here the mean of a single value P-value for
is compared with mean of other values the difference
And is displayed by mean difference
Multiple Comparisons

Dependent Variable: verbal fluency - animal naming score


Tukey HSD

Mean
Difference 95% Confidence Interval
(I) marital status (J) marital status (I-J) Std. Error Sig. Lower Bound Upper Bound
Never married currently married or
-.85 .666 .709 -2.67 .97
cohabiting
separated or divorced -2.13 .959 .172 -4.75 .49
widowed 1.06 .689 .541 -.83 2.94
not known 5.35* 1.213 .000 2.03 8.66
currently married or Never married .85 .666 .709 -.97 2.67
cohabiting separated or divorced -1.29 .745 .419 -3.32 .75
widowed 1.90* .331 .000 1.00 2.81
not known 6.19* 1.052 .000 3.32 9.07
separated or divorced Never married 2.13 .959 .172 -.49 4.75
currently married or
1.29 .745 .419 -.75 3.32
cohabiting
widowed 3.19* .765 .000 1.10 5.28
not known 7.48* 1.259 .000 4.04 10.92
widowed Never married -1.06 .689 .541 -2.94 .83
currently married or
-1.90* .331 .000 -2.81 -1.00
cohabiting
separated or divorced -3.19* .765 .000 -5.28 -1.10
not known 4.29* 1.067 .001 1.38 7.21
not known Never married -5.35* 1.213 .000 -8.66 -2.03
currently married or
-6.19* 1.052 .000 -9.07 -3.32
cohabiting
separated or divorced -7.48* 1.259 .000 -10.92 -4.04
widowed -4.29* 1.067 .001 -7.21 -1.38
*. The mean difference is significant at the .05 level.

This multiple comparison statistics (Tukey) tells us that for presence of


mean difference in animal naming score between groups & within groups
This gives graphical representation of mean score of verbal fluency by
marital status
2. Analysis ➔Regression
➔Linear
❖ Select the dependent variable to the ‘dependent’ space & the independent
variable to the ‘independent’

❖ After Clicking the ‘statistics’, chose the ‘estimate’, ‘model fit’, ‘confidence
interval’ and ‘R squared change’ & click the ‘Ok’

o This will give you the mean difference between & within group
difference & its significance is measured using F-test

o It also gives you regression coefficients (the intercept & the slop)

o (the ß = slop, gives you positive or negative relationship between the


predictor & the Outcome Variable)

o It also gives you R2 which is the explanatory or prediction power of the


model in predicting the outcome variable
2. Analysis ➔Regression ➔Linear
Analysis ➔Regression ➔Linear

After Clicking the ‘statistics’


‘estimate’,

‘Model fit’,

‘R squared change’

‘Confidence interval’
OUTPUT Model Summary

Change Statistics
Adjusted Std. Error of R Square
Model R R Square R Square the Estimate Change F Change df1 df2 Sig. F Change
1 .193 a .037 .037 5.496 .037 52.271 1 1344 .000
a. Predictors: (Constant), marital status

The Model summary shows you the R2 which tells us how much the predictive
Variables explains outcome variable, here in this example, it is 3.7 %

ANOVAb

Sum of
Model Squares df Mean Square F Sig.
1 Regression 1578.905 1 1578.905 52.271 .000 a
Residual 40597.181 1344 30.206
Total 42176.086 1345
a. Predictors: (Constant), marital status
b. Dependent Variable: verbal fluency - animal naming score

ANOVA statistics also tells us whether the explanatory variable predicts the outcome
variable well using F-test
OUTPUT Coefficientsa

Unstandardized Standardized

Model
1. B
Coefficients
Std. Error
Coefficients
Beta t Sig.
95% Confidence Interval for B
Lower Bound Upper Bound
1 (Constant) 17.779 .344 51.718 .000 17.105 18.454
marital status -.808 .112 -.193 -7.230 .000 -1.027 -.589
a. Dependent Variable: verbal fluency - animal naming score

1. The B is the coefficient that each independent variable contributes to the dependent
Variable, it is also the indicator of (ß = slop), & the intercept that crosses X value at 0

It tells us to what extent (degree) each predictor effects the outcome, if the effects of all
other predictors are held constant

The equation will seem Verbal fluency score = ß0 + ß1x Marital status + ……..
=17.78 – 0.81x Marital status + ……..
3. Coefficientsa 4.
2.
Unstandardized
Coefficients
Standardized
Coefficients 95% Confidence Interval for B
Model B Std. Error Beta t Sig. Lower Bound Upper Bound
1 (Constant) 17.779 .344 51.718 .000 17.105 18.454
marital status -.808 .112 -.193 -7.230 .000 -1.027 -.589
a. Dependent Variable: verbal fluency - animal naming score

2. The standard error, if its value is minute that could give


insignificant change to the ß (slop) when added or subtracted, then
it can show that its significance

3. Standard coefficient may be useful & gives a good estimate


through relative estimation using standard deviation

4. Students t-test is the statistics that estimates the significance, & the
upper & lower 95% CI, are significant if both become Negative or
Positive
Asymmetrical Dependent Variable
Use non-parametric analysis

1. Mann-Whitney Test
Analysis➔ Nonparametric tests ➔ Legacy Dialogs➔ 2
independent samples
Within 2 independent samples
❖ Select the dependent variable to the ‘test variable list’ space and the independent
variable to the ‘grouping variables’.

❖ Click ‘Mann-Whitney U’ and ‘Kolmogorov-Smirnov Z’

❖ Define the independent variable as their labeled number, and


click ‘Ok’.

❖ This will give you the ranked mean difference and its
significance using Z score.
‘Sexno’ is defined
1. Male
2. Female

•Click ‘Mann-Whitney U’ and ‘Kolmogorov-Smirnov Z’


Mann-Whitney U Test
Test Statisticsa

verbal fluency
Ranks
- animal
gender N Mean Rank Sum of Ranks naming score
verbal fluency - animal female 855 700.21 598676.02 Mann-Whitney U 232736.000
naming score male 580 744.23 431653.99 Wilcoxon W 598676.000
Total 1435 Z -1.979
Asymp. Sig. (2-tailed) .048
a. Grouping Variable: gender
Mean rank of animal scoring by sex

Kolmogorov-Smirnov Test Test Statisticsa

verbal fluency
- animal
naming score
Most Extreme Absolute .082
Differences Positive .082
Negative -.001
Kolmogorov-Smirnov Z 1.528
Asymp. Sig. (2-tailed) .019
a. Grouping Variable: gender
End

You might also like