Professional Documents
Culture Documents
SPSS
SPSS
using SPSS
Jarir At Thobari, MD, DPharm, PhD
Faculty of Medicine
Universitas Gadjah Mada
SPSS Program Windows
Variable
Variable Formats
Formats
Variable
Variable Labels
Labels
Value
Value Labels
Labels
Missing
Missing Values
Values
Copying
Copying Data
Data Properties
Properties
Formatting Your Variables
Variable Formats Variable Labels
– Click on the Variable View tab of – Type in descriptive text that
the Data Editor to edit or display explains what the variable
formats measures
Name, type, width decimals, label,
values, missing, columns, align ,
measure
Formatting Your Variables (cont.)
• Computing Variables
• Collapsing Variables Using Recode
• Counting Values in Other Variables
• Ranking Cases
• Date and Time Variables
Computing New Variables
• Create new variables
using equations or
functions
– Transform menu
• Compute Variable
– Enter a Target Variable
Name – e.g. TestAvg
– Build a Numeric
Expression
• E.g. – (Test1 + Test2 +
Test3)/3
– Click OK
Exercise!
Body Mass Index (BMI)
Calculates as (body weight in kg) divided (height2 in m)
• Counting Values in
Other Variables
• Ranking Cases
DO IF (MI=0) .
RECODE
Chol1 (8.00 thru Highest=1) INTO Hyperlipidemia .
END IF .
VARIABLE LABELS Hyperlipidemia 'Hyperlipidemia category'.
EXECUTE .
RECODE
Hyperlipidemia (SYSMIS=0) .
EXECUTE .
Hypertension syntax
USE ALL.
COMPUTE filter_$=(Systolic >= 140.00 OR Diastolic >= 90.00).
VARIABLE LABEL filter_$ 'Systolic >= 140.00 OR Diastolic >= 90.00 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE .
Descriptive Data Analysis
• Frequencies
• Descriptive
• Crosstabs
• Means
• Normality
The FREQUENCIES Procedure
1. Command syntax
2. Summary statistics
1
3. Frequency counts for
each value
4. Percentages 2
– Raw percent
– Valid percents
3 4
– Cumulative percents
Exercise!
Calculate the frequency of
Male subject
Smoker
Myocardial infarction
Obesity
Hypertension
Hyperlipidemia
Use of antihypertensive
Use of antidiabetic
Use of anticholesterol
The DESCRIPTIVES Procedure
1. Command syntax
2. Variable name and label
3. Number of cases
4. Statistics:
– Minimum
– Maximum
– Mean
– Standard Deviation
Exercise!
Calculate the mean and SD of
Age
Body Mass Index
Systolic BP
Diastolic BP
Glucose level
Cholesterol level
The CROSSTABS Procedure
1. Table title
2. Column variables
3. Row variables
4. Cell counts (# of
cases)
5. Column percents (%
of cases in column)
6. Statistics
Exercise!
Calculate % hyperlipidemia who are
Male
Smoker
Obesity
Myocardial infarction
Hypertension
Use of hypertension drug
Use of anticholesterol drug
Use of antidiabetic drug
The MEANS Procedure
• MEANS calculates overall means
and group means (defined by
independent variables)
• Analyze Menu:
– Compare mean…
• Means
• Highlight variables to create
tables, click the arrow to add to
Dependent or Independent
variable lists, then click OK
• Optional Statistics are available
MEANS Output
1. Command syntax
2. Numbers of cases included and
excluded
3. Dependent variable
4. Independent (group) variable
5. Means
6. Number of cases
7. Standard Deviations
Exercise!
Compare between Obesity’s vs. Normoweight
Mean of :
Age
Systolic BP
Diastolic BP
Cholesterol-1
Cholesterol-2
Glucose
Exercise!
Smoker
Myocard Infaction
Use of Antihypertensive
Use of Antihyperlipidemia
Use of Antidiabetic
Home Assignment
Variables Antilipidemia p value
Yes No
Age (mean +/- SD)
Male (%)
Smoker (%)
Systolic BP (mean +/- SD)
Diastolic BP (mean +/- SD)
Cholesterol (mean +/- SD)
Glucose (mean +/- SD)
Myocardial Infarction (%)
Independent Dependent
Variable Variable
CONFOUNDING CONFOUNDING
BY INDICATION X variable BY OUTCOME
USIA (X2)
Smoking (X3)
MI (X4)
Sample t-test
Paired t-test
Level of Cholesterol level before and after
given anticholesterol
One way ANOVA
When do I use it?
• A one-way ANOVA allows us to test whether several
means (for different conditions or groups) are equal
across one variable.
One way ANOVA
Checking your assumptions
• Normality. Assume that the population distributions are normal. Check for
normality by creating a histogram.
• Equal Variances. Assume that the population distributions have the same
variances.
– As long as the largest variance is no more than 4 or 5 times the size of the smallest
variance and the sample sizes are equal, then the test is robust to violations.
– Standard deviation is the square root of variance. Square the largest and smallest
standard deviation to get the variances, and then divide the larger by the smaller.
One way ANOVA
• Select "General Linear Model" in the "Analyze" menu Select "Univariate..."
from this menu.
• Select a dependent variable.
• Next, select the factor (independent variable) and place it in the "Fixed Factor(s)"
box.
One way ANOVA
• To display the group means, click the Options button and then add your
independent variable to the "Display Means for" list.
• Select the "Descriptive Statistics" option to obtain more information about each
group such as standard deviation and count in addition to the means
One way ANOVA
• If wish to run post-hoc tests on ANOVA to examine individual
mean differences click "Post Hoc..." Add the variables to test.
Most often, you will use the Tukey test.
One way ANOVA
the mean, standard deviation, and total count (N)
in each group.
Descriptive Statistics
Mean
95% Confidence Interval
Difference
(I) BMI categorical (J) BMI categorical (I-J) Std. Error Sig. Lower Bound Upper Bound
underweight normal weight -9.12 5.428 .335 -23.07 4.84
over weight -22.06* 5.427 .000 -36.01 -8.11
obese -26.28* 5.474 .000 -40.35 -12.21
normal weight underweight 9.12 5.428 .335 -4.84 23.07
over weight -12.94* .776 .000 -14.94 -10.95
obese -17.16* 1.060 .000 -19.89 -14.44
over weight underweight 22.06* 5.427 .000 8.11 36.01
normal weight 12.94* .776 .000 10.95 14.94
obese -4.22* 1.053 .000 -6.92 -1.51
obese underweight 26.28* 5.474 .000 12.21 40.35
normal weight 17.16* 1.060 .000 14.44 19.89
over weight 4.22* 1.053 .000 1.51 6.92
Based on observed means.
*. The mean difference is significant at the .05 level.
Exercise!
• Normality. Assume that the population distributions for each of your cells
are normal. ANOVA is quite robust over moderate violations of this
assumption. Check for normality by creating a histogram.
• Equal Variances. Assume that the population distributions for each cell
have the same variances.
Factorial ANOVA
• Select "General Linear Model" in the "Analyze" menu Select
"Univariate..." from this menu.
• Select a dependent variable.
• Next, select the factor (independent variable) and place it in the
"Fixed Factor(s)" box.
Factorial ANOVA
• To display the group means, click the Options button and then add
your independent variable to the "Display Means for" list.
• Select the "Descriptive Statistics" option to obtain more information
about each group such as standard deviation and count in addition to
the means
Factorial ANOVA
• The third row independent variable in capitals; use the F-
value and p-value from there.
• Degree of freedom "between" is also in that row. Use the
degrees of freedom in the next row, labeled "Error," to get
your degrees of freedom "within."
Factorial ANOVA
Descriptive Statistics
Factorial ANOVA
• Assign one variable to the rows and one variable to the columns.
• Click on the "Statistics" button.
Chi Square
• Graphically, the relationship between the two can be plotted (one as the
independent, or the predictor variable, and the other as the dependent, or
the predicted variable).
• In this case, the slant of the line represents the degree of correlation: the
steeper the line is, the more highly correlated the two variables are.
Otherwise, the correlation is represented in a table format as "r," in which
case, the greater the absolute value of "r," the higher the correlation.
Pearson correlation
Checking your assumptions
Pearson correlation