Professional Documents
Culture Documents
SPSS WORKSHOP
CONTINUE
CB3021 11-2
Outline
• ANOVA
• Chi-Square Analysis
• Correlation Test
• Linear Regression Model
• Logistic Regression Model
Analysis of Variance (ANOVA)
• ANOVA is a method that can test whether the mean values
for more than two groups are equal
• It can involve more that one e.g.
factor and compare the mean
gender
values among the factor levels
• It can determine if there is an interaction effect between
factors
• Reference:
• https://en.wikipedia.org/wiki/Analysis_of_variance
• Assumptions:
•Independence of observations
•Normality
•Equal variances, the variance of data in groups are the
same.
ANOVA
• Dataset: Deli (Week 10)
• X7, Gender
• X11, Drive Distance
Chi-Square Analysis: SPSS
• Select ANALYZE => DESCRIPTIVE STATISTICS =>
CROSSTABS
Chi-Square Analysis: SPSS
• Select X11 as Row and X7 as Column
Chi-Square Analysis: SPSS
• Click Statistics and then click Chi-square
• Click Cells and select Observed and Expected under the
box “Counts”
Chi-Square Analysis: SPSS output
• Check the table of Chi-square Tests
• H0 : No association between
Gender and Driven Distance
• H1 : There is a significant
association between the two
variables
Ref: https://en.wikipedia.org/wiki/Logistic_regression
Logistic Regression: SPSS
• Analyze →→ Regression →→ Binary Logistic…
Logistic Regression: SPSS
• Put Vote into the dependent variable box, and Educ and
Gender into the independent variable box
Logistic Regression: SPSS
• Click categorical, put gender into the categorical covariate
box, and choose indicator, change
• Male=0
• Female=1
• We use female as the reference category
Logistic Regression: SPSS
• Model Summary table tells that how much variation in the
dependent variable can be explained by the model
• This table contains the Two R square values, both are
measurements of calculating the explained variation
Logistic Regression: classification
• This gives the percent of cases for which the dependent
variables are correctly classified by the model
• The percentage is
(824+567)/( 824+567+414+563)=58.7%
Logistic Regression: equation
• The "Variables in the Equation" table shows the
parameter estimate of each independent variable in the
model and its statistical significance.
Logistic Regression: Wald Test
• The Wald test is used to determine statistical significance
for each of the independent variables. For the three factors,
p value <0.05, showing that they have statistically
significant impact on the output variable
Logistic Regression: SPSS
• The odds ratios are simply the exponentiated coefficients
from the logit model. For example, the coefficient for educ
was -.252. The odds ratio is exp(−.252)=.777.
Logistic Regression: odds ratio
• An odds ratio less than one means that an increase in X
leads to a decrease in the odds of Vote for Trump.
• An odds ratio greater than one means that an increase in x
leads to an increase in the odds of Vote for Trump.
Logistic Regression: odds ratio
• For Gender, the Beta value is 0.356, positive, and the Odds
ratio is 1.427, greater than 1.
• The odds of voting for Trump are higher for males
compared to females