SPSS Outputs Summary

SPSS-Output-Interpretation
Basics
From „Descriptive statistics“
to Histogram
Correlations
à Quantify the relationship between two variables
Corr = 0 à no correlation
Corr > 0 à positive correlation
0,1 – weak; 0,3 – middle,
0,5 – strong
Corr < 0 à negative correlation
One-sample t-test
à Determine whether the mean of a population, represented by a sample, significantly

differs from a specific value
Example: Check the following statement: “The average German bitumen-producing
company has more that 60 employees”
Sig=
0,021<0,05
à Reject H0
à “The
avergage
German…. has
more that 60
employees”
Independent samples t-test
à Determine whether two populations, represented by samples, are significantly
different in terms of means
Example: Determine whether Shell-owned companies generate higher turnover than
other companies
If Sig< 0,05 à
significant result
(reject H0, support
H1)
If Sig> 0,05 à
insignificant result
(we can’t reject H0)
Cluster Analysis
Overall objective: maximize variation between clusters, minimize variation within
clusters
Clustering Variables:
- significant differences between variables across the clusters
- relation between sample size and number of clustering variables (m) as well as
clusters (k) :
o Clusters of equal size: nmin= 10*m*k
o General recommendation nmin= 70*m
- Avoid using highly correlated variables (correlation 0,9 or higher)
- Data of strong quality
Step 1: Decide on clustering variables

Correlations
Step 2: Measure of (dis-)similarity & clustering algorithm

Algorithm: single linkage (nearest neighbour), complete linkage (furthest neighbour),
average linkage (between-groups)
Distance measure: Euclidean distance (mostly used, recommended), …
à Data must be standardized (range -1 to 1)
Proximity matrix: shows the Euclidean distances between variables

à look for the smallest distance (e.g. Peugot 207 & Fiat Grande Punto) and merge
(cluster)
Agglomeration schudule
à Different clustering algorithm (e.g. single linkage vs. average linkage) would lead to
different agglomeration schedules
Step 3: Decide on the number of clusters
à Dendogram and Icicle Graph show the different clustering possibilities
Cluster Membership: Shows the cluster allocation of a single solution or a range of

solutions
à deciding (subjective) how much clusters are

appropriate
ß
shows the differences in variables
between clusters.
Helps to name clusters
Step 4: Validate the cluster solution (k-means analysis)
à Reassignment of the objects until the overall within-cluster variation is minimized
Identical solution:
initial partitioning of
objects was retained
Factor Analysis
à As Set of methods to reduce complex data structures (typically induced by the large
number of variables) by identifying a smaller number of unifying variables called
factors that represent the original variables in the best possible way.
Step 1: Check assumptions
- Scale level: Variables should be measured on an interval or ratio scale level

- Sample size: Minimum number of (valid) observations ≥ 10* number of
items/variables
- Dependence of observations: Observations need to be independent (i.e., only one
observations from each individual, company, coutry…)
- Correlation between items/variables: Variables must be sufficiently correlated
Corr = 0 à no
correlation
Corr > 0 à positive
correlation
0,1 – weak;
0,3 – middle,
0,5 – strong
Corr < 0 à negative
correlation
Anti-image Covariance:
Covariance that is
independent from other
variables
If > 25 % of absolute
values are greater than
0.09, the variables may be
inappropriate for factor
analysis
Measure of sampling adequacy: KMO- Criterion and Bartlett’s Test of Sphericity:
à Evaluation if the correlation matrix as a whole
KMO should be at
least 0,6 to continue
with the factor
analysis
H0: the variables of the population are uncorrelated

p-value < a (0,05) à H0 can be rejected
à Data appropriate for factor analysis
Step 2: Extract the factors

à Transforming the original variables into new, uncorrelated variables
Step 3: Determine the number of factors

à Standard approach: Kaiser criterion à Eigenvalues > 1
à Alternative approach: Scree plot
Always extract one factor less than the number

indicated by the distinct break (“elbow”)
Step 4: Interpret the factor solution
à Assign each variable to a certain factor based in its maximum absolute factor loading
à Find an umbrella term for each factor that best describes the set of variables
associated with that factor
Some variables might be highly

correlated with more than one
factor à better use rotated
factor solution
à Never use the unrotated solution when interpreting factors!

à unrotated is only used to determine the
number of extracted Eigenvalues! Faktor 1 (e.g., Satisfaction with the hotel room)
Faktor 2 (e.g., Satisfaction with the
service/personnel)
How much of the total variance is explained by the factor solution?
Example: “Furnishing of the hotel room”= 0,8322 + (-0,353)2 = 0,818
Total variance = sum of diagonal elements in the (original) correlation matrix (i.e.,
correlations of the variables with themselves): R = 1+1+1+1+1 = 5
Reproduced Variance= sum of diagonal elements in the reproduced correlation matrix
(i.e., communalities): Rrepr. = 0,821+0,818+0,796+0,902+0,912 = 4,25
à Proportion of total variance explained by the factors = Rrepr./R = 4,25/5 = 85%

Step 5: Evaluate the goodness-of-fit of the factor solution
à Residuals as measure of the goodness-of-fit:
Differences between the
observed correlations and
the reproduced
correlations can be
examined to determine
model fit
A value > 50% should raise concerns. If less than 50% of residuals have absolute (- or +)
values greater that 0,05, we can presume a good model fit!
à Communalities as measure of the goodness-of-fit:

- Amount of variance a variable shares with all the other variables being considered.
(Equals the proportion of variances explained by the factors extracted)
81,8% of the variable’s variance is reproduced by

extracting two factors
Problem: If factor solution accounts fir less that 30 % of a

variable’s variance (i.e., communalities of less than 0,3),
reconsider set-up!
Cross Tabulations
à Verification of the hypothesis about the existence of any correlation between two
nominal scaled variables
Information about the shared frequency distribution of two variables and the absolute,
relative and expected frequency
The stronger ^hij and hij differ,
the stronger is the suspected
dependency between X and Y
Pearsons Chi-squared test

à Testing whether the variance of one variable differs from the populations variance
X2-Value of 79,277; p-value (,000) < a (=0,05)

à H0 (there is no correlation between the
variables) can be rejected. Conclusion about
causality is not possible
Likelihood Ratio based on the maximum likelihood method and delivers at large
sample sizes the same result as the Pearson Chi-squared test
Correlations measures
Higher Phi à stronger correlation (betw. 0-1);

Values > 0,3 indicating correlation
Cramer’s V à betw. 0-1 (0 = no correlation, 1=
perfect correlation
Contingency Coeff. à Estimation of the strengths
of correlations of > two variable characteristics
Analysis of Variance (ANOVA)

à Analysing the effect of one or multiple independent nominal criteria (single or
combined) on one dependent metrical object.
Test if means of 2 or more populations differ (2-sample t-test not applicable)
One-way ANOVA
à Examine mean differences between more than two groups. (One metric dependent
variable)
Step 1: Check the assumptions
a) The dependent variable should be normally distributed
H0: The variables are normal distributed

(à does not need to be rejected!)
If n < 50 à Shapiro-Wilk
If n > 50 à Kolmogorov-Smirnov
(if there is Significance, assume
Normality, because n > 30!)
No significance à H0 is not rejected à Normal distribution
( value > 0,05)
b) Homogeneity of variances: variances in the different groups of the design are

identical
à Levene-Test: proofs the Null-Hypothesis that the variances in the different groups are
identical (à should not be rejected!)
0,409 > 0,05 à no significance à variances are

homogeneous à Continue with ANOVA-Output
If variances are not homogeneous à continue with Welch (Robust Test of Equality
Means à assumes homogeneity!)
c) Independence of all observations

à not with repeated measures (mostly given)
Step 2: Calculate the test statistic: Output

Shows the means, minimum and
maximum as well as standard
deviation of the groups
Given the variances are

homogeneous, the significance (p-
value < 0,05) leads to the rejection
of H0: (There is no significant mean
differences between the groups)
à Supporting H1: There are significant differences in the means of the groups
Given the variances are inhomogeneous, the significance (p-

value < 0,05) leads to the rejection of H0: (There is no significant
mean differences between the groups
à Not applicable in this example since the homogeneity is given!

Step 3: Carry out Post-Hoc-Tests: Output
à showing information about the significant differences
Gives information about the

differences between the variables
à if p-value (Sig.) < 0,05 à
Significant difference of the
dependent variable depending the
respective independent variable
Example: A significant difference in sales (dependent var.) depending on the form of

promotional tool (independent variable).
The Homogenous Subsets output is

produced by a request for post hoc
tests and addresses the same questions
as the Multiple Comparisons table for
post hoc analysis, i.e. which pairs of
groups have significantly different
means on the dependent variable.
Shows the effects/influence of

(combined) subjects on the
dependent variable (e.g. the
influence of promotion and storesize
on sales)
à Not significance à no sig.

influence
If Sig. < 0,05 à Significant effect/influence on the dependent variable
Two-way ANOVA
à Examine mean differences between more than two groups. (Two metric dependent
variables)
… What is the impact of the 2 factors?
… Is there any interdependency between factor 1 and 2?
Small
effect
Partial Eta Squared: Effect size (Rule of thumb: Take the squareroot and interprent like
normal correlations)
Observed Power: Probability of not making a Type-II-Error (à the smaller the effect
size, the smaller the power)
If Sig. < 0,05 à Significant effect/influence on the dependent variable (no significance
in this example à Sig.= 0,967 > 0,05 à no significant influence)
Ordinary Least Squares (OLS) – Regression analysis

à Analysis of the interrelation between one dependent variable Y (measures on a
metric scale) and one or more independent variables X1,…, Xm
à Indicate if independent variables have a significant relationship with a dependent
variable
à make predictions (develop scenarios)
Examples: - Prediction of the sales changes by increasing/decreasing of the marketing

costs
Requirements:
a) Large Sample size ( n ≥ 50+8*k to test the overall relationship (k = number of

independent variables); n ≥ 104 + k to test for individual parameter effects))
b) Sample is representative of overall population
c) Variables show variation (they are not constant)
d) Dependent variable is interval or ratio scaled (if this is not met à logistic
regression)
e) No Linear dependence between independent variables (no multicollinearity)
Example: Strong correlation

between Customer expectations
and perceived value
à Analysis of pairwise
correlation
Test of model assumptions:
a) Linear model
à Diagnosis: visual inspection
a: linear
b: log (x) transformation
c: x2 transformation
b) No systematic errors
OLS will always satisfy this assumption
c) Homoscedasticity (constant variance) of error

terms
achieved by using generalized least squares (GLS) or weighted least squares (WLS)
regression models
d) No Autocorrelation
Autocorrelation: Positive or negative correlation of error terms over time

May occur if several observations are collected from a single respondent at different
points in time
Diagnosis: Durbin-Watson test (Page 328 Exercise)
e) Normal distribution
H0: The variables are normal distributed (à

does not need to be rejected!)
If n < 50 à Shapiro-Wilk
If n > 50 à Kolmogorov-Smirnov
(if there is Significance, assume Normality,
because n > 30!)
No significance à H0 is not rejected à Normal distribution
( p-value > 0,05)
Output – model interpretation
Significance of the regression model (F-test):
F-value is significant à The overall

model is significantly appropriate
Goodness of fit (R2)
R2 (between 0 – 1) shows how much of the variance

in the dependent variable can be explained by the
independent ones à The higher the R squared, the
higher the model-fit (e.g., R1 = 1 à the model
perfectly estimates the observed values )
If the number of independent variables is very high, use Adjusted R2
Significance of the regression coefficients (t-tests):
Constant (y-intercept) and regression coefficients of the different independent variables

A significant constant (a) means that the y-intercept is not zero (y=a + ßxn)
A significant regression coefficients (ßxn) means a significant effect on the
dependent variable (y)
Powered by TCPDF (www.tcpdf.org)

SPSS Outputs Summary

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SPSS Outputs Summary

Uploaded by

Copyright:

Available Formats

SPSS-Output-Interpretation

à Determine whether the mean of a population, represented by a sample, significantly

Step 1: Decide on clustering variables

Step 2: Measure of (dis-)similarity & clustering algorithm

Proximity matrix: shows the Euclidean distances between variables

à Dendogram and Icicle Graph show the different clustering possibilities

Cluster Membership: Shows the cluster allocation of a single solution or a range of

à deciding (subjective) how much clusters are

Step 4: Validate the cluster solution (k-means analysis)

à Reassignment of the objects until the overall within-cluster variation is minimized

Step 1: Check assumptions

- Scale level: Variables should be measured on an interval or ratio scale level

H0: the variables of the population are uncorrelated

Step 2: Extract the factors

Step 3: Determine the number of factors

à Alternative approach: Scree plot

Always extract one factor less than the number

Some variables might be highly

à Never use the unrotated solution when interpreting factors!

How much of the total variance is explained by the factor solution?

Example: “Furnishing of the hotel room”= 0,8322 + (-0,353)2 = 0,818

à Proportion of total variance explained by the factors = Rrepr./R = 4,25/5 = 85%

à Communalities as measure of the goodness-of-fit:

81,8% of the variable’s variance is reproduced by

Problem: If factor solution accounts fir less that 30 % of a

Pearsons Chi-squared test

X2-Value of 79,277; p-value (,000) < a (=0,05)

Higher Phi à stronger correlation (betw. 0-1);

Analysis of Variance (ANOVA)

H0: The variables are normal distributed

b) Homogeneity of variances: variances in the different groups of the design are

0,409 > 0,05 à no significance à variances are

c) Independence of all observations

Step 2: Calculate the test statistic: Output

Given the variances are

Given the variances are inhomogeneous, the significance (p-

à Not applicable in this example since the homogeneity is given!

à showing information about the significant differences

Gives information about the

Example: A significant difference in sales (dependent var.) depending on the form of

The Homogenous Subsets output is

Shows the effects/influence of

à Not significance à no sig.

If Sig. < 0,05 à Significant effect/influence on the dependent variable

Ordinary Least Squares (OLS) – Regression analysis

Examples: - Prediction of the sales changes by increasing/decreasing of the marketing

a) Large Sample size ( n ≥ 50+8*k to test the overall relationship (k = number of

Example: Strong correlation

OLS will always satisfy this assumption

c) Homoscedasticity (constant variance) of error

Autocorrelation: Positive or negative correlation of error terms over time

H0: The variables are normal distributed (à

Output – model interpretation

Significance of the regression model (F-test):

F-value is significant à The overall

R2 (between 0 – 1) shows how much of the variance

If the number of independent variables is very high, use Adjusted R2

Significance of the regression coefficients (t-tests):

Constant (y-intercept) and regression coefficients of the different independent variables

Powered by TCPDF (www.tcpdf.org)

You might also like