Professional Documents
Culture Documents
Quantitative Methods Workshop III
Quantitative Methods Workshop III
1
Correlation and Regression —
Pearson and Spearman
Correlation and
Regression show
the relationship
between two
continuous variables.
2
Correlation and Regression
• This chapter covers two kinds of correlation
• Pearson correlation (r) is used to assess the relationship between two
continuous variables (for each subject)
• Spearman rho correlation () assesses the relationship between two two
sorted lists
3
Correlation and Regression
• The Pearson is the most commonly used form of correlation; hence, it
will be discussed first
• NOTE: If the term “correlation” is used, presume that it is Pearson, unless
stated otherwise
4
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
5
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation
6
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation
7
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation
8
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation
9
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation
10
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation
11
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation
12
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation
13
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation
14
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression value indicates the strength of the correlation
15
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression value indicates the strength of the correlation
−1 0 +1
Strong Strong
16
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression value indicates the strength of the correlation
−1 0 +1
17
Correlation and Regression
• Each dot depicts a pair of variables
83
107
18
Correlation and Regression
• The regression line is the average pathway through the points
19
Correlation and Regression
• Test time and test grade are positively correlated
r = .815
p < .001
20
Correlation and Regression
• Test time and test grade are negatively correlated
r = .815 r = -.803
p < .001 p < .001
21
Correlation and Regression
• Test time and test grade are not correlated
23
Correlation and Regression
• Hypotheses
H0: There is no correlation between the length of time spent
taking the exam and the grade on the exam
H1: There is a correlation between the length of time spent
taking the exam and the grade on the exam
24
Correlation and Regression
• Use data set:
Ch 08 - Example 01 - Correlation and Regression - Pearson.sav
25
Correlation and Regression
• Pretest checklist
Normality*
Linearity**
Homoscedasticity**
*Check before correlation and regression run
**Check after correlation and regression run
26
Correlation and Regression
• Pretest checklist
• Normality
• Click Analyze, Descriptive Statistics, Frequencies
27
Correlation and Regression
• Pretest checklist
• Normality
• Move time and grade to Variable(s)
• Click Charts
28
Correlation and Regression
• Pretest checklist
• Normality
• Select Histogram with normal curve
• Click Continue
29
Correlation and Regression
• Pretest checklist
• Normality
• Click OK
30
Correlation and Regression
• Pretest checklist
• Normality
• Inspect both histograms for normality
31
Correlation and Regression
• Pretest checklist
Normality*
Linearity**
Homoscedasticity**
*Satisfied
**Check after correlation and regression run
32
Correlation and Regression
• Finalizing pretest checklist
– Click on Graphs, Chart Builder
33
Correlation and Regression
• Finalizing pretest checklist
– In the Choose
from list, click on
Scatter/Dot
– Double-click on
the first (circled)
graph choice or
drag it to the
Chart preview
area
34
Correlation and Regression
• Finalizing pretest checklist
– Drag time to the
X axis
– Drag grade to the
Y axis
– Click OK
35
Correlation and Regression
• Finalizing pretest checklist
• Double-click on the scatterplot
36
Correlation and Regression
• Finalizing pretest checklist
• Click on the Add Fit Line at Total icon
37
Correlation and Regression
• Finalizing pretest checklist
• Click on the Add Fit Line at Total icon
38
Correlation and Regression
• Finalizing pretest checklist
Linearity is satisfied
39
Correlation and Regression
• Finalizing pretest checklist
Homoscedasticity is satisfied
40
Correlation and Regression
• Test run
– Click on Analyze, Correlate, Bivariate
41
Correlation and Regression
• Test run
– Move time and grade to Variables
– Click OK
42
Correlation and Regression
• Results
• The Pearson correlation (r) of .815 with a p < .001 (< .05) indicates a
statistically significant strong positive correlation between the test
taking time and grade
43
Correlation and Regression
• Results
• Since r = .815 is a positive correlation
• time : grade
• time : grade
44
Correlation and Regression
• Hypothesis resolution
REJECT H0: There is no correlation between the length of time spent taking the exam
and the grade on the exam
H1: There is a correlation between the length of time spent taking the exam
ACCEPT
and the grade on the exam
45
Correlation and Regression
• Documenting results
To determine if there is a correlation between the length of time
students spent taking a two hour exam and the score on it, we
recorded the submission time (0…120 minutes) of each exam (n = 30).
The exams were then graded. We discovered a statistically significant
positive correlation between test time and grade (r = .815) (p <
.001, = .05), wherein students who spent longer on their exams
earned higher grades, and vice versa.
46
Correlation and Regression
• Correlation vs. Causation
• Correlation means that two variables move in a predictable fashion with
respect to each other
• In positive correlations, the variables move in the same direction:
(X Y) or (X Y)
• In negative correlations, the variables move in different directions:
(X Y) or (X Y)
47
Correlation and Regression
• Correlation vs. Causation
• No matter how strong the correlation, it would be inappropriate to
automatically claim that X causes Y, or that Y causes X
• Example:
• Z (Depression) may be causing:
X (Poor sleep)
and
Y (Low productivity)
48
Correlation and Regression
• Correlation vs. Causation
• Three causality criteria
Criteria Rule Example
Variable A and variable B must
Association / be empirically related; there Taking a dose of aspirin
1 correlation must be a (scientific) logical lowers fever.
relationship between A and B.
49
Correlation and Regression
• Correlation vs. Causation
• Three causality criteria
Criteria Rule Example
Variable A and variable B must
Association / be empirically related; there Taking a dose of aspirin
1 correlation must be a (scientific) logical lowers fever.
relationship between A and B.
A (cause [independent The person took aspirin, and
2 Temporality variable]) precedes B (effect then the fever went down; not
[dependent variable]). the other way around.
50
Correlation and Regression
• Correlation vs. Causation
• Three causality criteria
Criteria Rule Example
Variable A and variable B must
Association / be empirically related; there Taking a dose of aspirin
1 correlation must be a (scientific) logical lowers fever.
relationship between A and B.
A (cause [independent The person took aspirin, and
2 Temporality variable]) precedes B (effect then the fever went down; not
[dependent variable]). the other way around.
The drop in fever is not due to
The relationship between A
the room getting colder,
3 Nonspurious and B are not caused by other
submerging the person in an
variable(s).
ice bath, or other factors.
51
Spearman Correlation
• Spearman rho correlation () (−1…+1) assesses the relationship
between two sorted lists
52
Spearman Correlation
• Spearman rho correlation () (−1…+1) assesses the relationship
between two sorted lists
Alice Bill
Chocolate Chocolate
Strawberry Vanilla
Vanilla Strawberry
Alice Bill
Chocolate Chocolate
Strawberry Vanilla
Vanilla Strawberry
Alice Bill
Chocolate Chocolate
Strawberry Vanilla
Vanilla Strawberry
62
Spearman Correlation
• Example
• A dietician asks a patient to arrange five cards in order of preference, with the
most favorite food at the top. The dietician will then use another set of cards
to show the recommended diet. The dietician will record the two card
sequences (patient : dietician) and compare them using Spearman’s rho.
63
Spearman Correlation
• Hypothesis
H0: There is no correlation between the dietician’s
recommended food ranking and the patient’s food preferences
H1: There is a correlation between the dietician’s recommended
food ranking and the patient’s food preferences
64
Spearman Correlation
Dietician Patient
Vegetables Fish
Fish Vegetables
Poultry Poultry
Beef Beef
Pork Pork
65
Spearman Correlation
• Value labels assigned for dietician and patient
66
Spearman Correlation
• Database showing labels and values
Labels Values
67
Spearman Correlation
• Test run
– Click Analyze, Correlate, Bivariate
68
Spearman Correlation
• Test run
– Move dietician and patient to Variables
69
Spearman Correlation
• Test run
– Uncheck Pearson
– Check Spearman
70
Spearman Correlation
• Test run
– Click OK
71
Spearman Correlation
• Results
• The Spearman’s rho of .900 with p = .037 (< .05) indicates a
statistically significant strong positive correlation between the food
sequences of the dietician and the patient.
72
Spearman Correlation
• Hypothesis resolution
REJECT H0: There is no correlation between the dietician’s recommended food ranking
and the patient’s food preferences
ACCEPT H1: There is a correlation between the dietician’s recommended food ranking
and the patient’s food preferences
73
Spearman Correlation
• Documenting results
The dietician compared the patient’s food preference (Fish,
Vegetables, Poultry, Beef, Pork) to the recommended nutrition for this
patient (Vegetables, Fish, Poultry, Beef, Pork).
Spearman’s rho produced a statistically significant (p = .037, =
.05) positive correlation of .900 signifying a strong concurrence
between the two lists, suggesting that it should be fairly plausible to
assemble a healthy dietary plan that is suitable to this patient’s tastes.
74
Your Tasks Before Next Workshop
75