You are on page 1of 75

QUANTITATIVE METHODS III

1
Correlation and Regression —
Pearson and Spearman

Correlation and
Regression show
the relationship
between two
continuous variables.

2
Correlation and Regression
• This chapter covers two kinds of correlation
• Pearson correlation (r) is used to assess the relationship between two
continuous variables (for each subject)
• Spearman rho correlation () assesses the relationship between two two
sorted lists

3
Correlation and Regression
• The Pearson is the most commonly used form of correlation; hence, it
will be discussed first
• NOTE: If the term “correlation” is used, presume that it is Pearson, unless
stated otherwise

4
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1

5
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation

6
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation

Correlation r Variable directions


Positive

7
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation

Correlation r Variable directions


Positive 0…+1

8
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation

Correlation r Variable directions


Positive 0…+1 (X Y)

9
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation

Correlation r Variable directions


Positive 0…+1 (X Y) or (X Y)

10
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation

Correlation r Variable directions


Positive 0…+1 (X Y) or (X Y)
Negative

11
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation

Correlation r Variable directions


Positive 0…+1 (X Y) or (X Y)
Negative −1…0

12
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation

Correlation r Variable directions


Positive 0…+1 (X Y) or (X Y)
Negative −1…0 (X Y)

13
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression sign (+ or −) indicates the direction of the correlation

Correlation r Variable directions


Positive 0…+1 (X Y) or (X Y)
Negative −1…0 (X Y) or (X Y)

14
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression value indicates the strength of the correlation

15
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression value indicates the strength of the correlation

−1 0 +1

Strong Strong

16
Correlation and Regression
• Determines the relationship between 2 continuous variables
• Regression range: −1…+1
• Regression value indicates the strength of the correlation

−1 0 +1

Strong Weak Strong

17
Correlation and Regression
• Each dot depicts a pair of variables

83

107

18
Correlation and Regression
• The regression line is the average pathway through the points

19
Correlation and Regression
• Test time and test grade are positively correlated

r = .815
p < .001
20
Correlation and Regression
• Test time and test grade are negatively correlated

r = .815 r = -.803
p < .001 p < .001
21
Correlation and Regression
• Test time and test grade are not correlated

r = .815 r = -.803 r = -.072


p < .001 p < .001 p = .704
22
Correlation and Regression
• Example
• An instructor wants to discover if there is a correlation between the length of
time spent taking an exam and the grade on that exam
• The instructor gathers two pieces of data for each student
1. Minutes spent taking the exam
2. Grade on the exam

23
Correlation and Regression
• Hypotheses
H0: There is no correlation between the length of time spent
taking the exam and the grade on the exam
H1: There is a correlation between the length of time spent
taking the exam and the grade on the exam

24
Correlation and Regression
• Use data set:
Ch 08 - Example 01 - Correlation and Regression - Pearson.sav

25
Correlation and Regression
• Pretest checklist
 Normality*
 Linearity**
 Homoscedasticity**
*Check before correlation and regression run
**Check after correlation and regression run

26
Correlation and Regression
• Pretest checklist
• Normality
• Click Analyze, Descriptive Statistics, Frequencies

27
Correlation and Regression
• Pretest checklist
• Normality
• Move time and grade to Variable(s)
• Click Charts

28
Correlation and Regression
• Pretest checklist
• Normality
• Select Histogram with normal curve
• Click Continue

29
Correlation and Regression
• Pretest checklist
• Normality
• Click OK

30
Correlation and Regression
• Pretest checklist
• Normality
• Inspect both histograms for normality

31
Correlation and Regression
• Pretest checklist
 Normality*
 Linearity**
 Homoscedasticity**
*Satisfied
**Check after correlation and regression run

32
Correlation and Regression
• Finalizing pretest checklist
– Click on Graphs, Chart Builder

33
Correlation and Regression
• Finalizing pretest checklist
– In the Choose
from list, click on
Scatter/Dot
– Double-click on
the first (circled)
graph choice or
drag it to the
Chart preview
area
34
Correlation and Regression
• Finalizing pretest checklist
– Drag time to the
X axis
– Drag grade to the
Y axis
– Click OK

35
Correlation and Regression
• Finalizing pretest checklist
• Double-click on the scatterplot

36
Correlation and Regression
• Finalizing pretest checklist
• Click on the Add Fit Line at Total icon

37
Correlation and Regression
• Finalizing pretest checklist
• Click on the Add Fit Line at Total icon

38
Correlation and Regression
• Finalizing pretest checklist
 Linearity is satisfied

Linearity satisfied Linearity violated

39
Correlation and Regression
• Finalizing pretest checklist
 Homoscedasticity is satisfied

Homoscedasticity satisfied Homoscedasticity violated

40
Correlation and Regression
• Test run
– Click on Analyze, Correlate, Bivariate

41
Correlation and Regression
• Test run
– Move time and grade to Variables
– Click OK

42
Correlation and Regression
• Results
• The Pearson correlation (r) of .815 with a p < .001 (< .05) indicates a
statistically significant strong positive correlation between the test
taking time and grade

43
Correlation and Regression
• Results
• Since r = .815 is a positive correlation
•  time :  grade
•  time :  grade

44
Correlation and Regression
• Hypothesis resolution

REJECT H0: There is no correlation between the length of time spent taking the exam
and the grade on the exam

H1: There is a correlation between the length of time spent taking the exam
ACCEPT
and the grade on the exam

45
Correlation and Regression
• Documenting results
To determine if there is a correlation between the length of time
students spent taking a two hour exam and the score on it, we
recorded the submission time (0…120 minutes) of each exam (n = 30).
The exams were then graded. We discovered a statistically significant
positive correlation between test time and grade (r = .815) (p <
.001,  = .05), wherein students who spent longer on their exams
earned higher grades, and vice versa.

46
Correlation and Regression
• Correlation vs. Causation
• Correlation means that two variables move in a predictable fashion with
respect to each other
• In positive correlations, the variables move in the same direction:
(X Y) or (X Y)
• In negative correlations, the variables move in different directions:
(X Y) or (X Y)

47
Correlation and Regression
• Correlation vs. Causation
• No matter how strong the correlation, it would be inappropriate to
automatically claim that X causes Y, or that Y causes X
• Example:
• Z (Depression) may be causing:
X (Poor sleep)
and
Y (Low productivity)

48
Correlation and Regression
• Correlation vs. Causation
• Three causality criteria
Criteria Rule Example
Variable A and variable B must
Association / be empirically related; there Taking a dose of aspirin
1 correlation must be a (scientific) logical lowers fever.
relationship between A and B.

49
Correlation and Regression
• Correlation vs. Causation
• Three causality criteria
Criteria Rule Example
Variable A and variable B must
Association / be empirically related; there Taking a dose of aspirin
1 correlation must be a (scientific) logical lowers fever.
relationship between A and B.
A (cause [independent The person took aspirin, and
2 Temporality variable]) precedes B (effect then the fever went down; not
[dependent variable]). the other way around.

50
Correlation and Regression
• Correlation vs. Causation
• Three causality criteria
Criteria Rule Example
Variable A and variable B must
Association / be empirically related; there Taking a dose of aspirin
1 correlation must be a (scientific) logical lowers fever.
relationship between A and B.
A (cause [independent The person took aspirin, and
2 Temporality variable]) precedes B (effect then the fever went down; not
[dependent variable]). the other way around.
The drop in fever is not due to
The relationship between A
the room getting colder,
3 Nonspurious and B are not caused by other
submerging the person in an
variable(s).
ice bath, or other factors.

51
Spearman Correlation
• Spearman rho correlation () (−1…+1) assesses the relationship
between two sorted lists

52
Spearman Correlation
• Spearman rho correlation () (−1…+1) assesses the relationship
between two sorted lists

Alice Bill Alice Bill


Chocolate Chocolate Chocolate Vanilla

Strawberry Strawberry Strawberry Strawberry

Vanilla Vanilla Vanilla Chocolate

Spearman rho = 1 Spearman rho = -1


53
Spearman Correlation
• Spearman rho correlation () (−1…+1) assesses the relationship
between two sorted lists

Alice Bill Alice Bill


Chocolate Chocolate Chocolate Vanilla

Strawberry Strawberry Strawberry Strawberry

Vanilla Vanilla Vanilla Chocolate

Spearman rho = 1 Spearman rho = -1


54
Spearman Correlation
• Spearman rho correlation () (−1…+1) assesses the relationship
between two sorted lists

Alice Bill Alice Bill


Chocolate Chocolate Chocolate Vanilla

Strawberry Strawberry Strawberry Strawberry

Vanilla Vanilla Vanilla Chocolate

Spearman rho = 1 Spearman rho = -1


55
Spearman Correlation
• Spearman rho correlation () (−1…+1) assesses the relationship
between two sorted lists

Alice Bill Alice Bill


Chocolate Chocolate Chocolate Vanilla

Strawberry Strawberry Strawberry Strawberry

Vanilla Vanilla Vanilla Chocolate

Spearman rho = 1 Spearman rho = -1


56
Spearman Correlation
• Spearman rho correlation () (−1…+1) assesses the relationship
between two sorted lists

Alice Bill Alice Bill


Chocolate Chocolate Chocolate Vanilla

Strawberry Strawberry Strawberry Strawberry

Vanilla Vanilla Vanilla Chocolate

Spearman rho = 1 Spearman rho = -1


57
Spearman Correlation
• Spearman rho correlation () (−1…+1) assesses the relationship
between two sorted lists

Alice Bill Alice Bill


Chocolate Chocolate Chocolate Vanilla

Strawberry Strawberry Strawberry Strawberry

Vanilla Vanilla Vanilla Chocolate

Spearman rho = 1 Spearman rho = -1


58
Spearman Correlation
• Spearman rho correlation () (−1…+1) assesses the relationship
between two sorted lists

Alice Bill
Chocolate Chocolate

Strawberry Vanilla

Vanilla Strawberry

Spearman rho = .667


59
Spearman Correlation
• Spearman rho correlation () (−1…+1) assesses the relationship
between two sorted lists

Alice Bill
Chocolate Chocolate

Strawberry Vanilla

Vanilla Strawberry

Spearman rho = .667


60
Spearman Correlation
• Spearman rho correlation () (−1…+1) assesses the relationship
between two sorted lists

Alice Bill
Chocolate Chocolate

Strawberry Vanilla

Vanilla Strawberry

Spearman rho = .667


61
Spearman Correlation
• Use data set:
Ch 08 - Example 02 - Correlation and Regression - Spearman.sav

62
Spearman Correlation
• Example
• A dietician asks a patient to arrange five cards in order of preference, with the
most favorite food at the top. The dietician will then use another set of cards
to show the recommended diet. The dietician will record the two card
sequences (patient : dietician) and compare them using Spearman’s rho.

63
Spearman Correlation
• Hypothesis
H0: There is no correlation between the dietician’s
recommended food ranking and the patient’s food preferences
H1: There is a correlation between the dietician’s recommended
food ranking and the patient’s food preferences

64
Spearman Correlation

Dietician Patient
Vegetables Fish

Fish Vegetables

Poultry Poultry

Beef Beef

Pork Pork

65
Spearman Correlation
• Value labels assigned for dietician and patient

66
Spearman Correlation
• Database showing labels and values

Labels Values

67
Spearman Correlation
• Test run
– Click Analyze, Correlate, Bivariate

68
Spearman Correlation
• Test run
– Move dietician and patient to Variables

69
Spearman Correlation
• Test run
– Uncheck Pearson
– Check Spearman

70
Spearman Correlation
• Test run
– Click OK

71
Spearman Correlation
• Results
• The Spearman’s rho of .900 with p = .037 (< .05) indicates a
statistically significant strong positive correlation between the food
sequences of the dietician and the patient.

72
Spearman Correlation
• Hypothesis resolution

REJECT H0: There is no correlation between the dietician’s recommended food ranking
and the patient’s food preferences

ACCEPT H1: There is a correlation between the dietician’s recommended food ranking
and the patient’s food preferences

73
Spearman Correlation
• Documenting results
The dietician compared the patient’s food preference (Fish,
Vegetables, Poultry, Beef, Pork) to the recommended nutrition for this
patient (Vegetables, Fish, Poultry, Beef, Pork).
Spearman’s rho produced a statistically significant (p = .037,  =
.05) positive correlation of .900 signifying a strong concurrence
between the two lists, suggesting that it should be fairly plausible to
assemble a healthy dietary plan that is suitable to this patient’s tastes.

74
Your Tasks Before Next Workshop

• To study and practice the following available resources on Blackboard


▪ Chi-Square

75

You might also like