Professional Documents
Culture Documents
Correlation
A correlation is a relationship between two variables. The data
can be represented by the ordered pairs (x, y) where x is the
independent (or explanatory) variable, and y is the dependent
(or response) variable.
A scatter plot can be used to determine whether a linear y
(straight line) correlation exists between two variables.
2
x
Example: 2 4 6
x 1 2 3 4 5 –2
y –4 –2 –1 0 2
–4
Larson & Farber, Elementary Statistics: Picturing the World, 3e 2
Example of Correlation
Is there an association between:
Children’s IQ and Parents’ IQ?
Degree of social trust and number of
membership in voluntary association ?
Urban growth and air quality destructions?
Donor funding and number of publication by
Ph.D. students?
Number of police patrol and number of crime?
Grade on exam and time on exam?
Correlation Represents
a Linear Relationship
Correlation involves a linear relationship.
"Linear" refers to the fact that, when we graph our two
variables, and there is a correlation, we get a line of points.
Correlation tells you how much two variables are linearly
related, not necessarily how much they are related in
general.
There are some cases that two variables may have a
strong, or even perfect, relationship, yet the relationship is
not at all linear. In these cases, the correlation coefficient
might be zero.
Specific Example
Water
Temperature Consumption
(F) (Glasses)
For seven
random summer 75 16
days, a person 83 20
recorded the
temperature and 85 25
their water 85 27
consumption, during 92 32
a three-hour period
spent outside. 97 48
99 48
How “strong” is the linear relationship?
Correlation Coefficient
The correlation coefficient is a measure of the strength and the
direction of a linear relationship between two variables. The
symbol r represents the sample correlation coefficient. The
formula for r is
n xy x y
r .
2 2
n x x n y y
2 2
The range of the correlation coefficient is 1 to 1. If x and y have
a strong positive linear correlation, r is close to 1. If x and y have
a strong negative linear correlation, r is close to 1. If there is no
linear correlation or a weak linear correlation, r is close to 0.
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 9
Correlation Coefficient
Example:
Calculate the correlation coefficient r for the following data.
x y xy x2 y2
1 –3 –3 1 9
2 –1 –2 4 1
3 0 0 9 0
4 1 4 16 1
5 2 10 25 4
x 15 y 1 xy 9 x 2 55 y 2 15
Hours, x 0 1 2 3 3 5 5 5 6 7 7 10
Test score, y 96 85 82 74 95 68 76 84 58 65 75 50
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 11
Correlation Coefficient
Example continued:
Hours, x 0 1 2 3 3 5 5 5 6 7 7 10
Test score, y 96 85 82 74 95 68 76 84 58 65 75 50
y
100
80
Test score
60
40
20
x
2 4 6 8 10
Hours watching TV
Continued.
Larson & Farber, Elementary Statistics: Picturing the World, 3e 12
Correlation Coefficient
Example continued:
Hours, x 0 1 2 3 3 5 5 5 6 7 7 10
Test score, y 96 85 82 74 95 68 76 84 58 65 75 50
xy 0 85 164 222 285 340 380 420 348 455 525 500
x2 0 1 4 9 9 25 25 25 36 49 49 100
y2 9216 7225 6724 5476 9025 4624 5776 7056 3364 4225 5625 2500
Y Y Y
Y Y Y
X X X
20 2.7
30 2.9
50 3.4
45 3.0
10 2.2
30 3.1
40 3.3
25 2.3
50 3.5
20 2.5
10 1.5
55 3.8
60 3.7
50 3.1
35 2.8
Scatter diagram of BMI and Birthweight
4
3.5
2.5
1.5
0.5
0
0 10 20 30 40 50 60 70
Correlation Coefficient, R
• R is a measure of strength of the linear
association between two variables, x and y.
X X X
r = -1 r = -.6 r=0
Y
Y Y
X X X
r = +1 r = +.3 r=0
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall
Linear Correlation
Linear relationships Curvilinear relationships
Y Y
X X
Y Y
X X
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall
Linear Correlation
Strong relationships Weak relationships
Y Y
X X
Y Y
X X
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall
Linear Correlation
No relationship
X
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall
Difference between Correlation
and Regression
Weight X of
1 3 4 6 8 9 11 14
Father’s foot (kg)
Weight Y of Son’s
1 2 4 4 5 7 8 9
foot (kg)
Algebra (X )) 75 80 93 65 87 71 98 68 84 77
Physics (Y) 82 78 86 72 91 80 95 72 89 74
Age (X) 56 42 72 36 63 47 55 49 38 42 68 60
Blood
47 25 60 18 49 28 50 45 15 40 52 55
pressure (Y)