Professional Documents
Culture Documents
DATA ANALYSIS
Chapter 5
50
80
Percent
40
Percent
60
30
40 20
10
20
0
Gender Men Women Men Women Men Women
0
Opinion Agree Disagree No Opinion Opinion Agree Disagree No Opinion
y=
5
x
x=2
Describing the Scatterplot
Curvilinear No relationship
Numerical Measures for Two
Quantitative Variables
• Assume that the two variables x and y
exhibit a linear pattern or form.
• There are two numerical measures to
describe
–The strength and direction of the
relationship between x and y.
–The form of the relationship.
The Correlation Coefficient
• The strength and direction of the
relationship between x and y are
measured using the correlation
coefficient, r.
s xy ( xi )( yi )
xi yi −
r= where s xy = n
sx s y n −1
sx = standard deviation of the x’s
sy = standard deviation of the y’s
Example
• Living area x and selling price y of 5 homes.
Residence 1 2 3 4 5
x (thousand sq ft) 14 15 17 19 16
y ($000) 178 230 240 275 200
280
260
240
•The scatterplot
indicates a
y
220
200
positive linear
180
14 15 16 17 18 19
relationship.
x
x y xy Example
14 178 2492
15 230 3450 Calculate
17 240 4080
x = 16.2 s x = 1.924
19 275 5225
y = 224.6 s y = 37.360
16 200 3200
81 1123 18447
( xi )( yi ) s xy
xi yi − r=
s xy = n sx s y
n −1
63.6
(81)(1123) = = .885
18447 − 1.924(37.36)
= 5 = 63.6
4
Interpreting r
•-1 r 1 Sign of r indicates direction
of the linear relationship.
280
sy
b=r
260
sx 240
y
220
a = y − bx 200
180
14 15 16 17 18 19
x
sy 37.3604
b=r = (.885) = 17.189
sx 1.9235
a = y − b x = 224 .6 − 17 .189 (16 .2 ) = − 53 .86
RegressionLine : y = − 53 .86 + 17 .189 x
Example
• Predict the selling price for another
residence with 1600 square feet of
living area.
280
260
240
y
220
200
180
14 15 16 17 18 19
x