You are on page 1of 13

Linear Correlation of

Bivariate Data
Least Square Criterion:
To determine which is the better
line of best fit, we look for the
line which gives the least sum of
square residuals
+
𝑆𝑆" = $ 𝑦 − 𝑓(𝑥)
x y = 𝟎. 𝟖𝟏𝒙 + 𝟐. 𝟑𝟔 = 𝟎. 𝟔𝟓𝒙 + 𝟑. 𝟎𝟖
1 3
2 5
3 3
4 7
5 5
6 9
7 7
8 9
Solution:
On the GDC, Add a List and Spreadsheet page.
Enter x values and y values. Label column as
x and y.
Label the 3rd and 4th columns a and b.
Below the label, type = 𝟎. 𝟖𝟏𝒙 + 𝟐. 𝟑𝟔
and = 𝟎. 𝟔𝟓𝒙 + 𝟑. 𝟎𝟖.
Choose Variable Reference
x y = 𝟎. 𝟖𝟏𝒙 + 𝟐. 𝟑𝟔 = 𝟎. 𝟔𝟓𝒙 + 𝟑. 𝟎𝟖
1 3
2 5
3 3
4 7
5 5
6 9
7 7
8 9
Solution:
Label 5th and 6th column as c and d.
Below the labels, type 𝑦 − 𝑎 + and
𝑦 − 𝑏 +.
Choose Reference Variables
Press Menu, Statistics, Stat Calculations,
Two-Variable Statistics
Choose X List: c and Y List:d
x y 𝒚 = 𝟎. 𝟖𝟏𝒙 + 𝟐. 𝟑𝟔 𝒚 = 𝟎. 𝟔𝟓𝒙 + 𝟑. 𝟎𝟖
1 3 3.17 3.73
2 5 3.98 4.38
3 3 4.79 5.03
4 7 5.6 5.68
5 5 6.41 6.33
6 9 7.22 6.98
7 7 8.03 7.63
8 9 8.84 8.28

𝑆𝑆" = 12.5 for line 1


𝑆𝑆" = 13.5 for line 2
Line 2 is the better line of best fit.
On the scatter graph page, press Menu,
Analyze , Show regression, Linear (mx+b)
Press ctrl, doc, calculator, var. Select stat.regeqn. Type 19.5. Enter.
Interpolation refers to using the data in order to predict data within
the dataset.
Extrapolation is the use of the data set to predict beyond the data set.

a) The correlation is moderate and linear.


b) x = 0 is quite far out of the given date range.
Interpolation refers to using the data in order to predict data within
the dataset.
Extrapolation is the use of the data set to predict beyond the data set.

c) x = 10 is inside the given data range.


d) The regression line of y on x is used to
predict the value of y for a given value of x
and not the other way round.

You might also like