Professional Documents
Culture Documents
Learning Objectives
Presentation of Content
I. Linear Correlation
The coefficient measures the strength and direction of linear coefficient between two variables
(Larson and Farber, 2000; Pagala, 2011). We will use the formula below to determine the value
of linear coefficient.
r =n
∑ xy −¿ ∑ x ∑ y ¿
√ 2 2 2
[n ∑ x −(∑ x) ][n ∑ y −( ∑ y ) ]
2
Where:
n=number of ordered pairs
x=value of independent variable
y=value of dependent variable
How will you use the formula to determine the relationship of two variables?
How can we apply the formula to predict values of the dependent variable?
Method of Least Square
Since α and β are generally not known in a regression problem, they must be estimated from a
sample data taken on the dependent variable y for a number of values of the independent variable
x.
Note: The standard approach to estimating α and β is using the least squares (minimizing the
sum of the squared errors for your data points.)
Sample estimates of α and β are denoted by α and β, respectively, and the resulting regression
line is called sample least squares regression equation.
y = α + βx
The sum of the squared deviation between the line and the scatter of points should be minimized.
Statisticians have found that the formulas for α and β are shown below:
β=
∑ (x −x)( x− y )
∑ ( x−x)
a= y−β x
Note: Here, x and y denote the sample means of x and y.
Alternative Formulas
The alternative formulas for α and β are as follow.
β=n ∑ xy −¿ ¿ ¿
a=
∑ y−β ∑ x
n
Application
Example 1
Now, let us apply what we have learned. Here is an activity where we can utilize the formula
given. Remember to follow the guidelines in determining the linear coefficient. Try to solve the
problem independently before comparing your answers to the answers provided.
Problem: The list of height and weight of 10 basketball players is given below. Determine the
value of the linear coefficient.
The list of height and weight of 10 basketball players.
X
(Height in 67 70 71 70 66 69 72 78 64 65
Inches)
Y
(Weight in 71 70 69 68 66 65 71 70 64 65
Kilograms)
Have you tried answering the problem? Great! Now, we can compare your answers.
Solution:
We determine the values of the variables.
Height (X) Weight (Y) XY X2 Y2
67 71 4,757 4,489 5,041
70 70 4,900 4,900 4,900
71 69 4,899 5,041 4,761
70 68 4,760 4,900 4,624
66 66 4,356 4,356 4,356
69 65 4,485 4,761 4,225
72 71 5,112 5,184 5,041
78 70 5,460 6,084 4,900
64 64 4,096 4,096 4,096
65 65 4,225 4,225 4,225
The values of the variables are:
∑ xy =47,050 n ∑ xy=¿470,500 ∑ x = 692
∑ y=¿ ¿679 ∑ x ∑ y=¿ ¿469,868 ∑ x 2=¿ ¿ 48,036
n ∑ x =¿ ¿480,360 ∑ y2 =¿ ¿46,169 n ∑ y =¿ ¿ 461,690
2 2
2 2
( ∑ x ) =¿478,864 ( ∑ y ) =¿46,104
( 470,500 ) −(692)(697)
r=
√ [480,360−( 692 ) ][461,690−(697) ]
2 2
470,500−469,868
r=
√ [480,360−478,864][461,690−461,041]
632
r=
√(1,496)( 649)
632
r=
√ 970,909
632
r=
985.35
r =0.64
From the previous activity, the correlation coefficient is 0.64 which can be interpreted as a
moderately positive correlation. There is a substantial degree of correlation between the height
and weight of the ten basketball players.
Awesome! Keep up the good work!
Exercise 1
Let us put your understanding into practice. Below are the test results of 10 students in their
Mathematics and English examinations. Determine the linear correlation coefficient and interpret
its value.
X
(Score in 34 23 45 44 37 46 23 41 40 35
Mathematics
)
Y
(Score in 35 21 43 42 32 45 23 47 43 37
English)
Example 2
Using the given formulas, try to determine the values of the variables to come up with the least
squares regression equation.
Problem:
The Cagayan State University officials wished to determine if the CSU—College Admission
scores is a good indicator of the General Weighted Average (GWA) of the 16 scholars selected at
random from the first year class. Their GPA and CSU-CAT scores are shown in the next page.
What will the estimated GWA of a student with the CAT score of 83?
Student CAT Raw Score (x) GWA (y)
1 80 85
2 82 87
3 90 90
4 87 88
5 80 84
6 85 89
7 95 97
8 97 98
9 98 98
10 90 92
11 82 85
12 81 83
13 85 87
14 86 88
15 88 88
16 92 95
How can one predict and estimate GWA from CAT scores?
Solution
Now, we need to obtain the equation for the line that best fits the sample data.
CAT Raw
Student GWA (y) xy x2 y2
Score (x)
1 80 85 6,800 6,400 7,225
2 82 87 7,134 6,724 7,569
3 90 90 8,100 8,100 8,100
4 87 88 7,656 7,569 7,744
5 80 84 6,720 6,400 7,056
6 85 89 7,565 7,225 7,921
7 95 97 9,215 9,025 9,409
8 97 98 9,506 9,409 9,604
9 98 98 9,604 9,604 9,604
10 90 92 8,280 8,100 8,464
11 82 85 6,970 6,724 7,225
12 81 83 6,723 6,561 6,889
13 85 87 7,395 7,225 7,569
14 86 88 7,568 7,396 7,744
15 88 88 7,744 7,744 7,744
16 92 95 8,740 8,464 9,025
Total 1,398 1,434 125,720 122,670 128,892
Solution:
Using the formulas:
1,434
y= =89.625
16
1,398
x= =87.375
16
16(125,720)−(1,398)(1,434)
β= 2
=0.8163
16 (122,670 )−(1,398)
a=89.625−( 0.8163 )( 87.375 )=18.3008
The fitted equation describing the relationship between GWA and CAT scores is: GWA =
18.3008 + 0.8163x
Age
10 12 11 26 28 21 22 18 16 15
(x)
Score
32 30 34 39 38 32 29 28 25 20
(y)