Professional Documents
Culture Documents
PROBABILITY 11
4th Quarter
Week 8
Telefax:
Email Address:
Competencies:
- Calculates the Pearson’s sample correlation coefficient.
Code: M11/12SP-IVh - 2
- Solves problems involving correlation analysis
Code: M11/12SP-IIIh - 3
Learner’s Tasks
Lesson Overview
In the previous lesson, you have learned about bivariate data. You also learned how
to draw the scatterplot of the pair of variable and interpret it qualitatively in terms of its
direction. Sometimes, a scatterplot does not evidently show that a correlation exists
between the two variables. This in the case of very weak correlation where it would be very
difficult to identify the trend line. Thus, we need to come up with more accurate
interpretation of the scatterplot using quantitative methods. Here, we will be computing
some values that will indicate a correlation between the two variables exist and where we
can analyze and interpret its strength using arbitrary scale.
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∙ ∑ 𝑌
𝑟=
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]
Where: n = number of pairs of data
X = values in the first set data
Y = values in the second set data
1. If the trend line contains all the points in the scatterplot and the line points to the
right, we conclude that there is a perfect positive correlation between the two
variables. The computed 𝒓 is 1.
1
2. If all the points fall on the trend line that point to the left, then there exists a perfect
negative correlation between the pair of variables. The computed value of 𝒓 is -1.
3. If a trend line does not exist, there is no correlation between the pair of variables. This
is confirmed by the computed value of 𝒓 which is 0.
4. The absolute value of r indicates the strength of correlation between the two
variables. The direction of correlation is indicated by the sign (positive or negative)
of r.
The analysis and interpretation of the computed value of 𝒓 revolves around two elements:
direction and strength of correlation. For the direction, we just look at the sign of the
computed r – whether positive or negative. The strength of a correlation is indicated by the
value of computed 𝒓. The closer it gets to ±1, the higher or stronger the correlation. The
closer it is to 0, the lower or weaker is the correlation. The maximum value that 𝒓 can take is
either +1 or −1. The value of 𝒓 goes beyond these perfect values.
Normally, a boundary belongs to any of the bounding segments. For example, r = 0.5 may
be described as “low” or “high”. Let us agree that we will put the boundaries at the higher
segment. Thus, if r = - 0.75, the strength of correlation would be “very high.”
We will use the measuring devise to determine the strength of the computed r.
Pearson r Qualitative Description
±𝟏 Perfect
±𝟎. 𝟕𝟓 𝒕𝒐 < ±𝟏 Very High
±𝟎. 𝟓𝟎 𝒕𝒐 < ±𝟎. 𝟕𝟓 Moderately High
±𝟎. 𝟐𝟓 𝒕𝒐 < ±𝟎. 𝟓𝟎 Moderately Low
> 𝟎 𝒕𝒐 < ±𝟎. 𝟐𝟓 Very Low
0 No Correlation
Illustrative Example
2
Solution:
a. The next section will guide you on how to compute the Pearson product moment
correlation 𝒓.
STEPS SOLUTION
1. Construct a table as shown on
the right side. X Y XY X2 Y2
1 5
2 10
3 10
4 15
5 25
6 30
2. Complete the table.
a. Multiply the entries on the X and X Y XY X2 Y2
Y columns. Put them under the XY 1 5 5 1 25
column.
2 10 20 4 100
b. Square each entry in the X
column. Put them under the X2 3 10 30 9 100
column. 4 15 60 16 225
c. Square each entry in the Y 5 25 125 25 625
column. Put them under the Y2 6 30 180 36 900
column.
3.
a. Get the sum of all entries in X X Y XY X2 Y2
column. This is ∑ 𝑿. 1 5 5 1 25
b. Get the sum of all entries in Y
2 10 20 4 100
column. This is ∑ 𝒀.
c. Get the sum of all entries in XY 3 10 30 9 100
column. This is ∑ 𝑿𝒀. 4 15 60 16 225
d. Get the sum of all entries in X2 5 25 125 25 625
column. This is ∑ 𝑿𝟐 . 6 30 180 36 900
e. Get the sum of all entries in Y2 ∑ 𝑿 = 21 ∑ 𝒀 = 95 ∑ 𝑿𝒀 = 420 ∑ 𝑿 = 91
𝟐 ∑ 𝒀 = 1, 975
𝟐
column. This is ∑ 𝒀𝟐 .
The following are the heights of a father and his eldest son, in inches:
Height of the father 74 66 67 65 68 66 69 71 63 60
Height of the son 74 69 69 65 66 63 68 70 60 58
a. Compute for the Pearson’s correlation coefficient 𝒓 of the given data above.
b. What is the strength of the correlation?
. c. Based on the computed 𝒓, can you conclude that height is hereditary? Explain
4
Formative Test
Let us see how far you have learned from our lesson.
Direction: Choose the letter of the best answer. Write the chosen letter on a separate sheet
of paper.
1. In computing Pearson 𝒓, which of the following is the next step after obtaining the sum
of all entries in all columns in the table?
A. Construct a table. C. Simplify and compute for the value of 𝒓.
B. Complete the table. D. Substitute all the sum and 𝒏 in the formula.
5. Based on the findings, which of the following best describes the result?
A. A student who is good in Mathematics is also good in Science.
B. A student who is good in Mathematics is not good in Science.
C. A student who is not good in Mathematics is good in Science.
D. There is no correlation between the performances of the students in
Mathematics and Science.
5
Answer Key
Activity 1 Activity 3
1. Correlation 1. There is a moderate high positive correlation
between the number of theft cases and the number
2. Correlation Analysis of vandalism in the school.
3. -1, 1, -1 ≤ 𝑟 ≤ +1 2. Answers may vary.
3. Fuel consumption and mileage travelled by a car.
Activity 2
a.
X Y XY X2 Y2
74 74 5476 5476 5476
66 69 4356 4761 4554
67 69 4489 4761 4623
65 65 4225 4225 4225
68 66 4624 4356 4488
66 63 4356 3969 4158
69 68 4761 4624 4692
71 70 5041 4900 4970
63 60 3969 3600 3780
60 58 3600 3364 3480
∑ 𝑋 = 669 ∑ 𝑌 = 662 ∑ 𝑋𝑌 = 44, 446 ∑ 𝑋 2 = 44, 897 ∑ 𝑌 2 = 44, 036
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∙ ∑ 𝑌
𝒓=
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]
10(44, 446) − (669)(662)
=
√[10(44, 897) − (669)2 ][10(44, 036) − (662)2 ]
1, 582
=
√[1, 409][2, 116]
1, 582
=
√2, 981, 444
𝒓 = 𝟎. 𝟗𝟏𝟔𝟐𝟏 𝒐𝒓 𝟎. 𝟗𝟐
c. Since there is a “Very high positive correlation” between the heights of the fathers to their
son therefore we can conclude the height is truly hereditary.
6
References
Belecina, Rene R., Baccay,Elisa S. and Mateo, Efren B. “Statistics And Probability”. Manila:
Rex Book Store, Inc. (RBSI), 2016.
Statistics & Probability – Grade 11 Alternative Delivery Mode Quarter 4 – Module 18:
Calculating the Pearson’s Sample Correlation Coefficient, First Edition, 2020 Department of
Education – Region IV-A CALABARZON
Statistics & Probability – Grade 11 Alternative Delivery Mode Quarter 4 – Module 19: Solving
Problems involving Correlation Analysis, First Edition, 2020 Department of Education – Region
IV-A CALABARZON