You are on page 1of 9

STATISTICS and

PROBABILITY 11
4th Quarter
Week 8

LEARNING ACTIVITY SHEET


Division of Surigao del Sur
Disclaimer: This Learning Activity Sheet (LAS) is based from the Self-Learning
Modules, Learner’s Materials, Textbooks and Teaching Guides released by DepEd
Central Office. Furthermore, utilization of duly acknowledged external resources is
purely of non-profit, for educational use and constitutes fair use. All Rights Reserved.

Development Team Quality Assurance Team

Developer: Erick Jesson J. Bucalon Evaluators: Myracell P. Buenaflor


Layout Artist: Erick Jesson J. Bucalon Annabel C. Cubero
Danife B. Engcoy

PSDS/DIC: Rosalinda E. Urbiztondo


Mirasol Taray Learning Area EPS:
Ramonito D. Cortes Regina Euann A. Puerto

LAS Graphics and Design Credits:


Title Page Art: Marieto Cleben V. Lozada
Title Page Layout: Bryan L. Arreo
Visual Cues Art: Ivin Mae N. Ambos

For inquiries or feedback, please write or call:

Department of Education – Division of Surigao del Sur


Balilahan, Tandag City

Telephone: (086) 211-3225


Email Address: surigaodelsur.division@deped.gov.ph
Facebook: SurSur Division LRMS Updates
Facebook Messenger: Learning Resource Concerns

Telefax:

Email Address:
Competencies:
- Calculates the Pearson’s sample correlation coefficient.
Code: M11/12SP-IVh - 2
- Solves problems involving correlation analysis
Code: M11/12SP-IIIh - 3

Objectives: At the end of the week, you shall have


a. defined correlation and the terminologies related to it;
b. analyzed and interpreted the strength of correlation between two variables
based on the computed sample correlation coefficient 𝒓 from a given real-
life problem; and
c. reflected on the importance of correlation analysis on decision making in real-
life setting.

Learner’s Tasks

Lesson Overview
In the previous lesson, you have learned about bivariate data. You also learned how
to draw the scatterplot of the pair of variable and interpret it qualitatively in terms of its
direction. Sometimes, a scatterplot does not evidently show that a correlation exists
between the two variables. This in the case of very weak correlation where it would be very
difficult to identify the trend line. Thus, we need to come up with more accurate
interpretation of the scatterplot using quantitative methods. Here, we will be computing
some values that will indicate a correlation between the two variables exist and where we
can analyze and interpret its strength using arbitrary scale.

Pearson’s Sample Correlation Coefficient (Pearson 𝒓)


Statisticians devised quantitative ways to measure the association between two variables.
Correlation is used to determine the existence, strength, and direction of relationship
between two variables. Correlation analysis is a statistical method used to determine
whether a relationship between two variables exist. The strength of correlation is indicated
by the coefficient correlation. There are several coefficients of correlation. One that is most
commonly used in linear correlation is Pearson sample correlation coefficient, symbolized
by 𝑟, named in honor of the statistician who did a lot of research on this area, Karl Pearson.

To compute 𝒓, we use the formula,

𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∙ ∑ 𝑌
𝑟=
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]
Where: n = number of pairs of data
X = values in the first set data
Y = values in the second set data

The Meaning of the Correlation Coefficient

1. If the trend line contains all the points in the scatterplot and the line points to the
right, we conclude that there is a perfect positive correlation between the two
variables. The computed 𝒓 is 1.

1
2. If all the points fall on the trend line that point to the left, then there exists a perfect
negative correlation between the pair of variables. The computed value of 𝒓 is -1.
3. If a trend line does not exist, there is no correlation between the pair of variables. This
is confirmed by the computed value of 𝒓 which is 0.
4. The absolute value of r indicates the strength of correlation between the two
variables. The direction of correlation is indicated by the sign (positive or negative)
of r.

Analysis and Interpretation of Pearson r

The analysis and interpretation of the computed value of 𝒓 revolves around two elements:
direction and strength of correlation. For the direction, we just look at the sign of the
computed r – whether positive or negative. The strength of a correlation is indicated by the
value of computed 𝒓. The closer it gets to ±1, the higher or stronger the correlation. The
closer it is to 0, the lower or weaker is the correlation. The maximum value that 𝒓 can take is
either +1 or −1. The value of 𝒓 goes beyond these perfect values.

The strength of correlation as indicated by the numerical value of 𝒓 is relative to a judge.


For example, for one person, 𝒓 = 0.7 might already indicate a very high correlation.
However, to another person, it might not be the case. To have a more objective description
of the computed value of 𝒓 in terms of strength, we have developed a scale that indicates
the qualitative description of the strength for every computed value of 𝒓.

Normally, a boundary belongs to any of the bounding segments. For example, r = 0.5 may
be described as “low” or “high”. Let us agree that we will put the boundaries at the higher
segment. Thus, if r = - 0.75, the strength of correlation would be “very high.”

We will use the measuring devise to determine the strength of the computed r.
Pearson r Qualitative Description
±𝟏 Perfect
±𝟎. 𝟕𝟓 𝒕𝒐 < ±𝟏 Very High
±𝟎. 𝟓𝟎 𝒕𝒐 < ±𝟎. 𝟕𝟓 Moderately High
±𝟎. 𝟐𝟓 𝒕𝒐 < ±𝟎. 𝟓𝟎 Moderately Low
> 𝟎 𝒕𝒐 < ±𝟎. 𝟐𝟓 Very Low
0 No Correlation

Illustrative Example

A group of research students wants to determine whether there is a correlation between


the time in hours spent in studying of six students to their scores in a test. The table below
shows the time in hours spent in studying (X) by six Grade 11 students and their scores in a
test (Y).
Hours spent on studying (X) 1 2 3 4 5 6
Scores in a test (Y) 5 10 10 15 25 30

a. Compute for the Pearson’s sample correlation coefficient 𝒓.


b. What is the strength of the correlation?
c. What conclusion can be derived from the study?

2
Solution:
a. The next section will guide you on how to compute the Pearson product moment
correlation 𝒓.
STEPS SOLUTION
1. Construct a table as shown on
the right side. X Y XY X2 Y2
1 5

2 10
3 10
4 15
5 25
6 30
2. Complete the table.
a. Multiply the entries on the X and X Y XY X2 Y2
Y columns. Put them under the XY 1 5 5 1 25
column.
2 10 20 4 100
b. Square each entry in the X
column. Put them under the X2 3 10 30 9 100
column. 4 15 60 16 225
c. Square each entry in the Y 5 25 125 25 625
column. Put them under the Y2 6 30 180 36 900
column.
3.
a. Get the sum of all entries in X X Y XY X2 Y2
column. This is ∑ 𝑿. 1 5 5 1 25
b. Get the sum of all entries in Y
2 10 20 4 100
column. This is ∑ 𝒀.
c. Get the sum of all entries in XY 3 10 30 9 100
column. This is ∑ 𝑿𝒀. 4 15 60 16 225
d. Get the sum of all entries in X2 5 25 125 25 625
column. This is ∑ 𝑿𝟐 . 6 30 180 36 900
e. Get the sum of all entries in Y2 ∑ 𝑿 = 21 ∑ 𝒀 = 95 ∑ 𝑿𝒀 = 420 ∑ 𝑿 = 91
𝟐 ∑ 𝒀 = 1, 975
𝟐

column. This is ∑ 𝒀𝟐 .

4. Substitute the values obtained


from step 3 in the formula.
𝑟
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∙ ∑ 𝑌 𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∙ ∑ 𝑌
= 𝒓=
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ] √[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]
6(420) − (21)(95)
=
√[6(91) − (21)2 ][6(1,975) − (95)2 ]
2, 520 − 1, 995
=
You may use your √[546 − 441][11, 850 − 9, 025]
calculator here! 525
=
√[105][2, 825]
525
=
√296, 625
𝒓 = 𝟎. 𝟗𝟔𝟑𝟗𝟓 𝒐𝒓 𝟎. 𝟗𝟔
3
Note: For consistency of our answer, The value of 𝒓 is a positive number. Therefore, we
round your final answer into two can say accurately that there is a positive
decimal places correlation between hours spent in studying and
their scores in a test.

b. What is the strength of the correlation?


Answer: Since the computed Pearson 𝒓 is equal to 0.96, one can say that there is a “Very
high positive correlation” between the time in hours spent in studying of six Grade 11
students to their scores in a test.

c. What conclusion can be derived from the study?


Answer: Based on the computed value of 𝒓 and its strength, we can predict that the more
time a student spent in studying the better his score would be in a test.

Activity 1 – Can you fill me?


Direction: Fill in the blanks to complete the statements below. Write your answers on a
separate sheet of paper.

1. ________________ is used to determine the existence, strength, and direction of


relationship between two variables.
2. A statistical method to know the correlation between bivariate data is called
_____________________________.
3. Correlation coefficient 𝒓 is a number between ___ and ___ that describes both the
strength and the direction of correlation. In symbol, we write_____________.

Activity 2 – Father and son alike.


Direction: Consider the given situation below and answer the following questions. Write your
answers on a separate sheet of paper.

The following are the heights of a father and his eldest son, in inches:
Height of the father 74 66 67 65 68 66 69 71 63 60
Height of the son 74 69 69 65 66 63 68 70 60 58

a. Compute for the Pearson’s correlation coefficient 𝒓 of the given data above.
b. What is the strength of the correlation?
. c. Based on the computed 𝒓, can you conclude that height is hereditary? Explain

Activity 3 – Look Back and Reflect


Direction: Consider the situations below. Reflect on the usefulness of correlation analysis
about two variables in a real-life context. Write your answer on a separate sheet of paper.

A group of research students in your school wants to determine whether there is a


correlation between the number of theft cases X and the number of vandalism cases Y
incurred in your school. After collecting the data for 10 months they obtain a Pearson’s
correlation coefficient 𝒓 of 0.57.

1. What conclusion can the researchers derive from the study?


2. What is the importance of the drawn conclusion for your school?
3. Cite some real-life situations where correlation analysis is important in decision making.

4
Formative Test

Let us see how far you have learned from our lesson.

Direction: Choose the letter of the best answer. Write the chosen letter on a separate sheet
of paper.

1. In computing Pearson 𝒓, which of the following is the next step after obtaining the sum
of all entries in all columns in the table?
A. Construct a table. C. Simplify and compute for the value of 𝒓.
B. Complete the table. D. Substitute all the sum and 𝒏 in the formula.

2. Which of the following is the range of the correlation coefficient (r)?


A. 0 ≤ 𝒓 ≤ 1 C. -1 < 𝒓 < 1
B. 1 ≤ 𝒓 ≤ -1 D. -1 ≤ 𝒓 ≤ 1

For numbers 3 - 5, refer to the situation below:


A Mathematics teacher is interested in finding out if critical and scientific thinking exists
among students who are good in Mathematics and Science. Thus, he conducted a
research and gathered scores of his respondents in Math and Science. The following data
have been obtained:
Score in Score in Science (Y)
Mathematics (X)
12 13
10 9
5 8
7 8
11 14
6 7

3. What is the computed Pearson’s sample correlation coefficient?


A. 0.87 C. 0.52
B. 0.80 D. 0.23

4. What is the strength of correlation?


A. moderately low positive C. very high positive
B. very high negative D. perfect positive

5. Based on the findings, which of the following best describes the result?
A. A student who is good in Mathematics is also good in Science.
B. A student who is good in Mathematics is not good in Science.
C. A student who is not good in Mathematics is good in Science.
D. There is no correlation between the performances of the students in
Mathematics and Science.

5
Answer Key

Activity 1 Activity 3
1. Correlation 1. There is a moderate high positive correlation
between the number of theft cases and the number
2. Correlation Analysis of vandalism in the school.
3. -1, 1, -1 ≤ 𝑟 ≤ +1 2. Answers may vary.
3. Fuel consumption and mileage travelled by a car.

Activity 2
a.
X Y XY X2 Y2
74 74 5476 5476 5476
66 69 4356 4761 4554
67 69 4489 4761 4623
65 65 4225 4225 4225
68 66 4624 4356 4488
66 63 4356 3969 4158
69 68 4761 4624 4692
71 70 5041 4900 4970
63 60 3969 3600 3780
60 58 3600 3364 3480
∑ 𝑋 = 669 ∑ 𝑌 = 662 ∑ 𝑋𝑌 = 44, 446 ∑ 𝑋 2 = 44, 897 ∑ 𝑌 2 = 44, 036

𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∙ ∑ 𝑌
𝒓=
√[𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ][𝑛 ∑ 𝑌 2 − (∑ 𝑌)2 ]
10(44, 446) − (669)(662)
=
√[10(44, 897) − (669)2 ][10(44, 036) − (662)2 ]

1, 582
=
√[1, 409][2, 116]
1, 582
=
√2, 981, 444
𝒓 = 𝟎. 𝟗𝟏𝟔𝟐𝟏 𝒐𝒓 𝟎. 𝟗𝟐

b. What is the strength of the correlation?


Answer: Since the computed Pearson 𝒓 is equal to 0.92, one can say that there is a “Very high
positive correlation” between the heights of the fathers to their son.

c. Since there is a “Very high positive correlation” between the heights of the fathers to their
son therefore we can conclude the height is truly hereditary.

6
References

Belecina, Rene R., Baccay,Elisa S. and Mateo, Efren B. “Statistics And Probability”. Manila:
Rex Book Store, Inc. (RBSI), 2016.
Statistics & Probability – Grade 11 Alternative Delivery Mode Quarter 4 – Module 18:
Calculating the Pearson’s Sample Correlation Coefficient, First Edition, 2020 Department of
Education – Region IV-A CALABARZON

Statistics & Probability – Grade 11 Alternative Delivery Mode Quarter 4 – Module 19: Solving
Problems involving Correlation Analysis, First Edition, 2020 Department of Education – Region
IV-A CALABARZON

You might also like