You are on page 1of 30

STATISTICS &

PROBABILITY
Quarter 4 - Module 2:
Correlation Analysis

Department of Education ● Republic of the Philippines


v
Statistics & Probability – Grade 11
Alternative Delivery Mode
Quarter 4 – Module 2: Correlation Analysis
First Edition, 2020

Republic Act 8293, section 176 states that: “No copyright shall subsist in any work of
the Government of the Philippines. However, prior approval of the government agency or
office wherein the work is created shall be necessary for exploitation of such work for profit.
Such agency or office may, among other things, impose as a condition the payment of
royalties.”

Borrowed materials included in this module are owned by their respective copyright
holders. Effort has been exerted to locate and seek permission to use these materials from
the respective copyright owners. The publisher and author do not represent nor claim
ownership over them.
Published by the Department of Education – Division of Misamis Oriental
Division Superintendent: Dr. Jonathan S. Dela Peña, CESO V
Development Team of the Module
Authors: Monina C. Raagas
Editor: Glenn C. Aradilla Milger A. Baang, PhD
Reviewer/s: Flordeliz D. Laput

Illustrator:
Layout Artist:
Management Team:
Chairperson: Jonathan S. Dela Peña, PhD, CESO V
Schools Division Superintendent

Co-Chairpersons: Nimfa R. Lago, PhD, CESO VI


Assistant Schools Division Superintendent
Members:
Erlinda G. Dael, PhD, CES - CID
Lindo M. Cayadong, PhD, EPS-Science & Mathematics
Celieto B. Magsayo, EPS- LRMS Manager
Loucille M. Paclar, Librarian II
Kim Eric G. Lubguban, PDO II

Printed in the Philippines by


Department of Education – Division of Misamis Oriental
Office Address: Del Pilar corner Velez Street, Brgy. 29, Cagayan de Oro City, 9000
Telephone Nos.: (088) 881-3094: Text: 0917-8992245 (Globe)
Email: misamis.oriental@deped.gov.ph

vi
STATISTICS &
PROBABILITY
Quarter 4 - Module 2:
Correlation Analysis

This instructional material was collaboratively developed and reviewed


by educators from public and private schools, colleges, and
or/universities. We encourage teachers and other education
stakeholders to email their feedback, comments, and recommendations
to the Department of Education at action@deped.gov.ph.
We value your feedback and recommendations.

Department of Education • Republic of the Philippines

vii
TABLE OF CONTENTS
Cover Page i
Copyright Page ii
Title Page iii
Table of Contents iv
Introduction vi
Lesson 1 Correlation Analysis 1
What I Need To Know 1
What I Know 2
What’s In 4
What’s New Activity 1 4
What Is It 8
What’s More Creating Scatterplot in Spreadsheet or Excel 9
What I Have Learned 9
What I Can Do 10
Assessment 10
Lesson 2 Pearson Product-Moment Correlation 12
What I Need To Know 12
What I Know 12
What’s In 13
What’s New Activity 1 13
What Is It 16
What’s More Correlation Coefficient Software 16
What I Have Learned 17
What I Can Do 18
Assessment 18
Answer Key 21
References 23

viii
INTRODUCTION

This module, as part of the response in crafting the Alternative Delivery Module

Learning Resource, is made for you as students who took up Statistics and

Probability subject. The resource focuses on topics under Correlation Analysis which

include constructing scatterplot, computing the Pearson product coefficient and

solving problems involving correlation analysis. Activities are suited to your own pace

and capacity. You are also advised to use applications like Excel in your computer in

accomplishing some objectives. This is to make you enjoy the comparison of manual

computation and use of formula in the computer application. The module starts with a

Pre-test to assess how much knowledge you have about the lessons. At the end part,

an Assessment ensures that you gained an understanding and skill on the objectives

set.

For the facilitator, teacher or parent, this module serves as a guide in achieving

the most essential learning competencies set by the Department of Education’s

curriculum guide. Furthermore, this is not to say that you limit only in the resources

available in this module but it is hoped that you may supplement materials and

strategies that can help the student better.

The Author

ix
Lesson
Correlation Analysis
1
Quarter: Fourth Week: 7th

No. of Days: 4 No. of hours: 4

What I Need to Know

At the end of this lesson, you are expected to:


⚫ illustrate the nature of bivariate data (M11/12SP-IVg-2);
⚫ construct a scatter plot (M11/12SP-IVg-3); and
⚫ describe shape (form), trend (direction), and variation (strength) based
on a scatter plot (M11/12SP-IVg-4).

To achieve the objectives of this module, follow the instructions below:


✓ Take time to read the lessons and study.
✓ Follow the directions and perform the activities required in the
lessons.
✓ Answer the questions in the pre-test and assessment.
✓ Internalize and practice the use of the knowledge learned in the
application to real situation as provided in the module.

REMINDER: DO NOT WRITE ANYTHING IN THE MODULE. ANSWER IN


A SEPARATE NOTEBOOK OR PAPER.

1
What I Know

Directions: Write the letter that corresponds to the best answer in your
answer sheet.
1. Which scatterplot shows most likely a positive correlation?

a. A only c. both A and C


b. B only d. Both B and D
2. In terms of strength of association, how do you compare scatterplot I
with II?

Scatterplot I Scatterplot II

a. The strength of association in Scatterplot I is greater.


b. The strength of association in Scatterplot II is greater.
c. The strength of association in both scatterplots II is the same.
d. The strength of association in the scatterplots cannot be
compared.

2
3. Which of these most likely describes the correlation between grades in
Math and Physics?
a. Strong, positive c. Weak, positive
b. Strong, negative d. Weak, negative

4. This scatterplot shows the relationship between which two variables?

a. Speed of an airplane (x) vs. distance traveled in one hour (y)


b. Outside air temperature (x) vs. air conditioning costs (y)
c. Age of an adult (x) vs. height of an adult (y)
d. Distance traveled (x) vs. gas remaining in the tank (y)

5. Which scatterplot below best describes the table of values for the
number of hours studied and the test scores?

a. c.

b. d.

3
What’s In

Remember in your previous lessons, you were asked to plot ordered


pairs in the rectangular coordinate system? Let us try if you can still do it.
Plot the following points in the rectangular coordinate system.
1. (-3, 2)

2. (3, 3)
3. (1, -5)

4. (4, -4)

5. (-3, -5)
6. (3, 5)

7. (-2, 4)
8. (1, -3)
9. (-5, 0)

10. ( 0, 5)

What’s New

Bivariate Data
Data in statistics is sometimes classified according to how many variables
are in particular study. When you conduct a study that looks at a single
variable, that study involves univariate data. For example, you study a group
of students to find out their average grade.
Bivariate data is when you are studying two variables. These variables
are compared to find the relationships between them. For example, age might
be one variable and weight might be another variable. Another is when you
want to find out the temperature and the ice cream sales.
Using correlation analysis, we can find out the relationship of variables in
a bivariate data. Many businesses, marketing and social science questions
and problems could be solved using bivariate data sets. For instance, is there
a link between child obesity and family income? This is where correlation
analysis is helpful.

4
Correlation analysis is a method of statistical evaluation used to study
the strength of a relationship between two numerically measured, continuous
variables (e.g., height and weight). This particular type of analysis is useful
when a researcher wants to establish if there are possible connections
between variables.

Activity 1
Arm Span and Height of a Person
Steps Solution

1. Using a meterstick or ruler, measure Household Length of Height


the length of the arm span and height Members/ the Arm
of 10 household members/ neighbors Neighbors Span (cm) (cm)
in centimeters. Tabulate the results.
1
2
3
4
5
6
7
8
9
10

2. Graph the points corresponding to the


bivariate data. Put labels on the x-
axis (Length of the arm span) and y-
axis (Height).

3. Present your data. As you present


them, identify the variables and
describe how the points are
scattered.

5
The graph you have constructed is called a scatterplot. By examining the
points, can you say that there is a relationship between the length of the arm
span and the height of a person?

Activity 2
Number of Times Late and Grade of a Student

Steps Solution

1. Ask 10 of your classmates of their Number of Average


average grade in the first semester and Times Grade in
Student Submitted First
the number of times they submitted
Late semester
late outputs. Tabulate the results.
Outputs (%)

10

2. Graph the points corresponding to the


bivariate data. Put labels on the x-
axis (number of Times Submitted Late
Outputs) and y- axis (average grade in
the first semester).

3. Present your data. As you present


them, identify the variables and
describe how the points are
scattered.

Is there a relationship between the number of times late in coming to


school and the grade of a student in the first period?

6
Activity 3
Weight of a Person and Number of Facebook Friends

Steps Solution

1. Ask 10 of your classmates/ friends of Number of


Weight
their weights and the number of friends Student Facebook
(kg)
in their Facebook account. Tabulate the Friends
results.
1

10

2. Graph the points corresponding to


the bivariate data. Put labels on the
x-axis (number of Facebook friends)
and y- axis (weight of classmate or
friend).

3. Present your data. As you present


them, identify the variables and
describe how the points are
scattered.

Is there a relationship between the weight of a person and the number of


Facebook friends?

7
What is It

A scatterplot, or diagram, is a type of mathematical diagram using


Cartesian coordinates to display values for two variables in a set of data. The
independent variable is plotted along the horizontal axis (x) and the
dependent variable is plotted along the vertical axis (y). Scatterplot provides a
visual representation of the correlation, or relationship between the two
variables. It shows the direction and strength of a relationship of the
variables.
All correlations have two properties: direction and strength.
⚫ Positive correlation: Both variables move in the same direction. In
other words, as one variable increases, the other variable also
increases. As one variable decreases, the other variable also
decreases. An upward trend in points indicates a positive correlation.
Examples: IQ vs. academic performance;
salary vs. job satisfaction
⚫ Negative correlation: The variables move in opposite directions. As
one variable increases, the other variable decreases. As one variable
decreases, the other variable increases. A downward trend in points
indicates a negative correlation.
Examples: academic performance vs. no. of hours watching tv;
stress vs. job performance
⚫ Zero or no correlation: It means that there is no apparent
relationship between the two variables.
Example: shoe size vs. salary;
socio-economic status vs. grades
The strength of a correlation is determined by its numerical value. It may
be perfect, very high, moderately high, moderately low, very low, and zero.

The diagram above shows some examples of scatter plots and correlations.
8
What’s More
Creating Scatterplot in Spreadsheet or Excel
What’s interesting is you can create your scatterplot from your data
using Excel. Here are the steps you need:

• Select the worksheet range that contains the data.


• Click On the Insert tab, click the XY (Scatter) chart command button.
• Select the Chart subtype that doesn't include any lines.
• Confirm the chart data organization.
• Annotate the chart, if appropriate. Add those little flourishes to your
chart that will make it more attractive and readable. For example, you
can use the Chart Title and Axis Titles buttons to annotate the chart
with a title and with descriptions of the axes used in the chart.
• If you want to add a trendline, click Add Chart Element menu's
Trendline command button.

What I Have Learned

Based on this lesson, answer the following questions:


1. What are bivariate data? Give an example.
2. What is a scatterplot? What is the importance of scatterplot?
3. Describe a positive correlation? A negative correlation?
4. In the analysis of a scatterplot, what two elements should be
considered?
5. How is the strength of correlation determined?

9
⚫ Bivariate data involves the study of two variables. An example is the
IQ and age of students in a population.
⚫ A scatterplot is a mathematical diagram using Cartesian coordinates
to display values for two variables in a set of data. It provides a visual
representation of the correlation, or relationship between the two
variables.
⚫ In a positive correlation, both variables move in the same direction.
In other words, as one variable increases, the other variable also
increases. In a negative correlation, the variables move in opposite
directions. As one variable increases, the other variable decreases.
⚫ The two elements that should be considered in the analysis of a
scatterplot are: direction and strength of the correlation.
⚫ The strength of a correlation is determined by its numerical value. It
may be perfect, very high, moderately high, moderately low, very low,
and zero.

What I Can Do

With the lesson studied, we want to know if we can apply the use of
scatterplot in real life. Suppose the number of people of different ages who
died of COVID-19 virus on the month of April in our region is taken.
Construct the scatterplot of number of people died against age. Show your
output using Excel.

Assessment

A. For each of the following case, tell whether the relationship is positive,
negative or no correlation.
1. The more students enroll in a school, the more teachers are needed.
2. The wealthier a person is, the more friends he has.
3. A student who has many absences has a decrease in grades.
4. As one increases in age, often one's agility decreases.
5. The longer your hair grows, the more shampoo you will need.

10
B. Determine whether the following bivariate data are correlated or not. If
they are correlated, tell the direction of the association. Evaluate whether
correlation is most likely strong or weak.
1. time spent in a supermarket and money spent
2. income and value of car driven
3. number of children and time spent cleaning the house by the mother
4. amount spent on gas and distance traveled by car each week
5. age and reaction time of persons over 18 years of age
C. Match the letter below which best describes the following scatterplot.

1. 2.

3. 4.
A. Strong negative correlation
B. Strong positive correlation
C. Moderate positive correlation
D. Low negative correlation
E. Zero correlation
D. Construct a scatterplot for the following data and use it to comment on
the form, direction, and strength between the variables.
1 Age of a
. person, 11 12 13 14 15 16 17 18 19 20
years
Weight,
kg 40 42 38 35 45 51 48 48 50 47

2. Age of a
car, 0.5 1 1.5 2 3 4 4.5 5 6 7
years

Mileage,
16 15 10 12 10 12 11 10 11 8
km/L

11
Lesson Pearson Product-Moment
2 Correlation

Quarter: Fourth Week: 8th

No. of Days: 4 No. of hours: 4

What I Need to Know

At the end of this lesson, you are expected to:


⚫ calculate the Pearson’s sample correlation coefficient (M11/12SP-IVh-2);
and
⚫ solve problems involving correlation analysis (M11/12SP-IVh-3).

What I Know

Directions: The table shows the correlations for the four graphs below. Match
each graph to the correlation coefficient.

A. B.

C. Compute and interpret r for the following data.


1.
x 20 30 40 50 60

y 100 90 85 60 50

2.
x 6 15 30 12 20
y 3 6 15 5 15
12
What’s In

Check your readiness for this lesson by answering the following


exercises.
A. Sketch the scatterplot of the following that shows:
1. Strong positive correlation
2. Weak positive correlation
3. Perfect negative correlation
4. No correlation
B. Determine whether the correlation between the given bivariate data is
most likely positive, negative, or zero.
1. hours spent sleeping and hours spent awake
2. years of education and yearly salary
3. shoe size and salary.
4. temperature and ice cream sales
5. Car speed and travel time

Age and Weight of Children


A sample of 6 children was selected; data about their age in years and
weight in kilograms were recorded as shown in the following table. It is
required to find if there is a relationship between age and weight. Then,
interpret the result.

Child Age, X Weight, Y

1 7 12

2 6 8

3 8 12

4 5 10

5 6 11

6 9 13

13
Steps Solution

1. Construct a table shown on Child X Y X2 Y2 XY


the right side. Complete the
entries in each column. Get 1 7 12
the sum of all entries below 2 6 8
the columns.
3 8 12

4 5 10

5 6 11

6 9 13

ΣX = ΣY = ΣX2 ΣY2 ΣXY


= = =

2. Substitute the values


obtained in the formula,

r=

The value r is called the Pearson correlation coefficient. It indicates the


degree of relationship between two variables. What do you think is the degree
of relationship between age and weight?

14
Activity 2
Mathematics and Physics Scores

Steps Solution

1. Below are the data of Student X Y X2 Y2 XY


Mathematics and Physics
scores of 5 students at 1
Mabuhay High School. 2
Compute for the value of r by
completing the table on the 3
right side.
4

ΣX ΣY = ΣX2 ΣY2 ΣXY


=
Student Math Physics = = =

1 55 66

2 93 89

3 89 94

4 60 52

5 90 84

2. Substitute the values


obtained in the formula,
15

Can you state the correlation coefficient for the relationship between
Math and Physics scores?

15
What is It

Pearson Correlation Coefficient


The most common coefficient of correlation is known as the Pearson
product-moment correlation coefficient, or Pearson’s r. It is a measure of the
linear correlation (dependence) between two variables X and Y, giving a value
between +1 and −1. It was developed by Karl Pearson from a related idea
introduced by Francis Galton in the 1880s.

When conducting a statistical test between two variables, it is a good idea


to conduct a Pearson correlation coefficient value to determine just how
strong that relationship is between the two variables. If the coefficient value is
in the negative range, then that means the relationship between the variables
is negatively correlated, or as one value increases, the other decreases. If the
value is in the positive range, then that means the relationship between the
variables is positively correlated, or both values increase or decrease
together.

To determine the strength of the computed r:


If r=0 no association or correlation
If 0 < r < ±0.25 very low correlation
If ±0.25 < r < ±0.50 moderately low correlation
If ±0.50 < r < ±0.75 moderately high correlation
If ±0.75 < r < ±1 very high or strong correlation
If r = ±l perfect correlation

What’s More

Correlation Coefficient Software

Most spreadsheet editors such as Excel, Google sheets and OpenOffice


can compute correlations for you. The illustration below shows an example:
Using the Excel, click on an empty cell where you want the correlation
coefficient to be entered. Then enter the following formula.

16
=PEARSON(array1, array2)

Simply replace ‘array1‘ with the range of cells containing the first variable
and replace ‘array2‘ with the range of cells containing the second variable.

For the example above, the Pearson correlation coefficient (r) is 0. 76.

What I Have Learned

Based on this lesson, answer the following questions:


1. What is Pearson correlation coefficient?
2. What is the formula for computing r?
3. What are the indicators for determining the strength and direction of
correlation?

⚫ Pearson product-moment correlation coefficient, or Pearson’s r is a


measure of the linear correlation (dependence) between two variables
X and Y, giving a value between +1 and −1.
⚫ The formula for computing r is

⚫ The direction of correlation is indicated by the sign of r while its


strength is indicated by the absolute value of the computed value.

17
What I Can Do

With the lesson studied, suppose we want to determine the strength of


the relationship between the number of years in studying to the amount
of salary received of 10 persons in your community. Compute the Pearson
coefficient r using Excel. What conclusion can you derived from the
computation?

Assessment

A. Encircle the letter of the correct answer.


1. Which of the following values cannot represent a correlation
coefficient?
a. r = 1.08 b. r = 0.95 c. r=0 d. r = - 1.0
2. What could be the approximate value of the correlation coefficient for
a weak negative correlation?
a. −0.85 b. −0.16 c. 0.21 d. 0.90
3. Which value of a correlation coefficient represents the strongest
relationship between the two variables ?
a. -0.94 b. 0 c. 0.5 d. 0.91
4. Which value of r represents data 18
with a strong negative linear
correlation between two variables?
a. −1.07 b. −0.89 c. −0.14 d. 0.92
5. A study compared the number of years of education a person received
and that person's average yearly salary. It was determined that the
relationship between these two quantities was linear and the
correlation coefficient was 0.91. Which conclusion can be made
based on the findings of this study?
a. There was a weak relationship.
b. There was a strong relationship.
c. There was no relationship.
d. There was an unpredictable relationship.

18
B. Match the letter that corresponds as an interpretation of the scatter
plot below.
A. strong negative correlation
B. moderate negative correlation
C. strong positive correlation
D. zero correlation
E. moderate positive correlation

1. 2.

3. 4.

C. Compute and interpret r for the following


19 data given.
1.
1 3 6 10 12
x
y 5 13 25 41 49

2.
x 1 3 5 7 9
y 44 34 24 14 4

3.
x 1 3 6 9 11

y 12 28 37 28 12

19
D. Find the value of Pearson coefficient r. Give your conclusion about the
variables of the studies.
1. The diameter of the longest lichens growing on gravestones were
measured. Data gathered show the following:

Age of
gravestone 9 18 20 31 44 52 53 61 63 63
X (years)

Diameter of
2 3 4 20 22 41 35 22 28 32
lichen

2. In a biology experiment a number of cultures were grown in the


laboratory. The numbers of bacteria, in millions, and their
ages, in days, are given below.

Age
1 2 3 4 5 6 7 8
X (days)

No. of
bacteria 34 106 135 181 192 231 268 300
Y(mil)

20
Answer Key
Lesson 1 Lesson 2

What I Know What I Know


1. a A. Graph A =1

2. a Graph B = -1

3. a Graph C = 0
4. d Graph D = -0.72

5. c B. Graph A = 0.96

Assessment Graph B = -0.90


A. 1. Positive Graph C = 0.72
2. No correlation Graph D = -0.42

3. Negative C. 1. r = -0.97 ; strong negative correlation


4. Negative 2. r = 0.90 ; strong positive correlation

5. Positive

B. 1. Strong positive correlation


2. Strong positive correlation

3. Weak negative correlation

4. Strong positive correlation


5. Strong negative correlation

C. 1. B

2. C
3. E

4. A

D. 1. 2.

21
Lesson 2

Assessment
A.

1. a

2. b
3. a

4. b

5. B
B. 1. D

2. C

3. B
4. A
C. 1. r = 1 ; perfect positive correlation

2. r = -1 ; perfect negative correlation


3. r = 0 ; no correlation

D. 1. r = 0.86 ; There is a strong positive correlation between age of grave stone and
diameter of lichen.
2. r = 0.99 ; There is a strong positive correlation between the number of days
and the number of bacteria

22
References
Belecina, Rene R. et. al. Statistics and Probability. P. Florentino ST., Sta.
Mesa Heights, Quezon City: Rex Printing Company, Inc., 2016

Websites
https://www.onlinemathlearning.com/scatter-plots.html
https://courses.lumenlearning.com/boundless-statistics/chapter/correlati
on/
https://www.dummies.com/software/microsoft-office/excel/how-to-create-
a-scatter-plot-in-excel/

23
MODULE WRITER’S PROFILE
Name: MONINA C. RAAGAS
Position: Teacher II
Educational Attainment:
MA units in Teaching Math at USTP; MA units in Educational
Supervision & Administration
BS in Elementary Education Major in Mathematics
Module Title: Module 2 – Correlation Analysis
Division: Misamis Oriental
School: Opol National Secondary Technical School
District: Opol

24
For inquires or feedback, please write or call:

Department of Education – Division of Misamis Oriental


Office Address: Del Pilar corner Velez Street, Brgy.
29, Cagayan de Oro City, 9000
Telephone Nos.: (088) 881-3094: Text: 0917-8992245
(Globe)
Email: misamis.oriental@deped.gov.ph

25

You might also like