You are on page 1of 56

REPUBLIC OF THE PHILIPPINES

NORTHERN NEGROS STATE COLLEGE OF SCIENCE AND


TECHNOLOGY
OLD SAGAY, SAGAY CITY, NEGROS OCCIDENTAL
(034)722-4169/www.nonescost.edu.com

COLLEGE OF NURSING
and ALLIED HEALTH
SCIENCES (CONAHS)
COURSE MODULE IN

MATHEMATICS
IN THE MODERN
WORLD
st
1 Semester; AY 2023-2024
COURSE FACILITATOR: JOEMAR A. SALIMBOT, REE, RME, LPT, MAT

2
FB/MESSENGER: Joemar Salimbot
Email Address: joem2968@gmail.com
Phone No.: 09652049674
This document is a property of NONESCOST Module 2 | Page 1
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only) Prepared by: Cabahaga
/Toledo
LESSON

1 PROBLEM SOLVING & REASONING


6 HOURS

This lesson dwells on the strategies in solving real-life mathematical problems with emphasis on inductive
reasoning, deductive reasoning, and the Polya’s four-step-problem-solving strategy.

LEARNING OUTCOMES

At the end of this lesson, you are expected to:

1. Differentiate inductive reasoning from deductive reasoning;


2. Determine whether the argument is an example of inductive reasoning or deductive reasoning;
3. Use inductive reasoning, deductive reasoning, and Polya’s strategy to solve real-life math problems, and;
4. Make original problems that can be solved using deductive reasoning & Polya’s four-step problem-
solving strategy.

This document is a property of NONESCOST


Module 2 | Page 2
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Lately, there has been a lot of images circulating in Facebook posts/news feed similar to the image below which
challenges readers to answer. Others even create their own image/s.
Examine this image. Give the value for each of the animal or insect. And provide the correct answer on the question
mark.

BrainFans.com

Try to experience & enjoy the Einstein’s Riddle below.

The image above is an example of exciting problem that we might be challenged to solve and find the
answer.
Along the process of solving every problem our logical thinking and creativity skills will be gradually
developed.

Most occupations require good problem-solving skills (Aufmann, 2018). For instance, the architects and
engineers must solve many complicated problems as they design and construct modern buildings that are
aesthetically pleasing, functional, and that meet stringent safety requirements.

Two goals of this lesson are to help you become a better problem solver and to demonstrate that problem
solving can be an enjoyable experience. Anyway, it is something we do everyday.

This document is a property of NONESCOST


Module 2 | Page 3
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Inductive and Deductive Reasoning

Inductive Reasoning

The type of reasoning that forms a conclusion based on the examination of specific examples is called
inductive reasoning. The conclusion formed by using inductive reasoning is often called a conjecture, since it may
or may not be correct (Aufmann, 2018). It is a type of reasoning that uses specific examples to reach a general
conclusion of something is called inductive reasoning (Baltazar, 2018).

When you examine a list of numbers and predict the next number in the list according to some pattern you
have observed, you are using inductive reasoning.

Example 1:

Use inductive reasoning to predict the next number in each of the following lists.

a. 3, 6, 9, 12, 15, ? b. 1, 3, 6, 10, 15, ?

Solution:
a. Each successive number is 3 larger than the preceding number. Thus, we predict that the next
number in the list is 3 larger than 15, which is 18.

b. The first two numbers differ by 2. The second and the third numbers differ by 3. It appears that
the difference between any two numbers is always 1 more than the preceding difference.
Since 10 and 15 differ by 5, we predict that the next number in the list will be 6 larger than 15,
which is 21.

Check Your Progress 1:

Use inductive reasoning to predict the next number in each of the following lists.

a. 5, 10, 15, 20, 25, ? b. 2, 5, 10, 17, 26, ?

Inductive reasoning is not used just to predict the next number in a list. In Example 2, we use inductive
reasoning to make a conjecture about an arithmetic procedure.

Example 2: Use Inductive Reasoning to Make a Conjecture

Consider the following procedure: Pick a number. Multiply the number by 8, add 6 to the product, divide
the sum by 2, and subtract 3.

Complete the above procedure for several different numbers. Use inductive reasoning to make a conjecture
about the relationship between the size of the resulting number and the size of the original number.

Solution:

Suppose we pick 5 as our original number. Then the procedure would produce the following results:

This document is a property of NONESCOST


Module 2 | Page 4
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Original number: 5
Multiply by 8: 8 x 5 = 40
Add 6: 40 + 6 = 46
Divide by 2: 46 / 2 = 23
Subtract 3: 23 - 3 = 20

We started with 5 and followed the procedure to produce 20. Starting with 6 as our original number
produces a final result of 24. Starting with 10 produces a final result of 40. Starting with 100 produces a final result
of 400. In each of these cases the resulting number is four times the original number. We conjecture that following
the given procedure produces a number that is four times the original number.

Check Your Progress 2:

Consider the following procedure: Pick a number. Multiply the number by 9, add 15 to the product, divide
the sum by 3, and subtract 5.

Complete the above procedure for several different numbers. Use inductive reasoning to make a
conjecture about the relationship between the size of the resulting number and the size of the original number.

Scientists often use inductive reasoning. For instance, Galileo Galilei (1564– 1642) used inductive reasoning
to discover that the time required for a pendulum to complete one swing, called the period of the pendulum,
depends on the length of the pendulum. Galileo did not have a clock, so Galileo measured the periods of
pendulums in “heartbeats.” The following table shows some results obtained for pendulums of various lengths. For the
sake of convenience, a length of 10 inches has been designated as 1 unit.

Example 3: Use Inductive Reasoning to Solve an Application

Use the data in the table and inductive reasoning to answer each of the following questions.

a. If a pendulum has a length of 49 units, what is its period?


b. If the length of a pendulum is quadrupled, what happens to its period?

Solution:
a. In the table, each pendulum has a period that is the square root of its length. Thus we
conjecture that a pendulum with a length of 49 units will have a period of 7 heartbeats.
b. In the table, a pendulum with a length of 4 units has a period that is twice that of a pendulum
with a length of 1 unit. A pendulum with a length of 16 units has a period that is twice that of a
This document is a property of NONESCOST
Module 2 | Page 5
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
pendulum with a length of 4 units. It appears that quadrupling the length of a pendulum
doubles its period.

Conclusions based on inductive reasoning may be incorrect. As an illustration, consider the circles shown
below. For each circle, all possible line segments have been drawn to connect each dot on the circle with all the
other dots on the circle.

The maximum numbers of regions formed by connecting dots on a circle

For each circle, count the number of regions formed by the line segments that connect the dots on the circle.
Your results should agree with the results in the following table.

There appears to be a pattern. Each additional dot seems to double the number of regions. Guess the
maximum number of regions you expect for a circle with six dots. Check your guess by counting the maximum
number of regions formed by the line segments that connect six dots on a large circle. Your drawing will show that
for six dots, the maximum number of regions is 31 (see the figure below), not 32 as you may have guessed. With
seven dots the maximum number of regions is 57. This is a good example to keep in mind. Just because a pattern
holds true for a few cases, it does not mean the pattern will continue. When you use inductive reasoning, you have
no guarantee that your conclusion is correct.

This document is a property of NONESCOST


Module 2 | Page 6
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Note:
 A good problem solver is one who can find a resolution of which the path to the answer is not
immediately known.
 In the real world, decision-making and problem-solving are two key areas that one should be
good at in order to survive.

Example 4. Use Deductive Reasoning to Establish a Conjecture

Check Your Progress 3:

Use deductive reasoning to show that the following procedure produces a number that is three
times the original number.
Procedure: Pick a number. Multiply the number by 6, add 10 to the product, divide the sum by 2,
and subtract 5. Hint: Let n represent the original number.
This document is a property of NONESCOST
Module 2 | Page 7
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Logic Puzzles

Logic puzzles, similar to the one in example 5, can be solved by using deductive reasoning and a chart that
enables us to display the given information in a visual manner.

Example # 5:

Solve a Logic Puzzle below.

Each of the four friends: person 1, person 2, person 3, and person 4, has a different pet (fish, cat,
dog, and snake). From the following clues, determine the pet of each individual:
1. Person 2 is older than one friend who owns the cat and younger than one friend who owns the
dog.
2. Person 3 and one friend who owns the snake are both of the same age and are the youngest
members of their group.
3. Person 1 is older than one friend who owns the fish.

Solution:

From Clue 1, Person 2 does not own a cat nor a dog. In the following chart, write X1 (which stands for
“ruled out by clue 1”) in the cat and dog column for Person 2.

Fish Cat Dog Snake


Person 1
Person 2 X1 X1
Person 3
Person 4

From Clue 2, person 3 does not own a snake and a dog being the youngest. And since person 2 is not the
youngest from Clue 1, then person 2 does not own a snake as well. Write X2 (ruled out by clue 2) in snake column
for person 3 and X1 in snake column for person 2. There are now Xs in the 3 pets in person’s 2 row, therefore
person 2 owns the fish. Put a check ( / ) which means person’s 2 pet is a fish. So, person 1, person 3, and person 4
do not own the fish.

Fish Cat Dog Snake


Person 1 X2
Person 2 / X1 X1 X1
Person 3 X2 X2 X2
Person 4 X2

From clue 3, person 1 is older than person 2, hence person 1 owns the dog. Write X3 (ruled out by clue 3)
in cat and snake columns for person 1. There are now Xs in snake column for person 1, person 2, and person 3;
therefore person 4 owns the snake. Put a check in that box. Write x3 in the cat column for person 4; hence person
3 owns the cat. Put a check in that box.

Fish Cat Dog Snake


Person 1 X2 X3 / X3
Person 2 / X1 X1 X1
Person 3 X2 / X2 X2
Person 4 X2 X3 X3 /

Thus, person 2 owns the fish, person 1 owns the dog, person 4 owns the snake and person 3 owns the cat.

This document is a property of NONESCOST


Module 2 | Page 8
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Example 6: Solve a Logic Puzzle

This document is a property of NONESCOST


Module 2 | Page 9
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Inductive Reasoning vs. Deductive Reasoning

In example 7, we will analyze arguments to determine whether they use inductive or deductive reasoning.

Example # 7:

This document is a property of NONESCOST


Module 2 | Page 10
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Polya’s Four-Step Problem-Solving Strategy

One of the foremost recent mathematicians to make a study of problem solving was George
Polya (1887 – 1985). George Polya was born in Hungary and moved to the United States in 1940. The
basic problem-solving strategy that Polya advocated consisted of the following four steps.

Polya’s Four-Step Problem-Solving Strategy:


1. Understand the problem.
2. Devise a plan.
3. Carry out the plan.
4. Review the solution.

Polya’s four steps are deceptively simple. To become a problem solver, it helps to examine each of
these steps and determine what is involved.
I. Understand the Problem
This part of Polya’s four-step strategy is often overlooked. You must have a clear
understanding of the problem. To help you focus on understanding the problem, consider
the following questions:
1. Can you restate the problem in your own words?
2. Can you determine what is known about these types of
problems?
3. Is there missing information that, if known, would allow you
to solve the problem?
4. Is there extraneous information that is not needed to solve
the problem?
5. What is the goal?

II. Devise a Plan


Successful problem solvers use a variety of techniques when they attempt to solve
a problem. Here are some frequently used procedures.
1. Make a list of the known information.
2. Make a list of information that is needed.
3. Draw a diagram.
4. Make an organized list that shows all the possibilities.
5. Make a table or a chart.
6. Work backwards.
7. Try to solve a similar but simpler problem.
8. Look for pattern.
9. Write an equation. If necessary, define what each variable represents.
10. Perform an experiment.
11. Guess at a solution and then check your result.

III. Carry out the Plan


Once you have devised a plan, you must carry it out.
1. Work carefully.
This document is a property of NONESCOST
Module 2 | Page 11
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
2. Keep an accurate and neat record of all your attempts.
3. Realize that some of your initial plans will not work and that you may have to
devise another plan or modify your existing plan.

IV. Review the Solution


One you have found a solution, check the solution.
1. Ensure that the solution is consistent with the facts of the problem.
2. Interpret the solution in the context of the problem.
3. Ask yourself whether there are generalizations of the solution that could apply
to other problems.

In example 1 we apply Polya’s four-step problem-solving strategy to solve a problem involving


the number of routes between two points.

Example # 1:
Consider the city map shown below. A person wishes to walk along the streets
from point A to point B. How many direct routes can Allison take?

City Map

Solution:
Understand the Problem. We would not be able to answer the question if a person
retraced the path or traveled away from point B. Thus we assume that on a direct route,
the person always travels along a street in a direction that gets one closer to point B.
Devise a Plan. The city map above has many extraneous details. Thus we make a diagram
that allows us to concentrate on the essential information. See the figure below.

This document is a property of NONESCOST


Module 2 | Page 12
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
A simple diagram of the street map

Because there are many routes, we consider the similar but simpler diagrams shown
below. The number at each street intersection represents the number of routes
from point A to that particular intersection.

Simple street diagrams

Take note:
The strategy of working a similar but simpler problem is an important problem-
solving strategy that can be used to solve many problems.

Look for patterns. It appears that the number of routes to an intersection is the
sum of the number of routes to the adjacent intersection to its left and the number of
routes to the intersection directly above. For instance, the number of routes to the
intersection labeled 6 is the sum of the number of routes to the intersection to its left,
which is 3, and the number of routes to the intersection directly above, which is also 3.

Carry Out the Plan. Using the pattern discovered above, we see from the figure below that
the number of routes from point A to point B is 20 + 15 = 35.

This document is a property of NONESCOST


Module 2 | Page 13
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
A street diagram with the number of routes to each intersection labeled.

Review the Solution. Ask yourself whether a result of 35 seems reasonable. If you were
required to draw each route, could you devise a scheme that would enable you to draw
each route without missing a route or duplicating a route?

Example # 2:
Consider the street map below. A person wishes to walk directly from point A to
point B. How many different routes can a person take if one wants to go past Starbucks on
Third Avenue?

Street Map

Example 2 illustrates the technique of using an organized list.


Solution:

This document is a property of NONESCOST


Module 2 | Page 14
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Example # 3:
A baseball team won two out of their last four games. In how many different
orders could they have two wins and two losses in four games?

Solution:
Understand the problem. There are many different orders. The team may have
won two straight games and lost the last two (WWLL). Or maybe they lost the first two
games and won the last two (LLWW). Of course there are other possibilities, such as
WLWL.
Devise a Plan. We will make an organized list of all the possible orders. An
organized list is a list that is produced using a system that ensures that each of the
different orders will be listed once and only once.

Carry Out the Plan. Each entry in our list must contain two Ws and two Ls. We will
use a strategy that makes sure each order in considered, with no duplications. One such
strategy is to always write a W unless doing so will produce too many Ws or a duplicate of
one of the previous orders. If it is not possible to write a W, then and only then do we
This document is a property of NONESCOST
Module 2 | Page 15
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
write an L. This strategy produces the six different orders shown below.

This document is a property of NONESCOST


Module 2 | Page 16
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
1. WWLL (Start with two wins)
2. WLWL (Start with one win)
3. WLLW
4. LWWL (Start with one loss)
5. LWLW
6. LLWW (Start with two losses)

Review the Solution. We have made an organized list. The list has no duplicates
and the list considers all possibilities, so we are confident that there are six different
orders in which a baseball team can win exactly two out of four games.

Example # 5:
Finding a Day of the Week
In 2017, Venus’ birthday fell on a Saturday, June 3. On what day of the week does
Venus’ birthday fall in 2020? Note that the year 2020 is a leap year.

Solution:
The number of days in a year is 365 except when it is a leap year where there’s one day
added. How many days are there after June 3, 2017 to June 3, 2020?

Number of days:
After June 3, 2017 to June 3, 2018: 365 days
After June 3, 2018 to June 3, 2019: 365 days
After June 3, 2019 to June 3, 2020: 366 days (leap year)
Total 1,096 days

Because 1,096 days divided by 7 days = 156 weeks has a remainder 4 days, then we write
1,096 ≡ 4 mod 7. Since a week is a cycle, then any multiple of 7 days past a given day will
be the same day of the week. It means that on the 1,092nd day, 1092 being a multiple of 7,
after June 3, 2017 is also a Saturday. Furthermore, on the 1,096 th day, four days after, is a
Wednesday. Thus, June 3, 2020 will be a Wednesday.

This document is a property of NONESCOST


Module 2 | Page 17
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Activity 2 – 1
Carefully analyze the pictures below. Answer what is required to complete the equation. (You
may work with a partner.)

This document is a property of NONESCOST


Module 2 | Page 18
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
This document is a property of NONESCOST
Module 2 | Page 19
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Activity 2 – 2

Another interesting puzzle are Kenken Puzzles. Search the net and learn more about them. Solve
the given KenKen puzzles below:

A. 4x4 B. 5x5

3 5 1 4 2
2 3 5 1 4
0
4 2 3 5 1
1
5 1 4 2 3

1 4 2 3 5

Activity 2 – 3

Use Polya’s 4- step method in solving this problem.

Mr. Cruz has chickens and cows in his backyard. All in all there are thirty-nine (39) animals
and a total of one hundred (100) legs. How many cows and chickens are there?

will not be graded.

This document is a property of NONESCOST


Module 2 | Page 20
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
This document is a property of NONESCOST
Module 2 | Page 21
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
This document is a property of NONESCOST
Module 2 | Page 22
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
LESSON

2
3 HOURS
MEASURES OF CENTRAL TENDENCY

This lesson presents the definition, functions, types and computations of measures of central
tendency like mean, median & mode for ungroup data.

At the end of this lesson, you are expected to:

1. State the meaning & functions of measures of central tendency;


2. Identify the most common & widely used measures of central tendency;
3. Write the meaning of median;
4. Discuss the difference between population and sample mean;
5. Describe how to find the median value when the given frequency of scores is odd
& even;
6. State how to determine the mode value of a given distribution;
7. Write the meaning of unimodal, bimodal, trimodal & polymodal;
8. Compute mean, median & mode;

This document is a property of NONESCOST


Module 2 | Page 23
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
The world of sports is best understood through Mathematics. Do you know that the career of
each player is described by Statistics? In the world of basketball, let me give you two of my favorite
stars – Jordan and Bryant. The careers of Michael Jordan and the late Kobe Bryant are described by
these:

CAREER AVERAGES
Michael Jordan Kobe Bryant
1346 regular season games 1072 regular season games
Points 30.1 25.0
Rebounds 6.2 5.2
Assists 5.3 4.7

https://www.basketball-reference.com

What is your favorite sport? Who is your favorite player? What is your player’s statistics?

This topic helps you to recall and even develop mastery on computations about the different
measures of central tendency. It will also discuss the importance, uses and applications of these
measures.

What do you mean by measures of central tendency?


It is the value which provides a summary of the characteristics of a given set of data (Nocon, et. al.,
2000).
It is any measure indicating the center of a set of data, arranged in either an increasing or decreasing
magnitude (Broto, 2012).
It is a numerical value indicative of a typical value of the distribution (Montero-Galliguez, et. al.,
This document is a property of NONESCOST
Module 2 | Page 24
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
2016).

This document is a property of NONESCOST


Module 2 | Page 25
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
What is the function of measures of central tendency?
The measures of central tendency or location summarize the distribution of data in order to come
up with an average that is typical or representative of a given set of data. Raw data are abstract when
there is no order. Data are arranged according to magnitude to give further meaning or value to a given
distribution (Vizcarra, et. al., 2012).
What are the most commonly used measures of central tendency?
These are mean, median and mode.

MEAN
The mean is equal to the sum of all the values divided by the total number of population or values. It is
also referred to as the arithmetic mean or simple average. However, the mean is widely used in descriptive and
inferential statistics because it is the representative of the whole distribution (Vizcarra, et. al. 2012).
The arithmetic mean of a finite set of measurements is equal to the sum of the measurements divided
by the total number of measurements (Montero-Galliguez, et. al., 2016).
The uses of the mean according to Nocon are:
1. for interval and ratio measurements;
2. if higher statistical computations are wanted;
3. if there are no extreme values in a distribution since it is easily affected by extremely high or
extremely low scores. Thus, the distribution is approximately normal and;
4. when the greatest reliability of the measure of central tendency is wanted since its
computations include all the given values.
The mean is classified to population mean and sample mean.

The Mean for Ungrouped Data:


Population mean is the set of data taken from the average of the population. It is the total
population, added together, then divided by the population size N (Broto, 2012). The formula is:
μ = ΣX
N
Where:
μ = Population Mean
ΣX = Sum of the Population
N = Population Size
Example 1.
The number of faculty members in 10 different colleges is 16, 25, 40, 24, 15, 20, 50, 15, 35,
and 20. Treating the data as a population, find the population mean of faculty members for the 10
colleges.
Solution: X
16
25
40
24
15
20
50
15
35
20
ΣX = 260
This document is a property of NONESCOST
Module 2 | Page 26
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
N = 10

This document is a property of NONESCOST


Module 2 | Page 27
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Substitute the given values to the formula.
μ = ΣX
N
μ = 260
10
μ = 26

Sample mean is the set of data taken from the average or mean of the sample, added together then
divided by the sample size n (Broto, 2012). The formula is:
̅ 𝒙̅̅ = Σx
n
Where:
̅ 𝒙̅̅ = Sample Mean
Σx = Sum of the Sample Observation n
= Sample Size

Example 2. The following are the ages of samples of 9 children in a slum area.
x
9
8
1
3
4
5
6
7
2
Σx = 45
n=9

Solution: Substitute the given values to the formula.


̅ 𝒙̅̅ = Σx
n
45
̅ 𝒙̅̅ =
9

̅ 𝒙̅̅ = 5 mean / average


MEDIAN
Median refers to the value of the middle observation in an ordered distribution (Nocon, et. al., 2000).
Median (Md) is the value found at the middle when the data are arranged in an array form either from
the highest to the lowest or from the lowest to highest. If there are two middle values, the average is taken
(Broto, 2012).
Median is the midpoint in an ordered set of numbers. It is the middle value that divides the given set of
observations when it is arranged from the lowest to the highest or vice versa. The median is also defined as the
value at the 50% of both the upper and lower sides of a given set of observations (Vizcarra, et. al. 2012).
The uses of the median according to Nocon are:
1. for ordinal or ranked measurements;
2. if there are extreme cases, thus the distribution is markedly skewed;
This document is a property of NONESCOST
Module 2 | Page 28
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
3. if we desire to know whether the cases fall within the upper halves or the lower halves of a
distribution and;
4. for an open-end distribution; that is, the lowest or the highest class interval or both are not
defined as 50 and below or 100 and above.

The Median for Ungrouped Data:


Example 1. Find the median of the given set of scores.
X
9
8
7
6
5 – is the median
4
3
2
1
Solution: Md = 5 because it is the middle point value.

Example 2. Given the scores below, find the value of the median.
X
9
8
7
6
5
4
3
2
1
1
Solution: Since there are two middle point values which are 5 and 4, then:
Md = 5 + 4 = 4.5
2 2

MODE
The mode is the item or value in a distribution with the highest frequency or most number of cases
(Nocon, et. al., 2000).
The mode (Mo) is the value which occurs most often or with greatest frequency (Broto, 2012).
The mode is the value which occurs most frequently in a set of observations. It is considered as the least
reliable measure of location (Vizcarra, et. al. 2012).
The mode is the measurement which appears most frequently. There may be no mode if no value appears
more than any other (Montero-Galliguez, et. al., 2016).
The uses of the mode according to Nocon are:
1. for nominal or categorical data;
2. if the most popular or most typical case or value in a distribution is wanted and;
3. if a rough or quick estimate of a central value is wanted.

This document is a property of NONESCOST


Module 2 | Page 29
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
The Mode for Ungrouped Data:

Example
1. The following are the scores of 9 students in a spelling test of 10 items. Determine the mode.
X
0
9
9
7
6
5
3
2
1
Solution: Mo = 9, since 9 occurs twice. The data with one mode is called unimodal.

Example 2. Given the scores below, determine the mode values.


X
18
18
17
17
17
16
16
16
15
15
Solution: Mo = 17 and 16. The data are said to be bimodal because it has two modes.

Example 3. Determine the mode for the following data:


1, 2, 5, 8, 2, 9, 10, 1, 8, & 6.
Solution: Mo = 1, 2 & 8. The data are said to be trimodal because it has three modes.

Example 4. Given the data 9, 4, 7, 2, 1, 5, 9, 4, 8, 5, 6, & 7. Find the mode.


Solution: Mo = 9, 4, 5 & 7. The data are said to be polymodal because it has four modes.
Polymodal means the data has four or more modes.

Example 5. Given the distribution 9, 2, 8, 3, 4, 5, 7, & 6 determine the mode.


Solution: the mode does not exist because all the frequencies are equal.

According to Montero-Galliguez the properties of the measures of central tendency of finite data
sets can be summarized using the table below.

Table 1. Properties of the Mean, Median, and Mode.


Property Mean Median Mode
Always exists Yes Yes No
Affected by extreme values Yes No No

This document is a property of NONESCOST


Module 2 | Page 30
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Use these exercises for practice and review of the lesson. You may work with a partner or group of
threes.

I. Mean
1. Calculate the sample mean value of the following set of scores obtained by 5 students in a 10-
item test.
1, 3, 5, 7, 9.
2. Calculate the μ mean value in height (ft.) of 5 children.
Child Height (ft.)
1 3.48
2 4.50
3 4.55
4 5.23
5 5.12
3. Calculate the sample mean value of the given scores of students in statistics.
50, 42, 78, 50, 61, 40, 50, 68, 89, & 90.

4. Given the scores 86, 80, 75, 78, and 86, find the population mean value.

5. Find the sample mean value of the set of observations 3, 7, 12, 5, 7, and 10.

II. Median
1. Find the median value of the following set of scores obtained by 5
students. 1, 3, 5, 7, 9.

2. Find the median value in height (ft.) of 5 children.


Child Height (ft.)
1 3.48
2 4.50
3 4.55
4 5.23
5 5.12
3. Find the median value of the given scores of students in statistics.
50, 42, 78, 50, 61, 40, 50, 68, 89, & 90.

4. Given the scores 86, 80, 75, 78, and 86, find the median value.

5. Find the median value of the set of observations 3, 7, 12, 5, 7, and 10.

III. Mode
1. Determine the mode value of the following set of scores obtained by 5 students. 1, 3,
5, 7, 9.
2. Determine the mode value in height (ft.) of 5 children.
This document is a property of NONESCOST
Module 2 | Page 31
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Child Height (ft.)
1 3.48
2 4.50
3 4.55
4 5.23
5 5.12

3. Determine the mode value of the given scores of students in statistics.


50, 42, 78, 50, 61, 40, 50, 68, 89, & 90.

4. Given the scores 86, 80, 75, 78, and 86, find the mode value
.
5. Find the mode value of the set of observations 3, 7, 12, 5, 7, and 10.

Mean, Median & Mode

1. An experiment was conducted to determine whether fish scales could be used as a


fertilizer to accelerate the growth of plants. The height (in centimeters) of the plants
that had fish scales as fertilizers were measured after 12 days. Find the mean of the
following sample data:
Child Height (ft.)
1 3.48
2 4.50
3 4.55
4 5.23
5 5.12

2. Find the median for the set of measurements 9, 2, 7, 11, 14, and 6.

3. Seven students were asked how many text messages they sent on a given day. Their
answers were 10, 14, 5, 20, 8, 30, and 15. Find the median.

4. Find the mode of the data set 3, 5, 3, 6, 6, 9, 10, 6, and 7.

5. Given the distribution 19, 20, 20, 20, 18, 15, 13, and 9, determine the values for mean,
median & mode.

This document is a property of NONESCOST


Module 2 | Page 32
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
LESSON

3
6 HOURS
MEASURES OF VARIATION

This lesson presents the definition, functions, types and computation of measures of
variation or dispersion.

At the end of this lesson you are expected to:

1. Construct frequency distribution table.


2. Compute the following measures of variation:
2.1. Mean Absolute Deviation
2.2. Population Variance
2.3. Sample Variance
2.4. Population Standard Deviation
2.5. Sample Standard Deviation
2.6. Quartile Deviation
2.7. Percentile Range
3. Make interpretations of the computed results.

This document is a property of NONESCOST


Module 2 | Page 33
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
This is a comic strip created by a student in GE 104 about Variation. Analyze the comic strip. Was
the student able to present a clear idea about it?
What is variation? How is it used? When is it used? Where is it used?

Events of nature always vary from time to time. People keep on changing places, motion, physical
appearance, skin reaction to different chemicals, height, weight, hair color, eye color, ideas, and even
values in life.
In this lesson you will be exposed to the concept of variation. What is it? How to use it? When to
use it? Where to use it?

This document is a property of NONESCOST


Module 2 | Page 34
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Usually, the heights of a group of people with the same race tend to converge to a certain
common value. For example, if the mean height of Filipino males is approximately 5 feet and 6 inches
then this means that most Filipino male adults have heights that are clustering about this value. The
extent of the clustering of the heights of the Filipino males about a central value is known as variation.
The measures of variation enable us to know how varied the observations are, whether there are
extreme values in the distribution, or whether the values are very close to each other. If the measure is
zero, it means that there is no variation at all. The observations are all alike, or homogeneous. Otherwise,
they are heterogeneous. The common measures of variation are the range, mean absolute deviation,
variance, standard deviation, coefficient of variation, quartile deviation and the percentile range.

Range (R)

This is the simplest form of measuring variation of a distribution. To get the range, subtract the
lowest score or observation from the highest score.

R = Highest observation – Lowest observation

Example 1:

A group of scientists went on an expedition to the mountain range in Sierra Madre,


Philippines to study the different species of plants existing in the area. The ages of the scientists
are 34, 35, 45, 56, 32, 25, and 40. What is the range of their ages?

Solution:
Highest Age 56
Lowest Age 25

Range (R) = Highest observation – Lowest observation


= 56 – 25
= 31

Therefore, the range of their ages is 31. If the size of the population or sample is large, the range
is not an excellent measure of variation because it considers the highest and the lowest values and does
not tell anything about the values between them. If one is interested in the position of each
observation
This document is a property of NONESCOST
Module 2 | Page 35
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
relative to the mean of the set of data, other measures of variation are necessary. The mean absolute
deviation can be applied.

Mean Absolute Deviation (MAD)

To find the mean absolute deviation, subtract the mean score from each raw score then using the
absolute values of the differences, get the sum of the results. The sum is called the sum of the deviations
from the mean. Next, divide this number by N, the total number of cases. In symbols,

For ungrouped data:

For grouped data:

Example 2:

Find the MAD of the ages of the scientists in Example 1.

Solution:
Ages are 34, 35, 45, 56, 32, 25, and 40.
Mean Age: ̅ 𝒙̅̅ = 34 + 35 + 45 + 56+ 32 + 25 + 40 = 38.14
7

X x – ̅𝒙̅ /x – ̅𝒙̅/
34 - 4.14 4.14
35 - 3.14 3.14
45 6.86 6.86
56 17.86 17.86
32 - 6.14 6.14
25 -3.14 13.14
40 1.86 1.86
TOTAL 53.14

53.14
MAD = = 7.59
7

Therefore, the mean absolute deviation is 7.59.

Variance

Variance is another measure of variation which can be used instead of the range. The variance
considers the deviation of each observation from the mean. To obtain the variance of the distribution,
This document is a property of NONESCOST
Module 2 | Page 36
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
compute the deviation from the mean of each row score. Then, square the deviations from the mean and
add them. Finally, divide the resulting sum by N, or the total number of cases.

Except when specified that the population variance is to be used, we always use the sample
variance formula in the examples and exercises throughout the module.

Example 3. Compute the population and sample variances of the data below.

IQ f x fx 2 fx2 x – ̅𝒙̅ (x – ̅𝒙̅)2 f(x – ̅𝒙̅)2


Scores x
75-79 10 77 770 5,929 59,290 -13.7 187.69 1,876.9
80-84 12 82 984 6,724 80,688 -8.7 75.69 908.28
85-89 25 87 2,175 7,569 189,225 -3.7 13.69 342.25
90-94 34 92 3,128 8,464 287,776 1.3 1.69 57.46
95-99 19 97 1,843 9,409 178,771 6.3 39.69 754.11
100-104 15 102 1,530 10,404 156,060 11.3 127.69 1,915.35

N = 115 Σfx = 10,430 Σfx2 = 951,810 Σ f(x – ̅𝑥̅)2 = 5,854.35

10,430
̅ 𝑥̅̅ = = 90.70
115

This document is a property of NONESCOST


Module 2 | Page 37
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Example 4:
Find the population and sample variances of the following distribution.

34, 35, 45, 56, 32, 25, and 40.


Solution:

x x – ̅𝒙̅ /x – ̅𝒙̅ / (x – ̅𝒙̅)2


34 - 4.14 4.14 17.14
35 - 3.14 3.14 9.86
45 6.86 6.86 47.06
56 17.86 17.86 318.98
32 - 6.14 6.14 37.70
25 -13.14 13.14 172.66
40 1.86 1.86 3.46
TOTAL 53.14 606.86

𝟐𝟔𝟕
̅ ̅𝒙̅̅̅ = = 38.14
𝟕

Standard Deviation

The standard deviation, σ for a population or s for a sample, is the square root of the value of
the variance. In symbols,

A. Population Standard Deviation (σ)

σ = √ σ2N
This document is a property of NONESCOST
Module 2 | Page 38
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
B. Sample Deviation (s)

s = √s2N − 1

Unless specified, the sample standard deviation is used in all the examples and exercises
throughout the module.

To interpret the standard deviation, the larger the value the greater dispersion, denoting
heterogeneous data. The lesser the value means that the scores are homogeneous.

Example 5:
Compute the population and sample standard deviation of the data below.

IQ f x fx 2 fx2 x – ̅𝒙̅ (x – ̅𝒙̅)2 f(x – ̅𝒙̅)2


Scores x
75-79 10 77 770 5,929 59,290 -13.7 187.69 1,876.9
80-84 12 82 984 6,724 80,688 -8.7 75.69 908.28
85-89 25 87 2,175 7,569 189,225 -3.7 13.69 342.25
90-94 34 92 3,128 8,464 287,776 1.3 1.69 57.46
95-99 19 97 1,843 9,409 178,771 6.3 39.69 754.11
100-104 15 102 1,530 10,404 156,060 11.3 127.69 1,915.35

N = 115 Σfx = 10,430 Σfx2 = 951,810 Σ f(x – ̅𝑥̅)2 = 5,854.35

10,430
̅ 𝑥̅̅ = = 90.70
115

Solution:
a. Population Variance

σ 2N = 50.91

Therefore, the value of the population standard deviation is σ = √𝟓𝟎. 𝟗𝟏 = 7.13

b. Sample Variance

s2N-1 = 51.35
Sample standard deviation is s = √𝟓𝟏. 𝟑𝟓 = 7.17

Example
6: Find the standard deviation of the distribution in table below.

Scores in Statistics Final Exam


Class Interval f x fx fx2
27-29 12 28 336 9,408
30-32 23 31 713 22,103

This document is a property of NONESCOST


Module 2 | Page 39
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
33-35 60 34 2,040 69,360
36-38 45 37 1,665 61,605
39-41 51 40 2,040 81,600
42-44 75 43 3,225 138,675
45-47 28 46 1,288 59,248
48-50 33 49 1,617 79,233
51-53 18 52 936 48,672
54-56 10 55 550 30,250

N = 355 Σfx = 14,410 Σfx2 = 600,154

14,410
̅ 𝑥̅̅ = = 40.59
355

Therefore, the standard deviation of the scores is 6.56.

Coefficient of Variation

When it is necessary to compare the variability of two or more groups, the task is easy if the
means are the same. For example, we can easily compare which group is more varied in
height between the following groups:

Group 1: mean = 156 cm, standard deviation= 6 Group


2: mean = 156 cm, standard deviation = 10

Clearly one can see that Group 2 is more varied because it has a higher standard deviation.
The task becomes more difficult if the means are not equal and the units are different, such as
comparing the weights of the groups belonging to different age brackets or different gender.
How can we compare the variability of the weights of 9 girls, with mean weight equal to 100
pounds and with standard deviation 5 and that of the weights of 12 boys with mean weight
equal to 160 pounds and the standard deviation 8? A statistical called the coefficient of
variation helps us answer the question. The formula is:

CV = 𝑠
̅ 𝑥̅̅
x 100%

where s = standard deviation


̅ 𝑥̅̅ = 𝑚𝑒𝑎𝑛

Since s and ̅𝒙̅ have the same units, their units are canceled out and so CV has no unit.

This document is a property of NONESCOST


Module 2 | Page 40
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Example:
Suppose two groups of students are to be compared in terms of height.

Group Mean Height Standard Deviation CV


Male 162 cm 10 cm 6.17 %
Female 148 cm 4 cm 2.70 %

Solution for CV:

CVMale = 10
162
x 100% = 6.17%

CVFemale = 4
148
x 100% = 2.70%

Comparing the relative variations in height of the male and female students, it can be seen
that the male students have higher coefficient of variation in height than the female students.
Thus, male students’ heights are more varied.

Example 2
Compare the variability of the height and weight of the students given the following data:

Mean S CV
Height in cm 168 cm 12 cm 7.14%
Weight in pounds 200 lb 20 lb 10.00%

From the results, it can be seen that the weight of the students is more varied than the
height.

Quartile Deviation

Quartile deviation is another way of determining the spread of a distribution in terms of quartiles.
The following is the quartile deviation formula:

where QD = quartile deviation


Q3 = 3rd quartile
Q1 = 1st quartile

Example:

Find the QD of the following scores:

23 25 25 30 35 39 40 44 47 51 60

This document is a property of NONESCOST


Module 2 | Page 41
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Solution:

Q = 3𝑁 = 3(11) = 8.25th item. Thus, Q = 47


3 3
4 4
𝑁 11
Q= = = 2.75th item. Thus, Q = 25
1 1
4 4

QD = 47−25 = 22
= 11. Hence, the QD is 11.
2 2

Example:

Find the QD of car battery lives (in years).

1.2 1.4 1.6 2.2 2.5 2.8 3.0 3.0 3.1 4.4

Solution:

Q3 = 3𝑁 = 3(10) = 7.5th item that is 3.0 years (The value is midway between 7th
4 4
and 8th items which is 3.0 in this example).
𝑁 10
Q= = = 2.5th item that is 1.5 years (since the number of cases is even, the
1
4 4 median between the 2nd and the 3rd item which is 1.5
is taken).

QD = 3.0−1.5 = 1.5
2
= 0.75. Hence, the QD is 0.75.
2

Example:
Find the QD of the scores below.

Class f x fx ˂CF Class


Interval Boundary
27-29 12 28 336 12 26.5 – 29.5
30-32 23 31 713 35 29.5 – 32.5
33-35 60 34 2040 95 32.5 -35.5
36-38 45 37 1665 140 35.5 – 38.5
39-41 51 40 2040 191 38.5 – 41.5
42-44 75 43 3225 266 41.5 – 44.5
45-47 28 46 1288 294 44.5 – 47.5
48-50 33 49 1617 327 47.5 – 50.5
51-53 18 52 936 345 50.5 – 53.5
54-56 10 55 550 355 53.5 – 56.5

This document is a property of NONESCOST


Module 2 | Page 42
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Solution:

Percentile Range (PR)


The percentile range is the difference between the 90th percentile (P 90) and the 10th percentile (P10).
In symbols,
PR = P90 – P10

Example:
The following data represent the scores of students in the final examination in Physics.
100 100 111 111 112 120 121 122 123 175
171 130 132 133 135 140 145 145 146 150
150 155 160 164 165 165 170 180 175

Calculate the percentile range of the scores.


Solution:
90(29)
P90 = 90𝑁 = = 26.1th item that is 171.
100 100

P10 = 10𝑁 = 10(29)


100
= 2.9th item that is 111.
100

Hence, the percentile range of the scores is 60.

This document is a property of NONESCOST


Module 2 | Page 43
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Use these exercises for practice and review of the lesson. You may work with a partner of group of
threes.

I. The number of incorrect answers on a true-false Mathematics Proficiency Test for a random
sample of 20 students was recorded as follows:
3 3 5 6 1 2 1 4 4 5
1 3 3 2 5 4 4 5 1 2

Find the following:


a. Range b. Mean absolute deviation e. Quartile deviation
c. Population Variance d. Standard deviation

II. The data below are the volume of different materials used by the Physics students in a certain
laboratory activity to determine the density of the materials using the displacement method.

Volume of Wooden Cubes in cubic cm Number


0–5 2
6 – 11 3
12 – 17 5
18 – 23 8
24 – 29 7
30 – 35 4
36 – 41 5
42 – 47 3
48 – 53 3

Calculate the following:


a. Population variance c. Sample variance
b. Population standard deviation d. Sample standard deviation

III. Find the population mean, range, mean absolute deviation, and population variance of the
following set of data:
12 13 45 45 45 12 12 10 10 13 15

IV. Find the population mean, range, mean absolute deviation, and population variance of the
following set of data:
25 35 15 20 40 10 5 17 18 24 19

V. Which group is the most heterogeneous?


Group I scores: 100 123 122 150 146 141 132 122
Group II scores: 102 102 132 154 124 136 125 135

This document is a property of NONESCOST


Module 2 | Page 44
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Group III scores: 150 120 130 114 112 105 136 104

VI. Find the sample standard deviation of the following numbers:

12 23 56 61 42 21 10 32 26 28 54
95 43 23 12 25 34 56 64 36 39 12

VII. Construct the frequency distribution table then find the sample variance and sample standard
deviation of the following scores:

10 20 50 23 21 12 13 15 24 26 25
23 24 25 28 24 56 20 10 32 30 31
13 25 65 45 51 42 35 65 52 10

VIII. Using the data below

12 13 45 45 45 12 12 10 10 13 15

Compute/determine the following:

1. N 4. Mean Absolute Deviation


2. Population Mean 5. Population Variance
3. Range 6. Population Standard Deviation

This document is a property of NONESCOST


Module 2 | Page 45
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
LESSON

4 LINEAR REGRESSION
&
CORRELATION
3 HOURS

This lesson presents the definition, functions, and applications of linear regression & correlation.

At the end of this lesson, you are expected to:

1. Use the methods of linear regression and correlations to predict the value of a
variable given certain conditions, and;

2. Make interpretations of the computed results.

This document is a property of NONESCOST


Module 2 | Page 46
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Given the graph below, what have you noticed with the numbers in the x-axis versus the
numbers in the y-axis? What’s the meaning of red & blue lines?

In this lesson you will be exposed to linear regression and correlation as tools to be used when
we relate or predict certain event to happen.

This document is a property of NONESCOST


Module 2 | Page 47
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Linear Regression

When performing research studies, scientists often wish to know whether two variables
are related. If the variables are determined to be related, a scientist may then wish to find an equation
that can be used to model the relationship. For instance, a geologist might want to know whether there
is relationship between the duration of an eruption of a geyser and the time between eruptions. A first
step in this determination is to collect some data. Data involving two variables are called bivariate data.
Table below gives bivariate data showing the time between two eruptions and the duration of the second
eruption for 10 eruptions of the geyser Old Faithful.

Time between eruptions 272 227 237 238 203 270 218 226 250 245
(in seconds), x
Duration of eruption 89 79 83 82 81 85 78 81 85 79
(in seconds), y

Once the data are collected, a scatter diagram or scatter plot can be drawn, as shown in the
figure below.

One way for the geologist to create a model of the relationship between the time between
two eruptions and the duration of the second eruption is to find a line that approximates the data points
plotted in the scatter plot. There are many such lines that can be drawn, as shown in figure above.

Of all the possible lines that can be drawn, the one that is usually of most interest is called
the line of best fit or the least-squares regression line. The least-squares regression line is the line that
fits
This document is a property of NONESCOST
Module 2 | Page 48
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
the data better than any other line that might be drawn. The least-squares regression line is defined as
follows.

The Least-Squares Regression Line


The Least-square regression line for a set of bivariate data is the line that minimizes the sum of the squares
of the vertical deviations from each data point to the line.

In this definition, the phrase “minimizes the sum of the squares of the vertical deviations” is
somewhat daunting. Referring to figure below, it means that of all the lines possible, the linear equation
that minimizes the sum

d21 + d22 + d23+ d24 + d25 + d26 + d27 + d28 + d29 + d210

is the equation of the line of best fit. In this expression, each dn, represents the distance from data point n
to the line.

Applying some techniques from calculus, it is possible to find a formula for the least-squares line.

To apply this formula to the data for Old Faithful, we first find the value of each summation.

This document is a property of NONESCOST


Module 2 | Page 49
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
The regression equation is

The graph of the regression equation and a scatter plot of the data are shown below.

We can now use the regression equation to estimate the duration of an eruption given the
time between eruptions. For instance, if the time between two eruptions is 200 seconds, then the
estimated duration of the second eruption is

The approximate duration of the eruption is 78 seconds.

This document is a property of NONESCOST


Module 2 | Page 50
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Linear Correlation Coefficient
To determine the strength of a linear relationship between two variables, statisticians use a
statistic called the linear correlation coefficient, which is denoted by the variable r and is defined as
follows.

If the linear correlation coefficient r is positive, the relationship between the variables has a
positive correlation. In this case, if one variable increases, the other variable also tends to increase. If r is
negative, the linear relationship between the variables has a negative correlation. In this case, if one
variable increases, the other variable tends to decrease.
Figure below shows some scatter diagrams along with the type of linear correlation that exists
between the x and y variables. The closer /r/ is to 1, the stronger the linear relationship is between the
variables.

This document is a property of NONESCOST


Module 2 | Page 51
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Example:
Find the linear correlation coefficient for stride length versus speed of an adult dinosaur. Round
your result to the nearest hundredth.

Stride length (m) Speed (m/s) xy 2 2


x y x y
2.5 3.4 8.5 6.25 11.56
3.0 4.9 14.7 9.0 24.01
3.3 5.5 18.15 10.89 30.25
3.5 6.6 23.1 12.25 43.56
3.8 7.0 26.6 14.44 49.0
4.0 7.7 30.8 16.0 59.29
4.2 8.3 34.86 17.64 68.89
4.5 8.7 39.15 20.25 75.69
2 2
Σx = 28.8 Σy = 52.1 Σxy = 195.86 Σx = 106.72 Σy = 362.25

n= 8 (Σx)2 = 829.44 (Σy)2 = 2,714.41

Substitute the values to the formula below:

8 (195.86)− (28.8)(52.1)
r=
√8(106.72)− (829.44) .√8(362.25)− (2,714.41)
1,566.88 −1,500.48
r=
√853.76− 829.44 .√2,898− 2,714.41

66.4
r=
√24.32 .√183.59

66.4
r=
4.93 (13.54)

66.4
r=
66.75

r = 0.99

The value of r is 0.99 which means there is a linear correlation between stride length and speed of
an adult dinosaur. The strength of the relationship is strong positive which means that as the stride
length increases the speed of an adult dinosaur also increases.

This document is a property of NONESCOST


Module 2 | Page 52
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Properties of the Linear Correlation Coefficient
1. The linear correlation coefficient r is always as real number between – 1 and 1, inclusive. In the
case in which
 all of the ordered pairs lie on a line with positive slope, r is 1.
 all of the ordered pairs lie on a line with negative slope, r is -1.
2. For any set of ordered pairs, the linear correlation coefficient r and the slope of the least-squares
line both have the same sign.

3. Interchanging the variables in the ordered pairs does not change the value of r. Thus, the value of
r for the ordered pairs (x1, y1), (x2, y2), …, (xn, yn) is the same as the value of r for the ordered
pairs (y1, x1), (y2, x2), …, (yn, xn).

4. The value of r does not depend on the units used. You can change the units of a variable from, for
example, feet to inches, and the value of r will remain the same.

Use these exercises for review of the lesson. You may work with a partner or group of threes.

Part 1:
I. Prof. R. McNeill Alexander wanted to determine whether the stride length of a dinosaur, as shown
by its fossilized footprints, could be used to estimate the speed of the dinosaur. Stride length for
an animal is defined as the distance x from a particular point on a footprint to that same point on
the next footprint of the same foot. See the figure below.

Because dinosaurs are extinct, Alexander and fellow scientist A. S. Jayes carried out
experiments with many types of animals, including adult dinosaurs, dogs, camels,
ostriches, and elephants. Some of the results from these experiments are recorded in the
table below.

This document is a property of NONESCOST


Module 2 | Page 53
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Calculate the regression lines of the following:

Speeds for selected stride lengths


a. Adult Dinosaurs

Stride length (m) 2.5 3.0 3.3 3.5 3.8 4.0 4.2 4.5
Speed (m/s) 3.4 4.9 5.5 6.6 7.0 7.7 8.3 8.7

b. Dogs

Stride length (m) 1.5 1.7 2.0 2.4 2.7 3.0 3.2 3.5
Speed (m/s) 3.7 4.4 4.8 7.1 7.7 9.1 8.8 9.9

c. Camels

Stride length (m) 2.5 3.0 3.2 3.4 3.5 3.8 4.0 4.2
Speed (m/s) 2.3 3.9 4.4 5.0 5.5 6.2 7.1 7.6

Part 2:
Find the linear correlation coefficient of the following and interpret the results:
A. Dogs

Stride length (m) Speed (m/s)


X y
1.5 3.7
1.7 4.4
2.0 4.8
2.4 7.1
2.7 7.7
3.0 9.1
3.2 8.8
3.5 9.9

B. Camels

Stride length (m) Speed (m/s)


X y
2.5 2.3
3.0 3.9
3.2 4.4
3.4 5.0
3.5 5.5
3.8 6.2
4.0 7.1
4.2 7.6

This document is a property of NONESCOST


Module 2 | Page 54
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
This document is a property of NONESCOST
Module 2 | Page 55
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo

You might also like