Professional Documents
Culture Documents
COLLEGE OF NURSING
and ALLIED HEALTH
SCIENCES (CONAHS)
COURSE MODULE IN
MATHEMATICS
IN THE MODERN
WORLD
st
1 Semester; AY 2023-2024
COURSE FACILITATOR: JOEMAR A. SALIMBOT, REE, RME, LPT, MAT
2
FB/MESSENGER: Joemar Salimbot
Email Address: joem2968@gmail.com
Phone No.: 09652049674
This document is a property of NONESCOST Module 2 | Page 1
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only) Prepared by: Cabahaga
/Toledo
LESSON
This lesson dwells on the strategies in solving real-life mathematical problems with emphasis on inductive
reasoning, deductive reasoning, and the Polya’s four-step-problem-solving strategy.
LEARNING OUTCOMES
BrainFans.com
The image above is an example of exciting problem that we might be challenged to solve and find the
answer.
Along the process of solving every problem our logical thinking and creativity skills will be gradually
developed.
Most occupations require good problem-solving skills (Aufmann, 2018). For instance, the architects and
engineers must solve many complicated problems as they design and construct modern buildings that are
aesthetically pleasing, functional, and that meet stringent safety requirements.
Two goals of this lesson are to help you become a better problem solver and to demonstrate that problem
solving can be an enjoyable experience. Anyway, it is something we do everyday.
Inductive Reasoning
The type of reasoning that forms a conclusion based on the examination of specific examples is called
inductive reasoning. The conclusion formed by using inductive reasoning is often called a conjecture, since it may
or may not be correct (Aufmann, 2018). It is a type of reasoning that uses specific examples to reach a general
conclusion of something is called inductive reasoning (Baltazar, 2018).
When you examine a list of numbers and predict the next number in the list according to some pattern you
have observed, you are using inductive reasoning.
Example 1:
Use inductive reasoning to predict the next number in each of the following lists.
Solution:
a. Each successive number is 3 larger than the preceding number. Thus, we predict that the next
number in the list is 3 larger than 15, which is 18.
b. The first two numbers differ by 2. The second and the third numbers differ by 3. It appears that
the difference between any two numbers is always 1 more than the preceding difference.
Since 10 and 15 differ by 5, we predict that the next number in the list will be 6 larger than 15,
which is 21.
Use inductive reasoning to predict the next number in each of the following lists.
Inductive reasoning is not used just to predict the next number in a list. In Example 2, we use inductive
reasoning to make a conjecture about an arithmetic procedure.
Consider the following procedure: Pick a number. Multiply the number by 8, add 6 to the product, divide
the sum by 2, and subtract 3.
Complete the above procedure for several different numbers. Use inductive reasoning to make a conjecture
about the relationship between the size of the resulting number and the size of the original number.
Solution:
Suppose we pick 5 as our original number. Then the procedure would produce the following results:
We started with 5 and followed the procedure to produce 20. Starting with 6 as our original number
produces a final result of 24. Starting with 10 produces a final result of 40. Starting with 100 produces a final result
of 400. In each of these cases the resulting number is four times the original number. We conjecture that following
the given procedure produces a number that is four times the original number.
Consider the following procedure: Pick a number. Multiply the number by 9, add 15 to the product, divide
the sum by 3, and subtract 5.
Complete the above procedure for several different numbers. Use inductive reasoning to make a
conjecture about the relationship between the size of the resulting number and the size of the original number.
Scientists often use inductive reasoning. For instance, Galileo Galilei (1564– 1642) used inductive reasoning
to discover that the time required for a pendulum to complete one swing, called the period of the pendulum,
depends on the length of the pendulum. Galileo did not have a clock, so Galileo measured the periods of
pendulums in “heartbeats.” The following table shows some results obtained for pendulums of various lengths. For the
sake of convenience, a length of 10 inches has been designated as 1 unit.
Use the data in the table and inductive reasoning to answer each of the following questions.
Solution:
a. In the table, each pendulum has a period that is the square root of its length. Thus we
conjecture that a pendulum with a length of 49 units will have a period of 7 heartbeats.
b. In the table, a pendulum with a length of 4 units has a period that is twice that of a pendulum
with a length of 1 unit. A pendulum with a length of 16 units has a period that is twice that of a
This document is a property of NONESCOST
Module 2 | Page 5
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
pendulum with a length of 4 units. It appears that quadrupling the length of a pendulum
doubles its period.
Conclusions based on inductive reasoning may be incorrect. As an illustration, consider the circles shown
below. For each circle, all possible line segments have been drawn to connect each dot on the circle with all the
other dots on the circle.
For each circle, count the number of regions formed by the line segments that connect the dots on the circle.
Your results should agree with the results in the following table.
There appears to be a pattern. Each additional dot seems to double the number of regions. Guess the
maximum number of regions you expect for a circle with six dots. Check your guess by counting the maximum
number of regions formed by the line segments that connect six dots on a large circle. Your drawing will show that
for six dots, the maximum number of regions is 31 (see the figure below), not 32 as you may have guessed. With
seven dots the maximum number of regions is 57. This is a good example to keep in mind. Just because a pattern
holds true for a few cases, it does not mean the pattern will continue. When you use inductive reasoning, you have
no guarantee that your conclusion is correct.
Use deductive reasoning to show that the following procedure produces a number that is three
times the original number.
Procedure: Pick a number. Multiply the number by 6, add 10 to the product, divide the sum by 2,
and subtract 5. Hint: Let n represent the original number.
This document is a property of NONESCOST
Module 2 | Page 7
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Logic Puzzles
Logic puzzles, similar to the one in example 5, can be solved by using deductive reasoning and a chart that
enables us to display the given information in a visual manner.
Example # 5:
Each of the four friends: person 1, person 2, person 3, and person 4, has a different pet (fish, cat,
dog, and snake). From the following clues, determine the pet of each individual:
1. Person 2 is older than one friend who owns the cat and younger than one friend who owns the
dog.
2. Person 3 and one friend who owns the snake are both of the same age and are the youngest
members of their group.
3. Person 1 is older than one friend who owns the fish.
Solution:
From Clue 1, Person 2 does not own a cat nor a dog. In the following chart, write X1 (which stands for
“ruled out by clue 1”) in the cat and dog column for Person 2.
From Clue 2, person 3 does not own a snake and a dog being the youngest. And since person 2 is not the
youngest from Clue 1, then person 2 does not own a snake as well. Write X2 (ruled out by clue 2) in snake column
for person 3 and X1 in snake column for person 2. There are now Xs in the 3 pets in person’s 2 row, therefore
person 2 owns the fish. Put a check ( / ) which means person’s 2 pet is a fish. So, person 1, person 3, and person 4
do not own the fish.
From clue 3, person 1 is older than person 2, hence person 1 owns the dog. Write X3 (ruled out by clue 3)
in cat and snake columns for person 1. There are now Xs in snake column for person 1, person 2, and person 3;
therefore person 4 owns the snake. Put a check in that box. Write x3 in the cat column for person 4; hence person
3 owns the cat. Put a check in that box.
Thus, person 2 owns the fish, person 1 owns the dog, person 4 owns the snake and person 3 owns the cat.
In example 7, we will analyze arguments to determine whether they use inductive or deductive reasoning.
Example # 7:
One of the foremost recent mathematicians to make a study of problem solving was George
Polya (1887 – 1985). George Polya was born in Hungary and moved to the United States in 1940. The
basic problem-solving strategy that Polya advocated consisted of the following four steps.
Polya’s four steps are deceptively simple. To become a problem solver, it helps to examine each of
these steps and determine what is involved.
I. Understand the Problem
This part of Polya’s four-step strategy is often overlooked. You must have a clear
understanding of the problem. To help you focus on understanding the problem, consider
the following questions:
1. Can you restate the problem in your own words?
2. Can you determine what is known about these types of
problems?
3. Is there missing information that, if known, would allow you
to solve the problem?
4. Is there extraneous information that is not needed to solve
the problem?
5. What is the goal?
Example # 1:
Consider the city map shown below. A person wishes to walk along the streets
from point A to point B. How many direct routes can Allison take?
City Map
Solution:
Understand the Problem. We would not be able to answer the question if a person
retraced the path or traveled away from point B. Thus we assume that on a direct route,
the person always travels along a street in a direction that gets one closer to point B.
Devise a Plan. The city map above has many extraneous details. Thus we make a diagram
that allows us to concentrate on the essential information. See the figure below.
Because there are many routes, we consider the similar but simpler diagrams shown
below. The number at each street intersection represents the number of routes
from point A to that particular intersection.
Take note:
The strategy of working a similar but simpler problem is an important problem-
solving strategy that can be used to solve many problems.
Look for patterns. It appears that the number of routes to an intersection is the
sum of the number of routes to the adjacent intersection to its left and the number of
routes to the intersection directly above. For instance, the number of routes to the
intersection labeled 6 is the sum of the number of routes to the intersection to its left,
which is 3, and the number of routes to the intersection directly above, which is also 3.
Carry Out the Plan. Using the pattern discovered above, we see from the figure below that
the number of routes from point A to point B is 20 + 15 = 35.
Review the Solution. Ask yourself whether a result of 35 seems reasonable. If you were
required to draw each route, could you devise a scheme that would enable you to draw
each route without missing a route or duplicating a route?
Example # 2:
Consider the street map below. A person wishes to walk directly from point A to
point B. How many different routes can a person take if one wants to go past Starbucks on
Third Avenue?
Street Map
Solution:
Understand the problem. There are many different orders. The team may have
won two straight games and lost the last two (WWLL). Or maybe they lost the first two
games and won the last two (LLWW). Of course there are other possibilities, such as
WLWL.
Devise a Plan. We will make an organized list of all the possible orders. An
organized list is a list that is produced using a system that ensures that each of the
different orders will be listed once and only once.
Carry Out the Plan. Each entry in our list must contain two Ws and two Ls. We will
use a strategy that makes sure each order in considered, with no duplications. One such
strategy is to always write a W unless doing so will produce too many Ws or a duplicate of
one of the previous orders. If it is not possible to write a W, then and only then do we
This document is a property of NONESCOST
Module 2 | Page 15
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
write an L. This strategy produces the six different orders shown below.
Review the Solution. We have made an organized list. The list has no duplicates
and the list considers all possibilities, so we are confident that there are six different
orders in which a baseball team can win exactly two out of four games.
Example # 5:
Finding a Day of the Week
In 2017, Venus’ birthday fell on a Saturday, June 3. On what day of the week does
Venus’ birthday fall in 2020? Note that the year 2020 is a leap year.
Solution:
The number of days in a year is 365 except when it is a leap year where there’s one day
added. How many days are there after June 3, 2017 to June 3, 2020?
Number of days:
After June 3, 2017 to June 3, 2018: 365 days
After June 3, 2018 to June 3, 2019: 365 days
After June 3, 2019 to June 3, 2020: 366 days (leap year)
Total 1,096 days
Because 1,096 days divided by 7 days = 156 weeks has a remainder 4 days, then we write
1,096 ≡ 4 mod 7. Since a week is a cycle, then any multiple of 7 days past a given day will
be the same day of the week. It means that on the 1,092nd day, 1092 being a multiple of 7,
after June 3, 2017 is also a Saturday. Furthermore, on the 1,096 th day, four days after, is a
Wednesday. Thus, June 3, 2020 will be a Wednesday.
Another interesting puzzle are Kenken Puzzles. Search the net and learn more about them. Solve
the given KenKen puzzles below:
A. 4x4 B. 5x5
3 5 1 4 2
2 3 5 1 4
0
4 2 3 5 1
1
5 1 4 2 3
1 4 2 3 5
Activity 2 – 3
Mr. Cruz has chickens and cows in his backyard. All in all there are thirty-nine (39) animals
and a total of one hundred (100) legs. How many cows and chickens are there?
2
3 HOURS
MEASURES OF CENTRAL TENDENCY
This lesson presents the definition, functions, types and computations of measures of central
tendency like mean, median & mode for ungroup data.
CAREER AVERAGES
Michael Jordan Kobe Bryant
1346 regular season games 1072 regular season games
Points 30.1 25.0
Rebounds 6.2 5.2
Assists 5.3 4.7
https://www.basketball-reference.com
What is your favorite sport? Who is your favorite player? What is your player’s statistics?
This topic helps you to recall and even develop mastery on computations about the different
measures of central tendency. It will also discuss the importance, uses and applications of these
measures.
MEAN
The mean is equal to the sum of all the values divided by the total number of population or values. It is
also referred to as the arithmetic mean or simple average. However, the mean is widely used in descriptive and
inferential statistics because it is the representative of the whole distribution (Vizcarra, et. al. 2012).
The arithmetic mean of a finite set of measurements is equal to the sum of the measurements divided
by the total number of measurements (Montero-Galliguez, et. al., 2016).
The uses of the mean according to Nocon are:
1. for interval and ratio measurements;
2. if higher statistical computations are wanted;
3. if there are no extreme values in a distribution since it is easily affected by extremely high or
extremely low scores. Thus, the distribution is approximately normal and;
4. when the greatest reliability of the measure of central tendency is wanted since its
computations include all the given values.
The mean is classified to population mean and sample mean.
Sample mean is the set of data taken from the average or mean of the sample, added together then
divided by the sample size n (Broto, 2012). The formula is:
̅ 𝒙̅̅ = Σx
n
Where:
̅ 𝒙̅̅ = Sample Mean
Σx = Sum of the Sample Observation n
= Sample Size
Example 2. The following are the ages of samples of 9 children in a slum area.
x
9
8
1
3
4
5
6
7
2
Σx = 45
n=9
Example 2. Given the scores below, find the value of the median.
X
9
8
7
6
5
4
3
2
1
1
Solution: Since there are two middle point values which are 5 and 4, then:
Md = 5 + 4 = 4.5
2 2
MODE
The mode is the item or value in a distribution with the highest frequency or most number of cases
(Nocon, et. al., 2000).
The mode (Mo) is the value which occurs most often or with greatest frequency (Broto, 2012).
The mode is the value which occurs most frequently in a set of observations. It is considered as the least
reliable measure of location (Vizcarra, et. al. 2012).
The mode is the measurement which appears most frequently. There may be no mode if no value appears
more than any other (Montero-Galliguez, et. al., 2016).
The uses of the mode according to Nocon are:
1. for nominal or categorical data;
2. if the most popular or most typical case or value in a distribution is wanted and;
3. if a rough or quick estimate of a central value is wanted.
Example
1. The following are the scores of 9 students in a spelling test of 10 items. Determine the mode.
X
0
9
9
7
6
5
3
2
1
Solution: Mo = 9, since 9 occurs twice. The data with one mode is called unimodal.
According to Montero-Galliguez the properties of the measures of central tendency of finite data
sets can be summarized using the table below.
I. Mean
1. Calculate the sample mean value of the following set of scores obtained by 5 students in a 10-
item test.
1, 3, 5, 7, 9.
2. Calculate the μ mean value in height (ft.) of 5 children.
Child Height (ft.)
1 3.48
2 4.50
3 4.55
4 5.23
5 5.12
3. Calculate the sample mean value of the given scores of students in statistics.
50, 42, 78, 50, 61, 40, 50, 68, 89, & 90.
4. Given the scores 86, 80, 75, 78, and 86, find the population mean value.
5. Find the sample mean value of the set of observations 3, 7, 12, 5, 7, and 10.
II. Median
1. Find the median value of the following set of scores obtained by 5
students. 1, 3, 5, 7, 9.
4. Given the scores 86, 80, 75, 78, and 86, find the median value.
5. Find the median value of the set of observations 3, 7, 12, 5, 7, and 10.
III. Mode
1. Determine the mode value of the following set of scores obtained by 5 students. 1, 3,
5, 7, 9.
2. Determine the mode value in height (ft.) of 5 children.
This document is a property of NONESCOST
Module 2 | Page 31
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
Child Height (ft.)
1 3.48
2 4.50
3 4.55
4 5.23
5 5.12
4. Given the scores 86, 80, 75, 78, and 86, find the mode value
.
5. Find the mode value of the set of observations 3, 7, 12, 5, 7, and 10.
2. Find the median for the set of measurements 9, 2, 7, 11, 14, and 6.
3. Seven students were asked how many text messages they sent on a given day. Their
answers were 10, 14, 5, 20, 8, 30, and 15. Find the median.
5. Given the distribution 19, 20, 20, 20, 18, 15, 13, and 9, determine the values for mean,
median & mode.
3
6 HOURS
MEASURES OF VARIATION
This lesson presents the definition, functions, types and computation of measures of
variation or dispersion.
Events of nature always vary from time to time. People keep on changing places, motion, physical
appearance, skin reaction to different chemicals, height, weight, hair color, eye color, ideas, and even
values in life.
In this lesson you will be exposed to the concept of variation. What is it? How to use it? When to
use it? Where to use it?
Range (R)
This is the simplest form of measuring variation of a distribution. To get the range, subtract the
lowest score or observation from the highest score.
Example 1:
Solution:
Highest Age 56
Lowest Age 25
Therefore, the range of their ages is 31. If the size of the population or sample is large, the range
is not an excellent measure of variation because it considers the highest and the lowest values and does
not tell anything about the values between them. If one is interested in the position of each
observation
This document is a property of NONESCOST
Module 2 | Page 35
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
relative to the mean of the set of data, other measures of variation are necessary. The mean absolute
deviation can be applied.
To find the mean absolute deviation, subtract the mean score from each raw score then using the
absolute values of the differences, get the sum of the results. The sum is called the sum of the deviations
from the mean. Next, divide this number by N, the total number of cases. In symbols,
Example 2:
Solution:
Ages are 34, 35, 45, 56, 32, 25, and 40.
Mean Age: ̅ 𝒙̅̅ = 34 + 35 + 45 + 56+ 32 + 25 + 40 = 38.14
7
X x – ̅𝒙̅ /x – ̅𝒙̅/
34 - 4.14 4.14
35 - 3.14 3.14
45 6.86 6.86
56 17.86 17.86
32 - 6.14 6.14
25 -3.14 13.14
40 1.86 1.86
TOTAL 53.14
53.14
MAD = = 7.59
7
Variance
Variance is another measure of variation which can be used instead of the range. The variance
considers the deviation of each observation from the mean. To obtain the variance of the distribution,
This document is a property of NONESCOST
Module 2 | Page 36
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
compute the deviation from the mean of each row score. Then, square the deviations from the mean and
add them. Finally, divide the resulting sum by N, or the total number of cases.
Except when specified that the population variance is to be used, we always use the sample
variance formula in the examples and exercises throughout the module.
Example 3. Compute the population and sample variances of the data below.
10,430
̅ 𝑥̅̅ = = 90.70
115
𝟐𝟔𝟕
̅ ̅𝒙̅̅̅ = = 38.14
𝟕
Standard Deviation
The standard deviation, σ for a population or s for a sample, is the square root of the value of
the variance. In symbols,
σ = √ σ2N
This document is a property of NONESCOST
Module 2 | Page 38
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
B. Sample Deviation (s)
s = √s2N − 1
Unless specified, the sample standard deviation is used in all the examples and exercises
throughout the module.
To interpret the standard deviation, the larger the value the greater dispersion, denoting
heterogeneous data. The lesser the value means that the scores are homogeneous.
Example 5:
Compute the population and sample standard deviation of the data below.
10,430
̅ 𝑥̅̅ = = 90.70
115
Solution:
a. Population Variance
σ 2N = 50.91
b. Sample Variance
s2N-1 = 51.35
Sample standard deviation is s = √𝟓𝟏. 𝟑𝟓 = 7.17
Example
6: Find the standard deviation of the distribution in table below.
14,410
̅ 𝑥̅̅ = = 40.59
355
Coefficient of Variation
When it is necessary to compare the variability of two or more groups, the task is easy if the
means are the same. For example, we can easily compare which group is more varied in
height between the following groups:
Clearly one can see that Group 2 is more varied because it has a higher standard deviation.
The task becomes more difficult if the means are not equal and the units are different, such as
comparing the weights of the groups belonging to different age brackets or different gender.
How can we compare the variability of the weights of 9 girls, with mean weight equal to 100
pounds and with standard deviation 5 and that of the weights of 12 boys with mean weight
equal to 160 pounds and the standard deviation 8? A statistical called the coefficient of
variation helps us answer the question. The formula is:
CV = 𝑠
̅ 𝑥̅̅
x 100%
Since s and ̅𝒙̅ have the same units, their units are canceled out and so CV has no unit.
CVMale = 10
162
x 100% = 6.17%
CVFemale = 4
148
x 100% = 2.70%
Comparing the relative variations in height of the male and female students, it can be seen
that the male students have higher coefficient of variation in height than the female students.
Thus, male students’ heights are more varied.
Example 2
Compare the variability of the height and weight of the students given the following data:
Mean S CV
Height in cm 168 cm 12 cm 7.14%
Weight in pounds 200 lb 20 lb 10.00%
From the results, it can be seen that the weight of the students is more varied than the
height.
Quartile Deviation
Quartile deviation is another way of determining the spread of a distribution in terms of quartiles.
The following is the quartile deviation formula:
Example:
23 25 25 30 35 39 40 44 47 51 60
QD = 47−25 = 22
= 11. Hence, the QD is 11.
2 2
Example:
1.2 1.4 1.6 2.2 2.5 2.8 3.0 3.0 3.1 4.4
Solution:
Q3 = 3𝑁 = 3(10) = 7.5th item that is 3.0 years (The value is midway between 7th
4 4
and 8th items which is 3.0 in this example).
𝑁 10
Q= = = 2.5th item that is 1.5 years (since the number of cases is even, the
1
4 4 median between the 2nd and the 3rd item which is 1.5
is taken).
QD = 3.0−1.5 = 1.5
2
= 0.75. Hence, the QD is 0.75.
2
Example:
Find the QD of the scores below.
Example:
The following data represent the scores of students in the final examination in Physics.
100 100 111 111 112 120 121 122 123 175
171 130 132 133 135 140 145 145 146 150
150 155 160 164 165 165 170 180 175
I. The number of incorrect answers on a true-false Mathematics Proficiency Test for a random
sample of 20 students was recorded as follows:
3 3 5 6 1 2 1 4 4 5
1 3 3 2 5 4 4 5 1 2
II. The data below are the volume of different materials used by the Physics students in a certain
laboratory activity to determine the density of the materials using the displacement method.
III. Find the population mean, range, mean absolute deviation, and population variance of the
following set of data:
12 13 45 45 45 12 12 10 10 13 15
IV. Find the population mean, range, mean absolute deviation, and population variance of the
following set of data:
25 35 15 20 40 10 5 17 18 24 19
12 23 56 61 42 21 10 32 26 28 54
95 43 23 12 25 34 56 64 36 39 12
VII. Construct the frequency distribution table then find the sample variance and sample standard
deviation of the following scores:
10 20 50 23 21 12 13 15 24 26 25
23 24 25 28 24 56 20 10 32 30 31
13 25 65 45 51 42 35 65 52 10
12 13 45 45 45 12 12 10 10 13 15
4 LINEAR REGRESSION
&
CORRELATION
3 HOURS
This lesson presents the definition, functions, and applications of linear regression & correlation.
1. Use the methods of linear regression and correlations to predict the value of a
variable given certain conditions, and;
In this lesson you will be exposed to linear regression and correlation as tools to be used when
we relate or predict certain event to happen.
When performing research studies, scientists often wish to know whether two variables
are related. If the variables are determined to be related, a scientist may then wish to find an equation
that can be used to model the relationship. For instance, a geologist might want to know whether there
is relationship between the duration of an eruption of a geyser and the time between eruptions. A first
step in this determination is to collect some data. Data involving two variables are called bivariate data.
Table below gives bivariate data showing the time between two eruptions and the duration of the second
eruption for 10 eruptions of the geyser Old Faithful.
Time between eruptions 272 227 237 238 203 270 218 226 250 245
(in seconds), x
Duration of eruption 89 79 83 82 81 85 78 81 85 79
(in seconds), y
Once the data are collected, a scatter diagram or scatter plot can be drawn, as shown in the
figure below.
One way for the geologist to create a model of the relationship between the time between
two eruptions and the duration of the second eruption is to find a line that approximates the data points
plotted in the scatter plot. There are many such lines that can be drawn, as shown in figure above.
Of all the possible lines that can be drawn, the one that is usually of most interest is called
the line of best fit or the least-squares regression line. The least-squares regression line is the line that
fits
This document is a property of NONESCOST
Module 2 | Page 48
Unauthorized copying and / or editing is prohibited. (For Classroom Use Only)
Prepared by: Cabahaga /Toledo
the data better than any other line that might be drawn. The least-squares regression line is defined as
follows.
In this definition, the phrase “minimizes the sum of the squares of the vertical deviations” is
somewhat daunting. Referring to figure below, it means that of all the lines possible, the linear equation
that minimizes the sum
d21 + d22 + d23+ d24 + d25 + d26 + d27 + d28 + d29 + d210
is the equation of the line of best fit. In this expression, each dn, represents the distance from data point n
to the line.
Applying some techniques from calculus, it is possible to find a formula for the least-squares line.
To apply this formula to the data for Old Faithful, we first find the value of each summation.
The graph of the regression equation and a scatter plot of the data are shown below.
We can now use the regression equation to estimate the duration of an eruption given the
time between eruptions. For instance, if the time between two eruptions is 200 seconds, then the
estimated duration of the second eruption is
If the linear correlation coefficient r is positive, the relationship between the variables has a
positive correlation. In this case, if one variable increases, the other variable also tends to increase. If r is
negative, the linear relationship between the variables has a negative correlation. In this case, if one
variable increases, the other variable tends to decrease.
Figure below shows some scatter diagrams along with the type of linear correlation that exists
between the x and y variables. The closer /r/ is to 1, the stronger the linear relationship is between the
variables.
8 (195.86)− (28.8)(52.1)
r=
√8(106.72)− (829.44) .√8(362.25)− (2,714.41)
1,566.88 −1,500.48
r=
√853.76− 829.44 .√2,898− 2,714.41
66.4
r=
√24.32 .√183.59
66.4
r=
4.93 (13.54)
66.4
r=
66.75
r = 0.99
The value of r is 0.99 which means there is a linear correlation between stride length and speed of
an adult dinosaur. The strength of the relationship is strong positive which means that as the stride
length increases the speed of an adult dinosaur also increases.
3. Interchanging the variables in the ordered pairs does not change the value of r. Thus, the value of
r for the ordered pairs (x1, y1), (x2, y2), …, (xn, yn) is the same as the value of r for the ordered
pairs (y1, x1), (y2, x2), …, (yn, xn).
4. The value of r does not depend on the units used. You can change the units of a variable from, for
example, feet to inches, and the value of r will remain the same.
Use these exercises for review of the lesson. You may work with a partner or group of threes.
Part 1:
I. Prof. R. McNeill Alexander wanted to determine whether the stride length of a dinosaur, as shown
by its fossilized footprints, could be used to estimate the speed of the dinosaur. Stride length for
an animal is defined as the distance x from a particular point on a footprint to that same point on
the next footprint of the same foot. See the figure below.
Because dinosaurs are extinct, Alexander and fellow scientist A. S. Jayes carried out
experiments with many types of animals, including adult dinosaurs, dogs, camels,
ostriches, and elephants. Some of the results from these experiments are recorded in the
table below.
Stride length (m) 2.5 3.0 3.3 3.5 3.8 4.0 4.2 4.5
Speed (m/s) 3.4 4.9 5.5 6.6 7.0 7.7 8.3 8.7
b. Dogs
Stride length (m) 1.5 1.7 2.0 2.4 2.7 3.0 3.2 3.5
Speed (m/s) 3.7 4.4 4.8 7.1 7.7 9.1 8.8 9.9
c. Camels
Stride length (m) 2.5 3.0 3.2 3.4 3.5 3.8 4.0 4.2
Speed (m/s) 2.3 3.9 4.4 5.0 5.5 6.2 7.1 7.6
Part 2:
Find the linear correlation coefficient of the following and interpret the results:
A. Dogs
B. Camels