You are on page 1of 51

Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev.

0 10-July-
2020

GE7 Mathematics in the Modern World


Module 4: Data Management

MODULE 4

MODULE OVERVIEW
This module consists of five lessons: Measure of Central Tendency, Measures of Dispersion, Measures of
Relative Position, Normal Distribution, and Regression and Correlation. Each lesson was designed as a
selfteaching guide. Definitions of terms and examples had been incorporated. Answering the problems in
―your turn‖ will check your progress. You may compare your answers to the solutions provided at the later
part of this module in that way you will be able to measure your achievement as well as the effectiveness of
the module. Exercises were prepared as your assignment to measure your understanding about the topics.

MODULE LEARNING OBJECTIVES

At the end of the module, you should be able to:


• Use a variety of statistical tools to process and manage numerical data
• Use the methods of linear regression and correlations to predict the value of a variable given certain
conditions
• Advocate the use of statistical data in making important decisions

LEARNING CONTENTS (MEASURES OF CENTRAL TENDENCY )


Introduction
Numerical data is everywhere and everyday more data is being generated. It is important for us to have a
working knowledge of basic statistical concepts and tools so that we can use this data correctly and optimally.
A lot of data is raw - that is not been processed for use yet.

Discussion
Statistics involves the collection, organization, summarization, presentation, and interpretation of data. The
branch of mathematics that involves the collection of organization, summarization, and presentation of data is
called descriptive statistics. The branch that interprets, and draws conclusions from the data is called
inferential statistics.

Lesson 1: Measures of Central Tendency


A measure of central tendency is a summary measure that attempts to describe a whole set of
data with a single value that represents the middle or center of data set. Most commonly used measures of
central tendency or type of averages are arithmetic mean, median and mode.

Arithmetic Mean
The arithmetic mean or just simply mean is the sum of the value of each observation in a data set
divided by the number of observations. The traditional symbol used to indicate a summation is the Greek
letter , . Thus, the notation , called summation notation, denotes the sum of all numbers in a given set .
The definition is the same for both the sample (portion of the whole population) and population (is a
collection of all possible observations under a particular study), although we use different symbol to refer to
each.
The symbol for the sample mean is bar , and for the population mean is the Greek letter mu (µ).
PANGASINAN STATE UNIVERSITY
1
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
numbers from smallest to largest gives

23, 46, 77, 89, 92, 108


The two middle numbers are 77 and 89. The mean of 77 and 89 is 83. Thus, 83 is the median of the data.

Your turn 2

Find the median of the data in the following:

a. A sample of senior citizens in Lingayen, Pangasinan receiving Social Security payments revealed these
monthly benefits : , , , , , , , .

b. The scores in a quiz of nine students in MMW class are: 2, 4, 10, 7, 8, 0,5, 8, and 2.

Mode
The mode is another measure of type of average.

Mode
This is a value of the observation that appears most frequently.

Some lists of numbers do not have a mode. For instance, 1, 6, 8, 10, 32, 15, 49, each of number occurs exactly
once. No number occurs more often than the other numbers. Thus, there is no mode.
A list of numerical data can have more than one mode. For instance, in the list 4, 2 6, 2, 7, 9, 2, 4, 9, 8, 9, 7, the
numbers 2 and 9 occurs three times. Thus, 2 and 9 are both modes of the data .

Example 3 Find the mode of the data in the following lists.

a. 18, 15, 21, 16, 15, 14, 15, 21 b. 2, 5, 8, 9, 11, 4, 7, 23

Solution
a. In the list 18, 15, 21, 16, 15, 14, 15, 21, the number 15 occurs more often that the other numbers. Thus 15 is
the mode.

b. Each of the number in the list 2, 5, 8, 9, 11, 4, 7, 23 occurs only once. No number occurs more often than
others. Therefore, there is no mode.

Your turn 3 Find the mode of the data in the following lists.

a. 3, 3, 3, 4, 4, 4, 5, 5, 5, 8 b. 12, 34, 12, 71, 48, 93, 71, 12

The mean, median, and mode are all averages. However, they are generally not equal. The mean of a set of
data is most sensitive of the averages. A change of the numbers changes the mean, and the mean can be
changed drastically by changing an extreme value.
In contrast, the median and the mode of a set of data are usually not changed by changing an extreme value.
When a data set has one or more extreme values that are very different from the majority of values, the mean
will not necessarily be a good indicator of an average value. In the following example, we compare the mean,
median , and the mode for the salaries of five employees of a small company.
Salaries :
PANGASINAN STATE UNIVERSITY
3
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
Courses No. of Units Final Grade
Math 112 3 2.5
nal examination. Each test
English
4. A professor grades students on 101 6 2.0 counts as 15% of
PS 25 3 1.5
Fil 1 3 1.4 an 84 on his term
examination score was 88. ChemUse 1 5 2.4 paper. His final
PE =1
sum of all the weights is 100% 2 1.1 nd Alan’s average for the
1. course. Hint: The

5. After 6 math tests, Zia has a mean


sc average (mean) to 90? 4 tests, a term paper, and a fi need on the next test to
the course grade. The term paper counts as 20% of the course grade. The final examination
raise his counts as 20%
6. the weighted mean formula to fi

point test is to be given in


ore of 88. What score does Zia this class. All r
Lesson 2: Measures of average (mean) to 90?
Dispersion Explain.
After 4 algebra tests, Alisa has a mean score of 82. One
- more 100
dispersion
of the test scores are of equal importance. Is it possible for Alisa to raise he

versa.
LEARNING CONTENTS (MEASURES OF DISPERSION)
A measur measures of

While
are the measures of central tendency are used to estimate "normal" values of a dataset,
observations.
are important for describing the spread of the data, or its variation around a central value. Two
distinct For
samples mayconsider
instance, soft mean or median, but completely different levels of variability, or vicespread
have theasame
out
e of dispersion or variability
tells us how
much the observation from
mean. The higher the variability, the more dispersed are the observations; the lower it is, the the
more
cons
-drink dispensing machine that should isten
into a cup. Table 2.1 shows data for two of these machines. The mean data v t
Table 2.1 Soda Dispensed (ounces)
Machine 1 Machine 2
dispense 8 oz of your
9.52 8.01 selection
6.41 7.99 alue for each
inconsistent— machine obviously needs adjus 7.95
10.07 machine is 8 oz.
dispensed is very consistent,5.85 with little 8.03
variation. 8.15 8.02
̅ ̅
grade.
theAlan hasor
spread test scores ofof80,
dispersion dat78, 92, and 84. Alan received
standard deviation, and
the However, look at the variation in data values for Machine 1. The quantity of soda dispensed is very in some cas
the soda overflows the cup, and in other cases too little soda is dispensed. The tment. Machine 2, on th
The Range other hand, is working just fine. The quantity

This example shows that average values do not reflect the spread or dispersion of data. To measure a, we mu
introduce statistical values known as the variance.

The simplest measure of dispersion is the range. It is the difference between the largest and the

range, mean
deviation,

PANGASINAN STATE UNIVERSITY


6

Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
smallest values in a data set.

Range
Range = Largest value – Smallest value

Mean Deviation
A defect of the range is that it is based on only two values, the highest and the lowest; it does not take into
consideration all of the values. The mean deviation does. It measures the mean amount by which the values
in a population, or sample, vary from their mean. In terms of a definition: Mean Deviation is the arithmetic
mean of the absolute values of the deviations from the arithmetic mean.

a. range
b. mean deviation amount in ounces dispensed by each machine.

The Standard Deviation


The standard deviation of a set of numerical data makes use of the individual amount that each data value
deviates from the mean. These deviations, represented by are positive when the data value x
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
is greater than the mean and are negative when x is less than the mean . The sum of all the deviations
is 0 for all sets of data. This is shown in Table 2.2 for the Machine 2 data of Table 2.1. Table 2.2
Machine 2: Deviations from the Mean

8.01

7.99

7.95

8.03

8.02

Because the sum of all the deviations of the data values from the mean is always 0, we cannot use
the sum of the deviations as a measure of dispersion for a set of data. Instead, the standard deviation uses the
sum of the squares of the deviations.

Standard Deviation for Populations and Samples

If is a population of numbers with a mean of , then the standard deviation of the population is
𝑥 𝑥 𝑥 … 𝑥𝑛 𝑛 𝜇
∑ 𝑥 𝜇 2
𝜎 𝑛

If is a sample of numbers with a mean of , then the standard deviation of the


𝑥 𝑥 𝑥 … 𝑥𝑛 𝑛 𝑥̅
∑ 𝑥 𝜇 2
sample is 𝑠
𝑛

You may question why a denominator of is used instead of n when we compute a sample standard
deviation. The reason is that a sample standard deviation is often used to estimate the population standard
deviation, and it can be shown mathematically that the use of tends to yield better estimates.

Procedures for Computing a Standard Deviation


1. Determine the mean of the n numbers.
2. For each number, calculate the deviation (difference) between the number and the mean of the

numbers. 𝑛
𝑛 the square of each of the deviations and find the sum of these squared deviations.
3. Calculate
4. If the data is a population, then divide the sum by If the data is a sample, then divide the sum by
.
5. Find the square root of the quotient in Step 4.

The following numbers were obtained by sampling a population. 2, 4, 7, 12, 15.


Example 2 Find the standard deviation of the sample.

Solution:
̅
Step 1: Determine the mean.

Step 2: For each number, calculate the deviation between the number and the mean.

2
4
7
12
15

PANGASINAN STATE UNIVERSITY


8

Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
Step 3: Calculate the square of each of the deviations in Step 2, and find the sum of these squared
deviations.

Computer Solution
We can use spreadsheet like to find the range, standard deviation, and variance and the mode of a certain data
set.
Let us use the same list of data in Example 2.data are: 2, 4, 7, 12, 15
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management

LEARNING CONTENTS ( MEASURES OF RELATIVE POSITION)


Lesson 3: Measures of Relative Position
The measures of relative position of a given value shows where the value stands in relation position of a
given value to other values in the same set of data. The most common measures of relative position are
quartiles, percentiles and standard scores.

Quartiles divide a set of observations into four equal parts. To explain further, think of any set of values
arranged from smallest to largest. The first quartile, usually labeled , is the value below which 25 percent of
the observations occur , and the third quartile , usually labeled , is the value below which 75 percent of the

observations occur . Logically is the median.


th

Your turn1

Using the same data in Example 1, find the 4th decile.

Standard Scores (or the )


The for a given data value x is the number of standard deviations that x is above or below the
mean of the data.

PANGASINAN STATE UNIVERSITY


11

Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management

𝒛 𝒔𝒄𝒐𝒓𝒆
z-score for a data value x in a population
The following formulas show how to calculate the
and in a sample.
𝑥 𝜇 𝑥 𝑥̅
Population :𝑧𝑥 𝜎
Sample : 𝑧𝑥 𝑠

A negative represents a value less than the mean. A positive represents a value
greater than the mean. When , the data value is equal to the mean.
A equal to 1 represents a value that is 1 standard deviation above the mean ; a equal to
represents an element that is 1 standard deviation below the mean . If the number of elements in the
data set is large, about 68% of the elements have between and 1. About 95% have between and 2 and
about 99% have between and .

Example 2 Andrew gets a score of 64 in the Mathematics test where the class mean is 50 with
58 standard deviation of 8. Belle gets a score of 74 in a Physics test where the mean is
and the standard deviation is 10. Find out who actually performed better.

Solution
Find the z-score for each test.
Andrew : Belle:

So although Belle’s score is higher, Andrew’s score is farther above the mean and we may say that
Andrew performed better.

Cheryl has taken two quizzes in her history class. She scored 15 on the first quiz, for
Your turn2
which the mean of all scores was 12 and the standard deviation was 2.4. Her score
on the second quiz, for which the mean of all scores was 11 and the standard
deviation
was 2.0, was 14. In comparison to her classmates, did Cheryl do better on the first quiz or the second
quiz?

Example 3 A consumer group tested a sample of 100 light bulbs. It found that the mean life
expectancy of the bulbs was 842 h, with a standard deviation of 90. One particular
light bulb from the DuraBright Company had a of 1.2. What was the life span of this light bulb?

Solution
Substitute the given values into the equation and solve for
Solve for

The light bulb had a life span of 950 h.


Roland received a score of 70 on a test for which the mean score was 65.5. Your turn3
Roland has learned that the z-score for his test is 0.6. What is the standard
deviation for this
set of test scores

PANGASINAN STATE UNIVERSITY


12

Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management

LEARNING POINTS
The measures of relative position of a given value shows where the value stands in relation
position of a given value in relation to other values in the same set of data. The most common measures of
relative position are quartiles, percentiles, and standard scores

LEARNING ACTIVITY 3

In exercises 1 to 2. A data set has a mean of and a standard deviation of . Find the score for
each of the following.
1.
2.
̅
A data set has a mean of and a standard deviation of 115. Find the z-score for each of the following.
3.
4.

In exercises 5 to 6. A random sample of 1000 oranges showed that the mean amount of juice per orange was
7.4 fluid ounces, with a standard deviation of 1.1 fluid ounces.

5. Determine the z-score, to the nearest hundredth, of an orange that produced 6.6 fluid ounces of juice.
6. The z-score for one orange was 3.15. How much juice was produced by this orange? Round to the nearest
tenth of a fluid ounce.

7. Which of the following fitness scores is the highest relative score?


a. A score of 42 on a test with a mean of 31 and a standard deviation of 6.5
b. A score of 1140 on a test with a mean of 1080 and a standard deviation of 68.2
c. A score of 4710 on a test with a mean of 3960 and a standard deviation of 560.4

In exercises 8 to 10. The following scores were received by 20 accounting students in a short quiz: 10, 9, 15,
20, 13, 15, 18, 11, 7, 12, 15, 13, 18, 19, 12, 8, 10, 13, 17, and 15. Find the following : 8. third quartile,
9. eight decile and
10. forty percentile.

11. Rene scored at the 84th percentile on a test given to 12,600 students. How many students scored
higher than Rene?

LEARNING CONTENTS (NORMAL DISTRIBUTION)

Lesson 4: Normal Distribution

Data that has not been organized or manipulated in any manner is called raw data. A large collection
of raw data may not provide much pertinent information that can be readily observed. A frequency
distribution, which is a table that lists observed events and the frequency of occurrence of each observed
event, is often used to organize raw data. For instance, consider the following table, which lists the number of
laptop computers owned by families in each of 40 homes in a subdivision.

PANGASINAN STATE UNIVERSITY


13
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
Table 4.1
The frequency distribution in Table 4.2 below was constructed using the data from Table 4.1. The first
column of the frequency distribution consists of the numbers 0, 1, 2, 3, 4, 5, 6, and 7. The corresponding
frequency of occurrence, f, of each of the numbers in the first column is listed in the second column.

Table 4.2
Normal Distributions and the Empirical Rule
One of the most important statistical distributions of data
normalis known
distribution
as.
This distribution
a occurs in a variety
applications.
of Types of data that may demonstrate a normal distribution
include the lengths of leaves on a tree, the weights of newborns in a hospital, the lengths of time of a
student’s trip from home to school over a period of months, the large
SATgroup of students,
scores of a and
the life spans of light
bulbs.A normal distribution
forms a bell
-shaped curve that is symmetric about a vertical line through the
mean of the data. A graph of a normal distribution with a mean of 5 is shown
below.

Properties of a Normal Distribution

Every normal distribution has the following


 The graph is symmetric about a vertical line through the mean of the
properties.
 The mean, median, and mode are equal.
distribution.
 They-value of each point on the curve
percent
is the
(expressed as a decimal) of the data
at the corresponding
x-value.
 Areas under the curve that are symmetric about the mean are
 The total area under the curve is 1.
equal.
In the normal distribution shown below, the area of the shaded region is 0.159 units. This region
represents the fact that 15.9% of the data is greater than or equal to 10. Because the area under the curve is
1, the unshaded region under the curve has area , or 0.841, representing the fact that 84.1% of the
data are less than 10.
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
The following rule, called the Empirical Rule, describes the percent of data that lie within 1, 2, and 3 standard
deviations of the mean in a normal distribution.

Empirical
Rule for a Normal Distribution
In a normal distribution,
 68% of the data lie within 1 standard deviation of the
approximately
 95% of the data lie within 2 standard deviations of the
mean.
 99.7% of the data lie within 3 standard
mean. deviations of the Ans
mean.

Example 2 Use the Empirical Rule to Solve an Application


A survey of 1000 U.S. gas stations found that the price charged for a gallon of
regular gas could be closely approximated by a normal distribution with a mean of $3.10 and a standard
deviation of $0.18.
How many of the stations
a.betweencharge
$2.74$3.46
and for a gallon of regular gas?
b.less than $3.28 for a gallon of regular gas?
c. more than $3.46 for a gallon of regular gas?

Solution
a. Converting $2.74 into
-score
a z, , means that $2.74 per gallon price is 2 standard
deviations below the mean. While the $3.46
, price , thus $3.46 price is 2 standard
deviations above the mean. In a normal distribution, 95%ithin
of all2data
standard
lie w deviations of the
mean. See Figure 4.3. Therefore,imately
approx of the stations charge
between $2.74 and $3.46 for a gallon of regular gas.

Figure 4.3

b. Converting $3.28 price into a z-score, we can say that $3.28 price is 1 standard deviation
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
above the mean. See Figure 4.4. In a normal distribution, 34% of all data lie between the mean and 1
standard deviation above the mean. Thus, approximately (34%)(1000) 0.34)(1000) 340 of the stations
charge between $3.10 and $3.28 for a gallon of regular gasoline. Half of the 1000 stations, or 500 stations,
charge less than the mean. Therefore, about
of the stations charge less than $3.28 for a gallon of regular gas.

Figure 4.4

c. Converting $3.46 price-score


in a z , will give us a result of 2 standard deviations above
the mean. In a normal distribution, 95% of all data are within 2 standard deviations of the mean. This means
that the other 5% of the data will lie either above 2 standard deviations of w the
2 standard
mean or belo
deviations of the mean. We are interested only in the data that are more than 2 standard deviations above
the
mean, which isof 5%, or 2.5%, of the data. See Figure 4.5. (2.5%)(1000)
Thus about (0.025)(1000)25
of the stations charge
ore than 46for a gallon of regular gas.
m $3.

Figure 4.5
A vegetable distributor knows that during the month of August, the weights of
Your turn2 tomatoesits are normally distributed with a mean of 0.61 lb and a standard deviation
of 0.15
a.What percent
lb. of the tomatoes weigh less than 0.76
b.In a shipmentlb? of 6000 tomatoes, how many tomatoes can be expected to weigh more than 0.31
c. In
lb?a shipment of 4500 tomatoes, how many tomatoes can be expected to weigh from 0.31 lb to 0.91
lb
The Standard Normal Distribution
It is often helpful to convert data values
-scores,x to
as we did in the previous section by
using the
z -score
z
̅
formulas:
or

If the original distribution of values is a normal distribution, then the corresponding distribution of z-scores
will also be a normal distribution. This normal distribution of z-scores is called the standard normal distribution.
See Figure 4.6. It has a mean of 0 and a standard deviation of 1.
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
0.20 0.079 0.76 0.276 1.32 0.407 1.88 0.470 2.44 0.493 3.00 0.499 0.21
0.083 0.77 0.279 1.33 0.408 1.89 0.471 2.45 0.493 3.01 0.499 0.22 0.087
0.78 0.282 1.34 0.410 1.90 0.471 2.46 0.493 3.02 0.499 0.23 0.091 0.79
0.285 1.35 0.411 1.91 0.472 2.47 0.493 3.03 0.499 0.24 0.095 0.80 0.288
1.36 0.413 1.92 0.473 2.48 0.493 3.04 0.499 0.25 0.099 0.81 0.291 1.37
0.415 1.93 0.473 2.49 0.494 3.05 0.499 0.26 0.103 0.82 0.294 1.38 0.416
1.94 0.474 2.50 0.494 3.06 0.499 0.27 0.106 0.83 0.297 1.39 0.418 1.95
0.474 2.51 0.494 3.07 0.499 0.28 0.110 0.84 0.300 1.40 0.419 1.96 0.475
2.52 0.494 3.08 0.499 0.29 0.114 0.85 0.302 1.41 0.421 1.97 0.476 2.53
0.494 3.09 0.499 0.30 0.118 0.86 0.305 1.42 0.422 1.98 0.476 2.54 0.494
3.10 0.499 0.31 0.122 0.87 0.308 1.43 0.424 1.99 0.477 2.55 0.495 3.11
0.499 0.32 0.126 0.88 0.311 1.44 0.425 2.00 0.477 2.56 0.495 3.12 0.499
0.33 0.129 0.89 0.313 1.45 0.426 2.01 0.478 2.57 0.495 3.13 0.499
0.34 0.133 0.90 0.316 1.46 0.428 2.02 0.478 2.58 0.495 3.14 0.499 0.35
0.137 0.91 0.319 1.47 0.429 2.03 0.479 2.59 0.495 3.15 0.499 0.36 0.141
0.92 0.321 1.48 0.431 2.04 0.479 2.60 0.495 3.16 0.499 0.37 0.144 0.93
0.324 1.49 0.432 2.05 0.480 2.61 0.495 3.17 0.499 0.38 0.148 0.94 0.326
1.50 0.433 2.06 0.480 2.62 0.496 3.18 0.499 0.39 0.152 0.95 0.329 1.51
0.434 2.07 0.481 2.63 0.496 3.19 0.499 0.40 0.155 0.96 0.331 1.52 0.436
2.08 0.481 2.64 0.496 3.20 0.499 0.41 0.159 0.97 0.334 1.53 0.437 2.09
0.482 2.65 0.496 3.21 0.499 0.42 0.163 0.98 0.336 1.54 0.438 2.10 0.482
2.66 0.496 3.22 0.499 0.43 0.166 0.99 0.339 1.55 0.439 2.11 0.483 2.67
0.496 3.23 0.499 0.44 0.170 1.00 0.341 1.56 0.441 2.12 0.483 2.68 0.496
3.24 0.499 0.45 0.174 1.01 0.344 1.57 0.442 2.13 0.483 2.69 0.496 3.25
0.499 0.46 0.177 1.02 0.346 1.58 0.443 2.14 0.484 2.70 0.497 3.26 0.499
0.47 0.181 1.03 0.348 1.59 0.444 2.15 0.484 2.71 0.497 3.27 0.499 0.48
0.184 1.04 0.351 1.60 0.445 2.16 0.485 2.72 0.497 3.28 0.499 0.49 0.188
1.05 0.353 1.61 0.446 2.17 0.485 2.73 0.497 3.29 0.499
0.50 0.191 1.06 0.355 1.62 0.447 2.18 0.485 2.74 0.497 3.30 0.500 0.51
0.195 1.07 0.358 1.63 0.448 2.19 0.486 2.75 0.497 3.31 0.500 0.52 0.198
1.08 0.360 1.64 0.449 2.20 0.486 2.76 0.497 3.32 0.500
0.53 0.202 1.09 0.362 1.65 0.451 2.21 0.486 2.77 0.497 3.33 0.500
0.54 0.205 1.10 0.364 1.66 0.452 2.22 0.487 2.78 0.497
0.55 0.209 1.11 0.367 1.67 0.453 2.23 0.487 2.79 0.497

Because the standard normal distribution is symmetrical about the mean of 0, we can also use Table 4.4 to find
the area of a region that is located to the left of the mean.

Find the area of the standard normal distribution between z 1.44 and Example
3 z 0.

Solution
Because the standard normal distribution is symmetrical about the center line the area of the standard normal
distribution between and is equal to the area between and . The entry in Table 4.4 associated with
is 0.425. Thus the area of the standard normal distribution between and is 0.425 square unit.
See Figure 4.8.

Figure 4.8 Symmetrical region

PANGASINAN STATE UNIVERSITY


18
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management

Find the area of the standard normal distribution between


Your turn3 and

In Figure 4.9, the region to the right of is called a tail region. A tail region is a region of the
standard normal distribution to the right of a positive value or to the left of a negative value. To
find the area of a tail region, we subtract the entry in Table 4.4 from 0.500. This procedure is illustrated in the
next example.

Example 4 Find the area of the standard normal distribution to the right of .

Solution
Table 4.4 indicates that the area from
to is 0.294 square unit. The area to the right of
0.500square unit.
Thus the area to the right of is square unit.
See Figure 4.9
.

Figure 4.9Area of a tail region

Your turn4 Find the area of the standard normal distribution to the left of

The Standard Normal Distribution, Areas, Percentages, and Probabilities


In the standard normal distribution, the area of the �distribution
�to� �represents
from th percentage � � � �
a tob.
of� values that lie in the interval from
 the
e probability
that��lies in the intervalafrom
tob.

Because the area of a portion of the standard normal distribution can be interpreted as a percentage
of the data or as a probabi
lity that the variable lies in an interval, we can use the standard normal distribution
to solve many application
problems.
A soda machine dispenses soda into -ounce
12 cups. Tests show that the
Example 5 actual amount of soda dispensed is normally distributed, with a mean of 11.5
oz and a standard deviation of 0.2 oz. is

a. What percent of cups will receive less than 11.25 oz of soda?


b. What percent of cups will receive between 11.2 oz and 11.55 oz of soda?
c. If a cup is chosen at random, what is the probability that the machine will overflow the cup?

Solution
a. Recall that the formula for the score for a data value is
The score for 11.25 oz is

PANGASINAN STATE UNIVERSITY


19

Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management

Table 4.4 indicates that 0.394 (39.4%) of the data in a normal distribution are between and
. Because the data are normally distributed, 39.4% of the data is also between and
. The percent of data to the left of is 50% 39.4% 10.6%. See Figure 4.9 . Thus
10.6% of the cups filled by the soda machine will receive less than 11.25 oz of soda.

Figure 4.9 Portion of data to the left of


score for 11.55 ounces is

Table 4.4 indicates that 0.099 (9.9%) of the data in a normal distribution is between and
The z-score for 11.2 oz is

Table 4.4 indicates that 0.433 (43.


3%) of the data in a normal distribution are between
. Because the data are normally distributed, 43.3% of the data is also between
. See Figure 4.10. Thus the percent of the cups that the vending machine will fiwith
ll between 11.2
oz and 11.55 oz of soda is 43.3% 9.9% 53.2%.

Figure 4.10 Portion of data between two scores

c. A cup will overflow if it receives more than 12 oz of soda. Thescore for 12 oz is

b. The
.

and
and

Table 4.4 indicates that 0.494 (49.4%) of the data in the standard normal distribution are between and
. The percent of data to the right of is determined by subtracting 49.4% from 50%. See Figure
4.11. Thus 0.6% of the time the machine produces an overflow, and the probability that a cup chosen at
random will overflow is 0.006.
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management

Find the regression line equation of Table 5.2 and predict the score in Physics if
Example 2 the score in Mathematics of the student is 75.

Solution
Formulate the regression line equation by solving first the value of the variables b and a.
Solving for

1254107
b 8008112  b 0.48
1253418800
Solving for

59
a67.580.4866.67 a35.

Substitute the computed values of b and a to the regression line equation

Y = a + bx

y 35.590.48x regression line equation

We can now estimate scores in Physics using the regression line equation by substituting a value
or score in Mathematics Say for instance, if x is equal to 75, then solving for y will give a 71.59.

y35.59  0.4875
y71.59
Therefore, the estimated score in Physics is 71.59 or approximately equivalent to 72 if the score in
Mathematics is 75. The regression line equation may be used now in estimating scores for y by substituting a
value of

Find the regression line equation of Table 5.3 and predict the speed of a camel if
Your turn2 the stride length of the camel is 5.0.

Computer Solution
Using the data on the scores of 12 college students in Mathematics and Physics tests of 80 items (Table 5.1),
the following screenshot shows for the 12 paired values (occupying cells and cells
) as calculated by the spreadsheet’s built in PEARSON() ,INTERCEPT(), SLOPE()function.
PANGASINAN STATE UNIVERSITY
25
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
• Aufman,Richard et. al, Mathematics in the Modern World
• Mathematics in the World book from RBSI
• Paguio et. all, Statistics with Computer Based Discussion

Photo credits:
Population vs sample, keydifference.com
Figure 4.1 A histogram for the frequency distribution , Aufman,Richard et. al, Mathematics in the Modern World
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
Answers to Your turn (Lesson 2)
1.
Machine 1 Machine 2

a. Range Range =
Range =

b.
is

th

th

These indicate that in comparison to her classmates, Cheryl did better on the second quiz
than she did on the first quiz.
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
The standard deviation for this set of test scores is 7.5.

Answers to Your turn (Lesson 4)


1. a. The percent of data in all classes with an upper bound of 25 s or less is the sum of the percents for the fi
rst fi ve classes in Table 4.2. Thus the percent of subscribers who required less than 25 s to download the file is
30.9%.
b. The percent of data in all the classes with a lower bound of at least 10 s and an upper bound of 30 s or
less is the sum of the percents in the third through sixth classes in Table 4.2 . Thus the percent of subscribers
who required from 10 to 30 s to download the fi le is 47.8%. The probability that a subscriber chosen at
random will require from 10 to 30 s to download the fi le is 0.478.

2. a. 0.76 lb is 1 standard deviation above the mean of 0.61 lb. In a normal distribution, 34% of all data lie
between the mean and 1 standard deviation above the mean, and 50% of all data lie below the mean. Thus
34% +50% = 84% of the tomatoes weigh less than 0.76 lb.

b. 0.31 lb is 2 standard deviations below the mean of 0.61 lb. In a normal distribution, 47.5% of all data lie
between the mean and 2 standard deviations below the mean, and 50% of all data lie above the mean. This
gives a total of 47.5% + 50% 97.5% of the tomatoes that weigh more than 0.31 lb.
Therefore
(97.5%)(6000) (0.975)(6000) 5850 of the tomatoes can be expected to weigh more than 0.31 lb.

c. 0.31 lb is 2 standard deviations below the mean of 0.61 lb and 0.91 lb is 2 standard deviations above the
mean of 0.61 lb. In a normal distribution, 95% of all data lie within 2 standard deviations of the mean.
Therefore(95%)(4500) (0.95)(4500) 4275 of the tomatoes can be expected to weigh from 0.31 lb to 0.91 lb.

3. The area of the standard normal distribution between and is equal to the area between and . The entry
in Table 4.4 associated with is 0.249. Thus the area of the standard
normal distribution between and is 0.249 square unit.

4. Table 4.4 indicates that the area from to is 0.429 square unit. The area to the left of is
0.500 square unit. Thus the area to the left of is square unit.

5. Round z-scores to the nearest hundredth so you can use Table 4.4 .
a.
Table 4.4 indicates that 0.446 (44.6%) of the data in the standard normal distribution are between and
. The percent of the data to the right of is 50% 44.6% 5.4%. Approximately 5.4% of
professional football players have careers of more than 9 years.
b.
From Table 4.4:

The probability that a professional football player chosen at random will have a career of between 3 and 4
years is about 0.078.

PANGASINAN STATE UNIVERSITY


29
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
We can now estimate the speed of a camel using the regression line equation by substituting a
value or stride length of the camel Say for instance, if is equal to 5.0, then solving for y will give a 71.59.

y3.32.75.0
y10.2
Therefore, the estimated speed of a camel is 10.2 if its stride length is 5.0. The regression line equation
may be used now in estimating scores for y by substituting a value of
Study Guide in Mathematics in the Modern World FM-AA -CIA-15 Rev. 0 10-July-2020

GE7 Mathematics in the Modern World


Module 4: Data Management
Prepared by:

CORETA S. SANTILLAN
Math Faculty

Adopted by:

ANNA CLARICE M. YANDAY


Math Faculty

PANGASINAN STATE UNIVERSITY


31

You might also like