Statistics Tools for Data Management

MATHEMATICS AS A TOOL
CHAPTER 4 : DATA MANAGEMENT

Statistics is a branch of applied mathematics that deals with gathering, organizing, presenting,
analyzing, and interpreting the collected data. There are two branches of statistics – descriptive
statistics and inferential statistics. Descriptive statistics involves the collecting, organizing,
describing, summarizing and presenting of gathered data in a meaningful and informative way
while inferential statistics refers to the process of drawing conclusion and making decision on
the population based on evidence obtained from a sample. Inferential statistics include
estimation and hypothesis testing.
MATHEMATICS AS
LEARNING
OUTCOMES:
A TOOL At the end of this chapter,
you must be able to:

CHAPTER 4 : DATA MANAGEMENT
1. Use variety of
KEY CONCEPTS
statistical tools to
Gathering and Organizing Data
process and manage
Data – defined as the quantities (numbers) or qualities
numerical data.
(attributes) measured or observed that are to be collected and/or
analyzed. A collection of data is called data set. 2. Use the methods of
Two categories of data: linear regression and
1. Categorical data – these are nominal and ordinal scales and correlations to predict
uses non-parametric statistics.
Nominal scales consist of a finite set possible values having no the value of variable
particular orders.
given certain
Example: gender, mode of transportation, nationality,
occupation, civil status. conditions; and
Ordinal scale is a set of possible values having a specific order.
Example: pain level, social status, attitude, towards a subject. 3. Advocate the use of
statistical data in
2. Continuous data – these data has interval and ratio scale;
these uses parametric statistics. making important
Interval scales are measured on continuum and differences
between any two numbers on the scale are of known size. decisions.
Example: temperature, tons of garbage, number of arrests,
income, and age.
Variable – refers to a property that can take on different values or

categories which can not be predicted with certainty.
Common types of variable:
1. Independent variables or X variable – these are explanatory

variables, these may be continuous, nominal or ordinal.
2. Dependent variables or Y variable – these are the response
variables.
3. Control variable or Z variable – these are the constant and
unchanged variable.
1
Classification of variables:
1. Quantitative variable – is one that can be measured and ordered according to quantity.
Quantitative variable may be discrete or continuous variable.
Discrete variable includes finite or countably finite.
Continuous variable covers the values in an interval of real number line.
2. Qualitative variable – is one simply used as labels to distinguish one group from the another.
Presentation of data:
1. Textual presentation – uses statements with numerals in order to describe the data for the
concrete information and in expository form.
2. Tabular presentation – uses statistical table to directly display the quantities or variables
collected as data.
3. Graphical presentation – illustrates data in a form of graphs aiding readers to understand the
text easily.
Example: circle graph, bar graph, line graph, pictograph.
The data gathered should be properly organized into grouped data called frequency distribution.
Steps in constructing frequency distribution table:
1. Determine as to estimate number of classes k, k = 1+3 log(n), where n is the number of

population.
2. Determine the range, r = highest value – lowest value.
3. Obtain the class size, c = range / k
4. Set the lowest value as the first lower limit and get the upper limit which is equal to first lower
limit + class size – 1.
5. Do the same process again until you reach the lass class limit that includes the highest value
from the data.
Example 1. Construct a frequency distribution table for the following data:
11 19 11 15 16 10
16 16 15 17 10 27
21 11 13 21 10 16
11 19 24 12 22 13
19 13 18 20 21 11
19 15 11 25 29 23
16 23 10 17 11 27
16 24 12 21 13 12
26 15 11 14 10 12
11 15 18 12 20 13
2
Solution:
1. Determine the value of k = 1 + 3 log(n) where n = 30, log 60 = 1.77815125, k = 1 + 3
(1.7781512)
k = 1 = 5.3344536
k = 6. Therefore, 6 is the estimate number of classes in these data.
2. r = 29 – 10 = 19 3. Class size = 19 / 6 = 3.16 or 3
Class Limits Frequency

28 – 30 1
25 – 27 4
22 – 24 5
19 – 21 10
16 – 18 10
13 – 15 11
10 – 12 19
Total, n 60
EXERCISE no 1
Construct a frequency distribution table for the following data. The scores of students in a
Geometry Test.
55 63 44 37 50 57 44 57 42 46
58 40 54 65 39 27 28 56 38 45
30 35 56 78 55 27 50 28 44 28
39 37 65 43 33 70 60 61 60 44
Interpretation of Data
Any given data in statistics are useless if we don’t interpret them. The most appropriate measures
found to be useful in describing a distribution of observations are the measure of central tendency,
measures of variation, measure of relative position, z-scores, box and whisker plot, probability
and normal curve, linear regression and correlation.
3
Measures of Central Tendency
Central Tendency determines a numerical value in the central region of a distribution of scores. It
refers to the center of a distribution of observation.
There are three measures of central tendency: the mean, the median, and the mode.
1. MEAN
The mean, Mn is also called the arithmetic mean or average. It can be affected by extreme
scores. It is the balance point of a distribution.
How to compute for the mean?

A. The mean of Ungrouped data:
𝑠𝑢𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒𝑠

Mean, Mn =
𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑣𝑎𝑙𝑢𝑒𝑠
Example: Jeffrey has been working on programming and updating a Web site for his company for the
past 24 months. The following numbers represent the number of hours Jeffrey has worked on this Web
site for each of the past 7 months: 24, 25, 31, 50, 53, 66, 78. What is the mean (average) number of
hours that Jeffrey worked on this Web site each month?
Solution:
24 + 25 + 33+ 50 + 53 + 66 + 78 329
Mn = = = 47 was the average number of
7 7
hours that Jeffrey worked on this
website each month.
Σ𝑓𝑋 Where:
Weighted Mean, WMn = WMn = weighted mean
𝑁
f = frequency
X = score
ΣfX = sum of the product of frequency and score
N = total frequency
Example: There are 1000 notebooks sold at Php 10 each; 500 notebooks at Php 20 each; 500
notebooks at Php 25 each, and 100 notebooks at Php 30 each. Compute the weighted mean.
Solution:
Prepare the frequency distribution.
Notebook’s Price (X) f fX

Php 10 1000 Php 10000
Php 20 500 Php 10000
Php 25 500 Php 12500
Php 30 100 Php 3000
N= 2100 ΣfX = Php 35,500
4
Therefore:
Σ𝑓𝑋 35,500
WMn = = = 16.90
𝑁 2,100
B. The mean of Grouped Data

There are two ways on how to solve for the value of mean given the grouped data on
frequency distribution.
Σ𝑓𝑋𝑚
a. Mn = Where:
𝑁 Mn = mean
f = frequency
Xm = class mark
ΣfXm = sum of the product of frequencies and class
marks
N = total frequency
Example: The table below summarizes the weights of the Cubs. Find the average weight of the cubs.
Weights of the Cubs f

201 – 210 3
191 – 200 8
181 – 190 12
171 – 180 11
161 – 170 9
151 – 160 2
N = 45
Reminder: the class mark is just equal to the average value of the upper-class limit and the lower-class limit form each of the
class limits in the given frequency distribution.
Solution:
In solving for the mean given the grouped data or frequency distribution, we have to add two
columns for class mark (Xm) and fXm, that is
Weights of the Cubs f Xm fXm
201 – 210 3 205.5 616.5
191 – 200 8 195.5 1564
181 – 190 12 185.5 2226
171 – 180 11 175.5 1930.5
161 – 170 9 165.5 1489.5
151 – 160 2 155.5 311
Σ𝑓𝑋𝑚 = 8137.5
N = 45
5
Therefore:
Σ𝑓𝑋𝑚 8137.5
Mean, Mn = = = 180.83
𝑁 45
EXERCISE no 2
1. The sizes of pants sold during one business day in a department store are 32, 28, 34,
42, 36, 34, 40, 44, 32, 34. Find the average size of the pants sold.
2. Given the frequency distribution for the weights of the 50 pieces of luggage. Compute
the mean.
Weight (kilograms) Number of Pieces, f
7–9 2
10 – 12 8
13 – 15 14
16 – 18 19
19 – 21 7
N 50
2. MEDIAN
The median, Md, is the value in the distribution that divides an arranged
(ascending/descending) set into two equal parts. It is the midpoint or middlemost of a
distribution of scores.
How to compute the median?

A. The median of Ungrouped Data
It can be solved using the formula (N+1)/2th position after being arranged.
Examples:
1. Find the median of the following prices:
Php 50, Php 55, Php 60, Php 65, Php 12, Php 35, Php 48.
Solution:
Php 12, Php 35, Php 48, Php 50, Php 55, Php 60, Php 65, N = 7
Therefore:
Md = (N+1)/2 = (7+1)/2 = 4th score
Md = 50
2. Find the median of the following weights in kilos, 101, 107, 115, 120, 111, 105.
Solution:
Arranging the numbers in ascending order.
101, 105, 107, 111, 115, 120
6
N=6
Md = (N+1)/2th score
Md = (6+1)/2 = 3.5th score, that is between the 3rd and the 4th scores.
Md = (107+111)/2 = 109
B. The median of Grouped Data

In computing the median of the grouped data, determine the median class which contains
the (N/2)th score under <cf of the cumulative frequency distribution. To solve for the median,
we use the formula:
N
( −𝑐𝑓𝑏 )
2
Where:
Md = XLB + 𝑖 Md = median
𝑓𝑚
XLB = the lower boundary or true lower limit of the
median class.
N = total frequency
𝑐𝑓𝑏 = cumulative frequency before the median class
𝑓𝑚 = frequency of the median class
𝑖 = size of the class interval
Example: Solve for the median for the following data.
Statistics Test Results

Class frequency F <cf
28 – 29 1 60
26 – 27 3 59
24 – 25 3 56
22 – 23 3 53
20 – 21 6 50
18 – 19 6 44
16 – 17 8 38
14 – 15 6 = fm 30 = median class
12 – 13 10 24 = cfb
10 – 11 14 14
N = 60
Solution:
N/2th score = (60/2)th score

= 30th score
The median class that contains the 30th score is 14 – 15 since it has the 30th score.
XLB = 13.5
cfb = 24
fm = 6
i=2
7
Therefore:
N
( −𝑐𝑓𝑏 )
2
Md = XLB + 𝑖
𝑓𝑚
60
( −24)
2
= 13.5 + 2
6
6
= 13.5 + ( )2
6
= 13.5 + (1)2
= 13.5 + 2
= 15.5
This means that 50 percent of the students got a score below 15.5 or if the passing score is 50 percent
of the total number of items, almost half of the class failed in the test.
EXERCISE no 3
1. The ages of 10 Administrators in a certain college are given as follows: 40, 38, 45, 51,
44, 53, 59, 45, 56, 45. Compute the median.
2. Compute the median given the following data:
Scores in Statistics f
75 – 79 6
70 – 74 7
65 – 69 2
60 – 64 8
55 – 59 12
50 – 54 7
45 – 49 10
40 – 44 8
N 60
3. Mode
The mode is the value with largest frequency. It is the value that occurs most frequently in the
distribution. This is used when the quickest estimate of typical performance is wanted. A
distribution can be unimodal with one mode value, bimodal with two mode values and
trimodal with three mode values. In other words, it can have more than one mode.
How to find the mode?

A. The mode of Ungrouped Data
The mode of ungrouped data is found by merely inspection.
8
Example: Find the mode of the following discounts.
4%, 7%, 7%, 7%, 8%, 8%, 9%, 10%, 11%, 11%, 13%
Solution:
By inspection, the mode is 7 since it has the largest frequency.
B. The mode of Grouped Data

To find the mode of the grouped data, determine first the modal class. The modal class is
the class with the highest frequency, and we will use the formula:
Mode, Mo = XLB + [df1 / (df1 + df2)] i
Where,
Mo = Mode
XLB = lower boundary of the modal class
df1 = difference between the frequency of the modal class and the frequency above it.
df2 = difference between the frequency of the modal class and the frequency below it.
i = size of the class interval
Example: Find the mode of the following data:
Statistics Test Results

Class frequency f
28 – 29 1
26 – 27 3
24 – 25 3
22 – 23 3
20 – 21 6
18 – 19 6
16 – 17 8
14 – 15 6
12 – 13 10
10 – 11 14
N = 60
Solution:
Mo = XLB + [df1 / (df1 + df2)] i

XLB = 9.5
df1 = 14 – 0, because there is no frequency below the modal class,
= 14
df2 = 14 – 10
=4
i=2
Mo = 9.5 + [14/(14+4)]2
= 9.5 + [14/(18)]2
= 9.5 + (0.78)2
= 9.5 + 1.56
Mo = 11.06
9
EXERCISE no 4
1. Find the mode of the following data:

1 5 6 9 11 15 17
2 5 7 9 12 15 17
3 5 7 9 12 15 18
4 6 8 12 10 16 18
4 6 9 12 11 16 18
2. Solve for the mode, given the frequency distribution:

Score in Algebra f
75 – 79 6
70 – 74 7
65 – 69 2
60 – 64 8
55 – 59 12
50 – 54 7
45 – 49 10
40 – 44 8
N 60
10

Statistics Tools for Data Management

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics Tools for Data Management

Uploaded by

Copyright:

Available Formats

MATHEMATICS AS A TOOL

CHAPTER 4 : DATA MANAGEMENT

A TOOL At the end of this chapter,

you must be able to:

Variable – refers to a property that can take on different values or

Common types of variable:

1. Independent variables or X variable – these are explanatory

Steps in constructing frequency distribution table:

1. Determine as to estimate number of classes k, k = 1+3 log(n), where n is the number of

Example 1. Construct a frequency distribution table for the following data:

2. r = 29 – 10 = 19 3. Class size = 19 / 6 = 3.16 or 3

Class Limits Frequency

How to compute for the mean?

𝑠𝑢𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑣𝑎𝑙𝑢𝑒𝑠

Notebook’s Price (X) f fX

B. The mean of Grouped Data

Weights of the Cubs f

How to compute the median?

B. The median of Grouped Data

Example: Solve for the median for the following data.

Statistics Test Results

N/2th score = (60/2)th score

How to find the mode?

B. The mode of Grouped Data

Mode, Mo = XLB + [df1 / (df1 + df2)] i

Example: Find the mode of the following data:

Statistics Test Results

Mo = XLB + [df1 / (df1 + df2)] i

1. Find the mode of the following data:

2. Solve for the mode, given the frequency distribution:

You might also like