Professional Documents
Culture Documents
7
RESEARCH III
QUARTER 3
Week 2
Frequency Distribution
TOPIC Mean, Median and Mode
12. Describe a frequency distribution
13. Describe the measure of central tendency
LEARNING
a. Mean
COMPETENCY
b. Median
c. Mode
IMPORTANT: Do not write anything on this material. Write all your answers for the SAQ,
Let’s Practice Activities and Try Items on a separate sheet/s of paper.
UNDERSTAND
Frequency is how often something occurs or how often something happened. The
frequency of an observation tells you the number of times the observation occurs in the data.
Let us look at some examples.
Example 1:
Sam played football on:
Saturday Morning
Saturday Afternoon
Thursday Afternoon
The frequency was 2 on Saturday, 1 on Thursday and 3 for the whole week.
Example 2:
These are the numbers of newspapers sold at a local shop over the last 10 days:
Papers Frequency
Sold
18 2
19 0
20 4
21 0
22 2
23 1
24 0
25 1
One way to arrange data so that it makes sense is to use a frequency distribution table.
It's a list of all the different values in a variable, as well as the number of times they appear. A
frequency distribution, in other words, describes how frequencies are spread across values.
Frequency distributions are frequently used to summarize categorical variables.
Two types of frequency distributions that are most often used are the categorical
frequency distribution and the grouped frequency distribution.
The categorical frequency distribution is used for data that can be placed in specific
categories, such as nominal- or ordinal-level data. Categorical frequency distributions
are used to represent data such as political affiliation, religious affiliation, or major field of
study.
Example:
Twenty-five army inductees were given a blood test to determine their blood type. The
data set is
A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A
To construct the frequency data table for the data given. Follow the steps illustrated
below.
Since the data are categorical, discrete classes can be used. There are four blood
types: A, B, O, and AB. These types will be used as the classes for the distribution.
Step 4: Find the percentage of values in each class by using the formula
𝒇
%= x 100%
𝒏
where f is the frequency of the class and n is the total number of values.
Percentages are not normally part of a frequency distribution, but they can be added
since they are used in certain types of graphs such as pie graphs. Also, the decimal equivalent
of a percent is called a relative frequency.
Step 5: Find the totals for columns frequency and percent. The completed table is shown.
Class Tally Frequency Percent
A IIII 5 20%
B IIII-II 7 28%
O IIII-IIII 9 36%
AB IIII 4 16%
TOTAL 25 100%
It's useful to divide a wide collection of quantitative data into smaller intervals or classes
and count how many data values fall into each class. A frequency table divides data into
groups or intervals and displays the number of data values in each. The groups or intervals
are set up in such a way that each data value belongs to only one of them.
Constructing a frequency table involves several steps. Let us study the steps
illustrated below.
SITUATION: A task force to encourage car pooling did a study of one-way commuting
distances of workers in the downtown Zamboanga City area. A random sample of 60 of these
workers was taken. The commuting distances (in mile) of the workers in the sample are given
in Table 1.
Table 1. One-Way Commuting Distances (in mile) for 60 Workers in
Downtown Zamboanga City
13 47 10 3 16 20 17 40 4 2
7 25 8 21 19 15 3 17 14 6
12 45 1 8 4 16 11 18 23 12
6 2 14 13 7 15 46 12 9 18
34 13 41 28 36 17 24 27 29 9
14 26 10 24 37 31 8 16 12 16
NOTE: The data in the table are collected in original form, they are called raw data.
Decide how many classes you want. Typically, five to fifteen classes are
Step 1 used. You risk losing too much information if you use less than five classes.
The data might not be properly summarized if you use more than 15 classes.
Assume that six groups will be used in the case of commuting results.
NOTE: The classes must be mutually exclusive. Mutually exclusive classes have
nonoverlapping class limits so that data cannot be placed into two classes. The classes must
be exhaustive. There should be enough classes to accommodate all the data.
Step 2 Next, find the class width for the six classes.
Note: To ensure that all the classes taken together cover the data, we need to increase the
result of Step 1 to the next whole number, even if Step 1 produced a whole number. For
instance, if the calculation in Step 1 produces the value 4, we make the class width 5.
The classes must be equal in width.
Step 3 Determine the data range for each class. First let us take note of the following
terms:
▪ Lower class limit is the lowest data value that can fit in a class.
▪ Upper class limit is the highest data value that can fit in a class.
▪ Class width is the difference between the lower class limit of one class and the
lower class limit of the next class.
The smallest commuting distance in our sample is 1 mile. We use this smallest data value
as the lower class limit of the first class. Since the class width is 8, we add 8 to 1 to find that
the lower class limit for the second class is 9.
Table 2. Frequency table of One-way Commuting distance for 60 Downtown Zamboanga City
Workers (in mile)
Class Limits Class Boundaries Tally Frequency Class Cumulative
Lower-Upper Lower-Upper Midpoint Frequency
1-8
9-16
17-24
25-32
For class 1-8, 1 is the lower class limit and 8 is the upper class limit.
Step 4 There is a space between the upper limit of one class and the lower limit of the
next class. The halfway points of these intervals are called class boundaries.
▪ To find upper class boundaries, add 0.5 unit to the upper class limits.
▪ To find lower class boundaries, subtract 0.5 unit from the lower class limits.
Table 2. Frequency table of One-way Commuting distance for 60 Downtown Zamboanga City
Workers (in mile)
Class Limits Class Boundaries Tally Frequency Class Cumulative
Lower-Upper Lower-Upper Midpoint Frequency
1-8 0.5-8.5
9-16 8.5-16.5
17-24 16.5-24.5
25-32 24.5-32.5
Step 5 Tally the commuting distance data into the six classes and find the frequency
for each class.
Table 2. Frequency table of One-way Commuting distance for 60 Downtown Zamboanga City
Workers (in mile)
Class Limits Class Boundaries Tally Frequency Class Cumulative
Lower-Upper Lower-Upper Midpoint Frequency
1-8 0.5-8.5 IIII IIII IIII 14
9-16 8.5-16.5 IIII IIII IIII IIII I
21
17-24 16.5-24.5 IIII IIII I 11
25-32 24.5-32.5 IIII I 6
33-40 32.5-40.5 IIII 4
41-48 40.5-48.5 IIII 4
n= 60
The center of each class is called the midpoint (or class mark). The
midpoint is often used as a representative value of the entire class. The
Step 6
midpoint is found by adding the lower and upper class limits of one class and
dividing by 2.
Table 2. Frequency table of One-way Commuting distance for 60 Downtown Zamboanga City
Workers (in mile)
Class Limits Class Boundaries Tally Frequency Class Cumulative
Lower-Upper Lower-Upper Midpoint Frequency
1-8 0.5-8.5 IIII IIII IIII 14 4.5
9-16 8.5-16.5 IIII IIII IIII IIII I 21 12.5
17-24 16.5-24.5 IIII IIII I 11 20.5
25-32 24.5-32.5 IIII I 6 28.5
33-40 32.5-40.5 IIII 4
41-48 40.5-48.5 IIII 4
n = 60
Table 2. Frequency table of One-way Commuting distance for 60 Downtown Zamboanga City
Workers (in mile)
Class Limits Class Boundaries Tally Frequency Class Cumulative
Lower-Upper Lower-Upper Midpoint Frequency
1-8 0.5-8.5 IIII IIII IIII 14 4.5 14
9-16 8.5-16.5 IIII IIII IIII IIII I 21 12.5 35
17-24 16.5-24.5 IIII IIII I 11 20.5 46
25-32 24.5-32.5 IIII I 6 28.5 52
33-40 32.5-40.5 IIII 4 36.5 56
41-48 40.5-48.5 IIII 4 44.5 60
n = 60
Cumulative frequencies are used to show how many data values are accumulated up to
and including a specific class. For example, 35 of the total one-way commuting distance are
less than or equal to 16 miles. 52 is less than or equal to32 miles.
Let’s Practice #1! (Write your answer on the separate sheet/s of paper.)
1. A survey was taken on how much trust people place in the information they read on the
Internet. Construct a categorical frequency distribution for the data.
A - trust in everything they read
M - trust in most of what they read
H - trust in about one-half of what they read
S - trust in a small portion what they read.
(Based on information from the UCLA Internet Report.)
M M M A H M S M H M
S M M M M A M M A M
M M H M M M H M H M
A M M M H M M M M M
2. Medical: Tumor Recurrence: Certain kinds of tumors tend to recur. The following data
represent the lengths of time, in months, for a tumor to recur after chemotherapy (Reference:
D. P. Byar, Journal of Urology, Vol. 10, pp. 556–561). Note: These data are also available for
download at the Online Study Center.
19 18 17 1 21 22 54 46 25 49
50 1 59 39 43 39 5 9 38 18
14 45 54 59 46 50 29 12 19 36
38 40 43 41 10 50 41 25 19 39
27 20
Use five classes
Now that you are already familiar with the frequency distribution table its time
that you will learn about the MEAN, MEDIAN, and MODE.
The mean, median, and the mode are measures of central tendency.
Mode
Count the letters in each word of this sentence and give the mode. The numbers
of letters in the words of the sentence are
5 3 7 2 4 4 2 4 8 3 4 3 4
When we look at the numbers, we can see that 4 is the most common number, as there
are more words with 4 letters than any other. Before scanning for the mode, it is a good idea
to order—or sort—larger data sets.
For any qualitative or quantitative variable, the mode is the score that occurs the most
often in each data set. The mode is defined as the score(s) in each data set (variable) that
occurs with the highest frequency.
Not every data set has a mode. For example, if a science teacher gives equal numbers
of 85’s, 87’s, 89’s, 91’s, and 93’s, then there is no modal grade.
Data sets (variables) with two modes are often referred to as bi-modal, and similarly
defined in terms of the number of modes are tri-modal (three modes) or multi-modal data sets
or variables.
The mode is not very stable. Changing only one number in a data set will drastically
alter the mode. The mode, on the other hand, is a useful average when determining the most
frequently occurring data value.
Median
The middle value of a set of scores under consideration is the median, or central value,
of an ordered distribution. When you are given the median, you know there are an equivalent
number of data values above and below it in the ordered distribution.
The given data set has even amount of data. The two middle values are 31 and 31. The
median is thus equal to
Mean
An average that uses the exact value of each entry is the mean (sometimes called the
arithmetic mean). To compute the mean, add the values of all the entries and then divide by
the number of entries.
The mean for the data set 83, 93, 77, 33, 62, 28, 23 is
When we compute the mean, we sum the given data. There is a convenient notation to
indicate the sum. Let x represent any value in the data set. Then the notation
means that you are to sum all the data values. In other words, we are to sum all the entries
in the distribution. The summation symbol ∑ means sum the following and is capital sigma,
the S of the Greek alphabet.
The symbol for the mean of a sample distribution of x values is denoted by 𝑥̅ (read “x
bar”). Thus,
Let’s Practice #2! (Write your answer on the separate sheet/s of paper.)
For the following groups of numbers, calculate the mean, median and mode for each.
(Note: Show how you arrived at the answer for the median and the mean).
1. 18, 24, 17, 21, 24, 16, 29, 18
2. 75, 87, 49, 68, 75, 84, 98
3. 55, 47, 38, 66, 56, 64, 44, 63, 39
REMEMBER
Key Points
▪ A frequency distribution is the organization of raw data in table form, using classes.
▪ Two types of frequency distributions that are most often used are the categorical
frequency distribution and the grouped frequency distribution.
▪ The measures of central tendency are the mean, median and mode.
▪ The mean is the sum of n numbers divided by n.
▪ The median is the “middle” or “center” of a set of data.
▪ The mode is the value that occurs with the highest frequency and more than once.
TRY
Let’s see how much you have learned today!
Write your answers on a separate sheet of paper.
Directions: Read each question carefully and choose the letter corresponding to your
answer. (Write your answer on a separate sheet of paper.)
_____1. Which measure of central tendency is described as an “average”?
A. Mean B. Median C. Mode D. Standard Deviation
_____2. What measure of central tendency requires that the data be arranged according to
size?
A. Mean B. Median C. Mode D. Standard Deviation
_____3. The data distribution below shows the 2, 439 complaints about comfort-related
characteristics of an airline’s planes:
_____5. The following table is based on the study in which 200 persons were asked how
many times they had visited the local zoo during the last twelve months:
Number of visits to the local Number of persons
zoo
0 90
1 72
2 26
3 8
4 3
5 0
6 1
TOTAL 200
What type of frequency distribution is shown?
A. Numerical distribution C. Categorical distribution
B. Qualitative distribution D. Frequency distribution table
For items 6-8, refer to the problem below.
In a recent month, the Bureau of Fisheries reported 53, 31, 67, 53, and 36 fishing
violations for five different regions.