You are on page 1of 13

Chapter 2

Frequency Distribution and Graphs

Objectives:

1. Define some basic terms in making frequency distribution.


2. Organize data into frequency distribution.
3. Construct the different graphs and charts.

1.1 Introduction

When conducting statistical research, the researcher must gather data for a particular variable under investigation. The
researcher must organize the data gathered in a meaningful way. Frequency distribution is used in organizing
data.

After organizing the data using frequency distribution, the researcher needs to present the data in such a way that it can
be understood easily. The most useful method in presenting data is by constructing graphs and charts.

1.2 Definition of Frequency Distribution

A frequency Distribution is the organization of raw data in table form, using classes and frequencies; grouping
of the data into categories showing the number of observations in each of the non – overlapping classes.
It is used in organizing data in tabular form.

1.3 Definition of Some Important Terms

1. Raw data – data collected in original form


2. Range - difference of the highest value and the lowest value in the distribution.
3. Class Limits – the highest and lowest values describing a class
4. Class Boundaries (Real Limits) – the upper and lower values of a class for group frequency distribution
whose values has additional decimal place more than the class limits and end with digit 5.
5. Interval (or width) – distance between the class lower boundary and the class upper boundary and it is
denoted by the symbol i.
6. Frequency(f) – the number of values in a specific class of a frequency distribution
7. Relative frequency (rf) – the value obtained when the frequencies in each class of the frequency
distribution is divided by the total number of values.
8. Percentage (%)- is obtained by multiplying the relative frequency by 100%.
9. Cumulative frequency (cf)- is the sum of the frequencies accumulated up to the upper boundary of a
class in a frequency distribution.
10. Midpoint – the point halfway between the class limits of each class and is representative of the data
within that class.

1.4 Types of Frequency Distribution

1. Qualitative or Categorical frequency distribution –used to organized nominallevel or ordinal-level type of


data. In this type, the data are grouped according to some qualitative characteristics, data are grouped into
non numerical categories.

Some examples where we can apply this distribution are gender, business type, political affiliation, and
others.

Steps in Constructing Categorical Frequency Distribution

1. Make a table
2. Tally the data and place the results in tally column
3. Count the tallies and place the results in the frequency column
4. Find the percentage of values in each class by using the formula
% = f/N x 100%
Where: f = frequency of the class
N = total number of observations
Percentages are not normally a part of a frequency distribution, but they can be added since they are used in
certain types of graphical presentations, such as pie graphs.
5. Find the total for frequency column and Percentage column.

1
Subject: Elementary Statistics
Example:

1. Twenty-five army inductees were given a blood test to determine their blood type. The data set is

Construct a frequency distribution and interpret the results.


A
A B B O
B
A
O O B B
B
B B O A O
A
A O O O
B
A
A O B A
B
Solution:

Since the data are categorical, discrete classes can be used. There are four blood types: A, B, O, and AB.
These types will be used as the classes for the distribution.

Step 1 Make a table as shown. Step 2 Tally the data and place the results in column B.
A B C D
A B C D
Class Tally Frequency Percent (%)
Class Tally Frequency Percent (%)
A IIII
A
B IIII-II
B
O O IIII -IIII
AB AB IIII
Step 3 Count the tallies and place the results in column C. Step 4 Find the percentage of values in each class and place the results in column
D.

A B C D
Class Tally Frequency Percent (%)
A IIII 5
B IIII-II 7
O IIII -IIII 9
AB IIII 4
A B C D
Class Tally Frequency Percent (%)
A IIII 5 20
B IIII-II 7 28
O IIII -IIII 9 36
AB IIII 4 16

Step 5 Find the totals for columns C (frequency) and D (percent). The completed table is shown.

A B C D
Class Tally Frequency Percent (%)
A IIII 5 20 (5/25)*100%
B IIII-II 7 28
O IIII -IIII 9 36
AB IIII 4 16
Total 25 100
For the sample of 25 army-inductees, more people have type O blood than any other type.
Exercise:

2
Subject: Elementary Statistics
1. Twenty applicants were given a performance evaluation appraisal. The data set is:
High High High Low Average

Average Low Average Average Average

Low Average Average High High

Low Average Average High High

2. A survey taken at a hotel in Bohol indicated that 40 guest preferred the following means of transportation:
Car Car Bus Plane Train Bus Bus Plane Car Plane

Bus Plane Car Car Train Train Car Car Plane Plane

Plane Car Bus Car Bus Car Plane Car Plane Plane

Car Car Bus Train Car Bus Car Car Car Car

2. Ungrouped Frequency Distribution – applicable for numerical type of data less than 30.

Steps in Constructing Ungrouped Frequency Distribution


b. Make a table.
c. Tally the data and place the results in tally column.
d. Count the tallies and place the results in the frequency column.
e. Find the percentage of values in each class by using the formula. %=f/N x 100%
where: 𝑓 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑐𝑙𝑎𝑠𝑠
𝑁 = 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠
f. Find the total for frequency column and percentage column.

Note: Percentages are not normally part of a frequency distribution, but they can be added since they are
used in certain types of graphical presentations, such as pie graphs.

Example: The heights (inches) of commonly grown herbs are shown below. Construct the FDT and think of a way
the results would be useful.

18 20 18 18 24 10 24
12 20 36 14 20 18
18 16 20 7 16 15
Solution:

Class Tally Frequency %


36 I 1 5.26%
24 II 2 10.53%
20 IIII 4 21.05%
18 IIII 5 26.32%
16 II 2 10.53%
15 I 1 5.26%
14 I 1 5.26%
12 I 1 5.26%
10 I 1 5.26%
7 I 1 5.26%
Total 19 19

Conclusion: 26. 32% of the commonly grown herbs have 18 inches in heights.

Supplementary Exercises: The heights in of 20 young rambutan trees, which are to be transplanted by a tree
nursery aid, are 18, 20, 36, 30, 16, 14, 16, 20, 15, 16, 14, 17, 16, 21, 24, 22, 30, 35, 26, and 16 inches.

3. Grouped frequency distribution – used when the data is large (n≥30); data are grouped into numerical categories.

3
Subject: Elementary Statistics
Several things to be noted:

(24 – 30) – called as the class limits


24 – lower class limit; it represents the smallest data value that can be included in the class.

30 – upper class limit; it represents the largest data value that can be included in the class.

CLASS BOUNDARIES – the numbers used to separate the classes so that there are no gaps in the frequency
distribution.

Basic Rule: The class limit should have the same decimal place value as the data, but the class boundaries should
have one additional place value and end in a 5.

For example, if the values in the data set are whole numbers, such as 24, 32, 18, the limits for the class
might be 31-37, and the boundaries are 30.5 – 37.5. 30.5 is the lower boundary and 37.5 is the upper boundary.

If the data are in tenths, such as 6.2, 7.8, and 12.6, the limits for class hypothetically might be 7.8 – 8.8,
and the boundaries for that class would be 7.75-8.85. Find these by subtracting 0.05 from 7.8 and adding 0.05 to
8.8.

The class width for a class in a frequency distribution is found by subtracting the lower (or upper) class
limit of one class from the lower (or upper) class limit of the next class.

Steps in Constructing Grouped Frequency Distribution

1. Arrange the raw data in ascending or descending order (optional). This will make it easier for us to
tally.
2. Determine the number of classes
Find the highest and lowest value;
Find the range; R = highest value – lowest value
Determine the number of classes (k); k = 1 + 3.322 log N where N
is the number of observations. (Sturges Approximation formula)
Determine the class interval (or width)

Note: Round the value of the interval up to the nearest whole number if there is a remainder.

Determining Class Interval

Generally, the number of classes for a frequency distribution table varies from 5 to 20, depending primarily on
the number of observations in the data set. It is preferably to have more classes as the size of the data set
increases. The decision about the number of classes depends on the method used by the researcher.

Rule 1: To determine the number of classes is to use the smallest positive integer k such that 2 ≥ 𝑛, where n is
the total number of observations. Using the formula below, we can get the ideal class interval.
𝑅𝑎𝑛𝑔𝑒 𝐻𝑉 − 𝐿𝑉
𝑖= =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑙𝑎𝑠𝑠𝑒𝑠 𝑘

where: HV = highest value in the data set k = number of


classes
LV = lowest value in the data set i = suggested class
interval

Rule 2: Another way to determine the class interval is to use the following formula:

𝑖= 𝑅𝑎𝑛𝑔𝑒

1 + 3.322 (𝑙𝑜𝑔𝑎𝑟𝑖𝑡ℎ𝑚 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠)

Rule 3: Another guideline to determine the class interval is to have an ideal number of classes, then apply the
formula below:
𝐻𝑉 − 𝐿𝑉
𝑖=
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑙𝑎𝑠𝑠𝑒𝑠

4
Subject: Elementary Statistics
Select the starting point (usually the lowest value or any convenient number less than the
lowest value);
Select the individual class limits
• Add the interval (or width) to the lowest score taken as the starting point to obtain the
lower limits of the next class. Keep adding until the computed classes are obtained.
• To obtain the upper-class limits, subtract one unit to the lower limit of the second
class to obtain the upper limit of the first class. Then add the interval to each of the
upper limit to obtain all the upper limits.
Set the class boundaries in each class.
• To obtain the class boundaries, we need to subtract 0.5 from each lower-class limit
and add 0.5 to each upper-class limit.
3. Tally the raw data.
4. Convert the tallied data into numerical frequencies.
5. Determine the relative frequency. It can be found by dividing each frequency of the total frequency.
6. Determine the percentage. It can be found by multiplying 100% in each of the relative frequency.
7. Determine the cumulative frequencies. The cumulative frequency can be found by adding the
frequency in each class to the total frequencies of the classes preceding that class.
8. Determine the midpoints. The midpoint can be found by getting the average of the upper limit and
the lower limit in each class.

Example:

1. Panglao Island Travel and Tours, one of the DOT accredited travel and tour operators under the new
normal, offers special rates on summer period. The owner wants additional information on the ages of
those people taking travel tours. A random sample of 50 customers taking travel tours last summer
revealed these ages. Construct a grouped FDT using Rule 2 and interpret the results.

18 29 42 57 61 67 37 49 53 47

24 34 45 58 63 70 39 51 54 48

28 36 46 60 66 77 40 52 56 49

19 31 44 58 62 68 38 50 54 48

27 36 46 59 64 74 39 51 55 48

Solution:

Step 1: Determine the range (R); R = highest value – lowest value

R = 77-18

R = 59

Step 2: Determine the number of classes (k);

k = 1 + 3.322 log N; where N is the number of observations. (Sturges Approximation formula)

N= 50
k = 1 + 3.322 log N
= 1 + 3.322 log 50 = 6.64≈ 7 (round off);
Step 3: Find the class interval or width (i) by dividing the range by the number of classes and rounding up.

5
Subject: Elementary Statistics
Step 4: Select the starting point (usually the lowest value or any convenient number less than the lowest value);
add the width to get the lower limits.
Class Limits
18
27 (add i=9)
36
45
54
63
72
Step 5: Find the upper-class limits.
Class Limits
18-26 (by subtracting 1 to the 2nd lower limit)
27-35 (add i)
36-44
45-53
54-62
63-71
72-80
Step 6: Find the class boundaries.
Class Limits Class boundaries
18-26 17.5≥ 𝑥 < 26.5
27-35 26.5≥ 𝑥 < 35.5
36-44 35.5≥ 𝑥 < 44.5
45-53 44.5≥ 𝑥 < 53.5
54-62 53.5≥ 𝑥 < 62.5
63-71 62.5≥ 𝑥 < 71.5
72-80 71.5≥ 𝑥 < 80.5

Step 7: Find the midpoints.


Class Limits Class boundaries Midpoints
18-26 17.5≥ 𝑥 < 26.5 22 (18+26)/2
27-35 26.5≥ 𝑥 < 35.5 31
36-44 35.5≥ 𝑥 < 44.5 40
45-53 44.5≥ 𝑥 < 53.5 49
54-62 53.5≥ 𝑥 < 62.5 58
63-71 62.5≥ 𝑥 < 71.5 67
72-80 71.5≥ 𝑥 < 80.5 76
Step 8: Tally the raw data.
Class Limits Class boundaries Midpoints Tally
18-26 17.5≥ 𝑥 < 26.5 22 III
27-35 26.5≥ 𝑥 < 35.5 31 IIII
36-44 35.5≥ 𝑥 < 44.5 40 IIII-IIII
45-53 44.5≥ 𝑥 < 53.5 49 IIII-IIII-IIII
54-62 53.5≥ 𝑥 < 62.5 58 IIII-IIII-I
63-71 62.5≥ 𝑥 < 71.5 67 IIII-I
72-80 71.5≥ 𝑥 < 80.5 76 II
Step 9: Find the numerical frequencies from the tallies.
Class Limits Class boundaries Midpoints Tally frequency
18-26 17.5≥ 𝑥 < 26.5 22 III 3
27-35 26.5≥ 𝑥 < 35.5 31 IIII 5
36-44 35.5≥ 𝑥 < 44.5 40 IIII-IIII 9
45-53 44.5≥ 𝑥 < 53.5 49 IIII-IIII-IIII 14
54-62 53.5≥ 𝑥 < 62.5 58 IIII-IIII-I 11
63-71 62.5≥ 𝑥 < 71.5 67 IIII-I 6

6
Subject: Elementary Statistics
72-80 71.5≥ 𝑥 < 80.5 76 II 2
Step 10: Find the cumulative frequencies.
Class Limits Class boundaries Midp Tally frequency Cumulative
oints frequency
18-26 17.5≥ 𝑥 < 26.5 22 III 3 3
27-35 26.5≥ 𝑥 < 35.5 31 IIII 5 8
36-44 35.5≥ 𝑥 < 44.5 40 IIII-IIII 9 17
45-53 44.5≥ 𝑥 < 53.5 49 IIII-IIII-IIII 14 31
54-62 53.5≥ 𝑥 < 62.5 58 IIII-IIII-I 11 42
63-71 62.5≥ 𝑥 < 71.5 67 IIII-I 6 48
72-80 71.5≥ 𝑥 < 80.5 76 II 2 50
1.6 Graphing Frequency Distribution

Graphical Method is a pictorial or geometrical representation of a given data.

It is a method in presenting data.

1.6.1 Most Commonly Used graphs in Research

1. Histogram – a graph that displays the data by using vertical bars of various heights that are joined
together to represent the frequencies of the classes

Steps:

1. Draw and label the x and y axes. The x axis is always the horizontal axis, and the y axis is
always the vertical axis.
2. Represent the frequency on the y axis and the class boundaries on the x axis.
3. Using the frequencies as the heights, draw the vertical bars for each class.

Example:

Construct a histogram to represent the data shown below for the record of ages of 50 customers taking travel
tours.
Class frequency
boundaries
17.5-26.5 3
26.5-35.5 5
35.5-44.5 9
44.5-53.5 14
53.5-62.5 11
62.5-71.5 6
71.5-80.5 2
Total N= 50

Histogram
15
Frequency

10

0
26.5 35.5 44.5 53.5 62.5 71.5 80.5 More
Ages (class boundaries)

2. Frequency Polygon – is a graph that displays the data by using lines that connect points plotted for the
frequencies at the midpoints of the classes. The frequencies are represented by the height of the
points.

Steps:

7
Subject: Elementary Statistics
1. Find the midpoints
2. Draw x and y axis
3. Use the midpoints for x values and frequencies for y values
4. Connect adjacent points with line segments.

Example:

Construct a frequency polygon to represent the data shown below for the record of ages of 50 customers
taking travel tours.
Midpoints frequency
22 3
31 5
40 9
49 14
58 11
67 6
76 2

Frequency Polygon
16
14
12
10
frequency 8
6
4
2
0
22 31 40 49 58 67 76
Ages( Midpoint)

3. Cumulative Frequency Graph (Ogive) – a graph that displays the cumulative frequencies for the
classes in a frequency distribution. The vertical axis represents the cumulative frequency of the
distribution while the horizontal axis represents the upper-class boundaries (real upper limits) of the
frequency distribution.
Steps:
1. Find the cumulative frequency for each class
2. Draw x and y axis
3. Represent the frequency on the y axis and the upper-class boundaries on the x-axis.
4. Connect adjacent points with line segments

Example:

Construct a cumulative frequency graph or ogive to represent the data shown below for the record of ages
of 50 customers taking travel tours.

8
Subject: Elementary Statistics
Cumulative Frequency Graph
(Ogive)
60
50
40
cumulative
30
frequency
20
10
0
26.5 35.5 44.5 53.5 62.5 71.5 80.5
Ages( upper class boundaries)

1.6.2 Other Types of Graphs:

a. Pareto Chart – used to represent a frequency distribution for a categorical data or nominal level and
frequencies are displayed by the heights of vertical bars, which are arrange in order from highest to
lowest.

b. Bar graph – representing data by areas in the form of vertical rectangles or bars. It is used when the
quantities are independent of each other.

c. Pie graph – is also known as the circle graph. The presentation makes use of a circle to represent given
data that make up a whole.

d. Time Series Graph – represents data that occur over specific period of time under observation. It shows
for a trend or pattern on the increase or decrease over the period of time.

e. Pictograph or pictogram – picture symbols are used to illustrate or represent the data under consideration.
Usually, in depicting population data, the figures of persons.

f. Scatter Graph or Scatter Plot– a graph used to present measurements or values that are thought to be
related.

Example 1: Using the information in the table below about the favorite snacks of freshmen college students,
construct a pareto chart, bar chart and pie chart
Products Sales
Cookies 120
Candies 150
Ice Cream 190
Chocolate 220
Others 80
Solution:

a. Constructing a Pareto Chart

Step 1: Arrange the data from highest to lowest according to frequency.

Products Sales

Chocolate 220

Ice Cream 190

9
Subject: Elementary Statistics
Candies 150

Cookies 120

Others 80

Step 2: Draw and label the x-axis (Products) and y-axis (Sales).

Step 3: Make a bar with the same width and draw the height corresponding to the frequencies.

Pareto Chart for Favorite Snacks


250

200

150
Sales
100

50

0
Chocolate Ice Cream Candies Cookies Others
Products

b. Constructing a Bar Chart

Step 1: Draw and label the x-axis (Products) and y-axis (Sales).

Step 2: Make a bar with the same width and draw the height corresponding to the frequencies.

Bar Chart for Favorite Snacks


250

200

150
Sales
100

50

0
Cookies Candies Ice Cream Chocolate Others
Products

c. Constructing a Pie Chart

Step 1: Since there are 3600 in a circle, the frequency of each class must be converted into a proportional part of a
circle. This conversion is done by applying the formula

Degrees = 𝑓 3600
𝑛

10
Subject: Elementary Statistics
where f = frequency of each class n = sum of
all frequencies

Hence,
Cookies 3600 = 570

Candies 3600 = 710

Ice Cream 3600 = 900

Chocolate 3600 = 1040

Others 3600 = 380

Step 2: Each frequency must be converted to a percentage and has a total of 100%. This percentage can be done
by applying the formula

Percentage = 𝑓 (100 %)
𝑛

where f = frequency of each class n = sum of


all frequencies

Hence,
Cookies (100%) = 16%

Candies (100%) = 20%

Ice Cream (100%) = 25%

Chocolate (100%) = 29%

Others (100%) = 10%


Step 3: Using a protractor, graph each section and write its name and appropriate percentage.

Pie Chart for Favorite Snacks

Others
Cookies
10%
16%

Chocolate Candies
29% 20%

Ice Cream
25%

Example 2: The data in the following table represents the number of professionals actively using their credit
cards payment on shopping from 2011 to 2018.

11
Subject: Elementary Statistics
Year 2011 2012 2013 2014 2015 2016 2017 2018

Card Payments (in 10.3 13.4 14.0 16.7 18.5 20.8 24.0 27.0
Millions)

Solution:

Step 1: Draw and label the x-axis (Year) and y-axis (Card Payments).

Step 2: Plot each point according to the table.

Step 3: Draw a line segments connecting adjacent points.

Time-Series Graph for Card


Payments
30
25
20
Card
Payments 15
( in millions ) 10

5
0
2011 2012 2013 2014 2015 2016 2017 2018
Year

Example 3: The following table shows the number of televisions sold by a company for months January
to May. Construct a pictograph for the table.
Months January February March April May
Number of 45 60 75 30 15
Television
Solution:

Step 1: Draw and label the x-axis and y-axis.

Step 2: Label the x-axis for years and y-axis for Number of Television

Step 3: Draw a house to represent the number of houses.

12
Subject: Elementary Statistics
January February March April May

Months

Legend: = 15 televisions

Example 4. The owner of a chain of halo-halo stores would like to study the effect of atmospheric temperature on
sales during the summer season. A random sample of 12 days is selected with the results given as follows:
Day 1 2 3 4 5 6 7 8 9 10 11 12
Temperature 79 76 78 84 90 83 93 94 97 85 88 82

(0F)
Total Sales 147 143 147 168 206 155 192 211 209 187 200 150
Construct a scatter plot.

Solution:

Step 1: Draw and label the x-axis (Temperature) and y-axis (Total Sales).
Step 2: Plot the points of each ordered pairs in the Cartesian coordinate system.

Scatter Plot
250

200

150
Total Sales
100

50

0
0 20 40 60 80 100 120
Temperature

1.7 Guidelines for Developing Good graphs/Charts

1. The graph/chart should include a title.


2. The scales for all axes should be included.
3. The scale on the y-axis should start at zero.
4. The graph/chart should not disfigure the data.
5. The x-axis and y-axis should be properly labeled.
6. The graph or chart should not contain unnecessary decorations.
7. The simplest possible graph/chart should be used for any data set.

13
Subject: Elementary Statistics

You might also like