Professional Documents
Culture Documents
Objectives:
1.1 Introduction
When conducting statistical research, the researcher must gather data for a particular variable under investigation. The
researcher must organize the data gathered in a meaningful way. Frequency distribution is used in organizing
data.
After organizing the data using frequency distribution, the researcher needs to present the data in such a way that it can
be understood easily. The most useful method in presenting data is by constructing graphs and charts.
A frequency Distribution is the organization of raw data in table form, using classes and frequencies; grouping
of the data into categories showing the number of observations in each of the non – overlapping classes.
It is used in organizing data in tabular form.
Some examples where we can apply this distribution are gender, business type, political affiliation, and
others.
1. Make a table
2. Tally the data and place the results in tally column
3. Count the tallies and place the results in the frequency column
4. Find the percentage of values in each class by using the formula
% = f/N x 100%
Where: f = frequency of the class
N = total number of observations
Percentages are not normally a part of a frequency distribution, but they can be added since they are used in
certain types of graphical presentations, such as pie graphs.
5. Find the total for frequency column and Percentage column.
1
Subject: Elementary Statistics
Example:
1. Twenty-five army inductees were given a blood test to determine their blood type. The data set is
Since the data are categorical, discrete classes can be used. There are four blood types: A, B, O, and AB.
These types will be used as the classes for the distribution.
Step 1 Make a table as shown. Step 2 Tally the data and place the results in column B.
A B C D
A B C D
Class Tally Frequency Percent (%)
Class Tally Frequency Percent (%)
A IIII
A
B IIII-II
B
O O IIII -IIII
AB AB IIII
Step 3 Count the tallies and place the results in column C. Step 4 Find the percentage of values in each class and place the results in column
D.
A B C D
Class Tally Frequency Percent (%)
A IIII 5
B IIII-II 7
O IIII -IIII 9
AB IIII 4
A B C D
Class Tally Frequency Percent (%)
A IIII 5 20
B IIII-II 7 28
O IIII -IIII 9 36
AB IIII 4 16
Step 5 Find the totals for columns C (frequency) and D (percent). The completed table is shown.
A B C D
Class Tally Frequency Percent (%)
A IIII 5 20 (5/25)*100%
B IIII-II 7 28
O IIII -IIII 9 36
AB IIII 4 16
Total 25 100
For the sample of 25 army-inductees, more people have type O blood than any other type.
Exercise:
2
Subject: Elementary Statistics
1. Twenty applicants were given a performance evaluation appraisal. The data set is:
High High High Low Average
2. A survey taken at a hotel in Bohol indicated that 40 guest preferred the following means of transportation:
Car Car Bus Plane Train Bus Bus Plane Car Plane
Bus Plane Car Car Train Train Car Car Plane Plane
Plane Car Bus Car Bus Car Plane Car Plane Plane
Car Car Bus Train Car Bus Car Car Car Car
2. Ungrouped Frequency Distribution – applicable for numerical type of data less than 30.
Note: Percentages are not normally part of a frequency distribution, but they can be added since they are
used in certain types of graphical presentations, such as pie graphs.
Example: The heights (inches) of commonly grown herbs are shown below. Construct the FDT and think of a way
the results would be useful.
18 20 18 18 24 10 24
12 20 36 14 20 18
18 16 20 7 16 15
Solution:
Conclusion: 26. 32% of the commonly grown herbs have 18 inches in heights.
Supplementary Exercises: The heights in of 20 young rambutan trees, which are to be transplanted by a tree
nursery aid, are 18, 20, 36, 30, 16, 14, 16, 20, 15, 16, 14, 17, 16, 21, 24, 22, 30, 35, 26, and 16 inches.
3. Grouped frequency distribution – used when the data is large (n≥30); data are grouped into numerical categories.
3
Subject: Elementary Statistics
Several things to be noted:
30 – upper class limit; it represents the largest data value that can be included in the class.
CLASS BOUNDARIES – the numbers used to separate the classes so that there are no gaps in the frequency
distribution.
Basic Rule: The class limit should have the same decimal place value as the data, but the class boundaries should
have one additional place value and end in a 5.
For example, if the values in the data set are whole numbers, such as 24, 32, 18, the limits for the class
might be 31-37, and the boundaries are 30.5 – 37.5. 30.5 is the lower boundary and 37.5 is the upper boundary.
If the data are in tenths, such as 6.2, 7.8, and 12.6, the limits for class hypothetically might be 7.8 – 8.8,
and the boundaries for that class would be 7.75-8.85. Find these by subtracting 0.05 from 7.8 and adding 0.05 to
8.8.
The class width for a class in a frequency distribution is found by subtracting the lower (or upper) class
limit of one class from the lower (or upper) class limit of the next class.
1. Arrange the raw data in ascending or descending order (optional). This will make it easier for us to
tally.
2. Determine the number of classes
Find the highest and lowest value;
Find the range; R = highest value – lowest value
Determine the number of classes (k); k = 1 + 3.322 log N where N
is the number of observations. (Sturges Approximation formula)
Determine the class interval (or width)
Note: Round the value of the interval up to the nearest whole number if there is a remainder.
Generally, the number of classes for a frequency distribution table varies from 5 to 20, depending primarily on
the number of observations in the data set. It is preferably to have more classes as the size of the data set
increases. The decision about the number of classes depends on the method used by the researcher.
Rule 1: To determine the number of classes is to use the smallest positive integer k such that 2 ≥ 𝑛, where n is
the total number of observations. Using the formula below, we can get the ideal class interval.
𝑅𝑎𝑛𝑔𝑒 𝐻𝑉 − 𝐿𝑉
𝑖= =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑙𝑎𝑠𝑠𝑒𝑠 𝑘
Rule 2: Another way to determine the class interval is to use the following formula:
𝑖= 𝑅𝑎𝑛𝑔𝑒
Rule 3: Another guideline to determine the class interval is to have an ideal number of classes, then apply the
formula below:
𝐻𝑉 − 𝐿𝑉
𝑖=
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑙𝑎𝑠𝑠𝑒𝑠
4
Subject: Elementary Statistics
Select the starting point (usually the lowest value or any convenient number less than the
lowest value);
Select the individual class limits
• Add the interval (or width) to the lowest score taken as the starting point to obtain the
lower limits of the next class. Keep adding until the computed classes are obtained.
• To obtain the upper-class limits, subtract one unit to the lower limit of the second
class to obtain the upper limit of the first class. Then add the interval to each of the
upper limit to obtain all the upper limits.
Set the class boundaries in each class.
• To obtain the class boundaries, we need to subtract 0.5 from each lower-class limit
and add 0.5 to each upper-class limit.
3. Tally the raw data.
4. Convert the tallied data into numerical frequencies.
5. Determine the relative frequency. It can be found by dividing each frequency of the total frequency.
6. Determine the percentage. It can be found by multiplying 100% in each of the relative frequency.
7. Determine the cumulative frequencies. The cumulative frequency can be found by adding the
frequency in each class to the total frequencies of the classes preceding that class.
8. Determine the midpoints. The midpoint can be found by getting the average of the upper limit and
the lower limit in each class.
Example:
1. Panglao Island Travel and Tours, one of the DOT accredited travel and tour operators under the new
normal, offers special rates on summer period. The owner wants additional information on the ages of
those people taking travel tours. A random sample of 50 customers taking travel tours last summer
revealed these ages. Construct a grouped FDT using Rule 2 and interpret the results.
18 29 42 57 61 67 37 49 53 47
24 34 45 58 63 70 39 51 54 48
28 36 46 60 66 77 40 52 56 49
19 31 44 58 62 68 38 50 54 48
27 36 46 59 64 74 39 51 55 48
Solution:
R = 77-18
R = 59
N= 50
k = 1 + 3.322 log N
= 1 + 3.322 log 50 = 6.64≈ 7 (round off);
Step 3: Find the class interval or width (i) by dividing the range by the number of classes and rounding up.
5
Subject: Elementary Statistics
Step 4: Select the starting point (usually the lowest value or any convenient number less than the lowest value);
add the width to get the lower limits.
Class Limits
18
27 (add i=9)
36
45
54
63
72
Step 5: Find the upper-class limits.
Class Limits
18-26 (by subtracting 1 to the 2nd lower limit)
27-35 (add i)
36-44
45-53
54-62
63-71
72-80
Step 6: Find the class boundaries.
Class Limits Class boundaries
18-26 17.5≥ 𝑥 < 26.5
27-35 26.5≥ 𝑥 < 35.5
36-44 35.5≥ 𝑥 < 44.5
45-53 44.5≥ 𝑥 < 53.5
54-62 53.5≥ 𝑥 < 62.5
63-71 62.5≥ 𝑥 < 71.5
72-80 71.5≥ 𝑥 < 80.5
6
Subject: Elementary Statistics
72-80 71.5≥ 𝑥 < 80.5 76 II 2
Step 10: Find the cumulative frequencies.
Class Limits Class boundaries Midp Tally frequency Cumulative
oints frequency
18-26 17.5≥ 𝑥 < 26.5 22 III 3 3
27-35 26.5≥ 𝑥 < 35.5 31 IIII 5 8
36-44 35.5≥ 𝑥 < 44.5 40 IIII-IIII 9 17
45-53 44.5≥ 𝑥 < 53.5 49 IIII-IIII-IIII 14 31
54-62 53.5≥ 𝑥 < 62.5 58 IIII-IIII-I 11 42
63-71 62.5≥ 𝑥 < 71.5 67 IIII-I 6 48
72-80 71.5≥ 𝑥 < 80.5 76 II 2 50
1.6 Graphing Frequency Distribution
1. Histogram – a graph that displays the data by using vertical bars of various heights that are joined
together to represent the frequencies of the classes
Steps:
1. Draw and label the x and y axes. The x axis is always the horizontal axis, and the y axis is
always the vertical axis.
2. Represent the frequency on the y axis and the class boundaries on the x axis.
3. Using the frequencies as the heights, draw the vertical bars for each class.
Example:
Construct a histogram to represent the data shown below for the record of ages of 50 customers taking travel
tours.
Class frequency
boundaries
17.5-26.5 3
26.5-35.5 5
35.5-44.5 9
44.5-53.5 14
53.5-62.5 11
62.5-71.5 6
71.5-80.5 2
Total N= 50
Histogram
15
Frequency
10
0
26.5 35.5 44.5 53.5 62.5 71.5 80.5 More
Ages (class boundaries)
2. Frequency Polygon – is a graph that displays the data by using lines that connect points plotted for the
frequencies at the midpoints of the classes. The frequencies are represented by the height of the
points.
Steps:
7
Subject: Elementary Statistics
1. Find the midpoints
2. Draw x and y axis
3. Use the midpoints for x values and frequencies for y values
4. Connect adjacent points with line segments.
Example:
Construct a frequency polygon to represent the data shown below for the record of ages of 50 customers
taking travel tours.
Midpoints frequency
22 3
31 5
40 9
49 14
58 11
67 6
76 2
Frequency Polygon
16
14
12
10
frequency 8
6
4
2
0
22 31 40 49 58 67 76
Ages( Midpoint)
3. Cumulative Frequency Graph (Ogive) – a graph that displays the cumulative frequencies for the
classes in a frequency distribution. The vertical axis represents the cumulative frequency of the
distribution while the horizontal axis represents the upper-class boundaries (real upper limits) of the
frequency distribution.
Steps:
1. Find the cumulative frequency for each class
2. Draw x and y axis
3. Represent the frequency on the y axis and the upper-class boundaries on the x-axis.
4. Connect adjacent points with line segments
Example:
Construct a cumulative frequency graph or ogive to represent the data shown below for the record of ages
of 50 customers taking travel tours.
8
Subject: Elementary Statistics
Cumulative Frequency Graph
(Ogive)
60
50
40
cumulative
30
frequency
20
10
0
26.5 35.5 44.5 53.5 62.5 71.5 80.5
Ages( upper class boundaries)
a. Pareto Chart – used to represent a frequency distribution for a categorical data or nominal level and
frequencies are displayed by the heights of vertical bars, which are arrange in order from highest to
lowest.
b. Bar graph – representing data by areas in the form of vertical rectangles or bars. It is used when the
quantities are independent of each other.
c. Pie graph – is also known as the circle graph. The presentation makes use of a circle to represent given
data that make up a whole.
d. Time Series Graph – represents data that occur over specific period of time under observation. It shows
for a trend or pattern on the increase or decrease over the period of time.
e. Pictograph or pictogram – picture symbols are used to illustrate or represent the data under consideration.
Usually, in depicting population data, the figures of persons.
f. Scatter Graph or Scatter Plot– a graph used to present measurements or values that are thought to be
related.
Example 1: Using the information in the table below about the favorite snacks of freshmen college students,
construct a pareto chart, bar chart and pie chart
Products Sales
Cookies 120
Candies 150
Ice Cream 190
Chocolate 220
Others 80
Solution:
Products Sales
Chocolate 220
9
Subject: Elementary Statistics
Candies 150
Cookies 120
Others 80
Step 2: Draw and label the x-axis (Products) and y-axis (Sales).
Step 3: Make a bar with the same width and draw the height corresponding to the frequencies.
200
150
Sales
100
50
0
Chocolate Ice Cream Candies Cookies Others
Products
Step 1: Draw and label the x-axis (Products) and y-axis (Sales).
Step 2: Make a bar with the same width and draw the height corresponding to the frequencies.
200
150
Sales
100
50
0
Cookies Candies Ice Cream Chocolate Others
Products
Step 1: Since there are 3600 in a circle, the frequency of each class must be converted into a proportional part of a
circle. This conversion is done by applying the formula
Degrees = 𝑓 3600
𝑛
10
Subject: Elementary Statistics
where f = frequency of each class n = sum of
all frequencies
Hence,
Cookies 3600 = 570
Step 2: Each frequency must be converted to a percentage and has a total of 100%. This percentage can be done
by applying the formula
Percentage = 𝑓 (100 %)
𝑛
Hence,
Cookies (100%) = 16%
Others
Cookies
10%
16%
Chocolate Candies
29% 20%
Ice Cream
25%
Example 2: The data in the following table represents the number of professionals actively using their credit
cards payment on shopping from 2011 to 2018.
11
Subject: Elementary Statistics
Year 2011 2012 2013 2014 2015 2016 2017 2018
Card Payments (in 10.3 13.4 14.0 16.7 18.5 20.8 24.0 27.0
Millions)
Solution:
Step 1: Draw and label the x-axis (Year) and y-axis (Card Payments).
5
0
2011 2012 2013 2014 2015 2016 2017 2018
Year
Example 3: The following table shows the number of televisions sold by a company for months January
to May. Construct a pictograph for the table.
Months January February March April May
Number of 45 60 75 30 15
Television
Solution:
Step 2: Label the x-axis for years and y-axis for Number of Television
12
Subject: Elementary Statistics
January February March April May
Months
Legend: = 15 televisions
Example 4. The owner of a chain of halo-halo stores would like to study the effect of atmospheric temperature on
sales during the summer season. A random sample of 12 days is selected with the results given as follows:
Day 1 2 3 4 5 6 7 8 9 10 11 12
Temperature 79 76 78 84 90 83 93 94 97 85 88 82
(0F)
Total Sales 147 143 147 168 206 155 192 211 209 187 200 150
Construct a scatter plot.
Solution:
Step 1: Draw and label the x-axis (Temperature) and y-axis (Total Sales).
Step 2: Plot the points of each ordered pairs in the Cartesian coordinate system.
Scatter Plot
250
200
150
Total Sales
100
50
0
0 20 40 60 80 100 120
Temperature
13
Subject: Elementary Statistics