Professional Documents
Culture Documents
Module4-Data Organization Presentation
Module4-Data Organization Presentation
Data Organization
and Presentation
Objectives:
Introduction
In every research activity, information gathered may result in large masses of data.
These selected data need to be organized and presented in same manner that it could be easily
understood. Data sets are usually organized in tables and displayed through graphs.
Suppose you asked a sample of 20 persons about “where in the Philippines would
they like to spend their summer vacation.” The responses of these students were recorded and
results are as follows:
Boracay Baguio Palawan Bohol Boracay
CamSur Bohol Baguio Palawan Bohol
Palawan Bohol CamSur Palawan Boracay
Boracay Palawan CamSur Bohol Palawan
We may construct a frequency distribution table for these data. Note that the
variable in our activity, “It’s more fun in the Philippines”, is the different tourist
destinations, and is qualitative in nature. To construct a frequency distribution for qualitative
data, we simply list all categories and the number of responses that belong to each of the
categories.
The variable in the activity is classified into five categories; Baguio, Boracay, Bohol,
CamSur and Palawan. These categories are recorded in the first column of the frequency
distribution table. Each of the responses for the given data is read and marks a tally (1) in the
second column. Finally, record the total number of tallies for each category in the third
column of the table called the column of frequencies, usually denoted by f. The sum of the
entries in the frequency column gives the sample size (n) or the total frequency.
The frequency distribution table for the data set on tourist destination is as follows:
Elemtary Statistics 2
relative frequency of that category by 100.
Elemtary Statistics 3
In calculating relative frequency and percentage distribution, we have,
Relative Frequency:
f where: rf – relative frequency
rf f – frequency for each category
n
n – total frequency or sample size
Percentage:
Percentage = (rf)x 100
Data may easily be read if presented or displayed through graphs. Graphs give a
visual representation, thus, allowing to communicate information about the complicated
relationships among statistical data. This helps the readers to grasp information more
effectively.
Some of the graphs that may be used to present qualitative data are:
1. Bar graph
A bar graph uses vertical or horizontal bars to compare sizes of quantities. The
heights of bars represent the frequencies of repetitive categories.
6
Frequen
0
BAGUIO
BOHO BORACAY CAMSUR PALAWAN
L
Elemtary Statistics 4
2. Pie Graph
A pie graph is used to show the relationship of the parts to a whole. It is displayed by
a circle divided into portions that represent the relative frequencies or percentage of a
population or sample that belongs to different categories.
Tourist Destination
Baguio 10%
Palawan
30%
Bohol
25%
CamSur
Boracay 15%
20%
To construct a pie graph, we first determine the number of degrees that represent each
fractional part or percent of respective categories. Take note that a circle contains 360
degrees. This means that we have to multiply each percent of the category by 360 degrees to
get the area sector or angle size for the pie chart.
Example:
Tourist Destination (f) rf Angle size/Area sector
Baguio 2 0.10 360(0.10) = 36
Bohol 5 0.25 360(0.25) = 90
Boracay 4 0.20 360(0.20) = 72
Camsur 3 0.15 360(0.15) = 54
Palawan 6 0.30 360(0.30) = 108
n = 20
Elemtary Statistics 5
3. Line Graph
A line graph makes use of line segments to show changes and relationship between
quantities.
Example: Figure 4. Average Age of the Total Population: 1980, 1990, 1995, 2000-2011,
2016, 2017, and 2040
Sources: 1/ Based on the 1980, 1990 and 2000 Census of Population and Housing (CPH) and 1995 Census of Population of NSO.
2/ Special computations made by the NSCB-Technical Staff (NSCB-TS) using the 2000 Census-based Population Projections of NSO.
Take Note: Bar graph and line graph may also be used for comparing quantities of two
or more data sets. Different styles or color for bars and lines may be used to
distinguish a group from each other.
Example: Gross Domestic Product and Gross National Income, at Constant Prices, 2000
to 2011
9000000
8000000
7000000
6000000
5000000
4000000
3000000
GDP
2000000
GNI
1000000
0
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
Elemtary Statistics 6
Dependency Ratio by Type in Percent
Census Years 1970, 1975, 1980, 1990, 1995, 2000, 2007 and 2010
Elemtary Statistics 7
TABULAR PRESENTATIONOF QUANTITATIVE DATA
84 78 90 84 95 82 84 75 83 89
88 90 88 91 89 85 98 86 92 93
66 98 81 87 74 89 98 79 84 87
80 89 73 86 82 94 97 94 86 93
93 95 96 97 88 77 96 76 88 92
Literacy rate, a quantitative variable, may be organized using a stem and leaf display
or frequency distribution table.
STEM-and-LEAF DISPLAY
1. Split each value into two parts. The first part is the first digit, which is called the
stem. The second part will be the second digit, which is called the leaf.
2. Draw a vertical line and write the stems on the left side of it arranged in ascending
order.
3. After listing the stems, read the leaves for all values and record them next to the
corresponding stems on the right side of the vertical line.
Example: For the given data, the first two values are 84 and 78, thus:
The resulting steam and leaf display of the given data is:
6 6
7 8 5 4 9 3 7 6
8 4 4 2 4 3 9 8 8 9 5 6 17 9 4 7 0 9 6 8 8
Elemtary Statistics 8
9 0 5 0 1 8 23 8 8 4 7 4 3 3 5 6 7 6 2
Elemtary Statistics 9
Elemtary Statistics 10
A frequency distribution for quantitative data lists all the classes and the number of
values belonging to each class. Data presented in this form are called grouped data.
To construct a frequency distribution table for quantitative data, we have the following steps:
1. Find the range of the data set. The range (R) is given by the difference between the highest
(H) and lowest (L) data entries. So, for our given data set we have:
R = H – L = 98 – 66 = 32
2. Determine the number of classes, also known as number of class intervals (c). Note that
these classes represent a variable. One rule to help us decide on the number of classes is to
use Sturge’s Formula, given by;
c = 1 + 3.322 log n
3. Find the class size (i), also known as class width of the data set. Divide the range by the
number of classes (c) and round up to find the class size of the data set. Thus, we have
i=R/C
Elemtary Statistics 11
6. The number of tally marks for a class interval is the frequency for that class. The frequency
distribution for the given data is shown below.
After constructing a frequency distribution such as above, there are several additional
features that we may include to help better understand the data.
1. Classmark (xm)
The classmark (xm), sometimes called midpoint of the class interval is the sum of the
lower and upper limits of the class interval divided by two.
Thus,
𝑥𝑚 𝑈𝐿+𝐿𝐿
= 2
2. Class Boundaries
The class boundary is given by the midpoint of the upper limit of one class and the
lower limit of the next class. The class boundaries are the real limits of the class intervals.
Given below are the classmark and class boundaries of our data in Table 2
Elemtary Statistics 12
Take Note: We may distort or lose some information when we grouped into classes the
raw data. It is advised that we construct the frequency distribution table
carefully.
𝑟𝑓 𝑓
=𝑛
The cumulative frequency of a class interval is the sum of the frequency for the given
class and all previous classes. Cumulating the frequencies may be done by adding each
frequency starting from the lowest class interval, thus less than cumulative frequency (<c f). It
may also start from the highest class’ interval, thus greater than cumulative frequency (>cf).
5. Percentage
The percentage distribution of a class intervals, list the percentage of each class
obtained by multiplying the relative frequency of the class intervals by 100.
Percentage = (relative frequency * 100)
Elemtary Statistics 13
GRAPHICAL PRESENTATION OF QUANTITATIVE DATA
Pictures convey the message more effectively rather than column of numbers. It is
easier to identify patterns of data set by through visual presentation of a frequency table.
Visual models, such as graphs, provide a better understanding of a data set.
Recall that for qualitative data, we may present the data set using bar graph, line
graph, pictograph or pie graph. To show the information obtained from a frequency table of
quantitative data, we may use histogram and frequency polygon.
Histogram
Take Note: There are variants of histogram such as relative frequency histogram
or percentage histogram. The difference depends on whether the
relative frequencies or percentages are marked on the vertical axis.
14
12
10
8
6
4
67 72 77 82 87 92 97
Elemtary Statistics 14
Polygon
Another way of presenting quantitative data in graphical form is by constructing
polygons. This graph is formed by joining the midpoints of the tops of successive bars in a
histogram with straight lines. It emphasizes the continuous change in frequencies.
Take Note: Variants of polygon are frequency polygons with frequency marked on the
vertical axis, the relative frequency polygon where relative frequencies are
marked on the vertical axis. Consequently, a percentage polygon has percentages
marked on the vertical axis.
14
12
10
8
6
4
67 72 77 82 87 92 97
Types of ogives
1. Less than ogive – the upper class’ boundaries are marked on the horizontal axis
and the less than cumulative frequencies are marked on the vertical axis.
2. Greater than ogive – the lower class’ boundaries are marked on the horizontal
axis and the greater than cumulative frequencies are marked on the vertical axis.
Elemtary Statistics 15
How to construct an ogive
1. Construct a cumulative frequency distribution.
2. Specify the horizontal and vertical scales of the graph. The horizontal axis consists
of the class boundaries and the vertical axis with cumulative frequencies.
3. Plot the points that represent the specified class boundaries and their
corresponding cumulative frequencies.
4. Connect the points on the graph.
5. Close each graph with broken lines on both ends.
56
49
42
35
28
21
14
56
49
42
35
28
21
14
Elemtary Statistics 16
Elemtary Statistics 17
REFERENCES