You are on page 1of 18

Module 4

Data Organization
and Presentation
Objectives:

At the end of the lesson, the students are expected to:


1. prepare a stem-and-leaf plot;
2. construct frequency distribution table;
3. create graphs for qualitative and quantitative data;
4. read and interpret graphs and tables; and
5. perform simple analysis of data.

Introduction

In every research activity, information gathered may result in large masses of data.
These selected data need to be organized and presented in same manner that it could be easily
understood. Data sets are usually organized in tables and displayed through graphs.

TABULAR PRESENTATION OF QUALITATIVE DATA

Suppose you asked a sample of 20 persons about “where in the Philippines would
they like to spend their summer vacation.” The responses of these students were recorded and
results are as follows:
Boracay Baguio Palawan Bohol Boracay
CamSur Bohol Baguio Palawan Bohol
Palawan Bohol CamSur Palawan Boracay
Boracay Palawan CamSur Bohol Palawan

We may construct a frequency distribution table for these data. Note that the variable
in our activity, “It’s more fun in the Philippines”, is the different tourist destinations, and is
qualitative in nature. To construct a frequency distribution for qualitative data, we simply list
all categories and the number of responses that belong to each of the categories.

The variable in the activity is classified into five categories; Baguio, Boracay, Bohol,
CamSur and Palawan. These categories are recorded in the first column of the frequency
distribution table. Each of the responses for the given data is read and marks a tally (1) in the
second column. Finally, record the total number of tallies for each category in the third
column of the table called the column of frequencies, usually denoted by f. The sum of the
entries in the frequency column gives the sample size (n) or the total frequency.
The frequency distribution table for the data set on tourist destination is as follows:

Table 1. Frequency Distribution of PreferredTourist Destination


Tourist Destination Tally Frequency (f)
Baguio 2
Bohol 5
Boracay 4
Camsur 3
Palawan 6
n = 20

Additional information may be derived from the data set such as the relative
frequency and percentage distribution. A relative frequency of a category is determined by
getting what fractional part or proportion of the total frequency belongs to the corresponding

B 2u s i n e s s S t a t i s t i c s 2
3

category. On the other hand, the percentage for a category is determined by multiplying the
relative frequency of that category by 100.
In calculating relative frequency and percentage distribution, we have,

Relative Frequency:
f where: rf – relative frequency
rf = f – frequency for each category
n n – total frequency or sample size
Percentage:
Percentage = (rf)x 100

The relative frequency and percentage distribution of data is given below:

Relative Frequency and Percentage Distributions of Preferred Tourist Destination

Tourist Destination Frequency (f) Relative Frequency Percentage


Baguio 2 2/20 = 0.10 10.0
Bohol 5 5/20 = 0.25 25.0
Boracay 4 4/20 = 0.20 20.0
Camsur 3 3/20 = 0.15 15.0
Palawan 6 6/20 = 0.30 30.0
n = 20  = 1.00  = 100

GRAPHICAL PRESENTATION of QUALITATIVE DATA

Data may easily be read if presented or displayed through graphs. Graphs give a
visual representation, thus, allowing to communicate information about the complicated
relationships among statistical data. This helps the readers to grasp information more
effectively.

Some of the graphs that may be used to present qualitative data are:
1. Bar graph
A bar graph uses vertical or horizontal bars to compare sizes of quantities. The
heights of bars represent the frequencies of repetitive categories.

Example: Bar Graph of Preferred Tourist Destination

8
Frequency

0
BAGUIO BOHOL BORACAY CAMSUR PALAWAN

B 3u s i n e s s S t a t i s t i c s
4

2. Pie Graph

A pie graph is used to show the relationship of the parts to a whole. It is displayed by
a circle divided into portions that represent the relative frequencies or percentage of a
population or sample that belongs to different categories.

Example: Pie Graph of Preferred Tourist Destination

Tourist Destination
Baguio
10%
Palawan
Bohol 30%
25%

CamSur
Boracay 15%
20%

To construct a pie graph, we first determine the number of degrees that represent each
fractional part or percent of respective categories. Take note that a circle contains 360
degrees. This means that we have to multiply each percent of the category by 360 degrees to
get the area sector or angle size for the pie chart.
Example:

Tourist Destination (f) rf Angle size/Area sector


Baguio 2 0.10 360(0.10) = 36
Bohol 5 0.25 360(0.25) = 90
Boracay 4 0.20 360(0.20) = 72
Camsur 3 0.15 360(0.15) = 54
Palawan 6 0.30 360(0.30) = 108
n = 20

3. Line Graph
A line graph
makes use of line
segments to show
changes and
relationship between
quantities.

Example:

Average Age of the


Total Population: 1980,
1990, 1995, 2000-2011,
2016, 2017, and 2040

B 4u s i n e s s S t a t i s t i c s
5

Take Note: Bar graph and line graph may also be used for comparing quantities of two
or more data sets. Different styles or color for bars and lines may be used to
distinguish a group from each other.

Example: Gross Domestic Product and Gross National Income, at Constant Prices,
2000 to 2011
9000000
8000000
7000000
6000000
5000000
GDP
4000000 GNI
3000000
2000000
1000000
0
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

Source of data: National Statistical Coordination Board (NSCB)


GDP- Gross National Product
GNI- Gross National Income

Dependency Ratio by Type in Percent


Census Years 1970, 1975, 1980, 1990, 1995, 2000, 2007 and 2010

B 5u s i n e s s S t a t i s t i c s
6

Name: ____________________________________________ Date: ____________________

Section: ___________ Professor: ___________________________ Score: _____________

Activity 1

Refer to each graph below. Then answer the questions that follow.

1. OFW REMITTANCES IN BILLION (US Dollar) January to May, 2008 and 2009

1.80
1.60
1.40
1.20
1.00
2009
0.80 2008
0.60
0.40
0.20
0.00
January February March April May
________1. Which month in 2008 had the highest OFW remittance?
________2. How much was the highest OFW remittance in 2009?
________3. Which month in 2009 had the lowest OFW remittance?
________4. How much was the lowest OFW remittance in 2008?
________5. Which month had almost the same amount of remittances in both years?

2. Comparison of OFWs by Work Destination, 2000 and 2010

2000 OFW Destination 2010 OFW Destination

Middle East Balance Asia


44% 46%
Balance Asia
25%
Middle East
61% Others
8%
Europe America Others America
6% 1% 3% 2%
Europe
4%

Source of data: Philippine Overseas Employment Administration (POEA)

1. What is the most common OFW destination in 2000? _______________ in 2010?


_____________
2. What is the difference in percent (from 2000 to 2010) of OFWs working in
Balance Asia?_______________ in Middle East? _______________ Europe? _______________

B 6u s i n e s s S t a t i s t i c s
7

TABULAR PRESENTATIONOF QUANTITATIVE DATA

Data for quantitative variables may likewise be organized by determining the


frequency counts belonging to each group called classes or class intervals. Consequently, we
need to prepare a stem-and leaf display or construct a frequency distribution table, to
effectively present the data.

Suppose a regional-wide survey was conducted to determine its functional literacy


rate. Functional literacy, according to National Statistics Office (NSO), is a higher level of
literacy which includes not only reading and writing skills but also numerical and
comprehension skills. The survey includes 10-64 years old household members of provinces
and key cities in the region. The literacy rate of the sample was determined, and the results
are as follows:

84 78 90 84 95 82 84 75 83 89
88 90 88 91 89 85 98 86 92 93
66 98 81 87 74 89 98 79 84 87
80 89 73 86 82 94 97 94 86 93
93 95 96 97 88 77 96 76 88 92

Literacy rate, a quantitative variable, may be organized using a stem and leaf display
or frequency distribution table.

STEM-and-LEAF DISPLAY

Presenting quantitative data in condensed form using stem-and-leaf display that


contain the individual observation, thus no information is loss. Each value in the type of
presentation is divided into two parts – a stem and leaf. The leaves for each stem are shown
separately in the presentation.

How to Prepare a Stem-and-Leaf Display

1. Split each value into two parts. The first part is the first digit, which is called the
stem. The second part will be the second digit, which is called the leaf.
2. Draw a vertical line and write the stems on the left side of it arranged in ascending
order.
3. After listing the stems, read the leaves for all values and record them next to the
corresponding stems on the right side of the vertical line.

Example: For the given data, the first two values are 84 and 78, thus:

stems 7 8 leaf for 78


8 4 leaf for 84
9

The resulting steam and leaf display of the given data is:

6 6
7 8 5 4 9 3 7 6
8 4 4 2 4 3 9 8 8 9 5 6 17 9 4 7 0 9 6 8 8
9 0 5 0 1 8 23 8 8 4 7 4 3 3 5 6 7 6 2
B 7u s i n e s s S t a t i s t i c s
8

B 8u s i n e s s S t a t i s t i c s
9

Name: ____________________________________________ Date: ____________________

Section: ___________ Professor: ___________________________ Score: _____________

Activity 2

Below are the adult literacy rates in 2007 of selected Asian countries. Adult literacy is
according to World Bank is the total percentage of the population age who can: with
understanding, read and write a short/simple statement of their everyday life.

92 95 98 74 61 72 76 68 89 63
75 81 76 58 88 95 60 63 72 65
67 80 87 94 77 77 98 99 80 53
76 72 94 93 90 82 65 82 93 96

Prepare a stem-and-leaf display for these data

B 9u s i n e s s S t a t i s t i c s
10

Frequency Distribution

A frequency distribution for quantitative data lists all the classes and the number of
values belonging to each class. Data presented in this form are called grouped data.

To construct a frequency distribution table for quantitative data, we have the following steps:

1. Find the range of the data set. The range (R) is given by the difference between the highest
(H) and lowest (L) data entries. So, for our given data set we have:

R = H – L = 98 – 66 = 32
2. Determine the number of classes, also known as number of class intervals (c). Note that
these classes represent a variable. One rule to help us decide on the number of classes is to
use Sturge’s Formula, given by;

c = 1 + 3.322 log n

where: c – number of classes


n – sample size/ total frequency

Therefore: c = 1 + 3.322 log 50 → c7

3. Find the class size (i), also known as class width of the data set. Divide the range by the
number of classes (c) and round up to find the class size of the data set. Thus, we have
i=R/C

where: i = class size

i = 32 / 7 R = range
i = 4.71 c = number of classes
i=5
4. List the class intervals of the data set for the given data, we will have to construct seven (7)
classes with a class size (i) of 5. Determine also the lower limits and the upper limits of the
classes.

a. The lower limit of the first class’ interval is a number nearest to the lowest value of
the data entries that is divisible by the class size. This value may be less than or
equal to the lowest value.

For the given data, lowest value is 66. The nearest number to 66 that is divisible by
5 is 65 which is the lower limit of the first class’ interval. To find the lower limit of
the remaining 6 classes, add the class size to the lower limit of each previous class.
b. The upper limit of the first class’ interval is a number that is one less than the lower
limit of the second class. The upper limits of the remaining five classes is
determined by adding the class size to the upper limits of each previous class.

5. Tally the entries from each class interval.

B 10
usiness Statistics
11

6. The number of tally marks for a class interval is the frequency for that class. The frequency
distribution for the given data is shown below.

Literacy Rates of Provinces and Key Cities of Region X

Literacy Rate Tally Frequency (f)


Variable
65 – 69 1
70 – 74 2
Third class 75 – 79 5 Frequency of the
80 – 84 9 fourth class
85 – 89 14
Lower limit of 90 – 94 10
the sixth class 95 - 99 9
n = 50 Number of cases
or sample size
Upper limit of the fifth class

After constructing a frequency distribution such as above, there are several additional
features that we may include to help better understand the data.

1. Classmark (xm)
The classmark (xm), sometimes called midpoint of the class interval is the sum of the
lower and upper limits of the class interval divided by two.
Thus,
UL+¿
x m=
2

where: xm - class mark


UL - upper class limit
LL - lower class limit

2. Class Boundaries
The class boundary is given by the midpoint of the upper limit of one class and the
lower limit of the next class. The class boundaries are the real limits of the class intervals.
Given below are the classmark and class boundaries of our data in Table 2

Classmark and Class Boundaries of a Frequency Distribution Table of Functional Literacy


Rate Region X

Literacy Rate Frequency (f) Classmark (xm) Class Boundaries Upper class
65 – 69 1 67 64.5 – 69.5 boundary of the
70 – 74 2 72 69.5 – 74.5 2nd class
75 – 79 5 77 74.5 – 79.5
80 – 84 9 82 79.5 – 84.5
85 – 89 14 87 84.5 – 89.5
90 – 94 10 92 89.5 – 94.5
95 - 99 9 97 94.5 –99.5
Classmark of
Lowerclass boundary of
the 5thclass
the 5thclass

B 11
usiness Statistics
12

Take Note: We may distort or lose some information when we grouped into classes the
raw data. It is advised that we construct the frequency distribution table
carefully.

3. Relative Frequency (rf)


The relative frequency (rf) of the class interval is the portion or part of the data that
falls in that class. To find the rf, we have:

f
r f=
n

where: rf – relative frequency


f – frequency of the given class
n – total number of cases or sample size

4. Cumulative Frequency (cf)

The cumulative frequency of a class interval is the sum of the frequency for the given
class and all previous classes. Cumulating the frequencies may be done by adding each
frequency starting from the lowest class interval, thus less than cumulative frequency (<c f). It
may also start from the highest class’ interval, thus greater than cumulative frequency (>c f).

5. Percentage

The percentage distribution of a class intervals, list the percentage of each class
obtained by multiplying the relative frequency of the class intervals by 100.

Percentage = relative frequency * 100

6. Cumulative Percentage Frequency (cpf)


The cumulative percentage frequency of a class interval is the sum of the percentage
for the given class and all previous classes. This may be done in two ways; as with the
cumulative frequency, in which we add the percentage frequency either from the lowest class
interval or from the highest class’ interval.

The relative frequency, cumulative frequency, percentage and cumulative percentage


frequency of table 2 is given here:

Relative Frequency and Percentage Distribution Table


Of Functional Literacy Rate of Region X
Literacy Frequency Relative Percentage
Rate (f) Frequency (rf) (%)
65 – 69 1 .02 2
70 – 74 2 .04 4
75 – 79 5 .1 10
80 – 84 9 .18 18
85 – 89 14 .28 28
90 – 94 10 .2 20
95 - 99 9 0.18 18
n = 50 1.00 100

B 12
usiness Statistics
13

GRAPHICAL PRESENTATION OF QUANTITATIVE DATA

Pictures convey the message more effectively rather than column of numbers. It is
easier to identify patterns of data set by through visual presentation of a frequency table.
Visual models, such as graphs, provide a better understanding of a data set.
Recall that for qualitative data, we may present the data set using bar graph, line
graph, pictograph or pie graph. To show the information obtained from a frequency table of
quantitative data, we may use histogram and frequency polygon.

Histogram

A histogram is a bar graph that represents the frequency distribution of a


“continuous” data set. It has the following properties:
1. The horizontal scale is quantitative and measures of the data set.
2. The vertical scale measures the frequency of the class interval.
3. There is “no gap” between consecutive bars.

Steps in Constructing Histogram


1. Mark the horizontal axis with the classmarks of the class intervals and the vertical
axis with the frequencies.
2. Draw a bar graph for each class, such that the classmark is at the center of the bars,
and its height represents the frequency of that class.
3. Draw the bars adjacent to each other with no gap between bars. The resulting bar
graph is then called a frequency histogram, or simply histogram.

Take Note: There are variants of histogram such as relative frequency histogram
or percentage histogram. The difference depends on whether the
relative frequencies or percentages are marked on the vertical axis.

Example: The frequency histogram of the data in Table 1 is shown below.

14
12

10
8
6
4

67 72 77 82 87 92 97

B 13
usiness Statistics
14

Polygon
Another way of presenting quantitative data in graphical form is by constructing
polygons. This graph is formed by joining the midpoints of the tops of successive bars in a
histogram with straight lines. It emphasizes the continuous change in frequencies.

Steps in Constructing Polygon


1. Mark the horizontal axis with the classmark of the class interval and the vertical axis with
the frequencies.
2. Mark a dot above the midpoint of each class interval at a height equal to the frequency of
that class interval.
3. Mark two more classes one at each end and mark the midpoints. Take note that these two
classes have zero frequencies.
4. Join the adjacent dots with straight lines. The resulting line graph is then called a
frequency polygon, or simply polygon.

Take Note: Variants of polygon are frequency polygons with frequency marked on the vertical
axis, the relative frequency polygon where relative frequencies are marked on the
vertical axis. Consequently, a percentage polygon has percentages marked on the
vertical axis.

14
12
10
8
6
4

67 72 77 82 87 92 97

Cumulative Frequency Polygon or Ogive

A cumulative frequency polygon, or ogive (pronounced ō’jive) is a polygon that


presents the cumulative frequency of each class at its class boundaries.

Types of ogives
1. Less than ogive – the upper class’ boundaries are marked on the horizontal axis and
the less than cumulative frequencies are marked on the vertical axis.
2. Greater than ogive – the lower class’ boundaries are marked on the horizontal axis
and the greater than cumulative frequencies are marked on the vertical axis.

How to construct an ogive


1. Construct a cumulative frequency distribution.
2. Specify the horizontal and vertical scales of the graph. The horizontal axis consists of
the class boundaries and the vertical axis with cumulative frequencies.
3. Plot the points that represent the specified class boundaries and their corresponding
cumulative frequencies.
4. Connect the points on the graph.

B 14
usiness Statistics
15

5. Close each graph with broken lines on both ends.

56
49
42

35
28
21
14

64.5 69.5 74.5 79.5 84.5 89.5 94.5 99.5

56
49
42

35
28
21
14

64.5 69.5 74.5 79.5 84.5 89.5 94.5 99.5

B 15
usiness Statistics
16

Name: ____________________________________________ Date: ____________________

Section: ___________ Professor: ___________________________ Score: _____________

Activity 3

You work as a manager in the admission department of company. You were asked by your
department head to present and analyze the qualifying test scores of 40 employees who are
applying for 5-year benefits.

The recorded scores handed - out to you are as follows:

84 81 74 92 80 88 98 79
82 85 97 82 89 84 86 91
85 87 95 90 90 84 93 92
88 85 86 90 86 89 88 91
88 98 96 94 83 92 95 87

From this data set, prepare the following:

1. Step – and – leaf display


2. Complete frequency distribution table with 5 class intervals
3. Histogram

Complete the table:

Test Frequency Class


Tally Xm rf <cf >cf
Score (f) Boundaries

B 16
usiness Statistics
17

Name: _____________________________________________ Date: _________________

Section: ___________ Professor: _________________________ Score: _____________

Activity 4

A. Directions: The following table gives the frequency distribution of times that
clients spent waiting in line for bank transactions.

Waiting time (in minutes) Number of clients


0–6 20
7 – 13 24
14 – 20 18
21 – 27 12
28 – 34 5
35 – 41 1

Determine the following:

_______1. Class size _______7. Highest frequency


_______2. Number of classes _______8. Class that comprise 30% of the
_______3. Classmark of the third class distribution
_______4. Lower limit of the fourth class _______9. Frequency of clients that wait
_______5. Upper class boundary of the less than 20 minutes
third class _______10. Percentage of clients that wait
_______6. Number of clients surveyed less than 14 minutes

B. Analyze the histogram then answer the


questions that follow(graph: histogram)
Waiting Time of Patients
_______1.
N What is the class size of the
u 30
m 25 distribution?
b
_______2.
e 20 Which class has the
r highest frequency?
15
o
_______3. What are the class
f 10
boundaries of the class with lowest
P 5
a frequency?
t 0
_______4.
i How many patients
e
–6

4
3

1
–1

–2

–2

–3

–4

n have waiting time of 7 to 27


0
7

28
14

21

35

t
s minutes?
Waiting Time
_______5. What percentage of
the patients have waiting time of
less than or equal to 6 minutes and
greater than or equal 28?

B 17
usiness Statistics
18

REFERENCES

Sirug, W. S. (2018), Introduction to Business Statistics


Blay, B. E. (2013), Elementary Statistics
https://www.khanacademy.org>math

B 18
usiness Statistics

You might also like