You are on page 1of 7

2.

0 Data presentation
2.1 Introduction
Recall some of the communication concepts and purpose of communicating information to
corporate decision makers in various organizations. One of the ways that dove-tails with our
study has to do with graphical presentations of information to convey statistical results.
Graphs are very effective in communicating information vividly and briefly more than just
prose and tables. Because of this powerful attribute of graphs and charts, it is worthwhile for
you to have a very good grasp of graphs and charts as well as tables.

2.2 Objectives
By the end of the chapter, a student should be able:
 To identify the different forms of graphs and charts.
 To explain the use of these different graphs / charts/ tables.
 To associate specific data types to specific graphs/ charts/ table.
 To transform raw data into graphic/ chart /tabular form.
 To analyze the findings from a graph/ chart/ table.

2.3 Data presentation techniques

We will consider charts, graphs and frequency distributions, histograms frequency polygons
and “less than and more than ogive” curves.

2.3.1 Charts

We will consider a table, pie-chart and simple bar graph (horizontal).

A Table: A typical table has columns and rows. Both absolute figures and relative figures
can be shown on a table. Any quantitative data can be presented in a table.

A Pie Chart: this is a circular chart with divisions that are segmented but the area of each
division is proportionate to the frequency of each category of data. You may think of a pizza
from your favorite outlet.

The relative importance of each random variable is reflected by the relative size of each
division/ segment. Consider the revenue stream for Liscate Investments (Private) Limited in
2020 presented in Table 2.1, then in a pie-chart form and lastly in graphical form. Note that
these revenue streams are presented in absolute dollar and relative (percentage) terms.

Table 2.1 Revenue stream in 2010

Revenue stream Amount (USD) Percentage of Total (%)


Tomatoes 37,910.00 42.7
Eggs 16,180.00 18.2
Chickens 14,025.00 15.8
Potatoes 8,123.00 9.2
Consultancy services 7,000.00 7.9
Rentals 5,533.00 6.2
Total 588,769.00 100.0
Source: Liscate Investments (private) Ltd Management Report (2020)
Below is Fig.2.1 that shows the same information in a pie-chart:

Fig. 2.1 Revenue Streams

Tomatoes
Eggs
Chickens
Potatoes
Constultancy services
Rentals

From a glance one can see that tomatoes contributed the biggest chunk to the revenue streams
and rentals the least. There are a variety of these pie-charts so you select the one with the
highest impact on your consumers of your corporate decision-makers.

If a simple graph is instead used, it will be as shown in Fig. 2.2 below:

Fig. 2.1 Revenue Streams

Rentals

Constultancy
services
Tomatoes
Potatoes
Eggs
Chickens
Series1
Chickens Potatoes
Constultancy services
Rentals
Eggs

Tomatoes

0 5 10 15 20 25 30 35 40 45

This is a horizontal bar graph which can also be vertical. Again, the length of the bar is
indicative of the contribution of the revenue stream.
2.4 Frequency Distributions

A frequency distribution is typically a table summarizing mostly ratio-scaled data that is


either discrete or continuous in nature but the data is grouped into classes or intervals. Each
class/ interval frequency shows the number of observation occurrences of the data values in
that class.

First, you need to understand what ungrouped data is. When data is collected, it is jumbled
up/ mixed up and not formatted in any order.

On the other hand, when the data is now put into classes/ intervals, then it is grouped data. It
is now information. To transform ungrouped data to grouped data, you need 5 steps to do so.

Example
A business consultant in the Midlands province analyzed the Small-to-Medium Enterprises
(SMEs) turnovers of 40 companies (in 000 USD).

Table 2.1 Small-to- Medium Enterprises turnover ($000)

175 146 165 170 195 150 116 165


162 124 178 182 142 177 135 144
140 165 164 150 182 162 168 158
178 172 155 120 118 190 126 176
160 135 160 143 140 155 162 185

What is the random variable and data type under study?

Step 1: Calculate the data range.

Range = Highest observation- Lowest observation= 195-116= 79. Most students are not
accurate in identifying the both the maximum and minimum observations, hence their range
will be wrong and their class width too.

Step 2: Establish number of classes/ intervals


Here you want to determine the suitable number of classes that will accommodate the given
data. We will use two methods to do this. These methods are Sturge’s Rule and use of the
logarithm a rule of thumb is to have 5-8 classes.

Sturge’s Rule says that 2k ≥ n where


n = sample size/ total number of observations
k = the expected number of classes/ intervals
Notice that we will raise 2 by a certain number up to when we have a power that gives n = 40
or just greater than 40, the total number of observations in the data set.

If k =5; 25 = 32, this is below 40.

If k =6; 2 6= 64, this is just above 40. Therefore this data set will be adequately
accommodated in 6 classes.
If we use the logarithm method, the equation used is ; k= log n/ log 2 = log40/ log2= 5,322.
Remember there is no half class or a quarter of a class, so we need 6 classes, not 5 classes.
Hence k = 6 which satisfies Sturge’s Rule.

Step 3: Calculate class / interval width, c

This is the constant difference between the upper and lower limits of a class/ interval.

c =Range/ k = 79/6 = 13,1667 or


c = Range/ k = 79/ 5,322= 14,844

The class width should be one that is a whole number (integer), easy to work with and
should be close to the calculated value above. Taking 13 will leave out a few number that
should be accommodated by 0,1667. A similar reasoning will apply to 14,844. The closest
whole number in this case is 15. It’s easy to work with, it is a whole number. Most students
find this selection difficult to follow.

Suppose your class width comes to 8,77; what should be your class width? What of 17; or
23?

Step 4: Calculate class limits


These may also be called class boundaries. Always make sure that the lower limit should be
either equal to the lowest observation or just below this smallest value in the data set. Again
select a value that is easy to work with and an integer. Our lowest observation is 116 so, a
value such as 115 will accommodate all the observations starting with 116 up to 195.

A value such as 110 will be a bit further away from 116.

With our first interval/ class at 115, the upper limit of this class is found by adding the class
width of 15 to give us 130. Hence this class will be from 115 up to but less than 130. This
long phrase is written 115-<130 so that the next class will start at 130 +15 and so on. This
way an observation will be located in only one class not two.

Step 5: Tally table for placing observations in classes

Remember that each observation will be placed in only one class and this is achieved by use
of a tally table as shown below in Table 2.2. The table has both absolute frequency (f i) and
relative frequency (%). Study it closely.

Table 2.2 Tally table for allocating observations in classes

Company Tally Number of companies Relative frequency (%)


turnover($000) (Absolute frequency
(fi)
115-<130 lllll 5 12,5
130-<145 lllll ll 7 17,5
145-<160 lllll l 6 15
160-<175 lllll lllll ll 12 30
175-<190 lllll lll 8 20
190-<205 2 5
N=∑ fi =40 100%
In Table 2.2,
fi means the absolute frequency which are the number of observations in that specific class.
N=∑ fi which is the total number of observations. This summation of all absolute frequencies
must equal the total number of observations in the data set. If not, you must have a mistake
along the way. Revise your work before going any further.

The relative frequency must equal 100% always. Each relative frequency is found by dividing
the respective absolute frequency by the total number (N) of observations in the data set. For
example for the class 145-<160, the absolute frequency is 6/40 = 15%. This signifies the
“slice” or proportion of observations each class has to the total number of observations.

Homework: Draw a Histogram of company turnover (SMEs) in Midlands province using the
information in Table 2.2.
How many SMEs have the most turnover and how many have the least turnover?
What are the major difference between a bar graph and a histogram?

2.5 Frequency Polygon


When successive class interval’s mid-points are joined by a line, a frequency polygon is is
formed.

Fig. 2.2 Frequency polygon of SMEs turnover in the Midlands province

Number of companies (fi)


14
12
10
8 Number of companies (fi)
6
4
2
0
115- 130- 145- 160- 175- 190-
<130 <145 <160 <175 <190 <205

The vertical axis is showing the number of companies, f i; the horizontal axis shows the
company turnover.

2.6 Cumulative Frequency Distribution (Ogive)


Sometimes it is necessary to answer such questions as: How many SMEs have turnover less
than $175,000.00 per annum? Or how many SMEs have turnover that is more than
160,000.00 per annum? Or: What percentage of SMEs turnover is more than/ less than
$160,000.00?
To answer these questions, one needs to have constructed a “more than” and a “less than”
cumulative frequency distribution curves/ ogives.

2.6.1 Construction of the “more than ogive”

There are three steps that lead to the construction of a “more than ogive.”
a) Based on the last class with upper limit of 205, create another class with 205 as the lower
limit. Thus, we now have a class that reads 205 and above. This class has 0 absolute
frequency.
b) Starting with this new class’ lower limit repeatedly ask the question: How many
observations are above 205; above 190; above 175. As you may have noticed, the question
targets the lower limit of each class till we get to 115 where all 40 observations is the answer.
c) The procedure to get the answer to the answer for each of these questions is adding the
current absolute frequency to the previous cumulative absolute frequency.

Construct a “more than ogive” of the SMEs turnover in the Midlands province
.
Table 2.3 “More than ogive” for SMEs turnover (absolute and relative frequency).

Company turnover Number of companies Cumulative frequency


fi % Absolute
115-<130 5 100 40
130-<145 7 87,5 35
145-<160 6 70 28
160-<175 12 55 22
175-<190 8 25 10
190-<205 2 5 2
205 and above 0 0 0
40

So, how many companies have turnover of more than $175,000.00? Your answers should be
both absolute and relative.

2.6.2 Construction of the “less than ogive”

Again, there are three steps to the successful construction of the “less than ogive”.
a) We base this construction on the first class 115-<130. A new class is introduced below the
lower limit of 115 which becomes the upper limit of this class. We will have a class that has
115 and below. Of course it has a 0 absolute frequency.
b) Starting from this new class repeatedly ask the question: How many observations are
below the upper limit of this class? For the new class, the answer is 0. The next class will be
5; 12; 18 and so on.
c) The calculation of the current cumulative frequency is obtained by adding the previous
cumulative frequency to the current absolute frequency of the class in question. For example
for the class interval 130-<145 the current absolute frequency is 7; the previous cumulative
frequency is 5, giving us 12 as the required cumulative absolute frequency. The same applies
to the cumulative relative frequencies.

Table 2.4 shows the cumulative absolute and relative frequencies of the SMEs turnover in the
Midlands province.
Table 2.4 “Less than ogive” for SMEs turnover (absolute and relative frequencies).

Company Number of Cumulative Percentage of Cumulative


turnover companies (fi) frequency fi (<) companies fi frequency fi(<)
Below 115 0 0
115-<130 5 5 12,5 12,5
130-<145 7 12 17,5 30
145-<160 6 18 15 45
160-<175 12 30 30 75
175-<190 8 38 20 95
190-<205 2 40 5 100
40 100

What percentage of companies have turnover of less than $160,000.00 or less than
$145,000.00?

2.7 Cumulative Frequency Polygon

When we graph an ogive, we get a cumulative polygon.. We can use the absolute frequency
or the relative frequency. The set of questions to be answered determines which frequency to
use: absolute or relative. So, if the question asks about “percentage”, automatically, the
relative frequency has to be plotted.

Please note that if plotting the “more than ogive”, use the lower limit of successive class
interval. Hence the sum of the observations above that lower limit

Homework: Draw the “more than” and “less than” ogives to answer the following
questions.

a) What percentage of SMEs companies have turnover of less than $145,000.00 in the
Midlands province?
b) What company turnover was realised from the lower 80% of all the SMEs in the Midlands
province?
c) What percentage of SMEs provides more than $150,000.00 turnover in the Midlands
province?
d) What do we call the point where the “more than” and “less than” ogives meet on the graph
above?

You might also like