You are on page 1of 23

Lesson

ITE 3703 – Probability and Statistics Week 02

02

Graphical Techniques

ITE 3703 – Probability and Statistics

Lesson 2 – Graphical Techniques L2-1/23


ITE 3703 – Probability and Statistics Week 02

2.0 Introduction

In the previous lesson we discussed that Statistics always involves with transformation of
data in to information. Also, it deals with summarizing, interpreting and presenting of such
information. Wide range of techniques is used to achieve this transformation,
interpretation and presentation. These techniques are called “Statistical Techniques”.
Ability to use Statistical Techniques is very important in your professional life.

There are different statistical techniques for different types of statistics.

Descriptive statistical techniques are classified as given in figure 2.0.0.

Figure 2.0.0

During this lesson, our focus is on graphical techniques which are used mainly in descriptive
statistics.

Lesson 2 – Graphical Techniques L2-2/23


ITE 3703 – Probability and Statistics Week 02

Learning outcomes
After completion of this lesson, you will be able to draw different types of graphs
for different types of data. You will also be able to interpret a graph or a
distribution.

In Particular, you will be able to,


• Organize a given set of data into a frequency distribution.
• Draw a histogram, a frequency polygon and a cumulative frequency polygon
based on a given frequency distribution.
• Present a set of data in a stem-and-leaf display.
• Organize qualitative/ quantitative data in a line chart, bar chart and a pie
chart.
• Identify the shape of a histogram

It is far more convenient to make decisions when the information is presented graphically.
Also, the information is more understandable and simple if they are presented graphically.
Therefore, graphical techniques play a major role in descriptive statistics in interpreting,
summarizing and dissemination of information.

Not only graphs but also tables are categorized most of the time as graphical techniques.

It is often classified graphical techniques in to two categories: graphical techniques for


quantitative data and graphical techniques for qualitative data. But, in most of the time, it
is possible to use both type of techniques commonly with qualitative or quantitative data.

Lesson 2 – Graphical Techniques L2-3/23


ITE 3703 – Probability and Statistics Week 02

2.1 Graphical Techniques for quantitative data

In this lesson we will study some of the graphical techniques used with quantitative data.

Following is a list of quantitative graphical techniques.

Frequency Distribution
Relative Frequency Distribution
Frequency Histogram
Relative frequency Histogram
Cumulative Relative Frequency Distribution and the Ogive Curve
Stem and Leaf Diagram
Frequency Polygon and the Relative Frequency Polygon

2.1.1 Frequency Distribution

A frequency distribution is a table like arrangement that groups data in to classes and
records the frequency of each class. Let’s look at an example.

Example 2.1.1.1:

A physician checked the blood sugar levels of 20 patients and recorded the results given in
below.

Lesson 2 – Graphical Techniques L2-4/23


ITE 3703 – Probability and Statistics Week 02

212 221 210 218


217 207 210 203
208 210 210 199
215 209 213 208
200 218 202 214

Construct a frequency distribution from the above set of data.

Answer:

The frequency distribution table for the above set of data is given below.

Examine how the blood sugar levels have been classified into classes and how the
frequency of each class has been counted.

Blood Sugar level Frequency

195-199 1
200-204 3
205-209 4
210-214 7
215-219 4
220-224 1

First, let’s be familiar with the terminology used in frequency distributions.

Lower Class Limit:

The smallest value that can fall in to a class is called the lower class limit of that class.

Lesson 2 – Graphical Techniques L2-5/23


ITE 3703 – Probability and Statistics Week 02

In the above example,

200 is the lower class limit of the class 200-204


215 is the lower class limit of the class 215-219

Upper Class Limit:

The largest value that can fall in to a class is called the upper class limit of that class.

In the above example,

204 is the upper class limit of the class 200-204


219 is the upper class limit of the class 215-219

Class Interval (width of a class):

Or, we can use the following equation.

Lesson 2 – Graphical Techniques L2-6/23


ITE 3703 – Probability and Statistics Week 02

Class mark (midpoint):

Class Mark is the point that divides a class into two equal parts. This is the average
between the upper and lower class limits.

You must be wondering how to determine the number of classes required and the length of
a class interval.

The class intervals used by a frequency distribution should be equal. But, there’s no hard
and fast rule to determine the number of classes required and the length of each class.

In general, it’s good to have small number of classes when the number of observations is
small and a large number of classes when the number of observations is large.

Now, you have learned how to select class intervals. Then, to find the frequency of each
class you have to count the number of patients whose blood sugar levels falls into each
class interval. You have to record (in a table) the frequency of each class against the class
interval. This is called the frequency distribution table.

Frequency distribution table allows us to get a quick overall idea about the data set. For
example, we at once can see that the blood sugar level of the majority of the selected
sample of patients falls in the interval 210-214.

Now, let’s summarize what we learned about frequency distributions:

Frequency distributions enable us to get an overall idea about the distribution of a


set of data. A frequency distribution shows us the frequency of occurrence of a
particular data item (or a class of items) in the sample. The frequency distribution is
usually represented as a table with two columns, one column representing the class
and the other column representing the corresponding frequency.

Lesson 2 – Graphical Techniques L2-7/23


ITE 3703 – Probability and Statistics Week 02

2.1.2 Relative Frequency Distribution

Here, we use the relative frequency of a class, not the absolute frequency. The relative
frequency of a class is obtained by dividing the class frequency by the total frequency.

Total frequency is the total number of data items (in our example, total number of
patients in the sample, which is 20)

i.e
relative frequency of a class = (class frequency / total frequency)

Relative frequency distribution records the relative frequency of each class interval
against the class interval.

The relative frequency distribution of the example 2.1.1.1 is as follows

2.1.3 Frequency Histogram

A frequency histogram is the graphical representation of a frequency distribution.

Lesson 2 – Graphical Techniques L2-8/23


ITE 3703 – Probability and Statistics Week 02

A Histogram is a graph in which the classes are marked on the horizontal axis and the
class frequencies on the vertical axis. The class frequencies are represented by the
heights of the bars and the bars are drawn adjacent to each other.

The frequency histogram of the example 2.1.1.1 is as follows

Figure 2.1.3.1

Here, we have marked lower frequency limits of each class along the X axis. For example,
lower class limit of the class 195-199 is 195, lower class limit of the class 200-204 is 200
and so on. Although, when we look at the table we at once feel that the first class interval
is 195-200, it is not the case.

Lesson 2 – Graphical Techniques L2-9/23


ITE 3703 – Probability and Statistics Week 02

Let’s look at the first bar. The first bar represents the frequency of the class 195-199

Alternatively, we can mark the complete class (with both lower and upper class limits)
along the X axis to avoid confusions. Look at the histogram below. We have used the
complete class intervals instead of lower class limits.

Figure 2.1.3.2

We can draw a relative frequency histogram in the same way. Only difference is, in the Y
axis, we record Relative frequency instead of frequency.
The importance of histograms is that they can be used to determine the distribution shape.
To ease this we can draw a smooth curve through the histogram.

Lesson 2 – Graphical Techniques L2-10/23


ITE 3703 – Probability and Statistics Week 02

Figure 2.1.3.3

2.1.4 Cumulative Relative Frequency Distribution and the Ogive Curve

Let’s first see what a cumulative relative frequency distribution is.

Cumulative Frequency:

Cumulative frequency of a class is obtained by adding the all the frequencies up to that
class. The cumulative relative frequency is then calculated by dividing the cumulative
frequency by the total frequency.

Lesson 2 – Graphical Techniques L2-11/23


ITE 3703 – Probability and Statistics Week 02

Cumulative Relative Frequency Distribution:

This is a table which shows the classes and the relevant cumulative relative frequency.

Therefore, the cumulative relative frequency distribution for the example 2.1.1.1 is,

Blood Sugar Level Cumulative Relative Frequency


195-199 0.05
200-204 0.20
205-209 0.40
210-214 0.75
215-219 0.95
220-224 1

The Ogive Curve:

The Ogive curve is obtained by plotting the cumulative relative frequency against the
upper class limit of the corresponding class. The procedure for constructing the ogive
curve is,

Lesson 2 – Graphical Techniques L2-12/23


ITE 3703 – Probability and Statistics Week 02

a. Plot the cumulative relative frequency of the classes against the upper class
limit.

b. Join the plotted points by straight lines.

c. Close the graph by extending a straight line to the lower limit of the first class.

The Ogive Curve for the example 2.1.1.1 is given below.

Figure 2.1.4.1

The approximate proportion of observations that are less than any given value on the
horizontal axis can be read from the graph very easily.

For example, we can estimate that the proportion of patients with blood sugar levels less
than 211 is approximately 59%.

Lesson 2 – Graphical Techniques L2-13/23


ITE 3703 – Probability and Statistics Week 02

Figure 2.1.4.2

2.1.5 Stem and Leaf Diagram

Stem and Leaf diagram is a widely used statistical technique for displaying a set of data.

Each numerical value is divided into two parts: stem and leaf.

The leading digits become the stem and the trailing digits become the leaf.

A stem can be one digit or multiple digits.

First, stems are selected and recorded as a column. Then for each stem, the data set is
searched to find the corresponding leaves. These leaves are then recorded opposite to the
stem.

Lesson 2 – Graphical Techniques L2-14/23


ITE 3703 – Probability and Statistics Week 02

The stem and leaf diagram for the example 2.1.1.1 is given below.

Stems Leaves
19 9
20 9 7 0 8 3
21 0 8 2 0 0 3 8 4
22 1
Figure 2.1.5.1

Here, we have selected two digit stems.

The stem and leaf diagram resembles the frequency histogram. But it displays the actual
figures. This an advantage of the stem and leaf display over the frequency histogram.

The numbers in the leaves can be ordered to form an ordered stem and leaf diagram.

2.1.6 Frequency Polygon and the Relative Frequency Polygon

To construct a frequency polygon, follow the steps given below.


a. A point is plotted above each class mark at a height equal to the frequency of
the class.

b. Points are connected by straight lines.


If we plot, the frequencies against class marks, the graph is called a frequency Polygon and
if we plot relative frequencies against class marks, then the graph is called a Relative
Frequency Polygon.

Frequency polygon for the data set given in example 2.1.1.1 is,

Lesson 2 – Graphical Techniques L2-15/23


ITE 3703 – Probability and Statistics Week 02

Figure 2.1.6.1

The relative frequency polygon for the data set given in example 2.1.1.1 is,

Relative Frequency Polygon

0.40
0.35
0.30
0.25
0.20 Relative Frequency
0.15
0.10
0.05
0.00
195- 200- 205- 210- 215- 220-
199 204 209 214 219 224

Figure 2.1.6.2

Lesson 2 – Graphical Techniques L2-16/23


ITE 3703 – Probability and Statistics Week 02

Now, we have come to the end of this section.


To sum up, we studied several graphical techniques used for quantitative data. You may
have understood that each diagram displays different information. However the selection
of a suitable quantitative graphical technique highly depends on the context and the
purpose of the statistical study under consideration.

Now let’s move on to qualitative graphical techniques.

2.2 Graphical Techniques for qualitative data

The most commonly used qualitative graphical techniques are,

Pie Chart
Bar Chart
Line Chart

2.2.1 Pie Chart

A Pie Chart is a circle subdivided into a number of slices that represent different categories
of data based on their proportion.

The entire circle corresponds to 3600.


Therefore, every 1% corresponds to 3.60 (i.e : 25%=25x3.6=900)

Example 2.2.1.1:

This example has been directly taken from “Statistics for Management and Economics, by Kellar and Warrack”

Lesson 2 – Graphical Techniques L2-17/23


ITE 3703 – Probability and Statistics Week 02

The student placement office at a university conducted a survey of last year’s business
school graduates to determine the general areas in which the graduates found jobs.
The placement office intended to use the data to help decide where to concentrate its
efforts in attracting companies to campus to conduct job interviews. Each graduate was
asked in which area he or she found a job. The areas of employment are Accounting
(1), Finance (2), General Management (3), Marketing (4) and other (5).

Number of Proportion
Area Graduates of Graduates
Accounting 73 28.90%
Finance 52 20.60%
General Management 36 14.20%
Marketing 64 25.30%
Other 28 11.10%
Total 253 100%

Since 1% corresponds to 3.6o , we can calculate the corresponding number of degrees


represented by each category. Then the Pie chart can be drawn. Don’t worry about
measuring the number of degrees and drawing them, there are statistical packages
to do the drawing for you!

The Pie Chart for the example 2.2.1.1 is given by the Figure 2.2.1.1,

Lesson 2 – Graphical Techniques L2-18/23


ITE 3703 – Probability and Statistics Week 02

Employment Areas

11%
29%
Accounting
Finance
25% General Management
Marketing
Other
21%
14%

Figure 2.2.1.1
Decision Makers use this information to make valuable decisions.

2.2.2 Bar Chart


A Bar Chart represents the categories in horizontal axis and the frequencies in vertical axis
as bars.
The bar chart for the example 2.2.1.1 is given below.

Employment Areas

80 73
64
70 52
Frequency

60
50 36
40 28 Number of Graduates
30
20
10
0
Other
Accounting

Finance

Marketing
Management
General

Area

Figure 2.2.2.1

Lesson 2 – Graphical Techniques L2-19/23


ITE 3703 – Probability and Statistics Week 02

2.2.3 Line Chart

To construct a line chart,

o The frequency of a particular category is plotted as a point above the


horizontal axis.

o This should be done for all categories.

o Then, the points are joined by straight lines.

This is very similar to the frequency polygon except for the fact that the horizontal axis
contains categories instead of class intervals.
Example 2.2.3.1:

The table given below lists the number of new recruitments of USA army per annum during
1989 and 1994.

Lesson 2 – Graphical Techniques L2-20/23


ITE 3703 – Probability and Statistics Week 02

The line chart for the above problem is,

Figure 2.2.3.1

2.3 Shape of a Distribution

In statistical analysis it is very important to have an idea about the shape of the
distribution of a set of data. This knowledge can then be used to choose appropriate
statistical techniques.

To get an idea about the distribution shape, first a smooth curve is drawn through the
frequency histogram

Lesson 2 – Graphical Techniques L2-21/23


ITE 3703 – Probability and Statistics Week 02

Figure 2.3.1

To determine the shape, a line is drawn down the centre of the distribution.

The shape of the distribution is,

Symmetrical : If the two sides have identical shapes (mirror image)


Right Skewed (Positively Skewed): If there’s a long tail to the right
Left Skewed (Negatively Skewed): If there’s a long tail to the left

The distribution shapes can further be classified as given in the figure below. But, to your
level, the above three shapes are sufficient.

Lesson 2 – Graphical Techniques L2-22/23


ITE 3703 – Probability and Statistics Week 02

Figure 2.3.2

We have come to the end of lesson 2 and now, its time for you to try the quiz!!

Try the Quiz

Summary
In this lesson you learnt the importance of graphical techniques and you
were introduced to several graphical techniques which can be used with
qualitative/quantitative data.

Further Reading :
Anderson, D.R., Sweeny, D.J., Williams, T.A., 2007. Statistics for Business and Economics.
Chapter 2.

Lesson 2 – Graphical Techniques L2-23/23

You might also like