Professional Documents
Culture Documents
Graphical Techniques: Lesson 02
Graphical Techniques: Lesson 02
02
Graphical Techniques
2.0 Introduction
In the previous lesson we discussed that Statistics always involves with transformation of
data in to information. Also, it deals with summarizing, interpreting and presenting of such
information. Wide range of techniques is used to achieve this transformation,
interpretation and presentation. These techniques are called “Statistical Techniques”.
Ability to use Statistical Techniques is very important in your professional life.
Figure 2.0.0
During this lesson, our focus is on graphical techniques which are used mainly in descriptive
statistics.
Learning outcomes
After completion of this lesson, you will be able to draw different types of graphs
for different types of data. You will also be able to interpret a graph or a
distribution.
It is far more convenient to make decisions when the information is presented graphically.
Also, the information is more understandable and simple if they are presented graphically.
Therefore, graphical techniques play a major role in descriptive statistics in interpreting,
summarizing and dissemination of information.
Not only graphs but also tables are categorized most of the time as graphical techniques.
In this lesson we will study some of the graphical techniques used with quantitative data.
Frequency Distribution
Relative Frequency Distribution
Frequency Histogram
Relative frequency Histogram
Cumulative Relative Frequency Distribution and the Ogive Curve
Stem and Leaf Diagram
Frequency Polygon and the Relative Frequency Polygon
A frequency distribution is a table like arrangement that groups data in to classes and
records the frequency of each class. Let’s look at an example.
Example 2.1.1.1:
A physician checked the blood sugar levels of 20 patients and recorded the results given in
below.
Answer:
The frequency distribution table for the above set of data is given below.
Examine how the blood sugar levels have been classified into classes and how the
frequency of each class has been counted.
195-199 1
200-204 3
205-209 4
210-214 7
215-219 4
220-224 1
The smallest value that can fall in to a class is called the lower class limit of that class.
The largest value that can fall in to a class is called the upper class limit of that class.
Class Mark is the point that divides a class into two equal parts. This is the average
between the upper and lower class limits.
You must be wondering how to determine the number of classes required and the length of
a class interval.
The class intervals used by a frequency distribution should be equal. But, there’s no hard
and fast rule to determine the number of classes required and the length of each class.
In general, it’s good to have small number of classes when the number of observations is
small and a large number of classes when the number of observations is large.
Now, you have learned how to select class intervals. Then, to find the frequency of each
class you have to count the number of patients whose blood sugar levels falls into each
class interval. You have to record (in a table) the frequency of each class against the class
interval. This is called the frequency distribution table.
Frequency distribution table allows us to get a quick overall idea about the data set. For
example, we at once can see that the blood sugar level of the majority of the selected
sample of patients falls in the interval 210-214.
Here, we use the relative frequency of a class, not the absolute frequency. The relative
frequency of a class is obtained by dividing the class frequency by the total frequency.
Total frequency is the total number of data items (in our example, total number of
patients in the sample, which is 20)
i.e
relative frequency of a class = (class frequency / total frequency)
Relative frequency distribution records the relative frequency of each class interval
against the class interval.
A Histogram is a graph in which the classes are marked on the horizontal axis and the
class frequencies on the vertical axis. The class frequencies are represented by the
heights of the bars and the bars are drawn adjacent to each other.
Figure 2.1.3.1
Here, we have marked lower frequency limits of each class along the X axis. For example,
lower class limit of the class 195-199 is 195, lower class limit of the class 200-204 is 200
and so on. Although, when we look at the table we at once feel that the first class interval
is 195-200, it is not the case.
Let’s look at the first bar. The first bar represents the frequency of the class 195-199
Alternatively, we can mark the complete class (with both lower and upper class limits)
along the X axis to avoid confusions. Look at the histogram below. We have used the
complete class intervals instead of lower class limits.
Figure 2.1.3.2
We can draw a relative frequency histogram in the same way. Only difference is, in the Y
axis, we record Relative frequency instead of frequency.
The importance of histograms is that they can be used to determine the distribution shape.
To ease this we can draw a smooth curve through the histogram.
Figure 2.1.3.3
Cumulative Frequency:
Cumulative frequency of a class is obtained by adding the all the frequencies up to that
class. The cumulative relative frequency is then calculated by dividing the cumulative
frequency by the total frequency.
This is a table which shows the classes and the relevant cumulative relative frequency.
Therefore, the cumulative relative frequency distribution for the example 2.1.1.1 is,
The Ogive curve is obtained by plotting the cumulative relative frequency against the
upper class limit of the corresponding class. The procedure for constructing the ogive
curve is,
a. Plot the cumulative relative frequency of the classes against the upper class
limit.
c. Close the graph by extending a straight line to the lower limit of the first class.
Figure 2.1.4.1
The approximate proportion of observations that are less than any given value on the
horizontal axis can be read from the graph very easily.
For example, we can estimate that the proportion of patients with blood sugar levels less
than 211 is approximately 59%.
Figure 2.1.4.2
Stem and Leaf diagram is a widely used statistical technique for displaying a set of data.
Each numerical value is divided into two parts: stem and leaf.
The leading digits become the stem and the trailing digits become the leaf.
First, stems are selected and recorded as a column. Then for each stem, the data set is
searched to find the corresponding leaves. These leaves are then recorded opposite to the
stem.
The stem and leaf diagram for the example 2.1.1.1 is given below.
Stems Leaves
19 9
20 9 7 0 8 3
21 0 8 2 0 0 3 8 4
22 1
Figure 2.1.5.1
The stem and leaf diagram resembles the frequency histogram. But it displays the actual
figures. This an advantage of the stem and leaf display over the frequency histogram.
The numbers in the leaves can be ordered to form an ordered stem and leaf diagram.
Frequency polygon for the data set given in example 2.1.1.1 is,
Figure 2.1.6.1
The relative frequency polygon for the data set given in example 2.1.1.1 is,
0.40
0.35
0.30
0.25
0.20 Relative Frequency
0.15
0.10
0.05
0.00
195- 200- 205- 210- 215- 220-
199 204 209 214 219 224
Figure 2.1.6.2
Pie Chart
Bar Chart
Line Chart
A Pie Chart is a circle subdivided into a number of slices that represent different categories
of data based on their proportion.
Example 2.2.1.1:
This example has been directly taken from “Statistics for Management and Economics, by Kellar and Warrack”
The student placement office at a university conducted a survey of last year’s business
school graduates to determine the general areas in which the graduates found jobs.
The placement office intended to use the data to help decide where to concentrate its
efforts in attracting companies to campus to conduct job interviews. Each graduate was
asked in which area he or she found a job. The areas of employment are Accounting
(1), Finance (2), General Management (3), Marketing (4) and other (5).
Number of Proportion
Area Graduates of Graduates
Accounting 73 28.90%
Finance 52 20.60%
General Management 36 14.20%
Marketing 64 25.30%
Other 28 11.10%
Total 253 100%
The Pie Chart for the example 2.2.1.1 is given by the Figure 2.2.1.1,
Employment Areas
11%
29%
Accounting
Finance
25% General Management
Marketing
Other
21%
14%
Figure 2.2.1.1
Decision Makers use this information to make valuable decisions.
Employment Areas
80 73
64
70 52
Frequency
60
50 36
40 28 Number of Graduates
30
20
10
0
Other
Accounting
Finance
Marketing
Management
General
Area
Figure 2.2.2.1
This is very similar to the frequency polygon except for the fact that the horizontal axis
contains categories instead of class intervals.
Example 2.2.3.1:
The table given below lists the number of new recruitments of USA army per annum during
1989 and 1994.
Figure 2.2.3.1
In statistical analysis it is very important to have an idea about the shape of the
distribution of a set of data. This knowledge can then be used to choose appropriate
statistical techniques.
To get an idea about the distribution shape, first a smooth curve is drawn through the
frequency histogram
Figure 2.3.1
To determine the shape, a line is drawn down the centre of the distribution.
The distribution shapes can further be classified as given in the figure below. But, to your
level, the above three shapes are sufficient.
Figure 2.3.2
We have come to the end of lesson 2 and now, its time for you to try the quiz!!
Summary
In this lesson you learnt the importance of graphical techniques and you
were introduced to several graphical techniques which can be used with
qualitative/quantitative data.
Further Reading :
Anderson, D.R., Sweeny, D.J., Williams, T.A., 2007. Statistics for Business and Economics.
Chapter 2.