You are on page 1of 50

Lecture#4

DATA PRESENTATION

Instructor: Sulaman
Methods for Data Presentations
• Classification of Data
• Bases of Classification
• Types of Classification
• Tabulation of Data
• Types of Tabulations
• Constructing a Statistical Table
• General Rules of Tabulations
• Table of Frequency Distributions
• Frequency Distribution
• Relative frequency distribution
• Cumulative frequency distribution Instructor: Sulaman
Organizing Data
After collecting data, the first task for a researcher is to organize and
simplify the data so that it is possible to get a general overview of
the results.
Raw Data: Data which is not organized is called raw data.
Un-Grouped Data: Data in its original form is called Un-
Grouped Data.

Note: Raw data is also called ungrouped data.


Instructor: Sulaman
Different Ways of Organizing Data
To get an understanding of the data, it is organized and
arranged into a meaningful form.
This is done by the following methods,

• Classification

• Tabulation ( Simple Tables, Frequency Table, Stem and leaf plots


etc.)

• Graphs (Bar Graph, Pie Chart, Histogram, Frequency Ogive etc.

Instructor: Sulaman
Classification of Data
The process of arranging data into homogenous group or classes
according to same common characteristics present in the data is
called classification.

Example:
The process of sorting letters in a post office the letters are classified
according to the cities and further arranged to streets.

Instructor: Sulaman
Bases of Classification
There are four important bases of classification.

• Qualitative Base
• Quantitative Base
• Geographical Base
• Chronological or Temporal Base

Instructor: Sulaman
Bases of Classification
There are four important bases of classificationn.

• Qualitative Base
• Quantitative Base
• Geographical Base
• Chronological or Temporal Base

Instructor: Sulaman
Qualitative Base
When the data are classified according to some quality or attributes
such as gender, religion etc.

Quantitative Base
When the data are classified by quantitative characteristics like
heights, weights, ages, income etc.

Instructor: Sulaman
Geographical Base
When the data are classified by geographical regions or location,
like states, provinces, cities etc.

Chronological or Temporal Base


When the data are classified or arranged by their time of
occurrence, such as years, months, weeks, days etc. ( e.g. Time
series data).

Instructor: Sulaman
Types of Classification
There are Three main types of classifications.

• One-way Classification
• Two-way Classification
• Multi-way Classification

Instructor: Sulaman
One-way Classification
If we classify observed data keeping in view single characteristic, this
type of classification is known as one-way classification.

Example
The population of world may be classified by gender as Male and
Female etc.

Instructor: Sulaman
Two-way Classification
If we consider two characteristics at a time in order to classify the
observed data, then we are doing two-way classifications.

Example
The population of world may be classified by Religion and gender.

Instructor: Sulaman
Multi-way Classification
If we consider more than two characteristics at a time in order to
classify the observed data, then we are doing multi-way classification.

Example
The population of world may be classified by Religion, gender and
Literacy.

Instructor: Sulaman
Tabulation of Data
• The process of placing classified data into tabular form is known
as tabulation.
• A table is a symmetric arrangement of statistical data in rows and
columns.
• Rows are horizontal arrangements whereas columns are vertical
arrangements.

Instructor: Sulaman
Types of Tabulation
There are Three main types of tabulation

• Simple or One-way Table


• Double or Two-way Table
• Complex or Multi-way Table

Instructor: Sulaman
Simple or One-way Table
When the data are tabulated to one characteristic, it is said to be
simple tabulation or one-way tabulation.

Example
Tabulation of data on population of world classified by one
characteristic like Religion, is an example of simple tabulation.

Instructor: Sulaman
Double or Two-way Table
When the data are tabulated according to two characteristics at a
time. It is said to be double tabulation or two-way tabulation.

Example
Tabulation of data on population of world classified by two
characteristics like Religion and gender, is an example of double
tabulation.

Instructor: Sulaman
Complex or Multi-way Table
When the data are tabulated according to many characteristics
(generally more than two), it is said to be complex tabulation.

Example
Tabulation of data on population of world classified by three
characteristics like Religion, gender and literacy etc.

Instructor: Sulaman
Construction of Statistical Table
A statistical table has at least four major parts and some
other minor parts.
• Title Table
• The Box Head (column caption)
• The Stub (row captions)
• The Body
• Prefatory Notes
• Foot Notes
• Source Notes Instructor: Sulaman
General Sketch of Table
THE TITLE
(Prefatory Notes)

Box Head
Stub Column Caption
Row Caption

The Body

Foot Notes
Source Notes
Instructor: Sulaman
THE TITLE
• A title is the main heading written THE TITLE
in capital shown at the top of the
table

• It must explain the contents of the


table and throw light on the table
as whole

• Different parts of the heading can be


separated by commas and so no full
stop should be used in the little.
Instructor: Sulaman
Box Head
Main heading of Columns

Column Caption
Box Head
• The vertical subheading of the
Column Caption
column are called columns
captions.

• Only the first letter of the box head


is in capital letters and the
remaining words must be written
in small letters.
Instructor: Sulaman
Stub
Main heading of Rows

Row Caption
• The horizontal subheadings of the
Stub
row are called row caption.
Row Caption

• Only the first letter of the box head


is in capital letters and the
remaining words must be written
in small letters.
Instructor: Sulaman
The Body
It is the main part of the table
which contains the table which
contains the numerical
information classified with respect
to row and column captions. The Body

Instructor: Sulaman
THE TITLE
Prefatory Notes (Prefatory Notes)

A statement given below the title and


enclosed in bracket usually describe
the units of measurement.

Instructor: Sulaman
THE TITLE
Foot Notes
It appears immediately below the
body of the table providing the
further additional explanation

Source Notes
The source notes is given at the end Foot Notes
of the table indicating the source Source Notes
form where the information has been
taken.
Instructor: Sulaman
General Rules of Tabulation
• A table should be simple and attractive. A complex table may
be broken into relatively simple tables
• Headings for columns and rows should be proper and clear.
• Suitable approximation may be adopted and figures may be
rounded off. But this should be mentioned in the prefatory
note or in the foot note.
• The unit of measurement and nature of data should be well
defined.

Instructor: Sulaman
Organizing Data via Frequency Tables
One method for simplifying and organizing data is to construct a
frequency distribution.

Frequency Distribution
The organization of a set of data in a table showing the distribution
of the data into classes or groups together with number of
observations in each class or group is called a Frequency
Distribution.

Instructor: Sulaman
Class Frequency
The number of observations falling in a particular class is called
class frequency or simply frequency, denoted by ‘f’.

Instructor: Sulaman
Why Use Frequency Distribution?
• A frequency distribution is a way to summarize data.
• A frequency distribution condenses the raw data into a more
meaningful form.
• A frequency distribution allows for a quick visual interpretation
of the data.
• Frequency Distributions can be drawn for qualitative data as
well as quantitative data.

Instructor: Sulaman
Frequency Distribution of Discrete Data
Example: Number of children in 20 families.
2, 3, 1, 3, 2, 5, 4, 1, 4, 2, 3, 5, 2, 5, 2, 1, 3, 1, 2, 0
Construct un-grouped or discrete frequency distribution.
No. of Children Tally No of Families
(frequency) f
0 | 1
1 |||| 4
2 |||| | 6
3 |||| 4
4 || 2
5 ||| 3
Instructor: Sulaman
Total 20
Interpretation No. of
Children
Tally No of Families
(frequency) f
There is/are 0 | 1
1 family with no children.
1 |||| 4
4 families with one child.
6 families with two 2 |||| | 6
children. 3 |||| 4
4 families with three
children. 4 || 2
3 families with five 5 ||| 3
children.
Total 20

Instructor: Sulaman
Group Frequency Distribution
• Sometimes, when the data is continuous or cover a wide
range of values, it become very burdensome to make a list
of all values as in that case the list will be too long.
• To remedy this situation, a group frequency distribution
table is used.

Instructor: Sulaman
Group Frequency Distribution for Continuous
Data

Example (Temperature Data)


Temperature of 20 winter days in Pakistan is recorded below:
24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43,
44, 27, 53, 27.
Construct a distribution table
Note: Temperature is a continuous variable because it could be
measured to any degree of precision desired

Instructor: Sulaman
Steps in Constructing Grouped Frequency
Distribution
• Sort raw data from low to high.
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46,
53, 58
• Find range
Range= Maximum value – Minimum value
= 58– 12=46
• Select number of classes: 5 (usually between 5 and 20)

• Compute class width


Instructor: Sulaman
• Compute class width

Class width= Range/no. of classes= 46/5=9.2~10

• Determine the class limit: 11-20, 21-30, 31-40, 41-50, 51-60


(Note: the above classes should cover the full data)

• Count the number of values in each class

Instructor: Sulaman
Frequency Distribution of Grouped Data
Sorted data:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41,
43, 44, 46, 53, 58

FREQUENCY DISTRIBUTION
(Temp Data)
Classes Tally Frequency (f)
11-20 ||| 3
21-30 |||| || 7
31-40 |||| 4
41-50 |||| 4
51-60 || 2
Total 20 Instructor: Sulaman
Frequency Distribution for Qualitative Data
Political Party Affiliations: Professor X asked his introductory statistics
students to state their political party affiliation as PML-N (N), PPP (P), PTI,
JUI. The responses of the 30 students in a class are:
Classes Tally Frequency (f)
PPP, PTI, N, JUI, PTI, N,
PTI, PPP, JUI, PTI, JUI, JUI |||| |||| 10
PTI, JUI, PPP, PTI, N, PTI |||| |||| 9
PTI, JUI, N, JUI, PPP, PML-N |||| | 6
JUI, PTI, JUI, N, JUI, PPP |||| 5
PTI, JUI, N, PPP.
Total 30
Construct a frequency distribution.
Instructor: Sulaman
Interpretation
Classes Tally Frequency (f)
Out of 30 students JUI |||| |||| 10
in the class. PTI |||| |||| 9
10 are in favor of
PML-N |||| | 6
JUI
9 are in favor of PTI PPP |||| 5
6 are in favor of Total 20
PLM-N and
5 are in favor of PPP

Instructor: Sulaman
Relative Frequency Distribution
• Relative frequency is the ratio of the frequency to the total
number of observation.

Relative frequency= frequency/total no. of observations

Example:
Relative frequency of students who favored JUI =10/30= 0.3333=33.33%
Relative frequency of students who favored PTI =9/30= 0.3=30%
Relative frequency of students who favored PML-N =6/30= 0.2=20%
Relative frequency of students who favored PPP =5/30= 0.1667=16.67%

Instructor: Sulaman
Frequency Distribution of Qualitative Data
Party Affiliation Example:
Interpretation: Out of 30 students in
the class. Classes Frequency (f) Relative
33.33% are in favor of JUI Frequency
30% are in favor of PTI JUI 10 10/30=0.333
20% are in favor of PML-N PTI 9 9/30=0.3
And 16.67% are in favor PML-N 6 6/30=0.2
of PPP
PPP 5 5/30=0.1667
Total 30 1

Instructor: Sulaman
Class Boundary
Constructing Class Boundaries:
Take difference of lower limit of second class and upper limit of the
first class. (e.g., 21-20=1), then divide this difference by 2. ( i.e.,
½=0.5) Subtract the resulting number ( i.e., 0.5) from the lower-
class limit of each class and add the resulting number ( i.e., 0.5) to
the upper-class limit of each class. The newly obtained classes are
called Class Boundaries (C.B).

Instructor: Sulaman
Classes Class Boundaries Frequency (f)
11-20 10.5-20.5 3
21-30 20.5-30.5 7
31-40 30.5-40.5 4
41-50 40.5-50.5 4
51-60 50.5-60.5 2
Total 20

Instructor: Sulaman
Cumulative Frequency Distribution
Cumulative Frequency:
The total frequency of a variable from its one end to a certain values
(usually upper-class boundary in grouped data), called the base, is
known as cumulative frequency less than or more than the base of
the variable.
Cumulative Frequency Distribution:
The table showing cumulative frequencies is called cumulative
frequency distribution.

Instructor: Sulaman
Less than Cumulative Frequency Distribution
Less than Cumulative
Frequency Distribution of frequency distribution of
temperature data Temp data
Classes Class Boundaries Frequency (f) Class Boundaries Cumulative
11-20 10.5-20.5 3 Frequency (f)
Less than 10.5 0
21-30 20.5-30.5 7
Less than 20.5 0+3=3
31-40 30.5-40.5 4
Less than 30.5 3+6=9
41-50 40.5-50.5 4
Less than 40.5 9+5=14
51-60 50.5-60.5 2
Less than 50.5 14+4=18
Total 20
Less than 60.5 18+2=20
Instructor: Sulaman
Less than Cumulative Frequency Distribution
More than Cumulative
Frequency Distribution of frequency distribution of
temperature data Temp data
Classes Class Boundaries Frequency (f) Class Boundaries Cumulative
11-20 10.5-20.5 3 Frequency (f)
More than 10.5 20
21-30 20.5-30.5 7
More than 20.5 20-3=17
31-40 30.5-40.5 4
More than 30.5 17-6=11
41-50 40.5-50.5 4
More than 40.5 11-5=6
51-60 50.5-60.5 2
More than 50.5 6-4=2
Total 20
More than 60.5 2-2=0
Instructor: Sulaman
Stem and Leaf Plot
Disadvantage of Frequency Table
An obvious disadvantage of using frequency table is that the identity
of individual observation is lost in the grouping process.

Stem and Leaf plot provides the solution by offering a quick and
clear way of sorting and displaying data simultaneously.

Instructor: Sulaman
Stem and Leaf Plot
METHOD
• Sort the data series
• Separate the sorted data series into leading digits (the stem)
and the trailing digits (the leaf)
e.g. In 13, the leading digit (stem) is 1 and trailing digit (leaf) is
3 and in 21, the leading digit (stem) is 2 and trailing digit (leaf)
is 1.
• List all stems in a column from low to high
• For each stem, list all associated leaves
Instructor: Sulaman
Example: Consider the temp data again

The sorted data from low to high is shown below


12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46,
53, 58
Here, use the 10’s digit for the stem unit

Stem Leaf
1 3
2 1
3 5

Instructor: Sulaman
Stem and Leaf Plot

Stem Leaf
1 2 3 7
2 4 4 6 7 7
3 0 2 5 7 8
4 1 3 4 6
5 3 8

Instructor: Sulaman

You might also like