I've Come Loaded With Statistics, For I've Noticed That A Man Can't Prove Anything Without Statistics

Chapter two: Methods of data presentation
Contents
2.1 Sources of data
2.1.1 Methods of primary data collection
2.1.2 Methods of secondary data collection
2.2 Methods of data presentation
2.3 Frequency distribution
2.3.1 Categorical frequency distribution
2.3.2 Ungrouped frequency distribution
2.3.3 Grouped frequency distribution
2.4 Diagrammatic and Graphical Presentation of Data
“I've come loaded with statistics, for I've noticed that a man can't prove anything without
statistics”
M. TWAIN
2.1 Sources of data

Data:- is the raw material of statistics. It can be obtained either by measurement or counting.
When we determine that the appropriate approach to seeking an answer to a question will require
the use of statistics, we begin to search for suitable data to serve as the raw material for our
investigation.
The statistical data may be classified under two categories depending up on the sources.
1. Primary data: - Data collected by the investigator himself for the purpose of a specific
inquiry or study. Such data are original in character & are mostly generated by surveys
conducted by individuals or research institutions.
It is more reliable & accurate since the investigator can extract the correct information by
removing doubts, if any, in the minds of the respondents regarding certain questions.
2. Secondary data: - When an investigator uses data, which have already been collected by
others, such data are called secondary data. Such data are primary data for the agency that
collected them, and become secondary for some one else who uses these data for his own
purposes. Example of secondary data: books, reports, magazines, etc.
When our source is secondary data check that:
✓ The type and objective of the situations.
✓ The purpose for which the data are collected and compatible with the
present problem.
✓ The nature and classification of data is appropriate to our problem.
✓ There are no biases and misreporting in the published data.
Note: Data which are primary for one may be secondary for the other.
abebuabebaw@yahoo.com
1
2.1.1 Method of primary data collection

In primary data collection, you collect the data yourself using methods such as interviews and
questionnaires. The key point here is that the data you collect is unique to you and your research
and, until you publish, no one else has access to it. There are many methods of collecting
primary data and the main methods include:
Questionnaires: are a popular means of collecting data, but are difficult to design
and often require many rewrites before an acceptable questionnaire is produced.
Advantages:
Can be used as a method in its own right or as a basis for

interviewing or a telephone survey.
Can be posted, e-mailed or faxed.
Can cover a large number of people or organizations
Wide geographic coverage.
Relatively cheap.
No prior arrangements are needed
Avoids embarrassment on the part of the respondent.
Respondent can consider responses
Possible anonymity of respondent.
No interviewer bias.
Disadvantages:
Design problems
Questions have to be relatively simple.
Historically low response rate (although inducements may help).
Time delay whilst waiting for responses to be returned
Require a return deadline.
Several reminders may be required.
Assumes no literacy problems.
No control over who completes it.
Not possible to give assistance if required.
Problems with incomplete questionnaires.
Replies not spontaneous and independent of each other.
Respondent can read all questions beforehand and then decide
whether to complete or not. For example, perhaps because it is too
long, too complex, uninteresting, or too personal.
2
Interviewing is a technique that is primarily used to gain an understanding of the

underlying reasons and motivations for people’s attitudes, preferences or
behavior. Interviews can be undertaken on a personal one-to-one basis or in a
group. They can be conducted at work, at home, in the street or in a shopping
center, or some other agreed location.
Advantages:
Serious approach by respondent resulting in accurate information.

Good response rate.
Completed and immediate.
Possible in-depth questions.
Interviewer in control and can give help if there is a problem.
Can investigate motives and feelings.
Can use recording equipment.
Characteristics of respondent assessed – tone of voice, facial
expression, hesitation, etc.
Can use props.
If one interviewer used, uniformity of approach.
Used to pilot other methods.
Disadvantages:
Need to set up interviews.

Time consuming.
Geographic limitations.
Can be expensive.
Normally need a set of questions.
Respondent bias – tendency to please or impress, create false
personal image, or end interview quickly.
Embarrassment possible if personal questions.
Transcription and analysis can present problems– subjectivity.
If many interviewers, training required.
Observation: involves recording the behavioral patterns of people, objects and
events in a systematic manner.
Diaries: A diary is a way of gathering information about the way individuals
spend their time on professional activities. They are not about records of
engagements or personal journals of thought! Diaries can record either
quantitative or qualitative data, and in management research can provide
information about work patterns and activities.
3
Advantages:
Useful for collecting information from employees.

Different writers compared and contrasted simultaneously.
Allows the researcher freedom to move from one organization to
another.
Researcher not personally involved.
Diaries can be used as a preliminary or basis for intensive
interviewing.
Used as an alternative to direct observation or where resources are
limited.
Disadvantages:
Subjects need to be clear about what they are being asked to do,
why and what you plan to do with the data.
Diarists need to be of a certain educational level.
Some structure is necessary to give the diarist focus, for example, a
list of headings.
Encouragement and reassurance are needed as completing a diary
is time-consuming and can be irritating after a while.
Progress needs checking from time-to-time.
Confidentiality is required as content may be critical.
Analyses problems, so you need to consider how responses will be
coded before the subjects start filling in diaries.
Critical incidence: The critical incident technique is an attempt to identify the

more ‘noteworthy’ aspects of job behavior and is based on the assumption that
jobs are composed of critical and non-critical tasks. For example, a critical task
might be defined as one that makes the difference between success and failure in
carrying out important parts of the job. The idea is to collect reports about what
people that do is particularly effective in contributing to good performance. The
incidents are scaled in order of difficulty, frequency and importance to the job as
a whole.
The technique scores over the use of diaries as it is centered on specific

happenings and on what is judged as effective behavior. However, it is laborious
and does not lend itself to objective quantification.
4
2.1.2 Methods of secondary data collection

Secondary data analysis can be literally defined as second-hand analysis is the analysis of data
or information that was either gathered by someone else (e.g., researchers, institutions, other
NGOs, etc.) or for some other purpose than the one currently being considered, or often a
combination of the two.
Some of the sources of secondary data are government document, official statistics, technical
report, scholarly journals, trade journals, review articles, reference books, research institutes,
universities, libraries, library search engines, computerized data base and world wide web
(WWW)
Exercise-2: Write the merits and demerits of secondary data.
2.2 Methods of Data Presentation
This chapter introduces tabular and graphical methods commonly used to summarize both
qualitative and quantitative data. Tabular and graphical summaries of data can be obtained in
annual reports, newspaper articles and research studies. Everyone is exposed to these types of
presentations, so it is important to understand how they are prepared and how they will be
interpreted.
Having collected and edited the data, the next important step is to organize it. That is to present it in a
readily comprehensible condensed form that aids in order to draw inferences from it. It is also necessary
that the like be separated from the unlike ones.
The presentation of data is broadly classified in to the following two categories:
✓ Tabular presentation
✓ Diagrammatic and Graphic presentation.
The process of arranging data in to classes or categories according to similarities technically is called
classification. It eliminates inconsistency and also brings out the points of similarity and/or dissimilarity
of collected items/data.
Classification is necessary because it would not be possible to draw inferences and conclusions if we have
a large set of collected [raw] data.
2.2.1 Frequency Distribution

Frequency:- is the number of times a certain value or class of values occurs.
Frequency distribution (FD):- is the organization of raw data in table form using classes and
frequency.
There are three basic types of frequency distributions, and there are specific procedures for
constructing each type. The three types are categorical, ungrouped and grouped frequency
distributions.
The reasons for constructing a frequency distribution are as follows
To organize the data in a meaningful, intelligible way.
5
To enable the reader to determine the nature or shape of the distribution

To facilitate computational procedures for measures of average and spread
To enable the researcher to draw charts and graphs for the presentation of data
To enable the reader to make comparisons between different data set
I. Categorical Frequency Distribution
The categorical frequency distribution is used for data which can be placed in specific categories
such as nominal or ordinal level data. For example, data such as data such as political affiliation,
religious affiliation, or major field of study would use categorical frequency distribution.
The major components of categorical frequency distribution are class, tally and frequency.
Moreover, even if percentage is not normally a part of a frequency distribution, it will be added
since it is used in certain types of graphical presentations, such as pie graph.
Steps of constructing categorical frequency distribution
1. You have to identify that the data is in nominal or ordinal scale of measurement
2. Make a table as show below
3. Put distinct values of a data set in column A

4. Tally the data and place the result in column B
5. Count the tallies and place the results in column C
6. Find the percentage of values in each class by using the formula
𝑓
% = 𝑥100%
𝑛
Where 𝑓 is frequency and 𝑛 is total number of values
Example 2.1: Twenty-five army inductees were given a blood test to determine their blood type.
The data set is given as follows:
A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A
Construct a frequency distribution for the data.
Solution:
6
II. Ungrouped Frequency Distribution

When the data are numerical interested of categorical, the range of data is small and each class is
only one unit, this distribution is called an ungrouped frequency distribution.
The major components of this type of frequency distributions are class, tally, frequency and
cumulative frequency. The steps are almost similar with that of categorical frequency
distribution.
Cumulative frequencies are used to show how many values are accumulated up to and including
a specific class.
Example 2.2: The following data represent the number of days of sick leave taken by each of 50
workers of a company over the last 6 weeks.
2 0 0 5 8 3 4 1 0 0 7 1
7 1 5 4 0 4 0 1 8 9 7 0
1 7 2 5 5 4 3 3 0 0 2 5
1 3 0 2 4 5 0 5 7 5 1 1
0 2
A. Construct ungrouped frequency distribution
B. How many workers had at least 1 day of sick leave?
C. How many workers had between 3 and 5 days of sick leave?
Solution:
A. Since this data set contains only a relatively small number of distinct or different
values, it is convenient to represent it in a frequency table which presents each distinct
value along with its frequency of occurrence.
7
B. Since 12 of the 50workers had no days of sick leave, the answer is 50-12=38
C. The answer is the sum of the frequencies for values 3, 4 and 5 that is 4+5+8=17
III. Grouped Frequency Distribution
When the range of the data is large, the data must be grouped in which each class has more than
one unit in width.
Definition of some basic terms
• Grouped frequency distribution: is a FD when several numbers are grouped into one
class.
• Class limits (CL): It separates one class from another. The limits could actually appear in
the data and have gaps between the upper limits of one class and the lower limit of the next
class.
• Unit of measure (U): This is the possible difference between successive values. E.g. 1,
0.1, 0.01, 0.001……
• Class boundaries: Separate one class in a grouped frequency distribution from the other.
The boundary has one more decimal place than the raw data. There is no gap between the
upper boundaries of one class and the lower boundaries of the succeeding class. Lower
class boundary is found by subtracting half of the unit of measure from the lower class
limit and upper class boundary is found by adding half unit measure to the upper class
limit.
8
• Class width (W): The difference between the upper and lower boundaries of any
consecutive class. The class width is also the difference between the lower limit or upper
limits of two consecutive classes.
• Class mark (Mid point): It is found by adding the lower and upper class limit
(Boundaries) and divided the sum by two.
• Cumulative frequency (CF): It is the number of observation less than the upper class
boundary or greater than the lower class boundary of class.
• CF (Less than type): it is the number of values less than the upper class boundary of a
given class.
• CF (Greater than type): it is the number of values greater than the lower class boundary
of a given class.
• Relative frequency (Rf ):The frequency divided by the total frequency. This gives the
percent of values falling in that class.
Rfi = fi/n= fi/∑fi
• Relative cumulative frequency (RCf): The running total of the relative frequencies or the
cumulative frequency divided by the total frequency gives the percent of the values which
are less than the upper class boundary or the reverse.
CRfi = Cfi/n= Cfi/∑fi

When the range of the data is large, the data must be grouped in which each class has more than
one unit in width. While we construct this frequency distribution, we have to follow the
following steps.
1. Find the highest and the lowest values
2. Find the range; 𝑅𝑎𝑛𝑔𝑒 = 𝑀𝑎𝑥𝑖𝑚𝑢𝑚 − 𝑀𝑖𝑛𝑖𝑚𝑢𝑚 or 𝑅 = 𝐻 − 𝐿
3. Select the number of classes desired. Here, we have two choices to get the desired
number of classes:
I. Use Struge’s rule. That is, 𝐾 = 1 + 3.32 𝑙𝑜𝑔 𝑛 where 𝐾 is the number of class
and 𝑛 is the number of observations.
II. Select the number of classes arbitrarily between 5 and 20. This is a conventional
way. If you fail to calculate 𝐾 by Struge’s rule, this method is more appropriate.
When we choose the number of classes, we have to think about the following criteria
The classes must be mutually exclusive. Mutually exclusive classes have non
overlapping class limits so that values can’t be placed in to two classes.
9
The classes must be continuous. Even if there are no values in a class, the class
must be included in the frequency distribution. There should be no gaps in a
frequency distribution. The only exception occurs when the class with a zero
frequency is the first or last. A class width with a zero frequency at either end
can be omitted with out affecting the distribution.
The classes must be equal in width. The reason for having classes with equal
width is so that there is not a distorted view of the data. One exception occurs
when a distribution is open-ended. i.e., it has no specific beginning or end values.
4. Find the class width by dividing the range by the number of classes
𝑅 𝑅𝑎𝑛𝑔𝑒
𝑊 = 𝑜𝑟 𝑊𝑖𝑑𝑡ℎ =
𝐾 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑙𝑎𝑠𝑠𝑒𝑠
Note that: Round the answer up to the nearest whole number if there is a reminder. For
instance, 4.7 ≈ 5 and 4.12 ≈ 5
5. Select the starting point as the lowest class limit. This is usually the lowest score
(observation). Add the width to that score to get the lower class limit of the next class.
Keep adding until you achieve the number of desired class(𝐾) calculated in step 3.
6. Find the upper class limit; subtract unit of measurement(𝑈) from the lower class limit of
the second class in order to get the upper limit of the first class. Then add the width to
each upper class limit to get all upper class limits.
Unit of measurement: Is the next expected upcoming value. For instance, 28, 23, 52, and
then the unit of measurement is one. Because take one datum arbitrarily, say 23, then the
next upcoming value will be 24. Therefore,𝑈 = 24 − 23 = 1. If the data is 24.12, 30,
21.2 then give priority to the datum with more decimal place. Take 24.12 and guess the
next possible value. It is 24.13. There fore, 𝑈 = 24.12 − 24.13 = 0.01.
Note that: 𝑈 = 1 is the maximum value of unit of measurement and is the value when we
don’t have a clue about the data.
𝑈
7. Find the class boundaries. 𝑳𝑜𝑤𝑒𝑟 𝑪𝑙𝑎𝑠𝑠 𝑩𝑜𝑢𝑛𝑑𝑎𝑟𝑦 = 𝑳𝑜𝑤𝑒𝑟 𝑪𝑙𝑎𝑠𝑠 𝑳𝑖𝑚𝑖𝑡 − 2
𝑈 𝑈
and 𝑼𝑝𝑝𝑒𝑟 𝑪𝑙𝑎𝑠𝑠 𝑩𝑜𝑢𝑛𝑑𝑎𝑟𝑦 = 𝑼𝑝𝑝𝑒𝑟 𝑪𝑙𝑎𝑠𝑠 𝑳𝑖𝑚𝑖𝑡 + 2 . In short, 𝐿𝐶𝐵𝑖 = 𝐿𝐶𝐿𝑖 − 2
𝑈
and 𝑈𝐶𝐵𝑖 = 𝑈𝐶𝐿𝑖 + 2 .
8. Tally the data and write the numerical values for tallies in the frequency column
9. Find cumulative frequency. We have two type of cumulative frequency namely less than
cumulative frequency and more than cumulative frequency. Less than cumulative
frequency is obtained by adding successively the frequencies of all the previous classes
including the class against which it is written. The cumulate is started from the lowest to
the highest size. More than cumulative frequency is obtained by finding the cumulate
total of frequencies starting from the highest to the lowest class.
For example, the following frequency distribution table gives the marks obtained by 40
students:
10
The above table shows how to find less than cumulative frequency and the table shown
below shows how to find more than cumulative frequency.
Example 2.3: Consider the following set of data and construct the frequency distribution.
11 29 6 33 14 21 18 17 22 38
31 22 27 19 22 23 26 39 34 27
Steps
1. Highest value = 39, Lowest value = 6
2. 𝑅 = 39 − 6 = 33
3. 𝐾 = 1 + 3.32 log 20 = 5.32 ≈ 6
𝑅 33
4. 𝑊 = 𝐾 = = 5.5 ≈ 6
6
5. Select starting point. Take the minimum which is 6 then add width 6 on it to get the next
class LCL.
6. Upper class limit. Since unit of measurement is one. 12 − 1 = 11. So 11 is the UCL of
the first class. Therefore, 6 − 11 is the first class
11
7. Find the class boundaries. Take the formula in step 7. 𝐿𝐶𝐵1 = 𝐿𝐶𝐿1 − 0.5 and 𝑈𝐶𝐵1 =
𝑈𝐶𝐿1 + 0.5
8. 9 and 10
Class Class Class Tally Frequency CF(<) CF(>) RF RCF(>)

limit boundary Mark
6 – 11 5.5 – 11.5 8.5 // 2 2 20 2/20=0.1 1
12 – 17 11.5 – 17.5 14.5 // 2 4 18 2/20=0.1 0.9
18 – 23 17.5 – 23.5 20.5 ///// // 7 11 16 7/20=0.35 0.8
24 – 29 23.5 – 29.5 26.5 //// 4 15 9 4/20=0.2 0.45
30 – 35 29.5 – 35.5 32.5 /// 3 18 5 3/20=0.15 0.25
36 – 41 35.5 – 41.5 38.5 // 2 20 2 2/20=0.1 0.10
Relative Frequency Distribution

An important variation of the basic frequency distribution uses relative frequencies, which are
easily found by dividing each class frequency by the total of all frequencies. A relative frequency
distribution includes the same class limits as a frequency distribution, but relative frequencies are
used instead of actual frequencies. The relative frequencies are sometimes expressed as percents.
𝐶𝑙𝑎𝑠𝑠 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 =
𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠
Relative frequency distribution enables us to understand the distribution of the data and to
compare different sets of data.
12
2.2.2 Diagrammatic and Graphic presentation of data .

The most convenient and popular way of describing data is using graphical presentation. It is easier to
understand and interpret data when they are presented graphically than using words or a frequency table.
A graph can present data in a simple and clear way. Also it can illustrate the important aspects of the data.
This leads to better analysis and presentation of the data. In this article, we discuss the approach for the
most commonly used diagrammatic or graphical methods such as bar chart, pie chart, histogram,
frequency polygon and cumulative frequency polygon.
The three most commonly used diagrammatic presentation for discrete as well as qualitative data are:
➢ Bar chart
➢ Pictogram
➢ Pie chart
A pie chart is a circle that is divided in to sections or wedges according to the percentage of frequencies
in each category of the distribution. The angle of the sector is obtained using:
𝑉𝑎𝑙𝑢𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑎𝑟𝑡
𝐴𝑛𝑔𝑙𝑒 𝑜𝑓 𝑎 𝑠𝑒𝑐𝑡𝑜𝑟 = ∗ 3600
𝑇ℎ𝑒 𝑤ℎ𝑜𝑙𝑒 𝑞𝑢𝑎𝑛𝑡𝑖𝑡𝑦
Example 2.4: Draw a suitable diagram to represent the following population in a town.
Men Women Girls Boys

2500 2000 4000 1500
Solutions:
Step 1: Find the percentage.
Step 2: Find the number of degrees for each class.
Step 3: Using a protractor and compass, graph each section and write its name with corresponding
percentage.
Class Frequency Percent Degree

Men 2500 25 90
Women 2000 20 72
Girls 4000 40 144
Boys 1500 15 54
Total 10000 100 360
13
Boys Men
15% 25%
Girls Women
40% 20%
A) Bar Charts
✓ Used to represent & compare the frequency distribution of discrete variables and attributes or
categorical series.
✓ Bars can be drawn either vertically or horizontally.
In presenting data using bar diagram,
✓ All bars must have equal width and the distance between bars must be equal.
✓ The height or length of each bar indicates the size (frequency) of the figure represented.
There are different types of bar charts. The most common being:
❖ Simple bar chart
❖ Component or sub divided bar chart.
❖ Multiple bar charts
I. Simple bar chart
✓ Are used to display data on one variable.
✓ They are thick lines (narrow rectangles) having the same breadth. The magnitude of a quantity is
represented by the height /length of the bar.
Example 2.5: Number of students in the four department of Science College given as follows:
Department Physics Maths Chemistry Biology
Number of students 200 400 450 600
Male 170 350 250 200
Female 30 50 200 400
Draw a simple bar chart of the number of students by department.
14
Solution:
Simple bar chart
800 600
Frequency
600 450
400
400 200
200
0
Phys Maths Chem Bio
De prtm e nt
II. Component Bar chart

✓ When there is a desire to show how a total (or aggregate) is divided in to its component parts,
we use component bar chart.
✓ The bars represent total value of a variable with each total broken in to its component parts and
different colors or designs are used for identifications
Example 2.6: Draw a component (sub-divided) bar chart of the number of students by department is
given in the example 2.5.
Solution:
Sub-divided bar chart
800
600 Female
Frequency 400 Male
200
0
Phys Maths Chem Bio
Department
III. Multiple Bar charts

✓ These are used to display data on more than one variable.
✓ They are used for comparing different variables at the same time.
Example 2.7: The following data represent sales by product, 1957- 1959 of a given company for three
products A, B, C.
Product Sales in ($)
15
1957 1958 1959
A 12 14 18
B 24 21 18
C 24 35 54
Draw a multiple bar chart to represent the sales by product from 1957 to 1959.
Solution:
B) Pictograph
In this diagram, we represent data by means of some picture symbols. We decide about a suitable picture
to represent a definite number of units in which the variable is measured.
2.2.4 Graphical Presentation of data
The histogram, frequency polygon and cumulative frequency graph or ogive is most commonly applied
graphical representation for continuous data.
Procedures for constructing statistical graphs:
➢ Draw and label the X and Y axis.
➢ Choose a suitable scale for the frequencies or cumulative frequencies and label it on the Y axis.
➢ Represent the class boundaries for the histogram or ogive or the mid points for the frequency
polygon on the X axis.
➢ Plot the points.
➢ Draw the bars or lines to connect the points.
Histogram
A graph which displays the data by using vertical connected bars of various heights to represent
frequencies. Class boundaries are placed along the horizontal axis. Class marks and class limits are some
times used as quantity on the X axis.
16
Example 2.8: Construct a histogram to represent the following data.
Class 15-24 25-34 35-44 45-54 55-64 65-74 75-84

limits
Frequency 3 4 10 15 12 4 2
Solution:
Histogram
Frequency
20
15
15 12
10
10
4 4
5 3 2
0
Class boundaries
Frequency polygon
If we join the mid-points of the tops of the adjacent rectangles of the histogram with line segments a
frequency polygon is obtained. When the polygon is continued to the x-axis just outside the range of the
lengths the total area under the polygon will be equal to the total area under the histogram.
Example 2.9: Construct a frequency polygon to represent the previous data in example 2.8.
Solution:
Class Frequency Class Class R.F. % R.F. Less than More than
limits marks boundaries C.F. C. F.
(percent)
15 - 24 3 19.5 14.5 - 24.5 0.06 6% 3 50
17
25 – 34 4 29.5 24.5 - 34.5 0.08 8% 7 47
35 - 44 10 39.5 34.5 - 44.5 0.20 20% 17 43
45 - 54 15 49.5 44.5 - 54.5 0.30 30% 32 33
55 - 64 12 59.5 54.5 - 64.5 0.24 24% 44 18
65 - 74 4 69.5 64.5 - 74.5 0.08 8% 48 6
75 - 84 2 79.5 74.5 - 84.5 0.04 4% 50 2
Total 50 1.00 100%
Adding two class marks with f i = 0 , we have 9.5 at the beginning, and 89.5 at the end, the following
frequency polygon is plotted:
Frequency Polygon
20
F
r
15
e
q
10
u
e
n 5
c
y 0
9.5 19.529.539.549.559.569.579.589.5
Class mark
Ogive (cumulative frequency polygon)

An Ogive (pronounced as “oh-jive”) is a line that depicts cumulative frequencies, just as the cumulative
frequency distribution lists cumulative frequencies. Note that the Ogive uses class boundaries along the
horizontal scale, and graph begins with the lower boundary of the first class and ends with the upper
boundary of the last class. Ogive is useful for determining the number of values below or above some
particular value. There are two type of Ogive namely less than Ogive and more than Ogive. The
difference is that less than Ogive uses less than cumulative frequency and more than Ogive uses more
than cumulative frequency on 𝑦 axis.
Example 2.10: Draw a both types of ogives for the F.D. of Example 2.8.
Solutions:
18
The Less than Ogive The More than

Ogive
Cumulative Frequency
60 60
50 50
Cumulative
Frequency
40 40
30 30
20 20
10 10
0
0
14.5 24.5 34.5 44.5 54.5 64.5 74.5 84.5
14.5 24.5 34.5 44.5 54.5 64.5 74.5 84.5
Class Boundaries
Class Boundaries
Note: For both ogives, one class with frequency zero is added for similar reason with the frequency
polygon.
19

I've Come Loaded With Statistics, For I've Noticed That A Man Can't Prove Anything Without Statistics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

I've Come Loaded With Statistics, For I've Noticed That A Man Can't Prove Anything Without Statistics

Uploaded by

Copyright:

Available Formats

Chapter two: Methods of data presentation

2.1 Sources of data

2.1.1 Method of primary data collection

Can be used as a method in its own right or as a basis for

Interviewing is a technique that is primarily used to gain an understanding of the

Serious approach by respondent resulting in accurate information.

Need to set up interviews.

Useful for collecting information from employees.

Critical incidence: The critical incident technique is an attempt to identify the

The technique scores over the use of diaries as it is centered on specific

2.1.2 Methods of secondary data collection

2.2.1 Frequency Distribution

To enable the reader to determine the nature or shape of the distribution

3. Put distinct values of a data set in column A

II. Ungrouped Frequency Distribution

CRfi = Cfi/n= Cfi/∑fi

Class Class Class Tally Frequency CF(<) CF(>) RF RCF(>)

Relative Frequency Distribution

2.2.2 Diagrammatic and Graphic presentation of data .

Men Women Girls Boys

Class Frequency Percent Degree

Number of students 200 400 450 600

Male 170 350 250 200

Female 30 50 200 400

Draw a simple bar chart of the number of students by department.

II. Component Bar chart

Sub-divided bar chart

III. Multiple Bar charts

Product Sales in ($)

1957 1958 1959

Example 2.8: Construct a histogram to represent the following data.

Class 15-24 25-34 35-44 45-54 55-64 65-74 75-84

15 - 24 3 19.5 14.5 - 24.5 0.06 6% 3 50

25 – 34 4 29.5 24.5 - 34.5 0.08 8% 7 47

35 - 44 10 39.5 34.5 - 44.5 0.20 20% 17 43

45 - 54 15 49.5 44.5 - 54.5 0.30 30% 32 33

55 - 64 12 59.5 54.5 - 64.5 0.24 24% 44 18

65 - 74 4 69.5 64.5 - 74.5 0.08 8% 48 6

75 - 84 2 79.5 74.5 - 84.5 0.04 4% 50 2

Total 50 1.00 100%

frequency polygon is plotted:

Ogive (cumulative frequency polygon)

The Less than Ogive The More than

You might also like