Professional Documents
Culture Documents
Frequency Distribution.
Frequency Distribution.
FREQUENCY DISTRIBUTION
attribute.
Data on attributes may be of two types They are ordinal when there
ordinal and nominal.
is a clear ordering of the forms or categories of the attribute. For example, when the character
education' is measured with categories primary school, high school, college and post-graduate,
or the character 'economic status' is measured with categories poor, middle class and rich,
there is a clear ordering of the categories though the absolute distances between them are
unknown. Such data are common in the social sciences, in particular for measuring attitudes
and opinions on various issues and status of various types. When the various forms of an
attribute differ in nature, not in quantity, the data are nominal. For example, when we record
the religions of persons as Hindu, Muslim, Christian, etc. or when the mother tongues of
students are noted as Bengali, Hindi, Marathi, etc. the data are nominal. It is obviousthatthe
order of listing of the different forms of an attribute is unimportant in this case.
The term variable (or variate) means a character of an item or an individual that can be
expressed in numerical terms. It is also called a quantitative character and such characters
can be measured or counted. Weight of students in a school, ages of boys, family size,etc.
are characters of this type.
The above table reveals how this total frequency 40, is distributed over the two categories
of the attribute.
The table (TABLE 3.1) shows the frequency distribution of the attribute under study.
Sometimes, the proportions or the relative frequencies may be used as an alternative to
frequencies and a table similar to TABLE 3.1 may be prepared to present them against the
corresponding forms of the attribute. In this case, relative frequency is 18/40 (or 0.45) for the
form female and 22/40 (or 0.55) for the form male.
It should be mentioned that we may, similarly, have frequency distributions of attributes
with more than two forms.
TABLE 3.2
FAMILIES
NUMBER OF MEMBERS OF DIFFERENT
3 2 2
3 5
5 5 3 4
6 4
2 3 3
6 6 B 5
4 4 5 7
6 4 6 4 3
5 3 6 2 4
5 4 5 4
4 3 4
ods
4 4 4 3 4 4
t h e data are arranged in a systematic and compact form, then one can easily understand
the significance of them. To meet the purpose, the frequency distribution of the variable family
S1Ze i s constructed. On going through the data, we find that the range of the values is 2 to 7.
The
values 2, 3, , 7 are taken successively to form six classes and the given figures
considered one by one and recorded in the respective classes with the help of tally
are
marks
. In order to facilitate counting,
tally marks in kept of five. After
are
marks, the fifth one is drawn across the preceding four.
groups every four
TABLE 3.3
TALLY MARKS FOR THE GIVEN VALUES
Family size Tally Marks
2
Total
.90
FREQUENCY DISTRIBUTION 23
The same frequency distribution may be presented with relative frequencies in place of
frequencies. The relative frequency of the value 2 will give the proportion of families having
size2, and similarly for other values. Thus for instance, the relative frequency of the value 3
is 20/90 (or 0.222). Again, to respond to such queries as 'How many families are there with
4 members or less?' or 'How many families are there with 5 members or more?", we are to
find cumulative frequencies of the less-than type and 'greater-than type. For the former we
are to give the totals of the frequencies proceeding from the lowest class upwards, and for
the latter proceeding from the highest class downwards.
One can also show cumulative relative frequencies, by adding successively the relative
frequencies, starting from the top of the table and then from the bottom of the table for the
less-than type and greater-than type respectively.
It may be noted that we take one class for each different value of a discrete variable when
the range of variation is small. But when the range is large we are forced to take a group ot
0.100 90
0.222o 29 81
3
0.333 59 b 61p
0.189 76 31
0.111 o 86 14
6
0.044 90 4
0.999 1
Total
continuous variable
3.4 (b) Case of a
take an infinite number of values within its range of variation
A continuous variable may
individual class cannot be considered for each distinct value of
and, as such, it is natural that
the variable. To explain this fact,
let variable, namely
us consider a continuous height
of
persons, and record the data (in cm) as 165.5, 166.4, 165.2 etc. Here each figure is correct to
one decimal place and in the real
sense the reading 165.5, for example, means any value
between 165.45 and 165.55.
In fact, some suitable technique of classification would be necessary for presenting this
kind of data in some classes, their number being not very large. However, during the
construction of frequency distribution ot such a character, we come across the following useful
terms.
24 STATISTICAL TOOLS AND
TECHNIQUES
boundary of the next class. The class boundaries are used for forming the frequency distribution
of a continuous variable.
5. Clas-mark: The mid-value of a class interval that lies half-way between its two end
class boundaries) is termed as class-mark.
points (i.e. class limits or
6. Class width: The difference between the upper and lower boundaries of a class interval
is called the width or size of the class.
7. Frequency density: The frequency density of a class is the frequency per unit width
of the class,
class frequency
i.e., frequency density =
width of class-interval
Fequency densities are used for comparing the concentration of frequencies in different
classes, particularly when the classes are of unequal width.
Now, let us consider the problem of construction of frequency distribution of a continuous
variable and the relevant guidelines. SuPpose we are given n values ofa continuous variable.
To prepare a frequency distribution with the given values we proceed as follows.
We first pick up the smallest and the greatest of the given values. Their difference gives
the range of variation. The range is then divided into a suitable number of classes depending
on the total frequency.
In determining the classes we have to bear in mind the following points.
2. The classes should be mutually exclusive (i.e. non-overlapping) so that no value come
under more than one class.
T
a (a +c d)
(a + c)- (a + 2c - d)
(a +k-1c)
-
(a + kc -d)
Here a S the smallest given value;
desired width of the
classes;
c is the
k is the number of classes;
a+ kc -
d2 the greatest given value
according as tne Vaues eiven integers, upto one place
ae in
d 1. 0.1 or 0.01 etc.
=
(a + k-1c -
d/2) -
(a + kc -
d/2)
n
Total
Illustration3.1
marks in Mathematics of 50 students in a
Suppose the following data relating to a test on
72 55 54 33 48 56 34 77 65 58
47 59 44 35 75 40 45 56 55 65
48 56 52 53 34 42 58 65 43 54
46 57 62 58 53 43 47 54 60 48
Arrange the data in the form ofa frequency distribution table in 5 classes of equal length.
Prepare the table for cumulative frequencies and relative frequencies.
Here the smallest value = 33 and the greatest value = 77. So range = 77 - 33 = 44. We
onsider range as 50, slightly bigger than the range obtained from the data and hence
form
10.
five classes each of length
FREQUENCY DISTRIBUTIO)
27
TABLE 3.6
TALLY MARKS FOR
THE DATA ON
Clàss limits MARKS
31-40
Tally marks
THL
41-50
51-60
61-70
71-80
TABLE 3.7
FREQUENCY DISTRIBUTION OF MARKS OF 50 STUDENTS IN A
COLLEGE
Class boundaries
Frequency
30.5-40.5 6
40.5-50.5 14
50.5-60.5 20
60.5-70.5
70.5-80.5 3
Total 50
TABLE 3.8
RELATIVE FREQUENCY AND CUMULATIVE FREQUENCY TABLE OF MARKS
0.06 50 3
70.5-80.5
1.00
Total
of frequency distribution
D Diagrammatic representation
If a frequency distribution is exhibited in alagrams, then an overall idea regarding the
distribution may be readily developed. 1here are several modes of such graphical representation
but the choice of suitable figure depends on ne nature ot the character concerned.
28 STATISTICAL TOOLS AND TECHNIQUES
30
25 30
25
20
1 20
15
15
10 10
o
2
Family size Family size
Fig 3.1 Column diagram for the frequency Fig 3.2 Frequency polygon for the
distribution of family size (TABLE 3.4). frequency
distribution of family size (TABLE 3.4).
the former ascending from left to types will resemble two staircases,
right and the latter
ascending from right to left.
(c) Case ofa continuous variable
) Frequency polygon: A frequency
polygon may be drawn to exhibit the
frequency
distribution of a continuous variable, provided the classes are of
equal width. Here the
frequencies are plotted against the class-marks of the respective classes on the assumption
that the frequency of a class coresponds to its mid-value.
Finally, the plotted points are joined
successively by line segments. To get a closed polygon, we take two additional classes, one
at each end, which have zero
frequencies.
(ii) Histogram: It is an appropriate diagram for representing the frequency distribution
of a continuous variable in the sense that it considers the fact that the frequency of a class is
90r
80
70
60
50
40F
30
20
10
3
Family size
class-boundaries are
over the interval. Here two coordinate
taken and the
axes are
spersed Next, a rectangle
1s drawn
over
on the horizontal axis for locating the class intervals.
SnOwn class frequency.
In other words,
clasS-interval so that its area indicates the corresponding
CaCn frequency density.
In this manner
the height rectangle becomes equal to the corresponding
of a
this entire group or
that the area covered by
erected so
4
Scles Or adjoining rectangles are
formed is called the histogram
or the
diagram
cctangles exhibits the total frequency. The that
so
which arc sameC as
the widths of rectangles,
irequency distribution. It should be noted
corresponding class widths, are not necessarily equal. distribution or
later, for finding mode of frequency
is used, as we shall see
a rough idea
A nistogram below specified variate value. It also gives
number of observations above or a
Less-than type
50
2.0
40
1.5
l o ad
30
1.0
20 More-than type
0.5
10
40.5 50.5 60.5 70.5 80.5
30.5
Marks
30.5 40.5 50.5 60.5 70.5 80.5
Class boundaries-
for the frequency Fig 3.5 Ogive for the frequency distribution of
Fig 3.4 Histogram
distribution of marks (TABLE 3.7). marks (TABLE 3.8).
ii) Ogive This diagram is used for exhibiting the frequency distribution of a continuous
:
either type). To draw an ogive, initially, two
variable in terms of cumulative frequencies (of
rectangular axes of coordinates are t a k e n - t h e horizontal one showing the variable values
representing the cumulative frequencies. In case of less-than type
and the vertical one
cumulative frequencies, they are plotted against the upper class-boundaries as different points
which are joined successively by line segments to get less-than type ogive. Again, for a more-
than tvpe ogive, cumulative frequencies of more-than type correspond to lower class-boundaries,
the mode of construction being similar te the previous one. Obviously, cumulative frequency
of less-than type is zero for the lower boundary ot the lowest class and it isincluded in drawing
the diagram. Similarly, cumulative frequency of greater-than type is zero for the upper-boundary
which has to be included.
of the highest class
An ogive is primarily used for finding quantiles of different orders(such as median, third
quartile). From it one can also find the number of observations above or below a certain variate
value.
FREQUENCY DISTRIBUTION 31
3.6 Frequency curve
Variable
Variable>
curve. Fig 3.7 U-shaped frequency curve.
Fig 3.6 Bell-shaped frequency
olid (
Variable Variable
curves,
Fig 3.8 J-shaped frequency
curves may be of
diferent shapesbell shaped (symmetrical or moderately
The frequency
asymmetrical), U-shaped, J-shaped, etc,