You are on page 1of 13

Q No#1: Tabulation

The collected data is usually contained in schedules and questionnaires. But that is
not in an easily assailable form. The answers will require some analysis if their salient
points are to be brought out. As a rule, the first step in the analysis is to classify and
tabulate the information collected, or, if published statistics have been employed,
rearrange these into new groups and tabulate the new rearrangement. In case of
some investigations, the classification and tabulation may give such a clear picture of
the significance of the material that no further analysis is required. In other cases
these processes, though may materially assist the analysis, are not sufficient
presentation of the facts. They are however, very important whether they have been
very carefully drawn up and the answers may be both complete and accurate, but
until these answers are all brought together into the class to which they belong and
the whole information displayed in a tabular form, no one will be a great deal wiser
as to the contents of the replies.

5 Major Objectives of Tabulation:

(1) To Simplify the Complex Data

• It reduces the bulk of information i.e. raw data in a simplified and meaningful
form so that it could be easily by a common man in less time.

(2) To Bring Out Essential Features of the Data

• It brings out the chief/main characteristics of data.

• It presents facts clearly and precisely without textual explanation.

(3) To Facilitate Comparison

• Presentation of data in row & column is helpful in simultaneous detailed


comparison on the basis of several parameters.

(4) To Facilitate Statistical Analysis

• Tables serve as the best source of organized data for further statistical
analysis.
• The task of computing average, dispersion, correlation, etc. becomes much
easier if data is presented in the form of a table.

(5) Saving of Space

• A table presents facts in a better way than the textual form.

• It saves space without sacrificing the quality and quantity of data.

Q No#2: Frequency distribution and its Graphs:


Frequency distribution is a representation, either in a graphical or tabular format
that displays the number of observations within a given interval. The interval size
depends on the data being analyzed and the goals of the analyst. The intervals must
be mutually exclusive and exhaustive. Frequency distributions are typically used
within a statistical context. Generally, frequency distribution can be associated with
the charting of a normal distribution.
Frequency Distribution:

Many times it is not easy or feasible to find the frequency of data from a very large
dataset. So to make sense of the data we make a frequency table and graphs. Let us
take the example of the heights of ten students in centimeters.

Frequency Distribution Table

139, 145, 150, 145, 136, 150, 152, 144, 138, 138
This frequency table will help us make better sense of the data given. Also when the
data set is too big (say if we were dealing with 100 students) we use tally marks for
counting. It makes the task more organized and easy. Below is an example of how we
use tally marks.

Frequency Distribution Table


 
 

Q No#3: What is an Averages?


In statistics, an average is defined as the number that measures the central tendency
of a given set of numbers. There are a number of different averages including but not
limited to: mean, median, mode and range.

Mean
Mean is what most people commonly refer to as an average. The mean refers to the
number you obtain when you sum up a given set of numbers and then divide this
sum by the total number in the set. Mean is also referred to more correctly as
arithmetic mean.

Mean Example Problems


Example 1
Find the mean of the set of numbers below

Solution

The first step is to count how many numbers there are in the set, which we shall
call n

The next step is to add up all the numbers in the set

The last step is to find the actual mean by dividing the sum by n

Median
The median is defined as the number in the middle of a given set of numbers
arranged in order of increasing magnitude. When given a set of numbers, the median
is the number positioned in the exact middle of the list when you arrange the
numbers from the lowest to the highest. The median is also a measure of average. In
higher level statistics, median is used as a measure of dispersion. The median is
important because it describes the behavior of the entire set of numbers.

Example 3
Find the median in the set of numbers given below

Solution

From the definition of median, we should be able to tell that the first step is to
rearrange the given set of numbers in order of increasing magnitude, i.e. from the
lowest to the highest

Then we inspect the set to find that number which lies in the exact middle.

Mode
The mode is defined as the element that appears most frequently in a given set of
elements. Using the definition of frequency given above, mode can also be defined
as the element with the largest frequency in a given data set.

For a given data set, there can be more than one mode. As long as those elements all
have the same frequency and that frequency is the highest, they are all the modal
elements of the data set.

Example 5

Find the Mode of the following data set.

Solution

Mode = 3 and 15
Range
The range is defined as the difference between the highest and lowest number in a
given data set.

Example 7

Find the range of the data set below

Solution

Range 20-3=17

Q No#4: Dispersion and Measures of Dispersion


Dispersion is the state of getting dispersed or spread. Statistical dispersion means the
extent to which a numerical data is likely to vary about an average value. In other
words, dispersion helps to understand the distribution of the data.
Measures of Dispersion
In statistics, the measures of dispersion help to interpret the variability of data i.e. to
know how much homogenous or heterogeneous the data is. In simple terms, it
shows how squeezed or scattered the variable is.

Types of Measures of Dispersion


There are two main types of dispersion methods in statistics which are:

 Absolute Measure of Dispersion


 Relative Measure of Dispersion

Absolute Measure of Dispersion


An absolute measure of dispersion contains the same unit as the original data set.
Absolute dispersion method expresses the variations in terms of the average of
deviations of observations like standard or means deviations. It includes
range, standard deviation, quartile deviation, etc.
The types of absolute measures of dispersion are:

1. Range: It is simply the difference between the maximum value and the
minimum value given in a data set. Example: 1, 3,5, 6, 7 => Range = 7 -1= 6
2. Variance: Deduct the mean from each data in the set then squaring each of
them and adding each square and finally dividing them by the total no of
values in the data set is the variance. Variance (σ 2)=∑(X−μ)2/N
3. Standard Deviation: The square root of the variance is known as the standard
deviation i.e. S.D. = √σ.
4. Quartiles and Quartile Deviation: The quartiles are values that divide a list of
numbers into quarters. The quartile deviation is half of the distance between
the third and the first quartile.
5. Mean and Mean Deviation: The average of numbers is known as the mean
and the arithmetic mean of the absolute deviations of the observations from a
measure of central tendency is known as the mean deviation.

Also, read:

 Variance
 Quartiles
 Mean
Relative Measure of Dispersion:
The relative measures of depression are used to compare the distribution of two or
more data sets. This measure compares values without units. Common relative
dispersion methods include:

1. Coefficient of Range
2. Coefficient of Variation
3. Coefficient of Standard Deviation
4. Coefficient of Quartile Deviation
5. Coefficient of Mean Deviation

Coefficient of Dispersion
The coefficients of dispersion are calculated along with the measure of dispersion
when two series are compared which differ widely in their averages. The dispersion
coefficient is also used when two series with different measurement unit are
compared. It is denoted as C.D.
The common coefficients of dispersion are:

C.D. In Terms of Coefficient of dispersion

Range C.D. = (Xmax – Xmin) ⁄ (Xmax + Xmin)

Quartile Deviation C.D. = (Q3 – Q1) ⁄ (Q3 + Q1)

Standard Deviation (S.D.) C.D. = S.D. ⁄ Mean

Mean Deviation C.D. = Mean deviation/Average

Measures of Dispersion Formulas


The most important formulas for the different dispersion methods are:
Arithmetic Mean Formula

Sum of all of the numbers of a group, when divided by the number of items in that
list is known as the Arithmetic Mean or Mean of the group. For example, the mean of
the numbers 5, 7, 9 is 4 since 5+7+9 = 21 and 21 divided by 3 [there are three
numbers] is 7.
X¯¯¯¯=∑ni=1XiN
Quartile Formula
A quartile divides the set of observation into 4 equal parts. The middle term,
between the median and first term is known as the first or Lower Quartile and is
written as Q1. Similarly, the value of midterm that lies between the last term and the
median is known as the third or upper quartile and is denoted as Q3. Second Quartile
is the median and is written as Q2.
Q1=(n+14)thTerm

Standard Deviation Formula

Standard deviation formula is used to find the values of a particular data that is
dispersed. In simple words, the standard deviation is defined as the deviation of the
values or data from an average mean. Lower standard deviation concludes that the
values are very close to their average. Whereas higher values mean the values are far
from the mean value. It should be noted that the standard deviation value can never
be negative.
Standard Deviation is of two types:

1. Population Standard Deviation


2. Sample Standard Deviation
Variance Formulas
Variance can be of either grouped or ungrouped data. To recall, a variance can of
two types which are:

 Variance of a population
 Variance of a sample
The variance of a population is denoted by σ2 and the variance of a sample by s2.

Variance Formulas for Ungrouped Data

Formula for Population Variance


The variance of a population for ungrouped data is defined by the following formula:

 σ2 = ∑ (x − x̅)2 / n

Formula for Sample Variance


The variance of a sample for ungrouped data is defined by a slightly different
formula:

 s2 = ∑ (x − x̅)2 / n – 1

Interquartile Range Formula

The interquartile range (IQR) is a measure of variability, based on dividing a data set
into quartiles. The values that divide each part are called the first, second, and third
quartiles; and they are denoted by Q1, Q2, and Q3, respectively.

 Q1 is the “middle” value in the first half of the rank-ordered data set.
 Q2 is the median value in the set.
 Q3 is the “middle” value in the second half of the rank-ordered data set.
 The formula for inter-quartile range is given below
IQR=Q3−Q1

You might also like