MCA Mathematical Foundation For Computer Application 12

UNIT
12 Statistics - 1
Names of Sub-Units
Descriptive Statistics, Mean: Arithmetic, Geometric and Harmonic means, Relationship among
different means, Median for raw data and grouped data, Mode for raw data and grouped data,
Relationship among mean, median and mode, Descriptive statistics standard deviation, Variance,
Coefficient of variation
Overview
The unit begins by introducing the concept of descriptive statistics. Also, the unit explains various
measures of central tendency like mean, median, mode, standard variance, coefficient of variation
etc.
Learning Objectives
In this unit, you will learn to:

 Define Descriptive Statistics.
 Explain the measures of central tendency
 Describe the types of mean
 Determine the relationship between different means
JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
Learning Outcomes
At the end of this unit, you would:

 Recall and evaluate the measures of the spread of data with variance, standard deviation and
range.
 Examine the calculation of median and mode
 Determine the relationship of mean, median and mode
 Learn about standard deviation,variance and coefficient of variance
Pre-Unit Preparatory Material
 https://byjus.com/mean-median-mode-formula/
12.1 INTRODUCTION
The word ‘Statistics’ is similar to the Latin word ‘Status’, German word ‘Statistik’, French word ‘Statistique’
or Italian word ‘Statista’. In the previous time, it was regarded as Political Arithmetic which was used
to take and analyse data of population and wealth to obtain ideas about military strength and financial
state. Statistics can be defined as the ordered process including collection, classification, analysis and
interpretation of data. The term Statistics is related to techniques or tools in the analysis of data. The
Statistical method includes 4 stages:
1. Collection of data
2. Classification/Tabulation/Presentation of data
3. Analysis of data
4. Interpretation of data
The scope of Statistics has significantly distributed over all the branches in the modern era. In every
field, statistical methods may be applied. These two statistical methods are mostly used for analysing
the data; Descriptive Statistics and Inferential Statistics.
Different types of Statistical Methods are given below:
(i) Descriptive Statistics
(ii) Analytical Statistics
(iii) Inductive Statistics
(iv) Inferential Statistics
(v) Applied Statistics
12.2 DESCRIPTIVE STATISTICS

The preliminary steps to the final analysis and interpretation are consisted in descriptive statistics. The
various steps are included in descriptive statistics are:
1. Collection of data
2. Tabulation of data
2
UNIT 12: Statistics - 1 JGI JAIN
3. Measures of Central Tendency

4. Dispersion
5. Skewness
6. Index numbers
7. Time Series
Descriptive Statistics helps to bring out the characteristics of data.
12.2.1 Collection of Data

Collection of data is the most important part of Statistical investigation. This process requires the most
care and caution because it constitutes the foundation on which statistical investigation is based.
There are many resources from which one can collect the different types of data. Data is categorised
into two types: Primary data and Secondary data.
Primary data are those data that are collected originally from a primary source with some objective by
a person or an organisation. It is directly collected from informants.
The various methods of collecting primary data are:
1. Direct Personal Investigation
2. Indirect Oral Investigation
3. Data from Local agents and Correspondents
4. Mailed Questionnaire
5. Questionnaire to be filled by enumerators
6. Results of Experiments
Secondary Data are those data that are published by one organisation and used by another organisation
or person for some objective. The originality of Secondary data is less as compared to Primary data.
The Various Sources of Secondary data are:
1. International Publications
2. Reports of Committee and Commissions
3. Publications by Trade and Professional bodies
4. Newspapers
5. Publications of Research Scholars
12.2.2 Classification or Tabulation of Data

Classification is a process of arranging data in a simplified way according to their characteristics for
better analysis.
The main objectives of classification are:
1. It promotes comparison between the related variables.
2. It makes the data penetrable.
3
JGI JAIN
3. It promotes the statistical treatment of collected material.

4. It arranges huge data into tables or groups based on their characteristics.
We classify the data to make it into the arranged form on the following basis;
 Geographical Classification
 Chronological Classification
 Qualitative Classification
 Quantitative Classification
After the classification of data, tabulation of data is necessary. Tabulation helps in the ordered
arrangement of data systematically. The tabulation helps in summarising the data in condensed form.
Tabulation is a process of ordered and systematic presentation of numerical data in a form designed to
explain the problem under consideration.
The main objectives of Tabulation are:
1. Tabulation is helpful to avoid unnecessary repetition.
2. Data presentation is economical via tabulation.
3. It represents the characteristics and features of data.
4. Tabulation promotes presentation in form of diagrams and graphs.
12.2.3 MEASURES OF CENTRAL TENDENCY

The prior and important objective of statistical analysis is to obtain a central or average value. It is of
great importance because it represents the characteristics of the whole group. Thus, the average value
not only represents the characteristics but also helps to do a comparison between two sets of values.
Characteristics of Ideal Average/Central Value
1. It should be rigidly defined. There should not be any confusion regarding the definition of average
value.
2. Average value should not be affected by extreme values.
3. It should be capable to be included in further algebraic treatment.
In this unit, we will discuss the averages: Mean, Median and Mode.
2.3 MEAN: ARITHMETIC, GEOMETRIC AND HARMONIC MEANS

If a data set is given then the mean is an average of given data. Average helps in comparison of two or
more data sets. Here we will discuss three types of mean arithmetic, geometric and harmonic.
12.3.1 ARITHMETIC MEAN

Arithmetic mean is a widely used measure of central tendency which represents the entire data by a
single value. It is commonly known as ‘Mean’. Arithmetic mean is of two types:
1. Simple Arithmetic Mean
2. Weighted Arithmetic Mean
4
Computation of Simple Arithmetic Mean

For individual Series, there are two methods to calculate Simple Arithmetic mean
1. Direct Method
2. Short-cut Method
The direct method includes the following steps:
Step 1: Add all the values of variable X, i.e., find X

Step 2: Find the total number of observations, N.
X
Step 3: Apply the formulae, X  .
N
Example 1: The marks of 10 students out of 50 are given below:

40, 46, 45, 32, 25, 48, 36, 30, 24, 41
Calculate the Arithmetic mean of marks.
Solution:
Students 1 2 3 4 5 6 7 8 9 10 N = 10
Marks, X 40 46 45 32 25 48 36 30 24 41  = 367
X 367
X 
N 10
X  36.7
The Short-cut method is applied when the number of observations is very large. Thus, we take deviations
from the assumed mean to reach the actual mean. The method includes the following steps:
Step 1: Take any value in the series as assumed mean, i.e., A.
Step 2: Find deviations of all the observations from the assumed mean, i.e., d = X - A.
d
Step 3: Apply the formulae, X  A 
N.
Example 2: The marks of 10 students are given below:

40, 46, 45, 32, 25, 48, 36, 30, 24, 41
Calculate the Arithmetic mean of marks using the Short-cut method.
Solution: Let the assumed mean for the observations be A = 36.
Students Marks (X) d = X – 36

1 40 40 – 36 = 4
2 46 46 – 36 =10
5
JGI JAIN
Students Marks (X) d = X – 36

3 45 45 – 36 = 9
4 32 32 – 36 = – 4
5 25 25 – 36 = –11
6 48 48 – 36 = 12
7 36 36 – 36 = 0
8 30 30 – 36 = – 6
9 24 24 – 36 = – 12
10 41 41 – 36 = 5
N = 10 d  7
d 7
XA  36  = 36.7
N 10
For Discrete Series, the two methods to calculate Simple Arithmetic mean are:
 Direct Method
 Short-cut Method
Direct method includes the following steps:
Step 1: Multiply the variable X with its respective frequency for all the observations.
Step 2: find fX .

fX
Step 4: Apply the formulae, X 
N .
Example 3: Calculate the Arithmetic mean for the following data.
X 5 10 15 20 25 30 35 40
f 2 6 12 10 15 8 1 4
Solution:
X f fX
5 2 10
10 6 60
15 12 180
20 10 200
25 15 375
30 8 240
35 1 35
6
UNIT 12: Statistics - 1 JGI JAIN DEEMED-TO-BE UNI VE RSI TY
X f fX
40 4 160
f  N = 58 fX  1260
fX 1260
X  = 21.72
N 58
Short-cut method includes the following steps:

Step 1: Take any value in the series as assumed mean, i.e., A.
Step 2: Find deviations of all the observations from the assumed mean, i.e., d = X - A.
Step 3: Multiply the deviation obtained in Step 2 with the corresponding frequency and then obtain fd .
fd
Step 5: Apply the formulae, X  A 
N .
X 10 20 30 40 50 60 70
f 2 5 3 4 8 2 3
Solution: Let assumed mean for the observations be A = 40
X f d=X-A fd
10 2 –30 –60
20 5 –20 –100
30 3 –10 –30
40 4 0 0
50 8 10 80
60 2 20 40
70 3 30 90
f  N = 27 fd  20
fd 20
X A  40  = 40.74
N 27
For Continuous Series, the methods to calculate Simple Arithmetic mean are:
i. Direct Method
ii. Short-cut Method
iii. Step deviation method
7
JGI JAIN DEEMED-TO-BE UNI VE RSI TY
Direct method includes the following steps:

Step 1: Find mid-value of each class denoted by m.
Lowerlimit  Upperlimit
m
2
Step 2: Multiply the mid value obtained in Step 1 with the corresponding frequency and then obtain
fm.

fm
Step 4: Apply the formulae, X 
N .
X 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60
f 4 10 16 20 15 10
Solution:
X f m fm
0 – 10 4 5 20
10 – 20 10 15 150
20 – 30 16 25 400
30 – 40 20 35 700
40 – 50 15 45 675
50 – 60 10 55 550
f  N = 75 fm  2 495
fm 2495
X   33.27
N 75
Short-cut method includes the following steps:

Step 1: Find mid-value of each class denoted by m.
m
2
Step 2: Assume any value as assumed mean, i.e., A.

Step 3: Find deviations of all the observations from the assumed mean, i.e., d = m - A.
Step 4: Multiply the deviations obtained in Step 3 with the corresponding frequency and then obtain
fd.
Step 4: Find the total number of observations, N = f .
8
fd
Step 5: Apply the formulae, X  A  .
N
X 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60
f 4 10 16 20 15 10
X f m d=m–A fd
0 – 10 4 5 – 20 – 80
10 – 20 10 15 – 10 – 100
20 – 30 16 25 0 0
30 – 40 20 35 10 200
40 – 50 15 45 20 300
50 – 60 10 55 30 300
f  N = 75 fd  620
fd 620
X  A  25  = 33.27
N 75
Step deviation method includes the following steps:
Step 1: Find the mid-value of each class denoted by m.
m
2
Step 2: Assume any value as assumed mean, i.e., A.

m  A.
Step 3: Find deviations of all the observations from assumed mean, i.e., d  , where i is the class
i
interval.
Step 4: Multiply the deviations obtained in Step 3 with the corresponding frequency and then obtain
fd'.
Step 4: Find the total number of observations, N = f .
fd.
Step 5: Apply the formulae, X  A  i.
N
X 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60
f 4 10 16 20 15 10
9
JGI JAIN
X f m d=m–A d’ = d/i fd
0 – 10 4 5 – 20 –2 –8
10 – 20 10 15 – 10 –1 – 10
20 – 30 16 25 0 0 0
30 – 40 20 35 10 1 20
40 – 50 15 45 20 2 30
50 – 60 10 55 30 3 30
f  N = 75 fd  62
fd. 62
X  A  i  25   10 = 33.27
N 75
Example 8: Calculate missing value, when the Arithmetic mean is 11.
X 5 8 12 14 - 20 24
f 20 16 11 10 8 4 2
Solution: Let the missing value is ‘a’.
X f fX
5 20 100
8 16 128
12 11 132
14 10 140
a 8 8a
20 4 80
24 2 48
f  N = 71 fX  628+8a
fX  628  8a
X  = 11
N 71
8a = 153
a = 19.125 or 19
12.3.2 Geometric Mean

The Geometric mean is another important measure of central tendency. It indicates the typical value of a
set of numbers by using the product of values while arithmetic mean uses the sum of values. Geometric
mean of n observations x1, x2, x3,..., xn is
G.M = n x1 .x2 .x3 .xn
10
UNIT 12: Statistics - 1 JGI JAINDEEMED-TO-BE UNI VE RSI TY
Geometric mean between two numbers a and b is
G.M = ab
Also, log( G.M.) = log( n x1 .x2 .x3 ..xn )

1
log( G.M.) = log(x .x .x ..x )
1 2 3 n
n
1
log( G.M.) = log(x   log(x )  log(x )  log(x )]
n 1 2 3 n
n logx 
1
log (G.M.) =

n i1 i
logx  ]
1
G.M. = Antilog [ n

n i1
i
Applications of Geometric Mean

 Geometric mean is used by many financial departments in Stock indexes.
 Geometric mean is used to calculate annual return.
 It is used to calculate growth rate or Compound annual growth rate.
 It is also used in studies of Cell division and bacterial growth rate.
Example 9: Calculate Geometric mean for the following data:
X 50 60 59 120 135 110 7 10
Solution:
X log X
50 1.6990
60 1.7782
59 1.7709
120 2.0792
135 2.1303
110 2.0414
7 0.8451
10 1.0000
n=8 logX  13.3441

logx  ]
1
log (G.M.) = Antilog [ n

n i1
i
11
JGI JAIN
13.3441
= Antilog[ ]
8
= Antilog [1.6680]
= 46.56
12.3.3 Harmonic Mean

Harmonic mean is the average that is used when we need to calculate the rate of change.
Harmonic mean is one of the 3 types of Mathematical averages. It is used in geometry and music. It is
one of the Pythagorean mean. It is the least as compared to geometric and arithmetic mean.
Harmonic mean of n observations x1, x2, x3, .......... , xn is
n
H.M. =
1 1 1 1
  
x1 x2 x3 xn
n
H.M. =
1
 i1 x
n
Harmonic mean for two numbers a and b is

2 2ab
H.M. = =
1 1 ab

a b
Applications of Harmonic Mean

 Harmonic mean is used to determine the Fibonacci series.
 It is used in the field of finance to calculate multiple averages.
 Quantities like speed can be calculated using Harmonic mean.
Example 10: Calculate Geometric mean for the following data:
X 4 6 7 11 15 20 26
Solution:
X 1/X
4 0.2500
6 0.1667
7 0.1429
11 0.0909
15 0.0667
12
X 1/X
20 0.0500
26 0.0385
n=7 Sum = 0.8057
n 7
H.M. = =  8.688
1 0.8057
 i1 x
n
12.4 RELATIONSHIP AMONG DIFFERENT MEANS

ab
For two numbers a and b, A.M. =
2
G.M. = ab
2ab G.M. 2
H.M. = =
ab A.M.
 G.M.  H.M. .  A.M.

2
 G.M .   H .M .  . A.M .


ab
Also, A.M. = and G.M. = ab
2
ab
A.M. – G.M. = – ab
2
a  b  2 ab
 A.M. – G.M. =
2
 
2
a b
 A.M. – G.M. = 0
2
 A.M. – G.M.  0
 A.M. G.M.
Since the harmonic mean is less than both arithmetic and geometric mean, the relationship among
A.M., G. M. and H. M. is
A.M. G.M. H.M.

Example 11: If A.M. of two numbers is 20 and G.M. is 15. Find H.M.
13
JGI JAIN
Solution: Given A.M. = 20 and G.M. = 15
As, G.M.  H.M..A.M.
 15  H.M..20
 225 = (H.M.) . 20
 H.M. = 11.25
12.5 MEDIAN FOR RAW DATA AND GROUPED DATA

The median of the series is the actual or estimated value of an object when series is arranged in order
which divides it into two parts. It is the positional average and important measure of central tendency.
When a data set is given in unarranged then it is called raw data or ungrouped data, and if a data set is
given in ordered and in arranged form it is called grouped data.
Computation of Median for Individual Series or Ungrouped Data:
Step 1: Arrange the data in increasing or decreasing order.
Step 2: Find the number of observations, N.
N 1
If N is odd, Median = th observation
2
N N
If N is even, Median = Mean of ( th observation and 2  1 th observation)
2
Example 12: Calculate median for the following data:
126, 105, 115, 110, 104, 109, 120, 121, 116, 108, 130
Solution: Firstly, arrange the data in increasing order. That is;
104, 105, 108, 109, 110, 115, 116, 120, 121, 126, 130.
As N = 11
th observation = 11  1 th observation = 6th Observation

N+ 1
Median =
2 2
So, Median = 115.
Computation of Median for Discrete Series (Grouped data):
If the data is discrete and given in the form of frequency distribution, follow these steps to find the
median.
Step 1: Arrange the data according to their magnitude.
Step 2: Find Cumulative Frequency.
N 1
Step 3: Median = Magnitude of th observation.
2
14
Example 13: Calculate median for the following data:
Marks 5 10 15 20 25 30 35
Number of Students 2 6 10 12 11 6 2
Solution: After arranging the data, we will find the cumulative frequencies.
Marks (X) No. of Students f C.f.

5 2 2
10 6 8
15 10 18
20 12 30 
25 11 41
30 6 47
35 2 49
N = 49
N 1
Median = Magnitude of th observation
2
49  1
= Magnitude of th observation
2
= Magnitude of 25th observation

25th observation lies under 30 in cumulative frequency row, whose value is 20, So Median = 20
Computation of Median for Continuous Series (Grouped Data):
Step 1: Find Cumulative Frequency.
N
Step 2: Find the median class by locating in the column of C.F.
2
i N 
Step 3: Apply the formula, Median = l + .  c 
f  2 
where i is Width of class interval

l is Lower limit of median class
c is cumulative frequency preceding the median class
Example 14: Calculate the median for the following data.
X 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70
f 3 5 13 9 6 10 4
15
JGI JAIN
Solution:
X f C.F.
0 – 10 3 3
10 – 20 5 8
20 – 30 13 21  f
 l 30 – 40 9f 30
40 – 50 6 36
50 – 60 10 46
60 – 70 4 50
f  N = 50
N
First, we will find the median class by locating in the column of C.F., i.e., 25 lies in 30. So, the median
2
class interval = 30 – 40
l = 30, f = 9 and c = 21
Applying the formula, Median = 30 + 10 . 50  21

9  2 
= 34.44

Merits of Median
 Median can be located graphically.
 It is not affected by extreme values.
 It is very useful method to deal with Qualitative data.
 Median is the most suitable measure to apply in Skewed distributions.
12.6 MODE FOR RAW DATA AND GROUPED DATA

The word Mode is taken from a French word La mode which means the most popular phenomenon.
Thus, the mode is the value that occurs most frequently, that is, with the largest frequency. The value of
the variable at which the curve reaches maximum times is called mode. “Mode is the value which has
the largest frequency”
Computation of Mode for Individual Series:
In this method, we locate mode by inspection. A value occurring most frequently is termed as mode.
Example 15: Find the mode from the following:
X: 12 10 9 16 14 12 9 10 12 15 12
Solution: From the given series, by inspection method, the number 12 occurs most of the time. So, Mode
= 12
16
Computation of Mode for Discrete Series:

The two methods for finding mode are:
i. Inspective method
In this method, the mode can be computed by observing individual items. The value that occurs in
the maximum frequency is the mode.
Example 16: Find the mode from the following data :
50,40,42,50,46,42,42,50,48,44,52,50
Solution: First the series converted into discrete series in an ascending order:
X 40 42 44 46 48 50 52
f 1 3 1 1 1 4 1
Since 50 is the value with the largest frequency. Mode = 50.
ii. Grouping Method
When we find any difficulty to locate mode by inspection method. Under such a case the circumstance
being the difference between the maximum frequency and the frequency preceding or succeeding is
very small. So, we apply the Grouping Method to check whether the value with the highest frequency
is the mode. The following procedure needs to be applied in the grouping method:
a. Grouping Table
b. Analysis Table
Steps to be followed for preparing Grouping Table:

Step 1: Mark the maximum frequency in Column I.
Step 2: Group the frequencies in two’s starting from the first two frequencies in column II.
Step 3: Leave the first frequency and group the other in two’s in column III.
Step 4: Group the frequencies in three’s starting from the first two frequencies in column IV.
Step 5: Leave the first frequency and group the other in three’s in column V.
Step 6: Leave the first two frequencies and group the other in three’s in column V
Steps to be followed for preparing Analysis Table:
Step 1: Put the column number of the grouping table in leftmost column of the Analysis Table.
Step 2: Place the values given in problem in the topmost row.
Step 3: Mark the values with highest frequencies in the grouping table.
Step 4: Calculate the total of each column and find the value with highest frequency. The value with
maximum repetition is the mode.
Example 16: Find out the mode by grouping method:
Size 5 6 7 8 9 10 11 12 13
Frequency 2 5 6 8 12 15 6 2 1
17
JGI JAIN
Solution: Grouping Table
Size I II III IV V VI
5 2
7
6 5 13
11
7 6 19
14
8 8 26
20
9 12 35
27
10 15 33
21
11 6 23
8
12 2 9
3
13 1
Analytical Table
Col. No. 5 6 7 8 9 10 11 12 13
I
II
III
IV
VI
1 2 4 5 2
From the Analytical Table, Mode of the series is 10.

For Continuous Series
The following steps should be followed for Calculating mode:
Step 1: First we will identify the modal class by inspection.
18
Step 2: Apply the formula,
fm  f1
Mode = l  i
2fm  f1  f2
where, l is lower limit of modal class

fm is frequency of modal class
f1 is frequency preceding the modal class
f2 is frequency of succeeding the modal class
i is width of class interval
Example 17: Calculate the mode for following data:
Marks 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70
Frequency 2 5 8 12 8 6 2
Solution: By inspection we get, modal class = 30 – 40
fm  f1
Mode = l  i
2fm  f1  f2
l = 30, fm = 12, f1 = 8, f2 = 8, i 10
12  8
Mode = 30   10 = 35.
24  8  8
Merits of Mode
 Mode can be applied in many daily routines also it is readily intelligible.
 It is not affected by extreme values.
 Mode can be located graphically.
 Mode is useful in the field of Commerce and Business.
12.7 RELATIONSHIP AMONG MEAN, MEDIAN AND MODE
When Mean = X , Mode = Z and Median = M.

In a perfectly symmetrical distribution, the mean, median and mode are equal.
X ZM
In a moderately symmetrical distribution, the mean, median and mode are not equal.
For positively skewed distribution, X  Z  M
19
JGI JAINDEEMED-TO-BE UNI VE RSI TY
For negatively skewed distribution, X  Z  M

The empirical relation among mean median and mode for skewed data distribution is;
Mean – Mode = 3 (Mean - Median), i.e., X  Z  3 X  M  

Example 18: Calculate Mean, Median and Mode for the following data:
X 5 10 15 20 25 30 35 40 45 50
f 4 6 10 12 15 12 8 6 5 2
Solution:
X f d=X–A d’ = d/5 fd’ c.f.

5 4 –20 –4 –16 4
10 6 –15 –3 –18 10
15 10 –10 –2 –20 20
20 12 –5 –1 –12 32
25 15 0 0 0 47
30 12 5 1 12 59
35 8 10 2 16 67
40 6 15 3 18 73
45 5 20 4 20 78
50 2 25 5 10 80
N = 80 Sum = 10
fd'. 10
Mean = X  A   i  25   5 = 25.625
N 80
N 1
Median = Magnitude of th observation
2
80  1
= Magnitude of th observation
2
= 40.5 th observation
= 25
By inspection, Mode = 25
Mean = 25.625, Mode = 25 and Median = 25.
12.8 DESCRIPTIVE STATISTICS STANDARD DEVIATION

The concept of standard deviation came into existence when statistician Karl Pearson represented
this idea. Standard deviation is a measure of dispersion that is mostly used to know how much data
20
is extended. Standard deviation is also called ‘Mean Square Error’ or ‘Root Mean Square Error’. It is
denoted by . Standard deviation is better as compared to Mean deviation. Standard deviation is widely
used in the economic and business field.
Computation of Standard deviation for Individual Series
The following steps should be followed for computing Standard deviation:
Step 1: First calculate X.

Step 2: Find deviations from Mean and assure that sum of deviations is zero.
Step 3: Compute the square of deviations and find the sum.
Step 4: Apply the formula
x 2

N
where x is the deviations from mean.

Example 19: Calculate Standard deviation from the following data:
Marks: 10 8 12 16 14 20 22 25 28 15
Solution:
Marks x=X-X x2
10 –7 49
8 –5 25
12 –9 81
16 –1 1
14 –3 9
20 3 9
22 5 25
25 8 64
28 11 121
15 –2 4
N = 10, X  170 x  0 x 2  388

X 170
X   17
N 10
x 2 388
   6.23
N 10
Methods based on Assumed Mean

Step 1: Take any value as assumed mean from the series.
21
Step 2: Find deviations from Assumed Mean, i.e., d = X – A
Step 3: Compute the square of deviations and find the sum d .

2
d2  d 2
 
N  N 
where x is the deviations from mean.

Example 20: Calculate the Standard Deviation from the following data:
12 15 18 24 25 30 35 40
Solution:
X A = 25, d = X-A d2
12 –13 169
15 –10 100
18 –7 49
24 –1 1
25 0 0
30 5 25
35 10 100
40 15 225
N=8 d  1 d2  669
d2  d 2
 
N  N 
669  1  2
 
N8  8 

 83.614  9.144
Method based on actual use of data
Step 1: Find sum of values in the series, i.e., X.
Step 2: Square the value of X and aggregate these to obtain X 2 .
Step 3: Apply the following formula
X 2  X  2
 
N  N 
22
Note. This method can be applied when the number of observations is few.
Example 21: Calculate Standard Deviation from the Series:
Wages: 5 10 15 20 25 30 35
Solution: Calculation of Standard Deviation
X 5 10 15 20 25 30 35 X = 140
X2 25 100 225 400 625 900 1225 X 2 = 3500
X 2  X  2
 
N  N 
3500  140 2
 
7  7 
 100 = 10
For Discrete Series:
Method based on Assumed mean
Step 1: Take deviation of items from the assumed mean, i.e., d = X-A.
Step 2: Multiply these deviation by their respective frequencies to obtain fd .
Step 3: Multiply fd of each cell by d and find fd .

2
fd2  fd 2
 
N  N 
X 5 10 15 20 25 30
f 2 3 7 10 5 3
Solution:
X f A = 25, d = X-A fd fd2

5 2 –10 -20 200
10 3 –5 -15 75
15 7 0 0 0
23
X f A = 25, d = X-A fd fd2

20 10 5 50 250
25 5 10 50 500
30 3 15 45 675
N = 30 d  1 fd  110 fd2  1700
fd2  fd 2
 
N  N 
1700  110 2
 
30  30 
 47.20 = 6.57
Step Deviation Method

Step 1: Take deviations of values from assumed mean, i.e., d = X – A.
Step 2: Divide the deviations by common factor i denoted by d’.
Step 3: Multiply d’ with respective frequencies and sum to find fd' .
Step 4: Multiply fd’ of each cell by d’ and sum to obtain fd' .

2
fd'2  fd' 2
   i
N  N 
This method helps to simplify calculation. The Formula of S.D needs to be implemented by common
factor i because S.D is not independent of change of scale.
X 20 30 40 50 60 70 80
f 2 5 7 20 7 6 3
Solution:
X f A = 50, d = X-A d’ fd’ fd’2

20 2 -30 -3 -6 18
30 5 -20 -2 -10 20
40 7 -10 -1 -7 7
50 20 0 0 0 0
24
X f A = 50, d = X-A d’ fd’ fd’2

60 7 10 1 7 7
70 6 20 2 12 24
80 3 30 3 9 27
N = 50 fd'  5 fd'2  103
fd'2  fd' 2
   i
N  N 
103  5  2
   10
50  50 
 2.06  .01  10 = 14.32
For Continuous Series:

Step 1: Find mid value of each class denote by m.
m
2
Step 2: Assume any value as assumed mean A take deviations from A; d = m – A.

Step 3: Divide the deviations by common factor i denoted by d’.
Step 4: Multiply d’ with respective frequencies and sum to find fd' .
Step 5: Multiply fd’ of each cell by d’ and sum to obtain fd'2 .
fd'2  fd' 2
  
N  N 
where N = f.
Note:
1. Standard Deviation is not independent of change of Scale.
2. We can use common factor to simplify calculation only if class intervals are of equal size.
3. In case of unequal class interval, we can’t apply Step Deviation method.
Example 24. Calculate Standard Deviation from the following data :
Class 0-10 10-20 20-30 30-40 40-50 50-60

Frequency 3 5 10 12 7 3
25
JGI JAIN
Solution:
X f m d d’ fd’ fd’2
0-10 3 5 -20 -2 -6 18
10-20 5 15 -10 -1 -5 20
20-30 10 25 0 0 0 7
30-40 12 35 10 1 12 0
40-50 7 45 20 2 14 7
50-60 3 55 30 3 9 24
N = 40 fd'  24 fd'2  84
12.9 VARIANCE
The term Variance was introduced by R.A Fisher in 1918. It is calculated by Squaring the Standard
Deviation.
Variance = (S.D)2
Example 25: Calculate mean and variance from the following data:
Size 0-10 10-20 20-30 30-40 40-50 50-60

F 5 7 10 12 4 2
Solution:
Size f m d d’ fd’ fd’2
0-10 5 5 -20 -2 -10 20

10-20 7 15 -10 -1 -7 7
20-30 10 25 0 0 0 0
30-40 12 35 10 1 12 12
40-50 4 45 20 2 8 16
50-60 2 55 30 3 6 18
N = 40 fd'  9 fd'2  73
fd'
X  A i
N
9
 25   10
40
 25  2.25  27.25
26
Variance = (S.D)2
 fd2  fd  2 
=  .i2
N N 
   
 
 2
= 73   9   2
40 40   .10
  
 
=  1.825  0.506 .100
= 177.44
12.10 COEFFICIENT OF VARIATION

It is widely used and important relative measure of dispersion. It is used to compare variability of two
or more series.

C.V.   100%
X
where C.V. is coefficient of variation,  is Standard deviation and X is mean.

The series having less C.V. has better uniformity and less variability as compared to series having more
C.V.
Example 20: The scores of two batsman X and Y in one day match are as follows:
X 10 44 32 58 0 24 42 86
Y 40 8 0 16 50 56 48 70
Solution:
X x=X–X X2 Y y=Y–Y Y2
10 –27 729 40 +4 16
44 +7 49 8 –28 784
32 –5 25 0 –36 1296
58 +21 441 16 –20 400
0 –37 1369 50 +14 196
24 –13 169 56 +20 400
42 +5 25 48 +12 144
86 +49 2401 70 +34 1156
 X =296 x =0  x2 = 5208 Y = 288 y =0  y2 = 4392
27
X 296
X   37
N 8
Y 288
Y   36
N 8
  x  5208  25.51    y  4392  23.43

2 2
X Y
N 8 N 8
25.51 X
C.V. for series X =  100 =  100 = 68.95 %
X 37
Y 23.31
C.V. for series Y =  100 =  100 = 65.08 %
Y 36
C.V. for series Y  C.V. for series X. Hence Y is more consistent.
Conclusion 12.11 CONCLUSION
 Statistics can be defined as the ordered process including collection, classification, analysis, and
interpretation of data. The term Statistics is related to techniques or tools in the analysis of data.
 The preliminary steps to the final analysis and interpretation are consisted in descriptive statistics.
Descriptive Statistics helps to bring out the characteristics of data. 
 Classification is a process of arranging data in a simplified way according to their characteristics
for better analysis. 
 Tabulation is a process of ordered and systematic presentation of numerical data in a form designed
to explain the problem under consideration.
 Arithmetic mean is a widely used measure of central tendency which represents the entire data by
a single value. It is commonly known as ‘Mean’.
 The Geometric mean is another important measure of central tendency. It indicates the typical value
of a set of numbers by using the product of values while arithmetic mean uses the sum of values.
 The harmonic mean is the average that is used when we need to calculate the rate of change.
 When a data set is given in unarranged then it is called raw data or ungrouped data, and if a data
set is given in order and the arranged form it is called grouped data.
 The value of the variable at which the curve reaches maximum times is called mode. “Mode is the
value which has the largest frequency”.
 Standard deviation is a measure of dispersion that is mostly used to know how much data is extended.
Standard deviation is also called ‘Mean Square Error’ or ‘Root Mean Square Error’. It is denoted by .
 The term Variance was introduced by R.A Fisher in 1918. It is calculated by Squaring the Standard
Deviation. Variance = (S.D)2
 The coefficient of Variation is a widely used and important relative measure of dispersion. It is used
to compare the variability of two or more series.
28
12.12 GLOSSARY
 Mean: Mean is widely used measure of central tendency which represents the entire data by a single
value. It is also called as Averages.
 Median: Median of the series is the actual or estimated value of object when series is arranged in
order which divides it into two parts. It is the positional average and important measure of central
tendency.
 Mode: The value the variable at which the curve reaches maximum times is called mode. Mode is the
value which has the largest frequency.
 Standard deviation : It is a measure of dispersion that is mostly used to know how much data is
extended.
 Mean = Median = Mode In a moderately symmetrical distribution, the mean, median and mode are
not equal. 
 G.M.  H.M..A.M.


11.17 SELF-ASSESSMENT QUESTIONS



A. Essay type Questions:

1. Find out Arithmetic mean from the following series:
Wages 30 40 50 60 70 80 90
No. of workers 2 8 18 30 50 65 70
2. The Sum of deviations of certain numbers of items measured from 35 is 50 and measured from 45 is
-50. Find N and Mean.
3. Calculate Median for the following data:
X 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 60 – 70
f 2 4 10 15 20 10 6 3
4. Find out Mode from the following data:
X 40 – 50 50 – 60 60 – 70 70 – 80 80 – 90
f 5 10 12 15 20
5. In a moderately skewed distribution, the value of Mode and Arithmetic mean are 20 and 35
respectively then find Median.
6. The G.M. of 10 items on a certain variable was 14. It was later found that one of the items wrongly
recorded as 12 instead of 21. Calculate G.M.
7. Find H.M. from the following data:
X 10 20 40 60 120
f 1 3 6 5 4
29
8. It A.M. of 2 numbers is 25 and G.M. is 20. Find H.M.
9. Calculate S.D. of the following series:
Age 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70
No. of persons 2 4 8 10 12 4
10. The following information is given:
Firms Average Wages No. of Workers S.D.
X 175 600 10
Y 186 500 9
a. Which firm pays larger wages?
b. Which firm is more flexible in wage structure?
12.14 ANSWERS AND HINTS FOR SELF-ASSESSMENT QUESTIONS
A. Hints for Essay Type Questions

1. 60.29
2. N = 10, Mean = 40
3. 42
4. 82
5. 30
6. 14.80
7. 36.78
8. 16
9. 13.219
10. a. Firm X
b. Firm Y
@ 12.15 POST-UNIT READING MATERIAL
 https://www.wallstreetmojo.com/standard-deviation-examples/
12.16 TOPICS FOR DISCUSSION FORUMS
 Discuss how Standard deviation and Covariance can be used in field of Commerce and Business.
30

MCA Mathematical Foundation For Computer Application 12

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MCA Mathematical Foundation For Computer Application 12

Uploaded by

Copyright:

Available Formats

UNIT

In this unit, you will learn to:

At the end of this unit, you would:

Pre-Unit Preparatory Material

12.2 DESCRIPTIVE STATISTICS

3. Measures of Central Tendency

Descriptive Statistics helps to bring out the characteristics of data.

12.2.1 Collection of Data

12.2.2 Classification or Tabulation of Data

3. It promotes the statistical treatment of collected material.

12.2.3 MEASURES OF CENTRAL TENDENCY

2.3 MEAN: ARITHMETIC, GEOMETRIC AND HARMONIC MEANS

12.3.1 ARITHMETIC MEAN

Computation of Simple Arithmetic Mean

The direct method includes the following steps:

Step 1: Add all the values of variable X, i.e., find X

Example 1: The marks of 10 students out of 50 are given below:

Example 2: The marks of 10 students are given below:

Students Marks (X) d = X – 36

Students Marks (X) d = X – 36

Step 2: find fX .

Step 3: Find the total number of observations, N.

Short-cut method includes the following steps:

Example 4: Calculate the Arithmetic mean for the following data.

Solution: Let assumed mean for the observations be A = 40

Direct method includes the following steps:

Step 3: Find the total number of observations, N.

Example 5: Calculate the Arithmetic mean for the following data.

Short-cut method includes the following steps:

Step 2: Assume any value as assumed mean, i.e., A.

Step 4: Find the total number of observations, N = f .

Solution: Let the assumed mean for the observations be A = 25.

Step 2: Assume any value as assumed mean, i.e., A.

Step 4: Find the total number of observations, N = f .

Solution: Let the assumed mean for the observations be A = 25.

Solution: Let the missing value is ‘a’.

12.3.2 Geometric Mean

G.M = n x1 .x2 .x3 .xn

Geometric mean between two numbers a and b is

Also, log( G.M.) = log( n x1 .x2 .x3 ..xn )

Applications of Geometric Mean

X 50 60 59 120 135 110 7 10

n=8 logX  13.3441

12.3.3 Harmonic Mean

Harmonic mean for two numbers a and b is

Applications of Harmonic Mean

Example 10: Calculate Geometric mean for the following data:

12.4 RELATIONSHIP AMONG DIFFERENT MEANS

 G.M.  H.M. .  A.M.

 G.M .   H .M .  . A.M .

A.M. G.M. H.M.

Solution: Given A.M. = 20 and G.M. = 15

As, G.M.  H.M..A.M.

12.5 MEDIAN FOR RAW DATA AND GROUPED DATA

th observation = 11  1 th observation = 6th Observation

Example 13: Calculate median for the following data:

Marks (X) No. of Students f C.f.

= Magnitude of 25th observation

where i is Width of class interval

Applying the formula, Median = 30 + 10 . 50  21

12.6 MODE FOR RAW DATA AND GROUPED DATA

Computation of Mode for Discrete Series:

Steps to be followed for preparing Grouping Table: