Professional Documents
Culture Documents
MCA Mathematical Foundation For Computer Application 12
MCA Mathematical Foundation For Computer Application 12
12 Statistics - 1
Names of Sub-Units
Descriptive Statistics, Mean: Arithmetic, Geometric and Harmonic means, Relationship among
different means, Median for raw data and grouped data, Mode for raw data and grouped data,
Relationship among mean, median and mode, Descriptive statistics standard deviation, Variance,
Coefficient of variation
Overview
The unit begins by introducing the concept of descriptive statistics. Also, the unit explains various
measures of central tendency like mean, median, mode, standard variance, coefficient of variation
etc.
Learning Objectives
Learning Outcomes
https://byjus.com/mean-median-mode-formula/
12.1 INTRODUCTION
The word ‘Statistics’ is similar to the Latin word ‘Status’, German word ‘Statistik’, French word ‘Statistique’
or Italian word ‘Statista’. In the previous time, it was regarded as Political Arithmetic which was used
to take and analyse data of population and wealth to obtain ideas about military strength and financial
state. Statistics can be defined as the ordered process including collection, classification, analysis and
interpretation of data. The term Statistics is related to techniques or tools in the analysis of data. The
Statistical method includes 4 stages:
1. Collection of data
2. Classification/Tabulation/Presentation of data
3. Analysis of data
4. Interpretation of data
The scope of Statistics has significantly distributed over all the branches in the modern era. In every
field, statistical methods may be applied. These two statistical methods are mostly used for analysing
the data; Descriptive Statistics and Inferential Statistics.
Different types of Statistical Methods are given below:
(i) Descriptive Statistics
(ii) Analytical Statistics
(iii) Inductive Statistics
(iv) Inferential Statistics
(v) Applied Statistics
2
UNIT 12: Statistics - 1 JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Secondary Data are those data that are published by one organisation and used by another organisation
or person for some objective. The originality of Secondary data is less as compared to Primary data.
The Various Sources of Secondary data are:
1. International Publications
2. Reports of Committee and Commissions
3. Publications by Trade and Professional bodies
4. Newspapers
5. Publications of Research Scholars
3
JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
We classify the data to make it into the arranged form on the following basis;
Geographical Classification
Chronological Classification
Qualitative Classification
Quantitative Classification
After the classification of data, tabulation of data is necessary. Tabulation helps in the ordered
arrangement of data systematically. The tabulation helps in summarising the data in condensed form.
Tabulation is a process of ordered and systematic presentation of numerical data in a form designed to
explain the problem under consideration.
The main objectives of Tabulation are:
1. Tabulation is helpful to avoid unnecessary repetition.
2. Data presentation is economical via tabulation.
3. It represents the characteristics and features of data.
4. Tabulation promotes presentation in form of diagrams and graphs.
In this unit, we will discuss the averages: Mean, Median and Mode.
4
UNIT 12: Statistics - 1 JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Students 1 2 3 4 5 6 7 8 9 10 N = 10
Marks, X 40 46 45 32 25 48 36 30 24 41 = 367
X 367
X
N 10
X 36.7
The Short-cut method is applied when the number of observations is very large. Thus, we take deviations
from the assumed mean to reach the actual mean. The method includes the following steps:
Step 1: Take any value in the series as assumed mean, i.e., A.
Step 2: Find deviations of all the observations from the assumed mean, i.e., d = X - A.
d
Step 3: Apply the formulae, X A
N.
5
JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
N = 10 d 7
d 7
XA 36 = 36.7
N 10
For Discrete Series, the two methods to calculate Simple Arithmetic mean are:
Direct Method
Short-cut Method
Direct method includes the following steps:
Step 1: Multiply the variable X with its respective frequency for all the observations.
X 5 10 15 20 25 30 35 40
f 2 6 12 10 15 8 1 4
Solution:
X f fX
5 2 10
10 6 60
15 12 180
20 10 200
25 15 375
30 8 240
35 1 35
6
UNIT 12: Statistics - 1 JGI JAIN DEEMED-TO-BE UNI VE RSI TY
X f fX
40 4 160
f N = 58 fX 1260
fX 1260
X = 21.72
N 58
Step 3: Multiply the deviation obtained in Step 2 with the corresponding frequency and then obtain fd .
Step 4: Find the total number of observations, N.
fd
Step 5: Apply the formulae, X A
N .
X 10 20 30 40 50 60 70
f 2 5 3 4 8 2 3
X f d=X-A fd
10 2 –30 –60
20 5 –20 –100
30 3 –10 –30
40 4 0 0
50 8 10 80
60 2 20 40
70 3 30 90
f N = 27 fd 20
fd 20
X A 40 = 40.74
N 27
For Continuous Series, the methods to calculate Simple Arithmetic mean are:
i. Direct Method
ii. Short-cut Method
iii. Step deviation method
7
JGI JAIN DEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
Step 2: Multiply the mid value obtained in Step 1 with the corresponding frequency and then obtain
fm.
X 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60
f 4 10 16 20 15 10
Solution:
X f m fm
0 – 10 4 5 20
10 – 20 10 15 150
20 – 30 16 25 400
30 – 40 20 35 700
40 – 50 15 45 675
50 – 60 10 55 550
f N = 75 fm 2 495
fm 2495
X 33.27
N 75
8
UNIT 12: Statistics - 1 JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
fd
Step 5: Apply the formulae, X A .
N
Example 6: Calculate the Arithmetic mean for the following data.
X 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60
f 4 10 16 20 15 10
X f m d=m–A fd
0 – 10 4 5 – 20 – 80
10 – 20 10 15 – 10 – 100
20 – 30 16 25 0 0
30 – 40 20 35 10 200
40 – 50 15 45 20 300
50 – 60 10 55 30 300
f N = 75 fd 620
fd 620
X A 25 = 33.27
N 75
Step deviation method includes the following steps:
Step 1: Find the mid-value of each class denoted by m.
Lowerlimit Upperlimit
m
2
fd.
Step 5: Apply the formulae, X A i.
N
Example 7: Calculate the Arithmetic mean for the following data.
X 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60
f 4 10 16 20 15 10
9
JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
X f m d=m–A d’ = d/i fd
0 – 10 4 5 – 20 –2 –8
10 – 20 10 15 – 10 –1 – 10
20 – 30 16 25 0 0 0
30 – 40 20 35 10 1 20
40 – 50 15 45 20 2 30
50 – 60 10 55 30 3 30
f N = 75 fd 62
fd. 62
X A i 25 10 = 33.27
N 75
Example 8: Calculate missing value, when the Arithmetic mean is 11.
X 5 8 12 14 - 20 24
f 20 16 11 10 8 4 2
X f fX
5 20 100
8 16 128
12 11 132
14 10 140
a 8 8a
20 4 80
24 2 48
f N = 71 fX 628+8a
fX 628 8a
X = 11
N 71
8a = 153
a = 19.125 or 19
10
UNIT 12: Statistics - 1 JGI JAINDEEMED-TO-BE UNI VE RSI TY
G.M = ab
n logx
1
log (G.M.) =
n i1 i
logx ]
1
G.M. = Antilog [ n
n i1
i
Solution:
X log X
50 1.6990
60 1.7782
59 1.7709
120 2.0792
135 2.1303
110 2.0414
7 0.8451
10 1.0000
n i1
i
11
JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
13.3441
= Antilog[ ]
8
= Antilog [1.6680]
= 46.56
X 4 6 7 11 15 20 26
Solution:
X 1/X
4 0.2500
6 0.1667
7 0.1429
11 0.0909
15 0.0667
12
UNIT 12: Statistics - 1 JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
X 1/X
20 0.0500
26 0.0385
n=7 Sum = 0.8057
n 7
H.M. = = 8.688
1 0.8057
i1 x
n
G.M. = ab
2ab G.M. 2
H.M. = =
ab A.M.
a b 2 ab
A.M. – G.M. =
2
2
a b
A.M. – G.M. = 0
2
A.M. – G.M. 0
A.M. G.M.
Since the harmonic mean is less than both arithmetic and geometric mean, the relationship among
A.M., G. M. and H. M. is
13
JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
15 H.M..20
225 = (H.M.) . 20
H.M. = 11.25
N 1
Step 3: Median = Magnitude of th observation.
2
14
UNIT 12: Statistics - 1 JGI JAIN DEEMED-TO-BE UNI VE RSI TY
Marks 5 10 15 20 25 30 35
Number of Students 2 6 10 12 11 6 2
Solution: After arranging the data, we will find the cumulative frequencies.
N 1
Median = Magnitude of th observation
2
49 1
= Magnitude of th observation
2
X 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70
f 3 5 13 9 6 10 4
15
JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
Solution:
X f C.F.
0 – 10 3 3
10 – 20 5 8
20 – 30 13 21 f
l 30 – 40 9f 30
40 – 50 6 36
50 – 60 10 46
60 – 70 4 50
f N = 50
N
First, we will find the median class by locating in the column of C.F., i.e., 25 lies in 30. So, the median
2
class interval = 30 – 40
l = 30, f = 9 and c = 21
= 34.44
Merits of Median
Median can be located graphically.
It is not affected by extreme values.
It is very useful method to deal with Qualitative data.
Median is the most suitable measure to apply in Skewed distributions.
16
UNIT 12: Statistics - 1 JGI JAINDEEMED-TO-BE UNI VE RSI TY
Size 5 6 7 8 9 10 11 12 13
Frequency 2 5 6 8 12 15 6 2 1
17
JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
Size I II III IV V VI
5 2
7
6 5 13
11
7 6 19
14
8 8 26
20
9 12 35
27
10 15 33
21
11 6 23
8
12 2 9
3
13 1
Analytical Table
Col. No. 5 6 7 8 9 10 11 12 13
I
II
III
IV
VI
1 2 4 5 2
18
UNIT 12: Statistics - 1 JGI JAINDEEMED-TO-BE UNI VE RSI TY
fm f1
Mode = l i
2fm f1 f2
Marks 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70
Frequency 2 5 8 12 8 6 2
fm f1
Mode = l i
2fm f1 f2
l = 30, fm = 12, f1 = 8, f2 = 8, i 10
12 8
Mode = 30 10 = 35.
24 8 8
Merits of Mode
Mode can be applied in many daily routines also it is readily intelligible.
It is not affected by extreme values.
Mode can be located graphically.
Mode is useful in the field of Commerce and Business.
In a moderately symmetrical distribution, the mean, median and mode are not equal.
19
JGI JAINDEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
X 5 10 15 20 25 30 35 40 45 50
f 4 6 10 12 15 12 8 6 5 2
Solution:
fd'. 10
Mean = X A i 25 5 = 25.625
N 80
N 1
Median = Magnitude of th observation
2
80 1
= Magnitude of th observation
2
= 40.5 th observation
= 25
By inspection, Mode = 25
Mean = 25.625, Mode = 25 and Median = 25.
20
UNIT 12: Statistics - 1 JGI JAIN DEEMED-TO-BE UNI VE RSI TY
is extended. Standard deviation is also called ‘Mean Square Error’ or ‘Root Mean Square Error’. It is
denoted by . Standard deviation is better as compared to Mean deviation. Standard deviation is widely
used in the economic and business field.
Computation of Standard deviation for Individual Series
The following steps should be followed for computing Standard deviation:
x 2
N
Marks x=X-X x2
10 –7 49
8 –5 25
12 –9 81
16 –1 1
14 –3 9
20 3 9
22 5 25
25 8 64
28 11 121
15 –2 4
x 2 388
6.23
N 10
21
JGI JAIN DEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
d2 d 2
N N
X A = 25, d = X-A d2
12 –13 169
15 –10 100
18 –7 49
24 –1 1
25 0 0
30 5 25
35 10 100
40 15 225
N=8 d 1 d2 669
d2 d 2
N N
669 1 2
N8 8
83.614 9.144
X 2 X 2
N N
22
UNIT 12: Statistics - 1 JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Note. This method can be applied when the number of observations is few.
Wages: 5 10 15 20 25 30 35
X 5 10 15 20 25 30 35 X = 140
X 2 X 2
N N
3500 140 2
7 7
100 = 10
Step 1: Take deviation of items from the assumed mean, i.e., d = X-A.
fd2 fd 2
N N
Example 22: Calculate the Standard Deviation from the following data:
X 5 10 15 20 25 30
f 2 3 7 10 5 3
Solution:
23
JGI JAIN DEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
fd2 fd 2
N N
1700 110 2
30 30
47.20 = 6.57
fd'2 fd' 2
i
N N
This method helps to simplify calculation. The Formula of S.D needs to be implemented by common
factor i because S.D is not independent of change of scale.
Example 23: Calculate the Standard Deviation from the following data:
X 20 30 40 50 60 70 80
f 2 5 7 20 7 6 3
Solution:
24
UNIT 12: Statistics - 1 JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
fd'2 fd' 2
i
N N
103 5 2
10
50 50
fd'2 fd' 2
N N
where N = f.
Note:
1. Standard Deviation is not independent of change of Scale.
2. We can use common factor to simplify calculation only if class intervals are of equal size.
3. In case of unequal class interval, we can’t apply Step Deviation method.
Example 24. Calculate Standard Deviation from the following data :
25
JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
Solution:
X f m d d’ fd’ fd’2
0-10 3 5 -20 -2 -6 18
10-20 5 15 -10 -1 -5 20
20-30 10 25 0 0 0 7
30-40 12 35 10 1 12 0
40-50 7 45 20 2 14 7
50-60 3 55 30 3 9 24
N = 40 fd' 24 fd'2 84
12.9 VARIANCE
The term Variance was introduced by R.A Fisher in 1918. It is calculated by Squaring the Standard
Deviation.
Variance = (S.D)2
Example 25: Calculate mean and variance from the following data:
Solution:
26
UNIT 12: Statistics - 1 JGI JAIN
DEEMED-TO-BE UNI VE RSI TY
Variance = (S.D)2
fd2 fd 2
= .i2
N N
2
= 73 9 2
40 40 .10
= 177.44
X 10 44 32 58 0 24 42 86
Y 40 8 0 16 50 56 48 70
Solution:
X x=X–X X2 Y y=Y–Y Y2
10 –27 729 40 +4 16
44 +7 49 8 –28 784
32 –5 25 0 –36 1296
58 +21 441 16 –20 400
0 –37 1369 50 +14 196
24 –13 169 56 +20 400
42 +5 25 48 +12 144
86 +49 2401 70 +34 1156
27
JGI JAINDEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
X 296
X 37
N 8
Y 288
Y 36
N 8
X Y
N 8 N 8
25.51 X
C.V. for series X = 100 = 100 = 68.95 %
X 37
Y 23.31
C.V. for series Y = 100 = 100 = 65.08 %
Y 36
C.V. for series Y C.V. for series X. Hence Y is more consistent.
Statistics can be defined as the ordered process including collection, classification, analysis, and
interpretation of data. The term Statistics is related to techniques or tools in the analysis of data.
The preliminary steps to the final analysis and interpretation are consisted in descriptive statistics.
Descriptive Statistics helps to bring out the characteristics of data.
Classification is a process of arranging data in a simplified way according to their characteristics
for better analysis.
Tabulation is a process of ordered and systematic presentation of numerical data in a form designed
to explain the problem under consideration.
Arithmetic mean is a widely used measure of central tendency which represents the entire data by
a single value. It is commonly known as ‘Mean’.
The Geometric mean is another important measure of central tendency. It indicates the typical value
of a set of numbers by using the product of values while arithmetic mean uses the sum of values.
The harmonic mean is the average that is used when we need to calculate the rate of change.
When a data set is given in unarranged then it is called raw data or ungrouped data, and if a data
set is given in order and the arranged form it is called grouped data.
The value of the variable at which the curve reaches maximum times is called mode. “Mode is the
value which has the largest frequency”.
Standard deviation is a measure of dispersion that is mostly used to know how much data is extended.
Standard deviation is also called ‘Mean Square Error’ or ‘Root Mean Square Error’. It is denoted by .
The term Variance was introduced by R.A Fisher in 1918. It is calculated by Squaring the Standard
Deviation. Variance = (S.D)2
The coefficient of Variation is a widely used and important relative measure of dispersion. It is used
to compare the variability of two or more series.
28
UNIT 12: Statistics - 1 JGI JAINDEEMED-TO-BE UNI VE RSI TY
12.12 GLOSSARY
Mean: Mean is widely used measure of central tendency which represents the entire data by a single
value. It is also called as Averages.
Median: Median of the series is the actual or estimated value of object when series is arranged in
order which divides it into two parts. It is the positional average and important measure of central
tendency.
Mode: The value the variable at which the curve reaches maximum times is called mode. Mode is the
value which has the largest frequency.
Standard deviation : It is a measure of dispersion that is mostly used to know how much data is
extended.
Mean = Median = Mode In a moderately symmetrical distribution, the mean, median and mode are
not equal.
G.M. H.M..A.M.
29
JGI JAINDEEMED-TO-BE UNI VE RSI TY
Mathematical Foundation for Computer Application
8. It A.M. of 2 numbers is 25 and G.M. is 20. Find H.M.
9. Calculate S.D. of the following series:
Age 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70
No. of persons 2 4 8 10 12 4
10. The following information is given:
Firms Average Wages No. of Workers S.D.
X 175 600 10
Y 186 500 9
a. Which firm pays larger wages?
b. Which firm is more flexible in wage structure?
https://www.wallstreetmojo.com/standard-deviation-examples/
Discuss how Standard deviation and Covariance can be used in field of Commerce and Business.
30