Professional Documents
Culture Documents
Introduction To Statistics CH4
Introduction To Statistics CH4
PROBABILITY
CHAPTER-IV
Measures of Dispersion
(Variation)
4.1 Introduction
Merits:
• It is rigidly defined.
• It is easy to calculate and simple to understand.
Demerits:
• It is not based on all observation.
• It is highly affected by extreme observations.
• It is affected by fluctuation in sampling.
• It is not liable to further algebraic treatment.
• It can not be computed in the case of open end distribution.
• It is very sensitive to the size of the sample
Example:
𝑄3 − 𝑄1
𝑄. 𝐷 =
2
c) Coefficient of Quartile Deviation (C.Q.D)
LCF
Values Frequency
140- 150 17 17
150- 160 29 46
160- 170 42 88
170- 180 72 160
180- 190 84 244
190- 200 107 351
200- 210 49 400
210- 220 34 434
220- 230 31 465
230- 240 16 481
240- 250 12 493
Solution:
𝑖𝑛 𝑤
𝑄𝑖 = 𝐿𝑄𝑖 + − 𝑐𝑓
4 𝑓𝑄𝑖
Where: i=1,2,3.
Note the first quartile value is
𝑖𝑛 𝑡ℎ 1∗493 𝑡ℎ
item= item=123.25𝑡ℎ item. Which is between
4 4
𝑡ℎ 𝑡ℎ
123 𝑖𝑡𝑒𝑚 𝑎𝑛𝑑124 𝑖𝑡𝑒𝑚 𝑖𝑡 belongs to the fourth class.
1∗𝑛 𝑤 1∗493 10
𝑄1 = 𝐿𝑄1 +
4
− 𝑐𝑓
𝑓𝑄1
= 170 +
4
− 88
72
=174.9.
Note the third quartile value is
𝑖𝑛 𝑡ℎ 3∗493 𝑡ℎ
item= item=369.75𝑡ℎ item. Which is between
4 4
369𝑡ℎ 𝑖𝑡𝑒𝑚 𝑎𝑛𝑑370𝑡ℎ 𝑖𝑡𝑒𝑚 𝑖𝑡 belongs to the seventh class.
3∗𝑛 𝑤 3∗493 10
𝑄3 = 𝐿𝑄3 + − 𝑐𝑓 =200 + − 351
4 𝑓𝑄1 4 49
=203.83.
𝑄 ;𝑄 203.83;174.9
Then 𝑄. 𝐷 = 3 1 = = 14.47
2 2
𝑄3 − 𝑄1 203.83 − 174.9
𝐶. 𝑄. 𝐷 = = = 0.076
𝑄3 + 𝑄1 203.83 + 174.9
4 2 1.5 1
4 2 1.5 1
5 1 0.5 0
5 1 0.5 0
5 1 0.5 0
6 0 0.5 1
7 1 1.5 2
7 1 1.5 2
8 2 2.5 3
9 3 3.5 4
Total 14 14 14
𝑛 𝑛
𝑥𝑖 ;𝑋 𝑥𝑖 ;6 14
• 𝑀. 𝐷 𝑋 = 𝑖=1
= 𝑖=1
= = 1.4
𝑛 𝑛 10
𝑛 𝑛
𝑥𝑖 ;𝑋 𝑥𝑖 ;5.5 14
• 𝑀. 𝐷(𝑋) = 𝑖=1
= 𝑖=1
= = 1.4
𝑛 𝑛 10
𝑛 𝑛
𝑥𝑖 ;𝑋 𝑥𝑖 ;6 14
• 𝑀. 𝐷 𝑋 = 𝑖=1
= 𝑖=1
= = 1.4
𝑛 𝑛 10
Example-2:
Find mean deviation about mean, median and
mode for the following distributions.
Continued…
Class Frequency
40-44 7
45-49 10
50-54 22
55-59 15
60-64 12
65-69 6
70-74 3
Solution: First find the mean deviation about
the arithmetic mean
Class 𝒇 𝒊 ∗ 𝑿𝒊 𝑿𝒊 − 𝑿 𝒇 𝒊 𝑿𝒊 − 𝑿
Frequency
Class Mark
𝑓𝑖
𝑿𝒊
40-44 7 42 294 13 91
45-49 10 47 470 8 80
50-54 22 52 1144 3 66
55-59 15 57 855 2 30
60-64 12 62 744 7 84
65-69 6 67 402 12 72
70-74 3 72 216 17 51
Total 75 4125 474
Continued…
8
𝑖=1 𝑓𝑖 𝑋𝑖 4125
Arithmetic Mean=𝑋 = 8 = = 55
𝑖=1 𝑓𝑖 75
𝑛
𝑖<1 𝑓𝑖 𝑥𝑖 − 𝑋 474
𝑀. 𝐷 𝑋 = 𝑛 = = 6.3
𝑖<1 𝑓𝑖 75
Solution: Second find the mean deviation about
the median and the mode
Freque Class LCF 𝑿𝒊 − 𝑿 𝒇 𝒊 𝑿𝒊 − 𝑿
𝒇 𝒊 𝑿𝒊 − 𝑿
Class ncy Mark 𝑿𝒊 − 𝑿
𝑓𝑖 𝑿𝒊
40-44 7 42 7 12.2 85.4 10.7 74.9
45-49 10 47 17 7.2 72 5.7 57
50-54 22 52 39 2.2 48.4 0.7 15.4
55-59 15 57 54 2.8 42 4.3 64.5
60-64 12 62 66 7.8 93.6 9.3 111.6
65-69 6 67 72 12.8 76.8 14.3 85.8
72 75 17.8 53.4 19.3
70-74 3 57.9
𝑛
𝑖=1 𝑓𝑖 𝑥𝑖 ;𝑋 467.1
• 𝑀. 𝐷 𝑋 = 𝑛 𝑓 = = 6.2
𝑖=1 𝑖 75
• Remark: Mean deviation is always minimum about the median.
e)Coefficient of Mean Deviation (C.M.D)
Example:
calculate the C.M.D about the mean, median and
mode for the data in example 1 above.
Solution:
f) Variance
Definition : The variance is the arithmetic mean of the
squares of the distance each value is from the mean. The
symbol for the population variance is σ2 (σ is the Greek lower
case letter sigma). Let x1,x2,…,xN be the measurements on N
population units then, the population variance is given by the
formula:
𝑁 2 𝑁 2
𝑖=1(𝑥𝑖 ;µ) 𝑖=1 𝑓𝑖 (𝑥𝑖 ;µ)
𝜎2 = and 𝜎 2 = ,for frequency
𝑁 𝑁
distribution.
𝑁
𝑖=1 𝑥𝑖
where µ = 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛 = and
𝑁
N=Population size.
g) standard deviation
5. Since the data is a sample, divide the number (from step 4 above) by
the number of observations minus one, i.e., n-1 (where n is equal to the
Class Frequency
40-44 7
45-49 10
50-54 22
55-59 15
60-64 12
65-69 6
70-74 3
Solution:1
𝒙𝒊 𝒙𝒊 − 𝒙 𝒙𝒊 − 𝒙 𝟐
5 -6 36
10 -1 1
12 1 1
17 6 36
𝟐
𝑥𝑖 = 44 𝒙𝒊 − 𝒙 = 74
𝑁
𝑖<1 𝑥𝑖 44
𝑥 = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛 = = = 11
𝑁 4
𝑛 2
𝑖=1(𝑥𝑖 ;𝑥) 74
Sample variance= 𝑠2= = 4;1 = 24.67 and
𝑛;1
sample standard deviation= 𝑠 2 =s = 4.97
Solution:
Freque Class 𝒇𝒊 𝒙𝒊 − 𝒙 𝒙𝒊 − 𝒙 𝟐 𝒇𝒊 𝒙𝒊 − 𝒙 𝟐
8
𝑖=1 𝑓𝑖 𝑋𝑖 4125
Arithmetic Mean=𝑋 = 8 = = 55
𝑖=1 𝑓𝑖 75
𝑓𝑖 (𝑥𝑖 ;𝑥)2 4400
Sample Variance=𝑆 2= = = 59.46.
𝑛;1 75;1
sample standard deviation= 𝑠 2 =s = 59.46 = 7.71
Continued…
b) 32 and 68 are at equal distance from the mean,50 and this distance
is 18.
𝑘𝑠 = 18 ⟺ 𝑘 = 3, 𝑠𝑖𝑛𝑐𝑒 𝑠 = 6
1 1 8
Then (1 − 2 ) ∗ 100% = (1 − 2 ) ∗ 100% = ∗ 100% = 88.89% of
𝑘 3 9
the data lies between
32 and 68.
Continued…
1
c) It is just the complement of a) i.e. at most ∗ 100% =
1 𝑘2
∗ 100% = 25% of the numbers lie less than 38 or more
22
than 62.
1
d) It is just the complement of b) i.e. at most 2 ∗ 100% =
1 𝑘
∗ 100% = 11.11% of the numbers lie less than 32 or more
32
than 68.
Example :
The average score of a special test of knowledge of wood
refinishing has a mean of 53 and standard deviation of 6. Find
the range of values in which at least 75% the scores will lie.
Solution:
From Chebyshev's Theorem we have that
1
at least 1 − 𝑘2 ∗ 100% 𝑡𝑒 𝑑𝑎𝑡𝑎 𝑏𝑒𝑙𝑜𝑛𝑔𝑠 𝑡𝑜 𝑖𝑛 𝑡𝑒 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙
𝑥 − 𝑘𝑠, 𝑥 + 𝑘𝑠 .
1
But 1 − ∗ 100% = 75%
𝑘2
1 3 𝑘2 − 1 3 2 − 4 = 3𝑘 2 ⟺ 𝑘 2 − 4 = 0
1− 2 = ⟺ = ⟺ 4𝑘
𝑘 4 𝑘2 4
⟺ 𝑘 = 2 𝑜𝑟 𝑘 = −2 𝑏𝑢𝑡 𝑘 𝑐𝑎𝑛 𝑛𝑜𝑡 𝑏𝑒 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒. Thus k=2.
Then at least 75% of the data belongs to in the interval
𝑥 − 𝑘𝑠, 𝑥 + 𝑘𝑠 53 − 2 ∗ 6,53 + 2 ∗ 6 =(41,65).
Solutions:
Using property c) above the new standard deviation
= 𝑘 𝑠 = 2 ∗ 3=6
Example:
The mean and the standard deviation of a set of numbers are
respectively 500 and 10.
a. If 10 is added to each of the numbers in the set, then what
will be the variance and standard deviation of the new set?
b. If each of the numbers in the set are multiplied by -5, then
what will be the variance and standard deviation of the new
set?
Solutions:
City2 22 21 24 22 20
City3 32 27 35 24 28
Solutions:
Calculate the standard score of both students.
Continued…
Relatively speaking:
a) Which group is more consistent in its
performance
b) Suppose a person A from group one take 9.2
minutes while person B from Group two take 9.3
minutes,who was faster in performing the task?
Why?
Solution:
Continued…
• Child B is faster because the time taken by
child B is two standard deviation shorter(less
than) than the average time taken by group 2
while, the time taken by child A is only one
standard deviation shorter(less) than the
average time taken by group 1.