You are on page 1of 11

Statistics and Probability (STT-500)

Mr. Adeel Sohail email id: Adeel@biit.edu.pk, WhatsApp# 0331-5002033

(Week 11) Lecture 21,22


Objectives: Learning objectives of this lecture are

 Objectives of Computing Dispersion


 Characteristics of a Good Measure of Dispersion
 Merits and Demerits of Range
 Merits and Demerits of Quartile Deviation
 Merits and Demerits of Mean Deviation
 Merits and Demerits Standard Deviation
 Skewness
 Types of Distributions
 Measures of Skewness
 Pearson’s Coefficient of Skewness

Objectives of Computing Dispersion:


Comparative Study
 Measures of dispersion give a single value indicating the degree of consistency or
uniformity of distribution. This single value helps us in making comparisons of various
distributions.
 The smaller the magnitude (value) of dispersion, higher is the consistency or uniformity
and vice-versa.
Reliability of an Average
 A small value of dispersion means low variation between observations and average. It
means the average is a good representative of observation and very reliable.
 A higher value of dispersion means greater deviation among the observations. In this case,
the average is not a good representative and it cannot be considered reliable.
Control the Variability
 Different measures of dispersion provide us data of variability from different angles and
this knowledge can prove helpful in controlling the variation.
 Especially, in the financial analysis of business and Medical, these measures of dispersion
can prove very useful.
Basis for Further Statistical Analysis
 Measures of dispersion provide the basis further statistical analysis like, computing
Correlation, Regression, Test of hypothesis, etc.
Statistics and Probability (STT-500)

Mr. Adeel Sohail email id: Adeel@biit.edu.pk, WhatsApp# 0331-5002033

Characteristics of a Good Measure of Dispersion:


 It should be easy to calculate & simple to understand.
 It should be based on all the observations of the series.
 It should be rigidly defined.
 It should not be affected by extreme values
 It should not be unduly affected by sampling fluctuations.
 It should be capable of further mathematical treatment and statistical analysis.

Merits and Demerits of Range:


Merits
1. It is very easy to calculate and simple to understand.
2. No special knowledge is needed while calculating range.
3. It takes least time for computation.
4. It provides the broad picture of the data at a glance.
Demerits
1. It is a crude measure because it is only based on two extreme values (highest and lowest).
2. It cannot be calculated in case of open-ended series.
3. Range is significantly affected by fluctuations of sampling i.e. it varies widely from sample
to sample.
Merits and Demerits of Quartile Deviation
Merits
1. It is also quite easy to calculate and simple to understand.
2. It can be used even in case of open-end distribution.
3. It is less affected by extreme values so, it a superior to ‘Range’.
4. It is more useful when dispersion of middle 50% is to be computed.
Demerits
1. It is not based on all the observations.
2. It is not capable of further algebraic treatment or statistical analysis.
3. It is affected considerably by fluctuations of sampling.
4. It is not regarded as very reliable measure of dispersion because it ignores 50%
observations.
Merits and Demerits of Mean Deviation
Merits
1. It is based on all the observations of the series and not only on the limits like Range and
Q.D.
2. It is simple to calculate and easy to understand.
3. It is not much affected by extreme values.
4. For calculating mean deviation, deviations can be taken from any average.
Demerits
1. Ignoring + and – signs is bad from the mathematical viewpoint.
2. It is not capable of further mathematical treatment.
Statistics and Probability (STT-500)

Mr. Adeel Sohail email id: Adeel@biit.edu.pk, WhatsApp# 0331-5002033


3. It is difficult to compute when mean or median are in fraction.
4. It may not be possible to use this method in case of Open ended series.
Merits and Demerits Standard Deviation
Merits
1. Squaring the deviations overcomes the drawback of ignoring signs in mean deviations
2. Suitable for further mathematical treatment
3. Least affected by the fluctuation of the observations
4. The standard deviation is zero if all the observations are constant
5. Independent of change of origin

Demerits
1. Not easy to calculate
2. Difficult to understand for a layman
3. Dependent on the change of scale
Skewness
In probability theory and statistics, skewness is a measure of the asymmetry of the probability
distribution of a real-valued random variable about its mean. The skewness value can be positive,
zero, negative, or undefined.
For a unimodal distribution, negative skew commonly indicates that the tail is on the left side of
the distribution, and positive skew indicates that the tail is on the right. In cases where one tail is
long but the other tail is fat, skewness does not obey a simple rule. For example, a zero value
means that the tails on both sides of the mean balance out overall; this is the case for a symmetric
distribution, but can also be true for an asymmetric distribution where one tail is long and thin, and
the other is short but fat.
Types of Distributions
There are three types of distribution.
 Symmetrical Distribution
 Positively Skewed Distribution
 Negatively Skewed Distribution
Symmetrical Distribution
A symmetric distribution is a type of distribution where the left side of the distribution mirrors the
right side. As shown in figure

Symmetrical Distribution
Statistics and Probability (STT-500)

Mr. Adeel Sohail email id: Adeel@biit.edu.pk, WhatsApp# 0331-5002033

Positively Skewed Distribution


If the right tail is longer than left tail, the distribution is said to be Positively Skewed Distribution.

Positively Skewed Distribution

Negatively Skewed Distribution


If the left tail is longer than right tail, the distribution is said to be Negatively Skewed Distribution.

Negatively Skewed Distribution

Relationship of mean, median and mode for skewness


The skewness is directly related to the relationship between the mean median and mode: If the
distribution is symmetric then the mean is equal to the median, and median is equal to mode
i.e.
Mean = Median = Mode

If the distribution is Positively Skewed Distribution than the mean is greater than the median and
median is greater than mode
i.e.
Mean > Median > Mode
Statistics and Probability (STT-500)

Mr. Adeel Sohail email id: Adeel@biit.edu.pk, WhatsApp# 0331-5002033

In Negatively Skewed Distribution mode is greater than the median and median is greater than
mean
i.e.
𝑴𝒐𝒅𝒆 > Median > 𝑴𝒆𝒂𝒏

Measures of Skewness:
The difference between the measures of location, being an indication of the amount of skewness
or asymmetry is used as a measure of skewness. A measure of skewness is defined in such a way
that
i. the measure should be zero when the distribution is symmetric.
ii. the measure should be a pure number i.e. independent of origin and units of measurement.

Pearson’s Coefficient of Skewness


Accordingly, to measure the degree of skewness of a distribution or curve Karl Pearson (1857-
1936) introduced a coefficient of skewness denoted by 𝑆𝑘 and defined by

𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒
𝑆𝑘 =
𝑆. 𝐷

This is called Pearson’s first Coefficient of Skewness. This Coefficient of Skewness is usually
varies between -3 and +3 and the sign indicates the direction of the skewness.
i.e. if the value of the Coefficient of Skewness is negative than the distribution is Negatively
Skewed Distribution, Positive for Positively Skewed Distribution and 0 for symmetrical
distribution.
Statistics and Probability (STT-500)

Mr. Adeel Sohail email id: Adeel@biit.edu.pk, WhatsApp# 0331-5002033


Example
Calculate skewness by Pearson’s Coefficient of Skewness from the following frequency
distribution and interpret the result.

Ages 15– 19 20 – 24 25 – 29 30 – 34 35 – 39 40 – 44 45 – 49 50 – 54
No. of Men 29 176 208 173 82 40 15 3

Solution
So, the necessary calculation is given below
𝑪𝒍𝒂𝒔𝒔𝒆𝒔 𝑭𝒓𝒆𝒒𝒖𝒆𝒏𝒄𝒚 𝑪. 𝑩 𝒙 𝒇𝒙 𝒇𝒙𝟐
15– 19 29 14.5– 19.5 17 493 8381
20 – 24 176 19.5 – 24.5 22 3872 85184
25 – 29 208 24.5 – 29.5 27 5616 151632 𝑀𝑜𝑑𝑒 𝑐𝑙𝑎𝑠𝑠
30 – 34 173 29.5 – 34.5 32 5536 177152
35 – 39 82 34.5 – 39.5 37 3034 112258
40 – 44 40 39.5 – 44.5 42 1680 70560
45 – 49 15 44.5 – 49.5 47 705 33135
50 – 54 3 49.5 – 54.5 52 156 8112

Here we have to find Pearson’s Coefficient of Skewness

𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒
𝑆𝑘 =
𝑆. 𝐷

Where
∑ 𝑓𝑥
𝑀𝑒𝑎𝑛 =
∑𝑓

𝑓𝑚 − 𝑓1
𝑀𝑜𝑑𝑒 = 𝑙 + ×ℎ
(𝑓𝑚 − 𝑓1 ) + (𝑓𝑚 − 𝑓2 )
Where,
𝑙 = 𝑙𝑜𝑤𝑒𝑟 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 𝑜𝑓 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠
𝑓𝑚 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠
𝑓1 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠 𝑝𝑟𝑒𝑐𝑒𝑑𝑖𝑛𝑔 𝑡ℎ𝑒 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠
𝑓2 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠 𝑠𝑢𝑐𝑐𝑒𝑒𝑑𝑖𝑛𝑔 𝑡ℎ𝑒 𝑚𝑜𝑑𝑎𝑙 𝑐𝑙𝑎𝑠𝑠
ℎ = 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙.
Note:
Where modal or mode class is the class where frequency is maximum.
And
Statistics and Probability (STT-500)

Mr. Adeel Sohail email id: Adeel@biit.edu.pk, WhatsApp# 0331-5002033


2
∑ 𝑓𝑥2 ∑ 𝑓𝑥
𝑆. 𝐷 = √ −( )
∑𝑓 ∑𝑓

Here for mean


∑ 𝑓 = 726
And

∑ 𝑓𝑥 = 21092
So
21092
𝑀𝑒𝑎𝑛 =
726

𝑀𝑒𝑎𝑛 = 29.0523
Now, for mode
𝑙 = 24.5
𝑓𝑚 = 208
𝑓1 = 176
𝑓2 = 173
ℎ=5
So,
208 − 176
𝑀𝑜𝑑𝑒 = 24.5 + ×5
(208 − 176) + (208 − 173)

32
𝑀𝑜𝑑𝑒 = 24.5 + ×5
(32) + (35)

32
𝑀𝑜𝑑𝑒 = 24.5 + ×5
67
32
𝑀𝑜𝑑𝑒 = 24.5 + ×5
67

𝑀𝑜𝑑𝑒 = 26.888
Now, for S.D

∑ 𝑓 = 726

∑ 𝑓𝑥 = 21092
And
Statistics and Probability (STT-500)

Mr. Adeel Sohail email id: Adeel@biit.edu.pk, WhatsApp# 0331-5002033

∑ 𝑓𝑥2 = 646414

So,

646414 21092 2
𝑆. 𝐷 = √ −( )
726 726

𝑆. 𝐷 = √890.3774 − (29.0534)2

𝑆. 𝐷 = √46.2773

𝑆. 𝐷 = 6.8027
So now for Pearson’s Coefficient of Skewness

𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒
𝑆𝑘 =
𝑆. 𝐷
By putting the values
29.0532 − 26.888
𝑆𝑘 =
6.8027

2.1652
𝑆𝑘 =
6.8027

𝑆𝑘 = 0.3182

As the value of Coefficient of Skewness is positive so the distribution is positively skewed.


Statistics and Probability (STT-500)

Mr. Adeel Sohail email id: Adeel@biit.edu.pk, WhatsApp# 0331-5002033

Example
Find the Skewness the given distribution.
Classes 0-9.5 10-19.5 20-29.5 30-39.5 40-49.5 50-59.5 60-69.5 70-79.5 80-89.5 90-99.5

Frequency 10 19 23 30 39 33 20 18 17 6

Solution:
The necessary calculation is given below
Classes 𝐶. 𝐵 𝑥 𝑥2 𝑓 𝑓𝑥 𝑓𝑥 2
0-9.5 -0.25-9.75 4.75 22.5625 10 47.5 225.625
10-19.5 9.75-19.75 14.75 217.5625 19 280.25 4133.6875
20-29.5 19.75-29.75 24.75 612.5625 23 569.25 14088.9375
30-39.5 29.75-39.75 34.75 1207.5625 30 1042.5 36226.875
40-49.5 39.75-49.75 44.75 2002.5625 39 1745.25 78099.9375 𝑀𝑜𝑑𝑒 𝐶𝑙𝑎𝑠𝑠
50-59.5 49.75-59.75 54.75 2997.5625 33 1806.75 98919.5625
60-69.5 59.75-69.75 64.75 4192.5625 20 1295 83851.25
70-79.5 69.75-79.75 74.75 5587.5625 18 1345.5 100576.125
80-89.5 79.75-89.75 84.75 7182.5625 17 1440.75 122103.5625
90-99.5 89.75-99.75 94.75 8977.5625 6 568.5 53865.375

So,
Here
∑ 𝑓𝑥
𝐴𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑀𝑒𝑎𝑛 =
∑𝑓

10141.25
𝐴𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑀𝑒𝑎𝑛 =
215
𝐴𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑀𝑒𝑎𝑛 = 47.16
Now, for mode
𝑙 = 39.75
𝑓𝑚 = 39
𝑓1 = 30
𝑓2 = 33
ℎ = 10
So,
39 − 30
𝑀𝑜𝑑𝑒 = 39.75 + × 10
(39 − 30) + (39 − 33)

9
𝑀𝑜𝑑𝑒 = 39.75 + × 10
(9) + (6)
Statistics and Probability (STT-500)

Mr. Adeel Sohail email id: Adeel@biit.edu.pk, WhatsApp# 0331-5002033


9
𝑀𝑜𝑑𝑒 = 39.75 + × 10
15

𝑀𝑜𝑑𝑒 = 45.75
Now, for S.D

And

2
∑ 𝑓𝑥 2 ∑ 𝑓𝑥
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = √ −( )
∑𝑓 ∑𝑓

592090.9375 10141.25 2
= √ −( )
215 215

𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 23

So now for Pearson’s Coefficient of Skewness

𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒
𝑆𝑘 =
𝑆. 𝐷
By putting the values
47.16 − 45.75
𝑆𝑘 =
23
1.41
𝑆𝑘 =
23

𝑆𝑘 = 0.0613

As the value of Coefficient of Skewness is positive so the distribution is positively skewed.


Statistics and Probability (STT-500)

Mr. Adeel Sohail email id: Adeel@biit.edu.pk, WhatsApp# 0331-5002033

Assignment

Question 1:
Write down the importance of skewness in statistics
Question 2:
Find the Pearson’s Coefficient of Skewness for the following distribution

Marks 25 – 29 30 – 34 35 – 39 40 – 44 45 – 49 50 – 54 55 – 59 60 – 64 65 – 69 70 – 74 75 – 79
No. of students 6 9 17 28 25 18 13 6 10 7 5

Question 3:
Check distribution is symmetric or not

Classes 0-9.7 10-19.7 20-29.7 30-39.7 40-49.7 50-59.7 60-69.7 70-79.7 80-89.7 90-99.7

Frequency 10 9 13 20 29 23 15 18 17 8

You might also like