You are on page 1of 11

4.

MEASURES OF
SHAPE

4.2 LEARNING OBJECTIVES


⚫ Define, describe and interpret the three forms of data skewness – right-skewed, left-skewed and symmetry.
⚫ Determine the skewness of a data distribution by comparing the mean and median.
⚫ Define, describe and interpret the three forms of data kurtosis – platykurtic, leptokurtic and mesokurtic.
⚫ How to interpret kurtosis coefficient in terms of tail-ness and peak-ness in a data distribution.
⚫ How to use MS Excel to compute and interpret skewness and kurtosis coefficients.

PRACTICE EXERCISES
⚫ Review Questions 8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.9, 8.11, 8.12 and 8.13.

Dr. Raphael Djabatey 2

1
4.2 MEASURES OF SHAPE
⚫ The measures of central tendency tell you about the central value in your data while measures of variation
tell you how spread is your data from the central value. What about the shape of your data?

Measures of Central Tendency measures the location of central value of the data

Measures of Variation measures the spread of data from the central value

Measures of Shape measures the direction / frequency of outliers away from the central value

⚫ A histogram provides insight into the shape of a data distribution. But the two methods that precisely
measure the shape of a distribution are skewness and kurtosis.

⚫ Skewness measures the direction of extreme value in an asymmetrical distribution.


⚫ Kurtosis measures the frequency of extreme value in a symmetrical distribution.

Dr. Raphael Djabatey 3

I SKEWNESS

2
1. SKEWNESS
⚫ The relative concentration of data may indicate whether the distribution is asymmetrical. A distribution is
called asymmetric when one tail is longer than the other.

⚫ Skewness measures the asymmetry of a data distribution around its mean. Thus, it quantifies how much of
the data is skewed to one side of the mean.

⚫ A skewness looks for the direction in a data distribution. Thus, a distribution may skewed towards one
direction.

❑ A direction regarding where the extreme value lies – right side, left side or both sides.
❑ A direction regarding where most values are situated – low-end, high-end or middle of a distribution.

⚫ We can use a simple but less precise method to describe the skewness of a distribution by comparing the
mean and median. However, we can use MS Excel to compute a skewness coefficient to precisely measure
skewness and accurately describe the direction of extreme value in a data distribution.

5
Dr. Raphael Djabatey

1. SKEWNESS
⚫ Your data is skewed by towards one direction an extreme value. A test of skewness helps you know which
direction your data is skewed towards. A data can be skewed towards the right or left direction relative to
the mean.

: Negative Positive
skewness skewness
indicates a indicates a
distribution with distribution with
an asymmetric an asymmetric
tail extending tail extending
toward more toward more
negative values. negative skewed zero skewed positive skewed positive values.

• A data skewed to the left when the extreme value lies to the left, as a
result most values are situated at the high-end of the distribution.

• A data skewed to the right when the extreme value lies to the right, as a
result most values are situated at the low-end of the distribution.

• A data is not skewed (i.e. symmetry) when the extreme values lie at both
sides, as a result most data are situated in the middle of the distribution.

6
Dr. Raphael Djabatey

3
1. SKEWNESS

Negative Zero Positive


Also Known As
Skewness Skewness Skewness
Left-Side Or Both Sides Or Right-Side Or
Direction of Tail
Negative Zero Positive
Location of
Left Side Both Sides Right Side
Extreme Value
Location of Bulk High End of Middle of Low End of
of Data Distribution Distribution Distribution

Bulk of Data Most Values Most Values Most Values


Vs Mean Above Mean Around Mean Below Mean

Relationship
Mean Less Mean Equal Mean More
Between Mean
Than Median To Median Than Median
and Median
Less Than Between More Than
Excel Coefficient
- 0.5 - 0.5 and + 0.5 + 0.5
7
Dr. Raphael Djabatey

1. SKEWNESS
1. Left-Skewed Distribution 4, 80, 82, 88, 88, 90, 100

⚫ This is also known as a negative-skew distribution. This arises when the extreme values lie to the left of the
distribution and the bulk of the data is situated at the high-end of the distribution.

⚫ The distribution is characterized by high values, hence most values are above the mean. The distribution is
right-skewed when the mean is less than median. The above dataset has the mean of 76 and median of 88.

⚫ In a mathematical term, the mean minus median will yield a negative result (76 – 88 = - 12). Since the result is
a negative value, the distribution is described as negative skewed.

⚫ EXCEL provides a robust method where a skewness co-efficient of less than – 0.5 indicates a left skewed.

⚫ On a chart, we observe a long tail to the left of the distribution, stretching towards the negative direction.

Mean < Median < Mode 8


Dr. Raphael Djabatey

4
1. SKEWNESS
2. Right-Skewed Distribution 2, 4, 4, 7, 10, 12, 80

⚫ This is also known as a positive-skew distribution. This arises when the extreme values lie to the right of the
distribution and the bulk of the data is situated at the low-end of the distribution.

⚫ The distribution is characterized by low values, hence most values are below the mean. The distribution is
right-skewed when the mean is more than median. The above dataset has the mean of 17 and median of 7.

⚫ In a mathematical term, the mean minus median will yield a positive result (17 – 7 = 10). Since the result is a
positive value, the distribution is described as positive skewed.

⚫ EXCEL provides a robust method where a skewness co-efficient of more than 0.5 indicates a right skewed.

⚫ On a chart, we observe a long tail to the right of the distribution, stretching towards the positive direction.

Mode < Median < Mean 9


Dr. Raphael Djabatey

1. SKEWNESS
3. Normal Distribution 1, 3, 4, 5, 6, 8

⚫ This is also known as a zero-skew or symmetric distribution. This arises when extreme values lie both to the
left and right, and the bulk of the data is situated around the center or middle of the distribution.

⚫ The data is characterized by average values, hence most values are similar to the mean. The distribution is
symmetry when the mean is equal to median. The above dataset has the mean of 4.5 and median of 4.5.

⚫ In a mathematical term, the mean minus median will yield a zero result (4.5 – 4.5 = 0). Since the result is a
zero value, a symmetrical distribution is also described as zero skewed.

⚫ EXCEL provides a robust method where a skewness co-efficient between - 0.5 and + 0.5 indicates an
approximate symmetry while a skewness coefficient equal to zero (0) indicates a perfect symmetry.

⚫ On a chart, we observe tails to the left and right of the distribution, stretching towards the negative and
positive directions.

10
Mean = Median = Mode Dr. Raphael Djabatey

10

5
1. SKEWNESS
MS Excel Skewness Coefficient

negative skewness ± positive skewness


(high) (high)
1.0

Negative negative skewness positive skewness


(moderate) (moderate) Positive
(-)
(+)
left-skewed 0.5 right-skewed

symmetry symmetry
(approximate) (approximate)

0.0
symmetry symmetry
(perfect) (perfect)

11
Dr. Raphael Djabatey

11

1. SKEWNESS
Shopping Time
911 Emergency Waiting Time
90
18
80 Left-Skewed
16
70
14 Right-Skewed
60
12
50 10
40 8
30 6
20 4
10 2

0 0
0 - 15 15 - 30 30 - 45 45 - 60 60 - 75 75 - 90 90 -105 0-5 5 - 10 10 - 15 15 - 20 20 - 25 25 - 30 30 - 35

Students' Grades
80

70 Normal Distribution
60

50

40

30

20

10

0
20 - 30 30 - 40 40 - 50 50 - 60 60 - 70 70 - 80 80 - 90 12
Dr. Raphael Djabatey

12

6
II KURTOSIS

13

13

2. KURTOSIS
⚫ Kurtosis measures the tail-ness and peaked-ness of a distribution relative to a normal distribution. It
describes the degree to which values are clustered in the tail or peak of a distribution.

⚫ Kurtosis determines whether the tails of a data distribution match the normal distribution.
It compares the tails of a distribution to the normal distribution to determine how heavy or light are the tails.

⚫ Kurtosis quantifies the tail-ness of a distribution by measuring the frequency of extreme observations in the
dataset. It looks for whether a distribution have too much or fewer extreme values than a normal distribution.

⚫ Modern definition shows that kurtosis is influenced more by extreme values (tails) than the values in the
center (peak) of the distribution. Thus, recent definition measures how much the variation in the data is due
to extreme values. There are three forms of kurtosis:

❑ When the distribution has more variation than normal it is called leptokurtic distribution.
❑ When the distribution has less variation than normal it is called platykurtic distribution.
❑ When the distribution has same variation as normal it is called mesokurtic distribution.

14
Dr. Raphael Djabatey

14

7
2. KURTOSIS
Platykurtic Mesokurtic Leptokurtic
Distribution Distribution Distribution

Negative Zero Positive


Also Known As
Kurtosis Kurtosis Kurtosis

Size of Tail Thin Normal Thick

Frequency of Less More


Normal
Extreme Value Frequent Frequent
Less More
Peakness Normal
Peaked Peaked
Frequency of Values Less More
Normal
Around Mean Frequent Frequent

Shoulder More Data Normal Less Data

Nature of Variation Small Changes Normal Large Charges

Degree of Risk Low Risk Normal High Risk

Less Than Between More Than


Excel Coefficient
- 0.5 - 0.5 and + 0.5 + 0.5

Dr. Raphael Djabatey 15

15

2. KURTOSIS

leptokurtic

platykurtic

16
Dr. Raphael Djabatey

16

8
2. KURTOSIS
⚫ The larger the kurtosis coefficient, the more extreme values there are, the fatter the tails are, the more
peaked the distribution around the mean, and the more risk there is in the distribution as compared to a
normal distribution.

⚫ The smaller the kurtosis coefficient, the less extreme values there are, the thinner the tails are, the less
peaked the distribution around the mean, and the less risk there is in the distribution as compared to a
normal distribution.

Platy = Broad Lepto = Skinny

light tail, few extreme values --------------------------- heavy tail, more extreme value
less peaked at center --------------------------- high peaked at center
small changes are more frequent --------------------------- small changes are less frequent
less risk due to fewer extreme value ------------------------- more risk due to a lot of extreme value

17
Dr. Raphael Djabatey

17

2. KURTOSIS
1. Platykurtic Distribution

⚫ This is also known as a negative kurtosis and it describes a distribution with fewer extreme values and fewer
peak values (i.e. around the mean) than a normal distribution.

⚫ A negative kurtosis is less-tailed (or light-tailed) because it has fewer extreme values in the tails than a normal
distribution. It is also less-peaked (or low-peaked) because less values are clustered around the mean.

⚫ Because it is light-tailed and low peaked, there are less data in the center, more data in the shoulder and less
data in the tail. Such distribution indicates less variation or less risk than normal.

⚫ A negative kurtosis indicates less variation because small changes are leptokurtic
more common while large changes are less likely.

⚫ In EXCEL, an excess kurtosis coefficient of less than – 0.5 indicates platykurtic

a negative kurtosis. It is a moderate negative if the coefficient lies between


– 0.5 and – 1.0, but highly negative if the coefficient is less than – 1.0.

18
Dr. Raphael Djabatey

18

9
2. KURTOSIS
2. Leptokurtic Distribution

⚫ This is also known as a positive kurtosis and it describes a distribution with more extreme values and more peak
values (i.e. around the mean) than a normal distribution.

⚫ A positive kurtosis is more-tailed (or heavy-tailed) because it has more extreme values in the tails than a normal
distribution. It is also more-peaked (or high-peaked) because the more values are clustered around the mean.

⚫ Because it is heavy-tailed and high peaked, there are more data in the center, less data in the shoulder and more
data in the tail. Such distribution indicates more variation and more risk than normal.

⚫ A positive kurtosis indicates more variation because large changes are


leptokurtic
more common while small changes are less likely.

⚫ In EXCEL, an excess kurtosis coefficient of more than + 0.5 indicates


platykurtic
a positive kurtosis.

19
Dr. Raphael Djabatey

19

2. KURTOSIS
3. Mesokurtic Distribution

⚫ This is also known as a zero kurtosis and it describes a distribution which has the same characteristics as a
normal distribution.

⚫ A zero kurtosis has same amount of values in the tails (extreme values) as a normal distribution as well as
the same degree of clustering around the mean (peak values) as a normal distribution.

⚫ In EXCEL, an excess kurtosis between – 0.5 and + 0.5 indicates a distribution with approximately zero
kurtosis. An excess kurtosis coefficient equal to 0.0 represents a perfect or zero kurtosis.

Dr. Raphael Djabatey 20

20

10
2. KURTOSIS
Excess Kurtosis Coefficient

negative kurtosis ± positive kurtosis


(high) (high)
1.0

Negative negative kurtosis positive kurtosis


(moderate) (moderate) Positive
(-)
(+)
platykurtic 0.5 leptokurtic

near zero near zero


kurtosis kurtosis

0.0
perfect perfect
kurtosis kurtosis

21
Dr. Raphael Djabatey

21

2. EXCEL COEFFICIENTS

Dr. Raphael Djabatey 22

22

11

You might also like