You are on page 1of 3

Skewness and Percentiles

A. Skewness
 Skewness in statistics represents an imbalance and asymmetry from the mean of a data
distribution.1
 Skewness is a measure of the symmetry of a distribution. The highest point of a distribution is its
mode. The mode marks the response value on the x-axis that occurs with the highest probability.
A distribution is skewed if the tail on one side of the mode is fatter or longer than on the other: it
is asymmetrical.2

 In an asymmetrical distribution a negative skew indicates that the tail on the left
side is longer than on the right side (left-skewed), conversely a positive skew
indicates the tail on the right side is longer than on the left (right-skewed).
Asymmetric distributions occur when extreme values lead to a distortion of the
normal distribution.

Negative skew elongated tail at the left. This


means that more data in the left tail than would be
expected in a normal distribution.

Positive skew elongated tail at the right. Means,


more data in the right tail than would be expected
in a normal distribution.

(Empirical rule) Interpretation of the values:

 Measure of Skewness = 0, that means the frequency curve is symmetrical;


 Measure of Skewness is > 0, leads to positive skewness; and
 Measure of Skewness < 0, which leads to negative skewness.

1. Pearson Mode Skewness

1
https://study.com/academy/lesson/skewness-in-statistics-definition-formula-example.html
2
https://www.statista.com/statistics-glossary/definition/390/skewness/
Pearson mode skewness uses the above facts to help you find out if you have positive or negative
skewness.

*If you have a distribution and you know the mean, mode, and standard deviation (σ), then the Pearson
mode skewness formula is:

Sk = (mean-mode)/σ or
Mean – Mode
A3=
Standard deviation

Sample problem: You have data with a mean of 19, a mode of 20 and a standard deviation of 25. What
does Pearson Mode Skewness tell you about the distribution?

Solution:
Sk= (mean-mode)/σ = (19-20)/25 = -0.04.

There is a very slight negative skewness (-0.04). Note: For most intents and purposes, this would count as
a symmetric distribution as the skewness is so small.

Pearson Mode Skewness: Alternative Formula.

If you don’t know the mode, you won’t be able to use Pearson mode skewness. However, the direction of
skewness can be also figured out by finding where the mean and the median are. According to Business
Statistics, this leads to a second, equivalent formula:

Sk= (Mean – Median) / σ

This formula is also called Pearson’s second coefficient of skewness.

The difference between the mean and mode, or mean and median, will tell you how far the distribution
departs from symmetry. A symmetric distribution (for example, the normal distribution) has a skewness
of zero.

B. Percentile

 In statistics, a percentile (or a centile) is a score below which a given percentage of scores in its
frequency distribution falls (exclusive definition) or a score at or below which a given percentage
falls (inclusive definition). For example, the 50th percentile (the median) is the score below which
(exclusive) or at or below which (inclusive) 50% of the scores in the distribution may be found.
 Percentiles indicate the percentage of scores that fall below a particular value.
 A percentile is a term used in statistics to express how a score compares to other scores in the same
set. While there is technically no standard definition of percentile, it's typically communicated as the
percentage of values that fall below a particular value in a set of data scores.

In statistical terms, there are three separate definitions of percentile. They are:
 Greater than: The kth percentile is the lowest score in a data set that is greater than a percentage (k) of
the scores. For example, if k = .25, you'd be trying to identify the lowest score that is greater than 25% of
scores in the data set.

 Greater than or equal to: The kth percentile is the lowest score in the data set that is greater than or equal
to a percentage (k) of the scores. For example, if k = .25, you'd be looking for a value that is greater or
equal to 25% of the scores.

 Weighted average: In this method, the kth percentile is the weighted average of the percentiles calculated
in the two definitions above. This method allows for numbers to be more neatly rounded and defines the
median of the set as the 50th percentile.

Percentiles can be calculated using the formula n = (P/100) x N,

Where;
P = percentile,
N = number of values in a data set (sorted from smallest to largest), and
n = ordinal rank of a given value.

Percentiles are frequently used to understand test scores and biometric measurements.

Sample Problem:

There are 25 test scores such as: 72, 54, 56, 61, 62, 66, 68, 43, 69, 69, 70, 71, 77, 78, 79, 85, 87, 88, 89,
93, 95, 96, 98, 99, 99. Find the 60th percentile?

Solution:
Step 1:
Arrange the data in the ascending order.
Ascending Order = 43, 54, 56, 61, 62, 66, 68, 69, 69, 70, 71, 72, 77, 78, 79, 85, 87, 88, 89, 93, 95,
96, 98, 99, 99.
Step 2:
Find Rank,
Rank = Percentile / 100
= 60 / 100
k = 0.60
Step 3:
Find 60th percentile,
60th percentile = 0.60 x 25
= 15
Step 4:
Count the values in the given data set from left to right until you reach the number 15.

From the given data set, 15th number is 79. Now take the 15th number and the 16th number and
find the average: 79 + 85 / 2 = 164 / 2 = 82
Hence, 60th percentile of given data set = 82.

You might also like