You are on page 1of 3

Definition of Skewness

Skewness in statistics represents an imbalance and asymmetry from the mean of a data
distribution. If you look at a normal data distribution using a bell curve, the curve will be perfectly
symmetrical. Now, this doesn't happen all that often! In order to fully understand when a data
distribution is imperfect and skewed, let's look at a normal data distribution and symmetrical bell
curve.
First, let me remind you of a few basic terms

 Mean is the average of the numbers in the data distribution


 Median is the number that falls directly in the middle of the data distribution
 Mode is the number that appears most frequently in the data distribution

In a normal data distribution, the mean is directly in the middle (and top point) of the bell curve.
Imagine that Mrs. Thomas wanted to teach her high school statistics class on the first day about data
distributions, standard deviations, and bell curves. She asks her 16 student class to secretly divulge
their summer job incomes. Each student provides Mrs. Thomas with a piece of paper with their
income. She rounds each income level to the nearest 500 and makes a chart.
Now that we see the data on a chart, we can see that four of the students made about $2,000 in total
over the summer. If we find the mean, we see that it is $2,000. The mode and median in this data
distribution also happen to be $2,000. In a normal data distribution and perfectly symmetrical bell
curve, the median and mean are always the same value. Take a look at the graph of the data which
represents a normal bell curve (no skewness at all!).

Properties of Skewed Bell Curves

In a symmetric bell curve, the mean, median, and mode are all the same value. How easy is that?
But in a skewed distribution, the mean, median, and mode are all different values. You can see this
represented in this image:
A skewed data distribution or bell curve can be either positive or negative. A positive skew means
that the extreme data results are larger. This skews the data in that it brings the mean (average) up.
The mean will be larger than the median in a skewed data set. A negative skew means the
opposite: that the extreme data results are smaller. This means that the mean is brought down, and
the median is larger than the mean.

Formula for Skewness


The formula to find skewness manually is this:
skewness = (3 * (mean - median)) / standard deviation
In order to use this formula, we need to know the mean and median, of course. As we saw earlier,
the mean is the average. It's the sum of the values in the data distribution divided by the number of
values in the distribution. And if the data distribution was arranged in numerical order, the median
would be the value directly in the middle.
Now, you may be asking: What is standard deviation? Standard deviation tells you how different
and varied your data set really is. Standard deviation shows you how far your numbers spread out
from the mean and median. Here is the formula to find standard deviation:

You might also like