You are on page 1of 6

Support Contact Us Knowledge Base Signup Revision History View Cart

Are the Skewness and Kurtosis Useful Statistics?


April 2008

In this Issue:
Introduction
Skewness
Kurtosis
Population
The Impact of Sample Size
Conclusions
Quick Links

Greetings,
Many software packages have a feature that generates descriptive statistics for a set of data. These
statistics include the mean, standard deviation, variation, count, and a host of other numbers. Two other
numbers often encountered are the results for the skewness and kurtosis.
This month's newsletter covers these two statistics. These two statistics are meant to be "shape" statistics,
i.e., they describe the shape of the distribution. What do the skewness and kurtosis really represent? And do
they help you understand your process any better? Are they useful statistics? You might be surprised at the
answer.

Best regards,
Bill

Introduction
Are your data normally distributed? This is a very common question today. There are many good ways of
determining whether your data come from a normal distribution. Many books say that the skewness and
kurtosis statistics give you insights into the shape of the distribution. Skewness is a measure of symmetry. A
normal distribution has a skewness equal to 0; i.e. perfect symmetry. Of course, that symmetry does not
exist too often in real life. Kurtosis is a measure of the flatness of the distribution. The kurtosis will be 3 for a
normal distribution. Some calculations (including that in Microsoft Excel) subtract 3 from the result, so the
kurtosis will be 0 for a normal distribution. Skewness and kurtosis are defined in the next two sections.

Skewness
In his book The Six Sigma Handbook, Thomas Pyzdek has the following definition for skewness:
"A measure of asymmetry. Zero indicates perfect symmetry; the normal distribution has a skewness of zero.
Positive skewness indicates that the "tail" of the distribution is more stretched on the side above the mean.
Negative skewness indicates that the tail of the distribution is more stretched on the side below the mean."
The equation for skewness is:
where x are the individual values, Xbar is the overall average, n is then sample size and s is the standard
deviation. This equation is from Dr. Donald Wheeler's excellent book Advanced Topics in Statistical Process
Control (www.spcpress.com).
Main Menu
Catalog
Software (5)
Complete Teaching Guides
to SPC (11)
Special Offers (5)
PowerPoint Training
Modules (17)
Newsletter Sign-up
Click here to sign up for our
FREE monthly newsletter,
featuring SPC and other
statistical topics, case
studies and more!
.
SPC for Excel is used in over
60 countries internationally.
Click here for a list of those
countries.
Home
SPC for Excel Software
SPC Training
SPC Consulting
SPC Knowledge Base
Other Software
How to Order
About Us
Shopping Cart
MyAccount
Are the Skewness and Kurtosis Useful Statistics? | BPI Consulting http://www.spcforexcel.com/are-skewness-and-kurtosis-useful...
1 of 6 10/09/2014 2:14 PM

Microsoft Excel, on the other hand, uses the following to define skewness:
"Skewness characterizes the degree of asymmetry of a distribution around its mean. Positive skewness
indicates a distribution with an asymmetric tail extending toward more positive values. Negative skewness
indicates a distribution with an asymmetric tail extending toward more negative values."
Very close to Pyzdek's definition. Excel uses a slightly different equation for skewness:

Does the difference in equations cause a difference in the results? Just a little. For the most part, the results
are pretty close. Three examples of distributions, each with a different skewness, are shown below.
Distribution Number 1: Skewness > 0; Tail Extends to the Right

Distribution Number 2: Skewness = 0; Normal Distribution

Distribution Number 3: Skewness < 0; Tail Extends to the Left

Our SPC for Excel software easily constructs and updates histograms, control charts and many other
SPC tools. Click here for more information.
Are the Skewness and Kurtosis Useful Statistics? | BPI Consulting http://www.spcforexcel.com/are-skewness-and-kurtosis-useful...
2 of 6 10/09/2014 2:14 PM

Kurtosis
Pyzdek defines the following:
"Kurtosis is a measure of flatness of the distribution. Heavier tailed distributions have larger kurtosis
measures. The normal distribution has a kurtosis of 3"
The equation for kurtosis (from Wheeler's book) is:

Microsoft Excel has the following definition:
"Kurtosis characterizes the relative peakedness or flatness of a distribution compared with the normal
distribution. Positive kurtosis indicates a relatively peaked distribution. Negative kurtosis indicates a relatively
flat distribution."
Excel uses the following equation to calculate kurtosis:
This is where it can get confusing. The first equation for kurtosis above (from Wheeler) will give a kurtosis of
3 for a normal distribution. Excel's equation has a built-in correction that will give a kurtosis of 0 for a normal
distribution. The different equations do give slightly different results (when accounting for the difference of 3).
Two examples of distributions with differing values of kurtosis are shown below.
Distribution with Too Much Peak (Kurtosis > 0)

Too Flat Distribution (Kurtosis < 0)

The Population
Are the skewness and kurtosis any value to you? To explore this, a data set of 5000 points was randomly
generated. The goal was to have a mean of 100 and a standard deviation of 10. The random generation
resulted in a data set with a mean of 99.95 and a standard deviation of 10.01. The histogram for this data is
shown below and looks fairly bell-shaped.
Are the Skewness and Kurtosis Useful Statistics? | BPI Consulting http://www.spcforexcel.com/are-skewness-and-kurtosis-useful...
3 of 6 10/09/2014 2:14 PM

The skewness of the data is 0.007. The kurtosis is 0.03. Both values are close to 0 as you would expect for a
normal distribution. These two numbers represent the "true" value for the skewness and kurtosis since they
were calculated from all the data. In real life, you don't know the real skewness and kurtosis because you
have to sample the process. This is where the problem begins for skewness and kurtosis. Sample size has a
big impact on the results.

Impact of Sample Size on Skewness and Kurtosis
The 5,000 point data set above was used to explore what happens to skewness and kurtosis based on
sample size. For example, suppose we wanted to determine the skewness and kurtosis for a sample size of
50. 50 results were randomly selected from the data set above and the two statistics calculated. The results
are shown in the table below.

Sample Size Skewness Kurtosis
5 1.983 3.974
10 -0.078 -1.468
15 -0.384 0.127
25 -0.356 -0.025
50 -0.169 -0.752
75 -0.489 0.615
100 -0.346 0.671
250 0.089 0.061
500 0.186 0.232
750 -0.02 0.042
1000 -0.138 0.062
1250 0.085 0.079
1500 -0.017 0.001
2000 -0.059 -0.009
2500 0.037 0.096
3000 0.009 0.005
3500 -0.015 0.004
4000 -0.015 -0.009
4500 0.009 0.036
5000 0.007 0.030

Notice how much different the results are when the sample size is small compared to the "true" skewness and
kurtosis for the 5,000 results. For a sample size of 25, the skewness was -.356 compared to the true value of
0.007 while the kurtosis was -0.025. Both signs are opposite of the true values which would lead to wrong
conclusions about the shape of the distribution. There appears to be a lot of variation in the results based on
sample size. The two charts below show how the skewness and kurtosis changed with sample size.

Are the Skewness and Kurtosis Useful Statistics? | BPI Consulting http://www.spcforexcel.com/are-skewness-and-kurtosis-useful...
4 of 6 10/09/2014 2:14 PM

30 samples is a common number used for process capability studies. A subgroup size of 30 was randomly
selected from the data set. This was repeated 100 times. The skewness varied from -1.327 to 1.275 while
the kurtosis varied from -1.12 to 2.978. What kind of decisions can you make about the shape of the
distribution when the skewness and kurtosis vary so much? Essentially, no decisions.

Conclusions
The skewness and kurtosis statistics appear to be very dependent on the sample size. The table above
shows the variation. In fact, even several hundred data points didn't give very good estimates of the true
kurtosis and skewness. Smaller sample sizes can give results that are very misleading. Dr Wheeler said it
correctly in his book mentioned above:
"In short, skewness and kurtosis are practically worthless. Shewhart made this observation in his first book.
The statistics for skewness and kurtosis simply do not provide any useful information beyond that already
given by the measures of location and dispersion."
Walter Shewhart was the "Father" of SPC. So, don't put much emphasis on skewness and kurtosis values
you may see. And remember, the more data you have, the better you can describe the shape of the
distribution.

To download the workbook containing the macro and results that generated the above tables, please
click here.

Attachment Size
Data for Kurtosis and Skewness.xls 718.5 KB
Quick Links
Preview Upcoming Release of SPC for Excel Version 5
Visit our home page
SPC for Excel Software
Online Videos of How the SPC for Excel Software Works
Measurement Systems Analysis (Gage R&R)
Customer Complaint SPC Software
SPC Training
SPC Consulting
Special Offers
Ordering Information
Thanks so much for reading our publication. We hope you find it informative and useful. Happy charting and
may the data always support your position.
Sincerely,
Dr. Bill McNeese
BPI Consulting, LLC
Are the Skewness and Kurtosis Useful Statistics? | BPI Consulting http://www.spcforexcel.com/are-skewness-and-kurtosis-useful...
5 of 6 10/09/2014 2:14 PM
Home SPC for Excel Software SPC Training SPC Consulting SPC Knowledge Base Other Software How to Order About Us Shopping Cart
My Account Login
Copyright 2014 BPI Consulting, LLC. All Rights Reserved.
Site developed and hosted by ELF Computer Consultants.
Are the Skewness and Kurtosis Useful Statistics? | BPI Consulting http://www.spcforexcel.com/are-skewness-and-kurtosis-useful...
6 of 6 10/09/2014 2:14 PM

You might also like