You are on page 1of 16

A Statistical Analysis of Bostons 2015

Record Snowfall
More than 90 inches of snow fell in Boston in the winter of 2015, but how rare was this
occurrence? This article explores the rarity of the 2015 Boston snowfall amount in terms of
Sigma Levels and examines University of Oklahoma meteorologist Sam Lillos estimate of
the likelihood of this event occurring.
By Dr. Diane Evans, Six Sigma Black Belt and Professor of Engineering Management at
Rose-Hulman Institute of Technology, and Thomas Foulkes, National Science Foundation
Graduate Research Fellow in the Electrical and Computer Engineering Department at the
University of Illinois at Urbana-Champaign

Dr. Diane Evans and Thomas Foulkes

Section 1. Meteorologists Analysis of Bostons Historic Snowfall in The Washington


Post
The amount of snow that fell in Boston in winter 2015 was anything but normal. When
the photo displayed in Figure 1 was taken, an overwhelming amount of 90 inches (7.5 feet)
had fallen.

Figure 1. A woman shovels snow on Joy Street during a blizzard in Boston on February 15,
2015. [Taylor, 2015]
Following this historic snowfall of 94.4 inches in a 30-day period in 2015, Sam Lillo, a
University of Oklahoma meteorologist, analyzed historical weather data from the Boston
area from as far back as 1938 in order to determine the rarity of this event. Outlined in the
resource Attachment A [Agency, 2015] and discussed in Section 2 of this paper, Lillo
developed a simulated set of one million hypothetical Boston winters by sampling with
replacement snowfall amounts gathered over 30-day periods. Eric Holthaus, a journalist
with The Washington Post, reported that Lillos results indicated that winters like the 30
days of consecutive snowfall from January 24 to February 22, 2015 should only occur
approximately once every 26,315 years in Boston, as displayed in Figure 2. [Holthaus,
2015] In Six Sigma terminology, the likelihood of this 30-day snowfall amount occurring is
38 out of a million.

Figure 2. Sam Lillos simulated 30-day Boston snowfall amounts. The previous record of
58.8 inches was set in 1978. [Holthaus, 2015] The simulated (not actual) record using this
hypothetical data is 119.7 inches.
As Six Sigma practitioners, this article was appealing to us because of the suggested rarity
of this event. Since the 1980s, industries across the world (e.g., manufacturing, healthcare,
management, academia) have been measuring the quality of processes and products based
on Six Sigma methodology. [Deming, 1986] In simple terms, when a company makes a
part for a product, it needs to meet customer specifications. If a part does not meet the
specifications for its intended use, then the part is considered defective. In order to reduce
the number of defective parts, the companys goal is to center a parts dimension on the
targeted mean and to reduce its spread or standard deviation about the mean. In doing so,
parts that do not meet specifications, i.e., defective parts, are rare events.
The Sigma Level of a process indicates how well it meets its set specification limits. As is
well-known in Six Sigma methodology, a process with Sigma Level 6 is expected to have
only 3.4 defects per million (DPM). In other words, only 3.4 process parts out of a million
are expected to be beyond its closest specification limit. It is important to note that the 3.4
DPM is calculated assuming that the process is normally distributed and its mean shifts 1.5
standard deviations over time. If the process is normally distributed without this shift, then
the process has a long term Sigma Level of 7.5. If a process is not normally distributed
and does not shift, like Boston snowfall amounts, then its Sigma Level is the number of
standard deviations from its mean to its closest specification limit.
The purpose of this paper is to explore the rarity of the 2015 Boston snowfall amount in
terms of Sigma Levels and to examine Lillos estimate of the likelihood of this event
occurring. To assess Lillos findings, we obtained snowfall amounts in a specified Boston

location from 1891 to 2015 via the National Oceanic and Atmospheric Administration
(NOAA) for comparison with his simulated data.

Section 2. Recreating the Simulated Data


On March 15, 2015, the cumulative Boston snowfall of 108.6 inches surpassed the previous
Boston record of 107.6 inches set in the winter of 1996. In Figure 2, a graphical display of
Boston snow statistics from 1938 to 2015 illustrates the quick rise in snowfall amounts in
2015 as compared to record setting snowfalls in years 1996, 1994, and 1948. Included in
the figure is the annual average Boston snowfall through early June. The final tally on
Bostons brutal snowfall in 2015 clocked in at 110 inches. [Basu, 2015] The condensed
timeframe in which this historic and devastating snow accumulated created a public safety
crisis, resulting in dozens of deaths and hundreds of injuries and causing tens of millions of
dollars in destruction. [Agency, 2015]

Figure 3. Graphical display of Bostons snow statistics since 1938 [@NWSBoston, 2015]
The dashed rectangular region inserted in the graphic highlights the 30 days of snowfall
from January 24 to February 22, 2015 resulting in 94.4 inches of snow. In order to obtain
hypothetical 30-day Boston snowfall amounts, Lillo first generated one million resampled
winters by:
... stitching together days sampled from past winters. A three-day period was chosen, to
represent the typical timescale of synoptic weather systems. In addition, to account for the

effect of long-term pattern forcing, the random selection of 3-day periods was weighted by
the correlation between consecutive periods. Anomalies tended to persist across multiple
periods, such that theres a better chance that a snowier than normal three days would
follow a similarly snowy three days. This is well observed (and in extreme cases, like this
year), so its important to include in the simulation. [Agency, 2015]
After generating the one million resampled winters, Lillo recorded the snowiest 10-period
stretches, i.e., 30 days, from each winter. Percentile ranges of the resampled distribution
were compared to the distribution of observed winters to check the validity of the simulated
data. In simulating the winters snowfalls in this manner, Lillo had to assume that
consecutive winters and winter snow patterns within a particular year were independent and
identically distributed (IID). We recognize that these assumptions are not necessarily valid.
Since we were unable to obtain Lillos simulated data and are using actual historical data
for our own Sigma Level calculations, we used a digitizer and Figure 2 to simply create a
copy of his data for further analysis. Using Engauge Digitizer, an open source program, a
blue cross was placed at the top center of each vertical bar of the distribution, as shown in
Figure 4. Each blue cross was converted into an (x, y) coordinate pair and stored in an
Excel spreadsheet. The digitized x coordinates correspond to the x-axis values on the given
plot since only one blue cross was associated with each vertical bar. The digitized y
coordinates were first normalized* against the plots peak or maximum value. Then, these
normalized values were multiplied by 40,000 in order to scale the curve to match Figure 2.
The digitized plot of the normalized data is also displayed in Figure 4.
*Normalize:

Normalizing in this sense rescales data to the unit interval.

Figure 4. Digitizing Lillos data with Engauge Digitizer (left) and the resulting digitized
plot (right).

Section 3. Fitting a Distribution to the Maximum 30-Day Snowfalls


Data values for Maximum 30-day snowfall (in) and Number of winters from the Excel
worksheet were pasted into two columns in Minitab. Histograms of the snowfall amounts
with overlaid probability plots were constructed to offer reasonable distributions to fit the
data. The following Minitab commands were used to construct the histograms in Figure 5.

Choose Graph > Histogram > With Fit.


Select the column Maximum 30-day snowfall (in) as the Graph Variable.
Click on the menu Data View, and select the Distribution tab. Check the box Fit
Distribution and select a distribution from the drop down menu. Because of the positive
skew and several extreme values of snowfall amounts, the gamma and largest extreme
value distributions were used. Click OK.
Click on the menu Data Options, and select the Frequency tab. The frequency of the
maximum 30-day snowfall amounts is in the Minitab column Number of winters. Click
OK.
Click OK.

Figure 5. The gamma and largest extreme value distributions are fit to the maximum 30-day
snowfall amounts. Note that the sample size N is not one million since the digitized data is
not an exact replica of Lillos one million data points.
The gamma distribution provides the best visual fit to the maximum 30-day snowfall
amounts. To statistically identify the best fit, we used Minitabs Individual Distribution
Identification. By default, an Anderson-Darling (AD) test for goodness-of-fit is performed
and the numerical results are displayed with the graph. The Anderson-Darling procedure is
a general test to compare the fit of an observed cumulative distribution function to an
expected cumulative distribution function. This test gives more weight to the tails than the
Kolmogorov-Smirnov test, which is important because of the rare 2015 snowfall amount.
Minitab could not construct probability plots of the data given the very large sample size of
nearly one million values. We scaled the frequency of the snowfalls to ten thousand data
points using their probability of occurrence. A column of the ten thousand snowfalls called

Scaled Snowfall Amounts was created and used to obtain probability plots for various
distributions. The following Minitab commands were used to obtain the AD test statistic, pvalue, likelihood ratio test p-value for each distribution, as displayed in Figure 6, and
probability plots in Figure 7.

Choose Stat > Quality Tools > Individual Distribution Identification.


In the Single Column textbox under the heading Data are arranged as, select the column
Scaled Snowfall Amounts (in). Below it, type 1 as the Subgroup size.
In order to have Minitab perform goodness-of-fit tests on the data for a variety of
distributions, select Use all distributions and transformations.
Click on the Results, and select Goodness of fit tests and Distribution Parameters. Click
OK. Selecting Distribution Parameters tells Minitab to estimate parameters of the various
distributions. Click OK.

Figure 6. Minitabs goodness-of-fit test results, where AD is the value of the AndersonDarling test statistic for a distribution, P is the p-value corresponding to the distribution,
and LRT P is the likelihood ratio test p-value. An asterisk indicates that a value cannot be
calculated.
For every 3-parameter distribution, except the Weibull distribution, there is no established
method for calculating its p-value, so the likelihood-ratio test (LRT) must be used. A
likelihood ratio test p-value (LRT P) less than a significance level indicates that, for
distributions with this extra parameter, adding it significantly improves the distributions
fit. For example, the LRT p-value 0 for the 3-parameter gamma distribution indicates that
the gamma distribution with three parameters is a much better fit to the data than the
gamma distribution with only two parameters. Unfortunately, adding the third parameter to
the gamma distribution shifts it slightly left of x = 0. Although the shift is minor
(approximately 1.262 units), it does have an effect on the right tail probability, which is the
main computational interest of this paper.

The extremely large sample size of the scaled data yielded p-values less than the typical
significance level of = 0.05 for all of the distributions. Although overly large sample sizes
give distribution tests very high power, small deviations or departures from a distribution
are considered statistically significant, resulting in miniscule p-values. In these large
sample size cases, we should not consider the p-values to be practically significant. Patrick
Runkel and Jim Frost advise in their blog posts Large Samples: Too Much of a Good
Thing? and How to Identify the Distribution of Your Data using Minitab, respectively, to
disregard the p-values obtained from the AD goodness-of-fit test and compare only AD
values for the distributions. [Runkel, 2012], [Frost, 2012] Lower AD values indicate better
fits. A visual inspection of probability plots combined with Anderson-Darling values can
help indicate whether a distribution is a good fit. The probability plots for several
distributions are displayed in Figure 7.

Figure 7. Probability plots of the scaled snowfall amounts for the normal, gamma, and
largest extreme value distributions. Clearly the normal distribution provides the worst fit
among the three distributions.
Of the fourteen distributions provided by Minitab to fit the snowfall data, the gamma
distribution was found to have the best fit according to the probability plots and AD test
statistics for modeling only positive x values.

Section 4. Determining Sigma Levels for the 2015 Boston Snowfall Using Lillos Data
Let X represent the gamma random variable that best fits the maximum 2015 30-day
snowfall data. A gamma random variable X with positive shape parameter and positive
scale parameter has probability density function:

with expected value and variance:

The parameters for the distribution, shape 4.214 and scale 5.382, were computed
by Minitabs Individual Distribution Identification procedure. Using the gamma
distribution, the probability of a 30-day snowfall greater than 94.4 inches is:

This computation can be easily done in Minitab as well by using the following commands:

Choose Graph > Probability Distribution Plot.


Choose View Probability, then click OK.
From Distribution, select Gamma.
In Shape, type 4.214. In Scale, type 5.382.
Click the Shaded Area tab. Under Define Shaded Area By, choose X Value.
Click Right Tail. In X value, type 94.4. Click OK.

The resulting graphic is displayed in Figure 8.

Figure 8. The probability of the maximum 30-day snowfall of over 94.4 inches using the
best fit gamma distribution and Lillos simulated snowfalls.

In simulating the million hypothetical Bostonian winters, Lillo made the simplifying
assumption that the Boston climate is static. [Holthaus, 2015] Assuming a static climate
implies that the mean number of inches of snow received in 30-day periods each winter
does not shift over time. While we do not necessarily agree with this, we made the same
assumption in order to compare our results with Lillos. Thus, in calculating the Sigma
Level for the 2015 30-day snowfall amounts, we calculated the events long term Sigma
Level, as discussed in Section 1. The Sigma Level for the best fit gamma distribution with
approximate mean E (X) = 4.214 5.382 22.680 and varianceVar (X) = 2 4.214
5.3822 122.062 is:

A Long Term Sigma Level of 6.49 is equivalent to a Short Term Sigma Level of 4.99
given the 1.5 shift in the mean. Although the 2015 Boston snowfall is not a true 6 event
according to Lillos simulated data, the likelihood of Boston experiencing this rare amount
of snow in the next twenty-five centuries is near zero given its probability of occurrence of
3.675 10-5.

Section 5. Determining Sigma Levels for the 2015 Boston Snowfall Using Actual
Historical Data
Instead of creating and analyzing simulated snowfalls as Lillo did, we used historical
Boston snowfall data to determine the rarity of the 2015 winter event. Seasonal snowfall
data was obtained for two Weather Service Forecast Offices (WFSO) from the National
Oceanic and Atmospheric Administration (NOAA): Boston WSFO AP and Boston Logan
Airport. NOAA defines one snowfall season as all snow received from July through the
following June. Thus, the 2014-2015 snowfall season, which well refer to as the 2015
season, includes monthly records from July 2014 to June 2015. While the Boston WSFO
AP station has snowfall amounts dating back to 1891, it does not have snowfall data
available for this station after 2012. The snowfall records for the Boston Logan Airport
station, which start in 1936, are in strong agreement with the Boston WSFO AP station and
provide measurements for the 2013, 2014, and 2015 snowfall seasons. By appending these
three snowfall seasonal records from the Boston Logan Airport station to the records from
the Boston WSFO AP station, a total of 125 years of historical Boston seasonal snowfalls
were gathered.
Unfortunately, the NOAA does not have daily snowfall totals for each year, only monthly
totals. Because the specific snowfall amounts for the time period between January 24 and
February 22 could not be isolated, we combined the daily snowfall amounts for the entire
months of January and February. Although we cannot directly compare our results to
Lillos, we can still study the rarity of the amount of snow received in these two months in
2015 compared to years dating back to 1891. A boxplot and histogram of the two month
totals from 1891 to 2015 is displayed in Figure 9.

Figure 9. January and February Boston snowfall totals for years 1891 to 2015. The lone
right tail outlier (99.1 in) is year 2015.
For the 125 snowfalls, we again used Minitabs Individual Distribution Identification
procedure to compute the Anderson-Darling goodness-of-fit test statistics and their
respective p-values and likelihood ratio test p-values for a variety of distributions, as
displayed in Figure 10. Although the largest extreme value, gamma, and 3-parameter
loglogistic distributions have p-values greater than 0.250, both the largest extreme value
and 3-parameter loglogistic have positive probabilities for xs less than 0.

Figure 10. The largest extreme value, gamma, and 3-parameter loglogistic distributions are
the best fits to January and February Boston historical snowfalls.

Figure 11 displays the gamma distribution with shape parameter = 3.254 and scale
parameter = 7.963 that Minitab identified as the best reasonable fit for the historic Boston
data. The parameters were determined by the Individual Distribution Identification
procedure. By including all of January and February, this snowfall total as compared to the
30-day total increased from 94.4 inches to 99.1 inches, which is only 4.7 inches. Using the
best fit gamma distribution, the probability of snowfall greater than 99.1 inches occurring is
approximately 542 out of a million. The Long Term Sigma Level is 5.1, while the Short
Term Sigma Level is 3.6.

Figure 11. The probability of the maximum 60-day snowfall of over 99.1 inches using the
best fit gamma distribution and historical Boston snowfalls.
Recall that using Lillos simulated data, the probability of having a 30-day snowfall was 37
out of a million. Expanding the time period to all of January and February (59 or 60 days
accounting for leap years) and using historical rather than simulated data increased the
probability to 542 out of a million. One obvious reason for the difference is the rarity of the
2015 30-day total as compared to the 2015 60-day total as displayed in Figure 12. The
figure shows that the winter of 2015 does not hold the record for the most snow in any 1, 2,
or 3 day period; however, it holds the record for all other periods of time. [Letham, 2015]
In 2015, Boston got 94.4 inches of snow in a 30-day time period, compared to the previous
record (set in 1978) of approximately 59 inches. In a 60-day time period, the difference in
snowfall totals between 2015 and the all-time record is not as large, as displayed by the
inserted dotted lines in the figure. Also, the maximum 60-day total snowfall in any year

may have occurred over a 60-day interval that does not start on January 1 and end on
February 28 or 29. With respect to the simulated data, the difference in rarity calculations
for the simulated versus historical data is likely the questionable IID assumption in creating
the simulated snowfalls, as well as the assumption of weather as static. However, even
when historical data is used to calculate the rarity of the 2015 Boston snowfall, the
likelihood of 542 out of a million is still a negligible amount!

Figure 12. The total snowfall amounts for 2015, 1978, and all-time records winters for
the number of days n varying from 1 to the entire winter. The all-time records line is from
six previous winters: 1978, 1994, 1996, 2003, 2005, and 2011 using data that goes back to
1893. [Letham, 2015]

6. Conclusions and Future Work


On February 17, 2015, the website FiveThirtyEight, which uses statistics to tell compelling
stories, posted a time series-like chart of February snowfalls in Boston since 1891. [Enten,
2015] The chart, as displayed in Figure 13, provides a chronological view of the variability
in snowfalls since that time. In building probability models and calculating Sigma Levels in
Sections 4 and 5, we assumed that the conditions that resulted in the more typical snowfalls
were the same as the conditions that resulted in the more extreme events, such as in 2003,
2005, 2011, and 2015. This assumption is not necessarily true; in fact, it rarely is. When

data are not homogeneous, the fitted model will be incorrect and our predictions will be
erroneous. [Wheeler, 2013]

Figure 13. Time-series like chart of historical February snowfalls in Boston since 1891.
[Enten, 2015] Winter 2015s amount (over 58 inches) only accounts for the snowfall total
up to February 17, 2015.
The results calculated in this paper assume homogeneous annual snowfalls. Under that
assumption, the likelihood of the debilitating snowfall that occurred in Boston in 2015 is at
best 542 out of a million and at worst 38 out of a million. The goals of this paper were to
confirm Lillos results and calculate similar snowfall likelihoods using historical data,
which we did. Our future work is to determine whether the Boston historical snowfall data
is justly homogeneous. Walter Shewhart, who was the inventor of the control chart, said the
following about data presentation: Whenever an average, range, or histogram is used to
summarize the data, the summary should not mislead the user into taking any action that the
user would not take if the data were presented as a time series. [Wheeler, 2000] Our
follow-up work will be to examine the Boston snowfalls over time to look for patterns in
the historical data that may have had an effect on this unusual weather event. With recent

news coverage on El Nino and global warming, control charts may be able to provide
insight into regional climate change.

References
@NWSBoston. (2015, March 15). Retrieved from
https://twitter.com/NWSBoston/status/577460157615120384/photo/1?ref_src=twsrc%5Etf
w
Agency, M. E. (2015, March 27). Attachment A: 2015 Severe Winter Weather Pattern
Impacts - Supplemental Information. Massachusetts, USA. Retrieved from
http://www.mass.gov/governor/docs/news/attachment-a-severe-winter-weather-patternimpacts-supplemental-info.pdf
Basu, T. (2015, July 14). The Last Snow from Bostons Nightmare Winter Has Finally
Melted. TIME. Retrieved from http://time.com/3957842/boston-last-snow-melt/
Deming, W. E. (1986). Out of the Crisis: Quality, Productivity and Competitive Position.
Cambridge: Cambridge University Press.
Enten, H. (2015, February 17). Bostons Ridiculous February Snowfall in One Chart.
Retrieved from http://fivethirtyeight.com/datalab/bostons-ridiculous-february-snowfall-inone-chart/
Erdman, J. (2015, March 23). weather.com. (The Weather Channel) Retrieved from
https://weather.com/news/news/new-england-boston-record-snow-tracker
Frost, J. (2012, March 8). How to Identify the Distribution of Your Data using Minitab.
(blog.minitab.com) Retrieved from http://blog.minitab.com/blog/adventures-instatistics/how-to-identify-the-distribution-of-your-data-using-minitab
Holthaus, E. (2015, February 25). Bostons astounding month of snow a 1-in-26,315 year
occurrence. The Washington Post. Retrieved from
https://www.washingtonpost.com/news/capital-weather-gang/wp/2015/02/25/bostonsastounding-month-of-snow-a-1-in-26315-year-occurrence/
Jdrzejewski-Szmek, Z., Muftakhidinov, B., Winchen , T., Trande, A., Lane, D., &
Weingrill, J. (n.d.). Engauge Digitizer Tool (Version 6.0). Retrieved from
http://markummitchell.github.io/engauge-digitizer/
Letham, B. (2015, March 15). Was 2015 Bostons worst winter yet? Boston. Retrieved
from http://lethalletham.com/posts/winter2015.html
Lillo, S. (2016, January 14). Personal Communication. University of Oklahoma
Meteorologist.

Runkel, P. (2012, June 4). Large Samples: Too Much of a Good Thing. Retrieved from
http://blog.minitab.com/blog/statistics-and-quality-data-analysis/large-samples-too-muchof-a-good-thing
Taylor, A. (author of article), Brian Snyder/Reuters (photographer), (2015, February 17).
What Record-Breaking Snow Really Looks Like. Retrieved from
http://www.theatlantic.com/photo/2015/02/winter-storms-bury-freeze-northeasternus/385566/
Wheeler, D. J. (2013, June 4). Why We Keep Having 100-Year Floods. Quality Digest.
Retrieved from http://www.qualitydigest.com/inside/quality-insider-column/why-we-keephaving-100-year-floods.html#
Wheeler, D. J. (2000). Understanding Variation: The Key to Managing Chaos, 2nd Edition.
SPC Press.

You might also like