You are on page 1of 6

Author's personal copy

International Journal of Forecasting 31 (2015) 910915

Contents lists available at ScienceDirect

International Journal of Forecasting

journal homepage:

A simple approach to projecting the electoral college

Joshua T. Putnam
Department of Government and Justice Studies, Appalachian State University, Boone, NC 28608, United States



Electoral votes
Public opinion polling
Presidential elections
United States
Weighted average

The following research note examines the utility of a simpler method of projecting the winners of the various states within the United States Electoral College system. While more
advanced models may be able to center on state-level presidential winners earlier in an
election year, those models, among others, continue to be confounded by states where the
lead is small and/or not clear. This research will demonstrate that over the course of the
elections from 2000 to 2012, a simple weighted average identifies state winners as well as
the more complex models.
2015 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.

1. Introduction
For all the hype, talk, campaigning, and money spent,
the 2012 United States presidential election was a rather
pedestrian affair. Through the lens of the media, every
tamale not properly shucked, every ill-advised tankthemed photo op taken, every Wendys slogan borrowed
on the campaign trail, or every unearthed-turned-viral hidden video was the game change that was going to make
or break the election for one or both major party candidates at some point during the campaign season. Off that
roller coaster ride, however, the fundamentals roughly,
presidential approval and some measure of the state of the
US economy often chart a different, steadier path toward
election day (Abramowitz, 1988). In fact, in 2012, for all the
noise that the various campaign events represented, the
snapshot provided very early by the extant polling at the
state level held very steady throughout.
This was true across a number of forecasting models
that were employed, from very rudimentary polling averages to more complex Bayesian models, some of which
accounted for additional variables beyond the state-level
polling, while reducing the uncertainty of the predictions
through simulations (see Linzer, 2013). Across the array of

E-mail address:

models, the picture was clear: the handful of swing states

all favored President Obama, with the exception of North
Carolina. In mapping out the Electoral College, the president had and held a 332-206 electoral vote advantage from
June through election day. The one piece of the map that
fluctuated across the various forecasts with any regularity
was Florida.
In that scenario, then the one where Florida is the only
state to flip from one candidate to the other, and, in some
cases, back again those 29 electoral votes were neither
decisive nor determinative. They were superfluous to the
effort of Obamas reelection campaign to cobble together
a coalition of states that summed to 270 electoral votes
or more. In light of that fact, it was not much of a race
at all. The president had, and maintained throughout the
summer and fall campaign, leads in enough states to add
up to more than 300 electoral votes, depending on Florida.
In many respects, the 2012 US presidential election was a
steady state. Despite fluctuations in the state-level polling,
the picture always remained virtually the same in the aggregate. Obama always maintained some consistent cushion during the fall campaign.
However, that is not what was detailed through the
news media over the summer and into the fall of the
general election campaign. Nor did it follow the normal
pattern witnessed in the lead-up to the three previous
0169-2070/ 2015 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.

Author's personal copy

J.T. Putnam / International Journal of Forecasting 31 (2015) 910915

presidential elections of the 21st century. The 2012 campaign saw little movement from the summer onward in the
projected electoral vote count based on state-level polling.
The previous three cycles, on the other hand, ended in
November with different projections from those earlier in
the campaign, in June. The 20002008 cycles had higher
state-level polling volatility over the course of the campaign, but the movement across states was typically in the
same direction and was still limited to just the handful of
competitive states. To the extent that the electoral vote
count fluctuated, it was a function of the small number
of battleground states switching from one candidates column to the other candidates. That type of movement was
absent from the aggregation of 2012 polling. There were
fluctuations across the various survey snapshots taken in
2012, but those changes were muted relative to previous
Together, those four elections have provided forecasters with a wide array of conditions under which to project
electoral college outcomes. At their most basic, the four
elections are a reasonable cross-section of presidential
elections during the polarized era, given that, collectively,
they represent an open seat election following a two-term
Democratic incumbent (2000), one with a Republican incumbent (2004), an open seat contest following a termlimited Republican incumbent (2008), and a reelection
campaign involving a Democratic incumbent (2012). Not
only do the four elections account for different types of incumbency and partisan control of the White House over
this period, they also coincide with an era that has seen an
overall proliferation of publicly released polling data.
However, the publicly released polls can fluctuate
wildly over time and among the various polling firms surveying registered and likely voters across the country. At
a bare minimum, simple polling averages and other more
complex methods of aggregation and analysis can smooth
out the crests and troughs in the data to give a clearer, more
stable picture/forecast of where a presidential race stands
at any given point in time during an election year. This paper will establish the rationale behind one of these methods, the graduated weighted polling average (GWPA), and
describe its mechanics, discuss the reasons for inter-model
convergence with the GWPA in 2012, and, finally, examine the effectiveness of averages as a baseline presidential
election projection over the period 20002012.
2. A simple electoral college projection
The graduated weighted polling average (GWPA) works
on the assumption that survey data are an adequate proxy
of the state of the presidential race. Polls account for and
are a reasonable reflection of the oft-cited fundamentals in
the race for the White House, as well as of campaign effects
and other events. Furthermore, the GWPA treats all polling
data as good polling data; that is, good in the sense that
each new survey provides an update on the actual state
of the general election race in a given state at any point
in the campaign. The snapshots vary in their accuracy, but
are more powerful more sound when they are aggregated and averaged over time. On the whole, newer information is more valuable in the battleground states, but also


helps to identify which states are the swing states over the
course of any election cycle. For instance, in 2012, Michigan saw the average margin between the candidates widen
over time, making it less of a Romney target in the process.
Conversely, just four years earlier, North Carolina became
more competitive as the campaign continued, putting the
Tar Heel state on the Obama campaign radar.
Thus, any information is valuable to the graduated
weighted polling average. However, the recency of the data
is also of value. The timing serves as the only variable that
is accounted for in the projection model directly. The rest
are subsumed in the actual polling data or left unaccounted
for in the projection.
The polling data are paramount, and are only corrected
for by a decay function that attempts to account parsimoniously for how old the information is. In the context of the
graduated weighted polling average for a state, the candidates levels of support (or vote shares) in the most recent poll are given a full, undiscounted weight, while an
older survey is multiplied by a decay function. That discount is calculated by dividing the day of the year in which
the poll was last in the field by the day of the year at which
the campaign currently is. For example, election day 2012
was on November 6, the 311th day of the year. A poll that
was last in the field in Arizona on January 9 the ninth
day of the election year as a Behavior Research Center
poll was, would be discounted significantly in the average,
due to its age on election day (de Berge, 2012a). That is, it
would have been reduced by a decay function of 0.029 (or
9 / 311). A poll closer to the day of the election like the
early October survey of Arizona by the same firm would
have been discounted in the GWPA at a less severe rate of
reduction. This later Behavior Research Center survey, being last in the field on the 284th day of the election year,
would have retained a little more than 91% of the value of
the candidates vote shares in an election day average of
all weighted surveys (de Berge, 2012b).
This is a very gradual decay function. In addition, the
discount is being utilized as part of an average. The former means that the vote shares in the polls decay very
slowly the older they get, while the latter indicates that,
in the absence of outliers, the aggregation of polling data
over time will translate into a slowly reacting gauge of aggregated public opinion on the race in a state. The slow
reaction time can be viewed as a drawback of the GWPA
approach, as it fails to keep pace with changes in the race.
From another perspective, however, the slow reaction time
can also mean that when a change occurs specifically, a
states average switching from favoring one candidate to
the other it is evidence of a meaningful and lasting change
in the state of the overall race.
These conflicting perspectives highlight the advantages
and disadvantages of the GWPA relative to the extant,
alternative models. The 2012 presidential election cycle
offers one such lens through which to examine the various
prediction models. There are at least two overarching and
interrelated lessons that can be gleaned from 2012.
(1) First of all, despite the fact that a number of models
predicted the winner of the race for the White House
and the distribution of electoral college votes for each
candidate correctly early on, and showed little if any

Author's personal copy


J.T. Putnam / International Journal of Forecasting 31 (2015) 910915

fluctuation thereafter, that success seemingly had very

little to do with the methodology behind those models.
Each one was using polling data as either the sole basis
of its model or a major component of it. In many ways,
it was the aggregation of the polls themselves that was
prescient in terms of the outcome of the election.
In retrospect, even early aggregations of the available state-by-state survey data provided a clear picture in each state as to the ultimate outcome. As
early as June, the model constructed by Linzer found
the allocation of electoral college votes to be exactly
what it would end up on election day. Less than one
month later, the GWPA depicted the same distribution
through a much more simplistic lens: 332-206 in favor
of President Obama (Putnam, 2012). Over the course
of the summer and into the fall, that did not change:
not only were the aggregated polls right on the money,
they were consistent overall throughout the duration
of the contest, from the end of the Republican nomination race through to election day. To a great degree,
then, the prediction of the 2012 presidential general
election was about the polls.
(2) But where does that leave the models? One can
demonstrate that the polls were correct regardless of
the methodology behind the aggregation of the survey
data. Again, several models predicted the final electoral
vote distribution correctly, whether they were complex or more basic in approach (Jackman, 2012; Linzer,
2012; Putnam, 2012; Silver, 2012). Even those that
missed the mark picked up on the fact that Florida was
very closely divided (Wang, 2012). The certainty added
through the Monte Carlo simulations in the Wang
model indicated that neither candidate Obama or
Romney had a win probability in the Sunshine state
that was much higher than 50%. The few last-minute
polls that trickled in pushed most of the predictions
in the direction of Obama, but Florida was basically a
tie. The more basic averages had been there the entire summer, while some of the more involved models
had the Sunshine state drifting over into Romney territory in October, before reverting to the summer positioning as election day approached. Again, however,
compared to the graduated weighted polling averages,
those models were more adaptive to the smaller fluctuations in the polling that were taking place in Florida
and elsewhere throughout October.
Given the similarity of predictive abilities across
these models, the question that emerges is how much
the added complexity augments the ability to forecast
the outcome of a presidential election and/or the electoral college vote distribution. According to those two
measures, there was a certain degree of convergence
between the various forecasting models formulated
and publicly released during the 2012 cycle. In fact, a
simple unweighted average of the polls conducted in
each of the states throughout 2012 painted the same
picture: an Obama win and a 332-206 electoral college
vote distribution.
A better question may be why there was such a
convergence across such a varied array of forecasting models on those two measures. One hypothesis is

that the lack of variability in the forecasts may have

been attributable to the relative paucity of polling
variance within and across states relative to other presidential election cycles. There seems to be some superficial evidence of this in the fact that only one state
ever switched hands over the course of the campaign:
Florida. The shifts that were witnessed ended up being
more muted, and tended to follow the uniform swing
pattern in polling/vote shares across states. In other
words, there were small changes in the averages, but
such shifts were witnessed in similar ways across all
states rather than in a select group of states based on
regional differences or variations in competitiveness,
something that may be due to the nationalization of
presidential elections over time.
If the polling volatility was lacking, and thus the
state of the race was consistent over time, the intermodel convergence of forecasts may have been a phenomenon that was unique to the 2012 cycle. That is one
interpretation of why the wide range of models used,
from straight, unweighted averages of the polls in each
of the states to the more complex Bayesian analyses,
produced such similar results. However, it does not imply that the more complex forecasts added no value.
In fact, the utility of the more complex models is
twofold. Firstly, compared to the various polling averages, the Bayesian models incorporate new survey
data into a framework that is more reactive to changing information and more robust, in that, probabilistically, the uncertainty is reduced through repeated
simulations. Again, there was some variation in the
forecasts for Florida in 2012 across the more involved
models. Some of these models were on the right side
of what was a near-tie in the Sunshine state, while others were on the wrong side. This hints at some calibration issues over the series of models. Still, given the
increased volatility in the survey data over time, these
issues may actually have been less problematic in practice, if not nonexistent. The separation in that scenario
would potentially have been between the complex and
pedestrian models, rather than within the more complex ones. Thus, part of the value is in the reactiveness
of those models.
Secondly, the more involved models are better able
to take the forecasts a step beyond predicting an overall winner (a candidate with over 270 electoral college votes) and forecasting the electoral college vote
distribution. The methodology allows these models to
predict more closely the actual vote shares that the
candidates will receive in each state. The state-level
results of previous presidential elections are included
in several of these models as priors. Compared to
the graduated weighted polling average, the more
complex models reduce the vote share of undecided/other survey respondents more quickly as election day approaches. Due to the slower decay function,
the GWPA tends to lag behind those models. This is less
a matter of the undecided/other category than a result
of the effect of that vote share on the weighted average
of the candidate vote shares.
Together, the models along this spectrum from the
simple average to the more complex Bayesian analyses
performed well in the 2012 cycle. Certainly, it could be

Author's personal copy

J.T. Putnam / International Journal of Forecasting 31 (2015) 910915

argued that the polls told a consistent tale throughout the

year of the election. Furthermore, the value of the various models could be called into question. However, that
belies the fact that those models were largely consistent
throughout the general election phase of the 2012 presidential campaign.
By comparison, the graduated weighted polling average performed adequately overall. It predicted the winner
of the 2012 presidential election correctly, and predicted
the outcome correctly in all fifty states plus the District
of Columbia. However, as was mentioned previously, the
GWPA lagged behind the other models in terms of predicting the actual candidate vote shares. This is due to the
methodology. The GWPA is not designed to predict the last
of these three measures of success. In some ways, this calls
into question the robustness of the method behind the average, but it also separates the GWPA.
While the graduated weighted polling average may
have its limitations, it does serve as a baseline against
which other, more complex models can be compared.
From that perspective, the GWPA is akin to the Jacobson (1989) measure of congressional candidate/challenger
quality. The essence of the Jacobson measure of candidate/challenger quality is based on one simple question:
Has the challenger held elective office? Those who have,
have proven far more successful in winning elections than
those who have not. The resulting dichotomous variable
has been quite robust in models attempting to explain congressional electoral success. Other efforts to improve on
that robustness through the development of multi-point
indices have added to the explanatory value of the concept in some cases, but only marginally. In that case, the
baseline is a powerful enough measure on its own.
The graduated weighted polling average is not the Jacobson measure, and has not been tested on nearly as
many elections. However, its function is similar to that of
the latter parsimonious yet robust variable in the congressional elections literature. Again, the GWPA is a baseline
that the more complex models should be expected to improve upon. How much do they improve on that baseline,
though? In 2012, the GWPA was on a par with the other
models in terms of predicting the overall outcome and the
winner in each state, but such may not be the case in other
election cycles with different dynamics/conditions. For instance, in years when the polling volatility over time is low,
one would expect the various models to converge with the
baseline set in the averages. However, in years in which
there is an increased level of volatility in the survey data,
the expectation is that the more complex models would
outperform the simpler forecasts from the GWPA and similar.
3. Graduated weighted polling average, 20002012
Much of the explanation of the convergence between
forecasting models in 2012 is attributable to three things:
(1) polling volatility over time within elections, (2) the
amount of polling available, and (3) how closely competitive a state was. The polling volatility and the number of
polls affect when a correct prediction can be made, and
the level of competition impacts whether a correct prediction can be made. These are the factors that can plague


all of these models. To establish the graduated weighted

polling average as a baseline forecast and to determine its
strengths and weaknesses, it will be tested on each of the
four presidential elections from 20002012. Attention will
be paid to how reactive the GWPA was, how many states it
predicted correctly (and thus, how close the final electoral
college vote tally was), and whether it identified correctly
the tipping point state that would have put either candidate over the 270 electoral vote threshold.
3.1. 2000
Bush v. Gore ended in a near-deadlock in Florida and
in the Electoral College, but the race for the White House
that year did not look so competitive at all times through
the general election phase of the campaign. The point in
early June at which the primary season was ending, Bush
held narrow leads in the GWPA in Illinois, Michigan, Oregon, Pennsylvania, Washington and Wisconsin, all of which
were states Clinton had carried twice in the immediately
preceding elections.
These Clinton-carried states accounted for 92 electoral
college votes. In mid-June, according to the average margins in the GWPA, George W. Bush had a lead of more than
5% in states totaling 239 electoral college votes, just 31 shy
of the 270 necessary to win the White House. Thus, five
months before election day, Bush needed only two of those
Clinton states to clear that threshold, in a best case scenario.
As election day drew closer, however, that significant
electoral college vote cushion disappeared. The 345-193
electoral college vote lead in June from the GWPA became
much tighter by election day. The polls had reflected this,
and the GWPA had likewise shown a Gore surge in the
averages. Given the survey data available on election day
2000, the GWPA projected a 272-266 electoral college vote
advantage and victory for George W. Bush. Collectively,
the projected averages identified the winner of the election correctly, and were within one electoral college vote of
predicting the electoral college vote distribution correctly.
However, this does mask the fact that the model missed
three states. Tennessee (11 electoral college votes) was incorrectly projected to be a Gore victory. Both New Mexico
(5 electoral college votes) and Oregon (7 electoral college
votes) were also wrongly pushed into the Bush column.
All three states were surveyed only sporadically, and New
Mexico and Oregon were also within one percentage point
in the actual balloting. Tennessee, Gores home state,
moved late, and the scant polling available in the later period was not enough on election day to outweigh the earlier
polls in the GWPA. Despite those misses, though, the graduated weighted polling average method was also able to
pinpoint the tipping point state in the overall alignment of
states. Florida was positioned as the state that would have
nudged either candidate past the 270 electoral college vote
barrier both in the projection and on election day.
3.2. 2004
Four years later, some of the same patterns from 2000
re-emerged through the lens of the GWPA. The June snapshot showed the party of the incumbent administration to

Author's personal copy


J.T. Putnam / International Journal of Forecasting 31 (2015) 910915

be behind in enough battleground states to put its candidate, George W. Bush, well back in the early Electoral College projections. As had been the case in 2000, though, the
candidate who was trailing in June drew level with but surpassed the opposing candidate as election day approached.
John Kerry clinched the Democratic nomination by
early March, and held a sizable lead over President Bush in
the predicted electoral college vote distribution by the end
of primary season in June. However, this tenuous advantage was built on very narrow polling average margins in
Florida and Ohio. To put it into perspective, the graduated
weighted polling average gave Kerry margins of just 0.05%
in Florida and 0.04% in Ohio at the beginning of June. The
306-232 edge that Kerry thus retained was quite vulnerable, and the 47 electoral votes of Florida and Ohio would
have swung the outcome toward President Bush.
As the November election day approached, though,
Kerrys marginal advantage waned. Florida, Ohio, Wisconsin and Iowa pushed into the Bush column, while only
New Mexico moved toward Kerry and into the Democratic
nominees coalition of states. The movements of these five
states altered the June snapshot of the projected Electoral
College, and amounted to a net shift of 59 electoral college
votes between the candidates from June to November.
The graduated weighted polling averages projected the
winning candidate correctly overall and were within five
electoral college votes of predicting the electoral college
vote distribution correctly as well. This failure on the overall electoral college vote distribution was attributable to
the model incorrectly predicting New Mexico to Kerry and
Wisconsin to Bush. As was the case with the 2000 projection, the GWPA again missed on states that were decided
by less than one percentage point in terms of actual votes
cast. It missed the mark in New Mexico and Wisconsin, but
correctly predicted a narrow Bush win in Iowa. The GWPA
also correctly tabbed Ohio as the tipping point state in the
electoral college vote distribution.
3.3. 2008
As of the first week in June, when Barack Obama
clinched the Democratic nomination, John McCain was
running ahead in the polls and in the GWPA projection,
286-252. That early snapshot of the electoral college vote
distribution may have been a function of the extended
uncertainty surrounding the Democratic nomination, because once Obama was in place as the presumptive Democratic nominee, the picture of the race began to evolve.
As election day approached, the question became not who
would win, but to what extent Obama would win. The
graduated weighted polling average picked up on and reflected this shift in the race, predicting Obama wins in
states that Bush won twice, such as Florida, Nevada, Ohio
and Virginia.
The GWPA correctly predicted an Obama victory, identified Colorado as the tipping point state, and projected an
electoral college vote distribution of 338-200. Again, the
question as election day drew nearer concerned the extent
to which Obama would win in the Electoral College. The
GWPA identified the movement toward Obama, but failed
again in a couple of narrowly decided states, Indiana and

North Carolina. These three election cycles, though, make

it clear that the GWPA is more likely to mispredict states
when they are very closely contested. The GWPA was not
alone in missing on some of these. Both Linzer (2013) and
Silver (2008) missed on Indiana, though they were able to
utilize the direction of last minute data in North Carolina
to project an Obama win there.
Overall, the race in North Carolina in 2008 is a good
test of the value added by the more complex models. All
of these forecasting models are prone to failures when the
polling data are comparatively sporadic and/or when the
outcome approaches a 50/50 probability. The volatility of
the 2008 polling was also noteworthy relative to the more
stable data within and across states and polls in both 2000
and 2004.
3.4. 2012
The 2012 presidential election was anomalous relative
to the three prior cycles. There was little to no change in the
overall electoral college vote distribution over time, and
the statistical models all predicted an Obama victory, in
both June and November. Any models that did not forecast
a 332-206 breakdown of the electoral college votes occasionally indicated a 303-235 allocation, with Florida as the
only state to jump the partisan line from the Obama coalition of states to Romneys (see Wang, 2012). The GWPA did
not miss any states, but failed to identify the tipping point
state correctly, unlike in the three previous elections. Ohio
was predicted as the tipping point state by the GWPA, but
Colorado ended up as the state that put Obama over the
270 electoral college vote threshold.
3.5. Overall performance
The graduated weighted polling average performed
well across and within four very different election cycles.
In all four cases, the model identified the overall winner of
the Electoral College. Likewise, over the four elections, it
never predicted more than three states incorrectly in any
one election cycle. The seven states in total that were incorrectly predicted (three in 2000 and two each in 2004 and
2008) were all close elections that came down to a fraction of a point, were late-moving states, and/or were polled
only sporadically compared to some of the other battleground states. All are factors that could hurt any model, but
should have a greater impact on the more simplistic models than on those that are more involved statistically. Simulations should reduce the uncertainty created by those
factors, to the extent that they are utilized in the more advanced models. The GWPA also identified the tipping point
state in the overall alignment of states correctly in three of
the four elections examined, only missing in 2012.
4. Discussion
The tie that binds all of these models whether the
graduated weighted polling average or the more advanced
models is a reliance on polling data, which raises a different question as our view shifts from reviewing how well
the models perform using survey data from past elections

Author's personal copy

J.T. Putnam / International Journal of Forecasting 31 (2015) 910915

to looking at 2016 and beyond. The quants won in 2012.

However, that victory was not without a wide-ranging and
fruitful discussion about the current accuracy of polling
and what it will look like in the future. This is important for
models that rely so heavily upon such information. The one
question that will continue to be worth asking is whether
the dropping rates of response to public opinion polls will
continue to drop, and what impact that will have. If this
trend continues, there will almost certainly be a tipping
point at which phone-based polls begin to miss the mark
more consistently.
The bottom line remains: these projection models are
only as good as the polling data that goes into them. If
garbage goes in, garbage is more likely to come out. On the
other hand, if the polling is accurate, then so too are the
projections. In the period 20002012, a simple average of
the available state-level polling, adjusted for the age of the
information, provides an important baseline against which
other forecasting models can be compared.
I would like to thank Paul Henri-Gurian, Michael LewisBeck, Drew Linzer and the three anonymous reviewers for
their helpful comments. I am also grateful to Miles Burns
for his assistance in collecting and coding the polling data
from the 2000 and 2004 presidential election cycles.


Abramowitz, A. I. (1988). An improved model for predicting the outcomes
in presidential elections. PS: Political Science and Politics, 21, 843847.
de Berge, E. (2012a). Behavior Research Centers Rocky Mountain
Poll (January 13). Available at
de Berge, E. (2012b). Behavior Research Centers Rocky Mountain Poll (October 13). Available at
Jackman, S. (2012). Pollster predictions: 91.4% chance Obama wins,
303 or 332 EVs. Huffington Post Pollster. Available at http://
Jacobson, G. C. (1989). Strategic politicians and the dynamics of U.S.
House elections, 1946-1986. American Political Science Review, 83,
Linzer, D. (2012). Election day forecast: Obama 332, Romney 206.
Votamatic. Available at
Linzer, D. A. (2013). Dynamic Bayesian forecasting of presidential
elections in the states. Journal of the American Statistical Association,
108, 124134.
Putnam, J. T. (2012). The electoral college map (7/17/12). Frontloading
HQ. Available at
Silver, N. (2008). Obama vs McCain: final pre-election projection. FiveThirtyEight. Available at
Silver, N. (2012). FiveThirtyEights 2012 forecast. FiveThirtyEight. Available
Wang, S. (2012). Todays electoral college map. Princeton Election Consortium. Available at