Professional Documents
Culture Documents
Pomona College
June 2017
Abstract
analysis that reveals that these correlations do in fact exist. The correlations
reveal that 1) basic voter characteristics have strong explanatory power, 2)
poverty and unemployment have a statistically significant near-zero effect
on voting behavior, and that 3) different immigrant characteristics have
conflicting impacts on the Republican presidential vote. I conclude that an
inexplicable Trump effect does not appear to exist, and that the 2016
election appears to be significantly understandable through regression
analysis.
1 Introduction
The 2016 presidential election, perhaps more so than others in recent memory,
was subject to unexpected turns, October surprises, and divergent political analyses.
Donald Trumps upset victory on election night was the perfect illustration of this, and
no one expected Donald Trump to winnot even Donald Trump himself.1 The question
that remains is, how did nearly every analyst and pundit miss it? As an economist and
a political scientist, it occurred to me to examine the numbers and decide whether they
were illogical. Moreover, I aimed to whether or not a Trump effect exists: some broad
characteristic of Donald Trumps victory that breaks all of the rules and is impervious
to explanation. I have written this paper to, among other things, test these broader
claims.
to be the most appropriate way to approach this question. Using county-level data, I
decided there were three broad categories that I would use to test for correlations: basic
voting characteristics, economic indicators, and migration flows. Instead of just using
the results of the 2016 presidential election, I decided to correlate to the change in
1
Jennifer Jacobs and Billy House, "Trump Says He Expected to Lose Election Because of Poll
Results," Bloomberg.com, December 13, 2016, accessed July 08, 2017,
https://www.bloomberg.com/news/articles/2016-12-14/trump-says-he-expected-to-lose-election-
because-of-poll-results.
1
voting shares from the previous election in 2012. This served as a better measure than
simply the 2016 voting results, because the coefficients of the corresponding independent
variables would now instead work as a referendum on both the outgoing Democratic
administration (based on the 2012 results) and the incoming Republican administration
(based on the 2016 results). I also decided to test the econometric model through two
different dimensionsevery county in the United States and the counties of the top ten
swing states. Doing so allowed me to better gauge the influence of these variables in
different contexts.
Basic voting characteristics serve as a foundation for the model, and include the
general voting descriptors most commonly associated with voting behavior. Without
clear correlations between these characteristics and presidential voting behavior, the
econometric model would have failed to be statistically significant with any other
further characteristics. Economic indicators also seemed like another set of variables
that would be appropriate to test. Much of the rhetoric of the 2016 presidential election
was on the economy, and I wondered the degree to which the state of the economy
would correlate to how Americans voted. Finally, in the face of the elections heightened
level of discussion around immigration, I was curious to test whether the change in the
countrys migration flows could be tied to a change in the 2016 Republican presidential
vote share when compared to 2012. This regression would help test whether a change in
immigration level resulted in higher or lower levels of support for Donald Trump.
2
The results for these regressions were highly encouraging and statistically
significant across different dimensions. The basic voting controls proved to be in line
with other results across the literature, and explained voter behavior to a large
statistical degree. Those results showed that race and education played a significant role
in the 2016 election. They revealed a tilt against the Republican presidential vote share
by a large and small margin, respectively. The economic indicators, when statistically
presidential vote. This was a surprising result, and made certain implications about the
effects of poverty and unemployment on voting behavior. Lastly, migration flows, when
statistically significant, showed three results, one of which was unexpected. First, the
regressions revealed that an increase in a countys domestic inflow was tied with a lower
migration was also tied to a lower share of Republican votes. Third, and most
results provide some preliminary insights into how different indicators reveal the
correlation between voter characteristics and the change in the 2016 Republican
presidential vote share from 2012. This analysis strongly proposes that the 2016
3
presidential election is not inexplicable and can in fact be understood fairly well through
econometric analysis. A Trump effect does not necessarily appear to exist. Through
migration flows, it becomes apparent that the election, like any other, can be explained
A great deal of literature has been written about the effect of economic
broadly known as economic voting, and in one authors words, analyzes voting as
define or examine that definition. Econometric analysis is not straightforward and there
are a variety of ways to measure political demand and regard economic grievances. This
definition, though, does generally describe the broad relationship that my paper
A common starting point in the economic voting literature is Ray Fairs The
Effect of Economic Events on Votes for President (1978). Fairs model attempts to
4
election. His paper uses various economic specifications and concludes that change in
real economic activity [does] appear to have an important effect on votes for president
(Fair, 1978). Bartels (1997) claims that this conclusion has a consensus, such that the
clearest and most significant implication of aggregate election analyses is that objective
economic conditions are the single most important influence upon [voting shares].
This is not to say that economic voting models are faultlessin fact, Fairs model, when
tested with various election results, often comes up short (Bartels, 1997). Some authors
inherent to the study of presidential elections (Leamer, 1978). In the face of this
probable uncertainty when studying economic voting, models must be thus guided by
(Bartels, 1997). Scholars can therefore take a variety of considerations into account
when building economic voting models. This latitude is helpful for exploring new
voting model, like others, must too define those parameters by which it treats political
2
A priori here refers to considerations that are derived from theoretical deduction as opposed to
statistical convention or practice.
5
I define political demand by directly measuring the county-level changes in
Republican presidential voting shares from 2012 to 2016. This definition is conventional
and appears in different variations through the literature. One study worth examining
defines political demand as the share of votes for President George W. Bush in the 2004
presidential election (DeSimone & LaFountain, 2007). While my definition for political
justification. Unlike political demand, there is no clear consensus definition for economic
grievances.
conditions), however, it appears to me that I must first justify the type of economic
data that my paper uses. In economic voting, there are two broad branches of data
methodologies that scholars have consistently used. One body of scholarship advocates
for the use of egocentric voting3, which uses survey data to capture voters personal
opinions about their economic conditions. Using this type of data, researchers can
measure economic activity by placing polls and asking voters questions about their
financial realities. The same paper I previously mentioned used egocentric voting to ask
voters whether they felt that they were better off, worse off, or equally well-off as they
had been four years ago (DeSimone & LaFountain, 2007). Egocentric voting presumes
that how voters feel about their own personal economic conditions is the greatest
3
In the literature, egocentric voting is also referred to as pocketbook voting.
6
determinant of how they vote. The other body of scholarship adheres to sociotropic
voting, which posits that instead, concrete economic conditions are better correlated to
actual voting behaviors. Sociotropic voting advocates for the use of tangible economic
(Kinder & Kiewiet, 1979). Other scholars also concur that voters are sophisticated
My paper and my model fit more comfortably within the realm of sociotropic
voting. This is for two reasons. First, Im skeptical about approaches that use egocentric
economic data, I dont believe that it reliably measures economic changes that voters
face. This is not to say that economic data is fine-grained enough to figure out which
voters are doing well or not, but I believe that hard economic indicators are still more
accurate than voter opinions that have been subjected to a variety of biases.4 Survey
data is too reliant on context and the underlying presumptions of voters, and suffers the
risk of skewed conclusions or the inability to replicate results. My paper stays away
from those types of inconsistencies and instead examines the existence of strictly
4
Bowman Cutter commented on an earlier draft of this claim.
7
that to a great degree reach beyond opinion bias. In the words of Brunner et al. (2008),
I seek not to rely on survey data, but rather examine the impact of economic
Having provided the definition my paper uses for political demand and my data
approach, I must now justify the variables that the paper uses to define economic
grievances. My chosen variables are greatly informed by Bloom & Price (1975), whose
variable dependent on party identification and the percent change of real per capita
income. This definition used isnt a perfect fit for my paper, but the emphasis on income
is certainly in line with my thinking. In this regard, the Bloom & Price (1975) paper
informs my model and provides a solid foundation to which I expand other a priori
conditions that I wish to test on voter behavior. In earlier terms, the Bloom & Price
(1975) focus on income is the consensus economic variable to which I add other unique
economic variables to test for correlations and expand the literature of economic voting.
This is not to say that the a priori variables that my paper examines are not
mainstreamin fact, they have mostly been included across other papers in the
literature, education is often linked to other economic markers (Bailey & Dynarski,
2011). Gelman & Su (2010) suggest that despite patterns we see for education levels,
8
voting by income remains strongly patterned along traditional lines, and should thus be
examined carefully. In terms of unemployment, the literature has linked higher levels to
higher levels of voter turnout and consequent voter share (Burden & Wichowsky, 2012).
conventional, the inclusion of migration flows does present a departure from mainstream
inquiry. Though scholars have recently begun to explore the potential effects of global
movement on voting, their current explorations have primarily concerned the movement
of goods instead of people. Jensen et al.s Winners and Losers in International Trade:
The Effects on U.S. Presidential Voting (2016) examines the employment effects of
expanding ports on changing vote shares with a strictly focus on trade flows. Similarly,
Trade Exposure (2016) considers labor markets with varying levels of trade exposure in
relation to voting behaviors. Though again, these models arent directly applicable to
Finally, it is worth noting that at the time of this writing, few papers have been
9
published on the 2016 presidential election. Though political scientists and economists
alike often wait until data is more complete and the picture is a little clearer, I believe
part of the pause can be attributed to Donald Trumps astounding upset. Mr. Trumps
victory stumped many existing economic and political conventions, and scholars are
currently deciding the degree to which Mr. Trumps win invalidates some of those
exploring an economic voting model with migration flows that might help explain the
increase in vote share for Mr. Trump that won him the election.
3 Econometric Strategy
2010 to 2015 that together build an approximate image of the correlations between
economic considerations and votes during the 2012 and 2016 presidential elections. The
(1)
This is the simplest form of my model, and best captures the broader theoretical
10
framework that my paper applies. It delineates the basic voting controls that the
presidential elections. These consensus economic considerations allow for the accurate
examination of a priori factors like migration flows. If the model did not include these
consensus factors, then the coefficients for the a priori factors might erroneously draw
the words of DeSimone & LaFountain (2007), I attempt to include controls that are
particularly linked to proxy for voter preferences and therefore absorb their spurious
elections often include the share of votes from the previous election as an independent
variable in their models. I have chosen not to do this. I argue that its inclusion does not
necessarily help expand the models true correlations but would instead skew the model
towards a disingenuously higher R-squared value. This skewed R-squared value would
convey more statistical significance than might plausibly exist, so to be more accurate I
Though Equation (1) is helpful for understanding the framework of my paper, its
clearly not complex enough to express the statistical nuance that regression analysis
requires. Equation (1a) is thus expanded to include the models specifications as written
11
(1a)
13(perc65plus) + 14(Pop_Density) +
each racial group by county. I have excluded White people, as they represent the largest
racial group and are part of the models constants. Bachelors_Perc is the average yearly
percent of people that hold Bachelors degrees in each county. Fem_Perc is the yearly
household income in each county. The variables perc017, perc1824, perc2539, perc4054,
and perc65plus are the percentages of people in each county aged 0 to 17, 18 to 24, 25
to 39, 40 to 54, and 65+. I have excluded people ages 55 to 64 as they represent one of
the largest population sizes and are also part of the models constants. Pop_Density is
each countys yearly population density, calculated by taking a countys population and
Having controlled for these basic voting variables, my model then expands to
12
framework. The models expansion is such that:
(2)
4(%Unemployment) +
The X variable represents the variables previously presented in Equation (1). The
other elements are my chosen variables to represent the potential economic conditions
that might be correlated to voting. A poverty variable is included to get a sense of the
districts wealth distribution in relation to voter behavior. Civilian labor force and the
unemployment rate are included to serve as parameters to measure the change in each
countys labor force. I have included both instead of just the unemployment rate
because the unemployment rate by itself can be a misleading measure of how many
people are not working. This oversight can be attributed to the measuring methodology
used by the Bureau of Labor Statistics (BLS) which does not continue to count people
in the unemployment pool if they have not found employment within 27 weeks.5 In the
face of a changed labor market after the financial crisis of 2007 and 2008, its becoming
clear that many Americans are receiving now sustaining themselves through income that
wouldnt necessarily qualify them as employed. Only including the unemployment rate,
5
"How the Government Measures Unemployment," U.S. Bureau of Labor Statistics, accessed
June 22, 2017, https://www.bls.gov/cps/cps_htgm.htm.
13
Again, an expanded form of Equation (2) that includes the models
(2a)
4(Unemployment_Rate) +
previously in Equation (1a). Poverty_Perc is the yearly percent of people in poverty per
county, LaborForce_Perc represents the yearly percent of people in the civilian labor
force per county, and Unemployment_Rate is the yearly unemployment rate by county
as calculated by BLS.
Apart from the consensus specifications explored in Equation (1) and the
economic indicators expressed in Equation (2), my paper also explores the correlation of
(3)
Like with Equation (2), I have kept the consensus indicators from Equation (1)
and consider them in concert with newly introduced migration variables. My model
14
considers domestic and international migration separately. Though the intuition might
be to perhaps add both flows into a single new yearly migration variable, such a method
international migration might have. By separating migration into two separate flows, I
which non-domestic immigrants are a part of the communities in these counties. To this
end, I include a foreign-born population variable to measure how many people per
county are not born in the United States. Though there is some risk of collinearity with
the foreign migration variable, I argue its worth including it to, again, potentially catch
separate correlations.
(3a)
4(ForeignBorn_Perc) +
15
Succinctly, this is expressed as
(4)
where X again represents the basic voting controls from Equation (1), Y represents the
economic conditions listed in Equation (2), and Z represents the migration flow
(4a)
4 Data Analysis
The 2016 presidential election results are drawn from the work of entrepreneur
Gary Hoover who compiled data sets from Michael Kearneys open source election night
results. The data presents the presidential voting results of roughly 3,100 counties and
16
county-equivalents in the United States. The data gathering process was
straightforwardKearney compiled the election night voting tallies from official sources
for the two major-party nominees, Sec. Hillary R. Clinton (D) and Mr. Donald J.
Trump (R), and their respective running mates. Because these are election night results,
they dont captivate the additional Clinton votes that led to her near three-million vote
surplus in the popular vote. These missed votes, however, dont change the essence of
the finalized results for the purposes of my paper. Clintons additional votes were
overwhelmingly from large counties that had already voted in her favor (e.g. Los
Angeles County). Because my paper weighs the data by county population size, my
paper is not significantly affected. The dataset remains a reliable representation of the
2016 presidential election results. Also included are the official results from the 2012
presidential election between Pres. Barack H. Obama (D) and Gov. W. Mitt Romney
(R). The availability of both 2012 and 2016 presidential election results allow me to
The various demographic datasets are compiled from the U.S. Department of
Agricultures Economic Research Service and the U.S. Census Bureaus 2010-2014
statistics such as population size, density, race, gender, education, and age. Economic
statistics such as median household income, poverty level, unemployment rate, civilian
labor force, and migration flows are also drawn from this dataset. Finally, the number of
17
average yearly number of foreign-born persons per county was provided by the
Of the original 3,112 counties in the unified dataset, approximately 400 were
inaccurate data across different statistics. Two states were removed because the county
data available for them was not reliableAlaska and Delaware. The rest of the
removals were done to balance the dataset. The United States has a significant number
of counties at the extremes of the population distribution, and their inclusion would
and lowest 5% of the population distribution. After these removals, the number of
counties observed was reduced to a final number of 2,784. I took an additional step to
limit the influence of extreme values across the dataset by weighting all summary
statistics (and later, all regression analysis) by population size. All statistics are rounded
18
Table 1. General characteristics by county.
2. General Demographics
Population Size 2,784 155,630 122,482 2,954 446,753
Population Density 2,784 336.7 663.9 0.474 10,214
Median Household Income 2,784 51,587 13,728 22,640 125,635
*
Observations have been weighted by county population size in 2015, with a
weighting value of 153,030,410.
**
Unless otherwise noted, values represent a yearly average or an
approximate estimate.
***
Change here is measured from the 2012 and 2016 presidential elections.
Table 1 above shows the basic voting controls from Equation (1). I have included
most general demographics but have withheld age and race to detail those in Tables 2
and 3. The variables presented have been weighed by county population, and thus, are
not face value numbers but what the numbers are when the population concentration
(or lack thereof) of a given characteristic is made neutral. The weighed voting total
results for the 2016 presidential election, for instance, are more consistent with the
electoral college result than with the popular vote tally. Both mechanisms operate
19
similarlythey decrease the weight of urban populations and increase the weight of
rural populations. This is done to better captivate the effect that adding a given number
of people with a certain characteristic might have on the voting share in a given county.
Indeed, Mr. Trump on average performs better than Sec. Clinton by county. A similar
conclusion can be drawn from the 2012 presidential election weighed values, which also
show Gov. Romney performing better per county. Though Gov. Romney lost both the
electoral college and the popular vote, his success with most of rural American shifts the
weighed totals in his favor. Republicans increased their presidential voting share from
2012 to 2016 by about 1.6%, which is again consistent with results weve seen.
median household income per county is nearly perfectly in line with the national average
of $51,587. The number of female-identifying persons per county is also near the
national average at 50%. It is worth mentioning that though the mean number of
bachelors degrees is around 25%, there is great variance per countysome counties
20
Table 2. Percentages of race by county.
*
Observations have been weighted by county population size in 2015, with a
weighting value of 153,030,410.
**
Names of categories taken from U.S. Census Bureau classifications.
Table 2 above shows the racial distribution that was omitted from Table 1. Non-
Hispanic White people are the most consistently represented racial category, with an
average of 71.6% per county when weighed by population size. Black or African-
American people are represented at 10.4%, Latinos at 9.2%, Asian people at 2%, Native
Americans or Alaska Natives at 1%, and Native Hawaiians or Pacific Islanders at 0.01%.
In several counties, there is great racial homogeneitythere are entire counties where
there are only White people, and others where Black Americans, Native American, and
21
Table 3. Age percentages by county.
*
Observations have been weighted by county
population size in 2015, with a weighting value of
153,030,410.
Table 3 above presents a similar breakdown like Table 2 but of ages. People aged 0 to
17 are most represented by county with a mean of 23%. Looking only at voting eligible
populations, however, the next largest group is people aged 40 to 54, at 20.2%. It is
noteworthy that counties where senior citizens retire to are represented well; the
maximum of the 65+ age range shows that there are counties where up to 50.9% of the
and 3, I also examine the economic sociotropic metrics from Equation (2) in Table 4.
22
Table 4. Economic sociotropic metrics by county.
*
Observations have been weighted by county population
size in 2015, with a weighting value of 153,030,410.
poverty and 47.7% are in the civilian labor force by county. Again, civilian labor force is
presented alongside the unemployment rate to provide a better measure of the status of
a countys workforce. An additional measure which I did not have access to, but that
would ultimately prove better than the official poverty rate would be the Census
Bureaus Supplemental Poverty Measure (SPM).6 The SPM addresses the shortcomings
of the official poverty measure and calculates cost-of-living in a more accurate manner.
6
US Census Bureau, Data Integration Division, "Poverty - Experimental Measures,"
Supplemental Poverty Measure Latest Research - U.S Census Bureau, accessed June 27, 2017,
https://www.census.gov/hhes/povmeas/methodology/supplemental/research.html.
23
Table 5. Migration flows by county.
*
Observations have been weighted by county population size in
2015, with a weighting value of 153,030,410.
Finally, my model examines the migration flows from Equation (3), represented
born. This statistic, however, has great variancethere exist counties without any
foreign-born population and on the other end, some with near 40.8%. This variance
0.07% change in domestic migration, some counties saw changes as varied as -6.7% and
21.4%. In contrast, there was much lower variance for international migration. Counties
only experienced variance between -0.12% and 1.67%. Though these statistics are
helpful to draw a general picture of migration flows, its important to keep in mind that
people move so continuously that it can be difficult to track who is arriving and who is
leaving.
24
5 Results
corresponding regressions on the complete dataset of all county data available. As for
the data analysis, I used analytic weights by population size to neutralize the influence
effects, I have also absorbed the influence of broader state correlations by running my
regressions with fixed effects. Further, I have used robust regressions instead of
Ordinary Least Squares (OLS) regressions. The OLS method is traditionally too
county-level data (which occasionally do not follow clear patterns), robust regressions
became the better technical choice. Finally, because my regressions evaluate twenty (20)
possess explanatory power, so they prove useful to determining whether the variables
25
Table 6. Regression results for every county in the dataset.
Robust t-statistics in parentheses. 48 categories of fixed effects absorbed. *** p<0.01, ** p<0.05, * p<0.1
26
The first regression corresponds to Equation (1) and concerns only the basic
voting controls. The variables in Equation (1), per the literature, have repeatedly been
proven to have a strong correlation to voting behavior. Regression (1), then, serves as a
base that shows the explanatory power of the consensus economic considerations that I
mentioned earlier in the paper. This base regression allows me to make a clear
distinction between those consensus variables and the a priori considerations that I have
additionally proposed. This clear distinction further allows me to measure the degree to
which the a priori variables, presented in Equations (2) and (3), possess explanatory
between the characteristics of voters and presidential voting behavior. Though the
literature has supported this conclusion at great length, it is important that the
econometric model I have built comes to this conclusion individually. Regression (1)s R-
Squared and adjusted R-Squared values are the evidence that support this conclusion.
While any R-Squared value above 0.5 would imply the existence of a somewhat
support for the significance of the relationship. Its adjusted R-Squared value comes to a
similar conclusionat 0.834, it only differentiates from the regular R-Squared value by
0.004. This implies that the independent variables I have included (the basic voter
27
Many of Regression (1)s independent variables have high t-stat values that
people, Latino people, and people with Bachelors degrees or higher result in a decrease
to the share of Republican presidential vote. These results are consistent with
implication to having such high R-Squared values for my base regression. The values
imply that most of the relationship can be correlated to the variables that I have
already presented, such that the additional a priori variables that I introduce are less
Regression (2) keeps the basic voting controls of Equation (1) but adds the
economic indicators from Equation (2). The addition of these economic indicators seems
and t-stats increase significantly. The coefficient for percent of Black people, for
instance, grows from -3.45 to -5.15, and its t-stat from -4.996 to -6.457. The addition of
models statistical correlation. Regression (2)s R-Squared value increases to 0.842 from
Regression (1)s 0.838. Its adjusted R-Squared value increases in parallelfrom 0.834 to
0.838. Different from Regression (1), however, is that three additional variables are now
28
statistically significant. The percent of female-identifying persons and the percent of
people in poverty are statistically significant at the 99% confidence interval, and the
Regression (3) keeps the basic voting controls from Equation (1) and adds the
migration flows from Equation (3). The addition of these migration flows in this case
also appears to make the econometric relationship stronger, but not as much as the
addition of the economic indicators did in Regression (2). Regression (3)s R-Squared
and adjusted R-Squared values increased to 0.389 and 0.835, respectively. Though this
means that Regression (3) has a slightly stronger correlation than Regression (1) by
0.001, it is ultimately weaker than Regression (2)s 0.842 and 0.838, respectively.
Further, almost all coefficients and t-stats in Regression (3) increase slightly from
Regression (1)s values, but not more than Regression (2)s. The exceptions are the
coefficient), and the percent of female-identifying persons and persons aged 18-24 (which
Regressions (1). In terms of the added migration flows, Regression (2) suggests that
Finally, Regression (4) keeps the basic voting controls from Equation (1), the
economic indicators from Equation (2), and the migration flows from Equation (3). The
29
econometric relationship here is as strong as Regression (2)s, with the same R-Squared
and adjusted R-Squared values of 0.842 and 0.838. This suggests that the addition of
migration flows in Regression (4) does little to either hinder or bolster the model with
economic indicators from Regression (2). Regression (4) does present some deviations in
female loses statistical significance and is now only significant to the 90% confidence
interval instead of the 1%. In contrast to Regression (3), Domestic migration is now no
longer statistically significant and the percent of foreign-born people is now significant
in the 90% confidence interval. With the complete model tested in Regression (4), it
becomes clear that Equation (2)s economic indicators add the greatest degree of
explanatory power to the basic voting controls and that Equation (3)s migration flows
2,784 counties over 48 states.7 This dataset is effective for conveying the broader
implications of the model and the general effects of the independent variables. While
this big picture is helpful, it is also worth exploring other, more specific datasets that
In the context of the last election, and given that the models dependent variable
7
As I mentioned in the Data Analysis section, the two states that are not counted and have
been removed for logistical purposes from the dataset are Delaware and Alaska.
30
is the change in the Republican presidential vote, it follows that we should also consider
a narrower dataset that examines states most likely to be impacted by marginal changes
in the vote. I would define these swing states as states that have been pivotal to the
results of the election such that 1) both candidates in recent elections have made
consistent efforts to win them and 2) they are traditionally won by a small margin. For
the purposes of my paper, I have chosen ten states that I think fit these qualifications:
Colorado, Florida, Iowa, North Carolina, New Hampshire, Ohio, Pennsylvania, Virginia,
Nevada, and Wisconsin. Coincidentally, these are the ten states that elsewhere in the
literature have also been defined as swing states (Jensen et al. 2012).
I have again run my models four regressions. Like before, I have used analytic
weights by population size and have run robust regressions with fixed effects. I have
again included both R-Squared and adjusted R-Squared values. This secondary analysis
regressed 665 observations left after removing the corresponding observations from the
original 2,784. The analysis can be seen on the following page, in Table 7:
31
Table 7. Regression results for ten swing states in the dataset.
Robust t-statistics in parentheses. 10 categories of fixed effects absorbed. *** p<0.01, ** p<0.05, * p<0.1
32
The first regression corresponds to Equation (1)s basic voting controls. The
statistical relationship between these controls and presidential voting behavior still
holds. Regression (1) exhibits consistently high R-Squared and adjusted R-Squared
values, at 0.825 and 0.819, respectively. In this secondary analysis, basic voting controls
again hold substantial explanatory power. Like with the first analysis, this level of
explanatory power supports the legitimacy of the values and relationships established by
Similar to Regression (1) in the first analysis, many independent variables hold
high coefficients and are statistically significant to the 99% confidence interval. Some
differences are apparent, however. In this regression, the percent of Native Americans or
persons aged 0-17, which was previously significant to the 99% confidence interval in
Table 6, is now only significant to the 90% confidence interval. Most of these
conclusions fall in line with other results in the literature, but their implications to
Regression (2), similar to Table 6, also gains significant explanatory power when
compared to Regression (1). The R-Squared and adjusted R-Squared values have shifted
from 0.825 and 0.819 in Regression (1) to 0.843 and 0.837 in Regression (2),
respectively. This is an increase of 0.018. While the statistical significance of the existing
basic voting controls remains consistent, the addition of economic indicators does
33
provide new insights. Both the percent of people in poverty and the percent of people
doesnt include economic indicators, but interestingly, it is the only one where median
household income is statistically significant to the 95% confidence interval. All other
variables hold consistent to previous regressions. The newly included migration flow
interval, and both the change in the percent of international migration and the percent
The final and fourth regression has very different results from Table 6. The
the 95% confidence interval, even though in the previous three regressions it was at the
99% confidence interval. Though median household income was briefly significant in
Regression (3), it once again loses significant in Regression (4). The economic indicators,
though slightly less statistically significant, mostly hold their values. The percentage of
people in poverty is now only significant to the 90% confidence interval, but the percent
of people unemployed remains consistent at the 99% confidence interval. For the
34
migration variables, there is a parallel decrease in statistical significance. The change in
the percent of domestic migration is significant to the 95% confidence interval, and both
the change in percent of international migration and the percent of foreign-born people
are now significant to the 90% confidence interval. Aside from that, it is significant that
all added a priori variablesboth the economic indicators from Equation (2) and the
Regression (4)s R-Squared and adjusted R-Squared values make this conclusion even
more significant. With values of 0.847 and 0.840, they represent the strongest
correlation that the model achieves in either Table 6 or 7. This suggests that at the
highest level of correlation, in the swing states, the a priori variables that I have
6 Discussion
they also exhibited high t-statistics that suggest the validity of their accompanying
For the sake of directness, I will be explicit about how to interpret the direct
results of the regressions. First, note that the models dependent variable is not the 2016
Republican presidential vote share, but instead, the change in the Republican
35
presidential vote share from the 2012 to the 2016 election. The coefficients of the
independent variables (IVs), then, are a measure of the degree to which the Republican
presidential vote would have decreased or increased in the context of how the party
significant coefficient of -2, it would mean that a 1% increase in that IV would suggest
that Mr. Trump would have performed 2% worse in the context of the results of the
2012 and 2016 elections. The model at no point claims to predict the 2016 presidential
vote share but instead attempts to approximate the hypothetical shifts in the vote that
they can be interpreted ceteris paribus.8 The model divides voter characteristics into
separate IVs, but in practice, this isnt actually possible with people. A person isnt just
their racethey also possess an assigned gender and an education level. The regressions,
then, assume that one could increase one IV without affecting the others. This is not
possible. For the sake of interpreting the results of the regressions, though, we will
assume that those influences are not disruptive enough to significantly skew the
8
Latin. All other things equal.
36
Table 6. Regression results for every county in the dataset.
The results for the basic voting controls demonstrated statistical significance for
econometric model captures the broader implications of voter behavior correctly. The
statistics wavering around -4.996 and -4.015 along with large coefficients wavering
around -3.45 and -4.15.9 These values remain fairly consistent throughout the other
three regressions. These values support conventional conclusions about the increase of
minority populations, in which their increased presence decreases the share of the
Republican presidential vote. As an aside, this is consistent with analyses of the 2016
presidential election which claim that Black turnout could have potentially swung the
election. I am not claiming that the regressions are one-to-one evidence of the veracity
of this claimafter all, my model weighs population evenly and absorbs fixed effects
but the implication is nonetheless present. The econometric models agreement with
broader election analyses is also present in the % with Bachelors degree or higher
variable. With neutral coefficients consistently around -0.35, this result also agrees that
college educated individuals in general did not swing dramatically towards Sec. Clinton.
The last statistically significant IV of the voting controls is % aged 0-17. These
coefficients are uncharacteristically large (ranging from -34.03 to -41.36) and dont
9
Respectively.
37
actually have a direct effect on voting behavior, as underage people cannot vote.
Though the independent variable implies the effects of an increase in families, I have
chosen not to pursue that line of thinking for fear of excessive extrapolation. Finally, the
regressions have been weighed by population size, thus, that coefficient has been
negated.
Beyond the basic voter controls, in terms of the economic indicators regressed in
Regression (2) and Regression (4), the model also sees some statistically significant
results. With t-statistics of 2.901 and 3.150, the regressions conclusively support that %
in Poverty has a 0.00 effect on the dependent variable. This is a noteworthy conclusion
that I did not expect. Because the economic voting literature links economic variables to
voting outcomes, I thought that it would follow that poverty would have some given
correlation. My regressions, however, conclusively revealed that this is not the case. One
might assume that this might be because people in poverty are heavily disenfranchised,
and do not often have means to vote. But that does not answer the question as to why
the size of a countys poor population does not affect how the county votes at all. After
all, wouldnt having more of a countys population in poverty affect voting shares in
some significant pattern? According to the regressions, it appears that it does not. This
the United States. Perhaps poverty is experienced differently in different parts of the
38
country, and there arent clear correlations to those experiences. This result merits
further study. Furthermore, it was also unexpected that the other economic indicator
variables were not statistically significant. Again, I wonder if the wide variety of
counties observed in the dataset produced noise that clouded these correlations.
Finally, the regressions revealed that the majority of the migration flow variables
were not statistically significant. In Regression (3), % Domestic Migration is the only
coefficient of -0.19. This result suggests that counties people are moving have a very
slight negative correlation to the change in Republican presidential vote share. The data
does not exist to differentiate the reasons why people are migrating to those counties,
but I would argue that they are probably economic. As the nature of the labor market
continues to change as it has over the past five or so post-recession years, it would make
with a coefficient of -0.06. This result, though perhaps a bit unreliable because of the
relatively small number of Foreign-born people in the United States, also proves
interesting. One of the reasons that I was interested in exploring migration flows was to
coefficient would corroborate the idea of an immigrant backlash that resulted in a tilt
39
towards Mr. Trump. It appears that the result does not, and that the coefficient is
mostly neutral with a very slight Democratic tilt. Perhaps the conventional wisdom that
counties with higher levels of immigrants correlate against the change in Republican
Though one would think that changing the dataset away from all counties in the
United States to all counties in ten swing states would potentially result in dramatically
different results, that does not turn out to be the case. Most results in Table 7 hold
consistent with Table 6, except for a few notable exceptions that do provide some
additional insights.
Of the basic voting controls, the % American Indian or Alaska Native variable
becomes statistically significant in the 99% confidence interval. Though I expected the
Native population to be too small to be statistically significant, it appears that isnt the
case with the swing state dataset. Surprisingly, the IV has a significant positive
correlation with the Republican presidential vote share. This goes against the
conventional wisdom that minority populations vote against Republican candidates. Its
possible that the population weighing that my regressions use allows this correlation to
be evident for the first time. I dont have a clear explanation for it, but it merits further
study.
40
Of the economic indicators, the % Unemployed variable is now statistically
significant with a near-zero effect. This is precisely like the % in Poverty variable, which
for both the full dataset and the swing state dataset, is statistically significant with a
coefficient of 0.00. It is surprising again that the regressions convey that there is great
certainty that these economic indicators have little to no effect. Perhaps economic
voting isnt as clear cut as my interpretation of the literature suggested it would be.
statistically significant with a coefficient of -2.37. This result is highly surprising, and is
correlates against the change in the share of the Republican presidential vote. This helps
address one of the central inquiries of my paper, which is the exploration of whether or
not immigrant flows contributed significantly towards the increase in Republican vote
share in 2016 and the victory of Mr. Trump. Again, many pundits suggested the idea of
an immigrant backlash which these results appear to argue against. More unexpected
still is the % Foreign-born variables 0.15 coefficient, which suggests a Republican tilt.
At first, I considered these two characteristics to be more closely related but perhaps
7 Conclusion
The first, and broadest conclusion that can be drawn from my econometric
analysis is that in concurrence with the literature before it, my paper supports the use
41
of regression analysis to correlate between the characteristics of voters and voting shares
in presidential elections. Though some of the finer points of the model I provide are
consistently large R-Squared and adjusted R-squared values that the regressions held
This is of course, not to say that econometric models at the scale and
methodology that Im working with are effective or reliable election prediction tools.
Even with adjusted R-Squared values consistently hovering above 0.80, the coefficients
still depict very rough relationships that are not completely precise. Presidential
elections are furthermore nearly always decided at the margins, and the margins often
included in the second and fourth regressions is one of the most surprising results from
economic voting. Had the economic indicator variables that I included not been
statistically significant in any form of the econometric model, that would have perhaps
been a more inconclusive result. It is, however, the strong statistical significance of these
voting models, the benefits of egocentric voting have become clearer. The egocentric
42
approach might move past the statistically noise more effectively to better understand
Ultimately, I still believe that these economic indicators have some correlations to
voting behavior, but that my models sociotropic approach was not effective in finding
them.
Finally, the migration flow results also present some interesting insights. The %
Domestic Migration variable suggests that areas where domestic peoples are moving to
more likely to have more Democratic inclination from their position in 2012. This metric
might serve as a proxy economic indicator, in which one might assume that counties
with migration influxes are doing well economically, and that this boon translates into
voting in approval of the existing Democratic administration. On the other hand, with a
of immigrants are entering counties in which areas of the country. Having separate
figures might reveal clearer correlations that my econometric model misses. The model
expect both of these variables to have similar inclinations, this does not turn out to be
coefficient that remains consistently around -2.37, and % Foreign-born around 0.15. Id
43
argue that this discrepancy between the two variables is evidence of two completely
represents the degree to which new immigrants are entering a county. In this sense, it
appears that counties that international immigrants are entering have a stronger
Democratic lean, but counties that are developing centralized and established immigrant
communities might be more comfortable with no strong inclination one way or another.
immigrant communities, with time and growth, become more politically balanced.
Holistically, the model and its regressions reveal that the 2016 presidential
share, though subject to interpretation, do exist. The existence of some foreign Trump
effect which cannot be clearly identified is mostly myth. At the very least, I can write
conclusively that any such effect would be substantially overpowered by other voting
behavior variables.
44
References
[1] Weatherford, M. 1983. Economic Voting and the "Symbolic Politics" Argument: A
Reinterpretation and Synthesis. The American Political Science Review, 77(1),
158-174. doi:10.2307/1956017
[3] Fair, R. 1978. The Effect of Economic Events on Votes for President. The Review of
Economics and Statistics, 60(2), 159-173. doi:10.2307/1924969
[4] Autor, D., Dorn, D., Hanson, G., & Majlesi, K. 2016. Importing Political
Polarization? The Electoral Consequences of Rising Trade Exposure. NBER
Working Paper Series. doi:10.3386/w22637
[5] Bailey, M., & Dynarski, S. 2011. Gains and Gaps: Changing Inequality in U.S.
College Entry and Completion. doi:10.3386/w17633.
[6] Bloom, H., & Price, D. 1975. Voter Response to Short-Run Economic Conditions:
The Asymmetric Effect of Prosperity and Recession. The American Political
Science Review 69, no. 4: 1240-254. doi:10.2307/1955284.
[7] Burden, B., & Wichowsky, A. 2012. Unemployment and Voter Turnout. APSA 2012
Annual Meeting Paper.
[8] DeSimone, J., & LaFountain, C. 2007. Still the Economy, Stupid: Economic Voting
in the 2004 Presidential Election. doi:10.3386/w13549.
[9] Kinder, D., & Kiewiet, D. 1979. Economic Discontent and Political Behavior: The
Role of Personal Grievances and Collective Economic Judgments in Congressional
Voting. American Journal of Political Science 23, no. 3: 495-527.
doi:10.2307/2111027.
[10] Holbrook, Thomas, and James C. Garand. 1996. Homo Economus? Economic
Information and Economic Voting. Political Research Quarterly 49, no. 2: 351.
doi:10.2307/448878.
45
[11] Rosenstone, S. 1982. Economic Adversity and Voter Turnout. American Journal
of Political Science 26, no. 1: 25-46. doi:10.2307/2110837.
[12] Jensen, Q., & Weymouth, S. 2016. Winners and Losers in International Trade: The
Effects on U.S. Presidential Voting. doi:10.3386/w21899.
[13] Gelman, A., & Su, Y. 2010. Voting by education in 2008. Chance 23, no. 8: 8.
doi:10.1007/s00144-010-0040-z.
[14] Brunner, E., Ross, S., & Washington, E. 2008. Economics and ideology: causal
evidence of the impact of economic conditions on support for redistribution and other
ballot proposals. Cambridge, MA: National Bureau of Economic Research.
http://www.nber.org/papers/w14091.pdf
46