You are on page 1of 35

Free will meta-analysis 1


Moral Responsibility and Free Will: A Meta-Analysis

Adam Feltz
Michigan Technological University


Florian Cova
Swiss Centre for Affective Sciences, University of Geneva

Word Count: 7,717 (excluding notes, abstract, and references)

Address Correspondence to:

Adam Feltz
1400 Townsend Drive
Department of Cognitive and Learning Sciences
Michigan Technological University
Houghton, MI 49931

Free will meta-analysis 2
Fundamental beliefs about free will and moral responsibility are often thought to shape our
ability to have healthy relationships with others and ourselves. Emotional reactions have also
been shown to have an important and pervasive impact on judgments and behaviors. Recent
research suggests that emotional reactions play a prominent role in judgments about free will,
influencing judgments about determinisms relation to free will and moral responsibility.
However, the extent to which affect influences these judgments is unclear. We conducted a meta-
analysis to estimate the impact of affect. Our meta-analysis indicates that beliefs in free will are
largely robust to emotional reactions.

Free will meta-analysis 3
Many philosophers and psychologists hold that at least a minimal belief in free will is required
for us to have healthy relationships with others and ourselves. Free will may be necessary for
autonomy, creativity, desert, reactive attitudes, dignity, love, and friendship (Kane, 1996).
However, recent advances in psychology and neuroscience may pose some threats to a belief in
free will. This research suggests that many people appear to be unaware of some of the
neurological antecedents of their behavior (Bargh & Ferguson, 2000; Wegner & Wheatley, 1999;
Wegner, 2002; Wegner & Bargh, 1998; Libet, 1985). One worry is that if these results were to
become widely assimilated, then a belief in free will would be diminished and the desirable
behaviors associated with a belief in free will would also disappear or be dramatically reduced.
For example, in the absence of belief in free will, we may have difficulty maintaining
meaningful relationships with others and interpersonal conflicts may become more common
(Kane, 1996). Empirical research supports these worries to some extent suggesting that beliefs in
free will are linked to judgments about punishment (Rakos, Laurene, Skala, & Slane, 2008;
Carey & Paulhus, 2013). Moreover, belief in free will has been argued to be an important factor
for many commonly desirable behaviors such as refraining from cheating, self-control, and job
performance (Vohs and Schooler, 2008; Baumeister et al, 2007; Baumeister, Masicampo, &
DeWall, 2009; Stillman et al, 2010) and has been shown to influence some of the neurological
antecedents of behavior mentioned above (Rigoni et al., 2011). For these reasons, some take it
that belief in free will is so important and engrained that if we were to find out that people really
are not free or morally responsible, we should leave people to their mistaken beliefs (Smilansky,
2002). To disabuse people of their mistaken belief would create a world where nobody has any
of these things.
However, the extent to which advances in neuroscience and psychology call free will into
question or impact everyday conceptions of free will is still an open question (Roskies, 2006;
Mele, 2006, 2013). Belief in free will may be so engrained that it will be incredibly hard to
dislodge even in the face of extraordinary threats (Feltz, 2013; Feltz & Millan, in press). For
example, it appears as if many of the troubling findings from neuroscience have already been
assimilated in portions of the population. But this assimilation has not led to a reduction in
beliefs in the dualistic nature of humans or free will (OConnor & Joffe, 2013). Thus, one
possibility is simply that our ordinary understanding of free will is such that it can easily
accommodate the findings of neuroscience, rather than being at odds with them.
Free will meta-analysis 4
Indeed, why should people be worried by the findings of neuroscience? One answer is
that they seem to promote a deterministic view of human behavior. Determinism has been
traditionally considered as a threat to human freedom and moral responsibility. Determinism is
the thesis that whatever happens, including human behavior, is entirely caused by previous
events and the laws of nature (Mele, 2006). It means that whenever one acts, that action is
completely the product of the laws of nature and events that took place earlier in ones life, and
those events are in turn completely the product of earlier events, eventually reaching events that
happened long before the person who acted was born. However, it is not clear that determinism
prevents free will and moral responsibility, and philosophers have divergent opinions on that
matter. Compatibilists hold that free will and moral responsibility are compatible with
determinism. Incompatibilists hold that if determinism is true, then we cannot have free will or
be morally responsible for our actions. Thus, if people hold a compatibilist view of free will, it
could be that recent findings of neurosciences do not threaten at all their view of themselves as
free and morally responsible agents.
In the past years, theorists have increasingly made use of empirical methods to study
laypeoples conceptions and intuitions about free will and moral responsibility with findings that
may appear somewhat contradictory. Sometimes people seem to have compatibilist intuitions
and sometimes they appear to have incompatibilist intuitions.
For example, when participants
are asked the abstract question if somebody can be free and morally responsible in a
deterministic world, most people respond no. However, if participants are asked if a concretely
described person (e.g., John murdered his wife and children so he could be with his lover) can be
free and morally responsible in a deterministic world, most people respond yes. To resolve this
apparent contradiction, some theorists have proposed that peoples fundamental judgments about
free will and moral responsibility tend to be influenced by negative emotional reactions (Nichols
& Knobe, 2007). In this paper, we survey the results of 30 published and unpublished studies and
submit them to a meta-analysis in order to estimate the extent to which purported negative
emotional reactions influence judgments about the freedom and moral responsibility of agents
living in a deterministic universe. We conclude that negative emotional reactions have some

Intuition is a term of art (see Feltz and Bishop, 2010). Here, we consider the intuition that p as an immediate
judgment that p.
Free will meta-analysis 5
impact on judgments about free will and moral responsibility, but this effect is not large enough
to play the theoretical role theorists have attributed to it.

Free will and affective reactions
Existing research concerning intuitions about determinisms relation to free will and moral
responsibility can be divided into two broad categories.
A few works investigate intuitions
about particular claims or cases that help inform whether free will and moral responsibility are
compatible with determinism (e.g., the Principle of Alternative Possibilities or manipulation
cases) (Miller & Feltz, 2011; Sripada, 2012; Feltz, 2013; Cova, forthcoming). However, the
majority of existing studies try to determine directly whether laypeoples conceptions of free will
and moral responsibility are prima facie compatible with determinism. Some of these latter
studies involve investigating folk concepts of free will and moral responsibility (Monroe &
Malle, 2010, Stillman et al., 2011). Others address folk intuitions about the compatibility
questionthe question of whether free will and moral responsibility are compatible with
determinism (Sommers, 2010; Kane, 1996). In this paper, we focus on the studies that address
the compatibility question.

Conflicting answers to the compatibility question: the abstract/concrete asymmetry
Do people have the intuitions that an agent living in a deterministic universe can be free and
morally responsible (and therefore are natural compatibilists), or do they consider that
determinism precludes this agents free will and moral responsibility (and therefore are natural
incompatibilists)? Initial investigations concerning the compatibility question seemed to favor
the conclusion that people are mostly natural compatibilists (Feltz et al, 2009). For example,
Nahmias and his colleagues (2005, 2006) gave participants vignettes describing agents living in
deterministic universes and performing particular actions (such as robbing a bank). They then
asked participants whether these agents had free will and were morally responsible for their
actions. In three different studies, a majority of participants answered that these agents had free
will and were morally responsible for their actions. Given these results, it would seem tempting

Work using the empirical methods of the behavioral sciences to explore philosophically relevant beliefs sometimes
is called experimental philosophy. For an overview of experimental philosophy, see Feltz (2009) and Cova
Free will meta-analysis 6
to conclude that participants tended to think that free will and moral responsibility are
compatible with determinism.
However, things are not that simple. Nichols and Knobe (2007) designed an experiment
in which participants were introduced to the following description of universe A:

Imagine a universe (Universe A) in which everything that happens is completely caused by
whatever happened before it. This is true from the very beginning of the universe, so what
happened in the beginning of the universe caused what happened next, and so on right up until
the present. For example one day John decided to have French Fries at lunch. Like everything
else, this decision was completely caused by what happened before it. So, if everything in this
universe was exactly the same up until John made his decision, then it had to happen that John
would decide to have French Fries.

Participants were divided into two conditions. After reading the description, participants in the
concrete condition received the following additional paragraph and question:

In Universe A, a man named Bill has become attracted to his secretary, and he decides that
the only way to be with her is to kill his wife and 3 children. He knows that it is impossible to
escape from his house in the event of a fire. Before he leaves on a business trip, he sets up a
device in his basement that burns down the house and kills his family.
Is Billy fully morally responsible for killing his wife and children?

In this case, most participants (72%) answered that Billy was fully morally responsible for
killing his wife and children. These results are perfectly consistent with the hypothesis that most
laypeople are natural compatibilists. However, participants in the abstract condition did not
receive any additional paragraph but only the following question:

In Universe A, is it possible for a person to be morally responsible for their actions?

In this condition, most participants (86%) answered that it was not possible for this person to be
fully morally responsible. This pattern of responses conflicts with participants answers in the
concrete condition. Lets call this the abstract/concrete asymmetry. The abstract/concrete
Free will meta-analysis 7
asymmetry suggests that participants answers in the abstract and concrete conditions are not
based on the same psychological processes (Sinnott-Armstrong, 2008; Weigel, 2011). However,
there is wide disagreement about what those processes are, and which of the two answers (if
any) should be considered as revealing participants true conception of free will.

Competing accounts of the abstract/concrete asymmetry
One influential explanation of the abstract/concrete asymmetry is Nichols and Knobe (2007)s
affective performance error model. According to Nichols and Knobe (2007), the key difference
between the abstract and the concrete condition is the amount of emotional reaction generated by
the vignettes. The concrete condition depicting a horrendous murder can be thought as more
upsetting than the abstract condition. For Nichols and Knobe, this is what explains the different
intuitions between the abstract and the concrete conditions. While people tend to think that free
will and moral responsibility are incompatible with determinism (hence the answers in the
abstract condition), strong emotional responses can bias participants to attribute moral
responsibility in the concrete condition. Thus, results indicating that participants are natural
compatibilists would be performance errors from participants. Since compatibilist intuitions
result from an error, they could not be used to infer what the folk truly think about the
relationship between moral responsibility and determinism.
Though influential, Nichols and Knobes affective performance error model is not the
only available account of the abstract/concrete asymmetry. Another account simply relies on the
possibility that the abstract and the concrete conditions lead to different understandings and
interpretations of human agency in Universe A. Nahmias and Murray (2010) hold that most
people are natural compatibilists, and that they have the intuition that agents living in
deterministic universes can be free and morally responsible for their actions. However, they
stress that determinism should be distinguished from bypassing. Bypassing occurs when
agents mental states do not play a role in the production of those agents actions, so that agents
will end up acting the way they do whether they want it or not. Clearly, one can think free will
and moral responsibility to be compatible with determinism while thinking that they are
incompatible with bypassing. Determinism does not entail that an agents mental states are
bypassed or irrelevant to the production of the action. Based on this distinction, Nahmias and
Murray (2010) argue that Nichols and Knobes depiction of Universe A (in which things had to
Free will meta-analysis 8
happen the way they did) could lead participants to understand that agents mental states are
bypassed. This would explain why participants judge agents not to be morally responsible for
their actions in the abstract condition in spite of their natural tendency to consider free will and
moral responsibility to be compatible with determinism. However, in the concrete condition, it is
explicitly stated that the agent acts the way he does on the basis of his desires (e.g. he wants to
be with his secretary) and beliefs (e.g. he knows that it is impossible to escape from his house in
the event of a fire), which would lead participants to revise their interpretation and to understand
that agents in Universe A, though determined to act the way they do, are not bypassed. This
would explain why the agent is judged more morally responsible in the concrete condition.
Empirical evidence supports this hypothesis, suggesting that participants are indeed more likely
to consider that agents mental states are bypassed in the abstract condition than in the concrete
condition (Murray & Nahmias, forthcoming; but see Rose & Nichols, in press).
Finally, a third account focuses on another particular feature of the concrete condition:
the fact that a norm is broken. According to the NBAR hypothesis (where NBAR stands for
Norm Broken, Agent Responsible; see Mandelbaum & Ripley, 2012), people are natural
incompatibilists. This natural tendency accounts for participants answers in the abstract
condition. However, people also have the unconscious belief that whenever a norm is broken, an
agent is responsible for breaking the norm. In the concrete condition where a norm is broken,
this unconscious belief counters our natural tendency to judge free will and moral responsibility
to be incompatible with determinism, leading to an increase in judgments that the agent is
morally responsible.

Testing the affective performance error model: the high/low affect asymmetry
Thus, there are many possible accounts of the abstract/concrete asymmetry. What reasons do
Nichols and Knobe give us to prefer the affective performance error model? According to them,
their theory makes the following prediction: the same difference found between the abstract and
the concrete conditions can be found between two concrete conditions as long as affect is varied
in a similar way. To test for this hypothesis, they ran a second study in which participants
received the description of Universes A and B. Then participants received only one of the
following two pairs of sentences:

Free will meta-analysis 9
Low Affect condition:
As he has done many times in the past, Mark arranges to cheat on his taxes. Is it
possible that Mark is fully morally responsible for cheating on his taxes?

High Affect condition:
As he has done many times in the past, Bill stalks and rapes a stranger. Is it possible
that Bill is fully morally responsible for raping the stranger?

For each condition, half of the participants were told that Mark (or Bill) lived in deterministic
Universe A, while the other half were told that he lived in indeterministic Universe B. Results
(presented in Table 1) suggest that Nichols and Knobe were right. More participants considered
the agent responsible in the high affect condition than in the low affect condition when the
action was set in deterministic Universe A.
--- Insert table 1 here ---
Lets call this peculiar pattern of responses the high/low-affect asymmetry. The high/low-
affect asymmetry seems to provide support for Nichols and Knobes account of the
abstract/concrete asymmetry. Indeed, it seems that the difference between the high and low
affect cases cannot be accounted by the fact that the first makes it clearer that the agent acts on
the basis of his own desires and beliefs: in both cases, there are no direct references to the
agents mental states. Nor does it seem that the difference can be explained by the fact that a
norm is broken in the first case and not in the second: in both cases, it is clear that a norm has
been broken. Rather, it seems that the only difference between the high and low affect cases is
that the first features a much more gruesome and indignation-arousing violation than the second.
The existence of this high/low-affect asymmetry thus lends important support to Nichols and
Knobes affect-based account of the abstract/concrete asymmetry. The affect-based account has
led many to agree that emotional reactions could bias peoples judgment about free will and
moral responsibility and to speculate about what that means for our ordinary conception of free
will (e.g. Nelkin, 2007; Vargas, 2009).

Replications in experimental philosophy of free will: the trouble with the high/low affect
Free will meta-analysis 10
In a recent paper criticizing the methodological shortcomings of many empirical investigations
of folk intuitions about philosophically relevant topics, Woolfolk insisted on the fact that
replicability of research findings is a key to the establishment of scientifically sound inquiry
(2013, p. 84). There have been recent worries that most surprising results in psychology, and
particularly in social psychology, might not pass the test of replication, and these have led to a
demand for more replications (e.g., Young, 2012).
For these reasons, scientific responsibility
counsels that before speculating on what could cause the effects we described, we should make
sure that these effects are robust.
To what extent have the results we described been replicated? In the previous section, we
have described and distinguished two different phenomena: the abstract/concrete asymmetry,
and the high/low-affect asymmetry. The abstract/concrete asymmetry (i.e., participants ascribe
less moral responsibility to agents living in a deterministic universe when the question is asked
abstractly rather than concretely) has been widely reproduced. It has been reproduced using
different descriptions of the deterministic universe (Cova et al. 2012), different concrete
vignettes (Nahmias et al., 2007; Murray & Nahmias, forthcoming), cross-culturally (Sarkissian
et al, 2010), and even when the agent is forced to act by a particular neurological condition (De
Brigard et al., 2009).
The abstract/concrete asymmetry seems a robust result that deserves an explanation.
However, the same cannot be said of the high/low-affect asymmetry. So far, the dramatic
difference Nichols and Knobe found between the low-affect and the high-affect cases has not
been properly replicated in a published study. The only published paper to directly attempt a
replication failed twice and found both times that participants gave mostly incompatibilist
answers in both cases (Feltz et al., 2009). A similar effect has been found by Cova and his
colleagues (2012), but instead of comparing Nichols and Knobes low-affect and high-affect
cases, they compared the low-affect case to Nichols and Knobes concrete case, which differs in
many respect from the low- and high-affect cases (for example, the concrete condition puts more
emphasis on the agents desires and the role they play in the production of his action, which may

For example, Scaife and Webber (2013) found puzzling results about folk intuitions about intentional action, but
further studies repeatedly failed to replicate these results (Cova, in press). See also Sayedsayamdost (in prep) and
Christian Motts replication page:
Replications.html, and the Psych File Drawer project:
Free will meta-analysis 11
explain the difference between the two cases). Thus, it is not clear that the high/low-affect
asymmetry is robust or real (i.e., early findings may reflect Type I error).
This lack of replication is all the more worrying because there are reasons to doubt that
the impact of affective reactions can explain why most participants consider agents morally
responsible for their actions in the concrete cases. First, Nahmias and his colleagues used
concrete vignettes involving neutral actions such as going jogging (Nahmias et al., 2006) or
positive actions such as giving money to charity (Nahmias et al., 2006, 2007), and still found
that most participants judged the agent morally responsible for his actions. Second, in a recent
study, Cova et al. (2012) gave various concrete cases to patients suffering from a behavioral
variant of frontotemporal dementia, a neurodegenerative disease accompanied by a deficit in
emotional responses. Contrary to what the affective performance error model would have
predicted given their lack of emotional reactions, these patients were no more incompatibilist
than control participants and gave mostly compatibilist answers.
However, there are also reasons to think that affective reactions do have an influence on
judgments about free will and moral responsibility. First, one source of evidence comes from a
series of studies suggesting that extraverts are more likely than introverts to judge an agent
living in a deterministic universe is morally responsible for his actions (Feltz & Cokely, 2009;
Schulz et al., 2011; Feltz, Perez, & Harris, 2012; Feltz & Millan, in press). A possible
explanation for this phenomenon is that extraverts are less likely to regulate their own emotions,
and thus are more susceptible to be influenced by the affective content of vignettes. Results from
neuroscience support this to some extent suggesting that some individuals are more likely to
regulate emotional reactions than others (Ochsner & Gross, 2005; see also Smillie, 2013).
Second, Feltz and his colleagues (2012) found that the affective content of a vignette could
influence the type of explanation participants give for the agents behavior. Participants faced
with a high-affect vignette were more likely to explain the agents behavior in terms of the
agents decision than participants reading a low-affect vignette. Given that the same study found
that participants explaining the agents behavior in terms of decisions were also more likely to
perceive him as free and morally responsible, this suggests that the affective content of a
vignette can indeed influence participants judgments by favoring one kind of explanation for
agents behaviors over another. Third, other evidence comes from psychological studies
Free will meta-analysis 12
showing that inducing anger in participants can increase their propensity to punish and ascribe
moral responsibility to agents (Keltner et al., 1993; Tetlock et al., 1998).
Consequently, it is not clear whether negative affective reactions have an impact on
compatibilist or incompatibilist judgments. Even if affect does influence judgments, it is still
unclear the extent to which the abstract/concrete asymmetry can be explained by this impact. To
help address these issues, and determine whether the high/low-affect asymmetry is a genuine
and replicable effect, a meta-analysis was conducted.

Search Criteria
We used the following criteria for including studies in the meta-analysis: (1) the study included a
description of determinism. This narrowed the group of many possible studies because
determinism, as philosophers understand it, is a precise, technical concept. For example, this
criterion excluded a number of studies that used scenarios suggesting, but not explicitly
describing, determinism. It also excluded a number of studies where researchers inferred
compatibilist and incompatibilist judgments absent a description of determinism. (2) The study
manipulated the emotional content of the scenarios. Some studies only had high or low affect
scenarios. Because we were interested in the effect of affect, any study that did not manipulate
the emotional content of the scenario was excluded.

Search for Studies
The effect of affect was first identified by Nichols and Knobe (2007). Because they identified the
effect of interest, we first used a Google Scholar Cited Reference search to find all papers that
referenced Nichols and Knobe (2007) for possible inclusion in the meta-analysis. This method
returned 220 results. Computer based database searches were also conducted. Databases included
in the search were PsychInfo and Philosophers Index. Keywords determinism, free will, and
emotion were used in all database searches. This method returned a total of 3,254 results. Pro
Quest Psychology Journals, PsychArticles, Sage Full Text, Science Direct, Web of Science, and
Wiley Online databases were also searched returning no new papers that were not identified in
the PsychInfo and Philosophers Index search. We also conducted a search for unpublished
studies by posting calls for unpublished studies on discipline specific blogs. Additionally, we
Free will meta-analysis 13
contacted individuals who had conducted previous studies to see if they also had any
unpublished studies. Finally, we emailed relevant research groups. References of each paper that
met the inclusion criteria were searched for possible inclusion in the meta-analysis. The search
started in late 2012 and concluded in late 2013.
Results were then examined to determine if they met the two inclusion criteria. The
abstracts of the papers were read first. Then, if the abstract indicated that the two inclusion
criteria were likely to be met, the entire paper was read. Both authors agreed on which studies to
include. There were no disagreements about what studies were to be included. No results that did
not reference the original Nichols and Knobe paper met the inclusion criteria. Eleven published
studies met both inclusion criteria. We also included a number of unpublished studies (K = 19).
--- Insert table 2 here ---

Variables in Studies
Table 2 lists and includes a brief description of the 30 studies. There were a number of
differences between the 30 studies. First, two different experimental designs were usedeither a
within-subjects design where participants received both high affect and low affect scenarios or a
between-subjects design where participants received only one scenario. Second, studies gathered
either categorical (yes/no) or continuous (Likert scale) data. Third, the number of questions
asked varied. Studies that gathered categorical data only asked one question (e.g., Is it possible
for Bill to be fully morally responsible for cheating on his taxes?). Studies that gathered
continuous data typically asked more than one question. However, since participants answers to
multiple questions often had strong internal consistency, often a composite score was reported
(the mean of the responses). We used only composite scores from the studies that reported
continuous data in the meta-analyses. Fourth, there were three distinct types of scenarios used,
but all studies but one used the original Nichols and Knobe (2007), the Nahmias, Coates, and
Kvaran (2007) scenarios, or a close variation of either. The remaining study used a variation of a
Nahmias, Morris, Nadelhoffer, & Turner (2006) scenario. Finally, the action in the high affect
case varied. In one high affect scenario, a man is described as stalking and raping a stranger. In
the other variation, a man is described as falling in love with his secretary and the only way to be
with her is to kill his wife and children, and he does it.

Free will meta-analysis 14
To give a sense of the overall data, all effect sizes were converted to a form comparable to the
standardized mean difference and are included in the funnel plot in Figure 1. This overall
analysis needs to be interpreted with caution because it is not necessarily permissible to combine
different types of data from different kinds of experimental designs in the same meta-analysis
(see below). A visual inspection of the funnel plot indicated that there was no publication bias.
This was likely because a relatively large number of unpublished studies were included. For all
meta-analyses, we used the methods described in Lipsey & Wilson (2001). For the overall meta-
analysis, within-subjects (proportion gain) and between subjects (odds-ratios) effect sizes for
dichotomous data were converted to a form (logit d) similar to the standardized mean difference
for comparison. Because larger sample sizes tend to be more representative of the population, we
weighted the effect size of each study by the inverse variance weight.
This method allows
studies with a larger sample size more statistical importance than studies with smaller sample
sizes in the meta-analysis. For the overall meta-analysis, a random effect model was used
because homogeneity was rejected (Q (29) = 49.78, p = .009) and revealed a small, statistically
significant standardized mean difference .15 (95% CI: 0.08, 0.22), Z = 4.12, p < .001. The test
for homogeneity also suggested that there were important differences between types of studies.
Because the studies differed in empirically and conceptually important ways, subsequent
analyses were performed.
--- Insert figure 1 here ---
It is controversial whether combining data from different experimental designs (e.g.,
between-subjects and within-subjects) is legitimate.
However, we performed a meta-analysis

The inverse variance weight is the inverse of the squared standard error value (Lipsey & Wilson, 2001, p. 36).
For example, the inverse variance weight for the standardized mean gain is 1/(Standard error of the standardized
mean difference).
There is no clear answer when combining data from different kinds of designs is permissible. For example, Lipsey
and Wilson unequivocally state the standardized mean gain effect size statistic is different from the standardized
mean difference effect size statistic Comparison of the previous effect size statistics with thosethe standardized
mean difference should make it evident that they cannot be expected to yield comparable values. It follows that
these two effect size statistics should not be mixed in the same meta-analysis (2001, p. 45). Others are less
concerned with the differences, at least in practice (e.g., Eagly, Makhijani, & Klonsky, 1992). Others opt for a more
moderate position stating that in some instances data from different designs can be compared, but only if some
conditions are met. These conditions include (1) putting effect sizes meaningfully in the same metric, (2) the designs
do not generate relevantly different biases, and (3) the designs estimate the effects with an acceptably similar
precision (Morris & DeShon, 2002). The current studies pretty clearly fail condition 2 since the within-subjects
design is likely to generate a bias toward consistency, especially in the short time-frame in which participants were
asked to respond. It is questionable whether the current studies satisfy conditions 1 and 2. Morris and DeShon
Free will meta-analysis 15
combining each type of data from both types of experimental designs (i.e., within-subjects
(studies 1-11) and between-subjects (studies 12-30). Each of these meta-analyses used the data
from overall meta-analysis above. The meta-analysis combining all categorical data (studies 1-
11) indicated that homogeneity should be rejected Q (10) = 19.53, p = .03. A random effect
model indicated that the overall effect size was small .1 (95% CI: -.02, .22), and not significant Z
= 1.64, p = .1. However, caution must be taken in interpreting the result from this meta-analysis
of categorical data. The rejection of homogeneity suggested that there was variability in the
effect sizes greater than could be expected with subject-level sampling error. This variability
could in principle be accounted for. One obvious difference in the effect sizes was the different
experimental designs. An analogue of the analysis of variance (ANOVA) was performed using
the experimental design as the moderator variable to test whether differences were a function of
experimental design (Lipsey & Wilson, 2001). They were (Q
(1) = 9.1, p = .003). Therefore,
there are good conceptual and empirical reasons not to include all categorical data in the same
For these reasons, two separate meta-analyses were performed for the experiments that
gathered categorical data. Within-subjects categorical data studies (studies 1-4) were analyzed as
one group, and between subjects, categorical data studies (studies 5-11) were meta-analyzed as a
separate group. Inverse variance weights were used in the meta-analyses. For within-subjects,
categorical data studies, a fixed effect model of the proportion gain effect size was used for the
meta-analysis because homogeneity could not be rejected (Q (3) = 0.04, p = .99). The mean
effect size (proportion gain) was .002 (95% CI: -0.44, 0.44) and was not statistically significant Z
= .009, p = .99. For between subjects, categorical data studies, the odds-ratios effect sizes were
converted to their natural logarithm. A fixed effect model was used because homogeneity could
not be rejected (Q (6) = 5.6, p = 0.47). The mean effect size (odds-ratio) was small 1.70 (95%
CI: .97, 2.61) but was statistically significant Z = 2.4, p = .02.
--- Insert figure 2 here ---
--- Insert table 3 here ---
---Insert table 4 here ---

(2002) also offer an empirical method to determine the acceptability of combining data across different designs. If
the analogue of the ANOVA that uses the type of design a moderator is significant, then those data should not be
combined into the same meta-analysis.
Free will meta-analysis 16
An overall meta-analysis was also conducted combining all continuous data from each
experimental design (studies 12-30). This meta-analysis indicated that homogeneity should be
rejected Q (18) = 32.59, p = .02. A random effect model indicated that the overall effect size was
small .15 (95% CI: .07, .22) and significant Z = 3.62, p < .01. The rejection of homogeneity
again suggested this variability could in principle be accounted for by the differences in
experimental designs. The analogue of the ANOVA indicated that the experimental design was a
factor in the difference Q
(1) = 6.81, p = .001. Again, there were good conceptual and empirical
reasons not to include all continuous data in the same meta-analysis.
Within-subjects, continuous data studies (studies 12-16) and between-subjects,
continuous data studies (studies 17-30) were analyzed as two separate groups. Means and
standard deviations are reported in Tables 5 and 6 (see Figure 3 for a Forest Plot). Standardized
mean differences (between-subjects, continuous data studies) and standardized gain scores
(within-subjects, continuous data studies) and inverse variance weights were calculated. For
within-subjects, continuous data studies, a random effect model was used because homogeneity
was rejected Q (4) = 10.32, p = .04. Given the low number of studies analyzed and the absence
of a priori predictions about the source of the heterogeneity, we assumed that the variability
beyond subject-level sampling was random. Hence, we did not attempt to identify the source of
the heterogeneity. However, it is possible that there is some identifiable source that can account
for the heterogeneity. A random effect model suggested that the mean effect size (standardized
mean gain) was small .08 (95% CI: -.02, .17) and not statistically significant Z = 1.52, p = .13.
For between-subjects, continuous data studies, a fixed effect model was used because
homogeneity could not be rejected (Q (13) = 15.46, p = 0.28). The mean effect size (standardized
mean difference) was small 0.22 (95% CI: .12, .32) but was statistically significant Z = 4.47, p <
--- Insert figure 3 here ---
--- Insert table 5 here ---
---Insert table 6 here ---

It was an interesting artifact that many of the studies that gathered categorical data used
Nichols & Knobes (2007) scenarios or a variation of those scenarios while many of those that
gathered continuous data used Nahmias, Coates, & Kvarans (2007) scenarios or a variation of
Free will meta-analysis 17
them. Those that gathered categorical data tended to use rape as the high affect action and those
that gathered continuous data tended to use the killing the wife action. We conducted an
analysis of these different groups to see if either of these two factors accounted for differences in
the mean effect size. We divided Studies 5 and 7-11 (categorical between-subjects studies using
rape) into one group and Studies 17-29 (continuous between-subjects using kill) into a
separate group. Effect sizes based on categorical data (odds-ratios) were converted to a form
comparable to the standardized mean difference (logit d). Inverse variance weights were
calculated. Not surprising, homogeneity could not be rejected (Q (18) = 25.57, p = .1). A fixed-
effect model determined that the mean overall mean effect size (standardized mean difference)
was statistically significant .24, (95% CI: 0.15, 0.33), Z = 5.33, p < .001. Tests of homogeneity
for small sample sizes are fairly insensitive (Lipsey & Wilson, 2001). For this reason and
because the kill and rape actions could generate differences in effect sizes, we tried to model
the possible differences between different types of scenarios and actions. To do so, we used the
analogue of the ANOVA with the scenario type as the moderator variable. The result of the
analysis was not significant (Q
(1) = 1.41, p = .24) suggesting that the data are consistent with
the scenario and action type measuring the same things.

We meta-analyzed 30 studies to help determine the strength of the effect of affect. We only used
studies that provided participants with a description of determinism and varied the emotional
content of the actions performed under those descriptions. The meta-analysis suggested that
overall, affect does play a role in judgments about freedom and moral responsibility, but the
effect size is relatively small (Cohen, 1988). But this overall effect size must be interpreted with
caution because it is not necessarily legitimate to compare effect sizes from different
experimental designs. Further analyses revealed that the experimental design is a likely source of
heterogeneity. The effect of affect is more pronounced in between-subjects than in within-
subjects designs. Finally, there were no reliable overall differences between the scenarios used
by Nichols and Knobe (2007) and those used by Nahmias, Coates, and Kvaran (2007).

Consequences for the Affective Performance Error Model
Free will meta-analysis 18
The results of our meta-analysis suggest that, if there is a high/low-affect asymmetry, it is a very
small one. Given that the existence of the high/low-affect asymmetry was the main evidence in
favor of Nichols and Knobe (2007)s affective performance error model, our findings seem to
undercut the main reason to endorse this model. Granted, one might construct a modest version
of Nichols and Knobes model, according to which affect has some effect on intuitions about free
will and moral responsibility, and the result of our meta-analysis would support this model.
However, we should remember that the affective performance error model was initially meant to
explain the difference between abstract and concrete cases. Against this account, the results of
our meta-analysis suggests that affect accounts for a small amount of the total variance.
Explaining 1% of the variance hardly escapes a trivially sized effect (Cohen, 1988). That means
that many people likely are not misapplying their concept of free will and moral responsibility in
high affect cases (or that they misapply the concept in some other way). Rather, these judgments
are remarkably stable even in the cases where we find conventionally statistically significant
differences between high and low affect cases (e.g., in between subjects, dichotomous studies,
52% thought the person was free or morally responsible in high affect cases versus 37% in low
affect cases. Means of continuous data suggested there were seldom qualitative shifts.). The
existence of other theoretical accounts for the difference between concrete and abstract cases and
the empirical data to support them suggest that the difference between concrete and abstract
cases is multi-factorial. While affect is one factor, it is not the most important factor. So, even if
affect is one factor, other accounts (e.g., bypassing accounts) of the abstract-concrete difference
are likely to be better in the sense of explaining a greater amount of the variance. It remains to be
seen how much of the variance these other factors can account for.
These results help inform a number of different issues in the empirical investigation of
free will. First, our meta-analysis indicates why the effect has been sometimes difficult to
replicate. This is a timely issue since there is quite a bit of debate about the replicability of some
studies (Young, 2012). First, if one wants the best chance of replicating the effect of affect, one
should only use a between-subjects design. Using a between-subjects design is not free of
problems. The amount of variance explained by affect is relatively small, accounting for little
more than 1% of the total variance. One explanation for why the effect has been difficult to
replicate is that the sample size needed to reliably detect the difference would have to be
relatively large. To find this effect 80% of the time with a conventional standard for statistical
Free will meta-analysis 19
significance (p = .05) in a between-subjects study, there would have to be 393 in each condition
for a whopping total of 786 participants (Cohen, 1988). As such, without an adequately powered
experiment, failures to replicate would be common.
Given the relatively small effect size, it is not surprising that the effect of affect can be
substantially reduced or eliminated in some circumstances. For example, the effect can be
eliminated in a within-subjects design. It is not puzzling how this effect is eliminated. People
likely feel compelled to be consistent across scenarios when reading both of them in a relatively
short period of time. So they have a tendency to answer the same way to both the high and low
affect cases. There is a lack of order effects in peoples judgments. If the affective performance
error model is correct, we should expect there to be an order effect where people who receive the
high affect case first should give stronger compatibilist judgments than when the high affect case
is presented second. We do not find that general pattern (see Feltz, Cokely, & Nadelhoffer,
The lack of an order effect is explainable because of the small effect of affect. Peoples
judgments are just not likely to change dramatically between high and low affect cases.

Consequences for competing accounts of the abstract/concrete asymmetry
Not only was the high/low-affect asymmetry the best available evidence for Nichols and
Knobes affective performance error model it was also the best argument against competing
models. Affect seemed to be able to shift dramatically participants answers without changing
whether a norm was broken or whether the vignette implied the agents mental states had a role
in bringing about his behavior. However, since the high/low-affect asymmetry to be negligible
compared to the abstract/concrete asymmetry, the results of our meta-analysis suggest that
competing accounts are likely in a better position to account for the abstract/concrete difference
and cannot be rejected on the sole basis of the high/low-affect asymmetry.
More particularly, because our results suggest that participants intuitions about free will
and moral responsibility are robust and not easily influenced by affect, they are in line with
research suggesting that, rather than being torn between opposing conceptions of free will, many
people have a compatibilist conception of free will. According to this conception, agents are free
and morally responsible as long as they are able to act on the basis of reasons and deliberation

It should be noted that Cova et al. (2012) obtained results that fit this pattern. However, order effects failed to
reach statistical significance.
Free will meta-analysis 20
(Nahmias & Murray, 2010; Monroe & Malle, 2010, Stillman et al., 2011; Cova & Kitano, in
press). On these compatibilist approaches, apparent inconsistencies in peoples intuitions can be
explained away by pointing at methodological issues in experimental design. For example, it is
because the abstract condition is easily interpreted as implying that people cannot act on the
basis of their reasons that people give apparently incompatibilist answers. However, these
answers are not genuine incompatibilist answers, since both compatibilists and incompatibilists
would agree that free will and moral responsibility are not possible in a universe in which agents
reasons do not have an influence upon their behavior. Contrary to the abstract/concrete
asymmetry, the high/low-affect asymmetry was a significant obstacle to these compatibilist
approaches because the high and low affect cases give central causal roles to the agents mental
states in the production of the actions. By showing that the high/low-affect asymmetry is a very
small effect, the results of our meta-analysis lend further support to compatibilist approaches.

Possible accounts for the high/low-affect asymmetry
Finally, one can wonder about the source of the high/low-affect asymmetry. Though the results
of our meta-analysis suggest that the effect of affect is small, it also confirms that there is such
an effect, and that it deserves an explanation. Some of the attempts to explain the
abstract/concrete difference could be adopted to explain the effect of affect. Accounts in terms of
norm violations can rely on the fact that cheating on ones taxes breaks a less serious norm than
raping someone (Mandelbaum & Ripley, 2012). Bypassing accounts can hypothesize that
affective reactions counter the incorrect interpretation of determinism (Nahmias & Murray,
2010). How much of the effect of affect these alternative accounts can explain remains to be
Regardless of the correct account, individual differences are likely to be important. To
date, almost all the studies conducted in the empirical investigation of free will have focused on
overall or group means. This focus is likely to mask subtle but important differences that
contribute to an inaccurate account of the cognitive mechanisms generating free will and moral
responsibility intuitions. To illustrate, extraverts are more likely than introverts to have
compatibilist intuitions across a number of different determinism scenarios (Feltz & Cokely,
2009; Feltz, Perez, & Harris, 2012; Feltz & Millan, in press) even in different languages and
cultures (Cokely & Feltz, 2009; Schulz, Cokely, & Feltz, 2011). It stands to reason that introverts
Free will meta-analysis 21
are likely to be incompatibilists in both high and low affect cases whereas those who are more
extraverted will be more likely to be influenced by the affective content of the scenario. An
individual differences approach can help explain why affect has such a weak effectit only
influences a certain number of individuals (Feltz & Cokely, in press). What cognitive processes
are involved in generating compatibilist or incompatibilist judgments for introverts and
extraverts is a separate issue. Importantly, an individual differences approach can help inform
and provide a more accurate picture of the cognitive mechanisms involved, for whom, and when
(Cronbach, 1957; Cokely & Kelley, 2009).
The neglect of individual differences highlights shortcomings of a substantial portion of
previous empirical research in free will. For example, it is unclear whether some of the
inconsistent effects found in judgments about free will and moral responsibility are because
those factors influence everybody uniformly or they only influence some people some of the
time. To illustrate, Vohs and Schooler (2008) have presented evidence that increasing a disbelief
in free will increases cheating behavior. There are different, mutually exclusive ways that this
effect could come about. For example, it is possible that everybody is influenced systematically
and by roughly the same amount. Or, it could be that some people are influenced and other
people are not influenced. Or, it could be that some people are extremely influenced and others
are slightly influenced the opposite direction. Any of these three models could be true, but not all
three can be true. Which model is correct helps inform exactly how problematic increasing
disbelief in free will might be. If only a relatively small number of people are responsible for the
effect, it is seemingly less problematic than if everybody is influenced. Or, at a minimum, the
implications are different (e.g., regulating that small group responsible for the effect versus
regulating everybody). Accurately modeling the effect of increasing disbelief in free will and the
implications of that effect therefore hinges critically on determining which model best estimates
reality. The same lessons apply to understanding the high/low affect and concrete/abstract
asymmetries. Not attending to and controlling for individual differences results in the inability to
discriminate among these different models and a decreased ability to determine practical
consequences. Models that do not adequately take into account individual differences run the risk
of being fictionsfictions that could be deleterious or call for unnecessary action.

Free will meta-analysis 22
Following Nichols and Knobe (2007)s early findings, it has been widely assumed that affective
reactions could dramatically shift peoples intuitions about free will and moral responsibility.
Though there has been wide disagreement about the source and consequences of this effect, the
existence of this effect itself was never questioned, despite a lack of replication. In this paper, we
investigated the existence and the size of this effect through a meta-analysis of 30 published and
unpublished studies. As a result, we found that there was indeed an effect of affect on intuitions
about free will and moral responsibility but that this effect was much smaller than traditionally
conceived and, as such, did not have the implication it was thought to have.
These are important lessons not just for the experimental exploration of free will and
moral responsibility. They are also good lessons for those engaging in the empirical investigation
of philosophically relevant beliefs in general. Those engaged in the empirical investigation of
philosophically relevant beliefs often purport to be investigating fundamental, traditional
philosophical issues but with new and in some ways improved methods. With these new
methods, they often think that their views are more firmly rooted in things that matter (e.g.,
everyday conceptions or intuitions) than their armchair counterparts. But our meta-analysis
indicates that this assertion may not be warranted for many theorists because (a) often effect
sizes are not reported, and the importance of findings is difficult if not impossible to interpret
without effect sizes, and (b) often the effect sizes reported are trivial. As such, if we engage in
(a) or (b), we run the risk of theorizing in ways that are little better than the ways of traditional
armchair philosophical methods. In other words, those theories may not in fact be rooted in
things that matter. Even when better methods are used, there is the risk of committing conceptual
mistakes such or not appreciating the contribution of individual differences. Good science often
requires good philosophy. Without high levels of statistical and methodological rigor (e.g.,
reporting and interpreting data in the light of effect sizes), we cannot know if the theories
forwarded are rooted in things that matter. Without conceptual rigor, we cannot begin to tap the
theoretical and systematizing power that traditional analytic philosophy affords.

Free will meta-analysis 23

Bargh, J. A., & Ferguson, M. J. (2000). Beyond behaviorism: On the automaticity of higher mental
processes. Psychological Bulletin, 126, 925-945.
Baumeister, R., Masicampo, E.J., & DeWall, C. (2009). Prosocial benefits of feeling free:
Disbelief in free will increases aggression and reduces helpfulness. Personality and
Social Psychology Bulletin, 35, 260-268.
Baumeister, R., Sparks, E., Stillman, T., & Vohs, K. (2007). Free will in consumer behavior:
Self-control, ego depletion, and choice. Journal of Consumer Psychology, 18, 4-13.
Carey, J., & Paulhus, D. (2013). Worldview implications of believing in free will and/or
determinism: Politics, morality, and punitiveness. Journal of Personality, 81, 130-141.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Second Edition.
Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers.
Cokely, E.T., & Feltz, A. (2009). Adaptive variation in judgment and philosophical intuition.
Consciousness and Cognition, 18, 355-357.
Cokely, E. T., & Kelley, C. M. (2009). Cognitive abilities and superior decision making under
risk: A protocol analysis and process model evaluation. Judgment and Decision Making,
4, 20-33.
Cova, F. (2011a) Quen pensez-vous ? Une introduction la philosophie exprimentale. Paris:
Cova, F. (2011b). Neuroscience et droit pnal : le dterminisme peut-il sauver la conception
utilitariste de la peine ? Klesis, 21, 32-77.
Cova, F., Bertoux, M., Bourgeois-Gironde, S., & Dubois, B. (2012) Judgments about moral
responsibility and determinism in patients with behavioural variant of frontotemporal
dementia: still compatibilists. Consciousness and Cognition, 21, 851-864.
Cova, F. (in press) Unconsidered intentional action: an assessment of Scaife and Webbers
Consideration Hypothesis. Journal of Moral Philosophy.
Cova, F. (forthcoming) Frankfurt-style cases user manual: Why Frankfurt-style enabling cases
do not necessitate tech support. Ethical Theory and Moral Practice.
Cova, F. & Kitano, Y. (in press) Experimental philosophy and the compatibility of free will and
determinism: a survey. Annals of the Japan Association for Philosophy of Science.
Free will meta-analysis 24
Cronbach, L. (1957). The two disciplines of scientific psychology. American Psychologist, 12,
De Brigard, F., Mandelbaum, E., & Ripley, D. (2009) Responsibility and the brain sciences.
Ethical Theory and Moral Practice, 12, 511-524.
Eagly, A., Makhijani, M., & Klonsky, B. (1992). Gender in the evaluation of leaders: A meta-
analysis. Psychological Bulletin, 111, 3-22.
Feltz, A. (2009) Experimental philosophy. Analyse & Kritik, 31, 201-219.
Feltz, A. (2013) Pereboom and premises: Asking the right questions in the experimental
philosophy of free will. Consciousness and Cognition, 22, 53-63.
Feltz, A., & Bishop, M. (2010). The proper role of intuitions in epistemology. In M. Milkowski
& K. Talmont-Kaminski (Eds.), Beyond Description: Normativity in Naturalized
Philosophy (pp. 101-122). London: College Publications.
Feltz, A., & Cokely, E. (in press). Predicting philosophical disagreement. Philosophy Compass.
Feltz, A., & Cokely, E. (2009). Do judgments about free will and moral responsibility depend on
who you are? Personality differences in intuitions about compatibilism and
incompatibilism. Consciousness and Cognition, 18, 342-350.
Feltz, A., Cokely, E., & Nadelhoffer, T. (2009). Natural compatibilism versus natural
incompatibilism: back to the drawing board. Mind & Language, 24, 1-23.
Feltz, A., & Millan, M. (in press). An error theory for compatibilist judgments. Philosophical
Feltz, A., Perez, A., & Harris, M. (2012) Free will, causes, and decisions: individual differences
in written reports. The Journal of Consciousness Studies, 19, 166-189.
Kane, R. (1996). The Significance of Free Will. New York: Oxford University Press.
Keltner, D., Ellsworth, P., & Edwards, K. (1993). Beyond simple pessimism: effects of sadness
and anger on social perception. Journal of Personality and Social Psychology, 64, 740-
Knobe, J., & Nichols, S. (2008). An experimental philosophy manifesto. In J. Knobe & S.
Nichols (Eds.), Experimental Philosophy (pp. 3-14). New York: Oxford University Press.
Libet, B. (1985). Unconscious cerebral initiative and the role of conscious will in voluntary
action. Behavioral and Brain Sciences, 8, 529-566.
Free will meta-analysis 25
Mandelbaum, E., & Ripley, D. (2012). Explaining the Abstract/Concrete paradoxes in moral
psychology: the NBAR hypothesis. Review of Philosophy and Psychology, 3, 351-368.
Mele, A. R. (2006). Free will and luck. Oxford ; New York: Oxford University Press.
Mele, A. R. (2013). A Dialogue on Free Will and Science. New York: Oxford Univeristy Press.
Miller, J., & Feltz, A. (2011). Frankfurt and the folk: an empirical investigation. Consciousness
and Cognition, 20, 401-414.
Monroe, A., & Malle, B. (2010). From uncaused will to conscious choice: the need to study, not
speculate about peoples folk concept of free will. Review of Philosophy and Psychology,
1, 211-224.
Morris, S., & DeShon, R. (2002). Combining effect size estimates in a meta-analysis with
repeated measures and independent-groups designs. Psychological Methods, 7, 105-125.
Murray, D., & Nahmias, E. (forthcoming). Explaining away incompatibilist intuitions.
Philosophy and Phenomenological Research.
Nadelhoffer, T., Kvaran, T., & Nahmias, E. (2009). Temperament and intuition: A commentary on Feltz
and Cokely. Consciousness and Cognition, 18, 351-355.
Nahmias, E., Coates, D., & Kvaran, T. (2007). Free will, moral responsibility, and mechanism:
experiments on folk intuitions. Midwest Studies in Philosophy, 31, 214-242.
Nahmias, E., Morris, S., Nadelhoffer, T., & Turner, J. (2005) Surveying freedom: folk intuitions
about free will and moral responsibility. Philosophical Psychology, 18, 28-53
Nahmias, E., Morris, S., Nadelhoffer, T., & Turner, J. (2006) Is compatibilism intuitive?
Philosophy and Phenomenological Research, 73, 561-584.
Nahmias, E., & Murray, D. (2010). Experimental philosophy on free will: An error theory for
incompatibilist intuitions. In J. Aguilar, A. Buckareff & K. Frankish (Eds.) New Waves in
Philosophy of Action. London: Palgrave-Macmillan.
Nelkin, D. K. (2007). Do we have a coherent set of intuitions about moral responsibility?.
Midwest Studies in Philosophy, 31(1), 243-259.
Nichols, S. & Knobe, J. (2007). Moral responsibility and determinism: the cognitive science of
folk intuition. Nos, 41, 663-685.
Ochsner, K., & Gross, J. (2005). The cognitive control of emotion. Trends in Cognitive Science,
9, 242-249.
Free will meta-analysis 26
OConnor, C., & Joffe, H. (2013). How has neuroscience affected lay understandings of
personhood? A review of the evidence. Public Understanding of Science, 22, 254-268.
Rakos, R., Laurene, K., Skala, S., & Slane, S. (2008). Belief in free will: Measurement and
conceptualization innovations. Behavior and Social Issues, 17, 20-39.
Rigoni, D., Kuehn, S., Sartori, G., & Brass, M. (2011). Inducing disbelief in free will alters brain
correlates of preconscious motor preparation: The brain minds whether we believe in free
will or not. Psychological Sciences, 22, 613-618.
Rose, D., & Nichols, S. (in press) The lesson of bypassing. Review of Philosophy and
Roskies, A. (2006). Neuroscientific challenges to free will and responsibility. Trends in
Cognitive Sciences, 10, 419-423.
Sarkissian, H., Chatterjee, A., De Brigard, F., Knobe, J., Nichols, S., & Sirker, S. (2010). Is
belief in free will a cultural universal? Mind & Language, 26, 346-358.
Scaife, R., & Webber, J. (2013). Intentional side-effects of action. Journal of Moral Philosophy,
10, 179-203.
Schulz, E., Cokely, E., & Feltz, A. (2011). Persistent bias in expert judgments about free will and
moral responsibility: a test of the expertise defense. Consciousness and Cognition, 20, 401-
Seyedsayamodst, H. (in prep). On normativity and epistemic intuitions: Failure to detect
differences between socioeconomic groups.
Sinnott-Armstrong, W. (2008). Abstract + Concrete = Paradox. In J. Knobe & S. Nichols (Eds.)
Experimental Philosophy (pp. 209-230). New York : Oxford University Press.
Smillie, L. (2013). Extraversion and reward processing. Current Directions in Psychological
Science, 22, 167-172.
Sommers, T. (2010). Experimental philosophy and free will. Philosophy Compass, 5, 192-212.
Sripada, C. (2012) What makes a manipulated agent unfree? Philosophy and Phenomenological
Research, 85, 563-593.
Stillman, T., Baumeister, R., & Mele, A. (2011) Free will in everyday life: autobiographical
accounts of free and unfree actions. Philosophical Psychology, 24, 381-394.
Free will meta-analysis 27
Stillman, T., Baumeister, R., Vohs, K., Lambert, N., Fincham, F., & Brewer, L. (2010). Personal
philosophy and personal achievement: Belief in free will predicts better job performance.
Social Psychology and Personality Science, 1, 43-50.
Vargas, M. (2005). The revisionists guide to responsibility. Philosophical Studies, 125, 399-
Vargas, M. (2006). Philosophy and the folk: on some implications of experimental work for
philosophical debates on free will. Journal of Cognition and Culture, 6, 239-254.
Vargas, M. (2009). Revisionism about free will: a statement & defense. Philosophical studies,
144(1), 45-62.
Vohs, K. D., & Schooler, J. W. (2008). The value of believing in free will - Encouraging a belief in
determinism increases cheating. Psychological Science, 19, 49-54.
Wegner, D. (2002). The Illusions of Conscious Will. Cambridge: MIT Press.
Wegner, D., & Bargh, J. (1998). Control and automaticity in social life. In D. Gilbert, S. Fiske, &
G. Lindzey (eds), Handbook of Social Psychology
Wegner, D., & Wheately, T. (1999). Apparent mental causation: Sources of the experience of
will. American Psychologist, 54, 480-492.
Weigel, C. (2011) Distance, anger, freedom: an account of the role of abstraction in compatibilist
and incompatibilist intuitions. Philosophical Psychology, 24, 803-823.
Woolfolk, R. (2013). Experimental philosophy: a methodological critique. Metaphilosophy, 44,
Young, E. (2012). Replication studies: Bad copy. Nature, 485, 298-300.

Free will meta-analysis 28
Table 1: Results from Nichols and Knobe (2007)
Agent in indeterministic
Agent in deterministic
High Affect condition 95% 64%
Low Affect condition 89% 23%

Free will meta-analysis 29
Table 2: Description of Studies

!"#$% &#"'() *+"+ *,-./0 !1,0+).( 2#,-".(0- 34(".(0
! "#$%&' )*+#$,' - ./0#$1*22#3 456678 )/%#9*3:;/$ <:%1:= .-> ! ?/@#
5 "#$%&' )*+#$,' - ./0#$1*22#3 456678 )/%#9*3:;/$ <:%1:= .-> ! ?/@#
A "#$%&' )*+#$,' - ./0#$1*22#3 456678 )/%#9*3:;/$ <:%1:= .-> ! ?/@#
B "#$%&' )*+#$,' - ./0#$1*22#3 456678 )/%#9*3:;/$ <:%1:= .-> ! ?/@#
C .:;1*$D - >=*E# 4566F8 )/%#9*3:;/$ G#%H##= .-> ! ?/@#
I )*J/ 4K=@KE$:D1#08 )/%#9*3:;/$ G#%H##= .-> ! >:$$
F )*J/ 4K=@KE$:D1#08 )/%#9*3:;/$ G#%H##= .-> ! ?/@#
L )*J/ 4K=@KE$:D1#08 )/%#9*3:;/$ G#%H##= .-> ! ?/@#
7 )*J/ 4K=@KE$:D1#08 )/%#9*3:;/$ G#%H##= .-> ! ?/@#
!6 )*J/ 4K=@KE$:D1#08 )/%#9*3:;/$ G#%H##= .-> ! ?/@#
!! )*J/ 4K=@KE$:D1#08 )/%#9*3:;/$ G#%H##= .-> ! ?/@#
!5 "#$%& - )*+#$, 4K=@KE$:D1#08 )*=%:=K*KD <:%1:= .M.N A >:$$
!A "#$%& - M:$$/= 4K=@KE$:D1#08 )*=%:=K*KD <:%1:= .)> A >:$$
!B "#$%& - M:$$/= 4K=@KE$:D1#08 )*=%:=K*KD <:%1:= .)> A >:$$
!C "#$%& - )*+#$, 4K=@KE$:D1#08 )*=%:=K*KD <:%1:= .)> C >:$$
!I )*J/ 456!58 )*=%:=K*KD <:%1:= .-> ! >:$$
!F "#$%&' O#3#&' - P/33:D 456!58 )*=%:=K*KD G#%H##= .)> A >:$$
!L "#$%&' O#3#&' - P/33:D 456!58 )*=%:=K*KD G#%H##= .)> A >:$$
!7 "#$%&' O#3#&' - P/33:D 456!58 )*=%:=K*KD G#%H##= .)> A >:$$
56 "#$%& - )*+#$, 4K=@KE$:D1#08 )*=%:=K*KD G#%H##= .)> 5 >:$$
5! "#$%& - M:$$/= 4K=@KE$:D1#08 )*=%:=K*KD G#%H##= .)> A >:$$
55 "#$%& - M:$$/= 4K=@KE$:D1#08 )*=%:=K*KD G#%H##= .)> A >:$$
5A "#$%& - M:$$/= 4K=@KE$:D1#08 )*=%:=K*KD G#%H##= .)> A >:$$
5B "#$%& - M:$$/= 4K=@KE$:D1#08 )*=%:=K*KD G#%H##= .)> A >:$$
5C "#$%& - M:$$/= 4K=@KE$:D1#08 )*=%:=K*KD G#%H##= .)> A >:$$
5I "#$%& - M:$$/= 4K=@KE$:D1#08 )*=%:=K*KD G#%H##= .)> A >:$$
5F "#$%& - M:$$/= 4K=@KE$:D1#08 )*=%:=K*KD G#%H##= .)> A >:$$
5L ./1Q:/D' )*/%#D' - >J/3/= 4566F8 )*=%:=K*KD G#%H##= .)> A >:$$
57 ./1Q:/D' )*/%#D' - >J/3/= 4566F8 )*=%:=K*KD G#%H##= .)> A >:$$
A6 )*J/ 4K=@KE$:D1#08 )*=%:=K*KD G#%H##= .-> ! ?/@#
Free will meta-analysis 30
Table 3: Dichotomous Within-Subjects

!"#$% &
1+2(4 :;<
?55(4 :;<
1. lelLz, Cokely, & nadelhoffer (2009) 32 33/17 39/13 -0.08 -0.33 0.40
2. lelLz, Cokely, & nadelhoffer (2009) 108 33/33 31/37 0.04 -0.33 0.40
3. lelLz, Cokely, & nadelhoffer (2009) 63 22/43 22/43 0.00 -0.43 0.43
4. lelLz, Cokely, & nadelhoffer (2009) 110 38/72 33/73 0.03 -0.31 0.37
Free will meta-analysis 31

Table 4: Dichotomous Between-Subjects

!"#$% &
1+2(6 789
<==(6 789
!" $%&'()* + ,-(./ 012234 55 6578 !763 !"9! 6"!9 11"::
;" <(=> 0?-@?.)%*'/A4 ;2 66769 9716 6":! 2"5; :"93
3" <(=> 0?-@?.)%*'/A4 ;2 66769 9716 6":! 2"5; :"93
8" <(=> 0?-@?.)%*'/A4 ;2 61768 66769 6"6! 2"56 :"1;
9" <(=> 0?-@?.)%*'/A4 ;2 62712 8711 6":8 2"5! 5"63
62" <(=> 0?-@?.)%*'/A4 ;2 69766 6;765 6"!6 2"!5 5"15
66" <(=> 0?-@?.)%*'/A4 ;2 1371 1:73 5"66 2"38 16"3;
Free will meta-analysis 32
Table 5: Continuous Within-Subjects

!"#$% & '()* ,-./ !0 '()* 123 !0
27 '()*8
'()* :)-*
123(5 ;<=
?@@(5 ;<=
12. lelLz & Cokely (unpubllshed) 38 3.04 1.82 3.16 2.04 0.86 -0.06 -0.20 0.07
13. lelLz & Mlllan (unpubllshed) 123 3.63 1.69 3.47 1.71 0.82 0.09 -0.01 0.20
14. lelLz & Mlllan (unpubllshed) 126 3.41 1.84 3.19 1.82 0.86 0.12 0.03 0.21
13. lelLz & Cokely (unpubllshed) 33 3.33 1.9 3.29 2.12 0.9 0.03 -0.09 0.13
16. Cova (2012) 20 3.63 1.96 4.37 1.92 0.62 0.33 0.13 0.96
Free will meta-analysis 33
Table 6: Continuous Between-Subjects

!"#$% &'() + ,-./ &'() &'() !0 123 + ,-./ 123 123 !0
,-./ 0'66-4-/7-
123-4 89:
=>>-4 89:
17. lelLz, Parrls, & erez (2012) 23 3.19 1.84 40 3.31 1.39 -0.19 -0.69 0.31
18. lelLz, Parrls, & erez (2012) 30 3.73 1.3 61 3.06 2.01 0.37 0.00 0.73
19. lelLz, Parrls, & erez (2012) 32 3.67 1.78 44 4.77 2.02 0.47 0.01 0.93
20. lelLz & Cokely (unpubllshed) 49 3.32 1.48 44 3.1 1.94 0.13 -0.28 0.34
21. lelLz & Mlllan (unpubllshed) 61 3.68 1.33 76 3.11 1.79 0.34 0.00 0.68
22. lelLz & Mlllan (unpubllshed) 64 3.93 1.49 68 3.82 1.66 0.08 -0.26 0.42
23. lelLz & Mlllan (unpubllshed) 63 3.63 1.71 61 3.09 1.77 0.31 -0.04 0.66
24. lelLz & Mlllan (unpubllshed) 48 3.32 1.72 43 3.39 1.81 -0.13 -0.36 0.23
23. lelLz & Mlllan (unpubllshed) 122 3.74 1.62 131 3.06 1.8 0.40 0.13 0.63
26. lelLz & Mlllan (unpubllshed) 68 3.31 1.79 63 3.36 1.79 0.08 -0.26 0.42
27. lelLz & Mlllan (unpubllshed) 70 3.33 1.7 34 3.67 1.31 -0.09 -0.44 0.27
28. nahmlas, CoaLes, & kvaran (2007) 111 4.46 1.48 38 3.93 1.31 0.34 0.02 0.66
29. nahmlas, CoaLes, & kvaran (2007) 108 4.13 1.48 48 3.63 1.6 0.33 -0.01 0.67
30. Cova (unpubllshed) 30 -0.33 2.3 30 -1.2 2.38 0.27 -0.24 0.77
Free will meta-analysis 34
Figure 1: Funnel Plot and 95% Confidence Interval Estimate

!" !$%& !$ !'%& ' '%& $ $%& "


!*"+,"-,()&, .//&0* !()& "+, 123 45
Free will meta-analysis 35
Figure 2: Forest Plot, Effect Sizes Converted to be Comparable to Cohens d