You are on page 1of 20

Personality and Social Psychology Review

The Truth About the Truth: A Meta-Analytic 14(2) 238­–257


© 2010 by the Society for Personality
and Social Psychology, Inc.
Review of the Truth Effect Reprints and permission: http://www.
sagepub.com/journalsPermissions.nav
DOI: 10.1177/1088868309352251
http://pspr.sagepub.com

Alice Dechêne1, Christoph Stahl2, Jochim Hansen1,


and Michaela Wänke1

Abstract
Repetition has been shown to increase subjective truth ratings of trivia statements. This truth effect can be measured in
two ways: (a) as the increase in subjective truth from the first to the second encounter (within-items criterion) and (b) as the
difference in truth ratings between repeated and other new statements (between-items criterion). Qualitative differences are
assumed between the processes underlying both criteria. A meta-analysis of the truth effect was conducted that compared
the two criteria. In all, 51 studies of the repetition-induced truth effect were included in the analysis. Results indicate that the
between-items effect is larger than the within-items effect. Moderator analyses reveal that several moderators affect both
effects differentially.This lends support to the notion that different psychological comparison processes may underlie the two
effects. The results are discussed within the processing fluency account of the truth effect.

Keywords
truth effect, meta-analysis, processing fluency

When presented with an unfamiliar statement, such as “The the within-items component and the between-items compo-
zipper was invented in Norway,” most people do not know nent) and the context in which statements are judged. The
whether it is actually true or not (i.e., the statement is ambig- theoretical analysis in this review is based on the current con-
uous).1 To nevertheless judge a statement’s truth, people tend sensus that the truth effect is mediated by the metacognitive
to use heuristic cues. Such heuristics make use of attributes experience of processing fluency (Begg et al., 1992; Reber
of the statement’s source (e.g., the source’s level of expertise & Schwarz, 1999; Unkelbach, 2007; Whittlesea, 1993; for a
on the subject matter), attributes of the context in which it is review, see Alter & Oppenheimer, 2009). In short, process-
presented (e.g., at a scientific conference), and attributes of ing a repeated statement is experienced as unexpectedly
the statement itself. In particular, it has been shown that fluent—in other words, the processing fluency is experienced
people tend to trust a statement more if it had been encoun- as discrepant from a comparison standard—and this dis-
tered before: After reading “The zipper was invented in crepancy subsequently affects judgments of truth (e.g., Hansen,
Norway” for a second time, judgments of the statement’s Dechêne, & Wänke, 2008; Whittlesea & Williams, 1998,
truth or validity typically increase. This so-called truth effect 2000, 2001a, 2001b). Here we focus on the prediction that the
has been the subject of extended study in psychological (e.g., comparison standard—and, thereby, the truth effect—should
Arkes, Boehm, & Xu, 1991; Bacon, 1979; Begg, Anas, & vary with the context in which judgments are made (e.g.,
Farinacci, 1992; Hasher, Goldstein, & Toppino, 1977) and Dechêne, Stahl, Hansen, & Wänke, 2009; Whittlesea &
consumer research (e.g., Hawkins & Hoch, 1992; Law, Leboe, 2003).
Hawkins, & Craik, 1998; Roggeveen & Johar, 2002, 2007). We distinguish between two types of judgment contexts
It is important to understand this effect because people’s that differ with regard to the variability of the stimuli with
trust in statements’ truth may affect behavior related to those which participants are confronted: First, truth judgments can be
statements (e.g., marketing claims, health beliefs).
The present article presents a meta-analytic review of
research on the truth effect that aims at integrating past research 1
University of Basel, Basel, Switzerland
and providing a common basis for systematic future research. 2
University of Freiburg, Freiburg, Germany
In addition, it seeks to open new avenues for research. It
Corresponding Author:
attempts to do so by focusing on two theoretically important Alice Dechêne, University of Basel, Department of Psychology,
factors that have rarely been investigated so far, namely, the Missionsstrasse 60/62, CH-4055 Basel, Switzerland
distinction between two components of the truth effect (i.e., Email: alice.dechene@unibas.ch
Dechêne et al. 239

collected in a homogeneous context in which a homogeneous and it occurs when the truth of each statement is rated after
set of repeated items is judged. To assess the effect of repeti- each repetition (Hasher et al., 1977) or when the rating takes
tion, these judgments are compared to those collected for the place only after the final repetition (Schwartz, 1982). The
same set of statements at their first encounter; we call this the effect generalizes to nonlaboratory settings (Gigerenzer,
within-items criterion. Second, truth judgments can be col- 1984), it has been shown that one exposure to a statement is
lected in a heterogeneous context in which a mixture of sufficient to produce the effect (Arkes et al., 1991), and it
repeated and nonrepeated statements is judged. To assess the seems to be robust against feedback after a longer delay
effect of repetition, judgments of a set of repeated statements (Brown & Nix, 1996; Skurnik, Yoon, Park, & Schwarz,
are compared to judgments of a different set of nonrepeated 2005). It occurs equally for actually true and actually false
statements; we call this the between-items criterion. To shed statements (Brown & Nix, 1996). Moreover, even repeated
light on contextual influences on the truth effect, the meta- statements from a noncredible source are judged as more
analysis separately evaluates the between- and within-items probably true as compared to new statements (Begg et al.,
criteria, both with regard to the magnitude of the effect and 1992). Overall, the truth effect appears to be very robust. The
with regard to influences of potential moderating variables only constraint seems to be that the statements have to be
discussed in the literature. ambiguous, that is, participants have to be uncertain about
In the remainder of this introduction, the literature on the their truth status because otherwise the statements’ truthful-
truth effect is briefly reviewed. The discussion of mediating ness will be judged on the basis of their knowledge and not
variables focuses on processing fluency and the important on the basis of fluency.
role of the comparison standard. The distinction between Mediating Variables:The Role of Memory Processes. By
contexts is elaborated, and hypotheses for a comparison of definition, the repetition-based truth effect is mediated by
the two criteria are derived. Before turning to the meta-analysis, memory processes. Different memory processes have been
potential moderating variables are addressed and the focus of discussed, for example, memory for stimulus frequency
the present review is summarized. (Hasher et al., 1977), explicit recognition (Bacon, 1979;
Hawkins & Hoch, 1992), familiarity (Begg et al., 1992;
Schwartz, 1982), and, more recently, processing fluency (e.g.,
The Truth Effect Begg et al., 1992; Reber & Schwarz, 1999; Unkelbach, 2006,
In a typical study on the truth effect, a set of ambiguous state- 2007). Although these suggested mediators are all based on
ments (i.e., statements whose truth status is unknown to memory, they differ with regard to important qualities. A
participants, as established by pretests) is presented to par- first important distinction is that between source memory
ticipants in a first session. In a second session, some of the and item memory: On one hand, if we remember hearing a
statements are presented and judged again, often interspersed statement from a particular source (e.g., from a distinguished
with new statements that have not been presented before. expert), this information provides referential validity and likely
Repeated presentation of an ambiguous statement increases increases our judgment of the statement’s truth—given the
the probability that it will be judged as true (e.g., Bacon, 1979; source is trustful (e.g., Brown & Nix, 1996). The role of ref-
Hasher et al., 1977; Schwartz, 1982). This effect belongs to erential validity is limited in most studies because statements
a family of well-known effects of repetition in psychology, are usually pretested to be unfamiliar to participants. On the
such as the mere exposure effect (Bornstein, 1989; Zajonc, other hand, just remembering having heard the statement
1968) and the false-fame effect (Jacoby, Kelley, Brown, & itself before (i.e., without source information) can increase
Jasechko, 1989). truth judgments, too (Arkes et al., 1991): Remembering pre-
The finding that repeated statements are believed more vious occurrences of a statement conveys what can be called
than new ones (e.g., Hasher et al., 1977) has been obtained convergent validity.2
under many conditions. The effect has been demonstrated The second important distinction is that between explicit
for trivia statements (Bacon, 1979), product-related claims and implicit memory processes: Although explicit recogni-
(Hawkins & Hoch, 1992; Law et al., 1998; Roggeveen & tion of a statement is clearly necessary for referential or
Johar, 2002, 2007), and even opinion statements (Arkes, convergent validity effects to occur, a nonreferential truth
Hackett, & Boehm, 1989). It occurs for orally presented effect can consistently be observed even in the absence of
statements (Gigerenzer, 1984; Hasher et al., 1977) and writ- explicit recollection (Bacon, 1979); it is this nonreferential,
ten statements (e.g., Arkes et al., 1989; Schwartz, 1982) as implicit part of the truth effect that is thought to be driven by
well as with a delay of minutes (Arkes et al., 1989; Begg & processing fluency (Begg et al., 1992; Reber & Schwarz,
Armour, 1991; Begg, Armour, & Kerr, 1985; Begg et al., 1999; Unkelbach, 2007). Importantly, although source and
1992; Schwartz, 1982) or weeks (Bacon, 1979; Gigerenzer, even item memory provide a rational basis for the truth
1984; Hasher et al., 1977) between the repetitions of the effect, this is not the case for the fluency-based part of
statements. The effect has been shown with different presen- the effect, which is why it has been termed the illusory truth
tation times of each statement (5 s to 12 s; Gigerenzer, 1984), effect (Begg et al., 1992).
240 Personality and Social Psychology Review 14(2)

Processing Fluency and the Truth Effect. Processing fluency Contextual Influences on the Comparison Standard. In the
is defined as the metacognitive experience of ease during present review, we argue that the truth effect may be affected
information processing; it may be elicited, for example, by by the judgment context via its influence on the construction of
linguistic ease, perceptual ease, semantic priming, or retrieval a comparison standard. We distinguish between two different
ease (Alter & Oppenheimer, 2009; Whittlesea, 1993). For contexts in which the critical truth judgments are collected: a
instance, processing fluency can be increased by improv- homogeneous and a heterogeneous context. Below, it is out-
ing the visual contrast with which a stimulus is presented on lined how these contexts may differentially affect the
a computer screen: A higher contrast makes it easier for comparison standard and, thereby, the truth effect.
participants to perceive the stimulus. This perceptual ease— The heterogeneous context. Here, truth judgments are col-
or, more generally, processing fluency—may subsequently lected for a heterogeneous mix of statements, consisting of a
affect judgments; for example, it may lead individuals to rate set of repeated and a different set of nonrepeated statements.
statements more probably true when presented in high as The truth effect is assessed using the between-items criterion:
compared to low visual contrast (Reber & Schwarz, 1999). Truth judgments for the critical set of repeated statements
In addition to truth judgments, the fluency experience has are compared to those for a different set of statements that
been shown to affect a broad variety of domains (cf. Alter & were not encountered before. In the heterogeneous context,
Oppenheimer, 2009), including confidence (e.g., Simmons & because of the mixture of repeated and nonrepeated state-
Nelson, 2006), familiarity judgments (e.g., Whittlesea, 1993), ments, there is considerable variability in processing fluency.
and attractiveness (Reber, Winkielman, & Schwarz, 1998). This variability has at least two consequences: First, partici-
Taken together, repetition affects judgments of a statement’s pants may notice that some statements are more fluent than
truth value by increasing processing fluency: A statement that others, and they may be more likely to use this information
is processed more fluently is typically more likely to be judged as a heuristic cue to inform their judgments. Second, and
true (e.g., Begg et al., 1992).3 most importantly, the heterogeneous context provides a situ-
For fluency to inform judgments, one has to form an idea ation in which a useful and precise comparison standard can
of how fluent a given experience was; this implies that a com- be easily construed on the fly based on the statements’ aver-
parison must be made with a norm or standard (Whittlesea & age fluency (Whittlesea & Leboe, 2003). With the average
Leboe, 2003). Depending on the standard, a given experi- fluency as a comparison standard, repeated statements are
ence is then interpreted as relatively fluent (i.e., if it exceeds likely to be perceived as above average whereas nonrepeated
the standard) or as relatively disfluent (i.e., if it falls below statements are perceived as below average. Such an average
the standard). In other words, for processing fluency to affect standard is the most efficient way to classify items as fluent
judgments, it needs to be experienced as discrepant from a versus disfluent, as repeated versus new, or as more versus
comparison standard (Hansen & Wänke, 2008; Westerman, less likely true. In addition, the contrast in fluency between
2008; Whittlesea & Williams, 1998, 2000, 2001a, 2001b; fluent and disfluent stimuli may lead to a shift of judgments
Willems & Van der Linden, 2006). The experience of discrep- of nonrepeated items toward the “false” end of the scale. We
ancy has been shown to play an important role in effects of call this the negative truth effect. If such an effect were to
repetition on truth judgments (Dechêne et al., 2009; Hansen exist, it would inflate the between-items criterion but would
et al., 2008). Despite its theoretical importance, not much is not affect the within-items criterion (i.e., because the latter
known about how comparison standards are chosen or about does not consider judgments of nonrepeated statements).
the variables that affect this choice. What we do know is that The homogeneous context. In the homogeneous context,
the comparison standard appears to depend on features of the truth judgments are collected for a homogeneous set of
stimuli and of the context in which stimuli are encountered repeated statements only. The truth effect is assessed using
(e.g., Whittlesea & Leboe, 2003). Recent findings suggest the within-items criterion: Truth judgments are compared to
that a standard may be based, for instance, on an expectation those made to the same set of statements at the first encoun-
held in mind about how easily a stimulus will be processed ter. In contrast to the heterogeneous context, there is little
(Hansen et al., 2008; Hansen & Wänke, 2008; Whittlesea variability in processing fluency in the homogeneous context.
& Leboe, 2003; Whittlesea & Williams, 1998, 2000, 2001a, As a consequence, processing fluency conveys no informa-
2001b) or on the average processing fluency of other stim- tion and is less likely to be recruited as a cue by participants
uli in the same context (Dechêne et al., 2009; Whittlesea & (Whittlesea & Leboe, 2003).
Leboe, 2003). Furthermore, if fluency is nevertheless used, a sensible
At this point, the mediation of the repetition-based truth comparison standard is not easily construed. Using average
effect by processing fluency can be summarized as follows: fluency as a standard is obviously not useful here; it would
In a first step, fluency is enhanced by repeated processing; render all statements equally nondiscrepant from the stan-
in a second step, the enhanced processing fluency is experi- dard. Global standards or expectancies form other possible
enced as discrepant from a comparison standard; in a third bases for a comparison standard (e.g., Hansen et al., 2008;
step, the experienced discrepancy informs truth judgments. Whittlesea & Leboe, 2003): A global expectancy may be
Dechêne et al. 241

formed on the basis of previous experience but will likely effect were also, to a substantial extent, driven by the het-
lead to a rather diffuse comparison standard. Perhaps the erogeneous context. Hence, for separating the effects of the
most useful expectancy to be recruited is based on the spe- different contexts, the present comparison between the two
cific statements’ processing fluency at previous encounters; criteria must be seen as a rough initial assessment that should
it might lead to a less diffuse standard but depends on partici- be followed and complemented by more targeted experimen-
pants’ ability to retrieve their processing fluency experiences tal research (e.g., Dechêne et al., 2009).
from the first session. It is unclear whether and how fluency
experiences are stored in memory; given that fluency is a
rather subtle perceptual signal, memory for this signal is Moderating Variables
likely to decline substantially as a function of intersession A variety of moderators of the truth effect have been investi-
delay. Other standards may also be used, but such standards gated. First, as mentioned above, the truth effect is observed
are likely to be idiosyncratic and more variable across par- only for ambiguous statements; it disappears when the actual
ticipants as they are less restrained by the context. Because truth status is known. Evidence for this comes from findings
the homogeneous context provides no information about the that the truth effect disappears when feedback about the
relevant range of processing fluency, the comparison stan- actual truth status is given (Brown & Nix, 1996). However,
dards are likely to be more variable on two levels: Within when a delay is introduced between feedback and judgment,
participants, they may rely on more diffuse prior expectan- causing memory for the actual truth status to decline, the
cies; furthermore, expectancies are more likely to differ effect reappears.
between participants. As a result, the use of fluency is ren- Several attributes of the statements’ source also moderate
dered both less likely and less efficient. the truth effect. Overall, repeated statements from credible
Thus, although the truth effect is driven by fluency discrep- sources are believed more, as compared to repeated state-
ancy in both contexts, the comparison standards may differ ments from noncredible sources (when source credibility is
across contexts: In the absence of relevant contextual infor- remembered). However, repeated statements from non-
mation in the homogeneous context, the truth effect in this credible sources are still believed more, as compared to new
context may, for example, be based on an individual’s rather statements from those sources (Begg et al., 1992). This result
diffuse global fluency expectancies. In contrast, in the hetero- indicates that source credibility moderates the size of the
geneous context, the truth effect is likely to be based on a effect, but it does not completely eradicate the truth effect. In
more specific standard computed on the fly from the informa- addition, statements are believed more when the statements’
tion provided by that context. From these differences, three source is misattributed as originating from outside the exper-
predictions can be derived that will be the focus of the meta- imental setting, which may be interpreted as an external
analysis. First, the truth effects in the two contexts should validity cue. But even in this case, repeated statements that
differ in magnitude (i.e., the effect should be larger in the het- are attributed to the artificial laboratory situation are still
erogeneous than in the homogeneous context); second, they believed more than nonrepeated statements (Arkes et al.,
might be affected differently by moderating variables (e.g., 1989; Law et al., 1998). Similar to source credibility, source
by moderators that affect the construction of the comparison variability has been shown to influence the truth effect, but
standard). A third prediction is that a negative truth effect only when participants are aware of the variability in sources
should be observed in the heterogeneous context: Nonre- (Roggeveen & Johar, 2002). The effect appears to be more
peated statements should be less likely to be judged true when pronounced when the statements’ source is not correctly
judged in the context of repeated statements as compared to remembered but the statements’ content is (Law & Hawkins,
when judged in the context of other nonrepeated statements. 1997). Finally, older adults who tend to have impaired source
These predictions are evaluated in the meta-analysis by com- memory are especially susceptible to the truth effect (Law
paring the between-items and within-items criteria. et al., 1998; Skurnik et al., 2005; for differing results, see
Note that the postulated links between the within-item Parks & Toth, 2006). Thus, a disconnection of content and
criterion and the homogeneous context, and between the source memory seems to provide beneficial conditions for
between-item criterion and the heterogeneous context, are not the occurrence of the truth effect. This can be seen as func-
perfect: Although a truth effect in a homogeneous context is tional; for instance, it is plausible that familiar or recognized
necessarily measured using the within-items criterion, a truth information from a source that is not recalled—and thus not
effect obtained in a heterogeneous context can often be mea- evaluable—may be interpreted as a part of general semantic
sured using either criterion. Specifically, the within-items knowledge and by this interpreted as probably true without
criterion can also be computed for the heterogeneous context further evaluation (Law & Hawkins, 1997).
(provided that first-session truth ratings are available). In Taken together, research on the influence of feedback,
fact, a large proportion of the within-items effect sizes in the source characteristics, and age of participants on the truth
meta-analysis are from heterogeneous contexts. This implies effect suggests ways in which different memory processes
that the meta-analytic results obtained for the within-items are involved in the truth effect, above and beyond the explicit
242 Personality and Social Psychology Review 14(2)

use of recognition and the implicit use of cognitive feelings effect article, by Hasher et al. (1977). We asked for unpub-
of discrepant fluency (that both are enhanced by repetition). lished data through inquiries using the mailing lists of the
Perhaps most importantly, imprecise source memory may Society of Personality and Social Psychology and the Euro-
support the effect. Therefore, repetition may affect truth rat- pean Association of Social Psychology (the former European
ings in two opposite ways: First, repetition enhances the Association of Experimental Social Psychology).
cognitive processing fluency of statements; second, repeti- Criteria for Inclusion. We applied the following criteria to
tion also enhances the probability that both a statement’s determine the eligibility of each study for inclusion in the
content and its source are stored in memory. meta-analysis:

1. Studies that investigated the effect of repetition on


Focus of the Meta-Analysis subjective ratings of truth were included.
As discussed above, the nonreferential part of the repetition- 2. Studies that reported means or proportions of sub-
based truth effect is the product of two processes: An increase jective truth ratings at least of new and repeated
in fluency and the subsequent experience of this increased statements in Session 2 (between-items criterion)
fluency as discrepant from a comparison standard. Importantly, or of repeated statements at their first and second
the second process is likely affected by the (homogeneous or (or more) presentation or judgment (within-items
heterogeneous) judgment context; this contextual effect is criterion) were included. Studies that did not report
reflected in differences between the within- and between- means or proportions of subjective truth judgments
items criteria. It is an inherent limitation of the meta-analytic were excluded (k = 13).
method that these measures do not represent process-pure
indicators of the above context effects, and the present meta- Furthermore, we avoided duplication by excluding data that
analysis cannot replace controlled experiments of such context were reported in previously published work and that were
effects. Nevertheless, by focusing on the differences between thus already included in the meta-analysis (k = 2). All studies
the within-items and the between-items components of the had an experimental or at least a quasi-experimental design.
truth effect, the present review can provide an initial assess- After application of the exclusion criteria, 51 independent
ment of the role of the comparison standard as well as studies matched the criteria and therefore were included in
contextual moderation of the truth effect, and it hopes to stim- the analysis.
ulate experimental research in this direction. Coding of Study Characteristics. Appropriate studies were
The present review separately analyzes the between-items coded by the first and the third author, using a data cod­
and within-items truth effects and investigates possible dif- ing form. Both authors were familiar with the truth effect
ferential effects of procedural and participant variables on literature and discussed various preliminary versions of the
both components. First, we investigate the overall effect of coding form before starting the coding. The topics of the
repetition on subjective truth separately for both compo- coded study characteristics can be grouped to the following
nents. Next, we examine whether the effect of repetition on areas: (a) study and sample descriptors (e.g., year and type of
both components of subjective truth judgments is enhanced publication, type of sample, and proportion of male partici-
or diminished by a number of procedural and participant pants), (b) research design descriptors (e.g., the moderators
moderators that are of potential theoretical interest. Modera- investigated), (c) procedure descriptors (e.g., delay between
tors include characteristics of stimuli, presentation, and assessments, characteristics of data collection), and (d) data-
measurement of the truth effect as well as levels of process- level information (e.g., type of data reported). The relevant
ing, the number of repetitions, and participants’ age. variables are described in more detail below. The interrater
agreement was high: Of 456 ratings, 6 were divergent. These
differences were resolved through discussion.
Method Study and sample descriptors. Although we had no hypoth-
Literature Search. We retrieved published articles, book eses about influences of publication year, publication type,
chapters, and dissertations through an extensive search in Psy- gender, and sample size on the truth effect, we coded these
cINFO, PubMed, and Web of Science, the main databases for variables for descriptive reasons. Studies in journal articles
psychological literature. Furthermore, we searched in Pro- (k = 42) and book chapters (k = 1) were published between
Quest, the main database for doctoral dissertations, Google 1977 and 2006. Unpublished studies (doctoral dissertations,
Scholar, and two business literature databases (Business Source k = 4; master’s theses, k = 1; and other manuscripts, k = 3)
Premier and EconLit) because a considerable amount of were from 1998 to 2007. The sample sizes ranged from n = 12
research on the truth effect has been published in consumer to n = 145 participants (M = 46.78, Mdn = 40, SD = 25.59).
and marketing journals. The following keywords were used: The gender of participants was rarely reported.
truth effect, illusory truth, and truth judgment(s). We also Furthermore, we recorded the type of sample used in the
searched Web of Science for articles that cited the first truth studies (2 representative samples, 43 convenience samples
Dechêne et al. 243

consisting of university students, and 6 quasi-experimental We refrained from computing effect sizes from F tests, t
studies that compared older and younger adults). Partici- tests, and p values because of the repeated measures designs
pants’ mean age was reported only in studies that investigated typically used in truth effect experiments. The test statis-
the effect of age (Mold = 72.86, SD = 5.09; Myoung = 22.54, tics of repeated measures designs use error terms that are
SD = 4.11). Given the large proportion of student samples, it affected by the correlation between the measures: The larger
can be assumed that participants’ mean age was similar to the correlation between the measures, the smaller the error
that of the young participants in studies on age effects. (and the larger the test statistic). If an effect size is computed
Research design descriptors. Categorizing the study designs from such a statistic without taking the correlation between
involved coding the type of moderator (e.g., level of process- the measures into account, the effect size will be overestimated
ing, age) that was investigated in a study. If the information (Dunlap, Cortina, Vaslow, & Burke, 1996). Because none
given did not allow for a definite coding judgment or the of the studies reported the correlations, rendering adequate
study did not involve a moderator, the data were coded as corrections impossible, we used only the reported means and
missing. In addition, we coded the type of dependent vari- standard deviations for the computation of the effect sizes.
able used to measure the truth judgments (a 7-point Although this may imply somewhat less precise estimates, it
Likert-type scale was most frequently used; anchored either avoids potentially severe biases.
with 1 = true and 7 = false or with 1 = false and 7 = true). Twenty-one studies provided standard deviations for the
Procedure descriptors. Technical characteristics of the study reported means; seven studies reported a range of standard
procedures were coded. These included the number of mea- deviations. In the latter case, we computed the pooled stan-
surements, the time of the measurements (e.g., at Sessions 1 dard deviations from the range. Where no standard deviations
and 2, only Session 2), the number of presentations of the were provided, we chose to impute the pooled standard devi-
items, the modality of data collection (e.g., paper and pencil, ation from an overall estimate that was obtained from those
computer), the presentation mode of the items (e.g., visual or studies in which standard deviations were reported or could
auditory), and the response mode (e.g., written form, orally). be extracted.
Furthermore, the delay between two sessions was coded (e.g., A positive effect size indicates that truth ratings increase
< 30 min, 1 day, 1 week) as well as the duration of presenta- with repetition. For the computation of the within-items cri-
tion of one item, if this information was available. There was terion effect size, we subtracted the mean of the first truth
always a category to indicate if there was no information rating from the mean of the repeated truth rating. We obtained
available or a different procedure was used. k = 32 independent effect sizes for the within-items criterion.
In general, the coded variables can be categorized as either For the between-items criterion, we subtracted the mean
procedural (e.g., stimulus variables, measurement variables) truth rating of new statements from the mean truth rating of
or participant (e.g., age of participants, level of processing) repeated statements. We obtained k = 70 independent effect
variables. At least one effect was coded from each study sizes of the between-items criterion.
(within-item criterion and/or between-item criterion). If a We applied a small-sample correction on the effect sizes,
study systematically investigated a variable of interest in this using the following formula (Hedges & Olkin, 1985):
meta-analysis, one effect for each level was coded if the
 3 
manipulation was between participants. We did not include dunbiased== g × 1 −
dunbiased . (3)
 4( N − 2) − 1
effects from manipulations that were implemented within par-
ticipants because they violate the assumption of independence
of the effect sizes. In this case, we collapsed over manipula- The resulting effect size d is unbiased by sample size and can
tions (see Hedges & Olkin, 1985; Rosenthal & Rubin, 1986). be interpreted following the conventions of Cohen (1988)
Meta-Analytic Procedure. with d = .2 indicating a small effect, d = .5 indicating a
Calculating effect sizes. We used the standardized mean dif- medium effect, and d = .8 indicating a large effect.
ference to estimate the influence of repetition on truth Calculation of average effect sizes. In computing average
judgments. The effect size d is a scale-free measure of the effect sizes, individual study effect sizes were weighted by
difference of two group means (Cohen, 1988). We calculated the inverse of their variance. This procedure lends greater
the effect sizes by using Hedges’s formula based on means weight to studies with more precise effect size measures (e.g.,
and pooled standard deviations: because of larger samples, stricter experimental control, etc.)
and thereby ensures that the meta-analytic results are not
g = (M1 – M2)/SDpooled. (1) easily distorted by outcomes from a single small study.
Tests for moderation. We used homogeneity analyses to test
The pooled standard deviation was computed as, for moderation (Cooper & Hedges, 1994; Hedges & Olkin,
1985). In these analyses, the amount of variance in an observed
( n1 − 1) SD12 + ( n 2 −1) SD22
SDSDpooled =
pooled
= . (2) set of effect sizes is compared to the amount of variance that
n+n−2
would be expected by sampling error alone. The null hypothesis
244 Personality and Social Psychology Review 14(2)

Table 1. Overall Meta-Analytic Results for Within- and Between-Items Criterion

95% Confidence Interval

Criterion k d Low Estimate High Estimate z Qw

Within items 32 .39*** (.39)*** 0.32 (0.30) 0.47 (0.49) 10.33 (8.08) 47.21*
Between items 70 .49*** (.50)*** 0.45 (0.43) 0.55 (0.57) 19.67 (13.41) 142.04***
Note: Fixed-effects estimates are presented outside parentheses, and random-effects estimates are within parentheses.
*p < .05. ***p < .0001.

of homogeneity (i.e., no variation across effect sizes) is Results


tested using a within-class goodness-of-fit statistic, Qw,
which has an approximate chi-square distribution with k – 1 Prior to analyses, we checked the distribution of uncorrected
degrees of freedom, where k equals the number of effect effect sizes for outliers. No outliers were indicated by the
sizes (Hedges & Olkin, 1985). A significant Qw statistic indi- box plot. We also checked the distribution of uncorrected
cates heterogeneity, that is, a systematic variation among effect sizes for possible publication bias. As indicated by
effect sizes; such a finding would suggest that other vari- funnel plots for both the within-items and the between-items
ables moderate the effect (Cooper, 1998). criteria, effect sizes are distributed symmetrically, and thus
Homogeneity analyses can be used in a similar way to test publication bias does not seem to be a problem.
whether groups of average effect sizes vary more than pre-
dicted by sampling error. The null hypothesis of no variation
across groups can be tested by computing a between-class Overall Analyses
goodness-of-fit statistic, Qb. It has a chi-squared distribution In a first step, analyses of the truth effect for all independent
with j – 1 degrees of freedom, where j equals the number of effect sizes (32 within items, 70 between items) that were
tested groups. If the Qb statistic is significant, average effect retrieved from 51 studies were separately performed for both
sizes vary between the categories of the moderator more than components of the effect, collapsing across procedural and
predicted by sampling error. This procedure is similar to an participant variables.
analysis of variance that tests for group mean differences or Results of overall analyses for both criteria show that
to a multiple regression model that tests for linear effects. both weighted-average ds differ from zero (see Table 1). Fol-
Fixed versus random error models. It is debated whether a lowing the conventions of Cohen (1988), both components
fixed-effects (FE) model or a random-effects (RE) model is of the truth effect are medium-sized effects under both
more adequate for meta-analyses of effect sizes (e.g., FE and RE assumptions: The effects ranged from d = –.28 to
Schmidt, Oh, & Hays, 2009). On one hand, FE models are .89 for the within-items criterion and from d = –.18 to 1.43
more powerful and therefore more likely to detect effects of for the between-items criterion.
moderators. On the other hand, they tend to be more liberal Descriptively, the within-items effect was smaller than
and reject the null hypothesis more often than indicated by the between-items effect, and we tested whether this differ-
the nominal alpha level when the true effect is zero. FE ence was substantial. Note that, when computed for the same
models are thought to be appropriate when an inference is study, the within- and the between-items effects are not indepen-
made only about the sample of studies under investigation, dent in two ways: (a) The data are based on the same sample
whereas RE models are the appropriate method when the because of the repeated measures designs and (b) the mean
goal is to generalize the findings across all possible studies. subjective truth rating of repeated items in Session 2 is used
We fitted both models in the present study for the following to compute both effect sizes. To avoid a violation of indepen-
reasons. First, there is no consensus yet as to which method dence, we chose to compare effect sizes of the within-items
should generally be preferred. Second, where the results of criterion with effect sizes of the between-items criterion only
both analyses diverge, both can be important: We are inter- for those studies that did not report a within-items criterion.
ested in the effects of a given moderator on the present Thus, we excluded between-items effect sizes from analyses
sample of studies but also whether the effects of that modera- that were derived from studies that reported both criteria.
tor can be generalized. Results of the inverse variance weighted one-way ANOVA
Analyses followed the methods introduced by Hedges with a FE model show that, in fact, the within-items criterion
(1983; Hedges & Olkin, 1985; Hedges & Vevea, 1998; also is smaller (FE: dwithin = .39; 95% confidence interval [CI] =
see Raudenbush, 1994), as implemented for SPSS by Lipsey 0.32, 0.47; k = 32) than the between-items criterion (dbetween =
and Wilson (2001; also see http://mason.gmu.edu/~dwilsonb/ .52; 95% CI = 0.46, 0.58; k = 41), Qb(df = 1) = 6.21, p = .01.
ma.html). When a RE model was applied, the difference between both
Dechêne et al. 245

Table 2. Results of Moderator Analysis Examining the Effect of Change of   Wording on Both Components of the Truth Effect

95% Confidence Interval

Type of Repetition k d Low Estimate High Estimate Qb(df = 1)

Within-items criterion < 1 (< 1)


Verbatim repetition 30 .39*** (.39)*** 0.32 (0.29) 0.47 (0.49)
Gist repetition 2 .40† (.40) -0.44 (-0.09) 0.85 (0.89)
Between-items criterion 13.35** (5.95)*
Verbatim repetition 64 .53*** (.53)*** 0.47 (0.46) 0.58 (0.60)
Gist repetition 6 .21* (.22)† 0.05 (-0.02) 0.37 (0.46)
Note: Fixed-effects values are presented outside parentheses, and random-effects values are within parentheses.

p < .10. *p < .05. **p < .001. ***p < .0001.

criteria was only marginally significant (RE: dwithin = .39; relatively disfluent as compared to a verbatim repetition).
95% CI = 0.28, 0.51; k = 32; dbetween = .53; 95% CI = 0.44, Similarly, a higher proportion of repeated items in a set of
0.62; k = 41), Qb(df = 1) = 3.29, p = .07. This indicates that, statements may reduce the experienced discrepancy against
in fact, in the given set of studies both criteria differ in mag- relatively disfluent statements and thereby may reduce the
nitude. As the results from the RE model suggest, however, magnitude of the effect.
this result should probably not be generalized to all possi- Verbatim versus gist repetition. We examined whether the
ble studies on the truth effect. Also note that the sample of statements in experiments were repeated verbatim or whether
effect sizes of the between-items criterion used in this com- the gist of the statement was repeated. For example, Arkes
parison differed from the overall set and that the effect and colleagues (1991, Experiment 2) presented text passages
size of this subsample was greater than that of the overall about China in a first session and repeated passages contain-
sample in both models; perhaps those studies that did not ing the same facts as in the text presented first (i.e., they used
examine the within-items criterion systematically differ from a gist repetition). As a result, the China-related statements
those that did (see Wilson, 2003, for a discussion of con- were rated more probably true as compared to a control con-
founds in meta-analysis). dition that had not seen the text about China before.
We applied homogeneity analyses on both criteria. The Moderator analyses for changes in wording (gist vs. ver-
results indicate heterogeneity for both the within-items and batim repetition) from Session 1 to Session 2 were separately
the between-items criteria (see Table 1). Thus, it is likely performed for the within-items and the between-items crite-
that both components of the effect are moderated by other rion; results are shown in Table 2. Unfortunately, only two
variables. We performed all moderator analyses separately effect sizes were available in the gist group for the within-
for both criteria. Analyses are grouped into (a) procedural items criterion; results of the Qb statistic on the within-items
variables that represent characteristics of the stimuli, the criterion can therefore not be reasonably interpreted. We present
presentation, and measurement variables and (b) participant the results for descriptive reasons only. For the between-items
variables that include the age of the sample, the level of criterion, effects of gist repetition were significantly smaller
processing, and the influence of further (i.e., more than one) than those of verbatim repetition under both an FE and an
repetitions on the truth effect. For some participant vari- RE model. Interestingly, in the RE analysis, the truth effect
ables, there was too little information given in the studies to from gist repetition did not reach significance. The results
include them in the analysis (e.g., gender of participants, suggest that repeating a statement in a different wording than
individual differences). in the first presentation diminishes the truth effect or may even
eliminate it.
Proportion of critical items. We examined whether the pro-
Moderator Analyses portion of critical (i.e., repeated) items in the total number of
Stimulus Variables. Two possible procedural moderators items judged affects the size of the truth effect. Although this
can be characterized as variables of the stimuli: (a) whether proportion varied enormously (i.e., between 18% and 100%),
statements are repeated verbatim or only the gist of a statement it was 50% in more than half of the cases. As the discrepancy
is repeated and (b) whether the proportion of critical items in between fluent versus disfluent stimuli is discussed as an
the overall pool of presented items in the studies influences important mediator, the proportion of fluent versus disfluent
the effect. Both are of theoretical interest to understand the stimuli may be relevant (Dechêne et al., 2009; Hansen et al.,
mediating processes of fluency and discrepancy on the occur- 2008). For instance, a higher proportion of disfluent (i.e.,
rence of the truth effect. For instance, a gist repetition may nonrepeated) stimuli may lead to greater truth effects by
affect the fluency of a statement (i.e., it is experienced as enhancing or strengthening the experienced discrepancy to
246 Personality and Social Psychology Review 14(2)

Table 3. Results of Moderator Analysis on Statements’ Presentation Time for Within- and Between-Items Criterion

95% Confidence Interval


Presentation Time
per Statement k d Low Estimate High Estimate Qb(df = 2)

Within-items criterion 3.78 (2.85)


≤ 8 s 3 .55*** (.55)** 0.29 (0.25) 0.82 (0.85)
> 8 s 4 .38*** (.38)** 0.20 (0.16) 0.56 (0.61)
Participant paced 17 .28*** (.27)*** 0.17 (0.07) 0.39 (0.41)
Between-items criterion 5.25† (1.30)
≤ 8 s 5 .49*** (.51)** 0.27 (0.21) 0.71 (0.82)
> 8 s 27 .54*** (.53)*** 0.46 (0.41) 0.62 (0.65)
Participant paced 16 .38*** (.41)*** 0.27 (0.24) 0.49 (0.58)
Note: Fixed-effects values are presented outside parentheses, and random-effects values are within parentheses.

p < .10. **p < .001. ***p < .0001.

fluent (i.e., repeated) stimuli. Regression analyses revealed of intersession interval was found. Similarly, Brown and
that the proportion of critical items affects neither the within- Nix varied whether Session 2 was administered 1 week, 1
items criterion, FE: Qb(df = 1) < 1, RE: Qb(df = 1) < 1, nor the month, or 3 months after Session 1; they found no difference
between-items criterion, FE: Qb(df = 1) < 1, RE: Qb(df = 1) < in the truth effect for true items. Repeated false items, how-
1. However, given the reduced range of variability, this null ever, were rated less true than new (also false) items after a
finding should not be overinterpreted. 1-week interval; no such difference was found for the
Presentation Variables. Presentation variables include the 1-month or the 3-month conditions. Thus, experimental evi-
presentation time per statement, the delay between Sessions 1 dence so far shows no clear pattern for the influence of
and 2, the modality of the presentation (i.e., whether state- intersession interval on the truth effect. Along the same lines,
ments were presented to the participants visually, auditory, many studies that did not directly examine the influence of
or mixed), and whether the repeated statements in Session 2 the intersession interval obtained the truth effect with delays
were presented to participants homogenously (i.e., in a context ranging from no more than minutes (e.g., Arkes et al., 1989;
without new statements) or heterogeneously (i.e., interspersed Begg & Armour, 1991; Schwartz, 1982) to several weeks (e.g.,
with new statements). Bacon, 1979; Hasher et al., 1977).
Presentation time per statement. In all, 22 studies did not For 20 studies, the length of the delay between the ses-
specify the presentation time per statement and therefore were sions was not specified, and therefore these effect sizes were
excluded from analysis; for the within-items criterion, k = 24 excluded from the analysis. For the between-items criterion,
were included, and for the between-items criterion, k = 48 51 cases were included (k = 19 excluded), for the within-
were included. We decided to group the available presentation items criterion 30 cases were included (k = 2 excluded).
times in three categories because this yielded meaningful and As shown in Table 4, both criteria were not moderated by the
sufficiently large categories: less than or equal to 8 s per state- delay between the sessions in both models, between-items
ment, more than 8 s per statement, and participant paced. criterion FE: Qb(df = 2) = 0.66, p = .72; RE: Qb(df = 2) = 0.35,
Presentation time per statements did not affect either crite- p = .84; within-items criterion FE: Qb(df = 2) = 3.45, p = .18;
rion in FE and RE models (see Table 3). We conducted RE: Qb(df = 2) = 2.75, p = .25. On a descriptive level, the
follow-up analyses for the within-items criterion and tested effects size of the within-items criterion was very small when
whether a presentation time less or equal to 8 s produced sig- Session 2 was administered on the same day as Session 1. For
nificantly greater effects as compared to presentation times practical reasons, researchers often tend to realize Session 2
more that 8 s. No differences were found using FE or RE either within the same day or within a week after Session 1.
models, FE: Qb(df = 1) = 1.08, p = .29; RE: Qb(df = 1) = 1.08, Thus, we conducted additional analyses that compared effect
p = .29. This result corresponds with the finding of Gigerenzer sizes of within-day studies with those of the within-week
(1984), who experimentally varied whether each statement type. As suggested by a marginally significant effect in the
had to be judged within 5 s versus 10 s and found no difference fixed-error model, administering both sessions within the same
between these intervals. On a descriptive level, participant- day (dwithin = .25; 95% CI = 0.07, 0.43; p < .05) tends to yield
paced presentation seems to produce the smallest effects. smaller effect, as compared to when the second session is
Delay between sessions. Brown and Nix (1996) and administered at least 1 day after the first (dwithin = .44; 95%
­Gigerenzer (1984) investigated whether different delays CI = 0.31, 0.67; p < .0001), FE: Qb(df = 2) = 2.91, p = .08.
between the sessions affect the truth effect. Gigerenzer varied However, this effect does not replicate in a random-error
the delay between sessions (1 vs. 2 weeks). No influence model (within day: dwithin = .25; 95% CI = 0.07, 0.43, p = .01;
Dechêne et al. 247

Table 4. Results of Moderator Analysis of Delay Between Sessions on Within- and Between-Items Criterion

95% Confidence Interval

Session 2 Administered k d Low Estimate High Estimate Qb(df = 2)

Within-items criterion 3.44 (2.74)


Within day 9 .25* (.24)* 0.07 (0.04) 0.43 (0.46)
Within week 11 .44*** (.45)*** 0.31 (0.29) 0.57 (0.61)
Longer delay 10 .44*** (.45)*** 0.32 (0.28) 0.56 (0.61)
Between-items criterion < 1 (< 1)
Within day 25 .48*** (.49)*** 0.39 (0.37) 0.57 (0.62)
Within week 14 .43*** (.44)*** 0.32 (0.28) 0.54 (0.59)
Longer delay 12 .48*** (.49)*** 0.36 (0.32) 0.59 (0.65)
Note: Fixed-effects values are presented outside parentheses, and random-effects values are within parentheses.
*p < .05. ***p < .0001.

Table 5. Results of Moderator Analysis on Presentation Modality for Within- and Between-Items Criterion

95% Confidence Interval

Presentation Mode k d Low Estimate High Estimate Qb(df = 2)

Within-items criterion < 1 (< 1)


Visual 21 .43*** (.43)*** 0.32 (0.32) 0.53 (0.53)
Auditory 5 .43** (.43**) 0.22 (0.22) 0.63 (0.63)
Mixed 2 .40** (.40**) 0.17 (0.17) 0.63 (0.63)
Between-items criterion 2.66 (1.01)
Visual 38 .51*** (.52)*** 0.44 (0.42) 0.58 (0.61)
Auditory 24 .51*** (.52)*** 0.43 (0.39) 0.61 (0.64)
Mixed 4 .68*** (.67)*** 0.48 (0.38) 0.87 (0.95)
Note: Fixed-effects values are presented outside parentheses, and random-effects values are within parentheses. The fixed-effect and the random-effect
model revealed the same results in the analysis of the within-items criterion because the random error variance component was zero.
**p < .001. ***p < .0001.

within week: dwithin = .45; CI 95% = 0.30, 0.59, p = .0001), Homogenous versus heterogeneous context. In most studies
Qb(df = 2) = 2.57, p = .11). A tendency toward a smaller the truth effect has been examined in heterogeneous lists of
within-items effect for statements that were shown on the mixed and repeated statements; it has rarely been studied in
same day could be explained by a higher proportion of par- homogeneous lists (Dechêne et al., 2009; Schwartz, 1982).
ticipants who remembered their first truth judgment for the Results of the moderator analysis on the available effect
respective statements; they might not have shown a strong sizes of these different list types are presented in Table 6.
increase in subjective truth for the second judgment because Unfortunately, Qb statistics for the between-items criterion
of a desire for consistency. could not be computed because only one case of a homoge-
Modality of presentation. The truth effect has been reported neous list was reported. More studies are certainly needed to
with visual (e.g., Arkes et al., 1989; Begg et al., 1992; investigate the truth effect under homogeneous presentation
Hawkins & Hoch, 1992), with auditory (Gigerenzer, 1984; conditions. The results for the within-items criterion suggest
Hasher et al., 1977), and with mixed (i.e., visually and audi- that the truth effect may be reduced in a homogeneous con-
tory; Bacon, 1979; Begg & Armour, 1991) presentations. It text. This finding provides initial support for the important
is yet unclear whether modality of presentation systemati- role of context in determining comparison standards
cally influences the truth effect. Four studies did not specify (Dechêne et al., 2009; Hansen et al., 2008).
the modality of presentation and therefore were excluded Measurement Variables.
from analyses, leaving k = 28 effect sizes for the analysis of Scale. We examined whether the scale used to measure the
the within-items criterion and k = 66 for the analysis of the truth ratings influences the truth effect. Most studies (k = 26)
between-items criterion. Results (see Table 5) show that the used a 7-point Likert-type scale with values ranging from
truth effect is not affected by modality (the within-items cri- false (1) to true (7). Ten studies used a 7-point scale with
terion results for mixed presentation have to be interpreted values from true (1) to false (7). Three studies used a 6-point
with caution because of the small number of effect sizes). scale (higher values indicated higher truth ratings). Five
248 Personality and Social Psychology Review 14(2)

Table 6. Results of Moderator Analysis of Homogenous Versus Heterogeneous Presentation for Within- and Between-Items Criterion

95% Confidence Interval

List Type k d Low Estimate High Estimate Qb(df = 1)

Within-items criterion 5.65* (4.44)*


Heterogeneous 29 .42*** (.43)*** 0.35 (0.33) 0.50 (0.52)
Homogeneous 3 .08 (.07) -0.20 (-0.24) 0.35 (0.39)

95% Confidence Interval

List Type k d Low Estimate High Estimate Qb

Between-items criterion —
Heterogeneous 65 .53*** (.51)*** 0.48 (0.44) 0.58 (0.58)
Homogeneous 1 -.09 (-.09) -0.62 (-0.76) 0.43 (0.55)
Note: Fixed-effects values are presented outside parentheses, and random-effects values are within parentheses.
*p < .05. ***p < .0001.

Table 7. Results of Moderator Analysis of Scale Type on Within- and Between-Items Criterion

95% Confidence Interval

Scale k d Low Estimate High Estimate Qb(df = 2)

Within-items criterion 7.94* (6.00)*


1–7 (false–true) 23 .33*** (.33)*** 0.24 (0.22) 0.43 (0.44)
1–7 (true–false) 2 .40** (.41)** 0.18 (0.12) 0.63 (0.69)
1–6 (false–true) 6 .62*** (.62)*** 0.44 (0.42) 0.79 (0.82)
16 cm (false–true) —
Dichotomous —

95% Confidence Interval

Scale k d Low Estimate High Estimate Qb(df = 4)

Between-items criterion 24.50** (11.27)*


1–7 (false–true) 40 .39*** (.41)*** 0.32 (0.32) 0.45 (0.49)
1–7 (true–false) 12 .64*** (.65)*** 0.53 (0.49) 0.74 (0.79)
1–6 (false–true) 6 .66*** (.65)*** 0.48 (0.43) 0.83 (0.88)
16 cm (false–true) 5 .66*** (.65)*** 0.49 (0.42) 0.83 (0.89)
Dichotomous 7 .54*** (.53)** 0.32 (0.27) 0.75 (0.79)
Note: Fixed-effects values are presented outside parentheses, and random-effects values are within parentheses.
*p < .05. **p < .001. ***p < .0001.

studies used a continuous 16 cm scale with higher values effects than do the other scales. A follow-up analysis revealed
indicating higher truth rating. Finally, 6 studies used a that this difference was significant, FE: Qb(df = 1) = 23.57, p <
dichotomous measure. Only a single study used a 9-point .001; RE: Qb(df = 1) = 10.52, p = .001. Effect sizes obtained
scale (false to true); it was not included in this analysis. using dichotomous responses do not differ from those collected
Results (see Table 7) suggest that the scale used to measure on the other scales, FE: Qb(df = 1) < 1; RE: Qb(df = 1) < 1.
truth ratings influences both components of the truth effect. It has been discussed whether scales with a midpoint
This finding was replicated in FE and RE models. For the (i.e., odd scales) are superior to scales without one (i.e., even
within-item criterion, results suggest that a 6-point scale scales), or vice versa (e.g., Krosnick, 1991; Krosnick &
evokes a larger effect as compared to a 7-point scale. This is Fabrigar, 1997). Even scales force the participant to choose
indeed the case, as indicated by a follow-up test, FE: Qb(df = a clear direction of judgment, whereas odd scales allow
1) = 7.65, p < .05; RE: Qb(df = 1) = 5.75, p < .05. participants to choose an indifference point. Thus, we were
Results for the between-items criterion suggest that the interested to determine whether the truth effect can be captured
widely used 7-point scale from false to true evokes smaller better by an even scale that may foster the use of fluency
Dechêne et al. 249

Table 8. Results of Moderator Analysis of Scale Type (Odd vs. Even) for Within- and Between-Items Criterion

95% Confidence Interval

Scale k d Low Estimate High Estimate Qb(df = 1)

Within-items criterion 7.61* (5.91)*


Odd 26 .35*** (.34)*** 0.26 (0.25) 0.43 (0.44)
Even 6 .62*** (.62)*** 0.45 (0.42) 0.79 (0.82)
Between-items criterion 4.03* (1.65)
Odd 52 .46*** (.47)*** 0.40 (0.38) 0.52 (0.55)
Even 13 .61*** (.60)*** 0.47 (0.42) 0.75 (0.78)
Note: Fixed-effects values are presented outside parentheses, and random-effects values are within parentheses.
*p < .05. ***p < .0001.

Table 9. Results of Moderator Analysis of Data Collection Modality for Within- and Between-Items Criterion

95% Confidence Interval

Data collection modality k d Low Estimate High Estimate Qb(df = 1)

Within-items criterion < 1 (< 1)


Paper and pencil 21 .42*** (.43)*** 0.33 (0.31) 0.54 (0.55)
Computer   2 .27 (.21) -0.32 (-0.34) 0.73 (0.53)
Between-items criterion 17.74*** (7.21)*
Paper and pencil 42 .59*** (.59)*** 0.53 (0.51) 0.66 (0.69)
Computer 14 .30*** (.33)** 0.18 (0.16) 0.43 (0.50)
Note: Fixed-effects values are presented outside parentheses, and random-effects values are within parentheses.
*p < .05. **p < .001. ***p < .0001.

more than an odd scale. We grouped effect sizes into the two diminish the experience of discrepant fluency. However,
categories (i.e., both variants of the 7-point scale and the other reasons are also possible, and further research is needed
9-point scale were classified as odd; the 6-point scale and the on this point.
dichotomous responses were classified as even; the 16 cm Other Variables. Beyond procedural aspects, other variables
scale was excluded because it cannot be categorized as odd may also affect the size of the truth effect and may function
vs. even). Results (see Table 8) show that measuring truth differentially on the processes that are involved in the occur-
judgments with an even scale yields greater effects on both rence of the within- and the between-items effect. Research
criteria. Note that, in the RE model, the scales do not signifi- has concentrated on cognitive processing styles and
cantly differ for the between-items effect. memory processes. Level of cognitive processing was mainly
Modality of experiment and data collection. We examined examined by manipulation of high versus low involvement
whether the data collection mode, that is, paper and pencil or cognitive load. Participants’ age was investigated because
(k = 30 studies) versus computer (k = 11 studies), influences it may indirectly reflect the influence of memory on the
the truth effect. Ten studies did not report mode of data col- truth effect because older adults tend to have impaired
lection, and it was not possible to infer the mode of data memory capacities. Furthermore, we conducted analyses on
collection that was used; these studies were excluded from the influence of more than one repetition of statements on
analysis. Unfortunately, there were again too few within- the truth effect. Other variables, such as individual judg-
items effect sizes in one condition (computer collection); the ment or processing styles (e.g., need for cognition, faith in
results are given for descriptive reasons. Results of both intuition), were examined so infrequently that we were not
models (see Table 9) show that the modality of data collec- able to investigate them here.
tion affects the between-items effect: Conducting a truth Level of Processing. An important question in research on
effect experiment on the computer produces smaller effects the truth effect is whether the effect is moderated by level
than with paper and pencil. The reasons for this effect are of processing. Hawkins and Hoch (1992) manipulated in
unclear, and we can only speculate that there may be con- Session 1 whether participants had to judge comprehensi-
founds with restricted presentation times in this analysis or bility (i.e., a low involvement manipulation) or subjective
differences in the presentation of statements (i.e., sequen- truth (i.e., high involvement) of consumer-related claims and
tially vs. all statements on the same sheet) that may foster or found that the truth effect (indicated by the between-items
250 Personality and Social Psychology Review 14(2)

Table 10. Results of Moderator Analysis of Level of Processing at Session 1 for Between-Items Criterion

95% Confidence Interval

Level of Processing k d Low Estimate High Estimate Qb(df = 1)

Between-items criterion 12.88** (4.35)*


High 51 .44*** (.45)*** 0.37 (0.37) 0.49 (0.54)
Low 19 .63*** (.62)*** 0.54 (0.49) 0.72 (0.75)
Note: Fixed-effects values are presented outside parentheses, and random-effects values are within parentheses.
*p < .05. **p < .001. ***p < .0001.

Table 11. Results of Moderator Analysis of Age of Participants for Between-Items Criterion

95% Confidence Interval

Age k d Low Estimate High Estimate Qb(df = 1)

Between-items criterion 1.51 (0.79)


Young 66 .49*** (.49)*** 0.44 (0.42) 0.54 (0.57)
Old   4 .64*** (.64)** 0.41 (0.33) 0.88 (0.96)
Note: Fixed-effects values are presented outside parentheses, and random-effects values are within parentheses.
**p < .001. ***p < .0001.

criterion) was larger under the low involvement condition the moderator analysis suggest that the truth effect is not
(also see Hawkins, Hoch, & Meyers-Levy, 2001). In follow- dependent on participants’ age.
ing this classification, the task of performing truth judgments Further Repetitions. Hasher and colleagues (1977) tested
at Session 1 was coded as a high level of processing, whereas whether, after an initial repetition, truth ratings continue to
merely reading the statements, or performing other superfi- be affected by further repetitions. Unfortunately, very few
cial tasks, was coded as a low level of processing. A levels of effect sizes were available for both criteria, and therefore the
processing effect is supported by the results of the moderator Qw statistic cannot be reasonably interpreted (within-items
analyses (see Table 10), showing that the truth effect on the criterion) or even computed (between-items criterion). Nev-
between-items criterion is increased with low as compared to ertheless, there appears to be a tendency for an effect of
high levels of processing at Session 1. Note that this modera- further repetitions on the between-items criterion: Mean
tor analysis cannot be performed for the within-items effect sizes of a second repetition were not different from
criterion because (with one exception) all studies imple- zero for the within-items criterion but were significantly dif-
mented the level of processing manipulation in Session 1 and ferent from zero for the between-items criterion (see Table 12).
did not obtain the necessary Session 1 truth ratings for the Also, descriptively, the effects of a second and third repetition
low level of processing condition. were very small for the within-items criterion; in contrast,
Age of Participants. The mean age of young participants the magnitude of the between-items criterion remained con-
was M = 22.54 (SD = 4.11) and M = 72.86 (SD = 5.09) for stant beyond the first repetition.
older participants. Note that only studies that directly inves-
tigated the influence of age on the truth effect reported the
age of the sample. However, most studies use student sam- Summary
ples, suggesting a mean age between 18 and 30 years; thus, Overall, the truth effect was of medium size. The effect was
we included these studies in the group of younger partici- moderated by response format and presentation duration but
pants. Law and colleagues (1998) found that older adults not by modality of presentation or the proportion of critical
were especially susceptible to show the truth effect, espe- (i.e., repeated) statements. The truth effect was smallest on
cially after a longer delay (Skurnik et al., 2005). However, the widely used 7-point Likert-type scale (1 = false, 7 =
there are also findings that show no differences in the truth true), and it was larger when an even scale was used (d = .61)
effect between old and young participants in a fluency as compared to an odd scale (d = .46). The effect tended to
manipulation (Parks & Toth, 2006). Data were not available be moderated by statements’ presentation time, such that
for the within-items criterion. Descriptively, the truth effect participant-paced presentation tended to yield smaller effects
was larger for older adults (see Table 11); however, results of as compared to experimenter-paced presentation. The truth
Dechêne et al. 251

Table 12. Overall Analyses of Effect Sizes for Second and Third Repetition for Within- and Between-Items Criterion

95% Confidence Interval

k d Low Estimate High Estimate z Qw

Within-items criterion
Second repetition 7 .16† (.17)† –0.01 (–0.03) 0.33 (0.37) 1.93 (1.65) 8.52
Third repetition 2 .16 (.16) –0.08 (–0.08) 0.41 (0.41) 1.28 (1.28) 0.02
Between-items criterion
Second repetition 6 .44*** (.48)** 0.26 (0.22) 0.63 (0.73) 4.76 (3.68) 9.01
Third repetition 1 .68** (—) 0.35 (—) 1.02 (—) 4.04 (—) —
Note: Fixed-effects values are presented outside parentheses, and random-effects values are within parentheses.

p < .10. **p < .001. ***p < .0001.

effect did not depend on the modality (auditory, visual, or again in contrast to the within-items criterion, the effect of the
mixed) of the statements’ presentation. The proportion of second and subsequent repetitions on the between-items cri-
repeated items on the overall set of judged statements also terion was still of considerable size (d = .44), descriptively,
did not influence the effect. and comparable to the effect of the first repetition.
A few additional moderators could be investigated only
for between-items effect sizes (there were too few relevant
within-items effect sizes). The truth effect was greater under Discussion
lower (d = .63) as compared to higher (d = .44) levels of In summary, some general conclusions can be drawn. First,
processing at Session 1. The effect was further moderated by the truth effect is of medium size (with CIs ranging between
the data collection modality: A paper and pencil procedure d = .32 and d = .55, depending on the choice of criterion and
(d = .59) yielded a larger effect as compared to a computer- type of analysis). Second, the effect is robust against varia-
based procedure (d = .30). tions in presentation duration and modality, the proportion of
Last but not least, the truth effect was larger for the repeated statements, and participants’ age. Third, the effect is
between-items (d = .49) than for the within-items criterion smaller for scales that offer an indifference point (i.e., scales
(d = .39), a finding that is consistent with the notion that they with an odd number of points). These findings are consistent
may reflect different underlying processes. This notion is fur- with the widely held view that repetition, largely via the
ther supported by moderator analyses: As summarized below, subtle cues provided by increased processing fluency, affects
the within-items and the between-items criteria were differ- truth judgments in a wide range of situations. Before we turn
ently affected by a subset of the investigated moderators. to comparing the within- and between-items effects, some addi-
Within-Items Criterion. The within-items truth effect was tional findings are discussed: the levels of processing effect
moderated by the delay between sessions: When Session 2 and the role of interindividual differences.
was administered on the same day as Session 1, a signifi-
cantly smaller effect (d = .25) was observed, as compared to
when a longer delay was used (d = .44). The within-items Levels of Processing
truth effect was not affected by a verbatim (d = .39) versus The between-items truth effect was larger under low as com-
gist (d = .40) repetition. A within-items truth effect was not pared to high levels of processing, that is, shallower cognitive
found with a homogeneous test list (d = .08), but only with a processing during the statement’s first encounter led to a
heterogeneous list of old and new statements in Session 2 more pronounced truth effect. Hawkins and Hoch (1992)
(d = .42). Descriptively, additional repetitions (i.e., two or three argued that high levels of processing foster more elaborative
repetitions) did not reveal a considerable truth effect on the thoughts about the statements; such elaborations are likely to
within-items criterion (ds = .16). have a stronger influence on subsequent truth judgments than
Between-Items Criterion. The results obtained for the low-level heuristic cues such as fluency. In contrast, people
between-items criterion differed from those obtained for the who engaged in shallower cognitive processing are less
within-item criterion with regard to the two moderators of delay likely to elaborate; their judgments tend to be based more on
and verbatim versus gist repetition. In contrast to the within- peripheral heuristic cues such as a statement’s familiarity (e.g.,
items effect, the delay between sessions did not influence Hawkins & Hoch, 1992; Hawkins et al., 2001) or, we might
the between-items effect. Also, in contrast to the within-items add, its fluency. Note that the present results are restricted to
effect, the between-items effect was moderated by verbatim effects of level of processing at the first encounter of a state-
versus gist repetition: The effect was larger with a verbatim ment; there is a paucity of research on the influences of level
(d = .53) as compared to a gist (d = .21) repetition. Finally, and of processing at the time of judgment.
252 Personality and Social Psychology Review 14(2)

The levels of processing effect is reminiscent of research that determine the between- and within-items effects; they
on the “Spinozan” account introduced by Gilbert and col- cannot be interpreted as reflecting distinct cognitive processes.
leagues (Gilbert, Krull, & Malone, 1990; Gilbert, Tafarodi, & As we pointed out above, it is not possible to purely disen-
Malone, 1993). This account holds that “seeing is believing”: tangle underlying cognitive processes using the meta-analytic
When information is encoded, it is automatically and effort- method. Such an effort to disentangle the underlying processes
lessly “tagged” as true; elaborative processes are required to requires the use of controlled experiments, preferably in com-
modify this tag. If elaboration is not possible—for example, bination with comprehensive formal models of the truth effect
under time pressure or cognitive load—people tend to encode (e.g., Unkelbach & Stahl, 2008).
false information as true, even if it was explicitly identified Despite these limitations, the present distinction between
as false (Gilbert et al., 1993). Thus, similar to the present the two criteria makes an important first contribution to
result that low levels of processing facilitate the truth effect, research on context effects. This is because, first, the overlap
Gilbert and colleagues have shown that conditions of effortless between the cognitive processes involved in the between- and
or superficial cognitive processing facilitate the acceptance within-items effects is not perfect, and important theoreti-
of new information as true. cal distinctions remain such as the prediction of a negative
Note that the Spinozan account does not distinguish between truth effect for disfluent items. Second, despite considerable
effects on encoding versus judgment because it is not con- overlap, important empirical differences were observed. As
cerned with the effects of repetition. To investigate levels predicted, the between- and within-items effects were of dif-
of processing effects on the experience (and use) of fluency ferent magnitude, they were affected by different moderators,
during truth judgments, future research should concen- and a tendency toward a negative truth effect was found.
trate on the influence of level of processing at the time of the Dissociations. Support for the notion of different under-
judgment. lying processes comes from important empirical dissociations
between the within-items criterion and the between-items
criterion. Three main dissociations emerged: First, the between-
Individual Differences items effect was of greater magnitude than the within-items
As revealed by the present meta-analysis, there is a lack of effect. This observation is consistent with the above analysis
research on the influence of individual differences on the that judgments of nonrepeated (and therefore disfluent) items
truth effect. It is plausible that individuals’ predispositions are driven toward the “false” end of the scale in a heteroge-
such as general skepticism (e.g., Hurtt, 1999; Obermiller & neous context. Obviously, this line of reasoning implies that
Spangenberg, 1998) or the tendency toward an intuitional– when judgments for nonrepeated statements are compared
experiential thinking style (e.g., Epstein, Pacini, Denes-Raj, between the first and second sessions, we should observe a
& Heier, 1996) may influence their susceptibility to the truth negative effect, and this negative effect should account for the
effect: Individuals with a highly intuitive style of thinking difference in magnitude between the two criteria. We con-
may be more sensitive to their metacognitive experiences, such ducted such an analysis on the eligible set of effect sizes (k =
as processing fluency, and may therefore exhibit a stronger 29) and obtained the predicted negative effect of d = –.07
truth effect. In contrast, people with a greater tendency to dis- (with CIs ranging from –.14 to .01). Although this effect is
play general skepticism may be less susceptible to the truth not significantly different from zero because of the smaller
effect. Given that the present meta-analytic review has firmly number of eligible effect sizes, it is of comparable magnitude to
established the existence of a substantial, robust, medium-sized the difference between the between- and within-items effects
effect, future research focusing on individual differences is (see Table 1). We conclude that the negative truth effect
undoubtedly worthwhile. accounts for much, but not all, of the difference between the
between- and within-items effect sizes.
Second, increasing the delay between sessions increases
Two Components of the Truth Effect the within-items effect but not the between-items effect. This
We now turn to discussing the differences between the between- suggests that, with increasing delay, the attribution of flu-
and within-items effects in terms of contextual effects on ency to the statement’s previous encounter (i.e., as familiarity)
the comparison standard. But first, let us again point out an is less likely to be successful, perhaps because of decreasing
important limitation of the meta-analytic findings: Almost source memory accuracy. Instead, the experienced fluency is
all studies implemented a heterogeneous context with both then attributed to a statement’s truth, in the sense of convergent
repeated and nonrepeated statements, from which both the validity. Alternatively, a shorter delay might help participants
between-items and most of the within-items effect sizes were remember their first-session responses to the statements, and
computed for the present analyses; only a small number of a desire for behavioral consistency (i.e., a tendency to repeat
studies used the homogeneous context (for these studies, the first-session response) might then diminish the truth
only the within-items criterion was computed). As a conse- effect. Other explanations are possible, but all explanations
quence, there is considerable overlap between the processes have in common that delay moderates the within-items truth
Dechêne et al. 253

effect via its influence on memory processes other than those experience is interpreted as resulting from an event in the
that support processing fluency. If delay affected fluency past (i.e., a prior encounter with that stimulus), a feeling of
directly, such an effect should also be reflected in the familiarity is elicited. Also well demonstrated is the influence
between-items criterion. However, a moderation by delay of processing fluency on other memory-based judgments such
was not observed for the between-items effect. This suggests as feelings of remembering (e.g., Jacoby & Whitehouse,
that delay does not affect fluency directly and that the 1989) and recall (e.g., Roediger & McDermott, 1995). Fur-
between-items effect is less susceptible to explicit memory thermore, there is evidence that the discrepancy-attribution
processes such as those discussed above. Instead, the fact account generalizes to recognition (Westerman, 2008).
that the between-items effect is independent of delay sug- Beyond memory-based judgments, the discrepancy-attribu-
gests that it may come about mainly because of the differential tion account can explain the present findings on judgments
fluency experiences provided in a heterogeneous context. of truth as well as those of other fluency-based judgments
This is consistent with a third difference between the such as liking judgments (Alter & Oppenheimer, 2009;
two criteria that emerged with regard to the moderating role ­Willems & Van der Linden, 2006). But despite the success of
of verbatim versus gist repetition. The moderator did not the discrepancy-attribution account in explaining a wide
affect the within-items criterion; if, as suggested above, the variety of judgment effects, little is known about the expe-
within-items effect relies more on explicit memory processes, rience of discrepancy (and, for that matter, about the
this could explain why in this case a repetition of a state- attribution process). Investigations into these processes are
ment’s gist—its conceptual meaning—is sufficient and why necessary to fill this void. The same is true for the comparison
an additional repetition of the exact wording is not benefi- standard that plays an important role in the discrepancy-
cial. The between-items criterion, in contrast, exhibited larger attribution hypothesis (cf. Whittlesea & Leboe, 2003). We
effects for verbatim repetition than gist repetition, suggesting distinguished here between a comparison standard based on
that the underlying processes are low-level perceptual pro- internal expectancies (as in the homogeneous context) and
cesses typical of implicit memory effects (for an overview, one generated on the fly from external stimuli (as in the het-
see Roediger, 1990). erogeneous context). The results suggests that there may be
In sum, moderation analyses yielded different patterns substantial differences between the two types of comparison
for the between- and within-items effects. The findings are standards; for instance, comparison standards based on internal
consistent with the notion that there are different processes expectancies are likely to depend on individual characteris-
underlying the two criteria: Although both an increase in tics to a greater degree than standards based on external
fluency because of repetition and an experience of a discrep- stimuli. More research is needed that further investigates the
ancy with a comparison standard play a role in both criteria, effects of this and other factors on the comparison standard
the discrepancy experience in the heterogeneous context is and, thereby, on the role of fluency on judgments.
also boosted by the difference between relatively fluent (i.e.,
repeated) and relatively disfluent (i.e., nonrepeated) state-
ments at the time of judgment; this difference is fully reflected Relation to Other Fluency-Based
only by the between-items criterion. As most of the studies Effects: Mere Exposure
employed heterogeneous contexts, the truth effect in these The truth effect is one out of a class of fluency-based effects.
studies was influenced by a discrepancy between repeated Another prominent example of this class is the mere exposure
and nonrepeated statements, increasing the magnitude also effect (e.g., Bornstein, 1989; Bornstein & D’Agostino, 1992,
of within-items effect sizes computed for these studies. Future 1994; Zajonc, 1968). Here, instead of judgments of truth, the
research on the truth effect should carefully distinguish experience of fluency informs judgments of liking. Accord-
between the effects of repetition itself and the effects of the ing to the processing fluency/attribution model (Bornstein &
presence of nonrepeated, disfluent items at test. Moreover, D’Agostino, 1992, 1994), prior exposure facilitates subse-
research should concentrate on both of the underlying cogni- quent retrieval from memory, and this experience of ease is
tive processes—the increase in fluency and the experience of interpreted as positive. As a consequence, repeated stimuli
discrepancy—separately. are rated more favorably than new stimuli. Just as in the
The Discrepancy-Attribution Hypothesis. The present case of the truth effect, corresponding results were obtained
results suggest that the roles of discrepancy and of the compari- using perceptual fluency manipulations: Highly visible stim-
son standard in fluency judgments require further attention. uli are preferred over less visible stimuli (Reber et al., 1998;
According to the discrepancy-attribution account by Whittlesea Winkielman & Cacioppo, 2001).
and Williams (1998, 2000, 2001a, 2001b), feelings of famil- According to the hedonic fluency hypothesis (Reber,
iarity result when the processing of a stimulus is experienced Schwarz, & Winkielman, 2004), easily processed stimuli are
as unexpectedly fluent (i.e., more fluent than expected on the generally preferred because they automatically elicit posi-
basis of a comparison standard), and this feeling is attributed to tive affect. However, it remains to be shown whether positive
a source in the past. Thus, if a surprisingly fluently processing affect is also involved in the emergence of the truth effect.
254 Personality and Social Psychology Review 14(2)

On one hand, there is evidence for a link between positive important: Researchers should be sensitive to the possible
feelings and truth judgments (Garcia-Marques, Mackie, contextual effects on the comparison standard that drives
Claypool, & Garcia-Marques, 2004). On the other hand, recent fluency effects on judgments. More generally speaking, prefer-
evidence suggests that, when positivity is dissociated from ring a within-participants over a between-participants design
fluency, it is fluency, not positivity, that affects truth may not only affect statistical power but also affect the
­(Unkelbach, Bayer, Alves, Koch, & Stahl, 2009); further- context in which observations are made, which in turn may
more, it has been shown that the interpretation of fluency as substantially alter psychological processes involved. Sys-
“true” is learned (Unkelbach, 2007) and involves controlled tematic experimental investigations of this possibility are
cognitive processing, at least to some extent (Unkelbach & highly desirable not only in the truth effect domain.
Stahl, 2008). Therefore, it is not yet clear whether and how a
direct positive–true link exists and the extent to which this pos-
sible association is established automatically or by controlled Practical Relevance
processing. Thus, the role of affect in the truth effect is not fully Effects of repetition have received high prominence in the
understood, and future research on the truth effect and fluency fields of advertising and persuasion (e.g., Batra & Ray, 1986;
effects in general is needed to resolve this question. Holden & Vanhuele, 1999; Nordhielm, 2002; Roggeveen &
The present results show that the truth effect and the mere Johar, 2002, 2007; Schumann, Petty, & Clemons, 1990). Here,
exposure effect share some moderating variables. Some of the communication effectiveness is the focus of research on appli-
results obtained by Bornstein (1989) in a seminal meta-analysis cations of repetition, and repetition effects are seen as based
on the mere exposure effect correspond with our findings on two factors or phases that influence message response
on the truth effect: Both effects occur independently from (Berlyne, 1970): The first is a “wear-in” phase of positive habit-
exposure duration (except for a tendency toward a smaller uation and increasingly positive responses. It may be followed
truth effect for participant-paced presentation that is not easily by a “wear-out” phase, characterized by too much repetition,
interpreted in terms of duration). Furthermore, both effects onset of tedium, and therefore decreasing message effective-
are moderated by list type: They are found mainly under het- ness. The findings of the present analysis mainly contribute
erogeneous presentation conditions. The latter finding may to our understanding of the “wear-in” phase because data are
stimulate research on the similarity of contextual influences on scarce for the effects of more than one repetition (although
the fluency experience in the truth and mere exposure effects there is a tendency that the truth effect wears off after one or
(e.g., Dechêne et al., 2009; Willems & Van der Linden, 2006). two repetitions).
Somewhat in contrast to the mere exposure effect, we An important factor for advertising effectiveness is the
found no clear influence of further repetitions on the truth consumers’ level of processing (e.g., Anand & Sternthal,
effect. (Note, however, that the effect of a second repetition 1990; Greenwald & Leavitt, 1984; Hawkins & Hoch, 1992).
on the between-items criterion is consistent with the mere This factor is thought to influence whether people encode
exposure literature that also used this criterion.) The lack of a surface features of stimuli (i.e., low level of processing) or
clear effect may be a result of the small number of studies whether they evaluate the semantic content of a stimulus (i.e.,
available for that specific analysis. It may also be argued that higher level of processing). The present analysis showed that
a judgment about truth is qualitatively different from a liking lower levels of processing reveal greater effects of repetition
judgment: Whereas a liking judgment is by definition based on truth judgments; that is, a low level of processing at the
on subjective experiences, a truth judgment can always be seen time of initial encoding leads to a more pronounced truth
before the background of an objective value. This may lead effect. Thus, in case of an advertising strategy based on repeti-
to a ceiling effect in judgments because a statement cannot tion, it is beneficial for advertisers when consumers engage
be “truer” than “definitely true” by definition. In either case, in shallower processing. Similar results have been obtained
beyond noting important similarities, future research should for liking judgments on advertisements (Nordhielm, 2002).
focus on qualitative differences between different types of Although the present meta-analysis did not obtain a sufficient
fluency-based judgments (for an overview of similarities, number of effect sizes to investigate this factor, source cred-
see Alter & Oppenheimer, 2009). ibility may also be highly influential (e.g., Petty & Cacioppo,
1986). More research is needed to investigate the relative
roles and the interplay of levels of processing at initial encod-
Methodological Note ing versus at judgment.
The present observations suggest some obvious methodologi- A related point is the strategy of advertising variation (e.g.,
cal recommendations. First, researchers interested in effects Schumann et al., 1990): This approach holds it to be benefi-
of processing fluency should select the critical comparison— cial when an advertisement’s content varies over repetitions.
between items or within items—in a principled manner, based However, the present results suggest that an exact, verbatim
on theoretical considerations. Second, a principled choice of repetition is more effective in changing consumers’ belief in
judgment context and study design is probably even more a marketing claim. The present findings suggest that perhaps
Dechêne et al. 255

a combination of repetition of core elements (i.e., to benefit References


from implicit memory effects) and variation of peripheral
elements (i.e., to counter wear-out effects of tedium) may References marked with an asterisk indicate studies included in the
be most effective. Finally, the finding that the truth effect is meta-analysis.
largely independent of duration of stimulus exposure, delay, Alter, A. L., & Oppenheimer, D. M. (2009). Using the tribes of
and other variables leads to the conclusion that it is a reliable fluency to form a metacognitive nation. Personality and Social
phenomenon with relevant practical impact on advertising and Psychology Review, 13, 219-235.
persuasion processes. Anand, P., & Sternthal, B. (1990). Ease of message processing as a
moderator of repetition effects in advertising. Journal of Mar-
keting Research, 26, 345-353.
Conclusions *Arkes, H. R., Boehm, L. E., & Xu, G. (1991). Determinants of
Two main conclusions can be drawn. First, the present review judged validity. Journal of Experimental Social Psychology, 27,
has even more firmly established the existence of the truth 576-605.
effect: Repeated presentation increases participants’ subjec- *Arkes, H. R., Hackett, C., & Boehm, L. (1989). The generality of
tive judgments of a statement’s truth. The effect is of medium the relation between familiarity and judged validity. Journal of
size and occurs under a variety of conditions. Second, two Behavioral Decision Making, 2, 81-94.
components of this effect can be distinguished, both theoreti- *Bacon, F. T. (1979). Credibility of repeated statements: Memory
cally and empirically: On a theoretical level, the psychological for trivia. Journal of Experimental Psychology: Human Learn-
processes underlying the truth effect differ between homoge- ing and Memory, 5, 241-252.
neous and heterogeneous measurement contexts; empirically, Batra, R., & Ray, M. L. (1986). Situational effects of advertising repe-
the within-items effect and the between-items effect were tition: The moderating influence of motivation, ability, and oppor-
of different magnitude, and they were differently affected by tunity to respond. Journal of Consumer Research, 12, 432-445.
moderating variables. These conclusions support an important *Begg, I., Anas, A., & Farinacci, S. (1992). Dissociation of pro-
role of fluency (cf. Alter & Oppenheimer, 2009), and they cesses in belief: Source recollection, statement familiarity, and
suggest that fluency effects are moderated by the context the illusion of truth. Journal of Experimental Psychology: Gen-
in which judgments are performed. The next generation of eral, 121, 446-458.
research on fluency effects on judgments should focus on such *Begg, I., & Armour, V. (1991). Repetition and the ring of truth:
contextual moderation. Biasing comments. Canadian Journal of Behavioural Science,
23, 195-213.
Authors’ Note *Begg, I., Armour, V., & Kerr, T. (1985). On believing what we
Alice Dechêne and Christoph Stahl equally contributed to this work. remember. Canadian Journal of Behavioural Science, 17, 199-214.
Berlyne, D. E. (1970). Novelty, complexity, and hedonic value.
Declaration of Conflicting Interests Perception and Psychophysics, 8, 279-286.
The authors declared no potential conflicts of interests with respect Bornstein, R. F. (1989). Exposure and affect: Overview and meta-
to the authorship and/or publication of this article. analysis of research, 1968-1987. Psychological Bulletin, 106,
265-289.
Financial Disclosure/Funding Bornstein, R. F., & D’Agostino, P. R. (1992). Stimulus recognition
The authors received no financial support for the research and/or and the mere exposure effect. Journal Personality and Social
authorship of this article. Psychology, 63, 545-552.
Bornstein, R. F., & D’Agostino, P. R. (1994). The attribution and
Notes discounting of perceptual fluency: Preliminary tests of a per-
1. To the best of our knowledge, the zipper was invented in ceptual fluency/attributional model of the mere exposure effect.
­Switzerland, and thus the statement, “The zipper was invented Social Cognition, 12, 103-128.
in Norway,” is false. *Brown, A. S., & Nix, L. A. (1996). Turning lies into truths: Ref-
2. This distinction between referential validity and convergent erential validation of falsehoods. Journal of Experimental Psy-
validity is mirrored in remember versus know judgments regard- chology: Learning, Memory, and Cognition, 22, 1088-1100.
ing specific memories (Gardiner & Richardson-Klavehn, 2000). Cohen, J. (1988). Statistical power analysis for the behavioral sci-
3. There is theoretical debate as to whether the process by which ences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.
fluency affects truth judgments is one of misattribution (e.g., Cooper, H. (1998). Synthesizing research: A guide for literature
Whittlesea & Williams, 1998) or whether it represents a valid reviews (3rd ed.). Thousand Oaks, CA: Sage.
use of environmental cues (e.g., Hertwig, Herzog, Schooler, & Cooper, H., & Hedges, L. V. (1994). The handbook of research syn-
Reimer, 2008; Unkelbach, 2007). This distinction is not in the thesis. New York: Russell Sage.
focus of the present review, and we therefore remain neutral in *Dechêne, A. (2005). Hängt die Wahrheit von der Formulierung ab?
this debate. Einflüsse von syntaktischer Veränderung und Verarbeitungstiefe
256 Personality and Social Psychology Review 14(2)

auf den truth effect [Influence of syntactical changes and level Hedges, L. V. (1983). A random effects model for effect sizes. Psy-
of processing on the truth effect]. Unpublished master’s thesis, chological Bulletin, 93, 388-395.
University of Basel, Switzerland. Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-
*Dechêne, A., Stahl, C., Hansen, J., & Wänke, M. (2009). Mix analysis. Orlando, FL: Academic Press.
Me a list: Context moderates the truth effect and the mere expo- Hedges, L. V., & Vevea, J. D. (1998). Fixed- and random-effects
sure effect. Journal of Experimental Social Psychology, 45, models in meta-analysis. Psychological Methods, 3, 486-504.
1117-1122. Hertwig, R., Herzog, S. M., Schooler, L. J., & Reimer, T. (2008).
*Dechêne, A., & Wänke, M. (2006a). [Truth effect and faith in intu- Fluency heuristic: A model of how the mind exploits a by-product
ition]. Unpublished raw data. of information retrieval. Journal of Experimental Psychology:
*Dechêne, A., & Wänke, M. (2006b). [Truth judgments under pro- Learning, Memory, and Cognition, 34, 1191-1206.
cessing constraints]. Unpublished raw data. Holden, S. J. S., & Vanhuele, M. (1999). Know the name, forget the
*Doland, C. A. (1999). Repeating is believing: An investigation of exposure: Brand familiarity versus memory of exposure con-
the illusory truth effect. Dissertation Abstracts International, text. Psychology and Marketing, 16, 479-496.
60(4-B), pp. 1879. (UMI No. 9927617). Hurtt, R. K. (1999). Sceptical about scepticism: Instrument devel-
Dunlap, W. P., Cortina, J. M., Vaslow, J. B., & Burke, M. J. (1996). opment and experimental validation. Salt Lake City: University
Meta-analysis of experiments with matched groups or repeated of Utah.
measure designs. Psychological Methods, 1, 170-177. Jacoby, L. L., Kelley, C., Brown, J., & Jasechko, J. (1989). Becom-
Epstein, S., Pacini, R., Denes-Raj, V., & Heier, H. (1996). Individ- ing famous over night: Limits on the ability to avoid uncon-
ual differences in intuitive–experiential and analytical–rational scious influences of the past. Journal of Personality and Social
thinking styles. Journal of Personality and Social Psychology, Psychology, 56, 326-338.
71, 390-405. Jacoby, L. L., & Whitehouse, K. (1989). An illusion of memory:
Garcia-Marques, T., Mackie, D. M., Claypool, H. M., & Garcia- False recognition influenced by unconscious perception. Jour-
Marques, L. (2004). Positivity can cue familiarity. Personality nal of Experimental Psychology: General, 118, 126–135.
and Social Psychology Bulletin, 30, 585-593. *Kim, C. (2002). The role of individual differences in general scep-
Gardiner, J. M., & Richardson-Klavehn, A. (2000). Remembering ticism in the illusory truth effect. Dissertation Abstracts Inter-
and knowing. In E. Tulving & F. I. M. Craik (Eds.), The Oxford national, 63(05), 1913A. (UMI No. 3053399).
handbook of memory (pp. 229-244). New York: Oxford Univer- Krosnick, J. A. (1991). The stability of political preferences: Com-
sity Press. parisons of symbolic and non-symbolic attitudes. American
*Gigerenzer, G. (1984). External validity of laboratory experi- Journal of Political Science, 35, 547-576.
ments: The frequency-validity relationship. American Journal Krosnick, J. A., & Fabrigar, L. R. (1997). Designing rating scales
of Psychology, 97, 185-195. for effective measurement in surveys. In L. Lyberg, P. Biemer,
Gilbert, D. T., Krull, D. S., & Malone, P. S. (1990). Unbelieving the M. Collins, E. de Leeuw, C. Dippo, N. Schwarz, & D. Trewin
unbelievable: Some problems in the rejection of false informa- (Eds.), Survey measurement and process quality (pp. 141-164).
tion. Journal of Personality and Social Psychology, 59, 601-613. New York: John Wiley.
Gilbert, D. T., Tafarodi, R. W., & Malone, P. S. (1993). You can’t Law, S. (1999). Investigating the truth effect in young and elderly
not believe everything you read. Journal of Personality and consumers: The role of recognition and source memory. Dis-
Social Psychology, 65, 221-233. sertation Abstracts International, 60(1-A), Jul 1999, pp. 0191.
Greenwald, A., & Leavitt, C. (1984). Audience involvement in adver- UMI No. AAMNQ35219).
tising: Four levels. Journal of Consumer Research, 11, 581-591. *Law, S., & Hawkins, S. A. (1997). Advertising repetition and con-
Hansen, J., Dechêne, A., & Wänke, M. (2008). Discrepant fluency sumer beliefs: The role of source memory. In W. D. Wells (Ed.),
increases subjective truth. Journal of Experimental Social Psy- Measuring advertising effectiveness: Advertising and consumer
chology, 44, 687-691. psychology (pp. 67-75). Mahwah, NJ: Lawrence Erlbaum.
Hansen, J., & Wänke, M. (2008). It’s the difference that counts: *Law, S., Hawkins, S. A., & Craik, F. I. M. (1998). Repetition-
Expectancy/experience discrepancy moderates the use of ease of induced belief in the elderly: Rehabilitating age-related memory
retrieval in attitude judgments. Social Cognition, 26, 447-468. deficits. Journal of Consumer Research, 25, 91-107.
*Hasher, L., Goldstein, D., & Toppino, T. (1977). Frequency and Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis.
the conference of referential validity. Journal of Verbal Learn- Thousand Oaks, CA: Sage.
ing and Verbal Behavior, 16, 107-112. *Mitchell, J. P., Dodson, C. S., & Schacter, D. L. (2005). fMRI evi-
*Hawkins, S. A., & Hoch, S. J. (1992). Low-involvement learning: dence for the role of recollection in suppressing misattribution
Memory without evaluation. Journal of Consumer Research, 19, errors: The illusory truth effect. Journal of Cognitive Neurosci-
212-225. ence, 17, 800-810.
*Hawkins, S. A., Hoch, S. J., & Meyers Levy, J. (2001). Low- *Mitchell, J. P., Sullivan, A. L., Schacter, D. L., & Budson, A. E.
involvement learning: Repetition and coherence in familiarity (2006). Misattribution errors in Alzheimer’s disease: The illu-
and belief. Journal of Consumer Psychology, 11, 1-11. sory truth effect. Neuropsychology, 20, 185-192.
Dechêne et al. 257

*Mutter, S. A., Lindsey, S. E., & Pliske, R. M. (1995). Aging and Skurnik, I., Yoon, C., Park, D. C., & Schwarz, N. (2005). How
credibility judgment. Aging and Cognition, 2, 89-107. warnings about false claims become recommendations. Journal
Nordhielm, C. L. (2002). The influence of level of processing on of Consumer Research, 31, 713-724.
advertising repetition effects. Journal of Consumer Research, 29, *Toppino, T. C., & Brochin, H. A. (1989). Learning from tests:
371-382. The case of true-false examinations. Journal of Educational
Obermiller, C., & Spangenberg, E. R. (1998). Development of a Research, 83, 119-124.
scale to measure consumer skepticism toward advertising. Jour- Unkelbach, C. (2006). The learned interpretation of cognitive flu-
nal of Consumer Psychology, 7, 159-186. ency. Psychological Science, 12, 339-345.
Parks, C. M., & Toth, J. P. (2006). Fluency, familiarity, aging, and Unkelbach, C. (2007). Reversing the truth effect: Learning the inter-
the illusion of truth. Aging, Neuropsychology, and Cognition, pretation of processing fluency in judgements of truth. Journal
13, 225-253. of Experimental Psychology: Learning, Memory, and Cognition,
Petty, R. E., & Cacioppo, J. T. (1986). The elaboration likelihood 33, 219-230.
model of persuasion. In L. Berkowitz (Ed.), Advances in experi- Unkelbach, C., Bayer, M., Alves, H., Koch, A., & Stahl, C. (2009).
mental social psychology (Vol. 19, pp. 123-205). New York: Fluency and Positivity as possible causes of the truth effect.
Academic Press. Manuscript submitted for publication.
Raudenbush, S. W. (1994). Random effect models. In H. Cooper Unkelbach, C., & Stahl, C. (2008). A multinomial modelling
& L. V. Hedges (Eds.), The Handbook of research synthesis approach to dissociate different components of the truth effect.
(pp. 301-321). New York: Russell Sage. Consciousness and Cognition, 18, 22-38.
Reber, R., & Schwarz, N. (1999). Effects of perceptual flu- Westerman, D. (2008). Relative fluency and illusions of recognition
ency on judgments of truth. Consciousness and Cognition, 8, memory. Psychonomic Bulletin & Review, 15, 1196-1200.
338-342. Whittlesea, B. W. A. (1993). Illusions of familiarity. Journal of
Reber, R., Schwarz, N., & Winkielman, P. (2004). Processing flu- Experimental Psychology: Learning, Memory, and Cognition,
ency and aesthetic pleasure: Is beauty in the perceiver’s process- 19, 1235-1253.
ing experience? Personality and Social Psychology Review, 8, Whittlesea, B. W. A., & Leboe, J. P. (2003). Two fluency heuristics
364-382. (and how to tell them apart). Journal of Memory and Language,
Reber, R., Winkielman, P., & Schwarz, N. (1998). Effects of per- 49, 62-79.
ceptual fluency on affective judgments. Psychological Science, Whittlesea, B. W. A., & Williams, L. D. (1998). Why do strangers
9, 45-48. feel familiar, but friends don’t? A discrepancy-attribution account
Roediger, H. L., III. (1990). Implicit memory: Retention without of feelings of familiarity. Acta Psychologica, 98, 141-165.
thinking. American Psychologist, 45, 1043-1056. Whittlesea, B. W. A., & Williams, L. D. (2000). The source of
Roediger, H. L., III, & McDermott, K. B. (1995). Creating false feelings of familiarity: The discrepancy-attribution hypothesis.
memories: Remembering words not presented in lists. Journal Journal of Experimental Psychology: Learning, Memory, and
of Experimental Psychology: Learning, Memory, and Cogni- Cognition, 26, 547-565.
tion, 21, 803-814. Whittlesea, B. W. A., & Williams, L. D. (2001a). The discrepancy-
*Roggeveen, A. L., & Johar, G. V. (2002). Perceived source vari- attribution hypothesis: I. The heuristic basis of feeling of famil-
ability versus familiarity: Testing competing explanations for iarity. Journal of Experimental Psychology: Learning, Memory,
the truth effect. Journal of Consumer Psychology, 12, 81-91. and Cognition, 27, 3-13.
Roggeveen, A. L., & Johar, G. V. (2007). Changing false beliefs Whittlesea, B. W. A., & Williams, L. D. (2001b). The discrepancy-
from repeated advertising: The role of claim-refutation align- attribution hypothesis: II. Expectation, uncertainty, surprise, and
ment. Journal of Consumer Psychology, 17, 118-127. feelings of familiarity. Journal of Experimental Psychology:
Rosenthal, R., & Rubin, D. B. (1986). Meta-analytic procedures Learning, Memory, and Cognition, 27, 14-33.
for combining studies with multiple effect sizes. Psychological Willems, S., & Van der Linden, M. (2006). Mere exposure effect:
Bulletin, 99, 400-406. A consequence of direct and indirect fluency-preference links.
Schmidt, F. L., Oh, I., & Hayes, T. L. (2009). Fixed versus ran- Consciousness and Cognition, 15, 323-341.
dom effects models in meta-analysis: Model properties and an Wilson, M. W. (2003). Those confounded moderators in meta-
empirical comparison of differences in results. British Journal analysis: Good, bad, and ugly. Annals of the American Academy
of Mathematical and Statistical Psychology, 13, 97-128. of Political and Social Science, 587, 69-81.
*Schwartz, M. (1982). Repetition and rated truth value of state- Winkielman, P., & Cacioppo, J. T. (2001). Mind at ease puts a smile
ments. American Journal of Psychology, 95, 393-407. on the face: Psychophysiological evidence that processing facil-
Schumann, D. W., Petty, R. E., & Clemons, D. S. (1990). Predicting itation elicits positive affect. Journal of Personality and Social
the effectiveness of different strategies of advertising variation: Psychology, 81, 989-1000.
A test of the repetition-advertising hypothesis. Journal of Con- Zajonc, R. B. (1968). Attitudinal effects of mere exposure. Journal
sumer Research, 17, 192-202. of Personality and Social Psychology Monograph Supplement,
Simmons, J. P., & Nelson, L. D. (2006). Intuitive confidence: 9, 1-27.
Choosing between intuitive and nonintuitive alternatives. Jour-
nal of Experimental Psychology: General, 135, 409-428.

You might also like