Auditing With Data and Analytics - External Reviewers' Judgments of Audit Quality and Effort

Auditing with Data and Analytics:
External Reviewers’ Judgments of Audit Quality and Effort
Scott A. Emett*
Arizona State University
scottemett@asu.edu
Steven E. Kaplan
Arizona State University
Steve.Kaplan@asu.edu
Elaine G. Mauldin
University of Missouri-Columbia
mauldin@missouri.edu
Jeffrey S. Pickerd
The University of Mississippi
jspicker@olemiss.edu
Running Head: Auditing with Data and Analytics: External Reviewers’ Judgments of Audit
Quality and Effort
*Corresponding author. We are grateful for the generous research support received from the
American Institute of Certified Public Accountants and its members who participated in the
study. We gratefully acknowledge the valuable feedback received from Michelle Craig and Carl
Mayes from the AICPA, Kendall Bowlin, Kate Brightbill, Jonathan Chipman, Dennis Eggett,
Devon Erickson, Steve Glover, Steve Kachelmeier, Lisa Koonce, Bob Libby, Christy Nielson,
Sam Pimmentel, Steve Salterio, Chad Simon, Scott Vandervelde, Brian White, David Wood, and
workshop participants at Brigham Young University, Nanyang Technological University, Utah
State University, University of Texas at Austin, and the 2020 Spark Conference.
Auditing with Data and Analytics:
External Reviewers’ Judgments of Audit Quality and Effort
ABSTRACT
Audit firms hesitate to take full advantage of data and analytics (D&A) audit approaches because
they lack certainty about how external reviewers evaluate those approaches. We propose that
external reviewers use an effort heuristic when evaluating audit quality, judging less effortful
audit procedures as lower quality, which could shape how external reviewers evaluate D&A
audit procedures. We conduct two experiments in which experienced external reviewers evaluate
one set of audit procedures (D&A or traditional) within an engagement review, while holding
constant the procedures’ level of assurance. Our first experiment provides evidence that external
reviewers rely on an effort heuristic when evaluating D&A audit procedures—they perceive
D&A audit procedures as lower in quality than traditional audit procedures because they
perceive them to be less effortful. Our second experiment confirms these results and evaluates a
theory-based intervention that reduces reviewers’ reliance on the effort heuristic, causing them to
judge quality similarly across D&A and traditional audit procedures.
Keywords: Data and analytics; external reviewers; audit quality; effort heuristic
JEL Classifications: M41; M42
1. Introduction
We examine whether external reviewers use an effort heuristic when evaluating data and
analytics (D&A) audit procedures, causing them to perceive audit procedures using D&A as
lower quality than traditional audit procedures. 1 Understanding whether external reviewers
perceive D&A as lower quality than traditional audit procedures is an important research
question because auditors have expressed concern about how external reviewers evaluate D&A
audit approaches, which could slow adoption of innovations that support audit effectiveness and
efficiency (Cao et al. 2022; Austin et al. 2021; Eilifsen et al. 2020; Kang et al. 2020). 2
Regulators themselves say they recognize the value of D&A approaches and indicate current
standards should not impede adoption of D&A approaches. Instead, they attribute lack of wider
adoption, in part, to audit firms’ risk aversion (PCAOB 2019, 2021; Austin et al. 2021; Christ et
al. 2021). Thus, auditors and regulators express starkly different views. Bolstering auditors’
concerns, recent PCAOB inspection observations highlight deficiencies in the use of
“technology-based” audit tools (PCAOB 2022).
A rich literature finds practicing auditors often rely on heuristics that generally operate
outside of conscious awareness (e.g., Griffith et al. 2016). External reviewers are typically
practicing or former auditors, so they are likely subject to these same heuristics. Thus, external
reviewers may use heuristics to evaluate D&A audit approaches in ways they do not intend and
of which they are not consciously aware. Such a dynamic could account for these different views
of regulators and auditors. Figure 1 illustrates the theoretical framework, based on the effort
1
D&A tools include a variety of technology-based tools and techniques, including 100 percent population testing,
visualization, text mining, predictive analytics, and artificial intelligence. We examine one common form of D&A
tools, 100 percent population testing.
2
We use the term external reviewer to refer to an individual conducting an independent post-audit review of audit
quality. In practice, internal quality reviewers, AICPA peer reviewers, and PCAOB inspectors conduct forms of
external review.
1
heuristic, that guides our predictions and experimental tests. Prior psychology research shows
that, when quality is difficult to assess, individuals tend to use effort as a heuristic cue of quality
(e.g., see Kruger et al. 2004; Morales 2005; Kim and Labroo 2011; Franco-Watkins et al. 2013;
Schrift et al. 2016). When adopting the effort heuristic, people assume that less effort results in
lower quality. D&A approaches rely on D&A tools that typically require less manual effort from
the engagement team than traditional approaches (e.g., see EY 2017; KPMG 2017). Accordingly,
we first predict that, holding constant the level of assurance provided by audit procedures,
external reviewers will evaluate D&A as lower quality than traditional audit procedures.
Prior research also provides evidence that people rely more (less) on the effort heuristic
when they are primed to believe that effort is (is not) essential to quality (Cho and Schwarz
2008; Schrift et al. 2016; Cheng et al. 2017). We therefore predict that external reviewers will
evaluate D&A as lower quality than traditional audit procedures more when primed with a theory
of audit quality that emphasizes the importance of audit effort (“effort-is-essential” prime) than
when primed with a theory of audit quality that emphasizes how audit effort can be substituted
with other factors like audit execution (“effort-can-be-substituted” prime).
We design two complementary experiments to test these predictions. Experiment 1 uses a
1x2 between-participants design to test our first prediction. Participants assume the role of an
external reviewer conducting an evaluation of a single set of audit procedures within an AICPA
peer engagement review. 3 We manipulate audit approach between-participants (D&A procedures
or traditional procedures). In the D&A procedures condition, the audit team used D&A tools to
3
We use an AICPA peer review setting rather than a PCAOB inspection setting for our experiment because many of
our participants have experience as peer reviewers. Approximately 21,400 audit firms are currently enrolled in the
AICPA peer review program and receive peer reviews once every three years (AICPA 2022a). Section 5 discusses
interviews we conducted of current and former PCAOB inspectors. Those interviews provide evidence that our
theoretical framework (Figure 1) likely also applies to the PCAOB inspection setting.
2
identify all three-way match exceptions in the population of the company’s sales transactions.
Consistent with auditors’ use of D&A procedures, there were too many exceptions to examine
individually, and consequently, the audit team manually tested a sample of the exceptions to
identify misstatements (AICPA 2017; Barr-Pulliam et al. 2020). In the traditional procedures
condition, the audit team used traditional procedures to manually test a sample from the
population of transactions to identify exceptions with misstatements. Both approaches projected
misstatement to the population of transactions using statistical sampling tools. Importantly, we
hold the population of transactions and the actual frequency of exceptions and errors constant,
and we provide participants the same 95% confidence interval for the misstatement range in each
condition. Thus, the level of assurance is equivalent in both approaches (See online Appendix 1).
Results from 60 audit partners and senior managers with review experience find that
external reviewers judge the D&A as lower quality than the traditional audit procedures. Further
analyses find that external reviewers perceive that D&A takes less effort than traditional audit
procedures, and their perceptions of less effort in turn produce their judgments of lower audit
procedure quality. Additional analyses rule out other cognitive mechanisms that could explain
these results. Because D&A tools identify all exceptions, external auditors could assess lower
quality simply because D&A procedures makes the number of total and unexamined exceptions
more salient. Analyses provide evidence that perceived audit quality is driven by perceived effort
and not by the salience of total or unexamined exceptions. In a similar manner, we also rule out
perceptions of audit procedure risk and general technology beliefs as alternative explanations. To
more firmly rule out these and other potential alternative explanations, we designed Experiment
2 to provide more direct tests of the effort heuristic on perceived audit quality.
3
Experiment 2 uses a 2x2 between-participants design to further test our theory and avoid
potential correlated omitted variable problems in Experiment 1. We use the task and
manipulation of audit approach (D&A procedures or traditional procedures) from Experiment 1.
Our second manipulated variable evaluates a theory-based priming intervention designed to
reduce external reviewers’ reliance on perceived effort when assessing audit quality. Before
beginning the engagement review, participants read an excerpt from a fictional speech given by
the global head of audit quality for a large, international audit firm. We manipulate the prime
within the speech by emphasizing either the importance of audit effort (“effort-is-essential”) or
how audit effort can be substituted with audit execution (“effort-can-be-substituted”).
Importantly, the speech does not mention an audit approach or related concepts.
Results from 98 very experienced auditors (92 percent audit partners) with external
review experience find that participants primed with the “effort-is-essential” speech make
judgments consistent with those in Experiment 1, again judging the D&A as lower quality than
the traditional procedures. In contrast, participants primed with the “effort-can-be-substituted”
speech do not display this pattern, instead judging the D&A as similar in quality to the traditional
procedures. Further analyses find participants in both priming conditions perceive that D&A
takes less effort than traditional procedures. However, only participants in the “effort-is-
essential” priming condition rely on the effort heuristic, using effort as a signal of audit
procedure quality. Overall, our results provide evidence that external reviewers naturally rely on
the effort heuristic to judge audit procedure quality and that priming an alternative theory of
audit quality reduces their reliance on the effort heuristic.
Our study contributes to research on D&A procedures and practice. Our theory-based
evidence, including ruling out plausible alternative mechanisms, suggests regulatory scrutiny of
4
D&A audit procedures may stem from external reviewers relying on the effort heuristic, rather
than the quality of the procedures themselves. This evidence helps to account for the different
views expressed by auditors and regulators over how external reviewers evaluate D&A
approaches (Austin et al. 2021; Christ et al. 2021). Thus, regulators could provide guidance or
standards that explicitly address quality criteria for D&A audit approaches and provide training
for external reviewers related to the effort heuristic. In addition, given the tendency for external
reviewers to use the effort heuristic, auditors, in their interactions with external reviewers, could
consider highlighting all their efforts devoted to developing and using D&A audit approaches.
Our study also contributes to research by demonstrating that highly experienced external
reviewers are susceptible to the effort heuristic even when provided with the underlying evidence
showing how the auditor reached a conclusion.
2. Institutional background and hypotheses development
D&A and traditional audit approaches
Audit firms now use numerous D&A tools in audits (Deloitte 2016; EY 2017; KPMG 2017). We
study D&A tools used in substantive tests of revenue transactions, where D&A audit procedures
can effectively and efficiently replace traditional audit procedures. Historically, auditors have
used sampling procedures for substantive tests of large populations of transactions (Elder et al.
2013). In these traditional audit approaches, auditors (1) use sampling techniques to select a
subset of transactions from the population, (2) examine each transaction in the sample and
identify exceptions, (3) perform further tests of identified exceptions to identify misstatement, if
any, and (4) extrapolate misstatements to the population of transactions. Traditional audit
approaches typically involve substantial manual effort by the engagement team to identify and
test exceptions.
5
In contrast, in this setting, D&A audit approaches use D&A tools to initially identify all
exceptions in the population of transactions (e.g., EY 2017; KPMG 2017). However, D&A tools
often identify a large number of exceptions such that performing further tests to identify
misstatements on all identified exceptions would be cost-prohibitive (AICPA 2017; Barr-Pulliam
et al. 2020). When this occurs, audit regulators and guidance from the profession encourage
auditors to perform further tests on a sample of the identified exceptions (AICPA 2017). 4 Thus,
in these D&A audit approaches, auditors (1) use technology-based tools to identify all exceptions
in the population, (2) use sampling techniques to select a subset of exceptions, (3) perform
further tests to identify misstatement, if any, and (4) extrapolate misstatement to the population
of exceptions. D&A audit approaches tend to involve less manual effort by the engagement team
than traditional approaches because D&A tools, not manual procedures, identify exceptions and
then only exceptions are sampled for further tests. 5 For a given population of transactions with a
given number of exceptions and errors, D&A audit approaches can achieve the same level of
assurance as traditional audit approaches with a smaller sample (see online Appendix 1 for a
mathematical demonstration based on our experimental materials).
Initially, audit research sought to understand the uses and challenges of D&A approaches
at a theoretical level (e.g., Alles 2015; Brown-Liburd et al. 2015; Cao et al. 2015; Gepp et al.
2018). More recent research provides qualitative field evidence on how D&A audit approaches
are perceived by auditors, peer reviewers, and standard-setters (Walker and Brown-Liburd 2019;
Austin et al. 2021; Christ et al. 2021). In addition, experimental research focuses on how D&A
4
We conducted unstructured interviews with managers and partners from three Big 4 audit firms about D&A audit
approaches. These interviews confirmed that all three firms have established audit methodology that allows auditors
to sample identified exceptions and perform further tests on the sample.
5
We note that audit firms spend significant resources (e.g. financial resources, professional expertise, time, etc.) to
develop and test D&A audit approaches (Kapoor 2020). However, individuals tend to ignore the effort involved in
the development and implementation of an automated system (Naquin and Kurtzberg 2004; Waytz et al. 2014). See
Section 5 for further discussion.
6
tools influence auditor judgments (Anderson et al. 2021; Bibler et al. 2023; Commerford et al.
2022; Peters 2023; Barr-Pulliam et al. 2020; Koreff and Perreault 2023) and on how auditors’
use of D&A tools influence investor, manager, juror, and peer reviewer judgments (Ballou et al.
2021; Barr-Pulliam et al. 2022; Kipp et al. 2020).
The qualitative and experimental research on peer reviewers most closely relate to our
study. As noted in the introduction, qualitative field interviews with auditors, managers, and
regulators provide evidence that (1) auditors hesitate to take full advantage of D&A approaches
due to uncertainty about how external reviewers evaluate such approaches and (2) some auditors
express frustration that external reviewers have restricted or penalized such approaches (Austin
et al. 2021; Christ et al. 2021; Eilifsen et al. 2020). For example, one audit partner cited in Austin
et al. (2021, 1911) notes, “The firm is pushing the use of analytics, but there is still hesitation
because of impact on inspections.” A PCAOB regulator quoted in Austin et al. (2021, 1912)
confirms this dynamic: “Many of the partners have gotten really frustrated [with PCAOB
scrutiny of D&A approaches] … and so they’ve just kind of thrown in the towel and said we’ll
do substantive tests of detail.” Consistent with external reviewer scrutiny of D&A approaches,
the PCAOB recently singled out deficiencies in “technology-based tools” when providing an
overview of inspection observations (PCAOB 2022).
Yet, regulators indicate current standards should not penalize or impede adoption of
D&A approaches and findings from Ballou et al. (2021) suggest that external reviewers do not
intend to judge D&A as lower quality than traditional audit approaches. We complement and
extend this research by directly testing a theory-based explanation for why external reviewers
may sometimes judge D&A as lower quality than traditional audit approaches and a priming
intervention to mitigate the bias.
7
External reviews of audit quality
The PCAOB and AICPA both conduct external reviews that require independent reviewers to
evaluate audit quality and issue reports to stakeholders (Lennox and Pittman 2010). Currently,
the AICPA (PCAOB) conducts external reviews for private (public) company audits. In addition,
firms themselves conduct internal quality control reviews that are external to the audit
engagement (Houston and Stefaniak 2013; Stefaniak et al. 2017).
Prior research provides evidence that external review programs have a powerful influence
on audit practice (Hanlon and Shroff 2022). Audit firms can either gain or lose clients based on
the outcomes of AICPA or PCAOB external reviews (e.g., Hilary and Lennox 2005; Aobdia and
Shroff 2017; Aobdia 2018). Accordingly, audit firms increase audit effort and other compliance
activities to placate external reviewers (e.g., Defond and Lennox 2017; Hanlon and Shroff 2022;
Stefaniak et al. 2017; Aobdia 2018; Johnson et al. 2019; Shefchik Bhaskar 2019; Westermann et
al. 2019). In general, this research suggests that external review programs increase audit quality
(e.g., Carcello et al. 2011; Carson et al. 2021; Lamoreaux 2016; Defond and Lennox 2017;
Krishnan et al. 2017).
However, external review programs can result in perverse effects, shaping audit practice
in ways that do not improve audit quality. For example, auditors modify audit procedures to
increase compliance with PCAOB inspections, even when auditors believe such modifications
will reduce audit quality (Austin et al. 2021; Glover et al. 2019; Johnson et al. 2019;
Westermann et al. 2019). When anticipating PCAOB inspections, auditors tend to modify audits
in ways that impair performance on lower-risk audits (Shefchik Bhaskar 2019). Similarly, Austin
et al. (2021, 1910) indicate that “our auditor interviewees … report altering their procedures such
that they perform one set of traditional audit tests for PCAOB inspections and another set of
8
value-added data analytic tests.” Thus, auditors sometimes respond to the prospect of external
reviews in ways that do not increase audit quality and may increase audit costs.
Theoretical framework and hypotheses development
Figure 1 illustrates the theoretical framework, based on the effort heuristic, that guides our
predictions. We first discuss the effort heuristic, leading to Hypothesis 1. Then, we discuss a
priming intervention to reduce the effect of the effort heuristic, leading to Hypothesis 2.
The effort heuristic and audit quality
Psychology research finds that effort exerted on a task or product impacts how people perceive
the task or product (see review by Inzlicht et al. 2018). Seminal work on cognitive dissonance
(e.g., Aronson and Mills 1959; Festinger and Carlsmith 1959) and the sunk cost fallacy (Thaler
1980; Arkes and Blumer 1985) find that individuals value objects more when they exert effort to
obtain or produce them. For instance, research on the “IKEA effect” demonstrates that people
tend to value products more when they assemble the products themselves (Sarstedt et al. 2016).
Other research on the effort heuristic suggests people not only value the effort they exert,
but also value the effort other people exert. Specifically, Kruger et al. (2004) propose that when
quality is difficult to assess, people use perceived effort as a heuristic cue of quality. For
example, self-identified art experts judge paintings they believe artists spent more time
producing as higher in quality and more valuable than identical paintings they believe artists
spent less time producing (Kruger et al. 2004). As with other heuristics, people can arrive at
optimal decisions in many contexts using the effort heuristic, but the effort heuristic leads to
suboptimal decision making in other contexts (e.g., Kahneman et al. 1982). People use the effort
heuristic in a variety of contexts, including consumer choice (e.g., Morales 2005; Kim and
Labroo 2011; Schrift et al. 2016) and negotiation (e.g., Franco-Watkins et al. 2013).
9
Prior audit research does not explicitly test whether auditors, external reviewers, or other
stakeholders use an effort heuristic when evaluating audit quality. Audit research has
documented that audit effort—measured using audit hours and audit fees—is positively
associated with various measures of audit quality (Caramanis and Lennox 2008; Lobo and Zhao
2013). The PCAOB has proposed using “audit hours”, “audit fees”, and “effort” as indicators of
audit quality (PCAOB 2015, 13).
D&A tools identify exceptions, reducing the manual effort engagement team members
need to exert on sampling and related tasks. We therefore expect that external reviewers will
perceive that D&A takes less effort by the engagement team than traditional audit procedures
(Link 1 in Figure 1). Audit quality is difficult to assess, and judging audit quality can be
especially difficult for external reviewers, who do not directly observe the engagement team’s
behavior. Consequently, we expect they will use the effort heuristic and judge D&A as lower in
quality than traditional procedures (Link 2 in Figure 1). This leads to the following prediction:
HYPOTHESIS 1. Holding constant the level of assurance provided by audit procedures,

external reviewers will judge D&A as lower in quality than traditional procedures.
We note that whether Hypothesis 1 will be supported is not without tension. While the research
in other settings discussed above provides support for the effort heuristic, the effort heuristic may
not generalize to the external review process. Judging the quality of an audit procedure follows a
process that is far less subjective than judging the quality of art. As an example, audit reviewers
have access to an “audit trail” that documents how the auditor reached a conclusion. Audit
reviewers can review these documents to determine whether the audit conclusions are
appropriate. In our setting, these documents show that D&A and traditional procedures are each
appropriate and provide the same level of assurance. To the extent that audit reviewers rely on
these supporting documents and conclusions, audit reviewers are less likely to rely on the effort
10
heuristic to judge audit quality. In addition, other cognitive mechanisms, such as salience of
exceptions and variations in risk or technology perceptions could also lead to perceptions of
lower quality rather than the effort heuristic.
Priming intervention to reduce use of the effort heuristic
Schrift et al. (2016) argue that the “denying the antecedent” fallacy in conditional reasoning
underlies the effort heuristic. Specifically, believing that effort yields quality generally causes
people to conclude that a lack of effort yields a lack of quality. This reasoning is flawed because
other factors exist that can substitute for effort in producing quality. People frequently fall prey
to this fallacy because effort is a salient factor producing quality, while factors that substitute for
effort in producing quality are often not salient without being explicitly primed.
Prior research provides evidence that a variety of factors moderate the extent to which
people focus on effort when evaluating quality. Some of this research focuses on dispositional
variables that moderate use of the effort heuristic. For example, Cheng et al. (2017) provide
evidence that people who subscribe to the Protestant Work Ethic (PWE)—a core value that
emphasizes the importance of hard work—are more likely than those who do not subscribe to
PWE to use the effort heuristic in consumer choice settings. Other research focuses on situational
variables that moderate use of the effort heuristic. For example, Morales (2005) provides
evidence that consumers cease using the effort heuristic when they perceive that companies are
strategically signaling effort to persuade consumers. Thus, consumers do not reward effort that is
perceived to be strategic rather than genuine.
Another stream of research provides evidence that priming people to think about
substitutes for effort can reduce reliance on the effort heuristic (Cho and Schwarz 2008; Schrift
et al. 2016). For example, Cho and Schwarz (2008) ask participants to assess the value of
11
paintings after priming them with a speech that either focuses on how effort is essential to
producing good art (“effort-is-essential”) or focuses on how effort can be substituted by
creativity and talent to produce good art (“effort-can-be-substituted”). The “effort-is-essential”
speech emphasizes that “the most important thing that all great artists had in common was their
persistent effort and enduring hard work” and downplays the importance of innate talent (Cho
and Schwarz 2008, 210). In contrast, the “effort-can-be-substituted” speech downplays the
importance of effort and emphasizes how effort can be substituted with talent and creativity,
arguing “Without their talent, they would have never been able to create such influential
masterpieces only in a matter of days or sometimes even hours” (Cho and Schwarz 2008, 210).
The authors find that the “effort-is-essential” prime reinforces the effort heuristic while the
“effort-can-be-substituted” prime counteracts the effort heuristic, such that participants assign
similar valuations to “high effort” and “low effort” paintings after reading that prime.
Similarly, Schrift et al. (2016, 810) ask participants to read short statements that either
focus on the importance of effort (“A person who is willing and able to work hard and invest a
lot of effort will generate positive outcomes and success in life.”) or short statements
downplaying the importance of effort and emphasizing how effort can be substituted by other
factors (“Sometimes in life, we encounter extremely good opportunities that generate positive
outcomes even without working hard and investing too much effort.”). The authors find that
participants are more likely to use the effort heuristic when primed with “effort-is-essential”
statements than when primed with “effort-can-be-substituted” statements.
Guided by this research, we expect that priming external reviewers with different theories
of audit quality will moderate the extent to which they rely on the effort heuristic when
evaluating audit quality (Link 3 in Figure 1). Specifically, we expect external reviewers will use
12
the effort heuristic to judge audit quality when primed with theories of audit quality that
emphasize the importance of effort (“effort-is-essential”), but less so when primed with theories
of audit quality that emphasize how effort can be substituted with other factors like execution
(“effort-can-be-substituted”). We therefore make the following interaction prediction:
Hypothesis 2. External reviewers will judge D&A as lower quality than traditional audit
procedures more when primed with an “effort-is-essential” theory of audit quality
than when primed with an “effort-can-be-substituted” theory of audit quality.
We note that whether Hypothesis 2 will be supported is not without tension, for at least three
reasons. First, external reviewers have many years of audit experience, and their views about
audit effort and audit quality may therefore be less malleable than the views of the undergraduate
student participants in the studies discussed above. If so, priming different theories of audit
quality may not affect the views of external reviewers. Second, research provides evidence that
people sometimes become more entrenched in their current belief when presented with views
that contradict their own (Byrne and Hart 2009). If that is the case in our setting, priming
different theories of audit quality could backfire, resulting in stronger belief that effort is linked
to quality. Third, alternative explanations other than effort, such as the salience of unexamined
exceptions, could drive external reviewer perceptions of audit quality. If so, a theory-based prime
related to effort is unlikely to change perceptions of audit quality.
3. Experiment one
Method
Participants
We recruited participants with experience evaluating the work of other audit engagement teams. 6
We partnered with the AICPA’s Assurance Research Advisory Group (ARAG) to recruit AICPA
6
The corresponding author’s Institutional Review Board approved the experimental instruments for Experiments 1
and 2.
13
members who are audit partners or senior managers with experience as peer reviewers in the
AICPA’s peer review program or quality control reviewers within their firm. 7After thoroughly
reviewing our experimental instrument, ARAG sent our final instrument to qualified participants.
Seventy-one participants completed the study. Due to the importance of key details in different
phases of the experiment, we emphasized to participants the importance of “completing the task
in one sitting.” We therefore excluded 11 participants who spent more than twenty-four hours
completing the study (about eighteen days, on average). The remaining 60 participants spent, on
average, about twenty-nine minutes completing the study. 8
Participants averaged about 17 years of audit experience and included 37 audit partners,
22 senior managers, and one manager. Sixteen participants were AICPA peer reviewers, and 42
participants were engagement quality control reviewers within their firms. 9 Thus, participants
had the necessary expertise to assume the role of an external reviewer in our experiment (Libby
et al. 2002). See Table 1 for a summary of participants’ demographic characteristics. 10
Experimental design
We employed a 1 x 2 between-participants design, manipulating audit approach (D&A
procedures vs. traditional procedures). Because we seek to understand how external reviewers
evaluate two different but statistically equivalent audit approaches (i.e., the level of assurance
7
We also contacted former PCAOB inspectors to inquire about the possibility of inspectors participating in our first
experiment, but these conversations revealed that hurdles would make it impossible to secure their participation.
8
Including these participants adds noise to our reported analyses. Specifically, although the tests reported in Figure
2 continue to provide evidence of the effort heuristic using a 95% confidence interval, the t-test reported in Table 2
is not significant for Quality at conventional levels of significance. These analyses validate our ex-ante concern
about piecemeal completion of the experiment and justify our request for “completing the task in one sitting.”
9
We did not ask participants to indicate their level of experience with public clients. Because the AICPA recruited
participants with peer review experience, we conjecture that most of our participants work primarily with private
clients. Section 5 of the paper discusses why our results likely generalize to public company external reviews.
10
T-tests reveal that none of these demographic characteristics significantly differ across experimental conditions
(all p-values > 0.10), consistent with successful random assignment. Seventeen participants report neither peer
review experience nor quality control review experience, despite the AICPA identifying them as having such
experience. Removing these participants only strengthens our results (i.e., test of Hypothesis 1 in Table 2: p=0.011).
For completeness, we include them in the results reported below.
14
provided by each audit procedure was the same), we focused on the audit quality production
process described as “implementation of audit tests by engagement team personnel” (Francis
2011, 126). Thus, we informed participants that they would evaluate the quality of a single set of
substantive audit procedures in the role of an AICPA peer reviewer participating in an
engagement review of a fictional audit firm, PEKD.
Participants read detailed information about one set of substantive audit procedures used
in the audit with workpapers documenting the procedures. We held constant information about
revenue, materiality, exception rate, error rate, and total misstatement. In both conditions, the set
of audit procedures consisted of a three-way match procedure (between the order, shipping
documents, and invoice) to test the occurrence, accuracy, and valuation of sales revenue. In both
conditions, the audit procedure identified a 0.6% overstatement in revenue, which the audit team
proposed as an audit adjustment, but the client was unwilling to accept. The audit team
ultimately provided a clean (i.e., unqualified) audit opinion because the proposed adjustment,
combined with other proposed adjustments, was less than the materiality threshold for the
financial statements as a whole. Two highly experienced audit practitioners, including the former
head of assurance for a large global audit firm, reviewed the experimental materials and found
them realistic and appropriate for our study.
Participants assigned to the D&A procedures (see online Appendix 2 for excerpt from
experimental materials including statistical illustration) read that the audit team used D&A tools
to conduct the three-way match procedure on every sales transaction for the year (n =52,600).
The D&A tools identified 16,306 exceptions in the population of sales transactions (frequency of
exceptions equals 31%). 11 Because the D&A tools identified a large number of exceptions, the
11
Our design using a high, but not unreasonable, exception rate with false positives is consistent with prior research
and private conversations with audit practitioners (see Barr-Pulliam et al. 2020). As an explanation for this
15
audit team performed further testing on a sample of the identified exceptions. Using the firm’s
statistical methodology, the audit team randomly selected 36 exception transactions for further
testing. The audit team then extrapolated misstatement (average of $510) to the entire population
of exceptions using a 95% confidence interval. 12
Participants assigned to the traditional procedures (see online Appendix 3 for excerpt
from experimental materials including statistical illustration) read that the audit team took a
random sample of all sales transactions for the year (n=52,600) to conduct the three-way match
procedure. Using the firm’s statistical methodology, the audit team randomly selected 155
transactions for testing, identified 48 exceptions (frequency of exceptions equals 31%) and tested
these further for misstatement. The audit team then extrapolated misstatement (average of $510)
to the entire population of transactions using a 95% confidence interval.
We note that sample size differs across conditions (36 in the D&A vs. 155 in the
traditional condition). We selected these sample sizes to hold constant the statistical level of
assurance across conditions (see online Appendix 1 for mathematical calculation), while
maintaining sample sizes in both conditions that are not unreasonable based on prior research
(e.g., Durney et al. 2014). The experimental design reflects the fact that D&A approaches (which
sample only exceptions) can achieve the same level of assurance as traditional approaches
(which sample populations) with a smaller sample requiring less manual effort by the
engagement team. The audit approach manipulation does not mention effort in either condition,
allowing participants to make their own inferences about effort and its role on audit quality.
exception rate, participants were told that “audit testing revealed that the client’s customers often changed their
orders after submitting purchase orders but before receiving shipment”. We hold this rate and explanation constant
across conditions. Thus, it is unlikely that the level of this rate could explain our evidence.
12
We designed the experimental materials such that the midpoint of the confidence interval is below materiality for
the audit but the upper bound of the confidence interval is above materiality. We made this design choice to avoid
ceiling effects for our audit quality and supporting measures.
16
Audit quality, audit effort, and alternative cognitive mechanism measures
Given our focus on the audit production process and our instructions to participants that they
were to evaluate the quality of a single set of substantive audit procedures, our primary
dependent variable, Quality, represents a process measure. Similar to Ballou et al. (2021), we
asked participants to assess “the quality of the audit procedures performed” on a zero (very low
quality) to six (very high quality) scale.
We recognize that the multi-dimensional nature of audit quality includes not only
processes, but also inputs, outputs, and post-opinion factors (Francis 2011; Knechel et al. 2013;
PCAOB 2015; Christensen et al. 2016). The experimental materials held inputs and audit team
characteristics constant across conditions and did not include information specifically about post-
opinion factors. For completeness, we asked questions related to these indirect factors. The most
common output indicator of low audit quality is financial statement restatements (Knechel et al.
2013; Christensen et al. 2016). Therefore, the output measure, Failed, asked participants to
assess the likelihood that the auditors “failed to prevent a material misstatement” in the
company’s revenue transactions on a zero (not very likely) to six (very likely) scale. One of the
commonly used post-opinion indicators of low audit quality is civil litigation against auditors
(Francis 2011; Knechel et al. 2013). For example, malpractice claims, mostly stemming from
litigation, are positively associated with the number of weaknesses in AICPA peer review reports
(Casterella et al. 2009). Therefore, the post-opinion measure, Liability, asked participants about
the “risk of legal liability for the auditors” on a zero (no risk) to 100 (high risk) scale. In contrast
to our primary dependent measure Quality, both Failed and Liability represent measures of audit
quality that depend, in part, on the client’s financial statements and other extraneous factors.
Thus, they are indirect measures of audit process quality.
17
To gain insight on the effort heuristic, we asked participants to assess audit effort
associated with the audit procedures performed, Effort, on a zero (very low effort) to six (very
high effort) scale and how much additional effort the audit team should have invested to achieve
acceptable levels of assurance, Additional Effort, on a zero (no more effort) to six (a great deal
more effort) scale. To provide evidence on the intended D&A dynamic, we also asked
participants to assess the efficiency associated with the audit procedures performed, Efficiency,
on a zero (very low efficiency) to six (very high efficiency) scale.
To test alternative theoretical explanations, we asked participants three sets of additional
questions. The first set of questions measures salience of the exceptions. Because D&A audit
procedures concretely identifies all exceptions, but does not follow up on all of them, the
salience of exceptions could drive perceptions of lower audit quality. We asked participants to
estimate the total number of exceptions (Exceptions) and to estimate the number of exceptions
they believe that the audit team did not follow up and directly test (Unexamined Exceptions).
Although participants could acquire this information from the experimental materials, we asked
these brief questions about their beliefs to gauge the salience of the information to participants.
The second set of questions measures participants’ perceptions about the risk associated
with the audit procedures. Based on Slovic’s (1987) behavioral risk model, we asked four
questions that assess participants’ perceptions about different aspects of risk. These questions
asked participants to rate: (1) their concern/worry about the risks associated with the audit
procedures (Risk-Concern); (2) the severity of the risks associated with the audit procedures
(Risk-Severity); (3) whether the risks associated with the audit procedures are new or old (Risk-
New); and (4) the extent to which the risks associated with the audit procedures are known
precisely (Risk-Known).
18
The third set of questions measures individual beliefs about technology innovation. Prior
research suggests personal willingness to innovate with technology represents an individual
difference that can influence decision-making. We asked four questions that capture one’s
personal willingness to innovate with technology (Agarwal and Prasad 1998). 13 We create a
composite measure, Innovativeness, based on the average of these four measures. Finally,
participants also responded to manipulation checks and demographic questions.
Results
Over 85 percent of participants correctly answered manipulation checks asking whether the audit
team used data analytics to electronically identify the exceptions in the population or identified
exceptions in a random sample of the population. Excluding participants who answered
incorrectly generally does not affect the inferences we draw. 14 For completeness, we report
analyses below including all 60 participants.
Tests of Hypothesis 1 and the effort heuristic mechanism
Table 2 reports the results of two-sample t-tests, with Audit Approach as the independent
variable and Quality and each of the other measures as the dependent variable. Participants judge
Quality lower in the D&A procedures than in the traditional procedures (2.44 vs. 2.93, t = 1.68, p
= 0.049), consistent with Hypothesis 1. The other, indirect, audit quality variables are generally
also consistent with Hypothesis 1. Results for Failed are directionally consistent with our
prediction but not significant (3.91 vs. 3.61, t = 0.72, p = 0.237), while results for Liability are
significant as well as directionally consistent (71.00 vs. 53.00, t = 3.28, p = 0.001). Untabulated
13
The questions were: If I heard about a new technology, I would look for ways to experiment with it; Among my
peers, I am usually the first to try out new technologies; In general, I am hesitant to try out new technologies; I like
to experiment with new technologies.
14
Specifically, when excluding these participants, the tests reported in Figure 2 continue to provide evidence of the
effort heuristic using a 95% confidence interval. The t-test reported in Table 2 for Quality becomes slightly less
significant when removing these participants (p = 0.059), perhaps due to lower statistical power.
19
results from an overall MANOVA test that includes the primary dependent variable, Quality, as
well as measures more tangential to audit process quality (Failed, Liability, Additional Effort)
reveals a significant overall effect of audit approach (F=2.76, p=0.018, one-tailed).
Results for audit effort measures are also consistent with our theoretical model where
external reviewers perceive D&A takes less effort than traditional audit procedures. Effort is
lower for D&A, than traditional, audit procedures (2.47 vs. 3.86, t = 4.32, p < 0.001). Additional
Effort is higher for D&A, than traditional, audit procedures (4.38 vs. 3.75, t = 1.80, p = 0.039).
Further, Efficiency is higher for D&A, than traditional, audit procedures consistent with one of
the intended outcomes of D&A (4.38 vs. 2.39, t=6.21, p < 0.001).
Figure 2, panel A illustrates a statistical diagram that tests the effort heuristic mechanism.
As seen in path 1 of Figure 2, participants perceive that the D&A procedures entail less Effort
than the traditional procedures. 15 Path 2 of Figure 2 provides evidence that participants who
perceive lower Effort in turn perceive lower Quality. Figure 2, panel B tests for mediation using
Hayes’ (2018) bootstrapping approach (Model 4). Using a 95% confidence interval, we find that
Effort mediates the relationship between Audit Approach and Quality. Overall, these results
suggest that external reviewers judge lower audit procedure quality in the D&A approach
because they rely on an effort heuristic to judge audit procedure quality.
Alternative cognitive mechanisms
Although we hold constant the frequency (and number) of exceptions across conditions, the
D&A condition concretely identifies all 16,306 exceptions while the traditional condition
concretely identifies only 48 exceptions (out of the sample of 155 transactions). Astute external
15
Replacing Effort with a composite of our audit effort measures (Effort, the inverse of Efficiency and the inverse of
Additional Effort) yields identical inferences on these analyses, suggesting that these measures capture similar
constructs. For expositional simplicity, we focus on Effort throughout the text.
20
reviewers in the traditional condition should consider the number of extrapolated exceptions
when judging audit quality, not just those that were concretely identified in the sample.
Nevertheless, the D&A condition increases the salience of exceptions and the salience of known
exceptions that the engagement team did not directly investigate for error (i.e., “unexamined
exceptions”). Indeed, Table 2 results show higher mean values for Exceptions and Unexamined
Exceptions in the D&A than the traditional procedures (p = 0.007 and 0.010, respectively). 16
We conduct analyses to examine whether variations in the salience of exceptions or
unexamined exceptions across conditions help to explain our results for Hypothesis 1.
Specifically, we conduct two mediation analyses (see online Appendix 5) that includes both
Effort (our predicted mediator) and Exceptions or Unexamined Exceptions (the potentially
competing mechanisms proposed above) as mediators of the relationship between audit approach
and Quality (Hayes 2018, Model 4). Using a 95% confidence interval, these analyses reveal that
Effort but not Exceptions or Unexamined Exceptions mediates the relationship between Audit
Approach and Quality (95% CI for Exceptions, Lower Limit: -0.15, Upper Limit: 0.25; 95% CI
for Unexamined Exceptions, Lower Limit: -0.22, Upper Limit: 0.21). In addition, these analyses
reveal that the indirect effect of Effort is stronger than the indirect effect of Exceptions (95% CI,
Lower Limit: -1.38, Upper Limit: -0.32) and stronger than the indirect effect of Unexamined
Exceptions (95% CI, Lower Limit: -1.35, Upper Limit: -0.25). These results suggest that the
salience of exceptions does not account for our audit quality results for Hypothesis 1.
Next, we consider participants’ beliefs about the risk in D&A compared to traditional
audit procedures. Table 2 results show participants perceive the D&A procedures as riskier than
16
Interestingly, participants across all conditions tend to underestimate Exceptions and Unexamined Exceptions.
Because we wanted to capture salience, participants were not able to review the experimental materials when
answering the Exceptions and Unexamined Exceptions questions, so it is perhaps not surprising that participants did
not answer these questions completely accurately.
21
the traditional procedures for three of the four risk measures. Specifically, participants perceive
more severe and newer risks in the D&A than in the traditional procedures (p < 0.05), and are
more concerned/worried about the risks in the D&A than in the traditional procedures (p =
0.087). To provide assurance that perceptions of risk do not drive our results, we examine
whether any of the four risk measures mediate the relationship between audit approach and
perceived audit procedure quality (untabulated). The indirect effects of each of the four measures
are insignificant, and consequently, we do not consider these measures further. 17
Finally, to provide assurance that perceptions of technology do not drive our results, we
also compare Innovativeness across conditions and find that it does not differ across conditions
(p > 0.10). 18 We conduct an ANOVA with Quality as the dependent variable and Audit
Approach, Innovativeness, and their interaction as independent variables (untabulated). We
continue to find a significant main effect of Audit Approach (p=0.011). We also find some
evidence that Hypothesis 1 is stronger for participants with lower Innovativeness scores
(interaction p = 0.05). 19 In an analysis examining whether Innovativeness mediates the
relationship between audit approach and perceived audit quality we find that the indirect effect of
Innovativeness is not significant.
4. Experiment two
We designed Experiment 2 to provide a stronger test of the effort heuristic mechanism.
Experiment 1 uses a simple 1 x 2 between-participants design to provide initial evidence on the
17
We note that the four risk measures employ different scales. Likely as a result, standardized Cronbach’s alpha is
0.38 for these measures, indicating low reliability. We therefore do not create a composite measure of risk.
18
Cronbach’s alpha is 0.85 for the four measures included in Innovativeness, indicating reasonably strong reliability.
19
Specifically, a spotlight analysis at one standard deviation below (above) the median of Innovativeness finds that
participants judge D&A lower (similar) in quality than traditional procedures. Thus, Hypothesis 1 results appear to
be driven by participants with lower Innovativeness scores. However, including Innovativeness as a covariate in our
tests of Hypothesis 1 only strengthens our results (p= 0.044). We do not tabulate this analysis in accordance with
best practices for reporting experimental results (Simmons et al. 2011).
22
effort heuristic and to rule out some potential alternative explanations. However, Experiment 1
has several limitations including the correlational nature of the approach, which raises the
possibility that an omitted variable drives the observed relationship (for more details, see
Spencer et al. 2005; Griffith et al. 2016; Emett 2019; Asay et al. 2022). Experiment 2 tests the
effort heuristic through a “moderation-of-process” design, an approach that Asay et al. (2022)
argues may be more valuable than mediation to identify a theorized causal mechanism. Thus,
Experiment 2 is designed to test a theory-based intervention intended to reduce external
reviewers’ reliance on the effort heuristic.
Method
Participants
As in Experiment 1, we again partnered with the AICPA’s ARAG to recruit AICPA members
with experience evaluating the work of other audit engagement teams. The AICPA also reviewed
all experimental materials for Experiment 2 prior to recruiting participants. One hundred and
nine participants completed the study. Similar to Experiment 1, we emphasized to participants
the importance of “completing the task in one sitting”. We therefore excluded eight participants
who spent more than twenty-four hours completing the study. These excluded participants, on
average, spent almost thirteen days, on average, completing the study. 20 The remaining 101
participants spent, on average, about forty-one minutes completing the study. We also excluded
three participants who explicitly stated they could not complete our priming manipulation. 21 This
leaves us with a final sample of 98 participants.
20
Including these participants does not alter the inferences we draw from our test of Hypothesis 2 in Table 3 (p =
0.028) or our tests of moderated mediation in Figure 3 (i.e., we continue to find evidence of moderated mediation
using a 95% confidence interval).
21
The priming manipulation asks participants to describe a past audit experience consistent with their prime. The
three excluded participants stated: “Can’t think of examples off hand,” “I cannot think of any at this time,” and “I
cannot think of an example,” respectively. Including these participants does not alter the inferences we draw from
23
As shown in Table 1, participants reported having about 24 years of audit experience and
included 90 audit partners, five senior managers, and three managers. Fifty participants reported
experience as AICPA peer reviewers or PCAOB inspectors, and eighty-three participants
reported experience as engagement quality control reviewers within their firms. 22
Experimental design, dependent variables, and post-experiment questions
We employed a 2 x 2 between-participants experimental design, manipulating the audit approach
(D&A procedures vs. traditional procedures) and the audit quality prime (effort-is-essential vs.
effort-can-be-substituted). We kept the experimental instructions, background, and the audit
approach manipulation exactly as in Experiment 1.
We modeled our audit quality prime manipulation after the one used by Cho and Schwarz
(2008) where the primes either promoted a theory focused on effort (“effort-is-essential”) or a
theory focused on how effort can be substituted by other factors (“effort-can-be-substituted”).
Before beginning the engagement review task, we asked participants to read an excerpt from a
fictional speech given by “the global head of audit quality for a large, international audit firm.”
We modify this speech as our priming manipulation. After reading the speech, we also asked
participants to describe a past audit experience that corresponds to the priming condition they
received. We used both the speech excerpt and memory prompt to strengthen our manipulation.
Given the time constraints faced by this scarce and valuable participant group, we did not
provide a filler task between the priming manipulation and the main review task. We took other
steps, described subsequently, to mitigate concerns that participants could infer the purpose of
our test of Hypothesis 2 in Table 3 (p = 0.023) or our tests of moderated mediation in Figure 3 (i.e., we continue to
find evidence of moderated mediation using a 95% confidence interval).
22
Untabulated analyses reveal that, with the exception of familiarity with D&A tools, none of the demographic
variables are significantly associated with our manipulations or their interaction (all p-values > 0.10). Including
familiarity with D&A tools as a covariate in our tests of Hypothesis 2 does not affect the inferences we draw from
our test of Hypothesis 2 in Table 3 (p = 0.019).
24
the manipulation and would feel compelled to answer in line with inferred expectations (i.e.,
experimental demand).
In the effort-is-essential prime, similar to Cho and Schwarz’s (2008), the speaker
emphasizes the importance of audit effort for audit quality by stating, “the single most important
factor that drives audit quality is audit effort—the sheer amount of work that auditors dedicate to
planning, fieldwork, and review” (see online Appendix 4, panel A). 23 The accompanying
memory prompt requested, “In one or two sentences describe an experience where your effort on
an audit procedure improved audit quality.”
In the effort-can-be-substituted prime, similar to Cho and Schwarz’s (2008), the speaker
emphasizes how effort can be substituted by execution, stating: “On the highest-quality
engagements, audit teams find ways to reduce audit effort, but not quality, by improving audit
execution.” (see online Appendix 4, panel B). The accompanying memory prompt requested, “In
one or two sentences describe an experience where your effort on an audit procedure did not
improve audit quality.” Importantly, the speaker does not mention technology or D&A tools in
either priming condition.
We ask the same question to measure audit procedure quality as in Experiment 1,
Quality. However, because Experiment 2 focuses more narrowly on testing the effort heuristic
using a “moderation-of-process” design, we ask only a subset of the other questions from
23
Regulators and academics regularly promote the idea that audit effort is a crucial input into high quality audits
(Shibano 1990; Matsumura and Tucker 1992; Lobo and Zhao 2013; PCAOB 2015; Xiao et al. 2020; Christensen et
al. 2021). We relied upon these sources as well as press articles in constructing these speeches. Additionally, we
confirmed with current and former PCAOB inspectors, some of whom are now audit quality leaders, that our
“effort-is-essential” speech was consistent with the messaging they would expect to see put out by firms.
25
Experiment 1. In particular, we ask participants the Failed, Liability, Effort, Efficiency, and
Innovativeness questions. 24
Results
Almost ninety percent of participants correctly answered manipulation checks probing whether
they attended to the audit approach manipulation. Excluding participants who failed the
manipulation check does not change the inferences we draw from the entire sample of 98
participants. 25 Therefore, we do not exclude these participants from our analyses. Overall, we
believe participants attended to and understood the manipulations.
Tests of hypothesis 2 and effort heuristic mechanism
We test Hypothesis 2 by conducting a 2x2 ANOVA with audit approach, audit quality prime,
and their interaction as independent variables and Quality as the dependent variable. We report
descriptive statistics in Table 3 panel A, and plot mean values in Figure 3 panel B.
Table 3 panel B reports the predicted interaction is significant (p = 0.018) in the ANOVA
analysis. Table 3 panel C reports simple effect tests revealing that participants judge D&A lower
in quality than the traditional procedures in the effort-is-essential prime condition (t = 2.36,
p=0.010), consistent with our results in Hypothesis 1. In contrast, participants do not judge D&A
lower in quality than the traditional procedures in the effort-can-be-substituted prime condition (t
= -0.69, p = 0.491), consistent with Hypothesis 2. Additional (untabulated) simple effects tests
24
The wording for Failed differs slightly across Experiment 1 and Experiment 2 (“failed to prevent” in Experiment
1 vs. “failed to detect” in Experiment 2). We made this design choice because we conjectured that the double
negative in the first version inhibited comprehension in Experiment 1, leading to the weaker results for Failed.
25
We did not include a traditional manipulation check for the audit prime manipulation due to the overt, active
nature of that manipulation. Instead, two independent reviewers coded whether participants’ open-ended responses
to the experiential prompt at the end of the manipulation detailed an experience consistent with the question in their
condition. Only two participants were coded inconsistent with their condition; removing them from the reported
analyses only strengthens our test of Hypothesis 2 in Table 3 (p = 0.013) and does not alter our tests of moderated
mediation in Figure 3 (i.e., we continue to find evidence of moderated mediation using a 95% confidence interval).
This evidence suggests that the audit quality prime manipulation was successful.
26
reveal that the effort-can-be-substituted prime serves to both increase the perceived quality of
D&A audit procedures (t=1.47, p=0.072) and decrease the perceived quality of traditional audit
procedures (t= -1.55, p=0.063). 26 These results are consistent with the effort heuristic causing
external reviewers to evaluate D&A (traditional) audit procedures as lower (higher) quality than
they would without an effort heuristic, consistent with Hypothesis 2.
Figure 4 tests the theoretical process that underlies our Hypothesis 2 prediction. Figure 4
panel A illustrates a statistical diagram. As seen in Link 1, participants perceive that the D&A
takes less Effort than the traditional procedures across both audit quality prime conditions (p <
0.01). However, the diagram reveals that the relationship between Effort and Quality (Link 2) is
moderated by the audit quality prime manipulation (p < 0.01), consistent with Hypothesis 2.
Finally, panel A reveals that the audit quality prime manipulation does not moderate the direct
effect of audit approach on Quality (p = 0.28), suggesting our Hypothesis 2 results are explained
by the effort heuristic mechanism and not some other mechanism. 27
Figure 4, panel B tests for moderated mediation using Hayes’ (2018) bootstrapping
approach (Model 15). The audit quality prime manipulation is intended to influence the
relationship between Effort and Quality (Indirect Link 2) and not the relationship between Audit
Approach and Effort (Indirect Link 1). As such, we place the moderator on the Indirect Link 2
path. Consistent with our intention, the audit quality prime manipulation does not significantly
26
The unsigned difference between these two effects is not significant, suggesting that our priming manipulation
does not primarily affect perceptions of one type of audit procedure over another (D&A vs. traditional).
27
We examine the open-ended responses from the memory prompt to test whether the effort-is-essential prime
inadvertently introduced a pro-human bias to participants. Two independent coders, blind to condition, coded
whether any of the open-ended responses reflect a pro-human bias. None of the responses were coded by either
coder to contain statements that reflect a pro-human bias. We further use these open-ended responses to test whether
either of the primes was perceived as unrealistic to participants. An independent coder read these open-ended
responses and determined whether any of these responses indicated disagreement with the effort-is-essential or
effort-can-be-substituted primes. The coder identified only one of the responses as indicating disagreement with the
manipulation. Our results yield identical inferences when excluding this participant from analyses.
27
affect perceptions of Effort as either a main effect or interaction with our audit approach
manipulation. Using a 95% confidence interval, we find that Effort mediates the relationship
between audit approach and Quality in the effort-is-essential but not in the effort-can-be-
substituted prime condition. Further, an overall test of moderated mediation provides evidence of
a significantly stronger indirect effect of Effort in the effort-is-essential prime condition than in
the effort-can-be-substituted prime condition. These results suggest external reviewers rely on
the effort heuristic to judge audit quality when receiving an effort-is-essential prime, but not
when receiving an effort-can-be-substituted prime.
Regarding the other questions asked, results of untabulated analyses for Failed are
directionally consistent but not significant (F=1.12, p = 0.147), consistent with Experiment 1.
Results for Liability are also consistent with our Hypothesis 2 prediction (interaction F = 4.15, p
= 0.022). Specifically, participants perceive more litigation risk in the D&A procedures than in
the traditional procedures in the effort-is-essential prime condition (t = 2.56, p = 0.006) but not in
the effort-can-be-substituted prime condition (t = -0.36, p = 0.718). Finally, Innovativeness is not
significantly associated with Quality, either as a main effect or as an interaction with our
manipulated variables. Furthermore, our manipulated variables do not significantly affect
Innovativeness either as main effects or their interaction.
5. Discussion
Across two experiments, we investigated how external reviewers evaluate the quality of D&A
and traditional audit procedures. Our first experiment tested whether external reviewers judge
D&A audit procedures as lower in quality than traditional audit procedures and explored whether
the effort heuristic explains this difference better than alternative explanations, including the
salience of (unexamined) exceptions, perceived risk, and technology beliefs. Consistent with the
28
effort heuristic, we find that external reviewers judge D&A procedures to be of lower audit
procedure quality than traditional procedures because they perceive auditors use less effort in
D&A procedures.
Our second experiment provides a stronger test of the effort heuristic explanation and
evaluates a theory-based intervention designed to reduce the extent to which external reviewers
rely on the heuristic. 28 We find external reviewers change how they evaluate the quality of D&A
and traditional procedures based on the audit quality prime they receive. Participants receiving
an effort-is-essential prime again rely on the effort heuristic, judging D&A procedures as lower
in quality than traditional procedures. In contrast, participants receiving an effort-can-be-
substituted prime do not rely on the effort heuristic, judging D&A procedures as similar in
quality to traditional procedures.
Generalizability to PCAOB inspection settings
The AICPA ARAG provided participants, most of whom do not have PCAOB inspection
experience. The AICPA’s consequential regulatory role in private company audits makes the
AICPA external review setting important in its own right (AICPA 2022a; Hilary and Lennox
2005). Nevertheless, our choice of participants could raise questions about the extent to which
our results generalize to PCAOB inspection settings. AICPA peer reviewers and PCAOB
inspectors perform a similar task and the theory we examine is general and psychological in
nature, influencing the judgments and decisions of most people, including self-professed subject
28
We designed the two experiments to complement each other. Experiment 2 uses a moderation-of-process design
(see Asay et al. 2022), allaying concerns about correlated omitted variables in Experiment 1. Experiment 1 does not
employ obtrusive primes, allaying concerns about experimental demand in Experiment 2. Consistent evidence across
the two experiments, including the theoretical process, lends strong support to our conclusions while ruling out
alternative explanations.
29
experts (e.g., see Lennox and Pitman 2010 and Kruger et al. 2004). Thus, we expect our
theoretical model likely generalizes to PCAOB inspections settings.
To gain insight on whether our theoretical model does indeed generalize to PCAOB
inspection settings, we interviewed four current or former PCAOB inspectors, who on average
have 6.63 years of inspection experience at the PCAOB. Our interviews focused on the first two
links in our theoretical model (Figure 1). Three interviewees indicated they believe D&A audit
procedures reduce audit effort (Link 1). For example, one interviewee pointed out that data
analytics “should result in less manual effort compared to a traditional approach.” All four
interviewees noted that they consider audit effort when conducting inspections and relate audit
effort to audit quality (Link 2). For example, one interviewee noted: “If you are putting in
maximum effort, you would get a maximum [quality] outcome…If you have a high level of
effort in your planning, high level of effort in the execution, you should have an audit that is of a
higher quality.” Overall, these interview results provide evidence that the key theoretical
constructs operating in our experiments also operate within PCAOB inspection settings. Thus,
we expect that any differences between AICPA and PCAOB settings are unlikely to alter the
direction of the effects we examine (Libby et al. 2002).
Future research
Our study also suggests extensions for future research. First, our theoretical model could
apply to other audit or accounting contexts or other measures of effort (e.g., audit hours). For
example, regulatory friction created by the effort heuristic could impede the adoption of other
effort-reducing D&A audit innovations, such as large-language models and robotic process
automation. As another example, auditors may also fall prey to the effort heuristic when
performing their own supervisory reviews of audit procedures.
30
Second, prior research recognizes and provides evidence that external reviewers (e.g.,
peer reviewers and PCAOB inspectors) play a powerful role in shaping audit practice, but we
know little about the factors that influence their judgments (e.g., see Hilary and Lennox 2005;
Defond and Lennox 2017; Aobdia 2018; Johnson, et al. 2019; Shefchik Bhaskar 2019;
Westermann et al. 2019). We provide evidence that the effort heuristic is one influential factor,
but future research can explore other factors that influence external reviewers’ judgments.
Third, we test a theory-based priming intervention that reduces reliance on the effort
heuristic. Future research could examine other theory-based interventions that could also
eliminate the effect we observe in Experiment 1. For example, future research could examine
whether external reviewers perceive D&A procedures as less effortful than traditional procedures
in the presence of interventions that increase the salience of the firm’s efforts to design, test, and
implement D&A tools.
Fourth, in both experiments we gathered participants’ willingness to innovate with
technology, Innovativeness, as an individual difference that can influence judgments and
decisions. Results for this variable differ across experiments, with high Innovativeness reducing
the effort heuristic in Experiment 1, but not in Experiment 2. Future research could more directly
study how individuals’ willingness to use technology affect subsequent judgments of quality.
Finally, our study was conducted in the US. We encourage future research to explore
whether our findings generalize to settings outside the United States. Prior research finds
country-level variations in economic, regulatory, audit market, and sociological factors impact
financial reporting quality and audit fees (Isidro et al. 2020). These factors, particularly
sociological, could change external reviewers’ judgments in different countries (Kleinman et al.
2014). Specifically, culture has been shown to influence perceptions of technology (Huang et al.
31
2019; Srite 2006), perceptions of effort (Ko et al. 2015), and even perceptions of quality (Yavas
and Rezayat, 2003).
Public policy and practice implications
Our study contributes to a growing body of literature on D&A audit approaches by providing
theory-based empirical evidence on how external reviewers in practice evaluate D&A audit
procedures relative to traditional audit procedures. Our results are consistent with auditor
concerns that the external review process could unintentionally impede adoption of D&A audit
approaches. Thus, our results have important public policy and practice implications.
For public policy, our results first suggest that regulators could provide additional
training to external reviewers on evaluating audit evidence obtained using D&A audit
approaches. Second, because use of D&A audit approaches affects external reviewers’
evaluation of specific audit procedures, they could also potentially affect external reviewers’
choice of engagements for inspection, under a risk-based approach. In other words, to the extent
that external reviewers view D&A approaches less favorably, audits and/or component of audits
using these approaches are more likely to be selected for review. Regulators should consider
addressing this concern by proactively developing guidance and standards that more explicitly
detail the quality criteria that external reviewers should use when evaluating D&A audit
approaches. 29 For audit practice, auditors should be aware of the tendency for external reviewers
to use the effort heuristic, and consequently, should consider addressing this tendency by
highlighting all their efforts devoted to developing and using D&A audit approaches to external
reviewers.
29
Both the AICPA and PCAOB have begun considering whether additional guidance and standards are needed
regarding the auditor’s use of technology, but neither has put forward a standard and are still gathering information
(AICPA 2022b; PCAOB 2019). To help them in their efforts, we shared the findings of our study with the AICPA,
who helped sponsor this study, who shared it with the Auditing Standards Board and their peer review teams.
32
References
Agarwal, R., and J. Prasad. 1998. A conceptual and operational definition of personal
innovativeness in the domain of information technology. Information Systems Research 9
(2): 204-215.
AICPA. 2017. Guide to Data Analytics. Hoboken, N.J.: John Wiley & Sons.
AICPA. 2022a. Peer Review Program. Annual Report on Oversight.

https://us.aicpa.org/content/dam/aicpa/interestareas/peerreview/resources/transparency/do
wnloadabledocuments/56175896-annual-report-oversight-2021.pdf , accessed April 14,
2023.
AICPA. 2022b. ASB Strategy Work Plan.

https://us.aicpa.org/content/dam/aicpa/research/standards/auditattest/asb/downloadabledo
cuments/2022-2023-asb-strategy-work-plan.pdf, accessed April 14, 2023.
Alles, M. G. 2015. Drivers of the use and facilitators and obstacles of the evolution of big data
by the audit profession. Accounting Horizons 29 (2): 439-49.
Anderson, S. B., J. L. Hobson, and M. E. Peecher. 2021. The joint effects of rich data
visualization and audit procedure categorization on auditor judgment. Working Paper,
University of Illinois at Urbana-Champaign.
Aobdia, D. 2018. The impact of the PCAOB individual engagement inspection process—
Preliminary evidence. The Accounting Review 93 (4): 53-80.
Aobdia, D., and N. Shroff. 2017. Regulatory oversight and auditor market share. Journal of
Accounting and Economics 63 (2-3): 262-87.
Aronson, E., and J. Mills. 1959. The effect of severity of initiation on liking for a group. The
Journal of Abnormal and Social Psychology 59 (2): 177-81.
Arkes, H. R., and C. Blumer. 1985. The psychology of sunk cost. Organizational Behavior and
Human Decision Processes 35 (1): 124-40.
Asay, H. S., R. Guggenmos, K. Kadous, L. L. Koonce, and R. Libby. 2022. Theory testing and
process evidence in accounting experiments. The Accounting Review 97 (6): 23-43.
Austin, A., T. Carpenter, M. H. Christ, and C. Nielson. 2021. The data analytics journey:
Interactions among auditors, managers, regulation, and technology. Contemporary
Accounting Research 38 (3): 1888-1924.
Ballou, B., J. H. Grenier, and A. Reffett. 2021. Stakeholder perceptions of data and analytic
based auditing techniques. Accounting Horizons 35 (3): 47-68.
33
Barr-Pulliam, D., J. F. Brazel, J. McCallen, and K. Walker. 2020. Data analytics and skeptical
actions: The countervailing effects of false positives and consistent rewards for
skepticism. Working paper, University of Louisville, North Carolina State University,
University of Georgia, and University of Wisconsin-Madison.
Barr-Pulliam, D., H. L. Brown-Liburd, and K. A. Sanderson. 2022. The effects of the internal
control opinion and use of audit data analytics on perceptions of audit quality, assurance,
and auditor negligence. Auditing: A Journal of Practice and Theory 41 (1): 25-48.
Brown-Liburd, H., H. Issa, and D. Lombardi. 2015. Behavioral implications of big data's impact
on audit judgment and decision making and future research directions. Accounting
Horizons 29 (2): 451-68.
Bibler, S., T. Carpenter, M. Christ, and A. Gold. 2023. Thinking outside of the box: Engaging
auditors’ innovation mindset to improve auditors’ fraud actions in a data-analytic
environment. Working paper, Vrije Universiteit Amsterdam and University of Georgia
Byrne, S. and P. S. Hart. 2009. The boomerang effect a synthesis of findings and a preliminary
theoretical framework. Annals of the International Communication Association 33 (1): 3-
37.
Cao, M., R. Chychyla, and T. Stewart. 2015. Big Data analytics in financial statement
audits. Accounting Horizons 29 (2): 423-29.
Cao, T., R. Duh, H. Tan, and T. Xu. 2022. Enhancing auditors’ reliance on data analytics under
inspection risk using fixed and growth mindsets. The Accounting Review 97 (3): 131-53.
Caramanis, C., and C. Lennox. 2008. Audit effort and earnings management. Journal of
Accounting and Economics 45 (1): 116-38.
Carcello, J. V., C. Hollingsworth, and S. A. Mastrolia. 2011. The effect of PCAOB inspections
on Big 4 audit quality. Research in Accounting Regulation 23 (2): 85-96.
Carson, E., P. Lamoreaux, R. Simnett, U. Thurheimer, and A. Vanstraelen. 2021. Establishment

of national public audit oversight boards and audit quality. Working Paper, UNSW
Sydney, Arizona State University, and Maastricht University.
Casterella, J. R., K. N. Jensen, and W. R. Knechel. 2009. Is self-regulated peer review effective
at signaling audit quality? The Accounting Review 84 (3): 713-35.
Cheng, Y., A. Mukhopadhyay, and R. Y. Schrift. 2017. Do costly options lead to better
outcomes? How the protestant work ethic influences the cost-benefit heuristic in goal
pursuit. Journal of Marketing Research 54 (4): 636-49.
Cho, H., and N. Schwarz. 2008. Of great art and untalented artists: Effort information and the
flexible construction of judgmental heuristics. Journal of Consumer Psychology 18 (3):
205-11.
34
Christ, M. H., S. A. Emett, S. L. Summers, and D. A. Wood. 2021. Prepare for takeoff:
Improving asset measurement and audit quality with drone-enabled inventory audit
procedures. Review of Accounting Studies 26: 1323-43.
Christensen, B. E., S. M. Glover, T. C. Omer, and M. K. Shelley. 2016. Understanding audit

quality: Insights from audit professionals and investors. Contemporary Accounting
Research 33 (4): 1648-84.
Christensen, B. E., N. J. Newton, and M. S. Wilkins. 2021. Archival evidence on the audit
process: Determinants and consequences of interim effort. Contemporary Accounting
Research 38 (2): 942-73.
Commerford, B. P., S. A. Dennis, J. R. Joe, and J. Ulla. 2022. Man versus machine: Complex
estimates and auditor reliance on artificial intelligence. Journal of Accounting Research
60 (1): 171-201.
Davidson, R. A., and W. E. Gist. 1996. Empirical evidence on the functional relation between
audit planning and total audit effort. Journal of Accounting Research 34 (1): 111-24.
DeFond, M. L., and C. S. Lennox. 2017. Do PCAOB inspections improve the quality of internal
control audits? Journal of Accounting Research 55 (3): 591-627.
Deloitte. 2016. 2016 Global Impact Report,

https://www2.deloitte.com/content/dam/Deloitte/global/Documents/global-
report/Deloitte-2016-Global-Impact-Report.pdf, accessed September 8, 2021.
Durney, M., R. J. Elder, and S. M. Glover. 2014. Field data on accounting error rates and audit
sampling. Auditing: A Journal of Practice & Theory 33 (2): 79-110.
Eilifsen, A., F. Kinserdal, W. F. Messier, Jr., and T. E. McKee. 2020. An exploratory study into
the use of audit data analytics on audit engagements. Accounting Horizons 34 (4): 7-102.
Elder, R. J., A. D. Akresh, S. M. Glover, J. L. Higgs, and J. Liljegren. 2013. Audit sampling
research: A synthesis and implications for future research. Auditing: A Journal of
Practice & Theory 32 (1): 99-129.
Emett, S. A. 2019. Investor reaction to disclosure of past performance and future plans. The
Accounting Review 94 (5): 165-88.
EY. 2017. How Big Data and Analytics are Transforming the Audit,
http://www.ey.com/gl/en/services/assurance/ey-reporting-issue-9-how-big-data-and-
analytics-are-transforming-the-audit#item1, accessed September 8, 2021.
Festinger, L., and J. M. Carlsmith. 1959. Cognitive consequences of forced compliance. The
Journal of Abnormal and Social Psychology 58 (2): 203.
35
Francis, J. R. 2011. A framework for understanding and researching audit quality. Auditing: A
Journal of Practice & Theory 30 (2): 125-52.
Franco‐Watkins, A. M., B. D. Edwards, and R. E. Acuff Jr. 2013. Effort and fairness in
bargaining games. Journal of Behavioral Decision Making 26 (1): 79-90.
Gepp, A., M. K. Linnenluecke, T. J. O’Neill, and T. Smith. 2018. Big data techniques in auditing
research and practice: Current trends and future opportunities. Journal of Accounting
Literature 40: 102-15.
Glover, S. M., M. H. Taylor, and Y.-J Wu. 2019. Mind the Gap: Why do experts have
differences of opinion regarding the sufficiency of audit evidence supporting complex
fair value measurements? Contemporary Accounting Research 36 (3): 1417-60.
Griffith, E. E., K. Kadous, and D. Young 2016. How insights from the “new” JDM research can
improve auditor judgment: Fundamental research questions and methodological
advice. Auditing: A Journal of Practice & Theory 35 (2): 1-22.
Hanlon, M., and N. Shroff. 2022. Insights into auditor public oversight boards: Whether, how,
and why they “work”. Journal of Accounting and Economics 74 (1): 1-26.
Hayes, A.F. 2018. Introduction to Mediation, Moderation, and Conditional Process Analysis: A
Regression-Based Approach, 2nd Edition. New York: The Gilford Press.
Hilary, G., and C. Lennox. 2005. The credibility of self-regulation: Evidence from the
accounting profession's peer review program. Journal of Accounting and Economics 40
(1): 211-29.
Houston, R. W., and C. M. Stefaniak. 2013. Audit partner perceptions of post-audit review
mechanisms: An examination of internal quality reviews and PCAOB inspections.
Accounting Horizons 27 (1): 23-49.
Huang, F., T. Teo, and J. C. Sánchez-Prieto, F. J. García-Peñalvo, and S. Olmos-Migueláñez.

2019. Cultural values and technology adoption: A model comparison with university
teachers from China and Spain. Computers & Education 133: 69-81.
Inzlicht, M., A. Shenhav, and C. Y. Olivola. 2018. The effort paradox: Effort is both costly and
valued. Trends in Cognitive Sciences 22 (4): 337-49.
Isidro, H., D. Nanda, and P. D. Wysocki. 2020. On the relation between financial reporting
quality and country attributes: Research challenges and opportunities. The Accounting
Review 95 (3): 279-314.
Johnson, L. M., M. B. Keune, and J. Winchel. 2019. US auditors' perceptions of the PCAOB
inspection process: A behavioral examination. Contemporary Accounting Research 36
(3):1540-74.
36
Kahneman, D., P. Slovic, and A. Tversky. (Eds.). 1982. Judgment Under Uncertainty: Heuristics
and Biases. Cambridge MA: Cambridge University Press.
Kang, Y. J., M. D. Piercey, and A. Trotman. 2020. Does an audit judgment rule increase or
decrease auditors’ use of innovative audit procedures? Contemporary Accounting
Research 37 (1): 297-321.
Kapoor, M. 2020. Big Four invest billions in tech, shaping their identities. Bloomberg.
https://news.bloombergtax.com/financial-accounting/big-four-invest-billions-in-tech-
reshaping-their-identities
Ko, D., Y. Seo, and S. U. Jung. 2015. Examining the effect of cultural congruence, processing
fluency, and uncertainty avoidance in online purchase decisions in the US and
Korea. Marketing Letters 26 (3): 377-390.
Kim, S., and A. A. Labroo. 2011. From inherent value to incentive value: When and why
pointless effort enhances consumer preference. Journal of Consumer Research 38 (4):
712-42.
Kipp, P., M. Curtis, and Z. Li. 2020. The attenuating effect of intelligent agents and agent
autonomy on managers’ ability to diffuse responsibility for and engage in earnings
management. Accounting Horizons 34 (4): 143-64.
Knechel, W. R., G. V. Krishnan, M. Pevzner, L. B. Shefchik, and U. K. Velury. 2013. Audit

quality: Insights from the academic literature. Auditing: A Journal of Practice & Theory
32 (Supplement 1): 385-421.
Koreff, J. and S. Perreault. 2023. Is sophistication always better? Can perceived data analytic
tool sophistication lead to biased judgments?. Journal of Emerging Technologies in
Accounting 20 (1): 91-110.
KPMG. 2017. Audit data & analytics, https://home.kpmg.com/xx/en/home/services/audit/audit-

data-analytics.html, accessed September 8, 2021.
Krishnan, J., J. Krishnan, and H. Song. 2017. PCAOB international inspections and audit
quality. The Accounting Review 92 (5): 143-66.
Kruger, J., D. Wirtz, L. Van Boven, and T. W. Altermatt. 2004. The effort heuristic. Journal of
Experimental Social Psychology 40 (1): 91-8.
Lamoreaux, P. T. 2016. Does PCAOB inspection access improve audit quality? An examination
of foreign firms listed in the United States. Journal of Accounting and Economics 61 (2-
3): 313-37.
37
Lennox, C., and J. Pittman. 2010. Auditing the auditors: Evidence on the recent reforms to the
external monitoring of audit firms. Journal of Accounting and Economics 49 (1-2): 84-
103.
Libby, R., R. Bloomfield, and M. W. Nelson. 2002. Experimental research in financial

accounting. Accounting, Organizations and Society 27 (8): 775-810.
Lobo, G. J., and Y. Zhao. 2013. Relation between audit effort and financial report misstatements:
Evidence from quarterly and annual restatements. The Accounting Review 88 (4): 1385-
1412.
Matsumura, W. M., and R. R. Tucker. 1992. Fraud detection: A theoretical foundation. The
Accounting Review 67 (4): 753-82.
Morales, A. C. 2005. Giving firms an “E” for effort: Consumer response to high-effort firms.
Journal of Consumer Research 31: 806-12.
Naquin, C. E., and T. R. Kurtzberg. 2004. Human reactions to technological failure: How
accidents rooted in technology vs. human error influence judgments of organizational
accountability. Organizational Behavior and Human Decision Processes 93: 129-41.
PCAOB. 2015. Concept Release on Audit Quality Indicators. Release No. 2015-005.
Washington, DC: PCAOB.
PCAOB. 2019. Changes in Use of Data and Technology in the Conduct of Audits.
https://pcaobus.org/Standards/research-standard-setting-projects/Pages/data-
technology.aspx, accessed September 8, 2021.
PCAOB. 2021. Spotlight: Data and Technology Research Project Update. May. Washington
D.C.: PCAOB.
PCAOB. 2022. Spotlight: Staff Update and Preview of 2021 Inspection Observations. December.
Washington D.C.: PCAOB.
Peters, C. 2023. Auditor automation usage and professional skepticism. Working paper, Tillburg
University.
Sarstedt, M., D. Neubert, and K. Barth. 2016. The IKEA effect. A conceptual replication.
Journal of Marketing Behavior 2 (4): 307-12.
Schrift, R. Y., R. Kivetz, and O. Netzer. 2016. Complicating decisions: The work ethic heuristic
and the construction of effortful decisions. Journal of Experimental Psychology:
General 145 (7): 807.
Shefchik Bhaskar, L. 2019. How do risk-based inspections impact auditor behavior?

Experimental evidence on the PCAOB's process. The Accounting Review 95 (4): 103-26.
38
Shibano, T. 1990. Assessing audit risk from errors and irregularities. Journal of Accounting
Research 28: 110-40.
Simmons, J. P., L. D. Nelson, and U. Simonsohn. 2011. False-positive psychology: Undisclosed
flexibility in data collection and analysis allows presenting anything as significant.
Psychological Science 22 (11): 1359-66.
Slovic, P. 1987. Perception of risk. Science 236 (4799): 280-85.
Spencer, S. J., M. P. Zanna, and G. T. Fong. 2005. Establishing a causal chain: Why experiments
are often more effective than mediational analyses in examining psychological processes.
Journal of Personality and Social Psychology 89 (6): 845-51.
Srite, M. 2006. Culture as an explanation of technology acceptance differences: An empirical

investigation of Chinese and US users. Australasian Journal of Information Systems 14
(1).
Stefaniak, C. M., R. W. Houston, and D. M. Brandon. 2017. Investigating inspection risk: An
analysis of PCAOB inspections and internal quality reviews. Auditing: A Journal of
Practice & Theory 36 (1): 151-68.
Thaler, R. 1980. Toward a positive theory of consumer choice. Journal of Economic Behavior &
Organization 1 (1): 39-60.
Yavas, B. F., and F. Rezayat. 2003. The impact of culture on managerial perceptions of
quality. International Journal of Cross Cultural Management 3 (2): 213-234.
Waytz, A., J. Heafner, and N. Epley. 2014. The mind in the machine: Anthropomorphism
increases trust in an autonomous vehicle. Journal of Experimental Social Psychology 52:
113-17.
Walker, K., and H. Brown-Liburd. 2019. The emergence of data analytics in auditing:
Perspectives from internal and external auditors through the lens of institutional theory.
Working paper, University of Wisconsin-Madison and Rutgers University.
Westermann, K. D., J. Cohen, and G. Trompeter. 2019. PCAOB inspections: Public accounting
firms on “trial.” Contemporary Accounting Research 36 (2): 694-731.
Xiao, T., C. Geng, and C. Yuan. 2020. How audit effort affects audit quality: An audit process
and audit output perspective. China Journal of Accounting Research 13 (1): 109-27.
Yavas, B. F., and F. Rezayat. 2003. The impact of culture on managerial perceptions of quality.
International Journal of Cross Cultural Management 3 (2): 213-234.
39
Figure 1 Theoretical Framework
Theory of Audit
Quality
Link 3
Audit Approach Perceived Audit Perceived Audit
D&A vs. Traditional Effort Quality
Link 1 Link 2
The “Effort Heuristic”
Notes: Figure 1 illustrates the theoretical framework that guides our predictions. We expect that external reviewers
will perceive D&A audit procedures as less effortful than traditional audit procedures (Link 1). Guided by research
on the “effort heuristic”, we expect that external reviewers will therefore judge D&A audit procedures as lower in
quality than traditional audit procedures (Link 2). However, we expect this dynamic will be moderated when
external reviewers are primed with different theories of audit quality (Link 3). Specifically, we expect that external
reviewers will judge D&A audit procedures as lower in quality than traditional audit procedures more when primed
with a theory of audit quality that emphasizes the importance of audit effort (“effort-is-essential”) than when primed
with a theory of audit quality that emphasizes how effort can be substituted with other factors like execution
(“effort-can-be-substituted). Experiment 1 tests Link 1 and Link 2 of this framework without priming participants
with a theory of audit quality. Experiment 2 primes participants with a theory of audit quality and tests all three links
of the theoretical framework. See Section 2 of the paper for more details about the theoretical framework.
40
Figure 2 Statistical Diagram and Process Analysis- Experiment 1
Panel A: Diagram of Indirect Effect
Indirect Path 1 Perceived Audit Effort Indirect Path 2

β = -1.39 (Effort) β = 0.51
p = <0.01 p = <0.01
Perceived Audit
Audit Approach Quality
Direct Effect
β = 0.21 (Quality)
p = 0.46
Panel B: Analysis of Indirect Effect

Indirect Effect
Path Effect 95% Lower CI 95% Upper CI
Effort -0.70 -1.24 -0.31
Notes: Figure 2 illustrates a statistical diagram and analysis from a mediation analysis of Experiment 1 (see Hayes
2018). In the experiment, 60 partners and senior managers with external review experience assume the role of
external reviewer, review workpapers from an audit engagement, and make judgments about the quality of the audit
procedures employed in the engagement. Holding constant the level of assurance provided by the audit procedures,
we manipulate Audit Approach as whether the audit procedures incorporate data and analytics tools (D&A
procedures) or not (traditional procedures). See “Method” in Section 3 and online Appendix 2 for detailed
descriptions. Bolded p-values in the diagram indicate one-tailed p-values, given directional predictions.
41
Figure 3 Experiment 2 Results
4.00
3.00
Quality
D&A Procedures
2.00 Traditional Procedures
1.00
Effort-is-Essential Prime Effort-Can-Be-Substituted
Prime
Audit Quality Prime
Notes: Figure 3 illustrates results for Experiment 2. In the experiment, 90 partners, 5 senior managers, and 3
managers with external review experience assume the role of external reviewer, review workpapers from an audit
engagement, and make judgments about the quality of the audit procedures employed in the engagement. We
manipulate whether the audit procedures incorporate data and analytics tools (D&A procedures) or not (traditional
procedures). See online Appendix 2 for excerpts. Before performing this task, participants read a speech that either
emphasizes the importance of manual effort in audits (Effort-is-Essential Prime) or that manual effort can be
substituted by audit execution (Effort-Can-Be-Substituted Prime) (see online Appendix 4 for excerpts).
42
Figure 4 Statistical Diagram and Process Analysis- Experiment 2
Panel A: Diagram of Conditional Indirect Effect
Perceived Audit Effort

Indirect Link 1 (Effort) Indirect Link 2
β = -0.90 β = 0.06
p = <0.01 p = 0.64
Direct Effect
β = 0.31 Perceived Audit
p = 0.40 Quality
Audit Approach
(Quality)
Conditional Conditional
Direct Effect Indirect Effect
β = -0.56 β = 0.52
p = 0.28 p < 0.01
Primed Theory of
Audit Quality
Panel B: Analysis of Conditional Indirect Effect
Indirect Effect of Effort Conditional on Theory of Audit Quality Prime

Primed Theory of Audit Quality Indirect Effect 95% Lower CI 95% Upper CI
Effort-is-Essential Prime -0.53 -0.95 -0.19
Effort-Can-Be-Substituted Prime -0.06 -0.38 0.23
Index of Moderated Mediation

Index 95% Lower CI 95% Upper CI
-0.47 -0.98 -0.10
Notes: Figure 4 illustrates a statistical diagram and process analysis from moderated mediation analyses of
Experiment 2 (see Hayes 2018). In the experiment, 90 partners, 5 senior managers, and 3 managers with external
review experience assume the role of external reviewer, review workpapers from an audit engagement, and make
judgments about the quality of the audit procedures employed in the engagement. We manipulate Audit Approach as
whether the audit procedures incorporate data and analytics tools (D&A procedures) or not (traditional procedures).
See online Appendix 2 for excerpts. Before performing this task, participants read a speech that either emphasizes
the importance of manual effort in audits (Effort-is-Essential Prime) or that manual effort can be substituted by audit
execution (Effort-Can-Be-Substituted Prime) (see online Appendix 4 for excerpts). Bolded p-values in the diagram
indicate one-tailed p-values, given directional predictions.
43
TABLE 1
Demographic information
Experiment 1 Experiment 2
Demographic Variable (n=60) (n=98)
Years of Audit Experience 16.88 years 23.89 years
Firm Position:
Partner/ Director n=37 (61.67%) n=90 (91.84%)
Senior Manager n=22 (36.67%) n=5 (5.10%)
Manager n=1 (1.67%) n=3 (3.06%)
External Review Experience:

AICPA Peer Review n=16 (26.67%) n=50 (51.02%)
Firm Quality Control Review n=42 (70.00%) n=83 (84.69%)
Familiarity with D&A tools 3.32 3.77

(0=Not at all; 6= Very)
Innovativeness 4.3 4.19

(0=Low; 6= High)
Notes: Table 1 presents demographic information for participants in Experiment 1 and Experiment 2. Familiarity
with D&A tools was captured by asking participants to rate their “familiarity with data and analytic tools that are
used in audits”, on a 7-point Likert scale ranging from 0 (Not at all familiar) to 6 (Very familiar). Innovativeness is
captured through a scale validated by Agarwal and Prasad [1998]. Participants rate their agreement with the
following statements: “If I heard about a new technology, I would look for ways to experiment with it.” “Among my
peers, I am usually the first to try out new technologies.” “In general, I am hesitant to try out new technologies.” “I
like to experiment with new technologies.” We create a composite measure, Innovativeness, by averaging these four
measures after reverse-coding responses to the third statement.
44
TABLE 2
Experiment 1 results
Mean (std dev)
D&A Traditional Diff.
Procedures Procedures t- p-
Variable (n=32) (n=28) Statistic value*
Audit Quality Variables
Quality 2.44 (1.01) 2.93 (1.25) 1.68 0.049
Failed 3.91 (1.55) 3.61 (1.66) 0.72 0.237
Liability 71.00 (18.24) 53.00 (24.12) 3.28 0.001
Audit Effort Measures
Effort 2.47 (1.11) 3.86 (1.38) 4.32 <0.001
Additional Effort 4.38 (1.31) 3.75 (1.38) 1.80 0.039
Efficiency 4.38 (1.24) 2.39 (1.23) 6.21 <0.001
Alternative Cognitive Mechanisms
Exceptions 14,242 (5,016) 10,375 (6,610) 2.53 0.007
Unexamined
Exceptions 13,602 (6,787) 9,254 (6,444) 2.42 0.010
Risk-Concern 68.34 (20.85) 60.00 (25.99) 1.38 0.087
Risk-Severity 68.31 (21.83) 57.50 (26.11) 1.75 0.043
Risk-New 38.75 (27.80) 20.54 (18.53) 2.94 0.002
Risk-Known 45.78 (24.22) 51.93 (23.85) 0.99 0.164
Innovativeness 4.23 (0.99) 4.38 (1.05) 0.57 0.57
Notes: Table 2 presents descriptive statistics and two-sample t-tests for Experiment 1. In the experiment, 60
partners and senior managers with external review experience assume the role of external reviewer, review
workpapers from an audit engagement, and make judgments about the quality of the audit procedures employed
in the engagement. Holding constant the level of assurance provided by the audit procedures, we manipulate
Audit Approach as whether the audit procedures incorporate data and analytics tools (D&A procedures) or not
(traditional procedures). See Method in Section 3 and online Appendix 2 for detailed descriptions. See Method
in Section 3 for other variable definitions.
*Reported p-values are one-tailed, given directional predictions.
45
TABLE 3
Experiment 2 results
Panel A: Mean (std dev) for Quality and Effort
Primed Theory of Audit Quality
Effort-Can-Be-
Effort-is-Essential Prime Substituted
Prime Total
Audit Approach Quality Effort Quality Effort Quality Effort
2.32 3.48 2.88 4.04 2.59 3.76
D&A Procedures (1.31) (1.12) (1.23) (1.33) (1.29) (1.25)
n=25 n=25 n=24 n=24 n=49 n=49
3.19 4.54 2.61 4.78 2.92 4.65

Traditional Procedures (1.41) (1.42) (1.31) (1.35) (1.38) (1.38)
n=26 n=26 n=23 n=23 n=49 n=49
2.76 4.02 2.74 4.40 2.76 4.20

Total (1.42) (1.38) (1.26) (1.38) (1.34) (1.38)
n=51 n=51 n=47 n=47 n=98 n=98
Panel B: Analysis of variance for Quality

Source d.f. M.S. F-Statistic p-value
Audit Approach 1 2.24 1.29 0.259
Primed Theory of Audit Quality 1 0.01 0.00 0.957
Audit Approach * Primed Theory of Audit Quality (Hypothesis 2) 1 7.92 4.55 0.018
Error 94 1.74
Panel C: Follow-up simple effects

Test d.f. t-statistic p-value
Effect of Audit Approach Under Effort-is-Essential Prime 94 2.36 0.010
Effect of Audit Approach Under Effort-Can-Be-Substituted Prime 94 -0.69 0.491
Notes: Table 3 presents descriptive statistics and analyses for Experiment 2. In the experiment, 90 partners, 5 senior
managers, and 3 managers with external review experience assume the role of external reviewer, review workpapers
from an audit engagement, and make judgments about the quality of the audit procedures employed in the
engagement. We manipulate Audit Approach as whether the audit procedures incorporate data and analytics tools
(D&A procedures) or not (traditional procedures). See online Appendix 2 for excerpts. Before performing this task,
participants read a speech that either emphasizes the importance of manual effort in audits (Effort-is-Essential
Prime) or that manual effort can be substituted by audit execution (Effort-Can-Be-Substituted Prime) (see online
Appendix 4 for excerpts). Quality is measured by asking participants to assess “the quality of the audit procedures
performed” (0-very low quality, 6-very high quality). Effort is measured by asking participants to assess how much
effort the audit team invested in performing the audit procedures (0=very little effort, 6=a great deal of effort).
Bolded p-values are one-tailed equivalents, given directional predictions.
46

Auditing With Data and Analytics - External Reviewers' Judgments of Audit Quality and Effort

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Auditing With Data and Analytics - External Reviewers' Judgments of Audit Quality and Effort

Uploaded by

Copyright:

Available Formats

Auditing with Data and Analytics:

External Reviewers’ Judgments of Audit Quality and Effort

concerns, recent PCAOB inspection observations highlight deficiencies in the use of

“technology-based” audit tools (PCAOB 2022).

with other factors like audit execution (“effort-can-be-substituted” prime).

We design two complementary experiments to test these predictions. Experiment 1 uses a

peer engagement review. 3 We manipulate audit approach between-participants (D&A procedures

population of transactions to identify exceptions with misstatements. Both approaches projected

misstatement to the population of transactions using statistical sampling tools. Importantly, we

manipulation of audit approach (D&A procedures or traditional procedures) from Experiment 1.

Our second manipulated variable evaluates a theory-based priming intervention designed to

how audit effort can be substituted with audit execution (“effort-can-be-substituted”).

the traditional procedures. In contrast, participants primed with the “effort-can-be-substituted”

audit quality reduces their reliance on the effort heuristic.

showing how the auditor reached a conclusion.

2. Institutional background and hypotheses development

D&A and traditional audit approaches

misstatements on all identified exceptions would be cost-prohibitive (AICPA 2017; Barr-Pulliam

mathematical demonstration based on our experimental materials).

2021; Barr-Pulliam et al. 2022; Kipp et al. 2020).

overview of inspection observations (PCAOB 2022).

intervention to mitigate the bias.

engagement (Houston and Stefaniak 2013; Stefaniak et al. 2017).

Krishnan et al. 2017).

Theoretical framework and hypotheses development

The effort heuristic and audit quality

audit quality (PCAOB 2015, 13).

HYPOTHESIS 1. Holding constant the level of assurance provided by audit procedures,

lower quality rather than the effort heuristic.

Priming intervention to reduce use of the effort heuristic

perceived to be strategic rather than genuine.

producing good art (“effort-is-essential”) or focuses on how effort can be substituted by

creativity and talent to produce good art (“effort-can-be-substituted”). The “effort-is-essential”

statements than when primed with “effort-can-be-substituted” statements.

(“effort-can-be-substituted”). We therefore make the following interaction prediction:

related to effort is unlikely to change perceptions of audit quality.

average, about twenty-nine minutes completing the study. 8

et al. 2002). See Table 1 for a summary of participants’ demographic characteristics. 10

We employed a 1 x 2 between-participants design, manipulating audit approach (D&A

process described as “implementation of audit tests by engagement team personnel” (Francis

substantive audit procedures in the role of an AICPA peer reviewer participating in an

engagement review of a fictional audit firm, PEKD.

them realistic and appropriate for our study.

of exceptions using a 95% confidence interval. 12

to the entire population of transactions using a 95% confidence interval.

quality) to six (very high quality) scale.

Thus, they are indirect measures of audit process quality.

on a zero (very low efficiency) to six (very high efficiency) scale.

To test alternative theoretical explanations, we asked participants three sets of additional

research suggests personal willingness to innovate with technology represents an individual

participants also responded to manipulation checks and demographic questions.

exceptions in a random sample of the population. Excluding participants who answered

analyses below including all 60 participants.

Tests of Hypothesis 1 and the effort heuristic mechanism

reveals a significant overall effect of audit approach (F=2.76, p=0.018, one-tailed).

because they rely on an effort heuristic to judge audit procedure quality.

Alternative cognitive mechanisms

We conduct analyses to examine whether variations in the salience of exceptions or

are insignificant, and consequently, we do not consider these measures further. 17

Approach, Innovativeness, and their interaction as independent variables (untabulated). We

(interaction p = 0.05). 19 In an analysis examining whether Innovativeness mediates the

Innovativeness is not significant.

We designed Experiment 2 to provide a stronger test of the effort heuristic mechanism.