Professional Documents
Culture Documents
REVIEWER ACKNOWLEDGMENTS ix
EDITOR’S COMMENTS xi
vii
viii
The Editor and Associate Editors at AABR would like to thank the many excellent
reviewers who have volunteered their time and expertise to make this an outstand-
ing publication. Publishing quality papers in a timely manner would not be possible
without their efforts.
Elizabeth Dreike Almer Roger Debreceny
Portland State University, USA Nanyang Technological University,
Singapore
John C. Anderson
San Diego State University, USA William N. Dilla
Iowa State University, USA
Philip R. Beaulieu
University of Calgary, Canada Alan S. Dunk
University of Tasmania, Australia
Jean Bedard
Northeastern University, USA Jennifer D. Goodwin
University of Queensland, Australia
James Bierstaker
University of Massachusetts, Boston, Glen Gray
USA California State University,
Northridge, USA
Dennis M. Bline
Bryant College, USA Heather Hermanson
Kennesaw State University, USA
Robert H. Chenhall
Monash University, Australia Mary Callahan Hill
Kennesaw State University, USA
Freddie Choo
San Francisco State University, USA Karen L. Hooks
Florida Atlantic University, USA
Christie L. Comunale
Long Island University – C.W. Post James E. Hunton
Campus, USA Bentley College, USA
Charles Cullinan Mike Kirschenheiter
Bryant College, USA Columbia University, USA
Elizabeth Davis Stacy Kovar
Baylor University, USA Kansas State University, USA
ix
x
Vicky Arnold
Editor
xi
EDITORIAL POLICY AND
SUBMISSION GUIDELINES
MANUSCRIPT SUBMISSION
Manuscripts should be forwarded to the editor, Vicky Arnold, at Vicky.
Arnold@business.uconn.edu via e-mail. All text, tables, and figures should be in-
corporated into a word document prior to submission. The manuscript should also
include a title page containing the name and address of all authors and a concise
abstract. Also, include a separate word document with any experimental materials
or survey instruments. If you are unable to submit electronically, please forward
the manuscript along with the experimental materials to the following address:
For Journals
Dunn, C. L., & Gerard, G. J. (2001). Auditor efficiency and effectiveness with
diagrammatic and linguistic conceptual model representations. International
Journal of Accounting Information Systems, 2(3), 1–40.
For Books
For a Thesis
Thorne, L., Massey, D. W., & Magnan, M. (2000). Insights into selection-
socialization in the audit profession: An examination of the moral reasoning of
public accountants in the United States and Canada. Working paper: York Univer-
sity, North York, Ontario.
xv
ABSTRACT
This study models auditors’ professional commitment as the product of
socialization forces operating within the public accounting profession. The
results of a structural equation analysis from a sample of 349 auditors
representing international, national and regional firms indicate that firm size
is inversely related to professional commitment. Furthermore, the findings
indicate that a strong relationship exists between an auditor’s political
ideology and professional commitment. Politically conservative auditors,
reflecting the dominant ideology in public accounting, reported significantly
higher professional commitment than politically liberal auditors.
INTRODUCTION
The accounting scandals that have marked the dawn of the 21st century, such
as Enron, MCI, and Global Crossing, have damaged the credibility of the audit
report and the reputation of the public accounting industry. Perhaps more than
ever, commitment to the ideals and standards of the auditing profession is vital
Firm Size
Pratt and Beaulieu (1992) asserted that differences in firm size proxy for differ-
ences in culture. They concluded that larger firms have more rigid control systems
than smaller firms, resulting in the large firms being more structured and mecha-
nistic than the smaller firms. Wheeler et al. (1987) found that the nature of the work
environment, the organizational structure, performance evaluations, compensation
and promotion procedures in large firms differed substantially from those of
smaller firms. Goetz et al. (1991) contended that the more structured and bureau-
cratic environment of larger firms resulted in less individual voice in determining
rules of conduct within the firm. Ponemon (1992) claims that such a strong firm
culture effectively results in the organization weeding out those persons who fail
to conform.
These factors imply that the loyalty of accountants in the larger firms must
be first to the organization and then to the profession. Goetz et al. (1991)
support this premise and assert that because smaller firms have “less stand-alone
credibility” than do larger firms, practitioners in the smaller firms need the
profession more than practitioners in the larger firms. Larger firms are more
visible and prestigious, endowing upon their members an identity separate from
the profession. This suggests that auditors in smaller firms may identify more
readily with the profession, vis-à-vis the organization, than auditors in larger firms
and correspondingly develop a greater sense of commitment to the profession.
H1. Firm size is inversely related to auditors’ professional commitment.
A Structural Equation Model of Auditors’ Professional Commitment 7
Political Ideology
Socialization encourages persons “to become similar to their profession, not only
as it is embodied by other organizational members, but also as it is defined by
the profession’s espoused ideals” (Fogerty, 1992, p. 139). This description of
the socialization process implies the existence of a prototypic public accountant
embodying desirable characteristics, values and attitudes. The more effective the
socialization processes, the greater the correspondence between the prototype
and the professional member. Some values and attitudes (i.e. commitment,
identification) may be more readily influenced and inculcated by the social-
ization process than others (i.e. religious preferences). It is also possible that
some prototypic characteristics are not amendable by socialization (i.e. gender,
race).
A particularly appropriate theory for examining the influence of prototypes
on socialization processes in the auditing profession is self-categorization theory
(SCT) (Chatman et al., 1998; Hogg & Terry, 2000; Tajfel & Turner, 1985).2 SCT
focuses on the process whereby individuals define their self-concept in relation
to their membership in social groups. Prototype-based comparisons, whereby
social categorization of the individual into favorable in-group or unfavorable
out-group membership occurs, “lies at the heart” of SCT processes (Hogg &
Terry, 2000, p. 122). Prototypes are cognitive representations of the defining
and stereotypical features of in-groups, embodying exemplary or ideal types
and capturing characteristics that differentiate them from other groups. These
characteristics include demographic attributes, behaviors, attitudes and values.
Critical to the notion of prototypes is that they accentuate similarities within and
differences between groups (Hogg & Terry, 2000). For example, because the pro-
totypical partner in public accounting is male, an in-group characteristic may be
masculinity and an out-group characteristic femininity (Maupin, 1993; Maupin &
Lehman, 1994).3
Prototype-based self-categorization is relevant for modeling professional
commitment as a socialization process directed towards cultivating professional
values (Jeffery & Weatherholt, 1996; Larson, 1977) for several reasons. First, in-
group members, reflecting prototypic characteristics, are more likely to cooperate
with each other and to compete with out-group members (Chatman et al., 1998).
Second, in-group members are likely to receive favorable treatment compared to
out-group members (Ashforth & Mael, 1989). This favoritism may be reflected
in work assignments, performance evaluations, receipt of voluntary mentoring, or
through informal signals of preference relative to out-group members. As a result,
in-group members are likely to maintain more favorable attitudes towards their
profession and be more readily socialized than out-group members. Third, SCT
implies that a prototypically homogeneous audit profession is likely to develop,
8 JOHN T. SWEENEY, JEFFREY J. QUIRIN AND DANN G. FISHER
Control Paths
Prior research has documented that partners in public accounting are typically
male (Hooks & Cheramy, 1994; Hull & Umansky, 1997) and, on average,
have developed to the conventional level of moral reasoning (Sweeney, 1995).
Researchers have suggested that masculinity (Maupin, 1993; Maupin & Lehman,
1994) and conventional moral reasoning (Ponemon, 1992) represent prototypes
in public accounting. Since the influence of both gender and moral reasoning on
professional commitment has been examined in prior research, these variables
are included as control paths in the model of professional commitment.
Although the literature suggests that gender barriers in public accounting may
preclude women from attaining the same level of commitment to the profession as
men (Maupin, 1993; Maupin & Lehman, 1994), the results of empirical research
have been equivocal. Gaffney et al. (1993) found that family obligations increased
the professional commitment of men in public accounting but had no effect on
women’s professional commitment. Street et al. (1993), after controlling for
positional level, did not find a difference in professional commitment between
female and male public accountants.
Covaleski et al. (1998) contend that although women may have “broken the
glass ceiling” to attaining partnership in Big 6 firms, there is still a paucity of
high-level female partners. Women who are unable or unwilling to adapt mascu-
line characteristics required by the male-dominated culture of public accounting
may encounter obstacles in making partner (Maupin & Lehman, 1994). Given
the predominance of the male partners and the difficulties that woman may
encounter in adopting in-group male qualities, women in public accounting may
represent an out-group and have correspondingly less professional commitment
than men.
H3. Male auditors will have greater professional commitment than will female
auditors.
Ethics researchers in accounting have consistently found that the ethical devel-
opment of auditors, as measured by the P score of the Defining Issues Test
(DIT) (Rest, 1986, 1993), most commonly reflected conventional reasoning
and was inversely related to positional level (Lampe & Finn, 1992; Ponemon
& Gabhart, 1993; Shaub, 1994). This result seemingly contradicts Kohlberg’s
(1969) moral development theory, which holds that development is sequential
and progressive but not regressive. Ponemon (1992) contended that the inverse
relationship between P scores and rank in public accounting organizations was
the result of a selection-socialization process whereby firms prefer to hire and
then promote individuals with a shared set of ethical values and beliefs. He found
10 JOHN T. SWEENEY, JEFFREY J. QUIRIN AND DANN G. FISHER
METHOD
Sample
Staff 23 22 55 100
Senior 15 10 63 88
Supervisor 11 8 19 38
Manager 10 14 39 63
Partner 29 9 22 60
Totals 88 63 198 349
Professional P Score
Commitment
Measures
Professional commitment (PC) was measured with the 15-item scale adapted by
Aranya et al. (1981) from the Porter et al. (1974) organizational commitment
questionnaire. This scale has been utilized extensively by accounting researchers
to measure professional commitment (Aranya et al., 1982; Gaffney et al., 1993;
Harrell et al., 1986; Jeffery & Weatherholt, 1996; Street et al., 1993). Researchers
have indicated that the scale has good internal consistency, with Cronbach’s
14 JOHN T. SWEENEY, JEFFREY J. QUIRIN AND DANN G. FISHER
alpha reported in the high 0.80s (Aranya et al., 1981; Aranya & Ferris, 1984;
Bline et al., 1991).
Bline et al. (1991), in an extensive examination of the psychometric properties
of the professional commitment questionnaire, report that the scale measures a
construct distinct from organizational commitment. Their tests indicated that the
professional commitment scale has adequate reliability and validity. Furthermore,
the professional commitment construct correlated positively with job satisfaction
and negatively with intent to leave the profession. Other accounting researchers
have reported negative correlations between the professional commitment scale
and organizational-professional conflict (Aranya et al., 1981; Harrell et al., 1986)
and positive correlations with favorable work attitudes in public accounting
(Aranya et al., 1982).6
Ethical development was measured by the sample respondents’ P score
from the 6-story DIT (Rest, 1979, 1986, 1993). The P score is a continuous
measure, ranging from 0 to 95, reflecting the relative importance a subject gives
to principled moral reasoning in resolving moral dilemmas (Rest et al., 1997,
p. 498). Rest (1993) reports an average P score of 45 for college graduates,
although accounting researchers have generally found that public accountants
score lower than adults from the general population at similar educational levels
(Ponemon, 1992; Sweeney, 1995). Rest (1986, pp. 176–179) contends that the P
score correlates most strongly with educational level but only weakly with gender,
intelligence and ethnic background. Gender, however, appears to have a stronger
influence on accountants’ P scores than it does in the general population, with
females attaining significantly higher scores (Bernardi & Arnold, 1997; Enyon
et al., 1997; Shaub, 1994; Sweeney, 1995).
The DIT has been subjected to extensive reliability and validity tests with
generally good results (Rest, 1979, 1986; Rest et al., 1999). Some researchers
(Emler et al., 1983), however, contend that the DIT contains a political bias. In
studies with accounting subjects, Sweeney and Fisher (1998, 1999) found that the
DIT contained an imbedded political content that tended to overstate the scores
of political liberals and to understate the scores of political conservatives. They
suggest that researchers utilizing the DIT control for subjects’ political ideology in
order to more clearly interpret the relationship between P scores and the variable
of interest.
Subjects’ indicated their political ideology in response to the following ques-
tion: “Regarding important social and political issues, would you classify your
opinion or perspective as primarily conservative or liberal?” Forcing subjects to
identify their positions as primarily liberal or conservative is consistent with prior
research (Sweeney, 1995) and eliminates the ambiguity of a political “moderate”
classification.
A Structural Equation Model of Auditors’ Professional Commitment 15
EMPIRICAL RESULTS
Correlations
Structural equation modeling was used to evaluate the proposed hypotheses. The
structural equation model utilized to test the hypotheses corresponds to the model
in Fig. 1. Each link between the variables in Fig. 1 has a path coefficient that
measures the impact of the antecedent variable in explaining the variance in the
outcome variable. For example, the path coefficient for the link between political
ideology and P score indicates the increase in P score, measured in standard
deviations, associated with a one standard deviation increase in political ideology.
The goal of structural equation modeling is to evaluate whether associations
proposed in theory, or in prior research, fit the present data set. Evidence of proper
fit is provided by various other fit indices. However, measures of proper fit can
(1) 1.000
(2) −0.246** 1.000
(3) −0.132** 0.087 1.000
(4) −0.017 −0.080 0.194** 1.000
(5) 0.234** −0.146** −0.046 −0.105* 1.000
(6) −0.116* 0.054 0.055 0.205** −0.353** 1.000
N = 349.
∗ p < 0.05 (one tailed significance).
∗∗ p < 0.01 (one tailed significance).
16 JOHN T. SWEENEY, JEFFREY J. QUIRIN AND DANN G. FISHER
be problematic since several of the commonly used fit indices are sample size
dependent. For this reason, multiple measures of overall model fit are reported in
this study.
The Normed Fit Index (NFI) (Bentler & Bonett, 1980) has an index range
from 0 to 1, with values over 0.9 indicating a good fit. This index may be viewed
as the percentage of observed-measure covariation explained by a given model.
The disadvantage of the NFI is that it can underestimate goodness-of-fit in small
samples. Bentler’s (1990) revised Normed Comparative Fit Index (CFI) is based
upon the Bentler and Bonett (1980) NFI but with a correction for sample-size
dependency. CFI values always lie between 0 and 1, with values over 0.9 indicating
a relatively good fit (Bentler, 1990). Finally, the Adjusted Goodness of Fit Index
(AGFI), devised by Joreskog and Sorbom (1984), is an additional fit index that
ranges from 0 to 1, with values above 0.9 indicating acceptable fit. Specifically,
in addition to the traditional Goodness of Fit Index (GFI), the Adjusted Goodness
of Fit Index (AGFI), the Normed Fit Index (NFI), and the Comparative Fit Index
(CFI) are reported in this study. This lends some assurance that the measures of
fit produced are not spurious.
Figurative depictions of the results of the structural equation analysis are
presented in Fig. 2. With GFI, AGFI, NFI, and CFI values exceeding 0.9 in
all instances, the theoretical model appears to provide a very good fit with the
dataset.
Tabular results of the structural equation analysis including a listing of each
hypothesis and its corresponding path coefficient are presented in Table 3.
Consistent with the relatively high model fit indices, results in Table 3 indicate
that an overwhelming majority of the associations hypothesized in the current
study and suggested by prior literature were significant, providing further support
for the proposed theoretical model of professional commitment.
Tests of Hypotheses
Additional Analysis
conservative auditors have higher commitment than liberal auditors for every cell
containing at least two liberal auditors.
An objective of socialization is to insure that management promotes those
individuals who reflect the culture and values of the organization (Fogerty, 1992;
Kanter, 1977; Ponemon, 1992). If conservative ideology is a strongly held value
in the culture of public accounting, then politically conservative auditors should
perceive greater opportunities for advancement than politically liberal auditors.
To provide further evidence of the socializing influence of political ideology
in public accounting, subjects who were not partners were asked to respond
to the following question: “Please indicate what you believe are your chances
(likelihood) of making partner in your present firm.” The Likert response scale for
the question ranged from 1 (very low) to 7 (very high). Conservative auditors, on
average, perceived their opportunities for advancement to partner as significantly
greater than liberal auditors (3.68 vs. 2.96; p < 0.0003).
NOTES
1. Former Securities and Exchange (SEC) Commissioner Arthur Levitt questioned
whether the expansion into more lucrative services compromises the traditional audit
function (Covaleski, 1999). Suggesting that the audit has merely become a conduit for
selling other services, Levitt contends that auditors may not be sufficiently committed to
societal expectations and professional standards.
2. SCT is an extension of social identity theory (SIT) (Ashforth & Mael, 1989; Brown,
2000; Tajfel & Turner, 1985). SIT maintains that one’s social identity is derived primarily
from group membership, that people strive to maintain a positive identity, and that this
positive identity largely results from favorable comparisons between relevant in-groups
and out-groups (Ashforth & Mael, 1989).
3. Fogerty (2000, p. 13) described the socializing influence of prototypes in public
accounting firms when he stated: “Experienced organizational members selectively provide
reinforcement, communicate the approved range for action, and serve as examples of
achievement.”
4. An individualist orientation supports the notion of capitalism in viewing people as
independent economic actors, as opposed to a collectivist orientation that is more aligned
with a socialist perspective (Burns, 1992, p. 352).
5. After controlling for political ideology and gender, Sweeney (1995) did not find a
significant relationship between rank and DIT P scores. Therefore, we do not control for
the influence of rank on ethical development.
6. Dwyer et al. (2000) examined the dimensionality of the Aranya et al. (1981) pro-
fessional commitment scale with a broad sample of practicing accountants and concluded
that the 15-item scale could be parsimoniously reduced to a five-item measure. In light of
this research, we performed a principal components, orthogonal rotation factor analysis of
the instrument. Results of the factor analysis indicated that 14 of the 15 items possessed
loadings of 0.40 or greater on a single factor. Item 7 of the instrument, which possessed a
loading of 0.15, was the lone item not contributing to the factor. The resulting eigenvalue
for the 14-item factor was 5.49. The Cronbach alpha for the 15-item measure was 0.88.
Supplemental analyses utilizing the reduced 5-item scale from Dwyer et al. (2000) were also
performed and the results were essentially identical to those incorporating the full scale.
ACKNOWLEDGMENTS
We gratefully acknowledge the helpful comments of the participants in 2001
Annual Meeting of the Accounting, Behavior & Organizations Section, the 2002
Critical Perspectives in Accounting Conference, and the accounting research work-
shops at the Australian National University and at Washington State University.
REFERENCES
Adler, A., & Aranya, A. N. (1984). Comparison of the work needs, attitudes and preferences of profes-
sional accountants at different career stages. Journal of Vocational Behavior (August), 45–57.
A Structural Equation Model of Auditors’ Professional Commitment 23
Fogerty, T. J. (2000). Socialization and organizational outcomes in large public accounting firms.
Journal of Managerial Issues, 12(Spring), 12–33.
Gaffney, M. A., McEwen, R. A., & Welsh, M. J. (1993). Gender effects on commitment of public
accountants: A test of competing sociological models. Advances in Public Interest Accounting,
5, 45–73.
Goetz, J. F., Morrow, P. C., & McElroy, J. C. (1991). The effect of accounting firm size and member
rank on professionalism. Accounting, Organizations and Society, 16, 159–165.
Harrell, A., Chewning, E., & Taylor, M. (1986). Organizational-professional conflict and the job
satisfaction and the turnover intentions of internal auditors. Auditing: A Journal of Practice
and Theory, 5(Spring), 109–121.
Hogg, M. A., & Terry, D. J. (2000). Social identity and self-categorization processes in organizational
contexts. Academy of Management Review, 25(1), 121–140.
Hooks, K. L., & Cheramy, S. J. (1994). Facts and myths about women CPAs. Journal of Accountancy,
178(October), 79–86.
Hull, R. P., & Umansky, P. H. (1997). An examination of gender stereotyping as an explanation for
vertical job segregation in public accounting. Accounting, Organizations and Society, 22(6),
507–528.
Jeffery, C., & Weatherholt, N. (1996). Ethical development, professional commitment, and rule obser-
vance attitudes: A study of CPAs and corporate accountants. Behavioral Research in Accounting,
8, 8–31.
Joreskog, K., & Sorbom, D. (1984). LISREL – VI users guide (4th ed.). Mooresville, IN: Scientific
Software.
Kanter, R. (1977). Men and women of the corporation. New York: Basic Books.
Kohlberg, L. (1969). Stage and sequence: The cognitive developmental approach to socialization. In:
D. A. Goslin (Ed.), Handbook of Socialization Theory and Research (pp. 347–480). Chicago:
Rand McNally.
Lampe, J., & Finn, D. (1992). A model of auditors’ ethical decision process. Auditing: A Journal of
Practice & Theory (Suppl.), 1–21.
Larson, M. S. (1977). Rise of professionalism: A sociological analysis. Berkley: University of California
Press.
Maupin, R. J. (1993). How can women’s lack of upward mobility in accounting organizations be
explained? Group and Organization Management, 18(June), 132–152.
Maupin, R. J., & Lehman, C. R. (1994). Talking heads: Stereotypes, status, sex-roles and satisfaction
of female and male auditors. Accounting, Organizations and Society, 19, 427–437.
Norris, D. R., & Niebuhr, R. E. (1983). Professionalism, organizational commitment and job satisfaction
in an accounting organization. Accounting, Organizations and Society, 9, 49–59.
Ponemon, L. A. (1992). Ethical reasoning and selection-socialization in accounting. Accounting,
Organizations and Society, 17, 239–258.
Ponemon, L. A., & Gabhart, D. (1993). Ethical reasoning in accounting and auditing. Vancouver,
Canada: Canadian General Accountants’ Research Foundation.
Porter, L. W., Steers, R. M., Mowday, R. T., & Boulian, P. V. (1974). Organizational commitment,
job satisfaction, and turnover among psychiatric technicials. Journal of Applied Psychology,
59(October), 603–609.
Pratt, J., & Beaulieu, P. (1992). Organizational culture in public accounting: Size, technology, rank,
and functional area. Accounting, Organizations and Society, 17, 667–684.
Rest, J. R. (1979). Development in judging moral issues. Minneapolis, MN: University of Minnesota
Press.
A Structural Equation Model of Auditors’ Professional Commitment 25
Rest, J. R. (1986). Moral development: Advances in research and theory. New York: Prager Press.
Rest, J. R. (1993). Guide for the defining issues test. Version 1.3. Minneapolis, MN: University of
Minnesota.
Rest, J., Narvaez, D., Bebeau, M. J., & Thoma, S. J. (1999). Postconventional moral thinking: A
neo-kohlbergian approach. New Jersey: Lawrence Erlbaum Associates.
Rest, J., Thoma, S. J., & Edwards, L. (1997). Designing and validating a measure of moral judgment:
Stage preferences and stage consistency approaches. Journal of Educational Psychology, 89(1),
5–28.
Schroeder, R. G., & Imdieke, L. F. (1977). Local-cosmopolitan and bureaucratic perceptions in public
accounting firms. Accounting, Organizations and Society, 1, 39–45.
Shaub, M. (1994). An analysis of factors affecting the cognitive moral development of auditors and
auditing students. Journal of Accounting Education, 12, 1–26.
Shaub, M., Finn, D., & Munter, P. (1993). The effects of auditors’ ethical orientation on commitment
and ethical sensitivity. Behavioral Research in Accounting, 5, 145–169.
Siegal, P., Blank, M., & Rigsby, J. (1991). Socialization of the accounting professional: Evidence of the
effect of educational structure on subsequent auditor retention and advancement. Accounting,
Auditing and Accountability Journal, 4, 58–70.
Sorenson, J. E. (1967). Professional and bureaucratic organization in the public accounting firm. The
Accounting Review, 42(July), 553–565.
Sorenson, J. E., & Sorenson, T. C. (1974). The conflict of professionals in bureaucratic organizations.
Administrative Science Quarterly (March), 98–106.
Street, D. L., Schroeder, R. G., & Schwartz, B. (1993). The central life interests and organizational
professional commitment of men and women employed by public accounting firms. Advances
in Public Interest Accounting, 5, 201–229.
Sweeney, J. T. (1995). The moral expertise of auditors: An explanatory analysis. Research on Account-
ing Ethics, 1, 213–234.
Sweeney, J. T., & Fisher, D. G. (1998). An examination of the validity of a new measure of moral
judgment. Behavioral Research in Accounting, 10, 138–158.
Sweeney, J. T., & Fisher, D. G. (1999). Politics, faking, and self-presentation: How valid is the P score
of the Defining Issues Test? Research on Accounting Ethics, 5, 51–75.
Tajfel, H., & Turner, J. C. (1985). The social identity theory of intergroup behavior. In: S. Worchel &
W. G. Austin (Eds), Psychology of Intergroup Relations (2nd ed., pp. 7–24). Chicago: Nelson-
Hall.
Watts, R. L., & Zimmerman, J. L. (1986). Positive accounting theory. Englewood Cliffs, NJ: Prentice-
Hall.
Wheeler, R., Felsig, R. M., & Reilly, T. (1987). Large or small CPA firms: A practitioner’s perspective.
CPA Journal (April), 29–33.
AN ANALYSIS OF GROUP
INFLUENCES ON GOING
CONCERN AUDITOR JUDGMENTS
ABSTRACT
Studies that have indicated that the processing of audit evidence results in
judgment bias may be the result of the study of individual decision-making.
Building on work that suggests important differences between individual
and group decision-making, this paper evaluates decision-making attributes
of audit groups. Experienced auditors from offices of Big-Five firms in the
U.S. served as the participants in an experiment involving the going concern
judgment. Results show that recency does affect the judgments of individual
auditors but disappears as an important effect when groups make judgments.
Group responses are less extreme and exhibit greater confidence than those
of individuals.
INTRODUCTION
The descriptive theory of belief updating proposed by Hogarth and Einhorn (1992)
posits that the order in which evidence is received has a significant and predictable
influence on a person’s final judgment. Most of the attention generated by this
discovery has focused around recency effects. Recency refers to the tendency to
place a greater weight on evidence received later in a sequence. Accordingly, an
over-reliance on information presented last may occur. A number of experimental
studies utilizing various conditions suggest that significant recency effects exist in
accountants’ and auditors’ belief revisions (e.g. Asare, 1992; Ashton & Ashton,
1988; Dillard et al., 1991; Pei et al., 1992; Trotman & Wright, 1996; Tubbs
et al., 1990). However, recent research has questioned the prevalence of recency
in auditing. Cushing and Ahlawat (1996) suggested that such effects may not be
common in audit practice. Other studies also have produced evidence that recency
effects do not always occur, or occur only under certain circumstances (Kennedy,
1993; Messier & Tubbs, 1994; Trotman & Wright, 1996).
This paper builds on the growing recognition that contextual factors (e.g.
accountability, cognitive involvement, experience, and task realism) might
mitigate judgment bias in audit judgment. Another potential factor is group
influence. Many auditing situations involve either formal or informal group
consultation (Gibbins & Emby, 1985). For example, a team of audit staff and
seniors typically conduct audit fieldwork. The group expands as managers and
partners review this work prior to the issuance of an audit report. However, the
growing recognition that cognitive heuristics and biases in auditors’ judgments
can lead to different outcomes, including different types of audit reports (e.g.
Asare, 1992), has developed with little consideration of group influences.
This research investigates the potential for group processes to overcome
weaknesses in accountants’ judgment. In addition to the recency bias, this paper
also examines the related attributes of decision confidence and belief revision
that vary between audit groups and individual auditors. This research finds
fundamental differences between groups and individuals in their exposure to
recency effects, the nature of their belief revision processes, and their confidence in
decisions. Four subsequent sections are employed. The first develops the lit-
eratures surrounding group decision-making and judgment biases as a prelude
to stating the research hypotheses. The second describes the empirical study.
The last two sections present the results and discuss their implications and
limitations.
The unique condition of the group in business settings has been studied for some
time. Early studies measured the impact of social cues and interpersonal opinions
on performance and cognitive investment (Weiss & Shaw, 1979; White et al.,
1977). As this area matured, interactive effects between group conditions and
individual attributes were recognized (e.g. Vance & Biddle, 1985). Apart from
An Analysis of Group Influences on Going Concern Auditor Judgments 29
these more generic aspects, groups also were found to influence decision-making.
Although individuals come to the group with some degree of pre-discussion
preferences and unique decision-relevant information that continue to influence
group decisions (Winquist & Larson, 1998), the group resists reduction to the
sum of its members. Groups are believed to produce substantively different
decisions than individuals (Hill, 1982; Miner, 1984). The improved accuracy of
groups that has been reported in many areas may be attributable not only to the
increased perspectives contributed by members, but also to the heightened caution
as consensus processes tend to eschew extreme solutions (Myers & Lamm,
1976). Although the balance of evidence suggests net gains for group decisions
over those of individuals, a full explanation of their origin remains elusive. The
extent that groups may be effective at reducing the random error associated with
individual choice, may depend on the effectiveness with which feedback can be
incorporated. Group advantages may also center on the reduction of individual
variability. However, the importance of these conditions varies with the context
of the decision.
Hypotheses
The studies discussed above suggest that the tools that enhance cognitive involve-
ment can mitigate order effects. Group decision-making can serve to enhance
effort and involvement. Group assistance can also be useful in lessening task
demands. Groups have collective experience to draw from, whereas individuals
work alone. Studies in social psychology have found that livelier interaction
among group members was associated with superior performance (e.g. Valacich
& Schwenk, 1995). Interacting groups also reduced belief perseverance (Wright
et al., 1990). These findings suggest that the interaction process itself may have
a positive effect on judgment.
Two aspects of group process could contribute to superior performance. The
group tends to broaden the information set that is brought to bear upon a choice
(Stasser, 1992). This information set includes perspectives on what factual data
means and what limitations it possesses. Group processes also reduces individual
inconsistency or extremity (Schultz & Reckers, 1981). As information exchange
between members occurs, group interaction becomes a “corrective function” when
individual members have initially incomplete or biased information (Stasser &
Titus, 1985) and are encouraged to alter opinions in order to reach a collective
judgment (Stasser & Davis, 1981).
The complexities of some audits make group processes even more salient.
Auditors are aware of the importance of group work and the need to share and
integrate expertise (Schultz & Reckers, 1981). The audit requires considerable
knowledge about industries and competitive factors in order to ascertain the con-
sequences of account balance fluctuations. Fisher and Ellis (1990) suggested that
social pressures created by the group interaction process would moderate extreme
or divergent views held by group members as they work to accommodate each
other’s views. In an audit setting, groups may be useful in preventing anecdotal
experience about certain business conditions from being overly generalized.
32 SUNITA S. AHLAWAT AND TIMOTHY J. FOGARTY
about their decision because it takes into account a wide set of perspectives on
importance. Lower confidence would be inconsistent with the social pressures
that support the participatory consensus formation around the group’s choice. As
such, the group interaction process may lead to higher group confidence compared
to the individual members’ pre-group confidence (Sniezek & Henry, 1989, 1990).
The greater confidence may also reflect individuals’ recognition that groups can
potentially recognize, evaluate, and process more information than individuals.3
In an accounting study, Bloomfield et al. (1996) showed that interaction that
inspired group confidence contributed to group performance. In a different vein,
Allwood and Granhag (1996) found that groups inspired not only confidence,
but also realistic confidence.
The level of confidence is particularly important for the going concern decision
made by auditors. The evaluation of business survival is inherently oriented toward
the future and therefore is more uncertain than most auditing decisions. Since the
going concern decision has distinct adverse consequences for the client, high levels
of confidence are called for to withstand the client resistance that is likely to result.
Accordingly, the following hypothesis will be considered:
H2. Audit groups will exhibit greater confidence than individual auditors about
going concern decisions.
Research over the last thirty years has identified many reasons to depart from
the belief that the direction of influence in decision-making is symmetrical.
Human beings are not bound to strict mathematical consistency when dealing
with information that points to one conclusion relative to information that leads
to an opposite result. Pivoting around a baseline (zero), positive movements and
negative cues of equal magnitude have often been shown to be processed in a
qualitatively differently way. However, the reasons that individuals are influenced
by these frames of reference are imperfectly understood (Newman, 1980).
If group-based reasoning is capable of integrating more information and wider
perspectives, it also may be capable of altering the tendency to treat categories of
cues in ways that are inconsistent with Bayesian logic. The more varied experiences
available to the group as input to their decision may work against the tendency
to over-weigh the negative or the positive. If framing effects are psychological in
nature, forcing them into open discussion may have the effect of exposing their
inconsistency. In other words, there may be more balance in how groups react to
positive and negative types of information than there would be in how individuals
react to that same information.
Auditing has been described as the attempt to confirm a series of interrelated
hypotheses about the clients accounting records (Church & Schneider, 1993).
Evidence that the accounts are correct as stated therefore can be logically
34 SUNITA S. AHLAWAT AND TIMOTHY J. FOGARTY
THE EXPERIMENT
An experiment was designed to test the hypotheses in a context where auditors
are asked to evaluate a client’s ability to continue as a going concern. This type
of context has been employed frequently in prior studies of recency effects in
audit judgment. The specific task in the experiment involves making a series of
judgments about a firm’s going-concern status and a recommendation about the
type of audit report to issue.
The experiment was conducted in the offices of the participating international
public accounting firm over a four-week period. In each office, arrangements were
made for subjects to participate as individuals or as members of three-person
groups. Judgments were made privately by individuals or collaboratively in
groups. Although the assignment of participants to conditions was random, group
composition was subject to member availability at the pre-established time for
An Analysis of Group Influences on Going Concern Auditor Judgments 35
the exercise.4 The only qualifying stipulation was that participants were primarily
engaged in the auditing activities of the firm and that they had at least two years
of experience. A researcher distributed and collected all materials in person. For
groups, the researcher was present outside the meeting room for the duration of
the deliberations. Individuals completed the task in their offices, but without the
physical proximity of the researcher.
Each participant was provided with case material. Although each member of the
group was given a copy of the case, groups were instructed to respond collectively
on a single response sheet. Group members were encouraged to discuss the case
prior to reaching a consensus. Each group designated one member to record the
group response.
A cover letter accompanying the case materials suggested that the task should
take about 60 minutes to complete. Whereas letters to groups emphasized the
importance of working collectively, letters to individuals stressed the need for
independent work. Both types of letters asked participants to proceed through the
materials in one sitting. All participants were guaranteed anonymity, assured that
there were no right or wrong answers, and told that most of the questions dealt
with matters of professional judgment.
Participants were asked to read the case assuming that they were performing a
review of preliminary results from the current year’s audit engagement. The case
was previewed for realism and relevance by audit professionals other than the
participants and was revised in accordance with their suggestions.
The experimental materials consisted of a set of instructions and a case
booklet. The case booklet contained background and financial information for a
hypothetical client. The background information included a detailed description
of the industry and a company, its operations, economic environment, and the type
of audit opinion it had received in the last two years. The financial information
comprised audited financial statements for the past three years and the current
year. This information included the balance sheet, income statement, selected
financial ratios, footnotes, statement of changes in financial position, and schedule
of working capital changes. The experimental materials were designed to create
a case in which the audit decision was not an obvious unqualified or modified
(going concern) opinion.
Figure 1 depicts the sequence of procedures required of the auditors for the
experiment. The case consisted of four tasks. Participants were asked to complete
each task in the order given to capture belief revision. They were instructed to
36 SUNITA S. AHLAWAT AND TIMOTHY J. FOGARTY
Fig. 1. Procedure for the Experiment.
An Analysis of Group Influences on Going Concern Auditor Judgments 37
return the task to the appropriately labeled envelope, and to seal the envelope at
the end of each task.
In Task 1, participants were first asked to provide their general threshold level
for substantial doubt, such that a modified audit opinion would be recommended
for any entity whose likelihood for continued existence fell below the threshold
level. This established, in quantified terms, participants’ baseline threshold for
substantial doubt before they considered the hypothetical client in particular.
Group members had to agree to a single baseline. The scale used for pinpointing
participants’ threshold levels ranged from 0 to 100, with endpoints labeled
“certain not to continue” (0) and “certain to continue” (100).
Participants then dealt with case-specific questions. They were asked to: (1)
assess the likelihood of the client’s continued existence through the end of the
current fiscal year; (2) recommend the type of audit report to be issued; and (3)
indicate their confidence in the audit report recommended. A 0–100 scale with
end points labeled “certain not to continue” (0) and “certain to continue” (100)
measured this for each subsequent likelihood judgment. A similar 0–100 scale
with end points labeled “not confident at all” (0) and “very confident” (100) was
used to elicit participants’ confidence level. The audit report categories were Un-
qualified, Modified, and Disclaimer. Under U.S. auditing standards, the modified
opinion would be appropriate if there were significant doubt about the entity’s
continuation (AU 341, AICPA, 1990). At this point, participants did not know that
they would receive additional information or have an opportunity to revise their
previous judgments. In addition to familiarizing the participants with the client’s
overall operations and financial conditions, Task 1 allowed them to set their own
decisional anchor points.
Task 2 of the case sequentially presented six additional pieces of evidence.
Three of the evidence items were classified as “Contrary” with regard to the going
concern status of the hypothetical company. Contrary information is defined as
any evidence or issue that raises doubts about the entity’s ability to continue in
existence. Specifically, the contrary items related to: (1) the upcoming expiration
of a patent that had consistently generated approximately 25% of total sales;
(2) the departure of one of the company’s key sales executives; and (3) the
non-renewal of the company’s line of credit. The other three evidentiary items
could be considered “Mitigating” in nature, since they might quell traditional
auditor going concern doubts. The mitigating factors were: (1) the receipt of a
favorable marketing research report on a new product line; (2) the successful
deferment of an account payable over a three-year period; and (3) a successfully
concluded contract negotiation with an employee labor union. Following the
presentation of each of these pieces of evidence, participants were asked to
provide a revised assessment of the likelihood that the client would continue in
38 SUNITA S. AHLAWAT AND TIMOTHY J. FOGARTY
existence through the end of current fiscal year. After providing the last of these
assessments, participants were again asked to recommend the type of audit report
to be issued and to indicate their confidence in the appropriateness of that report.
The six items were presented in two orders. In the condition labeled MMMCCC
on Fig. 1, the three mitigating factors (MMM) were presented first, followed by the
three pieces of contrary information (CCC). The order of evidence was reversed
in the second condition, labeled CCCMMM. The variation in the order of cues
was the recency manipulation. Each of these items was presented on a new page
contained in an envelope. Participants were asked to complete a new 0–100 scaled
sealed assessment of the hypothetical company’s continuation as a going concern
before examining the next item of evidence. After the last piece of evidence was
revealed, participants were again asked about their confidence about the opinion
type they recommended, with a question identical to that used in Task 1.
Task 3 of the case required all participants to complete a questionnaire regarding
their background and auditing experience. Since these questions concerned their
individual attributes, all participants, even those that had worked in groups for
Tasks 1 and 2, were asked to work alone on Task 3.
Task 4 obtained data for a manipulation check. Nine pieces of evidence (includ-
ing the six items presented in the experiment) were used to check respondents’
perceptions. They were asked to classify these nine items as contrary, mitigating,
or neither, in relation to a going concern question. Individuals that had worked
in groups for Task 1 and Task 2 also performed this task collectively in keeping
with the intent to study the difference between groups and individuals.5
Participants
Group Individual
Experience (in years) Manager 9.09 (1.38) 9.30 (3.09) 8.23 (1.09) 7.67 (1.40)
Senior 3.36 (0.85) 3.15 (0.99) – –
No. of audits since Manager 106.4 (73.99) 139.5 (122.91) 102.3 (69.63) 105.67 (70.12)
working as an Senior 22.3 (15.13) 26.4 (17.27) – –
auditor
No. of audits in which Manager 3.27 (1.79) 3.60 (3.89) 0.38 (2.87) 2.47 (1.99)
an opinion other Seniorb 0.68 (0.99) 1.15 (1.27) – –
than unqualified
was issued
a Respondents in the CCCMMM (MMMCCC) condition received three items of contrary (mitigating)
evidence, followed by three items of mitigating (contrary) evidence.
b 7 of 49 managers and 21 of 42 seniors indicated they had not been on any engagements in which the
Most participants indicated that, as members of audit teams, they had been
involved in engagements in which an opinion other than unqualified was either
seriously considered (81 of 91), or actually issued (63 of 91). This suggests that
participants were familiar with non-standard audit reports in the “real world” of
audit practice. Of the 63 who had been on audits in which a going concern opinion
was issued, 42 were managers and 21 were seniors.
Experiment Design
RESULTS
Descriptive Results
The results of the manipulation check in Task 4 were very satisfactory. Participants
overwhelmingly reacted in the expected direction. Only 3 (1.02%) of the 294
possible cases (6 items each from 28 individuals and 21 groups) were incorrectly
classified. Regardless of this small misclassification, participants always revised
their probability assessment in the expected direction (downward in response to
contrary information and upward in response to mitigating factors) to the evidential
facts during Task 2.
The average likelihood judgments (J0 – J6 ) are reported in Table 2. The average
initial judgment (J0 ) by individuals (68.92 points) and groups (69.05 points) was
not significantly different ( p > 0.10). Table 2 shows how each subsequent infor-
mational unit altered the progressive going concern estimation in the predicted
direction. The average downward belief revision for contrary information was
39.16 points. The average upward belief revision was 15.82 points for mitigating
information. This magnitude difference is consistent with prior findings that
auditors are particularly sensitive to disconfirming evidence (Ashton & Ashton,
1988; McMillan & White, 1993). The average downward revision for contrary
information was less for groups (31 points) than for individuals (45 points).
Similarly, the average upward revision for mitigating information was 11 points
for groups and 19 points for individuals. Consistent with the literature that suggests
that groups function to taper extreme member positions, group responses were less
polarized than individual responses in both the positive and the negative direction
in this audit context.
Tests of Hypotheses
The first hypothesis specified that the groups would exhibit less recency effects
than individuals. A 2 (decision unit) × 2 (order) ANOVA was conducted with
percent change cumulative belief revision (J 6 − J 0 )/J 0 as the dependent variable.
An Analysis of Group Influences on Going Concern Auditor Judgments
Table 2. Descriptive Information: Analysis of Belief Assessments by Treatment Conditions.
Treatment Conditions Mean (Standard Deviation) of Initial (J0 ) and Revised (J1 Through J6 ) Likelihood Assessments
Group (N = 11) CCCMMM 69.54 (22.63) 59.54 (24.54) 47.72 (26.77) 38.82 (27.76) 42.73 (26.49) 45.91 (25.18) 51.36 (23.88)
Group (N = 10) MMMCCC 68.50 (16.67) 71.50 (18.86) 73.50 (15.47) 78.50 (14.35) 64.50 (17.55) 54.20 (22.75) 47.00 (21.24)
Individual (N = 13) CCCMMM 66.92 (21.27) 41.92 (22.03) 34.85 (16.62) 23.46 (16.88) 36.92 (16.40) 52.07 (20.38) 52.84 (18.76)
Individual (N = 15) MMMCCC 70.67 (14.12) 76.53 (14.89) 76.73 (13.23) 81.00 (8.70) 60.00 (8.45) 41.80 (15.36) 34.27 (14.28)
a Respondents in the CCCMMM (MMMCCC) condition received three items of contrary (mitigating) evidence, followed by three items of mitigating
(contrary) evidence.
41
42 SUNITA S. AHLAWAT AND TIMOTHY J. FOGARTY
Individual Group
The results are presented in Panel A of Table 3. The significance of the order
variable (F = 9.085, p < 0.01) shows that recency effects are present in auditors’
going concerns decisions. More importantly however, the results reveal a signifi-
cant interaction (F = 5.43, p < 0.05) between order and decision unit. This result
suggests that judgments were not only influenced by the order in which evidence
was evaluated, but also by whether judgments were made individually or in groups.
The decision unit does not have a direct effect and is important only in terms of
altering the impact of order effects. This suggests that groups act as a “debiaser”
in eliminating recency in auditor going concern judgments. H1 is supported.
Another test of recency among individual auditors shows that individuals in
MMMCCC condition made a greater average downward adjustment in their
going-concern likelihood judgments (from 70.67 to 34.27, a change of 36.40
points) than individuals in CCCMMM (from 66.92 to 52.84, a change of 14.08
points). This difference in average belief-revisions was significant (t = 3.96,
p < 0.001). In contrast to the individual results, likelihood judgments of audit
groups exhibited no recency. Here, the average downward adjustment was 21.50
points (from 68.50 to 47.00) for the MMMCCC condition, and almost identical
18.18 (from 69.54 to 51.36) points for the CCCMMM condition. This difference
was not significant (t = 0.47, p > 0.65).8 Hence, as expected, groups mitigated
the recency effect. These results also support H1.
The second hypothesis asserted a relationship between decision unit and going
concern judgment confidence. Specifically, audit groups were predicted to have
greater confidence in their going concern decisions. For these purposes, decision
An Analysis of Group Influences on Going Concern Auditor Judgments 43
Initial Final
confidence at the end of the case was used as the dependent variable. Final
confidence is important because it reflects the processing of all the information
in the case, either by groups or individually. Table 4 offers an ANOVA to test the
second hypothesis. Information order and decision unit are included as possible
effects upon final confidence consistent with H2. The significance of decision unit
at p < 0.05 suggests that groups have higher levels of confidence.9 The failure of
order effects, and the interaction between order and decision unit, to be significant
suggests that only how the decision-making unit was structured influenced
confidence.
Although H2 pertains to the existence of group differences, the change in
confidence that occurred during the experiment was also considered. Groups
exhibited significantly higher initial confidence than individuals (t = 2.27,
p < 0.03). A 2 × 2 ANCOVA with final confidence as the dependent variable,
initial confidence as the covariate, and decision unit and order as the independent
variables was conducted. In results not shown, the initial confidence covariate
was significant ( p < 0.05). Neither of the two main effects nor their inter-
action was significant. This suggests that the differential confidence in the final
decision was driven by the initial differences, and not by the differential pro-
cessing of information. Nonetheless, groups maintained a significant difference
in confidence over individuals throughout the entire process of belief revision.
Groups begin more confidently and stay that way, as further information is made
known about relevant events. However, the group does not progressively become
significantly more confident. The confidence difference appears to adhere to
44 SUNITA S. AHLAWAT AND TIMOTHY J. FOGARTY
Individuals Groups
the mere existence of the group, rather than its continued information handling
abilities.
The final hypothesis concerns different processing by groups and individuals
of the confirmatory and mitigating information. In the test of H3, the six
opportunities provided to participants to revise their probability beliefs were
distinguished into contrary and mitigating types. As shown in Table 5, there
is a significant difference between individual and group responses to contrary
information (t = 3.13, p < 0.01), with individuals reacting more severely. This
is consistent with H3a. No significant differences exist between audit groups
and individual auditors when presented with mitigating information (t = 1.57,
p > 0.12). This does not support H3b.
Other Analyses
In Hypothesis H1, the dependent variable was the revision of the assessment of the
likelihood that the client firm will continue as a going concern. As Asare (1992)
points out, it is also important to learn whether the differences in audit judgments
induced by the recency effect are likely to lead to differences in substantive audit
decisions. Accordingly, an additional analysis was performed to examine whether
judgment differences were sufficient to influence the audit report decisions in this
particular case setting.
Table 6 reports the recommended audit opinion of participants in each of the
four treatment conditions, both at the initial stage (Task 1) of the experiment, and
after reviewing all six additional items of information (the conclusion of Task 2).
Since none of the groups or individuals selected the “disclaimer of an opinion”
recommendation at any point in the experiment, the audit opinion variable was
binary. At the initial point, individuals are no more likely to recommend a
modified opinion (2 = 0.92, p > 0.50). However, individuals show a stronger
tendency to switch to a modified opinion during the course of the case. When final
decisions are considered, individuals are more likely than groups to recommend a
An Analysis of Group Influences on Going Concern Auditor Judgments 45
Individual CCCMMM 13 8 5 4 9
Individual MMMCCC 15 11 4 2 13
Groups CCCMMM 11 9 2 6 5
Groups MMMCCC 10 7 3 4 6
49 35 14 16 33
Panel B: Opinion Chosen vis-à-vis Opinion Indicated by Threshold
Initial Final
modified opinion (2 = 5.029, p < 0.05). In results not shown, individuals in the
CCCMMM condition tended to recommend more unqualified and fewer modified
opinions than individuals in MMMCCC condition at the end of the experiment.
This comparison, however, is not significant (2 = 2.24, p > 0.05). A comparison
of the distribution of final recommended opinions to the distribution of initial
opinions shows that 4 of 8 individuals in the CCCMMM group changed their
recommendation from unqualified to modified, while 9 of 11 in the MMMCCC
condition changed from unqualified to modified. A much less severe pattern
existed for groups. Only 6 of 21 groups (3 in each order condition) changed
their recommendation from unqualified to modified. However, neither of these
comparisons is significant (2 = 0.962, p > 0.05 and 2 = 0.829, p > 0.05 for
individuals and groups, respectively). Contrary to the expected effect of recency
on audit opinions, the number of modified opinions increased in both individual
and group CCCMMM conditions. Although revisions of belief toward modified
opinions may align with the aforementioned heightened sensitivity of auditors
to adverse news, these results also suggest possible differences between binary
(unqualified, modified) and continuous (percentage probability) outcomes.10
46 SUNITA S. AHLAWAT AND TIMOTHY J. FOGARTY
groups tend to sustain, but not significantly increase, their confidence advantage
over individuals. This suggests that the advantages of the group mode in an audit
setting occur early in the deliberative process. The fact that the confidence of
groups did not increase over time also may indicate that this collective mode is
not necessarily prone to overconfidence.
The results suggest that one of the main differences that groups may offer is
their willingness to reduce extreme reactions to particular pieces of information
that push toward extreme solutions. In the going concern situation, further evidence
of financial distress would logically make the going concern question more salient.
However, the contribution toward this conclusion for groups is relatively small.
Groups appear to be more willing to suspend judgment or to put each additional
piece of information in a broader context. Individuals demonstrate more sensitivity
to “bad” news by making larger belief revisions. This difference between groups
and individuals is not observed for information that tended to lessen the going
concern problem. Individuals did not react more strongly to facts that suggested
that the hypothetical business would remain financial viable. Further research is
needed to test possible reasons that the two decision units processed good news
and bad news differently.
The results should redirect the attention of auditing organizations and academic
accountants to group dynamics. Groups appear to process information in ways less
affected by its order. Groups are also more confident about decisions and less likely
to overreact to “bad” news about a client. Auditing firms should be comfortable
about the ability of groups to avoid recency bias but be somewhat concerned about
the tendency to perhaps react too little to going concern issues. In light of recent
sudden corporate bankruptcies, the latter tendency needs to be guarded against.
This research did not attempt to evaluate the importance of degrees of
confidence. The superior confidence of groups does not necessarily imply that
groups made more technically correct decisions about the going concern status of
the hypothetical client. This hypothetical nature of the client prevents any proof
of superiority. A necessary prelude to the confidence that constituents might
have about auditing outcomes is the confidence that auditors themselves have
in auditing inputs. Nonetheless, subsequent research should be directed at the
specific value of confidence in auditing judgments.
The findings of this study are subject to certain limitations. One stems from
the unavailability of data regarding the extent to which group members actually
had experience working together on previous engagements. The effectiveness
of group processes may depend on such experience, as individuals learn to
systematically respect or discount the judgments of others. The importance of
working histories of groups may not be as high in auditing as in other business
settings. As firms get larger and centralize control over their human resources,
48 SUNITA S. AHLAWAT AND TIMOTHY J. FOGARTY
individual assignments become less predictable and stable. No attention was given
to hierarchical differences within the participants that were assigned to groups.
In the attempt to ensure sufficient going concern expertise, auditors of different
ranks were mixed in the groups. No evidence exists on the question of whether
participants of higher rank dominated group decisions. A more systematic attempt
to isolate the power of more highly ranked individuals would have been necessary
to shed light on this question.11 Another potential limitation stems from the
fact that auditors in the group condition are more experienced than auditors in
the individual condition. Although the groups also included auditors with lesser
experience than those that worked individually, an experience effect may have
resulted if the more experienced group member dominated the group decisions.
NOTES
1. The professional nature of the work mitigates the fact that these groups often consist
of individuals at different levels within the organization. However, the empirical regularities
created by this professionalism need further investigation.
2. The expected ability of groups to make better-informed decisions does not take
into account situations where individuals first make judgments and then enter groups for
the reevaluation of the decision. This may cause groups to move towards more extreme
positions, as shown by Marxen (1990).
3. Group confidence might be lowered by cases where individuals strongly disagreed
with group positions. Therefore, the expectation that group confidence will be higher than
individual confidence implicitly asserts that these situations will be rare. This study does
not measure the degree to which satisfaction is related to confidence.
4. Group composition could be very important to the dynamics of group decision-
making. Since this research could not tightly control the composition dimension,
interpersonal issues such as charisma and persuasiveness could not be measured. On
more objective dimensions such as experience and rank, a suitable mixture of people was
achieved. See Table 1 and the discussion of participants in the Results section.
5. The researcher did not inquire about the decision processes of the groups after the
experiment was completed. Investigating this in a way that did it justice would require
another study.
6. This choice on group composition creates an alternative interpretation about the extent
of influence lower level employees can have on higher ones. See Graen and Uhl-Bien (1995).
7. The measure J0 –J6 was also examined in raw change terms. Since no differences in
the substantive results occurred, these were not shown.
8. Other tests were conducted to clarify the interpretation of the results presented in
Table 3. An ANCOVA with experience as the covariant (p > 0.05) was considered. A
significant order/decision-unit interaction (F = 4.674, p < 0.05) again resulted. This
suggests that these findings are not attributable to an experience effect. Another analysis
used J6 as the dependent variable, J0 as the covariant, and order and decision-unit as the
independent variables. This model captures belief revision in a different way by more
explicitly controlling for the initial anchoring point (J0 ). It also shows results similar
An Analysis of Group Influences on Going Concern Auditor Judgments 49
to those that are reported above. Specifically, the interaction between order effects and
decision-unit was significant (F = 4.35, p < 0.05). Another covariant that could be
important is the threshold for substantial doubt. The point at which the decision-maker is
confronted with a reportable going concern issue may present a matter independent from
the quantifiable belief revision variable. Using the probability estimate for this general
threshold specifically collected from the participants in Task 1 as a covariant, the order
effect/decision unit interaction term was again significant (F = 4.89, p < 0.05). The
results suggest the acceptance of the first hypothesis. Audit groups making going concern
decisions are less prone than individual auditors to recency effects.
9. As shown in Panel B of Table 4, this relationship was also analyzed using t-tests.
The results show that the difference between the final confidence of individuals (71.25)
and groups (80.23) was significant (t = 2.25, p < 0.03). This results is consistent with the
expectation in H2.
10. The bottom portion of Table 6 reports whether the participants’ recommended opin-
ions were consistent with the final probability ratings and (J6 ) with their initial threshold
judgment provided at the beginning of Task 1, apart from the consideration of case materials.
An auditor’s opinion type decision was considered consistent if the likelihood rating was
below the threshold judgment, and a modified report was chosen. Alternatively, consistency
could also be achieved with the recommendation that the opinion be unqualified if likelihood
was above the given threshold. Table 6 reports the results of these comparisons. In total,
only 7% (3 of 42) group recommendations of audit opinions were inconsistent. A nearly
twice as large 14% (8 of 56) of the individual recommendations were inconsistent. An even
more telling process unfolds when initial and final likelihood positions are differentiated.
Groups become more consistent to their original threshold over time. Initially, 90% of the
groups are consistent. This increases to 95% consistency after the last piece of information
has been processed. Individuals become less consistent. The percent of individuals that are
consistent changes from 89 to 82% over the course of the decision-making.
11. Conversations with practitioners about this did not reveal any consistent practice.
Some firms had a more hierarchical approach than others almost to the point of resting
this decision on the engagement partner after the other auditors had collected the relevant
information and suggested an outcome. Other firms had a more participatory process
wherein the decision cascaded from the lower levels to the top.
REFERENCES
Allwood, C. M., & Granhag, P. (1996). Realism in confidence judgments as a function of working in
dyads or alone. Organizational Behavior and Human Decision Processes, 64, 277–289.
American Institute of Certified Public Accountants (1990). Statement on auditing standards No. 59:
The auditor’s consideration of an entity’s ability to continue as a going concern. (AU 341) New
York, NY: AICPA.
Anderson, C. A., & Sechler, E. (1986). Effects of explanation and counter-explanation on the develop-
ment and use of social theories. Journal of Personality and Social Psychology, 50, 24–34.
Asare, S. K. (1992). The auditor’s going-concern decision: Interaction of task variables and the
sequential processing of evidence. The Accounting Review, 67, 379–393.
Ashton, A. H., & Ashton, R. (1988). Sequential belief revision in auditing. The Accounting Review,
63, 623–641.
50 SUNITA S. AHLAWAT AND TIMOTHY J. FOGARTY
Bloomfield, R., Libby, R., & Nelson, M. (1996). Communication of confidence as a determinant
of group judgment accuracy. Organizational Behavior and Human Decision Processes, 6,
287–300.
Chow, C., McNamee, A., & Plumlee, D. (1987). Practitioners’ perceptions of audit step difficulty and
criticalness: Implications for audit research. Auditing: A Journal of Practice and Theory, 6,
123–133.
Church, B. (1991). An examination of the effect that commitment to a hypothesis has on auditors’
evaluations of confirming and disconfirming evidence. Contemporary Accounting Research, 7,
513–534.
Church, B., & Schneider, A. (1993). Auditor generation of diagnostic hypotheses in response to a
superior’s suggestion: Influence effects. Contemporary Accounting Research, 10, 333–350.
Cushing, B., & Ahlawat, S. (1996). Mitigation of recency bias in audit judgment: The effect of docu-
mentation. Auditing: A Journal of Practice & Theory, 16, 134–146.
Dillard, J. N., Kauffman, N., & Spires, E. (1991). Evidence order and belief revision in management
accounting decisions. Accounting, Organizations and Society, 7, 619–633.
Fisher, B. A., & Ellis, D. (1990). Small group decision-making: Communication and the group process.
New York, NY: McGraw-Hill.
Gibbins, M., & Emby, C. (1985). Evidence on the nature of professional judgment in public accounting.
In: A. R. Abdel-khalik & I. Solomon (Eds), Auditing Research Symposium (pp. 181–212).
Champaign, IL: University of Illinois.
Graen, G. B., & Uhl-Bien, M. (1995). Relationship-based approach to leadership: Development of
leader-member exchange (LMX) theory of leadership over 25 years: Applying a multi-level
multi-domain perspective. Leadership Quarterly, 6, 219–247.
Hill, G. W. (1982). Group versus individual performance: Are n + 1 heads better than one? Psycho-
logical Bulletin, 19, 517–539.
Hogarth, R. M., & Einhorn, H. (1992). Order effects in belief updating: The belief adjustment model.
Cognitive Psychology, 24, 1–55.
Kennedy, J. (1993). Debiasing audit judgment with accountability: A framework and experimental
results. Journal of Accounting Research, 31, 231–245.
Luus, C. A. E., & Wells, G. (1994). The malleability of eyewitness confidence: Co-witness and perse-
verance effects. Journal of Applied Psychology, 79, 714–723.
Marxen, D. (1990). A behavioral investigation of time budget preparation in a competitive audit envi-
ronment. Accounting Horizons, 4, 47–57.
McMillan, J., & White, R. (1993). Auditors’ belief revisions and evidence search: The effect of
hypothesis frame, confirmation bias, and professional skepticism. The Accounting Review, 68,
443–465.
Messier, W., & Tubbs, R. (1994). Mitigating recency effects in belief revision: The impact of audit
experience and the review process. Auditing: A Journal of Practice & Theory, 14, 57–72.
Miner, F. (1984). Group versus industrial decision-making: An investigation of performance mea-
sures, decision strategies and process. Organizational Behavior and Human Performance, 39,
112–124.
Myers, D., & Lamm, H. (1976). The group polarization phenomenon. Psychological Bulletin, 82,
602–627.
Newman, D. (1980). Prospect theory: Implications for information evaluation. Accounting, Organiza-
tions and Society, 5, 217–230.
Pei, B. K., Reed, S., & Koch, B. (1992). Auditor belief revisions in a performance auditing setting:
An application of the belief-adjustment model. Accounting, Organizations, and Society, 17,
169–183.
An Analysis of Group Influences on Going Concern Auditor Judgments 51
Reckers, P. M. J., & Schultz, J. (1993). The effect of fraud signals, evidence order, and group-assisted
counsel on independent auditor judgment. Behavioral Research in Accounting, 5, 124–144.
Schultz, J. J., & Reckers, P. (1981). The impact of group processing on selected audit disclosure
decisions. Journal of Accounting Research, 19, 482–501.
Sniezek, J. A., & Henry, R. A. (1989). Accuracy and confidence in group judgment. Organizational
Behavior and Human Decision Processes, 43, 1–28.
Sniezek, J. A., & Henry, R. (1990). Revision, weighting, and commitment in consensus group judgment.
Organizational Behavior and Human Decision Processes, 45, 66–84.
Solomon, I. (1987). Multi-auditor judgment/decision-making research. Journal of Accounting
Literature, 6, 1–25.
Stasser, G. (1992). Information salience and the discovery of hidden profiles by decision-making
groups? A “thought experiment”. Organizational Behavior and Human Decision Processes,
52, 156–181.
Stasser, G., & Davis, J. (1981). Group decision-making and social influence: A social interaction
sequence model. Psychological Review, 88, 523–551.
Stasser, G., & Titus, W. (1985). Pooling of unshared information in group decision-making: Biased
information sampling during discussion. Journal of Personality and Social Psychology, 48,
1467–1478.
Tetlock, P. (1983). Accountability and the perseverance of first impressions. Social Psychology
Quarterly, 46, 285–292.
Trotman, K., & Wright, A. (1996). Recency effects: Task complexity, decision-mode, and task-specific
experience. Behavioral Research in Accounting, 8, 175–193.
Tubbs, R., Messier, W., Jr., & Knechel, W. (1990). Recency effects in the auditor’s belief-revision
process. The Accounting Review, 65, 452–460.
Valacich, J. S., & Schwenk, C. (1995). Devil’s advocacy and dialectical inquiry effects on face-to-face
and computer-mediated group decision-making. Organizational Behavior and Human Decision
Processes, 63, 158–173.
Vance, R., & Biddle, T. (1985). Task experience and social cues: Interactive effects on attitudinal
reaction. Organizational Behavior and Human Performance, 35, 252–265.
Weiss, H., & Shaw, J. (1979). Social influences in judgments about task. Organizational Behavior and
Human Performance, 24, 126–140.
White, S., Mitchell, T., & Bell, C. (1977). Goal setting, evaluation apprehension and social cues as
determinants of job performance and job satisfaction in a simulated organization. Journal of
Applied Psychology, 52, 665–673.
Winquist, J., & Larson, J. (1998). Information pooling: When it impacts group decision-making. Journal
of Personality and Social Psychology, 74, 371–378.
Wright, E., Luus, C., & Christie, S. (1990). Does group discussion facilitate the use of consensus
information in making causal attribution? Journal of Personality and Social Psychology, 59,
261–269.
Zarnoth, P., & Sniezek, J. (1997). The social influence of confidence in group decision-making. Journal
of Experimental Social Psychology, 33, 345–367.
INVESTIGATING ERROR PROJECTION
AMONG STATE AUDITORS: THE
IMPACT OF INTENTIONAL AND
SYSTEMATIC MISSTATEMENTS
ABSTRACT
This paper investigates state auditors’ decisions regarding the isolation
or projection of sample misstatements to underlying sample populations.
Seventy-eight state auditors completed four treatment cases that incor-
porate the complete 2 × 2 manipulation of intentional/unintentional and
systematic/non-systematic misstatements in different case scenarios, enabling
a test of the independent variables both across and within case scenarios.
The results indicate that both across and within case scenarios, auditors
tend to project systematic misstatements more often than they project
non-systematic misstatements. However, the auditors’ isolation/projection
decisions are generally not influenced by whether the sample misstatements
are intentional or unintentional.
INTRODUCTION
In 2000, state and local governments in the U.S. generated over $1.2 trillion
in revenues; they also spent over $1.1 trillion, accounting for over 9% of the
U.S. gross domestic product (28% of gross domestic product when the federal
government is included) (OMB, 2001). The magnitude of this economic activity
accentuates the need for proper oversight of the sources and uses of the funds,
including the audits of state and local governments. Despite the extent to which
state and local government activity impacts the economy, relatively little behav-
ioral auditing research has been conducted on the effectiveness and efficiency of
the auditors employed by these entities. This study addresses the issue of auditor
effectiveness by empirically testing the professional judgments of state auditors
in a context-rich environment; specifically, it examines the subjective assessment
of sample evidence by state auditors.
Sampling is one area where the evaluation of evidence may be largely
affected by subjective differences in auditors’ judgments. The auditing standards
explicitly state that in a variables sampling context, the “auditor should project
the misstatement results of the sample to the items from which the sample was
selected” (AICPA, 2001, AU§350.26).1 However, in addition to the quantitative
task of projecting sample misstatements, the standards also note that auditors
should consider the qualitative aspects of the misstatements (AICPA, 2001,
AU§350.27), and that the “actions that might be taken in light of the nature and
cause of particular misstatements” is left to the discretion of the auditor (AICPA,
2001, AU§350.06). Thus, some discord exists as to whether misstatements should
always be projected; and if not, under what conditions they should be isolated. If
an auditor inappropriately isolates misstatements found in a sample, the likelihood
of a non-representative, or biased, estimate of the account balance being tested
increases. More specifically, failure to project sample misstatements generally
results in an underestimation of the aggregate misstatement in the underlying
population, thereby increasing the auditor’s risk of incorrect acceptance. In the
case of state auditors, this implies a failure to satisfy an essential element of public
control and accountability.
The extent to which state auditors do not project sample misstatements of
account balances and the potential consequences of inappropriately isolating
misstatements is an important research topic. State auditors often conduct financial
statement audits; the results of which are used in a variety of ways, including the
allocation of resources among programs and personnel, monitoring compliance
with fiscal laws, and even bond ratings. This study focuses on non-sampling
risk,2 and extends existing literature in three ways. The first contribution is the
Investigating Error Projection Among State Auditors 55
The propensity of auditors to isolate rather than project sample misstatements can
occur when auditors assume that the misstatements do not exist elsewhere in the
population. One explanation for the lack of projection is that the auditors view the
misstatements as being unique or unusual and, therefore, not truly representative
of the underlying population being tested. Empirical evidence indicates that
the uniqueness perception of misstatements is highly significant in determining
Investigating Error Projection Among State Auditors 57
a more conservative approach than isolation; this is consistent with the findings
of Dusenbury et al. (1994) who found that intentional misstatements, in the
presence of containment information, were more likely to be projected than
unintentional misstatements. Increased attention to intentional misstatements may
be particularly warranted in the case of governmental auditors since generally
accepted governmental auditing standards state that the threshold for audit risk
may be lower in governmental audits than in audits of commercial entities (GAO,
1994, §4.9), and because various legal and regulatory requirements faced by
governmental auditors may require reporting on any intentional misstatement,
regardless of its materiality. Thus, the following research hypothesis is proposed:
H1. The propensity of state auditors to project intentional sample misstatements
to the underlying population being tested will be greater than their propensity
to project unintentional misstatements.
EXPERIMENTAL METHODOLOGY
Experimental Task
To test the hypotheses, a series of sampling cases (see Appendix) that incorporated
the experimental manipulations was developed. These cases enabled both across
scenarios and within scenario analysis. Burgstahler and Jiambalvo’s (1986)
cases served as a basis for comparability with other studies; however, precise
60 JOHN T. REISCH, KAREN S. McKENZIE AND ALAN H. FRIEDBERG
replication of cases used in these other studies was not tractable given the
manipulations employed. In addition, although comparisons can be made between
the results of this study and other studies on sample projections, the pressures and
incentives encountered by governmental auditors are believed to vary from those
encountered by external auditors. Until research addresses environmental factors,
perceptions of differences are all that distinguish the governmental auditor from
the non-governmental auditor.
The experimental instrument included four treatment cases, each representing
a different account balance (e.g. sales and accounts receivable) and sampling
situation; thus, we are able to capture the four treatments of the 2 × 2 design in
which we manipulate two independent variables: (1) the type of misstatement as
either intentional or unintentional (INT); and (2) the nature of the misstatement
in terms of potential recurrence, operationalized as being either systematic or
non-systematic (SYS).4 Each of the four cases should be projected according to
the guidance provided by SAS No. 39, “Audit Sampling.” When noted in the cases,
the employees responsible for the misstatements were intentionally kept at lower
levels (e.g. clerical employees or warehouse workers) to reduce the saliency of the
individuals involved, especially for manipulations containing intentional misstate-
ments. The dollar amounts of the misstatements were made immaterial since most
misstatements discovered by auditors are not individually material (Elder & Allen,
1998). In addition, keeping the materiality of the misstatements constant (i.e.
immaterial) enhanced control over the manipulations being tested by minimizing
potential confounding effects from the materiality of the misstatements.
The cases used were pretested at a chapter meeting of the Institute of Internal
Auditors. Most of the internal auditors in this chapter were governmental auditors,
and no feedback was received that indicated a problem understanding any of
the case scenarios. In addition, the results of the pretest suggested that the
experimental manipulations worked as intended.
This study is based on a repeated measure design in which each subject received
every possible combination of the 2 × 2 manipulation of INT and SYS. Each
combination was incorporated randomly into one of the four different treatment
scenarios. Each case scenario in the experimental instrument was included on an
individual page and participants were requested not to return to a scenario after it
was completed. The presentation of the case scenarios was randomized to minimize
potential order effects. After reading each scenario, participants made a decision as
to whether they would project the sample to the account population being tested or
isolate the sample result from the population. Subjects were then asked to complete
a ten point Likert-type scale, which measured the comfort level of their decision.
Subjects were instructed to consider each scenario independently and assume that
the samples were selected at random from the populations being tested.
Investigating Error Projection Among State Auditors 61
Fig. 1. Illustration of Data Analysis Across Cases (Direct Comparison of the Four
Experimental Manipulations Without Taking the Individual Case Scenarios into Considera-
tion). The Two Manipulated Variables were: (1) Intentional or Unintentional Misstatement;
and (2) Systematic or Non-systematic Misstatement. The Four Cases Scenarios Involved
Misstatements in Sales, Inventory, Receivables, and Unknown Receipts.
The differences in the auditors’ decisions are analyzed both across all cases and
by individual case, as illustrated in Fig. 1. The analysis across the four treatment
cases tests the experimental conditions of the 2 × 2 manipulation of INT and SYS,
with each subject receiving one each of the four conditions in the 2 × 2 design.
In the analysis by individual case, all like cases (e.g. all sales scenarios, Case 1 in
Appendix) received by the participating auditors are tested in the aggregate.
A major difference between the experimental design used in this study and
the research designs of most other studies investigating the isolation/projection
decisions of auditors (e.g. Burgstahler & Jiambalvo, 1986; Dusenbury et al.,
1994; Hermanson, 1997) is that in this study, the effects of the two independent
variables are isolated in the four combinations of the 2 × 2 design. The other
studies tested factors that affect auditors’ decisions by observing differences
62 JOHN T. REISCH, KAREN S. McKENZIE AND ALAN H. FRIEDBERG
Subjects
Central 15 CPA 48
North Central 11 CGFM 21
Northeast 7 CIA 3
Northwest 15 Other 7
South 15 None 13
Southeast 20
Total 78
Panel B: Means
Frequency of sampling
% of audit engagements 79.9 25.8
When sampling is used
(1) Frequency of statistical sampling (%) 26.3 30.7
(2) Frequency of judgmental sampling (%) 73.3 30.7
a States categorized according to Webster’s College Dictionary (1991).
b Does not total 78 (the number of subjects) since some subjects hold more than one certification.
Manipulation Checks
The data are first analyzed across cases with all of the cases collapsed into a
single group; that is, the individual case scenarios are ignored yielding a repeated
measures design in which each subject receives all four treatment combinations of
the 2 × 2 manipulation. Since each scenario is expected to elicit the same response,
multiple scenarios are used so the results are not too dependent on any one particular
Investigating Error Projection Among State Auditors 65
scenario. To the extent that one or more of the scenarios would produce a different
response in the dependent variable (the decision of isolating or projecting the
sample misstatement), the analysis would be biased against finding a significant
result; thus, collapsing the scenarios into a single group is a conservative approach
to the data analysis.
Table 2 presents the results of the logistic regression with all of the cases col-
lapsed into one group. Panel A shows the results using the manipulated variables
INT and SYS. In Panel B, the manipulated explanatory variables INT and SYS
have been replaced in the logistic regression model with INTCK and SYSCK,
the subjects’ assessments of whether the misstatements were intentional or
systematic, respectively. In both across treatment models, the chi-square statistics
are significant at the 0.01 level, suggesting the models are good predictors of
the auditors’ propensity to isolate or project sample misstatements. In addition,
goodness of fit for the logistic regression models was obtained by the c statistic,
which is somewhat analogous to the coefficient of multiple correlation (Kane
et al., 1996).7 In both models, the c statistic is greater than 0.65.
The analysis across treatments does not support H1 regardless of whether INT
or INTCK is included in the regression models, indicating that no difference exists
in the auditors’ isolation/projection decisions whether the sample misstatements
are intentional or not.
The independent variable SYS, which is used to operationalize the manipulation
of systematic misstatements, is highly significant as indicated in Table 2. As a
66 JOHN T. REISCH, KAREN S. McKENZIE AND ALAN H. FRIEDBERG
Table 4 presents logistic regression results for each case treatment rather than for a
single aggregate group across the treatment cases. Each case and each treatment
Investigating Error Projection Among State Auditors 67
Overall, the analysis of the individual case scenarios indicates that auditors’
isolation/projection decisions are not significantly affected by whether sample
misstatements are intentional or not; thus, the within case analyses does not
support H1. In addition, the results suggest that auditors are more likely to
project systematic sample misstatements to the underlying population than they
are non-systematic misstatements. While the finding is largely applicable in
the logistic regression models containing the variable SYS, the results are even
stronger when the subjects’ assessments of systematic misstatements, SYSCK, is
included in the regression models.
Discussion of Findings
misstatements (SYSCK). This is evidenced by the overall higher odds ratios and
Wald 2 values in Tables 2 and 4.
CONCLUDING COMMENTS
A limitation of this study is the manner in which the instruments were distributed
to the subjects. As noted previously, the instruments were sent to the audit
directors of the participating state audit departments which could have prevented
a random distribution of the instrument to the state auditors in each location; that
is, the audit directors may have selected the most diligent auditors in the office
to complete the task rather than distributing the instrument randomly across all
auditors in the office, limiting the external validity of the study.
To reduce potential confounding factors, subjects were told that none of the mis-
statements presented in the case scenarios were material, and subjects were faced
with a dichotomous decision task – to project or isolate the sample misstatements.
The participants were not given the opportunity to contain the misstatements. The
containment process has been posited as an explanation for choosing isolation over
projection of misstatements (Dusenbury et al., 1994; Wheeler et al., 1997), and is
a common practice among external auditors (Elder & Allen, 1998). Although this
study was not designed to test containment effects, very few of the participants’
comments suggested their decisions were based on perceived containment, or lack
thereof. Of all the subjects’ comments, only a few expressed a desire for contain-
ment information. The lack of options available to the subjects may have weakened
the generalizability of the study, and leaves an avenue open for future research.
Although the manipulation of the systematic misstatements for the four cases
are likely to be repeated because of some characteristic(s) associated with a trans-
action or class of transactions, they are not operationalized in the same manner.
For example, in Case 2 (inventory), the systematic manipulation is operationalized
as a control over inventory by whether or not there were past inventory problems,
whereas in Case 3 (receivables), the systematic manipulation is also a control issue,
Investigating Error Projection Among State Auditors 73
but it involves a computer system malfunction. The lack of uniformity in the opera-
tionalization of systematic misstatements is a weakness of the study. However, the
results were analyzed using both the initial manipulations of systematic misstate-
ments and the participating auditors’ assessments of whether the misstatements
were systematic, and in both analyses, the systematic manipulations are almost
all highly significant in explaining the auditors’ isolation/projection decisions.
Future research could address how auditors recognize and interpret the systematic
nature of misstatements and how that affects the auditors’ decision processes.
The case scenarios were set up in random order to minimize potential order
effects. Once the order of the scenarios had been selected for each participant, a
specific manipulation of the two independent variables was assigned to each of the
four treatment cases in a manner that insured every participant received each of the
four combinations of the 2 × 2 design (as illustrated in Fig. 1). While the process
of randomizing the research instrument in this manner should have minimized any
order effects, our ability to test for order effects was limited given that 28 different
combinations of the research instruments were distributed. Tests conducted that
compared the results of the different instrument combinations did not indicate the
presence of any order effects. In addition, ad hoc measures were developed that
compared the decisions among the different instruments in multiple ways (e.g.
compared the results based on which the case was presented first without regard
to the remaining order). These tests are admittedly imprecise; however, no effects
resulting from the order of the case presentation were noted and the randomization
of both the cases and treatment combinations should have minimized potential
order effects. Nevertheless, the low power of the tests for order effects is a limitation
of the study.
Finally, the use of state auditors as the subject pool limits the comparability of
this study to others that used non-governmental auditors as subjects. While both
governmental and non-governmental auditors must decide whether to isolate or
project sample misstatements to the population being evaluated, the experimental
manipulations may have affected the state auditors’ isolation/projection decisions
differently than they would have affected non-governmental auditors. Future
research should investigate the differences in audit environments between gov-
ernmental and non-governmental employers and the impact of those differences
on the actions of the auditors.
NOTES
1. Generally accepted governmental auditing standards (GAGAS) incorporate AICPA
standards relevant to financial statement audits unless the General Accounting Office (GAO)
excludes them by formal announcement (GAO, 1994, p. 32).
74 JOHN T. REISCH, KAREN S. McKENZIE AND ALAN H. FRIEDBERG
2. Auditing standards (AICPA, 2001, AU§350.11) divide the risk that a sample may be
non-representative of the population into sampling risk and non-sampling risk. Sampling
risk is the inherent risk of sampling that arises simply because less than the entire population
is examined. Non-sampling risk consists of risks not due to the sample selected but instead
involves risks associated with evaluating the sample, such as an auditor’s failure to recognize
exceptions in the sample selected, and the auditor’s inappropriate or ineffective application
of audit procedures.
3. Only one study conducted on auditors’ isolation/projection decisions (Wheeler et al.,
1997) used a complete design. They used a full 3 × 2 design to test the impact of containment
information on auditors’ sampling decisions and did not test either factor (intentional or
systematic misstatement) investigated in this study. In addition, Wheeler et al. used a single
case scenario in their study whereas we use four different case scenarios.
4. In this study, the delineation between systematic and non-systematic may be more
precisely described as “more systematic” and “less systematic,” because almost every
misstatement will have certain characteristics that could be construed as systematic.
5. Analyses of the data excluding the seven auditors who may potentially lack a
background in accounting indicate no significant differences from the reported results.
6. Because the manipulation of whether a misstatement is intentional is rather well-
defined, analyses were also conducted that excluded the participants that initially failed
the INT manipulation check. The results were substantially similar to those presented in
the paper.
7. The c statistic is derived by comparing the number of paired responses (of
observed and predicted responses) in the data set. It is defined by the equation: c =
(nc + 0.5(t − nc − nd))/t, where t is the total number of pairs with different responses,
nc is the number of concordant response pairs, nd is the number of discordant response
pairs, and t − nc − nd is the number of ties between the response pairs.
8. The odds ratio is calculated by exponentiating the parameter estimates (variable coef-
ficients) using the natural log (Stokes et al., 1995). For example, if the parameter estimate
is 1.2528, then the odds ratio is 3.50 (e1.2528 = 3.500).
9. The data were also analyzed using repeated measures analysis of variance (ANOVA)
by combining the auditors’ isolate/project decision with the comfort of their decision. The
resulting analysis yielded very similar results to the logistic regression presented.
ACKNOWLEDGMENTS
We appreciate the helpful comments of Richard Dusenbury, Randy Elder,
David Gilbertson, Julia Higgs, Bill Hopwood, Dennis O’Reilly, Steve Wheeler,
participants at the 1999 Southeast Regional AAA and 2000 Auditing Section
Midyear meetings, two anonymous reviewers, and the editor.
REFERENCES
Akresh, A., & Tatum, K. (1988). Audit sampling – dealing with the problems. Journal of Accountancy
(December), 58–64.
Investigating Error Projection Among State Auditors 75
American Institute of Certified Public Accountants (AICPA) (2001). AICPA Professional Standards
as of June 30th, 2000 (Vol. 1). New York, NY: AICPA.
Anderson, B. H., & Maletta, M. (1994). Auditor attendance to negative and positive informa-
tion: The effect on experience-related differences. Behavioral Research in Accounting (6),
1–20.
Ashton, A. H. (1991). Experience and error frequency knowledge as potential determinants of audit
experience. The Accounting Review (April), 218–239.
Ashton, A. H., & Ashton, R. H. (1988). Sequential belief revision in auditing. The Accounting Review
(October), 623–641.
Burgstahler, D., & Jiambalvo, J. (1986). Sample error characteristics and projection of error to audit
populations. The Accounting Review (April), 233–248.
Dusenbury, R., Reimers, J., & Wheeler, S. (1994). The effect of containment information and error
frequency on projection of sample errors to audit populations. The Accounting Review (January),
257–264.
Elder, R. S., & Allen, R. D. (1998). An empirical investigation of the auditor’s decision to project
errors. Auditing: A Journal of Practice and Theory (Fall), 71–87.
General Accounting Office (GAO) (1994). Government Auditing Standards: 1994 Revision.
Washington, DC: Comptroller General of the United States.
Green, S. L. (1992). Behavioral research in governmental and nonprofit accounting: An assessment of
the past and suggestions for the future. Research in Governmental and Non-profit Accounting
(7), 53–78.
Hermanson, H. M. (1997). The effects of audit structure and experience on auditors’ decisions to isolate
errors. Behavioral Research in Accounting, Suppl. (9), 76–93.
Kahneman, D., & Tversky, A. (1972). Subjective probability: A judgment of representativeness. Cog-
nitive Psychology (July), 430–454.
Kane, G. D., Richardson, F. M., & Graybill, P. (1996). Recession-induced stress and the prediction of
corporate failure. Contemporary Accounting Research, 13(2), 631–642.
Kida, T. (1984). The impact of hypothesis-testing strategies on auditors’ use of judgment data. Journal
of Accounting Research (Spring), 332–340.
Libby, R. (1985). Availability and the generation of hypotheses in analytical review. Journal of
Accounting Research (Autumn), 648–667.
Libby, R., & Frederick, D. M. (1990). Experience and the ability to explain audit findings. Journal of
Accounting Research (Autumn), 348–367.
MacDonald, E. (2000). ‘What’s Wevenue?’ Auditors Miss a Fraud and SEC tries to put them out of
business – scam at California Micro was well-hidden, says lawyer for Coopers duo – CFO’s
misleading resume. Wall Street Journal (January 6), A1.
Office of Management and Budget (OMB) (2001). A citizens’ guide to the federal budget, fiscal year
2002. Washington, DC: U.S. Government Printing Office.
Random House Webster’s College Dictionary (1991). New York, NY: McGraw-Hill.
Stokes, M. E., Davis, C. S., & Koch, G. G. (1995). Categorical data analysis using the SAS system.
Cary, NC: SAS Institute.
Trotman, K. T., & Sng, J. (1989). The effect of hypothesis framing, prior expectations and cue
diagnosticity on auditors’ information choice. Accounting, Organizations and Society, 14(5/6),
565–576.
Wheeler, S., Dusenbury, R., & Reimers, J. (1997). Projecting sample misstatement to audit popu-
lations: Theoretical, professional, and empirical considerations. Decision Sciences (Spring),
261–278.
76 JOHN T. REISCH, KAREN S. McKENZIE AND ALAN H. FRIEDBERG
APPENDIX
The treatment cases included in the experimental instrument are given below.
Cases 1–4 represent the four combinations of the complete 2 × 2 design that
tests two sample misstatement manipulations: intentional or unintentional mis-
statement and systematic or non-systematic misstatement. The unintentional and
non-systematic misstatement manipulations are italicized first, followed by the
manipulations for intentional and systematic misstatements that are also italicized
but in parentheses.
Case 1 (Sales)
Sales Account No. 77491 was understated by $945.16. It was determined that
a temporary clerical employee, who worked during a two week period in April,
mistakenly (deliberately) misfooted sales invoices for the account. The client’s
controller indicated that this was the only temporary employee (one of 25
temporary employees) used to process sales transactions.
Case 2 (Inventory)
During a physical inventory observation, it was discovered that inventory item No.
245-0672 (cleaning chemicals) was understated by 23 items valued at $50 each.
Further investigation revealed that a warehouse employee temporarily placed
the items in the breakroom to restock the company’s supplies closet (temporarily
placed the items in the breakroom with the intent of taking them home for his per-
sonal use) (the breakroom is adjacent to the company’s supplies closet). A review
of the company’s internal audit workpapers for the last two years, which report
on periodic surprise inventory test counts, revealed no similar instances (revealed
several similar instances) in which inventory was improperly segregated by
warehouse workers.
Case 3 (Receivables)
Receivables Account No. 16788 was overstated by $59. The misstatement was
discovered when the auditor compared the price on the selected sales invoice to the
client’s approved master price list in effect at the date of the sale. An investigation
into the matter revealed that a salesperson overcharged the customer for the item
Investigating Error Projection Among State Auditors 77
when she inadvertently read the price of the next item on the master price list (to
increase her sales commission). The client’s accounting system was temporarily
down when the item was ordered and the transaction had to be manually processed.
When the system is operating, it cannot process transactions (it allows overrides
of transactions) if the price of the item is not within the approved master price
range. It was estimated that the system was down 3–5% of the time during the year.
ABSTRACT
Data were collected from loan officers using a computerized process-tracing
program to help shed some light on how source credibility impacts the
judgments made by loan officers. Loan officers did not structure loans more
restrictively regardless of whether they were in the positive or negative char-
acter condition or whether they approved or denied the loan. Negative source
credibility affected decision process effort but did not produce the tradeoff
between loan approval and loan structure that is suggested in the literature.
Although significantly more (fewer) loans were denied when character in-
formation was negative (positive), a majority of loan officers in the negative
character condition approved the loan. While most loan officers were aware
of negative source credibility, they did not react by denying loans or adjusting
loan structure.
INTRODUCTION
While many agree that source credibility is important to lending decisions, how
negative source credibility impacts lender decisions is less understood. Some sug-
gest that loan structure (i.e. collateral and covenants) can be used to compensate
for negative source credibility (e.g. Mather, 1999; Oldham, 1998), while others
maintain that loan officers should not trade off perceived weaknesses in source
credibility with tighter structure. The risk of attempting to counterbalance flawed
character with loan structure is too great; a safer approach would be to avoid a
business relationship than to trust the applicant’s financial representations (e.g.
Pace & Simonson, 1977).
Research on whether lenders compensate for perceived weakness in source
credibility by imposing tighter loan structure requires joint study of loan approval
and loan structure decisions. However, the literature on how loan officers react to
negative source credibility has focused on loan approval (e.g. Beaulieu, 1994) or
loan structure (Mather, 1999), but not both. Thus, the primary contribution of this
paper is to determine whether the tradeoff exists.
Source credibility was manipulated in the experiment to be either positive
(suggesting a credible source) or negative (suggesting a non-credible source).
Data were collected using a computerized process-tracing program, which
collected information on decision effort, perceptions of the credibility of projected
accounting information, loan approval/denial, and loan structure. The results
indicate that loan officers will deny loans to less credible clients rather than
restructure the conditions of the loan, and that they will not structure loans more
restrictively regardless of whether they were in the positive or negative character
condition or whether they approved or denied the loan.
In capital markets, source credibility refers to whether managers who direct the
preparation of financial statements inspire belief in the statements. Source cred-
ibility is particularly important in today’s environment as a number of prominent
companies, including several of their CEOs and CFOs were accused of falsifying
documents and manipulating accounting information to hide poor financial results.
Source credibility is distinct from credibility conferred by attestation services
offered by external auditors. While both forms of credibility are important,
source credibility, which has received relatively little attention in finance and
accounting literature, is the focus of this paper. In a post-Enron world, new
research will likely address interactions between source credibility and attestation
services.
Source credibility is important whenever resource providers lack complete in-
formation and must rely on others to provide fair and accurate information. Source
How Does Negative Source Credibility Affect Commercial Lenders’ Decisions 81
treated as a more subtle, complex, and practical issue than is done in most prior
research and participants are given more freedom to judge source credibility.
Prior research in lending has found that source credibility affects loan approval
(Beaulieu, 1994) and loan structure (Mather, 1999) judgments. However, loan
approval and loan structure are not separate, independent judgments even though
they have been examined separately. To more completely understand the effect of
source credibility on lending decisions, both judgments need to be simultaneously
examined. Doing so provides a more comprehensive understanding of how
loan officers react to negative source credibility, and in particular, whether
they compensate for negative source credibility by restrictive loan structure or
whether they simply deny the loan request at the outset. This is our basic research
question.
This research question is important because it focuses on shortcomings in the
current literature and helps resolve the debate on how loan officers react to negative
source credibility. Framing the research question in terms of a tradeoff between
loan approval and structure allows us to investigate whether Mather’s findings
(1999) that source credibility affects loan structure would hold if loan officers
were permitted to deny loans. Similarly, while Beaulieu (1994) documented that
more loan officers denied loan applicants with negative source credibility than
those with positive source credibility, there is no evidence whether loan candidates
with negative source credibility who were approved received more restrictive loan
structures than those who were denied or those who had positive source credibility.
If lenders do not structure approved loan candidates with negative source credibility
more restrictively then there is no consequence to candidates with negative source
credibility that would protect lenders.
Commercial lending experts recommend that loan officers evaluate source credi-
bility, in the form of a character judgment, as soon as contact with a prospective
borrower has been made. If character is not of sufficient quality, then analyzing
credit further or considering alternative loan structures may not be worthwhile.
This preliminary character judgment is the first hurdle of lending (McDonald &
McKinley, 1981; Pace & Simonson, 1977). Stephens (1980) confirmed that loan
officers want information about the applicant before examining the details of the
loan. This position can also be inferred from Eisenreich (1981, p. 9):
Since the majority of information will come from the borrower . . . the lender must have confi-
dence in the raw material of the judgment. If not or if critical facts cannot be verified, the lender
cannot make the decision. It would be a gamble rather than a calculated risk.
How Does Negative Source Credibility Affect Commercial Lenders’ Decisions 83
The above direct quote conflicts with the advice offered by other commercial
lenders cited earlier (Eisenreich, 1981; McDonald & McKinley, 1981; Pace &
Simonson, 1977). It seems to advocate both screening borrowers of questionable
credibility and using loan structure to work with them. While prudently this should
be the exception rather than the rule, loan officers may use the exception to ratio-
nalize loan approvals.
Which reaction is more likely to occur is an open issue. Beaulieu (1994)
found that character had a significant main effect on loan officers’ loan decisions
(approval or denial) and that it interacted with accounting information to affect
both decisions and estimates of risk of nonpayment. Specifically, loan decisions
and risk estimates responded significantly to a change in the strength of ac-
counting information when character was positive, but not when it was negative.
Participants in Beaulieu’s study were told to assume, in a loan application case,
that structure of the proposed loan would be determined by the bank’s policy
at competitive terms and that collateral would be available to meet the bank’s
guidelines for that type of loan. They had no opportunity to adjust loan covenants
or collateral. In contrast, Mather (1999) instructed his subjects that loans had
already been approved, so that only the loan structure task was required. Under
these conditions, Mather found that loan officers set more restrictive loan structure
when credibility was unknown than when it was positive.
An objective of the current study is to help to resolve the debate by providing
evidence as to whether lenders simply deny a loan (H1) consistent with Beaulieu
(1994) or select collateral and covenants levels to compensate for weaknesses in
source credibility (H2) consistent with Mather (1999). Essentially, H1 and H2
are competing hypotheses. Because the guidance in the literature is at odds, the
hypotheses are stated in the null form.
84 PHILIP R. BEAULIEU AND ANDREW J. ROSMAN
H1. There will be no difference in the proportion of loan officers who will
approve loans when character of the borrower is positive than when character
is negative.
H2. There will be no difference in proposed loan structure between loan officers
receiving negative and positive character information.
Process Effort
Loan officers make a critical decision regarding how much effort to expend
when they evaluate a loan candidate. Rosman and Bedard (1999) find evidence
that lenders will structure loans more restrictively when they expend less effort.
However, Rosman and Bedard do not consider the relationship between effort
and loan structure restrictiveness in light of weaknesses in a potential borrower’s
character.
When character is perceived to be weak but not entirely non-credible, the lender
may pour more effort into the file to check on the initial negative impression
of character and to relate character judgments to other information provided,
especially accounting information. This possibility is motivated by the fact that
initial impressions of character and personality can be incorrect (Korem, 1997).
That is, loan officers may consider approving a loan if no aspect of presentation
in the financial statements encourages caution, even though assessments of
management’s credibility raise doubts about their character.2 Increasing decision
effort in such situations reduces concerns raised by initial negative character
judgments that do not push loan officers past a threshold where they feel that they
must deny loans. Increased processing effort, as a response to negative (but not
extremely so) character information, is consistent with Shaub’s (1996) finding
that auditors lacking trust in a client will recommend more work in their audit
plans. It is also consistent with Beaulieu (2001), in which recommended evidence
collection was negatively related to a CFO’s integrity.
The other option available to loan officers when character judgments are
sufficiently negative is to deny loans because such credits do not clear the
“first hurdle” of commercial lending (Pace & Simonson, 1977). This implies
that information processing will be terminated quickly when the character of
borrowers is so negative that they are considered non-credible.
Options one and two (checking initial impressions of character and relating
it to accounting information, and denying loans without checking) require more
and less processing effort, respectively, than an average or baseline credit with
positive character information. It may not be obvious to loan officers whether the
How Does Negative Source Credibility Affect Commercial Lenders’ Decisions 85
METHOD
Procedure
Decision process and outcome data were collected using Search Monitor, which is a
computerized process-tracing program (Biggs et al., 1993; Brucks, 1988; Rosman
& Bedard, 1999). Search Monitor is interactive, menu-driven software that presents
case materials to participants and captures a complete trace of selected processes
including cue acquisition, acquisition order, and time to examine cues.
Subjects were advised at the beginning of the Search Monitor task that a
commercial loan applicant was seeking a loan package that included short- and
medium-term financing. The case used in this study integrates the case materials
used by Beaulieu (1994), which validated the source credibility measures, and
Rosman and Bedard (1999), which validated the realism of the lending task and
related measures.
The loan applicant, a manufacturer of chemical products, was briefly described,
including the contact person with the firm, its CFO. Further information about the
firm was accessed via a menu having six categories of financial and qualitative
data: profitability, inventory turnover, liquidity, and financial leverage & capital
structure (financial); and management and industry & product (qualitative). Each
of the four categories of financial data consisted of three ratios (and the dollar
values of numerators and denominators), divided into historical (years −2, −1
and 0) and projected (years +1 and +2) information. Case information indicated
that the historical information was given a clean audit opinion, while no opinion
had been expressed regarding the projected figures.
For example, the following menu was presented to participants who selected
profitability information.
(1) Historical net income/average equity
(2) Projected net income/average equity
(3) Historical net income/average total assets
(4) Projected net income/average total assets
(5) Historical net income/net sales
(6) Projected net income/net sales
86 PHILIP R. BEAULIEU AND ANDREW J. ROSMAN
The order of the six cues was randomized differently each time a participant
returned to the menu. Participants could move both within each of the six
categories of information and between categories as they wished. When they
indicated that they had finished selecting and viewing information, they were
given a series of screens to register their recommendations about the loan.
Approval or denial of the loan was requested, assuming an interest rate set at one
percentage point above prime, followed by loan structure recommendations.3
Participants who recommended denial were told that although they did not
recommend approval, they had been asked to provide input on how to structure the
loan in the event that the loan committee recommended approval. This step was
necessary so that H2 could be examined. That is, even if a loan did not pass the initial
character judgment hurdle (see Pace & Simonson, 1977, discussed earlier), this
step ensures a test of the tradeoff between structure and character that is suggested
by some of the literature, including positive accounting theory. Combined, H1 and
H2 provide a stronger test of the two competing points of view that have been
expressed in the literature.
Four loan structure recommendations were requested (see below). Twelve
responses were provided for each, corresponding to ranges of percentages that
varied, depending on the item.4,5
(1) Percentage of loan principal for which an equivalent amount of assets will be
collateralized.
(2) Level of profitability (ratio of net income to average equity) to be maintained.
(3) Level of liquidity (ratio of cash flows to fixed cash commitments) to be main-
tained.
(4) Level of leverage (ratio of total liabilities to equity) to be maintained.
The loan structure recommendations were followed by a question asking
participants to indicate confidence in their structure judgments on a nine-point
scale. Finally, two questions asked participants to rate the credibility of historical
financial information and management’s financial projections, also on nine-point
scales.
The character information used in this study was adapted from Beaulieu
(1994), which contains a complete description of the development and valida-
tion process. As shown in Table 1, character was manipulated between-subjects in
two places in the Search Monitor program. First, either positive or negative charac-
ter information regarding the CFO was provided in an introductory screen and was
seen by all participants in either condition of the experiment. Second, participants
could select more information about the CFO via the management information
menu. Those selecting the additional information received either a positive or
negative description, depending on the condition to which they had been assigned.
How Does Negative Source Credibility Affect Commercial Lenders’ Decisions 87
Introductory screen When you visited the When you visited the business, the
viewed by all business, the CFO had CFO did not have available all the
participants available all the documentation that he had promised
documentation that he had to provide. However, the following
promised to provide. Among information did become available to
the items you examined you during your initial evaluation.
during your initial evaluation The loan application stated that the
are the following. The loan firm had not been a defendant in legal
application stated that the actions in the last three years. A
firm had not been a defendant background check showed that a
in legal actions in the last former senior officer of the firm has
three years. A background filed a wrongful dismissal suit. The
check confirmed this. At your suit has recently been settled out of
meeting, you said that a court. At your meeting, you said that
decision on the loan would be a decision on the loan would be
made within two weeks. The made within two weeks. The CFO
CFO accepted this time accepted this time frame and did not
frame, and did not urge you urge you to reach a decision earlier.
to reach a decision earlier.
Management menu, CFO The CFO’s work history is The CFO’s work history is provided,
viewed only if selected provided, then the following: then the following: During your
At your first meeting the CFO meeting with him, Mr. Butler ignored
answered your questions your suggestions for improving his
patiently, and volunteered firm’s operations and said that he did
additional information. He is not need business advice. He has
an active member of several changed the firm’s public accountant
local community service twice in the last five years.
organizations. Disagreements with the former
public accountants were reported.
a Thesentences in italics were rated as neutral, not providing information about character, in Beaulieu
(1994). They were not written in italics in Search Monitor.
Participants
RESULTS
Manipulation Check
The potential for source credibility to impact the perception of the credibility of
accounting projections is important because projected accounting information is a
standard component of loan applications (Danos et al., 1989), and is not audited.
This type of credibility judgment is different than other credibility judgments that
are made in equity markets, because the latter are objective assessments of the
accuracy of management forecasts (e.g. Hirst et al., 1999). In contrast, source
credibility in the lending context is a subjective consideration of the prior behavior
of management that is made because there is no objective public record of man-
agement forecast accuracy. The credibility of projected unaudited information is a
judgment that precedes loan approval and loan structure and is used to assess the
success of the manipulation.
Mean source credibility ratings of projected accounting information were
evaluated on a nine-point scale (1 = low, 9 = high). Subjects rated the credibility
of projected information to be higher in the positive condition than in the negative
condition (5.43 vs. 4.18, t = 1.63, p = 0.06, one-tailed). Credibility of the
historical, audited financial information was also judged on the same nine-point
scale. The mean ratings were 6.27 in the negative character condition and 7.14
in the positive condition (t = −1.13, p = 0.27).6 Therefore, any effects of
the manipulation of information about the CFO’s character on loan decisions,
structure recommendations, and processing effort result from changes in the
credibility of projected, rather than historical, accounting information.
Hypothesis Testing
H1 investigated whether loan officers would simply deny loans if they become
sufficiently concerned about character and source credibility. All loan officers
given the positive character information about the CFO approved the loan (100%
of 14), as did 8 of the 11 given the negative version (73%). The 2 statistic is 4.34
( p = 0.037). Thus, the null hypothesis is rejected. H2 investigated whether loan
officers would adjust loan structure to compensate for negative source credibility.
Table 2 reports the four mean loan structure recommendations (collateral and
How Does Negative Source Credibility Affect Commercial Lenders’ Decisions 89
Percentage of principal collateralized 10.3 11.0 −0.66 (0.53) 10.3 10.9 −0.66 (0.53)
and 10.5, respectively (F = 1.55, p = 0.490). Thus, the variance of effort choices
was similar with respect to the quantity of information examined, but not with
respect to time spent examining it.
loan officers react to negative source credibility, and they do so by denying loans,
while the majority do not react in terms of the final decisions to approve a loan
or to structure it restrictively. In short, proportionally few loan officers reacted to
negative source credibility, but when they did, they denied loans rather than accept
the loan and handle their concerns with loan structure.
In hindsight, these results mirror the reaction of the stock analyst community
to Enron. Those analysts who doubted Enron a year before its bankruptcy were
few and far between, but they did so by using their assessment of source cred-
ibility as the lens through which to analyze the numbers. Enron’s management
was notorious for dealing arrogantly with analysts and being unable to produce
financial information. This created an environment of distrust in which patterns of
transactions that were questionable could be pieced together. The advice of one
analyst who sold Enron stock short was simple: “Test what a company says; don’t
take it at face value.” In other words, it is necessary to assess the credibility of the
source of the information in order to be able to understand the information itself
(Bailey, 2001, p. F1).
As is true of experimental research, the ability to generalize results both
to other tasks and other financial statement users (in commercial lending and
elsewhere) is limited. In particular, although the indicators of character used in
this experiment have been validated in other research (Beaulieu, 1994, 1996),
subtle changes in the apparent financial strength of firms, task or context may
encourage financial statement users to select other signals of source credibility.
Other sources of credibility, especially external audits, may become relatively
more or less important, depending upon task and context. For example, concerns
about accounting for intangible assets may upset the current balance of users’
reliance upon source credibility vs. credibility derived from audits. Our objective
is to encourage thought and research about this balance, and about the type of
credibility information that different users employ.
NOTES
1. Hirst et al. (1999) did not explain to participants how forecast accuracy was calculated.
2. An example of a presentation that encourages caution is writing off all bad debts in a
single period, making it difficult to chart profitability (Ruth, 1987).
3. We do not examine pricing, that is to charge interest sufficiently above prime rates to
accommodate even the worst credit risks. It is difficult for loan officers in the United States
to price-protect themselves, because the commercial lending market is very competitive
and there is as little as a two-point spread separating prime from high-risk borrowers
(Emmanuel, 1989).
4. Consistent with Rosman and Bedard (1999), collateral was represented to the lenders
on a 12-point scale, which ranged from “0%” to “more than 100%” in 10% increments.
How Does Negative Source Credibility Affect Commercial Lenders’ Decisions 93
Profitability ranged from “0%” to “more than 50%” of the ratio of net income to average
equity, identified in 5% increments. Liquidity ranged from “0%” to “more than 150%” of
the ratio of cash flows to fixed cash commitments, in 15% increments. Leverage ranged
from “0%” to “more than 70%” of the ratio of total liabilities to equity, in 7% increments.
The upper bounds differ due to variation in the normal range of these ratios. The leverage
covenant was converted to a revised measure (i.e. 13 − x, where “x” is the value selected
by the participant) so that the direction of each scale was similar.
5. In contrast, Mather (1999) asked subjects to make judgments as to the number of
covenants they would seek and how tightly they would be imposed. However, the nature of
the covenants was not specified.
6. A potential concern regarding the experiment is that some participants may not have
seen all of the character information. As explained in Table 1, two facts in each condition
of the experiment were viewed only if selected. If a number of participants did not select
the additional screen about the CFO, the strength of the character manipulation would not
have been consistent. Ten of the 11 participants in the negative condition and 13 of 14 in
the positive condition accessed the optional CFO information. In total, 23 of 25 participants
investigated the CFO, evidence that the character manipulation was consistent across con-
ditions, and that character and source credibility were important to the participants. Both
participants who did not access the additional character information, one in the negative
condition and one in the positive condition, approved the loan.
ACKNOWLEDGMENTS
The authors thank Jean Bedard, Karla Johnstone, Marlys Lipe, Inshik Seol, Kathy
Wilkicki and two anonymous reviewers.
REFERENCES
Bailey, S. (2001). Right on the money. The Boston Globe (December 5th), F1.
Beach, L. R., Mitchell, T., Deaton, M., & Prothero, J. (1978). Information relevance, content and source
credibility in the revision of opinions. Organizational Behavior and Human Performance, 21,
1–16.
Beaulieu, P. (1994). Commercial lenders’ use of accounting information in interaction with source
credibility. Contemporary Accounting Research, 10(Spring), 557–585.
Beaulieu, P. (1996). A note on the role of memory in commercial loan officers’ use of accounting and
character information. Accounting, Organizations and Society, 21(August), 515–528.
Beaulieu, P. (2001). The effects of judgments of new clients’ integrity upon risk judgments, audit
evidence, and fees. Auditing: A Journal of Practice & Theory (Fall), 85–99.
Biggs, S., Rosman, A., & Sergenian, G. (1993). Methodological issues in judgment and decision-
making research: Concurrent verbal protocol validity and simultaneous trace of process. Journal
of Behavioral Decision Making, 6, 187–206.
Brucks, M. (1988). Search monitor: An approach for computer-controlled experiments involving con-
sumer information search. Journal of Consumer Research, 15, 117–121.
94 PHILIP R. BEAULIEU AND ANDREW J. ROSMAN
Coleman, D., & Irving, G. (1997). The influence of source credibility attributions on expectancy theory
predictions of organizational choice. Canadian Journal of Behavioural Science, 29(April),
122–131.
Danos, P., Holt, D., & Imhoff, E. (1989). The use of accounting information in bank lending decisions.
Accounting, Organizations and Society, 14, 235–246.
Eisenreich, D. (1981). Credit analysis: Tying it all together – Part I. Journal of Commercial Bank
Lending (December), 2–13.
Emmanuel, C. (1989). Limiting exposure to fraudulent financial reporting. The Journal of Commercial
Bank Lending (September), 16–27.
Gotlieb, J., & Sarel, D. (1991). Comparative advertising effectiveness: The role of involvement and
source credibility. Journal of Advertising, 20(1), 38–45.
Grewal, D., Gotlieb, J., & Marmorstein, H. (1994). The moderating effects of message framing and
source credibility on the perceived price-risk relationship. Journal of Consumer Research,
21(June), 145–153.
Hirst, D. E., Koonce, L., & Miller, J. (1999). The joint effect of management’s forecast accuracy and
the form of its financial forecasts on investor judgment. Journal of Accounting Research, 37,
101–123.
Kelley, H. (1972). Attribution in social interaction. Morristown, NJ: General Learning Press.
Korem, D. (1997). The art of profiling: Reading people right the first time. Richardson, TX: International
Focus Press.
Maines, L. (1990). The effect of forecast redundancy on judgments of a consensus forecast’s expected
accuracy. Journal of Accounting Research, 28(Suppl.), 29–47.
Mather, P. (1999). Financial covenants and related contracting processes in the Australian private debt
market: An experimental study. Accounting and Business Research, 30(1), 29–42.
McDonald, J., & McKinley, J. (1981). Corporate banking: A practical approach to lending.
Washington, DC: American Bankers Association.
Oldham, J. (1998). The “killer” character component. The Secured Lender, 54(November/December),
62–66.
Pace, E., & Simonson, D. (1977). The four hurdles of lending. The Journal of Commercial Bank
Lending (March), 10–15.
Rosman, A., & Bedard, J. (1999). Lenders’ strategy selection in loan structure decisions. Journal of
Business Research, 83–94.
Ruth, G. (1987). Commercial lending. Washington, DC: American Bankers Association.
Shaub, M. (1996). Trust and suspicion: The effects of situational and dispositional factors on auditors’
trust of clients. Behavioral Research in Accounting, 8, 154–174.
Stephens, R. (1980). Uses of financial information in bank lending decisions. Ann Arbor, MI: UMI
Research Press.
Watts, R., & Zimmerman, J. (1986). Positive accounting theory. Englewood Cliffs, NJ: Prentice-Hall.
EARNINGS MANAGEMENT AND
FRAMING: THE SPECIFIC CASE
OF OBSOLETE INVENTORY
ABSTRACT
Recent events have shown that earnings management is a significant problem
in the business world and that the culture in place in many organizations may
encourage managers to manipulate earnings. While prior research has shown
that earnings management exists at the corporate level, it has not examined
whether managers at the divisional level are motivated to manage earnings.
The purpose of this study is to examine whether divisional managers will be
more inclined to manage earnings in order to maximize personal wealth. The
secondary research objective is to examine whether the information frame
will impact discretionary management accounting decisions. Members of
the Institute of Management Accountants participated in an earnings man-
agement study in which two conditions were manipulated. First, the annual
compensation of subjects was contingent on whether target income was met
or not met. Second, information about a potentially obsolete inventory item
was framed as either positive or negative. Subjects were asked the likelihood
they would write off the potentially obsolete inventory. Research findings
support the earnings management hypothesis and indicate that managers
are less likely to write off obsolete inventory when their compensation is
impacted by the write-off. Study results also reveal that the manner in which
1. INTRODUCTION
Arthur Levitt, while chair of the Securities and Exchange Commission (SEC),
announced a focus on firms that manage earnings (Levitt, 1998). He unfolded an
action plan to address earnings management. Initiatives included better accounting
practices, standards and interpretative guidelines, stricter SEC focus on earnings
management, a review of audit practices, and a call for a cultural change in the
business world regarding the acceptance of earnings manipulations. While the
SEC can address most of these concerns with better standards and practices,
changing the culture of business is more complex. It involves changing the behav-
ior of individuals. Research needs to be conducted that addresses why individuals
manage earnings. Such research is important to future accounting practices.
The purpose of this study was threefold. First, earnings management was
experimentally examined in a managerial accounting setting. Previous empirical
research has examined earnings management at the corporate level indirectly
through the analysis of financial results.1 Researchers typically study discretionary
management decisions (i.e. write down of impaired assets) via publicly available
information and infer whether earnings management has occurred based on a
comparison of actual financial results to some expectation (Rees et al., 1996;
Zucca & Campbell, 1992). Rather than taking this approach in identifying
earnings management behavior, this study behaviorally examines whether bonus
plans influence managers’ decisions.
The second purpose of the study was to investigate earnings management at
the divisional level rather than the overall corporate view, looking at what occurs
within the firm.2 A survey by Buck Consultants of Fortune 1000 companies
found that 61% of U.S. companies offer variable compensation plans below the
executive level, and another 27% are considering them (Wilson, 2001). This
increase in bonus type plans creates greater incentive for earnings management.
Earnings management occurs at the corporate level due, in part, to managers’
efforts to achieve incentive compensation based targets (Watts & Zimmerman,
1978). Schipper (1989) states, “Clearly, compensation schemes and divisional
managers’ private information create a potential incentive to manipulate internal
managerial accounting reports.” If performance of managers at the lower levels
of the firm is also measured based on these types of targets, then the possibility
Earnings Management and Framing 97
exists that earnings management could occur at these levels. Managers could use
various means to manipulate earnings, from writing off low value inventory items
to controlling the timing of shipments to customers. The outcome of some of
these methods could be “buried” in the results of normal operations and therefore
might not be obvious at the corporate level. Alternatively, the consolidation of
this manipulated divisional income could result in significantly greater earnings
management at the corporate level than previously estimated. This division
level earnings management could be a potential intervening variable, which
has led to conflicting results in at least one published earnings management
study.3
The third purpose was to examine the effects of framing on earnings man-
agement. Subjects were presented with information pertinent to a discretionary
managerial decision from both a negative and positive viewpoint. Kahneman
and Tversky (1979) theorize that the way information is framed can impact
decision-making. This study looks at the potential impact of the information
frame on the decision to write off inventory.
The results support both earnings management and framing hypotheses.
Findings suggest that management accountants are more apt to write off inventory
when: (1) their personal wealth is unaffected; and (2) information is framed
negatively. An important contribution of this research is the fact that information
framing can have an impact on the earnings management decision. The probability
of writing off inventory was higher, although insignificant, for participants with
negatively framed information, even though their personal wealth decreased, than
those with positively framed information who were not eligible for a bonus. The
management accounting implication of these results is that managers’ decisions
could be influenced by the way information is presented.
This paper is organized in the following manner. Background and hypotheses
are developed and presented in Section 2. The research design and methodologies
used to test the hypotheses are presented in Section 3. Results are shown in
Section 4 and finally, Section 5 presents contributions and implications for further
study.
these decisions affect a firm’s cash flows and reported net income. The following
hypotheses examine the decision making behavior of managers.
Fig. 1. Predicted Outcomes Based on Bonus Maximization Theory. Note: Adapted from
Healy (1985).
Earnings Management and Framing 99
Prospect theory (Kahneman & Tversky, 1979) proposes that information presen-
tation impacts the editing process involved in decision making. Subtle changes
in the wording of facts of a situation can alter an individual’s reference point
(the point at which a decision is made), and ultimately their final decision. For
example, stating probabilities as a 25% chance of gain (positive frame) versus a
75% chance of loss (negative frame) has been found to affect decision-making
(Kahneman & Tversky, 1984). This framing of information can directly impact
the decision by altering the context or frame of reference in a way that is irrelevant,
sometimes leading to sub-optimal decisions.
Some framing research in accounting has occurred in auditing. Shields et al.
(1987) examined the effects of framing on an auditor’s uncertainty judgments
of account valuation. The sample space for accounts was framed as either book
value misstatements or audit values. They found no effect of the frame on the
auditor’s judgments of account accuracy. However, Ayers and Kaplan (1993)
found auditors exhibited confirming tendencies when assigned a misstatement
(non-misstatement) frame by selecting more misstatement (non-misstatement)
cues as relevant to explaining financial statement ratios. Beeler and Hunton (2002)
found evidence that the existence of non-audit revenues creates a predecisional
distortion of client related information, thereby suggesting a potential impairment
of independence.
Framing research has also examined managerial accounting issues. Lipe
(1993) studied framing in an analysis of variance investigation decision and the
subsequent performance evaluation of the investigation manager. She found that
when investigation expenditures were framed as a cost, managers were evaluated
more favorably than when the same expenditures were framed as a loss. In
another study, Rutledge (1995) examined the interaction between recency effects
and framing. He found that recency effects might be tempered by the framing of
decision relevant information.
The above research indicates that framing may impact accounting decisions.
The way managers perceive information may influence their propensity to manage
earnings, leading to the following hypothesis:
H2. Managers will be more likely to take income-decreasing action when
relevant decision information is framed in a negative manner than when that
information is framed in a positive manner.
Presentation framing may also impact the decision to write off an inventory
item. This study uses a potentially obsolete inventory item to operationalize the
accounting decision. If information regarding an inventory item is presented in
100 MARYBETH M. MURPHY AND JOANNE P. HEALY
a positive manner, the manager is expected to be less likely to classify the item
as obsolete and not write it off. Conversely, when information about the item is
presented from a negative viewpoint, the item is more likely to appear obsolete and
be written off.
101
102 MARYBETH M. MURPHY AND JOANNE P. HEALY
The intent of the study was to determine if earnings management would occur
in a bonus situation and whether the frame would influence a manager’s decision
process. Members of the Institute of Management Accountants (IMA) were chosen
as subjects, since these individuals are typically in managerial positions that involve
decisions such as writing off obsolete inventory. A randomly selected sample of
1000 actively employed members was obtained from the IMA.5
was set at $1,502,000, just above the threshold, so that an inventory write-off
would reduce net income below the threshold, eliminating the manager’s bonus.
With the negative initial reference point, a statement was included that infers
early results indicate that actual income will be lower than budgeted income.
Net income was set at $1,400,000, well below the threshold, to remove any
possibility that the threshold could be achieved. This operationalization simulated
a situation where the manager had the opportunity to reduce current year’s
104 MARYBETH M. MURPHY AND JOANNE P. HEALY
earnings with no impact on personal wealth and improve prospects for subsequent
years.
The italicized line in the second paragraph of the scenario shown in Appendix
indicates where these statements were placed in the survey instrument with the
negative initial reference point shown in brackets. The first independent variable,
INCOME, was a result of the manipulation of this initial reference point. Analysis
of this variable was conducted as a between-subjects design.
managers across the firm (all trying to achieve income targets) decide to write off
small valued inventory items, the write-offs have the potential to be material in
the aggregate. In addition, they are probably the most common method of writing
off inventory (Hepworth, 1953). Therefore, immaterial inventory write-offs were
selected as the discretionary decision in this study.
To operationalize this inventory write-off, the value of the inventory item
involved was set at $15,000. A number of factors were considered when setting the
dollar level of the potential discretionary decision. The amount was set at about
4% of inventory, considered immaterial in value. An immaterial value was chosen
for a number of reasons. First, if written off, the amount would be buried in cost
of goods sold. Therefore it would not be obvious to outsiders, and probably not be
detected by auditors. These decisions would be the type described by Hepworth
(1953). Second, most managers would be expected to act conservatively and
write off the amount. Therefore, differences in decision-making would basically
be due to either bonus implications and/or information framing. In addition to
materiality considerations, if the write-off would take place, the amount is large
enough to cause the income level to fall below budget expectations for those
receiving information that income was above the threshold. Obviously, for those
below the threshold, the write-off would have no impact on bonuses this year.
in the scenario. These payoffs are based on the bonus equal to 0.5% of plant net
income ($7,500 if budgeted net income is met). This dollar value was selected to
approximate bonus compensation for plant controllers.6
The only subjects to receive a bonus in the current year were those with income
greater than budget who did not write off the inventory. Since their expected
payoff is $7,510 ($1,502,000 at 0.5%), this group had the greatest opportunity
cost from writing off the inventory. Based on this payoff, these subjects were
expected to be the least likely to write off inventory, in accordance with the bonus
maximization theory.
3.3. Pretesting
Prior to mailing the survey instruments, two pretests were conducted to provide
evidence for content validity as well as to improve the experimental task. The
first pretest was conducted during the monthly meeting of a local IMA chapter.
Comments provided by the participants were incorporated to improve the scenario
Earnings Management and Framing 107
before the second pretest was undertaken. In general, participants of the initial
test found the scenario to be incomplete. Specifically, they requested additional
information on the relationship between the value of the write-off and total
inventory. Participants also inquired if the parts could be resold as replacement
parts. A line was added that indicated that no such market existed. The second
pretest was conducted during the monthly meeting of another IMA chapter five
months later. Again, all comments were considered and minor grammatical
changes were made to the experiment. The responses from these 38 pretest
participants have not been included in the final sample.
3.4. Procedure
Dillman’s “Total Design Method” (1978) was employed in the design and mailing
of the questionnaires. Each envelope and cover letter was printed with the
individual’s name and address to make the request more personal. Questionnaires
were numerically coded to determine which subjects had responded to the
mailing. The cover letter indicated that this coding was for mailing purposes
only and individual responses would not be associated with names of subjects.
Participants were asked to complete the experiment and were provided with a
stamped, self-addressed envelope for its return.
4. RESULTS
Table 1 indicates the number of responses. There were 242 (24.2%) responses
from the initial mailing. A second mailing was sent to the non-respondents; an
additional 149 questionnaires were returned, increasing the overall response rate
to 39.1%. Of the total 391 responses received, 27 were returned either unanswered
or incomplete. Another 48 were from non-accountants. The remaining 316 were
used for the analyses.
Tests for non-response bias were conducted on the final sample of 316 participants.
Mean responses to the participants’ probability of write-off question from the first
mailing were compared to those of the second. Kruskal–Wallis tests indicated no
significant differences between the two mailings (t > χ 2 = 0.236). t-Tests were
conducted for years of experience, number of certifications, firm type, type of
degree, and years on the current job. No significant differences were noted.
The manipulation of the earnings management situation was tested utilizing the
response to the question, “Did you achieve the operating budget prior to the
inventory write-off decision?” This insured that the subjects knew the position
of estimated net income relative to the budget. Approximately 86% of the
respondents answered the manipulation check for the operating budget correctly.7
The success of the manipulation of the inventory frame was confirmed by anal-
yzing the subjects’ response to the following question, “How risky do you feel it
is for the inventory to remain on the books?” Subjects were asked to respond on a
7 point Likert-type scale with “Very Risky” and “Not Risky” at opposite ends. A
Mann–Whitney Test found significant differences between the mean of the positive
(3.88) and negative (3.49) inventory frames at the 5% probability level indicating
that the frame manipulation had succeeded ( p = 0.0458).
4.4. Demographics
Table 2 provides overall information about the respondents in this study. The
respondents held positions in a fairly diversified number of industries with the
46% of respondents employed by manufacturing firms. Subjects employed by
service-oriented firms composed the next largest group (12.7%), followed by those
from public accounting firms (10.4%). The remaining subjects (30.0%) worked
in a variety of environments from banking, retailing, non-profits, consulting, to
distribution.
Earnings Management and Framing 109
The respondents were well educated with over 97.5% holding a bachelor’s
degree. An additional 38.6% held advanced degrees. More than half the group
possessed some form of certification. Thirty-nine percent held one certification,
while 13.5% had obtained two or more. The most common certifications were the
Certified Public Accountant (CPA) and the Certified Management Accountant
(CMA).
110 MARYBETH M. MURPHY AND JOANNE P. HEALY
Positive Negative
income target was met, the probability of writing off inventory was 57.5% for
the positive inventory frame compared to 67.64% for the negative frame. The
average likelihood of write-off when the income target was met was 63.1%. In
the condition where income targets were not met, respondents who received the
positive inventory frame indicated that there was a 63.13% likelihood that they
would write off inventory, where the negative frame indicated a 76.88% likelihood.
The average likelihood of write-off when the income target was not met was
69.9%. The results can also be viewed from the inventory frame. For the positive
frame, the average likelihood of inventory write-off was 60.5%. For the negative
frame, the average likelihood of write-off was 73.1%.
113
114 MARYBETH M. MURPHY AND JOANNE P. HEALY
information is framed negatively, managers are more likely to write off inventory
even when there would be an adverse effect on their income, than when informa-
tion is framed positively and their income would be unaffected by their decision.
However, since the Kruskal–Wallis z-value comparing the difference between
the means of these two groups was not significant (z = 1.2152), it is impossible
to support H3c.
4.8. Discussion
bonus by taking the write-off (mean = 67.64), than when the information is
framed positively and there is no chance the executive would receive a bonus
(mean = 63.13). While the difference in results is not statistically significant
and could be the result of random fluctuation, it does have interesting behavioral
implications. The differences in information frames were designed to be subtle
and not necessarily meant to mislead the reader. If these small changes in
information presentation could yield differences in decision making in a situation
where outcomes were more or less expected, how could the information frame
in other less obvious decisions be impacted? This suggests that the impact of
framing could possibly be important in other accounting decisions as well.
Managers receive and communicate information about decisions every day. If the
frame impacts the decision making in such a seemingly predictable decision as in
the earnings management situation, it has the potential to impact other decisions.
Are managers aware of the potential impact of framing on their decision making?
What should they be alert to in the decision-making analysis?
Another interesting outcome was the effect of the frame on the write-off decision.
It is interesting to note how something as simple and as indirect as the frame of
the information presentation (inventory frame) could have a significant impact on
results. These results raise the question of what other behavioral factors could
influence managers’ decisions to manage earnings and provides a basis for future
research into the effects of framing on discretionary managerial decisions.
NOTES
1. Burgstahler and Dichev (1997), Wu (1997), Cahan et al. (1997), Rees et al. (1996),
Healy (1996), Amir and Livnat (1996), Bernard and Skinner (1996), Dechow et al. (1996),
Dechow et al. (1995) are just a few of the most recent examples.
2. Schipper (1989) states that although there is a potential incentive for earnings
management at the divisional level, research in that area is “sparse to non-existent.”
3. White (1970) found no evidence of earnings management.
4. Watts and Zimmerman (1978) suggest that political costs and debt violations also
affect managers’ motivations to manipulate earnings. These factors would most likely
impact earnings management at the corporate level. The current research examines earnings
management at the plant level, and does not explicitly test for these other factors.
5. Student members or members reporting their employment status as retired were
excluded from the population.
6. Based on 1998 salaries and total compensation reported by Schroeder and Reichardt
(1999).
7. ANOVA tests were conducted on the sample excluding those individuals who
answered this question incorrectly. Results did not differ greatly from the entire test
sample. The p-value for the variable INCOME was p = 0.0005 for this group and 0.0010
for the full sample; for INVENTORY p = 0.0671 and 0.0627 respectively.
ACKNOWLEDGMENTS
We would especially like to thank Elizabeth Cole, Tim Fogarty, Pete Poznanski,
Ray Stephens and Linda Zucca for their helpful comments and the assistance
and for the support received from the Institute of Management Accountants. We
gratefully acknowledge the financial support received from the Research Council
of Kent State University.
REFERENCES
Amir, E., & Livnat, J. (1996). Multiperiod analysis of adoption motives: The case of SFAS No. 106.
The Accounting Review, 71(4), 539–553.
Earnings Management and Framing 117
Ayers, S., & Kaplan, S. E. (1993). An examination of the effect of hypothesis framing on auditors’
information choices in an analytical task. Abacus, 29(2), 113–131.
Barnea, A., Ronen, J., & Sadan, S. (1976). Classificatory smoothing of income with extraordinary
items. The Accounting Review, 52(2), 110–122.
Beeler, J. D., & Hunton, J. E. (2002). Contingent economic rents: Insidious threats to audit indepen-
dence. Advances in Accounting Behavioral Research, 5, 21–50.
Bernard, V. L., & Skinner, D. J. (1996). What motivates managers’ choice of discretionary accruals?
Journal of Accounting and Economics, 22(1–3), 313–325.
Burgstahler, D., & Dichev, I. (1997). Earnings management to avoid earnings decreases and losses.
Journal of Accounting and Economics, 24(1), 99–126.
Cahan, S. F., Chavis, B. M., & Elemendorf, R. G. (1997). Earnings management of chemical firms in
response to political costs from environmental legislation. Journal of Accounting, Auditing &
Finance, 12(1), 37–65.
Dechow, P. M., Sloan, R. G., & Sweeney, A. P. (1995). Detecting earnings management. The Accounting
Review, 70(2), 193–225.
Dechow, P. M., Sloan, R. G., & Sweeney, A. P. (1996). Causes and consequences of earnings
manipulations: An analysis of firms subject to enforcement actions by the SEC. Contempo-
rary Accounting Research, 13(1), 1–36.
Dillman, D. A. (1978). Mail and telephone surveys – The total design method. New York, NY: Wiley.
Elliott, J. A., & Shaw, W. H. (1988). Write offs as accounting procedures to manage earnings. Journal
of Accounting Research, 26(Suppl.), 91–119.
Financial Accounting Standards Board (1992). Original pronouncements – accounting standards –
Volume II. Norwalk, CT.
Guidry, F., Leone, A. J., & Rock, S. (1999). Earnings-based bonus plans and earnings management by
business unit managers. Journal of Accounting and Economics, 26(1–3), 113–142.
Healy, P. M. (1985). The effect of bonus schemes on accounting decisions. Journal of Accounting &
Economics, 7(1–3), 85–107.
Healy, P. M. (1996). Discussion of a market-based evaluation of discretionary accrual models. Journal
of Accounting Research, 34(3), 107–115.
Hepworth, S. R. (1953). Smoothing periodic income. The Accounting Review (January), 32–39.
Johnson, P. E., Jamal, K., & Berryman, R. G. (1991). Effects of framing on auditor decisions. Organi-
zational Behavior and Human Decision Processes, 53(2), 75–105.
Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Economet-
rica, 47(2), 263–291.
Kahneman D., & Tversky, A. (1984). Choices, values and frames. American Psychologist (April),
341–350.
Levitt, A. (1998). The numbers game (September 28th). New York, NY: NYU Center for Law and
Business.
Lipe, M. G. (1993). Analyzing the variance investigation decision: The effects of outcomes, mental
accounting and framing. The Accounting Review, 68(4), 748–764.
McNichols, M., & Wilson, G. P. (1988). Evidence of earnings management from the provision for bad
debts. Journal of Accounting Research, 26(Suppl.), 1–31.
O’Clock, P., & Devine, K. (1995). An investigation of framing and firm size on the auditor’s going
concern decision. Accounting and Business Research, 25(99), 197–201.
Puto, C. P. (1987). The framing of buying decisions. Journal of Consumer Research, 14(3), 301–315.
Rees, L., Gill, S., & Gore, R. (1996). An investigation of asset write-downs and concurrent abnormal
accruals. Journal of Accounting Research, 34(3), 157–169.
118 MARYBETH M. MURPHY AND JOANNE P. HEALY
Ronen, J., & Sadan, S. (1975). Classificatory smoothing: Alternative income models. Journal of
Accounting Research, 3(4), 133–149.
Rutledge, R. W. (1995). The ability to moderate recency effects through framing of management
accounting information. Journal of Mathematical Economics, 11(2), 27–40.
Schipper, K. (1989). Commentary on earnings management. Accounting Horizons, 3(4), 91–102.
Schroeder, D., & Reichardt, K. (1999). IMA 98 Salary Guide. Strategic Finance, 8(20), 28–41.
Shields, M. D., Solomon, I., & Waller, W. S. (1987). Effects of alternative sample space representation
on the accuracy of auditor’s uncertainty judgments. Accounting, Organizations and Society,
12(4), 375–385.
Walsh, P., Craig, R., & Clarke, F. (1991). Big bath accounting using extraordinary items adjustments:
Australian empirical evidence. Journal of Business Finance and Accounting, 18(2), 173–189.
Watts, R., & Zimmerman, J. (1978). Towards a positive theory of the determination of accounting
standards. Accounting Review, 53(1), 112–134.
White, G. E. (1970). Discretionary accounting decisions and income normalization. Journal of
Accounting Research, 8(2), 260–273.
Wilson, T. B. (2001). What’s hot and what’s not: Key trends in total compensation. Compensation &
Benefits Management, 17(2), 45–50.
Wu, Y. W. (1997). Management buyouts and earnings management. Journal of Accounting, Auditing,
and Finance, 12(4), 373–389.
Zucca, L. J., & Campbell, D. R. (1992). A closer look at discretionary write-downs of impaired assets.
Accounting Horizons, 6(3), 30–41.
APPENDIX
Scenario
You are the plant accountant for a Cleveland area plant of the Spring Wire Company.
The responsibilities of your position include the processing of payroll, payments
to vendors (accounts payable), inventory accounting, preparation of budget and
estimates, and analysis of actual plant operating results. All members of the plant
staff (including yourself) are given a bonus contingent on achieving or exceeding
the plant’s operating budgeted net income of $1,500,000. If the budgeted operating
income is achieved, 0.5% of the current year’s net income will be paid to you in the
form of a bonus. (e.g. if net income is $1,510,000, your bonus would be $7,550.)
It is January 1, and you have received estimated net income for the year of
$1,502,000 [$1,400,000]. In past years, these early results have proved to be
accurate, with few unexpected adjustments made after this date.
You have one last chance to review the status of your inventory that was taken on
December 31st to determine if any potentially obsolete inventory items should be
written off. You are presented with the following information from the Inventory
Earnings Management and Framing 119
and Materials Manager (also a staff manager) concerning the inventory item in
question.
Part Number PX23415 is sold to computer manufacturers. It has a current inven-
tory of 5,000 units on hand with a total current inventory value of $15,000. Your
plant’s total inventory including Part Number PX23415 is $350,000. The demand
for this product is 15% of last year’s demand [Industry sales of this product have
demonstrated an 85% decline in both volume and dollar amounts in the last year].
The inventory turnover ratio for this item has declined substantially from the
prior year. Of the original market for the product, about 20% of your competitors
remain [Approximately 80% of your competitors in the market for this product
have ceased production and sales]. No sales occurred during the months of
November or December for your company. Because of the nature of this product,
the potential for this part to be sold in the replacement parts market does not exist.
(1) Please indicate the percent probability in your opinion that this inventory will
be sold. (0–100%)
(2) Please indicate the percent probability that you would write off Part Number
PX23415 from inventory. (0–100%)
For questions 3 through 5, place an “X” on the box that best indicates your opinion.
(3) How risky do you feel it is for the inventory to remain on the books?
Very Risky Not Risky
(1) (2) (3) (4) (5) (6) (7)
(4) Indicate on the scale below your perception of what is occurring in the
marketplace to the demand for this part?
Significantly Decreased No
Change
(1) (2) (3) (4) (5) (6) (7)
(5) Would you consider the bonus an important component of your income?
Very Important Not Important
(1) (2) (3) (4) (5) (6) (7)
(6) Did you achieve the operating budget prior to the inventory write-off decision?
Yes or No
THE EFFECTS OF INCENTIVE
STRUCTURE AND GOAL DIFFICULTY
ON TIME PLANNING DECISIONS
WITHIN A BALANCED SCORECARD
FRAMEWORK
ABSTRACT
Recent innovations in management control systems, such as the Balanced
Scorecard System, reflect today’s complex business environment by account-
ing for performance in multiple areas. When individuals must allocate their
time between multiple areas that compete for their time, the manner in
which incentives are structured is hypothesized to influence their decisions
differently depending on goal difficulty. A decision-making experiment was
conducted to test this proposition. When incentives were structured so that
each area of the Balanced Scorecard is rewarded separately, challenging
goals received more planned attention than easy or unattainable goals
following previous findings. When incentives were structured so that goals
in all areas must be achieved together, the influence of goal difficulty on
the time planning decision diverges from previous findings such that areas
having unattainable goals receive the same planned attention as areas
having challenging goals. The results suggest that companies must consider
how performance is rewarded within a Balanced Scorecard framework.
INTRODUCTION
This study is motivated by today’s competitive business environment that requires
individuals to give their attention to many areas, all of which compete for their time.
Recent innovations in management accounting control systems, such as Kaplan
and Norton’s (1992, 1996) Balanced Scorecard, reflect this situation and attempt
to influence individuals to balance their time among multiple areas through the
establishment of goals, incentives, and accounting systems. While a great deal of
research has been conducted regarding the effects of incentives and goal difficulty
in relation to a single task (cf. Bonner, Hastie, Sprinkle & Young, 2000; Camerer
& Hogarth, 1999; Cameron & Pierce, 1994; Jenkins, Gupta, Mitra & Shaw, 1998;
Wood & Locke, 1990), very little is known about the effects of these variables
on behavior in relation to accomplishing multiple tasks (Ashford & Northcraft,
in press; Locke & Latham, 1990) as addressed by the Balanced Scorecard.
Research into the effects of incentives and goal difficulty on behavior within
a Balanced Scorecard framework is needed for several reasons. Foremost is the
fact that the kinds of incentive structures that are possible when multiple tasks are
involved have received scant attention in the literature. For instance, incentives
associated with a Balanced Scorecard can be structured so that rewards are
received only after meeting the goals in all areas. Or, Balanced Scorecard areas
can be decoupled so that rewards are provided after meeting goals associated with
individual areas. Furthermore, achieving the goals in one area may be easy while
it may be very challenging in another. The combinations of these possibilities add
a level of complexity to the Balanced Scorecard environment that has received
scant attention in the existing literature. For these reasons, Ashford and Northcraft
(in press) call for more research into decision-making when multiple tasks
compete for an individual’s time and attention. The use of the Balanced Scorecard
as a management tool has increased the need for this research.
Naylor, Pritchard and Illgen (1980) posit a theory, hereafter NPI theory,
suggesting that when individuals are faced with multiple objectives, how they
allocate their time among the areas that compete for their time is more important
to achieving overall satisfactory results than the total amount of time spent
working on all of their goals. This distinction has been termed, direction of effort
versus level of effort (Blau, 1986, 1993). Because the many studies that examine
goal difficulty and incentives typically use only a single goal and single task,
they address only level of effort. The effects of incentives and goal difficulty on
direction of effort remain largely unexplored.1
As a practical issue, organizations would benefit from a better understanding
of how incentives and goal difficulty interact to influence how individuals expect
to use their time among their areas of responsibility. The Balanced Scorecard
The Effects of Incentive Structure and Goal Difficulty 123
System (Kaplan & Norton, 1992, 1996) is based on the premise that overall
performance is improved when goals in all areas are reached together. Failure
on one dimension cannot be completely compensated by success in others.
Conceptually then, organizations may desire to reward individuals only when
they achieve satisfactory performance in all of the Balanced Scorecard areas.
One finding of goal research, however, is that while challenging goals generally
motivate more effort than easy goals (Wood & Locke, 1990), unattainable goals
often do not and can sometimes have large negative consequences (Fatseas &
Hirst, 1992; Lee, Locke & Phan, 1997; Mowen, Middlemist & Luther, 1981;
Wright, 1992). This being the case, basing rewards on areas coupled together via
a comprehensive control system may produce unintended consequences when
information suggests that goals in one or more areas are unattainable. Research
is clearly needed to answer these types of practical questions.
A theoretical justification for this study is that, for Balanced Scorecard systems
to work, they must affect the plans of individuals. Without premeditated, goal-
directed planning, individuals do not control their environments but are controlled
by them. This notion is consistent with the idea that closely related constructs like
goal commitment, goal motivation, and intentions affect goal-related performance
(Locke, Latham & Erez, 1988). If the Balanced Scorecard does not motivate
individuals sufficiently to alter their plans about where they will spend their time,
arguing that they are committed to it is difficult (cf. Naylor & Illgen, 1984, p. 98)
or achieving its objectives is unlikely. Hence, this study builds on the theoretical
foundation from prior studies by looking at the time planning decisions of
the subjects.
Studies that examine the effects of planning and intentions on performance
generally conclude that these variables have a stronger effect than most other
variables. For instance, Chesney and Locke (1991) find that identifying an
appropriate strategy for completing a complex task in the initial planning stage
has a greater effect on performance than does goal difficulty. Early, Wojnaroski
and Prest (1987) find that planning is positively associated with performance
in both the laboratory and the field. In a study by Cotton and Tuttle (1986),
intentions predicted subsequent behavior more reliably than any other variable
they identified in the literature. McAllister, Mitchell and Beach (1979) find
that individuals who planned to spend more time on a task actually did spend
more time on it and thus conclude that intentions are positively related to
performance.
Also from a theoretical viewpoint, this research extends the findings of many
goal studies that employ tasks having a production-line orientation to a context that
more closely resembles those encountered by individuals in management roles.
Managers operate in environments that inherently place many demands on their
124 BRAD TUTTLE AND MARK J. ULLRICH
time at once. Although prior goal research has examined complex as well as simple
tasks (Chesney & Locke, 1991; Wood & Locke, 1990; Wood, Mento & Locke,
1987), subjects have typically worked towards only a single goal. Settings charac-
terized by single objectives are more characteristic of unskilled or process-oriented
jobs – not management level positions. On the other hand, the task of allocating
one’s individual time and attention between various demands is highly consonant
with what managers do. That is, a manager’s time is his most valuable and scarce
resource and how that resource is allocated likely makes the most difference
to what gets accomplished (Miodonski, 1999; Plack, 2000). Few studies have
looked into factors that influence time allocation between tasks in a managerial
context.
Investigating the difficulty of the goal is also important in a Balanced Scorecard
framework. Information about goal difficulty is an integral, if not a necessary,
component to the successful achievement of most important goals (Wood &
Locke, 1990) and is a major rationale for the existence of the Balanced Scorecard.
Simply put, having a “goal” without also having the ability to assess one’s position
relative to it, is not much of a goal. Notwithstanding, only a very small portion
of the goal literature examines behavior in a setting where information about the
level of goal difficulty in one area permits subjects to shift their time to or from
other relevant, work-related areas. Yet, this is exactly what is possible within a
Balanced Scorecard system.
and, therefore, are less motivating than easy or challenging goals (Fatseas &
Hirst, 1992; Lee, Locke & Phan, 1997; Mowen et al., 1981; Wright, 1992). Erez,
Gopher and Arzi (1990) partially extend these conclusions to multiple tasks and
find that proportionately more attention is allocated to more difficult tasks. To the
extent that these findings generalize to time planning decisions by individuals,
they suggest that individuals will plan to spend more time on challenging goals
and less time on easy or unattainable goals.
Bonner, Hastie, Sprinkle and Young (2000) refer to incentives, within the context
of a management control system, as the presence or absence of motivators linked
to performance. They differentiate incentives from incentive type, which refers to
how pay is tied to performance and provide the following major categories: flat
rate, piece rate, variable ratio, quota (or goal), and tournament. As such, incentive
type refers to how incentives are tied to performance that is generally associated
with a single task. This differs from incentive structure, which is used in this
paper to refer to the way incentives are structured between tasks as in the multiple
areas of the Balanced Scorecard. An incentive structure may consist of just one
incentive type or of multiple incentive types across various performance measures
associated with different tasks and goals.
Organizations often implement monetary incentives to motivate goal congruent
behavior. These incentives are designed to motivate individuals to increase their
goal-related effort by making the goal more attractive to attain (Vroom, 1964), by
reinforcing performance (Komaki, Coombs & Schepman, 1996), by motivating
individuals to set more or higher goals (Wright, 1991) or by increasing the
acceptance of difficult goals (Locke, Latham & Erez, 1988). Given the importance
of time planning to goal accomplishment, incentives should motivate individuals
to plan sufficient time to meet their goals. Evidence shows that performance-based
incentives increase the amount of time individuals spend on a task (Awasthi
& Pratt, 1990; Libby & Lipe, 1992; Sprinkle, 2000; Stone & Zeibart, 1995; Tuttle
& Burton, 1999; Tuttle & Harrell, 2001). Some research, however, suggests that the
relationship between incentives and behavior is not direct but is contingent upon
the type of incentive being offered (Bonner et al., 2000) and the difficulty of the goal
(Wright, 1991).
Using NPI theory as a framework, Wright (1991) suggests that goal difficulty
and the structure of incentives interact to determine effort. Wright argues that
incentives will have a negative effect when effort is costly and does not result
in extrinsic rewards. To illustrate, consider the case where an individual is paid
126 BRAD TUTTLE AND MARK J. ULLRICH
incentives are based on goal attainment in each area separately than when in-
centives are based on goal attainment for all areas as a set.
H4. Shifts in time from Balanced Scorecard areas that information indicates are
unattainable to areas that information indicates are challenging will be greater
when incentives are based on goal attainment in each area separately than when
incentives are based on goal attainment for all areas as a set.
Note that H1 and H2 combine to suggest that a manager’s time allocation will
follow an inverted U-shaped function in relation to goal difficulty for a single
Balanced Scorecard area. Furthermore, as a result of shifting time to and from
competing areas, the time allocated to these other areas will resemble a righted
U-shaped function in relation to the goal difficulty of the single target area (holding
goal difficulty constant for the other competing areas). Figure 1 expresses this
relationship and is consistent with several models of motivation beginning as
early as Atkinson (1958). H3 and H4 imply that both U-shaped functions will be
flatter when incentives are provided only when goals are achieved in all areas in
comparison to when incentives are based on goal attainment for each area of the
Balanced Scorecard individually.
METHOD
Experimental Design
As shown in the Appendix, all participants were projected into the role of a
unit-level manager who was to plan his/her time among four areas corresponding
to a typical Balanced Scorecard: Customer, Financial, Internal Business, and
Learning & Growth. Example performance measures for each area were presented
along with information about goal difficulty. These four goals and associated
performance measures were derived from the Balanced Scorecard used by Mobil
The Effects of Incentive Structure and Goal Difficulty 129
Oil’s domestic marketing and oil refining division (Kaplan, 1997a, b). All subjects
received the same four areas and goals.
The participants were informed that they were being considered for a promotion
and that the corporation offered a performance bonus of up to 20% of their salary,
both of which were linked to their goals. About half of the participants were
informed that the likelihood of promotion and bonus depended upon “how many
goals you achieve” while the remaining participants were informed that their
promotion and bonus depended upon “achieving all four goals together.” This
constituted the incentive structure manipulation with one group’s bonus based
on achieving goals in individual areas and the other group’s bonus based on
achieving goals in the entire set of Balanced Scorecard areas. Thus, all subjects
were provided with a possibility to achieve the same reward; only the manner in
which the incentive was structured varied between groups.
Goal difficulty was manipulated for the Customer area by providing the subjects
with “reliable feedback suggesting that the Customer goal is” easily attainable,
challenging but attainable, or not attainable, depending on their experimental
condition. This resulted in three conditions: Easy, Challenging, and Unattainable.
Goal difficulty was held constant for the other three areas (Financial, Internal
Business, and Learning & Growth) at a “challenging but attainable level” for all
participants.
All participants were informed that they could work as many hours per week
as they wished and that they were free to allocate their work hours as they saw
fit except that they must spend 15 hours per week on tasks unrelated to their four
goals.2 The rest of their time at work was to be devoted to achieving the goals in the
four areas. The participants were then asked how they would allocate their hours
at work to achieve the goals in each of the four Balanced Scorecard areas. Thus,
the hours-per-week the subjects intended to work were collected for each goal
resulting in planned time to spend on Customer, Financial, Internal Business, and
Learning & Growth goals. The sum of these four responses is the total goal related
hours per week. To measure the relative amounts of time allocated to achieving the
various goals, the difference in time allocated to the (manipulated) customer area
and the average time allocated to the three other areas was computed. Positive num-
bers reflect more time to the customer goal in relation to the average time allocated
to the other three goals. Negative numbers reflect more time to the three competing
goals, on average, than to the customer goal. Hence, this measure reflects the
relative emphasis that the subjects placed on the manipulated goal in comparison
to other challenging goals that are competing for their time.3
After the dependent measures were collected, subjects responded to a goal dif-
ficulty manipulation check in which they selected the goal difficulty information
that they received in the case from among the three possibilities. Likewise, in
order to check the incentive manipulation, the participants selected the incentive
130 BRAD TUTTLE AND MARK J. ULLRICH
manipulation statement that they received in the case. Next, the participants
were asked two questions regarding the valance of the incentives and their
effort-to-performance expectancy. The valance question asked how attractive the
bonus and promotion was using a nine point Likert scale anchored by “1 = very
unattractive” and “9 = very attractive.” The effort-to-performance expectancy
question asked the subjects to rate how likely they would be to accomplish all four
goals if they exerted maximum effort using a nine point Likert scale anchored by
“1 = very unlikely” and “9 = very likely.”4
Finally, the participants were asked to provide demographic information. The
data were gathered during regularly scheduled classes. Participation was voluntary
and anonymous and the experiment took about 15 minutes to complete.5
Participants
RESULTS
Preliminary Analysis
conditions (all p-values > 0.10) for any of the three categorical demographic
variables: gender, educational degree program, and current compensation plan.
Separate 2 × 3 ANOVAs were conducted for each continuous demographic
variable: the number of years of work experience, the maximum number of
individuals supervised, and whether compensation at work was contingent upon
achieving a goal or goals in each of the four business areas. Incentive structure at
two levels and goal difficulty at three levels served as the independent variables.
The ANOVA results show no significant differences (all p > 0.10) for any
continuous demographic variable across cells. Thus, results from the analysis
of the demographic data suggest that randomization was effective and that the
subjects are homogenous across treatment conditions.
The attractiveness of the incentives and the expectancy of accomplishing
the goals in all four areas (given maximum effort) should not differ between
incentive structures. The data support this proposition in that the attractiveness
of the incentives based on each area separately (mean = 8.11) does not differ
from the attractiveness of the incentives based on achieving the goals in all areas
(mean = 8.28, t = 1.0441, p = 0.2980). Incentive structure was not predicted to
affect goal challenge but to interact with goal challenge to affect motivation. Con-
sistent with this notion, the expectancy of accomplishing the goals in all four areas
when incentives were based on each area (mean = 6.94) does not differ signifi-
cantly from when incentives were based on all areas (mean = 7.24, t = 0.8874,
p = 0.3762).
The expectancy of accomplishing all goals, however, should differ by goal
difficulty so that the expectancy should decrease with goal difficulty. The results
generally support this proposition in that easy and challenging goals (means 7.84
and 7.81, respectively) are seen as more likely to be accomplished (p = 0.0001)
than unattainable goals (mean = 5.78).
The total amount of time the subjects planned to work in a week, as shown in
Table 1, was not affected by incentives or goal difficulty. The finding that subjects
do not adjust their workweek for incentives or goal difficulty is consistent with
Naylor, Pritchard and Illgen (1980) who assert that total work effort is stable
across most conditions other than those associated with individual differences.
Hypotheses Testing
The first hypothesis predicts that individuals will shift time from goals that
information indicates are easy to goals that information indicates are challenging.
Recall that goal difficulty was manipulated only for the customer goal and that
goal difficulty was held constant (i.e. challenging but attainable) for the other
132 BRAD TUTTLE AND MARK J. ULLRICH
Source df F p
three goals. To measure the relative amounts of time allocated to achieving the
various goals, the difference in time allocated to the (manipulated) customer
area and the average time allocated to the three other areas was computed.
The hypothesis predicts that the difference in time allocations should be larger
(more positive) when the customer goal is challenging compared to when
it is easy.
The hypothesis was tested using a 2 × 2 ANOVA with the difference in time
allocated between the customer goal and the average of the other three goals as the
dependent measure. Goal difficulty (easy versus challenging but attainable) and
incentive structure (separate versus set) served as the independent variables. As
can be seen from Panel A in Table 2, the main effect for goal difficulty is highly
significant (F = 33.82, p = 0.0001). When the customer goal is challenging, then
all four goals are challenging. In the situation where all goals are challenging,
Panel B of Table 2 shows that the subjects allocated more time to the customer
area than to the other areas (mean difference = +1.77 hours) possibly reflecting
a bias towards taking care of customers or a belief that this area requires a greater
The Effects of Incentive Structure and Goal Difficulty 133
Source df F p
Panel B: Means
time commitment. In contrast, the subjects shift the time they plan to spend
accomplishing the three other challenging goals (mean difference = −3.10 hours)
when the customer goal is easy. Hypothesis 1 is strongly supported.
The second hypothesis predicts that individuals will shift time from areas that
information indicates the goals are unattainable to areas that information indicates
the goals are challenging. The second hypothesis was tested in a like manner
to H1 using a 2 × 2 ANOVA with the difference in time allocated between the
customer area and the average of the other three areas as the dependent measure.
For this test, goal difficulty (challenging versus unattainable) and incentive
structure (separate versus set) served as the independent variables. As can be seen
from Panel A in Table 3, the main effect for goal difficulty is highly significant
(F = 10.14, p = 0.0019). This result is modified, however, by a significant goal
difficulty by incentive interaction as discussed below.
Overall results for H1 and H2 support the prediction that subjects shift the
time they are willing to spend from one area of responsibility to another due to
goal difficulty as described in Fig. 1. Figure 2 shows the results from the study
134 BRAD TUTTLE AND MARK J. ULLRICH
Source df F p
Panel B: Means
in the same graphic form as Fig. 1. Recall that the information indicated that all
non-customer goals (i.e. goals from competing areas) are challenging. As can be
seen, individuals react to goal difficulty information by shifting their time from
areas associated with easy goals to those associated with challenging goals, and
from unattainable goals to challenging goals in a manner that supports our overall
prediction.
H3 and H4 suggest that incentive structure modifies the relationship between
goal difficulty and planned time leading to the prediction that the interaction terms
reported in Tables 1 and 2 should be significant. As can be seen in Panel A of
Table 2, incentive structure does not interact with goal difficulty (F = 0.01,
p = 0.9243) thus failing to support H3. Hence, the data do not suggest that
incentive structure modifies the amount of time subjects plan to spend on Balanced
Scorecard areas associated with easy versus challenging goals.
Panel B of Table 3 shows that when incentives are based on each goal separately,
information indicating that the customer goal is unattainable caused individuals
The Effects of Incentive Structure and Goal Difficulty 135
those areas differ in terms of whether their associated goals are unattainable versus
challenging or easy.
Supplemental Analysis
Predicting separate differential effects for the manipulations on the time allocated
to each of the three non-customer goals is not possible. Nevertheless, in the spirit
of the study’s main premise that individuals consider all areas of the Balanced
Scorecard together as they plan their time, supplemental analysis of these data
is reported. Panel A of Table 4 shows the results of a MANOVA in which hours
allocated to the Financial goal, the Internal Business goal, and the Learning and
Growth goal are dependent variables with goal constituting a within subject vari-
able. Customer goal difficulty at three levels (easy, challenging, and unattainable)
and incentive structure at two levels (separate versus set) served as the independent
variables. As can be seen from the table, the analysis shows a three-way interaction
between goal, customer goal difficulty, and incentive structure (F 318,4 = 2.44,
p = 0.0468) making the interpretation of other effects difficult.
Some insights into the interaction of these variables are possible by examining
the mean hours allocated towards attaining each of the three non-customer
goals, as well as hours allocated to the customer goal, as shown in Panel B of
Table 4. Consider first the case in which incentives are based on achieving each
goal separately. Here, the time allocated to the customer (manipulated) goal
follows the predicted inverted-U shaped pattern (Fig. 1) and the time allocated to
each competing goal generally follows the predicted righted-U shaped pattern. As
hours are shifted to and from the customer goal according to its difficulty, the hours
are spread relatively consistently across the three competing goals. In contrast,
consider the case in which incentives are based on achieving all goals as a set. Here,
no perceptible difference in time allocation occurs between the challenging and
unattainable conditions across any of the four goals. That is, when the customer
goal is challenging, the subjects allocated 10.1 hours to this goal and the like figure,
10.3 hours, when the goal is unattainable. When the customer goal is challenging
versus unattainable, hours allocated to the other three goals correspond closely
as well: financial goal = 8.6 and 9.6; internal business goal = 9.9 and 9.9; and
learning and growth goal = 8.6 and 8.9, respectively. These observations suggest
that the pattern of results shown in Fig. 2 is driven by the condition in which
incentives reward each goal separately – a conclusion that is consistent with H4.
This incentive structure more closely resembles those used in the prior studies
upon which the predictions were based (in contrast to incentives in which rewards
are received only after achieving an entire set of distinct, competing goals).
The Effects of Incentive Structure and Goal Difficulty
Table 4. Time Allocated to Balanced Scorecard Areas Other than the Customer Area.
Panel A: MANOVA
Source df F p
Customer Financial Internal Learning & Customer Financial Internal Learning &
Business Growth Business Growth
137
138 BRAD TUTTLE AND MARK J. ULLRICH
DISCUSSION
Some strengths and limitations to the study should be mentioned before discussing
its findings. The study was conducted in the laboratory using a written exercise
designed to capture the essentials of managers’ time allocation decisions. As such,
care should be taken when extrapolating the results to other contexts and situations.
On the other hand, the study employs a strong design that contributes to its internal
validity and allows us to examine the proposed causal relationships. It also benefits
from a high level of experimenter control and uses a task that corresponds more
closely to the kinds of tasks performed by managers than many of the previous
goal studies. Furthermore, the materials that the subjects used are based on the
Balanced Scorecard of an actual company. These factors increase the study’s
external validity.
This is one of a very few studies to examine the effects of goal difficulty and
the effects of incentive structure in a Balanced Scorecard context where multiple
demands vie for the subjects’ time. Based on Naylor, Pritchard and Illgen’s (1980)
NPI theory, we predicted that subjects would shift their time between areas based on
the goal difficulty information they received and as influenced by their incentives.
These predictions were generally supported. Although we found considerable
support that incentive structure and goal difficulty affect how individuals allocate
their time between areas, we found no evidence that either influences the total
amount of time the subjects said they would work to achieve satisfactory results in
all the Balanced Scorecard areas. This also supports NPI theory’s assumption that
in a work-related situation, people do not change their total level of effort except
under very unusual situations. Rather, individuals shift their time from easy goals
to more challenging goals and from unattainable goals to challenging goals.
These findings suggest that organizations should consider incentives and
management control variables such as goal difficulty information as ways to
change or refocus individuals’ time and not as ways to induce more effort. This has
implications for the kinds of effort attributions that are sometimes made during
performance evaluation. Evaluators should be careful about attributing negative
performance to a lack of effort unless they have first ruled out misdirected effort.
It also highlights the importance of receiving timely and accurate information in
order for individuals to appropriately direct their time. The findings also imply
that individuals are sensitive to variables that are under organizational control and
which are susceptible to manipulation. Organizations could improve appropriate
goal directed behavior by making sure that their incentives and reporting systems
focus individuals’ time on their important organizational goals.
One of the major insights of the study is that individuals react differently to
goal difficulty under different incentive structures. When a particular goal is
The Effects of Incentive Structure and Goal Difficulty 139
easy compared to challenging, the time allocated to achieving goals that are
competing for the manager’s attention does not differ according to incentive
structure. However, when information indicates that one goal is unattainable,
the incentive structure makes a substantial difference. When monetary incentives
are based on the extent to which the subjects met each goal individually, the
subjects shifted approximately 6.51 hours from the area with unattainable goals
to alternative areas. However, when the monetary incentives are provided only
upon achieving the goals in all areas, the subjects did not plan to shift any hours
from the area with unattainable goals to alternative areas. Envisioning situations
in which either result is desirable is certainly possible. If goals are somewhat
arbitrary, in that just missing a goal is still beneficial, then basing rewards on
individual goal achievement could be counter productive. Once missing a goal
becomes obvious, individuals will dramatically decrease their planned effort in
that area and redirect it towards meeting challenging but still attainable goals.
On the other hand, there are situations in which organizations want to discourage
individuals from working on unattainable goals. In this case, they should base
monetary rewards on attaining individual goals rather than all goals.
We note that individuals will plan to spend time working on a goal despite
receiving reliable information that the goal is unattainable. This suggests that
individuals consider more than goal difficulty when planning their time. For
instance, individuals may continue to be psychologically committed to goals
that they have previously accepted despite receiving negative goal difficulty
information. In addition, they may feel a need to justify their actions and believe
that missing a goal is easier to justify if effort has been expended than if one
quits altogether. They may also wish to come as close as possible to achieving
the goal in order to preserve their reputations as best they can – coming close
may not be viewed as badly as being way off the mark. Also, individuals know
that in most cases, goal achievement in future periods is tied to the level of effort
exerted this period. Hence, they may be reluctant to completely cease working on
an unattainable goal in order to avoid beginning in a hopeless situation the next
period. These conjectures are fruitful topics for future research.
Together, the findings from the study strongly suggest that when multiple
areas compete for attention, as in the Balanced Scorecard, the way incentives
are structured influences how individuals plan their time between areas rather
than their total level of effort. We have argued that planning one’s time to be
successful in multiple areas is a crucial aspect of what individuals, and particularly
managers, do. For these reasons, this study represents an important contribution
to knowledge about ways incentives can be structured in a Balanced Scorecard
framework to help organizations achieve their goals. Hopefully others will find
the approach taken by this study useful in examining these issues.
140 BRAD TUTTLE AND MARK J. ULLRICH
NOTES
1. Effort includes both time and intensity components, however, Larson and Callahan
(1990) argue that individuals are more likely to differentially allocate their time than vary
their intensity between tasks. They argue that individuals “groove in” to an overall level of
intensity, which they strive to maintain over time.
2. In pretests, subjects were concerned about duties other than those directly tied to
Balanced Scorecard areas. Inclusion of the 15 hours per week on tasks unrelated to their
four goals controls for differences in the amount of time that subjects would otherwise
have assumed needed to be spent on these tasks.
3. One reviewer suggested analyzing the data using proportions rather than difference
scores. The results are equivalent using either method (cf. Tuttle & Harrell, 2001).
4. The manipulation checks were presented with the original case materials and
asked the subjects not to look back. A stronger test would have been to administer the
post-experimental materials separately from the case.
5. A small number of students, which we did not count, chose not to participate. No
monetary incentive was provided.
ACKNOWLEDGMENTS
The authors would like to thank workshop participants at the University of Utah
and the University of South Carolina for their helpful comments.
REFERENCES
Ajzen, I. (1987). Attitudes, traits, and actions: Dispositional prediction of behavior in personality
and social psychology. In: L. Berkowitz (Ed.), Advances in Experimental Social Psychology
(Vol. 20, pp. 1–63). San Diego, CA: Academic Press.
Ajzen, I., & Madden, T. J. (1986). Prediction of goal-directed behavior: Attitudes, intentions,
and perceived behavioral control. Journal of Experimental Social Psychology, 22(5),
453–474.
Anthony, R., & Govindarajan, V. (1998). Management control systems. Homewood, IL: Irwin/McGraw-
Hill.
Ashford, & Northcraft, G. (2002). Robbing Peter to pay Paul: Feedback environments and enacted
priorities in response to competing task demands. Human Resource Management Review,
forthcoming.
Atkinson, J. W. (1958). Motives in fantasy, action, and society: A method of assessment and study.
Princeton, NJ: Van Nostrand.
Awasthi, V., & Pratt, J. (1990). The effects of monetary incentives on effort and decision performance:
The role of cognitive characteristics. The Accounting Review, 65(4), 797–811.
Blau, G. (1986). The relationship of management level to effort level, direction of effort, and managerial
performance. Journal of Vocational Behavior, 29, 226–239.
The Effects of Incentive Structure and Goal Difficulty 141
Blau, G. (1993). Operationalizing direction and level of effort and testing their relationship to individual
job performance. Organizational Behavior and Human Decision Processes, 55, 152–170.
Bonner, S. E., Hastie, R., Sprinkle, G. B., & Young, S. M. (2000). A review of the effects of finan-
cial incentives on performance in laboratory tasks: Implications for management accounting.
Journal of Management Accounting Research, 12, 19–64.
Camerer, C. F., & Hogarth, R. M. (1999). The effects of financial incentives in experiments: A review
and capital-labor-production framework. Journal of Risk and Uncertainty, 19(1–3), 7–42.
Cameron, J., & Pierce, W. D. (1994). Reinforcement, reward, and intrinsic motivation: A meta-analysis.
Review of Educational Research, 64, 363–423.
Chesney, A. A., & Locke, E. A. (1991). Relationships among goal difficulty, business strategies, and
performance on a complex management simulation task. Academy of Management Journal,
34(2), 400–424.
Cotton, J. L., & Tuttle, J. M. (1986). Employee turnover: A meta-analysis and review with implications
for research. Academy of Management Review, 11(1), 55–70.
Covey, S. R. (1989). The seven habits of highly effective people: Restoring the character ethic. New
York, NY: Simon and Schuster.
Early, P. C., Wojnaroski, P., & Prest, W. (1987). Task planning and energy expended: Exploration of
how goals influence performance. Journal of Applied Psychology, 72, 107–114.
Erez, M., Gopher, D., & Arzi, N. (1990). Effects of goal difficulty, self-set goals, and monetary
rewards on dual task performance. Organizational Behavior & Human Decision Processes,
47(2), 247–270.
Fatseas, V. A., & Hirst, M. K. (1992). Incentive effects of assigned goals and compensation schemes
on budgetary performance. Accounting and Business Research, 22(88), 347–355.
Gollwitzer, P. M., & Bargh, J. A. (1996). The psychology of action. New York, NY: Guilford Press.
Jenkins, G. D., Gupta, N., Mitra, A., & Shaw, J. D. (1998). Are financial incentives related to perfor-
mance? A meta-analytic review of empirical research. Journal of Applied Psychology, 83(5),
777–787.
Kaplan, R. S. (1997a). Mobil USM&R (A): Linking the balanced scorecard. Boston, MA: Harvard
Business School Publishing.
Kaplan, R. S. (1997b). Mobil USM&R (B): New England sales and distribution. Boston, MA: Harvard
Business School Publishing.
Kaplan, R. S., & Norton, D. P. (1992). The balanced scorecard: Measures that drive performance.
Harvard Business Review (January–February), 71–79.
Kaplan, R. S., & Norton, D. P. (1996). Translating strategy into action: The balanced scorecard. Boston,
MA: Harvard Business School Publishing.
Komaki, J. L., Coombs, T., & Schepman, S. (1996). Motivational implications of reinforcement theory.
In: R. M. Steers, L. W. Porter & G. A. Bigley (Eds), Motivation and Leadership at Work (pp.
34–52). New York, NY: McGraw-Hill.
Larson, J. R., Jr., & Callahan, C. (1990). Performance monitoring: How it affects work productivity.
Journal of Applied Psychology, 75(5), 530–538.
Lee, T. W., Locke, E. A., & Phan, S. H. (1997). Explaining the assigned goal-incentive interaction:
The role of self-efficacy and personal goals. Journal of Management, 23(4), 541–559.
Libby, R., & Lipe, M. G. (1992). Incentives, effort, and the cognitive processes involved in accounting-
related judgments. Journal of Accounting Research, 30(2), 249–273.
Locke, E. A., & Latham, G. P. (1990). A theory of goal setting and task performance. Englewood Cliffs,
NJ: Prentice-Hall.
Locke, E. A., Latham, G. P., & Erez, M. (1988). The determinants of goal acceptance and commitment.
Academy of Management Review, 13, 23–39.
142 BRAD TUTTLE AND MARK J. ULLRICH
McAllister, D. W., Mitchell, T. R., & Beach, L. R. (1979). The contingency model for the selec-
tion of decision strategies: An empirical test of the effects of significance, accountability, and
reversibility. Organizational Behavior and Human Decision Processes, 24(2), 228–244.
Miodonski, B. (1999). Time management is key to juggling multiple jobs. Contractor, 46(2), 5.
Mowen, J., Middlemist, R., & Luther, D. (1981). Joint effects of assigned goal level and incentive
structure on task performance: A laboratory study. Journal of Applied Psychology, 66, 598–
603.
Naylor, J., & Illgen, D. (1984). Goal setting: A theoretical analysis of a motivation technology. Research
in Organizational Behavior, 6, 95–140.
Naylor, J., Pritchard, R., & Illgen, D. (1980). A theory of behavior in organizations. New York, NY:
Academic Press.
Plack, H. (2000). Managing time can be crucial. Baltimore Business Journal, 17(40), 27.
Sprinkle, G. B. (2000). The effect of incentive contracts on learning and performance. The Accounting
Review, 75(3), 299–326.
Stone, D. N., & Zeibart, D. A. (1995). A model of financial incentive effects in decision making.
Organizational Behavior and Human Decision Processes, 61(3), 250–261.
Tuttle, B., & Burton, F. G. (1999). The effects of a modest incentive on information overload in an
investment analysis task. Accounting, Organizations and Society, 24, 673–687.
Tuttle, B., & Harrell, A. M. (2001). The impact of unit goal priorities, economic incentives, and interim
feedback on the planned effort of information systems professionals. Journal of Information
Systems, 15(2), 81–98.
Vroom, V. H. (1964). Work and motivation. New York, NY: Wiley.
Wood, R. E., & Locke, E. A. (1990). Goal setting and strategy effects on complex tasks. Research in
Organizational Behavior, 12, 73–109.
Wood, R. E., Mento, A. J., & Locke, E. A. (1987). Task complexity as a moderator of goal effects: A
meta-analysis. Journal of Applied Psychology, 72(3), 416–425.
Wright, P. M. (1991). Goals as mediators of the relationship between monetary incentives and perfor-
mance: A review and NPI theory examination. Human Resource Management Review, 1(1),
1–22.
Wright, P. M. (1992). An examination of the relationships among monetary incentives, goal level, goal
commitment, and performance. Journal of Management, 18(4), 677–693.
The Effects of Incentive Structure and Goal Difficulty 143
APPENDIX
Sample Decision Case
Columbia Corporation
Assume that you are a unit level manager employed by the Columbia Corporation.
Columbia’s senior management has identified a competitive strategy that is linked
to goals in four important business areas. All unit managers have the same goals. In
addition, performance measures were developed for each business area as follows:
Notice that you have received reliable interim feedback suggesting that the Cus-
tomer goal is easily attainable and that the other three goals are challenging but
attainable.
Bonus and Promotion: Two items are of particular interest. First, a division
manager is retiring and you are being considered for his replacement. Second,
Columbia provides a performance bonus of up to 20% of your salary.
Both your promotion and bonus depend on how many goals you achieve. The
more goals you achieve the greater your bonus and likelihood of promotion.
Decision: Like most managers, assume that you can work as many hours as you
want and you can allocate the hours as you see fit. Further, assume that during the
144 BRAD TUTTLE AND MARK J. ULLRICH
next performance evaluation period, you must spend 15 hours per week working on
administrative and other responsibilities that are not directly related to achieving
your goals in the four business areas (e.g. personnel issues, travel). Also, assume
that you will devote all your remaining work time towards achieving your goals in
the four business areas. Given the information in the case, please indicate below
how you would allocate your hours at work to achieve the goals in each business
area:
Customer Hours/week
Financial Hours/week
Internal business Hours/week
Learning & growth Hours/week
Administrative & other 15 Hours/week
Total work hours Hours/week
THE EFFECT OF FAIRNESS IN
CONTRACTING ON THE CREATION
OF BUDGETARY SLACK
Theresa Libby
ABSTRACT
This paper explores the relationship between fairness in contracting and
the creation of budgetary slack. A laboratory experiment was performed
in which privately informed subjects were compensated under either a
truth-inducing or slack-inducing incentive contract. Contracting processes
were either fair or unfair as defined by procedural justice theory (Leventhal,
1980; Lind & Tyler, 1988). Under the slack-inducing contract, subjects
exposed to the fair contracting process created significantly less slack than
subjects exposed to the unfair contracting process. Slack created by subjects
compensated under the truth-inducing contract was low and insensitive to
the fairness or unfairness of the contracting process employed.
INTRODUCTION
In large, decentralized organizations, accounting information often forms the basis
for budget estimates used in strategic planning, in coordinating work between
organizational divisions, and in setting targets used in performance evaluation
(Merchant, 1985). The accuracy of budget estimates is key to the effectiveness of
these short-run and long-run planning activities. Even so, prior research indicates
budget estimates are rarely accurate (Otley, 1985). The lack of accuracy of budget
estimates may be the result of the manager’s inability to forecast accurately
operational input-output relationships due to uncertainty inherent in the task.
In addition, the organization may operate in an environment characterized by
uncertainty. The manager may respond by building a buffer against uncertainty in
the environment or in the task into his or her budget estimate (Davila & Wouters,
2000).1
Alternatively, inaccuracy in budget estimates may be motivated by budget-
constrained performance evaluation and reward systems (Jensen, 2001). Results
of several studies in the accounting literature indicate that budget-constrained
performance evaluation systems that emphasize variances in budget-to-actual
results lead to budget gaming (Bart, 1988; Hopwood, 1972; Merchant, 1985;
Walker & Johnson, 1999). One form of budget gaming that has been the focus of
significant study is the creation of budgetary slack (Young & Lewis, 1995).
Budgetary slack is defined as the intentional incorporation of budget amounts
that make the budget easier to attain (Dunk, 1993). Budgetary slack is created
when managers build excess resources into their budgets or knowingly understate
their productive capabilities (Baiman & Evans, 1983; Young & Lewis, 1995).
Budgetary slack is often manifested through overstated expenses or understated
revenues and production plans (Kren & Liao, 1988).
While budgetary slack may play a positive role by facilitating flexibility in
dealing with uncertainty (Cyert & March, 1963; Van der Stede, 2000), this paper
focuses on the alternative negative role budgetary slack plays when budgets
are used to set targets for performance evaluation.2 Budgetary slack created
when budget estimates are intentionally set at a level that is easy to attain can
be detrimental to management control system effectiveness, especially when
responsibility center managers are held accountable for meeting budget targets
and these targets are used to coordinate activities between organizational divisions
and to compensate managers for high performance.
According to Jensen (2001) and Murphy (2000), the typical pay-for-
performance compensation contract includes a fixed salary plus a bonus
increasing in performance above a pre-specified budget target. When a manager
is compensated under this type of contract, holds private information about the
productive capability of his/her division and participates in setting his/her own
budget target, incentives for slack creation exist. Consequently, this type of
contract has been labeled slack-inducing (Waller, 1988). A significant stream
of research has developed using the agency framework to test the ability of
other forms of budget-based incentive contracts to encourage managers to reveal
their private information while limiting the amount of budgetary slack managers
create (Baiman, 1982). These types of contracts have been labeled truth-inducing
The Effect of Fairness in Contracting on the Creation of Budgetary Slack 147
accurate budget targets (Demski & Feltham, 1978; Melumad & Reichelstein, 1989;
Namazi, 1985). A major concern of this literature is that participation in setting
budget targets allows for information sharing, but also increases the potential for
the creation of budgetary slack if managers are then compensated based on meeting
or exceeding the budget that was participatively set (Antle & Eppen, 1985). Truth-
inducing contracts have been constructed to address this problem.
Truth-inducing contracts impose a penalty for misrepresentation, usually
scaled by the difference between budgeted and actual performance, providing an
incentive for subordinates to reveal their private information through the budget
targets they set (Kirby et al., 1991; Weitzman, 1976). The particular form of
truth-inducing contract studied here was developed theoretically by Reichelstein
and Osband (1984) and adapted to the budgeting context by Kirby et al. (1991).
The contract was further adapted by Kirby (1992) to a context in which the
manager selects a budget target and focuses effort on maximizing output to meet
or exceed that target. The contract is of the following form:
H(A, B) = v(B) + w(B)(A − B)
subject to
v(B) is increasing and convex (v > 0, v < 0) and w(B) = v (B) for all B.
In this context, H(A, B) represents the manager’s total compensation, B repre-
sents the productivity estimate (or budget target) for the period, and A represents
the actual level of productivity for the period. The manager’s total compensation
(H ) is therefore made up of an ex ante payment, v(B), and a bonus or penalty,
w(B) (A − B), whose value depends on the variance between budget and actual
performance.
The truth-inducing properties of this contract have been tested empirically by
Kirby (1992), Reichelstein (1992), and Chow et al. (2000). While the theoretical
design of the contract relies on the assumption that managers are strict utility
maximizers, Kirby (1992) finds the contract maintains its truth-inducing properties
even when this assumption is relaxed. Reichelstein (1992) reports a successful
application of this contract form by the German Department of Defense. Finally,
Chow et al. (2000) experimentally test several mechanisms designed to motivate
truthful upward communication of private information including this contract
form. They find this truth-inducing contract led to significantly less misrepresen-
tation of private information than a slack-inducing linear profit sharing scheme.
Accordingly, in the context of the current study, individuals compensated under
this form of truth-inducing contract are expected to create a relatively low amount
of budgetary slack. This prediction is stated formally as follows:
H1. Individuals compensated under a truth-inducing contract will create less
budgetary slack than individuals compensated under a slack-inducing contract.
The Effect of Fairness in Contracting on the Creation of Budgetary Slack 149
Social exchange theory defines behavior in terms of two types of exchange, eco-
nomic exchange and social exchange (Blau, 1964). Economic exchange motivates
behavior intended to fulfil the formal economic employment contract. Employers
offer a “fair day’s pay” and expect employees to provide a “fair day’s work.” Social
exchange, on the other hand, is based on a psychological or implicit contract
that defines obligations on the part of both the organization and the employee
(Rousseau & Parks, 1993). Employees may go beyond the specific duties laid out
in the employment contract if they feel the organization “values their contributions
and cares for their general well-being” (Eisenberger et al., 1990, p. 51).
The division of employee behavior into these two separate, but related cate-
gories mirrors the theoretical predictions of organizational justice theory (OJT).
OJT suggests individuals’ overall perceptions of fairness in an organizational
setting are based on the combination of judgments about the fairness of the actual
amount of resources allocated by the organization to subordinates, known as dis-
tributive fairness, and judgments about the fairness of the processes used to make
allocation decisions, known as procedural fairness (Folger & Cropanzano, 1998).
150 THERESA LIBBY
budgeting processes are unfair; that is, employees who perceive budgeting
processes to be unfair will reciprocate this unfair treatment by acting in their own
rather than the organization’s best interests by creating a relatively high amount
of budgetary slack.
Penalty-framed truth-inducing contracts tend to be completely specified in
economic terms at the beginning of the period due to difficulties in enforcing
the penalty after the fact (Luft, 1994). As a result, a shorter-term economic
exchange relationship between the individual and the organization may become
salient under the penalty-framed truth-inducing contract. If so, procedural fairness
becomes less important and employees may then focus on the economic benefits
obtainable in the current period to a greater degree than consideration of any
future benefits that may accrue. Fehr and Gachter (2002) refer to this effect
as a “crowding out” of agent’s incentives to voluntarily cooperate. Results of
their experimental study indicate that incentive contracts that include a penalty
for shirking (i.e. the agent provides less than the agreed upon level of effort)
are less efficient than a fixed-fee contract because they discourage agents from
focusing on the longer term employment relationship and therefore, reduce the
agent’s interest in reciprocating fair treatment. Thus, individuals compensated
under the penalty-framed truth-inducing contract may not respond to the fairness
or unfairness of budgeting processes when selecting a budget target because
economic incentives imbedded in the contract will be most salient to them.
In summary, this review of the literature implies that the relation between
fairness in contracting and budgetary slack creation is moderated by the form
of budget-based incentive contract employed. That is, fairness in contracting
will influence the amount of budgetary slack individuals create when they are
compensated under a slack-inducing, but not a truth-inducing incentive contract.
This line of reasoning leads to the following hypothesis:
Participants
Experimental Task
Contract Type
Incentive contracts used to compensate subjects were based on the slack-inducing
contractual form used by Waller and Chow (1985) and the truth-inducing con-
tractual form developed by Kirby et al. (1991). Subjects earned tickets that were
entered in a raffle for one of twelve cash prizes of $150. The more tickets sub-
jects earned, the greater their chance to win one of these prizes. The two types of
incentive contracts were operationalized as follows:5
Slack-inducing contract
Truth-inducing contract
(Budget)2 2(Budget)
Payment = + (Actual − Budget)
100 100
Subjects assigned the truth-inducing contract were provided a table in which the
total compensation under this contract for various pairs of budgeted and actual
outcomes was calculated. This table is reproduced in Appendix A. All subjects
were given sample budget and actual amounts and asked to calculate the related
compensation that would be received. They then checked these calculations to
ensure that they understood the relationship between their payment, their budget
and their actual performance.
Experimental Procedures
Subjects first completed a five-minute practice period to become familiar with the
translation task. They earned a piece rate of one raffle ticket for every three words
correctly translated. At the end of this practice period, the subjects verified their
work and calculated the number of words that they had correctly translated in the
practice period.7 After practicing the task and being informed of the probability
distribution of words of different lengths, but before experimental manipulations
were introduced, subjects recorded their best estimate of next period performance;
that is, their best estimate of the number of words they expected to be able to
The Effect of Fairness in Contracting on the Creation of Budgetary Slack 155
translate if given another five minutes in which to work. Subjects placed this
completed Best Estimate of Production sheet in an envelope and sealed it.
Subjects kept this sealed envelope with them until the experiment was complete
and consequently, this information was unknown to the researcher until subjects
had completed the experiment.
Subjects then read a description of the incentive contract under which they
were to work and the information about the fair or unfair contracting process.
They provided the researcher, acting as the division manager, with the budget
they wished to use in calculating the number of tickets earned in the work
period. Subjects were told the budget would also be used by the division
manager to co-ordinate production between divisions. Information asymmetry
was controlled at a relatively high level by informing all subjects they were new
to the organization and their manager was therefore unsure of their productive
capability and did not have access to the Best Estimate of Production forms.
Subjects wrote down their budgets and then performed the translation task for
five minutes.
The third part of the experiment involved filling out a post-experimental
questionnaire. The experimental materials were then collected and one week
later, subjects received a performance report and the tickets that they had earned.
Tickets were collected and placed in a container from which one of the subjects
drew a winning ticket in each group. A cash prize of $150 was paid to the winning
subject in each group and the goals of the experiment were discussed.8 These
experimental procedures are summarized in Fig. 1.
The dependent variable was the amount of budgetary slack subjects created.
Slack was measured as the difference between the best estimate of next period
performance subjects provided before they were given contract and process
information (i.e. prior to the introduction of the experimental manipulations) and
the budget subjects set after the incentive contract and process information was
provided to them. The pre-manipulation best estimate of next period performance
proxied for subjects’ private information about their own productive capability.
This information was known only to the subject until the experiment was
complete. Budget slack should therefore represent the intentional understatement
of subjects’ productive capabilities motivated by the budget-based incentive
contract and/or the contracting process employed. This method of measuring
budget slack is similar to the method used prior experimental studies including
Young (1985), Waller (1988), and Chow et al. (1988).
156 THERESA LIBBY
RESULTS
Manipulation Check for Contracting Process
To ensure subjects assigned the scenarios designed to represent fair and unfair
incentive contracting processes actually perceived these processes to be fair or
unfair respectively, subjects were asked to answer the following questions on
a scale of one (completely unfair) to five (completely fair): “How fair would
you judge the procedures used to set the formula on which your earnings were
based?” and “How fair would you judge the process of setting the budget used
to calculate your earnings?” These questions were based on measures reported
in Tyler and Lind (1992).9 Each subject’s score was their mean score across the
two questions included in the scale. The overall mean score on this scale was 3.60
(std. dev. = 0.81, Cronbach’s alpha = 0.67). Means and standard deviations for
perceived fairness of the contracting process are presented in Table 1, Panel A.
A 2 × 2 analysis of variance was performed on subjects’ perceptions of the
fairness of the contracting process (see Table 1, Panel B). Results indicated a
significant difference in subjects’ perceptions of the fairness of the contracting
process depending on whether they read the scenario describing the contracting
158 THERESA LIBBY
Source SS df MS F
process designed to be fair or unfair, F(1, 138) = 5.75 ( p < 0.05) but perceptions
of fairness did not differ depending on contract type, F(1, 138) = 0.38, or the
contract by process interaction, F(1, 138) = 1.08, indicating subjects responded
to the manipulation of process fairness as expected.10
In the fair contracting process condition, perceived fairness may also have
manifested itself as a felt social pressure to adhere to the existing norms or culture
of fairness within this organizational division (Naumann & Bennett, 2000).
This perspective may be implied in subjects response to a post-experimental
question asking them to rate the fairness of the work environment. A 2 × 2
analysis of variance indicated a significant main effect of contracting process on
subjects’ evaluation of the fairness of the work environment, F(1, 138) = 54.58
( p < 0.001), with subjects in the fair process condition rating the work envi-
ronment as significantly fairer (mean = 3.34, std. dev. = 1.03) than subjects
in the unfair process condition (mean = 2.35, std. dev. = 1.08). No significant
differences were found based on contract form or the contract by process
interaction.
The Effect of Fairness in Contracting on the Creation of Budgetary Slack 159
Hypothesis Tests
Slack-Inducing Truth-Inducing
Contract Contract
Source SS df MS F
F df Significance
DISCUSSION
This study explores the relationship between fair and unfair contracting processes,
budget-based compensation contracts, and the creation of budgetary slack. Prior
research examines the effectiveness of a variety of forms of truth-inducing
contracts in reducing budgetary slack. The current study contributes to the
literature by examining the effectiveness of two specific contract forms when
The Effect of Fairness in Contracting on the Creation of Budgetary Slack 161
between fair contracting process and the creation of budgetary slack in the
incentive-contracting setting studied here.
NOTES
subjects would have conserved energy by not performing the task and taking the fixed
portion of the payment available under each of the incentive schemes. No subjects took
this strategy indicating that the compensation scheme was motivational and that subjects
viewed the target as difficult, but attainable.
9. This scale also includes questions about outcome fairness. The outcome-related
questions were adapted from Tyler and Lind (1992) as “How would you judge the formula
itself that will be used to calculate your earnings for the work period?” and “How fair would
you judge the budget itself?” Perceptions of outcome fairness did not differ depending
on contract type, F(1, 138) = 2.08, process, F(1, 138) = 1.50, or the contract by process
interaction, F(1, 138) = 0.06.
10. As an additional check on subjects’ perceptions of fairness, subjects were asked to
answer the following question: Think about the information you received about the nego-
tiation process between the workers and managers in this organization that was involved
in setting the earnings formula. On a scale of 1 to 5, where 1 means completely unfair
and 5 means completely fair, how fair would you judge the negotiation process? Results
of a 2 × 2 analysis of variance indicated a significant main effect of contracting process
on subject’s evaluation of the fairness of the negotiation process, F(1, 138) = 39.67,
p < 0.001, with subjects in the fair process condition rating the negotiation process
as fairer (mean = 3.71, std. dev. = 0.86) than subjects in the unfair process condition
(mean = 2.68, std. dev. = 1.09). No significant differences were found based on contract
form or the contract by process interaction.
ACKNOWLEDGMENTS
I would like to thank John Waterhouse, Bill Scott, Duane Kennedy, and Jane
Webster for their guidance in the development and execution of this project. I also
wish to thank Glenn Feltham, Joseph Fisher, Kathryn Kadous, Kevin Kelloway,
Robert Mathieu, Don Moser, Steve Salterio, participants at the 1999 Management
Accounting Research Conference, and the accounting research workshops at
HEC (Montreal) and the University of Alberta for their many helpful comments
and suggestions. I gratefully acknowledge the School of Accountancy, University
of Waterloo and CGA-Canada for their financial support of this project. Data
available from the author upon request.
REFERENCES
Antle, R., & Eppen, G. D. (1985). Capital rationing and organizational slack in capital budgeting.
Management Science (Feb), 163–174.
Atkinson, A. (1985). Truth-inducing schemes in budgeting and resource allocation. Cost & Management
(May/June), 38–42.
Baiman, S. (1982). Agency research in management accounting: A survey. Journal of Accounting
Literature, 1, 154–213.
164 THERESA LIBBY
Baiman, S., & Evans, J. H. (1983). Pre-decision information and participative management control
systems. Journal of Accounting Research, 21, 371–395.
Baker, G. P., Jensen, M. C., & Murphy, K. J. (1988). Compensation and incentives: Practice vs. theory.
Journal of Finance, 43(3), 593–617.
Bart, C. (1988). Budgeting gamesmanship. Academy of Management Executive, 285–294.
Blau, P. (1964). Exchange and power in social life. New York, NY: Wiley.
Chow, C. W. (1983). The effects of job standard tightness and compensation scheme on performance:
An exploration of linkages. The Accounting Review, 58, 667–685.
Chow, C. W., Cooper, J. C., & Waller, W. S. (1988). Participative budgeting effects of truth inducing
pay schemes. The Accounting Review, 63, 111–123.
Chow, C. W., Hwang, R. N., & Liao, W. (2000). Motivating truthful upward communication of private
information: An experimental study of mechanisms from theory and practice. Abacus, 36(2),
160–179.
Cropanzano, R., & Folger, R. (1991). Procedural justice and worker motivation. In: R. M. Staw &
L. W. Porter (Eds), Motivation and Work Behavior (5th ed., pp. 131–143). New York, NY:
McGraw-Hill.
Cyert, R. M., & March, J. G. (1963). A behavioral theory of the firm. Englewood Cliffs, NJ: Prentice-
Hall.
Davila, T., & Wouters, M. (2000). Meeting budgets: Budget emphasis and the release of budgetary
slack. Working Paper: Stanford University, Stanford, CA.
Demski, J., & Feltham, G. (1978). Economic incentives in budgetary control systems. The Accounting
Review, 53, 336–359.
Dunk, A. S. (1993). The effect of budget emphasis and information asymmetry on the relation between
budgetary participation and slack. The Accounting Review, 68(2), 400–410.
Ehlen, C. R., & Welker, R. B. (1996). Procedural fairness in the peer and quality review programs.
Auditing: A Journal of Practice and Theory, 15(1), 38–52.
Eisenberger, R., Fasolo, P., & Davis-LaMastro, V. (1990). Perceived organizational Support and
employee diligence, commitment and innovation. Journal of Applied Psychology, 75,
51–59.
Evans, J. H., Hannan, R. L., Krishnan, R., & Moser, D. V. (2001). Honesty in managerial reporting.
The Accounting Review, 76(4).
Evans, J. H., & Sridhar, S. S. (1996). Multiple control systems, accrual accounting, and earnings
management. Journal of Accounting Research, 24(1), 45–65.
Fehr, E., & Gachter, A. (2002). Do incentive contracts crowd out voluntary cooperation? Institute for
Empirical Research in Economics, Working Paper No. 34, University of Zurich.
Fehr, E., Klein, A., & Schmidt, K. M. (2001). Fairness, incentives and contractual incompleteness.
CESifo Working Paper No. 445: Center for Economic Studies, Munich.
Folger, R. (1977). Distributive and procedural justice: Combined impact of voice and improvement on
experienced inequity. Journal of Personality and Social Psychology, 35, 108–119.
Folger, R., & Cropanzano, R. (1998). Organizational justice and human resource management. Thou-
sand Oaks, CA: Sage Publications.
Greenberg, J. (1986). Determinants of perceived fairness of performance evaluations. Journal of Applied
Psychology, 71(2), 340–342.
Greenberg, J. (1987). Reactions to procedural injustice in payment distributions: Do the means justify
the ends? Journal of Applied Psychology, 72(1), 55–61.
Hopwood, A. G. (1972). An empirical study of the role of accounting data in performance evaluation.
Journal of Accounting Research, 10, 156–182.
The Effect of Fairness in Contracting on the Creation of Budgetary Slack 165
Hunton, J. (1996). Involving information system users in defining system requirements: The influence
of procedural justice perceptions on user attitudes and performance. Decision Sciences, 27(4),
647–671.
Hunton, J., & Beeler, J. D. (1997). Effects of user participation in systems development: A longitudinal
field experiment. MIS Quarterly, 21(4), 359–388.
Hunton, J., & Gibson, D. (1999). Soliciting user-input during the development of an accounting infor-
mation system: Investigating the efficacy of group discussion. Accounting, Organizations and
Society, 24, 597–618.
Jennergren, L. P. (1980). On the design of incentives in Soviet firms: A survey of some research.
Management Science (Feb), 193–197.
Jensen, M. C. (2001). Corporate budgeting is broken – Let’s fix it. Harvard Business Review, 79(10),
94–101.
Kim, W. C., & Mauborgne, R. A. (1993). Procedural justice, attitudes, and subsidiary top-management
compliance with multinationals’ corporate strategic decisions. Academy of Management
Journal, 36(3), 502–526.
Kirby, A. J. (1992). Incentive compensation schemes: Experimental calibration of the rationality
hypothesis. Contemporary Accounting Research, 8, 374–408.
Kirby, A. J., Reichelstein, S., Sen, P. K., & Paik, T. (1991). Participation, slack, and budget-based
performance evaluation. Journal of Accounting Research, 29, 109–128.
Kren, L. (1992). Budgetary participation and managerial performance: The impact of information and
environmental volatility. The Accounting Review, 67(3), 511–526.
Kren, L., & Liao, W. M. (1988). The role of accounting information in the control of organizations: A
review of the evidence. Journal of Accounting Literature, 7, 280–309.
Lawrence, P. R., & Lorsch, J. W. (1967). Organization and environment: Managing differen-
tiation and integration. Boston: Graduate School of Business Administration, Harvard
University.
Leventhal, G. S. (1980). What should be done with equity theory? In: K. J. Gergen, M. S. Greenberg
& R. H. Willis (Eds), Social Exchange: Advances in Theory and Research. NY: Plenum Press.
Libby, T. (1999). The influence of voice and explanation on performance in a participative budgeting
setting. Accounting, Organizations and Society, 24(2), 125–138.
Lind, E. A., Kanfer, R., & Earley, P. C. (1990). Voice, control and procedural justice: Instrumental and
non-instrumental concerns in fairness judgments. Journal of Personality and Social Psychology,
59(5), 952–959.
Lind, E. A., & Tyler, T. R. (1988). The social psychology of procedural justice. NY: Plenum.
Lindquist, T. M. (1995). Fairness as an antecedent to participative budgeting: Examining the effects of
distributive justice, procedural justice and referent cognitions on satisfaction and performance.
Journal of Management Accounting Research, 7, 122–147.
Loeb, M., & Magat, W. (1978). Soviet success indicators and the evaluation of divisional management.
Journal of Accounting Research (Spring), 103–121.
Luft, J. (1994). Bonus and penalty incentives: Contract choice by employees. Journal of Accounting
and Economics, 18, 181–206.
Magner, N., & Welker, R. B. (1994). Responsibility center managers’ reactions to justice in budgetary
resource allocation. Advances in Management Accounting (Vol. 3, pp. 237–253). Greenwich,
CT: JAI Press.
Melumad, N. D., & Reichelstein, S. (1989). Value of communication in agencies. Journal of Economic
Theory, 47, 334–368.
166 THERESA LIBBY
Merchant, K. A. (1985). Budgeting and the propensity to create budgetary slack. Accounting, Organi-
zations and Society, 10(2), 201–210.
Moorman, R. H., Blakely, G. L., & Niehoff, B. P. (1998). Does perceived organizational support mediate
the relationship between procedural justice and organizational citizenship behavior? Academy
of Management Journal, 41, 351–368.
Moser, D. V., Evans, J. H., III, & Kim, C. K. (1995). The effects of horizontal and exchange inequity
on tax reporting decisions. The Accounting Review, 70(4), 619–634.
Murphy, K. J. (2000). Performance standards in incentive contracts. Journal of Accounting and Eco-
nomics, 30(3), 245–278.
Namazi, M. (1985). Theoretical developments of principal-agent employment contract in accounting:
The state of the art. Journal of Accounting Literature, 4, 113–163.
Naumann, S. E., & Bennett, N. (2000). A case for procedural justice climate: Development and test of
a multilevel model. Academy of Management Journal, 43(5), 881–889.
Otley, D. T. (1985). The accuracy of budgetary estimates: Some statistical evidence. Journal of Business
Finance and Accounting, 12(3), 415–425.
Reichelstein, S. (1992). Constructing incentive schemes for government contracts: An application of
agency theory. The Accounting Review, 67, 712–731.
Reichelstein, S., & Osband, K. (1984). Incentives in government contracts. Journal of Public Eco-
nomics, 24, 257–270.
Rousseau, D. M., & Parks, J. M. (1993). The contracts of individuals and organizations. In: L. L.
Cummings & B. M. Staw (Eds), Research in Organizational Behavior (Vol. 15). JAI Press.
Thompson, J. D. (1967). Organizations in action. New York: McGraw-Hill.
Tushman, M. L., & Nadler, D. A. (1978). Information processing as an integrating concept in organi-
zational design. Academy of Management Review, 3(3), 613.
Tyler, T. R. (1989). The quality of dispute resolution processes and outcomes: Measurement problems
and possibilities. Denver University Law Review, 66, 419–436.
Tyler, T. R., & Lind, E. A. (1992). A relational model of authority in groups. In: L. Berkowitz (Ed.),
Advances in Experimental Social Psychology (Vol. 25, pp. 115–191). Academic Press.
Van der Stede, W. A. (2000). The relationship between two consequences of budgetary controls:
Budgetary slack creation and managerial short-term orientation. Accounting Organizations
and Society, 25(6), 609–622.
Waller, W. S. (1988). Slack in participative budgeting: The joint effects of a truth-inducing pay scheme
and risk preferences. Accounting, Organizations and Society, 87–98.
Waller, W. S., & Chow, C. W. (1985). The self-selection and effort effects of standard-based employ-
ment contracts: A framework and some empirical evidence. The Accounting Review, 60(3),
458–476.
Walker, K. B., & Johnson, E. N. (1999). The effects of budget-based incentive compensation scheme
on the budgeting behavior of managers and subordinates. Journal of Management Accounting
Research, 11, 1–28.
Weitzman, M. (1976). The new Soviet incentive model. Bell Journal of Economics (Spring),
251–257.
Young, S. M. (1985). Participative budgeting: The effects of risk aversion and asymmetric information
on budgetary slack. Journal of Accounting Research, 23(2), 829–842.
Young, S. M., & Lewis, B. (1995). Experimental incentive contracting research in management
accounting. In: R. H. Ashton & A. H. Ashton (Eds), Judgment and Decision-making
Research in Accounting and Auditing (pp. 55–75). Cambridge, NY: Cambridge University
Press.
The Effect of Fairness in Contracting on the Creation of Budgetary Slack 167
APPENDIX A
Sample Payments Under the Truth-inducing Contract
Cells of the table below represent the number of tickets earned under different
combinations of budgeted and actual performance. Diagonal cells were shaded
to emphasize that the maximum payments would be earned when the budgets
subjects selected were equal to their actual performance.
APPENDIX B
Fair and Unfair Contracting Process Scenarios
Fair Process:
You have learned that your supervisor has held this supervisory position for many
years. You have also noted that your supervisor appears to be very popular with
your co-workers. Your supervisor’s philosophy is that the employees of the division
are the experts when it comes to the work that they do and that much can be learned
from listening to their suggestions.
The formula that is used to calculate your earnings, as was described above, is a
relatively new innovation within this division. The form of the contract was agreed
upon approximately one year ago based on negotiations between representatives
168 THERESA LIBBY
of the employee group and management. The management negotiation group was
headed by your supervisor.
Although you have been told that the negotiation process led to a degree of tension
between your co-workers and your supervisor, your co-workers seem to be fully
supportive of the contract as it now stands. You have been told by one of your
co-workers, whose opinion you respect, that this is mainly due to strong commu-
nication between the employee and management groups during the negotiation
process.
You have also noticed that the majority of your co-workers with whom you have
talked about the negotiation process feel that the management team was sincerely
interested in their opinions about the earnings formula. Before the formula was
finalized, the management team performed an informal poll of the employees who
would be affected by it and found that the majority supported it. Whenever an issue
came up on which there was disagreement, the worker and management groups
were able to talk out their differences and come to a satisfactory solution, although
the management group also offered to allow any unresolved issues to be passed on
to an objective third-party decision maker.
Many of the employees of this division have held positions within the division
for many years. While increasing their overall pay is, of course, very important to
your co-workers, providing accurate budgets to management and increasing overall
production efficiency in order to ensure the long-term survival of the organization
also seems to be high on their list of priorities. You have heard four or five of
them say that they would have to be given a pretty large raise in pay before they
would be willing to move to a job in another division mainly due to the positive
atmosphere between employees and managers in this division.
Unfair Process:
You have learned that your supervisor has held this supervisory position for many
years. You have also noted that, although your co-workers are polite and do as the
supervisor asks, he does not seem to be very popular with them. The supervisor’s
philosophy is that employees should work hard to receive higher pay and leave
all other decisions to him. You have been told that the supervisor feels that his
long-term position as supervisor of the division makes him the best judge of how
the work should be done and he is not really interested in receiving feedback or
suggestions from the employees that he supervises.
The formula that is used to calculate your earnings, as was described above, is a
relatively new innovation within this division. The form of the contract was agreed
upon approximately one year ago based on negotiations between representatives
The Effect of Fairness in Contracting on the Creation of Budgetary Slack 169
of the employee group and management. The management negotiation group was
headed by your supervisor.
You have been told that the negotiation process led to a great deal of tension between
your co-workers and your supervisor. Your co-workers seem to be quite bitter about
the contract as it now stands. You have been told by one of your co-workers, whose
opinion you respect, that this is mainly due to the lack of communication between
the employee and management groups during the negotiation process.
You have also noticed that the majority of your co-workers with whom you have
talked about the negotiation process feel that the management team appeared to
be completely uninterested in their opinions about the earnings formula. Before
the formula was finalized, the employee group suggested that an informal poll be
taken of employees who would be affected by it to measure their degree of support.
This suggestion was ignored by the management group. Whenever an issue came
up on which there was disagreement, the worker and manager groups found it
difficult to come to a satisfactory solution and generally, the solution was imposed
by the person in charge of the management group, who happens to have been your
supervisor.
Many of the employees of this division have held positions within the division for
only a year or two. Receiving the highest possible earnings at the end of each work
period seems to be of utmost importance to your co-workers. You have heard four
or five of them say that they view their position as only a “stepping stone” to a
better position within another division of the organization. Increasing production
efficiency and the long-term health of the division by providing accurate budgets
to management does not seem to be high on their list of priorities. A few of your
co-workers have commented that they would not have to be given a very large raise
in pay, or any raise at all, to convince them to move to a job in another division
of the organization where the atmosphere between the workers and the supervisor
was more positive.
A TOBIT ANALYSIS OF ACCOUNTING
FACULTY PUBLISHING
PRODUCTIVITY IN AUSTRALIAN AND
NEW ZEALAND UNIVERSITIES
ABSTRACT
This study examines the research behavior of Australian and New Zealand
accounting faculty to determine the characteristics that influence research
productivity. University reputations are integrally linked with research
performance and determining the qualities that predict research behavior
may be of particular value in the selection and recruitment process. The
study finds that two key factors significantly impact performance: holding
a Ph.D. and having an academe-oriented rather than profession-oriented
background. These results may be interpreted as affirming the U.S. model
of developing specialist academic researchers through doctoral education
programs rather than employing faculty with strong professional experience.
1. INTRODUCTION
Research is an integral function of any university and a key determinant of
academic reputation (Baden-Fuller et al., 2000). Primarily, a university’s research
The effectiveness or efficiency of a faculty is indeed difficult to measure, and I would not deny
the important faculty function performed by the non-writing researcher. However, I think he
is likely to be a relative “rare bird.” For the great majority of faculty members, it seems to me
that we must continue to emphasize the place of research and publication in their programs.
Only by this procedure can we hope to have accounting remain a vital and stimulating force
in business education and management.
Understanding the factors that impact the level of research (measured in publica-
tions) achieved by faculty members is very important from a university perspective.
This is particularly relevant when recruiting new and inexperienced faculty, where
the existing faculty must rely on indicators of possible future publication success
rather than on an observed publication stream. The issue is even more salient given
that “Most academics publish very little, or not at all” (Demski & Zimmerman,
2000, p. 346). However, research studies in accounting investigating factors that
indicate future publishing output levels are relatively limited (e.g. Cargile &
Bublitz, 1986; Gee & Gray, 1989; Gray & Helliar, 1994; Maranto & Streuly, 1994).
These studies suggest that various factors, such as the institutional setting of the
researcher and possessing a Ph.D., impact the level of research output. A difficulty
with conducting this form of research is that only a relatively small number of
the factors that are likely to influence publication output are “observable” (e.g.
research interests, Ph.D. qualification). Other relevant factors are likely to be more
difficult to accurately measure (e.g. ability, ambition) (Gray & Helliar, 1994).
This study examines the research behavior of Australian and New Zealand
accounting faculty to determine the characteristics that influence research
productivity. In essence, the study asks what factors will predict the desired
research behavior, namely papers published in quality academic journals. It builds
on the work of Wilkinson and Durden (1998) and Durden et al. (1999) who
measured research outputs in an attempt to enable comparisons of performances
across universities. Those studies served to provide a basis for ranking university
A Tobit Analysis of Accounting Faculty Publishing Productivity 175
departments, but did not seek to explain in any comprehensive sense the observed
differences between individual faculty performances. This study develops a Tobit
model to explain publishing output behavior. The findings indicate that two key
factors contribute to publishing performance – holding a Ph.D. qualification
and having an academe-orientation and background rather than an extensive
professional background. Other indicators of publishing productivity were having
stated research interests in the financial accounting, managerial accounting and
auditing fields. This may also reflect a bias in the higher-ranked journals toward
these areas of interest. That is to say, researchers may focus their research efforts in
financial accounting, managerial accounting and auditing because the more highly
ranked journals are more open to accepting research in these areas than in newer
subdisciplines. This is consistent with Daigle and Arnold’s (2000) suggestion that
many of the accounting information systems researchers are forced to develop
and promote research interests in other subdisciplines because research in these
other areas (financial, managerial and tax accounting) is more likely to result in
the highly-ranked journal publications required for tenure purposes.
The remainder of the paper is organized as follows. Section 2 develops the
hypotheses in the context of the extant literature. Section 3 outlines the model
development and data analysis. Results are shown in Section 4 and conclusions
and limitations are discussed in Section 5.
2. HYPOTHESIS DEVELOPMENT
Based on an analysis of prior literature several important characteristics appear to
impact research output. First, possessing a Ph.D. impacts research productivity.
Since the Ph.D. comprises by definition an intensive research preparation process,
a positive relationship likely exists between research productivity and possession
of a Ph.D. degree. Arguments about the importance of the Ph.D. are often based
on theories of human capital (Long et al., 1998; Maranto & Streuly, 1994). In this
sense the Ph.D. provides students with higher levels of intellectual capital which
should result in higher levels of research output and career success. This may exist
among graduates from a range of Ph.D. programs rather than being restricted only
to those with high status academic origins (Long et al., 1998). Other research has
also indicated an association between holding a Ph.D. and research productivity
(Gray et al., 1987; Gray & Helliar, 1994). The Australian and New Zealand
context provides an opportunity to further explore the role of the Ph.D. because
only a relatively small proportion of faculty in these two countries hold a Ph.D.
At the time this study was undertaken only 24% of faculty members were Ph.D.
qualified. H1, in the alternate form, is as follows:
176 B. R. WILKINSON, C. H. DURDEN AND K. J. WILKINSON
Table 1. Ratings Derived from Zeff (1996) for Journals in the Sample.
Abacus 12 British accounting review 11
Accounting and business research 12 Contemporary accounting research 11
Accounting and finance 7 Financial accountability and 6
management
Accounting, auditing and 8 The international journal of 11
accountability journal accounting
Accounting forum 2 Issues in accounting education 11
Accounting historians journal 10 Journal of accounting and economics 12
Accounting history 0 Journal of accounting and public 11
policy
Accounting horizons 12 Journal of accounting, auditing and 11
finance
Accounting, organizations and 12 Journal of accounting education 6
society
The Accounting review 12 Journal of accounting research 12
Advances in accounting 7 Journal of business finance and 12
accounting
Advances in international accounting 6 Journal of cost management 9
Advances in management accounting 3 Journal of international accounting, 3
auditing and taxation
Asian review of accounting 0 Journal of international financial 7
management and accounting
Auditing: A journal of theory and 10 Management accounting research 7
practice
Australian accounting review 3 Pacific accounting review 3
Behavioral research in accounting 7 Research in accounting in emerging 1
economies
where:
years employed = years employed at current institution;
financial, managerial, auditing, tax, theory, education and other = stated faculty
areas of research interest;
U.S. qualified = a dummy variable taking the value of 1 if the highest educa-
tional qualification is from a U.S. institution and zero otherwise;
Ph.D. = a dummy variable taking the value of 1 if a Ph.D., DBA or D.Phil.
qualification is held and zero otherwise;
Membership = a dummy variable taking the value of 1 if professional body
membership is held and zero otherwise;
Academe = a dummy variable taking the value of 1 if the individual has less
than 5 years of experience outside academe and zero otherwise.
180 B. R. WILKINSON, C. H. DURDEN AND K. J. WILKINSON
Since a large number of faculty members had no publications during the period
of measurement, there is a high proportion of zeros in the weighted publications
measure. Thus, while data for the independent variables is available, the data for the
dependent variable is of a censored nature. One possibility would be to estimate the
model via OLS using only those faculty for whom the dependent variable is non-
zero. However, as noted by Judge et al. (1988), this results in biased and inconsistent
estimators. A more appropriate approach is to estimate a Tobit regression model.
McDonald and Moffitt (1980, p. 318) identify the Tobit model as assuming “an
underlying, stochastic index equal to (X t +u t ) which is observed only when it is
positive, and hence qualifies as an unobserved, latent variable.” They express the
stochastic model as follows:
Yt = Xt  + ut if Xt  + ut > 0
Yt = 0 if Xt  + ut ≤ 0
where: t = 1, 2, . . ., N.
A Tobit model is estimated via the SAS LIFEREG procedure, using the normal
probability distribution for the error term. As noted by McDonald and Moffitt
(1980), the estimated regression parameters cannot be interpreted in the usual
sense. They will, however, enable us to ascertain the independent variables that
significantly impact publishing performance. As noted later in the paper, the Tobit
A Tobit Analysis of Accounting Faculty Publishing Productivity 181
this study calls into question the extent to which such individuals will be likely
to achieve quality research outputs, a critical determinant of a university’s
reputation.
H2 (years of employment at current institution) and H3 (membership of
professional body) were not supported. The failure of professional membership
to explain productivity may be related to the fact that a high number of faculty
hold such membership. Faculty may derive significant benefits from such
membership (for example, insurance benefits) such that even faculty with a low
professional interest, may maintain membership. The insignificance of years of
employment is surprising but may indicate that faculty with a strong research
interest maintain that interest over time and may derive sufficient reward within
their own institutions (Gray & Helliar, 1994).
Also of note were the significant coefficients for faculty research interests in
financial accounting, managerial accounting and auditing. This may reflect a bias
in the higher-ranked journals toward these areas of interest (Hasselback et al.,
2000). Some concerns could be raised with respect to the poor performance of
faculty with stated tax research interests. Here, there is a negative, though not
significant, relationship between an expressed interest in taxation and weighted
publications. This may reflect the tendency, particularly in Australia, for tax
faculty to be concentrated in law/business law disciplines. Tax publishing has
A Tobit Analysis of Accounting Faculty Publishing Productivity 183
accordingly trended toward legal based research rather than empirical accounting
research. This is consistent with comments by Schulman et al. (1996) concerning
the low level of empirical research into the policy implications of tax integration,
a reform that has been implemented in Australia, New Zealand, Canada and the
U.K., along with a range of other countries outside the U.S.. The holding of U.S.
qualifications was also non-significant. This may be the result of the low levels of
individuals holding such qualifications (30 out of 716 faculty).7
As noted earlier, the Tobit model parameters cannot be interpreted in the
same manner as those derived from ordinary least squares. However, the Tobit
model can be used to estimate the probability that an individual with a given
set of characteristics will publish at a certain level. In fact, an entire probability
distribution can be developed for an individual with a given set of characteristics.
For example, consider an individual with 5 years’ of employment at their current
institution, who has a stated interest in financial accounting, is not a member
of a professional organization and who has less than five years’ experience
outside the academic environment, is U.S. qualified with no Ph.D., the probability
distribution shown in Table 5, row 1 would arise.8 If, by way of contrast, an
equivalent individual with respect to the stated characteristics is considered but
who also possesses a Ph.D., the probability distribution shown in Table 5, row
2 arises. Thus, the model predicts a higher probability of increased publishing
performance across the board, and a much reduced probability of having no
publications for an individual with a Ph.D. relative to one without.
NOTES
1. Employment background was coded as “professional” for individuals with 5 years or
more experience in a non-academic role, and as “academic” for those with less than 5 years
experience outside academe.
2. This assessment ignores migration of U.S. citizens already holding Ph.D. qualifi-
cations to Australia and New Zealand, about which no a priori belief is held. Further, the
study uses “highest qualification from U.S.” rather than Ph.D. specifically. A subsequent
test using only U.S. Ph.D. qualification resulted in no qualitative differences.
3. The directory also included part time doctoral teaching assistants and assistant
lecturers, neither of whom would be considered permanent faculty, and were excluded
accordingly. Tutors were also excluded on the basis that their role is explicitly teaching
based, and on grounds that they also tend not to be regarded as permanent faculty.
4. Although the directory is primarily accounting specific, it does include some non-
accounting faculty. Where possible, such faculty were identified and eliminated based on
qualifications, teaching responsibilities and research interests. It is possible, however, that
in some instances non-accounting faculty may not have been identifiable as such and hence
were included. For example, finance faculty listed in the directory that held professional
accounting memberships might not have been clearly distinguishable from accounting
faculty. It is likely, however, that most departments registered only accounting faculty
in the directory and that most non-accounting faculty that were included were identified
and deleted.
5. Limited other deletions were made including the deletion of a dean. Details can be
found in Wilkinson and Durden (1998) and in Durden et al. (1999).
A Tobit Analysis of Accounting Faculty Publishing Productivity 185
6. Only published articles were included. Hence, published book reports and monographs
were excluded from the study.
7. As a further check, this was restricted to U.S. Ph.D. qualifications. The estimated
coefficient was negative but not significant and there was no qualitative change in the other
estimated coefficients.
8. Probabilities are calculated as follows: P(publications ≤ W P) = P(Z ≤ (W P −
t )/) For example, the probability that the individual in Table 5 without a Ph.D. will
publish zero publications is:
P(publications = 0) = P(Z < 0 − (−1.66455 + 0.00279 × 5(YEARS)
+ 0.54979(FINANCIAL) + 0.11835(U.S. QUALIFIED)
+ 0.54851(ACADEMIC))/1.15438) or P(Z < 0.376) = 0.647.
ACKNOWLEDGMENTS
The authors wish to thank Peter Westfall for his assistance with the methodological
development. We also thank the editor, Vicky Arnold, and an anonymous reviewer
for helpful comments and suggestions in revising the paper.
REFERENCES
Abdolmohammadi, M. J., Menon, K., Oliver, T. W., & Umapathy, S. (1985). The role of the doctoral
dissertation in accounting research careers. Issues in Accounting Education, 3, 59–76.
Baden-Fuller, C., Ravazzolo, F., & Schweizer, T. (2000). Making and measuring reputations: The
research rankings of European business schools. Long Range Planning, 33(5), 621–650.
Bairam, E. I. (1996). Research productivity in New Zealand university economics departments,
1988–1995. New Zealand Economics Papers, 30, 229–241.
Beresford, D. R. (2001). Guest editorial: If I could do it over again . . .. The CPA Journal, 71(7), 80.
Blaxter, L., Hughes, C., & Tight, M. (1998). Writing on academic careers. Studies in Higher Education,
23(3), 281–295.
Brinn, T., Jones, M. J., & Pendlebury, M. (1996). U.K. accountants’ perceptions of research journal
quality. Accounting and Business Research, 26(3), 265–278.
Brown, L. D., & Huefner, R. J. (1994). The familiarity with and perceived quality of accounting
journals: Views of senior accounting faculty in leading U.S. MBA programs. Contemporary
Accounting Research, 11(1), 223–250.
Cargile, B. R., & Bublitz, B. (1986). Factors contributing to published research by accounting faculties.
The Accounting Review, 61(1), 158–178.
Daigle, R., & Arnold, V. (2000). An analysis of the research productivity of AIS faculty. International
Journal of Accounting Information Systems, 1, 106–122.
Davidson, S. (1957). Research and publication by the accounting faculty. The Accounting Review,
32(1), 114–118.
Demski, J. S., & Zimmerman, J. L. (2000). On Research vs. Teaching: A long-term perspective.
Accounting Horizons, 14(4), 343–352.
186 B. R. WILKINSON, C. H. DURDEN AND K. J. WILKINSON
Doyle, J. R., & Arthurs, A. J. (1995). Judging the quality of research in business schools: The U.K. as
a case study. Omega International Journal of Management Science, 23(3), 257–270.
Durden, C. H., Wilkinson, B. R., & Wilkinson, K. J. (1999). Publishing productivity of Australian
accounting ‘units’ based on current faculty composition. Pacific Accounting Review, 11(1),
1–27.
Englebrecht, T. D., Govind, S. I., & Patterson, D. M. (1994). An empirical investigation of the publi-
cation productivity of promoted accounting faculty. Accounting Horizons, 8(1), 45–68.
Gee, K. P., & Gray, R. H. (1989). Consistency and stability of U.K. academic publication output criteria
in accounting. British Accounting Review, 21(1), 43–54.
Gray, R. H., Haslam, J., & Prodham, B. K. (1987). Academic departments of accounting in the U.K.:
A note on publication output. British Accounting Review, 19(1), 53–71.
Gray, R., & Helliar, C. (1994). U.K. accounting academics and publication: An exploration of
observable variables associated with publication output. British Accounting Review, 26(3),
235–254.
Hasselback, J. R., Reinstein, A., & Schwan, E. S. (2000). Benchmarks for evaluating the research
productivity of accounting faculty. Journal of Accounting Education, 18(2), 79–97.
Hull, R. P., & Wright, G. B. (1990). Faculty perceptions of journal quality: An update. Accounting
Horizons, 4(1), 77–97.
Imhoff, E. A. (1988). Planning academic accounting careers. Issues in Accounting Education, 3(2),
286–301.
Judge, G. G., Hill, R. C., Griffiths, W. E., Lutkepohl, H., & Lee, T.-C. (1988). Introduction to the theory
and practice of econometrics (2nd ed.). New York: Wiley.
Long, R. G., Bowers, W. P., Barnett, T., & White, M. C. (1998). Research productivity of graduates
in management: Effects of academic origin and academic affiliation. Academy of Management
Journal, 41(6), 704–714.
Maranto, C. L., & Streuly, C. A. (1994). The Determinants of accounting professors’ publishing pro-
ductivity – The early career. Contemporary Accounting Research, 10(2), 387–407.
Mautz, R. K. (1988). Editorial: Fifty years of accounting. Accounting Horizons, 2(1), 126–129.
McDonald, J. R., & Moffitt, R. A. (1980). The uses of Tobit analysis. Review of Economics and
Statistics, 62(2), 318–321.
Meyer, M. J., & Titard, P. L. (2000). Those who can . . . teach. Journal of Accountancy, 190(1), 49–58.
Newell, G., Langsam, S., & Kreuze, J. (1996). Accounting faculty profiles: Demographics and percep-
tions of academia. Journal of Education for Business, 72(2), 87–94.
Nobes, C. W. (1985). International variations in perceptions of accounting journals. The Accounting
Review, 60(4), 702–705.
Otley, D. (2002). British research in accounting and finance (1996−2000): The 2001 research assess-
ment exercise. British Accounting Review, 34(4), 387–417.
Schulman, C. G., Thomas, D. W., Sellers, K. F., & Kennedy, D. B. (1996). Effects of tax integration
and capital gains tax on corporate leverage. National Tax Journal, 46(1), 31–54.
Wiley, J. (1998). Jacaranda Wiley directory of accounting: 1998–1999. Brisbane, Australia: Jacaranda
Wiley.
Wilkinson, B. R., & Durden, C. H. (1998). A study of accounting faculty publishing productivity in
New Zealand. Pacific Accounting Review, 10(2), 75–95.
Zeff, S. A. (1996). A study of academic research journals in accounting. Accounting Horizons, 10(3),
158–177.
Zivney, T. L., Bertin, W. J., & Gavin, T. A. (1995). A comprehensive examination of faculty publishing.
Issues in Accounting Education, 10(1), 1–25.
CLASSIFICATION OF CUSTOMIZED
ASSURANCE SERVICES BY DECISION
MAKERS: THE CASE OF SysTrust™
Philip R. Beaulieu
ABSTRACT
When decision makers encounter new assurance services that can be
customized for individual clients, they must include them in their pre-existing
categorization of assurance, a cognitive task known as postclassification.
This paper draws upon three literatures (classification research in account-
ing, theory of assurance, and cognitive psychology) in order to suggest how
this task might be modeled and studied empirically, using the example of
SysTrust™ . The role of a necessary condition for successful postclassification
called the category use effect (Ross, 2000), in which decision makers are
reminded of pre-existing categories when they learn to use new categories,
is explained.
1. INTRODUCTION
New forms of assurance1 provided by public accountants have proliferated in the
last decade due to both supply and demand factors. On the supply side, public
accounting firms have sought to generate revenue in growth areas of assurance and
related consulting activities because growth opportunities in the mature market
for traditional financial statement assurance are limited. Demand for innovation in
assurance stems partly from technological innovation, which has led to concerns
a going-concern firm. Recall was also the dependent variable used by Moeckel
(1990) and Libby and Trotman (1993); see Libby (1995) for a review of research
in auditing involving knowledge structures and memory.
Memory models and recall have been featured in research concerning external
users of financial statements, although less has been done in this area than in
auditing. Beaulieu (1996) posited that commercial loan officers use a classification
system based on the Five Cs of Credit (character, capacity, capital, conditions
and collateral), a classification system used to teach loan officers to process infor-
mation and make loan decisions. Greater recall of decision-consistent character
and accounting (capacity and capital) information than decision-inconsistent
information provided evidence that the classification system resided in long-term
memory and biased recall in favor of decision-consistent information. Another
example of research involving users of financial statements is Kida et al. (1998),
who proposed that managers making stock investment and financial difficulty
decisions encode (classify) accounting information according to affect, a positive
or negative response to numerical data. Recall and decision results supported the
existence of an affect-based classification system in long-term memory. A fair
question to ask is whether auditing and accounting classification systems really
exist in the minds of auditors and financial statement users, psychologically and
neurologically, or whether they exist only as conventions that are convenient for
research purposes. Cohen (2000, p. 2) proposed two types of evidential require-
ments – logical and behavioral – for hierarchical classification systems, in which
“lower level items inherit the properties of higher level items.” Logical evidence
requires a convincing argument that a hierarchy is more efficient than alternative
methods of organizing and accessing knowledge. The argument for an efficient
system asserts that it enables economical storage and access to information,
and that “representation of factual knowledge at different levels of generality
facilitates the identification of useful analogies” (p. 5). Behavioral evidence
consists of experiments in which different hierarchical levels are presented,
causing effects in response times, error rates, and quality of responses.3 Bonner
et al. (1997) illustrates how these two criteria can be used to evaluate potential
classification systems.
Bonner et al. (1997) studied how accounting students learn to estimate the
frequency of financial statement errors. Subjects in their experiment were taught
either: (1) the relationship between financial statement errors and three categories
of transaction cycles: sales and receipts, inventory/purchases, and investments;
or (2) the relationship between errors and three categories of audit objectives:
proper cutoff, validity, and valuation. Subjects then observed a sequence of errors
and finally were asked for frequency estimates. The first hypothesis of Bonner
et al. (p. 391) was:
Classification of Customized Assurance Services by Decision Makers 193
Subjects receiving transaction cycle (audit objective) category instruction prior to experiencing
frequencies will make frequency estimates which more closely reflect experienced error fre-
quencies for transaction cycle (audit objective) categories than for audit objective (transaction
cycle) categories.
The term assurance services was not part of the auditing lexicon prior to the
1990s. For example, the classic book on the philosophy of auditing by Mautz and
Sharaf (1961) does not mention levels or categories of audit services in any of its
eight postulates of auditing or five primary concepts of auditing (evidence, due
audit care, fair presentation, independence and ethical conduct), let alone mention
assurance services. The term appeared in auditing textbooks after the AICPA
definition in 1996 – a year later in the case of Arens and Loebbecke (1997).
Around that time, audit partners began calling themselves assurance partners.
The meaning of new concepts is adjusted by usage until a generally accepted
meaning is established. The most relevant examples for the purposes of this paper
are the concepts of review and compilation services defined in 1978 by the AICPA.
Statement on Standards for Accounting and Review Services (SSARS) No. 1 stated
that in a review engagement, the CPA’s report would indicate “limited assurance,”
or negative assurance, that nothing came to the attention of the CPA indicating a
material misstatement (Kinney, 2000). A compilation was defined as providing no
opinion and no assurance regarding departures from GAAP, although the CPA is
still associated with the financial statements and has some responsibility (Kinney,
2000). Research regarding the financial statement users most affected by this
classification system, commercial lenders, has provided mixed evidence on their
understanding and use of reviews and compilations. Bandyopadhyay and Francis
(1995) found that loan officers’ interest rate recommendations and loan decisions
were affected by the level of attestation (including audit, review, and compilation).
Martin et al. (1988) reported that lenders do not generally differentiate between
audits and reviews, but their acceptance of compilations depends on a number of
Classification of Customized Assurance Services by Decision Makers 195
factors, including the level of owners’ equity and term of the loan. Johnson et al.
(1983) found that level of attestation (audit, review, compilation, and no attesta-
tion) did not affect loan decisions; Wright and Davidson (2000) similarly found no
effect on loan risk assessments.
In the United States, a gap between users’ and practitioners’ expectations of
audits led to the adoption of many Statements of Auditing Standards (SAS),
including SAS Nos 52–60, as well as SAS No. 82 on consideration of fraud in
a financial statement audit. Thus, in addition to the research conducted between
1983 and 2000 on financial statement users’ perceptions of audit, review, and
compilation services, other papers addressed the expectations gap related solely to
audit-level attestation. Some of this research suggests that expectations gap stan-
dards might effectively narrow the gap (e.g. Bamber & Stratton, 1997; Campbell &
Mutchler, 1988; Jennings et al., 1993; Kinney & Nelson, 1996). However, a paper
by Houston and Taylor (1999) on WebTrust indicated that users of that assurance
service incorrectly inferred that additional assurance regarding product quality
was provided.
Although the research cited in the preceding paragraphs offers the hope that
users can be educated in order to calibrate their expectations of assurance services
consistently with practitioners, it also discourages the assumption that decision
makers have any particular classification system in mind. To be conservative, this
paper will assume nothing about the classification hierarchies that decision makers
might have adopted since 1996 to accommodate customized assurance. Instead,
two theoretical classification systems, the AICPA (2000) and Kinney (2000), will
be examined for their potential in assisting decision makers to classify customized
assurance efficiently.
In addition to defining assurance services in terms of improvements to the
quality and context of information, the Special Committee (AICPA, 2000)
related them to attestation and consulting services in a framework of categories.
Attestation is a subcategory of assurance with detailed standards, whereas there is
some overlap between the categories of assurance and consulting activities. The
primary distinction between assurance and consulting is the goal of the service;
assurance improves decision-makers’ output indirectly, through provision of better
information, whereas consulting aims to aid decision makers directly through
research and findings. The AICPA’s positioning of the assurance, attestation, and
management consulting categories is shown in Fig. 1. Essential features of these
categories are described in Table 1; the hierarchical relationship between attesta-
tion and assurance is evident in the table. For example, the objective of assurance
is better decision making, which subsumes the narrower objective of attestation,
reliable information. The level of assurance is defined as examination, review, or
agreed-upon procedures in the attestation category, but the assurance category is
196 PHILIP R. BEAULIEU
flexible with regard to levels, which may range from explicit assurance about the
usefulness of information for specific purposes to implicit assurance resulting from
CPA involvement.
The test of logical evidence advocated by Cohen (2000) requires that the
hierarchical system in Fig. 1 be more efficient than alternative classification
systems in terms of information storage and access, and identification of useful
analogies. The system is economical in that there are just seven categories at
three levels; a hierarchy that could accommodate the complexity and variety of
assurance services in fewer categories is hard to conceive. The attestation category
is parsimonious because when decision makers encounter a service that they
expect is attestation, they only have to consider coding it as audit examination,
review, or agreed-upon procedures. The system might help decision makers think
Classification of Customized Assurance Services by Decision Makers 197
Fig. 2
Information Quality Assurance Services (Adapted from Kinney, 2000, p. 12). Source: This
figure is reproduced from Information Quality Assurance and Internal Control for Manage-
ment Decision Making (2000, Irwin/McGraw-Hill) by W. Kinney and is reproduced with
permission of The McGraw-Hill Companies.
to their individual circumstances, we proceed in the next section to show how they
could be revised to include SysTrust™ , an example of customized assurance.
The four principles used to judge whether a system is reliable, mentioned in the
above quotation, are defined as follows (AICPA/CICA, 2000, pp. 11–13).
Availability: The system is available for operation and use at times set forth in service-level
statements or agreements.
Security: The system is protected against unauthorized physical and logical access. This prin-
ciple also addresses privacy concerns related to use of confidential information.
Maintainability: The system can be updated when required in a manner that continues to provide
for system availability, security, and integrity.
The examples that follow this quote are reporting on selected SysTrust™ prin-
ciples, engagements for systems in the preimplementation phase, agreed-upon
procedures engagements, and consulting engagements (review level assurance is
not allowed). Thus, Exposure Draft Version 2.0 enables practitioners to customize
SysTrust™ assurance in several ways to meet specific client needs, but these
adjustments require a great deal of diligence on the part of decision makers to
understand. First, management defines the boundaries of the system in question
in a System Description attached to the management assertion regarding the
effectiveness of its controls, which in turn is attached to the assurance report.
Management can choose to define a system in any way it sees fit; the system
might be narrowly defined, as in the case of a data center, or broadly defined, as
in the case of an outsourced finance and accounting function or ERP system.
A second significant aspect of customization is that Version 2.0 allows reporting
on any one of the four SysTrust™ principles. Thus, an engagement could address
only the integrity principle, and provide no assurance regarding availability,
security, or maintainability. The accountant’s report would list all four principles
and state that integrity is the sole principle covered, but it would be left to
decision makers to search for a definition of the integrity principle. As defined
by SysTrust™ , integrity consists of complete, accurate, timely, and authorized
processing, but the auditor’s report refers the user to the AICPA (or CICA) Web
site for the definition; it does not appear in the report itself.
Customization under SysTrust™ (Version 2.0) extends even further than the
definition of system boundaries and reporting on selected principles. There can
also be engagements for systems in the pre-implementation phase, i.e. systems that
have not yet been placed in operation. Here, the practitioner tests the suitability
of the design of controls at a point in time, rather than the operating effectiveness
of controls for a period of time, as is the case for other SysTrust™ reports. For
pre-implementation phase engagements, the system description attached to the
practitioner’s report would require additional detail, such as the version of the
system and “other appropriate identifiers” (AICPA/CICA, 2000, p. 20).
There are few limits to customization of assurance under the proposed
SysTrust™ , making it relatively difficult to perceive as a single product. However,
202 PHILIP R. BEAULIEU
it has been trademarked and servicemarked in the United States and Canada, and
the brand appears in independent accountants’ or auditors’ reports, as in the phrase
“SysTrust™ Principles and Criteria.” SysTrust™ users have some alternatives
when they consider how to integrate it into their existing conceptual frameworks,
a process called postclassification in the cognitive psychology literature (Ross,
1999). They range from creating a single category for SysTrust™ , with features
of all customized options attached, to creating many SysTrust™ categories under
pre-existing assurance categories with features matching customization. These
choices are considered below in three possible postclassifications, two using the
AICPA’s classification system and one based on Kinney’s (2000) hierarchy.
Figure 3 revises Fig. 1 (the AICPA system) by including a category for
SysTrust™ that spans the attestation and management consulting categories, and
the attestation subcategories of audit and agreed-upon procedures (excluding
review), as defined by Exposure Draft Version 2.0. It might be a challenge for
decision makers to add the SysTrust™ category because it intersects different
Classification of Customized Assurance Services by Decision Makers 203
levels and types of assurance, but at least there are concrete reference points
(audit, agreed-upon procedures, and consulting) in the classification system. This
postclassification is also likely to foster the brand-name awareness of SysTrust™
among decision makers that the AICPA desires by creating a single category for it.
A difficulty with this postclassification is that the subject matter and customization
features of SysTrust™ , such as reporting on selective system reliability principles,
are not primary identifiers of the category.
An alternative postclassification to that shown in Fig. 3 would be to create three
separate SysTrust™ subcategories for audit examination, agreed-upon procedures,
and management consulting, as shown in Fig. 4. This option allows decision
204 PHILIP R. BEAULIEU
makers to compare the subject matter of SysTrust™ with other forms of assurance,
matched according to level of assurance. For example, within the category of
audit examination, the categories of financial statements and SysTrust™ explicitly
recognize that assertions regarding financial information and systems are involved.
However, breaking SysTrust™ down into three subcategories of other concepts
might sacrifice brand recognition among decision makers, and since SysTrust™
is distributed among several subcategories, increase the cognitive effort required
to classify each new SysTrust™ engagement. Using the single-category approach
(Fig. 3), more effort is likely expended initially in identifying the breadth of the
category, but less effort might be needed to store and access new information
once postclassification is complete.
Postclassification of SysTrust™ according to the Kinney (2000) system is
pictured in Fig. 5, which is restricted to the reliability improvement category,
the relevant portion of Fig. 2. SysTrust™ would be excluded from the category
of audits of financial statements and would constitute a subcategory of “audits”
of internal control quality, business processes, etc. The meaning of “audits” in
quotations would necessarily expand to include both true audits and quasi-audits.
There is less emphasis on levels of assurance at the top of the Kinney hierarchy
than in the AICPA’s classification system, so decision makers would be required
to recognize them at a lower point in the hierarchy, perhaps constructing sub-
categories of SysTrust™ for audit, agreed-upon procedures and consulting (not
shown in Fig. 5). Kinney’s system is similar to the single-category approach based
on the AICPA’s system in that there is a relatively high initial postclassification
cost in creating a comprehensive category having many features. The cost may
be even greater under Kinney’s system because analogs of financial statement
assurance levels are further removed from SysTrust™ . The advantage of Kinney’s
classification system is that decision makers could quickly classify SysTrust™ as
an assurance service that improves reliability of business processes (systems).
task separately for each disease. A symptom would be scored correct if it did
diagnose the disease, and incorrect if it indicated the other disease or was one of
the four symptoms that did not diagnose either disease. In the condition where
classification by disease was required in the learning of use phase, the ratio of
correct to incorrect relevant-use symptoms listed (0.80) was significantly higher
than the corresponding ratio for irrelevant-use symptoms (0.58). In the condition
where classification was not required during learning of use, the ratios of correct to
incorrect symptoms were lower and did not differ significantly between relevant use
(ratio = 0.40) and irrelevant-use (ratio = 0.38) symptoms. This result indicates a
category use effect; using the disease categories while learning subclassifications
of the system – relevant-use versus irrelevant use symptoms – improved the ability
of subjects to list symptoms in general, but particularly symptoms critical to the
subclassification being learned. The critical condition for the category use effect to
occur, identified in this experiment, is that the original categories must be activated
so that they can be revised. In plain language, people must be reminded of original
categories while they learn to use new, related categories in order for their use of
the original categories to be changed in the correct or intended manner.
In four other experiments, Ross (2000) ruled out alternative explanations of
the category use effect and found that it applied to a reverse-order task, in which
subjects were given one or two symptoms and asked to name the disease most
likely for a patient with the symptom(s). In other research, Ross found that the
category use effect applies to a problem-solving task in which formulas must be
learned (Ross, 1999). In short, the effect is robust across variables, measures, and
tasks, although the experiment described above is most relevant to the task of
learning features of customized assurance reports.
Applied to assurance services, the category use effect requires that decision
makers be reminded of initial categories of assurance as they encounter new
services, including SysTrust™ . The AICPA assumes that initial categories will be
related to the CPA brand name in some fashion (refer to the quote in Section 3).
In the AICPA’s initial classification system (Fig. 1), the relevant categories are
attestation, including the subcategories of audit examination and agreed-upon pro-
cedures, and management consulting. Presumably, this reminder would heighten
awareness among decision makers of the customization inherent in SysTrust™
with regard to level of attestation, regardless of whether postclassification involved
single (Fig. 3) or multiple (Fig. 4) categories for SysTrust™ . If decision makers
were taught Kinney’s (2000) classification system initially (Fig. 2), the one
essential category would be “audits” of internal control quality, business process,
etc., because SysTrust™ is entirely contained in that category. However, reminders
about three higher levels of the hierarchy – audits of financial statements, reliability
improvement, and relevance improvement – may possibly help decision makers
Classification of Customized Assurance Services by Decision Makers 209
6. RESEARCH IMPLICATIONS
Behavioral evidence supporting the psychological existence of any classification
system will most likely be found in controlled experiments similar to Ross (2000).
More importantly, if the category use effect is studied, then this methodology must
be used. This section begins with a detailed explanation of one possible experiment,
then considers variations in the design.
An experiment very similar to the one by Ross (2000), summarized in Table 2,
would involve classification of assurance engagements instead of diseases. The
initial classification learning task would consist of learning relevant features of
an assurance classification system as applied to traditional financial statement
assurance, such as the level of assurance implied by each category. An associated
characteristic would be the user’s risk level, the risk that “an assertion accompa-
nied by a favorable attest report is materially misstated” (Kinney, 2000, p. 270).
Risk level might be rated on a four-point scale including low, medium, high, and
210 PHILIP R. BEAULIEU
very high. In the case of the AICPA’s system (Fig. 1), audit examination would be
labeled as low risk, review as low to moderate, and agreed-upon procedures as low
to very high (Kinney, 2000). After being taught the classification system, subjects
would be given a series of two-part engagement descriptions (corresponding to
patients in Ross, 2000), the first part containing a description of the firm, its indus-
try, its general financial position and performance, and the second part consisting
of an independent accountant’s report. Firms would be described as belonging to
different industries, and their financial condition would vary, so that there would
be some uncertainty regarding the risk of using the accounting information for an
investment decision. Subjects would be asked to make an investment decision and
rate user’s risk for each firm until a criterion level of agreement with the classifica-
tion system’s ratings was achieved, similar to criterion achievement in diagnosis
in Ross (2000).
In the learning of use task, where the manipulation would occur, all subjects
would learn the essential features of SysTrust™ , such as the decision situations
in which it could be used, levels of assurance, and customization options. Next,
they would all see cases similar to those seen in the first part of the experiment,
except that these would describe various fictitious SysTrust™ engagements with
different customization features. However, only subjects in the treatment group
would be asked to classify each SysTrust™ case according to the initial assurance
classification system and would be able to see a picture of the entire system (e.g.
Fig. 1), perhaps with some description of the categories. This classification task
would likely be more difficult than the corresponding task in Ross (2000) because
it is a challenge to perceive relationships between SysTrust™ engagements
addressing the reliability of systems and traditional forms of assurance concerning
accounting information used in investment decisions. No criterion level of
“achievement” would be sought at this point; the purpose is to demonstrate to
subjects through experience the similarities and differences between SysTrust™
and all other assurance services, and the range of engagements possible within
SysTrust™ . Subjects in the control group would read the same descriptions of
SysTrust™ engagements in order to show them specific examples of the service,
but they would not be required to classify the engagements according to the initial
system and would not see a picture of it.
Still in the learning of use stage of the experiment, subjects would be taught
a postclassifcation of assurance services that includes SysTrust™ , for instance
AICPA option 1 as pictured in Fig. 3. Subjects would be shown where in the
revised system various types of engagements would be placed, according to levels
of assurance and customization provided. The final task of the experiment would
require subjects to read examples of SysTrust™ engagements (including the
independent accountant’s reports), rate the reliability of the systems described,
Classification of Customized Assurance Services by Decision Makers 211
and rate the user’s risk for each of them. The treatment group would observe the
entire postclassification system (e.g. Fig. 3), whereas the control group would
be shown only the part of the postclassification system showing SysTrust™ . In
the case of Fig. 3, this would be only the oval containing SysTrust™ and the
subcategories of audit examination, agreed-upon procedures, and management
consulting. The categories of review, attestation, compilation, and assurance in
the initial classification system would not be shown to the control group.
In this design, judgments of user’s risk replace disease symptoms listed (Ross,
2000) as the dependent variable, but the design follows Ross (2000) in all other
respects as closely as possible. Evidence of a category use effect would be that
subjects in the treatment group rate user’s risk closer to the levels intended by
the AICPA (e.g. consistently low risk for audit examination engagements) than
the control group’s ratings. The category use effect would have been caused first
by the treatment group’s having attempted to classify SysTrust™ with the initial
system and being prompted to recall various levels of user’s risk associated with
analogous assurance services. Also, they would have been able to observe the
entire postclassification system, not just SysTrust™ categories, when making
user’s risk judgments. Stated as a formal hypothesis, the category use effect
would predict:
H1. Decision makers asked to classify SysTrust™ engagements according to
an initial classification system when learning to use SysTrust™ , and to use a
complete postclassification system when asked to rate the user’s risk of systems,
will give ratings closer to those intended by the AICPA than decision makers
not asked to use an initial classification system, or a complete postclassification
system, during learning of use tasks.
The experiment described above could provide behavioral evidence of a category
use effect, but fails to identify a more (or the most) efficient assurance classi-
fication system because subjects are only required to learn one system in the
classification learning task. In the absence of a compelling logical argument, as
required by Cohen (2000), that there exists an assurance classification system
that is more efficient than alternative methods of organizing and accessing
knowledge, additional behavioral evidence is needed to address the question of
cognitive efficiency. To answer this question, the research could be extended
by teaching another classification system, such as Kinney (2000, Fig. 2), in
the classification learning task and comparing user’s risk judgments to results
given the AICPA’s system. Other dependent variables measuring cognitive
efficiency, for instance recall of the detailed information about the customized
features of individual SysTrust™ engagements, could be added to the design.
If one wished to predict that a relatively conceptual classification system is
212 PHILIP R. BEAULIEU
more efficient than a concrete system, then one possible alternative hypothesis
would be:
H2. Among decision makers given the opportunity to use complete initial
assurance classification systems when learning postclassification systems
that include SysTrust™ , those using systems based on Kinney (2000) will
later recall more information about SysTrust™ engagements than those using
AICPA-based systems.
Designing research in assurance classification is inherently more complex than
the experiments conducted by Ross (2000). Not only are there alternative initial
classification systems, there are alternative postclassification systems even for the
same initial system, as shown in Figs 3 and 4. When subjects begin participating
they might (e.g. commercial loan officers) or might not (e.g. students not having
taken an auditing course) have already learned an assurance classification system.
Those in the former group might find it difficult to ignore their preconceptions
if they contradict what is taught in the classification learning task. One means
of dealing with pre-existing classification systems among user groups would be
to survey them regarding concepts such as levels of assurance and to adjust the
systems taught in experiments for the results.
Research could be extended beyond the strictly cognitive domain by intro-
ducing a dependent variable not used in extant classification research – price.
Experimental markets could be employed to measure the willingness of subjects
to pay for customized assurance services, although some abstraction from the
details of specific services such as SysTrust™ might be necessary. A category use
effect would be evident if subjects who were reminded in some way of an original,
generic assurance classification system as they learned to use customized assur-
ance were willing to pay more for a customized service than subjects not reminded
of the initial system. This result would certainly please the AICPA and lend
support to their hope that the CPA brand can be extended to a broader spectrum
of assurance services.
Finally, research could be extended to other customized assurance services
offered under the AICPA brand, such as ElderCare. Services offered by other
providers could also be included. For example, experiments using recall, user’s risk
or price as dependent variables could require subjects to classify websites having
either WebTrust™ or BBB Online seals as to the level of assurance provided. Re-
gardless of the assurance services and dependent variables addressed, the difficulty
remains that an initial classification system must either be assumed or taught to
participants in the research. Consensus is more likely with relatively homogeneous
user groups. Thus, a sample comprised of either commercial loan officers or
institutional investors will be more likely to share a common classification system
Classification of Customized Assurance Services by Decision Makers 213
NOTES
1. Assurance will be defined in this paper as defined by the AICPA (2000, p. 1): “in-
dependent professional services that improve the quality of information, or its context, for
decision makers.” Attestation, including audits, is a subcategory of assurance (see Fig. 1),
and at times in the paper attestation services will be referred to as a type of assurance.
2. The fields of auditing and decision-making uses of accounting information are most
relevant to this paper, but cognitive models of classification also appear in accounting
education literature, e.g. Butler and Mautz (1996) and Bagranoff et al. (1994). There
is considerable discussion of ontologies in information systems literature concerning
databases and artificial intelligence, e.g. Dahlgren (1995), Parsons and Wand (1997),
Terenziani (1995), and Wand and Wang (1996). However, much of this work is based on
culture and language, for example in reproducing users’ classifications in artificial systems,
rather than psychology and cognition (the focus of this paper).
3. Cohen (2000) discusses two other types of evidence that are less relevant to this paper.
Neuropsychological evidence shows that damage affects hierarchical levels differently.
Ontogenetic evidence shows that children acquire some hierarchical levels before others.
4. Boritz (2001) pointed out that SysTrust™ assurance does not pertain to system
reliability itself – it pertains to effectiveness of controls over principles. He questioned
whether this could cause an expectations gap.
ACKNOWLEDGMENTS
Thanks to Karla Johnstone, Janet Morrill, Steve Salterio, Mike Stein, Michael
Wright, and two anonymous reviewers.
214 PHILIP R. BEAULIEU
REFERENCES
AICPA (1996). Report of the AICPA Special Committee on Assurance Services. http://www.
aicpa.org/assurance/index.htm
AICPA (2000). Assurance Services – Definition and Interpretive Commentary. http://www.aicpa.
org/assurance/scas/comstud/defincom/index.htm
AICPA/CICA (2000). SysTrust™ Principles and Criteria for Systems Reliability Exposure Draft,
Version 2.0. http://www.aicpa.org or http://www.cica.ca
Arens, A., & Loebbecke, J. (1997). Auditing: An integrated approach (8th ed.). Upper Saddle River,
NJ: Prentice-Hall.
Bagranoff, N., Houghton, K., & Hronsky, J. (1994). The structure of meaning in accounting: A cross-
cultural experiment. Behavioral Research in Accounting (Suppl.), 35–57.
Bamber, E. M., & Stratton, R. (1997). The information content of the uncertainty-modified audit report:
Evidence from bank loan officers. Accounting Horizons (June), 1–11.
Bandyopadhyay, S., & Francis, J. (1995). The economic effect of differing levels of auditor assurance
on bankers’ lending decisions. Canadian Journal of Administrative Sciences, 12, 238–249.
Beaulieu, P. (1996). A note on the role of memory in commercial loan officers’ use of accounting and
character information. Accounting, Organizations and Society (August), 515–528.
Bonner, S., Libby, R., & Nelson, M. (1997). Audit category knowledge as a precondition to learning
from experience. Accounting, Organizations and Society (July), 387–410.
Boritz, J. E. (2001). Information systems assurance. In: V. Arnold & S. G. Sutton (Eds), Research-
ing Accounting as an Information Systems Discipline. Sarasota, FL: American Accounting
Association (forthcoming).
Butler, J., & Mautz, R. D., Jr. (1996). Multimedia presentations and learning: A laboratory experiment.
Issues in Accounting Education (Fall), 259–280.
Campbell, J., & Mutchler, J. (1988). The expectations gap and going-concern uncertainties. Accounting
Horizons (March), 42–49.
Choo, F., & Trotman, K. (1991). The relationship between knowledge structure and judgments for
experienced and inexperienced auditors. The Accounting Review (July), 464–485.
Cohen, G. (2000). Hierarchical models in cognition: Do they have psychological reality? European
Journal of Cognitive Psychology, 12(1), 1–36.
Dahlgren, K. (1995). A linguistic ontology. International Journal of Human-Computer Studies,
43(5–6), 809–818.
Frederick, D., Heiman-Hoffman, V., & Libby, R. (1994). The structure of auditors’ knowledge of
financial statement errors. Auditing: A Journal of Practice and Theory (Spring), 1–21.
Houston, R., & Taylor, G. (1999). Consumer percentions of CPA WebTrust assurances: Evidence of
an expectation gap. International Journal of Auditing, 3, 89–105.
Jennings, M., Kneer, D., & Reckers, P. (1993). The significance of audit decision aids and precise
jurists’ attitudes on perceptions of audit firm culpability and liability. Contemporary Accounting
Research (Spring), 489–507.
Johnson, D., Pany, K., & White, R. (1983). Audit reports and the loan decision: Actions and perceptions.
Auditing: A Journal of Practice and Theory (Spring), 38–51.
Johonson, K., & Mervis, C. (1997). Effects of varying levels of expertise on the basic level of catego-
rization. Journal of Experimental Psychology (September), 248–277.
Kida, T., Smith, J., & Maletta, M. (1998). The effects of encoded memory traces for numerical data on
accounting decision making. Accounting, Organizations and Society (July/August), 451–466.
Classification of Customized Assurance Services by Decision Makers 215
Kinney, W. (2000). Information quality assurance and internal control for management decision mak-
ing. Boston: Irwin/McGraw-Hill.
Kinney, W., & Nelson, M. (1996). Outcome information and the “expectation gap”: The case of loss
contingencies. Journal of Accounting Research (Autumn), 281–294.
Libby, R. (1995). The role of knowledge and memory in audit judgment. In: R. Ashton & A. H.
Ashton (Eds), Judgment and Decision-Making Research in Accounting and Auditing.
Cambridge: Cambridge University Press.
Libby, R., & Trotman, K. (1993). The review process as a control for differential recall of evidence in
auditor judgments. Accounting, Organizations and Society (August), 559–574.
Lymer, A., Debreceny, R., Gray, G., & Rahman, A. (1999). Business reporting on the internet
(Discussion Paper). London: International Accounting Standards Committee (IASC).
Malt, B., Ross, B., & Murphy, G. (1995). Category coherence in cross-cultural perspective. Cognitive
Psychology, 29, 85–148.
Martin, C., Handorf, W., & Clewell, W. (1988). Small business lending and levels of report assurance.
Akron Business and Economic Review (Summer), 69–84.
Mautz, R., & Sharaf, H. A. (1961). The philosophy of auditing. Sarasota, FL: American Accounting
Association.
Moeckel, C. (1990). The effect of experience on auditors’ memory traces. Journal of Accounting
Research (Autumn), 368–387.
Nelson, M., Libby, R., & Bonner, S. (1995). Knowledge structures and the estimation of conditional
probabilities in audit planning. Accounting Review (January), 27–47.
Osherson, D., Smith, E., Wilkie, O., Lopez, A., & Shafir, E. (1990). Category-based induction. Psy-
chological Review, 97, 185–200.
Parsons, J., & Wand, Y. (1997). Choosing classes in conceptual modeling. Communications of the
ACM, 40(6), 63–69.
Rosch, E., Mervis, D., Gray, W., Johnson, D., & Boyes-Braem, P. (1976). Basic objects in natural
categories. Cognitive Psychology, 8(3), 382–439.
Ross, B. (1996). Category representations and the effects of interacting with instances. Journal of
Experimental Psychology: Learning, Memory and Cognition, 22, 1249–1265.
Ross, B. (1997). The use of categories affects classification. Journal of Memory and Language,
37(August), 240–267.
Ross, B. (1999). Postclassification category use: The effects of learning to use categories after learning
to classify. Journal of Experimental Psychology: Learning, Memory and Cognition, 25(May),
743–757.
Ross, B. (2000). The effects of category use on learned categories. Memory and Cognition, 28(January),
51–63.
Terenziani, P. (1995). Towards a causal ontology coping with the temporal constraints between causes
and effects. International Journal of Human-Computer Studies, 43(5–6), 847–863.
Wand, Y., & Wang, R. (1996). Anchoring data quality dimensions in ontological foundations. Com-
munications of the ACM, 39(11), 86–95.
Wright, M., & Davidson, R. (2000). The effect of auditor attestation and tolerance for ambiguity on
commercial lending decisions. Auditing: A Journal of Practice and Theory (Fall), 67–81.