You are on page 1of 16

This article was downloaded by: [128.122.253.

212] On: 20 May 2015, At: 03:21


Publisher: Institute for Operations Research and the Management Sciences (INFORMS)
INFORMS is located in Maryland, USA

Management Science
Publication details, including instructions for authors and subscription information:
http://pubsonline.informs.org

Inventory Record Inaccuracy: An Empirical Analysis


Nicole DeHoratius, Ananth Raman,

To cite this article:


Nicole DeHoratius, Ananth Raman, (2008) Inventory Record Inaccuracy: An Empirical Analysis. Management Science
54(4):627-641. http://dx.doi.org/10.1287/mnsc.1070.0789

Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions

This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact permissions@informs.org.

The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.

Copyright © 2008, INFORMS

Please scroll down for article—it is on subsequent pages

INFORMS is the largest professional society in the world for professionals in the fields of operations research, management
science, and analytics.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
MANAGEMENT SCIENCE informs ®

Vol. 54, No. 4, April 2008, pp. 627–641 doi 10.1287/mnsc.1070.0789


issn 0025-1909  eissn 1526-5501  08  5404  0627 © 2008 INFORMS

Inventory Record Inaccuracy: An Empirical Analysis


Nicole DeHoratius
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

Graduate School of Business, The University of Chicago, Chicago, Illinois 60637, nicole.dehoratius@chicagogsb.edu

Ananth Raman
Harvard Business School, Boston, Massachusetts 02163, araman@hbs.edu

T raditional inventory models, with a few exceptions, do not account for the existence of inventory record
inaccuracy (IRI), and those that do treat IRI as random. This study explores IRI observed both within and
across product categories and retail stores. Examining nearly 370,000 inventory records from 37 stores of one
retailer, we find 65% to be inaccurate. We characterize the distribution of IRI and show, using hierarchical linear
modeling (HLM), that 26.4% of the total variance in IRI lies between product categories and that 2.7% lies
between stores. We identify several factors that mitigate record inaccuracy, such as inventory auditing practices,
and several factors that exacerbate record inaccuracy, such as the complexity of the store environment and
the distribution structure. Collectively, these covariates explain 67.6% and 69.0% of the variance in IRI across
stores and product categories, respectively. Our findings underscore the need to design processes to reduce the
occurrence of IRI and highlight factors that can be incorporated into inventory planning tools developed to
account for its presence.
Key words: execution; information technology; inventory control; record inaccuracy; retail; supply chains
History: Accepted by Wallace J. Hopp, operations and supply chain management; received September 6, 2004.
This paper was with the authors 1 year and 5 months for 3 revisions.

1. Introduction we collected from 37 of Gamma’s U.S.-based stores.


Accurate inventory records are critical to the perfor- The absolute difference between the inventory quan-
mance of retail organizations. U.S. retailers spend 1% tity recorded in Gamma’s automated tracking system
of annual sales, approximately $30 billion per year, on and the actual inventory quantity found on the store
automated decision support tools that use recorded shelf averaged nearly five units per stock keeping unit
inventory quantities to forecast demand, plan product (SKU), approximately 35% of the average number of
assortments, and replenish store shelves (Steidtmann units per SKU found on the store shelf.
1999). Inaccurate inventory records compromise the How do such inaccuracies arise? Numerous oppor-
value of these tools. Take, for example, an auto- tunities exist for recorded quantities to diverge from
mated replenishment system designed to trigger an actual quantities as items move physically through
order when the on-shelf quantity of a particular item the retail supply chain and recorded quantities are
reaches a pre-specified level. If the recorded inven- automatically updated by the inventory tracking
tory quantity does not match the quantity present on system. Commonly cited problems include selling
the store shelf, this system will either order when an and restocking errors, replenishment errors, database
order is unnecessary or fail to order when it should. errors, poor or incomplete data synchronization, and
Consequently, despite having implemented a decision counting errors, as well as customer and employee
support tool to prevent excess inventory and stock- theft (Hollinger and Langton 2003, BearingPoint 2002,
outs, retailers with inventory record inaccuracy (IRI) Raman et al. 2001, Kurt Salmon Associates 2000).
may find themselves short of some items and over on Inventory records can, for example, become inaccu-
others. rate if a salesperson mis-scans items during the check-
We find IRI at a leading retail chain to be both per- out process. When faced with a customer buying
vasive—present across multiple items, product cate- multiple, identically priced items, a salesperson might
gories and stores—and substantial in magnitude. This scan one item multiple times rather than each item
retail chain, hereafter called Gamma, is a large public separately (BearingPoint 2002, Raman et al. 2001).
retailer with annual sales of approximately $10 billion. The recorded quantity of each item should have been
It has highly modern operations including electronic depleted by one unit. Instead, the recorded quantity
point-of-sale scanning and automated replenishment. of one item has been depleted multiple times, result-
Despite its technological sophistication, we observe ing in a mismatch between the recorded and actual
inaccuracies in 65% of the 369,567 inventory records inventory. Similarly, IRI can result from restocking
627
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
628 Management Science 54(4), pp. 627–641, © 2008 INFORMS

errors should a salesperson fail to properly process a recorded and actual inventory quantities do occur
returned or exchanged item. and are substantial in magnitude. Research suggests
In addition to these selling and restocking errors, that even small discrepancies can result in substan-
replenishment errors can cause IRI (Bernard 1985, tial lost sales, missed service level targets, and sub-
Raman et al. 2001). Typically, store employees do not optimal retail performance (DeHoratius et al. 2008,
scan each item delivered from the distribution cen- Kang and Gershwin 2005, Fleisch and Tellkamp 2005).
ter (DC) into store inventory upon receipt. Instead, We estimate the potential lost revenue due to IRI at
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

store employees verify whether the expected num- one Gamma store to be nearly $1,350 per week (see
ber of pallets or cases has been received. Store inven- e-companion1 Appendix A). Aggregated across each
tory records are automatically updated based on the week of operation and each North American store,
store order with the assumption that the order has the revenue lost by Gamma to IRI is more than 1% of
been filled correctly. If a DC employee incorrectly fills its North American retail sales and more than 3% of
a store order by accidentally picking and shipping its gross profit.
the wrong item or the wrong quantity of the cor- We also identify factors that mitigate as well as con-
rect item, the quantity received will not match the tribute to IRI both within and across stores. We find
quantity ordered for a given SKU. Consequently, store that IRI, measured as the absolute difference between
inventory records will be inaccurate. the recorded and actual inventory quantity, is asso-
Database errors and poor data synchronization can ciated with factors such as an item’s annual sell-
also result in IRI. For example, if the retailer’s mas- ing quantity, its cost, and the frequency of an audit,
ter database file incorrectly states that the case pack defined as the number of days since the previous
quantity is twelve units when it is actually six units, physical audit. While several of these factors have
every time a case is shipped to a store the automated been identified as drivers of IRI in other contexts (e.g.,
inventory tracking system will add twelve units to the item cost and quantity sold), we extend these findings
on-hand quantity of that item at that store when, in to the retail context. IRI is also associated with factors
reality, only six units have been shipped (Fisher et al. such as inventory density, defined as the total number
2000). Discrepancies between recorded and actual of units found in a store’s selling area, product variety,
inventory quantities can also be caused by poor data defined as the number of different merchandise cate-
synchronization or time lags between the flow of gories within a store (Levy and Weitz 2001), and the
material and information (Iglehart and Morey 1972, distribution structure (whether or not it was shipped
Bernard 1985). If the quantity of inventory found in from a Gamma-owned DC). These factors have never
the record is compared to the quantity of inventory been linked to IRI. The linking of these variables to
actually present in the store before the record has IRI reveals it to be not merely an inventory control
been properly updated to reflect any receipts that problem, but rather a larger organizational problem.
occurred in a previous period, the recorded quantity The rest of this paper is organized as follows. We
will appear to be inaccurate. review the literature in §2 and develop our hypothe-
IRI may also result from poor inventory counts ses in §3. In §4, we describe our research setting and
(Woolsey 1977, Iglehardt and Morey 1972, Rinehart the data collected for the empirical analysis presented
1960). One common mistake made during manual in- in §5. We conclude with a discussion of our results,
ventory counts includes a “unit of measure” error. study limitations, and future research opportunities
Here the individual conducting the audit mistakenly in §6.
counts a set, a product consisting of multiple units
(e.g., one toothbrush and toothpaste tube that are
merchandised together), as two pieces rather than as 2. Related Literature
a single piece (Bernard 1985). Rinehart (1960) is the first researcher to identify IRI
And, finally, inventory theft is also a source of IRI as a potential obstacle to operational performance. In
in retail organizations. Theft by store employees or his work with a government supply facility, Rinehart
customers leads to a mismatch between recorded and (1960) documents substantial discrepancies between
actual inventory quantities. Theft can also occur at recorded and actual inventory quantities. Discrepan-
the DC or when items are in transit to the store. In cies exist in 0–50% of the items audited across differ-
such cases, the quantity received does not match the ent inventory classifications. Record inaccuracy has,
quantity ordered. Because most retailers update their he notes, a “deleterious effect” on operations because
store inventory levels based on what is ordered and it prevents firms from achieving the benefits that
not by scanning each item upon its receipt, this theft
can result in IRI. 1
An electronic companion to this paper is available as part of the
We provide evidence that, even in a high-per- online version that can be found at http://mansci.journal.informs.
forming retail organization, discrepancies between org/.
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
Management Science 54(4), pp. 627–641, © 2008 INFORMS 629

result from using “theoretically optimal methods of Gershwin (2005) note inaccuracies in 51% of the
inventory control” (p. 543). records used by one retail firm and claim that the pro-
Iglehart and Morey (1972) determine the opti- portion of inaccurate records ranges from 30% to 80%
mal buffer stock and frequency of inventory counts across stores, but they do not investigate empirically
needed to prevent shortages caused by IRI. They cal- the source of this variability. Variability in operat-
ibrate their analytical model using data from Emma ing performance of different sites of the same firm
(1966), who documents inaccuracies in 25% of a Naval is not uncommon. Existing research in manufacturing
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

supply depot’s audited records. Morey (1985) sug- suggests that operational performance (e.g., quality,
gests that firms not only adjust the timing of inven- flexibility, productivity) differs even across sites that
tory counts and buffer stock calculations, but also employ similar technologies and processes and offer
alter ordering policies, to accommodate IRI. identical employee incentives and training (Krafcik
Among the IRI prevention techniques suggested by 1988, Garvin 1988, Clark and Hayes 1988, Upton
Ernst et al. (1993) is the use of control charts. Using 1997). We draw upon several factors cited by this
samples from two finished goods warehouses and one stream of research to generate hypotheses about the
pharmaceutical firm to test their methodology, they variation in IRI that one might expect across different
find that recorded inventory quantities fail to agree stores of the same retail chain.
with actual inventory quantities anywhere from 83.7%
to 95.5% of the time. Millet (1994) focuses on incentive
design as a means for improving IRI. He recommends
3. Definition of IRI and Hypotheses
rewarding efforts that produce higher accuracy (e.g., 3.1. Definition of IRI
more frequent counting of expensive, critical, and Schrady (1970) comprehensively reviews, and high-
fast-moving items) and reports a 32% improvement lights the advantages and disadvantages of, plausible
in the percentage of accurate items after implemen- IRI measures. Researchers have operationalized the
tation. However, Millet (1994) does not empirically concept of IRI in a variety of ways. Some studies have
test plausible alternative hypotheses for the observed employed a binary measure whereby recorded quan-
improvements. tities are either in agreement with actual inventory
Sheppard and Brown (1993) audit a sample of items quantities or they are not (Raman et al. 2001). Still
within a manufacturing plant on two separate occa- others have used a binary measure where accuracy is
sions, finding inaccuracies in 26% of the items audited defined as within a pre-specified tolerance (Rinehart
initially and then 22% of the items audited sub- 1960, Sheppard and Brown 1993). In both Rinehart
sequently. Unlike previous work on IRI, Sheppard (1960) and Sheppard and Brown (1993), a record is
and Brown (1993) actually test empirically several inaccurate if the discrepancy between the recorded
product-specific factors hypothesized to be associated and actual inventory quantity is 1% or more of the
with IRI. Like them, we report the magnitude of, and recorded balance.
test several product-specific factors hypothesized to Binary measures of record inaccuracy are useful for
be associated with, IRI. We turn to existing research determining the likelihood of discrepancies, but they
on IRI and inventory auditing in an attempt to under- do not provide any information on their magnitude.
stand why levels of IRI might vary for items with Our objective is to identify the determinants of IRI,
different characteristics. Product characteristics such and as such we are concerned about the magnitude of
as the value of an item, its transaction frequency, the discrepancy between recorded and actual inven-
its annual dollar volume, and audit frequency are tory quantities. We therefore define IRI as the absolute
often named, but seldom empirically tested, drivers of difference between the recorded and actual inventory
IRI (Tallman 1976, Neeley 1983, Flores and Whybark quantity, and the hypotheses generated below focus
1986, Young and Nie 1992). on what drives variability in this measure.
Our work differs from Sheppard and Brown (1993) Absolute deviation, in addition to being a simple
in two key ways. First, we document the magni- measure that is used in practice to track IRI, has the
tude of IRI in retailing, a context not yet examined following nice property. It captures the uncertainty of
by previous work on this topic. Second, we investi- the inventory management process in a single mea-
gate IRI across multiple sites (e.g., stores) of the same sure because it reflects both the mean and the spread
firm instead of looking only within one site of a sin- of the discrepancy distribution. Thus, it is akin to the
gle firm. Thus, we attempt to explain not only what way in which management scholars and practitioners
accounts for the variability in IRI across items within evaluate forecasts: mean absolute deviation (MAD).
an individual site but also what might explain the MAD has long been used in inventory management
variability in IRI across multiple sites. (Brown 1967, Nahmias 2001) as a measure of accuracy,
The variation in IRI across different stores of a sin- particularly in forecasting. Like our measure of IRI,
gle retail chain has yet to be examined. Kang and MAD is simple, is used by practitioners, and captures
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
630 Management Science 54(4), pp. 627–641, © 2008 INFORMS

in a single measure two key elements: the mean and where items are counted on a regularly defined sched-
the spread (e.g., standard deviation) of the discrep- ule throughout the year. The purpose of these audits
ancy distribution. Unlike our measure of IRI, MAD is is to identify items where the recorded inventory
the mean of a set of many absolute deviations. quantity is not in agreement with the actual inven-
Two advantages exist to using this absolute devia- tory quantity. Discrepancies are then reconciled, and,
tion definition of IRI compared to a normalized mea- where possible, root-cause analysis is conducted to
sure of IRI (e.g., sales-weighted absolute deviation). eliminate the source of these discrepancies.
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

One, our measure provides greater flexibility than In this research, two factors are commonly assumed
a normalized variable. Normalization predetermines to impact IRI: the activity of the item (Rinehart 1960,
the relationship between the discrepancy between Tallman 1976, Rout 1976, Johnson et al. 1981, Bernard
recorded and actual inventory levels and the variable 1985, Backes 1980, Vollmann et al. 2005) and the
used for normalization. Regressing this normalized time elapsed since that item was last audited (Neeley
variable on other independent variables assumes that 1983, Young and Nie 1992). Moreover, this work often
the discrepancy between recorded and actual inven- suggests that efforts to reduce or eliminate record
tory levels is a linear function of the variable used inaccuracy should not be applied evenly across all
for normalization. This is a restrictive assumption and items. Instead, items deemed most important by the
need not always be true. Two, our measure is eas- firm (e.g., expensive or high dollar-volume items)
ier to interpret. If we normalize IRI we are unable to should receive differential treatment such that they
clearly interpret the significance and the direction of are monitored more frequently or more extensively
any estimated regression coefficients. Does the inde- (see, for example, Brooks and Wilson 1993, Flores and
pendent variable impact the numerator of the normal- Whybark 1986). We discuss each of these factors and
ized outcome, the denominator, or possibly both? It is their expected relationship to IRI in detail below.
appropriate therefore to use variables that one might Transaction Frequency. According to Bernard (1985)
consider important for normalization as independent and Arnold and Chapman (2004), the inventory
variables to control for such variables and achieve a records of more active items, or items that have
similar effect to normalization. the greatest number of transactions processed against
them, are more likely to diverge from the actual quan-
3.2. Theoretical Development tity found at the time of an audit. We hypothesize
To generate our hypotheses, we draw upon several that annual quantity sold, a proxy for transaction fre-
streams of research. Research on inventory manage- quency, is positively associated with our measure of
ment in general and inventory auditing in particular IRI. The more units per year an item sells, the greater
suggests that the dollar value of an item, its trans- its IRI and vice versa. See e-companion Appendix B
action frequency, the dollar volume, and the time for the derivation of this relationship. Note that this
elapsed between audits are potential drivers of IRI. derivation, and therefore this hypothesis, might not
We use transaction cost economics to evaluate hold with alternative definitions of IRI.
whether the distribution structure might account for
Hypothesis 1. IRI is positively associated with the
differences in IRI across items. To account for the vari-
annual selling quantity of an item.
ability in IRI across stores, we draw on factors shown
to explain differences in operating performance across Audit Frequency. Backes (1980) and Flores and Why-
manufacturing plants such as product variety and bark (1987), in their work on manufacturing inven-
inventory levels. Product variety and inventory lev- tory, argue that error accumulates in inventory
els are analogous to environmental complexity in records until such time as the records are updated by
information processing theory (Flynn and Flynn 1999, a physical inventory audit. Neeley (1983) maintains
Vachon and Klassen 2002), specifically the branch of that, aside from eliminating the source of the error,
this theory that addresses human reliability and the increasing the frequency of inventory audits is the
environmental conditions under which individuals primary means of minimizing IRI in manufacturers’
are more or less likely to make mistakes. warehouses. Morey and Dittman (1986), concerned
that the more time that has elapsed between audits
3.2.1. Inventory Auditing and Record Inaccuracy.
the more likely record inaccuracies will occur, develop
Maintaining inventory accuracy through periodic
a framework for determining the optimal timing of
auditing is widely practiced throughout industry and
audits needed to correct the inaccuracies and attain a
is an extensively studied topic among operations
predetermined level of accuracy. We therefore expect
researchers (see, for example, Tallman 1976, Neeley
items that have been audited more frequently to have
1983, Flores and Whybark 1986, Young and Nie 1992).
lower levels of inaccuracy.
These audits can take the form of periodic physical
audits where every item in a pre-specified location is Hypothesis 2. IRI decreases with increases in audit
counted on the same day or the form of cycle counts frequency.
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
Management Science 54(4), pp. 627–641, © 2008 INFORMS 631

Inventory Classifications. Researchers discussing dif- can have vendors ship directly to stores. Differences
ferent auditing procedures often highlight the need in the way inventory is handled during this process,
to classify different items based on their relative due to differences in governance, may account for dif-
importance to the firm such that items that are more ferences in IRI.
important receive greater attention. Two definitions of Transaction cost economics suggests that, in the
relative importance emerge from this literature: items absence of efficiency distortions related to internal
of high value (Hanna and Newman 2001, Brooks and organization, vertical integration leads to efficient and
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

Wilson 1993, Tallman 1976) and high dollar-volume effective coordination through a number of mecha-
items (Arnold and Chapman 2004, Flores and Wybark nisms, including an increased likelihood of physi-
1986). Based on these definitions, managers can clas- cal proximity, established patterns of communication,
sify items into A, B, or C items using the Pareto prin- and greater willingness to cooperate with other mem-
cipal. Items classified as A items are then scheduled bers of the same group (Williamson 1975, Coase 1937,
to be audited more frequently than B items, B items Alchian and Demsetz 1972). Unlike the retail-owned
more frequently than C items, and so on. Inventory DC, the vendor is not owned by the retail chain and
auditing is labor intensive, and thus a firm must make thus agency problems exist. Thus, one might expect
choices on how best to allocate this resource. The fewer inventory discrepancies among items shipped
implicit belief is that the differential treatment given through retail-owned DCs.
to expensive items and/or high dollar-volume items Yet the presence of internal procurement bias, inter-
results in greater record accuracy for these items. nal expansion bias, persistence, or communication
Moreover, items of high value may be monitored distortion might cause one to expect the opposite
more closely throughout the retail supply chain as (i.e., more errors among items shipped from retail-
part of a retailer’s theft-prevention efforts. Inventory owned DCs). Research by Baker et al. (2002) argues
theft, or shrinkage, has a direct effect on store oper- that under some circumstances an upstream party
ating profits, and retailers take great care to prevent will refuse to take any costly action pertaining to qual-
such losses (Hollinger and Langton 2003). For exam- ity control (e.g., invest in the reduction of inventory
ple, high-value items may be shipped in special car- errors) when the downstream party is not paying a
tons, placed into locked cages in the backroom prior bonus for such quality. Because stores are unable to
to restocking the selling floor, or audited at mul- reward (or penalize) the retail-owned DC for accurate
tiple stages of the distribution process. Store man- (or inaccurate) fulfillment of store orders, one might
agers, who are often evaluated in part on their ability expect greater inventory discrepancies among items
to reduce inventory shrink (DeHoratius and Raman shipped through retail-owned DCs. Vendors, on the
2007), pay particular attention to reducing the theft other hand, are motivated to expend more than the
of high-value items because the shrinkage of expen- minimal amount of effort on ensuring quality perfor-
sive products affects store profitability more than the mance (e.g., accurate shipments) given that the store
shrinkage of less expensive products. Consequently, has alternative sourcing options. Incentives are mis-
store managers may induce store employees to exert aligned when retail stores receive inventory from the
more effort when checking the shipments of expen- retail-owned DC because the DC does not have an
sive products or while monitoring inventory levels of outside option to sell its goods and thus expends only
these expensive products. the minimum effort to ensure quality. Furthermore,
Given these auditing and theft-prevention practices stores are unable to source from other DCs in the face
common among retailers, we expect lower levels of of poor quality.
record inaccuracy among high-value items. We also We therefore hypothesize that IRI is correlated with
posit that inventory records of high dollar-volume the way in which an item is shipped to the store.
items are more accurate than those of low dollar-
volume items. Hypothesis 5. IRI is associated with the distribution
structure used.
Hypothesis 3. IRI is negatively associated with the
cost of the item. 3.2.3. Environmental Complexity and Record
Inaccuracy. The concept of bounded rationality, the
Hypothesis 4. IRI is negatively associated with the
notion that decision makers have limitations on their
dollar volume of an item.
capacity for processing information introduced by
3.2.2. Distribution Structure. Retailers employ Simon (1955), has led to a stream of research known as
two different shipping methods when distributing information processing theory. Information process-
inventory to stores (Levy and Weitz 2001). Retailers ing has been studied at the cognitive level (e.g., Payne
may receive inventory into their own DCs and then et al. 1993, Anderson 2004), within organizations
distribute inventory to stores. Alternatively, retailers (e.g., Galbraith 1973, Tushman and Nadler 1978), and
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
632 Management Science 54(4), pp. 627–641, © 2008 INFORMS

across organizations (e.g., Bensaou and Venkatraman “holes” in the shelf will appear. Identifying stockouts
1995). through visible inspection is therefore more difficult,
Information processing theory suggests that when less reliable, and more time consuming.
individuals execute a task they are constantly mak- Numerous researchers have addressed the adverse
ing trade-offs between effort and accuracy (Payne impact of high levels of inventory in the manufactur-
et al. 1993). Properties of the task, individual, and ing context. Schonberger (1982), in his work on JIT,
task environment influence the likelihood of errors noted that process problems are more difficult to iden-
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

for a given level of effort. For example, the more tify in the presence of excess inventory. Krafcik (1988)
complex the task, the more errors one would expect. identified plants with better operational performance
Such errors are exacerbated (or mitigated), according as ones with lower inventory levels and lean pro-
to human error analysis or human reliability analysis, duction systems. Hayes and Clark (1986) show that
by specific individual characteristics. These individ- plants with lower work-in-process, and therefore less
ual characteristics include the individual’s emotional operational complexity and confusion, exhibit better
state, intelligence, motivation, and skill level (see, for performance. Lieberman and Demeester (1999) find
example, Miller and Swain 1987). Familiarity with an inverse relationship between inventory and pro-
a task reduces the number of errors an individual ductivity among Japanese automobile manufacturers.
is likely to make (Newell and Rosenbloom 1981, Bertrand and Van Ooijen (2002) argue that high inven-
Delaney et al. 1998), as does the existence of estab- tory levels create additional employee workload and
lished organizational routines and practices (Regans hence a high-pressure work environment conducive
et al. 2005). to greater levels of human error. As detailed above,
Environmental complexity, however, makes a task inventory density plays a similar role in the retail con-
more difficult to execute (Swain and Guttman 1983). text. We therefore hypothesize that inventory density,
In retailing, employees perform multiple tasks includ- as a measure of environmental complexity, is posi-
ing moving, storing, picking, monitoring, and even- tively correlated with IRI.
tually selling store inventory. These employees
determine both how much effort to allocate to each Hypothesis 6. High (low) levels of inventory density
task and how accurate to be. We assume, all else being are associated with high (low) levels of IRI.
equal, that retail employees prefer more accuracy and
Product Variety. In contexts other than retailing,
less effort when executing any task. Yet, the effort–
substantial empirical evidence exists suggesting that
accuracy level required to execute a task depends
variety negatively impacts numerous measures of
on the task environment. Retail employees operating
operational performance. Srinivasan et al. (1994)
in stores with more complex task environments are
demonstrate that shipment discrepancy rates increase
expected to make more errors for a given level of
effort. in the face of increasing variety in manufactur-
We define environmental complexity in the retail ing. MacDuffie et al. (1996) find that plants with
environment along two dimensions: inventory den- greater product variety have lower labor productiv-
sity and product variety. Stores with greater inventory ity, and Anderson (1995) reveals a positive correlation
density, defined as the total number of units avail- between product variety and manufacturing overhead
able in a store’s selling area, as well as stores with costs. Rommel et al. (1995) identify the best perform-
greater product variety, defined as the number of dif- ing mid-sized manufacturers as ones with a narrow
ferent merchandise categories within a store, have range of products. Fisher and Ittner (1999) find a posi-
more complex task environments than other stores. tive relationship between product variety and rework
We describe each of these dimensions below. rates in automobile assembly. They posit that such
Inventory Density. Retail stores are becoming in- rework is caused by the increased likelihood that
creasingly crowded as merchandisers stock more workers will choose the wrong parts. Increased prod-
inventory in a fixed amount of square footage uct variety complicates the parts supply process by
(Ketzenberg et al. 2000). Crowding in a retail store requiring additional coordination and effort to get the
makes it difficult for store employees to periodically right part to the right place. Hayes and Clark (1986)
monitor store inventory for anomalies such as stock- and Skinner (1974) find that greater variety leads to
outs. Retailers often use stockouts to detect discrepan- greater complexity and confusion, which, in turn, lead
cies between recorded and actual inventories (Raman to lower quality in manufacturing plants.
and Zotteri 2000). When stockouts are easily visi- Employees working in stores with greater prod-
ble, store employees can readily check the recorded uct variety face additional environmental complex-
inventory for these items, spot mistakes, and request ity. Among the problems occasioned by increased
an update of the recorded inventory level. Crowd- product variety is difficulty differentiating, during the
ing, however, makes the identification of stocked- store checkout, audit, or order fulfillment processes,
out items more difficult because it is less likely that items that are nearly identical. The detrimental effects
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
Management Science 54(4), pp. 627–641, © 2008 INFORMS 633

of similarity on human error are well known (Reason by each of two DCs operated by Gamma and thirteen
1990). Such similarity may increase the likelihood of stores served by a third Gamma-operated DC. Collec-
mis-scanning at the store because salespeople may tively, these audit results provide data for each item in
scan one item multiple times without recognizing that the 37 chosen stores, amounting to a total of 369,567
the customer is purchasing multiple products that observations. Note that we sampled stores and col-
differ from each other. It may also increase the like- lected all observations for each store sampled.
lihood of DC mis-shipments to stores because ware- In addition to the data gathered from the physical
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

house employees may incorrectly pick and ship one inventory audit that provided us with our measure
item in lieu of another. Either of these actions can of IRI, we also obtained data characterizing each item
cause store inventory records to differ from the actual corresponding with our hypotheses. Specifically, we
inventory at the store. Thus, we expect stores with gathered the price Gamma paid for an item (ITEM
greater product variety to exhibit greater IRI. COST), the annual selling quantity per store during
the year preceding the audit for an item (QUANTITY
Hypothesis 7. High (low) levels of product variety are
SOLD),2 and an indicator variable denoting whether
associated with high (low) levels of IRI.
an item had been shipped to the store from one of
Gamma’s DCs (VENDOR = 0) or directly from the
4. Methods and Measures vendor (VENDOR = 1). Gamma’s distribution struc-
4.1. Research Setting ture is consistent across the retail chain such that each
We examine IRI at 37 stores owned and operated store receives the same items from vendors and the
by Gamma Corporation, a large public retailer with same items from Gamma-owned DCs.
annual sales of approximately $10 billion. Gamma is To evaluate the variation in IRI across stores, we
currently the largest operator of superstores in the first coded each observation so we could identify
world for its specific retail category. Those familiar the store from which it was taken. We then col-
with retail consider Gamma an extremely successful lected data on each store, specifically its inventory
retailer that excels both operationally and in its use of density, its product variety, and the frequency of
technology. Consequently, Gamma not only controls inventory audits. Inventory density, DENSITY, is the
the largest market share for its particular retail cate- total number of units in a store divided by that
gory, but its stock growth consistently outperformed store’s selling area (units per square foot). Product
that of the Standard & Poor’s retail index during the variety, VARIETY, is the number of different mer-
period of our study. chandise categories within a store. An alternative
measure of VARIETY in retailing is the number of
4.2. Data Collection and Measures SKUs per store (Lindsley et al. 1991). These two
The data we collected at Gamma consist of the results measures of variety are highly correlated with each
of the 1999 physical inventory audit conducted in 37 other (0.75, p < 00001), and our results, presented in
randomly selected Gamma stores. Gamma conducts §5.4, remained unchanged when using this alternative
periodic physical inventory audits at every store. measure. Audit frequency, DAYS, measures the num-
Audit teams scan each item in the store using hand- ber of days between audits for a given store. Audit
held scanning devices during a period when the store frequency decreases with increases in DAYS.
is closed to customers. Store managers and, at times,
management from the corporate office are present to 4.3. Controls
ensure timely and accurate counts. Upon completion We coded each store by the region in which it was
of the audit, Gamma records two corresponding unit located (e.g., Northeast, Mid-Atlantic, and West). Our
quantities, one representing the “system count” (i.e., aim is to control for any unmeasured regional differ-
the number of inventory units for each SKU recorded ences that might exist. We propose several regional
to be on-hand at a particular store), the other the
“actual count” (i.e., the number of inventory units for 2
We acknowledge that our measure of transaction frequency cap-
each SKU actually present in the store at the time tures sales during an entire year whereas our measure of IRI cap-
tures errors that have accumulated during varying time periods
of the audit). Our measure of IRI, defined in detail
(e.g., times since the store was last audited). Observations of quan-
in §3.1, is the absolute difference between the system tity sold during the period between audits were not available to us.
and the actual count recorded during the audit. Nevertheless, we believe annual quantity sold to be a good proxy
We did not collect audit data from every Gamma for transaction frequency. We do not expect there to be much vari-
store but rather sampled at random from among ability in sales across items within a store during different times
of the year. Unlike a department store, which may sell swimsuits
those stores that had been opened for at least 2 years
in summer and coats in winter, Gamma’s store assortment is rela-
to eliminate any operational complexities that might tively consistent from one season to the next, and thus any changes
exist in the ramp-up of a new store and to ensure that in sales during different periods are not likely to impact items
the store was viable. We selected twelve stores served differentially.
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
634 Management Science 54(4), pp. 627–641, © 2008 INFORMS

Table 1 Descriptive Statistics and Correlations

Variable name Mean (s.d.) Min. Max. (1) (2) (3) (4)

SKU level (level 1), n = 369567


1. IRI 498 (28.54) 000 6988 —
2. ITEM COST 2043 (66.85) 001 3474 −004∗∗∗ —
3. QUANTITY SOLD 7147 (318.60) 000 40229 026∗∗∗ −013∗∗∗ —
4. VENDOR 001 (0.11) 0 1 −002∗∗∗ 008∗∗∗ −002∗∗∗ —
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

Store level (level 2), n = 37


1. Avg. store IRI 494 (1.27) 237 7.79 —
2. DENSITY 828 (1.62) 543 11.36 067∗∗∗ —
3. VARIETY 35278 (5.66) 327 360 031∼ −004 —
4. DAYS 34227 (111.23) 10500 523.00 028∼ 030∼ 001 —
Product category level (level 2), n = 68
1. Avg. product category IRI 702 (23.62) 046 19670 —
∼ ∗ ∗∗ ∗∗∗
p < 010, p < 005, p < 001, p < 0001.

differences that may impact IRI in §6. Although we The average IRI by store ranges from a minimum of
would have liked to explore some of these regional 2.4 units per item to a maximum of 7.8 units per item.
differences more thoroughly, we are limited by the The average IRI within a product category ranges
small number of regions, three, in which Gamma from a minimum of 0.46 units per item to a maximum
stores are clustered. of 197 units per item.
We also obtained a “merchandising hierarchy” that A histogram of IRI reveals that 65% of the inven-
Gamma uses to classify every SKU into sixty-eight tory records in our sample are inaccurate (Figure 1).
product categories. Each SKU can be described by one Twenty-two percent of the records have a discrepancy
and only one product category. For example, a SKU of one unit, 10% have a discrepancy of two units, and
representing an IBM T-42 Thinkpad would be classi- so on. Approximately 17% of the records we collected
fied as a laptop computer.3 We use this classification have discrepancies of seven units or more. In other
to control for the lack of independence that may exist words, these records have discrepancies that were
within product categories. For example, store or DC 50% or more of the average on-hand quantity present
employees may be more likely to confuse items that in Gamma stores. We found positive discrepancies in
are similar (e.g., Coca-Cola and Diet Coke) than they 59% of the inaccurate records observed. Upon aver-
are to confuse items that are dissimilar. Thus, it is aging the raw difference (as opposed to the absolute
important to account for this lack of independence difference) between system and actual inventory lev-
within product categories. We also use this classifica- els, we found that items had, on average, a discrep-
tion to control for any unmeasured effect on IRI that is ancy of 0.12 units, implying that, on average, more
common to all items in a particular product category. inventory was recorded in the system than was actu-
Because of the nature of our agreement with Gamma, ally found on the selling floor. In an effort to illu-
we cannot identify each product category, nor can minate as much detail as possible about our chosen
we collect data specific to product categories. Thus, research site, a retail chain whose name we have to
even though one could hypothesize that certain prod- disguise, we supplement the descriptive statistics pre-
uct category differences might impact IRI (see §6), we sented here in e-companion Appendix C.
were able to include only product category codes in
our analysis. 5.2. Variable Transformations
Before testing our hypotheses, we transform several
of our variables (i.e., IRI, ITEM COST, and QUAN-
5. Estimation and Results TITY SOLD) into their natural log. The primary rea-
5.1. Descriptive Statistics son for transforming our variables in this way was to
Table 1 presents our descriptive statistics and within- linearize the regression model (Kleinbaum et al. 1998).
level correlations. We find that IRI ranges from zero Analysis of the raw data suggests a nonlinear rela-
to 6,988 units and averages 4.98 units. We also find tionship between the outcome variable, IRI, and each
variability in IRI across stores and product categories. of the predictors. We find the log transformation to
be the most effective at inducing linearity. Moreover,
3
The definition of product categories differs from our measure of
several of the variables in their raw form have stan-
variety, merchandise categories, in the following way. Product cat- dard deviations that are large relative to their means.
egories consist of numerous merchandise categories. Afifi and Clark (1997) recommend transforming such
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
Management Science 54(4), pp. 627–641, © 2008 INFORMS 635

Figure 1 Histogram of the Absolute Difference Between Recorded and Actual Inventory Levels n = 369567
35
35
30
Percentage of
records (%)
25 22
20
15 10 9
10 6 0
4 20

0
3 3 01

20

10

04

02

01

00

02
5 2 2 1 1 1 0.0 0.

0.

0.

0.

0.

0.

0.

0.
0
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

10

11–50

51–100

101–150

151–200

201–250

251–300

301–350

351–400

401–450

451–500

501+
Units

variables to induce normality prior to model estima- 5.4. Hypothesis Testing


tion. Because our measure of IRI and quantity sold We test our hypotheses concerning the SKU and
can be zero, and the natural log of zero is undefined, store-level determinants of IRI in several steps. Our
we add one unit to each observed unit of error and approach was similar to Liao and Chuang (2004) and
quantity sold. We find no substantial differences in Ang et al. (2002), where we first estimate a null model
our results presented in §5.4 when adding numbers (Model 1, Table 2) that had no predictors at the SKU
less than one, specifically 0.5 and 0.33. level (Level 1), the store level (Level 2a), or the prod-
uct category level (Level 2b). Equation (1) summarizes
5.3. Hierarchical Linear Modeling this null model. We then include each of our predic-
We employ hierarchical linear modeling (HLM) to tors sequentially, resulting in a full model (Model 10,
analyze these data (Raudenbush and Bryk 2002). One Table 2) described by Equation (2) below.
advantage of HLM is that it allows for unbalanced To estimate these HLM models we use Proc Mixed
data, an important benefit in our research setting in SAS v.9.1 and employed a full maximum likeli-
where the number of observations drawn from each hood (ML) estimation technique. The goodness-of-fit
store differs. The number of observations from each statistics computed with ML describe the fit of the
store ranges from 6,791 to 11,843 (see e-companion entire model and thus can be used to test hypotheses
Appendix C, Table C2). However, the primary ben- about any type of parameter (e.g., fixed effect or vari-
efit of HLM is that it takes into account the multi- ance component). See Singer and Willett (2003) for a
level structure of our data (Raudenbush and Bryk detailed description of the advantages and disadvan-
2002) and the lack of independence between observa- tages of different estimation techniques when execut-
tions at different levels (Seltzer 1994). In our data, nei- ing multi-level models.
ther the observations drawn from the same store nor
those represented by the same product category can 5.4.1. Null Model. The null model described
be assumed to be independent. Thus we use a cross- below is equivalent to a two-way analysis of variance
classified HLM model that allows observations to be with random row and column effects and within-cell
nested within two higher-level categories, stores and deviation. With this model, we partition the variance
product categories. Cross-classified HLM differs from in IRI into three components: the variance in IRI asso-
more traditional HLM techniques in that it allows for ciated with differences across stores, the variance in
observations to be nested within two different higher- IRI associated with differences across product cate-
level classifications. gories, and the variance in IRI within a given store
The hierarchical or nested nature of these data and product category combination.
means that SKUs existing within stores, or within
product categories, will tend to be more similar to each IRIijk = 0 + b00j + c00k + eijk (1)
other than SKUs randomly sampled from the entire
population. Thus, observations within these clusters where IRIijk is the record inaccuracy of item i (i =
of stores or product categories are not fully inde- 1    njk ) in product category j (j = 1    68) and
pendent. Traditional OLS approaches, including hier- store k (k = 1    37). 0 is a fixed intercept
archical regression modeling (HRM) and analysis of parameter. The random main effect of product cate-
variance (ANOVA), require independence of observa- gory j is b00j ∼ N (0 b00 ). The random main effect of
tions as a primary assumption for the analysis and store k is c00k ∼ N (0 c00 ). The random item effect is
would produce incorrect standard errors (Raudenbush eijk ∼ N (0  2 ). b00 , c00 , and  2 define the variance
and Bryk 2002). HLM, on the other hand, assumes a in IRI between product categories, stores, and items,
correlation between the residuals from observations respectively.
within the same store or within the same product We found that 70.9% of the variance in IRI exists
category. across items, 26.4% across product categories, and
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
636 Management Science 54(4), pp. 627–641, © 2008 INFORMS

Table 2 Hierarchical Linear Modeling Results for IRI

Variable Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9 Model 10

Fixed effects
Intercept 0961∗∗∗ 0523∗∗∗ 0770∗∗∗ 0677∗∗∗ 0676∗∗∗ 0677∗∗∗ 0770∗∗∗ 0442∗∗ −2638∗ −2811∗
0076 0069 0051 0050 0050 0050 0055 0015 1257 115
QUANTITY 0160∗∗∗ 0150∗∗∗ 0185∗∗∗ 0186∗∗∗ 0185∗∗∗ 0185∗∗∗ 0185∗∗∗ 0185∗∗∗ 0185∗∗∗
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

SOLD 0001 0001 0001 0001 0001 0013 0001 0001 0001
ITEM −0108∗∗∗ −0045∗∗∗ −00448∗∗∗ −0045∗∗∗ −0045∗∗∗ −0045∗∗∗ −0045∗∗∗ −0045∗∗∗
COST 0001 0002 0002 0002 0002 0002 0002 0002
DOLLAR −0028∗∗∗ −0028∗∗∗ −0027∗∗∗ −0027∗∗∗ −0027∗∗∗ −0027∗∗∗ −0027∗∗∗
VOLUME 0001 0001 0001 0001 0001 0001 0001
VENDOR −01098∗∗∗ −0292∗∗∗ −0292∗∗∗ −0292∗∗∗ −0292∗∗∗ −0292∗∗∗
0015 0047 0047 0047 0047 0047
VENDOR 0050∗∗∗ 0050∗∗∗ 0050∗∗∗ 0050∗∗∗ 0050∗∗∗
COST 0012 0012 0012 0012 0012
REGION −0265∗∗∗ −0190∗∗ −0156∗∗ −0139∗∗
ONE 0052 0058 0056 0051
REGION 0000 0037 0062 0092∗
TWO 0053 0052 0050 0046
DENSITY 0034∗ 0040∗∗ 0033∗∗
0015 0014 0013
VARIETY 0009∗ 0009∗∗
0003 0003
DAYS 0000∗∗
0000¥
Random effects
Store 0034∗∗∗ 0033∗∗∗ 0032∗∗∗ 0032∗∗∗ 0032∗∗∗ 0032∗∗∗ 0017∗∗∗ 0015∗∗∗ 0013∗∗∗ 0011∗∗∗
0008 0008 0008 0008 0008 0008 0004 0004 0003 0003
Category 0332∗∗∗ 0263∗∗∗ 0110∗∗∗ 0104∗∗∗ 0104∗∗∗ 0103∗∗∗ 0103∗∗∗ 0103∗∗∗ 0103∗∗∗ 0103∗∗∗
0065 0055 0028 0027 0027 0027 0027 0027 0027 0027
Residual 0891∗∗∗ 0839∗∗∗ 0826∗∗∗ 0821∗∗∗ 0821∗∗∗ 0821∗∗∗ 0821∗∗∗ 0821∗∗∗ 0821∗∗∗ 0821∗∗∗
0002 0002 0002 0002 0002 0002 0002 0002 0002 0002
−2 Log 1,006,833.0 984,673.9 978,762.5 976,592.9 976,538.5 976,521.9 976,496.2 976,490.5 976,484.2 976,476.4
likelihood

Notes. Standard errors are shown in parentheses. IRI, QUANTITY SOLD, and ITEM COST are all transformed into their natural log. ¥ DAYS coefficient is
46 × 10−4 , and its standard error is 17 × 10−4 .

p < 005, ∗∗ p < 001, ∗∗∗ p < 0001.

2.7% across stores (Model 1, Table 2). We observe sig- to test our hypotheses about the relationship between
nificant variation in IRI across stores after controlling an item’s value, its annual selling quantity, its dollar-
for variance in IRI across product categories (c00 = volume, and the distribution structure. For complete-
0034, p < 00001). The presence of significant ness, we test for other plausible interaction terms
between-store variance in IRI allows us to test and find that a significant interaction term exists
whether this variance is associated with select store between the distribution structure and an item’s value
characteristics, as hypothesized. (Model 6, Table 2).
5.4.2. Conditional Models. We introduce predic- Next, we include store-level predictors (Mod-
tors across columns in Table 2 in a stepwise fashion els 7–10, Table 2). With these models, we assess the
introducing the item-level predictors first followed main effects of the hypothesized store level factors
by store-level predictors. We report the proportion such as inventory design, product variety, and audit
of variance explained by adding each hypothesized frequency on IRI. We assume that the effects of our
predictor in Table D1 of the e-companion. We assess item characteristics on IRI are fixed across stores and
model significance by examining the difference in product categories, rather than randomly varying,
negative log-likelihood between two nested models. due to insufficient cell size among some store–product
First, we specify models in which we add each category combinations (Raudenbush and Bryk 2002).
item-level variable iteratively (Models 2–5, Table 2) Model 10 (Table 2) is our complete model including
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
Management Science 54(4), pp. 627–641, © 2008 INFORMS 637

all predictors and controls. Equation (2) summarizes Gamma-shipped items, controlling for all other pre-
our conditional model with every store- and item- dictors at their mean.5
level predictor. Hypothesis 2 predicts that items contained in stores
that are audited more frequently will exhibit lower
IRIijk = 0 + b00j + c00k + eijk levels of IRI. We find, in support of this hypothe-
sis, that the days lapsed between audits is signifi-
+ 1 ∗ QUANTITY_SOLDijk 
cantly and positively associated with IRI (005 = 46 ×
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

+ 2 ∗ ITEM_COSTijk  10−4 , t = 296). Items within stores audited annually


(DAYS = 360) have, on average, 12% greater record
+ 3 ∗ DOLLAR_VOLUMEijk  inaccuracy than items within stores audited semi-
+ 4 ∗ VENDORi  + 5 ∗ VENDOR_COSTijk  annually (DAYS = 180), controlling for all other pre-
dictors at their mean.
+ 001 ∗ REGION_ONEk  Confirming Hypothesis 3, we find that the coeffi-
+ 002 ∗ REGION_TWOk  cient on the main effect of item cost is both significant
and negative (2 = −0045, t = −2344). The IRI of
+ 003 ∗ DENSITYk  + 004 ∗ VARIETYk  expensive, hereafter defined as the 90th percentile of
+ 005 ∗ DAYSk  (2) item cost (or ITEM COST = $4184), Gamma-shipped
items is, on average, 57% less than inexpensive, here-
where 1 –5 are the fixed item-level coefficients and after defined as the 10th percentile of item cost (or
001 –005 are the fixed store-level coefficients. ITEM COST = $098), Gamma-shipped items, control-
Because results are generally stable from one model ling for all other predictors at their mean.
to the next, we focus on the results from Model 10 In support of Hypothesis 4, we find dollar-volume
(Table 2). With this model, we are able to explain to be significantly and negatively associated with IRI
25.6% of the total variance in IRI. Collectively, our (3 = −0027, t = −4638). Because dollar-volume is
item- and store-level predictors explain 67.6% of the calculated by multiplying the cost of an item and
variance in IRI across stores and 69.0% of the vari- its annual selling quantity, its significance has some
ance in IRI across product categories. We find that implications for the main effect of quantity sold and
IRI is indeed associated with inventory management item cost on IRI. We can no longer claim that mag-
policies, distribution structure, and environmental nitude of the effect of item cost on IRI or the magni-
complexity.4 tude of the effect of quantity sold on IRI is constant.
Inventory Management. Hypotheses 1–4 predict that Instead, the effect of item cost on IRI depends on
the annual selling quantity of the item, and the effect
inventory auditing policies will be associated with
of quantity sold on IRI depends on the cost of the
IRI. Specifically, we posit that four factors pertaining
item. Specifically, the effect of quantity sold on IRI
to inventory auditing (e.g., an item’s cost, its annual
is greater for inexpensive items than for expensive
selling quantity, its dollar volume, and the frequency
items. And the effect of item cost on IRI is more
with which it is audited) are associated with IRI.
substantial for fast-moving items than it is for slow-
Each of these factors is indeed significantly associated
moving ones. We find, on average, the record inaccu-
with IRI.
racy of fast-moving inexpensive items to be more than
In support of Hypothesis 1, we find a positive
double (150% greater) that of fast-moving expensive
main effect of quantity sold on IRI (1 = 0185,
items, controlling for all other variables at their mean.
t = 14258). On average, the record inaccuracy for
However, the record inaccuracy of slow-moving inex-
fast-moving, defined hereafter as the 90th percentile pensive items is only 56% greater, on average, than
of annual selling quantity (or QUANTITY SOLD = that of slow-moving expensive ones, controlling for
147), Gamma-shipped items is 2.1 times that of slow- all other variables at their mean. Similarly, when com-
moving, hereafter defined as the 10th percentile of paring the IRI of inexpensive fast-moving items to
annual selling quantity (or QUANTITY SOLD = 1), that of inexpensive slow-moving items, we find a
greater difference in IRI than when comparing the IRI
4
See e-companion Appendix D for additional details on our estima- of expensive fast-moving items to that of expensive
tion method, the robustness of our results using other estimation slow-moving items. Controlling for all other factors at
methods, and our findings pertaining to collinearity and aberrant their mean, the IRI for inexpensive fast-moving items
data points. We replicate the analysis presented in this section using
alternate measures of IRI in e-companion Appendix E. We also esti-
5
mate our models using two subsets of the existing data set, namely, We have chosen to interpret our findings for “Gamma-shipped”
one subset including observations with only positive discrepancies items (i.e., VENDOR = 0) and for an “average region” (i.e.,
and one subset with only negative discrepancies, see e-companion REGION = −0139 + 0092 + 0/3 = −0016. All other independent
Appendix F. variables are controlled at their mean; see Table 1 for these values.
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
638 Management Science 54(4), pp. 627–641, © 2008 INFORMS

is nearly four times greater, on average, than the IRI considerable portion of firm profit. The extent of
for inexpensive slow-moving items and the IRI for the IRI problem at Gamma, a well run retailer, sug-
expensive fast-moving items is only two times greater, gests that IRI exists in other retail contexts. The
on average, than the IRI for expensive slow-moving adverse effect of IRI is not limited to the operational
items, controlling for all other factors at their mean. performance of the retail chain. It may also impact the
Distribution Structure. The distribution structure is performance of upstream supply chain partners. Inac-
significantly associated with IRI as posited in Hypoth- curate retail sales and inventory data can confound
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

esis 5 (4 = −0292, t = −62). The coefficient is neg- manufacturers’ efforts to manage retail inventory lev-
ative, suggesting that items shipped direct from the els under vendor-managed inventory programs and
vendor to Gamma retail stores are more accurate render ineffective collaborative planning, forecasting,
than those shipped from Gamma-owned DCs. This and replenishment programs. Such efforts, whereby
supports the proposition of Baker et al. (2002) that retailers and manufacturers share sales and inventory
external suppliers have more incentive to invest in data to optimize the supply chain, are often touted
process quality than do retail-owned DCs. The IRI of as a solution to the bullwhip effect (Lee et al. 1997).
a Gamma-shipped item is 25% greater, on average, Many manufacturers even incorporate retailer sales
than a similar vendor-shipped item, controlling for all and inventory data into their production planning
other factors at their mean. process.
We also test for the presence of interaction terms Thus, IRI is an operational problem that war-
between an item’s cost and the distribution structure. rants greater attention from management scholars and
The presence of a significant positive coefficient on managers. Innovative solutions to this problem are
the interaction between an item’s cost and the dis- needed. Such solutions may include novel IRI preven-
tribution structure (5 = 005, t = 407) reveals that tion or correction techniques, changes in the design
the impact of distribution structure on IRI is not the of store operating environments to make it more dif-
same across all items. Instead, the more expensive ficult for such discrepancies to arise, or recommenda-
the item, the less difference in IRI exists between tions for attaining error-free compliance among store
Gamma-shipped and vendor-shipped items. We find and DC workers executing replenishment tasks. Retail
the IRI for expensive Gamma-shipped items to be 19% managers can benchmark across stores to determine
greater, on average, than for similar vendor-shipped the practices that lead to low levels of inaccuracy or
items, controlling for all other factors at their mean. across vendors to identify quality control techniques
This gap increases to 47%, on average, for inexpensive employed in the order fulfillment process.
items, holding other factors at their mean. Management scholars may also be able to design
Environmental Complexity. Hypotheses 6 and 7 posit robust decision support tools to account for the exis-
that stores will differ in their record accuracy depend- tence of IRI. Traditional inventory and assortment
ing on the degree of complexity of the work envi- planning models assume that inventory positions are
ronment. The two measures of complexity we use known with certainty and only account for the uncer-
are inventory density and product variety. We find tainty in lead times or demand. Our research, by pro-
both inventory density and product variety to be viding a detailed, empirical characterization of IRI in
significant and positively associated with IRI (003 = the retail context, can inform researchers wishing to
0033, t = 280; 004 = 0009, t = 298; respectively). incorporate error characteristics into robust planning
Items within stores that have high levels of inventory models (see, for example, DeHoratius et al. 2008, Kök
density (DENSITY = 11, the 90th percentile) exhibit and Shang 2007).
record inaccuracy levels 25% greater, on average, than Several limitations to our study exist. First, we
items within stores with low levels of inventory den- study several stores of the same firm. Advantages of
sity (DENSITY = 65, the 10th percentile), control- field research within one organization include the use
ling for all other factors at their mean. Similarly, of detailed, firm-specific data and a deep understand-
items within stores that have high levels of product ing of the study context gained through observations
variety (VARIETY = 358, the 90th percentile) exhibit and interview. Moreover, in testing our hypotheses,
record inaccuracy levels 13% greater, on average, than we could control for firm-specific factors that influ-
items within stores with low levels or product variety ence IRI such as firm ownership, information technol-
(VARIETY = 349, the 10th percentile), controlling for ogy, and incentive design. However, we are unable
all other factors at their mean. to claim statistically that our results will hold beyond
Gamma. Our choice of a random effects model does
6. Discussion allow us to generalize to the entire population of
We provide evidence that the problem of record inac- Gamma stores, but we would require data from sev-
curacy is substantial in magnitude in the retail con- eral firms to make any claims beyond Gamma. That
text. And the resulting lost sales can amount to a said, we do have anecdotal evidence that leads us to
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
Management Science 54(4), pp. 627–641, © 2008 INFORMS 639

believe that the observed level of IRI is not unusual that could identify the drivers of improved IRI over
and that our hypothesized relationships will hold time would be of immense interest to academics and
beyond Gamma. Specifically, results from this study practitioners alike.
have been presented to retail executives on numerous
occasions, and they have consistently acknowledged 7. Electronic Companion
our findings. Nevertheless, future research should An electronic companion to this paper is available as
provide additional evidence from other firms regard- part of the online version that can be found at http://
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

ing the incidence, magnitude, and drivers of IRI in mansci.journal.informs.org/.


other retailing contexts.
In addition to concerns over the external valid- Acknowledgments
ity of our study, we also recognize several threats The authors thank the many Gamma employees who partic-
to its internal validity. We are unable to control for ipated in this study. This paper benefited from discussions
all factors that might impact the variation in IRI with Carliss Baldwin, Jim Bettman, Vishal Gaur, Saravanan
across stores and product categories. It is plausible, Kesavan, Laura Kornish, Richard Lai, Adam Mersereau,
for example, that employee tenure and training, store Steve Miller, Chris Nachtsheim, Terri Pigott, Linus Schrage,
manager practices, as well as other unmeasured fac- Bill Simpson, Judy Singer, and Sarah Woolverton. The
tors may influence store-level IRI. Variation in IRI authors also acknowledge the substantial contributions of
the associate editor and three anonymous referees.
across items and across product categories may be
influenced by the cubic volume of the item because
bulky items are more difficult to steal and/or lose References
than compact items, by their case pack size (sin- Afifi, A. A., V. Clark. 1997. Computer-Aided Multivariate Analysis.
gle item shipment vs. multiple item shipment), or Chapman and Hall, Boca Raton, FL.
other unmeasured factors. Future research opportuni- Alchian, A. A., H. Demsetz. 1972. Production, information costs,
ties include exploring the impact of these additional and economic organization. Econom. Rev. 62 777–795.
factors while controlling for the variables included in Anderson, J. R. 2004. Cognitive Psychology and Its Implications. Worth
Publishers, New York.
this study.
Anderson, S. 1995. Measuring the impact of product mix hetero-
Crime, regional management, and DC practices are geneity on manufacturing overhead cost. Accounting Rev. 70
other factors that may impact IRI. Crime in the form 363–388.
of theft influences store error in that it generates Ang, S., S. Slaughter, K. Y. Ng. 2002. Human capital and institu-
a discrepancy between recorded and actual inven- tional determinants of information technology compensation.
Management Sci. 48 1427–1445.
tory quantities for the stolen item. Similarly, regional
Arnold, J. R. T., S. N. Chapman. 2004. Introduction to Materials Man-
management may influence IRI by enforcing different agement. Prentice Hall, Upper Saddle River, NJ.
inventory management practices (e.g., management Backes, R. W. 1980. Cycle counting—A better method for achieving
of one region could monitor store activities such as accurate inventory records. Production Inventory Management 21
checking of stocked-out items differently than another 36–44.
region). Finally, the capability of each of the man- Baker, G., R. Gibbons, K. J. Murphy. 2002. Relational contracts and
the theory of the firm. Quart. J. Econom. 117 39–84.
agement teams running the DCs affiliated with each BearingPoint. 2002. Inventory management 2002. Chain Store Age 79
region may differ. Perhaps one DC developed its own 3A–6A.
unique polices and practices that prevent common Bensaou, M., N. Venkatraman. 1995. Configurations of interor-
errors in store order fulfillment that led to lower IRI ganizational relationships: A comparison between U.S. and
among the stores it served. To what extent do DC pro- Japanese automakers. Management Sci. 41 1471–1492.
Bernard, P. A. 1985. Cycle counting. Production Inventory Manage-
cesses play a significant role in ensuring the accuracy ment 26 27–41.
of store records? Future research on these factors may Bertrand, J. W. M., H. P. G. Van Ooijen. 2002. Workload based order
offer valuable insights into the practical problem of release and productivity: A missing link. Production Planning
IRI in the retail context. Control 13 665–678.
Further research opportunities include linking IRI Brooks, R. B., L. W. Wilson. 1993. Inventory Record Accuracy. Oliver
Wight Publications, Essex Junction, VT.
to financial or non-financial performance measures.
Brown, R. G. 1967. Decision Rules for Inventory Management. Holt,
Do stores with higher levels of IRI exhibit better or Rinehart, and Winston, New York.
worse performance (e.g., store profit, customer satis- Clark, K. B., R. H. Hayes. 1988. Recapturing America’s manufac-
faction)? What actions can a retailer take to reduce turing heritage. Calif. Management Rev. 30 9–33.
IRI, how much will those actions cost, and will the Coase, R. H. 1937. The nature of the firm. Economica 4 386–405.
gains from improved IRI offset these costs? Several of DeHoratius, N., A. J. Mersereau, L. Schrage. 2008. Retail inventory
these questions have been explored through simula- management when records are inaccurate. Manufacturing Ser-
vice Oper. Management. Forthcoming.
tion (DeHoratius et al. 2008, Kang and Gershwin 2005,
DeHoratius, N., A. Raman. 2007. Store manager incentive design
Kök and Shang 2007, Fleisch and Tellkamp 2005) but and retail performance: An exploratory investigation. Manufac-
have yet to be addressed empirically. Field research turing Service Oper. Management 9 518–534.
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
640 Management Science 54(4), pp. 627–641, © 2008 INFORMS

Delaney, P. F., L. M. Reder, J. J. Staszewski, F. E. Ritter. 1998. MacDuffie, J. P., K. Sethuraman, M. L. Fisher. 1996. Product variety
The strategy-specific nature of improvement: The power law and manufacturing performance. Management Sci. 42 350–369.
applies by strategy within task. Psych. Sci. 9 1–7. Miller, D. P., A. D. Swain. 1987. Human error and human reliability.
Emma, C. K. 1966. Observations on physical inventory and stock G. Salvendy, ed. Handbook of Human Factors. John Wiley & Sons,
record error. Interim Report 1, Department of Navy Supply New York, 219–250.
Systems Command, Mechanicsburg, PA.
Millet, I. 1994. A Novena to St. Anthony, or how to find inventory
Ernst, R., J. Guerrero, A. Roshwalb. 1993. A quality control
by not looking. Interfaces 24 69–75.
approach for monitoring inventory stock levels. J. Oper. Res.
Soc. 44 1115–1127. Morey, R. C. 1985. Estimating service level impacts from changes
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

Fisher, M. L., C. Ittner. 1999. The impact of product variety on auto- in cycle count, buffer stock, or corrective action. J. Oper. Man-
mobile assembly operations: Empirical evidence and simula- agement 5 411–418.
tion analysis. Management Sci. 45 771–786. Morey, R. C., D. A. Dittman. 1986. Optimal timing of account audits
Fisher, M. L., A. Raman, A. S. McClelland. 2000. Rocket science in internal control. Management Sci. 32 272–283.
retailing is almost here—Are you ready? Harvard Bus. Rev. 78 Nahmias, S. 2001. Production and Operations Analysis. Irwin,
115–124. Chicago.
Fleisch, E., C. Tellkamp. 2005. Inventory inaccuracy and supply Neeley, P. S. 1983. A framework for cycle counting. Production Inven-
chain performance: A simulation study of a retail supply chain. tory Management 24 23–32.
Internat. J. Production Econom. 95 373–385.
Newell, A., P. S. Rosenbloom. 1981. Mechanisms of skill acquisition
Flores, B. E., D. C. Whybark. 1986. Multiple criteria ABC analysis.
and the law of practice. J. R. Anderson, ed. Cognitive Skills and
Internat. J. Oper. Production Management 6 36–46.
Their Acquisition. Lawrence Erlbaum, Hillsdale, NJ, 1–55.
Flores, B. E., D. C. Whybark. 1987. Implementing multiple criteria
ABC analysis. J. Oper. Management 7 79–85. Payne, J. W., J. R. Bettman, E. J. Johnson. 1993. The Adaptive Decision
Flynn, B., E. J. Flynn. 1999. Information-processing alternatives for Maker. Cambridge University Press, Cambridge.
coping with manufacturing environment complexity. Decision Raman, A., G. Zotteri. 2000. Estimating retail demand and lost
Sci. 30 1021–1052. sales. Working paper, Harvard Business School, Boston.
Galbraith, J. 1973. Designing Complex Organizations. Addison-Wesley Raman, A., N. DeHoratius, Z. Ton. 2001. Execution. Calif. Manage-
Publishing, Menlo Park, CA. ment Rev. 43 136–154.
Garvin, D. A. 1988. Managing Quality. The Free Press, New York. Raudenbush, S. W., A. S. Bryk. 2002. Hierarchical Linear Models. Sage
Hanna, M. D., W. R. Newman. 2001. Operations Management. Publishing, Thousand Oaks, CA.
Prentice Hall, Upper Saddle River, NJ.
Reason, J. 1990. Human Error. Cambridge University Press,
Hayes, R. H., K. B. Clark. 1986. Why some factories are more pro- Cambridge, UK.
ductive than others. Harvard Bus. Rev. 64 66–73.
Regans, R., L. Argote, D. Brooks. 2005. Individual experience and
Hollinger, R. C., L. Langton. 2003. National Retail Security Survey.
experience working together. Management Sci. 51 869–881.
http://www.crim.ufl.edu/research/srp/srp.html.
Iglehart, D. L., R. C. Morey. 1972. Inventory systems with imperfect Rinehart, R. F. 1960. Effects and causes of discrepancies in supply
asset information. Management Sci. 18 B388–B394. operations. Oper. Res. 8 543–564.
Johnson, J. R., R. A. Leitch, J. Neter. 1981. Characteristics of errors Rommel, G., J. Kluge, R. Kempis, R. Diederichs, F. Bruck. 1995.
in accounts receivable and inventory audits. Accounting Rev. 56 Simplicity Wins: How Germany’s Mid-Sized Industrial Companies
270–293. Succeed. Harvard Business School Press, Boston.
Kang, Y., S. B. Gershwin. 2005. Information inaccuracy in inventory Rout, W. 1976. That damn storeroom. Production Inventory Manage-
systems: Stock loss and stockout. IIE Trans. 37 843–59. ment 17 22–29.
Ketzenberg, M., R. Metters, V. Vargas. 2000. Inventory policy for Schonberger, R. J. 1982. Japanese Manufacturing Techniques. Free
dense retail outlets. J. Oper. Management 18 303–316. Press, New York.
Kleinbaum, D. G., L. L. Kupper, K. E. Muller, A. Nizam. 1998. Schrady, D. A. 1970. Operational definitions of inventory record
Applied Regression Analysis and Other Multivariable Methods.
accuracy. Naval Res. Logist. Quart. 17 133–142.
Duxbury Press, Pacific Grove, CA.
Kök, A. G., K. Shang. 2007. Inspection and replenishment policies Seltzer, M. H. 1994. Studying variation in program success. Eval.
for systems with inventory record inaccuracy. Manufacturing Rev. 18 342–361.
Service Oper. Management 9 185–205. Sheppard, G. M., K. A. Brown. 1993. Predicting inventory record-
Krafcik, J. F. 1988. Triumph of the lean production system. Sloan keeping errors with discriminant analysis: A field experiment.
Management Rev. 30 41–51. Internat. J. Production Econom. 32 39–51.
Kurt Salmon Associates. 2000. Inventory management. Chain Store Simon, H. A. 1955. A behavioral model of rational choice. Quart. J.
Age 79 3B–6B. Econom. 69 99–118.
Lee, H. L., V. Padmanabhan, S. Whang. 1997. The bullwhip effect Singer, J. D., J. B. Willett. 2003. Applied Longitudinal Data Analysis.
in supply chains. Sloan Management Rev. 38 93–102. Oxford University Press, New York.
Levy, M., B. A. Weitz. 2001. Retailing Management. Irwin McGraw- Skinner, W. 1974. The focused factory. Harvard Bus. Rev. 52 113–120.
Hill, Boston.
Srinivasan, K., S. Kekre, T. Mukhopadhyay. 1994. Impact of elec-
Liao, H., A. Chuang. 2004. A multilevel investigation of factors
tronic data interchange technology on JIT shipments. Manage-
influencing employee service performance and customer out-
ment Sci. 40 1291–1304.
comes. Acad. Management J. 47 41–58.
Lieberman, M. B., L. Demeester. 1999. Inventory reduction and pro- Steidtmann, C. 1999. The new retail technology. Discount Merchan-
ductivity growth: Linkages in the Japanese automotive indus- diser 39 23–24.
try. Management Sci. 45 466–485. Swain, A. D., H. E. Guttmann. 1983. Handbook of Human Reliabil-
Lindsley, W. B., J. D. Blackburn, T. Elrod. 1991. Time and product ity Analysis with Emphasis on Nuclear Power Plant Applica-
variety competition in the book distribution industry. J. Oper. tions. NUREG/CR-1278, U.S. Nuclear Regulatory Commission,
Management 10 344–362. Washington, D.C.
DeHoratius and Raman: Inventory Record Inaccuracy: An Empirical Analysis
Management Science 54(4), pp. 627–641, © 2008 INFORMS 641

Tallman, J. 1976. A practical approach to installing a cycle inventory Vollmann, T. E., W. L. Berry, D. C. Whybark, F. R. Jacobs. 2005.
program. Production Inventory Management 17 1–16. Manufacturing Planning Control. Irwin McGraw-Hill, Boston.
Tushman, M. L., D. A. Nadler. 1978. Information processing as an Williamson, O. 1975. Markets and Hierarchies. The Free Press,
integrating concept in organizational design. Acad. Management New York.
Rev. 3 613–624.
Upton, D. M. 1997. Process range in manufacturing. Management Woolsey, G. 1977. The warehouse model that couldn’t be and the
Sci. 43 1079–1092. inventory that couldn’t be zero. Interfaces 7 14–17.
Vachon, S., R. D. Klassen. 2002. An exploratory investigation of the Young, S. T., W. D. Nie. 1992. A cycle-count model consider-
effects of supply chain complexity on delivery performance. ing inventory policy and record variance. Production Inventory
Downloaded from informs.org by [128.122.253.212] on 20 May 2015, at 03:21 . For personal use only, all rights reserved.

IEEE Trans. Engrg. Management 49 218–230. Management J. 33 11–16.

You might also like