You are on page 1of 6

INFORMATION LIFE CYCLE

MANAGEMENT
A new framework,
created to enhance
information life cycle
management by
clarifying the
relationship between
value-at-risk and
total cost of
ownership, helps
IT managers build
a winning storage
environment.

By Paul P. Tallon and


Richard Scannell

n the aftermath of the IT Doesnt Matter


debate [3], a consensus has emerged that if
corporate IT assets such as hardware, software, and networks are susceptible to replication
by competitors, a competitive advantage can only
come from the information created by such assets
[11]. Meanwhile, a precipitous decline in data
storage costs of 45% per annum, on a cost-pergigabyte basis [6], has satiated firms desire to
capture information on multiple business transactions and relationships. Moreover, the
much anticipated adoption of RFID and recent regulatory reforms such as SEC Rule
17a-4, mandating the retention of electronic communications records (email, VoIP, and
IM) in securities trading firms for up to three years, are likely to accelerate the pace of
data accumulation even further. In information-rich sectors such as credit card lending,
retail, and health care, data growth has already begun to outstrip the decline in hardware costs, prompting a net increase in storage spending [8]. With storage now consuming 12%15% of IT budgets, CIOs fear that further increases could erode strategic
IT spending [4].
Faced with higher storage costs and burgeoning data growth, the concept of information life cycle management (ILM) has emerged to help management understand their

COMMUNICATIONS OF THE ACM November 2007/Vol. 50, No. 11

65

information needs and to struc- Stage of ILM Information Value Storage Architecture Storage Concerns and Issues
ture their storage spending in a 1. Capture
Low
Near-line storage
Ownership of data
Creation policies
way that meets those needs. The
High
Primary storage
Available capacity
underlying premise of ILM is that 2. Application
Access and performance
Backup and recovery times
information follows a natural life
Efficiency and utilization
cycle from capture through appliSecurity and access
cation and decline (see the table 3. Decline
Retention policies
Medium to Low
Tape, optical
Security and access
here). At each point in its life cycle,
the issue for ILM is to identify the
Understanding averse firms, a desire to have all data recoverable in
value of its information and how
information life cycle
best to protect that information
management (ILM) real time is impractical and cost prohibitive, especially
from loss and corruption. In this
if Tallon
the volume
of data
is expected to rise sharply. Risk
table
1 (11/07)
way, storage is like an insurance policy whose cost mir- neutrality represents a compromise where firms are
rors the value of the underlying asset and the risk that neither exposed to inordinate levels of uninsured risk
value will decline due to adverse events.
nor spending vast sums on storage solutions to elimiUnfortunately, the complex task of valuing infor- nate risk entirely. Risk neutrality does not mean that
mation has forced firms to apply ILM using cost cri- firms are indifferent to risk; rather, they are cost-effecteria alone. Thus, firms tend to spend more on tively insured against all known risks.
fault-tolerant hardware and backup monitoring for
In theory, risk neutrality assumes perfect knowldata in regular usethe perception being that fre- edge of all adverse events that could put the value of
quent use implies higher value. Data that is no longer information at risk. For example, firms must know
in use or that is perceived to have lower value is the probability of calamitous events such as terrorist
archived onto inexpensive media such as magnetic attacks or natural disasters and the probability that
tape or are deleted outright. Intel, for example, uses a the storage systems and backup media will perform as
35-day email retention policy for its employees; after expected. In practice, bounded rationality and lessthis period, email messages are automatically deleted than-perfect foresight means that all risks will never
regardless of their perceived value to the end user or be fully quantifiable [2]. The best that a firm can do
the firm as a whole.
is to review historical outage patterns and extrapolate
to a level of storage spending that protects against as
CONSIDERATION OF RISK
many future adverse events as possible. In reality, this
A problem with this cost-centric approach is its fail- task is rarely performed with any appreciable degree
ure to consider risk. In the event of a systems or of accuracy and so, adapting from a similar problem
media failure, time to recovery and point of recovery facing fund management firms, we outline a frame(that is, the age of the last backup) may vary widely. work that balances the risk associated with data loss or
For example, hot sites routinely provide synchro- corruption against storage spending that is meant to
nous mirroring of data in real time but this entails a prevent or contain such adverse events.
much greater level of investment than RAID devices
or periodic backing up to tape. Delayed recovery VALUE-AT-RISK
may not be an issue for low-value data such as pay- The task of measuring the risk to information value
roll records or social calendars but for critical data and deciding how additional storage spending may
such as stop loss orders in a brokerage firm, any be able to reduce that risk to more tolerable levels is
delay could prove embarrassing and lead to severe not unlike the task facing fund managers within the
financial penalties. From a legal viewpoint, there is financial services sector. Fund managers know that
also a possibility that courts will order that elec- fund values will vary based on market conditions.
tronic records be provided to opposing counsel Managers are willing to accept some losses but only
within a certain time frame; failure to comply with within predefined limits. To help establish this limit,
fund managers apply value-at-risk (VaR), a measure
such orders can prove costly [12].
If risk is overlooked, firms have no way of knowing that summarizes the worst loss over a target horizon
if they are spending too much or not enough on pro- with a given level of confidence [9]. For example,
tecting their data. In an era of regulatory oversight using historical data, managers may determine that
and paranoia over data loss, firms are unlikely to be with 99% confidence, the worst percentage loss a
risk seekers. Storing high-value data on unreliable, fund is likely to suffer is 5% or $5M on a $100M
albeit less expensive media constitutes a risk that even fund. If, on any given day, the fund value falls by
the most reckless firms are unlikely to accept. For risk- more than 5%, managers may opt to use a hedging
66

November 2007/Vol. 50, No. 11 COMMUNICATIONS OF THE ACM

strategy to guard against further losses. VaR, sonable, given the worst-case scenario that VaR repreexpressed in absolute or relative percentage terms, sents. A firm that previously tried to reduce its storage
acts as a trigger for corrective action but it also shows spending by migrating data to less expensive media or
how much a firm can spend to protect itself if VaR is using less frequent backups may find that VaR has
exceeded. For example, if a 6% decline in the value of jumped as end users face longer recovery times and
a fund creates a $1M loss above what is expected at a greater disruption. On the other hand, a firm may
99% VaR level, fund managers know they can spend find that an earlier decision to increase spending by
up to $1M to neutralize this loss. In practice, it is adopting more reliable technology or to pursue a strict
impossible to insure against all adverse market events backup regimen has contributed to a decline in VaR
but setting a VaR level at 95% or above gives man- in the current period. In a financial services setting,
agers an opportunity to identify how the most severe the VaR on an investment can be altered with hedgmarket
perturbations
ing strategies but only at a
might damage their portfocost to the firm [9]. In the
Frequency
lio.
case of information value,
To see how VaR can be
VaR can be manipulated
adapted to a storage envithrough storage spendVaR
ronment and used to
ing. As firms assess the
(95% confidence level)
understand the risks facing
risk to their information
5% of storage
related events will
a firm and the value of its
from adverse events such
exceed VaR
information, consider Figas hardware failure or
ure 1 showing a typical disdata corruption, their
Extent
tribution of storage-related
goal must be to link their
of
Loss
Low
Medium
High
risks. Recognizing that
current storage spending
Business
Minor
Business
Catastrophic
Disruption
Repercussions
valuing adverse events is Consequences Inconvenience
to VaR in order to decide
not an exact sciencefor
if their current spending
instance, it can be difficult Figure 1. Probability distribution is too low, in which case VaR is dangerously high, or
of adverse storage events.
to accurately assess how
if spending is excessive, in which case VaR is unnecesmuch an hour of CRM
sarily low.
downtime costs a firmit is nevertheless possible to
Tallon fig
1 (11/07)
LINKING
VAR TO STORAGE COSTS
create a probability distribution of storage-related
events and their cost to the firm from backup and Since many firms have felt compelled to pursue ILM
restore logs, help desk tickets, and end-user surveys. on the basis of cost criteria alone, total cost of ownAs seen on the left side of the diagram, most events are ership (TCO) is a standard metric for evaluating
of minor significance and have no lasting effects on storage environments. Fixed and variable costs are
the firm; accidental deletions and restoring earlier ver- accumulated for a defined period of time and
sions of files are typical examples. Meanwhile, other divided by throughput, number of users, data center
events can disrupt business activities leading to losses capacity, or footprint to yield a measure of cost utiin the form of missed sales, court-imposed fines, or lization. Although hardware costs are less than 30%
expedited data recovery fees. For example, in 2005, of TCO [1], vendors continue to market hardware
Morgan Stanleys failure to report email messages as as a way to lower TCO while, in reality, service or
part of an investor lawsuit led to a jury award of $1.45 labor costs (least impacted by innovation) are the
billion (this was reversed on appeal in early 2007) primary factors in TCO. Chargeback systems rou[12]. Lastly, a handful of events can be catastrophic if tinely use TCO to assign storage costs to end users
they directly threaten the survival of the firm. Events and so these costs are identifiable with some degree
such as the terrorist attacks of September 11, 2001 or of accuracy.
Relating storage TCO to VaR is complex. The critnatural disasters have been known to destroy data centers (as happened in New Orleans following Hurri- ical task is to determine how VaR responds to a
cane Katrina) meaning that firms may have to bear change in TCO. For firms to be fully insured against
significant additional cost in re-creating files from adverse events, the goal is to uncover a level of TCO
where a marginal increase in storage spending (due to
transaction-level data.
Using this distribution, firms can compute VaR at, greater use of skilled labor, more frequent backups,
for example, the 95% confidence level. For the time increased monitoring, or more fault-tolerant hardperiod covered by this data, each firm must then ware) is matched exactly by a marginal decrease in
decide whether their level of storage investment is rea- VaR. As firms expand their storage costs, VaR declines
COMMUNICATIONS OF THE ACM November 2007/Vol. 50, No. 11

67

will fall to zero the


consistent with faster recovery
at
instant that a flight
times and a reduction in the Value
Risk
has landed safely.
level of risk associated with
(Excessive VaR)
Equally, in the pharsystems outages. Since storage
maceutical industry,
costs tend to increase in large
the value of clinical
rather than small dollar incretrial data will increase
ments, the result is a downPoint at which a marginal increase
in TCO matches a marginal
once a drug applicaward sloping step-function
decline in VaR
tion moves to the next
linking VaR to TCO, as
(Excessive TCO)
stage of FDA approval,
shown in Figure 2. If TCO is
although the data itself
low, reflecting an environTCO
is unchanged. Consement where storage costs have
quently, VaR will flucbeen excessively reduced, an
Figure 2. Understanding
tuate
based
on
where
information
is in its natural life
increase in TCO will help to
information life cycle
reduce VaR to more managemanagement (ILM) cycle and how quickly it moves through that life cycle.
Similarly, legislation and the threat of lawsuits have
able levels. However, consisTallon fig 2 (11/07)
tent with the law of diminishing marginal returns, at altered the value of archival data to the extent that
some point the marginal benefit from greater spend- penalties and fines can be imposed if data is lost or
ing will be negligible. Beyond this point, spending is unavailable to investigators. Hence, archival data may
simply wasteful. The likelihood of unplanned events retroactively increase in value if a firm receives a
means that some risks will remain and so, in absolute court order to hand over its data, as was the case
terms, VaR will approach a minimum (non-zero) with Morgan Stanley where the discovery of 1,600
threshold. Equally, TCO can never fall to zero due to undocumented backup tapes was seen as evidence of
certain fixed costs associated with embedded systems an attempted cover-up [12].
As a consequence of an increase in the value of infor(for example, PC hard disks).
To create this curve in reality, firms begin by plot- mation, the curve linking VaR and TCO will shift up
ting VaR (taken from their risk data in Figure 1) and out, away from the origin. As seen in Figure 3, if a
against their existing TCO. This yields a single point firm maintains the same level of TCO as before, makon the curve. Next, firms engage in a series of what- ing no changes to its storage architecture or storage
if exercises by asking how VaR might have changed practices, VaR will increase because of an increase in
if storage spending was increased or decreased by a risk. The probability of a system outage may not have
certain amount. This may seem hypothetical but, in changed but the cost associated with an outage will
reality, firms may already be doing this exercise when increase commensurate with an upward shift in inforinvestigating how certain severe outages occurred and mation value. To re-establish equilibrium between VaR
how they can be prevented in future. For example, and TCO, a firm must either increase spending around
firms may discover that a certain percentage increase its existing systems by, for example, expanding the frein storage spending would have prevented or limited quency and scope of data backups, improving service
the most financially punitive outages, essentially and support, or by transferring the information to a
reducing VaR. However, to fully appreciate the safer and more secure set of technologies.
When information falls in value as it nears the end
inverse link between VaR and TCO, it is not just
enough to ask what can happen to VaR if TCO is of its useful life, the curve will shift down and in. If a
increased. It is also essential to ask how much higher firm maintains the same level of storage spending as
VaR might be if TCO was reduced. This hypotheti- before, it will have overinsured itself against relatively
cal situation may seem unusual but high TCO could minor risks. The firm will still need to protect its data
mean that firms have overinsured themselves against but not with the same level of spending as before. The
simplest solution is to reduce TCO by transferring
very small risks.
the data to less expensive media. This will allow VaR
DYNAMIC INFORMATION VALUE
to increase to the point where VaR and TCO are
Despite uncovering an optimal balance between VaR again in equilibrium.
and TCO, this balance can become distorted by
changes in information value. ILM recognizes that IMPLEMENTING ILM IN PRACTICE
information can increase and decrease in value over Despite its intuitive appeal, ILM remains challenged
time, often in dramatic fashion. For example, in the by the complex nature of information value. At one
airline industry, the value of a passenger manifest extreme, all data is valuable when viewed through
68

November 2007/Vol. 50, No. 11 COMMUNICATIONS OF THE ACM

the eyes of end users


Without some consideration of
Constant TCO leads
who feel their data Value at
VaR, however, firms have no
to excessive VaR
Risk
must be secured at all
appreciation of the risks they
costs. Rather than
face if their data is lost or corhaphazardly throwrupted. Consequently, VaR can
ing money at an
significantly improve the impleIncrease in
ever-increasing
mentation of ILM.
Information Value
mountain of data
CONCLUSION
estimated to be
Decrease in Information Value
Faced with rapidly expanding
increasing by two
Constant TCO will lead to
unnecessarily low VaR
mountains of data and new
exabytes (1018) annuTCO
ally or 400MB for
government regulation, IT
each of the earths
managers are using ILM to
Figure 3. Responding to change
five billion inhabiin information value. bring order to a storage domain that has previously
tants [10]an analybeen ignored because of its low strategic value. If
sis of VaR can provide an objective overview of data and information are essential to a firms strateTallon fig 3or(11/07)
different instances when data was unavailable
gic positioning, storage must be seen in a new light.
when users were impacted by system failures.
In this article, we introduced a framework that
ILM tries to match storage system capabilities with enhances ILM by clarifying the relationship between
information value but as information value is resistant VaR and TCO, a relationship that is often in flux
to measurement, erring on the side of increased stor- because of the dynamic nature of information value
age spending constitutes the lesser of two evils. VaR, and firms growing desire to capture data on multimeanwhile, can be derived with some degree of accu- ple aspects of their business. By considering this
racy on the basis that is it easier, for instance, to esti- framework and VaR in particular, we argue that IT
mate the cost of an hour of CRM downtime than to managers can forge an effective and secure storage
accurately predict the value of a CRM application environment. c
over its entire lifespan. Even if VaR and information
value correlate, VaR is not a proxy for information R1.EFERENCES
Allen, N. Dont waste your storage dollars: What you need to know.
value. It is true that some firms may invest in storage
Gartner Group Research Report COM-13-1217, 2001.
systems in order to improve information accessibility, 2. Bernstein, P. Against the Gods: The Remarkable Story of Risk. John
New York, NY, 1998.
accuracy, and relevance [5]ultimately seeking to 3. Wiley,
Carr, N. IT doesnt matter. Harvard Business Review 81, 5 (2003),
boost sales or to enhance customer service and sup4149.
port [7]but few look to storage systems as a com- 4. CIO Insight. Is your IT budget being spent effectively? Feb. 2005,
6775.
petitive differentiator when the underlying hardware 5. Davenport, T. Information Ecology. Oxford University Press, New
York, NY, 1997.
is easily replicable.
S. The Decline of Magnetic Disk Storage Cost over the Next 25
Storage remains an unavoidable cost of doing busi- 6. Gilheany,
Years. Berghell Associates, 2004; www.berghell.com/whitepapers/Storness. As such, VaR recognizes that storage is meant to
age%20Costs.pdf.
protect data from adverse events that could give rise to 7. Glazer, R. Measuring the value of information: The information-intensive organization. IBM Systems Journal 32, 1 (1993), 99110.
financially damaging business disruptions or worse 8. Goodwin, P. Enterprise SAN-attached storage: Market overview. Meta
Group Report, 2003.
[12]. In practice, differences in information value
Jorion, P. Value at Risk: The New Benchmark for Managing Financial
across applications such as payroll (low value), email 9. Risk,
Second Edition. McGraw Hill, NY, 2001.
(mid-level value), and CRM (high value) are managed 10. Lyman, P. and Varian, H. How Much Information? UC, Berkeley,
School of Information Management and Systems, 2003;
using different hardware tiers; each tier offers a specific
www.sims.berkeley.edu/research/projects/how-much-info-2003/.
service level that matches the information value. Thus, 11. Mata,
F., Fuerst, W., and Barney, J. Information technology and sustained competitive advantage: A resource-based analysis. MIS Quarterly
high-value data is assigned to a premium tier where
4 (1995), 487505.
TCO and service levels are high while low-value data 12. 19,
Wall Street Journal. How Morgan Stanley botched a big case by fumis assigned to a basic tier where TCO is lower.
bling emails. May 16, 2005.
To implement ILM is to determine a level of storage spending that fully insures firms against the conse- Paul P. Tallon (paul.tallon@bc.edu) is an assistant professor of
quences of data loss, corruption, or inaccessibility. If Information Systems at the Wallace E. Carroll School of Management,
storage spending is seen as the premium on an infor- Boston College, MA.
mation insurance policy, VaR represents the deductible Richard Scannell (rscannell@glasshouse.com) is senior vice
Sales and Marketing at GlassHouse Technologies, Inc.,
on that policy. Perceptually, falling hardware costs have president,
Framingham, MA
created a false belief that a firm is implementing ILM
if it spends more on storage and saves all its data. 2007 ACM 0001-0782/07/1100 $5.00
COMMUNICATIONS OF THE ACM November 2007/Vol. 50, No. 11

69

You might also like