Professional Documents
Culture Documents
Information Life Cycle Management: by Paul P. Tallon and Richard Scannell
Information Life Cycle Management: by Paul P. Tallon and Richard Scannell
MANAGEMENT
A new framework,
created to enhance
information life cycle
management by
clarifying the
relationship between
value-at-risk and
total cost of
ownership, helps
IT managers build
a winning storage
environment.
65
information needs and to struc- Stage of ILM Information Value Storage Architecture Storage Concerns and Issues
ture their storage spending in a 1. Capture
Low
Near-line storage
Ownership of data
Creation policies
way that meets those needs. The
High
Primary storage
Available capacity
underlying premise of ILM is that 2. Application
Access and performance
Backup and recovery times
information follows a natural life
Efficiency and utilization
cycle from capture through appliSecurity and access
cation and decline (see the table 3. Decline
Retention policies
Medium to Low
Tape, optical
Security and access
here). At each point in its life cycle,
the issue for ILM is to identify the
Understanding averse firms, a desire to have all data recoverable in
value of its information and how
information life cycle
best to protect that information
management (ILM) real time is impractical and cost prohibitive, especially
from loss and corruption. In this
if Tallon
the volume
of data
is expected to rise sharply. Risk
table
1 (11/07)
way, storage is like an insurance policy whose cost mir- neutrality represents a compromise where firms are
rors the value of the underlying asset and the risk that neither exposed to inordinate levels of uninsured risk
value will decline due to adverse events.
nor spending vast sums on storage solutions to elimiUnfortunately, the complex task of valuing infor- nate risk entirely. Risk neutrality does not mean that
mation has forced firms to apply ILM using cost cri- firms are indifferent to risk; rather, they are cost-effecteria alone. Thus, firms tend to spend more on tively insured against all known risks.
fault-tolerant hardware and backup monitoring for
In theory, risk neutrality assumes perfect knowldata in regular usethe perception being that fre- edge of all adverse events that could put the value of
quent use implies higher value. Data that is no longer information at risk. For example, firms must know
in use or that is perceived to have lower value is the probability of calamitous events such as terrorist
archived onto inexpensive media such as magnetic attacks or natural disasters and the probability that
tape or are deleted outright. Intel, for example, uses a the storage systems and backup media will perform as
35-day email retention policy for its employees; after expected. In practice, bounded rationality and lessthis period, email messages are automatically deleted than-perfect foresight means that all risks will never
regardless of their perceived value to the end user or be fully quantifiable [2]. The best that a firm can do
the firm as a whole.
is to review historical outage patterns and extrapolate
to a level of storage spending that protects against as
CONSIDERATION OF RISK
many future adverse events as possible. In reality, this
A problem with this cost-centric approach is its fail- task is rarely performed with any appreciable degree
ure to consider risk. In the event of a systems or of accuracy and so, adapting from a similar problem
media failure, time to recovery and point of recovery facing fund management firms, we outline a frame(that is, the age of the last backup) may vary widely. work that balances the risk associated with data loss or
For example, hot sites routinely provide synchro- corruption against storage spending that is meant to
nous mirroring of data in real time but this entails a prevent or contain such adverse events.
much greater level of investment than RAID devices
or periodic backing up to tape. Delayed recovery VALUE-AT-RISK
may not be an issue for low-value data such as pay- The task of measuring the risk to information value
roll records or social calendars but for critical data and deciding how additional storage spending may
such as stop loss orders in a brokerage firm, any be able to reduce that risk to more tolerable levels is
delay could prove embarrassing and lead to severe not unlike the task facing fund managers within the
financial penalties. From a legal viewpoint, there is financial services sector. Fund managers know that
also a possibility that courts will order that elec- fund values will vary based on market conditions.
tronic records be provided to opposing counsel Managers are willing to accept some losses but only
within a certain time frame; failure to comply with within predefined limits. To help establish this limit,
fund managers apply value-at-risk (VaR), a measure
such orders can prove costly [12].
If risk is overlooked, firms have no way of knowing that summarizes the worst loss over a target horizon
if they are spending too much or not enough on pro- with a given level of confidence [9]. For example,
tecting their data. In an era of regulatory oversight using historical data, managers may determine that
and paranoia over data loss, firms are unlikely to be with 99% confidence, the worst percentage loss a
risk seekers. Storing high-value data on unreliable, fund is likely to suffer is 5% or $5M on a $100M
albeit less expensive media constitutes a risk that even fund. If, on any given day, the fund value falls by
the most reckless firms are unlikely to accept. For risk- more than 5%, managers may opt to use a hedging
66
strategy to guard against further losses. VaR, sonable, given the worst-case scenario that VaR repreexpressed in absolute or relative percentage terms, sents. A firm that previously tried to reduce its storage
acts as a trigger for corrective action but it also shows spending by migrating data to less expensive media or
how much a firm can spend to protect itself if VaR is using less frequent backups may find that VaR has
exceeded. For example, if a 6% decline in the value of jumped as end users face longer recovery times and
a fund creates a $1M loss above what is expected at a greater disruption. On the other hand, a firm may
99% VaR level, fund managers know they can spend find that an earlier decision to increase spending by
up to $1M to neutralize this loss. In practice, it is adopting more reliable technology or to pursue a strict
impossible to insure against all adverse market events backup regimen has contributed to a decline in VaR
but setting a VaR level at 95% or above gives man- in the current period. In a financial services setting,
agers an opportunity to identify how the most severe the VaR on an investment can be altered with hedgmarket
perturbations
ing strategies but only at a
might damage their portfocost to the firm [9]. In the
Frequency
lio.
case of information value,
To see how VaR can be
VaR can be manipulated
adapted to a storage envithrough storage spendVaR
ronment and used to
ing. As firms assess the
(95% confidence level)
understand the risks facing
risk to their information
5% of storage
related events will
a firm and the value of its
from adverse events such
exceed VaR
information, consider Figas hardware failure or
ure 1 showing a typical disdata corruption, their
Extent
tribution of storage-related
goal must be to link their
of
Loss
Low
Medium
High
risks. Recognizing that
current storage spending
Business
Minor
Business
Catastrophic
Disruption
Repercussions
valuing adverse events is Consequences Inconvenience
to VaR in order to decide
not an exact sciencefor
if their current spending
instance, it can be difficult Figure 1. Probability distribution is too low, in which case VaR is dangerously high, or
of adverse storage events.
to accurately assess how
if spending is excessive, in which case VaR is unnecesmuch an hour of CRM
sarily low.
downtime costs a firmit is nevertheless possible to
Tallon fig
1 (11/07)
LINKING
VAR TO STORAGE COSTS
create a probability distribution of storage-related
events and their cost to the firm from backup and Since many firms have felt compelled to pursue ILM
restore logs, help desk tickets, and end-user surveys. on the basis of cost criteria alone, total cost of ownAs seen on the left side of the diagram, most events are ership (TCO) is a standard metric for evaluating
of minor significance and have no lasting effects on storage environments. Fixed and variable costs are
the firm; accidental deletions and restoring earlier ver- accumulated for a defined period of time and
sions of files are typical examples. Meanwhile, other divided by throughput, number of users, data center
events can disrupt business activities leading to losses capacity, or footprint to yield a measure of cost utiin the form of missed sales, court-imposed fines, or lization. Although hardware costs are less than 30%
expedited data recovery fees. For example, in 2005, of TCO [1], vendors continue to market hardware
Morgan Stanleys failure to report email messages as as a way to lower TCO while, in reality, service or
part of an investor lawsuit led to a jury award of $1.45 labor costs (least impacted by innovation) are the
billion (this was reversed on appeal in early 2007) primary factors in TCO. Chargeback systems rou[12]. Lastly, a handful of events can be catastrophic if tinely use TCO to assign storage costs to end users
they directly threaten the survival of the firm. Events and so these costs are identifiable with some degree
such as the terrorist attacks of September 11, 2001 or of accuracy.
Relating storage TCO to VaR is complex. The critnatural disasters have been known to destroy data centers (as happened in New Orleans following Hurri- ical task is to determine how VaR responds to a
cane Katrina) meaning that firms may have to bear change in TCO. For firms to be fully insured against
significant additional cost in re-creating files from adverse events, the goal is to uncover a level of TCO
where a marginal increase in storage spending (due to
transaction-level data.
Using this distribution, firms can compute VaR at, greater use of skilled labor, more frequent backups,
for example, the 95% confidence level. For the time increased monitoring, or more fault-tolerant hardperiod covered by this data, each firm must then ware) is matched exactly by a marginal decrease in
decide whether their level of storage investment is rea- VaR. As firms expand their storage costs, VaR declines
COMMUNICATIONS OF THE ACM November 2007/Vol. 50, No. 11
67
69