Storage Basics: An Introduction To The Fundamentals of Storage Technology Storage Basics

Storage Basics
Storage Basics
An introduction to the fundamentals

of storage technology

Contents
Section 1
The information society saving data and knowledge at new levels
Section 2
Tiered storage: intelligent information management in the company
Section 4
Storage networks spoilt for choice
Section 5
Backup & Restore: an unloved compulsory exercise
Section 6
Storage management making complex storage networks manageable
Section 7
Virtualization some catching up is necessary regarding storage topics
Section 8
The storage strategy of Fujitsu Siemens Computers and its partners
Forecast: Future storage trends
Glossary
As well as articles about storage from our partners
Brocade, CA, EMC, NetApp, Sun and Symantec
Price: 14.99 Euros
Storage Basics An introduction to the fundamentals of storage technology
Section 3
Online storage: disks and reliability
Storage Basics
Storage Basics
Storage Basics
January 2009
Copyright
Fujitsu Siemens Computers 2009
Text, editing, production: ZAZAmedia / Hartmut Wiehr
Printed in Germany.
Published by
Fujitsu Siemens Computers GmbH
Mies-van-der-Rohe-Strasse 8
80807 Munich, Germany
Contact
www.fujitsu-siemens.com/contact
All rights reserved. Subject to delivery and technical changes. The names reproduced in this document can be brands whose use by third parties for own purposes can violate the rights of the
owners.
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Section 1
The information society saving data and knowledge at new levels . . . . . . . . . . . . . . 9
Section 2
Tiered storage: intelligent information management in the company . . . . . . . . . . . . 15
Section 3 Online storage: disks and reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Section 4 Storage networks spoilt for choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Section 5 Backup & Restore: an unloved compulsory exercise . . . . . . . . . . . . . . . . . 39
Section 6
Storage management making complex storage networks manageable . . . . . . . . . . 49
Section 7
Virtualization some catching up is necessary regarding storage topics . . . . . . . . . . 57
Section 8 The storage strategy of Fujitsu Siemens Computers and its partners . . . 61
Forecast: Future storage trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Partners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Preface
Dear Reader,
Coming to grips with the ever-growing flood of data is still the greatest challenge as far
as storage is concerned. This statement no longer only applies to major customers and
their data centers, but also affects SME businesses and even smaller companies. Coming to grip means secure data storage and accessibility with due regard to agreed quality standards (service levels) and at a reasonable cost (CAPEX und OPEX). Many of the
new technologies such as ILM, DeDup, SSD, virtualization, thin provisioning regardless of whether they are already established or still just hype are eager to help here.
We advise our customers to develop a storage solution that suits them from this
variety of established and new technologies. We have a comprehensive portfolio of
best-in-class products in this regard, but neither is it our intention nor are we able to do
everything ourselves. The way in which we monitor technology helps us to make the
right choice here. Together with technology leaders in storage we have developed strategic partnerships, we integrate their products in solutions and provide the appropriate
services.
If we see that our customers have problems, for which there is no suitable solution
on the market, we then develop our own products and solutions. CentricStor solutions
are prime examples of this.
In addition we have begun to not only develop and implement storage solutions at
our customers, but also to operate them ourselves for our customers. Fujitsu Siemens
Computers is to increase its investment in this managed storage business. Our managed storage customers receive an invoice and the pertinent report every month. The
advantages are quite obvious: transparency as regards costs and performance, improved
efficiency and return on investment. This not only gives us the advantage that we are
becoming better in selecting and developing products that enable storage solutions to
be run on a cost-optimized basis and at defined quality standards.
Thanks to this strategy Fujitsu Siemens Computers has also developed into one of
the most successful providers of storage solutions in Europe.
Preface
The aim of this book is to provide you with an introduction to storage technologies
and storage networks, and to highlight the value added for your company. We also
introduce some of our partners in the storage sector and provide a comprehensive
overview of the current storage product portfolio of Fujitsu Siemens Computers. Our
products and associated services delivered by us or our certified partners are the
basis for storage solutions that help you contend with the growing flood of data!
Your Helmut Beck
Vice President Storage
Fujitsu Siemens Computers
Section 1
The information society

saving data and knowledge at new levels
There has never been a specific culture without information. Human development greatly depends on our ability to acquire information and to make it available to subsequent generations. Whenever there have been major inventions, for
example the invention of typeface and printing as well as computers and the
Internet, society has undergone a major change as well. Since the invention of
the computer by Konrad Zuse, automated information processing and the associated electronic storage of information have affected an increasing number of
areas within our lives. Although generating information is much easier, it has
now also become easier to lose such information as well. Saving, archiving and
managing information is more important than ever before.
n 2008 the World Wide Web celebrated its 15th birthday and Google, the Internet
search machine company, became 10 years old. Such milestones represent a new element in the history of computer technology: nowadays, anyone can just click on a
mouse and browse through a vast indescribable amount of information and entertainment data at anytime and anywhere. Search machines, such as Google, bring a certain
amount of order to this knowledge chaos and it was this particular discovery by two
Stanford students that revolutionized global knowledge within a very quick time. [1]
Accessing data quickly assumes that the data can be read and is available somewhere in the World Wide Web. Its readability is based on file formats, classifications and
index or meta data which must be defined before being accessed by the search machine.
Its existence is based on the fact that it is stored somewhere on electromagnetic storage systems. That is only possible using new and state-of-the-art technology which has
been developed over the last 60 to 70 years, i.e. information technology (IT).
10
Section 1
Original forms of information technology

toring information and knowledge is not exactly a modern intervention Irrespective
of the social forms of human coexistence history has revealed certain constants in
handling such knowledge and in its significance: even in early times human groups
forwarded their experiences from generation to generation separately and not just by
word-of-mouth. They piled up knowledge and traditions, first of all verbally and then
using images and systematic symbols, such as Sumerian pictogrammes and Egyptian
hieroglyphics. Our alphabet today used in European languages arose only about 3,300
years ago with the Sumerians and later the Egyptians. Basically a very short time ago
when compared with the overall history of man and homo sapiens in East Africa about
4 million years ago [2].
Human life organized itself around such information, partially written, not just daily
life but also fixed state institutions and not least religion itself. A wide range of cultures
thus existed around the world long before the Europeans discovered the world and
became subject to occidental standards.
August 2008 saw the German weekly news magazine Der Spiegel look closely at
the effects of such a flood of data on mankind and briefly looked at the very beginnings
of information technology. The first stages were:
The Sumerian matchstick lines turned language into a standard: fixed in clay, they
outlived their creator and became independent of time and space. In 1300 B.C. we had
the first wooden bamboo strips in China and the first book. Comprehensive information
can now be moved. The Phoenicians develop the first text with syllables. About 300
years later the Greeks turned that principle into the original alphabet. [3]
Knowledge progress as a result of technology made a jump forward some 3,000
years later when Johannes Gutenberg invented the copper matrices as the basis for the
mass creation of individual letters in about 1450. The first weekly papers appeared after
1600 in Strasburg and Wolfenbttel and the first daily paper was published in Leipzig in
1659. Then things moved fast: the invention of the telegraph in mid 19th century, later
the telephone, then the typewriter, the LP, electromagnetic waves via Hertz and then
radio, film and television until we finally reached the first computing machine developed by Konrad Zuse in 1938. The rest is history: Internet, World Wide Web and mobile
phones.
11
Digitizing knowledge
he means and technologies used by man to document experience and knowledge
either for a short time or indeed long-term have greatly developed throughout history, whereby the term knowledge is used here without any value attached to it, i.e.
without an assessment of the content. However, the reason why human or social groups
pass on information is still the same.
And it is the advance of computers and the World Wide Web which has now moved
society away from an industrial society to one based on information. The production of
goods in the classical sense is more in the background and it is now the product services and information which has been established. Distributing information and entertainment has become a separate profitable business and has constantly changed the
way in which society interacts [4].
More and more digital information is entered, saved and provided via networks. On
the one hand, this growth is based on the amount of accrued company data. Accounting was always at the center of automation developments. Some of the oldest clay
slates found in Persia contained book-keeping documentation. The first computers
were predominantly used for book-keeping purposes. And today IT has spread throughout companies like wildfire. More and more operational areas are now electronic and
each new sector provides an increasing amount of digital data. Depending on the business model, there are some companies today which just exist virtually within computers and that includes many Internet businesses. Classic companies started with solutions for ERP (Enterprise Resource Planning) and moved via CRM (Customer Relationship Management), SCM (Supply Change Management), Data Warehousing and Business Intelligence to new areas in Web 2.0 and social media. It is frequently
production-technical and commercial data that is entered the most and processed in
applications from SAP, Oracle or Microsoft.
This structured data, which is systematically stored in database fields, can be easily
accessed via queries. Its evaluation and interpretation has become very complex due to
the quantity and constant expansion. This is why Data Warehousing exists where the
data collected data is sorted and edited according to business intelligence criteria. For
example, for airline companies want to know for marketing and flight organization
purposes how often their customers fly, their destination and also their chosen form,
i.e. business or economy. Databases alone do not provide such interpretations.
12
Section 1
Changing information forms within companies

he Enterprise Strategy Group defines three phases:
Phase one saw the automation of core processes with database-based applications,
for example SAP R/3. Information became structured and transaction-oriented, hence
the name transaction data.
Phase two saw the structure of IT within a company change. The PC was invented
and it was introduced to companies on a decentralized basis. PCs were then merged
into workgroups which changed data via the first servers. That was the birth of Novell
and also the starting-point to develop Windows as a server operating system. And suddenly there were vast quantities of non-structured office files. The age of distributed
files had started and the required storage capacity was soon greater than that required
for transaction data.
And today we are at the beginning of the Internet data age Web 2.0 applications
such as social networks, wikis or blogs are also being used in companies. Each person is
now not only an information user but also an author of information and so the amount
of data that now has to be stored has considerably multiplied. The digital footprint in
the network is now already enormous and the end of such growth is not in sight. It can
be expected that Internet data will soon leave all other sorts of data way behind.
The growth of non-structured data (Web 2.0)

ost of todays data is not structured. It comes from Office or e-mail programmes.
Non-structured data has relevant and non-relevant data, all mixed up and not
easy to sort. That is the challenge: it is non-structured data that has to be managed
correctly so that important data can be archived on a long-term basis without any loss
and so that non-important data can be saved with a minimum of resources. The term
compliance is used to describe all the data storage requirements as defined by organizations and the state.
The mountain of digital data is growing because the old analog storage media in the
consumer world is being replaced: voice, music, photos, TV and film are now digitally
recorded and stored resulting in gigantic quantities of data. And added to that comes
the continuing conversion of analog media. The improvement in Internet streaming
technology and the improvement in bandwidths will enable the Internet to presumably
replace media such as CD-ROMs or DVDs for music and film. This will generate increased
demand for hard disks and arrays.
13
How big is an Exabyte?

Kilobyte (KB)
Megabyte (MB)
Gigabyte (GB)
Terabyte (TB)
Petabyte (PB)
Exabyte (EB)
1,000 bytes OR 103 bytes. 2 Kilobytes: a typewritten page. 100 Kilobytes:

a low-resolution photograph.
1,000,000 bytes OR 106 bytes. 1 Megabyte: a small novel or a 3.5 inch
floppy disk. 2 10 Megabytes: a high-resolution photograph taken by
a digital camera. 3 4 Megabytes: a song compressed with MP3.
5 Megabytes: the complete works of Shakespeare. 10 Megabytes:
a minute of high-fidelity sound. 10 20 Megabytes: a digital chest
X-ray. 100 Megabytes: 1 meter of shelved books or two hours of compressed radio. 650 Megabytes: a CD-ROM.
1,000,000,000 bytes OR 109 bytes. 1 Gigabyte: a pickup truck filled
with books. 3.2 Gigabytes: one hour of HDTV. 5 Gigabytes: the size of a
typical movie stored on a single DVD. 20 Gigabytes: a good collection
of Beethoven works. 50 Gigabytes: capacity of a Blue ray disc. 100
Gigabytes: a library floor of academic journals or about 1,200 hours of
downloaded MP3 music. 500 Gigabytes: the native capacity of the
largest tape cartridge in 2005.
1,000,000,000,000 bytes OR 1012 bytes. 1 Terabyte: 50,000 trees made
into paper and printed. 2 Terabytes: an academic research library. 10
Terabytes: the print collections of the U.S. Library of Congress or projected capacity of a magnetic tape cartridge in 2015. 600 Terabytes:
National Climactic Data Center (NOAA) database.
1,000,000,000,000,000 bytes OR 1015 bytes. 1 Petabyte: 3 years of EOS
data (Earth Observing system). 2 Petabytes: all U.S. academic research
libraries. 20 Petabytes: production of hard disk drives in 1995. 200
Petabytes: all printed material.
1,000,000,000,000,000,000 bytes OR 1018 bytes. 2 Exabytes: total volume of information generated in 1999. 5 Exabytes: all words ever spoken by human beings. 9.25 Exabytes: the amount of capacity needed to
hold all U.S. phone calls in one year.
90 Exabytes: estimated worldwide available digital storage capacity in
2010 for all media (disk, tape, optical).
Source: Horison Information Strategies, UC Berkeley Study How Much Information, IDC
The so-called Web 2.0 with its new interaction options for network participants, for
example YouTube, MyFace, LinkedIn or Xing, will also result in huge data quantities
being stored by the providers responsible. Most of todays Blade servers and storage
arrays are being sent to such companies. This development will increase as new technologies expand, such as leased software (Software as a Service / SaaS) or Cloud Computing where the user accesses programmes and data which is stored in giant data
centers somewhere in the Internet cloud. Medium-sized companies and start-ups will
enjoy low-priced options that enable them to use such a sophisticated infrastructure.
14
Section 1
Amazon with its large data centers is renting out computing and storage capacity for
its external customers. Of course, the appropriate network bandwidths must exist and
the provider must be 100% reliable. It is clear that new technologies which are used
first of all in the consumer environment will expand into the world of business IT. Risk
analyses for particular situations are essential, especially when looking at security topics and cost savings.
The options in our Information society have simply not yet been exhausted. New
processes for transferring knowledge and storing information have joined the existing
procedures [5]. Information technology has enormous potential but, just like all the
other technical progress before, it is simply a tool for specific purposes. It all depends
on how it is used and for which objectives. First of all, it has to be ensured that data
backup itself becomes more reliable. The storage example shows both the technological
opportunities as well as the restrictions. And that is why particularly in this environment there is a whole range of new fundamental inventions and gradual improvements.
Storage in such a world is thus gaining in significance:

1) The ever-increasing mountains of data first of all has to be saved first and where
possible using optimized resources.
2) As data has to be available anytime, anywhere, the storage systems used must provide the data worldwide via the Internet on a 24 hour-a-day basis. Data has to be
saved several times on a redundant basis in order to ensure that it is not lost due to
unforeseen circumstances.
3) An increasing amount of data with even less structure will dominate everyday data
storage processes: data management must thus become more intelligent so that
data is suitably saved according to its value (Information Lifecycle Management).
This is the only way to meet compliance requirements.
Section 2
Tiered storage: intelligent information

management in the company
Companies have looked at the topic of saving data but with various degrees of
intensity. If something goes wrong, i.e. data disappears, is stolen or the data
backup media cannot be recovered when hardware is faulty, then people tend to
keep quiet about it. It is rare for this type of situation to become public knowledge
as the company will try to protect its reputation or brand name. But almost all IT
departments follow some sort of storage strategy even if it is not well-known nor
follows specific regulations: they move their data from A to B or C in order to save
costs or they dont need the data for a while. Most companies thus have a basis
and a need for a clear strategy when it comes to saving their digital treasures.
lmost all the forecasts and estimates about fast data and storage growth have so
far proven to be true despite the comments of a lot of analysts and market
observers. Such estimates have often proven to be even too conservative. In particular, recent years have shown that, in addition to the company IT and their storage
requirements, other groups in society now digitize their analog information. These
include film and video, voice and music recordings, medical x-rays (medical imaging),
TV cameras in major cities or at security-conscious locations (security & surveillance) as
well as the conversion from analog to digital radio in police, fire brigade and rescue
services. An additional factor is the so-called social networks (social communities), such
as YouTube or Facebook with their enormous amount of data photos and videos.
Digital information is being saved everywhere at a great extent including data centers which are expanding daily. But is all this data really worth it? Do all those holiday,
birthday and family snapshots which used to bore us all at family slideshow evenings
really have to be saved on state-of-the-art technology for ever and ever? Does a company really have to save everything electronically despite legal specifications?
According to IDC the growth of all non-structured data (file-based) that is being
increasingly collected outside the company will exceed structured data (block-based)
for the first time in 2008. The balance in the company between structured data entered
in databases or ERP applications and non-structured data resulting from e-mails
16
Section 2
Petabytes
World wide File and Block Disk Storage Systems, 20052008

9000
8000
7000
6000
5000
4000
3000
2000
1000
0
Block based
File based
2005
2006
2007
2008
Source: IDC 2007
According to IDC, in 2008 file-based data will for the first time experience of stronger growth than
block-based data.
including attachments, Office files, presentations, videos and so on, has also shifted.
The problem of retrieving structured data via database queries and interpreting data via
Data Warehousing or Business Intelligence has today basically been solved although
there is no solution for non-structured data [1]. However, this cannot be said for the
actual storage of such data quantities: the main type of storage i.e. fast access to
expensive hard disks is limited and only assigned to really critical business applications.
Moving data
oving data records and data in servers and applications to less performant storage
areas is necessary for several reasons:
New up-to-date data is materializing every day, every hour in business-critical information. Booking, purchasing and selling data must all remain in direct access, yet it
becomes unimportant after a certain period and even out-of-date. It has to be
moved for space reasons, i.e. from primary online storage to slower and lower-priced
disk systems (Nearline or Secondary Storage).
Other files, such as presentations, are not necessary business critical yet have to be
stored nearby (Nearline) as such files are often modified and used again and again.
Such data is a typical example as it is saved several times and used or read by numer-
17
ous employees. The analysts in the Enterprise Strategy Group (ESG) have developed
the Iceberg model. There are two main types of data: dynamic data which is continually changed (= visible part of the data iceberg) and permanent data that is static
or fixed and will not be changed any more (= invisible part) [2].
Legal regulations or just careful commercial thinking requires long-term data storage without having to have the contents constantly available during daily business.
It can move to an archive in whichever manner. In early days it was stored to tape
(known nowadays as so-called Virtual Tape Libraries) which simulate tape storage in
disk systems. This is a third type of storage on tape but the data is a long way from
the original starting-point, such as servers and primary disk storage units.
Last but not least: data loss must be avoided depending on the value of the data by
making immediate or delayed copies which can be recovered as required (Backup,
Restore, Snapshot, CDP/Continuous Data Protection). These procedures are based on
up-to-date duplication concepts which fish out the copies during the backup processes. But this is only from the backup media and not from the primary or secondary
disk storage units. More information about this complex topic is in section 5.
Even if a company only uses one aspect of these procedures, it is still using a tiered
data backup system as data is being moved, even it is from the server to a backup
medium, and even though this would be inadmissible from a professional storage viewpoint.
HSM and ILM: classic concepts with long-term effects

wo classical concepts, apart from anything else, prove that a tiered storage process
was always somehow operated within a companys IT. Both look at a fundamental
question: what must be saved, how, for how long and on which medium? In the world
of mainframes the Hierarchical Storage Management (HSM) method was used where
the automatic storage of data is on the lowest-priced storage devices according to the
performance required for each application and according to the description offered by
Fred Moore [3]. This process is not visible for the users who can access all the data they
want without any restrictions irrespective of the storage hierarchy level currently
involved. A special HSM software looks after the various storage levels. IBM introduced
this procedure for the first time in mainframes in 1975.
Once the Internet bubble burst, StorageTek and other providers set up their strategy
on Information Lifecycle Management (ILM) so as to guide the user to a content view of
his stored data. A hardware manufacturer had a wide range of different devices on
offer; there was also more cooperation with those manufacturers who concentrated on
18
Section 2
What is Tiered Storage?

Tiering means establisching a hierarchy
of storage systems based on service
requirements (performance, business
continuity, security, protection, retention,
compliance, etx.) and cost.
Tiering storage requires some mechanism
to place data:
Static applications assigned to
specific tiers
Staged batched data movement
(e.g. archive)
Dynamic some active data mover
(e.g. HSM or ILM policy serice)
Source: SNIA
The stored data is increasingly moving towards non-access on account of the various levels
involved: from servers and quick primary storage and slower storage to backup mechanisms and
archiving.
using software to classify data. EMC bought Documentum a manufacturer of a document management solution (DMS) in order to set up packages for both their old and
new customers. ILM can be seen as a continuation of HSM in the world of Unix and
Windows. ILM manages the data from its origin to its archiving or deletion and stores
the data on various fast and powerful data media depending on its individual value.
This type of storage management is based on HSM technology and uses company policies to establish an optimal match between data values and the respective storage
subsystems. Even if companies are often not aware of it they all practice some form of
ILM. Even those who keep their data for a year or even longer on high-performance and
expensive online storage, have made a decision regarding the assumed value of their
data. But whether such a decision can be justified is not clear as in the meantime the
data could have been saved on cheaper data media.
A similar approach is Tiered Storage which is basically the same as HSM and ILM, but
which looks more at the combination of IT infrastructure (hardware basis) and data
contents. The stored data is increasingly moving towards non-access on account of the
various levels involved: from servers and quick primary storage (online storage for data
that requires immediate business access) and slower storage (nearline storage for data
that is only required sometimes) to backup mechanisms and archiving. Such a structure
based on the value of the data exists in every company whatever they may call it.
19
ILM Implementation Roadmap

Instrument & manage service to ILM
Deploy ILM practices across the enterprise
Automate with ILM Management tools

Capture overall savings & benefits
Automate ILM-based policies & services
practices across sites
Begin operating policy-based services

Refine ILM practices & benefits
Set data & info polices across domains
Tier storage and protection into
standard Service Levels
Deploy Configuration Mgmt tools
Begin collaborating on requirements

Identify value, lifecycle & classification
of information for each business unit
Use SRM tools to identify and track
Pilot ILM-based Solution Stacks

Standardize Information,
Data & Security Services
Identify information
assets & infrastructure
resources/services
Source: SNIA
The recommendation of the SNIA Data Management Forum is to add intelligence to the tiering
efforts by integrating it into a broader ILM-based practice. More informations at www.snia.org/
dmf.
HSM and ILM can be regarded as a high-level, indeed tactile strategy which establishes the data backup stages and criteria in a justified sequence. Many manufacturers
selling HSM or ILM promised their customers that they would, above all, reduce their
storage costs. This refers to the classic storage processes such as data save, backup,
restore and disaster recovery which must be classified as such so that the data can be
saved on the appropriately priced and powerful storage medium according to its value
for the company. That is more easily said than done: how can someone decide which
data should be stored at which stage and on which medium and for how long? [4]
The traditional data hierarchy was split in two: data saved on hard disks with direct
and fast access and backup or archive data saved on cheap tape and which is not in
direct access and thus partially bunkered somewhere. Those who select HSM or ILM as
their strategy want to move away from this old concept and now wish to save data
according to usage, i.e. its direct significance for the business process. Those that plan
such a step with specific criteria can save money immediately [5].
Even if this approach is not always accepted [6], HSM and ILM have had an effect:
Tiered Storage today is seen by companies to be quite normal. A real hierarchy now
almost completely dominates the world of storage. The two tiers have now become four
20
Section 2
or five and, in an ideal situation, they reflect exactly the value of the data on each tier
based on the corresponding costs. In other words, expensive primary storage (fibre
channel and SAS disks) down to less expensive secondary storage (disk-to-disk (D2D))
on SATA disks which still have to be accessed via servers or applications, and then down
to different forms of backup: either as backup storage on cheap SATA disks which have
the function of the older type of tape backup (Virtual Tape Libraries / VTL) or as classic
backup and archive on magnetic tapes.
Tiered storage and data classes

he classic version only had two storage levels: the first step was to have the data
saved on server-related hard disks (Direct Attached Storage = DAS) and keep it there
for a while for fast data access and then the second step was to move the data to a Tape
Library. If the data had to remain accessible, enterprises could only use powerful tape
libraries from StorageTek or IBM with high-performance robotics that loaded or
unloaded the cartridges quickly.
Backup and archive data was also saved on tapes and kept at a safe place usually
away from the company premises at least that was the theory. But if the data was
required again due to a recovery situation, it first had to be transported back and then
loaded into the productive system a process which could often last for hours or even
days.
Further developments in disk and array technology have today resulted in a tiered
storage model which comprises at least three or four classes:
Tier 0: Fast data storage (Flash Memory or (Enterprise Class) Solid State Disk) is used
to ensure that data can be accessed very quickly. For example, Solid State Disks (SSD)
as very expensive cache storage have been on offer for years from specialist
companies such as Texas Memory Systems. The main customers are state-owned
organizations in the USA, banks or companies, which are actively involved in crude
oil exploration with enormous amounts of data which even when using online storage would be too far away from the applications.
Tier 1: Mission critical data (such as revenue data), making up about 15% of all data,
very fast response time, FC or SAS disk, FC-SAN, data mirroring, local and remote
replication, automatic failover, 99.999% availability, recovery time objective: immediate, retention period: hours.
Tier 2: Vital data, approx. 20% of data, less critical data but fast response time, FC or
SAS disk, FC-SAN or IP-SAN (iSCSI), point-in-time copies, 99.99 % availability, recovery time objective: seconds, retention period: days.
21
Tier 3: Sensitive data, about 25% of data, moderate response times, SATA disk, IPSAN (iSCSI), virtual tape libraries, MAID, disk-to-disk-to-tape periodical backups,
99.9% availability, recovery time objective: minutes, retention period: years.
Tier 4: Non-critical data, ca. 40% of the data, tape FC-SAN or IP-SAN (iSCSI), 99.0%
availability, recovery time objective: hours/days, retention period: unlimited.
HSM, ILM or Tiered Storage require clear-cut and continuous data classification. This
can only be handled manually, especially with non-structured data, which is again
much too expensive. The price of the equivalent software in the market, such as Data
Tiered Storage and the Data Lifecycle

Tier 1
Tier 2
Tier 3
Operational
Mission-critical, OLTP
99.999%
Application
Vital, sensitive
99.99%
Reference, archive
Reference
99.0%99.9%
Very high
None
High
< 5 hours/year
Moderate, low
=> 10 hours/year
Disk, tape libraries
Tape
Probab
i
lity o
f re
Primary storage se
Enterprise class disk
Mirroring and replication,
CDP
Synchronous and
asynchronous (remote)
Secondary storage
SATA disk and virtual tape
Fixed content, backup/
recorvery, reference data
Point-in-time, snapshot,
deduplication
Long-term retention
Fixed content
Video, medical,
government regulations
Tape libraries, deep archive

Offsite vaults
fd
eo
at
Amount of data
Disk
alu
Probability of reuse (%)
Data type
Applicatons
Availability
in percent
I/O, throughput
Scheduled
downtime
Recovery
technology 100
VTLs
SATA/JBOD
ata
Amount of d
0
Average days
since creation
0 days
Recovery Time
Objective (RTO)
milliseconds
Key components
(ILM)
Policy engine
Tiered storage hierarchy
MAID
30+ days
seconds
Data mover
90+ days 1+ years forever

minutes
hours
days
(remastering)
Source: Horison Information Strategies
It is increasingly important to understand that the value of data changes throughout its lifetime.
Therefore, where data should optimally reside and how it should be managed changes during its
lifespan.
22
Section 2
Movers or Policy Engines, is such that an investment has to be carefully calculated

beforehand. The result is that many customers regard ILM or Tiered Storage as a good
idea but shy away from the corresponding investment involved. In practice, ILM has
only been accepted as a comprehensive concept if it could be obtained as an integrated
product (such as CentricStor from Fujitsu Siemens Computers).
It would be a serious mistake to believe that the importance of data once stored does
not change according to the business process, or related to the moment of data entry
or other criteria. For example, immediate access to customer flight booking data is
essential before, during and after the flight but a week later it is only of interest for
those running statistics or customer behavior evaluations. Decisions about data storage
and archive locations and periods must thus be taken. Even when nearline tape libraries were introduced at the beginning of the Nineties, it was still thought that archiving
data was the last phase before it was deleted. A life span of more than one or two years
for most data was pretty inconceivable in those days.
But that has certainly changed. On the one hand, state regulations throughout the
world now decree that IT data must be stored for longer periods; on the other hand,
new hard disk technologies have resulted in various disk storage phases. Nowadays, the
proportion of data that must still be retained towards the end of its lifecycle is increasing and not reducing as in earlier days.
There are more transit stations than before and more computing power and server
performance must be used when moving data from the one tiered storage to the next.
Many start-up companies have been involved in classifying data and in automating the
processes involved and have since been followed by the giants in the branch. Fujitsu
Siemens Computers has been working together with Kazeon in this respect.
Section 3

This section looks at the extremely sensitive area of direct data storage during
or immediately after its creation within the business process. Nothing must get
lost! The added safety harness in the form of backup on other media comes at a
later stage. In this situation, costs play a less important role. What are the features of expensive fibre-channel and SAS disks as used in online storage? Why
are RAID systems used? And how fast must direct data access actually be? These
are just some of the questions to be looked at more closely.
n the early days of electronic data processing which only took place on mainframes,
punch cards were originally used to save data. This was followed in 1952 by magnetic tapes. Both methods were based on binary data (consisting of two figures: 1
or 0) either by punching or not punching a hole on paper cards or by magnetizing or
not magnetizing the tapes. This form of storage is still necessary today as computers
can only handle information which has been reduced or converted using such a binary
system. In other words: this numbering system just consists of a 1 and a 0 because
the heart of the computer, namely the processor, can only operate on such a basis. This
means in turn that data backup is basically pretty uncertain.
Magnetic tapes were fast and could save what was in those days a large amount of
data, namely five megabytes (= 5 million characters, corresponds to several books or
the complete works of Shakespeare see the overview in section 1). But as early as
1956 alternative data media started to appear: the magnetic disk, the forerunner of the
modern hard disk. The first data media of this type consisted of a stack of 51 disks with
a diameter of 60 cm (IBM RAMAC). Hard disks as known today have several, rotating
disks which are arranged above each other in an airtight housing. In contrast to magnetic tapes, the data is no longer written and read sequentially to the data media, which
slows down access and retrieval, but to coated disks. A read/write head is driven by a
motor above these disks and can skip to all the positions required. In contrast to sequential storage, this type is known as random access and is much faster.
24
Section 3
Lifespan of hard disks

ard disks consist of mechanical parts which are constantly mobile and require
energy. This is also the case even when there are no read/write accesses. The advantage of magnetic tapes is they are not constantly moving and that the data is also
stored in a non-power status. The lifespan of a hard disk is basically limited as a result
of its mechanisms and thus possible error sources; they should be replaced every three
to four years on average even if some disk manufacturers claim a longer lifespan. Tapes
used professionally are said to last for even thirty or more years.
Relating to the tier storage concept, the use of tape today is moving from backup to
archiving (see section 5), while powerful hard disks have since taken on the main job of
data storage and even some of the backup activity. Disks can be directly accessed by
servers for business critical data where very fast, yet expensive FC and SAS disks are
used. Lower-priced and slower disks are used for backup to disk systems or as an intermediate stage in the form of virtual tape libraries (VTL) on which data is retained for
application access (nearline storage) before it is finally moved to tape according to
specific regulations or periods of time.
It was 1980 before Seagate (a company founded in 1979 und now the worlds largest disk manufacturer) launched a hard disk suitable for IBM PCs with a capacity of
5 MB. These disks and their immediate successor did not have any intelligence of their
The disk market in terms of interfaces

2007
2009
23,3
23,3
26,4
30,4
16,4
1,4
44,9
33,9
Parallel SCSI
SAS
Fibre Channel
ATA/SATA
Source: Gartner Dataquest
According to Gartner Dataquest the SAS interface is evolving into the number one disk technology.
The previosly dominant Parallel SCSI is sinking into insignificance.
25
own and were completely managed by an external controller. Todays standard hard
disks are based on standards, such as IDE/EIDE (Integrated Drive Electronics and
Enhanced Integrated Drive Electronics) and ATA (Advanced Technology Attachment)
which come from the consumer sector or are known as SCSI disks (SCSI (Small Computer Systems Interface) specially developed for enterprises. Many different devices could
be connected to a SCSI controller from hard disk to scanner. The parallel data transfer
rate was much higher than the previous sequential transport methods. Since 2001 the
ATA development SATA (Serial Advanced Technology Attachment) has become more
widespread the data is no longer transferred in parallel but in serial.
SATA hard disks now provide competition to fibre-channel hard disks as today they
have a higher degree of reliability and have fallen in price. Fibre channel technology as a
whole is regarded as particularly powerful for enterprises since the introduction of storage area networks (SANs). They read reliably and quickly. SAS disks (Serial Attached SCSI)
today play a significant part in this professional sector as they are gradually replacing the
SCSI disks. As they are compatible to SATA, they can be installed together in a joint array
which can result in tier 1 and tier 2 being connected within one single device.
The performance of various disk types

Specification
Rotation speed
Seek time
Typical average
access time
Power-on time
(hours x days)
I/O duty cycle
MTBF
Fibre Channel
SAS
SCSI
Online storage Online storage Online storage
and transaction and transaction
and
data
data
transaction data
10k, 15k rpm *
10k, 15k rpm *
10k, 15k rpm *
3 4.5 ms **
3 4.5 ms **
3 4 ms **
SATA
Low-end file
storage
7,200 rpm *
8 10 ms **
5.5 7.5 ms
5.5 7.5 ms
5.5 7.5 ms
13 15 ms
24 x 7
24 x 7
24 x 7
10 x 5
High
> 1.4 million
hours
High
> 1.4 million
hours
High
> 1.4 million
hours
Low
3 gbps
3.2 gbps
1.5, 3.0 gbps
Yes
Yes
No
Maximum bus
4 gbps
speed
Interactive error
Yes
management
* rpm = Rotations per minute
** ms = Milli seconds
Source: Horison Information Strategies
600,000 hours
26
Section 3
The technical options of the various hard disk types have not yet been fully exploited
and SATA will probably expand further regarding professional storage but will in turn be
replaced by SAS in nearline storage. The advantage of fibre channel is that, in addition
to better equipment with internal microprocessors for mechanic and error control, it
can be positioned away from other devices in the storage network (up to 10 kilometers,
whereas SCSI is only 25 meters). This was decisive in setting up Storage Area Networks
(SANs) since the end of the Nineties as decentralized locations, such as buildings on
extensive factory premises or within a town, could be connected to each other via the
storage network. The use of IP protocols for storage networks has since extended the
range of FC and SCSI/SAS hard disks to cover very large global distances.
Only powerful FC, SAS and SCSI disks are used in online storage as part of the data
and storage hierarchy [1]. Solid State Disks (SSD) already installed by some manufacturers in their storage systems are already significant as a kind of second cache (RAM) due
to their high access rates. As they have no mechanical parts, they have a lifespan that is
longer than that of classic hard disks. But they too are nearing their end as the SSD
lifecycle comes to a conclusion after 10,000 to 1,000,000 write accesses [2] according
to manufacturer specifications.
Online storage customer requirement:

reliability, availability
What is most difficult to replace if a company loses everything,
e.g. as a result of disaster:
Its buildings?
Its computers?
Its data?
50 % of all companies who lose their data as a result of a fire or flood are bankrupt within one year.
Possible solutions:
More reliable disks (migration from SATA to SAS)
Redundant disks (RAID, mirroring)
Regular backups
Source: Fujitsu Siemens Computers
27
Hard disks a contradiction in terms regarding

long-term data backup
ven if you decide to use high-value hard disks in online storage, one problem still
remains, namely the lifespan of the hard disks and thus your data is extremely
limited. The manufacturers warranty has usually expired after three to four years and
new investments are made usually for accounting purposes.
The actual reason for this situation is the strange contradiction that exists in our
so-called digital age. An increasing amount of information is saved on magnetic media,
but that is certainly anything but long-term as a power failure, head crash, material
damage, theft or the sudden death of the system hardware are all enough to send
your data so carefully saved and at considerable cost all off into Nirvana.
The correct moment for replacing hard disks also depends on conditions, such as
temperature and the amount of time the hard disks have been running. The lifecycle (or
duty cycle) is also reduced by frequent power-ups and switch-offs more so than if a
disk runs on a round-the-clock basis; this can be compared to the stress situations that
arise when a plane takes off and lands [3].
Annualized failure rates (AFR) broken down by age groups

10
AFR (%)
8
6
4
5 Years
4 Years
3 Years
2 Years
1 Year
6 Month
3 Month
Source: E. Pinheiro/W.-D. Weber/L. A. Barroso, Failure Trends in a Large

Disk Drive Population, February 2007 (Google)
Not only the age of the disks is accounted for in the annual error rates but also the different
disk types.
28
Section 3
Even if the disks last somewhat longer than a three-year cycle in particular situations, it is advisable to make the change and recopy the data in good time. How important is the price of new disks (constantly falling) and the administration hours involved
in comparison to a data catastrophe that you yourself have caused and the costs of
which could damage a company beyond belief?
Reliability with RAID

o-called RAID solutions exist to protect data on a basically unreliable hard disk and
they are used as a standard in most of todays disk systems. RAID, Redundant Array
of Independent (earlier known as Inexpensive) Disks started their development at the
University of Berkeley in the mid Nineties. There are two important functions in a RAID.
First of all, many small and inexpensive disks are combined to one group LUN (Logical
Unit Number) so that they provide more performance for an application. RAID 0 combines several disks in order to increase performance. However, this situation increases
the risks involved (in contrast to the general belief that RAID provides greater data
security) as the statistical probability increases that the whole system will no longer
function when one of the disks in the array goes down.
The operating system sees a RAID as one single logical hard disk (consisting of many
different physical disks). The RAID function known as striping means that the capacities
of each disk in the array can be split into partitions which in turn can be addressed by
one or more LUNs: application data can thus be distributed across several disks whereby
they use an early type of storage virtualization.
Today, RAIDs are mainly involved in providing protection against disks failing. A RAID
controller can do more: it takes on the disk administration, modifies the configuration
and size of the system cache according to the respective application requirements. The
common factor in RAID levels (apart from RAID 0) is that there is a balance between
performance and redundancy according to the system or application requirements. The
redundancy versions require additional investments in disks which then standby for the
emergency of disk failure (so-called hot spares).
Disk systems can be reset to their original state should a disk or a LUN fail by using
parity calculations which take up a lot of computing time and nowadays are fixed in
chips. RAID 1 saves everything twice (this requires double disk capacity), while RAID 5
requires about 10% more disk space in order to ensure redundancy: One disk covers for
several disks and is ready to jump in and help should there be an emergency.
The widespread RAID 6 was developed in order to eliminate a possible RAID 5 error:
Specialists from different manufacturers and research groups wondered what would
happen if, during processor-intensive reconstruction of the original parity status after
29
Milestones in the hard disk drive industry

Capacity
5 MB
10 MB
100 MB
500 MB
1 GB
100 GB
500 GB
1 TB
1.5 TB
Company
IBM
IBM
IBM
STC
IBM
Seagate
HGST
HGST
Seagate
Model
350 Ramac
1301
2302-3
8800 Super Disk
3380
Barracuda 180
7K500
7K1000
Barracuda 7200.11
Year
1956
1962
1965
1975
1981
2001
2005
2007
2008
Formatted Capacity
4.4 MB
21.6 MB
112
MB
880
MB
1,260 GB
181.6 GB
500
GB
1
TB
1.5 TB
Source: Storage Newsletter, edition July 2008
a disk downtime, the spare disk or another disk in the array were to fail? If such a failure
were to occur, RAID 6 has a second parity calculation ready which becomes active when
a second disk fails. In such a situation, the performance of the controller drops by more
than 30% compared with a simple RAID 5 downtime.
Most manufacturers recommend a specific RAID configuration for their systems or
applications. For example, Oracle recommends a combination of RAID 1 and 5 for its
database in order to increase performance. RAID 3 is more suitable for video streaming
and NetApp has selected RAID 4 for its NAS-Filer as very fast read/write actions can
occur on the disks [4].
How much data protection do you need?

T manufacturers have always been fairly inventive when introducing additional protective functions for data storage around disk arrays. All these procedures are based
on the idea of redundancy: keep everything twice or more where possible. This includes
at a hardware level clusters and grids which means that specific hardware is available
several times so that the second device can run with an identical configuration and
identical data should there be any problems. The transfer between clusters and grids is
based on scaling and also subject to heated expert debate.
International companies have also networked their data storage units and have to
go to great lengths to protect their storage media, networks and backups against any
30
Section 3
misuse. Todays SAN and NAS infrastructures unfortunately have only low-rate security
mechanisms both at a fibre channel level as well as on an iSCSI basis. They thus frequently do not meet the security policy requirements for the companys IT.
Zoning switches in FC-SANs ensure that access control is permitted for each storage
system. This zoning can be run on a hardware or software basis. Soft zoning means that
devices just get information about those systems with which they are to exchange data.
Hard zoning means that hardware checks all the packages and forwards them only to
the permitted addresses. LUN masking is the function in an FC-SAN which makes only
those storage areas visible to an application which the latter needs to implement its
tasks.
With IP-SANs on iSCSI basis, IPSec is used for authentication and saving data
streams, for example, via encryption [5].
Mirroring (a RAID 1 principle where i.e. one disk is an exact mirror of the other) can
also be applied to mirroring an entire storage system. An identical server can be positioned at a second location possibly several kilometers away and a storage landscape
can be set up based on a contingency data center. All the data is constantly transferred
from location A to location B so that the same data is at both locations. In the event of
a catastrophe, the productive IT including the stored data is transferred from A to B. As
everything is redundant and mirrored, IT operations can be continued.
The software elements in data backup are logically based on processes for Backup
and Restore, Continuous Data Protection (CDP) as well as Snapshots. (See section 5 for
more details.) All these processes have their origin in the basic problem of storing data
on electromagnetic media, such as hard disk or tape (and likewise DVD or Blu Ray).
Despite all the advantages of such technology the stored data could suddenly disappear
into thin air! An old medium such as paper can be longer lasting and proven methods
exist against dangers such as fire or natural catastrophes. However, protecting electronic data media on a permanent basis against downtime or damage remains a tricky
and never-ending story to which IT managers must give their full attention. There is no
patent recipe.
Section 4

Today storage networks are denoted as state of the art. At least all large companies use this technology. However, it exists in various versions, which makes its
understanding complicated particularly for beginners. Therefore, this section
deals with the description of some fundamental architectural features of the
various approaches and the explanation of important terms. This is especially
necessary since small to medium-sized companies in particular are currently
also able to set up own storage networks for their purposes. However, which one
should you choose in practice?
ard disks that are installed in servers and PCs or are directly connected to the
servers in storage arrays are still the most wide-spread structure in small to
medium-sized companies known in this case as Direct Attached Storage (DAS).
Small to medium-sized businesses have discovered the productive forces of IT and use
them for their business processes. However, at the same time their financial resources
limit their investments in an own IT infrastructure. Furthermore, there are fewer experts
EMEA market share for DAS, NAS and SAN in 2007

NAS
17 %
23 %
60 %
SAN
DAS
Source: IDC, 2008
Even although SAN determines the topology of storage systems as a whole, the share of DAS is still
large, particulary in small and medium sized business.
32
Section 4
available, who in addition are not able to specialize in the same way as their colleagues
in large-scale companies, who only have to provide support for a small part of the IT.
Consequently, small to medium-sized businesses do not follow every trend and concentrate on the basics. Although keeping a DAS structure does not meet the state of the art
of current storage technology, it can be used as the starting point for a gradual transformation.
But what does DAS really mean? Users who connect one or more storage arrays per
server have a dedicated, exclusive storage array for precisely the application that is
installed on the server. This silo structure may become somewhat complex with time
and use a great deal of space and energy in the server room or data center but it is
ultimately easy to monitor and manage. The disadvantage is obvious: If not split into
several partitions, each directly attached storage unit is in the same way as the individual server not at full capacity. The superfluous capacity and computing power is
used in the server for peaks for special occasions, such as internal monthly billing or
external accesses for web-based purchase orders in the pre-Christmas period. And for
storage it is reserved for corresponding write and read operations in other words
there is a gap between the investment made and benefit achieved. According to analysts individual servers only run at about 15 to 20 % capacity and for storage the value
is on average about 40%.
In other companies, where the client/server infrastructure was implemented via the
internal network, there were provided storage structures, in which several server units
had access to the same storage arrays but separate for mainframes and Open Systems (Unix computers, and later also for Windows servers). However, the amounts of
data to be moved in the local area network (LAN) became increasingly large, which was
to the detriment of the transfer speed and caused data losses. Networks on the basis of
the Internet protocol (IP), which had originally only been developed for the transport of
messages [1], thus reached the limits of their capacity.
Why storage networks make sense

he need for a separated network for storage purposes only was reflected toward the
end of the nineties in a separate technology for Storage Area Networks (SANs). The
new infrastructure consisted of own cabling and a further development of the SCSI
protocol, which was already used for the connection of various devices, such as storage
arrays or printers to a server, and bears the name Fibre Channel (FC). The Fibre Channel
protocol was specially developed for the transport of files. It is said to be reliable, and
most recently with 8 Gbit/sec achieved a transport speed that even outperformed the
Ethernet.
PC Client
33
PC Client
PC Client
Ethernet LAN
Server
Server
Storage Area Network
Disk Storage
Fibre Channel SAN
Disk Storage
Disk Storage
Disk Storage
Tape Storage
A storage area network (SAN) constitutes a separate storage infrastructure which is only intended
for data transport.
In an FC network special switches are given the task of connecting storage arrays
with servers and also with each other. A switch works as a kind of multiple socket, to
which various devices can be connected. [2] In contrast to the wide-spread image of
Fibre Channel as difficult to set up and manage, specialists describe it as easy to handle. This is for example the opinion of Mario Vosschmidt, Technical Consultant with the
American IT manufacturer LSI.
34
Section 4
During its time of origin this Fibre Channel architecture was particularly linked with
the Californian company Brocade, which was founded in 1995. Today, this company is
the market leader in FC switches, which work as the nerve center in an SAN and were
equipped with more intelligence in the course of their development. This means that
such switches can take on tasks within the network, such as zoning or virtualization.
With their help it is possible to set up a Fabric, a structure that forms the core of a
SAN.
One particular aspect is the configuration of different storage zones (zoning). The
administrator can define which devices and data are and which ones are not to be connected with each other. This serves to protect against unauthorized access both in and
also outside a company. If a SAN is extended, additional switches and zones can be set
up, depending on the existence of ports (connection for cable). The name Director has
become widely accepted for larger FC switches with at least 128 ports. Brocade has
taken over several providers (McData, CNT and Inrange), who were greatly involved with
directors [3]. The intention of the manufacturer with these purchases was to strengthen
it market position vis--vis Cisco.
Cisco, the worldwide leader in Ethernet switches, has also had Fibre Channel solutions in its portfolio for several years and has thus positioned itself as a competitor
against Brocade for Fibre Channel. Not for the first time in the history of information
technology have the cards been re-shuffled between the companies involved a recurring development that is enhanced by a forthcoming new technology: Currently, Fibre
Channel over Ethernet (FCoE) is being used to attempt to bring together the separate
networks of message transport (Ethernet or TCP/IP) and data storage (Fibre Channel and
iSCSI) to again form a common network. IP-SANs on an iSCSI basis would already be in
a position to do this, but the communication and storage transport networks are mostly
kept separate for performance reasons.
A new FCoE network calls for standards and an agreement between the various providers. However, before this is finally the case, hard-fought conflicts rage to decide
market positioning. Every manufacturer wants to be involved in FCoE, even if it has to
switch over almost completely to new products. Some providers have obviously still not
forgotten that a previous rival technology of Ethernet, named Token Ring, also lost
the race because the manufacturers behind it concentrated too much on their core
product and thus ultimately did not keep up with the competitors [4].
The historical service of FC-SANs, which are the prevailing storage infrastructure
today in large companies and institutions, consists in providing efficient, fast transport
services that are less susceptible to errors. Although the technology is simple in comparison with a classic network, problems frequently still occur in practice because server
and network administrators are too unfamiliar with data storage. Compared with Ethernet, Fibre Channel has ultimately remained a niche technology, in which there is still
a lack of standards in many places. And over the last few years false expectations have
35
in part been raised, because the obstacles (and prices) for FC training courses were set
too high. The storage arrays attached in the SAN are mostly managed via the tools supplied by the manufacturers, which in turn only requires a short familiarization period
and is directly supported by the suppliers [5].
Consolidate file services

pproximately at the same time as the FC-SANs, an alternative network structure
came into being for the storage of data within the company network, which is particularly associated with the name of Network Appliance (today NetApp). A Network
Attached Storage (NAS) denotes an integrated overall solution, which combines servers,
operating system, storage units, file system and network services. For this purpose
NetApp offers so-called filers, which support the file services NFS (= Network File System, originally developed by Sun) and CIFS (= Common Internet File System under Windows) and are especially suited for unstructured data.
Whereas the files are saved in blocks in an FC-SAN, i.e. in small data blocks of about
four to 128 KB, with the saving of file services we are dealing with related files. This
makes handling easier for the administrator and direct access to the file contents is also
possible. When storing in blocks, which derives from the physical splitting of hard disks
into sectors and blocks, no data can for the time being be accessed via the start and end
DAS
NAS
SAN
Direct Attached Storage
Network Attached Storage
Storage Area Network
Application
Application
Application
Network
File System
File System
File System
Disk Storage
Disk Storage
Network
Disk Storage
Every topology pursues a different concept but the goal is the same: protecting application data.
36
Section 4
The difference between SAN and NAS

Network
SAN
Fibre channel complex,
expensive, closed system,
secure
Protocol
FC, fast up to 200 Mb/s
Optimized for
Rapid data transport
Types of data
Split
Storage for
Drives
All data
Drives, resources
Server (data center)
All
NAS
IP simple implementation, cost-effective, open system, bear security in mind
TCP/IP fast but very high overhead
(up to 40% net)
Simple implementation, open and fast
communication over long distances
Only files
Files, stored content
Clients (workgroups)
Only disks
of the files and their contents and structures cannot be directly accessed, either. All PC
users know that their data is archived in certain files and folders and thus have a logical
structure. They also know that their data are ultimately scattered over the hard disk
after lengthier use they are fragmented (distributed, torn apart), because with each
storage process free blocks are occupied first, regardless of the context of the file content. Consequently, the operating system needs an increasingly long amount of time to
open files. First of all the various blocks have to be on the physical level and consolidated into one entity that is visible for the user. By using the command Defragment
the suffering Windows user puts things in order again on the hard disk at least for a
while.
In an NAS the focus is placed on the network functions [6] and less on the performance of the hard disks used. Many users consider it a lower-cost alternative to an SAN.
Whoever decides in favor of which version depends on a great many factors that are
perhaps also very individual. David Hitz, one of the founders and now Executive Vice
President of Engineering at NetApp, expressed a frank opinion in an interview: NAS
and SAN are like two flavors of the same ice cream. NAS is chocolate-flavored and SAN
is strawberry-flavored. And everything the customer needs to know about the two
technologies is only that both systems can be used at any time for data storage. What
intelligent person would be disturbed by the fact that the someone does not like chocolate-flavored ice cream, but prefers strawberry-flavored ice cream. [7] This somewhat
flippant statement can also be interpreted in such a way that companies with SAN and
NAS have two storage architectures to choose from, which can be individually adapted
depending on their requirements. No-one needs to have any reservations.
37
Comparing the three topologies

Based on network technology
Maximum number of supported
appliances / HBA
Vulnerability to downtime
with external influences
Price level
Scalability
Maximum distance to server
Base protocol
DAS
No
NAS
Yes
SAN
Yes
15
126
Yes (copper)
Yes (copper)
No (glass)
Low
Bad
25 m
SCSI
High
Relative
/
Ethernet
Very high
Very good
10 km
FCP
A third version has been under discussion for some years now: iSCSI networks for
storage units (also known as IP-SAN) obviously overcame the lengthy introductory
phase a year ago and have achieved significant sales figures. The attraction of this
architecture is its ability to use the existing TCP/IP infrastructure for data storage. This
makes the installation and maintenance of a second infrastructure (only set up for storage) superfluous, and the administrators can fall back upon their existing IP know-how.
In practice, however, there have been greater obstacles in the integration of the various
tasks of the LAN and iSCSI storage network. Nevertheless, new prospects are the result
of the new transfer speed of 10Gbit/sec for the Ethernet, because this technology is
currently faster than Fibre Channel with only 8 Gbit/sec at present. However, customers
incur additional costs due to the new cabling that becomes necessary. In the meantime,
it is generally assumed that an iSCSI infrastructure is mainly suited for small to mediumsized companies and has found its true position there.
Section 5
Backup & Restore: an unloved

compulsory exercise
Data backup is the same story as with many necessary tasks, but which are ultimately only performed half-heartedly or not at all because they require additional time or more money. A motorcar or a bicycle is often only used until it no
longer works based on the motto Nothing will go wrong. And if something
happens, for example the brakes or the steering suddenly fail or a tire bursts, then
it is usually too late to avert disaster. Modern IT is similar: In actual fact, copies
should be made at all times of every piece of stored information and kept ready
for emergencies. However, companies that behave according to this ideal are the
exception. Since everything that does not appear to be directly necessary for the
day-to-day business or cannot be immediately converted into hard cash, is not
given the attention that would actually be appropriate to the matter. Corporate
IT and the need for data backup have been in existence now for about 40 years,
but surveys and analyses still result in glaring errors in this basic discipline. That
ought not to be so: creating an efficient and automated remedy is no great conjuring trick.
he penetration of society and the economy with IT has only just begun. Increasingly more parts of daily life from communication and information procurement
right through to healthcare are dominated by IT systems, and economic processes today depend on electronic support in almost all branches of industry and in all
sizes of companies. This interdependence of business processes and electronically generated and processed information makes it absolutely necessary for companies with
requirements of all magnitudes to ensure secure data storage.
40
Section 5
Fundamental backup terms

Backup:
Regular data backup to enable access to data in the event of data loss.
Restore:
Recovery of data via backups (irrespective of the medium) to the time when the
backup was originally created.
The result provides a physically intact volume / file system.
The result does not necessarily provide data, with which applications can be
started (particularly with databases).
Basis for recovery.
Recovery:
Restartable restore of data and systems.
For databases retrace with the help of redo logs to the most current version possible.
Reset to the last possible point of consistency.
Distinction in backup volumes:
A full backup (complete data backup) is on account of the time required only
performed at greater intervals and constitutes the basis for the following backups, which only save the extensions and changes to the original database that
have taken place in the meantime.
Incremental backup denotes a backup of the data that have arisen anew or been
changed since the last backup (regardless of whether it is an incremental or a full
backup).
A differential backup always specifies all the changes after the last full backup. In
this respect, a differential backup needs more storage space than an incremental
one. Further specifications exist depending on the backup software used [1].
41
Backup and recovery strategies

artner analysts assume that a fundamental change in significance of the data
backup is currently taking place [2]. The most important factors to trigger a realignment of backup and recovery strategies can be specified as follows:
Local threats, such as fire, extreme weather conditions and failures in hardware,
software, networks and media endanger business continuity. Terrorist attacks and
threats have featured highly on the agenda in many countries since September 11,
2001. Power failures of a greater scope, such as those that have occurred in many
countries over the last few years, can also have an impact on data backup.
In the age of globalization and the Internet most companies cannot afford any interruptions and downtimes in the infrastructure. At the same time, the time slots for
backup and recovery are becoming increasingly tight, because work is performed on a
24/7 basis in various locations.
Unstructured data (e-mails, files) grow in a particularly disproportional manner,
while classically structured data (databases, business applications) only have comparably modest growth. However, according to surveys performed by BITKOM, the German
Association for Information Technology, about 50% of the data saved is not used. This
calls for measures to move these unused data to less expensive levels of the storage
hierarchy or to remove them from a productive context and archive them earlier.
In many branches of industry, such as banking or healthcare, retail trading and
mechanical or automotive engineering, efficiently organized data backup processes
have always been decisive for a companys market image. Data loss in banks, insurance
companies or airlines, for example, will have a relatively fast impact on the business
success of these companies. This is why data backup always has to be performed with
state-of-the-art methods and improved so as to continue to remain competitive.
And last but not least, IT infrastructure has to be equipped to meet rapidly changing
business and technical requirements. Therefore, backup and recovery processes should
be planned on a long-term and flexible basis and not at the expense of the productive
processes of IT.
42
Section 5
How does data loss come about?
Human
Error
Hardware
or System
Malfunction
32 %
44 %
14 %
Software Corrupton or
3 % 7%
Program Malfunction
Natural Disasters
Computer Viruses
Software errors and viruses are relatively rarely the cause of data loss. Most faults are caused by
the hardware or system, followed by human error.
n addition to the aforesaid external influences, the direct reason for data loss is software errors, which impair data integrity or can even cause entire systems to fail, as
well as various hardware errors, which can go from the power supply units and processors [3], via hard disks [4] to other components and even to redundant assemblies (on a
double or multiple basis) such as hard disk arrays. Added to these are user and operating errors, which even proficient administrators can make, whereby in this regard a
great deal disappears under a dense veil of silence. Which IT department and which
company gladly admits to having done something wrong?
Regardless of the imminent disasters it is frequently these technical errors of everyday IT or simply the end of the life-span of the components and media used, whose
sudden expiry can also mean the demise of the data stored upon them. A separate
branch of industry, which has become greatly centralized over the course of the last
few years, looks after the recovery of data storage systems of various types [5].
In order to specify the durability of hard disks their capacity and expected life-span
are expressed in the term MTBF (Mean Time Between Failure), which is an extrapolated
value for the probable downtime of a drive. Many manufacturers specify values between
one and a million hours for high-end drives, which would mean a biblical life expect-
43
ancy of 114 years (= one million hours). The underlying tests assume a very high number
of parallel disks, on the basis of which possible failure rates are calculated. However, the
implied optimal conditions are in practice the exception so that real failure rates can be
very high. Redundancy through disk system arrays (see section 3) and sophisticated
backup mechanisms have to prevent this. Since the term MTBF has on account of its
inaccuracy increasingly come under criticism, other units of measurements, for example AFR (Annualized Failure Rate), are used today. It is established in the same way as
the MTBF, but specifies the anticipated annual failure rate as a percentage of the
installed number of disks. If 8.7 out of 1000 disks fail a year, the annual failure rate or
AFR is 0.87% [6].
The average life-span of hard disks is at present three to five years, in individual
cases even longer. Companies should only rely on longer time periods if they use automatic error controls with direct notification of the manufacturers service, which
depending on the stipulated Service Level Agreement (SLA) ensures a replacement
before the disk finally fails. As capacity increases, disks become more and more inexpensive (in summer 2008 Seagate announced a 1.5 TB disk), and the magnetic tapes
that are still used in storage are becoming more efficient. For example, their throughput
has in the meantime risen to more than 500 MB/sec, while capacities are also clearly on
the increase and now lie at about 1 TB (LTO 4). The life-span of magnetic tapes is specified as up to 30 years for DLT/SDLT and LTO and is thus clearly beyond that of hard disks
and solid state disks [7].
Backup architectures / Storage locations for backup data

he consequence of this development in the price/performance ratio for the backup
architecture is that tape continues to move further toward archiving at the end of
the tiered-storage chain. An ultimately futile dispute has even developed as to which is
the better solution for backup disk or tape. This dispute is futile because it is the particular requirements of each company that matter. If the proportion of data that has to
remain permanently accessible for a longer period of time is very high, it is advisable to
use special forms of a disk backup before the data is finally moved to tape.
For the backup there are various forms how physical data backup can take place:
Backup to Disk / Disk Libraries:
Storage on disks is faster than write accesses on tapes. The data is usually saved on
low-price SATA disks and not on cost-intensive Fibre-Channel disks.
Backup to Tape / Tape Libraries:
Data backup on tapes is a low-cost option of storing data. Compared with disks,
tapes have the advantage of a longer life-span. However, access is slower. [8]
44
Section 5
Disk to Disk to Tape (D2D2T):

This version takes various requirements into account by also considering data backup
on magnetic tapes in addition to the short-term backup on disk as the next stage. If
two disk levels are used within a storage array, which combines several expensive
and cheaper hard disks, such as Fibre Channel or SAS, on the primary level and SATA
on secondary level, price advantages can be achieved because only one array needs
to be procured. Merely the time of the backup on tape has been delayed.
Virtual Tape Libraries (VTL):
VTLs represent the ultimate in storage. Here we have a combination of fast access
times from disks with low-priced tapes. A virtual tape library (VTL) is a storage system on the basis of a disk array, which outwardly emulates a tape library. As a result,
it is possible to integrate backup-to-disk concepts in existing data backup environments, which are usually based on tape drives.
For connected computers a VTL presents itself like one or more tape libraries. First
the data is temporarily stored on disk in order to keep the backup window to a minimum. The data backup is subsequently effected on the lower-cost tapes.
To define Service Level Agreements (SLAs) for backups the following terms are often
used today:
RPO (Recovery Point Objective)
The term RPO (Recovery Point Objective) is used to describe how much data loss the
company can afford at most. The interval between the individual backups takes its
bearings from this. For example, it follows from this that banks have zero tolerance
toward failure, whereas other branches of industry can cope with this better.
RPO
RTO
Disaster
Weeks Days Hours Mins Secs
Secs Mins Hours Days Weeks
The gap is
narrowing
RPO = Recovery Point Objective, RTO = Recovery Time Objective
RPO = Data Loss
RTO = Downtime
RPO = The amount of data which has to be recovered after a data outage in order to be able to
resume business as usual (measured in time). RTO = The maximum recovery time which can be
tolerated before business must be resumed.
45
RTO (Recovery Time Objective)

RTO concerns the question as to how long it takes to recover lost data or restore
them in the system? The scope may be restricted by statutory or institutional specifications. Taken as an example here is healthcare, in which access to electronic
patient data has to be fully guaranteed particularly in emergencies.
Continuous Data Protection (CDP)
CDP means that every change in the data is also simultaneously made in the backup.
In CDP the RPO (Recovery Point Objective) is set to zero, because each change immediately triggers a storage process. In this regard, CDP is the realization of the backup
ideal: Everything is saved immediately and in its entirety.
Extended backup options

ince the backup windows are becoming increasingly small, there are various technical approaches to meet these requirements.
Using
points in time or
snapshots
enables a backup to take place in a relatively small time slot. Here the data is prepared as a copy at certain times during ongoing operation in order to run the backup
from this copy irrespective of productive operations. However, a snapshot is also
stored on the primary storage system and if this fails, this form of data backup is also
no longer effective.
The backed-up data volume can be reduced through compression factors, which
saves space and costs. More recent procedures, such as Single Instance and Data Deduplication, make a direct search of the data volume to be stored and sort out double or
multiple datasets during the backup process. Such findings are replaced by pointers,
which can entail savings figures of far more than 50%. A pointer is only a reference to
the originally saved file so that it should not be saved twice or repeatedly.
Single Instance Storage

Single Instance Storage describes how double files are detected and saved once
only.
Data Deduplication
is already classed as one of the great inventions in storage technology [9]. Redundancies are not only detected at file level here, data deduplication also takes place at
segment level. This can take place in two ways [10]:
46
Section 5
On the target
Deduplication takes place on the storage medium itself, which helps keep the data
volume to be stored to a minimum, but the entire storage process is extended as a
result.
At source
In addition to the low data volume the other advantage here is that the reduced
quantity of data can be transferred more quickly. This is relevant for branch offices,
because in part only analog data links are available here.
Backup versus archiving

ompared with backup, the archiving of data is often restricted by the parameter
time permanently versus temporarily. That may be applicable in many cases,
because weekly backup tapes possibly end up in the data bunker marked with a label
saying Archive. Nevertheless, such a distinction and in particular the associated
practice is inadequate, because both cases could not be more contrary. Backup is
about protection against data loss, whereas archiving deals with long-term storage in
order to meet e.g. statutory requirements. Backup is normally used in a 1:1 ratio to
FSC Business Continuity Model Requirements

Data Class
Description
Availabilty
Unscheduled
downtime
Planned
downtime
RTO (Downtime)
RPO (Data loss)
Archive Access
Time
Backup Success
Rate
Offline import
response time
1
2
Mission Critical Business Critical
> 99.99 %
99.9 %
3
Business
Important
99 %
< 1 h/year
< 10 h/year
< 100 h/year
No commitment
< 1 h/month
< 2 h/month
< 8 h/month
Intermittent
< 15 Min.
< 1 hour
1 hour
12 hours
8 hours
48 hours
24 hours
96 hours
Seconds
Seconds
< 4 hours
2448 hours
97 %
95 %
90 %
90 %
< 30 Min.
< 45 Min.
< 2 hours
Not specified
4
Non Critical
97 %
47
restore, in an archive you should be able to find data relatively easily and quickly without having to perform a restore.
The copies of the original data mirrored on the backup media are usually only stored
for a shorter period of time and are replaced on a permanent basis by more up-to-date,
write-related copies. As they are actually not intended for use, only specific requirements arise, such as currency or completeness. In the event of a disaster all information
before the oldest existing backup and after the last one is irretrievably lost.
However, these data backups must also not be kept for a long time because the
copies are only needed in the event of a disaster. You will seldom have to restore a data
backup that is a few months old.
Archiving is an altogether different matter. This storage procedure when information leaves the productive process is about preparing data on a separate medium and
making it available for later use. Archiving media are not put to one side like backup
tapes, because their use is only taken into consideration in extreme cases. Whoever
archives, has reuse as his objective at some later point in time.
Archiving can be done of its own free will the bandwidth ranges from precautionary storage for as yet unforeseeable purposes right through to acquisitiveness or
because legislation, the banks or other institutions have issued mandatory regulations
or at least recommend archiving. Whoever no longer wants to use the archived data
later and also does not have to comply with regulations concerning any possible resubmission, should also consider deletion which saves resources, space and costs.
Section 6
Storage Management making complex

storage networks manageable
One would think that after over 40 years of server and storage technology the
handling of such infrastructure on a daily basis and their specific applications
would be easy. Thats the theory. However, the fact is that data quantities are constantly increasing and the storage landscape has become more complex and comprehensive. The numerous company takeovers have resulted in users operating
various IT islands with different operating systems, device parks and applications.
In the meantime there are at least four major storage architectures which require
different levels of know-how: DAS (Direct Attached Storage), SAN (Storage Area
Network), NAS (Network Attached Storage) and IP-SAN (iSCSI), whereby the last
two are based on TCP/IP protocol. The next architecture is on the horizon one that
is only partially compatible with the above installations, namely Fibre Channel
over Ethernet (FCoE). This could return the world of storage to a more standardized
environment. But management does not become any easier as a result indeed to
the contrary. So far each manufacturer has provided his own management tools
for his products yet generally without any standards except for SMI-S.
he confusion in storage management software can possibly be overcome by keeping to the tools supplied by the manufacturers. Integrated configuration tools, web
tools or component managers help you at the start but do not cover the entire
planning of the overall architecture and its elements. The large storage management
suites and the administration of the entire IT infrastructure as offered by some manufacturers requires a great deal of experience and thus are only possible for large-scale
companies on account of its complexity.
Planning and monitoring storage installations should be set up systematically right
from the beginning and includes a constant list of all the phases, stages and modifications. This is necessary as otherwise chaos would occur when staff change their jobs.
The IT Infrastructure Library (ITIL) with its standard requirements and procedures is an
important aid in documenting such processes. Such standards are also of assistance in
the discussions between manufacturers and customers or during company mergers
50
Section 6
when different IT worlds have to be united. ITIL has issued a series of publications which
are aimed at helping a company to structure its IT and business processes.
For example, Change Management describes the procedure for the continuous
management of modifications which replaces simple but irregularly updated Excel
tables and which is based on ITIL and the management tools supplied by the respective
manufacturer. It is thus possible to avoid right from the start any storage wilderness
with all types of products that are only flimsily interconnected. Such a lack of clarity
results in errors and failures and thus extra overnight work for employees. However, if
standard solutions or Storage out of the Box are used, the overall management is
much simpler due to the use of software that is already suitable.
Medium-sized companies have fewer financial resources which can result in them
using quicker yet not fully tested implementations for long periods, longer than is
financially viable. It is not a coincidence that it is these companies which choose technologies that they know or at least seem to know: DAS is well known today with SMEs,
and iSCSI is widespread here thanks to its close relationship with LAN architecture. As
long as these are well-tested and proven solutions, then such cautious behavior is certainly not wrong; it just makes the company less flexible than their larger competitors.
The latter can afford to have well-trained employees and longer planning phases which
enables them to text new and more effective technologies for a longer period and then
apply them productively. This particularly applies to information management which
requires both investment and know-how. Medium-sized customers can usually not
afford most of the tools used in this sector. Their purchase would also hardly be sensible
when compared to the data quantities that have to be managed.
Data management is thus frequently without good planning and basically chaotic.
As a range of Windows licenses already exists together with unused servers, non-structured data, such as Office documents tends to be saved in a non-systematic manner
with all the ugly consequences regarding classification, indices and search options for
such documents. Larger companies are one step further and provide dedicated servers
in a SAN. As a SAN has already been configured, it probably also has data which could
be saved more cheaply on other storage levels with somewhat less performance that
would be chaos at a higher level. When the IT department realizes the mix-up, it usually
adds its own negative element, i.e additional NAS filers are now created so that the
non-structured data can be saved suitably. And then the question arises at some time
about merging or integrating the various stand-alone storage solutions where sometimes the block level (SAN) and sometimes the file level (NAS) is the dominating element. They also have their own solutions which in turn require more investment and
basically add yet another complexity level to the overall storage architecture.
Storage Management making complex storage networks manageable
51
Storage Management
(1) Storage Resource Management: SRM (Storage Resource Management) initiatives began in earnest in the late 1990s. This was a Unix, Windows and later a Linux
market which held great promise 2-4 years ago but has faded in recent years. There
were several reasons why the 20+ SRM companies faded and lost momentum:
1. SRM products had a hard time moving from a reactive, reporting tool to a proactive tool that could make decisions and take actions based on user-defined
policies,
2. SRM products were mainly homogeneous, thus failing to provide support for
heterogeneous environments, and
3. SRM products only dealt with disk space allocation and lacked any insight into
disk performance issues. SRM users were worn down with all the alerts and
decisions that they had to perform manually.
Todays reality is that organizations will need to integrate a variety of vendor and
homegrown tools. Storage organizations must accept that the structure of storage
is going to be split up by vendor and type of array and that organizationally, minimizing the number of vendors and storage pools is one way to reduce storage
administration overheads.
(Fred Moore, Horison Information Strategies)
(2) Information Management
(Information Lifecycle Management) The discipline and function of oversight and
control of information resources.
Information management services: The processes associated with managing information as it progresses through various lifecycle states associated with a Business
Process. These services exploit information about data content and relationships in
making decisions. Examples include records management and content management
applications.
(SNIA Dictionary)
The consequence is that, due to such structures, storage management becomes difficult as problems, failures, hardware and software faults cannot be immediately identified. Counter-measures are usually taken too late in such situations. However, monitoring, reporting and constant error analysis could be implemented with software support
whereby such tools are usually supplied by the manufacturers. If many different components are used, storage management is often faced with the problem of controlling
such heterogeneity.
52
Section 6
IDCs Storage Taxonomy
Storage
Storage
Hardware
Storage
Systems
Storage
Mechanisms
Disk Systems
HDDs
Tape Libraries
Tape Drives
Optical
Jukeboxes
Optical Drives
Storage
Infrastructure
Removable
Media
Storage
Services
Storage
Software
Data
Protection
and
Recovery
Storage
Management
Archive and
HSM
Storage
InfraInfrastructure
Storage
Replication
File
System
Other
Storage
SW
Consulting
Implementation
Storage
Device
Management
Management
Support
Source: IDC
According to IDC, storage can be devided into three main groups, which various fields are assigned
to. Storage Management is only one of many other areas.
Management elements are sufficient supply of storage for users, departments and
applications as well as provisioning. The latter is understood as providing and if necessary procuring on time the storage required: sufficient storage arrays and hard disks
must be available to meet all these requirements. If, for example, seasonal peaks, such
as Christmas business, must be taken into account, this would mean low system usage
during the other periods which could be 20 to 30% below actual capacity. The technology known as Thin Provisioning has been developed to counteract this problem. These
are procedures which plan the flexible, changing assignment of storage for various
applications: capacities are assigned and removed depending on capacities [1].
Stranded Storage is storage which was originally provided for specific applications or
users but not used after all. The objective of Thin Provisioning is basically to use
stranded storage capacities again. Thin Provisioning also uses virtualization [2] via
which different physical hard disks and disk arrays are combined to logical or virtual
units. This enables a very exact assignment of storage space and you no longer have to
consider the maximum and minimum physical disk limits.
The IT department should remain independent in order to free itself of any blame
and be able to refer to the other manufacturer who has forced the user to take on a
53
specific SAN, NAS or iSCSI infrastructure. A Fujitsu Siemens customer has gone his own
way and has deliberately selected and configured separate stand-alone solutions for
different architectures: a SAN for mainframes including mirroring for a second (backup)
data center, a virtualization solution for SQL and mail server based on Windows and
Linux and finally a NAS for FlexFrame for SAP with Fujitsu Siemens servers and NetApp
storage. All three sectors are separately managed whereby the required extra effort
guarantees that other areas are not automatically touched in an error situation. In
other words: the often vilified silo solutions can certainly make sense in certain customer situations just like DAS systems.
This approach can also be seen in the use of software: it is often the case that too
many tools from various sources are used instead of relying on a single provider.
Whereas storage systems such as SAN, NAS or IP-SAN can be configured by the supplied software, a central tool should be used for the subsequent administration which
thus enables end-to-end management. Almost all the storage manufacturers today
offer the appropriate programs.
Heterogeneous storage networks

standardized management
n a SAN (Storage Area Network) on Fibre Channel basis (FC) the main components in
the network are FC switches whose protocol services are easy to configure and only
have a few error sources. The second part of the SAN is the storage systems which are
usually monitored and controlled via the supplied administration tools. However, the
problem in many SAN environments is the know-how and experience of the network
administrators, who come from the classic Local Area Network (LAN) sector and do not
know enough about data storage topics.
The alternative network NAS (Network Attached Storage) was developed with the
aim of offering classic file servers and storage systems in one package. A management
software is also integrated which, above all, handles the transport of files from the
servers to the storage systems. Such an environment frequently has less value placed
on the performance of data access and the hard disks involved. If the NAS servers are
connected to powerful FC storage systems in a SAN, the amount of administration
effort rises.
The third version is IP-SANs which today are usually used as iSCSI storage networks.
Such networks which are based on known LAN technology, are after a while more
popular especially in medium-sized companies, but also in international companies
with more than 1,000 employees. Of course, IT departments have a wealth of experience due to their many years of work with TCP/IP networks that are in use everywhere,
54
Section 6
but it would be a mistake to believe that storage data can simply be moved around in
addition to the existing network paths. A separate IP infrastructure with the mounted
storage systems must be set up in order to have a working IP-SAN [3].
Storage management itself is above these transfer techniques and consists, first of
all, of the tools supplied by the manufacturer, also known as element manager. A web
server is installed for easier operation so that the administrators can access a browser.
More complex networks are monitored and controlled via monitoring services which
connect many storage components such as HBAs (Host Bus Adapter), switches and
storage systems via interfaces. Some examples are Control Center from EMC, BrightStor
SRM from CA or Sanpoint Control from Symantec.
In the past it was frequently the case that storage equipment from the various manufacturers did not understand each other in the network as they either do not have
suitable interfaces (APIs = Application Programming Interface) or they were not compatible with those from other manufacturers. In order to make storage networks userfriendly for the companies, the SNIA (Storage Networking Industry Association which
includes almost all the storage manufacturers) organized a so-called management
initiative in order to create a standard for all devices. Many years of work were spent
by the SNIA boards before they submitted a standardization proposal known as the
Storage Management Initiative Specification (SMI-S). The manufacturers now work
much closer together, swap information about their interfaces and ensure mutual
licensing. The communication interfaces covered by the SMI-S now often have priority
over the proprietary APIs.
According to Frank Bunn, SNIA Europe and storage specialist at Symantec, SMI-S has
an important long-term influence on storage management: SMI-S is not an overnight
solution. It is a constant process which began as early as 2002. SMI-S creates better
interoperability. It enables the standardized management of different products and
provides a consistent view of the SAN and NAS environment. Users are often thus very
enthusiastic as they can finally see their entire storage environment reflected in SMI-S.
Customers often do not even know which storage equipment they have, let alone the
quantity. But that is just the first step. The second step can greatly facilitate the management of storage systems. [4]
As far as Bunn was concerned, the subject of SAN was previously predominantly
controlled by the larger companies. Small and average companies were more skeptical
according to the motto too complex, too expensive and doesnt work anyway. Bunn:
And they were right more often than not. However, SMI-S does make SAN management much easier. Partners can thus implement and support storage networks who are
not complete SAN specialists. The SMI-S versions 1.1 and 1.2 take Fibre Channel SAN as
well as NAS and iSCSI into consideration which greatly expands the environment for
integrators.
55
However, the standardization process has not been completed despite the many
years of effort. Not every manufacturer is implementing the adapted interfaces in their
devices which would be compatible to the other providers. The mutual licensing process
often takes much longer than is really necessary as far as the topic itself is concerned.
Furthermore, various SMI-S versions are in use. These circumstances have resulted in
user acceptance not being particularly high.
Section 7
Virtualization some catching up

is necessary regarding storage topics
Everyone worldwide is talking about virtualization. This technology is meanwhile
said to be truly fantastic and the corresponding product announcements in
this sector are immense. However, many users are still wary regarding the use of
virtualization in their own company. IDC market researchers determined in a
study carried out in 2008 that only 33% of companies in Western Europe have
introduced server virtualization from VMware or Citrix XenSource and 67% have
done nothing yet in this sector. These percentages are even lower in storage; only
15 to 20% have already looked at virtualization. The utilization of storage systems is normally very low. The degree of utilization can be increased very quickly
with virtual systems where several devices or hundreds of hard disks are linked to
logical units. The potential is enormous and a great deal of money could be
saved.
ccording to IDC, companies are holding back from using virtualization in their
storage environment as they do not regard this as absolutely essential. Storage is
often regarded as unavoidable for data storage, saving, backup or archiving
which does not contribute in any way towards increasing productivity or improving
operational procedures. Yet the experience gathered as a result of successful server
virtualization can also apply for storage. Many users follow exactly this procedure in
practice: they expand virtualization to other sectors on a step-by-step basis.
Virtual storage space can help companies to use existing storage resources efficiently as well as centralize and simplify administration. It was the x86 sector which
drove virtualization forwards as the average system utilization was very bad. To Dr.
Joseph Reger, CTO (Chief Technology Officer) at Fujitsu Siemens Computers, this is all
very obvious: Approximately 10% utilization is simply a bad value compared to existing
systems such as mainframes or Unix platforms. [1] Of all the options available to
improve the situation it is hardware virtualization which is the best: The reason is
because the superior layers no longer need to take care of this, says Reger. By superior
58
Section 7
layers we mean everything beyond the hardware layer, i.e. operating systems and
applications that also benefit from virtualization.
Reger continues his explanation: By simulating various machines pieces of hardware that do not even exist the result is a peculiar layer of transparency. With this
technology operating systems and applications need not even know that they are running in a virtualized environment. Consequently, the average degree of utilization was
dramatically increased and a great deal of money was saved. [2]
According to Reger there are in principle three large areas, in which virtualization
technologies can be applied today: hardware virtualization, operating system virtualization and application virtualization: With the first group it is a matter of pretending
as if we have more hardware than is actually available. This applies to server and storage in the same way. You simulate virtual levels of hardware that do not even exist.
Physically, the hardware exists once only; virtually, however, more is made of it. This
means that the operating system does not know it is running on virtual instances
from the view of the operating system there really are ten different servers or storage
arrays. If the operating system is virtualized, this means that the application thinks
there are several instances of the operating system, whereas in actual fact only one is
running. [3] By means of thin provisioning it is for example possible to have more
logical address space available in a storage array than is physically available.
Storage virtualization is in part misused as a term by the IT industry. It originally
meant the mapping of storage resources for the servers or applications involved, i.e. the
consolidation or regrouping of physical storage units to form logical ones. Today it is
usually used for the allocation of random storage resources, including data replication
mechanisms. This covers terms and technologies, such as volume management, virtual
disk volumes, file systems, virtual tape, virtual ports or virtual SANs: something they all
have in common is the approach of separating the physical view from the logical one,
i.e. splitting physical memory into partitions or consolidating several physical hard disks
into one or even various logical units.
When dealing with virtualization within the storage network (SAN device virtualization) itself, i.e. about virtualization at switch infrastructure level, a distinction should be
made between three different approaches:
1) With so-called in-band virtualization the control entity for the data connection,
meta data and data transport itself is on the same appliance. The scaling of these
solutions is determined by the transport performance of this appliance. The providers FalconStor and DataCore were among the first manufacturers to have offered
such solutions.
2) With out-of-band virtualization a single appliance takes care of only the meta data
and of controlling the data path, while the host and server respectively organize the
transport of the storage data to and from the storage devices.
59
3) The third approach consists in the separation of the control entity and data path,
which is done by an intelligent network device. This technology is known as Split
Path Architecture for Intelligent Devices (SPAID). Switch manufacturers like Brocade and Cisco provide suitable devices for this purpose. The separation of the
instances as performed here results in increased data transport speed and enables
scaling of the concept.
Such virtualization solutions normally have two goals. The first goal is to remove the
constraints of a storage array and/or a manufacturer. The second goal is to provide
manufacturer-independent data services, such as pool building (consolidation of storage capacities of the same service quality), data replication such as data snapshots,
remote mirroring or disaster recovery.
Something all these virtualization solutions have in common is that they permit a
coordinated selection of the optimal storage arrays for a certain task. In this way, storage resources can be made available at will and dynamically changed independent of
the storage array. Only the basic configuration of the elements to be virtualized is still
performed by the proprietary management applications. Therefore, the users have to
use this element manager together with the selected virtualization software.
The majority of the users of virtualization solutions use them in the sense of improved
storage management. Virtualization increases the freedom of choice of the users. You
can use solutions from several storage array manufacturers together in a pool. Storage
systems are frequently already completely partitioned during first installation and
afterwards are only managed via virtualization: The approach has proved itself to be
successful for years, particularly for very dynamic environments, for example with service providers or users with a large number of small application islands, such as in public
administration. Wherever high scaling is required for thousands of LUNS and hundreds
of servers the use of split-path technology is absolutely necessary. [4]
In contrast to server virtualization, storage virtualization has not yet made the
breakthrough. Neither is a market standard becoming apparent. This is certainly also
due to the fact that with the LUN concept every SAN storage array already has rudimentary hard disk virtualization. However, importance is increasingly attached to online
storage virtualization in virtualized server environments [5].
It is a different story entirely with file-based storage systems, the NAS systems. There
are a number of very promising approaches for file virtualization, but a market standard
has not established itself here, either [6].
The most progress can be seen in storage virtualization with magnetic tapes. When
its comes to backup on tape, Virtual Tape Libraries (VTL) are at present best practice.
CentricStor VT is currently the leading virtual tape product in data centers.
Section 8
The storage strategy of Fujitsu Siemens

Computers and its partners
Fujitsu Siemens Computers is Europes largest storage provider and offers its customers everything from one source all that they need for their storage infrastructure. In addition to a wide range of products from the leading manufacturers, such as EMC, Brocade, NetApp, CA, Sun and Symantec, Fujitsu Siemens
Computers has developed its own range of storage products designed to meet
the particular requirements of their users. The offer also includes a range of
sophisticated services from maintenance to Managed Storage.
ntil recently every application had its fixed allocated infrastructure, a collection
of servers, network components and storage systems. But now the aim is to
allocate only those infrastructure resources that the applications actually
require. This objective is achieved via virtualization: Large resource pools exist which
can then be used dynamically depending on the requirements involved. Fujitsu Siemens
Computers is moving in this direction as part of its Dynamic Infrastructure strategy of
which storage is an integral part.
CentricStor VT: intelligent tape storage

virtualization
ne example for this process is CentricStor VT. CentricStor VT (Virtual Tape) enables intelligent backup of all company data. The
dual target storage solution integrates the advantages of hard disk
and magnetic tape as the backup data can be stored on a wide range
of storage media based on rules and autonomously. Working with
tape is thus fit for ILM (Information Lifecycle Management), and
corporate data backup can be based in a flexible manner on different
Service Level Agreements (SLAs). True Tape Virtualization (TTV) from
62
Section 8
CentricStor VT represents above-average connection opportunities and also enables a

clear reduction in costs as it consolidates all the backup storage media used. The CentricStor Grid Architecture (CGA) has excellent scaling features and offers far-reaching
reliability for each data center.
The report The Forrester Wave: Enterprise Open Systems Virtual Tape Libraries published on 31st January 2008 by Forrester Research looked closely at 12 providers based
on 58 criteria. The result is that Fujitsu Siemens Computers was found to be one of the
market leaders for Virtual Tape Libraries. The Forrester Report awards top marks to the
company for its strategy in the Virtual Tape Appliances sector: Fujitsu Siemens Computers is the leader in host support, architecture and tape integration. And continues:
As the leading system provider in Europe the company has a solid basis of more than
300 customers and over 500 installed VTLs whereby most of the installations are with
major companies. Fujitsu Siemens Computers has solid growth regarding new customers and has one of the highest growth values for existing customers.
Version 4 of CentricStor was specifically developed in order to solve the following
problems for data backup supervisors
data quantities are continuously increasing.
archiving periods for data are longer.
a second data center must be provided in case of a possible catastrophe.
The positive assessment by Forrester is also confirmed by a study carried out by the
ex-Gartner analyst Josh Krischer: In 1999 Fujitsu Siemens Computers launched its
CentricStor virtual tape solution, which became the most open, flexible virtual tape
library supporting most existing mainframe and major Unix and Windows operating
systems and major tape libraries. CentricStor was conceptually designed as an appliance to be used with all types of tape software, servers, tape libraries and tape technology as well. Initially it was released by Fujitsu Siemens Computers to support the
BS2000/OSD, MVS mainframe platforms and ADIC tape libraries; however, over the
years with constant certifications for major operating systems, backup applications,
tape library systems and tape drive technologies it evolved into the most versatile VTL
appliance, which can be deployed in almost any environment while protecting previous
investments in tape automation. [1]
63
CentricStor FS: consolidated File Services

entricStor FS is an innovative, scale-out storage system which
offers file services at high speed and is very reliable. It offers
practically unlimited scaling regarding storage capacity, data
throughput and access times.
The administration of CentricStor FS is simple only one single
file system must be managed irrespective of the size. The storage
resources can thus be used more efficiently. Additional storage
capacity can be easily provided and CentricStor FS can also be
installed very easily. If a component fails, automatic failover and
fallback ensures that the system keeps running.
Terri McClure, analyst working for the Enterprise Strategy Group (ESG), concluded the following in a report:
The recently launched CentricStor FS from Fujitsu Siemens Computers is a high-end
storage system on file basis via which scaling is possible at a very fine level, thus
fulfilling capacity, availability and performance requirements as defined by file server
consolidation initiatives and Web 2.0 applications. The use of standard components
and the excellent cluster features make CentricStor FS a scalable easy-to-manage
file storage solution with a low starting price that has been specially designed for
the real world of increased file quantities. [2]
FibreCAT SX series: fast storage systems for the data center

he FibreCAT SX40, SX60, SX80, SX80
iSCSI, SX88 and SX100 storage systems offer quality and reliability as
required in data centers. FibreCAT SX
systems are very fast storage systems,
in particular, the FibreCAT SX88, which is up to 50% faster than the FibreCAT SX80, and
the FibreCAT SX100, which has twice the speed and performance of the FibreCAT SX80.
This is due to Fibre Channel technology with 4 Gigabit per second, FibreCache and
state-of-the-art RAID controller technology. Snapshots are integrated as a standard.
The revolutionary FibreCap technology protects the system in case of power failure.
64
Section 8
FibreCAT SX systems are user-friendly and easy to put into operation. The administration is not complicated thanks to the intuitive web interface. The systems are suitable
for a wide range of applications.
The analyst Hamish Macarthur from Macarthur Stroud International said the following regarding the FibreCAT SX series: Managing and protecting the information assets
of an organization is critical in todays markets. The systems in which the data is resident must be secure, reliable and easy to manage. The FibreCAT SX range supports reliable primary storage as well as the need for faster backup and recovery. The new arrays,
with management tools included, will be a sound investment to meet the business and
compliance requirements of small, medium and large organizations.
FibreCAT NX40 S4: reliable file and print server and

SAN/NAS gateway for medium-sized customers
he FibreCAT NX40 S4 is a file
and print server and a SAN/
NAS gateway for medium-sized
customers. The systems quality
and reliability provides a wide
range of application usages in
departments and branch offices.
The FibreCAT NX40 S4 is a rack model with only 2 height units (2 HE). The system can
be equipped with SATA-II or powerful SAS hard disks and provides, when fully configured, internal storage capacity of 4.5 terabytes. Additional interfaces (SAS, Fibre Channel) increase the system capacity via external storage systems and it then operates as
SAN/NAS gateway. Expansion with the FibreCAT SX/CX family is particularly simple as
the required software is already pre-installed. The FibreCAT NX40 S4 is also available as
an option with iSCSI functionality.
The system uses the operating system Windows Storage Server 2003 R2 with additional functions for the file and print server sector. The FibreCAT NX40 S4 systems are
pre-tested, pre-installed and pre-configured and thus ready for operation within minutes.
65
FibreCAT TX series: LTO for medium-sized customers

he FibreCAT TX series consists of the
entry model TX08 and the two models
TX24 S2 and TX48 S2 designed for
medium-sized customers. The FibreCAT
TX08 is a compact and reliable tape automation unit and enables a low-priced
entry into the FibreCAT TX family.
The FibreCAT TX08 is equipped with LTO technology which is synonymous for large
capacity, high speed and very low media costs. It can be equipped with a LTO-2, LTO-3
or LTO-4 drive which is a half height unit and provides storage capacity of 12.8 terabytes (LTO-4 compressed). This corresponds to an automated data backup for approximately 2 weeks. The offer price includes a full version of the backup software ARCserve
Backup as well as one years on-site service.
The second generation of the magnetic tape systems FibreCAT TX24 S2 and FibreCAT
TX48 S2 means that backup to tape, in comparison to hard disk, is usually more energyefficient as power is only used when data is read or written to tape. Both systems provide SAS and Fibre Channel interfaces to the server.
FibreCAT Solution Lab

ujitsu Siemens Computers not only offers its own products but also has lots of
storage competence within Europe. In addition to the numerous storage experts
supporting projects on-site, the FibreCAT Solution Lab in Paderborn is a center of
expertise.
The main task of the FibreCAT Solution Lab is to constantly monitor quality with
regards to a reliable system and component availability. This also includes coordinating
the seamless integration of the various FibreCAT systems into the comprehensive solution portfolio offered by Fujitsu Siemens Computers. Other tasks include adhering to all
legal standards regarding environment protection and DIN ISO Standards and implementing them on time. On request the FibreCAT Solution Lab can also carry out customer-specific tests and create realistic application scenarios. Fujitsu Siemens customers can thus see the FibreCAT systems operating in live demonstrations.
66
Section 8
Managed Storage
rowth in storage means above all an immense increase in data quantities which
must all be managed, saved, provided and stored. The demand for online storage
capacity increases as does the demand for backup storage volumes. Any limits to such
growth rates are not appearing on the horizon.
Against this backdrop SAP asked itself whether they wanted to continue managing
the required storage volumes themselves or to practice what their own hosting experts
recommend to their customers, namely outsource the work that is not part of their core
competence and concentrate on the important elements essential for their core business.
SAP managers would thus place their operations and support regarding processsupporting storage infrastructure into the skilled hands of external people. SAP has
found such a competent partner in Fujitsu Siemens Computers.
Fujitsu Siemens Computers took on the role of general contractor for SAP and the
entire responsibility for providing online storage capacity for data backup 4 petabytes
monthly at the start and then moving up to more than 200 terabytes daily. Furthermore, appropriate reserve capacities were provided in order to meet any additional
requirements in time.
Fujitsu Siemens Computers thus supports one of the largest Managed Storage
projects in Europe and also manages the cooperation activities with the strategic partners involved, namely EMC and NetApp which provide products in the SAN and NAS
environment and whose specialists are involved in the corresponding service sector.
Storage one part of the whole picture

t is not enough in the storage world to simply move a few levers to control this vast
flow of data. A paradigm change is required instead of individual steps. The job
involved is to connect up the various technologies into end-to-end solutions making
them easy to manage as well as operate. Storage is not a solitary topic but certainly an
important element in the overall dynamic data center. Fujitsu Siemens Computers has
thus developed a strategy known as Dynamic Infrastructure.
The storage solutions from Fujitsu Siemens Computers are a fixed part of this strategy and are components used in implementing the objective of providing end-to-end IT
solutions that generate a maximum business contribution with minimum total costs.
FlexFrame for mySAP is one of the most successful and best known examples of such
solutions for a virtualized infrastructure consisting of servers and storage systems.
67
Customers benefit even more from the fact that Fujitsu Siemens Computers work
together very closely with many partners on storage matters. One example of such
excellent cooperation can be seen in the quotes from the CEOs of our partners EMC and
NetApp:
Joe Tucci, EMC President & CEO: The combination of EMCs networked storage solutions and the server-based solutions from Fujitsu Siemens Computers creates a wideranging offer of end-to-end infrastructure solutions that meet the requirements of our
customers. EMC solutions play a central role in the Fujitsu Siemens Computers vision of
a Dynamic Data Center, and we will continue to concentrate our joint operations on
offering our customers the most comprehensive solution portfolio available on the
market.
Dan Warmenhoven, CEO at NetApp: The strategic partnership with Fujitsu Siemens
Computers has contributed a lot to our success and is still growing in the EMEA region.
Our jointly developed solution FlexFrame for mySAP Business Suite, the implementation
of a fast backup solution for Oracle and the Center of Excellence which has been set up
with Oracle are just some of the excellent examples that have resulted from our cooperation so far.
Forecast
Future storage trends

ompanies are facing many challenges regarding storage systems. In particular,
the continuous increase in data results in demands for new ideas on how to make
storage more efficient. In contrast to PCs and servers where the main market
players define the standards, the storage market has innovative companies joining in
the fray which have unusual new ideas that move the market in new directions. Having
looked at the main storage technologies in the preceding sections, let us look at current
storage trends and see which ones have the potential to make a major breakthrough in
the way that a company stores its data. We have selected five relevant trends:
C
1.
2.
3.
4.
5.
Fibre Channel over Ethernet (FCoE)

Massive Array of Idle Disks (MAID)
Flash-based Solid State Disks (SSDs)
Security (Encryption of Data at Rest)
New Storage Architectures (Grid Storage)
1. Fibre Channel over Ethernet (FCoE)

lthough two of the three leading storage technologies (NAS, iSCSI) are based on
Ethernet, it is the third, SAN, which by far dominates the market. SANs are typically
implemented on a fibre channel basis. Fibre Channel networks require dedicated HBA
(host bus adapters, i.e. server plug-in cards) and switches. These network components
are clearly more expensive than Ethernet cards and switches and they also require specific fibre channel know-how. The question thus arises whether SAN storage networks
can also be set up on an Ethernet basis so that only one type of network is needed as
this would greatly reduce costs. Instead of 4 cables (2x Ethernet, 2x Fibre Channel) the
servers would only have 2 FCoE cables. iSCSI indeed started with this promise but fibre
channel SAN purists were never really convinced for the following reasons. In contrast
to FCoE where fibre channel tools can still be used, iSCSI changes these levels and Ethernet is not deterministic. Leading network providers are now competing to merge storage networks and server networks in new ways; the Ethernet standard is to be extended
so that it can provide the same quality of service as a fibre channel network. The result
is the so-called DCE (Data Center Ethernet) via which previous TCP/IP data traffic can be
70
Forecast
handled in exactly the same way as fibre channel (FCoE). In short: the 10Gb Ethernet is
available and 40Gb as well as 100Gb is on the horizon. However, there are plans to
extend fibre channel beyond 8Gb/s even up to 16Gb/s. A conversion to FCoE requires a
high amount of infrastructure investment which in due course will be balanced by savings in operating similar networks.
2. Massive Array of Idle Disks (MAID)

he Enterprise hard disks used today run without any interruption on a 24x7 basis.
This not only wears out the mechanical parts but continually consumes energy. With
an alternative storage array on MAID basis only one small part of the hard disk is always
active. Hard disks not accessed for some period of time are switched off, but this is only
sensible if data really is not accessed for a longer period as frequently switching on and
off would use up even more power and negatively affect the disk lifespan. MAID is thus
mainly used in the important archive and backup sector where the price per saved terabyte is important. SATA disks are thus typically used in MAID systems. MAID is a major
step forward in significantly saving energy and towards Green IT. However, there are a
few obstacles to be surmounted. The disks should be periodically powered up and
checked to make sure that the data can still be read. You can also reduce their rotational
speed instead of switching them off completely which will significantly save power and
reduce the danger of unknown disk errors.
3. Flash-based Solid State Disks (SSDs)

lash memory based SSDs, based on price per capacity, are more expensive than conventional hard disks but, thanks to their much faster read speed and faster access
speed, SSDs that are optimized for enterprise use are already better than todays hard
disks with regard to price per IOPS. Furthermore, they need less energy, hardly generate
any heat and have no problems regarding rotational vibration as there are no moving
parts. As the price of flash chips has fallen by an annual average of 50% in recent years,
such disks are of great interest for a variety of situations. A large number of hard disks
are used today to achieve similar speeds. The speed comes from the parallel access on
many spindles. For example, such scenarios which have only 2 SSD (two in order to
ensure data redundancy via mirroring) could be sued to replace a dozen hard disks in
RAID 5 or RAID 6 arrays. This results in a great deal of energy savings.
Future storage trends
71
4. Security (Encryption of Data at Rest)

ata management and data storage has always gone hand-in-hand with security:
there would be serious consequences if sensitive data fell into the wrong hands.
Encryption helps to avoid misuse: the data cannot be read nor used by anyone who is
not authorized to do so. This also applies if data media is faulty or supposedly faulty
(e.g. hard disks) and has to be replaced. All data on any faulty data media must be made
non-readable. Unfortunately, there are many ways to approach encryption. It can be
carried out at various levels; at application, file system, network and switch level as well
as in devices, such as ones own encryption appliances or tape libraries and disk drives.
Key management for an entire data center is currently a problem as most manufacturers offer their own key management. Standardization does not yet exist. And without a
specific standard there is no guarantee that data users can also read the data and
exchange it as required without any problems.
5. New Storage Architectures (Grid Storage)

odays storage systems are usually built up in a monolithic manner and were specially designed as a storage system. As fewer storage systems are produced than PCs
or standard servers, their production is thus expensive. And as with all monolithic systems, the maximum configuration is limited for design reasons.
A paradigm change will revolutionize the storage industry: lets take standard servers and the large number of manufactured simple standard RAID systems and turn
them into one storage system thanks to an intelligent software layer. Such scale-out
systems can be greatly extended by just adding more and more standard components.
Growth will have no limits and the systems will offer more capacity and more access
speed as they increase in size. This architecture gives systems the scaling and flexibility
urgently required in todays file-driven world and is also the backbone for Internet
infrastructure (Cloud Computing). CentricStor FS is one of the first incarnations of this
new type of storage generation.
Remarks
Section 1
[1] The triumph of large numbers; 10 years Google; Neue Zricher Zeitung, April 25, 2008.
[2] Details about typeface development from Charles Panati, The Browsers Book of Beginnings
Origins of Everything Under, and Including, the Sun, New York 1998, Page 67 ff.
[3] Der Spiegel, edition dated 11. 8. 2008, cover feature Addicted to data, Page 88.
[4] Many authors have written about the positive and negative effects of this development. For
example, Neil Postman, Joseph Weizenbaum, Nicholas Negroponte or Nicholas Carr. See the
interview in Hendrik Leitner / Hartmut Wiehr, Die andere Seite der IT Business-Transformation durch Services und dynamische Infrastruktur, Munich 2006, Seite 215 ff.
[5] Cf. The following passage from an interview in the German weekly Die Zeit.
Question: Which development has changed the most the way we handle knowledge in
recent years?
Answer: Two. Firstly: hard disks now hardly cost anything. It is now no longer an utopian
idea to have mans entire published works on disk. Secondly, nobody in the world is more
than a days walk from an Internet caf. We now have the communication infrastructure to
provide the worlds great libraries to youngsters in Uganda or the poorer areas in the USA or
Germany. (Interview with Brewster Kahle, Director of the Internet Archive in San Francisco;
Die Zeit, 17. 1. 2008)
Section 2
[1] The fact that major storage manufacturers have been buying up companies, such as Cognos,
Business Objects, Hyperion, Documentum or FileNet, that developed software for document
management (DMS) or business intelligence (BI) proves that storage hardware and the criteria for stored data are merging. It can also be seen as an attempt to integrate classic storage
equipment with ILM or HSM.
[2] Steve Duplessie, File Infrastructure Requirements in the Internet Computing Era, Enterprise
Strategy Group (ESG), July 2008, Page 5.
[3] Fred Moore / Horison Information Strategies, Storage Spectrum (2007), Page 76.
[4] Every fourth user has implemented ILM at a certain point or in a certain area but only
3.5% use such solutions throughout the company. Users, who are seriously looking at such
a topic, must realize that ILM cannot be simply bought as a single product nor implemented via a one-off project. Wolfram Funk, Experton Group (quoted in the German magazine
Computerwoche 46/2006, Page 40)
[5] See also the interview with Michael Peterson in http://www.searchstorage.de/topics/
archiving/e-mail/articles/107136/.
[6] Cf. for example Dan Warmenhoven, NetApp CEO: My view (regarding ILM) is that a workintensive data management process should not be set up if there is an automated one. And
74
Remarks
ILM is very work-intensive. (). The user wants to migrate his data from expensive storage
forms to cheaper ones. And he requires an online archive in order to retrieve the data quickly
and also for compliance reasons. NetApp talks of data archiving and migration. Calling all
that ILM is confusing. (Interview in Computerwoche, 9. 11. 2007).
Section 3
[1] Cf. Hartmut Wiehr, Disk systems technology and products, in iX extra, edition 12/2007
(free download: http://www.heise.de/ix/extra/2007/ie0712.pdf).
[2] Cf. the forecast in this book, page 69.
[3] There is an early warning system in professional disk systems. Disks report during running
operations as to whether they are going to fail. Drives, which have not been in use for a
while, are addressed periodically to make sure that they still work. Do they react immediately
to a signal or only with the retries? The latter would suggest that the magnetic interface is
no longer 100%. All the corresponding information is collected and one of the spare disks is
triggered by the system if certain threshold values are exceeded. Or else a message is sent to
the service team.
[4] ICP Vortex has provided a White Paper about the various RAID-levels. The document is in
German and can be downloaded from: http://vortex.de/NR/rdonlyres/82BA6504-885D444E-AC71-7AC570CF56A3/0/raid_d.pdf.
[5] See Gnter Thome / Wolfgang Sollbach, Fundamentals of Information Lifecycle Management, Berlin Heidelberg 2007, page 214 ff.
Section 4
[1] For the history of the Ethernet protocol and the Internet, which were developed on behalf
of the US Army and were to ensure the transportation of messages in the event of war see
http://www.informatik.uni-bremen.de/grp/unitel/referat/timeline/timeline-2.html.
[2] For more information about SANs and Fibre Channel see http://www.snia.org/education/
tutorials/ and http://www.fibrechannel.org/.
[3] Seen from a historical viewpoint, CNT first took over its competitor Inrange, but was then
itself purchased by McData shortly afterwards. Brocade now unites all the former FC switch
providers against Cisco, the company with the largest market force in the network sector.
Seen globally, Brocade is still the market leader for FC switches, whereas Cisco has continuously extended its market share, particularly in USA.
[4] Good interpretations of the development of the IT industry are to be found in Paul E. Ceruzzis, A History of Modern Computing, Cambridge and London 2003, and in Clayton M. Christensens, The Innovators Dilemma, New York 2000.
[5] Cf. Mario Vosschmidt/Hartmut Wiehr, Storage networks and their instruments of administration; in: iX, vol. 8/2008, page 122 ff.
[6] A good overview is to be found in the brochure NetApp Products & Solutions.
Remarks
75
[7] Interview with David Hitz in project 57 Journal for Business Computing and Technology,
Special 01/05, page 39 ff. Cf. also the interview with NetApp CEO Dan Warmenhoven in
Computerzeitung, issue 38/2008, page 10.
Section 5
[1] Many storage manufacturers supply their own backup programs together with their hardware. EMC, a partner of Fujitsu Siemens Computers, has after the takeover of Legato in
2003 a complete solution (NetWorker) in its portfolio. Of the independent providers CA with
ARCserve and Symantec with NetBackup are particularly worth mentioning. CommVault and
BakBone can frequently be found in the Windows environment.
[2] See in this respect Dave Russell / Carolyn DiCenzo, MarketScope for Enterprise Backup/Recovery Software 2008, April 2008.
[3] An example of error corrections is ECC (Error Correction Code). The correction is used to
detect errors during the storage and transfer of data. The errors are then automatically
remedied in the second step.
[4] Hard disks with S.M.A.R.T. (Self-Monitoring, Analysis and Reporting Technology) have a
function that constantly monitors them, with technical values, such as temperature, startup
time or track reliability of the read and write heads being controlled.
[5] Specialist service providers, such as Kroll Ontrack or Seagate Services, can often recover hard
disks damaged by fire and water and thus save the lost data.
[6] See Hartmut Wiehr, Begrenzte Lebensdauer (= Limited life-span): Disks Status Quo and
Trends; in iX extra, issue 12/2007; download: http://www.heise.de/ix/extra/2007/ie0712.pdf.
Other terms and contexts relating to this topic are explained there.
[7] For more details see www.speicherguide.de: Focus on tape drives and tapes (http://www.
speicherguide.de/magazin/bandlaufwerke.asp?mtyp=&lv=200). The durability of DVDs and
Blu Ray Disc (BD) is subject to very large fluctuations. The market is wide-spread, particularly
in the consumer segment. A good overview of the durability of these optical media with due
regard to their suitability for archiving purposes can be found in ct Magazin fr Computertechnik, issue 16/2008, page 116 ff.
[8] As early as 1987 StorageTek presented the 4400 Automated Cartridge System (ACS), which
was the basis for the 4400 Tape Libraries (Nearline) which were introduced in 1988. The large
PowderHorn Libraries introduced in 1993 were a further development of this technology,
with which very fast access to tape cartridges is possible. A great many PowderHorns are
still in use today, and after StorageTek was taken over by Sun in 2005 the new owner had
to repeatedly extend the maintenance cycles on account of the pressure exerted by major
customers. These customers saw no reason to phase out their tried-and-trusted tape libraries and replace them with follow-up models.
[9] Cf. The Top 10 Storage Inventions of All Time, in Byteandswitch, June 16, 2008.
[10] Hartmut Wiehr, Dedup turns backup inside out, in Computerzeitung, issue 28/2008,
page 10.
76
Remarks
Section 6
[1] Fujitsu Siemens Computers has developed two dynamic infrastructure solutions FlexFrame
for SAP and FlexFrame for Oracle which combine storage, server and network resources on a
single platform. Resources can thus be assigned and moved in run mode in a virtual environment and depending on current requirements.
[2] See section 7 for more about virtualization.
[3] See Mario Vosschmidt/Hartmut Wiehr, Gut eingebunden Speichernetze und ihre Verwaltungsinstrumente (Well incorporated Storage networks and their administration tools), in:
iX Magazin fr professionelle Informationstechnik, Heft 8/2008, Page 123.
[4] Published in tecchannel, 26th January 2006. For more information, see the article SMI-S
is holding the storage networking industry together (Mit SMI-S hat sich die Storage-Networking-Industrie ihren inneren Zusammenhalt gegeben) www.searchstorage.de (September 5, 2007).
Section 7
[1] Virtualization drives IT industrialization forward: Interview by Hartmut Wiehr with Dr. Joseph Reger, at www.searchstorage.de, April 10, 2004: http://www.searchstorage.de/topics/
rz-techniken/virtuelle-systeme/articles/117015/.
[2] Ibid.
[3] Ibid.
[4] Mario Vosschmidt/Hartmut Wiehr, Gut eingebunden Speichernetze and ihre Verwaltungsinstrumente (Well incorporated Storage networks and their administration tools), in:
iX Magazin for professionelle Informationstechnik, issue 8/2008, page 124.
[5] LUN masking (also LUN mapping) means that only the storage area it needs to perform its
work is allocated and made visible to an application. As a result of this segmentation general
access to certain storage areas is prohibited, which at the same time increases the security
of all the applications. With SAN zoning the same principle is applied to the division of a
network into virtual subnetworks so that servers of the one zone cannot access storage
systems of another zone.
[6] Steve Duplessie, File Infrastructure Requirements in the Internet Computing Era, Enterprise
Strategy Group (ESG), July 2008.
Section 8
[1] Josh Krischer, Krischer & Associates, CentricStor Virtual Tape: the Swiss Army Knife for data
protection, September 2008
[2] Terri McClure, Enterprise Strategy Group (ESG), CentricStor FS von Fujitsu Siemens Computers, July 2008
Glossary
A
Array (Storage Array)

A subsystem which houses a group of disks (or
tapes), together controlled by software usually
housed within the subsystem.
Compliance
In data storage terminology, the word compliance is used to refer to industry-wide government regulations and rules that cite how data
is managed and the need for organizations to
be in compliance with those regulations. Compliance has become a major concern for
organizations and businesses, due largely in
part to increasing regulatory requirements
which often require organizations to invest in
new technologies in order to address compliance issues.
Asynchronous Replication
After data has been written to the primary
storage site, new writes to that site can be
accepted, without having to wait for the secondary (remote) storage site to also finish its
writes. Asynchronous Replication does not
have the latency impact that synchronous replication does, but has the disadvantage of
incurring data loss, should the primary site fail
before the data has been written to the secondary site.
B
Backup/Restore
A two step process. Information is first copied
to non-volatile disk or tape media. In the event
of computer problems (such as disk drive failures, power outages, or virus infection) resulting in data loss or damage to the original data,
the copy is subsequently retrieved and restored
to a functional system.
Block Data
Raw data which does not have a file structure
imposed on it. Database applications such as
Microsoft SQL Server and Microsoft Exchange
Server transfer data in blocks. Block transfer is
the most efficient way to write to disk.
Business Continuity
The ability of an organization to continue to
function even after a disastrous event, accomplished through the deployment of redundant
hardware and software, the use of fault tolerant systems, as well as a solid backup and
recovery strategy.
Continuous Data Protection (CDP)

Refers to backing up computer data by saving
as an automated function a copy every time
changes are made to that data. This event
driven backup works during productive computing time, and could slow down the performance of the IT infrastructure.
Cluster
A group of servers that together act as a single
system, enabling load balancing and high
availability. Clustering can be housed in the
same physical location (basic cluster) or can be
distributed across multiple sites for disaster
recovery.
D
Data Deduplication
Deduplication technology segments the
incoming data stream, uniquely identifies
these data segments, and then compares them
to segments previously stored. If an incoming
data segment is a duplicate of what has already
been stored, the segment is not stored again
but a reference is created for it (pointer). This
process operates at a very low level of granularity or atomic level to identify as much
redundancy as possible. The trade-offs in this
78
filtering process pertain to efficiency, speed

and data size.
DAS (Direct Attached Storage)
DAS is storage that is directly connected to a
server by connectivity media such as parallel
SCSI cables. This direct connection provides
fast access to the data; however, storage is
only accessible from that server. DAS include
the internally attached local disk drives or
externally attached RAID (redundant array of
independent disks) or JBOD (just a bunch of
disks). Although Fibre Channel can be used for
direct attached, it is more commonly used in
storage area networks.
Disaster Recovery
The ability to recover from the loss of a complete site, whether due to natural disaster or
malicious intent. Disaster recovery strategies
include replication and backup/restore.
E
Ethernet
Local area network (LAN) topology commonly
operating at 10 megabits per second (mbps)
over various physical media such as coaxial
cable, shielded or unshielded twisted pair, and
fiber optics. Future plans call for 1, 10 and 100
gigabit Ethernet versions. Ethernet standards
are maintained by the IEEE 802.3 committee.
F
Failover
In the event of a physical disruption to a network component, data is immediately rerouted
to an alternate path so that services remain
uninterrupted. Failover applies both to clustering and to multiple paths to storage. In the
case of clustering, one or more services (such
as Exchange) is moved over to a standby server
in the event of a failure. In the case of multiple
paths to storage, a path failure results in data
being rerouted to a different physical connection to the storage.
Glossary
FaultTolerance
Faulttolerance is the ability of computer
hardware or software to ensure data integrity
when hardware failures occur. Fault-tolerant
features appear in many server operating systems and include mirrored volumes, RAID
volumes, and server clusters.
File Data
Data which has an associated file system.
Fibre Channel (FC)
A highspeed interconnect used in storage
area networks (SANs) to connect servers to
shared storage. Fibre Channel components
include HBAs, hubs, switches, and cabling. The
term Fibre Channel also refers to the storage
protocol.
Fibre Channel over Ethernet (FCoE)
A technology that encapsulates Fibre Channel
frames in Ethernet frames, allowing FC traffic
to be transported over Ethernet networks.
Standards are in work in different standardization committees. Products are announced for
2009 or 2010. FCoE could be an alternative to
classical Fibre Channel technology.
G
Global File System
In some configurations, as with clusters or
multiple NAS boxes, it is useful to have a means
to make the file systems on multiple servers or
devices look like a single file system. A global
or dispersed file system would enable storage
administrators to globally build or make
changes to file systems. To date this remains
an emerging technology.
H
High Availability
A continuously available computer system is
characterized as having essentially no downtime in any given year. A system with 99.999%
availability experiences only about five minutes of downtime. In contrast, a high availability system is defined as having 99.9% uptime,
Glossary
which translates into a few hours of planned

or unplanned downtime per year.
HBA (Host Bus Adapter)
The HBA is the intelligent hardware residing on
the host server which controls the transfer of
data between the host and the target storage
device.
Hierarchical Storage Management (HSM)
A data storage system that automatically
moves data between high-cost and low-cost
storage media. HSM systems exist because
high-speed storage devices, such as enterprise
disk drives (Fibre Channel, SAS), are more
expensive (per byte stored) than slower devices,
such as low-speed discs (SATA, desktop discs)
or magnetic tape drives. While it would be
ideal to have all data available on high-speed
devices all the time, this is prohibitively expensive for many organizations. Instead, HSM systems store the bulk of the enterprises data on
slower devices, and then copy data to faster
disk drives when needed.
I
ILM (Information Lifecycle Management)
The process of managing information growth,
storage, and retrieval over time, based on its
value to the organization. Sometimes referred
to as data lifecycle management.
iSCSI (Internet SCSI)
A protocol that enables transport of block data
over IP networks, without the need for a specialized network infrastructure, such as Fibre
Channel.
ITIL (Information Technology Infrastructure
Library)
ITIL refers to a documentation of best practice
for IT Service Management. Used by many
hundreds of organisations around the world, a
whole ITIL philosophy has grown up around
the guidance contained within the ITIL books
and the supporting professional qualification
scheme. ITIL consists of a series of books giving
guidance on the provision of quality IT serv-
79
ices, and on the accommodation and environmental facilities needed to support IT. ITIL has
been developed in recognition of organisations growing dependency on IT and embodies
best practices for IT Service Management. ITIL
is often implemented when different enterprises work together, and can also facilitate
mergers and acquisitions.
J
JBOD (Just a Bunch of Disks)
As the name suggests, a group of disks housed
in its own box; JBOD differs from RAID in not
having any storage controller intelligence or
data redundancy capabilities.
L
LAN
Local Area Network. Hardware and software
involved in connecting personal computers
and peripherals within close geographic confines, usually within a building, or adjacent
buildings.
Load Balancing
Referring to the ability to redistribute load
(read/write requests) to an alternate path
between server and storage device, load balancing helps to maintain high performance
networking.
LTO
Linear Tape Open. The LTO family, a half-inch
open technology with Ultrium format a cartridge targeted at ultra-high capacity requirements.
LUN (Logical Unit Number)
A logical unit is a conceptual division (a subunit) of a storage disk or a set of disks. Logical
units can directly correspond to a volume drive
(for example, C: can be a logical unit). Each
logical unit has an address, known as the logical unit number (LUN), which allows it to be
uniquely identified.
80
LUN Masking
A method to restrict server access to storage
not specifically allocated to that server. LUN
masking is similar to zoning, but is implemented in the storage array, not the switch.
M
MAN
Metropolitan Area Network. A network capable
of high-speed communications over distances
up to about 80 kilometers.
Metadata
The information associated with a file but separate from the data in the file; required to
identify data in the file and its physical location on a disk.
Mirroring
A disk data redundancy technique in which
data is recorded identically and either synchronously or asynchronously on multiple
separate disks to protect data from disk failures. When the primary disk is off-line, the
alternate takes over, providing continuous
access to data. Normally used for missioncritical data, mirroring is classified as RAID 1
configuration and doubles disk costs.
N
NAS (Network Attached Storage)
A NAS device is a server that runs an operating
system specifically designed for handling files
(rather than block data). Network-attached
storage is accessible directly on the local area
network (LAN) through LAN protocols such as
TCP/IP. Compare to DAS and SAN.
P
Partition
A partition is the portion of a physical disk or
LUN that functions as though it were a physically separate disk. Once the partition is created, it must be formatted and assigned a drive
letter before data can be stored on it.
Glossary
Port
The physical connection point on computers,
switches, storage arrays, etc, which is used to
connect to other devices on a network. Ports
on a Fibre Channel network are identified by
their Worldwide Port Name (WWPN) IDs; on
iSCSI networks, ports are commonly given an
iSCSI name. Not to be confused with TCP/IP
ports, which are used as virtual addresses
assigned to each IP address.
R
RAID (Redundant Array of Independent
Disks)
A way of storing the same data over multiple
physical disks to ensure that if a hard disk fails
a redundant copy of the data can be accessed
instead. Example schemes include mirroring
and RAID 5.
Redundancy
The duplication of information or hardware
equipment components to ensure that should
a primary resource fail, a secondary resource
can take over its function.
Replication
Replication is the process of duplicating mission critical data from one highly available site
to another. The replication process can be synchronous or asynchronous; duplicates are
known as clones, point-in-time copies, or
snapshots, depending on the type of copy
being made.
S
SAN (Storage Area Network)
A storage area network (SAN) is a specialized
network that provides access to high performance and highly available storage subsystems
using block storage protocols. The SAN is made
up of specific devices, such as host bus adapters (HBAs) in the host servers, switches that
help route storage traffic, and disk storage
subsystems. The main characteristic of a SAN
is that the storage subsystems are generally
available to multiple hosts at the same time,
Glossary
which makes them scalable and flexible. Compare with NAS and DAS.
SAS/SATA
SAS: Serial Attached SCSI. While SATA (Serial
ATA) is designed for desktops, making it a good
choice in storage environments requiring configuration simplicity or optimal cost/capacity,
SAS delivers the high performance, scalability
and reliability required for mainstream servers
and enterprise storage.
SCSI (Small Computer System Interface)
A set of standards allowing computers to communicate with attached devices, such as storage devices (disk drives, tape libraries etc) and
printers. SCSI also refers to a parallel interconnect technology which implements the SCSI
protocol. SCSI is available in two flavours: Parallel SCSI and Serial Attached SCSI. Parallel
SCSI has been the standard in connectivity for
more than 20 years, and is known for its stability and reliability. Serial Attached SCSI (SAS) is
the newest generation of SCSI, with both Serial
ATA (SATA) and SAS drives.
Snapshot
A virtual copy of a device or filesystem. Snapshots imitate the way a file or device looked at
the precise time the snapshot was taken. It is
not a copy of the data, only a picture in time of
how the data was organized. Snapshots can be
taken according to a scheduled time and provide a consistent view of a filesystem or device
for a backup and recovery program to work
from.
Solid State Disk (SSD)
A solid state disk is a high-performance plugand-play storage device that contains no moving parts. SSD components include either
DRAM or EEPROM memory boards, a memory
bus board, a CPU, and a battery card. Because
they contain their own CPUs to manage data
storage, they are a lot faster than conventional
rotating hard disks; therefore, they produce
highest possible I/O rates. SSDs are most effective for server applications and server systems,
where I/O response time is crucial. Data stored
81
on SSDs should include anything that creates

bottlenecks, such as databases, swap files,
library and index files, and authorization and
login information.
Storage Controller
Providing such functionality as disk aggregation (RAID), I/O routing, and error detection
and recovery, the controller provides the intelligence for the storage subsystem. Each storage subsystem contains one or more storage
controllers.
Storage Resource Management (SRM)
Refers to software that manages storage from
a capacity, utilization, policy and event-management perspective. This includes bill-back,
monitoring, reporting and analytic capabilities
that allow you to drill down for performance
and availability. Key elements of SRM include
asset management, charge back, capacity
management, configuration management,
data and media migration, event management,
performance and availability management,
policy management, quota management, and
media management.
Switch
An intelligent device residing on the network
responsible for directing data from the source
(such as a server) or sources directly to a specific target device (such as a specific storage
device) with minimum delay. Switches differ in
their capabilities; a director class switch, for
example, is a high end switch that provide
advanced management and availability features.
Synchronous Replication
In synchronous replication, each write to the
primary disk and the secondary (remote) disk
must be complete before the next write can
begin. The advantage of this approach is that
the two sets of data are always synchronized.
The disadvantage is that if the distance
between the two storage disks is substantial,
the replication process can take a long time
and slows down the application writing the
data. See also asynchronous replication.
82
T
Tape Library
In data storage, a tape library is a collection of
magnetic tape cartridges and tape drives. An
automated tape library is a hardware device
that contains multiple tape drives for reading
and writing data, access ports for entering and
removing tapes and a robotic device for
mounting and dismounting the tape cartridges
without human intervention. To mount means
to make a group of files in a file system structure accessible to a user or user group.
Target
A target is the device to which the initiator
sends data. Most commonly the target is the
storage array, but the term also applies to
bridges, tape libraries, tape drives or other
devices.
TCP/IP
Transmission Control Protocol/Internet Protocol. A set of transport and network layer protocols developed under the auspices of the U.S.
Department of Defense. Has emerged as the
de-facto standard for communications among
Unix systems, particularly over Ethernet.
Thin Provisioning
Thin provisioning is most commonly used in
centralized large storage systems such as SANs
and also in storage virtualization environments
where administrators plan for both current
and future storage requirements and often
over-purchase capacity, which can result in
wasted storage. Since thin provisioning is
designed to allocate exactly what is needed,
exactly when it is needed, it removes the element of paid for but wasted storage capacity.
Additionally, as more storage is needed additional volumes can be attached to the existing
consolidated storage system.
Tiered Storage
Data is stored according to its intended use.
For instance, data intended for restoration in
the event of data loss or corruption is stored
locally, for fast recovery. Data required to be
Glossary
kept for regulatory purposes is archived to

lower cost disks.
V
VTL (Virtual Tape Library)
Refers to an intelligent disk-based library that
emulates traditional tape devices and tape formats. Acting like a tape library with the performance of modern disk drives, data is deposited onto disk drives just as it would onto a
tape library, only faster. Virtual tape backup
solutions can be used as a secondary backup
stage on the way to tape, or as their own
standalone tape library solution. A VTL generally consists of a Virtual Tape appliance or
server, and software which emulates traditional tape devices and formats.
Virtualization
In storage, virtualization is a means by which
multiple physical storage devices are viewed as
a single logical unit. Virtualization can be
accomplished inband (in the data path) or
out-of-band. Outofband virtualization does
not compete for host resources, and can virtualize storage resources irrespective of whether
they are DAS, NAS or SAN.
Volume
A volume is an area of storage on a hard disk. A
volume is formatted by using a file system, and
typically has a drive letter assigned to it. A single hard disk can have multiple volumes, and
volumes can also span multiple disks.
Z
Zoning
A method used to restrict server access to storage resources that are not allocated to that
server. Zoning is similar to LUN masking, but is
implemented in the switch and operates on
the basis of port identification (either port
numbers on the switch or by WWPN of the
attached initiators and targets).
(Sources: Adaptec, Fujitsu Siemens Computers,
Horison Information Strategies, Microsoft,
SNIA, Webopedia, ZAZAmedia)
Information infrastructures
in enterprises
he principle of profitable trading increasingly demands efficient handling of information in the enterprise, especially when you consider that the volume of information is growing by an average of 60 percent per year. With its solutions, EMC is
striving to make optimal use of this capital as well as to protect, manage, store and
archive it. EMC thereby rings in a paradigm change by moving the information itself
rather than the applications into the center of the infrastructure. As a result, the
demands on the infrastructure must be focused on the paths the information takes in
the enterprise: from creation, capture and utilization, through to archiving and deletion. The optimum strategy for setting up an information infrastructure includes intelligent data storage, protection against data loss or misuse, optimizing the infrastructure of IT management and services and utilizing the value added potential of information. Alongside the top priority of cost reduction, enterprises mainly want to improve
their compliance with all legal requirements and enhance support of their business
processes. Business demands on IT are therefore given noticeably higher priority than
technological goals such as better data security or better structured data.
EMC hardware and software to build up your information

infrastructure
our enterprise is changing continuously and EMC will grow and change with you.
From hard disk libraries to content management systems and SANs, our products
have been the choice for many enterprises for years.
EMC offers the software, systems, security and services that you need to develop a
more intelligent storage concept that is effective, economical and user-friendly and
also provides you with the necessary flexibility for expansion together with your enterprise.
But more intelligent storage is only half the solution: a better concept is needed for
information management. This can be achieved by implementing multiple key initiatives. With its years of experience, EMC has identified six important areas that can help
you to store information more intelligently.
84
EMC
Six approaches to storing more intelligently.

1. Classify data and deploy tiered storage
Not all data is created equal. In fact, each byte of data used in your organization varies
in terms of how often its accessed and how quickly it must be recovered in the event of
downtime. Therefore, its vital to begin the journey toward storing more intelligently by
classifying data based on access requirements. As soon as you begin assessing your
data, youll find that a one size fits all approach to storing data no longer makes sense.
Rather, deploying tiered storage provides the needed framework and processes to
maintain information on the right storage type at the right time, thereby aligning
capabilities to requirements and ultimately lowering total cost of ownership (TCO).
EMC offers a range of scalable, easy-to-use storage platforms, software, and services to help organizations like yours classify data and establish a tiered storage infrastructure. Our portfolio includes EMC Symmetrix, EMC CLARiiON, and EMC Celerra.
All EMC storage platforms can be deployed with a combination of high-performance
and low-cost/high-capacity disk drives within the same array, and offer feature-rich
software functionality to simplify movement, management, and protection of valuable
data. So, whether your organization requires a small-scale solution with tiered storage
deployed within one physical system, or has a large-scale environment where multiple
tiers of storage systems are needed, the ultimate benefit is the samelower TCO.
Before
After
Tier 1
Production
Data
Tier 2
Tier 3
"One size fits all"
Tiered Storage
Durch die Klassifizierung von Daten und die Bereitstellung von

By
classifying
and Organisationen
deploying tieredihre
storage,
Tiered
Storagedata
knnen
TCO inorganizations
der Regel
typically
a 25 percent improvement in TCO.
um 25%see
verbessern.
Information infrastructures in enterprises
85
EMC is the market leader in the area of IP storage solutions

The wide proliferation of IP storage technologies, such as iSCSI and NAS for accessing
storage systems, offers enterprises of all sizes the option of economically setting up
and operating efficient and reliable storage networks. Enterprises can now use their
existing IP technologies and IP-trained personnel to provide and operate their storage
networks. All-in-all, the methods offered by todays mature, IP-based storage technology make it significantly simpler and more cost-effective to achieve the desired storage
consolidation results. With the storage families Celerra NS, CLARiiON CX (FibreCAT CX)
and Symmetrix DMX, EMC is the market leader in the field of IP storage solutions.
According to IDC Research, no other supplier has provided more IP-based storage solutions than EMC with its wide network of EMC Velocity2 partners. The reason for this:
EMC offers a wide range of flexible options to allow enterprises and organizations of all
sizes to profit from cost-effective and user-friendly IP storage solutions regardless of
whether you are providing a whole new network storage system or consolidating servers and applications that are not in an existing SAN.
2. Create an active archive
When information is needed, it must be online and readily accessiblewhether its
required daily or to meet longer-term audit cycles. Active archiving gives you the ability
to move data that is used infrequently from primary storage into a more cost-effective
storage infrastructure while still allowing quick and easy access.
EMC Centera is the first active archive solution in the market that is purpose-built
for storage and retrieval of fixed content. By migrating static or infrequently accessed
information from the production storage environment to an EMC Centera active
archive, valuable production storage space is freed up and backup media requirements lessen as static content is removed from daily or weekly backup processes. For
compliance purposes, EMC Centera also offers guaranteed content authenticity to meet
the emerging set of regulatory requirements.
86
EMC
Before
Tier 2
Tier 3
Tier 1
Tier 2
Active Archive
Tier 1
After
Tier 3
vielen
E-Mailoder
Dateisystemumgebungen
werden
mehr
InInmany
e-mail
or file
system
environments, greater than
75 percent
75%
nicht verndert
damit
diese
Daten for
ofalsthe
datader
is Daten
not modified,
which makes
it ansind
ideal
candidate
idealearchive.
Kandidaten fr die aktive Archivierung.
active
3. Reduce / eliminate redundant data

One of the key drivers impacting storage costs is the amount of production data that
needs to be stored and backed up. Reducing or eliminating duplicate data in backup
environments is a key step to storing more intelligently. Consider a simple example: an
e-mail attachment is distributed and then saved by 20 different users. The impact?
Storage capacity, network bandwidth, and backup media are unnecessarily consumed
by a factor of 20. With data de-duplication, storage requirements lessen, operations are
streamlined, backup times are shortened, and overall TCO is improved.
EMC technology proactively helps to reduce or eliminate the amount of redundant
data that an organization creates, stores, and ultimately backs up. EMC Centera computes a content address every time an object is requested to be stored. For example,
when two or more people try to store the same e-mail attachment, EMC Centera computes the same content address each time. With this intelligence, EMC Centera easily
determines that multiple people are trying to store the same file and gives each user a
pointer to the file instead of storing multiple copies. This minimizes information redundancy and the amount of storage required.
EMC Avamar backup and recovery software can achieve the same results for your
backup data. EMC Avamar leverages global de-duplication technology to eliminate the
unnecessary transmission of redundant backup data over the network and saved to
secondary storage. By de-duplicating across sites and servers, you can dramatically
shrink the amount of time required for backups, network utilization, and the growth of
secondary storage.
Before
87
After
EMC Avamar erzielt eine Reduzierung der tglichen

EMC Avamar realizes
to 300300
times
reduction
in zu
daily
backup
Backup-Daten
bis zumupFaktor
und
bietet bis
zehn
dataschnellere
and up to 10
times faster backups.
Mal
Backups.
4. Speed backup and recovery

By extending the tiered storage methodology to backup/recovery operations, organizations can dramatically improve the speed and reliability of their backup and recovery
processes. For example, disk-based solutions deliver five-fold improvement in backup
and recovery times compared with tape-based solutions.
EMC offers a complete solution for backup-to-disk needs, including assessment
services, backup software applications, and a wide range of LAN-based, SAN-based, and
virtual tape library (VTL) solutions so you can realize better performance and reliability
in your backup and recovery environmentno matter how you manage your backups
today.
All EMC storage platforms can be deployed with low-cost/high-capacity disk drive
technologies that are ideal for backup-to-disk applications.
88
EMC
Before
After
Backup
Data
Backup
Data
Tier 3
Festplattenbasierte Lsungen bieten eine Verbesserung

Disk-based solutions provide a 5x improvement in backup /
der Backup- und Recovery-Zeiten um den Faktor 5.
recovery times.
5. Use space-saving snapshots to reduce capacity requirements

Creating full copies of production data has been a common practice to assist with
backup, recovery, and testing of application environments. As information continues to
grow, the storage requirements and costs associated with this practice have escalated,
as have the capacity requirements to support local replication activities. Today, the use
of space-saving snapshots provides a much smarter, more affordable alternative for
protecting and re-purposing production data. Specifically, the space-saving characteristics of logical snapshot copies allow organizations to more quickly and effectively
leverage replication capabilities while minimizing the traditional costs and capacity
requirements that go along with them.
All EMC storage platforms offer the flexibility to deploy array-based full copy clone
and local snapshot replicationyou can choose which to leverage based on your application requirements and use cases. In addition, both types are supported and integrated
with key applications, such as Microsoft Exchange, SQL Server, Oracle, and SAP.
Before
89
After
Snaps
Snaps
Clones
Snaps
Snapshots provide
to bis
10 times
reductionReduzierung
in capacity required
Snapshots
bieten up
eine
zu zehnfache
der
for die
locallokale
replication.
fr
Replikation erforderlichen Kapazitt.
6. Deploy server and file virtualization

IT departments today face a number of challenges in their physical server infrastructures, beginning with underutilization and server sprawl. According to IDC, a typical x86
server deployment uses only 10 to 15 percent of its total capacity, and yet organizations
spend considerable resources provisioning, maintaining, and securing growing numbers of physical servers. For every $1 spent on new servers, again according to IDC, $8
is spent in ongoing management. Organizations can overcome these challenges and
implement significant efficiencies by using virtualization to optimize the infrastructure
at the server level.
Virtualization with VMware offers a fundamentally simpler and more efficient way
to manage server infrastructures. In fact, the time to provision a new server can be
decreased by up to 7o percent, new applications can be provisioned in minutes, and
time previously spent on manual, time-consuming tasks can be redirected toward strategic initiatives. Whats more, data center capacity can be scaled non-disruptively and
availability can be ensured for all applications at a reasonable cost.
When you combine VMware server virtualization with EMC Rainfinity Global File
Virtualization- you get the added benefits of virtualizing your unstructured data environments. The Global File Virtualization capabilities allow you to simplify management,
move data non-disruptively, address performance bottlenecks, and lower TCO by maximizing utilization in heterogeneous NAS, CAS, and file server environments
90
EMC
Addressing your key priorities

At EMC, were focused on the things that are most important to you, such as making
sure your information is easy to manage, taking the right steps to conserve energy, and
protecting your valuable information from security threats and breaches.
EMC solutions simplify storage management
EMC has invested heavily in developing new and improved tools that meet user requirements to streamline storage management and improve ease of use across all platforms.
These capabilities help simplify.
EMC solutions can be deployed quickly, and are easy to use and manage. For example, Symmetrix Management Console, Celerra Manager, and CLARiiON Navisphere
Manager all allow users to configure and provision a terabyte of storage in less than
one minute. And EMC Centera includes self-configuring, self-healing, and self-managing capabilities that enable an administrator to manage up to 50 times greater quantity
of content.
Energy-efficient solutions improve cost management

odays data centers face critical energy issues power and cooling limitations, high
energy demands and costs, and even outages from overburdened power grids.
Advanced tools and services from EMC help optimize energy efficiency in your data
center. For example, EMC storage platforms consume less energy per terabyte than alternative solutions, while the EMC Power Calculator enables EMC experts to advise on ways
to reduce power and cooling costs through more efficient data management. Powerful
EMC virtualization solutions, such as VMware, can help you consolidate servers to
increase capacity utilization and eliminate unnecessary infrastructure. And finally, EMC
Energy Efficiency Services can help you make your data center as power efficient as possible. Continue to look to EMC for further innovations resulting from our investment in
research and development programs aimed at reducing your energy and cooling costs.
Storage solutions with security built-in

ost organizations readily agree that information is their most important asset. In
fact, in 2006 organizations spent $45B on security products; however studies
show that only one in five feel protected.
91
At EMC, weve taken an information-centric approach by building security into our

products, not bolting it on. This strategic approach provides our customers with
improved access control, a more secure infrastructure, and expanded compliance and
auditabilityso that their information infrastructure is guarded from security threats.
Take the Next Step
With its flexible and cost-effective storage consolidation solutions, EMC provides
organizations with the information management framework they need to store more
intelligently. But storage is only part of what we do. Our focus is on improving your
business from the ground up. As such, we help organizations like yours with solutions
focused on:
Establishing backup, recovery, and archive operations
Ensuring effective and affordable information protection
Virtualizing information infrastructure
Automating data center operations
Securing critical assets
Leveraging content for competitive advantage
Accelerating business value for business-critical applications, such as Microsoft
Exchange, SQL Server, Oracle and SAP.
Partnership between EMC and Fujitsu Siemens Computers

For nearly a decade, EMC and Fujitsu Siemens Computers have worked to build a strong
relationship built on mutual trust and technological innovation. A key to our success is
the combined expertise of both our sales and technical teams who have helped our
mutual customers meet some of their biggest challenges. The combination of EMC networked storage systems and software with servers from Fujitsu Siemens Computers
helps to provide a broad offering of end-to-end solutions. EMC technology plays a key
role in Fujitsu Siemens Computers vision of the Dynamic Data Center and we continue
to focus our combined efforts on providing customers with the industrys most comprehensive set of solutions. 2007, Joe Tucci, EMC President & CEO.
Common strengths of Fujitsu Siemens Computers and EMC:
EMC Storage integrated with Fujitsu Siemens Computers FlexFrame infrastructure
Fujitsu Siemens Computers CentricStor is available in the EMC Select Reseller
program
Fujitsu Siemens Computers Infrastructure Services is an accredited member of the
EMC Authorized Service Network (ASN) program
92
EMC
EMC Storage integration with Microsoft is implemented via Fujitsu Siemens Computers BladeFrame technology
Common solution for Grid Computing based on Fujitsu Siemens Computers
PRIMERGY server systems
OEM and reseller agreements for EMC Networker
For more information on EMC solutions, visit www.emc.com
DATA CENTER FABRIC
Leveraging 8 Gbit/sec Fibre Channel

End to End in the Data Center
rocade is a leading provider of networked storage solutions that help organizations connect, share, and manage their information. Organizations that use Brocade products and services are better able to optimize their IT infrastructures and
ensure compliant data management. All of Brocades products are tested and certified
by Fujitsu Siemens Computers which ensures seamless integration into the comprehensive and leading edge infrastructure solutions of Fujitsu Siemens Computers.
The continuing expansion of data centers and the introduction of new technologies
such as server and fabric virtualization are driving the need for higher storage networking performance and greater capabilities from the data center fabric. The new Brocade
Fibre Channel (FC) switch family meets current and near-future storage networking
needs by doubling the current standard FC speed of 4 Gbit/sec to 8 Gbit/sec. New Brocade 8 Gbit/sec Host Bus Adapters work in concert with Brocade 8 Gbit/sec fabrics to
deliver high performance and advanced functionality end to end.
Introduction
n January 2008, Brocade introduced 8 Gbit/sec capabilities for the Brocade 48000
Director and the new Brocade DCX Backbone platform. Brocade is expanding this
leadership position with the introduction of an entire family of 8 Gbit/sec switch products targeting a range of data center environmentsfrom the enterprise to Small and
Medium Business (SMB). In addition Brocade is launching 8 Gbit/sec Host Bus Adapters
(HBAs), providing the industrys first end-to-end 8 Gbit/sec solution for SMB to enterprise customers. These high-performance solutions are driven by a new family of Brocade 8Gbit/sec ASICs, which process and route data with much higher levels of efficiency. In addition to doubling performance throughput, these new ASICs offer new
capabilities that align with growing data center requirements for IT process automation, energy efficiency, and reduced Operating Expenses (OpEx). Steady increases in performance and functionality have been the hallmark of Fibre Channel evolution over the
past decade. With the periodic doubling of transport speed from 1 to 2 Gbit/sec and
from 2 to 4 Gbit/sec, storage administrators have quickly exploited the new perform-
94
Brocade
ance capabilities and advanced features to build more optimized storage networks.
With the introduction of Brocade 8 Gbit/sec switches and HBAs, it is now possible to
fully integrate advanced functionality that extends from the fabric all the way to the
server platform. In trying to decide where enhanced performance and capabilities can
be applied in your own environment, consider the following:
Storage Growth. Storage Area Network (SAN) storage capacity has dramatically
increased year over year in almost all data centers. As SAN storage grows, so do the
fabrics that interconnect storage with servers.
Large Fabrics. As fabrics grow, more Inter-Switch Links (ISLs) are used to keep pace
with storage and server scaling.
Higher Levels of Performance. In large-scale data centers, moving SAN bandwidthintensive hosts to 8 Gbit/sec connectivity enables the servers to achieve higher levels
of performance using fewer HBAs and a smaller cabling infrastructure.
Server Virtualization. Hosting multiple operating system instances on a single host
platform dramatically increases storage I/O demands, which in turn drives up host
SAN throughput.
Tiered Services. In a shared environment, in which IT may be using chargeback to
serve internal customers, a tiered services model requires the ability to specify service levels for hosted applications and to monitor and manage these services end to
endall capabilities of Brocade 8 Gbit/sec solutions.
Backup. Large amounts of traffic to tape or disk during backups require the fastest
SAN speeds possible to fit within backup windows.
Operational Flexibility. While not all hosts, storage, and ISLs currently require maximum speed capability, it is much easier to architect data center fabrics when highspeed ports are available.
Investment Protection. Existing SANs can be significantly enhanced with new capabilities enabled by 8 Gbit/sec port speed. Integrated Routing and Adaptive Networking services are compatible with legacy SAN equipment, extending their Return on
Investment (ROI) as data center fabrics scale.
Data centers may have some or all of these needs today. Although meeting these
needs may not require an immediate upgrade to 8 Gbit/sec for all storage applications,
future plans for expansion, virtualization, and fabric scaling will make acquiring 8 Gbit/
sec capabilities today a safe and well-founded decision. As fabrics scale, for example,
only half the number of ISLs is required with 8 Gbit/sec links than with 4 Gbit/sec links.
Likewise, the ISL oversubscription ratio is halved by upgrading from 4 to 8 Gbit/sec, ISLs
using the same number of links.
At long distances, 8 Gbit/sec can earn a very fast ROI compared to 4 Gbit/sec, due to
the high cost of dark fiber or WDM links. Almost all of these native FC extension links
support 8 Gbit/sec speeds, so utilization can be doubled on links that usually cost thou-
Leveraging 8 Gbit/sec Fibre Channel End to End in the Data Center
95
sands, if not tens of thousands, of dollars per month. This can quickly justify the equipment cost for the increased speed capability. Building a high-performance foundation
that provides the flexibility to selectively deploy 8 Gbit/sec as needed simplifies data
center fabric management and accommodates the inevitable growth in applications
and data over time.
Evolving Data Center Virtualization

irtualization of server platforms can dramatically increase the need for higher-speed
capability in the SAN. Some virtualized hosts have 10, 20, or even 30 operating systems, which can exceed the capacity of a 4 Gbit/sec HBA. Brocades 8 Gbit/sec end-toend solutions can prevent this saturation and increase the ROI on server hardware and
virtualization software investments.
Brocade recognizes these three phases for evolving virtualization in the data
center:
Phase 1. The primary business driver for this phase is the reduction of Capital Expenditures (CapEx), as a result of server consolidation and flexible test and development.
Phase 2. The challenge for Phase 2 is characterized by growth and deployment of
Disaster Recovery (DR) solutions and the need for high availability and automated
server provisioning. The primary business drivers are reduction of OpEx and the
requirement for Business Continuity (BC). Here are some typical use cases:
Automated server provisioning and applications deployment using pre-built Virtual
Machine (VM) templates
Data center architectures and products that provide High Availability (HA) and no
interruption of service during server maintenance or failure
Storage replication and automated restoration of service to support DR goals
Phase 3. We are now moving into this phase, in which business drivers are flexible IT,
variable cost, and further OpEx reductions. Phase 3 will provide data centers with
policy-driven utility computing, service-level management, and end-to-end service
provisioning.
Virtualization has fundamentally transformed the traditional relationship between

servers, storage, and the fabric interconnect. Running many VMs on a single server
hardware platform can dramatically increase the requisite Input/Output (I/O) load and
mandate offloading as much I/O processing as possible, so that CPU cycles can be
devoted more productively to application processing. Fully leveraging server virtualization therefore requires more powerful, high-performance storage adapters.
96
Brocade
Brocades new family of 8 Gbit/sec switches supports the rapidly growing data center
by delivering 8 Gbit/sec performance on every port with no oversubscription. A completely non-oversubscribed switching architecture enhances server scalability by enabling the rapid growth of virtual servers without compromising data center performance.
Integrated Routing (IR) fabric service is a new option on the Brocade DCX Backbone
and Brocade 5300 and 5100 Switches with the release of Fabric OS (FOS) 6.1. As of FOS
6.1, IR can be activated on FC8 port blades with up to 128 IR ports per Brocade DCX
chassis. (When there are two Brocade DCX chassis connected via Inter-Chassis Links, a
total of 256 IR ports are available.) No additional hardware is required to enable perport Fibre Channel Routing; only an optional IR software license is required. IR can be
enabled on the maximum number of ports on the Brocade 5300 (80 ports) and Brocade
5100 (40 ports) via user configuration. Brocade 8 Gbit/sec HBA ASICs support a maximum of 500k I/O per Second (IOPS) per port ( >1M IOPS on a dual-port HBA) to free up
the host processors and meet virtualization productivity goals. In the future, two 8
Gbit/sec HBA ports will be able to be combined into a single, ultra-high-speed 16 Gbit/
sec connection using Brocade ISL Trunking technology, which balances traffic flows at
the frame level. Currently, the benefits of Brocade 8 Gbit/sec switching technology are
extended directly to VMs via N_Port ID Virtualization (NPIV), so that special Brocade
features, such as Top Talkers and QoS Traffic Prioritization, can be applied to individual
VMs. This end-to-end fabric and host integration is unique to Brocade and offers the
industrys highest I/O performance for virtualized environments.
Brocade 8 Gbit/sec HBAs complement industry-leading performance with advanced
storage functionality to further streamline virtualized server operations. To meet regulatory compliance requirements, for example, Brocade 8 Gbit/sec HBAs implement the
industry standard Fibre Channel Security Protocol (FC-SP) and will support in-flight
data encryption for secure network transactions.
In addition, the new Brocade fabric service, Adaptive Networking, provides configurable Quality of Service (QoS) for each VM. With the increasing use of VM mobility to
shift application workloads from one hardware platform to another, conventional networking methods are no longer sufficient. Brocade meets the needs of more dynamic
virtualized environments by providing an integrated fabric and HBA solution that can
selectively deploy security and QoS to VM-hosted applications as required.
Brocade 8 Gbit/sec Products

he Brocade DCX Backbone, Brocades first 8 Gbit/sec platform, with 16-, 32-, and
48-port blades, was released in early 2008. In May 2008, Brocade completed the
Leveraging 8 Gbit/sec Fibre Channel End to End in the Data Center
97
transition to 8 Gbit/sec with the release of Fabric OS 6.1 and a full family of new
switches and HBAs for end-to-end connectivity in the data center:
Brocade 815 (single port) and 825 (dual port) HBAs
Brocade 300 Switch with 8, 16, and 24 ports
FC8-16, FC8-32, and FC8-48 port blades for the Brocade 48000 Director
Brocade 8 Gbit/sec switches comply with industry standards; and fabrics with 4 and
8 Gbit/sec devices interoperate seamlessly. Visit the Brocade Web site for data sheets
describing these products: www.brocade.com
Conclusion
peed increase in Brocade switching platforms is one of many advantages from Brocades next generation ASIC family. Higher speed in the data center brings the immediate benefit of higher-performing ISLs and increased scalability; since ISL performance
is doubled, more ports can be used for servers and storage. In addition, 8 Gbit/sec is
needed for server virtualization, scaling of fabrics, backups, and high-performance
computing requirements. New capabilities, such as Adaptive Networking and Integrated
Routing, plus the enhanced power efficiencies of the new switch platforms are also
important drivers for adoption of 8 Gbit/sec technology. Every data center user has or
will have these needs in the future, and as data center plans are developed, Brocades
integrated end-to-end 8 Gbit/sec solution provides the broadest choice of capabilities
with the highest performance and efficiency.
Further information may be found on our website at: www.brocade.com

Contact:
Brocade Communications GmbH
Ralf Salzmann
Altknigstrasse 6
64546 Mrfelden Walldorf
ralf.salzmann@brocade.com
NetApp
Innovative solutions for storage
and data management
etApp embodies innovative storage and data management with excellent cost
efficiency. The commitment to simplicity, innovation and the success of its customers has enabled the company to become one of the fastest growing storage
and data management manufacturers. The wide-ranging solution portfolio for serverto-storage virtualization, business applications, data security and much more has persuaded customers worldwide to opt for NetApp.
NetApp ensures that your business-critical data is constantly available and can also
simplify your business processes. Based on the motto Go further, faster, NetApp helps
companies to be successful. The storage requirement for company data will continue to
grow fast in coming years. This presents IT managers with the challenge of purchasing
an ever increasing quantity of storage equipment yet also having to manage such
devices. With the help of its powerful unified storage architecture NetApp helps companies overcome these challenges efficiently: Extremely low operating costs (TCO), very
fast backup and restore processes, high availability levels, consolidation and virtualization options as well as simplified and easy management of the entire storage environment are behind the NetApp slogan Go further, faster.
Solutions
Microsoft, VMware, Oracle and SAP are important strategic NetApp partners. NetApp
has developed a wide range of tools for its database and application software.
The NetApp concept

Data ONTAP
A small, very effective operating system kernel, optimized to meet storage process
requirements, provides top performance for storage applications and has a wide range
of usage: Data ONTAP permits universal data access via block or file level protocols.
Data access protocols, such as NFS, CIFS, iSCSI, FC, http and FTP permit access at file
level.
100
NetApp
Extremely flexible storage allocation

Data ONTAP means that all the hard disks available in a system can be used for all applications. A specific storage area no longer needs to be assigned to certain applications.
Each application can use all the drives installed, regardless of the overall storage space
required by the application.
Snapshot
Using Snapshot technology developed by NetApp, 255 snapshots can be created for
each volume created without any loss in performance. Snapshots are read-only copies
of the data on the respective volume (i.e. the data in the snapshot is stored on the same
disks as the productive data).
Flexvol & Flexclone
NetApp FlexCloneTM technology enables real cloning, an immediate replication of the
data volumes and data sets without any additional storage space.
NetApp FlexClone provides notable savings in storage space and has low overheads.
Metrocluster
If the two storage controllers of a FAS cluster are at different locations, MetroCluster
synchronously mirrors the data (see SyncMirror) in both data centers. An automatic
transfer is made if an FAS system is not available.
The NetApp unified storage concept

etApp has designed a completely standardized storage architecture which seamlessly integrates all the storage products and which can be managed with one
standardized operating system (Data ONTAP). This results in a number of business benefits:
Less administration effort

Fewer employees can manage more storage and more applications.
Better use of existing storage resources
As all the devices are connected in a network, data and applications can be distributed,
as required, irrespective of the physical storage space (virtualization). Storage reductions of up to 30% can thus be obtained for the same amount of data and operational
applications.
Innovative solutions for storage and data management
101
Long-term investment protection

When requirements change, the storage systems can be extended at any time to optimally meet new requirements. Investment in other dedicated systems is avoided.
Flexible scalability
When storage requirements drastically increase due to more data or new business
applications, additional storage equipment can be integrated instantly.
No migration costs
The end-to-end NetApp product range enables fast and smooth data transfer when
other systems are added.
NetApp FAS systems

he powerful, scalable and reliable NetApp FAS systems (fabric attached storage) for
simplified data management meet company requirements whatever their size
starting from the Global 1000 firms down to medium-sized companies and small
departments.
The installation, configuration and administration processes for these systems are
extremely simple. The stable and flexible microkernel operating system Data ONTAP
enables the simultaneous and combined usage in FC SAN, iSCSI and NAS environments.
In addition to Windows, Unix and Linux, web data is also supported. The FAS product
range comprises the high-end FAS6000 series for data consolidation in large and highperformance applications, the Midrange series FAS3100 with exceptional price/performance ratio for SAN and NAS, and the FAS2000 series for data backup in branches
and medium-sized companies.
The FAS systems were designed to consolidate and provide data for a wide range of
applications, including business applications, e-mail, Enterprise Content Management,
technical applications, file shares, home directories and web content.
Maximize storage utilization and performance via virtualization

Server virtualization in the data center improves server usage and simplifies server
administration for the customer. The demands on storage systems increase, as they
must support higher I/O rates, more capacity and the fast, non-interrupted provision of
storage space. If storage optimization is not taken into consideration during server
virtualization, then you have probably only moved the costs from the servers to the
storage systems.
102
NetApp
Together with providers, such as VMware, NetApp offers solutions and best practices
for developing a virtualized infrastructure from servers to storage that provide a
number of advantages:
Scalable and consistent I/O performance for all ESX protocols (NFS, iSCSI and FC)
Flexible, fast, simple and low-priced provision and data management solutions
First-class virtualized storage solution for thin provisioning in heterogeneous storage environments
NetApp deduplication in the Esx environment
The NetApp deduplication is one of the fundamental components of our data ONTAP
operating system. The elimination of redundant data objects and exclusive referencing
to the original object permits more efficient use of the existing storage.
SnapManager for virtualized infrastructures (VI)
SnapManager for VI provides customers with an automated solution for backing up and
restoring virtual machines in a VMware ESX environment. The two main advantages of
this solution are:
The backups created using NetApp snapshot technology only use a fraction of the
storage space that traditional systems would require.
The system performance of the ESX environment and thus the applications are
hardly impaired by the SnapManager backup and restore processes. More than 5000
customers worldwide (March 2008) already benefit from the advantages of a VMware
solution with NetApp storage.
Support for a virtual desktop infrastructure
In addition to server virtualization, the VMWare Virtual Desktop Infrastructure (VDI)
offers additional resource-saving virtualization technology. Application environments
no longer run on the users desktop processor but in virtual machines in the data center.
NetApp FlexClone can configure thousands of such virtual machines within minutes.
Data deduplication enables storage capacity savings of approx 90%.
NetApp solutions for SAP
As a worldwide technology partner for SAP, NetApp has a successful history in developing solutions which significantly simplify SAP data management. As one of the founding members of the Adaptive Computing Initiative for SAP, NetApp has been awarded
numerous certificates for the compatibility of its storage solutions and is in the SAP
compliance list for the SAP Adaptive Computing Services for Unix, Linux and Windows platforms.
NetApp won the SAP Pinnacle Award for Technical Innovation and cooperation in
the Adaptive Computing Netweaver Innovation category for the FlexFrameTM for
Innovative solutions for storage and data management
103
mySAP Business SuiteTM joint development with Fujitsu Siemens Computers. Integrative components are NetApp system cloning and backup/recovery scenarios.
The NetApp Unified Storage model provides SAN / IP SAN and NAS connections with
block and file access methods within a single storage architecture.
Data management solutions, such as FlexCloneTM, are used to clone SAP productive
systems within a few minutes and without affecting performance and without any
additional initial storage requirement. This thus significantly simplifies the addition and
management of systems for QA, test, development, reporting, interfaces and training.
The combination of NetApp SnapShotTM and SnapRestore provides SAP customers
with fast, simple backup and restore processes for several TB of SAP data as well as
efficient, simple upgrades and migrations of SAP systems. NetApp Adaptive Computing
solutions enable SAP customers to react dynamically, flexibly and economically to business requirements.
NetApp also offers the following for those companies using SAP:
A comprehensive range of products for Windows, Unix and Linux environments with
unified NAS/SAN storage solutions.
ILM solutions: storage consolidation, backup and recovery, archiving and compliance
via ArchiveLinkTM and/or WebDAV
High-availability and disaster recovery solutions for data encryption
SnapManager for SAP: the solution certified by SAP simplifies the creation of application-consistent snapshot copies, automated error-free data restores and permits
application-specific disaster recovery.
Clones of the SAP database can also be automatically created.
There is a worldwide support agreement between NetApp and SAP which ensures
that the customer has 7x24 SAP infrastructure support.
Further information about this and other NetApp solutions can be found at www.
netapp.com
CA RECOVERY MANAGEMENT
Data protection and availability

Solution
A Recovery Management offers you the functions and services which you really
need: comprehensive protection and reliable recovery in a simple easy-to-use
solution.
It includes application-related, powerful backup with replication, interrupt-free data
protection, automated failover and interrupt-free disaster recovery tests so that you
can provide the protection which you need for specific company data. A simplified
interface and new central management functions enable you to organize, monitor and
configure your entire backup environment. This results in lower operating costs and
improved administration. Thanks to the modular approach this solution is suitable for
every company whatever its size. It links up CA ARCserve Backup with CA XOsoft Replication and CA XOsoft High Availability so that you can define a multi-level data protection concept to meet your corporate objectives.
Benefits
he solution can be seamlessly integrated into existing IT management solutions so
that Enterprise IT Management is simplified and extended.
CA Recovery Management:
a complete data protection solution
A Recovery Management offers comprehensive and integrated data protection and
recovery functions which your company requires. Robust and proven technologies
are used which are connected via one simplified interface. These technologies provide
multi-level data protection which can be aligned to your company targets, requirements and guidelines and covers numerous hardware and software platforms.
106
CA Recovery Management
CA Recovery Management combines proven, powerful solutions CA ARCserve

Backup, CA XOsoft Replication and CA XOsoft High Availability.
CA ARCserve Backup Release 12

A ARCserve Backup provides the most comprehensive data protection currently
available on the market. The solution enables central management and offers
enhanced functions to meet your companys changing requirements.
Functions, such as FIPS-certified 256 encryption, improve security. Reports provide
useful information and optimize backup procedures as well as the catalog database
based on SQL Express with improved indexing for fast recovery. A combination of these
functions gives companies more control, higher operational efficiency, more protection
and reduced costs. This solution enables reliable data protection in the Enterprise class
for a wide range of operational environments. The fact that CA ARCserve Backup also
contains integrated virus protection and encryption tools makes it the most secure
backup solution currently on the market and one that can be immediately implemented.
CA XOsoft High Availability &

CA XOsoft Replication Release 12
his solution for maximum business continuity is based on asynchronous real-time
data replication with automatic failover and automatic reset.
It also enables integrated continuous data protection (CDP) against corruption and
automatic disaster recovery tests which ensure that critical company data and applications can be recovered. CA XOsoft High Availability provides low-priced interrupt-free
availability of servers which run Microsoft Exchange, Microsoft SQL Server or Oracle
databases, Microsoft Internet Information Services (IIS) WebServer, Fileserver, BlackBerry support and other applications (both on 32 and 64-bit Windows servers).
Contact: cainfo.germany@ca.com, www.ca.com/us
Sun Microsystems / StorageTek
More than 35 years experience

in the classic field of data security
s a result of the takeover of StorageTek Sun Microsystems has advanced to
become one of the worlds leading providers of tape libraries and drives. Customer-oriented solutions with special service offerings are a mark of the company. Sun Microsystems (JAVA) develops information technology for the global economy. With its vision The Network is the Computer Sun is promoting the distribution of
the Internet and is itself focusing on open innovations, development of communities as
well as market leadership in Open Source. Sun is represented in more than 100 countries.
Markets
un Microsystems is a system provider that develops hardware and software. Since
software development concentrates on resolving system-related tasks or setting
strategically important milestones in line with Suns vision, Sun is not in competition
with the developers of application programs. On the contrary, firm partnerships exist
with numerous renowned software manufacturers in order to develop offerings
together. In this way, customers retain their freedom, because they can decide in favor
of the best solution on the market. To ensure that systems are integrated at an early
stage Sun Microsystems has set up a number of partner programs, which have pioneered both branch-related and task-related methods. There is a solution portfolio of
almost 13,000 commercial and technical applications for Sun Systems with the platform SPARC/Solaris.
In addition to the partnerships with independent software manufacturers, Sun is
very much committed to long-term sales partnerships with innovative distributors and
resellers. These partnerships have enabled fast and competent solutions for end customers on a wide-spread basis.
108
Sun Microsystems
The partnership between Fujitsu Siemens Computers and

Sun Microsystems
partnership of many years exists between Fujitsu Siemens Computers and Sun
Microsystems/StorageTek, which has been extended to include the Nearline (tape)
sector and, since 2008, SPARC-based systems. As a result of OEM contracts as well as
reseller agreements with Sun for Nearline, drive types and Sparc-based products Fujitsu
Siemens Computers has won over a strong partner for disaster recovery and backup.
Sun Nearline products as quasi-standards are convincing with their high performance
levels and above-average error tolerance. On account of their open design and varied
connection options they can be optimally integrated in backup environments with the
systems sold by Fujitsu Siemens Computers, e.g. CentricStor Virtual Tape Appliance.
Cooperation between Sun Microsystems and Fujitsu Siemens Computers has already
proved to be extremely effective in large-scale, complex projects. And this applies to
companies of the most different kind, particularly large enterprises and small to
medium-sized customers. The numerous joint projects, which you can read about in the
case studies, are proof of this.
Sun StorageTek Open Storage solutions

lectronically stored data are currently one of the most important assets for many
companies. Storing these data securely, managing them intelligently and having
them available in a controlled way at all times is of vital interest for every company.
Unrelenting data growth and the increasing number of statutory regulations with
regard to archiving periods and access controls call for new data storage and administration methods. Sun StorageTek Open Storage solutions allow you to face these challenges successfully. They combine our more than 35 years of experience in classic fields,
such as data security, data availability and failure concepts, with new technologies like
access protection through identity management and storage virtualization. This in turn
enables trend-setting information lifecycle management infrastructures, which not
only store their data securely and in compliance with regulations, but also make them
available to everyone who needs them.
More than 35 years experience in the classic field of data security
109
Tape Library StreamLine SL500

he Sun StorageTek StreamLine SL500 tape library is the entry-level solution for small
to medium-sized businesses. Tried-and-trusted LTO technology ensures trouble-free
operation. The modular expandability of a standard rack permits customers to adapt
capacities to meet their operating needs at any time.
Tape Library StorageTek SL3000

he Sun StorageTek SL3000 modular library system is an innovative, environmentally
aware storage concept for the midrange market. The Any Cartridge Any Slot technology enables you to use the best possible drives depending on access and storage
requirements.
Highlights
Flexible scaling, for which only the capacity actually used is charged.
Cost savings through less space and lower power consumption.
Simpler storage management through partitioning, sharing.
Innovative technologies, risk reduction and development of new opportunities.
Support and services for successful installation, optimization and maintenance.
Tape Library StreamLine SL8500

he modular Sun StorageTek StreamLine SL8500 tape library enables organizations to
set up their operational processes more efficiently, to maximize availability and create optimal prerequisites for auditing reliability with the burden of costs and impairment to their processes remaining absolutely within scope.
Highlights
Data consolidation: Excellent scaling as well as support for mixed media.
Joint resources: Designed for use in mainframes, Unix, Linux and Windows.
Higher availability: Upgrade without downtimes. Redundancy of operation.
Higher throughput: High-performance throughput and capacity.
Simple scaling: Without any downtimes, growth according to your requirements.
Low space requirements: High density of media slots, optimal utilization.
For more information on Sun solutions, visit www.sun.com
Symantec:
Confidence in a connected world
ymantec is a global leader in providing security, storage and systems management
solutions to help businesses and consumers secure and manage their information.
Headquartered in Cupertino, Calif., Symantec has operations in more than 40
countries.
Market Categories
Consumer Products; Security and Compliance; Storage and Availability Management;
Symantec Global Services.
Symantecs leaders bring decades of diverse experience and a history of success.
Combining business acumen with technical savvy, these executives guide more than
17,500 talented employees to create innovative products and solutions that enable
customers around the world to have confidence in their infrastructure, information and
interactions.
Partnership between Fujitsu Siemens Computers

and Symantec
early all Symantec products (and those formerly of VERITAS) are certified for Fujitsu
Siemens Computers products. Above all, Fujitsu Siemens Computers is offering the
VERITAS Storage Foundation as the strategic file systems and volume management
solution for its PRIMEPOWER servers.
NetBackup and Enterprise Vault are strategic products and integral components of
the Fujitsu Siemens Computers storage software portfolio. The Fujitsu Siemens Computers CentricStor Virtual Tape Appliance is already being optimized through additional
functions to Symantec products in its operations.
The cooperation between Symantecs software-solutions alongside the platforms of
Fujitsu Siemens Computers has already proved to be very effective in large and complex
projects. This applies for all varieties of businesses, especially for large enterprises and
SME customers.
Fujitsu Siemens Computers is using Altiris technology for building its DeskView and
ServerView solutions.
112
Symantec
Veritas NetBackup High performance data protection

eritas NetBackup Enterprise Server delivers high performance data protection that
scales to protect the largest Unix, Windows, Linux and NetWare environments.
Offering complete protection from desktop to data center to vault, NetBackup software
offers a single management tool to consolidate all backup and recovery operations.
NetBackup helps organizations take advantage of both tape and disk storage with its
advances in disk and snapshot-based protection, off-site media management, and
automated disaster recover. For the ultimate in data protection, Net-Backup offers data
encryption that transmits and stores data using the latest encryption technologies on
the market today. To reduce the impact on business critical systems, NetBackup software provides online database and application aware backup and recovery solutions for
all leading databases and applications to deliver data availability for Utility Computing.
Veritas NetBackup offers granular file-level and image-level recovery from a single
backup, automatic discovery of virtual machines (winner of 2007 Best of VMWorld for
Data Protection).
Symantec Enterprise Vault E-Mail Archiving

ymantec Enterprise Vault provides a software-based intelligent archiving platform
that stores, manages, and enables discovery of corporate data from e-mail systems,
file server environments, instant messaging platforms, and content management and
collaboration systems. Because not all data is created equally, Enterprise Vault utilizes
intelligent classification and retention technologies to capture, categorize, index, and
store target data in order to enforce policies and protect corporate assets while reducing storage costs and simplifying management. Enterprise Vault also provides specialized applications, such as Discovery Accelerator and Compliance Accelerator, that mine
archived data to support legal discovery, content compliance, knowledge management,
and information security initiatives. It also eliminates PST headaches and removes the
problems of backup, security, stability and storage consumption inherent to PST files.
Archiving for legal and compliance enables you to retrieve e-mail content and corporate documents to meet legal and regulatory compliance requirements.
Further information may be found on our website at: www.symantec.com

Contact: Harald Derbsch: harald_derbsch@symantec.com
Rene Khnen-Wiesemes: rene_wiesemes@symantec.com
Storage Basics
Storage Basics


Contents
Section 1
Section 2
Section 4
Section 5
Section 6
Storage management making complex storage networks manageable
Section 7
Section 8
Forecast: Future storage trends
Glossary
As well as articles about storage from our partners
Brocade, CA, EMC, NetApp, Sun and Symantec
Price: 14.99 Euros
Storage Basics An introduction to the fundamentals of storage technology
Section 3

Storage Basics: An Introduction To The Fundamentals of Storage Technology Storage Basics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Storage Basics: An Introduction To The Fundamentals of Storage Technology Storage Basics

Uploaded by

Copyright:

Available Formats

Storage Basics

An introduction to the fundamentals

An introduction to the fundamentals

Price: 14.99 Euros

Storage Basics An introduction to the fundamentals of storage technology

The information society

Original forms of information technology

The information society saving data and knowledge at new levels

Changing information forms within companies

The growth of non-structured data (Web 2.0)

The information society saving data and knowledge at new levels

How big is an Exabyte?

1,000 bytes OR 103 bytes. 2 Kilobytes: a typewritten page. 100 Kilobytes:

Storage in such a world is thus gaining in significance:

Tiered storage: intelligent information

World wide File and Block Disk Storage Systems, 20052008

Tiered storage: intelligent information management in the company

HSM and ILM: classic concepts with long-term effects

What is Tiered Storage?

Tiered storage: intelligent information management in the company

ILM Implementation Roadmap

Deploy ILM practices across the enterprise

Automate with ILM Management tools

Automate ILM-based policies & services

practices across sites

Begin operating policy-based services

Begin collaborating on requirements

Pilot ILM-based Solution Stacks

Tiered storage and data classes

Tiered storage: intelligent information management in the company

Tiered Storage and the Data Lifecycle

Disk, tape libraries

Tape libraries, deep archive

Probability of reuse (%)

90+ days 1+ years forever

Source: Horison Information Strategies

Movers or Policy Engines, is such that an investment has to be carefully calculated

Online storage: disks and reliability

Lifespan of hard disks

The disk market in terms of interfaces

Source: Gartner Dataquest

Online storage: disks and reliability

The performance of various disk types

1.5, 3.0 gbps

Online storage customer requirement:

Online storage: disks and reliability

Hard disks a contradiction in terms regarding

Annualized failure rates (AFR) broken down by age groups

Source: E. Pinheiro/W.-D. Weber/L. A. Barroso, Failure Trends in a Large

Reliability with RAID

Online storage: disks and reliability

Milestones in the hard disk drive industry

Source: Storage Newsletter, edition July 2008

How much data protection do you need?

Storage networks spoilt for choice

EMEA market share for DAS, NAS and SAN in 2007

Why storage networks make sense

Storage networks spoilt for choice

Fibre Channel SAN

Source: Fujitsu Siemens Computers

Storage networks spoilt for choice

Consolidate file services

Direct Attached Storage