You are on page 1of 54

Section I

Storage System
In This Section
Chapter 1: Introduction to InIormation Storage
and Management
Chapter 2: Storage System Environment
Chapter 3: Data Protection: RAID
Chapter 4: Intelligent Storage Systems
Chapter 1
Introduction to InIormation Storage and
Management
Key Concepts
O Data and InIormation
O Structured and Unstructured Data
O Storage Technology Architectures
O Core Elements oI a Data Center
O InIormation Management
O InIormation LiIecycle Management
InIormation is increasingly important in our
daily lives. We have become inIormation
dependents oI the twenty-Iirst century, living in
an on-command, on-demand world that means
we need inIormation when and where it is
required. We access the Internet every day to
perIorm searches, participate in social
networking, send and receive e-mails, share
pictures and videos, and scores oI other
applications. Equipped with a growing number
oI content-generating devices, more inIormation
is being created by individuals than by
businesses. InIormation created by individuals
gains value when shared with others. When
created, inIormation resides locally on devices
such as cell phones, cameras, and laptops. To
share this inIormation, it needs to be uploaded
via networks to data centers. It is interesting to
note that while the majority oI inIormation is
created by individuals, it is stored and managed
by a relatively small number oI organizations.
Figure 1-1 depicts this virtuous cycle oI
inIormation.
The importance, dependency, and volume oI
inIormation Ior the business world also continue
to grow at astounding rates. Businesses depend
on Iast and reliable access to inIormation critical
to their success. Some oI the business
applications that process inIormation include
airline reservations, telephone billing systems,
e-commerce, ATMs, product designs, inventory
management, e-mail archives, Web portals,
patient records, credit cards, liIe sciences, and
global capital markets.
The increasing criticality oI inIormation to the
businesses has ampliIied the challenges in
protecting and managing the data. The volume
oI data that business must manage has driven
strategies to classiIy data according to its value
and create rules Ior the treatment oI this data
over its liIecycle. These strategies not only
provide Iinancial and regulatory beneIits at the
business level, but also manageability beneIits
at operational levels to the organization.
Data centers now view inIormation storage as
one oI their core elements, along with
applications, databases, operating systems, and
networks. Storage technology continues to
evolve with technical advancements oIIering
increasingly higher levels oI availability,
security, scalability, perIormance, integrity,
capacity, and manageability.
Figure 1-1: Virtuous cycle oI inIormation

This chapter describes the evolution oI
inIormation storage architecture Irom simple
direct-attached models to complex networked
topologies. It introduces the inIormation
liIecycle management (ILM) strategy, which
aligns the inIormation technology (IT)
inIrastructure with business priorities.
1.1 InIormation Storage
Businesses use data to derive inIormation that is
critical to their day-to-day operations. Storage is
a repository that enables users to store and
retrieve this digital data.
1.1.1 Data
Data is a collection oI raw Iacts Irom which
conclusions may be drawn. Handwritten letters,
a printed book, a Iamily photograph, a movie on
video tape, printed and duly signed copies oI
mortgage papers, a bank`s ledgers, and an
account holder`s passbooks are all examples oI
data.
BeIore the advent oI computers, the procedures
and methods adopted Ior data creation and
sharing were limited to Iewer Iorms, such as
paper and Iilm. Today, the same data can be
converted into more convenient Iorms such as
an e-mail message, an e-book, a bitmapped
image, or a digital movie. This data can be
generated using a computer and stored in strings
oIs ands, as shown in Figure 1-2. Data in this
Iorm is called digital data and is accessible by
the user only aIter it is processed by a computer.
Figure 1-2: Digital data

With the advancement oI computer and
communication technologies, the rate oI data
generation and sharing has increased
exponentially. The Iollowing is a list oI some oI
the Iactors that have contributed to the growth
oI digital data:
O Increase in data processing capabilities:
Modern-day computers provide a signiIicant
increase in processing and storage
capabilities. This enables the conversion oI
various types oI content and media Irom
conventional Iorms to digital Iormats.
O Lower cost oI digital storage: Technological
advances and decrease in the cost oI storage
devices have provided low-cost solutions
and encouraged the development oI less
expensive data storage devices. This cost
beneIit has increased the rate at which data
is being generated and stored.
O AIIordable and Iaster communication
technology: The rate oI sharing digital data
is now much Iaster than traditional
approaches. A handwritten letter may take a
week to reach its destination, whereas it only
takes a Iew seconds Ior an e-mail message to
reach its recipient.
Inexpensive and easier ways to create, collect,
and store all types oI data, coupled with
increasing individual and business needs, have
led to accelerated data growth, popularly termed
the data explosion. Data has diIIerent purposes
and criticality, so both individuals and
businesses have contributed in varied
proportions to this data explosion.
The importance and the criticality oI data vary
with time. Most oI the data created holds
signiIicance in the short-term but becomes less
valuable over time. This governs the type oI
data storage solutions used. Individuals store
data on a variety oI storage devices, such as
hard disks, CDs, DVDs, or Universal Serial Bus
(USB) Ilash drives.

,2ple of Rese,7ch ,nd Business d,t,
O Seismology: Involves collecting data
related to various sources and
parameters oI earthquakes, and other
relevant data that needs to be processed
to derive meaningIul inIormation.
O Product data: Includes data related to
various aspects oI a product, such as
inventory, description, pricing,
availability, and sales.
O Customer data: A combination oI data
related to a company`s customers, such
as order details, shipping addresses, and
purchase history.
O Medical data: Data related to the health
care industry, such as patient history,
radiological images, details oI
medication and other treatment, and
insurance inIormation.
Businesses generate vast amounts oI data and
then extract meaningIul inIormation Irom this
data to derive economic beneIits. ThereIore,
businesses need to maintain data and ensure its
availability over a longer period. Furthermore,
the data can vary in criticality and may require
special handling. For example, legal and
regulatory requirements mandate that banks
maintain account inIormation Ior their
customers accurately and securely. Some
businesses handle data Ior millions oI
customers, and ensures the security and integrity
oI data over a long period oI time. This requires
high-capacity storage devices with enhanced
security Ieatures that can retain data Ior a long
period.
1.1.2 Types oI Data
Data can be classiIied as structured or
unstructured (see Figure 1-3) based on how it is
stored and managed. Structured data is
organized in rows and columns in a rigidly
deIined Iormat so that applications can retrieve
and process it eIIiciently. Structured data is
typically stored using a database management
system (DBMS).
Data is unstructured iI its elements cannot be
stored in rows and columns, and is thereIore
diIIicult to query and retrieve by business
applications. For example, customer contacts
may be stored in various Iorms such as sticky
notes, e-mail messages, business cards, or even
digital Iormat Iiles such as .doc, .txt, and .pdI.
Due its unstructured nature, it is diIIicult to
retrieve using a customer relationship
management application. Unstructured data may
not have the required components to identiIy
itselI uniquely Ior any type oI processing or
interpretation. Businesses are primarily
concerned with managing unstructured data
because over 80 percent oI enterprise data is
unstructured and requires signiIicant storage
space and eIIort to manage.
1.1.3 InIormation
Data, whether structured or unstructured, does
not IulIill any purpose Ior individuals or
businesses unless it is presented in a meaningIul
Iorm. Businesses need to analyze data Ior it to
be oI value. InIormation is the intelligence and
knowledge derived Irom data.
Businesses analyze raw data in order to identiIy
meaningIul trends. On the basis oI these trends,
a company can plan or modiIy its strategy. For
example, a retailer identiIies customers`
preIerred products and brand names by
analyzing their purchase patterns and
maintaining an inventory oI those products.
EIIective data analysis not only extends its
beneIits to existing businesses, but also creates
the potential Ior new business opportunities by
using the inIormation in creative ways. Job
portal is an example. In order to reach a wider
set oI prospective employers, job seekers post
their resumes on various websites oIIering job
search Iacilities. These websites collect the
resumes and post them on centrally accessible
locations Ior prospective employers. In addition,
companies post available positions on job search
sites. Job-matching soItware matches keywords
Irom resumes to keywords in job postings. In
this manner, the job search engine uses data and
turns it into inIormation Ior employers and job
seekers.
Figure 1-3: Types oI data

Because inIormation is critical to the success oI
a business, there is an ever-present concern
about its availability and protection. Legal,
regulatory, and contractual obligations
regarding the availability and protection oI data
only add to these concerns. Outages in key
industries, such as Iinancial services,
telecommunications, manuIacturing, retail, and
energy cost millions oI U.S. dollars per hour.
1.1.4 Storage
Data created by individuals or businesses must
be stored so that it is easily accessible Ior
Iurther processing. In a computing environment,
devices designed Ior storing data are termed
storage devices or simply storage. The type oI
storage used varies based on the type oI data
and the rate at which it is created and used.
Devices such as memory in a cell phone or
digital camera, DVDs, CD-ROMs, and hard
disks in personal computers are examples oI
storage devices.
Businesses have several options available Ior
storing data including internal hard disks,
external disk arrays and tapes.
1.2 Evolution oI Storage Technology and
Architecture
Historically, organizations had centralized
computers (mainIrame) and inIormation storage
devices (tape reels and disk packs) in their data
center. The evolution oI open systems and the
aIIordability and ease oI deployment that they
oIIer made it possible Ior business
units/departments to have their own servers and
storage. In earlier implementations oI open
systems, the storage was typically internal to the
server.
The proliIeration oI departmental servers in an
enterprise resulted in unprotected, unmanaged,
Iragmented islands oI inIormation and increased
operating cost. Originally, there were very
limited policies and processes Ior managing
these servers and the data created. To overcome
these challenges, storage technology evolved
Irom non-intelligent internal storage to
intelligent networked storage (see Figure 1-4).
Highlights oI this technology evolution include:
O Redundant Array oI Independent Disks
(RAID): This technology was developed to
address the cost, perIormance, and
availability requirements oI data. It
continues to evolve today and is used in all
storage architectures such as DAS, SAN,
and so on.
O Direct-attached storage (DAS): This type oI
storage connects directly to a server (host)
or a group oI servers in a cluster. Storage
can be either internal or external to the
server. External DAS alleviated the
challenges oI limited internal storage
capacity.
O Storage area network (SAN): This is a
dedicated, high-perIormance Fibre Channel
(FC) network to Iacilitate block-level
communication between servers and storage.
Storage is partitioned and assigned to a
server Ior accessing its data. SAN oIIers
scalability, availability, perIormance, and
cost beneIits compared to DAS.
O Network-attached storage (NAS): This is
dedicated storage Ior Iile serving
applications. Unlike a SAN, it connects to
an existing communication network (LAN)
and provides Iile access to heterogeneous
clients. Because it is purposely built Ior
providing storage to Iile server applications,
it oIIers higher scalability, availability,
perIormance, and cost beneIits compared to
general purpose Iile servers.
O Internet Protocol SAN (IP-SAN): One oI the
latest evolutions in storage architecture, IP-
SAN is a convergence oI technologies used
in SAN and NAS. IP-SAN provides block-
level communication across a local or wide
area network (LAN or WAN), resulting in
greater consolidation and availability oI
data.
Figure 1-4: Evolution oI storage architectures

Storage technology and architecture continues
to evolve, which enables organizations to
consolidate, protect, optimize, and leverage their
data to achieve the highest return on inIormation
assets.
1.3 Data Center InIrastructure
Organizations maintain data centers to provide
centralized data processing capabilities across
the enterprise. Data centers store and manage
large amounts oI mission-critical data. The data
center inIrastructure includes computers, storage
systems, network devices, dedicated power
backups, and environmental controls (such as
air conditioning and Iire suppression).
Large organizations oIten maintain more than
one data center to distribute data processing
workloads and provide backups in the event oI a
disaster. The storage requirements oI a data
center are met by a combination oI various
storage architectures.
1.3.1 Core Elements
Five core elements are essential Ior the basic
Iunctionality oI a data center:
O Application: An application is a computer
program that provides the logic Ior
computing operations. Applications, such as
an order processing system, can be layered
on a database, which in turn uses operating
system services to perIorm read/write
operations to storage devices.
O Database: More commonly, a database
management system (DBMS) provides a
structured way to store data in logically
organized tables that are interrelated. A
DBMS optimizes the storage and retrieval oI
data.
O Server and operating system: A computing
platIorm that runs applications and
databases.
O Network: A data path that Iacilitates
communication between clients and servers
or between servers and storage.
O Storage array: A device that stores data
persistently Ior subsequent use.
These core elements are typically viewed and
managed as separate entities, but all the
elements must work together to address data
processing requirements.
Figure 1-5 shows an example oI an order
processing system that involves the Iive core
elements oI a data center and illustrates their
Iunctionality in a business process.
Figure 1-5: Example oI an order processing
system

1.3.2 Key Requirements Ior Data Center
Elements
Uninterrupted operation oI data centers is
critical to the survival and success oI a business.
It is necessary to have a reliable inIrastructure
that ensures data is accessible at all times. While
the requirements, shown in Figure 1-6, are
applicable to all elements oI the data center
inIrastructure, our Iocus here is on storage
systems. The various technologies and solutions
to meet these requirements are covered in this
book.
Figure 1-6: Key characteristics oI data center
elements

O Availability: All data center elements should
be designed to ensure accessibility. The
inability oI users to access data can have a
signiIicant negative impact on a business.
O Security: Polices, procedures, and proper
integration oI the data center core elements
that will prevent unauthorized access to
inIormation must be established. In addition
to the security measures Ior client access,
speciIic mechanisms must enable servers to
access only their allocated resources on
storage arrays.
O Scalability: Data center operations should be
able to allocate additional processing
capabilities or storage on demand, without
interrupting business operations. Business
growth oIten requires deploying more
servers, new applications, and additional
databases. The storage solution should be
able to grow with the business.
O PerIormance: All the core elements oI the
data center should be able to provide
optimal perIormance and service all
processing requests at high speed. The
inIrastructure should be able to support
perIormance requirements.
O Data integrity: Data integrity reIers to
mechanisms such as error correction codes
or parity bits which ensure that data is
written to disk exactly as it was received.
Any variation in data during its retrieval
implies corruption, which may aIIect the
operations oI the organization.
O Capacity: Data center operations require
adequate resources to store and process large
amounts oI data eIIiciently. When capacity
requirements increase, the data center must
be able to provide additional capacity
without interrupting availability, or, at the
very least, with minimal disruption.
Capacity may be managed by reallocation oI
existing resources, rather than by adding
new resources.
O Manageability: A data center should
perIorm all operations and activities in the
most eIIicient manner. Manageability can be
achieved through automation and the
reduction oI human (manual) intervention in
common tasks.
1.3.3 Managing Storage InIrastructure
Managing a modern, complex data center
involves many tasks. Key management activities
include:
O Monitoring is the continuous collection oI
inIormation and the review oI the entire data
center inIrastructure. The aspects oI a data
center that are monitored include security,
perIormance, accessibility, and capacity.
O Reporting is done periodically on resource
perIormance, capacity, and utilization.
Reporting tasks help to establish business
justiIications and chargeback oI costs
associated with data center operations.
O Provisioning is the process oI providing the
hardware, soItware, and other resources
needed to run a data center. Provisioning
activities include capacity and resource
planning. Capacity planning ensures that the
user`s and the application`s Iuture needs will
be addressed in the most cost-eIIective and
controlled manner. Resource planning is the
process oI evaluating and identiIying
required resources, such as personnel, the
Iacility (site), and the technology. Resource
planning ensures that adequate resources are
available to meet user and application
requirements.
For example, the utilization oI an application`s
allocated storage capacity may be monitored. As
soon as utilization oI the storage capacity
reaches a critical value, additional storage
capacity may be provisioned to the application.
II utilization oI the storage capacity is properly
monitored and reported, business growth can be
understood and Iuture capacity requirements can
be anticipated. This helps to Irame a proactive
data management policy.
1.4 Key Challenges in Managing InIormation
In order to Irame an eIIective inIormation
management policy, businesses need to consider
the Iollowing key challenges oI inIormation
management:
O Exploding digital universe: The rate oI
inIormation growth is increasing
exponentially. Duplication oI data to ensure
high availability and repurposing has also
contributed to the multiIold increase oI
inIormation growth.
O Increasing dependency on inIormation: The
strategic use oI inIormation plays an
important role in determining the success oI
a business and provides competitive
advantages in the marketplace.
O Changing value oI inIormation: InIormation
that is valuable today may become less
important tomorrow. The value oI
inIormation oIten changes over time.
Framing a policy to meet these challenges
involves understanding the value oI inIormation
over its liIecycle.
1.5 InIormation LiIecycle
The inIormation liIecycle is the 'change in the
value oI inIormation over time. When data is
Iirst created, it oIten has the highest value and is
used Irequently. As data ages, it is accessed less
Irequently and is oI less value to the
organization. Understanding the inIormation
liIecycle helps to deploy appropriate storage
inIrastructure, according to the changing value
oI inIormation.
For example, in a sales order application, the
value oI the inIormation changes Irom the time
the order is placed until the time that the
warranty becomes void (see Figure 1-7). The
value oI the inIormation is highest when a
company receives a new sales order and
processes it to deliver the product. AIter order
IulIillment, the customer or order data need not
be available Ior real-time access. The company
can transIer this data to less expensive
secondary storage with lower accessibility and
availability requirements unless or until a
warranty claim or another event triggers its
need. AIter the warranty becomes void, the
company can archive or dispose oI data to create
space Ior other high-value inIormation.
Figure 1-7: Changing value oI sales order
inIormation

1.5.1 InIormation LiIecycle Management
Today`s business requires data to be protected
and available 24 7. Data centers can
accomplish this with the optimal and
appropriate use oI storage inIrastructure. An
eIIective inIormation management policy is
required to support this inIrastructure and
leverage its beneIits.
InIormation liIecycle management (ILM) is a
proactive strategy that enables an IT
organization to eIIectively manage the data
throughout its liIecycle, based on predeIined
business policies. This allows an IT
organization to optimize the storage
inIrastructure Ior maximum return on
investment. An ILM strategy should include the
Iollowing characteristics:
O Business-centric: It should be integrated
with key processes, applications, and
initiatives oI the business to meet both
current and Iuture growth in inIormation.
O Centrally managed: All the inIormation
assets oI a business should be under the
purview oI the ILM strategy.
O Policy-based: The implementation oI ILM
should not be restricted to a Iew
departments. ILM should be implemented as
a policy and encompass all business
applications, processes, and resources.
O Heterogeneous: An ILM strategy should
take into account all types oI storage
platIorms and operating systems.
O Optimized: Because the value oI
inIormation varies, an ILM strategy should
consider the diIIerent storage requirements
and allocate storage resources based on the
inIormation`s value to the business.

%ie7ed Sto7,e
Tiered storage is an approach to deIine diIIerent
storage levels in order to reduce total storage
cost. Each tier has diIIerent levels oI protection,
perIormance, data access Irequency, and other
considerations. InIormation is stored and moved
between diIIerent tiers based on its value over
time. For example, mission-critical, most
accessed inIormation may be stored on Tier 1
storage, which consists oI high perIormance
media with a highest level oI protection.
Medium accessed and other important data is
stored on Tier 2 storage, which may be on less
expensive media with moderate perIormance
and protection. Rarely accessed or event
speciIic inIormation may be stored on lower
tiers oI storage.
1.5.2 ILM Implementation
The process oI developing an ILM strategy
includes Iour activities classiIying,
implementing, managing, and organizing:
O ClassiIying data and applications on the
basis oI business rules and policies to enable
diIIerentiated treatment oI inIormation
O Implementing policies by using inIormation
management tools, starting Irom the creation
oI data and ending with its disposal
O Managing the environment by using
integrated tools to reduce operational
complexity
O Organizing storage resources in tiers to align
the resources with data classes, and storing
inIormation in the right type oI
inIrastructure based on the inIormation`s
current value
Implementing ILM across an enterprise is an
ongoing process. Figure 1-8 illustrates a three-
step road map to enterprise-wide ILM.
Steps 1 and 2 are aimed at implementing ILM in
a limited way across a Iew enterprise-critical
applications. In Step 1, the goal is to implement
a storage networking environment. Storage
architectures oIIer varying levels oI protection
and perIormance and this acts as a Ioundation
Ior Iuture policy-based inIormation management
in Steps 2 and 3. The value oI tiered storage
platIorms can be exploited by allocating
appropriate storage resources to the applications
based on the value oI the inIormation processed.
Step 2 takes ILM to the next level, with detailed
application or data classiIication and linkage oI
the storage inIrastructure to business policies.
These classiIications and the resultant policies
can be automatically executed using tools Ior
one or more applications, resulting in better
management and optimal allocation oI storage
resources.
Step 3 oI the implementation is to automate
more oI the applications or data classiIication
and policy management activities in order to
scale to a wider set oI enterprise applications.
Figure 1-8: Implementation oI ILM

1.5.3 ILM BeneIits
Implementing an ILM strategy has the Iollowing
key beneIits that directly address the challenges
oI inIormation management:
O Improved utilization by using tiered storage
platIorms and increased visibility oI all
enterprise inIormation.
O SimpliIied management by integrating
process steps and interIaces with individual
tools and by increasing automation.
O A wider range oI options Ior backup, and
recovery to balance the need Ior business
continuity.
O Maintaining compliance by knowing what
data needs to be protected Ior what length oI
time.
O Lower Total Cost oI Ownership (TCO) by
aligning the inIrastructure and management
costs with inIormation value. As a result,
resources are not wasted, and complexity is
not introduced by managing low-value data
at the expense oI high-value data.
Summary
This chapter described the importance oI data,
inIormation, and storage inIrastructure. Meeting
today`s storage needs begins with understanding
the type oI data, its value, and key management
requirements oI a storage system.
This chapter also emphasized the importance oI
the ILM strategy, which businesses are adopting
to manage inIormation eIIectively across the
enterprise. ILM is enabling businesses to gain
competitive advantage by classiIying,
protecting, and leveraging inIormation.
The evolution oI storage architectures and the
core elements oI a data center covered in this
chapter provided the Ioundation on inIormation
storage. The next chapter discusses storage
system environment.
e7cises
1.A hospital uses an application that stores
patient X-ray data in the Iorm oI large
binary objects in an Oracle database. The
application is hosted on a UNIX server, and
the hospital staII accesses the X-ray records
through a Gigabit Ethernet backbone.
Storage array provides storage to the UNIX
server, which has 6 terabytes oI usable
capacity.
4 Explain the core elements oI the data
center. What are the typical challenges
the storage management team may Iace
in meeting the service-level demands oI
the hospital staII?
4 Describe how the value oI this patient
data might change over time.
2.An engineering design department oI a large
company maintains over 600,000
engineering drawings that its designers
access and reuse in their current projects,
modiIying or updating them as required. The
design team wants instant access to the
drawings Ior its current projects, but is
currently constrained by an inIrastructure
that is not able to scale to meet the response
time requirements. The team has classiIied
the drawings as 'most Irequently accessed,
'Irequently accessed, 'occasionally
accessed, and 'archive.
4 Suggest a strategy Ior design department
that optimizes the storage inIrastructure
by using ILM.
4 Explain how you will use 'tiered
storage based on access Irequency.
4 Detail the hardware and soItware
components you will need to implement
your strategy.
4 Research products and solutions
currently available to meet the solution
you are proposing.
3.The marketing department at a mid size Iirm
is expanding. New hires are being added to
the department and they are given network
access to the department`s Iiles. IT has given
marketing a networked drive on the LAN,
but it keeps reaching capacity every third
week. Current capacity is 500 gigabytes
(and growing), with hundreds oI Iiles. Users
are complaining about LAN response times
and capacity. As the IT manager, what could
you recommend to improve the situation?
4.A large company is considering a storage
inIrastructure one that is scalable and
provides high availability. More
importantly, the company also needs
perIormance Ior its mission-critical
applications. Which storage topology would
you recommend (SAN, NAS, IP SAN) and
why?
Chapter 2
Storage System Environment
Key Concepts
O Host, Connectivity, and Storage
O Block-Level and File-Level Access
O File System and Volume Manager
O Storage Media and Devices
O Disk Components
O oned Bit Recording
O Logical Block Addressing
O Little`s Law and the Utilization Law
Storage, as one oI the core elements oI a data
center, is recognized as a distinct resource and it
needs Iocus and specialization Ior its
implementation and management. The data
Ilows Irom an application to storage through
various components collectively reIerred as a
storage system environment. The three main
components in this environment are the host,
connectivity, and storage. These entities, along
with their physical and logical components,
Iacilitate data access.
This chapter details the storage system
environment and Iocuses primarily on storage. It
provides details on various hardware
components oI a disk drive, disk geometry, and
the Iundamental laws that govern disk
perIormance. The connectivity between the host
and storage Iacilitated by bus technology and
interIace protocols is also explained.
This chapter provides an understanding oI
various logical components oI hosts such as Iile
systems, volume managers, and operating
systems, and their role in the storage system
environment.
2.1 Components oI a Storage System
Environment
The three main components in a storage system
environment the host, connectivity, and
storage are described in this section.
2.1.1 Host
Users store and retrieve data through
applications. The computers on which these
applications run are reIerred to as hosts. Hosts
can range Irom simple laptops to complex
clusters oI servers. A host consists oI physical
components (hardware devices) that
communicate with one another using logical
components (soItware and protocols). Access to
data and the overall perIormance oI the storage
system environment depend on both the
physical and logical components oI a host. The
logical components oI the host are detailed in
Section 2.5 oI this chapter.
!hysic,l Co2ponents
A host has three key physical components:
O Central processing unit (CPU)
O Storage, such as internal memory and disk
devices
O Input/Output (I/O) devices
The physical components communicate with
one another by using a communication pathway
called a bus. A bus connects the CPU to other
components, such as storage and I/O devices.
C!&
The CPU consists oI Iour main components:
O Arithmetic Logic Unit (ALU): This is the
Iundamental building block oI the CPU. It
perIorms arithmetical and logical operations
such as addition, subtraction, and Boolean
Iunctions ( , #, and %).
O Control Unit: A digital circuit that controls
CPU operations and coordinates the
Iunctionality oI the CPU.
O Register: A collection oI high-speed storage
locations. The registers store intermediate
data that is required by the CPU to execute
an instruction and provide Iast access
because oI their proximity to the ALU.
CPUs typically have a small number oI
registers.
O Level 1 (L1) cache: Found on modern day
CPUs, it holds data and program instructions
that are likely to be needed by the CPU in
the near Iuture. The L1 cache is slower than
registers, but provides more storage space.
Sto7,e
Memory and storage media are used to store
data, either persistently or temporarily. Memory
modules are implemented using semiconductor
chips, whereas storage devices use either
magnetic or optical media. Memory modules
enable data access at a higher speed than the
storage media. Generally, there are two types oI
memory on a host:
O Random Access Memory (RAM): This
allows direct access to any memory location
and can have data written into it or read
Irom it. RAM is volatile; this type oI
memory requires a constant supply oI power
to maintain memory cell content. Data is
erased when the system`s power is turned
oII or interrupted.
O Read-Only Memory (ROM): Non-volatile
and only allows data to be read Irom it.
ROM holds data Ior execution oI internal
routines, such as system startup.
Storage devices are less expensive than
semiconductor memory. Examples oI storage
devices are as Iollows:
O Hard disk (magnetic)
O CD-ROM or DVD-ROM (optical)
O Floppy disk (magnetic)
O Tape drive (magnetic)
Devices
I/O devices enable sending and receiving data to
and Irom a host. This communication may be
one oI the Iollowing types:
O User to host communications: Handled by
basic I/O devices, such as the keyboard,
mouse, and monitor. These devices enable
users to enter data and view the results oI
operations.
O Host to host communications: Enabled using
devices such as a Network InterIace Card
(NIC) or modem.
O Host to storage device communications:
Handled by a Host Bus Adaptor (HBA).
HBA is an application-speciIic integrated
circuit (ASIC) board that perIorms I/O
interIace Iunctions between the host and the
storage, relieving the CPU Irom additional
I/O processing workload. HBAs also
provide connectivity outlets known as ports
to connect the host to the storage device. A
host may have multiple HBAs.
2.1.2 Connectivity
Connectivity reIers to the interconnection
between hosts or between a host and any other
peripheral devices, such as printers or storage
devices. The discussion here Iocuses on the
connectivity between the host and the storage
device. The components oI connectivity in a
storage system environment can be classiIied as
physical and logical. The physical components
are the hardware elements that connect the host
to storage and the logical components oI
connectivity are the protocols used Ior
communication between the host and storage.
The communication protocols are covered in
Chapter 5.
!hysic,l Co2ponents of Connectivity
The three physical components oI connectivity
between the host and storage are Bus, Port, and
Cable (Figure 2-1).
Figure 2-1: Physical components oI connectivity

The bus is the collection oI paths that Iacilitates
data transmission Irom one part oI a computer to
another, such as Irom the CPU to the memory.
The port is a specialized outlet that enables
connectivity between the host and external
devices. Cables connect hosts to internal or
external devices using copper or Iiber optic
media.
Physical components communicate across a bus
by sending bits (control, data, and address) oI
data between devices. These bits are transmitted
through the bus in either oI the Iollowing ways:
O Serially: Bits are transmitted sequentially
along a single path. This transmission can be
unidirectional or bidirectional.
O In parallel: Bits are transmitted along
multiple paths simultaneously. Parallel can
also be bidirectional.
The size oI a bus, known as its width,
determines the amount oI data that can be
transmitted through the bus at one time. The
width oI a bus can be compared to the number
oI lanes on a highway. For example, a 32-bit
bus can transmit 32 bits oI data and a 64-bit bus
can transmit 64 bits oI data simultaneously.
Every bus has a clock speed measured in MHz
(megahertz). These represent the data transIer
rate between the end points oI the bus. A Iast
bus allows Iaster transIer oI data, which enables
applications to run Iaster.
Buses, as conduits oI data transIer on the
computer system, can be classiIied as Iollows:
O System bus: The bus that carries data Irom
the processor to memory.
O Local or I/O bus: A high-speed pathway that
connects directly to the processor and carries
data between the peripheral devices, such as
storage devices and the processor.
oic,l Co2ponents of Connectivity
The popular interIace protocol used Ior the local
bus to connect to a peripheral device is
peripheral component interconnect (PCI). The
interIace protocols that connect to disk systems
are Integrated Device Electronics/Advanced
Technology Attachment (IDE/ATA) and Small
Computer System InterIace (SCSI).
!C
PCI is a speciIication that standardizes how PCI
expansion cards, such as network cards or
modems, exchange inIormation with the CPU.
PCI provides the interconnection between the
CPU and attached devices. The plug-and-play
Iunctionality oI PCI enables the host to easily
recognize and conIigure new cards and devices.
The width oI a PCI bus can be 32 bits or 64 bits.
A 32-bit PCI bus can provide a throughput oI
133 MB/s. PCI Express is an enhanced version
oI PCI bus with considerably higher throughput
and clock speed.
D%
IDE/ATA is the most popular interIace protocol
used on modern disks. This protocol oIIers
excellent perIormance at relatively low cost.
Details oI IDE/ATA are provided in Chapter 5.
SCS
SCSI has emerged as a preIerred protocol in
high-end computers. This interIace is Iar less
commonly used than IDE/ATA on personal
computers due to its higher cost. SCSI was
initially used as a parallel interIace, enabling the
connection oI devices to a host. SCSI has been
enhanced and now includes a wide variety oI
related technologies and standards. Chapter 5
provides details oI SCSI.
2.1.3 Storage
The storage device is the most important
component in the storage system environment.
A storage device uses magnetic or solid state
media. Disks, tapes, and diskettes use magnetic
media. CD-ROM is an example oI a storage
device that uses optical media, and removable
Ilash memory card is an example oI solid state
media.
Tapes are a popular storage media used Ior
backup because oI their relatively low cost. In
the past, data centers hosted a large number oI
tape drives and processed several thousand reels
oI tape. However, tape has the Iollowing
limitations:
O Data is stored on the tape linearly along the
length oI the tape. Search and retrieval oI
data is done sequentially, invariably taking
several seconds to access the data. As a
result, random data access is slow and time
consuming. This limits tapes as a viable
option Ior applications that require real-time,
rapid access to data.
O In a shared computing environment, data
stored on tape cannot be accessed by
multiple applications simultaneously,
restricting its use to one application at a
time.
O On a tape drive, the read/write head touches
the tape surIace, so the tape degrades or
wears out aIter repeated use.
O The storage and retrieval requirements oI
data Irom tape and the overhead associated
with managing tape media are signiIicant.
In spite oI its limitations, tape is widely
deployed Ior its cost eIIectiveness and mobility.
Continued development oI tape technology is
resulting in high capacity medias and high speed
drives. Modern tape libraries come with
additional memory (cache) and / or disk drives
to increase data throughput. With these and
added intelligence, today`s tapes are part oI an
end-to-end data management solution,
especially as a low-cost solution Ior storing
inIrequently accessed data and as long-term data
storage.
Optical disk storage is popular in small, single-
user computing environments. It is Irequently
used by individuals to store photos or as a
backup medium on personal/laptop computers.
It is also used as a distribution medium Ior
single applications, such as games, or as a
means oI transIerring small amounts oI data
Irom one selI-contained system to another.
Optical disks have limited capacity and speed,
which limits the use oI optical media as a
business data storage solution.

End oI this preview.
Enjoyed the preview?
Buy Now
or
See details Ior this book in the Kindle Store
Your Browsing History
Customers Also Bought

You might also like