You are on page 1of 32

CSot1tet1ts

UNIT -I
• 1. FUNDAMENTALS OF INFORMATI ON 1- 31
STOR AGE AND MANA GEME NT
I. I Information Storage 3
I .2 Data Proliferation 7
1.3 Evolution of Storage Tf'lchnology and Architec ture 8
I .4 Overview of Storage Infrastructure Compon ents 11
1.5 Information Llfecycle Management (ILM) 15
1.6 Data Categorization
J Review Points
Review Question s
24
30
31

I UNIT-II
2. STORAGE SYSTEM ENVIRONMENT ARCHITECTURE 32-8 5
Introduction of Storage System Architec ture
33
2. I Intelligent Disk Subsystem
37
2.2 Contras t of Integrated vs. Modular Arrays
38
2.3 Architec ture of Intelligent Disk Subsystems
39
2.4 Disk Physical Storage
41
2.5 Disk Properti es
2.6 45
Disk Perform ance
2. 7 46
Disk Specifications
2.8 51
RAID
2. 9 Hot Spares 52
2.10 Compon ents of an Intelligent Storage System 72
2. I I Data Mapping 75
Review Points 82
Review Questio ns 83
85
UNIT -III

3. INTRODUCTION TO NE1WORKED STORAGE SYSTEMS

3.1 Just a Bunch of Disks (JBOD)


86-1 32
87
3.2 Direct Attache d Storage (DAS)
91
3.3 Storage Area Networ k (SAN)
IOI
3.4 Networ k Attache d Storage (NAS)
113
3.5 Content Address able Storage (CAS)
125
3.6 Compar ison Among San , Nas, Das, and Cas
130
Review Points
131
Review Questio ns
132
:-. -·t ~6i!l
34
4l JSil l::zl :s:.:::(S::
Un it-1
1
5 ? t It 11*> t M
46
52
56
58
59
16l
1 ,
/
,I',,

2 ~ l't½4
l63
FUNDAMENTAL"OF INFORMATIQN , , l

l66 STOR.AGE . AND MANAGEM,ENJ, _: f. ,:


167 '
,, Y'"* V:
/ -~
/y~,,
V ,

179
182
184

5- 21 9
186
190
203
204
215
219

~ Information Storage
~ Data Proliferation
o- 25 5 ~ Evolution of Storage Technology and
Architecture
221 ~ Overview of Storage Infrastructure Com
223 ponents
~ Information Lifecycle Management (ILM
223 )
224
227
229
233
233
235
240
242
243
245
247
247
252
254
255
M ENT
IN FOR MAT IO N 8TO nAG E AN D MANAGE
D
rma tion tech nolo gy. Hug e am oun t and very
tnfon nf\ti on storage Is a cen tral pilla r of Info l or
g crea ted ever y mom ent by eith er Indiv idua
good qua lity of digit al Info rma tion Is bein
s to be stored, prot ecte d, optimized
by corp orat e cons umers of IT. This Info rma tion need '
und man age d .
on
life , as the time goin g we beco mes Informati
Info rm ation Is also very Imp orta nt In our real net is
Info rma tion from diffe rent re sour ces, inter
depe nde nt. There are man y way s to get sends
to com plet e our goa l like as receives and
one of them . We access the inte rnet dall y
and lot of othe r app licat ions . That means
in our
on e-111.}ils, dow nloa ds vide o and pictu res ii
tion 's and afte r crea ting the information
duily li fe ever y persons are crea ting info rma ur
storage was seen as .only a bunch of disks
needs to store them . Previously, info rma tion stor<.:
r to store data . This type of storage can
tape s attached to the back of the com pute tc
unt of info rma tion is very larg e then it rn~ed
only sma ll amo unt of data . But if the amo ty c.
orga niza tioi,s , whic h can prov ides a varie
man age d and store by a very sensitive izin
g, prot ectin g, secu ring , shar ing, and optim
solu tions for stori ng, man agin g, conn ectin
cycle of info rma tion .
digital info rma tion . Figu re 1.1 desc ribe the

Information
Creator

Uaer Storage
Device

!Figure 1.1. ()r:/11 uf I11fun11uliu11 .

info rm i:ltio n crea tors like as lapt ops, cell p


Whe n any user requests fo.r info rma tion the •t be
If the , mou nt of data ·
and cam eras creates these info rma tion • • c: 1s v ery 1ow 1 · can
FU NDAMENTAL OF IN FORMATIO N STOR AGE AN D MANAGEMENT

in their own me mory, but If the amount of data is very large , it need to store in big storage
device like as centralized storage device, here the data are stored and processed fo r future
use . If this stored da ta is called for reused, it can be accessed from centralized data storage .
As the criticality of informa tion is increases the challenge to protecting and managi ng these
data also ina ·eases . Th e major question is how to manage these critical data, like as
info rma tion of ra ilway reservation , airline reservation and information for telephone 1 !l ing
etc? The second question is how to secure these data? The solution is a data ce11ter , it
manage and protect these data by making the classification of data and set the rules for th e
h·eatment of these data. With the help of these data centers high level of availability,
security. and manageability can be provided over the data.

1.1 INFORMATION STORAGE


If we look towards the business world it is also suffering from variety of problems because
of importance. dependency, and volume of information . Businesses depend on fast and
reliable access of information. Some well known business applications like as airline
r£servations, telephone billing systems, e-commerce, ATMs, ptoduct designs, inventory
management, Web portals, patient records, credit cards, life sciences, and global capital
markets that process information required better management of information's.
As the criticality of inform9-tion increases the businesses goes down due to lack of better
protection and managing the data. Small businesses loses billions of rupees ~ach year due
to data loss and the data is loosed due to hardwsre problem, system failure , human error
and software corruption, so data protection is necessary. It is not only important to provide
data protection, but quick and easy-access to the right information is also very important.
As the time going the criticality of data increases, and need to care that data. In the existing
system verity of data are present like as banks holds the account information of their
customer very securely and accurately. Some businesses handle data for millions of
customers, and ensure the security and integrity of data over a fong period of time . This
requires high capacity storage devices that can_retain data for a long period. In today's
increasingly wired business world, the best way to recover data loss due to any reason is to
store backup data in a physically separate location like as storage devices and all storage
devices must be put in data center.
A data storage device is a device for recording (storing) information (data) . Recording can
be done usi ng virtually any form of energy, spanning from manual muscle power in
handwriti ng, to a coustic vibrations in phonographic recording, to electromagnetic energy
modulating magnetic tape and optical discs. A storage device may hold information, process
information, or both . A device that only holds information is a recording medium. Devices
that process information (da ta storage equipment) may either access a separate portable
(removable) recording medium or a permanent component to store or retrieve information.
So data storage is responsible for store and safe that digital data .
INFORMATION STORAGE AN D MANACiH:M ENT
-- i:u,
Data
Data Is a collection of raw facts, such as values or rn easurcment1:1 . It CiJ n be number11,
words, measurements, observations or even Jusl' descrlpt:l onA of things. A rrlnted paper,
movie on DVD, a bank' s ledgers, and an accou nt. holder' s pilssbooks etc are i:lll examples
of data. The different form of data Is giving below:
M◄ In computing, data Is information that has been translated Into a form that ii; ,more B<,
convenient to move or process . Relative t:o today' s computers a nd transrn rnRlon
media, data is Information converted Into binary digital form .
I◄◄ In computer component Interconnect ion and nelworl< communi ca tion, da ta Is often
distinguished from "control information, " "control bits," and similar terms to identify lrrn
the mai n content of a transmission unit.
1◄◄ In telecommun ications, data sometimes means digital-enco ded informoti on to
Vld
distinguish it from analog-encoded information such as conventi ona l telephone (Cl
voice calls. In general, "analog" or voice transmission requires a dedicated continual
connection for the duration of a related series of transmission s. Data tran smission
can often be sent wilh intermittent connections in packets thut arrive in piecemeal
fashion. ~ frlr

M◄ In database managemen t systems, data files are the fil es that: store the database
information, whereas other files, such as index file s and data dictionaries, store
administrative information, known as metadat~. !Figure 1.2. DiJ
M◄ Generally, in science, data is a gathered body of facts.
Types of data
M◄ Data can exist in a variety of forms - as numbers or text on pieces of paper, as bits and
bytes stored in electronic.memory, or as facts stored in a person's mind . Data eon ·
webpage c
I◄◄ Strictly speaking, data is the plural of datum, a single piece of Informati on. In
is applied t
practice, however, people use data as both the singular and plural form of the word.
by compul
M◄ The term data is often used to distinguish binary machine-rea dable ·information human re,
from textual human-read able Information. For example, some applications make a organized
distinction between data files (files that contain binary data) and text files (files that records Is
contain ASCII data).
columns ar
At the time when computers ware not invented, the procedures and methods ado . ted for data Is unc
data creation and storing were limited to fewer forms, such as paper and film . No~ days,
Unstructur,
s~me type of data ~an be converted into different forms such as e-book, bitmapped image, have a pre
video, or an e-mail message. These data can be generated by computer and is stored in
lnformatlo:
strings of Os and ls. This ~orm of data is called digital data and is can be accessible after as well. n
processed by computer. Figure 1.2 shows the different forms of th:, d t
e a a. using tradit
With the presence of computers and communica tion tecI nol · or annotat,
. . . 1 ogIes, th e rnte o f da ta
generation and sharing has mcreased. Because ·· 1::,jority of data a t· fr outside of ,
. •. :· · re crea mg om compu ters,
so we can say that all data a, e belongmg to digital data TL , most cum,
increase the growth of digital data are given below:
c . ne important factors w h'1ch
· governmer
1◄◄ Increase in data processing capabilities because m
and effort I
- FUNDAMENTAL OF INFORMATION STORAGE AND MANAG EMENT
1111
e numbers , · H◄ Lower cost of digital storage
nted
. paper, H◄ Faster communi cation technology
111 example s

that is more
ransmiss ion
Book i/;
:lata is often lmage.png O1 O1
1s to identify
Image
t------- -'----'-_ ;;__---- ---+1 01001110
1 0 1 0 1 0
11001 1
)rmation to 1 0 0 0 0 1
Video
I telephon e (CD) Digital Data
~d continua l
rans mission
1 pieceme al
e-mall

1e database
1aries, store
!Figure 1.2. Digital data Conversion Process

Types of data
r, as bits and Data can he divided into two parts as structured and unstructured. Structured Data is
webpage data for content objects, such as people, reviews, products, and companies that
,rmation. In is applied to a universal format. The benefit of structured data is it is universally understood
of the word. by computers, namely search engines, and can therefore be more efficiently organized for
informat ion · human readers. The term structured data refers to data that is identifiable because it is
ions make a organized in a structure . The most common form of structured dat~ or structured data
;es (files that records is a database where specific information is stored based on a methodology of
columns and rows. Structured data is also searchable by data type within content. Structured
data is understood by computers ~nd is also efficiently organized for ·human readers.
adopted for
. Now days, Unstructured Data (or unstructured information) refers to information that either does not
pped image, have a pre-defined data mode and/or does not fit well into relational tables. Unstructured
l is stored in information is typically text-heavy, but may contain data such as dates, numbers, and facts
:essible after as well. This results in irregularities and ambiguities that make it difficult to understand
using traditional computer programs as compared to data stored in fielded form in databases
or annotate d (semantically tagged) in documents. Unstructured data (information that lies
rate of data
outside of database s where business intelligence is usually stored) represents the largest,
rn computers
most current and fastest growing source of information available to businesses and
=1ctors which
governments worldwide. Businesses are major problems to manage unstructured data
because more than 80 percent of data is unstructured and it required more ctorage space
and effort to maintain them. Figure 1.3 describes the types of the data.
■!■
1111 UIU,MIIIJ II IJ I/J IIN!I /'t/HJ M/>tll/-1/ i l Ml IH

I t,,wn 1111d rJ11l1111111n

J 11JI 11

fflV()f(JUfl
A1111i11 Vl(l1111
'-.,trm
r1r tt:
V,/, t..v,(
C rn1fr111Jltl - -- - --- Untttruoturod D•td (80'1/it)
f~{(jf,

r, tftfo •

r1 ·ti111·ul.:i:"-
1 '/)1111111 uj' /)11l11 (8 /t'rt llll/l'lld (11//l 1/1111/l'fllilll/'11!1) M

-4<
lnformotlon
,..
l111'nrrni·1llnn, In Ito 11101:1 1· rnAlrlcted t·echnl co l aenne, le a riequence of Rymbols that can be ,..
1nt·, 0rpro t13d LHJ a me1J1111oe. ln formaf'lon ca n be recorded as signs, or transmitted as signals.
,4,
lnl'orrrn, 11<>11 lo f'J llY l<lnd of ev<ml: th1;1t. 1.1 ffo c.: l'i th~ 1:J blt~ of a dynamic system . Conceptually,
lnfornrn llon lfl the n,u~lfl1'0<~ (utl'crn nce or cxrims11fon) being conveyed . M oreover, th e concept ...
nf ll1fnrmatlon IM cloi:wly rn l,,t·ed to notfont1 of con11trufnt, comrnun ic.:utlon , control, dc.1!<1,
form , lnHlru cllon, lmowlu<IQLJ, muurilno, mental Htfmulua, palt·ern, r,erception, rer,rc:sentc.1 tion , 1.2 OAT
1111d t!<ipod.;lly rn1lropy. Oa t,
I ),,l,1 111111 I1, w b<irn1 vc•rlfl<:d lo be ticc.:um l1J ;,n cl lln1 uly, lti qpeclflc and orya nizecl for i:l that
plll'pqn,·, 1·1pr<:ncnt·cd within ,.1 co,Huxl· 1'11 a t fJ IV t:3 It, rnw-mlno ,md relevunc~, (mcJ lhnt· Ci.l n usah
liwl to o111 I 11cre,1u<J ·111, 1ndurnta11cllnu and decre,.HI(~ i11 unc.:erl i.1lnly. The vulue of informati on pert,
;; m,
11, 1 :11,l,·ly 111 ll •J11blllty tn r1ffoct· n beha vi or, dv,ciHlon, or outcome . A r,l ece of informati o n
00

1°1 crn1 Hld tmJd voluc:lcss If, nf'f'cr r<:c.:e lvlnu 11', thlnuH rern,Jln Lrnch,.rnyed. lnforrmi ti on ls Whi l
'iflm,dl lhnt h,,v,~ mcntilno In RnmLJ con toxt· for ll'B rece ivc~r. When in forrnntion Is entered mill r
11110 r1nd t1lornd inn com put er, It lt1 otmf!roll y referred t.o c1s ,fota. AHer proce ssing (such of d,
£10 form,1 ttln 9 rrnd r,rlnt·ln£1), outr,ut. da t. o can llga ln be perce ived as information . facil i
fnfnrrn1J llon 111 vory lln portan t· for bu0lnes8 point of vl <:!w so It hi:l s very important rol e In
our renl llfu. 1.2.1 Prol
The
Storogo avail
1>0111 crnfl l'C,HJ by I:'\ Llll(Jr or bu0lneRocfl need to be stored some ware 80 thcit lo cun easily dum
1:1 ccm1slbl'-l for f1,1rf'hC:J r procrnrn1t10, ll'l a computer, storage Is the place where data Is held In This
t1 n ol"ctrnmr,o nutl c or optl cul form for acccot1 by a computer processor. There are two funcl
oonorfll uoaoco: and
FUNDAMENTAL OF INFORMATION STORAGE AND MANAGEMENT

Storage is frequently used to mean the devices and data connected to the computer
1◄◄
through input/output operations - that is, hard disk and tape systems and other
forms of storage that don't include computer memory and other in-computer
storage. For the enterprise, the options for this kind of storage are of much greater
variety and expense than that related to memory.
1◄◄ In a more formal-usage, storage has been divided into: primary storage, which holds
data in memory (s')metimes called random access memory or RAM) and secondary
storage, which holds data on hard disks, tapes, CD/DVD and other devices requiring
input/output operations.
Storage is a term used to describe any location where information can be held permanently
or temporarily for later use ..A computer commonly has two storage types: internal and
external. For example, an internal storage is a device such as a hard drive and an external
removable storage is a device such as a floppy disk drive. Below are examples of forms of
storage used on a computer.
1◄◄ Floppy diskette
· 1◄◄ CD-ROM disc

1◄◄ CD-R disc


1◄◄ CD-RW disc
1◄◄ DVD-R, DVD+R, DVD-RW, andDVD+RW disc·.
1◄◄ Jump drive and USB flash drive
1◄◄ . Hard drive

1.2 DATA PROLIFERATION


Data proliferation refers to the unique amount of data, structured and unstructured,
that businesses and governments continue to·generate at an·unprecedented rate and the
usability problems that result from attempting to store and manage that data. While originally
pertaining to problems associated with paper documentation, data proliferation has become
a major problem in primary and secondary data storage on computers.
While digital storage has become cheaper, the associated costs, from raw power to
maintenance and from metadata to search engines, have not kept up with the proliferation
of data. Although the power required to maintaining a unit of data has fallen, the cost of
facilities which house the digital storage has tended to rise.

1.2.1 Problems caused by Data Proliferation


The problem of data proliferation is affecting all areas of commerce as the result of the
availability of relatively inexpensive data storage devices. This has made it very easy to
dump data into secondary storage immediately after its window of usability has passed.
This masks problems that could gravely affect the profitability of businesses and the efficient
functioning of health services, police and security forces, local and national governments,
and many other types of organization.
INFORMATION STORAGE ANO MANAGEMEIH

1.2.2 Data prollferatlon le problematic for several reaaon e


-
M◄ Difficulty when trying to find and re trieve informatio n . In large net111(j'(Y.Jl rA ptlrfin'c1/ lfJ ':;
and secondary data storage, problems fin ding electronic da@ am anaJ1yJrJUn tt1 ti,~tr
problems finding hard copy data . ( .Jr1'.-:,
rJht;;
M◄ Data Loss and legal liability wh en data is disorgan ized, not properly replicated , r1r
ir,frlr ,
cannot be found in a time ly manner.
ir~Jli
M◄ Increased manpowe r requireme nts to manage increasingly chaotic data i toa1-ge
I 101/'
resources .
VAt ,
M◄ Slower networks and application performan ce due to excess traffi c as userr, 5ea-rch v,-1 , ,b
and search again for the material thEy need. t5.err,;;
M◄ High cost in terms of the energy resources required to operate storage hard1..vare. r/) v ,:/,

unr.,r,
1.2.3 Posslble Solutions ooly;:;
<>tr1r;;~,
Information is power but an abundanc e of data is of little use if it Ls scattered across itl> rr.~:
desktops, schools, campuses, and formats, especially if you can't easily retrieve or consolidat e d~!Vk
it. To improve your business results, you need the ability to analyze di verse information fo%tJ.
sets, regardless of where they originate. ~ r, rJ r/;,1

So, here some proposed solution to this problem is given: larg~r .


M◄
5tm'-ei~~
Use that Applications whc, make better utilize modern technolog y
Fi2ur':
M◄ Reductions in duplicate data (especially as caused by data movemen t)
M◄ Improvement of metadata_structures
M◄ Improvement of file and storage transfer structures
M◄ User education and discipline
M◄ The implementation of Information Lifecycle Managem ent solutions to eliminate
low-value information as early as possible before putting the rest into actively managed
long-term storage in which it can be quickly and cheaply accessed.
M◄ Highly scalable data warehousing.
M◄ Disaster Recovery Sys.tern .

1.3 EVOLUTION OF STORAGE TECHNOLOGY AND ARCHITECTURE


As we know that the storage systems are built by taking the capable storage devices such as
hard disks drives and adding layers of hardware and software to obtain a higher reliability,
higher performance, and simple managem ent system . The storage systems are sometime s DAS
referred as storage subsystems or storage devices.
The first data storage system wa~ introduced by IBM in 1956. Initially the storage systems
were just the HOD , but over the time storage systems have developed to include advanced jFisure Lr. I

technology that added considerable value to the HDD. To understan d the evolution of
storage systems, it is importa nt to observe the evolution of the HOD.
ivcN0AMicH'iAl ~ iNrORlll'..ATK>H S 'iOAAGE AHO MM4AGEMEtH

1n general, organizations had centralized computers a nd inforrnafum SUJ'!~ d?-~ in


their data center and the storage devices ware the typicaJ~::1 intern.al part of the re-rvers.
Originall~,. there w'ere very !.irnrted policies a nd processes fO!' managing the 'i:R:rvers an.<l the
data created. \vhich produces unm anaged, unprotected a nd fragmented isla nd of
information·s . To overcom e these challen ges, storage t.echnclogy evolved frorn n'Yfi.-
intelligent internal s:orag-e to intelligent netl.1iorked storag€ .
?\e~.,'Ork 51-orage en ables us to sr.ore data in such a w ay that clients on different 5""~-..ems. .
possib~- running dmerem operating systems, can acces.s the storage. Data storage has
e\·oh:ed through a number of phases. Tne evolution has been driven CY;/ new neP-<ls
CeTT"..anding more and more widely availabl.e storage and by significant incre:;:a....ses in S;101'age
cap.a~' and neiv:ork speed. In the mainframe envirorunenf.5 of the 70's, data was stc'!ed
on Pflj,.'S!cally sepa.rate hardv,are from th€ actual processing unit The stored data v;as still
only accessib!e through the rr1ainfram€ system. Then along carrl€ mini-computers ar,.d the
_srorage pkrure rema..i.w-& pretty much me same; the m.iPJ's processor v ;a:s in one bo-.r. ·;., th
its memory. a nci me disk drives were in different physical bO'..t"..es. With PC servers, stern£€
d€'Vices inrually v.:e-re in the same enclosure that housed the CPU, memory, and peripherals.
Evemually. SCSI syst..ems emerged vi-here the storage could be dirediy cab!ro to the PC
enclosure via specialized adaprors and cables. The prob!em came as we needed larger an<l
!arger amounts of srorage; for exarnp!e v.--har happens to a PC server vih...'.>11 v1-e r,£.ed more
storage drives than will fit inside the b@_ From these ideas, net11.r/O'rk srorage ~~ras born.
figure 1.4 shov;s the evo!ution o: st.orage.

iUR E
~i ~ --: ::z: s:-...:tr~ as
:: ; .J:-"-e- ..,::;;;_.:,-.=!~- .
- . - - .. --lf·
-:-.:: ~:- ~~-:-._-es
E<dl'londSl!craig e

lfigore 1.r. ,!!i,Jcm tE-ctm-e aj Storoge Ernlution


liil IN FOl':\ MATION STOR AGE AND MANAGEMENT

Tho technology whi ch Is used till now Is going to discuss here:

Direct Attached Storage (DAS)


DAS describes a server or workstation wh ere a ll the storage devices a re directly attached to
the host system. A famili ar exa mple of DAS is an IDE drive inside a desktop computer.
Because there are so many computer systems ou t th ere, DAS is the most common method
of storing dota for computer systems. A Directly Attached Storage is a storage sub-sySt~m
that is directly attached to a server or workstation using a cable. It can be a hard disk
directly built into a computer system or a disk shelf with multiple disks attached by means
of external cabling. Contrary to local hard disks, disk shelves require separate manag_ement.
In some cases, storage shelves can be connected to multiple servers so the data or the disks
can be shared.

Redundant Arrays of Independent Disks (RAID)


RAID systems were developed to provide storage with various forms ·of fault tolerance.
RAID fault tolerance is used for storage inside the PC enclosure and in external storage
enclosures. RAID supports a set of levels to provide its fault tolerance. RAID is a technology
to combine multiple small, independent disk drives into an array that looks like a single,
big disk drive to the system. Simply putting n disk drives together .results in a system with
a failure rate that is n times the failure rate·of a single disk. The high failure rate makes the
disk array concept impractical for addressing the high reliability and large capacity needs
of enterprise storage.

Network Attached Storage (NAS)


Network-attached storage (NAS) is file-level computer data storage connected to a computer
network providing data access to heterogeneous clients. NAS not only operates as a file
server, but is specialized for this task either by its hardware, software, or configuration of
those· elements. NAS systems are networked appliances which contain one or more hard
drives, often arranged into logical, redundant storage containers or RAID arrays. Network-
attached storage removes the responsibility of file serving from other servers on the network.
They typically provide access to files using standard Ethernet and network file sharing
protocols such as NFS.

Storage Area Network (SAN)


A SAN is a network of storage devices that are connected to each other and to a server,
or cluster of servers, which act as access points to the storage. SAN's use special switches
as a mechanism to connect the devices and to provide connectivity points. SANs enable
devices on different networks to communicate with each other, which offer significant
advantages. With a properly configured SAN, you can back up every piece of data on
your network without having to 'pollute' the standard network infrastructure with
gigabytes of data.
FUNDAMENTAL OF INFORMATION STORAGE AND MANAGEM ENT
ID
Internet Protocol SAN (IP-SAN)
One of the latest evolutions in storage architecture , IP-SAN is a convergence of technologies
used in SAN and NAS. IP-SAN provides block-leve l communication across a local or wide
area network (LAN or WAN), resulting in greater consolidation and availability of da,ta.
The improvement in storage technology is continuously going on, which helps to organization
in form of consolidation, protection and optimization.

1.4 OVERVIEW OF STORAGE INFRASTRUCTURE COMPONENTS


The major work of any storage infrastructure is to store data, process them and manage
them. To full field this task, it need some helps from other things that is called as componen t
of storage infrastructure. If we talk about data center it has its own infrastructure like as
computers, servers, storage devices, network devices, environmental controller like as air
conditioner, and power backup devices etc. there can be o~her more equipmen t to support
the overall process, ·but we are going to explain the main element or componen ts of
storage infrastructure.

Core Elements
The basic core elements for storage infrastructure are given below:
Applicat ion: An application is the use of a technology, system, or product. The term
application is a shorter form of application program. An application program is a program
designed to perform a specific function directly for the user or, in some cases, for another
application pi:ogram. With the help of an application program we can give an instruction
to storage server to start backup process or storage device to store data.
Network : It is an intermediate device, provides connection between the componen ts of a
storage infrastruc ture. With the help of network any two or more computer s can
communicate to each other. If we talk about storage infrastructure it provides connection
between client and storage server and between storage server and storage devices.
Server and Operati~ g system: It-provides a computing platform and handles to run an
application programs.
Database : A database is a collection of information that is organized so that it can easily
be accessed, managed, and updated: To manage these operations, Database managem ent
System (DBMS) play an important role and provides a structured way to store data in
logical way, so it provides more flexibility to storing and fetching of data.
Storage Anay: It i~ a combination' of rrtany storage devices having different nature, and
is used to store different type of digital data provided by storage server. The main aim of
storage array is to store digital data to protect for future use .
The architecture, which will represent that how all comtx·ne'1ts are working together, is
given below (See Figure l .SJ:

It-I INFORMATION STORAGE
AND · MA NA GE ME NT

Appllcatlona OS/DBMS Store Data


(1)
··- (6)
'•
ge --
ork
I· -1?1
1

•••
)
Cllent
Storage Sever Storage Array
lFi gur e 1.5 . Arc hit ect ure of storage Co mp
one n t
Th e process which will describe
that how all components are wo
rking is given below:
(1) Th e application program
is installed on client side an d wh
information to storage server. en it run it passes the
(2) Type of network i.e. LAN
provides the connection bet we en
server. Client an d the storage
(3) Sto rag e server holds the
dat a in database an d it is ma na
running process is han dle d by an ge d by DBMS, and all
operating _system.
(4) Th e storage network, type
of a network provides the link bet
sto rag e array. we en sto rag e server and
(5) Th e storage array wh en rec
eived all dat a from outside, stored
A customer places an ord er throug her e for future use .
h the application software like as
the client computer. Th e client AUi which is found on
connects to the server over the
lo~ated on the server to update LA N an d access the DBMS
the relevant information such as
pa ym en t me tho d, products ord customer nam e, ·address,
ers an d quality . orders etc. the
op era tin g system to rea d an d wri DBMS use s the server
te this dat a to the dat aba se locate
the sto rag e array. Th e storage d on the physical disk in
networks provides the communic
ser ver an d the storage array an ation link between the
d transport the rea d an d write
Th e sto rag e arr ay after receiving co mm an d bet we en them. ·
the rea d an d write com ma nd fro
the necessary op era tio ns to sto m the server performs
re the dat a on physical disks.
Storage Planning-When, Wh
ere, How?
Th e sto rag e industry has be en
trying har d to move awa-y from
an d creating a single pool of dat the sca tte red environment
a, which could be accessed by
horizontally, effort has be en ma all. Ins tea d of spreading
de to have multiple layers of da
beh avi or an d im po rta nce of dat ta sto rag e bas ed on the
a.
1.4.1 Wh at Is Storage Infrastructu
re?
On e of the key challenges a Sto
rag e Administrator today has is
File systems. Th ese File system managing the Non-Database
s can be located on any nu mb er
of Sto rag e devices such as:
FUNDAM6NTAL OF INFORMATION STORAGE AND MANAGEMENT
Ill
I◄◄ SAN ..Attached Storage (SAN): Is an architecture to attach remote computer
storage devices (such as disk arrays, tape libraries, and optical jukeboxes) to servers
In such a way that the devices appear as locally attached to the operating system.
SAN's used to require connections via Fiber Channel (an expensive short-distance
networking technology), but now major vendors offer connections via iSCSI and
NFS.
1◄◄ Network ..Attached Storage (NAS): Uses file-based protocols such as SMB/NFS/
AFS where It is clear that the storage is remote, and computers request a portion of
an abstract file rather than a disk block.
I◄◄ Direct Attached Storage (DAS): is made of a data storage device (e.g.: a number
of hard disks) connected directly through a computer through a Host bus adapter. A
DAS device can be shared between multiple computers, as long as it provides
multiple interfaces (ports) that allow concurrent and direct access.

1.4.2 Data storage components


Storage components are the core of every storage infrastructure. Although high-level
tasks like data classification or storage virtualization are getting most of the attention today,
it's important to note that storage components, such as disk drives, RAID technology, disk
arrays and even network fabric hardware, are evolving at a tremendous pace; · enabling
vast storage capacities while still maintaining necessary service levels for network users.
The choice of hard discs can have a deep impact on the capacity, performance and long-
term reliability of any storage infrastructure. But it's unwise to trust valuable data to any
single point of failure, so hard discs are combined into groups that can boost performance
and offer redundancy in the event of disc faults. At an even higher level, those arrays must
be integrated into the storage infrastructure combining storage with network technologies
to make data available to users over a LAN or WAN.
1◄◄ The lowest level: Hard discs: Hard discs are random-access storage mechanisms
that transfer data to spinning platters (a.k.a. discs) coated with extremely sensitive
magnetic media. Magnetic read/write heads step across the radius of each platter in
set increments, forming concentric circles of data dubbed "tracks." Hard disc capacity
is loosely defined by the quality of the magnetic media (bits per inch) and the
number of tracks. Some of today's hard ,drives can deliver up to 750 GB of capacity.
Capacity is also influenced by specific drive technologies including perpendicular
recording, which fits more magnetic points into the same physical disc area.
1◄◄ Grouping the discs: RAID : Hard discs are electromechanical devices and their
working life is finite . Media faults, mechanical wear and electronic failures can all
cause problems that render drive contents inaccessible. This is unacceptable for any
organization, so tactics are often implemented to protect against failure . One of the
most common data protection tactics is arranging groups of discs into arrays. This is
known as a RAID. ·
RAID implementations typically offer two benefits; data redundancy and enhanced
performance. Redundancy is achieved by copying data to two or more discs -
m INFORM ATION STORAG E AND MANAGEME NT

. d I' t d ta on another can be used instead.


when a fault occurs on one ha rd disc, up tea e a . d ) across multiple hard discs.
In ma ny cases file contents are also spanned (or stnppe . be __ ..J
' h · arts of a file can acces!>t:,U on
This improves performance because t e van~~ p m lete file to be access.ed
multiple discs simultaneously - rather tha n waiting for a co P h h .th .,.
from a single disc. RAJD can be implemente d m · a var iety of sc emes ' eac vn ILS
own designation:
RAID-0-disc striping is used to ·improve s torage Performa nce • but there Ls no
redundancy.
RAID-1-disc mirroring· offers disc-to-
· d 1'sc re d un d an cy, but capacity is reduced and
performance is only marginally enhanced.
RAID-5-parity information is spread throughout the disc group , improving r~ d
performance and allowing data for a failed drive to be reconstructed once the fa ded
drive is replr1.ced.
RAID-6-multiple parity schemes are spread throughout the disc group , a1lrn>1ng
data for up Jo two simultaneously failed drives to be reconstructed once the fa iled
drive(s) are replaced.
There are additional levels, but thes~ four are the most common and widely used. It
is also possible to mix RAID levels in order to obtain greater benefits. Combinations
are typically denoted with two digits. For example, RAJD.:so is a com b ination of
RAID-5 and RAID-0, sometimes noted as RAID-5 + 0. As another exam ple, RAID-
10 is actually RAID-1 and RAID-0 implemented together, RAID-1 + 0.
~ A closer look at storage arrays: Of course, there are many ways to group hard
discs and enterprise storage can easily inv.olve dozens to hundreds of discs arranged
into storage arrays. The very largest arrays can store hundreds of terabytes or even
petabytes of data. The most basic expression cf di~c grouping is J BOD. Th is is simp ly
the accumulation of pure capacity, and doesn't offer a ny redundancy or performance
benefits. For example, putting five 200 Gbyte drives in a JBOD arra ngem e nt simply
yields 1 TB of unprotected storage . ·
~ Getting storage on the network: Of course, storage is useless for network users if
it would not accessed. There are two principle means of attaching storage systems are
NAS and SAN. NAS boxes are storage devices behind an Ethernet interface, effectively
connecting discs to the network through a single IP address. NAS deployments are
typically straightforward and management is light, so new NAS devices can easily be
added as more storage is needed. Th_e downside_to NAS is performance _ storage
tra ffic must compete for NAS access across the Ethernet cable. But NAS a ccess is
often superior to disc access a t a local serve r.
The SAN oyercomes common server a nd NAS performa nce limitations by creating a
sub network of storage devices interconnected through a switched fabric like FC or
iSCSI called Internet SCSI or SCS I-over-IP. Both FC a nd iSCSI approaches make
a ny storage device ~isible ~rom any host, a nd offer much more availability for
corporate data . FC 1s costlier, but offers optimum performance, while iSCSI is
cheaper, but somewhat slower. Conseque ntl y, FC is found in the enterprise and

l
FUNDAMENTAL OF INFORMATION STORAGE AND MANAGEMENT
Ill
iSCSI commonly appears in small and mid-sized businesses. However, SAN
deployments are more costly to implement (in terms of switches, cabling and host
bus adapters) and demand far more management effort.
Another trend in network storage components is the addition of intelligent features
at the fabric - often implemented at the switch . These features include storage
virtualization, data migration and replication, backup and restoration capability,
better interoperability between storage components, as well as uniform storage
provisioning and management. ·

1~5 INFORMATION LIFECVCLE MANAGEMENT (ILM)


Information today comes in a wide variety of types, for example it could be an email
message, a photograph or an order in an Online Transaction Processing System. Therefore,
once you know the type of data and how it will be used, you already have an understanding
of what its evolution and final destiny is likely to be. The challenge now before all
organizations is to understand how their data evolves, determine how it grows, monitor
how its usage change over time , and decide how long it should survive. Information
Lifecycle Management (ILM) is designed to address these issues, with a combination of
processes, policie~, software and hardware so that the appropriate technology can be used
for each phase of the lifecycle of the data.

THE LIFECYCLE OF DATA


An analysis of your data will most likely reveal that initially it is accessed and maybe updated
on a very frequent basis. As the age of the data increases, its access frequency diminishes
to almost negligible ,· if any. Therefore, most organizations find themselves in the situation
where most of their users, are accessing all of the current data, and very few users, are
accessing, all of the other data. Thus, data can be described as being; active , less active,
historical or ready to be archived.

Active Less Active Historical Archive

!Figure 1.6.(a) Data Lifecycle

With so much data being held, data, during its lifetime, the data will be moved to different
physical locations. This is because depending Oi: where it is in its lifecycle; it needs to be
located on the most appropriate storage device.
ILM gives the answer of following question so that enterprises "can understand how data
should be managed and where data should-ideally-reside during its_existence. In
particular, the probability of reuse of data has historically been one of the most meaningful
INRORMATION STORAGE ANQ -~ANAGEMENT

metrics for understanding optimal data placement. Understanding what happen: to ddata
·
throughout its lifetime Is · becoming · · I · rtant aspect of effective ata
an mcreasmg Y 1m_p o
managemen.t "
What happens to data as it ages?
Does usage decline as data ages?
Does the value of data change-increase or decrease-as ·t ages.?
1

Why are we keeping more data longer than ever before?


What conditions indicate when data should be retired?
Do storage management requirements change as data goes through its life cycle?
lf data is the most valuable asset of so many businesses, why do we know so little about it?
ILM is the process of managing the placement and movement of data on storage devices
as it is generated, replicated, electronically distributed, protected, archived, and ~ltimately
retired. ILM stores, indexes, searches, retrieves, and copies data according to the way it is
used during its life span, including its ultimate destruction or deletion. To do this, ILM is
composed of an interconnected set of processes, storage components, and data and storage
management applications. An ILM solution must link the adaptive storage infrastructure
and associated storage management applications to the enterprise applications and business
processes. This linkage is defined by policies that are determined by how the data is created,
stored, and accessed when needed and meets the pre-defined service-level requirements.
These service-level requirements should automatically change depending on how the data
is used and how this usage can change over time.
A key goal of ILM is to ensure that data is always stored on media that have the capabilities
required to deliver the Quality of Service (QoS) and other attributes required at each
stage of its life cycle. By doing so, ILM helps organizations optimize their ·systems and
storage infrastructures. Not all information created has equal importance to an enterprise,
nor are access requirements for all information the same. For these and other reasons
ILM includes the capability to classify the information according to its importance to th~
organizations that use it, and can then manage its placement on storage based on its
classification. For example, critical data used_to run the business must be accessible at all
times and req~ires the hi~hest level of protection (with the fastest recovery capabilities)
and performan~e. ILM ~rught ~lace such data on mirrored volumes contained in high
~erformanc_e, highly available disk arrays. The main goal of an ILM is to manage the data
hfe cycle. Figure 1.6 (a) represent the life cycle of data.

The vast amount of content created, dynamically accessed and mod·f· d


, 11€ regu Iar Iy 1s
• s hown
at the left in Figure 1.6 (b) . It is classified for ILM purposes as oper t· 1· f . .
. . .. a 10na m ormahon smce
1t 1s created by, mod1fted, and used for day-to-day operations o t· . I
. . . . ver 1me, operationa
mformatton becomes stable, and 1s not modified. In addition th fr . .
• · d .. , e equency with which
mformatton are accessed an mod1f1ed changes over time When th· ILM .
· is occurs, manages

L
- FUNDAMENTAL OF IN FORMATION QTOAAQli AN O MANAGlioMlil
N I'
m
and distributes tho dnta to opproprlnto eitorao o, bnsod on pollcl
os thnt nro dorlvud frorn
f-lssoclated usago ond busln oss vnh.10 nttrlbulos, to ro floct: tho chnng
od utw pnttorn11 of lh(l
lnforml:lt\011 When lnfonnollo11 bocOtT\Cti sl:nllc (no lnnoor modlf
lml), II 111 ruclns11lflod a!l
refere nce lnform ntlon . Reforn11cu lnf'ormntlon IRlyplcfllly used
for dnl11 mlnlnn, ro9ulnlory
compliance, legal, ond other purposes. Anothul' c,t:i poct lh11t
ILM 111111-11 comild or Iii tlH' !'net
that lnfon11nt.lon \lsed by different nppllcatlons spend s dlfforn
nl nmounts of tlrno In c:11' ,irn nt
parts of the life cydc . For example, emoll spond s very llttlu tllw.i
In the ''t.:rna t,1phr'ISi'." Aftur
the email Is st?nl, It enters the "manoge" oncl "dlslrll>uto" phnso
~. On th e.: oth er hond, trn
On Linc Transaction Processing {OLJP) dutnbnsc nppllcntlon rnny
sp~nd rnuch of Its ll fo In
the "crea t~" phase, as records ore contlnuE.l lly being crentud
1md 111odll kcl.

Generate Create Manage Dl1trlbute


Content

Email Cochlng O~lt• mining

Databaso Workflow
rtormllw Reoulotory
ERP Security Compllnn~

Medical lmagln~

Digital content ~ Transaction


Seoroh Q,trorno CRM

Storago
Custom app llcat ~ Leg81 Discovery

Oporatlonal Reference lnfonnatlon


Information

Figurn 1.0.(b) Uji 1C.)11:/0 11/' l)otu

ILM is not renlly about stornge; it is about intelligent workf


low nnd business procl'~"
management. 11 is as much about automating the scmc hing
and indwdnu, C{llcgorlzlnu.
and managing of the Information as II ls about the storing ond
orchlvintt of It. hnpor tt,nlly,
ILM is about creel ting polici es that can flt the Information flow Ink>
I.he hushw ss L:nvlrn nnwnt
as part of an overall arc.: hllect ure. By Its nntur e, ILM Is most
dfl'cl lwly l111pk11111..·11kd usinu
an appli catio n -spec ific solution appro ach thnt 11,du
cll's [)litwrf\llt \Jd n1t,nnnl·in0nt
compon~nt s such as policy engin es, schedulers, nnd nppl1(\\llon
-spl d lk romp on,.,. nts Ilk\·
1

data mov~m<mt dlrnctors.


ILM adapl'ed mainly four steps to complete Its task which is !]Ivon
lwlow :
~◄ Define the Datu Closses
~◄ Create Storage Tiers for th~ Dntf\ Clnssos
I◄◄ Create Data Access and Mlgrntlon Polldcjs
El • INFORMATION S:T"ORAGE ·,ANC) MANAGEMENT
·-
M◄ Define and Enforce Compliance Policies
M◄ Define the Data Classes
M◄ The first step of ILM is to look at all the data in any organization, what type of dat.
it, where is it stored? And determine:
~◄ which data is important, where is it and what needs to be retained
M◄ how this data flows within the organization
M◄ what happens to this data over time and is it still needed
M◄ the degree of data availability and protection that is needed
M◄ data retention for legaJ and business requirements
Once there is an understanding of how the data will be used, it can then be classified
this basis. The most common type of classification is by age or date, but other types ,
possible, such as by product or privacy, or a hybrid classification could be· used such as
privacy and age. Once the data has been classified, the policies that will be applied to ti
data will depend upon its class.
In order to treat the data classes differently, the data needs to be physically separat,
When information is first created it is often frequently accessed, but then·over time it m
be referenced very infrequently. For instance, when a customer places an order, tt
regularly look at it to see its status and that it has been shipped. But once it arrives, th
may never reference that order again. This order would also be included in regular repc
that are run to see what goods· are being ordered, but, over time, it would not figure
any of the reports and may only be referenced in the future if someone does a detail
analysis that involves this data.

Create Storage Tiers for the Data Classes

\/'✓ hat is Tiered Storage?


Tie ring means establishing a hierarchy of storage systems based on service requireme1
(performance, business continuity, security, protection, retention, and compliance) a
cost. First, the data is saved on server-related hard disks (Direct Attached Storage = DA
and keep it there for a while for fast data access and then the second step was to move t
data to a Tape Library. Backup and archive data was also saved on tapes and kept at a sc
place usually away from the company premises - at least that was the theory. But if ti
data was required again due to a recovery situation, it first had to be transported back a1
then loaded into the productive system - a process which .could often last for hours
everi days. Further developments in disk and array technology have today resulted in
tiered storage model which comprises at least three or four classes:
~◄ Tier 0: Fast data storage (Flash Memory or Solid State Disk) is used to ensure th
data can be accessed very quickly. For example, Solid State Disks (SSD); as ve
expensive cache storage - have been on offer for years from specialist companies
FUNDAMENTAL. 0F INFORMAT<ION s,reRAQE AND MANAGEMENT . ·- ..
IEI
~ Tier 1: Mission critical data (such as revenue data), making up about 15% of all
data, very fast response time, FC or SAS disk, FC-SAN, data mirroring, local and
remote replication, automatic failover, 99.999% availability, recovery time objective:
immediate, retention period: hours.
~ Tier 2: Vital data, approx. 20% of data, less critical data butfast response time, FC or
SAS disk, FC-SAN or IP-SAN (iSCSI), point-in-time copies, 99.99 % ava:!;1bility,
recovery time objective: seconds, retention period: days.
~ Tier 3: Sensitive data, about 25% of data, moderate response times, SATA disk,
IPSAN (iSCSI), virtual tape libraries, MAID, disk-to-disk-to-tape periodical backups,
99.9% availability, recovery time objective: minutes, retention period: years.
~ Tier 4: Non-critical ·data, ca. 40% of the data, tape FC-SAN or JP-SAN (iSCSI) ,
99.0% availability, recovery time objective: hours/days, retention period: unlimited.

Create Data Access and Migration Policies


The next step is to specify who can access the data and the operations they may perform
and how to move the data during its lifetime.

Managing Access to Data


Regulatory requi.re_ments are beginning to place exacting demands on how data can be
accessed. A security policy implemented via virtual private database determines exactly
which data can be seen, therefore authorized ~sers could see the information for quarters
(Ql, Q2, Q3 and Q4) but only special users could view the historical data. Using this
approach, the data is still available to those who need access to it, but for the vast of
majority of users, it is now inv!sible and therefore is not included or accessed by any of
their queries. A security policy is defined at the database level and it is transparently applied
to all database users. The benefit of this approach is that it provides a secure and controlled
environment for accessing the data, which cannot be overridden and can be implemented
without requiring any application changes. In addition read-only table spaces can be defined
which ensures that the data cannot change.

Migrate Data between Classes


During the lifecycle of the data it will be necessary to move it at various times and this
occurs for a varie·~ of reasons, such as:
M◄ For performance, only a limited number of orders are held on the high performance
disks.
~ Data is no longer frequently accessed and is using valuable high performance storage
and needs to be moved to a low-cost storage device.
~ Legal requirements demand that -the information is always available for a given
period of time, and it needs to be held safely at the lowest possible cost.
Whenever data is moved from its original source, then it is very important to ensure that
the process selected adheres to any regulatqry requirements, such as, the data cannot be
IN~unMAi lON ~ IQ A~\Qli ANO MANAGH,iM Li N r
Ell Pl
.,lh'r\•d. is :-,' ( \II\_' frn m u1m11thorl ,h,d Hl'l'llSS , I I >I{l l\ml ti lornd Ii 1 11n 11pprov,x l
onsII v mm"
k,\,tk"n 1.5.1 The Pr,
Ddfino nmt E"forco Co mpllnn co Pollclos The Olli,/
r h,s Lk111 nl..'s thnt IL~ \ \J1wlr1.H11n,•11t ls tho rr{'n l Io n o ( c 0 111p II(11 ,
,n·
11oll clrn, whi ch wlw11 cln t,1 world for
·I ( ,f )recd In f'vory dntn lo rn IIon ayat.emJ v,
,s \kt-c"tr~,11.wd t,nd frngmont,·d, luw,· to Iw c:Iofl noc.I n 11 , ll l ,
\ \ h k'h N l1ld ,-.,s,t,,
· I L, I ,
1\lsult In n rompll{\nt'\ t po icy o ng O Vl.' 1 o o ·I I<·
l l
I •
When dc fl nlng

rnmp ll1111c.:c CU.ti.om V
bppllcall<,
µ'-'-\lirk:- th\11,· ~''-' n"'-' nrnns lo conslckr:
problem'>
~ Hd,•ntinn
lmmut~hilitv L8ck of In teg ral
'" F(!W IT 0 (!
~ f'ri\',\l'Y
~ .-\ uditinu te nd to b
man~gerr.<
-... Expir:1ti0n
proc~dures
Th,- rd c ntion p1.>lic>1will d1.:' S,Tibc how the data Is to be ret,,i1wd. for how lo ng 11' must be
k~pr ,rnd wl,~t h~ppt.·ns at lifL' 1.•nd. T h c t\.'for11. an e:-.:ample o f" re te nt ion policy is thdt a Insufficien t diati,
record must be stort·d in its ori9innl form. no m od ifica ti ons nre nll o w c d. it must be kep t for
Pumbt·r o f thLd ~1Lnrs a nd then it m n)1 lw cleleted . hT1rnutnbility Is conc~rn ~d with proving Not all infc
to 3n .?xt.:mnl party th at datn is complt?te nnd has not bee n 1nod ified . C ryptogrnrhic disaster rec.
s_ignutun.:.- s c,rn be c.reat.>d nnd he ld e ithe r inside or outside oft-h e da tabase. to sho w thut typicaUy ,mi
dot3 h-1s not h ten nlte re d or tn m p ere d in nny WE\~1. With so mu ch do ta being rc trl incd exrerurve dl
tod.:-i~·. it is ~, hx: m t' l>1 ln1portunt to m nintui n privac~1of dato at nll times. A cc~ss to d a ta cn n bi nary appr1
be sn·ictl~: conh-olled using security polides defined using Virtual Private Da tnba se (VPD), side of cat:~i
which defin e ex actly which inforn1a tio n a user m a~1sPc . Mnintnlned at th e d a ta base level,
th ese policies eunnot be viol«ted b~.1 nny onc . Do tnhnse a lso hns i~ own m 1d iting c<1p.:1hility Growing complex,
to rruck .:111 nca:>s.s nnci clrnnges to cbtn . Th ese rnn be dd1rwd a t th ._• l..:1ble level o r via fine- As organiza :
gr.:lin.;,d audi ting whi ch sµ...·ci fi cs th~ critt'rin for w hen nn m 1dit n 1 cord shou ld be genl!ralcd, day- tcxia y fl.
such os: so meone h'i cd to d 1ungt' n s."llnr~1 or nttempt\tld to niter n processed o rder. D. 11,1
mu~· expm! fo r business or rcgulotor>1rc nsons nnd its need to be r~m ovl.:ld from th e d,,1<11),,sll. Growth of d igital i,
since thnt c:111 involve rt? rnovin y ,,nst quontitil'S. 0:1 tnbns1..' (Hn r ~m ov e cl.Jiu v,·ry quickly New applica tl
and e>ffic1.;, ntl>1 bv simply dropping tlw pmtition. w hich co ntl1 ins tht.:> inform ...,tio n ic.lt' nlifi<.'d to rapid grow
fo r removal.
In s-ummury. i1n lu\1 so lutio n must l'l,nsickr: Performance problE

'" An a doptive sto rnge infrns trttcllll\' th,1t ~upp0rts diffl'l\.' nl d,,s.--cs of stor,,g,• b.-:iscd 0 11 As rh e amour
how the data is used during its HfL l'.yck . T h~ sto r,"\ gl' lrnrdwaru ,md so ftw,1 rl:cl utllilud application M
.ls !).)rt o f un ov.::rllll lLM so lut ion mu-,t -,~. ~' l ICI \ as , \ V::\ l·1, \ I )I·1I·1v.
SU)) )l orl v •mio \ 1-,• <I" tt·,··,t)ttf l! recovery also I
- '

~• fo rmc11KL', d.:11.1 prukct ,o n (Ind 1\ 'C1._lvl• rv S1.


• • •
.' ( uri tv , 1, t , ·t ·
, , u l l l )S l\'qll ll\ J l\l e l\lS .

,.. Enterpri:...• sto ragt! applirn tions th;:-it nr1..' d,·sig1wd to si m plify th , ,t of 1.5.2 Recommend
11 11 1
com plex stor.19 ...• infrns tructurcs. · l )1..' tnCt
" · '- ~
The lllustratlon
,.. Tho? lin k1 1~9 of th e bu~itW $.., -n ltk ,,I npp lk,,tlo ns a nd th u ov1.Yrn ll l)llsit ,1..•&. proc l.'S..'it'S to 011Information
lh l' ada p ll Vt? Slo rng l., 1n f_t'.\ Sll'll Ct\lr\' to 1..l l1SUl'Q th a t thu ri nht d n t \ 1' • O " ii l I I .,,..• path an lndlvld
~ •·• , :.. " " ::l J e n ny w " -' ,,
d
ot ~ny : '_m ~ acco1 ing to its bu si nc:-..-. n·l,•v.11\l'u, tl m t is, i~ us.,g,· .,nd thl..! volu,J thi.t life time from er
bu:,1nc:.:, d'->11\\.'s from it ut ,,n~, po int in r1 11 ,c .
- FiUNDAMENTAL OF INFORMATION STORAGE AND MANAGEMENT
ID
1.5.1 The Problem
The basic concept of ILM has been a standard operating procedure in the mainframe
world for several years, outside the mainframe world ILM is far less mature. In the open
systems world, most IT organizations have some procedures In place (often In the form of
custom scripts and manual administration tasks) to manage the information produced by
applications and users. Unfortunately, these procedures typically suffer from the following
problems:

Lack· of integration
Few IT organizations have a holistic approach to information management; procedures
tend to be application- and operation-specific, using different tools with different
management interfaces for each task. This heterogeneity can rrtake it difficult to document
procedures and ensure that the right procedures are in place for every application.

Insufficient distinction between information types


Not all information is of equal value, and the procedures for tasks such as backup and
disaster recovery should be driven by the value of the information. Today's data centers
typically implement one of two classes of ILM: a combination of off-site disaster recovery,
extensive archiving, and comprehensive backup; or a bare-minimum tape backup. This
binary approach can lead to excessive IT costs because administrators tend to err on the
side of caution and overprotect information that may have little residual value.

Growing complexity in data and information management


As organizations have come to rely on computing infrastructures for a growing number of
day-to-day functions, effective information management has become increasingly important.

Growth of digital information storage


New applications, new regulations, and increased use of existing applications are leading
to rapid growth in capacity requirements.

Performance problems caused by information overload


As the amount of information stored by an application grows, the performance of the
application typically .degrades and administrative operations such as backups and disaster
· recovery also take longer.

1.6.2 Recommended Solution


The illustration below details the ILM Process model, defining the actions that can be taken
on information at any one time, the options available while taking those actions and the
path an individual should follow to ensure the Information remains secure throughout Its
lifetime from creation to deletion.
m INll<;.)I IMAI ION If! (,)flACillil ANO MANAQliMQ
NT

01-0alt Updlle

Oelltl
Information Ola11lft01tlon

.--- -- ·--- -1 ..- -- --- 7 Trt1n1f1r ·


lmpaot 81n1ltlvlty C1tegortz1Uon
L - - - - - - -- r- - - - - -7 Storage

AOOIII Control Polley

Cr coti on
1\ Int of lnfonnC\tl on Is cruo ted ris part of the ever
ydoy business procnss within organizatio
Upon t t\ •t\tlon, lh~ Ct\.!lllur inu$t consider the content
of the Information they are creati
,ind nm\,~ n ~ll:dslon f\S to wh,athcr or not It rcqu
lrns 'lccess control. If not, then its lifecy
rnn contlnuo without npplylng the ILM Process.
If It Is, the Information Classiflcati<
lmpnct Sonsltlvlty Cotcgorl zatlon nnd Access Con
111ust bo pl•rfurnwd. trol Polley Det1nitlon stages of the proo

Storngo
lnformntlon 1l1ust bo storod appropriately to refle
ct the Information protection requlreme
dvOncd by thu lnforn-u,llon Clas11lflcatlon and Imp
act Sensitivity Cateaorlzatlon stages.
Encryption Is not 1na11dntud but If tho informa
tion Is highly classified or has high Imp
sun11ltlvlty th1.1 n conl1 donllollt y ~ssurl'\nco must
be consldQred. The physical location of I
t> lOm!J" tll,vlnrn 1rnJ tho onn ypll on of Informa
tion am oxl'\n1ple considerations.
Inf ormntlon Sho ring

!111!)1mntlon sht" lJci In rollt1borntlv u, du-porn1nl!tu


rlzod t)twlronrnonts Is subj oct to a differ<
lllll'('' 11mdvl hl ll1fui 11mt lun slrnt1d locn lly. 111
light Clf this tho lnfonnu tlon Classlflcatic

b
FUNDAMENTAL OF INFORMATION 8TORA<JE ANO MANAGEMENT
ID
Impact Sensitivity Categorization and Access Control Polley Definition stages of the process
must be performed before sharing the Information, even If this has already been performed
on creation of the Information .

Data Transfer
The Information tra, 1sfer protocol between collaborating parties should take account of
the Information protection requirements as defined by the Information Classification and
Impact Categorization stages. Encryption ls not mandated but If the information is highly
classified or has high impact sensitivity then confidentiality assurance must be considered,
Endpoint compliance and the encryption of information in transit are example
considerations.

Update
If the information is updated, the updater must consider the content of the information
they are adding/ updating and make a decision, using the Information Classification Scheme
and Impact Sensitivity Categorization processes, as to whether or not the changes present
additional and/or modified information protection requirements. Changes in Access Control
Policy and data transfer. security are examples of such protection requirements that may
change due to modified information conte.nt.

Deletion
Deletion of the information should reflect its classification and impact sensitivity labels. If
the information is labeled as having to be securely destroyed then just placing it in the
system "Trash" is not acceptable.

Information Classification
Information that is shared in de-parameterized environments must be accurately labeled
with information protection requirements according to the sensitivity of the content within
the information resource in terms of a risk-based asses~ment of the business impact of an
incident or threat.
The information creator or individual intending to share the information in a collaborative,
de-parameterized environment must consider the threat to the business if the information
were accessed and/or modified by individuals with an identity outside of a particular domain.
This could be the internal organization, external business community cir named specific
individuals, details of which must be specified In the Information Classification Scheme for
the organization.
Information handling requirements, legality and temporal llspects must also be considered
at the Information Classification phase. Secure destroying of inrormation; clear ownership
rights; changes In classification after a particular date and time; and corporate governance
constraints are examples of the detail that should be evident from the labeling process. If
the information requires classification according to the Information Classification Scheme,
the Information must be correctly labeled.
m INFOPIMATION STORAGE AND MANAGEME
NT
- FUND
Impact Sensitivity Categorization
Information must be labele d with an Impact Sensi Information
tivity level based on the meas ures of
Confidentiality, Integrity, Authenticity and Availability each catego
requi red to adequ ately prote ct the
information, use and transit. There Is six-level Impac
t sensitivity scale that repre sents the Data cla.5.5ific
impact magn itude should the protection meas ures
not be effectively deplo yed. public or prh
The creato r or individual that Is Intending to share
the inform ation in a collaborative, All members ,
depar amete rized envir onme nt must condu ct an Impa
ct-Sensitivity analy sis to determine and availabil
the controls required to maintain information assur ance that organ iza1
in a de-pa rame terize d envir onme nt
in relation to the Confidentiality, Integrity and Avail into the follO\
ability requi remen ts of the information.
Access Control ~ Publit
Authentication and Authorization should be appli ~ Officic
ed to principles reque sting acces s to
information. ~ Confid
·
Appropriate access control technology should then PUBLIC DATA
be used to enfor ce the authorization ·
response. The de-parameterization issue make s many
of the curre nt techn ologi es that rely Public data is i
on perimeter security to enforce controls inapp ropria
te for use in an organ izatio n. information wi
Acc~ss Controls sho_uld be reflective of and respo usage . Public d
nsive to · the infor matio n security
requirements defined m the Information Classification of the organ iza
and Impa ct Sensitivity Categorization
stages. examples of Pu
·
~ Publicly
1.5.3 ILM Benefits
~ Publicly
The major benefits of an ILM are given below:
OFFI CIAL USE ONLY
~ The improve~ _usage of existing information reduc
emp1oyee eff1c1ency. es proce ss cycles and incre ased Official Use Onl
privacy conside
~ Costs of the IT infrastructure can be reduc ed
. by opt1·mizm
· · g th e storag e . onme nt. transmission , stc
~ Laws and regulations are more easily adher envir
coordinated infrastructure. d t d . be a civil statute
. e O ue to the relev ant polici es and a restricted to mer
~ Improve the utilization of business data. such data. Some
~ Provides variety of options for backup of data. ~ Employm
~ Organizat
1.6 Data Categorization
agreemen
DEFINITION: The categorization of stored data f it ~ Internal te
Data can be classified according to its critical val or hs t .
moS effective and efficient use. Official Use only c
with the most critical or often-used data stored ~eti;
fa::s~ ften it needs to be accessed,
be stored on slower (and less expensive) media0. ~ Must be pr
media while other data can
disclosure.
Why data Categorization?
~ Must be stc
Data classification is the act of placing data ·nt where phyi
. 1 0 ca t
internal controls to protect that data against theft egories that ·11 di Must not b(
c . WI ctate the level of ~
' omprom1se , and Inapp ropria te use .
· Fl:JNDAMENl"AL"tOF--INF0RMATION ST.ORAGE AN0 MANAGEMENT, _ _ _ - ID
Information security is best managed when data is classified and the risks associated with
each category are uniform and understood.
Data classification is an essential part of audit and compliance activities at any organization;
public or private sector.
All members of any organization have a responsibility to protect the confidentiality, integrity,
and availability of data generated, accessed, modified, transmitted and stored or used by
that organization.-Data owned, used, created or maintained by any organization is classified
into the following three categories:
~ Public
~ Official Use Only
~ Confidential
PUBLIC DATA
Public data is information that may or must be open to the general public. It is defined as
information with no existing local, national or international legal restrictions on access or
usage. Public data, while subject to organization disclosure rules, is available to all members
of the organization and to all individuals and entities external to the organization. Some
examples of Public Data include:
~ Publicly posted press releases
1◄◄ Publicly posted interactive organization maps, newsletters, newspapers and magazines

OFFICIAL USE ONLY DATA


Official Use Only Data is information that must be guarded due to proprietaryt ethical, or
privacy considerations, and must be protected from unauthorized access, modification,
transmission, storage or other use. This classification applies even though there may not
be a civil statute requiring this protection. Official Use Only Data is information that is
restricted to members of the organization who have a legitimate purpose for accessing
such data. Some examples of Official Use Data include:
1◄◄ Employment data

~ Organization partner or sponsor information where no more restrictive con_fidentiality


agreement exists
~ Internal telephone books and directories
Official Use only data:
~ Must be protected to prevent loss, theft, unauthorized access and/or unauthorized
disclosure.
1◄◄ Must be stored in a closed container (i.e. file cabinet, closed office, or department
where physical controls are in place to prevent disclosure) when not In use.
~ Must not be posted on any public website.
INFORMATION STORAGE AND MANAGEMENT

CONFIDEN TIAL DATA


Confidential Data is Information protected by statutes, regulations, organization polici
contractual language. Managers may also designate data as Confidential. Confid~
Data may be disclosed to individuals on a need-to-know basis only. Some exampl
Confidential Data include:
H◄Medical records
~ Social Security Numbers • Personnel and/or payroll or records
~ Bank account numbers and other personal financial information
~ Any data identified by government regulation to be treated as confidential, or se
by order of a court of competentjurisdictio!'},
Confiqential data:
H◄ When stored in an electronic format, must be protected with strong passwords
stored on servers that have protection and encryption measures provided by [
in order to protect against loss, theft, unauthorized access and unauthorized disclrn
~ Must not be disclosed to parties without explicit managemen t authorization.
~ Must be stored only in a locked drawer. or room or an area where access is contrc
by a guard, cipher lock, and/or card reader, or that otherwise has sufficient phy:
access control measures to afford adequate protection and prevent unauthor
access by members of the public, visitors, or other persons without a need-to-kr
~ When sent via fax must be sent only to a previously established and used addre~
one that has been verified as using a secured location .
. H◄ Must not be posted on any public website.

1.6.1 Goals of Data Categorization


H◄ Identify WHAT information exists and WHO needs it.
M◄ Understand how valuable the information is to each to each of the individu
groups and business processes that require it.
H◄ Provide a system for protecting information.

Steps for data categorization


M◄ Step 1 : Determine the need and/or requ.irements for data categorization
M◄ Step 2: Determine the roles Involved in data categorization.
M◄ Step 3 : Determine the Institution's categorization levels
H◄ Step 4: Determine the methodology and procedures for categorization of data
M◄ Step 5: Determine and review other information security processes impacted
data categorization
FUNDAMENTAL OP INFORMATION STORAGE AND MANAG EMENT

zatlon policies or 1.6.2 Architecture tor Data Categorization


ial. Confidential The Data Reference Model (ORM) provides a standard means by which data should be
me examples of described, categorized, and shared. These are reflected within each of the DRM's three
standardization areas: ·
1◄◄ Categorization of data
I◄◄ Exchange of data
1◄◄ Structure of data
:iential, or sealed Information sharing can be enabled through the common categorization and structure of
data. By understanding the business context of data, ORM users will be able to communicate
more accurately about the content and purpose of the data they require.
This improved communication on the content and purpose of data will improve the ability
;Jpasswords and to share information throughout the government.
·ovided by DivlT
orized disclosure.
orization.
cess is controlled
ufficient physical
mt unauthorized
a need-to-know.
i used address or

~~
l Dala Elomenl ]

· the individuals,
lFigure 1.8. DRM Structurn
1◄◄ Categorization of Data: The ORM establishes an approach to the categorization
of data through the use of a concept called Buslne n Context. The business context
represents the general business purpose of the data. The business context uses the
lzation FEA Business Reference Model (BRM) as Its categorization taxonomy.
1◄◄ Exchange of Data: The exchange of data Is enabled by the DRM's standard
message structure, called the lreformotlon ExcJaany~ Package. The information
echange package represents an actual set of datPi that :~ i"~quested or produced from
~tion of data one unit of work to another, The Information exch~ t•[.P t-Jackage makes use of the
,ses lmpact~d by DRM's ability to both categorize and structure d~1ta.
11\1
t"M illl t,lll,ltl of l)n
'I I Ii.I Vit i I111
11J1il pol. ',
111i11l1h1 n ll 1
111·11 hnu u 111
qp j111111) 1111
dnt n) 111lu l,!
V• ,11111111 1:1 I ,I
11 >< 1 l irw q,, I

1ltiJ 1111 1 1
i,l 1n1,,,
lt.1

8 tt \l iHl lle o f t,111 11


I l1t1 I )l<M II
1 f)ll \&1 )( 1 I Ii
llll\'llll5'lhm lh~l lt1 \)@1·1~1~1~,1 lll '"''lllh"9 111 /II i. l1'1 11•
,I hy"
llnll nf \!\ii" " ""'' Ill !lllh!l"q11•11
~11,,tht1r U11lt ,,1 \M111\. ll1111~ ,1f lly ,~•-•ll "- 1
W\lll\ ,11 1111il 111, ..
It lo1lllf lv1I w
"'''' \\l\)\ hl,~ IIAI~ . \)11 n lt1tJe.1 111

11111 i;, I I 111 ,1,1


f>tiin l~• 't''
,\ i/1111111 hi~ li, nl1Rlf~1\l1111111,
lhh11,1111111d 1~11 111 1l w i,1111 11 '
11" 111.-1111n~, wllh l'!S\ilh111 t1111,11'

l
\\Al )\ lil.nM'1NI

f~t, t \\:~ ~"


t
1- 11m.l
111\1\ltt 1\11~ .
1hlcill!l llll1Allh\U ,
wh,~111:1 111,,pf.llll(t" f\1111 l\f1hrwh11• 1
l\ U"w ihl')
p.ih m<V , oh
tJ xnt nt 1I'-' , I< '

l ' "'-'" ~,,¥ ~(\


~ 1.G.3 Oonoflt " 0

l !~I~ / ~\4..,_-qijll'Jc l\i.i/1


♦ 1••
,.4
l '1 1"
l 11 :11 ·
111 "·'
11 4 ( )p, , .
lf{i"rt \.", 1,,, ,,.,1

Cf'l-tQOI ltM llon ()f dreln 144 ( \ \11 11


<) I \ ti ll I
fh~ 1)1~1\1 p,~~""'~;\
'''"'"'''" "l'l''''"''h t, t\ w w,1 ci~,,,·1~n1\~m ,)f dnt.n, Tn ,'l'\t
~ '''" "'"'1" \\~'. th" nH~ I \l~lnhll!il\\J l! n l\11l'I IIW
1
~gur\~('. dl'\h'\ In
I 1111

thc1 h,,~11\~ li~ " ~" l,t n µIVIJI\ "'-'I l,1' dnlC !il'l l \ ,111\l~I. \'h1:2
\ l 111nhc1- 1.1l!I" ,,f tlw S11\,\ol't
hu"l"ct "" l'f'll'\l'e~ I '"P'~""''ti l44 \,Vll ll1l

"' n11 1h4,1 1' d.., ttrdb" 1\w \n1ic\11c1tt!I 1'u11t\lNnrn t uf n u\\ 1" " t 1o1l nf dnt" . ~111 ,1~,
.i\1'\.1n nnd ~ ,pc;1 • 'lypd
1 1 "'""'''
hl~h lt'\'l'I 11" 1 ,,1 \111l'1 l11v 1111 1\
11\l'llu1" nm\ n,~ \lbln \11(! ,\ ·t· "''" "" r~p,~it'1nl A 1~• I,,,,I II
~h"h,I iBHM\ . ~Ill'"' ~11''-'ll 1~ 111\l{ 1\-,,111 the1 \11'1•\' 11 f\u t1 lllct ~ Hctt
liJ III C\I\ f\dd\111,nnl \1:,1, 1..,l :.,,~ nl'~ I 11111\t
'-'°' 11~~1 C\lld ewe \.}c1w1 nlh,1 1\iln 1,f d~llnlll,,n with th1J ht1 t1 144
l1Jd h I Ii\,cdtk li11 11 l1 ll 1~!1 m•tl\'ll\ 1\l(l~~
tho 11u1 ,1"1'I n1~ n, ~)!I n l\1l 1. ,,, pi, 1i'\l!i1i\lli thnl
1111p1x'II
FUNDAMENTAL OF INFORMATION STORAGE AND MANA
GEMENT
El
Exch~nge of Data
mation exchange package
The exchange of data can be enabled through the DRM's infor
the actual message or
concpet. The Info rmat ion Exch ange Pack age represents
The information exchange
combination of data that is exchanged between users of the data.
ribed in the structure of data
package brings the business context and data element (desc
exchange of information and
section) together to define how a common transaction (the
ange package concept. Future
data) might appear. Exhibit D illustrates the information exch
and scope of the information
volumes of the ORM will continue to expand on the definition
exchange package.
transmitted or to data that
Note ~ The information exchange package can apply to data that is
is shared or retrieved.

Structure of Data
data within a given business
The ORM uses the Data Elem ent to describe the stmcture of
adapted from the 1S0/IEC
conte>.i. The structure of data is comprised of three elements
ns, or things that can be
111 79 standard. The Data Obje ct is the set of ideas, abstractio
and behavior follow the same
identified with explicit business meaning and whose properties
nt. for example, the data
business rules . In the conte,xt of population health manageme
t, the DRM 's approach uses a
object could be a vaccination. To further define the data objec
rty desoibes the data element.
Data Property and a Data Representation. The data prope
rty could be name , weight,
In the population health management example, the data prope
type of the data object. For
potency, etc. (of a vaccine) . The data representation is the value
numbers, etc.
example, representations could be plain te,xt, integers, whole

· 1.6.3 Benefits of Data Categorization


ization.
~ Provide a dear picture of categories of data that exist in organ
s for each category
~ Enable the design and development of a shared grouping of client
of data.
control activities can
~ Once data categorization has been determined the appropriate
be established .
are put into place to
~ Conh·ol activities are the policies, procedures and practices that
strategies are carried
ensure that business objectives are achieved and risk mitigation
out.
ties, public companies
M◄ Without reliable inf01mation system and effective IT control activi
would not be able to generate accurate reports.
~ Demonsh·ate economic value of data to business.
~ Eliminate misuse and theft of data and reducing
associated costs.
El INFORMATION STORAGE AND MANA GEME NT

Review ~oints
■ lnfom1atlon storag e is a central pillar of inform ation techn ology
. Huge amou nt and very good qt
of digita l information is be ing create d every mom ent by either
indivi dual or by corpo rate consu
of IT.
• Dn ta proliferation refers to the uniqu e amou nt of data , struct
ured and unstru ctur~ ?• that busin e
~nd gover nmen ts continue to gener ate at an unpre ceden ted
rate and the usability probl ems
result from attem pting to store a nd mana ge that data.
■ DAS descr ibes a serve r or workstation where all the storag e devic es are
direct ly attach ed to the }
system . A fa miliar exam ple of DAS is an IDE drive inside a deskt
op comp uter.
• Stora ge Area Ne twork (SAN) is a network of storag e devic es
that are conne cted to each other anc
a server. or cluste r of servers, which act as access points to the
storag e
■ Intern et Protocol SAN (IP-SAN) is a conve rgenc e
of tee::hnologies used in SAN and NAS. IP-S;
provides block-level comm unica tion across a local or wide area
netwo rk (LAN or WAN) , resulting
greate r consolidat1on a(1d availability ·of data.
■ SAN-Attached Stora ge (SAN ) is an architecture
to attach remot e comp uter storag e devic es (such,
disk arrays, tape libraries, and optical jukeboxes) to servers in
such a way that the devic es appea r i
locally attach ed to the operati11g system
■ Network-Attached Stora ge (NAS) uses file-ba
sed protocols such as SMB/NFS/AFS where it is dee
that the storage is remote, and computers· reque st a portion
of an abstra ct file rathe r than a dis.
block.
■ Direct Attached Storage (DAS) is made of c: data storage devic
e (e.g. a numb er c;:,f hard disks1
connected directly through a comp uter through a Host bus adapte
r.
■ RAID is a technology to comb ine multiple small,
indep enden t disk drives into an array that locks like
a single , big disk drive to the system .
■ RAID systems were devel oped to provide storage with
y arious forms of fault t~lera nce. RAID fault
tolera nce is used for storage inside the PC enclosure and in extern
al storag e enclo sures Network-
attach ed storag e (NAS) is file-level computer- data storag e
conne cted to a comp uter network
providing data access to heterogeneous clients.
■ RAID is a data protection tactics of arranging group
s of discs into arrays.
• RAID -0 is disc striping is used to improve storage perfor
mance, but there is no redun dancy .
■ RAID -I is disc mirroring offers disc-to-disc redun dancy
, but capacity is reduc ed and perfo rmanc e is
only marginally enhan ced.
■ RAID-5 is parity information is spre_ad throug hout
the disc group , improving read perfo rmanc e and
allowing data for a failed drive to be reconstructed once the failed drive
is replaced. _
■ RAiD-6 is multiple parity schem es are sprea d throu
ghout the disc group, allowing data for up to two
simultaneously failed drives to be reconstructed once the failed
drive( s) are replaced.
■ ILM is the process of mana ging the place ment
and move ment of data on storage devices as it is
gener ated, replicated, electronically distributed, p"rotected, archiv
ed, and ultimately retired.
■ ILM is about creating policies that can fit the information
flow into the business environment as part
of a n overall architecture . By its nature, ILM is most effectively
implemented using an application-
specific solution appro ach that includes generalized. ·
·
FUNDAMENTAL OF INFORMATION STORAGE AND MANAGEMENT
ID
■ The major benefits of a n ILM are Improved usage of existing Information, the reduces process cycles
and Increased employee effi ciency. Costs of the IT Infrastructure are reduced, La ws and regulations
are more easily adhered; Improve the utlllzatlon of business data , Provides variety of options for
backup of data.
• Tier O is fast data storage (Flash Memory or Solid State Disk) is used to ensure th at data can be
accessed very quickly.
■ Tier 1 is a Mission critical data (such as revenue data), making up about 15% of all data , venJ fast
response time, FC or SAS disk, FC-SAN , da ta mirroring, local and remote replication , automatic
failover, 99.999% availability, recovery time objective: immediate, retention period: hours
■ Tier 2 is a Vital data, approx. 20% of data, less critical data but fast response time, FC or SAS disk,
FC-SAN or IP-SAN (lSCSl) , point-in-time copies, 99.99 % availability, recovery time objective :
seconds, retention period: days.
■ Tier 3 is a Sensitive data, about 25% of data, moderate response times, SATA disk, lPSAN (iSCSI),
virtual tape libraries, MAID, disk-to-disk-to-tape periodical backups, 99.9% availability, recovery
time objective: minutes, retention period: years.
■ Tier 4 is a Non-critical data, ca. 40% of the data, tape FC-SAN or IP-SAN (iSCSI) , 99 .0%
availability, recovery time objective: hours/days, retention period: unlimited.
■ Data classification is the act of placing data into categories that will dictate the level of internal
controls to protect that data against theft, compromise, and inappropriate use .
111 Data categorization provides a clear picture of categories of data that exist in organization.
■ Without reliable information system and effective IT control activities, public companies would not
be able to gene'rate accurate reports.

1. Define. data. Also discuss data proliferation in detail.


2. Briefly explain the various storage technologies and its architecture.
3. Discuss storage infrastructure.
4. What is information cycle? Explain with an example.
5. Discuss information lifecycle management and its characteristics.
6. Discuss the RAID concept.
7. Why to implemenf ILM comment.
8. What is Tiered Storage?
9. Define and enforce Compliance Policies.
10. Discuss data categorization.

□□□

You might also like