You are on page 1of 78

Sun Blade 150

SNIA-SA 110
Essentials of
HDS 9910
Storage Networking
Chapter 5
Data Management
Version 1.1

© COPYRIGHTED 2004
Coverage

1. What is Data Management?

2. Data Management and ILM

3. Data Management and Data Protection

4. Data Management for Disaster Recovery and


Business Continuity

© COPYRIGHTED 2004
Section 1

1. What is Data Management?

2. Data Management and ILM

3. Data Management and Data Protection

4. Data Management for Disaster Recovery and


Business Continuity

© COPYRIGHTED 2004
What is Data Management?
• In the Storage Domain: (defined by SNIA)
• Data management is the management, control, and
operation of data services
• Data Services are the processes and practices of data
handling, retention, protection, movement, distribution, and
accessibility
• Examples: backup and recovery, archive, replication,
DR, HSM

© COPYRIGHTED 2004
Data Management Services
• In the context of Information Lifecycle Management

• The control of data from the time it is created until it no


longer exists.
• Data Management Services are not in the data path; rather,
they provide control of, or utilize, data in the delivery of their
services.
• This includes services such as data movement, data
redundancy, and data deletion

© COPYRIGHTED 2004
Section 2

1. What is Data Management?

2. Data Management and ILM

3. Data Management and Data Protection

4. Data Management for Disaster Recovery and


Business Continuity

© COPYRIGHTED 2004
Information Lifecycle
Management - Vision
• ILM VISION:

• A new set of management practices based on aligning the


business value of information to the most appropriate and
cost effective infrastructure

Reference: SNIA DMF ILM-Initiative

© COPYRIGHTED 2004
ILM - Definition
• Information Lifecycle Management is comprised of the policies,
processes, practices, and tools used to align the business value
of information with the most appropriate and cost effective IT
infrastructure from the time information is conceived through its
final disposition.
• Information is aligned with business processes through
management of policies and service levels associated with
applications, metadata, information, and data.

Reference: SNIA DMF ILM-Initiative

© COPYRIGHTED 2004
Information Lifecycle
Management - Principles
•ILM PRINCIPLES:

•Information is data that is exchanged, expressed, or represented


within a context such as an application or a process.
•ILM aligns business processes with IT solutions through definition
of appropriate service levels and policies.
•ILM spans data management and information management
services.
•ILM spans the storage, compute, and network infrastructures.

Reference: SNIA DMF ILM-Initiative

© COPYRIGHTED 2004
High-Level ILM Vision –
An ILM Framework for the Datacenter
• The ILM framework provides
management and control of the
IT Infrastructure abstracted in
terms that are relevant to
Business Requirements in the
management of Information
used by a Business Process

Reference: SNIA DMF ILM-Initiative

© COPYRIGHTED 2004
ILM and Data Management
• Relationship:
• ILM is a management practice that sets Service Level
Objectives and policies for information and uses data
services to enact those policies
• To achieve the interoperability goal of ILM, we have to first
define and develop interoperability for the elements of data
management
• SNIA SMI-s

© COPYRIGHTED 2004
Section 3

1. What is Data Management?

2. Data Management and ILM

3. Data Management and Data Protection

4. Data Management for Disaster Recovery and


Business Continuity

© COPYRIGHTED 2004
Section 3: Data Management and
Data Protection
2.1. Backup and Recovery
2.2. Snapshot
2.3. Replication Techniques, and Mirroring Concepts
2.4. High-Availability

© COPYRIGHTED 2004
Section 3: Data Management and
Data Protection
2.1. Backup and Recovery
2.2. Snapshot
2.3. Replication Techniques, and Mirroring Concepts
2.4. High-Availability

© COPYRIGHTED 2004
Backup Defined
SNIA defines:
“Backup is a collection of data stored on (usually removable ) non-
volatile storage media for purposes of recovery in case the
original copy of data is lost or becomes inaccessible. Also called
backup copy. To be useful for recovery, a backup must be made
by copying the source data image when it is in a consistent
state”.

© COPYRIGHTED 2004
Enterprise Backup
Architecture
• Backup client
• Any computer with data to back up
• Backup servers
• Copy data to back up media and maintain the historical
information
• Backup storage units
• Tapes, magnetic disks, optical disks

© COPYRIGHTED 2004
Backup Components
Backup
server
Catalog

SAN
SAN
LAN
LAN

Tape
Backup
client Storage
© COPYRIGHTED 2004
Backup Components
• Hardware
• Host for Backup Server
• Software
• Backup software
• E.g Veritas Netbackup
• E.g CA Brightstor ARCserve Backup
• E.g Legato Networker

© COPYRIGHTED 2004
Backup Techniques
• Full Backup
• Incremental Backup
• Differential Backup

© COPYRIGHTED 2004
Full Backup
• What
• Is a copy of only the files that changed since the preceding
backup

© COPYRIGHTED 2004
Incremental Backup
• Impacts
• Backup
• only a small % of data if data does not change much
• Faster backup time
• Restore
• Restoration will be tedious as the base full backup must
be restored first, and then the subsequent incremental
backup files. ( See example )
• Longer restoration time

© COPYRIGHTED 2004
Restoration of Incremental
Backup Example:
• Disaster – Friday
M • Restore Monday full
A
backup first
T • Restore all incremental till
B Thursday
W • Longer restoration
C

T D

A B C D Restoration
© COPYRIGHTED 2004
Incremental Backup
• Types
• Block Level Table 1 Snapshot Copy

Table 2
Table 1

Table 2
Backup
Manager
Changed Blocks

© COPYRIGHTED 2004
Differential Backup
• Types
• Cumulative

Example M
A
• Disaster – Friday
• Restore Monday T
• Restore Thursday
B
W C

T D
F

Restoration A D
© COPYRIGHTED 2004
Remote Backup
• What is remote backup?

• Remote backup is the use of long distance link to enable


backup to a remote site. Normal backup techniques are used,
with the differences being that backup tapes and media are
stored far away from the servers being backed up.This ensure
safety of that data in the case of geographically limited disaster.

© COPYRIGHTED 2004
Remote Backup
• Functions of Remote Backup?
• It works like a regular data backup software
• Sends backup over the internet, regular phone lines or other
network connections to a backup server
• Backup data at anytime
• Constantly reevaluating the computer system and add files
to be backup as needed
• Data are encrypted for complete security
• Automatically store this valuable data at more then one site.

© COPYRIGHTED 2004
Remote Backup
• How does Remote Backup works
• Install the Remote Backup Client
• when the time determine is met, Remote Backup will “wake
up”
• it determines which files need to be back up
• what kind of backup
• then compresses those files into archives
• then these archives are encrypted

© COPYRIGHTED 2004
Remote Backup
Diagram of Remote Backup
Tapes

Servers

Gateway
gateway
SAN
WAN
SAN

FC Disk
Tapes

© COPYRIGHTED 2004
Section 3: Data Management and
Data Protection
2.1. Backup and Recovery
2.2. Snapshot
2.3. Replication Techniques, and Mirroring Concepts
2.4. High-Availability

© COPYRIGHTED 2004
Snapshots

Snapshot:
Snapshot is a point in time view of the data that is created by
serving the original data to a repository whenever data in the base
volume is overwritten. The technique that’s allows a snapshot to be
created instantaneously is the innovative copy-on-write technology.
The snapshot process creates an empty repository that’s holds
original values that later change in the bas volume after the time of
snapshot creation.

© COPYRIGHTED 2004
Snapshots

A B’

A B

Base Volume

B Snapshot Image

Repository
© COPYRIGHTED 2004
Snapshots

(cont’d of snapshot)
The primary purpose of snapshot is to facilitate
non- disruptive backups. The snapshot image becomes the source
of the backup/restoring information. Common reason for
information restore is user error. Its is easier and faster to reinstate
selected files using snapshots. Snapshots image also provide a
convenient source for testing and training environments and for
data mining purpose. Traditional methods of finding data and
duplicating it may prove to be costly and time consuming.

© COPYRIGHTED 2004
Snapshots
There are 2 Snapshot techniques:
1. Copy-on-write
2. Split-Mirror

1. Copy-On-Write
Whenever a copy of data is being requested, the disk
subsystem sets up a second pointer ( snapshot index )
and represents it as a new copy. Inside this snapshot
index is empty inside by default.

© COPYRIGHTED 2004
Snapshots
How does it works?
Snapshot is a logical copy of data that gets created by
saving the original data to a snapshot index that is created and is
being updated whenever data in the base volume is updated. The
snapshot then process and creates an empty snapshot index,
holding the original values that later change in the bas volume after
the time of snapshot creation.

Further details:
Snapshot is actually seen by combining the base where
data with the snapshot index containing the original data at the
moment the snapshot was taken. Copy-On-Write technology
enables the instantaneous nature of the snapshot, while only
requiring a fraction of the base volume disk space.
© COPYRIGHTED 2004
Snapshots
(cont’d)
Copy-On-Write provides efficiency by requiring only a fraction of the
base volume disk space. The average disk space requirements for a
snapshot copy is about 10%-20% of the base volume space. Actual
space depends on how long the snapshot is active and how many
writes are being made to the base volume. Copy-On-Writes
technology is efficient to use except in a heavy write environment or
when copy is required to be active for a long period of time.
Copy-On-Write technology is effective as a backup source image.
The required disk space are less then a full volume copy, periodic
snapshots can be made throughout the day as copy points to
reference in the event of restoration.

© COPYRIGHTED 2004
Snapshots
Snapshots Pros Cons Products
Server-based software Tightly integrated with backup Operating system dependency Legato Networker; Veritas NetBackup
application FlashBackup; Computer Asssociates
BrightStor; IBM Tivoli
Snapshot index updates must
communicate with server decreased
performance.

Storage-based Improve efficiency of managing Typically no write capability mostly used for Compaq snap and clone; IBM
snapshots. backup. Flashcopy for FAstT; LSI logic
Snapshot; storagetek D1xx series and
V960 SVA; Clarition snapview(all
Support multiple operating systems Application integration
typically allow writes)
with partitioned storage Specific approach unique to each storage
vendor.

Switch or device based Heterogeneous storage system Immature products and limitations from Datacore: SANsymphony, snapshot
support various device support option;
FalconStor:IPStor snapshot copy
Application integration option: StoreAge: MultiView

© COPYRIGHTED 2004
Snapshots
2. Split-Mirror
Split mirror technology is used to maintain 2 or more up-to-
date full copies of the data. Every write request to the original data is
automatically duplicated to the other mirrors or copies of that data.
The mirror may be contained in the same subsystem or be between
different subsystem, although these typically must be of the same
subsystem model.
The primary purpose of the Split-mirror technology is to
perform disaster recovery. Whenever the system fails to perform,
mirrors must be written between 2 subsystem and how the
appropriate distance for the disaster to not affect both system at the
same time. Often 2 subsystems will be mirrored and sit in the same
data center. It will guard against hardware failure. The further the
distance, the more delay in the performance. Asynchronous modes
of data transfer are available to accommodate wide-arm distance.

© COPYRIGHTED 2004
Snapshots
(cont’d)
Split-mirror technology provides real time redundancy, when its
active, it isn’t a frozen image or snapshot. The mirror can be
temporarily suspended which is called SPLIT-MIRROR to create a
snapshot or point-in-time copy. The disk subsystem is told to
temporarily stop making updates to the mirrored copy so the data is
frozen at the point of the suspension. The split-mirror can then be
used for the backup process.
Mirrors create an instant copy, or snapshot of the data with the split
capability. Full data copy is available, usually a third copy mirror is
established for the purpose of splitting.

© COPYRIGHTED 2004
Snapshots
This (the splitting process) requires 3 entire copies of the data volume to
provide the protection and must continuous processing for backup and
other development needs. There is primary and secondary real-time
copy, and a tertiary point-in-time copy of the data. Data can be updated
for development or training purposes.
Products that utilizes the split-mirror:
EMC TimeFinder
Hitachi InstantSplit
HP SureStore Business Copy
Sun StarEdge Instant image
Xistech

© COPYRIGHTED 2004
Snapshots

Copy-on-write vs. split-mirror


Copy-on-write:  an efficient method to create an image.
 utilizes less disk space,multiple copies
 can be kept as restore points or other
purposes
 minimize disk space considerations
 save time and data with multiple roll-back
copies.

© COPYRIGHTED 2004
Snapshots

Split-mirror  can be costly


 may be appropriate protection from disk
subsystem failures and disasters
 is effective for rapidly churning data
 is effective for copies that will be utilizes for a
long time.

© COPYRIGHTED 2004
Section 3: Data Management and
Data Protection
2.1. Backup and Recovery
2.2. Snapshot
2.3. Replication Techniques and Mirroring Concepts
2.4. High-Availability

© COPYRIGHTED 2004
Replication Techniques
• Synchronous Replication
• Asynchronous Replication

© COPYRIGHTED 2004
Synchronous Replication

• This technique is closely related to the traditional RAID-1 mirror


implementation within the storage array.
• By using this implementation, the source can be 100 km apart
from the target and replication could still be done. However, the
application will be force to wait for acknowledgement from the
remote sites before the writing process completes.

© COPYRIGHTED 2004
Synchronous Replication
synchronous Replication
Application server must wait from step 1-6
Application onwards and this usually take about 1
Server milliseconds

1 6 3
Mirroring Agent
Mirroring Agent 5
4
2

Local Storage Local Storage

© COPYRIGHTED 2004
Asynchronous Replication

Asynchronous Replication:
This technique allows the replication process to be
separated from the local write so that the application server does not
need to suffer the performance penalty. Though this is fast, but the
remote copy does not guarantee to be a perfect copy of the source.

© COPYRIGHTED 2004
Asynchronous Replication
Asynchronous replication
Application server only waits
from step 1-3 which is faster then
Application
the Synchronous replication.
Server

1 3
4
Mirroring Agent Mirroring Agent
6
2 5

Local Storage Local Storage

© COPYRIGHTED 2004
Replication Comparison

Feature Synchronous Asynchronous

Distance 100km None

Remote data in Yes No


lockstep

Use low-bandwidth No Yes


connection

Instant failover ability Yes No

© COPYRIGHTED 2004
Mirroring Concepts
1. Disk Mirroring
Disk mirroring is a technique which is written to 2 duplicate disks
simultaneously. This way, if one of the disk drives fails to deliver, the
system can instantly switch to the other disk without any loss of data
or service. Disk mirroring is used commonly in on-line database
systems where its critical that the data can be accessible at all times.

Data Data Data Data Data Data

Different level of disk-mirrored protection can be created y


mirroring to different levels of the answer’s hardware:
remote, power domain, bus, IOP, or disk unit.
© COPYRIGHTED 2004
Disk Mirroring
BUS

Input/Output
Programmer

Controller

Disk Unit Disk Unit

© COPYRIGHTED 2004
Section 3: Data Management and
Data Protection
2.1. Backup and Recovery
2.2. Snapshot
2.3. Replication Techniques and Mirroring Concepts
2.4. High-Availability

© COPYRIGHTED 2004
High Availability
• What Is High Availability?
• High availability means minimal downtime for applications to
ensure business continuity. It must be resilient to
unexpected hardware and software failure and deal with
problems such as disaster recovery.

© COPYRIGHTED 2004
The Advantages and
Disadvantages
• The Advantages of high availability:
• Reliability (alternate path)
• Performance - multiple levels of redundancy reduces
downtime
• Integrity - No single points of failure

• Disadvantages of high availability:


• Increase Complexity
• Costly

© COPYRIGHTED 2004
HA Applications
• Redundant and Failover Paths
• Clustering

© COPYRIGHTED 2004
High Availability – Failover Path

Sun Blade 150

• Zero Downtime
• Fail over features HDS 9910

© COPYRIGHTED 2004
High Availability - Clustering
Clustering  collection of same Types of Cluster topologies:
kind of objects
Benefits:
1. 1+1 Cluster: often used in 2
• Functions as a single system in case nodes
of failure
2. N+1 Cluster: There is one
• High availability standby servers for all the
servers in the SAN
• Reliability
environment
• More servers can share the SAN
3. N+N Cluster: There is a
• Promotes remote disaster recovery standby for each server in the
capabilities environment.

© COPYRIGHTED 2004
High Availability – Clustering
1 + 1 Clustering

Sun Blade 150 Sun Blade 150

© COPYRIGHTED 2004
High Availability - Clustering
N + N Clustering

Sun Blade 150 Sun Blade 150


Sun Blade 150 Sun Blade 150

© COPYRIGHTED 2004
High Availability – Clustering
N+ 1 Clustering

Sun Blade 150 Sun Blade 150


Sun Blade 150 Sun Blade 150

© COPYRIGHTED 2004
Section 4

1. What is Data Management?

2. Data Management and ILM

3. Data Management and Data Protection

4. Data Management for Disaster Recovery and


Business Continuity

© COPYRIGHTED 2004
Section 4: Data Management for
Disaster Recovery and Business
Continuity
• What is Disaster Recovery?

• Terms in Disaster Recovery

• Tiers Levels of Disaster Recovery

• Disaster Recovery solutions.

© COPYRIGHTED 2004
Disaster Recovery
• What is Disaster Recovery?

• Process of restoring operations of a business or


organization
in the event of a catastrophe.

• Data must be available at all times.

• Most recent data must be recovered quickly with minimum


manual intervention.

© COPYRIGHTED 2004
The Goal
• The goal is to continue business operations after a loss of use
of all or part of a data center

• Business Continuity Planning:

• Business continuity planning is not just recovering


technology
• It is about maintaining, resuming, and recovering the
business.

© COPYRIGHTED 2004
Business Continuity Components
• Data Center Recovery Alternatives
• Backup recovery facilities
• Geographic diversity
• Backup and storage strategies
• Data file backup
• Software Backup
• Offsite storage
• Site relocation
• Post Disaster Communication

© COPYRIGHTED 2004
Terms in Disaster Recovery
•Recovery Point Objective (RPO)
• Point in time in which applications data must be recovered to
resume business transactions
• Traditionally, RPO is in hours but newer technology shortens RPO to

seconds, ensuring minimum loss of data, revenue and customers.

•Recovery Time Objective (RTO)


• The maximum elapsed time after a disaster in which data must
be recovered to resume business.
• Traditional methods have RTO of days and weeks but newer
technology shortens RTO to minutes

© COPYRIGHTED 2004
Remote/Local
Backup and Recovery
Recovery Point Objective

Backup

RPO
Disaster
© COPYRIGHTED 2004
Remote/Local
Backup and Recovery
Recovery Time Objective

Recovery

Disaster RTO
© COPYRIGHTED 2004
Tiers Levels of Disaster Recovery
Solutions

Reference: The IBM TotalStorage Solutions Handbook

© COPYRIGHTED 2004
Tiers Levels of Disaster Recovery
Solutions (cont…)
• Tier 0 - No off-site data
• Tier 0 Disaster Recovery solution have no Disaster
Recovery Plan.
• Tier 1 - Data backup with no Hot-Site
• Businesses that use Tier 1 Disaster Recovery solutions back
up their data at an off-site facility (PTAM - Pickup Truck
Access Method).
• Depending on how often backups are made, they are
prepared to accept several hours to days of data loss, but
their backups are secure off-site. However, this Tier lacks
the systems on which to restore data.

© COPYRIGHTED 2004
Tiers Levels of Disaster Recovery
Solutions (cont…)
• Tier 2 - Data Backup with a Hot-site
• Make regular backups on tape.
• This is combined with an off-site facility and infrastructure
(known as a hot-site) in which to restore systems from those
tapes in the event of a disaster.
• This Tier of solution will still result in the need to recreate
several hours to days worth of data, but it is less unpredictable
in recovery time.

• Tier 3 - Electronic vaulting


• Tier 3 solutions utilize components of Tier 2. Additionally,
some mission critical data is electronically vaulted.
• This electronically vaulted data is typically more current than
that which is shipped via PTAM. As a result there is less data
recreation or loss after a disaster occurs.

© COPYRIGHTED 2004
Tiers Levels of Disaster Recovery
Solutions (cont…)
• Tier 4 - Point-in-time Copies
• Used by businesses who require both greater data currency
and faster recovery than users of lower Tiers.
• Incorporate more disk based solutions.
• Several hours of data loss is still possible, but it is easier to
make such point-in-time copies with greater frequency than
data can be replicated in the lower tiers.
• Tier 5 - Transaction Integrity
• Tier 5 solutions are used by businesses with a requirement
for consistency of data between production and recovery
data centers (software two site, two phase commit).
• There is little to no data loss in such solutions, however the
presence of this functionality is entirely dependent on the
application in use.

© COPYRIGHTED 2004
Tiers Levels of Disaster Recovery
Solutions (cont…)
• Tier 6 - Zero or little data loss
• Maintain the highest levels of data currency.
• Used by businesses with little or no tolerance for data loss
and who need to restore data to applications rapidly.
• These solutions have no dependence on the applications to
provide data consistency.
• Tier 7 Highly automated, business integrated solution
• Include all the major components being used for a Tier 6
solution with the additional integration of automation.
• This allows a Tier 7 solution to ensure consistency of data
above that which is granted by Tier 6 solutions.
• Additionally, recovery of the applications is automated,
allowing for restoration of systems and applications much
faster and more reliably than would be possible through
manual Disaster Recovery procedures.

© COPYRIGHTED 2004
Disaster Recovery Solutions
• Important SAN solutions for DR are:

• Remote Data Mirroring

• Remote Data Replication

• Electronic Tape Vaulting

© COPYRIGHTED 2004
Remote Data Mirroring
•What is Data Mirroring?
•Method of replicating data- having more than one copy of data.
Primary Remote
Data Center DR center
•Benefits
•Reduces downtime after a Disaster Servers
•Multiple copies are available.

SAN

Data Mirroring

© COPYRIGHTED 2004
Remote Data Replication
Method of periodically replicating specific data or file systems.

Two modes available:


 Synchronous and Asynchronous. Disaster
Primary site Recovery site

In Data Replication, both Servers


original data and mirrored
volume can be seen.
DWDM DWDM

WAN

Data Replication
© COPYRIGHTED 2004
Electronic tape Vaulting
 Enables remote tape backup through WAN

Primary site DR Site

Backup Backup
server server

Tape Vaulting

WAN
router router

Tape Tape

Data Replication
© COPYRIGHTED 2004
Class Discussion
Team A
•Describe data management concepts (Backup & Recovery,
Information Lifecycle)
Team B
•Given a scenario, identify the advantages and disadvantages of
replication, snapshot and split mirror disk backup techniques
Team C
•Identify how the emergence of SMI-S Policy based management can
be an advantage to data and storage management

© COPYRIGHTED 2004
References
• SNIA – Data Management Forum: Vision & Directions
• SNIA DMF ILM-Initiative
• The IBM TotalStorage Solutions Handbook

© COPYRIGHTED 2004

You might also like