h2603 Oracle DB Emc Symmetrix Stor Sys WP LDV

Oracle Databases on
EMC Symmetrix Storage Systems
Version 1.3
• Generating Restartable Oracle Copies Using Symmetrix

Storage
• Oracle Remote Replication and Disaster Restart Using
Symmetrix Storage
• Oracle Data Layout and Performance Using Symmetrix
Storage
Yaron Dar
Copyright © 2008, 2009, 2010, 2011 EMC Corporation. All rights reserved.
EMC believes the information in this publication is accurate as of its publication date. The information is
subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO
REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS
PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR
FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable
software license.
For the most up-to-date regulatory document for your product line, go to the Technical Documentation and
Advisories section on EMC Powerlink.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.
All other trademarks used herein are the property of their respective owners.
H2603.3
2 Oracle Databases on EMC Symmetrix Storage Systems

Contents
Preface
Chapter 1 Oracle on Open Systems

Introduction ....................................................................................... 26
Oracle overview ................................................................................ 27
Oracle system elements .............................................................27
Oracle data elements ..................................................................29
Storage management ........................................................................ 33
Cloning Oracle objects or environments ....................................... 34
Backup and recovery ........................................................................ 35
Oracle Real Application Clusters ................................................... 36
Optimizing Oracle layouts on EMC Symmetrix .......................... 38
EMC and Oracle integration ........................................................... 39
Install base ...................................................................................39
Joint engineering.........................................................................39
Joint Services Center...................................................................40
Chapter 2 EMC Foundation Products

Introduction ....................................................................................... 42
Symmetrix hardware and EMC Enginuity features .................... 45
Symmetrix VMAX platform......................................................46
EMC Enginuity operating environment..................................47
EMC Solutions Enabler base management ................................... 49
EMC Change Tracker ....................................................................... 52
EMC Symmetrix Remote Data Facility .......................................... 53
SRDF benefits ..............................................................................54
SRDF modes of operation..........................................................54
SRDF device groups and composite groups...........................55
Oracle Databases on EMC Symmetrix Storage Systems 3

Contents
SRDF consistency groups .......................................................... 55

SRDF terminology ...................................................................... 59
SRDF control operations............................................................ 61
Failover and failback operations .............................................. 65
EMC SRDF/Cluster Enabler solutions.................................... 67
EMC TimeFinder............................................................................... 68
TimeFinder/Mirror establish operations................................ 69
TimeFinder split operations...................................................... 70
TimeFinder restore operations ................................................. 71
TimeFinder consistent split....................................................... 72
Enginuity Consistency Assist ................................................... 72
TimeFinder/Mirror reverse split ............................................. 75
TimeFinder/Clone operations.................................................. 75
TimeFinder/Snap operations ................................................... 78
EMC Storage Resource Management ............................................ 81
EMC Storage Viewer ........................................................................ 86
EMC PowerPath................................................................................ 88
PowerPath/VE............................................................................ 90
EMC Replication Manager .............................................................. 97
EMC Open Replicator ...................................................................... 99
EMC Virtual Provisioning ............................................................. 100
Thin device ................................................................................ 100
Data device ................................................................................ 100
New Symmetrix VMAX Virtual Provisioning features ...... 101
EMC Virtual LUN migration ........................................................ 103
EMC Fully Automated Storage Tiering (FAST).......................... 106
Chapter 3 Creating Oracle Database Clones

Overview.......................................................................................... 109
Comparing recoverable and restartable copies of databases ... 110
Recoverable disk copies........................................................... 110
Restartable disk copies............................................................. 110
Copying the database with Oracle shutdown ............................. 111
Creating Oracle copies using TimeFinder/Mirror .............. 111
Creating Oracle copies using TimeFinder/Clone ............... 113
Creating Oracle copies using TimeFinder/Snap ................. 115
Copying a running database using EMC consistency
technology........................................................................................ 118
Creating Oracle copies using TimeFinder/Mirror .............. 118
Creating Oracle copies using TimeFinder/Clone ............... 120
Creating Oracle copies using TimeFinder/Snap ................. 122
Copying the database with Oracle in hot backup mode........... 125

Contents
Putting the tablespaces or database into hot backup

mode ...........................................................................................125
Taking the tablespaces or database out of hot backup
mode ...........................................................................................126
Creating Oracle copies using TimeFinder/Mirror...............126
Creating Oracle copies using TimeFinder/Clone ................128
Creating Oracle copies using TimeFinder/Snap..................130
Replicating Oracle using Replication Manager .......................... 133
Transitioning disk copies to Oracle database clones.................. 135
Host considerations ..................................................................135
Enabling a cold database copy................................................140
Enabling a restartable database copy.....................................141
Enabling a hot backup database copy....................................142
Oracle transportable tablespaces .................................................. 143
Benefits and uses of transportable tablespaces.....................143
Implementation of transportable tablespaces with EMC
TimeFinder and SRDF..............................................................144
Transportable tablespace example .........................................144
Cross-platform transportable tablespaces ................................... 150
Overview ....................................................................................150
Implementing cross-platform transportable tablespaces....151
Choosing a database cloning methodology ................................ 154
Chapter 4 Backing Up Oracle Environments

Introduction ..................................................................................... 156
Comparing recoverable and restartable copies of databases.... 157
Recoverable disk copies ...........................................................157
Restartable disk copies .............................................................157
Database organization to facilitate recovery ............................... 159
Oracle backup overview ................................................................ 161
Online (hot) versus offline (cold) backups ............................163
Point-in-time and roll-forward recovery backups ...............164
Comparing partial and entire database backups .................165
Comparing incremental and full database backups ............165
Using EMC replication in the Oracle backup process ............... 166
Copying the database with Oracle shutdown ............................ 168
Creating cold Oracle backup copies using
TimeFinder/Mirror ..................................................................168
Creating cold Oracle backup copies using TimeFinder/
Clone ...........................................................................................170
Creating cold Oracle backup copies using TimeFinder/
Snap.............................................................................................172

Contents

technology........................................................................................ 175
Creating restartable Oracle backup copies using
TimeFinder/Mirror .................................................................. 176
TimeFinder/Clone ................................................................... 177
TimeFinder/Snap ..................................................................... 179
Copying the database with Oracle in hot backup mode........... 182
Putting the tablespaces or database into hot backup
mode........................................................................................... 182
Taking the tablespaces or database out of hot backup
mode........................................................................................... 183
Creating hot Oracle backup copies using TimeFinder/
Mirror ......................................................................................... 183
Clone........................................................................................... 185
Snap ............................................................................................ 187
Backing up the database copy ...................................................... 190
Backups using EMC Replication Manager for Oracle
backups ............................................................................................ 191
Backups using Oracle Recovery Manager (RMAN) .................. 193
Backups using TimeFinder and Oracle RMAN.......................... 195
Chapter 5 Restoring and Recovering Oracle Databases

Introduction ..................................................................................... 198
Oracle recovery types..................................................................... 199
Crash recovery .......................................................................... 199
Media recovery ......................................................................... 200
Complete recovery ................................................................... 201
Incomplete recovery................................................................. 201
Restartable database recovery ................................................ 202
Oracle recovery overview.............................................................. 203
Restoring a backup image using TimeFinder ............................. 205
Restore using TimeFinder/Mirror......................................... 205
Restore using TimeFinder/Clone .......................................... 208
Restore using TimeFinder/Snap............................................ 211
Restoring a backup image using Replication Manager ............ 215
Oracle database recovery procedures.......................................... 217
Oracle restartable database recovery procedures................ 217
Oracle complete recovery........................................................ 218

Contents
Oracle incomplete recovery.....................................................220

Database recovery using Oracle RMAN...................................... 223
Oracle Flashback ............................................................................. 224
Flashback configuration...........................................................224
Flashback Query........................................................................225
Flashback Version Query.........................................................226
Flashback Transaction Query..................................................226
Flashback Table .........................................................................226
Flashback Drop..........................................................................226
Flashback Database...................................................................227
Chapter 6 Understanding Oracle Disaster Restart & Disaster

Recovery
Introduction ..................................................................................... 230
Definitions ........................................................................................ 231
Dependent-write consistency..................................................231
Database restart.........................................................................231
Database recovery.....................................................................232
Roll-forward recovery ..............................................................232
Design considerations for disaster restart and disaster
recovery ............................................................................................ 233
Recovery Point Objective.........................................................233
Recovery Time Objective .........................................................234
Operational complexity............................................................234
Source server activity ...............................................................235
Production impact ....................................................................235
Target server activity................................................................235
Number of copies of data.........................................................236
Distance for solution.................................................................236
Bandwidth requirements .........................................................236
Federated consistency ..............................................................237
Testing the solution ..................................................................237
Cost .............................................................................................238
Tape-based solutions....................................................................... 239
Tape-based disaster recovery..................................................239
Tape-based disaster restart ......................................................239
Remote replication challenges....................................................... 241
Propagation delay .....................................................................241
Bandwidth requirements .........................................................242
Network infrastructure ............................................................242
Method of instantiation............................................................243
Method of reinstantiation ........................................................243

Contents
Change rate at the source site ................................................. 243

Locality of reference ................................................................. 244
Expected data loss .................................................................... 244
Failback operations .................................................................. 245
Array-based remote replication.................................................... 246
Planning for array-based replication ........................................... 247
SRDF/S single Symmetrix array to single Symmetrix array ... 250
How to restart in the event of a disaster ............................... 252
SRDF/S and consistency groups .................................................. 253
Rolling disaster ......................................................................... 253
Protection against a rolling disaster ...................................... 255
SRDF/S with multiple source Symmetrix arrays ................ 257
SRDF/A............................................................................................ 260
SRDF/A using a single source Symmetrix array................. 261
SRDF/A multiple source Symmetrix arrays ........................ 262
SRDF/AR single hop ..................................................................... 266
SRDF/AR multihop ....................................................................... 269
Database log-shipping solutions .................................................. 272
Overview of log shipping........................................................ 272
Log-shipping considerations .................................................. 272
Log shipping and remote standby database ........................ 275
Log shipping and standby database with SRDF.................. 276
Oracle Data Guard ................................................................... 277
Running database solutions .......................................................... 286
Overview ................................................................................... 286
Advanced Replication.............................................................. 286
Oracle Streams .......................................................................... 287
Chapter 7 Oracle Database Layouts on EMC Symmetrix DMX

Introduction ..................................................................................... 290
The performance stack ................................................................... 291
Importance of I/O avoidance ................................................. 292
Storage-system layer considerations ..................................... 293
Traditional Oracle layout recommendations .............................. 294
Oracle's optimal flexible architecture .................................... 294
Oracle layouts and replication considerations..................... 295
Automated Storage Management .......................................... 296
Symmetrix DMX performance guidelines .................................. 297
Front-end connectivity............................................................. 297
Symmetrix cache....................................................................... 299

Contents
Back-end considerations ..........................................................308

Additional layout considerations ...........................................309
Configuration recommendations............................................310
RAID considerations....................................................................... 311
Types of RAID ...........................................................................311
RAID recommendations ..........................................................315
Symmetrix metavolumes .........................................................316
Host- versus array-based striping ................................................ 318
Host-based striping ..................................................................318
Symmetrix-based striping (metavolumes) ............................319
Striping recommendations ......................................................320
Data placement considerations ..................................................... 322
Disk performance considerations ...........................................322
Hypervolume contention.........................................................324
Maximizing data spread across the back end.......................325
Minimizing disk head movement ..........................................327
Other layout considerations .......................................................... 328
Database layout considerations with SRDF/S .....................328
Database cloning, TimeFinder, and sharing spindles .........328
Database clones using TimeFinder/Snap .............................329
Oracle database-specific configuration settings ......................... 331
The database layout process .......................................................... 333
Database layout process...........................................................333
Chapter 8 Data Protection

EMC Double Checksum overview ............................................... 340
Traditional methods of preventing data corruption............340
Data corruption between host and conventional storage ...341
Benefits of checking within Symmetrix arrays .....................341
Implementing EMC Double Checksum for Oracle .................... 342
Other checksum operations.....................................................342
Enabling checksum options.....................................................343
Verifying checksum is enabled ...............................................344
Validating for checksum operations ......................................344
Disabling checksum..................................................................345
Implementing Generic SafeWrite for generic applications ....... 346
Torn pages: Using Generic SafeWrite to protect
applications................................................................................346
Why generic? .............................................................................347
Where to enable Generic SafeWrite........................................347
Configuring Generic SafeWrite...............................................348
How to disable Generic SafeWrite .........................................350

Contents
Listing Generic SafeWrite devices ......................................... 351

Performance considerations.................................................... 351
Syntax and examples...................................................................... 353
Chapter 9 Storage Tiering—Virtual LUN and FAST

Overview.......................................................................................... 356
Evolution of storage tiering........................................................... 359
Manual storage tiering............................................................. 359
Fully Automated Storage Tiering (FAST)............................. 359
Fully Automated Storage Tiering for Virtual Pools
(FAST VP) .................................................................................. 359
Example of storage tiering evolution .................................... 359
Symmetrix Virtual Provisioning................................................... 361
Introduction............................................................................... 361
Virtual Provisioning and Oracle databases .......................... 363
Planning thin devices for Oracle databases.......................... 368
Enhanced Virual LUN migrations for Oracle databases........... 372
Manual tiering mechanics ....................................................... 372
Symmetrix Enhanced Virtual LUN technology ................... 372
LUN-based migrations and ASM........................................... 373
Configuration for Virtual LUN migration ............................ 376
Symmetrix Virtual LUN VP mobility technology ............... 380
Fully Automated Storage Tiering for Virtual Pools................... 381
FAST VP and Virtual Provisioning........................................ 381
FAST VP Elements ................................................................... 382
FAST VP time window considerations ................................. 383
FAST VP move time window considerations ...................... 384
FAST VP architecture............................................................... 384
FAST VP and Oracle databases .............................................. 386
Examples of FAST VP for Oracle databases ......................... 390
Test Case 1: FAST VP optimization of a single Oracle
database OLTP workload........................................................ 391
Test Case 2: Oracle databases sharing the ASM disk
group and FAST policy............................................................ 396
Test Case 3: Oracle databases on separate ASM disk
groups and FAST policies ....................................................... 399
Fully Automated Storage Tiering................................................. 404
Introduction............................................................................... 404
FAST configuration .................................................................. 405
FAST device movement........................................................... 406
FAST and ASM ......................................................................... 407

Contents
Example of FAST for Oracle databases..................................407

Conclusion........................................................................................ 419
Appendix A Symmetrix VMAX with Enginuity

Introduction to Symmetrix VMAX series with Enginuity ........ 422
New Symmetrix VMAX ease of use, scalability and
virtualization features ............................................................. 422
Oracle mission-critical applications require protection
strategy ...................................................................................... 423
Enterprise protection and compliance using SRDF ............ 423
Oracle database clones and snapshots with TimeFinder ... 424
Oracle database recovery using storage consistent
replications................................................................................ 424
Best practices for local and remote Oracle database
replications................................................................................ 424
Symmetrix VMAX Auto-provisioning Groups ................... 425
Symmetrix VMAX Enhanced Virtual LUN migration
technology................................................................................. 427
Symmetrix VMAX TimeFinder product family................... 431
Symmetrix VMAX SRDF product family ............................. 434
ASM rebalancing and consistency technology .................... 442
Leveraging TimeFinder and SRDF for business continuity
solutions............................................................................................ 444
Use Case 1: Offloading database backups from
production................................................................................. 447
Use Case 2: Parallel database recovery ................................. 450
Use Case 3: Local restartable replicas of production .......... 452
Use Case 4: Remote mirroring for disaster protection
(synchronous and asynchronous).......................................... 453
Use Case 5: Remote restartable database replicas for
repurposing............................................................................... 454
Use Case 6: Remote database valid backup replicas .......... 456
Use Case 7: Parallel database recovery from remote
backup replicas......................................................................... 457
Use Case 8: Fast database recovery from a restartable
replicas....................................................................................... 459
Conclusion........................................................................................ 462
Test storage and database configuration ..................................... 463
General test environment........................................................ 463

Contents
Appendix B Sample SYMCLI Group Creation Commands

Sample SYMCLI group creation commands .............................. 468
Appendix C Related Host Operation

Overview.......................................................................................... 474
BIN file configuration ............................................................. 474
SAN considerations................................................................. 475
Final configuration considerations for enabling LUN
presentation to hosts ............................................................... 476
Presenting database copies to a different host ........................... 477
AIX considerations .................................................................. 477
HP-UX considerations ............................................................ 480
Linux considerations............................................................... 483
Solaris considerations ............................................................. 485
Windows considerations ........................................................ 487
Windows Dynamic Disks ....................................................... 490
Presenting database copies to the same host .............................. 491
AIX considerations .................................................................. 491
HP-UX considerations ............................................................ 492
Linux considerations............................................................... 495
Solaris considerations ............................................................. 496
Windows considerations ........................................................ 497
Appendix D Sample Database Cloning Scripts

Sample script to replicate a database........................................... 500
Appendix E Solutions Enabler Command Line Interface (CLI) for

FAST VP Operations and Monitoring
Overview.......................................................................................... 510
Enabling FAST.......................................................................... 510
Gathering detailed information about a Symmetrix thin
pool ............................................................................................ 510
Checking distribution of thin device tracks across FAST
VP tiers ...................................................................................... 511
Checking the storage tiers allocation.................................... 512

Figures
Title Page
1 Oracle Systems Architecture......................................................................... 27
2 Physical data elements in an Oracle configuration ................................... 30
3 Relationship between data blocks, extents, and segments....................... 32
4 Oracle two-node RAC configuration........................................................... 37
5 Symmetrix VMAX logical diagram ............................................................. 47
6 Basic synchronous SRDF configuration ...................................................... 54
7 SRDF consistency group ............................................................................... 57
8 SRDF establish and restore control operations .......................................... 63
9 SRDF failover and failback control operations .......................................... 65
10 Geographically distributed four-node EMC SRDF/CE clusters............. 67
11 EMC Symmetrix configured with standard volumes and BCVs ............ 69
12 ECA consistent split across multiple database-associated hosts............. 73
13 ECA consistent split on a local Symmetrix system ................................... 74
14 Creating a copy session using the symclone command ........................... 77
15 TimeFinder/Snap copy of a standard device to a VDEV......................... 80
16 SRM commands.............................................................................................. 82
17 EMC Storage Viewer...................................................................................... 87
18 PowerPath/VE vStorage API for multipathing plug-in........................... 91
19 Output of rpowermt display command on a Symmetrix VMAX
device .................................................................................................................94
20 Device ownership in vCenter Server........................................................... 95
21 Virtual Provisioning components .............................................................. 101
22 Virtual LUN eligibility tables ..................................................................... 103
23 Copying a cold (shutdown) Oracle database with TimeFinder/
Mirror ...............................................................................................................112
24 Copying a cold Oracle database with TimeFinder/Clone ..................... 114
25 Copying a cold Oracle database with TimeFinder/Snap....................... 116
26 Copying a running Oracle database with TimeFinder/Mirror............. 119
27 Copying a running Oracle database with TimeFinder/Clone .............. 121
28 Copying a running Oracle database with TimeFinder/Snap................ 123
Oracle Databases on 13
Figures
29 Copying an Oracle database in hot backup mode with TimeFinder/

Mirror............................................................................................................... 127
Clone ................................................................................................................ 129
Snap................................................................................................................. 131
32 Using Replication Manager to make a TimeFinder copy of Oracle...... 133
33 Database organization to facilitate recovery............................................ 159
34 Copying a cold Oracle database with TimeFinder/Mirror ................... 169
35 Copying a cold Oracle database with TimeFinder/Clone..................... 171
36 Copying a cold Oracle database with TimeFinder/Snap ...................... 173
37 Copying a running Oracle database with TimeFinder/Mirror............. 176
38 Copying a running Oracle database using TimeFinder/Clone ............ 178
39 Copying a running Oracle database with TimeFinder/Snap................ 180
Mirror............................................................................................................... 184
Clone ................................................................................................................ 186
Snap.................................................................................................................. 188
43 Using RM to make a TimeFinder copy of Oracle .................................... 191
44 Restoring a TimeFinder copy, all components ........................................ 206
45 Restoring a TimeFinder copy, data components only ............................ 206
46 Restoring a TimeFinder/Clone copy, all components ........................... 209
47 Restoring a TimeFinder/Clone copy, data components only ............... 209
48 Restoring a TimeFinder/Snap copy, all components ............................. 212
49 Restoring a TimeFinder/Snap copy, data components only................. 212
50 Restoring Oracle using EMC Replication Manager ................................ 215
51 Database components for Oracle ............................................................... 248
52 Synchronous replication internals ............................................................. 250
53 Rolling disaster with multiple production Symmetrix arrays .............. 254
54 Rolling disaster with SRDF consistency group protection .................... 256
55 SRDF/S with multiple source Symmetrix arrays and ConGroup
protection ........................................................................................................ 258
56 SRDF/A replication internals .................................................................... 260
57 SRDF/AR single-hop replication internals .............................................. 266
58 SRDF/AR multihop replication Internals ................................................ 270
59 Log shipping and remote standby database ............................................ 275
60 Sample Oracle10g Data Guard configuration.......................................... 280
61 "No data loss" standby database................................................................ 284
62 The performance stack ................................................................................ 292
63 Relationship between host block size and IOPS/throughput............... 298

Figures
64 Performance Manager graph of write-pending limit for a single

hypervolume ...................................................................................................305
65 Performance Manager graph of write-pending limit for a four-
member metavolume .....................................................................................306
66 Write workload for a single hyper and a striped metavolume ............. 307
67 3+1 RAID 5 layout detail ............................................................................. 312
68 Anatomy of a RAID 5 random write ......................................................... 313
69 Optimizing performance with RAID 5 sequential writes....................... 314
70 Disk performance factors ............................................................................ 324
71 Synchronous replication internals ............................................................. 351
72 Storage tiering evolution ............................................................................. 360
73 Thin devices and thin pools containing data devices ............................. 363
74 Thin device configuration ........................................................................... 365
75 Migration of ASM members from FC to EFDs using Enhanced
Virtual LUN technology ............................................................................... 376
76 Virtual LUN migration to configured space............................................. 377
77 Virtual LUN migration to unconfigured space........................................ 379
78 FAST managed objects................................................................................. 382
79 FAST policy association............................................................................... 383
80 FAST VP components .................................................................................. 385
81 “Heat” map of ASM member devices showing sub-LUN skewing...... 387
82 Gold FAST VP policy storage group association ..................................... 393
83 Storage tier allocation changes during the FAST VP test for FINDB.... 394
84 Ddatabase transaction changes with FAST VP ........................................ 396
85 Storage tier changes during FAST VP enabled run on two
databases......................................................................................................... 398
86 FAST VP enabled test with different FAST policies................................ 403
87 Initial FAST policies for DB3....................................................................... 410
88 Initial FAST policy for DB3 ......................................................................... 412
89 Initial performance analysis on FAST ....................................................... 412
90 FAST configuration wizard: Setting FAST parameters........................... 413
91 FAST configuration wizard: Creating performance and move time
window ........................................................................................................... 413
92 FAST configuration wizard: Creating FAST policy................................. 414
93 FAST configuration wizard: Creating a FAST storage group................ 415
94 DB3 FAST policy........................................................................................... 416
95 FAST swap/move detail ............................................................................. 417
96 Disk utilization map after migration ......................................................... 418
97 Oracle RAC and Auto-provisioning Groups............................................ 426
98 Migration example using Virtual LUN technology ................................ 429
99 SRDF/Synchronous replication ................................................................. 435
100 SRDF/Asynchronous replication............................................................... 437
101 SRDF Adaptive Copy mode........................................................................ 438

Figures
102 Concurrent SRDF ......................................................................................... 439

103 Cascaded SRDF ............................................................................................ 440
104 SRDF/Extended Distance Protection........................................................ 440
105 SRDF/Star ..................................................................................................... 441
106 Test configuration ........................................................................................ 464
107 Windows Disk Management console........................................................ 488

Tables
Title Page
1 Oracle background processes ........................................................................ 28
2 SYMCLI base commands ............................................................................... 49
3 TimeFinder device type summary................................................................ 79
4 Data object SRM commands .......................................................................... 83
5 Data object mapping commands .................................................................. 83
6 File system SRM commands to examine file system mapping ................ 84
7 File system SRM command to examine logical volume mapping ........... 85
8 SRM statistics command ................................................................................ 85
9 Comparison of database cloning technologies ......................................... 154
10 Database cloning requirements and solutions .......................................... 154
11 Background processes for managing a Data Guard environment......... 280
12 Initialization parameters .............................................................................. 331
13 Background processes for managing a Data Guard environment......... 353
14 FAST VP Oracle test environment .............................................................. 390
15 Initial tier allocation for test cases with shared ASM disk group .......... 391
16 FINDB initial tier allocation......................................................................... 393
17 Initial AWR report for FINDB ..................................................................... 393
18 Oracle database tier allocations-initial and FAST VP enabled ............... 395
19 FAST VP enabled database response time from the AWR report ......... 395
20 FINDB and HRDB initial storage tier allocation....................................... 397
21 Initial AWR report for FINDB ..................................................................... 397
22 FAST VP enabled database transaction rate changes .............................. 399
23 Initial tier allocation for a test case with independent ASM disk
groups ..............................................................................................................399
24 Initial AWR report for CRMDB and SUPCHDB....................................... 401
25 AST VP enabled AWR report for CRMDB and SUPCHDB .................... 402
26 Storage tier allocation changes during the FAST VP-enabled run ........ 403
27 Test configuration ......................................................................................... 408
28 Storage and ASM configuration for each test database........................... 409
29 Database storage placement (initial) and workload profile.................... 409

Tables
30 Initial Oracle AWR report inspection (db file sequential read).............. 410
31 Initial FAST performance analysis results................................................. 416
32 Results after FAST migration of DB3 to Flash .......................................... 417
33 ASM diskgroups, and Symmetrix device and composite groups ........ 444
34 Test hardware ................................................................................................ 464

Preface
As part of an effort to improve and enhance the performance and capabilities

of its product lines, EMC periodically releases revisions of its hardware and
software. Therefore, some functions described in this document may not be
supported by all versions of the software or hardware currently in use. For
the most up-to-date information on product features, refer to your product
release notes.
This document describes how the EMC Symmetrix array manages Oracle
databases on UNIX and Windows. Additionally, this document provides a
general description of the Oracle RDBMS and EMC products and utilities
that can be used for Oracle administration. EMC Symmetrix storage arrays
and EMC software products and utilities are used to clone Oracle
environments and to enhance database and storage management backup and
recovery procedures.
Other topics include:
◆ Database and storage management administration
◆ CPU resource consumption
◆ The time required to clone or recover Oracle systems
Audience This TechBook is intended for systems administrators, Oracle

database administrators, and storage management personnel
responsible for managing Oracle databases on opensystems
platforms. The information in this document is based on Oracle10g.
In this document, open-systems platforms are UNIX operating
systems (including AIX, HPUX, Linux, and Solaris), as well as
Microsoft Windows platforms.

Preface
Readers of this document are expected to be familiar with the

following topics:
◆ Symmetrix operation
◆ Oracle concepts and operation
Related The following is a list of related documents that provide more

documentation detailed information on topics described in this TechBook.
Many of these documents are on the EMC Powerlink site
(http://powerlink.EMC.com). For Oracle information, consult the
Oracle websites including the main site (http://www.oracle.com),
the Oracle Technology Network (OTN), and Oracle Metalink.
EMC-related documents include:
◆ Solutions Enabler Release Notes (by release)
◆ Solutions Enabler Support Matrix (by release)
◆ Solutions Enabler Symmetrix Device Masking CLI Product Guide (by
release)
◆ Solutions Enabler Symmetrix Base Management CLI Product Guide
(by release)
◆ Solutions Enabler Symmetrix CLI Command Reference (by release)
◆ Solutions Enabler Symmetrix Configuration Change CLI Product
Guide (by release)
◆ Solutions Enabler Symmetrix SRM CLI Product Guide (by release)
◆ Solutions Enabler Installation Guide (by release)
◆ Solutions Enabler Symmetrix TimeFinder Family CLI Product Guide
(by release)
◆ Solutions Enabler Symmetrix SRDF Family CLI Product Guide
(by release)
◆ Symmetrix Remote Data Facility (SRDF) Product Guide
◆ Enginuity—The EMC Symmetrix Storage Operating Environment - A
Detailed Review (white paper)
◆ Replication Manager Product Guide
◆ Replication Manager Support Matrix
Oracle-related documents include:
◆ Oracle Data Guard Concepts and Administration
◆ Oracle Database Administrator's Guide
◆ Oracle Database Backup and Recovery Basics

Preface
◆ Oracle Database Backup and Recovery Advanced Users Guide

◆ Oracle Database Performance Tuning Guide
◆ Oracle Database Reference
Organization This TechBook contains the following chapters and several

appendices:
Chapter 1, “Oracle on Open Systems,” provides a high-level
overview of Oracle.
Chapter 2, “EMC Foundation Products,” describes EMC products
used to support the management of Oracle environments.
Chapter 3, “Creating Oracle Database Clones,” describes procedures
to clone Oracle instances. It also discusses procedures to clone Oracle
objects within and across Oracle instances using Oracle Transportable
Tablespaces and EMC TimeFinder.
Chapter 4, “Backing Up Oracle Environments,” describes how to
back up Oracle environments and objects with Oracle Recovery
Manager and EMC products including TimeFinder and SRDF.
Chapter 5, “Restoring and Recovering Oracle Databases,” describes
how to recover Oracle environments and objects, based upon the type
of backups that were previously performed.
Chapter 6, “Understanding Oracle Disaster Restart & Disaster
Recovery,” describes the difference between using traditional
recovery techniques versus EMC restart solutions.
Chapter 7, “Oracle Database Layouts on EMC Symmetrix DMX,”
describes Oracle RDBMS on EMC Symmetrix DMX data layout
recommendations and best practices.
Chapter 8, “Data Protection,” describes data protection methods
using EMC Double Checksum to minimize the impact of I/O errors
on database consistency during I/O transfers between hosts and
Symmetrix storage devices.
Chapter 9, “Storage Tiering—Virtual LUN and FAST,” describes
storage tiers available on Symmetrix and methodologies for
nondisruptive migration of Oracle data using Symmetrix
technologies across available storage tiers.
The appendixes provide sample code, which supplement procedures
described in the document, and additional detail on the Symmetrix
VMAX Series with Enginuity with Oracle.

Preface
The references section lists documents that contain more information

on these topics. Examples provided in this document cover methods
for performing various Oracle functions using Symmetrix arrays with
EMC software. These examples were developed for laboratory testing
and may need tailoring to suit other operational environments. Any
procedures outlined in this document should be thoroughly tested
prior to production implementation.
Conventions used in EMC uses the following conventions for special notices.
this document
Note: A note presents information that is important, but not hazard-related.
IMPORTANT
An important notice contains information essential to operation of
the software or hardware.
Typographical conventions
EMC uses the following type style conventions in this document:
Normal Used in running (nonprocedural) text for:
• Names of interface elements (such as names of windows,
dialog boxes, buttons, fields, and menus)
• Names of resources, attributes, pools, Boolean expressions,
buttons, DQL statements, keywords, clauses, environment
variables, functions, utilities
• URLs, pathnames, filenames, directory names, computer
names, filenames, links, groups, service keys, file systems,
notifications
Bold Used in running (nonprocedural) text for:
• Names of commands, daemons, options, programs, processes,
services, applications, utilities, kernels, notifications, system
calls, man pages
Used in procedures for:
• Names of interface elements (such as names of windows,
dialog boxes, buttons, fields, and menus)
• What user specifically selects, clicks, presses, or types
Italic Used in all text (including procedures) for:
• Full titles of publications referenced in text
• Emphasis (for example a new term)
• Variables

Preface
Courier Used for:

• System output, such as an error message or script
• URLs, complete paths, filenames, prompts, and syntax when
shown outside of running text
Courier bold Used for:
• Specific user input (such as commands)
Courier italic Used in procedures for:
• Variables on command line
• User input variables
<> Angle brackets enclose parameter or variable values supplied by
the user
[] Square brackets enclose optional values
| Vertical bar indicates alternate selections - the bar means “or”
{} Braces indicate content that you must specify (that is, x or y or z)
... Ellipses indicate nonessential information omitted from the
example
The authors of this Techbook

This TechBook was written by Yaron Dar, an employee of EMC based
at Hopkinton, Masschusetts. Yaron has over ten years of service with
EMC and more than thirteen years of experience in Oracle databases.
Other primary contributors to this TechBook are David Waddill,
Udgith Mankad, and the EMC Database and Application Team, also
based in Hopkinton.
We'd like to hear from you!
Your feedback on our TechBooks is important to us! We want our
books to be as helpful and relevant as possible, so please feel free to
send us your comments, opinions and thoughts on this or any other
TechBook:
TechBooks@emc.com

Preface

1
Oracle on Open
Systems
This chapter presents these topics:

◆ Introduction ........................................................................................ 26
◆ Oracle overview ................................................................................. 27
◆ Storage management ......................................................................... 33
◆ Cloning Oracle objects or environments ........................................ 34
◆ Backup and recovery ......................................................................... 35
◆ Oracle Real Application Clusters..................................................... 36
◆ Optimizing Oracle layouts on EMC Symmetrix............................ 38
◆ EMC and Oracle integration............................................................. 39
Oracle on Open Systems 25

Oracle on Open Systems
Introduction
The Oracle RDBMS on open systems first became available in 1979
and has steadily grown to become the marketshare leader in
enterprise database solutions. With a wide variety of features and
functionality, Oracle provides a stable platform for handling
concurrent, read-consistent access to a customer's application data.
Oracle database 10g and 11g, the latest releases of the Oracle RDBMS,
have introduced a variety of new and enhanced features over
previous versions of the database. Among these are:
◆ Increased self-management through features such as Automatic
Undo Management, Oracle managed files, and mean time to
recovery enhancements.
◆ Improved toolsets and utilities such as Recovery Manager
(RMAN), Oracle Data Guard, and Oracle Enterprise Manager
(OEM).
◆ Introduction of Automatic Storage Management (ASM).
◆ Enhancements to Oracle Real Application Clusters.
◆ Introduction of Database Resource Manager.
◆ Enhancements to Oracle Flashback capabilities.
◆ Introduction of Oracle VM server virtualization.
Oracle's architectural robustness, scalability, and availability
functions have positioned it as a cornerstone in many customers'
enterprise system infrastructures. A large number of EMC®
customers use Oracle in open-systems environments to support large,
mission-critical business applications.

Oracle overview
The Oracle RDBMS can be configured in multiple ways. The
requirement for 24x7 operations, replication and disaster recovery,
and the capacity of the host(s) that will contain the Oracle instance(s)
will, in part, determine how the Oracle environment must be
architected.
Oracle system elements

An Oracle database consists of three basic components: memory
structures, processes, and files. An Oracle instance is defined as the
System Global Area and the associated background processes.
Figure 1 shows a simplified example of the Oracle components.
Snnn
Redo ARCn
System Global Area (SGA)
Log
PMON
Active
Shared Redo Log LGWR Archive
Redo
Pool Buffers Logs
Log
CKPT SMON
PGA
Data Redo
DB Block Buffers Dictionary Log
DBWn
Data files Data files Data files Data files
ICO-IMG-0
Figure 1 Oracle Systems Architecture
The System Global Area (SGA) contains the basic memory structures
that an Oracle database instance requires to function. The SGA
contains memory structures such as the Buffer Cache (shared area for
users to read or write Oracle data blocks), Redo Log Buffer (circular
buffer for the Oracle logs), Shared Pool (including user SQL and
PL/SQL code, data dictionary, and more), Large Pool, and others.
Oracle overview 27
In addition to the SGA, the Oracle instance has another memory

structure that is called Program Global Area, or PGA. A PGA is
allocated for each server process accessing the database as well as for
background processes. The PGA contains the session information,
cursors, bind variable values, and an area for memory intensive
operations such as sorts, joins, and others. This is particularly
important for data warehouses where parallel query execution
commonly requires a lot of PGA space rather than SGA.
The background processes are started when the instance is initiated;
they enable Oracle to perform tasks such as reading and writing
between the data files and the SGA, managing I/O to the redo logs,
performing archiving between the redo and archive logs, and
connecting users to the database. Table 1 describes some of the
Oracle background processes shown in Figure 1 on page 27.
Table 1 Oracle background processes (page 1 of 2)
Process Description
DBWn Writes data from buffer cache to the datafiles on disk. Up to 20 database
(Database writer processes can be started per Oracle instance. The number of writers
Writer) can be controlled manually by using the DB_WRITER_PROCESSES
init.ora parameter. If not specified, Oracle will determine automatically the
number of writers.
LGWR (Log Manages the redo log buffer and transmitting data from the buffer to the redo
Writer) logs on disk. Log writer writes to the logs whenever one of these four
scenarios occurs:
• A user committed transaction
• Every three seconds
• When the redo buffer is third full
• If DB writer needs to write dirty blocks, but their redo log is still in the
redo buffer
ARCn Copies the redo logs to one or more log directories when a log switch
(Database occurs. The ARCn process is only turned on if the database is in
Archiver) ARCHIVELOG mode and automatic archiving is enabled. Up to 10 archive
processes can be started per Oracle instance, controlled by the init.ora
parameter LOG_ARCHIVE_MAX_PROCESSES.
CKPT When the Oracle system performs a checkpoint, DBWn needs to destage
(Checkpoint) data to disk. The CKPT process updates the data file header accordingly.

Table 1 Oracle background processes (page 2 of 2)
Process Description
SMON Performs recovery at instance startup. It coalesces free extents in the

(System datafiles and cleans up temporary segments of failed user processes.
Monitor)
PMON Cleans up after a user process fails. The process frees up resources
(Process including database locks and the blocks in the buffer cache of the failed
Monitor) process.
Snnn (Server Connects user processes to the database instance. Server processes can
processes) either be dedicated or shared, depending on user requirements and the
amount of host memory available.
Additional database processes may be started depending on the

system configuration. Some of these processes include RECO,
QMNn, Jnnn, and MMON.
Finally, the database files are the physical structures that store data on
disk. These files can be created within a file system, as raw partitions
or in Oracle Automatic Storage Management (ASM). Oracle uses
database files to maintain the logical structures within the database
and store data. These logical structures include tablespaces,
segments, extents, and data blocks. Database files commonly include
data, control, temp, redo log and archive log files.
Oracle data elements

Oracle maintains a set of database elements critical to the operation of
the Oracle subsystem. These database elements consist of both
physical and logical data elements.
Physical data elements

The required physical data elements include datafile(s) for the Oracle
SYSTEM tablespace, control files, redo logs, and other miscellaneous
database files (the parameter file, alert and trace logs, backup files,
and so on). Other physical elements such as the archive logs and
Oracle overview 29
additional tablespaces for data are also typically configured. A

minimal configuration is shown in Figure 2, followed by a
description of each data structure.
REDO1
SYSTEM
REDO2
CNTL 1
CNTL 2 ARCH 14
CNTL 2 ARCH 15
Binaries ARCH 16
ICO-IMG-000502
Figure 2 Physical data elements in an Oracle configuration
The Oracle SYSTEM and SYSAUX tablespaces consist of the data

dictionary, PL/SQL program units, and other database objects such
as users, tablespaces, tables, indexes, performance information, and
so on. These tablespaces are the only ones required, although in
practice, other tablespaces containing user data are typically created.
Every database has one or more physical data files. A data file is
associated with just one tablespace. Data in a data file is read during
normal database operations and stored in the database buffer cache.
Modified or new data is not written to the data file immediately.
Instead, The DB Writer background process periodically refreshes the
data files from the buffer cache.
The Oracle control files consist of one or more configuration files (the
control file is typically multiplexed onto separate physical spindles)
that contain the name of the database, the name and location of all
database datafiles and redo logs, redo and archive log history
information, checkpoint information, and other information needed
at system startup and while the database is running.

Oracle redo logs contain data and undo changes. All changes to the
database are written to the redo logs, unless logging of allowed
database objects, such as user tables, is explicitly disabled. Two or
more redo logs are configured, and normally the logs are multiplexed
to prevent data loss in the event that database recovery is required.
Archive logs are offloaded copies of the redo logs and are normally
required for recovering an Oracle database. Archive logs can be
multiplexed, both locally and remotely.
Oracle binaries are the executables and libraries used to initiate the
Oracle instance. Along with the binaries, Oracle uses many other
files to manage and monitor the database. These files include the
initialization parameter file (init<sid>.ora), server parameter file
(SPFILE), alert log, and trace files.
Logical data elements

Datafiles are the primary physical data element. Oracle tablespaces
are the logical element configured on top of the datafiles. Oracle
tablespaces are used as containers to hold the customer's information.
Each tablespace is built on one or more of the datafiles.
Tablespaces are the containers for the underlying Oracle logical data
elements. These logical elements include data blocks, extents, and
segments. Data blocks are the smallest logical elements configurable
at the database level. Data blocks are grouped into extents that are
then allocated to segments. Types of segments include data, index,
temporary and undo.
Oracle overview 31
Figure 3 shows the relationship between the data blocks, extents, and
segments.
Segment
(1920 KB)
Extent Extent
(960 KB) (960 KB)
Data Blocks (16 KB)

ICO-IMG-000-503
Figure 3 Relationship between data blocks, extents, and segments

Storage management
Standard Oracle backup/restore, disaster recovery, and cloning
methods can be difficult to manage and time-consuming. EMC
Symmetrix® provides many alternatives or solutions that make these
operations easy to manage, fast, and very scalable. In addition, EMC
developed many best practices that increase Oracle performance and
high availability when using Symmetrix storage arrays.
Storage management 33
Cloning Oracle objects or environments

EMC technology enables creation of an instant point-in-time copy of
an Oracle database system. The cloned copy is an identical
environment to its source, and can be used for other processing
purposes such as backup, recovery, offline reporting, and testing.
Transportable tablespaces are an alternative to cloning an entire
Oracle database. Through Oracle's transportable tablespaces, it is
possible to clone an individual tablespace or all user tablespaces and
present them to a different Oracle database environment. Clone
creation is facilitated through the use of EMC products such as
TimeFinder®, SRDF®, Open Replicator, and others.
In addition, Oracle also may clone or replicate individual database
objects, such as tables, in a variety of ways. Methods include
trigger-based mechanisms such as snapshots, Oracle's Advanced
Replication, Streams, message queues and Oracle Data Guard.

Backup and recovery

Backup and recovery operations using Oracle utilities typically
require intervention by experienced personnel and can be both labor-
and resource-intensive in large Oracle environments. Recovery of
large Oracle instances, such as in SAP or PeopleSoft environments,
are especially complex because the entire system is basically a large
referential set. All data in the set, including the database and
associated application files, needs to be recovered together and to the
same recovery point.
Dynamic manipulation of objects and application-maintained
referential integrity further complicates recovery efforts. Traditional
Oracle recovery techniques require multiple passes of the data, which
can greatly impact recovery times. Such techniques are generally
unworkable in large Oracle environments due to the time required to
recover all objects. EMC hardware and software are used to make the
process faster and more effective.
In addition to traditional backup and recovery operations, Oracle
provides the Recovery Manager (RMAN) utility. RMAN provides a
wide range of backup and recovery procedures through either a
command line interface on a client host or a GUI interface in
Enterprise Manager. RMAN performs backup or recovery operations
by integrating with sessions running on the target database host.
Remote procedure calls (RPCs) to specialized packages stored in the
target database are then made that execute in the backup or recovery
of the database. RMAN also may be configured as a repository for
historical backup information that supplements records written by
the utility into the database control file. EMC has worked with Oracle
engineering to closely integrate RMAN with products such as
TimeFinder to offload backup operations from production and
reduce recovery time.
Backup and recovery 35

Oracle Real Application Clusters

Typically, Oracle is configured with a single instance that attaches to a
single database. However, Oracle can be configured with multiple
host instances connecting to a single database. This configuration,
which originally was called Oracle Parallel Server (OPS) in Oracle
versions prior to release Oracle9i, is now known as Oracle Real
Application Clusters (RAC). Implementations of Oracle RAC are
configured to enhance performance, scalability, and availability over
a stand-alone Oracle database.
An Oracle RAC environment consists of multiple Oracle instances
that share access to a single Oracle database. Each instance contains
its own memory structures and processes. In addition, each instance
contains its own set of redo logs and undo segments. Each instance
shares access to the datafiles making up the database. Since all hosts
must have access to all database files, concurrent access to the data
files through the use of cluster manager is required. This also permits
one host to assume control of all datafiles in the event of an instance
failure requiring recovery.
Performance and scalability are enhanced in an Oracle environment
because host-based resource limitations such as CPU and memory
constraints are overcome by permitting two or more host instances to
attach to the same database. For example, in a homogeneous host
environment, near-linear scaling of host resources is achieved by
employing Oracle RAC. Additionally, because multiple hosts are
configured with access to the database, availability is increased. In
the event of a failure to one host or database instance, user
connections are failed over to the surviving cluster members
ensuring continuous operations.
Figure 4 on page 37 shows a typical Oracle RAC configuration with
two member nodes. Each member of the group has its own SGA,
redo logs, and undo space. Though not shown here, each member
also has its own set of initialization and parameter files. Concurrent
access to each data file is managed through the cluster management
software. Locking and inter-instance management are
communicated through a network interconnect between the RAC
nodes.

High bandwidth, low latency interconnect
RAC Node 1 RAC Node 1
SGA SGA
Shared storage
SYSTEM DATA INDEX
Binaries Binaries
Node 1 files Node 2 files
REDO UNDO REDO UNDO

1 1 2 2
ICO-IMG-000504
Figure 4 Oracle two-node RAC configuration
Oracle Real Application Clusters 37

Optimizing Oracle layouts on EMC Symmetrix

A primary concern for DBAs and system administrators when
configuring an Oracle databases on an EMC Symmetrix VMAX™ and
DMX™ is the appropriate data layout on the storage. Maximizing
performance, availability, and recoverability of the database requires
a thorough understanding of the I/O characteristics, uptime
requirements, backup, and cloning needs. Careful consideration and
planning of the back-end configuration, including RAID, physical
spindles, number of front-end directors and HBAs, as well as layout
of the database on the back-end of the Symmetrix array is necessary.
These considerations ensure the database implementation
successfully meets all business requirements.

EMC and Oracle integration

The EMC/Oracle partnership was established in 1995 and continues
to the present. Through joint engineering efforts, certification testing,
collaborative solution offerings, and the Joint Services Center, EMC
and Oracle maintain strong ties to ensure successful product
integration for customers' mission-critical database systems.
Install base
With more than 55,000 mutual customers, EMC and Oracle are
recognized as the leaders in automated networked storage and
enterprise software, respectively. The EMC Symmetrix VMAX and
DMX offer the highest levels of performance, scalability and
availability along with industry-leading software for successfully
managing and maintaining complex Oracle database environments.
In addition, EMC IT has one of the largest deployments of Oracle
Applications in the world, with over 35,000 named users and over
3,500 concurrent users at peak periods. Also Oracle IT uses both
CLARiiON® and Symmetrix extensively.
Joint engineering
Engineers for EMC and Oracle continue to work together to develop
integrated solutions, document best practices, and ensure
interoperability for customers deploying Oracle databases in EMC
Symmetrix VMAX and DMX storage environments. Key EMC
technologies such as TimeFinder and SRDF have been certified
through Oracle's Storage Certification Program (OSCP). As Oracle
phased out OSCP based on the maturity of the technology,
Engineering efforts continue between the two companies to ensure
successful integration between each company's products. With each
major technology or new product line EMC briefs Oracle Engineering
about the technology changes and together they review best
practices. EMC publishes many of the technology and deployment
best practices as joint logo papers with the presence of the Oracle logo
showing the strong communication and relationship between the
companies.
EMC and Oracle integration 39

Joint Services Center

EMC and Oracle maintain a Joint Services Center to handle specific
customer questions and issues relating to the database in EMC
Symmetrix VMAX and DMX environments. When level 1 tech
support from either company requires assistance with joint
EMC/Oracle-related issues, calls are automatically escalated to this
service center. Based in Hopkinton, Mass., this Service Center
provides answers to EMC- and Oracle-related questions from leading
support specialists trained in both database and storage platforms.

2
EMC Foundation
Products
This chapter introduces the EMC foundation products discussed in

this document that work in combined Symmetrix and Oracle
environments:
◆ Introduction ........................................................................................ 42
◆ Symmetrix hardware and EMC Enginuity features...................... 45
◆ EMC Solutions Enabler base management .................................... 49
◆ EMC Change Tracker......................................................................... 52
◆ EMC Symmetrix Remote Data Facility ........................................... 53
◆ EMC TimeFinder ................................................................................ 68
◆ EMC Storage Resource Management.............................................. 81
◆ EMC Storage Viewer.......................................................................... 86
◆ EMC PowerPath ................................................................................. 88
◆ EMC Replication Manager................................................................ 97
◆ EMC Open Replicator ....................................................................... 99
◆ EMC Virtual Provisioning............................................................... 100
◆ EMC Virtual LUN migration.......................................................... 103
◆ EMC Fully Automated Storage Tiering (FAST)........................... 106
EMC Foundation Products 41

EMC Foundation Products
Introduction
EMC provides many hardware and software products that support
Oracle environments on Symmetrix systems. This chapter provides a
technical overview of the EMC products referenced in this document.
The following products, which are highlighted and discussed, were
used and/or tested with VMware Infrastructure deployed on EMC
Symmetrix.
EMC offers an extensive product line of high-end storage solutions
targeted to meet the requirements of mission-critical databases and
applications. The Symmetrix product line includes the DMX Direct
Matrix Architecture™ series and the VMAX Virtual Matrix™ series.
EMC Symmetrix is a fully redundant, high-availability storage
processor, providing nondisruptive component replacements and
code upgrades. The Symmetrix system features high levels of
performance, data integrity, reliability, and availability.
EMC Enginuity™ Operating Environment — Enginuity enables
interoperation between the latest Symmetrix platforms and previous
generations of Symmetrix systems and enables them to connect to a
large number of server types, operating systems and storage software
products, and a broad selection of network connectivity elements and
other devices, ranging from HBAs and drivers to switches and tape
systems.
EMC Solutions Enabler — Solutions Enabler is a package that
contains the SYMAPI runtime libraries and the SYMCLI command
line interface. SYMAPI provides the interface to the EMC Enginuity
operating environment. SYMCLI is a set of commands that can be
invoked from the command line or within scripts. These commands
can be used to monitor device configuration and status, and to
perform control operations on devices and data objects within a
storage complex.
EMC Symmetrix Remote Data Facility (SRDF) — SRDF is a
business continuity software solution that replicates and maintains a
mirror image of data at the storage block level in a remote Symmetrix
system. The SRDF component extends the basic SYMCLI command
set of Solutions Enabler to include commands that specifically
manage SRDF.

EMC SRDF consistency groups — An SRDF consistency group is a

collection of related Symmetrix devices that are configured to act in
unison to maintain data integrity. The devices in consistency groups
can be spread across multiple Symmetrix systems.
EMC TimeFinder — TimeFinder is a family of products that enable
LUN-based replication within a single Symmetrix system. Data is
copied from Symmetrix devices using array-based resources without
using host CPU or I/O. The source Symmetrix devices remain online
for regular I/O operations while the copies are created. The
TimeFinder family has three separate and distinct software products,
TimeFinder/Mirror, TimeFinder/Clone, and TimeFinder/Snap:
• TimeFinder/Mirror enables users to configure special devices,
called business continuance volumes (BCVs), to create a
mirror image of Symmetrix standard devices. Using BCVs,
TimeFinder creates a point-in-time copy of data that can be
repurposed. The TimeFinder/Mirror component extends the
basic SYMCLI command set of Solutions Enabler to include
commands that specifically manage Symmetrix BCVs and
standard devices.
• TimeFinder/Clone enables users to make copies of data
simultaneously on multiple target devices from a single source
device. The data is available to a target’s host immediately
upon activation, even if the copy process has not completed.
Data may be copied from a single source device to as many as
16 target devices. A source device can be either a Symmetrix
standard device or a TimeFinder BCV device.
• TimeFinder/Snap enables users to configure special devices in
the Symmetrix array called virtual devices (VDEVs) and save
area devices (SAVDEVs). These devices can be used to make
pointer-based, space-saving copies of data simultaneously on
multiple target devices from a single source device. The data is
available to a target’s host immediately upon activation. Data
may be copied from a single source device to as many as 128
VDEVs. A source device can be either a Symmetrix standard
device or a TimeFinder BCV device. A target device is a VDEV.
A SAVDEV is a special device without a host address that is
used to hold the changing contents of the source or target
device.
Introduction 43
EMC Change Tracker — EMC Symmetrix Change Tracker software

measures changes to data on a Symmetrix volume or group of
volumes. Change Tracker software is often used as a planning tool in
the analysis and design of configurations that use the EMC
TimeFinder or SRDF components to store data at remote sites.
Solutions Enabler Storage Resource Management (SRM)
component — The SRM component extends the basic SYMCLI
command set of Solutions Enabler to include commands that allow
users to systematically find and examine attributes of various objects
on the host, within a specified relational database, or in the EMC
enterprise storage. The SRM commands provide mapping support
for relational databases, file systems, logical volumes and volume
groups, as well as performance statistics.
EMC PowerPath® — PowerPath is host-based software that provides
I/O path management. PowerPath operates with several storage
systems, on several enterprise operating systems and provides
failover and load balancing transparent to the host application and
database.

Symmetrix hardware and EMC Enginuity features

Symmetrix hardware architecture and the EMC Enginuity operating
environment are the foundation for the Symmetrix storage platform.
This environment consists of the following components:
◆ Symmetrix hardware
◆ Enginuity-based operating functions
◆ Solutions Enabler
◆ Symmetrix application program interface (API) for mainframe
◆ Symmetrix-based applications
◆ Host-based Symmetrix applications
◆ Independent software vendor (ISV) applications
All Symmetrix systems provide advanced data replication
capabilities, full mainframe and open systems support, and flexible
connectivity options, including Fibre Channel, FICON, ESCON,
Gigabit Ethernet, and iSCSI.
Interoperability between Symmetrix storage systems enables
customers to migrate storage solutions from one generation to the
next, protecting their investment even as their storage demands
expand.
Symmetrix enhanced cache director technology allows configurations
of up to 512 GB of cache. The cache can be logically divided into 32
independent regions providing up to 32 concurrent 500 MB/s
transaction throughput.
The Symmetrix on-board data integrity features include:
◆ Continuous cache and on-disk data integrity checking and error
detection/correction
◆ Fault isolation
◆ Nondisruptive hardware and software upgrades
◆ Automatic diagnostics and phone-home capabilities
At the software level, advanced integrity features ensure information
is always protected and available. By choosing a mix of RAID 1
(mirroring), RAID 1/0, high performance RAID 5 (3+1 and 7+1)
protection and RAID 6, users have the flexibility to choose the
Symmetrix hardware and EMC Enginuity features 45

protection level most appropriate to the value and performance

requirements of their information. The Symmetrix DMX and VMAX
are EMC’s latest generation of high-end storage solutions.
From the perspective of the host operating system, a Symmetrix
system appears to be multiple physical devices connected through
one or more I/O controllers. The host operating system addresses
each of these devices using a physical device name. Each physical
device includes attributes, vendor ID, product ID, revision level, and
serial ID. The host physical device maps to a Symmetrix device. In
turn, the Symmetrix device is a virtual representation of a portion of
the physical disk called a hypervolume.
Symmetrix VMAX platform

The EMC Symmetrix VMAX Series with Enginuity is a new entry to
the Symmetrix product line. Built on the strategy of simple,
intelligent, modular storage, it incorporates a new scalable fabric
interconnect design that allows the storage array to seamlessly grow
from an entry-level configuration into the world's largest storage
system. The Symmetrix VMAX provides improved performance and
scalability for demanding enterprise storage environments while
maintaining support for EMC's broad portfolio of platform software
offerings.
The Enginuity operating environment for Symmetrix version 5874 is
a new, feature-rich Enginuity release supporting Symmetrix VMAX
storage arrays. With the release of Enginuity 5874, Symmetrix VMAX
systems deliver new software capabilities that improve capacity
utilization, ease of use, business continuity and security.
The Symmetrix VMAX also maintains customer expectations for
high-end storage in terms of availability. High-end availability is
more than just redundancy; it means nondisruptive operations and
upgrades, and being “always online.” Symmetrix VMAX provides:
◆ Nondisruptive expansion of capacity and performance at a lower
price point
◆ Sophisticated migration for multiple storage tiers within the array
◆ The power to maintain service levels and functionality as
consolidation grows
◆ Simplified control for provisioning in complex environments

Many of the new features provided by the new EMC Symmetrix

VMAX platform can reduce operational costs for customers
deploying VMware Infrastructure environments, as well as enhance
functionality to enable greater benefits. This document details those
features that provide significant benefits to customers deploying
VMware Infrastructure environments.
Figure 5 on page 47 illustrates the architecture and interconnection of
the major components in the Symmetrix VMAX storage system.
ICO-IMG-000752
Figure 5 Symmetrix VMAX logical diagram
EMC Enginuity operating environment

EMC Enginuity is the operating environment for all Symmetrix
storage systems. Enginuity manages and ensures the optimal flow
and integrity of data through the different hardware components. It
also manages Symmetrix operations associated with monitoring and
optimizing internal data flow. This ensures the fastest response to the
user's requests for information, along with protecting and replicating
data. Enginuity provides the following services:
◆ Manages system resources to intelligently optimize performance
across a wide range of I/O requirements.
Symmetrix hardware and EMC Enginuity features 47

◆ Ensures system availability through advanced fault monitoring,

detection, and correction capabilities and provides concurrent
maintenance and serviceability features.
◆ Offers the foundation for specific software features available
through EMC disaster recovery, business continuity, and storage
management software.
◆ Provides functional services for both Symmetrix-based
functionality and for a large suite of EMC storage application
software.
◆ Defines priority of each task, including basic system maintenance,
I/O processing, and application processing.
◆ Provides uniform access through APIs for internal calls, and
provides an external interface to allow integration with other
software providers and ISVs.

EMC Solutions Enabler base management

The EMC Solutions Enabler kit contains all the base management
software that provides a host with SYMAPI-shared libraries and the
basic Symmetrix command line interface (SYMCLI). Other optional
subcomponents in the Solutions Enabler (SYMCLI) series enable
users to extend functionality of the Symmetrix systems. Three
principle sub-components are:
◆ Solutions Enabler SYMCLI SRDF, SRDF/CG, and SRDF/A
◆ Solutions Enabler SYMCLI TimeFinder/Mirror, TimeFinder/CG,
TimeFinder/Snap, TimeFinder/Clone
◆ Solutions Enabler SYMCLI Storage Resource Management (SRM)
These components are discussed later in this chapter.
SYMCLI resides on a host system to monitor and perform control
operations on Symmetrix storage arrays. SYMCLI commands are
invoked from the host operating system command line or via scripts.
SYMCLI commands invoke low-level channel commands to
specialized devices on the Symmetrix called gatekeepers.
Gatekeepers are very small devices carved from disks in the
Symmetrix that act as SCSI targets for the SYMCLI commands.
SYMCLI is used in single command line entries or in scripts to
monitor and perform control operations on devices and data objects
toward the management of the storage complex. It also monitors
device configuration and status of devices that make up the storage
environment. To reduce the number of inquiries from the host to the
Symmetrix systems, configuration and status information is
maintained in a host database file.
Table 2 lists the SYMCLI base commands discussed in this document.
Table 2 SYMCLI base commands (page 1 of 3)
Command Argument Description
symdg Performs operations on a device group (dg)
create Creates an empty device group
delete Deletes a device group
rename Renames a device group
EMC Solutions Enabler base management 49

release Releases a device external lock associated with all

devices in a device group
list Displays a list of all device groups known to this host
show Shows detailed information about a device group and

any gatekeeper or BCV devices associated with the
device group
symcg Performs operations on a composite group (cg)
create Creates an empty composite group
add Adds a device to a composite group
remove Removes a device from a composite group
delete Deletes a composite group
rename Renames a composite group
release Releases a device external lock associated with all

devices in a composite group
hold Hold devices in a composite group
unhold Unhold devices in a composite group
list Displays a list of all composite groups known to this

host
show Shows detailed information about a composite group,

and any gatekeeper or BCV devices associated with
the group
symld Performs operations on a device in a device group
add Adds devices to a device group and assigns the

device a logical name
list Lists all devices in a device group and any associated

BCV devices
remove Removes a device from a device group
rename Renames a device in the device group
show Shows detailed information about a device in a the

device group

symbcv Performs support operations on BCV pairs
list Lists BCV devices
associate Associates BCV devices to a device group – required

to perform operations on the BCV device
disassociate Disassociates BCV devices from a device group
associate –rdf Associates remotely attached BCV devices to a SRDF

device group
disassociate Disassociates remotely attached BCV devices from

–rdf an SRDF device group
EMC Solutions Enabler base management 51

EMC Change Tracker

The EMC Symmetrix Change Tracker software is also part of the base
Solutions Enabler SYMCLI management offering. Change Tracker
commands are used to measure changes to data on a Symmetrix
volume or group of volumes. Change Tracker functionality is often
used as a planning tool in the analysis and design of configurations
that use the EMC SRDF and TimeFinder components to create copies
of production data.
The Change Tracker command (symchg) is used to monitor the
amount of changes to a group of hypervolumes. The command
timestamps and marks specific volumes for tracking and maintains a
bitmap to record which tracks have changed on those volumes. The
bitmap can be interrogated to gain an understanding of how the data
on the volume changes over time and to assess the locality of
reference of applications.

EMC Symmetrix Remote Data Facility

The Symmetrix Remote Data Facility (SRDF) component of EMC
Solutions Enabler extends the basic SYMCLI command set to enable
users to manage SRDF. SRDF is a business continuity solution that
provides a host-independent, mirrored data storage solution for
duplicating production site data to one or more physically separated
target Symmetrix systems. In basic terms, SRDF is a configuration of
multiple Symmetrix systems whose purpose is to maintain multiple
copies of logical volume data in more than one location.
SRDF replicates production or primary (source) site data to a
secondary (target) site transparently to users, applications, databases,
and host processors. The local SRDF device, known as the source (R1)
device, is configured in a partner relationship with a remote target
(R2) device, forming an SRDF pair. While the R2 device is mirrored
with the R1 device, the R2 device is write-disabled to the remote host.
After the R2 device synchronizes with its R1 device, the R2 device can
be split from the R1 device at any time, making the R2 device fully
accessible again to its host. After the split, the target (R2) device
contains valid data and is available for performing business
continuity tasks through its original device address.
SRDF requires configuration of specific source Symmetrix volumes
(R1) to be mirrored to target Symmetrix volumes (R2). If the primary
site is no longer able to continue processing when SRDF is operating
in synchronous mode, data at the secondary site is current up to the
last committed transaction. When primary systems are down, SRDF
enables fast failover to the secondary copy of the data so that critical
information becomes available in minutes. Business operations and
related applications may resume full functionality with minimal
interruption.
Figure 6 on page 54 illustrates a basic SRDF configuration where
connectivity between the two Symmetrix is provided using ESCON,
Fibre Channel, or Gigabit Ethernet. The connection between the R1
and R2 volumes is through a logical grouping of devices called a
remote adapter (RA) group. The RA group is independent of the
device and composite groups defined and discussed in “SRDF device
groups and composite groups” on page 55.
EMC Symmetrix Remote Data Facility 53

<200Km
Escon
Server FC
GigE
Source Target
ICO-IMG-000001
Figure 6 Basic synchronous SRDF configuration
SRDF benefits
SRDF offers the following features and benefits:
◆ High data availability
◆ High performance
◆ Flexible configurations
◆ Host and application software transparency
◆ Automatic recovery from a component or link failure
◆ Significantly reduced recovery time after a disaster
◆ Increased integrity of recovery procedures
◆ Reduced backup and recovery costs
◆ Reduced disaster recovery complexity, planning, testing, etc.
◆ Supports Business Continuity across and between multiple
databases on multiple servers and Symmetrix systems.
SRDF modes of operation

SRDF currently supports the following modes of operation:
◆ Synchronous mode (SRDF/S) provides real-time mirroring of data
between the source Symmetrix system(s) and the target
Symmetrix system(s). Data is written simultaneously to the cache
of both systems in real time before the application I/O is
completed, thus ensuring the highest possible data availability.

Data must be successfully stored in both the local and remote

Symmetrix systems before an acknowledgment is sent to the local
host. This mode is used mainly for metropolitan area network
distances less than 200 km.
◆ Asynchronous mode (SRDF/A) maintains a dependent-write
consistent copy of data at all times across any distance with no
host application impact. Applications needing to replicate data
across long distances historically have had limited options.
SRDF/A delivers high-performance, extended-distance
replication and reduced telecommunication costs while
leveraging existing management capabilities with no host
performance impact.
◆ Adaptive copy mode transfers data from source devices to target
devices regardless of order or consistency, and without host
performance impact. This is especially useful when transferring
large amounts of data during data center migrations,
consolidations, and in data mobility environments. Adaptive
copy mode is the data movement mechanism of the Symmetrix
Automated Replication (SRDF/AR) solution.
SRDF device groups and composite groups

Applications running on Symmetrix systems normally involve a
number of Symmetrix devices. Therefore, any Symmetrix operation
must ensure all related devices are operated upon as a logical group.
Defining device or composite groups achieves this.
A device group or a composite group is a user-defined group of
devices that SYMCLI commands can execute upon. Device groups
are limited to a single Symmetrix system and RA group (a.k.a. SRDF
group). A composite group, on the other hand, can span multiple
Symmetrix systems and RA groups. The device or composite group
type may contain R1 or R2 devices and may contain various device
lists for standard, BCV, virtual, and remote BCV devices. The
symdg/symld and symcg commands are used to create and manage
device and composite groups.
SRDF consistency groups

An SRDF consistency group is a collection of devices defined by a
composite group that has been enabled for consistency protection. Its
purpose is to protect data integrity for applications that span multiple

RA groups and/or multiple Symmetrix systems. The protected

applications may comprise multiple heterogeneous data resource
managers across multiple host operating systems.
An SRDF consistency group uses PowerPath or Enginuity
Consistency Assist (SRDF-ECA) to provide synchronous disaster
restart with zero data loss. Disaster restart solutions that use
consistency groups provide remote restart with short recovery time
objectives. Zero data loss implies that all completed transactions at
the beginning of a disaster will be available at the target.
When the amount of data for an application becomes very large, the
time and resources required for host-based software to protect, back
up, or run decision-support queries on these databases become
critical factors. The time required to quiesce or shut down the
application for offline backup is no longer acceptable. SRDF
consistency groups allow users to remotely mirror the largest data
environments and automatically split off dependent-write consistent,
restartable copies of applications in seconds without interruption to
online service.
A consistency group is a composite group of SRDF devices (R1 or R2)
that act in unison to maintain the integrity of applications distributed
across multiple Symmetrix systems or multiple RA groups within a
single Symmetrix. If a source (R1) device in the consistency group
cannot propagate data to its corresponding target (R2) device, EMC
software suspends data propagation from all R1 devices in the
consistency group, halting all data flow to the R2 targets. This
suspension, called tripping the consistency group, ensures that a
dependent-write consistent R2 copy of the database up to the point in
time that the consistency group tripped.
Tripping a consistency group can occur either automatically or
manually. Scenarios in which an automatic trip would occur include:
◆ One or more R1 devices cannot propagate changes to their
corresponding R2 devices
◆ The R2 device fails
◆ The SRDF directors on the R1 side or R2 side fail
In an automatic trip, the Symmetrix system completes the write to the
R1 device, but indicates that the write did not propagate to the R2
device. EMC software intercepts the I/O and instructs the Symmetrix
to suspend all R1 source devices in the consistency group from
propagating any further writes to the R2 side. Once the suspension is

complete, writes to all of the R1 devices in the consistency group

continue normally, but they are not propagated to the target side until
normal SRDF mirroring resumes.
An explicit trip occurs when the command symrdf –cg suspend
or split is invoked. Suspending or splitting the consistency group
creates an on-demand, restartable copy of the database at the R2
target site. BCV devices that are synchronized with the R2 devices are
then split after the consistency group is tripped, creating a second
dependent-write consistent copy of the data. During the explicit trip,
SYMCLI issues the command to create the dependent-write
consistent copy, but may require assistance from PowerPath or
SRDF-ECA if I/O is received on one or more R1 devices, or if the
SYMCLI commands issued are abnormally terminated before the
explicit trip.
An EMC consistency group maintains consistency within
applications spread across multiple Symmetrix systems in an SRDF
configuration, by monitoring data propagation from the source (R1)
devices in a consistency group to their corresponding target (R2)
devices as depicted in Figure 7. Consistency groups provide data
integrity protection during a rolling disaster.
Host 1 4 5 Suspend R1/R2

relationship
Consistency group 1 7 DBMS
Host component
E-ConGroup restartable
Symmetrix control Facility R1(A)
definition copy
DBMS (X,Y,Z) 6
R1(B)
RDF-ECA R1(X)
R2(A)
R2(X)
R1(Y)
R2(B)
R2(Y)
Host 2
2 R2(C)
R1(C) R2(Z)
Consistency group
Host component 3
Symmetrix control Facility
DBMS
R1(Z)
RDF-ECA X = DBMS data
Y = Application data
Z = Logs
ICO-IMG-000106
Figure 7 SRDF consistency group

A consistency group protection is defined containing volumes X, Y,

and Z on the source Symmetrix. This consistency group definition
must contain all of the devices that need to maintain dependent-write
consistency and reside on all participating hosts involved in issuing
I/O to these devices. A mix of CKD (mainframe) and FBA
(UNIX/Windows) devices can be logically grouped together. In some
cases, the entire processing environment may be defined in a
consistency group to ensure dependent-write consistency.
The rolling disaster described previously begins, preventing the
replication of changes from volume Z to the remote site.
Since the predecessor log write to volume Z cannot be propagated to
the remote Symmetrix system, a consistency group trip occurs.
A ConGroup trip holds the write that could not be replicated along
with all of the writes to the logically grouped devices. The writes are
held by PowerPath on UNIX/Windows hosts and by IOS on
mainframe hosts (or by ECA-RDA for both UNIX/Windows and
mainframe hosts) long enough to issue two I/Os to all of the
Symmetrix systems involved in the consistency group. The first I/O
changes the state of the devices to a suspend-pending state.
The second I/O performs the suspend actions on the R1/R2
relationships for the logically grouped devices which immediately
disables all replication to the remote site. This allows other devices
outside of the group to continue replicating, provided the
communication links are available. After the relationship is
suspended, the completion of the predecessor write is acknowledged
back to the issuing host. Furthermore, all writes that were held
during the consistency group trip operation are released.
After the second I/O per Symmetrix completes, the I/O is released,
allowing the predecessor log write to complete to the host. The
dependent data write is issued by the DBMS and arrives at X but is
not replicated to the R2(X).
When a complete failure occurs from this rolling disaster, the
dependent-write consistency at the remote site is preserved. If a
complete disaster does not occur and the failed links are activated
again, the consistency group replication can be resumed. EMC
recommends creating a copy of the dependent-write consistent image
while the resume takes place. Once the SRDF process reaches
synchronization the dependent-write consistent copy is achieved at
the remote site.

SRDF terminology
This section describes various terms related to SRDF operations.
Suspend and resume operations

Practical uses of suspend and resume operations usually involve
unplanned situations in which an immediate suspension of I/O
between the R1 and R2 devices over the SRDF links is desired. In this
way, data propagation problems can be stopped. When suspend is
used with consistency groups, immediate backups can be performed
off the R2s without affecting I/O from the local host application. I/O
can then be resumed between the R1 and R2 and return to normal
operation.
Establish and split operations

The establish and split operations are normally used in planned
situations in which use of the R2 copy of the data is desired without
interfering with normal write operations to the R1 device. Splitting a
point-in-time copy of data allows access to the data on the R2 device
for various business continuity tasks. The ease of splitting SRDF pairs
to provide exact database copies makes it convenient to perform
scheduled backup operations, reporting operations, or new
application testing from the target Symmetrix data while normal
processing continues on the source Symmetrix system.
The R2 copy can also be used to test disaster recovery plans without
manually intensive recovery drills, complex procedures, and
application service interruptions. Upgrades to new versions can be
tested or changes to actual code can be made without affecting the
online production server. For example, modified server code can be
run on the R2 copy of the database until the upgraded code runs with
no errors before upgrading the production server.
In cases where an absolute real-time copy of the production data is
not essential, users may choose to split the SRDF pair periodically
and use the R2 copy for queries and report generation. The SRDF pair
can be re-established periodically to provide incremental updating of
data on the R2 device. The ability to refresh the R2 device periodically
provides the latest information for data processing and reporting.
Failover and failback operations

Practical uses of failover and failback operations usually
involve the need to switch business operations from the production
site to a remote site (failover) or the opposite (failback). Once failover

occurs, normal operations continue using the remote (R2) copy of

synchronized application data. Scheduled maintenance at the
production site is one example of where failover to the R2 site might
be needed.
Testing of disaster recovery plans is the primary reason to
temporarily fail over to a remote site. Traditional disaster recovery
routines involve customized software and complex procedures.
Offsite media must be either electronically transmitted or physically
shipped to the recovery site. Time-consuming restores and the
application of logs usually follow. SRDF failover/failback
operations significantly reduce the recovery time by incrementally
updating only the specific tracks that have changed; this
accomplishes in minutes what might take hours for a complete load
from dumped database volumes.
Update operation
The update operation allows users to resynchronize the R1s after a
failover while continuing to run application and database services
on the R2s. This function helps reduce the amount of time that a
failback to the R1 side takes. The update operation is a subset of
the failover/failback functionality. Practical uses of the R1
update operation usually involve situations in which the R1
becomes almost synchronized with the R2 data before a failback,
while the R2 side is still online to its host. The -until option, when
used with update, specifies the target number of invalid tracks that
are allowed to be out of sync before resynchronization to the R1
completes.
Concurrent SRDF
Concurrent SRDF means having two target R2 devices configured as
concurrent mirrors of one source R1 device. Using a Concurrent
SRDF pair allows the creation of two copies of the same data at two
remote locations. When the two R2 devices are split from their
source R1 device, each target site copy of the application can be
accessed independently.
R1/R2 swap
Swapping R1/R2 devices of an SRDF pair causes the source R1
device to become a target R2 device and vice versa. Swapping SRDF
devices allows the R2 site to take over operations while retaining a
remote mirror on the original source site. Swapping is especially
useful after failing over an application from the R1 site to the R2 site.
SRDF swapping is available with Enginuity version 5567 or later.

Data Mobility
Data mobility is an SRDF configuration that restricts SRDF devices to
operating only in adaptive copy mode. This is a lower-cost licensing
option that is typically used for data migrations. It allows data to be
transferred in adaptive copy mode from source to target, and is not
designed as a solution for DR requirements unless used in
combination with TimeFinder.
Dynamic SRDF
Dynamic SRDF allows the creation of SRDF pairs from non-SRDF
devices while the Symmetrix system is in operation. Historically,
source and target SRDF device pairing has been static and changes
required assistance from EMC personnel. This feature provides
greater flexibility in deciding where to copy protected data.
Dynamic RA groups can be created in a SRDF switched fabric
environment. An RA group represents a logical connection between
two Symmetrix systems. Historically, RA groups were limited to
those static RA groups defined at configuration time. However, RA
groups can now be created, modified, and deleted while the
Symmetrix system is in operation. This provides greater flexibility in
forming SRDF-pair-associated links.
SRDF control operations

This section describes typical control operations that can be
performed by the Solutions Enabler symrdf command.
Solutions Enabler SYMCLI SRDF commands perform the following
basic control operations on SRDF devices:
◆ Establish synchronizes an SRDF pair by initiating a data copy
from the source (R1) side to the target (R2) side. This operation
can be a full or incremental establish. Changes on the R2 volumes
are discarded by this process.
◆ Restore resynchronizes a data copy from the target (R2) side to the
source (R1) side. This operation can be a full or incremental
restore. Changes on the R1 volumes are discarded by this process.
◆ Split stops mirroring for the SRDF pair(s) in a device group and
write-enables the R2 devices.
◆ Swap exchanges the source (R1) and target (R2) designations on
the source and target volumes.

◆ Failover switches data processing from the source (R1) side to the
target (R2) side. The source side volumes (R1), if still available,
are write-disabled.
◆ Failback switches data processing from the target (R2) side to the
source (R1) side. The target side volumes (R2), if still available,
are write-disabled.
Establishing an SRDF pair

Establishing an SRDF pair initiates remote mirroring—the copying of
data from the source (R1) device to the target (R2) device. SRDF pairs
come into existence in two different ways:
◆ At configuration time through the pairing of SRDF devices. This
is a static pairing configuration discussed earlier.
◆ Anytime during a dynamic pairing configuration in which SRDF
pairs are created on demand.
A full establish (symrdf establish –full) is typically performed
after an SRDF pair is initially configured and connected via the SRDF
links. After the first full establish, users can perform an
incremental establish, where the R1 device copies to the R2 device
only the new data that was updated while the relationship was split
or suspended.
To initiate an establish operation on all SRDF pairs in a device or
composite group, all pairs must be in the split or suspended state.
The symrdf query command is used to check the state of SRDF
pairs in a device or composite group.
When the establish operation is initiated, the system
write-disables the R2 device to its host and merges the track tables.
The merge creates a bitmap of the tracks that need to be copied to the
target volumes discarding the changes on the target volumes. When
the establish operation is complete and the SRDF pairs are in the
synchronized state. The R1 device and R2 device contain identical
data, and continue to do so until interrupted by administrative
command or unplanned disruption. Figure 8 depicts SRDF establish
and restore operations:

Production Disaster recovery

DBMS DBMS
Establish
Data Data
Logs Restore Logs

Production DR
server server
R1 R2
ICO-IMG-000003
Figure 8 SRDF establish and restore control operations
The establish operation may be initiated by any host connected to

either Symmetrix system, provided that an appropriate device group
has been built on that host. The following command initiates an
incremental establish operation for all SRDF pairs in the device
group named MyDevGrp:
symrdf –g MyDevGrp establish –noprompt
Splitting an SRDF pair

When read/write access to a target (R2) device is necessary, the SRDF
pair can be split. When the split completes, the target host can
access the R2 device for write operations. The R2 device contains
valid data and is available for business continuity tasks or restoring
data to the R1 device if there is a loss of data on that device.
While an SRDF pair is in the split state, local I/O to the R1 device can
still occur. These updates are not propagated to the R2 device
immediately. Changes on each Symmetrix system are tracked
through bitmaps and are reconciled when normal SRDF mirroring
operations are resumed. To initiate a split, an SRDF pair must
already be in one of the following states:
◆ Synchronized
◆ Suspended
◆ R1 updated
◆ SyncInProg (if the –symforce option is specified for the split –
resulting in a set of R2 devices that are not dependent-write
consistent and are not usable)
The split operation may be initiated from either host. The
following command initiates a split operation on all SRDF pairs in
the device group named MyDevGrp:

symrdf –g MyDevGrp split –noprompt
The symrdf split command provides exactly the same

functionality as the symrdf suspend and symrdf rw_enable R2
commands together. Furthermore, the split and suspend
operations have exactly the same consistency characteristics as SRDF
consistency groups. Therefore, when SRDF pairs are in a single
device group, users can split the SRDF pairs in the device group as
shown previously and have restartable copies on the R2 devices. If
the application data spans multiple Symmetrix systems or multiple
RA groups, include SRDF pairs in a consistency group to achieve the
same results.
Restoring an SRDF pair

When the target (R2) data must be copied back to the source (R1)
device, the SRDF restore command is used (see Figure 8 on
page 63). After an SRDF pair is split, the R2 device contains valid
data and is available for business continuance tasks (such as running
a new application) or restoring data to the R1 device. Moreover, if the
results of running a new application on the R2 device need to be
preserved, moving the changed data and new application to the R1
device is another option.
Users can perform a full or incremental restore. A full restore
operation copies the entire contents of the R2 device to the R1 device.
An incremental restore operation is much faster because it copies
only new data that was updated on the R2 device while the SRDF
pair was split. Any tracks on the R1 device that changed while the
SRDF pair was split are replaced with corresponding tracks on the
R2 device. To initiate a restore, an SRDF pair must already be in the
split state. The restore operation can be initiated from either host. The
following command initiates an incremental restore operation on all
SRDF pairs in the device group named MyDevGrp (add the –full
option for a full restore).
symrdf –g MyDevGrp restore –noprompt
symrdf –g MyDevGrp restore –noprompt -full
The restore operation is complete when the R1 and R2 devices

contain identical data. The SRDF pair is then in a synchronized state
and may be reestablished by initiating the following command:
symrdf -g MyDevGrp establish

Failover and failback operations

Having a synchronized SRDF pair allows users to switch data
processing operations from the source site to the target site if
operations at the source site are disrupted or if downtime must be
scheduled for maintenance. This switchover from source to target is
enabled through the use of the failover command. When the
situation at the source site is back to normal, a failback operation is
used to reestablish I/O communications links between source and
target, resynchronize the data between the sites, and resume normal
operations on the R1 devices as shown in Figure 9, which illustrates
the failover and failback operations.
Production Disaster recovery

DBMS DBMS
Failover
Data Data
Logs Failback Logs

Production DR
server server
R1 R2
ICO-IMG-000004
Figure 9 SRDF failover and failback control operations
The failover and failback operations relocate the processing

from the source site to the target site or vice versa. This may or may
not imply movement of data.
Failover
Scheduled maintenance or storage system problems can disrupt
access to production data at the source site. In this case, a failover
operation can be initiated from either host to make the R2 device
read/write-enabled to its host. Before issuing the failover, all
applications services on the R1 volumes must be stopped. This is
because the failover operation makes the R1 volumes read-only.
The following command initiates a failover on all SRDF pairs in the
device group named MyDevGrp:
symrdf –g MyDevGrp failover –noprompt

To initiate a failover, the SRDF pair must already be in one of the

following states:
◆ Synchronized
◆ Suspended
◆ R1 updated
◆ Partitioned (when invoking this operation at the target site)
The failover operation:
◆ Suspends data traffic on the SRDF links
◆ Write-disables the R1 devices
◆ Write-enables the R2 volumes
Failback
To resume normal operations on the R1 side, a failback (R1 device
takeover) operation is initiated. This means read/write operations on
the R2 device must be stopped, and read/write operations on the R1
device must be started. When the failback command is initiated,
the R2 becomes read-only to its host, while the R1 becomes
read/write-enabled to its host. The following command performs a
failback operation on all SRDF pairs in the device group named
MyDevGrp:
symrdf –g MyDevGrp failback -noprompt
The SRDF pair must already be in one of the following states for the
failback operation to succeed:
◆ Failed over
◆ Suspended and write-disabled at the source
◆ Suspended and not ready at the source
◆ R1 Updated
◆ R1 UpdInProg
The failback operation:
◆ Write-enables the R1 devices.
◆ Performs a track table merge to discard changes on the R1s.
◆ Transfers the changes on the R2s.
◆ Resumes traffic on the SRDF links.
◆ Write-disables the R2 volumes.

EMC SRDF/Cluster Enabler solutions

EMC SRDF/Cluster Enabler (SRDF/CE) for MSCS is an integrated
solution that combines SRDF and clustering protection over distance.
EMC SRDF/CE provides disaster-tolerant capabilities that enable a
cluster to span geographically separated Symmetrix systems. It
operates as a software extension (MMC snap-in) to the Microsoft
Cluster Service (MSCS).
SRDF/CE achieves this capability by exploiting SRDF disaster restart
capabilities. SRDF allows the MSCS cluster to have two identical sets
of application data in two different locations. When cluster services
are failed over or failed back, SRDF/CE is invoked automatically to
perform the SRDF functions necessary to enable the requested
operation.
Figure 10 illustrates the hardware configuration of two, four-node,
geographically distributed EMC SRDF/CE clusters using
bidirectional SRDF.
Clients
Enterprise LAN/WAN
Primary Secondary
site nodes site nodes
Fibre Channel Fibre Channel

or SCSI or SCSI
R1 R1
R2 SRDF R2
ICO-IMG-000005
Figure 10 Geographically distributed four-node EMC SRDF/CE clusters

EMC TimeFinder
The SYMCLI TimeFinder component extends the basic SYMCLI
command set to include TimeFinder or business continuity
commands that allow control operations on device pairs within a
local replication environment. This section specifically describes the
functionality of:
◆ TimeFinder/Mirror — General monitor and control operations
for business continuance volumes (BCV)
◆ TimeFinder/CG — Consistency groups
◆ TimeFinder/Clone — Clone copy sessions
◆ TimeFinder/Snap — Snap copy sessions
Commands such as symmir and symbcv perform a wide spectrum
of monitor and control operations on standard/BCV device pairs
within a TimeFinder/Mirror environment. The TimeFinder/Clone
command, symclone, creates a point-in-time copy of a source device
on nonstandard device pairs (such as standard/standard,
BCV/BCV). The TimeFinder/Snap command, symsnap, creates
virtual device copy sessions between a source device and multiple
virtual target devices. These virtual devices only store pointers to
changed data blocks from the source device, rather than a full copy of
the data. Each product requires a specific license for monitoring and
control operations.
Configuring and controlling remote BCV pairs requires EMC SRDF
business continuity software discussed previously. The combination
of TimeFinder with SRDF provides for multiple local and remote
copies of production data.
Figure 11 illustrates application usage for a TimeFinder/Mirror
configuration in a Symmetrix system.

STD BCV Target data uses:

Backup
STD BCV Data warehouse
Server Regression testing
running Data protection
STD BCV
SYMCLI
ICO-IMG-000006
Figure 11 EMC Symmetrix configured with standard volumes and BCVs
TimeFinder/Mirror establish operations

A BCV device can be fully or incrementally established. After
configuration and initialization of a Symmetrix system, BCV devices
contain no data. BCV devices, like standard devices, can have unique
host addresses and can be online and ready to the host(s) to which
they are connected. A full establish operation must be used the
first time the standard devices are paired with the BCV devices. An
incremental establish of a BCV device can be performed to
resynchronize any data that has changed on the standard since the
last establish operation.
Note: When BCVs are established, they are inaccessible to any host.
Symmetrix systems allow up to four mirrors for each hypervolume.

The mirror positions are commonly designated M1, M2, M3, and M4.
An unprotected BCV can be the second, third, or fourth mirror
position of the standard device. A host, however, logically views the
Symmetrix M1/M2 mirrored devices as a single device.
To assign a BCV as a mirror of a standard Symmetrix device, the
symmir establish command is used. One method of establishing
a BCV pair is to allow the standard/BCV device-pairing algorithm to
arbitrarily create BCV pairs from multiple devices within a device
group:
symmir -g MyDevGrp establish –full -noprompt
With this method, TimeFinder/Mirror first checks for any attach

assignments (specifying a preferred BCV match from among multiple
BCVs in a device group). TimeFinder/Mirror then checks if there are
EMC TimeFinder 69
any pairing relationships among the devices. If either of these

previous conditions exists, TimeFinder/Mirror uses these
assignments.
TimeFinder split operations

Splitting a BCV pair is a TimeFinder/Mirror action that detaches the
BCV from its standard device and makes the BCV ready for host
access. When splitting a BCV, the system must perform housekeeping
tasks that may require a few milliseconds on a busy Symmetrix
system. These tasks involve a series of steps that result in separation
of the BCV from its paired standard:
◆ I/O is suspended briefly to the standard device.
◆ Write pending tracks for the standard device that have not yet
been written out to the BCV are duplicated in cache to be written
to the BCV.
◆ The BCV is split from the standard device.
◆ The BCV device status is changed to ready.
Regular split
A regular split is the type of split that has existed for
TimeFinder/Mirror since its inception. With a regular split (before
Enginuity version 5568), I/O activity from the production hosts to a
standard volume was not accepted until it was split from its BCV
pair. Therefore, applications attempting to access the standard or the
BCV would experience a short wait during a regular split. Once the
split was complete, no further overhead was incurred.
Beginning with Enginuity version 5568, any split operation is an
instant split. A regular split is still valid for earlier versions and
for current applications that perform regular split operations.
However, current applications that perform regular splits with
Enginuity version 5568 actually perform an instant split.
By specifying the –instant option on the command line, an instant
split with Enginuity versions 5x66 and 5x67 can be performed.
Since version 5568, this option is no longer required because instant
split mode has become the default behavior. It is beneficial to
continue to supply the –instant flag with later Enginuity versions,
otherwise the default is to wait for the background split to
complete.

Instant split
An instant split shortens the wait period during a split by
dividing the process into a foreground split and a background
split. During an instant split, the system executes the foreground
split almost instantaneously and returns a successful status to the
host. This instantaneous execution allows minimal I/O disruptions to
the production volumes. Furthermore, the BCVs are accessible to the
hosts as soon as the foreground process is complete. The background
split continues to split the BCV pair until it is complete. When the
-instant option is included or defaulted, SYMCLI returns
immediately after the foreground split, allowing other operations
while the BCV pair is splitting in the background.
The following operation performs an instant split on all BCV pairs
in MyDevGrp, and allows SYMCLI to return to the server process
while the background split is in progress:
symmir -g MyDevGrp split –instant –noprompt
The following symmir query command example checks the

progress of a split on the composite group named MyConGrp. The
–bg option is provided to query the status of the background split:
symmir –cg MyConGrp query –bg
TimeFinder restore operations

A BCV device can be used to fully or incrementally restore data on
the standard volume. Like the full establish operation, a full
restore operation copies the entire contents of the BCV devices to
the standard devices. The devices upon which the restore operates
may be defined in a device group, composite group, or device file.
For example:
symmir -g MyDevGrp -full restore –noprompt
symmir -cg MyConGrp -full restore –noprompt
symmir -f MyFile -full –sid 109 restore -noprompt
The incremental restore process accomplishes the same thing as

the full restore process with a major time-saving exception. The
BCV copies to the standard device only new data that was updated
on the BCV device while the BCV pair was split. The data on the
corresponding tracks of the BCV device also overwrites any changed
tracks on the standard device. This maximizes the efficiency of the
resynchronization process. This process is useful, for example, if,
EMC TimeFinder 71
after testing or validating an updated version of a database or a new

application on the BCV device is completed, a user wants to migrate
and utilize a copy of the tested data or application on the production
standard device.
Note: An incremental restore of a BCV volume to a standard volume is

only possible when the two volumes have an existing TimeFinder
relationship
TimeFinder consistent split

TimeFinder consistent split allows you to split off a
dependent-write consistent, restartable image of an application
without interrupting online services. Consistent split helps to
avoid inconsistencies and restart problems that can occur when
splitting an application-related BCV without first quiescing or halting
the application. Consistent split is implemented using Enginuity
Consistency Assist (ECA) feature. This functionality requires a
TimeFinder/CG license.
Enginuity Consistency Assist

The Enginuity Consistency Assist (ECA) feature of the Symmetrix
operating environment can be used to perform consistent split
operations across multiple heterogeneous environments. This
functionality requires a TimeFinder/CG license and uses the
–consistent option of the symmir command.
Using ECA to consistently split BCV devices from the standards, a
control host with no database or a database host with a dedicated
channel to gatekeeper devices must be available. The dedicated
channel cannot be used for servicing other devices or to freeze I/O.
For example, to split a device group, execute:
symmir –g MyDevGrp split –consistent -noprompt
Figure 12 illustrates an ECA split across three database hosts that

access devices on a Symmetrix system.

Controlling host
SYMAPI
Host A
ECA
STD BCV
Database
STD BCV prodgrp
servers
Host B STD BCV
Consistent split
Host C ICO-IMG-000007
Figure 12 ECA consistent split across multiple database-associated hosts
Device groups or composite groups must be created on the

controlling host for the target application to be consistently split.
Device groups can be created to include all of the required devices for
maintaining business continuity. For example, if a device group is
defined that includes all of the devices being accessed by Hosts A, B,
and C (see Figure 12), then all of the BCV pairs related to those hosts
can be consistently split with a single command.
However, if a device group is defined that includes only the devices
accessed by Host A, then the BCV pairs related to Host A can be split
without affecting the other hosts. The solid vertical line in Figure 12
represents the ECA holding of I/Os during an instant split process,
creating a dependent-write consistent image in the BCVs.
Figure 13 illustrates the use of local consistent split with a database
management system (DBMS).
EMC TimeFinder 73
Host Symmetrix 4
SYMAPI 2 Application Application

SYMCLI BCV BCV
1 data data
DBMS 6
PowerPath or LOGS Other BCV BCV
ECA data
3
5
ICO-IMG-000008
Figure 13 ECA consistent split on a local Symmetrix system
When a split command is issued with ECA from the production

host, a consistent database image is created using the following
sequence of events shown in Figure 13:
1. The device group, device file, or composite group identifies the
standard devices that hold the database.
2. SYMAPI communicates to Symmetrix Enginuity to validate that
all identified BCV pairs can be split.
3. SYMAPI communicates to Symmetrix Enginuity to open the ECA
window (the time within Symmetrix Enginuity where the writes
are deferred), the instant split is issued, and the writes are
released by closing the window.
4. ECA suspends writes to the standard devices that hold the
database. The DBMS cannot write to the devices and
subsequently waits for these devices to become available before
resuming any further write activity. Read activity to the device is
not affected unless attempting to read from a device with a write
queued against it.
5. SYMAPI sends an instant split request to all BCV pairs in the
specified device group and waits for the Symmetrix to
acknowledge that the foreground split has occurred. SYMAPI
then communicates with Symmetrix Enginuity to resume the
write or close the ECA window.
6. The application resumes writing to the production devices.
The BCV devices now contain a restartable copy of the production
data that is consistent up until the time of the instant split. The
production application is unaware that the split or

suspend/resume operation occurred. When the application on the

secondary host is started using the BCVs, there is no record of a
successful shutdown. Therefore, the secondary application instance
views the BCV copy as a crashed instance and proceeds to perform
the normal crash recovery sequence to restart.
When performing a consistent split, it is a good practice to issue
host-based commands that commit any data that has not been written
to disk before the split to reduce the amount of time on restart. For
example on UNIX systems, the sync command can be run. From a
database perspective, a checkpoint or equivalent should be executed.
TimeFinder/Mirror reverse split

BCVs can be mirrored to guard against data loss through physical
drive failures. A reverse split is applicable for a BCV that is
configured to have two local mirrors. It is generally used to recover
from an unsuccessful restore operation. When data is restored
from the BCV to the standard device, any writes that occur while the
standard is being restored alter the original copy of data on the BCVs
primary mirror. If the original copy of BCV data is needed again at a
later time, it can be restored to the BCVs primary mirror from the
BCVs secondary mirror using a reverse split. For example,
whenever logical corruption is reintroduced to a database during a
recovery process (following a BCV restore), both the standard
device and the primary BCV mirror are left with corrupted data. In
this case, a reverse split can restore the original BCV data from a
BCVs secondary mirror to its primary mirror.
This is particularly useful when performing a restore and
immediately restarting processing on the standard devices when the
process may have to be restarted many times.
Note: Reverse split is not available when protected restore is used to

return the data from the BCVs to the standards.
TimeFinder/Clone operations
Symmetrix TimeFinder/Clone operations using SYMCLI can create
up to 16 copies from a source device onto target devices. Unlike
TimeFinder/Mirror, TimeFinder/Clone does not require the
traditional standard-to-BCV device pairing. Instead,
TimeFinder/Clone allows any combination of source and target
EMC TimeFinder 75
devices. For example, a BCV can be used as the source device, while
another BCV can be used as the target device. Any combination of
source and target devices can be used. Additionally,
TimeFinder/Clone does not use the traditional mirror positions the
way that TimeFinder/Mirror does. Because of this,
TimeFinder/Clone is a useful option when more than three copies of
a source device are desired.
Normally, one of the three copies is used to protect the data against
hardware failure.
The source and target devices must be the same emulation type (FBA
or CKD). The target device must be equal in size to the source device.
Clone copies of striped or concatenated metavolumes can also be
created providing the source and target metavolumes are identical in
configuration. Once activated, the target device can be instantly
accessed by a target’s host, even before the data is fully copied to the
target device.
TimeFinder/Clone copies are appropriate in situations where
multiple copies of production data is needed for testing, backups, or
report generation. Clone copies can also be used to reduce disk
contention and improve data access speed by assigning users to
copies of data rather than accessing the one production copy. A single
source device may maintain as many as 16 relationships that can be a
combination of BCVs, clones and snaps.
Clone copy sessions

TimeFinder/Clone functionality is controlled via copy sessions,
which pair the source and target devices. Sessions are maintained on
the Symmetrix system and can be queried to verify the current state
of the device pairs. A copy session must first be created to define and
set up the TimeFinder/Clone devices. The session is then activated,
enabling the target device to be accessed by its host. When the
information is no longer needed, the session can be terminated.
TimeFinder/Clone operations are controlled from the host by using
the symclone command to create, activate, and terminate
the copy sessions.
Figure 14 illustrates a copy session where the controlling host creates
a TimeFinder/Clone copy of standard device DEV001 on target
device DEV005, using the symclone command.

1
2 DEV
001 Target host
Server running
SYMCLI DEV
005
ICO-IMG-000490
Figure 14 Creating a copy session using the symclone command
The symclone command is used to enable cloning operations. The

cloning operation happens in two phases: creation and activation.
The creation phase builds bitmaps of the source and target that are
later used during the activation or copy phase. The creation of a
symclone pairing does not start copying of the source volume to the
target volume, unless the -precopy keyword is used.
For example, to create clone sessions on all the standards and BCVs in
the device group MyDevGrp, use the following command:
symclone -g MyDevGrp create -noprompt
The activation of a clone enables the copying of the data. The data
may start copying immediately if the –copy keyword is used. If the
–copy keyword is not used, tracks are only copied when they are
accessed from the target volume or when they are changed on the
source volume.
Activation of the clone session established in the previous create
command can be accomplished using the following command.
symclone –g MyDevGrp activate -noprompt
New Symmetrix VMAX TimeFinder/Clone features
Solutions Enabler 7.1 and Enginuity 5874 SR1 introduce the ability to
clone from thick to thin devices using TimeFinder/Clone. thick to
thin TimeFinder/Clone allows application data to be moved from
standard Symmetrix volumes to virtually provisioned storage within
the same array. For some workloads virtually provisioned volumes
offer advantages with allocation utilization, ease of use and
performance through automatic wide striping. thick to thin
TimeFinder/Clone provides an easy way to move workloads that
EMC TimeFinder 77
benefit from Virtual Provisioning into that storage paradigm.

Migration from Thin devices back to fully provisioned devices is also
possible. The source and target of the migration may be of different
protection types and disk technologies offering versatility with
protections schemes and disk tier options. thick to thin TimeFinder
Clone will not disrupt hosts or internal array replication sessions
during the copy process.
TimeFinder/Snap operations
Symmetrix arrays provide another technique to create copies of
application data. The functionality, called TimeFinder/Snap, allows
users to make pointer-based, space-saving copies of data
simultaneously on multiple target devices from a single source
device. The data is available for access instantly. TimeFinder/Snap
allows data to be copied from a single source device to as many as 128
target devices. A source device can be either a Symmetrix standard
device or a BCV device controlled by TimeFinder/Mirror, with the
exception being a BCV working in clone emulation mode. The target
device is a Symmetrix virtual device (VDEV) that consumes
negligible physical storage through the use of pointers to track
changed data.
The VDEV is a host-addressable Symmetrix device with special
attributes created when the Symmetrix system is configured.
However, unlike a BCV which contains a full volume of data, a VDEV
is a logical-image device that offers a space-saving way to create
instant, point-in-time copies of volumes. Any updates to a source
device after its activation with a virtual device, causes the pre-update
image of the changed tracks to be copied to a save device. The virtual
device’s indirect pointer is then updated to point to the original track
data on the save device, preserving a point-in-time image of the
volume. TimeFinder/Snap uses this copy-on-first-write technique to
conserve disk space, since only changes to tracks on the source cause
any incremental storage to be consumed.
The symsnap create and symsnap activate commands are
used to create source/target Snap pair.

Table 3 summarizes some of the differences between devices used in

TimeFinder/Snap operations.
Table 3 TimeFinder device type summary
Device Description
Virtual device A logical-image device that saves disk space through the use of pointers to
track data that is immediately accessible after activation. Snapping data to a
virtual device uses a copy-on-first-write technique.
Save device A device that is not host-accessible but accessed only through the virtual
devices that point to it. Save devices provide a pool of physical space to store
snap copy data to which virtual devices point.
BCV A full volume mirror that has valid data after fully synchronizing with its source
device. It is accessible only when split from the source device that it is mirroring.
Snap copy sessions

TimeFinder/Snap functionality is managed via copy sessions, which
pair the source and target devices. Sessions are maintained on the
Symmetrix system and can be queried to verify the current state of
the devices. A copy session must first be created—a process which
defines the Snap devices in the operation. On subsequent activation,
the target virtual devices become accessible to its host. Unless the
data is changed by the host accessing the virtual device, the virtual
device always presents a frozen point-in-time copy of the source
device at the point of activation. When the information is no longer
needed, the session can be terminated.
TimeFinder/Snap operations are controlled from the host by using
the symsnap command to create, activate, terminate, and
restore the TimeFinder/Snap copy sessions. The TimeFinder/Snap
operations described in this section explain how to manage the
devices participating in a copy session through SYMCLI.
Figure 15 on page 80 illustrates a virtual copy session where the
controlling host creates a copy of standard device DEV001 on target
device VDEV005.
EMC TimeFinder 79
Controlling host
1 I/O
DEV
2 001 Device pointers
from VDEV to
original data
I/O
VDEV
005 Data copied to
save area due to
Target host SAV copy on write
DEV
ICO-IMG-000491
Figure 15 TimeFinder/Snap copy of a standard device to a VDEV
The symsnap command is used to enable TimeFinder/Snap

operations. The snap operation happens in two phases: creation and
activation. The creation phase builds bitmaps of the source and target
that are later used to manage the changes on the source and target.
The creation of a snap pairing does not copy the data from the source
volume to the target volume. To create snap sessions on all the
standards and BCVs in the device group MyDevGrp, use the
following command.
symsnap -g <MyDevGrp> create -noprompt
The activation of a snap enables the protection of the source data

tracks. When protected tracks are changed on the source volume,
they are first copied into the save pool and the VDEV pointers are
updated to point to the changed tracks in the save pool. When tracks
are changed on the VDEV, the data is written directly to the save pool
and the VDEV pointers are updated in the same way.
Activation of the snap session created in the previous create
command can be accomplished using the following command.
symsnap –g <MyDevGrp> activate -noprompt

EMC Storage Resource Management

The Storage Resource Management (SRM) component of EMC
Solutions Enabler extends the basic SYMCLI command set to include
SRM commands that allow users to discover and examine attributes
of various objects on a host or in the EMC storage enterprise.
Note: The acronym for EMC Storage Resource Management (SRM) can be
easily confused with the acronym for VMware Site Recovery Manager. To
avoid any confusion, this document always refers to VMware Site Recovery
Manager as VMware SRM.
SYMCLI commands support SRM in the following areas:

◆ Data objects and files
◆ Relational databases
◆ File systems
◆ Logical volumes and volume groups
◆ Performance statistics
SRM allows users to examine the mapping of storage devices and the
characteristics of data files and objects. These commands allow the
examination of relationships between extents and data files or data
objects, and how they are mapped on storage devices. Frequently,
SRM commands are used with TimeFinder and SRDF to create
point-in-time copies for backup and restart.
Figure 16 on page 82 outlines the process of how SRM commands are
used with TimeFinder in a database environment.
EMC Storage Resource Management 81

Data BCV
Host
SRM DEV DEV
001 001
1 SYMCLI Mapping Command
SYMAPI Data BCV
SYMCLI Invoke Database APIs DEV DEV
2
Identify devices 002 002
DBMS Log BCV
Map database objects
3 between database metadata DEV DEV
PowerPath or and the SYMCLI database 003 003
ECA
4 Log BCV
TimeFinder SPLIT
DEV DEV
004 004
ICO-IMG-000011
Figure 16 SRM commands
EMC Solutions Enabler with a valid license for TimeFinder and SRM
is installed on the host. In addition, the host must also have
PowerPath or use ECA, and must be utilized with a supported DBMS
system. As discussed in “TimeFinder split operations” on page 70,
when splitting a BCV, the system must perform housekeeping tasks
that may require a few seconds on a busy Symmetrix system. These
tasks involve a series of steps (shown in Figure 16 on page 82) that
result in the separation of the BCV from its paired standard:
1. Using the SRM base mapping commands, first query the
Symmetrix system to display the logical-to-physical mapping
information about any physical device, logical volume, file,
directory, and/or file system.
2. Using the database mapping command, query the Symmetrix to
display physical and logical database information.
3. Next, use the database mapping command to translate:
• The devices of a specified database into a device group or a
consistency group, or
• The devices of a specified table space into a device group or a
consistency group.
4. The BCV is split from the standard device.

Table 4 lists the SYMCLI commands used to examine the mapping of

data objects.
Table 4 Data object SRM commands
Command Argument Action
symrslv pd Displays logical to physical mapping information about any physical

device.
lv Displays logical to physical mapping information about a logical

volume.
file Displays logical to physical mapping information about a file.
dir Displays logical to physical mapping information about a directory.
fs Displays logical to physical mapping information about a file system.
SRM commands allow users to examine the host database mapping

and the characteristics of a database. The commands provide listings
and attributes that describe various databases, their structures, files,
table spaces, and user schemas. Typically, the database commands
work with Oracle, Informix, SQL Server, Sybase, Microsoft Exchange,
SharePoint Portal Server, and DB2 LUW database applications.
Table 5 lists the SYMCLI commands used to examine the mapping of
database objects.
Table 5 Data object mapping commands (page 1 of 2)
symrdb list Lists various physical and logical database objects:

Current relational database instances available
table spaces, tables, files, or schemas of a database
Files, segments, or tables of a database table space or schema
show Shows information about a database object: table space, tables, file,
or schema of a database, File, segment, or a table of a specified table
space or schema
rdb2dg Translates the devices of a specified database into a device group.

Table 5 Data object mapping commands (page 2 of 2)
rdb2cg Translates the devices of a specified table space into a composite

group or a consistency group.
tbs2cg Translates the devices of a specified table space into a composite

group. Only data database files are translated.
tbs2dg Translates the devices of a specified table space into a device group.
Only data database files are translated.
The SYMCLI file system SRM command allows users to investigate

the file systems that are in use on the operating system. The
command provides listings and attributes that describe file systems,
directories, and files, and their mapping to physical devices and
extents.
Table 6 lists the SYMCLI command that can be used to examine the
file system mapping.
Table 6 File system SRM commands to examine file system mapping
symhostfs list Displays a list of file systems, files, or directories
show Displays more detail information about a file system or

file system object.
SYMCLI logical volume SRM commands allow users to map logical

volumes to display a detailed view of the underlying storage devices.
Logical volume architecture defined by a Logical Volume Manager
(LVM) is a means for advanced applications to improve performance
by the strategic placement of data.

Table 7 lists the SYMCLI commands that can be used to examine the
logical volume mapping.
Table 7 File system SRM command to examine logical volume mapping
symvg deport Deports a specified volume group so it can be

imported later.
import Imports a specified volume group.
list Displays a list of volume groups defined on the host

system by the logical volume manager.
rescan Rescans all the volume groups.
show Displays more detail information about a volume

group.
vg2cg Translates volume groups to composite groups.
vg2dg Translates volume groups to device groups.
symlv list Displays a list of logical volumes on a specified

volume group.
show Displays detail information (including extent data)

about a logical volume.
SRM performance statistics commands allow users to retrieve

statistics about a host’s CPU, disk, and memory.
Table 8 lists the statistics commands.
Table 8 SRM statistics command
symhost show Displays host configuration information.
stats Displays performance statistics.

EMC Storage Viewer

EMC Storage Viewer (SV) for vSphere Client extends the vSphere
Client to facilitate discovery and identification of EMC Symmetrix
storage devices that are allocated to VMware ESX/ESXi hosts and
virtual machines. The Storage Viewer for vSphere Client presents the
underlying storage details to the virtual datacenter administrator,
merging the data of several different storage mapping tools into a few
seamless vSphere Client views.
The Storage Viewer for vSphere Client enables you to resolve the
underlying storage of Virtual Machine File System (VMFS) datastores
and virtual disks, as well as raw device mappings (RDM). In
addition, you are presented with lists of storage arrays and devices
that are accessible to the ESX and ESXi hosts in the virtual datacenter.
Previously, these details were only made available to you using
separate storage management applications.
Once installed and configured, Storage Viewer provides four
different views:
◆ The global EMC Storage view. This view configures the global
settings for the Storage Viewer, including the Solutions Enabler
client/server settings, log settings, and version information.
Additionally, an arrays tab lists all of the storage arrays currently
being managed by Solutions Enabler, and allows for the
discovery of new arrays and the deletion of previously
discovered arrays.
◆ The EMC Storage tab for hosts. This tab appears when an
ESX/ESXi host is selected. It provides insight into the storage
that is configured and allocated for a given ESX/ESXi host.
◆ The SRDF SRA tab for hosts. This view also appears when an
ESX/ESXi host is selected on a vSphere Client running on
VMware Site Recovery Manager Server. It allows you to
configure device pair definitions for the EMC SRDF Storage
Replication Adapter (SRA), to use when testing VMware Site
Recovery Manager recovery plans, or when creating gold copies
before VMware Site Recovery Manager recovery plans are
executed.
◆ The EMC Storage tab for virtual machines. This view appears
when a virtual machine is selected. It provides insight into the
storage that is allocated to a given virtual machine, including
both virtual disks and raw device mappings (RDM).

A typical view of the Storage Viewer for vSphere Client can be seen in
Figure 17.
Figure 17 EMC Storage Viewer
EMC Storage Viewer 87

EMC PowerPath
EMC PowerPath is host-based software that works with networked
storage systems to intelligently manage I/O paths. PowerPath
manages multiple paths to a storage array. Supporting multiple paths
enables recovery from path failure because PowerPath automatically
detects path failures and redirects I/O to other available paths.
PowerPath also uses sophisticated algorithms to provide dynamic
load balancing for several kinds of path management policies that the
user can set. With the help of PowerPath, systems administrators are
able to ensure that applications on the host have highly available
access to storage and perform optimally at all times.
A key feature of path management in PowerPath is dynamic,
multipath load balancing. Without PowerPath, an administrator must
statically load balance paths to logical devices to improve
performance. For example, based on current usage, the administrator
might configure three heavily used logical devices on one path, seven
moderately used logical devices on a second path, and 20 lightly used
logical devices on a third path. As I/O patterns change, these
statically configured paths may become unbalanced, causing
performance to suffer. The administrator must then reconfigure the
paths, and continue to reconfigure them as I/O traffic between the
host and the storage system shifts in response to usage changes.
Designed to use all paths concurrently, PowerPath distributes I/O
requests to a logical device across all available paths, rather than
requiring a single path to bear the entire I/O burden. PowerPath can
distribute the I/O for all logical devices over all paths shared by
those logical devices, so that all paths are equally burdened.
PowerPath load balances I/O on a host-by-host basis, and maintains
statistics on all I/O for all paths. For each I/O request, PowerPath
intelligently chooses the least-burdened available path, depending on
the load-balancing and failover policy in effect. In addition to
improving I/O performance, dynamic load balancing reduces
management time and downtime because administrators no longer
need to manage paths across logical devices. With PowerPath,
configurations of paths and policies for an individual device can be
changed dynamically, taking effect immediately, without any
disruption to the applications.

PowerPath provides the following features and benefits:

◆ Multiple paths, for higher availability and performance —
PowerPath supports multiple paths between a logical device and
a host bus adapter (HBA, a device through which a host can issue
I/O requests). Having multiple paths enables the host to access a
logical device even if a specific path is unavailable. Also, multiple
paths can share the I/O workload to a given logical device.
◆ Dynamic multipath load balancing — Through continuous I/O
balancing, PowerPath improves a host’s ability to manage heavy
I/O loads. PowerPath dynamically tunes paths for performance
as workloads change, eliminating the need for repeated static
reconfigurations.
◆ Proactive I/O path testing and automatic path recovery —
PowerPath periodically tests failed paths to determine if they are
available. A path is restored automatically when available, and
PowerPath resumes sending I/O to it. PowerPath also
periodically tests available but unused paths, to ensure they are
operational.
◆ Automatic path failover — PowerPath automatically redirects
data from a failed I/O path to an alternate path. This eliminates
application downtime; failovers are transparent and
non-disruptive to applications.
◆ Enhanced high availability cluster support — PowerPath is
particularly beneficial in cluster environments, as it can prevent
interruptions to operations and costly downtime. PowerPath’s
path failover capability avoids node failover, maintaining
uninterrupted application support on the active node in the event
of a path disconnect (as long as another path is available).
◆ Consistent split — PowerPath allows users to perform
TimeFinder consistent splits by suspending device writes at the
host level for a fraction of a second while the foreground split
occurs. PowerPath software provides suspend-and-resume
capability that avoids inconsistencies and restart problems that
can occur if a database-related BCV is split without first
quiescing the database.
◆ Consistency Groups — Consistency groups are a composite
group of Symmetrix devices specially configured to act in unison
to maintain the integrity of a database distributed across multiple
SRDF arrays controlled by an open systems host computer.
EMC PowerPath 89
PowerPath/VE
EMC PowerPath/VE delivers PowerPath Multipathing features to
optimize VMware vSphere virtual environments. With
PowerPath/VE, you can standardize path management across
heterogeneous physical and virtual environments. PowerPath/VE
enables you to automate optimal server, storage, and path utilization
in a dynamic virtual environment. With hyper-consolidation, a
virtual environment may have hundreds or even thousands of
independent virtual machines running, including virtual machines
with varying levels of I/O intensity. I/O-intensive applications can
disrupt I/O from other applications and before the availability of
PowerPath/VE, load balancing on an ESX host system had to be
manually configured to correct for this. Manual load-balancing
operations to ensure that all virtual machines receive their individual
required response times are time-consuming and logistically difficult
to effectively achieve.
PowerPath/VE works with VMware ESX and ESXi as a multipathing
plug-in (MPP) that provides enhanced path management capabilities
to ESX and ESXi hosts. PowerPath/VE is supported with vSphere
(ESX4) only. Previous versions of ESX do not have the PSA, which is
required by PowerPath/VE.
PowerPath/VE installs as a kernel module on the vSphere host.
PowerPath/VE will plug in to the vSphere I/O stack framework to
bring the advanced multipathing capabilities of PowerPath -
dynamic load balancing and automatic failover - to the VMware
vSphere platform (Figure 18 on page 91).

Figure 18 PowerPath/VE vStorage API for multipathing plug-in
At the heart of PowerPath/VE path management is server-resident

software inserted between the SCSI device-driver layer and the rest of
the operating system. This driver software creates a single "pseudo
device" for a given array volume (LUN) regardless of how many
physical paths on which it appears. The pseudo device, or logical
volume, represents all physical paths to a given device. It is then
used for creating virtual disks, and for raw device mapping (RDM),
which is then used for application and database access.
EMC PowerPath 91
PowerPath/VE's value fundamentally comes from its architecture

and position in the I/O stack. PowerPath/VE sits above the HBA,
allowing heterogeneous support of operating systems and storage
arrays. By integrating with the I/O drivers, all I/Os run through
PowerPath and allow for it to be a single I/O control and
management point. Since PowerPath/VE resides in the ESX kernel, it
sits below the Guest OS level, application level, database level, and
file system level. PowerPath/VE's unique position in the I/O stack
makes it an infrastructure manageability and control point - bringing
more value going up the stack.
PowerPath/VE features
PowerPath/VE provides the following features:
◆ Dynamic load balancing - PowerPath is designed to use all paths
at all times. PowerPath distributes I/O requests to a logical
device across all available paths, rather than requiring a single
path to bear the entire I/O burden.
◆ Auto-restore of paths - Periodic auto-restore reassigns logical
devices when restoring paths from a failed state. Once restored,
the paths automatically rebalance the I/O across all active
channels.
◆ Device prioritization - Setting a high priority for a single or
several devices improves their I/O performance at the expense of
the remaining devices, while otherwise maintaining the best
possible load balancing across all paths. This is especially useful
when there are multiple virtual machines on a host with varying
application performance and availability requirements.
◆ Automated performance optimization - PowerPath/VE
automatically identifies the type of storage array and sets the
highest performing optimization mode by default. For
Symmetrix, the mode is SymmOpt (Symmetrix Optimized).
◆ Dynamic path failover and path recovery - If a path fails,
PowerPath/VE redistributes I/O traffic from that path to
functioning paths. PowerPath/VE stops sending I/O to the failed
path and checks for an active alternate path. If an active path is
available, PowerPath/VE redirects I/O along that path.
PowerPath/VE can compensate for multiple faults in the I/O
channel (for example, HBAs, fiber-optic cables, Fibre Channel
switch, storage array port).

◆ Monitor/report I/O statistics - While PowerPath/VE load

balances I/O, it maintains statistics for all I/O for all paths. The
administrator can view these statistics using rpowermt.
◆ Automatic path testing - PowerPath/VE periodically tests both
live and dead paths. By testing live paths that may be idle, a
failed path may be identified before an application attempts to
pass I/O down it. By marking the path as failed before the
application becomes aware of it, timeout and retry delays are
reduced. By testing paths identified as failed, PowerPath/VE
will automatically restore them to service when they pass the test.
The I/O load will be automatically balanced across all active
available paths.
PowerPath/VE management
PowerPath/VE uses a command set, called rpowermt, to monitor,
manage, and configure PowerPath/VE for vSphere. The syntax,
arguments, and options are very similar to the traditional powermt
commands used on all the other PowerPath Multipathing supported
operating system platforms. There is one significant difference in
that rpowermt is a remote management tool.
Not all vSphere installations have a service console interface. In
order to manage an ESXi host, customers have the option to use
vCenter Server or vCLI (also referred to as VMware Remote Tools) on
a remote server. PowerPath/VE for vSphere uses the rpowermt
command line utility for both ESX and ESXi. PowerPath/VE for
vSphere cannot be managed on the ESX host itself. There is neither a
local nor remote GUI for PowerPath on ESX.
Administrators must designate a Guest OS or a physical machine to
manage one or multiple ESX hosts. rpowermt is supported on
Windows 2003 (32-bit) and Red Hat 5 Update 2 (64-bit).
When the vSphere host server is connected to the Symmetrix system,
the PowerPath/VE kernel module running on the vSphere host will
associate all paths to each device presented from the array and
associate a pseudo device name (as discussed earlier). An example of
this is shown in Figure 15 on page 80, which shows the output of
rpowermt display host=x.x.x.x dev=emcpower0. Note in the output
that the device has four paths and displays the optimization mode
(SymmOpt = Symmetrix optimization).
EMC PowerPath 93
Figure 19 Output of rpowermt display command on a Symmetrix VMAX device
As more VMAX Engines or Symmetrix DMX directors become

available, the connectivity can be scaled as needed. PowerPath/VE
supports up to 32 paths to a device. These methodologies for
connectivity ensure all front-end directors and processors are
utilized, providing maximum potential performance and load
balancing for vSphere hosts connected to the Symmetrix
VMAX/DMX storage arrays in combination with PowerPath/VE.
PowerPath/VE in vCenter Server

PowerPath/VE for vSphere is managed, monitored, and configured
using rpowermt as discussed in the previous section. This CLI-based
management is common across all PowerPath platforms and
presently, there is very little integration at this time with VMware
management tools. However, LUN ownership is presented in the
GUI.
As seen in Figure 20 on page 95, under the ESX Configuration tab and
within the Storage Devices list, the owner of the device is shown.

Figure 20 Device ownership in vCenter Server
Figure 20 shows a number of different devices owned by PowerPath.

A set of claim rules are added to the vSphere PSA, which enables
PowerPath/VE to manage supported storage arrays. As part of the
initial installation process and claiming of devices by PowerPath/VE,
the system must be rebooted. Nondisruptive installing is discussed
in the following section.
Nondisruptive installation of PowerPath/VE using VMotion

Installing PowerPath/VE on a vSphere host requires a reboot. Just as
with other PowerPath platforms, either the host must be rebooted or
the I/O to applications running on the host must be stopped. In the
case of vSphere, the migration capability built into the hypervisor
allows members of the cluster to have PowerPath/VE installed
without disrupting active virtual machines.
VMware VMotion technology leverages the complete virtualization
of servers, storage, and networking to move an entire running virtual
machine instantaneously from one server to another. VMware
VMotion uses the VMware cluster file system to control access to a
virtual machine's storage. During a VMotion operation, the active
memory and precise execution state of a virtual machine is rapidly
transmitted over a high-speed network from one physical server to
another and access to the virtual machines' disk storage is instantly
switched to the new physical host. It is therefore advised, in order to
eliminate any downtime, to use VMotion to move all running virtual
machines off the ESX host server before the installation of
EMC PowerPath 95
PowerPath/VE. If the ESX host server is in a fully automated High

Availability (HA) cluster, put the ESX host into maintenance mode,
which will immediately begin migrating all of the virtual machines
off the ESX host to other servers in the cluster.
As always, it is necessary to perform a number of different checks
before evacuating virtual machines from an ESX host to make sure
that the virtual machines can actually be migrated. These checks
include making sure that:
◆ VMotion is properly configured and functioning.
◆ The datastores containing the virtual machines are shared over
the cluster.
◆ No virtual machines are using physical media from their ESX host
system (that is, CD-ROMs, USB drives)
◆ The remaining ESX hosts in the cluster will be able to handle the
additional load of the temporarily migrated virtual machines.
Performing these checks will help to ensure the successful (and
error-free) migration of the virtual machines. Additionally, this due
diligence will greatly reduce the risk of degraded virtual machine
performance resulting from overloaded ESX host systems. For more
information on configuring and using VMotion, refer to VMware
documentation.
This process should be repeated on all ESX hosts in the cluster until
all PowerPath installations are complete.

EMC Replication Manager

EMC Replication Manager is an EMC software application that
dramatically simplifies the management and use of disk-based
replications to improve the availability of user’s mission-critical data
and rapid recovery of that data in case of corruption.
Note: All functionality offered by EMC Replication Manager is not supported

in a VMware Infrastructure environment. The EMC Replication Manager
Support Matrix available on Powerlink® (EMC’s password-protected
customer- and partner-only website) provides further details on supported
configurations.
Replication Manager helps users manage replicas as if they were tape

cartridges in a tape library unit. Replicas may be scheduled or created
on demand, with predefined expiration periods and automatic
mounting to alternate hosts for backups or scripted processing.
Individual users with different levels of access ensure system and
replica integrity. In addition to these features, Replication Manager is
fully integrated with many critical applications such as DB2 LUW,
Oracle, and Microsoft Exchange.
Replication Manager makes it easy to create point-in-time, disk-based
replicas of applications, file systems, or logical volumes residing on
existing storage arrays. It can create replicas of information stored in
the following environments:
◆ Oracle databases
◆ DB2 LUW databases
◆ Microsoft SQL Server databases
◆ Microsoft Exchange databases
◆ UNIX file systems
◆ Windows file systems
◆ VMware file systems
The software utilizes Java-based client-server architecture.
Replication Manager can:
◆ Create point-in-time replicas of production data in seconds.
◆ Facilitate quick, frequent, and non-destructive backups from
replicas.
EMC Replication Manager 97

◆ Mount replicas to alternate hosts to facilitate offline processing

(for example, decision-support services, integrity checking, and
offline reporting).
◆ Restore deleted or damaged information quickly and easily from
a disk replica.
◆ Set the retention period for replicas so that storage is made
available automatically.
Replication Manager has a generic storage technology interface that
allows it to connect and invoke replication methodologies available
on:
◆ EMC Symmetrix arrays
◆ EMC CLARiiON arrays
◆ HP StorageWorks arrays
Replication Manager uses Symmetrix API (SYMAPI) Solutions
Enabler software and interfaces to the storage array’s native software
to manipulate the supported disk arrays. Replication Manager
automatically controls the complexities associated with creating,
mounting, restoring, and expiring replicas of data. Replication
Manager performs all of these tasks and offers a logical view of the
production data and corresponding replicas. Replicas are managed
and controlled with the easy-to-use Replication Manager console.

EMC Open Replicator

EMC Open Replicator enables distribution and/or consolidation of
remote point-in-time copies between EMC Symmetrix DMX and
qualified storage systems such as the EMC CLARiiON storage arrays.
By leveraging the high-end Symmetrix DMX storage architecture,
Open Replicator offers unmatched deployment flexibility and
massive scalability.
Open Replicator can be used to provide solutions to business
processes that require high-speed data mobility, remote vaulting and
data migration. Specifically, Open Replicator enables customers to:
◆ Rapidly copy data between Symmetrix, CLARiiON and
third-party storage arrays.
◆ Perform online migrations from qualified storage to Symmetrix
DMX arrays with minimal disruption to host applications.
◆ Push a point-in-time copy of applications from Symmetrix DMX
arrays to a target volume on qualified storage arrays with
incremental updates.
◆ Copy from source volumes on qualified remote arrays to
Symmetrix DMX volumes.
Open Replicator is tightly integrated with the EMC TimeFinder and
SRDF family of products, providing enterprises with highly flexible
and lower-cost options for remote protection and migration. Open
Replicator is ideal for applications and environments where
economics and infrastructure flexibility outweigh RPO and RTO
requirements. Open Replicator enables businesses to:
◆ Provide a cost-effective and flexible solution to protect lower-tier
applications.
◆ Reduce TCO by pushing or pulling data from Symmetrix DMX
systems to other qualified storage arrays in conventional
SAN/WAN environments.
◆ Create remote point-in-time copies of production applications for
many ancillary business operations such as data vaulting.
◆ Obtain cost-effective application restore capabilities with minimal
RPO/RTO impact.
◆ Comply with industry policies and government regulations.
EMC Open Replicator 99

EMC Virtual Provisioning

Virtual Provisioning (commonly known as thin provisioning) was
released with the 5773 Enginuity operating environment. Virtual
Provisioning allows for storage to be allocated/accessed on-demand
from a pool of storage servicing one or many applications. This type
of approach has multiple benefits:
◆ Enables LUNs to be “grown” into over time with no impact to the
host or application as space is added to the thin pool
◆ Only delivers space from the thin pool when it is written to, that
is, on-demand. Overallocated application components only use
space that is written to — not requested.
◆ Provides for thin-pool wide striping and for the most part relieves
the storage administrator of the burden of physical device/LUN
configuration
Virtual Provisioning introduces two new devices to the Symmetrix.
The first device is a thin device and the second device is a data
device. These are described in the following two sections.
Thin device
A thin device is a “Host accessible device” that has no storage
directly associated with it. Thin devices have pre-configured sizes
and appear to the host to have that exact capacity. Storage is allocated
in chunks when a block is written to for the first time. Zeroes are
provided to the host for data that is read from chunks that have not
yet been allocated.
Data device
Data devices are specifically configured devices within the
Symmetrix that are containers for the written-to blocks of thin
devices. Any number of data devices may comprise a data device
pool. Blocks are allocated to the thin devices from the pool on a round
robin basis. This allocation block size is 768K.
Figure 21 on page 101 depicts the components of a Virtual
Provisioning configuration:

Pool A
Data
devices
Thin
Devices
Pool B
Data
devices
ICO-IMG-000493
Figure 21 Virtual Provisioning components
New Symmetrix VMAX Virtual Provisioning features

Solutions Enabler 7.1 and Enginuity 5874 SR1 introduce two new
features to Symmetrix Virtual Provisioning - thin pool write
rebalancing and zero space reclamation. Thin pool write balancing
provides the ability to automatically rebalance allocated extents on
data devices over the entire pool when new data devices are added.
Zero space reclamation allows users to reclaim space from tracks of
data devices that are all zeros.
Thin pool write rebalance

Thin pool write rebalancing for Virtual Provisioning pools extends
the functionality of the Virtual Provisioning feature by implementing
a method to normalize the used capacity levels of data devices within
a virtual data pool after new data drives are added or existing data
drives are drained. This feature introduces a background
optimization task to scan the used capacity levels of the data devices
within a virtual pool and perform movements of multiple track
groups from the most utilized pool data devices to the least used pool
data devices. The process can be scheduled to run only when
EMC Virtual Provisioning 101

changes to virtual pool composition make it necessary and user

controls exist to specify what utilization delta will trigger track group
movement.
Zero space reclamation

Zero space reclamation or Virtual Provisioning space reclamation
provides the ability to free, also referred to as "de-allocate," storage
extents found to contain all zeros. This feature is an extension of the
existing Virtual Provisioning space de-allocation mechanism.
Previous versions of Enginuity and Solutions Enabler allowed for
reclaiming allocated (reserved but unused) thin device space from a
thin pool. Administrators now have the ability to reclaim both
allocated/unwritten extents as well as extents filled with host-written
zeros within a thin pool. The space reclamation process is
nondisruptive and can be executed with the targeted thin device
ready and read/write to operating systems and applications.
Starting the space reclamation process spawns a back-end disk
director (DA) task that will examine the allocated thin device extents
on specified thin devices. A thin device extent is 768 KB (or 12 tracks)
in size and is the default unit of storage at which allocations occur.
For each allocated extent, all 12 tracks will be brought into Symmetrix
cache and examined to see if they contain all zero data. If the entire
extent contains all zero data, the extent will be de-allocated and
added back into the pool, making it available for a new extent
allocation operation. An extent that contains any non-zero data is not
reclaimed.

EMC Virtual LUN migration

This feature offers system administrators the ability to transparently
migrate host visible LUNs from differing tiers of storage that are
available in the Symmetrix VMAX. The storage tiers can represent
differing hardware capability as well as differing tiers of protection.
The LUNs can be migrated to either unallocated space (also referred
to as unconfigured space) or to configured space, which is defined as
existing Symmetrix LUNs that are not currently assigned to a
server-existing, not-ready volumes-within the same subsystem. The
data on the original source LUNs is cleared using instant VTOC once
the migration has been deemed successful. The migration does not
require swap or DVR space, and is nondisruptive to the attached
hosts or other internal Symmetrix applications such as TimeFinder
and SRDF. Figure 22 shows the valid combinations of drive types and
protection types that are available for migration.
Drive Type Protection Type

Flash Fibre SATA RAID 1 RAID 6 RAID 6 Un-
Channel Protected
Flash y y y RAID 1 y y y x
RAID 6 y y y x
Fibre y y y
Channel RAID 6 y y y x
Un- y y y x
SATA y y y Protected
ICO-IMG-000754
Figure 22 Virtual LUN eligibility tables
The device migration is completely transparent to the host on which

an application is running since the operation is executed against the
Symmetrix device; thus the target and LUN number are not changed
and applications are uninterrupted. Furthermore, in SRDF
environments, the migration does not require customers to
re-establish their disaster recovery protection after the migration.
The Virtual LUN feature leverages the newly designed virtual RAID
architecture in Enginuity 5874, which abstracts device protection
from its logical representation to a server. This powerful approach
allows a device to have more simultaneous protection types such as
BCVs, SRDF, Concurrent SRDF, and spares. It also enables seamless
EMC Virtual LUN migration 103

transition from one protection type to another while servers and their
associated applications and Symmetrix software are accessing the
device.
The Virtual LUN feature offers customers the ability to effectively
utilize SATA storage - a much cheaper, yet reliable, form of high
capacity storage. It also facilitates fluid movement of data across the
various storage tiers present within the subsystem - the realization of
true "tiered storage in the box." Thus, Symmetrix VMAX becomes the
first enterprise storage subsystem to offer a comprehensive "tiered
storage in the box," ILM capability that complements the customer's
tiering initiatives. Customers can now achieve varied
cost/performance profiles by moving lower priority application data
to less expensive storage, or conversely, moving higher priority or
critical application data to higher performing storage as their needs
dictate.
Specific use cases for customer applications enable the moving of
data volumes transparently from tier to tier based on changing
performance (moving to faster or slower disks) or availability
requirements (changing RAID protection on the array). This
migration can be performed transparently without interrupting those
applications or host systems utilizing the array volumes and with
only a minimal impact to performance during the migration.
The following sample commands show how to move two LUNs of a
host environment from RAID 6 drives on Fibre Channel 15k rpm
drives to Enterprise Flash drives. The new symmigrate command,
which comes in EMC Solutions Enabler 7.0, is used to perform the
migrate operation. The source Symmetrix hypervolume numbers are
200 and 201, and the target Symmetrix hypervolumes on the
Enterprise Flash drives are A00 and A01.
1. A file (migrate.ctl) is created that contains the two LUNs to be
migrated. The file has the following content:
200 A00
201 A01
2. The following command is executed to perform the migration:

symmigrate -sid 1261 -name <ds_mig> -f <migrate.ctl>
establish
The ds_mig name associated with this migration can be used to

interrogate the progress of the migration.

3. To inquire on the progress use the following command:

symmigrate -sid 1261 -name <ds_mig> query
The two host accessible LUNs are migrated without having to impact
application or server availability.
EMC Virtual LUN migration 105

EMC Fully Automated Storage Tiering (FAST)

With the release of Enginuity 5874, EMC now offers the first
generation of Fully Automated Storage Tiering technology. EMC
Symmetrix VMAX Fully Automated Storage Tiering (FAST) for
standard provisioned environments automates the identification of
data volumes for the purposes of allocating or re-allocating
application data across different performance tiers within an array.
FAST proactively monitors workloads at the volume (LUN) level in
order to identify "busy" volumes that would benefit from being
moved to higher-performing drives. FAST will also identify less
"busy" volumes that could be relocated to higher-capacity drives,
without existing performance being affected. This
promotion/demotion activity is based on policies that associate a
storage group to multiple drive technologies, or RAID protection
schemes, based on the performance requirements of the application
contained within the storage group. Data movement executed
during this activity is performed nondisruptively, without affecting
business continuity and data availability.
The primary benefits of FAST include:
◆ Automating the process of identifying volumes that can benefit
from Enterprise Flash Drives and/or that can be kept on
higher-capacity, less-expensive drives without impacting
performance
◆ Improving application performance at the same cost, or
providing the same application performance at lower cost. Cost
is defined as space, energy, acquisition, management and
operational expense.
◆ Optimizing and prioritizing business applications, which allows
customers to dynamically allocate resources within a single array
◆ Delivering greater flexibility in meeting different
price/performance ratios throughout the lifecycle of the
information stored
Management and operation of FAST are provided by SMC, as well as
the Solutions Enabler Command Line Interface (SYMCLI). Also,
detailed performance trending, forecasting, alerts, and resource
utilization are provided through Symmetrix Performance Analyzer
(SPA). EMC IonixTM ControlCenter® provides the capability for
advanced reporting and analysis to be used for charge back and
capacity planning.

3
Creating Oracle
Database Clones
This chapter discusses how to presents these topics:

◆ Overview ................................................................................................. 109
◆ Comparing recoverable and restartable copies of databases............ 110
◆ Copying the database with Oracle shutdown .................................... 111
◆ Copying a running database using EMC consistency technology .. 118
◆ Copying the database with Oracle in hot backup mode .................. 125
◆ Replicating Oracle using Replication Manager ................................. 133
◆ Transitioning disk copies to Oracle database clones......................... 135
◆ Oracle transportable tablespaces ......................................................... 143
◆ Cross-platform transportable tablespaces .......................................... 150
◆ Choosing a database cloning methodology ....................................... 154
Creating Oracle Database Clones 107

Creating Oracle Database Clones
This chapter describes the Oracle database cloning process using

various EMC products. Determining which replication products to
use depends on the customer's requirements and database
environment. Products such as TimeFinder and Replication Manager
provide an easy method copying of Oracle databases in a single
Symmetrix array.
This chapter describes the database cloning process. A database
cloning process typically includes some or all of the following steps,
depending on the copying mechanism selected and the desired usage
of the database clone:
◆ Preparing the array for replication
◆ Conditioning the source database
◆ Making a copy of the database volumes
◆ Resetting the source database
◆ Presenting the target database copy to a server
◆ Conditioning the target database copy

Overview
There are many choices when cloning databases with EMC
array-based replication software. Each software product has differing
characteristics that affect the final deployment. A thorough
understanding of the options available leads to an optimal replication
choice.
An Oracle database can be in one of three data states when it is being
copied:
◆ Shutdown
◆ Processing normally
◆ Conditioned using hot-backup mode
Depending on the data state of the database at the time it is copied,
the database copy may be restartable or recoverable. This section
begins with a discussion of recoverable and restartable database
clones. It then describes various approaches to data replication using
EMC software products and how the replication techniques is used in
combination with the different database data states to facilitate the
database cloning process. Following that, database clone usage
considerations are discussed along with descriptions of the
procedures used to deploy database clones across various
operating-systems platforms.
Overview 109
Comparing recoverable and restartable copies of databases

The Symmetrix-based replication technologies described in this
section can create two types of database copies: recoverable or
restartable. A significant amount of confusion exists between these
two types of database copies; a clear understanding of the differences
between the two is critical to ensure the appropriate application of
each method when a cloned Oracle environment is required.
Recoverable disk copies

A recoverable database copy is one in which logs can be applied to
the database data state and the database is rolled forward to a point
in time after the database copy is created. A recoverable Oracle
database copy is intuitively easy for DBAs to understand since
maintaining recoverable copies, in the form of backups, is an
important DBA function. In the event of a failure of the production
database, the ability to recover the database not only to the point in
time when the last backup was taken, but also to roll forward
subsequent transactions up to the point of failure, is a key feature of
the Oracle database.
Restartable disk copies

If a copy of a running Oracle system is created using EMC
consistency technology without putting the database in hot backup
mode, the copy is a DBMS restartable copy. This means that when the
DBMS is started on the restartable copy, it performs crash recovery.
First, all transactions recorded as committed and written to the redo
log, but which may not have had corresponding data pages written to
the data files are rolled forward using the redo logs. Second, after the
application of log information completes, Oracle rolls back any
changes that were written to the database (dirty pages flushed to disk
for example), but were never actually committed by a transaction.
The state attained is often referred to as a transactionally consistent
point in time. It is essentially the same process that the RDBMS
would undergo if the server suffered an unanticipated interruption
such as a power failure.
Roll-forward recovery using archive logs to a point in time after the
disk copy is created is unsupported on an Oracle restartable database
copy.

Copying the database with Oracle shutdown

Ideally, a copy of an Oracle database should be taken while the
database is shut down. Taking a copy after the database has been shut
down normally ensures a clean copy for backups to tape or for fast
startup of the cloned database. In addition, a cold copy of a database
is in a known transactional data state which, for some application
requirements, is exceedingly important. Copies of running databases
are in unknown transactional data states.
While a normal shutdown is desirable, it is not always feasible with
an active Oracle database. In many cases, applications and databases
must be forced to completely shut down. Rarely, the shutdown abort
command may be required to successfully shut down the database.
For any abnormal shutdowns, it is recommended that the database be
restarted allowing recovery and cleanup of the database, and then be
shut down normally. This ensures a clean, consistent copy of the
database is available for the copy procedure.
One primary method of creating copies of an Oracle database is
through the use of the EMC local replication product, TimeFinder.
TimeFinder is also used by Replication Manager to make database
copies. Replication Manager facilitates the automation and
management of database clones.
TimeFinder comes in three different forms, TimeFinder/Mirror,
TimeFinder/Clone and TimeFinder/Snap. These were discussed in
general terms in Chapter 2, “EMC Foundation Products.” Here, they
are used in a database context.
Creating Oracle copies using TimeFinder/Mirror

TimeFinder/Mirror is an EMC software product that allows an
additional hardware mirror to be attached to a source volume. The
additional mirror is a specially designated volume in the Symmetrix
configuration called a business continuance volume (BCV). The BCV
is synchronized to the source volume through a process called an
establish. While the BCV is established, it is not ready to all hosts. At
an appropriate time, the BCV can be split from the source volume to
create a complete point-in-time copy of the source data that can be
used for multiple different purposes including backup, decision
support, regression testing, and such.
Copying the database with Oracle shutdown 111

Groups of BCVs are managed together using SYMCLI device or

composite groups. Solutions Enabler commands are executed to
create SYMCLI groups for TimeFinder/Mirror operations. If the
database spans more than one Symmetrix array, a composite group is
used. Appendix B, “Sample SYMCLI Group Creation
Commands,”provides examples of these commands.
Figure 23 shows how to use TimeFinder/Mirror to make a database
copy of a cold Oracle database.
1 3
Data STD Data BCV
Oracle Log STD Log BCV

2 4
Arch STD Arch BCV
ICO-IMG-000505
Figure 23 Copying a cold (shutdown) Oracle database with TimeFinder/Mirror
1. Establish the BCVs to the standard devices. This operation occurs

in the background and should be executed in advance of when
the BCV copy is needed.
symmir -g device_group establish -full -noprompt
Note that the first iteration of the establish needs to be a full

synchronization. Subsequent iterations by default are incremental
if the -full keyword is omitted. Once the command is issued, the
array begins the synchronization process using only Symmetrix
resources. Since this operation occurs independently from the
host, the process must be interrogated to see when it completes.
The command to interrogate the synchronization process is:
symmir -g device_group verify
This command will return a 0 return code when the

synchronization operation is complete. Alternatively,
synchronization can be verified using the following:

symmir -g device_group query
After the volumes are synchronized, the split command can be

issued at any time.
2. Once BCV synchronization is complete, bring down the database
to make a copy of a cold database. Execute the following Oracle
commands:
sqlplus "/ as sysdba"
SQL> shutdown immediate;
3. When the database is deactivated, split the BCV mirrors using the
following command:
symmir -g device_group split -noprompt
The split command takes a few seconds to process. The database

copy on the BCVs is now ready for further processing.
4. The source database can now be activated and made available to
users once again.
SQL> startup;
Creating Oracle copies using TimeFinder/Clone

TimeFinder/Clone is an EMC software product that copies data
internally in the Symmetrix array. A TimeFinder/Clone session is
created between a source data volume and a target volume. The
target volume must be equal to or greater in size than the source
volume. The source and target for TimeFinder/Clone sessions can be
any hypervolumes in the Symmetrix configuration.
TimeFinder/Clone devices are managed together using SYMCLI
device or composite groups. Solutions Enabler commands are
executed to create SYMCLI groups for TimeFinder/Clone operations.
If the database spans more than one Symmetrix array, a composite
group is used. Appendix B, “Sample SYMCLI Group Creation
Commands,” provides examples of these commands.

Figure 24 shows how to use TimeFinder/Clone to make a copy of a

cold Oracle database onto BCV devices.
1 3
Data STD Data BCV

2 4
Arch STD Arch BCV
ICO-IMG-000505
Figure 24 Copying a cold Oracle database with TimeFinder/Clone
1. Create the TimeFinder/Clone pairs. The following command

creates the TimeFinder/Clone pairings and protection bitmaps.
No data is copied or moved at this time:
symclone -g device_group create -noprompt
Unlike TimeFinder/Mirror, the TimeFinder/Clone relationship is

created and activated when it is needed. No prior
synchronization of data is necessary. After the TimeFinder/Clone
session is created, it can be activated consistently.
2. Once the create command is complete, shut down the database to
make a cold disk copy of the database. Execute the following
Oracle commands:
With the database down, activate the TimeFinder/Clone:

symclone -g device_group activate -noprompt
After an activate command, the database copy provided by

TimeFinder/Clone is immediately available for further
processing even though the copying of data may not have
completed.

3. Activate the source database to make available to users once

again:
SQL> startup;
Databases copied using TimeFinder/Clone are subject to Copy on
First Write (COFW) and Copy On Access (COA) penalties. The
COFW penalty means that if a track is written to the source volume
and it not copied to the target volume, it must first be copied to the
target volume before the write from the host is acknowledged. COA
means that if a track on a TimeFinder/Clone volume is accessed
before it is copied, it must first be copied from the source volume to
the target volume. This causes additional disk read activity to the
source volumes and could be a source of disk contention on busy
systems.
Creating Oracle copies using TimeFinder/Snap

TimeFinder/Snap enables users to create complete copies of their
data while consuming only a fraction of the disk space required by
the original copy.
TimeFinder/Snap is an EMC software product that maintains
space-saving, pointer-based copies of disk volumes using VDEVs and
SAVDEVs. The VDEVs contain pointers either to the source data
(when it is unchanged) or to the SAVDEVs (when the data has
changed).
TimeFinder/Snap devices are managed together using SYMCLI
executed to create SYMCLI groups for TimeFinder/Snap operations.

Figure 25 shows how to use TimeFinder/Snap to make a copy of a

cold Oracle database.
Controlling
host 1 3
2 I/O
STD
4 Device pointers
from VDEV to
original data
I/O
VDEV
Data copied to
save area due to
copy on write
Target
host SAVE
DEV
ICO-IMG-000506
Figure 25 Copying a cold Oracle database with TimeFinder/Snap
1. Create the TimeFinder/Snap pairs. The following command

creates the TimeFinder/Snap pairings and protection bitmaps.
symsnap -g device_group create -noprompt
2. Once the create operation has completed, shut down the database
to make a cold TimeFinder/Snap of the DBMS. Execute the
following Oracle commands:
3. With the database down, the TimeFinder/Snap copy can now be

activated:
symsnap -g device_group activate -noprompt
After activating the snap, the pointer-based database copy on the

VDEVs is available for further processing.
4. The source database can be started again. Use the following
Oracle command:
SQL> startup;

Databases copied using TimeFinder/Snap are subject to a COFW

penalty while the snap is activated. The COFW penalty means that if
a track is written to the source volume and it has not been copied to
the snap-save area, it must first be copied to the save area before the
write from the host is acknowledged.


technology
The replication of a running database system involves a database
copying technique that is employed while the database is servicing
applications and users. The database copying technique uses EMC
consistency technology combined with an appropriate data copy
process like TimeFinder/Mirror, TimeFinder/Clone, or such.
TimeFinder/CG allows for the running database copy to be created in
an instant through use of the -consistent key word on the split or
activate commands. The image created in this way is in a
dependent-write consistent data state and is used as a restartable
copy of the database.
Databases management systems enforce a principle of
dependent-write I/O. That is, no dependent-write I/O will be issued
until the predecessor write that it is dependent on has completed.
This type of programming discipline is used to coordinate database
and log updates within a database management system and allows
those systems to be restartable in event of a power failure.
Dependent-write consistent data states are created when database
management systems are exposed to power failures. Using EMC
consistency technology options during the database cloning process
also creates a database copy that has a dependent-write-consistent
data state. Chapter 2, “EMC Foundation Products,” provides more
information on EMC consistency technology.
Oracle can be copied while it is running and processing transactions.
The following sections describe how to copy a running Oracle
database using TimeFinder technology.

configuration called a BCV. The BCV is synchronized to the source
volume through a process called an establish. While the BCV is
established, it is not ready to all hosts. At an appropriate time, the
BCV can be split from the source volume to create a complete
point-in-time copy of the source data that can be used for multiple
different purposes including backup, decision support, regression
testing, and such.


used. Appendix B, “Sample SYMCLI Group Creation Commands,”
provides examples of these commands.
Figure 26 shows how to use TimeFinder/Mirror and EMC
consistency technology to make a database copy of a running Oracle
database.
1 2
Data STD Data BCV
Arch STD Arch BCV
ICO-IMG-000507
Figure 26 Copying a running Oracle database with TimeFinder/Mirror


synchronization. Subsequent iterations are incremental and do
not need the -full keyword. Once the command is issued, the

synchronization operation is complete.
Copying a running database using EMC consistency technology 119

Alternatively, verify synchronization using the following:

6. When the volumes are synchronized, issue the split command:

symmir -g device_group split -consistent -noprompt
The -consistent keyword tells the Symmetrix array to use ECA

(Enginuity Consistency Assist) to momentarily suspend writes to
the disks while the split is being processed. The effect of this is to
create a point-in-time copy of the database on the BCVs. It is
similar to the image created when there is a power outage that
causes the server to crash. This image is a restartable copy. The
database copy on the BCVs is then available for further
processing.
Since there was no specific coordination between the database state
and the execution of the consistent split, the copy is taken
independent of the database activity. In this way, EMC consistency
technology can be used to make point-in-time copies of multiple
systems atomically, resulting in a consistent point-in-time with
respect to all applications and databases included in the consistent
split.



running Oracle database onto BCV devices.
1 2
Data STD Data BCV
Arch STD Arch BCV
ICO-IMG-000507
Figure 27 Copying a running Oracle database with TimeFinder/Clone


created and activated when it is needed. No prior copying of data
is necessary.
2. After the TimeFinder/Clone relationship is created, activate it
consistently:
symclone -g device_group activate -consistent
-noprompt
The -consistent keyword tells the Symmetrix to use ECA to

momentarily suspend writes to the source disks while the
TimeFinder/Clone is being activated. The effect of this is to create
a point-in-time copy of the database on the target volumes. It is a
copy similar in state to that created when there is a power outage
resulting in a server crash. This copy is a restartable copy. After
the activate command, the database copy on the
TimeFinder/Clone devices is available for further processing.


systems atomically, resulting in a consistent point-in-time with
split.
Databases copied using TimeFinder/Clone are subject to COFW and
COA penalties. The COFW penalty means that the first time a track is
written to the source volume and it has not been copied to the target
volume, it must first be copied to the target volume before the write
from the host is acknowledged. Subsequent writes to tracks that have
already been copied do not suffer from the penalty. COA means that
if a track on a target volume is accessed before it has been copied, it
must first be copied from the source volume to the target volume.
This causes additional disk read activity to the source volumes and
could be a source of disk contention on busy systems.

the original copy.
space-saving, pointer-based copies of disk volumes using Virtual
Devices (VDEVs) and save devices (SAVDEVs). The VDEVs contain
pointers either to the source data (when it is unchanged) or to the
SAVDEVs (when the data has changed).


running Oracle database.
Controlling
host
1 I/O
STD
2 Device pointers
from VDEV to
original data
I/O
VDEV
Data copied to
save area due to
copy on write
Target
host SAVE
DEV
ICO-IMG-000508
Figure 28 Copying a running Oracle database with TimeFinder/Snap

After the TimeFinder/Snap is created, all pointers from the

VDEVs are directed at the source volumes. No data has been
copied at this point. The snap can be activated consistently using
the consistent activate command.
2. Once the create operation has completed, execute the activate
command can with the -consistent option to perform the
consistent snap:
symsnap -g device_group activate -consistent -noprompt
The -consistent keyword tells the Symmetrix arrat to use ECA to

momentarily suspend writes to the disks while the activate
command is being processed. The effect of this is to create a
point-in-time copy of the database on the VDEVs. It is similar to
the state created when there is a power outage that causes the
server to crash. This image is a restartable copy. The database
copy on the VDEVs is available for further processing.


systems atomically, resulting in a consistent point in time with
split.
Databases copied using TimeFinder/Snap are subject to COFW
the snap-save area, it must first be copied to the snap-save area before
the write from the host is acknowledged.

Copying the database with Oracle in hot backup mode

For many years, Oracle has supported hot backup mode, which
provides the capability to use split-mirroring technology while the
database is online and create a recoverable database on the copied
devices. During this process, the database is fully available for reads
and writes. However, instead of writing change vectors (such as the
rowid, before, and after images of the data) to the online redo log,
entire blocks of data are written. These data blocks are then used to
overwrite any potential inconsistencies in the data files. While this
enables the database to recover itself and create a consistent
point-in-time image after recovery, it also degrades performance
while the database is in hot backup mode.
An important consideration when using hot backup mode to create a
copy of the database is the need to split the archive logs separately
from the database. This is because Oracle must recover itself to the
point after all of the tablespaces are taken out of hot backup mode. If
the hypervolumes containing the archive logs are split at the same
time as the data volumes, the marker indicating the tablespaces are
out of hot backup mode will not be found in the last archive log. As
such, the archive logs must be split after the database is taken out of
hot backup mode, so the archive log devices (and generally the redo
logs as well) must be separate from the other data files.
The following sections describe the steps needed to put tablespaces
or the entire database into hot backup mode and take it out again.
Appendix D, “Sample Database Cloning Scripts,” provides a sample
script showing how hot backup mode is used to create a recoverable
Oracle database image.
Putting the tablespaces or database into hot backup mode

To create a consistent image of Oracle while in hot backup mode, each
of the tablespaces in the database must be put into hot backup mode
before copying can be performed. The following command connects
to the database instance and issues the commands to put the
tablespaces (in this case, SYSTEM, DATA, and INDEXES) into hot
backup mode:
SQL> alter system archive log current;
SQL> alter tablespace DATA begin backup;
SQL> alter tablespace INDEXES begin backup;
Copying the database with Oracle in hot backup mode 125

SQL> alter tablespace SYSTEM begin backup;
Alternatively, with Oracle10g, the entire database can be put into hot
backup mode with:
SQL> alter database begin backup;
When these commands are issued, data blocks for the tablespaces are
flushed to disk and the datafile headers are updated with the last
SCN. Further updates of the SCN to the datafile headers are not
performed. When these files are copied, the nonupdated SCN in the
datafile headers signifies to the database that recovery is required.
Taking the tablespaces or database out of hot backup mode

To take the tablespaces out of hot backup mode, connect to the
database and issue the following commands:
SQL> alter tablespace DATA end backup;
SQL> alter tablespace INDEXES end backup;
SQL> alter tablespace SYSTEM end backup;
When these commands complete, the database is returned to its

normal operating state.
backup mode with:
SQL> alter database end backup;
The log file switch command is used to ensure that the marker
indicating that the tablespaces have been taken out of hot backup
mode is found in an archive log.

configuration, called a BCV. The BCV is synchronized to the source


established it is not ready to all hosts. At an appropriate time, the
testing, etc.
used. Appendix B provides examples of these commands.
Figure 29 shows how to use TimeFinder/Mirror to make a copy of an
Oracle database in hot backup mode. 1.Establish the BCVs to the
standard devices. This operation occurs in the background and
should be executed in advance of when the BCV copy is needed.
1 3
Data STD Data BCV

2 4
Arch STD Arch BCV
5
ICO-IMG-000509
Figure 29 Copying an Oracle database in hot backup mode with

TimeFinder/Mirror

symmir -g data_group establish -full -noprompt
symmir -g log_group establish -full -noprompt
Note that the first iteration of the establish needs to be a "full"


resources. Since this is asynchronous to the host, the process must

be interrogated to see when it is finished. The command to
interrogate the synchronization process is:
symmir -g data_group verify
symmir -g log_group verify

2. When the volumes are synchronized, put the database in hot
backup mode. Connect to the database and issue the following
commands:
3. Execute a split of the standard and BCV relationship:

symmir -g data_group split -noprompt
The -consistent keyword is not used here as consistency is being

provided by the database. The Data BCV(s) now contain an
inconsistent copy of the database that can be made consistent
through recovery procedures using the archive logs. This is a
recoverable database. Usage of recoverable copies of databases is
described in “Recoverable disk copies” on page 110.
4. After the replicating process completes, take the database (or
tablespaces) out of hot backup mode on the source database:
5. After tablespaces are taken out of hot backup mode and a log
switch is performed, split the Log BCV devices from their source
volumes:
symmir -g log_group split -noprompt


target volume needs to be equal to or greater in size than the source

group is used. Appendix B provides examples of these commands.
Figure 30 shows how to use TimeFinder/Clone to make a copy of an
Oracle database in hot backup mode onto BCV devices.
1 3
Data STD Data BCV

2 4
Arch STD Arch BCV
5
ICO-IMG-000509

TimeFinder/Clone

symclone -g data_group create -noprompt
symclone -g log_group create -noprompt

is necessary.
2. Place the Oracle database in hot backup mode:

3. Execute an "activate" of the TimeFinder/Clone:

symclone -g data_group activate -noprompt
The -consistent keyword is not used here as consistency is being

provided by the database. The data-clone devices now contain an
recoverable database. “Enabling a cold database copy” on
page 140 describes use of recoverable copies of databases.
5. After the tablespaces are taken out of hot backup mode and a log
switch is performed, activate the log clone devices:
symclone -g log_group activate -noprompt

COA penalties. The COFW penalty means that the first time a track is
written to the source volume and it has not been copied to the target
from the host is acknowledged. Subsequent writes to tracks already
copied, do not suffer from the penalty. COA means that if a track on a
target volume is accessed before it is copied, it must first be copied
from the source volume to the target volume. This causes additional
disk read activity to the source volumes and could be a source of disk
contention on busy systems.

the original copy.
space-saving pointer-based copies of disk volumes using VDEVs and
changed).


If the database spans more than one Symmetrix, array, a composite
Figure 31 shows how to use TimeFinder/Snap to make a copy of an
Oracle database in hot backup mode.
Controlling
host 1 3 5
2 I/O
STD
4 Device pointers
from VDEV to
original data
I/O
VDEV
Data copied to
save area due to
copy on write
Target
host SAVE
DEV
ICO-IMG-000510

TimeFinder/Snap

symsnap -g data_group create -noprompt
symsnap -g log_group create -noprompt
Unlike TimeFinder/Mirror, the snap relationship is created and

activated when it is needed. No prior copying of data is necessary.
The create operation establishes the relationship between the
standard devices and the VDEVs and it also creates the protection
metadata.
2. After the snaps are created, place the Oracle database in hot
backup mode:

3. Execute an "activate" of the TimeFinder/Snap for the data

devices:
symsnap -g data_group activate -noprompt
The -consistent keyword is not used here because consistency is

being provided by the database. The VDEVs (and possibly
SAVDEVs) contain a pointer-based copy of the database while it
is in hot backup mode. This is a recoverable database copy.
“Enabling a cold database copy” on page 140 describes use of
recoverable copies of Oracle databases.
4. Once the snap activate process completes, take the database (or
5. After the database is taken out of hot backup mode and a log
switch is performed, activate the Log snap devices:
symsnap -g log_group activate -noprompt

the snap save area, it must first be copied to the save area before the

Replicating Oracle using Replication Manager

EMC Replication Manager is used to manage and control the
TimeFinder copies of an Oracle database. The RM product has a GUI
and command line and provides the capability to:
◆ Autodiscover the standard volumes holding the database.
◆ Identify the pathname for all database files.
◆ Identify the location of the archive log directories.
◆ Identify the location of the database binaries, dump files, and
such.
Using this information, RM can set up TimeFinder Groups with BCVs
or VDEVs, schedule TimeFinder operations and manage the creation
of database copies, expiring older versions as needed.
Figures 32 demonstrates the steps performed by Replication Manager
using TimeFinder/Mirror to create a database copy to use for
multiple purposes.
Oracle
2 5
Data STD Data BCV
3 4 Log STD
6 7
1 Arch STD
8
9
10 ICO-IMG-000511
Figure 32 Using Replication Manager to make a TimeFinder copy of Oracle
Replication Manager does the following:

1. Logs in to the database and discovers the locations of all the
datafiles and logs on the Symmetrix devices. Note that the
dynamic nature of this activity will handle the situation when
extra volumes are added to the database. The procedure will not
have to change.
Replicating Oracle using Replication Manager 133

2. Establishes the standards to the BCVs in the Symmetrix array.

Replication Manager polls the progress of the establish process
until the BCVs are synchronized, and then moves on to the next
step.
3. Performs a log switch to flush changes to disk, minimizing
recovery required of the copied database.
4. Puts the Oracle database in hot backup mode, discussed in
“Putting the tablespaces or database into hot backup mode” on
page 125.
5. Issues a TimeFinder split, to detach the Data BCVs from the
standard devices.
6. Takes the Oracle database out of hot backup mode, as discussed
in “Taking the tablespaces or database out of hot backup mode”
on page 126.
7. Performs another log switch to flush the end of hot backup mode
marker from the online redo logs to an archive log.
8. Creates a copy of a backup control file.
9. Copies the backup control file and additional catalog information
the Replication Manager host.
10. Copies the database archive logs to the Replication Manager host
for use in the restore process.

Transitioning disk copies to Oracle database clones

The method employed to enable a database copy for use depends on
how the copy was created. A database copy created while the source
database was in hot backup mode requires Oracle recovery before the
database can be opened for normal processing. This requires the
database be started in mount mode and recovery started through the
recover database command until the point the database was taken
out of hot backup mode (or beyond if desired).
If a copy of a running database was created using EMC consistency
technology without using hot backup mode, it can be restarted only.
Currently, no roll-forward log apply to a point in time after the copy
was created is supported by Oracle.
A database copy created with the EMC consistency technology
should also be restarted on another server, one different from the one
that sees the source database. This is because both the source and
target databases have the datafile paths and the same database ID,
and therefore can not coexist on the same server. Oracle provides
mechanisms to change the database ID through a utility called nid.
Additionally, the paths to the datafiles can change.
The following sections describe how to restart a database copy
created from a cold database, with the database running using EMC
consistency technology, and also a database copy made while in hot
backup mode. Details of how to deal with host-related issues when
processing the database copy are discussed first.
Host considerations
One of the primary considerations when starting a copy of an Oracle
database is whether to present it back to the same host or mount the
database on another host. While it is significantly simpler to restart a
database on a secondary host, it is still possible to restart a copy of the
database on the same host with only a few extra steps. The extra steps
required to mount a database to the same host, mounting a set of
copied volumes back to the same host, changing the mount points,
and relocating the datafiles, are described next.
Transitioning disk copies to Oracle database clones 135

Mounting a set of copied volumes to the same host

Before the database can be presented back to the same host, the
hypervolumes must be presented. Additionally, operating system
and logical volume specific commands must be run to make the
volumes and file systems (if applicable available). Appendix C,
“Related Host Operation,” provides detailed procedures by
operating system.
Relocating a database copy

Relocating the copy of an Oracle database is a requirement if
mounting the database back to the same server that sees the source
database, or if database datafile locations changed for whatever
reason. This is accomplished by writing a backup control file to trace
and editing the file written to the $ORACLE_HOME/rdbms/log
directory (by default). Generally, the command is used to re-create a
control file for the database to use. In this case, the new control file
contains a listing of new paths for the datafiles and redo logs to point
to. With the addition of an initialization parameter file, the new
database can be discovered and started for use.
The following steps are required to generate a file to use to mount a
database copy on the same or new host with new datafile locations.
1. Generate the file containing the script to re-create the control file.
SQL> alter database backup controlfile to trace;
2. Find and edit the script. The trace file is written to

background_dump_dest (by default into the
ORACLE_HOME/rdbms/log directory on a UNIX system) and
is in the form of SID_ora_nnnnn.trc where nnnnn is a number. It
is the customer's responsibility to write this to a new file
(create_control.sql for example) before editing the file. The
following is an example of backup control file written to trace:
Dump file /oracle/oracle9i/rdbms/log/test_ora_20748.trc
Oracle9i Enterprise Edition Release 9.2.0.7.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.7.0 - Production
ORACLE_HOME = /oracle/oracle9i
System name: SunOS
Node name: l82bk050
Release: 5.8
Version: Generic_108528-29
Machine: sun4u
Instance name: test

Redo thread mounted by this instance: 1

Oracle process number: 10
Unix process pid: 20748, image: oracle@l82bk050 (TNS V1-V3)
*** SESSION ID:(9.6785) 2005-12-11 17:04:18.454

*** 2005-12-11 17:04:18.453
# The following are current System-scope REDO Log Archival
# related parameters and can be included in the database
# initialization file.
#
# LOG_ARCHIVE_DEST=''
# LOG_ARCHIVE_DUPLEX_DEST=''
#
# LOG_ARCHIVE_FORMAT=T%TS%S.ARC
# REMOTE_ARCHIVE_ENABLE=TRUE
# LOG_ARCHIVE_START=TRUE
# LOG_ARCHIVE_MAX_PROCESSES=2
# STANDBY_FILE_MANAGEMENT=MANUAL
# STANDBY_ARCHIVE_DEST=?/dbs/arch
# FAL_CLIENT=''
# FAL_SERVER=''
#
# LOG_ARCHIVE_DEST_1='LOCATION=/oracle/archive'
# LOG_ARCHIVE_DEST_1='MANDATORY NOREOPEN NODELAY'
# LOG_ARCHIVE_DEST_1='ARCH NOAFFIRM SYNC'
# LOG_ARCHIVE_DEST_1='NOREGISTER NOALTERNATE NODEPENDENCY'
# LOG_ARCHIVE_DEST_1='NOMAX_FAILURE NOQUOTA_SIZE NOQUOTA_USED'
# LOG_ARCHIVE_DEST_STATE_1=ENABLE
#
# Below are two sets of SQL statements, each of which creates # a new
control file and uses it to open the database. The
# first set opens the database with the NORESETLOGS option and # should
be used only if the current versions of all online
# logs are available. The second set opens the database with
# the RESETLOGS option and should be used if online logs are
# unavailable.
# The appropriate set of statements can be copied from the
# trace into a script file, edited as necessary, and executed # when
there is a need to re-create the control file.
#
# Set #1. NORESETLOGS case
#
# The following commands will create a new control file and
# use it to open the database. Data used by the recovery
# manager will be lost. Additional logs may be required for
# media recovery of offline datafiles. Use this only if the
# current version of all online logs are available.
STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "TEST" NORESETLOGS NOARCHIVELOG
-- SET STANDBY TO MAXIMIZE PERFORMANCE
MAXLOGFILES 16
MAXLOGMEMBERS 2

MAXDATAFILES 30
MAXINSTANCES 2
MAXLOGHISTORY 224
LOGFILE
GROUP 1 (
'/oracle/oradata/test/oraredo1a.dbf',
'/oracle/oradata/test/oraredo2a.dbf'
) SIZE 10M,
GROUP 2 (
'/oracle/oradata/test/oraredo1b.dbf',
'/oracle/oradata/test/oraredo2b.dbf'
) SIZE 10M,
GROUP 3 (
'/oracle/oradata/test/oraredo1c.dbf',
'/oracle/oradata/test/oraredo2c.dbf'
) SIZE 10M
-- STANDBY LOGFILE
DATAFILE
'/oracle/oradata/test/orasys.dbf',
'/oracle/oradata/test/oraundo.dbf',
'/oracle/oradata/test/orausers.dbf'
CHARACTER SET US7ASCII
;
# Recovery is required if any of the datafiles are restored
# backups, or if the last shutdown was not normal or
# immediate.
RECOVER DATABASE
# Database can now be opened normally.
ALTER DATABASE OPEN;
# Commands to add tempfiles to temporary tablespaces.
# Online tempfiles have complete space information.
# Other tempfiles may require adjustment.
ALTER TABLESPACE TEMP_TS ADD TEMPFILE
'/oracle/oradata/test/oratest.dbf'
SIZE 524288000 REUSE AUTOEXTEND OFF;
# End of tempfile additions.
#
# Set #2. RESETLOGS case
#
# The following commands will create a new control file and
# use it to open the database. The contents of online logs
# will be lost and all backups will be invalidated. Use this
# only if online logs are damaged.
STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "TEST" RESETLOGS
NOARCHIVELOG
-- SET STANDBY TO MAXIMIZE PERFORMANCE
MAXLOGFILES 16
MAXLOGMEMBERS 2
MAXDATAFILES 30
MAXINSTANCES 2
MAXLOGHISTORY 224

LOGFILE
GROUP 1 (
'/oracle/oradata/test/oraredo1a.dbf',
'/oracle/oradata/test/oraredo2a.dbf'
) SIZE 10M,
GROUP 2 (
'/oracle/oradata/test/oraredo1b.dbf',
'/oracle/oradata/test/oraredo2b.dbf'
) SIZE 10M,
GROUP 3 (
'/oracle/oradata/test/oraredo1c.dbf',
'/oracle/oradata/test/oraredo2c.dbf'
) SIZE 10M
-- STANDBY LOGFILE
DATAFILE
'/oracle/oradata/test/orasys.dbf',
'/oracle/oradata/test/oraundo.dbf',
'/oracle/oradata/test/orausers.dbf'
CHARACTER SET US7ASCII
;
# Recovery is required if any of the datafiles are restored
# backups, or if the last shutdown was not normal or
# immediate.
RECOVER DATABASE USING BACKUP CONTROLFILE
# Database can now be opened zeroing the online logs.
ALTER DATABASE OPEN RESETLOGS;
# Commands to add tempfiles to temporary tablespaces.
# Online tempfiles have complete space information.
# Other tempfiles may require adjustment.
ALTER TABLESPACE TEMP_TS ADD TEMPFILE
'/oracle/oradata/test/oratest.dbf'
SIZE 524288000 REUSE AUTOEXTEND OFF;
# End of tempfile additions.
#
After deciding whether to open the database with a reset logs and
editing the file appropriately, the datafile locations can change.
When run, the instance will search in the new locations for the
Oracledatafiles.
SQL> @create_control
This will create the new database, relocating the datafiles into the
newly specified locations.
Changing the SID, DBNAME, and DBID

A normal part of presenting a database clone to the same or a new
host is changing identifiers associated with the database. These
identifiers include:
◆ SID - System ID is used to distinguish Oracle instances on a host.

◆ DBNAME - Database Name defined in the initialization

parameter at database creation and is written to the control file. It
specifies the service name of the database and should be the same
as that defined in the tnsnames.ora file.
◆ DBID - Database ID, which is the internal database unique
identifier.
Changing the SID is a simple procedure. It is accomplished by
shutting down the database, changing the initialization parameter
ORACLE_SID, and restarting the database. The new SID will be used
to name the processes that are initiated as part of the Oracle startup
procedures. An additional step needed is to create a unique
init<SID>.ora parameter file in the ORACLE_HOME/dbs directory.
Changing the DBNAME and DBID are more complicated since they
are written in the control file and into the database itself. Oracle
provides two utilities for changing the DBNAME and DBID. They are
the dbnewid and nid utilities. In addition, the DBNAME can be
changed by re-creating the control file using the procedures outlined
in “Relocating a database copy” on page 136. For changing the
DBNAME, the DBID, or both, these steps are performed after any
recovery procedures required are completed by the database.
Enabling a cold database copy

A cold database copy is a database image taken when the copied
database is shut down. A cold database copy ensures that the
database copy is consistent when it is restarted; no crash recovery is
required to make the database transactionally consistent. Restarting a
database copy taken while the database was shut down does not
require any crash recovery and as such requires minimal time to
restart and open.
The following steps describe the process for restarting a cold
database copy. It assumes that either the database is being started on
another host, or that the processes listed in “Host considerations” on
page 135 have completed.
1. Use the following SYMCLI command to verify that the
appropriate hypers are available to the host:
syminq

2. After the appropriate devices are available to the host, make the
operating system aware of the devices. In addition, import the
volume or disk groups and mount any file systems. This is
operating-system dependent and is discussed in Appendix C,
“Related Host Operation.”
3. Since the database was shut down when the copy was made, no
special processing is required to restart the database. Start the
database as follows:
SQL> startup;
Alternatively, if additional archive logs, redo logs, and a valid

control file from the copied database is available to roll forward
the database to a specified point in time, use the following
instead:
SQL> startup mount;
SQL> recover database;
Enabling a restartable database copy

If a restartable database copy is started on the same host, the Oracle
SID and DBID must change. If it is started on a different server, the
SID and DBID can be left the same or changed. In most cases, it is
appropriate to change the SID and DBID so that connections through
Oracle Net are uniquely identified. The following steps show the
process for enabling a restartable database copy.
syminq
volume or disk and any mount file systems. This is
3. Since the database was shut down when the copy was made, no
special processing is required to restart the database. Start the
database as follows:

SQL> startup;
Enabling a hot backup database copy

A database copy made with the database in hot backup mode by
definition requires recovery before the database can be opened for
user transactions. The following steps are used to recovery and open
a database copy made with the database in hot backup mode.
syminq
volume or disk groups and mount any file systems. This is
3. Since the database was shutdown when the copy was made, no
special processing is required to restart the database. The
following is used to start the database:
SQL> startup mount;
After applying the required logs to make the database consistent

to the point where the database was taken out of hot backup
mode, the copy can open for user transactions.

Oracle transportable tablespaces

A number of customers require Oracle objects, specifically
tablespaces, to be cloned across Oracle instances. Reasons for cloning
Oracle objects include building development environments,
providing system integration and test environments, building
read-only decision support environments, or satisfying a number of
other business-related requirements for Oracle data. Oracle provides
a mechanism for moving tablespace objects from one database to
another through a method called transportable tablespaces. In
conjunction with EMC replication technologies such as TimeFinder,
Open Replicator, or SRDF, pieces of an Oracle database can be cloned
and attached to an alternate database for a variety of business
applications.
The transportable tablespace feature was introduced in Oracle8i to
facilitate the bulk movement of data between two Oracle databases
running the same operating system and version of Oracle.
Additionally, new functionality built into Oracle10g allows
tablespaces to be transported between different operating system
platforms (such as a Sun tablespace to a Windows environment). This
enhancement allows greater flexibility when migrating from one
operating system to another or when creating test/development
systems on lower-cost operating environments.
Benefits and uses of transportable tablespaces

Transportable tablespaces are moved by placing the target
tablespaces into read-only mode at the Oracle level, and then copying
the associated operating system files by an external means (such as
cp, dd, ftp, and so on) to place them on the host where the target
database is located. Previous methods of transferring the data, such
as export and import, required significant time and effort to migrate
the data to the new instance. Transportable tablespaces provide a
simple mechanism for tablespace copies to be incorporated into a
second database environment.
There is a myriad of uses for Oracle transportable tablespaces. For
example, customers may need to access data from their OLTP
database to populate their data warehouse system. Transportable
tablespaces provide a mechanism to migrate the required data
directly into the data warehouse environment. Another example for
transportable tablespaces is migrating periodic data (for example,
Oracle transportable tablespaces 143

monthly sales data) from high-end mirrored storage to cheaper RAID

5 volumes as the data ages and access requirements change.
Transportable tablespaces allows data to be moved quickly and easily
between the RAID types.
Implementation of transportable tablespaces with EMC TimeFinder and SRDF

When implementing Oracle transportable tablespaces on a database
running on a Symmetrix array, replication software such as
TimeFinder or SRDF may be used to create a clone of the
tablespace(s) to be transported to the target environment. Creating
copies of the data in this manner has the advantage that no host
cycles are used in the cloning process. Additionally, following the
initial full-copy synchronization, incremental replication, in which
only changed tracks are copied, can be used when TimeFinder or
SRDF are the replication method. This significantly reduces the time
required to copy the datafiles to the target environment. Finally, this
process is also easily scripted or managed through EMC management
products like Replication Manager. The next section provides an
example of transportable tablespaces with the EMC TimeFinder
software.
Transportable tablespace example

Before implementing transportable tablespaces, a few requirements
must be addressed. For example, the source and target databases
must use the same character set and national character set. Also, a
transportable tablespace cannot be mounted to a database containing
a tablespace with that name. Any users that own objects in the
tablespace may either exist or be created in the target database;
objects can be transferred to other users if required. Additionally,
starting with Oracle9i, multiple block sizes can exist in the database.
If the block size used for the tablespace is not the default size for the
target database, buffer cache of the size used by the transportable
tablespace must be allocated in the target.
The major limitation with transportable tablespaces however, is that
the tablespace set (the group of tablespaces to be migrated) must be
self-contained. Self-contained means that indexes or referential
integrity constraints on any tablespace object, for example the primary
key index on a table, must be included as a part of the transportable
tablespace set. Thus, in a typical customer environment where table
data and indexes are separated into their own tablespaces, either both

tablespaces must be a part of the transportable tablespace set, or the

indexes must be dropped before the transportable tablespace set is
created.
Oracle provides a procedure to verify that the transportable
tablespace set is self-contained. This procedure, called
TRANSPORT_SET_CHECK, is a part of the DBMS_TTS package.
This package is created as a part of the dbms_plugts script, which is
automatically run as a part of catproc. The role
EXECUTE_CATALOG_ROLE must be assigned to any user that
executes this procedure.
The following is an example of the steps needed to verify that a set of
tablespaces can transport successfully. This example uses two
tablespaces, DATA1 and INDEX1, and verifies that they can be
successfully transported to a target database.
1. Determine the national character set in use by the source and
target databases:
SELECT *
FROM v$nls_parameters
WHERE parameter = 'NLS_CHARACTERSET';
The output from this SQL command on both the source and
target databases should be identical. For example:
PARAMETER VALUE
------------------- -------------------
NLS_CHARACTERSET WE8ISO8859P1
2. Verify that tablespaces with similar names do not already exist in

the target database:
SELECT tablespace_name
FROM dba_tablespaces;
TABLESPACE_NAME
-------------------------------
SYSTEM
SYSAUX
TEMP1
UNDO1
USERS1
3. Determine the users that own objects in the tablespaces to be

transported, and verify that either they exist in the target
database or create them. Note that these objects can be transferred
to another user, if required.
SELECT DISTINCT owner
FROM dba_segments

WHERE tablespace_name IN ('DATA1', 'INDEX1');
OWNER
---------------
DEV1
USER1
These owners (schemas) need to be verified on the
target side:
SELECT username
FROM dba_users;
USERNAME
---------------
SYS
SYSTEM
In this case, the DEV1 user exists but the USER1 user does not.
The USER1 user must be created with the command:
CREATE USER user1
IDENTIFIED BY user1;
4. Verify the default block sizes in each database. If different,

additional block buffer cache must be allocated in the target
database of the appropriate size. Multiple values for database
block sizes were released in Oracle9i.
SELECT name, value
FROM v$parameter
WHERE name = 'db_block_size';
NAME VALUE
-------------------------------------
db_block_size8192
5. Verify if the tablespaces in the set (DATA1 and INDEX1) are

self-contained:
EXECUTE dbms_tts.transport_set_check('DATA1,
INDEX1',TRUE);
SELECT *
FROM transport_set_violations;
VIOLATIONS
-----------------------------------------------------
--
CONSTRAINT FK_SALES_ORDER_DEPT between table
DEV1.SALES
in tablespace DATA1 and table DEV2.ORDER_DEPT in
tablespace DATA2

PARTITIONED TABLE DEV1.SALES is partially contained in

the transportable set
In this example, the foreign key constraint on the DEV1.SALES
table would need to be dropped. Additionally, the partitioned
table DEV1.SALES would need to be addressed.
After determining that any issues with the self-contained tablespaces
are addressed, the tablespaces need to be put into read-only mode (or
taken completely offline) so that copies of the files can be made and
presented to the target host. Additionally, metadata concerning the
tablespaces must be extracted so that the tablespaces can be
successfully "plugged into" the new environment. It should be noted
that extraction of the metadata from the source database and
importing it into the target is Oracle version dependent. Following
are the steps to implement a transportable tablespace:
1. Put the two tablespaces into read-only mode:
ALTER TABLESPACE data1 READ ONLY;
ALTER TABLESPACE index1 READ ONLY;
2. Extract tablespace metadata. There are two ways to do this. The

first method, available in all Oracle versions, is by using the
Oracle export utility EXP. The second, a new feature in Oracle10g,
uses the Oracle Data Pump utility.
EXP transport_tablespace = y
tablespaces = data1, index1
triggers = y
constraints = y
grants = y
file = d:\oracle\exp\meta1.dmp
Alternatively, Oracle Dump Pump syntax for the metadata extract

is as follows:
EXPDP system/manager
DUMPFILE = meta1.dmp
DIRECTORY = d:\oracle\exp
TRANSPORT_TABLESPACES = data1,index1
3. After successfully extracting the metadata, copy the datafile(s)

associated with the tablespaces. First, the datafiles must be
identified and copied to their new location (either on the same
host or a different one). A variety of methods for copying the
datafiles are available including cp, copy, cpio, tar, or the
DBMS_FILE_COPY package. Additionally, EMC software such as

TimeFinder or SRDF can be used to clone the volumes as

described in the following section. In this example, TimeFinder is
used to clone the data.
SELECT tablespace_name, file_name
FROM dba_data_files
WHERE tablespace_name in ('DATA1', 'INDEX1');
TABLESPACE_NAME FILE_NAME
--------------- --------------------------------
DATA1 d:\oracle\oradata\db1\data1.dbf
INDEX1 d:\oracle\oradata\db1\index1.dbf
In this case, both required datafiles are on the d:\ drive. This
volume will be identified and replicated using TimeFinder. Note
that careful database layout planning is critical when TimeFinder
is used for replication. First, create a device group for the
standard device used by the d:\ drive and a BCV that will be
used for the new e:\ drive. Appendix B, “Sample SYMCLI Group
Creation Commands,”provides examples of creating device
groups.
4. After creating the device group, establish the BCV to the standard
device:
symmir -g device_group verify -i 30
5. After the BCV is fully synchronized with the standard device, the
devices can split since the tablespaces on the device are in
read-only mode.
6. A full, track-by-track copy of the d:\ drive is now available on the

BCVs. Once BCVs are split, they become available to the host they
are presented to. The BCVs should be mounted on another host
(could be the same host) that contains the database where
transported tablespaces will be mounted to. Appendix C, “Related
Host Operation,” provides the steps to present devices to each of
the operating system types. To verify that the volumes are
presented to the host, enter:
syminq
7. Once the tablespace datafiles are in place, import the metadata

information into the target database:
IMP transport_tablespace = y
datafiles = (e:\oracle\oradata\db1\data1.dbf,
e:\oracle\oradata\db1\index1.dbf)

file = d:\oracle\exp\meta1.dmp
tablespaces = (data1,index1)
tts_owners = (dev1,dev2)
Alternatively, with Data Pump in Oracle10g:
IMPDPsystem/manager
DUMPFILE = meta1.dmp
DIRECTORY = d:\oracle\exp\
TRANSPORT_DATAFILES =
e:\oracle\oradata\db1\data1.dbf,
e:\oracle\oradata\db1\index1.dbf
8. Put the tablespaces on the target host into read/write mode:

ALTER TABLESPACE data1 READ WRITE;
ALTER TABLESPACE index1 READ WRITE;

Cross-platform transportable tablespaces

Oracle introduced a new concept with transportable tablespaces in
Oracle10g. Cross-platform transportable tablespaces are an
enhancement to previous functionality that allows a single
tablespace, or a set of tablespaces, to be migrated from one operating
system to another. Previously, use of transportable tablespaces
required that both the operating system and version of Oracle be the
same between target and source. If the target database is Oracle10g,
these limiting feature requirements no longer apply. Cross-platform
transportable tablespaces provide customers with a significantly
improved method for migrating Oracle databases from one operating
system to another.
Overview
Cross-platform transportable tablespaces enable data from an Oracle
database running on one operating system to be cloned and
presented to another database running on a different platform. Oracle
datafiles differences, as a result of the need to run on different
operating systems, are a function of byte ordering, or "endianness," of
the files. The endian format of the datafiles is classified as either "big
endian" or "little endian" (in "big endian," the first byte is the most
significant while in "little endian", the first byte is the least
significant). If two operating systems both use "big endian" byte
ordering, the files can transferred between operating systems and
used successfully in an Oracle database (through a feature such as
transportable tablespaces). For source and target operating systems
with different byte ordering, a process to convert the datafiles from
one "endianness" to another is required.
Oracle uses an RMAN option to convert a data file from "big endian"
to "little endian" and vice versa. First, the "endianness" of the source
and target operating systems must be identified. If different, then the
datafiles are read and converted by RMAN. Upon completion, the
"endianness" of the datafiles is converted to the format needed in the
new environment. The process of converting the cloned datafiles
occurs either on the source database host before copying to the new
environment or once it is received on the target host. Other than this
conversion process, the steps for cross-platform transportable
tablespaces are the same as for normal transportable tablespaces.

Implementing cross-platform transportable tablespaces

The following example shows the process needed to implement
cross-platform transportable tablespaces for a tablespace migrating
from a Solaris (big endian format) to a Linux host (little endian
format). In this example, the tablespace OR_UFS is migrated from a
Solaris to a Linux Oracle database.
1. Verify that the tablespace, or set of tablespaces, are self-contained.
This means that objects in the tablespace set must not have
associated objects (such as indexes, materialized views, or
partitioned tables) outside of the specified tablespace set. Oracle
provides the procedure TRANSPORT_SET_CHECK as a part of
the Oracle provided package DBMS_TTS. For example:
EXECUTE DBMS_TTS.TRANSPORT_SET_CHECK('OR_UFS', TRUE);
Any violations of the tablespace being self-contained
are written to the TRANSPORT_SET_VIOLATIONS view,
and queried using:
SELECT *
FROM TRANSPORT_SET_VIOLATIONS;
If no rows are selected, the tablespace is self-contained.

2. Determine the source and target database endian formats and
determine whether conversion is required. The first step lists the
endian formats for all available operating systems. The second
shows the specific format for the database platform in use.
SELECT platform_id, platform_name, endian_format
FROM v$transportable_platform
ORDER BY 1;
PLATFORMP LATFORM_NAME ENDIAN_FORMAT
-----------------------------------------------------
1 Solaris[tm] OE (32-bit) Big
2 Solaris[tm] OE (64-bit) Big
3 HP-UX (64-bit) Big
4 HP-UX IA (64-bit)Big
5 HP Tru64 UNIX Little
6 AIX-Based Systems (64-bit)Big
7 Microsoft Windows IA (32-bit)Little
8 Microsoft Windows IA (64-bit)Little
9 IBM zSeries Based LinuxBig
10 Linux IA (32-bit)Little
11 Linux 64-bit for AMDLittle
12 Microsoft Windows 64-bit for AMDLittle
13 Linux 64-bit for AMD Little
15 HP Open VMS Little
16 Apple Mac OS Big
SELECT a.platform_name, a.endian_format
Cross-platform transportable tablespaces 151

FROM v$transportable_platform a, v$database b

WHERE a.platform_name = b.platform_name;
On the Solaris host, the output from this SQL command is:
PLATFORM_NAME ENDIAN_FORMAT
----------------------------- --------------
Solaris[tm] OE (32-bit)Big
On the Linux host, this command returns:
PLATFORM_NAME ENDIAN_FORMAT
----------------------------- --------------
Linux IA (32-bit)Little
3. Either shut down the database or put the tablespace(s) into

read-only mode so that a clean copy of the datafiles that make up
the tablespace set can be made. A tablespace (in this example,
OR_UFS) is put into read-only mode with the following:
SQL> ALTER TABLESPACE or_ufs READ ONLY;
4. Metadata of the tablespace set must be created and copied to the

target environment. Use either the Oracle export utility or the
new Data Pump facility to create this file. The following shows
the commands to create the tablespace metadata information
using Oracle Data Pump:
expdp system/manager
dumpfile=or_ufs.dmp
directory=dpump_dir
transport_tablespaces=or_ufs
transport_full_check=Y
5. After putting the tablespace in read-only mode, the datafiles can

be copied and presented to the target host. There are many ways
to manage this replication process including host-based (cp, rcp,
ftp, and so on) and storage-based methods (TimeFinder, SRDF,
Open Replicator). These new target volumes are then presented
to the target host.
6. The endianness of the data may be converted either on the
storage or the target host. In this example, the conversion process
is performed after migrating the data to the target. The Oracle
RMAN utility is used to convert the data file. The following
shows an example of the RMAN conversion process:
RMAN> CONVERT DATAFILE "/ufs/or_ufs.dbf"
2> TO PLATFORM="Linux IA (32-bit)"
3> FROM PLATFORM="Solaris[tm] OE (64-bit)"
4> DB_FILE_NAME_CONVERT="/ufs","/ufs2"
5> PARALLELISM=2;

The datafile is converted to little endian and is written to the new

directory location /ufs2 from the /ufs directory using the same
filename.
7. After converting the file, the tablespace may now be "plugged"
into the target database. The Data Pump utility is used to facilitate
the process.
impdp system/manager
dumpfile=or_ufs.dmp
directory=dpump_dir
transport_datafiles=/ufs2/or_ufs.dbf
Cross-platform transportable tablespaces 153

Choosing a database cloning methodology

The replication technologies described in the prior sections each have
pros and regarding their applicability to solve a given business
problem. Table 9 compares the different methods to use and the
differing attributes of those methods.
Table 9 Comparison of database cloning technologies
TimeFinder/Snap TimeFinder/Clone TimeFinder/Mirror Replication Manager
Maximum number of 15 Incremental: 16 Incremental: 16 Incremental: 16

copies Non-inc: Unlimited Non-inc: Unlimited Non-inc: Unlimited
No. simultaneous 15 16 2 2
Copies
Production impact COFW COFW & COA None None
Scripting Required Required Required Automated
Database clone Not recommended Recommended Recommended Recommended

needed a long time
High write usage to Not recommended Recommended Recommended Recommended

DB clone
The following are examples of some of the choices you might make
for database cloning based on the information in Table 10.
Table 10 Database cloning requirements and solutions
System requirements Replication choices
The application on the source volumes is very performance- TimeFinder/Mirror

sensitive and the slightest degradation will cause responsiveness Replication
of the system to miss SLAs.
ManagerIntroductionIntrodu
ction
Space and economy are a real concern. Multiple copies are TimeFinder/Snap
needed and retained only a short period of time, with performance Replication Manager
not critical.
More than two simultaneous copies need to be made. The copies TimeFinder/Clone
will live for up to a month’s time.
Multiple copies are being made, some with production mount. The Replication Manager
copies are reused in a cycle expiring the oldest one first.

4
Backing Up Oracle
Environments

◆ Introduction ............................................................................................ 156
◆ Comparing recoverable and restartable copies of databases........... 157
◆ Database organization to facilitate recovery ...................................... 159
◆ Oracle backup overview ....................................................................... 161
◆ Using EMC replication in the Oracle backup process ...................... 166
◆ Copying the database with Oracle shutdown ................................... 168
◆ Copying a running database using EMC consistency technology . 175
◆ Copying the database with Oracle in hot backup mode .................. 182
◆ Backing up the database copy.............................................................. 190
◆ Backups using EMC Replication Manager for Oracle backups ...... 191
◆ Backups using Oracle Recovery Manager (RMAN) ......................... 193
◆ Backups using TimeFinder and Oracle RMAN ................................. 195
Backing Up Oracle Environments 155

Backing Up Oracle Environments
Introduction
As a part of normal day-to-day operations, the DBA creates backup
procedures that run one or more times a day to protect the database
against errors. Errors can originate from many sources (such as
software, hardware, user, and so on) and it is the responsibility of the
DBA to provide error recovery strategies that can recover the
database to a point of consistency and also minimize the loss of
transactional data. Ideally, this backup process should be simple,
efficient, and fast.
Today, the DBA is challenged to design a backup (and recovery)
strategy to meet the ever-increasing demands for availability that can
also manage extremely large databases efficiently while minimizing
the burden on servers, backup systems, and operations staff.
This section describes how the DBA can leverage EMC technology in
a backup strategy to:
◆ Reduce production impact of performing backups.
◆ Create consistent point-in-time backup images.
◆ Create restartable or recoverable database backup images.
◆ Enhance Oracle's RMAN backup utility.
Before covering these capabilities, it is necessary to review some
terminology and also to look at best practices for Oracle database
layouts that can facilitate and enhance the backup and restore
process.

Comparing recoverable and restartable copies of databases

The Symmetrix-based replication technologies for backup discussed
in this section can create two types of database copies: recoverable or
restartable. A significant amount of confusion exists between these
two types of database images; a clear understanding of the
differences between the two is critical to ensure that the recovery
goals for a database can be met.
Recoverable disk copies

A recoverable database copy is a disk copy of the database in which
transaction logs can be applied to datafiles to roll forward the
database content to a point in time after the copy is created. A
recoverable Oracle database copy is intuitively easy for DBAs to
understand since maintaining recoverable copies, in the form of
backups, is an important DBA function. In the event of a failure of the
production database, the ability to recover the database not only to
the point in time when the last backup was taken, but also to roll
forward subsequent transactions up to the point of failure, is a key
feature of the Oracle database.
Creating recoverable images of Oracle databases with EMC
replication technology requires that the database be shut down when
it is copied or if a running database is to be replicated, the database
must be in hot backup mode. “Putting the tablespaces or database
into hot backup mode” on page 182 describes this mode in detail.
Restartable disk copies

If a copy of a running Oracle database is created using EMC
consistency technology without putting the database in hot backup
mode, the copy is a DBMS restartable copy. This means that when the
restartable database copy is first brought up, it performs crash
recovery. First, all transactions recorded as committed and written to
the redo log, but which may not have had corresponding data pages
written to the datafiles, are rolled forward using the redo logs.
Second, after the application of log information completes, Oracle
rolls back any changes that were written to the database (dirty block
buffers flushed to disk for example) but that were never actually
Comparing recoverable and restartable copies of databases 157

committed by a transaction. The state attained as a result of these two

actions is often referred to as a transactionally consistent
point-in-time database state.
Roll-forward recovery using archive logs to a point in time after the
disk copy is created is not supported on an Oracle restartable
database copy.

Database organization to facilitate recovery

It is advantageous to organize the database on disk to facilitate
recovery. Since array replication techniques copy volumes at the
physical disk level (as seen by the host), all datafiles for a database
should be created on a set of disks dedicated to the database and
should not be shared with other applications and databases. For
UNIX systems using a logical volume manager (LVM), ensure that
the data files reside in a volume group dedicated to the database.
Sharing with other applications can cause unnecessary work for the
array and waste space on the target volumes.
In addition to isolating the database to be copied onto its own
dedicated volumes, the database should also be divided into two
parts, the data structures and the recovery structures. The recovery
structures consist of the redo logs, the archive logs, and the control
files. The database data volumes hold the data files such as the
SYSTEM, SYSAUX, and other database tablespaces. Figure 33 depicts
a TimeFinder setup where the data structures and recovery structures
have been separated onto their own volumes.
SYMMETRIX
Redo SYSTEM DATA

Control SYSAUX
logs
Standard
devices
Archive INDEX UNDO

logs TEMP
Redo SYSTEM DATA

Control
logs SYSAUX
BCVs
Archive
INDEX UNDO
logs
ICO-IMG-000512
Figure 33 Database organization to facilitate recovery
Database organization to facilitate recovery 159

The strategy of separating data structures and recovery structures

allows just the data structures to be restored from a database disk
copy or from a tape dump of the database copy. Once the data
structures are restored, the database can be rolled forward to a
designated point in time. If the data and recovery structures are not
separated, the resulting database restore will return the data state to
that of the restored database image, and no roll-forward processing
will be possible as the more current logs will be overwritten by the
restore process.

Oracle backup overview

Oracle has two primary backup methods: user-managed and
Recovery Manager (RMAN). A traditional user-managed backup
involves putting the database into an allowed backup state, and then
copying datafiles to tape or disk using operating-system commands.
User-managed backups require careful consideration by the DBA to
ensure that the appropriate files (archive logs, control files, datafiles,
and so on) are successfully backed up and accessible in the event a
recovery process is required. Recovery typically involves using the
appropriate server platform means to perform database file restore
from backup file images, and then explicitly performing log recovery
against the restored database file images.
An alternative to user-managed backup is RMAN. RMAN provides
both a utility for easily managing the backup process and facilitating
restore and recovery procedures should they be needed. RMAN does
this by providing management functions to locate datafile members
and scripting procedures to automate the backup process. It also
maintains an internal list of backup filenames and their locations to
automate the recovery process. RMAN (when not used with
proxy-copy) also checks the Oracle block integrity during the backup
and logs errors if a corrupt block is found. RMAN also allows finer
recovery granularity. RMAN provides an automated, efficient utility
that simplifies the Oracle backup and recovery process.
Note: Although it is common to see RMAN back up the production database

directly, it is recommended to offload RMAN backup to a replica of the
database.
The following sections describe considerations for the various

user-managed backups. These include:
◆ Online versus offline backups - Occasionally, backups are
performed by shutting down the primary database, copying the
data files to tape or disk, and then restarting the database.
However, this requires significant downtime. Given the
24x7x365 uptime requirements generally needed by IT today, hot
backup mode with Oracle database backups is widely used.
◆ Point-in-time versus roll-forward recovery backups —
Historically, backup procedures involved creating a recoverable
database so that further transactions found in the archive logs
Oracle backup overview 161

could be used to recover a database to the point in time of a

failure. However, this recovery process can be operationally
complex in federated environments.
◆ Partial (tablespace or datafile) versus entire database backups —
Oracle provides the ability to back up a single data file or
tablespace in addition to backing up the entire database. This
option is useful, for example, when a tablespace will not be
accepting new transactions and is converted to read-only. After
doing a final backup, there is no reason to continue backups to
this tablespace since the data is unchanging.
◆ Incremental versus full database backups — RMAN provides the
means of only backing up changed data blocks, rather than
backing up a full copy of the datafiles.
The backup process consists of three primary steps:
◆ Preparing the database for backup (shutting down the database,
putting the database in hot backup mode, or not conditioning the
database at all)
◆ Initiating the backup process to disk or tape from the
operating-system or array level
◆ Verifying the backup has completed successfully, that the backup
media (tape or disk) is protected and available for recovery (if
needed), and that the backup can be used for recovery purposes
Most backup procedures require that the database be conditioned in
some way before the data files that compose the database are backed
up. This includes either shutting down the database or putting the
database in hot backup mode. However, restartable backup images
captured without requiring any special operational preparation to
"condition" the database are growing in popularity due to federated
database and application environments. Although these types of
backups do not currently allow roll-forward operations to the point
of a failure as in the case of a recoverable image, they are important
particularly when used as offsite backups for DR.
User-managed backups require either host- or array-based replication
mechanisms for copying the data. Traditionally, backups are written
to tape; although there is increasing interest in writing backup images
to disk for performance and availability reasons. The use of EMC
local and remote replication technologies including Replication
Manager simplify and enhance performance of the backup process by
allowing most of the heavy work of creating the actual operational
backup to be offloaded from the production database service.

Perhaps the most critical, but often overlooked, component of the

backup process is verification of the database once the backup
process has completed. Important and often difficult management
tasks in the recovery process include:
◆ Ensuring that database image is complete and available in the
event it is needed for a recovery and has not lost any transactions.
◆ Integrating the database recovery process with application
information, outside data files, or other databases or applications.
Verifying the database after a backup depends on the customer's
specific applications, requirements, and environment.
The paramount consideration for any backup and recovery processes
is the need for tested and well-documented backup and recovery
procedures. Planning for, documenting, and testing the required
backup procedures for a particular database environment is an
essential part of maintaining a workable recovery strategy. In many
cases, tested and documented procedures for both backup and
recovery are not available in customer's IT environments. Without
these tried and documented procedures, unforeseen issues can arise
(such as the loss of key employees) or catastrophic failures can occur,
causing significant deleterious affects to critical business systems.
Online (hot) versus offline (cold) backups

The ability to create database images made consistent during
recovery while both read and write transactions are processing is a
key differentiating feature of Oracle. Backup images made while the
database is hot are critical in environments with stringent uptime
requirements. By comparison, shutting down the database provides
customers with a safe and consistent method of creating backup
database images at the expense of database service outages. Choosing
which of these user-managed backup methods to implement depends
on the customer's needs and database environment.
Hot backups allow for greater availability, but also create more
complex recovery strategies as all logs containing active transactions
must be present at recovery time to be successful. Cold backups make
for simple restore and recovery, but reduce the availability of the
system. Prolonged database service shutdowns to accommodate the
creation of extremely large database backups are frequently
unacceptable to the business. In these cases, online backups must be
performed.

Making a hot copy of the database is now the standard, but this
method has its own challenges. How can a consistent copy of the
database and supporting files be made when they are changing
throughout the duration of the backup? What exactly is the content of
the tape backup at completion? The reality is that the tape data is a
"fuzzy image" of the disk data, and considerable expertise is required
to restore the database back to a database point of consistency.
Online backups are made when the database is running in log
archival mode. While there are performance considerations for
running in archive log mode, the overhead associated with it is
generally small compared with the enhanced capabilities and
increased data protection afforded by running in it. Except in cases
such as large data warehouses where backups are unnecessary, or in
other relatively obscure cases, archive log mode is generally
considered a best practice for all Oracle database environments.
Point-in-time and roll-forward recovery backups

Until recently, conditioning the database either through shutting
down the database or putting the tablespaces into hot backup mode
was the only way to make Oracle database images used for recovery.
However, the requirement to recover databases and applications in
federated environments has driven the need to create new methods
of backing up databases and applications. Enabled through EMC
consistency technologies, point-in-time backup images, rather than
fully recoverable copies have become increasing important in
customers' complex federated environments. Consistently split
technologies in both local and remote EMC replication solutions
provide a means of creating dependent-write consistent database
images made transactionally consistent through the database's own
recovery mechanisms.
One backup method gaining increasing usage is combining EMC
consistency technologies with Oracle hot backup mode when
creating backup images. Using the two technologies together
provides enhanced flexibility during the recovery process since both
restartable and recoverable databases are supported when this
process is used.
Note: Currently, using the EMC consistency technology to create a

recoverable database image without conditioning Oracle is not supported.

Comparing partial and entire database backups

Backing up all the datafiles that compose the database is the typical
method of doing database backups. In some cases, however, backing
up only pieces of the database, for example the datafiles that make up
a single tablespace make sense. One example of this is the read-only
tablespace. In some environments, a read-only tablespace is created
after all updates to it are complete. Monthly orders or sales
transactions after month-end processing are examples of where a
tablespace might be converted from read/write to read-only. Once a
full backup of the tablespace is available, there is no need to continue
backups of that particular tablespace. In this case, taking a tablespace
backup and saving it once can save on subsequent database backup
time, complexity, and costs. Although in most cases full backups are
customary, partial backups in certain situations are more practical
and effective.
Comparing incremental and full database backups

Making full database images for backup purposes is the standard
method of backing up databases. Creating incremental database
backups is unavailable to users without the use of RMAN.
Additionally, incremental backups add complexity and create
recovery challenges when compared to full backups. Incremental
backups use less space for the backed up data, and in the latest
releases of the database (Oracle10g Release 2), by keeping a bitmap of
changed tracks, incremental backups eliminate the need to fully scan
each database datafile.

Using EMC replication in the Oracle backup process

EMC disk-based replication is used to make a copy of an Oracle
database and this copy is used as a source for backup. A database
backup process using disk replication technology typically includes
some or all of the following steps, depending on the copying
mechanism selected and the desired usage of the database backup:
◆ Preparing the array for replication
◆ Conditioning the source database
◆ Making a copy of the database volumes
◆ Resetting the source database
◆ Presenting the target database copy to a backup server
◆ Backing up the target database copy
In all cases but one, operating-system capabilities are used to back up
the copies of the database directories and containers. In other words,
the Oracle backup utility RMAN is not used except in the case
described in “Backups using TimeFinder and Oracle RMAN” on
page 195.
The first step in the backup process depends on what method is going
to be used to copy the database volumes and whether it is required to
"condition" the Oracle database in any way. Conditioning could
involve shutting down the database, or putting the database in hot
backup mode. How the backup is processed and subsequently
restored depends on what condition the database was in when the
database copy was made. The database can be in one of three data
states when it is being copied:
◆ Shutdown
◆ Processing normally
◆ Conditioned using hot backup mode
Depending on the data state of the database at the time it is copied,
the database copy may be restartable or recoverable. While a
restartable database is used in a valid backup/restore strategy, it
cannot guarantee some data loss. Most DBAs will want to make a
recoverable copy of the database such that logs taken after the backup
are used to roll forward the database to a point in time after the
backup was taken. It is important to understand that the database
copies created with the EMC Symmetrix storage array can be used in
a recoverable, roll-forward fashion, only if the database was
conditioned properly (hot backup mode or shut down) when the
copy was created. In addition, the way the restore is executed

depends on the state of the database at the time the copy was made.
Chapter 5, “Restoring and Recovering Oracle Databases,” covers the
restore of the database.
The following sections describe how to make a copy of the database
using three different EMC technologies with the database in the three
different states described in the prior paragraph.
The primary method of creating copies of an Oracle database is
through the use of the EMC local replication product TimeFinder.
TimeFinder is also used by Replication Manager to make database
copies. Replication Manager facilitates the automation and
management of database copies.
The TimeFinder family consists of two base products and several
component options. TimeFinder/Mirror, TimeFinder/Clone and
TimeFinder/Snap were discussed in general terms in Chapter 2,
“EMC Foundation Products.” In this chapter, they are used in a
database backup context.
Using EMC replication in the Oracle backup process 167

Copying the database with Oracle shutdown

Ideally, a copy of an Oracle database should be taken while the
database is shut down. Taking a copy after the database has shut
down normally ensures a clean copy for backups. In addition, a cold
copy of a database is in a known transactional data state which, for
some application requirements, is exceedingly important. Copies
taken of running databases are in unknown transactional data states.
While a normal shutdown is desirable, it is not always feasible with
an active Oracle database. In many cases, applications and databases
are forced to completely shut down. Rarely, the shutdown abort
command, which terminates all database engine processes abruptly,
may be required to successfully shut down the database. Whenever
an abnormal database shutdown occurs, it is recommended that the
database be restarted allowing the Oracle database engine to
properly recover and clean up the database, and then be shut down
normally. This ensures a clean, consistent copy of the database is
available for the backup procedure.
Creating cold Oracle backup copies using TimeFinder/Mirror

additional storage hardware mirror to be attached to a source
volume. The additional mirror is a specially designated volume in the
Symmetrix configuration called a BCV. The BCV is synchronized to
the source volume through a process called an establish. While the
BCV is established, it is not accessible from any hosts. At an
appropriate time, the BCV can be split from the source volume to
create a complete point-in-time copy of the source data that can be
used for backup.
Groups of standards and BCVs are managed together using SYMCLI
executed to create SYMCLI groups for TimeFinder/Mirror
operations. If the database spans more than one Symmetrix array, a
composite group is used. Appendix B, “Sample SYMCLI Group
Creation Commands,” provides examples of these commands.

Figure 34 shows how to use TimeFinder/Mirror to make a database

copy of a cold Oracle database.
1 3
Data STD Data BCV

2 4
Arch STD Arch BCV
ICO-IMG-000505
Figure 34 Copying a cold Oracle database with TimeFinder/Mirror


synchronization. Subsequent iterations by default are incremental
if the -full keyword is omitted. When the command is issued, the

synchronization is verified using the following:
After all of the volumes in the device group appears as

synchronized, the split command is issued at any time.

2. Once BCV synchronization is complete, bring down the database

to make a copy of a cold database:
3. When the database is shut down, split the BCV mirrors using the
following command:
The split command takes a few seconds to process. The database

copy on the BCVs is now ready for further processing.
4. The source database can now be activated and made available to
users once again.
SQL> startup;
Creating cold Oracle backup copies using TimeFinder/Clone



cold Oracle database onto the BCV devices.
1 3
Data STD Data BCV

2 4
Arch STD Arch BCV
ICO-IMG-000505
Figure 35 Copying a cold Oracle database with TimeFinder/Clone


created and activated when it is needed. No prior
synchronization of data is necessary. After the TimeFinder/Clone
session is created it can be activated consistently.
2. Once the create command has completed, shut down the
database to make a cold disk copy of the database:
3. With the database down, activate the TimeFinder /Clone:

symclone -g device_group activate -noprompt
After an activate command, the database copy provided by

TimeFinder/Clone is immediately available for further
processing even though the copying of data may not have
completed.
4. Activate the source database to make it available to users again:
SQL> startup;


COA penalties. The COFW penalty means that if a track is written to
the source volume and has not been copied to the target volume, it
must first be copied to the target volume before the write from the
host is acknowledged. COA means that if a track on a
TimeFinder/Clone volume is accessed before it was copied, it must
first be copied from the source volume to the target volume. This
causes additional disk read activity on the source volumes and could
be a source of disk contention on busy systems.
Creating cold Oracle backup copies using TimeFinder/Snap

the original copy.
space-saving, pointer-based copies of disk volumes using VDEVs and
(when it is unchanged) or to the SAVDEVs (when the data has been
changed). VDEVs are not a full copy of the source data and rely on
the source devices. If the source device becomes unavailable, the
virtual device will not be available as well. In addition, if the
SAVEDEV area gets full, it will invalidate the TimeFinder/Snap
session.


cold Oracle database.
Controlling
host 1 3
2 I/O
STD
4 Device pointers
from VDEV to
original data
I/O
VDEV
Data copied to
save area due to
copy on write
Target
host SAVE
DEV
ICO-IMG-000506
Figure 36 Copying a cold Oracle database with TimeFinder/Snap

2. Once the create operation has completed, shut down the database
in order to make a cold TimeFinder/Snap of the DBMS. Execute
the following Oracle commands:
3. With the database down, the TimeFinder/Snap copy can be

activated:
symsnap -g device_group activate -noprompt
After the activate, the pointer-based database copy on the VDEVs

is available for further processing.
4. The source database can be activated again. Use the following
Oracle command:
SQL> startup;


a track is written to the source or target volumes and has not been
copied to the snap save area, it must first be copied to the save area
before the write from the host is acknowledged.


technology
Running database systems can be copied while the databases are
servicing applications and users. The database copying techniques
use EMC consistency technology combined with an appropriate data
copy process like TimeFinder/Mirror or TimeFinder/Clone.
TimeFinder/CG allows for the running database copy to be created in
an instant through use of the -consistent key word on the split or
activate commands. The image created in this way is in a
dependent-write consistent data state and can be used as a restartable
copy of the database.
Database management systems enforce a principle of
dependent-write I/O. That is, no dependent-write will be issued until
the predecessor write it is dependent on has completed. This type of
programming discipline is used to coordinate database and log
updates within a database management system and allows those
systems to be restartable in event of a power failure. Dependent-write
consistent data states are created when database management
systems are exposed to power failures. Using EMC consistency
technology options during the database replication process also
creates a database copy that has a dependent-write-consistent data
state. Chapter 2, “EMC Foundation Products,” provides more
information on EMC consistency technology.
Oracle databases can be copied while running and processing
transactions. The following sections describe how to copy a running
Oracle database using TimeFinder technology.

Creating restartable Oracle backup copies using TimeFinder/Mirror

Figure 37 shows how to use TimeFinder/Mirror to make a copy of a
running Oracle database.
1 2
Data STD Data BCV
Arch STD Arch BCV
ICO-IMG-000507
Figure 37 Copying a running Oracle database with TimeFinder/Mirror

Note that the first iteration of the establish must be a full


synchronization can be verified using the following:
2. Once the volumes are synchronized, issue the split command:
symmir -g device_group split -consistent -noprompt

The -consistent keyword tells the Symmetrix array to use ECA to

momentarily suspend writes to the disks while the split is being
processed. The effect of this is to create a point-in-time copy of the
database on the BCVs. It is similar to the image created when
there is a power outage that causes the server to crash. This image
is a restartable copy. The database copy on the BCVs is then
available for further processing.
independent of the database activity. Using this technique, EMC
consistency technology is used to make point-in-time backups of
multiple systems atomically, resulting in a consistent point in time
with respect to all applications and databases included in the
consistent split.
Creating restartable Oracle backup copies using TimeFinder/Clone



running Oracle database onto BCV devices.
1 2
Data STD Data BCV
Arch STD Arch BCV
ICO-IMG-000507
Figure 38 Copying a running Oracle database using TimeFinder/Clone


is necessary.
2. After the TimeFinder/Clone relationship is created, it can be
activated consistently:
symclone -g device_group activate -consistent
-noprompt
The -consistent keyword tells the Symmetrix array to use ECA to

momentarily suspend writes to the source disks while the
TimeFinder/Clone is being activated. The effect of this is to create
a point-in-time copy of the database on the target volumes. It is a
copy similar in state to that created when there is a power outage
resulting in a server crash. This copy is a restartable copy. After
the activate command, the database copy on the
TimeFinder/Clone devices is available for further processing.

Since there was no specific coordination between the database and

the execution of the consistent split, the copy is taken independent of
the database activity. In this way, EMC consistency technology is
used to make point-in-time copies of multiple systems atomically,
resulting in a consistent point in time with respect to all applications
and databases included in the consistent split.
COA penalties. The COFW penalty means that if a track is written to
the source volume and has not been copied to the target volume, it
must first be copied to the target volume before the write from the
host is acknowledged. COA means that if a track on a target volume
is accessed before it is copied, it has to be copied from the source
volume to the target volume first. This causes additional disk read
activity to the source volumes and could be a source of disk
Creating restartable Oracle backup copies using TimeFinder/Snap

the original copy.
(when it is unchanged) or to the SAVDEVs (when the data changed).

Figure 39 on page 180 shows how to use TimeFinder/Snap to make a

copy of a running Oracle database.
Controlling
host
1 I/O
STD
2 Device pointers
from VDEV to
original data
I/O
VDEV
Data copied to
save area due to
copy on write
Target
host SAVE
DEV
ICO-IMG-000508
Figure 39 Copying a running Oracle database with TimeFinder/Snap

After the TimeFinder/Snap devices are created, all the pointers

from the VDEVs point at the source volumes. No data has been
copied at this point. The snap can be activated consistently using
the consistent activate command.
2. Once the create operation has completed, execute the activate
command with the -consistent option to perform the consistent
snap:
symsnap -g device_group activate -consistent -noprompt
The -consistent keyword tells the Symmetrix array to use ECA

(Enginuity Consistency Assist) to momentarily suspend writes to
the disks while the activate command is being processed. The
effect of this is to create a point-in-time copy of the database on
the VDEVs. It is similar to the state created when there is a power
outage that causes the server to crash. This image is a restartable
copy. The database copy on the VDEVs is available for further
processing.

Since there was no specific action coordination between the database

technology is used to make point-in-time copies of multiple systems
atomically, resulting in a consistent point in time with respect to all
applications and databases included in the consistent split.
a track is written to the source volume and has not been copied to the
snap-save area, it must first be copied to the snap-save area before the

Copying the database with Oracle in hot backup mode

For many years, Oracle has supported hot backup mode, which
provides the capability to use split-mirroring technology while the
database is online and create a recoverable database on the copied
devices. During this process, the database is fully available for reads
and writes. However, instead of writing change vectors (such as the
rowid, before, and after images of the data) to the online redo log,
entire blocks of data are written. These data blocks are then used to
overwrite any potential inconsistencies in the data files. While this
enables the database to recover itself and create a consistent
point-in-time image after recovery, it also degrades performance
while the database is in hot backup mode.
An important consideration when using hot backup mode to create a
copy of the database is the need to split the archive logs separately
from the database. This is because Oracle must recover itself to the
point after all of the tablespaces are taken out of hot backup mode. If
the hypervolumes containing the archive logs are split at the same
time as the data volumes, the marker indicating the tablespaces are
out of hot backup mode will not be found in the last archive log. As
such, the archive logs must be split after the database is taken out of
hot backup mode, so the archive log devices (and generally the redo
logs as well) must be separate from the other data files.
The following sections describe the steps needed to put tablespaces
or the entire database into hot backup mode and take it out again.
Appendix D, “Sample Database Cloning Scripts,” provides a sample
script showing how hot backup mode can be used to create a
recoverable Oracle database image.
Putting the tablespaces or database into hot backup mode

To create a consistent image of Oracle while into hot backup mode,
each of the tablespaces in the database must be put into hot backup
mode before copying can be performed. The following command
connects to the database instance and issues the commands to put the
tablespaces into hot backup mode:
SQL> alter tablespace DATA begin backup;
SQL> alter tablespace INDEXES begin backup;
SQL> alter tablespace SYSTEM begin backup;

backup mode with:
When these commands are issued, data blocks for the tablespaces are
flushed to disk and the datafile headers are updated with the last
checkpoint SCN. Further updates of the checkpoint SCN to the data
file headers are not performed while in this mode. When these files
are copied, the nonupdated SCN in the datafile headers signifies to
the database that recovery is required.
Taking the tablespaces or database out of hot backup mode

To take the tablespaces out of hot backup mode, connect to the
database and issue the following commands:
SQL> alter tablespace DATA end backup;
SQL> alter tablespace INDEXES end backup;
SQL> alter tablespace SYSTEM end backup;
When these commands complete, the database is returned to its

normal operating state.
backup mode with:
The log file switch command is used to ensure that the marker
indicating that the tablespaces are taken out of hot backup mode is
found in an archive log. Switching the log automatically ensures that
this record is found in a written archive log.
Creating hot Oracle backup copies using TimeFinder/Mirror

configuration called a BCV. The BCV is synchronized to the source


established it is not ready to all hosts. At an appropriate time, the
testing, etc.
used. Appendix B, “Sample SYMCLI Group Creation Commands,”
provides examples of these commands.
Figure 40 shows how to use TimeFinder/Mirror to make a copy of an
Oracle database in hot backup mode.
1 3
Data STD Data BCV

2 4
Arch STD Arch BCV
5
ICO-IMG-000509

TimeFinder/Mirror

symmir -g data_group establish -full -noprompt
symmir -g log_group establish -full -noprompt
Note that the first iteration of the establish needs to be a "full"


resources. Since this is asynchronous to the host, the process must

be interrogated to see when it is finished. The command to
interrogate the synchronization process is:
symmir -g data_group verify
symmir -g log_group verify

2. When the volumes are synchronized, put the database in hot
backup mode. Connect to the database and issue the following
commands:
3. Execute a split of the standard and BCV relationship:

symmir -g data_group split -noprompt
The -consistent keyword is not used here as consistency is

provided by the database. The Data BCV(s) now contain an
recoverable database. Usage of recoverable copies of databases is
described in “Recoverable disk copies” on page 110.
switch is performed, split the Log BCV devices from their source
volumes:
symmir -g log_group split -noprompt
Creating hot Oracle backup copies using TimeFinder/Clone



Figure 41 shows how to use TimeFinder/Clone to make a copy of an
Oracle database in hot backup mode onto BCV devices.
1 3
Data STD Data BCV

2 4
Arch STD Arch BCV
5
ICO-IMG-000509

TimeFinder/Clone

symclone -g data_group create -noprompt
symclone -g log_group create -noprompt

is necessary.
2. Place the Oracle database in hot backup mode:

3. Execute an "activate" of the TimeFinder/Clone:

symclone -g data_group activate -noprompt
The -consistent keyword is not used here as consistency is provided

by the database. The Data clone devices now contain an inconsistent
copy of the database that can be made consistent through recovery
procedures using the archive logs. This is a recoverable database.
“Enabling a cold database copy” on page 140 describes use of
recoverable copies of databases.
switch is performed, activate the Log clone devices:
symclone -g log_group activate -noprompt

COA penalties. The COFW penalty means the first time a track is
written to the source volume and has not been copied to the target
from the host is acknowledged. Subsequent writes to tracks already
copied, do not suffer from the penalty. COA means that if a track on a
target volume is accessed before it is copied, it must first be copied
from the source volume to the target volume. This causes additional
disk read activity to the source volumes and could be a source of disk
Creating hot Oracle backup copies using TimeFinder/Snap

the original copy.
changed).


The following diagram depicts the necessary steps to make a copy of
an Oracle database in hot backup mode using TimeFinder/Snap:
Controlling
host 1 3 5
2 I/O
STD
4 Device pointers
from VDEV to
original data
I/O
VDEV
Data copied to
save area due to
copy on write
Target
host SAVE
DEV
ICO-IMG-000510

TimeFinder/Snap

symsnap -g data_group create -noprompt
symsnap -g log_group create -noprompt
Unlike TimeFinder/Mirror, the snap relationship is created and

activated when it is needed. No prior copying of data is necessary.
The create operation establishes the relationship between the
standard devices and the VDEVs and it also creates the protection
metadata.
2. After the snaps are created, place the Oracle database in hot
backup mode:

3. Execute an "activate" of the TimeFinder/Snap for the data

devices:
symsnap -g data_group activate -noprompt
The -consistent keyword is not used here because consistency is

provided by the database. The VDEVs (and possibly SAVDEVs)
contain a pointer-based copy of the database while it is in hot
backup mode. This is a recoverable database copy. “Enabling a
cold database copy” on page 140 describes use of recoverable
copies of Oracle databases.
4. Once the snap activate process completes, take the database (or
5. After the database is taken out of hot backup mode and a log
switch is performed, activate the lsnap devices:
symsnap -g log_group activate -noprompt

a track is written to the source volume and has not been copied to the
snap-save area, it must first be copied to the save area before the

Backing up the database copy

The most common method of backing up an array-based copy of the
database is to present the volumes that contain the database copy to a
backup server, import the volume group, mount file systems, etc.
When the backup server has access to the volumes in this way, it can
simply execute a file system or device backup of the database
volumes. Note that this backup is done at the OS level and uses OS
utilities or a Tape Management system utility to create the backup on
tape.
Note: Regardless of the tool used to create the backup copy and regardless of
the state of the database at the time the copy was created, the backup process
is the same, except as noted in the next section.
Appendix C, “Related Host Operation,” describes issues around

accessing, importing, and mounting copies of database volumes.

Backups using EMC Replication Manager for Oracle backups

EMC Replication Manager (RM) is used to manage and control the
TimeFinder copies of an Oracle database used for backups. It also
may be integrated with backup products such as NetWorker. The RM
product has a GUI and command line and provides the capability to:
◆ Autodiscover the standard volumes holding the database
◆ Identify the pathname for all database files
◆ Identify the location of the archive log directories
◆ Identify the location of the database binaries, dump files, etc.
◆ Callouts to integrate backup utilities with the database copies
Using this information, RM can set up TimeFinder Groups with BCVs
or VDEVs, schedule TimeFinder operations and manage the creation
of database copies, expiring older versions as needed.
Figure 43 demonstrates the steps performed by RM using
TimeFinder/Mirror to create a database copy to use for multiple
other purposes:
Oracle
2 5
Data STD Data BCV
3 4 Log STD
6 7
1 Arch STD
8
9
10 ICO-IMG-000511
Figure 43 Using RM to make a TimeFinder copy of Oracle
RM does the following:

1. Logs in to the database and discovers the locations of all the
datafiles and logs on the Symmetrix devices. Note that the
dynamic nature of this activity will handle the situation when
extra volumes are added to the database. The procedure will not
have to change.
Backups using EMC Replication Manager for Oracle backups 191

2. Establishes the standards to the BCVs in the Symmetrix array. RM

polls the progress of the establish process until the BCVs are
synchronized, and then moves on to the next step.
3. Performs a log switch to flush changes to disk, minimizing
recovery required of the copied database.
4. Puts the Oracle database in hot backup mode, discussed in
“Putting the tablespaces or database into hot backup mode” on
page 182.
5. Issues a TimeFinder split, to detach the data BCVs from the
standard devices.
6. Takes the Oracle database out of hot backup mode, as discussed
in “Taking the tablespaces or database out of hot backup mode”
on page 183.
7. Performs another log switch to flush the end of hot backup mode
marker from the online redo logs to an archive log.
8. Creates a copy of a backup control file.
9. Copies the backup control file and additional catalog information
the RM host.
10. Copies the database archive logs to the replication manager host
for use in the restore process.

Backups using Oracle Recovery Manager (RMAN)

Oracle Recovery Manager, or RMAN, is an Oracle backup and
recovery utility first available in Oracle 8. RMAN contains a
command-line client and integration into the Enterprise Manager
(OEM) GUI. The utility integrates with procedures and sessions
running on the Oracle servers to manage the backup and recovery
processes. In addition, RMAN can create a backup repository that
contains information about all backups and recoveries in the
environment.
RMAN provides a number of key benefits over traditional user
managed database backups. They include:
◆ Automated backup and recovery — The RMAN utility provides
functionality that can automate backup and recovery processes to
tape or disk.
◆ Backup catalogs — RMAN monitors all database backup
activities and stores information concerning backed up datafiles
and archive logs in a centrally managed repository.
◆ Incremental backup support — RMAN enables the ability to
create incremental backup images that shorten the backup
window and reduce the amount of tape or disk space needed for
backups.
◆ Block level corruption detection and recovery — During the
backup process, RMAN reads each of the data blocks and
determines whether the block is consistent. If data corruption
issues exist, RMAN can provide block media recovery to correct
any corruption issues.
◆ Hot backup mode not required for hot backups — RMAN
provides the ability to create a database backup image that can be
made consistent without having to put the database into hot
backup mode (assumes that primary database is used for
backups rather than a database copy).
◆ Integration with OEM — Oracle Enterprise Manager provides a
GUI interface that simplifies the management process of Oracle
databases. RMAN integrates with OEM to provide a single
location for managing all aspects of multiple database
environments.
Backups using Oracle Recovery Manager (RMAN) 193

◆ Backup types — RMAN provides the ability to back up databases

to either tape or disk. It also integrates with specialized media
managers that assist in simplifying the backups of Oracle with
backup solutions including NetWorker and VERITAS
NetBackup.
RMAN provides a broad range of backup options and features.
Describing these features in detail is beyond the scope of this
document. For additional detailed information on RMAN, consult
the Oracle documentation Oracle Database Backup and Recovery Basics
and Oracle Database Backup and Recovery Advanced User's Guide.

Backups using TimeFinder and Oracle RMAN

It is possible to back up an Oracle database image made with
TimeFinder using the RMAN backup utility. This option retains all of
the benefits of RMAN while simultaneously offloading the resources
needed by Oracle's backup utility to process the production database
blocks. In addition, using TimeFinder in conjunction with RMAN
enables recovery procedures to be tested and validated before
affecting production data, since a second copy of the data is available.
Backups using TimeFinder and Oracle RMAN 195


5
Restoring and
Recovering Oracle
Databases

◆ Introduction ...................................................................................... 198
◆ Oracle recovery types ...................................................................... 199
◆ Oracle recovery overview ............................................................... 203
◆ Restoring a backup image using TimeFinder .............................. 205
◆ Restoring a backup image using Replication Manager.............. 215
◆ Oracle database recovery procedures ........................................... 217
◆ Database recovery using Oracle RMAN....................................... 223
◆ Oracle Flashback .............................................................................. 224
Restoring and Recovering Oracle Databases 197

Restoring and Recovering Oracle Databases
Introduction
Recovery of a production database is an event that all DBAs hope is
never required. Nevertheless, DBAs must be prepared for unforeseen
events such as media failures or user errors requiring database
recovery operations. The keys to a successful database recovery
include the following:
◆ Identifying database recovery time objectives
◆ Planning the appropriate recovery strategy based upon the
backup type (full, incremental)
◆ Documenting the recovery procedures
◆ Validating the recovery process
Oracle recovery depends on the backup methodology used. With the
appropriate backup procedures in place, an Oracle database is
recovered to any point in time between the end of the backup and the
point of failure using a combination of backed up data files and
Oracle recovery structures including the control files, the archive
logs, and the redo logs. Recovery typically involves copying the
previously backed up files into their appropriate locations and, if
necessary, performing recovery operations to ensure that the
database is recovered to the appropriate point in time and is
consistent.
The following sections examine both traditional (user-managed) and
RMAN Oracle database recoveries. This chapter assumes that EMC
technology is used in the backup process as described in Chapter 4,
“Backing Up Oracle Environments.” Thus, this chapter directly
matches the sections of that chapter.

Oracle recovery types

There are several reasons for Oracle to perform recovery. Examples
include recovery of the database after a power failure on the host,
recovery after user error that deletes a required part of the database,
recovery after application errors, and recovery on account of
hardware (disk, host, HBA, and such) or software failures or
corruptions.
The following sections discuss the various types of recovery
operations available in Oracle, under what circumstances they are
employed, and high-level details of each recovery operation. These
operations are then further discussed in subsequent sections of this
chapter.
Crash recovery
A critical component of all ACID-compliant (Atomicity Consistency
Isolation Durability) databases is the ability to perform crash
recovery to a consistent database state after a failure. Power failures
to the host are a primary concern for databases to go down
inadvertently and require crash recovery. Other situations where
crash recovery procedures are needed include databases shut down
with the "abort" option and database images created using a
consistent split mechanism.
Crash recovery is an example of using the database restart process,
where the implicit application of database logs during normal
initialization occurs. Crash recovery is a database driven recovery
mechanism-it is not initiated by a DBA. Whenever the database is
started, Oracle verifies that the database is in a consistent state. It
does this by reading information out of the control file and verifying
the database was previously shut down cleanly. It also determines the
latest checkpoint system change number (SCN) in the control file and
verifies that each datafile is current by comparing the SCN in each
data file header. In the event that a crash occurred and recovery is
required, the database automatically determines which log
information needs to be applied. The latest redo log is read and
change information from them is applied to the database files, rolling
forward any transactions that were committed but not applied to the
database files. Then, any transaction information written to the
datafiles, but not committed, are rolled back using data in the undo
logs.
Oracle recovery types 199

In addition to enabling Oracle databases to recover after an outage,

crash recovery is also essential to restartable databases that use the
EMC consistency technologies. These consistency technologies in
conjunction with TimeFinder, SRDF, or Open Replicator, enable
dependent-write consistent database images to be created. When
these copies are restarted, the database uses crash recovery
mechanisms to transform the dependent-write consistent images into
transactionally consistent database images.
Media recovery
Media recovery is another type of Oracle recovery mechanism.
Unlike crash recovery however, media recovery is always
user-invoked, although both user-managed and RMAN recovery
types may be used. In addition, media recovery rolls forward changes
made to the datafiles restored from disk or tape due to their loss or
corruption. Unlike crash recovery, which uses only the online redo
log files, media recovery uses both the online redo logs and the
archived log files during the recovery process.
The granularity of a media recovery depends on the requirements of
the DBA. It can be performed for an entire database, for a single
tablespace, or even for a single datafile. The process involves
restoring a copy of a valid backed up image of the required data
structure (database, tablespace, datafile) and using Oracle standard
recovery methods to roll forward the database to the point in time of
the failure by applying change information found in the archived and
online redo log files. Oracle uses SCNs to determine the last changes
applied to the data files involved. It then uses information in the
control files that specifies which SCNs are contained in each of the
archive logs to determine where to start the recovery process.
Changes are then applied to appropriate datafiles to roll them
forward to the point of the last transaction in the logs.
Media recovery is the predominant Oracle recovery mechanism.
Media recovery is also used as a part of replicating Oracle databases
for business continuity or disaster recovery purposes. Further details
of the media recovery process are in the following sections.

Complete recovery
Complete recovery is the primary method of recovering an Oracle
database. It is the process of recovering a database to the latest point
in time (just before the database failure) without the loss of
committed transactions. The complete recovery process involves
restoring a part or all of the database data files from a backup image
on tape or disk, and then reading and applying all transactions
subsequent to the completion of the database backup from the
archived and online log files. After restarting the database, crash
recovery is performed to make the database transactionally
consistent for continued user transactional processing.
The processes needed to perform complete recovery of the database
are detailed in the following sections.
Incomplete recovery
Oracle sometimes refers to incomplete database recovery as a
point-in-time recovery. Incomplete recovery is similar to complete
recovery in the process used to bring the database back to a
transactionally consistent state. However, instead of rolling the
database forward to the last available transaction, roll-forward
procedures are halted at a user-defined prior point. This is typically
done to recover a database prior to a point of user error such as the
deletion of a table, undesired deletion or modification of customer
data, or rollback of an unfinished batch update. In addition,
incomplete recovery is also performed when recovery is required, but
there are missing or corrupted archive logs. Incomplete recovery
always incurs some data loss.
Typically, incomplete recovery operations are performed on the entire
database since Oracle needs all database files to be consistent with
one another. However, an option called Tablespace Point-in-Time
Recovery (usually abbreviated TSPITR), which allows a single
tablespace to be only partially recovered, is also available. This
recovery method, in Oracle10g, uses the transportable tablespace
feature described in Section 3.8. The Oracle documentation Oracle
Database Backup and Recovery Advanced Users Guide provides
additional information on TSPITR.
Oracle recovery types 201

The need to perform incomplete recovery to fix user errors has

become less important with the introduction of the Oracle Flashback
capabilities. Flashback functionality is described in “Oracle
Flashback” on page 224.
Restartable database recovery

In addition to the more commonly used complete and incomplete
recovery methods, another database recovery scheme gaining
increasing relevance is restartable database recovery. Restartable
database recovery is differentiated from incomplete recovery by the
fact that the EMC consistency technology is required to make a
restartable image, rather than one using Oracle's recovery
procedures.
Using this method, the entire database image written to tape or disk
is restored to the point of the backup and the database service is then
restarted. At restart, Oracle performs crash recovery, rolling forward
any changes that did not make it to the data files and rolling back
changes that had not committed. The database is then opened for
user activities.
While a restartable database recovery invariably involves the loss of
data since the backup was made at some previous time, there are
benefits to restoring restartable database images. Primary among
them is the ability to coordinate recovery points with multiple, or
federated database environments. The operational complexity of
managing the recovery points of multiple databases, related
application data, or infrastructure messaging queues can be difficult,
if not impossible using traditional backup techniques. Restart
procedures in conjunction with EMC consistency technology allows
customers to create a point-in-time image of all applications and
databases. This reduces or eliminates operational complexity and
enables reduced recovery times for complex enterprise environments.

Oracle recovery overview

Oracle has two primary recovery methods: user-managed and
RMAN. A traditional user-managed recovery involves restoring data
files from tape or disk using operating system commands and
performing Oracle recovery operations to create a consistent database
image to a specified point in time between the last available backup
and the point of failure. User-managed recoveries require careful
consideration by the DBA to ensure that the appropriate files (archive
logs, control files, datafiles, and such) are available and restored to
their correct locations, and that the recovery process is performed
successfully.
An alternative to user-managed recovery is RMAN. After using
RMAN for the backup process, the utility also may be used to recover
the database. RMAN maintains an internal list of backup filenames
and their locations in order to automate the recovery process. RMAN
is an automated, efficient utility that simplifies the Oracle recovery
process.
The following sections describe considerations for the various
user-managed recovery processes. These include:
◆ Online versus offline recovery - In most cases, recovery is
performed by shutting down the primary database, restoring one
or more data files from tape or disk, and recovering the
datafile(s). This requires downtime for the database. An
alternative to this is recovering a single tablespace, which in
some cases can be done online.
◆ Point-in-time versus roll-forward recovery - Historically, recovery
procedures involved restoring a copy of the database and using
Oracle recovery mechanisms in conjunction with the archive logs
(and the last redo logs) to recover the database to the point of a
failure. In federated environments, however, restoring a database
image to a known state and using Oracle crash recovery
procedures to make the database consistent is a growing
alternative to traditional roll-forward recovery.
◆ Partial (tablespace or datafile) versus full database recovery -
Oracle provides the ability to recover a single data file or
tablespace in addition to recovering the entire database. This
option is useful for example if a single datafile becomes
corrupted or data from a single tablespace is lost.
Oracle recovery overview 203

◆ "Incomplete versus complete database recovery - In general,

customers want to recover a database fully to the point of a
failure. In some cases, however, due to lost or corrupted archive
logs from tape, incomplete recovery may necessary or even
desired.
The recovery process consists of three primary steps:
◆ Restoring a database backup (that is, the backed up datafiles or
raw devices, from either tape or disk)
◆ Recovering the database using Oracle database recovery methods
◆ Verifying the state of the recovered database including
consistency of the database, recovery point, and coordination
with other databases or applications
Most recovery procedures require both a restore of a database image
and user-initiated recovery of that image to a consistent state.
However, this is not always the case. In some circumstances, simply
restoring an image of the database and restarting the instance is all
that is needed to bring the database to a consistent, defined state and
continue operations. Planning for, documenting, and testing the
required recovery procedures for a particular database environment
are an essential part of maintaining a workable recovery strategy.
Perhaps the most critical, but often overlooked, component of the
recovery process is verification of the database once the restore and
recovery steps are complete. Important and often difficult
management tasks in the recovery process include:
◆ Ensuring that the database is consistent and has not lost any
transactions.
◆ Integrating the database recovery process with application
information, datafiles stored outside of the database, or other
databases or applications.
Verifying the database after restore and recovery depends upon the
customer's specific applications, requirements, and environment. As
such, it is not discussed further in this document.

Restoring a backup image using TimeFinder

The first step in any recovery process is to restore a backup image
from either tape or disk media. In this section, the copy is on disk
media and was created by TimeFinder. It is assumed that the disks
contain the database image needed for restore. The exact restore
process depends on how the copy on disk was created. If the database
image needed for the restore is on tape, the procedures are different
and beyond the scope of this document. Ideally, a copy of an Oracle
database shut down during the backup process is available for
restore, as it provides the greatest flexibility for the recovery
processing. However, backup images taken while the database was in
hot backup mode or was not conditioned in any way all use similar
restore procedures.
Restore operations that use only Symmetrix array resources may be
performed if EMC TimeFinder was used to create a backup image of
the database. If a database copy was made while the database was
shut down, this copy can be restored (with TimeFinder, either
incrementally or fully) and used as a point-in-time image, or as part
of an incomplete or complete recovery of the database. Alternatively,
if the backup image was made while the database was in a hot
backup state, TimeFinder may also be used to restore an inconsistent
image of the database that can be successfully recovered using Oracle
incomplete or complete recovery techniques to a user defined point
of consistency.
TimeFinder comes in three different forms: TimeFinder/Mirror,
TimeFinder/Clone and TimeFinder/Snap. These were discussed in
general terms in Chapter 2, “EMC Foundation Products.” Here, they
are used in a database recovery context. The following sections
describe the restore process when each variant of TimeFinder is used.
Restore using TimeFinder/Mirror

If TimeFinder/Mirror was used to create a backup database image,
the TimeFinder restore process can be used to copy the backup image
back to the production volumes. Two cases for the restore process
exist. In the first case, when a point-in-time restartable database
image restore is desired, all of the data files that make up the
database including the archive logs, redo logs, and control files are
restored from the BCV devices.
Restoring a backup image using TimeFinder 205

This first case is depicted in Figure 44, where both the volumes
containing the datafiles and the database recovery structures (archive
logs, redo logs, and control files) are restored.
Prior to any disk-based restore using EMC technology, the database
must be shut down, and file systems unmounted. The operating
system should have nothing in its memory that reflects the content of
the database file structures.
1 2
Data STD Data BCV
Arch STD Arch BCV
ICO-IMG-000513
Figure 44 Restoring a TimeFinder copy, all components
In most circumstances, only the datafiles (or even a subset of the

datafiles) are restored. In these instances, user-initiated complete or
incomplete database recoveries are planned. Figure 45 depicts the
case where only the datafiles (or a subset of the datafiles) are restored.
1 2
Data STD Data BCV
Arch STD Arch BCV
ICO-IMG-000514
Figure 45 Restoring a TimeFinder copy, data components only

In the example that follows, the data_group device group holds all
Symmetrix volumes containing Oracle tablespaces. The log_group
group has volumes containing the Oracle recovery structures (the
archive logs, redo logs, and control files). The following steps
describe the process needed to restore the database image from the
BCVs:
1. Verify the state of the BCVs. All volumes in the Symmetrix device
group should be in a split state. The following commands identify
the state of the BCVs for each of the device groups:
symmir -g data_group query
symmir -g log_group query
2. Shut down the database on the production volumes. From a

storage perspective, the restore process does not require the
database to shut down. However, because data blocks are
changing at the storage layer, Oracle is not aware of changes
occurring during the restore process. As such, data in the SGA
may not be consistent with data on disk. This inconsistency
requires a brief outage of the database while the restore process is
initiated.
SQL> shutdown immediate
SQL> startup restrict;
SQL> shutdown
3. After the primary database has shut down, unmount the file
system (if used) to ensure that nothing remains in cache. This
action is operating-system dependent.
4. Once the primary database has shut down successfully and the
file system is un-mounted, initiate the BCV restore process. In this
example, both the data_group and log_group device groups are
restored, indicating a point-in-time recovery. If an incomplete or
complete recovery is required, only the data_group device group
would be restored. Execute the following TimeFinder/Mirror
SYMCLI commands:
symmir -g data_group restore -nop
symmir -g log_group restore -nop


5. Once the BCV restore process has been initiated, the production
database copy is ready for recovery operations. It is possible to
start the recovery process even though the data is still being
restored from the BCV to the production devices. Any tracks
needed, but not restored, will be pulled directly from the BCV
device. It is recommended however, that the restore operation
completes and the BCVs are split from the standard devices
before the source database is started and recovery (if required) is
initiated.
Note: It is important to understand that if the database is restarted before the

restore process completes, any changes to the source database volumes will
also be written to the BCVs. This means that the copy on the BCV will no
longer be a consistent database image. It is always recommended that the
restore process completes and the BCVs are split from the source volumes
before processing or recovery is initiated on the source devices.
6. After the restore process completes, split the BCVs from the
standard devices with the following commands:
symmir -g data_group split -nop
symmir -g log_group split -nop
Restore using TimeFinder/Clone

TimeFinder/Clone allows a target clone image to be restored back to
the source device or to an unrelated target device. Prior to Solutions
Enabler 6.0, data could be restored from a clone target back to its
source device only by performing a reverse clone operation. The
clone relationship was terminated between the source and the target;
the target was then used as the source for creating and activating a
new clone relationship. However, with SYMCLI 6.0 running with
Enginuity 5671 code, an operation similar to a TimeFinder/Mirror
restore can be performed without terminating the original clone
session. A prerequisite of this is that the clone operation must have
been created with the -differential option.
If TimeFinder/Clone is used to create a database backup image, the
TimeFinder/Clone restore process can be used to copy the backup
image back to the production volumes. Figure 46 on page 209 depicts
the necessary steps to restore a database copy of an Oracle database
using TimeFinder/Clone. In this example, both the volumes
containing data files and database recovery structures (archive logs,

redo logs, and control files) are restored in anticipation that a

point-in-time recovery (rather than a complete or incomplete
recovery) will be performed. Alternatively, if complete recovery is
required, only the volumes containing the database datafiles may be
restored, as is shown in Figure 47.
1 2
Data STD Data BCV
Arch STD Arch BCV
ICO-IMG-000513
Figure 46 Restoring a TimeFinder/Clone copy, all components
1 2
Data STD Data BCV
Arch STD Arch BCV
ICO-IMG-000514
Figure 47 Restoring a TimeFinder/Clone copy, data components only

In the example that follows, the data_group device group holds all
Symmetrix volumes containing Oracle tablespaces. The log_group
group has volumes containing the Oracle recovery structures (the
archive logs, redo logs, and control files). Follow these steps to restore
the database image from the BCV clone devices:
1. Verify the state of the clone devices. Volumes in the Symmetrix
device group should be in an active state, although the
relationship between the source and target volumes may have
terminated. The following commands identify the state of the
clones for each of the device groups (the -multi flag is used to
show all relationships available):
symclone -g data_group query -multi
symclone -g log_group query -multi
2. .Shut down the database on the production volumes. From a

database to be shut down. However, because data blocks are
changing at the storage layer, Oracle is not aware of changes
taking place during the restore process. As such, data in the SGA
is inconsistent with data on disk. This inconsistency requires a
brief outage of the database while the restore process is initiated.
SQL> shutdown
3. After the primary database has shut down, unmount the file
system (if used) to ensure that nothing remains in server cache.
This action is operating-system dependent.
4. Initiate the clone restore process. In this example, both the
data_group and log_group device groups are restored, indicating a
point-in-time recovery. If an incomplete or complete recovery is
required, only the data_group device group would be restored.
Execute the following TimeFinder/Clone SYMCLI commands:
symclone -g data_group restore -nop
symclone -g log_group restore -nop

5. After the clone restore process is initiated, the production

needed, but not restored, are pulled directly from the BCV device.
6. After the restore process completes, terminate the clone/standard
relationships as follows:
symclone -g data_group terminate -nop
symclone -g log_group terminate -nop
Restore using TimeFinder/Snap

TimeFinder/Snap allows a target virtual database image to be
restored back to the source device or to an unrelated target device.
Prior to Solutions Enabler 5.4, when data was restored from a snap
back to its source device, any other snap sessions created were
terminated. Beginning with SYMCLI 5.4, restore operations using
TimeFinder/Snap can maintain the relationship between be the
source device and any other snaps. Additional snap sessions are
persistent through a restore operation to the source device.
If TimeFinder/Snap were used to create a backup database image,
the TimeFinder restore process can be used to copy the backup image
to the production volumes. Figure 48 shows how to use
TimeFinder/Clone to make a database copy of a cold Oracle
database. In this example, both the volumes containing data files and
database recovery structures (archive logs, redo logs, and control
files) are restored in anticipation that a point-in-time recovery (rather
than a complete or incomplete recovery) will be performed.

Alternatively, if complete recovery is required, only the volumes

containing the database data files may be restored, as shown in
Figure 49.
Database
host 1 2
STD
Log
DEV
Data copied to
save area is
restored to Arch
DEV
standard
Data
SAVE DEV
DEV
ICO-IMG-000515
Figure 48 Restoring a TimeFinder/Snap copy, all components
Database
host 1 2
STD
Data copied to
save area is
Data
restored to DEV
standard
SAVE
DEV
ICO-IMG-000516
Figure 49 Restoring a TimeFinder/Snap copy, data components only
1. Verify the state of the snap devices. Volumes in the Symmetrix

device group should be in an active state, although the
relationship between the source volumes and virtual devices may

also have been terminated. The following commands identify the

state of the clones for each of the device groups (the -multi flag is
used to show all relationships available):
symsnap -g data_group query -multi
symsnap -g log_group query -multi
2. Shut down the database on the production volumes. From a

database to be shut down. However, because data blocks are
changing at the storage layer, Oracle is unaware of changes
occurring during the restore process. As such, data in the SGA is
inconsistent with data on disk. This inconsistency requires a brief
outage of the database while the restore process is initiated.
SQL> shutdown
3. After the primary database shuts down, unmount the file system
(if used) to ensure that nothing remains in cache. This action is
operating-system dependent.
4. Once the file systems are unmounted, initiate the snap restore
process. In this example, both the data_group and log_group device
groups are restored, indicating a point-in-time recovery. If an
incomplete or complete recovery is required, only the data_group
device group would be restored. Execute the following
TimeFinder/Clone SYMCLI commands:
symsnap -g data_group restore -nop
symsnap -g log_group restore -nop
symsnap -g data_group query -restore

symsnap -g log_group query -restore
5. After the snap restore process is initiated, the production

needed, but not restored, are pulled directly from the BCV device.

6. When the snap restore process is initiated, both the snap device
and the source are set to a Not Ready status (that is, they are
offline to host activity). Once the restore operation commences,
the source device is set to a Ready state. Upon completion of the
restore process, terminate the restore operations as follows:
symsnap -g data_group terminate -restored -noprompt
symsnap -g log_group terminate -restored -noprompt
symsnap -g data_group query
symsnap -g log_group query
Note: Terminating the restore session does not terminate the underlying
snap session.
7. If the snap device is needed for further processing once the

restore process completes, the virtual devices must be set to a
Ready state again. This is accomplished through the command:
symld -g data_group ready VDEV001 -vdev
and so on for each VDEV in the device group.

Alternatively, after the restore process completes, the
snap/standard relationships can be terminated with the
following commands:
symsnap -g data_group terminate -noprompt
symsnap -g log_group terminate -noprompt

Restoring a backup image using Replication Manager

Replication Manager provides automated restore procedures through
a graphical user interface that simplifies the restore process. If
Replication Manager is used to create a replica of the database, a
restore process can simply be initiated through the Replication
Manager interface.
Figure 50 demonstrates the steps performed by Replication Manager
using TimeFinder/Mirror to restore a database copy so that recovery
can be performed. Note that Replication Manager has the ability to
leave the database in a restored state so the DBA can initiate recovery
procedures or it can start the recovery process automatically.
Replication Manager has several options for restoring and recovering
an Oracle database.
Oracle
2
Data STD Data BCV
1 Log STD
3
4 Arch STD
5
ICO-IMG-000517
Figure 50 Restoring Oracle using EMC Replication Manager
1. Shut down the primary Oracle database so that data can be

restored.
2. Replication Manager initiates a restore operation to copy the
backed up version of the database back over the primary Oracle
database's datafiles.
3. A copy of the backup control file is copied from the Replication
Manager host to the Oracle host for recovery procedures.
4. Copies of the required archive logs are copied from the
Replication Manager host to the Oracle host for recovery
procedures.
Restoring a backup image using Replication Manager 215

5. At this point, the database is ready for recovery. Depending on

how it is configured, recovery operations may be manually or
automatically initiated.
Note: For more information on Replication Manager capabilities with Oracle,

consult the latest Replication Manager Product Guide or Replication
Manager Administrator's Guide.

Oracle database recovery procedures

After the database is restored from either tape or disk, the next step in
the process is to perform recovery procedures to bring the database
into a consistent state and open the database for user transactions.
The type of recovery implemented depends on the backup process
originally used. The recovery procedures required for each of the
restored database images discussed in “Restoring a backup image
using TimeFinder” on page 205 and “Restoring a backup image using
Replication Manager” on page 215 are described next.
Oracle restartable database recovery procedures

An Oracle point-in-time recovery here refers to the process needed to
recover a backup database image taken using one of the EMC
consistency technologies. Recovery examples for cold, hot backup
mode, quiesced, and consistently split database backup images are
described next. For each case, the recovery process is managed by the
database itself through Oracle crash recovery procedures, rather than
media recovery operations.
Restartable database recovery from a cold backup image

Creating a point-in-time recovered database image usually requires
Oracle crash recovery procedures. However, after taking a cold
backup image, no database recovery is required to bring the datafiles
into a consistent state. Ensure the restored database files are available.
The database can then be restarted using the commands:
SQL> startup;
Restartable database recovery from a hot backup image

For customers with the ability to perform consistent TimeFinder
splits, utilizing EMC consistent splits with hot backup mode provides
an effective way to create both restartable and recoverable database
images. Using both together maximizes the flexibility of options
should either restart or recovery be required.
A point-in-time recovery of a restored image taken with the source in
hot backup mode in general requires user-initiated recovery. To
recover the hot backup image, follow these steps:
1. Recover the database to the point when the tablespaces were
taken out of hot backup mode:
Oracle database recovery procedures 217


SQL> startup mount;
The recover database command initiates recovery procedures within

Oracle. The control files and each of the datafiles are examined. The
database then lists the appropriate archive logs to apply. Each archive
log is then applied in turn to the point when the tablespaces were
taken out of hot backup mode.
2. Once recovery is complete, open the database to users:
SQL> alter database open;
Note: It is also possible to simply restart the database as shown in the next
section.
Restartable database recovery from a consistent split backup image

Creating a point-in-time restartable database image only requires
Oracle crash recovery procedures to open it to users. Ensure the
restored database files are available. The database can then be
restarted using the commands:
SQL> startup;
Oracle complete recovery

A complete recovery requires that all committed transactions from
the database are applied to the restored database image. Oracle
media recovery procedures are initiated to determine whether any
datafiles in the backup image need recovery. Oracle also determines
that archive logs to apply to roll forward the database to the state
before the failure occurred.
In most cases, the source database must be shut down for recovery
procedures to be initiated. In some cases however, for example when
only a single datafile or tablespace needs to be recovered, the process
can be completed with the database open (only the tablespace needs
to be taken offline).
The following subsections describe the recovery procedures for
Oracle complete recovery. The type of recovery implemented
depends on the backup process originally used. The recovery
procedures required for each of the restored database images

discussed in “Restoring a backup image using TimeFinder” on

page 205 and “Restoring a backup image using Replication Manager”
on page 215 are described next.
Oracle complete recovery from a cold backup image

Complete recovery using a cold backup image of the database is
easily managed as the database is already in a consistent
transactional state. SCN information in the control file defines the
recovery point for the database. Each of the datafiles in the database
also contains the latest SCN checkpoint. This information is
compared and the set of archive logs needed to roll forward the
database is determined.
To recover the database from a cold backup image, follow these steps:
1. Recover the database to the point of the latest transactions.
SQL> startup mount;
The recover database command initiates recovery procedures within
Oracle. The control files and each of the datafiles are examined. The
database then lists the appropriate archive logs to apply. Each archive
log is then applied in turn to the point when the tablespaces were
taken out of hot backup mode. Additionally, the latest redo log
information can be applied to the database by specifying the latest
logs.
Oracle complete recovery for a hot backup image

In general, a point-in-time recovery of a restored image taken with
the source in hot backup mode requires user-initiated recovery. The
following demonstrates the process of recovering the hot backup
image:
1. The database must be fully recovered to the point of database
failure. This requires that all archive logs and the latest redo logs
are available.
SQL> startup mount;

The recover database command initiates recovery procedures

within Oracle. The control files and each of the datafiles are
examined. The database then lists the appropriate archive logs to
apply. Information in each archive log is then applied to the
database in turn. This is a manual process, although it can be
automated if all the logs are in an appropriate directory (such as
the flash recovery area) by specifying auto when Oracle requests
a specific log.
Oracle complete recovery from a consistent split backup image

Currently, Oracle does not support complete recovery for a database
image taken using the -consistent split option of the TimeFinder
products without putting the database in hot backup mode.
Consistent split images are only supported to create database
restartable images, rather than recoverable ones. Consistent split
images with the database in hot backup mode can be used for both
restart and recovery. Using EMC consistency technology in
conjunction with hot backup mode is recommended because of the
flexibility in recovery options offered.
Oracle incomplete recovery

Incomplete recovery procedures are nearly identical to the complete
recovery steps defined in the last section. The primary difference
however, is that instead of rolling the database to the last available
transactions in the redo logs, data is only rolled forward to an earlier
point specified by the DBA. Additionally, the database must be
opened using the open resetlogs option.
Incomplete recovery is used for a number of reasons. User errors or
logical corruptions are the primary reasons that incomplete
recoveries are performed (although a new alternative is the Oracle
Flashback technology). Another reason for performing incomplete
recovery is due to missing archive logs during a complete recovery
procedure.
Oracle incomplete recovery from a cold backup image

A point-in-time recovery of a restored image taken with the source in
hot backup mode in general requires user initiated recovery. To
recover the hot backup image, follow these steps:

1. Recover the database to a point beyond where the tablespaces are

taken out of hot backup mode:
SQL> startup mount;
SQL> recover database until cancel;

within Oracle. The control files and each datafile is examined. The
database then lists the appropriate archive logs to apply. Each
archive log is then applied in turn to a point specified by the
DBA.
Alternatively, the recover database command has additional
options. For example, the database can be recovered to a specific
SCN or to a particular timestamp using the following:
SQL> recover database until change SCN;
or
SQL> recover database until time timestamp;
2. Once recovery is complete, open the database using the open

resetlogs option. This is necessary because the database was not
recovered fully to the point of failure.
SQL> alter database open resetlogs;
After opening the database with the resetlogs option, you should
immediately perform a full database backup.
Oracle incomplete recovery from a hot backup image

In general, a point-in-time recovery of a restored image taken with
the source in hot backup mode requires user-initiated recovery. The
following demonstrates the process of recovering the hot backup
image:
1. Recover the database to a point beyond where the tablespaces
were taken out of hot backup mode.
SQL> startup mount;

within Oracle. The control files and each datafile is examined. The
database then lists the appropriate archive logs to apply. Each
archive log is then applied in turn to the point all of the
tablespaces were taken out of hot backup mode.

Alternatively, the recover database command has additional

options. For example, the database can be recovered to a specific
SCN or to a particular timestamp using the following:
SQL> recover database until change SCN;
or
SQL> recover database until time timestamp;
2. Once recovery is complete, open the database using the open

resetlogs option:
After opening the database with the resetlogs option,

Oracle incomplete recovery from a consistent split backup image

Currently, Oracle does not support complete recovery for a database
image taken using the -consistent split option of the TimeFinder
products without putting the database in hot backup mode.
Consistent split images are only supported to create database
restartable images, rather than recoverable ones. Consistent split
images with the database in hot backup mode can be used for both
restart and recovery. Using EMC consistency technology in
conjunction with hot backup mode is recommended because of the
flexibility in recovery options offered.

Database recovery using Oracle RMAN

As stated in Chapter 4, “Backing Up Oracle Environments,” Oracle
Recovery Manager provides DBAs many options when performing
recovery operations of an Oracle database backed up with the utility.
The details of how RMAN may be used to restore and recover an
Oracle database are beyond the scope of this document. The Oracle
documentation Oracle Database Backup and Recovery Basics and
Oracle Database Backup and Recovery Advanced User's Guide
provide additional detailed information on RMAN.
Database recovery using Oracle RMAN 223

Oracle Flashback
Oracle Flashback is a technology that helps DBAs recover from user
errors to the database. Initial Flashback functionality was provided in
Oracle9i but was greatly enhanced in Oracle10g. Flashback retains
undo data in the form of flashback logs. Flashback logs containing
undo information are periodically written by the database in order
for the various types of Flashback to work.
Each type of Flashback relies on undo data being written to the flash
recovery area. The flash recovery area is a file system Oracle uses to
retain the flashback logs, archive logs, backups, and other
Some of the ways Flashback helps DBAs recover from user errors are:
◆ Flashback Query
◆ Flashback Version Query
◆ Flashback Transaction Query
◆ Flashback Table
◆ Flashback Drop
◆ Flashback Database
Each of these recovery methods is describe in the following sections.
Note: Flashback is a recovery mechanism for logical or user errors. It is not a

utility to be used in place of traditional backup and recovery techniques and
is not designed to solve physical or media errors.
Flashback configuration
Flashback is enabled in a database by creating a flash recovery area
for the Flashback logs to be retained, and by enabling Flashback
logging. Flashback allows the database to be flashed back to any
point in time. However, the Flashback logs represent discrete
database points in time, and as such, ARCHIVELOG mode must also
be enabled for the database. Archive log information is used in
conjunction with the flashback logs to re-create any given database
point-in-time state desired.
The default flashback recovery area is defined by the Oracle
initialization parameter DB_RECOVERY_FILE_DEST. It is important
to set this parameter with the location of a directory that can hold the
flashback logs. The required size of this file system depends on how
far back a user may want to flash back the database to, and whether

other objects such as archive logs and database backups, will be

written to this directory. A maximum size for this directory is
specified by the DB_RECOVERY_FILE_DEST_SIZE (no default)
parameter. In some cases, Oracle recommends up to three times the
actual database size for the flash recovery area.
To enable Flashback log generation in a database, enter the following:
SQL> startup mount;
SQL> alter database flashback on;
To turn off Flashback, enter the following:

SQL> startup mount;
SQL> alter database flashback off;
To identify the state of Flashback, use the following query:

SELECT name, current_scn, flashback_on
FROM v$database;
In addition to establishing the flash recovery area and enabling

Flashback log generation, set the initialization parameter
DB_FLASHBACK_RETENTION_TARGET (default of 1440 minutes,
or one day) to define the targeted amount of Flashback log retention.
This parameter determines how far back a database can be flashed
back. In addition, the oldest currently available SCN and time in the
Flashback logs can be determined through the query:
SELECT oldest_flashback_scn, oldest_flashback_time
FROM v$flashback_database_log;
Additional information concerning the flashback logs may also be

found in the v$flashback_database_log view.
Flashback Query
Flashback Query displays versions of queries run against a database
as they looked at a previous time. For example, if a user dropped a
selection of rows from a database erroneously, Flashback Query
allows that user to run queries against the table as if it were at that
time.
The following is an example of the Flashback Query functionality:
SELECT first_name, last_name
FROM emp
AS OF TIMESTAMP
TO_TIMESTAMP('2005-11-25 11:00:00', 'YYYY-MM-DD
HH:MI:SS')
Oracle Flashback 225

WHERE salary = '100000';
This can be used to return rows deleted as well. For example:

INSERT INTO emp
(SELECT first_name, last_name
FROM emp
AS OF TIMESTAMP
HH:MI:SS')
WHERE last_name = 'PENDLE');
Flashback Version Query

Flashback Version Query displays versions of rows in a table during a
specified time interval. This functionality is helpful in auditing
changes to particular rows in a database as well as for seeing
previous values of rows during a set time interval.
Flashback Transaction Query

Flashback Transaction Query presents changes made by transactions
or sets of transactions in the database during a specified time period.
Flashback Table
Flashback Table returns a table back into the state that it was at a
specified time. It is particularly useful in that this change can be made
while the database is up and running. The following is an example of
the Flashback Table functionality:
FLASHBACK TABLE emp
TO TIMESTAMP
HH:MI:SS');
An SCN can also be used:
FLASHBACK TABLE emp
TO SCN 54395;
Flashback Drop
If tables in Oracle are dropped inadvertently using a DROP TABLE
command, Flashback Drop can reverse the process, reenabling access
to the drop table. As long as space is available, the DROP TABLE
command does not delete data in the tablespace data files. Instead,

the table data is retained (in Oracle's "recycle bin") and the table is
renamed to an internally system-defined name. If the table is needed,
Oracle can bring back the table by renaming it with its old name.
The following shows an example of a table being dropped and then
brought back using the FLASHBACK TABLE command.
1. Determine the tables owned by the currently connected user:
SQL> SELECT * FROM tab;
TNAME TABTYPE CLUSTERID

--------------------------------
TEST TABLE
2. Drop the table:

SQL> DROP TABLE test;
no rows selected
SQL>
3. Ensure the table is placed in the recycle bin:

SQL> show recyclebin;
ORIGINAL NAME RECYCLEBIN NAME OBJECT TYPE DROP TIME
------------- --------------------- ----------- --------------
TEST BIN$wdadid/3akdah3a69 TABLE 2005-11-26:10:
4. Recover the table:

FLASHBACK TABLE test
TO BEFORE DROP;
5. Verify the table is back:

TNAME TAB TYPE CLUSTERID

--------------------------------
TEST TABLE
Flashback Database
Flashback Database logically recovers the entire database to a
previous point in time. A database can be rolled back in time to the
point before a user error such as a batch update or set of transactions
logically corrupted the database. The database can rolled back to a
particular SCN, redo log sequence number, or timestamp. The
following is the syntax of the FLASHBACK DATABASE command:
Oracle Flashback 227

FLASHBACK [DEVICE TYPE = device type DATABASE

TO [BEFORE] SCN = scn
TO [BEFORE] SEQUENCE = sequence # [THREAD = thread id]
TO [BEFORE] TIME = 'date_string'
The following is an example of Flashback Database used to recover

database table data inadvertently dropped. The particular SCN
before the transaction is identified and the database flashed to an
SCN just before the bad transactions occurred.
1. Identify the available Flashback logs. The following lists the
available Flashback logs and the first SCN associated with each
one:
SELECT log#, first_change#, first_time
FROM v$database_flashback_logfile;
The Flashback log should be validated for the particular SCN
desired.
2. Shut down the database and restart it in mount mode for the full
database flashback.
SQL> startup mount;
3. Flash back the database.
SQL> flashback database to scn = 23104;
4. Open the database for use. To make the database consistent, open
the database as follows:
After opening the database with the resetlogs option,

6
Understanding Oracle
Disaster Restart &
Disaster Recovery

◆ Introduction ............................................................................................ 230
◆ Definitions............................................................................................... 231
◆ Design considerations for disaster restart and disaster recovery ... 233
◆ Tape-based solutions ............................................................................. 239
◆ Remote replication challenges.............................................................. 241
◆ Array-based remote replication ........................................................... 246
◆ Planning for array-based replication................................................... 247
◆ SRDF/S single Symmetrix array to single Symmetrix array........... 250
◆ SRDF/S and consistency groups ......................................................... 253
◆ SRDF/A................................................................................................... 260
◆ SRDF/AR single hop............................................................................. 266
◆ SRDF/AR multihop............................................................................... 269
◆ Database log-shipping solutions.......................................................... 272
◆ Running database solutions ................................................................. 286
Understanding Oracle Disaster Restart & Disaster Recovery 229

Understanding Oracle Disaster Restart & Disaster Recovery
Introduction
A critical part of managing a database is planning for unexpected loss
of data. The loss can occur from a disaster such as a fire or flood or it
can come from hardware or software failures. It can even come
through human error or malicious intent. In each instance, the
database must be restored to some usable point, before application
services can resume.
The effectiveness of any plan for restart or recovery involves
answering the following questions:
◆ How much downtime is acceptable to the business?
◆ How much data loss is acceptable to the business?
◆ How complex is the solution?
◆ Does the solution accommodate the data architecture?
◆ How much does the solution cost?
◆ What disasters does the solution protect against?
◆ Is there protection against logical corruption?
◆ Is there protection against physical corruption?
◆ Is the database restartable or recoverable?
◆ Can the solution be tested?
◆ If failover happens, will failback work?
All restart and recovery plans include a replication component. In its
simplest form, the replication process may be as easy as making a
tape copy of the database and application. In a more sophisticated
form, it could be realtime replication of all changed data to some
remote location. Remote replication of data has its own challenges
centered around:
◆ Distance
◆ Propagation delay (latency)
◆ Network infrastructure
◆ Data loss
This section provides an introduction to the spectrum of disaster
recovery and disaster restart solutions for Oracle databases on EMC
Symmetrix arrays.

Definitions
In the following sections, the terms dependent-write consistency,
database restart, database recovery, and roll-forward recovery are used. A
clear definition of these terms is required to understand the context of
this section.
Dependent-write consistency
A dependent-write I/O is one that cannot be issued until a related
predecessor I/O has completed. Dependent-write consistency is a
data state where data integrity is guaranteed by dependent-write
I/Os embedded in application logic. Database management systems
are good examples of the practice of dependent-write consistency.
Database management systems must devise protection against
abnormal termination to successfully recover from one. The most
common technique used is to guarantee that a dependent-write
cannot be issued until a predecessor write has completed. Typically
the dependent-write is a data or index write while the predecessor
write is a write to the log. Because the write to the log must be
completed prior to issuing the dependent-write, the application
thread is synchronous to the log write (that is, it waits for that write to
complete prior to continuing). The result of this strategy is a
dependent-write consistent database.
Database restart
Database restart is the implicit application of database logs during
the database's normal initialization process to ensure a
transactionally consistent data state.
If a database is shut down normally, the process of getting to a point
of consistency during restart requires minimal work. If the database
abnormally terminates, then the restart process will take longer
depending on the number and size of in-flight transactions at the
time of termination. An image of the database created by using EMC
consistency technology while it is running, without conditioning the
database, will be in a dependent-write consistent data state, which is
similar to that created by a local power failure. This is also known as
a DBMS restartable image. The restart of this image transforms it to a
Definitions 231
transactionally consistent data state by completing committed

transactions and rolling back uncommitted transactions during the
normal database initialization process.
Database recovery
Database recovery is the process of rebuilding a database from a
backup image, and then explicitly applying subsequent logs to roll
forward the data state to a designated point of consistency. Database
recovery is only possible with databases configured with archive
logging.
A recoverable Oracle database copy can be taken in one of three
ways:
◆ With the database shut down and copying the database
components using external tools
◆ With the database running using the Oracle backup utility
Recovery Manager (RMAN)
◆ With the database in hot backup mode and copying the database
using external tools
Roll-forward recovery
With some databases, it may be possible to take a DBMS restartable
image of the database, and apply subsequent archive logs, to roll
forward the database to a point in time after the image was created.
This means that the image created can be used in a backup strategy in
combination with archive logs. At the time of printing, a DBMS
restartable image of Oracle cannot use subsequent logs to roll
forward transactions. In most cases, during a disaster, the storage
array image at the remote site will be an Oracle DBMS restartable
image and cannot have archive logs applied to it.

Design considerations for disaster restart and disaster recovery

Loss of data or loss of application availability has a varying impact
from one business type to another. For instance, the loss of
transactions for a bank could cost millions, whereas system
downtime may not have a major fiscal impact. On the other hand,
businesses that are primarily web-based may require 100 percent
application availability to survive. The two factors, loss of data and
loss of uptime are the business drivers that are baseline requirements
for a DR solution. When quantified, these two factors are more
formally known as Recovery Point Objective (RPO) and Recovery
Time Objective (RTO), respectively.
When evaluating a solution, the RPO and RTO requirements of the
business need to be met. In addition, the solution needs to consider
operational complexity, cost, and the ability to return the whole
business to a point of consistency. Each aspect is discussed in the
following sections.
Recovery Point Objective

The RPO is a point of consistency to which a user wants to recover or
restart. It is measured in the amount of time from when the point of
consistency was created or captured to the time the disaster occurred.
This time equates to the acceptable amount of data loss. Zero data
loss (no loss of committed transactions from the time of the disaster)
is the ideal goal, but the high cost of implementing such a solution
must be weighed against the business impact and cost of a controlled
data loss.
Some organizations, like banks, have zero data loss requirements.
The database transactions entered at one location must be replicated
immediately to another location. This can have an impact on
application performance when the two locations are far apart. On the
other hand, keeping the two locations close to one another might not
protect against a regional disaster like the Northeast power outage or
the hurricanes in Florida.
Defining the required RPO is usually a compromise between the
needs of the business, the cost of the solution, and the risk of a
particular event happening.
Design considerations for disaster restart and disaster recovery 233

Recovery Time Objective

The RTO is the maximum amount of time allowed for recovery or
restart to a specified point of consistency. This time involves many
factors, including the time taken to:
◆ Provision power, utilities, etc.
◆ Provision servers with the application and database software.
◆ Configure the network.
◆ Restore the data at the new site.
◆ Roll forward the data to a known point of consistency.
◆ Validate the data.
Some delays can be reduced or eliminated by choosing certain DR
options, such as having a hot site where servers are preconfigured
and on standby. Also, if storage-based replication is used, the time
taken to restore the data to a usable state is completely eliminated.
As with RPO, each solution for RTO will have a different cost profile.
Defining the RTO is usually a compromise between the cost of the
solution and the cost to the business when database and applications
are unavailable.
Operational complexity
The operational complexity of a DR solution may be the most critical
factor in determining the success or failure of a DR activity. The
complexity of a DR solution can be considered as three separate
phases.
1. Configuration of initial setup of the implementation
2. Maintenance and management of the running solution
3. Execution of the DR plan in the event of a disaster
While initial configuration complexity and running complexity can
be a demand on human resources, the third phase, execution of the
plan, is where automation and simplicity must be the focus. When a
disaster is declared, key personnel may be unavailable in addition to
the loss of servers, storage, networks, buildings, and so on. If the
complexity of the DR solution is such that skilled personnel with an

intimate knowledge of all systems involved are required to restore,

recover and validate application and database services, the solution
has a high probability of failure.
Multiple database environments grow organically over time into
complex federated database architectures. In these federated
database environments, reducing the complexity of DR is absolutely
critical. Validation of transactional consistency within the database
architecture is time consuming, costly, and requires application and
database familiarity. One reason for the complexity is the
heterogeneous databases and operating systems involved in these
environments. Across multiple heterogeneous platforms it is hard to
establish a common clock and therefore hard to determine a business
point of consistency across all platforms. This business point of
consistency has to be created from intimate knowledge of the
transactions and data flows.
Source server activity

DR solutions may or may not require additional processing activity
on the source servers. The extent of that activity can impact both
response time and throughput of the production application. This
effect should be understood and quantified for any given solution to
ensure the impact to the business is minimized. The effect for some
solutions is continuous while the production application is running;
for other solutions, the impact is sporadic, where bursts of write
activity are followed by periods of inactivity.
Production impact
Some DR solutions delay the host activity while taking actions to
propagate the changed data to another location. This action only
affects write activity and although the introduced delay may only be
of the order of a few milliseconds, it can impact response time in a
high-write environment. Synchronous solutions introduce delay into
write transactions at the source site; asynchronous solutions do not.
Target server activity

Some DR solutions require a target server at the remote location to
perform DR operations. The server has both software and hardware
costs and needs personnel with physical access to it for basic

operational functions like power on and off. Ideally, this server could
have some usage such as running development or test databases and
applications. Some DR solutions require more target server activity
and some require none.
Number of copies of data

DR solutions require replication of data in one form or another.
Replication of a database and associated files can be as simple as
making a tape backup and shipping the tapes to a DR site or as
sophisticated as asynchronous array-based replication. Some
solutions require multiple copies of the data to support DR functions.
More copies of the data may be required to perform testing of the DR
solution in addition to those that support the DR process.
Distance for solution

Disasters, when they occur, have differing ranges of impact. For
instance, a fire may take out a building, an earthquake may destroy a
city, or a tidal wave may devastate a region. The level of protection
for a DR solution should address the probable disasters for a given
location. For example when protecting against an earthquake, the DR
site should not be in the same locale as the production site. For
regional protection, the two sites need to be in two different regions.
The distance associated with the DR solution affects the kind of DR
solution that can be implemented.
Bandwidth requirements
One of the largest costs for DR is in provisioning bandwidth for the
solution. Bandwidth costs are an operational expense; this makes
solutions that have reduced bandwidth requirements very attractive
to customers. It is important to recognize in advance the bandwidth
consumption of a given solution to be able to anticipate the running
costs. Incorrect provisioning of bandwidth for DR solutions can have
an adverse affect on production performance and can invalidate the
overall solution.

Federated consistency
Databases are rarely isolated islands of information with no
interaction or integration with other applications or databases. Most
commonly, databases are loosely and/or tightly coupled to other
databases using triggers, database links, and stored procedures. Some
databases provide information downstream for other databases using
information distribution middleware; other databases receive feeds
and inbound data from message queues and EDI transactions. The
result can be a complex interwoven architecture with multiple
interrelationships. This is referred to as a federated database
architecture.
With a federated database architecture, making a DR copy of a single
database without regard to other components invites consistency
issues and creates logical data integrity problems. All components in
a federated architecture need to be recovered or restarted to the same
dependent-write consistent point of time to avoid these problems.
It is possible then that point database solutions for DR, such as log
shipping, do not provide the required business point of consistency
in a federated database architecture. Federated consistency solutions
guarantee that all components, databases, applications, middleware,
flat files, and such are recovered or restarted to the same
dependent-write consistent point in time.
Testing the solution

Tested, proven, and documented procedures are also required for a
DR solution. Many times the DR test procedures are operationally
different from a set of true disaster procedures. Operational
procedures need to be clearly documented. In the best-case scenario,
companies should periodically execute the actual set of procedures
for DR. This could be costly to the business because of the application
downtime required to perform such a test, but is necessary to ensure
validity of the DR solution.

Cost
The cost of doing DR can be justified by comparing it to the cost of
not doing it. What does it cost the business when the database and
application systems are unavailable to users? For some companies,
this is easily measurable, and revenue loss can be calculated per hour
of downtime or per hour of data loss.
Whatever the business, the DR cost is going to be an extra expense
item and, in many cases, with little in return. The costs include, but
are not limited to:
◆ Hardware (storage, servers and maintenance)
◆ Software licenses and maintenance
◆ Facility leasing/purchase
◆ Utilities
◆ Personnel

Tape-based solutions
This sectin discusses the following tape-based solutions:
◆ “Tape-based disaster recovery” on page 239
◆ “Tape-based disaster restart” on page 239
Tape-based disaster recovery

Traditionally, the most common form of disaster recovery was to
make a copy of the database onto tape and use PTAM (Pickup Truck
Access Method) to take the tapes offsite to a hardened facility. In most
cases, the database and application needed to be available to users
during the backup process. Taking a backup of a running database
created a "fuzzy" image of the database on tape, one that required
database recovery after the image had been restored. Recovery
usually involved application of logs that were active during the time
the backup was in process. These logs had to be archived and kept
with the backup image to ensure successful recovery.
The rapid growth of data over the last two decades indicates this
method is unmanageable. Making a hot copy of the database is now
the standard, but this method has its own challenges. How can a
consistent copy of the database and supporting files be made when
they are changing throughout the duration of the backup? What
exactly is the content of the tape backup at completion? The reality is
that the tape data is a "fuzzy image" of the disk data, and
considerable expertise is required to restore the database back to a
database point of consistency.
In addition, the challenge of returning the data to a business point of
consistency, where a particular database must be recovered to the
same point as other databases or applications, is making this solution
less viable.
Tape-based disaster restart

Tape-based disaster restart is a recent development in disaster
recovery strategies and is used to avoid the "fuzziness" of a backup
taken while the database and application are running. A "restart"
copy of the system data is created by locally mirroring the disks that
contain the production data, and splitting off the mirrors to create a
dependent-write consistent point-in-time image of the disks. This
Tape-based solutions 239

image is a DBMS restartable image as described earlier. Thus, if this

image was restored and the database brought up, the database would
perform an implicit recovery to attain transactional consistency.
Roll-forward recovery using archived logs from this database image
is not possible with Oracle without conditioning the database prior to
the consistent split. This conditioning process is described in
“Copying the database with Oracle in hot backup mode” on
page 125.
The restartable image on the disks can be backed up to tape and
moved offsite to a secondary facility. If this image is created and
shipped offsite on a daily basis, the maximum amount of data loss is
24 hours.
The time taken to restore the database is a factor to consider since
reading from tape is typically slow. Consequently, this solution can be
effective for customers with relaxed RTOs.

Remote replication challenges

Replicating database information over long distances for the purpose
of disaster recovery is challenging. Synchronous replication over
distances greater than 200 km may be unfeasible due to the negative
impact on the performance of writes because of propagation delay;
some form of asynchronous replication must be adopted.
Considerations in this section apply to all forms of remote replication
technology whether they are array-based, host-based, or managed by
the database.
Remote replication solutions usually start with initially copying a full
database image to the remote location. This is called instantiation of
the database. There are a variety of ways to perform this. After
instantiation, only the changes from the source site are replicated to
the target site in an effort to keep the target up to date. Some
methodologies may not send all of the changes (certain log shipping
techniques for instance), by omission rather than design. These
methodologies may require periodic re-instantiation of the database
at the remote site.
The following considerations apply to remote replication of
databases:
◆ Propagation delay (latency due to distance)
◆ Bandwidth requirements
◆ Method of instantiation
◆ Method of reinstantiation
◆ Change rate at the source site
◆ Locality of reference
◆ Expected data loss
◆ Failback operations
Propagation delay
Electronic operations execute at the speed of light. The speed of light
in a vacuum is 186,000 miles per second. The speed of light through
glass (in the case of fiber-optic media) is less, approximately 115,000
miles per second. In other words, in an optical network, such as
Remote replication challenges 241

SONET for instance, it takes 1 millisecond to send a data packet 125

miles or 8 milliseconds for 1,000 miles. All remote replication
solutions need to be designed with a clear understanding of the
propagation delay impact.
Bandwidth requirements
All remote replication solutions have some bandwidth requirements
because the changes from the source site must be propagated to the
target site. The more changes there are, the greater the bandwidth
that is needed. It is the change rate and replication methodology that
determine the bandwidth requirement, not necessarily the size of the
database.
Data compression can help reduce the quantity of data transmitted
and therefore the size of the "pipe" required. Certain network devices,
like switches and routers, provide native compression, some by
software and some by hardware. GigE directors provide native
compression in a DMX to DMX SRDF pairing. The amount of
compression achieved depends on the type of data being
compressed. Typical character and numeric database data
compresses at about a 2-to-1 ratio. A good way to estimate how the
data will compress is to assess how much tape space is required to
store the database during a full-backup process. Tape drives perform
hardware compression on the data prior to writing it. For instance, if
a 300 GB database takes 200 GB of space on tape, the compression
ratio is 1.5 to 1.
For most customers, a major consideration in the disaster recovery
design is cost. It is important to recognize that some components of
the end solution represent a capital expenditure and some an
operational expenditure. Bandwidth costs are operational expenses
and thus any reduction in this area, even at the cost of some capital
expense, is highly desirable.
Network infrastructure
The choice of channel extension equipment, network protocols,
switches, routers, and such, ultimately determines the operational
characteristics of the solution. EMC has a proprietary "BC Design
Tool" to assist customers in analysis of the source systems and to
determine the required network infrastructure to support a remote
replication solution.

Method of instantiation
In all remote replication solutions, a common requirement is for an
initial, consistent copy of the complete database to be replicated to
the remote site. The initial copy from source to target is called
instantiation of the database at the remote site. Following instantiation,
only the changes made at the source site are replicated. For large
databases, sending only the changes after the initial copy is the only
practical and cost-effective solution for remote database replication.
In some solutions, instantiation of the database at the remote site uses
a process similar to the one that replicates the changes. Some
solutions do not even provide for instantiation at the remote site (log
shipping for instance). In all cases it is critical to understand the pros
and cons of the complete solution.
Method of reinstantiation
Some methods of remote replication require periodic refreshing of the
remote system with a full copy of the database. This is called
reinstantiation. Technologies such as log shipping frequently require
this since not all activity on the production database may be
represented in the log. In these cases, the disaster recovery plan must
account for re-instantiation and also for the fact there may be a
disaster during the refresh. The business objectives of RPO and RTO
must likewise be met under those circumstances.
Change rate at the source site

After instantiation of the database at the remote site, only changes to
the database are replicated remotely. There are many methods of
replication to the remote site and each has differing operational
characteristics. The changes can be replicated using logging
technology (hardware and software mirroring for example). Before
designing a solution with remote replication, it is important to
quantify the average change rate. It is also important to quantify the
change rate during periods of burst write activity. These periods
might correspond to end of month/quarter/year processing, billing,
or payroll cycles. The solution needs to allow for peak write
workloads.

Locality of reference
Locality of reference is a factor that needs to be measured to
understand if there will be a reduction of bandwidth consumption
when any form of asynchronous transmission is used. Locality of
reference is a measurement of how much write activity on the source
is skewed. For instance, a high locality of reference application may
make many updates to a few tables in the database, whereas a low
locality of reference application rarely updates the same rows in the
same tables during a given time period. While the activity on the
tables may have a low locality of reference, the write activity into an
index might be clustered when inserted rows have the same or
similar index column values. This renders a high locality of reference
on the index components.
In some asynchronous replication solutions, updates are "batched"
into periods of time and sent to the remote site to be applied. In a
given batch, only the last image of a given row/block is replicated to
the remote site. So, for highly skewed application writes, this results
in bandwidth savings. Generally, the greater the time period of
batched updates, the greater the savings on bandwidth.
Log-shipping technologies do not consider locality of reference. For
example, a row updated 100 times, is transmitted 100 times to the
remote site, whether the solution is synchronous or asynchronous.
Expected data loss

Synchronous DR solutions are zero data loss solutions; there is no
loss of committed transactions from the time of the disaster.
Synchronous solutions also may be impacted by a rolling disaster in
which case, work completed at the source site after the rolling
disaster started may be lost. Rolling disasters are discussed in detail
in a later section.
Nonsynchronous DR solutions have the potential for data loss. How
much data is lost depends on many factors, most of which are
defined earlier. For asynchronous replication, where updates are
batched and sent to the remote site, the maximum amount of data lost
will be two cycles or two batches worth. The two cycles that may be
lost include the cycle currently being captured on the source site and
the one currently transmitted to the remote site. With inadequate
network bandwidth, data loss could increase due the increased
transmission time.

Failback operations
If there is the slightest chance that failover to the DR site may be
required, then there is a 100 percent chance that failback to the
primary site also will be required, unless the primary site is lost
permanently. The DR architecture should be designed to make
failback simple, efficient, and low risk. If failback is not planned for,
there may be no reasonable or acceptable way to move the processing
from the DR site, where the applications may be running on tier 2
servers and tier 2 networks, and so forth, back to the production site.
In a perfect world, the DR process should be tested once a quarter,
with database and application services fully failed over to the DR site.
The integrity of the application and database must be verified at the
remote site to ensure all required data copied successfully. Ideally,
production services are brought up at the DR site as the ultimate test.
This means production data is maintained on the DR site, requiring a
failback when the DR test completed. While this is not always
possible, it is the ultimate test of a DR solution. It not only validates
the DR process, but also trains the staff on managing the DR process
should a catastrophic failure occur. The downside for this approach is
that duplicate sets of servers and storage need to be present to make
an effective and meaningful test. This tends to be an expensive
proposition.

Array-based remote replication

Customers can use the capabilities of a Symmetrix storage array to
replicate the database from the production location to a secondary
location. No host CPU cycles are used for this, leaving the host
dedicated to running the production application and database. In
addition, no host I/O is required to facilitate this; the array takes care
of all replication and no hosts are required at the target location to
manage the target array.
EMC provides multiple solutions for remote replication of databases:
◆ SRDF/S: Synchronous SRDF
◆ SRDF/A: Asynchronous SRDF
◆ SRDF/AR: SRDF Automated Replication
Each of these solutions is discussed in detail in the following sections.
To use any of the array-based solutions, it is necessary to coordinate
the disk layout of the databases with this replication in mind.

Planning for array-based replication

All Symmetrix solutions replicating data from one array to another
are disk based. This allows the Symmetrix array to be neutral to
volume manager, file system, database system, etc. However, this
does not mean that file system and volume manager concerns can be
ignored. For example, it is impossible to replicate a single disk from
an AIX volume group and import it to another host. Effectively, the
smallest level of granularity for disk-based replication is a volume
group, in the case of UNIX. On Windows, the smallest unit could be a
disk, a volume set, or disk group, depending on how the disks are set
up in disk manager.
In addition, if a database is to be replicated independently of other
databases, it should have its own dedicated disks. That is, the disks
used by a database should not be shared with other applications or
databases.
In many cases, when a database is restored, only the tablespace
containers should be restored and not the logs. An array-based
restore copies the whole host volume, so if the current logs need to be
preserved, then they should be placed on separate volumes from the
tablespace containers. Logically, the database can be divided into
recovery structures and data. Figure 51 on page 248 shows the
separation of the recovery structures and data components for an
Oracle database in preparation for a TimeFinder implementation.
This separation is useful for restoring the data, and then applying the
log to some known point of consistency. This is usually for local
replication and recovery purposes, but can be used for solutions that
combine database and array-based replication solutions. Typically,
this separation is less important for remote replication for restart
purposes.
Planning for array-based replication 247

SYMMETRIX
Redo SYSTEM DATA

Control SYSAUX
logs
Standard
devices
Archive INDEX UNDO

logs TEMP
Redo SYSTEM DATA

Control
logs SYSAUX
BCVs
Archive
INDEX UNDO
logs
ICO-IMG-000512
Figure 51 Database components for Oracle
When a set of volumes is defined for a database for remote

replication, care must be taken to ensure the disks hold everything
needed to restart the database at the remote site. Simply replicating
the tablespace containers is in sufficient. The following is a list of the
objects that must be replicated in addition to the tablespace container
directories:
◆ Oracle binaries: Remotely replicating the Oracle binaries
directory is optional. If not installed on the remote host, this
directory must be replicated to start Oracle. The version and
patch level of the binaries should be the same on both the source
and target systems.
◆ Redo log directories: Place the redo log directories on the
replicated disks. Use the following series of commands to change
the active log location if they are not located on Symmetrix DMX
storage:
1. Shut down the database:
sqlplus
SQL> connect / as sysdba

SQL> shutdown
SQL> exit
2. Move the datafiles using O/S commands from the old location to
the new location:
mv /oracle/oldlogs/log1a.rdo /oracle/newlogs/log1a.rdo
mv /oracle/oldlogs/log1b.rdo /oracle/newlogs/log1b.rdo
3. Start the database in mount mode.

sqlplus
SQL> startup mount
SQL> alter database rename file
'/oracle/oldlog/log1a.rdo',
'/oracle/newlog/log1a.rdo'
'/oracle/oldlog/log1b.rdo',
'/oracle/newlog/log1b.rdo';
4. Open the database.

◆ Archive log directory: Place the archive log directory on the

replicated disks. The archive log directory is identified in the
init.ora startup file.
◆ Control files: Operational procedures are required to ensure that
when additional data files are added, the new versions are copied
to the DR location or at least placed on a replicated disk to
guarantee the files are at the remote site in a disaster. If an
array-based solution is used, placement of these files on
replicated disks solves this problem.
Planning for array-based replication 249

SRDF/S single Symmetrix array to single Symmetrix array

Synchronous SRDF, or SRDF/S, is a method of replicating production
data changes from locations that are no greater than 200 km apart.
Synchronous replication takes writes that are inbound to the source
Symmetrix array and copies them to the target Symmetrix array. The
write operation is not acknowledged as complete to the host until
both Symmetrix arrays have the data in cache. While the following
examples involve Symmetrix arrays, the fundamentals of
synchronous replication described here are true for all synchronous
replication solutions. Figure 52 shows the process.
4 3
1
2
Oracle
Source Target
ICO-IMG-000518
Figure 52 Synchronous replication internals
1. A write is received in the source Symmetrix cache. At this time,

the host has not received acknowledgement that the write is
complete.
2. The source Symmetrix array uses SRDF/S to push the write to the
target Symmetrix array.
3. The target Symmetrix array sends an acknowledgement back to
the source that the write was received.
4. Ending status of the write is presented to the host.
These four steps cause a delay in the processing of writes as
perceived by the database on the source server. The amount of delay
depends on the exact configuration of the network, the storage, the
write block size, and the distance between the two locations. Note
that reads to the source Symmetrix array are not affected by the
replication.
The following steps outline the process of setting up synchronous
replication using Solutions Enabler (SYMCLI) commands.

1. Before the synchronous mode of SRDF can be established, initial

instantiation of the database is required. In other words, first
create a baseline full copy of all the volumes participating in the
synchronous replication. This is usually accomplished using the
adaptive copy mode of SRDF. Create a group device_group as
follows:
symdg create device_group -type rdf1
2. Add disks 123, 124, and 12f to the group device_group:

symld -g device_group add dev 123
symld -g device_group add dev 12f
3. Put the group device_group into adaptive copy mode:

symrdf -g device_group set mode acp_disk -nop
4. Instruct the source Symmetrix array to send all the tracks on the
source site to the target site using the current mode:
symrdf -g device_group establish -full -noprompt
The adaptive copy mode of SRDF has no impact to host

application performance. It transmits tracks to the remote site
never sent before or changed since the last time the track was
sent. It does not preserve write order or dependent-write
consistency.
5. When both sides are synchronized, put SRDF into synchronous
mode. In the following command, the device group device_group
is put into synchronous mode:
symrdf -g device_group set mode sync -nop
Note: There is no requirement for a host at the remote site during the
synchronous replication. The target Symmetrix array itself manages the
in-bound writes and updates the appropriate volumes in the array.
Dependent-write consistency is inherent in a synchronous

relationship as the target R2 volumes are at all times equal to the
source provided that a single RA group is used. If multiple RA
groups are used or if multiple Symmetrix arrays are used on the
source site, SRDF Consistency Groups (SRDF/CG) must be used to
guarantee consistency. SRDF/CG is described below.
SRDF/S single Symmetrix array to single Symmetrix array 251

How to restart in the event of a disaster

In the event of a disaster where the primary source Symmetrix array
is lost, run database and application services from the DR site. A host
at the DR site is required for this. The first requirement is to
write-enable the R2 devices. If the device_group device group is not
built on the remote host, it must be created using the R2 devices that
were mirrors of the R1 devices on the source Symmetrix array. Group
Named Services (GNS) can be used to propagate the device group to
the remote site if there is a host used there. The Solutions Enabler
Symmetrix Base Management CLI Product Guide provides more
details on GNS.
To write-enable the R2s in group device_group, enter the following:
symld -g device_group rw_enable -noprompt
At this point, the host can issue the necessary commands to access the
disks. For instance, on a UNIX host, import the volume group,
activate the logical volumes, fsck the file systems and mount them.
Once the data is available to the host, the database can restart. The
database will perform an implicit recovery when restarted.
Transactions that were committed, but not completed, are rolled
forward and completed using the information in the redo logs.
Transactions that have updates applied to the database, but were not
committed, are rolled back. The result is a transactionally consistent
database.

SRDF/S and consistency groups

Zero data loss disaster recovery techniques tend to use
straightforward database and application restart procedures. These
procedures work well if all processing and data mirroring stop at the
same instant in time at the production site, when a disaster happens.
Such is the case when there is a site power failure.
However, in most cases, it is unlikely that all data processing ceases at
an instant in time. Computing operations can be measured in
nanoseconds and even if a disaster takes only a millisecond to
complete, many such computing operations could be completed
between the start of a disaster until all data processing ceases. This
gives us the notion of a rolling disaster. A rolling disaster is a series of
events taken over a period of time that comprise a true disaster. The
specific period of time that makes up a rolling disaster could be
milliseconds (in the case of an explosion) or minutes in the case of a
fire. In both cases, the DR site must be protected against data
inconsistency.
Rolling disaster
Protection against a rolling disaster is required when the data for a
database resides on more than one Symmetrix array or multiple RA
groups. Figure 53 on page 254 depicts a dependent-write I/O
sequence where a predecessor log write is happening prior to a page
flush from a database buffer pool. The log device and data device are
on different Symmetrix arrays with different replication paths.
Figure 53 demonstrates how rolling disasters can affect this
dependent-write sequence
SRDF/S and consistency groups 253

4 Data
ahead
of Log
R1(A)
Host R1(X)
3 R1(B)
R1(Y) 3 R2(A)
R2(X)
R2(B)
DBMS R2(Y)
1 R2(C)
R2(Z)
R1(C)
X = Application Data 2 R1(Z)
Y = DBMS Data
Z = Logs
ICO-IMG-000519
Figure 53 Rolling disaster with multiple production Symmetrix arrays
1. This example of a rolling disaster starts with a loss of the

synchronous links between the bottom source Symmetrix array
and the target Symmetrix array. This will prevent the remote
replication of data on the bottom source Symmetrix array.
2. The Symmetrix array, which is now no longer replicating, receives
a predecessor log write of a dependent-write I/O sequence. The
local I/O is completed, however it is not replicated to the remote
Symmetrix array, and the tracks are marked as being 'owed' to the
target Symmetrix array. Nothing prevents the predecessor log
write from completing to the host completing the
acknowledgement process.
3. Now that the predecessor log write is complete, the dependent
data write is issued. This write is received on both the source and
target Symmetrix arrays because the rolling disaster does not
affect those communication links.
4. If the rolling disaster ended in a complete disaster, the condition
of the data at the remote site is such that it creates a "data ahead of
log" condition, which is an inconsistent state for a database. The
severity of the situation is when the database is restarted,
performing an implicit recovery, it may not detect the
inconsistencies. A person extremely familiar with the transactions

running at the time of the rolling disaster may detect the

inconsistencies. Database utilities can run to detect some of the
inconsistencies.
A rolling disaster can happen so data links providing remote
mirroring support are disabled in a staggered fashion, while
application and database processing continues at the production site.
The sustained replication during the time when some Symmetrix
arrays are communicating with their remote partners through their
respective links while other Symmetrix arrays are not (due to link
failures) can cause data integrity exposure at the recovery site. Some
data integrity problems caused by the rolling disaster cannot be
resolved through normal database restart processing and may require
a full database recovery using appropriate backups, journals, and
logs. A full database recovery elongates overall application restart
time at the recovery site.
Protection against a rolling disaster

SRDF consistency group (SRDF/CG) technology provides protection
against rolling disasters. A consistency group is a set of Symmetrix
volumes spanning multiple RA groups and/or multiple Symmetrix
arrays that replicate as a logical group to other Symmetrix arrays
using synchronous SRDF. It is not a requirement to span multiple RA
groups and/or Symmetrix arrays when using consistency groups.
Consistency group technology guarantees that if a single-source
volume is unable to replicate to its partner for any reason, then all
volumes in the group stop replicating. This ensures that the image of
the data on the target Symmetrix array is consistent from a
dependent-write perspective.
Figure 54 on page 256 depicts a dependent-write I/O sequence where
a predecessor log write is happening prior to a page flush from a
database buffer pool. The log device and data device are on different
Symmetrix arrays with different replication paths. Figure 54
demonstrates how rolling disasters can be prevented using EMC
consistency group technology.

5 Suspend R1/R2
Host 1 1
4 relationship
E-ConGroup
Solutions Enabler definition 7 DBMS
ConGroup definition (X,Y,Z) restartable
R1(A)
copy
DBMS 6
R1(B)
SCF/SYMAPI R1(X)
IOS/PowerPath R2(A)
R2(X)
R1(Y)
R2(B)
R2(Y)
Host 2
2 R2(C)
Solutions Enabler 1 R1(C) R2(Z)
ConGroup definition E-ConGroup 3
definition
(X,Y,Z)
DBMS
R1(Z)
SCF/SYMAPI X = Application data
IOS/PowerPath Y = DBMS data
Z = Logs
ICO-IMG-000520
Figure 54 Rolling disaster with SRDF consistency group protection
1. Consistency group protection is defined containing volumes X, Y,

and Z on the source Symmetrix array. This consistency group
definition must contain all the devices required to maintain
dependent-write consistency and reside on all participating hosts
involved in issuing I/O to these devices. A mix of CKD
(mainframe) and FBA (UNIX/Windows) devices can be logically
grouped together. In some cases, the entire processing
environment may be defined in a consistency group to ensure
dependent-write consistency.
2. The rolling disaster just described begins preventing the
replication of changes from volume Z to the remote site.
3. The predecessor log write occurs to volume Z, causing a
consistency group (ConGroup) trip.
4. A ConGroup trip will hold the I/O that could not be replicated
along with all of the I/O to the logically grouped devices. The
I/O is held by PowerPath on the UNIX or Windows hosts, and
IOS on the mainframe host. It is held long enough to issue two
I/Os per Symmetrix array. The first I/O will put the devices in a
suspend-pending state.

5. The second I/O performs the suspend of the R1/R2 relationship

for the logically grouped devices, which immediately disables all
replication to the remote site. This allows other devices outside of
the group to continue replicating provided the communication
links are available.
6. After the R1/R2 relationship is suspended, all deferred write
I/Os are released, allowing the predecessor log write to complete
to the host. The dependent data write is issued by the DBMS and
arrives at X but is not replicated to the R2(X).
7. If a complete failure occurred from this rolling disaster,
dependent-write consistency at the remote site is preserved. If a
complete disaster did not occur and the failed links were
activated again, the consistency group replication could be
resumed once synchronous mode is achieved. It is recommended
to create a copy of the dependent-write consistent image while
the resume occurs. Once the SRDF process reaches
synchronization the dependent-write consistent copy is achieved
at the remote site.
SRDF/S with multiple source Symmetrix arrays

The implications of spreading a database across multiple Symmetrix
arrays or across multiple RA groups and replicating in synchronous
mode were discussed in previous sections. The challenge in this type
of scenario is to protect against a rolling disaster. SRDF consistency
groups can be used to avoid data corruption in a rolling disaster
situation.
Consider the architecture depicted in Figure 55 on page 258.

Source Target
Data Data
files files
Redo Redo
logs logs
Synchronous
Archive Archive
logs logs
Data Data
Oracle files files
Redo Redo
logs logs
Archive Archive
logs logs
ICO-IMG-000521
Figure 55 SRDF/S with multiple source Symmetrix arrays and ConGroup

protection
To protect against a rolling disaster, a consistency group can be

created that encompasses all the volumes on all Symmetrix arrays
participating in replication as shown by the blue-dotted oval.
The following steps outline the process of using Solutions Enabler
(SYMCLI) commands to set up synchronous replication with
consistency groups:
1. Create a consistency group for the source side of the synchronous
relationship (the R1 side):
symcg create device_group -type rdf1 -ppath
2. Add to the consistency group the R1 devices 121 and 12f from
Symmetrix with ID 111, and R1 devices 135 and 136 from
Symmetrix with ID 222:
symcg -cg device_group add dev 121 -sid 111
symcg -cg device_group add dev 12f -sid 111

3. Before the synchronous mode of SRDF can be established, the

initial instantiation of the database is required. In other words,
first create the baseline full copy of all the volumes participating
in the synchronous replication. This is usually accomplished
using adaptive copy mode of SRDF.
4. Put the group device_group into adaptive copy mode:
symrdf -cg device_group set mode acp_disk -noprompt
5. Instruct the source Symmetrix array to send all tracks at the

symrdf -cg device_group establish -full -noprompt
6. Adaptive copy mode has no host impact. It transmits tracks to the

remote site never sent before or that have changed since the last
time the track was sent. It does not preserve write order or
consistency. When both sides are synchronized, SRDF can be put
into synchronous mode. In the following command, the device
group proddb is put into synchronous mode:
symrdf -cg device_group set mode sync -noprompt
7. Enable consistency protection:

symcg -cg device_group enable -noprompt
synchronous replication. The target Symmetrix array manages the in-bound
writes and updates the appropriate disks in the array.

SRDF/A
SRDF/A, or asynchronous SRDF, is a method of replicating
production data changes from one Symmetrix array to another using
delta set technology. Delta sets are the collection of changed blocks
grouped together by a time interval configured at the source site. The
default time interval is 30 seconds. The delta sets are then transmitted
from the source site to the target site in the order created. SRDF/A
preserves dependent-write consistency of the database at all times at
the remote site.
The distance between the source and target Symmetrix arrays is
unlimited and there is no host impact. Writes are acknowledged
immediately when they hit the cache of the source Symmetrix array.
SRDF/A is only available on the DMX family of Symmetrix arrays.
Figure 56 shows the process.
1 5 2 3 4
R1 N-1 N-1 R2
N N-2
Oracle
R1 N-1 N-1 R2
N N-2
ICO-IMG-000522
Figure 56 SRDF/A replication internals
1. Writes are received in the source Symmetrix cache. The host

receives immediate acknowledgement that the write is complete.
Writes are gathered into the capture delta set for 30 seconds.
2. A delta set switch occurs and the current capture delta set
becomes the transmit delta set by changing a pointer in cache. A
new empty capture delta set is created.
3. SRDF/A sends the changed blocks in the transmit delta set to the
remote Symmetrix array. The changes collect in the receive delta
set at the target site. When the replication of the transmit delta set

is complete, another delta set switch occurs and a new empty

capture delta set is created with the current capture delta set
becoming the new transmit delta set. The receive delta set
becomes the apply delta set.
4. The apply delta set marks all the changes in the delta set against
the appropriate volumes as invalid tracks and begins destaging
the blocks to disk.
5. The cycle repeats continuously.
With sufficient bandwidth for the source database write activity,
SRDF/A will transmit all changed data within the default 30 seconds.
This means that the maximum time the target data will be behind the
source is 60 seconds (two replication cycles). At times of high-write
activity, it may be impossible to transmit all the changes that occur
during a 30-second interval. This means the target Symmetrix array
will fall behind the source Symmetrix array by more than 60 seconds.
Careful design of the SRDF/A infrastructure and a thorough
understanding of write activity at the source site are necessary to
design a solution that meets the RPO requirements of the business at
all times.
Consistency is maintained throughout the replication process on a
delta-set boundary. The Symmetrix array will not apply a partial
delta set, which would invalidate consistency. Dependent-write
consistency is preserved by placing a dependent write in either the
same delta set as the write it depends on or a subsequent delta set.
Note: There is no requirement for a host at the remote site during

asynchronous replication. The target Symmetrix array manages in-bound
writes and updates the appropriate disks in the array.
Different command sets are used to enable SRDF/A depending on

whether the SRDF/A group of devices is contained within a single
Symmetrix array or is spread across multiple Symmetrix arrays.
SRDF/A using a single source Symmetrix array

Before the asynchronous mode of SRDF is established, initial
instantiation of the database has to occur. In other words, a baseline
full copy of all the volumes participating in the asynchronous
replication must be executed first. This is usually accomplished using
the adaptive copy mode of SRDF.
SRDF/A 261
The following steps outline the process of using Solutions Enabler

(SYMCLI) commands to set up asynchronous replication.
1. Create an SRDF disk group for the source side of the synchronous
relationship (the R1 side):
symdg create device_group -type rdf1
2. Add to the device group the R1 devices 121 and 12f from the
Symmetrix array with ID 111, and R1 devices 135 and 136 from
the Symmetrix array with ID 222:
symld -g device_group add dev 121 -sid 111
symld -g device_group add dev 12f -sid 111
3. Put the group proddb into adaptive copy mode:

symrdf -g device_group set mode acp_disk -noprompt
4. Instruct the source Symmetrix array to send all the tracks at the
5. The adaptive copy mode of SRDF has no impact to host

sent. It does not preserve write order or consistency. When both
sides are synchronized, SRDF can be put into asynchronous
mode. In the following command, the device group proddb is put
into asynchronous mode:
symrdf -g device_group set mode async -noprompt
asynchronous replication. The target Symmetrix array manages the
in-bound writes and updates the appropriate disks in the array.
SRDF/A multiple source Symmetrix arrays

When a database is spread across multiple Symmetrix arrays and
SRDF/A is used for long-distance replication, separate software must
be used to manage the coordination of the delta-set boundaries
between the participating Symmetrix arrays and to stop replication if
any of the volumes in the group cannot replicate for any reason. The

software must ensure that all delta-set boundaries on every

participating Symmetrix array in the configuration are coordinated to
give a dependent-write consistent point-in-time image of the
database.
SRDF/A multisession consistency (MSC) provides consistency across
multiple RA groups and/or multiple Symmetrix arrays. MSC is
available on 5671 microcode and later with Solutions Enabler V6.0
and later. SRDF/A with MSC is supported by an SRDF process
daemon that performs cycle-switching and cache recovery operations
across all SRDF/A sessions in the group. This ensures that a
dependent-write consistent R2 copy of the database exists at the
remote site at all times. A composite group must be created using the
SRDF consistency protection option (-rdf_consistency) and must be
enabled using the symcg enable command before the RDF daemon
begins monitoring and managing the MSC consistency group. The
RDF process daemon must be running on all hosts that can write to
the set of SRDF/A volumes being protected. At the time of an
interruption (SRDF link failure, for instance), MSC analyzes the
status of all SRDF/A sessions and either commits the last cycle of
data to the R2 target or discards it.
The following steps outline the process using Solutions Enabler
(SYMCLI) commands to set up synchronous replication with
consistency groups.
1. Create the replication composite group for the SRDF/A devices:
symcg create device_group -rdf_consistency -type rdf1
The -rdf_consistency option indicates the volumes in the group

are to be protected by MSC.
2. Add to the composite group named device_group the R1 devices
121 and 12f from the Symmetrix array with ID 111 and R1 devices
135 and 136 from the Symmetrix array with ID 222:
symcg -cg device_group add dev 12f -sid 111
3. Before the asynchronous mode of SRDF can be established, the

initial instantiation of the database is required. In other words,
first create the baseline full copy of all the volumes participating
SRDF/A 263
in the asynchronous replication. This is usually accomplished

using the adaptive copy mode of SRDF. The following command
puts the group proddb into adaptive copy mode:
symrdf -g device_group set mode acp_disk -noprompt
4. Instruct the source Symmetrix array to send all the tracks at the
5. The adaptive copy mode of SRDF has no impact on host
sent. It does not preserve write order or consistency. When both
sides are synchronized, SRDF can be put into asynchronous
mode. In the following command, the device group proddb is put
into asynchronous mode:
symrdf -g device_group set mode async -noprompt
6. Enable multisession consistency for the group:

symcg -cg device_group enable.
asynchronous replication. The target Symmetrix array itself manages the
in-bound writes and updates the appropriate disks in the array.

In the event of a disaster when the primary source Symmetrix array is
lost, run database and application services from the DR site. A host at
the DR site is required for this. If the device_group device group is
not built yet on the remote host, it must first be created using the R2
devices that were mirrors of the R1 devices on the source Symmetrix
array. The first thing that must be done is to write-enable the R2
devices.
symld -g device_group rw_enable -nopromptR2s on a single
Symmetrix
symcg -cg device_group rw_enable -nopromptR2s on

multiple Symmetrix
At this point, the host can issue the necessary commands to access the
disks. For instance, on a UNIX host, import the volume group,
activate the logical volumes, fsck the file systems, and mount them.

Once the data is available to the host, the database can be restarted.
The database will perform crash recovery when restarted.
Transactions committed, but not completed, are rolled forward and
completed using the information in the redo logs. Transactions with
updates applied to the database, but not committed, are rolled back.
The result is a transactionally consistent database.
SRDF/A 265
SRDF/AR single hop

SRDF Automated Replication, or SRDF/AR, is a continuous
movement of dependent-write consistent data to a remote site using
SRDF adaptive copy mode and TimeFinder consistent split
technology. TimeFinder BCVs are used to create a dependent-write
consistent point-in-time image of the data to be replicated. The BCVs
also have an R1 personality, which means that SRDF in adaptive copy
mode can be used to replicate the data from the BCVs to the target
site. Since the BCVs are not changing, replication completes in a finite
length of time. The length of time for replication depends on the size
of the network "pipe" between the two locations, the distance
between the two locations, the quantity of changed data tracks, and
the locality of reference of the changed tracks. On the remote
Symmetrix array, another BCV copy of the data is made using data on
the R2s. This is necessary because the next SRDF/AR iteration
replaces the R2 image in a nonordered fashion, and if a disaster were
to occur while the R2s were synchronizing, there would not be a valid
copy of the data at the DR site. The BCV copy of the data in the
remote Symmetrix array is commonly called the "gold" copy of the
data. The whole process then repeats.
With SRDF/AR, there is no host impact. Writes are acknowledged
immediately when they hit the cache of the source Symmetrix array.
Figure 57 shows the process.
5
1 3 4
2
STD BCV/R1 R2 BCV
Oracle
STD BCV/R1 R2 BCV
ICO-IMG-000523
Figure 57 SRDF/AR single-hop replication internals

1. Writes are received in the source Symmetrix cache and are

acknowledged immediately. The BCVs are already synchronized
with the STDs at this point. A consistent split is executed against
the STD-BCV pairing to create a point-in-time image of the data
on the BCVs.
2. SRDF transmits the data on the BCV/R1s to the R2s in the remote
Symmetrix array.
3. When the BCV/R1 volumes are synchronized with the R2
volumes, they are reestablished with the standards in the source
Symmetrix array. This causes the SRDF links to be suspended. At
the same time, an incremental establish is performed on the target
Symmetrix array to create a "gold" copy on the BCVs in that
frame.
4. When the BCVs in the remote Symmetrix array are fully
synchronized with the R2s, they are split and the configuration is
ready to begin another cycle.
5. The cycle repeats based on configuration parameters. The
parameters can specify the cycles to begin at specific times,
specific intervals, or to run back to back.
Cycle times for SRDF/AR are usually in the minutes to hours range.
The RPO is double the cycle time in a worst-case scenario. This may
be a good fit for customers with relaxed RPOs.
The added benefit of having a longer cycle time is that the locality of
reference will likely increase. This is because there is a much greater
chance of a track being updated more than once in a 1-hour interval
than in, for example, a 30-second interval. The increase in locality of
reference shows up as reduced bandwidth requirements for the final
solution.
Before SRDF/AR starts, instantiation of the database has too occur. In
other words, first create a baseline full copy of all the volumes
participating in the SRDF/AR replication. This requires a full
establish to the BCVs in the source array, a full SRDF establish of the
BCV/R1s to the R2s, and a full establish of the R2s to the BCVs in the
target array. There is an option to automate the initial setup of the
relationship.
As with other SRDF solutions, SRDF/AR does not require a host at
the DR site. The commands to update the R2s and manage the
synchronization of the BCVs in the remote site are all managed
in-band from the production site.
SRDF/AR single hop 267

Note: SRDF/AR is primarily a restartable solution. While SRDF/AR may

also be used as a recoverable solution, difficulties arise because of the need to
split the archive logs separately from the data files after taking the
tablespaces out of hot backup mode. Because of this limitation, SRDF/AR is
not recommended in Oracle environments that plan on creating a recoverable
database image at the target site.

In the event of a disaster, it is necessary to determine if the most
current copy of the data is located on the remote site BCVs or R2s at
the remote site. Depending on when in the replication cycle the
disaster occurs, the most current version could be on either set of
disks.

SRDF/AR multihop
SRDF/AR multihop is an architecture that allows long-distance
replication with zero seconds of data loss through use of a bunker
Symmetrix array. Production data is replicated synchronously to the
bunker Symmetrix array, which is within 200 km of the production
Symmetrix array allowing synchronous replication, but also far
enough away that potential disasters at the primary site may not
affect it. Typically, the bunker Symmetrix array is placed in a
hardened computing facility.
BCVs in the bunker frame are periodically synchronized to the R2s
and consistent split in the bunker frame to provide a dependent-write
consistent point-in-time image of the data. These bunker BCVs also
have an R1 personality, which means that SRDF in adaptive copy
mode can be used to replicate the data from the bunker array to the
target site. Since the BCVs are not changing, the replication can be
completed in a finite length of time. The replication time depends on
the size of the "pipe" between the bunker location and the DR
location, the distance between the two locations, the quantity of
changed data, and the locality of reference of the changed data. On
the remote Symmetrix array, another BCV copy of the data is made
using the R2s. This is because the next SRDF/AR iteration replaces
the R2 image, in a nonordered fashion, and if a disaster were to occur
while the R2s were synchronizing, there would not be a valid copy of
the data at the DR site. The BCV copy of the data in the remote
Symmetrix array is commonly called the "gold" copy of the data. The
whole process then repeats.
SRDF/AR multihop 269

With SRDF/AR multihop, there is minimal host impact. Writes are

only acknowledged when they hit the cache of the bunker Symmetrix
array and a positive acknowledgment is returned to the source
Symmetrix array. Figure 58 depicts the process.
Production Bunker DR
1 5 4
2
R1 R2 BCV/R1 R2 BCV
Oracle R1 R2 BCV/R1 R2 BCV
Short Distance Long Distance

ICO-IMG-000524
Figure 58 SRDF/AR multihop replication Internals
1. BCVs are synchronized and consistently split against the R2s in

the bunker Symmetrix array. The write activity is momentarily
suspended on the source Symmetrix array to get a
dependent-write consistent point-in-time image on the R2s in the
bunker Symmetrix array, which creates a dependent-write
consistent point-in-time copy of the data on the BCVs.
2. SRDF transmits the data on the bunker BCV/R1s to the R2s in the
DR Symmetrix array.
3. When the BCV/R1 volumes are synchronized with the R2
volumes in the target Symmetrix array, the bunker BCV/R1s are
established again with the R2s in the bunker Symmetrix array.
This causes the SRDF links to be suspended between the bunker
and DR Symmetrix arrays. Simultaneously, an incremental
establish is performed on the DR Symmetrix array to create a gold
copy on the BCVs in that frame.
4. When the BCVs in the DR Symmetrix array are fully
synchronized with the R2s, they are split and the configuration is
ready to begin another cycle.

5. The cycle repeats based on configuration parameters. The

parameters can specify the cycles to begin at specific times,
specific intervals, or to run immediately after the previous cycle
completes.
Even though cycle times for SRDF/AR multihop are usually in the
minutes to hours range, the most current data is always in the bunker
Symmetrix array. Unless there is a regional disaster that destroys both
the primary site and the bunker site, the bunker Symmetrix array will
transmit all data to the remote DR site. This means zero data loss at
the point of the beginning of the rolling disaster or an RPO of 0
seconds. This solution is a good fit for customers with a requirement
of zero data loss and long- distance DR.
An added benefit of having a longer cycle time means that the
locality of reference will likely increase. This is because there is a
much greater chance of a track being updated more than once in a
1-hour interval than in say a 30-second interval. The increase in
locality of reference shows up as reduced bandwidth requirements
for the network segment between the bunker Symmetrix arrayand
the DR Symmetrix array.
Before SRDF/AR can be initiated, initial instantiation of the database
is required. In other words, first create a baseline full copy of all the
volumes participating in the SRDF/AR replication. This requires a
full establish of the R1s in the source location to the R2s in the bunker
Symmetrix array. The R1s and R2s need to be synchronized
continuously. The following then occur:
◆ A full establish from the R2s to the BCVs in the bunker
Symmetrix array
◆ A full SRDF establish of the BCV/R1s to the R2s in the DR
Symmetrix array
◆ A full establish of the R2s to the BCVs in the DR Symmetrix
system are performed.

In the event of a disaster, it is necessary to determine if the most
current copy of the data is on the R2s on the remote site or on the
BCV/R1s in the bunker Symmetrix array. Depending on when the
disaster occurs, the most current version could be on either set of
disks.
SRDF/AR multihop 271

Database log-shipping solutions

Log shipping is a strategy that some companies employ for disaster
recovery. The process only works for databases using archive
logging. The essence of log shipping is that changes to the database at
the source site reflected in the log are propagated to the target site.
These logs are then applied to a "standby" database at the target site
to maintain a consistent image of the database that can be used for
DR purposes.
Overview of log shipping

The change activity on the source database generates log information
eventually copied from the redo logs to the archive logs to free up
active log space. A process external to the database takes the archived
logs and transmits them (usually over IP) to a remote DR location.
This location has a database in standby mode. A server at the standby
location receives the archive logs and uses them to roll forward
changes to the standby database.
If a disaster were to happen at the primary site, the standby database
is brought online and made available to users, albeit with some loss
of data.
Log-shipping considerations
When considering a log shipping strategy it is important to
understand:
◆ What log shipping covers.
◆ What log shipping does not cover.
◆ Server requirements.
◆ How to instantiate and reinstantiate the database.
◆ How failback works.
◆ Federated consistency requirements.
◆ How much data will be lost in the event of a disaster.
◆ Manageability of the solution.
◆ Scalability of the solution.

Log-shipping limitations
Log shipping transfers only the changes happening to the database
that are written into the redo logs and then copied to an archive log.
Consequently, operations happening in the database not written to
the redo logs do not get shipped to the remote site. To ensure that all
transactions are written to the redo logs, run the following command:
alter database force logging;
Log shipping is a database-centric strategy. It is completely agnostic
and does not address changes that occur outside of the database.
Changes include, but are not limited to the following:
◆ Application files and binaries
◆ Database configuration files
◆ Database binaries
◆ OS changes
◆ Flat files
To sustain a working environment at the DR site, there are several
procedures required to keep these objects up to date.
Server requirements
Log shipping requires a server at the remote DR site to receive and
apply the logs to the standby database. It may be possible to offset
this cost by using the server for other functions when it is not being
used for DR. Database licensing fees for the standby database may
also apply.
How to instantiate and reinstantiate the database

Log-shipping architectures need to be supported by a method of
instantiating the database at the remote site. The method needs to be
manageable and timely. For example, shipping 200 tapes from the
primary site to the DR site may not be an adequate approach,
considering the transfer time and database restore time.
Reinstantiation must also be managed. Some operations, mentioned
above, do not carry over into the standby database. Periodically, it
may be necessary to reinstantiate the database at the DR site. The
process should be easily managed, but also should provide
continuous DR protection. That is to say, there must be a contingency
plan for a disaster during reinstantiation.
Database log-shipping solutions 273

How failback works

An important component of any DR solution is designing a failback
procedure. If the DR setup is tested with any frequency, this method
should be simple and risk free. Log shipping is done in reverse and
works well when the primary site is still available. In the case of a
disaster where the primary site data is lost, the database is
reinstantiated at the production site.
Federated consistency requirements

Most databases are not isolated islands of information. They
frequently have upstream inputs and downstream outputs, triggers,
and stored procedures that reference other databases. There may also
be a workflow management system like MQ Series, Lotus Notes, or
TIBCO managing queues containing work to be performed. This
entire environment is a federated structure that needs to be recovered
to the same point in time to get a transactionally consistent disaster
restart point.
Log-shipping provides single database-centric solutions and are not
adequate solutions in federated database environments.
Data loss expectations

If sufficient bandwidth is provisioned for the solution, the amount of
data lost in a disaster is going to be approximately two logs worth of
information. In terms of time, it would be approximately twice the
length of time it takes to create an archive log. This time will most
likely vary during the course of the day due to fluctuations in write
activity.
Manageability of the solution

The manageability of a DR solution is a key to its success.
Log-shipping solutions have many components to manage including
servers, databases, and external objects as noted above. Some of the
questions to answer to make a clear determination of the
manageability of a log-shipping solution are:
◆ How much effort does it take to set up log shipping?
◆ How much effort is needed to keep it running on an on-going
basis?
◆ What is the risk if something required at the target site is missed?
◆ If FTP is being used to ship the log files, what kind monitoring is
needed to guarantee success?

Scalability of the solution

The scalability of a solution is directly linked to its complexity. To
successfully scale the DR solution, the following questions must be
answered:
◆ How much more effort does it take to add more databases?
◆ How easy is the solution to manage when the database grows
much larger?
◆ What happens if the quantity of updates increases dramatically?
Log shipping and remote standby database

A remote standby database is an Oracle database that is a physical
copy of the production database and requires a standby control file. It
can be created by restoring a backup of the production database or by
using a storage hardware mirroring technology along with the
standby control file.
Figure 59 depicts a standby database kept up to date using log
shipping.
Data
1 files
Data
files Redo Active
logs logs 4
Other
Other data
data
Oracle Oracle
Archive 2 Archive
logs logs
3 ICO-IMG-000525
Figure 59 Log shipping and remote standby database
1. The database must be instantiated at the DR site, either by tape or

by using SRDF, or if it is small enough, shipping it over IP.
2. As redo logs are filled, they are copied into archive logs in the
archive directory.

3. A process external to the database copies the archive logs from

the production site to the DR site.
4. Periodically, the archive logs are applied to the remote standby
database.
A standby database is kept in mount mode so that new logs shipped
from the production database can be applied. As new logs arrive, use
the following command to apply the logs to the standby database:
recover from archive_dir standby database;
In the event of a disaster, the standby database needs to be activated
for production availability:
recover database;
alter database open;
Note: Note that users need to run a catalog for the production database to
connect to the new location or the IP address of the standby server needs to
be updated to be the same as the failed production server.
Log shipping and standby database with SRDF

Rather than using a file transfer protocol to ship the logs from the
source to the target site, SRDF can be used as the transfer mechanism.
The advantages to using SRDF to ship the logs are listed next.
Synchronous SRDF can be used to ship active and archive logs when
the Symmetrix arrays are less than 200 km apart; this is a zero data
loss solution. When the redo logs are replicated synchronously and a
disaster occurs, all data up to the last completed transaction will be
present at the remote site. When the database is restarted at the
remote site, partial transactions will be rolled back and the committed
transactions finished using log information.
In addition, federated DR requirements can be satisfied by
synchronously replicating data external to the database.
◆ Guaranteed delivery - SRDF brings deterministic protocols into
play, an important advantage over FTP. This means that SRDF
guarantees what is sent is what is received. If packets are lost on
the network, SRDF retransmits the packets as necessary.

◆ Restartability - SRDF also brings restartability to the log shipping

functionality. Should a network outage or interruption cause data
transfer to stop, SRDF can restart from where it left off after the
network is returned to normal operations.
While replicating the logs to the remote site, the receiving volumes
(R2s) cannot be used by the host, as they are read-only. If the business
requirements are such that the standby database should be
continually applying logs, BCVs can be used periodically to
synchronize against the R2s and split. Then, the BCVs can be accessed
by a host at the remote site and the archive logs retrieved and applied
to the standby database.
Oracle Data Guard

In database release Oracle8i and more significantly in the subsequent
Oracle9i release of the product, Oracle made significant changes to
the standby database process described in the previous section.
Rather than utilizing host- or storage-based replication methods for
shipping the archive logs and manually applying them, Oracle
created Data Guard, which automated the replication or redo
information and the application of this data to a standby database
under a single utility. Oracle9i and Oracle10g Data Guard allows
DBAs to replicate transactions written to the primary database's redo
logs, and either asynchronously or synchronously apply them to a
standby database running either in a local or remote database
environment. Oracle Data Guard provides businesses with a simple
and efficient disaster recovery mechanism for their enterprise
database protection needs.
Data Guard overview

Oracle Data Guard ensures high availability, data protection, and
disaster recovery of a database. Data Guard provides a
comprehensive set of services that create, maintain, manage, and
monitor one or more standby databases to enable production Oracle
databases to survive disasters and data corruptions. Data Guard
maintains these standby databases as transactionally consistent
copies of the production database. If the production database
becomes unavailable because of a planned or unplanned outage,
Data Guard can switch any standby database to the production role,
minimizing the downtime associated with the outage. Data Guard
can be used with traditional backup, restore, and cluster techniques
to provide a high level of data protection and availability.

A Data Guard configuration consists of one production database, also

referred to as the primary database, and one or more (up to nine)
standby databases. The databases in a Data Guard configuration are
connected by Oracle Net and may be dispersed geographically. There
are no restrictions on where the databases are located, provided they
can communicate with each other.
A standby database is a transactionally consistent copy of the
primary database. Using a backup copy of the primary database, up
to nine standby databases can be created and incorporated into a
Data Guard configuration. Once created, Data Guard automatically
maintains each standby database by transmitting redo data from the
primary database, and then applying this redo to the standby.
Data Guard protection modes

Oracle Data Guard first became available in release Oracle8i. In that
version of the software, synchronous replication of redo log
transactions is unavailable; only asynchronous shipping of the
archive logs could be configured. With Oracle8i Data Guard, the
ARCH process was responsible for replicating the redo information
over the network to the standby server. Once received by the remote
host, a remote file server (RFS) process would write the redo
information to an archive log where it could be applied to the
standby database.
The Oracle9i Data Guard release substantially improved this process.
In this release, the LGWR is used to replicate redo information from
the primary to standby host. Because LGWR is used to copy the redo
information, rather than the ARCH process, both synchronous and
asynchronous replication is available. Three modes of operation are
configurable in Oracle9i and Oracle10g Data Guard:
◆ Maximum Protection - This is a synchronous method of
replicating redo information. Data must be written on the
standby side before acknowledgement on the primary database.
If network connectivity to the standby host is lost (that is, the
primary and standby database cannot be kept in
synchronization), then the primary database will be shut down.
Network latency and bandwidth significantly impact primary
database performance.
◆ Maximum Availability - This is a synchronous method of
replicating redo information. Data must be written on the
standby host before acknowledgement on the primary database.
If network connectivity to the standby host is lost however, this

mode will allow continued primary database operations.

Network latency and bandwidth significantly impact primary
database performance.
◆ Maximum Performance - This is an asynchronous method of
replicating redo information. Replicated log information is
allowed to diverge from the primary database. Small amounts of
data may not be replicated from the primary to standby database,
but performance impact on the primary is minimal.
While the first two options provide synchronous replication between
source and target servers, latency limitations usually prevent their
use except in short-distance situations.
Data Guard services

Data Guard consists of three high-level services: Log Transport
services, Log Apply services, and Role Management services.
◆ Log Transport - Log Transport services are used to ensure that
redo log information is successfully replicated from the primary
site to the target site. Log Transport uses Oracle Net over the
customers existing LAN/WAN to replicate redo log information.
Log information is transported in three forms: asynchronously as
entire archive logs (using the ARC process), asynchronously as
writes to the standby redo logs (LGWR writes to async log ship
buffers), or synchronously as writes to the standby redo logs
(using the LGWR process).
◆ Log Apply - Log Apply services are configured on the standby
host and are used to read log information replicated from the
primary database and apply it to the standby. Primary database
log information is read from archived logs or standby redo logs,
both located on the standby host. Two host processes are used for
this: MRP (Redo Apply) and LSP (SQL Apply).
◆ Role Management - Role Management services are used to control
switchover and failover of the primary database. During a
switchover operation, the primary database is gracefully
demoted to standby status, while a standby database is promoted
to a primary without data loss. Failover operations are initiated
when the primary fails. In this case, all log information is applied
and the standby database is configured to run as the new primary
database.
Figure 60 on page 280 shows a sample high-level overview of an
Oracle10g Data Guard configuration. Additional details on the role of
the processes shown in the diagram are found in the next section.

Source site Target Site
LGWR RFS Standby

DB
Primary DB
Standby
Redo log MRP or LSP
Redo log
ARCn ARCn
LOG_ARCH_DEST_1
Archive log
Archive log
ICO-IMG-000528
Figure 60 Sample Oracle10g Data Guard configuration
Data Guard processes

Oracle uses several old and new background processes (listed in
Table 11) for managing a Data Guard environment.
Table 11 Background processes for managing a Data Guard environment

(page 1 of 2)
Process Description
LGWR Sends Redo Log information from the primary host to the standby host via
(Log Writer) Oracle Net. LGWR can be configured to send data to standby redo logs on
the standby host for synchronous operations.
ARCn Sends primary database archive logs to the standby host. This process is
(Archiver) used primarily in configurations that do not use standby redo logs and are
configured for asynchronous operations.
RFS Receives log data, either from the primary LGWR or ARCn processes, and
(Remote File write data on the standby site to either the standby redo logs or archive logs.
Server) This process is configured on the standby host when Data Guard is
implemented.


(page 2 of 2)
Process Description
FAL Manages the retrieval of corrupted or missing archive from the primary to the
(Fetch Archive standby host.
Log)
MRP Used by a physical standby database to apply logs, retrieved from either the
(Managed standby redo logs or from local copies of archive logs.
Recovery)
LSP Used by a logical standby database to apply logs, retrieved from either the
(Logical standby redo logs or from local copies of the archive logs.
Standby)
LNS Enables asynchronous writes to the standby site using the LGWR process
(Network and standby redo logs.
Server)
Physical and logical standby databases

A standby database may be configured either as a physical or a
logical standby database.
A physical standby database is a block-for-block copy of the primary
database. Redo log information is read and applied by the Redo
Apply process (MRP) to individual data blocks on the standby
database, similar to the way recovery would be performed if needed
on the primary database. This has a number of advantages including:
◆ A physical standby is an exact block-level copy of the primary
database and is used as a backup source for recovery directly to
the primary.
◆ Data Guard only ships change records from the redo logs to the
target; this information can be significantly smaller than the
actual block-level changes in the database.
◆ Recovery time is generally short as the database is in mount mode
and redo log information is applied to the database as it is
received. However, recovery times can only be reduced if logs are
continuously being applied, requiring that the standby database
always remain in managed recovery mode. While in managed
recovery mode, the standby database may not be opened in
read-only (or read/write with Oracle10g and Flashback) mode
for user queries.

Some things to consider when employing a physical standby

database include:
◆ A physical standby is completely dependent on redo log
information. Therefore, if any type of unlogged operation is
made to the primary database (such as batch loads, index
creation, and so on) it will invalidate the standby.
◆ A physical standby only protects a single database source. If there
are any external files that require protection or a data
dependency exists between the primary database and another
application, Data Guard will not protect these types of
configurations. Examples of this include environments with
asynchronous messaging, other nonOracle databases, related
applications with additional data, or configuration files that
require continuous protection.
◆ A physical standby database has no mechanism to coordinate
Enterprise Consistency-the protection and recovery point
between multiple databases. If loosely or tightly coupled
databases require a single point of recovery, achieving this by
means of Data Guard is operationally complex.
◆ In an environment where multiple databases exist, each requires
its own Data Guard configuration and protection. In data centers
where customers have many database instances running Data
Guard, management of the database systems can become
complex. Oracle Data Guard Broker is a distributed management
framework that simplifies management in these types of
environments.
A logical standby database also reads the redo log information
replicated from the primary host. However, instead of applying
changes directly to the database, SQL statements based on the redo
information are generated and then run against the standby. The SQL
Apply process is used to keep the logical standby database in close
synchronization with the primary. Some advantages with a logical
standby database include:
◆ Read/write access - With a logical standby, the database can be
open (replicated tables are read-only) for queries or updates
while the SQL Apply process is applying data.
◆ Altering the database - Database changes can be made to a logical
standby that do not prevent additional updates from the primary.
For example, additional indexes can be created on tables to
improve query performance to support a reporting instance.

Likewise, new tables or materialized views could be created

allowing read/write access or improve performance for user
queries.
Additional considerations for a logical stand database are:
◆ Because a logical standby is not a physical copy of the primary
database and has its own database physical structure and ID, it
cannot be used for media recovery of the primary database. It can
only maintain similar database content.
◆ Because a logical standby is not tied to the primary database
structure, with the ability to add indexes, materialized views, and
other Oracle objects to improve its usage, there is an increased
chance that the standby will get out of sync with the primary
database. This makes a logical standby database a less than
optimal solution for DR.
Oracle Data Guard Broker

Oracle Enterprise Manager (OEM) provides a web-based interface for
viewing, monitoring, and administering primary and standby
databases in a Data Guard configuration. The Data Guard Broker is a
distributed management framework that automates and centralizes
the creation, maintenance, and monitoring of a Data Guard
implementation. Data Guard Broker can either use the OEM GUI or a
CLI to automate and simplify:
◆ Creating and enabling Data Guard configurations, including
setting up log transport and log apply services.
◆ Managing an entire Data Guard configuration from any system in
the configuration.
In addition, the OEM GUI automates and simplifies:
◆ Creating a physical or logical standby database from a backup
copy of the primary database.
◆ Adding new or existing standby databases to an existing Data
Guard configuration.
◆ Monitoring log apply rates, capturing diagnostic information,
and detecting problems quickly with centralized monitoring,
testing, and performance tools.

Oracle Data Guard with EMC SRDF

SRDF and Data Guard both provide customers with the ability to
create and maintain a synchronous or asynchronous copy of their
database for DR purposes at a remote site. The decision to implement
SRDF or Data Guard depends on the specific business needs and
requirements for the environment in which they are deployed.
Although they essentially perform similar tasks, there are still cases
where both products may be deployed together.
Prior to the Oracle9i release of Data Guard in which standby redo
logs could be configured, synchronous replication to the standby site
could not be enabled. To ensure no data loss between the primary
and standby databases, SRDF/S can be configured in conjunction
with Data Guard to replicate only the logs as shown in Figure 61.
Data
1 files
Data
files Redo Active
logs logs 4
Other
Other data
data
Oracle Oracle
Archive 2 Archive
logs logs
3 ICO-IMG-000525
Figure 61 "No data loss" standby database
In Oracle9i, standby redo logs were added to the database. Standby

redo logs on the target side enable Data Guard to maintain a "no data
loss" standby database. These logs act like regular redo logs but are
available to receive log information synchronously from the primary
database. When the primary database flushes redo data from the log
buffers to disk, information is also sent via Oracle Net to the target
site. It is received on the target by the RFS process, which then writes
it to the standby logs. This, in conjunction with Oracle's Real Time
Apply technology in Oracle10g, enables Data Guard to maintain a
synchronous copy of the primary database at the standby site.

With the enhanced capabilities of Data Guard, in some customer

environments SRDF may be overlooked. However, even if Data
Guard is planned for a particular environment, SRDF is a useful tool
for instantiating and reinstantiating an Oracle database on the
standby site. Instantiation involves creating a consistent copy of the
primary database at the target site. SRDF not only simplifies this
process, but it is also more efficient as incremental reestablishes may
be used after the first initial full push of the data. SRDF provides an
easy-to-use and efficient mechanism for replicating the database or
outside data from the primary to standby sites, or vice versa in the
event of recovery to the primary after a failover, whenever required.

Running database solutions

This section contains the following information for running database
solutions:
◆ “Overview” on page 286
◆ “Advanced Replication” on page 286
◆ “Oracle Streams” on page 287
Overview
Running database solutions attempt to use DR solutions in an active
fashion. Instead of having the database and server sitting idly waiting
for a disaster to occur, the idea of having the database running and
serving a useful purpose at the DR site is an attractive one. Also,
active databases at the target site minimize the recovery time
required to have an application available in the event of a failure of
the primary. The problem is that hardware, server, and database
replication-level solutions typically require exclusive access to the
database, not allowing users to access the target database. The
solutions presented in this section perform replication at the
application layer and therefore allow user access even when the
database is being updated by the replication process.
In addition to an Oracle Data Guard logical standby database, which
can function as a running database while log information is being
applied to it, Oracle has two other methods of synchronizing data
between disparate running databases. These running database
solutions are Oracle's Advanced Replication and Oracle Streams,
which are described at a high level in the following sections.
Advanced Replication
Advanced Replication is one method of replicating objects between
Oracle databases. Advanced Replication is similar to Oracle's
previous Snapshot technology, where changes to the underlying
tables were tracked internally within Oracle and used to provide a list
of necessary rows to be sent to a remote location when a refresh of the
remote object was requested. Instead of snapshots, Oracle now uses
materialized views to track and replicate changes. Materialized views
are a complete or partial copy of a target table from a single point in
time.

Advanced Replication has two types of replication sites: master sites

and materialized view sites. A master site contains information that is
used as a source for the materialized view. A materialized view site is
the target site for the data to be replicated. At the materialized view
site, additional data may be written to the materialized views. These
views also may be updated with information sent back to the master
site. Materialized views with multiple master sites for a single data
object are also possible. In these situations, complexity is increased
due to the need to handle conflicting data added at each of the sites
for replication to the others. This type of replication is called
Multimaster Replication:
◆ Advanced Replication can use either asynchronous or
synchronous (two-phase commit) replication.
◆ Advanced Replication is rarely used for DR purposes. It is
typically used to replicate infrequently changing table data
between databases.
Note: The Oracle documentation Oracle Database Advanced Replication

provides more information on Advanced Replication.
Oracle Streams
Streams is Oracle's distributed transaction solution for propagating
table, schema, or entire database changes to one or many other Oracle
databases. Streams uses the concept of change records from the
source database, which are used to asynchronously distribute
changes to one or more target databases. Both DML and DDL
changes can be propagated between the source and target databases.
Queues on the source and target databases are used to manage
change propagation between the databases.
The process for distributing transactions includes the following

stages:
◆ Capture - LCRs are created that capture DML or DDL changes
from targeted objects in the source databases.
◆ Stage - LCRs are stored in a queue to be forwarded on to the
target database(s).
Running database solutions 287

◆ Propagate - LCRs are the passed via the network to queues

located in the target database(s).
◆ Consume - LCRs are extracted off the queue with the
corresponding DML or DDL changes being applied to the target
database(s).
A key feature of Streams is the ability to detect and resolve conflicts
between the databases participating in the replication process.
The Oracle Streams feature is rarely used for DR due to its
asynchronous nature and inherent complexity.
Note: More detailed information on Streams is provided in the Oracle

documentation Oracle Streams Concepts and Administration and Oracle Streams
Replication Administrator's Guide.

7
Oracle Database
Layouts on EMC
Symmetrix DMX

◆ Introduction ...................................................................................... 290
◆ The performance stack .................................................................... 291
◆ Traditional Oracle layout recommendations ............................... 294
◆ Symmetrix DMX performance guidelines.................................... 297
◆ RAID considerations ....................................................................... 311
◆ Host- versus array-based striping ................................................. 318
◆ Data placement considerations ...................................................... 322
◆ Other layout considerations ........................................................... 328
◆ Oracle database-specific configuration settings .......................... 331
◆ The database layout process........................................................... 333
Oracle Database Layouts on EMC Symmetrix DMX 289

Oracle Database Layouts on EMC Symmetrix DMX
Introduction
Monitoring and managing database performance should be a
continuous process in all Oracle environments. Establishing baselines
and collecting database performance statistics for comparison against
them are important to monitor performance trends and maintain a
smoothly running system. The following section discusses the
performance stack and how database performance should be
managed in general. Subsequent sections discuss Symmetrix DMX
layout and configuration issues to help ensure the database meets the
required performance levels.

The performance stack

Performance tuning involves the identification and elimination of
bottlenecks in the various resources that make up the system.
Resources include the application, the code (SQL) that drives the
application, the database, the host, and the storage. Tuning
performance involves the following:
◆ Analyzing each of these individual components that make up an
application
◆ Identifying bottlenecks or potential optimizations that can be
made to improve performance
◆ Implementing changes that eliminate the bottlenecks or improve
performance
◆ Verifying that the change has improved overall performance
Tuning performance is an iterative process and is performed until the
benefits to be gained by continued tuning are outweighed by the
effort required to tune the system.
Figure 62 on page 292 shows the various "layers" to be examined as a
part of any performance analysis. The potential benefits achieved by
analyzing and tuning a particular layer of the performance stack are
not equal, however. In general, tuning the upper layers of the
performance stack, such as the application and SQL statements,
provides a much better return on investment than tuning the lower
layers, such as the host or storage layers. For example, implementing
a new index on a heavily used table that changes logical access from a
full table scan to index lookup with individual row selection can
vastly improve database performance if the statement is run many
times (thousands or millions) a day.
When tuning an Oracle database application, developers, DBAs,
system administrators, and storage administrators need to work
together to monitor and manage the process. Efforts should begin at
the top of the stack and address application and SQL statement
tuning before moving down into the database and host-based tuning
parameters. After all of these are addressed, storage-related tuning
efforts should be performed.
The performance stack 291

Application
Poorly written application,
inefficient code
SQL Statements
SQL logic errors, missing index
DB Engine
Database resource contention
Operating System
File system parameters settings,
kernel tuning, I/O distribution
Storage System
Storage allocation errors,
volume contention
ICO-IMG-000040
Figure 62 The performance stack
Importance of I/O avoidance

The primary goal at all levels of the performance stack is disk I/O
avoidance. In theory, an ideal database environment is one in which
most I/Os are satisfied from memory rather than going to disk to
retrieve the required data. In practice however, this is unrealistic and
careful consideration of the disk I/O subsystem is necessary.
Optimizing performance of an Oracle database on an EMC
Symmetrix DMX involves a detailed evaluation of the I/O
requirements of the proposed application or environment. A
thorough understanding of the performance characteristics and best
practices of the Symmetrix array, including the underlying storage
components (disks, directors, and so on) is also needed. Additionally,
knowledge of complementary software products such as EMC SRDF,
EMC TimeFinder, EMC Symmetrix Optimizer, and backup software,
along with how using these products will affect the database, is
important for maximizing performance. Ensuring optimal
configuration for the Oracle database requires a holistic approach to

application, host, and storage configuration planning. Configuration

considerations for host- and application-specific parameters are
beyond the scope of this document. Storage configuration
considerations are covered in this chapter.
Storage-system layer considerations

What is the best way to configure Oracle on EMC Symmetrix DMX
storage? This is a frequently asked question from customers.
However, before recommendations are made, a detailed
understanding of the configuration and requirements for the
database, host(s), and storage environment is required. The principal
goal for optimizing any layout on the Symmetrix DMX is to
maximize the spread of I/O across the components of the array,
reducing or eliminating any potential bottlenecks in the system. The
following sections examine the trade-offs between optimizing storage
performance and manageability for Oracle. They also discuss
recommendations for laying out an Oracle database on EMC
Symmetrix DMX arrays, as well as settings for storage-related Oracle
configuration settings.
The performance stack 293

Traditional Oracle layout recommendations

Until recently, with the introduction of Automated Storage
Management (ASM), Oracle's best practices for optimally laying out a
database were focused on identifying potential sources of contention
for storage-related resources. Eliminating contention involved
understanding how the database managed the data flow process and
ensuring that concurrent or near-concurrent storage resource requests
were separated onto different physical spindles. Many of these
recommendations still have value in a Symmetrix DMX environment.
Before examining other storage-based optimizations, the next session
presents a discussion of these recommendations.
Oracle's optimal flexible architecture

Oracle has long recommended their Optimal Flexible Architecture
(OFA) for laying out databases on the storage. Although there is
much disagreement as to whether OFA provides an optimal storage
layout, many of the recommendations continue to make sense in
Oracle environments on a Symmetrix DMX array. Some of the
recommendations for performance that still generally apply include:
◆ Place redo log members (in addition to log groups) on separate
hypers/spindles. This minimizes contention for the logs as new
writes come in from the database and the old log is written to an
archive log. It also isolates the sequential write and read activity
for these members from other volumes with different access
methods.
◆ "Redo logs and archive logs on separate hypers/spindles. This
minimizes disk contention when writing to the archive logs while
reading from the previous redo log.
◆ Separate INDEX tablespaces from their DATA counterparts.
Index reads that result in table reads are better serviced from
different physical devices to minimize disk contention and head
movement.
◆ Isolate TEMP and UNDO tablespaces from DATA and INDEX
information. TEMP and UNDO typically do long sequential
writes and reads. Sequential access to data should be isolated
from more random-access object types to limit head movement
and improve performance.

Replication of Oracle databases also plays a critical role in the way a

database should be laid out. To create a backup image of a database
while it is open or hot, Oracle requires that the archive logs must be
replicated after the "inconsistent" data tablespaces (DATA, INDEX,
and so on) have been replicated. When using replication software
such as EMC TimeFinder or SRDF, log information must be copied
after the data is replicated. This requires that log information reside
on separate hypers from the data volumes. When configuring Oracle
in a Symmetrix DMX environment the archive logs, redo logs, and
control files should be placed on separate hypervolumes from other
data volumes. Also, because it is easier to re-create a TEMP
tablespace rather than replicate it (either locally or remotely),
temporary tablespaces should also be placed on their own separate
hypervolumes. A TEMP tablespace would then be re-created using a
"CREATE TEMPORARY TABLESPACE TEMP. . ." while the database
is in mount mode, before it is fully opened.
OFA provides some general recommendations for laying out an
Oracle database on a storage array. The key point with OFA, or any
recommendation for optimizing the layout, is that it is critical to
understand both the type (sequential or random), size (large or
small), and quantity (low, medium or high) of I/O against the various
tablespaces and other elements (logs, control files, and so on) of the
database. Without a clear understanding of the data elements and the
access patterns expected against them, contention issues on the
back-end directors or physical spindles may arise and seriously
degrade Oracle performance. Knowledge of the application, both
data elements and access patterns, is critical to ensuring high
performance in the database environment.
Oracle layouts and replication considerations

If it is planned to use array replication technologies like TimeFinder
and SRDF, it is prudent to organize the database in such a way as to
facilitate recovery. Since array replication techniques copy volumes at
the physical disk level (as seen by the host), all datafiles for a
database should be created on a set of disks dedicated to the database
and not be shared with other applications and databases. For UNIX
systems, ensure that the data files reside in a volume group dedicated
to the database. Sharing with other applications can cause
unnecessary work for the array and wasted space on the target
volumes.
Traditional Oracle layout recommendations 295

In addition to isolating the database to be copied onto its own

dedicated volumes, the database should be divided into two parts:
the data structures and the recovery structures. The recovery
structures consist of the redo logs, the archive logs, and the control
files. The database data volumes hold the data files for all tablespaces
in the database and the ORACLE_HOME directory if desired.
Automated Storage Management

A new feature with Oracle10g release 1 related to Oracle data layouts
is Oracle Automated Storage Management (ASM). Using ASM
reduces database layout complexity and management in Oracle10g
environments since the database itself determines where "extents" for
the database are placed on how they are managed.

Symmetrix DMX performance guidelines

Optimizing performance for Oracle in an EMC Symmetrix DMX
environment is similar to optimizing performance for all applications
on the storage array. Maximizing performance requires a clear
understanding of the I/O requirements of the applications accessing
storage. The overall goal when laying out an application on disk
devices in the back-end of the Symmetrix DMX is to reduce or
eliminate bottlenecks in the storage system by spreading out the I/O
across all of the array's resources. Inside a Symmetrix DMX array,
there are several areas to consider:
◆ Front-end connections into the Symmetrix DMX — This includes
the number of connections from the host to the Symmetrix DMX
that are required, and whether front-end Fibre Channel ports will
be directly connected or a SAN will be deployed to share ports
between hosts.
◆ Memory cache in the Symmetrix DMX — All host I/Os pass
through cache on the Symmetrix DMX. I/O can be adversely
affected if insufficient cache is configured in the Symmetrix DMX
for the environment. Also, writes to individual hypervolumes or
to the array as a whole may be throttled when a threshold known
as the "write-pending limit" is reached.
◆ Back-end considerations — There are two sources of possible
contention in the back-end of the Symmetrix array: the back-end
directors and the physical spindles. Proper layout of the data on
the disks is needed to ensure satisfactory performance.
Front-end connectivity
Optimizing front-end connectivity requires an understanding of the
number and size of I/Os, both reads and writes, which will be sent
between the hosts and the Symmetrix DMX array. There are
limitations to the amount of I/O that each front-end director port,
each front-end director processor, and each front-end director board
can handle. Additionally, SAN fan-out counts (that is, the number of
hosts that can be attached through a Fibre Channel switch to a single
front-end port) need to be carefully managed.
A key concern when optimizing front-end performance is
determining which of the following I/O characteristics is more
important in the customer's environment:
Symmetrix DMX performance guidelines 297

◆ Input/output operations per second (IOPS)

◆ Throughput (MB/s)
◆ A combination of IOPS and throughput
In OLTP database applications, where I/Os are typically small and
random, IOPS is the more important factor. In DSS applications,
where transactions in general require large sequential table or index
scans, throughput is the more critical factor. In some databases, a
combination of OLTP- and DSS-like I/Os are required. Optimizing
performance in each type of environment requires tuning the host
I/O size.
Figure 63 depicts the relationships between the block size of a
random read request from the host, and both IOPS and throughput
needed to fulfill that request from the Symmetrix DMX.
IOPS and throughput vs. blocksize

Random read cache hit
100%
90%
80%
Percent of maximum
70%
60%
50%
40%
30%
20%
10% I/O per sec
MB per sec
0%
512 4096 8192 32768 65536
Blocksize
ICO-IMG-000042
Figure 63 Relationship between host block size and IOPS/throughput
The figure shows that the maximum number of IOPS is achieved

using smaller block sizes such as 4 KB (4096 bytes). For OLTP
applications, where the typical Oracle DB_BLOCK_SIZE is 4 KB or 8
KB, the Symmetrix DMX provides higher IOPS, but decreased
throughput. The opposite is also true for DSS applications. Tuning

the host to send larger I/O sizes for DSS applications can increase the
overall throughput (MB/s) from the front-end directors on the DMX.
Database block sizes are generally larger (16 KB or even 32 KB) for
DSS applications. Sizing the host I/O size as a power of two multiple
of the DB_BLOCK_SIZE and tuning the
MULTI_BLOCK_READ_COUNT appropriately is important for
maximizing performance in a customer's Oracle environment.
Currently, each Fibre Channel port on the Symmetrix DMX is
theoretically capable of 200 MB/s of throughput. In practice however,
the throughput available per port is significantly less and depends on
the I/O size and on the shared utilization of the port and processor
on the director. Increasing the size of the I/O from the host
perspective decreases the number of IOPS that can be performed, but
increases the overall throughput (MB/s) of the port. As such,
increasing the I/O block size on the host is beneficial for overall
performance in a DSS environment. Limiting total throughput to a
fraction of the theoretical maximum (100 to 120 MB/s is a good "rule
of thumb") will ensure that enough bandwidth is available for
connectivity between the Symmetrix DMX and the host.
Symmetrix cache
The Symmetrix cache plays a key role in improving I/O performance
in the storage subsystem. The cache improves performance by
allowing write acknowledgements to be returned to a host when data
is received in solid-state cache, rather than being fully destaged to the
physical disk drives. Additionally, reads benefit from cache when
sequential requests from the host allow follow-on reads to be
prestaged in cache. The following briefly describes how the
Symmetrix cache is used for writes and reads, and then discusses
performance considerations for it.
Write operations and the Symmetrix cache

All write operations on a Symmetrix array are serviced by cache.
When a write is received by the front-end director, a cache slot must
be found to service the write operation. Since cache slots are a
representation of the underlying hypervolume, if a prior read or
write operation caused the required data to already be loaded into
cache, the existing cache slot may be used to store the write I/O. If a
cache slot representing the storage area is not found, a call is made to
locate a free cache slot for the write. The write operation is moved to
the cache slot and the slot is then marked write pending. At a later

point, Enginuity will destage the write to physical disk. The decision
of when to destage is based on overall system load, physical disk
activity; read operations to the physical disk, and availability of
cache.
Cache is used to service the write operation to optimize the
performance of the host system. As write operations to cache are
significantly faster than physical writes to disk media, the write is
reported as complete to the host operating system much earlier.
Battery backup and priority destage functions within the Symmetrix
ensure that no data loss occurs in the event of system power failure.
If the write operation to a given disk is delayed due to higher priority
operations (read activity is one such operation), the write-pending
slot remains in cache for longer time periods. Cache slots are
allocated as needed to a volume for this purpose. Enginuity
calculates thresholds for allocations to limit the saturation of cache by
a single hypervolume. These limits are referred to as write-pending
limits.
Cache allocations are based on a per hypervolume basis. As
write-pending thresholds are reached, additional allocations may
occur, as well as reprioritization of write activity. As a result, write
operations to the physical disks may increase in priority to ensure
that excessive cache allocations do not occur. This is discussed in
more detail in the next section.
Thus, the cache enables buffering of writes and allows for a steady
stream of write activity to service the destaging of write operations
from a host. In a "bursty" write environment, this serves to even out
the write activity. Should the write activity constantly exceed the low
write priority to the physical disk, Enginuity will raise the priority of
write operations to attempt to meet the write demand. Ultimately,
should write load from the host exceed the physical disk ability to
write, the volume maximum write-pending limit may be reached. In
this condition, new cache slots only will be allocated for writes to a
particular volume once a currently allocated slot is freed by destaging
it to disk. This condition, if reached, may severely impact write
operations to a single hypervolume.

Read operations and the Symmetrix cache

As mentioned in the previous section, read operations typically have
an elevated priority for service from the physical disks. As user
processes normally wait for an I/O operation to complete before
continuing, this is generally a good practice for storage arrays,
especially those able to satisfy write operations from cache.
When an I/O read request is detected from a host system, Enginuity
sees if a corresponding cache slot representing the storage area exists.
If so, the read request may be serviced immediately-this is considered
a read hit. If the required data is not in cache but free slots are
available, the read operation must wait for a transfer from disk-this is
called a short read miss. If no cache slot exists for the read to be
transferred into, then a cache slot must be allocated and the read
physically transferred into the slot - this is referred to as a long read
miss.
Although cache slots are 32 KB in size, a cache slot may contain only
the requested data. That is, if a read request is made for an 8 KB
block, then only that 8 KB block will be transferred into the cache slot,
as opposed to reading the entire 32 KB track from disk. The smallest
read request unit is 4 KB (a sector).
Note: In the DMX-3, the cache slot size increases from 32 KB to 64 KB. Sectors
also increase from 4 KB to 8 KB.
Symmetrix cache performance considerations

An important performance consideration is to ensure that an
appropriate amount of cache is installed in the Symmetrix DMX. All
I/O requests from hosts attached to the array are serviced from the
Symmetrix DMX cache. Symmetrix DMX cache can be thought of as
an extension to database buffering mechanisms. As such, many
database application environments can benefit from additional
Symmetrix DMX cache. With newly purchased arrays, appropriately
sizing the cache is performed by the sales team, based on the number
and size of physical spindles, configuration (including number and
type of volumes), replication requirements (SRDF for example), and
customer requirements.
Cache usage can be monitored through a number of Symmetrix DMX
monitoring tools. Primary among these is ControlCenter
Performance Manager (formerly known as WorkLoad Analyzer).
Performance Manager contains a number of views that analyze
Symmetrix DMX cache utilization at both the hypervolume and

overall system level. Views provide detailed information on specific

component utilizations including disks, directors (front-end and
back-end), and cache utilization.
Symmetrix cache plays a key role in host I/O read and write
performance. Read performance can be improved through
prefetching by the Symmetrix DMX if the reads are sequential in
nature. Enginuity algorithms detect sequential read activity and
pre-stage reads from disk in cache before the data is requested. Write
performance is also greatly enhanced because all writes are
acknowledged back to the host when they reach cache rather than
when they are written to disk. While reads from a specific
hypervolume can use as much cache as is required to satisfy host
requests (assuming free cache slots are available), the DMX limits the
number of writes that can be written to a single volume (the
write-pending limit discussed earlier). Understanding the Enginuity
write-pending limits is important when planning for optimal
performance.
As previously discussed, the write-pending limit is used to prevent
high write rates to a single hypervolume from consuming all of the
storage array cache for its use, at the expense of performance for
reads or writes to other volumes. The write-pending limit for each
hypervolume is determined at system startup and depends on the
number and type of volumes configured and the amount of cache
available. The limit is not dependent on the actual size of each
volume. The more cache available, the more write requests that can
be serviced in cache by each individual volume. While some sharing
of unused cache may occur (although this is not guaranteed), an
upper limit of three times the initial write-pending limit assigned to a
volume is the maximum amount of cache any hypervolume can
acquire for changed tracks. If the maximum write-pending limit is
reached, destaging to disk must occur before new writes can come in.
This forced destaging to disk before a new write is received into cache
limits writes to that particular volume to physical disk write speeds.
Forced destage of writes can significantly reduce performance to a
hypervolume should the write-pending limit be reached. If
performance problems with a particular volume are identified, an
initial step in determining the source of the problem should include
verification of the number of writes and the write-pending limit for
that volume.
In addition to limits imposed at the hypervolume level,
write-pending limits are imposed at the system level. Two key cache
utilization points for the DMX are reached when 40 percent and 80

percent of the cache is used for pending writes. Under normal

operating conditions, satisfying read requests from a host has greater
priority than satisfying write requests. However, when pending
writes consume 40 percent of cache, the Symmetrix DMX then
prioritizes reads and writes equally. This reprioritization can have a
profound affect on database performance. The degradation is even
more pronounced if cache utilization for writes reaches 80 percent. At
that point, the DMX begins a forced destage of writes to disk, with
discernable performance degradation of both writes and reads. If this
threshold is reached, it is a clear indicator that reexamination of both
the cache and the total I/O on the array is needed.
Write-pending limits are also established for Symmetrix
metavolumes. Metavolumes are created by combining two or more
individual hypervolumes into a single logical device that is then
presented to a host as a single logical unit (LUN). Metavolumes can
be created as concatenated or striped metavolumes. Striped
metavolumes use a stripe size of 960 KB. Concatenated metavolumes
write data to the first hyper in the metavolume (meta head) and fill it
before beginning to write to the next member of the metavolume.
Write-pending limits for a metavolume are calculated on a member
by member (hypervolume) basis.
Determining the write-pending limit and current number of writes
pending per hypervolume can be done simply using SYMCLI
commands.
The following SYMCLI command returns the write-pending limit for
hypervolumes in a Symmetrix system:
symcfg -v list | grep pending
Max # of system write pending slots:162901

Max # of DA write pending slots:81450
Max # of device write pending slots:4719
The exact number of cache slots available to writes for a

hypervolumes varies with the amount of cache available in the
system. However, the maximum number of write pending slots an
individual hypervolume uses is up to three times the maximum
number of device write-pending slots listed above (3 * 4,719 = 14,157
write pending tracks).
The number of write-pending slots used by a host's devices is found
using the SYMCLI command:
symstat -i 30
DEVICE KB/sec KB/sec % Hits %Seq Num WP

13:09:52 READ WRITE READ WRITE RD WRT READ Tracks
13:09:52 035A (Not Visible ) 0 0 0 0 N/A N/A N/A 2

0430 (Not Visible ) 0 0 0 0 100 0 100 2679
0431 (Not Visible ) 0 0 0 0 100 0 100 2527
0432 (Not Visible ) 0 0 0 0 82 28 0 2444
0434 (Not Visible ) 0 0 0 0 0 100 0 14157
0435 (Not Visible ) 0 0 0 0 0 100 0 14157
043A (Not Visible ) 0 0 0 0 N/A N/A N/A 49
043C (Not Visible ) 0 0 0 0 N/A N/A N/A 54
043E (Not Visible ) 0 0 0 0 N/A N/A N/A 15
043F (Not Visible ) 0 0 0 0 N/A N/A N/A 10
0440 (Not Visible ) 0 0 0 0 N/A N/A N/A 807
0442 (Not Visible ) 13 1 66 0 0 100 0 17
0443 (Not Visible ) 0 0 0 0 100 0 100 1597
From this, we see devices 434 and 435 have reached the device
write-pending limit of 14,157. Further analysis on the cause of the
excessive writes and methods of alleviating this performance
bottleneck against these devices should be made.
Alternatively, Performance Manager may be used to determine the
device write-pending limit, and whether device limits are being
reached. Figure 64 on page 305 is a Performance Manager view
displaying both the device write-pending limits and device
write-pending counts for a given device, in this example Symmetrix
device 055. For the Symmetrix in this example, the write-pending
slots per device was 9,776 and thus the max write-pending limit was
29,328 slots (3 * 9776). In general, a distinct flat line in such graphs
indicates that a limit is reached.

30000
25000
Devices 055-
20000 write pending
count 12/16/200n
15000
Devices 055-
maximum
write pending
10000
threshold
12/16/200n
5000
0
16:50 16:52 16:54 16:56 16:58 17:00 17:02 17:04 17:06 17:08 17:10 17:12 17:14
ICO-IMG-000043
Figure 64 Performance Manager graph of write-pending limit for a single

hypervolume
Metavolumes can also be monitored using Performance Manager.

Figure 65 on page 306 shows a four-member striped metavolume
with the same workload as shown in the previous figure. Each of the
volumes services a proportion of the workload. Due to the location of
the file being created and the stripe depth used, one of the volumes
incurred more write activity. However, even in this case, it did not
exceed the lowest of the write-pending thresholds, let alone reach the
maximal threshold limit.

Devices 00F-
write pending
10000 count 12/21/200n
9000
Devices 00E-
8000 write pending
count 12/21/200n
7000
Devices 011-
6000 write pending
count 12/21/200n
5000
4000 Devices 010-

write pending
3000 count 12/21/200n
2000 Devices 00F-

maximum write
1000 pending threshold
12/21/200n
0
12:49 12:53 12:57 13:01 13:05 13:09 13:13 13:17 13:27 13:25
ICO-IMG-000038
Figure 65 Performance Manager graph of write-pending limit for a four-member

metavolume
In the same way, the throughput and overall performance of the

workload was substantially improved. Figure 66 on page 307 shows a
comparison of certain metrics in this configuration. It is obvious this
is not a fair comparison since the comparison matches a single
hypervolume against four hypers within the metavolume. However,
it does show that multiple disks can satisfy an intense workload,
which clearly exceeds the capability of a single device. It also serves
to demonstrate the management and dynamic allocation of cache
resources for volumes.

Meta
10 ps Hyper
Read 10 ps
Write 10 ps Transactions
per second
ICO-IMG-000039
Figure 66 Write workload for a single hyper and a striped metavolume
Note that the number of cache boards also has a minor affect on
performance. When comparing Symmetrix DMX arrays with the
same amount of cache, increasing the number of boards (for example,
four cache boards with 16 GB each as opposed to two cache boards
with 32 GB each) has a small positive affect on the performance in
DSS applications. This is due to the increased number of paths
between front-end directors and cache, and has the affect of
improving overall throughput. However, configuring additional
boards is only helpful in high-throughput environments such as DSS
applications. For OLTP workloads, where IOPS are more critical,
additional cache directors provide no added performance benefits.
This is because the number of IOPS per port or director is limited by
the processing power of CPUs on each board.
Note: In the DMX-3, write-pending limits for individual volumes is

modified. Instead of allowing writes up to three times the initial
write-pending limit, up to ~ 1/20 of the cache can be used by any individual
hypervolume.

Back-end considerations
Back-end considerations are typically the most important part of
optimizing performance on the Symmetrix DMX. Advances in disk
technologies have not kept up with performance increases in other
parts of the storage array such as director and bandwidth (that is,
Direct Matrix versus Bus) performance. Disk-access speeds have
increased by a factor of three to seven in the last decade while other
components have easily increased one to three orders of magnitude.
As such, most performance bottlenecks in the Symmetrix DMX are
attributable to physical spindle limitations.
An important consideration for back-end performance is the number
of physical spindles available to handle the anticipated I/O load.
Each disk is capable of a limited number of operations. Algorithms in
the Symmetrix DMX Enginuity operating environment optimize
I/Os to the disks. Although this helps to reduce the number of reads
and writes to disk, access to disk, particularly for random reads, is
still a requirement. If an insufficient number of physical disks are
available to handle the anticipated I/O workload, performance will
suffer. It is critical to determine the number of spindles required for
an Oracle database implementation based on I/O performance
requirements, and not solely on the physical space considerations.
To reduce or eliminate back-end performance issues on the
Symmetrix DMX, carefully spread access to the disks across as many
back-end directors and physical spindles as possible. EMC has long
recommended for data placement of application data to "go wide
before going deep." This means that performance is improved by
spreading data across the back-end directors and disks, rather than
allocating individual applications to specific physical spindles.
Significant attention should be given to balancing the I/O on the
physical spindles. Understanding the I/O characteristics of each
datafile and separating high application I/O volumes on separate
physical disks will minimize contention and improve performance.
Implementing Symmetrix Optimizer may also help to reduce I/O
contention between hypervolumes on a physical spindle. Symmetrix
Optimizer identifies I/O contention on individual hypervolumes and
nondisruptively moves one of the hypers to a new location on
another disk. Symmetrix Optimizer is an invaluable tool in helping to
reduce contention on physical spindles should workload
requirements change in an environment.

Placement of data on the disks is another performance consideration.

Due to the rotational properties of disk platters, tracks on the outer
parts of the disk perform better than inner tracks. While the
Symmetrix DMX Enginuity algorithms smooth out much of this
variation, small performance increases can be achieved by placing
high I/O objects on the outer parts of the disk. Of more importance,
however, is minimizing the seek times associated with the disk head
moving between hypervolumes on a spindle. Physically locating
higher I/O devices together on the disks can significantly improve
performance. Disk head movement across the platters (seek time) is a
large source of latency in I/O performance. By placing higher I/O
devices contiguously, disk head movement may be reduced,
increasing I/O performance of that physical spindle.
Additional layout considerations

A few additional factors can determine the best layout for a given
hardware and software configuration. It is important to evaluate each
factor to create an optimal layout for an Oracle database.
Host bus adapter

A host bus adapter (HBA) is a circuit board and/or integrated circuit
adapter that provides I/O processing and physical connectivity
between a server and a storage device. The connection may route
through Fibre Channel switches if Fibre Channel FC-SW is used.
Because the HBA relieves the host microprocessor of both data
storage and retrieval tasks, it can improve the server's performance
time. An HBA and its associated disk subsystems are sometimes
referred to as a disk channel.
HBAs can be a bottleneck if an insufficient number of them are
provisioned for a given throughput environment. When configuring
Oracle systems, estimate the throughput required and provision
sufficient HBAs accordingly.
Host addressing limitations

There are also limitations on the number of LUNs that can be
addressed on a host channel. For instance, the maximum number of
LUNs that can be presented on AIX is 512, while on other operating
systems, the maximum number is 256.

These factors must be considered when designing the database

storage infrastructure. The final architecture will always be
compromise between what is ideal and what is economically feasible
within the constraints of the implementation environment.
Configuration recommendations
Key recommendations for configuring the Symmetrix DMX for
optimal performance include the following:
Understand the I/O — It is critical to understand the characteristics
of the database I/O including the number, type (read or write) size,
location (that is, data files, logs), and sequentiality of the I/Os.
Empirical data or estimates are needed to assist in planning.
◆ Physical spindles — The number of disk drives in the DMX
should first be determined by calculating the number of I/Os
required, rather than solely based on the physical space needs.
The key is to ensure that the front-end needs of the applications
can be satisfied by the flow of data from the back end.
◆ Spread out the I/O — Both reads and writes should be spread
across the physical resources (front-end and back-end ports,
physical spindles, hypervolumes) of the DMX. This helps to
prevent bottlenecks such as hitting port or spindle I/O limits, or
reaching write-pending limits on a hypervolume.
◆ Bandwidth — A key consideration when configuring
connectivity between a host and the Symmetrix DMX is the
expected bandwidth required to support database activity. This
requires an understanding of the size and number of I/Os
between the host and the Symmetrix system. Connectivity
considerations for both the number of HBAs and Symmetrix
front-end ports is required.

RAID considerations
For years, Oracle has recommended that all database storage be
mirrored; their philosophy of stripe and mirror everywhere (SAME)
is well known in the Oracle technical community. While laying out
databases using SAME may provide optimal performance in most
circumstances, in some situations acceptable data performance (IOPS
or throughput) can be achieved by implementing more economical
RAID configurations such as RAID 5. Before discussing RAID
recommendations for Oracle, a definition of each RAID type available
in the Symmetrix DMX is required.
Types of RAID
The following RAID configurations are available on the Symmetrix
DMX:
◆ Unprotected - This configuration is not typically used in a
Symmetrix DMX environment for production volumes. BCVs
and occasionally R2 devices (used as target devices for SRDF) can
be configured as unprotected volumes.
◆ RAID 1 - These are mirrored devices and are the most common
RAID type in a Symmetrix DMX. Mirrored devices require writes
to both physical spindles. However, intelligent algorithms in the
Enginuity operating environment can use both copies of the data
to satisfy read requests not in the cache of the Symmetrix DMX.
RAID 1 offers optimal availability and performance, but at an
increased cost over other RAID protection options.
◆ RAID 5 - A relatively recent addition to Symmetrix data
protection (Enginuity 5670+), RAID 5 stripes parity information
across all volumes in the RAID group. RAID 5 offers good
performance and availability, at a decreased cost. Data is striped
using a stripe width of four tracks (128 KB on DMX-2 and 256 KB
on DMX-3). RAID 5 is configured either as RAID 5 3+1 (75%
usable) or RAID 5 7+1 (87.5 % usable) configurations. Figure 67
shows the configuration for 3+1 RAID 5 while Figure 68 on
page 313 shows how a random write in a RAID 5 environment is
performed.
RAID considerations 311

RAID 5 3+1 Array
Disk 1 Disk 2 Disk 3 Disk 4
Parity 1 - 12 Data 1 - 4 Data 5 - 8 Data 9 - 12 Stripe size

Data 13 - 16 Parity 13 - 24 Data 17 - 20 Data 21 - 24 (4 tracks wide)
Data 25 - 28 Data 29 - 32 Parity 25 - 36 Data 33 - 36
Data 37 - 40 Data 41 - 44 Data 45 - 48 Parity 37 - 48
ICO-IMG-000083
Figure 67 3+1 RAID 5 layout detail
◆ RAID-S - Proprietary EMC RAID configuration with parity

information on a single hypervolume. RAID-S functionality was
optimized and renamed as parity RAID in the Symmetrix DMX.
◆ Parity RAID - Prior to the availability of RAID 5, parity RAID was
implemented in storage environments that required a lower cost
and did not have high performance requirements. Parity RAID
uses a proprietary RAID protection scheme with parity
information being written on a single hypervolume. Parity RAID
is configured in 3+1 (75 percent usable) and 7+1 (87.5 percent
usable) configurations. Parity RAID is not recommended in
current DMX configurations, where RAID 5 should be used
instead.

◆ RAID 1/0 - These are striped and mirrored devices. This

configuration is only used in mainframe environments.
CACHE
rite
Host data XOR data w Data
1 Data slot 2
of old
XOR ta
ew da
3 and n
Parity slot
4 XOR
parity
write
Parity
ICO-IMG-000045
Figure 68 Anatomy of a RAID 5 random write
The following describes the process of a random write to a RAID 5

volume:
1. A random write is received from the host and is placed into a data
slot in cache to be destaged to disk.
2. The write is destaged from cache to the physical spindle. When
received, parity information is calculated in cache on the drive by
reading the old data and using an XOR calculation with the new
data.
3. The new parity information is written back to Symmetrix cache.
4. The new parity information is written to the appropriate parity
location on another physical spindle.
It is also important to note some of the optimizations implemented
within Enginuity for large sequential batch update (write) operations.
As previously discussed, when random write operations are
processed, there may be a requirement to generate a background read
operation to be able to read and regenerate new parity information.
With subsequent optimizations, and when large sequential writes are
generated by the host application, Enginuity can calculate parity
information based on the data written, and then write the new parity
information when the data is destaged to disk. In this way the write
penalty is removed. The size of the write operation must be at least a
complete RAID 5 stripe, since each stripe is 4 tracks (128 KB in
DMX-2 and 256 KB in DMX-3), in a RAID 5 (3+1) environment, the
write must be 3 x 128 KB (384 KB) in a DMX-2 environment, or 3 x 256
KB (768 KB) in a DMX-3 environment. For a RAID 5 (7+1), the write

must be 7 x 128 KB (896 KB) in a DMX-2 environment, or 7 x 256 KB

(1792 KB) in a DMX-3 environment. Thus, large sequential write
operations, which may be typical of large batch updates, may benefit
from this optimization. Figure 69 shows an example of this
optimization.
CACHE
Write new data 3 Data
Data slots
Host data
Write new data
1 Data slots 3 Data
Data slots Write new

data
3 Data
2
XOR in Cache
Write new parity
Parity slots 3
Parity
ICO-IMG-000527
Figure 69 Optimizing performance with RAID 5 sequential writes
Determining the appropriate level of RAID to configure in an

environment depends on the availability and performance
requirements of the applications that will use the Symmetrix DMX.
Combinations of RAID types are configurable in the Symmetrix DMX
with some exceptions. For example, storage may be configured as a
combination of RAID 1 and 3+1 RAID 5 devices. Combinations of 3+1
and 7+1 parity or RAID 5 are currently not allowed in the same
Symmetrix DMX. Likewise, mixing any types of RAID 5 and parity
RAID in the same frame is not allowed.
Until recently, RAID 1 was the predominant choice for RAID
protection in Symmetrix storage environments. RAID 1 provides
maximum availability and enhanced performance over other
available RAID protections. In addition, performance optimizations
such as Symmetrix Optimizer, which reduces contention on the
physical spindles by nondisruptively migrating hypervolumes, and
Dynamic Mirror Service Policy (DMSP), which improves read
performance by optimizing reads from both mirrors, were only
available with mirrored volumes, not with parity RAID devices.
While mirrored storage is still the recommended choice for RAID

configurations in the Symmetrix DMX, the relatively recent addition

of RAID 5 storage protection provides customers with a reliable,
economical alternative for their production storage needs.
RAID 5 storage protection became available with the 5670+ release of
the Enginuity operating environment. RAID 5 storage protection
provides economic advantages over using RAID 1, while providing
high availability and performance. RAID 5 implements the standard
data striping and rotating parity across all members of the RAID
group (either 3+1 or 7+1). Additionally, Symmetrix Optimizer
functionality is available with RAID 5 in order to reduce spindle
contention. RAID 5 provides customers with a flexible data
protection option for dealing with varying workloads and
service-level requirements. With the advent of RAID 5 protection,
using parity RAID in Symmetrix DMX systems is not recommended.
RAID recommendations
Oracle has long recommended RAID 1 over RAID 5 for database
implementations. This was largely attributed to RAID 5's historical
poor performance versus RAID 1 (due to software implemented
RAID schemes) and also due to high disk drive failure rates that
caused RAID 5 performance degradation after failures and during
rebuilds. However, disk drives and RAID 5 in general have seen
significant optimizations and improvements since Oracle initially
recommended avoiding RAID 5. In the Symmetrix DMX, Oracle
databases can be deployed on RAID 5 protected disks for all but the
highest I/O performance intensive applications. Databases used for
test, development, QA, or reporting are likely candidates for using
RAID 5 protected volumes.
Another potential candidate for deployment on RAID 5 storage is
DSS applications. In many DSS environments, read performance
greatly outweighs the need for rapid writes. This is because data
warehouses typically perform loads off-hours or infrequently (once a
week or month); read performance in the form of database user
queries is significantly more important. Since there is no RAID
penalty for RAID 5 read performance, only write performance, these
types of applications are generally good candidates for RAID 5
storage deployments. Conversely, production OLTP applications
typically require small random writes to the database, and as such,
are generally more suited to RAID 1 storage.

An important consideration when deploying RAID 5 is disk failures.

When disks containing RAID 5 members fail, two primary issues
arise-performance and data availability. Performance will be affected
when the RAID group operates in the degraded mode, as the missing
data must be reconstructed using parity and data information from
other members in the RAID group. Performance also will be affected
when the disk rebuild process is initiated after the failed drive is
replaced or a hot spare is activated. Potential data loss is the other
important consideration when using RAID 5. Multiple drive failures
that cause the loss of multiple members of a single RAID group result
in loss of data. While the probability of such an event is small, the
potential in 7+1 RAID 5 environment is much higher than that for
RAID 1. As such, the probability of data loss due to the loss of
multiple members of RAID 5 group should be carefully weighed
against the benefits of using RAID 5.
The bottom line in choosing a RAID type is ensuring that the
configuration meets the needs of the customer's environment.
Considerations include the following:
◆ Read and write performance
◆ Balancing the I/O across the spindles and back-end of the
Symmetrix system
◆ Tolerance for reduced application performance when a drive fails
◆ The consequences of losing data in the event of multiple disk
failures
In general, EMC recommends RAID 1 for all types of customer data
including Oracle databases. However, RAID 5 configurations may be
beneficial for many applications and should be considered.
Symmetrix metavolumes
Individual Symmetrix hypervolumes of the same RAID type (RAID
1, RAID 5) may be combined together to form a virtualized device
called a Symmetrix metavolume. Metavolumes are created for a
number of reasons including:
◆ A desire to create devices that are greater than the largest
hypervolume available (in 5670 and 5671 Enginuity operating
environments, this is currently just under 31 GB per
hypervolume).

◆ To reduce the number of volumes presented down a front-end

director or to an HBA. A metavolume presented to an HBA only
counts as a single LUN even though the device may consist of a
large number of individual hypers.
◆ To increase performance of a LUN by spreading I/O across more
physical spindles.
There are two types of metavolumes: concatenated or striped. With
concatenated metavolumes, the individual hypers are combined to
form a single volume, such that data is written to the first
hypervolume sequentially before moving to the next. Writes to the
metavolume start with the metahead and proceed on that physical
until full, and then move on to the next hypervolume. Striped
metavolumes on the other hand, write data across all members of the
device. The stripe size is set at two cylinders or 960 KB.
In some cases, striped metavolumes are recommended over
concatenated volumes in Oracle database environments. The
exceptions to this general rule occur in certain DSS environments
where metavolumes may obscure the sequential nature of the I/Os
from the Enginuity prefetching algorithms, or in cases where RAID 5
metavolumes are created.

Host- versus array-based striping

Another hotly disputed issue when configuring a storage
environment for maximum performance is whether to use host-based
or array-based striping in Oracle environments. Striping of data
across the physical disks is critically important to database
performance because it allows the I/O to be distributed across
multiple spindles. Although disk drive size and speeds have
increased dramatically in recent years, spindle technologies have not
kept pace with host CPU and memory improvements. Performance
bottlenecks in the disk subsystem can develop if careful attention is
not paid to the data storage requirements and configuration. In
general, the discussion concerns trade-offs between performance and
manageability of the storage components.
Oracle has recommended the SAME (Stripe and Mirror Everywhere)
configuration for many years. However, Oracle has never
recommended where the striping should occur. In general, the
discussion concerns trade-offs between performance and
manageability of the storage components. The following presents in
more depth the trade-offs when using host-based and array-based
striping.
Host-based striping
Host-based striping is configured through the Logical Volume
Manager used on most open-systems hosts. For example, in an
HP-UX environment, striping is configured when logical volumes are
created in a volume group as shown below:
lvcreate -i 4 -I 64KB -L 1024 -n stripevol activevg
In this case, the striped volume is called stripevol (using the -n flag),
is created on the volume group activevg, is of volume size 1 GB (-L
1024), uses a stripe size of 64 KB (-I 64KB), and is striped across four
physical volumes (-i 4). The specifics of striping data at the host level
are operating-system-dependent.
Two important things to consider when creating host-based striping
are the number of disks to configure in a stripe set and an appropriate
stripe size. While no definitive answer can be given that optimizes
these settings for any given configuration, the following are general
guidelines to use when creating host-based stripes:

◆ Ensure that the stripe size used is a power of two multiple of the
track size configured on the Symmetrix DMX (that is, a multiple
of 32 KB on DMX-2 and 64KB on DMX-3), the database, and host
I/Os. Alignment of database blocks, Symmetrix tracks, host I/O
size, and the stripe size can have considerable impact on database
performance. Typical stripe sizes are 64 KB to 256 KB, although
the stripe size can be as high as 512 KB or even 1 MB.
◆ Multiples of 4 physical devices for the stripe width are generally
recommended, although this may be increased to 8 or 16 as
required for LUN presentation or SAN configuration restrictions
as needed. Care should be taken with RAID 5 metavolumes to
ensure that members do not end up on the same physical
spindles (a phenomenon known as vertical striping), as this may
affect performance. In general, RAID 5 metavolumes are not
recommended.
◆ When configuring in an SRDF environment, smaller stripe sizes
(32 KB for example), particularly for the redo logs, are
recommended. This is to enhance performance in synchronous
SRDF environments due to the limit of having only one
outstanding I/O per hypervolume on the link.
◆ Data alignment (along block boundaries) can play a significant
role in performance, particularly in Windows environments.
Refer to operating-system-specific documentation to learn how to
align data blocks from the host along Symmetrix DMX track
boundaries.
◆ Ensure that volumes used in the same stripe set are located on
different physical spindles. Using volumes from the same
physicals reduces the performance benefits of using striping. An
exception to this rule is when RAID 5 devices are used in DSS
environments.
Symmetrix-based striping (metavolumes)

An alternative to using host-based striping is to stripe at the
Symmetrix DMX level. Striping in the Symmetrix array is
accomplished through the use of striped metavolumes, as discussed
in the previous section. Individual hypervolumes are selected and
striped together, forming a single LUN presented through the
front-end director to the host. At the Symmetrix level, all writes to
this single LUN are striped. Currently, the only stripe size available
Host- versus array-based striping 319

for a metavolume is 960 KB. It is possible to create metavolumes with

up to 255 hypervolumes, although in practice metavolumes are
usually created with 4 to 16 devices.
Striping recommendations
Determining the appropriate striping method depends on many
factors. In general, striping is a tradeoff between manageability and
performance. With host-based striping, CPU cycles are used to
manage the stripes; Symmetrix metavolumes require no host cycles
to stripe the data. This small performance decrease in host-based
striping is offset by the fact that each device in a striped volume
group maintains an I/O queue, thereby increasing performance over
a Symmetrix metavolume, which only has a single I/O queue on the
host.
Recent tests show that striping at the host level provides somewhat
better performance than comparable Symmetrix-based striping, and
is generally recommended if performance is paramount. Host-based
striping is also recommended with environments using synchronous
SRDF, since stripe sizes in the host can be tuned to smaller increments
than are currently available with Symmetrix metavolumes, thereby
increasing performance.
Management considerations generally favor Symmetrix-based
metavolumes over host-based stripes. In many environments,
customers have achieved high-performance back-end layouts on the
Symmetrix system by allocating all of the storage as four-way striped
metavolumes. The advantage of this is any volume selected for host
data is always striped, with reduced chances for contention on any
given physical spindle. Additional storage requirements for any host
volume group, since additional storage is configured as a
metavolume, also are striped. Management of added storage to an
existing volume group using host-based striping may be significantly
more difficult, requiring in some cases a full backup, reconfiguration
of the volume group, and restore of the data to successfully expand
the stripe.
An alternative in Oracle environments gaining popularity recently is
the combined use of both host-based and array-based striping.
Known as double striping or a plaid, this configuration utilizes
striped metavolumes in the Symmetrix array, which are then
presented to a volume group and striped at the host level. This has
many advantages in database environments where read access is

small and highly random in nature. Since I/O patterns are pseudo
random, access to data is spread across a large quantity of physical
spindles, thereby decreasing the probability of contention on any
given disk. Double striping, in some cases, can interfere with data
prefetching at the Symmetrix DMX level when large, sequential data
reads are predominant. This configuration may be inappropriate for
DSS workloads.
Another method of double striping the data is through the use of
Symmetrix metavolumes and RAID 5. A RAID 5 hypervolume stripes
data across either four or eight physical disks using a stripe size of
four tracks (128 KB for DMX-2 or 256 KB for DMX-3). Striped
metavolumes stripe data across two or more hypers using a stripe
size of two cylinders (960 KB in DMX-2 or 1920 KB in DMX-3). When
using striped metavolumes with RAID 5 devices, ensure that
members do not end up on the same physical spindles, as this will
adversely affect performance. In many cases however, double
striping using this method also may affect prefetching for long,
sequential reads. As such, using striped metavolumes is generally not
recommended in DSS environments. Instead, if metavolumes are
needed for LUN presentation reasons, concatenated metavolumes on
the same physical spindles are recommended.
The decision of whether to use host-based, array-based, or double
striping in a storage environment has elicited considerable fervor on
all sides of the argument. While each configuration has positive and
negative factors, the important thing is to ensure that some form of
striping is used for the storage layout. The appropriate layer for disk
striping can have a significant impact on the overall performance and
manageability of the database system. Deciding which form of
striping to use depends on the specific nature and requirements of the
database environment in which it is configured.
With the advent of RAID 5 data protection in the Symmetrix DMX, an
additional option of triple striping data using RAID 5, host-based
striping, and metavolumes combined is now available. However,
triple striping increases data layout complexity, and in testing has
shown no performance benefits over other forms of striping. In fact, it
is shown to be detrimental to performance and as such, is not
recommended in any Symmetrix DMX configuration.
Host- versus array-based striping 321

Data placement considerations

Placement of the data on the physical spindles can potentially have a
significant impact on Oracle database performance. Placement factors
that affect database performance include:
◆ Hypervolume selection for specific database files on the physical
spindles themselves.
◆ The spread of database files across the spindles to minimize
contention.
◆ The placement of high I/O devices contiguously on the spindles
to minimize head movement (seek time).
◆ The spread of files across the spindles and back-end directors to
reduce component bottlenecks.
Each of these factors is discussed next.
Disk performance considerations

As shown in Figure 70 on page 324, there are five main
considerations for spindle performance:
◆ Actuator Positioning (Seek Time) - The time it takes the actuating
mechanism to move the heads from their present position to a
new position. This delay averages a few milliseconds in length
and depends on the type of drive. For example, a 15k drive has
an average seek time of approximately 3.5 ms for reads and 4 ms
for writes, with a full disk seek of 7.4 ms for reads and 7.9 ms for
writes.
Note: Disk drive characteristics can be found at www.seagate.com.
◆ Rotational Speed - This is due to the need for the platter to rotate
underneath the head to correctly position the data to be accessed.
Rotational speeds for spindles in the Symmetrix DMX range from
7,200-15,000 rpm. The average rotational delay is the time it takes
for half of a revolution of the disk. In the case of a 15 KB drive,
this is about 2 milliseconds.

◆ Interface Speed -A measure of the transfer rate from the drive into
the Symmetrix cache. It is important to ensure that the transfer
rate between the drive and cache is greater than the drive's rate to
deliver data. Delay caused by this is typically a very small value,
on the order of a fraction of a millisecond.
◆ Areal Density -A measure of the number of bits of data that fits on
a given surface area on the disk. The greater the density, the more
data per second that can be read from the disk as it passes under
the disk head.
◆ Cache Capacity and Algorithms - Newer disk drives have
improved read and write algorithms, as well as cache, in order to
improve the transfer of data in and out of the drive and to make
parity calculations for RAID 5.
Delay caused by the movement of the disk head across the platter
surface is called seek time. The time associated with a data track
rotating to the required location under the disk head is referred to as
rotational delay. The cache capacity on the drive, disk algorithms,
interface speed, and the areal density (or zoned bit recording)
combines to produce a disk transfer time. Therefore, the time taken to
complete an I/O (or disk latency) consists of these three elements:
seek time, rotational delay, and transfer time.
Data transfer times are typically on the order of fractions of a
millisecond and as such, rotational delays and delays due to
repositioning the actuator heads are the primary sources of latency on
a physical spindle. Additionally, rotational speeds of disk drives have
increased from top speeds of 7,200 rpm up to 15,000 rpm, but still
average on the order of a few milliseconds. The seek time continues
to be the largest source of latency in disk assemblies when using the
entire disk.
Transfer delays are lengthened in the inner parts of the drive; more
data can be read per second on the outer parts of the drive than by
data located on the inner regions. Therefore, performance is
significantly improved on the outer parts of the disk. In many cases,
performance improvements of more than 50 percent can sometimes
be realized on the outer cylinders of a physical spindle. This
performance differential typically leads customers to place high I/O
objects on the outer portions of the drive.
While placing high I/O objects such as redo logs on the outer edges
of the spindles has merit, performance differences across the drives
inside the Symmetrix DMX are significantly smaller than the
stand-alone disk characteristics would attest. Enginuity operating
Data placement considerations 323

environment algorithms, particularly the algorithms that optimize

ordering of I/O as the disk heads scan across the disk, greatly
reduces differences in hypervolume performance across the drive.
Although this smoothing of disk latency may actually increase the
delay of a particular I/O, overall performance characteristics of I/Os
to hypervolumes across the face of the spindle will be more uniform.
Areal density Rotational speed
Position actuator
Cache and
algorithms
Interface speed
ICO-IMG-000037
Figure 70 Disk performance factors
Hypervolume contention
Disk drives can receive only a limited number of read or write I/Os
before performance degradation occurs. While disk improvements
and cache, both on the physical drives and in disk arrays, have
improved disk read and write performance, the physical devices can
still become a critical bottleneck in Oracle database environments.
Eliminating contention on the physical spindles is a key factor in
ensuring maximum Oracle performance on Symmetrix DMX arrays.
Contention can occur on a physical spindle when I/O (read or write)
to one or more hypervolumes exceeds the I/O capacity of the disk.
While contention on a physical spindle is undesirable, this type of
contention can be rectified by migrating high I/O data onto other
devices with lower utilization. This is accomplished using a number

of methods, depending on the type of contention that is found. For

example, when two or more hypervolumes on the same physical
spindle have excessive I/O, contention may be eliminated by
migrating one of the hypervolumes to another, lower-utilized
physical spindle. This is done through processes such as LVM
mirroring at the host level or by using tools such as EMC Symmetrix
Optimizer to nondisruptively migrate data from impacted devices.
One method of reducing hypervolume contention is careful layout of
the data across the physical spindles on the back-end of the
Symmetrix system. Another method of reducing contention is to use
striping, either at the host level or inside the Symmetrix system.
Hypervolume contention can be found in several ways.
Oracle-specific data collection and analysis tools such as the
Automatic Workload Repository (AWS), the Automatic Database
Diagnostic Monitor, and StatsPack can help to identify areas of
reduced I/O performance in the database data files. Additionally,
EMC tools such as Performance Manager (formerly WorkLoad
Analyzer) can help to identify performance bottlenecks in the
Symmetrix DMX array. Establishing baselines of the system and
proactive monitoring are essential in helping to maintain an efficient,
high-performance database.
Commonly, tuning database performance on the Symmetrix system is
performed post-implementation. This is unfortunate because with a
small amount of up-front effort and detailed planning, significant
I/O contention issues could be minimized or eliminated in a new
implementation. While detailed I/O patterns of a database
environment are not always well known, particularly in the case of a
new system implementation, careful layout consideration of a
database on the back end of the Symmetrix system can save time and
future effort in trying to identify and eliminate I/O contention on the
disk drives.
Maximizing data spread across the back end

A long-standing data layout recommendation at EMC is "Go wide
before going deep." This means that data placement on the
Symmetrix DMX should be spread across the back-end directors and
physical spindles before locating data on the same physical drives. By
spreading the I/O across the Symmetrix back end, I/O bottlenecks in
any one array component can be minimized or eliminated.

Given recent improvements in the Symmetrix DMX component

technologies, such as CPU performance on the directors and the
Direct Matrix architecture, the most common bottleneck in new
implementations is with contention on the physical spindles and the
back-end directors. To reduce these contention issues, examine the
I/O requirements for each application that will use the Symmetrix
storage. From this analysis, create a detailed layout that balances the
anticipated I/O requirements across both back-end directors and
physical spindles.
Before data is laid out on the DMX back end, it is helpful to
understand the I/O requirements for each of the file systems or
volumes being laid out. Several methods to optimize layout on the
back-end directors and spindles are available. One time-consuming
method involves creating a map of the hypervolumes on physical
storage, including hypervolume presentation by director and
physical spindle, based on information available in EMC
ControlCenter. This involves documenting the environment using a
tool such as Excel, with each hypervolume marked on its physical
spindle and disk director. Using this map of the back end and volume
information for the database elements, preferably categorized by I/O
requirement (high/medium/low, or by anticipated reads and writes),
the physical data elements and I/Os can be evenly spread across the
directors and physical spindles.
This type of layout can be extremely complex and time-consuming.
Additional complexity is added when RAID 5 hypers are added to
the configuration. Since each hypervolume is placed on either four or
eight physical volumes in RAID 5 environments, trying to uniquely
map out each datafile or database element is beyond what most
customers feel provides value. In these cases, one alternative is to
rank each of the database elements or volumes in terms of anticipated
I/O. Once ranked, each element may be assigned a hypervolume in
order on the back end. Since BIN file creation tools almost always
spread contiguous hypervolume numbers across different elements
of the back end, this method of assigning the ranked database
elements usually provides a reasonable spread of I/O across the
spindles and back-end directors in the Symmetrix DMX. In
combination with Symmetrix Optimizer, this method of spreading
the I/O is normally effective in maximizing the spread of I/O across
DMX components.

Minimizing disk head movement

Perhaps the key performance consideration a customer can control
when laying out a database on the Symmetrix DMX is minimizing
head movement on the physical spindles. Positioning high I/O
hypervolumes contiguously on the physical spindles can minimize
head movement. Disk latency caused by interface or rotational
speeds cannot be controlled by layout considerations. The only disk
drive performance considerations that can be controlled are the
placement of data onto specific, higher-performing areas of the drive
(discussed in a previous section) and the reduction of actuator
movement, by trying to place high I/O objects in adjacent
hypervolumes on the physical spindles.
One method, described in the previous section, describes how
volumes can be ranked by anticipated I/O requirements. Using a
documented "map" of the back-end spindles, high-I/O objects can be
placed on the physical spindles, grouping the highest-I/O objects
together. Recommendations differ as to whether it is optimal to place
the highest I/O objects together on the outer parts of the spindle (the
highest performing parts of a physical spindle) or in the center of a
spindle. Since there is no definitive answer to this question, the
historical recommendation of putting high-I/O objects together on
the outer part of the spindle is still a reasonable suggestion. Placing
these high-I/O objects together on the outer parts of the spindle
should help reduce disk actuator movement when doing reads and
writes to each hypervolume on the spindle, thereby improving a
controllable parameter in any data layout exercise.

Other layout considerations

In addition to the layout considerations described in previous
sections, a few additional factors may be important to DBAs or
storage administrators seeking to optimize database performance.
Additional configuration factors to consider include the following:
◆ Implementing SRDF/S for the database
◆ Creating database clones using TimeFinder/Mirror or
TimeFinder/Clone
◆ Creating database clones using TimeFinder/Snap
These additional layout considerations are discussed in the following
sections.
Database layout considerations with SRDF/S

Two primary concerns must be considered when SRDF/S is
implemented:
◆ Inherent latency is added for each write to the database. Latency
occurs because each write must be first written to both the local
and remote Symmetrix caches before the write can be
acknowledged to the host. This latency must always be
considered as a part of any SRDF/S implementation. As the
speed of light cannot be circumvented, there is little to be done to
mitigate this latency.
◆ Each hypervolume configured in the Symmetrix is only allowed
to send a single I/O across the SRDF link. Performance
degradation results when multiple I/Os are written to a single
hypervolume since subsequent writes must wait for predecessors
to complete. Striping at the host level is particularly helpful in
these situations. Using a smaller stripe size (32-128 KB) ensures
that larger writes will be spread across multiple hypervolumes,
reducing the chances for SRDF to serialize writes across the link.
Database cloning, TimeFinder, and sharing spindles

Database cloning is useful when DBAs want to create backup or other
business continuance images of a database. A common question
when laying out a database is whether BCVs should share the same
physical spindles as the production volumes or should be isolated on

separate physical disks. There are pros and cons to each of the
solutions; the optimal solution generally depends on the anticipated
workload.
The primary benefit of spreading BCVs across all physical spindles is
performance. Spreading I/Os across more spindles reduces the risk
of bottlenecks on the physical disks. Workloads that use BCVs, such
as backups and reporting databases, may generate high I/O rates.
Spreading this workload across more physical spindles may
significantly improve performance in these environments.
The main drawbacks to spreading BCVs across all spindles in the
Symmetrix system are:
◆ Synchronization may cause spindle contention during
resynchronization.
◆ BCV workloads may negatively impact production database
performance.
When resynchronizing the BCVs, data is read from the production
hypers and copied into cache. From there it is destaged to the BCVs.
When the physical disks share production and BCVs, the
synchronization rates can be greatly reduced because of increased
seek times due to the conflict between reading from one part of the
disk and writing to another. The other drawback to sharing physical
disks is the increased workload on the spindles that may impact
performance on the production volumes. Sharing the spindles
increases the chance that contention may arise, decreasing database
performance.
Determining the appropriate location for BCVs (either sharing the
same physical spindles or isolated on their own disks) depends on
customer preference and workload. In general, BCVs should share
the same physical spindles. However, in cases where the BCV
synchronization and utilization may negatively impact applications
(for example, databases that run 24x7 with high I/O requirements), it
may be beneficial for the BCVs to be isolated on their own physical
disks.
Database clones using TimeFinder/Snap

TimeFinder/Snap provides many of the benefits of full-volume
replication techniques such as TimeFinder/Mirror or
TimeFinder/Clone, but at greatly reduced costs. However, there are
two performance considerations when using TimeFinder/Snap to
Other layout considerations 329

make database clones for backups or other business continuous

functions. These include Copy on First Write (COFW) and Copy On
Access (COA) penalties. The first affects the production volume
performance, while the second affects access to the snap volumes.
COFW is the result of the need for data to be copied from the
production hypers to the save area as writes come into them. It affects
the production devices. TimeFinder/Snap uses virtual devices that
contain pointers to where valid data for the snap device is located.
When first created, all of the pointers are directed at the production
hypervolume. As time goes by and changes are made to the
production volume, any changes must be saved to an alternative
location before being written to disk. This requirement to save the
original data to a save device before a write can be processed, along
with the change to the snap pointer, manifests itself as a small write
performance hit to the production volumes. Although generally
small, whenever a snap device is created with a production volume,
consider this COFW performance penalty.
The COA penalty affects both the production and snap volume
performance. It is the result of two reasons: The need to determine
where the snap data on disk is located and the workload on the
production volumes. The virtual device contains pointers that define
whether a requested track is located on the production volume or on
a save device. This in-memory lookup before reading from the
appropriate disk track requires extra cycles of processing before a
read request is returned to a host.
In addition, this COA penalty depends on the load in the Symmetrix
system and the activity. In highly utilized systems, the performance
penalty can increase dramatically due to contention for physical
resources. In highly utilized systems, using TimeFinder/Snap can
result in unsatisfactory performance in all applications resident of the
array.

Oracle database-specific configuration settings

Oracle provides significant flexibility when configuring a database
through its initialization parameters. Although a broad range of
parameters can be adjusted, relatively few have a significant impact
on Oracle performance from a storage perspective.
Table 12 describes initialization parameters that can be tuned to
improve I/O performance from a Symmetrix DMX storage array.
Table 12 Initialization parameters (page 1 of 2)
Parameter Description
DB_BLOCK_BUFFE Specifies the number of data "pages" available in host memory for
RS data pulled from disk. Typically, the more block buffers available in
memory, the better the potential performance of the database.
DB_BLOCK_SIZE Determines the size of the data pages Oracle stores in memory and
out on disk. For DSS applications, using larger block sizes such as
16 KB (or 32 KB when available) improves data throughput, while for
OLTP applications, a 4 KB or 8 KB block size may be more
appropriate.
DB_FILE_MULTIBL Specifies the maximum number of blocks that can be read in a single
OCK_READ_ sequential read I/O. For OLTP environments, this parameter should
COUNT be set to a low value (4 or 8 for example). For DSS environments
where long, sequential data scans are normal, this parameter should
be increased to match the maximum host I/O size (or more) to
optimize throughput.
DB_WRITER_PRO Specifies the number of DBWR processes initially started for the
CESSES database. Increasing the number of DBWR processes can improve
writes to disk through multiplexing if multiple CPUs are available in
the host.
DBWR_IO_SLAVES Configures multiple I/O server process for the DBW0 process. This
parameter is only used on single CPU servers where only a single
DBWR process is enabled. Configuring I/O slaves can improve write
performance to disk through multiplexing the writes.
DISK_ASYNCH_IO Controls whether I/O to Oracle structures such as datafiles, log
files, and control files is written asynchronously or not. If
asynchronous I/O is available on the host platform, asynchronous
I/O to the datafiles has a positive affect on I/O performance.
FAST_START_MTT Specifies the desired number of seconds needed to crash recover a

R_TARGET database in the event of a failure. If used, setting this to low values is
detrimental to performance because of the need to perform frequent
checkpoints to ensure quick restart of the data.
Oracle database-specific configuration settings 331

Table 12 Initialization parameters (page 2 of 2)
Parameter Description
LOG_BUFFER Specifies the size of the redo log buffer. Increasing the size of this
buffer can decrease the frequency of required writes to disk.
LOG_CHECKPOINT Specifies the number of redo log blocks that can be written before a
_INTERVAL checkpoint must be performed. This affects performance since a
checkpoint requires that data be written to disk to ensure
consistency. Frequent checkpoints reduce the amount of recovery
needed if a crash occurs but can also be detrimental to Oracle
performance.
LOG_CHECKPOINT Specifies the number of seconds that can elapse before a

_TIMEOUT checkpoint must be performed. This affects performance since a
checkpoint requires that data be written to disk to ensure
consistency. Frequent checkpoints reduce the amount of recovery
needed if a crash occurs, but also can be detrimental to Oracle
performance.
SORT_AREA_SIZE Specifies the maximum amount in memory that Oracle will use to
perform sort operations. Increasing this parameter decreases the
likelihood that a sort will be performed in a temporary tablespace on
disk. However, this also increases the memory requirements on the
host.

The database layout process

After discussing the various considerations for optimizing Oracle
performance on the Symmetrix DMX, the question arises as to how
these recommendations are applied when laying out a new database.
There are four general phases to the layout process:
◆ Analysis and planning
◆ Initial database layout
◆ Implementation
◆ Reanalyze and refine
The following sections describe each of these phases and provide
examples of how a database might be laid out for three examples: an
OLTP-like database, a DSS-like database, and a mixed-type database.
Database layout process

As discussed previously, there are four primary phases to a database
layout process. The following describes the high-level steps involved
in an Oracle database layout process.
Analysis and planning

The first step in the process is typically the most time-consuming as it
requires thoroughly analyzing and documenting the anticipated
environment. Information concerning the host, storage, connectivity,
I/O, availability, growth, and backups must be provided as a part of
this phase of the process. Typical questions include:
◆ What is the anticipated size of the database at deployment? After
one month? After six 6 months?
◆ What is the host environment needed for the application?
Memory? CPUs? Growth?
◆ What level of operating system will be deployed? Which LVM?
Raw or "cooked" file systems?
◆ How will data striping be achieved (host-based, storage-based,
double striping)?
◆ What RAID will be used for the database environment (RAID 1,
RAID 5)?
The database layout process 333

◆ How many data files for the database will be created? Which have
the highest I/O activity?
◆ What are the availability requirements for the database?
◆ Will a cluster be deployed? How many nodes? Single instance?
RAC?
◆ How many data paths are required from the host to the storage
array? Will multipathing software be used?
◆ How will the host connect d to the storage array (direct attach,
SAN)?
◆ Which is more important: IOPS or throughput? How much of
each are anticipated?
◆ What kind of database is planned (DSS, OLTP, a combination of
the two)?
◆ What types of I/Os are anticipated from the database (long
sequential reads, small bursts of write activity, a mix of reads and
writes)?
◆ How will backups be handled? Will replication (host or storage
based) be used?
Answers to these questions determines the configuration and layout
of the proposed database. The key to the layout process is a complete
understanding of the database characteristics and requirements to
implement. Of particular importance when planning a database
layout is the I/O characteristics of the various database objects. This
information is collected and documented, and encompasses the key
deliverable for the next phase of the database layout on a Symmetrix
project.
In some cases, the databases to be deployed already exist in a
production environment. In these cases, it is easy to understand the
I/O characteristics of the various underlying database structures
(tablespaces, data files, tables, and so on). Various tools for gathering
performance statistics include Oracle StatsPack, EMC Performance
Manager, host-based utilities including sar and iostat, and third-party
analyzers (such as Quest Central). Performance statistics such as
reads and writes are determined for database objects. These statistics
are then used to determine the required number of physical spindles,
the number of I/O paths between the host and the storage array, and
the Symmetrix configuration.

Sometimes, the database to be deployed is not in a production

environment. In these cases, it is difficult to anticipate the I/O
requirements of the planned database. However, EMC has analyzed
empirical data from a wide variety of environments, including Oracle
database implementations, and has created workload simulations
based upon these analyses. Given an anticipated maximum number
of I/Os and type of workload (DSS or OLTP), a simulated workload
can be put into an EMC utility called SymmMerge to estimate
resource utilization on the Symmetrix system. This tool is only
available to EMC internal performance specialists (i.e., SPEED
resources), but using this tool ensures a successful Oracle
implementation in a Symmetrix environment when the exact
workload requirements are unknown.
Initial database layout

The next step in the process is to create an initial layout of the
database on the back-end storage of the Symmetrix. Of primary
concern is spreading out the anticipated workload across the
back-end directors and physical spindles inside the Symmetrix array.
The first step in the process is to acquire a map of the Symmetrix back
end. This map shows the layout of the Symmetrix hypervolumes on
the physical disks in the array. This information can be acquired from
the EMC software Ionix ControlCenter.
In addition to the back-end layout, the database configuration in
terms of tablespaces and datafiles should be planned. This requires
determining the number and type of datafiles needed for the
implementation. Additionally, estimates of the type, size, and
number of read and write requests for each volume should be
determined. These estimates form the basis of the data layout
process.
Once the information on the Symmetrix back-end layout and Oracle
datafile requirements are determined, the next step is to begin laying
out each of the datafiles across the hypervolumes in the Symmetrix
system. Volumes are laid out such that the workload is spread across
the back-end directors and the physical spindles. Make sure that the
number of reads and writes to a physical device does not exceed the
maximum number of I/Os that a spindle can handle. However, if
sufficient diligence is taken to ensure balance of I/Os across the
drives, reaching I/O limits on one spindle would result in high rates
of activity across all drives. In such a situation, additional spindles
are required to handle the workload.

Simpler alternatives to a detailed database layout exist. One method

is to simply rank I/Os (primarily reads since writes are written to
cache). In a typical Symmetrix BIN file, consecutive hypervolume
numbers are spread across different physical spindles and back-end
directors. By ranking volume requirements for the database and
assigning them to consecutive hypervolume numbers, a DBA can
reasonably spread the database I/O across back-end directors and the
physical spindles.
Implementation
The implementation phase takes the planned database layout from
the preceding step and implements it into the customer's
environment. The host is presented with the documented storage.
Volume groups and file systems are created as required and database
elements are initialized as planned. This phase of the process is
normally short and relatively straight-forward to complete if prior
steps are performed and documented well.
Reanalyze and refine

After the database is put into production, it is important to establish
performance baselines and continue to monitor the system
environment. Important tools for this include Oracle Statspack and
EMC Performance Manager. Performance baselines after deployment
help to determine the database performance characteristics of the
system. Thereafter, ongoing monitoring of the system and
comparison to these benchmarks helps determine whether
performance degrades over time and where the degradation occurs.
Degradation can occur in database performance due to growth in the
system or changing requirements inside the database.
If degradation is detected, there are several ways to deal with it. If the
source of poor performance is contention on the physical spindle (for
example, multiple active hypervolumes on the same physical spindle
contending for I/O), then workload on the drive must be reduced.
There are several ways to do this. Striping data is effective at
eliminating contention on a spindle. Another commonly used
method to eliminate contention is to migrate one of the active
hypervolumes on the drive to a new location. This can be done
through host-based mechanisms such as copying the associated data
files to new hypervolumes. However, this can cause service
disruptions to the data being migrated.

An alternative to host-based migrations is to use EMC Symmetrix

Optimizer. Symmetrix Optimizer proactively analyzes the Symmetrix
for back-end performance issues including director and disk
contention. It then determines the best way to optimize disk I/O. It
detects hypervolume contention on a physical spindle and migrates
the volume to a new location without disruption to active systems.
Another source of degradation occurs when activity to a single
hypervolume exceeds hypervolume (write pending) or spindle (read
and write) limits. In such cases, moving the hypervolume to another
physical spindle will not solve the problem. When this case is found,
the only way to reduce the performance degradation is to spread out
the contents of the hypervolume onto multiple volumes. If multiple
datafiles are located on the hyper, migrating one or more to alternate
locations may fix the problem. More difficult (and more common
however) is when a single datafile his issue is to have the data spread
across multiple hypervolumes. This can be done through striping
techniques at the Symmetrix system through metavolumes, at the
host through host-based striping, or in the database by using data
partitioning.


8
Data Protection
This chapter describes data protection methods using EMC Double

Checksum to minimize the impact of I/O errors on database
consistency during I/O transfers between hosts and Symmetrix
storage devices. Topics include:
◆ EMC Double Checksum overview ................................................ 340
◆ Implementing EMC Double Checksum for Oracle..................... 342
◆ Implementing Generic SafeWrite for generic applications ........ 346
◆ Syntax and examples ....................................................................... 353
Data Protection 339

Data Protection
EMC Double Checksum overview

The EMC Double Checksum feature provides a method to help
minimize the impact of I/O errors on database consistency during
I/O transfers between hosts and Symmetrix storage devices.
For Oracle, EMC Double Checksum for Oracle contains a rich set of
checks that can be natively performed by the Symmetrix array. For
each Relational Database Management System (RDBMS) write in the
Symmetrix array, checksum values are computed and compared to
test the data for any corruption picked up along the way from the
host. Although errors of this kind are infrequent, they can have a
considerable effect on data availability and recovery. Section 8.2
provides details on this feature.
For generic RDBMS applications, EMC Double Checksum for Generic
Applications provides the Generic SafeWrite feature to help protect
critical applications from incurring an incomplete write, and
subsequent torn page, due to a failure with a component connected to
the Symmetrix Front End Channel Adapter. Generic SafeWrite is
most often used to protect against corruptions due to HBA and link
failures including server crashes, where essentially, it will help
protect against fractured writes that can occur before the data reaches
the Symmetrix. “Implementing Generic SafeWrite for generic
applications” on page 346 provides details on this feature.
This chapter contains overview and concept information. The EMC
Solutions Enabler Symmetrix CLI Command Reference contains a
complete list of syntax and options.
Traditional methods of preventing data corruption

Data corruption checking is an integral part of most RDBMS
products. For instance, Oracle computes a checksum that verifies the
data within each page. If corruption occurs, the checksum will be
incorrect when the data is read. However, this checking only takes
place within the host system - not the storage array.
As a result, if there is corruption after the data leaves the host system,
it will not be detected until that data is read back into the system,
which can be some time - maybe months - later. The RDBMS will
issue an alert, and then the data must be rebuilt from backups and

Data Protection
database logs. While a corruption remains undetected, the number of

database logs required for recovery increases. This causes the data
recovery process to be more complex and time-consuming.
Data corruption between host and conventional storage

Although data appears to the host to travel directly to the Symmetrix
array, it passes through multiple hardware and software layers. These
can lead to problems such as corruption introduced by errors in the
operating system or the I/O driver. Hardware can also introduce
corruption, such as errors in the host adapter, cable and connector
problems, static electricity, and RF noise and interference.
This means that valid data within the RDBMS might arrive corrupted
at the storage device. The storage device writes the data as is because
it has no way of validating the data.
Benefits of checking within Symmetrix arrays

With this feature, the Symmetrix array can perform error checks on
data pages handled within a checksum extent as they are written to
the disk. The check occurs before the write command is
acknowledged. If an error is detected within the blocks of the extent,
the I/O can be rejected and/or reported in a phone home connection
or logged in the Symmetrix error log facilities. The error action can be
specified by the administrator.
This checking feature minimizes the possibility of data corruption
occurring between the host and the Symmetrix array. It improves the
recovery time by flagging the error at the time of the "write." When
this error condition is raised, and reject I/O is selected, Oracle takes
an action, such as taking the tablespace offline.
EMC Double Checksum overview 341

Data Protection
Implementing EMC Double Checksum for Oracle

The symchksum command allows you to perform control operations
that manage the checksum I/O extents.
For example, to enable Oracle checksum checking on the extents of all
the devices that define the current database instance and then to
phone home on error, enter:
symchksum enable -type Oracle -phone_home
This command requires an Oracle PAK license.
The following are current restrictions for this feature:
◆ Refer to the EMC Support Matrix for supported Oracle versions
and configurations.
◆ Data-block size is limited to 32 KB.
◆ Checksum data objects can only be Oracle datafiles, control files,
and redo logs.
The Oracle instance being checked must be started with the following
init.ora configuration parameter set to true:
db_block_checksum=true
Other checksum operations

The following additional checks are available:
◆ MagicNumber - Verify the magic number that appears in Oracle
data blocks. (Enabled by default.)
◆ NonzeroDba - Check for nonzero data block address. The dba
stored in a data block is never zero. (Enabled by default.)
◆ Check_All_Blocks - Apply checks to each block in a write.
◆ Straddle - Check that the write does not straddle known Oracle
areas.
◆ Check_dba - Check that the logical address embedded by Oracle
is compatible with the storage address of the writes.
With the addition of these new tests, the output when you list the
extents will look similar to the following:

Data Protection
Symmetrix ID: 000187900671
D E V I C E S W I T H C H E C K S U M E X T E N T S
Action Checks
R P C C A N D
e h h M h S l Z F i
j o k a k t l r s
L e n s g D r B D a c
Num Blk o c e u i B a l B c r
Device Name Dev Exts Siz Type g t H m c A d k A t d
-----------------------------------------------------------------------------
--
/dev/sdi 047 16 32b Oracle X . X X X . . . X . .

/dev/sdj 048 16 32b Oracle X . X X X . . . X . .
/dev/sdk 049 16 32b Oracle X . X X X . . . X . .
/dev/sdl 04A 15 32b Oracle X . X X X . . . X . .
Use this output to determine which features are enabled on the

devices with checksum extents.
To turn off any of the automatic checksum features, use the
-suppress_feature option and supply the name of the feature, for
example:
symchksum -type Oracle enable -suppress_feature
MagicNumber
The -suppress_feature option is only for the operations run by

default. To turn off an option that was manually enabled, such as
-phone_home, disable the checksum operation and begin again with
a new symchksum enable command.
Enabling checksum options

When using the symchksum enable command, the user can decide to
reject the I/O, or have the Symmetrix phone home when a checksum
error is detected.
If an I/O is not a multiple of the object block size, the user can choose
to reject the I/O. This is called a fractured I/O, and is selected with
the -fractured_reject_io option. When using this option, the -reject_io
option must also be used.
When extents are enabled with the -discard option, EMC Double
Checksum writes blocks to disk until a failed block is detected. The
-discard option divides a large I/O into smaller units of 32 KB each.
Implementing EMC Double Checksum for Oracle 343

Data Protection
When a checksum failure is detected, all blocks in that unit and

subsequent units are discarded. When using this option, the -reject_io
option must also be used.
The symchksum enable command understands the Oracle database
structure. The feature can be enabled for tablespaces, control files,
redo logs, or the entire database.
For Oracle9i and above, if the block size for a tablespace is altered,
then the user must disable and then reenable the extents of the
tablespace to ensure that the block size of the enabled extents match
the block size of the tablespace.
Note: When FF or power down occurs, extents are lost. Run the symchksum
enable command again.
Verifying checksum is enabled

The symchksum command also allows you to verify that the
datafile's extents are currently enabled for checksum checking. This
provides an easy way to determine if the specified tablespace or
instance is fully protected by the Symmetrix checksum feature. The
verify action will report if all, some, or none of the Oracle datafile's
extents are enabled for checksum checking. This is useful in
environments where the database configuration changes frequently.
An example of this is if a new datafile is added, but checksum is not
enabled for the new file.
The symchksum verify command understands the Oracle database
Validating for checksum operations

The symchksum command also allows you to validate your Oracle
tablespace or instance for checksum operations without performing
any active actions. This is helpful when you want to know if your
database environment is configured to support Symmetrix checksum
functionality without actually making any changes.
If the validate is successful, you can enable EMC Double Checksum
on the specified Oracle database or tablespace. The following items
are validated:

Data Protection
◆ Refer to the EMC Support Matrix for supported Oracle versions

and configurations.
◆ Oracle's checksum initialization parameter is set
(db_block_checksum).
◆ If the Oracle datafile is created on a striped LVM, that the LVM
stripe width is a multiple of the Oracle block size.
◆ Oracle datafile's block size is less than or equal to 32 KB.
◆ The Symmetrix Enginuity version supports the checksum
functionality.
◆ Each Symmetrix device has the checksum flag set.
◆ Each Symmetrix device has a supportable number of extents
defined.
The symchksum validate command understands the Oracle database
Disabling checksum
The symchksum disable command understands the Oracle
database structure. The feature can be enabled for tablespaces,
control files, redo logs, or the entire database.
The symchksum disable command also is used on a device basis.
This capability is not normally used, but is provided in the event the
tablespace was dropped before EMC Double Checksum was disabled
for that object.
When the disable action is specified for a Symmetrix device, the
-force flag is required. Disabling extents in this way can cause a
mapped tablespace or database to be only partially protected,
therefore, use this option with caution. All the extents monitored for
checksum errors on the specified Symmetrix device will be disabled.
Implementing EMC Double Checksum for Oracle 345

Data Protection
Implementing Generic SafeWrite for generic applications

Generic SafeWrite (GSW) is used to help protect critical applications
from incurring an incomplete write, and subsequent torn page, due
to a failure with a component connected to the Symmetrix Front End
Channel Adapter.
Torn pages: Using Generic SafeWrite to protect applications

A Relational Database Management System (RDBMS), such as Oracle
and Microsoft Exchange, structure data within database files using
pages (also referred to as blocks). Pages within a database are the
smallest allocation unit size possible for a database object (such as a
table or a row).
For example, the page size for Microsoft Exchange is 4 KB and for
Oracle, though it is configurable, it is usually set to 8 KB. If an
incomplete page is written to a database file, a corruption to the
database will occur. The resulting corruption is commonly referred to
as a torn page.
Torn pages are only detected by most RDBMSs after the corruption is
written, when that area of the database is read, which could be after
when the corruption was introduced. In general, the only recovery
from a torn page is to perform a restore from a backup (some
RDBMSs allow page-level restores, while others require a complete
database restore). Torn pages can occur due to failures in various
components that lie between the RDBMS and the storage array. Some
of these components include the operating system, file system, logical
volume manager, I/O driver, host bus adapter, Fibre or SCSI link and
storage adapter.
The EMC Double Checkum Generic SafeWrite feature protects critical
applications from incurring incomplete writes, and subsequent torn
pages, due to a failure with a component connected to the Symmetrix
Front End Channel Adapter.
Most often, Generic SafeWrite is used to protect against corruption
that occurs when the HBA and link fails (including server crashes). In
this scenario, Generic SafeWrite will protect against fractured
writesoccuring before the data reaches the Symmetrix array.

Data Protection
Why generic?
Generic SafeWrite is deemed generic because the checks performed to
ensure complete data are application independent. For instance,
Generic SafeWrite will not perform any Oracle- or Exchange-specific
checksums to verify data integrity. It is important to note that for
Oracle, EMC Double Checksum for Oracle provides a rich set of
checks which can be natively performed by the Symmetrix array. For
more information on EMC Double Checksum for Oracle, consult
“Implementing EMC Double Checksum for Oracle” on page 342.
Where to enable Generic SafeWrite

Generic SafeWrite only needs to be enabled for specific devices on the
Symmetrix array. For a RDBMS, Generic SafeWrite only needs to be
enabled for devices that support datafiles. The list below gives an
example of database files where the supporting devices for these files
should have Generic SafeWrite enabled:
Microsoft Exchange
◆ .edb files
◆ .stm files
Microsoft SQL Server
◆ .mdf files
◆ .ndf files
Oracle
◆ Data files
◆ Control files
It is recommended to enable Generic SafeWrite for database file
devices, though it is unnecessary to enable it for database log devices.
In general, a RDBMS will write to its respective log file with a 512
byte sector alignment. The RDBMS can therefore determine the last
sector that was correctly written and subsequently discard or rollback
any incomplete transactions.
Implementing Generic SafeWrite for generic applications 347

Data Protection
Note: It is always a best practice to separate the location of database files and
log files for a given database onto unique devices. There are cases, however,
where the datafile and log file may share the same device. In this case, it is
still possible to have GSW enabled; however, there will be a performance
impact to the log writes that may impact application performance.
There are no restrictions regarding the size of a device or the number

of devices where GSW can be enabled. All device types are supported
including devices replicated using the SRDF and the TimeFinder
family of products. It is also supported to enable GSW on file systems
across all logical volume managers as well as on raw devices, given
the OS platforms are supported by the Solutions Enabler Storage
Resource Management (SRM) component. When using file systems
on Windows and Linux hosts, for performance reasons, it is strongly
recommended to ensure the file systems are properly aligned with
the storage. For more information regarding file system alignment,
consult Using diskpar and diskpart to Align Partitions on Windows Basic
and Dynamic Disks available on Powerlink.
Configuring Generic SafeWrite

To use Generic SafeWrite, you must:
◆ Enable the RDB_cksum device flag on all devices targeted for
Generic SafeWrite use
◆ Run a discover operation to update the physical device
information in the SYMAPI database
◆ Enable Generic SafeWrite on all devices targeted for Generic
SafeWrite use via the symchksum command
Setting the RDB_chksum Symmetrix device flag

Before using Generic SafeWrite, the RDB_cksum Symmetrix device
flag must be enabled on all devices targeted for Generic SafeWrite
use. This change does not turn Generic SafeWrite on, it only allows it
to be enabled on the specified devices.
The RDB_cksum device flag is set by using the SYMCLI
symconfigure command, which will perform a Symmetrix
configuration change. Chapter 1 contains more information on using
symconfigure.

Data Protection
Note: If symconfigure cannot be used, the appropriate device flag is set on

the array by a EMC Customer Support Engineer.
The following is an example command:

symconfigure -sid 54 -f c:\enable_cksum.txt commit
where the c:\enable_cksum.txt file contains the following command:

set device 0015:0019 attribute=RDB_Cksum;
Note: If metavolumes are used, this flag needs to be set for both metaheads
and metamembers.
Enabling Generic SafeWrite

Once the device flags are set on the Symmetrix array, it is possible to
use the symchksum command to enable Generic SafeWrite. Before
running the symchksum command, confirm the following:
◆ The devices enabled for Generic SafeWrite are visible to the host
from where the symchksum command will be run.
◆ Run a symcfg discover command after presenting devices to a
host to update the SYMAPI configuration database with the
correct physical drive information.
◆ Using the symchksum command, Generic SafeWrite is enabled
by specifying a specific device, a range of devices, or a device
group.
Enabling for a device
To enable Generic SafeWrite for a device, use the command syntax
shown in the example below:
symchksum enable -type generic dev 005 -sid 54
Note: If this is a metadevice, only the metahead needs to be specified.
Enabling for a range of devices

To enable Generic SafeWrite for a contiguous range of devices:
symchksum enable -type generic -range 005:025 -sid 54
Enabling for a device group

To enable Generic SafeWrite for a device group:

Data Protection
symchksum enable -type generic -g sql_data -sid

54
Note: Enabling Generic SafeWrite on a Composite Group (CG) is currently

not supported.
The symchksum enable -type generic command automatically

sets the Log, Phone Home, and Generic Double Checksum options as
described below:
◆ Log - Indicates that errors will be sent to the Symmetrix error log.
These events should be visible via the symevent command.
◆ Phone Home - Indicates that an error will initiate a call by the
Symmetrix to EMC Customer Service.
◆ Generic - The Generic option allows for two functions to be
performed by the Symmetrix array. First, when an incomplete
write is detected, it will be rejected and the Symmetrix will force
the I/O to be retried from the host. Then, if the host is
unavailable to retry the I/O, the write will be discarded,
preventing it from being written to disk.
How to disable Generic SafeWrite

Generic SafeWrite can be disabled using the symchksum disable
-type generic command as shown in the same examples below.
Disabling for a device

To disable Generic SafeWrite for a device, use the command syntax
shown in the example below:
symchksum disable -type generic dev 005 -sid 54
Note: If this is a metadevice, only the meta head needs to be specified.
Disabling for a range of devices

To disable Generic SafeWrite for a contiguous range of devices:
symchksum disable -type generic -range 005:025 -sid 54
Disabling for a device group

To disable Generic SafeWrite for a device group:
symchksum disable -type generic -g sql_data -sid 54

Data Protection
Listing Generic SafeWrite devices

To list which devices are Generic SafeWrite enabled, use the
symchksum list command. Only Generic SafeWrite-enabled devices
that are visible to the host running the symchksum list command are
returned.
Figure 71 shows the expected output from the list command, with
Generic for the type and the Log and PhoneH (short for Phone Home)
options set as well.
ICO-IMG-000526
Figure 71 Synchronous replication internals
The symchksum show command is used to look at a specific device.

For example:
symchksum show dev 103 -type generic -mb -sid 54
Performance considerations
Performance testing was done with Microsoft Exchange, Microsoft
SQL Server and Oracle on standard devices, and in the case of
Microsoft Exchange, also on SRDF/S and SRDF/A devices. For the
Microsoft SQL Server and Oracle performance tests, a TPCC

Data Protection
workload was used. For Microsoft Exchange, the Jetstress

performance tool was used. The results of these tests showed no
performance degradation from an application perspective. The
reason application performance remain unaffected is database log
writes, as well as database reads, will be performed normally. When
considering log write response times and database read response
times as the main determinates of application performance with
respect to storage, it is expected that client and application response
times will not be greatly affected.
Outside of application performance, there may be a slight increase in
the write response time to the database file devices depending on
application profile and usage. In general, this response time increase
should not impact application performance. Writes to a database file
are done asynchronously, therefore write response times to this file
are less of a concern than to the log device.However, there is always a
possibility in certain environments a delay in these asynchronous
writes may impact performance.

Data Protection
Syntax and examples

This section contains the symchksum argument descriptions and
several examples of using the SYMCLI symchksum command.
Consult the EMC Solutions Enabler Symmetrix CLI Command Reference
for the complete list of syntax and options.
symchksum list Lists all devices that currently have checksum

checking enabled.
show Displays the extents of a specified device that

are having checksum checking performed.
enable Enables checksum checking on the extents of

the specified devices.
disable Disables checksum checking on the extents of

the specified devices.
validate Validates that a specified database or

tablespace is able to have checksum checking
enabled.
verify Verifies whether the specified database or

tablespace has checksum checking enabled on
all their devices.
To list the devices on Symmetrix array 3890 with extents being

checked for checksum errors, enter:
symchksum list -sid 3890
To show all the extents of Symmetrix device 0A1 on Symmetrix array

3890 being checked for checksum errors, enter:
symchksum show dev 0A1 -sid 3890
To enable Checksum on the extents of all the devices that define the
current database instance and then to phone home on error, enter:
symchksum enable -type Oracle -phone_home
To enable Checksum on the extents of all the devices that define the
tablespace and then to log on error, enter:
symchksum enable -type Oracle -tbs SYSTEM
Syntax and examples 353

Data Protection
To verify that Oracle tablespace USER01 has Checksum enabled on

all the devices that have defined it, enter:
symchksum verify -type Oracle -tbs USER01
To disable Checksum on the current database instance, enter:

symchksum disable -type Oracle
Note: Disable by device should only be used under special circumstances.

For example, this option can be used to remove extents if a database or a
tablespace has been dropped without first doing a normal disable. In this
case, disable by device can be used to remove the extents.
To disable (with force) Checksum for all checksum extents on

Symmetrix device 0A1 on Symmetrix unit 3890, enter:
symchksum disable dev 0A1 -sid 3890 -force

9
Storage Tiering—Virtual
LUN and FAST
This chapter describes storage tiers available on Symmetrix and

methodologies for nondisruptive migration of Oracle data using
Symmetrix technologies across available storage tiers:
◆ Overview ........................................................................................... 356
◆ Evolution of storage tiering ............................................................ 359
◆ Symmetrix Virtual Provisioning .................................................... 361
◆ Enhanced Virual LUN migrations for Oracle databases ............ 372
◆ Fully Automated Storage Tiering for Virtual Pools .................... 381
◆ Fully Automated Storage Tiering .................................................. 404
◆ Conclusion ........................................................................................ 419
Storage Tiering—Virtual LUN and FAST 355

Storage Tiering—Virtual LUN and FAST
Overview
The EMC Symmetrix VMAX series with Enginuity is the newest
addition to the Symmetrix product family. Built on the strategy of
simple, intelligent, modular storage, it incorporates a new scalable
Virtual Matrix interconnect that connects all shared resources across
all VMAX Engines, allowing the storage array to grow seamlessly
and cost-effectively from an entry-level configuration into the
world’s largest storage system. The Symmetrix VMAX provides
improved performance and scalability for demanding enterprise
storage environments while maintaining support for EMC’s broad
portfolio of platform software offerings.
EMC Symmetrix VMAX delivers enhanced capability and flexibility
for deploying Oracle databases throughout the entire range of
business applications, from mission-critical applications to test and
development. In order to support this wide range of performance
and reliability at minimum cost, Symmetrix VMAX arrays support
multiple drive technologies that include Enterprise Flash Drives
(EFDs), Fibre Channel (FC) drives, both 10k rpm and 15k rpm, and
7,200 rpm SATA drives. In addition, various RAID protection
mechanisms are allowed that affect the performance, availability, and
economic impact of a given Oracle system deployed on a Symmetrix
VMAX array.
As companies increase deployment of multiple drive and protection
types in their high-end storage arrays, storage and database
administrators are challenged to select the correct storage
configuration for each application. Often, a single storage tier is
selected for all data in a given database, effectively placing both
active and idle data portions on fast FC drives. This approach is
expensive and inefficient, because infrequently accessed data will
reside unnecessarily on high-performance drives.
Alternatively, making use of high-density low-cost SATA drives for
the less active data, FC drives for the medium active data, and EFDs
for the very active data enables efficient use of storage resources, and
reduces overall cost and the number of drives necessary. This, in turn,
also helps to reduce energy requirements and floor space, allowing
the business to grow more rapidly.
Database systems, due to the nature of the applications that they
service, tend to direct the most significant workloads to a relatively
small subset of the data stored within the database and the rest of the
database is less frequently accessed. The imbalance of I/O load

across the database causes much higher utilization of the LUNs,

holding the active objects in a phenomenon known as LUN access
“skewing.” However, in most cases LUNs have some unallocated
and therefore idle spaces, or a combination of hot and cold data due
to a mix of different database objects. Such differences in the relative
utilization of the space inside each LUN are referred to as sub-LUN
“skewing.”
While the use of multiple storage tiers can be managed manually by
DBAs placing the appropriate database objects in their right tier, this
can become cumbersome given the growing complexity of
applications and the fluctuations of access frequency to data over
time.
Enginuity 5874 introduced Fully Automated Storage Tiering (FAST)
as a method to address changes in LUN access skewing. FAST
operates on standard (non-VP) Symmetrix addressable devices. It
automatically and seamlessly moves the storage behind the
controlled LUNs to the appropriate storage tier, based on user policy
and LUN activity. Enginuity 5875 introduced FAST for Virtual Pools
(FAST VP) as a method to address changes in sub-LUN access
skewing. FAST VP is based on Virtual Provisioning™ and operates
on thin Symmetrix devices. It automatically and seamlessly moves
portions of the LUN to the appropriate storage tiers, based on user
policy and the sub-LUN activity. Due to its finer granularity, FAST VP
is more efficient in utilizing the capacity of the different storage tiers,
and more responsive to changes in workload patterns than even the
most diligent DBA. FAST VP also adapts readily to configurations in
which, due to host striping, the workload is evenly distributed across
many LUNs (like Oracle Automatic Storage Management, or ASM).
Rather than having to move all the LUNs as a group between storage
tiers, FAST VP operates appropriately on small portions in each LUN,
moving them to the storage tier that best matches their workload
needs.
FAST VP preserves Symmetrix device IDs, which means there is no
need to change file system mount points, volume manager settings,
database file locations, or scripts. It also maintains any TimeFinder or
SRDF business continuity operations even as the data migration takes
place.
By optimizing data placement of active LUNs and sub-LUNs to the
storage tier that best answers their needs, FAST VP helps maximize
utilization of Flash drives, increase performance, reduce the overall
Overview 357
number of drives, and improve the total cost of ownership (TCO) and
ROI. FAST VP enables users to achieve these objectives while
simplifying storage management.
This chapter describes Symmetrix Virtual Provisioning, a tiered
storage architecture approach for Oracle databases, and the way in
which devices can be moved nondisruptively, using either Virtual
LUN, FAST for traditional thick devices and FAST VP for virtual
provisioned devices, in order to put the right data on the right storage
tier at the right time.

Evolution of storage tiering

Storage tiering has evolved over the past several years from a
completely manual process to the automatic process that it is today.
Manual storage tiering

Manual storage tiering is the process of collecting performance
information on a set of drives and then manually placing data on
different drive types based on the performance requirement for that
data. This process is typically very labor-intensive and does not
dynamically adjust as the load on the application increases or
decreases over time.
Fully Automated Storage Tiering (FAST)

FAST was introduced in 2009 and is based on virtual LUN (VLUN)
migration for standard devices. FAST allows administrators to define
policies and priorities that govern what data resides in each storage
tier and can automatically make data placement decisions without
human intervention. FAST is a major step forward in data
management automation, but it is limited to moving entire LUNs
from one tier to another. Even if only a small amount of the data on
the LUN is active, then inactive data is also migrated, consuming
valuable space in the higher-performance tier.
Fully Automated Storage Tiering for Virtual Pools (FAST VP)

FAST VP monitors the performance of a LUN at fine granularity and
moves only a small number of Symmetrix tracks between storage
tiers. FAST VP automates the identification of sub-LUN data for the
purposes of relocating it across different performance/capacity tiers
within an array.
Example of storage tiering evolution

Figure 72 on page 360 shows an example of storage tiering evolution
from a single tier to sub-LUN tiering. Although the image shows
FAST VP operating on two tiers alone, in most cases tiering strategy
is still best optimized for cost/performance using a three-tier
approach.
Evolution of storage tiering 359

Traditional FAST FAST VP
ICO-IMG-000927
Figure 72 Storage tiering evolution

Symmetrix Virtual Provisioning

This section contains the following information:
◆ “Introduction” on page 361
◆ “Virtual Provisioning and Oracle databases” on page 363
◆ “Planning thin devices for Oracle databases” on page 368
Introduction
Symmetrix Virtual Provisioning, the Symmetrix implementation of
what is commonly known in the industry as “thin provisioning,”
enables users to simplify storage management and increase capacity
utilization by sharing storage among multiple applications and only
allocating storage as needed from a shared “virtual pool” of physical
disks.
Symmetrix thin devices are logical devices that can be used in many
of the same ways that Symmetrix standard devices have traditionally
been used. Unlike traditional Symmetrix devices, thin devices do not
need to have physical storage preallocated at the time the device is
created and presented to a host (although in many cases customers
interested only in wide striping and ease of management choose to
fully preallocate the thin devices). A thin device is not usable until it
has been bound to a shared storage pool known as a thin pool.
Multiple thin devices may be bound to any given thin pool. The thin
pool is comprised of devices called data devices that provide the
actual physical storage to support the thin device allocations.
When a write is performed to a part of any thin device for which
physical storage has not yet been allocated, the Symmetrix allocates
physical storage from the thin pool for that portion of the thin device
only. The Symmetrix operating environment, Enginuity, satisfies the
requirement by providing a block of storage from the thin pool called
a thin device extent. This approach reduces the amount of storage
that is actually consumed.
The minimum amount of physical storage that can be reserved at a
time for the dedicated use of a thin device is referred to as a data
device extent. The data device extent is allocated from any one of the
data devices in the associated thin pool. Allocations across the data
Symmetrix Virtual Provisioning 361

devices are balanced to ensure that an even distribution of allocations

occurs from all available data devices in the thin pool (also referred to
as wide striping).
For Symmetrix, the thin device extent size is the same as the data
device extent size, which is 12 Symmetrix tracks or 768 KB. Note that,
there is no reason to match the LVM stripe depth with the thin device
extent size. Oracle commonly accesses data either by random single
block read/write operations (usually 8 KB in size) or sequentially by
reading large portions of data. In either case there is no advantage or
disadvantage to match the LVM stripe depth to the thin device extent
size as single block read/writes operate on a data portion that is
smaller than the LVM stripe depth anyway. For sequential operations,
if the data is stored together in adjacent locations on the devices, the
read operation will simply continue to read data on each LUN (every
time the sequential read wraps to that same LUN) regardless of the
stripe depth. If the LVM striping caused the data to be stored
randomly on the storage devices then the sequential read operation
will turn into a storage random read of large I/Os spread across all
the devices.
When a read is performed on a thin device, the data being read is
retrieved from the appropriate data device in the thin pool to which
the thin device is associated. If for some reason a read is performed
against an unallocated portion of the thin device, zeros are returned
to the reading process.
When more physical data storage is required to service existing or
future thin devices, for example, when a thin pool is approaching full
storage allocations, data devices can be added to existing thin pools
dynamically without causing a system outage. New thin devices can
also be created and bound to an existing thin pool at any time.
When data devices are added to a thin pool they can be in an enabled
or disabled state. In order for the data device to be used for thin
extent allocation, it needs to be in the enabled state. For it to be
removed from the thin pool, it needs to be in a disabled state.
Symmetrix automatically initiates a drain operation on a disabled
data device without any disruption to the application. Once all the
allocated extents are drained to other data devices, a data device can
be removed from the thin pool.
The following figure depicts the relationships between thin devices
and their associated thin pools. Thin Pool A contains six data devices,
and thin Pool B contains three data devices. There are nine thin

devices associated with thin Pool A and three thin devices associated
with thin pool B. The data extents for thin devices are distributed on
various data devices as shown in Figure 73.
Pool A
Data
devices
Thin
devices
Pool B
Data
devices
ICO-IMG-000929
Figure 73 Thin devices and thin pools containing data devices
The way thin extents are allocated across the data devices results in a
form of striping in the thin pool. The more data devices in the thin
pool (and the associated physical drives behind them), the wider
striping will be, creating an even I/O distribution across the thin
pool. Wide striping simplifies storage management by reducing the
time required for planning and execution of data layout.
Virtual Provisioning and Oracle databases

Oracle data file initialization
Using Virtual Provisioning in conjunction with Oracle databases
provides the benefits mentioned earlier, such as reducing future
server impact during LUN provisioning, increasing storage
utilization, native striping in the thin pool, and ease and speed of
creating and working with thin devices. However, as commonly
known, when Oracle initializes new files, such as log, data and temp
files, it fully allocates the file space by writing non-zero information

(metadata) to each initialized block. This will cause the thin pool to
allocate the amount of space that is being initialized by the database.
As database files are added, more space will be allocated in the pool.
Due to Oracle file initialization, and in order to get the most benefit
from a Virtual Provisioning infrastructure, a strategy for sizing files,
pools, and devices should be developed in accordance with
application and storage management needs. Some strategy options
are explained next.
Oversubscription
An oversubscription strategy is based on using thin devices with a
total capacity greater than the physical storage in the pools that they
are bound to. This can increase capacity utilization by sharing storage
among applications, thereby reducing the amount of allocated but
unused space. The thin devices each seem to be a full-size device to
the application, while in fact the thin pool cannot accommodate the
total LUNs’ capacity. Since Oracle database files initialize their space
even though they are still empty, it is recommended that instead of
creating very large data files that remain largely empty for most of
their lifetime, smaller data files should be considered to
accommodate near-term data growth. As they fill up over time, their
size can be increased, or more data files added, in conjunction with
the capacity increase of the thin pool. The Oracle auto-extend feature
can be used for simplicity of management, or DBAs may prefer to use
manual file size management or addition.
An oversubscription strategy is recommended for database
environments when database growth is controlled, and thin pools can
be actively monitored and their size increased when necessary in a
timely manner.
Undersubscription
An undersubscription strategy is based on using thin devices with a
total capacity smaller than the physical storage in the pools that they
are bound to. This approach doesnot necessarily improve storage
capacity utilization but still makes use of wide striping, thin pool
sharing, and other benefits of Virtual Provisioning. In this case the
data files can be sized to make immediate use of the full thin device
size, or alternatively, auto-extend or manual file management can be
used.

Undersubscribing is recommended when data growth is

unpredictable, when multiple small databases share a large thin pool
to benefit from wide striping, or when an oversubscriptioned
environment is considered unacceptable.
Thin device preallocation

A third option exists that can be used with either oversubscription or
undersubscription, and has become very popular for Oracle
databases. When the DBAs like to guarantee that space is reserved for
the databases’ thin devices, they can use thin device preallocation.
While this reduces potential capacity utilization benefits for the thin
pool, it still enables users to achieve easier data layout with wide
striping. A thin device can preallocate space in the pool, even before
data was written to it. Figure 74 shows an example of creating 10x
29.30 GB thin devices, and preallocating 10 GB in the pool for each of
them.
Figure 74 Thin device configuration

The example shows an SMC screen (a similar operation can be done

using the Symmetrix CLI). When preallocation is used, Oracle
database customers often preallocate the whole thin device (reducing
the storage capacity optimization benefits). In effect each thin device
therefore fully claims its space in the thin pool, eliminating a possible
thin pool out-of-space condition. It is also possible to preallocate a
portion of the thin device (like the 10 GB in the example) to match the
size of the application file. For example, ASM disks can be set smaller
than their actual full size, and later be resized dynamically without
any disruption to the database application. In this case an ASM disk
group can be created from these 10 thin devices, only using 10 GB of
each disk. At a later time, additional storage on the thin device can be
preallocated, and ASM disks resized to match it.
Planning thin pools for Oracle databases

Planning thin pools for Oracle environments requires some attention
to detail but the advantage of using thin pools is that the environment
is flexible. By using thin devices, performance of the database can be
improved over thick devices because thin devices are striped evenly
over all the physical drives in the pool. For typical OLTP Oracle
databases this provides the maximum number of physical devices to
service the workload. If a database starts on a pool of, say, 64
physical devices, and the load to those devices is too heavy, the pool
can be expanded dynamically without interruption to the
application, to spread the load over more physical drives.
In general, thin pools should be configured to meet at least the initial
capacity requirements of all applications that will reside in the pool.
The pool should also contain enough physical drives to service the
expected back-end physical drive workload. Customers can work
with their local EMC account team for recommendations on how to
size the number of physical drives.
For RAID protection, thin pools are not different in terms of reliability
and physical drive performance than existing drives today. If an
application is deployed on RAID 5 (3+1) today, there is no reason to
change the protection for thin pools. Likewise if an application is
deployed on RAID 1 or RAID 5 (7+1), then the thin pool should be
configured to match. Both RAID 1 and RAID 5 protect from a
single-drive failure, and RAID 6 protects from two-drive failures. A
RAID 1 group resides on two physical drives; a RAID 5 (3+1) group
resides on four physical drives, and so on. When a thin pool is
created, it is always created out of similarly configured RAID groups.
For example, if we create eight RAID 5 (3+1) data devices and put

them into one pool, the pool has eight RAID 5 devices of four drives
each. If one of the drives in this pool fails, you are not losing one
drive from a pool of 32 drives; rather, you are losing one drive from
one of the eight RAID-protected data devices and that RAID group
can continue to service read and write requests, in degraded mode,
without data loss. Also, as with any RAID group, with a failed drive
Enginuity will immediately invoke a hot sparing operation to restore
the RAID group to its normal state. While this RAID group is
rebuilding, any of the other RAID groups in the thin pool can have a
drive failure and there is still no loss of data. In this example, with
eight RAID groups in the pool there can be one failed drive in each
RAID group in the pool without data loss. In this manner data stored
in the thin pool is no more vulnerable to data loss than any other data
stored on similarly configured RAID devices. Therefore, a protection
of RAID 1 or RAID 5 for thin pools is acceptable for most applications
and RAID 6 is only required in situations where additional parity
protection is warranted.
The number of thin pools is affected by a few factors. The first is the
choice of drive type and RAID protection. Each thin pool is a group of
data devices sharing the same drive type and RAID protection. For
example, a thin pool that consists of multiple RAID 5 protected data
devices based on 15k rpm FC disk can host the Oracle data files for a
good choice of capacity/performance optimization. However, very
often the redo logs that take relatively small capacity are best
protected using RAID 1 and therefore another thin pool containing
RAID 1 protected data devices can be used. In order to ensure
sufficient spindles behind the redo logs the same set of physical
drives that is used for the RAID 5 pool can also be used for the RAID
1 thin pool. Such sharing at the physical drive level, but separation at
the thin pool level, allows efficient use of drive capacity without
compromising on the RAID protection choice. Oracle Fast Recovery
Area (FRA), for example, can be placed in a RAID 6 protected SATA
drive’s thin pool.
Therefore the choice of the appropriate drive technology and RAID
protection is the first factor in determining the number of thin pools.
The other factor has to do with the business owners. When
applications share thin pools they are bound to the same set of data
devices and spindles, and they share the same overall thin pool
capacity and performance. If business owners require their own
control over thin pool management they will likely need a separate
set of thin pools based on their needs. In general, however, for ease of

manageability it is best to keep the overall number of thin pools low,

and allow them to be spread widely across many drives for best
performance.
Planning thin devices for Oracle databases

Thin device sizing
The maximum size of a thin device in a Symmetrix VMAX is 240 GB.
If a larger size is needed, then a metavolume comprised of thin
devices can be created. When host striping is used, like Oracle ASM,
it is recommended that the metavolume be concatenated rather than
striped since the host will provide a layer of striping, and the thin
pool is already striped based on data device extents. Concatenated
metavolumes also support fast expansion capabilities, as new
metavolume members can be easily appended to the existing
concatenated metavolume. This functionality may be applicable
when the provisioned thin device has become fully allocated at the
host level, and it is required to further increase the thin device to gain
additional space. Note that it is not recommended to provision
applications with a low number of very large LUNs. The reason is
that each LUN provides the host with an additional I/O queue to
which the host operating system can stream I/O requests and
parallelize the workload. Host software and HBA drivers tend to
limit the amount of I/Os that can be queued at a time to a LUN and
therefore to avoid host queuing bottlenecks under heavy workloads,
it is better to provide the application with multiple, smaller LUNs
rather than very few and large LUNs.
Striped metavolumes are supported with Virtual Provisioning and
there may be workloads that will benefit from multiple levels of
striping (for example, for Oracle redo logs when SRDF/S is used, and
host striping is not available).
When oversubscription is used, the thin pool can be sized for
near-term database capacity growth, and the thin devices for
long-term LUN capacity needs. Since the thin LUNs do not take
space in the pool until data is written to them, this method optimizes
storage capacity utilization and reduces the database and application
impact as they continue to grow. Note, however, that the larger the
device the more metadata is associated with it and tracked in the
Symmetrix cache. Therefore the sizing should be reasonable and
realistic to limit unnecessary cache overhead, as small as it is.

Thin devices and ASM disk group planning

Thin devices are presented to the host as SCSI LUNs. Oracle
recommends creating at least a single partition on each LUN to
identify the device as being used. On x86-based platforms it is
important to align the LUN partition, for example, by using fdisk or
parted on Linux. With fdisk, after the new partition is created, type
“x” to enter Expert mode, then use the “b” option to move the
beginning of the partition. Either 128 blocks (64 KB) offset or 2,048
blocks (1 MB) offset are good choices and align with the Symmetrix
64 KB cache track size. After assigning Oracle permissions to the
partition it can become an ASM disk group member or can be used in
other ways for the Oracle database.
Oracle recommends when using Oracle Automatic Storage
Management (ASM), the use of a minimum number of ASM disk
groups for ease of management. Indeed, when multiple smaller
databases share the same performance and availability requirements
they can also share ASM disk groups; however, larger, more critical
databases may require their own ASM disk groups for better control
and isolation. EMC best practice for mission-critical Oracle databases
is to create a few ASM disk groups based on the following
considerations:
◆ +GRID:
Starting with database 11gR2 Oracle has merged Cluster Ready
Services (CRS) and ASM and they are installed together as part of
Grid installation. Therefore when the clusterware is installed, the
first ASM disk group is also created to host the quorum and
cluster configuration devices. Since these devices contain local
environment information such as hostnames and subnet masks,
there is no reason to clone or replicate them. EMC best practice
starting with Oracle database 11.2 is to only create a very small
disk group during Grid installation for the sake of CRS devices
and not place any database components in it. When other ASM
disk groups containing database data are replicated with storage
technology, they can simply be mounted to a different +GRID
disk group at the target host or site, already with Oracle CRS
installed with all the local information relevant to that host and
site. Note that while external redundancy (RAID protection is
handled by the storage array) is recommended for all other ASM
disk groups, EMC recommends high redundancy only for the
+GRID disk group. The reason is that Oracle automates the
number of quorum devices based on redundancy level and it will
allow the creation of more quorum devices. Since the capacity

requirements of the +GRID ASM disk group are tiny, very small
devices can be provisioned (High redundancy implies three
copies/mirrors and therefore a minimum of three devices is
required).
◆ +DATA, +LOG: While separating data and log files to two
different ASM disk groups is optional, EMC recommends it in the
following cases:
• When TimeFinder is used to create a clone (or snap) that is a
valid backup image of the database. The TimeFinder/Clone
image can serve as a source for RMAN backup to tape, and/or
can be opened for reporting (read-only), and so on. However
the importance of such a clone image is that it is a valid full
backup image of the database. If the database requires media
recovery, restoring the TimeFinder/Clone back to production
takes only seconds-regardless of the database size. This is a
huge saving in RTO and in a matter of a few seconds, archive
logs can start being applied as part of media recovery roll
forward. When such a clone does not exist, the initial backup
set has to be first restored from tape/VTL prior to applying
any archive log, which can add a significant amount of time to
recovery operations. Therefore, when TimeFinder is used to
create a backup image of the database, in order for the restore
to not overwrite the online logs, they should be placed in
separate devices and a separate ASM disk group.
• Another reason for separation of data from log files is
performance and availability. Redo log writes are synchronous
and require to complete in the least amount of time. By having
them placed in separate storage devices, the commit writes
will not have to share the LUN I/O queue with large async
buffer cache checkpoint I/Os. Having the logs in their own
devices makes it available to use one RAID protection for data
files (such as RAID 5), and another for the logs (such as RAID
1).
◆ +TEMP: When storage replication technology is used for disaster
recovery, like SRDF/S, it is possible to save bandwidth by not
replicating temp files. Since temp files are not part of a recovery
operation and quick to add, having them on separate devices
allows bandwidth saving, but adds to the operations of bringing
up the database after failover. While it is not required to separate
temp files, it is an option and the DBA may choose to do it
anyway for performance isolation reasons if that is their best
practice.

◆ +FRA: Fast Recovery Area typically hosts the archive logs and
sometimes flashback logs and backup sets. Since the I/O
operations to FRA are typically sequential writes, it is usually
sufficient to have it located on a lower tier such as SATA drives. It
is also an Oracle recommendation to have FRA as a separate disk
group from the rest of the database to avoid keeping the database
files and archive logs or backup sets (that protect them) together.
Thin pool reclamation using ASM storage reclamation utility

The Symmetrix Enginuity 5874 Q4 2009 service release introduced
zero space reclamation, which returns allocated data extents
containing contiguous zero blocks to a thin pool for reuse. The feature
currently works at the granularity of 12 tracks (768 KB). When
database and host files are dropped, they are commonly not zeroed
out by the operating system or the Oracle database and therefore
their space cannot be reclaimed in the thin pool.
In general, Oracle ASM reuses free/deleted space under the high
watermark very efficiently. However, when a large amount of space is
released, for example after the deletion of a large tablespace or
database, and the space is not anticipated to be needed soon by that
ASM disk group, it is beneficial to free up that space in both the disk
group and thin pool.
To simplify the storage reclamation of thin pool space no longer
needed by ASM objects, Oracle has developed the ASM Storage
Reclamation Utility (ASRU). ASRU rebalances (consolidates) the
ASM disk group and resizes the disks. Once the ASM disks are
resized, ASRU fills the remainder of the disk space with zeros to
allow the reclamation of the zero space by storage Virtual
Provisioning zero reclamation algorithms. The whole process is
nondisruptive since users can perform ASM’s resize and rebalance
operations online. ASRU along with Symmetrix Virtual Provisioning
zero space reclamation can save a considerable amount of space in
the thin pool, making it available for other applications.

Enhanced Virual LUN migrations for Oracle databases

The Oracle database and storage administrators use a variety of
mechanisms to place the data on the right storage tier. This section
describes manual tiering mechanisms commonly deployed and then
extends the discussion to nondisruptive, and transparent migration
of Oracle data using Enhanced Virtual LUN technology available on
Symmetrix.
Manual tiering mechanics

The goal of an effective storage tiering approach in a multi-typed
storage configuration is to place the right data on the right storage
tier at the right time. A given Oracle device may be highly active and
highly accessed when data is created on it in the first instance. But
over time, its usage may drop to a level where the device could be
deployed on a storage tier that has lower-cost and lower-performance
characteristics.
A typical manual storage tiering approach uses the following steps:
1. Monitor database and storage performance: Oracle statspack or
AWR report can be used to analyze the Oracle database I/O
profile.
2. Identify and classify hot and cold database objects that are
candidates for cost reduction (down-tiering) or performance
improvement (up-tiering).
3. Identify space to be used as targets for tiering operations.
4. Use database, host, or storage utilities to move the candidates to
the right storage tier.
5. Validate that the tiering activities had the desired effects.
6. Repeat the process at a later time.
Symmetrix Enhanced Virtual LUN technology

Symmetrix VMAX and Enginuity 5874 introduced an Enhanced
Virtual LUN technology that enables transparent, nondisruptive data
mobility of thick devices between storage tiers (a combination of disk
technology and RAID protection). Virtual LUN technology provides
users with the ability to move Symmetrix logical devices between

drive types, such as high-performance Enterprise Flash Drives

(EFDs), Fibre Channel drives, or high-capacity low-cost SATA drives
and at the same time change their RAID protection.
Virtual LUN migration occurs independent of host operating systems
or applications, and during the migration the devices remain fully
accessible to database transactions. While the back-end device
characteristics change (RAID protection or physical drive type) the
migrated devices’ identities remain the same to the host, allowing
seamless online migration. Virtual LUN migration is fully integrated
with Symmetrix replication technology and maintains consistency of
source/target device relationships in replications such as SRDF,
TimeFinder/Clone, TimeFinder/Snap, or Open Replicator.
The advantages of migrating data using storage technology are ease
of use, efficiency, and simplicity. Data is migrated in the Symmetrix
back end without needing any SAN or host resources, which
increases migration efficiency. The migration is a safe operation as the
target is treated internally as just another “mirror” of the logical
device, although with its own RAID protection and drive type. At the
end of the migration, the data on the original “mirror” is formatted to
preserve security. Finally, since the identity of source devices does not
change, moving between storage tiers is easy and does not require
additional host change control, backup script updates, changes in file
system mount points, a volume manager, or others. The migration
pace can be controlled using Symmetrix quality of service (symqos)
commands.
Virtual LUN migration helps customers to implement an Information
Life Management (ILM) strategy for their databases, such as the
movement of the entire database, tablespaces, partitions, or ASM disk
groups between storage tiers. It also allows adjustments in service
levels and performance requirements to application data. For
example, often application storage is provisioned before clear
performance requirements are known. At a later time, once the
requirements are better understood, it is easy to make any adjustment
to increase user experience and ROI using the correct storage tier.
LUN-based migrations and ASM

Automatic Storage Management (ASM) is a feature in Oracle
Database 10g and higher that provides the database administrators
with a simple storage management interface that is consistent across
all server and storage platforms. As a vertically integrated file system
Enhanced Virual LUN migrations for Oracle databases 373

and volume manager, purpose-built for Oracle database files, ASM

provides the performance of async I/O with the easy management of
a file system. ASM provides capability that saves DBAs time and
provides flexibility to manage a dynamic database environment with
increased efficiency.
Oracle Database 11g Release 2 extends ASM functionality to manage
all data: Oracle database files, Oracle Clusterware files, and
non-structured general-purpose data such as binaries, external files,
and text files. ASM simplifies, automates, and reduces cost and
overhead by providing a unified and integrated solution stack for all
your file management needs eliminating the need for third-party
volume managers, file systems, and clusterware platforms.
Symmetrix FAST complements the use of Oracle ASM, or other file
systems and volume manager types for that matter. FAST relies on
the availability of multiple storage tiers in the Symmetrix storage,
and on LUN access skewing. As discussed earlier, LUN access
skewing is very common in databases and often tends to simply
show that the most recent data is accessed much more heavily than
older data. The next few paragraphs will discuss approaches to LUN
access skewing with ASM and other volume managers or file
systems.
One of the many features of Oracle ASM is that it will stripe all the
data under ASM control evenly across all the LUNs in the ASM disk
group. The effect this has on I/O distribution is that access to all the
members in the ASM disk group is uniform. With the first release of
FAST, the granularity is a full device (ASM member) and therefore
the FAST Storage Group should always include all the devices in any
ASM disk group that is under FAST control. So for an Oracle system
using ASM, the goal will be to move or swap an entire ASM disk
group from one storage tier to another rather than individual
members. Therefore when planning the database for use of multiple
storage tiers, ASM disk group creation should be designed
accordingly. For example, if DBAs are not interested in the redo logs
being managed by FAST, and only data files, they should place each
file type in their own ASM disk group and allow FAST to manage just
the devices of the +DATA ASM disk group. This approach will create
skewing between ASM disk groups, based on the I/O profile
generated by the data file types included in each disk group. Another
approach is to look at skewing between databases. In that case by
placing each database in their own ASM disk group, FAST can now
manage the performance and cost requirements for each database
separately.

A similar approach can be taken with other third-party file systems

and LVMs. For example, if tablespace data files are spread across
multiple file systems, all the devices that pertain to these file systems
should be included together in the FAST Storage Group. In that case
it would be wise to not mix on the same file systems data files that
belong to different databases, business units, or organizations, unless
the intent is to operate on all of them in the same way when it comes
to a storage tiering strategy.
With the release of Oracle Database 11gR2, Oracle introduced a
feature called Intelligent Data Placement (IDP). IDP adds a template
to ASM disk groups that allows the ASM extents that belong to
specific Oracle files to be placed on the “hot” (outer) or “cold” (inner)
locations of the ASM members (LUNs). The Symmetrix storage array
never presents full physical drives to the host. Rather each drive is
carved into multiple logical devices and those are presented as LUNs
to hosts. This allows greater flexibility, data protection, and sharing of
resources, including data movement, virtualization, and replication.
Therefore IDP will not be effective with Symmetrix since it can not
operate on the full physical drive. In addition, Symmetrix Enginuity
code will make sure the disk heads are placed in the most optimal
position, which depends on data access and not inner/outer drive
locations.
Symmetrix devices can be concatenated to simulate full physical
drives or RAID groups; however, keep in mind that current FC 10k
rpm drive size has reached 600 GB, and SATA drives have capacities
of 1 and 2 TB. These are very large objects to operate on and each time
additional capacity is needed, multiples of such large units will need
to be acquired and added, which is not cost or capacity efficient. In
addition, for performance reasons it is better to present a host with
multiple reasonably sized LUNs than very few large ones. The reason
is that each LUN will have its own I/O queue by the host and this
will make keeping the storage devices busy with enough concurrent
work easy. With few very large LUNs often server host queuing
needs to be tweaked to simulate this behavior, potentially causing
performance problems. As a final note, since Flash drives have no
moving parts, there is no notion of inner or outer drive location and
IDP does not apply to EFDs either.
The following section describes a scenario of migrating an I/O
intensive ASM disk group from FC to EFDs using enhanced Virtual
LUN.

Configuration for Virtual LUN migration

As depicted in Figure 75 on page 376, based on the performance
analysis of the ASM diskgroup “+Sales” it was identified that this
ASM disk group services more random I/O and EFDs are better
suited for the disk group. ASM operates on the paradigm of Stripe
and Mirror Everything (SAME) and hence the data is widely striped
across all the logical volumes that are part of the ASM disk group.
Every time ASM devices are added or removed,ASM automatically
rebalances the extents across the new set of devices. It is
recommended to operate on an entire ASM diskgroup and migrate all
the LUNs to devices with similar I/O characteristics. In this case an
entire ASM disk group containing 40 x 50 GB ASM devices (RAID 1)
created on 40 x 300 GB FC drives will be migrated to logical volumes
carved out of 4 x 400 GB EFDs (RAID 5).
40 x 300 GB 15K rpm (RAID 1)
‘+Sales’
20 x 50 GB
ASM members
Ease of LUN migration for:

Databases
ASM disk groups
Partitions
Tablespaces
4 x 400 GB Enterprise
Flash drives (RAID 5)
ICO-IMG-000779
Figure 75 Migration of ASM members from FC to EFDs using Enhanced Virtual LUN
technology
The target devices for the migration can be chosen from configured
space or new devices can be automatically configured by migrating
to unconfigured space.
Migration to configured space

The LUN migration to configured space requires pre-configured
LUNs with equal or larger capacity and desired RAID protection on
the target storage and mapping source and target LUNs. The target
LUN will contain the complete copy of the application data on the
source LUN at the end of migration. The storage associated with

source LUNs will be released using iVTOC-an Enginuity operation

that wipes out logical volume information from the LUN at the end of
the migration process. The entire process happens transparent to the
application and no changes to the LUN visibility or identity are
made.
Example:
This example illustrates steps to migrate the ASM volumes in
sales_dg device group to configured space on disk group 1. Prior to
migration 40 x 50 GB RAID 5 (3+1) LUNs are configured on 4 x 400
GB EFDs. The migration operation will involve automatic selection of
appropriate target devices from the Symmetrix disk group #1.
The command line is:
symmigrate -name migrate_Session -g sales_dg
-tgt_config -tgt_disk_grp 1 -tgt_raid_5 -tgt_prot
3+1 establish
Figure 76 depicts steps involved during the migration of Symmetrix
LUN 790 that is a member of the “+SALES” ASM disk group to
automatically selected target device 1FD7 on disk group 1.
790
M1 M2 M3 M4
RAID1 RAID5
RDF
(FC) (SATA)
1FD7
M1 M2 M3 M4
RAID1
(FC)
ICO-IMG-000780
Figure 76 Virtual LUN migration to configured space
Steps:
1. Migrating device 790 from a RAID 1 (FC) to RAID 5 (3+1) on EFD
configured as 1FD7.

2. Configuration lock is taken.

3. RAID 5 mirror of 1FD7 is made not_ready and attached to source
device in one out of four available mirror positions as the
secondary mirror.
4. Configuration lock is released.
5. Secondary mirror is synchronized from the primary mirror.
6. Once synchronization is done, configuration lock is taken again.
7. Primary and secondary roles are switched and original primary
mirror is detached from the source and moved to the target
device 1FD7.
8. iVTOC is performed on 1FD7 for clearing the state of the original
primary mirror and stopping further access to the original data
on this mirror.
Migration to unconfigured space

This type of migration only requires selection of target storage and
protection type but no LUNs have to be carved out of target storage
ahead of time. At the start of the migration process, Symmetrix
Enginuity automatically creates the target LUNs with required
capacity and RAID protection on the unconfigured storage space and
performs data migration between source and target LUNs. At the end
of migration the original source storage is added to the unconfigured
capacity.
Example:
This example illustrates steps to migrate the ASM volumes in the
sales_dg device group to unconfigured space on disk group 2. At the
end of the migration, the target devices will be configured as RAID 5
(3+1) on 4 x 400 GB EFDs.
The command line is:
symmigrate -name migrate_Session -g sales_dg
-tgt_unconfig -tgt_disk_grp 2 -tgt_raid_5 -tgt_prot
3+1 establish
Figure 77 on page 379 depicts steps involved during the migration of
Symmetrix LUN 790 that is a member of the “+SALES” ASM disk
group to automatically configured RAID 5 (3+1) target device on EFD
disk group 2.

790
M1 M2 M3 M4
RAID5
RDF
(SATA)
Primary Remote Secondary
ICO-IMG-000781
Figure 77 Virtual LUN migration to unconfigured space
Steps:
1. Migrating device 790 from a RAID 1 (FC) to RAID 5 (EFD) pool.
2. Configuration lock is taken.
3. The RAID 5 mirror is created from unconfigured space and added
as the secondary mirror.
5. The secondary mirror is synchronized from the primary mirror.
6. Once synchronization is done, the configuration lock is taken
again.
7. Primary and secondary roles are switched and the original
primary mirror is detached from the source and moved to the
target device 1FD7.
8. The original primary mirror on RAID 1 (FC) is deleted.

Symmetrix Virtual LUN VP mobility technology

Introduced in Enginuity 5875, EMC Symmetrix VMAX VLUN VP
enables transparent, nondisruptive data mobility of thindevices
between storage tiers or RAID protections. Virtual LUN VP Mobility
(VLUN VP) benefits and usage are almost identical to VLUN with the
exception that while VLUN operated on “thick” devices, VLUN VP
operates only on thin devices, and migrates only the allocated extents
of a thin device to a single target thin pool. As a result, at the end of
the migration the thin device will share the storage tier and RAID
protection of the target thin pool.
Note that when using VLUN VP on devices under FAST VP control, it
is recommended to pin the thin devices to the target thin pool so that
FAST VP does not move them to other tiers until the user is ready.
When thin devices under FAST VP control are pinned to a thin pool,
FAST VP continues to collect their statistics, but it will not issue move
plans for them.
VLUN VP enables customers to move Symmetrix thin devices
without disrupting user applications and with minimal impact to
host I/O. Users may move thin devices between thin pools to:
◆ Change the drive media on which the thin devices are stored
◆ Change the thin device RAID protection level
◆ Move a thin device that was managed by FAST VP (and may be
spread across multiple tiers, or thin pools) to a single thin pool
While VLUN VP has the ability to move all allocated thin device
extents from one pool to another, it also has the ability to move
specific thin device extents from one pool to another, and it is this
feature that is the basis for FAST VP.

Fully Automated Storage Tiering for Virtual Pools

This section describes the Fully Automated Storage Tiering (FAST
VP) configuration and illustrates the process of sub-LUN level
migration of Oracle data onVirtual Provisioned storage (thin devices)
to different storage tiers to meet the business requirements using
FAST VP.
FAST VP and Virtual Provisioning

FAST VP is based on Virtual Provisioning technology. Virtual
Provisioning as explained earlier allows the creation and use of
virtual devices (commonly referred to as thin devices) that are
host-addressable, cache-only pointer-based devices. Once the host
starts using the thin devices, their data is allocated in commonly
shared pools called thin pools. A thin pool is simply a collection of
Symmetrix regular devices of the same drive technology and RAID
protection (for example, 50 x 100 GB RAID 5-15k rpm FC devices can
be grouped into a thin pool called FC15k_RAID5). Because the thin
pool devices store the pointer-based thin devices’ data, they are also
referred to as data devices. Data in the thin pool is always striped,
taking advantage of all the physical drives behind the thin pool data
devices. This allows both improved performance as well as ease of
deployment and storage provisioning. In addition, as data devices
are added or removed from the thin pool, their data will be
rebalanced (restriped) seamlessly as well. In short, Virtual
Provisioning has many deployment advantages in addition to being
the base technology for FAST VP.
One can start understanding how FAST VP benefits from this
structure. Since the thin device is pointer-based, and its actual data is
stored in thin pools based on distinct drive type technology, when
FAST VP moves data between storage tiers it simply migrates the
data between the different thin pools and updates the thin device
pointers accordingly. To the host, the migration is seamless as the thin
device maintains the exact same LUN identity. At the Symmetrix
storage, however, the data is migrated between thin pools without
any application downtime.
Fully Automated Storage Tiering for Virtual Pools 381

FAST VP Elements
FAST VP has three main elements—storage tiers, storage groups, and
FAST policies—as shown in Figure 78.
Storage Tier FAST Policy Storage Group
EFD Tier Policy

Storage Group 1
10%
FC Tier
40%
50%
SATA Tier
ICO-IMG-000930
Figure 78 FAST managed objects
◆ Storage tiers are the combination of drive technology and RAID

protection available in the VMAX array. Examples for storage
tiers are RAID 5 EFD, RAID 1 FC, RAID 6 SATA, and so on. Since
FAST VP is based on Virtual Provisioning, the storage tiers for
FAST VP contain one to three thin pools of the same drive type
and RAID protection.
◆ Storage groups are collections of Symmetrix host-addressable
devices. For example, all the devices provided to an Oracle
database can be grouped into a storage group. While a storage
group can contain both thin and thick devices, FAST VP will
operate only on the thin devices in a given storage group.
◆ A FAST VP policy combines storage groups with storage tiers,
and defines the configured capacities, as a percentage, that a
storage group is allowed to consume on that tier. For example, a
FAST VP policy can define 10 percent of its allocation to be placed
on EFD_RAID 5, 40 percent on FC15k_RAID 1, and 50 percent on
SATA_RAID 6 as shown in Figure 79 on page 383. Note that these
allocations are the maximum allowed. For example, a policy of
100 percent on each of the storage tiers means that FAST VP has
liberty to place up to 100 percent of the storage group data on any
of the tiers. When combined, the policy must total at least 100
percent, but may be greater than 100 percent as shown in Figure
8. In addition, the FAST VP policy defines exact time windows
for performance analysis, data movement, data relocation rate,
and other related settings.

FAST VP operates in the storage array based on the policy allocation

limits for each tier (“Compliance”), and in response to the application
workload (“Performance”). During the Performance Time Window,
FAST will gather performance statistics for the controlled storage
groups. During the Move Time Window, FAST will then create move
plans (every 10 minutes) that will accommodate any necessary
changes in performance or due to compliance changes. Therefore,
FAST VP operates in reactions to changes in workload or capacities,
in accordance to the policy.
ICO-IMG-000931
Figure 79 FAST policy association
FAST VP time window considerations

There is no one Performance Time Window recommendation that is
generically applicable to all customer environments. Each site will
need to make the decision based on its particular requirements and
SLAs. Collecting statistics 24x7 is simple and the most
comprehensive approach; however, overnight and daytime I/O
profiles may differ greatly, and evening performance may not be as
important as daytime performance. This difference can be addressed
by simply setting the collection policy to be active only during the
daytime from 7 a.m. to 7 p.m., Monday to Friday. This policy is best

suited for applications that have consistent I/O loads during

traditional business hours. Another approach would be to only
collect statistics during peak times on specific days. This is most
beneficial to customers whose I/O profile has very specific busy
periods, such as the a.m. hours of Mondays. By selecting only the
peak hours for statistical collection, the site can ensure that the data
that is most active during peak periods gets the highest priority to
move to a high-performance tier. The default Performance Time
Window is set for 24x7 as the norm but can be easily changed using
CLI or SMC.
FAST VP move time window considerations

Choosing a FAST VP Move Time Window allows a site to make a
decision about how quickly FAST VP responds to changes in the
workload. Allowing it to move data at any time of the day lets FAST
VP quickly adapt to changing I/O profiles but may add activity to the
Symmetrix back end during these peak times. Alternatively, the FAST
VP Move Time Window can be set to specific lower activity hours to
prevent FAST activity from interfering with online activity. One such
case would be when FAST is initially implemented on the array when
the amount of data being moved could be substantial. In either case
FAST VP would attempt to make the move operations as efficiently as
possible by only moving allocated extents, and with sub-LUN
granularity the move operations are focused on just the datasets that
need to be promoted or demoted.
The FAST VP Relocation Rate (FRR) is a quality-of-service setting for
FAST VP and affects the “aggressiveness” of data movement requests
generated by FAST VP. FRR can be set between 1 and 10, with 1 being
the most aggressive, to allow the FAST VP migrations to complete as
fast as possible, and 10 being the least aggressive. With the release of
FAST VP and Enginuity 5875, the default FRR is set to 5 and can be
easily changed dynamically. An FRR of 6 was chosen for the use cases
described later in the chapter.
FAST VP architecture
There are two components of FAST VP: Symmetrix microcode and
the FAST controller.

The Symmetrix microcode is a part of the Enginuity storage operating

environment that controls components within the array. The FAST
controller is a service that runs on the Symmetrix service processor.
Figure 80 FAST VP components
When FAST VP is active, both components participate in the

execution of two algorithms to determine appropriate data
placement:
◆ Intelligent tiering algorithm
The intelligent tiering algorithm uses performance data collected
by the microcode, as well as supporting calculations performed
by the FAST controller, to issue data movement requests to the
VLUN VP data movement engine.
◆ Allocation compliance
The allocation compliance algorithm enforces the upper limits of
storage capacity that can be used in each tier by a given storage
group by also issuing data movement requests to the VLUN VP
data movement engine.

Data movements performed by the microcode are achieved by

moving allocated extents between tiers. The size of data movement
can be as small as 768 KB, representing a single allocated thin device
extent, but will more typically be an entire extent group, which is 10
thin device extents, or 7.5 MB.
FAST VP has two modes of operation, Automatic or Off. When
operating in Automatic mode, data analysis and data movements
will occur continuously during the defined windows. In Off mode,
performance statistics will continue to be collected, but no data
analysis or data movements will take place.
FAST VP and Oracle databases

FAST VP integrates very well with Oracle databases. As explained
earlier, applications tend to drive most of the workload to a subset of
the database, and very often, just a small subset of the whole
database. That subset is a candidate for performance improvement
and therefore uptiering by FAST VP. Other database subsets can
either remain where they are or be down-tiered if they are mostly idle
(for example, unused space or historic data maintained due to
regulations). If we look at Oracle ASM, it natively stripes the data
across its members, spreading the workload across all storage devices
in the ASM disk group. From the host it may look as if all the LUNs
are very active but in fact, in almost all cases just a small portion of
each LUN is very active. Figure 81 on page 387 shows an example of
I/O read activity, as experienced by the Symmetrix storage array, to a
set of 15 ASM devices (X-axis) relative to the location on the devices
(Y-axis). The color reflects I/O activity to each logical block address
on the LUN (LBA), where blue indicates low activity and red high. It
is easy to see in this example that while ASM stripes the data and
spreads the workload evenly across the devices, not all areas on each
LUN are “hot,” and FAST VP can focus on the hot areas alone and
uptier them. It can also down-tier the idle areas (or leave them in
place, based on the policy allocations). The result will be improved
performance, cost, and storage efficiency.
Even if ASM is not in use, other volume managers tend to stripe the
data across multiple devices and will therefore benefit from FAST VP
in a similar way. When file systems alone are used we can look at a
sub-LUN skewing inside the file system rather than a set of devices.

The file system will traditionally host multiple data files, each
containing database objects in which some will tend to be more active
than others as discussed earlier, creating I/O access skewing at a
sub-LUN level.
ICO-IMG-000919
Figure 81 “Heat” map of ASM member devices showing sub-LUN skewing
At the same time there are certain considerations that need to be

understood in relationship to FAST VP and planned for. One of them
is instantaneous changes in workload characteristics and the other is
changes in data placement initiated by the host such as ASM
rebalance.
Instantaneous changes in workload characteristics

Instantaneous changes in workload characteristics, such as
quarter-end or year-end reports, may put a heavy workload on
portions of the database that are not accessed daily and may have
been migrated to a lower-performance tier. Symmetrix is optimized
to take advantage of very large cache (up to 1 TB raw) and has
efficient algorithms to prefetch data and optimize disk I/O access.
Therefore Symmetrix VMAX will handle most workload changes
effectively and no action needs to be taken by the user. On the other
hand, the user can also assist by modifying the FAST VP policy ahead
of such activity when it is known and expected, and by changing the
Symmetrix priority controls and cache partitioning quotas if used.

Since such events are usually short term and only touch each dataset
once it is unlikely (and not desirable) for FAST VP to migrate data at
that same time and it is best to simply let the storage handle the
workload appropriately. If the event is expected to last a longer
period of time (such as hours or days), then FAST VP, being a reactive
mechanism, will actively optimize the storage allocation as it does
natively.
Changes in data placement initiated by the host (such as ASM rebalance)

Changes in data placement initiated by the host can be due to file
system defrag, volume manager restriping, or even simply a user
moving database objects. When Oracle ASM is used, the data is
automatically striped across the disk group. There are certain
operations that will cause ASM to restripe (rebalance) the data,
effectively moving existing allocated ASM extents to a new location,
which may cause the storage tiering optimized by FAST VP to
temporarily degrade until FAST VP re-optimizes the database layout.
ASM rebalance commonly takes place when devices are added or
dropped from the ASM disk group. These operations are normally
known in advance (although not always) and will take place during
maintenance or low-activity times. Typically new thin devices given
to the database (and ASM) will be bound to a medium- or
high-performance storage tier, such as FC or EFD. Therefore when
such devices are added, ASM will rebalance extents into them, and it
is unlikely that database performance will degrade much afterward
(since they are already on a relatively fast storage tier). If such activity
takes place during low-activity or maintenance time it may be
beneficial to disable FAST VP movement until it is complete and then
let FAST VP initiate a move plan based on the new layout. FAST VP
will respond to the changes and re-optimize the data layout. Of
course it is important that any new devices that are added to ASM
should be also added to the FAST VP controlled storage groups so
that FAST VP can operate on them together with the rest of the
database devices.
Which Oracle objects to place under FAST VP control

Very often storage technology is managed by a different group from
the database management team and coordination is based on need. In
these cases when devices are provisioned to the database they can be
placed under FAST VP control by the storage team without clear
knowledge on how the database team will be using them. Since FAST
VP analyzes the actual I/O workload based on the FAST Policy it will
actively optimize the storage tiering of all controlled devices.

However, when more coordination takes place between the database

and storage administrators it might be best to focus the FAST VP
optimization on database data files, and leave other database objects
such as logs and temp space outside of FAST VP control. The reason
is that redo logs, archive logs, and temp space devices experience
sequential read and write activity. All writes in Symmetrix go to
cache and are acknowledged immediately to the host (regardless of
storage tier). For sequential reads, the different disk technologies at
the storage array will have minimal impact due to I/O prefetch and
reduced disk head movement (in contrast to random read activity).
FAST VP algorithms place higher emphasis on improving random
read I/O activity although they also take into consideration writes
and sequential reads activity. Placing only data files under FAST VP
control will reduce the potential competition over the EFD tier by
database objects that may have a high I/O load but are of less
importance to consume precious capacity on that tier. However, as
mentioned earlier, when all database devices are under FAST VP
control, such objects may uptier, but with a lesser priority than objects
with random read activity (such as data files with a typical I/O
profile).
A different use case for FAST VP usage could be to optimize the
storage tiering of sequential read/write devices (like temp files,
archive logs) in a separate storage group and FAST VP policy with
only SATA and FC tiers included in the FAST VP policy. In that way
the goal is again to eliminate competition over EFD, while allowing
dynamic cost/performance optimization for archive logs and temp
files between SATA and FC tiers (redo logs are best served by the FC
tier in almost all cases).
OLTP or DSS workloads and FAST VP

As explained in the previous section, FAST VP places higher
emphasis on uptiering a random read workload, although it will try
to improve performance of other devices with high I/O activity such
as sequential reads and writes. For that reason the active dataset of
the OLTP applications will have a higher priority to be uptiered by
FAST VP over DSS. However, DSS applications can benefit from
FAST VP as well. First, data warehouse/BI systems often have large
indexes that generate random read activity. These indexes generate
an I/O workload that can highly benefit by being uptiered to EFD.
Master Data Management (MDM) tables are another example of
objects that can highly benefit from the EFD tier. FAST VP also
downtiers inactive data. This is especially important in DSS

databases that tend to be very large. FAST VP can reduce costs by

downtiering the aged data and partitions, and keep the active dataset
in faster tiers. FAST VP does the storage tiering automatically
without having to continuously perform complex ILM actions at the
database or application tiers.
Examples of FAST VP for Oracle databases

This section covers examples of using Oracle database 11g with FAST
VP. The three use cases are:
1. FAST VP optimization of a single Oracle database OLTP
workload: This use case demonstrates the basic work of FAST VP
and how it optimizes the storage allocation of a single Oracle
database from the initial FC tier to all three tiers—SATA, FC, and
EFD.
2. FAST VP optimization of two databases sharing an ASM disk
group: This use case demonstrates FAST VP optimization when
multiple Oracle databases with different workloads are sharing
the same ASM disk groups, storage devices, and FAST VP policy.
3. FAST VP optimization of two databases with separate ASM disk
groups: This use case demonstrates FAST VP optimization when
each database requires its own FAST VP policy for better isolation
and control of resources.
Test environment
This section describes the hardware, software, and database
configuration used for Oracle databases and FAST VP test cases as
seen in Table 14.
Table 14 FAST VP Oracle test environment (page 1 of 2)
Configuration aspect Description
Storage Array Symmetrix VMAX
Enginuity 5875
Oracle CRS and Database Version 11gR2
EFD 8 x 400 GB
FC 40 x FC 15 k rpm 300 GB drives
SATA 32 x SATA 7,200 rpm 1 TB drives

Table 14 FAST VP Oracle test environment (page 2 of 2)
Linux Oracle Enterprise Linux 5.3
Multipathing EMCPowerPath 5.3 SP1
Host Dell R900
Test Case 1: FAST VP optimization of a single Oracle database OLTP workload

This section shows an example of the benefits of FAST VP storage
tiering optimization of a single Oracle ASM-based database
executing an OLTP workload. It highlights the changes in the tier
allocation and performance between the beginning and the end for
the run. The +DATA ASM disk group resides on the FC tier at the
start and FAST VP migrates idle portions to SATA, and highly active
portions to EFD. At the end of the run we can see improved
transaction rates and response times and very efficient usage of the
three tiers.
The test configuration had two Oracle databases-FINDB (Financial)
and HRDB (Human Resource)-sharing ASM disk groups and
therefore also a Virtual Provisioning storage group and FAST VP
policy, as shown in Table 15.
Table 15 Initial tier allocation for test cases with shared ASM disk group
Databases ASM disk Thin Storage Thin pool RAID Tier Initial tier
groups devices groups associated alocation
FC_Pool FC 100%
12 x 100
+DATA DATA_SG EFD_Pool RAID 5 EFD 0%
FINDB & HRDB GB
SATA_Pool SATA 0%
+REDO FC_Pool RAID 1 FC 10%
One server was used for this test. Each of the Oracle databases was
identical in size (about 600 GB) and designed for an
industry-standard OLTP workload. However, during this test one
database had high activity whereas the other database remained idle
to provide a simple example of the behavior of FAST VP.

Note that since an industry-standard benchmark tool was used, the

I/O distribution across the database was completely even and
random. This reduced sub-LUN skewing (since the whole database
was highly active), and therefore the second idle database helped in
simulating a more normal environment where some objects will not
be highly accessed. It is very likely that real customer databases will
demonstrate much better locality of data referenced (the recent data
is more heavily accessed, or a mix of hot and cold database objects),
providing FAST VP with better sub-LUN skewing to work with. With
improved locality of reference (sub-LUN skewing) smaller EFD
capacity can contain the hot database objects and therefore the policy
can be set to a smaller EFD tier allocation percentage than shown in
this example.
Test case execution

Objectives
Achieve a single Oracle ASM database workload storage tiering
optimization by FAST VP.
Steps
1. Run a baseline workload prior to the FAST VP-enabled run
2. Run the workload with FAST VP enabled, allowing storage
allocation on all three tiers
3. Review the storage tiering efficiency and performance differences
Monitoring database and storage performance
During the baseline run the database devices were 100 percent
allocated on the FC tier as shown in Table 16 on page 393. Per the
AWR report given in Table 17 on page 393, user I/O random read
activity (“db file sequential read”) is the main database wait event,
with an average I/O response time of 6 ms. For FC drives this is a
good response time that reflects a combination of 15k rpm drives
(typically 6 ms response time at best per I/O, regardless of storage
vendor) with efficient Symmetrix cache utilization.

Table 16 FINDB initial tier allocation
ASM disk Database size Storage Initial storage tier allocation

group groups
EFD 0% 0
FINDB (600 GB)
+DATA HRDB DATA_SG FC 100% 1.2 TB
(600 GB)
SATA 0% 0
Table 17 Initial AWR report for FINDB
Event Waits Time (s) Avg wait (s) %DB time Wait class
db file sequential read 3,730,770 12,490 6 88.44 User I/O
db file parallel read 85,450 1,249 14 6.74 User I/O
DB CPU 674 4.79
log file sync 193,448 108 1 0.56 Commit
db file scattered read 3,241 20 11 0.22 User I/O
Defining the FAST policy

Although a 6 ms response time is very good for a FC tier with a heavy
I/O workload, a FAST VP “Gold” policy was set to improve both the
performance for this critical database as well to tier it across SATA,
FC, and EFD thin pools. As shown in Figure 82, which is part of a
Symmetrix Management Console (SMC) screen, the Gold policy
allowed a maximum 40 percent allocation on the EFD tier and 50
percent allocations on both of the FC and SATA tiers.
ICO-IMG-000924
Figure 82 Gold FAST VP policy storage group association

Running the database workload after enabling the FAST VP policy

The database workload was restarted after enabling the FAST VP
policy. FAST VP collected statistics, analyzed them, and performed
the extent movements following the performance and compliance
algorithms.
As can be seen in Figure 83, the tier allocation changed rapidly and
where the FC tier was 100 percent used at the beginning of the run, by
the end of the run the ASM disk group was using 35 percent of the
EFD tier and rest of the disk group was spread across FC and SATA
tiers. As the entire +DATA ASM disk group was associated with a
FAST VP policy and FINDB and HRDB were sharing the same ASM
disk group, the majority of active extents of FINDB moved to the EFD
tier whereas inactive extents of HRDB moved to the SATA tier. The
extents that were moderately active remained in the FC storage tier.
At the end of the run the ASM disk group was spread across all three
storage tiers based on the workload and FAST VP policy.
The storage tier allocations initially and after FAST VP was enabled
are shown in Table 18 on page 395. The Solutions Enabler command
lines for enabling FAST VP operations and monitoring tier allocations
are given in Appendix E, “Solutions Enabler Command Line
Interface (CLI) for FAST VP Operations and Monitoring.”
FAST VP Enabled Tier Allocation

Tier Capacity Used
2000
1500
EFD Tier
(GB)
1000 FC Tier
SATA Tier
500
0
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86
Interval
ICO-IMG-000925
Figure 83 Storage tier allocation changes during the FAST VP test for FINDB

Table 18 Oracle database tier allocations-initial and FAST VP enabled
ASM disk Database size Storage tiers allocation

group
Tier used Initial FAST VP enabled

FINDB (600
+Data GB) EFD 0% 0 35% 626 GB
(1.2 TB) HRDB (600 FC 100% 1.2 TB 50% 941 GB
GB)
SATA 0% 0 12% 204 GB
Analyzing the performance improvements with FAST VP

As can be seen in Table 19, the average I/O response time at the end
of the run changed to 3 ms, which is a considerable improvement
over the initial test that utilized the FC tier for the entire ASM disk
group. This is the result of migration of active extents of the ASM
disk group to EFD tiers and allocation of 35 percent capacity on that
tier.
Table 19 FAST VP enabled database response time from the AWR report
Avg wait
Event Waits Time (s) %DB time Wait class
(ms)
DB CPU 674 4.69
The response time improvement and utilization of all available

storage tiers—EFD, FC and SATA—to store ASM disk group extents
also resulted in considerable improvement in FINDB transaction
rates as shown in the next figure. The initial database transaction rate
(transactions per minute) for FINDB with the entire ASM disk group
on the FC tier was 2,079 and after FAST VP initiated movements a
transaction rate of 3,760 was achieved that is an improvement of 81
percent while utilizing all available storage tiers more effectively and
efficiently.

FINDB FAST Enabled Run
Transactions Per Minute (TPM)

5000
4000
3000
2000 FAST Enabled

14 Hour
1000
TPCC Run
0
14 HR FAST Enabled Run
Database Transaction Rate

Transactions Per Minute (TPM)
4000
3000 Initial Transaction

Rate
2000
FAST VP Enabled
1000 Transaction Rate
0
Initial and With FAST VP Enabled
ICO-IMG-000921
Figure 84 Ddatabase transaction changes with FAST VP
Test Case 2: Oracle databases sharing the ASM disk group and FAST policy
Oracle ASM makes it easy to provision and share devices across
multiple databases. The databases, running different workloads, can
share the ASM disk group for ease of manageability and
provisioning. Multiple databases can share the Symmetrix thin pools
for ease of provisioning, wide striping, and manageability at the
storage level as well. This section describes the test case in which a
FAST VP policy is applied to the storage group associated with the
shared ASM disk group. At the end of the run we can see improved
transaction rates and response times of both databases, and very
efficient usage of the available tiers.

Test case execution

Objectives
Achieve storage tiering optimization for multiple databases sharing
the ASM disk group using FAST VP.
Steps
1. Run performance baselines while both databases use the FC tier
alone (prior to the FAST VP enabled run)
2. Run the workload again on both databases with FAST VP
enabled, allowing storage allocation on all three tiers
During the baseline run the databases devices were 100 percent
allocated on the FC tier as shown in . Both databases executed an
OLTP-type workload (similar to the previous use case) where FINDB
had more processes executing the workload in comparison to
HRDB’s workload, and therefore FINDB had a higher workload
profile than HRDB.
Table 20 FINDB and HRDB initial storage tier allocation
ASM disk group Database size Initial storage tier allocation
FINDB EFD 0% 0
+DATA (600 GB)
FC 100% 1.2 TB
(1.2 TB) HRDB
(600 GB) SATA 0% 0
Table 21 Initial AWR report for FINDB
DB CPU 674 4.79


As the ASM disk group and Symmetrix storage groups are identical
to the ones used in Test Case 1 the same FAST Policy is used for this
use case.
At the start of the test FAST VP was enabled and workloads on both
databases started with FINDB running a higher workload compared
to HRDB. After an initial analysis period (which was 2 hours by
default) FAST performed the movement to available tiersAnalyzing
the performance improvements with FAST VP
Active extents from both databases were distributed to the EFD and
FC tiers with the majority of active extents on EFDs while inactive
extents migrated to the SATA tier. Figure 85 shows the performance
improvements for both databases resulting from FAST VP controlled
tier allocation..
All on FC and FAST Enabled at the start

Transactions Per Minute
3000
2500
FINDB High
2000
Workload
1500
HRDB Low
1000 Workload
500
0
14 HRFAST Enabled Run
ICO-IMG-000922
Figure 85 Storage tier changes during FAST VP enabled run on two databases
The database transaction rate changes before and after FAST-based

movements are shown in Table 22 on page 399. Both databases
exhibited higher performance with FINDB, which was more active
and achieved higher gain as more extents from FINDB got migrated
to EFDs.

Table 22 FAST VP enabled database transaction rate changes
Trasnsaction rate
Database Initial FAST VP enabled % Improvement
FINDB 1144 2497 118%
HRDB 652 1222 87%
Test Case 3: Oracle databases on separate ASM disk groups and FAST policies
Not all the databases have the same I/O profile or SLA requirements
and may also warrant different data protection policies. By deploying
the databases with different profiles on separate ASM disk groups,
administrators can achieve the desired I/O performance and ease of
manageability. On the storage side these ASM disk groups will be on
separate storage groups to allow for definition of FAST VP policies
appropriate for the desired performance. This section describes a use
case with two Oracle databases with different I/O profiles on
separate ASM disk groups and independent FAST policies.
The hardware configuration of this test was the same as the previous
two use cases (as shown in Table 1). This test configuration had two
Oracle databases—CRMDB (CRM) and SUPCHDB (Supply
Chain)—on separate ASM disk groups, storage groups, and FAST VP
policies, as shown in Table 23.
Table 23 Initial tier allocation for a test case with independent ASM disk groups
ASM disk Thin Storage Tier Initial tier

Databases Thin pool RAID
groups devices groups associated allocation
+DATA 6 x 100 GB OraDevices_C1 FC_Pool FC 100%

CRMDB
+REDO 2 x 6 GB OraRedo EFD_Pool RAID 5 EFD 0%
+DATA 6 x 100 GB OraDevices_S1 SATA_Pool SATA 0%

SUPCHDB REDO_Poo
+REDO 2 x 6 GB OraRedo RAID 1 FC 100%
l

The Symmetrix VMAX array had a mix of storage tiers–EFD, FC, and
SATA. One server was used for this test. Each of the Oracle databases
was identical in size (about 600 GB) and designed for an
industry-standard OLTP workload.
The Oracle databases CRMDB and SUPCHDB used independent
ASM disk groups based on thin devices that were initially bound to
FC_Pool (FC tier).
The CRMDB database in this configuration was part of a customer
relationship management system that was critical to the business. To
achieve higher performance the FAST VP policy “GoldPolicy” was
defined to make use of all three available storage tiers and storage
group-OraDevices_C1 was associated with the policy.
The SUPCHDB database was important to the business and had
proper performance characteristics. Business would benefit if the
performance level can be maintained at lower cost. To meet this goal
the FAST VP policy “SilverPolicy” was defined to make use of only
FC and SATA tiers and storage group - OraDevices_S1 was associated
with the policy.
Test case execution

Objectives
Achieve storage tiering optimization while maintaining isolation of
resources that each database is allowed to use.
Steps
1. Run a baseline workload (prior to the FAST VP-enabled run)
2. Define two separate FAST policies–Gold policy and Silver policy–
and associate them with the appropriate storage groups
3. Run the workloads again with FAST VP enabled, allowing
storage allocation based on the distinct FAST VP policies


Table 24 shows the baseline performance of both databases based on
the initial FC tier allocation. Both databases are getting a response
time of 8 ms. Our goal is to improve it for CRMDB and maintain it for
SUPCHDB at lower cost.
Table 24 Initial AWR report for CRMDB and SUPCHDB
CRMDB
DB CPU 4,338 2.45
log file sync 1,635,001 1,157 1 0.65 Commit
SUPCHDB
DB CPU 4,338 2.45
log file sync 746,897 1,157 1 0.65 Commit

For CRMDB, our goal was to improve the performance. For FC-based
configurations, a response time of 8 ms is reasonable, but can
improve with better storage tiering. The FAST VP Gold policy was
defined to improve both the performance for this critical database as
well to tier it across SATA, HDD, and EFD thin pools. The Gold
policy allowed a maximum 40 percent allocation on the EFD tier and
100 percent allocations on both of the FC and SATA tiers. By setting
FC and SATA allocations to 100 percent in this policy, FAST VP has
the liberty to leave up to 100 percent of the data on any of these tiers
or move up to 40 percent of it to EFD, based on the actual workload.

For SUPCHDB, our goal was to lower the cost while maintaining or
improving the performance. The FAST VP Silver policy was defined
to allocate the extents across FC and SATA drives to achieve this goal.
The Silver policy allows a maximum of 50 percent allocation on the
FC tier and up to 100 percent allocation on the SATA tier.
The database workload was repeated after enabling the FAST VP
policy. FAST VP collected statistics, analyzed them, and performed
the extent movements following the performance and compliance
algorithms. The AWR reports for both databases were generated to
review the I/O response times as shown in Table 25.
Table 25 AST VP enabled AWR report for CRMDB and SUPCHDB
CRMDB
DB CPU 4,338 2.45
log file sync 1,635,001 1,157 1 0.65 Commit
SUPCHDB
DB CPU 4,338 1.86
log file sync 746,897 1,157 1 0.5 Commit
The database transaction rate changes are shown in Figure 86 on

page 403.

FAST VP Enabled run with different FAST Policies
Transactions Per Minute

3000 CRMDB High
2500 Workload 6
2000 Driver
(Gold_Pol_1)
1500
1000
500 SUPCHDB Low
0 Workload 4
Driver
FAST Enabled Run with
(Silver Policy)
different FAST VP Policies
ICO-IMG-000923
Figure 86 FAST VP enabled test with different FAST policies
Analyzing the performance improvements with FAST VP

As shown in Table 26, CRMDB used the FAST Gold policy and FAST
VP migrated 40 percent of the CRMDB FC extents to the EFD tier and
10 percent to SATA. The rest of the extents remained on FC drives.
This resulted in improvement of response time from 8 ms to 5 ms and
a very decent improvement in transaction rate from 962 to 2,500,
which represents 160 percent growth in transaction rate without any
application change.
SUPCHDB used the FAST Silver policy and therefore FAST VP
moved the less active extents to SATA drives. Still, the response time
improved from 8 ms to 7 ms and hence we reached both cost savings
while maintaining or improving performance.
Table 26 Storage tier allocation changes during the FAST VP-enabled run
ASM Database DB Initial FAST VP % FAST FAST VP enabled storage

disk size transaction enabled Change policy used tiers used
group rate transaction
(TPM) rate (TPM) EFD FC SATA
40% 50% 10%

CRMDB 600 GB 962 2500 160% GOLD Policy
240 GB 300 GB 60 GB
+DATA 44% 56%
SUPCHDB 600 GB 682 826 21% SILVER Policy
334
0 266 GB
GB

Fully Automated Storage Tiering

This section describes the FAST configuration and illustrates the
process of migration of Oracle databases to the correct storage tiers to
meet the business requirements using FAST.
Introduction
Businesses use multiple databases in environments that serve DSS
and OLTP application workloads. Even though multiple levels of
cache exist in the database I/O stack including host cache, database
server cache and Symmetrix cache, the disk response time is critical at
times for application performance. Selection of the correct storage
class for various database objects is a challenge. Also the storage
selection that works in one situation may not be optimal for other
cases. Jobs executed at periodic intervals or on an adhoc basis such as
quarter-end batch jobs demand high degree of performance and
availability and make disk selection and data placement even more
challenging. As the size and number of databases grow, analysis of
performance of various databases, identifying the bottlenecks, and
selection of the right storage tier for the multitude of databases turns
into a daunting task.
Introduced in the Enginuity 5874 Q4 2009 service release, EMC
Symmetrix VMAX Fully Automated Storage Tiering (FAST) is
Symmetrix software that utilizes intelligent algorithms to
continuously analyze device I/O activity and generate plans for
moving and swapping devices for the purposes of allocating or
re-allocating application data across different performance storage
tiers within a Symmetrix array. FAST proactively monitors workloads
at the Symmetrix device (LUN) level in order to identify “busy”
devices that would benefit from being moved to higher-performing
drives such as EFD. FAST will also identify less “busy” devices that
could be relocated to higher-capacity, more cost-effective storage such
as SATA drives without altering performance.
Time windows can be defined to specify when FAST should collect
performance statistics (upon which the analysis to determine the
appropriate storage tier for a device is based), and when FAST should
perform the configuration changes necessary to move devices
between storage tiers. Movement is based on user-defined storage
tiers and FAST Policies.

The primary benefits of FAST include:

◆ Automating the process of identifying volumes that can benefit
from EFD or that can be kept on higher-capacity, less-expensive
drives without impacting performance
◆ Improving application performance at the same cost, or
providing the same application performance at lower cost. Cost is
defined as space, energy, acquisition, management, and
operational expense
◆ Optimizing and prioritizing business applications, allowing
customers to dynamically allocate resources within a single array
◆ Delivering greater flexibility in meeting different
price/performance ratios throughout the lifecycle of the
information stored
The management and operation of FAST can be conducted using
either the Symmetrix Management Console (SMC) or the Solutions
Enabler Command Line Interface (SYMCLI). Additionally, detailed
performance trending, forecasting, alerts, and resource utilization are
provided through the Symmetrix Performance Analyzer (SPA). And
if so desired, Ionix ControlCenter provides the capability for
advanced reporting and analysis that can be used for chargeback and
capacity planning.
FAST configuration
FAST configuration involves three components:
◆ Storage Groups
A Storage Group is a logical grouping of Symmetrix devices.
Storage Groups are shared between FAST and Auto-provisioning
Groups; however, a Symmetrix device may only belong to one
Storage Group that is under FAST control. A Symmetrix VMAX
storage array supports upto 8,192 Storage Groups associated with
FAST Policies.
◆ Storage Tiers
storage tiers are a combination of a drive technology (for
example, EFD, FC 15k rpm, or SATA) and a RAID protection type
(for example, RAID 1, RAID 5 (3+1), RAID 5 (7+1), RAID 6 (6+2)).
There are two types of storage tiers-static and dynamic. A static
type contains explicitly specified Symmetrix device groups, while
a dynamic type will automatically contain all Symmetrix disk
Fully Automated Storage Tiering 405

groups of the same drive technology. A storage tier will contain at

least one Symmetrix disk group but can contain more than one of
a single drive technology type.
◆ FAST Policies
FAST Policies associate a set of Storage Groups with up to three
storage tiers. FAST Policy includes the maximum percentage that
Storage Group devices can occupy in each of the storage tiers. The
percentage of storage specified for each type in the policy when
aggregated must total at least 100 percent and may total more
than 100 percent. For example, if the Storage Groups associated
with the policy are allowed 100 percent in any of the types, FAST
can recommend for all the storage devices to be together on any
one type (capacity limit on the storage tier is not enforced). In
another example, to force the Storage Group to one of the storage
tiers, simply set the policy to 100 percent on that type and 0
percent for all other types. At the time of association, a Storage
Group may also be given a priority (between 1 and 3) with a
policy. If a conflict arises between multiple active FAST Policies,
the FAST Policy priorty will help determine which policy gets
precedence. The Symmetrix VMAX supports up to 256 FAST
Policies.
FAST device movement

Devicerelocation to another storage class can take place when using
FAST through either of the two ways: Move or Swap.
A move occurs when unconfiigured (free) space exists in the target
storage tier. Only one device is involved in a move, and a Dynamic
Relocation Volume (DRV) is not required. The device migration
operation is identical to Virtual LUN migration to unconfigured
space.
A swap occurs when there is no unconfigured space in the target
storage tier, so similar sized device in target storage tier will be
moved out of the storage tier. Such an operation requires one DRV for
each pair of devices being swapped. To facilitate swap operation,
DRV should therefore be sized to fit the largest FAST controlled
device.

Moves or swaps are completely transparent to the application and

can be performed nondisruptively. Symmetrix metadevices are
complete entity; therefore metadevice members cannot exist in
different Symmetrix disk groups.
FAST and ASM

Symmetrix FAST complements the use of Oracle ASM, or other file
systems and volume managers for that matter. FAST relies on the
availability of multiple storage tiers in the Symmetrix array, and on
LUN access skewing. As discussed earlier, LUN access skewing is
common in most database systems and often tends to simply show
that the most recent data is accessed more heavily than older data.
Similar to Virtual LUN-based migrations, the first release of FAST
also works at the LUN level granularity and therefore the FAST
Storage Group should include all the devices in the ASM disk group
to ensure that FAST policies are applied to all those LUNs
simultaneously. This ensures that FAST will choose to move or swap
all ASM devices within the ASM disk group and will not break the
disk group on multiple storage tiers.
Because of this, when planning to use multiple storage tiers and FAST
with ASM, the ASM disk groups must be designed accordingly. For
example, if DBAs are not interested in the REDO logs but only data
files, being managed by FAST, they should place the redo logs in their
own ASM disk group and allow FAST to operate on the Storage
Group containing the data disk group.
Example of FAST for Oracle databases

As described in the overview, FAST is controlled by defining Storage
Groups, storage tiers, and FAST Policies. The example here illustrates
three databases, their associated I/O profile, business goals for the
databases after their initial deployment, and FAST configurations to
meet the service level goals for all the databases.
The test configuration used an Oracle 11gR2 RAC to manage three
single-instance databases across two servers. All three databases
were the same size, roughly 1 TB executing an OLTP workload. The
focus was to manage the storage tiers between databases (that is,
placing each database in the type that best matches its business and
performance needs). Since each database had the redo logs, data files,
temp, and FRA in their own ASM disk groups, it is only the +DATA

ASM disk group of each database that was moved between the
storage tiers. The +REDO and +TEMP disk groups remained on 15k
rpm drives, and FRA on SATA drives.
The first database, DB1, started on FC 15k rpm drives but was
designed to simulate a low I/O activity database that has very few
users, low importance to the business, and is a candidate to move to a
lower storage tier, or “down-tier.” The DB1 database could be one
that was once active but is now being replaced by a new application.
The second database, DB2, was designed to simulate a medium active
database that was initially deployed on SATA drives, but its activity
level and importance to the business are increasing and it is a
candidate to be moved to a higher storage tier, or “up-tier.” The last
database, DB3, started on FC 15k rpm drives and was designed to
simulate the high I/O activity level of a mission-critical application
with many users and is a candidate to up-tier from FC 15k rpm to
EFD.
The test configuration details are provided in Table 27.
Table 27 Test configuration
Storage Array Symmetrix VMAX (SE 2 Engine)
Enginuity 5874 Service Release Q4 ‘09
Oracle CRS and Database Version 11gR2
EFD 8 x 400 GB Enterprise Flash Drives
HDD 120 x FC 15k rpm 300 GB Drives
SATA 32 x SATA 7,200 rpm 1 TB Drives
Linux Oracle Enterprise Linux 5.3
Multipathing EMC PowerPath 5.3 SP1

Each of the three databases was using the ASM disk group
configuration as shown in Table 28.
Table 28 Storage and ASM configuration for each test database
Number of
ASM disk groups LUNs Size (GB) Total (GB) RAID
DATA 10 120 1,200 RAID 5 (3+1)
REDO 20 5 100 RAID 1
TEMP 5 120 600 RAID 5 (3+1)
FRA 40 120 4,800 RAID 5 (3+1)
Table 29 shows the initial storage drive types and count behind each
of the +DATA ASM disk groups at the beginning of the tests. It also
shows the OLTP workload and potential business goals for each
database.
Table 29 Database storage placement (initial) and workload profile
Number of physical
Database drives Drive type Workload Business goal
DB1 40 FC 15k Very low Down-tiering/cost saving
DB2 32 SATA Medium Up-tiering/preserver SLA
DB3 40 FC 15k High Up-tiering/improve SLA
Figure 77 on page 379 shows the logical FAST profile we used for
database 3, or DB3. In this case, while we have three drive types in
the Symmetrix VMAX—EFD, FC 15k rpm, and SATA drives—we do
not want DB3 to reside on SATA so we could potentially not include a
SATA type. However, including it and setting the allowable
percentage to 0 percent has the same effect.

Match
Storage class Service level objectives
Storage type FAST Policies Storage Groups
Type 1 DB3_FP
400 GB EFD 100%
RAID 5 (3+1) DB3_SG
100%
Type 2
300 GB 15K FC
RAID 5 (3+1) 0% DB2_SG
Type 3
1 TB SATA
RAID 5 (3+1) DB2_SG
ICO-IMG-000782
Figure 87 Initial FAST policies for DB3

Based on the initial test configuration and the understanding of the
relative importance of each database to the business (as shown in
Table 29 on page 409) we reviewed Oracle AWR data for each of the
databases (Table 30), and made a note of the OLTP transaction rate
baseline for later comparison.
Table 30 Initial Oracle AWR report inspection (db file sequential read)
Database Events Waits Time(s) Avg wait (ms) % DB time
DB1 db file sequential read 684,271 3,367 5 84.6
DB2 db file sequential read 13,382 250 18 89.2
DB3 db file sequential read 18,786,472 163,680 9 76.2
Based on these results we can see that DB1 is mainly busy waiting for
random read I/O (“db file sequential read” Oracle event refers to
host random I/O). A wait time of 5 ms is very good; however, this

database shows a low transaction rate and has no business

justification to be on FC 15k rpm drives. As a result the decision is
made to down-tier it to SATA drives.
DB2 is also spending most of the time waiting for random read I/Os
with an average wait time of 18 ms. It is currently on SATA drives,
and its transaction rate will surely improve by placing it on a better
performing storage tier. In this case the business goal is to address
DB2’s increasing importance and activity and in order to do it we will
move it to the FC 15k rpm drive type where the I/O response time
and transaction rate can be improved.
DB3 is already on FC 15k rpm drives, but based on its high I/O
activity it can benefit from up-tiering to EFD. In this test
configuration we had 8 x 400 GB EFDs in RAID 5, which gives
roughly 2.8 TB available capacity. The 1 TB DATA ASM disk group of
DB3 could easily fit there. This is a very visible and critical database
to the business and so it was decided to up-tier its data to EFD.
Finally, we used an EMC internal performance analysis tool that
shows a “heat map” of the drive utilization at the Symmetrix back
end just to illustrate the changes. Each physical drive in the array is
represented by a single rectangle and that rectangle corresponds with
the physical location of the drive in the array. The rectangle is
color-filled to represent the utilization of the physical drive as shown
in the color legend on the left side of the figure. As shown in
Figure 89 on page 412, the FC 15k rpm drives hosting DB1 are blue,
showing minimum utilization. The SATA drives hosting DB2 are
primarily light-green to yellow, showing medium-high utilization,
and the FC 15k rpm drives hosting DB3 are bright red, which
indicates a potential for improvement.

Storage class Service level objectives
Storage type FAST Policies Storage Groups
Type 1 DB3_FP
400 GB EFD 100%
RAID 5 (3+1) DB3_SG
100%
Type 2
300 GB 15K FC
RAID 5 (3+1) 0% DB2_SG
Type 3
1 TB SATA
RAID 5 (3+1) DB2_SG
ICO-IMG-000782
Figure 88 Initial FAST policy for DB3
DB3 on FC 15K rpm DB1 on FC 15K rpm
DB2 on SATA
IMG-ICO-000783
Figure 89 Initial performance analysis on FAST
Based on the initial analysis, FAST configuration was done. The

simplest way to configure FAST for the first time is to use the SMC
FAST Configuration Wizard. But before starting the wizard, it is
recommended to create the Storage Groups that FAST will operate
on. Storage Groups in Symmetrix VMAX are used for both
Auto-provisioning Groups to simplify device masking operations, as
well as to group devices for FAST control. While Symmetrix devices
can be in multiple Storage Groups, no two Storage Groups under

FAST control can contain the same devices. In Figure 90 we can see
how the devices of ASM disk group +DATA, of database 3 (DB3), are
placed into a Storage Group that can later be assigned a FAST Policy.
As shown in Figure 90, FAST configuration parameters are specified.
The user approval mode is chosen.
ICO-IMG-000784
Figure 90 FAST configuration wizard: Setting FAST parameters
Figure 91 shows configuration of performance and windows for

collecting FAST statistical performance data and move windows.
Figure 91 FAST configuration wizard: Creating performance and move time

window

Figure 92 shows provisioning the target storage tier for the FAST
policies.
ICO-IMG-000786
Figure 92 FAST configuration wizard: Creating FAST policy
When creating FAST policies, the Storage Groups prepared earlier for
FAST control are being assigned storage tiers they can be allocated
on, and the capacity percentage the Storage Group is allowed on each
of them.
The last screen in the wizard is a summary and approval of the
changes. Additional modifications to FAST configuration and
settings can be done using Solutions Enabler or SMC directly, without
accessing the wizard again. Solutions Enabler uses the “symfast”
command line syntax, and SMC uses the FAST tab.
The following example shows how FAST can be used to migrate data
for DB3 to the appropriate storage tier. The DB3 Storage Group
properties box has three tabs—General, Devices, and Fast
Compliance. The Devices tab shows the 10 Symmetrix devices that
belong to the +DATA ASM disk group devices that contain DB3 data
files and comprise the DB3_SG Storage Group. The FAST
Compliance tab shows what tiers of storage this Storage Group may
reside in. In this case we have defined the FC storage tier as the place
where the drives are now and the EFD storage tier is where FAST
may choose to move this ASM disk group. Note that there is no
option for a SATA storage tier for the DB3 Storage Group. This will
prohibit FAST from ever recommending a down-tier of DB3 to SATA.

ICO-IMG-000787
Figure 93 FAST configuration wizard: Creating a FAST storage group
The final step of the process is to associate the Storage Group with the
FAST tiers and define a policy to manage FAST behavior. In our case
we have one Storage Group (DB3_SG), two FAST tiers (EFD and FC),
and one FAST Policy (Figure 94 on page 416). The FAST Policy allows
for up to 100 percent of the Storage Group to reside on the Flash
storage tier and allows for 100 percent of DB3 to reside on FC. Since
there is no SATA storage tier defined for DB3, a third storage tier
option does not exist. By allowing up to 100 percent of the DB3
Storage Group to reside on EFD we expected that if FAST was going
to move any DB3 LUNs to EFD, it would move them all because they
all have the same I/O profile, and there is ample capacity available
on that storage tier to accommodate all the capacity of those ASM
disk group devices or the FAST Storage Group.

ICO-IMG-000791
Figure 94 DB3 FAST policy
Monitoring and executing FAST recommendations for DB3

The initial FAST test configuration was exactly the same as the initial
configuration of the Virtual LUN use case described previously. It
had DB2 placed on 10 LUNs on 32 physical SATA drives, with DB1
and DB3 started on forty 15k rpm drives each. Table 31 shows the
initial performance of the databases in their starting locations.
Table 31 Initial FAST performance analysis results
Database Number of physical Drive type Avg. txn/min % Change

drives
DB1 40 FC 15k 349.20 0.00%
DB2 32 SATA 890.53 0.00%
DB3 40 FC 15k 11736.03 0.00%
We reran the OLTP workload and after an hour of collecting data,

FAST proposed a Move/Swap plan shown in Figure 95 on page 417.
FAST proposed to move all the LUNs in the DB3 disk group from FC
to EFD in a single move, which is exactly what we had expected. We
approved the plan, FAST executed the move plan, and we reran the
workload. The results after the move are shown in Table 32 on
page 417.

After the FAST move of DB3 to EFD, overall system transaction

performance improved around 13 percent and there was no
degradation in performance of the other two databases. The
utilization map (Figure 96 on page 418) shows both the active SATA
drives and the Flash drives in shades of yellow and green, indicating
moderate usage.
This FAST plan is based on

Performance
improvement algorithm
ICO-IMG-000790
Figure 95 FAST swap/move detail
Table 32 Results after FAST migration of DB3 to Flash
Number of physical
Drives Drive type Avg. txn/min % Change
DB1 40 FC 15k 358.12 2.55%
DB2 32 SATA 897.27 0.76%
DB3 8 Flash 13334.98 13.62%

DB1 on FC 15K rpm
DB2 on SATA
DB3 on EFD
ICO-IMG-000789
Figure 96 Disk utilization map after migration

Conclusion
Symmetrix Virtual Provisioning offers great value to Oracle
environments with improved performance and ease of management
due to wide striping and higher capacity utilization. Oracle ASM and
Symmetrix Virtual Provisioning complement each other very well.
With a broad range of data protection mechanisms and tighter
integration between Symmetrix and Oracle now available even for
thin devices, adoption of Virtual Provisioning for Oracle
environments is very desirable.
With the Enginuity 5874 Q4 2009 service release enhancements made
to Virtual LUN migration and the introduction of FAST technology,
data center administrators are now able to dynamically manage data
placement in a Symmetrix array to maximize performance and
minimize costs.Introduced with the Symmetrix Enginuity 5875 in Q1
2011, FAST VP in Oracle environments improves storage utilization
and optimizes the performance of databases by effectively making
use of multiple storage tiers at a lower overall cost of ownership
when using Symmetrix Thin Provisioning.In a multi-tiered Oracle
storage configuration, moving the highly accessed volumes from FC
drives to EFDs can help administrators maintain or improve
performance and free up FC drives for other uses. Moving active
drives from SATA to FC drives improves performance and allows for
increased application activity. Moving lightly accessed volumes from
FC to SATA helps utilization and drives down cost. This volume or
sub-LUN level movement can be done nondisruptively on a
Symmetrix VMAX using Virtual LUN,FAST VP and FAST
capabilities.
Conclusion 419

A
Symmetrix VMAX with
Enginuity
This appendix introduces EMC Symmetrix VMAX software and

hardware capabilities, and provides a comprehensive set of best
practices and procedures for high availability and business continuity
when deploying Oracle Database 10g and 11g with Symmetrix
VMAX, including EMC TimeFinder and Symmetrix Remote Data
Facility (SRDF ), which have been widely deployed with Oracle
databases.
◆ Introduction to Symmetrix VMAX series with Enginuity ......... 422
◆ Leveraging TimeFinder and SRDF for business continuity
solutions ............................................................................................ 444
◆ Conclusion ........................................................................................ 462
◆ Test storage and database configuration ...................................... 463
Symmetrix VMAX with Enginuity 421

Symmetrix VMAX with Enginuity
Introduction to Symmetrix VMAX series with Enginuity

As mentioned in Chapter 2, ”EMC Foundation Products,” The EMC
Symmetrix VMAX Series with Enginuity is a new offering in the
Symmetrix product line. Built on the strategy of simple, intelligent,
modular storage, it incorporates a new Virtual Matrix interface that
connects and shares resources across all nodes, allowing the storage
array to seamlessly grow from an entry-level configuration into the
world's largest storage system. Symmetrix VMAX provides
improved performance and scalability for demanding enterprise
database environments while maintaining support for EMC's broad
portfolio of software offerings. With the release of Enginuity 5874,
Symmetrix VMAX systems now deliver new software capabilities
that improve ease of use, business continuity, Information Life
Management (ILM), virtualization of small to large environments,
and security.
Symmetrix VMAX arrays are well integrated with Oracle databases
and applications to support their performance needs, scalability,
availability, ease of management, and future growth. This white
paper introduces Symmetrix VMAX software and hardware
capabilities, and provides a comprehensive set of best practices and
procedures for high availability and business continuity when
deploying Oracle Database 10g and 11g with EMC Symmetrix
VMAX. This includes EMC TimeFinder and Symmetrix Remote Data
Facility (SRDF), which have been widely deployed with Oracle
databases.
New Symmetrix VMAX ease of use, scalability and virtualization features

In addition to Symmetrix VMAX enhanced performance, scalability,
and availability, Enginuity 5874 introduces new ease of use,
virtualization, and ILM functionalities. With Symmetrix VMAX
Auto-provisioning Groups, mapping devices to small or large Oracle
database environments becomes fast and easy. Devices, HBA WWNs,
or storage ports can be easily added or removed, and automatically
these changes are propagated through the Autoprovisioning Group,
thus improving and simplifying complex storage provisioning for
any physical or virtual environment. With Symmetrix VMAX
Enhanced Virtual LUN Technology, Oracle applications data can be
migrated between storage tiers seamlessly, while the database is
active, thus allowing the placement of data on the storage tier that
best matches its performance and cost requirements. As database

performance requirements change, it is easy and efficient to move the

appropriate LUNs to their new storage tier. Symmetrix VMAX
Virtual LUN migration doesn't consume host or SAN resources; it
improves return on investment (ROI) by using the correct storage
tiering strategy, and it reduces complexity as there is no need to
change backup or DR plans since the host devices don't change.
Additional enhancements to availability, scalability, and ease of use
are introduced later in the paper and are fully described in the VMAX
product guide.
Oracle mission-critical applications require protection strategy

The demand for database protection and availability increases as data
grows in size and becomes more interconnected, and the organization
infrastructure expands. It is essential to have continuous access to the
database and applications and efficient use of available system
resources. Data centers face disasters caused by human errors,
hardware and software failures, and natural disasters. When disaster
strikes, the organization is measured by its ability to resume
operations quickly, seamlessly, and with the minimum amount of
data loss. Having a valid backup and restartable image of the entire
information infrastructure greatly helps achieve the desired level of
recovery point objective (RPO), recovery time objective (RTO), and
service level agreement (SLA).
Enterprise protection and compliance using SRDF

Data consistency refers to the accuracy and integrity of the data and
the copies of the data. Symmetrix VMAX offers several solutions for
local and remote replication of Oracle databases and applications
data. With SRDF software, single or multiple database mirrors can be
created, together with their external data, application files and/or
message queues - all sharing a consistency group. Replicating data
this way creates the point of consistency across business units and
applications before any disaster takes place. Failover to the DR site is
merely a series of application restart operations that reduce overall
complexity and downtime. SRDF provides two- or three-site
solutions, and synchronous and asynchronous replication, as well as
a no data loss solution over any distance using SRDF/Star, cascaded
or concurrent SRDF, and the new SRDF/Extended Distance
Protection (EDP). With SRDF/Star, for example, compliance
requirements such as not operating the business without a disaster
Introduction to Symmetrix VMAX series with Enginuity 423

recovery site can be met, even when the production array is

unavailable.
Oracle database clones and snapshots with TimeFinder

Every mission-critical system has a need for multiple copies, such as
for development, test, backup offload, reporting, data publishing,
and more. With Symmetrix VMAX using TimeFinder software,
multiple Oracle database copies can be created or restored in a matter
of seconds (either full volume clones or virtual snapshots), regardless
of the database size. Such operations are incremental and only
changes are copied over. As soon as TimeFinder creates (or restores) a
replica, the target devices (or source) will immediately show the final
image as if the copy has already finished, even if data copy
operations continue incrementally in the background. This
functionality shortens business operation times tremendously. For
example, rather than performing backup directly on production, it
can be offloaded in seconds to a standalone replica. In another
example, if an Oracle database restore is required, as soon as
TimeFinder restore starts, database recovery operations can start, and
there is no need to wait for the storage restore to complete. This
ability, also referred to as parallel restore, provides a huge reduction
in RTO and increases business availability.
Oracle database recovery using storage consistent replications

In some cases there is a need for extremely fast database recovery,
even without failing over to a DR site (especially when only one
database out of many sustained a logical or physical corruption). By
implementing TimeFinder consistency technology, periodic database
replicas can be taken (for example, every few hours) without placing
the Oracle database in hot backup mode. Oracle now supports
database recovery on a consistent storage replica, applying archive
and redo logs to recover it (Oracle support is based on Metalink note
604603.1).
Best practices for local and remote Oracle database replications

This appendix provides an overview of the Symmetrix VMAX
system, Auto-provisioning Groups, and Virtual LUN technology
with Oracle-related samples. It also details the procedures and best
practices for the following use cases:

◆ Use Case 1 — Offloading database backups from production to a

local TimeFinder/Clone, then using Oracle Recovery Manager
(RMAN) for farther backup
◆ Use Case 2 — Facilitating parallel production database recovery
by restoring a local TimeFinder/Clone backup image and
applying logs to it
◆ Use Case 3 — Creating local restartable clones (or snaps) of
production for database repurposing (such as creating test,
development, and reporting copies)
◆ Use Case 4 — Creating remote mirrors of the production database
for disaster protection (synchronous and asynchronous)
◆ Use Case 5 — Creating remote restartable and writeable database
clones (or snaps) for repurposing
◆ Use Case 6 — Creating remote database valid backup and
recovery clones (or snaps)
◆ Use Case 7 — Facilitating parallel production database recovery
by restoring remote TimeFinder/Clone backup images
simultaneously with SRDF restore, and then applying Oracle logs
to the production database in parallel
◆ Use Case 8 — Demonstrating fast database recovery using a
restartable TimeFinder replica
Symmetrix VMAX Auto-provisioning Groups

The Auto-provisioning Groups feature facilitates ease and simplicity
of storage provisioning for standalone and clustered Oracle
databases. It simplifies and shortens storage provisioning tasks for
small- and large-scale environments. The storage provisioning that
used to take many steps in prior releases can now be accomplished
with just a few simple and intuitive operations.
The Auto-provisioning Groups feature is built on the notion of
"storage groups," "initiator groups," "port groups," and the views that
combine the groups together. Storage groups are populated with
Symmetrix devices. Port groups are populated with the array
front-end adapter (FA) port numbers. Initiator groups are populated
with HBA WWN information. Then by simply combining storage,
initiator, and port groups into views, the device masking operations
take place automatically across the view. Any modification necessary
to available storage devices, storage array ports, or HBAs would

simply require changing the appropriate group and will

automatically be incorporated throughout the view. For example, if
additional database devices are necessary, simply adding those
devices to the appropriate storage group will automatically initiate
all the necessary mapping and masking operations across the entire
view (note that if the devices are already mapped, the operation will
complete faster, otherwise the Symmetrix config change will first
map the devices appropriately before they are masked, making the
task take a little longer). Initiator groups can be cascaded as shown in
the next example.
Figure 97 shows an example of using Auto-provisioning Groups to
mask Oracle Real Applications Cluster (RAC) database devices. A
storage group is created with the database devices and a port group
with the Symmetrix ports. An initiator group is created for each host's
HBAs (for long-term ease of management); however, they are then
cascaded into a single initiator group for the entire cluster. The
Auto-provisioning Groups view simply includes the storage group,
port group, and the cascaded initiator group. If any hosts are added
or removed from the cluster they will simply be added or removed
from the cascaded initiator group. In a similar way, devices or
Symmetrix ports can be added or removed from their groups and the
view will automate the device provisioning for the cluster.
RAC1_HBAs RAC2_HBAs
Storage
SAN Port: 07E:1
Port: 10E:1
Oracle RAC devices
Figure 97 Oracle RAC and Auto-provisioning Groups

The following steps demonstrate the use of Auto-provisioning

Groups, based on the example in Figure 97 on page 426.
1. Create a storage group for RAC devices
symaccess -name RAC_devs -type storage devs 790:7AF
create
2. Create a port group with storage ports 7E:1 and 10E:1
symaccess -name RAC_ports -type port -dirport
7E:1,10E:1 create
3. Create an initiator group for each cluster node's HBAs
symaccess -name RAC1_hbas -type initiator -file
./RAC1_hbas.txt create
The file RAC1_hbas.txt contains:

WWN:10000000c975c2e4
WWN:10000000c975c336
symaccess -name RAC2_hbas -type initiator -file
./RAC2_hbas.txt create
The file RAC2_hbas.txt contains:
WWN:10000000c975c31a
WWN:10000000c975c3ab
4. Cascade the cluster nodes' initiator groups into a single one for
the entire cluster
symaccess -name RAC_hbas -type initiator create
symaccess -name RAC_hbas -type initiator add -ig
RAC1_hbas
symaccess -name RAC_hbas -type initiator add -ig
RAC2_hbas
5. Create the view for the entire RAC cluster storage provisioning
symaccess create view -name RAC_view
-storgrp RAC_devs -portgrp RAC_ports -initgrp
RAC_hbas
Symmetrix VMAX Enhanced Virtual LUN migration technology

Enginuity 5874 provides an enhanced version of Symmetrix Virtual
LUN software to enable transparent, nondisruptive data mobility of
devices between storage tiers and/or RAID protections. Virtual LUN
migration technology provides users with the ability to move
Symmetrix logical devices between disk types, such as
high-performance enterprise Flash drives (EFDs), Fibre Channel
drives, or high-capacity low-cost SATA drives. As devices are
migrated they can change their RAID protection.

Virtual LUN migration occurs independent of host operating systems

or applications, and during the migration the devices remain fully
accessible to database transactions. While the back-end device
characteristics change (RAID protection and/or physical disk type)
the migrated devices' identities remain the same, allowing seamless
online migration. Virtual LUN is fully integrated with Symmetrix
replication technology and the source devices can participate in
replications such as SRDF, TimeFinder/Clone, TimeFinder/Snap, or
Open Replicator.
The advantages of migrating data using storage technology are ease
of use, efficiency, and simplicity. Data is migrated in the Symmetrix
back end without needing any SAN or host resources increasing
migration efficiency. The migration is a safe operation as the target is
treated internally as just another "mirror" of the logical device,
although with its own RAID protection and storage tier. At the end of
the migration the original "mirror" of the logical device is simply
removed. Finally, since the identity of source devices doesn't change,
moving between storage tiers is made easy and doesn't require
additional change control of business operations such as
remote/local replications and backup. The migration pace can be
controlled using Symmetrix Quality of Service (symqos) commands.
Virtual LUN migration helps customers to implement an Information
Life Management (ILM) strategy for their databases, such as the
move of the entire database, tablespaces, partitions, or ASM
diskgroups between storage tiers. It also allows adjustments in
service levels and performance requirements to application data. For
example often application storage is provisioned before clear
performance requirements are known. At a later time once the
requirements are better understood it is easy to make any adjustment
to increase user experience and ROI using the correct storage tier.
Figure 98 on page 429 shows an example of performing a Virtual
LUN migration of an ASM diskgroup "+Sales" with 20 x 50 GB
devices (ASM members). The migration source devices are spread
across 40 x 300 GB hard disk drives and protected with RAID 1. The
migration target devices are spread across only 4 x 400 GB EFDs and
protected with RAID 5.

Figure 98 Migration example using Virtual LUN technology
The following steps demonstrate the use of Virtual LUN, based on

the example in Figure 98.
1. Optional: Verify information for a migration session called
Sales_mig
symmigrate -name Sales_mig -file Sales_ASM.txt
validate
The file Sales_ASM.txt contains the list of source and target

migration devices:
0100 0C00
... ...
0113 0C13
2. Perform the migration
establish
3. Follow the migration progress and rate at 60-second intervals

symmigrate -name Sales_mig -file Sales_ASM.txt query -i
60
4. Terminate the migration session after completion

terminate

5. Optional: Control migration pace

Create a Symmetrix DG with the source devices
symdg create Sales_dg
symld -g Sales_dg -range 0100:0113 addall
Control the copy pace using the DG

symqos -g Sales_dg set MIR pace 8
Virtual LUN can utilize configured or unconfigured disk space for the
target devices. Migration to unconfigured disk space means that
devices will move to occupy available free space in a target storage
diskgroup. After the migration, the original storage space of the
source devices will be unconfigured. In either case the source devices'
identity doesn't change, making the migration seamless to the host;
no changes to DR, backup, or high availability configuration aspects
are necessary. When specifying configured disk space for the
migration, in essence the source and target devices simply swap their
storage characteristics. However, after the data was migrated to the
target devices, the original source drive storage space will be
reformatted, to prevent exposure of the data that once belonged to it.
With Enginuity 5874, migration of logical devices and metavolumes
is supported. (Only the metahead volumes needs to be specified. The
metamembers will be automatically selected.) Virtual LUN migration
does not support migration of thin devices (or thin pool devices),
virtual devices (or save pool devices), and internal Symmetrix
devices such as VCM, SFS, or Vault.
Migration to configured space

This option is useful when most of the space in the target diskgroup
is already configured (and therefore not enough free space is
available). It is also useful when it is expected that the migration is
temporary and a reverse migration will take place at a later time to
the same target devices. One example of this is migrating the SALES
ASM diskgroup to a Flash drive tier before the end-of-the-month
closing report. That way when the time comes to migrate back, the
source devices return to occupy their previous storage space. When
migrating to a configured space both source and target devices are
specified. The target devices should match in size to the source
devices and they should be at least unmasked to any host, and
optionally unmapped from any Symmetrix FA port. These
requirements ensure that the target devices of the migration do not
contain currently active customer data. Likewise, the target devices

cannot be involved in any other Symmetrix copy operation such as

SRDF, Clone, Snap, or Open Replicator. After the migration, the
target devices occupy the original storage location and protection of
the source devices, and the original source device storage space is
formatted to prevent exposure of its old data by the target.
Migration to unconfigured space

This option is useful when enough free space is available in the target
storage diskgroup. When migrating to an unconfigured space only
the source devices are specified. For the migration target, a storage
diskgroup number is provided along with the RAID protection type
of the new LUN. At the completion of this migration the old source
LUN is unconfigured so no reformat of the LUN is required.
Symmetrix VMAX TimeFinder product family

The EMC TimeFinder family of local replication technology allows
for creating multiple, nondisruptive, read/writeable storage-based
replicas of database and application data. It satisfies a broad range of
customers' data replication needs with speed, scalability, efficient
storage utilization, and minimal to no impact on the applications -
regardless of the database size. TimeFinder provides a solution for
backup, restart, and recovery of production databases and
applications, even when they span Symmetrix arrays. TimeFinder is
well integrated with other EMC products such as SRDF and allows
the creation of replicas on a remote target without interrupting the
synchronous or asynchronous replication. If a restore from a remote
replica is needed, TimeFinder and SRDF will restore data
incrementally and in parallel, to achieve a maximum level of
availability and protection. The TimeFinder product family supports
the creation of dependent write-consistent replicas using EMC
consistency technology, and replicas that are valid for Oracle
backup/recovery operations, as described later in the use cases.
TimeFinder/Clone and the new cascaded clones

TimeFinder/Clone provides the ability to create, refresh, or restore
multiple full volume copies of the source volumes where after the
first full synchronization, only incremental changes are passed
between source and target devices. TimeFinder/Clone operations can
have any combination of standard (STD) and/or business
continuance volumes (BCV) for source and/or target devices, making
it extremely flexible. TimeFinder/Clone can work in emulation
mode, simulating TimeFinder/Mirror commands (symmir) for

legacy reasons; however, it is recommended to use the native

TimeFinder/Clone command syntax (symmclone) when creating
new scripts.
TimeFinder/Clone can scale to thousands of devices and can create
up to 16 targets to each source device. It also provides the flexibility
of synchronizing the target volumes before the clone session (replica)
is activated, also referred to as precopy, after the clone session is
activated, also referred to as background copy, or let the clone devices
synchronize only when data is accessed, also referred to as no-copy,
which can be used, for example, for short-term gold copies.
TimeFinder always presents the final copied image immediately on
its target devices (when creating a replica) or source devices (when
restoring it), even if background copy operations are still in progress.
This allows the application to immediately use the TimeFinder
devices. For example, during TimeFinder restore of a valid database
backup image, Oracle roll forward recovery can start in parallel,
reducing RTO.
Cascaded clones is a new feature in Enginuity 5874 that provides the
ability to perform one additional clone operation on a clone target
without losing the incremental nature of the relationships. This can
become useful when the first clone is a gold copy (backup image, for
example) that should not be used, but additional replicas are required
off it for purposes such as backup, reporting, publishing, test/dev,
and so on. Another option is to do it by using multiple
TimeFinder/Snaps. However when a full volume replica is required
instead, starting with Enginuity 5874 it is also possible to create an
additional clone and deploy it for such purposes.
TimeFinder/Snap and the new TimeFinder/Snap Recreate

TimeFinder/Snap software allows users to create, refresh, or restore
multiple read/writeable, space-saving copies of data.
TimeFinder/Snap allows data to be copied from each source device
to as many as 128 target devices where the source devices can be
either a STD device or a BCV. The target devices are Symmetrix
virtual devices (VDEV) that consume negligible physical storage
through the use of pointers to track changed data.
Any update to source target devices after the snap session was
activated causes the pre-updated data to be copied in the background
to a designated shared storage pool called a save device pool. The
virtual device's pointer is then updated to that location. Any
subsequent updates after the first data modification won't require

any further background copy. Since copy operations happen in the

background, performance overhead of using TimeFinder/Snap is
minimal, and the process is known as Avoid Copy on First Write
(ACOFW).
TimeFinder/Snap Recreate is new in Enginuity 5874. It provides the
ability to very quickly refresh TimeFinder snapshots. Previously it
was necessary to terminate an older snap session in order to create a
new one. The TimeFinder recreate command simplifies the process to
refresh old snaps without having to describe the source and target
devices relationships again.
TimeFinder Consistent Split

With TimeFinder you can use the Enginuity Consistency Assist (ECA)
feature to perform consistent splits between source and target device
pairs across multiple, heterogeneous hosts. Consistent split (which is
an implementation of instant split) helps to avoid inconsistencies and
restart problems that can occur if you split database-related devices
without first quiescing the database. The difference between a normal
instant split and a consistent split is that when using consistent split
on a group of devices, the database writes are held at the storage level
momentarily while the foreground split occurs, maintaining
dependent-write order consistency on the target devices comprising
the group. Since the foreground instant split completes in just a few
seconds, Oracle needs to be in hot backup mode only for this short
time when hot backup is used. When consistent split alone is used to
create a restartable replica, interference with business operations is
minimal.
TimeFinder target devices, after performing a consistent split, are in a
state that is equivalent to the state a database would be in after a
power failure, or if all database instances were aborted
simultaneously. This is a state that is well known to Oracle and it can
recover easily from it by performing a crash recovery the next time
the database instance is started.
TimeFinder and SRDF

TimeFinder and SRDF products are closely integrated. In fact, it is
always recommended to use SRDF in conjunction with remote
TimeFinder to allow remote copies utilizing the target hardware
resources without interrupting the SRDF replications. Also the
remote copies can serve as a gold copy whenever an SRDF target
needs to be refreshed. As an example, a remote TimeFinder/Clone
can be created from the SRDF R2 devices, and many additional snaps

can be created out of that clone for test, development, and reporting
instances. When SRDF/A is used any remote TimeFinder operation
should use the consistent split feature to coordinate the replica with
SRDF/A cycle switching. The use cases in this appendix illustrate
some of the basic Oracle business continuity operations that
TimeFinder and SRDF can perform together.
Symmetrix VMAX SRDF product family

Symmetrix Remote Data Facility (SRDF) is a Symmetrix-based
business continuance and disaster restart solution. In simplest terms,
SRDF is a configuration of multiple Symmetrix units whose purpose
is to maintain real-time copies of host devices in more than one
location. The Symmetrix units can be in the same room, in different
buildings within the same campus, or hundreds of miles apart. SRDF
provides data mobility and disaster restart spanning multiple host
platforms, operating systems, and applications. It can scale to
thousands of devices, can replicate while maintaining write-order
consistency from multiple source arrays to multiple target arrays, and
can support a variety of topologies and configurations.
The local SRDF device, known as the source (R1) device, is
configured in a pairing relationship with a remote target (R2) device,
forming an SRDF pair. When the R2 devices are mirrored with R1
devices, the R2 devices are write-disabled to the remote host. After
the R2 devices are synchronized with its R1 devices, they can be split
at any time, making the R2 devices fully accessible to their hosts. The
R2 device can be either used directly by hosts (once they are split),
can be restored incrementally to the R1 devices, or can be used in
conjunction with TimeFinder to create additional replicas.
TimeFinder replicas can be taken from the R2 devices even while
SRDF is replicating, without disturbing the replication.
Many other new performance and scalability features were added to
SRDF with Enginuity release 5874, including a new protection mode
called SRDF/Extended Distance Protection (SRDF/EDP). Please refer
to the SRDF product guide for a full description.
SRDF modes of operation

SRDF/Synchronous (SRDF/S), SRDF/Asynchronous (SRDF/A), and
SRDF Adaptive Copy are the basic operation modes of SRDF. The
first two are valid for Oracle database protection and maintain
dependent write-order consistency. The third is useful for bulk data

transfers or in combination with more complex SRDF solutions such

as SRDF/Automated Replication (SRDF/AR).
SRDF/Synchronous mode
SRDF/S is used to create a no data loss solution of committed
transactions. It provides the ability to replicate multiple databases
and applications data remotely while guaranteeing the data on both
the source and target devices is exactly the same. SRDF/S can protect
single or multiple source Symmetrix storage arrays with synchronous
replication.
With SRDF/S Synchronous replication, shown in Figure 99, each I/O
from the local host to the source R1 devices is first written to the local
Symmetrix cache (1) and then it is sent over the SRDF links to the
remote Symmetrix unit (2). Once the remote Symmetrix unit
acknowledged it received the I/O in its cache successfully (3), the
I/O is acknowledged to the local host (4). Synchronous mode
guarantees that the remote image is an exact duplication of the source
R1 device's data.
Figure 99 SRDF/Synchronous replication
Single Roundtrip and Concurrent Write SRDF performance

enhancements
Starting with the Enginuity 5772 Service Release, SRDF/S provides a
few performance enhancements. The first, Single Roundtrip, allows
faster SRDF/S response time when long distances increase write
latency. Where previously a transfer-ready condition state was
required from the SRDF target before sending the actual data, now
both transfer ready and data are sent in parallel and acknowledged
once. The second, Concurrent Write, allows SRDF/S to send up to
eight I/Os in parallel per each source device if the I/O arrives from
different FA ports. This allows SRDF/S to perform much faster, for

example, during Oracle checkpoints and when host multipathing

tools like EMC PowerPath® are used.
SRDF/Asynchronous replication mode

SRDF/Asynchronous (SRDF/A) provides a consistent point-in-time
image on the target (R2) devices that is only slightly behind the
source (R1) devices. SRDF/A allows replication over unlimited
distance, with minimum to no effect on the performance of the local
production database(s). SRDF/A can "ride" through workload peaks
by utilizing the local Symmetrix cache and optionally spilling data to
a disk pool (also called delta set extension, or DSE) and reducing the
link bandwidth requirements.
SRDF/A session data is transferred to the remote Symmetrix array in
timed cycles, also called delta sets, as illustrated in Figure 100 on
page 437. There are three cycles that work in unison - the capture cycle
receives all new I/O from the hosts, the transmit/receive cycles on the
R1 and R2, respectively, send and receive the previous captured cycle
until it is fully received, and the apply cycle applies a previously fully
received cycle to the R2 devices.
The SRDF/A cycle switching process is very efficient and scalable.
Within a capture cycle if a piece of data is updated multiple times
only the most recent update to the data is transmitted once. This
process is called write folding. Also, there is no need to maintain write
consistency of each I/O. Instead, consistency is maintained between
cycles. If replication stops for any reason SRDF will make sure to
either apply a fully received cycle to the target R2 devices, or discard
the last incomplete cycle. This leaves the remote R2 devices always
only one or two cycles behind the R1 devices. While the default
minimum cycle switching time is 30 seconds, it can grow during peak
workload, and shrink back to default afterward.

Figure 100 SRDF/Asynchronous replication
SRDF/A Consistency Exempt

New to Enginuity 5874 is the ability to add or remove devices from
an SRDF/A session without breaking the session consistency to
perform that operation. When dynamic SRDF devices are added the
consistency exempt flag is set, allowing them to synchronize without
interrupting the consistency attributes of the other devices in the
SRDF/A session. After they are in sync for two cycles the flag will be
automatically removed, allowing them to join the session consistency
attributes. When devices are suspended the consistency exempt flag
will be automatically set, thus allowing them to be removed without
interrupting the SRDF session consistency. These new and flexible
abilities enhance database protection and availability.
SRDF/A Multi-Session Consistency
Like SRDF/S, SRDF/A can replicate from multiple source arrays to
multiple target arrays while maintaining write-order consistency
between cycles. When dependent write consistently across multiple
Symmetrix arrays is required, the SRDF/A Multi-Session
Consistency (MSC) option is used and the coordination of cycle
switching across the arrays is performed with the assistance of SRDF
redundant host daemons. The daemons merely wait for ready
conditions on all the arrays and then send the switch cycle command,
keeping communication light and efficient. Similar to TimeFinder
consistent split, also when SRDF/A MSC is used there is a brief hold
of write I/O on all the arrays simultaneously during cycle switch to
preserve write-order consistency.

SRDF Adaptive Copy replication mode

SRDF Adaptive Copy replication facilitates long-distance data
sharing and migration (see Figure 101). SRDF Adaptive Copy
replication allows the primary and secondary volumes to be more
than one I/O out of synchronization. The maximum number of I/Os
that can be out of synchronization is known as the maximum skew
value, and can be set using SRDF monitoring and control software.
There is no attempt to preserve the ordering of write I/Os when
using SRDF Adaptive Copy replication.
Figure 101 SRDF Adaptive Copy mode
SRDF Adaptive Copy replication is useful as an interim step before

changing to an Oracle-supported SRDF/S or SRDF/A replication. It
is also used for point-in-time long-distance bulk transfer of data. For
example, if the connection between the two sides is lost for a long
period of time allowing the buildup of a large number of changes to
accumulate, resumption of the links can cause a heavy surge in link
traffic (created by the backlog of changes added to those generated by
normal production traffic). By using SRDF Adaptive Copy
replication, the backlog of invalid tracks is synchronized using the
SRDF low priority queue, while new writes are buffered in cache and
sent across using the high priority SRDF queue without impacting
the host application. Once the backlog of changes has been
transferred, or the total amount of changed tracks has reached a
specified number, the mode can be changed to SRDF/S or SRDF/A
replication to achieve database protection.
Note: SRDF Adaptive Copy replication is not supported for database restart
or database recovery solutions with Oracle databases. Using SRDF Adaptive
Copy replication by itself for disaster protection of Oracle databases will lead
to a corrupt and unusable remote database.

SRDF topologies
SRDF can be set in many topologies other than the single SRDF
source and target. Thus SRDF satisfies different needs for high
availability and disaster restart. It can use a single target or two
concurrent targets; it can provide a combination of synchronous and
asynchronous replications; it can provide a three-site solution that
allows no data loss over very long distances and more. Some of the
basic topologies that can be used with SRDF are shown in the
following section .
Concurrent SRDF
SRDF allows simultaneous replication of single R1 source devices to
up to two target devices using multiple SRDF links. All SRDF links
can operate in either Synchronous or Asynchronous mode or one or
more links can utilize Adaptive Copy mode for efficient utilization of
available bandwidth on that link. This topology allows simultaneous
data protection over short and long distances as shown in Figure 102.
Figure 102 Concurrent SRDF
Cascaded SRDF
SRDF allows cascaded configurations in which data is propagated
from one Symmetrix to the next. This configuration requires
Synchronous mode for the first SRDF leg and Asynchronous or
Adaptive Copy modes for the next. As shown in Figure 103, this
topology provides remote replications over greater distances with
varying degree of bandwidth utilization and none to limited data loss
(depends on the choice of SRDF modes and disaster type).

Figure 103 Cascaded SRDF
SRDF/Extended Distance Protection

SRDF currently supports multi-site replications in cascaded SRDF
configuration. This feature is enhanced to support a more efficient
two-site DR solution over extended distances with zero or near zero
data loss. In this configuration (shown in Figure 104), the storage
cache alone is used on the intermediate site for a temporary
pass-through data store of the modified tracks before copying them
over to the tertiary site. SRDF/S and Adaptive Copy are allowed
between primary and secondary sites. SRDF/A and Adaptive Copy
are available between secondary and tertiary sites.
Figure 104 SRDF/Extended Distance Protection
The major benefits of this configuration are:

◆ New long-distance replication solution with the ability to achieve
zero RPO at the target site

◆ A lower-cost alternative in which to achieve no data loss for

target site disaster restart
SRDF/Star
SRDF/Star is a two- or three-site protection topology where data is
replicated from source Site A to two other Symmetrix systems
simultaneously (Site B and Site C). The data remains protected even
in case one target site (B or C) goes down. If site A (the primary site)
goes down, the customer can choose where to come up (site B or C)
based on SRDF/Star information. If the storage data in the other
surviving site is more current then changes will be incrementally sent
to the surviving site that will come up. For protection and
compliance, remote replications can start immediately to the new DR
site. For example, as shown in Figure 105, if database operations
resume in Site C, data will be sent first from Site B to create a no data
loss solution, and then Site B will become the new DR target.
SRDF/Star has a lot of flexibility and can change modes and topology
to achieve best protection with each disaster scenario. For full
description of the product refer to the SRDF product guide.
Figure 105 SRDF/Star
Leveraging TimeFinder and SRDF for data consistency

EMC TimeFinder and SRDF solutions with Enginuity Consistency
Assist (ECA consistent split) allow creation of dependent write-order
consistent storage-based replicas. The replicas are created by
temporarily holding write I/Os to all source devices included in the
replica. Since all writes are held, no dependent writes can be issued

(as they depend on a previous completion of the held I/O). For

example Oracle will not write to data files (checkpoint) until the redo
writes for these data changes were fully recorded in the log files.
SRDF/S and SRDF/A modes ensure the dependent write-order
consistency of the replication by synchronizing each and every
dependent I/O (SRDF/S mode) or by synchronizing across cycles of
transferred data (SRDF/A mode). In an actual disaster that leads to
the loss of source location, database restart operations can be
completed at the remote location without the delays associated with
finding and applying recovery across applications in the correct
sequence or to a coordinated time before the failure.
In addition to disaster restart benefits, SRDF significantly enhances
disaster recovery operations by using fast and reliable replication
technology to offload the Oracle backup operations to a remote site
and later return the restored data to the local site as shown in the use
cases section.
ASM rebalancing and consistency technology

ASM provides a seamless and nonintrusive mechanism to expand
and shrink the diskgroup storage. When disk storage is added or
removed, ASM will perform a redistribution (rebalancing) of the
striped data . This entire rebalance operation is done while the
database is online, thus providing higher availability to the database.
The main objective of the rebalance operation is to always provide an
even distribution of file extents, workload, and data protection across
all disks in the diskgroup.
With Symmetrix arrays as the storage, it is considered a best practice
to use ASM external redundancy for data protection. The Symmetrix
RAID protection will be utilized to provide RAID 1, RAID 5, or RAID
6 internal disk protection.
The split operation of storage-based replicas is sensitive to the
rebalancing process, which may cause ASM diskgroup
inconsistencies if the diskgroup device members are split at slightly
different times. These inconsistencies are a result of possible ASM
metadata changes occurring while a split operation is in process.
Upon startup if ASM detects an inconsistency, metadata logs will be
used to perform ASM instance recovery. In addition Oracle provides
tools and procedural steps to avoid inconsistencies when splitting
storage-based replicas; however, these procedures can be simplified
and streamlined with the use of EMC consistency technology.

Since EMC consistent split technology suspends database I/O to

preserve write ordering, it also has the side effect of preventing any
ASM metadata changes during the split. Performing a consistent split
will prevent ASM metadata inconsistencies during the replication
process, eliminating the otherwise extra steps or possible unusable
replica if ASM rebalance was active while performing a
nonconsistent split.

Leveraging TimeFinder and SRDF for business continuity solutions

Table 33 shows the RAC database and Symmetrix device layout that
was used in the use cases. All the devices (LUNs) were 50 GB in size
and the database actual size was about 400 GB.
Table 33 ASM diskgroups, and Symmetrix device and composite groups
Restart
Recovery Device SRDF
ASM Database Device Groups Consistency
diskgroups devices Groups (DG) (DG) Group (CG)
+DATA 18 LUNs x 50 GB DATA_DG DB_DG ALL_CG
+REDO 4 LUNs x 50 GB REDO_DG
+FRA 3 LUNs x 50 GB FRA_DG
The database primary devices (also TimeFinder and SRDF source

devices) were using Symmetrix RAID 1 protection.
TimeFinder/Clone targets were using RAID 5 protection to improve
storage utilization. SRDF target devices also used RAID 1 to match
the same protection level as the primary database devices.
ASM general best practices

◆ ASM was using external redundancy (no software mirroring) in
accordance with EMC's recommendation of leveraging the
Symmetrix array RAID protection instead.
◆ ASM was set with three diskgroups: +REDO (redo logs), +DATA
(data, control, temp files), and +FRA (archives, flashback logs).
Typically EMC recommends separating logs from data for
performance monitoring and backup offload reasons. When
SRDF is used, temp files can go to their own "+TEMP" diskgroup
if replication bandwidth is limited as temp is not required for
database restart or recovery. In these use cases, however, SRDF
FC bandwidth was not an issue and temp files were included in
the +DATA diskgroup. Finally, +FRA can typically use a
lower-cost storage tier like SATA drives and therefore require
their own diskgroup.
TimeFinder best practices

◆ Multiple Symmetrix device groups were used for
TimeFinder/Clone (or snap) operations, allowing finer
granularity of operations. For recovery solutions, data files

(together with control files), log files, and archive logs each had
their own DG, allowing the replica of each to take place at slightly
different times as shown in the recovery use cases. For example, if
a valid datafile's backup replica should be restored to production,
and the production logs are intact, by separating the datafiles and
logs to their own DG and ASM diskgroups, such a restore won't
compromise the logs and full database recovery would be
possible. For a restart solution, a single DG was used that
includes all data (control) and log files, allowing them to be split
consistently creating a restartable and consistent replica.
◆ Note that TimeFinder operations can span Symmetrix arrays.
When that is the case instead of a device group (DG) a composite
group (CG) should be used, following the exact same best
practices as shown for the DG in this paper.
◆ It is recommended to issue TimeFinder and SRDF commands
from a management (or the target) host and not the database
production host. The reason is that in rare cases when consistent
split is used, under heavy write activity Symmetrix management
commands may be queued behind database writes, interfering
with completing the replication and the replica will be deemed
invalid.
◆ It is recommended to use Symmetrix Generic Name Services
(GNS) and allow them to be replicated to the SRDF targets. GNS
manages all the DG and CG definitions in the array and can
replicate them to the SRDF target so the management host issuing
TimeFinder and SRDF commands will be able to operate on the
same CG and DG as the source (without having to re-create
them).
◆ For the sake of simplicity the use cases assume that GNS is used
and replicated remotely. When remote TimeFinder or SRDF
operations are used, they are issued on the target host. It is also
possible to issue remote TimeFinder and SRDF commands from
the local management host using the -rdf flag; however it requires
the SRDF links to be functional.
◆ Note that remote TimeFinder replica creation from an SRDF/A
target should always use the -consistent flag to coordinate
SRDF/A cycle switching with the TimeFinder operation and
simply put, guarantee that the replica is consistent.
Leveraging TimeFinder and SRDF for business continuity solutions 445

SRDF best practices

◆ SRDF, whether synchronous or asynchronous, should always use
a composite group (CG) with consistency enabled (also called a
consistency group). While enabling consistency is a requirement
for SRDF/A, it is a common misconception that SRDF/S being a
synchronous replication doesn't benefit from it. However
SRDF/S with consistency enabled will guarantee that if even a
single source device can't replicate to its target, all the SRDF
devices in that session will stop replicating, preserving the target
consistent image.
◆ For SRDF replications a single CG was used that included all the
database devices (data, control and log files). As shown in Table 1
it also included the FRA devices. SRDF on its own is a restart
solution and since database crash recovery never uses archive
logs there is no need to include FRA in the SRDF replications.
However there are two reasons why they could be included. The
first is if Flashback database functionality is required for the
target. Replicating the FRA (and the flashback logs) in the same
consistency group with the rest of the database allows its usage
on the target of flashback functionality. The second reason is that
to allow offload of backup images remotely, the archive logs are
required (as shown in “Use Case 6: Remote database valid backup
replicas” on page 456).
◆ It is always recommended to have a clone copy available at the
SRDF target as a gold copy protection from rolling disasters.
Rolling disasters is a term used when a first interruption to
normal replication activities is followed by a secondary database
failure on the source, leaving the database without an
immediately available valid replica. For example, if SRDF
replication was interrupted for any reason for a while (planned or
unplanned) and changes were accumulated on the source, once
the synchronization resumes and until the target is synchronized
(SRDF/S) or consistent (SRDF/A), the target is not a valid
database image. For that reason it is best practice before such
resynchronization to take a TimeFinder gold copy replica at the
target site, which will preserve the last valid image of the
database, as a protection from rolling disasters.
◆ While the source database was clustered, since Oracle RAC is
based on a shared storage architecture, by virtue of replicating all
the database components (data, log, and control files) the target
database has the option of being started in cluster, or
non-clustered mode. Regardless of the choice, it is not

recommended to replicate the cluster layer (voting disks or

cluster configuration devices) since these contain local hosts and
subnets information. It is best practice that if a cluster layer is
required at the target hosts, it should be configured ahead of time,
based on target hostnames and subnets, and therefore be ready to
bring up the database whenever the time comes.
Use Case 1: Offloading database backups from production

This use case illustrates how to offload database backups from
production to a local TimeFinder/Clone, and then using Oracle
RMAN to perform further backup.
While the Oracle database is in hot backup mode on the production
host, a TimeFinder/Clone activate is performed to create a
recoverable replica of the database. This is a valid backup image that
can be used to perform quick recovery of the Oracle database. The
image can also be mounted to another host for RMAN backups.
High-level steps
1. Place the database in hot backup mode.
2. Activate the DATA_DG clone (with -consistent since ASM is
used).
3. End hot backup mode.
4. Archive the current log.
5. Copy two backup control files to the FRA ASM diskgroup.
6. Activate the ARCHIVE_DG clone (with -consistent since ASM is
used).
7. Optionally mount the clone devices on a backup host and
perform RMAN backup.
Device groups used

DATA_DG and ARCH_DG
Detailed steps
On the production host
1. Place the production database in hot backup mode.
# export ORACLE_SID=RACDB1
# sqlplus "/ as sysdba"

2. Activate the TimeFinder/Clone DATA_DG replica. The clone

replica includes data and control files. Use -consistent with ASM
or file systems.
# symclone -dg DATA_DG -tgt -consistent activate

4. Switch logs and archive the current log file.

5. Create two backup control files and place them in the FRA
diskgroup for convenience (RMAN syntax is shown, although
SQL can be used as well). One will be used to mount the database
for RMAN backup; the other will be saved with the backup set.
RMAN>run {
allocate channel ctl_file type disk;
copy current controlfile to
'+FRA/control_file/control_start';
copy current controlfile to
'+FRA/control_file/control_bakup';
release channel ctl_file;
}
6. Activate the TimeFinder/Clone ARCHIVE_DG replica. The clone

replica includes the archive logs and backup control files. Use
-consistent with ASM or file systems. If RMAN Catalog is used
synchronize it first to register the most recent archive logs.
RMAN>resync catalog;
# symclone -g ARCH_DG -tgt -consistent activate
On the backup host

The database replica can be used as a valid disk backup or as a source
for backup to a tertiary media such as tape or a disk library. In this
example RMAN will be used to perform the backup.
Target/Backup host prerequisites:
◆ The ASM devices (or partitions) on clone volumes have correct
Oracle permissions.

◆ The ASM_DISKSTRING parameter in the init.ora file for the ASM

instance includes the path to clone volumes.
◆ The ASM_DISKGROUPS parameter in the init.ora file for the
ASM instance contains the names of the production database
diskgroups.
◆ It is not necessary to have the database mounted as RAC. Prior to
mounting the database comment out, update ASM and database
instance init.ora parameters as necessary. Specifically change
CLUSTER_DATABASE to false if clustered mode is not needed. If
the database is to be started in clustered mode then the cluster
layer (and software) should already be installed and configured
on the target host (not replicated with TimeFinder or SRDF)
7. Continuing from step 6 on the previous page) Start the ASM
instance. If other volume managers or file systems are used their
appropriate import and mount commands will be used instead.
Make sure all the diskgroups were mounted correctly by ASM.
# export ORACLE_SID=+ASM
SQL> startup
8. Mount the database instance. A database backup that was taken

with hot backup mode is valid for recovery only as long as it has
not been opened read-writeable (with the resetlogs option). For
that reason, it should be only mounted, which is the minimum
prerequisite for RMAN backup. It can also be opened in
read-only mode after enough archive logs are applied to resolve
any data files' fuzziness. Before starting the database in mount
mode, change the CONTROL_FILES in the init.ora file to point to
the backup control file.
control_files = +FRA/control_file/control_start
# export ORACLE_SID=CLONE_DB
SQL> startup mount
9. Back up the database with RMAN from the backup host. The
control file copy that was not used to mount the instance
(control_bak) should be part of the backup set. The control_start file
should not be backed up because the SCN will be updated when
the database is mounted for backup.
RMAN>run {allocate channel t1 type disk;
backup format 'ctl%d%s%p%t'

controlfilecopy '+FRA/control_file/control_bak';
backup full format 'db%d%s%p%t' database;
backup format 'al%d%s%p%t' archivelog all;
release channel t1;
}
Note: Note: The format specifier %d is for date, %t for 4-byte timestamp, %s
for backup set number, and %p for the backup piece number.
Use Case 2: Parallel database recovery

This use case illustrates how to perform parallel database recovery by
restoring a local TimeFinder backup replica and applying logs to it,
even while TimeFinder restore continues in the background.
The clone copy created in Use Case 1 can be used to perform database
recovery of the production database. Database recovery operations
can start as soon as TimeFinder/Clone restore operation has started,
providing a much faster RTO compared to common solutions that
require an initial restore of the backup image from the tertiary media
destination, and only once it was fully restored, database recovery
operations can start. Recovery can be performed using the archived
logs available on the production host or restored from the
TimeFinder/Clone image. Like in this example, if recovery takes
place on production, and archive logs including even online redo logs
are available, a full media recovery (no data loss) can be achieved. If
the production logs (or not all archive logs) are available, database
incomplete media recovery can still be performed.
High-level steps
1. Shut down production database and ASM instances.
2. Restore the DATA_DG clone (split afterwards).
3. Start ASM.
4. Mount the database.
5. Perform database recovery and open the database.
Device group used

DATA_DG

Detailed steps
1. Shut down any production database and ASM instances (if still
running).
SQL> shutdown abort
# export ORACLE_SID=+ASM1
SQL> shutdown abort
2. Restore the TimeFinder/Clone replica. Note the -force is required

if the source device is also part of an active SRDF session with
remote R2 devices. In this case it is assumed that production
archive and redo logs are available, therefore, just the DATA_DG
(with data and control files) is restored.
As soon as the restore starts it is possible to continue with the next
step. However make sure to split the clone replica at a later time
when the background restore completed. Note that TimeFinder
restore protects the replica from changes to the source devices.
# symclone -dg DATA_DG -tgt restore [-force]
# symclone -dg DATA_DG -tgt split
3. Start the ASM instance (follow the same activities as in Use Case
1, step 7).
4. Mount the database (follow the same activities as in Use Case 1,
step 8).
5. Recover and open the production database. Use resetlogs if
incomplete recovery was performed.

SQL> startup mount
SQL> recover automatic database using backup
controlfile until cancel;

Use Case 3: Local restartable replicas of production

This use case illustrates how to create local restartable clones (or
snaps) of production for database repurposing, such as creating test,
development, and reporting copies.
While the Oracle database is running transactions on the production
host, without the use of hot backup mode activate a consistent
TimeFinder/Clone session to create a restartable replica of the
database. The replica can be mounted to another host for purposes
such as test, dev, reporting, and so on. Mounting multiple replicas of
the same database on the same host is possible; however that topic is
beyond the scope of this paper.
High-level steps
1. Activate the DB_DG clone (with -consistent to create restartable
replica).
2. Start the ASM instance.
3. Start the database instance.
4. Optionally, refresh the clone replica from production at a later
time.
Device group used

DB_DG
Detailed steps
On the target host
1. Activate the TimeFinder/Clone DB_DG replica. The clone replica
includes all data, control, and log files. Use -consistent to make
sure the replica maintains dependent write consistency and
therefore a valid restartable replica from which Oracle can simply
perform crash recovery.
# symclone -dg DB_DG -tgt -consistent activate
Note: Note: Follow the same target host prerequisites as in Use Case 1 prior
to step 7.
2. Start the ASM instance (or perform import/mount if other

volume managers or file systems are used). Make sure all the
diskgroups were mounted correctly by ASM.

# export ORACLE_SID=+ASM
SQL> startup
3. Simply start the database instance. No recovery or archive logs

are needed.
# export ORACLE_SID=CLONE_DB
SQL> startup
At this point the clone database is opened and available for user
connections.
4. Optionally, it is easy and fast to refresh the TimeFinder replica
from production as TimeFinder/Clone operations are incremental
as long as the clone session is not terminated. Once the clone
session is reactivated, the target devices are available
immediately for use, even if background copy is still taking place.
1. Shut down the clone database instance since it needs to be
refreshed
SQL> shutdown abort
2. Re-create and activate the TimeFinder/Clone replica from

production. This will initiate the background copy operation.
# symclone -dg DB_DG -tgt recreate
# symclone -dg DB_DG -tgt activate -consistent
3. Start the clone ASM and database instances by following steps

2 and 3 again.
Use Case 4: Remote mirroring for disaster protection (synchronous and

asynchronous)
This use case illustrates how to create remote mirrors of a production
database for disaster protection using SRDF/S or SRDF/A.
High-level steps
1. Perform initial synchronization of SRDF in Adaptive Copy mode.
2. Once the SRDF target is close enough to the source, change the
replication mode to SRDF/S or SRDF/A.
3. Enable SRDF consistency.

Device group used

ALL_CG
Detailed steps
1. Perform initial synchronization of SRDF in Adaptive Copy mode.
Repeat this step or use the skew parameter until the SRDF target
is close enough to the source.
# symrdf -cg ALL_CG set mode acp_wp skew <number>]
# symrdf -cg ALL_CG establish
2. Once the SRDF target is close enough to the source change the
replication mode to SRDF/S or SRDF/A.
1. For SRDF/S, set protection mode to sync:
# symrdf -cg ALL_CG set mode sync
2. For SRDF/A, set protection mode to async:

# symrdf -cg ALL_CG set mode async
3. Establish SRDF replication if the copy is not already active and

enable consistency.
# symrdf -cg ALL_CG enable
# symrdf -cg ALL_CG establish [-full]
# symrdf -cg ALL_CG verify -synchronized -i 60
Use Case 5: Remote restartable database replicas for repurposing

This use case illustrates how to create remote restartable clones (or
snaps ) of production for database repurposing without interrupting
SRDF protection
Once synchronized, an SRDF/S or SRDF/A session can be split at
any time to create the dependent write consistent remote replica
based on the R2 target devices. At that time SRDF will keep track of
any changes on both source and target devices and only these
changes will be copied over the next time SRDF is synchronized
(refresh the target devices) or restored (refresh the source devices).
However it is a better practice to keep SRDF synchronized to
maintain remote replication and protection, and instead activate a
remote TimeFinder replica such as clone or snap (currently supported
with SRDF/S only), and alternatively additional snapshots can be
taken from the remote clone. These replicas of the database are
dependent write consistent and can be used for activities such as test,

development, reporting, data processing, publishing, and more. It

also can serve as gold copy protection from rolling disasters as
explained earlier in the SRDF best practices section.
High-level steps
1. Activate the remote DB_DG clone (use -consistent to create
restartable replica).
2. Start the remote ASM instance.
3. Start the remote database instance.
4. Optionally, refresh the remote clone replica from production
(SRDF targets) at a later time.
Device group used

DB_DG
Detailed steps
On the target host
1. Activate the TimeFinder/Clone DB_DG remote replica. The clone
replica includes all data, control, and log files. Use -consistent to
make sure the replica maintains dependent write consistency and
therefore a valid restartable replica from which Oracle can simply
perform crash recovery.
# symclone -dg DB_DG -tgt -consistent activate
Note: Note: Follow the same target host prerequisites as in Use Case 1 prior
to step 7.
2. Start the ASM instance. Follow the same activities as in Use Case
3 step 2.
3. Start the database instance. Follow the same activities as in Use
Case 3 step 3.
At this point the clone database is opened and available for user
connections.
4. Optionally, to refresh the database clone follow the same activities
as in Use Case 3 step 4.

Use Case 6: Remote database valid backup replicas

This use case illustrates how to create remote database clones that are
a valid Oracle backup image and can be used for database recovery.
By creating TimeFinder remote replicas that are valid for database
recovery, backup to tertiary media can be performed at the remote
site. Also, the TimeFinder replica itself is a valid backup to disk that
can be used to recover production if necessary.
Note: For SRDF/A: The SRDF checkpoint command will return control to the
user only after the source device content reached the SRDF target devices
(SRDF will simply wait two delta sets). This is useful for example when
production is placed in hot backup mode before the remote clone is taken.
High-level steps
1. Place the database in hot backup mode.
2. If using SRDF/A, perform SRDF checkpoint (no action required
for SRDF/S).
3. Activate a remote DATA_DG clone (with -consistent if SRDF/A
and/or ASM are used).
5. Archive the current log.
6. Copy two backup control files to the FRA ASM diskgroup.
7. If using SRDF/A then perform SRDF checkpoint (no action
required for SRDF/S).
8. Activate the remote ARCHIVE_DG clone (with -consistent if
SRDF/A and/or ASM is used).
9. Optionally mount the remote clone devices on the backup host
and perform RMAN backup.
Device groups used

DATA_DG and ARCH_DG for TimeFinder operations, ALL_CG for
SRDF operations

Detailed steps
1. Place production in hot backup mode. Follow the same activities
as in Use Case 1 step 1.
2. If SRDF/A is used then an SRDF checkpoint command will make
sure the SRDF target has the datafiles in backup mode as well.
# symrdf -cg ALL_CG checkpoint
3. Activate the remote DATA_DG clone. Use -consistent if SRDF/A

is used and/or ASM. Follow the same activities as in Use Case 1
step 2.
4. End hot backup mode. Follow the same activities as in Use Case 1
step 3.
5. Switch logs and archive the current log file. Follow the same
activities as in Use Case 1 step 4.
6. Create two backup control files and place in the FRA diskgroup
for convenience. Follow the same activities as in Use Case 1 step
5.
7. If SRDF/A is used then an SRDF checkpoint command will make
sure the SRDF target has the FRA diskgroup (with the last
archives and backup control files) at the target.
# symrdf -cg ALL_CG checkpoint
8. Activate the remote TimeFinder/Clone ARCHIVE_DG replica.

Follow the same activities as in Use Case 1 step 6.
9. Optionally mount the remote clone devices on the backup host
and perform RMAN backup. Follow the same activities as in the
“On the backup host” section in Use Case 1.
Use Case 7: Parallel database recovery from remote backup replicas

This use case illustrates how to perform parallel production database
recovery by restoring a remote TimeFinder/Clone backup image
simultaneously with SRDF restore, and then applying Oracle logs to
the production database in parallel. This is similar to Use Case 2, only
the recovery is from a remote replica.

High-level steps
2. Restore the remote DATA_DG clone (split afterwards). Restore
SRDF in parallel.
3. Start ASM.
5. Perform database recovery (possibly while the TimeFinder and
SRDF restore are still taking place) and open the database.
Device groups used

DATA_DG; ALL_CG for SRDF operations
Detailed steps
running). Follow the same activities as in Use Case 2 step 1.
2. Restore the remote TimeFinder/Clone replica to the SRDF target
devices, then restore SRDF. If SRDF is still replicating from source
to target stop the replication first. Then start TimeFinder restore,
and once started start SRDF restore in parallel.
In some cases the distance is long, the bandwidth is limited, and
many changes have to be restored. In these cases it might make
more sense to change SRDF mode to Adaptive Copy first until the
differences are small before placing it again in SRDF/S or
SRDF/A mode.
# symrdf -cg ALL_CG split
# symclone -dg DATA_DG -tgt restore [-force]
# symrdf -cg ALL_CG restore
It is not necessary to wait for the completion of the SRDF restore

before moving to the next step.
3. Start ASM on the production host. Follow the same activities as in
Use Case 1 step 7.
4. Mount the database. Follow the same activities as in Use Case 1
step 8.
5. Recover and open the production database. Follow the same
activities as in Use Case 2 step 5.

Use Case 8: Fast database recovery from a restartable replicas

This use case illustrates fast database recovery by using the most
recent consistent (restartable) replica and applying logs to it.
Oracle supports various database recovery scenarios based on
dependent write consistent storage replicas created using SRDF
and/or TimeFinder. Oracle support is covered in metalink note ID
604683.1. The purpose of this use case is not to replace backup
strategy such as nightly backups based on hot backup mode. Instead,
it offers a complementary use case such as when RTO requirements
are very strict. It could be a compelling solution to run the database in
archivelog mode, and perform periodic snapshots without placing
the database in hot backup mode. If recovery is required, the last
snapshot is restored and in parallel the limited transactions since that
snapshot was taken are restored, creating a fast database recovery
solution.
Consider this scenario. The database is in archive log mode and
periodic TimeFinder consistent clones or snaps are created that
include only the data. At some point a database recovery is required
based on the last replica (clone in this example).
High-level steps
2. Restore the most recent DATA_DG clone (split afterwards).
3. Start ASM.
5. Perform database full or incomplete recovery (possibly while the
TimeFinder background restore is still taking place).
Device group used

DATA_DG
Detailed steps
running). Follow the same activities as mentioned in Use Case 2
step 1.
2. Restore the most recent DATA_DG TimeFinder replica. Follow
the same activities as mentioned in Use Case 2 step 2.

3. Start the ASM instance (follow the same activities as in Use Case 1
step 7).
4. Mount the database (follow the same activities as in Use case 1
step 8).
5. Perform database recovery based on one of the following options.
Full (complete) database recovery

When all online redo logs and archive logs are available it is possible
to perform a full media recovery of the Oracle database to achieve a
no data loss of committed transactions.
SQL> recover automatic database;
Note: Note: It might be necessary to point the location of the online redo logs
or archive logs if the recovery process didn't locate them automatically
(common in RAC implementations with multiple online or archive logs
locations). The goal is to apply any necessary archive logs as well as the
online logs fully.
Point-in-time database recovery

When a full media recovery is not desirable, or when some archives
or online logs are missing, an incomplete recovery can be performed.
When performing incomplete recovery enough logs need to be
applied to pass the maximum point of data file fuzziness so they are
all consistent. After passing that point additional archive can
potentially be applied. The following is a sample script (based on the
Oracle metalink note mentioned previously) that can help identify
the minimum SCN required to open the database. However
performing data file scans can be an elongated process that defeats
the purpose of fast recovery and short RTO. Therefore running the
script is optional, and it is recommended to simply perform the
recovery instead for two reasons. First, the TimeFinder replica with
the data and control files can be restored again if necessary so it can't
be corrupted by the restore. Second, since the replica is taken with
consistent split, the point of fuzziness of the data files can't go beyond
the time of the split (it can only be older). Therefore it is clear that
recovering this replica to a time beyond the split time will pass the
maximum fuzziness in all the data files and will be sufficient.
Optional scan datafile script (not recommended to run unless RTO is
not a concern):
spool scandatafile.out

set serveroutput on
declare
scn number(12) := 0;
scnmax number(12) := 0;
begin
for f in (select * from v$datafile) loop
scn := dbms_backup_restore.scandatafile(f.file#);
dbms_output.put_line('File ' || f.file# ||'
absolute fuzzy scn = ' || scn);
if scn > scnmax then scnmax := scn; end if;
end loop;
dbms_output.put_line('Minimum PITR SCN = ' ||

scnmax);
end;
Sample output generated by the scan data script:

SQL> @./scandata.sql
File 1 absolute fuzzy scn = 27088040
…
Minimum PITR SCN = 27164171
Perform incomplete database recovery (sample commands):

SQL> alter database recover database until change
27164171;

Conclusion
Symmetrix VMAX is a new offering in the Symmetrix product line
with enhanced scalability, performance, availability, and security
features, allowing Oracle databases and applications to be deployed
rapidly and with ease.
With the introduction of Enterprise Flash Drives, and together with
Fibre Channel and SATA drives, Symmetrix provides a consolidation
platform covering performance, capacity, and cost requirements of
small and large databases. The correct use of storage tiers together
with the ability to move data seamlessly between tiers allow
customers to place their most active data on the fastest tiers, and their
less active data on high-density, low-cost media like SATA drives.
Features such as Autoprovisioning allow ease of storage provisioning
to Oracle databases, clusters, and physical or virtual server farms.
TimeFinder and SRDF technologies simplify high availability and
disaster protection of Oracle databases and applications, and provide
the required level of scalability from the smallest to the largest
databases. SRDF and TimeFinder are easy to deploy and very well
integrated with Oracle products like Automatic Storage Management
(ASM), RMAN, Grid Control, and more. The ability to offload
backups from production, rapidly restore backup images, or create
restartable database clones enhances the Oracle user experience and
data availability.
Oracle and EMC have been investing in an engineering partnership
to innovate and integrate both technologies since 1995. The
integrated solutions increase database availability, enhance disaster
recovery strategy, reduce backup impact on production, minimize
cost, and improve storage utilization across a single database instance
or RAC environments.

Test storage and database configuration

This appendix contains a description of the storage and database
configurations used in the test use cases.
General test environment

It is assumed that:
◆ Oracle is installed on the target host with similar options to
production and configured for ASM use (CSS, or Cluster
Synchronization Service, is active).
◆ Copies of the production init.ora files for the ASM instance and
the database instance were copied to the target host and modified
if required to fit the target host environment.
◆ The appropriate Clone, R2, or Remote Clone (whichever is
appropriate for the test) is accessible by the target host.
The SRDF and TimeFinder tests were performed while an OLTP
workload was running, simulating a high number of concurrent
Oracle users.
Although, TimeFinder and SRDF commands can be issued from any
host connected to the Symmetrix, in the following test cases, unless
specified otherwise, they were issued from the production host. The
term "Production host" is used to specify the primary host where the
source devices are used, and "Target host" is used to specify the host
where the Clones, R2, or Remote clone devices are used.
Test setup
Figure 106 on page 464 depicts the test setup containing Oracle RAC
on the production site and associated TimeFinder/Clone and SRDF
devices for local and remote replication.
Test storage and database configuration 463

Figure 106 Test configuration
Storage and device specific configuration:

◆ All RAC nodes share the same set of devices and have proper
ownerships.
◆ PowerPath is used to support multipathing and load balancing.
◆ PowerPath device names are uniform across all RAC nodes.
◆ Symmetrix device groups are created for shared storage for RAC.
◆ ASM diskgroups were configured on Symmetrix devices.
◆ Appropriate local and remote replication relationships were
created using SYMCLI commands for TimeFinder/Clone and
SRDF.
Table 34 Test hardware (page 1 of 2)
Model OS Oracle version
Local “Production” Host: Dell Red Hat Enterprise 11g release 1 (11.1.0.6.0)
RAC Node 1 Linux 5.0
Local “Production” Host: Dell Red Hat Enterprise 11g release 1 (11.1.0.6.0)
RAC Node 2 Linux 5.0
Remote “Target” Host Dell Red Hat Enterprise 11g release 1 (11.1.0.6.0)
Linux 5.0

Table 34 Test hardware (page 2 of 2)
Model OS Oracle version
Type Enginuity version
Symmetrix VMAX 5874
Symmetrix VMAX 5874
Test storage and database configuration 465


B
Sample SYMCLI Group
Creation Commands
This appendix presents the following topic.

◆ Sample SYMCLI group creation commands................................ 468
Sample SYMCLI Group Creation Commands 467

Sample SYMCLI Group Creation Commands
Sample SYMCLI group creation commands

The following shows how Symmetrix device groups and composite
groups are created for the TimeFinder family of products including
TimeFinder/Mirror, TimeFinder/Clone, and TimeFinder/Snap.
This example shows how to build and populate a device group and a
composite group for TimeFinder/Mirror usage:
Device group:
1. Create the device group:
symdg create dbgroup -type regular
2. Add the standard devices to the group. The database containers

reside on five Symmetrix devices. The device numbers for these
are 0CF, 0F9, 0FA, 0FB, and 101:
symld -g device_group add dev 0CF
symld -g device_group add dev 0F9
symld -g device_group add dev 0FA
symld -g device_group add dev 0FB
3. Associate the BCV devices to the group. The number of BCV

devices should be the same as the number of standard devices
and the same size. The device serial numbers of the BCVs used in
the example are 00C, 00D, 063, 064, and 065.
symbcv -g device_group associate dev 00C
symbcv -g device_group associate dev 00D
symbcv -g device_group associate dev 063
Composite group:
1. Create the composite group:
symcg create device_group -type regular
2. Add the standard devices to the composite group. The database

containers reside on five Symmetrix devices on two different
Symmetrix arrays. The device numbers for these are 0CF, 0F9 on
the Symmetrix array with the last three digits of 123, and device
numbers 0FA, 0FB, and 101 on the Symmetrix array with the last
three digits of 456:
symcg -g device_group add dev 0CF -sid 123

symcg -g device_group add dev 0F9 -sid 123

symcg -g device_group add dev 0FA -sid 456
symcg -g device_group add dev 0FB -sid 456
symcg -g device_group add dev 101 -sid 456
3. Associate the BCV devices to the composite group. The number

of BCV devices should be the same as the number of standard
devices and the same size. The device serial numbers of the BCVs
used in the example are 00C, 00D, 063, 064, and 065.
symbcv -cg device_group associate dev 00C -sid 123
symbcv -cg device_group associate dev 00D -sid 123
symbcv -cg device_group associate dev 063 -sid 456
This example shows how to build and populate a device group and a
composite group for TimeFinder/Clone usage:
Device group:
1. Create the device group device_group:
symdg create device_group -type regular

3. Add the target clone devices to the group. The targets for the
clones can be standard devices or BCV devices. In this example,
BCV devices are used. The number of BCV devices should be the
same as the number of standard devices, and the same size or
larger than the paired standard device. The device serial numbers
of the BCVs used in the example are 00C, 00D, 063, 064, and 065.
symbcv -g device_group associate dev 00C
symbcv -g device_group associate dev 00D
Sample SYMCLI group creation commands 469

Composite group:
1. Create the composite group device_group:

reside on five Symmetrix devices on two different Symmetrix
arrays. The device numbers for these are 0CF, 0F9 on the
Symmetrix array with the last three digits of 123, and device
3. Add the target for the clones to the device group. In this example,
BCV devices are added to the composite group to simplify the
later symclone commands. The number of BCV devices should be
the same as the number of standard devices and the same size.
The device serial numbers of the BCVs used in the example are
00C, 00D, 063, 064, and 065.
symbcv -cg device_group associate dev 00C -sid 123
symbcv -cg device_group associate dev 00D -sid 123
The following example shows how to build and populate a device

group and a composite group for TimeFinder/Snap usage.
Device group:
1. Create the device group device_group:
symdg create device_group -type regular


3. Add the VDEVs to the group. The number of VDEVs should be

the same as the number of standard devices and the same size.
The device serial numbers of the VDEVs used in the example are
291, 292, 394, 395, and 396.
symld -g device_group add dev 291 -vdev
Composite group:
1. Create the composite group device_group:
2. Add the standard devices to the composite group. The database

containers reside on five Symmetrix devices on two different
Symmetrix arrays. The device numbers for these are 0CF, 0F9 on
the Symmetrix array with the last three digits of 123, and device
3. Add the VDEVs to the composite group. The number of VDEVs

should be the same as the number of standard devices and the
same size. The device serial numbers of the VDEVs used in the
example are 291, 292, 394, 395, and 396:
symld -cg device_group add dev 291 -sid 123 -vdev
Sample SYMCLI group creation commands 471


C
Related Host Operation

◆ Overview ........................................................................................... 474
Related Host Operation 473

Overview
Previous sections demonstrated methods of creating a database
copy using storage-based replication techniques. While in some
cases, customers create one or more storage-based database
copies of the database as "gold" copies (copies that are left in a
pristine state on the array), in most cases they want to present
copied devices to a host for backups, reporting, and other
business continuity processes. Mounting storage- replicated
copies of the database requires additional array-based,
SAN-based (if applicable), and host-based steps including LUN
presentation and masking, host device recognition, and
importing of the logical groupings of devices so that the
operating system and logical volume manager recognize the data
on the devices. Copies of the database can be presented to a new
host or presented back to the same host that sees the source
database. The following sections describe the host-specific
considerations for these processes.
Whether using SRDF, TimeFinder, or Replication Manager to
create a copy of the database, there are six essential requirements
for presenting the replicated devices and making the copies
available to a host. They include:
◆ Verifying that the devices are presented to the appropriate
front-end directors in the BIN file.
◆ Verifying zoning and LUN presentation through the SAN are
configured (if needed).
◆ Editing configuration information to allow the devices to be seen
on the host.
◆ Scanning for the devices on the SCSI paths.
◆ Creating special files (UNIX) or assigning drive letters
(Windows).
◆ Making the devices ready for use.
The following sections briefly discuss these steps at a high level.
BIN file configuration

For r the data to be presented to a host, the Symmetrix BIN file
must be configured. LUNs need to be assigned to the
hypervolumes, and then presented to front-end director ports.

This configuration can be done by the EMC Customer Engineer

(CE) through the BIN file change request process, or by the
customer using software utilities such as Symmetrix
Configuration Manager or EMC ControlCenter.
SAN considerations
Hosts can be attached to a Symmetrix DMX either by direct
connectivity (FC-AL, iSCSI, ESCON, or FICON), or through a
SAN using Fibre Channel (FC-SW). When using direct-connect,
all LUNs presented to a front-end port are presented to the host.
In the case of a SAN, additional steps must be considered. These
include zoning, which is a means of enabling security on the
switch, and LUN masking, which is used to restrict hosts to see
only the devices that they are meant to see. Also, there are
HBA-specific SAN issues that must be configured on the hosts.
SAN zoning is a means of restricting FC devices (for example,
HBAs and Symmetrix front-end FC director ports) from accessing
all other devices on the fabric. It prevents FC devices from
accessing unauthorized or unwanted LUNs. In essence, it
establishes relationships between HBAs and FC ports using
World Wide Names (WWNs). WWNs are unique hardware
identifiers for FC devices. In most configurations, a one-to-one
relationship (the zone) is established between an HBA and FC
port, restricting other HBAs (or FC ports) from accessing the
LUNs presented down the port. This simplifies configuration of
shared SAN access and provides protection against other hosts
gaining shared access to the LUNs.
In addition to zoning, LUN masking, which on the Symmetrix
array is called Volume Logix™, can also be used to restrict hosts to
see only specified devices down a shared FC director port. SANs
are designed to increase connectivity to storage arrays such as the
Symmetrix. Without Volume Logix, all LUNs presented down a
FC port would be available to all hosts that are zoned to the
front-end port, potentially compromising both data integrity and
security.
The combination of zoning and Volume Logix, when configured
correctly for a customer's environment, ensures that each host
only sees the LUNs designated for it. They ensure data integrity
and security, and also simplify the management of the SAN
environment. There are many tools to configure zoning and LUN
Overview 475
presentation; primary among them is EMC SAN Manager™.

Identifying specific configuration steps for zoning and LUN
presentation are beyond the scope of this document.
There are several host- and HBA-specific SAN configuration
considerations. When presenting volumes through a SAN, the
HBA(s) must be configured correctly for the fabric. One step in
the process is updating the HBA firmware and driver levels to
those validated in the EMC Support Matrix; this is dependent on
the type of host and HBA implemented. Parameter files for
particular HBAs may also require modification. Additional
configuration requirements depend on the type of host used.
Examples of additional host configuration needs include
updating of the sd.conf file on Solaris systems, configuration of
persistent binding on Solaris hosts, and installation of the EMC
ODM Support Package for Symmetrix on IBM systems.
For additional information on host- and HBA-specific
configuration issues, consult the EMC Host Connectivity guides
for each operating system, as well as the HBA-specific installation
and configuration guides.
Final configuration considerations for enabling LUN presentation to hosts

Once the BIN file is configured correctly and connectivity via
direct-connect or the SAN is established, the final steps needed to
present the storage are to probe the SCSI bus for the storage
devices, create special files on UNIX systems or assign drive
letters on Windows hosts, and to make the devices ready for
device I/O. These host-specific steps are described by operating
system in the following sections.

Presenting database copies to a different host

There are host-specific requirements to mount volumes when
TimeFinder or Replication Manager are used to create a database
copy that is to be presented to a different host. The following
sections describe the processes needed to mount these copies for
each of the major host types. Each section describes the
operating-system steps for the following:
◆ Gathering configuration information prior to the change
◆ Probing the SCSI bus for new devices
◆ Creating special device files (UNIX) or assigning drive letters
(Windows)
◆ Importing volume/disk groups (UNIX)
◆ Activating logical volumes (UNIX)
◆ Mounting file systems (if applicable)
AIX considerations
When presenting copies of devices from an AIX environment to a
different host from the one the production copy is running on, the
first step is to scan the SCSI bus, which allows AIX to recognize
the new devices. The following demonstrates the steps needed for
the host to discover and verify the disks, bring the new devices
under PowerPath control if necessary, import the volume groups,
and mount the file systems (if applicable).
1. Before presenting the new devices, it is useful to run the
following commands and save the information to compare to
after the devices are presented:
lsdev -Cc disk
lspv
syminq
2. Another important step to complete before presenting the devices

to the new host is to understand which volume groups are
associated with which physical devices on the source host. The
following commands list the volume groups on the host, identify
for each Oracle volume group which physical devices (hdisks) are
used, and show the relationship between hdisks. Run these
Presenting database copies to a different host 477

commands on the host prior to making any device changes. This

is a precaution only and is to document the environment should it
later need to be restored manually.
lspv (List all the physical volume
identifiers)
lsvg (List all volume groups, identify
Oracle volume groups)
lsvg -p vol_grp (Run for each Oracle volume, identify
hdisks)
syminq (Find Symmetrix volume numbers for
each
Oracle hdisk)
3. Once the Symmetrix volumes on the source host are identified,

determine the relationship between the source volumes and the
target volumes. Finding this relationship depends on the software
used. The following commands are used to determine the
relationships when TimeFinder/Mirror is used as the replication
method:
4. The next step is for the target host to recognize the new devices.
The following command scans the SCSI buses and examines all
adapters and devices presented to the target system:
cfgmgr -v
Alternatively, the EMC command emc_cfgmgr (found in the

/usr/lpp/EMC/Symmetrix/bin directory) may be executed to
probe the SCSI buses for new devices after the Symmetrix ODM
drivers have been installed.
5. Confirm presentation of the new devices by running the
following commands:
lsdev -Cc disks
lspv
syminq
6. If EMC PowerPath is installed, use the following commands to

place the new devices under PowerPath control and verify
success:
powermt config
powermt display
powermt display dev=all

Once the devices are discovered by AIX, the next step is to import
the volume groups. The key is to keep track of the PVIDs on the
source system. The PVID is the physical volume identifier that
uniquely identifies a volume across multiple AIX systems. When
the volume is first included in a volume group, the PVID is
assigned based on the host serial number and the timestamp. In
this way, no two volumes should ever get the same PVID.
However, array-based replicating technologies copy everything
on the disk including the PVID.
7. On the production host, use the lspv command to list the physical
volumes Locate the PVID of any disk in the volume group being
replicated. On the secondary host, do an lspv as well. Locate the
hdisk that corresponds to the PVID noted in the first step.
Suppose the disk has the designation hdisk33. The volume group
can now be imported using the command:
importvg -y vol_grp hdisk33
8. The volume group descriptor area (VGDA) on every disk in the

volume group has a list of the PVIDs of all the other disks in the
volume group. During the import, the LVM tracks down all the
disks using the information in the VGDA. To ensure that the
volume group imported successfully, use the following command
to list the volume groups:
lsvg
9. If PowerPath is in use on the target host, the import of the volume

group must be executed using a powerdisk rather than an hdisk.
The procedure otherwise is the same. When performing an lspv
on the target host, locate the appropriate hdiskpower##
associated with PVID previously obtained. Then, issue the
importvg command:
importvg -y vol_grp hdiskpower##
Once imported, the volume groups are automatically activated by

AIX. The next step is to mount the file systems. If the UID and
GIDs are not the same between the two hosts, run the chown
command to change the ownerships of the logical volumes to the
dba user and group that administer the server:
chown dbaadmin:dbagroup /dev/rlvname (Character
special device file)
chown dbaadmin:dbagroup /dev/lvname(Block special
device file)

10. The first time this procedure is performed, create mount points
for the file systems if raw volumes are not used. The mount
points should be made the same as the mount points for the
production file systems.
AIX and BCV considerations

TimeFinder/Mirror uses BCVs, which are by default in the
"defined" state to AIX. To change these volumes to the "available"
state, execute the following command:
/usr/lpp/EMC/Symmetrix/bin/mkbcv -a
If the devices need to be placed in the "defined" state to AIX, use

the following command:
/usr/lpp/EMC/Symmetrix/bin/rmbcv -a
HP-UX considerations
When presenting clone devices in an HP-UX environment to a
host different from the one the production copy is running on,
initial planning and documentation of the source host
environment is first required. The following demonstrates the
steps needed for the target host to discover and verify the disks,
bring the new devices under PowerPath control if necessary,
import the volume groups and mount the file systems (if
applicable).
following commands on the target host and save the information
to compare to output taken after the devices are presented:
vgdisplay -v | grep "Name"(List all volume groups)
syminq(Find Symmetrix volume for each c#t#d#)

to the new host is to understand the association of volume groups
and physical devices (/dev/rdsk/c#t#d#) on the source host since
all disks in a volume group must be replicated together.
Additionally, the relationship between the physical devices and
the Symmetrix logical devices (hypervolumes) must be identified.
The following commands list the volume groups on the host,
identify for each Oracle volume group which physical devices are
used, and show the relationship between physical devices and the
Symmetrix devices.

vgdisplay -v | grep "VG Name" (List all volume

groups)
vgdisplay -v <vol_grp> | grep "PV Name"
(List PVs for a volume group)
syminq(Find Symmetrix volume for each c#t#d#)
3. Create map files for each volume group to replicate. The Volume
Group Reserve Area (VGRA) on disk contains descriptor
information about all physical and logical volumes that make up
a volume group. This information is used when a volume group
is imported to another host. However, logical volume names are
not stored on disk. When a volume group is imported, the host
assigns a default logical volume name. To ensure that the logical
volume names are imported correctly, a map file generated on the
source is created for each volume group and used on the target
host when the group is imported.
vgexport -v -p -m /tmp/vol_grp.map vol_grp
4. Identify the Symmetrix device groups on the source host. The

following commands, when run on the source host, list the device
groups created and shows their details:
symdg list(Lists all device groups on the host)
symdg show device_group (Shows specifics of a device
group)
5. Once the Symmetrix volumes and applicable device groups on

the source host are identified, identify the relationship between
the source volumes and the target volumes. Finding this
relationship depends on the replication software used. The
following commands are used to determine the relationships
when TimeFinder/Mirror is the replication method used:
symmir -g device_group query(TimeFinder/Mirror)
6. After identifying the Symmetrix volumes used as targets, ensure

that the target host recognizes these new devices. The following
command scans the SCSI buses and examines all adapters and
devices presented to the target system:
ioscan -fn
7. Create device special files for the volumes presented to the host:
insf -e


following commands and comparing them to outputs found in
step 1:
symcfg discover
syminq

success:
powermt config(Configures the devices under PowerPath)
powermt display(Displays the number of devices per
path)
powermt display dev=all(Displays all of the device
detail info)
10. Once the devices are discovered by HP-UX, they need to be

identified with their associated volume groups from the source
host to be imported successfully. When using the vgimport
command, specify all of the devices for the volume group to be
imported. Since the target and LUN designations for the target
devices are different from the source volumes, the exact devices
must be identified using the syminq and symmir outputs. Source
volume group devices are associated with Symmetrix source
devices through a syminq output. Then Symmetrix device
pairings from the source to target hosts are found from the
symmir device group outputs. Finally, Symmetrix target volume
to target host device pairings are made through the syminq
output from the target host.
11. After identifying each of the new /dev/rdsk/c#t#d# devices and
their associated volume groups, create the volume group
structures needed to successfully import the volume groups onto
the new host. A directory and group file for each volume group
must be created before the volume group can be imported. Ensure
that each volume group has a unique minor number.
ls -l /dev/*/group (Identify used minor
numbers)
mkdir /dev/vol_grp
mknod /dev/vol_grp/group c 64 0xminor#0000
(minor# must be unique)

12. Import the volume groups onto the target host. Volume group
information from the source host is stored in the Volume Group
Reserve Area (VGRA) on each volume presented to the target
host. Volume groups are imported by specifying a volume group
name, if the volume group names are not used on the target.
vgimport -v -m vg_map_file vol_grp /dev/rdsk/c#t#d#
[/dev/rdsk/c#t#d#]
where vg_map_file is the volume group map file created in step 3,

vol_grp is the volume group name being imported, and c#t#d# are
the devices in the specified volume group.
13. After importing the volume group, activate the volume group.
vgchange -a y vol_grp
14. Once the volume groups are activated, mount on the target any
file systems from the source host. These file systems may require
a file system check using fsck as well. Add an entry to /etc/fstab
for each file system.
Linux considerations
Enterprise releases of Linux from Red Hat and SuSE provide a
logical volume manager for grouping and managing storage.
However, it is not common to use the logical volume manager on
Linux. The technique deployed to present and use a copy of
Oracle database on a different host depends on whether or not the
logical volume manager is used on the production host. To access
the copy of the database on a secondary host, follow these steps:
1. Create a mapping of the devices that contain the database to file
systems. This mapping information is used on the secondary
host. The mapping can be performed by using the information in
the /etc/fstab file and/or the output from the df command.
In addition, if the production host does not use logical volume
manager, the output from syminq and
symmir/symclone/symsnap command is required to associate
the operating-system device names (/dev/sd<x>) with
Symmetrix device numbers on the secondary host.
2. Unlike other UNIX operating systems, Linux does not have a
utility to rescan the SCSI bus. Any of the following methods allow
a user to discover changes to the storage environment:

• Rebooting the Linux host

• Unloading and reload the Fibre Channel or SCSI driver
module
• Making changes to the /proc pseudo-file system to initiate a
scan
Although rebooting the Linux host is a viable option, for most
enterprise environment this is unacceptable. The unloading and
reload of the Fibre Channel or SCSI driver module is possible
only if the driver is not being used, making it a highly unreliable
technique.
The Linux operating system presents all resources under
management by means of a pseudo file system called /proc. This
file system is a representation of in-memory data structures that
are used by the operating system to manage hardware and
software resources. The /proc file system is used to convey
resource changes to the operating system. To initiate a scan of the
SCSI bus, execute the following command on the secondary host:
echo "scsi scan-new-devices" > /proc/scsi/scsi
The devices representing the copy of the database should be

available for use on the secondary host.
following commands on the secondary host:
symcfg discover
syminq
4. If PowerPath is available on the secondary host, use the following
command to place the new devices under the control of
PowerPath:
powermt config
Verify the status of the devices by executing the following

commands:
powermt display
5. If logical volume manager is used on the production host, import

the volume group definitions back on the secondary host. To do
this, use the pvscan, vgimport, and vgchange commands as
follows:
pvscan -novolumegroup
vgimport volume_group_name

vgchange -a y volume_group_name
The pvscan command displays all the devices that are initialized,
but not belonging to a volume group. The command should
display all members of the volume groups that constitute the
copy of the database. The vgimport command imports the new
devices and creates the appropriate LVM structures needed to
access the data. If LVM is not used, this step can be skipped.
6. Once the volume groups, if any, are activated, mount on the target
any file systems from the source host. If logical volume manager
is not being used, execute syminq on the secondary host. The
output documents the relationship between the operating system
device names (/dev/sd<x>) and the Symmetrix device numbers
associated with the copy of the database. The output from step 1
can be then used to determine the devices and the file systems
that need to be mounted on the secondary host.
These file systems may require a file system check (using fsck)
before they can be mounted. If it does not exist, make an entry to
/etc/fstab for each file system.
Solaris considerations
When presenting replicated devices in a Solaris environment to a
different host from the one production is running on, the first step
is to scan the SCSI bus which allows the secondary Solaris system
to recognize the new devices. The following steps cause the host
to discover and verify the disks, bring the new devices under
PowerPath control if necessary, import the disk groups, start the
logical volumes, and mount the file systems (if applicable). The
following commands assume that VERITAS Volume Manager
(VxVM) is used for logical volume management.
1. Before presenting the new devices, run the following commands
and save the information to compare to, after the devices are
presented:
vxdisk list
vxprint -ht
syminq

to the new host is to understand which disk groups are associated
with which physical devices on the source host. The following
commands list the disk groups on the host, identify for each

Oracle disk group which physical devices are used, and show the
relationship between hdisks should be run on the host prior to
making any device changes. This is a precaution only and is to
document the environment should it reqiore a manual restore
later.
vxdg list(List all the disk groups)
vxdisk list(List all the disks and associated groups)
syminq(Find Symmetrix volume numbers for each Oracle
disk)
3. Once the Symmetrix volumes on the source host are identified,

determine the relationship between the source volumes and the
target volumes. Finding this relationship depends on the software
used. The following commands are used to determine the
relationships when TimeFinder or SRDF are the replication
methods used:
symmir -g device_group query(TimeFinder)
symrdf -g device_group query(SRDF)
4. The next step is for the target host to recognize the new devices.
The following command scans the SCSI buses, examines all
adapters and devices presented to the target system, and builds
the information into the /dev directory for all LUNs found:
drvconfig;devlinks;disks

following commands:
format
syminq
6. VERITAS needs to discover the new devices after the OS can see
them. To make VERITAS discover new devices, enter:
vxdctl enable

success:
powermt config
powermt display

8. Once VERITAS has found the devices, import the disk groups.
The disk group name is stored in the private area of the disk. To
import the disk group, enter:
vxdg -C import diskgroup
Use the -C flag to override the host ownership flag on the disk.
The ownership flag on the disk indicates the disk group is online
to another host. When this ownership bit is not set, the vxdctl
enable command actually performs the import when it finds the
new disks.
9. Run the following command to verify that the disk group
imported correctly:
vxdg list
10. Activate the logical volumes for the disk groups:

vxvol -g diskgroup startall
11. For every logical volume in the volume group, run fsck must to
fix any incomplete file system unit of work:
fsck -F vxfs /dev/vx/dsk/diskgroup/lvolname
12. Mount the file systems. If the UID and GIDs are not the same
between the two hosts, run the chown command to change the
ownerships of the logical volumes to the DBA user and group
that administers the server:
chown dbaadmin:dbagroup /dev/vx/dsk/diskgroup/lvolname
chown dbaadmin:dbagroup
/dev/vx/rdsk/diskgroup/lvolname
13. The first time this procedure is performed, create mount points
for the file systems, if raw volumes are not used. The mount
points should be made the same as the mount points for the
production file systems.
Windows considerations
To facilitate the management of volumes, especially those of a
transient nature such as BCVs, EMC provides the Symmetrix
Integration Utility (SIU). SIU provides the necessary functions to scan
for, register, mount, and unmount BCV devices.

Within the Windows Server environment, logical units (LUNs) are

displayed as PHYSICALDRIVE devices. Use the sympd list
command to see the currently accessible devices on the BCV host, as
shown in the following example:
Device Name Directors Device
--------------------------- ------------- ------------------------------------
Cap
Physical Sym SA :P DA :IT Config Attribute Sts (MB)
--------------------------- ------------- ------------------------------------
\\.\PHYSICALDRIVE4 0000 04B:0 01A:C5 2-Way Mir N/Grp'd VCM WD 11
\\.\PHYSICALDRIVE5 0002 04B:0 02B:C0 RDF1+Mir Grp'd RW 8632
\\.\PHYSICALDRIVE6 000E 04B:0 15A:D1 RDF1+Mir Grp'd (M) RW 34526
\\.\PHYSICALDRIVE7 004F 04B:0 01A:D3 2-Way Mir N/Grp'd RW 8632
\\.\PHYSICALDRIVE8 00AF 04B:0 16B:C4 2-Way Mir Grp'd (M) RW 34526
\\.\PHYSICALDRIVE9 00B3 04B:0 01A:D4 2-Way Mir Grp'd (M) RW 34526
\\.\PHYSICALDRIVE10 00B7 04B:0 16B:C5 2-Way Mir Grp'd (M) RW 34526
\\.\PHYSICALDRIVE11 00BB 04B:0 01A:D5 2-Way Mir Grp'd (M) RW 34526
Additionally, view the mapping of these physical drive devices to
volume mount points using the Windows Disk Management
console, as shown next.
Figure 107 Windows Disk Management console

The "Disk x" value represents the similarly numbered

PHYSICALDRIVE device from the sympd command. Thus, Disk 7 is
the same device as PHYSICALDRIVE7.
As new devices such as BCVs are presented to a Windows server, the
SIU can be used to scan for and register these devices. The rescan
function of the symntctl utility will rediscover disk devices including
newly visible BCV devices:
symntctl rescan
It is possible to use the SIU to manage the mount operations of BCV

volumes by specifying the Symmetrix volume identifier with a
mount operation:
symntctl mount -drive W: -sid 028 -symdev 055 -part 1
In the previous example, the -part 1 option specifies the partition on

the LUN that is to be mounted, and is only required if multiple
partitions exist on the device. As such, SIU will mount the volume
with Symmetrix volume ID of 055 on Symmetrix 028 to the drive
letter W.
Conversely, it is possible to unmount volumes using SIU. This is
recommended prior to reestablishing the BCV to its source STD
volume. In the case of the unmount, only the drive letter location is
required to be presented.
symntctl unmount -drive W:
This command will unmount the volume from the drive letter and
dismiss the Windows cache that relates to the volume. If any running
application maintains an open handle to the volume. SIU will fail and
report an error. The administrator should ensure that no applications
are using any data from the required volume; proceeding with an
unmount while processes have open handles is not recommended.
The SIU can identify those processes that maintain open handles to
the specified drive, using the following command:
symntctl openhandle -drive W:
Using this command, it is possible to identify running processes to

shut down or terminate to facilitate the unmount command.

Windows Dynamic Disks

In general, avoid using the Dynamic Disk functionality provided by
the Windows Server environment. This limited Dynamic Disk
functionality does not provide necessary API calls to adequately
manage import and deport operations for the disk groups. Refer to
the release notes for SIU to identify how to resolve this situation.
However, while all functionality may be unavailable, it is possible to
implement some limited functionality when using base Dynamic
Disk configurations. SIU will allow for mount and unmount
operations against Dynamic Disk configurations, though it will be
unable to import/deport the disk groups.
To import the disk group, use Windows Disk Management. The
newly presented disk group will appear as a "Foreign" group, and
may then be imported using the Disk Management interface. For
specific implementation details and limitations, consult the Windows
Help documentation provided by Microsoft.

Presenting database copies to the same host

A copy of a database in most cases is used on a different host from the
one that owns the source database. The secondary host can then
perform operations on the data independent from the primary host
and without any conflicts or issues. In some circumstances, it is
preferred to use the copy of the database on the same host that owns
the source database, or to use two copies of the database on a
secondary host. In these situations, care must be taken as OS
environments can experience issues when they own disks that have
identical signatures/descriptor areas. Each operating system and
volume manager has its own unique way of dealing with these
issues. The following sections describe how to manage duplicate
disks in a single OS environment.
AIX considerations
When presenting database copies back to the same host in an AIX
environment, one must deal with the fact that the OS now sees the
source disk and an identical copy of the source disk. This is because
the replication process copies not only the data part of the disk, but
also the system part, which is known as the Volume Group
Descriptor Area (VGDA). The VGDA contains the physical volume
identifier (PVID) of the disk, which must be unique on a given AIX
system.
The issue with duplicate PVIDs prevents a successful import of the
copied volume group and has the potential to corrupt the source
volume group. Fortunately, AIX provides a way to circumvent this
limitation. AIX 4.3.3 SP8 and later provides the recreatevg command
to rebuild the volume group from a supplied set of hdisks or
powerdisks. Use syminq to determine the hdisks or powerdisks that
belong to the volume group copy. Then, issue either of the two
commands:
recreatevg -y replicavg_name -l lvrename.cfg hdisk##
hdisk## hdisk ## …
recreatevg -y replicavg_name -l lvrename.cfg hdiskpower##
hdiskpower## hdiskpower## …
where the ## represents the disk numbers of the disks in the volume
group. The recreatevg command gives each volume in the set of
volumes a new PVID, and also imports and activates the volume
group.
Presenting database copies to the same host 491

The lvrename.cfg file can be used to assign new alternative names to

the logical volumes. If the file is not provided, AIX provides default
LV names.
A successful recreatevg command varies on the volume group and
performs JFS log replay if necessary for any cooked file systems.
Default mount points for each file system are created in /etc/file
systems using the format /xx/oldmountpoint where xx is specified
with the -L parameter. If desired, mount points can be changed by
editing /etc/file systems. The command mount -a can be used to
mount the file systems of the replicated database.
HP-UX considerations
Presenting database copies in an HP-UX environment to the same
host as the production copy is nearly identical to the process used for
presenting the copy to a different host. The primary differences are
the need to use a different name for the volume groups and the need
to change the volume group IDs on the disks.
following commands on the target host and save the information
to compare to outputs taken after the devices are presented:
vgdisplay -v | grep "Name"(List all volume groups)
syminq (Find Symmetrix volume for each c#t#d#)

to the new host is to understand the association of volume groups
and physical devices (/dev/rdsk/c#t#d#) on the source host since
all disks in a volume group must be replicated together.
Additionally, the relationship between the physical devices and
the Symmetrix logical devices (hypervolumes) must be identified.
The following commands list the volume groups on the host,
identify for each Oracle volume group which physical devices are
used, and show the relationship between physical devices and the
Symmetrix devices:
vgdisplay -v | grep "VG Name" (List all volume groups)
vgdisplay -v <vol_grp> | grep "PV Name"
(List PVs for a volume group)
syminq (Find Symmetrix volume for each c#t#d#)
3. Create map files for each volume group to be replicated. The

Volume Group Reserve Area (VGRA) on disk contains descriptor
information about all physical and logical volumes that make up

a volume group. This information is used when a volume group

is imported to another host. However, logical volume names are
not stored on disk. When a volume group is imported, the host
assigns a default logical volume name. To ensure that the logical
volume names are imported correctly, a map file generated on the
source is created for each volume group and used on the target
host when the group is imported.
vgexport -v -p -m /tmp/vol_grp.map vol_grp
4. Identify the Symmetrix device groups on the source host. The

following commands, when run on the source host, list the device
groups created and shows their details:
symdg list play
symdg show device_group (Shows specifics of a device
group)
5. Once the Symmetrix volumes and applicable device groups on

the source host are identified, identify the relationship between
the source volumes and the target volumes. Finding this
relationship depends on the replication software used. The
following commands are used to determine the relationships
when TimeFinder/Mirror is the replication method used:
symmir -g device_group query(TimeFinder)
6. After identifying the Symmetrix volumes used as targets, ensure

that the target host recognizes these new devices. The following
command scans the SCSI buses and examines all adapters and
devices presented to the target system:
ioscan -fn
7. Create device special files for the volumes presented to the host:
insf -e
8. Confirm presentation of the new devices is confirmed by running

the following commands and comparing them to output created
in step 1:
symcfg discover
syminq

success:

powermt config(Configures the devices under PowerPath)

powermt display(Displays the number of devices per
path)
powermt display dev=all(Displays all of the device
detail info)
10. Once the devices are found by HP-UX, identify them with their
associated volume groups from the source host so that they can
be imported successfully. When using the vgimport command,
specify all of the devices for the volume group to be imported.
Since the target and LUN designations for the target devices are
different from the source volumes, the exact devices must be
identified using the syminq and symmir output. Source volume
group devices can be associated with Symmetrix source devices
through syminq output. Then Symmetrix device pairings from
the source to target hosts are found from the symmir device
group output. And finally, Symmetrix target volume to target
host device pairings are made through the syminq output from
the target host.
11. Change the volume group identifiers (VGIDs) on each set of
devices making up each volume group. For each volume group,
change the VGID on each device using the following:
vgchgid /dev/rdsk/c#t#d# [/dev/rdsk/c#t#d#] . . .
12. After changing the VGIDs for the devices in each volume group,
create the volume group structures needed to successfully import
the volume groups onto the new host. A directory and group file
for each volume group must be created before the volume group
is imported. Ensure each volume group has a unique minor
number and is given a new name.
ls -l /dev/*/group(Identify used minor numbers)
mkdir /dev/newvol_grp
mknod /dev/newvol_grp/group c 64 0xminor#0000
(minor# must be unique)
13. Import the volume groups onto the target host. Volume group
information from the source host is stored in the VGRA on each
volume presented to the target host. Volume groups are imported
by specifying a volume group name, if the volume group names
are not used on the target.
vgimport -v -m vg_map_file vol_grp /dev/rdsk/c#t#d#
[/dev/rdsk/c#t#d#]

where vg_map_file is the volume group map file created in step 3,

vol_grp is the volume group name being imported, and c#t#d# is
the device in the specified volume group.
14. After importing the volume groups, activate them:
vgchange -a y vol_grp
15. Once the volume groups are activated, mount on the target any
file systems from the source host. These file systems may require
a file system check using fsck as well. An entry should be made
to /etc/fstab for each file system.
Linux considerations
Presenting database copies back to the same Linux host is possible
only if the production volumes are not under the control of the logical
volume manager. Linux logical volume manager does not have utility
such as vgchgid to modify the UUID (universally unique identifier)
written in the private area of the disk.
For a Oracle database not under LVM management, the procedure to
import and access a copy of the production data on the same host is
similar to the process for presenting the copy to a different host. The
following steps are required:
1. Execute syminq and symmir/symclone/symsnap to determine
the relationship between the Linux device name (/dev/sd<x>),
the Symmetrix device numbers that contain the production data,
and the Symmetrix device numbers that hold the copy of the
production data. In addition, note the mount points for the
production devices as listed in /etc/fstab and the output from the
command df.
2. Initiate the scan of SCSI bus by running the following command
as root:
echo "scsi scan-new-devices" > /proc/scsi/scsi
3. If PowerPath is in use on the production host, bring the new

devices under PowerPath control:
powermt config
Verify the status of the devices:

powermt display

4. Use the syminq or sympd list commands to display, in addition

to the production devices, the devices that contain the database
copy. Note the Linux device name associated with the newly
added devices.
5. Using the mapping information collected in step 1, mount the file
systems on the new devices to the appropriate mount points.
Note that the copy of the data has to be mounted at a different
location since the production database volumes are still mounted
and accessible on the production host. For ease of management, it
is recommended to create a directory structure similar to the
production devices.
For example, assume that the production database consists of one
Symmetrix device, 000 accessed by the operating system by
device name /dev/sdb1 and mounted at /u01. Furthermore,
assume that the copy of the database is available on Symmetrix
device 100 and device name is /dev/sdc1. It is recommended to
unmount /dev/sdc1 at mount point /copy/u01.
The total number of hard disks presented on a Linux host varies
anywhere from 128 to 256 depending on the version of the Linux
kernel. When presenting copies of database back to the
production host, ensure that the total devices does not exceed the
limit.
Solaris considerations
Presenting database copies to a Solaris host using VERITAS volume
manager where the host can see the individual volumes from the
source volume group is not supported other than with Replication
Manager. Replication Manager provides "production host" mount
capability for VERITAS.
The problem is that the VERITAS Private Area on both the source and
target volumes is identical. A vxdctl enable finds both volumes and
gets confused as to which are the source and target.
To get around this problem, the copied volume needs to be processed
with a vxdisk init command. This re-creates the private area. Then, a
vxmake using a map file from the source volume created with a
vxprint -hvmpsQq -g dggroup can be used to rebuild the volume

group structure after all the c#t#d# numbers are changed from the
source disks to the target disks. This process is risky and difficult to
script and maintain and is not recommended by EMC.
Windows considerations
The only difference for Windows when bringing back copies of
volumes to the same Windows server is that duplicate volumes or
volumes that appear to be duplicates are not supported in a cluster
configuration.


D
Sample Database
Cloning Scripts

◆ Sample script to replicate a database ............................................ 500
Sample Database Cloning Scripts 499

Sample Database Cloning Scripts
Sample script to replicate a database

The following example shows a Korn shell script where the
requirements are to replicate the Oracle database onto a different
host than the primary database using TimeFinder functionality.
In this case, BCVs on the Symmetrix array are established to the
primary database volumes. After the establish is complete, the
tablespace names are found and each is put into hot backup
mode. The BCVs are then split from the standards. Two device
groups, DATA_DG and LOG_DG, are used to split the log
information separately from the rest of the Oracle data.
The main script is called main.ksh, which contains callouts to
perform all the additional required tasks.
#!/bin/ksh
#
############################################################
#
# Main script - Note: This assumes that the device groups
# DATA_DG and LOG_DG are already created.
#
############################################################
############################################################
# Define Variables
############################################################
ORACLE_SID=oratest
export ORACLE_SID
ORACLE_HOME=/oracle/oracle10g
export ORACLE_HOME
SCR_DIR=/opt/emc/scripts
CLI_DIR=/usr/symcli/bin
DATA_DG=data_dg
LOG_DG=logs_dg
#############################################################
############################################################
# Establish the BCVs for each device group
############################################################
${SCR_DIR}/establish.ksh
RETURN=$?
if [ $RETURN != 0 ]; then
exit 1
fi

############################################################
# Get the tablespace names using sqlplus
############################################################
su - oracle -c ${SCR_DIR}/get_tablespaces_sub.ksh
RETURN=$?
exit 2
fi
############################################################
# Put the tablespaces into hot backup mode
############################################################
su - oracle -c ${SCR_DIR}/begin_hot_backup_sub.ksh
############################################################
# Split the DATA_DG device group
############################################################
${SCR_DIR}/split_data.ksh
RETURN=$?
exit 3
fi
############################################################
# Take the tablespaces out of hot backup mode
############################################################
su - oracle -c ${SCR_DIR}/end_hot_backup_sub.ksh
############################################################
# Split the LOG_DG device group
############################################################
${SCR_DIR}/split_log.ksh
RETURN=$?
exit 4
fi
echo "Script appeared to work successfully"

exit 0
=================================================================
#!/bin/ksh
############################################################
# establish.ksh
# This script initiates a BCV establish for the $DATA_DG
# and $LOG_DG device groups on the Production Host.
Sample script to replicate a database 501

############################################################
############################################################
# Define Variables
############################################################
CLI_BIN=/usr/symcli/bin
DATA_DG=data_dg
LOG_DG=log_dg
############################################################
# Establish the DATA_DG and LOG_DG device groups
############################################################
${CLI_BIN}/symmir -g ${DATA_DG} -noprompt establish

RETURN=$?
ERROR_DATE=`date`
echo "Split failed for Device Group ${DATA_DG}!!!"
echo "Script Terminating."
echo
echo "establish: failed"
echo "$ERROR_DATE: establish: failed to establish ${DATA_DG}"
exit 1
fi
${CLI_BIN}/symmir -g ${LOG_DG} -noprompt establish

RETURN=$?
ERROR_DATE=`date`
echo "Establish failed for Device Group ${LOG_DG}!!!"
echo
echo "establish: failed"
echo "$ERROR_DATE: establish: failed to establish ${LOG_DG}"
exit 2
fi
############################################################
# Cycle ${CLI_BIN}/symmir query for status
############################################################
RETURN=0
while [ $RETURN = 0 ]; do
${CLI_BIN}/symmir -g ${LOG_DG} query | grep SyncInProg \
> /dev/null
RETURN=$?
REMAINING=`${CLI_BIN}/symmir -g ${LOG_DG} query | grep MB | \
awk '{print $3}'`
echo "$REMAINING MBs remain to be established."
echo

sleep 10
done
RETURN=0
${CLI_BIN}/symmir -g ${DATA_DG} query | grep SyncInProg \
> /dev/null
RETURN=$?
REMAINING=`${CLI_BIN}/symmir -g ${DATA_DG} query | grep MB | \
awk '{print $3}'`
echo "$REMAINING MBs remain to be established."
echo
sleep 10
done
exit 0
=================================================================
#!/bin/ksh
############################################################
# get_tablespaces_sub.ksh
# This script queries the Oracle database and returns with
# a list of tablespaces which is then used to identify
# which tablespaces need to be placed into hotbackup mode.
############################################################
############################################################
# Define Variables
############################################################
############################################################
# Get the tablespace name using sqlplus
############################################################
sqlplus internal <<EOF > /dev/null

set echo off;
spool ${SCR_DIR}/tablespaces.tmp;
select tablespace_name from dba_tablespaces;
spool off;
exit
EOF
############################################################
# Remove extraneous text from spool file
############################################################
cat ${SCR_DIR}/tablespaces.tmp | grep -v "TABLESPACE_NAME" | \

grep -v "-" |grep -v "rows selected." \

> ${TF_DIR}/tablespaces.txt
############################################################
# Verify the creation of the tablespace file
############################################################
if [ ! -s ${SCR_DIR}/tablespaces.txt ]; then
exit 1
fi
exit 0
=================================================================
#!/bin/ksh
############################################################
# begin_hot_backup_sub.ksh
# This script places the oracle database into hot backup
# mode.
############################################################
############################################################
# Define Variables
############################################################
############################################################
# Do a log switch
############################################################
sqlplus internal <<EOF

alter system archive log current;
exit
EOF
############################################################
# Put all tablespaces into hot backup mode
############################################################
TABLESPACE_LIST=`cat ${SCR_DIR}/tablespaces.txt`
for TABLESPACE in $TABLESPACE_LIST; do

alter tablespace ${TABLESPACE} begin backup;
exit
EOF
done
exit 0
=================================================================

#!/bin/ksh
############################################################
# split_data.ksh
# This script initiates a Split for the $DATA_DG Device
# group on the Production Host.
############################################################
############################################################
# Define Variables
############################################################
DATA_DG=data_dg
############################################################
# Split the DATA_DG device group
############################################################
${CLI_BIN}/symmir -g ${DATA_DG} -noprompt -instant split

RETURN=$?
ERROR_DATE=`date`
echo "Split failed for Device Group ${DATA_DG}!!!"
echo "It is not safe to continue..."
echo
echo "split_data: failed"
echo "$ERROR_DATE: split_data: failed to split ${DATA_DG}"
exit 1
fi
############################################################
############################################################
RETURN=0
${CLI_BIN}/symmir -g ${DATA_DG} query | grep SplitInProg \
> /dev/null
RETURN=$?
REMAINING=`${CLI_BIN}/symmir -g ${DATA_DG} query | grep MB | \
awk '{print $3}'`
echo "$REMAINING MBs remain to be split."
echo
sleep 5
done
exit 0
=================================================================

#!/bin/ksh
############################################################
# end_hot_backup_sub.ksh
# This script ends the hot backup mode for the oracle
# database. The script is initiated by the end_hot_backup
# scrips
############################################################
############################################################
# Define Variables
############################################################
###########################################################
# Take all tablespaces out of hotbackup mode
############################################################
TABLESPACE_LIST=`cat ${SCR_DIR}/tablespaces.txt`
for TABLESPACE in $TABLESPACE_LIST; do

alter tablespace ${TABLESPACE} end backup;
exit
EOF
done
############################################################
# Do a log switch
############################################################

alter system archive log current;
exit
EOF
exit 0
=================================================================
#!/bin/ksh
############################################################
# split_log.ksh
# This script initiates a Split for the $LOG_DG Device
# group on the Production Host.
############################################################
############################################################
# Define Variables
############################################################

LOG_DG=log_dg
############################################################
# Split the LOG_DG device group
############################################################
${CLI_BIN}/symmir -g ${LOG_DG} -noprompt -instant split

RETURN=$?
ERROR_DATE=`date`
echo "Split failed for Device Group ${LOG_DG}!!!"
echo "It is not safe to continue..."
echo
echo "split_data: failed"
echo "$ERROR_DATE: split_data: failed to split ${LOG_DG}"
exit 1
fi
############################################################
############################################################
RETURN=0
${CLI_BIN}/symmir -g ${LOG_DG} query | grep SplitInProg \
> /dev/null
RETURN=$?
REMAINING=`${CLI_BIN}/symmir -g ${LOG_DG} query | grep MB | \
awk '{print $3}'`
echo "$REMAINING MBs remain to be split."
echo
sleep 5
done
exit 0
=================================================================


E
Solutions Enabler
Command Line
Interface (CLI) for FAST
VP Operations and
Monitoring

◆ Overview ........................................................................................... 510
Solutions Enabler Command Line Interface (CLI) for FAST VP Operations and Monitoring 509
Solutions Enabler Command Line Interface (CLI) for FAST VP Operations and
Monitoring
Overview
This appendix describes the Solutions Enabler commands lines (CLI)
that can be used to configure and monitor FAST VP operations. All
such operations can also be executed using the GUI of SMC.
Although there are command line counterparts for the majority of the
SMC-based operations, the focus here is to show only some basic
tasks that operators may want to use CLI for.
Enabling FAST
Operation: Enable or disable FAST operations.
Command:
symfast –sid <Symm ID> enable/disable
Gathering detailed information about a Symmetrix thin pool

Operation: Show the detailed information about a Symmetrix thin
pool.
Command:
symcfg show –pool FC_Pool –sid <Symm ID> -detail –thin
Sample output:
Symmetrix ID : 000192601262
Pool Name : FC_Pool
Pool Type : Thin
Dev Emulation : FBA
Dev Configuration : RAID-5(3+1)
Pool State : Enabled
....
Enabled Devices(20): <==Number of Enabled Data Devices (TDAT)
in the Thin Pool
{
------------------------------------------------------
Sym Total Alloc Free Full Device
Dev Tracks Tracks Tracks (%) State
------------------------------------------------------
00EA 1649988 701664 948324 42 Enabled
00EB 1649988 692340 957648 41 Enabled
...

Monitoring
}
Pool Bound Thin Devices(20): <== Number of Bound Thin Devices
(TDEV) in the Thin Pool
{
-----------------------------------------------------------------------
Pool Pool Total
Sym Total Subs Allocated Written
Dev Tracks (%) Tracks (%) Tracks (%) Status
-----------------------------------------------------------------------
0162 1650000 5 1010940 61 1291842 78 Bound
Checking distribution of thin device tracks across FAST VP tiers

Operation: Listing the distribution of thin device extents across FAST
VP tiers that are part of a FAST VP policy associated with the storage
group containing the thin devices.
Command:
symcfg –sid <Symm ID> list –tdev –range 0162:0171 –detail
Sample output:
Enabled Capacity (Tracks) : 363777024
Bound Capacity (Tracks) : 26400000
SYMMETRIX THIN DEVICES
-------------------------------------------------------------------------------
Pool Pool Total
Flags Total Subs Allocated Written
Sym Pool Name EM Tracks (%) Tracks (%) Tracks (%)
Status
---- ------------ ----- --------- ----- --------- --- --------- --- -----------
0162 FC_Pool FX 1650000 5 1010940 61 1291842 78
Bound
EFD_Pool -- - - 259212 16 - --
SATA_Pool -- - - 21732 1 - --
Shows that Symmetrix thin device 0162 has thin device extents
spread across data devices on FC_Pool, EFD_Pool and SATA_Pool
...
Overview 511
Monitoring
0171 FC_Pool FX 1650000 5 3720 0 1505281 91

Bound
EFD_Pool -- - - 2040 0 - --
SATA_Pool -- - - 1499184 91 - --
Legend:
Flags: (E)mulation : A = AS400, F = FBA, 8 = CKD3380, 9 = CKD3390
(M)ultipool : X = multi-pool allocations, . = single pool allocation
Checking the storage tiers allocation

Operation: Listing the current allocation of defined storage tiers on a
Symmetrix
Command:
symtier list -vp
Sample output:
Symmetrix ID : 000192601262
--------------------------------------------------------------------
I Logical Capacities (GB)
Target n --------------------------------
Tier Name Tech Protection c Enabled Free Used
--------------------- ---- ------------ - -------- -------- ----------------------
EFD_Tier EFD RAID-5(7+1) S 2566 2565 1

FC_Tier FC RAID-5(3+1) S 4028 2814 1214
SATA_Tier SATA RAID-5(3+1) S 2566 1435 1131
Shows that Symmetrix has 3 tiers defined: EFD_Tier, FC_Tier and

SATA_Tier and their associated enabled, free and used capacities
Legend:
Inc Type : S = Static, D = Dynamic

h2603 Oracle DB Emc Symmetrix Stor Sys WP LDV

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

h2603 Oracle DB Emc Symmetrix Stor Sys WP LDV

Uploaded by

Copyright:

Available Formats

Oracle Databases on

EMC Symmetrix Storage Systems

• Generating Restartable Oracle Copies Using Symmetrix

2 Oracle Databases on EMC Symmetrix Storage Systems

Chapter 1 Oracle on Open Systems

Chapter 2 EMC Foundation Products

Oracle Databases on EMC Symmetrix Storage Systems 3

SRDF consistency groups .......................................................... 55

Chapter 3 Creating Oracle Database Clones

4 Oracle Databases on EMC Symmetrix Storage Systems

Putting the tablespaces or database into hot backup

Chapter 4 Backing Up Oracle Environments

Oracle Databases on EMC Symmetrix Storage Systems 5

Copying a running database using EMC consistency

Chapter 5 Restoring and Recovering Oracle Databases

6 Oracle Databases on EMC Symmetrix Storage Systems

Oracle incomplete recovery.....................................................220

Chapter 6 Understanding Oracle Disaster Restart & Disaster

Oracle Databases on EMC Symmetrix Storage Systems 7

Change rate at the source site ................................................. 243

Chapter 7 Oracle Database Layouts on EMC Symmetrix DMX

8 Oracle Databases on EMC Symmetrix Storage Systems

Back-end considerations ..........................................................308

Chapter 8 Data Protection

Oracle Databases on EMC Symmetrix Storage Systems 9

Listing Generic SafeWrite devices ......................................... 351

Chapter 9 Storage Tiering—Virtual LUN and FAST

10 Oracle Databases on EMC Symmetrix Storage Systems

Example of FAST for Oracle databases..................................407

Appendix A Symmetrix VMAX with Enginuity

Oracle Databases on EMC Symmetrix Storage Systems 11

Appendix B Sample SYMCLI Group Creation Commands

Appendix C Related Host Operation

Appendix D Sample Database Cloning Scripts

Appendix E Solutions Enabler Command Line Interface (CLI) for

12 Oracle Databases on EMC Symmetrix Storage Systems

29 Copying an Oracle database in hot backup mode with TimeFinder/

14 Oracle Databases on EMC Symmetrix Storage Systems

64 Performance Manager graph of write-pending limit for a single

Oracle Databases on EMC Symmetrix Storage Systems 15

102 Concurrent SRDF ......................................................................................... 439

16 Oracle Databases on EMC Symmetrix Storage Systems

Oracle Databases on EMC Symmetrix Storage Systems 17

18 Oracle Databases on EMC Symmetrix Storage Systems

As part of an effort to improve and enhance the performance and capabilities

Audience This TechBook is intended for systems administrators, Oracle

Oracle Databases on EMC Symmetrix Storage Systems 19

Readers of this document are expected to be familiar with the

Related The following is a list of related documents that provide more

20 Oracle Databases on EMC Symmetrix Storage Systems

◆ Oracle Database Backup and Recovery Advanced Users Guide

Organization This TechBook contains the following chapters and several

Oracle Databases on EMC Symmetrix Storage Systems 21

The references section lists documents that contain more information

22 Oracle Databases on EMC Symmetrix Storage Systems

Courier Used for:

The authors of this Techbook

Oracle Databases on EMC Symmetrix Storage Systems 23

24 Oracle Databases on EMC Symmetrix Storage Systems

This chapter presents these topics:

Oracle on Open Systems 25

26 Oracle Databases on EMC Symmetrix Storage Systems

Oracle system elements