Professional Documents
Culture Documents
Abstract
This white paper provides details on the best practices for backup, recovery, and
replications of Oracle databases with Dell EMC® VMAX® All Flash storage arrays.
H14232.1
This document is not intended for audiences in China, Hong Kong, Taiwan, and
Macao.
WHITE PAPER
Copyright
The information in this publication is provided as is. Dell Inc. makes no representations or warranties of any kind with respect
to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular
purpose.
Use, copying, and distribution of any software described in this publication requires an applicable software license.
Copyright © 2018 Dell Inc. or its subsidiaries. All Rights Reserved. Dell, EMC, Dell EMC and other trademarks are
trademarks of Dell Inc. or its subsidiaries. Intel, the Intel logo, the Intel Inside logo and Xeon are trademarks of Intel
Corporation in the U.S. and/or other countries. Other trademarks may be the property of their respective owners. Published
in the USA January 2018 White Paper H14232.1
Dell Inc. believes the information in this document is accurate as of its publication date. The information is subject to change
without notice.
2 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Contents
Contents
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 3
White Paper
Contents
4 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 1: Executive Summary
Terminology .......................................................................................................... 7
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 5
White Paper
Chapter 1: Executive Summary
Executive overview
Many applications are required to be fully operational 24x7x365, even as their data
continues to grow. At the same time, their RTO and RPO requirements are becoming
more stringent. As a result, there is an increasing demand for faster and more efficient
data protection.
Traditional backup methods cannot satisfy this demand because of the long duration and
host overhead required to create full backups. More importantly, during recovery, the
recovery process itself (transactions roll forward) cannot start until the initial image of the
database is fully restored, which can take many hours.
This has led many data centers to use storage snapshots for more efficient protection.
Dell EMC SnapVX snapshots take seconds to create or restore, regardless of database
size. During restore the data is made available immediately to the user, even while
remaining changes are copied in the background.
SnapVX eliminates both the problem of elongated copy time associated with backups, and
the huge RTOs associated with a database restore. These problems plague host-based
backup solutions designed for medium and large mission-critical databases.
SnapVX also allows fast creation of database replicas for testing, development, reporting,
staging, making gold copies, and more. All SnapVX replicas are consistent by default,
allowing the creation of replicas in seconds, while the production database is active.
With SRDF/Metro both source and target devices are writable and in sync, allowing an
Oracle extended RAC solution. SRDF/Metro changes data protection framework from
a failover to a continuous availability solution. It allows Oracle databases and
applications to continue operations throughout many possible disasters, including host,
network, SAN, or even storage unavailability.
Audience This white paper is intended for database administrators, system administrators, storage
administrators, and system architects who are responsible for implementing Oracle
database backups and replications with VMAX All Flash storage systems. Readers should
have some familiarity with Oracle database and VMAX storage arrays.
6 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 1: Executive Summary
Terminology
Table 1 explains important terms used in this paper.
Table 1. Terminology
Term Description
Oracle Automatic Storage Oracle ASM is a robust volume manager and a virtual file system that Oracle
Management (ASM) databases can use to store database files. ASM can be configured as a single
server or clustered solution, can provide mirroring, allows for online storage
migrations and much more.
Oracle Real Application Clusters Oracle RAC is a clustered version of Oracle database based on a
(RAC) comprehensive high-availability stack that can be used as the foundation of a
database cloud system as well as a shared infrastructure, ensuring high
availability, scalability, and agility for applications.
Restartable vs. Recoverable Oracle distinguishes between a restartable and recoverable state of the
database database. A restartable state requires all log, data, and control files to be
consistent (see ‘Storage consistent replications’ in this table). For example,
after a server crash, database shutdown abort, or a consistent snapshot, the
database will be in a restartable state. Oracle can be simply started,
performing automatic crash/instance recovery to the point in time just before
the snapshot or crash took place. Recoverable state on the other hand
requires re-applying transaction logs to the data files (often from the archive
logs) before the database can be opened.
Rolling Disasters Rolling disasters is a term used when a first disaster disrupts normal database
protection strategy, followed by a second disaster. For example, the dropping
of remote replications followed by a later loss of a production site, or silent
corruptions at the remote database followed by the loss of production site, etc.
RTO and RPO Recovery Time Objective (RTO) refers to the time it takes to recover a
database after a failure. Recovery Point Objective (RPO) refers to any amount
of data loss after the recovery completes, where RPO=0 means no data loss
of committed transactions.
Storage consistent replications Storage consistent replications refer to storage replications (local or remote) in
which the replica maintains write-order fidelity. That means that for any two
dependent I/Os, such as log write followed by data update, either both will be
included in the replica, or only the first. SnapVX replicas are always consistent
and when performed correctly (include all log, data, and control files), the
database replica is restartable.
Starting with Oracle database 11gR2, Oracle allows database recovery from
storage consistent replications without the use of hot-backup mode (details in
Oracle support note: 604683.1). The feature has become integrated with
Oracle database 12c and is called Oracle Storage Snapshot Optimization.
Dell EMC ProtectPoint ProtectPoint is a product that directly integrates Data Domain with VMAX to
provide a very fast backup and restore solution for Oracle databases, including
those residing in ASM. It can leverage SnapVX technology to send just
changed data directly to Data Domain with each database backup, or restore
just the changes. It does not require host resources for either operation.
VMAX HYPERMAX OS HYPERMAX OS is the industry’s first open converged storage hypervisor and
operating system. It enables VMAX to embed storage infrastructure services
like cloud access, data mobility and data protection directly on the array. This
delivers new levels of data center efficiency and consolidation by reducing
footprint and energy requirements.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 7
White Paper
Chapter 1: Executive Summary
VMAX storage group (SG) A collection of host addressable VMAX devices. An SG can be used to
present devices to hosts (LUN masking), manage grouping of devices for
SnapVX and SRDF® operations, monitor performance, and more.
VMAX composite or consistency A collection of host addressable VMAX devices. A CG can manage consistent
group (CG) local replications with SnapVX, when the application storage devices are
spread across multiple VMAX arrays. In this case it is referred to as a
composite group. A CG can also manage consistent remote replications with
SRDF, when the application storage devices are spread across multiple arrays
or SRDF groups. In this case it is referred to as consistency group.
VMAX TimeFinder SnapVX TimeFinder SnapVX is the latest generation in TimeFinder local replication
software, offering high-scale, in-memory, pointer-based, consistent snapshots.
VMAX SRDF SRDF (Symmetrix Remote Data Facility) is VMAX remote replication
technology, which allows batch transfers of data, synchronous, asynchronous,
active/active, and cascaded replications between multiple VMAX arrays.
SRDF is tightly integrated with SnapVX to allow utilizing snapshots at the local
or remote arrays.
8 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 2: Product Overview
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 9
White Paper
Chapter 2: Product Overview
In 2016, Dell EMC announced the VMAX All Flash 250F, 450F, and 850F storage arrays.
In May 2017, Dell EMC introduced VMAX 950F, which replaces the VMAX 450F and
850F, and provides higher performance at a similar cost.
VMAX All Flash arrays, as shown in Figure 1, provide a combination of ease of use,
scalability, high performance, and a robust set of data services that makes them an ideal
choice for database deployments.
Figure 1. VMAX All Flash 950F (left) and 250F (right) storage arrays
VMAX All Flash VMAX All Flash storage arrays provide the following benefits:
benefits Ease of use—Uses virtual provisioning to create new storage devices in seconds. All
VMAX devices are thin, consuming only the storage capacity that is actually written to,
which increases storage efficiency without compromising performance. VMAX devices
are grouped into storage groups and managed as a unit for operations such as:
device masking to hosts; performance monitoring; local and remote replications;
compression; and host I/O limits. In addition, you can manage VMAX devices by using
Unisphere for VMAX, Solutions Enabler CLI, or REST APIs.
High performance—Designed for high performance and low latency, VMAX arrays
scale from one up to eight engines. Each engine consists of dual directors, where
each director includes two-socket Intel CPUs, front-end and back-end connectivity,
hardware compression module, InfiniBand internal fabric, and a large mirrored and
persistent cache.
10 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 2: Product Overview
All writes are acknowledged to the host as soon as they are registered with VMAX
cache1. Only later writes are written to flash, perhaps after multiple database updates.
Reads also benefit from the VMAX large cache. When a read is requested for data
that is not already in cache, FlashBoost technology delivers the I/O directly from the
back-end (flash) to the front-end (host). Reads are only later staged in the cache for
possible future access. VMAX also excels in servicing high bandwidth sequential
workloads that leverage pre-fetch algorithms, optimized writes, and fast front-end and
back-end interfaces.
Data services—Offers a strong set of data services. It natively protects all data with
T10-DIF from the moment data enters the array until it leaves (including replications).
With SnapVX and SRDF, VMAX provides many topologies for consistent local and
remote replications. Dell EMC ProtectPoint™ provides an integration with Data
Domain™, and Dell EMC CloudArray™ provides cloud gateways. Other VMAX data
services include Data at Rest Encryption (D@RE), Quality of Service (QoS)2 ,
compression, the “call home” support feature, non-disruptive upgrades (NDU), non-
disruptive migrations (NDM), and more. In virtual environments, VMAX also supports
VMware vStorage APIs for Array Integration (VAAI) primitives such as write-same and
xcopy.
NOTE: While the topic is not covered in this paper, you can also purchase VMAX as part of a
Converged Infrastructure (CI). For details, refer to Dell EMC VxBlock System 740, and Dell EMC
Ready Bundles for Oracle.
The replicated devices can contain the database data, Oracle home directories, data that
is external to the database (e.g. image files), message queues, and so on.
1 VMAX All Flash cache is large (from 512 GB to16 TB, based on configuration), mirrored, and
persistent due to the vault module that protects the cache content in case of power failure and
restores the cache when the system comes back up.
2Two separate features support VMAX QoS. The first relates to Host I/O limits that enable placing
IOPS or bandwidth limits on “noisy neighbor” applications (set of devices) such as test/dev
environments. The second relates to slowing down the copy rate for local or remote replications.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 11
White Paper
Chapter 2: Product Overview
The following list describes the main SnapVX characteristics related to native3 snapshots:
SnapVX snapshots are always space-efficient as they are simply a set of pointers
pointing to the original data when it is unmodified, or to its own version of the data
when the source data is modified after the snapshot was taken. Multiple snapshots
of the same data utilize both storage and memory savings by pointing to the same
locations (tracks).
SnapVX, snapshots are targetless. That means that they can’t be used directly.
Instead, snapshots can be restored back to the source devices, or linked to
another set of target devices, matching in size to the source devices. The target
devices can be host-accessible. A re-link operation refresh the target devices with
a new snapshot data.
Snapshot operations are performed on a group of devices. This group is defined by
using either a text file with device IDs, a ‘device-group’ (DG), ‘composite-group’
(CG), a ‘storage group’ (SG), or simply specifying the devices. The recommended
way is to use a storage group (SG).
Snapshots are taken using the establish command. When establishing a snapshot,
provide a snapshot name, and optionally set an expiration time. Each snapshot
has a generation number which is incremented if the same snapshot name is used.
Generation 0 is always the latest snapshot. The snapshot time is listed together
with the snapshots, adjusted to the local time-zone.
Snapshot operations take seconds to complete, regardless of the size of the data.
For that reason, creating a snapshot of a large database is very fast. When a
snapshot is restored, the operation also takes seconds and as soon as it is done,
the source devices reflect the snapshot data. If necessary, a background copy will
take place and prioritize any requests to tracks that weren’t already copied.
Similarly, a link operation takes seconds to complete and when it is done, the target
devices reflect the snapshot data.
For legacy reasons, SnapVX link operation can use the option “-copy”, which
creates a full-copy clone. The outcome is that the original data is duplicated within
the array and the target devices point to the copy. This behavior is not
recommended with All Flash arrays due to the inefficiencies involved in the copy
12 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 2: Product Overview
operation and the additional capacity utilized by the copy. Full copies don’t improve
performance or data resiliency.
Defining phase: initially, when a snapshot is linked to target devices, accessing their
data is achieved indirectly by using the snapshot pointers. As part of the
background operation that takes place during link, the target devices’ pointers are
changed to point directly to the data. When this process ends, the snapshot is in a
defined state, and the target devices become a stand-alone image of the
snapshot data, regardless if the snapshot used ‘-copy’ or not, and regardless if the
snapshot is unlinked or terminated. Unless ‘-copy’ was used, both the source and
linked-target devices point to shared data, creating dedupe-like efficiencies.
Linked-target devices cannot restore any changes directly to the source devices.
Instead, a new snapshot can be taken from the target devices and linked back to
the original source devices. In this way, SnapVX allows an unlimited number of
cascaded snapshots.
Snapshots are protected. That means that even if a snapshot is restored and the
source devices are modified, or linked, and target devices are modified, the
snapshot is intact and can be re-used over and over with the same original data.
Optionally, snapshots can be secured. A secured snapshot can’t be terminated by
users before its retention period.
SnapVX snapshots are always consistent. That means that snapshot creation
always maintains write-order fidelity. This allows easy creation of restartable
database copies, or Oracle database recoverable backup copies based on Oracle
Storage Snapshot Optimization.
Source devices can have up to 256 snapshots that can be linked to up to 1,024
targets, providing very high scalability.
For more information on SnapVX, refer to: Dell EMC HYPERMAX OS TimeFinder local
replication technical note and the EMC Solutions Enabler CLI Guides.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 13
White Paper
Chapter 2: Product Overview
SRDF Adaptive Copy (SRDF/ACP) mode allows bulk transfers of data between
source and target devices without write-order fidelity and without write performance
impact to source devices. Use SRDF/ACP during data migrations or to
resynchronize an SRDF target when many changes are owed to the target site. Set
SRDF/ACP to perform a one-time transfer, or to continuously send changes in bulk
until a specified delta between source and target remains. Once the delta is small
enough, change the SRDF mode to another mode, such as SRDF/S or SRDF/A.
SRDF/Metro is an extension of SRDF/S. With SRDF/Metro, devices from both
source and target arrays are in sync, and can perform both reads and writes
(active/active topology). To the host, SRDF/Metro makes the source and replicated
devices seem identical by giving them the same SCSI identity. As a result, the host
software (usually a cluster) can benefit from high-availability across distance,
avoiding most of the added complexity of setting up geo-clusters. If one of the
arrays becomes unavailable, the cluster software will automatically failover to the
surviving site (Oracle RAC reconfiguration) and database operations continue from
there without user intervention or downtime. SRDF/Metro is a great option to create
an Oracle Extended Cluster without added complexity to the cluster configuration.
SRDF groups SRDF devices are configured in groups, and managed together as follows:
An SRDF Consistency Group is an SRDF group for which consistency has been
enabled. Consistency can be enabled for either Synchronous or Asynchronous
replication modes.
An SRDF consistency group maintains write-order fidelity (also called dependent-
write consistency) to make sure that the target devices always provide a restartable
replica of the source application.
NOTE: Even when consistency is enabled the remote devices may not yet be consistent
while SRDF state is sync-in-progress. This happens when SRDF initial synchronization is
taking place before it enters a ‘consistent’ or ‘synchronized’ replication state.
14 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 2: Product Overview
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 15
White Paper
Chapter 2: Product Overview
As a result, the Oracle block structure or data can be either incorrect, or incomplete, and
often neither the database nor the user will know about it until a database read to this
block takes place, which can be after minutes, hours, or days, or perhaps even longer.
To avoid silent corruptions VMAX utilizes a SCSI standard called T10-PI (Protection
Information), which is sometimes referred to as T10-DIF (Data Integrity Field). With DIF,
the 512 bytes disk sector geometry is extended to 520 bytes, adding 8 bytes for protection
of each such block. The protection information includes three parts: a 16-bit guard tag,
which is used for CRC check, a 32-bit reference tag which is used to validate the correct
block address (location) of the block, and an application tag, which can be used in
different ways but is currently mostly ignored.
Internally, VMAX utilizes DIF extensively from the moment the I/O arrives to the array, and
while it goes through the different emulations, such as front-end, memory, and back-end
(disk). It is important to realize that VMAX uses DIF for all replications, including local and
remote, to validate that the data is replicated accurately.
Externally, VMAX can work with other layers that support external DIF, such as the HBAs,
the Linux kernel, and even Oracle ASM. In this way, the protection is extended between
the host and the VMAX storage for all reads and writes, including storage replications.
For more information about Oracle involvement in supporting external DIF see:
https://oss.oracle.com/~mkp/docs/OOW2011-DI.pdf
For supported configurations with external DIF refer to Dell EMC eLab Navigator, VMAX
All Flash/VAMX3 Features Simple Support Matrix.
16 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
Considerations for Oracle database replications with SnapVX and SRDF ... 18
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 17
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
If snapshots are used as part of a disaster protection strategy then the frequency of
creating snapshots can be determined based on the RTO and RPO needs.
Copy vs. no- SnapVX snapshots cannot be directly accessed by a host. They can be either restored to
copy snapshot the source devices or linked to up to 1024 sets of target devices. When linking any
target snapshot to target devices, SnapVX allows using the copy or no-copy option where no-
copy is the default.
No-copy option: No-copy linked targets remain space efficient by sharing pointers with
production devices and the snapshot. Only changes to either the linked targets or
production devices consume additional storage capacity. It is important to know that no-
copy linked targets retain their data even after they are unlinked. This requires them
to first be in ‘defined’ stage, meaning that the target devices’ pointers are pointing directly
to the storage data and no longer using indirect pointers via the snapshot.
Copy option: Alternatively, the linked-targets can have their own full copy of the data,
and will not be sharing pointers with the production devices and snapshot. The copy
option is not recommended for VMAX All Flash because it consumes a lot more capacity
without providing performance or resiliency advantages over no-copy linked targets. It is
mainly used for legacy operations or with products such as ProtectPoint, where the target
devices are actually Data Domain encapsulated devices.
Oracle database SnapVX creates consistent snapshots by default, which are well-suited for a database
restartable, restart solution. Simply open a restartable database replica. It will then perform crash or
recoverable, and instance recovery, just as if the server rebooted or the DBA performed a shutdown abort.
hybrid To achieve a restartable solution, all data, control, and redo log files must participate in
snapshots the consistent snapshot. Archive logs are not required and are not used during database
crash/instance recovery. Restartable database snapshots are covered in Chapter 4.
SnapVX can also create recoverable replicas. A recoverable database replica can perform
database recovery to a desired point in time using archive and redo logs. Oracle
Database 12c enhanced the ability to create database recoverable solution based on
18 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
For a recoverable snapshot that will be recovered on the production host and therefore
relies on the available redo logs and archive logs, the snapshot can include just the data
files. However, if the snapshot will be used on another host (such as when using linked
targets and presenting them to a mount host), take an additional snapshot of the archive
logs, following the best practice described in Chapter 5.
Redo logs are not required for a recoverable snapshot and are not part of a roll forward,
since the redo logs in the snapshot will never include the latest transactions. However, the
redo logs may optionally be included so that the DBA doesn’t have to create the +REDO
ASM disk group from scratch on the mount host. Redo logs can also be used for creating
a hybrid replica as explained in the next paragraph.
To create a hybrid replica that can be used for both recovery and restart, include all data,
control, and redo logs in the first snapshot (or SRDF session), and archive logs in a
second snapshot (or SRDF session), following the best practice for recoverable database
replicas.
If a restartable solution is chosen, the archive log replica is not needed, but can be used
on the mount host if the DBA wants a +FRA ASM disk group identical to production’s
available for after the database restart took place. If a recoverable solution is chosen, the
replica of the redo logs will not be needed (especially if restoring back to production, to
avoid overwriting production’s redo logs). However, on a mount host, the DBA may want a
+REDO ASM disk group identical to production’s available for after the database recovery
takes place. The use cases in this paper always create a snapshot with both +REDO and
+DATA included to allow the greater flexibility of the hybrid replica.
RMAN and Oracle Recovery Manager (RMAN) is tightly integrated with the Oracle database. It can
storage perform host-based backups and restores on its own, but it can also work very effectively
replications with storage snapshots.
RMAN backups can be performed from VMAX snapshots that are mounted to a mount
host, sending the backup to a target outside the VMAX, such as Data Domain. Since
RMAN doesn’t depend on which host it operates from, it can later restore that backup
directly to production.
RMAN incremental backups can continue to leverage Oracle Block Change Tracking,
even if the backup was offloaded to the mount host. RMAN can also use the mount host
to validate the database integrity.
Restore optimizations are realized when combining RMAN with storage snapshots. Once
we restore a recoverable snapshot to production, RMAN can use it to finish the database
recovery operations on that image, combining the power of RMAN with storage
snapshots.
RMAN can also leverage the storage snapshot as a copy of production. Mount the
snapshot to the production host with a new location (for instance, a new ASM disk group
name). Once RMAN catalogs it, the snapshot can be used to quickly recover any
corruptions in the production database.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 19
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
Storage Typically, the DBAs execute Oracle SQL and RMAN commands and storage admins
snapshots host execute storage management operations (such as SnapVX, or SRDF commands). This
user type of role-management and security segregation is common in large organizations
where each group manages their respective infrastructure with a high level of expertise.
There are reasons to merge these roles to some extent. For example, allow the database
backup operator to have controlled access to both Oracle SQL and SnapVX commands
so they can create their own backups, leveraging storage snapshots. Use VMAX Access
Controls (ACLs) to allow the backup manager limited control of a defined set of devices
and operations, tied to a specific backup host.
It goes beyond the scope of this paper to discuss the configuration and usage of VMAX
ACLs; however, it is important to mention that Solutions Enabler can be installed for a
non-root user, and together with ACLs, allows the storage admins to offload such backup
operations to the backup admin.
Snapshot time When performing media recovery, Oracle is looking for either the end hot-backup mode
and clock marker in the archive logs, or for the user to supply the ‘snapshot-time’ during the media
differences recovery, which is the time the snapshot of the data files was created.
View the snapshot time by listing the snapshots. However, keep in mind that the storage
management shows the times adjusted to its clock and time-zone. If it exactly matches the
production database server’s clock (e.g. when using NTP), then the listed times can be
used as the ‘snapshot time’. Alternatively, during the backup, include the database server
time in the snapshot name.
----------------------------------------------------------------------------
Sym Num Flags
Dev Snapshot Name Gens FLRG TS Last Snapshot Timestamp
----- -------------------------------- ---- ------- ------------------------
00067 database_20171025-160003 1 .X.. .. Wed Oct 25 16:00:03 2017
database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017
database_20171024-155406 1 .... .. Tue Oct 24 15:54:04 2017
00068 database_20171025-160003 1 .X.. .. Wed Oct 25 16:00:03 2017
database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017
database_20171024-155406 1 .... .. Tue Oct 24 15:54:04 2017
...
During the media recovery Oracle inspects the data file headers (list them using the
following query), and compares the last checkpoint time to the snapshot-time.
20 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
Oracle expects that the last checkpoint took place prior to the snapshot time. If the
checkpoint time is later than the snapshot time Oracle will produce the following error:
To avoid this situation, make sure that the snapshot time is accurate and fits the clock and
time-zone of the database server from which the snapshot was created.
Tests show that in some cases (unrelated to storage snapshots), the file headers’
checkpoint time is a minute or two ahead of the actual database server clock (or
‘sysdate’). While this seems like an Oracle bug (a case was opened), and that the chance
of a database checkpoint occurring just before a snapshot is slim, it is described here in
case it occurs at a customer site.
If you receive ORA-19839 message, and the file headers’s checkpoint_time shows a
minute or two ahead of the snapshot time, use the checkpoint_time from the file
headers as the ‘snapshot time’ in the media recovery. It only means that Oracle will
require slightly more recovery before the database can be opened.
Define at least three ASM disk groups (and matching VMAX storage groups) for
maximum flexibility: +DATA (data and control files), +REDO (redo logs), and +FRA
(archive logs). A parent storage group is recommended, which includes both
+DATA and +REDO (and used for restartable replicas creation).
The separation of data, redo and archive log files allows backup and restore of only
the appropriate file types at the appropriate time. For example, Oracle backup
procedures require the archive logs to be replicated at a later time than the data
files. Also, during restore, if the redo logs are still available on the production host,
we can restore only data files without overwriting the production’s redo logs.
ASM, ASMlib, and ASM Filter Driver
Another aspect of ASM is that it can be used without other drivers, pointing directly to the
storage devices. It can utilize ASMlib, which is an optional driver that places labels on the
storage devices and then ASM uses these labels when creating the disk groups, or it can
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 21
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
use ASM Filter Driver (AFD), which also provides its own device labels (and other
functionality).
From the VMAX storage replications’ perspective (i.e. SRDF and SnapVX), it doesn’t
matter whether ASM, ASMlib, or AFD are used. What’s important is to be consistent. For
example, if AFD is used on production, it should also be used on the mount host as well.
The examples in this paper use AFD. If ASMlib is used instead, then additional operations
will be required on a mount host that ties into ASMlib specifically. For example, ASMlib
disk scan and ASMlib labels rename when necessary.
Since +GRID does not contain any user data, and since GI setup contains host specific
information, do not include +GRID in the replications (SnapVX or SRDF). Instead, pre-
install GI on the mount host(s) with its own +GRID. When the replicated ASM disk groups
are made visible to the mount host(s), they can simply be mounted into the existing
cluster.
If production database was using RAC, on the mount host(s), start the database in either
clustered or non-clustered mode. The reason is that Oracle RAC uses shared storage and
requires all data to be visible to all nodes, and therefore it will be part of the replication,
regardless of how the database is started on the target.
If production is not clustered then there will not be a readily available cluster waiting on
the mount host to mount the replicated ASM disk groups. Instead, if it isn’t already
running, start the ASM software stack once the replica is made available using ‘srvctl
start has’ command.
Oracle uses Fast Recovery Area (FRA) as the location of the flashback log. In this paper
we created an ASM disk group called +FRA for the Fast Recovery Area and we assume
that the archive logs are sent there. While archive log destination can be any ASM disk
group (with default to the FRA), flashback logs always go to the FRA. Typically, a very
large capacity is required for flashback logs, even with a relatively small retention time.
22 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
control files has to be extended to include also the flashback logs in the same storage
replica (SnapVX or SRDF). This will make sure that the latest database blocks’ past
images are consistent with the data files. That means that restartable snapshots have to
include: +DATA, +REDO, and +FRA (and not only +DATA and +REDO).
In that case, the DBA may want to consider separating the ASM disk groups of the archive
logs from the flashback logs. For example, create a disk group called +ARCH and send
the archive logs there while flashback logs go to +FRA. By doing so, the recoverable use
cases described in this paper will be possible, as they require a snapshot of the archive
logs to occur after a snapshot of the data files.
NOTE: When more than one SRDF group participates in the replications, a CG has to be created
and used to enable consistency.
In a similar way, SRDF/S can also benefit from enabling consistency. Unlike SRDF/A,
when a single SRDF group is used, SRDF/S does not allow enabling consistency at the
SG level and requires a CG instead. For simplicity, the SRDF/S examples in this paper
manage replications using SG, however, it is a best practice to enable consistency for
both SRDF/S and SRDF/A. In the SRDF/S case, it means using a CG to manage the
replications, even if a single SRDF group is used.
If Oracle backup is offloaded to the target storage array, the archive logs are
needed there. In this case, the archive logs can use the same or a different SRDF
group. However, the replication mode (Sync or Async) should match that of the
data files. That means that if database_sg is replicated in Sync mode, fra_sg should
also be replicated in Sync mode, so that regardless of where the database is
started (local or remote), all the appropriate archive logs are available.
If the DBA wants a +FRA ASM disk group that is identical to the production
database’s at the target site, this can be accommodated. While +FRA can be
created separately on the target array (saving replication bandwidth), the DBA may
prefer to prevent any differences by simply replicating the production database’s
+FRA.
As discussed earlier, if Oracle Flashback Database will be used at the remote site,
then both flashback and archive logs need to be replicated together with the other
database files. The DBA can still decide to keep the archive logs in another ASM
disk group to allow for remote backups.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 23
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
Another reason to replicate the temp files is that although they don’t contain user data or
participate in a recovery or restart solution, Oracle will be looking for them when it
attempts to open the database at the target site. So that database operations are not
delayed, it is best to include them in the replications together with database_sg.
One of SRDF’s strengths is its ability to create a consistency group across a group of
such databases, external files, and message queues, as long as they all reside on VMAX
storage devices. This is very powerful because in a true disaster, not all systems crash at
exactly the same time. As a result, solutions that can’t maintain consistency across
databases, may spend a huge amount of time after the disaster reconciling dependencies,
transactions owed and their order, and message queues between databases before the
database can be accessed by users.
SRDF consistency groups can include all such related databases, applications, message
queues, and external files, making all these related components consistent with each
other, so after a disaster, simple restart operations take place and user operations resume
quickly.
In that case, Oracle Home should be installed on a VMAX device and should be included
in the replications. It can have its own SRDF group or can be joined with the data files.
For example if SRDF replications were interrupted (planned or unplanned) and changes
accumulated on the source array, once the synchronization resumes and until the target
24 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
array is in synchronized state (SRDF/S) or consistent state (SRDF/A), the target database
image is not usable. For that reason it is a best practice before such
resynchronization starts, to take a gold copy snapshot at the target site. This gold
copy preserves the last valid remote image of the database as a safety measure until the
target is in sync again.
From a storage replications perspective, both SnapVX and SRDF replicate the source
data to the target accurately. As mentioned in the section VMAX and T10-DIF protection
from silent corruptions, VMAX uses T10-DIF. That means that the I/O is only vulnerable to
corruptions while it moves from the source database until it reaches the storage. After that
it is protected by VMAX, including during replications. In other words, only one
vulnerability path exists. If external T10-DIF is added, even that vulnerability path is
eliminated.
A log shipping solution has two active databases – the source, and the standby (target).
Although the log records shipped to the standby are validated before they are applied,
once they are applied, the database changes are going through the I/O path just like on
the source database. In other words, a log shipping solution has two vulnerability paths
where silent corruptions may occur. Of course if VMAX is used for both, external T10-DIF
can be enabled for both.
Therefore, we can say that without external T10-DIF (which makes both solutions
resilient), a log shipping solution has twice the vulnerability of VMAX replications. The
slight difference is that a VMAX replica will be identical to the source (including pre-
existing corruptions, if there are any), where a log shipping replica can introduce new
corruptions, due to the I/O exposure while the standby writes to its data files.
Silent data corruptions are not discovered until the data is read, and that can be a long
time after it was written. In a ‘rolling disasters’ case, first corruptions are introduced to the
replication target (storage replication or log shipping replication), and then the production
database is lost. To avoid this, with either replication technology it is a good practice to
check for database corruptions periodically. With storage replications, either the source or
target databases can be validated, as described in Database integrity validation on a
mount host. In the log shipping solution, both the production and standby databases
should be tested (as different silent corruptions may exist in each).
To summarize, VMAX replications can actually be considered safer than log shipping
replications. The choice of storage replications or log shipping should be driven by
business needs. Very often, both types are used in parallel. Remember that a big
advantage for SRDF is its ability to create a consistency group across a group of related
databases and applications, including external files and message queues. The ability to
perform restart operations after a disaster where everything is consistent, instead of
reconciling between databases, saves time and reduces complexity.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 25
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
Lab configuration
Lab The following tables show the environment used to test and demonstrate the use cases
configuration for described in the following chapters. Table 1 shows the VMAX storage environment, Table
use cases 2 shows the host environment, and Table 3 shows the Oracle ASM and VMAX storage
groups configuration.
HYPERMAX OS 5977.1125
4Any Oracle 12c feature or best practice in this paper is applicable to both Oracle database release
12.1 as well as 12.2. Hot-backup mode based solutions fit older Oracle releases as well.
26 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
DB Size: 1.2 TB
Figure 3 shows the overall test configuration used for the local replications use cases.
A 2-node RAC Oracle 12.2 ran on the local array, VMAX 950F (SID 048). ASM was
configured with a +GRID disk group for grid infrastructure with normal redundancy and no
user data. As such, it was not part of the replications. The other ASM disk groups
(+DATA, +REDO, +FRA) used external redundancy and matched with VMAX storage
groups (data_sg, redo_sg, and fra_sg). A parent storage group database_sg contained
both data_sg and redo_sg.
The +GRID ASM disk group were pre-configured on the VMAX devices of the mount
hosts, and were not based on the production snapshots. Once the snapshot target
devices were made available to the mount host, their ASM disk groups were simply
mounted to the pre-configured cluster.
Figure 4 shows the overall test configuration used for the remote replications test cases.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 27
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
A 2-node RAC Oracle 12.2 was running on the local array, VMAX 950F (SID 048). ASM
was configured with a +GRID disk group for Grid Infrastructure (GI) with normal
redundancy and no user data. As such, it was not part of the replications. The other ASM
disk groups (+DATA, +REDO, +FRA) used external redundancy and matched with VMAX
storage groups (data_sg, redo_sg, and fra_sg). A parent storage group database_sg
contained both data_sg and redo_sg.
The remote array, VMAX 950F (SID 047) was configured with its own +GRID ASM disk
group. Once the SRDF target devices or the remote snapshot target devices were made
available, ASM was simply able to mount these disk groups and use them.
There are two storage management hosts – local and remote. While SRDF is connected,
each storage management host can issue commands to either local or remote arrays.
However, it is best to have a storage management host prepared in each site in case a
disaster occurs and the links between the arrays are not operational.
NOTE: To make use of Solutions Enabler CLI, a storage management host (or vApp) is required.
However, if only Unisphere or REST APIs are used then the VMAX embedded management
module can be used.
Linux aliases Two Linux aliases are used in the examples to change the Oracle user environment
variables between the database and ASM.
‘TODB’ is a Linux alias that sets the Oracle user environment variables of
ORACLE_HOME and ORACLE_SID to those of the database.
28 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
‘TOGRID’ is a Linux alias that sets the Oracle user environment variables of
ORACLE_HOME and ORACLE_SID to those of the Grid Infrastructure (ASM).
We used TOGRID or TODB aliases prior to executing commands associated with ASM or
the database.
Sometimes it is necessary to match VMAX devices IDs from a storage group (SG) to the
device presentation on the production or mount database servers; for example, when
creating text files with device names in order to perform an ASM disk group rename.
In the following example we want to match data_mount_sg VMAX devices IDs with the
devices on the database host.
First, identify the storage device IDs that are part of data_mount_sg. To find the device
IDs of the storage group, use the Unisphere interface, or use the following command from
the storage management host:
To match storage and host devices you can use the scsi_id command, a PowerPath
command, or an inq (inquiry) command, as explained in these sections.
Using scsi_id The scsi_id Linux command is part of ‘sg3-utils’ module. If that module is not already
command installed on the database servers, you can add it (‘yum install sg3-utils’). In the following
list, the three digits of the storage serial ID (for example, 048) are followed by the device
ID (for example, 066, 075, etc). The command can be adjusted to /dev/mapper/*p1 if you
are using native multipathing.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 29
White Paper
Chapter 3: Use Cases Considerations, Lab Configuration, And VMAX Device Identification
Using PowerPath PowerPath commands can be executed on the database server to list the host device
commands presentation for each storage device, as shown below:
Using Inq Another option is to download the free, stand-alone Inquiry binary from the Dell EMC ftp
command website and use it on the database server to list the devices, as shown below:
30 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 4: Restartable Database Snapshots
Mounting restartable snapshot with a new DBID and file location ................ 40
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 31
White Paper
Chapter 4: Restartable Database Snapshots
NOTE: Snapshots are protected. That means that a snapshot of the production database can be
restored over and over during patch update, if it failed multiple times. That also means that if the
snapshot is used as a new copy of the database, and the database copy is modified (sensitive
data is masked for example), the snapshot original data is still intact and can be used to create
more copies of the original database. If a copy of the masked database is desired, a new snapshot
of the target devices with the masked database can be created, and that snapshot can be used as
a source to other database copies.
Requirements To satisfy the requirement for a restartable snapshot, it has to include all redo logs,
control, and data files, and has to be taken in a single storage-consistent snapshot
operation. Note that native SnapVX snapshots are always consistent, even if the snapshot
includes devices spread across multiple arrays.
Other files such as temp files or archive logs are not required. Consider including them if
the snapshot purpose is to create a database copy, which may require its own temp files
and/or archive logs, or if they are mixed with the other files and will be included in the
snapshot anyway (e.g. temp files sharing devices with data files).
The following steps show how to create a valid restartable database snapshot.
1. Identify the storage group that contains all the data files, control files, and
redo logs. Ideally, data files and redo logs have each their own ASM disk groups
and matching storage groups (e.g. data_sg for +DATA ASM disk group, and
redo_sg for +REDO ASM disk group). In that case, a parent SG, such as:
‘database_sg’ may already exist or can be added, containing data_sg and
5 As soon as a snapshot is restored, its data is available to the source devices, even as background
copy of the changed data takes place. If a host I/O is requested for data that hasn’t been copied yet
it will be prioritized.
32 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 4: Restartable Database Snapshots
SQL> insert into testTbl values (2, 'After first snapshot taken');
SQL> commit;
NOTE: When the same storage group (SG) and snapshot name are used to create additional
snapshots, a new snapshot generation is created, where generation 0 always points to the latest
snapshot. When snapshots are listed, the date/time information of each generation is shown.
SQL> insert into testTbl values (3, 'After second snapshot taken');
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 33
White Paper
Chapter 4: Restartable Database Snapshots
For this purpose we created a set of matching target devices and placed them in SGs
similar to production’s SGs, with the word ‘mount’ in the SG name (since we’ll be
presenting them to another host which we refer to as the ‘mount’ host). For the following
examples we’ve created redo_mount_sg, data_mount_sg, and a parent SG that included
both and was called database_mount_sg.
It is important to consider zoning and LUN masking, which are the operations of making
devices visible to hosts. In this example, the mount host is pre-zoned to the storage array,
and the target devices are placed in a masking view and made visible to the mount host,
even before the snapshot is linked.
Remember that if the devices are presented for the first time to the mount host, their
partitions will only become visible once the snapshot is linked to the target devices. This is
no longer a consideration if the snapshot is refreshed (relinked) as by then the partitions
will already be known to the mount host with the correct permissions.
When deciding which snapshot generation to use, it is best to first list the snapshot
generations (use Unisphere, or the ‘-detail’ option in the ‘symsnapvx list’ command) and
choose the appropriate snapshot. When linking a snapshot to target devices, if a
generation number isn’t used, gen 0 (the latest snapshot) is assumed. The CLI command
to link a snapshot to target device is shown below:
If the target SG already has a snapshot linked to it, there is no need to ‘unlink’ it prior to
refreshing the target SG with another snapshot. Simply relink the new snapshot with the
same command as above, using the ‘relink’ option instead of ‘link’.
Procedure This procedure shows how to link a snapshot to target devices using CLI. Then, start the
target database and inspect the data.
1. Choose a snapshot to link. By listing the snapshots with -detail flag, each
generation and its date/time is shown.
-------------------------------------------------------------------------------------------------------------
Total
34 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 4: Restartable Database Snapshots
Dev Snapshot Name Gen FLRG TS Snapshot Timestamp (GBs) (GBs) Expiration
Date
Flags:
2. Link the snapshot to the target devices. We use generation 1, which is the first
database snapshot we took in the previous section.
3. The target host should already be zoned and masked to the target devices. If this
is the first time a snapshot is made visible to the target host, you should reboot or
rescan the SCSI bus to make sure the devices and their partitions are recognized
by the mount host. Give the partitions (if used) or devices (otherwise) Oracle
permissions.
4. Log in to the ASM instance on the target host. The ASM disk groups on the target
devices should be in the unmounted state. Mount them using ‘asmcmd’ or SQL,
such as the following example.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit
Production
NAME STATE
------------------------------ -----------
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 35
White Paper
Chapter 4: Restartable Database Snapshots
REDO DISMOUNTED
DATA DISMOUNTED
GRID MOUNTED
Diskgroup altered.
Diskgroup altered.
NAME STATE
------------------------------ -----------
GRID MOUNTED
DATA MOUNTED
REDO MOUNTED
5. Log in to the database instance on the mount host, and simply start the database.
Do not perform any media recovery. During this step Oracle performs crash or
instance recovery.
SQL> startup
ORACLE instance started.
Optional: If archive log mode is not necessary (or +FRA is not available) on the
mount host, the following example shows how to disable archiving before opening
the database.
36 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 4: Restartable Database Snapshots
6. Inspect the data in the test table. Since we used generation 1, which was the first
snapshot, the data shows the table’s record from that time.
ID STEP
---------- --------------------------------------------------
1 Before snapshots taken
1. Before linking a different snapshot to the target SG, first, bring down the database
and ASM disk groups on the mount host, as the target devices’ data is about to
be changed.
NOTE: If the target database is RAC, make sure to shutdown all the instances and
dismount the relevant ASM disk groups on all nodes.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit
Production
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 37
White Paper
Chapter 4: Restartable Database Snapshots
2. Choose a snapshot to link. By listing the snapshots with -detail flag, each
generation of a specific snapshot_name and its date/time is shown.
-------------------------------------------------------------------------------------------------------------
Total
Dev Snapshot Name Gen FLRG TS Snapshot Timestamp (GBs) (GBs) Expiration
Date
...
3. Link the snapshot to the target devices. This time we’ll use generation 0, which is
the most recent snapshot (the second snapshot we took in the previous example).
NOTE: There is no need to terminate the previous snapshot first. Use the ‘relink’ option.
There is no need to choose ‘-gen 0’ because it is the default.
NOTE: As before, the target host should already be zoned and masked to the target devices.
4. Log in to the ASM instance on the target host. The ASM disk groups on the target
devices should be visible, though in an unmounted state. Mount them as follows.
NAME STATE
------------------------------ -----------
REDO DISMOUNTED
DATA DISMOUNTED
GRID MOUNTED
SQL> alter diskgroup data mount;
Diskgroup altered.
Diskgroup altered.
38 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 4: Restartable Database Snapshots
NAME STATE
------------------------------ -----------
GRID MOUNTED
DATA MOUNTED
REDO MOUNTED
5. Log in to the database instance on the mount host. Start the database. Do not
perform any media recovery. During this step Oracle performs crash or instance
recovery.
SQL> startup
ORACLE instance started.
Alternatively, if, for example, archive logs were enabled on production but are not
needed on the mount host, disable archiving prior to opening the database on the
mount host:
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 39
White Paper
Chapter 4: Restartable Database Snapshots
Database altered.
Database altered.
6. Inspect the data in the test table. Since we used generation 0, which was the
second snapshot, the data shows the table’s records from that time.
ID STEP
---------- ----------------------------------------
1 Before snapshots taken
2 After first snapshot taken
Procedure This procedure explains how to link a snapshot to target devices using CLI. Afterwards,
we’ll start the target database using the new file location and inspect the data.
NOTE: The initial steps are identical to the previous use case and therefore are not shown in
detail. Make sure the target database is down and the appropriate ASM disk groups (+DATA and
+REDO) are unmounted.
1. Choose a snapshot to link. By listing the snapshots with the -detail flag, each
generation and its date/time is shown.
2. Link the snapshot to the target devices. We’ll use in the example generation 0,
which is the latest database snapshot we took.
3. The mount host should already be zoned and masked to the target devices. If this
is the first time a snapshot is made visible to the mount host, reboot it or rescan
the SCSI bus online to make sure the devices and their partitions are recognized
by the host. Make sure the partitions (if used) or devices (otherwise) receive
Oracle permissions.
4. Rename the ASM disk groups.
40 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 4: Restartable Database Snapshots
Now that the target devices are visible to the mount host, create the new ASM
disk groups’ locations. If ASMlib is used, rename the ASMlib labels first, before
the disk groups can be renamed.
In this example, we’ll change the ASM disk group names from +DATA to
+DATA_ENV_1, and from +REDO, to +REDO_ENV_1.
ASM uses a text file to rename the disk group which contains a list of the devices,
the old name and the new name.
Since AFD labels were used when the disk groups were created, the labels can
be used in the text file as well. However, as can be seen in the +REDO rename
example below, as part of the rename execution, ASM changes the text file from
the labels back to the actual device names. For that reason, if the text file is going
to be used more than once, save a copy of it beforehand.
NOTE: The renamedg command requires the ASM disk_string parameter. It can be listed by using
the command: ‘asmcmd dsget’ or by looking at the ASM init.ora parameter ASM_DISKSTRING.
Parsing parameters..
renamedg operation: dgname=DATA newdgname=DATA_ENV_1
config=./ora_asm_rename_data.txt asm_diskstring=/dev/emc*1,AFD:*
Executing phase 1
Discovering the group
Checking for hearbeat...
Re-discovering the group
Generating configuration file..
Completed phase 1
Executing phase 2
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 41
White Paper
Chapter 4: Restartable Database Snapshots
Completed phase 2
Parsing parameters..
renamedg operation: dgname=REDO newdgname=REDO_ENV_1
config=./ora_asm_rename_redo.txt asm_diskstring=/dev/emc*1,AFD:*
Executing phase 1
Discovering the group
Checking for hearbeat...
Re-discovering the group
Generating configuration file..
Completed phase 1
Executing phase 2
Completed phase 2
## NOTICE THAT THE AFD LABELS IN THE TEXT FILES WERE CHANGED TO DEVICES
6. On the mount host, mount the ASM disk groups with their new names.
NAME STATE
------------------------------ -----------
DATA_ENV_1 DISMOUNTED
42 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 4: Restartable Database Snapshots
REDO_ENV_1 DISMOUNTED
GRID MOUNTED
Diskgroup altered.
Diskgroup altered.
7. On the mount host, update the file names to their new location.
a. Update controlfile location to use the new ASM disk group location.
At the end of this step make sure the database is able to mount.
If using pfile, update the init.ora file parameters with the new ASM disk group
names.
b. Update Oracle files with their new location (database should be in mounted
state):
An example script is shown below. It assumes the ASM disk groups are
renamed from +DATA to +DATA_<NEW_NAME>, and +REDO to
+REDO_<NEW_NAME>. Update as necessary.
$ vi ora_rename_files.sh
export NEW_NAME=ENV_1
sqlplus -s "/ as sysdba" << EOF2
startup mount;
set linesize 132 pagesize 0 heading off feedback off verify off termout
off echo off
spool /tmp/ora_rename_redofile.sql
select 'alter database rename file ''' || member || ''' to ''' ||
member || ''';' from v\$logfile;
spool off;
spool /tmp/ora_rename_datafile.sql
select 'alter database rename file ''' || name || ''' to ''' || name ||
''';' from v\$datafile;
spool off;
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 43
White Paper
Chapter 4: Restartable Database Snapshots
spool /tmp/ora_rename_tempfile.sql
select 'alter database rename file ''' || name || ''' to ''' || name ||
''';' from v\$tempfile;
spool off;
quit;
EOF2
sed "s/+DATA/+DATA_$NEW_NAME/2" /tmp/ora_rename_datafile.sql >
/tmp/ora_rename_data.sql
sed "s/+REDO/+REDO_$NEW_NAME/2" /tmp/ora_rename_redofile.sql >
/tmp/ora_rename_redo.sql
sed "s/+DATA/+DATA_$NEW_NAME/2" /tmp/ora_rename_tempfile.sql >
/tmp/ora_rename_temp.sql
10. Inspect the data in the test table. Since we used generation 0, which was the
second snapshot, the data shows the table’s records from that time.
ID STEP
---------- ----------------------------------------
1 Before snapshots taken
2 After first snapshot taken
11. If you are also changing the DBID and DBNAME, be sure to open and shut
down the database cleanly. Therefore, this step can only take place after
step 9, where the database is opened (so it can be shut down cleanly).
In addition, the NID utility requires the database to be mounted in exclusive mode.
After the NID utility runs successfully, update spfile or pfile with the new DBNAME
and ORACLE_SID for any relevant database parameters before restarting the
database.
44 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 4: Restartable Database Snapshots
...
Control Files in database:
+DATA_ENV_1/cntrlslob.dbf
As before, no media recovery is performed – the database is simply started after the
snapshot is restored. As there are no reset-logs involved, all prior backups continue to be
valid. This is simply a fast and safe way of creating a short-term gold copy of the
production database without wasting capacity or time. The snapshot restore operation
itself takes seconds, even if background data copy of changes continues. If the host
requests data that is still being copied, it will be prioritized. As previously discussed,
snapshots are protected and therefore the snapshot can be used over and over again.
Make sure that the snapshot finishes the background copy before connecting users to the
production database again at full scale.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 45
White Paper
Chapter 4: Restartable Database Snapshots
Procedure This section explains how to restore a snapshot to the production’s database devices
using CLI. Then, start the database and inspect the data.
1. Before restoring the snapshot, first, bring down the database and ASM disk
groups on the production host, as their data is about to be refreshed.
NOTE: if the production database is clustered, make sure to shut down and dismount the ASM
disk groups and instances on all nodes.
2. Choose a snapshot to restore. By listing the snapshots with the -detail flag,
each generation and its date/time is shown.
-------------------------------------------------------------------------------------------------------------
Total
Dev Snapshot Name Gen FLRG TS Snapshot Timestamp (GBs) (GBs) Expiration
Date
...
---------- ----------
2059.9 1772.1
46 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 4: Restartable Database Snapshots
Flags:
3. Restore the snapshot. In this case, we restore the latest snapshot (generation 0).
Since it is the default value, there is no need to mention the generation in the
command.
4. Remount the ASM disk groups and start the Oracle database.
5. Start the production database. Do not perform any media recovery. During this
step Oracle performs crash/instance recovery.
6. Inspect the data in the test table. Since we used generation 0, which was the
latest snapshot, the data shows the table’s records from that time.
ID STEP
---------- ----------------------------------------
1 Before snapshots taken
2 After first snapshot taken
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 47
White Paper
Chapter 5: Recoverable Database Snapshots
48 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 49
White Paper
Chapter 5: Recoverable Database Snapshots
databases when relying on RMAN alone. SnapVX or SRDF can help shorten this
time by leveraging storage-based local or remote replications, and easily refresh
the target whenever it is necessary.
1. For Oracle databases prior to 12c: place the database in hot-backup mode.
(Oracle 12c databases can leverage the Storage Snapshot Optimization feature
and don’t require hot-backup mode).
2. Create a snapshot containing all Oracle data files (+DATA).
NOTE: Although it isn’t a requirement, in our snapshot we’ll include both +DATA and +REDO
ASM disk groups, using the parent SG: database_sg. That allows us to use this snapshot as both
recovereable and restartable.
Additional notes:
Unlike the restart use cases, the recovery use case requires that the Oracle data
files and redo logs are separated to different storage devices and ASM disk groups.
The reason is that in the case of database recovery, only the +DATA snapshot will
be restored. We don’t want to overwrite the redo logs with the +REDO snapshot in
case the current production database redo logs survived and can be used for full
recovery. By separating data files and redo logs to different devices and ASM disk
groups, we can restore only the data files without overwriting the redo logs.
In the example below, we create the snapshot using the parent SG (database_sg),
and therefore include both data files and redo logs. This doesn’t conflict with the
previous point. The reason is that although the snapshot is performed on the parent
SG, in case of a restore, we restore only the child SG, i.e. just +DATA. So, why
did we create a snapshot with both +DATA and +REDO? The reason is that this
snapshot serves as a valid source for both a recoverable as well as a restartable
solution. If a restartable option is not desirable, change the process in step 2 above
to only include the +DATA ASM disk group (data_sg).
When the command ‘alter system switch logfile’ is executed, Oracle switches the
log file. When it does that, it needs to flush the dirty buffer cache associated with
the previous logs to disk. In a clustered environment, that can create a storm of
writes that may affect database performance. If that’s the case, consider using
FAST_START_MTTR_TARGET init.ora parameter. By tuning it correctly, Oracle
will limit the amount of dirty buffers in cache without affecting database
50 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
performance. The outcome is that when the logs switch, the amount of writes won’t
be overwhelming. Some customers prefer to manually swich logs at the different
cluster nodes, one at a time. Our recommendation is to use
FAST_START_MTTR_TARGET to not add operational overhead.
SQL> create table testTbl (Id int, Step varchar(255)) tablespace slob;
SQL> insert into testTbl values (1, 'Before +DATA & +REDO snapshot');
SQL> commit;
2. Perform this step only if hot-backup mode is used (databases pre-12c), to begin
hot backup mode.
3. To create a database snapshot that is only recoverable, include only the data
files: data_sg. For a database snapshot that is both recoverable and restartable,
include both data and redo logs together: database_sg.
NOTE: To simplify finding the snapshot time when hot-backup mode is not used, we
included the production host date/time in the snapshot name using ‘date’ command.
4. Perform this step only if hot-backup mode is used (databases pre-12c), to end
hot-backup mode.
5. Perform this step only if RMAN incremental backups are offloaded to the mount
host. In that case, the BCT file version must be switched manually on the
production host (see details of this use case later in the section RMAN backup
offload to a mount host), just like RMAN would have done automatically at the end
of the backup if it was performed from the production host.
Make sure BCT is enabled, then switch its version.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 51
White Paper
Chapter 5: Recoverable Database Snapshots
6. For demonstration purposes, insert another known record after the first snapshot.
SQL> insert into testTbl values (2, 'After +DATA & +REDO snapshot');
SQL> commit;
8. Create a snapshot with the archive logs (ASM +FRA disk group, or fra_sg SG).
This snapshot includes sufficient archives to recover the database so it can open.
9. For demonstration purposes, insert the last known record for this test.
10. To inspect the snapshots created, use the appropriate level of detail
symsnapvx list
symsnapvx -sg <sg_name> list
symsnapvx -sg <sg_name> -snapshot_name <snapshot_name> list –gb –detail
symsnapvx -sg <sg_name> -snapshot_name <snapshot_name> list –gb -summary
As before, it is important to consider zoning and LUN masking operations, which make
devices visible to hosts. In this example, the mount host is pre-zoned to the storage array.
The target devices are placed in a masking view and made visible to the mount host, even
before the snapshot is linked. Remember that if partitions are used they will only become
visible to the mount host once the snapshot is linked, and at that time will require Oracle
permissions. This is no longer a consideration if the snapshot is refreshed (relinked) as by
then the partitions will already be set on the mount host with the correct permissions.
When the same snapshot_name is used, list the snapshots based on the SG and
snapshot name to choose the appropriate generation to link. When each snapshot name
is unique, use just the SG name to list the snapshots. Once a source and target SGs are
52 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
linked, there is no need to terminate the link in order to relink another snapshot. Just use
the ‘relink’ option in the syntax.
Procedure This section explains how to link a snapshot to target devices using CLI. Then, we recover
the target database and inspect the data.
1. If the target storage groups are in use, make sure to shut down the database on
the mount host, and dismount the appropriate ASM disk groups prior to the link
operation. If the mount host uses RAC, make sure all nodes are included.
[oracle@dsib0057 slob]$ TODB
[oracle@dsib0057 slob]$ sqlplus "/ as sysdba"
SQL> shutdown immediate;
[oracle@dsib0057 slob]$ TOGRID
[oracle@dsib0057 ~]$ asmcmd umount data
[oracle@dsib0057 ~]$ asmcmd umount redo
[oracle@dsib0057 ~]$ asmcmd umount fra
2. Choose a snapshot to link by first listing the snapshots for each storage group
(database_sg, and fra_sg), then link it to the matching SG, providing the desired
snapshot name. If a different snapshot was previously linked between the SGs,
use ‘relink’ in the syntax instead of ‘link’.
----------------------------------------------------------------------------
Sym Num Flags
Dev Snapshot Name Gens FLRG TS Last Snapshot Timestamp
----- -------------------------------- ---- ------- ------------------------
00067 database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017
database_20171024-155406 1 .... .. Tue Oct 24 15:54:04 2017
00068 database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017
database_20171024-155406 1 .... .. Tue Oct 24 15:54:04 2017
00069 database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017
...
# symsnapvx -sg database_sg -lnsg database_mount_sg link -snapshot_name
database_20171025-095033
----------------------------------------------------------------------------
Sym Num Flags
Dev Snapshot Name Gens FLRG TS Last Snapshot Timestamp
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 53
White Paper
Chapter 5: Recoverable Database Snapshots
3. The mount host should already be zoned and masked to the target devices. If this
is the first time a snapshot is made visible to the mount host, reboot it or rescan
the SCSI bus online to make sure the devices and their partitions are recognized
by the host and have Oracle permissions.
4. Log in to the ASM instance on the mount host. The ASM disk groups on the target
devices should be visible, though in an unmounted state. Mount them.
NAME STATE
------------------------------ -----------
GRID MOUNTED
DATA MOUNTED
FRA MOUNTED
REDO MOUNTED
Before we start this use case make sure that the database is in a mounted state. Make
sure all ASM disk groups (+DATA, +REDO, and +FRA) are available from the two
snapshots that were linked to target devices, and visible to the mount host.
Procedure This section describes how to open the database in read-only mode:
54 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
1. If ‘hot-backup’ mode was used during the backup then skip this step. Otherwise,
‘snapshot time’ will be used during the media recovery. To identify the correct
snapshot time to use we’ll compare the snapshot time from the ‘symsnapvx list’,
with the snapshot time from the data file headers, and use the latest, as shown
below.
a. Inspect the snapshots time based on the ‘symsnapvx list’ command. When
the storage management clock is identical to the database server clock, both
timestamps will match – the one listed by the command, and the one we
added to the snapshot name. In this example they match (if they didn’t, we
would use the one from the snapshot name as it came from the database
server).
The snapshot that is currently linked to the target SG will have an ‘X’ under
the L in the ‘FLRG’ flags.
----------------------------------------------------------------------------
Sym Num Flags
Dev Snapshot Name Gens FLRG TS Last Snapshot Timestamp
----- -------------------------------- ---- ------- ------------------------
00067 database_20171025-160003 1 .X.. .. Wed Oct 25 16:00:03 2017
database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017
database_20171024-155406 1 .... .. Tue Oct 24 15:54:04 2017
00068 database_20171025-160003 1 .X.. .. Wed Oct 25 16:00:03 2017
database_20171025-095033 1 .... .. Wed Oct 25 09:50:33 2017
database_20171024-155406 1 .... .. Tue Oct 24 15:54:04 2017
...
b. Compare the snapshot time from above with the data file headers
checkpoint_time. In the next step use the snapshot time that is the latest (in
other words, we want to make sure the data files are recovered to a point
beyond both snapshot and checkpoint times).
Check the data files’ header checkpoint time. You can run this command on
the mount host, since the database is in a mounted state.
$ cat ./ora_checkpoint_time.sh
#!/bin/bash
set -x
sqlplus "/ as sysdba" << EOF
column name format a50
set linesize 132
select name, checkpoint_change#, to_char(checkpoint_time, 'YYYY-MM-DD
HH24:MI:SS') checkpoint_time from v\$datafile_header;
quit;
EOF
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 55
White Paper
Chapter 5: Recoverable Database Snapshots
$ ./ora_checkpoint_time.sh
6 rows selected.
As shown, the file header checkpoint time is older than the symsnapvx output
time so we’ll use the latter.
2. Connect to the database and perform the minimum media recovery necessary to
open the database in read-only mode.
If hot backup mode was not used during backup then add the ‘snapshot time…’ to
the recover command.
Make sure to use the syntax of ‘using backup controlfile’ (Oracle will not
apply enough archive logs otherwise).
Database altered.
ID STEP
---------- ------------------------------
1 Before +DATA & +REDO snapshot
2 After +DATA & +REDO snapshot
As shown, the database opened cleanly and both the first record, which was in
the data files at the time of the snapshot, and the record from after the snapshot,
which was in the archive logs, are present.
56 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
Using DB Verify DB Verify can run against data files only and does not test control files or redo logs. It can
run against files only while the database is not opened, which makes a good use case for
storage snapshot. The following steps explain how to use DB Verify to test the database
for corruptions:
1. The database on the mount host can be either offline, or in mounted state. It must
not be open to avoid any changes to the data files while they are being tested.
The following example creates a script that will validate the data files. It can be
modified to run a few validations simultaneously to allow more parallelism.
Using RMAN RMAN can run against the full database, including control files, redo logs, and data files. It
validate is beyond the scope of this paper to cover RMAN; however here are the basic steps to
test a database using RMAN database validation:
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 57
White Paper
Chapter 5: Recoverable Database Snapshots
1. Mount the database on the mount host so RMAN can connect to it.
2. Connect to the database from RMAN and perform the validation.
...
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
6 OK 0 473280 158597120 26808371
File Name: +DATA/SLOB/DATAFILE/slob.263.953031317
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 157286558
Index 0 329088
Other 0 508194
58 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
1. The backup process competes with the production host workload for resources
such as CPU, memory, and I/Os. By offloading the backup process to a mount
host, it will not compete with the production database host for these resources.
2. Performing backups directly from the production host means that if a recovery is
needed, the database image will first need to be restored before the recovery
operations can take place. Typical production databases can contain many
terabytes of data, which means that the initial restore operation can take a very
long time.
When using SnapVX to offload the backup to a mount host, first, a readily available
snapshot which is a valid backup image of the database is created. This snapshot is
linked to the target devices and mounted on the mount host. Therefore, an RMAN backup
taking place from the mount host does not compete for host resources with the production
database.
Secondly, the snapshot itself can be restored in seconds to the production host, and
recovery operations can resume immediately. This is a huge savings in recovery time,
compared to host-based backups.
It is also important to remember that RMAN is only concerned with the DBID and the
database files location. As such, RMAN can perform backups from a mount host and
recovery from the production host. For that reason it is a good practice to use RMAN
catalog (vs. a local controlfile) to keep track of backups. RMAN can connect to its catalog
over the network regardless if it is running on the mount host or production.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 59
White Paper
Chapter 5: Recoverable Database Snapshots
Alternatively, Oracle provides a bitmap that keeps track of what blocks have changed. The
bitmap is called Block Change Tracking, or BCT. When BCT is enabled, RMAN will
attempt to use it to make incremental backups much faster and more efficient.
A BCT is a file that can be stored externally or within ASM (but not in the Oracle
database). When enabled, the default file location is based on the init.ora
parameter DB_CREATE_FILE_DEST. Enable it using the following SQL command
(database can be open):
By default, a BCT file tracks eight versions, where each version resets the block
change information. As such, if more than seven incremental (level 1) backups are
performed prior to a new full (level 0), the BCT file won’t be able to provide RMAN
with sufficient information for efficient incremental backup and RMAN will revert to
performing a full database scan. The init.ora parameter ‘_bct_bitmaps_per_file’
can be set to a value greater than eight if that is a concern.
When RMAN performs the backup from production, it automatically switches the
BCT file version. When RMAN backup is offloaded to the mount host, the DBA will
execute the BCT switch on the production host manually, using the following
command:
Procedure These steps explain how to perform an RMAN backup offload to a mount host:
1. Preparation: you shoud be using an RMAN catalog since otherwise any backup
information will be stored on in the mount host database controlfile and will get
lost with each snapshot. First, register the database from the primary (production
database) ahead of time.
2. The mount host database state should be ‘mounted’ prior to performing the
RMAN backup.
3. On the mount host, connect with RMAN to the database and catalog, and perform
the appropriate backup. This paper doesn’t cover the specifics of RMAN backups
but here is a simple example of a level 0 (full) backup:
60 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
time rman target / catalog rco@catdb msglog /tmp/rman.log append << EOF
run{
allocate channel ch1 device type disk format '+BACKUP';
allocate channel ch2 device type disk format '+BACKUP';
allocate channel ch3 device type disk format '+BACKUP';
allocate channel ch4 device type disk format '+BACKUP';
backup incremental level 0 database tag 'incr lvl 0' section size
300G;
}
quit;
EOF
run{
allocate channel ch1 device type disk format '+BACKUP';
allocate channel ch2 device type disk format '+BACKUP';
allocate channel ch3 device type disk format '+BACKUP';
allocate channel ch4 device type disk format '+BACKUP';
backup incremental level 1 cumulative database tag 'incr lvl 1'
section size 300G;
}
quit
EOF
4. To see if the BCT file was used, execute the following query:
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 61
White Paper
Chapter 5: Recoverable Database Snapshots
Blocks read (‘BLOCKS_READ’) indicates how much data was read by RMAN as
part of the backup. Data file blocks (‘DATAFILE_BLOCKS’) indicates the number
of blocks in each data file, and blocks (‘BLOCKS’) indicates how many blocks
were actually written as part of the backup.
NOTE: Only +DATA is made visible to the production host. The assumption is that +REDO
and +FRA on the production host are intact and the corruption is found in the data files.
Be sure to read all the steps first. Especially pay attention to the steps leading to
preparing the text file to rename the ASM disk group from the snapshot target devices
described in steps 5 and 6.
Procedure To conduct an RMAN minor recovery of the production database using a snapshot, follow
these steps:
1. If the target SGs were previously mounted to a mount host, shut down the mount
host database and dismount the ASM disk groups.
NOTE: If the mount host database is RAC, make sure to shut down and dismount the
ASM disk groups and instances on all nodes.
62 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
2. On the production host, identify the corruption type and location. For
demonstration purposes we corrupted a database block6.
3. Choose the appropriate recoverable snapshot and link it to the target SG.
----------------------------------------------------------------------------
Sym Num Flags
Dev Snapshot Name Gens FLRG TS Last Snapshot Timestamp
----- -------------------------------- ---- ------- ------------------------
00067 database_20171030-120916 1 .... .. Mon Oct 30 12:09:17 2017
database_20171030-102525 1 .... .. Mon Oct 30 10:25:25 2017
database_20171029-121717 1 .X.. .. Sun Oct 29 12:17:15 2017
database_20171029-121519 1 .... .. Sun Oct 29 12:15:18 2017
00068 database_20171030-120916 1 .... .. Mon Oct 30 12:09:17 2017
database_20171030-102525 1 .... .. Mon Oct 30 10:25:25 2017
database_20171029-121717 1 .X.. .. Sun Oct 29 12:17:15 2017
database_20171029-121519 1 .... .. Sun Oct 29 12:15:18 2017
00069 database_20171030-120916 1 .... .. Mon Oct 30 12:09:17 2017
...
6 The method to deliberately corrupt a database block in ASM is introduced in this blog.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 63
White Paper
Chapter 5: Recoverable Database Snapshots
4. VMAX uses initiator groups and masking views to make devices visible hosts. If
the database_mount_sg SG was visible to the mount host (based on
‘rac_mount_mv’ masking view), remove that masking view. Instead, create a new
masking view making data_mount_sg visible to the production host.
Only the child storage group data_mount_sg is made visible to production.
Database_mount_sg, which includes both data_sg and redo_sg, is not
made visible to production.
Symmetrix ID : 000197700048
Symmetrix ID : 000197700048
If this is the first time the data_mount_sg devices are made visible to the
production host, you may need to reboot or rescan the SCSI bus online so that
the host is aware of the devices, and can identify their partitions and associate
Oracle permissions to them.
If you reboot, ASM will not mount the +DATA disk group since it sees both
the original devices and the snapshot target devices. This ASM feature
protects its data and makes this procedure safe to follow. If that happened due to
a reboot, simply remount the production +DATA ASM disk group after the
snapshot ASM disk group was renamed, as described in the next step.
5. To rename the ASM disk group based on the data_mount_sg_SG, prepare a text
file containing the snapshot devices from data_mount_sg as they appear on the
production host. This file is then used to rename the ASM disk group.
64 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
# cat asm_renam_to_snap_data.txt
/dev/emcpowercc1 DATA SNAP_DATA
/dev/emcpowerci1 DATA SNAP_DATA
/dev/emcpowerbx1 DATA SNAP_DATA
/dev/emcpowerbs1 DATA SNAP_DATA
/dev/emcpowerbt1 DATA SNAP_DATA
/dev/emcpowerbv1 DATA SNAP_DATA
/dev/emcpowerbq1 DATA SNAP_DATA
/dev/emcpowerce1 DATA SNAP_DATA
/dev/emcpowercj1 DATA SNAP_DATA
/dev/emcpowerbw1 DATA SNAP_DATA
/dev/emcpowerck1 DATA SNAP_DATA
/dev/emcpowercb1 DATA SNAP_DATA
/dev/emcpowerby1 DATA SNAP_DATA
/dev/emcpowercg1 DATA SNAP_DATA
/dev/emcpowercd1 DATA SNAP_DATA
/dev/emcpowercn1 DATA SNAP_DATA
6. Run the ASM disk group rename command on the production host using the text
file.
Parsing parameters..
renamedg operation: phase=two dgname=DATA newdgname=SNAP_DATA
config=./asm_rename_to_snap_data.txt asm_diskstring=AFD:*
Executing phase 2
Completed phase 2'
7. On the production host, mount the renamed disk group (and the original +DATA
disk group if it is not already mounted). Open the production database if it isn’t
already opened.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 65
White Paper
Chapter 5: Recoverable Database Snapshots
8. RMAN can catalog the whole +SNAP_DATA ASM disk group, or specific files or
directories within the ASM disk group. Once it does, it becomes aware of that
backup image and can use it to recover the production database.
Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights
reserved.
9. Perform RMAN recovery based on the situation. In the example from step 2, there
was a single block corruption in data file 7 block 154. Verifythat the corruption
was fixed.
66 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
Copyright (c) 1982, 2017, Oracle and/or its affiliates. All rights
reserved.
10. Once recovery operations are complete, dismount the +SNAP_DATA disk group
from the production host and remove the masking view rac_snap_mv. Optionally,
recreate the rac_mount_mv if the target SG should be made visible to the mount
host again.
Note that only the data files portion of the snapshot is restored. The assumption is
that +REDO and +FRA on the production host are intact. If that’s not the case, they can
be restored as well. To restore only data, use the child SG: ‘data_sg’. To restore both data
and redo use the parent SG: ‘database_sg’. To restore FRA use the ‘fra_sg’ snapshot.
Be sure to read all the steps first. Especially make sure that production redo logs are not
overwritten by mistake by the snapshot restore.
Procedure Follow these steps to conduct a production restore from a recoverable snapshot:
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 67
White Paper
Chapter 5: Recoverable Database Snapshots
2. Shut down the production database and dismount the ASM disk group that will be
restored. Other disk groups can stay online. In this example, only +DATA is
restored.
NOTE: If the target database is RAC, be sure to shutdown and dismount the ASM disk
groups and instances on all nodes.
NOTE: Make sure only +DATA is dismounted and not +REDO or +FRA, assuming they survived
the disaster.
3. List the snapshots and restore the desired snapshot. Note that we use ‘data_sg’
SG.
----------------------------------------------------------------------------
68 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
Data copy from the snapshot may proceed in the background and can be
monitored using the command below.
-----------------------------------------------------------------------------------------
Sym Flgs Remaining Done
Dev Snapshot Name Gen F Snapshot Timestamp (GBs) (%)
----- -------------------------------- ---- ---- ------------------------ ---------- ----
00067 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.1 33
00068 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 68.2 31
00069 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.1 33
0006A database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.0 34
0006B database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 67.7 32
0006C database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.2 33
0006D database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.0 33
0006E database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 65.6 34
0006F database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.6 33
00070 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.3 33
00071 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 67.0 32
00072 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 65.7 34
00073 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.6 33
00074 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 65.9 34
00075 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 67.4 32
00076 database_20171031-111302 0 . Tue Oct 31 11:13:03 2017 66.5 33
----------
1064.1
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 69
White Paper
Chapter 5: Recoverable Database Snapshots
...
In most cases, the DBA can proceed with the recovery operations without waiting
for the background copy to complete (assuming the storage utilization is not too
high). However, allow the background copy to finish before opening the database
to user access at scale.
5. Mount the +DATA disk group on the production host.
NAME STATE
------------------------------ -----------
DATA MOUNTED
FRA MOUNTED
GRID MOUNTED
REDO MOUNTED
6. Mount the database and perform media recovery. If hot-backup was not used
when the snapshot was created, use the ‘snapshot time’ syntax, similar to the
previous use case: Opening a recoverable database on a mount host, with the
exception that this time, the recovery takes place on the production database.
70 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
Database altered.
QL> select * from testTbl;
ID STEP
---------- ----------------------------------------
1 Before +DATA & +REDO snapshot
2 After +DATA & +REDO snapshot
a. If the online redo logs are not available, you can open the database with
reset logs.
Database altered.
b. If the online redo logs are available, apply the latest redo logs, as shown in
the following example.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 71
White Paper
Chapter 5: Recoverable Database Snapshots
Statement processed
RMAN> quit
ID STEP
---------- ----------------------------------------
1 Before +DATA & +REDO snapshot
2 After +DATA & +REDO snapshot
3 After +FRA snapshot
8. After confirming that the restored snapshot has finished any background copy,
terminate the restore session (the snapshot itself is not terminated, only the
restore session, by specifying the option: ‘-restored’).
9. The database is now available for all operations and all nodes can be brought
online. If the database was opened with resetlogs, create a new recoverable
backup image immediately as the new backup base.
This use case shows how to instantiate the standby database using SnapVX. A similar
operation can take place using SRDF leveraging SnapVX remotely, or the SRDF R2
devices directly.
Important: Unlike the database backup and recovery use cases discussed earlier, standby
database in managed recovery is not well integrated with the ‘snapshot time’ alternative for using
hot-backup mode, even with Oracle 12c.
For that reason, if using hot-backup mode, the process of instantiating the standby is simpler. If
72 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
using the ‘snapshot-time’, the instantiation it is done in two parts – first, recover the target
database using ‘snapshot-time’ as if it was a normal backup image, and second, once the
database is recovered past the snapshot time, turn the image into a standby database.
Procedure To prepare to instantiate an Oracle standby database using VMAX replications, follow
these steps:
2. Configure the production database with standby redo logs (in case of a role
switch). Update the sizing below as appropriate.
##########################################
# For Standby DB
db_unique_name=hopkinton # Primary unique name
control_files=('+DATA/cntrlSLOB.dbf') # Primary controlfile location
#db_unique_name=austin # Standby unique name
#control_files=('+DATA/austin.ctl') # Standby controlfile location
LOG_ARCHIVE_DEST_1=
'LOCATION=USE_DB_RECOVERY_FILE_DEST
VALID_FOR=(ALL_LOGFILES,ALL_ROLES)
DB_UNIQUE_NAME=hopkinton'
LOG_ARCHIVE_DEST_2=
'SERVICE=austin ASYNC
VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)
DB_UNIQUE_NAME=austin'
REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE
LOG_ARCHIVE_FORMAT=%t_%s_%r.arc
FAL_SERVER=austin
DB_FILE_NAME_CONVERT='/austin/','/hopkinton/'
LOG_FILE_NAME_CONVERT='/austin/','/hopkinton/'
STANDBY_FILE_MANAGEMENT=AUTO
##########################################
...
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 73
White Paper
Chapter 5: Recoverable Database Snapshots
austin =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = dsib0057)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = austin)
)
)
hopkinton =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = dsib0144)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = hopkinton)
)
)
6. Copy the production database’s init.ora to the standby site and make any
appropriate changes, as shown below.
##########################################
# For Standby DB
#db_unique_name=hopkinton # Primary unique name
#control_files=('+DATA/cntrlSLOB.dbf') # Primary controlfile location
db_unique_name=austin # Standby unique name
control_files=('+DATA/austin.ctl') # Standby controlfile location
LOG_ARCHIVE_DEST_1=
'LOCATION=USE_DB_RECOVERY_FILE_DEST
VALID_FOR=(ALL_LOGFILES,ALL_ROLES)
DB_UNIQUE_NAME=austin'
LOG_ARCHIVE_DEST_2=
'SERVICE=austin ASYNC
VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)
DB_UNIQUE_NAME=hopkinton'
REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE
LOG_ARCHIVE_FORMAT=%t_%s_%r.arc
FAL_SERVER=hopkinton
DB_FILE_NAME_CONVERT='/hopkinton/','/austin/'
LOG_FILE_NAME_CONVERT='/hopkinton/','/austin/'
STANDBY_FILE_MANAGEMENT=AUTO
74 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
##########################################
Using hot- Follow these steps to use hot-backup mode to create a replica for the standby database:
backup mode
1. On the standby database host make sure the database instances are shut
down. Dismount the ASM disk groups +DATA, +REDO, +FRA.
6. On the production host end backup mode, switch logs and archive.
SQL> insert into testTbl values (2, 'After +DATA, +REDO, and +FRA
snapshot');
SQL> commit;
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 75
White Paper
Chapter 5: Recoverable Database Snapshots
4. To test the standby database, add some records to the production database.
SQL> insert into testTbl values (3, 'After standby managed recovery
started');
SQL> commit;
SQL> alter system switch logfile;
SQL> alter system archive log current;
5. Open the standby database in read-only (managed recovery) mode, and inspect
the data to see if the updates from production are arriving. Note that it can take
some time for updates to show in the standby, based on how many logs are owed
to it.
76 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
Database altered.
ID STEP
---------- --------------------------------------------------
1 Before +DATA, +REDO, and +FRA snapshot
2 After +DATA, +REDO, and +FRA snapshot
3 After standby managed recovery started
Database altered.
OPEN_MODE
--------------------
READ ONLY WITH APPLY
2. For demonstration purposes, SLOB was used to simulate OLTP user transactions
on the production host. Add some records to the test table.
3. Create a standby control file and place it in the +DATA ASM disk group, since
we’ll be replicating that disk group to the standby site.
4. Create snapshots of just the database_sg (+DATA and +REDO). Do not use hot-
backup mode.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 77
White Paper
Chapter 5: Recoverable Database Snapshots
SQL> insert into testTbl values (2, 'After +DATA and +REDO snapshot');
SQL> commit;
As mentioned earlier, when using ‘snapshot time’ there are two steps: first perform a
manual media recovery on the standby host, using the ‘snapshot time’ syntax. Once the
database can be opened in read-only mode (when enough recovery has been performed),
convert the replica to a standby database. Following is the detailed description of these
steps.
78 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 5: Recoverable Database Snapshots
2. On the standby host, use the password file copied from production.
4. On the standby host, perform manual media recovery with the available archives
using the ‘snapshot time’ syntax.
ID STEP
---------- --------------------------------------------------
1 Before +DATA, +REDO, and +FRA snapshot
2 After +DATA and +REDO snapshot
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 79
White Paper
Chapter 5: Recoverable Database Snapshots
SQL> insert into testTbl values (3, 'After standby managed recovery
started');
SQL> commit;
SQL> alter system switch logfile;
SQL> alter system archive log current;
4. Open the standby database and inspect the data. It could take some time until the
latest production data is shown in the standby, based on how many transactions
need to be recovered first.
Database altered.
ID STEP
---------- --------------------------------------------------
1 Before +DATA, +REDO, and +FRA snapshot
2 After +DATA and +REDO snapshot
3 After standby managed recovery started
Database altered.
OPEN_MODE
--------------------
READ ONLY WITH APPLY
80 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
Mounting remote restartable snapshot with a new DBID and file location . 105
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 81
White Paper
Chapter 6: Remote Replications with SRDF
Therefore, the target of both SRDF/S and SRDF/A is restartable, though with SRDF/A it
has a slight lag. That lag can increase during peak loads as data makes its way to the
remote site, and shrink back to the default 15 seconds afterwards. If a disaster hits the
source database, Oracle can start from the target array(s) as if the database went through
shutdown-abort or a server crash. It will simply restart, performing instance or crash
recovery using only data, control, and redo log files (no archive logs are used).
In addition, SRDF is tightly integrated with SnapVX. As a result, SnapVX can create
recoverable or restartable snapshots at the remote site without interrupting SRDF
replications, allowing all the use cases discussed previously for snapshots to be executed
from either the source or the target arrays while SRDF/S or SRDF/A are used.
Following are key use cases for creating remote replications with SRDF.
82 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
Include data files and archive logs. Most often, SRDF replications focus on DR
(Disaster Restart), and therefore will already include all data, log, and control files.
In order to extend the use case to support recovery, the archive logs are simply
added to the remote replications (for example, the fra_sg SRDF group is added).
A remote recoverable image is created using SnapVX at the remote array, without
interrupting SRDF replications.
When SRDF/A is used, an SRDF ‘checkpoint’ command is issued prior to each
SnapVX establish command (snapshot creation). SRDF checkpoint makes sure
that data in the local array reaches the remote array. This is important in a recovery
use case.
For example, if a hot backup mode is used, the remote snapshot must be taken
after the database was placed in backup mode. Issuing the checkpoint command
prior to creating the remote snapshot ensures that the R2 devices contain the
backup mode state. Similarly, after issuing the ‘archive log current’ command, the
SRDF checkpoint command ensures that the R2 devices contain these latest
archive logs prior to the remote snapshot of the fra_sg.
SRDF management using an SG or a CG
SRDF has many topologies that satisfy different disaster protection and replications
strategies. Their management aspects are covered in the Solutions Enabler SRDF
Product Guide.
The configuration and examples used in this paper assume that there is a basic
environment with a single source and target arrays and a single SRDF group for
each Storage Group (SG). As a result, this paper uses SGs to manage the SRDF
replications. This can only be done when there is a single SRDF group associated
with each SG.
In an environment where database devices are spread across multiple arrays (and
SGs), or where multiple SGs must be replicated consistently, a Consistency Group
(CG) is created and SRDF is managed using the CG, not the SG.
While the examples in this paper do not cover the use of CGs or more complex
SRDF topologies, from an Oracle ASM and database perspective, the operations
remain the same. However, the SRDF management commands may change based
on the topology and use of CGs versus SGs.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 83
White Paper
Chapter 6: Remote Replications with SRDF
NOTE: See additional considerations for local and remote replications in Chapter 3.
Procedure Execute all storage commands from the local storage management host.
1. To set up replications between the matching storage groups on the local and
remote arrays, first create an SRDF group. An SRDF group declares which of
the SRDF adapters (RA’s) and ports of each array participate in the replications. It
also lets you provide an SRDF group number and label for ease of management.
To create the SRDF group follow these sub-steps:
a. List the SRDF adapters and ports on the local (048) and remote (047) arrays.
S Y M M E T R I X R D F D I R E C T O R S
S Y M M E T R I X R D F D I R E C T O R S
84 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
9 - - - Online PendOn
RF-2H 8 000197700048 2 (01) 2 (01) Online Online
8 000197700048 42 (29) 4 (03) Online Online
9 - - - Online PendOn
b. Choose the appropriate ports on each array and use them to create
SRDF groups. Both VMAX arrays have directors 1H port 8 and 2H port 8
available as shown in the previous step. Also, SRDF group number 10 is not
already used.
2. Create a replication session between the local and remote SGs using the
newly created SRDF group.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 85
White Paper
Chapter 6: Remote Replications with SRDF
4. Eventually, either all the data is copied and the SRDF pairs’ state shows
‘Synchronized’, or the number of invalid tracks between source and target arrays
shrinks sufficiently. At that time, change SRDF mode to either Sync, or Async
so that the target devices can be consistent with the source.
NOTE: Only SRDF/S or SRDF/A (and their variants, such as SRDF/A MSC, cascaded SRDF,
or SRDF/STAR) are valid consistent replications for Oracle. SRDF ACP does not keep the
target consistent with the source and is only meant for data refresh. Once SRDF mode has
changed to Sync or Async, the target devices are only consistent when their state is no
longer ‘SyncInProg’ and rather ‘Synchronized’ (for SRDF/S) or ‘Consistent’ (for SRDF/A).
NOTE: although the example above uses SG, it is recommended to enable consistency even for
SRDF/S, which requires the use of CG instead of an SG.
NOTE: As shown above, SRDF/A allows enabling consistency using SG when a single SRDF
group is used. Otherwise, SRDF/A also requires the use of CG.
86 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
The example below shows SRDF/S mode and the devices in ‘Synchronized’
state:
Another way to view the replication state is using the ‘symrdf query’ command,
as shown below (SRDF/A example):
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 87
White Paper
Chapter 6: Remote Replications with SRDF
Wait until source and target devices have little to no difference, then:
88 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
The next step is to bring up the remote ASM disk groups and database associated with
the R2 (remote) devices. If the R2 devices are already visible to the remote database
server(s) then this operation can take place immediately. However, if the remote database
server(s) are using the remote database snapshots, first change the remote servers to
point to the R2 devices directly.
Finally, either resume replications in the opposite direction, or once the local array is
available again, failback to the local site and resume replications again from there to the
remote site.
Changing the This section describes two scenarios: in the first scenario, both sites are reachable and
target devices’ SRDF replications did not stop. In the second scenario, only the remote site is available
state to read- and SRDF stopped replicating.
writable (RW)
If the local site is reachable and SRDF replications didn’t stop
If SRDF replications are still on-going, to make the target devices RW, ‘split’ SRDF to stop
the replications. If the production database is still running on the local array, consider
whether it should be brought down first (for example, if production operations are moved
to the remote site, then production should be shut down first on the local array). However,
if the production database remains operational on the local site, and if the business wants
to access the R2 devices directly, split SRDF without bringing down the local production
database.
2. If ASM +FRA disk groups were replicated in a different SRDF group, split that
group as well.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 89
White Paper
Chapter 6: Remote Replications with SRDF
setting (default), then the SRDF state will show as ‘TransIdle’, which means SRDF is
waiting for the last cycle to arrive from the R1 devices. If SRDF/S was configured (or if
SRDF/A mode was used but Transmit Idle was disabled), then the SRDF state will show
as ‘Partitioned’. Each case is described in the following sections.
90 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
If fra_sg was replicated in its own SRDF group, repeat the steps for fra_sg.
At the end of this step, the remote SRDF devices are in RW state.
SRDF state is ‘Partitioned’
If the SRDF state is ‘Partitioned’ and the R2 devices are still write disabled (WD), then
they cannot be used by Oracle yet. Perform a failover operation on the R2 devices to
make them read-writable (RW).
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 91
White Paper
Chapter 6: Remote Replications with SRDF
If fra_sg was replicated in its own SRDF group repeat the steps for fra_sg.
At the end of this step, the remote SRDF devices are in RW state.
Making sure the Depending on how the business uses the remote servers during normal operations, they
R2 devices are may be accessing the remote snapshots and not the R2 devices. In that case, shut down
visible to the the database running from the remote snapshots and change the servers masking view to
remote servers point to the R2 devices instead. This section shows how to change the remote server
masking views from the remote snapshot target devices (database_mount_sg) to point
directly to the R2 devices (database_sg).
Symmetrix ID : 000197700047
92 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
Symmetrix ID : 000197700047
Resuming SRDF There are two options to resume replications: The first option is to failback to the local site
replications as soon it becomes available again and resume replications from there. The other option
is to resume operations at the remote site, switching the replications direction so now
SRDF replicates are from the remote to the local site. These two scenarios are described
below.
The initial state for either scenario is that both remote and local arrays are available and
connected by SRDF. In that case the SRDF state will change from ‘Partitioned’ to ‘Split’.
In this example, the R2 devices are in the RW state and the database is running at the
remote site.
Switching SRDF replications to replicate from the remote to the local site
1. Swap personality, making the remote site devices R1, and the local site devices
R2.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 93
White Paper
Chapter 6: Remote Replications with SRDF
NOTE: After the ‘swap’, the remote array (047) will be shown as the ‘Source (R1)’ instead of the
‘Target (R2)’.
2. Resume replications from the remote to the local site. If time has passed and
many changes have accumulated on the remote array, use adaptive copy mode
to do a batch update. Once the data differences between R1 and R2 devices are
small enough (either completely synchronized, or there is only a small delta that is
remains stable between batch updates), change mode to sync or async.
The RDF Set 'ACp Disk Mode ON' operation successfully executed
for storage group 'database_sg'.
94 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
NOTE: Repeat the same operation for fra_sg if it was replicated in a different SRDF group.
Remember to enable consistency on both SRDF groups.
NOTE: In the example below, execute SRDF commands from the local site, which is available.
1. Database operations may be running at the remote site. First, update the R1
devices with the R2 data without disturbing remote database operations.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 95
White Paper
Chapter 6: Remote Replications with SRDF
Repeat the ‘update’ command as necessary until the differences between R2 and
R1 devices are sufficiently low.
2. When ready to procced with replications from the local site, if Oracle was running
at the remote site, bring down the database and dismount the associated
ASM disk groups.
Use SRDF ‘failback’ to resume replications again from local to remote site.
96 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
NOTE: After the ‘failback the local array (048) will show itself as the ‘Source (R1)’ again.
NOTE: Repeat the same operations for fra_sg if it was replicated in a different SRDF group.
If fra_sg is required at the remote site (that is, if the remote database copies require the
use of archive logs) then make fra_sg part of the replication so it can be snapped as well.
The SRDF remote device state should be either ‘Synchronized’ (for SRDF/S), or
‘Consistent’ (for SRDF/A) before taking the snapshot.
Creating a The steps to create a remote restartable copy of the database are identical to creating
remote database such a copy from the local array. The only difference is that the snapshot is created from
snapshot the remote array, and linked to target devices in the remote array. The database copy is
accessed by the remote servers.
While Solutions Enabler SnapVX commands to the remote array can be executed from
either the local or remote storage management hosts, it is simpler to execute them from
the remote storage management host to avoid any confusion with the local array.
Procedure This example shows how to create a remote restartable snapshot using CLI.
1. To demonstrate what data is preserved in the different scenarios, use a test table
in the production database on the local array.
2. To simulate on-going database activity, start the SLOB OLTP workload in the
background.
3. Insert a known record into the test table before taking the snapshot.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 97
White Paper
Chapter 6: Remote Replications with SRDF
NOTE: The ‘-sid’ option is used in the example to make sure the snapshot is created at the 047
[remote] array. If the commands are executed from the local array (048) storage management
host then ‘-remote’ is needed in the command syntax.
SQL> insert into testTbl values (2, 'After first snapshot taken');
SQL> commit;
NOTE: When the same SG and snapshot name are used to create additional snapshots, a new
snapshot generation is created, where generation 0 always points to the latest snapshot. When
snapshots are listed, the date/time information of each generation is shown.
SQL> insert into testTbl values (3, 'After second snapshot taken');
Create a masking view so the remote servers can access the snapshot target devices
(database_mount_sg, and fra_mount_sg if fra_sg was replicated).
Remember that +FRA is not required for a restartable solution, although you may prefer to
include it so the replicated database can have a place to write archive logs. On the other
hand, if you prefer to open the replica without archive logs, or to have a different +FRA on
98 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
the remote mount host, then there is no need to create a snapshot of fra_sg. In the
example below we’ll create the fra_sg snapshot.
Creating remote This example shows how to create the remote target devices and add them to a masking
target devices view:
## Create SG’s
# symsg create data_mount_sg
# symsg create redo_mount_sg
# symsg create fra_mount_sg
## Populate SG’s
# symsg -sg data_mount_sg addall -devs 19A:1A9
# symsg -sg redo_mount_sg addall -devs 1AA:1B1
# symsg -sg fra_mount_sg add dev 1B7
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 99
White Paper
Chapter 6: Remote Replications with SRDF
Symmetrix ID : 000197700047
As in the local array use case, if using RAC, install Grid Infrastructure ahead of time
locally for the remote servers instead of replicating it from the local array. In the example
above, we can see the grid_mv masking view which was created ahead of time and used
for +GRID ASM disk group during GI installation.
Linking This example shows how to link a snapshot to target devices using CLI. Afterwards, we’ll
snapshots to start the target database and inspect the data.
target devices
using CLI 1. Choose a snapshot generation ID to link. By listing the snapshots with the -detail
flag, each generation and its date/time is shown.
-------------------------------------------------------------------------------------------------------------
Total
Dev Snapshot Name Gen FLRG TS Snapshot Timestamp (GBs) (GBs) Expiration
Date
...
database_snap 3 .... .. Tue Nov 21 06:47:46 2017 0.6 0.0 NA
---------- ----------
2312.3 120.0
100 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
Flags:
2. Link the snapshot to the target devices using generation 1, which is the first
database snapshot from the previous example.
4. Make sure the target host is zoned and masked to the target devices. If this is the
first time a snapshot is made visible to the target host, reboot the host or rescan
the SCSI bus online to make sure the devices and their partitions are seen by the
host. Make sure the partitions (if used) or devices (otherwise) receive Oracle
permissions.
5. Log in to the ASM instance on the target host. Make sure that the ASM disk
groups on the target devices are visible and in the unmounted state, then mount
them.
6. Log in to the database instance on the mount host, and simply start the database.
Do not perform any media recovery. During this step Oracle performs crash or
instance recovery.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 101
White Paper
Chapter 6: Remote Replications with SRDF
SQL> startup
ORACLE instance started.
Optional: If archive log mode is not necessary (or +FRA is not available) on the
mount host, the following steps show how to disable archiving before opening the
database.
7. Inspect the data in the test table. Since we used generation 1, which was the first
snapshot, the data in the table reflects the records from before that snapshot.
ID STEP
---------- --------------------------------------------------
1 Before snapshots taken
1. Before linking a different snapshot generation to the target SG, bring down the
database and ASM disk groups on the mount host, because the target devices’
data is about to be refreshed.
NOTE: If the target database is RAC, make sure to shut down all the instances and
dismount the relevant ASM disk groups on all nodes.
102 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
2. Choose a snapshot generation ID to link. By listing the snapshots with the -detail
flag, each generation and its date/time is shown.
3. Link the appropriate snapshot to the target devices. This time we’ll use generation
0, which is the second (latest) snapshot we took in the previous example.
NOTE: We use ‘relink’ in the syntax. There is no need to terminate the previous
snapshot, just relink, using the new generation ID. There is no need to mention –gen 0 as
it is the default.
5. As before, the target host should already be zoned and masked to the target
devices. No action.
6. Log in to the ASM instance on the target host. The ASM disk groups on the target
devices should be visible, though in unmounted state. Mount them.
7. Log in to the database instance on the target host. Start the database. Do not
perform any media recovery. During this step Oracle performs crash recovery.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 103
White Paper
Chapter 6: Remote Replications with SRDF
8. Inspect the data in the test table. Since we used generation 0, which was the
second snapshot, the data in the table reflects the records from just before that
snapshot.
ID STEP
---------- ----------------------------------------
1 Before snapshots taken
2 After first snapshot taken
104 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
NOTE: Make sure that the SRDF remote device state is either ‘Synchronized’ (for SRDF/S), or
‘Consistent’ (for SRDF/A) before taking the snapshot.
1. To demonstrate what data is preserved in the different scenarios, usea test table
with known records inserted before or after specific steps. To simulate user
workload during the tests, run the SLOB OLTP benchmark on the source
clustered database.
SQL> create table testTbl (Id int, Step varchar(255)) tablespace slob;
SQL> insert into testTbl values (1, 'Before +DATA & +REDO snapshot');
SQL> commit;
2. Only if hot-backup mode is used (databases pre-12c) begin hot backup mode.
IMPORTANT: If hot-backup mode is used in the production database, and the replications are
asynchronous (SRDF/A), then execute the ‘symrdf checkpoint’ command before creating the
remote database_sg snapshot. This ensures that the begin backup mode state is replicated to the
remote array before creating the remote snapshot. If hot-backup is not used (Oracle 12c),
executing ‘symrdf checkpoint’ is not needed as there is nothing to wait for.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 105
White Paper
Chapter 6: Remote Replications with SRDF
4. Only if hot-backup mode is used (databases pre-12c), end hot backup mode.
5. For demonstration purposes, insert another known record into the production
database after the first remote snapshot.
SQL> insert into testTbl values (2, 'After +DATA & +REDO snapshot');
SQL> commit;
7. Perform this step only if RMAN incremental backups are offloaded to the mount
host. In that case, the BCT file version must be switched manually on the
production host, just like RMAN would have done automatically at the end of the
backup if it was performed from the production host.
Make sure BCT is enabled, then switch its version.
8. Create archive logs snapshot. Create a snapshot with the archive logs (ASM
+FRA disk group, or fra_sg SG). This snapshot includes sufficient archives to
recover the database so it can open.
IMPORTANT: If SRDF/A is used, then ‘symrdf checkpoint’ command has to be executed prior to
creating the remote +FRA snapshot. This is regardless if hot-backup mode was used or not in
order to make sure the latest archive logs (after the log switch on Production) arrived to the
remote array prior to creating the remote snapshot.
Optionally, verify that the devices are in ‘synchronized’ state for SRDF/S
replications, or ‘consistent’, for SRDF/A replications.
106 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
9. For demonstration purposes, insert the last known record for this test.
10. To inspect the snapshots created, use the appropriate level of detail
symsnapvx list
symsnapvx -sg <sg_name> list
symsnapvx -sg <sg_name> -snapshot_name <snapshot_name> list –gb –detail
symsnapvx -sg <sg_name> -snapshot_name <snapshot_name> list –gb -summary
The remote SnapVX restore, together with the SRDF restore, work in parallel to restore
the data as fast as possible to the production database at the local site.
Note that only the data files portion of the snapshot is restored to the production
database. The assumption is that production database +REDO and +FRA are intact. If
that’s not the case, they can be restored as well, although overwriting the last production
redo logs means limited data loss. To restore only data, use the child SG: ‘data_sg’. To
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 107
White Paper
Chapter 6: Remote Replications with SRDF
restore both data and redo, use the parent SG: ‘database_sg’. To restore FRA, use the
‘fra_sg’ snapshot.
Be sure to read all the steps first. Especially make sure that production redo logs are not
overwritten by mistake by the snapshot restore.
Procedure Follow these steps to restore a local production database from a remote recoverable
snapshot.
2. Shut down the production database (if it is still running) and dismount the ASM
disk group that will be restored. Other disk groups can stay online. In this case we
restore only +DATA.
NOTE: if the target database is RAC, make sure to shutdown and dismount the ASM disk
groups and instances on all nodes.
108 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
NOTE: Make sure only +DATA is dismounted and not +REDO or +FRA, assuming they survived
the disaster.
b. Stop SRDF replications for the database_sg. The flag ‘-force’ is required for
SRDF/A. If consistency is enabled,disable it, because we will only restore
data_sg which is part of database_sg. Change SRDF mode to Adaptive
Copy.
3. List the remote snapshots (array id 047) and restore the one desired. Note that
we use ‘data_sg’ SG for the restore so that we do not overwrite redo_sg of the
production database.
NOTE: It is important to start with a SnapVX restore followed by the SRDF restore so they can
work in parallel. If SRDF restore is performed first, SnapVX restore won’t be able to take place
until it is finished or stopped.
----------------------------------------------------------------------------
Sym Num Flags
Dev Snapshot Name Gens FLRG TS Last Snapshot Timestamp
----- -------------------------------- ---- ------- ------------------------
00178 database_2017-11-22_11-27-29 1 .X.. .. Wed Nov 22 11:27:47 2017
database_2017-11-22_11-26-04 1 .... .. Wed Nov 22 11:26:22 2017
database_2017-11-21_16-13-25 1 .... .. Tue Nov 21 16:13:42 2017
database_2017-11-21_16-03-13 1 .... .. Tue Nov 21 16:03:30 2017
database_2017-11-21_15-14-52 1 .... .. Tue Nov 21 15:15:09 2017
00179 database_2017-11-22_11-27-29 1 .X.. .. Wed Nov 22 11:27:47 2017
database_2017-11-22_11-26-04 1 .... .. Wed Nov 22 11:26:22 2017
database_2017-11-21_16-13-25 1 .... .. Tue Nov 21 16:13:42 2017
database_2017-11-21_16-03-13 1 .... .. Tue Nov 21 16:03:30 2017
database_2017-11-21_15-14-52 1 .... .. Tue Nov 21 15:15:09 2017
...
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 109
White Paper
Chapter 6: Remote Replications with SRDF
4. As soon as the remote SnapVX restore operation starts, start restoring SRDF.
Make sure to only restore data_sg in both cases.
5. Data copy from both the remote snapshot and SRDF will take place in parallel.
Use the following commands to track the progress.
6. Once both the SnapVX and SRDF restores are done, verify the completion.
NAME STATE
------------------------------ -----------
DATA MOUNTED
FRA MOUNTED
GRID MOUNTED
REDO MOUNTED
9. Mount the database and perform media recovery. If hot-backup was not used
when the snapshot was created, use the ‘snapshot time’ syntax, similar to the
previous use case: Opening a recoverable database on a mount host, only this
time, the recovery takes place on the production host and not the mount host.
110 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 6: Remote Replications with SRDF
Database altered.
QL> select * from testTbl;
ID STEP
---------- ----------------------------------------
1 Before +DATA & +REDO snapshot
2 After +DATA & +REDO snapshot
If the production database redo logs are not available, you can open the database
with reset logs.
Database altered.
If the production redo logs are available, apply the latest redo log transactions using
RMAN, as shown in the following example.
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 111
White Paper
Chapter 6: Remote Replications with SRDF
Statement processed
RMAN> quit
10. We can now see that the latest transactions are visible.
ID STEP
---------- ----------------------------------------
1 Before +DATA & +REDO snapshot
2 After +DATA & +REDO snapshot
3 After +FRA snapshot
11. The database is now available for all operations and all nodes can be brought
online. If the database was opened with resetlogs, create a new recoverable
backup image immediately as the new backup base.
112 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper
Chapter 7: Summary and Conclusion
Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash 113
White Paper
Chapter 7: Summary and Conclusion
Summary
Business continuity, disaster protection, and continuous availability are important topics
for which every mission critical database environment must have a strategy. By
understanding the benefits that VMAX All Flash, SnapVX and SRDF provide, you can
develop a strong strategy to not only protect the primary databases, but also to allow fast
and efficient creation of copies for purposes such as testing, development, reporting, data
validations, backup offloading, and others.
114 Oracle Database Backup, Recovery, and Replication Best Practices with VMAX All Flash
White Paper