Professional Documents
Culture Documents
by Daniel T. Liu
Introduction
One of the biggest responsibilities for a database administrator is provide high availability and reduce
unplanned downtime for a database. However, this has become a major challenge as our database size
increased so dramatically over the years and our critical business information system requires 24x7 uptime.
In an unplanned downtime when a terabyte database was corrupted, it may take hours, even days to
restore such a database. To minimize downtime and avoid data loss, we need a standby database that can
Oracle9i Data Guard technology meets such a challenge. Oracle version 7.3 was the first release to support
standby database, however, the process of transferring redo logs was manual. The standby database has no
other use until it takes the role of the primary database. Oracle8i introduced the concept of automatic
shipping and application of redo log files from the primary site to the standby site. It also allows the standby
database to be opened for read only while the recovering process is stopped. Oracle9i release 1 introduces
the new concept of protection mode, preventing the primary and the standby database from diverging. It
also introduces Data Guard broker, an interface to manage the Data guard environment. Oracle9i release 2
This article provides an overview of Oracle9i Data Guard technology. It offers an introduction to the basic
concepts and architectures of Data Guard. It discusses the selection of several of data protection mode,
steps to setup a Data Guard environment, and steps to perform failover and switchover operations. It also
Oracle9i Data Guard is the management, monitoring, and automation software that work with a production
database and one or more standby databases to protect data against failures, errors, and corruption that
Primary database: A primary database is a production database. The primary database is used to create a
standby database. Every standby database is associated with one and only one primary database.
Standby database: A physical or logical standby database is a database replica created from a backup of a
primary database.
block-for-block basis. It is updated by performing recovery from redo logs generated from the
primary database.
Log transport services: Log transport services control the automated transfer of archived redo from the
Network configuration: The primary database is connected to one or more remote standby database via
Oracle Net.
Log apply services: Log apply services apply the archived redo logs to the standby database.
Data guard broker: Data Guard Broker is the management and monitoring component with which you
configure, control, and monitor a fault tolerant system consisting of a primary database protected by one or
Figure 1
A database can operate in one of the two mutually exclusive roles: primary or standby database.
Failover
During a failover, one of the standby databases takes the primary database role.
Switchover
In Oracle9i, primary and standby database can continue to alternate roles. The primary database can switch
the role to a standby database; and one of the standby databases can switch roles to become the primary.
Data Guard Manger is a GUI version of Data Guard broker interface that allows you to automate many of the
It is an alternative interface to using the Data Guard Manger. It is useful if you want to use the broker from
batch programs or scripts. You can perform most of the activities required to manage and monitor the Data
$ dgmgrl
Note: The use of an SPFILE is required with Oracle9i Release 2 when using a Data Guard Broker
Configuration.
Process Architecture
Physical Standby Processes Architecture
The log transport services and log apply services use the following processes to ship and apply redo logs to
writes to the online redo logs. The archiver process (ARCH) creates a copy of the online redo logs, and
writes to the local archive destination. Depending on the configuration, the archiver process or log writer
process can also transmit redo logs to standby database. When using the log writer process, you can specify
synchronous or asynchronous network transmission of redo logs to remote destinations. Data Guard
achieves synchronous network I/O using LGWR process. Data Guard achieves asynchronous network I/O
using LGWR network server process (LNS). These network severs processes are deployed by
On the standby database site, the remote file server process (RFS) receives archived redo logs from the
primary database. The primary site launches the RFS process during the first log transfer. The redo logs
information received by the RFS process can be stored as either standby redo logs or archived redo logs.
Data Guard introduces the concept of standby redo logs (separate pool of log file groups). Standby redo logs
must be archived by the ARCH process to the standby archived destination before the managed recovery
The fetch archive log (FAL) client is the MRP process. The fetch archive log (FAL) server is a foreground
process that runs on the primary database and services the fetch archive log requests coming from the FAL
client. A separate FAL server is created for each incoming FAL client.
When using Data Guard broker (dg_broker_start = true), the monitor agent process named Data Guard
Broker Monitor (DMON) is running on every site (primary and standby) and maintain a two-way
communication.
Figure 2
The major difference between the logical and physical standby database architectures is in its log apply
services.
The logical standby process (LSP) is the coordinator process for two groups of parallel execution process
(PX) that work concurrently to read, prepare, build, and apply completed SQL transactions from the archived
redo logs sent from the primary database. The first group of PX processes read log files and extract the SQL
statements by using LogMiner technology; the second group of PX processes apply these extracted SQL
transactions to the logical standby database. The mining and applying process occurs in parallel. Logical
standby database does not use standby online redo logs. Logical standby database does not have FAL
capabilities in Oracle9i. All gaps are resolved by the proactive gap resolution mechanism running on the
Figure 3
Note: Logical Standby database is an Oracle9i Release 2 feature. In 9.2, the LGWR SYNC actually does use
the LNS as well. Only SYNC=NOPARALLEL goes directly from the LGWR. The default SYNC mode is
SYNC=PARALLEL.
Depending on the business requirement, you can set Data Guard in different protection modes.
Guaranteed protection: The standby database cannot diverge from the primary database and no data can
be lost. A transaction is not committed on the primary database until it has been confirmed that the
transaction data is available on at least one standby database. When operating in this mode, it provides the
highest degree of data availability. However, it could adversely affect primary database performance.
Instant protection: The standby database could temporarily diverge from the primary database. However,
the standby database will be synchronized after the failover process, no data will be lost.
Rapid protection: The log writer process transmits redo logs to the standby site. The primary database
continues its operation without regard to the database availability on the standby database. There is risk to
Delayed protection: The archiver process transmits the redo logs to the standby sites. This is the only
Mode Log Writing Network Trans Disk Write Redo Log Reception Failure Resolution
Process Mode Option Option Option
Guaranteed LGWR SYNC AFFIRM Standby redo logs Protect
Instant LGWR SYNC AFFIRM Standby redo logs Unprotect
Rapid LGWR ASYNC NOAFFIRM Standby redo logs Unprotect
Delayed ARCH ASYNC NOAFFIRM Archived redo logs Unprotect
Note: Oracle recommends Standby Redo Logs on all of the top three modes.
Maximum Protection: It offers the highest level of data availability for the primary database. Redo records
are synchronously transmitted from the primary database to the standby database using LGWR process.
Transaction is not committed on the primary database until it has been confirmed that the transaction data
is available on at least one standby database. This mode is usually configured with at least two standby
databases. If all standby databases become unavailable, it may result in primary instance shutdown. This
ensures that no data is lost when the primary database loses contact with all the standby databases.
Standby online redo logs are required in this mode. Therefore, logical standby database cannot participate in
Maximum Availability: It offers the next highest level of data availability for the primary database. Redo
records are synchronously transmitted from the primary database to the standby database using LGWR
process. The transaction is not complete on the primary database until it has been confirmed that the
transaction data is available on the standby database. If standby database becomes unavailable, it will not
shut down the primary database. Instead, the protection mode is temporarily switched to maximum
performance mode until the fault has been corrected and the standby database will re-synchronize with the
primary database. This protection mode supports both physical and logical standby databases, and only
Maximum Performance: It is the default protection mode. It offers slightly less primary database
protection than maximum availability mode but with higher performance. Redo logs are asynchronously
shipped from the primary database to the standby database using either LGWR or ARCH process. When
operating in this mode, the primary database continues its transaction processing without regard to data
availability on any standby databases and there is little or no effect on performance. This protection mode is
similar to the combination of 9iR1's Instance, Rapid, and Delay modes. It supports both physical and logical
standby databases.
The best way to understand Data Guard implementation is to setup one manually.
For simple illustration, a hypothetical Data Guard environment is given (see table below).
2. One primary database instance called prod_01 on host server_01; one physical
5. The purpose of TNS entry prod_01 and prod_02 are used for LGWR/ARCH process
to ship redo logs to the standby site, and for FAL process to fetch redo logs from the primary
site.
6. Since Data Guard broker is not used here, we set dg_broker_start to false.
7. The protection mode is set to best performance. Therefore, only local archive
(log_archive_dest_2) is set to optional for LGWR process, with network transmission method of
8. The standby site is not using standby online redo logs. Therefore, the redo log
Figure 4
The following eight steps show how to set up a Data Guard environment:
Setup the init.ora file for both primary and standby databases.
Setup the listener.ora file for both primary and standby databases.
cp /u02/oradata/prod/* /u03/backup/prod/
enabled.
rcp u01/app/oracle/admin/prod/ctl/stbycf.ctl
server_02:/u01/app/oracle/admin/prod/ctl/control01.ctl
Step 5: Start the Listeners on Both Primary and Standby Site
Start the primary database listener.
Connect as sysdba.
Issue the following command to bring the standby database in managed recover
mode.
SQL> alter database recover managed standby database disconnect from session;
Step 8: Monitor the Log Transport Services and Log Apply Services
Check the standby alert log file to see if the new logs have applied to the standby
database.
standby database (prod_02) becomes the new primary database. It is possible to have data loss.
In 9.0.1, since you do not have Standby Redo Log files, you issue the following command on the standby
The ACTIVATE STANDBY DATABASE clause automatically creates online redo logs. It also performed a reset
logs operation. New logs generated from the new primary database (prod_02) cannot be applied to the old
In 9.2.0, you can gracefully Failover even without standby redo log files. Issue the following command on
SQL> alter database recover managed standby database skip standby logfiles;
This will apply all available redo and make the standby available to become a Primary. Complete the
operation by switching the standby over to the primary role with the following command:
The old primary (prod_01) has to be discarded and can not be used as the new standby database. You need
to create a new standby database by backing up the new primary and restore it on host server_01. The time
to create a new standby database exposes the risk of having no standby database for protection.
After failover operation, you need to modify TNS entry for 'prod' to point to the new instance and host name
Switchover Steps
Unlike failover, a switchover operation is a planned operation. All the archive logs required bringing the
standby to the primary's point in time need to be available. The primary database's online redo logs also
must be available and intact. During switchover operation, primary and standby databases switch roles. The
old standby database (prod_02) becomes the new primary, and the old primary (prod_01) becomes the new
standby database.
DATABASE_ROLE SWITCHOVER_STATUS
------------------------- -----------------------------------
PRIMARY TO STANDBY
Add the following two parameters. These two parameters can also be set
fal_server = "prod_02"
fal_client = "prod_01"
DATABASE_ROLE SWITCHOVER_STATUS
------------------------- -----------------------------------
PHYSICAL STANDBY TO PRIMARY
fal_server = "prod_01"
fal_client = "prod_02"
SQL> startup;
Step 5: Add Temp Tablespace
Change the TNS entry on all application hosts to point to the new primary
Prod =
(description =
(address = (protocol = tcp) (host = server_02) (port = 1522)
(connect_data = (sid = prod_02))
)
Implementation Tips
Tip #1: Primary Online Redo Logs — The number of redo groups and the size of redo logs are two key
factors in configuring online redo logs. In general, you try to create the fewest groups possible without
hampering the log writer process's ability to write redo log information. In a Data Guard environment, LGWR
process may take longer to write to the remote standby sites, you may need to add additional groups to
guarantee that a recycled group is always available to the log writer process. Otherwise, you may receive
incomplete logs on the standby sites. The size of redo log is determined by the amount of transaction
needed to be applied to a standby database during database failover operation. A small size of redo will
minimize the standby database lag time; however, it may cause more frequent log switches and require
more redo groups for log switches to occur smoothly. On the other hand, large size redo logs may require
few groups and less log switches, but it may increase standby database lag time and potential for more data
loss. The best way to determine if the current configuration is satisfactory is to examine the contents of the
log writer process's trace file and the database's alert log.
For example, the following message from the alert log may indicate a need for more log groups: ORA-
Tip #2: Standby Online Redo Logs vs. Standby Archived Redo Logs — Online redo logs transferred
from the primary database are stored as either standby redo logs or archived redo logs. Which redo log
process
Can place on raw devices
protection
Tip #3: Enforce Logging — Enforce Logging is a new feature in Oracle9i Release 2, it is recommended that
you set the FORCE LOGGING clause to force redo log to be generated for individual database objects set to
FORCE_LOGGING
--------------
NO
Tip #4: RMAN Backup — A failover operation reset logs for the new primary. If you use RMAN to backup
your database, you need to create a new incarnation of the target database. Otherwise, your RMAN backup
will fail.
Tip #5: Disable Log Transport Services When Standby Database is Down — When a standby
database or host is down for maintenance, it is advisable to temporarily disable the log transport services
for that site. Especially during a heavily transaction period, one behavior observed in Oracle9i R1 is that
when one of the standby database is down for maintenance, it can temporarily freeze the primary database
even the data protection mode is set to rapid mode. To avoid such problem, you can issue this command on
Tip #6: Standby Database Upgrade — Steps to upgrade standby database to newer database version:
Tip #7: Data Guard Broker — Oracle9i Release 1 broker configuration supported only one primary site
and one physical standby site. The first release of broker is not so user-friendly with limited features.
Oracle9i Release 2 broker has made great improvements. The new configuration now support up to nine
standby sites (including logical standby database). Both Data Guard Manager and CLI support switchover
and failover operations. You must upgrade to Oracle Enterprise Manager Release 9.2 to managed broker
Tip #8: Using 'Delay' Option to Protect Logical Physical Corruptions — You may utilize the delay
option (if you have multiple standby sites) to prevent physical/logical corruption of your primary. For
instance, your standby #1 may not have 'Delay' on to be your disaster recovery standby database.
However, you may opt to implement a delay of minutes or hours on your standby #2 to allow recover from
Tip #9: Always Monitor Log Apply Services and Check Alert Log File for Errors — If you are not
using Data Guard broker, here is a script to help you to monitor your standby database recover process:
$ cat ckalertlog.sh
####################################################################
## ckalertlog.sh ##
####################################################################
#!/bin/ksh
export EDITOR=vi
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/9.2.0
export ORACLE_HOME LD_LIBRARY_PATH=$ORACLE_HOME/lib
export TNS_ADMIN=/var/opt/oracle
export ORATAB=/var/opt/oracle/oratab
PATH=$PATH:$ORACLE_HOME:$ORACLE_HOME/bin:/usr/ccs/bin:/bin:/usr/bin:/usr/sbin:/
sbin:/usr/openwin/bin:/opt/bin:.; export PATH
DBALIST="primary.dba@company.com,another.dba@company.com";export
do
cd $ORACLE_BASE/admin/$SID/bdump
if [ -f alert_${SID}.log ]
then
mv alert_${SID}Log alert_work.log
touch alert_${SID}Log
fi
then
fi
rm -f alert.err
rm -f alert_work.log
done
#########################################################
#########################################################
Conclusion
This paper provides an overview of Oracle9i Data Guard technology. The paper offers an introduction to the
basic concepts and architectures of Data Guard. It reviews different data protection modes. It discusses the
following implementation steps: planning for higher availability, creating the standby database environment,
setting up the log transport services, managing the log apply services, and administrating the Data Guard
environment. It also shows steps to perform switchover and failover operations, along with some
implementation tips. By implementing Oracle9i Data Guard technology, organizations will achieve higher