Professional Documents
Culture Documents
Dataguard Questions
Dataguard Questions
DMON?
The Data Guard broker process. DMON is started when Data Guard is started. This is
broker controller process is the main broker process and is responsible for
coordinating all broker actions as well as maintaining the broker configuration
files. This process is enabled/disabled with the DG_BROKER_START parameter.
DRCn
These network receiver processes establish the connection from the source database
NSVn process. When the broker needs to send something (e.g. data or SQL) between
databases, it uses this NSV to DRC connection. These connections are started as
needed.
LNS?
In Data Guard, LNS process performs actual network I/O and waits for each network
I/O to complete. Each LNS has a user configurable buffer that is used to accept
outbound redo data from the LGWR process. The NET_TIMEOUT attribute is used only
when the LGWR process transmits redo data using a LGWR Network Server(LNS) process.
MRP?
Managed Recovery Process MRP
In Data Guard environment, this managed recovery process will apply archived redo
logs to the standby database.
RFS?
Remote File Server process RFS
The remote file server process, in Data Guard environment, on the standby database
receives archived redo logs from the primary database.
FAL?
Fetch Archive Log (FAL) Server
Services requests for archive redo logs from FAL clients running on multiple
standby databases. Multiple FAL servers can be run on a primary database, one for
each FAL request.
Data Guard is basically a ship redo and then apply redo, as you know redo is the
information needed to recover a database transaction.
A production database referred to as a primary database transmits redo to one or
more independent replicas referred to as standby databases.
Standby databases are in a continuous state of recovery, validating and applying
redo to maintain synchronization with the primary database.
A standby database will also automatically resynchronize if it becomes temporary
disconnected to the primary due to power outages, network problems, etc.
Firstly the redo transport services transmits redo data from the primary to the
standby as it is generated,
secondly services apply the redo data and update the standby database files,
thirdly independently of Data Guard the database writer process updates the primary
database files and
lastly Data Guard will automatically resynchronize the standby database on power or
network outages using redo data that has been archived at the primary.
===================================================================================
===========================================
2.What is Max Availability , Max Performance and Max Protection?
Max Availability:
=================
Its first priority is to be available its second priority is zero loss protection,
thus it requires the SYNC redo transport. In the event that the standby server is
unavailable the primary will wait the specified time in the NET_TIMEOUT parameter
before giving up on the standby server and allowing the primary to continue to
process. Once the connection has been re-established the primary will automatically
resynchronize the standby database.
When the NET_TIMEOUT expires the LGWR process disconnects from the LNS process,
acknowledges the commit and proceeds without the standby, processing continues
until the current ORL is complete and the LGWR cycles into a new ORL,
a new LNS process is started and an attempt to connect to the standby server is
made,
if it succeeds the new ORL is sent as normal, if not then LGWR disconnects again
until the next log switch,
the whole process keeps repeating at every log switch, hopefully the standby
database will become available at some point in time. Also in the background if you
remember if any archive logs have been created during this time the
ARCH process will continually ping the standby database waiting until it come
online.
Max Performance:
=================
This mode requires ASYNC redo transport so that the LGWR process never waits for
acknowledgement from the standby database,
also note that Oracle no longer recommends the ARCH transport method in previous
releases is used for maximum performance.
Max Protection:
=================
The priority for this mode is data protection, even to the point that it will
affect the primary database.
This mode uses the SYNC redo transport and the primary will not issue a commit
acknowledgement to the application
unless it receives an acknowledgement from at least one standby database, basically
the primary will stall and eventually abort
preventing any unprotected commits from occurring. This guarantees complete data
protection,
in this setup it is advised to have two separate standby databases at different
locations with no Single Point Of Failures (SPOF's),
they should not use the same network infrastructure as this would be a SPOF.
===================================================================================
===========================================
3.Is the standby database will create redo log file ?
4.What is physical standby?
5.What is logical standby?
===================================================================================
===========================================
6.How to check the gaps in archivelog shipment?
v$archived_log
v$log_history
v$archive_gap
NET_TIMEOUT
===================================================================================
===========================================
9.What is datagaurd broker?
the broker maintains the configuration files that includes profiles for all
databases.
Change can be propagated to all databases within the configuration,
the broker also includes commands to start an observer,
the process that monitors the status of a Data Guard configuration and executes an
automatic failover.
You might be think that the Data Guard broker is a single point of failure, which
is incorrect,
broker processes are background processes that exist on each database in the
configuration and communicate with each other.
if the system on which you are attached fails, you simple attach to another
database within the configuration and resume management from there.
distributed management tool that centralizes management , uses DGMGRL command line.
===================================================================================
===========================================
10.What is fast start fail over and switch over in datagaurd?
===================================================================================
===========================================
11.List out all the views which are used to check the datagaurd related
information?
Issue the following query to show information about the protection mode, the
protection level, the role of the database, and switchover status:
SELECT DATABASE_ROLE, DB_UNIQUE_NAME INSTANCE, OPEN_MODE, PROTECTION_MODE,
PROTECTION_LEVEL, SWITCHOVER_STATUS FROM V$DATABASE;
On the standby database, query the V$ARCHIVED_LOG view to identify existing files
in the archived redo log.
SELECT SEQUENCE#, FIRST_TIME, NEXT_TIME FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;
Or
SELECT THREAD#, MAX(SEQUENCE#) AS "LAST_APPLIED_LOG" FROM V$LOG_HISTORY GROUP BY
THREAD#;
On the standby database, query the V$ARCHIVED_LOG view to verify the archived redo
log files were applied.
SELECT SEQUENCE#,APPLIED FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;
Query the physical standby database to monitor Redo Apply and redo transport
services activity at the standby site.
SELECT PROCESS, STATUS, THREAD#, SEQUENCE#, BLOCK#, BLOCKS FROM V$MANAGED_STANDBY;
To determine if real-time apply is enabled, query the RECOVERY_MODE column of the
V$ARCHIVE_DEST_STATUS view.
SELECT RECOVERY_MODE FROM V$ARCHIVE_DEST_STATUS;
The V$DATAGUARD_STATUS fixed view displays events that would typically be triggered
by any message to the alert log or server process trace files.
SELECT MESSAGE FROM V$DATAGUARD_STATUS;
Determining Which Log Files Were Not Received by the Standby Site.
SELECT LOCAL.THREAD#, LOCAL.SEQUENCE# FROM (SELECT THREAD#, SEQUENCE# FROM
V$ARCHIVED_LOG WHERE DEST_ID=1) LOCAL WHERE LOCAL.SEQUENCE# NOT IN (SELECT
SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND THREAD# = LOCAL.THREAD#);
If a delayed apply has been specified or an archive log is missing then switchover
may take longer than expected.
Check v$managed_standby
select process, status, sequence# from v$managed_standby;
OR alternatively:
select name, applied from v$archived_log;
------------------------------------------------------------------
Here is a useful document about the views related with dataguard:
===================================================================================
===========================================
12.What is observer?
process that monitors the status of a Data Guard configuration and executes an
automatic failover.
One point to mention is regarding a split-brain scenario, where the primary and
standby both think that they are the primary database,
with Data Guard Fast-Start failover a failed primary cannot open without first
receiving permission from the Data Guard observer process.
The observer will know that a failover has occurred and will refuse to allow the
original primary to open.
The observer will automatically reinstate the failed primary as a standby for the
new primary database making
it impossible to have a split-brain condition.
===================================================================================
===========================================
13.Database performance is not good after migrated to Datagaurd environment?
check the tuning part below
===================================================================================
===========================================
14.What is the difference between Active Dataguard, and the Logical Standby
implementation of 10g dataguard?
===================================================================================
===========================================
15.What are the uses of Oracle Data Guard?
Disaaster recovery.
===================================================================================
===========================================
16.What is Redo Transport Services?
Using LNS and RFS .
===================================================================================
===========================================
17.What is apply services?
--Two methods in which to apply redo, Redo Apply (physical standby) and SQL Apply
(logical standby).
SQL apply uses the logical standby process (LSP) to coordinate the apply of changes
to the standby database.
read the SRL and "mine" the redo by converting it to logical change records and
then building SQL transactions and
applying SQL to the standby database and because there are more moving parts it
requires more CPU, memory and I/O then redo apply
SQL apply does not support all data types,
such as XML in object relational format and Oracle supplied types such as Oracle
spatial, Oracle intermedia and Oracle text.
The benefits to SQL apply is that the database is open to read-write while apply is
active,
while you can not make any changes to the replica data you can insert,
modify and delete data from local tables and schemas that have been added to the
database,
you can even create materialized views and local indexes. This makes it ideal for
reporting tools, etc to be used.
-A standby database that is opened for read-write while SQL apply is active
-A guard setting that prevents the modification of data that is being maintained by
the SQL apply
-Able to execute rolling database upgrades beginning with Oracle Database 11g using
the KEEP IDENTITY clause
LNS process has ceased transmitting redo to the standby database (network issues).
The primary database continues writing to the current log file.
Data Guard uses an ARCH process on the primary database to continuously ping the
standby database during the outage.
ARCH process queries the standby control file (via the RFS process) to determine
the last complete log file.
The ARCH process will then transmit the missing files to the standby database using
additional ARCH processes.
LNS will attempt and succeed in making a connection to the standby database and
will begin transmitting the current redo while the ACH processes resolve the gap in
the background.
Once the standby apply process is able to catch up to he current redo logs.
apply process automatically transitions out of reading the archive redo logs and
into reading the current SRL.
===================================================================================
===========================================
This setup really does depend on network performance and can have a dramatic impact
on the primary databases,
low latency on the network will have a big impact on response times.
The impact can be seen in the wait event "LNS wait on SENDREQ" found in the
v$system_event dynamic performance view.
LNS sends data through oracle net services to RFS standby database
it receives write confirmation from the disk then RFS send ACK to LNS then
transaction commits.
===================================================================================
===========================================
21.what is asynchronous transport?
Asynchronous transport (ASYNC) eliminates the requirement that the LGWR waits for
a acknowledgment from the LNS,
creating a "near zero" performance on the primary database regardless of distance
between the primary and the standby locations.
The LGWR will continue to acknowledge commit success even if the bandwidth prevents
the redo of previous transaction
from being sent to the standby database immediately. If the LNS is unable to keep
pace and the log buffer is recycled before the redo is sent to the standby,
the LNS automatically transitions to reading and sending from the log file instead
of the log buffer in the SGA.
Once the LNS has caught up it then switches back to reading directly from the
buffer in the SGA.
The log buffer ratio is tracked via the view X$LOGBUF_READHIST (11g) a low hit
ratio indicates that the LNS is reading from the log file instead of the log
buffer,
if this happens try increasing the log buffer size.
select bufsize, rdmemblks, rddiskblks, hitrate from x$logbuf_readhist;
The drawback with ASYNC is the increased potential for data loss, if a failure
destroys the primary database before the transport lag is reduced to zero,
any committed transactions that are part of the transport lag are lost. So again
make sure that the network bandwidth is adequate and that you get the
lowest latency possible.
===================================================================================
===========================================
22.How to determine if Redo Apply has recovered all redo that has been received
from the primary? ******* Performance
To detect if a gap really does exist. run the below query on both database and find
the difference.
redo generate rate and recovery rates using (AWR) reports or V$SYSSTAT on both
databases.
===================================================================================
===========================================
23.How to improve the performance of media recovery in standby database?
---By default, media recovery uses the CPU_COUNT to determine the number of
processes to use for recovery operations.
---PARALLEL_EXECUTION_MESSAGE_SIZE = 65535. The message size parameter is used by
all parallel query operations and pulls memory from the shared pool.(check
shared_pool size).
---Ensure that Oracle can use ASYNC I/O. by default, the Oracle database is
configured for asynchronous I/O.
---Set the initialization parameter DISK_ASYNCH_IO=TRUE
---If asynchronous I/O is not available, consider using the DBWR_IO_SLAVES
parameter to simulate asynchronous I/O.
---Set the DB_WRITER_PROCESSES parameter to a value greater than 1 when
asynchronous I/O It is applicable if you have sufficient CPU and I/O bandwidth.
---increase the buffer cache If receive high “free buffer wait” as a significant
database wait event in V$SYSTEM_EVENT.
---Increase the size of the primary database’s online redo log and standby
database’s standby redo logs to reduce the number of times a full checkpoint is
performed.
Standby database:
------------------
---media recovery instance is very write and update intensive.less overall CPU
resources but equal or greater I/O or memory capacity.
---MRP process (MRP0 PID found in V$MANAGED_STANDBY).
---less no of CPU utilization during distinct rows have changed.
---Timed_Statistics=TRUE
---sar command
---vmstat command
---AWR reports from the primary database
---Output from the following query from V$SYSSTAT run on a 60-second interval on
the standby database: select name, value, to_char(sysdate, 'DD-MON-YYYY HH:MI:SS')
from v$sysstat where value > 0 order by name;
---Output from the following query from V$SYSTEM_EVENT run on a 60-second interval
on the standby database:
---Output from the following query on V$RECOVERY_PROGRESS run once at the
completion of the media recovery operation on the standby:
===================================================================================
===========================================
24.
===================================================================================
====================================