Professional Documents
Culture Documents
@RobertPBialek doag2017
Who Am I
OPERATION
3 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
With over 600 specialists and IT experts in your region.
COPENHAGEN
4 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Technology on its own won't help you.
You need to know how to use it properly.
5 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Database Service High Availability – Goal
Consider the whole SW/HW stack. Find the best cost/risk ratio.
Effort
Downtime Costs
Database Application Complexity
Storage Server(s) Clients
Server(s)
Best cost/risk
ratio
Availability
6 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Database Service High Availability – Options
HA
HA
Data Replication – used mostly for data, rather than
service high availability:
– Data Guard (Fast-Start Failover/Global Data Services).
– GoldenGate (Global Data Services).
– Other replication technologies. HA
7 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Agenda
1. Introduction
2. Configuration
3. Special Cases
4. Conclusions
8 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Introduction
9 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Database Service HA with Data Guard? – Introduction
Data Guard
Yes, it can also be used for service high availability:
– Planned downtimes – manual switchover.
– Unplanned downtimes – fast-start failover or manual failover. Primary Standby
FSFO Configuration
Why might we consider Data Guard for service high availability:
– Less complex than a cluster installation.
– Infrastructure requirements not that high (even local storage is sufficient).
– Not subject to additional license fees (EE license assumed).
– Additionally, many other advantages: data high availability, snapshot standby,
potentially rolling upgrade capability, ...
10 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Database Service HA with Data Guard? – Big Picture
Backup Observers
(optional, 12.2)
Database Clients Master Observer
Required
Primary
RW Service
Transparency
required
(failover/
switchover) ...
Primary Target Candidate Target Failover Standbys
Failover Standby (optional, 12.2)
11 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Database Service HA with Data Guard? – Monitoring
Observer
Failover condition
detected
Reconnect interval expired Logoff
SLEEP SLEEP
~ 3sec. Connect ~ 3sec.
PING PING
PRIMARY TARGET STANDBY
12 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Configuration
13 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Data Guard Protection Modes with FSFO – Prerequisites
FSFO: Guaranted zero data FSFO: Data loss possible. FSFO: Guaranted zero data
loss. loss.
14 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Observer
Fast-Start Failover – Observer (1)
Ping
Monitoring component, initiates a failover procedure.
In 12.2, up to 3 observers (in background) can be started: W000 B001
P001 S001
– One master and up to two backup (standby) observers.
PRIMARY TARGET
DGMGRL> START OBSERVER OBS1.TRIVADIS.COM IN BACKGROUND Failover Standby
FILE IS '$ADMIN_SID/fsfo_$ORACLE_SID.dat'
LOGFILE IS '$ADMIN_SID/fsfo_$ORACLE_SID.log'
CONNECT IDENTIFIER IS <Alias1>.TRIVADIS.COM; Oracle wallet
required
In older releases:
– Only one running observer (HA needs to be adressed).
nohup dgmgrl -logfile $ADMIN_SID/fsfo_$ORACLE_SID.log <<EOD &
CONNECT $CONNECT_DATA
START OBSERVER FILE='$ADMIN_SID/fsfo_$ORACLE_SID.dat';
EOD
15 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Observer
Fast-Start Failover – Observer (2)
Ping
Fast start-failover is initiated by the master observer to
the target standby database, if one of the following W000 B001
conditions is detected: P001 S001
– observer and the target standby database cannot reach PRIMARY TARGET
the primary database (default: ObserverOverride=‘FALSE‘). Failover Standby
17 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Data Guard: Example Role-Based Services
18 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Client-Side Configuration – Main Problems To Address
CASE 1 CASE 2
Problem
19 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
New Oracle Net Session – Connect Timeout (1)
1
sqlnet.ora parameters (OCI, ODP.net)
– Applies to each IP that a host name resolves to!
– All Oracle client versions supported. LSNR LSNR
Oracle Net
TCP.CONNECT_TIMEOUT=3 #default 60 sec. 2
SQLNET.OUTBOUND_CONNECT_TIMEOUT=5 #no default
Three-way handshake
3
For clients >=11.2:
OLTP.trivadis.com =
(DESCRIPTION =
(FAILOVER=ON) (LOAD_BALANCE=OFF) Introduced in 12.1.0.2
(CONNECT_TIMEOUT=5)(RETRY_COUNT=3)(RETRY_DELAY=1)(TRANSPORT_CONNECT_TIMEOUT=3)
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP )(HOST = italy )(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP )(HOST = sweden )(PORT = 1521)))
(CONNECT_DATA = (SERVICE_NAME = OLTP_RW.trivadis.com)))
20 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
New Oracle Net Session – Connect Timeout (2)
JDBC Thin clients can alternatively use the following driver property (ms)
– Overrides CONNECT_TIMEOUT from address description parameters
Properties prop = new Properties();
prop.put(oracle.net.ns.SQLnetDef.TCP_CONNTIMEOUT_STR, ""+3000);
ods.setConnectionProperties(prop);
21 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Established Oracle Net Session – Re-Connect Timeout
2
3 4
Timeout Client failover
Using the following parameters is not a good idea:
SQLNET.RECV_TIMEOUT=30 #no default value, OCI driver
SQLNET.SEND_TIMEOUT=30 #no default value, OCI driver
Better solution:
– If possible use: Fast Application Notification/Fast Connection Failover.
– Tuning OS kernel parameter tcp_retries2 might be also an alternative.
22 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Client HA Features – Overview
24 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Special Cases with FSFO: Candidate Targets Standby
Starting with 12.2, many candidate fast-start failover target databases can be
specified, but switchover or FSFO works only to the current target standby database.
DGMGRL> EDIT DATABASE db_site1 SET PROPERTY FastStartFailoverTarget =
'db_site2,db_site3'; Current target depends
on many conditions
Threshold: 60 seconds FSFO
Target: db_site2
Candidate Targets: db_site2,db_site3
Observers: (*) obs1.trivadis.com
obs2.trivadis.com Switchover
db_site1 db_site2 db_site3
25 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Special Cases: Switchover – DelayMins >0
DelayMins>0
Version 11.2.0.X: recovery re-started with NODELAY option.
Version 12.1.0.X: recovery waits until DelayMins reached!
– OPEN_MODE
PRIMARY TARGET
• Primary – CLOSED BY SWITCHOVER Failover Standby
• Standby – MOUNTED
– Application RW service outage within DelayMins time-frame! Application connect
attemps fail with ORA-16456: switchover to standby in progress or completed
Version 12.2.0.1: Switchover is not possible.
Error: ORA-16672: switchover not permitted to standby database with non-zero
DelayMinsFailed.
26 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Special Cases with FSFO: Master Observer Failure
Master Observer Backup Observers
27 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Special Cases with FSFO: Failover Target Failure Observer
After about 10 sec. the target failover standby is changed, i.e.: a candidate
target failover standby observer is promoted to the current target role.
Permission granted to the primary database for target
switch.
The primary database returned to SYNC/NOT LAGGING state
with the standby database db_site3.
db_site1 db_site2 db_site3
Note: to perform the target failover standby change, the primary database and the
master observer need to be available!
– If the master observer fails at the same time:
LGWR: FSFO SetState("UNSYNC", 0x2) operation requires an ack
Primary database will shutdown within 30 seconds if permission
is not granted from Observer or FSFO target standby to proceed
28 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Special Cases with FSFO: Master Observer/Primary
Master Observer Backup Observers
If the primary and the master observer fail:
– No failover is initiated to a candidate standby.
– From a backup observer log file:
Ready to failover check on standby returned
RFS_NON_MSTOB.
Command READY_TO_FSFO to thread S024 returned status=0
Fast-Start Failover is not possible because this
observer is not the master.
db_site1 db_site2 db_site3
29 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Special Cases with FSFO: Private Redo Network
Master Observer
If the public network on the primary server fails:
– Broker configuration property: ObserverOverride=FALSE.
– No failover (HB over private network still works!).
Public Network
Fast-Start Failover is not possible because
primary last contacted the standby within
FastStartFailoverThreshold seconds
HB
30 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Master Observer, Primary/Standby DB: Location? (1)
For DR HA service protection, do not place the primary and the master observer in
the same data center!
No automatic
failover!
For DR HA service protection, do not place the primary and the master observer in
the same data center!
To relocate:
Disable & Enable Automatic
FSFO failover!
33 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Conclusions (1)
Can Data Guard be a good solution for database service high availability?
– Yes, with a fast-start failover configuration.
– Anyway, it is not a replacement for a cluster but rather an alternative.
– Careful business requirements analysis is necessary.
Advantages:
– It offers a good service high availability, in addition to excellent data high
availability and some other features.
– Fairly simple solution (setup and operation).
– Not subject to additional license fees (EE license assumed).
– Infrastructure requirements not that high as for a cluster.
– Most client HA features can be used the same way as with a cluster.
34 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Conclusions (2)
Disadvantages:
– Component placement is critical and requires customized monitoring scripts.
– Some technical restrictions like network latencies (SYNC), flashback database or
force logging might limit Data Guard in this area.
– Re-connect timeouts without FAN/FCF (no VIPs).
35 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?
Trivadis @ DOAG 2017
#opencompany
Booth: 3rd Floor – next to the escalator
36 23.11.2017 Trivadis DOAG17: Oracle Database Service High Availability with Data Guard?