Safe Harbor Statement: 1 © 2019 Oracle

Safe harbor statement
The following is intended to outline our general product direction. It is intended for information
purposes only, and may not be incorporated into any contract. It is not a commitment to deliver
any material, code, or functionality, and should not be relied upon in making purchasing
decisions.
The development, release, timing, and pricing of any features or functionality described for
Oracle’s products may change and remains at the sole discretion of Oracle Corporation.
1 © 2019 Oracle
High Availability and Disaster Recovery
Level 300
Flavio Pereira
Changbin Gong
Oracle Cloud Infrastructure
October 2019
Objectives
After completing this lesson, you should be able to:

• Describe High Availability and Disaster Recovery
• How to Leverage OCI for HA and DR
• HA and DR features for OCI
• High Availability and disaster Recover scenarios
High Availability
High Availability Concepts
• Computing environments configured to provide
nearly full-time availability are known as high
availability systems
• Such systems typically have redundant hardware

and software that makes the system available
despite failures.
• Well-designed high availability systems avoid

having single points-of-failure
• When failures occur, the failover process moves

processing performed by the failed component to
the backup component
• The more transparent that failover is to users, the

higher the availability of the system.
Availability Domains
• Availability domains are isolated from each other, fault tolerant, and very unlikely to fail simultaneously.
• Because availability domains do not share physical infrastructure, such as power or cooling, or the
internal availability domain network, a failure that impacts one availability domain is unlikely to impact the
availability of the others.
ORACLE CLOUD INFRASTRUCTURE (REGION)
Availability Domain 1 Availability Domain 2 Availability Domain 3
Subnet A Regional Subnet C
Regional Subnet B
Fault Domains
• Fault Domains (FD) enable you to distribute your instances so that they are not on the same physical
hardware within a single AD. Each AD will have 3 FDs.
• Fault domains provide high availability for application resources within an availability domain by protecting
against unexpected hardware failures and maintenance updates on the compute hardware.
Availability Domain 1 Availability Domain 2 Availability Domain

3
Subnet A Subnet B
FD01 FD02 FD01 FD02
FD03 FD03
Avoid single points-of-failure
One of the key principles of designing high availability solutions is to avoid single point of failure. We
recommend designing your architecture to deploy instances that perform the same tasks in different fault
domains for one AD regions and if possible, in different availability domains for multiple AD regions. This
design removes a single point of failure by introducing redundancy.
ORACLE CLOUD INFRASTRUCTURE (ONE AD REGION) ORACLE CLOUD INFRASTRUCTURE (MULTIPLE AD REGION)
Availability Domain Availability Domain 1 Availability Domain Availability Domain 3
2
Subnet A Subnet C Regional Subnet A Subnet C

FD01 FD03 FD03 FD01
Subnet B Regional Subnet B
FD01 FD02 FD02 FD03 FD01

Regional and AD Specific Subnets
• Each subnet in a VCN exists in a single availability domain (AD Specific Subnets) or in multiple availability
domains (Regional Subnets) and consists of a contiguous range of IP addresses that do not overlap with
other subnets in the cloud network.
• You can not change the size of the subnet after it is created, so it's important to think about the size you
need before creating subnets
ORACLE CLOUD DATA CENTER REGION ORACLE CLOUD DATA CENTER REGION
AVAILABILITY DOMAIN-1 AVAILABILITY DOMAIN-2 AVAILABILITY DOMAIN-1 AVAILABILITY DOMAIN-2
SUBNET A, SUBNET B,
10.0.1.0/24 10.0.2.0/24 SUBNET A, 10.0.1.0/24
VCN, 10.0.0.0/16 VCN, 10.0.0.0/16

Load Balancer
VCN AVAILABILITY DOMAIN-1 AVAILABILITY DOMAIN-2
• Load Balancer: Load Balancing service

Public IP address
improves resource utilization, facilitates
Listener scaling, and helps ensure high availability.
It supports routing incoming requests to
Load Balancer Load Balancer Pair Load Balancer various backend sets based on virtual
(Active) (Failover)
REGIONAL SUBNET 1 hostname, path route rules, or
combination of both. (Public and Private
LB)
• NOTE: Private and Public Load Balancer is

Backend Set only Highly-available within an AD for
Backend Servers Backend Servers
single AD regions.
REGIONAL SUBNET 2
Virtual IP
AD-1 AD-2
Regional Subnet • Virtual IP: A Compute instance can be

10.0.1.0/24
assigned a secondary private IP
address. If the VM1 has problems, the
VIP-2
VIP-2
IP-1
IP-1
primary primary Virtual IP (VIP-2) will be reassign to VM2

VNIC1 VNIC1
instance in the same subnet to achieve
instance failover.
primary primary
Heartbeat
Communication
VM1 VM2
Compute
Depending on your system or application requirements, you can implement this architecture redundancy in
either standby or active mode:
• Standby mode: a secondary or standby component runs side-by-side with the primary component.
When the primary component fails, the standby component takes over. Standby mode is typically used
for applications that need to maintain their states.
Availability Domain 1 Availability Domain Availability Domain 3

2
Subnet A
Active Standby
Compute
• Active/Active mode: all components are actively participating in performing the same tasks. When one
of the components fails, the related tasks are simply distributed to another component. Active mode is
typically used for stateless applications.

2
Subnet A
Active Active
Compute – Auto Scaling
• Avoid single point of failure
• Enables automatic adjustments for the number of Compute instances in an instance pool based on
performance metrics. For instance,
• CPU utilization
• Memory utilization
• Recommend to attaching a load balancer to the instance pool which has autoscaling configured
Instance Pool before scale Instance Pool after scale
Scaling Rule
Minimum Size
If CPU or Memory > 70% add 2 Instances Initial Size
If CPU or Memory < 70% remove 2 instances
Initial Size Maximum Size
High Availability for OCI – Connectivity
Highly available, fault-tolerant network connections are key to a well-architected system. You can choose to
implement IPSec VPN connections to connect your data center to OCI or FastConnect which provides
higher-bandwidth options and a more reliable and consistent networking experience compared to internet-
based connections:
• IPSec VPN: DRG has multiple VPN endpoints so that each IPSec VPN connection consists of multiple
redundant IPSec tunnels that use static routes to route traffic. To ensure high availability, you must set up
VPN connection availability within your internal network to use either path when needed.
• FastConnect: You can either connect directly to OCI routers in provider points-of-presence (POPs) or use
one of Oracle’s many partners to connect from POPs around the world to their OCI Networking resources.
Oracle provides features that allow you to build fault-tolerant connections, including multiple POPs per
region and multiple FastConnect routers per POP.
IPsec VPN Redundancy Models (Multiple CPE)
Availability Domain 1
Subnet A
10.0.30.0/24
CPE
Transit Availability Domain 2
Subnet B
POP 10.0.40.0/24
CPE
Availability Domain 3
Subnet C
10.0.50.0/24
Transit
POP
Redundant FastConnect
Public Internet IPsec VPN CONNECTION
VIRTUAL CIRCUIT #1 PRIVATE SUBNET

EDGE EDGE FASTCONNECT LOCATION
10.2.2.0/24
1 AVAILABILITY DOMAIN 1
PROVIDER
CUSTOMER CPE DRG
NETWORK
NETWORK
10.0.0.0/16 VIRTUAL CIRCUIT #2
EDGE EDGE
FASTCONNECT LOCATION
2
DST IP: 0.0.0.0/0 PRIVATE SUBNET

Public Internet 10.2.3.0/24
IGW AVAILABILITY DOMAIN 2
VCN
REGION
Storage
• Object Storage: Object Storage was designed to be highly durable. Multiple copies of the data are stored
across servers in the availability domains. Data integrity is actively monitored using checksums. Corrupt
data is auto detected and auto healed from redundant copies. Any loss of data redundancy is actively
managed by recreating a copy of the data
Availability Domain 1 Availability Domain 2 Availability Domain 3
Storage Server Storage Server Storage Server

Storage
• Block Volume: policy-based backups to perform automatic, scheduled backups and retain them based
on a backup policy. You can restore backup across availability domains

2
Subnet A Subnet B
Server Server
Block
Block
Storage
Storage
(Backup)
(Restore)
Storage
• File Storage: Durable, scalable, enterprise-grade network file system Ideal for Enterprise applications
that need shared files (NAS)

2
Subnet A
Server Rsync Server Server
File File
Storage Storage
Disaster Recovery
Disaster Recovery Terminology
• Disaster recovery (DR) involves a set of policies,

tools and procedures to enable the recovery or
continuation of vital technology infrastructure and
systems
• Disaster recovery should indicate the key metrics

of recovery point objective (RPO) and recovery
time objective (RTO)
• In many cases, an organization may elect to use

an outsourced disaster recovery provider to
provide a stand-by site and systems rather than
using their own remote facilities
Disaster Recovery RTO and RPO
RPO RTO
Disaster
Transactions Lost
Down Time
Disaster Recovery Options
Active/Active
Replicate data and

services into OCI
Standby ready to take over.
Replicate data with

< 2 hours 15 min
Backup and Restore minimum running
services in OCI.
Backup of on-premises
data to OCI to use in a
DR event
12 hours 4 hours
$$$
24 hours 24 hours
$$
RPO
$
RTO
$ Cost
Disaster Recovery for OCI
• Regions are completely independent of other regions and can be separated by vast distances—across
countries or even continents.
• You can also deploy applications in different regions to:

- mitigate the risk of region-wide events, such as large weather systems or earthquakes
- meet varying requirements for legal jurisdictions, tax domains, and other business or social criteria
ORACLE CLOUD INFRASTRUCTURE (REGION 1) ORACLE CLOUD INFRASTRUCTURE (REGION 2)
AD1 AD2 AD3 AD1 AD2 AD3

Disaster Recovery using multiple regions
• You can connect Regions using Remote VCN Peering.
• Using internal backbone, traffic never leaves Oracle Network.
ORACLE CLOUD INFRASTRUCTURE (REGION 1) ORACLE CLOUD INFRASTRUCTURE (REGION 2)
AD1 AD2 AD3 AD1 AD2 AD3

Disaster Recovery using multiple regions
• Cross region Block volume backup copy:

• By copying block volume backups to
another region at regular intervals, it makes
it easier to rebuild applications and data in
the destination region if a region-wide
disaster occurs in the source region.
• Migration and expansion:
• To easily migrate and expand your
applications to another region.
DNS traffic management
• STEERING POLICIES
• A framework to define the traffic management behavior for your zones. Steering policies contain
rules that help to intelligently serve DNS answers.
• ATTACHMENTS
• Allows you to link a steering policy to your zones. An attachment of a steering policy to a zone
occludes all records at its domain that are of a covered record type, constructing DNS responses
from its steering policy rather than from those domain's records. A domain can have at most one
attachment covering any given record type.
• RULES
• The guidelines steering policies use to filter answers based on the properties of a DNS request, such
as the requests geo-location or the health of your endpoints.
• ANSWERS
• Answers contain the DNS record data and metadata to be processed in a steering policy.
Failover
A -> B Failover
Primary asset is monitored

Outage from multiple points via
Oracle Health Checks
Traffic is automatically
Primary Cloud directed to a different
endpoint as soon as service
fails to respond
User
Recursive OCI DNS Monitoring is powered by
Server Oracle Health Checks
Available
Redundant Cloud
Backup and Restore Architecture
ON-PREMISES ORACLE CLOUD INFRASTRUCTURE (REGION)
Web Server Buckets

Web Server
Object
Storage
Buckets
Back Up/
Restore
System
AD1 AD2 AD3
Database
NFS
Storage
SAN
Gateway
Standby Architecture
DNS
AD1 AD2 AD3
Web Server Web Server

Virtual Cloud
Network
Web Servers
Buckets
Object
Storage
Database Buckets
Database VPN
Block
Storage
SAN
Storage
Gateway
Active/Active Architecture
DNS
AD1 AD2 AD3
Web Server Web Server

Virtual Cloud
Network
Load File Storage
Balancer
VPN
Buckets
Object
Web Server Web Server Storage
Buckets
Database FastConnect
Database Database
SAN Block Block

Storage
Storage Storage
Gateway
Database Strategies for DR
• Active Data Guard

• Provides data protection and availability for Oracle Database in a simple and economical
manner by maintaining an exact physical replica of the production copy at a remote
location that is open read-only while replication is active.
• GoldenGate
• Enables advanced logical replication that supports multi-master replication, hub and
spoke deployment, and data transformation.
• Provides customers flexible options to address the complete range of replication
requirements, including heterogeneous hardware platforms.
Oracle Cloud always free tier:
oracle.com/cloud/free/
OCI training and certification:

oracle.com/cloud/iaas/training
oracle.com/cloud/iaas/training/certification
education.oracle.com/oracle-certification-path/pFamily_647
OCI hands-on labs:

ocitraining.qloudable.com/provider/oracle
Oracle learning library videos on YouTube:

youtube.com/user/OracleLearning

Safe Harbor Statement: 1 © 2019 Oracle

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Safe Harbor Statement: 1 © 2019 Oracle

Uploaded by

Copyright:

Available Formats

Safe harbor statement

After completing this lesson, you should be able to:

• Such systems typically have redundant hardware

• Well-designed high availability systems avoid

• When failures occur, the failover process moves

• The more transparent that failover is to users, the

ORACLE CLOUD INFRASTRUCTURE (REGION)

Availability Domain 1 Availability Domain 2 Availability Domain 3

Subnet A Regional Subnet C

ORACLE CLOUD INFRASTRUCTURE (REGION)

Availability Domain 1 Availability Domain 2 Availability Domain

FD01 FD02 FD01 FD02

Subnet A Subnet C Regional Subnet A Subnet C

Subnet B Regional Subnet B

FD01 FD02 FD02 FD03 FD01

AVAILABILITY DOMAIN-1 AVAILABILITY DOMAIN-2 AVAILABILITY DOMAIN-1 AVAILABILITY DOMAIN-2

VCN, 10.0.0.0/16 VCN, 10.0.0.0/16

• Load Balancer: Load Balancing service

• NOTE: Private and Public Load Balancer is

Regional Subnet • Virtual IP: A Compute instance can be

primary primary Virtual IP (VIP-2) will be reassign to VM2

ORACLE CLOUD INFRASTRUCTURE (REGION)

Availability Domain 1 Availability Domain Availability Domain 3

ORACLE CLOUD INFRASTRUCTURE (REGION)

Availability Domain 1 Availability Domain Availability Domain 3

Instance Pool before scale Instance Pool after scale

Public Internet IPsec VPN CONNECTION

VIRTUAL CIRCUIT #1 PRIVATE SUBNET

DST IP: 0.0.0.0/0 PRIVATE SUBNET

ORACLE CLOUD INFRASTRUCTURE (REGION)

Availability Domain 1 Availability Domain 2 Availability Domain 3

Storage Server Storage Server Storage Server

ORACLE CLOUD INFRASTRUCTURE (REGION)

Availability Domain 1 Availability Domain Availability Domain 3

ORACLE CLOUD INFRASTRUCTURE (REGION)

Availability Domain 1 Availability Domain Availability Domain 3

Server Rsync Server Server

• Disaster recovery (DR) involves a set of policies,

• Disaster recovery should indicate the key metrics

• In many cases, an organization may elect to use

Replicate data and

Replicate data with

• You can also deploy applications in different regions to:

ORACLE CLOUD INFRASTRUCTURE (REGION 1) ORACLE CLOUD INFRASTRUCTURE (REGION 2)

AD1 AD2 AD3 AD1 AD2 AD3

ORACLE CLOUD INFRASTRUCTURE (REGION 1) ORACLE CLOUD INFRASTRUCTURE (REGION 2)

AD1 AD2 AD3 AD1 AD2 AD3

• Cross region Block volume backup copy:

Primary asset is monitored

ON-PREMISES ORACLE CLOUD INFRASTRUCTURE (REGION)

Web Server Buckets

AD1 AD2 AD3

Web Server Web Server

AD1 AD2 AD3

Web Server Web Server

SAN Block Block

• Active Data Guard

OCI training and certification:

OCI hands-on labs:

Oracle learning library videos on YouTube:

You might also like