Professional Documents
Culture Documents
(HyperMetro)
Bill.LiangzhiQiang@huawei.com
Storage MM Director
Security Level:
Contents
1 Introduction
3 Comparison
4 Key technology
2 Huawei Confidential
Importance of Business Continuity to IT Systems
Source: Network Computing, the Meta Group and Contingency Planning Research
3 Huawei Confidential
International Standards About DR Construction
RPO: Recovery Point Objective (amount of lost data caused by downtime) RTO: Recovery Time Objective (downtime)
International Standard
RPO RTO BC&DR Solutions
Share78, UK
Tier 7 - Zero data loss and Active-Active/
0 < 15 minutes
automated service recovery Active-Passive/Geo-redundant 3DC/Private cloud solution
4 Huawei Confidential
Active-Active Solution Ensures 24/7 Business Continuity
Active-Active Solution
Active-Passive Solution
Fusion
Local Backup Solutions Sphere
Fusion
Sphere
Fusion
Sphere
Site 1 Site 2
Site 1 Site 2 Active-Active DCs
Active-Passive DCs
Single DC DR level
1 2 3 4 5 6 7
5 Huawei Confidential
Definition of Active-Active Storage
Definition
Active-active storage solutions consists of two storages which provide consistent data copies in real time. The two
copies are accessible to one host at the same time. The failure of any copy does not affect services. The two
storages can be deployed at two data centers, to form an active-active data center solution, with the active-active
design of upper-layer applications (as well the network layer).
7 Huawei Confidential
Contents
1 Introduction
3 Comparison
4 Key technology
8 Huawei Confidential
End-to-End Physical Architecture of Active-Active DR Solution
≤100 km
Network layer DC outlet
Raw optical fiber
Core layer
Active-Active network layer
Aggregation
High-reliability, optimized layer
Layer-2 interconnect Access
Optimum access path layer
Application
layer Active-Active application layer
Fusion Fusion
Sphere Sphere
Oracle RAC, VMware, and
FusionSphere cross-DC high
Storage layer availability, load balancing,
and migration scheduling
Active-active access,
zero data loss
DC A DC B
Networking Requirement:
• RTT<1ms
9 Huawei Confidential • Bandwidth >2Gbps
Active-Active DR Solution –All Flash
High performance active-active: • Active-Active replication
~300K IOPS@1ms latency • Load balanced
(Dorado6000 V3 Dual-controller)
• Gateway-free
• Improved reliability
• Simplified management
• Reduced cost
• RPO = 0, RTO ≈ 0
Production center 2
Production center 1
DR center
10 Huawei Confidential
Active-Active DR Solution –Unified Storage
DC A DC B
Working principle
Two storages are deployed in DC A and DC B, providing read and
Host application
cluster write services in active-active mode. Write IO will be mirrored real-
time between two storages to ensure data consistency. It supports
both SAN and NAS. No data will be lost if either storage fails.
Highlights
Suggest the distance < 100KM
IP/FC IP/FC Active-Active,RPO = 0, RTO ≈ 0
SAN No gateway devices, simplify networks, save costs, and
Real-time data mirroring.
NAS SAN Dual-write heartbeat and SAN NAS eliminate gateway-caused latency
configuration. Provide dual arbitration mode,improve reliability
Support replicating between different models of Huawei
storage, saving investment.
Production storage Production storage
Support smooth upgrade from single-site to active-active and
from active-active solution to geo-redundant solution without
IP IP service interruption.
Support IP or FC links for intra-city interconnection and IP
networks for arbitration links.
Quorum/VM
11 Huawei Confidential
High Availability Design of HyperMetro
No need extra gateway device, When bad blocks cannot be repaired Quorum server and static priority
reduce fault point and networking within the array, automatic repair bad modes are provided, with support
complexity , provide higher reliability. blocks by reading data from the remote automatic switchover between the two
storage, the service access is not affected. modes. If the quorum device is faulty, the
static priority mode ensures service
continuity.
12 Huawei Confidential
High Performance Design of HyperMetro
1~1.5ms
Gateway-free design, avoid bottleneck Combine Write Command and Data The optimistic lock technology is mainly
in the gateway, shorten the IO path, Transfer into one transmission. The aimed at more than 99% of the host IOs
reduce 1~1.5ms latency latency of cross-site write I/O will not have concurrent write conflicts,
interactions is reduced by half. so use optimistic locks to lock locally,
reducing inter-array interaction.
13 Huawei Confidential
High Flexibility Design of HyperMetro
IP or FC
IP IP
Support smooth upgrade from single-site Support using IP or FC network in Support automatic self-healing after failure
to active-active and from active-active replication link, only one type of recovery, reduce manual operations.
solution to geo-redundant solution network, no need complicated network Support online updating version and online
without service interruption. design capacity expansion on Active-Active LUN.
Single site configuration when creation and
operation
14 Huawei Confidential
“Never-down” Data Solution Deployed by Yahoo Japan
Active-Active
Production Center 2
OceanStor V5
Production Center 1
OceanStor V5
Flight mgmt.
Baggage Stand
Asset mgmt.
Geographic
CRM SQL
Huawei's all-flash active-active storage solution
processing allocation information
16 Huawei Confidential
Contents
1 Introduction
3 Comparison
4 Key technology
17 Huawei Confidential
Active-Active Solutions in the Industry
Controller Controller Controller Controller
Gateway Gateway
Controller Controller
Key points:
non-gateway/device isolation/loose Key points:
Key points:
coupling non-gateway/data-level mirroring/tight
gateway/data-level mirroring/tight
coupling
coupling
18 Huawei Confidential
Contents
1 Introduction
3 Comparison
4 Key technology
19 Huawei Confidential
FastWrite — Higher Dual-Write Performance
100 km
100 km
1. Write Command FC/IP FC/IP
1. Write Command
2. Ready 2. Ready
RTT-2
8. Status Good
Common solution: A write I/O involves two interactions FastWrite: The protocol is optimized to combine Write Command
between two storages, namely, Write Command and and Data Transfer into one transmission. The number of cross-
Data Transfer. site write I/O interactions is reduced by half.
One 100 km transmission involves two RTTs. One 100 km transmission link involves only one RTT.
20 Huawei Confidential
Optimistic Lock Optimization (Write Process)
Latency = t1 + t2 + t3 Latency = t1 + t3
Host Host
Cluster Cluster
Write IO Write IO
t3 t3
Write process with distributed lock Write process with optimistic lock
21 Huawei Confidential
HyperMetro Arbitration Design
Arbitration Design
Storage Resource Pool • The quorum device is deployed at a third-place site and in
a different fault domain from the two active-active DCs.
Support two quorum servers to avoid single point failure
Preferred
Note: Two quorum servers work in active/standby mode. Only one
Site quorum server is in effect at a time.
Storage A Storage B
IP
• Deploy the quorum device at the preferred site.
third-place site Primary Quorum Secondary Quorum • Set the static priority mode on the condition without
Device Device quorum device
Note: If the preferred site fails, services will be interrupted.
• Quorum device: support physical server or virtual server. And two quorum servers can be
deployed.
• Quorum link: IP addresses must be reachable.
• Arbitration mode: Both quorum server mode and static priority mode are offered.
• Arbitration granularity: Arbitration is performed based on LUN pairs or consistency groups.
22 Huawei Confidential
Cross-Site Bad Block Repair
Working principle
23 Huawei Confidential
Summary
Active Active DR
24 Huawei Confidential
Thank you. 把数字世界带入每个人、每个家庭、
每个组织,构建万物互联的智能世界。
Bring digital to every person, home, and
organization for a fully connected,
intelligent world.