You are on page 1of 617

OceanStor Dorado 5000v6

HANDS-ON / TRANSFERÊNCIA DE CONHECIMENTO


OceanStor Dorado V6
Storage Systems
Introduction
Challenges to Traditional Storage
Databases VDI
1. It is hard to have satisfactory 1. A single storage system usually
performance. Performance is supports fewer than 100 desktops
improved only by piling up storage due to limited performance and
devices and disks. capacity.
2. The long latency of traditional 2. The system is prone to boot, login
storage results in high host CPU and antivirus storms.
consumption. Therefore, multiple 3. It takes a long time to deploy
databases are required, pushing up desktops.
license fee and maintenance costs. 4. Desktop applications are slow to
3. Copying data in databases for respond.
development and tests is inefficient
and usually requires adjusting the
time window.

To cope with these challenges, storage vendors have


launched various all-flash storage products.
Product Positioning

3x improvement in
application performance

99.9999% availability

OceanStor Dorado V6
75% OPEX reduction
Lightning fast, rock solid
Specifications
OceanStor Dorado OceanStor OceanStor OceanStor
OceanStor Dorado 3000 V6
5000 V6 Dorado 6000 V6 Dorado 8000 V6 Dorado 18000 V6
Número máximo de
16* 32*
controladoras

Cache máximo
(controladora dupla,
192 GB–1536 GB 256 GB–4 TB 1 TB–8 TB 512 GB–16 TB 512 GB–32 TB
expandindo com o número
de controladoras)

Protocolos de interface
FC e iSCSI
compatíveis
Tipos de porta de front-end FC/FC-NVMe de 8/16/32 Gbit/s, Ethernet de 10/25/40/100 Gbit/s

Tipos de porta de back-end SAS 3.0 NVMe por Fabric e SAS 3.0

Número máximo de
módulos de I/O com troca a
6 12 12 28 28
quente
por controladora

Número máximo de

portas de front-end
40 48 56 104 104
por invólucro de
controladora

Número máximo de SSDs 1200 1600 2400 3200 6400

SSDs SAS de 960 GB / 1,92 SSDs SAS de 960 GB / 1,92 TB / 3,84 TB / 7,68 TB / 15,36 TB / 30,72TB
SSDs compatíveis TB / 3,84 TB / 7,68 TB /
SSDs NVMe portáteis de 1,92 TB / 3,84 TB / 7,68 TB / 15,36 TB
Application Scenario — Databases
Oracle SQL Server
database database DB2 database
server
Customer Benefits

1. Provides high performance


(I/O size: 8 KB, read/write
ratio: 7:3, 280,000 IOPS/1
ms) for OLTP databases.
Inline compression is
enabled, delivering a
compression ratio of nearly
2:1.
2. Delivers stable performance
at a latency shorter than 1
ms, meeting the
performance SLA.
3. Creates database copies
quickly to meet
development and test
OceanStor Dorado V6
requirements, without
impacting performance.

5
Application Scenario — VDI
Customer Benefits

1. Supports 2000 virtual desktops and a


maximum of 5000 desktops with inline
deduplication and compression enabled.
2. Prevents boot and login storms. When
there are 500 users, the total boot time is
less than 6.5 minutes and the boot time
per user is shortened to seconds.
3. Substantially reduces the time required
for deploying desktops. You only need to
install or upgrade applications on one VM
and then clone the operations to other
VMs.
4. The average response time is 0.5s when
you use View Planner to simulate user
operations on 1000 full-clone or linked-
clone desktops.

OceanStor Dorado V6
Typical Network
⚫ Multi-link dual-switch network
Physical Architecture of the Controller
Enclosure Dorado6000 V6
No. Name
1 Subrack

2 BBU

3 Controller

4 Power module

Management
5
module
Interface
6
module
Modules in the Controller Enclosure of
Dorado5000 and Dorado3000 V6 SAS
2.5-inch disk
⚫ 12 Gbit/s SAS SSD
⚫ 960 GB/1.92 TB/3.84 TB/7.68
TB/15.36 TB/30.72 TB SSD
Note:
900 GB/1.8 TB/3.6 TB SSDs are only used
as spare parts or for expansion.

Interface module Power-BBU-fan integrated module


⚫ Two interface Serial port
module slots per 1+1 redundancy
controller Up to 94% of power

⚫ Hot swappable Maintenance port conversion efficiency
⚫ Types: 12 Gbit/s SAS, 100 V to 240 V AC, -48 V/-60 V
SmartIO (8/16/32 Gbit/s DC, and 240 V high-voltage DC
FC, 10GE, 25GE, 10
Management network port
Gbit/s FCoE), 40GE,
100GE, 56 Gbit/s
InfiniBand, GE, 10GE
(electrical)
⚫ Up to 175 SSDsper
engine (with 6 x 25 SAS Onboard interfaces
SSD disk enclosures, ⚫ SmartIO (8/16 Gbit/s FC,
single- uplink 10GE, 10 Gbit/s FCoE)
networking)
Physical Architecture of SAS and
NVMe Disk Enclosures for Dorado V6

No. Name
1 Subrack

2 Disk module

3 Power module

Expansio
4
n module
2 U SAS Disk Enclosure
2.5-inch disk
Architecture (25 Slots) ⚫


12 Gbit/s SAS SSD
960 GB/1.92 TB/3.84
TB/7.68 TB/15.36 TB/30.72
TB SSD
Note:
600 GB/900 GB/1.8 TB/3.6 TB
SSDs are only used as spare
parts or for expansion.

Expansion module
⚫ Dual expansion modules
⚫ Two 12 Gbit/s SAS ports

600 W power module


⚫ 1+1 redundancy
⚫ Built-in fan modules (1+1)
⚫ 100 V to 240 V AC, -48 V/-
60 V DC, and 240 V high-
voltage DC

• In new systems, disks with N*960 GB (N = 1, 2, 4, 8, 16) capacity specifications are used.
• N*900 GB SAS SSDs and N*1 TB (N =1, 2, 4) NVMe SSDs are only used for capacity expansion of
systems earlier than C30.
• 30.72 TB SSDs are supported after 2019-01-30.
Dorado5000/6000 V6(SAS) Scale up
Dorado5000/6000 V6(SAS) 4 Controllers
Scale up
SmartIO Interface Module
➢ Provides four 8/16/32 Gbit/s FC, 25GE, or 10GE ports.
X8 ➢ This new module (Hi1822) has the following enhancements as compared with the old
ones (Hi1821):
✓ Further supports FastWrite in FC mode and TOE.
✓ Improves the port rate: 32 Gbit/s FC and 25GE.

Note:
1. Physical form: The module uses the Hi1822 chip. It has two different structures (back-
end X8 and back-end X16), which require different SBOMs.
X16 2. The X8 module is only used on Dorado5000 V6 (NVMe).
3. The X16 module is used on Dorado3000 V6, Dorado5000 V6 EnhancedEdition,
Dorado6000 V6 Enhanced Edition, and Dorado18000 V6.
* Neither module can be used on Dorado C30 or earlier. The new modules are designed
in such a way as to prevent incorrectinsertion.

No. Indicator Description


1 Power indicator/Hot Swap button
Steady green: The interface module is running properly.
8 Gbit/s, 16 Gbit/s, or 32 Gbit/s Fibre Blinking green: The interface module has received a hot swap
2 Power
Channel, 10GE, or 25GE ports 1 request.
indicato
3 Port Link/Active/Mode indicator Steady red: The interface module is faulty.
r
Off: The interface module is powered off.
4 Port mode silkscreen
Blinking blue slowly (twice per second): The port is working in FC
5 Module handle* mode and is not connected.
Blinking blue quickly (10 times per second): The port is working in
FC mode and is transmittingdata.
*A new-generation SmartIO interface module using the Port Steady blue: The port is working in FC mode and is connected, but
Hi1822 chip. Link/Activ is not transmittingdata.
2 e/Mode Blinking green slowly (twice per second): The port is working in
*The figure is a SmartIO interface module with 8 Gbit/s indicator ETH mode and is not connected.
FC ports. The silkscreen varies with the port rate and Blinking green quickly (10 times per second): The port is working
protocol. in ETH mode and is transmitting data.
Steady green: The port is working in ETH mode and is connected,
but is not transmitting data.
40GE/100GE Interface Module
➢ Provides two 100GE/40GE ports.
➢ Supports TOE.

Note:
X8
1. Physical form: This module uses the Hi1822 chip. It has two different
structures (back-end X8 and back-end X16), which require different
SBOMs.
2. The X8 module is only used on Dorado5000 V6 (NVMe).
3. The X16 module is used on Dorado3000 V6, Dorado5000 V6 Enhanced
Edition, Dorado6000 V6 Enhanced Edition, and Dorado18000 V6.
X16 * Neither module can be used on Dorado C30 or earlier. The
new modules are designed in such a way as to prevent incorrect insertion.

No. Indicator Description


Steady green: The interface module is
running properly.
Power Blinking green: The interface module has
1 received a hot swap request.
1 Power indicator/Hot Swap indicator
button Steady red: The interface module is faulty.
Off: The interface module is powered off.
2 2 x ETH port
Steady on: The port is properly connected to
3 Link/Active indicator
Port an application server.
4 Module handle/silkscreen* 3 Link/Activ Off: The port is not connected.
e indicator Blinking: The port is transmitting data.
* The figure is a 100GE module. The silkscreen Steady on: The port is not transmitting data.
varies with the port rate and protocol.
PCIe Scale-Out Interface Module
1 Power indicator/Hot Swap button
2 Link/Speed indicator of a PCIe port
3 PCIe port
4 Module handle

No. Indicator Description


Steady green: The interface module is working correctly.
Power Blinking green: The interface module has received a hot
1 indicator/Ho swap request.
t Swap Steady red: The module is faulty.
button Off: The interface module is powered off or hot swappable.
Steady blue: The data transfer rate between the PCIe port
and the data switch is 8 Gbit/s.
Link/Speed
Steady green: The data transfer rate between the PCIe port
2 indicator of a
and the data switch is 2.5 Gbit/s or 5 Gbit/s.
PCIe port
Steady red: The port is faulty.
Off: The link to the port is down.
PCIe Scale-up Card
PCIe scale-up cables
are mini SAS HD x 8
cables (mini SAS HD x
4 on both ends) and
1 Power indicator/Hot Swap button
provide 64 Gbit/s
2 PCIe port (two in a group) bandwidth.
3 Link/Speed indicator of a PCIe
port
4 Module handle

No. Indicator Description


1 Power Steady green: The interface module is working correctly.
indicator/Ho Blinking green: There is a hot swap request to the module.
t Swap Steady red: The module is faulty.
button Off: The interface module is powered off or hot swappable.
3 Link/Speed Steady blue: The data transfer rate between the PCIe port
indicator of a and the data switch is 8 Gbit/s.
PCIe port Steady green: The data transfer rate between the PCIe port
and the data switch is 2.5 Gbit/s or 5 Gbit/s.
Steady red: The port is faulty.
Off: The link of the port is down.
Cables of Dorado V6

Ground cables
DC power cables

SAS electrical cables


PCIe electrical cables

AC power cables Network cables


Optical fiber cables

PDU power cables

Serial cables
FDR cables
Software Architecture
All balanced Active-Active architecture
FlashLink: RAID-TP Tolerates
Simultaneous Failure of Three Disks
Three-disk failures

Reliability
doubled

Two-disk failures

Conventional RAID RAID-TP

RAID-TP is recommended when the capacity of SSDs is greater than or equal to 8 TB


because this will improve system reliability.
FlashLink: Global Garbage Collection
Invalid
data

CKG

Idle CKG

1. New data is written to new locations. The original data is set to invalid state.

2. After the amount of garbage reaches the threshold, valid data is migrated to a new stripe.

3. The original CKG is released.


FlashLink: Hot/Cold Data
Partitioning + I/O Priority
Hot/cold I/O priority
partitions adjustment

Data read/write Data read/write Priority 1


数据读写 Priority 1 数据读写

高级特性
Advanced features 高级特性
Advanced features Priority 2
Priority 1
Cache批量写
Cache batch write
Cache批量写
Cache batch write
Priority 1 Priority 3
硬盘重构 硬盘重构
Disk reconstruction Disk reconstruction
Priority 1 Priority 4
垃圾回收 垃圾回收
Garbage collection Garbage collection
Priority 1 Priority 5

Controllers automatically detect data The I/O priorities are dynamically adjusted
layouts inside SSDs. within the controller and SSDs based on the
Partitioning of hot and cold data is service status.
implemented within the controller The priorities of garbage collection I/Os are
and SSDs simultaneously. automatically controlled to trigger garbage
Sequential layout of hot and cold data collection on demand.
in different partitions Service data reads/writes are always
Effectively reducing the amount of responded to with the highest priority.
garbage inside SSDs
Key Design Points: Global Wear
Leveling and Anti-Wear Leveling
Lifespan Lifespan
Threshold when
global anti-wear
leveling is enabled

SSD#1 SSD#2 SSD#3 SSD#4 SSD#5 SSD#6


SSD#1 SSD#2 SSD#3 SSD#4 SSD#5 SSD#6

Global anti-wear leveling


Global wear leveling

⚫ Data is evenly distributed on all SSDs based on LBA/fingerprints using FlashLink.


⚫ When anti-wear leveling is enabled, an SSD is specified to carry more data using
FlashLink.

Benefits:
⚫ Global wear leveling enhances general SSD reliability.
⚫ Anti-global wear-leveling avoids simultaneous failure of disks.
Key Features: Global Inline
Deduplication and Compression
8 KB datablock

Global inline deduplication


Weak hash plus byte-by-byte comparison
Global inline ensures precise deduplication.
deduplication

Fingerprint
pool
Inline compression
Inline Optimized LZ4 algorithm
compression
Enhancement in C00: Optimized
Storage
pool
ZSTD algorithm, improving the
compression ratio
Byte-alignment in data compaction and
DIF rearrangement, increasing the
Engine
compression ratio between 15% and
35%
Key Features: Multiple Disk Domains
Concept
SSDs can be grouped into multiple disk domains. Faults in onedisk
domain do not affect services in the other disk domains, isolating
different types of services or services from different vStores. With the
same number of SSDs, the possibility that two SSDs fail simultaneously Working Principle of Multiple Disk Domains
in multiple disk domains is lower than it would be in a single domain.
Multiple disk domains reduce the risk of data loss caused by failure of Host 1 Host 2 Host 3 Host 4
multiple disks.
Technical Highlights
1. One engine can manage up to four disk domains. A disk domain
can consist of SSDs owned by two engines. The RAID level of each
disk domain can be specified. Controller A Controller B
LU
2. Disk domains are physically isolated and must be configured with N
independent hot spare space respectively.
3. If a disk domain is faulty, services in other disk domains are not
affected.
Disk enclosure
Application Scenarios
1. vStore isolation: Different disk domains can be created for various Disk domain Disk domain Disk domain Disk domain
1 2 3 4
hosts or vStores, implementing physicalisolation.
2. Data reliability improvement: Given the same number of SSDs, the
possibility that two or three SSDs fail simultaneously in multiple
disk domains is lower than it would be in a single domain.

• On a storage system, all disk domains consist entirely of SSDs owned


by two controllers or SSDs owned by four controllers.
• For example, if a disk domain that consists of SSDs owned by four
controllers has been configured, new disk domains must also consist
of SSDs owned by four controllers.
Key Features: Internal Key Management
Host

Internal Key Manager


Plaintext Plaintext Internal Key Manager is a built-in key management
application in Huawei OceanStor Dorado V6 all-flash
storage systems. It is designed based on the NIST SP
Switch 800-57 best practice and manages the lifecycle of
AKs (authentication keys) for encrypted disks.

Plaintext Application scenarios


Internal Key Manager is recommended if FIPS 140-2
LUN FS InternalKey is not required and the key management system is
Manager used only by the storage systems in a data center.
OceanStor POOL
DoradoV6 AK
BDM
Advantages over external key management
AK Plaintext It is easy to deploy, configure, and manage. There is
no need to deploy an independent key management
system.
SED
OceanStor Dorado V6
Storage Systems
Initial Configuration
Initialization Process

Change management network port IP addresses.

Apply for and activate licenses.

Log in to DeviceManager.

Start the initial configuration wizard.

Configure authorized IP addresses.

Configure security policies.

Configure alarm policies.


Changing Management Network
Port IP Addresses (1)
DeviceManager makes it easy to modify IP addresses of management ports so the
ports can be connected to user networks during system initialization.

⚫ Prerequisites
The temporary maintenance terminal used for the initial configuration is connected
to the storage device's management port, and the maintenance terminal IP address
and management port's default IP address are on the same network segment.

Choose System > Controller Enclosure. Click to switch to the rear view of the
controller enclosure and click a management port to modify.

Note
⚫ The default IP address of the management network port on management module 0 is 192.168.128.101 and that on management
module 1 is 192.168.128.102. The default subnet mask is 255.255.0.0.
⚫ Management network port IP addresses and internal heartbeat IP addresses must reside on different network segments. Otherwise,
route conflicts will occur. The default internal heartbeat IP addresses are 127.127.127.10 and 127.127.127.11, and the default subnet
mask is 255.255.255.0. In a dual-controller storage system, IP addresses on the 127.127.127. XXX network segment cannot be used.
⚫ Management network port IP addresses and the maintenance network port IP address must reside on different network segments.
Otherwise, route conflicts will occur. The default maintenance network port IP address is 172.31.128.101 or 172.31.128.102, and the
default subnet mask is 255.255.0.0. Therefore, IP addresses on the 172.31.XXX.XXX network segment cannot be allocated to
management network ports. You are advised to connect management network ports to the network only.
⚫ By default, management network port IP addresses and service network port IP addresses must reside on different network segments.
Changing Management Network
Port IP Addresses (2)
You can also log in to the storage system using the serial port. After using serial
cables to connect a maintenance terminal to a controller enclosure, run the
change system management_ip command to change management network
port IP addresses. For example, set the IPv4 address of the management
network port on management module 0 to 172.16.190.2, subnet
mask to 255.255.0.0, and gateway address to 172.16.0.1.

Note
⚫ The default IP address of the management network port on management module 0 is 192.168.128.101 and that on management
module 1 is 192.168.128.102. The default subnet mask is 255.255.0.0.
⚫ Management network port IP addresses and internal heartbeat IP addresses must reside on different network segments. Otherwise,
route conflicts will occur. The default internal heartbeat IP addresses are 127.127.127.10 and 127.127.127.11, and the default subnet
mask is 255.255.255.0. In a dual-controller storage system, IP addresses on the 127.127.127. XXX network segment cannot be used.
⚫ Management network port IP addresses and the maintenance network port IP address must reside on different network segments.
Otherwise, route conflicts will occur. The default maintenance network port IP address is 172.31.128.101 or 172.31.128.102, and the
default subnet mask is 255.255.0.0. Therefore, IP addresses on the 172.31.XXX.XXX network segment cannot be allocated to
management network ports. You are advised to connect management network ports to the network only.
⚫ By default, management network port IP addresses and service network port IP addresses must reside on different network segments.
Applying for a License
Item Description

GTS permission for the ESDP Users who have GTS permission can apply for licenses
(applicable to Huawei service in Entitlement Activation mode. If you do not have
engineers) GTS permission, click Permission Application in the
left navigation tree of the ESDP home page and
complete the permission application.
ASP or Guest permission for the Users who have ASP or Guest permission can apply
ESDP (applicable to Huawei for licenses in Password Activation mode.
partners or end users) Click Register Now on the ESDP home page and fill
in the required information.
Equipment serial number (ESN) An ESN is a character string that uniquely identifies a
device. Licenses must be activated for each device.
You can obtain the ESN in any of the following ways:
• Check the ESN on the mounting ear of the
front panel of the device.
• On the DeviceManager home page, choose
Basic Information > SN.
• Log in to CLI and run the show system general
command to view the value of SN.
Applying for a License (Entitlement
Activation)
Applying for a License (Password
Activation)
Importing and Activating a License
After you obtain a license file, you need to upload and activate it before you can use
the value-added features.
Introduction to DeviceManager
⚫ DeviceManager is an integrated storage management software developed
by Huawei. It comes installed in storage systems from the factory.
⚫ You can log in to DeviceManager on any maintenance terminal connected
to a storage system by entering the management network port IP address
of the storage system and the local or domain user name in a browser.

Note: You can download DeviceManager demos of various versions at


http://support.huawei.com/enterprise/.
Checking Interoperability Before
Logging In to DeviceManager
⚫ The following table lists operating systems and browsers able to
operate with DeviceManager.
Operating Operating System Fully Compatible Partially
System Version Browser Compatible Browser

Internet Explorer 10
Windows7+
to 11
Professional (32- Internet Explorer 9
Firefox 25 to 52
bit/64-bit)
Chrome 27 to 57
Internet Explorer 10
Windows Windows Server 2012 to 11
and Windows 8 Firefox 25 to 52
Chrome 27 to 57
Internet Explorer 11
Windows 8.1 Firefox 25 to 52
Chrome 27 to 57

MAC OS MAC OS X 10.5+ Safari 5.5 to 9.0

Linux Ubuntu 11 Firefox 25 to 52


DeviceManager Interface
Storage system status
Initial Configuration Wizard
Alarm Settings — Email Notification
This function allows you to send
alarm notification emails of the
specified severities to specified
recipients' email addresses.

Choose Settings > Alarm Settings


>Email Notification and configure
the notification settings.
Alarm Settings — SMS Notification
This function allows you to send
alarm notification of the
specified severities to specified
recipients' phone number
Alarm Settings — Trap IP Address
This function allows you to send alarm notifications to specified network
management systems or storage devices.
Alarm Settings — Syslog
Notification
This function allows you to send alarms and events of specified severities
from devices with specified addresses to the Syslog server.

Choose Settings > Alarm


Settings > Syslog
Notification and configure
the Syslog notification
function.
OceanStor Dorado V6
Storage Systems
Operation and
Maintenance
Security Configuration
Management — Domain Authentication
Dorado V6 storage system

DeviceManager allows users


Authentication to log in to the storage
system by using Lightweight
Directory Access Protocol
Domain (LDAP) server authentication
authentication server
User
Log in to the to centrally manage user
storage system.
information.

User

User
Security Configuration
Management — Authorized IP Addresses
Dorado V6 storage system

To prevent unauthorized IP
addresses from accessing
DeviceManager, specify the
authorized IP addresses that
can access the storage device
from DeviceManager. After
Not an authorized the IP address security rules
Log in to the
User IP address or not in are enabled, DeviceManager
storage
the authorized IP
address segment is accessible only to the
system.
authorized IP addresses or IP
User address segment.
User
Alarm Management — Severity
The following slides present the alarming mechanism, alarm notification
methods, and alarm dump for you to better manage and clear alarms.

1
Alarm Management — Checking Alarms
Alarm Management — Checking Alarms
Detailed descriptions and troubleshooting suggestions are provided to each
alarm in the list for convenient fault rectification.
Performance Management

0
Performance Management — View Analysis

0
Performance Management — View Dashboard

On DeviceManager, you can view various performance monitoring data.


Performance Management — Checking
the Service Life of SSDs
On DeviceManager, you can check the service life of SSDs.

2
Performance Management —
SystemReporter
SystemReporter is a performance analysis tool for storage systems. It provides functions
such as real-time monitoring and trend analysis by collecting, archiving, analyzing, and
forecasting data. By using SystemReporter, users can easily check storage system
performance and tune performance in a timely manner. SystemReporter is installed on
servers and supports the following operating systems.

4
Performance Management —
SystemReporter
On SystemReporter, you can view real-time and historical performance
monitoring data.

5
Viewing Basic Information
On the DeviceManager home page, you can view basic information of the storage
system, including health status, alarms, system capacity, and performance. This
information helps you prepare for device management and maintenance.

7
Viewing Power Consumption
Information
Power consumption indicates how much power a storage system consumes per
unit time. You can view the total power consumption of a storage device or its
power consumption on a specified date.

8
Checking Device Running Status —
Disk Enclosure/Controller Enclosure

Parameter Description
Health status •Normal: The enclosure is functioning and running normally.
•Faulty: The enclosure is abnormal.
Running status Online or offline

4
Checking Device Running Status —
Controller

Parameter Description
Health status • Normal: The controller is functioning and running normally.
• Faulty: The controller is abnormal.
Running status Online or offline

5
Checking Device Running Status —
Power Module

Parameter Description
Health status • Normal: The power module is functioning and running normally.
• Faulty: The power module is abnormal.
• No input: The power module is in position but is not providing power.
Running status Online or offline

6
Checking Device Running Status —
Controller Enclosure BBU

Parameter Description
Health status •Normal: The controller enclosure BBU is functioning and running normally.
•Faulty: The controller enclosure BBU is abnormal.
•Insufficient power: The BBU has insufficient power but other parameters are
normal.
Running status Online, charging, or discharging

7
Checking Device Running Status —
Fan Module

Parameter Description
Health status • Normal: The fan module is functioning and running normally.
•Faulty: The fan module is abnormal.
Running status Online or offline

8
Checking Device Running Status —
Disk

Parameter Description
Health status • Normal: The disk is functioning and running normally.
• Faulty: The disk is abnormal.
• Failing: The disk is failing and needs to be replaced soon.
Running status Online or offline

9
Checking Device Running Status —
Host Port

Parameter Description
Health status •Normal: The host port is functioning and running normally.
•Faulty: The host port is abnormal.
Running status Link up or link down

0
Checking Device Running Status —
Interface Module

Parameter Description
Health status • Normal: The interface module is functioning and running normally.
• Faulty: The interface module is abnormal.
Running status Running or powered off

1
Checking Service Running Status —
Disk Domain

Parameter Description
Health status •Normal: The disk domain is functioning and running normally.
•Degraded: The disk domain is functioning normally, but performance is not
optimal.
•Faulty: The disk domain is abnormal.
Running status Online, reconstruction, precopy, deleting, or offline

3
Checking Service Running Status —
Storage Pool

Parameter Description
Health status • Normal: The storage pool is functioning and running normally.
•Degraded: The storage pool is functioning normally, but performance is not
optimal.
•Faulty: The storage pool is abnormal.
Running status Online, reconstruction, precopy, deleting, or offline

4
Checking Service Running Status —
LUN

Parameter Description
Health status •Normal: The LUN is functioning and running normally.
•Faulty: The LUN is abnormal.
Running status Online, deleting, or offline

5
Checking Service Running Status —
Host

Parameter Description
Status •Normal: The host is functioning and running normally.
•Faulty: The host is abnormal.

6
Checking Service Running Status —
Remote Replication Pair

Parameter Description
Health status •Normal: All pairs are functioning and running normally.
•Faulty: One or more of the pairs are abnormal.
Running status •Normal, synchronizing, to be recovered, interrupted, split, or invalid

7
Checking Service Running Status —
Remote Replication Consistency Group

Parameter Description
Health status •Normal: All pairs in the consistency group are functioning and running
normally.
• Faulty: One or more pairs in the consistency group are abnormal.
Running status • Normal, synchronizing, to be recovered, interrupted, split, or invalid

8
Checking Service Running Status —
Snapshot

Parameter Description
Health status •Normal: The snapshot is functioning and running normally.
•Faulty: The snapshot is abnormal.
Running status Active, inactive, deleting, or rolling back

9
Inspecting Storage Device Status
You can use SmartKit to make inspection policies and inspect devices to check
device running status in a timely manner.

0
Powering Storage Devices On or Off
— Powering On a Device
The correct power-on sequence is as follows:
1. Switch on the external power supplies of all devices.
2. Press the power button on the controller enclosure.
3. Switch on Ethernet switches or Fibre Channel
switches (If the Ethernet or Fiber Channel switches
are configured, but not powered on).
4. Switch on application servers (If the application
servers are not powered on).

1
Powering Storage Devices On or Off
— Powering Off a Device
The correct power-off sequence is as follows:
1. Stop all services on the storage device.
2. Hold down the power button for 5 seconds to
power off the controller enclosure or perform
power-off operations on DeviceManager.
3. Disconnect the controller enclosure and disk
enclosures from their external power supplies.

2
Powering Storage Devices On or Off
— Restarting a Storage Device
Exercise caution when you restart the storage device
as doing so interrupts the services running on the
device.

3
Powering Storage Devices On or Off
— Powering On an Interface Module
If you want to enable interface modules that have
been powered off, power on them on DeviceManager.

4
Powering Storage Devices On or Off
— Powering Off an Interface Module
Before replacing an interface module, power off it.

5
Collection and Recovery of Storage
System Information
After a fault occurs, collect the basic information, fault
information, and storage device information, and send
it to maintenance engineers. This helps maintenance
engineers quickly locate and rectify the fault. Note
that the information collection operations described
here must be authorized by customers in advance.

9
Exporting System Data
The system data to be exported using DeviceManager includes
running data, system logs, and disk logs.
• Running data indicates the real-time status of a storage system, such
as, the configuration information of LUNs. Running data files are in
*.txt format.
• System logs record information about the running data, events, and
debugging operations on a storage system and can be used to analyze
the status of the storage system. A system log file is in *.tgz format.
• A DHA runtime log is the daily runtime log of a disk. It mainly includes
daily disk health status, I/O information, and disk life span. A DHA
runtime log file is in *.tgz format.
• An HSSD log is the working log of HSSD, such as the S.M.A.R.T
information of a disk. An HSSD log file is in *.tgz format.

0
Exporting Alarms and Events
Alarms and events record the faults and events that occur during
storage system operation. When the storage device is faulty, view
the alarms and events to locate and rectify the fault.
On DeviceManager, you can specify the severity and time of
alarms and events to export.
➢ On the Current Alarms page, critical alarms, major alarms, and
warnings are displayed.
➢On the All Events page, alarms of all severities are displayed.
Alarms on the Current Alarms tab are exported to All Events.

1
Quick Maintenance Process
The following flowchart shows how to quickly maintain a storage system.

View the status of indicators on the front and rear panels of devices in the
storage system to check for hardware faults.
On the Home page of DeviceManager, you can know the basic information,
alarms, system capacity trend, and performance of the storage system.

Check the operation of the storage system through DeviceManager to get


real-time and historical statuses of the storage system service. When a fault
occurs, you can rectify the fault in a timely manner, avoiding service
interruption and data loss.
When a fault occurs in the storage system, DeviceManager automatically
determines the severity of the fault. Then it sends an alarm to the
maintenance engineer so that the engineer can rectify the fault in a timely
manner, avoiding a service interruption and data loss.

5
Checking Service Status
The following table describes the check items.
Item Abnormal Status Common Cause Recommended Action

Reinsert disk modules that are not


The Health Status is The disk domain is faulty
Disk domain secured in the disk slots or replace the
Degrade or Fault. or degraded.
faulty disk modules.
Reinsert disk modules that are not
The Health Status is The storage pool is
Storage pool secured in the disk slots or replace the
Degrade or Fault. faulty or degraded.
faulty disk modules.
The Health Status is The associated LUN is Follow the instructions regarding LUN
LUN
Fault. faulty. alarms to handle the alarms.

The Health Status is The source LUN is Follow the instructions regarding
Snapshot
Fault. abnormal. snapshot alarms to handle the alarms.

The primary LUN is


abnormal. Follow the instructions regarding remote
The secondary LUN is replication alarms to handle the alarms.
Remote The Health Status is
abnormal.
replicatio Fault.
n Check whether the cable connecting to
Links between storage
the remote storage array is loose or
systems are abnormal.
damaged.

8
Checking Storage System
Performance
The following table describes the check items.
Itema Abnormal Status Common Cause Recommended Actionb

The transmission
The bandwidth is rate of the storage
Adjust the transmission
Block bandwidth lower than the system does not
rate of the related port
(MB/s) minimum bandwidth match that of the
on the server or switch.
of a single link. application server or
switch.
The link between the Check the cable
storage system and connection between
The throughput is low
Total IOPS (IO/s) the application the storage system and
or 0.
server or switch is the application server
abnormal. or switch.
a: This table only lists recommended items. Determine whether to enable other items based
on the storage system status.
Enabling too many items may cause a slight degradation of performance in the processing of
storage system services.
b: For some faults, the system displays alarms with IDs and recommended actions.
Troubleshoot such faults by following the instructions.

9
PAUSA PARA CAFÉ
RETORNO 16:30
OceanStor SmartKit
Introduction
SmartKit Introduction
⚫ A portable toolbox for Huawei IT
service engineers.
⚫ Provides a unified desktop
management platform for IT tools.
The built-in ToolStore allows quick
download, installation, and
upgrade of tools.
⚫ Includes various tools required for
deployment, maintenance, and
upgrade of IT devices. These tools
can be used for device O&M,
improving work efficiency and
simplifying operations.
Information Collection Tool – Process
Adding devices ⚫ Add devices whose information you want to
collect.

Setting collection ⚫ Select desired collection items. Information of


items the selected items will be collected.

⚫ Select the device whose information you


Selecting devices
want to collect.

Changing the ⚫ Select the address to save information.


directory

⚫ By running commands on devices, you can


Collecting information
collect and package information by one click.

⚫ After the information collection is


Completing
information collection complete, you can click Open Directory to
view collected information.
Information Collection Tool –
Adding Devices
Information Collection Tool – Setting
Collection Items
Information Collection Tool – Setting
the Directory
Information Collection Tool –
Collecting Information
Information Collection Tool –
Completing Information Collection
InfoGrab – Process
⚫ Create an information collection task.
Creating a task

⚫ Add devices for information collection. The


Adding devices devices can be hosts, databases, and switches.

Setting collection ⚫ Select desired collection items. Information of


items the selected items will be collected.

Setting the directory ⚫ Select the address to save the collection


for saving the result result.

⚫ By running commands on devices, you can


Collecting information
collect and package information by one click.

Completing ⚫ After InfoGrab collects information, you


information collection can click View Result to view the
collected information.
InfoGrab – Creating a Task (Real-
time Collection)
InfoGrab – Creating a Task (Periodic Collection)
InfoGrab – Adding Devices
InfoGrab – Adding Devices
InfoGrab – Setting Collection Items
InfoGrab – Setting the Directory for
Saving the Result
InfoGrab – Collecting Information
Inspection Tool – Process
Selecting the • Select the inspection type for specific scenarios.
inspection type

Selecting devices • Select devices that you want to inspect.

• Select items that you want to inspect.


Selecting check items

Setting a check policy • Set the directory for saving the inspection
report.

Performing the • By running commands on arrays to inspect.


inspection
• After the inspection, you can click Open the
Completing the result directory to view the result. If you fail to
inspection view the result, click the related message box to
collect information.
Inspection Tool – Selecting the
Inspection Type
Inspection Tool – Selecting Devices
Inspection Tool – Selecting Check Items
Inspection Tool – Setting a Check Policy
Inspection Tool – Starting Inspection
Upgrade Tool – Process
⚫ Set a path for saving the upgrade package, a
Setting upgrade path for saving backup data, and an upgrade
information mode (online or offline). Online upgrade is
recommended.

Importing the upgrade


⚫ Click Perform Upgrade and enter the Upgrade
Package Import process to upload the
package upgrade package to the array.

Performing pre- ⚫ Check that the device meets upgrade


upgrade check requirements and view the errors, repair
suggestions, and handling operations based
on the check result.
⚫ Back up the system configuration data and
Backing up data
the license.

Performing the ⚫ Upgrade the device. The upgrade progress


upgrade and periodic steps are displayed.

Verifying the upgrade ⚫ Check the status of the upgraded device.


Upgrade Tool – Setting Upgrade
Information
Upgrade Tool – Importing the
Upgrade Package
Upgrade Tool – Performing Pre-
upgrade Check
Upgrade Tool – Backing Up Data
Upgrade Tool – Performing the Upgrade
Upgrade Tool – Verifying the Upgrade
Patch Tool – Process

⚫ You can select a device for patch


Selecting
installation and a patch installation mode.
devices
You can select devices of the same model
and version to install the patch in a batch.

Selecting
⚫ Select a local patch installation package.
patches

⚫ This operation involves importing a patch


Installing installation package, checking before the
patches installation, installing the patch, and
verifying the patch installation.
Patch Tool – Selecting Devices
Patch Tool – Selecting Devices
Patch Tool – Selecting Devices
Patch Tool – Selecting a Patch
Patch Tool – Installing the Patch
OceanStor Dorado V6
Storage Systems
Storage Pool
Basic Storage Pool Concepts
⚫ A disk domain consists of different disks and does not have a
RAID configuration. Disk domains provide basic storage
resources for storage pools. Disks within a disk domain belong
to the same failure domain.

⚫ A storage pool consists of disks of specified types and has a


specified RAID configuration. Storage pools are containers of
storage resources visible to users, created based on disk
domains.

⚫ The maximum number of disk domains and storage pools that


can be created in a storage system is the same as the
maximum number of engines in the system.
Basic Storage Pool Services – Disk
Selection
⚫ Each disk is divided into chunks (CKs) of a certain size.
⚫ Each chunk group (CKG) consists of CKs from different disks in the same engine
and the same domain. CKs form a CKG based on a specific RAID configuration.

⚫ CKs are selected for a CKG based on wear leveling and anti-wear leveling
algorithms. The algorithms select CKs based on the capacity and degree of wear,
ensuring SSDs are used evenly and the risk of failure is mitigated.

CK CK CK CK CKG CK CK CK CK CKG

CK CK CK CK CK CK CK CK CK CK CK CK

Disk Disk Disk Disk

Domain
Basic Storage Pool Services – Wear
Leveling
The lifespan of SSDs is determined
by the degree of wear. When SSDs
are selected unevenly, that is,
when a few SSDs are used
repeatedly, those SSDs experience
wear at a faster rate, as a result of
which the overall reliability of the
array is reduced. In this case, the
wear leveling algorithm ensures
even use of SSDs to prolong usage
and reliability.
Basic Storage Pool Services – Anti-
Wear Leveling
When the degree of wear exceeds
the threshold, it can cause SSD
failures. This results in the number
of faulty disks exceeding redundant
ones, causing array data loss. The
anti-wear leveling algorithm
systematically queues worn out
SSDs to be further worn out,
reducing failure uncertainties.
Basic Storage Pool Services – RAID
2.0+ Technology Overview
RAID 2.0+ technology dynamically selects the number of data columns (N) in a
CKG according to the number of disks in the disk domain (N is a fixed value when
RAID 2.0+ technology is not used), and keeps the number of parity columns (M)
unchanged, improving reliability and space utilization.
⚫ How RAID 2.0+ technology works:
 When the number of disks increases, more data columns are selected to form a CKG,
improving the space utilization rate (N/(N+M)).
 When the number of disks decreases, the number of data columns in the new CKG is
decreased but the number of parity columns is kept unchanged. In this case, data will
not be lost when the number of damaged disks is the same as or less than that of
parity columns in the new CKG.

⚫ Restriction: The value of N + M must be greater than or equal to 5 but less


than or equal to 25.
Basic Storage Pool Services – RAID 2.0+
Technology Principles
Add new disks
After new disks are added and
RAID 2.0+ technology is
Old CKG D D P executed, the number of data
columns is automatically
D
D P increased in the new CKG. New
New CKG D D
disks are then divided into CKs
of a certain size, and new CKs
are allocated to the new CKG.

When RAID 2.0+ is used on a


Old CKG D D D D P faulty disk, the number of data
columns is automatically
reduced in the new CKG, and
D P
New CKG D D
CKs from the damaged disk are
not allocated to the new CKG.
Basic Storage Pool Services – RAID
Algorithm
Compared with traditional RAID 5 and RAID 6, which support one and two parity
columns respectively, Dorado's new RAID-TP algorithm supports three parity

columns, safeguarding data even when three disks fail.

Disk 0 Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Disk 6

D0 D1 D2 D3 P Q R
Basic Storage Pool Services –
Overview of Deduplication
⚫ Dorado supports global deduplication within disk domains,
determining repeated data at 4 KB or 8 KB granularity, helping avoid
duplicates and unnecessary space usage.

⚫ Mechanisms like weak hash algorithm and byte-by-byte comparison


help pinpoint repeated data for clearing, reducing the possibility of
hash conflicts.

⚫ Deduplication has a positive effect on disk efficiency after data is


evenly distributed to SSDs based on fingerprints.
Basic Storage Pool Services –
Deduplication Principles
Writing repeated 3 Add a new mapping item for the
D1
repeated data instead of writing the
data
repeated data

1 2 1 2 3
M apping M apping
During deduplication, the
table mapping table records the table
mapping from the logical
F1 F2 block address (LBA) to the F1 F2 F1
fingerprint index

Fingerprint During non-deduplication, Fingerprint


index the mapping table records index
the mapping from the LBA
to the data address
Data D1 D2 D1 D2
Data

How deduplication works:


1. A data fingerprint is calculated and forwarded to the fingerprint-owning controller. Repeated data is determined by querying
the global fingerprint table.
2. Data whose fingerprint is repeated with the stored data is compared byte-by-byte. If bytes of the flagged data are same as
those of the stored data, the former is declared to be repeated data.
3. A mapping item for the repeated data (LBA3 to F1) is added to the fingerprint index, as is a reference count of the index.
4. The fingerprint index is forwarded to the owning controller of mapping items to insert mapped items (LBA3 to F1) to the
mapping table.
5. A 'write successful' response isreturned.
Note: Non-deduplication data is written to the disk under the owning controller of mapping items, for insertion of the item (LBA3 to D1)
into the mapping table.
Basic Storage Pool Services –
Compression Principles
In Dorado systems, user data is compressed at a granularity of 4 KB or 8 KB. Post-
compression, pieces of data are stored at smaller granularities and assembled so that they can
be written to CKGs in a compacted and sequential manner to save space.

8 KB data blocks are used as an example.

Compression Store to the disk in a


User data D1 --- 8K 4K
compacted manner

4K 2K 2K
User data D2 --- 8K 2K

User data D3 --- 8K 2K


Basic Storage Pool Services –
Garbage Collection Overview
⚫ ROW
All data and metadata in a disk domain are written into data blocks in redirect on write
(ROW) mode. Overwrite is not performed in CKGs.

⚫ Garbage collection
To meet the requirements of ROW on space for writing new data, valid data in old CKGs is
migrated. After migration, data in old CKGs is completely erased. In this way, the space for
ROW writes can be managed and provided.
Basic Storage Pool Services –
Garbage Collection Principles
CKG 0

CKG 2

CKG 1

How garbage collection works:


1. Valid data written in current CKGs is transferred to a new CKG.
2. Old CKGs are released and released CKs are erased.
3. Released CKs are thus open for allocation to form CKGs for writing new data.
Basic Storage Pool Services –
Overview of Reconstruction
⚫ When the number of damaged disks does not exceed that of redundant disks,
damaged data blocks within faulty or long-removed disks can be recovered
using the RAID algorithm. Damaged data is then written to new data blocks or
CKGs. Data reconstruction is performed based on RAID of CKGs, parity columns,
and normal data columns, resulting in recovery of data redundancy capacity.

⚫ Reconstruction is classified into common reconstruction and migration


reconstruction.

 During common reconstruction, recovered data is written to newly selected


CKs.

 During migration reconstruction, recovered data is written to new CKGs.


Basic Storage Pool Services –
Common Reconstruction Principles

Disk 0 Disk 1 Disk 2 Disk 3 Disk 4 Disk 5

D0 D1 D2 P Q

D2

When a disk is faulty, a new CK is selected from another disk outside the
affected CKG. The data within the damaged CK is then calculated based
on RAID parity data to reconstruct it.
Basic Storage Pool Services –
Migration Reconstruction Principles
Disk 0 Disk 1 Disk 2 Disk 3 Disk 4 Disk 5

CKG0 D0 D1 D2 D3 P Q

P’ Q’
D0+D1+D3 => P’ + Q’

CKG1
D0 D1 D2 P Q

1. The number of data columns is reduced, and new


parity columns are recalculated for CKG0.
2. Damaged data D2 is then migrated to CKG1.
Basic Storage Pool Services – Pre-
copy Technology
⚫ Scenario
Data can still be accessed even when a disk slows or is about to fail. However,
writing data on such a disk may accelerate damage or result in poor
performance. Therefore, the at-risk data needs to be migrated preemptively,
after which the disk can be removed.

⚫ Difference between pre-copy and reconstruction


During a pre-copy procedure, data on the source disk remains accessible and is
preferentially obtained to relieve the pressure of reading back-end disks, while
for reconstruction, the nature of data that is generated is degraded.
Configuration Operation – Provisioning
Configuration Operation – Creating a
Disk Domain
1
1
B1 B1

2
2

B2 B2

Or

3 B3

3
3 B3
4 B4

4 5

You can select controller


enclosures for a disk domain
(by default, one controller
enclosure can be selected).
Configuration Operation – Creating a
Storage Pool
1
C1

2
C2
3
C3

4
3 C4

5
Configuration Operation – Creating
a LUN and LUN Group
1

1 D1 E1

D2 2
2

E2
3

3 D3
D4
4

D5
5
A LUN group can contain one or multiple
LUNs. A maximum of 4096 LUNs can be
added to a LUN group. A LUN can be added
to a maximum of 64 LUN groups.
6
Configuration Operation – Creating
a Host
Hosts can be created manually, in batches, or through automatic
scanning. This page describes how to create a host manually.

1
G1
1
2 G2
2

On Fibre Channel networks, choose FC from the drop-down list of


Initiator Type. On iSCSI networks, choose iSCSI from the drop- 4
down list of Initiator Type. On IB networks, choose IB from the
drop-down list of Initiator Type.
Configuration Operation – Creating
a Host Group
1 H1

2
H2 3

4
Configuration Operation – Creating
a Port Group (Optional)
A port group is a logical combination of multiple physical ports and a mode for
use of specific ports by the storage system. A port group establishes a mapping
relationship between storage resources (LUNs) and servers.

1 I2
1

2 I3

2
3

4
Configuration Operation – Creating
a Mapping View

1 J1

2
OceanStor Dorado V6
Storage Systems
SmartThin
Terminology

Term Definition/Description

A mechanism that offers on-demand


SmartThin provisioning
allocation of storage space
A logical disk that can be accessed by
hosts. The thin LUN dynamically
Thin LUN obtains storage resources from the
storage pool according to the actual
capacity requirements of users.
Data that indicates the mapping
Mapping table
between thin LUNs
Overview
⚫ Definition
✓ SmartThin enables on-demand
space allocation. All storage space
is not allocated in advance. Dorado
V6 does not support thick LUNs.

⚫ Highlights
✓ Provides a storage management
approach that enables on-demand
storage resource allocation.
✓ Provides thin LUNs and allocates
physical storage space based on
user needs.
✓ Reduces resourceconsumption.
License Requirements
⚫ SmartThin is a value-added feature which requires a
license to be purchased.

⚫ In the license file, SmartThin is displayed for Name.


Thin LUN
⚫ A thin LUN is a logical disk that can be accessed by hosts.
The thin LUN dynamically obtains storage resources from the
storage pool according to the actual capacity requirements
of users.
Data in the file
✓ Data collection: In terms of a system
storage system, a thin LUN is a LUN
that can be mapped to a host.
✓ Fully usable: A thin LUN can be
read and written.
✓ Dynamic allocation: Resources are
allocated once data is written.
Host volume
The allocated
capacity is equal
to the actual
capacity used by a
user.

Capacity allocation
Storage Virtualization
⚫ Capacity-on-write (COW): Storage space is allocated by
engines upon data writes based on load balancing rules.

⚫ Direct-on-time: Data reads from and writes to a thin LUN are


redirected.
Space occupied by
Thin LUN
data Physical storage space
(storage pool)
RAID 5

Space
Host allocation

Space COW
allocation
upon Redirection
data to the
writes actual
storage
location
Application Type
When creating a LUN, you can select the application type of the service.
The application type includes the application request size, as well as
SmartCompression and SmartDedupe attributes. LUNs are created based
on application types. The system automatically sets parameters to
provide optimal performance for services.
Capacity-on-Write
⚫ A write request to a thin LUN will trigger space allocation.
⚫ If the available space of a thin LUN is smaller than the threshold,
the thin LUN applies for more space from the storage pool.

Thin LUN
Storage pool
Space allocated:
Writes data
directly.

Write
request 1.
Allocate
Write Space not allocated: space.
Applies for space
request first.
2. Write
data.
Computer
Direct-on-Time
Capacity-on-write stores data in random areas. For this reason,
the direct-on-time technology is required to redirect requests
when thin LUNs are accessed.

Thin LUN

Storage pool

Space
allocated:
Redirects data.

Space not
allocated:
Read request Returns 0.

Read
request
Space not
Write allocated:
request Allocates
space first.

Write
Computer request
Space
allocated:
Redirects data.
Mapping Table
A mapping table shows the mapping relationship of thin LUN data.
Each mapping entry is referred to as a pointer.
✓ The left mapping entry is the logical address, which is
used as the search keyvalue. 1 7 M apping
✓ The right mapping entry records the address of the Add entry
3 5
resource block.
✓ Entries in the mapping table can be added or deleted. 6 8

Search
1 7

7
Delete
1

The mapping table shows where the actual data of thin LUN is.
Reading Data from a Thin LUN

1. Receives a read request.


Data that
maps to the M apping
2. Queries the mapping table. logical a b c
table
address: d e f
g h i 0 0
3. Redirects the request. j k l

4. Reads data.

Data that
maps to a
physical
space:
Writing Data to a Thin LUN
a'
Data that
1. Receives a write request. maps to the a b c Mapping
logical d e f table
address: g h i 0 0
2. Queries the mapping table.
j k l
3. Redirects the request.
4. Writes data.
Data that
maps to
physical a
space:
Using SmartThin
The procedure for using SmartThin is similar to that for using RAID
groups and thick LUNs:
1. Select disks and create a disk domain using the disks.
2. Create a storage pool.
3. Create a thin LUN.
4. Map the thin LUN to the host for data read and write or create
value-added services such as remote replication and snapshot on the
thin LUN.

Host volume 1 Host volume

Host volume 2
Capacity
allocation

Host volume 3 Storage pool


consumed
consumed consumed
Typical Application Scenarios

1. SmartThin can help core service systems that have


demanding requirements on business continuity, such as
bank transaction systems, to expand system capacity online.
2. For services where the growth of application system data is
hard to evaluate accurately, such as email services and web
disk services, SmartThin can assist with on-demand physical
space allocation, preventing wasted space.
3. For mixed services that have diverse storage requirements,
such as carriers' services, SmartThin can assist with physical
space contention, achieving the optimized space
configuration.
SmartThin Configuration Process
Start

Check the SmartThin


license.

Select disks and create


a disk domain.
Select a RAID level and
create a storage pool.

Create a thin LUN.

End
Checking the SmartThin License
Start

Is the SmartThin
license valid?
No Yes
Import and activate
the SmartThin
license.

Enable SmartThin.

End
Checking the SmartThin License
Importing the SmartThin License
Creating a Disk Domain
Creating a Storage Pool
Creating a Thin LUN
Modifying the Owning Controller of
a Thin LUN
Expanding the Capacity of a Thin
LUN
Deleting a Thin LUN

Before deleting a thin LUN,


delete the mapping and
value-added configurations
from the thin LUN.
Deleting a Storage Pool

Before deleting a
storage pool, delete
all LUNs from the
storage pool.
Deleting a Disk Domain

Before deleting a disk


domain, delete all
storage pools from
the disk domain.
Common Faults and Solutions
⚫ Symptom: The capacity of a thin LUN is not fully used, but an
alarm is displayed indicating that the storage pool capacity is
insufficient.
⚫ Cause: The storage pool capacity is used up. The thin LUN
capacity is not the actual allocated capacity. Add disks into the
storage pool to expand the storage pool capacity.

⚫ Symptom: Data is continuously written to a thin LUN, but the


free thin LUN capacity does not change.
⚫ Cause: The data is written to storage locations that were
allocated earlier. The storage space will not be allocated again,
so the thin LUN capacity will not change.
OceanStor Dorado V6
Storage Systems
SmartDedupe&
SmartCompression
Background and Definition – SmartDedupe
⚫ Deduplication: A technology for saving storage space. Duplicate data can
occupy a lot of disk space, reducing efficiency. The goal of storage-based data
deduplication is to inspect large blocks of data and identify duplicate data
blocks larger than 1 KB in order to store only one copy. Deduplication is widely
used in network disks, emails, disk backup media devices, and other areas.
⚫ Deduplication Types
 Inline deduplication: Data is deduplicated when written to the storage
media.
 Post-processing deduplication: After data is written to the storage
media, it is read from the media and deduplicated.
 Fixed-length deduplication: Data is divided into blocks of fixed
granularity and then deduplicated.
 Variable-length deduplication: Data is divided into blocks of different
sizes based on the different content. This kind of deduplication is used in
the backup area.
Background and Definition – SmartCompression
⚫ Compression: In computer science and information theory, data
compression, also known as source coding, is a process of displaying
information that involves encoding information using fewer bits than the
original representation.
⚫ Compression Types:
 Inline compression: Data is compressed when written to the storage
media.
 Post-processing compression: After data is written to the storage media,
it is read from the media and compressed.
 Software compression: the process of executing the compression
algorithm using the system CPU.
 Hardware compression: The compression algorithm is logically
integrated into the hardware device, such as FPGA and ASIC. Then the
hardware device can provide the compression port.
 Lossy compression: After lossy compression, data cannot be recovered
to the original status. This kind of compression is used to process
audios, videos, and images.
 Lossless compression: After lossless compression, data can be recovered
to the original status completely.
Objectives and Benefit
⚫ Less data storage space
 SmartDedupe and SmartCompression, when used separately or together,
can effectively reduce redundant data and data storage space.
⚫ Lower purchasing cost of the storage system
 Data occupies less space, so fewer storage devices are required to meet
requirements of saving data for some time in the future.
⚫ Lower TCO
 Fewer storage devices require less management personnel input. O&M
costs of equipment rooms, power consumption, and refrigeration also
decrease.
⚫ Prolonged service life of SSDs
 The SmartDedupe and SmartCompression features reduce the amount
of data that is written into SSDs and data write count, reducing the
wear on SSDs and prolonging their service life.
License Requirements
⚫ SmartDedupe and SmartCompression are two value-added features
and that each requires a license.

⚫ In license files, the feature name of SmartDedupe is SmartDedupe


(for LUN)

⚫ In license files, the feature name of SmartCompression is


SmartCompression (for LUN)
How to Perform Deduplication?
How to Perform Deduplication?
Data to be deduplicated
5. For old blocks, if their
fingerprints are found in the library,
add one to their references, and the
1. Divide data into blocks. Block 0 Block 1 Block 2 existing data address is returned.
New blocks are written into the
storage space.

2. Calculate fingerprints FP0 FP1 FP2 6. Add the mapping between the
of data blocks. fingerprint and address of the new
block to the fingerprint library.

3. Check whether the


fingerprint is in the
fingerprint library.
FP DataAddr rc
Old Old New
block block block
FP0 dataAddr0 1
4. Existing blocks are old
FP1 dataAddr1 2
blocks. Other blocks are
new ones. FP2 dataAddr2 1
Add the Add the Write
reference reference data
Play the animation.

How to Perform Compression?


Compression window

Move the window backward


Move the window backward

Data to be compressed……. abcdefg abc hj abchj

Search the longest


same character string
from the first character
Export after compression
(0,3) hj (2,5)

Offset in the The longest


window match length
When to Perform Deduplication? –
Inline vs. Post-Processing
Post-processing: Deduplicate Inline: Deduplicate and
and compress data after it is compress data before it
written to disks. is written to disks.

Compared with post-processing, inline deduplication and compression reduce operations


on SSDs, especially write operations, which improves the service life of SSDs. That's why
vendors use inline deduplication and compression in all-flash arrays.
How to Perform SmartDedupe&SmartCompression
on Dorado V6?
◼App0 ◼ App1 ◼App2

Application

Device

◼ LUN0 ◼ LUN1 ◼ LUN2

Only enable
◼ Deduplication deduplication
Enable deduplication
◼ Compression and compression
Only enable
compression

You can choose whether to enable the deduplication and


compression when creating LUNs. When deduplication and
compression are both enabled, data is deduplicated then
compressed.
How to Perform SmartDedupe&SmartCompression
on Dorado V6? Fixed length chunking
Flowchart Start.
Obtain the address
of the data from the
fingerprint table. LBA 31 LBA 33
Write host data to
cache.
IO
Read data from the
Return a success SSD pool.
message to the host.

Whether the data


Divide data into
Yes block is a Yes 8 KB 8 KB 8 KB ……. 8 KB
fixed length data
blocks.
compressed block? LUN
Decompress
Use the weak has LBA 0 LBA max
algorithm to No
calculate fingerprints
of data blocks.
⚫ Divide LUNs into blocks of the fixed length based
Compare the data
on the LBA address.
Check if there is the byte by byte. ⚫ The default deduplication and compression
same fingerprint in the
fingerprint table.
granularity of Dorado V6 is 8 KB. For deduplication,
No the granularity can be configured to 4 KB or 8 KB.
Compress this data Whether data is the
For compression, the granularity can be configured
No
block. same completely.
to 4 KB, 8 KB, 16 KB, or 32 KB. The granularity
Yes
cannot be changed after it is set.
Write the data to the
SSD pool.
⚫ The 8 KB granularity is used as an example. If the
Add the count of
references.
address of a write operation is LBA31 (15.5 KB) to
Add the mapping LBA33 (16.5 KB), data needs to be divided into two
between the
fingerprint and data blocks. First create the two 8 KB blocks, and then
to the fingerprint
table. deduplicate and compress them.
End.
How to Perform SmartDedupe&SmartCompression
on Dorado V6?
◼ App
Application 1. Write the host data 2. Return a write success message to
to the data cache of the host.
the LUN.

Device 3. Divide data into


blocks of the fixed
length during flushing.
4. Use the weak hash
algorithm to calculate
fingerprints. 5. Check if there is the
same fingerprint in the
FP DataAddr rc
FP
fingerprint library.
FP0 dataAdd 3
7. If fingerprints are different
r 0
or the results after byte-by- 6. The fingerprints are the
byte comparison are different,
FP1 dataAdd 2 same. Read data through
compress this data block. r 1 the address. Compare the
data byte by byte. If the
… … … results are the same, add
the count of references.
◼ Compression
8. Write compressed data to SSDs. Save the mapping
between the fingerprint and the data in SSDs to the
fingerprint table. At the same time, set the reference to 1.
What Are Characteristics of Dorado V6
SmartDedupe&SmartCompression?
1. Provides inline SmartDedupe&SmartCompression function
– Inline deduplication and compression reduce write times and the amount of data written to SSDs
and improve the service life of SSDs.

2. The SmartDedupe&SmartCompression can be configured when LUNs


are created.
– Deduplication and compression functions can be configured for different scenarios. The optimal
reduction result can be achieved with less impact on the system.

3. Deduplication supports 4 KB or 8 KB granularities and compression


supports 4 KB, 8 KB, 16 KB, or 32 KB granularities.
– Different deduplication and compression granularities can be configured for different applications to
achieve the optimal reduction result under the best configurations.

4. An industry-standard compression algorithm is used to ensure high


reliability of data.
– An algorithm that is widely used throughout the industry and has been verified in scenarios is used
to ensure data reliability during compression.
5. The weak hash algorithm and byte-by-byte comparison are used to
ensure safe deduplication.
– Safe deduplication ensures absolute reliability.
Application Scenarios
Data compression occupies extra CPU resources. Larger amount of data
processed by the storage system indicates more overheads. Scenarios
where compression can play a full role are used as examples here. It does
not indicate that deduplication has no effect in those scenarios.

1. Database: The database is the best application scenario of data compression.


A large amount of data needs to be saved in the database. Many users tend to
save more than 65% storage space with slight impact on the storage system
performance.
2. File service: The file service is also a common application scenario of data
compression. In file systems where 50% time is busy time and data
compressibility is 50%, if the compression function is enabled, the
input/output operations per second (IOPS) slightly decreases.
3. Engineering, earthquake, and geological data: Characteristics of
engineering, earthquake, and geological data are similar with those of
database backup. The data is saved in the same format, but the data similarity
is low. The compression function can save storage space.
Application Scenarios Where
Deduplication Interworks with
Compression
Deduplication interworks with compression to effectively save
storage space.

Scenarios:

1. VDI/VSI scenarios

2. Data test or development systems

3. File service systems

4. Engineering data systems


Application Restrictions of
Deduplication and Compression
Interwork
The amount of storage space saved by deduplication and
compression is dependent on different data types. In actual
application scenarios, the following restrictions apply:

1.Deduplication and compression are not recommended for non-


repetitive archive data, for example, image files and encrypted
data.

2.Deduplication and compression are not recommended for data


that has been compressed or encrypted by hardware devices or
applications (including backup and archive applications).
How to Configure
SmartDedupe&SmartCompression
on Dorado V6?

⚫ You can choose whether to enable


SmartDedupe&SmartCompression
when creating LUNs. You cannot
change the settings or disable the
function once you enable it.
Deduplication and Compression Ratio in Typical
Scenarios
Note: Based on a survey of over 20 industry participants and end-users in October 2014
Latency Intensive
IOPS Intensive
High- Bandwidth Intensive
High performance Analytics: Average Reduction Ratio
3 to 4:1 D = Deduplication
Computing
C = Compression

DW C
Media 2 to 4:1 OLTP VSI VDI
Transactions

1.2 to 1.5:1 3 to 4:1 5 to 8:1 7 to 12:1

C Email D D
C C 4 to 6:1

D Note: Databases and analytics often


have application-level data reduction
Low
Low Data Reduction Ratio High

Average Data Reduction Ratio: 3 to 5:1


Viewing the Result of
SmartDedupe&SmartCompression
⚫ Deduplication ratio: Total
amount of data written to the
LUN whose deduplication is
enabled/total amount of data
after deduplication, which
reflects the effect of
deduplication.
⚫ Compression ratio: Total
amount of data written to the
LUN whose compression is
enabled/total amount of data
after compression, which
reflects the effect of
compression.
⚫ Data reduction ratio: Total
amount of data written to the
system/occupied disk space
(excluding metadata).
OceanStor Dorado
V6 Storage Systems
HyperClone
Background and Definition
⚫ By generating multiple physical copies of a source LUN or
snapshot, HyperClone allows multiple services to access
these copies concurrently without affecting data in the
source LUN or snapshot.

Splitting a clone generates a full physical copy of the


source LUN or snapshot data at the point in time at
which the clone was created without interrupting
services. When the clone is split, writes and reads on the
physical copy have no impact on the source LUN or
snapshot data.
Purpose and Benefits
⚫ 1. Clones can serve as data sources for backup and
archiving.

⚫ 2. A clone can be generated instantly without affecting


host services. It is a full data duplicate of the source LUN
or snapshot at a specific point in time. Data in a clone is
physically isolated from the data in the source LUN or
snapshot.
License Requirements
⚫ HyperClone is a value-added feature that requires a
license.

⚫ In the license file, its feature name is HyperClone.


Terminology
Term Definition/Description
Source volume The volume that stores the source data to be cloned. It is presented
as a LUN or snapshot to users.

Clone volume A logical data duplicate that is generated after a clone is generated
for a source volume. It is presented as a clone LUN to users.

Redirect on write ROW is a core snapshot technology. When data is changed, the
(ROW) storage system writes the new data to a new location and directs a
pointer for the modified data block to the new location. The old data
then serves as snapshot data.

Clone split Clone split generates a full physical copy of data that a clone shares
with the source LUN or snapshot.
Working Principles
⚫ Definition: A clone is a copy of source data at a particular point in time. It can be split from the source
data and function as a complete physical data copy. A clone can serve as a data backup and is accessible
to hosts.

Characteristics:
✓ Quick clone generation: A storage system can generate a clone within several seconds. A clone can be
read and written immediately after being created. Users can configure deduplication and compression
attributes for a clone.

✓ Online splitting: A split can be performed to cancel the association between source and clone LUNs
without interrupting services. After a split, any later changes to the data on the clone LUN will not affect
the data on the source LUN.

a b c a b c a b c
d e f Creating a clone d e f Splitting the clone d e f
g h i g h i g h i
j k l j k l j k l
Key Technology: Creating a Clone
1. After a clone LUN has
been created, it shares the
data of its source LUN if no SourceLUN Snapshot CloneLUN
changes are made to the
data on either LUN. A
snapshot ensures data
consistency at the point in
time at which the clone is
created.

2.When an application
server reads data from the
clone LUN, it actually reads A B C D
the source LUN's data.

3.HyperMetro cannot be
implemented on a clone
LUN before it is split.
Key Technology: Reading and
Writing a Clone LUN
1. When an application server
writes new data to an existing
Source LUN Snapshot CloneLUN data block in the source LUN,
the storage system allocates a
new storage space for the new
data instead of overwriting the
data in the existing storage
space.

2. When an application server


writes new data to an existing
data block in the clone LUN, the
storage system allocates a new
storage space for the new data
A1 A B C D D1 instead of overwriting the data
in the existing storage space.
Key Technology: Splitting a Clone
(1/3)
1. When a clone LUN is split, the
storage system copies the data Source LUN Snapshot Clone LUN
that the clone LUN shares with
the source LUN to new data
blocks, and retains the new data
that has been written to the
clone LUN.

2. When a host writes new data


to a clone LUN during clone
splitting, the data is written to
both the source and target
storage spaces for the split task.
If a split is canceled before it is
complete, the data in the source
A1 A B C D A’ B’ C’ D1
space is retained, but data in the
target space is cleared.
Key Technology: Splitting a Clone
(2/3)
Source LUN Snapshot Clone LUN 1. After splitting is
complete, the
association between
the source and clone
LUNs is canceled and
the clone LUN
becomes an
independent physical
copy.

A1 A B C D A’ B’ C’ D1 2. After the split, the


storage system
automatically reclaims
snapshot data on
which the clone
depends.
Key Technology: Splitting a Clone
(3/3) 1. After splitting is
complete, the
Source LUN Clone LUN association between
the source and clone
LUNs is canceled and
the clone LUN
becomes an
independent physical
copy.

2. After being split,


the clone LUN has
the same properties
as a common LUN
A1 B C D A’ B’ C’ D1
and supports all
replication services.
The capacity of a
clone LUN is equal to
the copied data
volume.
Application Scenarios
Data backup, application testing, and data analysis

Read I/O

Clone LUN
Write I/O

Application test server

Read I/O

Source LUN Clone LUN Write I/O

Data analysis server

Read I/O

Clone LUN
Write I/O
Backup server
Creating a Clone (for a LUN)
Creating a Clone (for a Snapshot)
Creating a Clone
Querying a Clone
Querying a Clone
Splitting a Clone
Stopping Splitting a Clone
Deleting a Clone
OceanStor Dorado V6
Storage Systems
HyperSnap Introduction
Background and Definition
⚫ A snapshot is a mirror of a data set at a specific point in
time. It can also be called an instant copy. The snapshot
itself is a complete usable copy of the data set.

⚫ A snapshot is defined by Storage Networking Industry


Association (SNIA) as a fully usable copy of a defined
collection of data that contains an image of the data as it
appeared at the point in time at which the copy was
initiated. A snapshot can be a duplicate or replicate of data.
Purpose and Benefits
1. Snapshots can serve as data sources for backup and
archiving.
2. Data duplicates can be flexibly generated at multiple
points in time, enabling fast data recovery if necessary.

3. A snapshot is instantaneously generated without


interrupting host services. It is a data duplicate of the
source data at a specific point in time.
License Requirements

⚫ Snapshot is a value-added feature that requires a license.

⚫ In the license file, the feature name of snapshot is


HyperSnap.
Terminology
Term Definition/Description
Source The volume that stores the source data for which
volume snapshots are generated. It is presented as a LUN to users.
A logical data duplicate that is generated after a snapshot
Snapshot
is created for a source volume. It is presented as a
volume
snapshot LUN to users.
In overwrite scenarios, space reallocation is required.
Redirect on
Original space is released after a successful data write to a
write (ROW)
host.
The data of a snapshot LUN is copied to the source
Snapshot volume. In this way, data of the source volume can be
rollback recovered to the data at the point in time the snapshot
LUN was generated.
A state of a snapshot. In this state, the snapshot is
Inactive
unavailable. The opposite state is Activated.
Working Principles
⚫ Definition: A snapshot is a point-in-time copy of source data. A snapshot
serves as a data backup and is accessible to hosts.
The snapshot technology has the following features:
✓ Quick snapshot generation: A storage system can generate a virtual snapshot
within several seconds.
✓ Minimal storage space consumption: A snapshot is not a full physical data
copy. Therefore, even if the amount of source data is large, a snapshot
occupies only a small amount of storage space.

a b c a b c
d e f Snapshot d e f
08:00 AM
g h i g h i
j k l j k l

a b c a b c
d m f d e f
g h n 09:00 AM
g h i
j k l j k l
Working Principles — Lossless Performance
Write to the source LUN (L2->P5).
Write to the source LUN (L2->P7) again. Write to snapshot 1 (L0->P6). Write to snapshot 2 (L2->P8).

LUN Mapping Table Snap Mapping Table Snap Mapping Table

L0->P6 L0->P0 L1->P1 L2->P2 L3->P3 L4->P4 L2->P5 L2->P7 L2->P8

SSD storage capacity A B C D E F G H I


P0 P1 P2 P3 P4 P5 P6 P7 P8
This feature does not affect read/write Snapshot performance is comparable to
performance of the source LUN. the source LUN performance.
1. Data requested to be written to L2 of the source LUN is 2. Data requested to be written to LO of snapshot 1 is directly
directly written to a new space P5. The original space P2 is written to a new space P6, bringing no additional read and write
referenced by the snapshot. overhead.
3. Data requested to be written to L2 of the source LUN is 4. Recreate and activate a snapshot 2.
directly written to a new space P7. The original space P5 is 5.Data requested to be written to L2 of snapshot 2 is directly written
released as it is not referenced by the snapshot. to a new space P8, bringing no additional read and write overhead.
Working Principles — Snapshot Rollback
Create and activate a
snapshot.

Data is written Writing to


properly. Data4

Data4 is unintentionally deleted,


Data is damaged. overwritten, or infected with a virus.

Snapshot rollback is
Data is restored. complete, and the data is Use snapshot data
restored. to restore Data4.
Working Principles — Snapshot Cascading
and Cross-Level Rollback

1. Snapshot cascading is
to create a child Source volume
snapshot for a parent
snapshot. The child
snapshot shares the
data of its parent
snapshot. 08:00 09:00
2. Cross-level cascading
indicates that Snapshot0 Snapshot1
snapshots that share a 10:00 11:00
source volume can be
Snapshot1.snapsho Snapshot1.snapsho
rolled back to each t1
t0
other regardless of
their cascading levels.
Working Principles — Timing Snapshots

1. Two timing policies are


supported: fixed interval
and every week, every day,
or a fixed point in time.
2. The system supports 512
different timing schedules.
Each schedule supports
128 source LUNs. Each
source LUN in each
schedule supports 256
timing snapshots. Each
LUN supports only one
timing schedule.
Key Technologies — ROW
Source volume mapping
table Snapshot mapping
table
L3->P3

2 L2->P2
L1->P1 NULL
L4->P4

3
L0->P0 L2->P2 L2->P5
L: logical address
P: physical address

1. Redirect_On_Write

A B C D E F
P0 P1 P2 P3 P4 P5 P6 P7
Key Technologies — Snapshot Duplicate
⚫ How can I obtain multiple copies of data that is generated
based on the same snapshot?

Source
volume Snapshot

Snapshots are virtual and can


8:00
be replicated in a short time
period.

8:00 8:00 8:00


Key Technologies — Restore on Write

⚫ How can data to be instantly restored?

Source
Snapshot
volume
Snapshot
rollback
08:00

During the rollback, the host writes data to the source volume
after snapshot data is copied to the source volume.
If there are no access requests, data on the snapshot is rolled
back to the source volume in sequence.
Key Technologies — Reading a Snapshot
Origin Snapshot's
volume's Mapping
1. Receive a read request. Mapping table
2. Generate the address index table
Key Disk Key Disk
(key) specific to the request. offset offset
0
3. Obtain the data based on the 0 0
corresponding address index
(0,0) from the pool. If no data
is available, obtains it from the
source LUN.
a

Pool
Key Technologies — Writing a Snapshot
a'

1. Receive a write request.


2. Generate a key, Key
Disk
offset
reallocate the space 0
0 0
(0,1) and write to the
0 1
pool.
3. Reclaim the
unreferenced space
(0,0), as shown in the a'
a
figure on the right.
Application Scenarios
The following figure shows how HyperSnap and BCManager work
jointly to implement data backup and recovery.

Delivering a
snapshot policy

Rollback Reactivate
Generating snapshots

Quick restoration Quick synchronization of


of source volume data changes made to a
source LUN
Application Scenarios
Continuous data protection
Source
LUN

01:00:00

02:00:00
03:00:00 If a source LUN covered by
04:00:00 continuous data protection is
damaged, the source LUN's data can
be restored to any point in time
preserved by snapshots.
Application Scenarios
Re-purposing of data

Creating snapshot
duplicates

Creating a
snapshot
Reading
snapshot
duplicates

Report Data Data Decision


generation test analysis support
Going to the HyperSnap Configuration Page
Creating a Snapshot
Rolling Back a Snapshot
Reactivating a Snapshot
Creating a Snapshot Copy
Deactivating a Snapshot
Deleting a Snapshot
Configuring a Timing Snapshot Schedule
Rollback

Rollback is
completed or
stopped.
Roll back a
snapshot.
State Transition Diagram

Activation
Create a snapshot
duplicate.

Deactivate a
snapshot.

Deactivation
Delete a
snapshot
.
Create a
snapshot
.
OceanStor Dorado V6
Storage Systems
HyperReplication
Feature Overview
Term Definition
Remote replication is the core technology in disaster recovery (DR) backup.
It can be used for remote data synchronization and DR. Remote replication
Remote replication allows you to remotely maintain one or multiple data copies from storage
system at another site. In case a disaster occurs at one site, data copies at
the other site are not affected and can be used for DR.

Data is synchronized in real time to fully protect data consistency and


Synchronous remote replication
minimize data loss in the event of a disaster.

Data is synchronized periodically to minimize the service performance


Asynchronous remote replication deterioration caused by the long latency of long-distance data
transmission.
A consistency group is a collection of multiple remote replication sessions
that ensure data consistency in scenarios where a host writes data to
multiple LUNs on a single storage system. After data is written to a
Remote replication
consistency group at the primary site, all data in the consistency group is
consistency group
simultaneously copied to the secondary LUN using the synchronization
function of the consistency group. This ensures the integrity and
availability of the data used for backup and DR.
Feature Overview
⚫ Purpose and Benefits

Application Purpose Benefits


To recover service data by
Remote backup and Avoids data loss after the data at the
using backup data at the
recovery primary site becomes invalid.
remote end
To quickly switch service
data from the primary site to Avoids losses caused by service
Business continuity
the secondary site, thereby interruption after the primary site
support
ensuring business continuity becomes faulty.
in case of a disaster
To recover data at the
Avoids losses caused by service data
primary site by using backup
Disaster recovery loss or long recovery duration after a
data at the secondary site in
disaster.
case of a disaster
Working Principles — Synchronous
Remote Replication
LO G

1 4

4
1 2
H ost P3 r i m a r y S e c on d ary
5 C ac h e Cache
3
H ost
2 2

R e m o t e replication links

Primary Secondary
L U N LUN
Primary Secondary
storage system storage system

P ri m a ry P ri m a ry Secondary S ec onda ry
cache L UN cache L UN

1 Data block N 1 Re co r d s the difference in the L O G

2 Data blockN

2 Data block N 2 Data block N

2 W rite I/O result 2 W rite I/O result

3 W rite I/O result

D e l e t e s t h e l o g if a l l w r i t e s a r e
4
successful

Saves the log to the D C L upon


4
a n y write failure

5 W rite I/O result


Working Principles — Asynchronous
Remote Replication D CL

1 2 5
P r im a ry
2 Cache
H o st 2

P r im a ry
LUN
3
4
S e co n d a ry
S n a p sh o t LUN
3
6
R e m o t e replication links
S n a p sh o t
S n a p sh o t
6
Primary
storage system
S n a p sh o t

Secondary
storage system

D CL D CL

Primary Secondary
P r im a ry S e c on d ary S e c Lo Un N
da ry
P ri P r im a r y PPrr im
im a
a rr yy P r im
L Ua Nr y 差异位图 S e co n d a ry
m ar Cache LU
L UNN 差异位图 s nLaUp sNh o t
LUN
LUN s nLa Up sNh o t
y s n a p s ho t
1 Data block N
Records the
2
difference in the D C L
T h e pri mary The secondary
2 Da t a block N 3 LUN snapshot 3 L U N snapshot is
W rite I/O is activated activated
2
Result
2 W rit e I/O
Result
I n c r e me n t a l d a t a is s y n c h r o n i z e d
4
to the s e c o n d a r y cache

Cl ear the difference


5
in the DCL.

T h e pri mary
6 LUN snapshot The secondary
6 L U N snapshot
is stopped.
is s t o p p e d .
Comparison Between Synchronous and
Asynchronous Remote Replication
Item Synchronous Remote Asynchronous Remote
Replication Replication
Data synchronization period In real time Periodically

Data amount of each synchronization The primary and secondary LUNs keep Depending on the number of data
synchronized in real time. differences of the primary LUN in a
synchronization period
Impact on the primary LUN Large Small

RPO (data lossrate) 0 Depending on the number of data


differences of the primary LUN in a
synchronization period (minimum value:
3s)
Applicable to Intra-city backup and DR Inter-city backup and DR

Number of secondary LUNs supported 1 1


by a pair
Remote Replication — Consistency Group
Pr i ma r y st o r a g e S e c o n d a r y st o r a g e Pr i ma r y st o r a g e S e c o n d a r y st o r a g e
system system system system

1 . Initial st a t u s Co n si st e n cy g r o u p 0 1
R e mo t e r e p l i cat ion se ssi o n R e mo t e r e p l i cat ion se ssi o n
Pr i ma r y 01 Secondary Pr i ma r y 01 Secondary
LUN01 LUN01 LUN01 LUN01
A co n si st en cy g r o u p is n o t u s e d .
R e mo t e r e p l i cat ion se ssi o n R e mo t e r e p l i cat ion se ssi o n
Pr i ma r y 02 Secondary Pr i ma r y 02 Secondary
LUN02 LUN02 LUN02 LUN02
A co n si st en cy g r o u p is u se d .
R e mo t e r e p l i cat ion se ssi o n R e mo t e r e p l i cat ion se ssi o n
Pr i ma r y 03 S e co n d a r y Pr i ma r y 03 Secondary
LUN03 LUN03 LUN03 LUN03

Co n si st e n cy g r o u p 0 1
R e mo t e r e p l i cat ion se ssi o n 2 . D a t a r e p l i catio n R e mo t e r e p l i cat ion se ssi o n
01 01
P r im a ry Secondary P r im a ry S e co n d a r y
LUN01 LUN01 T h e r e mo t e replication se ssi o n 0 2 fails LUN01 LUN01
Datastatus

a n d r e mo t e r e p l ica tion se ssi o n s 0 1 a n d


R e mo t e r e p l i cat ion se ssi o n 0 3 su cce e d . D a t a i n t h e p r i ma r y a n d R e mo t e r e p l i cat ion se ssi o n
02 02
Pr i ma r y Secondary se co n d a r y st o r a g e s y s t e ms is Pr i ma r y Secondary
LUN02 LUN02 i n co n siste nt . L U N0 2 L U N0 2
R e mo t e r e p l i cat ion se ssi o n R e mo t e r e p l i cat ion se ssi o n s R e mo t e r e p l i cat ion se ssi o n
03 0 1 a n d 0 3 in t h e co n si st e n cy 03
P r im a ry Secondary g r o u p 0 1 a r e st o p p e d a n d P r im a ry Secondary
L U N0 3 LUN03 wa i t i n g f o r fault recovery. LUN03 LUN03

Co n si st e n cy g r o u p 0 1
R e mo t e r e p l i cat ion se ssi o n R e mo t e r e p l i cat ion se ssi o n
P r im a ry 01 Secondary 3 . D a t a r e co ve r y P r im a ry 01 Secondary
LUN01 L U N0 1 LUN01 LU N01
Af t e r a d i sa st e r occurs, t h e se co n d a r y st o r a g e
s y s t e m is u s e d for d a t a recovery.
R e mo t e r e p l i cat ion se ssi o n R e mo t e r e p l i cat ion se ssi o n
P r im a ry 02 Secondary Data on t h e p r i ma r y L U N is invalid P r im a ry 02 Secondary
LUN02 LU N02 b e ca u se i t is n o t d a t a o f t h e s a m e p o i n t in LUN02 L U N0 2
t i me .
R e mo t e r e p l i cat ion se ssi o n D a t a o n t h e p r i ma r y L U N is R e mo t e r e p l i cat ion se ssi o n
P r im a ry 03 va l i d f o r d a t a r e co ve r y. P r im a ry
Secondary 03 Secondary
LUN03 LUN03
LU N03 L U N0 3
Application Scenario 1: Centralized
Disaster Backup
Remote replication
session 01
Primary
LUN 01 Secondary
LUN 01
Synchronous
Service site 01
Remote replication Host
Primary session 02
LUN 02 Snapshot Secondary
02 LUN 02
Asynchronous

...
Service site 02

...
Remote replication
...

session n
Primary Snapshot Secondary
LUN n n LUN n

Asynchronous
Central backup site
Service site n
Application Scenario 2: Two-Site
Active-Active Service Continuity
Key Technologies
⚫ Multi-Point-in-Time Caching Technology
⚫ Secondary-LUN Write Protection Cancelation Technology
(Secondary LUNs Writable)

⚫ Multi-Link Redundancy Technology


⚫ Variable-Granularity Small DCL Bitmap Technology
Multi-Point-in-Time Caching
Technology —Second-Level RPO
One consistency point per 3s minimum

1 When a replication period starts, new time


3 2 slices (T2 and P2) are generated respectively
in the primary and secondary LUN caches.

2 New data from the host is cached in T2 of the


Time
primary LUN.
Time
slice T2 slice P2
3 The host receives a write success response.
4
Time Time
slice T1 slice P1 4 Data in T1 is replicated to P2 of the
secondary LUN.
1 1
Cache Cache
5 Both LUNs send the received data into disks.

5 Asynchronous remote 5 • Data is directly read from the cache. The


replication
latency is short.
• The snapshot does not require real-time
data updates based on the copy-on-write
Primary LUN Secondary LUN (COW). The synchronization has a minor
impact on performance but the
Production Disaster synchronization period is shortened to 3s.
center recovery center
Secondary-LUN Write Protection
Cancelation
Definition Setting the Secondary LUN Writable

With this technology, the secondary LUN is able to


receive data from the host. If the primary LUN Production DR center
becomes faulty, the administrator can cancel center
secondary LUN write protection to make the
secondary LUN writable. This enables the secondary-
end storage system to take over host services and
ensure service continuity.

Application Scenarios
Synchronous/Asynchronous
➢ Users need the secondary LUN for data analysis and WreAplNication
mining without affecting services on the primary LUN. SAN SAN
➢The DR storage array needs to take over services
upon a fault in the production storage array, but a WAN
primary/secondary switchover cannot be completed
normally.
OceanStor OceanStor
Dorado V6 Dorado V6
Advantages

This technology accelerates service recovery. In The primary end sends a The secondary host
addition, after the secondary LUN is read and written, disaster message. reads and writes DR
an incremental synchronization can be performed,
data.
enabling services to be switched back rapidly after a
disaster recovery.
Multi-Link Redundancy Technology

Eng ine0 Eng ine1


A B A B
Engine0 Engine1
A B A B

Engine0 Engine1
A B A B Eng ine0 Eng ine1
A B A B

iSCSI

FC
Multi-Link Redundancy Technology
⚫ Specifications:
 Each controller provides a maximum of 8 links for supporting
remote replication.

⚫ Characteristics:
 Links have a mutually redundant relationship. As long as one link is
available, the replication service will run smoothly.

 The load is balanced among multiple links, with the optimal paths
always preferred.
Variable-Granularity Small DCL
Bitmap Technology
⚫ Context:
DCLs are logs recording differentiated data. Their chunk granularity is 64 KB. In
the event that small-granularity (< 64 KB) I/Os require chunk replication, small
bitmap technology is used. A 64-KB chunk is divided into 4-KB chunks to
record data differences, with the query-returned chunk granularity being 4 KB
x N (N ranges from 1 to 16). That is, N pieces of differentiated data with
consecutive addresses are combined as a chunk.
64 KB

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

4 KB

⚫ Advantages:
1. Reduces the amount of replicated data, shortens synchronization
duration, and improves replication performance.
2. Mitigates data loss and lowers RPO.
Application Planning of Remote
Replication
⚫ Mirrors data from the production center to the disaster recovery center.
⚫ Enables the disaster recovery center to take over services in case of a disaster
in the production center.
⚫ Restores data to the production center after the production center recovers.

Production center Disaster recovery center

IP SAN/FC SAN IP/FC IP SAN/FC SAN


network

Remote data
mirroring
OceanStor
OceanStor storage storage system
system
Typical Networking and Connections
for Remote Replication
Synchronous Remote Replication's Bandwidth and Network Requirements

For synchronous remote replication, a


write success response is returned only ⚫ Normally, the DR distance of a
metropolitan area network (MAN) is
after the data in each write request is smaller than 200 km.
written to the primary and secondary sites.
⚫ The minimum connection bandwidth
If the primary site is far away from the must be at least 64 Mbit/s.
secondary site, the write latency of ⚫ The transmission latency must be smaller
foreground applications is quite long, than 1 ms (one-way).
affecting foreground services. ⚫ The actual network bandwidth must be
larger than the maximum write I/O
bandwidth.
Typical Networking and Connections
for Remote Replication
Asynchronous Remote Replication's Bandwidth and Network Requirements

For asynchronous remote replication, the


write latency of foreground applications is
independent of the distance between the
⚫ The minimum connection bandwidth
primary and secondary sites. As a result, must be at least 10 Mbit/s (two-way).
asynchronous remote replication is applied ⚫ The transmission latency must be smaller
in disaster recovery scenarios where the than 50 ms (one-way).
primary and secondary sites are far away ⚫ The actual network bandwidth must be
larger than the average write I/O
from each other, or the network
bandwidth.
bandwidth is limited. No specific distance
requirements are imposed on WAN
disaster recovery.
Typical Networking and Connections
for Remote Replication

Engi ne0 Engi ne1


A B A B
Engine0 Engine1
A B A B

Engine0 Engine1
A B A B
Engi ne0 Engi ne1
A B A B

iSCSI

FC
Typical Networking and Connections
for Remote Replication
Production center DR center
Replication
data flow

Synchronous/Asyn
chronous
replication

SAN SAN

LAN/WAN

OceanStor storage system OceanStor storage system


Deployment and Configurations
Start

1. Check whether the


asynchronous remote Check the license files for
replication function is remote replication.
available.
Manage routes if you want to
connect the primary and
Manage routes. secondary storage systems
2. Set up a connection through iSCSI host ports when
between the primary and the host ports of the storage
secondary storage systems. systems are in different network
Add a remote device. segments.

3. Create a remote Create an asynchronous


replication session. remote replication session.

To ensure LUN data consistency


4 . Create a consistency Create a consistency group. regarding the time for multiple
group. remote replication pairs, create a
consistency group.

End

Legend: Mandatory Optional


Configuring Remote Replication —
Checking the License Files
Configuring Remote Replication —
Adding a Remote Device
Configuring Remote Replication —
Creating a Remote Replication
Configuring Remote Replication —
Creating a Remote Replication
Configuring Remote Replication —
Setting Attributes
OceanStor Dorado V6
Storage Systems
HyperMetro Introduction
Background
Traditional active-passive storage Active-active DCs

FusionSphere FusionSphere

Data center A Data center B Data center A Data center B

• If the production center is affected by a disaster such as a


power failure, fire, flood, or earthquake, you must switch
services from the production center to the disaster recovery
(DR) center. Services are interrupted for a long time and Huawei HyperMetro
service continuity cannot be ensured.
• The DR center remains idle most of the time, wasting
resources.
Definition of HyperMetro

HyperMetro is Huawei's active-active storage solution

Two data centers serve as a backup for each other and


both of them are continuously running. If one data
center fails, services are automatically switched to the
other one.
Networking Overview
1. Network of hosts and storage arrays: a
network through which hosts can read
data from or write data to storage
arrays

2. Active-active replication network: a


network that supports data
synchronization and heartbeat
information synchronization between
storage arrays

3. Same-city network between data


centers: a network that synchronizes
data between data centers
4. Quorum network: a network through
which arbitration information is sent
from the quorum server to arrays
HyperMetro Arbitration Mechanism
1. Quorum server mode
• Application scenario: A third-place quorum server is
Arbitration deployed.
• Working principle: If heartbeat communication
between two storage arrays fails, each storage
array sends an arbitration request to the quorum
Storage resource pool server. The storage array that wins arbitration
Arbitration of the continues providing services while the storage array
preferred site that loses arbitration stops providing services. The
X preferred site takes precedence in arbitration.

Storage array A 2 Storage array B


2. Static priority mode
X
1 • Application scenario: The third-place quorum
Quorum server is faulty.
server • Working principle: If heartbeat communication
between two storage arrays fails, the storage array
that is preset with arbitration precedence
continues providing services.

• When the quorum server fails, HyperMetro automatically enters into static priority
mode. The two arrays still work normally.
• When communication between arrays A and B fails, the preferred site continues
working while the array at the non-preferred site stops working.
Why Are Arbitration and Dual-Arbitration
Needed?
No Arbitration Arbitration
HyperMetro link failure Device fault

X X X X
Array A Array B Array A Array B
Array A Array B Array A Array B
X X
Quorum server Quorum server

➢ If the quorum server is not deployed ➢ As a component in the active-active


and the communication between two solution, the quorum server may have
arrays fails, the following conditions reliability risks.
may occur: ➢ If the quorum server fails and the
• Storage arrays A and B are providing communication between the two storage
services — A brain-split occurs. arrays fails, a brain-split may occur or
• Both arrays A and B stop providing services may be interrupted.
services — Services are interrupted.
Why Are Arbitration and Dual-Arbitration
Needed?
Dual-arbitration

X X
Array A Array B Array A Array B Array A Array B Array A Array B
X
X
Active Standby Active Standby Active Standby Active Standby
quorum quorum quorum quorum quorum quorum quorum quorum
server server server server server server server server

➢ If the active quorum server fails, storage arrays A and B negotiate to switch arbitration
to the standby quorum server. If storage array A fails later, the standby quorum server
implements arbitration.
➢ If links between the active quorum server and storage array B are down, storage arrays
A and B negotiate to switch arbitration to the standby quorum server. If storage array A
fails later, the standby quorum server implements arbitration.
Arbitration Policies in Static Priority Mode
HyperMetro
No. Diagram Fault Type Pair Running Arbitration Result
Status
The link between LUNs of array A run
To be
1 two storage arrays services and LUNs of array
synchronize
breaks down. B stop.
d
The storage array in
LUNs of array A run
data center B (non- To be
2 services and LUNs of array
preferred site) synchronized
B stop.
malfunctions.
LUNs on both arrays stop.
The storage array in You must forcibly start
data center A To be the storage array in data
3
(preferred site) synchronize center B to enable the
malfunctions. d storage array to provide
services for hosts.
The black line between two data centers indicates the HyperMetro replication network.
Arbitration Policies in Quorum Server Mode
HyperMetro HyperMetro
No. Diagram Pair Running Arbitration Result No. Diagram Pair Running Arbitration Result
Status Status

LUNs of arrays A and B keep


running services. The LUNs of array A stop and LUNs
1 Normal arbitration mode of To be
6 of array B keep running
HyperMetro automatically synchronize
services.
becomes static priority mode. d

Simultaneous failure: LUNs of


LUNs of arrays A and B keep
2 Normal both arrays A and B stop. You
running services.
To be must forcibly start the storage
7 array in data center B to
synchronize
d enable the storage array to
LUNs of array A stop and LUNs provide services for hosts.
To be
3 of array B keep running
synchronize
services. Two faults with an interval of
d
greater than 20s: LUNs of array
A keep running services while
If data center A is the LUNs of array Bstop.
preferred one, LUNs of array A Simultaneous failure or two
To be
4 continue running services faults with an interval smaller
synchronize To be
while LUNs of array B stop 8 than 20s: LUNs on both arrays
d synchronize
running services. stop. You must forcibly start
d
the storage array in data
Simultaneous failure: LUNs of center A or B to enable the
both arrays A and B stop. You storage array to provide
To be must forcibly start the storage services for hosts.
5
synchronize array in data center B to
d enable the storage array to
provide services for hosts.
LUNs of arrays A and B keep
9 Normal
running services.
The black line between two data centers indicates the HyperMetro replication
network.
HyperMetro Dual-Write Process
HyperMetro dual-write process
◼ Dual-write of I/Os, ensuring real-time data
consistency
1. A host delivers a write I/O to the
HyperMetro management module.
2. A log is recorded.
3. The HyperMetro management module
concurrently writes the write I/O to both
the local cache and remote cache.
4. The local cache and remote cache return
the write I/O result to the HyperMetro
management module.
5. The storage array returns the write I/O
result to the application host after
receiving the feedback from the local
cache and remote cache.
◼ Differential data recording upon the
breakdown of a single storage array
1. If a storage array breaks down, data is
written into the other storage array that
is working properly and data changes are
recorded in a data change log (DCL). After
the storage array is recovered and
connected to the system again, the data
changes in the DCL are written into the
storage array in incremental mode.
HyperMetro I/O Read Process
HyperMetro read I/O process
◼ I/Os read locally during normal
operations and remotely during a
switchover failure
1. A host delivers a read I/O request to the
HyperMetro management module.
2. HyperMetro enables the local storage array
to respond to the read I/O request of the
host.
3. If the local storage array is operating
properly, it returns data to the HyperMetro
management module.
4. If the local storage array is working
improperly, HyperMetro enables the host to
read data from the remote storage array
through the HyperMetro management
module. The remote storage array returns
data to the HyperMetro management
module, which then sends data to the host.
5. The read I/O request of the host is
processed successfully.
FastWrite — Dual-Write Performance
Tuning
Traditional solution FastWrite
OceanStor OceanStor Host OceanStor
Host OceanStor
V6 storage V6 storage Host V6 storage Host
V6 storage

100 KM
100 KM
1 Write Command 8 Gbit/s Fibre
1 Command 8 Gbit/s Fibre
Channel/10GE
Channel/10GE
2 Transfer Ready 2 Ready

3 Data Transfer 3 Data Transfer

5 Transfer Ready 5 Status Good RTT-1


RTT-1

RTT-2
8 Status Good

Site A Site B Site A Site B

⚫ Traditional solution: Write I/Os experience two ⚫ FastWrite: A private protocol is used to combine the
interactions at two sites (write command and two interactions (write command and data transfer).
data transfer). The cross-site write I/O interactions are reduced by
⚫ 100 km transfer link: twice round trip time (RTT) 50%.
⚫ 100 km transfer link: RTT for only once, improving
service performance by 25%
Multipathing Routing Algorithm
Optimization — Host Data Access Optimization
Local HA Site A Site B

HyperMetro LUN HyperMetro LUN

H be Short-distance Long-distance
deployment deployment

Load balancing mode Preferred array mode

Load balancing mode (applicable to local Preferred storage array mode (applicable to
HA scenarios) same-city active-active storage scenarios)
⚫ Cross-array I/O load balancing is achieved in this ⚫ This mode greatly reduces cross-site accesses and the
mode. transfer latency.
⚫ This mode is applicable to short-distance deployment ⚫ This mode is applicable to long-distance deployment
scenarios such as the same equipment room. scenarios.
⚫ I/Os are delivered to two storage arrays and storage ⚫ In UltraPath, the hosts at site A are specified to access
resources are fully utilized, improving performance. the storage array at site A first and the hosts at site B
are specified to access the storage array at site B first.
I/Os are only delivered to the preferred storage array.
Thin Copy — Quick Initialization/Incremental
Data Synchronization
Traditional data synchronization solution Huawei thin copy solution
Site A storage Site B storage Site A storage Site B storage
A B C D A B C D A B C D A B C D
Full copy 8 blocks Full copy 8 blocks
H G F E H G F E
H G F E H G F E
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Send One Command
0 0 0 0 Full copy12 blocks 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
I J K L Full copy 4 blocks I J K L I J K L Full copy 4 blocks I J K L

Full copy of non- Full copy of zero- Zero copy of zero-


zero data blocks page data blocks page data blocks

◼ Thin copy solution: When data is synchronized, all-zero


◼ Traditional data synchronization: When data is data is intelligently identified. Only a specifier is
synchronized, all-zero data is not identified and all data
transferred. Data is not transferred.
blocks are copied one by one.
◼ Therefore, the initial data synchronization time is reduced
◼ Initial data synchronization occupies a large bandwidth
by 90%, and the occupied link bandwidth is lowered by
and data transfer takes a long time.
90%.
States of a HyperMetro Pair
Normal
Pause (operation) Fault (event)
Completing
synchronization
Synchronize (event) Synchronize
(operation) (operation)
Paused Synchronizing To be synchronized

Synchronize
Force start (operation) Force start (operation)
Force Start

HyperMetro Management Operation Prerequisites


The pair/consistency group (CG) running status is Paused, To be synchronized or Force Start and the
Synchronizing a HyperMetro Pair
links between devices are normal.
Suspending a HyperMetro Pair The pair/CG running status is Normal or Synchronizing and the links between devices are normal.
Performing a Prior/Non-prior The pair/CG running status is Normal, Synchronizing, Paused, To be synchronized or Force Start and
Switchover the links between devices are normal.
The pair/CG running status is Paused or To be synchronized and the local resource data is unreadable
and unwritable. Links between the storage arrays aredisconnected.
Forcibly Enabling a HyperMetro Pair
Note: To ensure data security, stop service hosts before forcibly enabling a HyperMetro pair. Start
hosts and services after the HyperMetro pair is started.
Deleting a HyperMetro Pair The pair/CG running status is Paused, To be synchronized or Force Start.
HyperMetro Without a Consistency Group

For associated LUNs, data may be invalid if a HyperMetro consistency group is not used.
HyperMetro With a Consistency Group

For associated LUNs, a HyperMetro consistency group effectively prevents data loss.
Impacts and Restrictions (1)
1. Capacity requirements
Reserve 1% of the LUN capacity in the storage pool where the LUN resides when applying HyperMetro to
the LUN.

2. Relationship between LUNs used by HyperMetro and the LUNs of other value-added
features
Local/Remote LUN of HyperMetro
LUN of Other Value-Added Features HyperMetro Configured Before HyperMetro Configured After Other
Other Features Features
Source LUN of a snapshot Yes Yes
Snapshot LUN No No
Source LUN of a clone Yes Yes
Clone LUN No No
Primary LUN of HyperReplication Yes Yes
Secondary LUN of HyperReplication No No
Source LUN of SmartMigration No No
Target LUN of SmartMigration No No
• Local LUN: Yes
Mapping Yes
• Remote LUN: No
SmartCompression Yes Yes
SmartDedupe Yes Yes
SmartThin Yes Yes
SmartVirtualization No No
Impacts and Restrictions (2)
3. Application restrictions
(1) After HyperMetro is configured for a LUN (a remote or local LUN), this LUN cannot be mapped to the
local storage system for a takeover.
(2) This iSCSI host port cannot be bound to an Ethernet port; otherwise, the active-active services may fail.
(3) Ports of the active-active replication network and the host-to-storage network must be physically
isolated and cannot be the same.
(4) After a HyperMetro pair is deleted, you are not advised to map the two LUNs of the deleted
HyperMetro pair to the same host.

3. Device requirements

Device Name Configuration Requirement

• The arbitration software can be deployed on either a physical machine or a VM.


Quorum server • If quorum servers are deployed on VMs, VMs can only use local disks of servers or LUNs
independent of active-active storage systems as system disks or data disks.

• You can only create an active-active relationship for two storage systems having the
same model.
Arbitration mode • The HyperMetro license must be available for the storage arrays in two data centers.
• The version of storage arrays must be C00 or a later one.
Installation Process

Prepare for the installation, check the installation environment, and


unpack and check the devices.

Install devices in DC A, DC B, and the third-place quorum site. See Device


Layout in a Cabinet.

Establish links among the network between hosts and storage arrays,
active-active replication network, same-city network, and the arbitration
network. Connect the cables as planned.

Ensure that all devices and their hardware are properly installed. Power on
the devices.

Configure the IP address of a management network port. Apply for and


import a license.
Configuration Process

Start

Configure Fibre Channel


switches. Configure UltraPath.
Configure Ethernet
switches. Configure the
virtualization platform.
Configure arbitration
software. Configure physical
machines.
Configure SAN
HyperMetro. End
Configuring Fibre Channel Switches (1)
• License requirement for cross-DC networking
Host DC1
✓ Each Fibre Channel switch must have an E-Port
cascading license.
1 2 ✓ If the network distance is greater than 10 km, a
Domain id:1 Domain id:2
4 3 2 1 1 2 3 4 long-distance transmission license must be provided.
FC
5 6 7 8 switch 8 7 6 5 ✓ (Optional) Each switch is configured with a trunking
license.
• Requirements for general configurations
✓ Fibre Channel switches have unique domain IDs.
1 2 3 4 4 3 2 1
Storage • Link aggregation configuration
A B ✓ If aggregation is enabled, loads on ports are
balanced by frame. If aggregation is disabled, loads
A B Storage on ports are balanced by session.
1 2 3 4 4 3 2 1
✓ Configure the ports to be trunked in the same port
group. Ports 0 to 7 are in a port group, ports 8 to
15 are in a port group, and ports 16 to 24 are in a
port group for a switch.
5 6 7 8 FC 8 7 6 5 ✓ Ports involved by trunking have the same
4 3 2 1 switch 1 2 3 4 configuration.
Domain id:3 Domain id:4 ✓ The length difference of optical fibers used in
1 2
trunking cannot be over 30 m; otherwise,
Host DC2 performance will deteriorate. Trunking fails when
the difference of fiber lengths is over 400 m.
Configuring Fibre Channel Switches (2)
• Long-distance transmission configuration
✓ For switch SNS 2224 and later models, run portdport to check one-way network distance and RTT.
switch:admin> portdporttest --show 1
D-Port Information:
===================
Port: 1
Remote WWNN: 10:00:00:05:33:81:43:00
Remote port index: 2
Roundtrip link latency: 277 nano-seconds
Estimated cable distance: 3 meters
✓ Cascading port configuration
o If network distance is smaller than 1 km, port L0 mode is used by default. If network distance is
greater than 1 km, modify the port mode.
o If network distance is within 10 km, LE mode is recommended.
o If network distance is greater than 10 km, LS mode is recommended. Set the buffer of LS to twice
the value of the actual network distance.
Note: Links are down if you modify the mode of the expansion port.
For a port that has enabled long-distance transmission, set a fixed rate for the port.
switch:admin> portcfgspeed -i port_index -f speed_level
✓ If the DWDM devices are involved, disable the QoS and buffer credit recovery functions of the
corresponding ports.
switch:admin> portcfgqos --disable [slot/]port
switch:admin> portcfgcreditrecovery --disable [slot/]port
For details about configurations in long-distance transmission, see the following document.
http://support.huawei.com/enterprise/en/doc/DOC1000153327
Configuring Ethernet Switches

• Different services are isolated by VLANs.


• Core switches are used to configure a CSS
loop-free Ethernet.

• Access switches are used to configure an


iStack loop-free Ethernet.
Configuring Arbitration Software
• Preparations
✓ When configuring the IP address of the quorum port on the quorum server, set
the IP address to take effect upon system startup.
The SUSE system is used as an example. Set STARTMODE to auto.
✓ Enable the firewall and configure a port ID for the firewall of the quorum server.
If a VM is used to deploy arbitration software, enable the firewall port of the
physical machine that deploys the VM.
The SUSE system is used as an example. Under any directory of the quorum server in OS mode, go to vi
/etc/sysconfig/SuSEfirewall2 to open the configuration file of the firewall. Add the 30002 port number to the
FW_SERVICES_EXT_TCP configurationitem.
www .huawei.com
Configuring SAN HyperMetro
HyperMetro configuration procedure

HyperMetro configuration procedure is as follows:

Add a remote
device.
Create a Create a (Optional) Create a Map LUNs to a
HyperMetro HyperMetr HyperMetro host (local or
Create a quorum
domain. o pair. consistency group. remote).
server (local).
Create a quorum
server (remote).
Configuring SAN HyperMetro — Adding
a Remote Device

Select FC or IP.

www.huawei.com

• When the network distance exceeds 25 km, enable the FastWrite function of replication links.
✓ Fibre Channel links: Run the change port fc fc_port_id= XXX fast_write_enable=yes command to enable the
FastWrite function of Fibre Channel ports, where fc_port_id can be obtained by running the show port
general command.
✓ iSCSI links: Run the change remote_device link link_type=iSCSI link_id=XXXfast_write_enable=yes command
to enable the FastWrite function of iSCSI ports, where link_id can be obtained by running the show
remote_device link command.
Configuring SAN HyperMetro — Creating a
Quorum Server

www.huawei.com

Running Status:
Connected
Configuring SAN HyperMetro — Creating a
HyperMetro Domain

1. If you select Configure Later, the arbitration mode to be


created for the HyperMetro domain is Static Priority Mode.
If you select Quorum Server, the arbitration mode to be
created for the HyperMetro domain is Quorum Server
www.huawei.com
Mode.
3. If there is no quorum server created,
2. If a quorum server has been created, it is listed in this area.
click Create Quorum Server.
Configuring SAN HyperMetro — Creating a
HyperMetro Pair

4. Possible values of synchronization


2. Select remote LUN resources. The speed are Low, Medium, High, and
system automatically selects those Highest. The default value is Medium.
LUNs with the same capacity of 5. Select a recovery policy. Possible
the local LUNs. policies are Automatic and Manual.

1. Select a local LUN.

6. Select an initial synchronization mode. The default


mode is automatic. The HyperMetro pair is in the
Synchronizing state after being created. If you select
the second option, the HyperMetro pair is in the
Paused state after being created. You can manually
resume the HyperMetro pair. If you select the third
www.huawei.com
3. You can batch create multiple pairs option, the HyperMetro pair is in the Normal state
at one go. after being created. This option is recommended for
scenarios where there is no data on the primary LUNs.
Configuring SAN HyperMetro – Creating
a HyperMetro Consistency Group
4. Enter the name of the
consistency group.
1. This operation is optional. Add pairs
to the consistency group during the
pair creation process. 5. Possible values of
synchronization speed are Low,
Medium, High, and Highest. The
2. If any HyperMetro consistency group default value is Medium.
exists, it is listed here.
The HyperMetro consistency group must
be in the Paused state and data is
synchronized from the local to remote
6. Select a recovery policy.
storage array.
Possible policies are
Automatic and Manual.

www.huawei.com

3. If no HyperMetro consistency group


exists, click Create HyperMetro
Consistency Group to create one.

Note: If you have selected a HyperMetro consistency group here, you do not need to create one later.
Configuring UltraPath Policies —
Windows/Linux/AIX/Solaris
Huawei UltraPath provides two working modes for HyperMetro: Priority and Balance. You are
advised to select the Priority mode and specify two arrays for primary array load balancing.
Huawei UltraPath is Priority by default. Specify the array with the largest serial number (SN)
as the Primary array. In practical application, it is required to modify the primary array to
achieve load balancing.

The Windows/Linux/AIX/Solaris operating system is used as an example:

1. Query Array ID.

2. Set the HyperMetro working


mode as priority and the ID of
the primary array to 0.

3. Query the VLUN information. Confirm


that the working mode is read write
within primary array. Check whether
the SN of the primary array is correct.

4. Repeat step 1 to 3 to modify


information about each host.
Configuring UltraPath Policies — vSphere

If ESXi hosts are deployed in a cluster, configure the APD to PDL function.
• Configure Huawei UltraPath.
• Run the esxcli upadm set apdtopdl -m on command.
• Run the esxcli show upconfig command to view the configuration result.
If the APD to PDL Mode value is on, the APD to PDL function of ESXi hosts is
successfully enabled.
Configuring the Virtualization Platform
— VMware Configuration Requirements
Mandatory configuration items:
✓ Deploy ESXi hosts across data centers in an HA cluster.
Configure the cluster with HA advanced parameter. For VMware vSphere 5.0 u1 and later
versions, set the das.maskCleanShutdownEnabled = True parameter.
✓ VM service networks and vMotion networks require L2 interworking between data
centers.
✓ Configure all ESXi hosts with the following advanced parameters and the Ultrapath
apdtopdl switch.
Recommended configuration items:
✓ The vMotion network, service network, and management network must be configured as
different VLANs to avoid network interference.
✓ The management network includes the vCenter Server management node anwdwEwSh.Xuiawheo.iscotsm
that are not accessible to external applications.
✓ The service network is divided into VLANs based on service requirements to ensure
logical isolation and control broadcast domains.
✓ In a single cluster, the number of hosts does not exceed 16. If the number of hosts
exceeds 16, you are advised to use the hosts to create multiple clusters across data
centers.
✓ A DRS group must be configured to ensure that VMs can be recovered first in the local
data center in the event of the breakdown of a single host.
Configuring the Virtualization Platform
— vSphere Configuration Requirements
Mandatory configuration items:
✓ Deploy CNA hosts across data centers in a cluster.
✓ HA is enabled to ensure that VMs can restart and recover when the hosts where the VMs
reside are faulty.
✓ VM service networks require L2 interworking between data centers.
✓ Both data centers are configured with a VRM in active and standby mode, using the local
disks.
✓ Select FusionSphere V100R005C10U1 and later and choose Huawei UltraPath for
multipathing software.
Recommended configuration items:
✓ Computing resource scheduling must be enabled to ensure that VMs can be recovered first
in the local data center in the event of the breakdown of a single host.
✓ The VM hot migration network, service network, and management network must be
configured as different VLANs to avoid network interference.
✓ The management network includes the VRM management node and CNA hosts that are
not accessible to external applications.
✓ The service network is divided into VLANs based on different services to ensure logical
isolation and control broadcast domains.
Configuring the Virtualization Platform
— Hyper-V Configuration Requirements

Mandatory configuration items:


✓ Perform the following operations to set the timeout parameter of
Hyper-V clusters' quorum disks to 60 seconds (20 seconds by default):
Open PowerShell and run Get-Cluster | fl *.
Check whether the QuorumArbitrationTimemax parameter value is 60.
If not, go to the next step.

Run (Get-Cluster cluster_name).QuorumArbitrationTimemax=60.


Configuring Physical Machines

Windows clusters:
✓ Perform the following operations to set the timeout parameter of
clusters' quorum disks to 60 seconds (20 seconds by default):
Open PowerShell and run Get-Cluster | fl *.
Check whether the QuorumArbitrationTimemax parameter value is 60. If not,
go to the next step.
Run (Get-Cluster cluster_name).QuorumArbitrationTimemax=60.

⚫ Oracle RAC c l u s t e r s :
✓ Oracle RAC clusters are deployed in Automatic Storage Management (ASM)
mode. You are advised to use the External redundancy mode.
✓ You are advised to store the arbitration file, redo log file, system
data file, user data file, and archive log file in different ASM disk
groups.
✓ You are advised to create three redo log groups for each thread. The
size of a redo log must allow a log switchover every 15 to 30 minutes.
OceanStor Dorado V6
Storage System
SmartMigration
Feature Description
⚫ Background
With the evolution of storage technologies, the need for service migration
arises as a result of storage system upgrade or storage resource reallocation.
Mission-critical services, in particular, must be migrated without being
interrupted. Service migration may take place either within a storage system
or between storage systems.

⚫ Definition
SmartMigration, a key service migration technology, migrates host services
from a source LUN to a target LUN without interrupting these services and
then enables the target LUN to take over services from the source LUN after
replication is complete. After the service migration is complete, all service-
related data has been replicated from the source LUN to the target LUN.
Feature Description
Characteristics Description
SmartMigration tasks are executed without interrupting host services,
Reliable service continuity preventing any loss caused by service interruption during service
migration.

After a SmartMigration task starts, all data is replicated from the source
LUN to the target LUN. During the migration, I/Os delivered by hosts will
Stable data consistency be sent to both the source and target LUNs using dual-write, ensuring
data consistency between the source and target LUNs and preventing
data loss.

In addition to service migration within a storage system,


Service migration between
SmartMigration also supports service migration between a Huawei
heterogeneous storage systems
storage system and a compatible heterogeneous storage system.
Working Principles
SmartMigration is leveraged to adjust service performance or upgrade storage systems by
migrating services between LUNs.
Implementation of SmartMigration has two stages:
⚫ Service data synchronization
When a SmartMigration task is created initially, data on the source LUN is synchronized to the target
LUN. During the synchronization, the host writes data to the source as well as to the target LUN in real
time without interrupting host services.

⚫ LUN information exchange


After all data on the source LUN is synchronized to the target LUN, information of the source LUN and
target LUN is exchanged and the relationship between the source LUN and target LUN is terminated.
Host services are not interrupted and the host writes data to the source LUN. In this way, the target LUN
can replace the source LUN to carry host services.
Working Principles
1. The host delivers an I/O write request to the SmartMigration module of the storage system.
2. The SmartMigration module writes the data to the source LUN and target LUN and records this
write operation to the log.

3. The source LUN and target LUN return the data write result to the SmartMigration module.

4. The SmartMigration module determines whether to clear the DCL based on the data write result.

5. A write success acknowledgment is returned to the host.

Host

1 5
DCL LOG
4 2

LM Storage
4

2 2
3 3
Working Principles
⚫ In a storage system, each LUN and its corresponding data volume has a unique identifier, namely, LUN ID
and data volume ID. A LUN corresponds to a data volume. The former is a logical concept whereas the
latter is a physical concept. LUN information exchange changes the mapping relationship between a LUN
and a data volume. That is, without changing the source LUN ID and target LUN ID, data volume IDs are
exchanged between a source LUN and a target LUN. As a result, the source LUN ID corresponds to the
target data volume ID, and the target LUN ID corresponds to the source data volume ID.
SmartMigration Consistent Splitting
⚫ Consistent splitting of SmartMigration enables simultaneous splitting on multiple related LUNs. As a
result, data consistency can be ensured and services of the target LUN are not affected. After
SmartMigration pairs are split, the data written to the source LUN by the host is not synchronized to the
target LUN.
SmartMigration State Transition
Initial
creation
The number of copy tasks
does not reach the The number of copy
maximum value. tasks reaches the
maximum value.
1. Synchronizing: Data on the source LUN is Start the
synchronizing to the target LUN. Synchro synchronization.
nizing
2. Normal: Data synchronization between the
source LUN and the target LUN is complete. Queuing
3. Queuing: The pair is waiting in a queue. Disconnected
4. Interrupted: The replication relationship Synchronization
between the source LUN and the target LUN completed
is interrupted due to I/O errors.
5. Migrated: Data synchronization between the Fault
source LUN and target LUN is complete and rectification
Normal
the splitting is complete.

Interrupted
Splitting

Migrated
Storage System Replacement
⚫ When users plan to upgrade
their storage systems, for
example, to replace A B C
D E F
heterogeneous storage G H I

systems that are compatible Data mig ra tio n

with Huawei's new storage


systems, users can deploy
eD evLU N
Exte rnal LU N

SmartMigration along with New sto rage


Old s to r a ge
syste m
syste m
SmartVirtualization to migrate
service data from the original
storage systems to new
storage systems to ensure
data consistency.
Service Performance Adjustment
⚫ To enhance the reliability of services on a LUN with a low-reliability
RAID level, you can deploy SmartMigration to migrate the services to
a LUN with a high-reliability RAID level. If services do not need high
reliability, you can migrate them to a low-reliability LUN.

Service migration
Source LUN Target LUN

RAID 5 RAID 6
policy policy
Impact and Restrictions
Impact on performance
⚫ When SmartMigration is in use, operations such as data migration and dual-write consume
CPU resources, increasing the access latency and decreasing the write bandwidth.

 During the migration, enabling SmartMigration increases the average latency of the
source LUN by no more than 30% and the average total latency is no more than 2 ms.
 When SmartMigration is enabled and the target LUN is faulty, the latency of the
source LUN increases by no more than 15% in the case of writing data to only the
source LUN and the average total latency is no more than 1.5 ms.

⚫ You are advised to use the moderate migration speed to perform migration in common
scenarios. The impact of migration on host performance increases as the migration speed
increases. Therefore, users can reduce the impact of SmartMigration on host performance
by dynamically adjusting the migration speed.
Impact and Restrictions
Restrictions
⚫ The capacity of the target LUN must not be smaller than that of the
source LUN.

⚫ Neither the source nor target LUN is used by any value-added feature.
⚫ The source and target LUNs belong to the same controller.

⚫ The target LUN cannot be mapped to the host.


Configurations
Checking SmartMigration Licenses
Creating SmartMigration Tasks
Creating SmartMigration Tasks –
Migration Speed Settings
OceanStor Dorado V6
Storage Systems
SmartQoS
Introduction to SmartQoS
SmartQoS helps set upper limits on IOPS or bandwidth for certain
applications. Based on these upper limits, SmartQoS can accurately
control the performance of these applications, thereby preventing them
from contending for storage resources with critical applications. It
provides the following functions:

⚫ Assigns storage resources to critical applications on a preferential


basis in the event of resource shortages in order to meet specific
service level requirements across scenarios.

⚫ Limits the resources allocated to non-critical applications to ensure


better performance of critical applications.
I/O Priority Scheduling
⚫ I/O priority scheduling is based on LUN
priority, or more specifically, the importance
of applications processed by each LUN.

⚫ This function schedules storage system


resources, such as computing and bandwidth
resources. This ensures that storage systems
give priority to resource allocation requests
initiated by high-priority applications. Thus,
resource shortages do not affect the ability
of high-priority applications to meet their
service level requirements.
I/O Traffic Control
The performance of common applications goes high, Enable the traffic control
affecting critical applications. policy.

Overall
performance

The performance of
common
applications is Common
limited to avoid application
affecting other
applications.

Critical
application

I/O traffic control restricts the performance of non-critical applications by limiting


their IOPS or bandwidth, thereby preventing them from affecting critical
applications. I/O traffic control is implemented based on hierarchical management,
objective distribution, and traffic control management.
I/O Traffic Control: Hierarchical
Management
SmartQoS supports both normal
and hierarchical policies.

⚫ Normal policy: controls the


traffic from a single application
to LUNs or snapshots.

⚫ Hierarchical policy: controls the


traffic from a group of
applications to LUNs or
snapshots. Hierarchical policies
can be supplemented by
normal policies.
I/O Traffic Control: Objective
Distribution Start

⚫ All LUNs in a SmartQoS traffic


control policy share a specified
Collects information.
traffic control objective. The
SmartQoS module periodically
collects performance data and
Identifies distribution objects and adds weights.
requirement data of all LUNs in a
traffic control policy, and distributes
the traffic control objective to each Calculates a midpoint value between the maximum and
minimum values.
LUN using the distribution
algorithm.

⚫ Currently, Huawei employs a tuned Calculates the final results.

weighted max-min fairness


algorithm explained by the figure on
the right. End
I/O Traffic Control: Traffic Control
Management
⚫ Traffic control management
is implemented based on I/O
queue management, token
allocation, and dequeue
control.

⚫ I/O queue management uses


a token mechanism to
allocate storage resources. A
high number of tokens
indicates correspondingly
high resource allocation for
the respective I/O queue.
Application Scenarios
SmartQoS dynamically allocates storage resources to
prevent non-critical applications from contending for
storage resources, thereby ensuring optimal performance
of critical applications. It is used mainly for:
⚫ Preventing mutual impact between applications
⚫ Ensuring the performance of critical applications in a multi-
application system
Scenario 1: Preventing Mutual Impact
Between Applications
Since storage systems are now designed with increasingly large capacities,
multiple applications are commonly deployed on single storage systems. This
practice simplifies the storage system architecture, but also causes applications
to contend for resources, which may adversely affect the performance of each
application. SmartQoS allows specification of performance objectives for each
application to ensure the performance of critical applications.

Users can create a traffic control policy to limit the performance of non-critical
applications.
Scenario 1: Preventing Mutual Impact
Between Applications
Application Type I/O Characteristic Peak Hour of Operation

OLTP Random small I/Os, typically measured in IOPS 08:00 to 00:00

Archive and backup Sequential large I/Os, typically measured in bandwidth 00:00 to 08:00

⚫ Online transaction processing (OLTP) applications are critical time-


Controlling performance of critical
sensitive applications. applications
⚫ Archive and backup applications involve large amounts of data
and are latency-tolerant.
⚫ OLTP applications run mainly from 08:00 to 00:00. Archive and
backup applications run mainly from 00:00 to 08:00.
You can create two SmartQoS policies for these two types of
applications:
⚫ SmartQoS policy A: Limits the bandwidth (for example, ≤ 50 MB/s)
for archive and backup applications to reserve sufficient system
resources for OLTP applications from 08:00 to 00:00.
⚫ SmartQoS policy B: Limits the IOPS (for example, ≤ 200) for OLTP
applications to reserve sufficient system resources for archive and
backup applications from 00:00 to 08:00.
Scenario 2: Ensuring the Performance of
Critical Applications in a Multi-Application
System
Users can configure higher priorities for critical applications to enable
preferential allocation of resources when the system is overloaded with
applications. This practice is more suitable for scenarios featuring varied
importance levels rather than a specific performance objective.

Configure priorities
Critical application (high)
for applications.
Important application
(medium)
Critical application (high)

Overall performance
SmartQoS Portal
On OceanStor DeviceManager, choose Provisioning > Resource
Performance Tuning > SmartQoS.
Configuring the I/O Priority
Configure the I/O priority for a LUN based on the importance of
applications processed by the LUN. The three I/O priorities are
Low, Medium, and High.
Creating a SmartQoS Policy (1)
Step 1: On the Traffic Control tab, click Create. Specify the policy
name and type in the displayed dialog box.
Creating a SmartQoS Policy (2)
Step 2: Set the control objective.
 Do not set the control objective to too small a value. The value displayed in the
following figure is provided as an example. A big difference between the value
and the actual service load leads to high latency, which may adversely affect
host services and other services such as HyperMetro and HyperReplication.
Creating a SmartQoS Policy (3)
Step 3: Set the time period for which the policy comes into
effect.
Creating a SmartQoS Policy (4)
Step 4: Add LUNs to the policy.
Creating a SmartQoS Policy (5)
Step 5: Confirm the parameter settings and click Finish.
Creating a SmartQoS Policy (6)
Step 6: On the Traffic Control tab, you can view basic
information about all policies. There are three activity states
for policies: Unactivated, Idle, and Running.

Unactivated

Deactivate Deactivate
Activate

The policy execution time starts.

Running Idle

The policy execution time ends.


Activating or Deactivating a
SmartQoS Policy
For unactivated SmartQoS policies, you can activate them and
add or remove LUNs.

For activated SmartQoS policies, you can deactivate them and


add or remove LUNs.
Deleting a SmartQoS Policy
You can directly delete an unactivated SmartQoS policy.
Activated policies must be deactivated before being deleted.
Modifying the Properties of a
SmartQoS Policy (1)
⚫ You can modify the properties of an activated SmartQoS
policy.
 Do not set the control objective to too small a value. The value displayed in the
following figure is provided as an example. A big difference between the value and
the actual service load leads to high latency, which may adversely affect host
services and other services such as HyperMetro and HyperReplication.
Modifying the Properties of a
SmartQoS Policy (2)
You can modify the properties of an activated SmartQoS
policy.
Huawei UltraPath
Training (Entry-Level)
Positioning of Multipathing Software – What
Is Multipathing Software Capable of?

Without multipathing software With multipathing software


Server Server
Multipathing software Eliminating
Basic single points
functions
of failure

HBA HBA
Single point of failure
A single point of failure (SPOF)
means that a certain point of a
network is faulty, which may
cause network breakdown. To
prevent single points of failure,
high-reliability systems
LUN LUN LUN LUN LUN LUN implement redundant backup for
devices that may suffer single
points of failure and adopt a
cross cable connection method
to achieve optimal reliability.
Moreover, redundant paths
Storage array assist in achieving higher
Storage array
performance.
Positioning of Multipathing Software – What
Else can Multipathing Software Do?

Without multipathing software With multipathing software


Server Server

Multipathing software Basic


Load balancing
functions

Link
Load balancing
Balanced
I/O Link loads
bottleneck Load balancing is another
Doubled
performance critical function of
multipathing software. With
load balancing, the system
can use the bandwidth of
multiple links, improving
L U L U LUN L U L U LUN overall throughput.
N N N N
Common load balancing
algorithms include round-
Storage array Storage array robin, minimum queue
depth, and minimum task.
Positioning of Multipathing Software – What
else can Multipathing Software Do?

server
Positioning Application
vdisk
UltraPath is a type of filter driver software running in the UltraPath
host kernel. It can block and process disk creation/deletion HBA HBA
and I/O delivery of operating systems. Multipathing software
ensures reliable utilization of redundant paths. If a path fails
or cannot meet the performance requirement, multipathing
SAN
software automatically and transparently transfers I/Os to
other available paths to ensure that I/Os are transmitted
effectively and reliably. As shown in the figure on the right,
multipathing software can handle many faults such as HBA

storage
faults, link faults, and controller faults. ControllerA Controller B

Basic Function Importance Degree Overview


If a path is faulty, I/Os on the path are automatically transferred
Failover High
to another available path.
After the faulty path recovers, I/Os are automatically transferred
Failback High
back to the path.
The bandwidths of multiple links are used, improving the overall
Load balancing High
system throughput.
Overview of Mainstream
Multipathing Software
Windows Linux AIX VMware ESX Solaris
Built in
MPIO DM-Multipath MPIO NMP STMS
OS
UltraPath PCM (based on
Huawei UltraPath UltraPath UltraPath UltraPath
MPIO)
EMC PowerPath PowerPath PowerPath PowerPath PowerPath
SDDDSM (based on
DM-Multipath SDDPCM (based on MPIO) NMP SDD
IBM1 MPIO)
RDAC RDAC RDAC VERITAS DMP
SecurePath SecurePath SecurePath STMS
HP2 DSM (based on
HP-DM HP-PCM (based on MPIO) NMP DMP
MPIO)
HDS HDLM HDLM HDLM HDLM HDLM
DSM (based on
NetApp DM-Multipath MPIO NMP STMS
MPIO)
Veritas DMP DMP DMP DMP DMP

Currently, multipathing solutions provided by storage vendors are classified into three types:
1. Use self-developed multipathing software, for example, EMC PowerPath, HDS HDLM, and Huawei UltraPath.
2. Provide storage adaptation plug-ins based on the multipathing framework of operating systems, for example, IBM and HP.
3. Use native multipathing software of operating systems (generally used by A-A arrays or A-A/A arrays supporting ALUA).
Currently, Windows and Linux are the most mainstream operating systems for x86 servers, AIX is the most mainstream in
minicomputers, and VMware ESX in virtualization platforms.
Native multipathing software of operating systems (often called MPIO) supports the failover and load balancing functions
and can cope with scenarios that have moderate requirements on reliability. Multipathing software developed by storage
vendors is more professional and delivers better reliability, performance, maintainability, and storage adaptation.
Overview of Huawei UltraPath
A multipathing software
program installed on hosts to
improve service performance
Introduction and availability

Runs in kernel mode of the


operating system as a driver of
the operating system.
Controls access to storage
devices. Supports querying and setting the
operating parameters of driver software in
Selects paths between UltraPath
user mode of the operating system.
hosts and storage devices.
Has different installation
Improves the reliability of the programs or scripts in
paths between hosts and different operating systems.
storage devices.
Functions Environment
Overview of Huawei UltraPath

Windows Linux
Windows Linux

AIX
Supported UltraPath UltraPath
Other-OS Self-developed
Not supported
OS framework–based
Solaris AIX
ESX
HP-UX
ESX
Solaris

Figure 1: OSs supported by UltraPath Figure 2: Architecture of UltraPath

Remarks
⚫ UltraPath for AIX is based on the MPIO framework built in the OS, and provides the Path

Control Module (PCM).


⚫ UltraPath for ESX is based on the PSA framework built in the OS, and provides the MPP

module (a multipathing plug-in).


Major Functions of UltraPath
Virtual LUNs mask physical LUNs
and are visible to upper-layer
users. Read/Write operations are
performed on virtual LUNs.

Mainstream clustering software:


Virtual LUN MSS MSCS, VCS, HACMP,
The path to the owning generation Oracle RAC, and so on
controller of a LUN is
Optimal Support for Mainstream database software:
used to achieve the
path application Oracle, DB2, MySQL, Sybase,
best performance.
selection software Informix, and so on
UltraPath

Failover occurs when a Failover Failback


link becomes faulty, After link recovery, failback
preventing service immediately occurs without
I/O load
interruption. manual intervention or
balancing
service interruption.

Multiple paths are


automatically selected to
deliver I/Os, improving I/O
performance. Paths are
selected based on the path
workload.
Redundancy Solution — Without
Multipathing Software
Server

HBA
Single link to
external storage

LUN LUN LUN

Storage array
Redundancy Solution — Without
Multipathing Software
Server

Services are interrupted


immediately after the link is
faulty.

HBA

LUN LUN LUN

Storage array
Redundancy Solution — Multipathing

Server
Multipathing software
Redundant links are established to prevent
single-point failures.

HBA

LUN LUN LUN

Storage array
Redundancy Solution — Multipathing

Server
Multipathing software

A better network–standard dual-switch


network
HBA

LUN LUN LUN

Storage array
Redundancy Solution — Multipathing

Server
Multipathing software

Services are interrupted


immediately after the server
fails.
HBA

LUN LUN LUN

Storage array
Redundancy Solution — Multipathing
+ Cluster
Multipathing Multipathing
software Server software Server

Cluster software Cluster software


WSFC VCS… WSFC VCS…
A server is added for
redundancy and backup.
HBA HBA

LUN LUN LUN

Storage array
Redundancy Solution — Multipathing
+ Cluster
Multipathing Multipathing
software Server software Server

Cluster software Cluster software


WSFC VCS… WSFC VCS…
Services are interrupted
immediately after the storage
HBA HBA
array is faulty.

LUN LUN LUN

Storage array
Redundancy Solution — Multipathing +
Cluster + Active-Active = High Availability
Multipathing Server Multipathing Server
software software
Arrays are added for redundancy
and backup.
Cluster software Cluster software
WSFC VCS… WSFC VCS… Multipathing + cluster + active-
active = high availability.
HBA HBA

LUN LUN LUN LUN LUN LUN

Good!

Storage array Storage array


Native Multipathing Software of
Operating Systems — Windows

Microsoft Multipath I/O (MPIO) is a framework that allows storage


vendors to develop multipathing solutions that contain the
hardware-specific information needed to optimize connectivity with
the storage arrays. MPIO can also be used independently, helping
implement load balancing among paths, path selection, and
failover between storage devices and hosts.
Native Multipathing Software of
Operating Systems — Linux
⚫ Device Mapper Multipath (DM-Multipath) can configure multiple
I/O links between servers and storage arrays as an independent
device. These I/O links are physical SAN links composed of
different cables, switches, and controllers. DM-Multipath
aggregates these links to form a new device.
⚫ DM-Multipath delivers the following functions:
1. Failover and failback
2. I/O traffic load balancing
3. Disk virtualization
Native Multipathing Software of
Operating Systems — ESXi
⚫ By default, ESXi provides a native multipathing plug-in (NMP)
module which is scalable.
⚫ Generally speaking, VMware NMP supports all the storage arrays
listed on the VMware storage HCL and provides default path
selection algorithms based on the array type. The storage array
type plug-in (SATP) is responsible for path failover of specific
storage arrays. The path selection plug-in (PSP) is responsible for
selecting physical paths for sending I/O requests to storage
arrays. SATP and PSP are sub-plug-ins of the NMP module.
⚫ In ESXi, the SATP appropriate for storage arrays is installed
automatically. You do not need to obtain or download any SATP.
Comparison Between Huawei UltraPath and
Multipathing Software from Competitors —
Basic Functions
Importance
PowerPath Windows MPIO DM-Multipath AIX MPIO UltraPath
Degree
Failover High Supported Supported Supported Supported Supported
Failback High Supported Supported Supported Supported Supported
Optimal- Supported
Supported Supported
controller High Supported Note: using Supported
Note: using ALUA Note: using ALUA
identification ALUA

Medium
round-robin
Except round-
least-io round robin
robin, all other
least-block Least Queue round robin round-robin
Load balancing algorithms do Adaptive round robin
Depth Queue-length min-queue-depth
algorithm not differ much Weighted Paths
CLARiiON optimization Least Block Service-time min-task
in their actual
Symmetrix optimization Weighted Paths
performances.
Stream I/O

Supported Supported
Load balancing
Note: Path group Note: Path group
based on path High Supported Supported Supported
identification identification
groups
through ALUA through ALUA
Comparison Between Huawei UltraPath
and Multipathing Software from
Competitors — Advanced Functions
Importance Windows DM-
PowerPath AIX MPIO UltraPath
Degree MPIO Multipath
(All Paths
High
Down) Supported by some Not
Note: reliability Supported Not supported Supported
APD platforms supported
function
protection
Supported
Note: Paths
Isolation of
High cannot be
intermittent Not Not
Note: reliability Not supported restored Supported
ly faulty supported supported
function automatically
paths
after the
isolation.
Supported by some
platforms Supported
Note: The isolation is Note: Isolation
Isolation of achieved by the algorithms are
High autostandby function. different for different
links that Not Not
Note: reliability Only one isolation Not supported types of faults, and a
have bit supported supported
function algorithm can be special recovery test
errors
used and paths will mechanism is
be recovered after a provided.
fixed period of time.
Comparison Between Huawei UltraPath
and Multipathing Software from
Competitors — Advanced Functions
Importance Windows
PowerPath DM-Multipath AIX MPIO UltraPath
Degree MPIO
Pushes information to the
array and provides
High
Path exception Not Not Not centralized alarming.
Note: reliability Supported
alarming supported supported supported Multiple types of path alarms
function
are supported: path failure
and no redundant controllers.
Low
Note: After
multipathing
GUI centralized software is Supported Not Not Not
Not supported
management installed, this PowerPath Viewer supported supported supported
management is
rarely needed.
Path management
insight provides
Medium monitoring from multiple
Statistics of IOPS and
Path performance Note: It is used dimensions: Not Not Not
1. IOPS, bandwidth, bandwidths are collected
monitoring to diagnose supported supported supported
and latency based on read/write requests.
problems.
2. I/O size
3. read/write requests
Comparison Between Huawei UltraPath
and Multipathing Software from
Competitors — Advanced Functions
Importance Windows DM-
PowerPath AIX MPIO UltraPath
Degree MPIO Multipath
Supported
Note:
Medium
Identifies the
Note: Without this I/Os will
controller that I/Os will drop I/Os will drop to zero
function, services drop to zero
Smooth online is about to go to zero during during the upgrade
are not interrupted Supported during the
upgrade of arrays offline through the upgrade process.
but only upgrade
ALUA and process.
temporarily process.
switches over
congested.
the controller
in advance.
Disable paths using one
Medium of the following methods:
Manually disabling Can disable paths
Note: Without this 1. Disabling a specified
paths (used for based on HBA
function, services Can disable controller
smoothly ports and Not
are not interrupted Not supported logical paths 2. Disabling a specified
transferring services controller ports that supported
but only only. physical path which is
before replacing a correspond to the
temporarily identified by the HBA
component) faulty components.
congested. port plus target port
ID.
Medium Support VIS active-active
Remote active- Active-active Not
Note: Applies to Not supported Not supported and self-developed
active DC solution VPLEX supported supported
special scenarios. active-active mode.
Automatic host Not
Medium Supported Not supported Not supported Supported
registration supported
Comparison Between Huawei
UltraPath and Multipathing Software of
Competitors — DFX
Importance
PowerPath Windows MPIO DM-Multipath AIX MPIO UltraPath
Degree
Automatic environment
Additional tools
dependency check during Low Not supported Not supported Not supported Supported
need to be used
installation
Automatic environment
parameter configuration Low Not supported Not supported Not supported Not supported Supported
during installation
N/A
N/A
N/A Note: bound with the
Note: bound with the Supported by some
Silent installation Low Supported Note: bound with the operating system
operating system platforms
operating system version version
version

N/A
N/A
N/A Note: bound with the
Note: bound with the Supported by some
No reboot upgrade (NRU) High Supported Note: bound with the operating system
operating system platforms
operating system version version
version

N/A N/A
N/A
Note: bound with the Note: bound with the Supported by some
Non-interruptive upgrade High Not supported Note: bound with the
operating system operating system platforms
operating system version
version version
Multi-platform unified user
Medium Supported Not supported Not supported Not supported Supported
interface
Automatic storage Manual configuration Manual configuration Manual configuration
Low Supported Supported
identification required required required
Supported
Note: supported
Co-existence with third-
High Supported Supported Supported Supported theoretically, with the
party multipathing software
need to verify the
specific version
Comparison Between UltraPath and Native
Multipathing Software of Operating Systems—
Overview
Fault Source and Symptom UltraPath Multipathing Software Built in OSs
Components are faulty and cannot receive or
✓ Isolate the faulty path. ✓ Isolate the faulty path.
send any signal.
Connections are not stable because cables Cannot isolate the faulty path permanently.
✓ Isolate the faulty path permanently.
are not firmly connected to ports. Performance deteriorates intermittently.
Fault Signals of optical fibers or modules are weak, Cannot isolate the faulty path permanently.
✓ Isolate the faulty path permanently.
Symptom causing packet loss or error packets. Performance deteriorates intermittently.
Cannot isolate the path permanently.
The transmission delay is long. ✓ Isolation the path.
Performance deteriorates intermittently.
Cannot isolate the faulty path permanently.
Components are reset repeatedly. ✓ Isolate the faulty path permanently.
Performance deteriorates intermittently.
Host HBAs ✓ Isolate the faulty path. ✓ Isolate the faulty path.
Optical fiber ✓ Isolate the faulty path. ✓ Isolate the faulty path.
Switch ✓ Isolate the faulty path. ✓ Isolate the faulty path.
Fault
Storage controller ✓ Isolate the faulty path. ✓ Isolate the faulty path.
Source
Interface module ✓ Isolate the faulty path. ✓ Isolate the faulty path.
Channel within a storage controller to access Cannot handle the problem perfectly. Services
✓ Isolate the faulty path.
LUNs may be interrupted.

The fault symptoms and sources that UltraPath can handle are five times
and 1.2 times, respectively, as many as the native multipathing software
of operating systems can handle. The comprehensive coverage increases
6-fold.
Comparison Between UltraPath and Multipathing
Software from Competitors — Overview
IBM/HP/
Field Function Item Huawei EMC HDS
NetApp
Some operating systems only
I/O load balancing Supported Supported Supported support the round-robin
Performance algorithm.
Performance consumption of software stacks Relatively large Relatively large Relatively small Small
Isolation of intermittently faulty links Supported Not supported Supported Not supported
Isolation of links that have bit errors Supported Supported Not supported Supported by AIX only
Reliability
Duration of I/O suspension in a path fault 1s to 2s (except AIX) 1s to 60s 1s to 60s 1s to 60s
Duration of I/O suspension in the case of timeout ≥ 30s ≥ 30s ≥ 30s ≥ 30s
Basic
Path performance monitoring Supported Supported Supported Not supported
services
Management and Path topology query Supported Supported Supported Not supported
maintenance Disabling paths/Standby Disabling is supported only. Supported Disabling is supported only. Disabling is supported only.
Log audit Supported Supported Supported Not supported
Supported by mainstream operating
SAN-Boot Supported Supported Supported
systems
Interoperability
Operating system Mainstream operating systems supported Supported Supported Supported
Virtualization platforms of OS vendors Supported Supported Supported N/A
Optimization of active-active path selection algorithm Supported Supported Supported Not supported
Performance
NVMe Not supported Not supported Not supported Not supported
APD retry Supported Supported Not supported Supported by Linux only
Reactive autorestore (the software test dead paths
Supported by AIX only Supported Not supported Supported by AIX only
Reliability when no path is available for I/O flows)
No I/O interruption when components are replaced
Supported (online array upgrade) Not supported Supported Not supported
proactively
GUI centralized management Not supported Supported Supported Not supported
SNMP trap
Messages are sent to the array for SNMP trap
Event and alarm Syslog Not supported
Advanced unified alarms. Syslog
SCOM
services
Automatic host registration Supported Supported Not supported Not supported

Management and Supported by some Supported by some operating


Installation/Upgrade without restarting the system Supported by some operating systems N/A
maintenance operating systems systems
Non-interruptive upgrade Supported Not supported Not supported N/A
Hot patching Not supported Not supported Not supported Not supported
Supported by mainstream operating
Silent installation Supported Supported N/A
systems
Batch deployment and upgrade Not supported Not supported Not supported N/A
Interoperability Heterogeneous storage Not supported Supported Not supported Supported
Basic UltraPath Configuration Guide
— Windows
The following table describes frequently used commands for configuringUltraPath.
Command Description
show iostat Queries the performance statistics of a specified storagesystem.
show upconfig Queries UltraPath configuration information.
show version Queries the version of UltraPath.
show path Queries the working condition of specific or all physical paths.
show alarmenable Checks whether the host pushes alarms.
show path_reliability_enable Checks whether UltraPath path degradation is enabled.
show event Queries key event information.
show array Queries information about specific or all storage systems connected to the applicationserver:
show vlun Queries virtual LUNs mapped from the storage system to theapplication server.
set ied_recovery_time Sets the I/O discrete error path recovery time. The defaultvalue is recommended.
set sdd_recovery_time Sets the recovery time of a latency-sensitive path. The default value is recommended.
set sdd_threshold Sets the threshold of switching a latency-sensitive path. The default value is recommended.
set ifd_time Sets the time window for intermittent path failure statistics. The default value is recommended.
set ifd_threshold Sets the intermittent path failure isolation threshold. The default value is recommended.
set ifd_recovery_time Sets the intermittent path failure recovery time. The default value isrecommended.
set hld_time Sets the threshold of determining a high latency path. The default value isrecommended.
set phypathnormal Sets a degraded path to the normal status.

Note: For details about command usage, see the user guide of UltraPath for the operating system.
For details about how to obtain the document, see Basic UltraPath Installation, Uninstallation, and
Upgrade.
Basic UltraPath Configuration Guide
— Windows
The following table describes frequently used commands for configuringUltraPath.
Command Description
set tpgstate Enable or disable the controller modules of the specified storage system.
set pathstate Enable or disable the specified physical path.
set workingmode Sets the working mode of UltraPath to load balancing between controllers or within a controller.
set loadbalancemode Sets the load balancing mode of UltraPath.
set luntrespass Sets the policy of switching over the working controller for LUNs. The default value is recommended.
set failbackdelaytime Sets the failback interval. The default value is recommended.
set ioretry Sets the number and interval of I/O retries. The default values are recommended.
set iosuspensiontime Sets the I/O suspension time. The default value is recommended.
set alarmenable Sets whether the host pushes alarms. The default value is recommended.
set path_reliability_enable Sets whether UltraPath path degradation is enabled. The default value is recommended.
set ied_min_io Sets the minimum number of I/Os for I/O discrete error isolation. The default value is recommended.
set ied_threshold Sets the I/O discrete error isolation threshold (ratio). The default value is recommended.
set ied_time Sets the time window for I/O discrete error isolation statistics. The default value is recommended.
set tod_recovery_time Sets the I/O timeout path recovery time. The default value is recommended.
set tod_threshold Sets the I/O timeout isolation threshold (times). The default value is recommended.
set tod_time Sets the time window for I/O timeout isolation statistics. The default value is recommended.
set hld_threshold Sets the high-latency path isolation threshold. The default value is recommended.

Note: For details about command usage, see the user guide of UltraPath for the operating system. For details about
how to obtain the document, see Basic UltraPath Installation, Uninstallation, and Upgrade.
Basic UltraPath Configuration Guide
— Windows
The following table describes frequently used commands for configuringUltraPath.
Command Description
set hld_recovery_time Sets the high latency path recovery time. The default value is recommended.
set faulty_path_check_interval Sets the faulty path routine test interval. The default value is recommended.
set idle_path_check_interval Sets the idle path routine test interval. The default value is recommended.
set max_io_retry_timeout Sets the timeout threshold of retrying an I/O. The default value is recommended.
Sets the number of I/Os consecutively delivered in load balancing mode. The default value is
set lb_io_threshold
recommended.
set hypermetro workingmode Sets the HyperMetro working mode. The default value is recommended.
Sets the size of slices during load balancing across HyperMetro arrays. The default value is
set hypermetro split_size
recommended.
clear upconfig Deletes UltraPath configuration information from virtual LUNs or the storage system.
clear obsolete_path Delete information about unused physical paths.
check status Checks the UltraPath status.
start pathcheck Checks the working status of the specified physical path.
Checks whether the configuration of LUNs' working controller is optimal and starts working
start rebalancelun
controller switchover if necessary.
start migration Switches the host I/O path to the target or source array.
start iosuspension Suspends I/Os to the specified LUN.
stop iosuspension Stops I/O suspension of a specified virtual LUN.

Note: For details about command usage, see the user guide of UltraPath for the operating system. For details
about how to obtain the document, see Basic UltraPath Installation, Uninstallation, and Upgrade.
UltraPath Parameter Settings in
Typical Application Scenarios
In most scenarios, default settings of UltraPath are recommended. In some scenarios,you
can configure UltraPath as instructed by the following:
 upadm set workingmode={0|1}
⚫ It specifies the load balancing mode at the storage controller level. 0 indicates inter-

controller load balancing. 1 indicates load balancing within a controller.


⚫ The default setting is load balancing within a controller. UltraPath selects paths to

deliver I/Os based on the owning controller of each LUN.


⚫ When the inter-controller load balancing mode is used, UltraPath delivers I/Os toall

paths. This increases latency due to transmission of I/Os betweencontrollers.

Typical Scenario Recommended Configuration


The transmission paths between
hosts and storage arrays
0 (inter-controller load balancing)
become a performance
bottleneck.
1 (default setting, load balancing within
Other scenarios
controllers)
UltraPath Parameter Settings in
Typical Application Scenarios
 upadm set loadbalancemode={round-robin|min-queue-depth|min-task}
⚫ Sets the load balancing algorithm at the link level. The value can be round-
robin, min-queue-depth, and min-task.
⚫ The default algorithm is min-queue-depth. UltraPath selects the path that
has the least number of I/Os from all available paths to deliver I/Os.
⚫ When round-robin is used, UltraPath selects all available paths between the
application server and storage arrays one by one to deliver I/Os.
⚫ When min-task is used, UltraPath selects the path that has the least I/O
data volume from all available paths to deliver I/Os.

Typical Scenario Recommended Configuration


The service I/O models delivered by hosts
have small differences and I/Os need to round-robin
be balanced on each path.
The service I/Os delivered by hosts are min-task
large data blocks.

Other scenarios min-queue-depth (default)


Limitations and Restrictions of
Huawei UltraPath
Operating System Restriction and Limitation
AIX Only the native FC HBAs of AIX application servers can be used.
Applicable to the following versions only: 5300-03 and later, 6100-00 and later, 7100-00 and
later.
Solaris Not support SAN BOOT.

AIX/Solaris Not support iSCSI connections.

Linux/Windows The native failover function of the HBAs must be disabled.


AIX/Windows/Solari ALUN cannot be mapped to a host through HBAs of different models or from different
s/ESX/Linux vendors.
A LUN cannot be mapped to a host using Fibre Channel and iSCSI simultaneously.
Linux Disk UUIDs instead of drive letters are recommended to mount file systems to avoid impact
brought by drive letter change.
AIX/Windows/ESX For virtualization platforms, if multipathing software has been installed and taken effect on the
host and LUNs have been allocated to VMs through RAW Device Mapping (RDM) or pass-
through mode, UltraPath cannot be installed on the VMs.
OceanStor Dorado
V6 Storage Systems
Product Upgrade
Upgrade Method — Online Upgrade
⚫ An online upgrade is highly reliable. Controllers are upgraded in batcheswithout interrupting
ongoing services. It is applicable to scenarios where it is essential that services are not
interrupted.
The following table describes the default batch policy of upgrading OceanStor Dorado5000 V6,
Dorado6000 V6, and Dorado18000 V6 online.
Controller Primary
First Batch Second Batch
Quantity Controller
2 0A 0B 0A
0B 0A 0B
4 0A, 1A 0B, 1B 0A, 1A
0B, 1B 0A, 1A 0B, 1B
4 (high-end) 0A, 0C 0B, 0D 0A, 0C
0B, 0D 0A, 0C 0B, 0D
6 0A, 1A 0B, 1B, 2B 0A, 1A, 2A
0B, 1B 0A, 1A, 2A 0B, 1B, 2B
8 0A, 1A 0B, 1B, 2B, 3B 0A, 1A, 2A, 3A
0B, 1B 0A, 1A, 2A, 3A 0B, 1B, 2B, 3B
8 (high-end) 0A, 0C 0B, 0D, 1B, 1D 0A, 0C, 1A, 1C
0B, 0D 0A, 0C, 1A, 1C 0B, 0D, 1B, 1D
Reliability of Online Upgrade — Batch
Upgrade Ensures Business Continuity
Upgrade in batches: When the software
of controllers at one end is upgraded,
services on the controllers are taken over by
the peer end. The system automatically
detects and upgrades the firmware that
needs to be upgraded. After the upgrade, the
controllers are restarted and powered on,
and services are switched back.
Switch
Second-batch upgrade First-batch upgrade
Retry supported: If one upgrade step
Service switchover
fails due to hardware or software bugs, you
can perform the upgrade again after the bug
Controller 1A Controller 1B
is fixed.
Co
Controller 0A Step 1 Switch over Controller 0B
Service Rollback supported: If the retry still fails,
Step 1 Switch over Step 2 Upgrade services (retry and
Step 2 Upgrade
services (retry firmware (retry switchover rollback supported). you can roll back the system to the source
firmware (retry and
supported). supported).
Step 4 Restart the system rollback supported). version.
Step 4 Restart the Step 3 Upgrade fo r the upgrade to take
Step 3 Upgrade
system for the upgrade software (retry effect (retry and rollback
software (retry and
to take effect (retry supported). s upported).
supported). rollback supported).
Reliability of Online Upgrade — Fast Service
Switchover by Port Failover
Before a controller is restarted during an online upgrade, its services are quickly taken over by the ports on the peer
controller. The host detects an intermittent disconnection and then re-establishes the link quickly, reducing the I/O
impact duration to between 3 and 5 seconds.
Service layers that are
Principles affected by the upgrade

Application 1 Application 2 ① Before upgrading controller B, the system disconnects port P1 of


controller B.
Host ② The system quickly creates port P1 of controller B on controller A.
UltraPath ③ Controller A registers port P1 on the switch. The switch
broadcasts port P1 to the host.
④ ④ The HBA detects that the P1->P1 link is disconnected and
HBA P0 P1 attempts to reconnect the link.
⑤ The P1->P1 link is re-established on controller A, and the host
③ continues to deliver I/Os.
Switch
⑥ The system starts to upgrade controllerB.

⑤ ✓①
Prerequisites

Storage
P1
P0
array P1 ⑥ 1. The switch must support port failover (NPIV).
Controller 2. The network between the host and the storage array must be
Controller
A symmetrical. (Controllers A and B are connected to the same host
B
and are in the same switching network.)
3. The HBA has no compatibility issue. Ensure that the connection
can be set up again after port failover.
Networking Requirements for Port
Failover
Host Fully symmetric networking: Host Partially symmetric
1. A host port is connected to networking:
both controller A andcontroller 1. A host port is connected
B. to both controller A and
2. A host port is connected to controller B.
both controller A andcontroller 2. A host port is connected
Switch B via the same number of Switch to both controller A and
links. controller B via the same
3. The array ports connected to number of links.
a host port are symmetrical
(the slot number and port
number are the same).

P0 P0 P0 P0
P1 P1 P1 Controller Controller
P1
Controller Controller
P2 A B P2 P2 A B P2
P3 P3 P3 P3
0 1 1 0 0 1 1 0
Upgrade Method — Offline Upgrade
⚫ If the upgrade is performed offline, you must stop host applications before
upgrading controller software. During an offline upgrade, all controllers
are upgraded simultaneously, shortening the upgrade time. Because all
host services are stopped before the upgrade, data loss is reduced in the
upgrade process.
Impact on Services
⚫ Online upgrade
During an online upgrade of controller software, the controller restarts and its
services are taken over by other normal controllers. The read and write IOPS
decreases by 10% to 20%. It is recommended that you perform online upgrades in
off-peak hours

⚫ Offline upgrade
You must stop host services before performing an offline upgrade of controller
software.
Preparations Before an Upgrade
⚫ Obtain upgrade reference documents.
⚫ Obtain software and related tools.

⚫ Perform a site survey before an upgrade.


Site Survey — Array Upgrade
Evaluation

⚫ Array upgrade evaluation checks the array health status before the upgrade,
preventing interference caused by potential errors. Ensure that all check items are
passed before performing subsequent operations. If you want to upgrade the system
forcibly, ensure that you understand the risks and accept the possible consequences.
⚫ In most cases, you do not need to collect array and host information or evaluate
compatibility if all the array evaluation items are passed. The actual situation depends
on the array evaluation result. If the array information collection, host information
collection, or compatibility analysis item becomes unavailable, the array upgrade
evaluation is successful and you can skip these items.
Site Survey — Array Information
Collection

⚫ This operation collects array logs for cause analysis if specific


upgrade evaluation items failed.
⚫ If all upgrade evaluation items are passed, this operationbecomes
unavailable on the GUI and you can skip it.
Site Survey — Host Information
Collection

⚫ This operation collects host HBA and multipathing information for


compatibility evaluation if the port failover criteria are not met and
Huawei UltraPath is not installed on the host.
⚫ If the host compatibility and HBA check items are passed in the
array upgrade evaluation, this operation becomes unavailable on
the GUI and you can skip it.
Site Survey — Host Compatibility
Evaluation

⚫ This operation evaluates the host compatibility based on the


collected information if the port failover criteria are not met and
Huawei UltraPath is not installed on the host.
⚫ If the host compatibility and HBA check items are passed in the
array upgrade evaluation, this operation becomes unavailable on
the GUI and you can skip it.
Upgrade Procedure — Entering the
Upgrade Page
⚫ Open OceanStor SmartKit. Click Scenario-based Task. Choose
Upgrade/Patch > Device Upgrade.
Upgrade Procedure — Setting
Upgrade Policies
⚫ Click Set Upgrade Policy to add the device, select the upgrade package, set the
upgrade mode, and select the backup path for the configuration data.
Upgrade Procedure — Array
Upgrade Evaluation
⚫ If the upgrade is to be performed more than one day after the site survey is
complete, you must perform an array upgrade evaluation again to ensure reliability.

⚫ You can skip this operation if either of the following conditions is met:
⚫ The upgrade is performed on the same day when the array upgrade evaluation is passed.
⚫ The failed check items have been rectified, the array and host service configurations are not changed,
and the networking is not changed after the evaluation.
Upgrade Procedure — Array
Upgrade
Prerequisites

⚫ All the evaluation and check items in the site survey have been passed.

⚫ If you perform an offline upgrade, all services have been stoppedproperly.


⚫ If the site survey and upgrade are performed on different days, an array upgrade evaluation has been conducted
again and all check items have been passed.

If you ignore the failed check items and want to upgrade the system forcibly, ensure that you understand the risks and accept the possible
consequences.
Upgrade Procedure — Solving
Upgrade Faults
⚫ If a fault occurs during the upgrade, the upgrade stops and can be retried or rolled
back after manual rectification and confirmation.

⚫ As shown in the figure, the status of the upgrade process is Paused. You can
click Details. In the Details window, select Retry or Roll Back.
Upgrade Procedure — Upgrading
SystemReporter
Prerequisites

⚫ SystemReporter has been installed.

It is recommended that the SystemReporter version be consistent with that in the storage array's version mapping
table. If the array is upgraded, SystemReporter must be upgraded as well. Otherwise, SystemReporter may not
monitor the performance statistics of the array.

Upgrade SystemReporter by following instructions in the OceanStor Dorado5000 V6, Dorado6000 V6, and
Dorado18000 V6 Storage Systems C30SPC100 SystemReporter Upgrade Guide.
Upgrade Procedure — Verifying
Upgrade Results
⚫ Checking system status
 Checks system status using an inspection tool and ensure that system status is not
affected during an upgrade

⚫ Restarting the value-added services


 If value-added services (such as HyperSnap, HyperMetro, and HyperReplication) are
suspended, stopped, or split before the upgrade, restore them to their original states after
the upgrade.
Rollback Procedure Upon an
Upgrade Failure
⚫ Rollback after an upgrade failure
If a fault occurs during a controller software upgrade, the software is rolled back to the source
version according to the specified rollback policy.

⚫ Rollback policy
 Online upgrade: If a system is not upgraded in the last batch of the upgrade, a rollback
must be performed by maintenance engineers. If a system is upgraded in the last batch of
the upgrade, do not perform a rollback. Instead, solve the problem following instructionsin
troubleshooting.

 Offline upgrade: If the number of controllers that fail an upgrade equals to or exceeds 50%
of the total controller quantity, the upgrade stops and must be retried or rolled back
manually by maintenance engineers. If the number of controllers that fail an upgrade is
smaller than 50% of the total controller quantity, the upgrade can be retried or ignored and
a rollback is not required.
Version Downgrade and Use
Scenarios
⚫ Version downgrade
In some cases, the controller software has to be downgraded to the source version even after
a successful upgrade.

⚫ Possible causes for a version downgrade


 Services cannot be recovered after the upgrade.
 System performance deteriorates after the upgrade.
 Configuration data is lost and cannot be retrieved after theupgrade.

⚫ Version downgrade method


 Run the create upgrade session command in developer mode in the CLI. Downgrade
operations are similar to those of an offline upgrade.

 If a downgrade is needed, contact Huawei technical support to evaluate the operation and
obtain the downgrade guide.
Precautions Before an Upgrade
⚫ Before an online upgrade, the available links between the storage system and a host must meet the
following requirements:

◆ At least one available link exists between controller A or C of each engine and thehost.

◆ At least one available link exists between controller B or D of each engine and the host.
If your live network does not meet the preceding networking requirements, it is strongly recommended
that you modify your networking mode and then perform an online upgrade. If your networking mode
cannot be modified, adjust the batch upgrade sequence and then perform an online upgrade under
guidance of Huawei technical support engineers.

⚫ Before the upgrade, ensure that the target storage system version is compatible with other management
software of the customer, such as OceanStor BCManager.

⚫ Before the upgrade, ensure that all controllers on at least one engine have links to external LUNs.

⚫ If a local array has replication links to a remote array, you cannot configure the remote array (for example,
creating or deleting the remote array, or adding or removing replication links) if only the local array is
upgraded. Existing configurations are not affected and services can run normally.

⚫ Before an online upgrade, close all DeviceManager pages and do not log in to DeviceManager duringthe
upgrade.

⚫ If the array has four controllers and its source version is C01SPC100, access the array using the IP
address of the CTE0.SMM0.MGMT port when performing the upgrade.
Precautions During an Upgrade
⚫ Do not configure the storage system.
⚫ Prevent other users who will not perform the upgrade from logging in to
the storage system.
⚫ Do not perform hardware operations (such as removing or inserting
interface modules, power modules in expansion enclosures, or disks).

⚫ Do not use DeviceManager or CLI to deliver configuration or query


commands.
⚫ Ensure persistent power supply.
⚫ Ensure that the network is working properly.
Precautions After an Upgrade
⚫ If specific alarm IDs cannot be found in the document of the target version, query the alarm IDs in the
document of the source version. These alarms do not exist in the target version and must be cleared
manually.
⚫ After the upgrade is complete and the browser is restarted on the maintenance terminal, clear all
cached data from the browser. For example, if you use Internet Explorer, choose Tools > Internet
Options > General > Browsing history and click Delete. In the dialog box that is displayed, clear
the cached data as prompted. Then log in to DeviceManager.
⚫ If the DeviceManager digital certificate or private key imported before the upgrade does not work, use
the backup digital certificate and private key to start background services.
⚫ If a local array has replication links to a remote array and both arrays are upgraded successfully, you
must be authenticated again before configuring the remote array (for example, adding or removing
replication links, or deleting the remote array).
◆ On the CLI, run change remote_device user_password remote_device_id=*
remote_user=mm_user to reset the password for logging in to the remote device.
◆ On DeviceManager, reset the password for logging in to the remote device after system
prompts an incorrect password.
OceanStor Dorado V6
Storage Systems
Performance and Tuning
Performance Tuning Guideline
1. Service performance is limited by the performance
bottlenecks in a system. Each service system has
Performance its bottlenecks in different service scenarios.
tuning cost 2. All optimization methods have restrictions.
analysis Optimizations beyond the actual requirements
wastes time and money.

Architecture Hardware Code Configuration


optimization upgrade optimization optimization

Highest cost Performance tuning sequence Lowest cost


Most effective Least effective
System Workflow and Bottlenecks CPU usage (%)
Switching 100 %
Many software
devices interruptions
90 %
Server 80 % Frequent context
70 %
Logical module Hardware
60 % switches
Application 50 %
90% usage Long I/O waiting
40 %
OLTP OLAP Multimedia 30 %
time
20 % Great queue depth
Data container
ApplicationCPU 10 %
architecture and 0%
Database Task processing delay (ms)
File logic A large amount of page
system
swapping
Operating system
M emory
80% usage
Implement
Volume management the function Low memory hit ratio
(LVM) of each
module
Block device layer

M ultipath Queue depth < 5


software
HBA card Bandwidth usage: 80%
HBA driver

Switching Bandwidth usage: 80%


devices Frequent retransmission
Storage subsystem
and bit errors Front- and back-end channel usage: 80%
Logical module Hardware Mirroring channel usage: 80%
Front-end channel
Cache CPU usage:80%
CPU
Disk usage: 80%
LUN Implement Memory
the function
I/O latency of OLTP services: > 5 ms
Back-end channel
RAID of each Disk I/O queue depth: > 10
module Disk
System Tuning Workflow
Preparations System Tuning
1. Data volume 1. Host performance
Know your 2. Randomness Monitor and analyze indicators
data. 3. Read/write ratio performance data based 2. Storage I/O
on the service process. process
3. Network latency
1. Application
Know your configuration's
Find the performance
applications. impact on data bottlenecks and
2. Application analyze the causes. 1. Analyze relevant
pressure data based on the
All of the storage systems, hosts, and networks can
NO situation
cause performance bottlenecks.
Optimize one be required
1. Indicators to be
Know the tuning 2. Detailed data may
optimized configuration of the
objectives. 2. Objective for
each indicator system at a time.

Check whether the


Back up the service 1. Service data is
objectives
are fulfilled.
system. crucial
2. System tuning
has risks
Common Terms
Term Explanation
A disk that responds slowly to I/Os, resulting in a
Slow disk
reduced read/write performance.
Temporary cache data that has not been written to
Dirty data
disks.
Deletes duplicate data and leaves only one copy of the
Deduplication
data to be stored.
An unexpected phenomenon in SSDs where the actual
Write
volume of data written to SSDs is multiple times
amplification
greater than the data volume intended to be written.
Garbage Copies the valid data in a block to another blank block
collectio and erases the original block.
n
The over-provisioning (OP) space is reserved on SSDs
OP space and cannot be used by users. Its capacity is
determined by the controller.
Introduction to Performance Indicators
IOPS Bandwidth

⚫ I/O per second ⚫ Unit: MB/s


⚫ Indicates the number of I/Os ⚫ Indicates the volume of
that a storage device can data that a storage device
process each second. can process each second.

Response Time Fluctuation Rate


⚫ Processing time of an I/O ⚫ Its maximum value,
after being delivered minimum value, and mean
⚫ Unit: ms square error are measured.

Common indicators are the ⚫
Common calculation
average response time and formula: Mean square
maximum response time. error/Average value x 100%
Performance Requirements of
Various Service Types
⚫ Service systems carry various applications. They can be classified into
the following categories based on their I/O characteristics and
performance requirements.
Application Performance
Service Characteristic
Scenario Requirement

Small blocks, generally 2-8 KB


Random access
OLTP High IOPS, low latency
20%-60% writes, high
concurrency
Large blocks, generally 64-512
KB
OLAP Large bandwidth
Multi-channel sequential access,
> 90% reads
Small blocks, generally < 64 KB
Virtual desktop High IOPS
Random access, > 80% reads
What Are Performance Problems?

1 Performance fluctuates greatly.

Performance degrades significantly after


2 a system upgrade.

Performance cannot meet service


3 requirements.

I/O latency is great and the service


4 response is slow.
Performance Tuning Guideline for
Storage Systems
Step 1 Ensure that the system operating environment is correct and stable.

Step 2 Confirm that I/Os have reached the front end of the storage system
and that the performance bottleneck is on the storage system.

Step 3 Verify that the storage system configurations provide the optimal
performance for the current types of services.

Step 4 Locate and eliminate the bottleneck on the storage system by using
command lines and tools.
Hardware's Impact on Performance

CPU

Front-end host port

Back-end port and disk


CPU
When a CPU works at a low frequency, it provides a lower performance than when working at a high frequency. In a light-load
test, for example, the dd command, single file copy, or single IOmeter testing, the CPU performance decreases. Therefore,
before conducting a low-load performance test, it is recommended that you run change cpu frequency in developer mode to
disable CPU underclocking.
When the CPU usage rises, the system scheduling latency increases, thus increasing the I/O latency.
The CPU usage of a storage system varies greatly with I/O models and networking modes. For example:
• Write I/Os consume more CPU resources than read I/Os.
• Random I/Os consume more CPU resources than sequential I/Os.
• IOPS-sensitive services consume more CPU resources than bandwidth-sensitive services.
• iSCSI networks consume more CPU resources than Fibre Channel networks.
You can use SystemReporter, DeviceManager, or the CLI to query the CPU usage of the current controller.
To monitor performance on DeviceManager, choose Monitor > Performance Monitoring, select the desired controller, and
query the statistical indicators.
Front-end Host Port
Front-end host ports process host I/Os. Analyzing the performance factors
of front-end ports helps identify potential performance bottlenecks in a
storage system.

• Before analyzing the performance of front-end host ports, confirm the locations
of interface modules and the number, statuses, and speeds of connected ports.
You can use DeviceManager or the CLI to query information about front-end host
ports.
• If performance fluctuates frequently or declines unexpectedly, front-end host
ports or links may be abnormal. You can use DeviceManager or the inspection
report to check whether the front-end host ports have bit errors.
• Key performance indicators of front-end host ports include the average read I/O
response time, average write I/O response time, average I/O size, IOPS, and
bandwidth. You can use SystemReporter or the CLI to query these indicators.
Back-end Ports and Disks
• Back-end ports are SAS ports that connect a controller enclosure to a disk
enclosure and provide a channel for reading/writing data from/to disks. Back-end
SAS ports' impact on performance typically lies in disk enclosure loops. Currently,
OceanStor Dorado6000 V6 supports 12 Gbit/s SAS ports.
• A single SAS port provides limited bandwidth. The bandwidth supported by the
SAS ports in a loop must be higher than the total bandwidth of all disks in the disk
enclosures that compose the loop. In addition, as the number of disk enclosures in
a loop grows, the latency caused by expansion links increases, affecting back-end
I/O latency and IOPS. Considering these situations, when there are sufficient SAS
ports, disk enclosures should be evenly distributed to multiple loops.
• Due to the global application of the deduplication and compression technologies
and changes in the pool subsystem architecture, OceanStor Dorado6000 V6
currently supports only one disk domain and one storage pool. You do not need to
consider disk selection in disk domains for bandwidth-intensive services (to avoid
dual-port access and disk selection from different engines). However, you still need
to avoid using disks of different capacities or speeds in a disk domain to prevent
bottlenecks caused by single disks.
Impact of Storage Configurations on
Performance

Number of
RAID level member disks Write policy

Deduplication and
Cache watermark LUN ownership compression
RAID Levels – RAID5, RAID6, RAID-TP
Queue after compression (Grains each with a granularity of 8 KB are used as examples.)

1K 7K 6K 8K 8K 7K 6K 6K 6K 8K 1K 1K 4K 4K 1K
➢ Full-stripe write D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 D14

✓ All the stripes in a chunk group are Description:


Dn indicates the data carried
modified. Parity data is calculated by the n-th I/O request.

from newly written data.


✓ New data and its metadata are
written to a new position. Old data in
the original position becomes garbage Writing data to a chunk group

data and will be reclaimed by the Parity Parity


Description:
CK0 CK1 CK2 CK3 CK CK
storage pool via garbage collection. 1. When writing data to a stripe, the system
processes all requests in a queue at a
time. For example, in the 1 KB queue, D0
to D14 are processed at once.
8K D1 2. Each chunk provides 8 KB of space for
D3 D4 D9 RAID RAID
each stripe. When the 8 KB space on a
➢ Zero padding chunk is full, data is written to the next
chunk in the same stripe regardless of the
✓ If a stripe is not full when the waiting deduplication granularity.
3. Because the data of an I/O request is
D12
time expires, all zeros will be written 8K D2 D7
D5 represented by a data grain, the sizes of
data grains are varied. A data grain
RAID RAID
to empty data grains to pad the stripe. cannot be stored on different disks. For
example, after the first 1 KB of data of D5
D13
Then the system calculates the parity D0
D10
is written to the first stripe, the remaining
6 KB of data of D5 must be written to the
data and writes the stripe to disks. D6 D8 same chunk in the next stripe. The system
responds to the D5 I/O request after both
stripes have been written to the disk.
4. If a stripe is not full when the waiting
8K D11 0 0 RAID RAID
D14 time expires, the stripe will be padded
with all 0s and then written to disks.
0
0
RAID Levels – RAID5, RAID6, RAID-TP
➢ RAID uses the Huawei-developed Erasure Code
technology. Erasure Code can add m copies of parity
data to n copies of original data to form n + m protection.
You can use any n copies of data to restore the original
data.
➢ RAID5, RAID6, and RAID-TP have one, two, and three
copies of parity data respectively, allowing one, two, and
three damaged disks respectively.
➢ The current version of OceanStor Dorado V6 uses RAID6
by default. You can select the RAID level based on your
requirements on performance, reliability, and space
utilization.

Read performance: RAID5 = RAID6 = RAID-TP


Write performance: RAID5 > RAID6 > RAID-TP
Reliability: RAID5 < RAID6 < RAID-TP
Write amplification: RAID5 < RAID6 < RAID-TP
Space utilization: RAID5 > RAID6 > RAID-TP
Impact of Storage Configurations on
Performance

Number of
RAID level member disks Write policy

Deduplication and
Cache watermark LUN ownership compression
Relationship Between Performance
and the Number of Member Disks

➢ An SSD can carry only a certain number of random I/Os. This depends on its capacity,
type of chip, chip manufacturer, firmware version, and OP space.

➢ If the storage system provides sufficient front-end capability, the performance of


random read/write services can be improved by adding member disks to a RAID group
so that more disks will share the I/O requests.

➢ For random read/write services, a disk supports 5,000 to 12,000 IOPS. For bandwidth-
intensive services, a disk supports 120 MB/s bandwidth.
Impact of Storage Configurations on
Performance

Number of
RAID level member disks Write policy

Deduplication and
Cache watermark LUN ownership compression
Write Policy
⚫ There are three cache write policies: write through, write back
with mirroring, and write back without mirroring.

Cache mirroring Host

Mirroring
Read cache Read cache

Mirror channel Controller A Controller B


Write cache Write cache
Data1 Data2

Mirror cache Mirror cache


Data2 Data1
Cache Cache

RAID RAID
Write back without mirroring is not recommended
because the data will not have dual protection.
Disk Disk
Write Policy
⚫ Select the write policy based on your requirements on
performance and reliability.

Write Policy Reliability Performance

Write through High Low

Write back with


Medium Medium
mirroring
Write back
Low High
without mirroring

⚫ You are not recommended to use write through in the current


version.
Impact of Storage Configurations on
Performance

Number of
RAID level member disks Write policy

Deduplication and
Cache watermark LUN ownership compression
Cache Watermark
When the write policy is write back, the cache uses the high
and low watermarks to control the storage capacity and
flushing rate for dirty data.
Write Cache
The flush thread flushes data to disks until the
data volume falls below the low watermark.
High flushing rate
High
watermark

The flush thread flushes a chunk of data to


Medium flushing rate
disks immediately.
Low
watermark
Low flushing rate
The flush thread flushes a chunk of data to
disks if no I/O is received for 5 seconds.
Note: A chunk is the granularity at which data is flushed to disks.
Cache Watermark Feature
➢ When the data volume in the cache is lower than or equal to the low watermark, there is
only a low probability that the data will be flushed to disks.

➢ The time for I/Os to stay in the cache largely depends on the value of the low watermark. A
higher low watermark will provide more opportunities for I/Os in the cache to be
consolidated, improving the random write performance.

➢ The default low watermark is 20%. To process multi-channel small sequential I/Os and
OLTP services in the SPC-1 model, you can increase the low watermark, for example, to
40% or 50%.
Impact of Storage Configurations on
Performance

Number of
RAID level member disks Write policy

Deduplication and
Cache watermark LUN ownership compression
LUN Ownership – Accessing a LUN
Through the Owning Controller
⚫ When the host accesses LUN 1,
it delivers the access request Host
through controller A.

⚫ When the host accesses LUN 2,


it delivers the access request to Controller A Controller B
controller A. Controller A then
LUN1 LUN2
forwards the request to
controller B via the mirror
channel between them.
Impact of Storage Configurations on
Performance

Number of
RAID level member disks Write policy

Deduplication and
Cache watermark LUN ownership compression
Deduplication and Compression
⚫ Deduplication and compression can effectively improve space
utilization, reduce the amount of data written to disks, and extend
the lifespan of SSDs. However, they will consume additional
computing resources.
⚫ The deduplication and compression ratio depends on the
characteristics of user data.

Deduplicatio
n and Performance Space Utilization Disk Lifespan
Compression

Disabled High Low Short

Enabled Low High Long


Test Tool Type

I/O test Service test Benchmark


tool tool test tool
Locating Performance Problems
➢ Before performance tuning, you must determine whether the bottleneck is
located in hosts, network links, or storage devices.

Latency Fluctuation
problems problems
Locating Performance Problems –
Latency Problems
➢ Check whether the latency in the storage system is normal.

Latency in the
Bottleneck
Storage System
High Storage system

Normal Host or link


Low Host or link

➢ If a host is directly connected to a storage system via a Fibre Channel network


and the host has no bottleneck, the difference between the latencies on the
host and storage system is 100-200 μs. In other scenarios, the latencies must
be calculated based on actual configurations.
➢ If the bottleneck is in the host or links, check for host and link faults as well as
test tool configurations.
Locating Performance Problems –
Fluctuation Problems Start
➢ Changes in the I/O size, read/write ratio,
concurrency, and latency will cause performance
fluctuation in the storage system. Is the I/O size,
concurrency, or The host or link
read/write ratio Yes performance is
➢ Concurrency = IOPS x Latency fluctuating? unstable.

No
I/O Size, Read/Write No
Latency Bottleneck
Ratio, Concurrency Is the latency
Check the host
and link
fluctuating? configurations.

Fluctuating Fluctuating Host or link


Yes

Stable Fluctuating Storage system The storage


system is
unstable.

Fluctuating Stable Host or link

Check garbage
Stable Stable Host or link collection and QoS
configurations that
may cause
fluctuation.
Configuration Optimization Guideline
Switching
devices
Server
1. Streamline the logical modules
Logical module Hardware
Application based on the I/O process and
OLTP OLAP Multimedia
performance requirements to
CPU
minimize resource consumption by
Data container unnecessary operations.
Database

File system

Operating system
Implement Memory
Volume management the 2. Identify I/O hot spots and properly
(LVM) function of
each allocate hardware resources.
module
Block device layer

Multipath software
HBA card
3. Ensure that the I/O size, concurrency,
HBA driver
and stripes are aligned among the
Switching entire data storage process,
devices
Storage subsystem
minimizing unnecessary I/Os.
Logical module Hardware
Front-end channel
Cache
CPU
LUN Implement Memory 4. Make full use of the cache to
the
function of Back-end channel consolidate and schedule data and
RAID each
module
Disk improve the memory hit ratio.
Data Container Performance Tuning
– Database
Item Recommendation
Allocate as many storage resources as possible for hotspot
Tablespace areas. Select Big File or Small File based on actual
requirements.
Cache Use about 80% of the host memory as the database cache.
Data block OLTP: 4 KB or 8 KB; OLAP: 32 KB
Prefetch Aligned with the ASM, LVM, or LUN stripe. 512 KB or 1 MB
window is recommended.
Index Delete unnecessary indexes. Select B-tree or bitmap.
Partition a disk when it has more than 100 million records.
Partition
Use range, list, and hash partitioning based on requirements.
Number of
Ensure that no free cache iswaiting.
flush processes
Log file 32 MB to 128 MB, five per instance
Data Container Performance Tuning
– File System
⚫ The file system container processes the operations on files or
directories delivered by upper layer modules.
⚫ Select an appropriate file system.
 File systems are classified into log and non-log file systems.

Applicable File
Service Scenario Service
System
Database server, mail server, small
Small file, random access e-commerce system, finance Ext3, Reiserfs
system
Large file, multi-channel
Video server XFS
sequential read
Large file, multi-channel
Video surveillance system XFS
sequential write

Number of Server CPUs Applicable File System


≤8 Ext3, Reiserfs
>8 XFS
Data Container Performance Tuning
– File System 7000
Log optimization test

6000

Transactions/s
⚫ Adjust file system 5000
4000
parameter settings. 3000
xfs

2000

reiserfs 1-4k 1000


0
Before After
5000
reiserfs
Performance before and after log
4000
transactions/s

journal area separation


3000 reiserfs
atime vs noatime (1k-4k)
ordered
2000 1845
reiserfs 1840
1000 writeback 1835

transactions/s
1830
1825
0 atime
1820
noatime
100% 90% 50% 10% 0 1815
reads reads readsreads read 1810
Read ratio 1805
1800
1795
Performance in ext3 ordered

different log modes Performance before and after


disabling logging during file reads
Operating System Performance
Tuning – Volume Management
Module
➢ When creating an LVM volume, ensure that all LUNs have:
✓ The same stripes
✓ The same capacity
✓ The same number of disks
✓ The same RAID level and different owning controllers
✓ The same stripe size which equals the LVM stripe unit to achieve
load balancing
Operating System Performance
Tuning – Block Management Module
➢ The block device module
is the core of an I/O
processing module in an Performance before and after
I/O alignment
operating system. It 2000
1800
14

offers various 1600


12

Average Response Time (ms)


performance tuning 1400 10

1200
parameters. 8 IOPS

IOP
1000 ART (ms)

S
6
✓ I/O alignment 800

600 4
✓ I/O size alignment 400
2
200
✓ Start position alignment 0 0
Aligned Not aligned
✓ Prefetch window
adjustment Performance of OLTP applications
✓ I/O scheduling policy before and after I/O alignment
adjustment
Operating System Performance Tuning –
Multipath and HBA Modules
⚫ The HBA module delivers I/Os to storage devices. Pay attention to the
following indicators.
Performanc Performance of an 8 Gbit/s Fibre
Description
e Channel HBA
Indicator
1. The maximum number of
1. Indicates the maximum number of I/Os that an
Maximum concurrent I/Os is 256 on a
HBA can deliver in one period.
number of single HBA port.
2. This parameter is adjustable. You are advised to
concurren 2. The value can be adjusted by
set it to the maximum value to prevent I/O
t requests the Execution Throttle
congestion on the HBA.
parameter.
1. Usually 1 MB
Maximum I/O Indicates the maximum I/O size that an HBA can
2. The value can be adjusted by
size deliver without splitting the I/O.
the Frame Size parameter.
1. Indicates the maximum bandwidth of a single
Maximum HBA port. The one-way bandwidth is about
bandwidth 2. You can add HBAs and network ports based on 750 MB/s on a single HBA port.
your actual storage bandwidth.
1. Indicates the maximum IOPS of a single HBA port.
The IOPS is 100,000 on a single
Maximum IOPS 2. You can add HBAs and network ports based on
HBA port.
your actual storage IOPS requirement.
Operating System Performance Tuning –
Multipath and HBA Modules
➢ The multipath module controls the access to storage devices by
pathing between servers and the storage devices, improving path
reliability and performance.
➢ Common multipath policies are as follows.
Routing Policy Description Application Scenario
Static load balancing. I/Os are
delivered to the optimal path in
ROUND_ROBIN Applications with light I/O load
turn to reduce the I/O workload
on a single path.
Dynamic load balancing. I/Os are Applications with heavy I/O load
Minimum queue
delivered to the path that has the and requiring low I/O latency, for
length
least number of I/Os. example, OLTP applications
Applications with heavy I/O load
Dynamic load balancing. I/Os are
Minimum data and requiring large bandwidth, for
delivered to the path that has the
volume example, OLAP and multimedia
minimum amount of data.
applications
Performance Tuning Overview for
Storage Systems
Cache Policy Recommendation
Use write back unless otherwise required.
Cache write
policy Adjust the cache high/low watermarks based on
actual requirements.
The default value is RAID6.
Use RAID5 if you require higher performance or
RAID level
space usage.
Use RAID-TP if you require higher reliability.
Deduplication Use it based on customer requirements or data
and compression characteristics.
Performance Tuning Overview for
Storage Systems
➢ Reconfigure the network switching devices between storage
devices and servers to ensure network privacy.
✓ To prevent the network between storage devices and servers from
becoming a bottleneck or being interfered by other services, use
direct connection or a private network to ensure performance.

✓ If only a limited number of switches are available, configure zones or


VLANs on them to ensure logical isolation of network connections.
Storage System Performance
Tuning – Disk
⚫ Number of disks required by OLTP applications
OLTP applications require a large number of disks for load sharing. The
number can be estimated as follows:
 Collect the performance data of the servers' block device layer and
the storage devices' front end. Calculate the maximum read and
write IOPS outputs (corresponding to physical reads and writes).
 Collect the disk performance data on the storage devices and
calculate the IOPS of a single disk.
 If the latency, especially read latency, on a disk cannot meet
performance requirements, increase the number of member disks. If
the queue depth on a disk exceeds 32, the system latency increases,
and you also need to add member disks.
Flowchart for Troubleshooting
Storage Performance Problems
Write through
is used
LUN write
Battery failure
through
Storage system Power supply failure
fault
Only one controller is working

iSCSI link

Link problem
Read and write FC link

performance
problem A slow disk
exists

The LUN is formatting

Other RAID groups or LUNs are


improperly configured
problems Random small
I/Os
Dual port
access
Troubleshooting Procedure
⚫ Procedure
 Check alarms and events of the  Check LUN parameter settings.
storage system. • Whether the owning controller is
• Whether an overload alarm is correct
generated • Whether deduplication and
• Whether a fault alarm is generated compression is correctly set
 Check the hardware operating • Whether the prefetch policy and write
policy are correct
status.
• Whether a slow disk exists  Check the service model and load on
• Whether the working rates of the storage system.
network ports are normal • Whether the service model is correct
• Whether the power supply status is • Whether the service load fluctuates
normal • Whether the latency meets the
 Check the link status. requirements
• Whether the connections are correct • Whether the upper limit of the storage
system performance is reached
• Whether FC links are normal
• Whether the upper limit of the link
performance is reached
Checking for Link Bottlenecks
⚫ A link has its upper limits for IOPS and bandwidth. If the actual IOPS
or bandwidth exceeds the upper limit, the latency will increase and
the performance will fluctuate. The following table lists the upper
limits of typical links.
Link Type (single
IOPS Bandwidth
port)
4G FC 50,000 370 MB/s

8G FC 100,000 750 MB/s

16G FC 200,000 1500 MB/s

10GE 75,000 600 MB/s

⚫ Generally, the IOPS will not reach the upper limit of a link. However, a
high IOPS may cause high usage on a single CPU core, especially the
cores for HBA card interrupts.
Flowchart for Troubleshooting
Network Performance Problems
Check the iSCSI network
connectivity and bandwidth.
Network
bandwidth
Check the FC network
connectivity and bandwidth.

Network
performance
problem

Check path connectivity using


the uptool.

Adjust the multipathing


Network path algorithm.

Check whether failover is disabled in


the cluster environment.
Troubleshooting Methods
➢ The following methods are available for checking network
bandwidth:
✓ For an iSCSI network, use a host to ping the service ports in the
storage system to check for network latency and packet loss.
✓ For a Fibre Channel network, run showfreeport to check the host
port connectivity and then run showfps to view the port rate.

✓ For a Fibre Channel network, you can check bit errors of ports on the
ISM. This helps you determine whether the performance problem is
caused by bit errors.
Troubleshooting Methods
➢ The following methods are available for checking network paths:
✓ Run upadmin show path to check the number of paths between the
host and the storage system and their connectivity.
✓ If multiple paths exist between the host and the storage system, you
can adjust the multipathing algorithm on the host to improve storage
performance.

✓ Check whether the failover function of multipath is disabled on the


host. If it is enabled, path failover will affect the read/write
performance.
Flowchart for Troubleshooting Host
Performance Problems
HBA parameter
settings

Check the data read/write performance


Windows host using the performance monitoring tool.
Host performance
problem

Run sar to check the CPU and


memory usage.

Linux host
Run iostat to check the storage
resource usage.
Hosts' Impact on SAN Performance

➢ HBA card
✓ Maximum size of a single request
✓ Maximum number of concurrent requests
✓ HBA driver
Methods for Troubleshooting Host
Performance Problems
➢ Querying Windows host performance
 To check the performance of a Windows host, first collect the performance
monitoring information to confirm the current I/O performance. On the
desktop, choose Start > Run and type Perfmon. You can create a counter log
and select Counter to view the I/O performance.
iostat Command

➢ Sequential services ➢ Random service


✓ r/s and w/s should be equal to the
✓ %util should be close to 100%.
theoretical IOPS.
✓ rkb/s and wkb/s should reach the
✓ avgqu-sz should reach a proper
theoretical bandwidths.
value.
✓ avgrq-sz should be equal to the
✓ await should be less than 30 ms.
block size of upper-layer services.
top Command
✓ load average in the first line shows the average numbers of processes in the
running queue in the past 1, 5, and 15 minutes, respectively.
✓ The second line shows the numbers of processes in various states.
✓ The third line shows the CPU usage.
✓ The fourth line shows the physical memory usage.
✓ The fifth line shows the swapping usage.
top Command
✓ You can press 1 on the keyboard to query the usage of each CPU core.
✓ us indicates the usage of CPU resources in the user state.
✓ sy indicates the usage of CPU resources in the kernel state.
✓ si indicates the CPU usage of software interrupts, which is related to the HBA
card.
✓ id indicates the idle CPU ratio. The CPU may become the bottleneck if the
value is lower than 10%.
OceanStor Dorado V6
Storage System
Troubleshooting
Troubleshooting Principles and
Methods
⚫ Troubleshooting procedure
Troubleshoot faults by following the troubleshooting procedure.

⚫ Basic principles
Help users quickly exclude useless information and locate faults.

⚫ Alarm analysis
Describe how to analyze alarms for troubleshooting a fault.

⚫ Replacement
Describe how to troubleshoot a fault by replacing components of a
storage system.
Troubleshooting Principles and Methods
— Troubleshooting Procedure
⚫ Troubleshooting procedure
Troubleshooting Principles and Methods
— Troubleshooting Procedure
⚫ Required fault information is as follows.
Category Name Remarks
Device serial number
Provide the serial number and version of the storage device.
Basic Information and version
Customer information Provide the customer's contact person and contact information.
Fault occurrence time Record the time when the fault occurs.
Symptom Record details about the fault symptom such as the content of error messagesand event notifications.
Operations performed
Fault information Record operations that are performed before the fault occurs.
before a fault occurs
Operations performed Record operations that are performed after the fault occurs and before the fault is reported to
after a fault occurs maintenance personnel.
Hardware module
Record the configuration of hardware modules of storage devices.
configuration
Storage device Indicator status Record status of indicators on storage devices. Pay attention to indicators that are steady orange or red.
information
Storage system data Manually export operation data and system logs of storage devices.
Alarms and logs Manually export alarms and logs of storage devices.
Describe how an application server and storage devices are connected, such as in Fibre Channel
Connection mode
networking mode or iSCSI networking mode.
Switch model If any switches exist on the network, record the switch model.
Networking Switch diagnosis Manually export switch diagnosis information, including startup configurations, current configurations,
Information information interface information, time, and system versions.
Network topology Describe the topology or diagram of network between an application server and storage devices.
If an application server is connected to storage devices over an iSCSI network, describe IP address
IP address
planning rules or provide the IP address allocation list.
OS version Record the type and version of the operating system that an application server runs.
Application server
Port rates Record the port rate of an application server that isconnected to storage devices.
information
Operating system logs View and export the operating system log.
Troubleshooting Methods and
Principles — Basic Principles
⚫ Analyze external factors and then internal factors.
 External factor failures include failures in optical fibers, optical cables,
power supplies, and customer's devices.

 Internal factors include disk, controller, and interface module issues.


⚫ Analyze alarms with higher severities and then alarms with lower severities.
Analyze high-severity alarms and then low-severity alarms. The alarm severity
sequence from high to low is critical alarms, major alarms, and warnings.

⚫ Analyze common alarms and then uncommon alarms.


Analyze common alarms and then uncommon alarms. When analyzing an
event, confirm whether it is an uncommon or common fault and then
determine its impact. Determine whether the fault occurred on only one
component or on multiple components.
Troubleshooting Methods and
Principles — Alarm Analysis
⚫ Overview
Typically, when a system is faulty, many alarms are generated. By
viewing alarm information and analyzing performance data, the type
and location of the fault can be determined roughly.
⚫ Application scenarios
If alarm information can be collected, the alarm analysis method can
be used to locate any faults.
⚫ Summary
By analyzing alarms, you can locate a fault or its cause. You can also
use the alarm analysis method along with other methods to locate a
fault.
Troubleshooting Principles and
Methods — Replacement
⚫ Overview
A fault can be located and rectified by replacing components
suspected to be faulty.
⚫ Application scenarios
This method typically enables engineers to quickly locate faulty
components during hardware troubleshooting. The limitation of this
method is that you must prepare spare parts in advance. Therefore,
you need to make full preparations.
⚫ Summary
Advantages of the replacement method are accurate fault location
and moderate requirements on maintenance personnel.
Common Faults
⚫ Hardware module faults
⚫ Basic storage service faults
⚫ Value-added service faults
⚫ Management software faults
⚫ UltraPath faults
Common Faults — Hardware
Module Faults
⚫ Introduction
Typically, when a hardware module is faulty, its indicator becomes abnormal.
⚫ Common faults
1. The disk enclosure is faulty.
2. The expansion module is faulty.
3. The power module is faulty.
4. The interface module is faulty.
⚫ Common troubleshooting method
When hardware becomes faulty, an alarm is generated. You can view the
alarm information to locate the faulty hardware and then replace the faulty
hardware with a new one using the FRU replacement tool of OceanStor
SmartKit or remove the faulty hardware and insert it again.
Common Faults — Basic Storage
Service Faults
⚫ Introduction
Alarms are generated when basic storage service faults occur.
⚫ Common faults
1. The Fibre Channel link is down.
2. The iSCSI link is down.
3. The storage device fails to be logged in after CHAP authentication
is canceled.
⚫ Common troubleshooting method
Alarms are generated when basic storage service faults occur. You can
clear the alarms by taking recommended actions in the alarm details.
Common Faults — Value-added
Service Faults
⚫ Introduction
Alarms are generated when value-added
service faults occur.
⚫ Common faults
1. Inconsistent number of replication links
For example:
between storage systems 1. Are the configurations of the primary
storage device consistent with those
2. Storage pool offline, LUN fault, or remote of the secondary storage device? Are
replication failure the storage devices connected by a
single link?
3. Heterogeneous arrays' link down 2. Is the link down between the primary
and secondary storage devices?
⚫ Common troubleshooting method 3. Reset and restart the service.

Alarms are generated when value-added


service faults occur. You can clear the alarms
by taking recommended actions in the alarm
details.
Common Faults — Management
Software Faults
⚫ Introduction
Storage systems cannot be managed or
maintained.
⚫ Common faults
Failure of activating login through a serial For example:
1. Check whether serial ports are logged in at the same time?
port
2. Check whether the baud rate is correct.
Failure of logging in to OceanStor Note: Typical serial port parameter settings are as follows:
DeviceManager Baud rate is 115200, data bit is 8, parity check isnone, and
Incorrect display of OceanStor stop bit is 1.
3. If it is a Windows-based host, check whether the COM port
DeviceManager or SystemReporter
is occupied.
⚫ Common troubleshooting method
1. The preceding faults are typically
caused by incorrect serial cable
connection or serial port parameter
settings. You can reinsert the serial
cable or reset serial port parameters.
2. If a browser incompatibility issue occurs,
select a browser of a specified version
or reset the browser.
Common Faults — UltraPath Faults
⚫ Introduction
UltraPath malfunctions lead to storage
performance deterioration.
⚫ Common faults
1. An application server fails to load UltraPath
after being restarted.
2. An SUSE application server fails to discover
multiple paths.
At the same time, check
3. Blue screen is displayed when UltraPath is whether:
installed on a Windows operating system. 1. Links are faulty.
2. Switches are faulty.
⚫ Common troubleshooting method 3. Controllers are faulty.
The typical cause is that UltraPath is blocked
because the server startup items do not include the
UltraPath information or the HBA driver has a
failover function. To resolve the problem, unblock
UltraPath.
Case Study — BBU Faults
⚫ Symptom
On DeviceManager, a BBU's Health Status is Faulty.

The BBU Running/Alarm indicator on the storage device is steady red.

⚫ Alarm information
On the Alarms and Events page of DeviceManager, choose the Current Alarms tab page.
The alarm BBU IS Faulty is displayed.

⚫ Possible causes
Therefore, the cached data cannot be completely flushed into coffer disks, resulting in
data loss.
Case Study — BBU Faults

⚫ Fault diagnosis
Case Study — UltraPath Failures
⚫ Symptom
UltraPath installed on an application server is automatically isolated by antivirus software.
As a result, it cannot be used.

⚫ Possible causes

The antivirus software mistakenly takes UltraPath as a virus and therefore, isolates it.
⚫ Recommended actions
1. On the management page of the antivirus software, add UltraPath as a piece of
trusted software.

2. Restart the antivirus software.


⚫ Suggestion and summary
Disable the antivirus software before installing UltraPath on an application server. After
UltraPath is installed, enable the antivirus software and set UltraPath to trusted software.
Case Study — Storage Pool Failures
⚫ Symptom
Services are interrupted, information of storage pool faults and LUN faults is generated,
and alarms of disk failure or disk removal are reported.

⚫ Impact

The storage pool is degraded or fails, and some or all storage services are interrupted.
Host services are interrupted.
⚫ Possible causes

Dual or multiple disks fail, or the disk is faulty.


⚫ Fault diagnosis
Determine whether the alarms of disk failures or disk removal are reported before the
information of storage pool faults and LUN faults is generated. Determine whether
storage pool failures are caused by disk faults and then determine whether storage pool
failures are caused by disk failures.
Case Study — Host I/O
Performance Deterioration
⚫ Symptom
Services time out or are responded to slowly (interrupted).
⚫ Impact
The I/O response time prolongs as host I/O performance decreases, remarkably reducing
system performance.
⚫ Possible causes
The host is not configured with a redundant path, the controller is faulty, or the controller
cache enters the write through mode.
⚫ Fault diagnosis
Based on the alarm information of storage devices, determine whether the host I/O
performance deterioration is caused by arrays, and whether the controller is faulty or the
write through mode of the controller cache occurs together with disk fault alarms.
After ruling out array factors, you need to log in to the host to check whether single-path
configurations exist.
Case Study — Failure of
Detecting LUNs by an
Application Server
⚫ Symptom
An application server fails to discover LUNs that are mapped to a storage system.
⚫ Impact
If an application server fails to discover LUNs, the application server cannot use the
storage resources.
⚫ Possible causes
Common reasons for a LUN discovery failure on an application server:
The storage pool fails.
The link is abnormal.
The device file is lost (applicable to a UNIX- or Linux-based device).
The dynamic detection mechanism of the application server (running Mac OS X) causes a
failure in detecting LUNs.
The application server (running HP-UX) does not have a LUN with ID 0.
The application server (running Solaris 9) automatically stops scanning for LUNs.
Case Study — Failure of Detecting
LUNs by an Application Server
⚫ Fault diagnosis
Case Study — Controller Failure
in a Non-UltraPath Environment
⚫ Symptom

The indicators on controller A are normal, but the indicators on controller B are turned off.
The application servers connecting to controller B fail to send read/write requests to the
storage system. As a result, the system services are interrupted. On the Performance
monitoring page of DeviceManager, the host port write I/O traffic or read I/O traffic on the
controller B is 0.

⚫ Impact
If a controller is faulty and host services are interrupted when UltraPath is not installed,
you can manually switch the host services to another functional controller.

⚫ Possible causes

The controller is faulty.


Case Study — Controller Failure
in a Non-UltraPath Environment
⚫ Suggestion and summary
To completely resolve the fault,
you are advised to:
1. Install UltraPath on
application servers.
2. Replace the faulty
controller.
3. Upgrade the storage
system.
4. Send the collected log
information to Huawei
technical support engineers
so that they can proceed to
the next step.
Case Study — Fibre Channel Link Faults
⚫ Symptom
Log in to DeviceManager. In the rear view of the storage device, click the interface
module in the red square and check the Fibre Channel host port information. The Fibre
Channel port with Running Status set to Link Down is displayed.

The link indicator of the Fibre Channel host port is steady red or off.

⚫ Alarm information
On the Alarms and Events page of DeviceManager, choose the Current Alarms tab page.
The Link to the Host Port Is Down alarm may be displayed.

⚫ Impact
An unavailable Fibre Channel link causes a link down failure, service interruption, and data
loss between the application server and the storage system.
Case Study — Fibre Channel Link
Faults
⚫ Possible causes
✓ The optical module is faulty.
✓ The optical module is incompatible with the host port.
✓ The rate of the optical module is different from that of the host port.
✓ The optical fiber is poorly connected or faulty.
✓ The port rate of the storage system is different from that of its peer end.
 On a direct connection network, the Working Rate of the Fibre Channel host port
is different from that of the Fibre Channel host bus adapter (HBA) on the
application server.

 On a switch connection network, the Working Rate of the switch is different


from that of the Fibre Channel host port or of the Fibre Channel HBA on the
application server.
Case Study — Fibre Channel Link
Faults
⚫ Fault diagnosis
Case Study — Inconsistent Number of
Replication Links Between Storage Systems
⚫ Symptom
When configuring links between two storage systems:
1. Log in to DeviceManager on each of the two storage systems and choose Data
Protection > Remote Device. On the displayed page, select the remote device and
view its replication links.

2. The numbers of replication links are different on the two storage systems. For
example, one storage system has two replication links whereas its peer storage
system has only one replication link.

⚫ Possible causes
1. The primary controller on the local storage system was powered off in the process of
creating a remote device.

2. The primary controller on the local storage system was powered off in the process of
adding a link to the remote device.
Case Study — Inconsistent Number of
Replication Links Between Storage Systems
⚫ Fault diagnosis
OceanStor 5300 v5
HANDS-ON / TRANSFERÊNCIA DE CONHECIMENTO
OceanStor V5
Converged Storage
Systems Product
Introduction
Product Positioning (1/2)
Converg High-density Tiered Data disaster
OceanStor V5 converged storage systems ed virtualization storage recovery
stora
ge


Brand-new hardware architecture delivering
industry-leading performance and
specifications

Convergence of SAN and NAS
⚫ Outstanding scalability and reliability

Inline
Virtualization High
Up to eight deduplication Wide channel Block-level specifications
controllers and Latest 16 Gbit/s virtualization, Large capacity,
compression Fibre Channel, 12 heterogeneous high cache speed,
IP Scale-out and Gbit/s SAS, and virtualization, and and large number
load balancing Higher storage
PCIe 3.0 computing of ports
resource utilization virtualization
Product Positioning (2/2)

Type Produ Product Feature Applicat


ct ion
Mod Scena
el rio
• Unified storage • Large-scale
High-end storage
• Outsta consolidation
Functions first 18800 V5 nding • Layer 1
18800F V5 perfor application
High-
Large 18500 V5 mance virtualization
end
enterprise storag
18500F V5 • Excellent • Mixed workloads
6800 V5 capacity expansion • Multiple
e
6800F V5 • Excellent application programs
Mid-range storage
Functions and
capacity optimization • High-
• High efficiency performance
prices balanced
• Flash optimization application
Medium programs
enterprises 5800 V5 • Unified storage • Enterprise application
5600 V5 • Stable performance • programs
Entry-level 5500 V5 • Good (Oracle
storage 5300 V5 capacity databases/emails/SAP
Mid-
Price first 5800F V5 expansion )
range 5600F V5 • Good • Storage consolidation
storage 5500F V5 capacity optimization • Server virtualization
Small
5300F V5 • Efficiency service • Advanced storage tiering
enterprises • Flash optimization • Data protection
• File sharing
• Basic consolidation
• Microsoft
2600 V3 • Good performance application Programs
Entry-
2200 V3 and large capacity • Entry-
level
2600F V3 Ease-of- use level
storag
• Cost-effectiveness server virtualization
e
• iSCSI SAN
• Video surveillance
High-Performance Applications

Hotspot data flow


High-Availability Applications
High-Density and Multi-Service Applications (1)
High-Density and Multi-Service Applications (2)
Product Models (5300/5500/5600/5800/6800 V5)
⚫ OceanStor V5 converged storage systems adopt the PANGEA hardware platform.

 5300/5500 V5/5500 V5 Elite: disk and controller integrated (2 U)

 5600/5800 V5: disk and controller separated (3 U independent engine)

 6800 V5: disk and controller separated (6 U independent engine)

 Active-Active controllers 6800 V5

5300 V5/5500 V5 Elite/5500 V5 5600 V5/5800 V5

Difference in V5 as compared with V3:


1. 5300 V5/5500 V5 Elite uses ARM CPUs. 5500/5600/5800/6800 V5 uses the new-generation Intel CPUs.
2. 6800 V5 uses 12-port SAS back-end interface modules instead of 4-port SAS back-end interface modules.

1
Product Features
⚫ High performance ⚫
Robust reliability
 PCIe 3.0 high-speed bus and SAS 3.0 high-speed 
Full redundancy design
I/O channel 
Built-in BBU+data coffer

⚫ Flexible scalability Various data protection
technologies
 Hot-swappable I/O interface modules ⚫
Energy saving
 Support for 4 interface modules and 2 onboard 
Intelligent CPU frequency control
interface modules (2 U) 
Delicate fan speed control
 Support for 16 interface modules (3 U)

 Support for 22 interface modules (6 U)

Difference in V5 as compared with V3:


A6 U enclosure supports up to 22 interface modules.

3
2 U Controller Enclosure Architecture
Interface Interface Interface Interface

subsystem
Service
module module module module
A0 A1 B1 B0
8 x PCIe GEN3
8 x PCIe GEN3

8 x PCIe GEN3
Controller module A Controller module B

subsystem
Disk
……

Disk 0 Disk 1 Disk 23 Disk 24

Electromechanical
12 V

subsystem
12 V

Power/BBU/Fan module Power/BBU/Fan module


0 1

Service channel Manage channel Power supply

Difference in V5 as compared with V3:


1. The above figure shows the 5500 V5. For 5300 V5/5500 V5 Elite, BBUs are integrated on the controller modules.
2. 5300 V5/5500 V5 Elite uses ARM CPUs and 4 x PCIe GEN3 as mirror channels.

4
5300 V5/5500 V5 Elite Controller
Enclosure

Power/BBU/Fan module Interface modules



SAS expansion ports ⚫ Two slots for hot-swappable I/O modules,
1+1
⚫ Two onboard SAS
⚫ AC, –48 V DC, and 240 V which can house up to one back-end
expansion ports per SAS module
DC
controller
⚫ Port types: 8 Gbit/s or 16 Gbit/s Fibre
Channel, GE, 10GE electrical, 10 Gbit/s
Onboard ports
FCoE (VN2VF), 10GE optical, and 12
⚫ Four GE ports per controller
Gbit/s SAS
Difference between V3 and V5:
5300 V5/5500 V5 Elite uses ARM CPUs and does not support 56 Gbit/s IB or 10 Gbit/s FCoE (VN2VN) modules.

5
5300F/5500/5500F V5 Controller Enclosure
(Front Panel)

1 Coffer disk label 2 Disk module handle


3 Disk module latch 4 Information plate (with ESN)
5 ID display of the controller enclosure 6 Power indicator/Power button

6
5500 V5 Controller Enclosure

1 Disk module 2 Coffer disk label


3 Information plate (with ESN) 4 ID Display of the controller enclosure

5 Power indicator/Power button 6 Disk module latch

7
5300F/5500/5500F V5 Controller
Enclosure (Rear Panel)

Serial port

SAS expansion port USB port Configuration


⚫ Two SAS ⚫ One USB port for network port
Power-BBU-fan module
expansion ports for each controller
⚫ 1+1 each controller (reserved) Management network
⚫ Up to 94% power conversion port
efficiency
⚫ Independent BBUs
Interface module
⚫ -48 V and 240 V DC power
⚫ Two interface module slots for each controller
Onboard port ⚫ Hot-swappable
Four SmartIO ports: 8 Gbit/s or ⚫ Rich port types: 8 Gbit/s Fibre Channel, 16 Gbit/s Fibre Channel,
16 Gbit/s Fibre Channel, 10 GE, GE, 10GE electrical, 10 Gbit/s FCoE (VN2VF), 10GE optical, 12
10 Gbit/s FCoE (VN2VF) Gbit/s SAS expansion, 10 Gbit/s FCoE (VN2VN), and 56 Gbit/s IB
ports

8
2 U 2.5-Inch Disk Enclosure

2.5-inch disk unit


⚫ No disk connector

⚫ Support for 12 Gbit/s SAS

disks, NL-SAS disks, and


12 Gbit/s SAS SSDs

Expansion module
⚫ Dual expansion modules

⚫ 12 Gbit/s SAS uplink and

downlink

600 W power module



1+1
⚫ Fan modules integrated into a disk enclosure
1 2 3
(1+1 fan redundancy, namely, each power

1
Serial
2
Mini SAS HD 3 Disk enclosure module equipped with one fan module)
port expansion port ID display ⚫ DC/AC power supplies
Difference in V5 as compared with V3:
SSD, SAS disk, and NL-SAS disk units support only the 12 Gbit/s rate.

5
Smart IO Interface Module
1 Power indicator/Hotswap button
2 16 Gbit/s Fibre Channel, 8 Gbit/s Fibre Channel, 10 GE, 10 Gbit/s
FCoE, or iWARP (Scale-Out) port
3 Port Link/Active/Mode indicator
1 4 4 Module handle
5 Port working mode silkscreen

2
No. Indicator Description
1 Powe Steady green: The interface module is running properly.
3 r Blinking green: The interface module receives a hot swap
indic request. Steady red: The interface module is faulty.
5 ator Off: The interface module is powered off.
3 Port Blinking blue slowly: The port is working in FC mode
Link/Acti and is not connected.
ve/ Blinking blue quickly: The port is working in FC
Mode mode and is transmitting data.
indicator Steady blue: The port is working in FC mode and is connected,
but is not transmitting data.
Blinking green slowly: The port is working in 10GE/FCoE/iWARP
mode and is not connected.
Blinking green quickly: The port is working in
10GE/FCoE/iWARP mode and is transmitting data.
Steady green: The port is working in 10GE/FCoE/iWARP
mode and is connected, but is not transmitting data.

9
Onboard SmartIO Interface Module
1 3

2 4 Indicator Description

1 16 Gbit/s Fibre Channel, 8 Gbit/s Port Blinking blue slowly: The port is working
Fibre Channel, 10 GE, or 10 Link/Active/M in FC mode and is not connected.
Gbit/s FCoE port ode indicator Blinking blue quickly: The port is working
in FC mode and is transmitting data.
2 Port Link/Active/Mode indicator Steady blue: The port is working in FC mode
and is connected, but is not transmitting data.
Blinking green slowly: The port is working in
3 Module handle 10GE/FCoE mode and is not connected.
Blinking green quickly: The port is working
in 10GE/FCoE mode and is transmitting
4 Port working mode silkscreen data.
Steady green: The port is working in
10GE/FCoE mode and is connected, but is not
transmitting data.

0
8 Gbit/s Fibre Channel High-Density Interface
Module 1 Power indicator/Hotswap button
2 8 Gbit/s Fibre Channel port

1 3 Port Link/Active indicator


4
4 Module handle/Silkscreen

No. Indicator Status Description


2
1 Power Steady green: The interface module is running
indicator/H properly. Blinking green: The interface module
ot Swap receives a hot swap request.
button Steady red: The interface module isfaulty.
3 Off: The interface module is not powered on or is
hot- swappable.
3 Port •Steady blue: Data is being transmitted between the
Link/Acti storage system and the application server at a rate of
ve 8 Gbit/s.
indicator • Blinking blue: Data is being transferred.
•Steady green: Data is being transmitted between the
storage system and the application server at a rate of
4 Gbit/s or 2 Gbit/s.
• Blinking green: Data is being transmitted.
• Steady red: The port is faulty.
• Off: The port link is down.

1
16 Gbit/s Fibre Channel High-Density
Interface Module
1 Power indicator/Hot
Swap button No. Indicator Status Description
2 Handle
1 Power Steady green: The interface module is
indicator/ running properly.
3 16 Gbit/s Fibre
H ot Blinking green: The interface modulereceives
Channel port Swap a hot swap request.
button Steady red: The interface module is faulty.
Off: The interface module is not powered
on or is hot-swappable.
4 Port •Steady blue: Data is being transmitted
Link/Activ between the storage system and the application
4 Port Link/Active
indicator e server at a rate of 16 Gbit/s.
indicator • Blinking blue: Data is being transferred.
•Steady green: Data is being transmitted
between the storage system and the application
server at a rate of 8 Gbit/s, 4 Gbit/s, or 2 Gbit/s.
• Blinking green: Data is being transmitted.
• Steady red: The port is faulty.
• Off: The port link is down.

Difference in V5 as compared with V3:


16 Gbit/s Fibre Channel high-density interface modules are used in V5. 16 Gbit/s and 8 Gbit/s Fibre Channel high-density
interface modules have similar appearances and can be distinguished by labels on the handles. The two types of interface
modules can house only mapping optical modules respectively.

2
8 x 8 Gbit/s Fibre Channel High-
Density Interface Module

Difference in V5 as compared with V3:


16 Gbit/s Fibre Channel high-density interface modules are used in V5. 16 Gbit/s and 8 Gbit/s Fibre Channel high-density interface modules have
similar appearances and can be distinguished by labels on the handles. The two types of interface modules can house only mapping optical
modules respectively.

3
10GE Electrical Interface Module

1 Power indicator/Hot Swap button


2 10 Gbit/s Ethernet port

3 Port Link/Active indicator


1 5 4 Port speed indicator
5 Module handle

4 No. Indicator Status Description


2 1 Power Steady green: The interface module is
indicator/Hot working properly.
3 Swap button Blinking green: The interface module
receives a hot swap request.
Steady red: The interface module is
faulty. Off: The interface module is
powered off.
3 Port Link/Active Steady green: The link to the
indicator application server is normal.
Blinking green: Data is being
transferred. Off: The link to the
application server is down or no link
exists.
4 Port speed indicator Steady orange: The data transfer rate
between the storage system and the
application server is 10 Gbit/s.
Off: The data transfer rate between the
storage system and the application server is
less than 10 Gbit/s.

4
56 Gbit/s IB Interface Module

1 Power indicator/Hot Swap button

2 4 lane x 14 Gbit/s IB electrical port

3 Port Link indicator


1 5 4 Port Active indicator

5 Module handle/Silkscreen

2 No. Indicator Status Description

1 Power Steady green: The interface module


indicator/H is working properly.
ot Swap Blinking green: There is a hot
button swap request to the module.
3 Steady red: The module isfaulty.
Off: The interface module is poweredoff
or hot swappable.
4 3 Port Steady green: The port is
Link connected properly.
indica Off: The port link is down.
tor
4 Port Steady yellow: Data is
Active being transmitted.
indicato Off: No data is being transmitted.
r
5
Overview of OceanStor V5 Software
Features
5300F V5/5500F V5/5600F
5300 V5/5500 V5/5600 V5/5800 V5/6800V5
V5/5800F
V5/6800F V5
SAN Supported Supported
NAS Supported Supported

SmartThin, SmartQoS,
SmartMotion, SmartPartition, SmartThin, SmartQoS,
Sm SmartCache, SmartCompression, SmartMotion, SmartPartition,
SmartDedupe, SmartMulti- SmartCompression,
art Tenant, SmartTier, SmartDedupe, SmartMulti-Tenant,
seri SmartVirtualization, SmartMigration, SmartVirtualization, SmartMigration,
es SmartErase, SmartQuota SmartErase, SmartQuota

HyperSnap, HyperReplication, HyperClone,


HyperSnap, HyperReplication, HyperClone,
Hyp HyperMetro, HyperCopy, HyperMirror,
HyperMetro, HyperCopy, HyperMirror,
er HyperLock, HyperVault
HyperLock, HyperVault
seri
⚫ Note:
es  Smart and Hyper series software in boldface supports SAN and NAS, BLUE supports SAN only, and
RED supports NAS only.
 5300F V5/5500F V5/5600F V5/5800F V5/6800F V5 does not support SmartTier or SmartCache.

7
SAN+NAS Converged Architecture

Traditional storage systems OceanStor V5

NAS
SAN NAS or
SAN NAS+SAN

⚫ Two storage systems are required to provide ⚫ Block- and file-level data storage is unified,
SAN and NAS services. requiring no additional file engines, reducing
⚫ The efficiency of databases and file sharing purchasing costs by 15% and decreasing power

services cannot be maximized. consumption.


⚫ Underlying storage resource pools provide SAN
and NAS, ensuring that database and file
sharing services are equally efficient.

8
Integrated and Unified Storage Architecture

Parallel: NAS and SAN


software protocol stacks are
OceanStor OS SAN and NAS parallel architecture parallel. File systems adopt
NAS SAN
ROW, and thin LUNs and
thick LUNs adopt COW,
File service Block service adapting to different
(CIFS/NFS) (FC/iSCSI)
application scenarios.
File semantics LUN semantics
Converged: NAS and SAN
System are converged on the
Cache
control resource allocation and
Object Volume
management planes, disk
blocks are allocated based
Storage pool on the RAID 2.0 architecture,
and cache resources are
shared, improving resource
utilization.

9
Software Feature Deployment
Multipathing Application software
Failover, failback Disk guard, host agent

Management software
NAS protocols SAN protocols GUI/CLI/SNMP
NFS/CIFS FC /iSCSI/SCSI OMM
Alarm, log, performance statistics
Replication Snapshot, clone, volume mirroring, LUN copy, and
Device
remote replication Public System
management
mechanism control
Volume management QoS Initialization Power supply
Object management
Cache Configuration Battery
Object Volume
change Fan
Transaction
Temperature
Storage pool System exception
Controller
RAID2.0, storage resource management, tiered storage System resources
enclosure
Unified thread Disk enclosure
Logical disk
Memory management Port
Internal disk, heterogeneous LUN
Link/Channel

Device drive, OS
FC/SAS/iSCSI Kernel, BSP, BIOS, PCIe BMC, SES

0
Software Architecture (1)
⚫ Protocol layer (NAS and SAN protocols)
 Processes NAS and SAN interface protocols.

⚫ Replication layer
 Implements value-added replication features for LUNs and file systems,including
HyperReplication, HyperClone, and HyperMirror.

⚫ Space management layer


 Manages underlying space for file systems and LUNs.

 Implements space allocation mechanism in COW and ROW mode.

⚫ Storage pool
 Divides space provided by physical disks into fine-grained blocks so that services are
distributed to all disks, bringing disk performance into fullplay.

 Improves disk reconstruction speed and shortens reconstructiontime.

 Facilitates tiered storage.

1
Software Architecture (2)
⚫ Management software (GUI/CLI/SNMP)
 Enables users to manage storage devices using the GUI and CLI.

⚫ OMM
 Collects and dumps alarms, logs, and performance statistics of storage devices.

⚫ System control
 Manages storage clusters.

 Implements processes such as storage device initialization and power-off, andhandles


faults on the control plane.

⚫ Device management
 Monitors and manages storage device hardware, such as fans, power supplies,
controller enclosures, and disk enclosures.

⚫ Device driver/OS
 Provides basic OSs and hardware drivers.

2
Block Virtualization (1)
LUN
LUN
Extent Extent Extent Extent Extent Extent Extent Extent Extent

CKG CKG
Extent Extent Extent Extent Extent Extent Extent Extent

CK CK CK CK CK CK CK CK
CKG CKG

CK CK CK CK CK CK CK CK CK CK CK CK

Disk
Disk Domain

3
Block Virtualization (2)
The following figure shows how application servers use storage space.

Storage pool Storage system

Mapping Host 1
Mapping view 1

Application
Host 2 server
(Windows)
Hot spare block
Mapping view 2

Application
server (Linux)

Host 3
Mapping view 3

Hot spare block


Application
server (UNIX)
Host 4
Mapping view 4

Application
server (VM)
Hot spare block

Hot spare
block

4
Configuration for Difference RAID Levels
RAID 5300 V5/5500 V5/5600
Level/Numb 18500/18800 V5
V5/5800
er of Disks V5/6800 V5

Typical configuration: 2D+1P, 4D+1P, Typical configuration: 2D+1P, 4D+1P,


RAID3 8D+1P 8D+1P
Flexible configuration: 2D to 13D+1P Flexible configuration: 2D to 13D+1P

Typical configuration: 2D+1P, 4D+1P, Typical configuration: 2D+1P, 4D+1P,


RAID5 8D+1P 8D+1P
Flexible configuration: 2D to 13D+1P Flexible configuration: 2D to 13D+1P

Typical configuration: 2D+2P, 4D+2P, Typical configuration: 2D+2P, 4D+2P,


RAID6 8D+2P, 16D+2P 8D+2P, 16D+2P
Flexible configuration: 2D to 26D+2P Flexible configuration: 2D to 26D+2P

5
SAN Host Multipathing

Path Failover and Failback UltraPath – Self-developed Multipath Software


Physical path
✓ Failover: If a primary path fails, services on the
I/O path
primary path are switched to a backup path to
prevent service interruption due to single point of
HBA1 HBA2 failure.
✓ Failback: After the primary path is recovered, the

×
services fail back from the backup path to the
primary path.
✓ Load balancing: UltraPath can balance I/Os on
paths, evenly distributing loads on hosts.
✓ UltraPath can quickly isolate intermittently
Controller A Controller B interrupted links and links that have bit errors,
ensuring the latency of key applications.
✓ Online upgrade reduces the service downtime.
✓ Path performance statistics
LUN0
LUN LLUN
LUN11 LUN2 LUN3 ✓ In cooperation with the array, the host path can
0 1
be automatically detected and path fault alarms
can be automatically sent.

9
NAS IP Address Failover

Bond port failover


Ethernet port failover

VLAN IP address failover

0
FC/iSCSI Port
Original service switching method (assume that
Logical link
Physical link
Port failover solution (assume that controller A restarts)
controller A restarts)

Failover Upper-layer application

OS
Upper-layer application

OS
Host Host
Multipathing ③ Multipathing

SCSI (LUN) SCSI (LUN)

HBA HBA HBA HBA

IP IP
② ② ④ ④
iSCSI FC iSCSII FC
WWP
GE switch N
FC switch GE switch WWP
N
FC switch

ETH FC ETH FC ETH FC ETH FC


IP1 WWPN1 IP2 WWPN2 IP2 WWPN2
iSCSI FC iSCSI FC iSCSI FC
IP1 WWPN1
iSCSI ② ③
FC
Controller A① Controller B Controller ①A Controller B

Principles: Principles:
(1) Controller A restarts during an upgrade or due to a fault. (1) Controller A restarts during an upgrade or due to a fault.
(2). The HBAs detect that I/Os to controller A time out (30 (2)iSCSI IP1 fails over to controller B and sends an ARP
seconds by default). message to the switch to perform IP address failover.
(3)The multipathing software receives the link fault report from (3) WWPN1 fails over to controller B and is re-created.
the HBAs and switches over I/O paths. (4) The HBAs re-establish links (less than 5 seconds).
(4) The I/O paths are switched to controller B.

1
Introduction to Highly Reliable Coffer Disks
⚫ Coffer disks consist of the first four disks and system disks. They are used to
save system configurations and dirty data.
⚫ The first four disks are organized into RAID 1 groups to ensure high reliability of
data. System disks of controller A and B back up each other.
⚫ System disks save system configurations and dirty data during power failure.

OS
Controller-A Controller-B

ssd0 ssd1 VAULT


ssd0 ssd1

DB DB

Disk enclosure LOGZONE LOGZONE

disk0 disk1 disk2 disk3 SYSPO SYSPOOL


OL (CCDB)
(CCDB)

Deployment of coffer System disk partition First four disk partition


disks (single engine)

2
Data Protection

A* A A*
A A A A*
A Power
B*
B B B*
B failure B* B
occurs

System System
disk disk

Controller Controller Controller Controller Controller Controller Controller Controller


A B A B A B A B

Persistent cache Data protection during power failure

A*
A A A*
A
B
B* B
B*

Controller Controller Controller Controller


A B Controller Controller
A B A B

Protect memory during the reset

3
OceanStor
V5&18000 V5
Converged Storage
Systems V500R007
- CIFS
Overvi
ew
⚫ Barry Feigenbaum originally designed Server Message Block (SMB) at IBM with the
aim of turning DOS "Interrupt 13" local file access into a networked file system.
SMB is used for sharing files, printers, and serial ports among computers.

⚫ In 1996, Microsoft renamed SMB CIFS client CIFS server


to Common Internet File System
(CIFS) and added many new
functions. Now CIFS is a collective
name of SMB that includes SMB1, Network
SMB2, and SMB3.

⚫ SMB is a client/server and


request/response protocol.
Overvi
ew
Since being developed in 1988, SMB has already been available in multiple versions (SMB1
in 1988, SMB2 in 2007, and SMB3 in 2012).

Named Introduced Introduced


as SMB Redefined SMB2 SMB2.1

Samba Added new Introduced


implemented features SMB3.0 (or
SMB2.2)
Disadvantages in
SMB1
Poor
Scalability
As a product in the DOS era, the SMB1's WAN speed was lower than 10
Mbit/s, allowing few files to be opened, few shares, and few users.

Poor Previously, security was not a priority for development of SMB1.


Although digital signature was added in Windows 2000, the algorithm
Security
MD5 was not that secure and was cracked later.

Complex After continuous evolution over 20 years, SMB1 has up to 13 sub-


Operations versions and more than 100 commands (even 14 read commands).

Poor
SMB1 has only 1/3 WAN speed of SMB2.0.
Performance

SMB1.x was too old and needed a complete change.


Changes in
SMB2
After nearly 20-years efforts, Microsoft finally redefined its SMB
architecture and launched SMB2.0.
SMB1 SMB2

OS bit mode (user, file, share) 16 32- or 64-bit

Number of sub-versions 13 2

Number of commands More than 100 19

Signature algorithm MD5 SHA-256

LEASE support Not supported Supported

Preference Low High

Number of bottom-layer
4 2
transmission protocols

Applicable to high-latency networks Not applicable Applicable

Flow control support Not supported Supported

SMB2 is FASTER, SECURER, SIMPLER, and BETTER


SCALABILITY.
SMB3 Feature:
Transparent Failover
Homedir
Features
⚫ Homedir can be regarded as a share. Its difference from a common share is
that access to a homedir share is actually access to a user's private directory.
Like a common share, a homedir can be created, deleted, modified, queried,
configured with a share privilege, or enabled/disabled.
⚫ Homedir has the following features:
✓ Allows a customer to manage different users' services separately by
dividing different users' home directories to different file systems.
✓ Allows a user to access one or more home directories by the configured
share name(s), and to switch between the multiple home directories by
share name. (Providing multiple home directories for a single user enables
better homedir scalability for this user.)
✓ Like common shares, allows all share-related features to be
enabled/disabled, thereby enabling control over the users' access to
homedir services.
✓ Offers AutoCreate for mapping rules, preventing administrators from
creating homedir directories separately for each CIFS user and thereby
reducing the administrator's O&M load.
MMC
⚫ Microsoft Management Console (MMC) is the management console in Windows.

Features
It provides a unified, standardized management interface for Windows
administrators to manage hardware, software, and network components.
⚫ In medium-and large-scale NAS networking scenarios, there may be multiple NAS
servers. If the NAS administrator had to log in to each single NAS server for daily
management, that would be very time consuming. To address this issue and
improve the management efficiency, the MMC provides a centralized
management platform to manage all NAS servers in a unified manner.
⚫ The MMC communicates with storage systems using the standard MSRPC (over
SMB1 or SMB2) protocol. The MMC workflow is as follows:
Client Server
CIFS Share
Management

Local User/Group
Management

MSRPC
MMC MS-RPC Processing Module SMB Session
Management

SMB OpenFile
Management
GNS
⚫ Global namespace (GNS) is a file virtualization technology, aggregating
Features
different file systems and providing a unified access namespace. GNS allows
clients to access files even not knowing the locations of the discrete files, just
like accessing web sites without the need to knowing their IP addresses. It
also enables administrators to manage data on geographically scattered
heterogeneous devices using a unified console.
⚫ In OceanStor V5 storage, GNS is implemented as a CIFS share. The CIFS
protocol provides global root nodes to which each individual file system can
be aggregated, thereby presenting a unified view (based on file system
names). By accessing a GNS share, users can view all created file systems.
⚫ In actual use, GNS shares are nearly the same as common shares. Better than
common shares, the GNS share function provides a global unified view for
storage administrators, facilitating their daily maintenance and management.
⚫ By accessing a GNS access, you can view and access all created file systems. If
a service access node is not a home node for the file system, the file system
will forward the I/Os from this access node, compromising system
performance. To avoid the performance compromise, you can enable the GNS
forwarding function to ensure that the service access node is always a home
node of the file system.
Version Requirements on
CIFS Clients
Client/Server OS
Windows 8 Windows 7 Windows Vista Previous
Windows Server 2012 Windows Server 2008 Windows Server Versions
R2 2008 of
Windows

Windows 8 SMB 3.0 SMB 2.1 SMB2.0 SMB 1.0


Windows Server 2012

Windows 7
SMB 2.1 SMB 2.1 SMB2.0 SMB 1.0
Windows Server 2008
R2

Windows Vista
SMB 2.0 SMB 2.0 SMB2.0 SMB 1.0
Windows Server
2008

Previous
SMB 1.0 SMB 1.0 SMB1.0 SMB 1.0
Versions
of
Windows
Working
Principles
Client Server

SMB2 NEGOTIATE Request


Protocol Protocol handshake
SMB2 NEGOTIATE Response negotiation

SMB2 SESSION_SETUP Request


Session
SMB2 SESSION_SETUP Response setup Security authentication

SMB2 TREECONNECT Request


Tree Connection share
connect
SMB2 TREECONNECT Response

Network file
File operations

operations

SMB2 TREE_DISCONNECT Request


Tree
disconnect
Disconnect
SMB2 TREE_DISCONNECT Response
CIFS
Authentication:
NTLM and
Kerberos

NTLM Kerberos
Typical Application
Scenarios
CIFS is mainly applied in file share scenarios, typically enterprise file servers
and media assets:

⚫ File Share Service


CIFS is commonly used to provide file share service for users (for example, file share on
enterprise file servers and in the media assets industry).
Typical Application
Scenarios
File share service: enterprise file servers and media assets

Enterprise
office work

IP
Management
IP IP
Windows

Performance Service Share User


IP IP IP
Monitoring Management Management Management

NAS service

LAN
DNS

AD server

Authentication traffic
Management traffic
Service data
Configuring
CIFS
⚫Creating a User

⚫ Creating a Share

⚫ Setting the Share Permission

⚫ Adding a Server to an AD Domain

⚫ Creating a GNS Share

⚫ Creating a Homedir Share


Creating a
User
Click Create:
Creating a
Share
Setting the Share
Permission
Adding a Server to an AD
Domain
Creating a GNS
Share
Creating a
Homedir Share
OceanStor V5&18000 V5
Converged Storage Systems
NFS Introduction
Position
ing
NFS
⚫ Network File System (NFS)
⚫ NFS is a commonly used IETF network file sharing protocol in UNIX-like system
environments such as Linux, UNIX, AIX, HP-UX, and Mac OSX.

Positioning
⚫ Functions as a network file storage system in UNIX-like system environments such as Linux,
UNIX, AIX, HP-UX, and Mac OS X.

⚫ Simplifies the access of remote files by application programs.


⚫ Supports diskless workstations.
Positioning –
Example Sharing the file system with
other computers using NFS

/mnt/nfs120

/home/wenhai/tmp/d01

Accessing the
file systems of
other
computers
using NFS
Accessing the file systemsof
other computers using NFS
Working
Principles
User and
application
File system

File operation request and response


NFS client NFS server

NFS request and response


Client RPC RPC request and Server RPC
response

Theoretically, RPC data can be transmitted over IP/Ethernet or


IP/InfiniBand, as well as RDMA/IP/Ethernet orRDMA/IP/InfiniBand.
By July 2015, the OceanStor V5 still does not support NFS over RDMA.
Working Principles –
Protocol Stack
OSI Model Protocol Layer

Application Layer NFS, file system mounting, RPCport mapping,


and NIS

Presentation Layer XDR

Session Layer RPC

Transport Layer TCP andUDP

Network Layer IP

Data Link Layer Ethernet, IB, or other


communications
supporting IP
Physical Layer
Working Principles –
NFS V5
NFS client
PORTMAP
Port mapping
service
NLM service process

Mounted service process


RPC RPC

NFS serviceprocess

NLM
Mount File system
PORTM AP
Client

Network file access is implemented based on the NFS V5 protocol.


Multiple RPC servers and clients are required.
Multiple TCP socket ports are required.
The preceding three layers conform to different protocol standards.

0
Working Principles –
NFS V4.0
NFS client NFS service
process

RPC File system


RPC

Network file access is implemented based on the NFS V4.0


protocol.
Only one pair of RPC servers and clients are required.
Only one TCP socket port is required.

1
Software
Architecture
iSCSI/FC/FCOE NFS/CIFS/FTP/HTTP

Block service File service

Storage pool RAID 2.0+

Disk management is based on disk domains.


Space management is based on storage pools and RAID
2.0+.
Block storage and file storage services are provided
based on the disk and space management.

3
Software Architecture –
Unified Storage
⚫ The following table lists the compatibility information about the basic
connectivity of NFSV5.
Software
Ubuntu 12.04 LTS
HP-UX 11i V2 HP-UX 11i V5
Red Hat Enterprise Red Hat Enterprise
Linux 5 Linux 6
SUSE Linux SUSE Linux
Enterprise Server Enterprise Server
10 11
Asianux 3.0 Asianux 4.0 Asianux 4.2
AIX 5.3 TL12 AIX 6.1 TL5 AIX 7.1 TL0
Mac OS X 10.6 Mac OS X 10.7 Mac OS X 10.8
⚫ For details about compatibility information, visit
http://3ms.huawei.com/mm/docMaintain/mmMaintain.do?method=showMMDe
tail&f_id=STR15073109310058.

4
Feature Description –
Basic Networking
NFS is one of the two most commonly used network sharing protocol.

Mode
NFS applies to UNIX-like system environments such as Linux, UNIX, AIX, HP-UX, and Mac OS
X.

NFS is widely used in cloud computing and databases.

Host Unified storage


Market requirements:
High performance, robust reliability, NFS client Network
flexible scalability, and easy
management NFS server

Competitive analysis:
All enterprise-level NAS supports NFS.

6
Feature Description –
UNIX User Permission
⚫ Three security modes, including UNIX, NIS, and LDAP, are supported.

Control
⚫ The following figure shows the UNIX security mode.

Host Unified storage


NFS client Network

NFS server

User information is presented using UID and GID in the UNIX system environment.
Identity authentication and permission verification are performed in the same way as
the local security mode.

7
Feature Description –
NIS User Permission
⚫ The following figure shows the NIS security mode.

Control NIS server

Host Unified storage


NFS client Network

NFS server

The unified storage device and the host must join the NIS domain.
User information is presented using user names and group names in the NIS domain.
Identify authentication and permission verification are performed by the NIS server.

8
Feature Description –
LDAP User Permission
⚫ The following figure shows the LDAP security mode.

Control LDAP server

Host Unified storage

NFS client Network

NFS server

The unified storage device and the host must join the LDAP domain.
User information is presented using user names and group names in the LDAP
domain.
Identify authentication and permission verification are performed by the LDAP server.

9
NFS
Benefits
• Functions as a network file storage system in UNIX-like system
environments such as Linux, UNIX, AIX, HP-UX, and Mac OS X. With
NFS, users can access files in other systems like accessing local files.

• Supports diskless workstations, reducing network costs.

• Simplifies the access of remote files by application programs. No


special processes need to be invoked to access remote files.

0
Feature Description –
NFS V3/V4 supports audit logs.
Audit Log

⚫ NFS audit logs are used by customers to perform the second audit, and real-
time background monitoring and data analysis for the system.

Audit server

Host
NFS client
Network
⚫ Unified storage

NFS server

1. Administrators can dynamically configure NFS log audit rules in a


granularity of the share and operation.
2. When an NFS client accesses a shared file, operations that meet the rules
are recorded in operation logs.
3. Connections to the customer's external audit log server are allowed for
the second audit.

1
Feature Description –
Global Namespace
⚫ The NFS protocol provides a global access root node /. Each independent file
system can be aggregated to the virtual root node. You can use an NFS host to
access the / directory to view the unified directory structure.

NFS server
/

FS01 FS02 FS03 FS04

DIR1 QT1

1. Administrators can dynamically create, modify, and


query the NFS global namespace share.
2. Each tenant can create only one global namespace share.
3. By accessing the global namespace, an NFS client can
easily view all independent file systems that it has the
permission to access.

2
NFS
Advantage
⚫ Scalability: NFS is a standard industry protocol. It (from V2 to V4,

s which includes 4.1, pNFS, and 4.2) is widely used in UNIX-like system
environments such as Linux, UNIX, AIX, HP-UX, and Mac OSX.

⚫ Reliability: NFS adopts a reliability design based on standard


specifications.

⚫ Performance: NFS is widely used in the high-performance computing


field.

3
NFS Share
Configurations
⚫ Configuring permission

⚫ Creating an NFS share

⚫ Setting share permission

5
Configuring
Permission – LDAP
Step 1: Go to the LDAP Domain Settings page.
Step 2: Set related parameters. Primary IP address, Port, Protocol, and Base DN are

Domain Settings
mandatory. Other parameters are optional.
Step 3: After completing the settings, click Save.

6
Configuring
Permission – NIS
Step 1: Go to the NIS Domain Settings page.
Step 2: Set Domain Name and Primary IP address.
Domain Settings
Step 3: After completing the settings, click Save.

7
Creating an NFS
Share
Step 1: Select a file system and create an NFS share as prompted.
If you want to share a quota tree, select a quota tree.
If you want to specify extra information about the NFS share to be created, enter the
information in Description.
Step 2: After competing the settings, click Next.

8
Setting
Permission (1)
Step 1: Click Add to set access permission for clients to access the NFS share.

9
Setting
Step 2: Select a client type.
Step 3: Set Name of IP Address. If you set Type to Host, enter the host name or IP address. If
Permission (2)
you set Type to Network Group, enter the network group name or IP address. Symbol *
indicates any host name or IP address. For example, 192.168.* indicates any IP address
between 192.168.0.0 and 192.168.255.255.
Step 4: Select share permission.
Step 5: Click OK.

0
Setting
Step 6: In the client list, select a client to assign the client the access permission for the NFS

Permission (3)
share. In the following figure, symbol * indicates that any host or IP address has only READ
permission.
Step 7: Click Next to complete the permission settings.

1
Completing the
NFS Share
Configuration
Click Finish to complete the NFS share configuration. The execution result will be
displayed.

2
Background Web
File Sharing
⚫ Unified storage in the background of web servers

NIS server Unified storage


Web server
NFS client Internal
IP
External
Firewall

network
IP
NFS server
network Web server

Web server

The following provides a method for clients to mount NFS:


linux-yuyo:/home/a/tmp # mount -t nfs 129.88.22.101:/nfsshare d01
linux-yuyo:/home/a/tmp # mount
...
129.88.22.101:/nfsshare on /home/a/tmp/d01 type nfs
(rw,addr=129.88.22.101)
linux-yuyo:/home/a/tmp #

4
Database File
Storage
⚫ Database files are stored on NFS shares.

NIS server Unified storage


Database server NFS client
Internal
IP
network
NFS server
Database server

Database server

Oracle databases have a built-in NFS client to move database storage space to the
shared space on the NFS server.
The NFS client improves database performance.

5
Cloud Computing
Shared Storage
⚫ Cloud computing uses the NFS server for internal shared storage.

Cloud
computing Unified storage
server
NFS client Internal
IP
External
Firewall

network
IP
NFS server
network Cloud computing server

Cloud computing server

VMware optimizes the NFS client and moves virtual machine storage space to the shared
space on the NFS server. The NFS client optimized based on cloud computing provides higher
performance and reliability.

6
Common Problems in NFS
Applications
⚫ The NFS client runs in a system using a 32-bit CPU.
 Because the NFS server uses a 64-bit CPU, the NFS running in a system using a 32-bit
CPU may fail to process 64-bit file data from the NFS server. As a result, applications
cannot access files normally.
 However, some new operating systems and applications can enable 32-bit CPUs to
process data from the NFS server using a 64-bit CPU.

⚫ A firewall is deployed between the NFS client and NFS server.


 In such a case, you need to open a port required by the NFS protocol on the firewall.

⚫ Applications that originally use local file systems need to be migrated to NFS
storage.
 Some special functions of local file systems are not supported by NFS. In such a case,
tests must be performed to check whether those applications can run on NFS.

7
OceanStor V5&18000
V5 Converged
Storage Systems
FTP Introduction
Software Introduction
— Protocol
File Transfer Protocol (FTP) is used to control bidirectional file transfer on the
Internet. It also is an application. FTP applications vary according to different
operating systems. These applications use the same protocol to transfer files.

FTP is usually used for downloading and uploading files. You can download files
from remote servers to your computers or upload files from your computers to
remote servers. That is, you can use client programs to download files from or
upload files to remote servers.
Software
Architecture
FTP is an application-layer protocol in the TCP/IP protocol family. It uses two types of TCP
connections: control connection and data connection. Its software architecture is as follows:

TCP/IP communication service

Control channel Data channel

User
working
process 1

User
working
process 2
Listening File
User
working
process n

Configuration Process
management management
Overvi
ew
⚫ FTP is a common protocol used to transfer files between remote servers and local hosts
over IP networks. Before World Wide Web (WWW) appears, users use command lines to
transfer files and the most commonly used file transfer application is FTP. Although now
most users use emails and web to transfer files, FTP is also widely used.
⚫ The FTP protocol is an application-layer protocol in the TCP/IP protocol family. TCP port 20
is used to transfer data and TCP port 21 is used to transfer control messages. Basic FTP
operations are described in RFC959.
⚫ FTP provides two file transferring modes:
⚫ Binary mode: Program files (such as .app, .bin, and .btm files) are transferred in binary
mode.
⚫ ASCII mode: Text files (such as .txt, .bat, and .cfg files) are transferred in ASCII mode.
⚫ FTP can work in either of the following modes:
⚫ Active mode (PORT): In this mode, the FTP server sends requests to set up data
connections. This mode does not work if FTP clients protected by firewalls (for example,
FTP clients reside on private networks).
⚫ Passive mode (PASV): In this mode, FTP clients send requests to set up data connections.
This mode does not work if the FTP server forbids FTP clients from connecting to its ports
whose port number is higher than 1024.
⚫ The methods of setting up control links in PORT and PASV modes is the same, but those of
setting up data links are different. Since the two methods have their advantages and
disadvantages, choose one of them to set up data links based on networking environments.
Restricted
Scenarios
Since FTP transfers files in plaintext, the data that is transferred and the user
name and password used for authentication can be obtained by methods such as
the packet capture. Therefore, FTP is restricted in scenarios that require high
security, such as a scenario where confidential files are transferred.
Active Mode of the FTP
Server (1)
An FTP client sends a PORT command to inform the FTP server of the IP address and temporary port
used to receive the data connection setup request sent by the FTP server from port 20. Since the FTP
server sends the data connection setup request, the FTP server works in PORT mode. For example, as
shown in the following figure, the FTP client uses temporary port 30000 and IP address 192.168.10.50
to receive the data connection setup request.

Scenario 1 Setting up a control connection in PORT mode

192.168.10.200
192.168.10.50
SYN

ACK + SYN 21
ACK

FTP server
FTP client Control connection
Active Mode of the FTP
Server (2)
A data connection will be set up after a control connection is set up. If the file list on the FTP server can be
viewed on the FTP client, the data connection is set up successfully. If directory listing times out, the data
connection fails to be set up.

Scenario 1 Setting up a data connection in PORT mode

192.168.10.200
192.168.10.50
SYN

30000
ACK + SYN 20
ACK

FTP server
FTP client Data connection
Passive Mode of the FTP
Server (1)
An FTP client uses a PASV command to notify the FTP server that the FTP client sends a data connection setup
request. Then the FTP server uses the PORT command to inform the FTP client of the temporary port and IP
address used to receive the data connection setup request. For example, as shown in the following figure, the
FTP server uses temporary port 30000 and IP address 192.168.10.200 to receive the data connection setup
request from the FTP server. Then the FTP client sends the request to port 30000 with the IP address of
192.168.10.200. Since the FTP server passively receives the data connection setup request, the FTP server
works in PASV mode.

Scenario 1 Setting up a control connection in PASV mode

192.168.10.20
192.168.10.5
0
0 SYN

ACK + SYN 21
ACK

FTP server
FTP client Control connection
Passive Mode of the FTP
Server (2)
If the file list on the FTP server can be viewed on the FTP client, the data connection is set up successfully. If
directory listing times out, the data connection fails to be set up.

Scenario 1 Setting up a data connection mode in PASV mode

192.168.10.20
192.168.10.5
0
0 SYN

ACK + SYN 30000


ACK

FTP server
FTP client Data connection
Scenario — Setting Up a
Server for Sharing Learning
Materials
1. Background
Employees in a small company often use chat tools to transfer files for sharing some
learning materials. However, these learning materials are saved on the computers of
different employees. Obtaining and searching files as well as updating files that have
been shared are inconvenient.

2. Solution
Use an FTP server as a learning material sharing server, create an FTP account for
each employee in the company, and enable the employees to share the same
directory. When an employee wants to share learning materials, the employee can
use the FTP uploading function to upload materials to the FTP server. In this way,
other employees can download and updating the materials on the FTP server
anytime. The FTP server enables employees to easily share, obtain, and accumulate
learning materials.
Enabling the FTP
Service
1. On DeviceManager, configure global parameters for and enable the FTP service.
Creating a
User
2. Create a local authentication user.
Creating a
Share Path
3. Create a file system as the FTP share path.
Creating an FTP
Share
4. Create an FTP share.
Selecting a File
System
5. Select a file system as the FTP share path.
Selecting a
User
6. Select a user to create the FTP share.
Reading the Warning Message
7. Carefully read the content of the Warning dialog box and select I
have read and understood the consequences associated with
performing this operation. Then you can use an FTP client to log in.

1
OceanStor V5&18000 V5
Converged Storage Systems
SmartQuota Introduction
Method to Manage
and Control
Limit the resources occupied by single

Resources
directories, users, and user groups.
Host I/O  To prevent some users from occupying
excessive storage resources.

NAS share
⚫ Notify users about the resources they
occupied by alarm or event.
Termino
logy Term Description
Quota tree Quota trees are special directories of file systems.
When the resources used by a user exceed the soft quota, an
alarm is displayed, which is cleared when the used resources
are lower than the soft quota.

Root quota tree Root quota trees are root directories of file systems. User
quotas, group quotas, and resource limits for users can be
configured on root quota trees.

Soft quota When the resources used by a user exceed the soft quota, an
alarm is reported, which is cleared when the used resources
are lower than the soft quota.
Hard quota Hard quota is the maximum number of resources available
to a user.
Usage of
Quota Tree V5 series allow users to configure
quotas on quota trees (special level-1
directories, created by management
commands).

Quota trees record information


about resource occupation and
quota limitation metadata.

Update resource occupation and


check quotas during I/O
operations.
Resource
Occupation (1)
⚫ Resource occupation of directories (statistic values of
directory quotas)

The storage capacity and number of files of all quota trees

⚫ Resource occupation of users/user groups (statistic values of


user/user group quotas)

In a quota tree, the quota can be consumed by a user equals the
storage capacity of files created by the user.
Resource
Occupation (2)
Quota Tree 1
Quota Tree 1

Directory
Capacity

8 MB
File
Quantity
4
| ---- confFile.conf (2 MB, usr 3, grp 5)
User
| ---- run.dat (1 MB, usr 3, grp 8) 3 3 MB 2
4 0 1
| ---- doc (0 B, usr 4, grp 8)
7 5 MB 1
| | ---- study.doc (5 MB, usr 7, grp 9) User group
5 2 MB 1
|
8 1 MB 2
9 5 MB 1
Enabling the Switch of a
Quota Tree
Switch Status Initialization On

Enabling the Quota Switch N/A I/O update


of an Empty Quota Tree

Enabling a Quota Switch of Update by running a background scanning task I/O update
a Non-empty Quota Tree + I/O update
1. Run a background task to scan the quota
tree for files and subdirectories and
upgrade the resources occupied by it.
2. During the scanning, I/O requests are
delivered. If a target file has been scanned,
update it.
3. After the scanning, enable the switch of
the quota tree.
Quota
Limitations (1) Root Quota Tree
(File System Root Other Quota Trees
Directory, Quota Tree 0)

Directory Quota X O
Default Directory
O O
Quota
User Quota O O
Default User
O O
Quota
User Group
O O
Quota
Default User
O O
Group Quota
Quota
Limitations (2)

Configuration items: soft quota, space soft quota, space hard quota,
file quantity software, and file quantity hard quota

A soft quota cannot exceed its related hard quota. At least one item
must be configured.

Quota Space Space File File


Limitatio Soft Hard Quantity Quantity
ns Quot Quota Soft Hard
a Quota Quota
Directory
Private 6 MB 10 MB - -
User
3 4 MB 5 MB 5K 6K
4 - - 1K 2K
User group
8 1 MB - 2K -
Quota Check During the I/O
Operation
⚫ For each write I/O, a check is performed on
Protocol server
the space that will be used by the write I/O
and whether the hard quota will be
exceeded by the write I/O.
File system ⚫ If the quota will be exceeded, the write I/O is
rejected, with a message displayed indicating
I/O insufficient space.
No
⚫ After the check is passed, the resources to be
used + delta <
used are added to used resources.
Hard quota
⚫ If the total used resources after the addition
Quota Check
Yes exceed the soft quota, an alarm is reported.
⚫ When data or files are successfully deleted,
Cache and used resources are below 90% of the
soft quota, the alarm is cleared.
Resource Occupation Update and Quota
Check During the I/O Operation

I/O An I/O error is


returned
Updating resource
occupation
No Enable a
quota switch. Checking quotas
Yes

NO Has the quota


limitation set?

Yes
Is used + delta No
within the limitation?

Update the resource


Yes
occupation to used +
delta.

An I/O has
been written.
Soft Quota Alarm and Hard
Quota Event
1.An I/O operation
succeeds.
2.Clear the insufficient
resource alarms.

1. An I/O operation fails and an


insufficient space error is
1. An I/O operation succeeds. returned.
2.Send an insufficient 2. Send an excessive resource
resource alarm. occupation event.

Resource
occupation

90% of soft Soft quota Hard quota


quota limit limit limit
Using Directory
Quotas to Control
NAS
Exclusive directory of manager A limits the
resources available to the manager.

Resources
Manager A

Project team A directory limits the resources


available to the team.
Engineer A
Share Engineer B

Sales department directory limits the
resources available to the department.
Salesperson A
Salesperson B

Engineer A
Engineer B You can plan different quota trees for different

Salesperson A departments or individuals of an enterprise. In
Salesperson B this way, you only need to configure the

Manager A directory quota of each quota tree to limit the
resources occupied by each user.
Flexible
Restrictions on Share is the shared directory (quota tree 0) of the R&D department:
1. Set the quota for quota tree 0 to limit resources available to the R&D
NAS
department.

Resource 2. Set the quota for manager A to limit the resources available to manager A.
3.Set the quota for project group G/E to limit the resources available to the
group.

Occupations
Share
Manager A

Engineer G1 Within a quota tree, the


Engineer G2

administrator can set quotas
Owning user for the corresponding
group G
department, and for users
Engineer G1 Engineer E1
Engineer G2 Engineer E2
and groups of the department.
… …
Engineer E1
In this way, different users may
Owning user
Engineer E2 group E occupy different amount of

Manager A resources.
Introduct
ion ⚫


Create a directory quota.
Delete or modify a directory quota.
Report/Batch report


Create a host user/user
group.
Modify, query, and delete a
host user/user group.
⚫ Create a quota tree share.
⚫ Delete, modify, and query a
quota tree share.

DSitreecpto2ry SUtseepr/U3ser Step 4


Quota Tree Prqoudotuac gProroupduqucota ProShdauricng
management mana g ement management management
t2 t3 t4

⚫ Create a quota tree. ⚫ Create authentication users/user groups.


⚫ Delete a quota tree. ⚫ Modify, query, and delete authentication users/user groups.
⚫ Change a quota tree name. ⚫ Create a user/user group quota.
⚫ Enable/Disable the switch of a quota tree. ⚫ Delete and modify a user/user group quota.
⚫ Batch query quota trees. ⚫ Report/Batch report
Creating a
Quota Tree
Modifying a
Quota Tree
Creating a Directory
Quota (1)
Creating a Directory
Quota (2)
Checking Space Soft Quota
Alarms (1)
Checking Space Soft Quota
Alarms (2)
Checking Space Hard Quota
Alarms (1)
Checking Space Hard Quota
Alarms (2)
Creating a Local
Authentication User/User
Group
Creating a User/User Group
Quota (1)
Creating a User/User Group
Quota (2)
Creating a Host
User/User Group Specify the user ID/group ID of
a host user/user group. This ID
must be the same as that of
authentication user/user group
of the device.
Checking File Quantity Soft
Quota Events
Checking File
Quantity Hard Quota
Events
Modifying a
Quota
Deleting a
Quota
Deleting a
Quota Tree
Sum
mary1 2 3
⚫ Restrict the space or ⚫ Limit the resources ⚫ Quota tree
number of files that occupied by an management
can be used by a organization or ⚫ Quota
user or user group. user to prevent management
⚫ Make quota excessive

effective: used + occupation of

delta vs. quota limit resources.

Configuration
Basic principles Typical scenarios
management

You might also like