Professional Documents
Culture Documents
3x improvement in
application performance
99.9999% availability
OceanStor Dorado V6
75% OPEX reduction
Lightning fast, rock solid
Specifications
OceanStor Dorado OceanStor OceanStor OceanStor
OceanStor Dorado 3000 V6
5000 V6 Dorado 6000 V6 Dorado 8000 V6 Dorado 18000 V6
Número máximo de
16* 32*
controladoras
Cache máximo
(controladora dupla,
192 GB–1536 GB 256 GB–4 TB 1 TB–8 TB 512 GB–16 TB 512 GB–32 TB
expandindo com o número
de controladoras)
Protocolos de interface
FC e iSCSI
compatíveis
Tipos de porta de front-end FC/FC-NVMe de 8/16/32 Gbit/s, Ethernet de 10/25/40/100 Gbit/s
Tipos de porta de back-end SAS 3.0 NVMe por Fabric e SAS 3.0
Número máximo de
módulos de I/O com troca a
6 12 12 28 28
quente
por controladora
Número máximo de
portas de front-end
40 48 56 104 104
por invólucro de
controladora
SSDs SAS de 960 GB / 1,92 SSDs SAS de 960 GB / 1,92 TB / 3,84 TB / 7,68 TB / 15,36 TB / 30,72TB
SSDs compatíveis TB / 3,84 TB / 7,68 TB /
SSDs NVMe portáteis de 1,92 TB / 3,84 TB / 7,68 TB / 15,36 TB
Application Scenario — Databases
Oracle SQL Server
database database DB2 database
server
Customer Benefits
5
Application Scenario — VDI
Customer Benefits
OceanStor Dorado V6
Typical Network
⚫ Multi-link dual-switch network
Physical Architecture of the Controller
Enclosure Dorado6000 V6
No. Name
1 Subrack
2 BBU
3 Controller
4 Power module
Management
5
module
Interface
6
module
Modules in the Controller Enclosure of
Dorado5000 and Dorado3000 V6 SAS
2.5-inch disk
⚫ 12 Gbit/s SAS SSD
⚫ 960 GB/1.92 TB/3.84 TB/7.68
TB/15.36 TB/30.72 TB SSD
Note:
900 GB/1.8 TB/3.6 TB SSDs are only used
as spare parts or for expansion.
No. Name
1 Subrack
2 Disk module
3 Power module
Expansio
4
n module
2 U SAS Disk Enclosure
2.5-inch disk
Architecture (25 Slots) ⚫
⚫
12 Gbit/s SAS SSD
960 GB/1.92 TB/3.84
TB/7.68 TB/15.36 TB/30.72
TB SSD
Note:
600 GB/900 GB/1.8 TB/3.6 TB
SSDs are only used as spare
parts or for expansion.
Expansion module
⚫ Dual expansion modules
⚫ Two 12 Gbit/s SAS ports
• In new systems, disks with N*960 GB (N = 1, 2, 4, 8, 16) capacity specifications are used.
• N*900 GB SAS SSDs and N*1 TB (N =1, 2, 4) NVMe SSDs are only used for capacity expansion of
systems earlier than C30.
• 30.72 TB SSDs are supported after 2019-01-30.
Dorado5000/6000 V6(SAS) Scale up
Dorado5000/6000 V6(SAS) 4 Controllers
Scale up
SmartIO Interface Module
➢ Provides four 8/16/32 Gbit/s FC, 25GE, or 10GE ports.
X8 ➢ This new module (Hi1822) has the following enhancements as compared with the old
ones (Hi1821):
✓ Further supports FastWrite in FC mode and TOE.
✓ Improves the port rate: 32 Gbit/s FC and 25GE.
Note:
1. Physical form: The module uses the Hi1822 chip. It has two different structures (back-
end X8 and back-end X16), which require different SBOMs.
X16 2. The X8 module is only used on Dorado5000 V6 (NVMe).
3. The X16 module is used on Dorado3000 V6, Dorado5000 V6 EnhancedEdition,
Dorado6000 V6 Enhanced Edition, and Dorado18000 V6.
* Neither module can be used on Dorado C30 or earlier. The new modules are designed
in such a way as to prevent incorrectinsertion.
Note:
X8
1. Physical form: This module uses the Hi1822 chip. It has two different
structures (back-end X8 and back-end X16), which require different
SBOMs.
2. The X8 module is only used on Dorado5000 V6 (NVMe).
3. The X16 module is used on Dorado3000 V6, Dorado5000 V6 Enhanced
Edition, Dorado6000 V6 Enhanced Edition, and Dorado18000 V6.
X16 * Neither module can be used on Dorado C30 or earlier. The
new modules are designed in such a way as to prevent incorrect insertion.
Ground cables
DC power cables
Serial cables
FDR cables
Software Architecture
All balanced Active-Active architecture
FlashLink: RAID-TP Tolerates
Simultaneous Failure of Three Disks
Three-disk failures
Reliability
doubled
Two-disk failures
CKG
Idle CKG
1. New data is written to new locations. The original data is set to invalid state.
2. After the amount of garbage reaches the threshold, valid data is migrated to a new stripe.
高级特性
Advanced features 高级特性
Advanced features Priority 2
Priority 1
Cache批量写
Cache batch write
Cache批量写
Cache batch write
Priority 1 Priority 3
硬盘重构 硬盘重构
Disk reconstruction Disk reconstruction
Priority 1 Priority 4
垃圾回收 垃圾回收
Garbage collection Garbage collection
Priority 1 Priority 5
Controllers automatically detect data The I/O priorities are dynamically adjusted
layouts inside SSDs. within the controller and SSDs based on the
Partitioning of hot and cold data is service status.
implemented within the controller The priorities of garbage collection I/Os are
and SSDs simultaneously. automatically controlled to trigger garbage
Sequential layout of hot and cold data collection on demand.
in different partitions Service data reads/writes are always
Effectively reducing the amount of responded to with the highest priority.
garbage inside SSDs
Key Design Points: Global Wear
Leveling and Anti-Wear Leveling
Lifespan Lifespan
Threshold when
global anti-wear
leveling is enabled
Benefits:
⚫ Global wear leveling enhances general SSD reliability.
⚫ Anti-global wear-leveling avoids simultaneous failure of disks.
Key Features: Global Inline
Deduplication and Compression
8 KB datablock
Fingerprint
pool
Inline compression
Inline Optimized LZ4 algorithm
compression
Enhancement in C00: Optimized
Storage
pool
ZSTD algorithm, improving the
compression ratio
Byte-alignment in data compaction and
DIF rearrangement, increasing the
Engine
compression ratio between 15% and
35%
Key Features: Multiple Disk Domains
Concept
SSDs can be grouped into multiple disk domains. Faults in onedisk
domain do not affect services in the other disk domains, isolating
different types of services or services from different vStores. With the
same number of SSDs, the possibility that two SSDs fail simultaneously Working Principle of Multiple Disk Domains
in multiple disk domains is lower than it would be in a single domain.
Multiple disk domains reduce the risk of data loss caused by failure of Host 1 Host 2 Host 3 Host 4
multiple disks.
Technical Highlights
1. One engine can manage up to four disk domains. A disk domain
can consist of SSDs owned by two engines. The RAID level of each
disk domain can be specified. Controller A Controller B
LU
2. Disk domains are physically isolated and must be configured with N
independent hot spare space respectively.
3. If a disk domain is faulty, services in other disk domains are not
affected.
Disk enclosure
Application Scenarios
1. vStore isolation: Different disk domains can be created for various Disk domain Disk domain Disk domain Disk domain
1 2 3 4
hosts or vStores, implementing physicalisolation.
2. Data reliability improvement: Given the same number of SSDs, the
possibility that two or three SSDs fail simultaneously in multiple
disk domains is lower than it would be in a single domain.
Log in to DeviceManager.
⚫ Prerequisites
The temporary maintenance terminal used for the initial configuration is connected
to the storage device's management port, and the maintenance terminal IP address
and management port's default IP address are on the same network segment.
Choose System > Controller Enclosure. Click to switch to the rear view of the
controller enclosure and click a management port to modify.
Note
⚫ The default IP address of the management network port on management module 0 is 192.168.128.101 and that on management
module 1 is 192.168.128.102. The default subnet mask is 255.255.0.0.
⚫ Management network port IP addresses and internal heartbeat IP addresses must reside on different network segments. Otherwise,
route conflicts will occur. The default internal heartbeat IP addresses are 127.127.127.10 and 127.127.127.11, and the default subnet
mask is 255.255.255.0. In a dual-controller storage system, IP addresses on the 127.127.127. XXX network segment cannot be used.
⚫ Management network port IP addresses and the maintenance network port IP address must reside on different network segments.
Otherwise, route conflicts will occur. The default maintenance network port IP address is 172.31.128.101 or 172.31.128.102, and the
default subnet mask is 255.255.0.0. Therefore, IP addresses on the 172.31.XXX.XXX network segment cannot be allocated to
management network ports. You are advised to connect management network ports to the network only.
⚫ By default, management network port IP addresses and service network port IP addresses must reside on different network segments.
Changing Management Network
Port IP Addresses (2)
You can also log in to the storage system using the serial port. After using serial
cables to connect a maintenance terminal to a controller enclosure, run the
change system management_ip command to change management network
port IP addresses. For example, set the IPv4 address of the management
network port on management module 0 to 172.16.190.2, subnet
mask to 255.255.0.0, and gateway address to 172.16.0.1.
Note
⚫ The default IP address of the management network port on management module 0 is 192.168.128.101 and that on management
module 1 is 192.168.128.102. The default subnet mask is 255.255.0.0.
⚫ Management network port IP addresses and internal heartbeat IP addresses must reside on different network segments. Otherwise,
route conflicts will occur. The default internal heartbeat IP addresses are 127.127.127.10 and 127.127.127.11, and the default subnet
mask is 255.255.255.0. In a dual-controller storage system, IP addresses on the 127.127.127. XXX network segment cannot be used.
⚫ Management network port IP addresses and the maintenance network port IP address must reside on different network segments.
Otherwise, route conflicts will occur. The default maintenance network port IP address is 172.31.128.101 or 172.31.128.102, and the
default subnet mask is 255.255.0.0. Therefore, IP addresses on the 172.31.XXX.XXX network segment cannot be allocated to
management network ports. You are advised to connect management network ports to the network only.
⚫ By default, management network port IP addresses and service network port IP addresses must reside on different network segments.
Applying for a License
Item Description
GTS permission for the ESDP Users who have GTS permission can apply for licenses
(applicable to Huawei service in Entitlement Activation mode. If you do not have
engineers) GTS permission, click Permission Application in the
left navigation tree of the ESDP home page and
complete the permission application.
ASP or Guest permission for the Users who have ASP or Guest permission can apply
ESDP (applicable to Huawei for licenses in Password Activation mode.
partners or end users) Click Register Now on the ESDP home page and fill
in the required information.
Equipment serial number (ESN) An ESN is a character string that uniquely identifies a
device. Licenses must be activated for each device.
You can obtain the ESN in any of the following ways:
• Check the ESN on the mounting ear of the
front panel of the device.
• On the DeviceManager home page, choose
Basic Information > SN.
• Log in to CLI and run the show system general
command to view the value of SN.
Applying for a License (Entitlement
Activation)
Applying for a License (Password
Activation)
Importing and Activating a License
After you obtain a license file, you need to upload and activate it before you can use
the value-added features.
Introduction to DeviceManager
⚫ DeviceManager is an integrated storage management software developed
by Huawei. It comes installed in storage systems from the factory.
⚫ You can log in to DeviceManager on any maintenance terminal connected
to a storage system by entering the management network port IP address
of the storage system and the local or domain user name in a browser.
Internet Explorer 10
Windows7+
to 11
Professional (32- Internet Explorer 9
Firefox 25 to 52
bit/64-bit)
Chrome 27 to 57
Internet Explorer 10
Windows Windows Server 2012 to 11
and Windows 8 Firefox 25 to 52
Chrome 27 to 57
Internet Explorer 11
Windows 8.1 Firefox 25 to 52
Chrome 27 to 57
User
User
Security Configuration
Management — Authorized IP Addresses
Dorado V6 storage system
To prevent unauthorized IP
addresses from accessing
DeviceManager, specify the
authorized IP addresses that
can access the storage device
from DeviceManager. After
Not an authorized the IP address security rules
Log in to the
User IP address or not in are enabled, DeviceManager
storage
the authorized IP
address segment is accessible only to the
system.
authorized IP addresses or IP
User address segment.
User
Alarm Management — Severity
The following slides present the alarming mechanism, alarm notification
methods, and alarm dump for you to better manage and clear alarms.
1
Alarm Management — Checking Alarms
Alarm Management — Checking Alarms
Detailed descriptions and troubleshooting suggestions are provided to each
alarm in the list for convenient fault rectification.
Performance Management
0
Performance Management — View Analysis
0
Performance Management — View Dashboard
2
Performance Management —
SystemReporter
SystemReporter is a performance analysis tool for storage systems. It provides functions
such as real-time monitoring and trend analysis by collecting, archiving, analyzing, and
forecasting data. By using SystemReporter, users can easily check storage system
performance and tune performance in a timely manner. SystemReporter is installed on
servers and supports the following operating systems.
4
Performance Management —
SystemReporter
On SystemReporter, you can view real-time and historical performance
monitoring data.
5
Viewing Basic Information
On the DeviceManager home page, you can view basic information of the storage
system, including health status, alarms, system capacity, and performance. This
information helps you prepare for device management and maintenance.
7
Viewing Power Consumption
Information
Power consumption indicates how much power a storage system consumes per
unit time. You can view the total power consumption of a storage device or its
power consumption on a specified date.
8
Checking Device Running Status —
Disk Enclosure/Controller Enclosure
Parameter Description
Health status •Normal: The enclosure is functioning and running normally.
•Faulty: The enclosure is abnormal.
Running status Online or offline
4
Checking Device Running Status —
Controller
Parameter Description
Health status • Normal: The controller is functioning and running normally.
• Faulty: The controller is abnormal.
Running status Online or offline
5
Checking Device Running Status —
Power Module
Parameter Description
Health status • Normal: The power module is functioning and running normally.
• Faulty: The power module is abnormal.
• No input: The power module is in position but is not providing power.
Running status Online or offline
6
Checking Device Running Status —
Controller Enclosure BBU
Parameter Description
Health status •Normal: The controller enclosure BBU is functioning and running normally.
•Faulty: The controller enclosure BBU is abnormal.
•Insufficient power: The BBU has insufficient power but other parameters are
normal.
Running status Online, charging, or discharging
7
Checking Device Running Status —
Fan Module
Parameter Description
Health status • Normal: The fan module is functioning and running normally.
•Faulty: The fan module is abnormal.
Running status Online or offline
8
Checking Device Running Status —
Disk
Parameter Description
Health status • Normal: The disk is functioning and running normally.
• Faulty: The disk is abnormal.
• Failing: The disk is failing and needs to be replaced soon.
Running status Online or offline
9
Checking Device Running Status —
Host Port
Parameter Description
Health status •Normal: The host port is functioning and running normally.
•Faulty: The host port is abnormal.
Running status Link up or link down
0
Checking Device Running Status —
Interface Module
Parameter Description
Health status • Normal: The interface module is functioning and running normally.
• Faulty: The interface module is abnormal.
Running status Running or powered off
1
Checking Service Running Status —
Disk Domain
Parameter Description
Health status •Normal: The disk domain is functioning and running normally.
•Degraded: The disk domain is functioning normally, but performance is not
optimal.
•Faulty: The disk domain is abnormal.
Running status Online, reconstruction, precopy, deleting, or offline
3
Checking Service Running Status —
Storage Pool
Parameter Description
Health status • Normal: The storage pool is functioning and running normally.
•Degraded: The storage pool is functioning normally, but performance is not
optimal.
•Faulty: The storage pool is abnormal.
Running status Online, reconstruction, precopy, deleting, or offline
4
Checking Service Running Status —
LUN
Parameter Description
Health status •Normal: The LUN is functioning and running normally.
•Faulty: The LUN is abnormal.
Running status Online, deleting, or offline
5
Checking Service Running Status —
Host
Parameter Description
Status •Normal: The host is functioning and running normally.
•Faulty: The host is abnormal.
6
Checking Service Running Status —
Remote Replication Pair
Parameter Description
Health status •Normal: All pairs are functioning and running normally.
•Faulty: One or more of the pairs are abnormal.
Running status •Normal, synchronizing, to be recovered, interrupted, split, or invalid
7
Checking Service Running Status —
Remote Replication Consistency Group
Parameter Description
Health status •Normal: All pairs in the consistency group are functioning and running
normally.
• Faulty: One or more pairs in the consistency group are abnormal.
Running status • Normal, synchronizing, to be recovered, interrupted, split, or invalid
8
Checking Service Running Status —
Snapshot
Parameter Description
Health status •Normal: The snapshot is functioning and running normally.
•Faulty: The snapshot is abnormal.
Running status Active, inactive, deleting, or rolling back
9
Inspecting Storage Device Status
You can use SmartKit to make inspection policies and inspect devices to check
device running status in a timely manner.
0
Powering Storage Devices On or Off
— Powering On a Device
The correct power-on sequence is as follows:
1. Switch on the external power supplies of all devices.
2. Press the power button on the controller enclosure.
3. Switch on Ethernet switches or Fibre Channel
switches (If the Ethernet or Fiber Channel switches
are configured, but not powered on).
4. Switch on application servers (If the application
servers are not powered on).
1
Powering Storage Devices On or Off
— Powering Off a Device
The correct power-off sequence is as follows:
1. Stop all services on the storage device.
2. Hold down the power button for 5 seconds to
power off the controller enclosure or perform
power-off operations on DeviceManager.
3. Disconnect the controller enclosure and disk
enclosures from their external power supplies.
2
Powering Storage Devices On or Off
— Restarting a Storage Device
Exercise caution when you restart the storage device
as doing so interrupts the services running on the
device.
3
Powering Storage Devices On or Off
— Powering On an Interface Module
If you want to enable interface modules that have
been powered off, power on them on DeviceManager.
4
Powering Storage Devices On or Off
— Powering Off an Interface Module
Before replacing an interface module, power off it.
5
Collection and Recovery of Storage
System Information
After a fault occurs, collect the basic information, fault
information, and storage device information, and send
it to maintenance engineers. This helps maintenance
engineers quickly locate and rectify the fault. Note
that the information collection operations described
here must be authorized by customers in advance.
9
Exporting System Data
The system data to be exported using DeviceManager includes
running data, system logs, and disk logs.
• Running data indicates the real-time status of a storage system, such
as, the configuration information of LUNs. Running data files are in
*.txt format.
• System logs record information about the running data, events, and
debugging operations on a storage system and can be used to analyze
the status of the storage system. A system log file is in *.tgz format.
• A DHA runtime log is the daily runtime log of a disk. It mainly includes
daily disk health status, I/O information, and disk life span. A DHA
runtime log file is in *.tgz format.
• An HSSD log is the working log of HSSD, such as the S.M.A.R.T
information of a disk. An HSSD log file is in *.tgz format.
0
Exporting Alarms and Events
Alarms and events record the faults and events that occur during
storage system operation. When the storage device is faulty, view
the alarms and events to locate and rectify the fault.
On DeviceManager, you can specify the severity and time of
alarms and events to export.
➢ On the Current Alarms page, critical alarms, major alarms, and
warnings are displayed.
➢On the All Events page, alarms of all severities are displayed.
Alarms on the Current Alarms tab are exported to All Events.
1
Quick Maintenance Process
The following flowchart shows how to quickly maintain a storage system.
View the status of indicators on the front and rear panels of devices in the
storage system to check for hardware faults.
On the Home page of DeviceManager, you can know the basic information,
alarms, system capacity trend, and performance of the storage system.
5
Checking Service Status
The following table describes the check items.
Item Abnormal Status Common Cause Recommended Action
The Health Status is The source LUN is Follow the instructions regarding
Snapshot
Fault. abnormal. snapshot alarms to handle the alarms.
8
Checking Storage System
Performance
The following table describes the check items.
Itema Abnormal Status Common Cause Recommended Actionb
The transmission
The bandwidth is rate of the storage
Adjust the transmission
Block bandwidth lower than the system does not
rate of the related port
(MB/s) minimum bandwidth match that of the
on the server or switch.
of a single link. application server or
switch.
The link between the Check the cable
storage system and connection between
The throughput is low
Total IOPS (IO/s) the application the storage system and
or 0.
server or switch is the application server
abnormal. or switch.
a: This table only lists recommended items. Determine whether to enable other items based
on the storage system status.
Enabling too many items may cause a slight degradation of performance in the processing of
storage system services.
b: For some faults, the system displays alarms with IDs and recommended actions.
Troubleshoot such faults by following the instructions.
9
PAUSA PARA CAFÉ
RETORNO 16:30
OceanStor SmartKit
Introduction
SmartKit Introduction
⚫ A portable toolbox for Huawei IT
service engineers.
⚫ Provides a unified desktop
management platform for IT tools.
The built-in ToolStore allows quick
download, installation, and
upgrade of tools.
⚫ Includes various tools required for
deployment, maintenance, and
upgrade of IT devices. These tools
can be used for device O&M,
improving work efficiency and
simplifying operations.
Information Collection Tool – Process
Adding devices ⚫ Add devices whose information you want to
collect.
Setting a check policy • Set the directory for saving the inspection
report.
Selecting
⚫ Select a local patch installation package.
patches
⚫ CKs are selected for a CKG based on wear leveling and anti-wear leveling
algorithms. The algorithms select CKs based on the capacity and degree of wear,
ensuring SSDs are used evenly and the risk of failure is mitigated.
CK CK CK CK CKG CK CK CK CK CKG
CK CK CK CK CK CK CK CK CK CK CK CK
Domain
Basic Storage Pool Services – Wear
Leveling
The lifespan of SSDs is determined
by the degree of wear. When SSDs
are selected unevenly, that is,
when a few SSDs are used
repeatedly, those SSDs experience
wear at a faster rate, as a result of
which the overall reliability of the
array is reduced. In this case, the
wear leveling algorithm ensures
even use of SSDs to prolong usage
and reliability.
Basic Storage Pool Services – Anti-
Wear Leveling
When the degree of wear exceeds
the threshold, it can cause SSD
failures. This results in the number
of faulty disks exceeding redundant
ones, causing array data loss. The
anti-wear leveling algorithm
systematically queues worn out
SSDs to be further worn out,
reducing failure uncertainties.
Basic Storage Pool Services – RAID
2.0+ Technology Overview
RAID 2.0+ technology dynamically selects the number of data columns (N) in a
CKG according to the number of disks in the disk domain (N is a fixed value when
RAID 2.0+ technology is not used), and keeps the number of parity columns (M)
unchanged, improving reliability and space utilization.
⚫ How RAID 2.0+ technology works:
When the number of disks increases, more data columns are selected to form a CKG,
improving the space utilization rate (N/(N+M)).
When the number of disks decreases, the number of data columns in the new CKG is
decreased but the number of parity columns is kept unchanged. In this case, data will
not be lost when the number of damaged disks is the same as or less than that of
parity columns in the new CKG.
D0 D1 D2 D3 P Q R
Basic Storage Pool Services –
Overview of Deduplication
⚫ Dorado supports global deduplication within disk domains,
determining repeated data at 4 KB or 8 KB granularity, helping avoid
duplicates and unnecessary space usage.
1 2 1 2 3
M apping M apping
During deduplication, the
table mapping table records the table
mapping from the logical
F1 F2 block address (LBA) to the F1 F2 F1
fingerprint index
4K 2K 2K
User data D2 --- 8K 2K
⚫ Garbage collection
To meet the requirements of ROW on space for writing new data, valid data in old CKGs is
migrated. After migration, data in old CKGs is completely erased. In this way, the space for
ROW writes can be managed and provided.
Basic Storage Pool Services –
Garbage Collection Principles
CKG 0
CKG 2
CKG 1
D0 D1 D2 P Q
D2
When a disk is faulty, a new CK is selected from another disk outside the
affected CKG. The data within the damaged CK is then calculated based
on RAID parity data to reconstruct it.
Basic Storage Pool Services –
Migration Reconstruction Principles
Disk 0 Disk 1 Disk 2 Disk 3 Disk 4 Disk 5
CKG0 D0 D1 D2 D3 P Q
P’ Q’
D0+D1+D3 => P’ + Q’
CKG1
D0 D1 D2 P Q
2
2
B2 B2
Or
3 B3
3
3 B3
4 B4
4 5
2
C2
3
C3
4
3 C4
5
Configuration Operation – Creating
a LUN and LUN Group
1
1 D1 E1
D2 2
2
E2
3
3 D3
D4
4
D5
5
A LUN group can contain one or multiple
LUNs. A maximum of 4096 LUNs can be
added to a LUN group. A LUN can be added
to a maximum of 64 LUN groups.
6
Configuration Operation – Creating
a Host
Hosts can be created manually, in batches, or through automatic
scanning. This page describes how to create a host manually.
1
G1
1
2 G2
2
2
H2 3
4
Configuration Operation – Creating
a Port Group (Optional)
A port group is a logical combination of multiple physical ports and a mode for
use of specific ports by the storage system. A port group establishes a mapping
relationship between storage resources (LUNs) and servers.
1 I2
1
2 I3
2
3
4
Configuration Operation – Creating
a Mapping View
1 J1
2
OceanStor Dorado V6
Storage Systems
SmartThin
Terminology
Term Definition/Description
⚫ Highlights
✓ Provides a storage management
approach that enables on-demand
storage resource allocation.
✓ Provides thin LUNs and allocates
physical storage space based on
user needs.
✓ Reduces resourceconsumption.
License Requirements
⚫ SmartThin is a value-added feature which requires a
license to be purchased.
Capacity allocation
Storage Virtualization
⚫ Capacity-on-write (COW): Storage space is allocated by
engines upon data writes based on load balancing rules.
Space
Host allocation
Space COW
allocation
upon Redirection
data to the
writes actual
storage
location
Application Type
When creating a LUN, you can select the application type of the service.
The application type includes the application request size, as well as
SmartCompression and SmartDedupe attributes. LUNs are created based
on application types. The system automatically sets parameters to
provide optimal performance for services.
Capacity-on-Write
⚫ A write request to a thin LUN will trigger space allocation.
⚫ If the available space of a thin LUN is smaller than the threshold,
the thin LUN applies for more space from the storage pool.
Thin LUN
Storage pool
Space allocated:
Writes data
directly.
Write
request 1.
Allocate
Write Space not allocated: space.
Applies for space
request first.
2. Write
data.
Computer
Direct-on-Time
Capacity-on-write stores data in random areas. For this reason,
the direct-on-time technology is required to redirect requests
when thin LUNs are accessed.
Thin LUN
Storage pool
Space
allocated:
Redirects data.
Space not
allocated:
Read request Returns 0.
Read
request
Space not
Write allocated:
request Allocates
space first.
Write
Computer request
Space
allocated:
Redirects data.
Mapping Table
A mapping table shows the mapping relationship of thin LUN data.
Each mapping entry is referred to as a pointer.
✓ The left mapping entry is the logical address, which is
used as the search keyvalue. 1 7 M apping
✓ The right mapping entry records the address of the Add entry
3 5
resource block.
✓ Entries in the mapping table can be added or deleted. 6 8
Search
1 7
7
Delete
1
The mapping table shows where the actual data of thin LUN is.
Reading Data from a Thin LUN
4. Reads data.
Data that
maps to a
physical
space:
Writing Data to a Thin LUN
a'
Data that
1. Receives a write request. maps to the a b c Mapping
logical d e f table
address: g h i 0 0
2. Queries the mapping table.
j k l
3. Redirects the request.
4. Writes data.
Data that
maps to
physical a
space:
Using SmartThin
The procedure for using SmartThin is similar to that for using RAID
groups and thick LUNs:
1. Select disks and create a disk domain using the disks.
2. Create a storage pool.
3. Create a thin LUN.
4. Map the thin LUN to the host for data read and write or create
value-added services such as remote replication and snapshot on the
thin LUN.
Host volume 2
Capacity
allocation
End
Checking the SmartThin License
Start
Is the SmartThin
license valid?
No Yes
Import and activate
the SmartThin
license.
Enable SmartThin.
End
Checking the SmartThin License
Importing the SmartThin License
Creating a Disk Domain
Creating a Storage Pool
Creating a Thin LUN
Modifying the Owning Controller of
a Thin LUN
Expanding the Capacity of a Thin
LUN
Deleting a Thin LUN
Before deleting a
storage pool, delete
all LUNs from the
storage pool.
Deleting a Disk Domain
2. Calculate fingerprints FP0 FP1 FP2 6. Add the mapping between the
of data blocks. fingerprint and address of the new
block to the fingerprint library.
Application
Device
Only enable
◼ Deduplication deduplication
Enable deduplication
◼ Compression and compression
Only enable
compression
Scenarios:
1. VDI/VSI scenarios
DW C
Media 2 to 4:1 OLTP VSI VDI
Transactions
C Email D D
C C 4 to 6:1
Clone volume A logical data duplicate that is generated after a clone is generated
for a source volume. It is presented as a clone LUN to users.
Redirect on write ROW is a core snapshot technology. When data is changed, the
(ROW) storage system writes the new data to a new location and directs a
pointer for the modified data block to the new location. The old data
then serves as snapshot data.
Clone split Clone split generates a full physical copy of data that a clone shares
with the source LUN or snapshot.
Working Principles
⚫ Definition: A clone is a copy of source data at a particular point in time. It can be split from the source
data and function as a complete physical data copy. A clone can serve as a data backup and is accessible
to hosts.
Characteristics:
✓ Quick clone generation: A storage system can generate a clone within several seconds. A clone can be
read and written immediately after being created. Users can configure deduplication and compression
attributes for a clone.
✓ Online splitting: A split can be performed to cancel the association between source and clone LUNs
without interrupting services. After a split, any later changes to the data on the clone LUN will not affect
the data on the source LUN.
a b c a b c a b c
d e f Creating a clone d e f Splitting the clone d e f
g h i g h i g h i
j k l j k l j k l
Key Technology: Creating a Clone
1. After a clone LUN has
been created, it shares the
data of its source LUN if no SourceLUN Snapshot CloneLUN
changes are made to the
data on either LUN. A
snapshot ensures data
consistency at the point in
time at which the clone is
created.
2.When an application
server reads data from the
clone LUN, it actually reads A B C D
the source LUN's data.
3.HyperMetro cannot be
implemented on a clone
LUN before it is split.
Key Technology: Reading and
Writing a Clone LUN
1. When an application server
writes new data to an existing
Source LUN Snapshot CloneLUN data block in the source LUN,
the storage system allocates a
new storage space for the new
data instead of overwriting the
data in the existing storage
space.
Read I/O
Clone LUN
Write I/O
Read I/O
Read I/O
Clone LUN
Write I/O
Backup server
Creating a Clone (for a LUN)
Creating a Clone (for a Snapshot)
Creating a Clone
Querying a Clone
Querying a Clone
Splitting a Clone
Stopping Splitting a Clone
Deleting a Clone
OceanStor Dorado V6
Storage Systems
HyperSnap Introduction
Background and Definition
⚫ A snapshot is a mirror of a data set at a specific point in
time. It can also be called an instant copy. The snapshot
itself is a complete usable copy of the data set.
a b c a b c
d e f Snapshot d e f
08:00 AM
g h i g h i
j k l j k l
a b c a b c
d m f d e f
g h n 09:00 AM
g h i
j k l j k l
Working Principles — Lossless Performance
Write to the source LUN (L2->P5).
Write to the source LUN (L2->P7) again. Write to snapshot 1 (L0->P6). Write to snapshot 2 (L2->P8).
Snapshot rollback is
Data is restored. complete, and the data is Use snapshot data
restored. to restore Data4.
Working Principles — Snapshot Cascading
and Cross-Level Rollback
1. Snapshot cascading is
to create a child Source volume
snapshot for a parent
snapshot. The child
snapshot shares the
data of its parent
snapshot. 08:00 09:00
2. Cross-level cascading
indicates that Snapshot0 Snapshot1
snapshots that share a 10:00 11:00
source volume can be
Snapshot1.snapsho Snapshot1.snapsho
rolled back to each t1
t0
other regardless of
their cascading levels.
Working Principles — Timing Snapshots
2 L2->P2
L1->P1 NULL
L4->P4
3
L0->P0 L2->P2 L2->P5
L: logical address
P: physical address
1. Redirect_On_Write
A B C D E F
P0 P1 P2 P3 P4 P5 P6 P7
Key Technologies — Snapshot Duplicate
⚫ How can I obtain multiple copies of data that is generated
based on the same snapshot?
Source
volume Snapshot
Source
Snapshot
volume
Snapshot
rollback
08:00
During the rollback, the host writes data to the source volume
after snapshot data is copied to the source volume.
If there are no access requests, data on the snapshot is rolled
back to the source volume in sequence.
Key Technologies — Reading a Snapshot
Origin Snapshot's
volume's Mapping
1. Receive a read request. Mapping table
2. Generate the address index table
Key Disk Key Disk
(key) specific to the request. offset offset
0
3. Obtain the data based on the 0 0
corresponding address index
(0,0) from the pool. If no data
is available, obtains it from the
source LUN.
a
Pool
Key Technologies — Writing a Snapshot
a'
Delivering a
snapshot policy
Rollback Reactivate
Generating snapshots
01:00:00
02:00:00
03:00:00 If a source LUN covered by
04:00:00 continuous data protection is
damaged, the source LUN's data can
be restored to any point in time
preserved by snapshots.
Application Scenarios
Re-purposing of data
Creating snapshot
duplicates
Creating a
snapshot
Reading
snapshot
duplicates
Rollback is
completed or
stopped.
Roll back a
snapshot.
State Transition Diagram
Activation
Create a snapshot
duplicate.
Deactivate a
snapshot.
Deactivation
Delete a
snapshot
.
Create a
snapshot
.
OceanStor Dorado V6
Storage Systems
HyperReplication
Feature Overview
Term Definition
Remote replication is the core technology in disaster recovery (DR) backup.
It can be used for remote data synchronization and DR. Remote replication
Remote replication allows you to remotely maintain one or multiple data copies from storage
system at another site. In case a disaster occurs at one site, data copies at
the other site are not affected and can be used for DR.
1 4
4
1 2
H ost P3 r i m a r y S e c on d ary
5 C ac h e Cache
3
H ost
2 2
R e m o t e replication links
Primary Secondary
L U N LUN
Primary Secondary
storage system storage system
P ri m a ry P ri m a ry Secondary S ec onda ry
cache L UN cache L UN
2 Data blockN
D e l e t e s t h e l o g if a l l w r i t e s a r e
4
successful
1 2 5
P r im a ry
2 Cache
H o st 2
P r im a ry
LUN
3
4
S e co n d a ry
S n a p sh o t LUN
3
6
R e m o t e replication links
S n a p sh o t
S n a p sh o t
6
Primary
storage system
S n a p sh o t
Secondary
storage system
D CL D CL
Primary Secondary
P r im a ry S e c on d ary S e c Lo Un N
da ry
P ri P r im a r y PPrr im
im a
a rr yy P r im
L Ua Nr y 差异位图 S e co n d a ry
m ar Cache LU
L UNN 差异位图 s nLaUp sNh o t
LUN
LUN s nLa Up sNh o t
y s n a p s ho t
1 Data block N
Records the
2
difference in the D C L
T h e pri mary The secondary
2 Da t a block N 3 LUN snapshot 3 L U N snapshot is
W rite I/O is activated activated
2
Result
2 W rit e I/O
Result
I n c r e me n t a l d a t a is s y n c h r o n i z e d
4
to the s e c o n d a r y cache
T h e pri mary
6 LUN snapshot The secondary
6 L U N snapshot
is stopped.
is s t o p p e d .
Comparison Between Synchronous and
Asynchronous Remote Replication
Item Synchronous Remote Asynchronous Remote
Replication Replication
Data synchronization period In real time Periodically
Data amount of each synchronization The primary and secondary LUNs keep Depending on the number of data
synchronized in real time. differences of the primary LUN in a
synchronization period
Impact on the primary LUN Large Small
1 . Initial st a t u s Co n si st e n cy g r o u p 0 1
R e mo t e r e p l i cat ion se ssi o n R e mo t e r e p l i cat ion se ssi o n
Pr i ma r y 01 Secondary Pr i ma r y 01 Secondary
LUN01 LUN01 LUN01 LUN01
A co n si st en cy g r o u p is n o t u s e d .
R e mo t e r e p l i cat ion se ssi o n R e mo t e r e p l i cat ion se ssi o n
Pr i ma r y 02 Secondary Pr i ma r y 02 Secondary
LUN02 LUN02 LUN02 LUN02
A co n si st en cy g r o u p is u se d .
R e mo t e r e p l i cat ion se ssi o n R e mo t e r e p l i cat ion se ssi o n
Pr i ma r y 03 S e co n d a r y Pr i ma r y 03 Secondary
LUN03 LUN03 LUN03 LUN03
Co n si st e n cy g r o u p 0 1
R e mo t e r e p l i cat ion se ssi o n 2 . D a t a r e p l i catio n R e mo t e r e p l i cat ion se ssi o n
01 01
P r im a ry Secondary P r im a ry S e co n d a r y
LUN01 LUN01 T h e r e mo t e replication se ssi o n 0 2 fails LUN01 LUN01
Datastatus
Co n si st e n cy g r o u p 0 1
R e mo t e r e p l i cat ion se ssi o n R e mo t e r e p l i cat ion se ssi o n
P r im a ry 01 Secondary 3 . D a t a r e co ve r y P r im a ry 01 Secondary
LUN01 L U N0 1 LUN01 LU N01
Af t e r a d i sa st e r occurs, t h e se co n d a r y st o r a g e
s y s t e m is u s e d for d a t a recovery.
R e mo t e r e p l i cat ion se ssi o n R e mo t e r e p l i cat ion se ssi o n
P r im a ry 02 Secondary Data on t h e p r i ma r y L U N is invalid P r im a ry 02 Secondary
LUN02 LU N02 b e ca u se i t is n o t d a t a o f t h e s a m e p o i n t in LUN02 L U N0 2
t i me .
R e mo t e r e p l i cat ion se ssi o n D a t a o n t h e p r i ma r y L U N is R e mo t e r e p l i cat ion se ssi o n
P r im a ry 03 va l i d f o r d a t a r e co ve r y. P r im a ry
Secondary 03 Secondary
LUN03 LUN03
LU N03 L U N0 3
Application Scenario 1: Centralized
Disaster Backup
Remote replication
session 01
Primary
LUN 01 Secondary
LUN 01
Synchronous
Service site 01
Remote replication Host
Primary session 02
LUN 02 Snapshot Secondary
02 LUN 02
Asynchronous
...
Service site 02
...
Remote replication
...
session n
Primary Snapshot Secondary
LUN n n LUN n
Asynchronous
Central backup site
Service site n
Application Scenario 2: Two-Site
Active-Active Service Continuity
Key Technologies
⚫ Multi-Point-in-Time Caching Technology
⚫ Secondary-LUN Write Protection Cancelation Technology
(Secondary LUNs Writable)
Application Scenarios
Synchronous/Asynchronous
➢ Users need the secondary LUN for data analysis and WreAplNication
mining without affecting services on the primary LUN. SAN SAN
➢The DR storage array needs to take over services
upon a fault in the production storage array, but a WAN
primary/secondary switchover cannot be completed
normally.
OceanStor OceanStor
Dorado V6 Dorado V6
Advantages
This technology accelerates service recovery. In The primary end sends a The secondary host
addition, after the secondary LUN is read and written, disaster message. reads and writes DR
an incremental synchronization can be performed,
data.
enabling services to be switched back rapidly after a
disaster recovery.
Multi-Link Redundancy Technology
Engine0 Engine1
A B A B Eng ine0 Eng ine1
A B A B
iSCSI
FC
Multi-Link Redundancy Technology
⚫ Specifications:
Each controller provides a maximum of 8 links for supporting
remote replication.
⚫ Characteristics:
Links have a mutually redundant relationship. As long as one link is
available, the replication service will run smoothly.
The load is balanced among multiple links, with the optimal paths
always preferred.
Variable-Granularity Small DCL
Bitmap Technology
⚫ Context:
DCLs are logs recording differentiated data. Their chunk granularity is 64 KB. In
the event that small-granularity (< 64 KB) I/Os require chunk replication, small
bitmap technology is used. A 64-KB chunk is divided into 4-KB chunks to
record data differences, with the query-returned chunk granularity being 4 KB
x N (N ranges from 1 to 16). That is, N pieces of differentiated data with
consecutive addresses are combined as a chunk.
64 KB
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
4 KB
⚫ Advantages:
1. Reduces the amount of replicated data, shortens synchronization
duration, and improves replication performance.
2. Mitigates data loss and lowers RPO.
Application Planning of Remote
Replication
⚫ Mirrors data from the production center to the disaster recovery center.
⚫ Enables the disaster recovery center to take over services in case of a disaster
in the production center.
⚫ Restores data to the production center after the production center recovers.
Remote data
mirroring
OceanStor
OceanStor storage storage system
system
Typical Networking and Connections
for Remote Replication
Synchronous Remote Replication's Bandwidth and Network Requirements
Engine0 Engine1
A B A B
Engi ne0 Engi ne1
A B A B
iSCSI
FC
Typical Networking and Connections
for Remote Replication
Production center DR center
Replication
data flow
Synchronous/Asyn
chronous
replication
SAN SAN
LAN/WAN
End
FusionSphere FusionSphere
• When the quorum server fails, HyperMetro automatically enters into static priority
mode. The two arrays still work normally.
• When communication between arrays A and B fails, the preferred site continues
working while the array at the non-preferred site stops working.
Why Are Arbitration and Dual-Arbitration
Needed?
No Arbitration Arbitration
HyperMetro link failure Device fault
X X X X
Array A Array B Array A Array B
Array A Array B Array A Array B
X X
Quorum server Quorum server
X X
Array A Array B Array A Array B Array A Array B Array A Array B
X
X
Active Standby Active Standby Active Standby Active Standby
quorum quorum quorum quorum quorum quorum quorum quorum
server server server server server server server server
➢ If the active quorum server fails, storage arrays A and B negotiate to switch arbitration
to the standby quorum server. If storage array A fails later, the standby quorum server
implements arbitration.
➢ If links between the active quorum server and storage array B are down, storage arrays
A and B negotiate to switch arbitration to the standby quorum server. If storage array A
fails later, the standby quorum server implements arbitration.
Arbitration Policies in Static Priority Mode
HyperMetro
No. Diagram Fault Type Pair Running Arbitration Result
Status
The link between LUNs of array A run
To be
1 two storage arrays services and LUNs of array
synchronize
breaks down. B stop.
d
The storage array in
LUNs of array A run
data center B (non- To be
2 services and LUNs of array
preferred site) synchronized
B stop.
malfunctions.
LUNs on both arrays stop.
The storage array in You must forcibly start
data center A To be the storage array in data
3
(preferred site) synchronize center B to enable the
malfunctions. d storage array to provide
services for hosts.
The black line between two data centers indicates the HyperMetro replication network.
Arbitration Policies in Quorum Server Mode
HyperMetro HyperMetro
No. Diagram Pair Running Arbitration Result No. Diagram Pair Running Arbitration Result
Status Status
100 KM
100 KM
1 Write Command 8 Gbit/s Fibre
1 Command 8 Gbit/s Fibre
Channel/10GE
Channel/10GE
2 Transfer Ready 2 Ready
RTT-2
8 Status Good
⚫ Traditional solution: Write I/Os experience two ⚫ FastWrite: A private protocol is used to combine the
interactions at two sites (write command and two interactions (write command and data transfer).
data transfer). The cross-site write I/O interactions are reduced by
⚫ 100 km transfer link: twice round trip time (RTT) 50%.
⚫ 100 km transfer link: RTT for only once, improving
service performance by 25%
Multipathing Routing Algorithm
Optimization — Host Data Access Optimization
Local HA Site A Site B
H be Short-distance Long-distance
deployment deployment
Load balancing mode (applicable to local Preferred storage array mode (applicable to
HA scenarios) same-city active-active storage scenarios)
⚫ Cross-array I/O load balancing is achieved in this ⚫ This mode greatly reduces cross-site accesses and the
mode. transfer latency.
⚫ This mode is applicable to short-distance deployment ⚫ This mode is applicable to long-distance deployment
scenarios such as the same equipment room. scenarios.
⚫ I/Os are delivered to two storage arrays and storage ⚫ In UltraPath, the hosts at site A are specified to access
resources are fully utilized, improving performance. the storage array at site A first and the hosts at site B
are specified to access the storage array at site B first.
I/Os are only delivered to the preferred storage array.
Thin Copy — Quick Initialization/Incremental
Data Synchronization
Traditional data synchronization solution Huawei thin copy solution
Site A storage Site B storage Site A storage Site B storage
A B C D A B C D A B C D A B C D
Full copy 8 blocks Full copy 8 blocks
H G F E H G F E
H G F E H G F E
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Send One Command
0 0 0 0 Full copy12 blocks 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
I J K L Full copy 4 blocks I J K L I J K L Full copy 4 blocks I J K L
Synchronize
Force start (operation) Force start (operation)
Force Start
For associated LUNs, data may be invalid if a HyperMetro consistency group is not used.
HyperMetro With a Consistency Group
For associated LUNs, a HyperMetro consistency group effectively prevents data loss.
Impacts and Restrictions (1)
1. Capacity requirements
Reserve 1% of the LUN capacity in the storage pool where the LUN resides when applying HyperMetro to
the LUN.
2. Relationship between LUNs used by HyperMetro and the LUNs of other value-added
features
Local/Remote LUN of HyperMetro
LUN of Other Value-Added Features HyperMetro Configured Before HyperMetro Configured After Other
Other Features Features
Source LUN of a snapshot Yes Yes
Snapshot LUN No No
Source LUN of a clone Yes Yes
Clone LUN No No
Primary LUN of HyperReplication Yes Yes
Secondary LUN of HyperReplication No No
Source LUN of SmartMigration No No
Target LUN of SmartMigration No No
• Local LUN: Yes
Mapping Yes
• Remote LUN: No
SmartCompression Yes Yes
SmartDedupe Yes Yes
SmartThin Yes Yes
SmartVirtualization No No
Impacts and Restrictions (2)
3. Application restrictions
(1) After HyperMetro is configured for a LUN (a remote or local LUN), this LUN cannot be mapped to the
local storage system for a takeover.
(2) This iSCSI host port cannot be bound to an Ethernet port; otherwise, the active-active services may fail.
(3) Ports of the active-active replication network and the host-to-storage network must be physically
isolated and cannot be the same.
(4) After a HyperMetro pair is deleted, you are not advised to map the two LUNs of the deleted
HyperMetro pair to the same host.
3. Device requirements
• You can only create an active-active relationship for two storage systems having the
same model.
Arbitration mode • The HyperMetro license must be available for the storage arrays in two data centers.
• The version of storage arrays must be C00 or a later one.
Installation Process
Establish links among the network between hosts and storage arrays,
active-active replication network, same-city network, and the arbitration
network. Connect the cables as planned.
Ensure that all devices and their hardware are properly installed. Power on
the devices.
Start
Add a remote
device.
Create a Create a (Optional) Create a Map LUNs to a
HyperMetro HyperMetr HyperMetro host (local or
Create a quorum
domain. o pair. consistency group. remote).
server (local).
Create a quorum
server (remote).
Configuring SAN HyperMetro — Adding
a Remote Device
Select FC or IP.
www.huawei.com
• When the network distance exceeds 25 km, enable the FastWrite function of replication links.
✓ Fibre Channel links: Run the change port fc fc_port_id= XXX fast_write_enable=yes command to enable the
FastWrite function of Fibre Channel ports, where fc_port_id can be obtained by running the show port
general command.
✓ iSCSI links: Run the change remote_device link link_type=iSCSI link_id=XXXfast_write_enable=yes command
to enable the FastWrite function of iSCSI ports, where link_id can be obtained by running the show
remote_device link command.
Configuring SAN HyperMetro — Creating a
Quorum Server
www.huawei.com
Running Status:
Connected
Configuring SAN HyperMetro — Creating a
HyperMetro Domain
www.huawei.com
Note: If you have selected a HyperMetro consistency group here, you do not need to create one later.
Configuring UltraPath Policies —
Windows/Linux/AIX/Solaris
Huawei UltraPath provides two working modes for HyperMetro: Priority and Balance. You are
advised to select the Priority mode and specify two arrays for primary array load balancing.
Huawei UltraPath is Priority by default. Specify the array with the largest serial number (SN)
as the Primary array. In practical application, it is required to modify the primary array to
achieve load balancing.
If ESXi hosts are deployed in a cluster, configure the APD to PDL function.
• Configure Huawei UltraPath.
• Run the esxcli upadm set apdtopdl -m on command.
• Run the esxcli show upconfig command to view the configuration result.
If the APD to PDL Mode value is on, the APD to PDL function of ESXi hosts is
successfully enabled.
Configuring the Virtualization Platform
— VMware Configuration Requirements
Mandatory configuration items:
✓ Deploy ESXi hosts across data centers in an HA cluster.
Configure the cluster with HA advanced parameter. For VMware vSphere 5.0 u1 and later
versions, set the das.maskCleanShutdownEnabled = True parameter.
✓ VM service networks and vMotion networks require L2 interworking between data
centers.
✓ Configure all ESXi hosts with the following advanced parameters and the Ultrapath
apdtopdl switch.
Recommended configuration items:
✓ The vMotion network, service network, and management network must be configured as
different VLANs to avoid network interference.
✓ The management network includes the vCenter Server management node anwdwEwSh.Xuiawheo.iscotsm
that are not accessible to external applications.
✓ The service network is divided into VLANs based on service requirements to ensure
logical isolation and control broadcast domains.
✓ In a single cluster, the number of hosts does not exceed 16. If the number of hosts
exceeds 16, you are advised to use the hosts to create multiple clusters across data
centers.
✓ A DRS group must be configured to ensure that VMs can be recovered first in the local
data center in the event of the breakdown of a single host.
Configuring the Virtualization Platform
— vSphere Configuration Requirements
Mandatory configuration items:
✓ Deploy CNA hosts across data centers in a cluster.
✓ HA is enabled to ensure that VMs can restart and recover when the hosts where the VMs
reside are faulty.
✓ VM service networks require L2 interworking between data centers.
✓ Both data centers are configured with a VRM in active and standby mode, using the local
disks.
✓ Select FusionSphere V100R005C10U1 and later and choose Huawei UltraPath for
multipathing software.
Recommended configuration items:
✓ Computing resource scheduling must be enabled to ensure that VMs can be recovered first
in the local data center in the event of the breakdown of a single host.
✓ The VM hot migration network, service network, and management network must be
configured as different VLANs to avoid network interference.
✓ The management network includes the VRM management node and CNA hosts that are
not accessible to external applications.
✓ The service network is divided into VLANs based on different services to ensure logical
isolation and control broadcast domains.
Configuring the Virtualization Platform
— Hyper-V Configuration Requirements
Windows clusters:
✓ Perform the following operations to set the timeout parameter of
clusters' quorum disks to 60 seconds (20 seconds by default):
Open PowerShell and run Get-Cluster | fl *.
Check whether the QuorumArbitrationTimemax parameter value is 60. If not,
go to the next step.
Run (Get-Cluster cluster_name).QuorumArbitrationTimemax=60.
⚫ Oracle RAC c l u s t e r s :
✓ Oracle RAC clusters are deployed in Automatic Storage Management (ASM)
mode. You are advised to use the External redundancy mode.
✓ You are advised to store the arbitration file, redo log file, system
data file, user data file, and archive log file in different ASM disk
groups.
✓ You are advised to create three redo log groups for each thread. The
size of a redo log must allow a log switchover every 15 to 30 minutes.
OceanStor Dorado V6
Storage System
SmartMigration
Feature Description
⚫ Background
With the evolution of storage technologies, the need for service migration
arises as a result of storage system upgrade or storage resource reallocation.
Mission-critical services, in particular, must be migrated without being
interrupted. Service migration may take place either within a storage system
or between storage systems.
⚫ Definition
SmartMigration, a key service migration technology, migrates host services
from a source LUN to a target LUN without interrupting these services and
then enables the target LUN to take over services from the source LUN after
replication is complete. After the service migration is complete, all service-
related data has been replicated from the source LUN to the target LUN.
Feature Description
Characteristics Description
SmartMigration tasks are executed without interrupting host services,
Reliable service continuity preventing any loss caused by service interruption during service
migration.
After a SmartMigration task starts, all data is replicated from the source
LUN to the target LUN. During the migration, I/Os delivered by hosts will
Stable data consistency be sent to both the source and target LUNs using dual-write, ensuring
data consistency between the source and target LUNs and preventing
data loss.
3. The source LUN and target LUN return the data write result to the SmartMigration module.
4. The SmartMigration module determines whether to clear the DCL based on the data write result.
Host
1 5
DCL LOG
4 2
LM Storage
4
2 2
3 3
Working Principles
⚫ In a storage system, each LUN and its corresponding data volume has a unique identifier, namely, LUN ID
and data volume ID. A LUN corresponds to a data volume. The former is a logical concept whereas the
latter is a physical concept. LUN information exchange changes the mapping relationship between a LUN
and a data volume. That is, without changing the source LUN ID and target LUN ID, data volume IDs are
exchanged between a source LUN and a target LUN. As a result, the source LUN ID corresponds to the
target data volume ID, and the target LUN ID corresponds to the source data volume ID.
SmartMigration Consistent Splitting
⚫ Consistent splitting of SmartMigration enables simultaneous splitting on multiple related LUNs. As a
result, data consistency can be ensured and services of the target LUN are not affected. After
SmartMigration pairs are split, the data written to the source LUN by the host is not synchronized to the
target LUN.
SmartMigration State Transition
Initial
creation
The number of copy tasks
does not reach the The number of copy
maximum value. tasks reaches the
maximum value.
1. Synchronizing: Data on the source LUN is Start the
synchronizing to the target LUN. Synchro synchronization.
nizing
2. Normal: Data synchronization between the
source LUN and the target LUN is complete. Queuing
3. Queuing: The pair is waiting in a queue. Disconnected
4. Interrupted: The replication relationship Synchronization
between the source LUN and the target LUN completed
is interrupted due to I/O errors.
5. Migrated: Data synchronization between the Fault
source LUN and target LUN is complete and rectification
Normal
the splitting is complete.
Interrupted
Splitting
Migrated
Storage System Replacement
⚫ When users plan to upgrade
their storage systems, for
example, to replace A B C
D E F
heterogeneous storage G H I
Service migration
Source LUN Target LUN
RAID 5 RAID 6
policy policy
Impact and Restrictions
Impact on performance
⚫ When SmartMigration is in use, operations such as data migration and dual-write consume
CPU resources, increasing the access latency and decreasing the write bandwidth.
During the migration, enabling SmartMigration increases the average latency of the
source LUN by no more than 30% and the average total latency is no more than 2 ms.
When SmartMigration is enabled and the target LUN is faulty, the latency of the
source LUN increases by no more than 15% in the case of writing data to only the
source LUN and the average total latency is no more than 1.5 ms.
⚫ You are advised to use the moderate migration speed to perform migration in common
scenarios. The impact of migration on host performance increases as the migration speed
increases. Therefore, users can reduce the impact of SmartMigration on host performance
by dynamically adjusting the migration speed.
Impact and Restrictions
Restrictions
⚫ The capacity of the target LUN must not be smaller than that of the
source LUN.
⚫ Neither the source nor target LUN is used by any value-added feature.
⚫ The source and target LUNs belong to the same controller.
Overall
performance
The performance of
common
applications is Common
limited to avoid application
affecting other
applications.
Critical
application
Users can create a traffic control policy to limit the performance of non-critical
applications.
Scenario 1: Preventing Mutual Impact
Between Applications
Application Type I/O Characteristic Peak Hour of Operation
Archive and backup Sequential large I/Os, typically measured in bandwidth 00:00 to 08:00
Configure priorities
Critical application (high)
for applications.
Important application
(medium)
Critical application (high)
Overall performance
SmartQoS Portal
On OceanStor DeviceManager, choose Provisioning > Resource
Performance Tuning > SmartQoS.
Configuring the I/O Priority
Configure the I/O priority for a LUN based on the importance of
applications processed by the LUN. The three I/O priorities are
Low, Medium, and High.
Creating a SmartQoS Policy (1)
Step 1: On the Traffic Control tab, click Create. Specify the policy
name and type in the displayed dialog box.
Creating a SmartQoS Policy (2)
Step 2: Set the control objective.
Do not set the control objective to too small a value. The value displayed in the
following figure is provided as an example. A big difference between the value
and the actual service load leads to high latency, which may adversely affect
host services and other services such as HyperMetro and HyperReplication.
Creating a SmartQoS Policy (3)
Step 3: Set the time period for which the policy comes into
effect.
Creating a SmartQoS Policy (4)
Step 4: Add LUNs to the policy.
Creating a SmartQoS Policy (5)
Step 5: Confirm the parameter settings and click Finish.
Creating a SmartQoS Policy (6)
Step 6: On the Traffic Control tab, you can view basic
information about all policies. There are three activity states
for policies: Unactivated, Idle, and Running.
Unactivated
Deactivate Deactivate
Activate
Running Idle
HBA HBA
Single point of failure
A single point of failure (SPOF)
means that a certain point of a
network is faulty, which may
cause network breakdown. To
prevent single points of failure,
high-reliability systems
LUN LUN LUN LUN LUN LUN implement redundant backup for
devices that may suffer single
points of failure and adopt a
cross cable connection method
to achieve optimal reliability.
Moreover, redundant paths
Storage array assist in achieving higher
Storage array
performance.
Positioning of Multipathing Software – What
Else can Multipathing Software Do?
Link
Load balancing
Balanced
I/O Link loads
bottleneck Load balancing is another
Doubled
performance critical function of
multipathing software. With
load balancing, the system
can use the bandwidth of
multiple links, improving
L U L U LUN L U L U LUN overall throughput.
N N N N
Common load balancing
algorithms include round-
Storage array Storage array robin, minimum queue
depth, and minimum task.
Positioning of Multipathing Software – What
else can Multipathing Software Do?
server
Positioning Application
vdisk
UltraPath is a type of filter driver software running in the UltraPath
host kernel. It can block and process disk creation/deletion HBA HBA
and I/O delivery of operating systems. Multipathing software
ensures reliable utilization of redundant paths. If a path fails
or cannot meet the performance requirement, multipathing
SAN
software automatically and transparently transfers I/Os to
other available paths to ensure that I/Os are transmitted
effectively and reliably. As shown in the figure on the right,
multipathing software can handle many faults such as HBA
storage
faults, link faults, and controller faults. ControllerA Controller B
Currently, multipathing solutions provided by storage vendors are classified into three types:
1. Use self-developed multipathing software, for example, EMC PowerPath, HDS HDLM, and Huawei UltraPath.
2. Provide storage adaptation plug-ins based on the multipathing framework of operating systems, for example, IBM and HP.
3. Use native multipathing software of operating systems (generally used by A-A arrays or A-A/A arrays supporting ALUA).
Currently, Windows and Linux are the most mainstream operating systems for x86 servers, AIX is the most mainstream in
minicomputers, and VMware ESX in virtualization platforms.
Native multipathing software of operating systems (often called MPIO) supports the failover and load balancing functions
and can cope with scenarios that have moderate requirements on reliability. Multipathing software developed by storage
vendors is more professional and delivers better reliability, performance, maintainability, and storage adaptation.
Overview of Huawei UltraPath
A multipathing software
program installed on hosts to
improve service performance
Introduction and availability
Windows Linux
Windows Linux
AIX
Supported UltraPath UltraPath
Other-OS Self-developed
Not supported
OS framework–based
Solaris AIX
ESX
HP-UX
ESX
Solaris
Remarks
⚫ UltraPath for AIX is based on the MPIO framework built in the OS, and provides the Path
HBA
Single link to
external storage
Storage array
Redundancy Solution — Without
Multipathing Software
Server
HBA
Storage array
Redundancy Solution — Multipathing
Server
Multipathing software
Redundant links are established to prevent
single-point failures.
HBA
Storage array
Redundancy Solution — Multipathing
Server
Multipathing software
Storage array
Redundancy Solution — Multipathing
Server
Multipathing software
Storage array
Redundancy Solution — Multipathing
+ Cluster
Multipathing Multipathing
software Server software Server
Storage array
Redundancy Solution — Multipathing
+ Cluster
Multipathing Multipathing
software Server software Server
Storage array
Redundancy Solution — Multipathing +
Cluster + Active-Active = High Availability
Multipathing Server Multipathing Server
software software
Arrays are added for redundancy
and backup.
Cluster software Cluster software
WSFC VCS… WSFC VCS… Multipathing + cluster + active-
active = high availability.
HBA HBA
Good!
Medium
round-robin
Except round-
least-io round robin
robin, all other
least-block Least Queue round robin round-robin
Load balancing algorithms do Adaptive round robin
Depth Queue-length min-queue-depth
algorithm not differ much Weighted Paths
CLARiiON optimization Least Block Service-time min-task
in their actual
Symmetrix optimization Weighted Paths
performances.
Stream I/O
Supported Supported
Load balancing
Note: Path group Note: Path group
based on path High Supported Supported Supported
identification identification
groups
through ALUA through ALUA
Comparison Between Huawei UltraPath
and Multipathing Software from
Competitors — Advanced Functions
Importance Windows DM-
PowerPath AIX MPIO UltraPath
Degree MPIO Multipath
(All Paths
High
Down) Supported by some Not
Note: reliability Supported Not supported Supported
APD platforms supported
function
protection
Supported
Note: Paths
Isolation of
High cannot be
intermittent Not Not
Note: reliability Not supported restored Supported
ly faulty supported supported
function automatically
paths
after the
isolation.
Supported by some
platforms Supported
Note: The isolation is Note: Isolation
Isolation of achieved by the algorithms are
High autostandby function. different for different
links that Not Not
Note: reliability Only one isolation Not supported types of faults, and a
have bit supported supported
function algorithm can be special recovery test
errors
used and paths will mechanism is
be recovered after a provided.
fixed period of time.
Comparison Between Huawei UltraPath
and Multipathing Software from
Competitors — Advanced Functions
Importance Windows
PowerPath DM-Multipath AIX MPIO UltraPath
Degree MPIO
Pushes information to the
array and provides
High
Path exception Not Not Not centralized alarming.
Note: reliability Supported
alarming supported supported supported Multiple types of path alarms
function
are supported: path failure
and no redundant controllers.
Low
Note: After
multipathing
GUI centralized software is Supported Not Not Not
Not supported
management installed, this PowerPath Viewer supported supported supported
management is
rarely needed.
Path management
insight provides
Medium monitoring from multiple
Statistics of IOPS and
Path performance Note: It is used dimensions: Not Not Not
1. IOPS, bandwidth, bandwidths are collected
monitoring to diagnose supported supported supported
and latency based on read/write requests.
problems.
2. I/O size
3. read/write requests
Comparison Between Huawei UltraPath
and Multipathing Software from
Competitors — Advanced Functions
Importance Windows DM-
PowerPath AIX MPIO UltraPath
Degree MPIO Multipath
Supported
Note:
Medium
Identifies the
Note: Without this I/Os will
controller that I/Os will drop I/Os will drop to zero
function, services drop to zero
Smooth online is about to go to zero during during the upgrade
are not interrupted Supported during the
upgrade of arrays offline through the upgrade process.
but only upgrade
ALUA and process.
temporarily process.
switches over
congested.
the controller
in advance.
Disable paths using one
Medium of the following methods:
Manually disabling Can disable paths
Note: Without this 1. Disabling a specified
paths (used for based on HBA
function, services Can disable controller
smoothly ports and Not
are not interrupted Not supported logical paths 2. Disabling a specified
transferring services controller ports that supported
but only only. physical path which is
before replacing a correspond to the
temporarily identified by the HBA
component) faulty components.
congested. port plus target port
ID.
Medium Support VIS active-active
Remote active- Active-active Not
Note: Applies to Not supported Not supported and self-developed
active DC solution VPLEX supported supported
special scenarios. active-active mode.
Automatic host Not
Medium Supported Not supported Not supported Supported
registration supported
Comparison Between Huawei
UltraPath and Multipathing Software of
Competitors — DFX
Importance
PowerPath Windows MPIO DM-Multipath AIX MPIO UltraPath
Degree
Automatic environment
Additional tools
dependency check during Low Not supported Not supported Not supported Supported
need to be used
installation
Automatic environment
parameter configuration Low Not supported Not supported Not supported Not supported Supported
during installation
N/A
N/A
N/A Note: bound with the
Note: bound with the Supported by some
Silent installation Low Supported Note: bound with the operating system
operating system platforms
operating system version version
version
N/A
N/A
N/A Note: bound with the
Note: bound with the Supported by some
No reboot upgrade (NRU) High Supported Note: bound with the operating system
operating system platforms
operating system version version
version
N/A N/A
N/A
Note: bound with the Note: bound with the Supported by some
Non-interruptive upgrade High Not supported Note: bound with the
operating system operating system platforms
operating system version
version version
Multi-platform unified user
Medium Supported Not supported Not supported Not supported Supported
interface
Automatic storage Manual configuration Manual configuration Manual configuration
Low Supported Supported
identification required required required
Supported
Note: supported
Co-existence with third-
High Supported Supported Supported Supported theoretically, with the
party multipathing software
need to verify the
specific version
Comparison Between UltraPath and Native
Multipathing Software of Operating Systems—
Overview
Fault Source and Symptom UltraPath Multipathing Software Built in OSs
Components are faulty and cannot receive or
✓ Isolate the faulty path. ✓ Isolate the faulty path.
send any signal.
Connections are not stable because cables Cannot isolate the faulty path permanently.
✓ Isolate the faulty path permanently.
are not firmly connected to ports. Performance deteriorates intermittently.
Fault Signals of optical fibers or modules are weak, Cannot isolate the faulty path permanently.
✓ Isolate the faulty path permanently.
Symptom causing packet loss or error packets. Performance deteriorates intermittently.
Cannot isolate the path permanently.
The transmission delay is long. ✓ Isolation the path.
Performance deteriorates intermittently.
Cannot isolate the faulty path permanently.
Components are reset repeatedly. ✓ Isolate the faulty path permanently.
Performance deteriorates intermittently.
Host HBAs ✓ Isolate the faulty path. ✓ Isolate the faulty path.
Optical fiber ✓ Isolate the faulty path. ✓ Isolate the faulty path.
Switch ✓ Isolate the faulty path. ✓ Isolate the faulty path.
Fault
Storage controller ✓ Isolate the faulty path. ✓ Isolate the faulty path.
Source
Interface module ✓ Isolate the faulty path. ✓ Isolate the faulty path.
Channel within a storage controller to access Cannot handle the problem perfectly. Services
✓ Isolate the faulty path.
LUNs may be interrupted.
The fault symptoms and sources that UltraPath can handle are five times
and 1.2 times, respectively, as many as the native multipathing software
of operating systems can handle. The comprehensive coverage increases
6-fold.
Comparison Between UltraPath and Multipathing
Software from Competitors — Overview
IBM/HP/
Field Function Item Huawei EMC HDS
NetApp
Some operating systems only
I/O load balancing Supported Supported Supported support the round-robin
Performance algorithm.
Performance consumption of software stacks Relatively large Relatively large Relatively small Small
Isolation of intermittently faulty links Supported Not supported Supported Not supported
Isolation of links that have bit errors Supported Supported Not supported Supported by AIX only
Reliability
Duration of I/O suspension in a path fault 1s to 2s (except AIX) 1s to 60s 1s to 60s 1s to 60s
Duration of I/O suspension in the case of timeout ≥ 30s ≥ 30s ≥ 30s ≥ 30s
Basic
Path performance monitoring Supported Supported Supported Not supported
services
Management and Path topology query Supported Supported Supported Not supported
maintenance Disabling paths/Standby Disabling is supported only. Supported Disabling is supported only. Disabling is supported only.
Log audit Supported Supported Supported Not supported
Supported by mainstream operating
SAN-Boot Supported Supported Supported
systems
Interoperability
Operating system Mainstream operating systems supported Supported Supported Supported
Virtualization platforms of OS vendors Supported Supported Supported N/A
Optimization of active-active path selection algorithm Supported Supported Supported Not supported
Performance
NVMe Not supported Not supported Not supported Not supported
APD retry Supported Supported Not supported Supported by Linux only
Reactive autorestore (the software test dead paths
Supported by AIX only Supported Not supported Supported by AIX only
Reliability when no path is available for I/O flows)
No I/O interruption when components are replaced
Supported (online array upgrade) Not supported Supported Not supported
proactively
GUI centralized management Not supported Supported Supported Not supported
SNMP trap
Messages are sent to the array for SNMP trap
Event and alarm Syslog Not supported
Advanced unified alarms. Syslog
SCOM
services
Automatic host registration Supported Supported Not supported Not supported
Note: For details about command usage, see the user guide of UltraPath for the operating system.
For details about how to obtain the document, see Basic UltraPath Installation, Uninstallation, and
Upgrade.
Basic UltraPath Configuration Guide
— Windows
The following table describes frequently used commands for configuringUltraPath.
Command Description
set tpgstate Enable or disable the controller modules of the specified storage system.
set pathstate Enable or disable the specified physical path.
set workingmode Sets the working mode of UltraPath to load balancing between controllers or within a controller.
set loadbalancemode Sets the load balancing mode of UltraPath.
set luntrespass Sets the policy of switching over the working controller for LUNs. The default value is recommended.
set failbackdelaytime Sets the failback interval. The default value is recommended.
set ioretry Sets the number and interval of I/O retries. The default values are recommended.
set iosuspensiontime Sets the I/O suspension time. The default value is recommended.
set alarmenable Sets whether the host pushes alarms. The default value is recommended.
set path_reliability_enable Sets whether UltraPath path degradation is enabled. The default value is recommended.
set ied_min_io Sets the minimum number of I/Os for I/O discrete error isolation. The default value is recommended.
set ied_threshold Sets the I/O discrete error isolation threshold (ratio). The default value is recommended.
set ied_time Sets the time window for I/O discrete error isolation statistics. The default value is recommended.
set tod_recovery_time Sets the I/O timeout path recovery time. The default value is recommended.
set tod_threshold Sets the I/O timeout isolation threshold (times). The default value is recommended.
set tod_time Sets the time window for I/O timeout isolation statistics. The default value is recommended.
set hld_threshold Sets the high-latency path isolation threshold. The default value is recommended.
Note: For details about command usage, see the user guide of UltraPath for the operating system. For details about
how to obtain the document, see Basic UltraPath Installation, Uninstallation, and Upgrade.
Basic UltraPath Configuration Guide
— Windows
The following table describes frequently used commands for configuringUltraPath.
Command Description
set hld_recovery_time Sets the high latency path recovery time. The default value is recommended.
set faulty_path_check_interval Sets the faulty path routine test interval. The default value is recommended.
set idle_path_check_interval Sets the idle path routine test interval. The default value is recommended.
set max_io_retry_timeout Sets the timeout threshold of retrying an I/O. The default value is recommended.
Sets the number of I/Os consecutively delivered in load balancing mode. The default value is
set lb_io_threshold
recommended.
set hypermetro workingmode Sets the HyperMetro working mode. The default value is recommended.
Sets the size of slices during load balancing across HyperMetro arrays. The default value is
set hypermetro split_size
recommended.
clear upconfig Deletes UltraPath configuration information from virtual LUNs or the storage system.
clear obsolete_path Delete information about unused physical paths.
check status Checks the UltraPath status.
start pathcheck Checks the working status of the specified physical path.
Checks whether the configuration of LUNs' working controller is optimal and starts working
start rebalancelun
controller switchover if necessary.
start migration Switches the host I/O path to the target or source array.
start iosuspension Suspends I/Os to the specified LUN.
stop iosuspension Stops I/O suspension of a specified virtual LUN.
Note: For details about command usage, see the user guide of UltraPath for the operating system. For details
about how to obtain the document, see Basic UltraPath Installation, Uninstallation, and Upgrade.
UltraPath Parameter Settings in
Typical Application Scenarios
In most scenarios, default settings of UltraPath are recommended. In some scenarios,you
can configure UltraPath as instructed by the following:
upadm set workingmode={0|1}
⚫ It specifies the load balancing mode at the storage controller level. 0 indicates inter-
⑤ ✓①
Prerequisites
②
Storage
P1
P0
array P1 ⑥ 1. The switch must support port failover (NPIV).
Controller 2. The network between the host and the storage array must be
Controller
A symmetrical. (Controllers A and B are connected to the same host
B
and are in the same switching network.)
3. The HBA has no compatibility issue. Ensure that the connection
can be set up again after port failover.
Networking Requirements for Port
Failover
Host Fully symmetric networking: Host Partially symmetric
1. A host port is connected to networking:
both controller A andcontroller 1. A host port is connected
B. to both controller A and
2. A host port is connected to controller B.
both controller A andcontroller 2. A host port is connected
Switch B via the same number of Switch to both controller A and
links. controller B via the same
3. The array ports connected to number of links.
a host port are symmetrical
(the slot number and port
number are the same).
P0 P0 P0 P0
P1 P1 P1 Controller Controller
P1
Controller Controller
P2 A B P2 P2 A B P2
P3 P3 P3 P3
0 1 1 0 0 1 1 0
Upgrade Method — Offline Upgrade
⚫ If the upgrade is performed offline, you must stop host applications before
upgrading controller software. During an offline upgrade, all controllers
are upgraded simultaneously, shortening the upgrade time. Because all
host services are stopped before the upgrade, data loss is reduced in the
upgrade process.
Impact on Services
⚫ Online upgrade
During an online upgrade of controller software, the controller restarts and its
services are taken over by other normal controllers. The read and write IOPS
decreases by 10% to 20%. It is recommended that you perform online upgrades in
off-peak hours
⚫ Offline upgrade
You must stop host services before performing an offline upgrade of controller
software.
Preparations Before an Upgrade
⚫ Obtain upgrade reference documents.
⚫ Obtain software and related tools.
⚫ Array upgrade evaluation checks the array health status before the upgrade,
preventing interference caused by potential errors. Ensure that all check items are
passed before performing subsequent operations. If you want to upgrade the system
forcibly, ensure that you understand the risks and accept the possible consequences.
⚫ In most cases, you do not need to collect array and host information or evaluate
compatibility if all the array evaluation items are passed. The actual situation depends
on the array evaluation result. If the array information collection, host information
collection, or compatibility analysis item becomes unavailable, the array upgrade
evaluation is successful and you can skip these items.
Site Survey — Array Information
Collection
⚫ You can skip this operation if either of the following conditions is met:
⚫ The upgrade is performed on the same day when the array upgrade evaluation is passed.
⚫ The failed check items have been rectified, the array and host service configurations are not changed,
and the networking is not changed after the evaluation.
Upgrade Procedure — Array
Upgrade
Prerequisites
⚫ All the evaluation and check items in the site survey have been passed.
If you ignore the failed check items and want to upgrade the system forcibly, ensure that you understand the risks and accept the possible
consequences.
Upgrade Procedure — Solving
Upgrade Faults
⚫ If a fault occurs during the upgrade, the upgrade stops and can be retried or rolled
back after manual rectification and confirmation.
⚫ As shown in the figure, the status of the upgrade process is Paused. You can
click Details. In the Details window, select Retry or Roll Back.
Upgrade Procedure — Upgrading
SystemReporter
Prerequisites
It is recommended that the SystemReporter version be consistent with that in the storage array's version mapping
table. If the array is upgraded, SystemReporter must be upgraded as well. Otherwise, SystemReporter may not
monitor the performance statistics of the array.
Upgrade SystemReporter by following instructions in the OceanStor Dorado5000 V6, Dorado6000 V6, and
Dorado18000 V6 Storage Systems C30SPC100 SystemReporter Upgrade Guide.
Upgrade Procedure — Verifying
Upgrade Results
⚫ Checking system status
Checks system status using an inspection tool and ensure that system status is not
affected during an upgrade
⚫ Rollback policy
Online upgrade: If a system is not upgraded in the last batch of the upgrade, a rollback
must be performed by maintenance engineers. If a system is upgraded in the last batch of
the upgrade, do not perform a rollback. Instead, solve the problem following instructionsin
troubleshooting.
Offline upgrade: If the number of controllers that fail an upgrade equals to or exceeds 50%
of the total controller quantity, the upgrade stops and must be retried or rolled back
manually by maintenance engineers. If the number of controllers that fail an upgrade is
smaller than 50% of the total controller quantity, the upgrade can be retried or ignored and
a rollback is not required.
Version Downgrade and Use
Scenarios
⚫ Version downgrade
In some cases, the controller software has to be downgraded to the source version even after
a successful upgrade.
If a downgrade is needed, contact Huawei technical support to evaluate the operation and
obtain the downgrade guide.
Precautions Before an Upgrade
⚫ Before an online upgrade, the available links between the storage system and a host must meet the
following requirements:
◆ At least one available link exists between controller A or C of each engine and thehost.
◆ At least one available link exists between controller B or D of each engine and the host.
If your live network does not meet the preceding networking requirements, it is strongly recommended
that you modify your networking mode and then perform an online upgrade. If your networking mode
cannot be modified, adjust the batch upgrade sequence and then perform an online upgrade under
guidance of Huawei technical support engineers.
⚫ Before the upgrade, ensure that the target storage system version is compatible with other management
software of the customer, such as OceanStor BCManager.
⚫ Before the upgrade, ensure that all controllers on at least one engine have links to external LUNs.
⚫ If a local array has replication links to a remote array, you cannot configure the remote array (for example,
creating or deleting the remote array, or adding or removing replication links) if only the local array is
upgraded. Existing configurations are not affected and services can run normally.
⚫ Before an online upgrade, close all DeviceManager pages and do not log in to DeviceManager duringthe
upgrade.
⚫ If the array has four controllers and its source version is C01SPC100, access the array using the IP
address of the CTE0.SMM0.MGMT port when performing the upgrade.
Precautions During an Upgrade
⚫ Do not configure the storage system.
⚫ Prevent other users who will not perform the upgrade from logging in to
the storage system.
⚫ Do not perform hardware operations (such as removing or inserting
interface modules, power modules in expansion enclosures, or disks).
Step 2 Confirm that I/Os have reached the front end of the storage system
and that the performance bottleneck is on the storage system.
Step 3 Verify that the storage system configurations provide the optimal
performance for the current types of services.
Step 4 Locate and eliminate the bottleneck on the storage system by using
command lines and tools.
Hardware's Impact on Performance
CPU
• Before analyzing the performance of front-end host ports, confirm the locations
of interface modules and the number, statuses, and speeds of connected ports.
You can use DeviceManager or the CLI to query information about front-end host
ports.
• If performance fluctuates frequently or declines unexpectedly, front-end host
ports or links may be abnormal. You can use DeviceManager or the inspection
report to check whether the front-end host ports have bit errors.
• Key performance indicators of front-end host ports include the average read I/O
response time, average write I/O response time, average I/O size, IOPS, and
bandwidth. You can use SystemReporter or the CLI to query these indicators.
Back-end Ports and Disks
• Back-end ports are SAS ports that connect a controller enclosure to a disk
enclosure and provide a channel for reading/writing data from/to disks. Back-end
SAS ports' impact on performance typically lies in disk enclosure loops. Currently,
OceanStor Dorado6000 V6 supports 12 Gbit/s SAS ports.
• A single SAS port provides limited bandwidth. The bandwidth supported by the
SAS ports in a loop must be higher than the total bandwidth of all disks in the disk
enclosures that compose the loop. In addition, as the number of disk enclosures in
a loop grows, the latency caused by expansion links increases, affecting back-end
I/O latency and IOPS. Considering these situations, when there are sufficient SAS
ports, disk enclosures should be evenly distributed to multiple loops.
• Due to the global application of the deduplication and compression technologies
and changes in the pool subsystem architecture, OceanStor Dorado6000 V6
currently supports only one disk domain and one storage pool. You do not need to
consider disk selection in disk domains for bandwidth-intensive services (to avoid
dual-port access and disk selection from different engines). However, you still need
to avoid using disks of different capacities or speeds in a disk domain to prevent
bottlenecks caused by single disks.
Impact of Storage Configurations on
Performance
Number of
RAID level member disks Write policy
Deduplication and
Cache watermark LUN ownership compression
RAID Levels – RAID5, RAID6, RAID-TP
Queue after compression (Grains each with a granularity of 8 KB are used as examples.)
1K 7K 6K 8K 8K 7K 6K 6K 6K 8K 1K 1K 4K 4K 1K
➢ Full-stripe write D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 D14
Number of
RAID level member disks Write policy
Deduplication and
Cache watermark LUN ownership compression
Relationship Between Performance
and the Number of Member Disks
➢ An SSD can carry only a certain number of random I/Os. This depends on its capacity,
type of chip, chip manufacturer, firmware version, and OP space.
➢ For random read/write services, a disk supports 5,000 to 12,000 IOPS. For bandwidth-
intensive services, a disk supports 120 MB/s bandwidth.
Impact of Storage Configurations on
Performance
Number of
RAID level member disks Write policy
Deduplication and
Cache watermark LUN ownership compression
Write Policy
⚫ There are three cache write policies: write through, write back
with mirroring, and write back without mirroring.
Mirroring
Read cache Read cache
RAID RAID
Write back without mirroring is not recommended
because the data will not have dual protection.
Disk Disk
Write Policy
⚫ Select the write policy based on your requirements on
performance and reliability.
Number of
RAID level member disks Write policy
Deduplication and
Cache watermark LUN ownership compression
Cache Watermark
When the write policy is write back, the cache uses the high
and low watermarks to control the storage capacity and
flushing rate for dirty data.
Write Cache
The flush thread flushes data to disks until the
data volume falls below the low watermark.
High flushing rate
High
watermark
➢ The time for I/Os to stay in the cache largely depends on the value of the low watermark. A
higher low watermark will provide more opportunities for I/Os in the cache to be
consolidated, improving the random write performance.
➢ The default low watermark is 20%. To process multi-channel small sequential I/Os and
OLTP services in the SPC-1 model, you can increase the low watermark, for example, to
40% or 50%.
Impact of Storage Configurations on
Performance
Number of
RAID level member disks Write policy
Deduplication and
Cache watermark LUN ownership compression
LUN Ownership – Accessing a LUN
Through the Owning Controller
⚫ When the host accesses LUN 1,
it delivers the access request Host
through controller A.
Number of
RAID level member disks Write policy
Deduplication and
Cache watermark LUN ownership compression
Deduplication and Compression
⚫ Deduplication and compression can effectively improve space
utilization, reduce the amount of data written to disks, and extend
the lifespan of SSDs. However, they will consume additional
computing resources.
⚫ The deduplication and compression ratio depends on the
characteristics of user data.
Deduplicatio
n and Performance Space Utilization Disk Lifespan
Compression
Latency Fluctuation
problems problems
Locating Performance Problems –
Latency Problems
➢ Check whether the latency in the storage system is normal.
Latency in the
Bottleneck
Storage System
High Storage system
No
I/O Size, Read/Write No
Latency Bottleneck
Ratio, Concurrency Is the latency
Check the host
and link
fluctuating? configurations.
Check garbage
Stable Stable Host or link collection and QoS
configurations that
may cause
fluctuation.
Configuration Optimization Guideline
Switching
devices
Server
1. Streamline the logical modules
Logical module Hardware
Application based on the I/O process and
OLTP OLAP Multimedia
performance requirements to
CPU
minimize resource consumption by
Data container unnecessary operations.
Database
File system
Operating system
Implement Memory
Volume management the 2. Identify I/O hot spots and properly
(LVM) function of
each allocate hardware resources.
module
Block device layer
Multipath software
HBA card
3. Ensure that the I/O size, concurrency,
HBA driver
and stripes are aligned among the
Switching entire data storage process,
devices
Storage subsystem
minimizing unnecessary I/Os.
Logical module Hardware
Front-end channel
Cache
CPU
LUN Implement Memory 4. Make full use of the cache to
the
function of Back-end channel consolidate and schedule data and
RAID each
module
Disk improve the memory hit ratio.
Data Container Performance Tuning
– Database
Item Recommendation
Allocate as many storage resources as possible for hotspot
Tablespace areas. Select Big File or Small File based on actual
requirements.
Cache Use about 80% of the host memory as the database cache.
Data block OLTP: 4 KB or 8 KB; OLAP: 32 KB
Prefetch Aligned with the ASM, LVM, or LUN stripe. 512 KB or 1 MB
window is recommended.
Index Delete unnecessary indexes. Select B-tree or bitmap.
Partition a disk when it has more than 100 million records.
Partition
Use range, list, and hash partitioning based on requirements.
Number of
Ensure that no free cache iswaiting.
flush processes
Log file 32 MB to 128 MB, five per instance
Data Container Performance Tuning
– File System
⚫ The file system container processes the operations on files or
directories delivered by upper layer modules.
⚫ Select an appropriate file system.
File systems are classified into log and non-log file systems.
Applicable File
Service Scenario Service
System
Database server, mail server, small
Small file, random access e-commerce system, finance Ext3, Reiserfs
system
Large file, multi-channel
Video server XFS
sequential read
Large file, multi-channel
Video surveillance system XFS
sequential write
6000
Transactions/s
⚫ Adjust file system 5000
4000
parameter settings. 3000
xfs
2000
transactions/s
1830
1825
0 atime
1820
noatime
100% 90% 50% 10% 0 1815
reads reads readsreads read 1810
Read ratio 1805
1800
1795
Performance in ext3 ordered
1200
parameters. 8 IOPS
IOP
1000 ART (ms)
S
6
✓ I/O alignment 800
600 4
✓ I/O size alignment 400
2
200
✓ Start position alignment 0 0
Aligned Not aligned
✓ Prefetch window
adjustment Performance of OLTP applications
✓ I/O scheduling policy before and after I/O alignment
adjustment
Operating System Performance Tuning –
Multipath and HBA Modules
⚫ The HBA module delivers I/Os to storage devices. Pay attention to the
following indicators.
Performanc Performance of an 8 Gbit/s Fibre
Description
e Channel HBA
Indicator
1. The maximum number of
1. Indicates the maximum number of I/Os that an
Maximum concurrent I/Os is 256 on a
HBA can deliver in one period.
number of single HBA port.
2. This parameter is adjustable. You are advised to
concurren 2. The value can be adjusted by
set it to the maximum value to prevent I/O
t requests the Execution Throttle
congestion on the HBA.
parameter.
1. Usually 1 MB
Maximum I/O Indicates the maximum I/O size that an HBA can
2. The value can be adjusted by
size deliver without splitting the I/O.
the Frame Size parameter.
1. Indicates the maximum bandwidth of a single
Maximum HBA port. The one-way bandwidth is about
bandwidth 2. You can add HBAs and network ports based on 750 MB/s on a single HBA port.
your actual storage bandwidth.
1. Indicates the maximum IOPS of a single HBA port.
The IOPS is 100,000 on a single
Maximum IOPS 2. You can add HBAs and network ports based on
HBA port.
your actual storage IOPS requirement.
Operating System Performance Tuning –
Multipath and HBA Modules
➢ The multipath module controls the access to storage devices by
pathing between servers and the storage devices, improving path
reliability and performance.
➢ Common multipath policies are as follows.
Routing Policy Description Application Scenario
Static load balancing. I/Os are
delivered to the optimal path in
ROUND_ROBIN Applications with light I/O load
turn to reduce the I/O workload
on a single path.
Dynamic load balancing. I/Os are Applications with heavy I/O load
Minimum queue
delivered to the path that has the and requiring low I/O latency, for
length
least number of I/Os. example, OLTP applications
Applications with heavy I/O load
Dynamic load balancing. I/Os are
Minimum data and requiring large bandwidth, for
delivered to the path that has the
volume example, OLAP and multimedia
minimum amount of data.
applications
Performance Tuning Overview for
Storage Systems
Cache Policy Recommendation
Use write back unless otherwise required.
Cache write
policy Adjust the cache high/low watermarks based on
actual requirements.
The default value is RAID6.
Use RAID5 if you require higher performance or
RAID level
space usage.
Use RAID-TP if you require higher reliability.
Deduplication Use it based on customer requirements or data
and compression characteristics.
Performance Tuning Overview for
Storage Systems
➢ Reconfigure the network switching devices between storage
devices and servers to ensure network privacy.
✓ To prevent the network between storage devices and servers from
becoming a bottleneck or being interfered by other services, use
direct connection or a private network to ensure performance.
iSCSI link
Link problem
Read and write FC link
performance
problem A slow disk
exists
⚫ Generally, the IOPS will not reach the upper limit of a link. However, a
high IOPS may cause high usage on a single CPU core, especially the
cores for HBA card interrupts.
Flowchart for Troubleshooting
Network Performance Problems
Check the iSCSI network
connectivity and bandwidth.
Network
bandwidth
Check the FC network
connectivity and bandwidth.
Network
performance
problem
✓ For a Fibre Channel network, you can check bit errors of ports on the
ISM. This helps you determine whether the performance problem is
caused by bit errors.
Troubleshooting Methods
➢ The following methods are available for checking network paths:
✓ Run upadmin show path to check the number of paths between the
host and the storage system and their connectivity.
✓ If multiple paths exist between the host and the storage system, you
can adjust the multipathing algorithm on the host to improve storage
performance.
Linux host
Run iostat to check the storage
resource usage.
Hosts' Impact on SAN Performance
➢ HBA card
✓ Maximum size of a single request
✓ Maximum number of concurrent requests
✓ HBA driver
Methods for Troubleshooting Host
Performance Problems
➢ Querying Windows host performance
To check the performance of a Windows host, first collect the performance
monitoring information to confirm the current I/O performance. On the
desktop, choose Start > Run and type Perfmon. You can create a counter log
and select Counter to view the I/O performance.
iostat Command
⚫ Basic principles
Help users quickly exclude useless information and locate faults.
⚫ Alarm analysis
Describe how to analyze alarms for troubleshooting a fault.
⚫ Replacement
Describe how to troubleshoot a fault by replacing components of a
storage system.
Troubleshooting Principles and Methods
— Troubleshooting Procedure
⚫ Troubleshooting procedure
Troubleshooting Principles and Methods
— Troubleshooting Procedure
⚫ Required fault information is as follows.
Category Name Remarks
Device serial number
Provide the serial number and version of the storage device.
Basic Information and version
Customer information Provide the customer's contact person and contact information.
Fault occurrence time Record the time when the fault occurs.
Symptom Record details about the fault symptom such as the content of error messagesand event notifications.
Operations performed
Fault information Record operations that are performed before the fault occurs.
before a fault occurs
Operations performed Record operations that are performed after the fault occurs and before the fault is reported to
after a fault occurs maintenance personnel.
Hardware module
Record the configuration of hardware modules of storage devices.
configuration
Storage device Indicator status Record status of indicators on storage devices. Pay attention to indicators that are steady orange or red.
information
Storage system data Manually export operation data and system logs of storage devices.
Alarms and logs Manually export alarms and logs of storage devices.
Describe how an application server and storage devices are connected, such as in Fibre Channel
Connection mode
networking mode or iSCSI networking mode.
Switch model If any switches exist on the network, record the switch model.
Networking Switch diagnosis Manually export switch diagnosis information, including startup configurations, current configurations,
Information information interface information, time, and system versions.
Network topology Describe the topology or diagram of network between an application server and storage devices.
If an application server is connected to storage devices over an iSCSI network, describe IP address
IP address
planning rules or provide the IP address allocation list.
OS version Record the type and version of the operating system that an application server runs.
Application server
Port rates Record the port rate of an application server that isconnected to storage devices.
information
Operating system logs View and export the operating system log.
Troubleshooting Methods and
Principles — Basic Principles
⚫ Analyze external factors and then internal factors.
External factor failures include failures in optical fibers, optical cables,
power supplies, and customer's devices.
⚫ Alarm information
On the Alarms and Events page of DeviceManager, choose the Current Alarms tab page.
The alarm BBU IS Faulty is displayed.
⚫ Possible causes
Therefore, the cached data cannot be completely flushed into coffer disks, resulting in
data loss.
Case Study — BBU Faults
⚫ Fault diagnosis
Case Study — UltraPath Failures
⚫ Symptom
UltraPath installed on an application server is automatically isolated by antivirus software.
As a result, it cannot be used.
⚫ Possible causes
The antivirus software mistakenly takes UltraPath as a virus and therefore, isolates it.
⚫ Recommended actions
1. On the management page of the antivirus software, add UltraPath as a piece of
trusted software.
⚫ Impact
The storage pool is degraded or fails, and some or all storage services are interrupted.
Host services are interrupted.
⚫ Possible causes
The indicators on controller A are normal, but the indicators on controller B are turned off.
The application servers connecting to controller B fail to send read/write requests to the
storage system. As a result, the system services are interrupted. On the Performance
monitoring page of DeviceManager, the host port write I/O traffic or read I/O traffic on the
controller B is 0.
⚫ Impact
If a controller is faulty and host services are interrupted when UltraPath is not installed,
you can manually switch the host services to another functional controller.
⚫ Possible causes
The link indicator of the Fibre Channel host port is steady red or off.
⚫ Alarm information
On the Alarms and Events page of DeviceManager, choose the Current Alarms tab page.
The Link to the Host Port Is Down alarm may be displayed.
⚫ Impact
An unavailable Fibre Channel link causes a link down failure, service interruption, and data
loss between the application server and the storage system.
Case Study — Fibre Channel Link
Faults
⚫ Possible causes
✓ The optical module is faulty.
✓ The optical module is incompatible with the host port.
✓ The rate of the optical module is different from that of the host port.
✓ The optical fiber is poorly connected or faulty.
✓ The port rate of the storage system is different from that of its peer end.
On a direct connection network, the Working Rate of the Fibre Channel host port
is different from that of the Fibre Channel host bus adapter (HBA) on the
application server.
2. The numbers of replication links are different on the two storage systems. For
example, one storage system has two replication links whereas its peer storage
system has only one replication link.
⚫ Possible causes
1. The primary controller on the local storage system was powered off in the process of
creating a remote device.
2. The primary controller on the local storage system was powered off in the process of
adding a link to the remote device.
Case Study — Inconsistent Number of
Replication Links Between Storage Systems
⚫ Fault diagnosis
OceanStor 5300 v5
HANDS-ON / TRANSFERÊNCIA DE CONHECIMENTO
OceanStor V5
Converged Storage
Systems Product
Introduction
Product Positioning (1/2)
Converg High-density Tiered Data disaster
OceanStor V5 converged storage systems ed virtualization storage recovery
stora
ge
⚫
Brand-new hardware architecture delivering
industry-leading performance and
specifications
⚫
Convergence of SAN and NAS
⚫ Outstanding scalability and reliability
Inline
Virtualization High
Up to eight deduplication Wide channel Block-level specifications
controllers and Latest 16 Gbit/s virtualization, Large capacity,
compression Fibre Channel, 12 heterogeneous high cache speed,
IP Scale-out and Gbit/s SAS, and virtualization, and and large number
load balancing Higher storage
PCIe 3.0 computing of ports
resource utilization virtualization
Product Positioning (2/2)
1
Product Features
⚫ High performance ⚫
Robust reliability
PCIe 3.0 high-speed bus and SAS 3.0 high-speed
Full redundancy design
I/O channel
Built-in BBU+data coffer
⚫ Flexible scalability Various data protection
technologies
Hot-swappable I/O interface modules ⚫
Energy saving
Support for 4 interface modules and 2 onboard
Intelligent CPU frequency control
interface modules (2 U)
Delicate fan speed control
Support for 16 interface modules (3 U)
3
2 U Controller Enclosure Architecture
Interface Interface Interface Interface
subsystem
Service
module module module module
A0 A1 B1 B0
8 x PCIe GEN3
8 x PCIe GEN3
8 x PCIe GEN3
Controller module A Controller module B
subsystem
Disk
……
Electromechanical
12 V
subsystem
12 V
4
5300 V5/5500 V5 Elite Controller
Enclosure
5
5300F/5500/5500F V5 Controller Enclosure
(Front Panel)
6
5500 V5 Controller Enclosure
7
5300F/5500/5500F V5 Controller
Enclosure (Rear Panel)
Serial port
8
2 U 2.5-Inch Disk Enclosure
Expansion module
⚫ Dual expansion modules
downlink
1
Serial
2
Mini SAS HD 3 Disk enclosure module equipped with one fan module)
port expansion port ID display ⚫ DC/AC power supplies
Difference in V5 as compared with V3:
SSD, SAS disk, and NL-SAS disk units support only the 12 Gbit/s rate.
5
Smart IO Interface Module
1 Power indicator/Hotswap button
2 16 Gbit/s Fibre Channel, 8 Gbit/s Fibre Channel, 10 GE, 10 Gbit/s
FCoE, or iWARP (Scale-Out) port
3 Port Link/Active/Mode indicator
1 4 4 Module handle
5 Port working mode silkscreen
2
No. Indicator Description
1 Powe Steady green: The interface module is running properly.
3 r Blinking green: The interface module receives a hot swap
indic request. Steady red: The interface module is faulty.
5 ator Off: The interface module is powered off.
3 Port Blinking blue slowly: The port is working in FC mode
Link/Acti and is not connected.
ve/ Blinking blue quickly: The port is working in FC
Mode mode and is transmitting data.
indicator Steady blue: The port is working in FC mode and is connected,
but is not transmitting data.
Blinking green slowly: The port is working in 10GE/FCoE/iWARP
mode and is not connected.
Blinking green quickly: The port is working in
10GE/FCoE/iWARP mode and is transmitting data.
Steady green: The port is working in 10GE/FCoE/iWARP
mode and is connected, but is not transmitting data.
9
Onboard SmartIO Interface Module
1 3
2 4 Indicator Description
1 16 Gbit/s Fibre Channel, 8 Gbit/s Port Blinking blue slowly: The port is working
Fibre Channel, 10 GE, or 10 Link/Active/M in FC mode and is not connected.
Gbit/s FCoE port ode indicator Blinking blue quickly: The port is working
in FC mode and is transmitting data.
2 Port Link/Active/Mode indicator Steady blue: The port is working in FC mode
and is connected, but is not transmitting data.
Blinking green slowly: The port is working in
3 Module handle 10GE/FCoE mode and is not connected.
Blinking green quickly: The port is working
in 10GE/FCoE mode and is transmitting
4 Port working mode silkscreen data.
Steady green: The port is working in
10GE/FCoE mode and is connected, but is not
transmitting data.
0
8 Gbit/s Fibre Channel High-Density Interface
Module 1 Power indicator/Hotswap button
2 8 Gbit/s Fibre Channel port
1
16 Gbit/s Fibre Channel High-Density
Interface Module
1 Power indicator/Hot
Swap button No. Indicator Status Description
2 Handle
1 Power Steady green: The interface module is
indicator/ running properly.
3 16 Gbit/s Fibre
H ot Blinking green: The interface modulereceives
Channel port Swap a hot swap request.
button Steady red: The interface module is faulty.
Off: The interface module is not powered
on or is hot-swappable.
4 Port •Steady blue: Data is being transmitted
Link/Activ between the storage system and the application
4 Port Link/Active
indicator e server at a rate of 16 Gbit/s.
indicator • Blinking blue: Data is being transferred.
•Steady green: Data is being transmitted
between the storage system and the application
server at a rate of 8 Gbit/s, 4 Gbit/s, or 2 Gbit/s.
• Blinking green: Data is being transmitted.
• Steady red: The port is faulty.
• Off: The port link is down.
2
8 x 8 Gbit/s Fibre Channel High-
Density Interface Module
3
10GE Electrical Interface Module
4
56 Gbit/s IB Interface Module
5 Module handle/Silkscreen
SmartThin, SmartQoS,
SmartMotion, SmartPartition, SmartThin, SmartQoS,
Sm SmartCache, SmartCompression, SmartMotion, SmartPartition,
SmartDedupe, SmartMulti- SmartCompression,
art Tenant, SmartTier, SmartDedupe, SmartMulti-Tenant,
seri SmartVirtualization, SmartMigration, SmartVirtualization, SmartMigration,
es SmartErase, SmartQuota SmartErase, SmartQuota
7
SAN+NAS Converged Architecture
NAS
SAN NAS or
SAN NAS+SAN
⚫ Two storage systems are required to provide ⚫ Block- and file-level data storage is unified,
SAN and NAS services. requiring no additional file engines, reducing
⚫ The efficiency of databases and file sharing purchasing costs by 15% and decreasing power
8
Integrated and Unified Storage Architecture
9
Software Feature Deployment
Multipathing Application software
Failover, failback Disk guard, host agent
Management software
NAS protocols SAN protocols GUI/CLI/SNMP
NFS/CIFS FC /iSCSI/SCSI OMM
Alarm, log, performance statistics
Replication Snapshot, clone, volume mirroring, LUN copy, and
Device
remote replication Public System
management
mechanism control
Volume management QoS Initialization Power supply
Object management
Cache Configuration Battery
Object Volume
change Fan
Transaction
Temperature
Storage pool System exception
Controller
RAID2.0, storage resource management, tiered storage System resources
enclosure
Unified thread Disk enclosure
Logical disk
Memory management Port
Internal disk, heterogeneous LUN
Link/Channel
Device drive, OS
FC/SAS/iSCSI Kernel, BSP, BIOS, PCIe BMC, SES
0
Software Architecture (1)
⚫ Protocol layer (NAS and SAN protocols)
Processes NAS and SAN interface protocols.
⚫ Replication layer
Implements value-added replication features for LUNs and file systems,including
HyperReplication, HyperClone, and HyperMirror.
⚫ Storage pool
Divides space provided by physical disks into fine-grained blocks so that services are
distributed to all disks, bringing disk performance into fullplay.
1
Software Architecture (2)
⚫ Management software (GUI/CLI/SNMP)
Enables users to manage storage devices using the GUI and CLI.
⚫ OMM
Collects and dumps alarms, logs, and performance statistics of storage devices.
⚫ System control
Manages storage clusters.
⚫ Device management
Monitors and manages storage device hardware, such as fans, power supplies,
controller enclosures, and disk enclosures.
⚫ Device driver/OS
Provides basic OSs and hardware drivers.
2
Block Virtualization (1)
LUN
LUN
Extent Extent Extent Extent Extent Extent Extent Extent Extent
CKG CKG
Extent Extent Extent Extent Extent Extent Extent Extent
CK CK CK CK CK CK CK CK
CKG CKG
CK CK CK CK CK CK CK CK CK CK CK CK
Disk
Disk Domain
3
Block Virtualization (2)
The following figure shows how application servers use storage space.
Mapping Host 1
Mapping view 1
Application
Host 2 server
(Windows)
Hot spare block
Mapping view 2
Application
server (Linux)
Host 3
Mapping view 3
Application
server (VM)
Hot spare block
Hot spare
block
4
Configuration for Difference RAID Levels
RAID 5300 V5/5500 V5/5600
Level/Numb 18500/18800 V5
V5/5800
er of Disks V5/6800 V5
5
SAN Host Multipathing
×
services fail back from the backup path to the
primary path.
✓ Load balancing: UltraPath can balance I/Os on
paths, evenly distributing loads on hosts.
✓ UltraPath can quickly isolate intermittently
Controller A Controller B interrupted links and links that have bit errors,
ensuring the latency of key applications.
✓ Online upgrade reduces the service downtime.
✓ Path performance statistics
LUN0
LUN LLUN
LUN11 LUN2 LUN3 ✓ In cooperation with the array, the host path can
0 1
be automatically detected and path fault alarms
can be automatically sent.
9
NAS IP Address Failover
0
FC/iSCSI Port
Original service switching method (assume that
Logical link
Physical link
Port failover solution (assume that controller A restarts)
controller A restarts)
OS
Upper-layer application
OS
Host Host
Multipathing ③ Multipathing
IP IP
② ② ④ ④
iSCSI FC iSCSII FC
WWP
GE switch N
FC switch GE switch WWP
N
FC switch
④
Principles: Principles:
(1) Controller A restarts during an upgrade or due to a fault. (1) Controller A restarts during an upgrade or due to a fault.
(2). The HBAs detect that I/Os to controller A time out (30 (2)iSCSI IP1 fails over to controller B and sends an ARP
seconds by default). message to the switch to perform IP address failover.
(3)The multipathing software receives the link fault report from (3) WWPN1 fails over to controller B and is re-created.
the HBAs and switches over I/O paths. (4) The HBAs re-establish links (less than 5 seconds).
(4) The I/O paths are switched to controller B.
1
Introduction to Highly Reliable Coffer Disks
⚫ Coffer disks consist of the first four disks and system disks. They are used to
save system configurations and dirty data.
⚫ The first four disks are organized into RAID 1 groups to ensure high reliability of
data. System disks of controller A and B back up each other.
⚫ System disks save system configurations and dirty data during power failure.
OS
Controller-A Controller-B
DB DB
2
Data Protection
A* A A*
A A A A*
A Power
B*
B B B*
B failure B* B
occurs
System System
disk disk
A*
A A A*
A
B
B* B
B*
3
OceanStor
V5&18000 V5
Converged Storage
Systems V500R007
- CIFS
Overvi
ew
⚫ Barry Feigenbaum originally designed Server Message Block (SMB) at IBM with the
aim of turning DOS "Interrupt 13" local file access into a networked file system.
SMB is used for sharing files, printers, and serial ports among computers.
Poor
SMB1 has only 1/3 WAN speed of SMB2.0.
Performance
Number of sub-versions 13 2
Number of bottom-layer
4 2
transmission protocols
Features
It provides a unified, standardized management interface for Windows
administrators to manage hardware, software, and network components.
⚫ In medium-and large-scale NAS networking scenarios, there may be multiple NAS
servers. If the NAS administrator had to log in to each single NAS server for daily
management, that would be very time consuming. To address this issue and
improve the management efficiency, the MMC provides a centralized
management platform to manage all NAS servers in a unified manner.
⚫ The MMC communicates with storage systems using the standard MSRPC (over
SMB1 or SMB2) protocol. The MMC workflow is as follows:
Client Server
CIFS Share
Management
Local User/Group
Management
MSRPC
MMC MS-RPC Processing Module SMB Session
Management
SMB OpenFile
Management
GNS
⚫ Global namespace (GNS) is a file virtualization technology, aggregating
Features
different file systems and providing a unified access namespace. GNS allows
clients to access files even not knowing the locations of the discrete files, just
like accessing web sites without the need to knowing their IP addresses. It
also enables administrators to manage data on geographically scattered
heterogeneous devices using a unified console.
⚫ In OceanStor V5 storage, GNS is implemented as a CIFS share. The CIFS
protocol provides global root nodes to which each individual file system can
be aggregated, thereby presenting a unified view (based on file system
names). By accessing a GNS share, users can view all created file systems.
⚫ In actual use, GNS shares are nearly the same as common shares. Better than
common shares, the GNS share function provides a global unified view for
storage administrators, facilitating their daily maintenance and management.
⚫ By accessing a GNS access, you can view and access all created file systems. If
a service access node is not a home node for the file system, the file system
will forward the I/Os from this access node, compromising system
performance. To avoid the performance compromise, you can enable the GNS
forwarding function to ensure that the service access node is always a home
node of the file system.
Version Requirements on
CIFS Clients
Client/Server OS
Windows 8 Windows 7 Windows Vista Previous
Windows Server 2012 Windows Server 2008 Windows Server Versions
R2 2008 of
Windows
Windows 7
SMB 2.1 SMB 2.1 SMB2.0 SMB 1.0
Windows Server 2008
R2
Windows Vista
SMB 2.0 SMB 2.0 SMB2.0 SMB 1.0
Windows Server
2008
Previous
SMB 1.0 SMB 1.0 SMB1.0 SMB 1.0
Versions
of
Windows
Working
Principles
Client Server
Network file
File operations
…
operations
NTLM Kerberos
Typical Application
Scenarios
CIFS is mainly applied in file share scenarios, typically enterprise file servers
and media assets:
Enterprise
office work
IP
Management
IP IP
Windows
NAS service
LAN
DNS
AD server
Authentication traffic
Management traffic
Service data
Configuring
CIFS
⚫Creating a User
⚫ Creating a Share
Positioning
⚫ Functions as a network file storage system in UNIX-like system environments such as Linux,
UNIX, AIX, HP-UX, and Mac OS X.
/mnt/nfs120
/home/wenhai/tmp/d01
Accessing the
file systems of
other
computers
using NFS
Accessing the file systemsof
other computers using NFS
Working
Principles
User and
application
File system
Network Layer IP
NFS serviceprocess
NLM
Mount File system
PORTM AP
Client
0
Working Principles –
NFS V4.0
NFS client NFS service
process
1
Software
Architecture
iSCSI/FC/FCOE NFS/CIFS/FTP/HTTP
3
Software Architecture –
Unified Storage
⚫ The following table lists the compatibility information about the basic
connectivity of NFSV5.
Software
Ubuntu 12.04 LTS
HP-UX 11i V2 HP-UX 11i V5
Red Hat Enterprise Red Hat Enterprise
Linux 5 Linux 6
SUSE Linux SUSE Linux
Enterprise Server Enterprise Server
10 11
Asianux 3.0 Asianux 4.0 Asianux 4.2
AIX 5.3 TL12 AIX 6.1 TL5 AIX 7.1 TL0
Mac OS X 10.6 Mac OS X 10.7 Mac OS X 10.8
⚫ For details about compatibility information, visit
http://3ms.huawei.com/mm/docMaintain/mmMaintain.do?method=showMMDe
tail&f_id=STR15073109310058.
4
Feature Description –
Basic Networking
NFS is one of the two most commonly used network sharing protocol.
Mode
NFS applies to UNIX-like system environments such as Linux, UNIX, AIX, HP-UX, and Mac OS
X.
Competitive analysis:
All enterprise-level NAS supports NFS.
6
Feature Description –
UNIX User Permission
⚫ Three security modes, including UNIX, NIS, and LDAP, are supported.
Control
⚫ The following figure shows the UNIX security mode.
NFS server
User information is presented using UID and GID in the UNIX system environment.
Identity authentication and permission verification are performed in the same way as
the local security mode.
7
Feature Description –
NIS User Permission
⚫ The following figure shows the NIS security mode.
NFS server
The unified storage device and the host must join the NIS domain.
User information is presented using user names and group names in the NIS domain.
Identify authentication and permission verification are performed by the NIS server.
8
Feature Description –
LDAP User Permission
⚫ The following figure shows the LDAP security mode.
NFS server
The unified storage device and the host must join the LDAP domain.
User information is presented using user names and group names in the LDAP
domain.
Identify authentication and permission verification are performed by the LDAP server.
9
NFS
Benefits
• Functions as a network file storage system in UNIX-like system
environments such as Linux, UNIX, AIX, HP-UX, and Mac OS X. With
NFS, users can access files in other systems like accessing local files.
0
Feature Description –
NFS V3/V4 supports audit logs.
Audit Log
⚫
⚫ NFS audit logs are used by customers to perform the second audit, and real-
time background monitoring and data analysis for the system.
Audit server
Host
NFS client
Network
⚫ Unified storage
NFS server
1
Feature Description –
Global Namespace
⚫ The NFS protocol provides a global access root node /. Each independent file
system can be aggregated to the virtual root node. You can use an NFS host to
access the / directory to view the unified directory structure.
NFS server
/
DIR1 QT1
2
NFS
Advantage
⚫ Scalability: NFS is a standard industry protocol. It (from V2 to V4,
s which includes 4.1, pNFS, and 4.2) is widely used in UNIX-like system
environments such as Linux, UNIX, AIX, HP-UX, and Mac OSX.
3
NFS Share
Configurations
⚫ Configuring permission
5
Configuring
Permission – LDAP
Step 1: Go to the LDAP Domain Settings page.
Step 2: Set related parameters. Primary IP address, Port, Protocol, and Base DN are
Domain Settings
mandatory. Other parameters are optional.
Step 3: After completing the settings, click Save.
6
Configuring
Permission – NIS
Step 1: Go to the NIS Domain Settings page.
Step 2: Set Domain Name and Primary IP address.
Domain Settings
Step 3: After completing the settings, click Save.
7
Creating an NFS
Share
Step 1: Select a file system and create an NFS share as prompted.
If you want to share a quota tree, select a quota tree.
If you want to specify extra information about the NFS share to be created, enter the
information in Description.
Step 2: After competing the settings, click Next.
8
Setting
Permission (1)
Step 1: Click Add to set access permission for clients to access the NFS share.
9
Setting
Step 2: Select a client type.
Step 3: Set Name of IP Address. If you set Type to Host, enter the host name or IP address. If
Permission (2)
you set Type to Network Group, enter the network group name or IP address. Symbol *
indicates any host name or IP address. For example, 192.168.* indicates any IP address
between 192.168.0.0 and 192.168.255.255.
Step 4: Select share permission.
Step 5: Click OK.
0
Setting
Step 6: In the client list, select a client to assign the client the access permission for the NFS
Permission (3)
share. In the following figure, symbol * indicates that any host or IP address has only READ
permission.
Step 7: Click Next to complete the permission settings.
1
Completing the
NFS Share
Configuration
Click Finish to complete the NFS share configuration. The execution result will be
displayed.
2
Background Web
File Sharing
⚫ Unified storage in the background of web servers
network
IP
NFS server
network Web server
Web server
4
Database File
Storage
⚫ Database files are stored on NFS shares.
Database server
Oracle databases have a built-in NFS client to move database storage space to the
shared space on the NFS server.
The NFS client improves database performance.
5
Cloud Computing
Shared Storage
⚫ Cloud computing uses the NFS server for internal shared storage.
Cloud
computing Unified storage
server
NFS client Internal
IP
External
Firewall
network
IP
NFS server
network Cloud computing server
VMware optimizes the NFS client and moves virtual machine storage space to the shared
space on the NFS server. The NFS client optimized based on cloud computing provides higher
performance and reliability.
6
Common Problems in NFS
Applications
⚫ The NFS client runs in a system using a 32-bit CPU.
Because the NFS server uses a 64-bit CPU, the NFS running in a system using a 32-bit
CPU may fail to process 64-bit file data from the NFS server. As a result, applications
cannot access files normally.
However, some new operating systems and applications can enable 32-bit CPUs to
process data from the NFS server using a 64-bit CPU.
⚫ Applications that originally use local file systems need to be migrated to NFS
storage.
Some special functions of local file systems are not supported by NFS. In such a case,
tests must be performed to check whether those applications can run on NFS.
7
OceanStor V5&18000
V5 Converged
Storage Systems
FTP Introduction
Software Introduction
— Protocol
File Transfer Protocol (FTP) is used to control bidirectional file transfer on the
Internet. It also is an application. FTP applications vary according to different
operating systems. These applications use the same protocol to transfer files.
FTP is usually used for downloading and uploading files. You can download files
from remote servers to your computers or upload files from your computers to
remote servers. That is, you can use client programs to download files from or
upload files to remote servers.
Software
Architecture
FTP is an application-layer protocol in the TCP/IP protocol family. It uses two types of TCP
connections: control connection and data connection. Its software architecture is as follows:
User
working
process 1
User
working
process 2
Listening File
User
working
process n
Configuration Process
management management
Overvi
ew
⚫ FTP is a common protocol used to transfer files between remote servers and local hosts
over IP networks. Before World Wide Web (WWW) appears, users use command lines to
transfer files and the most commonly used file transfer application is FTP. Although now
most users use emails and web to transfer files, FTP is also widely used.
⚫ The FTP protocol is an application-layer protocol in the TCP/IP protocol family. TCP port 20
is used to transfer data and TCP port 21 is used to transfer control messages. Basic FTP
operations are described in RFC959.
⚫ FTP provides two file transferring modes:
⚫ Binary mode: Program files (such as .app, .bin, and .btm files) are transferred in binary
mode.
⚫ ASCII mode: Text files (such as .txt, .bat, and .cfg files) are transferred in ASCII mode.
⚫ FTP can work in either of the following modes:
⚫ Active mode (PORT): In this mode, the FTP server sends requests to set up data
connections. This mode does not work if FTP clients protected by firewalls (for example,
FTP clients reside on private networks).
⚫ Passive mode (PASV): In this mode, FTP clients send requests to set up data connections.
This mode does not work if the FTP server forbids FTP clients from connecting to its ports
whose port number is higher than 1024.
⚫ The methods of setting up control links in PORT and PASV modes is the same, but those of
setting up data links are different. Since the two methods have their advantages and
disadvantages, choose one of them to set up data links based on networking environments.
Restricted
Scenarios
Since FTP transfers files in plaintext, the data that is transferred and the user
name and password used for authentication can be obtained by methods such as
the packet capture. Therefore, FTP is restricted in scenarios that require high
security, such as a scenario where confidential files are transferred.
Active Mode of the FTP
Server (1)
An FTP client sends a PORT command to inform the FTP server of the IP address and temporary port
used to receive the data connection setup request sent by the FTP server from port 20. Since the FTP
server sends the data connection setup request, the FTP server works in PORT mode. For example, as
shown in the following figure, the FTP client uses temporary port 30000 and IP address 192.168.10.50
to receive the data connection setup request.
192.168.10.200
192.168.10.50
SYN
ACK + SYN 21
ACK
FTP server
FTP client Control connection
Active Mode of the FTP
Server (2)
A data connection will be set up after a control connection is set up. If the file list on the FTP server can be
viewed on the FTP client, the data connection is set up successfully. If directory listing times out, the data
connection fails to be set up.
192.168.10.200
192.168.10.50
SYN
30000
ACK + SYN 20
ACK
FTP server
FTP client Data connection
Passive Mode of the FTP
Server (1)
An FTP client uses a PASV command to notify the FTP server that the FTP client sends a data connection setup
request. Then the FTP server uses the PORT command to inform the FTP client of the temporary port and IP
address used to receive the data connection setup request. For example, as shown in the following figure, the
FTP server uses temporary port 30000 and IP address 192.168.10.200 to receive the data connection setup
request from the FTP server. Then the FTP client sends the request to port 30000 with the IP address of
192.168.10.200. Since the FTP server passively receives the data connection setup request, the FTP server
works in PASV mode.
192.168.10.20
192.168.10.5
0
0 SYN
ACK + SYN 21
ACK
FTP server
FTP client Control connection
Passive Mode of the FTP
Server (2)
If the file list on the FTP server can be viewed on the FTP client, the data connection is set up successfully. If
directory listing times out, the data connection fails to be set up.
192.168.10.20
192.168.10.5
0
0 SYN
FTP server
FTP client Data connection
Scenario — Setting Up a
Server for Sharing Learning
Materials
1. Background
Employees in a small company often use chat tools to transfer files for sharing some
learning materials. However, these learning materials are saved on the computers of
different employees. Obtaining and searching files as well as updating files that have
been shared are inconvenient.
2. Solution
Use an FTP server as a learning material sharing server, create an FTP account for
each employee in the company, and enable the employees to share the same
directory. When an employee wants to share learning materials, the employee can
use the FTP uploading function to upload materials to the FTP server. In this way,
other employees can download and updating the materials on the FTP server
anytime. The FTP server enables employees to easily share, obtain, and accumulate
learning materials.
Enabling the FTP
Service
1. On DeviceManager, configure global parameters for and enable the FTP service.
Creating a
User
2. Create a local authentication user.
Creating a
Share Path
3. Create a file system as the FTP share path.
Creating an FTP
Share
4. Create an FTP share.
Selecting a File
System
5. Select a file system as the FTP share path.
Selecting a
User
6. Select a user to create the FTP share.
Reading the Warning Message
7. Carefully read the content of the Warning dialog box and select I
have read and understood the consequences associated with
performing this operation. Then you can use an FTP client to log in.
1
OceanStor V5&18000 V5
Converged Storage Systems
SmartQuota Introduction
Method to Manage
and Control
Limit the resources occupied by single
⚫
Resources
directories, users, and user groups.
Host I/O To prevent some users from occupying
excessive storage resources.
NAS share
⚫ Notify users about the resources they
occupied by alarm or event.
Termino
logy Term Description
Quota tree Quota trees are special directories of file systems.
When the resources used by a user exceed the soft quota, an
alarm is displayed, which is cleared when the used resources
are lower than the soft quota.
Root quota tree Root quota trees are root directories of file systems. User
quotas, group quotas, and resource limits for users can be
configured on root quota trees.
Soft quota When the resources used by a user exceed the soft quota, an
alarm is reported, which is cleared when the used resources
are lower than the soft quota.
Hard quota Hard quota is the maximum number of resources available
to a user.
Usage of
Quota Tree V5 series allow users to configure
quotas on quota trees (special level-1
directories, created by management
commands).
Directory
Capacity
8 MB
File
Quantity
4
| ---- confFile.conf (2 MB, usr 3, grp 5)
User
| ---- run.dat (1 MB, usr 3, grp 8) 3 3 MB 2
4 0 1
| ---- doc (0 B, usr 4, grp 8)
7 5 MB 1
| | ---- study.doc (5 MB, usr 7, grp 9) User group
5 2 MB 1
|
8 1 MB 2
9 5 MB 1
Enabling the Switch of a
Quota Tree
Switch Status Initialization On
Enabling a Quota Switch of Update by running a background scanning task I/O update
a Non-empty Quota Tree + I/O update
1. Run a background task to scan the quota
tree for files and subdirectories and
upgrade the resources occupied by it.
2. During the scanning, I/O requests are
delivered. If a target file has been scanned,
update it.
3. After the scanning, enable the switch of
the quota tree.
Quota
Limitations (1) Root Quota Tree
(File System Root Other Quota Trees
Directory, Quota Tree 0)
Directory Quota X O
Default Directory
O O
Quota
User Quota O O
Default User
O O
Quota
User Group
O O
Quota
Default User
O O
Group Quota
Quota
Limitations (2)
⚫
Configuration items: soft quota, space soft quota, space hard quota,
file quantity software, and file quantity hard quota
⚫
A soft quota cannot exceed its related hard quota. At least one item
must be configured.
Yes
Is used + delta No
within the limitation?
An I/O has
been written.
Soft Quota Alarm and Hard
Quota Event
1.An I/O operation
succeeds.
2.Clear the insufficient
resource alarms.
Resource
occupation
Resources
Manager A
Engineer A
Engineer B You can plan different quota trees for different
…
Salesperson A departments or individuals of an enterprise. In
Salesperson B this way, you only need to configure the
…
Manager A directory quota of each quota tree to limit the
resources occupied by each user.
Flexible
Restrictions on Share is the shared directory (quota tree 0) of the R&D department:
1. Set the quota for quota tree 0 to limit resources available to the R&D
NAS
department.
Resource 2. Set the quota for manager A to limit the resources available to manager A.
3.Set the quota for project group G/E to limit the resources available to the
group.
Occupations
Share
Manager A
⚫
Create a directory quota.
Delete or modify a directory quota.
Report/Batch report
⚫
⚫
Create a host user/user
group.
Modify, query, and delete a
host user/user group.
⚫ Create a quota tree share.
⚫ Delete, modify, and query a
quota tree share.
Configuration
Basic principles Typical scenarios
management