Professional Documents
Culture Documents
V300R006C20
Upgrade Guide
Issue 06
Date 2019-06-21
All other trademarks and trade names mentioned in this document are the property of their respective
holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and
the customer. All or part of the products, services and features described in this document may not be
within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,
information, and recommendations in this document are provided "AS IS" without warranties, guarantees
or representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.
Website: https://e.huawei.com
Overview
This document describes how to upgrade OceanStor 2000 V3 series storage systems, FAQs,
and provides common troubleshooting methods.
The following table lists supported product models.
Intended Audience
This document is intended for upgrade engineers. The upgrade engineers are required to have
the following experience and skills:
Be familiar with the current networking and versions of related network elements.
Be familiar with device operation and maintenance (O&M) and have device maintenance
experience.
Symbol Conventions
The symbols that may be found in this document are defined in the following table.
Symbol Description
Symbol Description
Change History
Issue Date Description
Contents
7 Troubleshooting..........................................................................................................................91
7.1 Common Troubleshooting Methods for an Upgrade Failure.......................................................................................91
7.2 The Preparation for the Online Upgrade Fails..............................................................................................................96
7.3 Failed to Notify the System of Starting Upgrade.........................................................................................................99
7.4 After an Offline Upgrade Is Complete, the Host Fails to Detect LUNs and Services Cannot Be Resumed..............100
7.5 A Controller Hardware Fault Occurred During an Upgrade. As a Result, the Upgrade Failed..................................101
7.6 Restoring VM Services Fails After an Upgrade.........................................................................................................103
7.7 After the Upgrade Is Complete, the Drive Letter Mappings Change.........................................................................104
7.8 Heterogeneous Links for Which CHAP Is Enabled Cannot Recover Automatically.................................................106
7.9 Abnormal Controller Reset Occurs During an Offline Upgrade; Need Manual Repair.............................................107
7.10 The Upgrade Is Interrupted Due to a Version Verification Failure Occurred After the Local Storage Array Restarts
During an Upgrade on a Non-standard Heterogeneous Network.....................................................................................108
7.11 Failed to Continue Upgrading Enclosure Firmware After Spare Parts Replacement. Try Again............................109
7.12 After an Array Is Upgraded, Status of the Physical Path and Logical Path Displayed on UltraPath Installed on the
Host Is Inconsistent..........................................................................................................................................................110
7.13 Using the Maintenance Network Port for Upgrade Results In an Upgrade Package Uploading Failure.................114
7.14 Failed to Copy the Upgrade Package Due to the Upgrade By Using an IPv6 Address............................................116
7.15 Upgrade Progress on the Upgrade Tool Is Not Refreshed Within 30 Minutes But No Upgrade Failure Is Displayed
..........................................................................................................................................................................................118
7.16 Synchronization Fails to Start in Value-Added Services After the Upgrade............................................................118
8 FAQs.............................................................................................................................................120
8.1 What Are the Common Methods of Logging In to and Out of a Storage Array?.......................................................120
8.2 Method of Querying and Changing the SSH Port Number........................................................................................126
8.3 How Can I Collect the Storage Array Information If the System Is Abnormal?........................................................127
8.4 The Method of Handling Upgrade Exceptions...........................................................................................................127
8.5 How Can I Ignore the Failed StepDuring the Upgrade and Proceed with the Subsequent Procedure.......................130
8.6 Description and Usage of Upgrade Batch Customization..........................................................................................135
8.7 How Can I Modify the Upgrade Configuration on the SmartKit?.............................................................................136
8.8 What Can I Do If License File Backup Fails in a Scenario Where the Primary Controller Is Not Connected and
SmartKit Is Used for the Upgrade?..................................................................................................................................137
8.9 How Can I Upgrade a Non-commercial Version to a Commercial One?...................................................................138
8.10 Forcibly Upgrading the System................................................................................................................................141
8.11 How Can I Check Dual Links of a Disk?.................................................................................................................142
8.12 What Can I Do If Multiple Check Items Fail During a Pre-upgrade Check?..........................................................143
8.13 How Do I Query Major or Critical Alarms That Have Been Deleted Manually?....................................................144
8.14 What Is Upgrade Pause and How Can I Configure It?.............................................................................................145
8.15 How Can I Confirm that Port Failover Requirements Are Met................................................................................147
8.16 How Can I Disable the Port Failover Function........................................................................................................155
8.17 How Can I Make the Read-only Function Take Effect on a NAS Remote Replication Pair's Secondary Array After
the Pair Is Upgraded Successfully....................................................................................................................................157
8.18 How Can I Make the Read-only Function Take Effect on a HyperVault Pair's Secondary Array After the Pair Is
Upgraded Successfully.....................................................................................................................................................157
8.19 What Can I Do If the Tool Displays a Message Indicating That the Front-End Link Redundancy Check Fails
During the Array Upgrade?..............................................................................................................................................158
8.20 How Do I Check Whether Host Links Meet Upgrade Requirements in a Customized Batch Upgrade?................160
8.21 How Can I Check the Compatibility of Hosts Running on Non-mainstream Operating Systems?.........................162
8.22 How Can I Ignore the Pre-Upgrade Check Items and Continue the Upgrade?........................................................163
8.23 What Can I Do When the Host HBA or iSCSI Timeout Parameter Does Not Pass the Check and Cannot be
Modified?.........................................................................................................................................................................166
8.24 In Which Scenarios Can I Install the Patch During the Upgrade?...........................................................................166
8.25 How Can I Modify the Windows Remote Desktop Port of the SVP?......................................................................167
9 Appendix.....................................................................................................................................170
9.1 Checking Host Status.................................................................................................................................................170
9.1.1 Testing Host Connectivity.......................................................................................................................................170
9.1.2 Checking Host Multipathing Link Status................................................................................................................170
9.1.2.1 VMware ESX.......................................................................................................................................................171
9.1.2.1.1 Link Redundancy Check with Huawei UltraPath.............................................................................................171
9.1.2.1.2 Link Redundancy Check with VMware Multipathing Software.......................................................................173
9.1.2.2 Windows...............................................................................................................................................................174
9.1.2.2.1 Link Redundancy Check with Huawei UltraPath.............................................................................................174
9.1.2.3 Linux.....................................................................................................................................................................175
9.1.2.3.1 Link Redundancy Check with Huawei UltraPath.............................................................................................175
9.1.2.3.2 Link Redundancy Check with Linux Multipathing Software...........................................................................177
9.1.2.4 Solaris...................................................................................................................................................................179
9.1.2.4.1 Link Redundancy Check with Huawei UltraPath.............................................................................................179
9.1.2.4.2 Link Redundancy Check with Solaris Multipathing Software..........................................................................181
9.1.2.5 HP-UX..................................................................................................................................................................182
9.1.2.5.1 Link Redundancy Check with HP-UX 11i v1/v2 PVlinks................................................................................182
9.1.2.5.2 Link Redundancy Check with HP-UX 11i v3 NMP.........................................................................................184
9.1.2.6 AIX.......................................................................................................................................................................188
9.1.3 Checking Risky Emulex HBA Driver Versions in a Fibre Channel Network.........................................................190
9.1.4 Checking the Oracle Heartbeat Parameter..............................................................................................................192
9.1.5 Checking the HBA Timeout Parameter in a Fibre Channel Network.....................................................................198
9.1.5.1 VMware ESX.......................................................................................................................................................199
9.1.5.2 Windows...............................................................................................................................................................200
9.1.5.3 Linux.....................................................................................................................................................................204
9.1.5.3.1 SUSE 10............................................................................................................................................................204
9.1.5.3.2 SUSE 11 and SUSE 12......................................................................................................................................212
9.1.5.3.3 CentOS 5.11.......................................................................................................................................................213
9.1.5.3.4 CentOS 6.5 and CentOS 7.2..............................................................................................................................220
9.1.5.3.5 Red Hat 5.10 and Red Hat 5.6...........................................................................................................................221
9.1.5.3.6 Red Hat 6.8 and Red Hat 7.1.............................................................................................................................221
9.1.5.3.7 Red Hat 7.3........................................................................................................................................................221
9.1.5.3.8 OEL 5.11 and OEL 6.1......................................................................................................................................221
9.1.5.3.9 OEL 7.2.............................................................................................................................................................221
9.1.5.3.10 Ubuntu 12.04, Ubuntu 14.10, and Ubuntu 16.04............................................................................................221
9.1.5.4 Solaris...................................................................................................................................................................221
9.1.5.5 HP-UX..................................................................................................................................................................228
9.1.5.5.1 HP-UX 11i v1 and v2........................................................................................................................................228
9.1.5.5.2 HP-UX 11i v3....................................................................................................................................................230
9.1.5.6 AIX.......................................................................................................................................................................231
9.1.6 Checking the Initiator Timeout Parameter in an iSCSI Network............................................................................234
9.1.6.1 VMware ESX.......................................................................................................................................................234
9.1.6.2 Windows...............................................................................................................................................................236
9.1.6.3 Linux.....................................................................................................................................................................237
9.2 Checking the System Status Before the Upgrade.......................................................................................................238
9.3 Obtaining the License File for the Target Version......................................................................................................243
9.4 Upgrading Storage Arrays in the HyperMetro Solution.............................................................................................244
9.4.1 Upgrading Storage Arrays in the SAN HyperMetro Solution.................................................................................245
9.4.1.1 Upgrading by Suspending HyperMetro Relationships.........................................................................................245
9.4.1.1.1 Upgrade Schemes..............................................................................................................................................245
9.4.1.1.2 Upgrade Process................................................................................................................................................247
9.4.1.1.3 Pre-upgrade Checklist.......................................................................................................................................247
9.4.1.1.4 Performing the Upgrade....................................................................................................................................250
9.4.1.1.5 Verifying the Upgrade.......................................................................................................................................259
9.4.1.2 Upgrading Without Suspending HyperMetro Relationships................................................................................260
9.4.1.2.1 Upgrade Schemes..............................................................................................................................................260
9.4.1.2.2 Upgrade Process................................................................................................................................................261
This chapter describes the upgrade schemes, requirements on the source version and the
impact or precautions of upgrade.
1. Rolling upgrade is recommended no matter whether host services are stopped. Parallel upgrade
is used only when no service is configured.
2. The rolling upgrade is the original online upgrade mode. The rolling upgrade described in this
section is the same as the online upgrade description. The parallel upgrade is the original offline
upgrade mode. The parallel upgrade described in this section is the same as the offline upgrade
description.
1. If the current version is V300R006C10 or later ones, you can refer to 8.15 How Can I Confirm that
Port Failover Requirements Are Met to check networking conditions of port failover, to avoid
collecting host information and assessing host compatibility before upgrade and reduce dependency
on host configuration during the upgrade. You must ensure that the networking meets requirements
for port failover. If the requirements are not met, you are advised to adjust the networking based on
the networking requirements.
2. If the networking conditions of port failover are not met, before performing an online upgrade, you
need to refer to 4.6 Compatibility Evaluation to assess host compatibility. You can perform the
online upgrade only after all items pass the check.
The rolling upgrade mode is a highly reliable and available batch controller upgrade
mode without interrupting services. The mode is used for upgrade scenarios where
services cannot be interrupted. Before the upgrade, ensure that the upgrade package
supports the upgrade from the current version to the package version.
Table 1-1 lists the default batch upgrade policies for OceanStor 2200 V3, 2600 V3 V3
series.
Table 1-1 Default batch upgrade policies for OceanStor 2200 V3, 2600 V3 V3 series
Number of Primary First Batch Second Batch
Controllers Controller
2 controllers 0A 0B 0A
0B 0A 0B
4 controllers 0A\1A 0B, 1B 0A, 1A
0B\1B 0A, 1A 0B, 1B
8 controllers 0A\1A\2A\3A 0B, 1B, 2B, 3B 0A, 1A, 2A, 3A
0B\1B\2B\3B 0A, 1A, 2A, 3A 0B, 1B, 2B, 3B
A storage system that can be upgraded in online mode contains multiple engines. Each engine
contains two or four controllers, namely, controllers A and B or controllers A, B, C, and D. A
controller is named in digit+letter format. The digit indicates the ID of an engine in the cluster and
the letter indicates the ID of a controller in the engine. For example, 0A indicates controller A in
engine 0.
You can customize the upgrade batches when performing an online upgrade. For details about how
to customize the upgrade batches, see 8.6 Description and Usage of Upgrade Batch Customization.
For details about how to upgrade disk arrays with the HyperMetro solution, see 9.4 Upgrading
Storage Arrays in the HyperMetro Solution in the appendix after 4 Site Survey Before an Upgrade is
complete.
NFSv3, NFSv4 on the versions after V300R006C00, and SMB 3.0 can ensure continuous
services during an online upgrade, but SMB 3.0 needs manual enabling of Failover. If you
have any questions, contact technical support engineers.
Some file sharing protocols (such as SMB 1.0, SMB 2.0, and NFSv4 on the versions
before V300R006C00) have limitations and cannot ensure continuous services during an
online upgrade, and you cannot use online upgrade. Refer to Offline upgrade.
The rolling upgrade is the original online upgrade mode. The rolling upgrade described in this section is
the same as the online upgrade description. The parallel upgrade is the original offline upgrade mode.
The parallel upgrade described in this section is the same as the offline upgrade description.
Impact on Services
Rolling Upgrade
During an online upgrade of controller software, controllers to be upgraded restart and
their services are switched to other normal controllers. The read and write performance
may deteriorate by 10% to 20%. You are advised to perform online upgrades in off-peak
hours.
Parallel Upgrade
You must stop host services before performing an offline upgrade of controller software.
1.4 Precautions
This section describes the upgrade precautions that you need to take.
If you need to get the required protocol and port, click the icon on the upper right corner of
SmartKit to open the Help to see the Operating Environment Requirements section.
1. Figure 2-1 shows the upgrade process for common storage arrays (without the HyperMetro
configuration). After preparing for the upgrade, for details about the upgrade of storage
arrays in the HyperMetro solution, see section 9.4 Upgrading Storage Arrays in the
HyperMetro Solution.
2. The rolling upgrade is the original online upgrade mode. The rolling upgrade described in
this section is the same as the online upgrade description. The parallel upgrade is the
original offline upgrade mode. The parallel upgrade described in this section is the same as
the offline upgrade description.
This document describes a complete upgrade process. You can determine the actual process according to
the situations on the live network.
Preparing One day before 30 minutes For details, see 3 Preparing for the
for the the upgrade Upgrade.
Upgrade
Site After Prepare 30 minutes For details, see 4 Site Survey Before an
Survey for the Upgrade Upgrade.
Before an and before
Upgrade controller
upgrade
Stopping Before Depending on For details, see 5.1 Stopping Services
Services controller the customer's Before the Offline Upgrade.
Before the upgrade business
Offline volume
Upgrade
Installing Before Array 5 minutes For details, see 5.2 Installing the Patch
Patch Upgrade Before the Upgrade.
Before Evaluation
Upgrade
Setting the Before Assess 5 minutes For details, see 5.3 Setting the Upgrade
Upgrade the Array Policy
Policy Upgrade.
Assessing Before 5 minutes For details, see 5.4 Assessing the Array
the Array rebooting Upgrade.
Upgrade Controllers
Upgrading - 60 to 180 Upgrading controller software includes
Controller minutes as well as restart and evaluation before
Software controller restart. Upgrade time is also
determined by the upgrade mode and
the number of disk enclosures in the
storage system.
For details, see 5.6 Upgrading
Controller Software.
Patch After upgrading 10 minutes For details, see 5.7 Patch Installation
Installatio controller After the Upgrade.
n After software
the
Upgrade
Upgrading After the 20 minutes If the customer installs OceanStor
OceanStor controller SystemReporter, upgrade it to the
SystemRe software matched version.
porter upgrade is For details, see 5.8 Upgrading
complete OceanStor SystemReporter.
Upgrading After the 10 minutes For details, see 5.9 Upgrading the
the controller Antivirus Agent.
Antivirus software
Agent upgrade is
complete
Verifying After the 30 minutes For details, see 6 Verifying the
the upgrade Upgrade.
Upgrade
The time required for each procedure is an estimate and for reference only. The operation time actually
used depends on onsite conditions.
You must apply for permission to obtain documents and software from the website.
Procedure
Step 1 Go to http://support.huawei.com/enterprise/en/category/enterprise-storage-pid-21430818.
Then, select a matched product and version.
Step 2 Register if you are visiting the website for the first time, or go to the next step if you have
already registered.
Step 3 Go to Entry-level Storage. Select the related product to enter the product page or directly
search for your desired documents.
If you cannot visit the link directly, go to http://support.huawei.com/enterprise/ and click Login. Enter
the user name and password, and then choose Support > Enterprise Storage > Entry-level Storage.
Step 4 View the related product documents by entering the related product page in the product
catalog or search document or software titles to find desired documents or software.
----End
idAbsPath=fixnode01%7C7919749%7C7941815
%7C250389224%7C21462742%7C21538251
V300R005C V300R005C0 - https://support.huawei.com/enterprise/en/
00SPC300 0SPH308 or enterprise-storage/oceanstor-2600-v3-pid-
later 21538251/software/23468099?
idAbsPath=fixnode01%7C7919749%7C7941815
%7C250389224%7C21462742%7C21538251
1. After the patch version is confirmed, download the patch package and related patch installation
guide from the download path listed in the preceding table.
Huawei does not provide the SSH and SFTP tools due to copyrights. You can obtain the tools by
yourself and consult with Huawei if necessary.
You are advised to download the SSH tool from the official site of PuTTY and download the SFTP
tool from the official site of Core FTP.
You must apply for permission to obtain tools and software from the website.
Figure 1-2 Downloading the upgrade or patch package and digital certificate
Step 4 If the target version has a hot patch package, download the latest patch package of the target
version. After the version is upgraded to the target one, install the latest hot patch. For
example, the latest hot patch package is ended with SPH0XX, such as V300R006C20.
Download the latest patch package.
----End
----End
Procedure
Step 1 Double-click PGPVerify.exe to open the OpenPGP signature verification tool. Click Select
Public Key. See Figure 3-4.
Step 2 Select the KEYS file and click Single Verify. See Figure 3-5.
Step 3 Select the digital certificate used with the upgrade package and check upgrade package
integrity. If PASS is displayed in the Results column and the entry is green in Figure 3-6, the
upgrade package integrity check is passed. Otherwise, check that the downloaded upgrade
package and digital certificate are correct.
For details about how to obtain the digital certificate used with the upgraded package, see 3.4.2
Downloading Tools and Software Packages. Ensure that the digital certificate and the upgraded package
reside in the same directory.
----End
This section describes the necessary site survey operations before an upgrade.
1. The site survey before an upgrade checks arrays and host services to ensure that the
upgrade is successful. Perform the operations following instructions in this section.
2. If you cannot use the tool provided in this chapter to perform the check, log in to each host
and perform a manual check by referring to section 9.1 Checking Host Status.
The pre-upgrade site survey includes: upgrade tool installation, inspection, upgrade policy
setting, array information collection, array upgrade evaluation, compatibility evaluation, and
log analysis (except log analysis, all the preceding steps are performed in the pre-upgrade
survey), as shown in the following table.
Upgrade Tool Installation Install SmartKit for the site survey 10 minutes
before the upgrade and device
upgrade. See 4.1 Installing the
Upgrade Tool.
Upgrade Policy Setting Select the device to be upgraded 5 minutes
and set the upgrade policy. See 4.2
Upgrade Policy Setting
Inspection Check the hardware, software, 10 minutes
value-added services, and alarms
of devices. See 4.3 Inspection.
Array Information Use the information collection tool 5 minutes for dual
Collection to collect information about array controllers
running logs for health status
evaluation. See 4.4 Array
Information Collection
Array Upgrade Evaluation Check array health status. This 5 minutes for dual
operation is automatically controllers
Step 5 Click Function Management, as shown in Figure 4-2. Select Storage under Product Field
in the Filter area and click Install, as shown in Figure 4-3.
Step 6 After the installation is complete, choose Home > Upgrade/Patch Installation to perform
the site survey before upgrade and device upgrade.
----End
This section can be performed only after the procedure in section 4.1.1 Installing SmartKit Online is
complete.
Procedure
Step 1 Choose Home > Function Management, and click Export, as shown in Figure 4-4. Select
the functions to be exported and click Export, as shown in Figure 4-5.
Step 2 After the export is complete, click View to open the exported function package, as shown in
Figure 4-6.
Step 3 Copy the function package exported in Step 1 and the latest SmartKit installation package to
the host or laptop where the pre-upgrade site survey or upgrade is to be performed.
Step 4 Decompress the SmartKit installation package and install the latest SmartKit. On the
Function Management page, click Import, as shown in Figure 4-7. Select the function
package copied in Step 3 and click OK, as shown in Figure 4-8.
Step 5 After the import is successful, corresponding functions are available on the Home page.
----End
Step 3 On the page of Set Upgrade Policy, click Add Device(A) as shown in Figure 4-11 and
Figure 4-12.
Step 4 Enter device management IP address and click Next, as shown in Figure 4-13.
Step 5 Enter the user name and password, then click Finish, as shown in Figure 4-14.
Set Port to the default value 22. Do not select Need Debugging Password.
Step 6 The dialog box indicating that the fingerprint of the device is not registered is displayed. Click
OK to register the device, as shown in Figure 4-15.
Step 7 Select your desired device and click Next, as shown in Figure 4-16.
Step 8 On the Set Strategy page, select upgrade mode, enter the target version and click Finish, as
shown in Figure 4-17.
Step 9 In the Set Upgrade Policy window, select the desired device and click OK, as shown in
Figure 4-18.
----End
4.3 Inspection
Only when all the evaluation items are passed, the verification is successful.
Procedure
Step 1 On the page of Pre-upgrade Site Survey, select Inspection, as shown in Figure 4-19.
Step 3 It will take about 5 minutes to complete the inspection, as shown in Figure 4-22.
Step 4 After the inspection is complete, click Passed or Not passed to view the inspection details, as
shown in Figure 4-23.
Step 5 Click Original Information to view error details, as shown in Figure 4-24.
Step 7 Click the wizard in the left window to view details, as shown in Figure 4-26.
----End
If all items in 4.5 Array Upgrade Evaluation pass the check, the array information collection is
unavailable on the GUI. You can skip this section.
Procedure
Step 1 On the page of Pre-upgrade Site Survey, select Array Information Collection, as shown in
Figure 4-27.
Step 2 On the Information Collection page, click Collect, as shown in Figure 4-28.
Step 3 After information has been collected, click Open Directory to view the collected
information, as shown in Figure 4-29.
----End
1. Array upgrade evaluation checks the array health status before the upgrade, preventing
adverse impact on the upgrade. You must ensure that all items pass the check before
performing subsequent upgrade operations. If you perform an upgrade forcibly, you must
confirm the risks and accept the consequences associated with performing the upgrade.
2. The array upgrade evaluation also comprehensively assesses whether to continue Array
Information Collection and Compatibility Evaluation steps. If one or more of these
steps are unavailable on the page, the array upgrade evaluation is performed and you can
skip this evaluation. Otherwise, proceed with the array upgrade evaluation. If
Compatibility Evaluation fails or is not supported, contact Huawei technical support.
3. The rolling upgrade is the original online upgrade mode. The rolling upgrade described in
this section is the same as the online upgrade description. The parallel upgrade is the
original offline upgrade mode. The parallel upgrade described in this section is the same as
the offline upgrade description.
Procedure
Step 1 On the page of Pre-upgrade Site Survey, select Array Upgrade Evaluation, as shown in
Figure 4-30.
Step 2 On the Upgrade Evaluation Tool page, select the successfully added device and click
Execute, as shown in Figure 4-31.
Step 3 After the upgrade evaluation is completed, click the items that do not pass the check, as
shown in Figure 4-32. In the Details dialog box that is displayed, follow the instructions in
Reference for troubleshooting until the Evaluation Result is Passed.
If the status of value-added services fails to pass the check, adjust the status by referring to related
descriptions in 9.2 Checking the System Status Before the Upgrade.
If some check items fail to pass the check after troubleshooting or cannot be rectified, proceed
with the upgrade only after the technical support engineers evaluate the risks and confirm that
the risks are acceptable.
----End
If all items mentioned in section 4.5 Array Upgrade Evaluation pass evaluation, Compatibility
Evaluation is unavailable on the Scenario-based Task tab. This indicates that Compatibility
Evaluation is not required, and you can skip this section.
Procedure
Step 1 On the Pre-upgrade Site Survey page, click Compatibility Evaluation, as shown in Figure
4-33.
Step 2 The system automatically queries the host information of the current device, as shown in
Figure 4-34.
Step 3 After the query is complete, click Add. In the dialog box that is displayed, enter the IP
address of the device and click Next, as shown in Figure 4-35.
Step 4 Enter the user name and password, and then click Finish, as shown in Figure 4-36.
Step 5 Select the desired device and click Execute, as shown in Figure 4-37.
Step 6 Then, the compatibility evaluation is completed automatically, as shown in Figure 4-38.
Step 8 Click the navigation tree on the left for detailed information, as shown in Figure 4-40.
If you want to customize the upgrade batches, the check result of the path redundancy check item in
the host compatibility evaluation tool cannot be used as the final result. For details about how to
check whether the paths meet the online upgrade conditions, refer to 8.20 How Do I Check Whether
Host Links Meet Upgrade Requirements in a Customized Batch Upgrade?.
If the host operating system is not VMware, Windows, Linux, Solaris, HP-UX, or AIX operating
system, the host compatibility evaluation tool cannot be used for evaluation. For details about how
to check the host compatibility of the non-mainstream operating systems, refer to 8.21 How Can I
Check the Compatibility of Hosts Running on Non-mainstream Operating Systems?.
If the host HBA timeout period or iSCSI initiator timeout period fails to pass the check but the fault
cannot be rectified according to the handling suggestion, rectify the fault by referring to 8.23 What
Can I Do When the Host HBA or iSCSI Timeout Parameter Does Not Pass the Check and Cannot be
Modified?.
----End
Procedure
Step 1 Enter http://support.eservice.huawei.com/#/Welcome to log in to the eService home page.
The user name and password must be obtained for logging in to eService. Contact technical support
engineers to obtain them.
Step 2 Choose Troubleshooting > Storage Fault Diagnosis, as shown in Figure 4-41.
Step 3 Click New Analysis. The page for selecting or uploading logs is displayed, as shown in
Figure 4-42.
Step 4 On the Fault Diagnosis page, click Select File to select log files, as shown in Figure 4-43.
Step 5 Select the log file package to be uploaded, as shown in Figure 4-44.
The log files uploaded are compressed package files obtained from 4.4 Array Information Collection.
Step 6 After the log file package is selected, click OK, as shown in Figure 4-45.Wait until the logs
are uploaded and analyzed, as shown in Figure 4-46.
After the log file is uploaded for the first time, set the product model and version information as
prompted.
Step 7 After the analysis is complete, click View to view the log analysis result, as shown in Figure
4-47.
In the log analysis result, you only need to focus on the faults and risks in the last two months.
Step 8 If any fault or risk exists, rectify the fault as prompted. If the fault persists, contact Huawei
technical support.
If any fault or risk is found, do not perform the subsequent upgrade operations before the fault
is rectified or Huawei technical support engineers suggest. Otherwise, the upgrade may fail or
services may be interrupted.
----End
Procedure
Figure 5-1 shows the flowchart for stopping host services.
In order to make sure the I/O from host to storage device interrupted completely, remove service
cables between the host and the storage device.
For an offline upgrade, you must stop services. For an online upgrade, skip this section.
Customers' maintenance engineers are responsible for performing this task. Huawei technical
support will provide assistance if necessary.
If the host application contains VM services, ensure that the disk type of the VM is persistent.
Otherwise, service data of the host will be lost after the VM is restarted.
When you create a disk on VMware or FusionCompute VMs, you can set the disk mode to
persistent or non-persistent.
Persistent: All the operations on the disk are written into the disk.
Non-persistent: All the operations on the disk are not written to the disk, instead, they are
written to the cache file which resides in the same datastore as the disk. After VMs are
shut, all the operations on the disk are lost.
Therefore, data on a non-persistent disk will be lost after VMs are shut. In this case,
FusionCompute V100R003 and later are involved.
Oracle Database
Step 1 In Windows and Linux operating systems, run the shutdown normal command under
SQLPlus to stop Oracle database services. If no application is running, you can run the
shutdown immediate command.
Step 2 Stop Oracle services in the Windows and Linux operating systems.
----End
Step 2 Enter the cmd view. Run the cd \ command to switch the working directory to the directory
where sync.exe resides.
Step 3 Run the sync.exe command to write all memory data into disks of the storage array.
Step 4 If you want to write memory data into a specific disk, run the sync.exe -r drive letter
command (for example: sync.exe -r F:).
----End
If a host cannot be disconnected from the storage device, take the following solutions to solve the
problem.
1. Contact customers' service engineer to check whether all host applications at the service layer are
stopped. If you are not sure, go to the next step.
2. On the CLI, run the change performance statistic_enable enabled=yes command to enable
performance monitoring in OceanStor DeviceManager. Then observe whether the IOPS of each
controller is zero. Observe this for about five minutes.
Step 2 Install the patch by following instructions in the corresponding patch installation guide.
If no patch version is found for the source version in 3.3 Confirming the Patch Version Before the
Upgrade, no patch needs to be installed.
----End
Step 3 On the Select Device page, click Add Device, as shown in Figure 5-4.
Step 4 On the Basic Information page, enter the device management IP address and click Next, as
shown in Figure 5-5.
Step 5 Enter required device information in Username and Password, and click Finish, as shown in
Figure 5-6.
The default value of the port is 22. Do not select Need Debugging Password.
Step 6 A dialog box is displayed indicating that the server fingerprint is not registered. Click OK to
register it, as shown in Figure 5-7.
Step 7 Select an added device and click Next, as shown in Figure 5-8.
Step 8 On the Upgrade Settings page, click Browse. In the Browse dialog box that is displayed,
select an upgrade package and click OK, as shown in the Figure 5-9.
Step 9 On the Upgrade Settings page, click Browse on the right side of Select Patch Package
(Optional). In the Browse dialog box that is displayed, select a patch package and click OK,
as shown in Figure 5-10.
Step 10 Go back to the Upgrade Settings page, select Online for Upgrade Mode, and click Finish,
as shown in Figure 5-11.
For details about the difference between Online and Offline, see section 1.1 Upgrade Schemes.
Step 11 In the dialog box that is displayed, click Cancel, as shown in Figure 5-12.
Step 12 Click Browse on the right of Directory saving the backup data and report to select a
backup path and click Save, as shown in Figure 5-13.
Step 13 Go to Set Upgrade Policy page, select the added device, select the confirmation item, and
click OK to complete the upgrade policy setting, as shown in Figure 5-14.
----End
1. If the upgrade is to be performed more than one day after the site survey is complete, you need to
perform the array upgrade evaluation before the upgrade again to improve the reliability of the upgrade.
2. You can skip this section if either of the following conditions is met:
On the day when the upgrade is performed, the array upgrade evaluation is completed and all items
pass the check in the site survey scenario before the upgrade.
Items that failed to pass the check are resolved, the array and host services do not change
configurations, and the host networking does not change after the evaluation.
Procedure
Step 1 Click Array Upgrade Evaluation, as shown in Figure 5-15.
Step 2 Go to the Upgrade Evaluation Tool page, select a device to be upgraded for evaluation, and
click Execute, as shown in Figure 5-16.
Step 3 After the upgrade evaluation is completed, click the items that do not pass the check, as
shown in Figure 5-17. In the Details dialog box that is displayed, follow the instructions in
Reference for troubleshooting until the Evaluation Result is Passed.
If a patch is not installed, as shown in Figure 5-18, perform Step 4 to Step 10.
Step 5 On the Patch Tool page, click Select patch, as shown in Figure 5-20.
Step 6 In the Patch Details dialog box, download the required patch package and save it to a local
directory, as shown in Figure 5-21.
Step 7 Click Modify, select the desired patch package, and click OK, as shown in Figure 5-22.
Step 8 In the Information dialog box that is displayed, click OK, as shown in Figure 5-23.
Step 9 On the Patch Tool page, click OK, as shown in Figure 5-24.
----End
Prerequisites
1. The current version of the storage array is V3R6C00SPC100 or earlier. Otherwise, skip
this section.
2. You have completed the checks and evaluation mentioned in chapter 4 Site Survey
Before an Upgrade, and the check and evaluation results are pass.
3. If the upgrade is performed in offline mode, ensure that you have stopped services before
the upgrade by following instructions in section 5.1 Stopping Services Before the Offline
Upgrade.
4. You have installed the specific patch by following instructions in section 5.2 Installing
the Patch Before the Upgrade.
5. If site survey and controller reboot are not performed on the same day, you have to check
the device and environment status again by following instructions in section 4.5 Array
Upgrade Evaluation, and ensure that all check items pass the check.
If you want to ignore the failed check items and forcibly perform an upgrade, ensure that you
have clearly known the risks of those failed check items and the risks are acceptable.
Procedure
Step 1 On the page of Device Upgrade, click Controller Rebooting, as shown in Figure 5-26.
Step 2 In the Restart Controller window, select the device to be rebooted and click Restart, as
shown in Figure 5-27.
Step 3 In the Select Restart Mode dialog box, select the reboot mode and click OK, as shown in
Figure 5-28.
Restart controllers in two batches(recommended for standard front-end networking): This mode is
recommended in the case of the default batch upgrade sequence (for details about the batch upgrade
sequence, see section 1.1 Upgrade Schemes. In this mode, controllers are rebooted in two batches in
sequence, and it takes about 20 minutes.
Restart controllers one by one: This mode needs to be selected in the case of the user-defined batch
upgrade sequence. In this mode, controllers are rebooted one by one. According to the number of
controllers, the total time required for rebooting all controllers is: Number of controllers x 10
minutes.
Step 4 In the Confirm Restart dialog box, type the password, select I have read the previous
information and understood consequences of the operation. and click OK, as shown in
Figure 5-29.
Step 5 The reboot process starts, and the follow-up processes are automatically executed. Wait for
the processes to finish, as shown in Figure 5-30.
If a check fails before the reboot, ignore the check and continue the reboot after the evaluation by
technical support engineers. For details, see 8.22 How Can I Ignore the Pre-Upgrade Check Items and
Continue the Upgrade?.
----End
Prerequisites
1. You have checked and assessed items in the pre-upgrade site survey according to 4 Site
Survey Before an Upgrade. You have ensured that all items pass the check and
evaluation.
2. For an offline upgrade, you must ensure that services are stopped before the offline
upgrade according to 5.1 Stopping Services Before the Offline Upgrade.
3. You must ensure that the patch has been installed before the upgrade by referring to 5.2
Installing the Patch Before the Upgrade.
4. If the site survey and controller software upgrading are performed on different days
before the upgrade, you must check the device and environment status again, and ensure
all items pass the check according to 5.4 Assessing the Array Upgrade.
If you want to ignore the failed check items and forcibly perform an upgrade, ensure that you
have clearly known the risks of the failed check items and the risks are acceptable.
5. Before the upgrade, close all OceanStor DeviceManager pages and CLI windows.
During the upgrade, do not log in to OceanStor DeviceManager or the CLI.
Procedure
1. The rolling upgrade is the original online upgrade mode. The rolling upgrade described in this
section is the same as the online upgrade description. The parallel upgrade is the original offline
upgrade mode. The parallel upgrade described in this section is the same as the offline upgrade
description.
2. The upgrade information settings in Online are used as an example in the following steps. The
actual settings are subject to site conditions.
Step 1 On the Device Upgrade page, click Storage Array Upgrade to start an upgrade, as shown in
Figure 5-31.
Step 2 Click Perform Upgrade to start the upgrade, as shown in Figure 5-32.
Step 3 The Upgrade Confirm dialog box is displayed. After reading the confirmation, select the
check box, and click OK to start the upgrade, as shown in Figure 5-33.
Step 4 The system automatically imports the upgrade package, as shown in Figure 5-34.
If the upgrade package fails to be uploaded during the upgrade using the maintenance network port,
refer to 7.13 Using the Maintenance Network Port for Upgrade Results In an Upgrade Package
Uploading Failure to rectify the fault.
If the upgrade package fails to be uploaded during the upgrade using the IPv6 address, refer to 7.14
Failed to Copy the Upgrade Package Due to the Upgrade By Using an IPv6 Address to rectify the
fault.
Step 5 After the upgrade package is imported, the system automatically starts to perform a pre-
upgrade check, as shown in Figure 5-35. For details about check items, see appendix 9.2
Checking the System Status Before the Upgrade.
The pre-upgrade check is mainly about inspections directly related to the device and its basic health
status, such as the upgrade package compatibility, device service pressure, redundant link of hosts,
and device alarm. Some check items are confirmed in the array upgrade assessment whereas some
items must be checked after the upgrade package is imported.
If the front-end link redundancy check fails, refer to 8.19 What Can I Do If the Tool Displays a
Message Indicating That the Front-End Link Redundancy Check Fails During the Array Upgrade?o
rectify the fault.
If an item fails to pass the pre-upgrade check after you rectify the fault by referring to the handling
suggestions for the check item, contact technical support engineers to evaluate whether the item can
be ignored. If the item can be ignored, refer to 8.22 How Can I Ignore the Pre-Upgrade Check Items
and Continue the Upgrade? to ignore the pre-upgrade check item and continue the upgrade. If the
item cannot be ignored, do not continue the upgrade.
Step 6 After the pre-upgrade check is complete, the system automatically backs up data, as shown in
Figure 5-36.
Step 7 After the data is backed up, the system automatically performs the upgrade, as shown in
Figure 5-37.
If the upgrade progress is not refreshed within 30 minutes and no message is displayed indicating that
the upgrade fails, refer to 7.15 Upgrade Progress on the Upgrade Tool Is Not Refreshed Within 30
Minutes But No Upgrade Failure Is Displayed to rectify the fault.
Step 8 After the upgrade is complete, the system automatically performs post-upgrade verification,
as shown in Figure 5-38.
Check the post-upgrade verification result carefully. Clear each warning item and check for exceptions
during the upgrade.
Step 9 Check whether the target device version is consistent with the upgrade package version and
confirm that the upgrade is complete, as shown in Figure 5-39.
----End
It is recommended that the version of SystemReporter complies with the version mapping
of the storage array. If the storage array is upgraded, SystemReporter needs to be upgraded
as well. Otherwise, SystemReporter cannot monitor the performance data of the storage
array.
If the source version of the storage array is V300R003C10 or earlier, upgrade
SystemReporter in 24 hours after the storage array is upgraded. Otherwise, data generated
during the 24 hours may be lost.
Prerequisites
OceanStor SystemReporter has been installed.
Procedure
Step 1 For details, see the OceanStor 2000, 5000 and 6000 V3 Series V300R006C30 SystemReporter
Upgrade Guide. For details about how to obtain the Guide, see 3.2 Obtaining Other Upgrade
Reference Documents.
----End
Procedure
Step 1 After the storage system is upgraded, run the show alarm command in the CLI to check
whether the alarm whose ID is 0x100F011D004D exists.
If yes, perform the following steps 2 and 3 to upgrade the antivirus agent.
If no, the antivirus agent does not need an upgrade.
Step 2 Run the show antivirus_server general command in the CLI to obtain the IP address of the
antivirus server.
Step 3 Download the latest version of the antivirus agent and configure again. You are advised to
download the antivirus agent corresponding to the latest version of the storage system after
upgrading the storage system. After downloading the new version of the antivirus agent, to
ensure that the original file antivirus service can be normally used, follow Figure 1
Configuration process after the antivirus agent is upgraded and reinstalled to reconfigure.
Figure 1-1 Configuration process after the antivirus agent is upgraded and reinstalled
Table 1-1 Configuration steps after the antivirus agent is upgraded and reinstalled
Configuration Description
Procedure
Install the antivirus Under the Settings tab page of the DeviceManager management
agent. interface, download the antivirus agent that matches the storage
system and install it on the antivirus server.
Configure the Configuration of the antivirus agent includes setting and starting an
antivirus agent. account, configuring antivirus-related ports, and restarting the
antivirus agent.
Check the Check whether the antivirus software has a correct scanning policy.
configuration of the
antivirus software.
Reset the pre-shared After the antivirus agent is reinstalled, the storage system restores
key. the default pre-shared key. To realize the mutual authentication
between the storage system and the antivirus agent, reset the pre-
shared key on the storage system and the antivirus server
respectively.
For details about how to install and configure an Anti-Virus Agent, see section 3.9 "File Antivirus" in the
Security Configuration Guide.
----End
When you upgrade two disk arrays between which a remote device is deployed and only
one disk array is upgraded successfully, you cannot configure remote devices (including
creating or deleting remote devices, or adding or removing replication links) between two
disk arrays of different versions. The existing configuration is not affected and services
can run normally.
After the upgrade, use a browser to visit OceanStor DeviceManager and press
Ctrl+Shift+Delete to clear all browser cache.
Prerequisites
Before the upgrade, you have paused, stopped, or split the following value-added services:
snapshot, clone, LUN copy, LUN migration, HyperMirror, HyperMetro, and remote
replication.
Procedure
Step 1 Log in to DeviceManager and select Data Protection on the home page, as shown in Figure
6-1.
Step 2 On the Data Protection page, go to the corresponding value-added service management
interface, and then restart the value-added service, as shown in Figure 6-2.
1. After the upgrade, restart the value-added services and ensure that you only restart the services that
are stopped before the upgrade. If other ongoing services are to be restarted, evaluate the related impact
again. If the value-added services fail to be restarted, see section 7.16 Synchronization Fails to Start in
Value-Added Services After the Upgrade.
2. For the HyperMetro service, refer to 9.4 Upgrading Storage Arrays in the HyperMetro Solution.
----End
7 Troubleshooting
Symptom
1. If step Online Upgrade Preparation fails due to a failed port failover check, refer to section 7.2
The Preparation for the Online Upgrade Fails for troubleshooting.
2. If step Notify system start upgrade fails, refer to section 7.3 Failed to Notify the System of Starting
Upgrade for troubleshooting.
3. If other exceptions occur during the upgrade, refer to common troubleshooting methods for upgrade
failures in this section.
If an exception occurs during the upgrade, the status bar of the upgrade tool displays Paused
or Failed. The following provides details:
1. The status bar of the upgrade tool displays Paused:
Alarm Information
None.
Possible Cause
1. Processing fails during the upgrade (probably because of busy services).
2. A controller (the controller that is being upgraded or primary cluster) is reset during the
upgrade.
3. The firmware upgrade fails.
Recommended Actions
Step 1 Use the SSH client software to log in to each controller and check whether all the controllers
can be connected. If yes, go to the next step; if no, go to Step 12.
Step 2 Check whether message "System is upgrading" is displayed every time after a controller is
connected. If yes, go to Step 3; if no, go to Step 12.
Step 3 Run the minisystem command to enter the minisystem mode. Run the showsystrace
command in minisystem mode to check whether the status of any task is running or failed, or
whether non-0 tasks exist in the FailCnt column. If no, go to the next step; if yes, go to Step
12.
Step 4 Run the showsysstatus command to check whether all controller IDs are listed in the id
column and the status of all the controllers are normal. If yes, go to the next step; if no, go to
Step 12.
Step 5 Click Details on the status bar of the upgrade tool. Check whether options Retry, Roll Back,
and Close appear in the dialog box that is displayed, as shown in Figure 7-4, or Retry,
Ignore, and Close or Retry, Terminate, and Close appear in the dialog box. If yes, go to the
next step; if no, go to Step 7.
Step 6 In the dialog box in Figure 7-4 you are advised to select Retry. If the retry fails, select Roll
Back, Ignore, Manual repair, or Terminate. Then the upgrade tool will continue the
upgrade process. Observe the progress to foresee the execution result, or go to Step 11.
Step 7 After step 4, run the exit command to exit the minisystem mode. In upgrade mode, run the
show upgrade status command to view the Status column. If Fault is displayed, go to Step
8; if Upgrade failed is displayed, go to Step 12.
Step 8 Run the show upgrade package command to check whether the name and IP address of each
controller can be viewed. If no, go to Step 12; if yes, check whether the current versions of all
the controllers are the same, as shown in Figure 7-5. If no, go to Step 9; if yes, go to Step 12.
Step 9 Remove and insert the controllers that failed to be upgraded. (For example, after a fault occurs
during an upgrade from version A to version B, the version of controllers 0B and 0D is
version B while the version of controllers 0A and 0C is version A. 0A and 0C failed to be
upgraded.) Power on the controllers and perform steps 1, 2, 3, 4, and 8. Confirm that all the
controllers are normal and of the same version before going to Step 10.
1. To remove and insert a controller, run the cmm.sh -c setmrfi and rebootsys commands in
minisystem mode.
2. If multiple controllers fail, remove and insert the controllers at the same time, or remove and insert
the next controller after one controller is normal.
Step 10 Run the change upgrade flow resume_type=repair command. When the command is being
executed, go to Step 11, if failed, go to Step 12.
Step 11 Run the show upgrade status command to view the progress. When Upgrade Succeed
appears on the Status column or 100% appears on the Percent column, the action succeeds,
go to Step 12.
Step 12 If a task whose Status is failed or the value of FailCnt is not 0 exists, run the showsystrace
FlowId command to view the failed step and contact Huawei technical support engineers to
analyze the cause. Otherwise, go to Step 13.
Step 13 Log in to OceanStor DeviceManager and choose Settings > Export Data to export all system
logs of the controllers. Then contact Huawei technical support engineers for help.
----End
Symptom
During the online upgrade, the preparation for the online upgrade failed. The cause is
displayed as The port failover check fails. Click Details and the upgrade process is paused.
Figure 1-1 Failure of the preparation for the online upgrade due to a failed port failover check
Alarm Information
None.
Possible Causes
1. The network connection between the array and host is faulty. As a result, the links
between the array and host are not redundant.
2. After the controller is restarted, the connection status of the front-end port on the
controller is abnormal. As a result, the failed-over port cannot be failed back.
3. After the port failover function is enabled on the restarted controller, the connection to
the host fails due to a bug in the HBA driver on the host. As a result, the links between
the storage array and host are not redundant.
Recommended Actions
Step 1 Use the SSH client software to log in to each controller in sequence. Enter the developer
mode.
Step 2 Run the debug command in developer mode to enter the debug mode.
Step 3 Run the eam showfctgtlink and eam showiscsitgtlink commands on each controller.
Step 4 Compare the number of host links queried on each controller. If the link quantities are
different, check whether the network connections between the host and array are normal. If
not, rectify the fault and try again. If the network connections are normal, go to the next step.
Step 5 Exit from the debug mode to the developer mode. Run the show failover_path general
command.
Step 6 As shown in Figure 7-8, if there are ports in the Failed-over or Taking over state, some ports
are not failed back or fail to fail back. In this case, go to the next step. If no port is in the
Failed-over or Taking over state, go to Step 10.
Determine whether a failback failure occurs based on the upgrade stage and status:
1. When a batch upgrade is complete and the next batch upgrade of the controller starts, the preparation
for the online upgrade fails and ports in the Failed-over and Taking over states are displayed,
indicating that the port failback failure occurs after the former upgrade and restarting.
2. As shown in Figure 7-8, if controller A is upgraded and controller B starts to upgrade, the
preparation for the online upgrade fails, signifying that a logical port whose source port is on
controller A is not failed back.
Step 7 In developer mode, run the following command: change logical_port failback
service_type=SAN controller=<Controller ID>.
Controller ID: Indicates the controller whose logical port needs to be failed back. As shown in Figure 7-
8, controller 0B needs to be failed back. The Failover Status of CTE0.B is Taking over, meaning that
the port carries a logical port that is failed over from another controller and requires failback.
Step 8 After the failback command is executed manually, wait for five seconds and run the show
failover_path general command to check whether all ports are in the Idle state. If yes, the
failback is successful. Otherwise, go to the next step.
Step 9 Check whether a port fault occurs on the current device. Ensure all the ports are normal,
manually run the failback command again. If the fault persists, contact technical support
engineers.
Step 10 Log in to the service host and check whether the links between the host and array are normal.
Use the UltraPath software in Linux as an example. Run the upadmin show path command
to check whether a faulty path exists. If two faulty paths exist as shown in Figure 7-9, go to
the next step.
Step 11 Enter cat /sys/class/fc_remote_ports/rport- and then press Tab to view the number of links.
Step 13 Remove and then insert the optical cable on the array or host (the interval between removing
and inserting is longer than 1 minute) to restore connections.
Step 14 After the removing and inserting, repeat Step 10, Step 11, Step 12 to check whether the
connection is restored. If the fault persists, contact technical support engineers.
----End
Symptom
1. The upgrade tool fails to notify the system of starting upgrade, as shown in Figure 7-12.
Alarm Information
None.
Possible Causes
1. If the source version is V300R006C30 or later, the primary and secondary ends of the
HyperMetro pair or remote replication pair cannot be upgraded at the same time.
2. If the version is earlier than V300R006C30, the system may fail to notify service
modules of starting upgrade.
Recommended Actions
Step 1 Use the SSH client software to log in to the active controller of the current cluster.
Step 2 Check whether a "System is upgrading" message is displayed. If yes, go to the next step. If
not, contact Huawei technical support engineers.
Step 3 Run the minisystem command to enter the minisystem mode and run the showsystrace
command in minisystem mode.
Step 4 Check whether Success is displayed under the Status column in the
SET_UPDATE_STATUS step and whether the FailCnt column is 0. If not, the system fails
to be notified of starting upgrade, as shown in Figure 7-13.
Step 5 Run the showsystrace 112 command. If the value of Failed Action Name is Upgrade:
NtfRssPre, the value-added services fail to start upgrade. The possible cause is that the
primary and secondary ends of the HyperMetro pair or remote replication pair cannot be
upgraded at the same time.
----End
Symptom
1. Customer services cannot be resumed.
2. The host fails to discover LUNs on one or more controllers.
Alarm Information
None.
Possible Causes
The storage device and host are not properly connected.
Recommended Actions
1. Check whether the device is correctly connected to the host. If the connection is correct,
restart the controller.
2. If the fault persists, contact Huawei technical support.
Symptom
A controller hardware fault occurred during an upgrade performed. As a result, the upgrade
failed.
Alarm Information
None.
Possible Causes
The hardware of controller is faulty.
Recommended Actions
Step 1 In SmartKit, find out the ID of the controller that fails the upgrade. In Figure 1 Failed
upgrade, the ID is 0B. Based on the ID, find out the controller that fails the upgrade. (0B
indicates controller B in enclosure 0.)
Step 2 Remove the controller that fails the upgrade and insert a spare controller.
Step 3 Check whether the controller replacement is successful as follows:
1. Log in to OceanStor DeviceManager and view the device. The Figure 2 Checking the
status of the device shows an example of a spare controller that is in normal state.
2. Check that you can successfully log in to the CLI of the spare controller, as shown in
Figure 7-16.
3. Check whether the version information of the spare controller is correct, as shown in
Figure 7-17. If Current Version and History Version of the spare controller are the
target and source versions respectively, the controller replacement is successful.
If the controller replacement is not successful, remove and insert the spare controller again. Then, repeat
step 3 or contact R&D engineers.
Step 4 In SmartKit, upgrade the faulty controller until the upgrade is successful. You can refer to 8.4
The Method of Handling Upgrade Exceptions in 8 FAQs.
If the preparation for the online upgrade of the first or second batch of controllers fails and a
controller is replaced, you can only retry the upgrade.
If the offline upgrade preparation fails to be executed and controllers are replaced, you can only
perform the upgrade again.
----End
Symptom
After an upgrade, VM information cannot be queried.
After an upgrade, VMs are restarted successfully but ISV services on the VMs cannot be
restarted.
Alarm Information
None.
Possible Causes
Restarting storage LUNs fails.
Recommended Actions
If VM information is lost, contact Huawei technical support engineers.
If ISV services on the VMs cannot be started, contact technical support engineers of the ISV.
Symptom
After an upgrade complete, the Drive Letter Mappings of host changed.
Alarm Information
None.
Possible Causes
None.
Recommended Actions
Windows-based host
Rectify incorrect drive letters according to the drive letter mapping table. The procedure
is as follows:
Step 1 Right-click Computer and choose Manage from the shortcut menu. In the dialog box that is
displayed, select Disk Management.
Step 2 On the upper left corner of the page, right-click the corresponding driver letter and choose
Change Drive Letter and Paths, as shown in Figure 7-18.
Step 3 Click Change. The Change Drive Letter or Path for X: dialog box is displayed (X indicates
a drive letter), as shown in Figure 7-19.
Step 4 Specify a new drive letter and click OK to confirm the modification.
----End
Linux-based host
Restart the host.
Symptom
On the CLI, status of all links with CHAP enabled is Link Down after a rollback in case of an
offline upgrade failure, as shown in Figure 7-20.
Alarm Information
None.
Possible Causes
Links cannot recover automatically because iSCSI configurations fail to be restored.
Recommended Actions
Step 1 On the CLI, view information about all heterogeneous links, as shown in Figure 7-20.
Step 2 On the CLI, delete all heterogeneous links for which CHAP Enabled is Yes, as shown in
Figure 7-21.
Step 3 On OceanStor DeviceManager, click Add Remote Device. Re-configure heterogeneous links
for which CHAP is enabled as instructed, as shown in Figure 7-22.
----End
Symptom
If an offline upgrade fails due to an abnormal controller reset during the preparation for the
upgrade, or after the preparation and before the controllers are restarted, do not click Retry or
Rollback. Perform a forcible reset and click Retry and Rollback after the controller is added
to the cluster.
Alarm Information
None.
Possible Causes
Before reset controller during offline upgrade, the controller is reset abnormally. The state of
the controller does not recover.
Recommended Actions
Step 1 Log in to any normal controller and enter the minisystem mode, using the following command
to reset the storage system forcibly, then handle it according to command tips, as shown in
Figure 7-23.
Step 2 IF the state of the storage system is still not recovered, contact Huawei R&D engineers.
----End
Symptom
On a non-standard heterogeneous network (for example, controller A of the local storage array
is connected to that of a heterogeneous storage array through a link, and controller B of the
local storage array is connected to that of the heterogeneous storage array through a link), if
both storage arrays are upgraded simultaneously, a version verification failure may occur after
the local storage array restarts. On a standard network, each controller of the local storage
array has at least one available link to each controller of the heterogeneous storage array.
Alarm Information
None.
Possible Cause
The heterogeneous network is a non-standard network. When both local and heterogeneous
storage arrays are upgraded, one controller of the heterogeneous storage array resets, so the
heterogeneous LUN has only one link to the local storage array. Consequently, the link check
of the heterogeneous LUN fails during the upgrade check on the local storage array, causing
the upgrade interruption.
Recommended Actions
Step 1 After the heterogeneous storage array completes the upgrade and restores to normal, click
Details on the Upgrade page of SmartKit, and click Retry in the dialog box that is displayed.
Step 2 If the local storage array still fails to be upgraded, contact technical support engineers.
----End
Symptom
During the online or offline upgrade, the pre-upgrade preparations are suspended to perform
spare parts replacement. After the Retry button is clicked, the upgrade failure of the replaced
controller firmware is reported during the system upgrade.
Alarm Information
None.
Possible Causes
Enclosure firmware of the replaced controller is not uploaded. As a result, the enclosure
version does not update to the latest one during the enclosure upgrade and the version fails to
be verified.
Recommended Actions
Step 1 After the part is replaced and the firmware upgrade failure is reported, click Details on the
Upgrade page of SmartKit. In the dialog box that is displayed, click Retry.
Step 2 If the array still cannot be upgraded, contact Huawei technical support engineers.
----End
Symptom
After an array is upgraded, the physical path status and logical path status displayed on the
UltraPath installed on the host are faulty and normal, respectively.
Alarm Information
None
Possible Causes
Before the controller is restarted following an upgrade, the controller instructs UltraPath to set
the logical path to faulty. After the controller restarts, UltraPath detects that the physical path
is disconnected and sets the physical path to faulty. After the controller is restarted, the status
of the logical path becomes normal, but the physical path status is not refreshed.
This problem does not affect the host I/O, but only the status is displayed incorrectly.
Recommended Actions
Open the in-band commands of the mapping view on the array and scan for disks on the host
to update the path information.
Step 1 Log in to the host and run the upadmin show path (for Linux hosts) or esxcli upadm show
path (for ESX hosts) command to view information about the initiator whose physical path
status is faulty. The following uses Linux as an example: As shown in the following figure,
the path whose Path ID is 2 is faulty and its initiator is iqn.1996-
04.de.suse:01:1f7fc639cedc.
Step 2 Use an SSH tool, such as Xshell 5, PuTTY 0.63, SecureCRT 6.7, or one of their later
versions, to log in as user admin (the default password is Admin@storage) to the
management network port on the storage device. Then, the admin view of the CLI is
displayed.
Step 3 Log in to the array and run the following commands to query the mapping view associated
with the host initiator:
1. Run the show initiator command to check Host ID corresponding to the host initiator.
As shown in the following figure, the host initiator is iqn.1996-
04.de.suse:01:1f7fc639cedc and the corresponding Host ID is 10.
2. Run the show host host_group host_id=xx command (xx is the Host ID found in the
previous step) to find the corresponding Host Group ID. As shown in the following
figure, Host Group ID is 9.
3. Run the show host_group mapping_view host_group_id=xx command (xx is the Host
Group ID found in the previous step) to find the corresponding mapping view. As
shown in the following figure, the mapping view is MappingView004.
Step 4 On the array, enable the in-band command of the mapping view found in step 3.
1. Run the show mapping_view general command to check whether the in-band command
is enabled. As shown in the following figure, the in-band command of
MappingView004 is Disable.
3. Run the show mapping_view general command to check whether the in-band command
is enabled. As shown in the following figure, the in-band command of
MappingView004 is Enable.
Step 5 On the host, run the upRescan (for Linux hosts) or esxcfg rescan -A (for ESX hosts)
command to scan for disks. The following uses Linux as an example:
Before scanning for disks, the disk information is as follows:
Step 6 Run the upadmin show path (for Linux hosts) or esxcli upadm show path (for ESX hosts)
command to check whether the corresponding physical path status is normal. Using a Linux
host as an example, the status of the physical path whose Path ID is 2 changes to normal.
Step 7 On the array, disable the mapping view's command device to restore the environment.
1. Run the change mapping_view mapping_view_id=xx command_device=disable
command (xx indicates the mapping view ID of the command device to be disabled, for
example, 4), as shown in the following figure.
2. Run the show mapping_view general command to check whether the in-band command
is enabled. As shown in the following figure, the in-band command of
MappingView004 is Disable.
Step 8 On the host, run the upRescan (for Linux hosts) or esxcfg rescan -A (for ESX hosts)
command to restore the environment. Using a Linux host as an example, after the command
for scanning for disks is executed, the 16 KB command device disappears.
----End
Symptom
When OceanStor SmartKit is used for an upgrade, information indicating that the uploading
of the upgrade package fails and the upgrade cannot be continued is displayed.
Alarm Information
None
Possible Causes
Involved versions do not support the upgrade method by connecting to the maintenance
network port.
Recommended Actions
Step 1 Check whether the management network port of the array marked with is connected by
network cables. If not, connect it using network cables, as shown in the following figure.
Step 2 Use an SSH tool, such as Xshell 5, PuTTY 0.63, SecureCRT 6.7, or one of their later
versions, to log in as user admin (the default password is Admin@storage) to the
management network port on the storage device. Then, the CLI is displayed.
Step 3 In the admin view, run the show port general command to check IP addresses of ports whose
type is Maintenance Port. If you use these IP addresses in OceanStor SmartKit for upgrade,
this problem occurs, as shown in the following figure.
Step 4 If OceanStor SmartKit uses the IP addresses of ports whose type is Maintenance Port for the
upgrade, run the show port general command in the admin view to check the IP addresses of
ports whose type is Management Port. When OceanStor SmartKit is used for the upgrade,
replace the IP addresses of the ports whose type is Maintenance Port with the IP addresses
of the ports whose type is Management Port, as shown in the following figure (the IP
addresses in the figure are examples).
----End
Symptom
When OceanStor SmartKit is used to perform an upgrade, copying the upgrade package in
online mode fails and consequently the upgrade cannot be continued.
Alarm Information
None
Possible Causes
An error occurs when the non-cluster primary IPv6 address is parsed. If the IPv6 address is
entered on the upgrade tool, the upgrade package fails to be copied.
Recommended Actions
If possible, replace the IPv6 address with an IPv4 address for the upgrade. Otherwise, perform the
following steps.
Step 1 In the admin view, run the show port general command to query the IPv6 address of
management port ID of the storage array.
Step 2 Use the IPv6 address obtained in step 1 to perform the upgrade.
----End
Symptom
The upgrade progress on the upgrade tool is not refreshed within 30 minutes, but no upgrade
failure is displayed.
Alarm Information
None
Possible Causes
The upgrade tool cannot communicate with the array.
Recommended Actions
Step 1 Use the SSH software to connect to any management IP address of the array and log in to the
CLI as user admin.
Step 2 If the upgrade view is displayed after the CLI is displayed and the status is System is
upgrading, the system is being upgraded. Run the show upgrade status command to view
the upgrade progress and status.
Step 3 If the admin view is displayed after the CLI is displayed, run the change user_mode
current_mode user_mode=developer command to go to the developer view and run the
show upgrade status command to check the upgrade progress and status. If the status is
Upgrade Succeed and Percent is 100, the upgrade is successful. Otherwise, the upgrade fails.
Step 4 If the upgrade fails, contact Huawei technical support.
----End
Symptom
After the system is upgraded to V300R006C30 or later, synchronization fails to start in value-
add services, such as LUN copy, LUN clone, volume mirroring, and remote replication. The
error message similar to "The vStore IDs of the primary and secondary LUNs differ" is
returned. For remote replication, the message "The system is busy" or "An internal error
occurs" is returned.
Note: For details about the LUN HyperMetro, see section 9.4.3.2 What Can I Do when the
HyperMetro Consistency Group Fails to Be Synchronized After the Upgrade?.
Alarm Information
Operation logs about synchronization failures exist.
Possible Causes
1. In LUN clone, LUN copy, and volume mirroring, primary and secondary LUNs belong
to different vStores before the upgrade (in V300R003C20SPC200,
V300R005C00SPC300, V300R006C00SPC100, V300R006C10SPC100,
V300R006C20). After the upgrade to the version (V300V006C30 or later) where multi-
vStores are supported in value-added services, the vStore consistency of the primary and
secondary LUNs will be checked. If the vStores are inconsistent, synchronization is not
allowed.
2. For a remote replication consistency group, the vStore consistency of member pairs and
consistency groups, and between pairs is checked on the primary and secondary ends
respectively. If the vStores are inconsistent, synchronization is not allowed. You need to
manually modify to ensure vStore consistency before starting synchronization.
Recommended Actions
For LUN clone: The error code 0x4040373A is returned during synchronization.
Solution: Modify the vStore attributes of one or more LUNs to ensure that all LUNs
have the same vStore.
Run the following command in developer mode: change lun lun_id=x vstore_id=x.
For LUN copy: The error code 0x4000DB13 is returned during synchronization.
Solution: Modify the vStore attributes of one or more LUNs to ensure that all LUNs
have the same vStore.
Run the following command in developer mode: change lun lun_id=x vstore_id=x.
For volume mirroring: The error code 0x40024728 is returned during synchronization.
Solution: Modify the vStore attributes of one or more LUNs to ensure that all LUNs
have the same vStore.
Run the following command in developer mode: change lun lun_id=x vstore_id=x.
For LUN remote replication: its member LUNs belong to vStores and remote replication
pairs are added to a consistency group. The group belongs to a default vStore because
vStores are not supported by LUN remote replication in the pre-upgrade version. After
the upgrade to the version (V300R006C30 and later) where multi-vStores are supported
by HyperMetro, the vStore consistency of the consistency groups and member pairs will
be checked during synchronization. The vStores vary and the synchronization is not
allowed. The error code 0x40001C6D or message "The system is busy" is returned.
Solution:
Method 1: Change the vStores of all member LUNs to the system vStore. Run the
following command in developer mode: change lun lun_id=x vstore_id=0.
Method 2: Remove all remote replication pairs, delete the consistency group, create a
consistency group for the specified vStore, and add the remote replication pairs to the
consistency group. Note: If the vStores of remote replication pairs' LUNs are
inconsistent, use method 1 to change the vStore IDs of all remote replication pairs to the
same one. Use SSH to connect to a management IP address of the storage array and log
in to the CLI as user admin.
8 FAQs
Answer
Using the SSH tool to log in to and out of a storage array (with PuTTY as an example)
Step 1 Open PuTTY, enter the IP address of the storage array in Host Name, and click Open to log
in to the storage array, as shown in Figure 8-1.
Step 2 Enter the user name and password of the storage array. The default user name and password
are admin and Admin@storage respectively. Figure 2 Logging in to the storage array shows
a successful login.
Step 3 Run the exit command to log out of the storage array, as shown in Figure 8-3.
----End
The web browser may display a message indicating that there is a problem with the security certificate.
You only need to confirm that the IP address is correct and continue to access the storage array.
Step 2 Enter the user name and password, and click Log In, as shown in Figure 8-4.
Step 3 The user interface of OceanStor DeviceManager is displayed, as shown in Figure 8-5.
Step 4 Click the log out button as shown in Figure 8-6 and click OK to log out of OceanStor
DeviceManager.
----End
Using an SFTP tool to log in to and out of a storage array (with WinSCP as an example)
Step 1 Open WinSCP. Set Host name to the IP address of the storage array to which you want to log
in. Set User name and Password to the user name and password of the storage array. The
user name is the administrator account, and the password is that of the administrator account.
The default user name and password are admin and Admin@storage respectively. Click
Login, as shown in Figure 8-7.
Step 2 After you log in to the storage array, the local directory is displayed in the left and the device
directory is displayed in the right, as shown in Figure 8-8.
Step 3 Click the close button in the upper right corner to log out of the storage array, as shown in
Figure 8-9.
----End
Answer
Step 1 Use the CLI to log in to the storage array and run show system server_port
server_name=SSH to query the SSH port number. The red box in Figure 8-10 shows an
example of the SSH port number.
Step 2 Use the CLI to log in to the storage array and run change system server_port
server_name=SSH port_num=** to change the SSH port number, as shown in Figure 8-11.
In the command, port_num indicates the new SSH port number that you want to use.
----End
Answer
If the system is in an abnormal state, you must contact Huawei technical support engineers to
collect storage array information. Figure 1 Device status shows that the system is normal.
Answer
If nodes fail to be upgraded, the upgrade is automatically terminated. Options including
rollback, retry, continue, and terminate are provided.
In the dialog box that is displayed, available options, such as Retry, are displayed, as shown
in Figure 3 Clicking Retry.
Confirm your choice and click OK, as shown in Figure 4 Clicking OK.
When a node fails to be upgraded and the upgrade process is suspended, you can roll back the upgrade,
perform the upgrade again, or ignore the upgrade failure on the CLI if the SmartKit is unable to connect
to devices or is disabled.
1. Run the show upgrade status command to check the current update status. the upgrade status can
be Suspended Before Continue, Suspended Before Rollback or Suspended Before Terminate.
2. After the upgrade status is confirmed, Huawei R&D engineers locate the causes of the upgrade
failure. Then R&D engineers instruct operators to troubleshoot the upgrade failure.
If the upgrade status is Suspended Before Continue, select Continue or Retry.
If the upgrade status is Suspended Before Rollback, select Roll back or Retry.
If the upgrade status is Suspended Before Terminate, select Retry.
3. Run the change upgrade flow resume_type=? command on the CLI. Five options continue,
rollback, retry, terminate and repair are available. For example, change upgrade flow
resume_type=retry indicates that you need to perform the upgrade again. for more information
about the parameters, run the help upgrade command on the CLI.
Answer
If the source version is V300R006C10 or later, you can ignore the failed step and continue the
upgrade.
If you ignore the failed step and continue the upgrade, the varying risks from different failures
require evaluation of Huawei technical support engineers. Only when the risks are acceptable,
you can continue the upgrade. Do not perform any operations without evaluation.
1. If you ignore "upgrading PCIe switches" and continue the upgrade, the versions of the
switches and controllers may be incompatible, thereby triggering unknown exceptions and
risks are triggered.
2. If you ignore "preparing for the online upgrade" and continue the upgrade, services may be
interrupted.
3. If you ignore "upgrading the system" and continue the upgrade, the system software or
some firmware may not be upgraded and become incompatible, and other unknown
exceptions and risks are triggered.
4. If you ignore "rebooting the system" and continue the upgrade, the upgrade may fail again
or services may be interrupted.
5. If you ignore "verifying the system version after rebooting" and continue the upgrade, the
upgrade may fail again or services may be interrupted.
Figure 1-1 Failure of preparing for the upgrade due to a failed front-end redundant link check
The above preparation fails because no redundant link exists between the host and disk array during the
upgrade of node 1 (controller 0B). If the upgrade continues, services will be interrupted. If the
evaluation shows that the service interruption is acceptable, you can skip this step and proceed with the
upgrade.
Step 1 Use the SSH client software to log in to a controller and run the change user_mode
current_mode=developer command to enter the developer view, as shown in Figure 8-18.
Step 2 Run the minisystem command to enter the minisystem view, as shown in Figure 8-19.
Step 3 Run the upgrade.sh -i [nodeId] command to ignore the upgrade failure of nodeId and
continue the upgrade. In this example, run the upgrade.sh -i 1 command to ignore the failure
of node 1 (controller 0B), as shown in Figure 8-20.
Step 4 Choose Status > Details in the Upgrade page and then click Retry to continue the upgrade,
as shown in Figure 8-21.
----End
The above preparation fails because the upgrade system has timed out during the upgrade of node 2
(controller 0C). If the upgrade continues, services will be interrupted. If the evaluation shows that
upgrade system timeout is acceptable, you can skip this step and proceed with the upgrade.
Step 2 Click Ignore in the Details, command to ignore the upgrade failure of nodeId and continue
the upgrade. In this example, as shown in Figure 8-23.
Step 3 Enter username and password to check the agreement and then click OK to continue the
upgrade, as shown in Figure 8-24.
----End
Answer
To meet customer needs, you can perform an online upgrade in the customized upgrade
batches. That is, you can divide an upgrade into several batches and specify the controllers to
be upgraded in each batch.
In terms of determining upgrade batches, you must consider the specific network environment. If
you want to employ customized upgrade batches, contact technical support engineers to determine a
batching strategy.
The master node must be upgraded in the last batch. In the CLI, run show controller general to get
the master node.
Controllers A and B cannot be upgraded in the same batch, and controllers C and D cannot be
upgraded in the same batch.
Controllers are named in the format of digit + letter. The digit indicates the ID of the engine in the
cluster. The letter indicates the controller in the engine. For example, 0A indicates controller A in engine
0.
3. Use SmartKit to perform an online upgrade. During the upgrade, SmartKit determines
whether UpgradeFile.xml exists. If UpgradeFile.xml exists, SmartKit performs the
upgrade based on the batch information in the file. For details about how to use SmartKit
to perform an online upgrade, see 5.6 Upgrading Controller Software.
4. Restart SmartKit for the change to take effect.
After completing the upgrade, delete UpgradeFile.xml to prevent any adverse impact on the
next upgrade.
Answer
1. Click the arrow in the red box, as shown in Figure 1 Clicking arrow.
2. In the dialog box that is displayed, select Upgrade configuration, as shown in Figure 2
Selecting upgrade configuration.
3. Click Modify in the red box as shown in Figure 3 Clicking Modify. Then you can
modify the upgrade configuration.
Answer
In the data backup phase of the upgrade, if the License file backup fails, the upgrade process
continues and is not affected.
This approach is highly risky. It is used to format system startup partitions, reinstall the
operating system, and clear the existing service configuration data. Do not perform it without
the guidance of R&D engineers.
Answer
Step 1 Log in to the controller. On the CLI, run change user_mode current_mode
user_mode=developer to enter the developer mode, as shown in Figure 8-28.
In the preceding command, ip indicates the IP address of the host. user and password are the user name
and password of the host. db_file can be customized, but the file name extension must be .dat, for
example: XX.dat. Both SFTP and FTP are supported.
Step 3 In developer mode, run the clear configuration_data action=reboot command, wait for the
system to clear configuration, and reboot the system to make the clearing take effect, as
shown in Figure 8-30.
Figure 1-1 Clearing the configuration and rebooting the system to make the clearing take effect
Step 4 On the CLI, run minisystem to enter the minisystem mode. Run sys.sh clearnode -bf, as
shown in Figure 8-31.
Step 5 Before uploading the upgrade package, run the free -m command in the minisystem to query
the remaining memory of the system. Ensure that the remaining memory is 100 MB larger
than the size of the upgrade package, as shown in Figure 8-32.
Step 9 After the setupsystem command succeeds, go to the next step, as shown in Figure 8-34
Figure 1-1 Rebooting the system for the change to take effect
To re-upload an upgrade package or cancel the upgrade, you need to delete the upgrade
package. Run rm /home/permitdir/*.tgz to delete the package. Then, run ls /home/permitdir/
to verify whether the package is deleted, as shown in Figure 8-36.
----End
This approach is highly risky. Do not perform it without the guidance of R&D engineers.
Answer
Step 1 Log in to either controller of one engine. On the CLI of the minisystem, run upgradesystem.
Upload the upgrade package and enter the package name as prompted, as shown in Figure 8-
37.
Step 2 Use an FTP/SFTP tool to upload the upgrade package to the /home/permitdir/update_disk
directory of the controller. Then, enter the package name and press Enter. Enter y to start the
upgrade, as shown in Figure 8-38.
Step 3 If the system has multiple engines, repeat steps 1 and 2 on each engine.
Step 4 In developer mode, run the export configuration_data ip=x.xx.xx.xx user=xxx
password=xxx db_file=XX.dat protocol=SFTP command, as shown in Figure 8-39.
In the preceding command, ip indicates the IP address of the host. user and password are the user name
and password of the host. db_file can be customized, but the file name extension must be .dat, for
example: XX.dat. Both SFTP and FTP are supported.
Step 5 In developer mode, run the clear configuration_data action=reboot command, wait for the
system to clear configuration, and reboot the system to make the clearing take effect, as
shown in Figure 8-40.
Figure 1-1 Clearing the configuration and rebooting the system to make the clearing take effect
----End
To re-upload an upgrade package or cancel the upgrade, you need to delete the upgrade
package. Run upgradesystem and rm /startup_disk/image/update/* to delete the upgrade
package. Then, run ls /startup_disk/image/update/ to verify whether the package is deleted, as
shown in Figure 8-41.
Answer
1. Check whether there is single-link alarm. Recover the failed link based on the alarm.
2. If there is no single-link alarm, run the show disk general |filterColumn include
columnList=ID,Multipathing command on the CLI to check disk links as shown in
Figure 8-42. If Multipathing corresponds to A, B, the links are correct. Locate the faulty
disks and clear single-link alarms.
In the figure above, two disks (DAE000.11 and DAE000.12) are in the single-link state. A single-link
alarm of the two disks (DAE000.11 and DAE000.12) is reported on the DeviceManager. Rectify the
fault according to the alarm clearing suggestion.
Answer
You might fail if the user who has logged in to the storage system has opened four debug
windows. Run the following commands to locate the problem.
Log in to the storage system on the CLI. Run the change user_mode current_mode
user_mode=developer command to go to the developer mode. Run the debug command to
check whether the storage system can enter the debug mode.
If the storage system can enter the debug mode as shown in Figure 8-43, the failure is not
caused by debug windows. Contact Huawei technical support engineers to handle the
problem.
Figure 1-1 The storage system can enter the debug mode
If the storage system cannot enter the debug mode as shown in Figure 8-44, the failure is
caused by debug windows, close all the debug windows by running the exit command, and
then, try to execute the upgrade on SmartKit. If failed again, contact Huawei technical
support engineers to handle the problem.
Figure 1-2 The storage system cannot enter the debug mode
Answer
Step 1 Run the show event | filterRow column=ID predict=equal_to value=0x200F01040042 to
filter all operation logs generated when the user clears alarms, as shown Figure 8-45.
Figure 1-1 Filtering all operation logs generated when the user clears alarms
Step 2 Based on the sequence of alarm deletion obtained in step 1, run the show event sequence
command to view the sequence numbers of the alarms, as shown in Figure 8-46.
Figure 1-1 Viewing deleted alarms using the operation logs of alarm deletion
Step 3 Based on the sequence numbers obtained in step 2, run the command to view details of the
alarms. Confirm alarms of major or critical level only, as shown in Figure 8-47.
----End
Answer
What is upgrade pause?
This function is used to pause an upgrade after certain steps. The upgrade continues when
some operations are performed manually. For example, when the first batch of controllers is
upgraded, you need to confirm whether all links between hosts and arrays are restored
manually and then continue to upgrade the next batch of controllers. The pause points can be
configured.
When the upgrade is paused, you can click Continue or Roll Back to resume the upgrade as
prompted. For details, see the following figure.
Step 3 In SmartKit, click Modify on the Set Upgrade Policy page. In the displayed Modify dialog
box, select Enable Upgrade Pause, as shown in the following figure.
Step 4 If you use SmartKit to implement an online upgrade, the upgrade process will be paused as
configured.
----End
Answer
You can check whether the conditions of port failover are met based on the following
procedure:
Step 1 Use the SSH client software to log in to a controller and run the change user_mode
current_mode user_mode=developer command to enter the developer view, as shown in
Figure 8-51.
In developer mode, run the show upgrade port_failover_switch (V300R006C20 and later
versions) command to check whether the port failover function is enabled, as shown in Figure
8-53.
Step 3 By default, the port failover function is enabled. If the port failover function is disabled (as
shown in Figure 8-54), run the change logical_port failover_switch service_type=SAN
switch=on (V300R006C10SPC100 and earlier versions) command in developer mode to
enable the function, as shown in Figure 8-55.
By default, the port failover function is enabled. If the port failover function is disabled (as
shown in Figure 8-56), run the change upgrade port_failover_switch switch=on
(V300R006C20 and later versions) command in developer mode to enable the function, as
shown in Figure 8-57.
Step 4 After the port failover function is enabled, run the test logical_port failover
service_type=SAN command in developer mode to check whether the networking mode
meets the upgrade prerequisites.
1. If the system has IB and FCoE ports that are configured with host services, port failover is not
supported, and an error message is displayed during the network check.
2. Batch upgrades are performed according to a default sequence that cannot be changed. If you change
the sequence, an error message will be displayed during the network check.
3. If the system has Fibre Channel and iSCSI ports that are configured with host services, the network
check fails. Check the networking mode of the storage system based on the following standard
networking requirements for port failover.
----End
2. The Fibre Channel ports on controllers A and B of the storage system must support
NPIV (such as the Fibre Channel port on the SmartIO interface module in Fibre Channel
mode, 8 x 8 Gbit/s Fibre Channel high-density interface module, and onboard SmartIO
interface module in Fibre Channel mode).
3. The Fibre Channel ports on controllers A and B must be connected to the same switching
network.
4. The NPIV function of the switches is enabled.
2. The Fibre Channel ports on controllers A and B of the storage system must support
NPIV (such as the Fibre Channel ports on the SmartIO interface module in Fibre
Channel mode, 8 x 8 Gbit/s Fibre Channel high-density interface module, and onboard
SmartIO interface module in Fibre Channel mode).
3. The Fibre Channel ports on controllers A and B must be connected to the same switching
network.
4. The NPIV function of the switches is enabled.
Figure 1-1 Dual-switch network (weak symmetric) for Fibre Channel port failover
In Figure 8-59, the two ports on controllers A and B connected using yellow cables are non-
strong-symmetric ports, and the two ports on controllers A and B connected using green
cables are strong symmetric ports.
Perform the following steps to add the two non-strong-symmetric ports to a SAN failover
group, thereby meeting port failover requirements:
Step 1 Run the create failover_group general name=? service_type=SAN command to create a
SAN failover group.
Figure 1-1 Weak symmetric networking mode for iSCSI port failover
In Figure 8-61, the two ports on controllers A and B connected using green cables are non-
strong-symmetric ports, and the two ports on controllers A and B connected using yellow
cables are strong symmetric ports.
Perform the following steps to add the two non-strong-symmetric ports to a SAN failover
group, thereby meeting port failover requirements:
Step 1 Run the create failover_group general name=? service_type=SAN command to create a
SAN failover group.
In developer mode, run the test logical_port failover service_type=SAN command to check
whether the networking mode meets upgrade requirements.
----End
Answer
Step 1 Use the SSH client software to log in to a controller and run the change user_mode
current_mode user_mode=developer command to enter the developer view, as shown in
Figure 8-62.
In developer mode, run the show upgrade port_failover_switch (V300R006C20 and later
versions) command to check whether the port failover function is enabled, as shown in Figure
8-64.
----End
Answer
After the upgrade is complete, use the SSH client software to log in to any controller and run
the change remote_replication synchronize remote_replication_id=****** command to
restart the synchronization. In this way, the read-only function of the secondary array takes
effect, as shown in Figure 8-67.
----End
Answer
After the upgrade is complete, use the SSH client software to log in to any controller and run
the change hypervault start hypervault_id=******** action_type=remote command to
restart remote backup. In this way, the read-only function of the secondary array takes effect,
as shown in Figure 8-68.
----End
Question
What can I do if the system displays a message indicating that the front-end link redundancy
check fails during the pre-upgrade check or online controller upgrade preparation?
Answer
The following is the possible causes for the failure to pass the front-end link redundancy check:
1. The links between the host and each controller of the disk array do not meet the online upgrade
requirements. If the upgrade continues, host services will be interrupted during the upgrade.
2. In some specific scenarios, for example, the host has no I/O, the links between the host and
controller are not automatically restored. As a result, the link redundancy does not meet the
following upgrade requirements after the controller is upgraded and restarted.
Step 1 Choose DeviceManager > Monitoring > Alarms and Events, and check whether there is an
alarm indicating that there is no redundant path from the host to the disk array.
If yes, clear the alarm by taking recommended actions in alarm details.
If not, go to Step 2.
Step 2 Choose DeviceManager > Provisioning > Host. Traverse all hosts to check whether there
are online initiators.
With the default upgrade batch, if the host paths meet the following requirements, the host paths are
redundant. Otherwise, the host paths are not redundant.
At least one available link exists between the host and a batch of controllers with even IDs
(controllers XA and XC in a single-engine quad-controller device and controller XA in a single-
engine dual-controller device).
At least one available link exists between the host and a batch of controllers with odd IDs
(controllers XB and XD in a single-engine quad-controller device, and controller XB in a single-
engine dual-controller device).
Take a single-engine quad-controller storage device with controller 0A/0B/0C/0D as an example:
You can perform an online upgrade if the following networking mode is used: 0A and 0B, 0A and
0D, 0B and 0C, or 0B and 0D have links, or three or more controllers have links.
You cannot perform an online upgrade if the following networking mode is used: 0A and 0C, or 0B
and 0D have links but other controllers do not have links, or there is only one link between the host
and all controllers.
During a customized batch upgrade, the host paths must meet the requirement that two or more
batches of controllers have available links to perform an online upgrade.
Step 4 If the host paths are not redundant, perform the following operations:
1. Check whether the connection between the host and controller is normal. If the cable is
loose or removed, reinsert it.
2. Choose DeviceManager > Home > System to check whether controller service ports are
normal.
3. Log in to the host and check whether paths are normal by using the multipathing
software.
Step 5 Rectify the fault that leads to not redundant paths. Then, perform the upgrade again. If the
fault cannot be rectified, contact technical support engineers.
----End
Question
The upgrade evaluation tool supports only the default batch upgrade (two batches) in the link
redundancy check of hosts. If the upgrade batch needs to be customized, the check result of
the upgrade evaluation tool may be incorrect. How to deal with it?
Answer
In the case of customized upgrade batches, you need to manually check whether the host links
are redundant regardless of whether the front-end link redundancy check item of the upgrade
evaluation tool passes the check.
1. Link redundancy rule: After each batch of controllers is upgraded and restarted, the host
still has at least one available link.
2. Link redundancy check method: Take a Linux host installed with UltraPath as an
example. For other hosts or other multipathing software, see section Checking Host
Multipathing Link Status.
Step 1 Log in to a Linux host and run the upadm show version command to view the UltraPath
version, as shown in Figure 8-71.
Step 2 If the UltraPath version is 5.01.017 or earlier, run the upadm show array command to query
information about all disk arrays managed by UltraPath and obtain the value of Array ID in
the first column. If the UltraPath version is later than 5.01.017, go to Step 6.
Step 3 Run the upadm show lun array=<Array ID> command with the obtained Array ID to view
the path information of all LUNs, as shown in Figure 8-73.
Step 4 Check the Controller XX information in the preceding step, such as Controller 0A and
Controller 0B. Check the values of NumLunObjects in the command output. If the values
are greater than 0 and the value of DevState is OPTIMAL, there are available links on the
controller.
Step 5 Based on all available links, check whether at least one available link exists in every
controller in each batch to be upgraded.
Step 6 If the UltraPath version is later than 5.01.017, run the upadmin show vlun command (when
the UltraPath version is 8.01.051 or earlier) or upadmin show show vlun type=all (when the
UltraPath version is later than 8.01.051) to query information about all LUNs managed by
UltraPath for Linux and to obtain values in the Vlun ID column.
Step 7 Run upadm show vlun id=<Vlun ID> (when the UltraPath version is 8.01.051 or earlier) or
upadm show vlun vlun id =< Vlun ID > -t type=all (when the UltraPath version is later than
8.01.051) command to query path information about all LUNs.
Step 8 Check vlun paths and check whether at least one available link exists in every controller in
each batch to be upgraded.
----End
Question
The current host compatibility evaluation tool can only evaluate hosts running on
VMware/Windows/Linux/Solaris/HP-UX/AIX. If the host connected to the disk array does
not run on these mainstream operating systems, how to evaluate the host compatibility?
Answer
If the operating system of the host is not VMware/Windows/Linux/Solaris/HP-UX/AIX,
perform the following operations to check the host:
1. Perform basic connectivity check by referring to section Testing Host Connectivity.
2. If the Oracle database software is installed on the host, check the Oracle heartbeat
timeout parameters by referring to section Checking the Oracle Heartbeat Parameter.
3. Check whether the connection between the host and storage controller meets the online
upgrade requirements:
h. At least one available link exists between the host and a batch of controllers with
even IDs (controllers XA and XC in a single-engine quad-controller device and
controller XA in a single-engine dual-controller device).
i. At least one available link exists between the host and a batch of controllers with
odd IDs (controllers XB and XD in a single-engine quad-controller device, and
controller XB in a single-engine dual-controller device).
Question
Before the upgrade, if the controller tool is restarted to restart controllers or the disk array
upgrade tool is used to upgrade the disk array, a check item fails. After analysis by Huawei
technical support engineers, the check item can be ignored. How to perform the operation?
Answer
Before ignoring a check item, ensure that the check item is evaluated by Huawei technical
support engineers. In addition, ensure that the check item does not affect host services or
cause upgrade failure after being ignored. Do not ignore it without permission. Otherwise,
unexpected serious results may occur.
Step 1 Go to the \tools\ArrayUpgrade directory in the tool installation directory. Open the
ignoreitem.txt file, as shown in Figure 8-76.
Find the option corresponding to the failed item and change the parameter to yes, as shown in
Figure 8-77.
Figure 1-1 Going to the installation directory of the tool for restarting controllers
Step 2 On the tool page for restarting controllers, click Retry, as shown in Figure 8-78.
Step 3 After the retry, the check item can be ignored. Click Ignore to continue the restart process.
----End
Answer
If the host is not installed with UltraPath 8.0.0.0 or later, during the controller software
upgrade, the host I/O path switchover depends on the timeout mechanism of the Fibre
Channel HBA or iSCSI initiator during controller reset. How long I/Os are affected depends
on the timeout parameter. If the timeout parameter is set to a large value, the host software
cannot tolerate the I/O timeout. As a result, services on the host software may be interrupted
and the interruption duration is approximately equal to the value of the HBA or iSCSI timeout
parameter.
If the HBA or iSCSI timeout parameter cannot be checked or modified after the check,
confirm the following before continuing the upgrade:
1. In the current HBA configuration, the storage controller failure test has been performed.
If multiple controllers are faulty, host software services are not interrupted, or the I/O
impact caused by the host software upon multiple controller failures is acceptable.
2. If the Oracle software is installed on the host, check the heartbeat parameters by
referring toChecking the Oracle Heartbeat Parameter. If the ORACLE ASM parameter
does not meet the online upgrade requirements, the parameter is modified according to
the modification method.
If the Oracle software is installed on the host, you must perform operations indicated in
section Checking the Oracle Heartbeat Parameter.
If the Oracle software is not installed on the host but you cannot be sure whether the I/O
timeout during the upgrade affects host services, it is strongly recommended that you
modify the HBA or iSCSI timeout parameters by referring to the troubleshooting cases of
the host compatibility evaluation tool.
Answer
The patch installed during the upgrade is to solve the upgrade failures caused by the target
version. Install the patches that do not tackle upgrade issues after the upgrade is complete.
Answer
Step 1 Log in to the SVP host as svp_user (use SSH to connect to the IP address of SVP
management port 20).
Step 2 Run the su root command to switch to user root. The default password is Admin@12#$.
Step 3 Run the iptables -t nat -L command to check the current NAT mapping configuration. Find
the line number of to:172.17.126.12:3389. (For example, the line number is 1 and the
mapping port is tcp dpt:ms-wbt-server in the following figure.)
Step 4 Run the iptables -t nat -R PREROUTING [line number] -d [IP address of the SVP
management port] -p tcp --dport [port number to be changed into] -j DNAT --to
172.17.126.12:3389 command. The following figure uses the target port number 22111 as an
example.
Step 5 Run the iptables -t nat -L command to check whether the setting is successful. (For example,
if the following figure is displayed, the port number is changed into 22111 successfully.)
Step 6 Use the new port number to log in to the server. If the login fails, go to step 7.
Step 7 Check the firewall configuration by running iptables -L. In INPUT, locate the line number of
the rule for the mapping port (tcp dpt:ms-wbt-server) obtained in step 3. (For example, the
line number is 10 in the following figure.)
Step 8 Run the iptables -R INPUT [line number] -p tcp --dport [port number to be changed into] -
j ACCEPT command.
9 Appendix
You can adjust the configuration by referring to the corresponding operating system configuration guide.
Cause
In the online upgrade of storage devices, controllers are upgraded in batches. By default,
controllers are upgraded in two batches by controller ID (for example, 0, 1, 2, or 3, which can
be controllers 0A, 0B, 0C, and 0D in a single-engine quad-controller device). One batch is
controllers with odd IDs and the other batch is controllers with even IDs. During the upgrade
and restart of each batch of controllers, ensure that there are available links between other
controllers and hosts to avoid host I/O interruption. Therefore, the links between multipathing
and controllers must meet the following redundancy requirements:
1. At least one available link exists between the host and a batch of controllers with even
IDs (controllers XA and XC in a single-engine quad-controller device and controller XA
in a single-engine dual-controller device).
2. At least one available link exists between the host and a batch of controllers with odd
IDs (controllers XB and XD in a single-engine quad-controller device, and controller XB
in a single-engine dual-controller device).
Take a single-engine quad-controller storage device with controller 0A/0B/0C/0D as an
example:
You can perform an online upgrade if the following networking mode is used: 0A and
0B, 0A and 0D, 0B and 0C, or 0B and 0D have links, or three or more controllers have
links.
You cannot perform an online upgrade if the following networking mode is used: 0A and
0C, or 0B and 0D have links but other controllers do not have links, or there is only one
link between the host and all controllers.
1. The preceding networking requirements are the minimum networking requirements for an online
upgrade. To improve service reliability during the online upgrade, it is strongly recommended that
there be available links between the host and each controller or there be available links between the
host and a batch of controllers with odd IDs and between the host and a batch of controllers with
even IDs in every engine.
2. To ensure that each LUN mapped to the host has redundant links, this section describes how to
check link redundancy based on each LUN mapped to the host. If there are so many LUNs that the
LUNs cannot be checked one by one, you can randomly select some LUNs to check their link
redundancy after there are redundant links between the host and controllers.
3. This link redundancy check method described in this section applies only to Huawei UltraPath and
multipathing software delivered by the system. If third-party multipathing software is used, perform
a check by referring to the link check method provided by the third-party multipathing software and
preceding networking requirements.
4. If a customized batch upgrade is performed, ensure that there are still available links between the
host and controllers in other batches after each batch of controllers are upgraded and restarted. For
example, if controllers are upgraded one by one, ensure that available links exist between the host
and at least two controllers.
Step 2 Run the esxcli upadm show vlun command to view the Vlun ID field, as shown in Figure 9-
2. If the Vlun information is not displayed, UltraPath does not take over any LUNs. In this
case, refer to the link redundancy check method provided by VMware multipathing software.
Otherwise, go to the next step.
Step 3 Run esxcli upadm show vlun -l < Vlun ID > (when the UltraPath version is 8.01.051 or
earlier), as shown in Figure 9-3 or esxcli upadm show vlun -l < Vlun ID > -t all (when the
UltraPath version is later than 8.01.051) to query information about all LUN paths, as shown
in Figure 9-4.
Step 4 View the vlun path information to check whether there are available paths in the Normal state
to a batch of controllers with odd IDs (controllers XB and XD in a single-engine quad-
controller device and controller XB in a single-engine dual-controller device) and to a batch
of controllers with even IDs (controllers XA and XC in a single-engine quad-controller device
and controller XA in a single-engine dual-controller device). In Figure 9-5, both controllers
0A and 0B have two paths, meeting the online upgrade requirements.
----End
Step 2 Run the esxcfg-mpath -b -d DeviceName command to obtain the path information of a
specified LUN. Obtain the last column of the command output (WWPN of the array controller
port).
Step 3 Convert each WWPN to a binary value and check the values of the 53th to 56th bits and
change the values into a decimal number. The value indicates the controller ID, such as
controller 1 in Figure 9-8.
Step 4 Based on the calculated controller ID, check whether each LUN has at least one available path
to a batch of controllers with even IDs and a batch of controllers with odd IDs. If yes, an
online upgrade can be performed; otherwise, an online upgrade is unavailable.
----End
9.1.2.2 Windows
This section describes how to check multipathing link redundancy on a Windows host.
Step 2 If the UltraPath version is 8.01.051 or earlier, run the upadm show vlun command, as shown
in Figure 9-10. If the UltraPath version is later than 8.01.051, run the upadm show vlun
type=all command, as shown in Figure 9-11. Check Vlun ID in the first column.
Step 3 Run upadm show vlun id=<Vlun ID> (when the UltraPath version is 8.01.051 or earlier), or
upadm show vlun vlun id =<Vlun ID> type=all (when the UltraPath version is later than
8.01.051) to query path information about all LUNs.
Step 4 View the vlun path information to check whether there are available paths in the Normal state
to a batch of controllers with odd IDs (controllers XB and XD in a single-engine quad-
controller device and controller XB in a single-engine dual-controller device) and to a batch
of controllers with even IDs (controllers XA and XC in a single-engine quad-controller device
and controller XA in a single-engine dual-controller device). If both planes have available
links, an online upgrade can be performed. Otherwise, an online upgrade is unavailable.
----End
9.1.2.3 Linux
This section describes how to check multipathing link redundancy on a Linux host.
Step 2 If the UltraPath version is 5.01.017 or earlier, run the upadm show array command to query
information about all disk arrays managed by UltraPath and obtain the value of Array ID in
the first column. If the UltraPath version is later than 5.01.017, go to Step 6.
Step 3 Run the upadm show lun array=<Array ID> command with the obtained Array ID to view
the path information of all LUNs, as shown in Figure 9-14.
Step 4 Check the Controller XX information in the preceding step, such as Controller 0A and
Controller 0B. Check the values of NumLunObjects in the command output. If the values
are greater than 0 and the value of DevState is OPTIMAL, there are available links on the
controller.
Step 5 Based on all available links, check whether there are available links to a batch of controllers
with odd IDs (controllers XB and XD in a single-engine quad-controller device and controller
XB in a single-engine dual-controller device) and to a batch of controllers with even IDs
(controllers XA and XC in a single-engine quad-controller device and controller XA in a
single-engine dual-controller device). If both planes have available links, an online upgrade
can be performed. Otherwise, an online upgrade is unavailable.
Step 6 If the UltraPath version is later than 5.01.017, run the upadmin show vlun command (when
the UltraPath version is 8.01.051 or earlier) or upadmin show vlun type=all (when the
UltraPath version is later than 8.01.051) to query information about all LUNs managed by
UltraPath for Linux and to obtain values in the Vlun ID column.
Step 7 Run upadmin show vlun id=<Vlun ID> (when the UltraPath version is 8.01.051 or earlier),
or upadmin show vlun id =<Vlun ID> type=all (when the UltraPath version is later than
8.01.051) to query path information about all LUNs.
Step 8 View the vlun path information to check whether there are available paths in the Normal state
to a batch of controllers with odd IDs (controllers XB and XD in a single-engine quad-
controller device and controller XB in a single-engine dual-controller device) and to a batch
of controllers with even IDs (controllers XA and XC in a single-engine quad-controller device
and controller XA in a single-engine dual-controller device). If both planes have available
links, an online upgrade can be performed. Otherwise, an online upgrade is unavailable.
----End
Step 2 Check whether there are disks mapped by Huawei storage devices based on the information
displayed in the third column. The following types of disks belong to Huawei storage devices
(case insensitive): Huawei|huasy|symantec|hs|eisoo|udsafe|marstor|sanm|anystor|sugon|
netposa. If no Huawei disk exists, skip this section.
Step 3 After obtaining the Huawei disk list, classify the disks according to the first three digits in the
first column. As shown in Figure 9-16, there are four types of disks: 0:0:0, 0:0:1, 1:0:0, and
1:0:1, indicating the port number of different targets reported by the storage device.
Step 4 Run the cat /sys/class/fc_transport/targetX\:X\:X/port_name or cat
/sys/class/fc_transport/targetX:X:X/port_name command (X:X:X is the first three digits in
the first column obtained in the previous step, for example, 0:0:0, 0:0:1, 1:0:0, or 1:0:1) to
query the array controllers whose targets are connected to the host. The query result is a
hexadecimal port name (WWPN), as shown in Figure 9-17.
Step 5 Convert each port name (WWPN) into a binary value, check the values of the 53th to 56th
bits, and convert the values into a decimal number. The converted number indicates the
controller ID, such as controller 0 in Figure 9-18.
Step 6 Based on the calculated controller ID, check whether each LUN has at least one available path
to a batch of controllers with even IDs and a batch of controllers with odd IDs. If yes, an
online upgrade can be performed; otherwise, an online upgrade is unavailable.
----End
9.1.2.4 Solaris
This section describes how to check multipathing link redundancy in a host running in Solaris.
Step 2 Run upadm show vlun or upadmin show vlun (when the UltraPath version is 8.01.051 or
earlier), or upadm show vlun type=all or upadmin show vlun type=all (when the UltraPath
version is later than 8.01.051) to query all vlun information managed by UltraPath. Obtain
Vlun ID in the first column, as shown in Figure 9-20.
If can't find any vlun is displayed, UltraPath does not manage any Huawei disks. In this case, skip this
section.
Step 3 Run upadm show vlun id=<Vlun ID> or upadmin show vlun id=<VLun ID>(when the
UltraPath version is 8.01.051 or earlier), or upadm show vlun id=<Vlun ID> type=all or
upadmin show vlun id=<VLun ID> type=all (when the UltraPath version is later than
8.01.051) to view the path information of all LUNs, as shown in Figure 9-21.
Step 4 View the vlun path information to check whether there are available paths in the Normal state
to a batch of controllers with odd IDs (controllers XB and XD in a single-engine quad-
controller device and controller XB in a single-engine dual-controller device) and to a batch
of controllers with even IDs (controllers XA and XC in a single-engine quad-controller device
and controller XA in a single-engine dual-controller device). If both planes have available
links, an online upgrade can be performed. Otherwise, an online upgrade is unavailable.
----End
If the command output is empty, no disk managed by STMS exists. In this case, skip this section.
Step 2 For each STMS disk, run the mpathadm show lu /dev/rdsk/XXXX command to check
whether Vendor indicates Huawei disks, as shown in Figure 9-23.
1. If the Vendor field is one of the following fields (case-insensitive), the disk is from Huawei:
Huawei|huasy|symantec|hs|eisoo|udsafe|marstor|sanm|anystor|sugon|netposa.
2. If all disks are not Huawei disks, skip this section.
Step 3 As shown in Figure 9-23, view each Target Port Name under Paths, convert each port name
(WWPN) into a binary value, check the values of the 53th to 56th bits, and convert the values
into a decimal number. The converted number indicates the controller ID, such as controller 0
in Figure 9-24.
Step 4 Based on the calculated controller ID, check whether each LUN has at least one available path
to a batch of controllers with even IDs and a batch of controllers with odd IDs. If yes, an
online upgrade can be performed; otherwise, an online upgrade is unavailable.
----End
9.1.2.5 HP-UX
This section describes how to check multipathing link redundancy in an HP-UX system.
Huawei UltraPath is not supported in HP-UX. This section describes how to check link
redundancy with HP-UX multipathing software.
Step 2 If the version is HP-UX 11i v1/v2, that is, B.11.11 or B.11.23, go to the next step. Otherwise,
skip this section.
Step 3 Run the ioscan -fnC fc command to obtain the H/W Path list of the current HP-UX HBA, as
shown in Figure 9-26.
Step 4 Run the ioscan -funC disk command to obtain the mapped disk device list. The obtained
values of H/W Path are the same as those obtained in the previous step and Description
contains HUAWEI device information, for example: /dev/dsk/c23t0d2, as shown in Figure
9-27.
Step 5 After obtaining the device information, run the pvdisplay xxx| grep 'PV Name' command. In
the command, xxx indicates the device name obtained in the previous step, for example,
/dev/dsk/c23t0d2. If the command output contains Alternate Link, the disk has no redundant
links and no further check is required.
Step 6 Run the fcmsutil /dev/xxx get remote all | grep 'Target Port Word Wide Name' command
with the driver information obtained in Step 3. In the command, xxx indicates the driver name
and is used to query information about the array targets connected to the port, as shown in
Figure 9-29.
Step 7 As shown in Figure 9-29, query each Target Port World Wide Name, convert each port
name (WWPN) into a binary value, check the values of the 53th to 56th bits, and convert the
values into a decimal number. The converted number indicates the controller ID, such as
controller 0 in Figure 9-30.
Step 8 Based on the calculated controller ID, check whether each LUN has at least one available path
to a batch of controllers with even IDs and a batch of controllers with odd IDs. If yes, an
online upgrade can be performed; otherwise, an online upgrade is unavailable.
----End
Step 2 If the system is HP-UX 11i v3 (B.11.31), run the scsimgr get_attr -a leg_mpath_enable
command to check whether NMP is installed and enabled, as shown in Figure 9-32. If both
current and default are true, NMP is installed and enabled. Go to the next step; otherwise,
end the check.
Step 3 Run the ioscan -funNC disk command to obtain information about all disks on the host.
HUAWEI and CLAIMED disks are obtained based on Description and S/W State, as
shown in Figure 9-33.
If the Description field is one of the following fields (case-insensitive), the disk is from Huawei:
Huawei|huasy|symantec|hs|eisoo|udsafe|marstor|sanm|anystor|sugon|netposa.
If all disks are not Huawei disks, skip this section.
Step 4 Run the ioscan -P health | grep lunpath command to obtain all online LUN WWPNs, as
shown in Figure 9-34.
Step 5 Run the scsimgr lun_map -D xxx command (xxx is the HUAWEI and CLAIMED disk
obtained in step 3) to obtain the disk path information, as shown in the following figure.
Step 6 View each LUN path and convert the WWN of each Hardware path in every path whose
State and Last Open or Close state are ACTIVE into a binary value. Check the values of
the 53th to 56th bits and convert the values into a decimal number. The number indicates the
controller ID, such as controller 0 in Figure 9-36.
Step 7 Based on the calculated controller ID, check whether each LUN has at least one available path
to a batch of controllers with even IDs and a batch of controllers with odd IDs. If yes, an
online upgrade can be performed; otherwise, an online upgrade is unavailable.
----End
9.1.2.6 AIX
This section describes how to check multipathing link redundancy in an AIX system. In an
AIX system, the same link redundancy check method is used in UltraPath and AIX
multipathing software.
Step 1 Run the lsdev -Cc disk | grep FC command on the host to obtain the hdiskx disks taken over
by the host, as shown in Figure 9-37.
Step 2 Run the lscfg -vpl hdisk1 command (hdisk1 is taken as an example) to obtain disk
information. Check whether the disk is a Huawei disk based on the Manufacturer field and
record the Huawei disk information, as shown in Figure 9-38.
If the Manufacturer field is one of the following fields (case-insensitive), the disk is from Huawei:
Huawei|huasy|symantec|hs|eisoo|udsafe|marstor|sanm|anystor|sugon|netposa.
If all disks are not Huawei disks, skip this section.
Step 4 Convert the name (WWPN) of the ports whose status is Enabled into a binary value, check
the values of the 53th to 56th bits, and convert the values into a decimal number. The
converted number indicates the controller ID, such as controller 0 in Figure 9-40.
Step 5 Based on the calculated controller ID, check whether each LUN has at least one available path
to a batch of controllers with even IDs and a batch of controllers with odd IDs. If yes, an
online upgrade can be performed; otherwise, an online upgrade is unavailable.
----End
Cause
In a VMware ESX system, if an Emulex HBA is used for a Fibre Channel network, because of
the known problems in the HBA driver version, the host HBA cannot report link interruption
events to the upper-layer multipathing software after controller links are interrupted when the
controllers of the storage device start upgrading. As a result, the host multipathing link fails to
be switched over, or the host HBA fails to connect to controllers after the controllers restart
and reconnect to the host.
Check Method
Step 1 If the host does not communicate with the storage device through Fibre Channel links, the
check is passed. Otherwise, go to the next step.
Step 2 Use the SSH client tool to log in to the VMware ESX system and enter the CLI.
Step 3 Run the esxcli storage core adapter list command to check whether Link State is link-up
and Description contains Emulex items, as shown in Figure 9-41.
Step 4 If the Emulex item is not displayed in Description or the value of Link State is not link-up,
Emulex HBA is not installed or not enabled, skip the following steps. Otherwise, go to the
next step.
Step 5 Check the value of LPexxxxx under Description in the previous step, which indicates the
HBA model, such as LPe11000 in step 3.
Step 6 Run the esxcfg-module -i lpfc | grep -i version command to query the Emulex driver version,
for example, Version: 11.1.0.6-1vmw.650.0.0.4564106, where Version: xx.x.x.x indicates the
HBA driver version.
In the preceding information, lpfc in esxcfg-module -i lpfc indicates the driver queried in Step 3.
Step 7 Run the vmware -i or vmware -v command to view the current VMware system version, for
example: VMware ESXi 5.5.0 Update 2.
Step 8 Compare the obtained VMware system version, HBA model, and driver version with that of
the HBA version shown in Table 9-1. If the information matches, risks exist.
Step 9 If the operating system, HBA model, and driver version match the information in the
preceding table, a link switchover or setup failure may occur due to HBA driver bugs during
the upgrade. As a result, host services are interrupted.
----End
Recommended Actions
You are advised to log in to the VMware official website:
http://www.vmware.com/resources/compatibility/search.php and download the driver of the
latest version to upgrade the HBA.
Cause
The Oracle database deployed on the host uses the Automatic Storage Management (ASM)
mode to manage disks, the Oracle version is 11.2.0.3 to 12.1.01, and the default value of the
ASM disk heartbeat parameter is 15 seconds (120 seconds by default for other versions). If
I/O timeout exceeds 15 seconds due to controller service switchover during the upgrade, the
ASM disk is dismounted from the database and is kicked off. As a result, the database breaks
down.
Oracle official statement: https://support.oracle.com/epmos/faces/DocumentDisplay?
_afrLoop=533716944762019&id=1581684.1&_afrWindowMode=0&_adf.ctrl-
state=1dkv36pc56_4
This section describes how to check whether the Oracle ASM disk heartbeat parameters need
to be modified based on the Oracle version, configuration, host multipathing, and storage
system version.
If the HBA or iSCSI timeout parameters are modified by referring to 9.1.5 Checking the HBA Timeout
Parameter in a Fibre Channel Network or 9.1.6 Checking the Initiator Timeout Parameter in an iSCSI
Network, and the ASM disk timeout parameter is not modified after Oracle database installation, skip
this section.
Check Method
Step 1 The Oracle database is installed on the host. Check whether ASM disk groups are used to
manage database disks.
Other hosts: Run the ps -ef |grep pmon command. If the command output contains +ASM1, ASM
disk groups are used to manage disks.
Disk in the Oracle 11g/12c database can be managed by file system or ASM.
Step 2 If the Oracle database is not installed or ASM disk groups are not used to manage disks, the
check is passed. Otherwise, go to the next step.
Step 3 If the Oracle database uses ASM disk groups to manage disks but the Oracle database version
is not from 11.2.0.3 (included) to 12.1.0.1 (included), the check is passed. Otherwise, go to
the next step.
Note: You need to set environment variables in Windows for ASM administrator to access the
database. The path specified by ORACLE_HOME is the installation path. You need to find the
installation path of the Oracle database before setting the path.
Checking the database version information on other hosts
Step 4 Check the disk type. If the disk type is not high/normal, the check is passed. Otherwise, go to
the next step.
In ASM disk management, the following types of disks are managed: high, normal, and extern.
Step 5 Check the ASM disk timeout parameter. If the value is greater than or equal to 120 seconds,
the check is passed. Otherwise, go to the next step.
Step 6 Use the SSH client software to log in to the host where the Oracle database is installed. On the
CLI, run the upadm show version command to check whether UltraPath is installed and the
version of the multipathing software.
For details about how to check the UltraPath version, see the methods for each operating system in
Checking Host Multipathing Link Status.
Step 7 Use the SSH client software to log in to the storage system associated with the Oracle
database and run the show system general command to check the storage system version.
Run the show lun general | filterColumn include columnList=ID to query the number of
storage LUNs.
Step 8 If the operating system, multipathing software type and version, storage system version of the
host where Oracle exists, and number of LUNs meet the following requirements, the check is
passed and no further check is required; otherwise, go to the next step.
Table 1-1 Version mapping without Oracle ASM timeout parameter modification
Host Storage System Number Operating Recommended
Multipathing Version of System Actions
and Version Storage Type of the
LUNs Host Where
the Oracle
Database
Exists
Step 9 If the version mapping does not match these in Table 9-2, modify the ASM timeout parameter
by referring to Modifying the Oracle ASM Timeout Parameter or modify the HBA/iSCSI
timeout parameter by referring to 9.1.5 Checking the HBA Timeout Parameter in a Fibre
Channel Network or 9.1.6 Checking the Initiator Timeout Parameter in an iSCSI Network.
Change the HBA/iSCSI timeout duration or Oracle ASM disk timeout duration of the host. You need to
modify only one of the two parameters.
Step 10 After the Oracle ASM disk timeout duration is modified, restart the Oracle database for the
modification to take effect (if the Oracle database is deployed in an RAC cluster, change the
disk timeout duration on each node). The host HBA timeout duration can be modified in some
hosts online but other hosts need to be restarted for the modification to take effect. Therefore,
use a solution that has minor impact on service continuity after evaluation.
Step 11 After the storage device is upgraded, the modified parameters can remain.
----End
Step 2 Run the ls -l/proc/xx/fd command to view the user directory opened by the process ID (xx
indicates the system process ID recorded in the previous step). Check whether the command
output contains the ASM instance name and record the path before dbs (the path is the
ORACLE_HOME of grid).
Step 3 Switch to the ASM instance administrator, set environment variables ORACLE_HOME and
ORACLE_SID, and log in to the ASM instance by entering sqlplus / as sysasm to collect or
modify the ASM heartbeat parameter.
Step 4 Modify the _asm_hbeatiowait parameter by running alter system set "_asm_hbeatiowait"
=120 scope=spfile sid='*';.
Step 5 (Optional) Switch to the database administrator, log in to the database as an Oracle DBA user
(entering sqlplus / as sysdba), and stop the database instance.
Step 6 Go to the ORACLE_HOME directory of grid as a root user and run the crsctl stop crs
command to stop the cluster service (running the crsctl stop has command to stop the service
on a node).
Step 7 Go to the ORACLE_HOME directory of grid as user root and run the crsctl start crs
command to start the CRS service. In a single-node system, run the crsctl start has command
to start the CRS service.
Step 8 Go to the ORACLE_HOME directory of grid as user root and wait for 5 minutes to check
whether the database starts.
Step 9 Repeat steps 1 to 3 to log in to the ASM instance and check whether the modification of
parameter _asm_hbeatiowait takes effect by running the following command: select
a.ksppinm name, b.ksppstvl value, a.ksppdesc describe from x$ksppi a, x$ksppcv b
where a.inst_id = userenv('instance') and b.inst_id = userenv('instance') and a.indx =
b.indx and a.ksppinm like '\_asm_hbeatio%' escape '\' ;.
Step 10 Repeat the preceding operations on the other Oracle node.
----End
This section describes the timeout parameter check performed when the Oracle heartbeat parameter
check fails. If the Oracle database is not installed on the host, you do not need to perform the timeout
parameter check.
Cause
1. If the host is not installed with UltraPath 8.0.0.0 or later, during the controller software
upgrade, the host I/O path switchover depends on the timeout mechanism of the Fibre
Channel HBA or iSCSI initiator during controller reset. How long I/Os are affected
depends on the timeout parameter. If the timeout parameter is set to a large value, the
host software cannot tolerate the I/O timeout. As a result, services on the host software
may be interrupted.
1. If the host and storage device do not communicate with each other through a Fibre Channel network,
skip this section.
2. In the current HBA configuration, the storage controller failure test has been performed. If multiple
controllers are faulty, host software services are not interrupted, or the I/O impact caused by the host
software upon multiple controller failures is acceptable. In this case, you can skip this section.
Check Method
Step 1 Log in to the VMware ESX system and run the esxcli storage core adapter list command to
obtain the HBA information. QLogic and Emulex HBAs are used as an example, as shown in
Figure 9-49.
For non-QLogic or Emulex HBAs, query the HBAs according to the method provided by the HBA
vendor.
Step 2 In the command output of the previous step, record Driver (such as lpfc and qlxx for QLogic
and Emulex HBAs), for example, Driver: lpfc.
Step 3 For a QLogic HBA, run the esxcli system module parameters list -m Driver| grep
qlport_down_retry command. For an Emulex HBA, run the esxcli system module
parameters list -m Driver| grep lpfc_devloss_tmo command to query the HBA timeout
duration, as shown in Figure 9-50.
1. If nothing is displayed after int, the timeout parameter is set to the default 10 seconds.
2. In the command, Driver indicates the driver information obtained in Step 2.
Step 4 If the timeout duration queried in the previous step is longer than 10 seconds, change the
timeout duration to 10 seconds to reduce the impact on host I/Os during the upgrade.
----End
Recommended Actions
Step 1 For a QLogic HBA, run the esxcli system module parameters set -p
"lpfc_devloss_tmo=10" -m lpfc command to modify the timeout parameter.
Step 2 For an Emulex HBA, run the esxcli system module parameters set -p
"qlport_down_retry=10" -m qlnativefc command to modify the timeout parameter.
For Emulex and QLogic HBAs, the command of modifying the timeout parameter is effective for the
HBA ports with the same type of drivers. If multiple HBAs are inserted, you only need to execute the
command once.
Step 3 After the modification, restart the VMware ESX system to make the timeout parameter take
effect.
----End
9.1.5.2 Windows
This section describes how to check and rectify the HBA timeout duration in a Windows Fibre
Channel network.
Check Method
Step 1 In a Windows networking, use the management tool provided by the HBA vendor to query
the HBA timeout duration. The following steps describe how to query the timeout duration of
QLogic and Emulex HBAs.
Step 2 For a QLogic HBA, install Fibre Channel Information Tool (fcinfo). The tool can be
downloaded from Microsoft's official website. After installing fcinfo, open the CMD CLI and
run fcinfo to get the HBA type, as shown in Figure 9-51.
Step 3 In the QLogic area on the Downloads page of QLogic's official website (see Figure 9-52),
select the corresponding HBA model and operating system, and click Go, as shown in Figure
9-53.
Step 4 Click Previously released versions under the Management Tools area.
Step 6 Install QConvergeConsole and view and set the timeout parameter, as shown in Figure 9-56.
Figure 1-1 Viewing and setting the timeout parameter for a QLogic HBA
Check whether Port Down Retry Count and Link Down Timeout are set to 10. If the two
parameters are not 10, set them to 10 and save the settings before you perform an online
upgrade.
If the QConvergeConsole tool cannot be used after being installed, download and install SANsurfer FC
HBA Manager in step 5, and then open the QConvergeConsole tool again.
Step 7 For an Emulex HBA, install OneCommand Manager, the software can be downloaded from
Broadcom's official website.
Step 8 Download OneCommand Manager based on the operating system type, as shown in Figure 9-
58.
Step 9 Use OneCommand Manager to view and set the timeout parameter, as shown in Figure 9-59.
Figure 1-1 Viewing and setting the timeout parameter for an Emulex HBA
Check whether LinkTimeOut and NodeTimeOut are set to 10. If the two parameters are not
10, set them to 10 and save the settings before you perform an online upgrade.
----End
Recommended Actions
You are advised to set LinkTimeOut and NodeTimeOut to 10 and then perform an online
upgrade.
9.1.5.3 Linux
This section describes how to check and rectify the HBA timeout parameter in Linux.
1. The check methods described in this section involve some operating systems. For the operating
systems not listed, you can perform the check by referring to this section. If there is any difference,
refer to the documents provided by the HBA vendor.
2. The check methods described in this chapter involve only Emulex and QLogic HBAs. If other HBAs
are used, perform operations by referring the documents provided by the HBA vendor.
3. If the HBA timeout parameter is not set to 5, you are advised to change the value to 5 to reduce the
impact on host I/Os during the upgrade.
9.1.5.3.1 SUSE 10
Check Method
Step 1 After logging in to the device, run the cat /etc/*-release command to obtain the operating
system version. If the operating system is SUSE Linux Enterprise Server 10, go to step 2.
Step 3 Run the cat /sys/class/fc_host/hostX/port_state command to query the port status. X in hostX
indicates the HBA port number. host* can also be used to query information about all the
hosts in Online state.
Step 4 Run the cat /sys/class/scsi_host/host*/model*name command to query the HBA model of all
hosts based on the hostX information recorded in step 3. Check whether the HBA comes from
Emulex or QLogic.
Step 5 Collect timeout parameters by running commands based on the HBA model.
If the HBA model is Emulex (starting with LPe), use the following method to query the
timeout parameters:
j. Run the cat /sys/class/scsi_host/host*/lpfc_devloss_tmo command to query the
timeout duration of all Fibre Channel ports.
If the HBA model is QLogic (starting with QLE), use the following method to query the
timeout parameter (going to the QLogic official website to download and install
QConvergeConsole CLI plug-in):
k. Run the qaucli command that can be executed only after the plug-in is installed.
After the qaucli command is executed, if the command output contains Please
Enter Selection, the command is executed successfully. Go to the next step.
n. On the page shown in the following figure, enter the port number of each HBA, for
example, 1, 2, 3, and 4.
ii. Enter 1 (Display HBA Parameters) to query HBA information. The value of
Link Down Timeout (seconds) is the HBA timeout duration.
iii. Enter 0 to go back to the previous directory. Repeat the steps to query
information about the next port. After all the ports are queried, input quit.
----End
Recommended Actions
1. For an Emulex HBA, run the echo X > /sys/class/scsi_host/hostY/lpfc_devloss_tmo
command to modify the HBA timeout parameters. X indicates the timeout parameter and
Y indicates the Fibre Channel port number. In the following figure, set the timeout
interval to 5 seconds.
s. Enter the number of the port for which you want to modify the timeout setting. Port
1 is used as an example here.
u. Set the timeout parameter. In the following figure, the timeout parameter is set to 15
seconds.
Check Method
Step 1 After logging in to the device, run the cat /etc/*-release command to obtain the operating
system version. If the operating system is SUSE Linux Enterprise Server 11, go to step 2.
linux:~ # cat /etc/*-release
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 3
LSB_VERSION="core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-
x86_64:core-3.2-x86_64:core-4.0-x86_64"
Step 3 Run the cat /sys/class/fc_host/hostX/port_state command to query the port status. X in hostX
indicates the HBA port number. Record information about all the hostX in Online state.
Step 4 Based on the hostX information recorded in step 3, run the cat
/sys/class/fc_host/host3/dev_loss_tmo command to query the HBA timeout duration of all
hosts, as shown in the following figure.
cat /sys/class/fc_host/host3/dev_loss_tmo
45
----End
Recommended Actions
Run the echo X > /sys/class/fc_host/hostY/dev_loss_tmo command to modify the HBA
timeout parameter, where X indicates the timeout parameter and Y indicates the Fibre Channel
port number. In the following figure, set the timeout parameter to 5 seconds.
echo 5 > /sys/class/fc_host/host3/dev_loss_tmo
Check Method
Step 1 After logging in to the device, run the cat /etc/*-release command to obtain the operating
system version. If the operating system is CentOS release 5.11, go to step 2.
[root@centos5 host0]# cat /etc/*-release
CentOS release 5.11 (Final)
Step 3 Run the cat /sys/class/fc_host/hostX/port_state command to query the port status. X in hostX
indicates the HBA port number. Record information about all the hostX in Online state.
Step 4 Run the cat /sys/class/scsi_host/hostX/model*name command to query the model of every
HBA port and record the results. X in hostX indicates the HBA port number. Or you can
replace hostX with host* to query the models of all HBA ports. HBA models include Emulex
and QLogic.
Step 5 Collect timeout parameters by running commands based on the HBA model.
If the HBA model is Emulex (starting with LPe), use the following method to query the
timeout parameter:
w. Collect a list of all Emulex HBA ports. Run the cat
/sys/class/scsi_host/hostX/lpfc_devloss_tmo command to obtain the timeout
duration of every HBA port one by one. X in hostX indicates the HBA port number.
Or you can replace hostX with host* to query the timeout duration of all the Fibre
Channel ports.
If the HBA model is QLogic (starting with QLE), use the following method to query the
timeout parameters:
y. Run the qaucli command that can be executed only after the plug-in is installed. If
Please Enter Selection can be seen in the command output, the command has been
executed successfully. Then go to the next step; otherwise, the timeout value is
Unknown and the collection ends.
z. When the command output ends with Please Enter Selection, enter 2 (Adapter
Configuration).
bb. Obtain the following information from the command output, including the HBA
model, port, and WWPN. (Information about ports in Link Down state does not
need to be collected).
cc. Perform the following steps for every port (ports 1, 2, 3, 4). Port 1 is used as an
example here:
i. Input the port number 1.
ii. Enter 1 (Display HBA Parameters) to query HBA information. The value of
Link Down Timeout (seconds) is the HBA timeout duration.
iii. Enter 0 to go back to the previous directory. Repeat the steps to query
information about the next port. After all the ports are queried, input quit.
----End
gg. Enter the number of the port for which you want to modify the timeout setting. Port
1 is used as an example here.
ii. Set the timeout parameter. In the following figure, the timeout parameter is set to 15
seconds.
Check Method
Step 1 After logging in to the device, run the cat /etc/*-release command to obtain the operating
system version. If the operating system is CentOS release 6.5, go to step 2.
cat /etc/*-release
CentOS release 6.5 (Final)
LSB_VERSION=base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphi
CentOS release 6.5 (Final)
CentOS release 6.5 (Final)
[root@centos6 ~]#
Step 3 Run the cat /sys/class/fc_host/hostX/port_state command to query the port status. X in hostX
indicates the HBA port number. Record information about all the hostX in Online state.
Step 4 Run the cat /sys/class/scsi_host/hostX/model*name command to query the model of every
HBA port and record the results. X in hostX indicates the HBA port number. Or you can
replace hostX with host* to query the timeout settings of all HBA ports.
[cat /sys/class/scsi_host/host3/model*name
QLE2462
[root@centos6 ~]# cat /sys/class/scsi_host/host9/model*name
LPe11002
Step 5 Run the cat /sys/class/fc_host/hostX/dev_loss_tmo command to query the timeout setting of
every HBA port and record the results. X in hostX indicates the HBA port number. Or you can
replace hostX with host* to query the timeout settings of all HBA ports.
cat /sys/class/fc_host/host9/dev_loss_tmo
30
[root@centos6 ~]# cat /sys/class/fc_host/host3/dev_loss_tmo
45
----End
Recommended Actions
Run the echo X > /sys/class/fc_host/hostY/dev_loss_tmo command to modify the HBA
timeout parameter, where X indicates the timeout parameter and Y indicates the Fibre Channel
port number. In the following figure, set the timeout parameter to 5 seconds.
9.1.5.4 Solaris
This section describes how to check and rectify HBA timeout parameters in Solaris.
This section uses Oracle Solaris 10 and Oracle Solaris 11 as examples. For other operating system
versions, refer to this section. If there is any difference, query the document provided by the HBA
vendor.
Check Method
Step 1 Log in to the Solaris OS as an administrator.
Step 2 Run the fcinfo hba-port command to obtain information about the HBA port whose State is
Online, as shown in Figure 9-64.
Step 3 Check the Manufacturer field to obtain the HBA vendor (Emulex or QLogic).
Step 4 If an Emulex HBA exists, run the cat /kernel/drv/emlxs.conf |grep linkup-delay command
to obtain the value of linkup-delay, that is, the value of the Emulex HBA timeout parameter.
Step 5 If a QLogic HBA exists, going to the QLogic official website to download and install
QConvergeConsole CLI plug-in.
Step 6 After the QConvergeConsole CLI plug-in is installed, run the qaucli command, as shown in
Figure 9-65.
Step 8 Enter 3 (HBA Parameters), as shown in Figure 9-67. The QLogic HBA of the current system
is displayed.
Step 9 Enter the serial number of each HBA in sequence, such as 3. On the basic HBA information
page, enter 1 to display the HBA details, as shown in Figure 9-68.
----End
Recommended Actions
Step 1 For an Emulex HBA, change the value of linkup-delay in /kernel/drv/emlxs.conf to 5.
Step 2 If the HBA model is QLogic, use the following method to query the timeout parameters
(going to the QLogic official website to download and install QConvergeConsole CLI plug-
in):
1. Run the qaucli command that can be executed only after the plug-in is installed. After
the qaucli command is executed, if the command output contains Please Enter
Selection, the command is executed successfully. Go to the next step.
4. On the page shown in the following figure, enter the port number of each HBA, for
example, 1, 2, 3, and 4.
ll. Enter 1 (Display HBA Parameters) to query HBA information. The value of Link
Down Timeout (seconds) is the HBA timeout duration.
mm. Enter 0 to go back to the previous directory. Repeat the steps to query information
about the next port. After all the ports are queried, input quit.
----End
9.1.5.5 HP-UX
This section describes how to check and rectify the HBA timeout parameters in HP-UX
(querying the timeout parameters of each LUN instead).
Step 2 Run the ioscan -funC disk command to obtain the mapped disk device list. Match the current
H/W Path with H/W Path obtained in Step 1 and obtain the corresponding disk, for example,
/dev/dsk/c23t0d1.
Step 3 Run the vgdisplay -v | grep 'PV Name' command. Disks displayed in the command output
have PVs created.
Step 4 Run the pvdisplay /dev/dsk/c23t0d1 command to obtain the value of IO Timeout (Seconds).
The value is 30 seconds by default.
----End
Recommended Actions
Run the pvchange -t X Y command to modify the LUN timeout parameter. X indicates the
timeout parameter, and Y indicates the obtained disk. Check whether the modification is
successful. As shown in the following figure, set the timeout parameter to 5 seconds.
Check Method
Step 1 Run the ioscan -P health -C disk command to obtain the H/W Path values of the disks in
online state (such as 64000/0xfa00/0x1).
Step 2 Run the scsimgr lun_map -H 64000/0xfa00/0x20 | grep 'SCSI transport protocol'
command to check whether SCSI transport protocol is fibre_channel. If two records are
displayed, there are two paths.
If SCSI transport protocol is sas, it is a DVD-ROM or other kinds of SAS interface devices; if SCSI
transport protocol is sata, it is a local disk.
Step 3 Based on the value in the H/W Path column obtained in step 2 and fibre_channel, run the
scsimgr get_attr -H 64000/0xfa00/0x20 -a path_fail_secs command to obtain the current
value, which is the timeout duration of the LUN mapped to the HBA.
----End
Recommended Actions
Run the scsimgr set_attr -H 64000/0xfa00/0x20 -a path_fail_secs=X command to modify
the timeout setting of an HBA. X indicates the timeout threshold that you want to set. The
value of current in the command output is the timeout threshold. Check whether the
modification is successful.
# scsimgr set_attr -H 64000/0xfa00/0x20 -a path_fail_secs=5
Value of attribute path_fail_secs set successfully
# scsimgr get_attr -H 64000/0xfa00/0x20 -a path_fail_secs
name = path_fail_secs
current = 5
default = 120
saved =
9.1.5.6 AIX
This section describes how to check and rectify AIX HBA timeout parameters.
This operation may require restarting hosts or stopping services. Exercise caution before
performing this operation.
Check Method
Step 1 Run the lspath command on the AIX host. The command output shows that how many ports
on the HBA are connected to the disk array. As shown in Figure 9-73, two Fibre Channel
ports on the host are connected to the disk array.
Step 2 Run the lsattr -El fscsi* command to check whether the attributes of each connected port are
modified, as shown in Figure 9-74. In the following figure, the value of dyntrk is yes and the
value of fc_err_recov is fast_fail, indicating that the port attribute value has been changed.
Otherwise, the attributes are not modified.
If DAS networking is used, set the value of fc_err_recov to delayed_fail and that of dyntrk to no.
In SAN networking mode, change the value of fc_err_recov to fast_fail and that of dyntrk to yes.
IBM has configuration requirements on the Fibre Channel network of an AIX system, that is, the
HBA attributes must be modified. This has nothing to do with the storage system.
----End
Recommended Actions
If attributes of all ports have been modified, skip the following steps.
Running the attribute modification command one time modifies attributes of only one port. You must
modify the attributes of all ports connected to the storage array.
If the host can be restarted, perform the following operations:
nn. Run the chdev -l fscsi* -a drntrk=yes -P command to modify HBA attributes.
oo. Restart the host for the settings to take effect.
If services cannot be interrupted and multiple ports on the host are connected to the
storage array, perform the following operations (assuming that ports fscsi0 and fscsi1 are
connected to the storage array):
pp. Run rmdev -l fscsi0 -R to set the status of fscsi0 to defined.
qq. Run chdev -l fscsi0 -a dyntrk=yes and chdev -l fscsi0 -a fc_err_recov=fast_fail to
modify the attributes of fscsi0.
rr. Run cfgmgr -l fcs0 to establish a link to fscsi0.
ss. Run lspath to confirm that the link to fscsi0 is recovered.
tt. Run rmdev -l fscsi1 -R.
uu. Run chdev -l fscsi1 -a dyntrk=yes and chdev -l fscsi1 -a fc_err_recov=fast_fail to
modify the attributes of fscsi1.
vv. Run cfgmgr -l fcs1.
ww. Run lspath to confirm that the link to fscsi1 is recovered.
If services cannot be interrupted and only one port on the host is connected to the storage
array, port attributes cannot be modified. Negotiate with the customer over service
suspension.
This section describes the timeout parameter check performed when the Oracle heartbeat parameter
check fails. If the Oracle database is not installed on the host, you do not need to perform the timeout
parameter check.
Cause
1. If the host is not installed with UltraPath 8.0.0.0 or later, during the controller software
upgrade, the host I/O path switchover depends on the timeout mechanism of the Fibre
Channel HBA or iSCSI initiator during controller reset. How long I/Os are affected
depends on the timeout parameter. If the timeout parameter is set to a large value, the
host applications cannot tolerate the I/O timeout. As a result, services on the host
applications may be interrupted.
1. If the host and storage device do not communicate with each other through an iSCSI network, skip
this section.
2. With the current iSCSI configuration parameters, the storage controller failure test has been
performed. If multiple controllers are faulty, host application services are not interrupted, or the I/O
impact caused by the host application upon multiple controller failures is acceptable. In this case,
skip this section.
This operation may require restarting hosts or stopping services. Exercise caution before
performing this operation.
Check Method
Step 1 Log in to the VMware ESX system and run the esxcfg-scsidevs –a command to check the
iSCSI initiator name. As shown in Figure 9-75, the iSCSI initiator is vmhba35.
If multiple online iSCSI vmhba adapters exist, check and modify their settings one by one.
Step 2 Run the esxcli iscsi adapter param get –A vmhba35 command (vmhba35 is the iSCSI
initiator name queried in step 1) to check the values of NoopOutInterval, NoopOutTimeout,
and RecoveryTimeout, as shown in Figure 9-76
Figure 1-1 Checking the current parameter settings of the iSCSI vmhba adapter
Recommended Actions
Step 1 Run the following commands to modify the parameter settings of the iSCSI vmhba adapter
(the values specified in the following commands are the optimal values obtained from tests):
esxcli iscsi adapter param set -A vmhba35 -k NoopOutInterval -v 1
esxcli iscsi adapter param set -A vmhba35 -k NoopOutTimeout -v 10
esxcli iscsi adapter param set -A vmhba35 -k RecoveryTimeout -v 1
1. The preceding parameters are recommended values. Set them based on the site
requirements.
2. After the upgrade, restore the values of the timeout parameters to the original values.
Step 2 The modification takes effect only after the ESXi host is restarted.
Step 3 Check whether the settings take effect after the ESXi host is started.
----End
9.1.6.2 Windows
This section describes how to check and modify the iSCSI initiator timeout value of a
Windows host.
This operation may require restarting hosts or stopping services. Exercise caution before
performing this operation.
Check Method
Step 1 Open the CLI and run the following commands to check the timeout parameters of the iSCSI
initiator.
iscsiConfig get timeout value
iscsiConfig get linkdowntime value
Step 2 If the timeout parameter is not set to 5, you are advised to set this parameter to 5 to reduce the
impact on I/Os during the upgrade.
----End
Recommended Actions
Step 1 If the values of the timeout parameters are not 5, run the following commands to set the
values to 5:
iscsiConfig set timeout 5
iscsiConfig set linkdowntime 5
Step 2 If the iscsiConfig command cannot be executed using the CLI, manually open registration
entry HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\
{4D36E97B-E325-11CE-BFC1-08002BE10318}, find the nodes with digits and pluses (+),
and click the Parameters registration entry to check whether LinkDownTime and
MaxRequestHoldTime are set to 5. If not, set the parameters to 5, as shown in Figure 9-77.
Figure 1-1 Querying and setting timeout parameters through the registration entry
After the upgrade, restore the values of the timeout parameters to the original values.
9.1.6.3 Linux
This section describes how to check and modify the iSCSI initiator timeout value of a Linux
host.
Check Method
Step 1 Run iscsiadm -m node -p iSCSI service IP address | grep replacement_timeout, as shown
in Figure 9-78. Check the timeout parameters of all iSCSI service IP addresses.
Step 2 If the timeout parameter is not set to 1, you are advised to set this parameter to 1 to reduce the
impact on I/Os during the upgrade.
----End
Recommended Actions
Run the iscsiadm -m node -o update -n node.session.timeo.replacement_timeout -v 1
command to modify the configuration, as shown in Figure 9-79.
If the check fails or you need to perform the check again, click Perform Upgrade.
If a failed item exists, do not perform an upgrade. You need to rectify the system fault and
perform the pre-upgrade check again.
For details about how to resolve the failed items, see Table 9-3.
upgrade requirements.
Disk usage Succeeded The disk usage is None.
lower than the
threshold.
Failed The disk usage is Check whether background
higher than the disks are being formatted. If
threshold. yes, wait until the formatting is
complete. If no, reduce the
service load and perform the
upgrade again.
System Succeeded The system software None.
software version in the memory
compatibility is compatible with that
on the disks.
Failed The system software Contact technical support
version in the memory engineers.
is not compatible with
that on the disks.
Whether all Succeeded All engines are in None.
engines are in dual-controller mode.
dual-controller
mode Failed Not all engines are in Contact technical support
dual-controller mode. engineers.
Whether online Succeeded No disks are being None.
disk diagnosis diagnosed.
is on
Failed One or more disks are Wait 10 minutes and retry. If the
being diagnosed. retry fails, contact technical
support engineers.
Pool status Succeeded The health status and None.
running status of each
pool is Normal and
Online respectively.
Failed The pool status is Contact technical support
abnormal. engineers.
Workload on Succeeded The workload on each None.
front-end ports front-end port (Fibre
Channel, Eth, and
FCoE) meets
requirements. (The
occupied bandwidth
does not exceed 80%
of the theoretical
value.)
Failed The workload on Contact technical support
front-end ports does engineers.
For details about how to upgrade SAN HyperMetro arrays, see section 9.4.1.
For details about how to upgrade NAS HyperMetro arrays, see section 9.4.2.
If the HyperMetro arrays are members of DR Star, you are advised to upgrade the secondary end of the
asynchronous remote replication first. The upgrade sequence of the HyperMetro sites is subject to the
HyperMetro upgrade guide.
Table 1-1 Mappings between the HyperMetro solution upgrade and storage array upgrade
Upgrade Application scenario:
without Host services are not interrupted during the upgrade. There are two available
interrupti methods: Method 1: More controllers provide host services during the
ng host upgrade, improving performance and reliability. Method 2: The upgrade
services procedure is simpler and the upgrade duration is shorter.
Online Method 1
upgrade
mode of Procedure:
storage 1. Upgrade UltraPath.
devices 2. Select a storage device that you want to upgrade. Suspend its services and
HyperMetro pairs.
3. Upgrade the quorum server.
4. Disable the working controller switchover function of LUNs.
5. Upgrade the storage array whose services are suspended in online mode.
6. Start HyperMetro synchronization. After the synchronization is complete,
the HyperMetro pairs are in the normal state.
7. Select the other storage array that you want to upgrade. Suspend its
services and HyperMetro pairs.
8. Upgrade the storage device (the other one) online whose services are
suspended.
9. Start HyperMetro synchronization after the storage arrays are upgraded.
After the synchronization is complete, the HyperMetro pairs are in the
normal state.
Method 2
Procedure:
1. Upgrade UltraPath.
2. Select a storage device that you want to upgrade. Suspend its services and
HyperMetro pairs.
3. Upgrade the quorum server.
4. Disable the working controller switchover function of LUNs.
Procedure
Step 1 Check whether the network is a standard HyperMetro network. If the network is not a
standard HyperMetro network, modify the network before performing an upgrade.
For details about the networking standards, go to SAN HyperMetro networking standard.
Step 2 In OceanStor DeviceManager, verify that the status of HyperMetro consistency groups and
pairs is normal, as shown in Figure 9-82 and Figure 9-83.
Figure 1-1 Checking the HyperMetro LUN ID on a storage array installed with UltraPath
Step 4 Check whether the paths between active-active VLUNs and storage arrays are normal.
Take the Linux host as an example. Run the upadmin show vlun id=ID type=hypermetro
command on the host CLI to check whether the paths between active-active VLUNs and
storage arrays are normal, as shown in Figure 9-85. Run the upadm show vlun id=ID
type=hypermetro command in the AIX/Linux/Solaris/Windows system. Run the esxcli
upadm show vlun -l ID -t hypermetro command in the vSphere system to check whether the
paths between active-active VLUNs and storage arrays are normal.
Check whether the pre-upgrade check is passed. If any exception occurs, contact engineers for
help.
----End
Procedure
Step 1 If the UltraPath version installed on the host is V100R800C20 or earlier versions, upgrade the
version of OceanStor UltraPath by referring to the upgrade guide of OceanStor UltraPath.
The upgrade guide of OceanStor UltraPath can be obtained from OceanStor UltraPath Upgrade Guide.
If you cannot access the link, visit http://support.huawei.com/enterprise/. Click Login and enter your
user name and password. After login, choose Technical Support > Cloud Storage > Tools and
Platform > UltraPath.
Step 2 Log in to the OceanStor DeviceManager of the storage array that you want to upgrade. Open
the HyperMetro management page.
Step 3 On the HyperMetro Consistency Group tab page, select all HyperMetro consistency groups
for which Local Resource Role is Preferred, and then unfold the More. Click Pause. In the
displayed Suspend HyperMetro consistency group dialog box, select Preferred, and click
OK. Perform similar operations on the HyperMetro Pair tab page. See Figure 9-86 and
Figure 9-87.
Step 4 On the HyperMetro Consistency Group tab page, select all HyperMetro consistency groups
for which Local Resource Role is Non-Preferred, and then unfold the More. Click Pause.
In the displayed Suspend HyperMetro consistency group dialog box, select Non-
Preferred, and click OK. Perform similar operations on the HyperMetro Pair tab page. See
Figure 9-88 and Figure 9-89.
The suspension operation is different based on the local resource role of HyperMetro pairs,
as shown in Step 3 and Step 4. If the preferred sites of HyperMetro pairs are not on the
same end, you need to stop some LUNs in the preferred site and non-preferred site, to
prevent service interruption on the host during the upgrade.
If the HyperMetro pair belongs to the HyperMetro consistency group, you can only
operate the HyperMetro consistency group.
Step 5 On the HyperMetro management page, check whether all HyperMetro pairs' local resources
cannot be accessed. If no, return to Step 3, as shown in Figure 9-90.
Step 6 Click quorum server software to obtain the quorum server software. In the product catalog,
select the desired product to go to the corresponding product page and view related
information. Download the upgrade package specific to your product.
If the link cannot be accessed, visit http://support.huawei.com/enterprise/. Click Login and enter your
user name and password, and choose Technical Support > Enterprise Storage.
Step 7 Upload the upgrade package to the quorum server. Go to the upgrade package path, and run
the ./quorum_server.sh -upgrade command to upgrade the software, as shown in Figure 9-
91.
Before upgrading the quorum server software, ensure that the quorum server has sufficient storage space
(greater than or equal to 10 GB). Otherwise, the upgrade may fail due to insufficient space. If the
upgrade fails, run the ./quorum_server.sh -install command to reinstall the quorum server software.
During the quorum server software upgrade, a quorum link interruption alarm may be
generated and remain for about one minute on storage arrays. The alarm automatically
disappears after the quorum server is upgraded.
Step 8 Disable the working controller switchover function of LUNs (perform this step on OceanStor
V300R003C00SPC100 storage systems. Later versions do not require this step).
Operations in AIX/Linux/Solaris/Windows: Run the upadmin command (upadm command
for AIX/Solaris/Windows) on a host to go to the CLI of UltraPath. Run the set
luntrespass={ on | off } [ array_id=ID | vlun_id={ ID | ID1,ID2... | ID1-ID2 } ] command to
disable the working controller switchover function of LUNs. Then run the show upconfig
[ array_id=ID | vlun_id=ID ] command to confirm the function is disabled successfully, as
shown in Figure 9-92.
Operations in vSphere: Run the esxcli upadm command on a host to log in to the CLI of
vSphere. Run the set luntrespass [ -a array-id | -l vlun-id ] -m off command to disable the
working controller switchover function of LUNs. Then run the show upconfig [ -a array-id |
-l vlun-id ] command to confirm the function is disabled successfully.
Step 9 Upgrade the controller software of the storage array where the services are suspended (local
resources cannot be accessed). See chapter 5 Performing the Upgrade.
During the upgrade of the array controller, the remote resource name of the HyperMetro
arrays is displayed as -- and the connection status is disconnected. After the array controller is
upgraded, the name and status of the remote resource becomes normal.
Step 10 After the array controller software is upgraded successfully, log in to HyperMetro
management page of DeviceManager, and start synchronization of all HyperMetro
consistency groups and pairs, as shown in Figure 9-93 and Figure 9-94.
HyperMetro data synchronization prolongs the upgrade duration but does not affect
services.
After an upgrade from a version earlier than V300R006C30 to V300R006C30 or later, the
HyperMetro consistency group may fail to be synchronized. For details, see section 9.4.3.2
What Can I Do when the HyperMetro Consistency Group Fails to Be Synchronized After
the Upgrade?.
Step 11 After all HyperMetro consistency groups and pairs are synchronized, the pair running status
becomes normal, as shown in Figure 9-95.
Step 12 Log in to OceanStor DeviceManager of the other storage array that you want to upgrade.
Open the HyperMetro management page, and suspend the local host access to all HyperMetro
consistency groups and HyperMetro pairs.
Step 13 On the HyperMetro Consistency Group tab page, select all HyperMetro consistency groups
for which Local Resource Role is Preferred, and then unfold the More. Click Pause. In the
displayed Suspend HyperMetro consistency group dialog box, select Preferred, and click
OK. Perform similar operations on the HyperMetro Pair tab page. See Figure 9-96 and
Figure 9-97.
Step 14 On the HyperMetro Consistency Group tab page, select all HyperMetro consistency groups
for which Local Resource Role is Non-Preferred, and then unfold the More. Click Pause.
In the displayed Suspend HyperMetro consistency group dialog box, select Non-
Preferred, and click OK. Perform similar operations on the HyperMetro Pair tab page. See
Figure 9-98 and Figure 9-99.
The suspension operation is different based on the local resource role of HyperMetro pairs,
as shown in Step 13 and Step 14. If the preferred sites of HyperMetro pairs are not on the
same end, you need to stop some LUNs in the preferred site and non-preferred site, to
prevent service interruption on the host during the upgrade.
If the HyperMetro pair belongs to the HyperMetro consistency group, you can only
operate the HyperMetro consistency group.
Step 15 On the HyperMetro management page, check whether all HyperMetro pairs' local resources
cannot be accessed. If no, return to Step 13, as shown in Figure 9-100.
Step 16 Upgrade the controller software of the storage array where the services are suspended (local
resources cannot be accessed). See chapter 5 Performing the Upgrade.
Step 17 After both arrays are upgraded successfully, start HyperMetro consistency group and pair
synchronization, as shown in Figure 9-101 and Figure 9-102.
After an upgrade from a version earlier than V300R006C30 to V300R006C30 or later, the
HyperMetro consistency group may fail to be synchronized. For details, see section 9.4.3.2
What Can I Do when the HyperMetro Consistency Group Fails to Be Synchronized After the
Upgrade?.
Step 18 Enable the working controller switchover function of LUNs (note that this step is performed
only when Step 6 is performed).
Operations in AIX/Linux/Solaris/Windows: Run the upadmin command (upadm command
for AIX/Solaris/Windows) on a host to go to the CLI of UltraPath. Run the set
luntrespass={ on | off } [ array_id=ID | vlun_id={ ID | ID1,ID2... | ID1-ID2 } ] command to
enable the working controller switchover function of LUNs. Then run the show upconfig
[ array_id=ID | vlun_id=ID ] command to confirm the function is enabled successfully, as
shown in Figure 9-103.
Operations in vSphere: Run the esxcli upadm command on a host to log in to the CLI of
vSphere. Run the set luntrespass [ -a array-id | -l vlun-id ] -m on command to enable the
working controller switchover function of LUNs. Then run the show upconfig [ -a array-id |
-l vlun-id ] command to confirm the function is enabled successfully.
----End
Step 3 Run the esxcli upadm show version command to check the UltraPath version. If the system
displays a message indicating that the command does not exist, UltraPath is not installed. In
this case, end the operation. If the UltraPath version is 8.06.061 or later, proceed to the next
step. Otherwise, end the operation.
[root@localhost:~] esxcli upadm show version
Software Version : 8.06.061
Driver Version : 8.06.061
[root@localhost:~]
Step 4 Run the esxcli upadm show upconfig | grep "APD to PDL Mode" command to check
whether the apdtopdl switch on UltraPath is turned on. If the switch is turned off, end the
operation.
The command output of esxcli upadm show upconfig | grep "APD to PDL Mode" shows
that the apdtopdl switch on UltraPath is turned on.
[root@localhost:~] esxcli upadm show upconfig | grep "APD to PDL Mode"
APD to PDL Mode : on
[root@localhost:~]
Step 5 Run the esxcli upadm set apdtopdl -m off command to turn off the apdtopdl switch.
[root@localhost:~] esxcli upadm set apdtopdl -m off
Succeeded in executing the command.
[root@localhost:~]
Step 6 Run the esxcli upadm show upconfig | grep "APD to PDL Mode" command to ensure that
the apdtopdl switch on UltraPath is turned off.
[root@localhost:~] esxcli upadm show upconfig | grep "APD to PDL Mode"
APD to PDL Mode : off
[root@localhost:~]
----End
Table 1-1 Mappings between the HyperMetro solution upgrade and storage array upgrade
Upgrade Application scenario:
If storage arrays are upgraded offline, suspend HyperMetro relationships by referring to 9.4.1.1
Upgrading by Suspending HyperMetro Relationships.
Procedure
Step 1 If the UltraPath version installed on the host is V100R800C20 or earlier versions, upgrade the
version of OceanStor UltraPath by referring to the upgrade guide of OceanStor UltraPath.
The upgrade guide of OceanStor UltraPath can be obtained from OceanStor UltraPath Upgrade Guide.
If you cannot access the link, visit http://support.huawei.com/enterprise/. Click Login and enter your
user name and password. After login, choose Technical Support > Cloud Storage > Tools and
Platform > UltraPath.
Step 2 Click quorum server software to obtain the quorum server software. In the product catalog,
select the desired product to go to the corresponding product page and view related
information. Download the upgrade package specific to your product.
If the link cannot be accessed, visit http://support.huawei.com/enterprise/. Click Login and enter your
user name and password, and choose Technical Support > Enterprise Storage.
Step 3 Upload the upgrade package to the quorum server. Go to the upgrade package path, and run
the ./quorum_server.sh -upgrade command to upgrade the software, as shown in Figure 9-
105.
During the arbitration software upgrade, a quorum link interruption alarm may be generated
and remain for about one minute on storage arrays. The alarm automatically disappears after
the quorum server is upgraded.
Step 4 Log in OceanStor DeviceManager of the disk array to be upgraded. The HyperMetro
management page is displayed.
Step 5 Check all HyperMetro consistency groups and pairs that their Pair Running Status is
Normal, as shown in Figure 9-106. Upgrade the controller software of the array. For details,
see 5 Performing the Upgrade.
Step 6 Log in OceanStor DeviceManager of the other HyperMetro array to be upgraded. The
HyperMetro management page is displayed. Check all HyperMetro consistency groups and
pairs that their Pair Running Status is Normal. Upgrade the controller software of the array.
For details, see 5 Performing the Upgrade.
Step 7 After both arrays are upgraded successfully, if Pair Running Status of the HyperMetro
consistency groups and pairs is in the disconnected state (such as Paused, To Be
Synchronized, or Forcibly Started), start HyperMetro consistency group and pair
synchronization, as shown in Figure 9-107 and Figure 9-108.
----End
Table 1-1 Mappings between the HyperMetro solution upgrade and storage array upgrade
Upgrade Application scenario:
without Host services are not interrupted during the upgrade. There are two available
interrupti methods: Method 1: More controllers provide host services during the
ng services upgrade, improving performance and reliability. Method 2: The upgrade
Online procedure is simpler and the upgrade duration is shorter.
upgrade Method 1 Procedure:
mode of
storage 1. Suspend HyperMetro pairs.
devices 2. Upgrade the quorum server.
3. Upgrade the storage array whose services are suspended in online mode.
4. Start HyperMetro synchronization. After all HyperMetro pairs under the
vStore pair are synchronized, the HyperMetro pairs are in the normal
state.
5. Check the HyperMetro vStore pair.
6. Perform a primary/secondary switchover for the HyperMetro vStore pair.
7. Suspend HyperMetro pairs.
8. Upgrade the storage device (the other one) online whose services are
suspended.
9. Start HyperMetro synchronization after the storage arrays are upgraded.
After all HyperMetro pairs under the vStore pair are synchronized, the
HyperMetro pairs are in the normal state.
Method 2 Procedure:
1. Suspend HyperMetro pairs.
2. Upgrade the quorum server.
3. Upgrade the storage array (local resources cannot be accessed) whose
services are suspended in online mode and then the other storage array in
the HyperMetro pair in online mode.
4. Start HyperMetro synchronization after the storage arrays are upgraded.
After all HyperMetro pairs under the vStore pair are synchronized, the
HyperMetro pairs are in the normal state.
Upgrade Application scenario:
without Host services cannot be interrupted during the upgrade and only the
interrupti HyperMetro services are configured on the storage.
ng services
Procedure:
Offline
upgrade of 1. Suspend HyperMetro pairs.
storage 2. Upgrade the quorum server.
devices 3. Upgrade the storage array whose services are suspended in offline mode.
4. Start HyperMetro synchronization. After all HyperMetro pairs under the
vStore pair are synchronized, the HyperMetro pairs are in the normal
state.
5. Check the HyperMetro vStore pair.
Procedure
Step 1 Check whether the network is a standard HyperMetro network. If the network is not a
standard HyperMetro network, modify the network before performing an upgrade.
For details about the networking standards, go to NAS HyperMetro networking standard.
If the link cannot be accessed, visit http://support.huawei.com/hedex/hdx.do?
docid=EDOC1100020508&lang=en and check the networking standard by referring to the NAS
HyperMetro Deployment Guide > Installation and Configuration > Planning > Networking Planning.
Step 2 In OceanStor DeviceManager, verify that the running status and configuration status of
HyperMetro vStore pairs and pairs are normal, as shown in Figure 9-110.
Figure 1-1 HyperMetro vStore pairs and HyperMetro pairs in the normal running status and
configuration status
----End
Procedure
Step 1 Log in to OceanStor DeviceManager of the storage array to be upgraded, and choose Data
Protection > HyperMetro > HyperMetro vStore Pair.
If storage arrays are configured with NAS HyperMetro and SAN HyperMetro, the selected
storage array is the one that SAN HyperMetro plans to upgrade first. Ensure that both the
SAN HyperMetro and NAS HyperMetro upgrade conditions are met during the upgrade.
Step 2 Check whether Activation Status of all HyperMetro vStore pairs in the local storage array is
Passive, as shown in Figure 9-112. If not, go to Step 3 to switch the local end to the passive
end. If yes, go to Step 4.
Figure 1-1
If the vStore pair of the local storage array is in the activated state, perform Step 3 to
switch the local vStore pair to the passive state. Ensure no running host services during the
upgrade.
The NFSv3 and NFSv4 support uninterrupted service switching.
Step 3 If a HyperMetro vStore pair is activated on the local storage array. Select the corresponding
vStore pair and click Check, as shown in Figure 9-113. No abnormal check item is found.
Click Primary/Secondary Switchover, as shown in Figure 9-114.
Figure 1-2 Performing primary and secondary switchover of a HyperMetro vStore pair
Step 4 On the HyperMetro vStore Pair management page, check whether Activation Status of all
HyperMetro vStore pairs in the local storage array is Passive, as shown in Figure 9-115. If no,
go to Step 2.
Step 5 Select a vStore pair, select all HyperMetro pairs under the vStore pair, and click Pause. Select
the other vStore pair and do the same, as shown in Figure 9-116.
Step 6 On the HyperMetro vStore Pair management page, check whether Local Resource Host
Access Status of all HyperMetro vStore pairs is Access denied. If no, go to Step 5, as shown
in Figure 9-117.
If the storage array is configured with NAS HyperMetro and SAN HyperMetro, stop SAN
HyperMetro pairs by referring to 9.4.1 to ensure that the Local Resource Host Access Status
of SAN HyperMetro pairs is Access Denied.
Step 7 Click quorum server software to obtain the quorum server software. In the product catalog,
select the desired product to go to the corresponding product page and view related
information. Download the upgrade package specific to your product.
If the link cannot be accessed, visit http://support.huawei.com/enterprise/. Click Login and enter your
user name and password, and choose Technical Support > Enterprise Storage.
Step 8 Upload the upgrade package to the quorum server. Go to the upgrade package path, and run
the ./quorum_server.sh -upgrade command to upgrade the software, as shown in Figure 9-
118.
During the arbitration software upgrade, a quorum link interruption alarm may be generated
and remain for about one minute on storage arrays. The alarm automatically disappears after
the quorum server is upgraded.
Step 9 Upgrade the controller software of the storage array where the services are suspended (local
resources cannot be accessed). See chapter 5 Performing the Upgrade.
Step 10 After the controller software is successfully upgraded, log in to OceanStor DeviceManager of
the other storage array to be upgraded. Choose HyperMetro > HyperMetro vStore Pair. On
the page that is displayed, select the vStore pairs in sequence and synchronize all HyperMetro
pairs in vStore pairs, as shown in Figure 9-119.
HyperMetro data synchronization prolongs the upgrade duration but does not affect services.
Step 11 After all HyperMetro pairs are synchronized, Pair Running Status is Normal and
Configuration Status is Normal, as shown in Figure 9-120.
Figure 1-1 HyperMetro vStore pairs and HyperMetro pairs in the normal state
Step 12 Check the HyperMetro vStore pair. If all check items are normal, as shown in Figure 9-121,
perform a primary/secondary switchover for the HyperMetro vStore pair, as shown in Figure
9-122.
Figure 1-2 Performing primary and secondary switchover of a HyperMetro vStore pair
Step 13 On the HyperMetro vStore Pair management page, check whether Activation Status of all
HyperMetro vStore pairs in the local storage array is Passive, as shown in Figure 9-123. If no,
go to Step 11.
Step 14 After the primary and secondary switchover of the vStore pair is complete, if Pair Running
Status and Configuration Synchronization of all HyperMetro pairs in the vStore pair are
normal, suspend all the HyperMetro pairs, as shown in Figure 9-124.
Step 15 On the HyperMetro vStore Pair management page, check whether Local Resource Host
Access Status of all HyperMetro vStore pairs is Access denied. If no, go to Step 13, as
shown in Figure 9-125.
If the storage array is configured with NAS HyperMetro and SAN HyperMetro, SAN
HyperMetro and HyperMetro relationships are suspended for the upgrade, and the SAN
HyperMetro array upgrade is selected, stop SAN HyperMetro pairs by referring to 9.4.1
Upgrading Storage Arrays in the SAN HyperMetro Solution to ensure that the Local
Resource Host Access Status of SAN HyperMetro pairs is Access denied.
Step 16 Upgrade the controller software of the storage array where the services are suspended (local
resources cannot be accessed). See chapter 5 Performing the Upgrade.
During the upgrade of the array controller, the remote resource name of the NAS HyperMetro
is displayed as -- and the Link Status is Link down. After the array controller is upgraded,
the name and status of the remote resource becomes normal.
Step 17 After both arrays are successfully upgraded, start HyperMetro pair synchronization in all
vStore pairs, as shown in Figure 9-126.
----End
Figure 1-1 HyperMetro vStore pairs and HyperMetro pairs in the normal state
Step 2 Check HyperMetro vStore pairs. No abnormal check item exists, as shown in Figure 9-128.
----End
9.4.3 Troubleshooting
This chapter describes how to troubleshoot upgrade faults in the HyperMetro solution.
For details, see chapter 7 Troubleshooting.
Question
What can I do if the HyperMetro is disconnected and cannot be recovered after the offline
upgrade is complete?
Answer
The possible cause is that the HyperMetro pair is not suspended before the array control
software is upgraded in offline mode, and both storage arrays of the HyperMetro pair are
upgraded simultaneously. As a result, the arrays cannot be accessed after the HyperMetro
upgrade is complete. The check method is as follows:
Step 1 Log in to OceanStor DeviceManager of the storage array. On the HyperMetro management
page, check whether the Local Resource Host Access Status of the HyperMetro pair is
Access denied, as shown in Figure 9-129. Log in to the HyperMetro management page of the
remote storage array and check whether the Local Resource Host Access Status of the
HyperMetro pair is also Access denied. The preceding information indicates that active-
active bidirectional access is disabled.
Step 2 After the HyperMetro pair cannot be accessed, click More and choose Force Start from the
drop-down list box to forcibly start the HyperMetro pair, as shown in Figure 9-130.
Before forcibly starting a HyperMetro pair, ensure that the host services of the
HyperMetro pair are stopped and that the Local Resource Host Access Status of
HyperMetro pairs of both arrays is Access denied.
If the HyperMetro pair belongs to the HyperMetro consistency group, you can only
operate the HyperMetro consistency group.
You can only forcibly start the vStore pair to which the NAS HyperMetro pair belongs.
Step 3 After forcible starting a HyperMetro pair, synchronize the HyperMetro pair, as shown in
Figure 9-131. After the synchronization is complete, check whether Pair Running Status of
the HyperMetro pair is Normal, as shown in Figure 9-132.
----End
Question
If the consistency group fails to be synchronized during the online upgrade or after the
upgrade, error code 0x403c01b6 is returned, indicating that the vStore IDs of the HyperMetro
consistency group and its HyperMetro pairs vary. If you start synchronization at one end that
is not upgraded, the message "The system is busy" or "An internal error occurs" is returned.
Answer
The possible cause is that multi-vStores are not supported and HyperMetro LUNs belong to a
vStore before the upgrade (in V300R003C20SPC200, V300R005C00SPC300,
V300R006C00SPC100, V300R006C10SPC100, or V300R006C20). In addition, HyperMetro
pairs are added to the consistency group which belongs to a default vStore. After the upgrade
to the version (V300R006C30 and later) where multi-vStores are supported by HyperMetro,
the vStore consistency of the consistency groups and member pairs will be checked during
synchronization. The HyperMetro LUNs already belong to different vStores. As a result, the
synchronization is not allowed.
Solution: Remove all HyperMetro pairs from the consistency group and delete the consistency
group. Synchronize and suspend each pair in sequence. After the upgrade is complete, create a
consistency group and add all HyperMetro pairs to the consistency group.
Step 1 Log in to OceanStor DeviceManager of the storage array and check whether the member
LUNs of the HyperMetro pairs belong to vStores. If both vStore ID and vStore name are --,
the failure does not exist. Otherwise, contact technical support engineers.
Step 3 Delete the HyperMetro consistency group on the HyperMetro Consistency Group page.
----End
Downgrade Description
After an upgrade is complete, if you want to downgrade the current version to the pre-upgrade
version, contact Huawei technical support engineers for assessing risks and obtaining the
Storage System Downgrade Guide.
1. Before you downgrade the current version to the pre-upgrade version, contact Huawei
technical support engineers for assessing risks because the early version may be
incompatible with data in the new version.
2. Perform checks and operations by strictly following instructions in the Storage System
Downgrade Guide.