You are on page 1of 46

Atlas 800 Training Server

1.0.7 to 1.0.10

NPU Driver and Firmware


Installation and Upgrade Guide
(Model 9010)
Issue 05
Date 2022-03-07

HUAWEI TECHNOLOGIES CO., LTD.


Copyright © Huawei Technologies Co., Ltd. 2022. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior
written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.

Notice
The purchased products, services and features are stipulated by the contract made between Huawei and
the customer. All or part of the products, services and features described in this document may not be
within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,
information, and recommendations in this document are provided "AS IS" without warranties, guarantees
or representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.

Huawei Technologies Co., Ltd.


Address: Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China

Website: https://e.huawei.com

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. i


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) About This Document

About This Document

Overview
This document describes how to install and upgrade the driver and firmware
packages and provides FAQs and troubleshooting methods.

Intended Audience
This document is intended for upgrade engineers who must:

● Be familiar with the product networking and related network element (NE)
versions.
● Have device maintenance experience and be familiar with device operation
and maintenance.

Symbol Conventions
The symbols that may be found in this document are defined as follows.

Symbol Description

Indicates a hazard with a high level of risk which, if not


avoided, will result in death or serious injury.

Indicates a hazard with a medium level of risk which, if not


avoided, could result in death or serious injury.

Indicates a hazard with a low level of risk which, if not


avoided, could result in minor or moderate injury.

Indicates a potentially hazardous situation which, if not


avoided, could result in equipment damage, data loss,
performance deterioration, or unanticipated results.
NOTICE is used to address practices not related to personal
injury.

Supplements the important information in the main text.


NOTE is used to address information not related to personal
injury, equipment damage, and environment deterioration.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. ii


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) About This Document

Change History
Issue Release Date Description

05 2022-03-07 This issue is the fifth official release.

04 2021-06-03 Modified 1.2.1 Obtaining Software


Packages, 1.2.3 Checking the OS
Requirements and Environment, and
2.2 Preparations for Upgrade.

03 2021-04-30 Modified 1.1 Before You Start, 2.2


Preparations for Upgrade, and 2.4.2
Using apt-get.
Added 1.2.5 (Optional) Repacking a
Driver Package.

02 2021-02-02 Modified 1.2 Preparations for


Installation.
Added 3.7 Driver 20.2.0 or Later Failed
to Be Rolled Back to an Earlier Version.

01 2020-07-17 This issue is the first official release.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. iii


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) Contents

Contents

About This Document................................................................................................................ ii


1 Installation and Maintenance.............................................................................................. 1
1.1 Before You Start....................................................................................................................................................................... 1
1.2 Preparations for Installation................................................................................................................................................ 1
1.2.1 Obtaining Software Packages..........................................................................................................................................1
1.2.2 Verifying Software Package Integrity............................................................................................................................3
1.2.3 Checking the OS Requirements and Environment................................................................................................... 3
1.2.4 Creating a Running User................................................................................................................................................... 6
1.2.5 (Optional) Repacking a Driver Package...................................................................................................................... 7
1.3 Installing the Driver and Firmware................................................................................................................................. 10
1.4 Uninstalling the Driver and Firmware........................................................................................................................... 13

2 Upgrade................................................................................................................................... 15
2.1 Before You Start.................................................................................................................................................................... 15
2.2 Preparations for Upgrade................................................................................................................................................... 16
2.3 Upgrading Components (.run)......................................................................................................................................... 19
2.3.1 Upgrading the Ascend 910 AI Processor Firmware............................................................................................... 20
2.3.2 Upgrading the Ascend 910 AI Processor Driver...................................................................................................... 21
2.4 Upgrading Components (.deb)........................................................................................................................................ 22
2.4.1 Using dpkg........................................................................................................................................................................... 22
2.4.2 Using apt-get...................................................................................................................................................................... 23

3 FAQs..........................................................................................................................................26
3.1 Driver Source Code Compilation..................................................................................................................................... 26
3.2 Software Package Unavailable......................................................................................................................................... 27
3.3 How Do I Check Whether the Device Is Running Properly.................................................................................... 28
3.4 BC_Linux Image Source Configuration.......................................................................................................................... 29
3.5 Inconsistent Driver and Firmware Versions Due to Incorrect Upgrade Sequence..........................................30
3.6 How Do I Handle a .deb Package Installation or Upgrade Failure?................................................................... 31
3.7 Driver 20.2.0 or Later Failed to Be Rolled Back to an Earlier Version................................................................ 32

A Appendixes............................................................................................................................. 33
A.1 Parameters.............................................................................................................................................................................. 33
A.2 Scripts....................................................................................................................................................................................... 36

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. iv


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) Contents

A.3 Tools.......................................................................................................................................................................................... 37

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. v


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

1 Installation and Maintenance

1.1 Before You Start


1.2 Preparations for Installation
1.3 Installing the Driver and Firmware
1.4 Uninstalling the Driver and Firmware

1.1 Before You Start


● For servers with Ascend 910 Pro B NPUs, to install the NPU driver for the first
time, choose version 20.2.0.SPC300 or later. The NPU firmware uses the
version that is released with the driver. The NPU driver or firmware version
cannot be rolled back to version 20.2.0.SPC200 or earlier.
● To query the NPU processor name, log in to the host and run npu-smi info.
To query the NPU name, log in to the iBMC WebUI, choose System > System
Info > Processors, and view the NPU model information.

1.2 Preparations for Installation

1.2.1 Obtaining Software Packages


Before the installation, obtain the software packages based on the operating
system of the operating environment. Table 1-1 lists the software packages.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 1


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

Table 1-1 Software packages

Component OS Software User


Package

Firmware Ubuntu 18.04/ A800-9010-npu- root


package CentOS 7.6/ firmware_<versio
Debian 9.9/ n>.run
CentOS 8.2/ A800-9010-npu-
BC_Linux 7.6/ firmware_<versio
Debian 10.0 n>.deb
NOTE
Debian 10.0 is
supported only by
version 21.0.rc1 or
later.

Driver package Ubuntu 18.04 A800-9010-npu- root


(x86_64) driver_x.x.x_ubunt
u18.04-
x86_64.run

Driver package CentOS 7.6 A800-9010-npu- root


(x86_64) driver_x.x.x_cento
s7.6-x86_64.run

Driver package Debian 9.9 A800-9010-npu- root


(x86_64) driver_x.x.x_debia
n9.9-x86_64.run
A800-9010-npu-
driver_x.x.x_debia
n9.9-x86_64.deb

Driver package BC_Linux 7.6 A800-9010-npu- root


(x86_64) driver_x.x.x_linux-
x86_64.run

Driver package CentOS 8.2 A800-9010-npu- root


(x86_64) driver_x.x.x_linux-
x86_64.run

Driver package Debian 10.0 A800-9010-npu- root


(x86_64) driver_x.x.x_debia
NOTE n10.0-x86_64.run
Debian 10.0 is A800-9010-npu-
supported only by
driver_x.x.x_debia
version 21.0.rc1 or
later. n10.0-x86_64.deb

NOTE

● {x.x.x} indicates the version number.


● The A800-9010-npu-driver_x.x.x_linux-x86_64.run is compatible with all operating
systems.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 2


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

Procedure
Step 1 Visit the A800-9010 page.
Step 2 Select the target version A800-9010 X.X.X.
For details about the mapping between the firmware, driver, and CANN, see
CANN Version Mapping.

Step 3 Click and next to a software package (for example, A800-9010-npu-


driver_x.x.x_ubuntu18.04-x86_64.run) to obtain the software package and digital
signature file.

----End

1.2.2 Verifying Software Package Integrity


To prevent a software package from being maliciously tampered with during
transmission or storage, download the corresponding digital signature file for
integrity verification when downloading the software package.
After the software package is downloaded from the Support website, verify its PGP
digital signature by referring to the OpenPGP Signature Verification Guide. If the
verification fails, do not use the software package. Contact Huawei technical
support.
Before a software package is used for installation or upgrade, its digital signature
also needs to be verified by referring to the OpenPGP Signature Verification Guide
to ensure that the software package has not been tampered with.
For carrier users, visit http://support.huawei.com/carrier/
digitalSignatureAction.
For enterprises, visit https://support.huawei.com/enterprise/zh/tool/pgp-verify-
TL1000000054.

1.2.3 Checking the OS Requirements and Environment


Table 1-2 and Table 1-3 list the default operating systems and kernels used by
the software packages.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 3


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

Table 1-2 Default OS versions of the binary driver packages


Hardware Form Host OS Version Default Host OS GCC Version
Kernel Version in
a Software
Package

x86_64 + Atlas Ubuntu 18.04 4.15.0-45-generic 7.5.0


800 9010 NOTE
If the kernel
version does not
match the OS
version, install
Dynamic Kernel
Module Support
(DKMS) first. For
details about how
to install DKMS,
see 3.1 Driver
Source Code
Compilation.

x86_64 + Atlas CentOS 7.6 3.10.0-957.el7 -


800 9010

x86_64 + Atlas Debian 9.9 4.9.0-9-amd64 6.3.0


800 9010

x86_64 + Atlas Debian 10.0 kernel4.19.0-5- 8.3.0


800 9010 NOTE amd64
Debian 10.0 is
supported only by
version 21.0.rc1 or
later.

Table 1-3 OS versions compatible with the general driver packages


Hardware Form Host OS Version Default Host OS GCC Version
Kernel Version in
a Software
Package

x86_64 + Atlas CentOS 8.2 4.18.X 8.3.1


800 9010 NOTE
Upgrading to
5.6.14 is supported.

x86_64 + Atlas BC_Linux 7.6 4.19 4.8.5


800 9010

Checking the Linux OS Version


Run the uname -m && cat /etc/*release command to query the OS version and
architecture.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 4


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

The OS version and architecture must comply with Table 1-2 or Table 1-3.

Check Items Specific to General Driver Packages


● Run the make -v command to check whether the Make tool has been
installed. If the Make version is displayed, Make has been installed.
● The driver can be successfully installed if either of the following check items is
met:
– Installation dependencies such as DKMS have been installed in the
system. For details, see 3.1 Driver Source Code Compilation.
– Run the ls /lib/modules/`uname -r`/build command to check whether
the default kernel source code path /lib/modules/`uname -r`/build
exists.

▪ If it exists, the kernel is automatically used to compile the driver.

▪ If it does not exist, you can provide the source code path during
installation. For details, see Step 5.
● For CentOS 8.2+x86, run the rpm -qa | grep elfutils-libelf-devel command
to check whether elfutils-libelf-devel has been installed. If its information is
displayed, elfutils-libelf-devel has been installed.

Checking the Linux OS Kernel Version


Run the uname -r command to query the kernel version of the host OS.

● If the binary driver package is used for installation, the kernel version of the
current host OS must be the same as that required in Table 1-2. If the kernel
versions are inconsistent, use either of the following methods:
– Recompile the source code. For details, see 3.1 Driver Source Code
Compilation.
– Check whether the software package has been installed by referring to
Checking Whether a Component Has Been Installed in the Operating
Environment. If no, update the system kernel. If yes, uninstall the
software package and then update the kernel.
● If the general driver package is used for installation, the kernel version of the
current host OS must be the same as that required in Table 1-3. Otherwise,
the driver package may fail to be installed or functions may be affected.

Checking Whether a Component Has Been Installed in the Operating


Environment
Before updating the system kernel, ensure that no component is installed in the
current system. Otherwise, the software packages fail to be started after the
kernel is upgraded. If this problem occurs, resolve the problem. For details, see 3.2
Software Package Unavailable. Run the lsmod|grep drv command to check that
there is no previous component installation.

● If no information is output, no software package has been installed. You can


directly upgrade the kernel.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 5


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

● If software information is output, software packages have been installed. In


this case, uninstall the software packages and upgrade the system kernel
version. For details, see 1.4 Uninstalling the Driver and Firmware.

1.2.4 Creating a Running User


● For installation as the root user:
After performing installation as the root user, you must switch to a non-root
user for execution. Therefore, you need to create a running user before the
installation.
– If the created user is HwHiAiUser, you can directly install the software
packages as the HwHiAiUser user. The default installation user is
HwHiAiUser.
– If the created user is not HwHiAiUser, you need to specify a running user
using the --install-username=username --install-usergroup=usergroup
parameter when installing the software packages.
● For installation as a non-root user:
In this scenario, the installation and running users must be the same.
– If a non-root user exists, you do not need to create one.
– If you want to use a new non-root user, create the user as follows:
To create a non-root user, run the following commands as the root user:
i. Create a non-root user:
groupadd usergroup
useradd -g usergroup -d /home/username -m username

ii. Set the password of the non-root user:


passwd username

NOTE

● The running user is specified for the driver component. It is not supported to specify the
running user for firmware. The running user of firmware is the same as that of the
driver.
● In the preceding commands, replace username with the actual user name.
● Permission control may pose security risks. Therefore, you are advised to create a
running user that does not belong to the root user group.
● After the HwHiAiUser user is created, do not disable the login authentication function
of the user.
● The password validity period is 90 days. You can change the validity period in the /etc/
login.defs file or run the chage command to set the validity period. For details, see
Setting the User Account Validity Period.

Setting the User Account Validity Period


Run the chage command to set the validity period of a user account for security
purposes.

Command:

chage [-m mindays] [-M maxdays] [-d lastday] [-I inactive] [-E expiredate] [-
W warndays] user

Table 1-4 describes the parameters.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 6


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

Table 1-4 Parameters description


Parameter Description

-m Minimum validity period (days) of a password. The value 0


indicates that the password can be changed at any time.

-M Maximum validity period (days) of a password. If this


parameter is set to -1, this check item is ignored and the user
password will not expire. This poses security risks. Therefore,
exercise caution when setting this parameter to -1.

-d Date when a password was changed the last time.

-I Maximum idle period (in days) after which the user account
will be disabled. After the specified time period has expired, the
password will be invalid.

-E Date when the user account expires. The user account is


unavailable when the account validity period has expired.

-W Number of days in advance users are notified that their


passwords are about to expire.

-l Lists the user name and password validity information. This


information helps non-privileged users to determine when to
change their passwords.

NOTE

● Table 1-4 lists only common parameters. You can run the chage --help command to
display detailed parameter description.
● The date is in the format of YYYY-MM-DD. For example, chage -E 2019-12-01 test
indicates that user test will expire on December 1, 2019.
● If User is not specified, the default user root will be used.

For example, to change the validity period of the user account test to December
31, 2019, run the following command:
chage -E 2019-12-31 test

1.2.5 (Optional) Repacking a Driver Package


Repack a driver package that meets the requirements of the target OS version,
kernel version, and GCC version.
A driver package can be repacked using either of the following methods:
1. Directly repacking the driver package
2. Decompressing the .run package and repacking the driver package
3. Using a repacking tool to repack the driver package

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 7


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

CAUTION

You are advised to use the first two methods.


● If the source code in the .run package does not need to be modified, you can
directly repack the driver package. For details, see Directly Repacking the
Driver Package.
● If the driver source code needs to be modified, decompress the .run package
and repack the driver package. For details, see Decompressing the .run
Package and Repacking the Driver Package.

Prerequisites
● The GCC tool and kernel source code corresponding to the target OS and its
kernel version exist. For details, see 1.2.3 Checking the OS Requirements
and Environment.
● The open-source tool Makeself v2.4.0 has been downloaded from https://
github.com/megastep/makeself or https://github.com/megastep/
makeself/releases/tag/release-2.4.0.
● Run the which pigz command to check whether the pigz tool has been
installed. If the location of the pigz tool is displayed, it has been installed.

Directly Repacking the Driver Package


Step 1 Obtain the general driver package. For details, see 1.2.1 Obtaining Software
Packages.
Step 2 Upload the general driver package to a directory in the operating environment, for
example, /opt.
Step 3 Run the following command to go to the directory:
cd /opt
Step 4 Run the following command to add the execute permission for the general driver
package:
Command: chmod +x *.run
Example: A800-9010-npu-driver_x.x.x_linux-x86_64.run
Step 5 Run the following command as the root user to repack the driver package:
Command: ./*.run --repack [package_name]
Example: ./A800-9010-npu-driver_x.x.x_linux-x86_64.run --repack A800-9010-
npu-driver-repack.run

NOTE

If package_name is left blank, a file named original_driver_package_name-custom.run is


generated in the current path.

Step 6 Install the driver. For details, see 1.3 Installing the Driver and Firmware.

----End

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 8


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

Decompressing the .run Package and Repacking the Driver Package


Step 1 Obtain the general driver package. For details, see 1.2.1 Obtaining Software
Packages.
Step 2 Upload the general driver package to a directory in the operating environment, for
example, /opt.
Step 3 Run the following command to go to the directory:
cd /opt
Step 4 Run the following command to add the execute permission for the general driver
package:
Command: chmod +x *.run
Example: A800-9010-npu-driver_x.x.x_linux-x86_64.run
Step 5 Run the following command to decompress the general driver package to a
specified directory:
Command: ./*.run --noexec --extract=specified_directory
Example: ./A800-9010-npu-driver_x.x.x_linux-x86_64.run --noexec --
extract=./tmp
Step 6 Run the following command as the root user to repack the driver package:
Command: ./*.run --repack-path=<path> [package_name]
Example: ./A800-9010-npu-driver_x.x.x_linux-x86_64.run --repack-path=tmp/
A800-9010-npu-driver-repack.run

NOTE

If package_name is left blank, a file named original_driver_package_name-custom.run is


generated in the current path.

Step 7 Install the driver. For details, see 1.3 Installing the Driver and Firmware.

----End

Using a Repacking Tool to Repack the Driver Package


Step 1 Obtain the general driver package and repacking tool package. For details, see
1.2.1 Obtaining Software Packages.
Step 2 Upload the general driver package, repacking tool package, and open-source
software package makeself-release-2.4.0.zip to the same directory in the
operating environment, for example, in the /opt directory.
Step 3 Run the following command to go to the directory:
cd /opt
Step 4 Run the following command to add the execute permission for the general driver
package:
Command: chmod +x *.run

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 9


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

Example: chmod +x A800-9010-npu-driver_x.x.x_linux-x86_64.run


Step 5 Run the following command to decompress the general driver package to a
specified directory:
Command: ./*.run --noexec --extract=specified_directory
Example: ./A800-9010-npu-driver_x.x.x_linux-x86_64.run --noexec --
extract=./tmp
Step 6 Run the following command to decompress the repacking tool package to the
current directory:
Command: tar -xzvf repacking_tool_package_name--strip-components 1
Example: tar -xzvf A800-9010-npu-driver-repack-tools-x.x.x.tar.gz --strip-
components 1
Step 7 Run the following command to decompress the software package makeself-
release-2.4.0.zip:
unzip makeself-release-2.4.0.zip
Step 8 Run the following command to build a new driver package:
Command: bash build.sh makeself.sh path
Decompression_path_of_the_general_driver package output_package_name
Example: bash build.sh ./makeself-release-2.4.0/makeself.sh tmp/ A800-9010-
npu-driver-repack.run
Step 9 Install the driver. For details, see 1.3 Installing the Driver and Firmware.

----End

1.3 Installing the Driver and Firmware


Prerequisites
The preparations for the installation are complete. For details, see 1.2
Preparations for Installation.

Procedure
Install the software driver package and then the .run firmware package. The
procedures for installing the software packages are the same. Replace the
asterisks (*) in the software packages based on the actual situation.

Step 1 Log in to the operating environment as user root and upload the *.run packages
to any directory, for example, /opt, in the operating environment.
Step 2 Grant the execute permission on the software packages to the installation user.
Run the ls -l command in the directory where the software package is stored to
check whether the installation user has the permission to execute the file. If the
installation user does not have the permission, run the following command:
chmod +x *.run

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 10


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

Step 3 Run the following command to check the consistency and integrity of the software
package installation file:
./*.run --check
Step 4 Run the following command to perform installation:
● If the installation path of the software package has been specified, for
example, /test/HiAI/, run the ./*.run --full --install-path=/test/HiAI/
command.
● If the software package is installed in the default installation directory, run
the ./*.run --full command.
NOTE

– The installation path is specified for the driver component. It is not supported to
specify the installation path for firmware. The installation path of firmware is the
same as that of the driver.
– If the driver is installed in a specified path:

▪ In the scenario where the specified path does not exist, a directory is
automatically created during the installation. If there are multiple levels of
directories, the directory is automatically created only when the last level of
directory does not exist.

▪ If the specified path already exists:


○ If the owner of all levels of directories in the path is the root user,
ensure that the permission on all levels of directories is at least 755.
If this requirement is not met, run the chmod 755 path command to
change the permission on the path.
○ If the owner of a level-1 directory in the path is not the root user,
change the owner to the root user and ensure that the permission on all
levels of directories is 755.
If this requirement is not met, run the chown root:group_name_path
command to change the path owner to root.
– If the root user is specified as the running user, the --install-for-all parameter
must be included in the installation command as follows. In this scenario, security
risks may exist.
--install-username=root --install-usergroup=root --install-for-all
– Default installation path: /usr/local/Ascend
– Installation log path: /var/log/ascend_seclog/ascend_install.log
– Driver/Firmware installation path, installation command, and user
information: /etc/ascend_install.info

Step 5 For a general driver package, if the following information is displayed, the DKMS
is not installed and the default kernel source code path /lib/modules/`uname -r`/
build does not exist. Enter the following information as prompted:
[WARNING]rebuild ko has something wrong, detail in /var/log/ascend_seclog/ascend_rebuild.log
Do you want to try build driver after input kernel absolute path? [y/n]:

If you want to continue the installation, enter y.


When the following information is displayed, enter the actual path of the kernel
source code, for example, /lib/modules/`uname -r`/build-bak:
Please input your kernel absolute path:

Press Enter to continue the installation.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 11


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

NOTE

● If DKMS and related components such as kernel-header and kernel-devel have been
installed, the system automatically compiles and installs the DKMS driver.
● If DKMS has not been installed but the default kernel source code path /lib/modules/
`uname -r`/build already exists, the kernel is automatically used for driver compilation.

Step 6 If information similar to the following is displayed, the installation is successful:


● Driver:
Driver package install success! Reboot needed for installation/upgrade to take effect!

● Firmware:
Firmware package install success! Reboot needed for installation/upgrade to take effect!

Step 7 Restart the operating environment.


Step 8 Check the version of the installed driver.
In the software package installation path, for example, the default path of the
root user /usr/local/Ascend/{package_name}, run the following command to
check whether the target version is correct:
cat version.info
Version=1.73.T105.0.B050

Step 9 Check the version of the installed NPU firmware.


/usr/local/Ascend/driver/tools/upgrade-tool --device_index -1 --component -1
--version
Get component version(1.73.5.0.b050) succeed for deviceId(0), componentType(0).
{"device_id":0, "component":nve, "version":1.73.5.0.b050}
Get component version(1.73.5.0.b050) succeed for deviceId(0), componentType(3).
{"device_id":0, "component":uefi, "version":1.73.5.0.b050}
Get component version(1.73.5.0.b050) succeed for deviceId(0), componentType(8).
{"device_id":0, "component":imu, "version":1.73.5.0.b050}
Get component version(1.73.105.0.b050) succeed for deviceId(0), componentType(9).
{"device_id":0, "component":imp, "version":1.73.105.0.b050}

Step 10 Run the npu-smi info command to check whether the NPU tool is successfully
installed.
The installation is successful if the following information is displayed. Otherwise,
the installation fails. Contact Huawei technical support.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 12


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

NOTE

● In the command output, the field after npu-smi is the NPU tool version, and the field
after Version: is the NPU driver version.
● For details about how to use other commands, see Atlas 800 Training Server npu-smi
Command Reference (Model 9010).

----End

Important Notes
● After the .run installation package is used, do not manually set the
environment variable export LD_LIBRARY_PATH to the original SO file of
the .rar package. Otherwise, the tool in the .run installation package may
connect to the dynamic library of the earlier version. The third-party library
file path and non-run installation package release library file path are not
affected.
● Observe the following when viewing logs: The log time is the system time.
The time on the device is the same as that on the host. You can run the date
command to change the time on the host.
For example, to set the system time to 17:55:55, run the date -s 17:55:55
command.

1.4 Uninstalling the Driver and Firmware


Procedure
The driver and firmware can be uninstalled in any sequence. Replace the asterisks
(*) in the software packages based on the actual situation.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 13


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 1 Installation and Maintenance

Step 1 Log in to the operating environment as user root.


Step 2 Two uninstallation methods are supported. Users can select either one based on
the actual situation.
● To use the software package for uninstallation, run the following command in
the directory where the .run package is stored, for example, /opt:
./*.run --uninstall
● Run the following command in any directory for uninstallation:
bash {install_path}/{package_name}/script/uninstall.sh
NOTE

Replace install_path with the actual installation path, package_name with the actual
package name, the firmware package with the actual firmware, and the driver
package with actual driver.

Step 3 If no error information is displayed during the uninstallation, the uninstallation is


successful. Determine whether to reboot the server based on the displayed
information to complete the uninstallation.

----End

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 14


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 2 Upgrade

2 Upgrade

2.1 Before You Start


2.2 Preparations for Upgrade
2.3 Upgrading Components (.run)
2.4 Upgrading Components (.deb)

2.1 Before You Start


Update Impacts
Do not perform other maintenance operations during the upgrade.

During the Atlas 800 training server (model 9010) software upgrade, the system
needs to be reset, which interrupts services.

Precautions
Table 2-1 lists the precautions for the upgrade of the Atlas 800 training server
(model 9010).

Table 2-1 Precautions for the upgrade

No. Description

1 Before the upgrade, read this document carefully to


ensure that you have learned all the content. For any
problems or suggestions pertaining to the document,
contact Huawei technical support.

2 To reduce the impact on services, switch services to


other nodes or perform the upgrade during off-peak
hours.

3 After the upgrade, ensure that the versions of the


components are consistent.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 15


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 2 Upgrade

No. Description

4 Ensure that the system is running properly and the


firmware upgrade tool and its dependent drivers are
correctly loaded. Ensure that the operating
environment is working properly. Otherwise, reset the
system before performing the upgrade. For details, see
3.3 How Do I Check Whether the Device Is Running
Properly.

5 Do not modify the /etc/ascend_install.info file unless


necessary. Otherwise, system functions will be
unavailable.

6 When upgrading the DEB packages from 20.0.0, 20.1.0,


and patch versions to 20.2.0 or a later version in
default mode (not installed by specified users),
uninstall all NPU-related DEB packages. You can view
the list of installed software in the /etc/
ascend_install.info file and then re-install it.

7 After the driver version rollback of a server with Ascend


910 Pro B as the NPU, determine whether the device is
available based on whether the NPU firmware version
can be queried. For details about how to query the
NPU firmware version, see Step 9.

Version Requirements
You are advised to use the driver version and firmware version in the same
software version list of Atlas 800 training server (model 9010) to ensure that they
match each other.

Update Process
Upgrade software packages in the sequence of firmware > driver.

2.2 Preparations for Upgrade


Performing a Pre-upgrade Check
Perform a check according to Table 2-2 and record the check results.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 16


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 2 Upgrade

Table 2-2 Pre-upgrade checklist


No. Item Criteria

1 Check 1. Query and record the version of Atlas 800 training server
the (model 9010) in the current system.
softwar 2. Determine the target version.
e
version.

2 System Check alarms of the Atlas 800 training server (model 9010).
status ● If no active alarm exists, perform the upgrade.
● If there are active alarms, contact Huawei technical support
to confirm the alarms.

Table 2-3 lists the software versions supported by different processors.

Table 2-3 Software version mapping


Processor Type Applicable Version

Ascend 910 A Atlas 800 training server 20.0.0, 20.1.0, 20.2.0


(model 9010)

Ascend 910 B Atlas 800 training server 20.0.0, 20.1.0, 20.2.0


(model 9010)

Ascend 910 Pro B Atlas 800 training server 20.2.0.SPC300


(model 9010)

Obtaining the Upgrade Packages


To obtain an upgrade package, perform the following steps:
1. Log in to the A800-9010.
2. Select the target version A800-9010 X.X.X.
For details about the mapping between the firmware, driver, and CANN, see
CANN Version Mapping.
3. On the page of the required version, download the related upgrade packages
to the client (local PC). Table 2-4 describes the upgrade packages.

NOTE

● In the following sections, x.x.x indicates the software version.


● The A800-9010-npu-driver_x.x.x_linux-x86_64.run is compatible with all operating
systems.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 17


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 2 Upgrade

Table 2-4 Upgrade package information


Hardware Host OS Kernel Driver Firmware Package
Form OS Version Package
Versio Name
n

Atlas 800 Debian 4.9.0-9- A800-9010- A800-9010-npu-


training 9.9 amd64 npu- firmware_<version>.run
server driver_x.x.x_ A800-9010-npu-
(model debian9.9- firmware_<version>.deb
9010): x86_64.run
NOTE
x86_64 A800-9010- Processor firmware upgrade
npu- package of the Atlas 800
driver_x.x.x_ training server (model
9010).
debian9.9-
x86_64.deb

CentOS 3.10.0-957. A800-9010-


7.6 el7.x86_64 npu-
driver_x.x.x_c
entos7.6-
x86_64.run

Ubuntu 4.15.0-45- A800-9010-


18.04 generic npu-
NOTE driver_x.x.x_
If the ubuntu18.04
kernel -x86_64.run
version
does not
match the
OS
version,
install
DKMS
first. For
details
about how
to install
DKMS, see
3.1 Driver
Source
Code
Compilati
on.

CentOS 4.18.X A800-9010-


8.2 NOTE npu-
Upgrading driver_x.x.x_li
to 5.6.14 nux-
is x86_64.run
supported.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 18


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 2 Upgrade

Hardware Host OS Kernel Driver Firmware Package


Form OS Version Package
Versio Name
n

Debian kernel4.19. A800-9010-


10.0 0-5-amd64 npu-
NOTE driver_x.x.x_
Debia debian10.0-
n 10.0 x86_64.run
is
suppor A800-9010-
ted npu-
only driver_x.x.x_
by debian10.0-
versio x86_64.deb
n
21.0.rc
1 or
later.

BC_Lin 4.19 A800-9010-


ux 7.6 npu-
driver_x.x.x_li
nux-
x86_64.run

Checking the Software Package Integrity


To prevent a software package from being maliciously tampered with during
transfer or storage, download also the corresponding digital signature file for
integrity verification while downloading the software package.
After the software package is downloaded from the Huawei Support website,
verify its PGP digital signature. See OpenPGP Signature Verification Guide. If the
verification fails, do not use the software package, and contact Huawei technical
support engineers.
Before a software package is used in installation or update, its digital signature
also needs to be verified according to OpenPGP Signature Verification Guide to
ensure that the software package is not tampered with.
For carriers: http://support.huawei.com/carrier/digitalSignatureAction.
For enterprise users, visit https://support.huawei.com/enterprise/en/tool/pgp-
verify-TL1000000054

2.3 Upgrading Components (.run)


Upgrade components by upgrading software packages in the sequence of
firmware > driver.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 19


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 2 Upgrade

2.3.1 Upgrading the Ascend 910 AI Processor Firmware


The Ascend 910 AI processor firmware upgrade is supported by the Atlas 800
training server (model 9010). This section uses the A800-9010-npu-
firmware_<version>.run package of the Atlas 800 training server (model 9010) as
an example to describe how to upgrade the firmware.

Procedure
Step 1 Obtain the A800-9010-npu-firmware_<version>.run package. For details, see 2.2
Preparations for Upgrade.

Step 2 Log in to the Atlas 800 training server (model 9010) as the root user.

Step 3 Upload the A800-9010-npu-firmware_<version>.run package to any directory on


the Linux OS, for example, /opt.

Step 4 Go to the directory where the A800-9010-npu-firmware_<version>.run package


is stored, for example, /opt.

cd /opt

Step 5 Run the following command to change the permission on the A800-9010-npu-
firmware_<version>.run package:

chmod u+x A800-9010-npu-firmware_<version>.run

Step 6 Run the ./A800-9010-npu-firmware_<version>.run --check command to check


the consistency and integrity of the .run installation package.

Step 7 Upgrade the driver.

Run the ./A800-9010-npu-firmware_<version>.run --upgrade command to


upgrade the firmware.

If information similar to the following is displayed, the upgrade is successful:


Firmware package install success! Reboot needed for installation/upgrade to take effect!

NOTE

● In the package name, <version> indicates the firmware version.


● The logs generated during the installation are recorded in the /var/log/ascend_seclog/
ascend_install.log file. You can run the vim /var/log/ascend_seclog/ascend_install.log
command to open the log file.

Step 8 Restart the system.

reboot

Step 9 Check the target version.

In the installation directory, run the following command to check whether the
target version is correct:

cat version.info

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 20


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 2 Upgrade

NOTE

The default installation path is /usr/local/Ascend/firmware.

----End

2.3.2 Upgrading the Ascend 910 AI Processor Driver


Scenario
This section describes how to upgrade the driver of the Ascend 910 AI Processor in
the Atlas 800 training server (model 9010).
The .run upgrade package supports one-click upgrade. This section uses the
A800-9010-npu-driver_x.x.x_debian9.9-x86_64.run package of the Atlas 800
training server (model 9010) as an example. The specific operations are subject to
the driver package of the host system.

NOTE

Upgrading the driver does not change the username and password of the system.

Impact on the System


During the driver upgrade of the Atlas 800 training server (model 9010), the
system needs to be reset, which will interrupt services. To reduce the impact on
services, switch the services to other devices before the upgrade.

Procedure
Step 1 Obtain the driver package A800-9010-npu-driver_x.x.x_debian9.9-x86_64.run.
For details, see 2.2 Preparations for Upgrade.
Step 2 Log in to the Atlas 800 training server (model 9010) as the root user.
Step 3 Upload the A800-9010-npu-driver_x.x.x_debian9.9-x86_64.run package to any
directory on the Linux OS, for example, /opt.
Step 4 Go to the directory where the A800-9010-npu-driver_x.x.x_debian9.9-x86_64.run
package is stored, for example, /opt.
cd /opt
Step 5 Run the following command to change the permission on the A800-9010-npu-
driver_x.x.x_debian9.9-x86_64.run package:
chmod u+x A800-9010-npu-driver_x.x.x_debian9.9-x86_64.run
Step 6 Run the ./A800-9010-npu-driver_x.x.x_debian9.9-x86_64.run --check command
to check the consistency and integrity of the .run package.
Step 7 Upgrade the driver.
You can run the ./A800-9010-npu-driver_x.x.x_debian9.9-x86_64.run --upgrade
command to upgrade the driver.
If information similar to the following is displayed, the upgrade is successful:

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 21


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 2 Upgrade

Driver package install success! Reboot needed for installation/upgrade to take effect!

NOTE

● During the driver upgrade, the dynamic library libdcmi.so and header file
dcmi_interface_api.h are copied to the /usr/local/dcmi/ directory.
● For details about how to resolve problems that may occur during the driver upgrade,
see 3 FAQs.
● During the driver upgrade, the log information about the Ascend 910 driver is generated
in the /var/log/ascend_seclog/ascend_install.log file.

Step 8 Restart the system.

reboot

Step 9 Check the target version.

In the installation directory, run the following command to check whether the
target version is correct:

cat version.info

NOTE

The default installation path is /usr/local/Ascend/driver.


● If you cannot log in to the host OS after the upgrade, contact Huawei
technical support.
● If the target version is not the correct version or the upgrade fails, perform
the upgrade again. If the update still fails, record the fault information and
operations you have performed, and contact Huawei technical support.

----End

2.4 Upgrading Components (.deb)

2.4.1 Using dpkg


Installation or Upgrade
Install or upgrade the .deb driver and firmware packages in the local source.
Replace <version> with the actual version number in the following commands.

Install or upgrade the driver and then the firmware:

dpkg -i A800-9010-npu-driver_<version>_debian9.9-x86_64.deb
dpkg -i A800-9010-npu-firmware_<version>.deb

Uninstalling the Driver and Firmware


Uninstall the firmware and then the driver:

dpkg -r ascend910-firmware
dpkg -r ascend910-driver

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 22


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 2 Upgrade

NOTE

If a software package needs to be reinstalled when you uninstall the firmware and driver,
the software package failed to be installed. In this case, you need to reinstall the software
package to ensure that the installation is successful. If the software package fails to be
installed, rectify the fault by referring to 3.6 How Do I Handle a .deb Package Installation
or Upgrade Failure?.

Querying the Installation Information about the .deb Driver and Firmware
Packages
dpkg -l | grep Ascend

2.4.2 Using apt-get


NOTE

The firmware package has an internal dependency on the driver package because the
firmware upgrade uses the upgrade tool in the driver package.

Installation or Upgrade
Install or upgrade the .deb driver and firmware packages in the local source:
● Install or upgrade the driver and then the firmware:
apt-get install ascend910-driver
apt-get install ascend910-firmware
● Install or upgrade the firmware:
apt-get install ascend910-firmware
NOTE

The firmware depends on the driver. This command installs the driver first and then
the firmware.

Uninstalling the Driver and Firmware


● Uninstall the firmware and then the driver:
apt-get remove ascend910-firmware
apt-get remove ascend910-driver
NOTE

If a software package needs to be reinstalled when you uninstall the firmware and
driver, the software package failed to be installed. In this case, you need to reinstall
the software package to ensure that the installation is successful. If the software
package fails to be installed, rectify the fault by referring to 3.6 How Do I Handle
a .deb Package Installation or Upgrade Failure?.
● Uninstall the driver.
apt-get remove ascend910-driver
NOTE

The firmware depends on the driver. This command uninstalls the firmware first and
then the driver.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 23


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 2 Upgrade

Re-installation
apt-get --reinstall install ascend910-firmware
apt-get --reinstall install ascend910-driver

Rolling Back to the Source Version


apt-get install ascend910-firmware=1.73.1405.6.50
apt-get --reinstall install ascend910-driver=20.0.rc1.spc100

Querying the Installation Information about the .deb Driver and Firmware
Packages
apt list ascend* or dpkg -l | grep Ascend

Default Installation
The default user is root. You can run the Installation or Upgrade commands
directly without extra operations.

export ASCEND_INSTALL_FOR_ALL=true

NOTE

If this parameter is included in the installation or upgrade command, all users have the
same permission on the directories and files created by the runfile installation engineer as
the installation group. Make sure the security risks are considered before you include this
parameter.

Specifying an Installation User


NOTE

Before running the installation or upgrade command, set the environment variables.

Step 1 (Optional) During the installation, set any non-root account to use Ascend AI
Processors.

export ASCEND_INSTALL_FOR_ALL=true

NOTE

If this parameter is included in the installation or upgrade command, all users have the
same permission on the directories and files created by the runfile installation engineer as
the installation group. Make sure the security risks are considered before you include this
parameter.

Step 2 Set environment variables.


1. Set environment variables in export mode. In this mode, the environment
variables take effect immediately but are valid only in the current window.
export ASCEND_USER_GROUP=usergroup
export ASCEND_USER_NAME=username
2. Modify the ~/.bashrc file to set the permanent environment variables.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 24


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 2 Upgrade

a. Run the vi ~/.bashrc command in any directory as the installation user to


open the .bashrc file and append the preceding lines to the file.
b. Run the :wq! command to save the file and exit.
c. Run the source ~/.bashrc command for the modification to take effect
immediately.
Step 3 Run the following commands to add the created user to the created user group:
groupadd ${ASCEND_USER_GROUP}
useradd -g ${ASCEND_USER_GROUP} -d /home/${ASCEND_USER_NAME} –m $
{ASCEND_USER_NAME} -s /bin/bash
Step 4 Run the Installation or Upgrade commands.
NOTE

Run the following commands to clear the environment variables related to the installation:
unset ASCEND_USER_NAME
unset ASCEND_USER_GROUP
unset ASCEND_INSTALL_FOR_ALL

----End

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 25


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 3 FAQs

3 FAQs

3.1 Driver Source Code Compilation


3.2 Software Package Unavailable
3.3 How Do I Check Whether the Device Is Running Properly
3.4 BC_Linux Image Source Configuration
3.5 Inconsistent Driver and Firmware Versions Due to Incorrect Upgrade Sequence
3.6 How Do I Handle a .deb Package Installation or Upgrade Failure?
3.7 Driver 20.2.0 or Later Failed to Be Rolled Back to an Earlier Version

3.1 Driver Source Code Compilation


The driver source code can be compiled automatically or manually. The automatic
compilation is implemented by the DKMS tool. You need to install DKMS first.
Switch to the root user before performing the following operations.

Prerequisites
Run the following command to check whether the software packages, such as
DKMS, GCC, and linux-header, have been installed:
● Ubuntu/Debian:
dpkg-query -s dkms (Run this command to query DKMS only when you want
to use the automatic compilation mode.)
dpkg-query -s gcc
dpkg-query -s linux-headers-$(uname -r)
● CentOS/BC_Linux:
rpm -qa | grep dkms (Run this command to query DKMS only when you
want to use the automatic compilation mode.)
rpm -qa | grep gcc
rpm -qa | grep kernel-*-headers-$(uname -r)
rpm -qa | grep kernel-*-devel-$(uname -r)

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 26


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 3 FAQs

NOTE

● For Ubuntu or Debian, you need to install the DKMS, GCC, and linux-header software
packages. If the software packages are not installed, obtain them from the official
websites and install them.
● For CentOS or BC_Linux, you need to install the DKMS, GCC, kernel-*-headers, and
kernel-*-devel software packages. If the software packages are not installed, obtain
them from the official websites and install them.
● You are advised to install the dkms-2.6.1-1.el7.noarch.rpm software package.

Automatic Compilation
During the installation of the .run software package, if the kernel version is
inconsistent with the driver image version in the software package, compilation of
the driver source code is performed automatically before the software package
installation.

For details about how to install the .run software package, see 1.3 Installing the
Driver and Firmware.

Manual Compilation
During the installation, if the kernel version is consistent with that of the Driver
image in the .run package, automatic compilation of the Driver source code will
not be triggered. In the subsequent use, however, if the kernel is updated, you
need to manually execute the script for compiling the Driver source code in the
installation path of the .run software package.

● Find the script in RUN Installation Path/driver/script.


● Two scripts are provided:
– run_driver_dkms_install.sh
NOTE

This script compiles the Driver source code and adds the code to the DKMS
framework. After the script is executed, the Driver image linked to the .run
package is automatically updated.
– run_driver_dkms_uninstall.sh
NOTE

This script uninstalls the Driver source code and removes the code from the
DKMS framework. The removal does not affect the installed .run package, but
automatic compilation is no longer supported after the removal.

3.2 Software Package Unavailable


If the kernel is updated in the environment where the software package is
installed, the environment will fail to be restarted. The problem persists after a
new software package is installed. The cause is that the .ko driver has been
packed to the rootfs of the kernel during the installation of the software package.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 27


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 3 FAQs

Figure 3-1 A software package is unavailable.

Solution
Step 1 Manually uninstall the .ko driver.
1. Query the .ko driver list.
lsmod|grep drv
2. Uninstall all the queried .ko drivers. Separate .ko drivers with spaces in the
uninstallation command. For example, to uninstall .ko drivers ko1 and ko2,
run the following command:
rmmodko1 ko2
Step 2 Pack the rootfs.
dracut --force
Step 3 Restart the environment.
reboot

----End

3.3 How Do I Check Whether the Device Is Running


Properly
Step 1 Log in to the operating environment as user root and query the installation path
of a software package.
cat /etc/ascend_install.info
Information similar to the following is displayed.
Driver_Install_Path_Param=/usr/local/Ascend

Step 2 Go to the driver installation path and use the upgrade-tool to check the version
of the file system running on the device.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 28


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 3 FAQs

cd /usr/local/Ascend/driver/tools/
./upgrade-tool --device_index -1 --system_version
If the query is successful, the device is started properly and the information similar
to the following is displayed.

----End

3.4 BC_Linux Image Source Configuration


If a driver fails to be installed and the system displays a message indicating that
the dkms and kernel-headers are missing, configure the image source. The
following uses BC_Linux as an example.

Procedure
Step 1 Log in to the operating environment as user root.
Step 2 Save the original .repo file as a backup file.
mv /etc/yum.repo.d/BCLinux-Base.repo /etc/yum.repo.d/BCLinux-
Base.repo.bak
Step 3 On the KVM, mount the OS ISO file.
1. Log in to the remote virtual console.

2. Click on the toolbar.


The virtual CD/DVD-ROM drive toolbar is displayed, as shown in Figure 3-2.

Figure 3-2 Virtual CD/DVD-ROM drive toolbar

3. Select Directory and click Browse.


4. Select the folder where the downloaded ISO file is located.
5. Click Insert in the virtual CD/DVD-ROM drive toolbar.
Step 4 Open the CLI.
Step 5 Mount the ISO file to a specified directory, for example, /mnt.
mount /dev/cdrom /mnt
Step 6 Edit the /etc/yum.repo.d/iso.repo file,
vi /etc/yum.repo.d/iso.repo

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 29


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 3 FAQs

Press i to edit the file and add the following content to the file:
[base-local]
name=iso
baseurl=file:///mnt
enable=1
gpgcheck=0

After the modification is complete, press Esc to exit editing mode and enter :wq!
to save the modification and exit.
Step 7 Update the Yum source.
yum clean all
yum makecache
Step 8 View the .repo list.
yum repolist
If the following information is displayed, the local source is configured
successfully:

----End

3.5 Inconsistent Driver and Firmware Versions Due to


Incorrect Upgrade Sequence
After the driver and firmware are installed, the Ascend 910 device information
cannot be obtained by running the npu-smi info command. The possible cause is
that the driver and firmware are not upgraded in the sequence specified in the
upgrade guide. As a result, the driver and firmware versions are inconsistent. Use
the following method to resolve the problem.

Checking the Version of the Firmware Installed in the System


Step 1 Obtain the Ascend 910 device information in the system.
[root@localhost ~]# lspci | grep d801
01:00.0 Processing accelerators: Huawei Technologies Co., Ltd. Device d801 (rev 20)
02:00.0 Processing accelerators: Huawei Technologies Co., Ltd. Device d801 (rev 20)
41:00.0 Processing accelerators: Huawei Technologies Co., Ltd. Device d801 (rev 20)
42:00.0 Processing accelerators: Huawei Technologies Co., Ltd. Device d801 (rev 20)
81:00.0 Processing accelerators: Huawei Technologies Co., Ltd. Device d801 (rev 20)
82:00.0 Processing accelerators: Huawei Technologies Co., Ltd. Device d801 (rev 20)
c1:00.0 Processing accelerators: Huawei Technologies Co., Ltd. Device d801 (rev 20)
c2:00.0 Processing accelerators: Huawei Technologies Co., Ltd. Device d801 (rev 20)

Step 2 Obtain the firmware version of a specified Ascend 910 device.


[root@localhost ~]# lspci -s 42:00.0 -xxx
42:00.0 Processing accelerators: Huawei Technologies Co., Ltd. Device d801 (rev 20)

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 30


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 3 FAQs

00: e5 19 01 d8 46 05 10 00 20 00 00 12 08 00 80 00
10: 0c 00 00 00 08 28 00 00 04 00 00 c0 00 00 00 00
20: 0c 00 00 00 00 28 00 00 00 00 00 00 00 02 00 01
30: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 00 00
40: 10 a0 02 00 e2 8f 64 00 3f 21 09 00 04 f9 43 00
50: 08 00 04 01 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 10 00 10 00 00 00 00 00 1e 1e 8f 01
70: 04 00 1f 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 03 a0 00 00 00 00 00 00
a0: 11 b0 ff 83 00 00 01 00 00 40 01 00 00 00 00 00
b0: 01 b8 03 00 08 00 00 00 09 c8 10 01 02 00 00 00
c0: 50 b0 11 00 75 00 00 00 09 dc 14 02 02 00 00 00
d0: 16 b1 11 00 00 75 00 00 16 b1 11 00 09 ec 10 03
e0: 02 00 00 00 00 75 00 00 00 00 00 00 09 00 10 04
f0: 02 00 00 00 00 30 00 00 80 00 00 00 00 00 00 00

NOTE

In the preceding example, the red information in the d0 address indicates that the firmware
version is C75B116.

----End

Checking the Version of the Driver Installed in the System


In the software package installation path, for example, the default path of the
root user /usr/local/Ascend/driver, run the following command to check whether
the target version is correct:

cat version.info
Version=1.75.T11.0.B116

NOTE

In the preceding example, the version is C75B116.

Solution
If the preceding command output indicates that the driver version is inconsistent
with the firmware version, perform the following operations:

1. Upgrade the driver to the same version as the firmware version.


2. Restart the host system.
3. Upgrade the firmware to the target version.
4. Upgrade the driver to the target version.
5. Restart the host system.

3.6 How Do I Handle a .deb Package Installation or


Upgrade Failure?
Symptom
If the .deb package fails to be installed or upgraded, residual cache information
will cause an installation or upgrade failure.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 31


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) 3 FAQs

Solution
If the .deb package fails to be installed or upgraded, you need to manually delete
the cache information. The procedure is as follows:

Step 1 Delete the configuration file list of the failed package (the ascend910-driver
package is used as an example):
rm /var/lib/dpkg/info/ascend910-driver*

Step 2 Run the following forcible deletion command:


dpkg --remove --force-remove-reinstreq ascend910-driver

Step 3 Delete the residual files in the directory of the software that fails to be installed.
(The directory in the example is the default installation directory. Replace it with
the actual installation directory.)
rm -rf /usr/local/Ascend/driver

----End

3.7 Driver 20.2.0 or Later Failed to Be Rolled Back to an


Earlier Version
If message Device_images_crl_check failed is displayed when a driver is rolled
back from 20.2.0 or later to an earlier version, perform the following operations:

Step 1 Log in to the operating environment as the root user and run the following
command to uninstall the driver. The default installation path /usr/local/Ascend/
driver/ is used as an example.
[root@localhost ~]# /usr/local/Ascend/driver/script/uninstall.sh

Step 2 Run the following hot reset command if the system prompts you to restart the
system for the uninstallation to take effect. Skip this step if a message is displayed
indicating that the uninstallation takes effect immediately.
[root@localhost ~]# reboot

Step 3 Run the following command to delete the ascend_check directory from the root
path:
[root@localhost ~]# rm -rf /root/ascend_check

Step 4 Perform the rollback again.

----End

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 32


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) A Appendixes

A Appendixes

A.1 Parameters
Description
One-click installation is supported in the command line. You can select parameters
as required to complete the installation. All parameters are optional.

Installation command format: ./*.run [options]

For details, see Table A-1.

NOTICE

If the parameters queried by running the ./*.run --help command are not
described in the following table, this parameter is reserved or applies to other chip
versions. You do not need to pay attention to this parameter.

Table A-1 Parameters supported by the installation packages

Parameter Description

--run Indicates the running mode, which installs only the files
required in a running scenario.

--devel Indicates the development mode, which contains the


header files that users need to use for development.
The firmware subpackage does not support this
parameter.

--full Full mode: installs all files.

--docker Docker mode: applies to the driver only. Other


components are installed in full mode by default.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 33


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) A Appendixes

Parameter Description

--install- Initial installation: You can specify the running user name.
username=<userna Otherwise, HwHiAiUser is used by default.
me> Overwrite: The user name used in the last installation is
adopted.
NOTE
● You are not advised to specify user root for security
considerations.
● The user name and user group cannot be specified during
firmware installation. The user name and user group of the
driver is used.
● This parameter must be used together with --install-
usergroup=<usergroup>, and the value of username must be
the same as that of the created user (1.2.4 Creating a
Running User).

--install- Initial installation: You can specify the running user group
usergroup=<usergr name. Otherwise, HwHiAiUser is used by default.
oup> Overwrite: The user group name used in the last
installation is adopted.
NOTE
This parameter must be used together with --install-
username=<username>, and the value of usergroup must be the
same as that of the created user group created (1.2.4 Creating a
Running User).

--install- Specifies the installation directory. If the directory is not


path=<path> specified, the default installation path /usr/local/Ascend
is used.
The running user must have the read and write
permissions on the specified installation path.
NOTE
The firmware installation path cannot be specified. The driver
installation path is shared.

--uninstall Uninstalls.

--noexec Does not run the installation script. This parameter is used
together with the --extract=path parameter. Format: --
noexec --extract=path

--extract=path Decompresses the installation package to a specified


directory.

--upgrade Performs upgrade, which takes effect immediately. The


upgrade can be performed only in the path where a
software package is stored.

--help/-h Displays the help information.

--check Check the integrity of a software package.

--version Display the version information.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 34


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) A Appendixes

Parameter Description

--tar arg1 [arg2 ...] Runs the tar command on the software package. Use the
arguments following tar as the command arguments. For
example, the --tar xvf command indicates that the .run
package will be decompressed to the current directory.

--list Lists the files in a software package.

--info Displays detailed information of a package.

--quiet Indicates the silent installation, which skips interactive


messages.

--install-for-all If this parameter is included in the installation or upgrade


command, the permission of the directories and files is
changed to the group permission.
This parameter must be used in pair with any one among
--run, --devel, --full, and --upgrade, for example, ./*.run
--full --install-for-all.
NOTE
The firmware does not support this parameter.

--repack Builds a new driver package.


[package_name] package_name indicates the name of the newly built .run
driver package.
NOTE
If package_name is left blank, a file named Original driver
package name-custom.run is generated in the current path.

--repack- Builds a new driver package in the specified path.


path=<path> ● path indicates the directory where the original .run
[package_name] package is extracted.
● package_name indicates the name of the newly
built .run driver package.
NOTE
If package_name is left blank, a file named Original driver
package name-custom.run is generated in the current path.

Example:

● Installation in full mode


– Unspecified installation directory: ./*.run --full
– Specified installation directory: ./*.run --full --install-path=installation
directory
● Installation in Docker mode (applies to the driver only)
– ./*.run --docker: without installation path specified
– ./*.run --docker --install-path=<installation_path>: with installation
path specified
● Installation in run mode

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 35


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) A Appendixes

– Unspecified installation directory: ./*.run--run


– Specified installation directory: ./*.run --run --install-
path=installation_path

A.2 Scripts
The scripts in Table A-2 are automatically called to install, uninstall, and update
the .run software package. The script storage path is subject to the path specified
during the installation. The following uses the default path /usr/local/Ascend as
an example.

Table A-2 Scripts


Name Path Description

run_firmware_install.sh /usr/local/Ascend/ Installs firmware.


firmware/

run_firmware_uninstall. /usr/local/Ascend/ Uninstalls firmware.


sh firmware/

install_common_parser. /usr/local/Ascend/ Parses the filelist.csv


sh firmware/ file.

install.sh /usr/local/Ascend/ Entry script for installing


firmware/ the firmware installation
package.

host_sys_stop.sh /usr/local/Ascend/ Stops the background


processes.

host_sys_init.sh /usr/local/Ascend/ Starts the background


processes.

host_server_setup.sh /usr/local/Ascend/ Obtains the system


configuration
information and invokes
the host_sys_init.sh
script.

host_servers_remove.sh /usr/local/Ascend/ Removes startup


configuration including
host_sys_init.sh.

install.sh /usr/local/Ascend/driver/ Entry script for installing


script/ the driver installation
package

run_driver_dkms_install. /usr/local/Ascend/driver/ Loads modules to DKMS.


sh script/

run_driver_dkms_uninst /usr/local/Ascend/driver/ Uninstalls modules from


all.sh script/ DKMS.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 36


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) A Appendixes

Name Path Description

run_driver_install.sh /usr/local/Ascend/driver/ Installs, uninstalls, or


script/ updates (in immediate
or delayed mode) driver.

run_driver_uninstall.sh /usr/local/Ascend/driver/ Uninstalls driver.


script/

run_driver_tool_install.s /usr/local/Ascend/driver/ Installs the tool.


h script/

run_driver_tool_uninstal /usr/local/Ascend/driver/ Uninstalls the tool.


l.sh script/

install_common_parser. /usr/local/Ascend/driver/ Parses the filelist.csv


sh script/ file.

install_npudrv.sh /usr/local/Ascend/driver/ Script for privileged


tools/ container, updating
driver of the host in the
container.

A.3 Tools
The following describes how to run the tools described in Table A-3.

Step 1 Log in to the host server as the running user (HwHiAiUser by default) specified
during runfile installation.
Step 2 Go to the tool directory, for example, /usr/local/Ascend/driver/tools. The
command format is: cd /usr/local/Ascend/driver/tools.
Step 3 Run the commands in Table A-3 to call the tools. The following uses the default
installation path of the driver as an example.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 37


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) A Appendixes

Table A-3 Related tools


Name Path Function Command

upgrade-tool /usr/local/ Queries the NOTE


Ascend/driver/ firmware Run the following commands in
the /usr/local/Ascend/
tools version and
firmware/tools directory.
update
cd /usr/local/Ascend/firmware/
firmware.
tools
dev_id = –1 indicates all devices.
● Lists all devices:
/usr/local/Ascend/driver/
tools/upgrade-tool --
mini_devices
● Obtains the version of a
specified device:
/usr/local/Ascend/driver/
tools/upgrade-tool --
device_index <dev_id> --
system_version
● Obtains the component
information about a
specified device:
/usr/local/Ascend/driver/
tools/upgrade-tool --
device_index <dev_id> --
components
● Queries the version of a
device component:
/usr/local/Ascend/driver/
tools/upgrade-tool --
device_index <dev_id> --
component <type> --
version
● Queries the device status:
/usr/local/Ascend/driver/
tools/upgrade-tool --
device_index <dev_id> --
status
● Performs hot reset on the
device:
/usr/local/Ascend/driver/
tools/upgrade-tool --
device_index <dev_id> --
hot_reset
dev_id = –1 indicates that
all devices are hot reset.
Hot reset of a single device
is not allowed.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 38


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) A Appendixes

Name Path Function Command


NOTE
– Before performing a hot
reset, stop the services.
– In the current version,
only the ARM architecture
supports hot reset.
● Checks if a device is a
physical device:
/usr/local/Ascend/driver/
tools/upgrade-tool --
device_index <dev_id> --
phymachflag
NOTE
In the current version, only
the firmware of a physical
device can be upgraded.
● Updates the firmware of a
device:
/usr/local/Ascend/driver/
tools/upgrade-tool --
device_index <dev_id> --
component <type> --path
<firmware_path>
where,
– --mini_devices: list of
all devices
– --device_index: device
ID. The value can be 0–7
or –1. The value 0–7
indicate device IDs. The
value -1 indicates all
devices.
– --system_version:
system version.
– --components: lists of
all valid components.
– --component:
component name. When
upgrading a single
component, you need to
specify the component
name. Currently, AI CPU
can only be upgraded
separately. To upgrade
all components, enter
-1. To upgrade all
components and reset
the password, enter 9.

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 39


Atlas 800 Training Server
NPU Driver and Firmware Installation and Upgrade
Guide (Model 9010) A Appendixes

Name Path Function Command

– --version: obtains the


component version.
– --status: obtains the
device status.
– --path: relative path of
the firmware package.
To upgrade all
components, set this
option to --path ./conf/
upgrade.cfg. To
upgrade a single
component, for
example, nve.bin, set
this option to --path ../
image/nve.bin.
– --hot_reset: hot resets
the devices.
– --phymachflag: queries
if a device is a physical
device. If the device is
not a physical device,
the firmware of the
device cannot be
upgraded.
– --async: supports
asynchronous upgrade.
That is, after an upgrade
request from the host is
received, the device
returns a success
response to the host. To
query the upgrade
result, use the --status
option.
– --help: views the help
information

hccn_tool /usr/local/ Cluster For details, see Ascend 910


Ascend/driver/ network tool, HCCN Tool API Reference.
tools which can be
executed only
by user root.

----End

Issue 05 (2022-03-07) Copyright © Huawei Technologies Co., Ltd. 40

You might also like