Professional Documents
Culture Documents
NIST SP 800-223
Yang Guo
Ramaswamy Chandramouli
Lowell Wofford
Rickey Gregg
Gary Key
Antwan Clark
Catherine Hinton
Andrew Prout
Albert Reuther
Ryan Adamson
Aron Warren
Purushotham Bangalore
Erik Deumens
Csilla Farkas
February 2024
Certain commercial entities, equipment, or materials may be identified in this document in order to describe an
experimental procedure or concept adequately. Such identification is not intended to imply recommendation or
endorsement by the National Institute of Standards and Technology (NIST), nor is it intended to imply that the
entities, materials, or equipment are necessarily the best available for the purpose.
There may be references in this publication to other publications currently under development by NIST in
accordance with its assigned statutory responsibilities. The information in this publication, including concepts and
methodologies, may be used by federal agencies even before the completion of such companion publications.
Thus, until each publication is completed, current requirements, guidelines, and procedures, where they exist,
remain operative. For planning and transition purposes, federal agencies may wish to closely follow the
development of these new publications by NIST.
Organizations are encouraged to review all draft publications during public comment periods and provide feedback
to NIST. Many NIST cybersecurity publications, other than the ones noted above, are available at
https://csrc.nist.gov/publications.
Authority
This publication has been developed by NIST in accordance with its statutory responsibilities under the Federal
Information Security Modernization Act (FISMA) of 2014, 44 U.S.C. § 3551 et seq., Public Law (P.L.) 113-283. NIST is
responsible for developing information security standards and guidelines, including minimum requirements for
federal information systems, but such standards and guidelines shall not apply to national security systems
without the express approval of appropriate federal officials exercising policy authority over such systems. This
guideline is consistent with the requirements of the Office of Management and Budget (OMB) Circular A-130.
Nothing in this publication should be taken to contradict the standards and guidelines made mandatory and
binding on federal agencies by the Secretary of Commerce under statutory authority. Nor should these guidelines
be interpreted as altering or superseding the existing authorities of the Secretary of Commerce, Director of the
OMB, or any other federal official. This publication may be used by nongovernmental organizations on a voluntary
basis and is not subject to copyright in the United States. Attribution would, however, be appreciated by NIST.
Publication History
Approved by the NIST Editorial Review Board on 2024-02-02.
Contact Information
sp800-223-comments@list.nist.gov
Additional Information
Additional information about this publication is available at https://csrc.nist.gov/pubs/sp/800/223/final, including
related content, potential updates, and document history.
All comments are subject to release under the Freedom of Information Act (FOIA).
NIST SP 800-223 High-Performance Computing Security
February 2024
Abstract
Security is essential component of high-performance computing (HPC). HPC systems often
differ based on the evolution of their system designs, the applications they run, and the
missions they support. An HPC system may also have its own unique security requirements,
follow different security guidance, and require tailored security solutions. Their complexity and
uniqueness impede the sharing of security solutions and knowledge. This NIST Special
Publication aims to standardize and facilitate the information and knowledge-sharing of HPC
security using an HPC system reference architecture and key components as the basis of an HPC
system lexicon. This publication also analyzes HPC threats, considers current HPC security
postures and challenges, and makes best-practice recommendations.
Keywords
high-performance computing; HPC reference architecture; HPC security; HPC security posture;
HPC threat analysis; security guidance.
i
NIST SP 800-223 High-Performance Computing Security
February 2024
ii
NIST SP 800-223 High-Performance Computing Security
February 2024
Table of Contents
1. Introduction ...................................................................................................................................1
2. HPC System Reference Architecture and Main Components ............................................................2
5. Conclusions ..................................................................................................................................19
iii
NIST SP 800-223 High-Performance Computing Security
February 2024
References.......................................................................................................................................20
Appendix A. HPC Architecture Variants.............................................................................................24
List of Figures
Fig. 1. HPC System Reference Architecture..........................................................................................3
iv
NIST SP 800-223 High-Performance Computing Security
February 2024
1. Introduction
In 2015, Executive Order 13702 established the National Strategic Computing Initiative (NSCI) to
maximize the benefits of high-performance computing (HPC) for economic competitiveness and
scientific discovery. The ability to process large volumes of data and perform complex
calculations at high speeds is a key part of the Nation’s vision for maintaining its global
competitive edge.
Security is essential to achieving the anticipated benefits of HPC. HPC systems bear some
resemblance to enterprise IT computing, which allows for the effective application of traditional
IT security solutions. However, they also have significant differences. An HPC system is designed
to maximize performance, so its architecture, hardware components, software stacks, and
working environment are very different from enterprise IT. Additionally, most HPC systems
serve multiple projects and user communities, leading to more complex security concerns than
those encountered in enterprise systems. As such, security solutions must be tailored to the
HPC system’s requirements. Furthermore, HPC systems are often different from one another
due to the evolution of their system designs, the applications they run, and the missions they
support. An HPC system frequently has its own unique method of applying security
requirements and may follow different security guidance, which can impede the sharing of
security solutions and knowledge.
This NIST Special Publication aims to standardize and facilitate the sharing of HPC security
information and knowledge through the development of an HPC system reference architecture
and key components, which are introduced as the basics of the HPC system lexicon. The
reference architecture divides an HPC system into four function zones. A zone-based HPC
reference architecture captures the most common features across the majority of HPC systems
and segues into HPC system threat analysis. Key HPC security characteristics and use
requirements are laid out alongside the major threats faced by the system and individual
function zones. HPC security postures, challenges, recommendations, and some currently
recognized variants are also included. This publication is intended to be a conceptual guide, not
a checklist of requirements.
Finally, emerging technologies — such as novel networking technology, data storage solutions,
hardware acceleration technology, and others — have the potential to influence the HPC
reference architecture and security posture. Consequently, this document will be updated as
needed.
1
NIST SP 800-223 High-Performance Computing Security
February 2024
2
NIST SP 800-223 High-Performance Computing Security
February 2024
3
NIST SP 800-223 High-Performance Computing Security
February 2024
optimize network performance. The popular HPC interconnect networking includes InfiniBand
[2], Omni-Path [3], Slingshot [4], Ethernet, and others.
A high-performance computing zone typically utilizes non-high-performance communication
networks, like ethernet, as cluster internal networks that connect the high-performance
computing zone with the management zone and access zone for traffic associated with
maintenance activities as opposed to HPC traffic.
4
NIST SP 800-223 High-Performance Computing Security
February 2024
Both GPFS and Lustre-based PFSs are prone to performance degradation when a certain
capacity threshold is reached. These file systems may be regularly pruned of unwanted files
with a strategy decided by the file system administrators. Some deployments will delete files
older than a certain age, which forces HPC users to transfer job output to a longer-term file
system, such as campaign storage. PFSs tends to be somewhat unreliable, depending on the
types of activities being performed by running jobs and users. Because these are distributed file
systems, file-system software must solve distributed locking of files to ensure deterministic file
updates when multiple clients are writing to the same file at once. Additionally, PFSs are
susceptible to denial-of-service conditions even during legitimate user operations, such as
listing a directory with millions of files or applications that perform poor file locking semantics.
5
NIST SP 800-223 High-Performance Computing Security
February 2024
• Remote-shared burst buffer architectures: Burst buffers are shared between multiple
HPC compute nodes that are hosted on an I/O node [8]. This is advantageous for
facilitating the development, deployment, and maintenance of these architectures.
There are also HPCs that can contain mixed burst buffer intermediate storage architectures,
which combine the strengths of node-local and remote-shared burst buffer architectures.
6
NIST SP 800-223 High-Performance Computing Security
February 2024
separate security posture helps with risk assessment and the selection of controls to secure the
management zone and manage the risk.
7
NIST SP 800-223 High-Performance Computing Security
February 2024
8
NIST SP 800-223 High-Performance Computing Security
February 2024
Vendor software includes low-level system tools to facilitate the running of other software. For
instance, remote direct memory access, inter-node memory sharing, performance counters,
temperature and power telemetry, and debugging are all vendor-provided software.
Users can choose specific versions of installed vendor and site-provided software libraries by
manipulating environment variables. Tools such as wrapper scripts or module files are usually
provided to help users find and choose which versions of installed software to use.
9
NIST SP 800-223 High-Performance Computing Security
February 2024
10
NIST SP 800-223 High-Performance Computing Security
February 2024
11
NIST SP 800-223 High-Performance Computing Security
February 2024
The management zone may also be implemented as a service running on a cloud via
virtualization technologies. In such cases, the risks associated with the cloud also apply to the
management zone.
12
NIST SP 800-223 High-Performance Computing Security
February 2024
Some organizations offer backup services using their own HPC systems, but these systems may
be in the same geographic locations and subject to the same environmental threats.
Finally, the data stored in a HPC system may contain sensitive information, such as personally
identifiable information (PII), patient health information (PHI), controlled unclassified
information (CUI), and more. Such data may require compliance with security standards, such
as HIPAA [39] and NIST SP 800-171 [40]. To protect this data, a range of privacy-preserving
technologies are available, such as data anonymization, obfuscation, randomization, masking,
and differential privacy. However, applying these technologies in a way that maintains HPC
usability and performance can be challenging.
13
NIST SP 800-223 High-Performance Computing Security
February 2024
14
NIST SP 800-223 High-Performance Computing Security
February 2024
network address translation (NAT) [46,3] or a proxy [47] can be installed to allow users on a
private network to access the internet and download new versions of software or share
software data. However, a NAT and Squid proxy can also be security risks that demand extra
caution and mitigation considerations.
Employing multiple physical networks is a common and effective means for access control and
fault tolerance, and it is highly recommended.
15
NIST SP 800-223 High-Performance Computing Security
February 2024
system that can remotely mount a file system will have complete control over all file system
data. This access could be used to gain privileges elsewhere within other zones of the HPC
security enclave.
Hashing is another technique for monitoring data integrity. Data files can be hashed at the
beginning to acquire hashing keys. A file is not modified if its hashing key remains the same.
Parallel file systems maintain many types of metadata (e.g., user ID, group ID, modification
time, checksum, etc.). Hashing metadata is another way to check whether a file has been
modified.
Periodically scanning file systems for malware is a proven technique for monitoring data
integrity. However, scanning an HPC system for malware is challenging. HPC data storage can
easily contain a petabyte or more of data. Existing malware scanning tools are mainly designed
for a single machine or laptop. They are not efficient or fast enough to scan large HPC data
storage systems. Furthermore, the scanning operation can adversely affect the performance of
running jobs. HPC systems may consider alternate implementation options for malware
scanning to alleviate some of the stated impacts. For example, conduct scanning at write, and
conduct subsequent batch scanning at read. Protection against ransomware attacks on HPC
data is especially important. It is recommended to conduct risk assessments of ransomware
attacks, and plan and implement mechanisms to protect the HPC system from ransomware
attacks, as applicable [48,40,40].
Protecting data integrity is vital to HPC security. Granular data access provides the best
protection and is highly recommended when possible.
16
NIST SP 800-223 High-Performance Computing Security
February 2024
escalation. Additionally, many orchestration and runtime tools expose APIs that must be
appropriately isolated and secured. However, when carefully implemented, container isolation
can reduce risk by restricting a container from performing disallowed operations. For instance,
data access to a container can be limited to the least required filesystem mounts, network
access can be limited through network namespaces, and tools like seccomp [35] can be used to
restrict containers from making disallowed system calls. 1
1 SP 800-190, Application Container Security Guide [65], offers a comprehensive guide on container security.
17
NIST SP 800-223 High-Performance Computing Security
February 2024
A Security Technical Implementation Guide (STIG) [51] is a configuration standard and offers a
security baseline that reflects security guidance requirements. The security checking tool can
measure how well the STIG is satisfied. However, available STIGs are typically written for
servers or desktops rather than for HPC systems. In addition, security baseline checking tools
developed for commodity operating systems and applications require customizations to run on
HPC. The Lawrence Livermore National Laboratory [52], Sandia National Laboratories [53], and
the Los Alamos National Lab [54] have collaborated with the Defense Information Systems
Agency (DISA) [55] to develop the TOSS 4 STIG [56], which is geared toward HPC systems. Still, a
more general STIG library and corresponding security checking tools are desirable to handle
diverse subsystems and components inside an HPC system.
18
NIST SP 800-223 High-Performance Computing Security
February 2024
5. Conclusions
Securing HPC systems is challenging due to their size; performance requirements; diverse and
complex hardware, software, and applications; varying security requirements; and the nature
of shared resources. The security tools suitable for HPC are inadequate, and current standards
and guidelines on HPC security best practices are lacking. The continuous evolution of HPC
systems makes the task of securing them even more difficult.
This Special Publication aims to set a foundation for standardizing and facilitating HPC security
information and knowledge-sharing. A zone-based HPC system reference architecture is
introduced to serve as a foundation for a system lexicon and captures common features across
the majority of HPC systems. HPC system threat analysis is discussed, security postures and
challenges are considered, and recommendations are made.
19
NIST SP 800-223 High-Performance Computing Security
February 2024
References
20
NIST SP 800-223 High-Performance Computing Security
February 2024
21
NIST SP 800-223 High-Performance Computing Security
February 2024
22
NIST SP 800-223 High-Performance Computing Security
February 2024
23
NIST SP 800-223 High-Performance Computing Security
February 2024
24
NIST SP 800-223 High-Performance Computing Security
February 2024
virtualization components can expose data and pose privilege escalation risks. Untrusted VMs
can also introduce supply chain risks into the HPC environment.
However, a well-architected virtualized HPC environment may offer security advantages. For
example, many virtualization cluster management and orchestration platforms provide
enforceable configurations and policies. Virtualized platforms with SDN can also provide the
ability to place individual multi-node HPC workloads in isolated compute zones, providing finer
grained trust domains than traditional HPC architecture.
Containerized HPC environments utilize container and orchestration technologies to virtualize
HPC environments. Many considerations applicable to VM-based virtualized HPC environments
also apply to containerized HPC environments. The primary distinction lies in the placement of
the abstraction layer. For virtualized environments, the operating system is abstracted from the
hardware. In containerized environments, the applications are abstracted from (i.e., parts of)
the operating system kernel, but containers share a running host operating system kernel.
Containerized HPC environments are distinct from containerized HPC software (see Sec. 2.1.6)
in that the underlying HPC architecture is run as orchestrated containers rather than
applications within containers on top of a traditional HPC architecture. The same considerations
for container supply chains, container runtime, and container orchestrator security apply.
25
NIST SP 800-223 High-Performance Computing Security
February 2024
networks. This linkage should be managed in a secure and isolated fashion using either
dedicated links or cryptographically isolated networking, such as VPNs.
26