Sun HPC ClusterTools™ Software: ™ Ubiquitous Parallel Computing
Ferhat Hatay, Rolf vandeVaart, Josh Simons Sun Microsystems, Inc.
Sun HPC ClusterTools Software is a complete software environment to support parallel high performance and technical computing applications ubiquitously over single as well as networked Sun computer system solutions. Sun HPC ClusterTools Software has long offered industry standard programming interfaces along with support for multiple communication protocols and mixed-parallel programming paradigms. Consequently, the Sun high performance computing (HPC) user community has developed and deployed their parallel applications without being land-locked in superficial technology enclaves. The newly released Sun HPC ClusterTools 4 Software introduces an open and expandable architecture with the new Loadable Protocol Modules enabling the seamless integration of existing and future network protocols. Parallel user applications remain unaffected with no performance compromises throughout.
well as in commercial computing circles has awaited the maturation and stabilization of a robust software development environment. The Message Passing Interface (MPI) standard was developed in the early 1990s by a forum of hardware and software vendors and developer community. They created a platformindependent programming paradigm that could realistically achieve scalable high performance on different system architectures. Over the years, the MPI standard has achieved unprecedented industry-wide adoption among hardware and software vendors, as well as scientific and engineering application developers. The original MPI functionality was later extended through the additional features introduced by the MPI-2 standard. Most recently, mixed programming techniques have been gathering interest, in which MPI distributed parallel techniques are used in conjunction with thread-parallel and sharedmemory parallel techniques such as those employed in the OpenMP standard. Sun readily provides high performance and technical computing (HPTC) customers with powerful and scalable Symmetric Multiprocessing (SMP) compute servers with up to 64 CPUs. HPTC clusters of Sun SMP systems enabled by Sun HPC
The potential power and economic appeal of parallel computing can hardly be questioned. However, the wide acceptance of parallel computing in research and development as
ClusterTools Software provide further scalability and performance levels to meet the most challenging high performance computing demands of research, development, design, and production communities involved in the art and science of computational modeling in their respective fields. Sun HPC ClusterTools Software provides a complete development environment for the analysis, debugging, and performance monitoring of parallel MPI applications on Sun system solutions, which are based on the UltraSPARC™ processor architecture and the Solaris™ Operating Environment. Sun HPC ClusterTools Software enables the deployment of distributed parallel applications with continuous and portable scalability from one to thousands of processes. The components included in the Sun HPC ClusterTools Software package are:
• • •
safety feature of Sun MPI allows hybrid programming techniques to be applied within a single application. In this paper, the key features in the recently released Sun HPC ClusterTools 4 Software are discussed. The outstanding capabilities of Sun MPI, Sun's native implementation of the MPI standard, are also covered for the benefit of parallel application developers who are making strategic choices for programming paradigms, system architectures, and high performance clustering infrastructures. In addition to providing the source code under the Sun Community Source Code License (SCSL) program, Sun HPC ClusterTools 4 Software also introduces a new open and expandable architecture framework to support network protocols of the present and future within one robust software development environment. This paper is an introduction to the new architecture as well as an invitation to the high performance computing developer community for collaboration.
Sun Cluster Runtime Environment (CRE) Sun MPI communication library Prism™ graphical environment Sun Scalable Library (S3L) Scientific programming Subroutine
2. What is new in Sun HPC ClusterTools 4 Software?
Sun HPC ClusterTools 4 software is the new release following the Sun HPC ClusterTools 3.1 software in product offerings. The key new features in the Sun HPC ClusterTools 4 Software include:
Sun Parallel File System (PFS)
The Sun MPI library is the core component of the Sun HPC ClusterTools toolkit. It is a highly optimized, native MPI implementation that includes most of the extended MPI-2 standard. A notable exception is MPI one-sided communication between SMP nodes, which will appear in a later release. The Sun MPI library includes a full, native implementation of the MPI I/O part of the MPI-2 standard extensions which allows access to the Sun Parallel File System from Sun MPI applications. The thread-
Scalability to 2048 processes and 64 nodes (SMPs).
Sun HPC ClusterTools 4 supports the execution of parallel MPI applications with up to 2048 processes, running across up to 64 computational nodes where each node can be an SMP of any size.
Support for systems.
The Sun S3L (Scalable Scientific Subroutine Library) provides a set of parallel and scalable functions widely used in scientific and engineering computing. The Sun S3L library now includes support for the UltraSPARC III processor architecture along with the UltraSPARC II architecture in the same library, and transparently provided to the user applications.
Sun S3L library now features new direct and iterative sparse matrix solvers, linear programming and equity option pricing functions as well as additional mathematical transformations.
Support for dynamically spawned MPI processes in the Sun Prism graphical programming environment.
Support for Loadable Protocol Modules.
Sun HPC ClusterTools Software supports multiple communication protocols to achieve scalable performance on parallel high performance computing applications. With the modularity introduced by Loadable Protocol Modules, new network and communication fabrics are supported natively by the Sun MPI framework without requiring a recompilation or relinking of the Sun MPI library or the existing parallel user applications.
The spawn feature of the MPI 2 standard allows programmers to create sets of related processes. Sun HPC ClusterTools 4 includes a new feature that allows the developers to debug codes using MPI_SPAWN functionality of the MPI 2 standard.
Improved security, error logging and corefile handling services in Sun CRE.
Infrastructure to support the use of Remote Shared Memory (RSM) protocol within Sun MPI over the next generation Sun interconnect hardware.
The RSM protocol provides direct and lowlatency memory-to-memory communication between cluster nodes over RSM-capable interconnects. Applications demanding lowlatency communication framework will benefit from the direct remote memory access capability in a high performance cluster over RSM capable interconnects. Sun HPC ClusterTools 4 Software provides builtin support from the Sun MPI library for next generation Sun interconnect hardware through the Loadable Protocol Module technology.
Sun CRE is a cluster administration and job launching facility. With the Sun HPC ClusterTools 4 Software release, the errors detected by Sun CRE can now be localized to the node level with messages and recommended actions for appropriate owners. Sun CRE, as are all components in Sun HPC ClusterTools 4 Software, is scalable to 2048 processes over 64 nodes with fast parallel job execution startup and shutdown operations. New adaptive time-out functionality in Sun CRE introduces variability with job size for improved fault and error identification.
Sun HPC ClusterTools 4 / 3.1 software coexistence.
Enhancements and additions to Sun S3L parallel performance library.
Users are able to have both Sun HPC ClusterTools 3.1 software and Sun HPC ClusterTools 4 software installed on the same system and can run either version under the Solaris 8 Operating Environment. Consequently, customers can provisionally upgrade a cluster to Sun HPC ClusterTools 4 software while keeping the option of returning to Sun HPC ClusterTools 3.1 software if desired. User applications that
are compiled and linked with Sun HPC ClusterTools 3.1 release will work under the new Sun HPC ClusterTools 4 Software without any changes.
Free Web download.
the library at any given time. In addition, both 32- and 64-bit versions of the library are included with Sun HPC ClusterTools and Sun MPI programs may be debugged and tuned with the Prism parallel development environment. Figure 1 illustrates the relationship between the components of the Sun HPC ClusterTools Software and other software development products available from Sun.
Sun HPC ClusterTools 4 Software is now available for free download from the Web directly for unlimited use.
3. Thread-safety: support for multiple programming paradigms
Two primary high performance computing programming models are supported in the Sun environment: a single-process model and a multiprocess model. The singleprocess model includes all types of multithreaded applications. These may be automatically parallelized by Sun's high performance compilers using parallelization directives (e.g., OpenMP) or explicitly parallelized with user-inserted Solaris or POSIX threads. The multiprocess model, which is the topic of this paper, supports the MPI standard for parallel applications that run both on single SMPs and on clusters of SMPs or thin nodes. It should be noted that a third, hybrid model is also supported: the mixing of threads and MPI parallelism to create applications that use MPI for communication between cooperating processes and threads within each process. Such codes may make most efficient use of the capabilities of individual SMP nodes in the high performance cluster environment. Hybrid parallel programming paradigm can only be considered in a thread-safe framework. Sun MPI is fully thread-safe with locking pushed as low as possible within the implementation to allow for concurrency: multiple user threads may be active within
Figure 1 Sun HPC ClusterTools Software in context
4. Support for multiple network protocols: open and expandable Loadable Protocol Module Architecture
Sun MPI, which currently supports applications spanning up to 64 compute nodes and consuming up to 2048 processes, includes extensive optimizations for running both within and between SMP nodes. Versions of Sun HPC ClusterTools Software prior to the Release 4 included code for communication between processes across TCP networks, shared memory interfaces (SHM), and remote shared memory (RSM). To accommodate communication across a wider variety of networks and interfaces, Sun HPC ClusterTools 4 has extracted the communications code into individual Loadable Protocol Modules that can be called by the main Sun MPI library. Each protocol module enables Sun HPC ClusterTools communication across a
network with a different communications protocol. Protocol modules offer several advantages over embedded code:
New communications protocols can be developed and supported without the need for a release of the Sun MPI library and the Sun CRE software. No recompilation or relinking is necessary for existing Sun MPI parallel applications to use the new and additional dynamic Loadable Protocol Modules. Individual protocol modules can be patched without patching the entire Sun MPI Library.
The architecture of the Sun MPI library is shown in Figure 2. The lowest level of the library, called the Protocol Module Layer, includes support for several communication mechanisms. When communication between two processes in a Sun MPI job is initiated, the library chooses the most efficient transport pathway between those two processes. For example, two processes colocated on a single SMP node will communicate via shared memory segments under control of the shared-memory protocol module, whereas two processes on different nodes will use the protocol module corresponding to the best available network connection between those nodes. These connection decisions are made automatically on a pairwise basis at run time. The TCP protocol module allows Sun MPI jobs to run across any available TCP-capable interconnects. While this allows the use of numerous commodity interconnects offering a wide variety of interconnect bandwidths, it does not address the issue of low-latency communication that is of interest for many classes of HPTC applications. Sun HPC ClusterTools 4 Software includes a Remote Shared Memory (RSM) protocol module for low-latency communications. RSM protocol module addresses the latency issue by allowing nodes connected with capable interconnects to bypass the Solaris Operating Environment and perform MPI data transfers with user-space load and store operations. The Dynamic Loadable Protocol Module open architecture allows the development and integration of other lowlatency protocol modules within the Sun MPI framework.
Figure 2 Sun MPI architecture
5. Sun HPC ClusterTools Sun Community Source License (SCSL) program
In addition to being available as a supported product, the source code for the Sun HPC ClusterTools Software has been available to the HPTC community available through Sun's Community Source License (SCSL) program since November 1999. This mechanism provides free access to source code to those individuals and organizations who wish to experiment with or contribute to the development of these tools. In the new release 4, Sun HPC ClusterTools Software introduces an open software architecture to support different network protocols natively under the ubiquitous Sun MPI framework as explained in the previous section. In addition to the standard Protocol Modules for shared memory, remote shared memory, and TCP, the Sun MPI supports the concept of dynamically Loadable Protocol Modules. Under the Sun HPC ClusterTools Software SCSL program, third-party interconnect vendors can develop Loadable Protocol Modules and provide native support for their communication hardware and software within the Sun MPI framework. The open architecture of Loadable Protocol Module enables such partners to integrate their hardware and low-level interconnect software with the Sun MPI library to offer the full capabilities of their interconnect under the Sun HPTC software and stack. Users of Sun HPTC solutions, in return, experience a stable and continuous software environment, protecting their existing software investments while allowing upgrades of their cluster interconnects with new solutions.
6. Myrinet2000 Loadable Protocol Module Development
Myricom, Inc. has been developing a dynamically Loadable Protocol Module under Sun Community Source License program for their latest Myrinet2000 interconnect. Because these modules are loaded at run time by the standard Sun MPI library, users of such third-party modules do not need access to Sun HPC ClusterTools source code to make use of this capability. Furthermore, ISV applications readily compiled and certified with the current version of Sun HPC ClusterTools may also take advantage of such third-party interconnects without re-compilation or relinking.
The Sun HPC ClusterTools toolkit is a fullfeatured suite that supports the development and the execution of high performance, distributed-memory, parallel applications across all Sun system solutions. The toolkit currently supports applications that span up to 64 SMP nodes and that contain up to 2048 processes. Sun's MPI implementation is shared-memory aware, thread-safe, and supports a low-latency RSM communication protocol over capable interconnects. Scalable, parallel I/O is supported from MPI applications. In addition, the Sun HPC ClusterTools Software includes a robust debugging, performance analysis, and data visualization environment. In the new release 4, Sun HPC ClusterTools Software introduces an open software architecture to support multiple different network protocols natively under the ubiquitous Sun MPI framework.
8. Further Information
The following references provide additional technical detail on a number of the capabilities covered in this paper. Information about Sun HPC ClusterTools Software product can be found at this location:
Copyright © 2001 Sun Microsystems, Inc., 901 San Antonio Road, Palo Alto, California 94303, U.S.A. All rights reserved. Sun Microsystems, rights Inc. relating these has to intellectual technology intellectual listed and one or pending at more patent
property without the
embodied in this product. In particular, and limitation, U.S. property rights may include one or more of patents patents or http://www.sun.com/patents additional
Sun HPC ClusterTools Software can be downloaded for unlimited use from:
http://www.sun.com/ \ software/hpc/tryandbuy.html
applications in the U.S. and other countries. This product is distributed under licenses restricting its use, copying distribution, and decompilation. No part of this product may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any.
The following page contains a selection of HPC-related whitepapers, including a technical whitepaper on Sun PFS and additional information on S3L:
http://www.sun.com/ \ software/solutions/hpc/docs/
Third-party Sun suppliers.
technology, is copyrighted and licensed from
Sun, Sun Microsystems, the Sun logo, Solaris, Sun HPC ClusterTools, Inc. in the and U.S. Prism and are other trademarks or registered trademarks of Sun Microsystems, countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and other countries. are Products based upon bearing SPARC trademarks architecture
Sun Community Source web site for the Sun HPC ClusterTools toolkit provides information on how to join the Sun HPC ClusterTools community and how to download the source code.
http://www.sun.com/ \ software/solutions/hpc/communitysource
developed by Sun Microsystems, Inc.
Sun's primary site for online documentation contains the full manual set for the Sun HPC ClusterTools Software is: