You are on page 1of 2

Massively Parallel Processing Supercomputers

 Distributed memory Massively Parallel Processors (MPP) systems based on commodity chips
have become increasingly important to supercomputing because of the low price performance
ratios and because systems of this kind can offer extremely large memory. For suitable problems,
and in particular for very memory intensive problems, these systems are capable of much higher
computational performance and they are able to address certain Grand Challenge and Capability
computing problem that cannot be done with PVP systems in any reasonable time frame.
The HPCAC T3E MPP is one of the largest configurations available anywhere. The system
installed in July of 1997 is a 512 processor T3E-900, with 1.5 Terabytes of disk and 132 GB of
memory. At the time of delivery, it was the largest I/O configuration ever built. This system has
a peak performance of 900 Mflops per processor for a theoretical peak of slightly less than 0.5
Teraflops. The system has a measured performance of more than 0.25 Teraflops (265 Gflops on
512 processors). It is allocated to Grand Challenge and very large capability problems. In FY 98,

the HPCAC will provide roughly 5 times the computing resource on the MPP systems

as exists on the PVP systems. Currently the MPP user base is smaller , and many
applications will require restructuring to effeciently use these MPP systems.

There also exists Shared Memory Processors that are not Parallel Vector, such as the IBM SP
and SGI Origin, and Sun HPC series. NERSC currently has prototype systems using this
architecture (a pair of Sun Enterprise 4000 systems) and in FY 98 will receive a 64-node Origin
2000 to evaluate the state of a hybrid architecture for clustering SMPs. Indeed, the J-90 complex
is already an early implementation of this style of computing. It is believed these systems will
provide both fine and coarse grain parallelism in a single application.

Future directions of this system include the incorporation of clustered SMPs. This could be
clustering using custom switches within an MPP to clustering using commodity network
components. The number of processors per SMP is also a technical challenge, whether there will
be a small number (2, 4 or 8) or a large number (32 or 64) of processors sharing local memory
within an SMP node. One path, small SMPs clustered with specialized connections could be
viewed as an evolution of today's MPP architecture. The other path, large SMPs clustered with
commodity or custom networks, could be an evolution of today's SMP architecture. Adding the
complexity of providing distributed memory access in a way that enables vector computing
creates a large range of possible paths.

NERSC has begun exploring these directions in preparation for the next major acquisition in
1999, and the one in 2001/2002. These acquisitions could lead to a system with hundreds of
processors that has Teraflops of measured performance in 1999, and 10s of Teraflops in 2002.
In all these scenarios, the programming methods are different than for the PVP or current SMP
systems. In order to do increasingly more demanding problems that track the computational price
performance curve, HPCAC will have to work with clients and providers to develop system and
programming environment software which allows applications to capture a significant part of the
aggregated computational power these new systems provide. New models of computing services
are also needed to fully realize the potential of the new technology.

The Massively Parallel Processor

The development of parallel processing, with the attendant technology of advanced software engineering,
VLSI circuits, and artificial intelligence, now allows high-performance computer systems to reach the speeds
necessary to meet the challenge of future complex scientific and commercial applications. This collection of
articles documents the design of one such computer, a single instruction multiple data stream (SIMD) class
supercomputer with 16,834 processing units capable of over 6 billion 8 bit operations per second. It
provides a complete description of the Massively Parallel Processor (MPP), including discussions of hardware
and software with special emphasis on applications, algorithms, and programming.

This system with its massively parallel hardware and advanced software is on the cutting edge of parallel
processing research, making possible AI, database, and image processing applications that were once
thought to be inconceivable. The massively parallel processor represents the first step toward the large-
scale parallelism needed in the computers of tomorrow. Orginally built for a variety of image-processing
tasks, it is fully programmable and applicable to any problem with sizeable data demands.

Contents: "History of the MPP," D. Schaefer; "Data Structures for Implementing the Classy Algorithm on the
MPP," R. White; "Inversion of Positive Definite Matrices on the MPP," R. White; "LANDSAT-4 Thematic
Mapper Data Processing with the MPP," R. O. Faiss; "Fluid Dynamics Modeling," E. J. Gallopoulas; "Database
Management," E. Davis; "List Based Processing on the MPP," J. L. Potter; "The Massively Parallel Processor
System Overvew," K. E. Batcher; "Array Unit," K. E. Batcher; "Array Control Unit," K. E. Batcher; "Staging
Memory," K. E. Batcher; "PE Design," J. Burkley; "Programming the MPP," J. L. Potter; "Parallel Pascal and
the MPP," A. P Reeves; "MPP System Software," K. E. Batcher; "MPP Program Development and Simulation,"
E. J. Gallopoulas.

You might also like