Professional Documents
Culture Documents
QsNets contributions:
Integration of the virtual-address spaces. Network fault detection and fault tolerance.
QsNet I Components:
Elan Network Interface Elite Switch Global Virtual Memory Network Fault Detection and Fault Tolerance
Elan4
Packet Generation
For receiving: Elan4 performs master PCI-X read . Node CPUs can issue writes directly. For sending: For large messages, the Elan4 DMA engine can read from the PCI-X. For small messages, Elan4 uses STEN processor.
Elite 4
8 QsNetIIlinks 2 virtual channels Broadcast to range of outputs Full automatic error detection / recovery Arbitration based on age of packet Two levels of priority Adaptive routing support Unblocked latency of ~20ns Trace route transaction for interrogating the network
Implementation
Thunder an 4096-way Itanium II system at Lawrence Livermore National Laboratory, the most powerful system in the US and the second most powerful system in the world. Quadrics hardware and software is an integral part of many of the leading production HPC clusters.
Advantages
*Full, pageable 64 bit virtual memory support *Multiple, virtual, programmable network interfaces *Ultra-low latency short messaging *Optimized support for scalable global operations *Ability to scale number of network connections with number of CPUs for SMP nodes. *Proven scalability to many 1000s of processors.
Conclusion
QsNet generations provide a high performance data network for HPC apps. Quadrics have increased link speeds significantly in each generations. Quadrics have enhanced both adapter and switch functionality. No. of components required is reduced by using high radix routers. Cost and complexity of routing data is reduced. Features such as h/w barrier and broadcast enhance the applicability to HPC high end applications.
References
The Quadrics Network (QsNet): High-Performance Clustering Technology- Fabrizio Petrini, Wu-chun Feng, Los Alamos National Laboratory, IEEE Computer society, 2001. QsNet II: Defining High-Performance Network Design- Jon Beecroft, David Addison, Pacific Northwest National Laboratory, IEEE Computer society, 2005, p.34 to p.47. QsNet III an Adaptively Routed Network for High Performance Computing, Duncan Roweth, Trevor Jones, Quadrics Limited, Bristol, UK, IEEE Computer society, 2008, p. 157 to p. 164.