You are on page 1of 29

Understanding of HPC Cluster and It's components

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Agenda

o Improving computational performance

o Introduction to cluster computing

o Components of a cluster

o Types of clusters

o Question and answer

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
It’s only human to “want more”

o Unhappy with the computational power at their disposal

o People wish to compress files and folders faster

o Weather forecasting quicker and broader

o Accurate predictions for financial & scientific applications

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
How to increase computational performance ?

o Work harder - use faster & advanced hardware

o Work smarter - optimize the software subsystem

! ! ! Still unhappy ! ! !

o Get help - processing in parallel is the way to go

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Processing in Parallel

Single Processor Parallelism

o Use more pipelines

o Use more cores in same die

o Use of advanced extensions

Multi-Processor Parallelism - Shared memory

o Use of multiple processors to speed up program run-time

o By dividing up the entire computation among the processors

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Processing in Parallel - mammoth style

Multi-computer systems - Distributed memory

o Uses several independent computers

o Strong interconnection network

o e.g. Cluster computers

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
What is a cluster ?

o Group of loosely coupled commodity computers

o Works together to achieve the same goal

o Maintains a Single System Image

o Good computational performance and reliability

o Good for compute Intensive, data or I/O, & transaction

Intensive applications

o Homogeneous cluster and Heterogeneous

o Cost-effective

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Components of a cluster computer

Software

Hardware

Network

Storage

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Cluster software

Software

o Cluster compatible OS

o Clustering software to generate and maintain a cluster

o Cluster-aware applications

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Cluster hardware

o Uses commodity components

o Advanced processors with large cache

o High speed memory

o Advanced chipsets

o Faster I/O subsystem

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Cluster network

o High speed

o Low-latency

o Scalable

o Reliable

o Data throughput

o Accelerators

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Cluster storage

Scratch disk

o Linear scaling Storage disk

o Extreme bandwidth & I/O Tape storage

o Hierarchical Storage Management

o Single storage pool

o Reliable

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Types of cluster computers

o LBC : Load balancing cluster

o HTC : High throughput computing cluster

o HAC : High availability computing cluster

o HPC : High performance computing cluster

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Overview of Load balancing cluster

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Overview of HTC cluster

o Focuses on completing more computations as possible

o Make use of all available computing platforms

o May have high latency network

o May have heterogeneous nodes

o Opportunistic type

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Overview of HAC cluster

Uncertainty

o Failure to prepare is preparing to fail.

Mantra for High availability

o Always avoid Single Point Of Failures (SPOF).

o Redundancy of critical system and equipments

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Availability factor

High Availability

“a system or component that is

continuously operational”

Factors affecting High Availability

o MTBF

o MTTR

o Outage

o Downtime( Planned / Unplanned).


HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Building a HPC Cluster HPC

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Components of HPC

o Head node / Master

o Compute node / Slave

o Cluster software

o Cluster interconnect

o Cluster Storage

Software

Hardware Network Storage

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Parts list for a simple 1+1 HPC Cluster

Minimum Requirements Slave Nodes


o A Master node o 1 Ethernet Port
o A Slave node o CDROM Drive
o Interconnection network o 256 MB RAM
o Instant Cluster – Live CD
Network
Master Node
o cross over network cable
o 1 Ethernet port

o CDROM Drive

o 256 MB RAM

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Components of HPC - Cluster software

o Master – Slave arrangement

o Message passing libraries

o Open Mosix

o Peer to peer process migration

o Dynamic load balancing

clusterknoppix.sw.be
HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Steps in Building a Manual Cluster

o Identify two or more healthy machines

o Load suitable Linux & packages in to the nodes

o Name the nodes appropriately and modify /etc/hosts of all nodes

o Create a local area network and validate it

o Identify a master node

o Configure SSH and NFS daemon in master

o Configure SSH daemon and NFS mount in slaves

o Create identical user names / password and UID in all nodes

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Steps in Building a Manual Cluster Cont...

o Copy ssh keys - password less authentication

o Install mpich in master(use rsh=ssh flag).

o Update 'machines' file and copy mpich to all nodes

o Install SGE, PBS etc for job scheduling

o Install ganglia to monitor the cluster

o Export a NFS partition from the master and mount it in nodes.

o Building a large manual cluster is a na na

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Building an experimental cluster using Rocks

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Blue print for HAC cluster

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Overview of HPC cluster

o Uses advanced processors with large cache

o Uses commodity components

o High speed memory

o Advanced chipsets

o Faster I/O subsystem

o High speed network

o Distributed HPC

o Class 1 / Class 2

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
Blue print for HPC cluster

System Name : Tianhe-1A


Site : National Supercomputing center in Tianzin
Cores 186368
Memory 229376 GB
R Max GFlops 2566000
R Peak Gflops 4701000
Operating System Linux
Processor : Intel EM64T Xeon X56xx (Westmere-EP) 2930 MHz (11.72
Gflops)
List Rank Rmax (GFlops)
11/2010 1 2.566e+06

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect - Japan
Site: RIKEN Advanced Institute for Computational Science (AICS)
List Rank System Vendor Total Cores Rmax(GFs) Rpeak(GFs) Power (Kw)
11/2011 1 K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu 705024 10510000.00 11280384.00
12659.89
06/2011 1 K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu 548352 8162000.00 8773632.00 9898.56

HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune
HPC Solutions Group © 2012, Centre for Development of Advanced Computing, Pune

You might also like