Professional Documents
Culture Documents
Cluster Computing
Cluster Computing
______________________________________________________________________________
Components of a Cluster
The main components of a cluster are the Personal
Computer and the interconnection network. The computer can
be built out of Commercial off the shelf components (COTS)
and is available economically.
The interconnection network can be either an ATM ring
(Asynchronous Transfer Mode), which guarantees a fast and
effective connection, or a Fast Ethernet connection, which is
commonly available now. Gigabit Ethernet which provides
speeds up to 1000Mbps,or Myrinet a commercial
interconnection network with high speed and reduced latency
are viable options.
But for high-end scientific clustering, there are a variety of
network interface cards designed specifically for clustering.
Those include Myricom's Myrinet, Giganet's cLAN and the
IEEE 1596 standard Scalable Coherent Interface (SCI). Those
cards' function is not only to provide high bandwidth between
the nodes of the cluster but also to reduce the latency (the time it
takes to send messages). Those latencies are crucial to
4 Cluster Computing
______________________________________________________________________________
Giganet
Giganet is the first vendor of Virtual Interface (VI)
architecture cards for the Linux platform, in their cLAN cards
and switches. The VI architecture is a platform-neutral software
and hardware system that Intel has been promoting to create
clusters. It uses its own network communications protocol rather
than IP to exchange data directly between the servers, and it is
not intended to be a WAN routable system. The future of VI
now lies in the ongoing work of the System I/O Group, which in
itself is a merger of the Next-Generation I/O group led by Intel,
and the Future I/O Group led by IBM and Compaq. Giganet's
products can currently offer 1 Gbps unidirectional
communications between the nodes at minimum latencies of 7
microseconds.
IEEE SCI
6 Cluster Computing
______________________________________________________________________________
The IEEE standard SCI has even lower latencies (under 2.5
microseconds), and it can run at 400 MB per second (3.2 Gbps)
in each direction. SCI is a ring-topology-based networking
system unlike the star topology of Ethernet. That makes it faster
to communicate between the nodes on a larger scale. Even more
useful is a torus topology network, with many rings between the
nodes. A two-dimensional torus can be pictured as a grid of n by
m nodes with a ring network at every row and every column. A
three-dimensional torus is similar, with a 3D cubic grid of nodes
that also has rings at every level. Supercomputing massively
parallel systems use those to provide the relatively quickest path
for communications between hundreds or thousands of nodes.
The limiting factor in most of those systems is not the operating
system or the network interfaces but the server's internal PCI
bus system. Basic 32-bit, 33-MHz PCI common in nearly all
desktop PCs and most low-end servers offers only 133 MB per
second (1 Gbps), stunting the power of those cards. Some costly
high-end servers such as the Compaq Proliant 6500 and IBM
Netfinity 7000 series have 64-bit, 66-MHz cards that run at four
times that speed. Unfortunately, the paradox arises that more
organizations use the systems on the low end, and thus most
7 Cluster Computing
______________________________________________________________________________
o Close Clusters
o Open Clusters
Close Clusters
They hide most of the cluster behind the gateway node.
Consequently they need less IP addresses and provide better
security. They are good for computing task
s.
Open Clusters
All nodes can be seen from outside,and hence they need
more IPs, and cause more security concern .But they are more fl
exible and are used for internet/web/information server task.
9 Cluster Computing
______________________________________________________________________________
Close Cluster
High Speed Network
Service Network
gateway Front-end
gateway
node
External Network
Open Cluster
External Network
Front-end
10 Cluster Computing
______________________________________________________________________________
Security Considerations
Special considerations are involved when completing the
implementation of a cluster. Even with the queue system and
parallel environment, extra services are required for a cluster to
function as a multi-user computational platform. These services
include the well-known network services NFS, NIS and rsh.
NFS allows cluster nodes to share user home directories as well
as installation files for the queue system and parallel
environment. NIS provides correct file and process ownership
across all the cluster nodes from the single source on the master
machine. Although these services are significant components of
a cluster, such services create numerous vulnerabilities. Thus, it
would be insecure to have cluster nodes function on an open
network. For these reasons, computational cluster nodes usually
reside on private networks, often accessible for users only
through a firewall gateway. In most cases, the firewall is
configured on the master node using ipchains or iptables.
12 Cluster Computing
______________________________________________________________________________
Beowulf Cluster
Basically, the Beowulf architecture is a multi-computer
architecture that is used for parallel computation applications.
Therefore, Beowulf clusters are primarily meant only for
processor-intensive and number crunching applications and
definitely not for storage applications. Primarily, a Beowulf
cluster consists of a server computer that controls the
functioning of many client nodes that are connected together
with Ethernet or any other network comprising of a network of
switches or hubs. One good feature of Beowulf is that all the
system's components are available from off-the-shelf component
and there is no special hardware that is required to implement it.
It also uses commodity software - most often Linux - and other
commonly available components like Parallel Virtual Machine
(PVM) and Messaging Passing Interface (MPI).
Besides serving all the client nodes in the Beowulf cluster,
the server node also acts as a gateway to external users and
passes files to the Beowulf system. The server is also used to
drive the console of the system from where the various
parameters and configuration can be monitored. In some cases,
especially in very large Beowulf configurations, there is
17 Cluster Computing
______________________________________________________________________________
PLAPACK
PLAPACK is an MPI-based Parallel Linear Algebra
Package that provides an infrastructure for building parallel
dense linear algebra libraries. PLAPACK provides 3 unique
features.
o Physically based matrix distribution
o API to query matrices and vectors
o Programming interface that allows object oriented
programming
26 Cluster Computing
______________________________________________________________________________
ScaLAPACK
ScaLAPACK is a library of high performance linear
algebra routines for distributed memory MIMD computers. It
contains routines for solving systems of linear equations .Most
machine dependencies are limited to two standard libraries
called the PBLAS, or Parallel Basic Linear Algebra Subroutines,
and the BLACS ,or the BLACS, or Basic Linear Algebra
Communication Subroutines. LAPACK and ScaLAPACK will
run on any system where the PBLAS and the BLACS are
available.
27 Cluster Computing
______________________________________________________________________________
Diskless Cluster
Server assign IP
TFTP load O
S
Server Supply OS
Related Links:
http://stonesoup.esd.ornl.gov/
http://extremelinux.esd.ornl.gov/
http://www.beowulf.org/
http://www.cacr.caltech.edu/research/beowulf/
http://beowulf-underground.org/