1. All clusters basically fall into two broad categories: High Availability (HA) and HighPerformance Computing (HPC).2 Relevance of the Project in the Present Scenario An eternal struggle in any IT department is in finding a method to squeeze the maximum processing power out of a limited budget. While there have been significant advances in mainframe computing in recent years.1. and yet IT budgets struggle to keep pace with the ever growing demand for yet more power. investment in a single large source of processing power may be the most cost-effective or flexible solution. Unfortunately. enterprises require enormous processing power in order to manage their desktop applications. HA clusters strive to provide extremely reliable services.1 Overview A computer cluster is a group of loosely coupled computers that work together closely so that in many respects they can be viewed as though they are a single computer. Many business processes are extremely heavy users of IT resources. while typically being much more cost-effective than single computers of comparable speed or availability. . databases and knowledge management. but not always. Today more than ever. Clusters are usually deployed to improve performance and/or availability over that provided by a single computer. though some of the largest enterprises require the processing power of a supercomputer. connected to each other through fast local area networks. HPC is a cluster configuration designed to provide greater computational power than one computer alone could provide. The components of a cluster are commonly. Organizations such as the National Oceanic and Atmospheric Administration are able to use clusters to forecast trends in potentially deadly weather conditions. few enterprises can rustle up supercomputer-sized IT budgets.

It provides a much more cost-effective solution than single computers of comparable speed or availability.3 Importance of the Project Clusters are deployed to improve performance over that of a single computer. the entire system fails. while mainframe computers have a fixed processing capacity. computer clusters can be easily expanded as requirements change by adding additional nodes to the network. if a node in a computer cluster fails. ensuring that there is no interruption in service. 1.. They are scalable i. IBM did not invent them either. Amdahl's Law describes mathematically the speedup one can expect from parallelizing any given otherwise serially performed task on a parallel architecture. or needed a backup. This article defined the engineering basis for both multiprocessor computing and cluster computing. When a mainframe computer fails. The date of the first is unknown.1. low-cost desktop computers – that provide an attractive alternative to mainframe computers. its operations can be simply transferred to another node within the cluster.e.‖ The formal engineering basis of cluster computing as a means of doing parallel work of any sort was arguably invented by Gene Amdahl of IBM. who invented clusters. many enterprises are now choosing to invest their IT budgets in computer clusters – networks of high-powered. Customers invented clusters.5 History of Clusters The history of cluster computing is best captured by a footnote in Greg Pfister's In Search of Clusters: ―Virtually every press release from DEC mentioning clusters says ‗DEC.4 Need of the Project Instead. but it would be surprising if it was not in the 1960s. 1.‘. or even late 1950s. as soon as they could not fit all their work on one computer. However. who in 1967 published what has come to be regarded as the seminal paper on parallel processing: Amdahl's Law.. where the primary differentiator is whether or not the interprocessor communications are supported "inside" the computer (on for example a .

it was not until circa 1983 that the protocols and tools for easily doing remote job distribution and file sharing were defined (largely within the context of BSD Unix. The idea was to provide the advantages of parallel processing. as one of the primary motivations for the development of a network was to link computing resources. the ARPANET project succeeded in creating in 1969 what was arguably the world's first commodity-network based computer cluster by linking four different computer centers (each of which was something of a "cluster" in its own right. The first commercial clustering product was ARCnet. but also shared file systems and peripheral devices. VAXcluster. Packet switching networks were conceptually invented by the RAND corporation in 1962. including clusters. along with a shared filesystem. ARCnet was not a commercial success and clustering per se did not really take off until Digital Equipment Corporation released their VAXcluster product in 1984 for the VAX/VMS operating system.mmp at Carnegie Mellon University in 1971. The Hydra operating system was built for a cluster of DEC PDP-11 minicomputers called C. Consequently the history of early computer clusters is more or less directly tied into the history of early networks. The ARPANET project grew into the Internet—which can be thought of as "the mother of all computer clusters" (as the union of nearly all of the compute resources. Using the concept of a packet switched network. The development of customer-built and research clusters proceeded hand in hand with that of both networks and the Unix operating system from the early 1970s. creating a de facto computer cluster. while maintaining data reliability and uniqueness. is still available on OpenVMS systems from HP running on Alpha and Itanium systems. The ARCnet and VAXcluster products not only supported parallel computing. that happen to be connected). It also established the paradigm in use by all computer clusters in the world today—the use of packet-switched networks to perform interprocessor communications between processor (sets) located in otherwise disconnected frames. . but probably not a commodity cluster). as both TCP/IP and the Xerox PARC project created and formalized protocols for network-based communications. developed by Datapoint in 1977. now VMScluster. However. as implemented by Sun Microsystems) and hence became generally available commercially.customized internal communications bus or network) or "outside" the computer on a commodity network.

Free form heterogeneous clusters built on top of this model rapidly achieved total throughput in FLOPS that greatly exceeded that available even with the most expensive "big iron" supercomputers. whether or not it. in 1993. However. This open source software based on TCP/IP communications enabled the instant creation of a virtual supercomputer—a high performance compute cluster—made out of any TCP/IP connected systems. to a NASA project to build supercomputers out of commodity clusters.  Availability When a mainframe computer fails. prove more cost effective than a mainframe with similar power. No history of commodity computer clusters would be complete without noting the pivotal role played by the development of Parallel Virtual Machine (PVM) software in 1989. . PVM and the advent of inexpensive networked PCs led. were named. This reduced price per unit of power enables enterprises to get a greater Return on Investment from their IT budget. 1.Two other noteworthy early commercial clusters were the Tandem Himalaya (a circa 1994 highavailability product) and the IBM S/390 Parallel Sysplex (also circa 1994. or the clusters that used it. primarily for business use).  Processing Power The parallel processing power of a high-performance cluster can. In 1995 the Beowulf cluster—a cluster built on top of a commodity network for the specific purpose of "being a supercomputer" capable of performing tightly coupled parallel HPC computations—was invented. although Grid-style clustering had been around at least as long as the Unix operating system and the Arpanet. if a node in a computer cluster fails. in many cases. its operations can be simply transferred to another node within the cluster. ensuring that there is no interruption in service. which spurred the independent development of grid computing as a named entity.6 Objectives There are basically four objectives needed to be achieved by a Cluster. the entire system fails. These are mentioned below.

While mainframe computers have a fixed processing capacity. computer clusters can be easily expanded as requirements change by adding additional nodes to the network. LAN.e.  Compatibility The cluster should be able to work properly in every network i. . wireless LAN. Scalability The greatest advantage of computer clusters is the scalability they offer.

39.3:2889028  ..425:93  -:9.3/:36:03088 '.70/ 10 889028 .-9.33/.:8907574/:.39.3/ 34'$.08  %0 /0.0..9.9834943 8:554790/5..0 574.84 8./. ./0 90 .3/'.0883 02..-043 503'$8890281742!7:333435.. /0.'$4507.3/ 507507.08 41 5.:8907 8 89.93 88902 %0 #309..8 94 574.

...42207.7....3/022.-9574/:.7..%4490734904790.9 .:8907807090%.3/90$... . .

84.0 '79:.0 8419.30 !'  8419...7147-:83088 :80  4894741. 572.0/ - 90 /0.0452039 41 !.7.42509094:93493905.70 -.4224/9.740 5.80/ 43 %!.70 3   %8 4503 84:7.7. !. .425:907.49.0$850 .7.:890784:/-0.

0 .3 %!.3./0 4:9 41 .39 . .422:3.! . 8:507.425:907 .79:.70..-0/ 90 389.  5071472.:8907 2.425:90 .9438 03.943 41 .

:89078 -:9 43 945 41 98 24/0 7.94:7/ 890.09.9438.3.0/57. 8:507..425:9.89.20/     -0.3  3 2.3-082597.9 0.9:80/9 0703.425:907 ..0 2470.317. .9.0 - 743 8:507.0 574. $ 5740.0507:3941 540703. :8907  %080 .907#09:73433.425:93 .8 3.:89071.4224/9 .:8907 038:739.8  40.9.75407 %870/:.. 3.8 984507.9 70.5/ ..8 ./-003.0 ! .:8907 .0390/  .-0 41 50714723 99 .3810770/94.-0 0.0 .20982.4:50/ 5. 2. 85:770/ 90 3/0503/039 /0.3/ 9075. 974:5:9 3  !$ 9..8  90 03970 88902 1.3/ 90 .9907083439077:59433807.9388902.425:9078  !' .843.0.3490734/09390 .5.70 -.0 309470/ !8 0/  3   94 .03 9 90 2489 05038. 14:7 4-0.70.-9 03 ..3 .0..9.039 41 305038.7.3..-0803907578089409. 34/0 3 .1..425:907 1.4224/9 30947 147 90 850.74:3/.0/ - .7. ..9 94 -:/ 8:507.20 .  5071472..70 2039430/ -04  O !74.:89073.:89078  3  90 04:1 .:8907 -:9 43 945 41 .20/ 0399  ./.0883!407 %0 5.425:9078 4:9 41 .890&34507.07  1 ..90.90/ 889028  700 1472 090740304:8 . .0/ 949.08 %070 .0452039 41 7/ .08 300/0/ 94 -0 .317.9 .425:907.00/0/ 9.808  574.9438 .0883 5407 41 .2..08920391742907%-:/09   O .8. 5:75480 41 -03 .0   .4330.4890110.:8907 .309 0907473499 4790.! .:890789.

34/08949030947   O 425..3/0/..425:907.20.:89078 8 90 8.5.0 .8 05.9 .425:9078.0 41 ...0883 .:890784:/-0...:89078.9-9 %0.317.9089 ...8706:7020398. 10/574.425:907 .30-.//943.3 -00..O $.0730947 0 70088   ..-9 90 41107  0 2.-0944757450730./.39.-9 %0 70.//3.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.