You are on page 1of 8

GRID COMPUTING

Grid computing is an interconnected computer systems where the machines utilize the same
resources collectively. Grid computing usually consists of one main computer that distributes
information and tasks to a group of networked computers to accomplish a common goal. Grid
computing is often used to complete complicated or tedious mathematical or scientific
calculations.
Grid computing is applying the resources of many computers in a network to a single problem,
also at the same time to a scientific or technical problem that requires a great number of
computer processing cycles or access to large amounts of data. Grid computing is a computer
network in which each computer's resources are shared with every other computer in the system.
Processing power, memory and data storage are all community resources that authorized users
can tap into and leverage for specific tasks. A grid computing system can be as simple as a
collection of similar computers running on the same operating system or as complex as internetworked systems comprised of every computer platform you can think of.
The grid computing concept isn't a new one. It's a special kind of distributed computing. In
distributed computing, different computers within the same network share one or more resources.
In the ideal grid computing system, every resource is shared, turning a computer network into a
powerful supercomputer. With the right user interface, accessing a grid computing system would
look no different than accessing a local machine's resources. Every authorized computer would
have access to enormous processing power and storage capacity.
Though the concept isn't new, it's also not yet perfected. Computer scientists, programmers and
engineers are still working on creating, establishing and implementing standards and protocols.
Right now, many existing grid computer systems rely on proprietary software and tools. Once
people agree upon a reliable set of standards and protocols, it will be easier and more efficient
for organizations to adopt the grid computing model.

Grid computing requires the use of software that can divide and farm out pieces of a program to
as many as several thousand computers. Grid computing can be thought of as distributed and
large-scale cluster computing and as a form of network-distributed parallel processing. It can be
confined to the network of computer workstations within a corporation or it can be a public
collaboration (in which case it is also sometimes known as a form of peer-to-peer computing).

A number of corporations, professional groups, university consortiums, and other groups have
developed or are developing frameworks and software for managing grid computing projects.
The European Community (EU) is sponsoring a project for a grid for high-energy physics, earth
observation, and biology applications. In the United States, the National Technology Grid is
prototyping a computational grid for infrastructure and an access grid for people. Sun
Microsystems offers Grid Engine software. Described as a distributed resource management
(DRM) tool, Grid Engine allows engineers at companies like Sony and Synopsys to pool the
computer cycles on up to 80 workstations at a time. (At this scale, grid computing can be seen as
a more extreme case of load balancing.)

Grid computing appears to be a promising trend for three reasons: (1) its ability to make more
cost-effective use of a given amount of computer resources, (2) as a way to solve problems that
can't be approached without an enormous amount of computing power, and (3) because it
suggests that the resources of many computers can be cooperatively and perhaps synergistically
harnessed and managed as a collaboration toward a common objective. In some grid computing
systems, the computers may collaborate rather than being directed by one managing computer.
One likely area for the use of grid computing will be pervasive computing applications - those in
which computers pervade our environment without our necessary awareness.

TYPES OF GRID ARE


Computational grid: This grid is used to allocate resources specifically for computing
power.
Scavenging grids: It is use to find and harvest machine cycles from idle servers and
desktop computers for use in resource-intensive tasks.
Data grids: It provides a unified interface for all data repositories in an organization, and
through which data can be queried, managed and secured.

EVOLUTION OF GRID COMPUTING


The term Grid computing originated in the early 1990s as a metaphor for making computer
power as easy to access as an electric power grid in Ian Foster and Carl Kesselmans seminal
work, "The Grid: Blueprint for a new computing Infrastructure". CPU scavenging and volunteer
computing were popularized beginning in 1997 by distributed net and later in 1999 by
SETI@home to harness the power of networked PCs worldwide, in order to solve CPU-intensive
research problems.
The ideas of the grid (including those from distributed computing, object oriented programming,
web services and others) were brought together by Ian Foster, Carl Kesselman and Steve Tuecke,
widely regarded as the "fathers of the grid." They led the effort to create the Globus Toolkit
incorporating not just computation management but also storage management, security
provisioning, data movement, monitoring and a toolkit for developing additional services based
on the same infrastructure including agreement negotiation, notification mechanisms, trigger
services and information aggregation.

THE EVOLUTION OF THE GRID: THE FIRST GENERATION


The early Grid efforts started as projects to link supercomputing sites; at this time this approach
was known as metacomputing. The origin of the term is believed to have been the CASA project,
one of several US Gigabit testbeds around in 1989. Larry Smarr, the former NCSA Director, is
generally accredited with popularizing the term thereafter. The early to mid 1990s mark the
emergence of the early metacomputing or grid environments. Typically, the objective of these
early metacomputing projects was to provide computational resources to a range of high
performance applications. Two representative projects in the vanguard of this type of technology
were FAFNER [FAFNER] and I-WAY. These projects differ in many ways, but both had to

overcome a number of similar hurdles, including communications, resource management, and


the manipulation of remote data, to be able to work efficiently and effectively. The two projects
also attempted to provide metacomputing resources from opposite ends of the computing
spectrum. Whereas FAFNER was capable of running on any workstation with more than 4
Mbytes of memory, I-WAY was a means of unifying the resources of large US supercomputing
centers.

THE EVOLUTION OF THE GRID: THE SECOND GENERATION


The emphasis of the early efforts in grid computing was in part driven by the need to link a
number of US national supercomputing centers. The I-WAY project successfully achieved this
goal. Today the grid infrastructure is capable of binding together more than just a few specialized
supercomputing centers. A number of key enablers have helped make the Grid more ubiquitous,
including the take up of high bandwidth network technologies and adoption of standards,
allowing the Grid to be viewed as a viable distributed infrastructure on a global scale that can
support diverse applications requiring large-scale computation and data. This vision of the Grid
was presented in and we regard this as the second generation, typified by many of todays grid
applications.
There are three main issues that had to be confronted:
Heterogeneity: a Grid involves a multiplicity of resources that are heterogeneous in nature and
might span numerous administrative domains across a potentially global expanse. As any cluster
manager knows, their only truly homogeneous cluster is their first one!
Scalability: a Grid might grow from few resources to millions. This raises the problem of
potential performance degradation as the size of a Grid increases. Consequently, applications that
require a large number of geographically located resources must be designed to be latency
tolerant and exploit the locality of accessed resources. Furthermore, increasing scale also
involves crossing an increasing number of organizational boundaries, which emphasizes
heterogeneity and the need to address authentication and trust issues. Larger scale applications
may also result from the composition of other applications, which increases the intellectual
complexity of systems.

Adaptability: in a Grid, a resource failure is the rule, not the exception. In fact, with so many
resources in a grid, the probability of some resource failing is naturally high. Resource managers
or applications must tailor their behavior dynamically so that they can extract the maximum
performance from the available resources and services.

Middleware is generally considered to be the layer of software sandwiched between the


operating system and applications, providing a variety of services required by an application to
function correctly. Recently, middleware has re-emerged as a means of integrating software
applications running in distributed heterogeneous environments. In a Grid, the middleware is
used to hide the heterogeneous nature and provide users and applications with a homogeneous
and seamless environment by providing a set of standardized interfaces to a variety of services.

Setting and using standards is also key to tackling heterogeneity. Systems use varying standards
and system APIs, resulting in the need to port services and applications to the plethora of
computer systems used in a grid environment. As a general principle, agreed interchange formats
help reduce complexity, because n converters are needed to enable n components to interoperate
via one standard, as opposed to n converters for them to interoperate with each other.

THE EVOLUTION OF THE GRID: THE THIRD GENERATION

The second generation provided the interoperability that was essential to achieve large-scale
computation. As further grid solutions were explored, other aspects of the engineering of the
Grid became apparent. In order to build new grid applications it was desirable to be able to reuse
existing components and information resources, and to assemble these components in a flexible
manner. The solutions involved increasing adoption of a service-oriented model and increasing
attention to metadata these are two key characteristics of third generation systems. In fact the
service-oriented approach itself has implications for the information fabric: the flexible assembly
of grid resources into a grid application requires information about the functionality, availability
and interfaces of the various components, and this information must have an agreed
interpretation which can be processed by machine. For further discussion of the service oriented
approach see the companion Semantic Grid paper.

Whereas the Grid had traditionally been described in terms of large scale data and computation,
the shift in focus in the third generation was apparent from new descriptions.

In particular, the terms distributed collaboration and virtual organization were adopted in the
anatomy paper. The third generation is a more holistic view of grid computing and can be said
to address the infrastructure for e-Science a term which reminds us of the requirements (of
doing new science, and of the e-Scientist) rather than the enabling technology. As Fox notes, the
anticipated use of massively parallel computing facilities is only part of the picture that has
emerged: there are also a lot more users; hence loosely coupled distributed computing has not
been dominated by deployment of massively parallel machines.

There is a strong sense of automation in third generation systems; for example, when humans can
no longer deal with the scale and heterogeneity but delegate to processes to do so (e.g. through
scripting), which leads to autonomy within the systems. This implies a need for coordination,
which, in turn, needs to be specified programmatically at various levels including process
descriptions. Similarly, the increased likelihood of failure implies a need for automatic recovery:
configuration and repair cannot remain manual tasks. These requirements resemble the selforganizing and healing properties of biological systems, and have been termed autonomic after
the autonomic nervous system. According to the definition in [Autonomic], an autonomic system
has the following eight properties:

1. Needs detailed knowledge of its components and status;


2. Must configure and reconfigure itself dynamically
3. Seeks to optimize its behavior to achieve its goal
4. Is able to recover from malfunction
5. Protect itself against attack
6. Be aware of its environment
7. Implement open standards
8. Make optimized use of resources

The third generation grid systems now under development are beginning to exhibit many of these
features.

RELATIONSHIP BETWEEN GRID COMPUTING AND SUPER COMPUTER

BENEFITS OF GRID COMPUTING


Grid Computing has proved beneficial in many ways, some of these benefits are:
1. Exploitation of Under-Utilized Resources: Exploitation of under-utilized resources by
running an existing application on different machines, exploiting idle times on other
machines and aggregating unused disk drive capacity.

2. Reduces Computational Time: Computational time is reduced for complex numerical


and data analysis problems.
3. Provide Information Access: Information accessibility to maximize the exploitation of
existing data assets by providing unified data access during the querying process of nonstandard data formats
4. Reduces cost by optimizing existing IT infrastructure: The grid facilitate reduction of
costs by optimizing the use of existing IT infrastructure investments and by enabling data
sharing and distributed workflow across partners, and therefore enabling faster design
processes.
5. Providing access to parallel CPU capacity: Grid computing offers potential access for
large-scale parallel computation to enhance performance in computationally intensive
applications.

6. Offers improved reliability: Grid technology offers alternate approach to achieving


improved reliability. Parallelization can boost reliability by having multiple copies of
important jobs run concurrently on separate machines on the grid. Their results can be
checked for any kind of inconsistency, such as failures, data corruption and tempering.
7. Provision of resource balancing: The grid offers good resource balancing measures that
can handle occasional peak loads, job prioritization, and job scheduling.
8. Effective management of resources: With grid technology, management of organization
can easily visualize resource capacity and utilization to effectively control expenditures
for computing resources over a larger organization.
9. Interoperability of virtual organizations: The grid offers collaboration facilities and
interoperability of different virtual organization by allowing the sharing and
interoperation of the heterogeneous resources available.
10. Access to additional resources: The grid offers access to other specialized devices such
as the cameras and embedded systems.
11. Harnessing heterogeneous systems together: Grid computing can be used to harness
heterogeneous systems together into a mega computer by applying greater computational
power to a task.
12. Grid virtualization: Grid computing offers grid virtualization, thereby making a single,
local computer to undertake powerful applications.
13.

You might also like