Professional Documents
Culture Documents
Grid computing is an interconnected computer systems where the machines utilize the same
resources collectively. Grid computing usually consists of one main computer that distributes
information and tasks to a group of networked computers to accomplish a common goal. Grid
computing is often used to complete complicated or tedious mathematical or scientific
calculations.
Grid computing is applying the resources of many computers in a network to a single problem,
also at the same time to a scientific or technical problem that requires a great number of
computer processing cycles or access to large amounts of data. Grid computing is a computer
network in which each computer's resources are shared with every other computer in the system.
Processing power, memory and data storage are all community resources that authorized users
can tap into and leverage for specific tasks. A grid computing system can be as simple as a
collection of similar computers running on the same operating system or as complex as internetworked systems comprised of every computer platform you can think of.
The grid computing concept isn't a new one. It's a special kind of distributed computing. In
distributed computing, different computers within the same network share one or more resources.
In the ideal grid computing system, every resource is shared, turning a computer network into a
powerful supercomputer. With the right user interface, accessing a grid computing system would
look no different than accessing a local machine's resources. Every authorized computer would
have access to enormous processing power and storage capacity.
Though the concept isn't new, it's also not yet perfected. Computer scientists, programmers and
engineers are still working on creating, establishing and implementing standards and protocols.
Right now, many existing grid computer systems rely on proprietary software and tools. Once
people agree upon a reliable set of standards and protocols, it will be easier and more efficient
for organizations to adopt the grid computing model.
Grid computing requires the use of software that can divide and farm out pieces of a program to
as many as several thousand computers. Grid computing can be thought of as distributed and
large-scale cluster computing and as a form of network-distributed parallel processing. It can be
confined to the network of computer workstations within a corporation or it can be a public
collaboration (in which case it is also sometimes known as a form of peer-to-peer computing).
A number of corporations, professional groups, university consortiums, and other groups have
developed or are developing frameworks and software for managing grid computing projects.
The European Community (EU) is sponsoring a project for a grid for high-energy physics, earth
observation, and biology applications. In the United States, the National Technology Grid is
prototyping a computational grid for infrastructure and an access grid for people. Sun
Microsystems offers Grid Engine software. Described as a distributed resource management
(DRM) tool, Grid Engine allows engineers at companies like Sony and Synopsys to pool the
computer cycles on up to 80 workstations at a time. (At this scale, grid computing can be seen as
a more extreme case of load balancing.)
Grid computing appears to be a promising trend for three reasons: (1) its ability to make more
cost-effective use of a given amount of computer resources, (2) as a way to solve problems that
can't be approached without an enormous amount of computing power, and (3) because it
suggests that the resources of many computers can be cooperatively and perhaps synergistically
harnessed and managed as a collaboration toward a common objective. In some grid computing
systems, the computers may collaborate rather than being directed by one managing computer.
One likely area for the use of grid computing will be pervasive computing applications - those in
which computers pervade our environment without our necessary awareness.
Adaptability: in a Grid, a resource failure is the rule, not the exception. In fact, with so many
resources in a grid, the probability of some resource failing is naturally high. Resource managers
or applications must tailor their behavior dynamically so that they can extract the maximum
performance from the available resources and services.
Setting and using standards is also key to tackling heterogeneity. Systems use varying standards
and system APIs, resulting in the need to port services and applications to the plethora of
computer systems used in a grid environment. As a general principle, agreed interchange formats
help reduce complexity, because n converters are needed to enable n components to interoperate
via one standard, as opposed to n converters for them to interoperate with each other.
The second generation provided the interoperability that was essential to achieve large-scale
computation. As further grid solutions were explored, other aspects of the engineering of the
Grid became apparent. In order to build new grid applications it was desirable to be able to reuse
existing components and information resources, and to assemble these components in a flexible
manner. The solutions involved increasing adoption of a service-oriented model and increasing
attention to metadata these are two key characteristics of third generation systems. In fact the
service-oriented approach itself has implications for the information fabric: the flexible assembly
of grid resources into a grid application requires information about the functionality, availability
and interfaces of the various components, and this information must have an agreed
interpretation which can be processed by machine. For further discussion of the service oriented
approach see the companion Semantic Grid paper.
Whereas the Grid had traditionally been described in terms of large scale data and computation,
the shift in focus in the third generation was apparent from new descriptions.
In particular, the terms distributed collaboration and virtual organization were adopted in the
anatomy paper. The third generation is a more holistic view of grid computing and can be said
to address the infrastructure for e-Science a term which reminds us of the requirements (of
doing new science, and of the e-Scientist) rather than the enabling technology. As Fox notes, the
anticipated use of massively parallel computing facilities is only part of the picture that has
emerged: there are also a lot more users; hence loosely coupled distributed computing has not
been dominated by deployment of massively parallel machines.
There is a strong sense of automation in third generation systems; for example, when humans can
no longer deal with the scale and heterogeneity but delegate to processes to do so (e.g. through
scripting), which leads to autonomy within the systems. This implies a need for coordination,
which, in turn, needs to be specified programmatically at various levels including process
descriptions. Similarly, the increased likelihood of failure implies a need for automatic recovery:
configuration and repair cannot remain manual tasks. These requirements resemble the selforganizing and healing properties of biological systems, and have been termed autonomic after
the autonomic nervous system. According to the definition in [Autonomic], an autonomic system
has the following eight properties:
The third generation grid systems now under development are beginning to exhibit many of these
features.