Professional Documents
Culture Documents
net/publication/215757945
Conference Paper in The International Journal of High Performance Computing Applications · August 2001
DOI: 10.1177/109434200101500302 · Source: DBLP
CITATIONS READS
6,206 4,849
3 authors, including:
All content following this page was uploaded by Ian Foster on 29 October 2018.
Published by:
http://www.sagepublications.com
Additional services and information for International Journal of High Performance Computing Applications can be found at:
Subscriptions: http://hpc.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
Citations: http://hpc.sagepub.com/content/15/3/200.refs.html
What is This?
1 Introduction
The term Grid was coined in the mid-1990s to denote a
proposed distributed computing infrastructure for ad-
THE ANATOMY OF THE GRID: vanced science and engineering (Foster and Kesselman,
ENABLING SCALABLE VIRTUAL 1998a). Considerable progress has since been made on
ORGANIZATIONS the construction of such an infrastructure (e.g., Beiriger
et al., 2000; Brunett et al., 1998; Johnston, Gannon, and
Nitzberg, 1999; Stevens et al., 1997), but the term Grid
1
Ian Foster has also been conflated, at least in popular perception, to
2
Carl Kesselman embrace everything from advanced networking to artifi-
3
Steven Tuecke cial intelligence. One might wonder whether the term has
any real substance and meaning. Is there really a distinct
Grid problem and hence a need for new Grid technolo-
Summary
gies? If so, what is the nature of these technologies, and
“Grid” computing has emerged as an important new field, what is their domain of applicability? While numerous
distinguished from conventional distributed computing by
groups have interest in Grid concepts and share, to a sig-
its focus on large-scale resource sharing, innovative appli-
cations, and, in some cases, high performance orienta- nificant extent, a common vision of Grid architecture, we
tion. In this article, the authors define this new field. First, do not see consensus on the answers to these questions.
they review the “Grid problem,” which is defined as flexible, Our purpose in this article is to argue that the Grid con-
secure, coordinated resource sharing among dynamic cept is indeed motivated by a real and specific problem
collections of individuals, institutions, and resources—
and that there is an emerging, well-defined Grid technol-
what is referred to as virtual organizations. In such set-
tings, unique authentication, authorization, resource ac- ogy base that addresses significant aspects of this prob-
cess, resource discovery, and other challenges are en- lem. In the process, we develop a detailed architecture
countered. It is this class of problem that is addressed by and road map for current and future Grid technologies.
Grid technologies. Next, the authors present an extensible Furthermore, we assert that while Grid technologies are
and open Grid architecture, in which protocols, services,
currently distinct from other major technology trends,
application programming interfaces, and software devel-
opment kits are categorized according to their roles in en- such as Internet, enterprise, distributed, and peer-to-peer
abling resource sharing. The authors describe require- computing, these other trends can benefit significantly
ments that they believe any such mechanisms must satisfy from growing into the problem space addressed by Grid
and discuss the importance of defining a compact set of technologies.
intergrid protocols to enable interoperability among differ-
The real and specific problem that underlies the Grid
ent Grid systems. Finally, the authors discuss how Grid
technologies relate to other contemporary technologies, concept is coordinated resource sharing and problem
including enterprise integration, application service pro- solving in dynamic, multi-institutional virtual organiza-
vider, storage service provider, and peer-to-peer comput- tions. The sharing that we are concerned with is not pri-
ing. They maintain that Grid concepts and technologies marily file exchange but rather direct access to comput-
complement and have much to contribute to these other
ers, software, data, and other resources, as is required by a
approaches.
range of collaborative problem-solving and resource-
brokering strategies emerging in industry, science, and
engineering. This sharing is, necessarily, highly con-
trolled, with resource providers and consumers defining
clearly and carefully just what is shared, who is allowed to
1
MATHEMATICS AND COMPUTER SCIENCE DIVISION, AR-
Address reprint requests to Ian Foster, Mathematics and Com- GONNE NATIONAL LABORATORY, ARGONNE, ILLINOIS, AND
puter Science Division, Argonne National Laboratory, 9700 S. DEPARTMENT OF COMPUTER SCIENCE, UNIVERSITY OF
Cass Avenue, MCS/221, Argonne, IL 60439; phone: (630) 252- CHICAGO
4619; fax: (630) 252-9556; e-mail: foster@mcs.anl.gov. 2
INFORMATION SCIENCES INSTITUTE, UNIVERSITY OF
SOUTHERN CALIFORNIA
The International Journal of High Performance Computing Applications, 3
MATHEMATICS AND COMPUTER SCIENCE DIVISION, AR-
Volume 15, No. 3, Fall 2001, pp. 200-222
2001 Sage Publications
GONNE NATIONAL LABORATORY, ARGONNE, ILLINOIS
Application
Collective
Resource
Transport
Connectivity
Internet
Fabric Link
Fig. 2 The layered Grid architecture and its relationship to the Internet protocol (IP) architecture. Because the IP architecture ex-
tends from network to application, there is a mapping from Grid layers into Internet layers.
The resource layer builds on connectivity-layer commu- • A Grid resource information protocol (GRIP, currently
nication and authentication protocols to define protocols based on the lightweight directory access protocol
(and APIs and SDKs) for the secure negotiation, initia- [LDAP]) is used to define a standard resource informa-
tion, monitoring, control, accounting, and payment of tion protocol and associated information model. An
sharing operations on individual resources. Resource- associated soft-state resource registration protocol, the
layer implementations of these protocols call fabric-layer Grid resource registration protocol (GRRP), is used to
functions to access and control local resources. Resource- register resources with Grid index information servers,
layer protocols are concerned entirely with individual re- discussed in the next section (Czajkowski et al., 2001).
sources and hence ignore issues of global state and atomic • The HTTP-based Grid resource access and manage-
actions across distributed collections; such issues are the ment (GRAM) protocol is used for allocation of com-
concern of the collective layer, discussed next. putational resources and for monitoring and control of
Two primary classes of resource-layer protocols can computation on those resources (Czajkowski et al.,
be distinguished: 1998).
• An extended version of the file transfer protocol,
• Information protocols are used to obtain information GridFTP, is a management protocol for data access;
about the structure and state of a resource, for example, extensions include use of connectivity-layer security
its configuration, current load, and usage policy (e.g., protocols, partial file access, and management of par-
cost). allelism for high-speed transfers (Allcock et al., 2001).
• Management protocols are used to negotiate access to FTP is adopted as a base data transfer protocol because
a shared resource, specifying, for example, resource of its support for third-party transfers and because its
requirements (including advanced reservation and separate control and data channels facilitate the imple-
quality of service) and the operation(s) to be per- mentation of sophisticated servers.
formed, such as process creation, or data access. Since • LDAP is also used as a catalog access protocol.
management protocols are responsible for instant-
iating sharing relationships, they must serve as a “pol- The Globus Toolkit defines client-side C and Java APIs
icy application point,” ensuring that the requested pro- and SDKs for each of these protocols. Server-side SDKs
tocol operations are consistent with the policy under and servers are also provided for each protocol to facili-
which the resource is to be shared. Issues that must be tate the integration of various resources (computational,
considered include accounting and payment. A proto- storage, network) into the Grid. For example, the Grid re-
col may also support monitoring the status of an opera- source information service (GRIS) implements server-
tion and controlling (e.g., terminating) the operation. side LDAP functionality, with callouts allowing for publi-
cation of arbitrary resource information (Czajkowski
et al., 2001). An important server-side element of the
Collective Layer
Co-reservation Service
Co-Allocation API & SDK
Resource Mgmt API & SDK
Resource Layer Resource Mgmt Protocol
Fig. 3 Collective- and resource-layer protocols, services, application programming interfaces (APIs), and software develop-
ment kits (SDKs) can be combined in a variety of ways to deliver functionality to applications
Connectivity APIs
Connectivity Protocols
Fabric
Fig. 4 Application programming interfaces (APIs) are implemented by software development kits (SDKs), which in turn use Grid
protocols to interact with network services that provide capabilities to the end user. Higher level SDKs can provide functionality
that is not directly mapped to a specific protocol but may combine protocol operations with calls to additional APIs as well as im-
plement local functionality. Solid lines represent a direct call; dashed lines represent protocol interactions.
4.5 APPLICATIONS
Collective Solver coupler, distributed data archiver Checkpointing, job management, failover, staging
(application specific)
Collective (generic) Resource discovery, resource brokering, system monitoring, community authorization,
certificate revocation
Resource Access to computation; access to data; access to information about system structure, state,
performance
Connectivity Communication (Internet protocol), service discovery (DNS), authentication, authorization,
delegation
Fabric Storage systems, computers, networks, code repositories, catalogs
9 Conclusions
We have provided in this article a concise statement of the
“Grid problem,” which we define as controlled and coordi-
nated resource sharing and resource use in dynamic, scal-
An important property of protocols is that they admit to multiple • The generic security service (GSS) API (Linn, 2000) defines
implementations: two endpoints need only implement the same standard functions for verifying the identity of communicat-
protocol to be able to communicate. Standard protocols are thus ing parties, encrypting messages, and so forth.