Grid computing

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files. What distinguishes grid computing from conventional high performance computing systems such as cluster computing is that grids tend to be more loosely coupled, heterogeneous, and geographically dispersed. Although a grid can be dedicated to a specialized application, it is more common that a single grid will be used for a variety of different purposes. Grids are often constructed with the aid of general-purpose grid software libraries known as middleware. Grid size can vary by a considerable amount. Grids are a form of distributed computing whereby a “super virtual computer” is composed of many networked loosely coupled computers acting together to perform very large tasks. Furthermore, “distributed” or “grid” computing, in general, is a special type of parallel computing that relies on complete computers (with onboard CPUs, storage, power supplies, network interfaces, etc.) connected to a network (private, public or the Internet) by a conventional network interface, such as Ethernet. This is in contrast to the traditional notion of a supercomputer, which has many processors connected by a local high-speed computer bus.

Grid computing combines computers from multiple administrative domains to reach a common goal,[1] to solve a single task, and may then disappear just as quickly. One of the main strategies of grid computing is to use middleware to divide and apportion pieces of a program among several computers, sometimes up to many thousands. Grid computing involves computation in a distributed fashion, which may also involve the aggregation of large-scale cluster computing-based systems. The size of a grid may vary from small—confined to a network of computer workstations within a corporation, for example—to large, public collaborations across many companies and networks. "The notion of a confined grid may also be known as an intranodes cooperation whilst the notion of a larger, wider grid may thus refer to an internodes cooperation". Grids are a form of distributed computing whereby a “super virtual computer” is composed of many networked loosely coupled computers acting together to perform very large tasks. This technology has been applied to computationally intensive scientific, mathematical, and academic problems through volunteer computing, and it is used in

This is due to the economies of scale of producing commodity hardware. pervasive. transparent access to electricity irrespective of its source. INTRODUCTION The popularity of the Internet as well as the availability of powerful computers and high-speed network technologies as low-cost commodity components is changing the way we use computers today. storage. These technology opportunities have led to the possibility of using distributed computers as a single. data sources. engineering.commercial enterprises for such diverse applications as drug discovery. seismic analysis. This new approach to network computing is known by several names. which has many processors connected by a local high-speed computer bus.[citation needed] The primary advantage of distributed computing is that each node can be purchased as commodity hardware. such as Ethernet. and aggregation of a wide variety of resources including supercomputers. A detailed analysis of this analogy can be found in.) connected to a network (private. such as metacomputing. The primary performance disadvantage is that the various processors and local storage areas do not have high-speed connections. Comparison of grids and conventional supercomputers “Distributed” or “grid” computing in general is a special type of parallel computing that relies on complete computers (with onboard CPUs. and commerce. which. economic forecasting. compared to the lower efficiency of designing and constructing a small number of custom supercomputers. selection. network interfaces. when combined. and back office data processing in support for e-commerce and Web services. due to the low need for connectivity between nodes relative to the capacity of the public Internet. without the need to communicate intermediate results between processors. The term Grid is chosen as an analogy to a power Grid that provides consistent. can produce a similar computing resource asmultiprocessor supercomputer. power supplies. Thus creating virtual organizations and enterprises as a temporary . global computing.peer (P2P) computing. unified computing resource. Figure1:Towards grid computing:a conceptual view Grids enable the sharing. storage systems. and specialized devices that are geographically distributed and owned by different organizations for solving large-scale computational and data intensive problems in science. dependable. and more recently peer-to. leading to what is popularly known as Grid computing. but at a lower cost. scalable computing. This is in contrast to the traditional notion of a supercomputer. public or the Internet) by a conventional network interface.[citation needed] The high-end scalability of geographically dispersed grids is generally favorable. This arrangement is thus well-suited to applications in which multiple parallel computations can take place independently. etc. Internet computing.

integrated computational and collaborative environment. shared. They build on computational and data services provided by the Grid. The low-level details handled by this are the way that information is represented. data exploration. or resources in order to better respond to business opportunities or large-scale application processing requirements. The concept of Grid computing started as a project to link geographically dispersed supercomputers. and/or application services. and whose cooperation is supported by computer networks. These are concerned with proving secure access to distributed datasets and their management. These are concerned with providing secure services for executing application jobs on distributed computational resources individually or collectively. To provide a scalable storage and access to the data sets. and the processing of application jobs on the distributed Grid resources. and the NSF TeraGrid . high-throughput computing. From the end-user point of view. accessed. and . A Grid providing computational services is often called a computational Grid. • Data services. The emerging technologies such as Web services are expected to play a leading role in defining application services. and even different datasets stored in different locations to create an illusion of mass storage. but now it has grown far beyond its original intent. core competencies. • Application services. The processing of datasets is carried out using computational Grid services and such a combination is commonly called data Grids. stored. • Computational services. and processing of large datasets are high-energy physics and accessing distributed chemical databases for drug design. • Information services. The users interact with the Grid resource broker to solve problems.alliance of enterprises or organizations that come together to share resources and skills. which in turn performs resource discovery. Grids can be used to provide the following types of services. sharing. the World Wide Grid. These are concerned with the extraction and presentation of data with meaning by using the services of computational. Some examples of computational Grids are: NASA IPG. catalogued. Sample applications that need such services for management. scheduling. data. An example system that can be used to develop such services is NetSolve. and distributed supercomputing. including collaborative engineering. Resources brokers provide the services for collective use of distributed resources. These are concerned with application management and providing access to remote software and libraries transparently. The Grid infrastructure can benefit many applications. they may be replicated. A Grid can be viewed as a seamless.

or which may be globally located for other practical reasons. grid computing helps protect IT investments. This can decrease time-tomarket. used. it will provide pervasive services to all users that need them. information. resource aggregation. Hence. execution management. These include security. the Grid is not only a computing infrastructure. In this light. Grid applications (typically multidisciplinary and large-scale processing applications) often couple resources that cannot be replicated at a single site. • Knowledge services. the Web is the obvious point of departure for this level. Scalability : Grids can grow seamlessly over time. work can be completed more quickly. Flexibility: Grid computing provides computing power where it is needed most. or execute a decision. Increased user productivity :By providing transparent access to resources. for large applications. Given its key role in many scientific endeavors. resource allocation. These are some of the driving forces behind the foundation of global Grids. the development and deployment of a number of services is required. published. containing costs while providing more capacity. and payment mechanisms in an open environment and high-level services for application development. the Grid allows users to solve larger or new problems by pooling together resources that could not be easily coupled before. To build a Grid. Knowledge is understood as information applied to achieve a goal. allowing many thousands of processors to be integrated into one cluster.maintained. it is a technology that can bond and unify remote and diverse distributed resources ranging from meteorological sensors to data vaults and from parallel supercomputers to personal digital organizers. Users gain additional productivity as they can focus on design and development rather than wasting valuable time hunting for resources and manually scheduling and managing large numbers of jobs. solve a problem. Components can be updated independently and additional resources can be added as needed. As such. or enable additional testing and simulation for improved product quality. helping to better meet dynamically changing work loads. retrieved. allow for innovation. An example of this is data mining for automatically building a new knowledge. and scheduling. By employing existing resources. Grid computing can provide many benefits not available with traditional computing models: • Better utilization of resources : Grid computing uses distributed resources more efficiently and delivers more usable computing power. reducing large one-time expenses. These are concerned with the way that knowledge is acquired. and maintained to assist users in achieving their particular goals and objectives. directory. Grids can contain • • • .

connecting sites around the globe.heterogeneous compute nodes. Designed to support and address the needs of multiple sites and organizations sharing resources. Beowulf clusters. Common examples of the Cluster Grid architecture include compute farms. multiple Cluster Grids can be combined into an Enterprise Grid. and networks of workstations (NOW). groups of multi-processor HPC systems. and Global Grids. Enterprise Grids typically contain resources from multiple administrative domains. but not necessarily the same implementation. • Cluster Grids The simplest form of a grid. as well as centralized resources in a datacenter environment. all of which have agreed upon global usage policies and protocols. Cluster Grids support both high throughput and high performance jobs. Global Grids provide the power of distributed resources to users anywhere in the world. • Enterprise Grids As capacity needs increase. • Global Grids Global Grids are a collection of Enterprise Grids. In particular. Cluster Grids may contain distributed workstations and servers.2:Three levels of grid computing:cluster. the idealized design features that are required by a Grid to provide users with a seamless computing environment are discussed. allowing resources to be added and removed as needs dictate. Enterprise Grids. Typically owned and used by a single project or department. Figure. a Cluster Grid consists of multiple systems interconnected through a network. Enterprise Grids enable multiple projects or departments to share computing resources in a cooperative way.enterprise and global grid GRID CONSTRUCTION: GENERAL PRINCIPLES This section briefly highlights some of the general principles that underlie the construction of the Grid. Computing resources may be geographically dispersed. • Multiple administrative domains and autonomy. Grid resources are geographically distributed across multiple administrative domains and owned by . but are located in the same geographic location. Levels of Deployment Grid computing can be divided into three logical levels of deployment: Cluster Grids. Four main aspects characterize a Grid.

with so many resources in a Grid. names are used to refer to a wide variety of objects such as computers. resource failure is the rule rather than the exception. Resource managers or applications must tailor their behavior dynamically and use the available resources and services efficiently and effectively. A Grid is a dynamic environment where the location and types of services available are constantly changing. ranging from reliable point-to-point to unreliable multicast communications. The autonomy of resource owners needs to be honored along with their local resource management and usage policies. and jitter control. group communications. services. • Design Features The following are the main design features required by a Grid environment. The Grid information (registration and directory) services components provide the mechanisms for registering and obtaining information about the Grid structure. Dynamicity or adaptability. the probability of some resource failing is high. A Grid might grow from a few integrated resources to millions. fault-tolerance. The • • • . Scalability. Information services. and those used by distributed objects.different organizations. In a Grid. without regard to the relative location of the resource user. The administrative hierarchy determines how administrative information flows through the Grid. bandwidth. or data objects. A Grid involves a multiplicity of resources that are heterogeneous in nature and will encompass a vast range of technologies. The network services used also provide the Grid with important QoS parameters such as latency. streaming data. In a Grid. The communication needs of applications using a Grid environment are diverse. • Administrative hierarchy. An administrative hierarchy is the way that each Grid environment divides itself up to cope with a potentially global extent. Naming services. The communications infrastructure needs to support protocols that are used for bulk-data transport. and status. It is necessary to provide mechanisms to enable a rich environment in which information is readily obtained by requesting services. • • Heterogeneity. Communication services. Consequently. resources. services. applications that require a large number of geographically located resources must be designed to be latency and bandwidth tolerant. This raises the problem of potential performance degradation as the size of Grids increases. A major goal is to make all resources accessible to any process in the system. In fact. reliability. like in any distributed system.

and users to be part of the Grid and develop and using strategies that help maximize their objectives. Distributed applications. Security within a Grid environment is a complex issue requiring diverse resources autonomously administered to interact in a manner that does not impact the usability of the resources or introduces security holes/lapses in individual systems or the environments as a whole. From an applications point of view it is important that a distributed file system can provide a uniform global namespace. and accountability. Security and authorization. their interaction with it being confined to a manipulating mechanism for submitting their application. and other components in a Grid is clearly very important. more often than not. network. This approach provides incentives for resource owners. Resource management and scheduling. An economic approach is one means of managing resources in a complex and decentralized manner. it is essential to support mechanisms and policies that help in regulate resource supply and demand. It is important in a Grid that a resource management and scheduling service can interact with those that may be installed locally. As a Grid is constructed by coupling resources distributed across various organizations and administrative domains that may be owned by different organizations.500 naming scheme or DNS. The management of processor time. • • • • . resource management and scheduling should be transparent. To accomplish this task. support a range of file I/O protocols. require little or no program modification. Computational economy and resource trading. A security infrastructure is the key to the success or failure of a Grid environment. • Distributed file systems and caching. To provide a reliable and robust environment it is important that a means of monitoring resources and applications is provided. the Internetâ„¢s scheme.naming service provides a uniform name space across the complete Grid environment. From a userâ„¢s point of view. storage. A distributed file system is therefore a key component in a distributed system. authentication. The overall aims to efficiently and effectively schedule the applications that need to utilize the available resources in the Grid computing environment. and provide means that enable performance optimizations to be implemented. integrity. memory. Any distributed system involves all four aspects of security: confidentiality. Typical naming services are provided by the international X. such as the usage of caches. tools that monitor resources and application need to be deployed. require access to files distributed among many servers. System status and fault tolerance.

Grid applications (multi-disciplinary applications) couple resources that cannot be replicated at a single site even or may be globally located for other practical reasons. which facilitate the sharing of individual resources. a suite of numerical and other commonly used libraries should be available. such as message passing or distributed shared memory. and Fortran should be available. and can in turn be used to construct a wide range of global services and application-specific behaviors at the Collective layer”so called because they involve the coordinated (collective) use of multiple resources. and tools to provide a rich development environment. APIs. Protocols at these layers are designed so that they can be implemented on top of a diverse range of resource types. In addition. Components within each layer share common characteristics but can build on capabilities and behaviors provided by any lower layer.• Programming tools and paradigms. thus supporting the ubiquitous access to the Grid. they should work on a range of different platforms and operating systems. open architectural structure within which can be placed solutions to key VO requirements.g. the number of protocols defined at the neck must be small. APIs. In our architecture. In specifying the various layers of the Grid architecture. as shown in Figure. we follow the principles of the hourglass model. The result is an extensible. A variety of programming paradigms should be supported. and SDKs) but rather to identify requirements for general classes of component. The provision of access to scientific applications through the Web (e. A Grid should include interfaces. or Personal Digital Assistant. • GRID ARCHITECTURE Our goal in describing our Grid architecture is not to provide a complete enumeration of all required protocols (and services. Our architecture and the subsequent discussion organize components into layers. the users should be allowed to submit their jobs to computational resources through a Web interface from any of the accessible platforms such as PCs. defined at the Fabric layer. The Web-centric approach to access supercomputing resources should enable users to access any resource from anywhere over any platform at any time.g. That means. onto which many different high-level behaviors can be mapped (the top of the hourglass). The interfaces to the services and resources available should be intuitive and easy to use. TCP and HTTP in the Internet). utilities. RWCPs parallel protein information analysis system) leads to the creation of science portals. By definition. C++. They also need to take advantage of Web technologies to offer a view of portal supercomputing. the neck of the hourglass consists of Resource and Connectivity protocols. The narrow neck of the hourglass defines a small set of core abstractions and protocols (e. laptops. as should application-level interfaces such as MPI and PVM. In addition. and which themselves can be mapped onto many different underlying technologies (the base of the hourglass). Common scientific languages such as C. . User and administrative GUI..

Figure3. as in practice few resources support advance reservation out of the box. catalogs. • . a resource implementation may involve internal protocols (e.g. network resources. Management mechanisms that allow control over the resources allocated to data transfers (space. whether they support advance reservation) on the one hand. Storage resources: Mechanisms are required for putting and getting files. Enquiry functions are needed for determining hardware and software characteristics as well as relevant load information such as available space and bandwidth utilization. The following brief and partial list provides a resource-specific characterization of capabilities. So are mechanisms for reading and writing subsets of a file and/or executing remote data selection or reduction functions. but these are not the concern of Grid architecture.. striped) transfers are useful. the NFS storage access protocol or a cluster resource management systemâ„¢s process management protocol). For example. • Computational resources: Mechanisms are required for starting programs and for monitoring and controlling the execution of the resulting processes. in such cases. and sensors. However. Fabric components implement the local. computer cluster.g. resource-level support for advance reservations makes it possible for higher-level services to aggregate (coschedule) resources in interesting ways that would otherwise be impossible to achieve. A resource may be a logical entity. • Fabric: Interfaces to Local Control The Grid Fabric layer provides the resources to which shared access is mediated by Grid protocols: for example. if we place few demands on Fabric elements. Enquiry functions are needed for determining hardware and software characteristics as well as relevant state information such as current load and queue state in the case of scheduler-managed resources. There is thus a tight and subtle interdependence between the functions implemented at the Fabric level. Experience suggests that at a minimum. Richer Fabric functionality enables more sophisticated sharing operations. as are advance reservation mechanisms. resources should implement enquiry mechanisms that permit discovery of their structure. such as a distributed file system. Because the Internet protocol architecture extends from network to application. state. Management mechanisms that allow control over the resources allocated to processes are useful. on the other.. and the sharing operations supported. on the one hand. on the other. resource-specific operations that occur on specific resources (whether physical or logical) as a result of sharing operations at higher levels. The layered Grid architecture and its relationship to the Internet protocol architecture. and resource management mechanisms that provide some control of delivered quality of service. computational resources. there is a mapping from Grid layers into Internet layers.. or distributed computer pool. CPU) are useful. as are advance reservation mechanisms. disk bandwidth. a requirement for advance reservation increases the cost of incorporating new resources into a Grid. Thirdparty and high-performance (e. network bandwidth. then deployment of Grid infrastructure is simplified.g. and capabilities (e. storage systems. at the same time.

Catalogs: This specialized form of storage resource requires mechanisms for implementing catalog query and update operations: for example. • • Connectivity: Communicating Easily and Securely The Connectivity layer defines core communication and authentication protocols required for Grid-specific network transactions. transport (TCP. As with communication. This is not to say that in the future. Grid security solutions must be able to interoperate with these various local solutions. Communication requirements include transport.• • Network resources: Management mechanisms that provide control over the resources allocated to network transfers (e. They cannot.g. Grid communications will not demand new protocols that take into account particular types of network dynamics. many of the security standards developed within the context of the Internet protocol suite are applicable. and naming. reservation) can be useful. etc. we observe that the complexity of the security problem makes it important that any solutions be based on existing standards whenever possible. a control system such as CVS. Authentication protocols build on communication services to provide cryptographically secure mechanisms for verifying the identity of users and resources. Code repositories: This specialized form of storage resource requires mechanisms for managing versioned source and object code: for example. require wholesale replacement • • .) layers of the Internet layered protocol architecture. Enquiry functions should be provided to determine network characteristics and load. without further user intervention. Authentication solutions for VO environments should have the following characteristics: • Single sign on. routing. Users must be able to log on (authenticate) just once and then have access to multiple Grid resources defined in the Fabric layer. including Kerberos and Unix security. While alternatives certainly exist. a relational database. the Internet (IP and ICMP). and application (DNS. With respect to security aspects of the Connectivity layer. Integration with various local security solutions: Each site or resource provider may employ any of a variety of local security solutions. realistically. we assume here that these protocols are drawn from the TCP/IP protocol stack: specifically. Delegation. OSPF. so that the program is able to access the resources on which the user is authorized. Communication protocols enable the exchange of data between Fabric layer resources.. A user must be able to endow a program with the ability to run on that userâ„¢s behalf. The program should (optionally) also be able to conditionally delegate a subset of its rights to another program (sometimes referred to as restricted delegation). RSVP. prioritization. UDP).

if a user has the right to use sites A and B. current load. accounting. Resource layer implementations of these protocols call Fabric layer functions to access and control local resources. . Management protocols are used to negotiate access to a shared resource. the user should be able to use sites A and B together without requiring that Aâ„¢s and Bâ„¢s security administrators interact. for example. its configuration.g. the next layer in the architecture contains protocols and services (and APIs and SDKs) that are not associated with any one specific resource but rather are global in nature and capture interactions across collections of resources. specifying. A protocol may also support monitoring the status of an operation and controlling (for example. Resource layer protocols are concerned entirely with individual resources and hence ignore issues of global state and atomic actions across distributed collections. terminating) the operation. for example. For example. we refer to the next layer of the architecture as the Collective layer. initiation. Two primary classes of Resource layer protocols can be distinguished: Information protocols are used to obtain information about the structure and state of a resource. • Resource: Sharing Single Resources The Resource layer builds on Connectivity layer communication and authentication protocols to define protocols (and APIs and SDKs) for the secure negotiation. Because Collective components build on the narrow Resource and Connectivity layer neck in the protocol hourglass. control. and payment of sharing operations on individual resources. A directory service may allow its users to query for resources by name and/or by attributes such as type. and usage policy (e. For this reason. Resource-level GRRP and GRIP protocols are used to construct directories. such as process creation. Issues that must be considered include accounting and payment. Since management protocols are responsible for instantiating sharing relationships. such issues are the concern of the Collective layer discussed next.• of local security solutions but rather must allow mapping into the local environment. User-based trust relationships: In order for a user to use resources from multiple providers together. • Collective: Coordinating Multiple Resources While the Resource layer is focused on interactions with a single resource. monitoring. resource requirements (including advanced reservation and quality of service) and the operation(s) to be performed. they must serve as a policy application point. ensuring that the requested protocol operations are consistent with the policy under which the resource is to be shared.. availability. the security system must not require each of the resource providers to cooperate or interact with each other in configuring the security environment. or data access. cost). For example: Directory services allow VO participants to discover the existence and/or properties of VO resources. or load. they can implement a wide variety of sharing behaviors without placing new requirements on the resources being shared.

Above this. multi-component workflows Software discovery services discover and select the best software implementation and execution platform based on the parameters of the problem being solved. adversarial attack (intrusion detection). and logging. generating capabilities that community members can use to access community resources.Co-allocation. Monitoring and diagnostics services support the monitoring of VO resources for failure. An application might then use the coreservation service protocol to request end-to-end network reservations. and so forth. and/or limiting of resource usage by community members. and other concerns. Workload management systems and collaboration frameworks also known as problem solving environments (PSEs) provide for the description. Examples include AppLeS. security. with the latter existing perhaps only within specific VOs. or as SDKs (with associated APIs) designed to be linked with applications. Community authorization servers enforce community policies governing resource access. such as authorization. calling the co-allocation API to implement co-allocation operations and perhaps providing additional functionality. Examples include Grid-enabled implementations of the Message Passing Interface and manager-worker frameworks. with associated protocols. These examples illustrate the wide variety of Collective layer protocols and services that are encountered in practice. For example. scheduling. and resource management protocols (in the Resource layer) and security protocols in the Connectivity layer. Condor-G. use. Examples are CAVERNsoft. and brokering services allow VO participants to request the allocation of one or more resources for a specific purpose and the scheduling of tasks on the appropriate resources. Access Grid. . In both cases. asynchronous. fault tolerance. resource allocation. Nimrod-G. payment. These servers provide a global policy enforcement service by building on resource information. using various Grid services to address resource discovery. Data replication services support the management of VO storage (and perhaps also network and computing) resources to maximize data access performance with respect to metrics such as response time. reliability. Akenti addresses some of these issues. whether synchronously or asynchronously. their implementation can build on Resource layer (or other Collective layer) protocols and APIs. Community accounting and payment services gather resource usage information for the purpose of accounting. Grid-enabled programming systems enable familiar programming models to be used in Grid environments. Examples include NetSolve and Ninf. and management of multistep. and the DRM broker . Collective layer protocols span the spectrum from general purpose to highly application or domain specific. Collaboratory services support the coordinated exchange of information within potentially large user communities. overload. and cost. Collective functions can be implemented as persistent services. and commodity groupware systems. Notice that while Resource layer protocols must be general in nature and are widely deployed. Figure shows a Collective co-allocation API and SDK (the middle tier) that uses a Resource layer management protocol to manipulate underlying resources. we define a co-reservation service protocol and implement a coreservation service that speaks this protocol.

CORBA . Cactus. Applications are constructed in terms of. and by calling upon. APIs. Depending on the grid on which we are working. the more important it is that a Collective componentâ„¢s protocol(s) and API(s) be standards based. We emphasize that what we label applications and show in a single layer in Figure 4 may in practice call upon sophisticated frameworks and libraries (e. Applications The final layer in our Grid architecture comprises the user applications that operate within a VO environment. and so forth. an SDK that implements an application-specific coherency protocol. the Simple Workflow Access Protocol . the Common Component Architecture . which addresses only the most fundamental protocols and services required in a Grid. APIs are implemented by software development kits (SDKs). data access. services. or a co-reservation service for a specific set of network resources. if captured in our figure.g. but may combine protocol operations with calls to additional APIs as well as implement local functionality. expand it out to many times its current size. which in turn use Grid protocols to interact with network services that provide capabilities to the end user. a replication service that manages an international collection of storage systems for multiple communities. the larger the target user community. services. we have well-defined protocols that provide access to some useful service: resource management. these aggregated resources might comprise the majority of the supercomputers in the country or simply all of the workstations within a company. At each layer. Figure illustrates an application programmerâ„¢s view of Grid architecture. Collective components may be tailored to the requirements of a specific user community. Figure5. These frameworks may themselves define protocols. APIs may also be defined whose implementation (ideally provided by third-party SDKs) exchange protocol messages with the appropriate service(s) to perform desired actions. resource discovery.. services defined at any layer. for example. VO. and/or APIs. for example. or a directory service designed to enable the discovery of VOs. or application domain.g. Higher level SDKs can provide functionality that is not directly mapped to a specific protocol.) However.. SciRun . At each layer. workflow systems) and feature much internal structure that would. Solid lines represent a direct call. and SDKS can be combined in a variety of ways to deliver functionality to applications. these issues are beyond the scope of this article. Here are some contemporary examples: . (E. In general.Figure4. GRID APPLICATIONS • Distributed Supercomputing Distributed supercomputing applications use grids to aggregate substantial computational resources in order to tackle problems that cannot be solved on a single system. dash lines protocol interactions. Collective and Resource layer protocols. Other Collective components can be more general-purpose.

data repositories. The result is a significant enhancement in the ability to understand what we are seeing and. the focusing of available resources on a single problem. fault tolerance. and so on. with the goal of putting unused processor cycles (often from idle workstations) to work. The result may be. For example: The NEOS and NetSolve network-enhanced numerical solver systems allow users to couple remote software and resources into desktop applications. • High-Throughput Computing In high-throughput computing. and payment mechanisms.Distributed interactive simulation (DIS) is a technique used for training and planning in the military. specialized sensors. dispatching to remote servers calculations that are computationally demanding or that require specialized software. the grid is used to schedule large numbers of loosely coupled or independent tasks. to steer the instrument. scheduling. The challenging issues in on-demand applications derive primarily from the dynamic nature of resource requirements and the potentially large populations of users and resources. these applications are often driven by cost-performance concerns rather than absolute performance. as in distributed supercomputing. security. studies of ground penetrating radar. These resources have been used for studies as diverse as molecular simulations of liquid crystals. These computers are located on the desktops of AMD engineers at a number of AMD sites and were used for design verification only when not in use by engineers. Here are some examples: Platform Computing Corporation reports that the microprocessor manufacturer Advanced Micro Devices used high-throughput computing techniques to exploit over a thousand computers during the peak design phases of their K6 and K7 microprocessors. In contrast to distributed supercomputing applications. but the quasi-independent nature of the tasks involved leads to very different types of problems and problem-solving methods. A computer-enhanced MRI machine and scanning tunneling microscope (STM) developed at the National Center for Supercomputing Applications use supercomputers to achieve real time image processing. • On-Demand Computing On-demand applications use grid capabilities to meet short-term requirements for resources that cannot be cost effectively or conveniently located locally. These issues include resource location. and the design of diesel engines. . code management. configuration. software. in the case of the microscope. These resources may be computation. More loosely organized efforts have harnessed tens of thousands of computers distributed world wide to tackle hard cryptographic problems. A system developed at the Aerospace Corporation for processing of data from meteorological satellites uses dynamically acquired supercomputer resources to deliver the results of a cloud detection algorithm to remote meteorologists in quasi real time. The Condor system from the University of Wisconsin is used to manage pools of hundreds of workstations at universities and laboratories around the world.

The Digital Sky Survey will. Such applications are often structured in terms of a virtual shared space. Challenging aspects of collaborative applications from a grid architecture perspective are the real. Future high-energy physics experiments will generate terabytes of data per day. • Collaborative Computing Collaborative applications are concerned primarily with enabling and enhancing humanto-human interactions. assuming that appropriate computational grid facilities exist. Challenging issues in data-intensive applications are the scheduling and configuration of complex. The scientific collaborators who will access this data are widely distributed.time requirements imposed by human perceptual capabilities and the rich variety of interactions that can take place. First. Many collaborative applications are concerned with enabling the shared use of computational resources such as data archives and simulations. the focus is on synthesizing new information from data that is maintained in geographically distributed repositories. we expect the range and sophistication of applications to increase dramatically. for entertainment and education. collaborative exploration of large geophysical data sets and the models that generate them-for example. etc. we note that even in this brief survey we see a tremendous variety of already successful applications. Modern meteorological forecasting systems make extensive use of data assimilation to incorporate remote satellite observations. disk. The CAVE5D system supports remote. NICE system developed at the University of Illinois at Chicago allows children to participate in the creation and maintenance of realistic virtual worlds. This facility enables new approaches to astronomical research based on distributed analysis. they also have characteristics of the other application classes just described.• Data-Intensive Computing In data-intensive applications. digital libraries. We conclude this section with three general observations. This synthesis process is often computationally and communication intensive as well.) that cannot be met in a timely fashion . or around a peta byte per year. The complex queries used to detect interesting" events may need to access large fractions of this data. ultimately. we observe that almost all of the applications demonstrate a tremendous appetite for computational resources (CPU. The different users interact with each other and with a simulation of the incinerator. and hence the data systems in which data is placed are likely to be distributed as well. This rich set has been developed despite the significant difficulties faced by programmers developing grid applications in the absence of a mature grid infrastructure. in this case. For example: The BoilerMaker system developed at Argonne National Laboratory allows multiple users to collaborate on the design of emission control systems in industrial incinerators. As grids evolve. make many terabytes of astronomical photographic data available in numerous network-accessible databases. a coupled physical/biological model of the Chesapeake Bay. high-volume data flows through multiple levels of hierarchy. memory. Second. The complete process involves the movement and processing of many gigabytes of data. and databases.

and from collaborative environments to batch submission mechanisms. Ian Foster. and Tuecke. 2. Mark Baker. S. or depend on tight synchronization with computational components. CONCLUSIONS AND FUTURE TRENDS There are currently a large number of projects and a diverse range of new and emerging Grid developmental approaches being pursued. The Grid: A New Infrastructure for 21st Century Science. Hence. International Journal of High Performance Computing Applications 3. It is difficult to predict the future in a field such as information technology where the technological advances are moving very rapidly. (1999). it is not an easy task to forecast what will become the ˜dominant™ Grid approach. C. GRID APPLICATIONS 22 ¢ Distributed Supercomputing 22 ¢ High-Throughput Computing 23 ¢ On-Demand Computing 24 ¢ Data-Intensive Computing 25 . Kesselman. San Francisco. These systems range from Grid frameworks to application testbeds. The Grid: Blueprint for a New Computing 5. and hence depend on the availability of a grid infrastructure able to provide robust performance guarantees. http://www. 4.SP&E.globus. GRID ARCHITECTURE 11 ¢ Fabric: Interfaces to Local Control 12 ¢ Connectivity: Communicating Easily and Securely 14 ¢ Resource: Sharing Single Resources 16 ¢ Collective: Coordinating Multiple Resources 17 ¢ Applications 20 4. BIBLIOGRAPHY 1. The Anatomy of the Grid: Enabling Scalable Virtual Organizations. Kesselman. Calif. expected growth in single-system performance. editors. Windows of opportunity for ideas and products seem to open and close in the ˜blink of an eye™. This emphasizes the importance of grid technologies as a means of sharing computation as well as a data access and communication medium. Rajkumar Buyya. we see that many of the applications are interactive. Foster. Physics today CONTENTS 1. some trends are evident. GRID CONSTRUCTION: GENERAL PRINCIPLES 7 ¢ Design Features 8 3. Morgan Kaufmann. Foster. I. One of those is growing interest in the use of Java and Web services for network computing. INTRODUCTION 1 ¢ Benefits of Grid Computing 4 ¢ Levels of Deployment 5 2. Grids and Grid technologies for wide-area distributed computing . C. Third.

BIBLIOGRAPHY 30 Reference: http://www.¢ Collaborative Computing 26 . CONCLUSIONS AND FUTURE TRENDS 28 6.

Sign up to vote on this title
UsefulNot useful