GRID COMPUTING
Outline
Introduction to Grid Computing
Methods of Grid computing
Grid Middleware
Grid Architecture
Grid Applications
Related topics on Grid
Grid Computing
Grid computing is a form of distributed computing whereby a "super and
virtual computer" is composed of a cluster of networked, loosely coupled
computers, acting in concert to perform very large tasks.
Grid computing (Foster and Kesselman, 1999) is a growing technology
that facilitates the executions of large-scale resource intensive
applications on geographically distributed computing resources.
Facilitates flexible, secure, coordinated large scale resource sharing
among dynamic collections of individuals, institutions, and resource
Enable communities (“virtual organizations”) to share geographically
distributed resources as they pursue common goals
Ian Foster and Carl Kesselman
Introduction
Since its introduction, the concept of grid
computing has acquired great popularity, even
greater than the Web itself had at its beginning.
The concept has not only found its place within
numerous science projects (in medicine e.g.),
but is also being used for various commercial
applications.
What is Grid Computing?
Grid computing is a type of data management
and computer infrastructure, designed as a
support primarily for scientific research, but, as
said in the introduction, also used in various
commercial concepts, business research,
entertainment and finally by governments of
different countries.
Criteria for a Grid:
Coordinates resources that are not subject to
centralized control.
Uses standard, open, general-purpose protocols and
interfaces.
Delivers nontrivial qualities of service.
Benefits
Exploit Underutilized resources
Resource load Balancing
Virtualize resources across an enterprise
Data Grids, Compute Grids
Enable collaboration for virtual organizations
Who can use grid computing
Governments and International
Organizations
The military
Teachers and educators
Businesses
Grid Applications
Data and computationally intensive applications:
This technology has been applied to computationally-intensive scientific,
mathematical, and academic problems like drug discovery, economic
forecasting, seismic analysis back office data processing in support of
e-commerce
A chemist may utilize hundreds of processors to screen thousands of
compounds per hour.
Teams of engineers worldwide pool resources to analyze terabytes of
structural data.
Meteorologists seek to visualize and analyze petabytes of climate data
with enormous computational demands.
Resource sharing
Computers, storage, sensors, networks, …
Sharing always conditional: issues of trust, policy, negotiation,
payment, …
Coordinated problem solving
distributed data analysis, computation, collaboration, …
Grid Computing Applications
One of the most tantalizing applications of radio
astronomy is the observation of radio signals as
part of Searches for Extra Terrestrial Intelligence
(SETI).
The vast amount of computing capacity required
for SETI radio signal processing has led to a
unique grid computing concept that has now been
expanded to many applications.
Grid Topologies
• Intragrid
– Local grid within an organisation
– Trust based on personal contracts
• Extragrid
– Resources of a consortium of organisations
connected through a (Virtual) Private Network
– Trust based on Business to Business contracts
• Intergrid
– Global sharing of resources through the internet
– Trust based on certification
TYPES OF GRID
• Computational Grid
• Scavenging Grid
• Data Grid
Computational Grid
• A computational grid is focused on
setting aside resources specifically
for computing power.
• In this type of grid, most of the
machines
are high-performance servers.
Computational Grid
“A computational grid is a hardware and software infrastructure
that provides dependable, consistent, pervasive, and
inexpensive access to high-end computational capabilities.”
”The Grid: Blueprint for a New Computing Infrastructure”,
Kesselman & Foster
Example : Science Grid (US Department of Energy)
Scavenging Grid
A scavenging grid is most commonly used with
large numbers of desktop machines.
Machines are scavenged for available CPU
cycles and other resources.
Owners of the desktop machines are usually
given control over when their resources are
available to participate in the grid.
Data Grid
A data grid is responsible for housing and
providing access to data across multiple
organizations.
Users are not concerned with where this data
is located as long as they have access to the
data.
Data Grid
A data grid is a grid computing system that deals with data
— the controlled sharing and management of large
amounts of distributed data.
Data Grid is the storage component of a grid environment.
Scientific and engineering applications require access to
large amounts of data, and often this data is widely
distributed. A data grid provides seamless access to the
local or remote data required to complete compute intensive
calculations.
Example :
Biomedical informatics Research Network (BIRN),
the Southern California earthquake Center (SCEC).
Methods of Grid Computing
Distributed Supercomputing
High-Throughput Computing
On-Demand Computing
Data-Intensive Computing
Collaborative Computing
Logistical Networking
Distributed Supercomputing
Combining multiple high-capacity resources on
a computational grid into a single, virtual
distributed supercomputer.
Tackle problems that cannot be solved on a
single system.
High-Throughput Computing
Uses the grid to schedule large numbers of
loosely coupled or independent tasks, with the
goal of putting unused processor cycles to
work.
On-Demand Computing
Uses grid capabilities to meet short-term
requirements for resources that are not
locally accessible.
Models real-time computing demands.
Collaborative Computing
Concerned primarily with enabling and
enhancing human-to-human interactions.
Applications are often structured in terms of a
virtual shared space.
Data-Intensive Computing
The focus is on synthesizing new information
from data that is maintained in geographically
distributed repositories, digital libraries, and
databases.
Particularly useful for distributed data mining.
Logistical Networking
Logistical networks focus on exposing storage
resources inside networks by optimizing the global
scheduling of data transport, and data storage.
Contrasts with traditional networking, which does
not explicitly model storage resources in the
network.
high-level services for Grid applications
Called "logistical" because of the analogy it bears
with the systems of warehouses, depots, and
distribution channels.
P2P Computing vs Grid Computing
Differ in Target Communities
Grid system deals with more complex,
more powerful, more diverse and highly
interconnected set of resources than
P2P.
VO
A typical view of Grid
environment
Grid InformationGrid Information Service
Service system Details of Grid
resources
collects the details of
the available Grid 1
resources and passes
the information to
2
the resource broker. 4
Computational
3 jobs
Grid
application
Processed jobs
Computation
result
User
A User sends
Resource Broker
A Resource Broker
computation or data
intensive application
distribute the jobs in an Grid Resources
application to the Grid Grid Resources (Cluster,
to Global Grids in resources based on user’s PC, Supercomputer,
order to speed up the QoS requirements and database, instruments,
execution of the details of available Grid etc.) in the Global Grid
application. resources for further execute the user jobs.
Grid Middleware
Grids are typically managed by grid ware -
a special type of middleware that enable sharing and manage grid
components based on user requirements and resource
attributes (e.g., capacity, performance)
Software that connects other software components or
applications to provide the following functions:
Run applications on suitable available resources
– Brokering, Scheduling
Provide uniform, high-level access to resources
– Semantic interfaces
– Web Services, Service Oriented Architectures
Address inter-domain issues of security, policy, etc.
– Federated Identities
Provide application-level status
monitoring and control
Middlewares
Globus –chicago Univ
Condor – Wisconsin Univ – High throughput
computing
Legion – Virginia Univ – virtual workspaces-
collaborative computing
IBP – Internet back pane – Tennesse Univ –
logistical networking
NetSolve – solving scientific problems in
heterogeneous env – high throughput & data
intensive
Two Key Grid Computing Groups
The Globus Alliance ([Link])
Composed of people from:
Argonne National Labs, University of Chicago, University of
Southern California Information Sciences Institute, University
of Edinburgh and others.
OGSA/I standards initially proposed by the Globus Group
The Global Grid Forum ([Link])
Heavy involvement of Academic Groups and Industry
(e.g. IBM Grid Computing, HP, United Devices, Oracle, UK e-
Science Programme, US DOE, US NSF, Indiana University,
and many others)
Process
Meets three times annually
Solicits involvement from industry, research groups, and
academics
Some of the Major Grid Projects
Name URL/Sponsor Focus
EuroGrid, Grid [Link] Create tech for remote access to super
Interoperability European Union comp resources & simulation codes; in
(GRIP) GRIP, integrate with Globus Toolkit™
Fusion Collaboratory [Link] Create a national computational
DOE Off. Science collaboratory for fusion research
Globus Project™ [Link] Research on Grid technologies;
DARPA, DOE, development and support of Globus
NSF, NASA, Msoft Toolkit™; application and deployment
GridLab [Link] Grid technologies and applications
European Union
GridPP [Link] Create & apply an operational grid within the
U.K. eScience U.K. for particle physics research
Grid Research [Link] Integration, deployment, support of the NSF
Integration Dev. & NSF Middleware Infrastructure for research &
Support Center education
Grid Architecture
The Hourglass Model
Focus on architecture issues
Applications
Propose set of core services as
Diverse global services
basic infrastructure
Used to construct high-level,
domain-specific solutions
(diverse)
Core
Design principles services
Keep participation cost low
Enable local control
Support for adaptation
“IP hourglass” model
Local OS
Grid architecture
Fabric layer: Provides the resources to which shared access is
mediated by Grid protocols.
Connectivity layer: Defines the core communication and
authentication protocols required for grid-specific network
functions.
Resource layer: Defines protocols, APIs, and SDKs for secure
negotiations, initiation, monitoring control, accounting and
payment of sharing operations on individual resources.
Collective Layer: Contains protocols and services that capture
interactions among a collection of resources.
Application Layer: These are user applications that operate
within VO environment.
Layered Grid Architecture
(By Analogy to Internet Architecture)
Application
Architecture
Internet Protocol
“Coordinating multiple resources”:
ubiquitous infrastructure services, Collective
app-specific distributed services Application
“Sharing single resources”:
negotiating access, controlling use Resource
“Talking to things”: communication
(Internet protocols) & security Connectivity Transport
Internet
“Controlling things locally”: Access
to, & control of, resources Fabric Link
Example:
Data Grid Architecture
App Discipline-Specific Data Grid Application
CollectiveCoherency control, replica selection, task management,
(App) virtual data catalog, virtual data code catalog, …
CollectiveReplica catalog, replica management, co-allocation,
(Generic) certificate authorities, metadata catalogs,
Resource Access to data, access to computers, access to network
performance data, …
Connect Communication, service discovery (DNS), authentication,
authorization, delegation
Fabric Storage systems, clusters, networks, network caches, …
Advantages
Increased user productivity: By providing
transparent access to resources, work can be
completed more quickly.
Scalability: Grids can grow seamlessly over time,
allowing many thousands of processors to be
integrated into one cluster.
Flexibility: Grid computing provides computing
power where it is needed most, helping
to better meet dynamically changing work
loads.
Disadvantages
1) for memory hungry applications that can't take advantage
of MPI you may be forced to run on a large SMP.
2) you may need to have a fast interconnect between
compute resources (gigabit ethernet at a minimum).
Infobahn for MPI intense applications.
3) some applications may need to be tweaked to take full
advantage of the new model.
Disadvantages…
4) Licensing across many servers may make it
prohibitive for some apps. Vendors are starting to
be more flexible with environment like this.
Areas that already are taking good advantage of
grid computing include bioinformatics,
cheminformatics, and oil & drilling, and financial
applications.
Grid Computing Software
Interface
•Brief introduction to Globus
•Executing a simple job on command line
•Executing program through a Grid portal
Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © 2009.
Chapter 1, pp 19-28. For educational use only. All rights reserved. Aug 24, 2009 1-2.38
Grid computing infrastructure
(middleware) software
Primary objective:
To make seamless environment for users
to access distributed resources.
1-2.39
Grid computing infrastructure software
Key aspects include:
Secure envelop over all transactions
Single sign-on - being able to access all available
resources and run jobs without having to supply
additional passwords or account information.
Data management tools
Information services providing characteristics of
resources and their status (including dynamic load)
APIs and services that enable applications
themselves to take advantage of Grid platform
Convenient user interface
1-2.40
Globus Project
Open source software toolkit developed for
Grid computing.
One of the most influential projects
Roots in I-way experiment.
Work started in 1996.
Four versions developed to present time.
Reference implementations of Grid computing
standards.
Defacto standard for Grid computing.
1-2.41
Globus
A “toolkit” of services and packages for
creating the basic grid computing
infrastructure
Higher level tools added to this infrastructure
Version 4 is web-services based
Some non-web services code exists from
earlier versions (legacy) or where not
appropriate (for efficiency, etc.).
1-2.42
Some Globus toolkit versions
(approximate time line)
Fig. 1.5 1-2.43
Globus Toolkit
Five major parts:
Common run time
- Libraries and services
Security
- Components to provide secure access
Execution management
- Executing, monitoring and management of jobs
Data Management
- Discovery access and transfer of data
Information
- Discovery and monitoring of resources and services
1-2.44
Some basic Globus components
GSI Grid Security Infrastructure
– Provides for security envelop around Grid resources
– Uses public key cryptography
GRAM (Globus/Grid Resource Allocation
Management)
– Globus’ basic execution management component
– Used to issue and manage jobs
MDS (Monitoring and Discovery Service)
– To discover resources and their status
GridFTP
– For transferring files between resources
1-2.45
Security
Has to cross administrative domains.
Need agreed mechanisms and standards.
Focus on Internet security mechanisms,
modified to handle the special needs of
Grid computing.
1-2.46
Security
Distributed resources must be protected from unauthorized
access.
GSI (Grid Security Infrastructure) -- Globus components for
creating security envelop.
Requires each user to be authenticated (their identity proved).
Uses public key cryptography (basis of Internet security)
Each user must possess a so-called (digital) certificate, signed
by a trusted certificate authority.
Users will also need to be able to give their authority to Grid
components to act on their behalf.
Users generally will also need accounts of resources they
intend to use (authorization).
1-2.47
Resource Discovery
Still primitive and in research but ideal is
to be able to submit a job and the
system find the best grid resources for
that job across the whole grid
1a.48
Resource Discovery
Basic Globus component called MDS (Monitoring
and Discovery System).
Users might access MDS to discover status of
compute resources. In practice, users often
know what resources are there but not dynamic
load.
MDS might be used by other Grid components
such as schedulers.
1-2.49
Executing a Job
Next user typically would want to submit a job.
Basic Globus component for running a job is
GRAM (Globus or Grid Resource Allocation
Management).
1-2.50
Command-line interface
Grid computing environments mostly Linux-based and originally
accessed through a command line.
Once you have established your security credentials, to run a
simple job you might issue GRAM command:
globusrun-ws -submit -c prog1
where prog1 is executable of job.
Executable needs to be present on compute resource that is to
execute it.
Above command does not specify compute resource and hence
computer executing globusrun-ws command will execute prog1.
Executing a Job (continued)
May be necessary beforehand to transfer files
to resources and afterwards to transfer files
to other locations including back to user.
User might use data management component
called GridFTP for that.
1-2.52
GridFTP command to transfer files
globus-url-copy \
gsi[Link]
prog1out \
[Link]
First argument is source location and second argument is
destination location.
In the above case, the file:
[Link]/~abw/prog1out
transferred to
home/abw/
on the local computer.
User employing Globus services and facilities
Fig. 1.6 1-2.54
Grid Portal
Command-line interface a very primitive way of interacting with
Grid resources. Web-based interface called a Grid portal more
desirable. UNC–Charlotte Grid portal course based upon
GridSphere Grid portal toolkit.
Fig. 1.7
Before users can log on, they need a user name and
password for portal.
They must have user “credentials” and accounts on
the resources they wish to access.
In UNC–Charlotte course portal, PURSe (Portal-
based User Registration Service) portlet used to
facilitate setup procedures.
Reached by selecting “Register” tab.
User enters required information (name, email
address, institution, etc.) which is forwarded to Grid
system administrator to set up accounts and
credentials.
1-2.56
PURSe
registration
portlet
Fig. 1.8
Registration activities
Fig. 1.9 1-2.58
Once logged into Grid portal, user will see a number
of tabs across top, which enable user to perform
many basic tasks.
Grid information tab
1-2.60
Proxies
To use many services, you are required to have a
proxy certificate (a proxy).
Proxies are part of Grid security infrastructure,
discussed later in course.
Proxy is an electronic document that enables
resources to be accessed on user’s behalf.
Very convenient to use credential management
service called myProxy to hold proxies
Usually, Gridsphere automatically obtains a proxy
from the myProxy server for you when you log in.
1-2.61
Proxy management tab
1-2.62
File management tab
1-2.63
Batch job submission tab
CONCLUSION
Grid computing introduces a new concept to IT
infrastructures because it supports distributed
computing over a network of heterogeneous
resources and is enabled by open standards.