You are on page 1of 54

Resource Discovery and

Workflow
Security in Grid Computing
Lecture 1
Introduction
• Grid system is a seamless, integrated computational and collaborative
environment and a high level view of activities within the Grid.
• Grids, composed of large-scale, geographically distributed platforms
working together.
• The resource requesters (users or systems) interact with the Grid resource
broker for solving problems, which in turn performs resource discovery,
scheduling, and processing application jobs on the distributed Grid
resources.
• A core part of the grid system is Resource Broker.
• Resource Broker has a significant role in discovering the resources before
implementing other grid resource management schemes.
• It performs various processes like discovering, scheduling, allocating and
evaluating.
Grid resource broker

• Traditionally, a user has two basic options when submitting a job to a


resource.
• The most basic one is using command line tools and manually connecting
to each particular resource, at first to test for availability of one and then to
do the job submission or using HLL (High Level Language) to make job
submission.
• The resource broker removes the interactions the user make with
resources in the Grid using, easy and user-friendly mechanisms for job
submission.
• Therefore the Broker take charge of the communications between User
(resource requester) and resources residing in the grid.
Grid resource broker
• The Grid Resource Broker (GRB) is a grid portal that allows trusted
users to create and handle computational/data grids on the fly
exploiting a simple and friendly web-based GUI.
• GRB provides automatic discovery of resources matching the user’s
criteria, selection and scheduling on behalf of the user.

User Browser/ GUI Resource Broker GRIS

• Some typical sub components of a resource broker are : Resource


finder, resource filter, resource scheduler, resource matcher etc.
Resource Discovery
• In Grid computing, resource discovery is the process of finding the best candidate
from the set of resources that meet the requirements of the time, cost and
efficiency at optimum level.
• Involves the location of resources that satisfy a set of requirements given in a
query.
• Resource discovery is the first and the foremost important process.
• Due to the heterogeneity of resources (storage devices, operating systems,
communication devices, bandwidth, simulation, software’s etc.), it is always a
challenge to discover the right resource at the right time with the optimum cost.
• Other challenges are resources are spread through organizations each having
their own resource management policies and different access and cost models.
• To handle the different resource tasks such as allocation, assignment,
authentication, authorization etc. efficiently there is need for a good resource
management planning.
Resource Discovery
• Resource discovery is conducted by Grid resource broker (see slide
4).
• The goal of resource discovery is to identify a list of authenticated
resources that are available for job submission.
• In order to cope with the dynamic nature of the Grid, a scheduler needs to
have some way of incorporating dynamic state information about the
available resources into its decision-making process.
• A scheduler should always know what resources it can access, how busy
they are, how long it takes to communicate with them and how long it
takes for them to communicate with each other ( Here is calculate Total
Time to Deliver (TTD) and Grid node with shortest TTD.
• With this information, the scheduler optimizes the scheduling of jobs to
make more efficient and effective use of the available resources.
Resource Discovery Process
GRIS returns : LFC

Tell me about the computing resources belonging


to the HPC Lab that are uniprocessor Linux
workstation, with low CPU load and available
memory < 250 Mbyte
2

Tell me about the


features of “LFC”
Resource Discovery Process

• Generally, the node submits a job and the grid finds appropriate
resources, in principle anywhere, to complete it.
• A user at a node provides a search query containing a set of keywords
corresponding to the resource being searched for (ontological description
of the resources).
• A resource discovery protocol is then used to forward the search query to
other nodes and search for the resource on those nodes.
• When a resource is located at a node, a success message is sent to the
node that originated the search query.
Resource Discovery Process
Specific steps :
• A node, searching for resources, submits a request containing an
ontological description of a target resource.
• In order to perform the request distribution over the network,
communication and routing protocols are need to be employed.
• Each node receiving the request compares the incoming request
against its ontology, by applying some criteria for matching.
• The matching evaluation depends on the expressiveness of the ontological
description and on the desired level of accuracy.
• Replying to the requesting node, each node returns a list of candidate resource
descriptions, together with their respective matching values.
• Initiated node selects the relevant resources descriptions based on received
candidate descriptions and on their matching values.
Resource Discovery Approaches
Centralized Approach
• The resource discovery approaches in grid environments fall into one of the
following categories.
• Centralized approach one or a limited number of central servers index(es) the
resource information of an entire grid.
• The resource information about all other nodes and instances is located at a
central point, that can be reached by all service or resource requester
instances in the environment.
• All the resource providers periodically update and register their dynamic or
static information in the central repository.
• The central information service is the only entity in the system which can
process the queries, initiated by resource requesters, for matching to the
resources available.
Centralized Approach
• Each grid node includes some resources to share.
• The node publishes its resource information toward the central
server.
• Also, once the resource information of the grid node is changed,
for instance when its resource is allocated to a requesting
application, the new information of the resource will be sent by
the node directly to the central server.
Centralized Approach
Hierarchical Approach
• In the hierarchical (tree) approach, the resource information is partially
distributed to multiple locations in hierarchical fashion.
• In the hierarchical tree, grid nodes publish their resource information to
their parent nodes located in upper level.
• Also, each grid node handles the queries related to its descendants.
• In some cases, grid nodes do not maintain the metadata of all their
descendants because of security considerations or memory-space
limitations.
• The hierarchical approach is prevalent in many current resource
management systems such as Globus Toolkit, gLite, etc.
Hierarchical Approach
Hierarchical Approach

• As indicated in the figure, index servers at level 1 of the hierarchy are


directly connected to the resource owner nodes.
• They gather the resource information and forward it to their parent to be
indexed.
• Thus, the upper-level indexing nodes are busier and contain the
information about more resources in the network
• They have a more important role in the resource discovery system and
necessitate more maintenance costs, especially in dynamic conditions.
Hierarchical Approach
• Security should be applied on upper-level indexing nodes to protect
them from bombarded messages sent by descendant nodes.
• One used strategy is that each node knows only its direct related
parent, and the address.
• This approach has limited scalability, as requests trickle through the
root server, which can easily become a bottleneck and consequently
suffer from fault tolerance issues.
• Also, the loss of a node in the higher level of the architecture causes
the loss of an entire subtree.
P2P Approach
• In a P2P-based approach, structure dictates a distributed and cooperative
network design without central supervision and the information of
resources is not kept on a central server.
• Each node has the role of both client and server and acts as autonomous
and self-organized.
• Grid applications send their resource query message to their neighbours
(Peers).
• The neighbours subsequently forward the received queries to appropriate
neighbours in order to reach the required resource location.
P2P Approach
P2P Approach

• The concept of P2P introduces many significant advantages in different


aspects of resource discovery including

• Scalability (due to collaborative resource sharing between peers),


• Reliability (e.g., fault-tolerance) (due to the equality essence of
peers),
• Robustness (due to self-organization against peer or system
failures)
Resource Discovery and
Workflow
Security in Grid Computing
Lecture 2
P2P Approach: Security
• In P2P, it envisages that every Virtual Organization (VO) maintains its Grid
information services and makes it available as part of a P2P based
network.
• Upon receiving a request from grid node, the resource brokers in various
VOs initiate a resource look-up query message which is forwarded in the
P2P network using flooding technique.

• A simple routing technique in computer networks where a source or node sends


packets through every outgoing link.

• Flooding, which is similar to broadcasting, occurs when source packets (without


routing data) are transmitted to all attached network nodes.
P2P Approach: Security

• However, this approach generate large volume of network messages


and may overwhelm the peers.
• To avoid this, a Time to Live (TTL) field is associated with every
message, i.e. the peers stop forwarding a query message once the
TTL expires.
• To an extent, this approach can limit the network message traffic,
and could help mitigating DDoS attacks.
Security In Grid

• Further, the nature of a distributed system raises serious challenges in


domains of security and trust management.
• Grid system requires solutions that can :

• Preserve the privacy of participants


• Ensure authenticity of the participants
• Robust authorization
• Securely route messages between distributed services, and minimise
loss to the system due to malicious participants.
Security In Grid

• The privacy of the participants can be ensured through secret key-based


symmetric cryptographic algorithms such as 3DES.
• These secret keys must be securely generated and distributed in the system,
key management systems such as public key algorithms (including DH, RSA),
Kerberos (trusted third party) can be used.
• The communication channel is secured using TLS.
• Authentication of the participants can be achieved through trust enforcement
mechanisms such as X.509 certificates (Public Key Infrastructure) , Kerberos
(third party authentication) and SSH.
Security In Grid: The X.509 Certificate

For Authentication
• X.509 certificates involves the use of CA which issues certificate to Grid
users (Grid nodes).
• Each grid user (grid node) should trust the CA (which issued the
certificate).
• X.509 Certificate contains information about the CA as well as the grid
node.
• Key information of X.509 certificate: Subject’s name, Public Key, Name
of CA, Encryption algorithms etc.
• Certificates are also signed by the CA.
Security In Grid: The X.509 Certificate

• Authentication based on X.509 certificates warrants a trusted Certifying


Authority (CA) in the system.
• A Grid participant presents a X.509 certificate along with an associated
private key in order to authenticate itself with a remote service.
• A system can have a single CA, which is trusted by all the participants.
• However, single CA approach has limited scalability.
• An alternative to this is to have multiple CA's combining together to form a
trust chain.
• In this case, a certificate signed by any CA in the system has global validity.
Security In Grid: Single Sign on (SSO)

• SSO in UNICORE is implemented through My Proxy.


• My Proxy is open source software widely used in grid portals for
managing X.509 public key infrastructure (PKI) security credentials
(certificates and private keys) to help realize the SSO.
• It provides a solution for delegating credentials to grid portal to allow
the portal to authenticate to grid services on the user’s behalf.
• The work mechanism of My Proxy with a grid portal can be described
as follows.
• First, users need to store their grid credentials on a My Proxy server
that the portal can use.
Security In Grid: Single Sign on (SSO)
Security In Grid : Secure Shell (SSH)
• SSH facilitate secure communication between systems.
• It replace telnet, remote hash(rsh) and remote login (rlogin).
• The SSH based authentication scheme is comparatively easier to
implement as it does not require trusted third party certification
similar to X.509 .
• However, it does not allow the creation of a dynamic trust chain
(similar to CAS) – refer slide 33.
• In case a participant’s private key is compromised, it requires every
public key holder to be informed about this event.
Security In Grid : Kerberos

• The Kerberos protocol uses strong cryptography so that a client


can prove its identity to a server (and vice versa) across an insecure
network connection.

• After a client and server has used Kerberos to prove their identity,
they can also encrypt all of their communications to assure privacy
and data integrity as they go about their business.
Security In Grid : Kerberos
How it works
• Client requests an authentication ticket (TGT) from the Key
Distribution Center (KDC)
• The KDC verifies the credentials and sends back an encrypted TGT and
session key
• The TGT is encrypted using the Ticket Granting Service (TGS) secret
key
• The client stores the TGT and when it expires the local session
manager will request another TGT (this process is transparent to the
user)
Security In Grid : Kerberos PSN= Unique identifier of
a service instance.

If the Client is requesting access to a service or other resource on the


network, this is the process:
• The client sends the current TGT to the TGS with the Service Principal
Name (SPN) of the resource the client wants to access
• The KDC verifies the TGT of the user and that the user has access to
the service
• TGS sends a valid session key for the service to the client
• Client forwards the session key to the service to prove the user has
access, and the service grants access.
Security In Grid : Kerberos
Cerberus
Security In Grid : ACL
• The resource sharing policy takes into account the participant’s identity
and membership to groups or virtual organisations.
• Globus based grid installation defines the access control list using a Grid
map file.
• This file simply maintains a list of the distinguished names of the Grid
users and the equivalent local user account names that they are to be
mapped to. (Can be used for Peers identification purposes).
• Access control to a resource is then left up to the local operating system
and application access control mechanisms.
Security In Grid : The Communication Channel
• Transport layer security protocols such as TLS and SSL are
used for message encryption and integrity checking as they are
transported from one host to the other on the Internet.
• Authorization deals with the verification of an action that a
participant is allowed to undertake after a successful
authentication.
• In a grid, site owners have the privilege to control how their
resources are shared among the participants .
Security in Grid: Grid Security Infrastructure (GSI)

• GSI provides secure communications for all the higher-level grid services
• Secure Authentication and Authorization
• The job would not be permitted to execute if the user has no access on
that resource.

• Authentication ensures you are whom you claim to be (ID card, fingerprint,
passport, username/password)
• Authorization controls what you are permitted to do (Run a job, read or write
a file)

37
Grid Security Infrastructure - GSI
• User authenticates once – then can perform many tasks
• The primary motivations behind the GSI are:
• The need for secure communication (authenticated and perhaps
confidential) between elements of a computational Grid.
• The need to support security across organizational boundaries, thus
prohibiting a centrally-managed security system.
• The need to support "single sign-on" for users of the Grid, including
delegation of credentials for computations that involve multiple
resources and/or site Single Sign-on
Grid Security Infrastructure - GSI
• The Grid Security Infrastructure (GSI) implementation of PKI supports
dynamic trust chain creation through the Community Authorization Service
(CAS). – For details see (Paper 1)
• This is based on the policy that two peers bearing proxy certificates signed
by the same user will inherently trust each other.
• Kerberos based implementation has significant shortcomings as it requires
synchronous communication with the ticket granting server in order to setup
communication between a client and server.
• If the ticket granting server goes offline or has a security breach then there is
no way the system can operate.
• In case of X.509 based implementation, a CA can certify the credentials
offline.
• Having said that, a majority of implementations do rely on centralised trust
enforcement entities such as a CA or a ticket granting authority.
Other Grid Security Approaches
• Reliability: Reliability is one of the fundamental goals of any
distributed system.
• Usually hardware reliability is achieved through redundancy of
equipment.
• In grid, the underlying software technology offers more than
hardware based reliability.
• The grid management software resubmits a job to alternate
machines in case of failures or in some case a critical job’s multiple
instances are executed over different machines.
Other Grid Security Approaches
• Failure Management: In the grid environment the intra-node or
inter-node resources may fail.
• Therefore the information services have to robust in case of
resource failures or even
• for the failure of information service components.
• The term robust in this case states that in case any component
fails this act should not prevent to obtain information about other
• components of the system.
• The end user should have information even in case of partial or
inconsistent information.
Other Grid Security Approaches
How to ensure robustness of information services?
• First, the information services should be as decentralized and distributed
as much as possible and the information providers should be in the
proximity of the entities they describe.
• This will increase the likelihood of obtaining information about the available
resources
• Secondly, the components of information services should be developed
under the consideration that failure is not an exception rather a rule.
• It means not only making sure that in case of failure the failed services or
resources should not interfere or halt other functions but also the timely
information regarding the failure should be made available.
Users in VO-A and VO-B have access to
partially overlapping resources. While VO-
B is split by network failure, it should
operate as two disjoint fragments. Source:
Czajkowski et al (1998).
Grid Security is a Crucial

Why?
• Problems being solved might be sensitive
• Resources are typically valuable

• Also, resources involved are, generally, located across a heterogeneous


assortment of domains, each with there own policies, procedures, and
security mechanisms.
• Additionally, the security implementation must be broadly available and
applicable as well as maintaining standard, well-tested, well-understood
protocols that are integrated with a wide variety of tools.

44
Resource Discovery and
Workflow
Security in Grid Computing
Lecture 3
Grid Workflow

46
Grid Work Flow : Definition

• The automation of a business process, in whole or parts, where


documents, information or tasks are passed from one
participant to another to be processed, according to a set of
procedural rules.
WFMS And WF Engine
• Workflow Management System (WFMS)
• “A system that defines, creates and manages the execution of
workflows through the use of software, running on one or more
workflow engines, which is able to interpret the process definition,
interact with workflow participants and, where required, invoke the use
of IT tools and applications.”
• Workflow Engine.
• “A software service or "engine" that provides the run time execution
environment for a process instance.”
Security in Workflows
• A Workflow Management Systems (WFMS) is often used to support the
automated execution of business processes.
• Nowadays the WWW provides new opportunities of performing such
business processes, through web services.
• A standard for specifying such workflow processes is the Web Services
Business Process Execution Language (WSBPEL).
• A web service workflow can be defined as a set of interacting web services
or a web service composition, in which it is determined which web services
participate in the process, the order of their interactions and which data is
transferred during the process.
Security in Workflows
• Web service compositions (interacting web services) are used to automate
the coordination between participating “partners” thereby increasing the
efficiency of the whole process.
• There exist two different types of interactions between the single web
services in a workflow:
• Service orchestration (centralized) refers to those workflows, in which there exists one
central service that receives the client requests, makes the required data transformations
and invokes the component web services.
• Service choreography (decentralized) refers to the workflows in which there are multiple
engines, each executing a composite web service specification (a small part of the
original composite web service specification but complete in itself) at distributed
locations.
Security in Workflow
• In decentralised model, the workflow engines communicate directly with each
other (rather than through a central coordinator) to transfer data and control
when necessary in an asynchronous manner.
• In the centralized system model one or more workflow engines, where each of
them is able to interact with workflow participants.
• On one the other, hand centralized systems provide centralized monitoring and
auditing, simpler synchronization mechanisms and overall design simplicity.
• Centralized approach of web service compositions suffers from some
drawbacks.
• Centralized service architectures are designed as classical client/server
applications in which the server provides most of the functionality of the system
while the computational potential at the client side is barely used.
Security in Workflow
• Systems with client/server architecture are normally vulnerable to server
failures.
• From a security perspective a central server introduces a single point of
failure where a number of different partners share one system which is under
the control of one single partner.
• In contrast to the centralized approach, in a decentralized workflow each
partner can be aware of the actual state of the workflow and its involvement
in the workflow.
• The decentralized WFMS should be able to distribute the tasks to the
appropriate partners, and ensure specified task dependencies by sending the
tasks to the predetermined partners only when all prerequisite conditions are
satisfied.
Security in Workflow
• However, decentralized execution of inter-organizational workflows may
raise a number of security issues including integrity, non-repudiation and
confidentiality.
• However, through a composition approach, each web service can access
only the information which is needed for the correct execution of the
invoked operations and we provide an execution proof of the fulfilled
assignments.
• At the end of the workflow the integrity and the authenticity of the provided
execution proof can be verified by a central verification unit.
• PKI and Kerberos could take care of authentication and integrity of the
web services

You might also like