You are on page 1of 26

CSI INSTITUTE OF TECHNOLOGY, THOVALAI

DEPARTMENT OF COMPUTER SCIENCE AND ENGG.


NOTES OF LESSION

SUB. NAME:GRID AND CLOUD COMPUTING SUB. CODE:


CS7603

UNIT-I
1. SCALABLE COMPUTING OVER THE INTERNET

Introduction:
The term ‘computing’ has undergone lots of advancements due to the evaluation of the
architecture and platforms over past two decades. The basic purpose of computing is to provide
faster execution platform for large and complex task with high performance and low cost. There
are many well-known techniques like serial computing, centralized computing, parallel
computing, distributed computing, grid computing, cloud computing etc.

1.1. The Age of Internet Computing

Billions of people use the Internet every day. As a result, supercomputer sites and large
data centers must provide high-performance computing services to huge numbers of Internet
users concurrently. Because of this high demand, the high-performance computing (HPC)
applications is no longer optimal for measuring system performance. The emergence of
computing clouds instead demands high-throughput computing (HTC) systems built with parallel
and distributed computing technologies.

1.1.1 The Platform Evolution

Computer technology has gone through five generations of development, with each
generation lasting from 10 to 20 years. Successive generations are overlapped in about 10 years.

Evaluation of Hardware

1960-80 IBM Mainframes(IBM 360)


1970-80 Mini computers (VAX)
1970-90 Personal computer with VLSI microprocessors
1980-90 Super computers having massively parallel processors
1990-2000 Portable computer with wired or wireless devices
2000 onwards Laptops, tablets ,smart phones with built in advanced wireless technology
The general computing trend is to leverage shared web resources and massive amounts of
data over the Internet. Figure 1.1 illustrates the evolution of HPC and HTC systems. On the HPC
side, super computers (massively parallel processors or MPPs) are gradually replaced by clusters
of cooperative computers out of a desire to share computing resources. The cluster is often a
collection of homogeneous compute nodes that are physically connected in close range to one
another.

On the HTC side, peer-to-peer (P2P) networks are formed for distributed file sharing and content
delivery applications. In P2P network the peers share computing and data resources with others.
Peer is a computer connected to the internet.

1.1.2 High-Performance Computing

The HPC environment deliver a huge amount of compute power over a short period of
time. HPC environments are measured in terms of Floting point operations per second(FLOPS)
It is used to execute less number of jobs with high speed of execution .HPC is also defined as
number of instructions that are executed per second. Some of the application where HPC used
are military sensors, manufacturing process, chemical reactors etc

1.1.3 High-Throughput Computing

HTC is defined as number of tasks that are executed per unit time. The HTC is dealing
with executing more number of jobs completed over a long period of time. Instead of their speed.
1.1.4 Computing Paradigms

Centralized Computing

In Centralized Computing all computer resources (processors, memory, and storage)are


centralized in one physical system. All resources are fully shared and tightly coupled within one
integrated OS. Many data centers and supercomputers are
centralized systems, but they are used in parallel, distributed, and cloud computing applications

Parallel Computing

In parallel computing, a complex and large problems is broken into discrete parts and re
solved concurrently. It uses multiple processors or computers working together on a common
task. Here all processors are either tightly coupled with centralized shared memory or loosely
coupled with distributed memory.
A computer system capable of doing parallel computing is known as a parallel computer.
Programs running in a parallel computer are called parallel programs.
The process of writing parallel programs is often referred to as parallel programming.

Distributed computing

A distributed system is a collection of independent computers, each having its own


private memory, interconnected via a computer network capable of collaborating on task.
Distributed computing is computing performed in a distributed system.
Information exchange in a distributed system is done by message passing
A computer program that runs in a distributed system is known as distributed program.

Grid computing

Grid computing couples the resources of numerous computers in a network to work on a


single problem at the same time.

Cloud computing

Cloud computing is a large scale distributed computing paradigm in which a pool of


virtualized , dynamically scalable, managed computing power , storage, platform and services
are delivered on on-demand basis ti external users over internet.

Concurrent Computing;

It is the combination of parallel and distributed computing.


Ubiquitous computing

Ubiquitous computing means computing with pervasive devises at any place and at any
time using wired or wireless communication.

1.1.5 Degrees of Parallelism

Bit-level parallelism (BLP) : It converts bit-serial processing to word-level processing .

Instruction-level parallelism (ILP): The processor executes multiple instructions


simultaneously rather than only one instruction at a time.
Eg. Multithreading, pipelining

Data-level parallelism (DLP): It executes vector or array types of instructions

Task-level parallelism (TLP): Running multiple task at a time


Eg: Multicore processor.

Job-level parallelism (JLP): JLP handles multiple jobs in a distributed fashion.

1.1.6 Utility Computing

It is a business model in which the customers receive computing recourses from a paid
service provider.
Eg. Grid and cloud service providers
Internet of Things:

IoT is a networked collection of every day objects such as computers, sensors human
etc.In IoT a tag is connected to every day objects, With the help of RFID or sensors or GPS these
tags are monitored or accessed.
The IPV6 helps to distinguish all objects in earth. Research shows that every human being is
surrounded by 1000 to 5000 objects, So IoT needs to design 100 trillion objects of things. To
reduce the complexity of identification, search and storage filters are used to fine grain the
objects.
In IoT the communications are of three types
1.H2H(Human to Human)
2, H2T(Human to Things)
3.T2T(Things to Things.

1.1.7 Cyber-Physical Systems

A cyber-physical system (CPS) is the interaction between computational processes and the
physical world. A CPS integrates “cyber” (heterogeneous, asynchronous) with “physical”
(concurrent and information-dense) objects. A CPS merges the “3C” technologies of
computation, communication, and control into an intelligent closed feedback system between the
physical world and the information world.
Eg:virtual reality (VR) applications
Evolution of Distributed computing

At the initial stage of computing, standalone computers were used to solve large complex
task in a sequential manner called serial computing. In serial computing large problems were
divided in to number of smaller tasks which were solved serially or sequentially on standalone
computers. The limitations of serial computing were slower computing performance and long
waiting time.

Centralized Computing

In Centralized Computing all computer resources (processors, memory, and


storage)are centralized in one physical system. All resources are fully shared and tightly
coupled within one integrated OS.

Parallel Computing
In parallel computing, a complex and large problems is broken into discrete parts
and re solved concurrently. It uses multiple processors or computers working together on
a common task. Here all processors are either tightly coupled with centralized shared
memory or loosely coupled with distributed memory.

Distributed computing
A distributed system is a collection of independent computers, each having its
own private memory, interconnected via a computer network capable of collaborating on
task.
Distributed computing is computing performed in a distributed system.
Information exchange in a distributed system is done by message passing
A computer program that runs in a distributed system is known as distributed program.

Grid computing

Grid computing couples the resources of numerous computers in a network to


work on a single problem at the same time.

Cloud computing

Cloud computing is a large scale distributed computing paradigm in which a pool


of virtualized , dynamically scalable, managed computing power , storage, platform and
services are delivered on on-demand basis ti external users over internet.

High performance computing

The HPC environment deliver a huge amount of compute power over a short
period of time. HPC environments are measured in terms of Floting point operations per
second(FLOPS) It is used to execute less number of jobs with high speed of execution
.HPC is also defined as number of instructions that are executed per second.
Some of the application where HPC used are military sensors, manufacturing process,
chemical reactors etc.

High Throughput computing

HTC is defined as number of tasks that are executed per unit time. The HTC is
dealing with executing more number of jobs completed over a long period of time.
Instead of their speed.

The other new computing paradigm are.

Concurrent Computing;

It is the combination of parallel and distributed computing.

Ubiquitous computing

Ubiquitous computing means computing with pervasive devises at any place and
at any time using wired or wireless communication.

Utility computing

It is a business model in which the customers receive computing recourses from a


paid service provider.
Eg. Grid and cloud service providers

Internet of Things:

IoT is a networked collection of every day objects such as computers, sensors


human etc.In IoT a tag is connected to every day objects, With the help of RFID or
sensors or GPS these tags are monitored or accessed.
The IPV6 helps to distinguish all objects in earth. Research shows that every human
being is surrounded by 1000 to 5000 objects, So IoT needs to design 100 trillion objects
of things. To reduce the complexity of identification, search and storage filters are used to
fine grain the objects.
In IoT the communications are of three types
1.H2H(Human to Human)
2, H2T(Human to Things)
3.T2T(Things to Things.
2. TECHNOLOGIES FOR NETWORK-BASED SYSTEMS

2.1 Multicore CPUs and Multithreading Technologies

The growth of computing and networking technology leads to the development of HPC
and HTC systems. The processor speed is measured in millions of instructions per second
(MIPS) and network bandwidth is measured in megabits per second (Mbps) or gigabits per
second (Gbps). The unit GE refers to 1 Gbps Ethernet bandwidth.
Processor speed ranges from 1 MIPS for the VAX 780 in 1978 to 1,800 MIPS for the Intel
Pentium 4 in 2002, up to a 22,000 MIPS peak for the Sun Niagara 2 in 2008.The clock rate for
these processors increased from 10 MHz for the Intel 286 to 4 GHz for the Pentium 4 and now it
exceeds 5 GHz
The advanced CPUs or microprocessor chips are developed using multicore architecture
with dual, quad, six, or more processing cores. These processors uses ILP .ILP is the degree on
average by which the instruction of the program can be executed in parallel.ILP mechanisms
include multiple-issue superscalar architecture,
dynamic branch prediction, and speculative execution.
2.1.1 Architecture of multicore processor

Multi-core processors consists of several cores in a single processor with its own private cache
(L1 cache).
Multiple cores are placed in the same chip with an L2 cache that is shared by all cores.
The L2 cache is directly attached with L3 cache which is the primary memory.
The Level1 cache or primary cache is in the CPU and is used for temporary storage of
instructions.
The data organization is in blocks of 32 bytes. The primary cache is the fastest form of storage
because it is built in the chip with zero wait state.
Examples for Multicore and multithreaded CPUs are Intel i7, Xeon, AMD Opteron, Sun Niagara,
IBM Power 6, and X cell processors.
2.2 Multithreading Technology
Thread means a segment or a part of a program or a process. Multithreading means
executing multiple threads concurrently.

Consider five categories of processors namely,


 a four-issue superscalar processor,
 a fine-grain multithreaded processor,
 a coarse-grain multithreaded processor,
 a two-core CMP, and
 a simultaneous multithreaded (SMT) processor.

To perform multithreading operations in these processors , the threads represented as Thread 1 to


Thread 5and are assigned to four pipeline data path.

1. In four-issue superscalar processor, instructions from the same threads are get executed.
2. The fine-grain multithreaded processor, switches the execution of instructions from
different threads per cycle.
3. The coarse-grain multithreaded processor, executes many instructions of the sane thread
for few cycles before switching to other threads.
4. The dual core executes instructions from different threads completely.
5. The simultaneous multithreaded (SMT) processor performs simultaneous scheduling of
instructions from different threads in the same cycle.
2.3 GPU Computing to Exascale
GPU stands for Graphics Processing Unit. It is a graphics coprocessor or accelerator
mounted on a computer’s graphics card or video card. A GPU offloads the CPU when it performs
complex graphics tasks in video editing applications. The world’s first GPU, was GeForce 256
marketed by NVIDIA. These GPU chips can process a minimum of 10 million polygons per
second The latest NVIDIA GPU has 128 cores on a single chip. Each core can handle 8 threads
of instructions ie 1024 threads can be executed concurrently on a single chip. The modern GPU
are used in HPC system to power super computer with massive parelliesm at multi core and
multi threading level.
Uses: The conventional GPU are used in mobile phones, Game console, embedded systems,
servers etc.

2.3.1 GPU Programming Model


Figure 1.7 shows the interaction between a CPU and GPU in performing parallel
execution of floating-point operations concurrently. The CPU is the ordinary multicore processor
with limited parallelism. The GPU has a many-core architecture that has hundreds of simple
processing cores organized as multiprocessors. Each core can have one or more threads. The
CPU instructs the GPU to perform massive data processing. The bandwidth must be matched
between the on-board main memory and the on-chip GPU memory.

Power Efficiency in GPU: The CPU ship consumes 2 nJ/Instruction where as GPU consumes
200pJ/instruction which is 1/10 times lesser than that of a CPU. So the power consumed by GPU
is less compared with CPU.

2.4 Memory, Storage, and Wide-Area Networking


2.4.1 Memory Technology
The capacity of DRAM chip capacity from 16 KB to 64 GB. Due to memory wall problem
memory access time is not increased..
For hard disk the capacity is increased from 260MB to 500 GB.
2.4.2 Disks and Storage Technology

The storage capacity of disk arrays have exceeded 3 TB The flash memory and solid-
state drives (SSDs) can handle 3.00.000 to 1 million write cycles per block. So they are used in
HPC and HTC systems.

2.4.3 System-Area Interconnects

The nodes in small clusters are interconnected by an Ethernet switch or a local area
network (LAN). LAN is used to connect client hosts to big servers. A storage area network
(SAN) is used to connect servers to network storage such as disk arrays. Network attached
storage (NAS) connects client hosts directly to the disk arrays. All three types of networks often
appear in a large cluster built with commercial network components.

.2.5 Virtual Machines and Virtualization Middleware


Virtualization
Virtualization is the process of creating a virtual version of an operating system or storage
or network resources.
Virtual Machines
 It is the software implementation of a computing environment in which operating system
or programs can be installed and run.
 The VM emulates a physical computing environment but request for CPU, memory,
network and other recourses that are managed by virtualization layer which translates
these request to the underlying physical hardware,
 The VM are created with in virtualization layer which is called a hypervisor.
 A hypervisor is also called as a virtual machine monitor which is a program that allow
multiple operating systems to share a single hardware host.

Host OS: The os which is loaded in the physical machine is called host os.
Guest Os: The guest os is an operating system that is installed in a virtual machine.
2.5.1 VM architecture
The host machine is equipped with the physical hardware, ie an x-86 architecture desktop
running its installed Windows OS, as shown in part (a) of the figure. The three VM architectures
are,

1. Native VM
2. Hosted VM
3. Duel-mode VM
4.
In native VM , the Virtual machine monitor(VMM) is placed on the host machine itself.

In the hosted VM, Virtual machine monitor(VMM) is placed on the host virtual machine.

In dual mode VM, Virtual machine monitor(VMM) is placed on both host machine and virtual
machine.

2.5.2 VM Primitive Operations


The VM primitive Operations are
1. VMs multiplexing
2. VM suspension(storage)
3. VM provision(resume)
4. VM life migration
1. VMs multiplexing : In VM multiplexing, the virtual macines are multiplexed between the
hardware machine. That is creation of multiple number of virtual machine on a physical
machine.
3. VM suspension (storage): In VM suspension, a VM can be suspended and stored in a stable
storage for future use.
4. VM provision (resume): In VM provision, the suspended VM can be resumed or
provisioned to a new hardware platform.
5. VM life migration: A VM can be migrated or moved from one hardware platform to a new
hardware platform.

2.6 Data Center Virtualization for Cloud Computing

A large data center may be built with thousands of servers. Smaller data centers are
typically built with hundreds of servers. The cost to build and maintain data center servers has
increased over the years. 30 percent of data center costs goes toward purchasing IT equipment
(such as servers and disks), 33 percent is attributed to the chiller, 18 percent to the
uninterruptible power supply (UPS), 9 percent to computer room air conditioning (CRAC), and
the remaining 7 percent to power distribution, lighting, and transformer costs.

3.Clusters of Cooperative Computers

Distributed and cloud computing systems are built over a large number of autonomous
computer nodes. These nodes are interconnected by SANs, LANs, or WANs in a hierarchical
manner. A LAN switches can easily connect hundreds of machines as a working cluster. A WAN
can connect many local clusters to form a very large cluster of clusters which is called a massive
system. Massive systems are classified into four groups: clusters, P2P networks, computing
grids, and Internet clouds.
A cluster is a collection of interconnected stand-alone computers which work together
collectively and cooperatively as a single integrated computing resource.

3.1 Cluster Architecture

Here So to Sn refers a cluster of servers interconnected by a SAN or LAN with shared


I/O devices and disk arrays. To build a larger cluster with more nodes, the interconnection
network can be built with multiple levels of Gigabit Ethernet, Myrinet, or InfiniBand switches.
Through hierarchical construction using a SAN, LAN, or WAN, one can build scalable clusters
with an increasing number of nodes. The cluster acts as a single computer attached to the
internet. The cluster is connected to the Internet via a virtual private network (VPN).gateway.
The gateway IP address is used to identify the cluster.. The system image of a computer is
decided by the way the OS manages the shared cluster resources. Most clusters have loosely
coupled node computers. All resources of a server node are managed by their own OS. Thus,
most clusters have multiple system images as a result of having many autonomous nodes under
different OS control.

Single-System Image
A single-system image (SSI) means merging multiple system images into one that is an
illusion created by software or hardware that presents a collection of resources as one integrated,
powerful resource. SSI makes the cluster appear like a single machine to the user.

Design issues of cluster

1.Scalable performance
2. Single-System Image
3. Cluster job management
4. Fault tolerance and recovery
4.Grid Computing Infrastructures

Computational Grids:
 A computing grid offers an infrastructure that couples computers, software
/middleware, special instruments, and people and sensors together.
 The goal of grid computing is to provide fast solutions for large scale computing
problems.
 The grid is often constructed across LAN, WAN, or Internet backbone networks at
a regional, national, or global scale.
 They can also be viewed as virtual platforms to support virtual organizations. The
computers used in a grid are primarily workstations, servers, clusters, and
supercomputers.
 Personal computers, laptops, and PDAs can be used as access devices to a grid
system.
 Figure below shows an example computational grid built over multiple resource
sites owned by different organizations.
 The resource sites offer complementary computing resources, including
workstations, large servers, a mesh of processors, and Linux clusters to satisfy a
chain of computational needs. The grid is built across various IP broadband
networks including LANs and WANs already used by enterprises or organizations
over the Internet.
 The grid is presented to users as an integrated resource pools as shown in the
upper half of the figure

Grid Families
The classification of grid systems are
1.Computational or data grids
2. P2P grids.
5.Cloud Computing Over the Internet
Cloud computing is a large scale distributed computing paradigm in which a pool
of virtualized , dynamically scalable, managed computing power , storage, platform and
services are delivered on on-demand basis to external users over internet.
Characteristics of cloud computing
1. Rapid Elasticity
2. Dynamic load balancing
3. Pay as you go model
4. Supports virtualized resources
5. On-demand self service
Internet Clouds
 Cloud computing applies a virtualized platform with elastic resources on demand
by provisioning hardware, software, and data sets dynamically.
 The idea is to move desktop computing to a service-oriented platform using server
clusters and huge databases at data centers.
 Cloud computing leverages its low cost and simplicity to benefit both users and
providers. Machine virtualization has enabled such cost effectiveness.
Cloud service models
The cloud computing service models are of three types. They are
1. Infrastructure as a service (IaaS)
2. Platform as a service (PasS)
3. Software as a service (SaaS)

1.Infrastructure as a service (IaaS):


IaaS deliver Infrastructure on demand in the form of virtual hardware, storage
and networks. It delivers virtual machines with virtual OS, virtual servers, virtual
networks and virtual storage to the users over internet on on demand basis. The user can
deploy and run their applications on virtual machine instances. The IaaS model again
consists of three more services,
* Storage as service
* Compute instances as a service
* Communication as a service
Examples for IaaS providers are Amazon EC2, GoGrid, Flexiscale in UK
Public Cloud Offering services of IaaS
Amazon EC2 Each instance has , 1-20 processor,
1.7-15 GB memory,
160-1.6TB of storage
GoGrid Each instance has , 1-6 CPU,
0.5-8 GB memory,
30-48GB of storage
The pricing model is defined in terms of dollars per hour.

2. Platform as a service (PasS)


Platform as a service (PasS) provide a Platform and run time environment that
allow developers to build their application and services over internet. It allows users to
develop, run and test their application. It involves runtime execution engine, database and
middleware solution. The PaaS services hosted on the cloud can be accessed by users
using a web browser.
Eg:
Cloud Name Languge and development tools
Google App Engine Pythan, Java and Ecliplse
Microsoft Azure .NET, Visual Studio

3..Software as a service (SaaS):


Software as a service (SaaS) model provide software to the user as a service on
demand. In the customer side there is no investment on the software license. The
software’s are delivered by service providers, on request, to the user for that the users
have to pay only a small subscription fee.
Eg; Google-gmail, Microsoft Share point
6.SERVICE ORIENTED ARCHITECTURE (SOA)

Service Oriented Architecture is a collection of services communicate with each


other using service interfaces. A service is a function that is well defined, self-contained
and does not depend on the state of other services. In SOA resources on a network are
made available in independent services that can be accessed without knowledge of their
underlying platform implementations. The SOA provide methods for design , deployment
and management of services that are accessible over the network and executable. So it is
used to build Grid and Clouds.

The SOA architecture has three components namely service provider, service consumer
and service registry.

Middleware

Service Registry

Client or service Server or service


consumer provider

F
The service provider is responsible for publishing the services in to the registry and
provides access to those services using API and interfaces for the customers.

The service consumer is responsible for invoking and accessing the services published
by the service provider through standard interfaces and APIs.

The service registry stores the references of services published by provider and allows
consumers to locate and access those using references.

Layered Architecture for Web Services and Grids

 In grids/web services, an entity is called as a service.


 In Java, an entity is called as a Java object.
 In CORBA an entity is called as a CORBA distributed object.
 On top of this OSI model a base software environment is present where.NET or
Apache Axis for web services and the Java Virtual Machine for Java, and a broker
network for CORBA.

 The service interfaces are used to communicate between upper layer of SOA with
base hosting environment.
 The Web Services Description Language (WSDL), is the interfaces for web
services correspond to the Java method, is for java and and CORBA interface
definition language (IDL) is the interface for CORBA These interfaces are linked
with customized, high-level communication systems: SOAP, RMI, and IIOP.
 These communication systems support Remote Procedure Call, fault recovery,
and specialized routing.

Service discovery
 The CORBA Trading Service, UDDI (Universal Description, Discovery, and
Integration), LDAP (Lightweight Directory Access Protocol), and ebXML
Electronic Business using eXtensible Markup Language) are other examples of
discovery and information services.
Service Management
Management services include service state and lifetime support; examples include the
CORBA Life Cycle and Persistent states, the different Enterprise JavaBeans models,
Jini’s lifetime model, and a suite of web services specifications.

The Evolution of SOA


 SOA applies to building grids, clouds, grids of clouds, clouds of grids, clouds of
clouds (also known as interclouds), and systems of systems in general.
 A large number of sensors provide data-collection services, denoted in the Figure
as SS (sensor service).
 A sensor can be a ZigBee device, a Bluetooth device, a WiFi access point, a
personal computer, a GPA, or a wireless phone, among other things. Raw data is
collected by sensor services.
 All the SS devices interact with large or small computers, many forms of grids,
databases, the compute cloud, the storage cloud, the filter cloud, the discovery
cloud, and so on.
 Filter services (fs in the Figure) are used to eliminate unwanted raw data, in order
to respond to specific requests from the web, the grid, or web services.
 A collection of filter services forms a filter cloud.
 SOA aims to search for, or sort out, the useful data from the massive amounts of
raw data items. Processing this data will generate useful information, and
subsequently, the knowledge for our daily use. In fact, wisdom or intelligence is
sorted out of large knowledge bases.
 Finally, intelligent decisions were made based on both biological and machine
wisdom. Most distributed systems require a web interface or portal.
 For raw data collected by a large number of sensors to be transformed into useful
information or knowledge, the data stream may go through a sequence of compute,
storage, filter, and discovery clouds. Finally, the inter-service messages converge
at the portal, which is accessed by all users. Two example are portals, OGFCE and
HUBzero, which uses both web service (portlet) and Web 2.0 (gadget)
technologies.
7.Elements of Grid Computing

Grid computing combines elements such as distributed computing, high-


performance computing and disposable computing depending on the application of the
technology and the scale of operation.

Grids can create a virtual supercomputer out of the existing servers, workstations
and personal computers. The types of grids are,
Computational grids have huge computing power. They are meant to provide secure
access to computational resources and perform processing of large computational
problems.
Scavenging grids, commonly used to find and harvest machine cycles from idle servers
and desktop computers for use in resource-intensive tasks
 Data grids, are meant to provide storage space for grid applications. They provide
support for data storage ,data discovery data locality and data security.
Market-oriented grids, which deals with price setting and negotiation, grid economy
management and utility driven scheduling and resource allocation.

The key components of grid computing include the following.

Resource management: a grid must be aware of what resources are available for
different tasks.
Security management: the grid needs to take care that only authorized users can access
and use the available resources
Data management: data must be transported, cleansed, parceled and processed.
Services management: users and applications must be able to query the grid in an
effective and efficient manner.

The functional constituents or primary elements of a grid are,


 Resources
 grid portal
 Security infrastructure
 Resource Broker
 Scheduler
 Data Management
 Job and resource management

The grid resource is an entity that is to be shared; this includes computers, storage, data
and software.
Grid portal acts as a user interaction mechanism which provides interfaces for users to
access grid applications and resources.
A grid security infrastructure provides various mechanisms for secure access of grid
applications in terms of authentication, authorization, data confidentiality, data integrity
and availability.
The resource broker provides information about the available resources on the grid and
working status of these resources.
The grid scheduler is responsible for submitting a job for execution , monitoring the
progress of active job rescheduling of suspended job etc.
The Data Management framework is used for moving files and data to various nodes
with in grid.
The Job and resource management framework provide different services to launch a job
on a particular resource, to check job’s status and retrive the results when the job is
complete.
8.Overview of Grid Architecture
The Grid Architecture consists of five layers. They are,
1. Fabric Layer
2. Connectivity Layer
3. Resource Layer
4. The Collective Layer
5. Application Layer

1.Fabric Layer
The Fabric layer which is the bottom most layer is responsible for providing
sharable resources like computational resources, data storage, networks, catalogs, and
other system resources. The data received at this layer can be transmitted directly to other
computational nodes or can be stored in the database .The resources can be physical
resources or logical resources by nature. Some examples of the logical resources of Grid
Computing environment are distributed file systems, computer clusters, distributed
computer pools, software applications, and advanced forms of networking services.
2.Connectivity Layer
The Connectivity layer defines the core communication and authentication
protocols required for grid-specific networking service transactions. t this layer
different communication protocols like IP, DNS are used for exchange of data between
resource layer and fabric layer. The authentication protocols are used for identification of
users .The Grid Security Infrastructure works at this layer which provide various security
standards for authentication , authorization data confidentiality and integrity.

3.Resource Layer
 The Resource layer is responsible for providing protocols for resource publication,
discovery , negotiation, allocation , monitoring, metering, accounting, and
payment of individual resources.
 There are two primary classes of resource layer protocols. These protocols are key
to the operations and integrity of any single resource. These protocols are as
follows:
Information Protocols: These protocols are used to get information about the structure
and the operational state of a single resource, including configuration, usage policies,
service-level agreements, and the state of the resource. In most situations, this
information is used to monitor the resource capabilities and availability constraints.
Management Protocols: The important functionalities provided by the management
protocols are:
 Negotiating access to a shared resource.
 Performing operation(s) on the resource, such as process creation or data access.
 Providing accounting and payment management functions on resource sharing.
 Monitoring the status of an operation, controlling the operation including
terminating the operation, and providing asynchronous notifications on operation
status.

4.The Collective Layer


The Collective layer is responsible for all global resource management and interaction
with a collection of resources. The collective layer services are discovery Services,
Coallocation, Scheduling, and Brokering Services Monitoring and Diagnostic Services
Data Replication Services Workload Management Systems and Software Discovery
Services Community Accounting and Payment Services.

Application Layer
This layer is responsible for providing different user applications with the help of APIs.
This layer also provides interfaces to the users to interact with the grid.
6. GRID STANDARDS
The development in grid computing requires the need of various standards associated
with it. The Global grid Forum developed various grid standards to support
interoperability between services. The popular grid standards are OGSA, OGSI, OGSA-
DAI, WSRF etc.
OGSA
The Open Grid Service Architecture (OGSA).is a standard provided by Global
Grid Forum to address the requirements of grid computing in an open and standard way.
OGSA allows a system to perform a specific task or solve s challenging problem by using
distributed resources over the interconnection network.

OGSI
The Open Grid Services Interface (OGSI) defines mechanisms for creating,
managing, and exchanging information among Grid services.
 A Grid service is a Web service that conforms to a set of interfaces and behaviors
that define how a client interacts with a Grid service.
 OGSI provides the Web Service Definition Language (WSDL) definitions for key
interfaces.

OGSA-DAI
The OGSA-DAI (data access and integration) project is concerned with
constructing middleware to assist with access and integration of data from separate data
sources via the grid.

GridFTP
 GridFTP is a secure and reliable data transfer protocol providing high performance
and optimized for wide-area networks that have high bandwidth.
 It is based upon the Internet FTP protocol and includes extensions that make it a
desirable tool in a grid environment.
 GridFTP uses basic Grid security on both control (command) and data channels.
Features include multiple data channels for parallel transfers, partial file transfers,
third-party transfers, and more.
 GridFTP can be used to move files (especially large files) across a network
efficiently and reliably. These files may include the executables required for an
application or data to be consumed or returned by an application.
 Higher level services, such as data replication services, could be built on top of
GridFTP.
WSRF
WSRF defines a set of specifications for defining the relationship between Web
services (that are normally stateless) and stateful resources.
Web service related standards:XML,WSDL,SOAP and UDDI are the services related
to web services.
IMPORTANT 2 MARK AND 16 MARK QUESTIONS.

UNIT-1-Part A

Define cloud computing/ Grid computing/ Difference between grid and cloud computing/ Components
of Cloud computing/ HPC/ HTC/ Centralized computing/ Parallel computing/ Distributed computing/
Concurrent computing/ Ubiquitous computing/ Internet of things(IOT)/ Degrees of parallelism/ Utility
computing/ Multicore processors/ Multithreading/ GPU/ SAN/ NAS/ Virtualization / Virtual machine/
Types of virtual machine architecture/ Bare metal devices/ Virtual machine primitive operations/
Massive systems/ Datacenter/ Cluster/ Single system image/ Computational grid/ Data grid/ Cloud
service models/ SOA/ Types of grid/ OGSA/ OGSI/ Grid standards /web service specifications.

Part-B

1.Explain Grid computing infrastructure.

2. Explain Grid computing architecture

3. Explain service oriented architecture.

4. Explain the system models for distributed and cloud computing .

5. Explain technologies for network based systems.

6. What are the cloud service models?Explain

7. Explain the cluster architecture.

You might also like