You are on page 1of 53

Unit II CLOUD ENABLING TECHNOLOGIES

Service Oriented Architecture – REST and Systems of Systems – Web Services –


Publish Subscribe Model – Basics of Virtualization – Types of Virtualization –
Implementation Levels of Virtualization – Virtualization Structures – Tools and
Mechanisms – Virtualization of CPU – Memory – I/O Devices –Virtualization
Support and Disaster Recovery.

Objective
To appreciate the evolution of cloud from the existing technologies

Outcome
Learn the key and enabling technologies that help in the development of cloud
Service Oriented Architecture
• In grids/web services, Java, and CORBA, an entity is,
respectively, a service, a Java object, and a CORBA
distributed object in a variety of languages.

• These architectures build on the traditional seven Open


Systems Interconnection (OSI) layers that provide the base
networking abstractions.
• On top of this we have a base software environment, which
would be .NET or Apache Axis for web services, the Java
Virtual Machine for Java, and a broker network for CORBA.

• On top of this base environment one would build a higher


level environment reflecting the special features of the
distributed computing environment.

• This starts with entity interfaces and inter-entity


communication, which rebuild the top four OSI layers but at
the entity and not the bit level.
Layered architecture for Web
services and Grids
• The entity interfaces correspond to the Web Services Description Language
(WSDL), Java method, and CORBA interface definition language (IDL)
specifications in these example distributed systems
• These interfaces are linked with customized, high-level communication systems
: SOAP, RMI, and IIOP
• Often, these communication systems are built on message-oriented
middleware (enterprise bus) infrastructure such as Websphere MQ or Java
Message Service (JMS) which provide rich functionality and support
virtualization of routing, senders, and recipients.
• In the case of fault tolerance, the features in the Web Services Reliable
Messaging (WSRM) framework mimic the OSI layer capability at the entity
levels
Cont..
★ Security is a critical capability that either uses or implements the capabilities seen in
concepts such as Internet Protocol Security (IPsec) and secure sockets in the OSI
layers.

★ The mentioned language or interface terms form a collection of entity-level


capabilities.

★ However, the distributed model has two critical advantages:

■ higher performance

■ cleaner separation of software functions with clear software reuse and


maintenance
Evolution of SOA
• SOA applies to building grids, clouds, grids of
clouds, clouds of grids, clouds of clouds (also
known as interclouds), and systems of
systems in general.
• A large number of sensors provide data-
collection services, as SS (sensor service).
• A sensor can be a ZigBee device, a Bluetooth
device, a WiFi access point, a personal
computer, or a wireless phone, among other
things. Raw data is collected by sensor
services.
• All the SS devices interact with large or small
computers, many forms of grids, databases,
the compute cloud, the storage cloud, the
filter cloud, the discovery cloud, and so on.
• Filter services (fs) are used to eliminate
unwanted raw data, in order to respond to
specific requests from the web, the grid, or
web services. A collection of filter services
forms a filter cloud.
The World Wide Web Consortium (W3C) defines SOA as a form of distributed systems architecture
characterized by the following properties:

Logical view
Message orientation
Description orientation

The semantics of a service should be documented, either directly or indirectly, by its description
Granularity Services tend to use a small number of operations with relatively large and complex
messages.

Granularity Services tend to use a small number of operations with relatively large and complex
messages.

Network orientation Services tend to be oriented toward use over a network

Platform-neutral Messages are sent in a platform-neutral, standardized format delivered through


the interfaces. XML is the most obvious format that meets this constraint.
Web Services and Tools
★ Loose coupling and support of heterogeneous implementations make services
more attractive than distributed objects
★ There are two choices of service architecture: web services or REST systems
★ Both web services and REST systems have very distinct approaches to building
reliable interoperable systems
★ In web services, one aims to fully specify all aspects of the service and its
environment
★ This specification is carried with communicated messages using Simple Object
Access Protocol (SOAP).
Cont..
★ This approach has mixed success as it has been hard to agree on key parts of the
protocol and even harder to efficiently implement the protocol by software such as
Apache Axis.

★ In the REST(Representational state transfer) approach, one adopts simplicity as the


universal principle and delegates most of the difficult problems to application
software

★ In a web services language, REST has minimal information in the header, and the
message body (that is opaque to generic message processing) carries all the needed
information.

★ REST can use XML schemas but not those that are part of SOAP; “XML over HTTP” is a
popular design choice in this regard
REST and Systems of Systems
• REST is a software architecture style for distributed systems,
particularly distributed hypermedia systems, such as the World Wide
Web.
• It has recently gained popularity among enterprises such as Google,
Amazon, Yahoo!, and especially social networks such as Facebook and
Twitter because of its simplicity, and its ease of being published and
consumed by clients.
• The REST architectural style is based on four principles:
• Resource Identification through URIs
• Uniform, Constrained Interface
• Self-Descriptive Message
• Stateless Interactions
APIs What is a REST API?

An API, or application programming interface, is a set REST APIs provide a flexible, lightweight way to integrate
of rules that define how applications or devices can applications, and have emerged as the most common
connect to and communicate with each other. method for connecting components in microservices
architectures.

A REST API is an API that conforms to the design principles


of the REST, or representational state transfer architectural
style. For this reason, REST APIs are sometimes referred to
RESTful APIs.
REST design principles

At the most basic level, an API is a mechanism that enables an application or service to
access a resource within another application or service. The application or service doing
the accessing is called the client, and the application or service containing the resource is
called the server.

Some APIs, such as SOAP or XML-RPC, impose a strict framework on developers. But
REST APIs can be developed using virtually any programming language and support a
variety of data formats. The only requirement is that they align to the following six REST
design principles - also known as architectural constraints:
1.Uniform interface. All API requests for the same resource should look the same, no matter where the request comes
from. The REST API should ensure that the same piece of data, such as the name or email address of a user, belongs to
only one uniform resource identifier (URI). Resources shouldn’t be too large but should contain every piece of
information that the client might need.
2.Client-server decoupling. In REST API design, client and server applications must be completely independent of
each other. The only information the client application should know is the URI of the requested resource; it can't
interact with the server application in any other ways. Similarly, a server application shouldn't modify the client
application other than passing it to the requested data via HTTP.
3.Statelessness. REST APIs are stateless, meaning that each request needs to include all the information necessary for
processing it. In other words, REST APIs do not require any server-side sessions. Server applications aren’t allowed to
store any data related to a client request.
4.Cacheability. When possible, resources should be cacheable on the client or server side. Server responses also need
to contain information about whether caching is allowed for the delivered resource. The goal is to improve
performance on the client side, while increasing scalability on the server side.
5.Layered system architecture. In REST APIs, the calls and responses go through different layers. As a rule of thumb,
don’t assume that the client and server applications connect directly to each other. There may be a number of different
intermediaries in the communication loop. REST APIs need to be designed so that neither the client nor the server can
tell whether it communicates with the end application or an intermediary.
6.Code on demand (optional). REST APIs usually send static resources, but in certain cases, responses can also
contain executable code (such as Java applets). In these cases, the code should only run on-demand.
How REST APIs work

REST APIs communicate via HTTP requests to perform standard database functions like
creating, reading, updating, and deleting records (also known as CRUD) within a resource.
For example, a REST API would use a GET request to retrieve a record, a POST request to
create one, a PUT request to update a record, and a DELETE request to delete one. All HTTP
methods can be used in API calls. A well-designed REST API is similar to a website running in
a web browser with built-in HTTP functionality.

The state of a resource at any particular instant, or timestamp, is known as the resource
representation. This information can be delivered to a client in virtually any format including
JavaScript Object Notation (JSON), HTML, XML, Python, PHP, or plain text. JSON is popular
because it’s readable by both humans and machines—and it is programming language-
agnostic.

Request headers and parameters are also important in REST API calls because they include
important identifier information such as metadata, authorizations, uniform resource
identifiers (URIs), caching, cookies and more. Request headers and response headers, along
with conventional HTTP status codes, are used within well-designed REST APIs.
Working of REST API
Example: RESTful Web Service in Amazon S3 Interface
• A good example of RESTful web service application in high-
performance computing systems is the Amazon Simple Storage
Service (S3) interface.
• Amazon S3 is data storage for Internet applications. It provides simple
web services to store and retrieve data from anywhere at any time
via the web.
• S3 keeps fundamental entities, “objects,” which are named pieces of
data accompanied by some metadata to be stored in containers
called “buckets,” each identified by a unique key.
• Buckets serve several purposes: They organize the Amazon S3
namespace at the highest level, identify the account responsible for
storage and data transfer charges, play a role in access control, and
serve as the unit of aggregation for usage reporting.
• Amazon S3 provides three types of resources: a list of user buckets, a
particular bucket, and a particular S3 object, accessible through
https://s3.amazonaws.com/{name-of-bucket}/{name-of-object}
<?xml version="1.0"?>
<SOAP-ENV:Envelope
xmlns:SOAP-ENV
="http://www.w3.org/2001/12/soap-envelope"
SOAP-ENV:encodingStyle {"city":"Mumbai","state":"Maharastra"}
=" http://www.w3.org/2001/12/soap-encoding">
<soap:Body>
<Demo.guru99WebService
xmlns="http://tempuri.org/">
<EmployeeID>int</EmployeeID>
</Demo.guru99WebService>
</soap:Body>
</SOAP-ENV:Envelope>
Quiz on Service Oriented Architecture - Intro
https://forms.gle/MW3YbtfSV8SsVCx9A
Virtualization
• Virtualization is a computer architecture technology by which multiple virtual machines
(VMs) are multiplexed in the same hardware machine.
• The purpose of a VM is to enhance resource sharing by many users and improve
computer performance in terms of resource utilization and application flexibility.
• Hardware resources (CPU, memory, I/O devices, etc.) or software resources (operating
system and software libraries) can be virtualized in various functional layers.
• This virtualization technology has been revitalized as the demand for distributed and
cloud computing increased sharply in recent years
• The idea is to separate the hardware from the software to yield better system efficiency.
For example, computer users gained access to much enlarged memory space when the
concept of virtual memory was introduced.
• Similarly, virtualization techniques can be applied to enhance the use of compute
engines, networks, and storage.
• With sufficient storage, any computer platform can be installed in another host
computer, even if they use processors with different instruction sets and run with
distinct operating systems on the same hardware.
Levels of Virtualization Implementation
A traditional computer
runs with a host
operating system
specially tailored for its
hardware architecture
• After virtualization, different user
applications managed by their own
operating systems (guest OS) can run
on the same hardware, independent
of the host OS.
• This is often done by adding additional
software, called a virtualization layer
• This virtualization layer is known as
hypervisor or virtual machine monitor
(VMM)
• The VMs are shown in the upper
boxes, where applications run with
their own guest OS over the
virtualized CPU, memory, and I/O
resources.
The virtualization software creates the
abstraction of VMs by interposing a
virtualization layer at various levels of a
computer system.

Common virtualization layers include the


instruction set architecture (ISA) level,
hardware level, operating system level, library
support level, and application level
Instruction Set Architecture Level
• At the ISA level, virtualization is performed by emulating a given ISA by the ISA of
the host machine.
• For example, MIPS binary code can run on an x86-based host machine with the
help of ISA emulation.
• With this approach, it is possible to run a large amount of legacy binary code
written for various processors on any given new hardware host machine.
• Instruction set emulation leads to virtual ISAs created on any hardware machine.
• The basic emulation method is through code interpretation. An interpreter
program interprets the source instructions to target instructions one by one.
• A virtual instruction set architecture (V-ISA) thus requires adding a processor-
specific software translation layer to the compiler.
Hardware Abstraction Level
• Hardware-level virtualization is performed right on top of the bare hardware.
• On the one hand, this approach generates a virtual hardware environment for a
VM.
• On the other hand, the process manages the underlying hardware through
virtualization.
• The idea is to virtualize a computer’s resources, such as its processors, memory,
and I/O devices.
• The intention is to upgrade the hardware utilization rate by multiple users
concurrently.
Operating System Level
• This refers to an abstraction layer between traditional OS and user applications.
• OS-level virtualization creates isolated containers on a single physical server and the
OS instances to utilize the hardware and software in data centers.
• The containers behave like real servers.
• OS-level virtualization is commonly used in creating virtual hosting environments to
allocate hardware resources among a large number of mutually distrusting users.
Library Support Level
• Most applications use APIs exported by user-level libraries rather than using
lengthy system calls by the OS.
• Since most systems provide well-documented APIs, such an interface becomes
another candidate for virtualization.
• Virtualization with library interfaces is possible by controlling the communication
link between applications and the rest of a system through API hooks.
User-Application Level
• Virtualization at the application level virtualizes an application as a VM. On a
traditional OS, an application often runs as a process.
• Therefore, application-level virtualization is also known as process-level
virtualization.
• The most popular approach is to deploy high level language (HLL) VMs.
• In this scenario, the virtualization layer sits as an application program on top of
the operating system, and the layer exports an abstraction of a VM that can run
programs written and compiled to a particular abstract machine definition.
• Any program written in the HLL and compiled for this VM will be able to run on
it. The Microsoft .NET CLR and Java Virtual Machine (JVM) are two good
examples of this class of VM.
Google App Engine

https://cloud.google.com/appengine/docs/standar
d/python3/an-overview-of-app-engine

Reading – 15 mins
Discussion – 10 mins
Quiz on virtualization
https://forms.gle/5w7SErRbhNGCJwhn7
VIRTUALIZATION STRUCTURES/TOOLS AND MECHANISMS

• Before virtualization, the operating system manages the hardware.


• After virtualization, a virtualization layer is inserted between the hardware and
the operating system.
• In such a case, the virtualization layer is responsible for converting portions of the
real hardware into virtual hardware. Therefore, different operating systems such
as Linux and Windows can run on the same physical machine, simultaneously.
• Depending on the position of the virtualization layer, there are several classes of
VM architectures,
• hypervisor architecture
• Paravirtualization
• host-based virtualization
The hypervisor is also known as the VMM (Virtual Machine Monitor).
Hypervisor and Xen Architecture
• The hypervisor supports hardware-level virtualization on bare metal devices like CPU,
memory, disk and network interfaces.
• The hypervisor software sits directly between the physical hardware and its OS.
• The hypervisor provides hypercalls for the guest OSes and applications.
• Depending on the functionality, a hypervisor can assume a micro-kernel architecture
like the Microsoft Hyper-V. Or it can assume a monolithic hypervisor architecture like
the VMware ESX for server virtualization
• A micro-kernel hypervisor includes only the basic and unchanging functions (such as
physical memory management and processor scheduling). The device drivers and other
changeable components are outside the hypervisor.
• A monolithic hypervisor implements all the aforementioned functions, including those
of the device drivers.
• Therefore, the size of the hypervisor code of a micro-kernel hypervisor is smaller than
that of a monolithic hypervisor.
• Essentially, a hypervisor must be able to convert physical devices into virtual resources
dedicated for the deployed VM to use.
The Xen Architecture
Xen is an open source hypervisor
program developed by Cambridge
University.
Xen is a microkernel hypervisor, which
separates the policy from the
mechanism. The Xen hypervisor
implements all the mechanisms, leaving
the policy to be handled by Domain 0
Xen does not include any device drivers
natively. It just provides a mechanism by
which a guest OS can have direct access
to the physical devices.
As a result, the size of the Xen hypervisor
is kept rather small.
Binary Translation with Full Virtualization
• Depending on implementation technologies, Binary Translation of Guest OS Requests
hardware virtualization can be classified into two Using a VMM
categories: full virtualization and host-based
virtualization.
• Full virtualization does not need to modify the
guest OS. It relies on binary translation to trap
and to virtualize the execution of certain
sensitive, nonvirtualizable instructions.
• The guest OSes and their applications consist of
noncritical and critical instructions.
• In a host-based system, both a host OS and a
guest OS are used. A virtualization software layer
is built between the host OS and guest OS.
• With full virtualization, noncritical instructions
run on the hardware directly while critical
instructions are discovered and replaced with
traps into the VMM to be emulated by software.
Para-Virtualization with Compiler Support
• Para-virtualization needs to modify
the guest operating systems.
• A para-virtualized VM provides
special APIs requiring substantial OS
modifications in user applications.
• Performance degradation is a critical
issue of a virtualized system.
• No one wants to use a VM if it is
much slower than using a physical
machine.
• para-virtualization attempts to
reduce the virtualization overhead,
and thus improve performance by
modifying only the guest OS kernel.
CPU Virtualization
• A VM is a duplicate of an existing computer system in which a majority of the VM
instructions are executed on the host processor in native mode.
• Thus, unprivileged instructions of VMs run directly on the host machine for higher
efficiency.
• Other critical instructions should be handled carefully for correctness and stability.
• The critical instructions are divided into three categories: privileged instructions,
control sensitive instructions, and behavior-sensitive instructions.
• Privileged instructions execute in a privileged mode and will be trapped if
executed outside this mode.
• Control-sensitive instructions attempt to change the configuration of resources
used.
• Behavior-sensitive instructions have different behaviors depending on the
configuration of resources, including the load and store operations over the virtual
memory
• A CPU architecture is virtualizable if it supports the ability to run the VM’s
privileged and unprivileged instructions in the CPU’s user mode while the VMM
runs in supervisor mode.
• When the privileged instructions including control- and behavior-sensitive
instructions of a VM are executed, they are trapped in the VMM.
• In this case, the VMM acts as a unified mediator for hardware access from
different VMs to guarantee the correctness and stability of the whole system.
• Not all CPU architectures are virtualizable.
• RISC CPU architectures can be naturally virtualized because all control- and
behavior-sensitive instructions are privileged instructions.
• x86 CPU architectures are not primarily designed to support virtualization. This is
because about 10 sensitive instructions, such as SGDT and SMSW, are not
privileged instructions. When these instructions execute in virtualization, they
cannot be trapped in the VMM.
Hardware-Assisted CPU Virtualization
Intel’s VT-x technology is an example
of hardware-assisted virtualization
Intel calls the privilege level of x86
processors the VMX Root Mode.
In order to control the start and stop
of a VM and allocate a memory page
to maintain the CPU state for VMs, a
set of additional instructions is added.
Xen, VMware, and the Microsoft
Virtual PC all implement their
hypervisors by using the VT-x
technology.
Memory Virtualization
• Virtual memory virtualization is similar to the virtual memory support provided by
modern operating systems.
• In a traditional execution environment, the operating system maintains mappings
of virtual memory to machine memory using page tables, which is a one-stage
mapping from virtual memory to machine memory.
• All modern x86 CPUs include a memory management unit (MMU) and a
translation lookaside buffer (TLB) to optimize virtual memory performance.
• In a virtual execution environment, virtual memory virtualization involves sharing
the physical system memory in RAM and dynamically allocating it to the physical
memory of the VMs.
That means a two-stage mapping process should be maintained by the guest OS and the VMM,
respectively: virtual memory to physical memory and physical memory to machine memory.
MMU virtualization should be supported, which is transparent to the guest OS.
The guest OS continues to control the mapping of virtual addresses to the physical memory addresses
of VMs.
But the guest OS cannot directly access the actual machine memory. The VMM is responsible for
mapping the guest physical memory to the actual machine memory.
Since each page table of the guest OSes has a separate page table in the VMM corresponding to it,
the VMM page table is called the shadow page table.
Nested page tables add another layer of indirection to virtual memory.
The MMU already handles virtual-to-physical translations as defined by the OS.
Then the physical memory addresses are translated to machine addresses using another set of page
tables defined by the hypervisor.
Since modern operating systems maintain a set of page tables for every process, the shadow page
tables will get flooded.
Consequently, the performance overhead and cost of memory will be very high. VMware uses
shadow page tables to perform virtual-memory-to-machine-memory address translation.
Processors use TLB hardware to map the virtual memory directly to the machine memory to avoid
the two levels of translation on every access.
I/O Virtualization
• I/O virtualization involves
managing the routing of I/O
requests between virtual devices
and the shared physical
hardware.
• There are three ways to
implement I/O virtualization: full
device emulation, para-
virtualization, and direct I/O.
• Full device emulation is the first
approach for I/O virtualization.
• This approach emulates well-
known, real-world devices.
• All the functions of a device or bus infrastructure, such as device enumeration,
identification, interrupts, and DMA, are replicated in software.
• This software is located in the VMM and acts as a virtual device.
• The I/O access requests of the guest OS are trapped in the VMM which interacts with the
I/O devices.
• The para-virtualization method of I/O virtualization is typically used in Xen.
• It is also known as the split driver model consisting of a frontend driver and a backend
driver.
• The frontend driver is running in Domain U and the backend driver is running in Domain 0.
• They interact with each other via a block of shared memory.
• The frontend driver manages the I/O requests of the guest OSes and the backend driver is
responsible for managing the real I/O devices and multiplexing the I/O data of different
VMs.
• Although para-I/O-virtualization achieves better device performance than full device
emulation, it comes with a higher CPU overhead.
Virtual disaster recovery is a type of DR that typically involves replication and allows a user to fail over to
virtualized workloads.

For the most efficient virtual disaster recovery, an organization should copy virtual machine (VM) workloads off-
site on a regular basis. Replication can essentially make a real-time copy of VMs in a separate location, thus
strengthening DR.

Virtualization provides flexibility in disaster recovery. When servers are virtualized, they are containerized into
VMs, independent from the underlying hardware. Therefore, an organization does not need the same physical
servers at the primary site as at its secondary disaster recovery site.

Virtualized platforms typically provide high availability in the event of a failure. Virtualization helps meet
recovery time objectives (RTOs) and recovery point objectives (RPOs), as replication is done as frequently as
needed, especially for critical systems.

DR planning and failover testing is also simpler with virtualized workloads than with a physical setup, making
disaster recovery a more attainable process for organizations that may not have the funds or resources for
physical DR.

In addition, consolidating physical servers with virtualization saves money because the virtualized workloads
require less power, floor space and maintenance. However, replication can get expensive, depending on how
frequently it's done.
What is recovery time objective?

The recovery time objective (RTO) is the maximum tolerable length of time that a computer, system, network or
application can be down after a failure or disaster occurs.

The RTO is a function of the extent to which the interruption disrupts normal operations and the amount of
revenue lost per unit time because of the disaster.

These factors, in turn, depend on the affected equipment and application(s).


An RTO is measured in seconds, minutes, hours or days. It is an important consideration in a disaster recovery
plan (DRP)

How do you calculate RTO? Calculating RTO requires determining how quickly the
Calculating recovery time objective is a multistep process recovery process for a given application, service,
that needs to be considered from several different system or data needs to happen after a major incident
viewpoints, including business impact analysis (BIA), DR based on the loss tolerance the organization has for
strategy and business continuity planning. that application, service, system or data as part of its
BIA. Defining the loss tolerance involves how much
The key goal of an RTO is to determine what duration of operational time an organization can afford (or is
time it will take in a recovery process after a major willing) to lose after an incident before normal
incident to resume normal business operations. business operations must resume.
What is recovery point objective?
The recovery point objective (RPO) is the age of files that For example, if the RPO is one hour, admins
must be recovered from backup storage for normal must schedule backups for at least once per
operations to resume if a computer, system or network goes hour. In this case, external, redundant hard
down as a result of a hardware, program or communications drives may prove to be the best disaster
failure. The RPO is expressed backward in time -- that is, recovery platform. If the RPO is five days (120
into the past -- from the instant at which the failure occurs hours), then backups must happen at intervals of
and can be specified in seconds, minutes, hours or days. It is 120 hours or fewer.
an important consideration in a disaster recovery plan

A virtual disaster recovery plan has many similarities to a traditional DR plan.


An organization should:

•Decide which systems and data are the most critical for recovery, and document them.
•Get management support for the DR plan
•Complete a risk assessment and business impact analysis to outline possible risks and
their potential impacts.
•Document steps needed for recovery.
•Define RTOs and RPOs.
•Test the plan.
RPO AND RTO
Virtualization Support and Disaster
Recovery
Discussion

• How failure is handled in traditional system/Infrastructure?


• How does virtualization simplify disaster recovery?
• What is RTO and RPO?
Task
Presentation on 4 techniques for disaster recovery in AWS
Concepts to be known
• DISASTER RECOVERY : Process of resuming normal
operations after a system has been down (access to data,
networking,…)

• RPO : The last working state of a system before a disaster


(How much data loss can a business incur ?)

• RTO : The time to recover after a disaster


(How long a system can be down ?)
AWS DISASTER RECOVERY OPTIONS
AWS offers 4 levels of DR options:
1. Backup & Restore
2. Pilot Light
3. Warm Standby
4. Hot Standby with Multi Site

These options are chosen based on the business’s RPO and RTO as
desired.

You might also like