You are on page 1of 474

Cloud Computing

BITS Pilani AWS-Storage and Database Services


Hyderabad Campus
Cloud Computing Offerings

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 2
Amazon Web Services
Amazon Web Services Cloud
• Provides highly reliable and scalable infrastructure for
deploying web-scale solutions
• With minimal support and administration costs
• More flexibility than own infrastructure, either on premise
or at a data centre facility

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 3
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 4
Amazon Web Services

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 5
Infrastructure Services
• Elastic IP addresses - allocate a static IP address and
assigned to an instance.
• CloudWatch: Enable monitoring Amazon EC2 instance - -
visibility into resource utilization, operational performance,
and overall demand patterns (including metrics such as
CPU utilization, disk reads and writes, and network traffic).
• Auto-scaling - to automatically scale capacity on certain
conditions based on metric that Amazon CloudWatch
collects.
• Elastic LB – distribute incoming traffic by creating an
elastic load balancer
• Amazon Elastic Block Storage (EBS) - volumes provide
network-attached persistent storage to Amazon EC2
instances.
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 6
Infrastructure Services
• Amazon S3 is highly durable and distributed data store.
With a simple web services interface, store and retrieve
large amounts of data as objects in buckets (containers) at
any time, using standard HTTP
• Amazon SimpleDB - Provides the core functionality of a
database, real-time lookup and simple querying of
structured data
• Amazon Relational Database Service - provides an easy
way to setup, operate and scale a relational database in
the cloud.
• Amazon Elastic MapReduce - provides a hosted Hadoop
framework
• AWS Identity and Access Management (IAM) – enables
multiple User creation with unique security credentials and
manage the permissions for each of these Users

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 7
Amazon Elastic Compute
Cloud (Amazon EC2)

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 8
Features of Amazon EC2

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 9
Amazon EC2

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 10
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 11
Amazon Machine Image
(AMI)

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 12
Types of AMI

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 13
Amazon EC2 Choices

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 14
Amazon EC2 Instance Types

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 15
Elastic IP Address

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 16
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 17
Auto Scaling

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 18
Elastic Load Balancing

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 19
Amazon VPC

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 20
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 21
Amazon Route 53

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 22
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 23
Security Groups

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 25
Inbound
Source Protocol Port Range Comments
0.0.0.0/0 TCP 80 Allow inbound HTTP access from all
IPv4 addresses

::/0 TCP 80 Allow inbound HTTP access from all IPv6


addresses

0.0.0.0/0 TCP 443 Allow inbound HTTPS access from all


IPv4 addresses

::/0 TCP 443 Allow inbound HTTPS access from all IPv6
addresses

Your network's public IPv4 address TCP 22 Allow inbound SSH access to Linux
range instances from IPv4 IP addresses in
your network (over the Internet
gateway)

Your network's public IPv4 address TCP 3389 Allow inbound RDP access to Windows
range instances from IPv4 IP addresses in
your network (over the Internet
gateway)

Outbound
Destination Protocol Port Range Comments
The ID of the security group for your TCP 1433 Allow outbound Microsoft SQL Server
Microsoft SQL Server database servers access to instances in the specified
security group

The ID of the security group for your TCP 3306 Allow outbound MySQL access to
MySQL database servers instances in the specified security group

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 26
Region versus Availability
Zones

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 27
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 28
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 29
Regions

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 30
Amazon S3

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 31
Organization of Data in S3

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 32
Amazon S3

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 33
Amazon S3 pricing
S3 Standard Storage

First 50 TB / Month $0.023 per GB

Next 450 TB / Month $0.022 per GB

Over 500 TB / Month $0.021 per GB

S3 Standard-Infrequent Access (S3 Standard-IA) Storage

All storage / Month $0.0125 per GB

S3 One Zone-Infrequent Access (S3 One Zone-IA) Storage

All storage / Month $0.01 per GB

S3 Glacier Storage

All storage / Month $0.004 per GB

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 34
Billions of Objects Stored

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 35
S3 Namespace

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 36
S3 Namespace

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 37
S3 API

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 38
Storage Resources

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 39
Elastic Block Storage

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 40
Elastic Block Storage

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 41
Elastic Block Storage

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 42
Elastic Block Storage
In the diagram, Volume 1 is shown at
three points in time. A snapshot is taken
of each of these three volume states.
•In State 1, the volume has 10 GiB of
data. Because Snap A is the first
snapshot taken of the volume, the entire
10 GiB of data must be copied.
•In State 2, the volume still contains 10
GiB of data, but 4 GiB have changed.
Snap B needs to copy and store only the
4 GiB that changed after Snap A was
taken. The other 6 GiB of unchanged
data, which are already copied and
stored in Snap A, are referenced by
Snap B rather than (again) copied. This
is indicated by the dashed arrow.
•In State 3, 2 GiB of data have been
added to the volume, for a total of 12
GiB. Snap C needs to copy the 2 GiB
that were added after Snap B was taken.
As shown by the dashed arrows, Snap C
also references 4 GiB of data stored in
Snap B, and 6 GiB of data stored in
Snap A.
•The total storage required for the three
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html
snapshots is 16 GiB.
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 43
Instance Storage vs EBS
Storage

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 44
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 45
SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 46
Amazon Glacier

https://aws.amazon.com/glacier/

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 47
Amazon Dynamo DB

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 50
Amazon Dynamo DB

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 51
Amazon Dynamo DB

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 52
Amazon Cloud Front

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 53
How CloudFront delivers
Content?

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 54
How CloudFront delivers
Content?

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 55
How CloudFront delivers
Content?

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 56
Bibliography
• Jayaswal K., Kallakurchi J., Houde D. J., and Shah D.
Cloud Computing Black Book. DreamTech Press; 2014.
• Dan C. Marinescu, Cloud Computing Theory and
Practice. Elsevier; 2013.
• Internet Resources.
• Recorded Lectures.

SS ZG527 AWS Storage and DB Services Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 57
Cloud Computing
BITS Pilani
Hyderabad Campus
What is OpenStack?

OpenStack is a collection of open source


software projects that controls large pools of
compute, storage, and networking resources
throughout a datacentre, all managed through a
dashboard that gives administrators control
while empowering their users to provision
resources through a web interface.

- OpenStack.org

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 2


Introduction to OpenStack

A free and open-source software platform for cloud


computing, mostly deployed as an IaaS.
Began in 2010 as a joint project of Rackspace Hosting and
NASA.
Rackspace contributed their "Cloud Files" platform (code)
to power the Object Storage part of the OpenStack.
NASA contributed their "Nebula“ platform (code) to power
the Compute part.
As of 2016, more than 500 companies have joined the
project including companies like Intel, Cisco, HP, Dell,
AMD.

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 3


SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 4
7 Core Components of OpenStack

Nova- Compute Service


Swift- Storage Service
Glance- Imaging Service
Cinder- Block Storage Service
Keystone- Identity Service
Horizon- UI Service
Quantum (Neutron) -Network connectivity Service

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 5


Components of OpenStack

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 6


Nova – Compute Service:

Allows users to create, destroy, and


manage virtual machines using user-
supplied images.
It can work with widely available
virtualization technologies, as well as
bare metal and high-performance
computing (HPC) configurations.
Rackspace and HP provides commercial
compute services built on Nova and it
is used internally at companies like
MercadoLibre and NASA.
It corresponds to Amazon EC2 and users
can use OpenStack API or Amazon’s
EC2 API.
Uses Python and Web Server Gateway
Interface(WSGI).

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 7


Nova – System Architecture

Nova is comprised of multiple server processes, each


performing different functions.
– User-facing interface is a REST API
– Nova components internally communicate via an RPC message
passing mechanism.
The API servers process REST requests, which typically
involve database reads/writes.
Most of the major nova components can be run on multiple
servers, and have a manager that is listening for RPC
messages.
Nova uses a central database that is (logically) shared
between all components.
Deployment sharding concept called cells is employed to
achieve horizontal scaling.

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 8


Nova - Components

• DB: sql database for data storage.


• API: Component that receives HTTP
requests, converts commands and
communicates with other components
via the oslo.messaging queue or
HTTP
• Scheduler: decides which host gets
each instance
• Network: manages ip forwarding,
bridges, and vlans
• Compute: manages communication
with hypervisor and virtual machines.
• Conductor: handles requests that
need coordination(build/resize), acts
as a database proxy, or handles object
conversions.

Source: http://docs.openstack.org/developer/nova/architecture.html

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 9


Horizon – UI Service:

“Think simple” as my old master used to say - meaning


reduce the whole of its parts into the simplest terms, getting
back to first principles.
- Frank Lloyd Wright

It provides a modular web-based user interface for all the


OpenStack services.
The design accommodates third party products and services,
such as billing, monitoring, and additional management tools.
With this web GUI, you can perform most operations on your
cloud like launching an instance, assigning IP addresses,
setting access controls, attaching volumes to VM,
maintenance, etc.
The dashboard is also brand-able for service providers and other
commercial vendors who want to make use of it.
SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 10
Horizon – Values and
Architecture
Horizon holds several key values at the core of its design
and architecture:

Core Support: Out-of-the-box support for all core OpenStack projects.


– Ships with three central dashboards, a “User Dashboard”, a “System
Dashboard”, and a “Settings” dashboard.
– Ships with a set of API abstractions for the core OpenStack projects in order to
provide a consistent, stable set of reusable methods for developers.
Extensible: Anyone can add a new component as a “first-class
citizen”.
– A Horizon dashboard application is based around the Dashboard class that
provides a consistent API and set of capabilities for both core OpenStack
dashboard apps shipped with Horizon and equally for third-party apps which is
extensible.
Usable: Providing an awesome interface that people want to use.

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 11


Horizon – Values and
Architecture
Manageable: The core codebase should be simple and easy-to-
navigate.

– A simple method for registering a Panel (sub-navigation forms) is provided within


the application which contains the necessary logic (views, forms) for that
interface.
– This granular breakdown prevents files from becoming thousands of lines long
and makes code easy to find by correlating it directly to the navigation.
Consistent: Visual and interaction paradigms are maintained
throughout.
– Consistency can be maintained across applications by providing the necessary
core classes to build from, as well as a solid set of reusable templates and
additional tools (base form classes, base widget classes, template tags, and
perhaps even class-based views).
Stable: A reliable API with an emphasis on backwards-compatibility.
– By architecting around these core classes and reusable components we create
an implicit contract that changes to these components will be made in the most
backwards-compatible ways whenever possible.

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 12


Storage Services:
Glance – Imaging Service:
– OpenStack Image (Glance) provides discovery,
registration, and delivery services for disk and server
images.
– The Image Service API provides a standard REST
interface for querying information about disk images
and lets clients stream the images to new servers.
– The disk images are most commonly used in
OpenStack Compute.
– Glance is built on the below guidelines:
• Component based architecture: Quickly add new behaviors
• Highly available: Scale to very serious workloads
• Fault tolerant: Isolated processes avoid cascading failures
• Recoverable: Failures should be easy to diagnose, debug, and
rectify
• Open standards: Be a reference implementation for a community-
SS ZG527
driven api
OpenStack 13
Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus
Glance - Architecture
• Has a client-server architecture that
provides a REST API to the user through
which requests to the server can be
performed.
• A Glance Domain Controller manages the
internal server operations that is divided
into layers.
• All the file (Image data) operations are
performed using glance store library,
which is responsible for interaction with
external storage back ends and (or) local
file-system(s).
• The glance store library provides a
uniform interface to access the backend
stores.
• Glance uses a central database (Glance
DB) that is shared amongst all the
components in the system and is sql-
based by default.

Source: http://docs.openstack.org/developer/glance/architecture.html

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 14


Storage Services continued…
Cinder – Block Storage Service:
– It provides persistent block-level storage devices for use with
OpenStack compute instances.
– Cinder can also be used independently of other OpenStack
services.
– Cinder permits organizations that deploy it to make available a
catalog of block-based storage devices with differing characteristics.
– Cinder also features basic storage capabilities such as snapshot
management and volume clones, which are often enhanced
through vendor-specific drivers.
– The physical storage media, whether disks or solid-state drives, can
be located within or directly attached to the Cinder server nodes, or
they can be located in external storage systems from third-party
vendors.
– Third-party storage vendors use Cinder's plug-in architecture to do
the necessary integration work.
– Similar to AWS EBS.

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 15


Storage Services continued…
Swift – Storage Service:
– A highly available, distributed, scalable and redundant
object storage system.
– Objects and files are written to multiple disk drives
spread throughout servers in the data center.
– Several companies provide commercial storage
services based on Swift. These include KT,
Rackspace (from which Swift originated) and
Internap, etc.
– It is also used internally at many large companies to
store their data.
– Similar to Amazon S3.

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 16


Swift – Architecture
– Proxy Server:
• Responsible for tying together the rest of the Swift architecture.
• For each request, it will look up the location of the account, container, or
object in the ring and route the request accordingly.
• Also responsible for encoding and decoding object data.
• A large number of failures are also handled in the Proxy Server.
– For e.g: If a server is unavailable for an object PUT, it will ask the ring for a
handoff server and route there instead.
– The Ring:
• Represents a mapping between the names of entities stored on disk and
their physical location.
• Separate rings for accounts, containers, and one object ring per storage
policy are maintained.
• Other components need to interact with the appropriate ring to determine its
location in the cluster when they need to perform any operation on an object,
container, or account .
• The Ring maintains the mapping using zones, devices, partitions, and
replicas.
• Each partition in the ring is replicated, by default, 3 times across the cluster,
and the locations for a partition are stored in the mapping maintained by the
ring.

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 17


Swift – Architecture continued…
– Storage Policies
• Storage Policies provide a way for object storage providers to
differentiate service levels, features and behaviors of a Swift
deployment.
• Each device in the system is assigned to one or more Storage Policies.
• Accomplished through the use of multiple object rings, where each
Storage Policy has an independent object ring, which may include a
subset of hardware implementing a particular differentiation.
– Object Server:
• The Object Server is a very simple blob storage server that can store,
retrieve and delete objects stored on local devices.
• Each object is stored using a path derived from the object name’s hash
and the operation’s timestamp.
– Container Server:
• primary job is to handle listings of objects.
– Replication:
• Designed to keep the system in a consistent state in the face of
temporary error conditions like network outages or drive failures.
– Other components include updaters, auditors, and
reconstruction.

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 18


Keystone – Identity Service:

It provides authentication and authorization for all the


OpenStack services.
It supports multiple forms of authentication including
standard username and password credentials, token-
based systems and AWS-style (i.e. Amazon Web
Services) logins.
Users and third-party tools can programmatically determine
which resources they can access.
It can integrate with existing backend directory services like
LDAP (Lightweight Directory Access Protocol).

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 19


Neutron – Network Connectivity Service:
• Neutron is a system for managing networks and IP
addresses.
• It provides "network connectivity as a service" between
interface devices managed by other OpenStack services
(most likely Nova).
• It ensures the network is not a bottleneck or limiting factor
in a cloud deployment, and gives users self-service ability,
even over network configurations.
• It manages IP addresses, allowing for dedicated static IP
addresses or DHCP.
• The service works by allowing users to create their own
networks and then attach interfaces to them.
• OpenStack Network has a pluggable architecture to
support many popular networking vendors and
technologies.
SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 20
Heat – OpenStack Orchestration:
• Used to create a human and machine accessible service for
managing the entire lifecycle of infrastructure and
applications within OpenStack clouds.
• It implements an orchestration engine to launch multiple
composite cloud applications based on templates in the
form of text files that can be treated like code.
• A Heat template describes the infrastructure for a cloud
application in a text file that is readable and writable by
humans.
• Infrastructure resources that can be described include:
servers, floating IPs, volumes, security groups, users, etc.

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 21


All Components Put Together:

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 22


SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 23
Conceptual Architecture:

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 24


Two node architecture:

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 25


Two node architecture – Overview:
– Controller node:
• The basic controller node runs the Identity service, Image
Service, management portion of Compute, and the dashboard
necessary to launch a simple instance.
• It also includes supporting services such as a database,
message broker, and NTP.
• Optionally, the controller node also runs portions of Block
Storage, Object Storage, Database Service, Orchestration, and
Telemetry. These components provide additional features for
your environment.
– Compute node:
• Runs the hypervisor portion of Compute, which operates tenant
virtual machines or instances.
• By default, Compute uses KVM as the hypervisor. Compute also
provisions and operates tenant networks and implements security
groups.

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 26


Three node architecture

SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 27


Three node architecture
Overview:
– Controller node:
• Runs the Identity service, Image Service, management portions of
Compute and Networking, Networking plug-in, and the dashboard.
• It also includes supporting services such as a database, message
broker, and Network Time Protocol (NTP).
• Optionally, the controller node also runs portions of Block Storage,
Object Storage, Database Service, Orchestration, and Telemetry.
These components provide additional features for your environment.
– Network node:
• Runs the Networking plug-in, layer 2 agent, and several layer 3
agents that provision and operate tenant networks.
• Layer 2 services include provisioning of virtual networks and tunnels.
Layer 3 services include routing, NAT , and DHCP.
• This node also handles external (internet) connectivity for tenant
virtual machines or instances.
– Compute node:
• Runs the hypervisor portion of Compute, which operates tenant virtual
machines or instances.
• By default Compute uses KVM as the hypervisor. The compute node
also runs the Networking plug-in and layer 2 agent which operate
tenant networks and implement security groups.
SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 28
Bibliography
• Jayaswal K., Kallakurchi J., Houde D. J., and Shah D. Cloud Computing
Black Book. DreamTech Press; 2014.
• Recorded Lectures.
• http://docs.openstack.org/developer/openstack-projects.html
• https://en.wikipedia.org/wiki/OpenStack
• http://docs.openstack.org/liberty/install-guide-obs/overview.html
• https://www.ibm.com/blogs/cloud-computing/2014/07/openstack-in-a-
day-for-under-20/
• http://www.unixarena.com/2015/08/openstack-architecture-and-
components-overview.html
• http://openstack-admin-
guide.readthedocs.io/zh_CN/latest/started/conceptual_architecture.html
• http://ken.pepple.info/openstack/2012/09/25/openstack-folsom-
architecture/
• http://fosshelp.blogspot.in/2014/06/openstack-icehouse-multi-node-
setup.html
• http://searchstorage.techtarget.com/definition/Cinder-OpenStack-Block-
Storage
• http://docs.huihoo.com/openstack/docs.openstack.org/icehouse/install-
guide/install/yum/content/ch_overview.html
SS ZG527 OpenStack Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 29
Cloud Computing
BITS Pilani VM Provisioning and Migration
Hyderabad Campus
Objective

Virtual Machine Provisioning and


Manageability.
– VM Provisioning Process
VM Provisioning using templates
– Vagrant
VIRTUAL MACHINE MIGRATION SERVICES
– Migration Techniques

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 2
Virtual Machine Provisioning and
Manageability

Source: http://www.slideshare.net/mhajibaba/cloud-computing-principles-and-paradigms-5-virtual-machines-provisioning-and-migration-services

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 3
VM Provisioning Process
Provisioning a virtual machine or server can be explained and
illustrated as follows:

Source: http://docplayer.net/15384567-Cloud-computing-virtual-machines-provisioning-and-migration-services-mohamed-el-refaey.html

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 4
Steps to Provision VM

 Step1: Select a server from a pool of available servers


(physical servers with enough capacity) along with the
appropriate OS template you need to provision the
virtual machine.
 Step2: load the appropriate software (operating system
you selected in the previous step, device drivers,
middleware, and the needed applications for the service
required).
 Step3: Customize and configure the machine (e.g., IP
address, Gateway) to configure an associated network
and storage resources.
 Step4: Finally, the virtual server is ready to start with its
newly loaded software.
 These are the tasks required or being performed by an
IT or a data centre's specialist to provision a particular
virtual machine.
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 5
Provisioning of VM (contd..)
 Server provisioning is defining server’s configuration
based on the organization requirements, a
hardware, and software component (processor,
RAM, storage, networking, operating system,
applications, etc.).
 Virtual machines can be provisioned by manually
installing an operating system, by using a
preconfigured VM template, by cloning an existing
VM, or by importing a physical server or a virtual
server from another hosting platform.

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 6
Provisioning of VM (contd..)
 After creating a virtual machine by virtualizing a
physical server, or by building a new virtual server in
the virtual environment, a template can be created
out of it.
 Most virtualization management vendors (VMware,
Xen Server, etc.) provide the data center’s
administration with the ability to perform such tasks
in an easy way.
 Provisioning from a template is an invaluable
feature, because it reduces the time required to
create a new virtual machine.

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 7
Provisioning of VM (contd..)
 Administrators can create different templates for different
purposes.
 For example, you can create a Windows 2003 Server
template for the finance department, or a Red Hat Linux
template for the engineering department.
 This enables the administrator to quickly provision a
correctly configured virtual server/virtual machine on
demand.
 For example:
• Vagrant provision tool using VagrantFile(template file).
• Heat –Orchestration Tool of openstack(Heat template in YAML
format).

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 8
VM Provisioning Using Vagrant
”Create and Configure lightweight, reproducible and
portable environments.”

What is Vagrant?
• A tool to build development environments based on
virtual machines.
• Focused to create environments that are similar as
possible or identical with production servers.
• Created by Mitchell Hashimoto and written in ruby.
• Initially built on top of VirtualBox API, today offers
VMWare Fusion support(as $79/license).

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 9
Use Cases of Vagrant

Virtualized development environment.


Built-in development environment sandboxing.
Multi-VM Host Environment
– Run a full production stack on a local machine for
testing
Package Virtual Environment
– Makes troubleshooting easier.

Exercise:
Create a virtual test lab on your machine using
Vagrant.

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 10
VM Migration Techniques
 Migration service, in the context of virtual machines, is the
process of moving a virtual machine from one host server
or storage location to another.
 Techniques of VM migration:
 Hot/Live Migration (real-time migration)
 Cold/Regular Migration
 Live Storage Migration
 All key machines’ components, such as CPU, storage
disks, networking, and memory, are completely
virtualized, thereby facilitating the entire state of a virtual
machine to be captured by a set of easily moved data
files.

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 11
Live Migration

 Can be defined as the movement of a virtual machine from


one physical host to another while being powered on.
 When it is properly carried out, this process takes place
without any noticeable effect from the end user’s point of
view (a matter of milliseconds).
 Advantages:
• It facilitates proactive maintenance in case of failure.
• Can be used for load balancing.
 Examples:
• VMWare Vmotion
• Citrix Xen Server XenMotion

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 12
Live Migration contd…
Pre-Assumption :
 All storage resources are
separated from computing
resources.
 Storage devices of VMs
are attached from network :
– NAS: NFS, CIFS
– SAN: Fibre Channel
– iSCSI, network block device
– Drdb network RAID
 Require high quality
network connection
– Common L2 network (LAN)
– L3 re-routing

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 13
Live Migration contd…

Challenges of live migration:


 VMs have lots of state in memory
 Some VMs have soft real-time requirements:
– For examples, web servers, databases and game servers, ...etc.
– Need to minimize down-time.
Relocation strategy:
1. Pre-migration process
2. Reservation process
3. Iterative pre-copy
4. Stop and copy
5. Commitment
6. Activation

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 14
Live Migration(Xen Hypervisor)
 Stage 0: Pre-Migration. An active virtual machine exists on the
physical host A
 Stage 1: Reservation. A request is issued to migrate an OS
from host A to host B (a precondition is that the necessary
resources exist on B and a VM container of that size).
 Stage 2: Iterative Pre-Copy. During the first iteration, all pages
are transferred from A to B. Subsequent iterations copy only
those pages dirtied during the previous transfer phase.
 Stage 3: Stop-and-Copy. Running OS instance at A is
suspended, and its network traffic is redirected to B. CPU state
and any remaining inconsistent memory pages are then
transferred. At the end of this stage, there is a consistent
suspended copy of the VM at both A and B. The copy at A is
considered primary and is resumed in case of failure.

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 15
Live Migration(Xen Hypervisor)

 Stage 4: Commitment. Host B indicates to A that it has


successfully received a consistent OS image. Host A
acknowledges this message as a commitment of the
migration transaction. Host A may now discard the original
VM, and host B becomes the primary host.
 Stage 5: Activation. The migrated VM on B is now
activated. Post-migration code runs to reattach the
device’s drivers to the new machine and advertise moved
IP addresses.

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 16
Live Migration timeline

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 17
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 18
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 19
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 20
SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 21
Regular/Cold Migration
 Cold migration is the migration of a powered-off
virtual machine.
 With cold migration, you have the option of
moving the associated disks from one data store
to another.
 The virtual machines are not required to be on a
shared storage.

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 22
Regular/Cold Migration
Cold Migration Process:
 The configuration files, including the NVRAM
file(BIOS settings), log files, as well as the disks of
the virtual machine, are moved from the source
host to the destination host’s associated storage
area.
 The virtual machine is registered with the new
host.
 After the migration is completed, the old version of
the virtual machine is deleted from the source
host.
 Example: VM vSphere

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 23
Live migration Vs Cold migration

Live Migration Cold Migration

1 Needs a shared storage It doen’t require a shared


for virtual machines in storage.
the server’s pool.

2 Between two hosts, Compatibility check not


there would be certain required
CPU compatibility
checks to be applied

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 24
Live Storage Migration of Virtual
Machine
 Constitutes moving of the virtual disks or configuration file
of a running virtual machine to a new data store without
any interruption in the availability of the virtual machine’s
service. Ex: Vmware Storage Vmotion
 Migration of VM disk files within and across storage arrays
with no down time or disruption in service.
 Relocates VM disk files from one shared storage location
to another shared storage location with zero downtime,
continuous service availability and complete transaction
integrity.
 Benefits:
 Simplify storage array migration and storage upgrades.
 Dynamically optimize storage I/O performance.

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 25
How does it work?
 Completely transparent to the virtual machine or the end
user.
 Moves the “home directory” (configuration, swap and log
files) of the VM to the new location.
 Copies the contents of the entire VM storage disk file to
the destination storage host, leveraging “changed block
tracking” to maintain data integrity during the migration
process.
 The VM is quickly suspended and resumed so that it can
begin using the virtual machine home directory and disk
file on the destination data store location.

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 26
How does it work?

Source: http://www.suredatum.com/blog/oracle-licensing-the-vmotion-trap/

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 27
Migration of VMs to Alternate
Platforms
 One of the main advantages of having facility in
datacenter’s technologies is to have the ability to migrate
virtual machines from one platform to another.
 Vmware converter handles migrations between ESX
hosts; the Vmware server; and the Vmware workstation.
 The Vmware converter can also import from other
virtualization platforms, such as Microsoft virtual server
machines.
 Ex:Vmware vCenter Converter

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 28
VM Provisioning and Migration in
Action
 ConVirt (open source framework for the management of
open source virtualization like Xen and KVM).
 You can create and provision images, diagnose
performance problems, and balance load across the data
center.
 Using this we can manage the lifecycle, provision, and
migrate a virtual machine.
 https://www.convirture.com/products_opensource.php

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 29
Summary
Virtual Machine Provisioning and Manageability
– VM Provisioning Process
Virtual Machine Migration Services
– Migration techniques

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 30
Bibliography
• Buyya K. R., Broberg J., Goscinski A., Cloud Computing
Principles and Paradigms. Wiley; 2013.
• Recorded Lectures.
• https://en.wikipedia.org/wiki/Xen
• https://support.gmocloud.us/hc/en-us/articles/230943528-
What-is-the-difference-between-a-hot-migration-and-a-cold-
migration-
• http://www.sersc.org/journals/IJGDC/vol8_no5/33.pdf
• https://pubs.vmware.com/vsphere-
50/index.jsp?topic=%2Fcom.vmware.vsphere.vcenterhost.do
c_50%2FGUID-326DEC3C-3EFC-4DA0-B1E9-
0B2D4698CBCC.html

SS ZG527 VM Provisioning and Migration Dr. S. Panda, CSIS, BITS Pilani, Hyderabad Campus 31
Cloud Computing
BITS Pilani PaaS
Hyderabad Campus
Agenda

o Introduction to PaaS
o Building blocks of PaaS
o Characteristics of PaaS
o Advantages and Risks
o PaaS Example – Windows Azure

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 2


Dependency on IaaS and PaaS

PaaS providers can assist developers from the conception of their


original ideas to the creation of applications, and through to testing and
deployment.

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 3


Introduction to PaaS
• Platform as a Service, referred to as PaaS, is a category of
cloud computing that provides a platform and environment to
allow developers to build applications and services over the
internet.
• Platform as a Service allows users to create software
applications using tools supplied by the provider.
• PaaS services are hosted in the cloud and accessed by users
simply via their web browser.
• PaaS services can consist of preconfigured features that
customers can subscribe to; they can choose to include the
features that meet their requirements while discarding those
that do not.

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 4


Characteristics of PaaS
• Services to develop, test, deploy, host and maintain applications in
the same integrated development environment. All the varying
services needed to fulfill the application development process

• Web based user interface creation tools help to create, modify, test
and deploy different UI scenarios

• Multi-tenant architecture where multiple concurrent users utilize the


same development application

• Built in scalability of deployed software including load balancing and


failover

• Integration with web services and databases via common standards

• Support for development team collaboration – some PaaS solutions


include project planning and communication tools

• Tools to handle billing and subscription management

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 6


Advantages and Risks
Advantages
• Users don’t have to invest in physical infrastructure
• PaaS allows developers to frequently change or upgrade
operating system features. It also helps development teams
collaborate on projects.
• Makes development possible for ‘non-experts’
• Teams in various locations can work together
• Security is provided, including data security and backup and
recovery.
• Adaptability; Features can be changed if circumstances
dictate that they should.
• Flexibility; customers can have control over the tools that are
installed within their platforms and can create a platform that
suits their specific requirements. They can ‘pick and choose’
the features they feel are necessary.

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 8


Advantages and Risks
Risks
• Since users rely on a provider's infrastructure and software,
vendor lock-in can be an issue in PaaS environments.
• Other risks associated with PaaS are provider downtime or a
provider changing its development roadmap.
• If a provider stops supporting a certain programming
language, users may be forced to change their programming
language, or the provider itself. Both are difficult and
disruptive steps.

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 9


PaaS Vendors
• Common PaaS vendors include Salesforce.com's
Force.com, which provides an enterprise customer
relationship management (CRM) platform.
• PaaS platforms for software development and
management include Appear IQ, Mendix, Amazon Web
Services (AWS) Elastic Beanstalk, Google App Engine
and Heroku.

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 10


Paas Example
• PaaS does not typically replace a business' entire
infrastructure. Instead, a business relies on PaaS providers
for key services, such as Java development or application
hosting.
• For example:
Deploying a typical business tool locally might require an IT
team to buy and install hardware, operating systems,
middleware (such as databases, Web servers and so on)
the actual application, define user access or security, and
then add the application to existing systems management
or application performance monitoring (APM) tools. IT
teams must then maintain all of these resources over time.
• Paas solution: A PaaS provider, however, supports all
the underlying computing and software; users only need to
log in and start using the platform – usually through a Web
browser interface.

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 11


Paas Example: Windows Azure

• Windows Azure is Microsoft's operating system for cloud computing.


• Windows Azure is intended to simplify IT management and minimize
up-front and ongoing expenses
• To this end, Azure was designed to facilitate the management of
scalable Web applications over the Internet.
• Windows Azure can be used to create, distribute and upgrade Web
applications without the need to maintain expensive, often
underutilized resources onsite.
• New Web services and applications can be written and debugged
with a minimum of overhead and personnel expense.
• <iframe width="504" height="536" title="Embedded post"
src="https://www.linkedin.com/embed/feed/update/urn:li:ugcPost:65
67816303200075776" frameborder="0" allowfullscreen=""></iframe>

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 12


Paas Example: Windows Azure

• The Azure operating system is the central component of the


company's Azure Services Platform, which also includes
separate application, security, storage and virtualization
service layers and a desktop development environment.
• Windows Azure supports a wide variety of Microsoft and third-
party standards, protocols, programming languages and
platforms. Examples include XML (Extensible Markup
Language), REST (representational state transfer), SOAP
(Simple Object Access Protocol), Eclipse, Ruby, PHP and
Python.
• Although it faces steep competition from Amazon Web
Services (AWS), Microsoft Azure has managed to hold a
strong second place among cloud hosting platform
providers. http://azure.microsoft.com/en-us/

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 13


Paas Example: Windows Azure

• The Azure operating system is the central component of the


company's Azure Services Platform, which also includes
separate application, security, storage and virtualization
service layers and a desktop development environment.
• Windows Azure supports a wide variety of Microsoft and third-
party standards, protocols, programming languages and
platforms. Examples include XML (Extensible Markup
Language), REST (representational state transfer), SOAP
(Simple Object Access Protocol), Eclipse, Ruby, PHP and
Python.
• Although it faces steep competition from Amazon Web
Services (AWS), Microsoft Azure has managed to hold a
strong second place among cloud hosting platform
providers. http://azure.microsoft.com/en-us/

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 14


Windows Azure Runtime Environment
• The Windows Azure runtime environment provides a
scalable compute and storage hosting environment
along with management capabilities. It has three major
components: Compute, Storage and the Fabric
Controller

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 15


Windows Azure Runtime Environment
• The hosting environment of Azure is called the Fabric
Controller. It has a pool of individual systems connected on a
network and automatically manages resources by load
balancing and geo-replication. It manages the application
lifecycle without requiring the hosted apps to explicitly deal
with the scalability and availability requirements. Each
physical machine hosts an Azure agent that manages the
machine.
• The Azure Compute Service provides a Windows-based
environment to run applications written in the various
languages and technologies supported on the Windows
platform.
• The Windows Azure storage service provides scalable
storage for applications running on the Windows Azure in
multiple forms. It enables storage for binary and text data,
messages and structured data through support for features
called Blobs,Tables, Queues and Drives.

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 16


Fabric Controller architecture

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 17


Windows Azure AppFabric

App Fabric allows on-premise


applications to interoperate
with applications hosted in
the cloud with secure
connectivity, messaging,
and identity management.

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 18


Windows Azure AppFabric

It is a middleware platform that developers can use to bridge existing


applications/data to the cloud through secure, authenticated
connectivity across network boundaries.
Azure AppFabric consists of three main components:
– Service Bus provides secure messaging and connectivity
between cloud and on-premise applications and data.
– Access Control component provides federated identity
management with standards-based identity providers that
includes Microsoft’s Active Directory, Yahoo!, Google and
Facebook. This provides users across these organizations a
single sign-on facility to access services hosted on Azure
– Caching component provides an in-memory, scalable, highly
available cache for application data

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 19


Azure Programming Model

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 20


Compute service with two Web roles
and two Worker roles

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 21


SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 22
Server Rack 1 Server Rack 2

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 23


SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 24
SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 25
SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 26
SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 27
Azure Storage Services

• Blob service: For large binary and text data

• Azure Drives: To use as mounted file systems

• Table Service: For structured storage of non-relational


data

• Queue Service: For message passing between


components

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 28


Blob (Binary Large Object)
• It provides file-level storage and is similar to Amazon S3

• Applications deal with blobs as a whole, although they might


read/write parts of a blob

• Blobs are always stored under containers, which are similar


to AWS buckets

• Every storage account must have at least one container, and


containers can have blobs within them.

• Container names can contain the directory separator


character ("/") –this gives developers the facility to create
hierarchical "file-systems" similar to those on disks

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 29


Blob (Binary Large Object)
The blob service defines 3 kinds of blobs to store text and
binary data:
• Block blobs: are used to store ordinary files upto 4.7 TB,
which are optimized for streaming. This type of blob was
the only blob type available with versions prior to 2009.
• Page blobs: used to hold random access files upto 8TB,
which are optimized for random read/write operations
and which provide the ability to write to a range of bytes
in a blob. Page blobs are introduced after 2009.
• Append blobs: which are optimized for append
operations only. These are used for things like logging
information to the same blob from multiple virtual
machines. Append blobs are introduced recently in 2015.

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 30


Blob (Binary Large Object)

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 31


Azure Drive
• Azure Drive
• Used for mounting an NTFS volume to be accessed by an
application, and are similar to Amazon EBS.

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 32


Table Service
• Table service:
• Structured forms of storage using key-value pairs
• Uses a NoSQL model based on key-value pairs for querying
structured data that is not in a typical database
• These tables are not relational in nature, nor are table schemas
enforced by the Azure framework
• Data stored in Azure tables is partitioned horizontally and
distributed across storage nodes for optimized access.
• Every table has a property called the Partition Key, which defines
how data in the table is partitioned across storage nodes –rows
that have the same partition key are stored in a partition.
• In addition, tables can also define Row Keys which are unique
within a partition and optimize access to a row within a partition.
• The pair {partition key, row key} uniquely identifies a row in a
table

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 33


Table Service

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 34


Table Service

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 35


Queue Service
Queue Service
• Provide reliable message delivery within and between services.

• A storage account can have unlimited number of queues, and


each queue can store an unlimited number of messages.

• Queues are used by Web roles and Worker roles for inter-
application communication, and by applications to communicate
with each other.

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 36


5 Principles of Good UI by AWS

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 37


5 Principles of Good UI by AWS

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 38


THANK YOU !

SS ZG527 Intro to PaaS Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, 39


Cloud Computing
BITS Pilani Google App Engine
Hyderabad Campus
Objective

o Introduction to GAE
o Why Google App Engine ?
o Scalability
o Development Life Cycle of GAE
o GAE Services
o Programming Languages Supported
o GAE Example In JAVA using Eclipse

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 2
Google App Engine
GAE is part of Google Cloud and is Platform As A Service
cloud (PAAS)
Use Google Infrastructure to host and build your Web
Applications

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 3
Introduction to Google App Engine
• Google App Engine is a PaaS solution that enables users to
host their own applications on the Google data center similar
to Google Docs, Google Maps and other popular Google
services.

• It enables users to develop and host applications written using


Java, Python, Go, JRuby, JavaScript(Rhino), Scala, etc.

• The applications hosted on Google App Engine can scale


both in compute and storage just like other Google
products

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 4
Why Google App Engine ?

• Automatic scaling and load balancing


• Lower total cost of ownership
• Web administration console & diagnostic utilities
• Enhances developing & deploying of web
applications
• Multilanguage support (Java, Python, GO, PHP)
• Fully featured SDK for local development
• Rich set of Google APIs
• Secure environment (Sandbox)

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 5
Why Google App Engine ?

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 6
Scalability

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 7
Life of Request

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 8
Scalability

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 9
Application Life Cycle

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 10
Development Life Cycle

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 11
What does GAE Provide?

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 12
GAE Services

App Engine also provides a variety of services to perform common


operations when managing your application.
• URL Fetch: Facilitates the application’s access to resources on
the internet, such as web services or data.
• Mail: Facilitates the application to send e-mail messages using
Google infrastructure.
• Memcache:
• High performance in-memory key-value storage.
• Can be used to store temporary data which doesn’t need to be
persisted.
• Images : manipulate images: resize, rotate, flip, crop
• XMPP: instant messages server
• Task Queue : message queue; allow integration with non-GAPPs
• Datastore : managing data objects
• Blobstore : large files, much larger than objects in data store,
use <key, object> to access

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 13
Limitations with free account on GAE

• FREE: All applications have a default quota


configuration, the "free quotas", which should allow for
roughly 5 million page views a month for an efficient
application.10 Applications per Google account

• PAY FOR MORE: As your application grows, it may need


a higher resource allocation than the default quota
configuration provides. You can purchase additional
computing resources by enabling billing for your
application. Billing enables developers to raise the limits
on all system resources and pay for even higher limits
on CPU, bandwidth, storage, and email usage

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 14
Programming Language Support

Java:
• App Engine runs JAVA apps on a JAVA 7 virtual
machine (currently supports JAVA 6 as well).

• Uses JAVA Servlet standard for web applications:


• WAR (Web Applications ARchive) directory structure.
• Servlet classes
• Java Server Pages (JSP)
• Static and data files
• Deployment descriptor (web.xml)
• Other configuration files
• Getting started :
– https://developers.google.com/appengine/docs/java/gettingstarte
d/

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 15
Programming Language Support

Python:
• Uses WSGI (Web Server Gateway Interface) standard.
• Python applications can be written using:
• Webapp2 framework
• Django framework
• Any python code that uses the CGI (Common
Gateway Interface) standard.
• Getting started :
– https://developers.google.com/appengine/docs/pytho
n/gettingstartedpython27/

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 16
Programming Language Support

PHP (Experimental support):


• Local development servers are available to anyone for
developing and testing local applications.
• Only whitelisted applications can be deployed on
Google App Engine.
(https://gaeforphp.appspot.com/).
• Getting started:
https://developers.google.com/appengine/docs/php

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 17
Programming Language Support

Google’s Go:
• Go is an Google’s open source programming
environment.
• Tightly coupled with Google App Engine.
• Applications can be written using App Engine’s Go SDK.
• Getting started:
https://developers.google.com/appengine/docs/go/overvi
ew

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 18
GAE Java Example using Eclipse

• Developing and Deploying app on Google App


Engine
• Download and install Java EE

• Add plug-ins: Google plugin for eclipse(SDK)

• Create a new "Web Application Project"

• Configure the application

• Develop code

• Test in simulated App Engine environment

• Deploy to Google App Engine

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 19
GAE Java Example using Eclipse

Tools used :
• JDK 1.6
• Eclipse 3.7 + Google Plugin for Eclipse
• Google App Engine Java SDK 1.6.3.1

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 20
GAE Java Example using Eclipse

1. Create the GAE Project


• Once you have installed the GAE Plugin for Eclipse, a new icon (a blue
“g”) will be shown in the tool bar. Click it and a menu will be displayed.
• Select New Web Application Project.

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 21
GAE Java Example using Eclipse

1. Create the GAE Project


• A new wizard appears and put the information about your
project.

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 22
GAE Java Example using Eclipse

2. Configure
• The created project will have this structure.

The structure of the project is like a typical


Web project with some extra libraries and
a appengine-web.xml ,

As you can see, a Servlet is created. In


this Servlet you will put all the logic for the
incoming requests that your application will
have.

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 23
GAE Java Example using Eclipse

3. Code
• In this example we will return Hello world from GAE.
• Open the HelloWorldGAEServlet Class and add the
following code
import java.io.IOException;
import javax.servlet.http.*;

@SuppressWarnings("serial")
public class HelloWorldGAEServlet extends HttpServlet {
public void doGet(HttpServletRequest req, HttpServletResponse resp)
throws IOException {
resp.setContentType("text/plain");
resp.getWriter().println("Hello world from GAE");
}
}

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 24
GAE Java Example using Eclipse

4. Test your Application Local


• Right click on HelloWorldGAEServlet.java->Run As->Web
Application.

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 25
GAE Java Example using Eclipse

4. Test your Application Local


• Open a browser and go to http://localhost:8888/. The main Google App
Engine screen comes up.
• If you click on HelloWorldGAE you will see the greeting from the servlet
you just modified.

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 26
GAE Java Example using Eclipse

5. Deploy to Google App Engine


• Create an account in https://appengine.google.com/ and register you
application. Google will only let you register 10 application, so be careful!
• Modify the appengine-web.xml with the name of your registered
application.

<?xml version="1.0" encoding="utf-8"?>


<appengine-web-app xmlns="http://appengine.google.com/ns/1.0">
<application>Your_application_ID</application>
<version>1</version>
<!-- Allows App Engine to send multiple requests to one instance in parallel: -->
<threadsafe>true</threadsafe>

<!-- Configure java.util.logging -->


<system-properties>
<property name="java.util.logging.config.file" value="WEB-INF/logging.properties"/>
</system-properties>

</appengine-web-app>

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 27
GAE Java Example using Eclipse

5. Deploy to Google App Engine


• Once you save the changes, click on the g icon in the tool bar and
select Deploy to App Engine
• Then log in with your Google account and the deployment will start.
• If everything is fine, the hello world web application will be deployed to
this URL – http://Your_application_ID.appspot.com/

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 28
GAE Java Example using Eclipse

5. Administer the App via Web Console

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 29
THANK YOU !

SS ZG527 GAE Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 30
BITS Pilani presentation
Software as a Service (SaaS)
BITS Pilani
Hyderabad Campus
SaaS (Software as a Service)
- No Worries - It's a Service

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 2
Objectives

 Dependency on IaaS and PaaS

 Introduction to SaaS

 Pros and Cons of SaaS model

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 3
When you opt for SaaS ??
For better understanding

• Imagine a system
 where you don't have to buy new hardware or update
software
 where you pay nothing or pay as much as you use
 where everything is done as a service: Infrastructure,
computing, storage and usage
 where you don't worry about your resources spent on
Infrastructure security and operational security
 where you cut your IT spending
 where you have freedom of usage from anywhere with
internet connectivity
 which is eco-friendly

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 4
Dependency on IaaS and PaaS

6
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 5
SaaS - Definition
 The most complete cloud computing service model is one in
which the computing hardware and software, as well as the
solution itself, are provided by a vendor as a complete service
offering.
 SaaS is a model where an application is hosted on a remote
data center and provided as a service to customers across the
internet.
 Shortly, in the SaaS model software is deployed as a hosted
service and accessed over the Internet, as opposed to “On
Premise.”
 In this model, the provider takes care of all software
development, maintenance and upgrades.
 Salesforce.com is a common and popular example of a CRM
SaaS application.
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 6
 The applications are accessible from various client
devices through a web browser.

http://cloudcomputingwire.com
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 7
Is it customizable?
• Many people believe that SaaS software is not
customizable, and in many SaaS applications this is indeed
the case
- user-centric application like office suite.
• Many other SaaS solutions expose Application Programming
Interfaces (API) to developers to allow them to create
custom composite applications
- Salesforce.com, Quicken.com, etc.
• So, SaaS does not necessarily mean that the software is
static or monolithic. Customers can configure user-specific
application parameters and settings.

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 9
SaaS – How is it delivered

• Network-base access to and management of


commercially available (i.e., not custom) software
application delivery
- that typically is closer to a one-to-many model
(single instance, multi-tenant architecture)
- than to a one-to-one model, including
architecture, pricing, partnering, and
management characteristics.

• Software as a service and not software as a


product delivered to home consumers, small
business, medium and large business.

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 10
SaaS – How is it delivered (1)

Source: wiki
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 11
An Analogy
Traditional On-Demand Utility

Subscribe, Plug In,


Build Your Own
Pay-per-Use

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 12
SaaS characteristics
 The software is available over the Internet globally through a
browser on demand.

 The typical license is subscription-based or usage-based and


is billed on a recurring basis.

 The software and the service are monitored and maintained


by the vendor, regardless of where all the different software
components are running.

 Reduced distribution, maintenance costs, and minimal end-


user system costs generally make SaaS applications cheaper
to use than their shrink-wrapped versions.
SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 13
SaaS characteristics (contd..)

• Such applications feature automated upgrades, updates,


and patch management and much faster rollout of changes.
• SaaS applications often have a much lower barrier to
entry than their locally installed competitors, a known
recurring cost, and they scale on demand.
• All users have the same version of the software, so
each user's software is compatible with another's.
• SaaS supports multiple users and provides a shared
data model through a single-instance, multi-tenancy
model.

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 14
SaaS - Pros
 No large upfront costs - usually free trials
 Anywhere, anytime, anyone - mobility
 Stay focused on business processes
 Change software to an Operating Expense instead of a
Capital Purchase, making better accounting and budgeting
sense.
 Create a consistent application environment for all users
 No concerns for cross platform support
 Easy Access
 Reduced piracy of your software
 Lower Cost
 For an affordable monthly subscription
 Implementation fees are significantly lower
 Continuous Technology Enhancements

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 15
SaaS - Cons
 Initial time needed for licensing and agreements
 Trust, or the lack thereof, is the number one factor blocking the
adoption of software as a service (SaaS).
 Centralized control
 Possible erosion of customer privacy
 Absence of disconnected use
 Not suited to high volume data entry
 Broadband risk

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 16
SaaS Advantages

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 17
SaaS Advantages

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 18
SaaS User Benefits

• Lower Cost of Ownership


- The software is paid when it is consumed, no large upfront cost for
a software license .
- Salesforce.com has a best-of-breed CRM system for $59.00 per
user per month, with no upfront.
- Since no hardware infrastructure, installation, maintenance, and
administration, budgeting is easy.
- The software is available immediately upon purchasing.

• Focus on Core Competency


- The IT saving on capital and effort allows the customer
• to remain focused on their core competency and utilize
resources in more strategic areas.

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 19
SaaS User Benefits

• Access Anywhere
- Users can use their applications and access their data
anywhere
• With an Internet connection and a computing device.
- This enhances the customer experience of the software and
makes it easier for users to get work done fast.
• Freedom to Choose (or Better Software)
- The pay-as-you-go (PAYG) nature of SaaS enables users to
select applications they wish to use and to stop using those
that no longer meet their needs.
- Ultimately, this freedom leads to better software applications
because vendors must be receptive to customer needs and
wants.

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 20
SaaS User Benefits

• New Application Types


- Since the barrier to use the software for the first time is low, it is
now feasible to develop applications that may have an occasional
use model.
• This would be impossible in the perpetual license model.
- If a high upfront cost were required the number of participants
would be much smaller.
• Faster Product Cycles
- Product releases are much more frequent, but contain fewer new
features than the typical releases in the perpetual license model
because the developer knows the environment in which the
software needs to run.
- This new process gets bug fixes out faster and allows users to digest
new features in smaller bites, which ultimately makes the users
more productive than they were under the previous model.

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 21
SaaS Vendor Benefits
• Increased Total Available Market
- Lower upfront costs and reduced infrastructure capital translate into
a much larger available market for the software vendor,
- because users that previously could not afford the software license or
lacked the skill to support the necessary infrastructure are potential
customers.
- A related benefit is that the decision maker for the purchase of a
SaaS application will be at a department level rather than the
enterprise level that is typical for the perpetual license model.
- This results in shorter sales cycles.
• Enhanced Competitive Differentiation
- The ability to deliver applications via the SaaS model enhances a
software company’s competitive differentiation.
- It also creates opportunities for new companies to compete effectively
with larger vendors.
- On the other hand, software companies will face ever-increasing
pressure from their competitors to move to the SaaS model.
- Those who lag behind will find it difficult to catch up as the software
industry continues to rapidly evolve.

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 22
SaaS Vendor Benefits

• Lower Development Costs & Quicker Time-to-Market


- The main saving is at testing (35%).
• Small and frequent releases – less to test
• Application is developed to be deployed on a specific
hardware infrastructure, far less number of possible
environment – less to test.
• This, in turn, provides the software developer with overall
lower development costs and quicker time-to-market.
• Effective Low Cost Marketing
- Between 1995 and today, buyers’ habits shifted from an
outbound world driven by field sales and print advertising to
an inbound world driven by Internet search.

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 23
SaaS Vendor Benefits

• Predictable MRR Revenue


- Traditionally, software companies rely on one major release
every 12-18 months to fuel a revenue stream from the sale of
upgrades (long tail theory).
- In the SaaS model, the revenue is typically in the form of
Monthly Recurring Revenue (MRR).
• Improved Customer Relationships
- SaaS contributes to improved relationships between vendors
and customers.
• Protecting of IP
- Difficult to obtain illegal copies.
- Price is low, making/getting illegal copies is totally
unnecessary.

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 24
Summary:

 Introduction to SaaS
 Pros and Cons of SaaS model

SS ZG527 Intro to SaaS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 25
BITS Pilani presentation
SaaS Architecture
BITS Pilani
Hyderabad Campus
Objectives

 SaaS Architecture
 Applications of SaaS
 Traditional packaged Software Vs SaaS
 Examples of SaaS
 Case study

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 2
SaaS – Architecture
(Introduction)
Run by:
• Bandwidth technologies
- The cost of a PC has been reduced
significantly with more powerful computing.
- But the cost of application Software has not
followed.
• A normal scenario would require timely and
expensive setup and maintenance costs.
• Licensing issues for business are contributing
significantly to the use of illegal software and
piracy.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 3
SaaS Application Architecture (1)
- Scalable
- Multitenant efficient
- Configurable
• Scaling the application
• Maximizing concurrency, and using application
resources more efficiently.
• i.e. optimizing locking duration, statelessness,
sharing pooled resources such as threads and
network connections, caching reference data, and
partitioning large databases.

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 4
SaaS Application Architecture (2)
Multi-tenancy:
 Important architectural shift from designing
isolated, single-tenant applications.
 One application instance must be able to
accommodate users from multiple other
companies at the same time.
 All transparent to any of the users.
 This requires an architecture that maximizes the
sharing of resources across tenants.
 Is still able to differentiate data belonging to
different customers.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 5
SaaS Application Architecture (3)
Configurable:
• A single application instance on a single server has to
accommodate users from several different companies at
once.
• To customize the application for one customer will change
the application for other customers as well.
• Traditionally customizing an application would mean code
changes for individual customer.
• Each customer uses metadata to configure the way the
application appears and behaves for its users.
• Customers configuring applications must be simple and
easy without incurring extra development or operation costs.

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 6
SaaS Models

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 7
Example SaaS applications

 Salesforce.com

 Google Apps
 Gmail, Google Groups, Google Calendar, Talk, Docs, etc
 Google Apps Marketplace (Google apps for both free and for a fee)

 Microsoft Office 365


 Office 365 is a subscription-based online office and software plus
services suite which offers access to various services and software built
around the Microsoft Office platform

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 11
SaaS examples

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 12
Which applications are suitable?

 Any application can be deployed in this way. However


communications over the Internet are not as fast as local
connections - so leave any high volume data entry
applications on your internal LAN or WAN. All the rest
can go on the Internet under a SaaS approach

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 13
Myths
 SaaS is still relatively new and untested.

 SaaS is just another version of the failed


Application Service Provider (ASP) and hosting
models of the past and will suffer the same fate
as its predecessors.

 SaaS only relieves companies of the upfront costs


of traditional software licenses.

 SaaS is only for small and mid-sized businesses


and will not be accepted by large-scale
organizations.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 14
Traditional Web Application Architecture

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 15
Web Application Hosting on AWS

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 16
Multi-Tenancy in SaaS

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 17
Multi-Tenancy in Saas

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 18
3-Tier auto-scalable Web Application Architecture

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 19
ASP vs SaaS

• Ownership- single tenant • Ownership- Multitenant,


with client server application hosted by the
architecture. application developer.
• Infrastructure- Non • Infrastructure-Shared,
virtualized environment with virtualized servers, network
direct attached storage and storage system form a
dedicated to application. resource pool.
• Not originally written to be • Built to be Web-based and
Web –based and used over used over the public
the internet; hence, there is internet.
performance degradation

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 20
Myths (contd..)

 SaaS only applies to applications such as CRM and


Salesforce automation.

 SaaS will only have a minor impact on the software


industry and will fade over time.

 It will be easy for the established software vendors to


offer SaaS and dominate this market.

 SaaS is only for corporate users.

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 21
Case study
Cloud computing for education

 Cloud computing can help in improving the reach of


quality online education:
• Cloud based collaboration application (ex: online forums) can
help students discuss common problems and seek guidance
from experts.
• Univ., colleges, schools can use cloud based information
management systems to:
Admissions
Improve administrative efficiency
 Offer online and distance edn. programs
 Online exam, etc.
• Cloud based online learning systems can provide access to high
quality educational material to students.
• Overall, cloud based systems can help in cutting down the IT
infrastructure costs.
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 22
Case study (contd..)
Cloud computing for education

Fig. A generic use case of cloud for education


SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 23
Applicability – Scenario 1
 Single-User software application
– Organize personal information
– Run on users’ own local computer
– Serves only one user at a time
– Inapplicable to SaaS model
• Data security issue
• Network performance issue
– Example: Microsoft office suite

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 24
Applicability – Scenario 2
 Infrastructure Software
– Serves as the foundation for most other
enterprise software applications.
– Inapplicable to SaaS model
• Installation locally is required
• Forms the basis to run other applications
– Example: Window XP, Oracle database

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 25
Applicability – Scenario 3

 Embedded Software
• Software components for embedded systems.
• Supports the functionality of the hardware device
• Inapplicable to SaaS model
- Embedded software and hardware is combined
together and is inseparable.
- Example: software embedded in ATM machines,
cell phones, routers, medical equipment, etc.

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 26
Applicability – Scenario 4
 Enterprise Software Application
– Performs business functions
– Organize internal and external information
– Share data among internal and external users
– The most standard type of software applicable to
SaaS model
– Example: Saleforce.com CRM application, Siebel
On-demand application.

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 27
SaaS Example- Zoho Doc Writer

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 28
Case study- DOCUMENT SERVICES:
GOOGLE DOCS

 Cloud computing SaaS service designed to enable


uploading and sharing of documents as a persistent
repository of information.
 All the features of Google Docs can be accessed using a
Portal.
 The developer APIs enable the usage of
this cloud application from within other applications.

 Many such cloud services also exists,


www.dropbox.com, www.slideshare.com,
www.scribed.com

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 29
Google Docs Portal
Google Docs APIs

• Provides APIs that allow users to develop applications that


upload documents to the Google Docs service and share
documents.
• Google Data Protocol (GDP) provides a secure means for
new applications to let end users access and update the
data stored by many Google products.
• GDP uses GET and POST requests.
• Users may also use the protocol directly using any of the
supported programming languages provided by HTTP
client libraries.

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 31
Example:

 Example application that demonstrates several features


of document sharing <1. Book>.
 The application first uploads a document onto Google
Docs, and then shares the document to people on a
mailing list (Google groups id), which also sends an
email notifying those people about the document.

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 32
 Google Docs packages that one needs to import
are the following:
import com.google.common.*;
import com.google.gdata.util.*;
import com.google.gdata.client.uploader.*;
import com.google.gdata.data.docs.*;
import com.google.gdata.data.media.*;
import com.google.gdata.data.acl.*;

 A snippet of Java code that uploads a file without taking


care of upload errors is given below:

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 33
Java code that uploads a file to Google Docs
Embedding Google Docs in Other HTML Pages
 Consider a scenario where you have your own web page
and would like to embed Google Docs to use Google
Docs as a back-end store.
 Clicking on a link would display a document that is actually stored in
Google Docs.

Requirements:
 Google doc API’s for upload
 The unique URL of the document to be inserted is
needed.
 To get this unique URL, the file needs to be published as
a web page.

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 35
HTML code similar to the following, which has to be inserted into your web page:

<iframe
src="https://docs.google.com/document/d/1swzqklOR0jcVphTe0DBQ3NwNI8MDI17eB50aBY
ap3Kk/pub?embedded=true"></iframe>
Case study- Salesforce CRM

 Salesforce.com is a Customer Relationship


Management (CRM) solutions vendor

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 37
Salesforce benefits
Costs:
Receive a very reasonable non-profit rate ($300 per year per user license)

Marketing : Services
• Track all data related to a campaign (date, location, costs).
• Track email blasts internally and view an aggregated record of all
emails sent to a potential student
- Prevents excessive emails
• Track all potential lead data including survey data
• Calculate a Return On Investment
- Some marketing campaigns can be quite costly (print, radio etc.)
- Given the current times of economic hardship, it is crucial to
measure the effectiveness of marketing campaigns.
• Increase rate of conversion by capturing web-based inquiry data.
• Help keep our data clean with the integrated de-duplication tools.

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 38
Salesforce benefits

Email Blasts
Vertical Response
• Email Service Provider sends out mass emails to customers.
- This email provided basic contact information and a series of links
including one back to a web-based inquiry form.
- Auto Response Rules:
• Example- creating an effective message back to the student
within minutes of clicking submit.
Survey Tools
Poltzer
• Integrated survey tool - great for creating basic surveys, and have the
data automatically tracked in salesforce.
- It can send a survey link out in your mass email and combine the two
great services.
• You can use your own external survey tool for more advanced survey,
reporting, and import the data into salesforce.

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 39
Marketing via Salesforce

Send email blasts Online Advertising

Include URL directing leads


to online form

Online form collects data and tracks in


Salesforce

Create an automated email notification which is immediately sent


back to end user

Create an assignment rule that delegates a task & email


notification to a co-worker to contact lead within minutes
SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 4-Mar-17 40
Self Study Links

Google docs:
Refer book :
Moving to the cloud for:
 Printing the details of the uploaded file
 Handling errors while uploading
 Sharing the document with a mailing list

Salesforce CRM:
Refer:
 Salesforce.User Guide Site: http://tinyurl.com/2ajcpgs
 Salesforce Blog: http://salesforceatrutgers-
sci.blogspot.com

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 4-Mar-17 41
Summary: SaaS
• Software/Interface
- SaaS provides the users a complete software application or the
user interface to the application itself.
• Outsourced Management
- The cloud service provider manages the underlying cloud
infrastructure including servers, network, operating systems, storage
and application software,
- and the user is unaware of the underlying architecture of the cloud.
• Thin client interfaces
- Applications are provided to the user through a thin client interface (e.g.
a browser).
- SaaS applications are platform independent and can be accessed from
various client devices such as workstations, laptop, tablets and
smartphones, running on different operating systems.
• Ubiquitous Access
- Since the cloud service provider manages both the application and data,
the users are able to access the applications from anywhere.

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 42
Summary: SaaS
SaaS
Benefits Characteristics Examples
- Lower costs - Multi-tenancy - Google Apps
- No infrastructure required - On-demand software - Salesforce.com
- Seamless upgrades - Open integration protocols - Facebook
- Guaranteed performance - Social network integration - Zoho
- Automated backups - Dropbox
- Easy data recovery - Taleo
Adoption
- Secure - Microsoft Office 365
- Individual users: High
- High adoption - Small & medium enterprises: High - Linkedin
- On-the move access - Large organizations: High - Slideshare
- Government: Medium - CareCloud

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 43
Bibliography

• Jayaswal K., Kallakurchi J., Houde D. J., and


Shah D. Cloud Computing Black Book.
DreamTech Press; 2014.
• Dinakar Sitaram and Geetha Manjunath,
Moving to the Cloud. Elsevier; 2012.
• Internet Resources.
• Recorded Lectures.

SS ZG527 SaaS Architecture Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 44
Cloud Computing
BITS Pilani Orchestration and Dockers
Hyderabad Campus
BITS Pilani
Hyderabad Campus

Cloud orchestration technologies


Background

• The three most common models of cloud services are


• Software as a Service (SaaS)
• Platform as a Service (PaaS)
• Infrastructure as a Service (IaaS)

BITS Pilani, Hyderabad Campus


Approach to setup
Environment Manually
• Wait for approval
• Buy the hardware
• Install the OS
• Connect to and configure the network
• Get an IP
• Allocate the storage
• Configure the security
• Deploy the database
• Connect to a back-end system
• Deploy the application on the server

BITS Pilani, Hyderabad Campus


Cloud Orchestration

• It is the end-to-end automation of the deployment of


services in a cloud environment.
• it is the automated arrangement, coordination, and
management of complex computer systems,
middleware, and services—all of which helps to
accelerate the delivery of IT services while reducing
costs
• It is used to manage cloud infrastructure

BITS Pilani, Hyderabad Campus


Aspects to Cloud
Orchestration
• Resource orchestration, where resources are allocated
• Workload orchestration, where workloads are shared
between the resources
• Service orchestration, where services are deployed on
servers or cloud environments

BITS Pilani, Hyderabad Campus


Cloud Orchestration

• how cloud orchestration automates the services in all


types of clouds—public, private, and hybrid.

BITS Pilani, Hyderabad Campus


Why Orchestration

• As you saw in the previous section, the manual process


of setting up an environment involves multiple steps.
• Using an orchestrator tool, it's easy to quickly configure,
provision, deploy, and develop environments, integrate
service management, monitoring, backup, and security
services—and all of these steps are repeatable.
• Orchestration enables you to make your products
available on a wider variety of cloud environments, which
enables users to deploy them more easily.

BITS Pilani, Hyderabad Campus


Orchestration Tools

• Various tools and techniques are available for


orchestration, each of which are appropriate for specific
cases
• Chef and Puppet
• OpenStack
• Heat
• Juju
• Charms

• Docker

BITS Pilani, Hyderabad Campus


Chef and Puppet - Chef

• Chef is a powerful automation platform that transforms


complex infrastructures into code, bringing both servers
and services to life
• Chef uses cookbooks to determine how each node
should be configured.
• Cookbooks consist of multiple recipes
• A recipe is an automation script for a particular service that's written using the
Ruby language

• Chef client is an agent that runs on a node and


performs the actual tasks that configure it
• Chef can manage anything that can run the Chef client, like physical machines,
virtual machines, containers, or cloud-based instances

• The Chef server is the central repository for all


configuration data
BITS Pilani, Hyderabad Campus
Chef and Puppet- Chef Cond.

• Both the Chef client and Chef server communicate in a


secure manner using a combination of public and private
keys, which ensure that the Chef server responds only to
requests made by the chef-client.
• Chef-solo option to install standalone client

BITS Pilani, Hyderabad Campus


Chef and Puppet - Puppet

• Puppet is similar to Chef. It requires installation of a


master server and client agent in target nodes, and
includes an option for a standalone client (equivalent to
chef-solo)
• Like Chef, Puppet comes with a paid Enterprise edition
that provides additional features like reporting and
orchestration/push deployment.
• Both Chef and Puppet perform the same basic
functions, they differ in their approach.
• Chef seems to be significantly more integrated and monolithic, whereas Puppet
consists of multiple services

BITS Pilani, Hyderabad Campus


Here's how the two platforms
stack up against one another
• Puppet is geared toward system admins who need to specify
configurations like dependencies, whereas Chef is for developers who
actually write the code for the deployment.
• Puppet depends much more on its own Domain Specific Language
(DSL) for defining the rules of configuration, whereas Chef's DSL is just
a supplement of Ruby, so most of Chef's recipes are written in standard
Ruby code.
• Chef has an omnibus (third-party) installer, which can make installation
much easier than that of Puppet.
• Chef is used mostly for OS-level automation, like deployment of
servers, patches, and fix issues. Puppet is mostly for mid-level
automation, like installing databases and starting Apache.
• Chef seems to cater more toward developer-centric operations teams,
while Puppet is geared toward more traditional operations teams with
less Ruby programming experience.
• Once you get past its steep initial learning curve, Chef offers a lot more
power and flexibility than Puppet.

BITS Pilani, Hyderabad Campus


OpenStack

• OpenStack is a free, open source cloud computing


software platform that's primarily used as an
Infrastructure as a service (IaaS) solution

BITS Pilani, Hyderabad Campus


OpenStack- Component

• Nova (compute)
• Cinder (block storage)
• Glance (image library)
• Swift (object storage)
• Neutron (network)
• Keystone (identity)
• Heat (orchestration tool)

BITS Pilani, Hyderabad Campus


Heat

• Heat is a pattern-based orchestration mechanism from


OpenStack that's known as the Orchestration for OpenStack
on OpenStack (OOO) project
• Heat provides a template-based orchestration for describing a
cloud application by executing appropriate OpenStack API
calls that generate running cloud applications
• The software integrates other core components of OpenStack
into a one-file template system
• The templates allow for the creation of most OpenStack
resource types (such as instances, floating IPs, volumes,
security groups, and users) as well as more advanced
functionality such as instance high availability, instance auto
scaling, and nested stacks.

BITS Pilani, Hyderabad Campus


How Heat Works

BITS Pilani, Hyderabad Campus


Heat Working

• Use Heat instead of writing a script that manages all of


the software in OpenStack (like setting up the servers,
adding volumes, managing networks, etc.)
• How?
• To do this, you create a Heat template that specifies what infrastructure you will
need.
• If any further changes to the existing service are required later on, you can just
modify the Heat template and the Heat engine will make the necessary changes
when you rerun the template.
• When it's finished, you can clean up and release the resources, and they can be
used by anyone else who needs them

BITS Pilani, Hyderabad Campus


Heat-Working- Cond
• As you can see in the previous figure passing a Heat template
through the Heat engine creates a stack of resources that are
specified in the Heat template.
• Heat sits on top of all the other OpenStack services in the
orchestration layer and talks to the IPs of all the other components.
• A Heat template generates a stack, which is the fundamental unit of
currency in Heat. You write a Heat template with a number of
resources in it, and each resource is an object in OpenStack with an
object ID. Heat creates those objects and keeps track of their IDs.
• You can also use a nested stack, which is a resource in a Heat
stack that points to another Heat stack. This is like a tree of stacks,
where the objects are related and their relationships can be inferred
from the Heat template. This nested feature enables independent
teams to work on Heat stacks and later merge them together.

BITS Pilani, Hyderabad Campus


Component of Heat & HOT

• The main component of Heat is the Heat engine, which


provides the orchestration functionality
• Heat Orchestration Templates (HOT) are native to
Heat and are expressed in YAML. These templates
consist of:
• Resources (mandatory fields) are the OpenStack objects that you need to
create, like server, volume, object storage, and network resources. These fields
are required in HOT templates.
• Parameters (optional) denote the properties of the resources. Declaring the
parameters can be more convenient that hard coding the values.
• Output (optional) denotes the output created after running the Heat template,
such as the IP address of the server.

BITS Pilani, Hyderabad Campus


Juju and Charm - Juju
• Juju is an open source
automatic service
orchestration
management tool
developed by Canonical,
the developers of the
Ubuntu OS.
• It enables you to deploy,
manage, and scale
software and services on
a wide variety of cloud
services and servers.
• Juju can significantly
reduce the workload for
deploying and configuring
a product's services.

BITS Pilani, Hyderabad Campus


Juju and Charm – Juju
Benefits
• Juju is the fastest way to model and deploy applications or solutions on all major
public clouds and containers.
• It helps to reduce deployment time from days to minutes.
• Juju works with existing configuration management tools, and can scale
workloads up or down very easily.
• No prior knowledge of the application stack is needed to deploy a Juju charm for
the product.
• Juju includes providers for all major public clouds, such as Amazon Web
Services, Azure, and HP as well as OpenStack, MAAS, and LXC containers.
• Juju can also be deployed on IBM SoftLayer using the manual provider
available in Juju, so anyone can use Juju with SoftLayer by provisioning the
machines manually and then telling Juju where those machines are.
• Local providers on LXC containers allow you to recreate the production
deployment-like environment on your own laptop.
• It also offers a quick and easy environment for testing deployments on a local
machine. Users can deploy entire cloud environments in seconds using bundles,
which can save a lot of time and effort.

BITS Pilani, Hyderabad Campus


Juju and Charm - Charms

• A charm is a set of scripts that can be written in any language


that are based on certain things.
• Juju utilizes charms, which are open source tools that
simplify specific deployment and management tasks
• After a service is deployed, Juju can define relationships between services and
expose some services to the outside world. Charms give Juju its power
• They encapsulate application configurations, define how
services are deployed, how they connect to other services,
and how they are scaled.
• Charms define how services integrate, and how their service
units react to events in the distributed environment, as
orchestrated by Juju.
• Charms are easy to share, and there are hundreds of charms
already rated and reviewed in the Juju charm store

BITS Pilani, Hyderabad Campus


Juju and Charm -
Relationships
• Juju allows services to be instantly integrated via
relationships.
• Relationships allow the complexity of integrating services
to be abstracted from the user.
• Juju relationships are loosely typed definitions of how
services should interact with one another.
• These definitions are handled through an interface.
• Juju decides which services can be related based solely
on the interface names.

BITS Pilani, Hyderabad Campus


Juju- Charm - Features

• Some of the advanced features in Juju include:


• Juju Compose builds new charms from existing ones using a layering
approach, so that common tasks require much less rework. Features in
the lower layers are inherited by the new charms.
• Subordinate charms are related charms that can be grouped together
as subordinate or principal charms. A principal charm is the main charm
and a subordinate charm cannot stand alone, so it is deployed along
with the principal charm.
• Leadership hooks are automated mechanisms provided by Juju that
select a leader/master in a clustered environment.

BITS Pilani, Hyderabad Campus


Available tools

• CoreOS https://coreos.com/
• OpenShift https://www.openshift.com
• Docker https://www.docker.com/
• Kubernetes http://kubernetes.io/

BITS Pilani, Hyderabad Campus


Dockers
BITS Pilani Orchestration and Dockers
Hyderabad Campus
 The Challenge
 The Solution
 Why Dockers and Containers matter?
 How they work?

BITS Pilani, Hyderabad Campus


Market View: Evolution of IT

BITS Pilani, Hyderabad Campus


Challenges

BITS Pilani, Hyderabad Campus


Challenges

BITS Pilani, Hyderabad Campus


Results in NXN Compatible
Nightmare

BITS Pilani, Hyderabad Campus


Analogy: Cargo Transport Pre
- 1960

BITS Pilani, Hyderabad Campus


Also NXN Matrix

BITS Pilani, Hyderabad Campus


Solution: Intermodal Shipping
Container

BITS Pilani, Hyderabad Campus


Solution: This Eliminated the
NXN Problem

BITS Pilani, Hyderabad Campus


What is Docker?

• Docker is a application Container technology.

BITS Pilani, Hyderabad Campus


Docker is a Shipping
Container System for Code

BITS Pilani, Hyderabad Campus


Or…..Simply

BITS Pilani, Hyderabad Campus


Docker Solves the NXN
Problem

BITS Pilani, Hyderabad Campus


Why Container Matters?

BITS Pilani, Hyderabad Campus


Docker

• Lightweight—containers running on a single machine


all share the same operating system kernel so they start
instantly and make more efficient use of RAM. Images
are constructed from layered file systems so they can
share common files, which makes disk usage and image
downloads much more efficient.
• Open—Docker containers are based on open standards.
This allows them to run on all major Linux distributions
and Microsoft operating systems with support for every
infrastructure.
• Secure—containers isolate applications from each other
and the underlying infrastructure while providing an
added layer of protection for the application.

BITS Pilani, Hyderabad Campus


Docker- Container vs VM

• Each virtual machine includes the application, the


necessary binaries and libraries, and an entire guest
operating system—all of which may be tens of gigabytes
in size.
• Containers include the application and all of its
dependencies, but share the kernel with other containers
• They run as an isolated process in user space on the host operating system.
• They're also not tied to any specific infrastructure: Docker containers run on any
computer, on any infrastructure, and in any cloud.
• Key difference - while the hypervisor abstracts an entire
device, containers just abstract the operating system
kernel.
• This means that one thing hypervisors can do that containers can't is use
different operating systems or kernels

BITS Pilani, Hyderabad Campus


Docker- Container vs VM

BITS Pilani, Hyderabad Campus


Docker – Container vs VM

BITS Pilani, Hyderabad Campus


Why Docker is Lightweight?

BITS Pilani, Hyderabad Campus


Docker Benefits
• Faster delivery of applications—Docker is perfect for helping with the development lifecycle. It
also allows you to develop on local containers that contain the apps and services.
• Deploy and scale more easily—Docker's container-based platform allows for highly portable
workloads. They can run on a developer's local host, on physical or virtual machines in a data
center, or in the cloud. You can use Docker to quickly scale apps and services up or down.
• Achieve higher density and run more workloads—Docker is lightweight and fast. It provides a
viable, cost-effective alternative to hypervisor-based VMs. This is especially useful in high-density
environments, such as building your own cloud or Platform as a Service. However, it is also useful
for small- and medium-sized deployments where you want to get more out of the resources you
have.
• Eliminate environmental inconsistencies—By packaging the application with its configs and
dependencies together and shipping it as a container, the app will always work as designed
locally, on another machine.
• Empower developer creativity—The isolation capabilities of Docker containers liberate
developers from having to use approved language stacks and tooling. Developers can use the
best language and tools for their application service without having to worry about causing
conflicts.
• Accelerate developer onboarding—Stop wasting hours trying to set up developer environments,
spin up new instances, and make copies of production code to run locally. With Docker, you can
easily take copies of your live environment and run them on any new endpoint that's running
Docker.

BITS Pilani, Hyderabad Campus


Docker- Files and Hub

• Docker file is a text document that contains all the


commands a user can call on the command line to
assemble an image. Using Docker build, you can create
an automated build that executes several command-line
instructions in succession. Docker can build images
automatically by reading the instructions from a Docker
file.
• Docker Hub is a cloud-hosted service from Docker that
provides registry capabilities for public and private
content. It makes it easier for you to collaborate with the
broader Docker community or with your own team on
key content, or automate your application by building
workflows.

BITS Pilani, Hyderabad Campus


Docker Container Life Cycle

BITS Pilani, Hyderabad Campus


Docker File

BITS Pilani, Hyderabad Campus


Docker installation in Linux
1)sudo apt-get install apt-transport-https ca-certificates curl software-
properties-common

2)curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-


key add -

3)
sudo add-apt-repository
"deb[arch=amd64]https://download.docker.com/linux/ubuntu xenial
stable"

4)sudo apt-get update

5)sudo apt-cache search docker-ce


6)sudo apt-get install docker-ce
BITS Pilani, Hyderabad Campus
Basic Docker commands
docker run – Runs a command in a new container.
docker start – Starts one or more stopped containers
docker stop – Stops one or more running containers
docker build – Builds an image form a Docker file
docker pull – Pulls an image or a repository from a registry
docker push – Pushes an image or a repository to a registry
docker export – Exports a container’s filesystem as a tar archive
docker exec – Runs a command in a run-time container
docker search – Searches the Docker Hub for images
docker attach – Attaches to a running container
docker commit – Creates a new image from a container’s changes

BITS Pilani, Hyderabad Campus


.

BITS Pilani, Hyderabad Campus


Contd.
1)Docker command line interface is the Client
2)Docker Daemon act as server. This runs in the host system in which we install
docker.
>>service docker start command makes daemon to startIt does all the heavy lifting.
Docker Daemon does build docker images, pull images from registry, run containers
with images. You can access Docker Daemon via Docker Client

3)Docker Images could be stored in Docker registries. Docker officially provides

public access to Docker Cloud and Docker Hub.


They contain pre-built Docker images. You may pull those Docker Images and run
with containers in your Docker Host.

We can also push our own applications as images into the Docker hub

BITS Pilani, Hyderabad Campus


Image and Container
• Docker Image is the build component of Docker. It is a
read-only template. A docker image could contain an
Operating System, a Web Server, a Web Application, etc.
• Docker Container is the process that uses the template
(Docker Image) and runs. A docker container, like a
process, could be run, started, moved, stopped and
deleted. Each container is isolated from others and is a
secure application platform.
• Simply container is a running image or instance of image
• To turn an image into a container, the Docker engine takes
the image, adds a read-write file system on top and
initializes various settings including network ports,
container name, ID and resource limits.
BITS Pilani, Hyderabad Campus
Big Picture

BITS Pilani, Hyderabad Campus


References

• https://www.ibm.com/developerworks/cloud/library/cl-
cloud-orchestration-technologies-trs/index.html
• https://docs.dockers.com/engine/installation/windows/

BITS Pilani, Hyderabad Campus


BITS Pilani presentation
Dr.Subhrakanta Panda
BITS Pilani BITS-Pilani, Hyderabad Campus
Hyderabad Campus
BITS Pilani
Hyderabad Campus

SS Z G527

Cloud
Computing

CS 8.1
DFS
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 2
Objectives BITS Pilani, Hyderabad Campus

 Introduction to file system


 Distributed File System (DFS)
 Case Studies
 GFS
 HDFS
 Reading and writing files (HDFS)
 Cloud storage

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 3
File system BITS Pilani, Hyderabad Campus

 File system is a type of data store which can be used


to
 organize data (as files)
 provide a means for applications to store, access,
and modify data
 Windows makes use of the FAT, NTFS, exFAT and
ReFS file systems (the latter is only supported and
usable in Windows Server 2012)
 Ext, ext2, ext3, ext4 for Linux
 HFS, HFS+ for Mac OS
 GFS/HDFS

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 4
BITS Pilani, Hyderabad Campus
Distributed File System (DFS)?

Big data continues to grow.

In contrary to a local file system, a


distributed file system (DFS) can hold big
data and provide access to this data to
many clients distributed across a network.

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 5
NAS versus SAN
 Another term for DFS is network attached storage (NAS),
referring to attaching storage to network servers that provide
file systems
 A similar sounding term that refers to a very different
approach is storage area network (SAN)
 SAN makes storage devices (not file systems) available over a network
BITS Pilani, Hyderabad Campus
NAS versus SAN
Benefits of DFSs

DFSs provide:
1.File sharing over a network: without a DFS, we
would have to exchange files by e-mail or use
applications such as the Internet’s FTP.

2.Transparent files accesses: A user’s program


can access remote files as if they are local. The
remote files have no special APIs; they are
accessed just like local ones.

3.Easy file management: managing a DFS is


easier than managing multiple local file systems.

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 8
DFS Architectures

1. Client-Server Distributed File Systems.


2. Cluster-Based Distributed File Systems.
3. Symmetric Distributed File Systems.

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 9
Cluster-Based Distributed File Systems

 The cluster-based file system is a key component for


providing scalable data-intensive application performance.

 The cluster-based file system divides and distributes big


data, using file striping techniques, for allowing
concurrent data accesses.

 The cluster-based file system could be either a cloud


computing or an HPC (High performance computing)
oriented distributed file system.
 GFS, HDFS, S3, etc. are examples of cloud computing DFSs
 Parallel Virtual File System (PVFS) and IBM’s General Parallel File System
(GPFS) are examples of HPC DFSs

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 10
Google File System (GFS)
 The Google File System (GFS) is a cloud computing
based scalable DFS for large distributed data
intensive applications.
 GFS divides large files into multiple pieces called chunks
or blocks (by default 64MB) and stores them on different
data servers.
 This design is referred to as block-based design
 Each GFS chunk has a unique 64-bit identifier and
is stored as a file in the lower layer local file system on
the data server.
 GFS distributes chunks across cluster data servers using
a random distribution policy.

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 11
GFS Random
Distribution Policy

SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 12
How to replicate?

 Two machines in the same rack have more


bandwidth and lower latency between each
other than two machines in two different racks

 For every block of data, two copies will exist in


one rack, another copy in a different rack.

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 13
Google File System
 GFS stores a huge number of files, totaling
many terabytes of data.
 Individual file characteristics:
– Very large, multiple gigabytes per file
– Files are updated by appending new
entries to the end (faster than overwriting
existing data)
– Files are virtually never modified (other
than by appends) and virtually never
deleted.
– Files are mostly read-only
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 14
GFS Architecture

SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 15
Master and Chunk Server
Responsibilities

SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 16
GFS
 Chunks are replicated within a cluster for fault
tolerance, using a primary/backup scheme.
Periodically the master polls all its chunk
servers to find out which chunks each
one stores
– This means the master doesn’t need to know
each time a new server comes on board, when
servers crash, etc.
Polling occurs often enough to guarantee that
master’s information is “good enough”.

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 17
HDFS??
 Hadoop's Distributed File System is designed to
reliably store very large files across machines in a
large cluster.
 It is inspired by the Google File System.
 Hadoop DFS stores each file as a sequence of
blocks, all blocks in a file except the last block are
the same size.
 Blocks belonging to a file are replicated for fault
tolerance. The block size and replication factor are
configurable per file. Files in HDFS are "write
once" and have strictly one writer at any time.
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 18
Hadoop File system

SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 19
Hadoop Distributed File System – Goals:
• Store large data sets
• Cope with hardware failure
• Emphasize streaming data access
From GFS to HDFS
Terminology differences:
– GFS master = Hadoop namenode
– GFS chunkservers = Hadoop datanodes

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 21
HDFS Architecture

HDFS namenode
Application /foo/bar
(file name, block id) File namespace
HDFS Client block 3df2
(block id, block location)

instructions to datanode

datanode state
(block id, byte range)
HDFS datanode HDFS datanode
block data Linux file system Linux file system

… …
Hadoop Server Roles

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 23
Source: http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network
Namenode Responsibilities

 Managing the file system namespace:


– Holds file/directory structure, metadata,
file-to- block mapping, access permissions,
etc.
 Coordinating file operations:
– Directs clients to datanodes for reads and
writes
– No data is moved through the namenode
 Maintaining overall health:
– Periodic communication with the datanodes
– Garbage collection
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 25
Writing files to HDFS

Source:
http://bradhedlund.com/2011/09/10/under
standing-hadoop-clusters-and-the-network

 The Name Node is not in the data path. The Name Node only provides the map
of where data is and where data should go in the cluster
SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 26
Preparing HDFS writes

SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 27
Pipelined write

Data Nodes 1 & 5


pass data along
as its received

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 28
Pipelined write (contd..)

The Client is ready to start the pipeline process again for the next block of data
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 29
Multi-block replication pipeline

Note: The initial node in the pipeline will vary for each block
SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 30
Client reading file from HDFS BITS Pilani, Hyderabad Campus

SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 31
Data Node reading file from
BITS Pilani, Hyderabad Campus

HDFS

SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 32
Name node, Data node
(heart beat) BITS Pilani, Hyderabad Campus

SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 33
Single point of failure
(name node) BITS Pilani, Hyderabad Campus

SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 34
Data Recovery BITS Pilani, Hyderabad Campus

SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 35
Overall effect?

 As intended the file is spread in blocks across the


cluster of machines each machine having relatively
small part of data.
 The more blocks that make up a file, the more
machines the data can potentially spread.
 The more CPU cores and disk drives that have a piece
of your data mean more parallel processing power and
faster results.
 This is the motivation behind building large wideclusters,
to process more data, faster.

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 36
Scaling the cluster BITS Pilani, Hyderabad Campus

SS ZG527 Intro to file system Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 37
Scaling the cluster (contd..) BITS Pilani, Hyderabad Campus

Two approaches for scaling the cluster


 Wide
 Deep

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 38
Scaling the cluster (contd..) BITS Pilani, Hyderabad Campus

Wide scaling:
Cluster size increases by increasing the no. of nodes
Network needs to scale appropriately
Deep:
Instead of increasing the number of machines you can
look at increasing the density of each machine
i.e., increasing each node capacity in terms of more
CPUs, disk drives and RAM
More network I/O requirements (fewer machines)

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 39
What does it have to do with cloud BITS Pilani, Hyderabad Campus
computing?

Data is at the Heart of Cloud Computing


Services

Need File Systems that can fit the bill for


large, scalable hardware and software

GFS/HDFS and similar Distributed File Systems


are now part and parcel of Cloud Computing
solutions.

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 40
Cloud??? BITS Pilani, Hyderabad Campus

 Cloud storage is a model of networked online


storage where data is stored in virtualized pools of
storage
 Companies operate large data centers, and people who
require their data to be hosted, buy or lease storage
capacity from them
 Cloud storage services may be accessed through a web
service application programming interface (API), a cloud
storage gateway or through a Web-based user interface
 It is difficult to pin down a canonical definition of
cloud storage architecture, but object storage is
reasonably analogous

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 41
BITS Pilani, Hyderabad Campus
Reference BITS Pilani, Hyderabad Campus

 Understanding Hadoop Clusters and the Network


http://bradhedlund.com/2011/09/10/understanding-
hadoop-clusters-and-the-network/

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 43
Summary BITS Pilani, Hyderabad Campus

 Introduction to file system


 Distributed File System (DFS)
 Case Studies
 GFS
 HDFS
 Reading and writing files (HDFS)
 Cloud storage

SS ZG527 DFS Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 44
Cloud Computing
BITS Pilani Multi-Tenancy
Hyderabad Campus
Objective

 Multi-Tenancy

 4 levels of multi tenancy

 Multi-tenant models for cloud services

 Cloud Security Threats

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 2
Multitenancy
• Multi-tenancy refers to a principle in software architecture
where a single instance of the software runs on a server,
serving multiple client organizations (tenants).
• Multi-tenancy is different from multi-instance architecture
where separate software instances (or hardware systems) are
set up for different client organizations.
• Multi-tenancy is a critical technology to allow
one instance of application to serve multiple
customers by sharing resources.
 Multi - multiple, independent customers are served
 tenant is any legal entity responsible for data and is provided on
a contractual basis. Tenant is the contract signee.
 Applications : IaaS, PaaS, SaaS

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 3
Multitenancy
• Single tenant applications: lots of waste
   
   
App App
   
Db
    Db
   
 
   
  App App
   
Db Db
   
   

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 4
Multitenancy
Multi-tenant applications :

App

 
 Db

 

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 5
Monitor Multiple Customers Using Typical Infrastructure

Customer A Customer B Customer C


Customers

Network Network Network

Agent Agent Agent

Mgt Mgt Mgt


Service provider

DB WS DB WS DB WS

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 6
Multi-Tenant Network Monitoring Infrastructure

Customer A Customer B Customer C


Network Network Network

Agent Agent Agent

Management
DB Workstation

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 7
Goals of Multi-tenancy

 Sharing – maximize the resource sharing across multiple


tenants.

 Isolation – Hide the facts of one from another tenant.

• Execution – enforce security. Make sure one tenant


can’t call other tenants executable logic.
• Data – make sure one tenant can’t see other data
• Performance - make sure performance is not
affected by existence of other tenants.

 Scale
– Server is distributed and it can handle larger load by
adding more nodes.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 8
Multi-tenants Deployment Modes
for Application Server

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 9
Multi-tenants Deployment Modes in
Data Centers

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 10
4 levels of multi tenancy

1. Ad-hoc/customizable instances
2. Configurable instances
3. Configurable multi–tenant efficient instances
4. Scalable, configurable, multi-tenant efficient
instances

• For any given resource in a cloud system, the


appropriate level could be selected.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 11
4 levels of multi tenancy

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 12
Ad-hoc / Customizable Instances

 Each customer has their own custom vision of


the software.
 Represents an enterprise data center where
there are multiple instances and versions of the
software.
 Each customer would have their own binaries,
as well as their own dedicated processes for
implementation of the application.
 Disadv:
- Difficulty in Management: Each customer would
need their own management support.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 13
Configurable Instances

 All customers share the same vision of the


software (one copy for each customer)

 Adv:
- Easy Management: Single copy of the software.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 14
Configurable MULTI-TENANT
Efficient Instances
 All customers share the same version of the
software (only single copy among all
customers).

 Adv: Easy Management: running of only single


instance.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 15
Configurable Multi-Tenant
Efficient Instances Scalable)
 All customers share the same version of the
software (only single copy among all customers).
 Software is hosted on a cluster of computers.
 Hence, allows the capacity of the system to
scale almost limitlessly.
 Thus, increase in no. of customers and capacity
as well.
 Ex: Gmail, yahoo mail, etc
 Disadv: Shared storage problem

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 16
Multi-tenant models for cloud
services
Tenants
T1 T2

IaaS PaaS

AP AP AP AP AP AP AP AP AP AP SaaS

VM1 VM2 VM3 VM VM VM VM1 VM2 VM3


(OS1) (OS2) (OS3) (OS) (OS) (OS) (OS1) (OS2) (OS3)

Hypervisor Hypervisor Hypervisor

Host OS Host OS Host OS

Hardware Hardware Hardware

Private cloud/ IT center Development center Data center

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 17
Multi-tenancy Issues in the Cloud
• Conflict between tenants’ opposing goals
– Tenants share a pool of resources and have opposing
goals.
• How does multi-tenancy deal with conflict of interest?
– Can tenants get along together and ‘play nicely’ ?
– If they can’t, can we isolate them?
• How to provide separation between tenants?
• Cloud Computing brings new threats
• Multiple independent users share the same physical infrastructure.
- Thus an attacker can legitimately be in the same physical machine as the target.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 18
Cloud Computing
BITS Pilani Cloud Security
Hyderabad Campus
Objectives

• Cloud Security
• Who is responsible for Managing Security
• Service License Agreements: Lifecycle and Management
• Traditional approaches to SLO management
• Automated Policy based management

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 20
Cloud Security

• Data on computers is an extremely important aspect of


modern life.
- Therefore various areas in security began to gain
prominence.
• Furthermore, the internet took the world by storm
and there were many examples of what could
happen if
- there was insufficient security built in
applications developed for the internet.
• Network security measures are needed to protect data
during their transmission.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 18-Mar-17 21
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 22
BITS Pilani, Hyderabad Campus
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 23
Different Aspects of Information
Security

Security Attacks:
Any action that compromises the security of information
owned by an organization.

Security Mechanisms:
A mechanism that is designed to detect, prevent or recover
from a security attack.

Security Services:
A service that enhances the security of data processing
system and the information transfers of an organization.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 24
Some General Terms

 Availability: data/services can be accessed as desired

 Integrity: data has not been (maliciously) altered

 Confidentiality: no information has been inappropriately


disclosed

 Authentication: user or data origin is properly identifiable

 Accountability: actions are traceable to those responsible

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 25
Security attacks

There are four categories of attacks


1. Interruption:
This is an attack on availability

2. Interception:
This is an attack on confidentiality

3. Modification:
This is an attack on integrity

4. Fabrication:
This is an attack on authenticity

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 26
Passive Vs Active attacks
Passive:
Do not involve any modification Passive attacks
to the contents of an original
message.
Eg. An unauthorized party gain
Release of Traffic analysis
access to an asset
contents
(unauthorized copying of files or
programs).

Active:
Contents of the original
messages are modified in some
way or a false message is
created.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 27
Security Attacks in Practice

• Application level attack: The attacker attempts to


access the information of a particular application or
the application itself.

• Network level attack: Aims at reducing the


capabilities of a network by a number of possible
means.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 28
Security is the key inhibitor to cloud adaptation

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 29
Companies
18
are still afraid to use clouds

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 30
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 31
Recent Cloud attacks

• Running of “Zeus botnet controller” on an EC2


instance on Amazon’s cloud infrastructure was
reported in 2009.
• iCloud hack
• Sony Pictures
• Home Depot
• Anthem

BITS Pilani, Hyderabad Campus


SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 32
Security and Privacy Issues in
Cloud Computing

• Infrastructure Security

• Data security and Storage security

• Identity and Access Management (IAM)

• Privacy

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 33
Infrastructure Security

• Network Level

• Host Level

• Application Level

34
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 34
The Network Level

 Ensuring confidentiality and integrity of your


organization’s data-in-transit to and from your public
cloud provider.
 Ensuring proper access control (authentication,
authorization, and auditing) to whatever resources
you are using at your public cloud provider.
 Ensuring availability of the Internet-facing resources in a
public cloud that are being used by your organization, or
have been assigned to your organization by your
public cloud providers.

35
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 35
The Network Level -
Mitigation
Note that network-level risks exist regardless of
what aspects of “cloud computing” services are
being used.
The primary determination of risk level is
therefore not which *aaS is being used.
But rather whether your organization intends to
use or is using a public, private, or hybrid cloud.

36
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 36
The Host Level

SaaS/PaaS
–Both the PaaS and SaaS platforms abstract and
hide the host OS from end users.
–Host security responsibilities are transferred to the
CSP (Cloud Service Provider).
• You do not have to worry about
protecting hosts.
–However, as a customer, you still own the risk of
managing information hosted in the cloud services.

37
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 37
More on attacks…

 Can one determine where in the cloud


infrastructure an instance is located?
 Can one easily determine if two instances are
co-resident on the same physical machine?
 Can an adversary launch instances that will be
co-resident with other user instances?
 Can an adversary exploit cross-VM
information leakage once co-resident?
 Answer: Yes to all

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 38
Who is responsible for managing security

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 39
Who is responsible for managing security
(contd..)

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus TS
40Pilani, Hyderabad Campus
Cloud Security Issues

• Most security problems stem from:


– Loss of Control
• Take back control
– Data and apps may still need to be on the cloud
– But can they be managed in some way by the consumer?
– Lack of trust
• Increase trust (mechanisms)
– Technology
– Policy, regulation
– Contracts (incentives): topic of a future talk
– Multi-tenancy
• Private cloud
– Takes away the reasons to use a cloud in the first place
• VPC: its still not a separate system
• Strong separation
• These problems exist mainly in 3rd party management models
– Self-managed clouds still have security issues, but not related to above.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 41
Loss of Control in the Cloud
Consumer’s loss of control
– Data, applications, resources are located with provider.
– User identity management is handled by the cloud
– User accessed control rules, security policies, and
enforcement are managed by the cloud provider
– Consumer relies on provider to ensure
• Data security and privacy
• Resource availability
• Monitoring and repairing of services/resources
Minimize Loss of Control in the Cloud
 Monitoring
 Utilizing different clouds
 Access control management
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 42
Lack of Trust in the Cloud
 Trusting a third party requires taking risk
 Defining trust and risk
– Opposite sides of the same coin (J. Camp)
– People only trust when it pays (Economist’s view)
– Need for trust arises only in risky situations
 Trust here means mostly lack of accountability and
verifiability

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 43
Multi-tenancy Issues in the Cloud
 Conflict between tenants’ opposing goals
– Tenants share a pool of resources and have opposing
goals
 How does multi-tenancy deal with conflict of
interest?
– Can tenants get along together and ‘play nicely’ ?
– If they can’t, can we isolate them?
 How to provide separation between tenants?
 Who are my neighbors? What is their objective?

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 44
Minimize Multi-tenancy in
the Cloud
 Can’t really force the provider to accept less
tenants
– Can try to increase isolation between tenants
• Strong isolation techniques (VPC to some
degree)
• VM Side channel attacks (T. Ristenpart et
al.)
– Can try to increase trust in the tenants
• Who’s the insider, where’s the security
boundary? Who can I trust?
• Use SLAs to enforce trusted behavior
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 45
Threat Model
A threat model helps in analyzing a security
problem, design mitigation strategies, and evaluate
solutions.
Steps:
– Identification: Identify attackers, assets, threats and other
components
– Ranking: Rank the threats
– Mitigation: Choose mitigation strategies
– Solution: Build solutions based on the strategies

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 46
Top 5 cloud security threats

1. Account Hijacking

2. Insufficient Due Diligence

3. Data Loss

4. Data Breach

5. Insider Threat

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 47
Top 5 cloud security threats
Account hijacking
• Multi-factor authentication
• Protect the global admin account

Insufficient Due Diligence


• Shadow IT (file sharing, social, collaboration, etc.)
• IT department need to be shepherds
• Manageable services, access controls and encryption
• User controls
• Audit transparency

Data loss
• Accidental deletion
• Archiving service
• User / Admin level –recycle bin
• Can get data for a period of time
• Redundancy for natural disasters (Geo-redundancy

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 48
Top 5 cloud security threats
• Data breach
• Media breach
• Physical security
• Finding the data is like finding a needle in a haystack
• Encryption at rest
• Man in the middle
• Encryption in transit within and outside datacenters
• End-to-end encryption
• Message encryption

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 49
Top 5 cloud security threats

• Malicious Insider (Insider threat)


 Least privilege access to operators.
 Audit and monitor the admin accounts usage
closely.
 Elevations are granted with manual approval
and for a limited period.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 50
Bibliography
• Buyya K. R., Broberg J., Goscinski A., Cloud
Computing Principles and Paradigms. Wiley;
2013.
• Recorded Lectures.
• Dinkar Sitaram, Geetha Manjunath, Moving to
the Cloud Developing Apps in the New World
of Cloud Computing;2012

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 51
BITS Pilani presentation
Dr.Subhrakanta Panda
BITS Pilani BITS-Pilani, Hyderabad Campus
Hyderabad Campus
BITS Pilani
Hyderabad Campus

SS Z G527

CloudComputing

CS 8.2
HADOOP
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 2
Objectives

 Introduction to Hadoop
 MapReduce
 Understanding MapReduce various logical steps
 Exploring the word count java program in detail
 Summary of MapReduce facts

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 3
Hadoop

Name of the elephant!!


Or have you thought of the
following?

• How do “big bazaar/more/D’Mart” target promotions


guaranteed to make you buy?

• How can Airtel(4G) increase Ad-campaign efficiency?

• What’s in your search? How is Google able to make such good


predictions about your search?

• I have huge amount of data (news sites, twitter, blogs, feeds,


forums, etc.) ? What do I do with it?

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 5
Big DATA ????

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 6
Wow, that’s so much of DATA to
process!!!!!!

: exactly, and that what we call as ”BIG Data”


• Hadoop is one of the best-known cloud platforms for big
data today
• It solves a specific class of data-crunching problems that
frequently comes up in the domain of Internet computing
and high-performance computing.
• Managing lots of information (growing by the day and
doubling by year)
• Working with many new types of data (totally unstructured)

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 7
One of the research in the year 2012, Hadoop held the
world record for the fastest system to sort large data
(500 GB of data in59 sec and 100 terabytes of data in
68 seconds)Designed to answer the question: “How to
process big data with reasonable cost and time?”

Super, so tell me more about Hadoop,


the data cruncher
Okay, okay…. Sit tight.

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 8
Hadoop components and importance of
MapReduce
 Hadoop is optimized for batch-processing
applications, and scales to the number of
CPUs available in the cluster
 MapReduce is fundamental building block
in Hadoop
 Provides Framework for Massive parallel
processing
 Provides scalability
 Programmer can focus on their program,
and the framework takes care of the
details of parallelization, fault-tolerance,
locality optimization, load balancing
 Paradigm shift: In MapReduce
programming model, computation goes to
data rather than data coming to program.
Processing takes place where data is.

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 9
Hadoop components
 Hive: provides a database query interface to Apache
Hadoop
Pig-A high-level data-flow language and execution
framework for parallel computation.
ZooKeeper is an effort to develop and maintain an open-
source server which enables highly reliable distributed
coordination
Hbase: A scalable, distributed database that supports
structured data storage for large tables.
Mahout: A Scalable machine learning and data mining
library.
Sqoop is a tool designed to transfer data between Hadoop
and relational database servers
SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 10
Hadoop Framework
MapReduce (Data Processing Framework)
 MapReduce is a software framework for easily running applications which
processes large amount of data in parallel on large clusters having
thousands of nodes of commodity hardware in a reliable and fault-tolerant
manner

MapReduce
Software Processes large Using large Nodes of In a reliable and
Framework for amount of data in clusters having commodity fault-tolerant
easily running parallel thousands of hardware manner
applications nodes

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 12
Suggested Reading

1. Hadoop frame work - based on white paper published by


Google in 2004

2. "MapReduce: Simplified data processing on large


clusters" by Jeffrey Dean and Sanjay Ghemawat

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 13
MapReduce??
 Origin from Google, [OSDI’04]
 A simple programming model - distributed programming
frame work (works on divide and conquer)
 Used for processing and generating large data sets
 Functional model
 For large-scale data processing
– Exploits large set of commodity computers
– Executes process in distributed manner
– Offers high availability

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 14
Motivation
Lots of demands for very large scale data processing
A certain common themes for these demands
– Lots of machines needed (scaling)
– Two basic operations on the input
• Map
• Reduce

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 15
Architecture overview

blog.raremile.com
The Job Tracker:
 Central authority for the complete MapReduce cluster
and responsible for scheduling and monitoring
MapReduce jobs.
 Responds to client request for job submission and status.

The TaskTracker:
 Workers that accepts map and reduce tasks from job
tracker, launches them and keeps track of their
progress, reports the same to job tracker.
 Keeps track of resource usage of tasks and kills the
tasks that overshoots their memory limits.

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 17
Ref: Jeffrey Dean and Sanjay Ghemawat
Distributed Grep BITS Pilani, Hyderabad Campus

Split data grep matches


Split data grep matches
Very All
big Split data grep matches cat matches
data
Split data grep matches
Distributed Word Count BITS Pilani, Hyderabad Campus

Split data count count


Split data count count
Very merged
big Split data count count merge count
data
Split data count count
Map+Reduce BITS Pilani, Hyderabad Campus

R
M E
Very
A D Result
big
P U
data
C
E

Map: Reduce
– Accepts input key/value pair – Accepts intermediate key/value pair
– Emits intermediate key/value pair – Emits output key/value pair
Flow of MapReduce
1. Define Inputs

2. Define Map function

3. Define Combiner function

4. Define Reduce function

5. Define output

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 22
MapReduce Programming Model

Data type: key-value records

Map function:
(Kin, Vin)  list(Kinter, Vinter)

Reduce function:
(Kinter, list(Vinter))  list(Kout, Vout)

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 23
Examples
let map(k,v) =emit (k.toUpper(), v.toUpper() )
– (“foo”, “bar”) -> (“FOO”,”BAR”)
– (“key2”,”data”) -> (“KEY2”,”DATA”)

let map(k,v)= foreach char c in v :emit (k,c)


– (“A”,”cats”)->(“A”,”c”),(“A”,”a”),(“A”,”t”),(“A”,”s”)
– (“B”,”hi”) ->(“B”,”h”), (“B”,”i”)

let map(k,v)= if (isPrime(v)) then emit (k,v)


– (“foo”,7) -> (“foo”,7)
– (“test”,10) -> (nothing)

let map(k,v)= emit(v.length,v)


– (“hi”,”test”)->(4,”test”)
– (“x”,”quick”) ->(5,”quick”)

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 24
Example: Word Count

def mapper(line):
foreach word in line.split():
output(word, 1)

def reducer(key, values):


output(key, sum(values))

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 25
Word Count Execution BITS Pilani, Hyderabad Campus

Split Map Reduce Output


the, 1
brown, 1
Input the fox, 1 brown, 2
quick Map fox, 2
brown Reduce how, 1
the quick fox the, 1 now, 1
brown fox. fox, 1 the, 3
the fox ate the, 1
the fox
the mouse. Map
ate the
how now quick, 1
mouse
brown cow
how, 1 ate, 1
now, 1 ate, 1
cow, 1
brown, 1 mouse, 1 Reduce
how mouse, 1
now Map cow, 1 quick, 1
brown
cow
An Optimization: The Combiner

• Local reduce function for repeated keys produced by same map


• For associative ops. like sum, count, max
• Decreases amount of intermediate data

• Example: local counting for Word Count:

def combiner(key, values):


output(key, sum(values))

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 27
Word Count with Combiner BITS Pilani, Hyderabad Campus

Input Map Reduce Output


the, 1
brown, 1
the quick fox, 1 brown, 2
brown Map fox, 2
fox Reduce how, 1
now, 1
the, 2
fox, 1
the, 3
the fox
ate the Map
quick, 1
mouse
how, 1 ate, 1
now, 1 ate, 1
brown, 1
cow, 1
mouse, 1 Reduce
how now mouse, 1
brown Map cow, 1 quick, 1
cow
Overall Word Count Execution (2)
Word Count implementation (java)
 To count number of distinct words in each file
public class WordCount {

public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {


private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

public void map(LongWritable key, Text value, OutputCollector<Text,InWritable> output, Reporter reporter ) throws
IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
output.collect(word, one);
}
}
}
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, OutputCollector<Text,InWritable> output, Reporter reporter
) throws IOException, InterruptedException {
int sum = 0;
while(values.hasNext()){
sum += values.next().get();
}
output.collect(key, new IntWritable(sum));}
}

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 30
public static void main(String[] args) throws Exception {

JobConf job = new JobConf(WordCount.class);


job.setJobName(“wordcount”)

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

job.setMapperClass(Map.class);
job.setCombinerClass(Reducer.class)
job.setReducerClass(Reduce.class);

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

FileInputFormat.addInputPath(job, new Path(args[1]));


FileOutputFormat.setOutputPath(job, new Path(args[2]));

JobClient.runJob(job);
}
} // WordCount class end
Running the application
Step 1: compile your program.java and create a jar
Step 2: Place the files in appropriate HDFS directory
– /user/CSE/wordcount/input/file01 (Hello WILP students)
– /user/CSE/wordcount/input/file02 (How are you! Bye for now)

Step 3: Run the application


$bin/hadoop jar wc.jar WordCount /user/CSE/wordcount/input
/user/CSE/wordcount/output

Output:
cat /user/CSE/wordcount/output/part-r-00000
are 1
Bye 1
For 1
Hello 1
How 1
Now 1
Students 1
You! 1
WILP 1

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 32
Word Count example code (java)
http://hadoop.apache.org/docs/stable/mapred_tutorial.html

http://wiki.apache.org/hadoop/WordCount

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 33
Challenges of Cloud Environment
Cheap nodes fail, especially when you have many
– Mean time between failures for 1 node = 3 years
– MTBF for 1000 nodes = 1 day
– Solution: Build fault tolerance into system

Commodity network = low bandwidth


– Solution: Push computation to the data

Programming distributed systems is hard


– Solution: Restricted programming model: users write
data-parallel “map” and “reduce” functions, system
handles work distribution and failures

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 34
MapReduce Execution Details

• Mappers preferentially scheduled on same node


or same rack as their input block
– Minimize network use to improve performance

• Mappers save outputs to local disk before


serving to reducers
– Allows recovery if a reducer crashes

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 35
Fault Tolerance in MapReduce
1. If a task crashes:
– Retry on another node
• OK for a map because it had no dependencies
• OK for reduce because map outputs are on disk
– If the same task repeatedly fails, fail the job or
ignore that input block

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 36
Fault Tolerance in MapReduce

2. If a node crashes:
– Relaunch its current tasks on other nodes
– Relaunch any maps the node previously ran
• Necessary because their output files were lost
along with the crashed node

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 37
Fault Tolerance in MapReduce

3. If a task is going slowly (straggler)

– Launch second copy of task on another node


– Take the output of whichever copy finishes first,
and kill the other one

• Critical for performance in large clusters (many


possible causes of stragglers)

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 38
MapReduce (Map task)
What if data is not local?
MapReduce (Reduce task)
BITS Pilani,
Hyderabad Campus

Examples
Inverted Index BITS Pilani,
Hyderabad Campus

• Input: (filename, text) records


• Output: list of files containing each word

• Map:
foreach word in text.split():
output(word, filename)

• Combine: uniquify filenames for each word

• Reduce:
def reduce(word, filenames):
output(word, sort(filenames))

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 43
Inverted Index Example BITS Pilani, Hyderabad Campus

hamlet.txt
to, hamlet.txt
to be or be, hamlet.txt
not to be or, hamlet.txt afraid, (12th.txt)
not, hamlet.txt be, (12th.txt, hamlet.txt)
greatness, (12th.txt)
not, (12th.txt, hamlet.txt)
of, (12th.txt)
12th.txt be, 12th.txt or, (hamlet.txt)
not, 12th.txt to, (hamlet.txt)
be not afraid, 12th.txt
afraid of of, 12th.txt
greatness greatness, 12th.txt
Summary of MapReducefacts

 Number of Map's: depends on Input data size, usually


10-100 per node.
SetNumMapTasks(int) can be used to set it higher

 Number of Reducer's : Its legal to have zero Reducer if


no reduction is desired
setNumReduceTasks(int)

 The Hadoop framework is in Java, but it supports the


streaming, thus making it possible to write the
MapReduce in other languages like .Net, C#, etc

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 45
Summary of MapReduce facts (contd..)

 MapReduce’s data-parallel programming model


hides complexity of distribution and fault tolerance

 Principal philosophies:
 Make it scale, so you can throw hardware at problems
 Make it cheap, saving hardware, programmer and
administration costs (but necessitating fault tolerance)

 Hive and Pig further simplify programming


 MapReduce is not suitable for all problems, but
when it works, it may save you a lot of time

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 46
Available solutions with cloud platforms

AWS (EMR)
Microsoft Azure (Azure HDInsight)
OpenStack (Sahara)

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 47
Summary
 Hadoop components and importance of
MapReduce
 Understanding MapReduce various logical steps
 Exploring the word count java program in detail
 Few examples
 Summary of MapReduce facts

SS ZG527 HADOOP Dr. S. Panda/Mr.Rahul C.S. CSIS, BITS Pilani, Hyderabad Campus 48
Cloud Computing
BITS Pilani SLA
Hyderabad Campus
SLA – Service level Agreement
 Enterprises enter into a legal agreement – SLA (Service
Level Agreement) with the infrastructure service providers
to guarantee a minimum quality of service (QoS).
 General QoS Parameters:
– System CPU
– Data storage
– Network bandwidth

 SLA rules:
 The application’s server machine will be available for 99.9% of the key
business hours of the application’s end users, also called core time, and
85% of the non-core time.
 The service provider would respond to a reported issue in less than 10
minutes during the core time, but would respond in one hour during non-
core time.
 These SLAs are termed as Infrastructure SLA and
Providers are called ASP (Applications Service Providers).
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 2
SLA – Service level Agreement
An implementation of an SLA should specify:
•Purpose: Objectives to achieve by using a SLA.
•Restrictions: Necessary steps or actions that need
to be taken to ensure that the requested level of
service is delivered.
•Validity Period: Period of time during which the
SLA is valid.
•Scope: Services that will be delivered to the
consumer and services that are outside the SLA.
•Parties: Any involved organizations or individual
and their roles (e.g. provider, consumer).

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 3
SLA – Service level Agreement
An implementation of an SLA should specify:
•Service-Level Objectives (SLOs): Levels of
services on which both parties agree. These are
expressed by means of service-level indicators such
as availability, performance, and reliability.
•Penalties: The penalties that will occur if the
delivered service does not achieve the defined
SLOs.
•Optional Services: Services that are not
mandatory but might be required.
•Administration: Processes that are used to
guarantee that SLOs are achieved and the related
organization is responsible for controlling these
processes.
SS ZG527 Multitenancy, Security, and SLA 4
Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus
Example : Amazon EC2 SLA
BITS Pilani, Hyderabad Campus

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 5
Amazon EC2 SLA (contd..)

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 6
Amazon S3 SLA

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 7
Amazon S3 SLA

SS ZG527 Multitenancy, Security, and SLA BITS Pilani, Hyderabad


Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 8
Microsoft Azure SLA

SS ZG527 Multitenancy, Security, and SLA BITS Pilani, Hyderabad


Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 9
Microsoft Azure SLA

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 10
Key components of SLA:
 Service Level Parameter: Describes an
observable property of a service whose value is
measurable (reasonable, attainable, enforceable,
measurable).
 Metrics: These are definitions of values of service
properties that are measured from a service
providing system (server uptime is 98% for a
period of 10 weeks).
 Function: A function specifies how to compute a
metric’s value from the values of other metrics and
constants.
 Measurement directives: These specify how to
measure a metric.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 11
Types of SLA

 Infrastructure SLA: The infrastructure provider


manages and offers guarantees on availability of
the infrastructure, namely, server machine, power,
network connectivity, etc.
 Application SLA: In the application co-location
hosting model, the server capacity is available to
the applications.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 12
Infrastructure SLA

SS ZG527 Multitenancy, Security, and SLA BITS Pilani, Hyderabad


Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 13
Application SLA

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 14
Life Cycle of SLA
Five phases in SLA life cycle:
1. Contract definition
2. Publishing and discovery
3. Negotiation
4. Operationalization: SLA operation consists of
– SLA Monitoring
– SLA Enforcement
– SLA Accounting
5. De-commissioning

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 15
Life Cycle of SLA
1. Contract definition : Define a set of service offerings and
corresponding SLAs using standard templates.

2. Publishing and discovery : Advertises these base service offerings


through standard publication media, and the customers can search
different competitive offerings and shortlist a few that fulfill their
requirements for further negotiation.

3. Negotiation:
- For a standard packaged application offered as service, this
phase is automated.
- For customized applications hosted on cloud platforms, this phase
is manual.
- The service provider analyze the application’s behavior with
respect to scalability and performance before agreeing on the
specification of SLA.
- At the end of this phase, the SLA is mutually agreed by both
customers.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 16
Life Cycle of SLA
4. Operationalization: SLA operation consists of:
• SLA monitoring - measure parameter values and calculate the metrics
defined as a part of SLA and determine the deviation.
• SLA accounting – capture and archive the SLA adherence for
compliance. The application’s actual performance and the performance
guaranteed as a part of SLA is reported and provide the penalties paid
for each SLA violation.
• SLA enforcement – take appropriate action when the runtime monitoring
detects a SLA violation and notify the concerned parties, charge the
penalties besides other things.
5. De-commissioning :
- Termination of all activities performed under a particular SLA
when the hosting relationship between the service provider and
the service consumer has ended.
- SLA specifies the terms and conditions of contract termination
and specifies situations under which the relationship between a
service provider and a service consumer can be considered to be
legally ended.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 17
SLA Management in Cloud
SLA management of applications hosted on cloud platforms
involves five phases:
• Feasibility Analysis
– Technical feasibility
– Infrastructure feasibility
– Financial feasibility
• On-Boarding of Application
 Moving an application to the MSP’s hosting platform is called on-boarding.
• Preproduction
 The application is hosted in a simulated production environment.
• Production
 The application is made accessible to its end users under the agreed SLA.
• Termination
- When the customer wishes to withdraw the hosted application and does not wish to
continue to avail the services of the MSP for managing the hosting of its application, the
termination activity is initiated.
- On initiation of termination, all data related to the application are transferred to the
customer and only the essential information is retained for legal compliance.
SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 18
TRADITIONAL APPROACHES TO SLO
MANAGEMENT

 To provide guaranteed quality of service (QoS) for hosted


web applications following mechanisms are used:
 Load balancing techniques
 Admission control mechanism

 Load balancing is to distribute the incoming requests


onto a set of physical machines, each hosting a replica
of an application.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 19
Load Balancing Algorithms

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 20
Load Balancing Algorithms (contd..)
 Class – agnostic
 This means that the front-end node is neither aware of the type of client from
which the request originates nor aware of the category (e.g., browsing, selling,
payment, etc.) to which the request belongs to.
 Class – aware
 With class-aware load balancing and requests distribution, the front-end node
must additionally inspect the type of client making the request and/or the type of
service requested before deciding which back-end node should service the
request.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 21
Admission Control

 Admission control algorithms play an important role in deciding the


set of requests that should be admitted into the application server
when the server experiences “very” heavy loads.
 During overload situations, since the response time for all the
requests would invariably degrade if all the arriving requests are
admitted into the server, it would be preferable to be selective in
identifying a subset of requests that should be admitted into the
system so that the overall payoff is high.
 The objective of admission control mechanisms, therefore, is to police
the incoming requests and identify a subset of incoming requests that
can be admitted into the system when the system faces overload
situations.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 22
Admission Control
Mechanisms

 Request-based admission control algorithms reject new requests if the


servers are running to their capacity.
 Disadv: a client’s session may consist of multiple requests that are not
necessarily unrelated. Consequently, some requests are rejected even if
there are others that are honored.
 Session-based admission control mechanisms try to ensure that longer
sessions are completed and any new sessions are rejected.

 QoS Aware: Requests from low priority users & requests that are likely to
consume more system resources can be rejected.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 23
Automated Policy-based Management

 The parameters often used to prioritize action and perform


resource contention resolution are:
 The SLA class (Platinum, Gold, Silver, etc.) to which the application
belongs to.
 The amount of penalty associated with SLA breach.
 Whether the application is at the threshold of breaching the SLA.
 Whether the application has already breached the SLA.
 The number of applications belonging to the same customer that has
breached SLA.
 The number of applications belonging to the same customer about to
breach SLA.
 The type of action to be performed to rectify the situation.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 24
Policy based Management System

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 25
Policy based Management System
(example)
Consider a Policy Based Management system used by some cloud
computing environment where 80% of the load is optimal for physical servers.
There are 3 physical machines (or servers)–A, B and C with CPU and memory
capacity of 100 units each. The data center in which A, B and C are hosted follows
Green Computing methodology for conserving power resources, meaning until
absolutely required physical machines are not turned on. Following figure shows
the resource allocation to different virtual machines in the data center.
Resource allocation to VMs in a data center

Answer the following:


1. Is the current load optimal for all the physical machines? If not, what can be
done to enforce the optimal load policy?
2. How can you allocate 20 more units of memory to VM4 dynamically? Show the
resource allocation diagram after allocation.
3. If the data center gets a request of provisioning a new virtual machine (VM8),
with CPU and memory requirements of 70 units each, which physical machine
will be used?

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 26
a.

b.

c.
None of the physical machines A, B, and C has the resources to
provision new VM (VM8). So, a new physical machine, D, needs to be
switched on for hosting it.

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 27
Bibliography
• Buyya K. R., Broberg J., Goscinski A., Cloud Computing
Principles and Paradigms. Wiley; 2013.
• Recorded Lectures.
• Dinkar Sitaram, Geetha Manjunath, Moving to the Cloud
Developing Apps in the New World of Cloud Computing;2012
• Internet Sources.
Suggested Reading:
• Cloud Computing Black Book (Chapters 10, 11, and 18)

SS ZG527 Multitenancy, Security, and SLA Dr. S. Panda/Palak Sharma, CSIS, BITS Pilani, Hyderabad Campus 28

You might also like