You are on page 1of 87

CloudStack Overview

Written by: Chiradeep Vittal, Alex Huang @ Citrix Revised by: Gavin Lee, Zhennan Sun @ TCloud Computing

Outline
Overview of CloudStack Problem Definition Feature set overview Network Storage MS internals System VMs System Interactions Roadmap Comparisons

What is CloudStack?
Multi-tenant cloud orchestration platform
Turnkey Hypervisor agnostic Scalable Secure Open source, open standards Deploys on premise or as a hosted solution BSS, self service portal. (Not ASL) Extensive networking service

Build your cloud the way the worlds most successful clouds are built

Deliver cloud services faster and cheaper

CloudStack Supports Multiple Cloud Strategies


Private Clouds
On-premise Enterprise Cloud Hosted Enterprise Cloud

Public Clouds
Multi-tenant Public Cloud

Dedicated resources Security & total control Internal network Managed by Enterprise or 3rd party

Dedicated resources Security SLA bound 3rd party owned and operated

Mix of shared and dedicated resources Elastic scaling Pay as you go Public internet, VPN access

CloudStack Provides On-demand Access to Infrastructure Through a Self-Service Portal


Org A Org B

Admin

Admin Users

Users

End User

Users

Compute
Admin

Network

Storage

Open Flexible Platform


Compute Hypervisor
XenServer VMware Oracle VM KVM Bare metal

Storage

Block & Object


Local Disk iSCSI
Primary Storage

Fiber Channel

NFS

Swift
Secondary Storage

Network

Network & Network Services


Connection Type Isolation Firewall Load balancer VPN

Problem Definition
Offer a scalable, flexible, manageable IAAS platform that follows established cloud computing paradigms IAAS
Orchestrate physical and virtual resources to offer self-service infrastructure provisioning and monitoring

Scalable
1 -> N hypervisors / VMs / virtual resources 1 -> N end users

Flexible
Handle new physical resource types
Hypervisors, storage, networking

Add new APIs Add new services Add new network models

Problem Definition (contd)


Manageable
Hide complexity of underlying resources Rich functional end-user and admin UI Admin API to automate operations Easy install, upgrade for small -> large clouds Simple scaling, automated resilience

Established Paradigms
EC2 inspired
Semantic variations based on cloud provider needs, hypervisor capabilities

Feature Set Overview

Create Custom Virtual Machines via Service Offerings


Select Operating System Windows, Linux

Select Compute Offering CPU & RAM

Select Disk Offering Volume Size

Select Network Offering Network & Services

Create VM

Dashboard Provides Overview of Consumed Resources

Running, Stopped & Total VMs Public IPs Private networks Latest Events

Virtual Machine Management

Users

VM Operations

VM Access

VM Status

Change Service Offering

Start Stop Restart Destroy

CPU Utilized Network Read Network Writes

2 CPUs 1 GB RAM 20 GB 20 Mbps

4 CPUs 4 GB RAM 200 GB 100 Mbps

Volume & Snapshot Management


Add / Delete Volumes
VM 1
Volume

Create Templates from Volumes

Volume

Template

Schedule Snapshots

Hourly

Weekly Monthly

Now
Daily

View Snapshot History

Network & Network Services


Create Networks and attach VMs Acquire public IP address for NAT & load balancing Control traffic to VM using ingress and egress firewall rules Set up rules to load balance traffic between VMs

CloudStack Deployment Architecture


CloudStack Management Server Zone 1 L3 core

Internet

Hypervisor is the basic unit of scale. Cluster consists of one ore more hosts of same hypervisor All hosts in cluster have access to shared (primary) storage

Pod 1

Access Layer

Pod N

.
Cluster N

Secondary Storage

Pod is one or more clusters, usually with L2 switches. Availability Zone has one or more pods, has access to secondary storage. One or more zones represent cloud

.
Cluster 1 Host 1 Host 2
Primary Storage

CloudStack Cloud Architecture


Cloud
Data Center 1 Data Center 2 Data Center 2 Data Center 3

Zone1

Zone 2 Zone 2 Zone 3 Zone Zone 4 3

Data Center 2 Data Center 2 Data Center 2

CloudStack Cloud can have one or more Availability Zones (AZ).

Zone 2 Zone 2 Zone 2 3 Zone Zone 3 Zone 3

Management Server Managing Multiple Zones


Cloud
Data Center 1
Mgmt Server

Data Center 2 Data Center 2 Data Center 3

Single Management Server can manage multiple zones Zones can be geographically distributed but low latency links are expected for better performance Single MS node can manage up to 10K hosts. Multiple MS nodes can be deployed as cluster for scale or redundancy

Zone1

Zone 2 Zone 2 Zone 3 Zone Zone 4 3

Data Center 2 Data Center 2 Data Center 2

Zone 2 Zone 2 Zone Zone 2 3 Zone 3 Zone 3

Management Server Deployment Architecture


Single-node Deployment Multi-node Deployment

MS

User API
MS
MySQL DB

User API
Load Balancer MS

Admin API

Admin API
MS
MySQL DB Back Up Replication DB

Infrastructure Resources

MS is stateless. MS can be deployed as physical server or VM Single MS node can manage up to 10K hosts. Multiple nodes can be deployed for scale or redundancy Commercial: RHEL 5.4+; FOSS: Ubuntu 10.0.4, Fedora 16 Infrastructure Resources

CloudStack Storage
Primary Storage
Configured at Cluster-level. Close to hosts for better performance Stores all disk volumes for VMs in a cluster Cluster can have one or more primary storages Local disk, iSCSI, FC or NFS Cluster 1 Host 1 Pod 1 L2 switch
Secondary Storage

L3 switch

Secondary Storage
Configured at Zone-level Stores all Templates, ISOs and Snapshots Zone can have one or more secondary storages NFS, OpenStack Swift Host 2

Primary Storage

Core CloudStack Components


Hosts
Servers onto which services will be provisioned
VM

Host
VM

Primary Storage
VM storage

Host
Primary Storage

Cluster
A grouping of hosts and their associated storage

Pod
Collection of clusters

Cluster
Secondary Storage Network

Network
Within the same L2 switch

Cluster

Secondary Storage
Template, snapshot and ISO storage

CloudStack Pod CloudStack Pod Zone

Zone
Collection of pods, network offerings and secondary storage

Management Server Farm


Responsible for all management and provisioning tasks

Understanding the Role of Storage and Templates


Primary Storage
Cluster level storage for VMs Connected directly to hosts NFS, iSCSI, FC and Local
Host Host
Primary Storage

Secondary Storage
Zone level storage for template, ISOs and snapshots NFS or OpenStack Swift via CloudStack System VM

Cluster Pod

Templates and ISOs


Imported into CloudStack Can be private or public
Template

Secondary Storage

Zone

Provisioning Process
1. 2. 3. User Requests Instance Provision Optional Network Services Copy instance template from secondary storage to primary storage on appropriate cluster Create any requested data volumes on primary storage for the cluster Create instance Start instance
Secondary Storage
Template
VM

Host Host
Primary Storage

Cluster Pod

4. 5. 6.

Zone

Citrix XenServer
Integrates directly with XenServer Pool Master Snapshots at host level System VM control channel at host level Network management is host level
CloudStack Manager

XenServer Pool Master Host XenServer Host XenServer Host XenServer Host XenServer Host XenServer Resource Pool

Oracle VM
Integrates with ovs-agent Snapshots at host level System VM control channel at host level Network management is host level Does not use OVM Manager All templates must be from Oracle CloudStack configures ocfs2 nodes Requires helper cluster
XenServer, KVM or vSphere OVS Agent

CloudStack Manager

OVM Host
OVS Agent

OVM Host
OVS Agent

OVM Host
OVS Agent

OVM Host

RedHat Enterprise Linux (KVM)


Integrates with libvirt using Cloud Agent Snapshots at host level System VM control channel at host level Network management is host level Only RHEL 6, not RHEV
Also supports Ubuntu 10.04
Cloud Agent Libvirt

CloudStack Manager

KVM Host

Cloud Agent Libvirt

KVM Host

VMware vSphere
Integration through vCenter System VM control channel via CloudStack private network Snapshot and volume management via Secondary Storage VM Networking via vSphere vSwitch
CloudStack Manager vSphere Host vCenter vSphere Host vSphere Cluster vSphere Host vSphere Host vSphere Host vSphere Cluster Data Center

Management Server Interaction with Hypervisors


Management Server

XAPI

HTTPS

vCenter XenServer ESX


XS 5.6, 5.6FP1, 5.6 SP2, 6.0 Incremental Snapshots VHD NFS, iSCSI, FC & Local disk Storage over-provisioning: NFS ESX 4.1, 5.0 (coming) Full Snapshots VMDK NFS, iSCSI, FC & Local disk Storage over-provisioning: NFS, iSCSI

Agent

Agent

KVM

OVM

RHEL 6.0, 6.1, 6.2 (coming) Full Snapshots (not live) QCOW2 NFS, iSCSI & FC Storage over-provisioning: NFS

OVM 2.2 No Snapshots RAW NFS & iSCSi No storage overprovisioning

Multi-tenancy & Account Management


Cloud
Domain Org A

Resources
VMs, IPs, Snapshots

Admin
Domain Reseller A

Domain is a unit of isolation that represents a customer org, business unit or a reseller Domain can have arbitrary levels of subdomains A Domain can have one or more accounts An Account represents one or more users and is the basic unit of isolation Admin can limit resources at the Account or Domain levels

Sub-Domain Org C

Admin

Resources
VMs, IPs, Snapshots

Admin
Account Group A


Account Group B

User 1 User 2

CloudStack Network

Network Terminology
Traffic type
Guest: The tenant network to which instances are attached Storage: The physical network which connects the hypervisor to primary storage Management: Control Plane traffic between CloudStack management server and hypervisor clusters Public:
Outside the cloud [usually Internet] Shared public VLANs trunked down to all hypervisors

Network type
Shared, same subnet for different user
Direct. 1 subnet Direct tagged. VLAN, multiple subnet

Isolated, different subnet for different user


Virtual (tagged)

All traffic can be multiplexed on to the same underlying physical network using VLANs
Usually Management network is untagged Storage network usually on separate nic (or bond)

Admin informs CloudStack how to map these network types to the underlying physical network
Configure traffic labels on the hypervisor Configure traffic labels on Admin UI

VM Instance
Choose the instantiated guest network IP is arbitrary

Guest Network
Instance of Network Offering Shared: created by Admin Isolated: Created and owned by user One virtual router for one network Cross pod, within Zone VLAN id picked from the pool

Physical Network
Zone level Defined by NIC Assigned with traffic type (P, G, M, S) Associated by label/vswitch label/vswitch name Attached with device as service provider

Network Offering
Only for Guest traffic Guest network type: Shared or Isolated Defined a set of network services, such as DHCP, Firewall, VPN, NAT Bandwidth

Tag

Physical Network
Operations Admin and Cloud API
CloudStack MS Cluster

Users

Router
MySQL

Load Balancer L3 Core Switch Access Layer Switches

Availability Zone

Servers

Pod 2

Pod 3

Pod N

Secondary Storage

Pod 1

Network Isolation

Web VM

Web Security Group

DB VM

DB Security Group

Web VM

Web VM

Web VM

DB VM

Web VM

Web VM

Network Isolation (Security Group, L3)


Public Internet Pod 1 L2 Switch 10.1.0.1 Guest 1 VM 1 Guest 2 VM 1 Guest 1 VM 2 10.1.0.2

10.1.0.3

10.1.0.4

L3 Core Switch

Pod 2 L2 Switch

10.1.8.1

Load Balancer

Pod 3 L2 Switch

10.1.16.1

Guest 2 VM 2 Guest 2 VM 3 Guest 1 VM 3 Guest 1 VM 4

10.1.16.12

10.1.16.21

10.1.16.47

10.1.16.85

Network Isolation (VLAN, L2)


Core (L3) Network Pod K
Access Switch(es)
Hypervisor V R

Pod M

Pod N
Hypervisor V V

CLUSTER 1
Hypervisor 1

VLAN 101 Traffic VLAN 102 Traffic

CLUSTER 4
Hypervisor V V N
Hypervisor N+1

Hypervisor 8

V R V Tenant VM Tenant Virtual Router

Guest virtual network


Guest Virtual Network 10.1.1.0/24 Public IP Public Guest 1 Gateway Network address VM 1 address 65.37.141.11 10.1.1.1 65.37.141.36 Guest 1 Guest 1 Virtual VM 2 Router NAT DHCP Load Balancing VPN Guest 1 VM 3 Guest 1 VM 4 Guest address 10.1.1.2 Guest address 10.1.1.3 Guest address 10.1.1.4 Guest address 10.1.1.5 Guest address 10.1.1.2 Guest address 10.1.1.3 Guest address 10.1.1.4

Public Internet

Guest Virtual Network Public IP 10.1.1.0/24 Guest 2 address Gateway VM 1 65.37.141.24 address 65.37.141.80 10.1.1.1 Guest 2 Virtual Guest 2 Router VM 2 NAT DHCP Load Balancing VPN Guest 2 VM 3

Guest Virtual Network With Physical Device


CS Virtual Router provides Network Services External Devices provide Network Services

Guest Virtual Network 10.1.1.1/8 VLAN 100 Public Network/Internet 10.1.1.1 Guest VM 1 Public Network/Internet Public IP 65.37.141.111
Juniper SRX Firewall

Guest Virtual Network 10.1.1.1/8 VLAN 100

Public IP 65.37.141.11

CS Virtual Router

Gateway address 10.1.1.1

Private IP 10.1.1.111

10.1.1.1

Guest VM 1

10.1.1.3

Guest VM 2 Public IP 65.37.141.112

10.1.1.3
NetScaler Load Blancer

Guest VM 2

DHCP, DNS NAT Load Balancing VPN

Private IP 10.1.1.112 10.1.1.4 Guest VM 3

10.1.1.4

Guest VM 3

10.1.1.5

Guest VM 4

10.1.1.5
CS Virtual Router

Guest VM 4

DHCP, DNS

Layer-3 Guest Network


Network Services Managed Externally Network Services Managed by CS

Public Network 65.11.0.0/16 Security Group 1 65.11.1.2 Public Network/Internet

Security Group 1
10.1.2.3

Guest VM 1 65.11.1.2 65.11.1.3 65.11.1.4


NetScaler Load Blancer
10.2.12.4

Guest VM 1

65.11.1.3

Guest VM 2

L3 switch

Guest VM 2

65.11.1.4

EIP, ELB
Guest VM 3

10.5.2.99

Guest VM 3

10.1.2.18

65.11.1.5

Guest VM 4 Security Group 2 DHCP, DNS


CS Virtual Router

Guest VM 4 Security Group 2

DHCP, DNS

CS Virtual Router

Multi-tier network
Multi-tier network
Virtual Network 10.1.3.0/24 VLAN 141 10.1.2.31 Private IP 10.1.1.111 10.1.1.1 Web VM 1 10.1.2.21 10.1.2.24 10.1.2.18 App VM 2 10.1.3.45 App VM 1 10.1.3.21

Public Network/Intern et Public IP 65.37.141.11 Juniper SRX 1 Firewall

Virtual Network 10.1.1.0/24 VLAN 100

Virtual Network 10.1.2.0/24 VLAN 1001

10.1.1.3 Public IP 65.37.141. 112


Netscaler Load Balancer

Web VM 2

Private IP 10.1.1.112 10.1.1.4 Web VM 3 10.1.2.38 10.1.3.24 DB VM 1

10.1.1.5
CS Virtual Router

Web VM 4

10.1.2.39 DHCP, DNS, Userdata


CS Virtual Router

DHCP, DNS Userdata

DHCP, DNS Userdata, Source -NAT, VPN

CS Virtual Router

Public IP 65.37.141.115

Multi-tier unified [vision]


Internet
IPSec or SSL site-to-site VPN Loadbalancer
CS Virtual Router / Other

Customer Premises

Monitoring VLAN

Virtual Router Services IPAM DNS LB [intra] S-2-S VPN Static Routes ACLs NAT, PF FW [ingress & egress] BGP

10.1.2.31 10.1.1.1 Web VM 1 10.1.2.24

App VM 1

10.1.1.3

Web VM 2

App VM 2

10.1.1.4

Web VM 3

10.1.3.24

DB VM 1

10.1.1.5 Virtual Network 10.1.1.0/24 VLAN 100

Web VM 4 Virtual Network 10.1.2.0/24 VLAN 1001 Virtual Network 10.1.3.0/24 VLAN 141

Multi-tier unified with SDN[vision]


Internet
IPSec or SSL site-to-site VPN Loadbalancer Virtual Appliance
CS Virtual Router / Other

Customer Premises

Monitoring VLAN

Virtual Router Services IPAM DNS LB [intra] S-2-S VPN Static Routes ACLs NAT, PF FW [ingress & egress] BGP

10.1.2.31 10.1.1.1 Web VM 1 10.1.2.24

App VM 1

10.1.1.3

Web VM 2

App VM 2

10.1.1.4

Web VM 3

10.1.3.24

DB VM 1

10.1.1.5 Overlay Network 10.1.1.0/24

Web VM 4 Overlay Network 10.1.2.0/24 Overlay Network 10.1.3.0/24

Network Offerings
Cloud provider defines the feature set for guest networks Toggle features or service levels
Security groups on/off Load balancer on/off Load balancer software/hardware VPN, firewall, port forwarding

User chooses network offering when creating network Enables upgrade between network offerings Default offerings built-in
For classic CloudStack networking

CloudStack Storage

Storage
Primary Storage
Zone-Level Layer 3 Switch Private Network

Pod 1 Pod-Level Layer-2 Switch

Pod 2

Pod N Scale-Out NFS

Block device to the VM IOPs intensive Accessible from host or cluster wide Supports storage tiering

WORM Storage
Secondary Storage or Object Store for templates, ISO, and snapshot archiving High capacity

Cluster 2

Computing Server 1

Primary Storage

Computing Server 2

Primary Storage Scale-Out NFS Primary Storage

Computing Server 3
Cluster 1

Computing Server 4

CloudStack manages the storage between the two to achieve maximum benefit and resiliency

Primary Storage Support Matrix


Type Local Disk iSCSI Fiber Channel NFS XenServer Supported Supported Supported Supported VmWare Supported Supported Supported Supported KVM Supported Not Supported Not Supported Supported

Storage Tagging
Supported via storage tags for primary storage Specify a tag when adding a storage pool Specify a tag when adding a disk offering Only storage pools with the tag will be allocated for the volume

WORM Storage
Write Once Read Many storage pattern is supported by two different storage types
Secondary Storage (NFS Server within an availability zone) Object Store (Swift implementation for cross-zone)

Objective for WORM storage


High capacity, cheap storage Easy to increase capacity

Used to store templates, ISOs, and snapshots

Snapshots
Snapshots are used as backups for DRS Taken on the primary storage and moved to secondary storage Supports individual snapshots and recurring snapshots Full snapshots on VmWare and KVM. Need help. Incremental snapshots on XenServer Allows backup network traffic to be specified in zone to segregate the backup network traffic from other network traffic types

MS Internals
Architecture Workflow High Availability Scalability

Inside a Management Server

Cmds

cmd.execute()

Plugins Plugins Plugins

CS API

API Servlet

Async Job Queue Mgr

Services API

Responses

Kernel
Agent API (Commands)
Agent Manager Local

Resources

Or Remote

Hypervisor Native APIs

Network Device API

MySQL

Old Architecture
EC2 CloudStack

API Layer

Access Control
Virtual Machine Manager Console Proxy Manager

F5 Resour ce NetScal er Resour ce

Agent Manager
XenServ er Resourc e KVM Resour ce SRX Resour ce Other Resourc es

Pros Agile development for existing developers Scales well horizontally Cons Monolithic Difficult to educate new and third-party developers Easy to introduce bugs

Async Job Manager

Snapshot Manager

52

Template Manager

Network Manager

Storage Manager

New Deployment Architecture


Scales horizontally to different pressure points Automatically scales service VMs in zones to facilitate most efficient data path transfers Fault isolation between API servers and Execution Servers and resources within zones

New Architecture API Server


UI Cloud Portal CLI Other Clients

REST API Server Pluggable API Engine


OAM&P API End User API EC2 API Other APIs Integration

Management Services - Resource management - Configuration - Additional operations added by third party -

ACL & Authentication - Accounts, Domains, and Projects - ACL, limits checking

Framework Job Queue Database Access Layer OSGi

API Server isolates integration code from Execution Server API Server can horizontally scale to handle traffic Easily adds other API compatibility Easily exposes API needed by third party vendors

New Architecture Execution Server


Execution Server
Services API

Kernel
Drives long running VM operations Syncs between resources managed and DB Generates events

Plugins
Storage Handling Network Handling Deployment planning Hypervisor Handling

Framework
Cluster Management Job Management Alert & Event Management Database Access Layer Messaging Layer Component Framework (OSGi) Transaction Management

Execution Server protected by job queue Kernel kept small for stability. It only drives processes. Plugins provide mappings of virtual entities to physical resources Third party plugins to provide vendor differentiation in CloudStack Communicates with resources within data center over message bus

New Architecture Resources


Agent
Hypervisor Resources Network Resources

Storage Resources

Image & Template Resources Snapshot Resources

Resources are carried in service VMs to be in close network proximity to the physical resources it manages Easily scales to utilize the most abundant resource in data center (CPU & RAM) Communicates with Execution Server over message bus (JSON) Can be replicated for fault tolerance

UI

Cloud Portal

CLI

Other Clients

Management Server
REST API
OAM&P API Console Proxy Management Template Access HA Usage Calculations Additional Services End User API EC2 API Other APIs Pluggable Service API Engine Security Adapters Account Management Connectors
Plugin API

ACL & Authentication Accounts, Domains, and Projects ACL, limits checking Services API

Deployment Planning Network Configurations Network Elements Hypervisor Gurus

Kernel
Drives long running VM operations Syncs between resources managed and DB Generates events
Services API

Cluster Management

Resource Management

Job Management

Alert & Event Management

Database Access

Event Bus Message Bus Hypervisor Resources Network Resources Storage Resources Image Resources Snapshot Resources

Kernel Module
Understands how to orchestrate long running processes (i.e. VM starts, Snapshot copies, Template propagation) Well defined process steps Calls Plugin API to execute functionalities that it needs

Plugins
Various ways to add more capability to CloudStack Implements clearly defined interfaces All operations must be idempotent All calls are at transaction boundaries Compiles only against the Plugin API module

Anatomy of a Plugin

Rest API
Optional. Required only if needs to expose configuration API to admin.

ServerResource
Optional. Required if Plugin needs to be colocated with the resource Implements translation layer to talk to resource Communicates with server component via JSON

Plugin API

Implmentation

Data Access Layer

Anatomy of a Plugin
Can be two jars: server component to be deployed on management server and an optional ServerResource component to be deployed colocated with the resource Server component can implement multiple Plugin APIs to affect its feature Can expose its own API through Pluggable Service so administrators can configure the plugin As an example, OVS plugin actually implements both NetworkGuru and NetworkElement

Plugin Interfaces Available


NetworkGuru Implements various network isolation technologies and ip address technologies NetworkElement Facilitate network services on network elements to support a VM (i.e. DNS, DHCP, LB, VPN, Port Forwarding, etc) DeploymentPlanner Different algorithms to place a VM and volumes. Investigator Ways to find out if a host is down or VM is down. Fencer Ways to fence off a VM if the state is unknown UserAuthenticator Methods of authenticating a user SecurityChecker ACL access HostAllocator Provides different ways to allocate host StoragePoolAllocator Provides different ways to allocate volumes

Adding a Plugin to CloudStack


Components are configured through components.xml Supports DAO, Manager, and Adapter patterns Open to other component frameworks (OSGi a possibility)

Components.xml Example
<components.xml> <system-integrity-checker class="com.cloud.upgrade.DatabaseUpgradeChecker"> <checker name="ManagementServerNode" class="com.cloud.cluster.ManagementServerNode"/> <checker name="EncryptionSecretKeyChecker" class="com.cloud.utils.crypt.EncryptionSecretKeyChecker"/> <checker name="DatabaseIntegrityChecker" class="com.cloud.upgrade.DatabaseIntegrityChecker"/> <checker name="DatabaseUpgradeChecker" class="com.cloud.upgrade.PremiumDatabaseUpgradeChecker"/> </system-integrity-checker> <interceptor library="com.cloud.configuration.DefaultInterceptorLibrary"/> <management-server class="com.cloud.server.ManagementServerExtImpl" library="com.cloud.configuration.PremiumComponentLibrary"> <adapters key="com.cloud.storage.allocator.StoragePoolAllocator"> <adapter name="LocalStorage" class="com.cloud.storage.allocator.LocalStoragePoolAllocator"/> <adapter name="Storage" class="com.cloud.storage.allocator.FirstFitStoragePoolAllocator"/> </adapters> <pluggableservice name="VirtualRouterElementService" key="com.cloud.network.element.VirtualRouterElementService" class="com.cloud.network.element.VirtualRouterElement"/> </management-server> </components.xml>

ServerResource
Translation layer between CloudStack commands and resource API May be Co-located with resource Have no access to DB API defined in JSON messages

DAO
SQL generation done mostly in GenericDaoBase Uses JPA annotations Very little code to write for each individual DAO Database Access Layer for Kernel No support for more complicated features such as fetch strategy Welcome to use other types of ORM in other modules but like to hear about preferred library. (Hibernate is out due to licensing issues)

Example DAO
// ExampleVO.java @Entity @Table(name=example) public class ExampleVO { @Id @GeneratedValue(strategy= GenerationType.IDENTITY) @Column(name=id) long id; @Column(name=name) String name; @Column(name=value) String value; } } // ExampleDao.java public interface ExampleDao extends GenericDao<ExampleVO, Long> { } // ExampleDaoImpl.java @Local(value=ExampleDao.class) public class ExampleDaoImpl extends GenericDaoBase<ExampleVO, Long> implements ExampleDao { protected ExampleDaoImpl() { }

Sequence Flow for deploy VM


End User Rest API Deploy VM ACL Checks Allocate Entity in CS Allocate VM Allocate NIC Allocate IP Allocate Volume Security Checkers User VM Mgr Kernel VirtualMac Network hine Mgr Mgr Storage Mgr Network Guru Job Scheduling

Schedules Deploy Job Returns with job id, VM id

Query Job Result Returns with job status

Sequence Flow for deploy VM


Job Threads Services API User VM Mgr VirtualMac hine Mgr Network Mgr Storage Mgr Network Guru Network Element Template Mgr
Deployment Planner

Server Resource

Start VM Start User VM Start VM Get a Deployment Plan (Host and StoragePool) Prepare Nics Reserve resources for Nic Notify that Nic is about to be started in network Agent Calls Prepare Volumes Prepare template on Primary Storage Agent Calls Agent Start VM Call Stores job result

High Availability

High Availability
Service Offering contains a flag for whether HA should be supported for the VM Does not use the native HA capability of hypervisors for XenServer and KVM Uses adapters to fine tune HA process

Triggering High Availability


VM HA are triggered via the following methods: VM Sync detects out of band VM changes Resource Management detects that a resource is unreachable and its state can not be determined. VM start/stop has been sent to the resource but resource does not return Details of how high availability is done is at
http://docs.cloudstack.org/CloudStack_Documentation/Design_Documents/CloudStack_High_Availability__Developer's_Guide

High Availability
Has VM changed since work scheduled?
Yes

Cancel Work

Investigation
Uses investigators to find out if VM is alive or down Each investigator returns three states

No

No

Investigation Needed?

Yes Up

Failure

Start VM
Down

Is VM Up or Down?

Up

Is hypervisor host Up or Down?

Success

Up Down Unknown

Unknown

Down

Fencing
Uses fencers to fence off the VM from accessing storage to ensure VM is not corrupted Each Fencer returns three states
Fenced Unable to Fence Dont know how to fence

Completed Work Has more Investigators ?


Yes

No

Reschedule Work

Yes

Fence off VM?

No

More Fencers??
Yes No

Restart
Restarts the VM

Scalability

Current Status
10k resources managed per management server node Scales out horizontally (must disable stats collector) Real production deployment of tens of thousands of resources Internal testing with software simulators up to 30k physical resources with 300k VMs managed by 4 management server nodes We believe we can at least double that scale per management server node

Balancing Incoming Requests


Each management server has two worker thread pools for incoming requests: effectively two servers in one.
Executor threads provided by tomcat Job threads waiting on job queue

All incoming requests that requires mostly DB operations are short in duration and are executed by executor threads because incoming requests are already load balanced by the load balancer All incoming requests needing resources, which often have long running durations, are checked against ACL by the executor threads and then queued and picked up by job threads. # of job threads are scaled to the # of DB connections available to the management server Requests may take a long time depending on the constraint of the resources but they dont fail.

Comparison of two Approaches


Stats Collector collects capacity statistics
Fires every five minutes to collect stats about host CPU and memory capacity Smart server and dumb client model: Resource only collects info and management server processes Runs the same way on every management server

VM Sync
Fires every minute Peer to peer model: Resource does a full sync on connection and delta syncs thereafter. Management server trusts on resource for correct information. Only runs against resources connected to the management server node

Numbers
Assume 10k hosts and 500k VMs (50 VMs per host) Stats Collector
Fires off 10k requests every 5 minutes or 33 requests a second. Bad but not too bad: Occupies 33 threads every second. But just wait:
2 management servers: 66 requests 3 management servers: 99 requests

It gets worse as # of management servers increase because it did not auto-balance across management servers Oh but it gets worse still: Because the 10k hosts is now spread across 3 management servers. While its 99 requests generated, the number of threads involved is three-fold because requests need to be routed to the right management server. It keeps the management server at 20% busy even at no load from incoming requests

VM Sync
Fires off 1 request at resource connection to sync about 50 VMs Then, push from resource as resource knows what it has pushed before and only pushes changes that are out-of-band. So essentially no threads occupied for a much larger data set.

Resource Load Balancing


As management server is added into the cluster, resources are rebalanced seamlessly.
MS2 signals to MS1 to hand over a resource MS1 wait for the commands on the resources to finish MS1 holds further commands in a queue MS1 signals to MS2 to take over MS2 connects MS2 signals to MS1 to complete transfer MS1 discards its resource and flows the commands being held to MS2

Listeners are provided to business logic to listen on connection status and adjusts work based on whos connected. By only working on resources that are connected to the management server the process is on, work is auto-balanced between management servers. Also reduces the message routing between the management servers.

CloudStack System VMs

CloudStack System VMs


System VMs optimize and scale the data path on behalf of CloudStack
Stateless, can be destroyed and recreated from database state Highly Available Communicates with Management Server over management network Usually have 3 interfaces: control(linked-local), mgmt and public

Console Proxy VM Provides AJAX-style HTTP-only console viewer Grabs VNC output from hypervisor Scales out (more spawned) as load increases Java-based server Communicates with MS

Secondary Storage VM
Provides image (template) management services Download from HTTP file share or Swift Copy between zones Scale out to handle multiple NFS mounts Java-based server communicates with MS

CloudStack System VMs


Virtual Router VM Provides multiple network services IPAM (DHCP), DNS, NAT, Source NAT, Firewall, Port Forwarding, VPN User-data, Meta-data, guest SSH keys and password change server Redundancy via VRRP MS configures VR over SSH
Proxied via the hypervisor on XS and KVM

System VM spec
Debian 6.0 ("Squeeze"), 2.6.32 kernel with the latest security patches from the Debian security APT repository. No extraneous accounts 32-bit for enhanced performance on Xen/VMWare Only essential software packages are installed. Services such as, printing, ftp, telnet, X, kudzu, dns, sendmail are not installed. SSHd only listens on the private/link-local interface. SSH port has been changed to a nonstandard port (3922). SSH logins only using keys (keys are generated at install time and are unique for every customer) pvops kernel with Xen paravirt drivers + KVM virtio drivers + VMware tools for optimum performance on all hypervisors. Xen tools inclusion allows performance monitoring Template is built from scratch and is not polluted with any old logs or history Latest versions of haproxy, iptables, ipsec, apache from debian repository ensures improved security and speed Latest version of jre from Sun/Oracle ensures improved security and speed

System VM contd
SSH keys and password are unique to cloud installation Code can be patched by restarting system vm
Mounts a special ISO file with latest code at boot If ISO contents differ, patch and reboot

Same system vm works on XS, KVM, VMWare


Bootstrap step for the cloud is to install the template for this system vm

Ready to be re-purposed for other specialized tasks

Interactions
OVM Cluster
vcenter

Primary Storage

Monitoring
End User UI Admin UI Domain Admin UI

CS API

Primary Storage vSphere Cluster

CS Admin &

End-user API

Clustered CloudStack CloudStack CloudStack Management Server

XS Cluster
XAPI

Primary Storage

JSON

Primary

KVM Cluster Storage


NetConf

Juniper SRX Cloud user {API client (Fog/etc)} Nitro API JSON JSON Console Console Proxy VM Proxy VM {Proxied} SSH Ajax Console HTTPS Router VM Router VM Router VM Sec. Storage Sec. Storage VM VM NFS Netscaler VNC

ec2 API
Cloud user {ec2 API client }

MySQL

NFS Server NFS

HTTP (Template Download) HTTP (Template Copy) HTTP (Swift)

Cloud user

CloudStack Roadmap

CloudStack Roadmap
2012
Feb Apr Jul Oct

2013
Feb

Acton
Swift Integration Support XenServer 6 Support Vsphere 5

Bonita

Burbank

Campo
AWS-style Regions IPv6 Resource Scaling Dedicated Resource Module Scalability (50K hosts) Plugin Architecture Hypervisor Enhancement

?
Hyper-V (win 8)

OpenvSwitch Support Inter-Vlan Routing VMWare Distributed vSwitch Support Multi-tier App Site-to-Site VPNs AWS-style tags VM Tiers

Netscaler Integration Cisco Nexus 1000v Support Refine Resource Upload Volume Management UI refinement LDAP/AD Authentication Clustered LVM support

You might also like