From Xen   to Kvm

®
Codefidence Ltd, CTO Gilad Ben­Yossef

Virtualization:

Linux

®

®

What is Virtualization
“Virtualization is a technique for hiding the physical characteristics of computing resources from the way in which other systems, applications, or end users interact with those resources. This includes making a single physical resource (such as a server, an operating system, or storage device) appear to function as multiple logical resources"
-- Virtualization entry in Wikipedia, slightly edited.
(C) 2007 Codefidence Ltd. 2

What is it good for?

Isolate different applications and users on the same machine from interfering from each other. Consolidate many servers on a single machine. Run operating system or other software built for one type of CPU on another kind of CPU. Easily and safely test software. Easily deploy software using virtual software appliances.
(C) 2007 Codefidence Ltd. 3

● ●

● ●

Agenda

Explain basic virtualization terms and techniques. Discuss 7 different Open Source virtualization technologies Explain how they work using the above terms. Maybe mention a couple of proprietary ones. Q&A

● ● ●

(C) 2007 Codefidence Ltd.

4

The ABC

The native operating system running the virtualization software is called the Host.
– –

Sometime the virtualization software is the host. The host has control of the real hardware.

● ● ●

The virtualized OS is called a Guest. There can be many Guests on a single Host. Guests must not interfere with each other or the host.
(C) 2007 Codefidence Ltd. 5

OS Level Virtualization

One OS kernel provides API that support multiple user space “Virtual Environments” within Guests (user space) run.
– –

Can be thought of as chroot on steroids. VE's are sometime also called VPS, Jails, Partitions and Containers. The Linux-Vserver project . OpenVZ , the core of SwSoft's Virtuozzo.
(C) 2007 Codefidence Ltd. 6

Open Source implementations includes
– –

OS Level Virtualization Isolation

Each Guest in the VE has its own:

Files

System libraries, applications, virtualized /proc and /sys. Each VE has its own root and other users and groups. VE only sees its own processes. PIDs are virtualized. VE has its own IP address, netfilter and routing rules.
(C) 2007 Codefidence Ltd. 7

Users and groups

Process tree

Network

Hardware Virtualization
● ●

Create the illusion of separate hardware. Many flavors:
– –

Full virtualization: virtuliaze all the hardware. Native virtualization: virtuliaze just enough to isolate native OS. Para-Virtuzliation: virtuliaze specialized hardware to run modified OS. In practice: some mix of above.

(C) 2007 Codefidence Ltd.

8

Hardware Virtualization Terms

The virtualization software is called:
– –

Hypervisor Virtual Machine Manager or VMM.

The VMM or Hypervisor is running as part of the host OS, in teandem with it or is the host. A virtual hardware instance is called a Virtual Machine or VM. The Guests OS run inside a VM.
(C) 2007 Codefidence Ltd. 9

Full Virtualization

Interpret binary code of a program using emulator that mimics the real hardware.

Just like Python or Perl only the language is binary assembly.

The emulated hardware may be a CPU or a peripheral device, such as a HD or NIC. All virtualization solution use emulation to some degree.

A few use only emulation
(C) 2007 Codefidence Ltd. 10

Bochs is a highly portable open source IA-32 (x86) PC emulator written in C++, that runs on most popular platforms. Written by Kevin Lawton It includes emulation of the Intel x86 CPU, common I/O devices, and a custom BIOS. Bochs is very slow, but it's BIOS is used by virtually all other Open Source virtualization projects.
(C) 2007 Codefidence Ltd. 11

● ●

Dynamic Re-Compilation

One way to speed emulation is to use Just In Time compilation techniques. The emulator translates a block of binary code to native binary code the first time it needs to run it. The emulator then keeps the translated block of code in it's cache, for later. An order of magnitude faster then simple interpreter.
(C) 2007 Codefidence Ltd. 12

QEMU is a generic and open source machine emulator and virtualizer. When used as a machine emulator, QEMU can run OSes and programs made for one machine (e.g. an ARM board) on a different machine (e.g. your own PC). Written by Fabrice Ballrad of ffmpeg fame. By using dynamic translation, it achieves very good performances.
(C) 2007 Codefidence Ltd. 13

● ●

Full Virtualization Pros and Cons
Can emulate one type of CPU on another.

● ● ●

Slow. When virtualizaing a CPU on the same CPU, wasteful.

Say, MIPS on x86.

Can add hooks for debug and profile. Can easily emulate access to non existing hardware.

When running Windows on Linux x86 most translated code blocks look the same as native.

(C) 2007 Codefidence Ltd.

14

Native Virtualization

If virtualizaing the same CPU as we run on, we can run most code unmodified on the native processor Page tables and segmentation are used to separate the virtual OS from the host. This involves a technique called “ring deprivileging” or hardware assistance

(C) 2007 Codefidence Ltd.

15

Ring Levels

Modern CPUs support multiple levels of code privileges, known as Ring Levels. Only code running in the highest privilege level can execute sensitive instructions. Intel CPUs support 4 ring levels: 0 for supervisor mode, used by the kernel 3 for user mode, used by applications 1and 2 are unused by Linux and other OSes.
(C) 2007 Codefidence Ltd. 16

● ● ● ●

Native OS Ring Levels
Ring 3
User mode

Processes/ Threads

User Space

Ring 2 Ring 1 Ring 0
Supervisor mode

Unused
Kernel
(C) 2007 Codefidence Ltd.

Kernel Space
17

De-Privileged OS Ring Levels
Ring 3
User mode

Processes/ Threads

User Space

Ring 2 Ring 1 Ring 0
Supervisor mode

Unused
Kernel Kernel Space VMM Space
18

Hypervisor
(C) 2007 Codefidence Ltd.

Problems with Ring De-Privileging

Ring Aliasing Address-Space Compression Non-Faulting Access to Privileged State Adverse Impact on Guest System Calls Interrupt Virtualization Access to Hidden State Ring Compression Frequent Access to Privileged Resources
http://www.intel.com/technology/itj/2006/v10i3/1-hardware/3-software.htm
(C) 2007 Codefidence Ltd. 19

Run Time Code Translation

Dynamically re-compile guest OS code to overcome aforementioned problems. Slower then running native, faster then full virtualization. Basically it boils down to:
– –

Run native when you can. Change guest OS code in situ during run time when not.

Complicated and tricky but works.
(C) 2007 Codefidence Ltd. 20

When used as a virtualizer, QEMU achieves near native performances by executing the guest code directly on the host CPU. A host driver called the QEMU accelerator (also known as KQEMU) is needed in this case. The virtualizer mode requires that both the host and guest machine use x86 compatible processors.

(C) 2007 Codefidence Ltd.

21

VirtualBox

innotek VirtualBox is a general-purpose virtualizer for x86 hardware. VirtualBox runs on Windows, Linux and Macintosh hosts and supports a large number of guest operating systems. Available as an Open Source version and professional version that adds some features. Incorporates some code from Qemu to support emulation when needed.
(C) 2007 Codefidence Ltd. 22

Para-Virtualization

Instead of changing the guest OS code dynamically in run time, why not change the source? Replace, in the source code, any problematic operation, with a call to the Hypervisor. Easier to do with an Open Source operating system. Lowest virtualization overhead:

About 3% below native CPU performance.
(C) 2007 Codefidence Ltd. 23

User Mode Linux

User Mode Linux is a port of the Linux kernel to it's own user space API. Makes the guest Linux kernel run as a process on Linux.

The guest processes are host processes that are controlled via the PTrace system call.

Included in Vanilla kernel version as a new pseudo architecture (UM). Written by Jeff Dike.
(C) 2007 Codefidence Ltd. 24

● ●

Xen is a para-virtualizing hypervisor. Xen originated as a research project at the University of Cambridge, led by Ian Pratt, senior lecturer at Cambridge and founder of XenSource, Inc. Xen has been integrated into recent Suse and RedHat releases. Incorporates some code from Qemu to support emulation when needed.
(C) 2007 Codefidence Ltd. 25

Xen 2.0 Architecture
VM0 Device Manager & Control s/w VM1 Unmodified User Software VM2 Unmodified User Software VM3 Unmodified User Software

GuestOS
(XenLinux)
Back-End Native Device Driver

GuestOS
(XenLinux)
Back-End Native Device Driver

GuestOS
(XenLinux)

GuestOS
(XenBSD)

Front-End Device Drivers

Front-End Device Drivers

Control IF

Safe HW IF

Event Channel

Virtual CPU

Virtual MMU

Xen Virtual Machine Monitor
Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE)

Hardware Virtual Machine

Intel VT-x (Vanderpool) and AMD SVM (Pacifica) are extensions to x86/x86_64 processors to support virtualization. Adds new instructions and CPU modes that make building Hypervisors easy. Introduce a new set of “non root” ring levels for virtual machines to run in. Hypervisor (called VMM) runs in “root” ring level. Non root privileged instruction trap to the VMM.
(C) 2007 Codefidence Ltd. 27

Hardware VM Ring Levels
Ring 3
User mode

Processes/ Threads

User Space

Ring 1,2 Ring 0D
Supervisor mode

Unused
Kernel Kernel Space VMM Space
28

Ring 0P
Hypervisor mode

Hypervisor
(C) 2007 Codefidence Ltd.

Xen 3.0 Architecture
VM0 Device Manager & Control s/w VM1 Unmodified User Software VM2 Unmodified User Software VM3 Unmodified User Software

GuestOS

GuestOS
(XenLinux)
Back-End Native Device Driver

GuestOS
(XenLinux)

AGP ACPI PCI

(XenLinux)
Back-End Native Device Driver

Unmodified GuestOS (WinXP))

SMP
Front-End Device Drivers Front-End Device Drivers

VT­x x86_32 x86_64 IA64
Hardware (SMP, MMU, physical memory, Ethernet, SCSI/IDE)
Control IF Safe HW IF Event Channel Virtual CPU Virtual MMU

Xen Virtual Machine Monitor

This slide (C) XenSource / Ian Pratt from Zen and the art of virtualization talk

KVM

KVM (for Kernel-based Virtual Machine) is a Linux kernel infrastructure for supporting virtualization.

It's a device driver exposing VT-X and SVM interface under Linux.

Developed by Avi Kivity and sponsored by Qumranet. Used with a slightly modified Qemu.

(C) 2007 Codefidence Ltd.

30

KVM Architecture

Normal User Process

Normal User Process

Guest OS

Guest OS

Qemu

Qemu

Linux Kernel

KVM Driver

(C) 2007 Codefidence Ltd.

31

Any Questions?

Gilad Ben­Yossef gilad@codefidence.com http://codefidence.com

(C) 2007 Codefidence Ltd.

32

Sign up to vote on this title
UsefulNot useful