You are on page 1of 7

CLOUD COMPUTING

Assignment-1

Identify the sensitive or privileged instructions in x86 architecture that becomes concern for VMM construction
and analyze the problem with these instructions possess for virtualization and explain the solutions.

Submitted By,

ANCY EPHREM

Student ID: 2019HT66110


Introduction
It has been known that x86 architecture supports virtualization and that the sensitive or privileged instructions
becomes a concern in the construction of VMM. In this paper, we are identifying such sensitive instructions in x86
architecture and analyze the problem that each of these instructions possess for virtualization and elaborating the
solutions to overcome.

Virtualization

Virtualization is a technology that combines or divides (computing) resources to present one or many operating
environments. There are broadly two main categories of software virtualization. First is System Virtualization: In which
the virtualization software is in between the host hardware machine and the guest software. On the other hand, Process
Virtualization: In which the virtualization software runs above the OS and hardware combination and only provides user-
level instructions compatibility.

The software responsible for system virtualization is called the Virtual Machine Monitor (VMM) or Hypervisor.
The combination of OS and Application that runs on the top of the virtualization software is called a Guest or Virtual
Machine. A hypervisor or VMM allows multiple operating systems to run concurrently on the host hardware. Also, it
provides isolation between the different guest processes.

Hypervisors are of three types. Native Hypervisor, Hosted Hypervisor and Hybrid Hypervisor. Native Hypervisor
or Bare Metal run directly on the hardware providing all the features needed by the guests. Hosted Hypervisors run on
the top of an existing OS and leverage only the features of the underlying O.S. Where the Hybrid Hypervisors run directly
on the hardware and leverage only the features of the underlying OS.

TRAP AND EMULATE TECHNIQUE


There are different types of techniques used for Hypervisor based virtualization. Trap and Emulate Technique is
the basic technique which used in the early days for Hypervisor based virtualization. All three types VMMs operate in a
similar manner. In each case, the guests continue execution until they try to access a shared physical resource of the
hardware such as I/O device or an interrupt received. When this happens, the hypervisor regains control and mediates
access to the hardware or handles the interrupt.

1
To accomplish this, hypervisors rely on a feature of modern processors, which is known as the privilege level or
protection ring. The basic idea behind this is that, all instructions that modify the physical hardware configuration are
permitted only the highest privilege level. At lower levels, only restricted sets of instructions can be executed. There four
types rings numbered from 0 to 3 in the case of Intel X86 Architecture. Programs executing in the Ring 0 will have the
highest privileges and are allowed to execute any instructions or access any physical resources such as memory pages or
I/O devices. Guests typically execute in Ring 3. This can be accomplished by setting the CPL (Current Privilege Level) register
of the processor before starting the execution of guest. If the guest wants to access the protected resource such as I/O
device, an interrupt takes place and Hypervisor regains control. The Hypervisor, then emulates the I/O operation for the
guest. In order to emulate the I/O operation, it is necessary for the hypervisor to have maintained the state of the guest
and its virtual resources.

LIMITATIONS OF TRAP & Emulate TECHNIQUE


The trap and emulate technique has two major limitations. First is, as in any of the other emulation technique,
there is some amount of performance overhead incurred. Second is, not all the architectures are suitable for implementing
the trap and emulate virtualization. Gerald J. Popek and Robert P. Goldberg give a set of conditions for a computer
architecture to support virtualization and allow VMM run efficiently. In order to be virtualizable, the set
of sensitive instructions must be a subset of the privileged instructions.

Sensitive instructions are defined as those that are either behavior sensitive or control sensitive. Behavior
sensitive instructions are those whose behavior depends on the processor privilege level. If behavior sensitive instructions
are executed by a guest running under a hypervisor, the results obtained could be different from executing these with the
guest running directly on the hardware at a higher privilege level, since the results depend upon the privilege level. This
could lead to potential errors in the execution of the guest. Control sensitive instructions are those that change the
processor privilege level, and therefore should be privileged instructions.

Sensitive instructions read or update the state of virtual machine and don't trap (nonprivileged). The x86
Architecture sensitive, non-privileged instructions are listed below.

Groups Instructions
Access to interrupt flag pushf, popf, iret
Visibility into segment descriptors lar, verr, verw, lsl
Segment manipulation instructions pop <seg>, push <seg>, mov <seg>
Read-only access to privileged state sgdt, sldt, sidt, smsw
Interrupt and gate instructions fcall, longjump, retfar, str, int <n>

2
Examples

• popf doesn't update interrupt flag (IF)


– Impossible to detect when guest disables interrupts
• push %cs can read code segment selector (%cs) and learn its CPL
– Guest gets confused

The first group of instructions manipulates the interrupt flag (%eflags.if) when executed in a privileged mode
(%cpl≤%eflags.iopl) but leave the flag unchanged otherwise. Unfortunately, operating systems used these instructions to
alter the interrupt state, and silently disregarding the interrupt flag would prevent a VMM using a trap-and-emulate
approach from correctly tracking the interrupt state of the virtual machine.

The second group of instructions provides visibility into segment descriptors in the global or local descriptor table.
For de-privileging and protection reasons, the VMM needs to control the actual hardware segment descriptor tables.
When running directly in the virtual machine, these instructions would access the VMM’s tables (rather than the ones
managed by the operating system), thereby confusing the software.

The third group of instructions manipulates segment registers. This is problematic since the privilege level of the
processor is visible in the code segment register. For example, push %cs copies the %cpl as the lower 2 bits of the word
pushed onto the stack. Software in a virtual machine that expected to run at %cpl=0 could have unexpected behavior if
push %cs were to be issued directly on the CPU.

The fourth group of instructions provides read-only access to privileged registers such as %idtr. If executed
directly, such instructions return the address of the VMM structures, and not those specified by the virtual machine’s
operating system. Intel classifies these instructions as “only useful in operating-system software; however, they can be
used in application programs without causing an exception to be generated” [Intel Corporation 2010], an unfortunate
design choice when considering virtualization.

Finally, the x86 architecture has extensive support to allow controlled transitions between various protection rings
using interrupts or call gates. These instructions are also subject to different behavior when de-privileging

METHODS TO OVERCOME Limitations of trap & emulate technique


There are software and hardware techniques which can be used to overcome the limitations of trap and
emulate virtualization. Three main methods are listed below.

1. Parse the instruction stream and detect all sensitive instructions dynamically

✓ Interpretation (BOCHS, JSLinux)


✓ Binary translation (VMWare, QEMU)

2. Change the operating system

✓ Paravirtualization (Xen, L4, Denali, Hyper-V)

3. Make all sensitive instructions privileged!

✓ Hardware supported virtualization (Xen, KVM, VMWare) – Intel VT-x, AMD SVM

INTERPRETATION
This is one of the techniques which is used to parse the instruction stream and detecting all the sensitive
instructions dynamically. This involves a four-step cycle. The fetching the source instructions, analyzing it, performing
required operation and then fetching the next source instruction.

3
A simple interpreter, referred to as decode-and-dispatch, operates by stepping through the source program
(instruction by instruction) reading and modifying the source state. Decode-and-dispatch is structured around a central
loop that decodes an instruction and then dispatches it to an interpretation routine. It uses a switch statement to call a
number of routines that emulate individual instructions. The central dispatch loop of a decode-and dispatch interpreter
contains a number of branch instructions. These branches tend to degrade performance

To avoid some of the branches, a portion of the dispatch code can be appended (threaded) to the end of each of
the interpreter routines. To locate interpreter routines, a dispatch table and a jump instruction can be used when stepping
through the source program. This scheme is referred to as indirect threaded interpretation. The main drawbacks of the
Indirect threaded interpretation is that, the dispatch table causes an overhead when looked up. Second is, an interpreter
routine is invoked every time the same instruction is encountered. Thus, the process of examining the instruction and
extracting its various fields is always repeated

Binary translation

In this technique, the hypervisor includes a binary translator which replaces the sensitive instructions but not-
privileged instructions by equivalent non-sensitive instructions at run time, caches the results for future use and leaves
non-sensitive instructions unchanged. The advantages of this technique are, no hardware assistance required, no
modifications in the guest OS, Isolation and security. The main disadvantage is, as similar to other just in time translation
techniques, this yields similar overheads.

4
PARAVIRTUALIZATION

In this method, the guest is modified not to use the sensitive instruction, but to directly invoke hypervisor APIs
which would provide and equivalent service. Paravirtualization is widely used for reducing overheads associated with I/O
virtualization. The advantages of the technique is faster execution and lower virtualization overhead. The disadvantage
with this technique is that, it is not hypervisor- independent i.e. modifications have to be carried out for every hypervisor
under which the guest could run. So paravirtualization requires re-writing of guest O.S.

HARDWARE SUPPORTED VIRTUALIZATION


Hardware vendors are rapidly embracing virtualization and developing new features to simplify virtualization
techniques. First generation enhancements include Intel Virtualization Technology (VT-x) and AMD’s AMD-V which both
target privileged instructions with a new CPU execution mode feature that allows the VMM to run in a new root mode
below ring 0. As depicted in below Figure, privileged and sensitive calls are set to automatically trap to the hypervisor,
removing the need for either binary translation or paravirtualization.

VT-x provides hardware assists for virtualization by defining two modes for processor execution: VMX Root
Operation and VMX non-Root Operation. In each mode of operation, there are four privilege levels. Thus, the Hypervisor
can operate at ring 0 in VMX root operation and guests can operate at ring 0 in VMX non-root operation.

VT-x provides assistance for both responsibilities of the hypervisor, i.e. ensuring the sensitive instructions execute
correctly and keeping the track of state of guest. To implement this functionality, VT-x makes use of a new data structure
called the Virtual Machine Control Structure (VMCS). VMCS provides facilities for controlling the execution of sensitive
operations as well as saving the state of guests. Additionally, VMCS also provides facilities for storing the state of
hypervisor.

5
When a guest ceases the execution and exits to the hypervisor (called VM Exit), the state of guest is saved in VMCS
and the state of the hypervisor is restored in the VMCS. The reverse process is followed when a guest is dispatched for
execution by the hypervisor (called VM Entry). This is how, the VT-x save the guest state.

Consider a case, where the guest OS wishes to execute all instructions to mask all interrupts. This is controlled by
two controls in VMCS. If the external interrupt exiting control is set, then all external interrupts will cause control to
transfer to the Hypervisor, If the interrupt-window exiting control is set, then the guest will not be interrupted until it
enables interrupts. Thus, when an interrupt occurs, hypervisor can check the settings of the controls to decide whether
to keep the interrupts pending or reflect them to the guest.

Conclusion
The software solutions such as interpretation, binary translation and paravirtualization overcomes the challenges
appeared in the VMM with respect to x86 architecture processors. The hardware solutions like Intel VT-x and AMD V
technologies removes the software dependencies and provides more efficiency.

References

1) Moving to the cloud by Dinkar Sitaram and Geetha Manjunath (chapter 9)


2) Microsoft Virtualization Deep Dive current and future architecture presentation by Shai Ofek
3) https://fawzi.wordpress.com/2009/05/24/virtualization-and-protection-rings-welcome-to-ring-1-part-i/
4) https://www.sciencedirect.com/topics/computer-science/hypervisor-based-virtualization
5) http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.423.4009&rep=rep1&type=pdf
6) https://my.eng.utah.edu/~cs5460/slides/virt-lecture1.pdf
7)https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/VMware_paravirtualization.pdf
8) https://web2.qatar.cmu.edu/~mhhammou/15319-s12/lectures/Lecture18_15319_MHH_26Mar_2012.pdf

You might also like