You are on page 1of 37

CPUVIRTL

CPU Virtualization

SCHOOL OF COMPUTING JEIAN


JEIAN LOUELL
LOUELL NUEVA
NUEVA
© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Attendance Reminders

PLEASE SCAN QR CODE FOR ATTENDANCE


SLACK

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Introduction to Computer Systems Organization

SCHOOL OF COMPUTING JEIAN


JEIAN LOUELL
LOUELL NUEVA
NUEVA
© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Computer Systems Organization

OUTPUT INPUT

Vestibulum nec congue Vestibulum nec congue


tempus lorem ipsum tempus lorem ipsum

Vestibulum nec
congue tempus

PROCESS

Vestibulum nec congue


tempus lorem ipsum

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
CPU or the Processor
The CPU carries out instructions and tells the
computer what to do [CU]

Performs arithmetic operations and data


manipulation [ALU]

Holds data and instructions which are in current use


[Memory Registers]

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Control Unit [CU]
The part of the computer which accesses
instructions in sequence, interprets them and then
directs their implementation - literally in control.

What it does:

● Directs the entire computer system to carry


out stored program instructions.
● Communicates with both the arithmetic and
main memory
● Uses the instructions contained in the
Instruction register to device which circuits
need to be activated
● Co-ordinates the activities of all peripheral
devices

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Arithmetic Logic Unit [ALU]
Arithmetic Logic Unit (ALU) The part of the CPU where arithmetic and
logic operations are performed. What it does: 1.Executes arithmetic
operations like addition, subtraction, multiplication and division
2.Executes logical operations which compare numbers, letters and
special characters (=, ) 3.Performs logic functions such as AND, OR and
NOT Registers: 1.Accumulator is used to accumulate results. It is the
place where the answers from many operations are stored temporarily
before being put out to the computer's memory 2.General-purpose
registers hold data on which operations are to be performed by the
arithmetic logic unit.

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Primary Memory

Primary Memory is a part of the computer that holds data


and instructions for processing When we load software from
any external storage, it is stored in the Primary Memory
There are two types of computer memory: RAM and ROM

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
RAM / ROM

Random Access Memory (RAM) Where the programs get


stored CPU fetches program instructions from the RAM CPU
stores calculation results here Can read or write to it When
computer is switched off RAM is lost Read Only Memory
(ROM) Only read this memory used for storing special sets of
boot up instructions. When computer is turned off ROM is
kept alive.

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Cache Memory

High speed Located in CPU itself Stores current data

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Data Bus

Information flows both ways between the microprocessor


and memory or I/O.

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Control Bus

There is no real control bus. Instead, the control bus is made


up of a number of single bit control signals.

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Microprocessor Initiated Operations

These are operations that the microprocessor itself starts.


These are usually one of 4 operations: – Memory Read –
Memory Write – I/O Read (Get data from an input device) –
I/O write (Send data to an output device)

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
The read operation

To read the contents of a memory location, the following


steps take place: The microprocessor places the 16-bit
address of the memory location on the address bus. The
microprocessor activates a control signal called “memory
read” which enables the memory chip. The memory decodes
the address and identifies the right location. The memory
places the contents on the data bus. The microprocessor
reads the value of the data bus after a certain amount of
time.

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Basic Introduction Interpretation

SCHOOL OF COMPUTING JEIAN


JEIAN LOUELL
LOUELL NUEVA
NUEVA
© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
FETCH

EX

DE
EC

CO
UT

DE
E

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Instruction Set Architecture

SCHOOL OF COMPUTING JEIAN


JEIAN LOUELL
LOUELL NUEVA
NUEVA
© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Emulate Fetch

➢ Positives
○ Easy to implement
○ Minimal complexity
➢ Negatives
○ Slow!

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Trap and Emulate

Guest app has to handle syscall/interrupt


● Special trap instr (int n), traps to VMM
● VMM doesn’t know how to handle trap
● VMM jumps to guest OS trap handler
● Trap handled by guest OS normally
Guest OS performs return from trap
● Privileged instr, traps to VMM
● VMM jumps to corresponding user process

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Trap and Emulate

Problems
● Guest OS may realize it is running at lower ● Some x86 instructions which change
privilege level hardware state (sensitive instructions) run in
○ Some registers in x86 reflect CPU both privileged and unprivileged modes
privilege level (code segment/CS) ○ Will behave differently when guest OS
○ Guest OS can read these values and get is in ring 0 vs in less privileged ring 1
offended! ○ OS behaves incorrectly in ring1, will not
trap to VMM

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Trap and Emulate

Problems
● Why these problems?
○ OSes not developed to run at a lower
privilege level
○ Instruction set architecture of x86 is not
easily virtualizable (x86 wasn’t designed
with virtualization in mind)

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Binary Translation

Binary translation is one specific approach to


implementing full virtualization that does not
require hardware virtualization features. It
involves examining the executable code of the
virtual guest for "unsafe" instructions, translating
these into "safe" equivalents, and then executing
the translated code.

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Hardware Assisted Virtualization

SCHOOL OF COMPUTING JEIAN


JEIAN LOUELL
LOUELL NUEVA
NUEVA
© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Hardware Assisted Virtualization

● Modern technique, after hardware support


for virtualization introduced in CPUs
○ Original x86 CPUs did not support
virtualization
○ Intel VT-X or AMD-V support is widely
available in modern systems
○ Special CPU mode of operation called
VMX mode for running VMs
● Many hypervisors use this H/W feature, e.g.,
QEMU/KVM in Linux

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Full Virtualization

SCHOOL OF COMPUTING JEIAN


JEIAN LOUELL
LOUELL NUEVA
NUEVA
© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Hardware Assisted Virtualization

● x86 and other hardware lacked virtualization


support.
● But cloud computing increased demand for
virtualization.
● VMWare workstation first to solve the
problem of virtualization existing operating
systems on x86 (basis for this lecture).
● Type 2 hypervisor based on
trap-and-emulate approach.

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Hardware Assisted Virtualization

● Key idea: dynamic (on a need basis) binary


(not source) translation of OS instructions
○ Problematic OS instructions translated
before execution
● Subsequently, hardware support for
virtualization (previous lecture)
○ Binary translation is higher overhead
than hardware-assisted virtualization
○ Used when hardware support not
available

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Memory Virtualization

SCHOOL OF COMPUTING JEIAN


JEIAN LOUELL
LOUELL NUEVA
NUEVA
© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved
Recap: Memory virtualization
problem
Guest Guest Host/Machine
Virtual Addresses Physical Addresses Physical Addresses
(GVA) Page (GPA) (HPA)
Guest Address = X’
“RAM”
Address = X
Code/dat Address = Y’
a Stack
Heap
(guest
Address = Y
user Address = Z’
process)
Address = Z Host RAM
Guest OS
Guest “RAM” is actually memory of the userspace hypervisor process running
on the host, which is mapped to host memory by the host’s page table
• Guest page table has GVA🡪GPA mapping
Recap: Techniques for memory
• Each guest OS thinks it has access to all RAM starting at address 0

virtualization
• VMM / Host OS has GPA🡪HPA mapping
• Hypervisor knows mapping between host virtual address (HVA) and guest physical address
(GPA), because it has setup its userspace memory as guest RAM
• Host OS knows mapping from HVA to HPA for all processes, so it knows GPA🡪HPA
• That is, guest “RAM” pages are userspace process pages that are distributed across host memory
and host OS knows the physical locations of these guest “RAM” pages
• Which page table should MMU use?
• Shadow paging: VMM creates a combined mapping GVA🡪HPA and MMU is given a pointer to this
page table
• Used in VMWare workstation (full virtualization)
• Extended page tables (EPT): MMU hardware is aware of virtualization, takes pointers to two
separate page tables
• Used in QEMU/KVM
Extended page tables
Guest page table Host page table/map
GVA 🡪 GPA GPA 🡪 HPA
CR3 EPTP

Special EPT
pointer register

MMU

• Page table walk by MMU: Start walking guest page table using GVA
• Guest PTE (for every level page table walk) gives GPA (cannot use GPA to access memory)
• Use GPA, walk host page table to find HPA, then access memory page, then next level access
• Every step in guest page table walk requires walking N-level host page table
• N-level page tables in guest/host result in page table walk of NXN memory accesses
Shadow page tables

Read only

Guest page table Host page table


GVA 🡪 GPA GPA 🡪 HPA

Shadow page table


GVA 🡪 HPA
CR3
Combined mapping
Changes to guest page table trap to VMM constructed by VMM
Shadow page table updated
Maintaining shadow page tables
• Guest writes to CR3, privileged operation traps to VMM
• VMM marks the guest page table pages as read-only
• VMM constructs shadow page table, sets CR3 to it
• Shadow page table can be built on demand
• Start with empty page table, add entries on page faults
• Guest changes page table, traps to VMM, shadow entry updated
• Guest OS keeps multiple page tables of active processes in memory
• On context switch, new page table used, but old page table still in memory
• What about shadow page tables? How many in memory?
• Many design choices exist
• VMM can discard old shadow page table on context switch, and rebuild it
later (overhead during context switch)
• VMM can maintain multiple shadow page tables of active processes (overhead
to track changes to all page table pages)
Demand Paging and Page Faults
Guest Guest Host/Machine
Virtual Addresses Physical Addresses Physical Addresses
(GVA) Page (GPA) (HPA)
Address = X’

Address = X
Address = Y’

Address = Y
• Guest OS may not assign a physical Address = Z’
frame to a page
• No entry in page table, page fault
Address = Z • VMM may not assign frames to all
• VM exit, VMM handles page fault,
guest “RAM” pages due to
injects fault into guest
memory pressure
• Guest assigns physical frame
• Page fault is handled by VMM
• Guest OS not aware of this fault
• “Hidden” page faults
Memory
• VMM reclaimsreclamation technqiues
memory from guest “RAM” under memory pressure
• Uncooperative swapping: VMM reclaims some guest “RAM” pages and
swaps them to disk
• Page fault and memory assigned when guest accesses the page
• May hurt performance, important pages may be swapped to disk
• Ballooning: VMM opens dummy device, requests pages from VM
(“inflating the balloon”), then swaps these pages out
• Since pages assigned to a device, pages not used by other processes in guest
• Cooperative, guest can assign free pages
• Lesser impact on performance, but reclamation takes time, requires guest changes
• Memory sharing: memory pages with identical content across VMs
(OS images, zero pages etc.) shared between VMs
• One physical frame mapped into page tables of multiple guests
• Scan pages periodically to compute and match hash-based similarity of pages
Summary
• Memory virtualization
• Extended page tables
• Shadow page tables
• Memory reclamation techniques
• Uncooperative swapping
• Ballooning
• Memory sharing

• “Memory Resource Management in VMware ESX Server”, Carl A. Waldspurger.


Thank you!

SCHOOL OF COMPUTING JEIAN LOUELL NUEVA


© iACADEMY - Jeian Louell Nueva - 2021 - All Rights Reserved

You might also like