You are on page 1of 39

CS 523: ReVirt and virtual better than real

Sam King

Administrative
Newsgroup posting starts next week Can sign up for presentations on wiki
Need a volunteer for first paper on 2/4 Will take into account first presentation

Administrative
Office hours daily
Reserve the right to not show up

I am horrible at email
But good at talking to people who come by my office Try to have a lot of office hours to make it easier to talk to me

Administrative
Project groups
Requirements
Group size 2-3 students

Suggestions / hopes
At least one PhD student At least one online student

Can work on your own research


For feedback, email a few sentences to the cs523 staff, or come by office hours

Discussion questions
One of the most important parts of your presentation
Three default questions
What did you like about the paper? What did you dislike about the paper? What future work did this inspire?

Other questions designed to spark a discussion


Most papers accepted / rejected based on opinions, rarely because of facts In this class, learn more about how your classmates think

Quote
Intelligent and polite disagreement is what separates us from the politicians
-- Matt Hicks, CS523 Spring 08

How to read a paper.

When virtual is better than real and ReVirt


Description of a VMM Why we read these papers Deterministic replay Small group discussions Class-wide discussion

VMM basics
Definitions
VM - software abstraction of a real machine
Guest software - software running inside VM

VMM - thin layer of software


Host software/resources - underlying
E.g., host physical memory

Manages resources Provides abstractions

How is this different than an operating system? Examples?

VMM architecture

VM0 drivers

VMU drivers VMM hardware

VMU drivers

Guest

Same interface Host

Workshop papers
No real implementation Lots of ideas Potential for impact

When virtual is better than real


Most cited work by Pete One of the two main reasons why VMs are so popular today
Other is Medel Rosenblums work on making VMMs practical for x86

When virtual is better than real


Secure logging, ReVirt
5+ papers, 2 commercial products, 3+ PhD dissertations (including my own), running in VMware

Intrusion detection
Countless papers, sandboxing email, servers, etc.

Migration
3+ papers, a company (Moka5), Intel internet suspend/resume, used heavily in data centers

VMMs in general
Used heavily in data centers Shipped with Windows 7 by default
Used for Windows XP mode

Side note
Originally, Pete thought that the shadow security services was going to be the next big thing Secure logging was second, turned out to be the most influential

Note
This is where my historical context would have ended

Key points
VMMs are great for certain things VMMs are not the solution to all problems

Using VMMs for services


Benefits
Modify hardware layer Simple abstractions to work with
Not always true (e.g., migration, checkpoints)

Improved isolation
Should be done by OS, but done poorly today

Uses
Secure logging Intrusion detection VM migration

Poor uses of VMMs


Need to peer into the guest system
Semantic gap

Secure logging deterministic replay


Use time-travel to recreate the past Architecturally visible state transitions
Same starting state + same input => same state E.g., hello world
Does the OS have to be the same?

E.g., hello world its 1:35pm

In general, need to log sources of nondeterminism, re-execute the rest

ReVirt
Uses VMM to record virtual machine Most computation deterministic Non-determinism I/O, interrupts
Log values Use performance counters

Uniprocessor only

Replay a process -- recording

What are the inputs and sources of non-determinism?

Replay a process -- replaying

Software-only replay
Advantages
Simple and efficient Closer to abstractions you care about
OS level record/replay individual processes VMM encapsulate many processes

multi-thread on single core

Disadvantages
multi-thread on multi-core

Problem: race conditions


Memory CPU0 Store X Store Y CPU1 Load X Store Z

Problem: race conditions


Memory CPU0 Store X Store Y CPU1 Load X Store Z

Problem: race conditions


Memory CPU0 Store X Store Y CPU1 Load X Store Z

Problem: race conditions


Memory CPU0 Store X Store Y CPU1 Load X Store Z

Problem: race conditions


Memory CPU0 Store X Store Y CPU1 Load X Store Z

Problem: race conditions


Memory CPU0 Store X Store Y CPU1 Load X Store Z

Problem: race conditions


Memory CPU0 Store X Store Y CPU1 Load X Store Z

Problem: race conditions


Memory CPU0 The final state depends on the CPU1 interleaving of the two processors Store X Load X memory accesses Store Y Store Z

Problem: race conditions


Software only approaches would have to inspect all load and store instructions
Recent optimizations to improve on this Still ongoing research

Delorean: H/W support for multi-core replay

Memory operations at chunk commit Only need to record chunk commit order Works well with large chunk sizes

H/W to rec entire system

HW-based replay
Focus on mech. for proc. interactions
Full system Cache coherence recording Transaction or chunk based

Advantages
Efficient Full system

Disadvantages
Need new hardware Full system

Capo

Combines H/W S/W replay Key abstraction: replay sphere SW input non determinism, HW thread interleaving

Related work
Bressoud, T. C. and Schneider, F. B. Hypervisor-based fault tolerance. SOSP 95. Xu, M., Bodik, R, and Hill,M. A "flight data recorder" for enabling full-system multiprocessor deterministic replay. ISCA03. Sudarshan M. Srinivasan , Srikanth Kandula , Christopher R. Andrews , Yuanyuan Zhou, Flashback: a lightweight extension for rollback and deterministic replay for software debugging, USENIX 04 George W. Dunlap , Dominic G. Lucchetti , Michael A. Fetterman , Peter M. Chen, Execution replay of multiprocessor virtual machines, VEE 08

Related work
Remus: High Availability via Asynchronous Virtual Machine Replication Brendan Cully, Geoffrey Lefebvre, Dutch T. Meyer, Anoop Karollil, Michael J. Feeley, Norman C. Hutchinson, and Andrew Warfield. NSDI 08 Montesinos, P., Hicks, M., King, S. T., and Torrellas, J. 2009. Capo: a software-hardware interface for practical deterministic multiprocessor replay, ASPLOS 09 PRES: probabilistic replay with execution sketching on multiprocessors, Soyeon Park, Yuanyuan Zhou, Weiwei Xiong, Zuoning Yin, Rini Kaushik, Kyu H. Lee, Shan Lu, SOSP 09 ODR: output-deterministic replay for multicore debugging, Gautam Altekar, Ion Stoica, SOSP 09

What did you like about the paper? What did you dislike about the paper? What future work did this inspire? Can you create a complete VMM log from within the guest (or non-virtual) OS? Will VMMs become the next OS? Is smaller more secure? Are VMMs more secure than OSes? Are extra layers on computer systems good? What will it take to make replay practical? Are we going to have HW support for replay?