You are on page 1of 55

Hypercall APIs Explained

Maryrita Steinhour
Program Manager
Windows Virtualization
Microsoft Corporation
Goals
Understand Windows hypervisor APIs
Apply this knowledge to build
solutions based on Windows
virtualization
Peek at Windows hypervisor future
hardware support
Agenda
Hypervisor Overview
Hypercall Overview
Making a Hypercall (or, What’s Under the
Wrapper?)
Hypercall Functions
Example: Inter-partition Communication
Future Support for Hardware Features
Windows Virtualization Architecture
Parent Child Child
Partition Partition Partition

Applications Applications Applications

Windows
Server

WMI Provider

Virtualization
Stack
Operating Operating
System 1 System 2
IHV
Drivers

Windows hypervisor

“Designed for Windows” Server Hardware


Partition Definitions
Partition
Basic unit of isolation supported by the hypervisor
Root partition
Controlling partition in which the virtualization stack
runs and which owns hardware devices
Parent partition
Manages resources for a set of child partitions
Child partition
Created by the parent partition
Guest operating systems, applications run in these
partitions
Virtualization Mappings
Real Virtual
System Partition
Logical processor Virtual processor
Advanced programmable Virtual APIC + Synthetic
interrupt controller (APIC) Interrupt Controller (SynIC)
Operating system (OS) Guest OS
Virtual address Guest virtual address (GVA)
Physical address
≡ System Physical Guest physical address (GPA)
Address (SPA)
Agenda
Hypervisor Overview
Hypercall Overview
Making a Hypercall (or, What’s Under the
Wrapper?)
Hypercall Functions
Example: Inter-partition Communication
Future Support for Hardware Features
Hypervisor Interface Mechanisms

Guests interacting with the hypervisor can


use two types of mechanisms
Architecture-specific interfaces
Architected procedural interface
x64 Architecture-specific interfaces

Traditional mechanisms used by software to


interact with the underlying processor
CPUID instruction
Used for static feature and version information
MSRs (model-specific registers)
Used for status and control values
Memory-mapped registers
Used for status and control values
Processor interrupts
Used for asynchronous events, notifications and
messages
Procedural Interface
Hypercalls
Procedural versus (mostly) informational
Guest requests action or information from
hypervisor
Processor activity
GPA management
Inter-partition messaging
Virtual interrupts
Partition control
Virtual processor control
Enlightenments
Windows Hypercall Interface
Partition Partition Partition

Applications Applications Applications

Operating
System 1
Operating Operating
System 2 System 3

Hypercall

Hypercall
Processing
Windows hypervisor

“Designed for Windows” Server Hardware


Types of Hypercalls
Simple hypercalls
Call once, operation complete
Rep hypercalls
“Repeating” hypercalls
Process lists of input or output elements
Simple Hypercalls
Perform a single atomic operation
Examples
HvGetLogicalProcessorRuntime
HvEnableTraceEvents
Rep Hypercalls
Act like a series of simple hypercalls
Perform multiple, independent atomic actions
Input includes
Rep count
Start index
If operation does not complete within 50 µs
Instruction pointer is not advanced
When execution resumes, restart at updated start index
Examples
HvGetVpRegisters
HvMapGpaPages
Native and Wrapper Interfaces
Native interface
Provided by the hypervisor
Low-level (assembler)
Caller must set up hypercall environment
Wrapper interface
Provided by a “wrapper library” that runs within the
guest
WinHv.sys on Windows
High-level (C-style) calling convention
Use THIS interface on Windows!
Agenda
Hypervisor Overview
Hypercall Overview
Making a Hypercall (or, What’s Under the
Wrapper?)
Hypercall Functions
Example: Inter-partition Communication
Future Support for Hardware Features
Making a Hypercall
User
Invokes the hypercall via a function call
Wrapper interface
Sets up the hypercall environment (once)
Sets up the parameters and registers for the call
Makes the hypercall
Decodes the output parameters and registers
User
Acts on the result of the hypercall
Invoking the Hypercall
Wrapper interface for sample hypercall
HvAssignWidgets

HV_STATUS
HvAssignWidgets(
__in HV_PARTITION_ID PartitionId,
__in UINT64 Flags,
__inout PUINT32 RepCount,
__inout PUINT32 StartIndex,
__in PCHV_WIDGET WidgetList
);
Set Up the Hypercall Environment

Make sure the hypervisor is present


Determine
Hypervisor version
Capabilities
Implementation recommendations
Report the guest OS’ identity
Set up and enable the hypercall page
Making the Hypercall
Set up the parameters
64-bit GPA (guest physical address) of the input and/or output
parameters
Hypercall control value
63 60 59 48 47 44 43 32 31 16 15 0
rsrvd Start index rsrvd Rep count rsrvd Call code

Issue a CALL to the beginning of the hypercall page


Decode the output
Hypercall status value
63 44 43 32 31 16 15 0
rsrvd Reps rsrvd Result
complete
Act on the Result of the Hypercall

Check the result code


Some result codes are common to all
hypercalls
Success
HV_STATUS_SUCCESS
Failure
HV_STATUS_INVALID_HYPERCALL_CODE
HV_STATUS_INVALID_HYPERCALL_INPUT
HV_STATUS_INVALID_ALIGNMENT
Refer to the individual hypercalls for details
of other, specific result codes
Agenda
Hypervisor Overview
Hypercall Overview
Making a Hypercall (or, What’s Under the
Wrapper?)
Hypercall Functions
Example: Inter-partition Communication
Future Support for Hardware Features
Partition Management
Create, delete, manage partition state
Partitions identified by a unique partition
ID
Child partitions are managed by the
parent partition
HvCreatePartition
HvDeletePartition
HvGetPartitionProperty
HvSetPartitionProperty
Physical Hardware Management
Hypervisor manages physical hardware
System physical address space
Logical processors
Local APICs
Constant-rate system counter
Hardware information comes from three sources
Boot-time input parameters
Dynamic discovery
Root partition input
HvSetLogicalProcessorRunTimeGroup
HvNotifyLogicalProcessorPowerState
Queries
HvGetLogicalProcessorRunTime
Resource Management
Memory for hypervisor partition-specific structures
comes from guest
Ensures that a single guest cannot monopolize system
resources
Hypervisor creates pools of SPA pages
Originally mapped in guest’s GPA space
Pages deposited or withdrawn
Deposited pages no longer accessible by guest
HvDepositMemory
HvWithdrawMemory
Query balance
HvGetMemoryBalance
Guest Physical Address Spaces

Single address space for the partition


GPA pages have one of three states
Mapped
Can be readable, writeable, executable
Inaccessible
Unmapped
Changing states
HvMapGpaPages
HvUnmapGpaPages
Intercepts
Parent partition may need to handle certain
situations on behalf of its child
Accesses to I/O Ports
Accesses to MSRs
Parent installs an intercept
HvInstallIntercept
Hypervisor suspends child virtual processor and
allows parent to handle
Execution resumed when parent’s work
complete
Virtual Processor Management
Virtual processor (Vp) identified by
Partition ID
Processor index
Vp state modeled by processor registers
Architected hardware registers
Hypervisor-defined MSRs
Queried and modified by the guest
HvGetVpRegisters
HvSetVpRegisters
Virtual MMU and Caching
Virtual memory management unit (MMU)
GVA  GPA mapping
Generally compatible with physical MMU
Certain control register bits not supported
CR0.CD (cache disable)
Certain MSRs not virtualized, or their scope changed
Page address type (PAT) MSR may be treated as a per-
partition register, with updates visible to all Vps in the partition

Hypervisor honors guest cache attribute bits for


pages mapped in guest’s GVA space
HvReadGpa
HvWriteGpa
Virtual Interrupt Control
Synthetic Interrupt Controller (SynIC)
Extension of virtualized local APIC
Used for virtualizing interrupt delivery to virtual processors
Has 16 synthetic local vector tables known as SINTx
Internal interrupts generated when
A Vp accesses the APIC interrupt command register
External interrupts generated when
A physical hardware device generates an interrupt
The hypervisor delivers a message
Another partition calls inter-partition communication hypercalls
HvPostMessage
HvSignalEvent
Inter-Partition Communication
Two types of inter-partition communication
Events
Messages
Both must be sent via a pre-allocated
connection
HvCreatePort
HvDeletePort
HvConnectPort
HvDisconnectPort
HvGetPortProperty
Inter-Partition Communication
Events
Use single-bit event flags
Lightweight signaling mechanism
Destination notified by
Bit ORed in SIEF (synthetic interrupt event
flag) page
Interrupt if bit was not previously set
HvSignalEvent
Inter-Partition Communication
Messages
Posted (sent asynchronously)
HvPostMessage
Sent through message buffers
Destination notified of message arrival by
interrupt
Partition Save and Restore
Parent can save the state of a child partition
HvSavePartitionState
Saved state may be
Migrated and restored on this or another system
HvRestorePartitionState
Checkpointed for subsequent restart
Summary state data also available
Determine whether complete saved state is
compatible with target hardware and hypervisor
Scheduler
Set scheduler policy management
HvSetPartitionProperty
Policy applies to all virtual processors in a partition
CPU Reserve
Sets lower boundary for the amount of CPU time available
Virtual processor can use less
CPU Cap
Sets upper boundary for the amount of CPU time available
Virtual processor can NOT use more
CPU Weight
Sets amount of CPU time available relative to other partitions’
virtual processors
Tracing
Allocate and enable tracing
Allocate trace buffer group
Local buffers
Associated with a logical processor
Global buffers
Shared among logical processors
HvAllocateTraceBufferGroup
Enable trace events
Diagnostic trace events
Debugging and performance analysis
Audit trace events
Security auditing
HvEnableTraceEvents
Tracing
Record and deallocate traces
When trace buffer is full (complete)
Hypervisor sends notification message to guest via
SynIC
Guest reads the contents of the buffer
Guest frees the buffer
HvCompleteTraceBuffer
HvFreeTraceBuffer
When done
Disable trace events
Deallocate trace buffer group
HvDeallocateTraceBufferGroup
Statistics
Performance statistics available in statistics
pages
One page per object
Global classes
Associated with the hypervisor
Hypervisor
Logical processors
Local classes
Associated with a partition
Partitions
Virtual processors
Guest can access via
HvMapStatsPage
HvUnmapStatsPage
Agenda
Hypervisor Overview
Hypercall Overview
Making a Hypercall (or, What’s Under the
Wrapper?)
Hypercall Functions
Example: Inter-partition Communication
Future Support for Hardware Features
Example
Inter-partition communication
Set up inter-partition communication
Send and receive messages
Send and receive events
Set Up Communication
Receiver or its parent partition creates a port
Sender or its parent partition creates a
connection to a port
Port is associated with
An SINTx number
A list of receiver virtual processors

Receiver Sender

Port Connection
Partitions

Hypervisor
Event Interface
Hypercall
Sets up communication
Signals event by setting a bit
Interrupt
Posted (through SynIC) to notify receiver that
the bit is set
Send and Receive Events
Event posted with HvSignalEvent sets bit in
guest page
Guests map their per-VP SynIC event page into
their address space
Receiver Sender
Receiver VP’s SynIC Event Page
Connection
SINT0 events 2048 bits
VP Port
SINT1 events 2048 bits Partitions

Hypervisor
… Routing info from port:
SINTx number
Valid receiver VP’s
SINT15 events 2048 bits
Message Interfaces
Hypercall
Sets up communication
Posts message
Interrupts
Posted (through SynIC) to notify recipient of
message arrival
Send Messages
HvPostMessage issued
Messages are copied to a per-receiver message buffer
Hypervisor queues messages for future delivery to
message page
Receiver Sender
Message Page
Connection
Message slot 0 256 bytes
VP Port
Message slot 1 256 bytes Message
Partitions
… …
Message slot 15 256 bytes Hypervisor
Message Buffers
Buffer 1 – busy
Buffer 2 – busy
Buffer 3

Buffer n – free
Send Messages
Sender uses Post and Cancel hypercalls
Messages are copied to a per-receiver message buffer
Hypervisor queues messages for future delivery to
message page
Receiver Sender
Message Page
Connection
Message slot 0 256 bytes
VP Port
Message slot 1 256 bytes
Partitions
… …
Message slot 15 256 bytes Hypervisor
Message Buffers
Buffer 1 – busy
Buffer 2 – busy
Message

Buffer n – free
Agenda
Hypervisor Overview
Hypercall Overview
Making a Hypercall
(or, What’s Under the Wrapper?)
Hypercall Functions
Example: Inter-partition Communication
Future Support for Hardware Features
Future Support For
Hardware Features
Items on the following slides are hardware
features for which we’re considering
support
Some committed
Some still under investigation
Feedback is welcome
Committed Hardware Features
Authenticated (measured) launch of the hypervisor
“This IS the hypervisor you’re looking for …”
Uses secure hardware features
LT (LaGrande Technology) or SVM (Secure Virtual Machine)
TPM (Trusted Platform Module)
DMAr (Direct Memory Access remapping) protection of
the hypervisor
Provides additional protection for the hypervisor from guest
memory accesses for I/O operations
Uses AMD IOMMU (I/O Memory Management Unit)
Uses Intel VT-d (Vanderpool Technology – Directed I/O)
Hardware Features Under Investigation

DMAr for partition-level device assignment of PCIe


Bus/Device/Functions (BDFs)

Partial device assignment of multi-head devices that


don’t expose separate BDFs

DMAr services for guests


Effectively a virtualized IOMMU

TPM guest-level virtualization


Guest access to TPM services

System management isolation mechanisms

Hardware-based multi-level page tables


Call To Action
Understand Windows hypervisor APIs
Apply this knowledge to build solutions
based on Windows virtualization
Review the session materials from related
sessions
Let us know your thoughts on our future
hardware support plans
Additional Resources
Related Sessions
BUS126 Windows Virtualization Strategy and Roadmap

VIR065 Microsoft Operating System Virtualization Strategy and


Virtual Hard Disk Directions

VIR047 Hypervisor, Virtualization Stack, and Device Virtualization


Architectures

VIR040 Device Virtualization Architecture

VIR043 How to Use WMI Interfaces with Windows Virtualization

VIR049 Inside Microsoft’s Network and Storage VSP/VSC

VIR124 Windows Virtualization Best Practices and Future


Hardware Directions

VIR046 Hypercall APIs Explained


Additional Resources
Publications and Contact Information
Publications:
Presentations and future papers
http://www.microsoft.com/whdc/system/platform/virtual/default.mspx
Preview papers and specs
WinHEC Proceedings DVD
Windows Hypervisor Top Level Functional Specification
Windows Virtualization Glossary
Web resources
Windows Virtualization Team Blog
http://blogs.technet.com/virtualization
AMD I/O Virtualization Technology (IOMMU) Specification
http://developer.amd.com/documentation.aspx
Intel® Virtualization Technology for Directed I/O Architecture Specification
ftp://download.intel.com/technology/computing/vptech/Intel(r)_VT_for_Direct
_IO.pdf
LaGrande Technology Preliminary Architecture Specification
http://www.intel.com/technology/security/
Trusted Computing Group
http://www.trustedcomputinggroup.org/home
E-mail comments to: msvirtex@microsoft.com
© 2006 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

You might also like