You are on page 1of 8

An Architectural Perspective for Cloud Virtualization

Yiming Zhang Qiao Zhou Dongsheng Li Kai Yu Chengfei Zhang


School of Computer School of Computer School of Computer School of Computer School of Computer
NUDT NUDT NUDT NUDT NUDT
Changsha, China Changsha, China Changsha, China Changsha, China Changsha, China
ymzhang@nudt.edu.cn zhouqiao3518@sina.com dsli@nudt.edu.cn yukaigoforit@126.com chengfeizh@gmail.com

Yaozheng Wang Jinyan Wang Ping Zhong Yongqiang Xiong Huaimin Wang
School of Computer School of Computer Central South University MSRA School of Computer
NUDT NUDT Changsha, China Beijing, China NUDT
Changsha, China Changsha, China ping.zhong@csu.edu.cn yqx@microsoft.com Changsha, China
wyaozheng@gmail.com wjy0213@vip.qq.com whm w@163.com

Abstract—Virtual machines (VMs) and processes are two ty [4]. VMs usually emulate existing hardware architectures
important abstractions for cloud virtualization, where VMs that run a complete operating system (OS) providing the
usually install a complete operating system (OS) running environment for executing various applications in the form
user processes. Although existing in different layers in the of user processes.
virtualization hierarchy, VMs and processes have overlapped Although existing in different layers in the virtualization
functionalities. For example, they are both intended to provide hierarchy, VMs and processes have overlapped functionali-
execution abstraction (e.g., physical/virtual memory address ties. They are both intended to provide execution abstraction
space), and share similar objectives of isolation, cooperation (e.g., memory address space), and share similar objectives
and scheduling. Currently, neither of them could provide the of isolation, cooperation and scheduling. On the other hand,
benefits of the other: VMs provide higher isolation, security however, neither of them could provide the benefits of the
and portability, while processes are more efficient, flexible and other: VMs provide higher isolation, security and portability,
easier to schedule and cooperate. The heavyweight virtualiza- while processes are more lightweight and easier to schedule
tion architecture degrades both efficiency and security of cloud and cooperate (with multi-process APIs like fork, IPC and
services. signals). Therefore, both VMs and processes are needed by
There are two trends for cloud virtualization: the first is to the mainstream software stacks, which not only decreases
enhance processes to achieve VM-like security, and the second
efficiency but also increases vulnerability [4, 5].
is to reduce VMs to achieve process-like flexibility. Based on
Recently, commodity clouds (like Amazon’s Elastic
Computing Cloud [6] and Alibaba’s Aliyun [7]) have com-
these observations, our vision is that in the near future VMs
pletely changed the economics of large-scale computing [8],
and processes might be fused into one new abstraction for cloud
providing a public platform where tenants run their specific
virtualization that embraces the best of both, providing VM-
applications (e.g., a web server or a database server) on
level isolation and security while preserving process-level effi-
dedicated VMs. The highly-specialized VMs require only a
ciency and flexibility. We present a reference implementation,
very small subset of the overall functionality provided by a
dubbed cKernel (customized kernel), for the new abstraction.
standard OS.
Essentially, cKernel is a LibOS based virtualization architec- This has motivated two trends for cloud virtualization:
ture, which (i) removes traditional OS and process layers and the first is to enhance processes to achieve VM-like security,
retains only VMs, and (ii) takes the hypervisor as an OS and and the second is to reduce VMs to achieve process-like flex-
imitates processes’s dynamic mapping mechanism for VMs’ ibility. For example, picoprocesses [9] augment processes
pages and libraries. so as to address the isolation and efficiency problem of
conventional OS processes, and Unikernels [4] statically
1. Introduction seal only the application binary and requisite libraries into
a single bootable appliance image so that Unikernel VMs
Currently, virtual machines (VMs) and processes are two could be lightweight and flexible.
important abstractions for cloud virtualization. Hardware Based on these observations, our vision is that in the near
virtual machines [1] have been widely used to guarantee future the two abstractions of VMs and processes might be
isolation [2, 3] and improve system reliability and securi- fused into one new abstraction for cloud virtualization that
embraces the best of both VMs and processes, providing
VM-level isolation and security while preserving process-
level efficiency and flexibility. We argue that a promising
approach for this fusion is to start from the VM abstraction
and add desirable processes’ features following the library
operating system (LibOS) [10] paradigm, which has been
widely adopted in the virtualization literature [4, 11, 9, 12,
13, 14, 15, 16].
We present a reference implementation, dubbed cKernel Figure 1: Conventional VMs/processes vs. cKernel VM
(customized kernel), for the new abstraction. cKernel is appliances.
essentially a LibOS based virtualization architecture, which
(i) removes traditional OS and process layers and retains on-
ly VMs, and (ii) takes the hypervisor as an OS and imitates effectively reduce the image size and memory footprint,
processes’s dynamic mapping mechanism for VMs’ pages taking the first step towards making a virtual machine more
and libraries. By drawing an analogy between conventional like a process.
processes and VMs, cKernel has partially (i) realized the Meanwhile, another trend is to improve processes with
fork API for VMs, a basis for conventional multi-process VM-like isolation and security. For instance, the picopro-
programming abstractions, and (ii) supported dynamic lib- cess [9, 19] is a process-based isolation abstraction that
rary loading and linking for the minimized VMs. can be viewed as a stripped-down virtual machine with-
The rest of this paper is organized as follows. Sec- out emulated CPU kernel mode, MMU or virtual devices.
tion 2 further introduces the background and motivation. Picoprocess-oriented library OSs [12, 15, 13, 14, 16] provide
Section 3 introduces a promising approach (cKernel) to re- an interface restricted to a narrowed set of host kernel
alizing the fusion of processes and VMs. Section 4 presents ABIs [12], implementing the OS personalities as library
cKernel’s reference implementation. Section 5 introduces functions and mapping high-level APIs onto a few interfaces
related work. Section 6 discusses future work. And finally to the host OS.
Section 7 concludes the paper. For example, Drawbridge [12] refactors Windows 7 to
create a self-contained LibOS running in a picoprocess yet
still supporting rich desktop applications. Graphene [15]
2. Background broadens the LibOS paradigm by supporting multi-process
APIs in picoprocesses. Bascule [13] allows OS-independent
2.1. VMs Vs. Processes extensions to be attached safely and efficiently at runtime.
Tardigrade [14] utilizes picoprocesses to easily construct
VMs rely on the virtual machine hypervisor, like Xen [1] fault-tolerant services. And Haven [16] leverages the hard-
and Linux KVM [17], to provide an application with all its ware protection of Intel Software Guard Extensions (SGX)
dependencies in a virtualized, self-contained unit. As shown to achieve shielded execution in picoprocesses against at-
in Fig. 1 (left), the hardware dependency is provided through tacks from a malicious host. Similar designs of enhancing
the abstraction of an emulated machine with CPU kernel processes include Linux Container [20], Docker [21], Open-
mode, multiple address spaces and virtual hardware devices, VZ [22], and VServer [23].
which is able to provide the software dependency by running
a conventional operating system together with various lib- 2.2. Vision
raries, perhaps slightly modified for para-virtualization [1].
VMs have been widely used for cloud computing since Currently, neither VMs nor processes could provide the
they provide desirable compatibility of major applications benefits of the other, and thus both of them are needed
by reusing existing OS functions including the support of by the mainstream software stacks. This not only decreases
various hardwares. efficiency but also increases vulnerability of cloud services.
However, running a conventional OS in a VM brings The two treads (of enhancing processes to achieve VM-
substantial overhead, since the guest OS may run many du- like security and reducing VMs to achieve process-like
plicated management processes and each guest OS may de- flexibility) have motivated our vision that in the near future
mand gigabytes of disk storage and hundreds of megabytes VMs and processes might be fused into one new abstraction
of physical memory. This has recently motivated the revival for cloud virtualization, with the goal of making the best of
of LibOS for VM appliances [4, 18, 11]. For example, both VMs and processes.
Unikernel [4] implements the OS functions (e.g., networking Considering the great success of VMs in security and
system and file system) as a set of libraries that can be backward compatibility, in this paper we will discuss a
separately linked into the VM appliance at compile time. promising approach for the fusion. We propose to start from
The Unikernel (system libraries), application libraries and the VM abstraction, strip off any unused functions of the
application binary are statically sealed into a single bootable accommodated applications, and add desirable processes’
image, which can run directly on the virtual hardware pro- features like inter-process communication and dynamic lib-
vided by the hypervisor. These VM oriented LibOS designs rary linking to the minimized VMs. We follow the technical
and industrial tread to base our design on the state-of-the-art 3.2. Process-Like Fork of VMs
LibOS paradigm, making virtual machines on the hypervisor
be comparable to processes on the OS, as depicted in Fig. 1 Since the process abstraction has been striped off from
(right). the simplified cKernel architecture, cKernel VMs need to
realize IPC (inter-process communication) to support con-
ventional multi-process applications. cKernel lets a multi-
3. Fusion of VMs and Processes process application have the view that its processes run with-
out any change, while these processes are actually cKernel
This section introduces a promising approach (called VMs collaboratively running on the hypervisor.
cKernel) to realizing the new abstraction, a LibOS based The fork API is the basis to realize IPC. It is straight-
virtualization architecture that fuses VM and process and forward to leverage the hypervisor’s memory mapping func-
makes the best of both for the cloud. tions (like Xen’s page sharing mechanism) to implement
fork. The fork API will create a child VM by duplicating the
3.1. cKernel VMs calling parent VM and return its ID to the parent. cKernel
needs to maintain all the information of the parent and
child VMs, including the IDs, offsets within shared pages,
Conventional general-purpose OSs are heavyweight and references of the grant table, etc.
contain extensive features, so processes have been proposed
to act as “lightweight VMs” that are cheaper to create,
schedule and configure. In cloud environments, however, 3.3. Process-like Library Linking of VMs
most cloud services only run a specific application in a
single-purpose VM, which needs only a very small portion Full VMs could easily realize dynamic library link-
of conventional OS supports. Therefore, we expect to reduce ing by leveraging their rich-featured OS functionalities. In
the OS and make the VM become (even) as lightweight as the cKernel architecture, however, the minimized VMs are
a process. monolithically sealed at compile time and thus libraries
Based on this observation, cKernel proposes to start cannot be dynamically linked as before. Static library linking
from the VM abstraction and take minimalism to the ex- brings great inefficiency to cKernel. For example, the entire
treme, sealing only the application binary into a single VM image has to be re-compiled and re-deployed to update
VM. Previous work [5] has showed that a VM could be each of its libraries.
booted/run in less than 3 milliseconds, which is comparable Therefore, it is necessary to design a dynamic LibOS
to process fork/exec on Ubuntu Linux (about 1 millisecond). for cKernel’s minimized VMs, which supports to map the
This proves the feasibility for cKernel to realize high- shared libraries (along with their states) onto the VMs’ vir-
performance process-like VM creation and scheduling (such tual address space at runtime. To achieve this goal, cKernel
as VM fork, IPC-like inter-VM communication, and dynam- follows the “shell-core” mechanism [2], sealing only the
ic library linking) by taking the hypervisor as an OS. By application binary into a single VM (shell) and dynamically
drawing an analogy between VMs and processes, cKernel linking libraries (core) at runtime. It (i) adds a dynamic
naturally relies on the hypervisor for multicore scheduling, segment to its virtual address space and (ii) dynamically
following the multikernel mechanism [24] of running one loads and maps requisite libraries, in a similar way to the
VM per vCPU. procedure of dynamic linking of conventional processes by
In the conventional virtualization architecture, threads the language runtime (like glibc) [27].
have been proposed as “lightweight processes”. Multiple Specifically, cKernel initializes a VM and then jumps
threads running in one process could provide more efficient to an entry function provided by the hypervisor. cKernel
scheduling and communication. However, many production adopts a single virtual address space where the basic OS
systems have showed that threads are difficult to rein in functions (e.g., bootstrap and memory allocation), system
large-scale distributed environments [25] due to problems libraries (including the language runtime), application binary
like data race, deadlock, and livelock, which are usually and libraries, and global data structures co-locate to run
nondeterministic and counter-intuitive. On the other hand, without kernel/user space separation or process scheduling.
in modern clouds one VM usually runs only one service
instance, making threads less attractive than before. There- 4. Reference Implementation
fore, cKernel adopts the simplified coroutine model [26] for
concurrency, where many logical coroutines (each of which We present a reference implementation of cKernel by
serves one client connection) run in the same VM/process so modifying MiniOS [11], a para-virtualized LibOS for user
as to achieve fast sequential performance while alleviating domains on Xen [1], where the application binary and
the problems of threads. necessary libraries are statically sealed into a single bootable
Next, we will in turn introduce how to add the two im- image like Unikernel [4]. MiniOS has implemented the
portant features of conventional processes to cKernel VMs, basic functionalities needed to run an appliance on Xen. It
namely, dynamic mapping of pages and dynamic linking of has a single address space and a simple scheduler without
libraries. preemptive scheduling/threading.
Algorithm 1 Dynamic Library Linking
reserved by Xen 1: procedure XC DOM MAP DYN
2: Create domain boot loader
program heap program stack 3: Parse kernel image file
virtual address space

boot stack 4: Initiate boot memory


console
page tables 5: Build image in memory
external I/O xenstore
arch‐specific  6: Read dynamic lib info from program hdr table
structures start info
7: for each dynamic lib do
grant table p2m table 8: Retrieve info of dynamic sections
dynamic seg 9: Load dynamic sections
front driver info
10: Relocate unresolved symbols
kernel seg 11: end for
text and data 12: Jump to the program entry
13: end procedure
Figure 2: cKernel memory layout. The blue region is where
dynamic libraries are loaded.
libraries and application binary, (ii) the dynamic segment
where all the dynamic libraries are loaded, and (iii) the struc-
cKernel leverages the GNU cross-compiling develop- tures needed by cKernel such as the physical-to-machine
ment tool chain (including the GCC compiler) to support (p2m) mapping table, arch-specific structures, page tables
source code (mainly C in our current prototype) compatibil- and stacks.
ity, so that a legacy Linux application could be re-compiled The front driver info region is for realizing the split
to a cKernel appliance that can run directly on Xen. It driver model of Xen. The grant table region is used to
incorporates the lightweight RedHat Newlib [28] to provide share pages with dom0 and other domUs. The external
the standard C library functions. I/O region maps the external devices to the virtual address
space to facilitate I/O operations. The heap region is used
4.1. Fork to allocate memory growing in 4KB page chunks. And the
Xen-reserved region is used by Xen for hypervisor-level
cKernel fork is designed to duplicate the VM that ac- management and not available by cKernel.
commodates the caller application. When the fork API is By emulating the procedure of dynamic mapping in
invoked in the parent VM, cKernel uses inline assemblies UNIX processes that supports shared libraries, we have
to get the values of CPU registers. Dom0 is responsible for implemented cKernel’s dynamic mapping mechanism of
duplicating and starting the child VM. shared libraries in the Xen control library (libxc), which is
The parent domain uses the grant table to selectively used by the upper-layer control tool stacks like XL (XM),
share its data with its child, including the stack, heap, .bss Libvirt/VIRSH, and XAPI/XE.
and data sections. This improves the security by avoiding After creating the domain boot loader, XL will build a
unauthorized access. For instance, the parent may choose not para-virtualized domU, during the process of which it (i)
to share its stack if the child never access it; and in a key- parses the kernel image file, (ii) initiates the boot memory,
value store application a child domain servicing a specific (iii) builds the image in the memory, and (iv) boots up the
user should be given access to only that user’s data. Writable image for domU. We modify the above 3rd step (of building
data is shared in a copy-on-write mode. image) and add a function xc dom map dyn(), which maps
The forked child VM can access the authorized data of the dynamic libraries at runtime to the virtual address of
the parent VM. The child and parent may also use shared dynamic segment in the text and data region depicted in
pages to communicate with each other, i.e., realizing con- Fig. 2.
ventional inter-process communication APIs in the cKernel
architecture, such as pipe, signal, and message queue. The
Xen-based implementation of inter-VM page mapping will 5. Related Work & Discussion
be studied in our future work.
This section discusses the related work of the fused vir-
4.2. Dynamic Library Linking tualization architecture, including virtualization techniques
of virtual machines and containers (Section 5.1), modular
The original design of MiniOS cannot support dynamic kernels and library OSs (Section 5.2), and security issues
libraries. Fig. 2 shows the augmented memory layout of (Section 5.3).
cKernel, which is divided into multiple subspaces for regions
of text and data, frontend driver info, grant table, external 5.1. Virtual Machines & Containers
I/O, heap, and Xen-reserved.
The text and data region contains (i) the kernel segment Currently full virtual machine hypervisors like Xen [1]
including the pre-compiled basic OS functions, (if any) static and Linux KVM [17] are the dominant virtualization tech-
nique that provides an application with all its dependen- virtual memory and inter-process communication primitives,
cies on both hardware and software in a virtualized self- and SPACE [38] achieves better I/O performance while pro-
contained unit [29]. viding low-level kernel abstractions defined by the trap and
The full VM and rich-featured OS architecture is heavy- architectural interface. Cache Kernel [39] provides a low-
weight and inefficient. Therefore, researchers have proposed level kernel that supports multiple application-level OSs,
container-based virtualization techniques, such as picopro- and thus it can be viewed as a variation of the exokernel.
cesses [9], Linux Container [20], Docker [21], OpenVZ [22], The difference between Cache Kernel and Exokernel is
and VServer [23]. A container is essentially an OS process that Cache Kernel mainly focuses on reliability rather than
plus its executing environment, so it is much smaller than a secure multiplexing of physical resources.
traditional full VM and boots up much faster. With the proliferation of cloud computing and single-
The picoprocess [9] is a process-based isolation con- purpose VM appliances, Unikernels [4] target the com-
tainer. It is a highly restricted process that is prevented from modity clouds and compile the applications and libraries
making kernel calls, and provides a single memory-isolated into a single bootable VM image that runs directly on a
address space with strictly user-mode CPU execution and VM hypervisor (e.g., Xen). Mirage [4] is an OCaml [40]
a narrow interface for implementing specific OS personali- based unikernel system where the applications are written
ties [12]. in high-level type-safe OCaml code and are compile-time
Linux Container [20] (LXC) runs multiple isolated con- statically sealed into VM appliances against modification,
tainers over a single Linux host, each of which creates its which provides better security than previous library OSs
own virtual spaces for its processes. It uses Linux kernel like Singularity [41]. Mirage eschews backward compatibil-
cgroups [30] to multiplex and prioritize resources such as ity [12] to achieve higher performance, but it is nontrivial to
CPU cycles and memory, and leverages Linux namespace to rewrite the numerous legacy services and applications (that
isolate an applications’ view of users, processes, file systems are written in C, Java, Python, etc.) using OCaml.
and networking [31]. Jitsu [42] uses Unikernel to design a power-efficient
A Docker [21] container is made up of a base image and responsive platform for hosting cloud services in the
plus multiple committed layers, and also uses resource edge network, providing a new Xen toolstack that satisfies
isolation features of Linux to allow multiple independent the demands of secure multi-tenant isolation on resource-
dockers to run within a single Linux host. Traditionally, constrained embedded ARM devices.
Docker supports a single process only, so a separate process MiniOS [11] designs and implements a unikernel-style
manager like supervisor is needed to run multiple processes library OS to run as a para-virtualized guest OS within
in Docker. a Xen domain. MiniOS has better backward compatibility
Although containers are smaller and have better perfor- and supports some single-process applications written in C.
mance, full VMs are still dominant in public commodity However, the original MiniOS also statically seals a mono-
clouds and production platforms due to better isolation, lithic appliance and suffers problems similar to Mirage.
security, portability and compatibility. Libra [43] provides a library OS abstraction for the JVM
directly over Xen and relies on a separate Xen domain to
5.2. Modular Kernels & Library OSs provide networking and storage, which is similar to the
design of MiniOS.
A microkernel [32] provides the basic functions such as Library OSs may also run in picoprocesses [12, 15, 13,
virtual memory management and interprocess communica- 14, 16]. The picoprocess-based library OSs implement the
tion (IPC), while many other functions like file systems, OS personalities as library functions and map high-level
drivers and protocols, are run in the user space. Micro- APIs onto a few interfaces to the host OS. Picoprocess-
kernels have mainly argued for extensibility and flexibili- based LibOS is more like a traditional OS process and nat-
ty [33, 34, 35, 36]. For example, the SPIN [33] microker- urally supports dynamic shared libraries. However, currently
nel allows applications to make policy decisions by safely picoprocess-based library OSs run only a handful of custom
downloading extensions into the kernel, and Scout [34], applications on small research prototypes, mainly because
Vino [35] and Nemesis [36] efficiently realize the commu- they require much engineering effort to achieve the same
nication, database and multimedia applications on top of security and compatibility as full VMs.
microkernels, respectively.
The exokernel architecture [10] achieves higher flexibil- 5.3. Security Issues
ity than microkernels by providing application-level secure
multiplexing of physical resources and enabling application- cKernel VMs are expected to provide network-facing
specific customization of conventional OS abstractions. A services in the public multi-tenant cloud and run inside full
small exokernel securely exports all hardware resources virtual machines above the Xen hypervisor layer, therefore
through a low-level interface to untrusted library OSs, which we treat both the hypervisor and the control domain (dom0)
use this interface to implement system functions. Exok- as part of the trusted computing base (TCB). Software
ernels provide applications with higher performance than running in cKernel VMs is under potential threat from both
traditional microkernel systems. For example, ExOS [10] other tenants in the cloud and malicious hosts connected
and JOS [37] provide much more efficient application-level via Internet. The full VM architecture facilitates cKernel to
use the hypervisor as the guard for isolation against internal to incorporate the MUSL library [47] instead of RedHat
attacks from other tenants, and the use of secure protocols Newlib for better runtime performance.
like SSL and SSH helps cKernel appliances trust external Third, it is important to develop demonstrating applica-
entities. tions of the new architecture. For example, it is interesting to
Some modern OSs apply type-safe languages (e.g., build a cloud platform on which multiple large-scale cKernel
OCaml [4], Haskell [44], and C sharp [41]) to achieve type- applications are deployed securely and quickly. It is chal-
safety. For example, Mirage [4] uses OCaml to implement lenging to build a cKernel pool for fast VM booting, which
a set of type-safe protocol libraries and provides static type- accommodates sleeping (empty) VMs that can be waken
safety with a single address-space layout that is immutable up by dynamically linking to the designated applications
via a hypervisor extension. But there are still non-type-safe as shared libraries. It is also useful to build basic cloud
components (e.g., the garbage collector) in Mirage. Clearly, services adopting the fused architecture, such as the large-
cKernel is written in C and thus is not type-safe. However, scale persistent in-memory key-value store [48, 49].
the cKernel architecture is orthogonal to the language for Fourth, currently customers of a cloud must trust the
implementation and it is possible to implement a “type-safe” cloud provider with both the hypervisor and the cKernel
cKernel using type-safe languages. Note that to achieve VMs, while recent advances in hardware protection (Intel
complete type-safety not only the OS and applications but SGX [16]) provides opportunity to shield cKernel VMs not
also the hypervisor and dom0 should not allow the type-safe only against unforseen bugs of the compiler, runtime, hyper-
property to be violated. visor or toolchain, but also against attacks from malicious
Another approach to improving security is to reduce the hosts and operators.
TCB of a VM through disaggregation. For example, Ander- Last, it is also vital to implement the fused virtualization
son et al. propose to run security sensitive components in architecture for other VM hypervisors like Linux KVM and
separate Xen domains for MiniOS [11], leveraging the fully VMWare vSphere.
virtualized processors and the Trusted Computing Group’s
trusted platform module (TPM) provided by the hypervisor. 7. Conclusion
Murray et al. implement “trusted virtualization” for Xen and
improve the security of virtual TPM by moving the domain In this paper we propose a new virtualization architec-
builder, which is the most important privileged component, ture which fuses the abstractions of VMs and processes
into a minimal trusted compartment [45]. Currently the TCB for the cloud. We present a LibOS based reference im-
of cKernel includes the hypervisor, the privileged dom0 and plementation (cKernel) by modifying MiniOS on top of
the necessary management tools that is hosted on top of a Xen, which seals only the application binary into a bootable
full Linux OS running dom0, therefore it is worth studying VM and supports dynamic library linking and conventional
the disaggregation approach to reduce the TCB of cKernel process-like fork. The new virtualization architecture has a
in future work. number of advances. For instance, it (i) allows online library
update by leveraging dynamic library mapping, (ii) supports
conventional multi-process applications by realizing VM
6. Future Work fork, and (iii) improves the flexibility of VMs’ management
by emulating processes’ scheduling and operations. Some
The fusion of VMs and processes provides the cloud a key components of cKernel are already available as open-
desirable virtualization architecture that embraces the best source [50].
of both, but a lot of work remains to do in this promising
direction. Acknowledgement
First, currently cKernel supports fork but still cannot
support IPC APIs like pipe, signal and message queue. So This work was supported in part by the National Ba-
it is urgent to implement the complete IPC abstractions for sic Research Program of China (973) under Grant No.
cKernel. It is also necessary to study the subtle difference 2014CB340303, the National Natural Science Foundation
between VMs and processes. For example, it is common of China (NSFC) under Grant No. 61772541, 61379055 and
for a web server to fork a worker process to service a 61379053. This work was performed when the first author
user request, but a worker VM will have a new IP and was visiting the Computer Lab at University of Cambridge
lose connection to the user. The fork API could also be between 2012 and 2013. The authors would like to thank
extended to support the spawn() primitive for fast booting Huiba Li, Ziyang Li, Yuxing Peng, Ling Liu, Jon Crowcroft,
a large-number of VM appliances in OpenStack the iVCE Group, the NetOS group of Computer Lab at
Second, the current cKernel implementation relies on University of Cambridge, and the anonymous reviewers for
the (unmodified) shared memory mechanism of Xen, whose their insightful comments.
networking system, file system and threading model are
simple and inefficient. This might result in relatively poor References
performance [46]. For example, the current networking im-
plementation of Xen is insufficient and we could use the [1] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris,
ClickOS [46] modification to improve it. It is also desirable A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, “Xen
and the art of virtualization,” ACM SIGOPS Operating USENIX Symposium on Operating Systems Design and
Systems Review, vol. 37, no. 5, pp. 164–177, 2003. Implementation (OSDI), 2014.
[2] X. Lu, H. Wang, and J. Wang, “Internet-based virtual [17] “http://www.linux-kvm.org/.”
computing environment (ivce): Concepts and architec- [18] “http://osv.io/.”
ture,” Science in China Series F: Information Sciences, [19] J. Howell, B. Parno, and J. R. Douceur, “How to
vol. 49, no. 6, pp. 681–701, 2006. run posix apps in a minimal picoprocess.” in USENIX
[3] X. Lu, H. Wang, J. Wang, and J. Xu, “Internet-based Annual Technical Conference, 2013, pp. 321–332.
virtual computing environment: Beyond the data center [20] “https://linuxcontainers.org.”
as a computer,” Future Generation Computer Systems, [21] “https://www.docker.com.”
vol. 29, no. 1, pp. 309–322, 2013. [22] “http://openvz.org.”
[4] A. Madhavapeddy, R. Mortier, S. Smith, S. Hand, and [23] “http://http://linux-vserver.org.”
J. Crowcroft, “Unikernels: Library operating systems [24] A. Baumann, P. Barham, P.-E. Dagand, T. Harris,
for the cloud,” in ACM ASPLOS. ACM, 2013, pp. R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and
461–472. A. Singhania, “The multikernel: a new os architecture
[5] F. Manco, C. Lupu, F. Schmidt, J. Mendes, S. Kuenzer, for scalable multicore systems,” in Proceedings of the
S. Sati, K. Yasukata, C. Raiciu, and F. Huici, “My vm ACM SIGOPS 22nd symposium on Operating systems
is lighter (and safer) than your container,” 2017. principles. ACM, 2009, pp. 29–44.
[6] “https://aws.amazon.com/ec2/.” [25] A. Ho, S. Smith, and S. Hand, “On deadlock, livelock,
[7] “http://www.aliyun.com/.” and forward progress,” Technical Report, University of
[8] A. Madhavapeddy, R. Mortier, R. Sohan, T. Gaza- Cambridge, Computer Laboratory (May 2005), 2005.
gnaire, S. Hand, T. Deegan, D. McAuley, and [26] S. R. Maxwell, “Experiments with a coroutine execu-
J. Crowcroft, “Turning down the lamp: software spe- tion model for genetic programming,” in Evolutionary
cialisation for the cloud,” in Proceedings of the 2nd Computation, 1994. IEEE World Congress on Compu-
USENIX conference on Hot topics in cloud computing, tational Intelligence., Proceedings of the First IEEE
HotCloud, vol. 10, 2010, pp. 11–11. Conference on. IEEE, 1994, pp. 413–417.
[9] J. R. Douceur, J. Elson, J. Howell, and J. R. Lorch, [27] Y. Zhang, D. Li, and Y. Xiong, “Enabling online library
“Leveraging legacy code to deploy desktop applica- update for vm appliances in the cloud,” NUDT, Tech.
tions on the web.” in OSDI, vol. 8, 2008, pp. 339–354. Rep., 2016.
[10] D. R. Engler, M. F. Kaashoek et al., “Exokernel: [28] “https://sourceware.org/newlib/.”
An operating system architecture for application-level [29] J. Rutkowska and R. Wojtczuk, “Qubes os architec-
resource management,” in Fifteenth ACM Symposium ture,” Invisible Things Lab Tech Rep, vol. 54, 2010.
on Operating Systems Principles. ACM, 1995. [30] “https://en.wikipedia.org/wiki/Cgroups.”
[11] M. J. Anderson, M. Moffie, C. I. Dalton et al., “To- [31] R. Rosen, “Resource management: Linux kernel
wards trustworthy virtualisation environments: Xen namespaces and cgroups,” Haifux, May, 2013.
library os security service infrastructure,” HP Tech [32] P. B. Hansen, “The nucleus of a multiprogramming
Reort, pp. 88–111, 2007. system,” Communications of the ACM, vol. 13, no. 4,
[12] D. E. Porter, S. Boyd-Wickizer, J. Howell, R. Olinsky, pp. 238–241, 1970.
and G. C. Hunt, “Rethinking the library os from the [33] B. N. B. S. S. Przemysław, P. E. G. Sirer, M. E. F. D.
top down,” ACM SIGPLAN Notices, vol. 46, no. 3, pp. Becker, C. Chambers, and S. Eggers, “Extensibility,
291–304, 2011. safety and performance in the spin operating system,”
[13] A. Baumann, D. Lee, P. Fonseca, L. Glendenning, in Fifteenth ACM Symposium on Operating Systems
J. R. Lorch, B. Bond, R. Olinsky, and G. C. Hunt, Principles. ACM, 1995, p. 46.
“Composing os extensions safely and efficiently with [34] A. B. Montz, D. Mosberger, S. W. O’Mally,
bascule,” in Proceedings of the 8th ACM European L. L. Peterson, and T. A. Proebsting, “Scout: A
Conference on Computer Systems. ACM, 2013, pp. communications-oriented operating system,” in HotOS.
239–252. USENIX, 1995, p. 58.
[14] J. R. Lorch, A. Baumann, L. Glendenning, D. T. [35] C. Small and M. Seltzer, “Vino: an integrated platform
Meyer, and A. Warfield, “Tardigrade: Leveraging for operating systems and database research,” Harvard,
lightweight virtual machines to easily and efficiently Tech. Rep., 1994.
construct fault-tolerant services.” [36] I. M. Leslie, D. McAuley, R. Black, T. Roscoe,
[15] C.-C. Tsai, K. S. Arora, N. Bandi, B. Jain, W. Jannen, P. Barham, D. Evers, R. Fairbairns, and E. Hyden, “The
J. John, H. A. Kalodner, V. Kulkarni, D. Oliveira, design and implementation of an operating system to
and D. E. Porter, “Cooperation and security isola- support distributed multimedia applications,” Selected
tion of library oses for multi-process applications,” Areas in Communications, IEEE Journal on, vol. 14,
in Proceedings of the Ninth European Conference on no. 7, pp. 1280–1297, 1996.
Computer Systems. ACM, 2014, p. 9. [37] “http://pdos.csail.mit.edu/6.828.”
[16] A. Baumann, M. Peinado, and G. Hunt, “Shielding [38] D. Probert, J. Bruno, and M. Karaorman, “Space:
applications from an untrusted cloud with haven,” in A new approach to operating system abstraction,” in
Object Orientation in Operating Systems, 1991. Pro-
ceedings., 1991 International Workshop on. IEEE,
1991, pp. 133–137.
[39] D. R. Cheriton and K. J. Duda, “A caching model of
operating system kernel functionality,” in Proceedings
of the 1st USENIX conference on Operating Systems
Design and Implementation. USENIX Association,
1994, p. 14.
[40] “https://ocaml.org.”
[41] G. C. Hunt and J. R. Larus, “Singularity: rethinking
the software stack,” ACM SIGOPS Operating Systems
Review, vol. 41, no. 2, pp. 37–49, 2007.
[42] A. Madhavapeddy, T. Leonard, M. Skjegstad, T. Gaza-
gnaire, D. Sheets, D. Scott, R. Mortier, A. Chaudhry,
B. Singh, J. Ludlam et al., “Jitsu: Just-in-time sum-
moning of unikernels,” in 12th USENIX Symposium on
Networked System Design and Implementation, 2015.
[43] G. Ammons, J. Appavoo, M. Butrico, D. Da Silva,
D. Grove, K. Kawachiya, O. Krieger, B. Rosenburg,
E. Van Hensbergen, and R. W. Wisniewski, “Libra:
a library operating system for a jvm in a virtualized
execution environment,” in VEE, vol. 7, 2007, pp. 44–
54.
[44] T. Hallgren, M. P. Jones, R. Leslie, and A. Tolmach, “A
principled approach to operating system construction
in haskell,” in ACM SIGPLAN Notices, vol. 40, no. 9.
ACM, 2005, pp. 116–128.
[45] D. G. Murray, G. Milos, and S. Hand, “Improving
xen security through disaggregation,” in Proceedings
of the fourth ACM SIGPLAN/SIGOPS international
conference on Virtual execution environments. ACM,
2008, pp. 151–160.
[46] J. Martins, M. Ahmed, C. Raiciu, V. Olteanu, M. Hon-
da, R. Bifulco, and F. Huici, “Clickos and the art
of network function virtualization,” in 11th USENIX
Symposium on Networked Systems Design and Imple-
mentation (NSDI 14). USENIX Association, 2014,
pp. 459–473.
[47] “http://www.musl-libc.org.”
[48] Y. Zhang, C. Guo, D. Li, R. Chu, H. Wu, and Y. Xiong,
“Cubicring: Enabling one-hop failure detection and
recovery for distributed in-memory storage systems,”
in 12th USENIX Symposium on Networked Systems
Design and Implementation, NSDI 15, Oakland, CA,
USA, May 4-6, 2015, 2015, pp. 529–542.
[49] J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis,
J. Leverich, D. Mazières, S. Mitra, A. Narayanan,
G. Parulkar, M. Rosenblum et al., “The case for
ramclouds: scalable high-performance storage entirely
in dram,” ACM SIGOPS Operating Systems Review,
vol. 43, no. 4, pp. 92–105, 2010.
[50] “http://www.kylinx.com/.”

You might also like