You are on page 1of 16

The Multi-Principal OS Construction of the Gazelle Web Browser

Helen J. Wang∗, Chris Grier†, Alexander Moshchuk‡, Samuel T. King†, Piali Choudhury∗, Herman Venter∗
∗ † ‡
Microsoft Research University of Illinois at Urbana-Champaign University of Washington
{helenw,pialic,hermanv}@microsoft.com, {grier,kingst}@uiuc.edu, anm@cs.washington.edu

Abstract sources. In this paper, we focus only on resource pro-


Original web browsers were applications designed to tection in Gazelle.
view static web content. As web sites evolved into dy- In Gazelle, the browser kernel runs in a separate pro-
namic web applications that compose content from mul- tection domain (an OS process in our implementation),
tiple web sites, browsers have become multi-principal interacts with the underlying OS directly, and exposes a
operating environments with resources shared among set of system calls for web site principals. We use the
mutually distrusting web site principals. Nevertheless, same web site principal as defined in the same-origin
no existing browsers, including new architectures like IE policy (SOP), which is labeled by a web site’s origin,
8, Google Chrome, and OP, have a multi-principal oper- the triple of <protocol, domain name, port>. In
ating system construction that gives a browser-based OS this paper, we use “principal” and “origin” interchange-
the exclusive control to manage the protection of all sys- ably. Unlike previous browsers, Gazelle puts web site
tem resources among web site principals. principals into separate protection domains, completely
In this paper, we introduce Gazelle, a secure web segregating their access to all resources. Principals can
browser constructed as a multi-principal OS. Gazelle’s communicate with one another only through the browser
browser kernel is an operating system that exclusively kernel using inter-process communication. Unlike all ex-
manages resource protection and sharing across web site isting browsers except OP, our browser kernel offers the
principals. This construction exposes intricate design is- same protection to plugin content as to standard web con-
sues that no previous work has identified, such as cross- tent.
protection-domain display and events protection. We Such a multi-principal OS construction for a browser
elaborate on these issues and provide comprehensive so- brings significant security and reliability benefits to the
lutions. overall browser system: the compromise or failure of a
Our prototype implementation and evaluation expe- principal affects that principal alone, leaving other prin-
rience indicates that it is realistic to turn an existing cipals and the browser kernel unaffected.
browser into a multi-principal OS that yields signifi- Although our architecture may seem to be a straight-
cantly stronger security and robustness with acceptable forward application of multi-principal OS construction to
performance. the browser setting, it exposes intricate problems that did
not surface in previous work, including display protec-
tion and resource allocation in the face of cross-principal
1 Introduction
web service composition common on today’s web. We
Web browsers have evolved into a multi-principal oper- will detail our solutions to the former and leave the latter
ating environment where a principal is a web site [43]. as future work.
Similar to a multi-principal OS, recent proposals [12, We have built an Internet-Explorer-based prototype
13, 23, 43, 46] and browsers like IE 8 [34] and Fire- that demonstrates Gazelle’s multi-principal OS archi-
fox 3 [16] advocate and support programmer abstrac- tecture and at the same time uses all the backward-
tions for protection (e.g., <sandbox> in addition to compatible parsing, DOM management, and JavaScript
<iframe> [43]) and cross-principal communication interpretation that already exist in IE. Our prototype ex-
(e.g., PostMessage [24, 43]). Nevertheless, no exist- perience indicates that it is feasible to turn an existing
ing browsers, including new architectures like IE 8 [25], browser into a multi-principal OS while leveraging its
Google Chrome [37], and OP [21], have a multi-principal existing capabilities.
OS construction that gives a browser-based OS, typically With our prototype, we successfully browsed 19 out
called the browser kernel, the exclusive control to man- of the top 20 Alexa-reported popular sites [5] that we
age the protection and fair sharing of all system resources tested. The performance of our prototype is acceptable,
among browser principals. and a significant portion of the overhead comes from IE
In this paper, we present a multi-principal OS con- instrumentation, which can be eliminated in a production
struction of a secure web browser, called Gazelle. implementation.
Gazelle’s browser kernel exclusively provides cross- We expect that the Gazelle architecture can be made
principal protection and fair sharing of all system re- fully backward compatible with today’s web. Neverthe-
less, it is interesting to investigate the compatibility cost model is the closest to Gazelle’s two processes-per-
of eliminating the insecure policies in today’s browsers. principal-instance model, but with several crucial differ-
We give such a discussion based on a preliminary analy- ences: (1) Chrome’s principal is site (see above) while
sis in Section 9. Gazelle’s principal is the same as the SOP principal. (2)
For the rest of the paper, we first give an in-depth A web site principal and its embedded principals co-exist
comparison with related browser architectures in Sec- in the same process in Chrome, whereas Gazelle places
tion 2. We then describe Gazelle’s security model 3. In them into separate protection domains. Pursuing this de-
Section 4, we present our architecture, its design ratio- sign led us to new research challenges including cross-
nale, and how we treat the subtle issue of legacy pro- principal display protection (Section 5). (3) Plugin con-
tection for cross-origin script source. In Section 5, we tent from different principals or sites share a plugin pro-
elaborate on the problem statement and design for cross- cess in Chrome, but are placed into separate protection
principal, cross-process display protection. We give a domains in Gazelle. (4) Chrome relies on its render-
security analysis including a vulnerability study in Sec- ing processes to enforce the same-origin policy among
tion 6. We describe our implementation in Section 7. We the principals that co-exist in the same process. These
measure the performance of our prototype in Section 8. differences indicate that in Chrome, cross-principal (or -
We discuss the tradeoffs of compatibility vs. security for site) protection takes place in its rendering processes and
a few browser policies in Section 9. Finally, we conclude its plugin process, in addition to its browser kernel. In
and address future work in Section 10. contrast, Gazelle’s browser kernel functions as an OS,
managing cross-principal protection on all resources, in-
cluding display.
2 Related Work IE 8 [25] uses OS processes to isolate tabs from one
another. This granularity is insufficient since a user may
In this section, we discuss related browser architectures browse multiple mutually distrusting sites in a single tab,
and compare them with Gazelle. and a web page may contain an iframe with content from
an untrusted site (e.g., ads).
Fundamentally, Chrome and IE 8 have different goals
2.1 Google Chrome and IE 8 from that of Gazelle. Their use of multiple processes is
In concurrent work, Reis et al. detailed the various pro- for failure containment across the user’s browsing ses-
cess models supported by Google Chrome [37]: mono- sions rather than for security. Their security goal is to
lithic process, process-per-browsing-instance, process- protect the host machine from the browser and the web;
per-site-instance, and process-per-site. A browsing in- this is achieved by process sandboxing [9]. Chrome and
stance contains all interconnected (or inter-referenced) IE 8 achieved a good milestone in the evolution of the
windows including tabs, frames and subframes regard- browser architecture design. Looking forward, as the
less of their origin. A site instance is a group of same- world creates and migrates more data and functionality
site pages within a browsing instance. A site is defined into the web and establishes the browser as a dominant
as a set of SOP origins that share a registry-controlled application platform, it is critical for browser designers
domain name: for example, attackerAd.socialnet.com, to think of browsers as operating systems and protect
alice.profiles.socialnet.com, and socialnet.com share the web site principals from one another in addition to the
same registry-controlled domain name socialnet.com, host machine. This is Gazelle’s goal.
and are considered to be the same site or principal
by Chrome. Chrome uses the process-per-site-instance 2.2 Experimental browsers
model by default. Furthermore, Reis et al. [37] gave
the caveats that Chrome’s current implementation does The OP web browser [21] uses processes to isolate
not support strict site isolation in the process-per-site- browser components (i.e., HTML engine, JavaScript in-
instance and process-per-site models: embedded princi- terpreter, rendering engine) as well as pages of the same
pals, such as a nested iframe sourced at a different ori- origin. In OP, intimate interactions between browser
gin from the parent page, are placed in the same process components, such as JavaScript interpreter and HTML
as the parent page. engine, must use IPC and go through its browser ker-
The monolithic and process-per-browsing-instance nel. The additional IPC cost does not add much bene-
models in Chrome do not provide memory or other re- fits: isolating browser components within an instance of
source protection across multiple principals in a mono- a web page provides no additional security protection.
lithic process or browser instance. The process-per- Furthermore, besides plugins, basic browser components
site model does not provide failure containment across are fate-shared in web page rendering: the failure of any
site instances [37]. Chrome’s process-per-site-instance one browser component results in most web pages not
functioning properly. Therefore, process isolation across neutral interface that allows scripts to dynamically ac-
these components does not provide any failure contain- cess and update the content, structure and style of a doc-
ment benefits either. Lastly, OP’s browser kernel does ument [14]. A script can access its document origin’s
not provide all the cross-principal protection needed as remote data store using the XMLHttpRequest object,
an OS because it delegates display protection to its pro- which issues an asynchronous HTTP request to the re-
cesses. mote server [45]. (XMLHttpRequest is the cornerstone
Tahoma [11] uses virtual machines to completely iso- of AJAX programming.) SOP allows a script to issue
late (its own definition of) web applications, disallowing an XMLHttpRequest only to its enclosing page’s origin.
any communications between the VMs. A web appli- A script executes as the principal of its enclosing page
cation is specified in a manifest file provided to the vir- though its source code is not readable in a cross-origin
tual machine manager and typically contains a suite of fashion.
web sites of possibly different domains. Consequently, For example, an <iframe> with source http://a.com
Tahoma doesn’t provide protection to existing browser cannot access any HTML DOM elements from another
principals. In contrast, Gazelle’s browser kernel protects <iframe> with source http://b.com and vice versa.
browser principals first hand. http://a.com’s scripts (regardless of where the scripts
The Building a Secure Web Browser project [27, 28] are hosted) can issue XMLHttpRequests to only a.com.
uses SubOS processes to isolate content downloading, Furthermore, http://a.com and https://a.com are different
display, and browser instances. SubOS processes are origins because of the protocol difference.
similar to Unix processes except that instead of a user Cookies. For cookie access, by default, the principal
ID, each process has a SubOS ID with OS support for is the host name and path, but without the protocol [19,
isolation between objects with different SubOS IDs. Su- 32]. For example, if the page a.com/dir/1.html creates a
bOS instantiates a browser instance with a different Su- cookie, then that cookie is accessible to a.com/dir/2.html
bOS process ID for each URL. This means that the prin- and other pages from that directory and its subdirec-
cipal in SubOS is labelled with the URL of a page (pro- tories, but is not accessible to a.com/ . Furthermore,
tocol, host name plus path) rather than the SOP origin https://a.com/ and http://a.com/ share the cookie store
as in Gazelle. Nevertheless, SubOS does not handle em- unless a cookie is marked with a “secure” flag. Non-
bedded principals, unlike Gazelle. Therefore, they also HTTPS sites may still set secure cookies in some im-
do not encounter the cross-principal display-sharing is- plementations, just not read them back [48]. A web pro-
sue which we tackle in depth. SubOS’s principal model grammer can make cookie access less restrictive by set-
would also require all cross-page interactions that are ting a cookie’s domain attribute to a postfix domain or
common within a SOP origin to go through IPC, incur- the path name to be a prefix path. The browser ensures
ring significant performance cost for many web sites. that a site can only set its own cookie and that a cookie
is attached only to HTTP requests to that site.
The path-based security policy for cookies does not
3 Security model
play well with SOP for scripts: scripts can gain access
to all cookies belonging to a domain despite path restric-
3.1 Background: security model in existing tions.
browsers Plugins. Current major browsers do not enforce any
Today’s browsers have inconsistent access and protec- security on plugins and grant plugins access to the local
tion model for various resources. These inconsistencies operating system directly. The plugin content is subject
present significant hurdles for web programmers to build to the security policies implemented in the plugin soft-
robust web services. In this section, we give a brief ware rather than the browser.
background on the relevant security policies in existing
browsers. Michal Zalewski gives an excellent and per- 3.2 Gazelle’s security model
haps the most complete description of existing browsers’
security model to date [48]. Gazelle’s architecture is centered around protecting prin-
Script. The same-origin policy (SOP) [39] is the cipals from one another by separating their respective re-
central security policy on today’s browsers. SOP gov- sources into OS-enforced protection domains. Any shar-
erns how scripts access the HTML document tree and ing between two different principals must be explicit us-
remote store. SOP defines the origin as the triple of ing cross-principal communication (or IPC) mediated by
<protocol, domain-name, port>. SOP mandates the browser kernel.
that two documents from different origins cannot access We use the same principal as the SOP, namely, the
each other’s HTML documents using the Document Ob- triple of <protocol, domain-name, port>. While
ject Model (DOM), which is the platform- and language- it is tempting to have a more fine-grained principal,
we need to be concerned with co-existing with current be made explicit. Just as in desktop applications, where
browsers [29, 43]: the protection boundary of a more instances of an application are run in separate processes
fine-grained principal, such as a path-based principal, for failure containment and independent resource alloca-
would break down in existing browsers. It is unlikely that tion, a principal instance is the unit of failure contain-
web programmers would write very different versions of ment and the unit of resource allocation. For example,
the same service to accommodate different browsers; in- navigating to the same URL in different tabs corresponds
stead, they would forego the more fine-grained principal to two instances of the same principal; when a.com em-
and have a single code base. beds two b.com iframes, the b.com iframes correspond to
The resources that need to be protected across princi- two instances of b.com. However, the frames that share
pals [43] are memory such as the DOM objects and script the same origin as the host page are in the same principal
objects, persistent state such as cookies, display, and net- instance as the host page by default, though we allow the
work communications. host page to designate an embedded same-origin frame
We extend the same principal model to all content or object as a separate principal instance for independent
types except scripts and style sheets (Section 4): the el- resource allocation and failure containment. Principal in-
ements created by <object>, <embed>, <img>, and stances are isolated for all runtime resources, but princi-
certain types of <input>1 are treated the same as an pal instances of the same principal share persistent state
<iframe>: the origin of the included content labels such as cookies and other local storage. Protection unit,
the principal of the content. This means that we en- resource allocation unit, and failure containment unit can
force SOP on plugin content2 . This is consistent with the each use a different mechanism depending on the sys-
existing movement in popular plugins like Adobe Flash tem implementation. Because the implementation of our
Player [20]. Starting with Flash 7, Adobe Flash Player principal instances contains native code, we use OS pro-
uses the exact domain match (as in SOP) rather than cesses for all three purposes.
the earlier “superdomain” match (where www.adobe.com Our principal instance is similar to Google Chrome’s
and store.adobe.com have the same origin) [2]; and site instance [37], but with two crucial differences: 1)
starting with Flash 9, the default ActionScript behavior Google Chrome considers the sites that share the same
only allows access to same-origin HTML content unlike registrar-controlled domain name to be from the same
the earlier default that allows full cross-origin interac- site, so ad.datacenter.com, user.datacenter.com, and dat-
tions [1]. acenter.com are considered to be the same site and be-
Gazelle’s architecture naturally yields a security pol- long to the same principal. In contrast, we consider them
icy that partitions all system resources across the SOP as separate principals. 2) When a site, say a.com, em-
principal boundaries. Such a policy offers consistency beds another principal’s content, say an <iframe> with
across various resources. This is unlike current browsers source b.com, Google Chrome puts them into the same
where the security policies vary for different resources. site instance. In contrast, we put them into separate prin-
For example, cookies use a different principal than that cipal instances.
of scripts (see the above section); descendant navigation The browser kernel runs in a separate protection do-
policy [7, 8] also implicitly crosses the SOP principal main and interposes between browser principals and the
boundary (more in Section 5.1). traditional OS. The browser kernel mediates the princi-
It is feasible for Gazelle to enable the same security pals’ access to system resources and enforces security
policies as the existing browsers and achieve backward policies of the browser. Essentially, the browser ker-
compatibility through cross-principal communications. nel functions as an operating system to browser princi-
Nevertheless, it is interesting to investigate the tradeoffs pals and manages the protection and sharing of system
between supporting backward compatibility and elimi- resources for them. The browser kernel also manages
nating insecure policies in today’s browsers. We gave a the browser chrome, such as the address bar and menus.
preliminary discussion on this in Section 9. The browser kernel receives all events generated by the
underlying operating system including user events like
4 Architecture mouse clicks or keyboard entries; these events are then
dispatched to the appropriate principal instance. When
4.1 Basic Architecture the user navigates a window by clicking on a hyperlink
that points to an URL at a different origin, the browser
Figure 1 shows our basic architecture. A principal is the kernel creates the protection domain for the URL’s prin-
unit of protection. Principals need to be completely iso- cipal instance (if one doesn’t exist already) to render the
lated in resource access and usage. Any sharing must target page, destroys the protection domain of the hy-
1 <input> can be used to include an image using a “src” attribute. perlink’s host page, and re-allocates and re-initializes the
2 OP [21] calls this plugin policy the provider domain policy. window to the URL’s principal instance. The browser




































!


"

"



$

(
!
%

&'

"






$

?
'

"


>

>
45

9
78

;<










12

3
)

$%

&
*

,-.

*0

"

"


!

#

!
@

F
C

=
:

:
A

*B

9

<

>

?
P
I

RS

.
1

'

/0

1
JK

MT

JK

45

3)

()

+2

()

)
Figure 1: The Gazelle architecture Figure 2: Supporting legacy protection

kernel is agnostic of DOM and content semantics and promise the principal with the same origin as the ma-
has a relatively simple logic. licious plugin content, but not any other principals nor
The runtime of a principal instance performs con- browser kernel.
tent processing and is essentially an instance of today’s The browser kernel supports the following system
browser components including HTML and style sheet calls related to content fetching in this architecture (a
parser, JavaScript engine, layout renderer, and browser more complete system call table is shown in Table 3):
plugins. The only way for a principal instance to inter-
act with system resources, such as networking, persis- • getSameOriginContent (URL): Fetch the content at
tent state, and display, is to use browser kernel’s system U RL that has the same origin as the issuing princi-
calls. Principals can communicate with one another us- pal regardless of the content type.
ing message passing through the browser kernel, in the • getCrossOriginContent (URL): Fetch the script or
same fashion as inter-process communications (IPC). style sheet content from U RL; U RL may be from
It is necessary that the protection domain of a princi- different origin than the issuing principal. The
pal instance is a restricted or sandboxed OS process. The content type is determined by the content-type
use of process guarantees the isolation of principals even header of the HTTP response.
in the face of attacks that exploit memory vulnerabilities. • delegate (URL, windowSpec): Delegate a display
The process must be further restricted so that any interac- area to a different principal of URL and fetch the
tion with system resources is limited to the browser ker- content for that principal.
nel system calls. Native Client [47] and Xax [15] have
established the feasibility of such process sandboxing. The semantics of these system calls is that the browser
This architecture can be efficient. By putting all kernel can return cross-origin script or style content to a
browser components including plugins into one process, principal based on the content-type header of the HTTP
they can interact with one another through DOM inti- response, but returns other content if and only if the con-
mately and efficiently as they do in existing browsers. tent has the same origin as the issuing principal, abid-
This is unlike the OP browser’s approach [21] in which ing the same-origin policy. All the security decisions are
all browser components are separated into processes; made and enforced by the browser kernel alone.
chatty DOM interactions must be layered over IPCs
through the OP browser kernel, incurring unnecessary
overhead without added security.
4.2 Supporting Legacy Protection
Unlike all existing browsers except OP, this architec- The system call semantics in the basic architecture has
ture can enforce browser security policies on plugins, one subtle issue: cross-origin script or style sheet sources
namely, plugin content from different origins are segre- are readable by the issuing principal, which does not con-
gated into different processes. Any plugin installed is un- form with the existing SOP. The SOP dictates that a script
able to interact with the operating system and is only pro- can be executed in a cross-origin fashion, but the access
vided access to system resources subject to the browser to its source code is restricted to same origin only.
kernel allowing that access. In this architecture, the pay- A key question to answer is that whether a script
load that exploits plugin vulnerabilities will only com- should be processed in the protection domain of its
provider (indicated in “src”), in the same way as frames, less than browser software. Therefore, for protecting
or in the protection domain of the host page that embeds cross-origin script or style sheet source, we place more
the script. To answer this question, we must examine the trust in the browser code and let the browser code retrieve
primary intent of the script element abstraction. Script and protect cross-origin script or style sheet sources: for
is primarily a library abstraction (which is a necessary each principal, we run browser code and plugin code
and useful abstraction) for web programmers to include in two separate processes. The plugin instance process
in their sites and runs with the privilege of the includer cannot issue the getCrossOriginContent() and it can
sites [43]. This is in contrast with the frame abstractions: only interact with cross-origin scripts and style sheets
Programmers put content into cross-origin frames so that through the browser instance process.
the content runs as the principal of its own provider and In this architecture, the quality of protecting cross-
be protected from other principals. Therefore, a script origin script and style-sheet source relies on the browser
should be handled by the protection domain of its in- code quality. While this protection is not perfect with na-
cluder. tive browser code implementation, the architecture offers
In fact, it is a flaw of the existing SOP to offer protec- the same protection as OP, and stronger protection than
tion for cross-origin script source. Evidence has shown the rest of existing browsers. The separation of browser
that it is extremely dangerous to hide sensitive data inside code and plugin code into separate processes also im-
a script [22]. Numerous browser vulnerabilities exist for proves reliability by containing plugin failures.
failing to provide the protection. In recent work, Native Client [47] and Xax [15] have
Unfortunately, web sites that rely on cross-origin presented a plugin model that uses sandboxed processes
script source protection, exist today. For example, to contain each browser principal’s plugin content. Their
GMail’s contact list is stored in a script file, at the time plugin model works perfectly in our browser architec-
of writing. Furthermore, it is increasingly common for ture. We do not provide further discussions on plugins in
web programmers to adopt JavaScript Object Notation our paper.
(JSON) [31] as the preferred data-interchange format.
Web sites often demand such data to be same-origin ac-
cess only. To prevent such data from being accidentally 5 Cross-Principal, Cross-Process Display
accessed through <script> (by a different origin), web and Events Protection
programmers sometimes put “while (1);” prior to the
data definition or put comments around the data so that Cross-principal service composition is a salient nature
accidental script inclusion would result in infinite loop of the web and is commonly used in web applications.
execution or a no-op. When building a browser as a multi-principal OS, this
In light of the existing use, new browser architecture composition raises new challenges in display sharing and
design must also offer the cross-origin script source pro- event dispatching: when a web site embeds a cross-origin
tection. One way to do this is to strip all authentication- frame (or objects, images), the involved principal in-
containing information, such as cookies and HTTP au- stances share the display at the same time. Therefore, it is
thentication headers, from the HTTP requests that re- important that the browser kernel 1) discerns display and
trieve cross-origin scripts so that the web servers will not events ownership, 2) enforces that a principal instance
supply authenticated data. The key problem with this ap- can only draw in its own display areas, 3) dispatches
proach is that it is not always clear what in an HTTP re- UI events to only the principal instance with which the
quest may contain authentication information. For exam- user is interacting. An additional challenge is that the
ple, some cookies are used for authentication purposes browser kernel must accomplish these without access to
and some are not. Stripping all cookies may impair func- any DOM semantics.
tionality when the purpose of some cookies are not for From a high level, in Gazelle principal instances are
authentication purposes. In another example, a network responsible for rendering content into bitmap objects,
may use IP addresses for authentication, which are im- and our browser kernel manages these bitmap objects
possible to strip out. and chooses when and where to display them. Our ar-
We address the cross-origin script source protection chitecture provides a clean separation between the act of
problem by modifying our architecture slightly, as shown rendering web content and the policies of how to display
in Figure 2. The modification is based on the following this content. This is a stark contrast to today’s browsers
observation. Third-party plugin software vulnerabilities that intermingle these two functions, which has led to
have surged recently [36]. Symantec reports that in 2007 numerous security vulnerabilities [18, 44].
alone there are 467 plugin vulnerabilities [42], which is Our display management fundamentally differs from
about one magnitude higher than that of browser soft- that of the traditional multi-user OSes, such as Unix and
ware. Clearly, plugin software should be trusted much Windows. Traditional OSes offer no cross-principal dis-
play protection. In X, all the users who are authorized Landlord Tenant
(through .Xauthority) to access the display can access position (x,y,z) RW
one another’s display and events. Experimental OSes dimensions (height, width) RW R
like EROS [41] have dealt with cross-principal display pixels RW
protection. However, the browser context presents new URL location W RW
challenges that are absent in EROS, such as dual owner-
ship of display and cross-principal transparent overlays. Table 1: Access control policy for a window’s landlord
and tenant
5.1 Display Ownership and Access Control
• Drawing isolation: Pixels inside the window reflect
We define window to be a unit of display allocation and
the tenant’s private content and should not be acces-
delegation. Each window is allocated by a landlord prin-
sible to the landlord. Therefore, the browser kernel
cipal instance or the browser kernel; and each window
enforces that only the tenant can draw within the
is delegated to (or rented to) a tenant principal instance.
window. (Nevertheless, a landlord can create over-
For example, when the web site a.com embeds a frame
lapping windows delegated to different principal in-
sourced at b.com, a.com allocates a window from its own
stances.)
display area and delegates the window to b.com; a.com is
the landlord of the newly-created window, while b.com is • Navigation: Setting the URL location of a window
the tenant of that window. The same kind of delegation navigates the window to a new site. Navigation
happens when cross-origin object and image elements is a fundamental element of any web application.
are embedded. The browser kernel allocates top-level Therefore, both the landlord and the tenant are al-
windows (or tabs). When the user launches a site through lowed to set the URL location of the window. How-
address-bar entry, the browser kernel delegates the top- ever, the landlord should not obtain the tenant’s nav-
level window to the site, making the site a tenant. We igation history that is private to the tenant. There-
decided against using “parent” and “child” terminologies fore, the browser kernel prevents the landlord from
because they only convey the window hierarchy, but not reading the URL location. The tenant can read the
the principal instances involved. In contrast, “landlord” URL location as long as it remains being the ten-
and “tenant” convey both semantics. ant. (When the window is navigated to a different
Window creation and delegation result in a principal, the old tenant will no longer be associated
delegate(URL, position, dimensions) system with the window and will not be able to access the
call. For each window, the browser kernel maintains window’s state.)
the following state: its landlord, tenant, position,
Table 1 summarizes the access control policies in the
dimensions, pixels in the window, and the URL location
browser kernel. In existing browsers, these manipulation
of the window content. The browser kernel manages a
policies also vaguely exist. However, their logic is inter-
three-dimensional display space where the position of a
mingled with the DOM logic and is implemented at the
window also contains a stacking order value (toward the
object property and method level of a number of DOM
browsing user). A landlord provides the stacking order
objects which all reside in the same protection domain
of all its delegated windows to the browser kernel. The
despite their origins. This had led to numerous vulnera-
stacking order is calculated based on the DOM hierarchy
bilities [18, 44]. In Gazelle, by separating these security
and the CSS z-index values of the windows.
policies from the DOM semantics and implementation,
Because a window is created by a landlord and occu-
and concentrating them inside the browser kernel we
pied by a tenant, the browser kernel must allow reason-
achieve more clarity in our policies and much stronger
able window interactions from both principal instances
robustness of our system construction.
without losing protection. When a landlord and its tenant
The browser kernel ensures that principal instances
are from different principals, the browser kernel provides
other than the landlord and the tenant cannot manipu-
access control as follows:
late any of the window states. This includes manipulat-
• Position and dimensions: When a landlord embeds ing the URL location for navigation. Here, we depart
a tenant’s content, the landlord should be able to re- from the existing descendant navigation policy in most
tain control on what gets displayed on the landlord’s of today’s browsers [7, 8]. Descendant navigation pol-
display and a tenant should not be able to reposition icy allows a landlord to navigate a window created by
or resize the window to interfere with the landlord’s its tenant even if the landlord and the tenant are different
display. Therefore, the browser kernel enforces that principals. This is flawed in that a tenant-created window
only the landlord of a window can change the posi- is a resource that belongs to the tenant and should not be
tion and the dimensions of a window. controllable by a different principal.
Existing literature [7, 8] supports the descendant navi- if the determined event owner corresponds to the user in-
gation policy with the following argument: since exist- tent. Different window layout policies directly affect the
ing browsers allow the landlord to draw over the ten- fidelity of this determination. We elaborate on our explo-
ant, a landlord can simulate the descendant navigation by rations of three layout policies and their implications on
overdrawing. Though overdrawing can visually simulate fidelity.
navigation, navigation is much more powerful than over- Existing browsers’ policy. The layout policy in exist-
drawing because a landlord with such descendant nav- ing browsers is to draw windows according to the DOM
igation capability can interfere with the tenant’s opera- hierarchy and the z-index values of the windows. Exist-
tions. For example, a tenant may have a script interact- ing browsers then associate a stateless or focus-setting
ing with one of its windows and then effecting changes event to the window that has the highest stacking order.
to the tenant’s backend; navigating the tenant’s window Today, most browsers permit page authors to set trans-
requires just one line of JavaScript and could effect un- parency on cross-origin windows [48]. This ability can
desirable changes in the tenant’s backend. With over- result in poor fidelity in determining the event owner in
drawing, a landlord can imitate a tenant’s content, but the the face of cross-principal transparent overlays. When
landlord cannot send messages to the tenant’s backend in there are transparent, cross-origin windows overlapping
the name of the tenant. with one another, it is impossible for the browser ker-
nel to interpret the user’s intent: the user is guided by
what she sees on the screen; when two windows present
5.2 Cross-Principal Events Protection a mixed view, some user interfaces visible to the user be-
The browser kernel captures all events in the system long to one window, and yet some belong to another. The
and must accurately dispatch them to the right princi- ability to overlay transparent cross-origin content can
pal instance to achieve cross-principal event protection. be extremely dangerous: a malicious site can make an
Networking and persistent-state events are easy to dis- iframe sourced at a legitimate site transparent and over-
patch. However, user interface events pose interesting laid on top of the malicious site [48], fooling the users to
challenges to the browser kernel in discerning event own- interact with the legitimate site unintentionally.
ership, especially when dealing with overlapping, poten- 2-D display delegation policy. This is a new layout
tially transparent cross-origin windows: major browsers policy that we have explored. In this policy, the display
allow web pages to mix content from different origins is managed as two-dimensional space for the purpose of
along the z-axis where content can be occluded, either delegation. Once a landlord delegates a rectangular area
partially or completely, by cross-origin content. In addi- to a tenant, the landlord cannot overdraw the area. Thus,
tion, current standards allow web pages to make a frame no cross-principal content can be overlaid. Such a lay-
or portions of their windows transparent, further blur- out constraint will enable perfect fidelity in determining
ring the lines between principals. Although these flexible an event ownership that corresponds to the user intent. It
mechanisms have a slew of legitimate uses, they can be also yields better security as it can prevent all UI redress-
used to fool users into thinking they are interacting with ing attacks except clickjacking [48]. Even clickjacking
content from one origin, but are in fact interacting with would be extremely difficult to launch with this policy
content from a different origin. Zalewski [48] gave a tax- on our system since our cross-principal memory protec-
onomy on “UI redressing” or clickjacking attacks which tion makes reading and writing the scrolling state of a
illustrated some of the difficulties with current standards window an exclusive right of the tenant of the window.
and how attackers can abuse these mechanisms. However, this policy can have a significant impact on
To achieve cross-principal events protection, the backward compatibility. For example, a menu from a
browser kernel needs to determine the event owner, the host page cannot be drawn over a nested cross-origin
principal instance to which the event is dispatched. There frame or object; many sites would have significant con-
are two types of events for the currently active tab: state- straints with their own DOM-based pop-up windows cre-
less and stateful. The owner of a stateless event like a ated with divs and such (rather than using window.open
mouse event is the tenant of the window (or display area) or alert), which could overlay on cross-origin frames or
on which the event takes place. The owner of a state- objects with existing browsers’ policy; and a cross-origin
ful event such as a key-press event is the tenant of the image cannot be used as a site’s background.
current in-focus window. The browser kernel interprets Opaque overlay policy. This policy retains exist-
mouse clicks as focus-setting events and keeps track of ing browsers’ display management and layout policies
the current in-focus window and its principal instance. as much as possible for backward compatibility (and
The key problem to solve then is to determine the win- additionally provides cross-principal events protection),
dow on which a stateless or focus-setting event takes but lets the browser kernel enforce the following layout
place. We consider a determination to have high fidelity invariant or constraint: for any two dynamic content-
containing windows (e.g., frames, objects) win1 and By separating principals into different protection
win2, win1 can overlay on win2 iff (T enantwin1 == domains and making any sharing explicit, we can
T enantwin2 ) || (T enantwin1 6= T enantwin2 && win1 much more easily eliminate cross-origin vulnera-
is opaque). This policy effectively constrains a pixel bilities. The only logic for which we need to en-
to be associated with just one principal, making event sure correctness is the origin determination in the
owner determination trivial. This is in contrast with browser kernel.
the existing browsers’ policy where a pixel may be as- This is unlike existing browsers, where origin val-
sociated with more than one principals when there are idations and SOP enforcement are spread through
transparent cross-principal overlays. This policy allows the browser code base [10], and content from dif-
same-origin windows to transparently overlay with one ferent principals coexists in shared memory. All of
another. It also allows a page to use a cross-origin im- the cross-origin vulnerabilities illustrated in Chen et
age (which is static content) as its background. Note that al. [10] simply do not exist in our system; no spe-
no principal instance other than the tenant of the window cial logic is required to prevent them because all of
can set the background of a window due to our mem- those vulnerabilities exploit implicit sharing.
ory protection across principal instances. So, it is impos-
sible for a principal to fool the user by setting another Cross-origin script source can still be leaked in our
principal’s background. The browser kernel associates a architecture if a site can compromise its browser in-
stateless event or a focus-setting event with the dynamic stance. Nevertheless, only that site’s browser in-
content-containing window that has the highest stacking stance is compromised, while other principals are
order. intact, unlike all existing browsers except OP.
This policy eliminates the attack vector of overlaying a • Display vulnerabilities:
transparent victim page over an attacker page. However, The display is also a resource that Gazelle’s browser
by allowing overlapping opaque cross-principal frames kernel protects across principals, unlike existing
or objects, it allows not only legitimate uses, such as browsers (Section 5). Cross-principal display and
those denied by the 2D display delegation policy, but it events protection and access control are enforced in
also allows an attacker page to cover up and expose se- the browser kernel. This prevents a potentially com-
lective areas of a nested cross-origin victim frame or ob- promised principal from hijacking the display and
ject. The latter scenario can result in infidelity. We leave events that belong to another principal. Display hi-
as future work the mitigation of such infidelity by deter- jacking vulnerabilities have manifested themselves
mining how much of a principal’s content is exposed in in existing browsers [17, 26] that allow an attacker
an undisturbed fashion to the user when the user clicks site to control another site’s window content.
on the page.
We implemented the opaque overlay policy in our pro- • Plugin vulnerabilities:
totype. Third-party plugins have emerged to be a signifi-
cant source of vulnerabilities [36]. Unlike exist-
ing browsers, Gazelle’s design requires plugins to
6 Security Analysis interact with system resources only by means of
browser kernel system calls so that they are sub-
In Gazelle, the trusted computing base encompasses the
ject to our browser’s security policy. Plugins are
browser kernel and the underlying OS. If the browser
contained inside sandboxed processes so that basic
kernel is compromised, the entire browser is compro-
browser code doesn’t share fate with plugin code
mised. If the underlying OS is compromised, the en-
(Section 4). A compromised plugin affects the prin-
tire host system is compromised. If the DNS is com-
cipal instance’s plugin process only, and not other
promised, all the non-HTTPS principals can be compro-
principal instances nor the rest of the system. In
mised. When the browser kernel, DNS, and the OS are
contrast, in existing browsers except OP, a compro-
intact, our architecture guarantees that the compromise
mised plugin undermines the entire browser and of-
of a principal instance does not give it any capabilities
ten the host system as well.
in addition to those already granted to it through browser
kernel system call interface (Section 4). A DNS rebinding attack results in the browser la-
Next, we analyze Gazelle’s security over classes of beling resources from different network hosts with
browser vulnerabilities. We also make a comparison with a common origin. This allows an attacker to operate
popular browsers with a study on their past, known vul- within SOP and access unauthorized resources [30].
nerabilities. Although Gazelle does not fundamentally address
this vulnerability, the fact that plugins must inter-
• Cross-origin vulnerabilities: act with the network through browser kernel system
IE 7 Firefox 2 .NET framework 3.5 [4]. We next discuss the implemen-
Origin validation error 6 11 tation of two major components shown in Figure 2: the
Memory error 38 25 browser kernel and the browser instance.
GUI logic flaw 3 13 Browser Kernel. The browser kernel consists of ap-
Others - 28 proximately 5k lines of C# code. It communicates with
Total 47 77 principal instances using system calls and upcalls, which
are implemented as asynchronous XML-based messages
Table 2: Vulnerability Study for IE 7 and Firefox 2 sent over named pipes. An overview of browser kernel
system calls and upcalls is presented in Table 3. Sys-
tem calls are performed by the browser instance or plug-
calls defeats the multipin form of such attacks. ins and sometimes include replies. Upcalls are messages
from the browser kernel to the browser instance.
We analyzed the known vulnerabilities of two major Display management is implemented as described in
browsers, Firefox 2 [3] and IE 7 [35], since their re- Section 5 using .NET’s Graphics and Bitmap libraries.
lease to November 2008, as shown in Table 2. For both Each browser instance provides the browser kernel with
browsers, memory errors are a significant source of er- a bitmap for each window of its rendered content using
rors. Memory-related vulnerabilities are often exploited a display system call; each change in rendered content
by maliciously crafted web pages to compromise the en- results in a subsequent display call. For each top-level
tire browser and often the host machines. In Gazelle, browsing window (or tab), browser kernel maintains a
although the browser kernel is implemented with man- stacking order and uses it to compose various bitmaps
aged C# code, it uses native .NET libraries, such as net- belonging to a tab into a single master bitmap, which is
work and display libraries; memory errors in those li- then attached to the tab’s PictureBox form. This straight-
braries could still cause memory-based attacks against forward display implementation has numerous optimiza-
the browser kernel. Memory attacks in principal in- tion opportunities, many of which have been thoroughly
stances are well-contained in their respective sandboxed studied [33, 38, 40], and which are not the focus of our
processes. work.
Cross-origin vulnerabilities, or origin validation er-
Browser instance. Instead of undertaking a signifi-
rors, constitute another significant share of vulnerabili-
cant effort of writing our own HTML parser, renderer,
ties. They result from the implicit sharing across princi-
and JavaScript engine, we borrow these components
pals in existing browsers and can be much more easily
from Internet Explorer 7 in a way that does not com-
eliminated in Gazelle because cross-principal protection
promise security. Relying on IE’s Trident renderer has a
is exclusively handled by the browser kernel and because
big benefit of inheriting IE’s page rendering compatibil-
of Gazelle’s use of sandboxed processes.
ity and performance. In addition, such an implementa-
In IE 7, there are 3 GUI logic flaws which can be
tion shows that it is realistic to adapt an existing browser
exploited to spoof the contents of the address bar. For
to use Gazelle’s secure architecture.
Gazelle, the address bar UI is owned and controlled by
In our implementation, each browser instance embeds
our browser kernel. We anticipate that it will be much
a Trident WebBrowser control wrapped with an interpo-
easier to apply code contracts [6] in the browser kernel
sition layer which enforces Gazelle’s security properties.
than in a monolithic browser to eliminate many of such
The interposition layer uses Trident’s COM interfaces,
vulnerabilities.
such as IWebBrowser2 or IWebBrowserEvents2, to
In addition, Firefox had other errors which didn’t map
hook sensitive operations, such as navigation or frame
into these three categories, such as JavaScript privilege
creation, and convert them into system calls to the
escalation, URL handling errors, and parsing problems.
browser kernel. Likewise, the interposition layer receives
Since Gazelle enforces security properties in the browser
browser kernel’s upcalls, such as keyboard or mouse
kernel, any errors that manifest as the result of JavaScript
events, and synthesizes them in the Trident instance.
handling and parsing are limited in the scope of exploit
For example, suppose a user navigates to a web page
to the principal instance owning the page. URL handling
a.com, which embeds a cross-principal frame b.com.
errors could occur in our browser kernel as well.
First, the browser kernel will fetch a.com’s HTML con-
tent, create a new a.com process with a Trident compo-
7 Implementation nent, and pass the HTML to Trident for rendering. Dur-
ing the rendering process, we intercept the frame naviga-
We have built a Gazelle prototype mostly as described in tion event for b.com, determine that it is cross-principal,
Section 4. We have not yet ported an existing plugin onto and cancel it. The frame’s DOM element in a.com’s
our system. Our prototype runs on Windows Vista with DOM is left intact as a placeholder, making the interpo-
Type Call Name Description
syscall getSameOriginContent(URL) retrieves same origin content
syscall getCrossOriginContent(URL) retrieves script or css content
syscall delegate(URL, delegatedWindowSpec) delegates screen area to a different principal
syscall postMessage(windowID, msg, targetOrigin) cross-frame messaging
syscall display(windowID, bitmap) sets the display buffer for the window
syscall back() steps back in the window history
syscall forward() steps forward in the window history
syscall navigate (windowID, URL) navigates a window to URL
syscall createTopLevelWindow (URL) creates a new browser tab for the URL specified
syscall changeWindow (windowID, position, size) updates the location and size of a window
syscall writePersistentState (type, state) allows writing to origin-partitioned storage
syscall readPersistentState (type) allows reading of origin-partitioned storage
syscall lockPersistentState (type) locks one type of origin-partitioned storage
upcall destroy(windowID) closes a browser instance
upcall resize(windowID, windowSpec) changes the dimensions of the browser instance
upcall createPlugin(windowID, URL, content) creates a plugin instance
upcall createDocument(windowID, URL, content) creates a browser instance
upcall sendEvent(windowID, eventInfo) passes an event to the browser instance

Table 3: Some Gazelle System Calls

sition transparent to a.com. We extract the frame’s po- proper rendering of the pages.
sition, dimensions, and CSS properties from this element Our interposition layer ensures that our Trident com-
through DOM-related COM interfaces, and send this in- ponents are never trusted with sensitive operations, such
formation in a delegate system call to the browser ker- as network access or display rendering. However, if a
nel to allow the landlord a.com to “rent out” part of its Trident renderer is compromised, it could bypass our in-
display area to the tenant b.com. The browser kernel terposition hooks and compromise other principals using
then creates a new b.com process (with a new instance the underlying OS’s APIs. To prevent this, we are in the
of Trident), and asks it to render b.com’s frame. For any process of implementing an OS-level sandboxing mecha-
rendered display updates for either a.com or b.com, nism, which would prevent Trident from directly access-
our interposition code obtains a bitmap of display con- ing sensitive OS APIs. The feasibility of such a browser
tent from Trident using the IViewObject interface and sandbox has already been established in Xax [15] and
sends it to the browser kernel for rendering. Native Client [47].
One intricacy we faced was in rerouting all network To verify that such an implementation does not cause
requests issued by Trident instances through the browser rendering problems with popular web content, we used
kernel. We found that interposing on all types of fetches, our prototype to manually browse through the top 20
including frame, script, and image requests, to be very Alexa [5] web sites. We checked the correctness of
challenging with COM hooks currently exposed by Tri- Gazelle’s visual output against unmodified Internet Ex-
dent. Instead, our approach relies on a local web proxy, plorer and briefly verified page interactivity, for exam-
which runs alongside the browser kernel. We configure ple by clicking on links. We found that 19 of 20 web
each Trident instance to use our proxy for all network sites rendered correctly. The remaining web site exposed
requests, and the proxy converts each request into a cor- a (fixable) bug in our interposition code, which caused
responding system call to the browser kernel, which then it to load with incorrect layout. Two sites experienced
enforces our security policy and completes the request. crashes (due to more bugs) when trying to render em-
One other implementation difficulty that we encoun- bedded cross-principal <iframe>’s hosting ads. How-
tered was to properly manage the layout of cross-origin ever, the crashes only affected the <iframe> processes;
images. It is easy to render a cross-origin image in a sep- the main pages rendered correctly with the exception of
arate process, but difficult to extract the image’s correct small blank spaces in place of the failed <iframe>’s.
layout information from the host page’s Trident instance. This illustrates a desirable security property of our archi-
We anticipate this to be an overcomable implementation tecture, which prevents malicious or misbehaving cross-
issue. In our current prototype, we are keeping cross- origin tenants from affecting their landlords or other
origin images in the same process as their host page for principals.
Gazelle Internet Explorer 7 Google Chrome
Memory Memory Memory
Time Used Time Used Time Used
1. Browser startup (no page) 668 ms 9 MB 635 ms 14 MB 500 ms 25 MB
2. New tab (blank page) 602 ms 14 MB 115 ms 0.7 MB 230 ms 1.8 MB
3. New tab (google.com) 939 ms 16 MB 499 ms 1.4 MB 480 ms 7.6 MB
4. Navigate from google.com 955 ms 6 MB 1139 ms 3.1 MB 1020 ms 1.4 MB
to google.com/ads
5. Navigate to nytimes.com 5773 ms 88 MB 3213 ms 53 MB 3520 ms 19.4 MB
(with a cross-origin frame)

Table 4: Loading times and memory overhead for a sequence of typical browser operations.

8 Evaluation igations. Finally, operation 5 causes Gazelle to create a


new process for nytimes.com to render the popular
In this section, we measure the impact of our architecture news page 3 . In addition, NYTimes contains an embed-
on browser performance. All tests were performed on an ded cross-principal <iframe>, which triggers window
Intel 3.00Ghz Core 2 Duo with 4GB of RAM, running delegation and another process creation event in Gazelle.
32-bit Windows Vista with a gigabit Ethernet connec- Gazelle’s overall page load latency of 5773 ms includes
tion. To evaluate Gazelle’s performance, we measured the rendering times of both the main page and the em-
page loading latencies, the memory footprint, and re- bedded <iframe>, with the main page becoming visi-
sponsiveness of our prototype in comparison with IE7, ble and interactive to the user in 5085 ms.
a monolithic browser, and Google Chrome v1, a multi- Compared to both IE7 and Chrome, it is expected that
process browser. We found that while Gazelle performs Gazelle will have a performance overhead due to ex-
on-par with commercial browsers while browsing within tra process creation costs, messaging overhead, and the
an origin, it introduces some overhead for cross-origin overhead of our Trident interposition layer as well as Tri-
navigation and rendering embedded cross-origin princi- dent itself. Table 5 breaks down the major sources of
pals (e.g., frames). Nevertheless, our main sources of overhead involved in rendering the three sites in Table 4.
overhead stem from our interposition layer, various ini- Our Trident interposition layer is a big source of
tialization costs for new browser instances, and the un- overhead, especially for larger sites like NYTimes.com,
optimized nature of our prototype. We point out simple where it consumes 813 ms. Although we plan to op-
optimizations that would eliminate much of the overhead timize our use of Trident’s COM interfaces, we are also
along the way. limited by the Trident host’s implementation of the hooks
that we rely on, and by the COM layer which exposes
Page load latency. Table 4 shows the loading times
these hooks. Nevertheless, we believe we could mitigage
for a series of browser operations a typical user might
most of this latency if Trident were to provide us with a
perform using our prototype, IE7, and Google Chrome.
direct (non-COM) implementation for a small subset of
The operations are repeated one after another within the
its hooks that Gazelle requires.
same browser. A web page’s loading time is defined as
Our local proxy implementation for network interpo-
the time between pressing the “Go” button and seeing the
sition constitutes another large source of overhead, for
fully-rendered web page. All operations include network
example 541 ms for NYTimes.com. Much of this over-
latency.
head would disappear if Trident were to make direct net-
Operation 1 measures the time to launch the browser work system calls to the browser kernel, rather than go-
and is similar for all three browsers. Although Gazelle’s ing through an extra proxy indirection. Another part of
browser kernel is small and takes only 225 ms to start, this overhead stems from the fact that the browser kernel
Gazelle also initializes the local proxy subsystem (see currently releases web page data only when a whole net-
Section 7), which takes an additional 443 ms. Operations work transfer finishes; instead, it could provide browser
2 and 3 each carry an overhead of creating a new process instances with chunks of data as soon as they arrive (e.g.,
in Gazelle and Chrome, but not IE7. Operation 4 reuses by changing getContent system calls to the semantics
the same google.com process in Gazelle to render a of a UNIX read() system call), allowing them to better
same-origin page to which the user navigates via a link overlap network transfers with rendering.
on google.com. Here, Gazelle is slightly faster than Process creation is an expected source of overhead that
both IE7 and Chrome, possibly because Gazelle does not
yet manage state such as browsing history between nav- 3 In contrast, Chrome reuses the tab’s old google.com process
increases whenever sites embed cross-principal content, browser kernel detects a user event, it issues a sendEvent
such as NYTimes’s cross-origin <iframe>. As well, upcall to the destination principal’s browser instance.
each process must instantiate and initialize a new Trident Such calls take only 2 ms on average to transfer, plus
object, which is expensive. As an optimization, we could 1 ms to synthesize in Trident. User actions might lead
use a worker pool of a few processes that have been pre- to display updates; for example, a display update for
initialized with Trident. This would save us 275 ms on google.com would incur an additional 77 ms. Most
NYTimes’s load time and 134 ms on google.com’s users should not perceive this overhead and will experi-
load time. ence good responsiveness.
We encountered an unexpected performance hit when Process creation. In addition to latency and memory
initializing named pipes that we use to transfer system measurements we also have tested our prototype on the
calls: a new process’s first write to a pipe stalls for a con- top 100 popular sites reported by Alexa [5] to provide an
siderable time. This could be caused by initialization of estimate of the number of processes created for different
an Interop layer between .NET and the native Win32 pipe sites. Here, we place a cross-origin image into a separate
interfaces, on which our implementation relies. We can process to evaluate our design. The number of processes
avoid this overhead by either using an alternate imple- created is determined by the use of different-origin con-
mentation of a system call transfer mechanism, or pre- tent on sites, which is most commonly image content.
initializing named pipes in our worker pool. This would For the top 100 sites, the median number of processes re-
save us 439 ms in NYTimes’s render time. quired to view a single page is 4, the minimum is 1, and
Retrieving bitmap display updates from Trident and the maximum is 28 (caused by skyrock.com, which
sending them to the browser kernel is expensive for large, uses an image farm). Although creation of many pro-
complex sites such as NYTimes.com, where this takes cesses introduces additional latency and memory foot-
422 ms. Numerous optimizations are possible, including print, we did not experience difficulties when Gazelle
image compression, VNC-like selective transfers, and a created many processes during normal browsing. Our
more efficient bitmap sharing channel between Trident test machine easily handles a hundred running processes,
and the browser kernel. Our mechanism for transferring which are enough to keep 25 average web sites open si-
bitmap updates currently performs an inefficient .NET- multaneously.
based serialization of the image’s data (which takes 176
ms for NYTimes); passing this data directly would fur- 9 Discussions on compatibility vs. security
ther improve performance.
Overall, we believe that with the above optimizations, While Gazelle’s architecture can be made fully backward
Gazelle’s performance would be on par with production compatible with today’s web, it is interesting to inves-
browsers like Chrome or IE8; for example, we anticipate tigate the compatibility cost of eliminating the insecure
that NYTimes.com could be rendered in about 3.6 s. policies in today’s browsers. We have considered several
Memory overhead. As a baseline measurement, the policies that differ from today’s browsers but offer bet-
browser kernel occupies around 9MB of memory after ter security. We conducted a preliminary study on their
a page load. This includes the user interface compo- compatibility cost. This is by no means a conclusive or
nents of the browser to present the rendered page to the complete study, but only a first look on the topic.
user and the buffers allocated for displaying the rendered We mostly used the data set of the front pages of the
page. Memory measurements do not include shared li- top 100 most popular web sites ranked by Alexa [5]. We
braries used by multiple processes. used a combination of browser instrumentation with au-
Table 4 shows the amount of memory for perform- tomatic script execution and manual inspection in our
ing various browsing operations. For example, to open study. We consider any visual differences in the render-
a new tab to a blank page, Gazelle consumes 14MB, and ing of a web page to be a violation of compatibility. We
to open a new tab for google.com, Gazelle consumes discuss our findings below.
an additional 16MB. Each empty browser instance uses Subdomain treatment Existing browsers and SOP
1.5MB of internal storage plus the memory required for make exceptions for subdomains (e.g., news.google.com
rendered content. Given our implementation, the lat- is a subdomain of google.com) [39]: a page can set
ter closely corresponds to Trident’s memory footprint, the document.domain property to suffixes of its do-
which at the minimum consists of 14MB for a blank main and assume that identity. This feature was one
page. In the case of NYTimes, our memory footprint of the few methods for cross-origin frames to communi-
further increases because of structures allocated by the cate before the advent of postMessage [25]. Changing
interposition layer, such as a local DOM cache. document.domain is a dangerous practice and violates
Responsiveness. We evaluated the response time of a the Principle of Least Privilege: Once a subdomain sets
user-generated event, such as a mouse click. When the its domain to a suffix, it has no control over which other
Latency
Location Overhead blank site google.com nytimes.com
Overhead before rendering
Browser kernel - process creation 44 ms 40 ms 78 ms
Browser instance - creating interposed instances of Trident 94 ms 94 ms 197 ms
Browser instance - named pipe initialization 137 ms 145 ms 439 ms
Overhead during rendering
Browser instance - proxy-based network interposition 4 ms 134 ms 541 ms
Browser instance - other Trident interposition 127 ms 122 ms 813 ms
Overhead after rendering
Browser instance - bitmap capture 13 ms 35 ms 196 ms
Browser instance - bitmap transfer 37 ms 67 ms 226 ms
Browser kernel - display rendering 10 ms 11 ms 101 ms

Table 5: A breakdown of Gazelle’s overheads involved in page rendering. Note that nytimes.com creates two processes
for itself and an <iframe>; the other two sites create one process.

subdomains can access it. This is also observed by Za- than two percent include HTTP scripts and CSS.
lewski [48]. Therefore, it would be more secure not to Layout policies. The opaque overlay policy allows
allow a subdomain to set document.domain. only opaque (and not transparent) cross-origin frames or
Our experiments indicate that six of the top 100 Alexa objects (Section 5.2). We test this policy with the top
sites set document.domain to a different origin, though 100 Alexa sites by determining if any cross-origin frames
restricting write access to document.domain might not or objects are overlapped with one another. We found
actually break the operation of these web sites. that two out of 100 sites attempt to violate this policy.
Mixed HTTPS and HTTP Content. When an This policy does not generate rendering errors; instead,
HTTPS site embeds HTTP content, browsers typically we convert transparent cross-origin elements to opaque
warn users about the mixed content, since the HTTPS elements when displaying content.
site’s content can resist a network attacker, but the em- We also tested the 2D display delegation policy that
bedded HTTP content could be compromised by a net- we analyzed in Section 5.2. We found this policy to have
work attacker. higher compatibility cost than our opaque overlay policy:
six of the top 100 sites attempt to violate this policy.
When an HTTPS site embeds other HTTP principals
Sites that attempt to violate either policy have reduced
(through <iframe>, <object>, etc.), HTTPS princi-
functionality, and will render differently than what the
pals and HTTP principals will have different protection
web page author intends.
domains and will not interfere with each other.
Plugins. Existing plugin software must be adapted
However, when an HTTPS site embeds a script or style (ported or binary-rewritten) to use browser kernel sys-
sheet delivered with HTTP, existing browsers would al- tem calls to accomplish its tasks. Of top 100 Alexa sites,
low the script to run with the HTTPS site’s privileges (af- 34 sites use Flash, but no sites use any other kinds of plu-
ter the user ignores the mixed content warning). This is gins. This indicates that porting or adapting Flash alone
dangerous because a network attacker can then compro- can address a significant portion of the plugin compati-
mise the HTTP-transmitted script and attack the HTTPS bility issue.
principal despite its intent of preventing network attack-
ers. Therefore, a more secure policy is to deny ren-
dering of HTTP-transmitted scripts or style sheets for 10 Concluding Remarks
an HTTPS principal. Instead of the Alexa top 100, we
identified a few different sites that provide SSL ses- We have presented Gazelle, the first web browser that
sions for parts of their web application: amazon.com, qualifies as a multi-principal OS for web site princi-
mail.google.com, mail.microsoft.com, blogger.com, and pals. This is because Gazelle’s browser kernel exclu-
a few popular banking sites where we have existing ac- sively manages resource protection, unlike all existing
counts. This allows us to complete the login process dur- browsers which allow cross-principal protection logic to
ing testing. These sites do not violate this policy. In reside in the principal space. Gazelle enjoys the security
addition, we have also gathered data from one of the au- and robustness benefit of a multi-principal OS: a com-
thor’s browsing sessions over the course of a few months promise or failure of one principal leaves other principals
and found that out of 5,500 unique SSL URLs seen, less and the browser kernel intact.
Our browser construction exposes challenging design [10] S. Chen, D. Ross, and Y.-M. Wang. An Analysis of Browser
issues that were not seen in previous work, such as pro- Domain-Isolation Bugs and A Light-Weight Transparent Defense
Mechanism. In Proceedings of the ACM Conference on Computer
viding legacy protection to cross-origin script source and and Communications Security, 2007.
cross-principal, cross-process display and event protec-
[11] R. S. Cox, J. G. Hansen, S. D. Gribble, and H. M. Levy. A Safety-
tion. We are the first to provide comprehensive solutions Oriented Platform for Web Applications. In Proceedings of the
to them. IEEE Symposium on Security and Privacy, 2006.
The implementation and evaluation of our IE-based [12] D. Crockford. JSONRequest. http://www.json.org/
prototype shows promise of a practical multi-principal jsonrequest.html.
OS-based browser in the real world. [13] D. Crockford. The Module Tag: A Proposed Solution to
In our future work, we are exploring the fair sharing the Mashup Security Problem. http://www.json.org/
module.html.
of resources among web site principals in our browser
kernel and a more in-depth study of the tradeoffs between [14] Document Object Model. http://www.w3.org/DOM/.
compatibility and security in browser policy design. [15] J. R. Douceur, J. Elson, J. Howell, and J. R. Lorch. Leverag-
ing legacy code to deploy desktop applications on the web. In
Proceedings of the Symposium on Operating Systems Design and
Implementation, 2008.
11 Acknowlegements
[16] Firefox 3 for developers, 2008. https://developer.
mozilla.org/en/Firefox_3_for_developers.
We thank Spencer Low, David Ross, and Zhenbin Xu
for giving us constant help and fruitful discussions. We [17] Mozilla Browser and Mozilla Firefox Remote Window Hijacking
Vulnerability, 2004. http://www.securityfocus.com/
thank Adam Barth and Charlie Reis for their detailed and bid/11854/.
insightful feedback on our paper. We also thank the fol-
[18] Security Advisories for Firefox 2.0. http://www.
lowing folks for their help: Barry Bond, Jeremy Con- mozilla.org/security/known-vulnerabilities/
dit, Rich Draves, David Driver, Jeremy Elson, Xiaofeng firefox20.html.
Fan, Manuel Fandrich, Cedric Fournet, Chris Hawblitzel, [19] D. Flanagan. JavaScript: The Definitive Guide. O’Reilly Media
Jon Howell, Galen Hunt, Eric Lawrence, Jay Lorch, Rico Inc., August 2006.
Malvar, Wolfram Schulte, David Wagner, Chris Wilson, [20] Adobe Flash Player 9 Security, July 2008. http://
and Brian Zill. We also thank our paper shepherd Niels www.adobe.com/devnet/flashplayer/articles/
Provos for his feedback over our last revisions. flash_player_9_security.pdf.
[21] C. Grier, S. Tang, and S. T. King. Secure web browsing with the
OP web browser. In Proceedings of the 2008 IEEE Symposium
References on Securiy and Privacy, 2008.

[1] Changes in allowScriptAccess default (Flash Player). http:// [22] J. Grossman.


www.adobe.com/go/kb403183. Advanced Web Attack Techniques using GMail. http:
//jeremiahgrossman.blogspot.com/2006/01/
[2] Developer center: Security changes in Flash Player 7. advanced-web-attack-techniques-using.html.
http://www.adobe.com/devnet/flash/articles/
[23] W. H. A. T. W. Group. Web Applications 1.0, February
fplayer_security.html.
2007. http://www.whatwg.org/specs/web-apps/
[3] Security advisories for Firefox 2.0. http://www. current-work/.
mozilla.org/security/known-vulnerabilities/
[24] HTML 5 Editor’s Draft, October 2008. http://www.w3.
firefox20.html.
org/html/wg/html5/.
[4] .NET Framework Developer Center, 2008. http: [25] What’s New in Internet Explorer 8, 2008. http://msdn.
//msdn.microsoft.com/en-us/netframework/ microsoft.com/en-us/library/cc288472.aspx.
default.aspx.
[26] Microsoft Internet Explorer Remote Window Hijacking Vulner-
[5] Alexa, 2009. http://www.alexa.com/. ability, 2004. http://www.securityfocus.com/bid/
[6] M. Barnett, K. Rustan, M. Leino, and W. Schulte. The Spec# 11855.
programming system: An overview. In LNCS, editor, CAS- [27] S. Ioannidis and S. M. Bellovin. Building a secure web browser.
SIS, volume 3362. Springer, 2004. http://research. In Proceedings of the FREENIX Track: 2001 USENIX Annual
microsoft.com/en-us/projects/specsharp/. Technical Conference, 2001.
[7] A. Barth and C. Jackson. Protecting browsers from frame hijack- [28] S. Ioannidis, S. M. Bellovin, and J. M. Smith. Sub-operating sys-
ing attacks, April 2008. http://crypto.stanford.edu/ tems: a new approach to application security. In Proceedings of
websec/frames/navigation/. the 10th workshop on ACM SIGOPS European workshop, pages
[8] A. Barth, C. Jackson, and J. C. Mitchell. Securing frame com- 108–115, New York, NY, USA, 2002. ACM.
munication in browsers. In In Proceedings of the 17th USENIX [29] C. Jackson and A. Barth. Beware of Finer-Grained Origins. In
Security Symposium (USENIX Security), 2008. Web 2.0 Security and Privacy, May 2008.
[9] A. Barth, C. Jackson, C. Reis, and T. G. C. Team. The [30] C. Jackson, A. Barth, A. Bortz, W. Shao, and D. Boneh. Pro-
security architecture of the Chromium browser, 2008. tecting Browsers from DNS Rebinding Attacks. In Proceedings
http://crypto.stanford.edu/websec/chromium/ of ACM Conference on Computer and Communications Security,
chromium-security-architecture.pdf. 2007.
[31] JavaScript Object Notation (JSON). http://www.json. [40] R. W. Scheifler and J. Gettys. The X window system. ACM
org/. Transactions on Graphics (TOG), 5(2):79–109, April 1986.
[32] D. Kristol and L. Montulli. HTTP State Management Mecha- [41] J. S. Shapiro, J. Vanderburgh, E. Northup, and D. Chizmadia.
nism. IETF RFC 2965, October 2000. Design of the EROS TrustedWindow system. In Usenix Security,
2004.
[33] T. W. Mathers and S. P. Genoway. Windows NT Thin Client So-
lutions: Implementing Terminal Server and Citrix MetaFrame. [42] Symantec Global Internet Security Threat Report: Trends for July
Macmillan Technical Publishing, Indianapolis, IN, November - December 07, April 2008.
1998. [43] H. J. Wang, X. Fan, J. Howell, and C. Jackson. Protection and
[34] IEBlog: IE8 Security Part V: Comprehensive Protection, Communication Abstractions in MashupOS. In ACM Symposium
2008. http://blogs.msdn.com/ie/archive/2008/ on Operating System Principles, October 2007.
07/02/ie8-security-part-v-comprehensive- [44] Cross-Domain Vulnerability In Microsoft Internet
protection.aspx.
Explorer 6. http://cyberinsecure.com/
[35] Microsoft security bulletin. http://www.microsoft. cross-domain-vulnerability-in-microsoft-
com/technet/security/. internet-explorer-6/.
[36] Microsfot Security Intelligence Report, Volume 5, 2008. [45] The XMLHttpRequest Object. http://www.w3.org/TR/
http://www.microsoft.com/security/portal/ XMLHttpRequest/.
sir.aspx. [46] W3C XMLHttpRequest Level 2. http://dev.w3.org/
[37] C. Reis and S. D. Gribble. Isolating web programs in modern 2006/webapi/XMLHttpRequest-2/.
browser architectures. In Proceedings of Eurosys, 2009. [47] B. Yee, D. Sehr, G. Dardyk, B. Chen, R. Muth, T. Ormandy,
[38] T. Richardson, Q. Stafford-Fraser, K. R. Wood, and A. Hopper. S. Okasaka, N. Narula, and N. Fullagar. Native client: A sandbox
Virtual network computing. IEEE Internet Computing, 2(1):33– for portable, untrusted x86 native code. In Proceedings of the
38, 1998. IEEE Symposium on Security and Privacy, May 2009.
[39] J. Ruderman. The Same Origin Policy. http://www. [48] M. Zalewski. Browser security handbook, 2008. http://
mozilla.org/projects/security/components/ code.google.com/p/browsersec/wiki/Main.
same-origin.html.