You are on page 1of 11

contributed articles

DOI:10.1145/ 3338843
By using formality early in develop-
Exploiting a simple, expressive logic based ment, you can minimize the costs of
ambiguity and get feedback on your
on relations to describe designs and automate work by running analyses. The most
their analysis. popular approach to advocate this is
agile development, in which the formal
BY DANIEL JACKSON representation is code in a traditional
programming language and the analy-

Alloy:
sis is conventional unit testing.
As a language for exploring designs,
however, code is imperfect. It’s verbose
and often indirect, and it does not al-

A Language and
low partial descriptions in which some
details are left to be resolved later. And
testing, as a way to analyze designs,

Tool for Exploring


leaves much to be desired. It’s notori-
ously incomplete and burdensome,
since you need to write test cases ex-
plicitly. And it’s very difficult to use

Software Designs
code to articulate design without get-
ting mired in low-level details (such as
the choice of data representations).
An alternative, which has been ex-
plored since the 1970s, is to use a de-
sign language built not on convention-
al machine instructions but on logic.
Partiality is free because rather than
listing each step of a computation, you
write a logical constraint saying what’s
ALLOY IS A language and a toolkit for exploring the kinds true after, and that constraint can say
as little or as much as you please. To
of structures that arise in many software designs. This analyze such a language, you use spe-
article aims to give readers a flavor of Alloy in action, and cialized algorithms such as model
checkers or satisfiability solvers (more
some examples of its applications to date, thus giving a on these later). This usually requires
sense of how it can be used in software design work. much less effort than testing, since you
Software involves structures of many sorts:
key insights
architectures, database schemas, network topologies,
ontologies, and so on. When designing a software ˽˽ Using a simple logic of relations, Alloy
lets you model software designs that
system, you need to be able to express the structures involve complex, evolving structures.

essential to the design and to check that they have the ˽˽ Alloy’s tool uses SAT technology to
simulate designs and find subtle flaws,
properties you expect. and has been used in a wide variety
of applications from networking and
You can express a structure by sketching it on a security to critical systems.
napkin. That’s a good start, but it’s limited. Informal ˽˽ A key advantage of logical modeling is that
you can construct a design incrementally,
representations give inconsistent interpretations, and in an agile way, representing and
they cannot be analyzed mechanically. So people have analyzing only an essential subset of
the behavioral contraints.
turned to formal notations that define structure and ˽˽ Alloy is complementary to a class
behavior precisely and objectively, and that can exploit of tools called model checkers, and
is a valuable addition to the software
the power of computation. designer’s toolkit.

66 COMMUNICATIO NS O F TH E AC M | S EPTEM BER 201 9 | VO L . 62 | N O. 9


only need to express the property you tools that are smaller and simpler than exploring the space of possible states
want to check rather than a large col- the tool that finds the proof, you can be of a system, and if that space is large,
lection of test cases. And the analysis is confident the analysis is sound. they may require considerable compu-
much more complete than testing, be- On the down side, the combination tational resources (or may fail to termi-
cause it effectively covers all (or almost of an expressive logic and sound proof nate). The so-called “state explosion”
all) test cases that you could have writ- means that finding proofs cannot gen- problem arises because model check-
ten by hand. erally be automated. Theorem provers ers are often used to analyze designs
usually require considerable effort and involving components that run in par-
What Came Before: Theorem expertise from the user, often orders allel, resulting in an overall state space
Provers and Model Checkers of magnitude greater than the effort that grows exponentially with the num-
To understand Alloy, it helps to know of constructing a formal design in the ber of components.
a bit about the context in which it was first place. Moreover, failure to find a Alloy was inspired by the successes
developed and the tools that existed at proof does not mean that a proof does and limitations of model checkers.
the time. not exist, and theorem provers don’t For designs involving parallelism and
Theorem provers are mechanical aids provide counterexamples that explain simple state (comprising Boolean
for constructing mathematical proofs. concretely why a theorem is not valid. variables, bounded integers, enumer-
To apply a theorem prover to a software So theorem provers are not so useful ations and fixed-size arrays), model
design problem, you formulate some when the intended property does not checkers were ideal. They could eas-
intended property of the design, and hold—which unfortunately is the com- ily find subtle synchronization bugs
then attempt to prove the theorem that mon case in design work. that appeared only in rare scenarios
the property follows from the design. Model checkers revolutionized de- that involved long traces with mul-
Theorem provers tend to provide very sign analysis by providing exactly the tiple context switches, and therefore
rich logics, so they can usually express features theorem provers lacked. They eluded testing.
any property the designer might want, offer push-button automation, requir- For hardware designs, model check-
at least about states and state transi- ing the user to give only the design and ers were often a good match. But for
tions—more dynamic properties can property to be checked. They allow software designs they were less ideal.
IMAGE BY IDEA ST UDIO

require a temporal logic that theorem dynamic properties to be expressed Although some software design
provers don’t typically support directly. (through temporal logics), and gener- problems involve this kind of synchro-
Also, because they generate mathemat- ate counterexamples when properties nization, often the complexity arises
ical proofs, which can be checked by do not hold. Model checkers work by from the structure of the state itself. Early

SE PT E MB E R 2 0 1 9 | VO L. 6 2 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 67
contributed articles

model checkers (such as SMV9) had limit- sets of sets and relations over sets. This
ed expressiveness in this regard, and did changes how designs are modeled, but
not support rich structures such as trees, not what can be modeled; after all, re-
lists, tables, and graphs. lational databases have flourished de-
Explicit state model checkers, such
as SPIN,14 and later Java Pathfinder,37 Alloy was inspired spite being first order.
Taking advantage of this restriction,
allowed designs with rich state to be
modeled, but, despite providing sup-
by the successes Alloy’s operators are defined in a very
general way so that most expressions
port for temporal properties, gave little and limitations of can be written with just a few opera-
help for expressing structural ones. To
express reachability (for example, that
model checkers. tors. The key operator is relational join,
which in conventional mathematics
two social media users are connected only applies to binary relations, but in
by some path of friend edges), you Alloy works on relations of any arity. By
would typically need to code an explicit using a dot to represent the join opera-
search, which would have to be execut- tor, Alloy lets you write dereferencing
ed at every point at which the property expressions as you would in an object-
was needed. Also, explicit state model oriented programming language, but
checkers have limited support for par- gives these expressions a simple math-
tiality (since the model checker would ematical interpretation. So, as in Java,
have to conduct a costly search through given an employee e, a relation dept
possible next states to find one satisfy- that maps employees to departments,
ing the constraints). and a relation manager that maps de-
Particularly difficult for all model partments to their managers, e.dept.
checkers are the kinds of designs that manager would give the manager of
involve a configuration of elements in a e’s department. But unlike in Java, the
graph or tree structure. Many network expression will also work if e is a set of
protocols are designed to work irre- employees, or if dept can map an em-
spective of the initial configuration (or ployee to multiple departments, giving
of the configuration as it evolves), and the expected result—the set of manag-
exposing a flaw often involves not only ers of the set of departments that the
finding a behavior that breaks a prop- employees e belong to. The expression
erty but also finding a configuration in dept.manager is well defined too, mean-
which to execute it. ing the relation that maps employees
Even the few model checkers that to their managers. You can also navi-
can express rich structures are gener- gate backward—writing manager.m for
ally not up to this task. Enumerating the department(s) that e manages.
possible configurations is not feasi- (A note for readers interested in
ble, because the number of configu- language design: This flexibility is
rations grows super-exponentially: if achieved by treating all values as rela-
there are n nodes, there are 2n×n ways tions—a set being a relation with one
to connect them. column, and a scalar being a set with
one element—and defining a join op-
Alloy’s Innovations erator that applies uniformly over a
Alloy offered a new kind of design lan- pair of relations, irrespective of their
guage and analysis that was made pos- arity. In contrast, other languages tend
sible by three innovations: to have multiple operators, implicit
Relational logic. Alloy uses the same coercions, or overloading to accommo-
logic for describing designs and prop- date variants that Alloy unifies.)
erties. This logic combines the for-all Alloy was influenced also by model-
and exists-some quantifiers of first- ing languages such as UML. Like the
order logic with the operators of set class diagrams of UML, Alloy makes it
theory and relational calculus. easy to describe a universe of objects as
The idea of modeling software de- a classification tree, with each relation
signs with sets and relations had been defined over nodes in this tree. Alloy’s
pioneered in the Z language.32 Alloy dot operator was inspired in part by the
incorporated much of the power of Z, navigational expressions of the Object
while simplifying the logic to make it Constraint Language39 (OCL) of UML,
more tractable. but by defining the dot as relational
Alloy allows only first-order struc- join, Alloy dramatically simplifies the
tures, thus ruling out, for example, semantics of navigation.

68 COMMUNICATIO NS O F TH E AC M | S EPTEM BER 201 9 | VO L . 62 | N O. 9


contributed articles

Small scope analysis. Even plain sible values. A very small design might exposed several serious flaws in brows-
first-order logic (without relational op- have five such relations, giving (225)5 er security.1 While it cuts corners and
erators) is not decidable. This means possible states—about 1037 states. is unrealistic in some respects, it does
that no algorithm can exist that could Even checking a billion cases per sec- capture the spirit and style of the origi-
analyze a software design written com- ond, such an analysis would take many nal model, and is fairly representative of
pletely in a language like Alloy. So some- times the age of the universe. how Alloy is often used.
thing has to give. You could make the Alloy therefore does not perform an For those unfamiliar with browser
language decidable, but that would explicit search, but instead translates security, cross-site request forgery
cripple its expressive power and make the design problem to a satisfiability (CSRF) is a pernicious and subtle at-
it unable to express even the most basic problem whose variables are not rela- tack in which a malicious script run-
properties of structures (although excit- tions but simple bits. By flipping bits ning in a page the user has loaded
ing progress has been made recently in individually, a satisfiability (SAT) solver makes a hidden and unwanted re-
applying decidable fragments of first- can usually find a solution (if there is quest to a website for which the user
order logic to certain problems29). You one) or show that none exists by exam- is already authenticated. This may
could give up on automation, and re- ining only a tiny portion of the space. happen either because the user was
quire help from the user, but this elimi- Alloy’s analysis tool is essentially a enticed to load a page from a mali-
nates most of the benefit of an analysis compiler to SAT, which allows it to ex- cious server, or because a supposedly
tool—analysis is no longer a reward for ploit the latest advances in SAT solvers. safe server was the subject of a cross-
constructing a design model, but a ma- The success of SAT solvers has been a site scripting attack, and served a page
jor extra investment beyond modeling. remarkable story in computer science— containing a malicious script. Such a
The other option is to somehow theoreticians had shown that SAT was script can issue any request the user
limit the analysis. Prior to Alloy, two inherently intractable, but it turned out can issue; one of the first CSRF vulner-
approaches were popular. Abstraction that most of the cases that appeared in abilities to be discovered, for example,
reduces the analysis to a finite number practice could be solved efficiently. So allowed an attacker to change the de-
of cases by introducing abstract values SAT went from being the archetypal in- livery address for the user’s account
that each corresponds to an entire set soluble problem used to demonstrate in a DVD rental site. What makes
of real values. This often results in false the infeasibility of other problems to CSRF particularly insidious is that the
positives that are difficult to interpret. being a soluble problem that other prob- browser sends authentication creden-
In practice, picking the right abstrac- lems could be translated to. Alloy also tials stored as cookies spontaneously
tion calls for considerable ingenuity. applies a variety of tactics to reduce the when a request is issued, whether that
Simulation picks a finite number of cas- problem prior to solving, most notably request is made explicitly by the user
es, usually by random sampling, but it adding symmetry breaking constraints or programmatically by a script.
covers such a small part of the state that save the SAT solver from consider- One way to counter CSRF is to track
space that subtle flaws elude detection. ing cases equivalent to one another. the origins of all responses received
Alloy offered a new approach: run- from servers. In this example, the
ning all small tests. The designer spec- Example: Modeling Origins browser would mark the malicious
ifies a scope that bounds each of the To see Alloy in action, let’s explore the script as originating at the malicious or
types in the specification. A scope of design of an origin-tracking mechanism compromised server. The subsequent
five, for example, would include tests for Web browsers. The model shown request made by that script to the rental
involving at most five elements of each here is a toy version of a real model that site server—the target of the attack—
type: five network nodes, five packets,
five identifiers, and so on. Figure 1. Structure declarations.
The rationale for this is the small
1 abstract sig EndPoint { }
scope hypothesis, which asserts that
most bugs can be demonstrated with 2 sig Server extends EndPoint {
small counterexamples. If you test for 3 causes: set HTTPEvent
4 }
all small counterexamples, you are
likely to find any bug. Many Alloy case 5 sig Client extends EndPoint { }
studies have confirmed the hypothesis
6 abstract sig HTTPEvent {
by performing an analysis in a variety 7 from, to, origin: EndPoint
of scopes and showing, retrospectively, 8 }
that a small scope would have sufficed 9 sig Request extends HTTPEvent {
to find all the bugs discovered. 10 response: lone Response
Translation to SAT. Even with small 11 }
scopes, the state space of an Alloy 12 sig Response extends HTTPEvent {
model is fiendishly large. The state 13 embeds: set Request
comprises a collection of variables 14 }
whose values are relations. Just one bi- 15 sig Redirect extends Response {
nary relation in a scope of five has 5 × 16 }
5 = 25 possible edges, and thus 225 pos-

SE PT E MB E R 2 0 1 9 | VO L. 6 2 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 69
contributed articles

Figure 2. Data model from declarations. the value of s is some atomic identifier
representing a server object, and the
value of e is some atomic identifier rep-
resenting an event.
EndPoint Fields are declared in signatures to
allow a kind of object-oriented mind-
set. Alloy supports this by resolving field
names contextually (so that field names
need not be globally unique), and by al-
lowing “signature facts” (not used here)
Client Server from origin to that are implicitly scoped over the ele-
ments of a signature and their fields.
But don’t be misled into thinking there
causes is some kind of fancy object semantics
here. The signature structure is only a
convenience, and just introduces a set
HTTPEvent
and some relations.
The extends keyword defines one
signature as a subset of another. An
abstract signature has no elements
that do not belong to a child signa-
ture, and the extensions of a signature
are disjoint. So the declarations of
Response
EndPoint, Server, and Client imply the
set of endpoints is partitioned into
servers and clients: no server is also a
response client, and there is no endpoint that
embeds
is neither client nor server. A relation
defined over a set applies over its sub-
Redirect Request
sets too, so the declaration of from, for
example, which says that every HTTP
event is from a single endpoint, implies
the same is true for every request and
would be labeled as having this other model itself; and, response. (Alloy is best viewed as un-
origin. The target server can be configured ˲˲ An abstract style of modeling that typed. It turns out that conventional
so it only accepts requests that originate includes only those aspects essential to programming language types are far
directly from the user (for example, by the problem at hand. too restrictive for a modeling lan-
the user entering the URL for the request We start by declaring a collection of guage. Alloy thus allows expressions
in the address bar), or from the target signatures (as illustrated in Figure 1). such as HTTPEvent.response, denoting
server itself (for example, from a script A signature introduces a set of objects the set of responses to any events, but
embedded in a page it previously sent). and some fields that relate them to oth- its type checker rejects an expression
As always, the devil is in the details, and er objects. So Server, for example, will such as Request.embeds that always
we shall see that a plausible design of represent the set of server nodes, and denotes an empty set.12)
this mechanism turns out to be flawed. has a field causes that associates each The Alloy Analyzer can generate a
Some features to look out for in this server with the set of HTTP events that graphical representation of the sets
model, which distinguish Alloy from it causes. and relations from the signature decla-
many other approaches, include: Keywords (or their omission) indi- rations (see Figure 2); this is just an al-
˲˲ A rich structure of objects, classifi- cate the multiplicity of the relations ternative view and involves no analysis.
cation, and relationships; between objects: thus each HTTP event Moving to the substance of what the
˲˲ Constraints in a simple logic that has exactly one from endpoint, one to model actually means:
exploits the relations and sets of the endpoint, and one origin endpoint (line ˲˲ The from and to fields are just the
structure, avoiding the kind of low-lev- 7); each request has at most one re- source and destination of the event’s
el structures (arrays and indices espe- sponse (line 10, with lone being read as packet.
cially) that are often required in model “less than or equal to one”); and each ˲˲ For a response r, the expression
checkers; response embeds any number of re- r.embeds denotes a set of requests that
˲˲ Capturing dynamic behavior with- quests (line 13). are embedded as JavaScript in the re-
out any need for a built-in notion of Mathematically speaking, objects sponse; when that response is loaded
time or state; are just atomic identifiers without any into the browser, the requests are ex-
˲˲ Intended properties to check ex- internal structure. The causes relation ecuted spontaneously.
pressed in the same language as the includes tuples of the form (s, e) where ˲˲ A Redirect is a special kind of re-

70 COMMUNICATIO NS O F TH E ACM | S EPTEM BER 201 9 | VO L . 62 | N O. 9


contributed articles

sponse that indicates that a resource was from, and from the endpoint its from that server, or is embedded in a
has moved, and spontaneously issues request was to (line 23); and that a response that the server causes.
a request to a different server; this sec- request cannot be embedded in a re- ˲˲ The Origin fact describes the ori-
ond request is modeled as an embed- sponse to itself (line 24). Two expres- gin-tracking mechanism. Each con-
ded request in the redirect response. sions in these constraints merit ex- straint defines the origin of a differ-
˲˲ The origin of an event is a notion planation. The expression response.r ent kind of HTTP event. The first (line
computed by the browser as a means of exploits the flexibility of the join op- 31) says that every embedded request
preventing cross-site attacks. As we will erator to navigate backward from the e has the same origin as the response
see later, the idea is that a server may response r to the request it responds r that it is embedded in. The second
choose to reject an event unless it origi- to; it could equivalently be written (line 32) defines the origin of a re-
nated at that server or at a browser. r.~response using the transpose op- sponse: it says if the response is a redi-
˲˲ The cause of an event is not part erator ~. The expression r.^(response. rect, it has the same origin as the orig-
of the actual state of the mechanism. embeds) starts with the request r, and inal request, and otherwise its origin
It is introduced in order to express the then applies to it one or more naviga- is the server that the response came
essential design property: that an evil tions (using the closure operator ^) of from. The third (line 33) handles a re-
server cannot cause a client to send a following the response and mbeds re- quest that is not embedded: its origin
request to a good server. lations, as if we had written instead is the endpoint it comes from (which
Now let’s look at the constraints the infinite expression will usually be the browser).
(as shown in Figure 3). If there were Finally, EnforceOrigins is a predicate
no constraints, any behavior would be r.response.embeds that can be applied to a server, indicat-
possible; adding constraints restricts + r.response.embeds ing that it chooses to enforce the origin
the behavior to include only those that .response.embeds header, allowing incoming requests
are intended by design. + r.response.embeds only if they originate at that server, or
The constraints are grouped into .response.embeds at the client that sent the request.
separate named facts to make the mod- .response.embeds With all this in place—the struc-
el more understandable: +… ture of endpoints and messages, the
˲˲ The Directions fact contains two rules about how origins are computed
constraints. The first says that every re- defining the requests embedded in the and used, and the definition of causal-
quest is from, and every response is to, response to r, the requests embedded ity—we can define a design property to
a client; the second says that every re- in the response to the requests em- check (as illustrated in Figure 4).
quest is to, and every response is from, bedded in the response to r, and so on. The keyword check introduces a
a server. These kinds of constraints can (Equivalently, r.^p is the set of nodes command that can be executed. This
be written in many ways. Here I have reachable from r in the graph whose command instructs the Alloy Analyzer
chosen to use expressions denoting edges correspond to the relation p.) to search for a refutation for the given
sets of endpoints—for example, Re- ˲˲ The Causality fact defines the constraint. In this case, the constraint
quest.from for the set of endpoints that causes relation. It says that an event asserts the nonexistence of a cross-
requests are from, but could just as is caused by a server if and only if it is site request forgery attack; refuting
well have written a constraint like:
Figure 3. Fact and predicate declarations.
from in
Request -> Client + Response -> Server 17 fact Directions {
18 Request.from + Response.to in Client
19 Request.to + Response.from in Server
to say the from relation maps requests 20 }
to clients and responses to servers. Or
21 fact RequestResponse {
in a more familiar but less succinct 22 all r: Response | one response.r
style, we could have used quantifiers: 23 all r: Response | r.to = response.r.from and r.from = response.r.to
24 all r: Request | r not in r.^(response.embeds)
25 }
all r: Request | r.from in Client
all r: Response | r.from in Server 26 fact Causality {
27 all e: HTTPEvent, s: Server | e in s.causes iff
28 e.from = s or some r: Response | e in r.embeds and r in s.causes
(constraining only the range of the re- 29 }
lations, which is sufficient in this case
30 fact Origin {
since the declarations already con- 31 all r: Response, e: r.embeds | e.origin = r.origin
strain their domains). 32 all r: Response | r.origin = (r in Redirect implies response.r.origin else r.from)
˲˲ The RequestResponse fact defines 33 all r: Request | no embeds.r implies r.origin in r.from
34 }
some basic properties of how re-
quests and responses work: that every 35 pred EnforceOrigins (s: Server) {
response is associated with exactly 36 all r: Request | r.to = s implies r.origin = r.to or r.origin = r.from
37 }
one request (line 22); that every re-
sponse is to the endpoint its request

SE PT E MB E R 2 0 1 9 | VO L. 6 2 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 71
contributed articles

Figure 4. Check command. rect whose embedded request (Req2) is


received by the good server. (Note that
the numbering of objects is arbitrary:
38 check {
39 no good, bad: Server { Req1 actually happens before Req0.)
40 good.EnforceOrigins Looking at the server nodes and the
41 no r: Request | r.to = bad and r.origin in Client events they cause, we see that, as expect-
42 some r: Request | r.to = good and r in bad.causes
43 } ed, the good server caused the response
44 } for 5 to the first request, and the bad server
caused the redirect and its subsequent
embedded request. The problem is the
mismatch between cause and origin in
the last request (Req2): we can see that it
Figure 5. Counterexample for check of Figure 6. A bogus counterexample. was caused by the bad server, but it was
Figure 4.
labeled as originating at the good serv-
Server
er. In other words, the origin tracking
Req1
($bad, $good) design is allowing a cross-site request
from: Client Server0 forgery by incorrectly identifying the
origin: Client ($good)
to: Server0 ($good) causes origin of the request in the redirect.
The solution to this problem turns
response causes
out to be non-trivial. Updating the ori-
Resp gin header after each redirect would
from: Server ($bad, $good) causes
Resp origin: Server ($bad, $good) fail for websites that offer open redirec-
from: Server0 ($good) to: Client
origin: Server0 ($good) tion; a better solution is to list a chain
to: Client of endpoints in the origin header.1
response
embeds
embeds Agile Modeling
As mentioned earlier, our model is rep-
Req0 Req
from: Client Server1
($r) resentative of many Alloy models. But
from: Client
origin: Server0 ($good) ($bad)
origin: Server ($bad, $good)
the way I presented it was potentially
to: Server1 ($bad)
to: Server ($bad, $good) misleading. In practice, users of Alloy
response causes don’t construct a model in its entirety
and then check its properties. Instead,
they proceed in a more agile way, grow-
Redirect
from: Server1 ($bad) causes
(with a 2.6GHz i7 processor and 16GB ing the model and simulating and
origin: Server0 ($good) of RAM). checking it as they go.
to: Client
The counterexample can be dis- Take, for example, the constraint on
embeds
played in various ways—as text, a table, line 24 of Figure 3. Initially, I had not
or a graph whose appearance can be actually noticed the need for this con-
Req2 customized. I’ve chosen the graph op- straint. But when I ran the check for
($r) tion, and have selected which objects the first time (without this constraint),
from: Client
origin: Server0 ($good) are to appear as nodes (just the events the analyzer presented me with coun-
to: Server0 ($good)
and the servers), which relations are to terexamples such as the one shown in
appear as edges (those between events, Figure 6, in which the response to a re-
and causes), and I’ve picked colors for quest is the very response in which the
this will show that the origin mecha- the sets and relations. I have also cho- request is embedded!
nism is not designed correctly, and an sen to use the Skolem constants (wit- One way to build a model exploiting
attack is possible. nesses that the analyzer finds for the Alloy’s ability to express and analyze
The constraint says that there are no quantified variables) good and bad to very partial models is to add one con-
two servers, good and bad, such that the label the servers. straint at a time, exploring its effect.
good server enforces the origin header Reading the graph from the top, You do not need a property to check;
(line 40), there are no requests sent di- looking just at the large rectangles you can just ask for an instance of the
rectly to the bad server that originate representing the HTTP events, we see model satisfying all the constraints.
in the client (line 41), and yet there is a request (Req1) was sent from a cli- Doing this even before any explicit
some request to the good server that ent to the good server. The response constraints have been included is very
was caused by the bad server (line 42). (Resp) embeds a request (Req0) that is helpful. You can run just the data mod-
sent to the bad server; this is a cross- el by itself and see a series of instances
Analysis Results: Finding Bugs site request, which will not be rejected that satisfy the constraints implicit
The Alloy Analyzer finds a counterex- because the bad server accepts incom- in the declarations. Often doing this
ample (see Figure 5) almost instanta- ing requests irrespective of origin. The alone exposes some interesting issues.
neously—in 30ms on my 2012 MacBook bad server’s response is to send a redi- In this case, the first few instances in-

72 COMM UNICATIO NS O F THE ACM | S EPTEM BER 201 9 | VO L . 62 | N O. 9


contributed articles

clude examples with no HTTP events, arate, for example, the constraints that
and with requests and responses that model the setting and checking of the
are disconnected. origin field from those that describe
To get more representative in- what kinds of requests and responses
stances, you can specify an additional
constraint to be satisfied. For example, Like the class are possible.
Obviously, the less you assume
the command diagrams of UML, about the environment, the better,
since every assumption you make is a
run (some response) Alloy makes it easy risk (as it may turn out to be untrue).

will show instances in which the re-


to describe In our model, for example, we do not
require every request to have a re-
sponse relation has some tuples. The a universe sponse. It would be easy to do—just
first one generated (Figure 7) shows
a request with a response that is a re-
of objects as change the declaration of response
in line 10 of Figure 1 by dropping the
direct from the same source as the a classification tree, lone keyword—but would only make
request, and sent to an endpoint that
is also its origin, and it includes an with each relation the result of the analysis less general.
Likewise, the less you constrain the
orphaned redirect unrelated to any re- defined over mechanism, the better. Allowing mul-
quest! These anomalies immediately
suggest enrichments of the model. nodes in this tree. tiple behaviors gives implementation
freedom, which is especially impor-
When we developed Alloy, we un- tant in a distributed setting.
derestimated the value of this kind of Simulation matters for a more
simulation. As we experimented with profound reason. Verification—that
Alloy, however, we came to realize is, checking properties—is often
how helpful it is to have a tool that can overrated in its ability to prevent
generate provocative examples. These failure. As Christopher Alexander
examples invariably expose basic mis- explains,2 designed artifacts usually
understandings, not only about what’s fail to meet their purposes not be-
being modeled but also about which cause specifications are violated but
properties matter. It’s essential that because specifications are unknown.
Alloy provides this simulation for free: The “unknown unknowns” of a soft-
in particular, you do not need to for- ware design are invariably discovered
mulate anything like a test case, which when the design is finally deployed,
would defeat the whole point. but can often be exposed earlier by
Growing a model in a declarative simulation, especially in the hands
language like Alloy is very different of an imaginative designer.
from growing a program in a conven- Verification, in contrast, is too nar-
tional programming language. A pro- rowly focused to produce such discover-
gram starts with no behaviors at all, ies. This is not to say property checking
and as you add code, new behaviors is not useful—it’s especially valuable
become possible. With Alloy, it’s the when a property can be assured with
opposite. The empty model, since it high confidence using a tool such as
lacks any constraints, allows every pos- Alloy or a model checker or theorem
sible behavior; as you add constraints, prover (rather than by testing). But
behaviors are eliminated. its value is always contingent on the
This allows a powerful style of in- sufficiency of the property itself, and
cremental development in which you
only add constraints that are abso- Figure 7. A simulated instance.
lutely essential for the task at hand—
whether that is eliminating patho- Req
from: Client1
logical cases or ensuring a design origin: Client0
property holds. to: Client1
Typically a model includes both
a description of the mechanism be- response

ing designed and some assumptions


about the environment in which it op-
Redirect1 Redirect0
erates. Our example model does not from: Client1 from: Client1
origin: Client0 origin: Client0
separate these rigorously, but where to: Client0 to: Client0
brevity is not such a pressing concern,
it would be wise to do so. We could sep-

SE PT E MB E R 2 0 1 9 | VO L. 6 2 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 73
contributed articles

techniques that help you explore prop- Web security mechanisms, and then
erties have an important role to play. analyzed five different mechanisms,
including: WebAuth, a Web-based au-
Uses of Alloy thentication protocol based on Ker-
Hundreds of papers have reported on
applications of Alloy in a wide variety As we experimented beros deployed at several universities
including Stanford; HTML5 forms;
of settings. Here are some examples
to give a better idea of how Alloy has
with Alloy, the Cross-Origin Resource Sharing
protocol; and proposed designs for
been used: we came to realize using the referrer header and the ori-
Critical systems. A team at the Uni-
versity of Washington constructed a
how helpful gin header to foil cross-site attacks
(of which the last is the basis for the
dependability case18 for a neutron radio- it is to have example here). The base library was
therapy installation. The team devised an
ingenious technique for verifying prop-
a tool that can written in 2,000 lines of Alloy; the vari-
ous mechanisms required between 20
erties of code against specifications us- generate and 214 extra lines; and every bug was
ing lightweight, pluggable checkers.
The end-to-end dependability case provocative found within two minutes and a scope
of 8. Two previously known vulnerabil-
was assembled in Alloy from the code examples. ities were confirmed by the analysis,
specifications, properties of the equip- and three new ones discovered.
ment and environment, and the ex- Memory models. John Wickerson
pected properties, and then checked and his colleagues have shown that
using the Alloy Analyzer. The analysis four common tasks in the design of
found several safety-critical flaws in memory models—generating confor-
the latest version of the control soft- mance tests, comparing two memory
ware, which the researchers were able models, checking compiler optimiza-
to correct prior to its deployment. For tions, and checking compiler map-
a full description, see a recent research pings—can all be framed as constraint
report30 and additional information on satisfaction problems in Alloy.41 They
the project’s website.36 were able to reproduce automatically
Network protocols. Pamela Zave, a several results for C11 (the memory
researcher at AT&T, has been using Al- model introduced in 2011 for C and
loy for many years to construct and ana- C++) and common compiler optimiza-
lyze models of networking as well as tions associated with it, for the mem-
for designing a new unifying network ory models of the IBM Power and Intel
architecture. In a major case study, x86 chips, and for compiler mappings
she analyzed Chord, a distributed hash from OpenCL to AMD-style GPUs.
table for peer-to-peer applications. The They then used their technique to de-
original paper on Chord33—one of the velop and check a new memory model
most widely cited papers in computer for Nvidia GPUs.
science—notes that an innovation of Code verification. Alloy can also be
Chord was its relative simplicity, and used to verify code by translating the
consequently the confidence users body of a function into Alloy, and ask-
can have in its correctness. By model- ing it to find a behavior of the func-
ing and analyzing the protocol in Al- tion that violates its specification.
loy, Zave showed that the Chord pro- Greg Dennis built a tool called Forge
tocol was not, in fact, correct, and she that wraps Alloy so it can be applied
was able to develop a fixed version that directly to Java code annotated with
maintains its simplicity and elegance JML specifications. In a case study
while guaranteeing correct behav- application, 10 he checked a variety of
ior.43 Zave also used the explicit model implementations of the Java collec-
checker SPIN14 in this work, and wrote tions list interface, and found bugs
an insightful article explaining the rela- in one (a GNU Trove implementa-
tive merits of the two tools, and how she tion). Dennis also applied his tool
used them in tandem.42 to KOA, an electronic voting system
Web security. The demonstration ex- used in the Netherlands that was an-
ample of this article is drawn from notated with JML specifications and
a real study performed by a research had previously been analyzed with
group at UC Berkeley and Stanford.1 The a theorem-proving tool, and found
group constructed a library of Alloy several functions that did not satisfy
models to capture various aspects of their specifications.11

74 COM MUNICATIO NS O F TH E AC M | S EPTEM BER 201 9 | VO L . 62 | N O. 9


contributed articles

Civil engineering. In one of the Alloy Extensions No extension of Alloy, however, has yet
more innovative applications of Alloy, Many extensions to Alloy—both to addressed the problem of combining
John Baugh and his colleagues have the language and to the tool—have Alloy’s capacity for structural analysis
been applying Alloy to problems in been created. These offer a variety of with the ability of traditional model
large-scale physical simulation. They improvements in expressiveness, per- checkers to explore long traces, so Al-
designed an extension to ADCIRC— formance, and usability. For the most loy analyses are still typically limited to
an ocean circulation model widely part, these extensions have been mu- short traces.
used by the U.S. Army Corps of Engi- tually incompatible, but a new open Instance generation. The result of
neers and others for simulating hur- source effort is now working to consoli- an Alloy analysis is not one but an
ricane storm surge—that introduces a date them. There are too many efforts entire set of solutions to a constraint-
notion of subdomains to allow more to include here, so I focus on represen- solving problem, each of which rep-
localized computation of changes tatives of the main classes. resents either a positive example of
(and thus reduced overall computa- Higher-order solving. The Alloy Ana- a scenario, or a negative example,
tional effort). Their extension, which lyzer’s constraint-solving mechanism showing how the design fails to meet
has been incorporated into the of- cannot handle formulas with universal some property. The order in which
ficial ADCIRC release, was modeled quantifications over relations—that these appear is somewhat arbitrary,
and verified in Alloy.7 is, problems that reduce to “find some being determined both by how the
Alloy as a backend. Because Alloy relation P such that for every relation problem is encoded and the tactics
offers a small and expressive logic, Q …” This is exactly the form that many of the backend SAT solver. Since SAT
along with a powerful analyzer, it has synthesis problems take, in which the solvers tend to try false before true
been exploited as a backend in many relation P represents a structure to be values, the instances generated tend
different tools. Developers have often synthesized, such as the abstract syn- to be small ones—with few nodes and
used Alloy’s own engine, Kodkod,34 tax tree of a program, and the relation edges. This is often desirable, but is
directly, rather than the API of Alloy Q represents the state space over which not always ideal. Various extensions
itself, because it offers a simpler pro- certain behaviors are to be verified. Al- to the Alloy Analyzer provide more
grammatic interface with the ability loy*24 is an extension of Alloy that can control over the order in which in-
to set bounds on relations, improv- solve such formulas, by generalizing a stances appear. Aluminum28 presents
ing performance. Jasmin Blanchette’s tactic known as counterexample-guid- only minimal scenarios in which ev-
Nitpick tool,8 for example, uses Kod- ed inductive synthesis that has been ery relation tuple is needed to satisfy
kod to find counterexamples in Isa- widely used in synthesis engines. the constraints, and lets the user add
belle/HOL, saving the user the trouble Temporal logic. Alloy has no built-in new tuples, automatically compen-
of trying to prove a theorem that is not notion of time or dynamic behavior. sating with a (minimal) set of addi-
true, and the Margrave tool26 analyzes On the one hand, this is an asset, be- tional tuples required for consistency.
firewall configurations. Last year, a cause it keeps the language simple, Amalgam27 lets users ask about the
team from Princeton and Nvidia built and allows it to be used very flexibly. provenance of an instance, indicating
a tool that uses Alloy to synthesize se- I exploited this in the example model which sub formula is responsible for
curity attacks that exploit the Spectre here, where the flow of time is cap- requiring (or forbidding) a particular
and Meltdown vulnerabilities.35 tured in the response relation that tuple in the instance. Another exten-
Teaching. Alloy has been widely maps each request to its response. By sion21 of the Alloy Analyzer generates
taught in undergraduate and gradu- adding a signature for state, Alloy sup- minimal and maximal instances, and
ate courses for many years. At the Uni- ports the specification style common choosing a next instance that is as
versity of Minho in Portugal, Alcino in languages such as B, VDM, and Z; close to, or as far away from, the cur-
Cunha teaches an annual course on and by adding a signature for events, rent instance as possible.
formal methods using Alloy, and has Alloy allows analysis over traces that Better numerics. Alloy handles nu-
developed a Web interface to present can be visualized as a series of snap- merical operations by treating num-
students with Alloy exercises (which shots. On the other hand, it would bers as bit strings. This has the ad-
are then automatically checked). At often be preferable to have dynamic vantage of fitting into the SAT solving
Brown University, Tim Nelson teaches features built into the language. Elec- paradigm smoothly, and it allows a
Logic for Systems, which uses Alloy trum20 extends Alloy with a keyword good repertoire of integer operations.
for modeling and analysis of system var to indicate that a signature or field But the analysis scales poorly, making
designs, and has become one of the has a time-varying value, and with Alloy unsuitable for heavily numeric
most popular undergraduate classes. the quantifiers of linear temporal applications. The finite scopes of Alloy
Because the Alloy language is very logic (which fit elegantly with Alloy’s can also be an issue when a designer
close to a pure relational logic, it has traditional quantifiers). DynAlloy31 would like numbers to be unbounded.
also been popular in the teaching of offers similar functionality, but us- A possible solution is to replace the
discrete mathematics, for example, in ing dynamic logic instead, and is the SAT backend with an SMT backend
a course that Charles Wallace teaches basis of an impressive code analysis instead. This is challenging because
at Michigan Technological Univer- tool called TACO13 that outperforms SMT solvers have not traditionally
sity38 and appearing as a chapter in a Forge (mentioned earlier) by employ- supported relational operators. Nev-
popular textbook.15 ing domain-specific optimizations. ertheless, a team at the University of

SE PT E MB E R 2 0 1 9 | VO L. 6 2 | N O. 9 | C OM M U N IC AT ION S OF T HE ACM 75
contributed articles

Iowa has recently extended CVC4, a to write it; and to Devdatta Akhawe, knowledge for efficient model analysis. In Proceedings
of the 15th Intern. Symp. Automated Technology for
leading SMT solver, with a theory of Adam Barth, Peifung E. Lam, John Verification and Analysis. Springer, 2017, 344–362.
finite relations, and has promisingly Mitchell, and Dawn Song, whose work 23. Meng, B., Reynolds, A., Tinelli, C., and Barrett,
C. Relational Constraint Solving in SMT. In
demonstrated its application to some formed the basis of the example used Proceedings of the 26th Intern. Conf. Automated
Alloy problems.23 in the article. Thank you also to the Deduction. (Gothenburg, Sweden, 2017) L. de
Moura, ed. Springer.
Configurations. Many Alloy models many members of the Alloy communi- 24. Milicevic, A., Near, J.P., Kang E. and Jackson, D. Alloy*:
contain two loosely coupled parts, ty who have contributed to Alloy over A general-purpose higher-order relational constraint
solver. Formal Methods in System Design, 2017, 1–32.
one defining a configuration (say of a the years. 25. Montaghami, V. and Rayside, D. Extending alloy with
network) and the other the behavior partial instances. In Proceedings of the 3rd Intern.
Conf. Abstract State Machines, Alloy, B, VDM, and Z.
(say of sending packets). By iterating References 2012, 122–135.
1. Akhawe, D., Barth, A., Lam, P.E., Mitchell, J. and Song, 26. Nelson, T., Barratt, C., Dougherty, D.J., Fisler, K. and
through configurations and analyzing D. Towards a formal foundation of Web security. In Krishnamurthi, S. The Margrave tool for firewall
each independently, one can often Proceedings of the 23rd IEEE Computer Security analysis. In Proceedings of the 24th USENIX Large
Foundations Symp. Edinburgh, 2010, 290–304. Installation System Administration Conference (San
dramatically reduce analysis time.22 2. Alexander, C. Notes on the Synthesis of Form. Harvard Jose, CA, 2010).
University Press, 1964.
In some applications, a configuration 3. Alloy Tools; http://alloytools.org.
27. Nelson, T., Danas, N., Dougherty, D.J., and
Krishnamurthi, S. The Power of Why and Why Not:
is already fully or partially known, and 4. Alloy Models repository; https://github.com/ Enriching Scenario Exploration with Provenance.
AlloyTools/models
the goal is to complete the instance— 5. Alloy discussion forum; https://groups.google.com/
Joint European Software Engineering Conference and
ACM SIGSOFT Symposium on the Foundations of
in which case searching for the con- forum/#!forum/alloytools Software Engineering, 2017.
6. Barth, A., Jackson, C., and Mitchell, J.C. Robust
figuration is a wasted effort. Kodkod, defenses for cross-site request forgery. In
28. Nelson, T., Saghafi, S., Dougherty, D.J., Fisler, K., and
Krishnamurthi, S. Aluminum: Principled scenario
Alloy’s engine, allows the explicit def- Proceedings of the 15th ACM Conf. on Computer and exploration through minimality. In Proceedings of the
Communications Security. ACM, 2008, 75–88.
inition of a “partial instance” to sup- 7. Baugh, J. and Altuntas, A. Formal methods and finite
Intern. Conf. Software Engineering, 2013.
29. Padon, O., Losa, G., Sagiv, M. and Shoham S. Paxos
port this, but in Alloy itself, this no- element analysis of hurricane storm surge: A case made EPR: Decidable reasoning about distributed
study in software verification. Science of Computer protocols. In Proceedings of the OOPSLA 2017
tion is not well supported (and relies Programming 158 (2018), 100–121. (Vancouver, 2017).
on a heuristic for extracting partial 8. Blanchette, J. and Nipkow, T. Nitpick: A 30. Pernsteiner, S. et al. Investigating safety of a
counterexample generator for higher-order logic radiotherapy machine using system models with
instances from formulas in a certain based on a relational model finder. In Proceedings of pluggable checkers. Computer Aided Verification
form). Researchers have therefore the 1st Intern.Conf. Interactive Theorem Proving. M. LNCS 9780. Springer.
Kaufmann and L.C. Paulson, eds. LNCS 6172 (2010). 31. Regis, G. et al. DynAlloy Analyzer: A tool for the
proposed a language extension25 to Springer, 131–146. specification and analysis of Alloy models with
allow partial instances to be defined 9. Burch, J.R., Clarke, E.M., McMillan, K.L., Dill, D.L. and dynamic behaviour. In Proceedings of the 11th Joint
Hwang, L.J. Symbolic model checking: 10 states and Meeting on Foundations of Software Engineering. ACM,
directly in Alloy itself. beyond. In Proceedings of the 5th Annual Symp. Logic New York, NY, 2017, 969–973.
in Computer Science. (Philadelphia, PA, USA, June 32. Spivey, J.M. The Z Notation: A Reference Manual (2nd
4–7, 1990), 428–439. ed.), Prentice Hall, 1992.
How to Try Alloy 10. Dennis, G., Chang, F. and Jackson, D. Modular 33. Stoica, I. et al. Chord: A scalable peer-to-peer lookup
verification of code with SAT. In Proceedings of
The Alloy Analyzer3 is a free down- the Intern. Symp. Software Testing and Analysis.
protocol for Internet applications. IEEE/ACM Trans.
Networking 11, 1 (2003), 17–32.
load available for Mac, Windows, (Portland, ME, July 2006). 34. Torlak, E. and Jackson, D. Kodkod: A relational model
11. Dennis, G., Yessenov, K. and Jackson, D. Bounded
and Linux. The Alloy book16 provides verification of voting software. In Proceedings of the
finder. In Proceedings of the 13th Intern. Conf. Tools
and Algorithms for the Construction and Analysis of
a gentle introduction to relational 2nd IFIP Working Conf. Verified Software: Theories, Systems (Braga, Portugal, 2007), 632–647.
Tools, and Experiments. (Toronto, Canada, Oct. 2008).
logic and to the Alloy language, gives 12. Edwards, J., Jackson, D. and Torlak, E. A type system
35. Trippel, C., Lustig, D. and Martonosi, M.
MeltdownPrime and SpectrePrime: Automatically
many examples of Alloy models, and for object models. In Proceedings of the 12th ACM Synthesized Attacks Exploiting Invalidation-Based
SIGSOFT Intern. Symp. Foundations of Software Coherence Protocols. Feb. 2018; arX- iv:1802.03802.
includes a reference manual and a Engineering (Newport Beach, CA, USA, Oct. 31— Nov. 36. University of Washington. PLSE Neutrons;
comparison to other languages (both 6, 2004), 189–199. http:neutrons.uwplse.org/
13. Galeotti, J.P., Rosner, N., Lopez Pombo, C.G. and Frias, 37. Visser, W., Havelund, K., Brat, G., Park, S.J. and Lerda,
of which are available on the book’s M.F. TACO: Efficient SAT-based bounded verification F. Model checking programs. Automated Software
website17). The Alloy community using symmetry breaking and tight bounds. IEEE Engineering J. 10, 2 (Apr. 2003).
Trans. Softw. Eng. 39, 9 (Sept.2013), 1283–1307. 38. Wallace, C. Learning Discrete Structures Interactively
answers questions tagged with the 14. Holzmann, G.J. The Spin Model Checker: Primer and with Alloy. In Proceedings of the 49th ACM Tech.
keyword alloy on StackOverflow, and Reference Manual, Addison Wesley, 2003. Symp. Computer Science Education (Baltimore, MD,
15. Huth, M. and Ryan, M. Logic in Computer Science: Feb. 21–24, 2018), 1051–1051.
hosts a discussion forum.5 A variety of Modeling and Reasoning about Systems. Cambridge 39. Warmer, J.B. and Kleppe, A.G. The Object Constraint
University Press, 2004.
tutorials for learning Alloy are avail- 16. Jackson, D. Software Abstractions. MIT Press, Second
Language: Precise Modeling with UML. Addison-
Wesley, 1999.
able online too, as well as blog posts edition, 2012. 40. Wayne, H. Personal blog; https://www.hillel-wayne.com
17. Jackson, D. Software Abstractions; http://
with illustrative case studies and softwareabstractions.org.
41. Wickerson, J, Batty, M., Sorensen, T. and
Constantinides, G.A. Automatically comparing
examples (for example, Kriens 19 18. Jackson, D., Thomas, M. and Millett, L.I. eds. Software memory consistency models. In Proceedings of the
For Dependable Systems: Sufficient Evidence?
and Wayne 40). The model used in Committee on Certifiably Dependable Software
44th ACM SIGPLAN Symp. Principles of Programming
Languages (Paris, France, 2017), 190–204.
this article is available (along with Systems, Computer Science and Telecommunications 42. Zave, P. A practical comparison of Alloy and Spin.
Board, Division on Engineering and Physical Sciences.
its visualization theme) in the Alloy National Research Council of the National Academies.
Formal Aspects of Computing 27 (2015), 239–253.
43. Zave, P. Reasoning about identifier spaces: How
community’s model repository.4 The National Academies Press, Washington, D.C., 2007. to make Chord correct. IEEE Trans. Software
19. Kriens, P. JPMS, The Sequel; http://aqute. Engineering 43, 12 (Dec. 2017), 1144–1156.
biz/2017/06/14/jpms-the-sequel.html
Acknowledgments 20. Macedo, N., Brunel, J., Chemouil, D., Cunha, A. and
Kuperberg, D. Lightweight specification and analysis
I am very grateful to David Chemouil, of dynamic systems with rich configurations. In
Daniel Jackson is a professor of computer science and
Associate Director of the Computer Science and Artificial
Alcino Cunha, Peter Kriens, Shriram Proceedings of the 24th ACM SIGSOFT Intern. Symp.
Intelligence Laboratory at the Massachusetts Institute of
Foundations of Software Engineering (Seattle, WA,
Krishnamurthi, Jay Parlar, Emina Tor- Technology, Cambridge, MA, USA.
USA, 2016), 373–383.
lak, Hillel Wayne, Pamela Zave, and 21. Macedo, N., Cunha, A. and Guimaraes, T. Exploring
scenario exploration. Fundamental Approaches to
the anonymous reviewers, whose sug- Software Engineering. A. Egyed and I. Schaefer, eds.
Lecture Notes in Computer Science 9033. Springer,
gestions improved this article greatly; Berlin, Heidelberg.
to Moshe Vardi, who encouraged me 22. Macedo, N., Cunha, A. and Pessoa, E. Exploiting partial © 2019 ACM 0001-0782/19/9

76 COMM UNICATIO NS O F THE AC M | S EPTEM BER 201 9 | VO L . 62 | N O. 9

You might also like