You are on page 1of 9

No more, no less

A formal model for serverless computing

Maurizio Gabbrielli1 , Saverio Giallorenzo2 , Ivan Lanese1 , Fabrizio Montesi2 ,


Marco Peressotti2 , and Stefano Pio Zingaro1
1
INRIA, France / Università di Bologna, Italy
{maurizio.gabbrielli,ivan.lanese,stefanopio.zingaro}@unibo.it
2
University of Southern Denmark, Denmark
{saverio,fmontesi,peressotti}@imada.sdu.dk
arXiv:1903.07962v2 [cs.PL] 1 May 2019

Abstract Serverless computing, also known as Functions-as-a-Service,


is a recent paradigm aimed at simplifying the programming of cloud
applications. The idea is that developers design applications in terms
of functions, which are then deployed on a cloud infrastructure. The
infrastructure takes care of executing the functions whenever requested
by remote clients, dealing automatically with distribution and scaling
with respect to inbound traffic.
While vendors already support a variety of programming languages for
serverless computing (e.g. Go, Java, Javascript, Python), as far as we
know there is no reference model yet to formally reason on this paradigm.
In this paper, we propose the first core formal programming model for
serverless computing, which combines ideas from both the λ-calculus
(for functions) and the π-calculus (for communication). To illustrate
our proposal, we model a real-world serverless system. Thanks to our
model, we capture limitations of current vendors and formalise possible
amendments.

1 Introduction

Serverless computing [24], also known as Functions-as-a-Service, narrows the


development of Cloud applications to the definition and composition of stateless
functions, while the provider handles the deployment, scaling, and balancing
of the host infrastructure. Hence, although a bit of a misnomer — as servers
are of course involved — the “less” in serverless refers to the removal of some
server-related concerns, namely, their maintenance, scaling, and expenses related
to a sub-optimal management (e.g. idle servers). Essentially, serverless pushes to
the extreme the per-usage model of Cloud Computing: in serverless, users pay
only for the computing resources used at each function invocation. This is why
recent reports [18, 24] address serverless computing as the actual realisation of
the long-standing promise of the Cloud to deliver computation as a commodity.
AWS Lambda [4], launched in 2014, is the first and most widely-used serverless
implementation, however many players like Google, Microsoft, Apache, IBM,
and also open-source communities recently joined the serverless market [3, 16,
2 Gabbrielli et al.

19, 21, 22, 29]. Current serverless proposals support the definition of functions
— written in mainstream languages such as Go, Java, Javascript or Python —
activated by specific events in the system, like a user request to a web gateway,
the delivery of content from a message broker or a notification from a database.
The serverless infrastructure transparently handles the instantiation of functions,
as well as monitoring, logging, and fault tolerance.
Serverless offerings have become more and more common, yet the technology
is still in its infancy and presents limitations [6, 18, 24] which hinder its wide
adoption. For example, current serverless implementations favour operational
flexibility (asynchrony and scalability) over developer control (function composi-
tion). Concretely, they do not support the direct composition of functions, which
must call some stateful service in the infrastructure (e.g. a message broker) which
will take care of triggering an event bound to the callee. On the one hand, that
limitation is beneficial, since programmers must develop their functions as highly
fine-grained, re-usable components (reminiscent of service-oriented architectures
and microservices [12]). On the other hand, such openness and fine granularity
increases the complexity of the system: programmers cannot assume sequential
consistency or serialisability among their functions, which complicates reasoning
on the semantics of the transformations applied to the global state of their
architecture. This holds true also when estimating resource usage/costs, due to
the complexity of unfolding all possible concurrent computations.
The above criticisms pushed us to investigate a core calculus for serverless
computing, to reason on the paradigm, to model desirable features of future imple-
mentations, and to formalise guarantees over programs. In Section 2 we introduce
the Serverless Kernel Calculus (SKC); as far as we know, the first core formal
model for serverless computing. SKC combines ideas from both the λ-calculus (for
functions) and the π-calculus (for communication). In Section 2, we also extend
SKC to capture limitations of current serverless implementations. In Section 3 we
use our extension to model a real-world serverless architecture [1], implemented
on AWS Lambda. Finally, in Section 4 we discuss future developments of SKC.

2 A Serverless Kernel Calculus

Our kernel calculus defines a serverless architecture as a pair hS, Di, where S
is the system of running functions and D is a definition repository, containing
function definitions. The repository D is a partial function from function names
f to function bodies M . M includes function application (M M '), asynchronous
execution of new functions (async M ), function names f, and values V . Values
include variables x, λ-abstractions λx.M , named futures [5, 17, 32] c, and the unit
value (). A system S contains running functions c J M , where c will contain the
result of the computation of the function M . Systems can be composed in parallel
| and include the empty system 0. Futures can be restricted in systems via νc S.

S, S 0 ::= c J M | S | S 0 | νc S | 0 (Systems)
M, M 0 ::= M M ' | async M | f | V (Functions)
A formal model for serverless computing 3

D(f) = M
bβe brete
hE[(λx.M ) V ], Di −−→ hE[M {V /x}], Di hE[f], Di −−→ hE[M ], Di
c∈/ fn(M )
basynce bpushe
hE[async M ], Di −−→ hνc(E[c] | c J M ), Di hνc(S | c J V ), Di −−→ hS{V /c}, Di
S0 ≡ S00 hS00 , Di −−→ hS10 , D0 i S10 ≡ S1
0
bstre
hS0 , Di −−→ hS1 , D i
hS, Di −−→ hS 0 , D0 i hS1 , Di −−→ hS10 , D0 i
brese blpare
0
hνc S, Di −−→ hνc S , D i 0
hS1 | S2 , Di −−→ hS10 | S2 , D0 i

Figure 1. SKC reduction semantics.

V, V 0 ::= x | λx.M | c | () (Values)

We assume futures to appear only at runtime and not in initial systems. Moreover,
we consider a standard structural congruence ≡ that supports changing the scope
of restrictions to avoid name capture, and where parallel composition is associ-
ative, commutative, and has 0 as neutral element.

νc νc0 S ≡ νc0 νc S νc(S | S 0 ) ≡ νc S | S 0 if c 6∈ fn(S 0 )


S ≡S|0 S|S ≡S |S 0 0
(S | S ) | S ≡ S | (S 0 | S 00 )
0 00

We define the semantics of our calculus using evaluation contexts E and Eλ , to


evaluate, respectively, systems and functions.

E ::= c J Eλ Eλ ::= [−] | (λx.M )Eλ | Eλ M

We report in Figure 1 the semantics of serverless architectures hS, Di, expressed


as reduction rules. Rule bβe is the traditional function application of λ-calculus.
Rule brete retrieves the body of function f from the definition repository D. Rule
basynce models the execution of new functions: it creates a fresh future c and, in
parallel, it executes function M so that c will store the evaluation of M . When the
evaluation of a function reduces to a value, rule bpushe returns the value to the
associated future and removes both the terminated function and its restriction.
Rules bstre, brese, and blpare perform the closure under, respectively, structural
congruence, restriction, and parallel composition. We include in SKC standard
components (conditionals, etc.) and extend evaluation contexts (E) accordingly:

M ::= · · · | if M then M ' else M '' | fst M | snd M


V ::= · · · | True | False | (V ,V ')

We define standard macros for fixpoint, let and let rec declarations, and pairs.

fix , λf.(λx.f(xx))(λx.f(xx)) let x = M in M ' , (λx.M ') M


let rec x = M in M ' , let x = fix λx.M in M '
λ(x,y).M , λz.(λx.λy.M) (fst z) (snd z)
4 Gabbrielli et al.

2.1 SKCσ - A stateful extension of SKC


SKC considers static definition repositories, i.e. no rules mutate the state of D.
We now present SKCσ , an extension of SKC which includes two primitives to
define transformations on definition repositories. As shown in Section 3, SKCσ is
powerful enough to encode stateful services, like databases and message queues.

M, M 0 ::= · · · | set f M | take f

The first primitive included in SKCσ is set f M , which updates the definition
repository D to map f to M : users can use the set primitive to deploy new
function definitions or update/override existing ones. The second primitive is
take f, which removes the definition of f from D, returning it to the caller. We
report below the semantics of the new primitives.
futures(M ) = ∅
bsete
hE[set f M ], Di −−→ hE[f], D[f7→M]i
D(f) = M
btakee
hE[take f], Di −−→ hE[let rec f=M in M ], undef(D, f)i

The only restriction on the application of rule bsete is that the body M of the
newly deployed function f does not contain futures (futures(M ) is the set of
futures occurring in M ). This preserves the semantics of restriction of futures in
function evaluations (cf. rules basynce and bpushe). In the reductum, the rule
returns the name of the deployed function, useful to invoke it in the continuation.
Rule btakee removes the definition M of a deployed function f. For simplicity,
we define btakee applicable only if f is defined. In the reductum, the caller of
the take obtains the recursive let declaration of the function (useful for internal
application) while the association for f is removed from D by function undef.

2.2 SKCe - Event-based function composition in SKC


We present an idiom of SKC, called SKCe , which models event-based function
composition. SKCe captures one of the main limitations of current serverless
vendors: the lack of support for direct function invocation, replaced by an
event-handling/event-triggering invocation model. Indeed, current serverless im-
plementations, such as AWS Lambda, work as follows: they include infrastructural
stateful services, such as API gateways, that we can model using our stateful
extension SKCσ , and these services throw events. User-defined functions are
invoked as handlers of these events. User-defined functions can then invoke the
infrastructural services above. Notably, a user-defined function cannot directly
invoke another user-defined function. We will see an instance of the event-based
pattern in Section 3, while we describe below event handling mechanisms.
We model events (e and variations thereof) inside SKC as function names
associated with peculiar function bodies in the repository D that asynchronously
evaluate the corresponding event handler and discard the handler result. For
convenience, i) we package the asynchronous call of an event handler in the helper
A formal model for serverless computing 5

function callHandler below (hereafter, we assume that D contains callHandler)


and ii) we write _ for unused variable symbols in binding constructs.

callHandler 7→ λh.λx.let _ = async (h () x) in ()

Event e is defined in D as e 7→ callHandler λ_.he and its event handler as


he 7→ Me ; we wrap the name he in a lambda abstraction to avoid expansion
(via bRete) since function names are not values. Raising an event e with some
parameter v results in asynchronously executing the corresponding handler, as
shown by the derivation below (we abbreviate hS, Di −−→ hS 0 , Di as S −−→D S 0
and label reductions with the names of the most relevant applied rules).
brete brete
e v −−−→D callHandler λ_.he v −−−→D (λh.λx.let _=async (h () x) in ()) λ_.he v
bβe,bβe basynce
−−−−→D let _=async (λ_.he () v) in () −−−−→D νc(let _=c in () | c J λ_.he () v)
bβe,bβe brete
−−−−→D νc(() | c J he v) −−−→D () | νc(c J Me v)

3 An Illustrative Example
Let SKCσe be the compound of SKCσ and SKCe presented in Section 2. Here,
we illustrate how SKCσe can capture real-world serverless systems by encoding
a relevant portion (depicted in Figure 2) of Tailor [1], an architecture for user
registration, developed by Autodesk over AWS Lambda. Tailor mixes serverless
functions with vendor-specific services: API Gateways, key-value databases (Dy-
namoDB), and queue-based notification services (SNS). In the architecture, each
function defines a fragment of the logic of a user-registration procedure, like
the initiation of registration requests (talr-receptionist), request validation
(talr-validator), etc. To model Figure 2 in SKCσe , first, we install in D the
event handlers for the API Gateway, the DynamoDB, and SNS services3 :

eAPI 7→ callHandler(talr-receptionist) eSNS 7→ callHandler([...])


eDDB 7→ callHandler(talr-validator)

Then, we define the functions called by the handlers installed above, using the
same names of the AWS Lambda functions in Figure 2. Handler eAPI calls function
talr-receptionist, which validates the request and inserts the information
of the user in the key/value database. For brevity, we omit the behaviour of
talr-receptionist in case of invalid requests and the definition of auxiliary
functions validate_request, get_key, get_value in D:

talr-receptionist 7→ λx.if validate_request x then


write_db (get_key x,get_value x) else [...]

Handler eDDB invokes function talr-validator, which retrieves from the database
the status of task x, checks if it is complete, and sends a notification on SNS. We
omit the definitions of functions check and compose_msg and of the else branch.
3
We omit the name of the function called by eSNS , excluded in the excerpt of Figure 2.
6 Gabbrielli et al.

Figure 2. Scheme of the Autodesk Tailor system. Top, excerpt considered in the
example. Bottom, full architecture (circled elements belong to the excerpt).

talr-validator 7→ λx.let status = read_db x in


if check status then push (compose_msg x) else [...]
We conclude illustrating the definitions of functions write_db, read_db, and push
in D, which exemplify how SKCσe can encode stateful, event-triggering services.
Keys are represented as function names and values are stored in D; thus keys
are passed around wrapped in lambda abstractions (λ_.k) as done for events.
write_db 7→ λ(x,v).eDDB (set (x ()) v) read_db 7→ λx.x ()
push 7→ λ(x,v).eSNS (set (x ()) v)
Function write_db takes a key (wrapped as x = λ_.k) and a value v as parameters,
writes on the database by setting to v the body of a function called k, and notifies
the write, invoking eDDB 4 . Function read_db simply unwraps the key thus enabling
4
More involved variants of the database are possible. E.g. to avoid clashes among
services using the same key for different elements, we can either use scoping or prefix
key names with service names — e.g. Tailor uses service-specific tables in DynamoDB.
A formal model for serverless computing 7

retrieval from D. Similarly to write_db, function push publishes (set) a message


v on an SNS topic (represented as a function name) and triggers eSNS .

Remark 3.1. The example illustrates how SKC can capture (but not be restricted
by) one of the most prominent limitations of current serverless platform [18],
i.e. that i) user-defined functions can be only invoked by raising an event that
executes a new function (as done by callHandler, using the async primitive)
and ii) functions can invoke other functions only by interacting with some
event-triggering infrastructural service (e.g. a database, represented by function
write_db, or a notification queue, represented by function push).

4 Discussion and Conclusion

We propose SKC, the first core formal model to reason on serverless computing.
While the design of SKC strives for minimality, it captures the main ingredi-
ents [18, 24] of serverless architectures: i) the deployment and instantiation of
event-triggered, stateless functions and ii) the desiderata of direct function-to-
function invocation based on futures — in Section 3 we show how this mechanism
is powerful enough to cover also the current setting of serverless vendors, where
function invocation must rely on third-party services that handle event triggering.
Futures [5, 17], which are the main communication mechanism in SKC, are
becoming one of the de-facto standards in asynchronous systems [13, 15, 35, 37].
We considered using named channels (as in CCS/π-calculus [30, 34]) instead of
futures, but we found them too general for the needs of the serverless model
(they are bi-directional and re-usable). Besides, futures can encode channels [32].
The work closest to ours is [23], appeared during the submission of this work,
in the form of a technical report. It presents a detailed operational semantics
that captures the low-level details of current serverless implementations (e.g.
cold/warm components, storage, and transactions are primitive features of their
model) whereas SKC identifies a kernel model of serverless computing. Another
work close to SKC is [32], where the authors introduce a λ-calculus with futures.
Since the aim of [32] is to formalise and reason on a concurrent extension of
Standard ML, their calculus is more involved than SKC, as it contains primitive
operators (handlers and cells) to encode safe non-deterministic concurrent opera-
tions, which can be encoded as macros in SKC. An interesting research direction
is to investigate which results from [23, 32] can be adapted to SKC.
Being the first core framework to reason on serverless architectures, SKC
opens multiple avenues of future research. For example, current serverless tech-
nologies offer little guarantee on sequential execution across functions, which
compels the investigation of new tools to enforce sequential consistency [28] or
serialisability [33] of the transformations of the global state [18]. That challenge
can be tackled developing static analysis techniques and type disciplines [2, 20]
for SKC. Another direction concerns programming models, which should give to
programmers an overview over the overall logic of the distributed functions and
capture the loosely-consistent execution model of serverless [18]. Choreographic
8 Gabbrielli et al.

Programming [10, 31] is a promising candidate for that task, as choreographies are
designed to capture the global interactions in distributed systems [26], and recent
results [9, 11, 14] confirmed their applicability to microservices [12], a neighbour-
ing domain to that of serverless architectures. Other possible research directions,
that we do not discuss for space constraints, include monitoring, various kinds
of security analysis including “self-DDoS attacks” [25, 27, 36] and performance
analysis. This last one is particularly relevant in the per-usage model of serverless
architectures, yet requires to extend SKC with an explicit notion of time in order
to support quantitative behavioural reasoning for timed systems [7, 8].

Bibliography
[1] Alan Williams. Tailor - the AWS Account Provisioning Service.
https://github.com/alanwill/aws-tailor. Online; acc. 02/2019.
[2] D. Ancona et al. Behavioral types in programming languages. Foundations
and Trends in Programming Languages, 3(2-3):95–230, 2016.
[3] Apache. OpenWhisk. https://github.com/apache/incubator-openwhisk.
Online; acc. 02/2019.
[4] AWS. Lambda. https://aws.amazon.com/lambda/. Online; acc. 02/2019.
[5] H. C. Baker Jr and C. Hewitt. The incremental garbage collection of
processes. In ACM Sigplan Notices, volume 12(8), pages 55–59. ACM, 1977.
[6] I. Baldini et al. Serverless computing: Current trends and open problems.
In Research Advances in Cloud Computing, pages 1–20. Springer, 2017.
[7] T. Brengos and M. Peressotti. A uniform framework for timed automata.
In CONCUR, volume 59 of LIPIcs, pages 26:1–26:15. Schloss Dagstuhl -
Leibniz-Zentrum fuer Informatik, 2016.
[8] T. Brengos and M. Peressotti. Behavioural equivalences for timed systems.
Logical Methods in Computer Science, 15(1), 2019.
[9] M. Carbone and F. Montesi. Deadlock-freedom-by-design: multiparty asyn-
chronous global programming. In POPL, pages 263–274. ACM, 2013.
[10] L. Cruz-Filipe and F. Montesi. A core model for choreographic programming.
In FACS, Lecture Notes in Computer Science, pages 17–35, 2016.
[11] M. Dalla Preda et al. Dynamic choreographies: Theory and implementation.
Logical Methods in Computer Science, 13(2), 2017.
[12] N. Dragoni et al. Microservices: Yesterday, today, and tomorrow. In Present
and Ulterior Software Engineering., pages 195–216. Springer, 2017.
[13] ECMAScript. Ecmascript 2018 language specification. http://ecma-
international.org/ecma-262/9.0/index.html. Online; acc. 02/2019.
[14] S. Giallorenzo, F. Montesi, and M. Gabbrielli. Applied choreographies. In
FORTE, pages 21–40. Springer, 2018.
[15] B. Goetz et al. Java concurrency in practice. Pearson Education, 2006.
[16] Google. Cloud Functions. https://cloud.google.com/functions. Online;
acc. 02/2019.
[17] R. H. Halstead Jr. Multilisp: A language for concurrent symbolic com-
putation. ACM Transactions on Programming Languages and Systems
(TOPLAS), 7(4):501–538, 1985.
A formal model for serverless computing 9

[18] J. M. Hellerstein et al. Serverless computing: One step forward, two steps
back. In CIDR. www.cidrdb.org, 2019.
[19] S. Hendrickson et al. Serverless Computation with OpenLambda. In USENIX.
USENIX Association, 2016.
[20] H. Hüttel et al. Foundations of session types and behavioural contracts.
ACM Comput. Surv., 49(1):3:1–3:36, 2016.
[21] IBM. Cloud Functions. https://www.ibm.com/cloud/functions. Online;
acc. 02/2019.
[22] Iron.io. IronFunctions. https://open.iron.io. Online; acc. 02/2019.
[23] A. Jangda et al. Formal foundations of serverless computing. CoRR,
abs/1902.05870, 2019. URL http://arxiv.org/abs/1902.05870.
[24] E. Jonas et al. Cloud programming simplified: A berkeley view on serverless
computing. Technical report, EECS Department, University of California,
Berkeley, Feb 2019.
[25] K-Optional Software. Serverless out of Control.
https://koptional.com/2019/01/22/serverless-out-of-control/. On-
line; acc. 02/2019.
[26] N. Kavantzas, D. Burdett, G. Ritzinger, and Y. Lafon. Web services choreo-
graphy description language version 1.0, W3C candidate recommendation.
Technical report, W3C, 2005. http://www.w3.org/TR/ws-cdl-10.
[27] Kevin Vandenborne. Serverless: A lesson learned. The hard
way. https://sourcebox.be/blog/2017/08/07/serverless-a-lesson-
learned-the-hard-way/. Online; acc. 02/2019.
[28] L. Lamport. How to make a multiprocessor computer that correctly executes
multiprocess programs. IEEE Trans. Comput., 28(9):690–691, Sept. 1979.
[29] Microsoft. Azure Functions. https://azure.microsoft.com/services/func-
tions. Online; acc. 02/2019.
[30] R. Milner. A Calculus of Communicating Systems, volume 92 of Lecture
Notes in Computer Science. Springer, 1980.
[31] F. Montesi. Kickstarting choreographic programming. In WS-FM/BEAT,
pages 3–10. Springer, 2015.
[32] J. Niehren, J. Schwinghammer, and G. Smolka. A concurrent lambda
calculus with futures. Theor. Comput. Sci., 364(3):338–356, 2006.
[33] C. H. Papadimitriou. The serializability of concurrent database updates. J.
ACM, 26(4):631–653, 1979.
[34] D. Sangiorgi and D. Walker. The Pi-Calculus - a theory of mobile processes.
Cambridge University Press, 2001. ISBN 978-0-521-78177-0.
[35] M. Summerfield. Python in practice: create better programs using concur-
rency, libraries, and patterns. Addison-Wesley, 2013.
[36] Tom Wright. Beware “RunOnStartup” in Azure Functions — a
serverless horror story. http://blog.tdwright.co.uk/2018/09/06/beware-
runonstartup-in-azure-functions-a-serverless-horror-story/. Online;
acc. 02/2019.
[37] A. Williams. C++ concurrency in action. Manning, 2017.

You might also like