Professional Documents
Culture Documents
Sorbonne Universités,
Université Pierre et Marie Curie (Paris 06),
UMR 7606, Laboratoire d’Informatique de Paris 6,
4 Place Jussieu, 75005 Paris, France
jeremie.salvucci@lip6.fr
emmanuel.chailloux@lip6.fr
1 Introduction
Resource consumption analysis started in the late 70s with METRIC[8] target-
ting the worst case execution time of programs written in a pure subset of Lisp.
Based on recurrence relations, it can be adapted to memory consumption analy-
sis. Since then, new methods have emerged from both purely functional and im-
perative communities. Recently, two powerful type systems have been proposed
to get an upper bound of allocated memory for purely functional programs : sized
types[5] in the Embounded project developing HUME[7] and automatic amor-
tized analysis for the RAML[4] language. They do not require any actions from
the programmer to perform their analysis. This comes at a cost, only linear and
polynomial bounds can be infered. Aside type systems, two projects target the
Java language : COSTA[1] which relies on recurrence relations and JConsume[2]
based on invariants on iteration spaces that can be infered in some cases but
have to be provided by the programmer in complex cases.
expressions description
hei ::= () unit
| hbi booleans
| hni integers
| hx i variables
| fun x → hei @ ρ functions
| he0 i he1 i function application
| if hec i then het i else hef i conditional
| let hx i = hea i in heb i variable binding
| let rec hf i hx i = heb i in hei recursive binding (function only)
| (hea i,heb i) @ hρi couple construction
| Πi hei couple projections
| ref hei @ hρi reference
| hei := hei assignment
| !hei dereference
| newrgn () new region primitive
| aliasrgn ρ in hei sharing a region handler
| freergn ρ free region primitive
Fig. 1. Expressions
Fig. 2. Types
C; Γ ` e : τ ; C 0
where C is a set of capabilities and Γ a typing environment. It reads as
follows: “given a set of capabilities C and a typing environment Γ , the expression
e has type τ and returns a set of capabilities C 0 ”.
There are two kinds of variables. Those bound to stack values and those
bound to region-allocated values. RVar is the rule for the last. As you can see,
the type (see figure 2) of this expression contains a region name. This rule checks
that the access to the corresponding region is still sound. The ⊕ is a set union
operator with some constraints over linear capabilities.
Γ (x) = (τ, r) C = rq ⊕ C 0
(RVar)
C; Γ ` x : (τ, r); C
Typing a function requires planning for future calls. For instance, if some free
variables are captured then the relevant set of capabilities has to be presented
at each call site. To perform this verification, the arrow type is augmented with
Cin and Cout . Cin represents the set of capabilities needed to evaluate the func-
tion body and Cout represents the new set of capabities once the evaluation is
over. Moreover, we need to be sure that the closure does not capture any linear
capabilities. The predicate unrestricted checks this.
The application rule, APP, follows immediately the function rule. At each
call site, we check that C contains Cin , the relevant set of capabilities to evaluate
the function body. Cout represents capabilities that are available after typing the
function body. Here ≤ can be seen as a subtyping relation : Cv has to allow as
many operations as Cin .
Cin
C; Γ ` ef : τx −−−→ τ ; Cf
Cout
Cf ; Γ ` ev : τx ; Cv Cv ≤ Cin
(App)
C; Γ ` ef ev : τ ; Cv − (Cin − Cout ) + (Cout − Cin )
r∈
/C
(New)
C; Γ ` newrgn () : hnd r; r1 ⊕ C
Sometimes duplicating a capability is necessary. Especially when you need to
pass twice a region handler to a function. To perform this, aliasrgn can be used.
Leaving the scope of this primitive requires some checks to ensure correctness.
r+ ⊕ C; Γ ` e : τ ; r+ ⊕ C 0
(Alias)
r1 ⊕ C; Γ ` aliasrgn ρ in e : τ ; r1 ⊕ C 0
The most interesting rule for the analysis is Free. Here, linearity is the
important part, it ensures that the region handler is not shared. Hence, the
corresponding region can be freed in a sound way.
Γ (ρ) = hnd r
1 (Free)
r ⊕ C; Γ ` f reergn ρ : unit; C
Hence, we have a set of rules giving information about the memory behavior
of our programs. That is essential to perform a memory consumption analysis.
3 Analysis
The goal of the analysis is to provide an upper bound at compile-time of live
memory at runtime considering freed memory. To perform this, we rely on the
correctness of the previous type system. The analysis computes the sizes of
the different regions involved symbolically. It proceeds by combining several
parameterized analyses depending on the programming style used at the function
level.
4 Example
The following example shows how the analysis works. We assume that the lan-
guage is augmented with lists and pattern matching (this changes neither the
language nor the analysis). The main function is rev append which concatenates
two lists by reversing the first one to be tail recursive. This function can written
in at least two different styles : purely functional and imperative.
This program builds two regions, r and rr and allocates two lists, xs and ys,
in r and rr respectively. Then, it concatenates xs and ys thanks to rev append
and frees the region rr.
let r = newrgn () in
let ys = [12; 15; 18] @ r in
let rr = newrgn () in
let xs = [3; 6; 9] @ rr in
let zs = rev_append xs ys r in
freergn rr
The left version employs a purely functional style and the right version an
imperative one with the side-effect on the reference rs along the computation.
The effect system captures this difference and allows different analyses to be
performed and combined.
The first list can be allocated in a region ra , the second in a region rb but in
the end the returned list will be allocated in region rb . This information is useful
to track the different lifespans of the regions involved in the computations.
Pure functions are analyzed with the automatic amortized analysis. It ex-
tracts a set of constraints of the function and tries to minimize it. In rev append,
the interesting part is the application of the data constructor Cons. If before
this application, the amount of memory available is n, then the constraint
n ≤ KCons + n0 , where n0 is the amount of memory available after, needs to
be satisfied. KCons represents the amount of memory necessary to allocate an
element of a list. In this case, the extracted cost is proportional to the size of
the first list. Pure functions do not act on the program state, hence there is no
information to propagate.
The imperative version of rev append is analyzed thanks to invariants on
iteration spaces. Here, the side-effect is local to the function. Hence, the amount
of allocated memory is the only information propagated. Here, the invariant is
length !rs = size xs + size ys where size ys is a constant. It is linear and could
be obtained in an automatic way. In other cases, we would rely on programmer
annotations. From this, we can deduce the amount of memory allocated. In this
case, it is also proportional to the length of the first list.
In this example, both analyses returned similar results. Then, we can instan-
tiate symbolic expressions to get the necessary amount of memory to execute the
program in a safe way. Here, we can see that the region rr is freed at the end. If
the program was larger, this region would have been considered as non-existent
to analyze the rest of the allocated memory.
5 Conclusion
This analysis allows us to treat programs mixing functional and imperative fea-
tures by combining analyses working well on each style. Discrimination between
these styles is made with an effect system at the function level. Automatic amor-
tized analysis for pure functions and invariants on imperatives ones. Combining
these analyses can be done in a sound way thanks to a type system based on
regions. These regions give us locations of side-effects and lifespans of the regions
involved in a computation.
Correctness of this analysis relies on the correctness of the type system which
has been proved through progress and preservation lemmas. To validate this
approach in practice, a prototype is currently being developed.
References
1. Albert, E., Arenas, P., Genaim, S., Puebla, G.: Closed-form upper bounds in static
cost analysis. J. Autom. Reasoning 46(2), 161–203 (2011)
2. Braberman, V.A., Garbervetsky, D., Hym, S., Yovine, S.: Summary-based inference
of quantitative bounds of live heap objects. Sci. Comput. Program. 92, 56–84 (2014)
3. Fluet, M., Morrisett, G., Ahmed, A.J.: Linear Regions Are All You Need. In: Pro-
ceedings of the 15th European Symposium on Programming. pp. 7–21. ESOP 06
(2006)
4. Hofmann, M., Jost, S.: Type-Based Amortised Heap-Space Analysis. In: Proceedings
of the 15th European Symposium on Programming. pp. 22–37. ESOP 06 (2006)
5. Hughes, J., Pareto, L., Sabry, A.: Proving the correctness of reactive systems using
sized types. In: Proceedings of the 23rd Symposium on Principles of Programming
Languages, POPL 96. pp. 410–423 (1996)
6. Tofte, M., Talpin, J.P.: Implementation of the Typed Call-by-value λ–calculus us-
ing a Stack of Regions. In: Proceedings of the 21st Symposium on Principles of
Programming Languages. pp. 188–201. POPL 94 (1994)
7. Vasconcelos, P.: Space cost analysis using sized types. Ph.D. thesis, University of St
Andrews (2008)
8. Wegbreit, B.: Mechanical program analysis. Commun. ACM 18(9), 528–539 (Sep
1975)