You are on page 1of 27

Take-home Exam in Advanced Programming

Deadline: Friday, 10 November 2023 at 15:00

Version 1.0

Preamble
This is the exam set for the individual, written take-home exam on the course Advanced
Programming, B1-2023. This document consists of 27 pages; make sure you have them all.
Please read the entire preamble carefully.
The exam consists of 2 questions. Your solution will be graded as a whole, on the 7-point
grading scale, with an external examiner. The questions each count for 50%. However, note
that you must have both some non-trivial working Haskell and Erlang code to get a passing
grade.
In the event of errors or ambiguities in an exam question, you are expected to state your
assumptions as to the intended meaning in your report. You may ask for clarifications
in the discussion forum on Absalon, but do not expect an immediate reply. If there is
no time to resolve a case, you should proceed according to your chosen (documented)
interpretation.

What To Hand In
To pass this exam you must hand in both a report and your source code:

• The report should be around 5–10 pages, not counting appendices, presenting (at
least) your solutions, reflections, and assumptions, if any. The report should contain
all your source code in appendices. The report must be a PDF document.
• The source code should be in a .ZIP file called code.zip, archiving one directory called
code, and following the structure of the handout skeleton files.

Make sure that you follow the format specifications (PDF and .ZIP). If you don’t, the hand
in will not be assessed and treated as a blank hand in. The hand in is done via the Digital
Exam system (eksamen.ku.dk).

Learning Objectives
To get a passing grade you must demonstrate that you are both able to program a solution
using the techniques taught in the course and write up your reflections and assessments of

1
Advanced Programming DIKU, B1-2023/2024

your own work.

• For each question your report should give an overview of your solution, including
an assessment of how good you think your solution is and on which grounds you
base your assessment. Likewise, it is important to document all relevant design
decisions you have made.

• In your programming solutions emphasis should be on correctness, on demonstrating


that your have understood the principles taught in the course, and on clear separation
of concerns.

• It is important that you implement the required API, as your programs might be
subjected to automated testing as part of the grading. Failure to implement the correct
API may influence your grade.

• To get a passing grade, you must have some non-trivial working code in both Haskell
and Erlang. (This is a necessary, but not sufficient, condition.)

Exam Fraud
This is a strictly individual exam; thus you are not allowed to discuss any part of the exam
with anyone on, or outside the course. Submitting answers (code and/or text) you have
not written entirely by yourself, or sharing your answers with others, is considered exam
fraud.
You are allowed to ask (not answer) how an exam question is to be interpreted on the
course discussion forum on Absalon. That is, you may ask for official clarification of what
constitutes a proper solution to one of the exam problems, if this seems either underspecified
or inconsistently specified in the exam text. But note that this permission does not extend to
discussion of specific solution approaches or strategies, whether concrete or abstract.
This is an open-book exam, and so you are welcome to make use of any reading material
from the course, or elsewhere. However, make sure to use proper and specific citations for
any material from which you draw considerable inspiration – including what you may find
on the Internet, such as snippets of code. Similarly, if you reuse any significant amount of
code from the course assignments that you did not develop entirely on your own, remember to
clearly identify the extent of any such code by suitable comments in the source. Remember
also that AI-based coding tools (e.g., ChatGPT or Copilot) are strictly prohibited.
Also note that it is not allowed to copy any part of the exam text (or supplementary skeleton
files) and publish it on forums other than the course discussion forum (e.g., StackOverflow,
IRC, exam banks, chatrooms, or suchlike), whether during or after the exam, without
explicit permission of the author(s).
During the exam period, students are not allowed to answer questions on the discussion
forum; only teachers and teaching assistants are allowed to answer questions. Since some
students may have been granted addtional exam time, the prohbition against discussing
the exam in public extends through Saturday, 11 November.
Breaches of the above policy will be handled in accordance with the Faculty of Science’s
disciplinary procedures.

2
Advanced Programming DIKU, B1-2023/2024

Question 1: Agner: A concurrency tracer


Concurrency bugs, including especially deadlocks and race conditions, are notoriously
difficult to replicate and diagnose, even in otherwise well-behaved, message-oriented
languages such as Erlang. The problem is that, when multiple concurrent threads are
executing, the relative order of critical events may depend unpredictably on factors such as
the compiler version, CPU load, number of cores available, and other factors that are not
easy to vary, and that might still change in the future. For example, consider the Erlang
top-level expression:

P = self(), spawn(fun () -> P ! a end), P ! b, receive X -> X end.

In most current implementations, this would probably evaluate to b every time it is run.
However, if there had been more code between the spawn and the sending of b, such
as some nontrivial computations, or just timer:sleep(1), the result would be likely to
change to a. In even slightly more complicated examples it quickly becomes infeasible for
humans to consider all the possible execution scenarios and verify that they all lead to an
acceptable outcome.
However, because of the inherent simplicity and cleanliness of the Erlang concurrency
model, and exploiting the expressive power of Haskell with monads, it is quite feasible
to construct a usable concurrency tracer that systematically explores the consequences
of various scheduling choices and enumerates the ultimate possible outcomes, including
step-by-step scenarios of how each outcome could arise. This task is about implementing a
simple such tracer named Agner.

An overview of the Agner language


The syntax and semantics of Agner are closely based on Erlang, though quite a bit simpler.
Mostly, Agner programs will have the same meanings as in Erlang, but there are some
important differences. This overview presupposes a working knowledge of Erlang.

The sequential core

Like Erlang, Agner is an evaluation-oriented language, with a single-assignment store.


The following expression forms are available (the exact syntactic and lexical rules will be
specified later):

Atoms An atom is either a simple identifier starting with a lowercase letter, such as hello,
or an (almost) arbitrary sequence of characters enclosed in single quotes, such as
'Hello, world!'. The quotes are not part of the atom name, so that 'hello' and
hello denote exactly the same atom. Atoms evaluate to themselves.

Variables Variables are simple identifiers starting with an uppercase letter or underscore.
A variable can have a current value in the store, in which case it will evaluate to
that value; or it can be unbound, in which case evaluating it signals an exception

3
Advanced Programming DIKU, B1-2023/2024

{badvar, 'X'}, where X is the name of the variable. (In Agner, unlike in Erlang,
unbound variables are detected only at runtime, rather than before execution.)
Variables get values by pattern-matching (see below); once a variable has a value,
that value cannot change.

Tuples A tuple expression is a comma-separated, possibly empty, sequence of expressions,


enclosed in curly braces, such as {foo,X,{}}. Like in Erlang, such an expression
simply evaluates to the tuple of values that its elements evaluate to.

Lists A list expression is either [] or of the form [Exp1 | Exp2]. Like in Erlang, the
generalized form [Exp1, ..., Expn] is syntactic sugar for

[Exp1 | [Exp2 | ... [Expn | []]]]

and [Exp1, ..., Expn | ExpT] for

[Exp1 | [Exp2 | ... [Expn | ExpT]]]

(As usual there is no requirement that ExpT must evaluate to a list.) In Agner,
unlike in Erlang, lists are actually not a separate data type, but just an abbreviation,
with [] representing the atom '[]', and [Exp1 | Exp2] representing the tuple
{'[|]', Exp1, Exp2}. For example, [a] is just a more convenient notation for
{'[|]', a, '[]'}.

Patterns and matches A pattern, like in Erlang, is a restricted expression built out
of atoms, variables, and tuples (with list-notation desugared like for expressions).
Additionally, the wildcard pattern _ acts like an anonymous variable (with each
occurrence distinct). For example, {a, X, [H|_]} is a pattern. Also like in Erlang,
and unlike in Haskell, a variable may occur multiple times in a pattern, as in {X,X}.
A match expression is of the form Pat = Exp. When evaluated, it evaluates Exp to a
value and checks that that this value is of the shape described by the pattern, possibly
binding variables occurring in the pattern. It also returns the value. For example,
{a, X, [H|_]} = {a, b, [foo, bar, baz]} will succeed and bind X to b and H to
foo. If the pattern does not match the value, the exception {badmatch,V} is thrown,
where V is the value that the expression evaluated to
If variable in the pattern is already bound, the value it would be (re)bound to must be
the same as its current value. For example, after X = a, the pattern {X,b} matches
{a,b}, but not {c,b}. In particular, {X,X} = {Exp1,Exp2} (where X is previously
unbound), succeeds if and only if Exp1 and Exp2 evaluate to the same value.

Sequences A sequence expression, often referred to as a body, is of the form

Exp1, ..., Expn

(where 𝑛 ≥ 1). It evaluates all of the expressions in turn (possibly binding variables
in matches), and returns the value of the last one. (If any of the expressions signal an
exception, any bindings are undone.). An expression sequence can appear anywhere
in an expression by enclosing it as begin Exp1, ..., Expn end, but many places
in the grammar also allow sequencing without an explicit begin.

4
Advanced Programming DIKU, B1-2023/2024

Note that in Agner, unlike in Erlang, it is explicitly specified that all subexpressions
within a single expression are also evaluated from left to right. This means that, e.g.,
{X=a,X} evaluates to {a,a}, while {X,X=a} signals that X is unbound. (In Erlang,
both expressions are statically illegal.)
Case analysis A case expression is of the form
case Exp of Pat1 -> Body1; ...; Patn -> Bodyn end
(Erlang’s case guards are not supported.) Like in Erlang, first Exp is evaluated to a
value V, which is then matched against all the patterns in turn. If V matches some
Pati (possibly binding some variables), the corresponding Bodyi is evaluated and
its result becomes the result of the case-expression. (Any variables bound by the
expression, pattern-match, or body also remain bound after the case-expression.) If
none of the patterns match, the case-expression signals the exception {badcase,V}.
In Erlang, variables bound by a Bodyi are in general not available after the case-
expression, unless they are bound by all the cases. (Variables that fail this check are
called unsafe, and cause compilation to fail.) Agner, which only detects unbound
variables at runtime, imposes no such restrictions.
Functions A function call is an expression of the form Exp0(Exp1,...,Expn), where
𝑛 ≥ 0. Agner, like Erlang (and most other languages, except Haskell), has an eager
semantics, so first all of the expressions are evaluated (from left to right), and then
the function that Exp0 evaluates to is applied to the list of argument values.
Like in Erlang, there are two kinds of functions in Agner, named and anonymous.
For the former, if Exp0 evaluates to (usually, is already) an atom f, the program must
contain a declaration of a function with that name, of the form,
f(X1, ..., Xn) -> Body.
Then the formal parameters (the X’s) are bound to the values of the corresponding
actual parameters, the function body is evaluated, and its result becomes the result
of the function call. Function declarations may be (mutually) recursive, as usual, and
the functions may be declared in any order.
Note that, unlike in Erlang, there can only be a single clause in a function declaration,
rather than a collection of ;-separated ones. Consequently, the function parameters
must also be simple variable names, not more general patterns. Likewise, all the
parameter names must be distinct in Agner; this is considered a syntactic restriction.
Erlang’s multiple-clause functions can expressed, slightly more verbosely, by explicit
case-expressions; for example, the Erlang function declaration
equal(X,X) -> true; equal(_,_) -> false.
could be written in Agner as
equal(X,Y) -> case X of Y -> true; _-> false end.
Agner, like Erlang (and Haskell) is statically scoped. This means that any variables
bound at the place where the function is called are not available in the function body.
For example, with the declaration

5
Advanced Programming DIKU, B1-2023/2024

foo(X) -> Z = {X,X}, Z.


the expression (sequence) Z = a, X = b, foo(c) will still evaluate to {c,c}, despite
the seemingly conflicting binding for Z. Conversely, in foo(c), {X,Z}, the caller’s X
and Z will remain unbound after the call, unless they already had values.
Unlike in Erlang, a named function can only have a single arity; that is, there cannot
be declarations for both foo(X) -> ... and foo(X,Y) -> ... in the same program.
Also, there are no modules, so all declared functions live in the same name space,
shared with the built-in functions (BIFs). (However, there is a lexical hack for : that
allows for better source-level compatibility with Erlang; see later.)
If the number of actual parameters in a function call does not agree with the number
of formal parameters in the declaration, the exception {badarity, F} is signaled,
where F is the function name. If the atom F is not declared as a function all, the
exception will instead be {undef, F}. If F is neither an atom nor an anonymous
function value, the exception is {badfun, F}.

The second kind of functions are the anonymous function values, produced by function
expressions, fun (X1, ..., Xn) -> Body end. Calling an anonymous function is
very similar to calling a named one; for example,
F = fun(X) -> {X,X} end, F(a).
would evaluate to {a,a}. There is, however, a subtlety in that – again in accordance
with static scoping – variables that are already bound at the place where the function-
expression is evaluated (not where the resulting functional value is later called) are
also available in the function body. For example,
F = fun(X) -> fun (Y) -> {X,Y} end end, G = F(a), G(b).
would evaluate to {a,b} in both Erlang and Agner. Note that formal parameters
always shadow any existing variables of the same name, so that
X = a, F = fun(X) -> X end, {F(b), X}.
evaluates to {b,a} as expected. However, other variables, even if they are seemingly
“defined” in the function body by the pattern-matching syntax, may clash with already
existing bindings, so that (like in Erlang),
Y = a, F = fun(X) -> Y = X, {X,Y} end, F(b).
would actually signal a match exception. (But had the final call been F(a), the result
would still be {a,a}).
Functional values always compare as equal to themselves. For example,the match in
F = fun(X) -> X end, F = F. would always succeed. It is unspecified whether
two function values produced by different evaluations of identical or similar function
expressions compare as equal. For example all of these may succeed or fail:
F = fun(X) -> X end, G = fun(X) -> X end, F = G.
F = fun(X) -> X end, G = fun(Y) -> Y end, F = G.
F = fun(X) -> X end, D = a, G = fun(X) -> X end, F = G.

6
Advanced Programming DIKU, B1-2023/2024

Exceptions Unlike Erlang, Agner does not distinguish formally between various excep-
tion classes, such as errors or thrown exceptions. Any Agner value (though usually
an atom or a tuple starting with an atom) can be signaled as an exception using the
BIF throw, e.g, throw({myErr,{X,Y})) (where X and Y are already bound). Likewise,
many runtime errors also signal specific exceptions, as mentioned above. Unless it is
caught, an exception aborts evaluation of the entire expression.
Exceptions are caught by try-expressions, with syntax

try Body0 catch Pat1 -> Body1; ...; Patn -> Bodyn end

Like in Erlang, first Body0 is evaluated, and if this succeeds (i.e., does not signal an
exception), the result becomes the result of the whole try (and the handler cases are
ignored). But if Body0 signals an exception, the exception value is matched against
the patterns in turn (just like in a case-expression), and if a match succeeds, the
corresponding handler body is evaluated. (Any exceptions thrown by the handler are
not themselves caught by this try, but may be handled by an outer one.) If none of
the patterns match, the exception propagates to any enclosing handlers, or ultimately
to the top level of the expression.
If any variables were bound by Body0 before the exception was signaled, those
bindings are undone; but any bindings existing before the try are kept. For example,
after

X = a, try Y = b, throw(e) catch e -> Z = c end.

X will be bound to a and Z to c, but Y will be unbound.

The sequential subset of Agner is a reasonably expressive functional language in its own
right. The main difference from Erlang is the lack of any significant library of functions
such as lists:map; however, such functions can easily be defined explicitly using pattern-
matching and recursion. Likewise, but with more effort, rudimentary integer arithmetic
could be coded as a library, either using an unary representation, e.g, representing 3 as
{{{{}}}}, or – far more efficiently – as a list of bits, such as [one,one]. However, the real
interest of Agner is of course in the concurrency features, described next.

Concurrency

Agner shares Erlang’s basic concurrency model: a concurrent system consists of a


collection of processes that communicate only by sending messages to each other. Each
process has an incoming message queue, and once delivered to the queue, all messages will
be processed in the order in which they were delivered. Also, a process may selectively
receive only messages of a certain form, as specified by patterns. The additional expression
forms and BIFs relating to concurrency are as follows:

Process spawning The BIF spawn(F), where F is either a named or an anonymous


function of arity 0, starts a new process computing F(). The ultimate value (or
thrown exception) computed like this is discarded, but while it is running, the process
may engage in communication with other processes. The spawn call returns the

7
Advanced Programming DIKU, B1-2023/2024

pid (process id) of the newly spawned process. A pid is a special value that can be
bound to variables and tested for equality against other values, but that cannot be
otherwise constructed. A pid is guaranteed to be distinct from any other pid issued in
the same run of the program (i.e. a pid is not reissued even if the process it originally
referenced has died.)
In Agner, pids have a user-readable representation as sequences of numbers <n1.n2...nk>.
The initial process is just <>; and the processes it spawns is called <0>, <1>, etc.
Likewise, process <1> may spawn <1.0>, <1.1>, and so on; But the structured pids
have no semantic significance (such as linking or supervision). Also, it is not possible
to “forge” a pid as a suitably named atom, as in '<1.2>' ! a.
Message send A send expression has the syntax ExpP ! ExpM. Here, ExpP must evaluate to
a 𝑃 that is a (possibly defunct) pid (otherwise, {badpid, P} is signaled), while ExpM
can evaluate to any Agner value 𝑀. 𝑀 is delivered to to the message queue of the
process 𝑃, if it is still alive; otherwise the message is just discarded. Messages sent by
a process are delivered in the same order in which they were sent, but if two different
processes are sending to the same pid, it is unspecified which one gets delivered first.
In particular, there are no inherent fairness guarantees (e.g., in a denial-of-service
scenario, where one process is uninterruptedly sending messages to a single pid, any
other sends to that pid may never get through.)
Message receive A receive expression is more involved, having the form
receive Pat1 -> Body1; ...; Patn -> Bodyn end
Like for case-expressions in Agner, there are no guards, nor is possible to specify an
after (i.e., timeout) case.
The semantics of the expression is that the process first sleeps until its message queue
becomes non-empty. Then, it takes the first message from the queue and matches
it against all the patterns in turn, like for a case. But if none of the patterns match,
the message is not immediately put back on the queue, since trying to match it again
against the same patterns would be pointless. Instead, the message is stashed in an
internal list inside the process, and the next message from the incoming queue is
examined. (If the queue has become empty, the process goes back to sleep.) Once
a message is received that matches one of the patterns, the process first unstashes
its internal list back unto the front of the message queue (preserving the original
order), so that the next receive-expression will examine all the messages again from
the oldest one.
Self-identification A process can find out its own pid (tyically to give it to someone else)
by the BIF self().
Logging A final, Agner-specific BIF (but roughly corresponding to Erlang’s. io:format/2,
when used for debugging information, though without any actual formatting func-
tionality) is log(V) where V can be an arbitrary value. This adds adds (a textual
representation of) V to the end of an event log that can be printed once the program
stops (whether normally or by a deadlock). Also, the Agner system itself is allowed
to add its own messages to the log; such messages will be specially tagged so as not
to interfere with the user’s. The log function always returns the atom ok.

8
Advanced Programming DIKU, B1-2023/2024

Log entries produced by the same process are recorded in the order in which they
are made, but messages from different processes that are not causally related may be
arbitrarily interleaved.

For simplicity, Agner does not support preemptive multi-threading: a well-behaved process
is expected to eventually perform a send or receive, at which time a different process may
be scheduled. Thus, processes spinning in a tight loop may cause the whole system to grind
to a halt. Also, “fork bombs” (i.e., a process spawning unbounded numbers of subprocesses),
like infinite recursion, are not handled gracefully in the basic version.

Examples
We can use the system to run the example from the introduction

$ agner "P = self(), spawn(fun () -> P ! a end), P ! b, receive X -> X end."


===== result : value b (with deterministic scheduler)

This (coincidentally) agrees with the standard Erlang implementation. However. we can
also explore the space of possibilites using the backtracking scheduler, which systematically
tries all possible interleavings of runnable processes:

$ agner -bt "P = self(), spawn(fun () -> P ! a end), P ! b, receive X -> X end."
===== result : value b (1 scenarios)
===== result : value a (1 scenarios)

If it seems puzzling how some of the final results can arise, we can ask the system to print
out its detailed event log leading to a particular outcome. (If there are multiple ways to
reach a particular result, only one, representative, scenario is shown.)

$ agner -sl -bt "P = self(), spawn(fun () -> P ! a end), P ! b, receive X -> X end."
===== result : value b (1 scenarios)
system: Process <> spawning process <0>
system: selecting #1 of 2 ready processes
system: process <> sending b to self
system: Process <> inspecting b
system: Process <> unstashed 0 messages
system: Process <> terminated with value b
system: finished!
===== result : value a (1 scenarios)
system: Process <> spawning process <0>
system: selecting #2 of 2 ready processes
system: process <0> sending a to ready <>
system: Process <0> terminated with value a
system: process <> sending b to self
system: Process <> inspecting a
system: Process <> unstashed 0 messages
system: Process <> terminated with value a
system: finished!

9
Advanced Programming DIKU, B1-2023/2024

Note that the actual contents of the system log is not specified; you can be as chatty or silent
as you prefer, and the exact phrasing of particular events is also entirely up to you.
For a slightly larger example, we may define some helper functions to be used in the top-
level expression. We also select a Monto-Carlo scheduler that makes random scheduling
choices. (Newlines are added in the command line for readabilty):

$ agner -mc 1000


-p "r() -> receive X -> X end.
ss(P,X) -> spawn (fun () -> P ! X end)."
"P = self(), ss(P,a), ss(P,b), ss(P,c), {r(), r(), r()}."
===== result : value {b,a,c} (163 / 1000 scenarios)
===== result : value {a,b,c} (173 / 1000 scenarios)
===== result : value {b,c,a} (183 / 1000 scenarios)
===== result : value {c,a,b} (169 / 1000 scenarios)
===== result : value {c,b,a} (157 / 1000 scenarios)
===== result : value {a,c,b} (155 / 1000 scenarios)

Running the tool with no arguments gives a brief usage summary.

The Agner implementation


The Agner system is divided into 4 main modules (as well as a couple of extra ones,
containing shared type definitions and the like), as follows:

• The Parser reads in the concrete syntax of Agner programs and top-level expressions
into abstract syntax trees.

• The Evaluator handles the sequential part of the language, It also contains part
of the support for the concurrency-related features with significant syntax-related
aspects, specifically receive-expressions.

• The Scheduler provides the main support for the concurrent features of the language.
It orchestrates the sequential evaluation of expressions in the individual processes,
as well as handling the communications aspects. It also allows various scheduling
choices to be explored.

• The main program, which contains the top-level stand-alone tool (including command-
line option processing, file I/O, log filtering, and the like).

The main program is already provided, so your task is to implement the remaining three
modules, according to the detailed specifications in the following. The sections are weighted
roughly as indicated below; but as always, the exam is graded as a whole, and the various
modules demonstrate complementary skills.

10
Advanced Programming DIKU, B1-2023/2024

Program ::= epsilon


| FDecl Program

FDecl ::= atom "(" Varz ")" "->" Body "."

Top ::= Body "."

Exp ::= atom Pat ::= atom


| var | var
| "(" Exp ")" | "(" Pat ")"
| "{" Expz "}" | "{" Patz "}"
| "[" ExpL "]" | "[" PatL "]"
| Pat "=" Exp | "_"
| "begin" Body "end"
| "case" Exp of Cases "end"
| Exp "(" Expz ")"
| "fun" "(" Varz ")" "->" Body "end"
| "try" Body "catch" Cases "end"
| Exp "!" Exp
| "receive" Cases "end"

Cases ::= Case


| Case ";" Cases

Case ::= Pat "->" Body

Body ::= Exps

Expz ::= epsilon Patz ::= epsilon Varz ::= epsilon


| Exps | Pats | Vars

Exps ::= Exp Pats ::= Pat Vars ::= var


| Exp "," Exps | Pat "," Pats | var "," Vars

ExpL ::= epsilon PatL ::= epsilon


| Exps | Pats
| Exps "|" Exp | Pats "|" Pat

Figure 1: Concrete grammar of Agner

11
Advanced Programming DIKU, B1-2023/2024

Question 1.1: The Parser (40%)


The concrete syntax of Agner programs is given in Figure 1 (in a notation similar to the
parser notes). This grammar is supplemented with the following stipulations:

Disambiguation the operators “=” and ”!” have the same precedence and are right-
associative. Function calls have higher higher precedence than match and send, so that,
e.g., self() ! self() parses as (self()) ! (self()). Also, slightly unusually for
a functional language, function application is non-associative, so that f(a)(b) is illegal
syntax; it must be written as (f(a))(b).

Complex lexical tokens An identifier atom is a non-empty sequence of (Unicode) letters,


digits(0 through 9), underscores (_) and at-signs (@), starting with a lowercase letter. The
following Agner keywords are not allowed as identifiers: begin, case, catch, end, fun,
receive, and try. Other Erlang keywords, such as after are allowed, though it may be
prudent to avoid them.
For pseudo-compatibilty with Erlang, colons (:) are also allowed as identifier characters
(similarly to @), so that one could directly declare and call, e.g., a utility function named
lists:reverse.
Alternatively, a general atom is any (possibly even empty) sequence of characters, enclosed
in single quotes ('). If this sequence is to contain single-quote or a backslash (\) characters,
they must be written as \' and \\, respectively. The escape sequences \n and \t stand for
a newline and tab character; all others stand for just the second character. The outer quotes
are not included in the atom name.
The rules for variable names var are the same as for identifier atoms (including the special
inclusion of :, except that the first character must now be an uppercase letter or an
underscore. (However, it cannot consist of only an underscore, since that is reserved for
the wildcard pattern.) There is no analog of the general-atom syntax for variables.

Whitespace and comments Tokens may be surrounded by arbitrary whitespace, which


is ignored, except for separating tokens that would otherwise run together, such as
consecutive identifiers, variables, and keywords. Comments start with a percent sign
(%) and run until the following newline (or end of file). Comments also act as whitespace
for token separation.
The abstract syntax is defined in file AST.hs:

data Exp =
EAtom AName
| EVar VName
| ETup [Exp]
| EMatch Pat Exp
| ESeq Exp Exp
| ECase Exp Cases
| ECall Exp [Exp]

12
Advanced Programming DIKU, B1-2023/2024

| EFun [VName] Exp


| ETry Exp Cases
| ESend Exp Exp
| EReceive Cases

data Pat =
PAtom AName
| PVar VName
| PTup [Pat]
| PWild

type Cases = [(Pat, Exp)]

type AName = String


atomNil = "[]"
atomCons = "[|]"

type VName = String

data FDecl = FDecl AName [VName] Exp

type Program = [FDecl]

The correspondence between the concrete and abstract syntax should be largely straight-
forward. Note, however, that there is no separate AST type corresponding to Body; instead,
commas used for sequencing correspond to the ESeq constructor.
The Parser should export two functions, corresponding to the two possible start symbols of
the grammar:

parseProgram :: String -> Either String Program


parseTop :: String -> Either String Exp

Note that the parsing functions should also check that all functions are well formed, i.e.,
that there are no repeated variable names in a formal-parameter list.
For implementing your parser, you may use either the ReadP or the Parsec combinator
library. If you use Parsec, then only plain Parsec is allowed, namely the following submod-
ules of Text.Parsec: Prim, Char, Error, String, and Combinator (or the compatibility
modules in Text.ParserCombinators.Parsec); in particular you are disallowed to use
Text.Parsec.Token, Text.Parsec.Language, and Text.Parsec.Expr. As always, don’t
worry about providing informative syntax-error messages if you use ReadP.
With ReadP’s lack of error messages, it may be a bit challenging to track down the location
of syntax errors in large programs. However, since Agner syntax is very close to Erlang,
you may be able to use the standard Erlang compiler on the failing file to see where the
error is.

13
Advanced Programming DIKU, B1-2023/2024

Question 1.2: The Evaluator (35%)


The evaluation model of Agner is value-oriented, where a value is given by the following
datatype (in Runtime.hs):

data Value =
VAtom AName
| VTup [Value]
| VClos Env [VName] Exp
| VPid Pid

type Env = [(VName,Value)]

type Exn = Value


mkExn s v = VTup [VAtom s, v]

type Outcome = Either Exn Value

The meanings of the constructors VAtom, VTup, and VPid should be obvious. VClos is
a function closure, used for functional values. It consists of the formal parameters and
body of the function expression, together with a copy of the environment at the time the
function-expression was evaluated. An environment is simply a list of variables and their
corresponding values, if any; unbound variables are not mentioned. As usual, only the first
occurrence of a variable in the list matters.
Finally, an Agner exception is simply a value; we also define a convenience function for
constructing exception values in the standard format. Outcome is a convenient abbreviation
for the final result of an evaluation.
(The file also includes a simple pretty-printer showV for rendering values in a more readable
form. To avoid potential confusion, it is not used as the default Show instance. The
concurrency-related definitions in Runtime will be discussed later.)
The central type definition in the Evaluator is the Eval monad:

newtype Eval a = Eval {runEval :: FEnv -> Env -> Req (Either Exn (a, Env))}

type FEnv = [(AName, (Int, [Value] -> Eval Value))]

An Eval-computation may draw upon a number of features of the evaluation context. First,
it has Reader-like access to the function environment, in which all available named functions
(both program-defined and built-in) are defined. The environment binds each such function
name to a tuple of the function’s arity and the function itself, represented as taking a list of
argument values, and returns (a computation of) the result value.
Second, an Eval computation receives the current (value) environment, as previously
mentioned. Since expression evaluation may add bindings to the environment (using the
match-operator =), the updated environment is also returned together with the (successful)
computation result.

14
Advanced Programming DIKU, B1-2023/2024

Since evaluation may fail, the result is actually an Either-type, with the left alternative
being a thrown exception. Note that no updated environment is returned in this case; rather,
the set of available bindings is rolled back to what it was when the failing computation
was started in a try-expression.
Finally, the entire computation is performed in the Req monad, a generalization of the
SimpleIO monad from the lectures. It is used only by the concurrency features discussed
later; so for now, you may think of Req as simply the identity monad.
Your first (minor) task is to complete the Monad instance for Eval. This should not depend
on what Req is, but merely that it is some monad.
Next, as usual, we define the operations accessing the features of the monad:

askFEnv :: Eval FEnv

getVar :: VName -> Eval Value


setVar :: VName -> Value -> Eval ()
getEnv :: Eval Env
inEnv :: Env -> Eval a -> Eval a

raise :: Exn -> Eval a


handle :: Eval a -> (Exn -> Eval a) -> Eval a

request :: Req a -> Eval a


wrapup :: FEnv -> Eval Value -> Req (Either Exn Value)

askFEnv simply returns the function environment.


getVar 𝑥 returns the binding of variable 𝑥 in the current (value) environment. If 𝑥 is
unbound, the function signals the exception mkExn "unbound" (VAtom 𝑥). Conversely,
setVar 𝑥 𝑣 binds 𝑥 to the value 𝑣, unless 𝑥 is already bound to a different value; in the
latter case, the analogous exception with "bound" is signaled.
getEnv returns the current value environment. The companion operation inEnv 𝑒𝑛𝑣 𝑚
runs the computation 𝑚 in the environment 𝑒𝑛𝑣 (and the unmodified function environ-
ment), and returns its result (value or raised exception). The current environment is not
modified.
raise 𝑒𝑥 signals the exception 𝑒𝑥. Conversely handle 𝑚 ℎ first tries to run the computation
𝑚. If that succeeds, the result is also returned from handle. However, if 𝑚 raises an
exception 𝑒𝑥, the result is given by applying ℎ to 𝑒𝑥.
Finally, request 𝑟 performs the request 𝑟 in the inner Req monad, and returns its result.
And wrapup fenv 𝑚 runs the computation 𝑚 in function environment fenv and the empty
value environment; it returns (as a Req-computation) the final result (value or exception),
and simply discards the final environment.
The remainder of the Evaluator (and the other modules) should not depend on the definition
of the Eval monad, but only rely on the above accessor functions.

Using the monad, we first implement a central Agner operation:

15
Advanced Programming DIKU, B1-2023/2024

match :: Pat -> Value -> Eval ()

match 𝑝 𝑣 attempts to match the value 𝑣 against the pattern 𝑝. If it succeeds, it updates
the environment with any new bindings and returns just (); if it fails, it raises mkExn
"badmatch" v. (This, and subsequent exceptions are as specified in the previous overview
of Agner.)
Then, we define the bulk of the Evaluator as three mutually recursive functions:

evalExp :: Exp -> Eval Value


apply :: Value -> [Value] -> Eval Value
evalCases :: Cases -> Value -> Eval () -> Eval Value -> Eval Value

The first is the main function for evaluating an expression; it has the obvious functionality.
apply 𝑣 𝑓 𝑣𝑠 calls the function specified by 𝑣 𝑓 (either an atom or an anonymous function
value) on the arguments 𝑣𝑠, raising exceptions for various error conditions, as previously
specified.
evalCases 𝑐𝑠 𝑣 ℎ𝑜𝑜𝑘 𝑛𝑚 matches the value 𝑣 against the patterns in 𝑐𝑠 and evaluates
the corresponding expression in the first matching case. The extra computation ℎ𝑜𝑜𝑘 is
run after the matching has successfully completed, but before the selected body expression.
(If no such computation is relevant, it can be simply specified as return ().) If no case
matches the value, the computation 𝑛𝑚 is run instead; ℎ𝑜𝑜𝑘 is not invoked in this case.
Finally, the Evaluator defines:

fenv0 :: FEnv
evalTop :: Program -> Exp -> Req Outcome

fenv0 is the initial function environment, containing bindings for the BIFs. (A sample
implementation for log is already provided) You should add the remaining ones. You
may also include additional BIFs, but your black-box tests should not rely on any such
non-standard functions being available. Any definitions in the program override the BIFs;
this is rarely a good idea.
Finally, evalTop 𝑝𝑔𝑚 𝑒 runs the top-level expression 𝑒 with the function declarations in
𝑝𝑔𝑚. If 𝑒 (and the program functions it calls) do not involve any concurrency features,
the computation should just be the trivial Req-computation of the final result of the
expression.
A fair bit of the evaluator can be implemented without using Req: pattern matching, case-
expressions, functions (both named and anonymous), and exceptions. However, for send,
receive and spawn, need to look at Req in the next section.

Question 1.3: The Scheduler (25%)


The central interface between the Evaluator and the Scheduler is the Req monad:

data Req a =
RDone a

16
Advanced Programming DIKU, B1-2023/2024

| RLog String (Req a)


| RSend Pid Msg (Req a)
| RRcv (Msg -> Req a)
| RUnstash [Msg] (Req a)
| RSpawn (Req Outcome) (Pid -> Req a)
| RSelf (Pid -> Req a)

type Msg = Value

newtype Pid = Pid [Int]

A Req-computation result is either an already computed value (RDone), or a request to


perform some operation, together with a further computation to perform once that operation
has completed. That further computation may also depend on the result of the previous
operation, where relevant. The meanings of the various requests are as follows:

• RLog 𝑠 𝑟 : Add string 𝑠 to the log, then continue with 𝑟 .

• RSend 𝑝 𝑚 𝑟 : Send message 𝑚 to pid 𝑝, then continue with 𝑟 .

• RRcv ℎ: Wait for the requesting process’s message queue to become non-empty, then
retrieve the first (i.e., oldest) message from the queue and pass it to the function ℎ,
which will determine the next request, if any. Note that it becomes the responsibility
of the caller to preserve the retrieved message if it cannot be immediately processed
(for example, if it doesn’t match any of the patterns in the receive that led to the
RRcv request). Normally, such messages are stashed, to be looked at later.

• RUnstash 𝑞 𝑟 : Add the previously stashed messages 𝑞 back to the front of the caller’s
message queue, so that they will be retrieved again. While it is in principle possible to
unstash messages that had been not previously received (i.e., in effect, send a priority
message to oneself) using this mechanism, that is considered poor practice, as it
makes interpreting the traces event traces confusing.
On the other hand, it is often useful to explicitly unstash an empty queue, to indicate
that the immediately previously retrieved message was in fact accepted for further
processing, and will not need to be considered again.

• RSpawn 𝑟 ℎ: Spawn a new, independent process running request 𝑟 , and pass its pid
to ℎ, which will determine the next request. The spawned process may make its
own requests, including further spawns. When the spawned process makes a RDone
request (meaning: no further requests), the process terminates.

• RSelf ℎ: Pass the caller’s pid to ℎ, which will determine the next request.

The Scheduler itself defines two central types:

data Process = Proc Pid (Req Outcome) Int [Msg]


type ProcSys = Either Outcome ([Process], [Process])

17
Advanced Programming DIKU, B1-2023/2024

A process consists of a pid, a Req-computation of an Outcome, a local counter for naming


spawned subprocesses, and a message queue. Depending on the exact contents of the
request and message queue, the process is said to be in one of three states:

Busy A busy process has a request that can be immediately fulfilled, without requiring
coordination with other (existing) processes, so there is no reason to postpone it.
This covers most of the Req constructors, with the following two exceptions:

Ready A process is said to be ready if it is making a request that could be fulfilled, but
doing so might irrevocably affect the subsequent behavior of the system, so an explicit
choice must be made to perform it now, rather than waiting. The only such situation
is a RSend request, because if two more more processes want to send a messages,
there is a choice to be made about which one goes first.

Blocked A blocked, or sleeping process is one that has made a RRcv request and its message
queue is empty, so that the process must wait until something is sent to it by another
process.

A process system is either completed with a final outcome (the result from the initial process),
or two lists. The former of these is the ready queue, and contains only ready processes. The
scheduler will repeatedly pick a process from this queue and execute the send request. The
simple round-robin scheduler always takes the first process on the queue, if there is more
than one; the others may make different choices. If the ready queue should ever become
empty, before the initial process has completed, the system is said to be deadlocked.
The second list in a non-completed system contains only the blocked processes. The order
of this list is not significant, since processes will in general be unblocked from the middle
of the list, as they receive messages.
There can only be one busy process at a time, the currently executing one. It will evolve until
it either terminates, or becomes ready or blocked. This evolution may spawn other processes,
which are themselves evolved until they enter one of the two non-busy states.
As processes evolve, they may add event entries to a (global) log maintained by the scheduler.
Such entries may be requested by RLog, or they may be generated by the scheduler itself,
and may be useful for debugging. The event log is just a list of String pairs, where the
first string is a tag (in the simple version, just "system" or "user"), and the second is the
associated text. The log is conveniently maintained in a standard Writer monad:

type Logger = Writer [(String, String)]

Most of the Scheduler’s work is done by two functions:

runLocal :: Process -> Logger ProcSys


runReady :: Process -> ([Process], [Process]) -> Logger ProcSys

The former takes a (potentially busy) process and processes all its requests, until it can be
placed in either of the two lists in a process system. If the process has no further requests
(RDone), it just disappears, unless it was the initial process (Pid []), in which case the
whole system goes into the completed state. Any processes spawned as part of running

18
Advanced Programming DIKU, B1-2023/2024

runLocal are also evolved to become part of the system. If any such processes end up on
the ready queue, it will be after the process that spawned them.
The second function takes a ready process (i.e., about to send), together with the remaining
process in the system, and returns the new state of the system. In general, after a send, both
the new processes and the recipient may become busy, and both are individually evolved,
starting with the sender. Both processes are put on the end of the ready queue if they end
up in a ready state again. (If they become blocked, their position on the blocked list doesn’t
matter.)
Note that there are a priori four different situations that can arise in runReady, depending
on the send destination: a process can send a message to itself, to another ready process
(which should not affect its position in the queue), to a blocked process (which will unblock
it), or to a no longer existing process (which means that the message is discarded). You may
be able to handle some of these situations together, but think carefully about what should
happen in each one.
Using these functions, we can specify the main scheduler function:

scheduler :: Monad m => (Int -> m Int) -> Req Outcome -> Int
m (Outcome, [(String, String)]

The call scheduler 𝑠𝑒𝑙 𝑟𝑞 𝑛 runs 𝑟𝑞 as the initial process, for at most 𝑛 “big” scheduling
steps (i.e., sends). If the initial process terminates within that limit, its outcome is returned.
If the system has not completed after 𝑛 steps, the outcome will be the exception (atom)
timeout. If the ready-queue runs empty, the outcome will be the exception deadlock.
Regardless of the outcome, the event log is included in the result.
The function 𝑠𝑒𝑙 is used to select which of the ready process to run in each step. It may
be called with positive integer 𝑘 (representing the length of the ready list), and it should
return an integer in the range 0 through (𝑘 − 1). To get the round-robin behavior, we can
take 𝑚 to be just the identity monad, and make the the 𝑠𝑒𝑙 function can just always return
0, making the scheduler always select the first ready process from the queue.
For the backtracking scheduler, we take 𝑚 to be the list monad, and make 𝑠𝑒𝑙 return a
nondeterministic choice between all the valid selections. This means that every time there
is a scheduling choice to be made, we consider all of the possible choices and collect all
of the possible scenarios in a big list which can then be inspected. This guarantees that
we do not miss any potential scenarios, but for complex process systems with lots of
concurrent communications, the number of possible interleavings may explode, making a
full backtracking search infeasible.
Therefore, another possibility is to make 𝑠𝑒𝑙 return a (pseudo-)random number in the range,
and run the scheduler from the top a large number of times. But since Haskell is purely
functional, that would just make the same psedo-random choice every time; instead, we
must carry a random-number seed through the computation by taking 𝑚 as a suitable state
monad. Then for each run, we start with a different seed (could be just the number of the
run), to hopefully get a reasonable exploration of the space.
All three strategies can be selected from the handed-out main program.

19
Advanced Programming DIKU, B1-2023/2024

Getting started
As usual, the recommended strategy is to develop a partial implementation of all three
modules, possibly postponing some of the more difficult aspects, rather than aiming for
completing each one before moving on to the next. Within the modules, we recommend
the following prioritizations:
In the Parser, omit first list syntax (in expression and patterns), general atoms, and the finer
points of operator precedences and associativity, whitespace, etc. (But do make a list of
what you skip.)
In the Evaluator, focus on the single-assignment environment and basic expression forms:
atoms, variables, tuples, cases, and named functions. Wait with try/catch, function values,
and send/receive/spawn support. In particular, postpone “stashing” in receives; initially,
make unmatched messages just throw an exception.
In the Scheduler, first set up logging infrastructure. Then do as many cases for runLocal
as you can, before moving on to runReady. Implement just a round-robin scheduler first,
which simply ignores the selection function and get it to work, before considering the
generalization to arbitrary monads and selection functions.

General instructions
All your code should be put into the provided skeleton files under code/agner/, as indicated.
In particular, the actual code for each module Mod should go into src/ModImpl.hs. Do not
modify src/AST.hs and src/Runtime.hs, which contains the common type definitions
presented above; nor should you modify the type signatures of the exported functions in
the APIs. Doing so will likely break our automated tests, which may affect your grade.
Be sure that your codebase builds with the provided app/Main.hs with stack build; if
any parts of your code contain syntax or type errors, be sure to comment them out for the
submission.
In your report, you should describe your main design and implementation choices for each
module in a separate (sub)section. Be sure to cover at least the specific points for each
module asked about in the text above (and emphasized with *** in the margin), as well as
anything else you consider significant. Low-level, technical details about the code should
normally be given as comments in the source itself.
For the assessment, we recommend that you use the same (sub)headings as in the weekly
Haskell assignments (Completeness, Correctness, Efficiency, Robustness, Maintainability,
Other). Feel free to skip aspects about which you have nothing significant to say. You may
assess each module separately, or make a joint assessment in which you assess each aspect
for both modules.

20
Advanced Programming DIKU, B1-2023/2024

Question 2: Advanced Database


The task in this question is to implement a simple database called Advanced Database. It
uses a query language, Advanced Query Language (AQL), inspired by SQL.

General comments
This question consists of three sub-questions: Question 2.1 about implementing an API for
starting an Advanced Database server and for trading, Question 2.2 about modelling the
database, and Question 2.3 about writing QuickCheck tests against the API. Note that Ques-
tion 2.3 can be solved with a partial implementation of Question 2.1 and Question 2.2.
Remember that it is possible to make a partial implementation of the API that does not
support all features. If there are functions or features that you don’t implement, then make
them return the atom not_implemented.
There is a section at the end of this question, on page 26, that lists expected topics for your
report.

Terminology
An attribute is a name used to label data in a database. In Erlang attributes are represented
by atoms, for example name or grade.
A row is a mapping from a finite set of attributes to values. In Erlang a row is represented
by a map, for example #{name => "Alice", grade => 7}.
A bag is a set where elements may occur multiple times. In Erlang a bag can be represented
by a list of elements. Two such lists represent the same bag if one is a permutation of
the other. For example, [a, b, a] is a bag where a occurs twice and b occurs once. It is
different from the bag [a, b] (where a only occurs once), but is equivalent to [a, a, b]
(which lists the same elements, but sorted).
An AQL query computes a bag based on the contents of the database. It is represented by
an Erlang term, which can be one of the following:

• {list, Rows}, an explicit list of rows. It evaluates to the bag represented by Rows.
• {from, TableName}, the contents of a table. It evaluates to the contents of the table
TableName.
• {project, AttrNames, Q}, the projection of the query Q onto the attributes
AttrNames. It evaluates to the result of Q, but only includes the attributes listed
in AttrNames. For example, if Q evaluates to

[#{name => "Alice", grade => 7},


#{name => "Bob", grade => 10},
#{name => "Charlie", grade => 7}]

then {project, [grade], Q} evaluates to

21
Advanced Programming DIKU, B1-2023/2024

[#{grade => 7}, #{grade => 10}, #{grade => 7}]

• {rename, OldAttr, NewAttr, Q}, the renaming of OldAttr to NewAttr in Q. It


evaluates to the result of Q, but all OldAttr => Value mappings are replaced by
NewAttr => Value. For example, if Q evaluates to

[#{name => "Alice", grade => 7},


#{name => "Bob", grade => 10},
#{name => "Charlie", grade => 7}]

then {rename, grade, final_grade, Q} evaluates to

[#{name => "Alice", final_grade => 7},


#{name => "Bob", final_grade => 10},
#{name => "Charlie", final_grade => 7}]

• {where, Pred, Q}, the filtering of Q by Pred. It evaluates to the subbag (i.e. “subset”,
but for bags) of rows of the result of Q for which Pred returns true. If Pred throws
an exception for any row the entire query fails. For example, if Q evaluates to

[#{name => "Alice", grade => 7},


#{name => "Bob", grade => 10},
#{name => "Charlie", grade => 7}]

then {where, fun(#{grade => Grade}) -> Grade < 10 end, Q} evaluates to

[#{name => "Alice", grade => 7},


#{name => "Charlie", grade => 7}]

• {union, Q1, Q2}, the union of Q1 and Q2. For bags represented as lists this is essen-
tially concatenation. For example, if Q1 evaluates to [#{grade => 7}, #{grade => 10}]
and Q2 evaluates to [#{grade => 7}, #{grade => 12}] then their union evaluates
to

[#{grade => 7}, #{grade => 10}, #{grade => 7}, #{grade => 12}]

• {join, Q1, Q2}, the (natural) join of Q1 and Q2. If Q1 and Q2 evaluate to R1 and R2
respectively, then the join evaluates to the bag

[Row || Row1 <- R1, Row2 <- R2, {ok, Row} <- [join_rows(Row1, Row2)]]

where join_rows (which can be found in the util module) merges two rows, but
only if they agree on the value of the attributes that they have in common.

22
Advanced Programming DIKU, B1-2023/2024

Question 2.1: The adb module


Implement an Erlang module adb with the following API. See the handout code for type
specifications.

• start() for starting an adb server. Returns {ok, S} on success, or {error, Reason}
if some fault occurred.
• stop(DB) for stopping the database server DB. It is unspecified whether stop/1 will
abort ongoing operations or wait for them to finish.
• create_table(DB, Name) for creating a table with name Name in the database DB.
It is an error if there is already a table called Name in the database. Returns ok on
success and {error, Reason} on error.
• select(DB, Query) for running Query on DB. Returns {ok, Rows} if the query
succeeds with the result Rows and {error, Reason} if an error occured. See section 2
for the possible values of Query.
• insert(DB, Table, Query) for running Query and inserting the resulting rows
into the table Table. It is a non-blocking operation.
• delete(DB, Table, Pred) for deleting every row in Table for which Pred returns
true. It is a non-blocking operation.
• sync(DB) for waiting until there are no ongoing operations (select, insert or delete)
on DB; then it returns ok. In particular, all past updates by the same client (process)
must be visible to any future selects. For example, if an insert is followed by a sync
and then a select, the effects of the insert must be taken into account when computing
the select.

Database queries and operations are generally expected to execute concurrently, except
when there is a conflict such as multiple updates to the same table. Examples of scenarios
where concurrent execution is expected include:

• Multiple selects in general.


• Multiple deletes on different tables.
• An update (insert or delete) and a query (via select or another insert) which
does not depend on the table being updated.

Testing
The test_adb module should contain, at least, the following:

• A test_all/0 function that runs all your tests and only depends on the specified
adb API. Your tests should involve the required properties in this module, but should
also test aspects and functionality not covered by the required properties.
• We also evaluate your tests on your own implementation, for that you should export
a test_everything/0 function (that could just call test_all/0).

23
Advanced Programming DIKU, B1-2023/2024

How to get started


Start by implementing start/0 and create_table/3. Implement select/2 for the case
where the query is {from, Table} and insert/3 for the case where the query is {list, Elems}.
This subset of the API allows meaningful testing. Most of the complexity in the system
comes from handling queries concurrently, so consider making some simplifications initially.
For instance, you could start by assuming that no other query can run while an update
(insert or delete) is taking place.
Consider developing your test-suite along side your implementation. This will help you
greatly, should you see the need for refactoring your code.

24
Advanced Programming DIKU, B1-2023/2024

Question 2.2: Modelling adb


Make a module adb_model with the following API. This module will serve as a model of
adb for testing purposes. The API is quite similar to adb, but instead of communicating
with a stateful server each function simply returns an updated state. See the handout code
for type specifications.

• start() for creating an empty database. It returns the empty database state.

• create_table(DB, Name) for creating a table. It returns {ok, NewDB} on success


where NewDB is the state after adding a new table Name to DB. If there is already a
table Name in DB it returns {error, Reason}.

• select(DB, Query) for running Query on DB. Returns {ok, Rows} if the query
succeeds with the result Rows and {error, Reason} if an error occured. See Section 2
for the possible values of Query.

• insert(DB, Name, Query) for running Query and inserting the resulting rows into
the table Table. Returns {ok, NewDB} (where NewDB is the state after insertion) on
success, and {error, Reason} if an error occured.

• delete(DB, Table, Pred) for deleting every row in Table for which Pred returns
true. Returns {ok, NewDB} (where NewDB is the state after deletion) on success, and
{error, Reason} if an error occured.

25
Advanced Programming DIKU, B1-2023/2024

Question 2.3: QuickCheck for adb


The adb_qc module should contain a QuickCheck property prop_model_check/0 which
checks that starting a new database, creating a number of tables, executing a number
of operations (insert and delete) on these tables and finally getting the contents of all
tables (using select(DB, {from, Table}) for each Table) produces the same result (up
to equivalence of bags) using adb and adb_model. For adb you should also call sync/1
between the updates and the selects to make sure that the effects of all the updates are
reflected in the result.
You are welcome (even encouraged) to make more QuickCheck properties than explicitly
required. Properties that only depend on the specified adb API should start with the
prefix prop_. If you have properties that are specific to your implementation of the adb
library (perhaps they are related to an extended API or you are testing sub-modules of your
implementation), they should start with the prefix myprop_, so that we know that these
properties most likely only work with your implementation of adb.

Topics for your report for Question 2


Your report should document:

• For Question 2.1:

– Clearly explain if you support the complete API or not. If not, which parts you
don’t support.
– If you have made any assumptions, especially simplifying assumptions.
– How many processes your solution uses, what role each process has, and how
they communicate.
– What data your processes maintain.
– Summarise how you have tested aspects and functionality not covered by the
properties in Question 2.3.
Likewise, as always, remember to include your tests and the result of running
your test in your report (perhaps in an appendix).
– Overall, the interesting part of Question 2.1 is how concurrent your database is.
What are the conditions for concurrent execution in your implementation? Do
you obey the minimum concurrency requirements?

• For Question 2.2:

– Clearly explain if you support the complete API or not. If not, which parts you
don’t support.
– If you have made any assumptions, especially simplifying assumptions.
– How you model the database state.

• For Question 2.3:

26
Advanced Programming DIKU, B1-2023/2024

– Clearly explain if you have implemented the requirements or not. If not, what
is missing.
– Explain the idea behind and how your property works.
– The quality or limitations of your property based testing. Explain how you have
measured this quality.

27

You might also like