Professional Documents
Culture Documents
László Lövei, Zoltán Horváth, Tamás Kozsik, Anikó Vı́g, Tamás Nagy
Eötvös Loránd University, Hungary
printList(L) ->
forEach(fun(H) ->
tion is achieved by first abstracting over the free vari-
io:format("~p\n", [H]) end, L).
able H, and by making the generalised parameter a
forEach(F,[H|T]) -> function F. In the body of printList the expression
F(H), io:format("~p/n", H]) has been replaced with F
forEach(F, T); applied to the local variable H.
forEach(F,[]) -> true. The arity of the printList has thus changed; in or-
der to preserve the interface of the module, we create a
Figure 4. The program after renaming
new function, printList/1, as an application instance
of printList/2 with the first parameter supplied with
The example presented here is small-scale, but it is the function expression:
chosen to illustrate aspects of refactoring which can fun(H) -> io:format("~p/n", [H]) end.
scale to larger programs and multi-module systems. Note that this transformation gives printList a func-
In Figure 1, the function printList/1 has been tional argument, thus making it a characteristically
defined to print all elements of a list to the standard ‘functional’ refactoring.
output. Next, suppose the user would like to define Figure 4 shows the result of renaming printList/2
another function, broadcast/1, which broadcasts a to forEach/2. The new function name reflects the
message to a list of processes. broadcast/1 has a functionality of the function more precisely. In Figure
very similar structure to printList/1, as they both 5, function braodcast/1 is added as another applica-
iterate over a list doing something to each element in tion instance of forEach/2.
the list. Naı̈vely, the new function could be added by Refactorings to generalise a function definition and
copy, paste, and modification as shown in Figure 2. to rename an identifier are typical structural refactor-
However, a refactor then modify strategy, as shown in ings, implemented in our work on both Haskell and Er-
Figures 3 - 5, would make the resulting code easier to lang.
maintain and reuse.
Figure 3 shows the result of generalising the func-
3.1 Language Issues
tion printList on the sub-expression
In working with Erlang we have been able to com-
io:format("~p/n",~[H]) pare our experience with what we have done in writ-
The expression contains the variable H, which is only ing refactorings for Haskell. Erlang is a smaller lan-
in scope within the body of printList. Instead of guage than Haskell, and in its pure functional part, very
generalising over the expression itself, the transforma- straightforward to use. It does however have a number
of irregularities in its static semantics, such as the fact The Distel infrastructure helps us to integrate refac-
that it is possible torings with Emacs, and thus make them available
within the most popular Erlang IDE.
• to have multiple defining occurrences of identifiers,
and
• to nest scopes, despite the perception that there is no 4. Our Approaches
shadowing of identifiers in Erlang. Both University of Kent and Eötvös Loránd Univer-
sity are now in the process of building a refactoring
Erlang is also substantially complicated by its possi- tool for Erlang programs, however different techniques
bilities of reflection: function names, which are atoms, have been used to represent and manipulate the pro-
can be computed dynamically, and then called using gram under refactoring. The Kent approach uses the
the apply operator; similar remarks apply to modules. annotated abstract syntax tree (AAST) as the internal
Thus, in principle it is impossible to give a complete representation of Erlang programs, and program analy-
analysis of the call structure of an Erlang system stat- sis and transformation manipulate the AASTs directly;
ically, and so the framing of side-conditions on refac- whereas the Eötvös Loránd approach uses relational
torings which are both necessary and sufficient is im- database, MySQL, to store both syntactic and semantic
possible. information of the Erlang program under refactoring,
Two solutions to this present themselves. It is pos- therefore program analysis and transformation are car-
sible to frame sufficient conditions which prevent dy- ried out by manipulating the information stored in the
namic function invocation, hot code swap and so forth. database.
Whilst these conditions can guarantee that behaviour One thing that is common between the two refac-
is preserved, they will in practice be too stringent for toring tools is the interface. Both refactoring tools are
the practical programmer. The other option is to artic- embedded in the Emacs editing environment, and both
ulate the conditions to the programmer, and to pass the make use of the functionalities provided by Distel [8],
responsibility of complying with them to him or her. an Emacs-based user interface toolkit for Erlang, to
This has the advantage of making explicit the condi- manage the communication between the refactoring
tions without over restricting the programmer through tool and Emacs.
statically-checked conditions. It is, of course, possible In this section, we first illustrate the interface of the
to insert assertions into the transformed code to signal refactoring tools, explain how a refactoring can be in-
condition transgressions. voked, then give an overview of the two implementa-
Compared to Haskell users, Erlang users are more tion approaches. A preliminary comparison of the two
willing to stick to the standard layout, on which the approaches follows.
Erlang Emacs mode is based. Therefore a pretty-printer
which produces code according to the standard layout 4.1 The Interface
is more acceptable to Erlang users.
While the catalogue of supported refactorings is slightly
different at this stage, the interfaces of the two refac-
3.2 Infrastructure Issues toring tools share the same look and feel. In this paper,
A number of tools support our work with Erlang. we take the refactoring tool from University of Kent as
Notable among these is the syntax-tools package an example to illustrate how the tool can be used.
which provides a representation of the Erlang AST A snapshot of the Erlang refactorer, which is called
within Erlang. The extensible nature of the package Wrangler, is shown in Figure 6. To perform a refactor-
allows syntax trees to be equipped with additional in- ing, the source of interest has to be selected in the ed-
formation as necessary. For example, Erlang Syntax itor first. For instance, an identifier is selected by plac-
Tools provides functionalities for reading comment ing the cursor at any of its occurrences; an expression
lines from Erlang source code, and for inserting com- is selected by highlighting it with the cursor. Next, the
ments as attachments on the AST at the correct places; user chooses the refactoring command from the Refac-
and also the functionality for pretty printing of abstract tor menu, which is a submenu of the Erlang menu, and
Erlang syntax trees decorated with comments. input the parameter(s) in the mini-buffer if prompted.
refactoring is shown in Figure 7: the new parameter A
has been added to the definition of repeat/1, which
now becomes repeat/2, and the selected expression,
wrapped in a fun-expression because of the side-effect
problem, is now supplied to the call-site of the gener-
alised function as an actual parameter.
All the implemented refactorings are module-aware.
In the case that a refactoring affects more than one
module in the program, a message telling which un-
opened files, if there is any, have been modified by the
refactorer will be given after the refactoring has been
successfully done. The customize command from the
Refactor menu allows the user to specify the boundary
of the program, i.e. the directories that will be searched
Figure 6. A snapshot of Wrangler and analysed by the refactorer.
Undo is supported by the refactorer. Applying undo
After that, the refactorer will check the selected once will revert the program back to the status right
source is suitable for the refactoring, and the param- before the last refactoring performed.
eters are valid, and the refactoring’s side-conditions are
satisfied. If all checks are successful, the refactoring 4.2 The Kent Approach
will perform the refactoring and update the program In this approach, the refactoring engine is built on top
with the new result, otherwise it will give an error mes- of the infrastructure provided by SyntaxTools [1]. It
sage and abort the refactoring with the program un- uses the annotated abstract syntax tree (AAST) as
changed. the internal representation of Erlang programs; both
program analysis and transformation manipulate the
AASTs directly.
As an example, consider the code in Figure 9. This Figure 10. The Implementation Architecture
is one of the clauses of a function that computes the
greatest common divisor of two numbers. Each node of database (which represents the AST and the seman-
the AST is given a unique id. Every module also has its tic information), but the position information might
own id. These ids are written as subscripts in the code. 1
The price for the separation of tables containing syntactic infor-
The database representation of the AST is illustrated mation from tables containing semantic information is an increased
in Table 1. The table names clause, name, infix expr redundancy in the database. For example, the “names” table stores
and application refer to the corresponding syntactic the variable name for each occurrence of the same variable.
no longer reflect the actual positions in the program
source. In order to keep the position information up-
to-date, we build up the updated syntax tree from the
database and use the pretty-printer to refresh the code,
then the position information is updated by a simul-
taneous traversal of the syntax tree represented in the
database, and the AST generated by parsing the re-
freshed code.