You are on page 1of 15

A Refactoring Tool for Smalltalk

Don Roberts, John Brant, and Ralph Johnson

University of Illinois at Urbana-Champaign


Department of Computer Science
1304 W. Springfield Ave.
Urbana, IL 61801
e-mail: {droberts,brant,johnson}@cs.uiuc.edu

Abstract and the only portion of the system that must be


validated1.
Behavior-preserving transformations are
Refactoring is an important part of the evo-
known as refactorings [Opd92]. Refactorings
lution of reusable software and frameworks. Its
are changes whose purpose is to make a program
uses range from the seemingly trivial, such as
more reusable and easier to understand, rather
renaming program elements, to the profound,
than to add behavior. Refactorings are specified
such as retrofitting design patterns into an ex-
as parameterized program transformations along
isting system. Despite its importance, lack of
with a set of preconditions that guarantee be-
tool support forces programmers to refactor pro-
havior preservation if satisfied. Figure 1 gives an
grams by hand, which can be tedious and error-
example refactoring for adding a new class. This
prone. The Smalltalk Refactoring Browser is a
would be one of the first refactoring steps to
tool that carries out many refactorings automati-
create an abstract class.
cally, and provides an environment for improv-
A closely related field of research is that of
ing the structure of Smalltalk programs. It makes
object-oriented database schema evolution
refactoring safe and simple, and so reduces the
[BK87][Kim90][PS87][WP91][DZ91]. Many of
cost of making reusable software.
the manipulations that are studied in schema
evolution are similar to the refactorings that we
1. Introduction are studying. A major difference between the
two areas is that schema evolution must deal
Improving an object-oriented system often
with the existence of “live” instances of evolving
involves both redesign and extension. Pro-
objects and therefore focuses much more on the
grammers often do not distinguish between these
data component of the objects. Our research
two activities and intertwine them freely. We
does not consider live instances, but focuses
have found it useful to perform improvements by
much more on the manipulations of the methods
first transforming the design while preserving
of the objects.
the program behavior, and then extending the
Much of the prior work on refactoring has
better designed system
focused on developing a small, primitive set of
[OBHS86][Cas91][Cas92]. This approach al-
refactorings that are provably behavior-
lows the designer to validate the new design
preserving. These primitive refactorings often
separate from the extended functionality. Fur-
seem trivial, but they possess three important
thermore, if the designer uses automatic tools to
properties:
transform the design while ensuring that pro-
gram behavior is preserved, then the extensions
1
are the only possible source of additional error Since the refactoring is behavior-preserving, if there
were errors in the original code, there will be errors in
the refactored code.

1
Name: AddClass

Parameters: className : String


superclass : Class
subclasses : Set of Class

Preconditions: ∀c ∈ classes( Program). className ≠ name( c)


∧ ∀c ∈ subclasses.superclass(c) = superclass

Transformation:
superClass superClass

Sub1 Sub2 Sub3 Sub1 className

Subclasses

Sub2 Sub3

Figure 1 - The AddClass Refactoring for Smalltalk

ficult to understand because they are more com-


1. They can be completely automated. This
plicated than they need to be, with duplicated
frees programmers from having to per-
code and methods that perform more than one
form mundane restructuring tasks manu-
task. By using refactorings to reorganize the
ally.
code, a programmer can gain better understand-
ing as he factors out the various abstractions and
2. They are provably correct. This ensures
eliminates redundancy. In fact, often important
that the restructuring process will not in-
abstractions are discovered only after the code
troduce new errors into the system. This
has been rearranged to make it simpler. Kent
is especially important in refactorings
Beck refers to this phenomenon as “talking pro-
that affect code in many places, such as
grams” [Beck96]. Abstract classes are a classic
renaming methods and variables.
example of this. Often in the program develop-
ment process, the designer finds that code is be-
3. More complex refactorings can be cre-
ing duplicated in several classes. By creating a
ated by composing primitive ones. Since
common superclass and moving the common
each primitive refactoring preserves the
methods and instance variables into it, the com-
behavior of the program, the entire com-
mon abstraction often becomes evident. This
position is itself behavior-preserving.
syntactic manipulation reveals a design-level in-
At first glance, simply restructuring a system sight.
seems to be of limited utility. However, there are Even non-behavior-preserving changes can
several areas of software development that can be aided by refactoring. Quite often a change
be aided by refactoring. One area is in program can be made much more easily if the code to be
comprehension. Complex systems are often dif- changed can be brought into one place (class,

2
Instance/Class Variable Refactorings
add variable
rename variable Method Refactorings
remove variable add method
push down variable into subclass(es) rename method
pull up variable from subclass(es) remove method
create accessors for a variable push down method into subclass(es)
change all variable refs to accessor calls pull up method from subclass(es)
(abstract variable) add parameter to method
Class Refactorings move method across object boundary
create new class extract code as method
rename class
remove class
Table 1- Refactorings that have been automated
method, etc.). By first localizing the code to be toring. We eventually implemented an en-
changed, the change can be made much more tirely new browser, with many enhance-
accurately since the programmer does not have ments, but even this new browser looks very
to search through the entire system to determine similar to the standard system browser.
which parts are affected by the change. An ex- • The refactorings must be fast. Smalltalk
ample of this is using the Strategy pattern to lo- programmers are used to being able to im-
calize an algorithm within a single class, and mediately see the results of a change. This is
then replacing the strategy object with a differ- due to the dynamic nature of the Smalltalk
ent one which encapsulates a different algorithm environment. Therefore, refactorings that
[GHJV95]. take a long time to perform an analysis will
Since refactoring occurs at all levels within not be used. If a refactoring takes too long,
the software development life cycle, the ability a Smalltalk programmer will just do it by
to perform refactorings automatically is crucial hand and live with the consequences.
to software evolution. This is especially true • Avoid purely automatic reorganization.
with the advent of design patterns. Due to the In Smalltalk, names are very important
relatively recent development of design patterns, since that is one of the fundamental ways of
few existing programs use the flexible designs determining what a class, method, or vari-
typified by them. Adding these designs to exist- able is used for. Automatic reorganization
ing software can be a tedious process. Refactor- tools that have to create entities must name
ings simplify this process by automatically han- them. These names typically do not have
dling the details of the code [TB95]. any meaning in the problem domain and
serve to obfuscate the code. Whenever we
1.1 The Refactoring Browser have to name something, we always prompt
the user to provide a name.
The goal of our research is to move refac-
toring into the mainstream of program develop- • The refactorings must be reasonably cor-
ment. The only way that this can occur is to pre- rect. The programmers must have a degree
sent refactorings to developers in such a way of trust in the transformations. In a reflec-
that they cannot help but use them. To do this, tive environment such as Smalltalk, any
the refactoring tool must fit the way that they change to the system can be detected by the
work. This goal imposes the following design system. Therefore, it is possible to write
criteria on refactoring tools: programs that depend on the objects being a
• The refactorings must be integrated into particular size, or that call methods by get-
the standard development tools. In ting a string from the user and calling per-
Smalltalk, the standard development tool is form: with it. Therefore, it is impossible to
the browser. Initially, we simply added have totally correct, nontrivial refactorings.
menu items to the browser for each refac- However, the refactorings in our system

3
Figure 2 - Screenshot of Refactoring Browser during extract code as method
refactoring

handle most Smalltalk programs, but if a were markedly different. Sending translateBy:
system uses reflective techniques, the re- to a Geometric created a new object that is off-
factorings will be incorrect. set from the original by the argument (a point).
However, sending the same message to
A screen shot from the Refactoring Browser
GraphicsContext did not create a new in-
is shown in Figure 2. It currently implements the
stance of GraphicsContext, but changed its
refactorings presented in Table 1. Several of
state so that entities drawn on it were offset by a
these refactorings could be decomposed into
particular amount.
several more primitive refactorings, but we are
This situation was clearly undesirable.
seeking a set that programmers actually use,
Therefore, in release 4.1 of ParcPlace Smalltalk,
rather than a minimal set.
To demonstrate the usefulness of the Refac- the method defined in Geometric and its sub-
toring Browser, we present four case studies of classes was renamed to translatedBy:, a name
problems that have arisen in working Smalltalk that more accurately reflects its semantics. Of
systems and show how they can be easily solved course, all of the references to the new methods
with automatic refactorings. were updated within the image. The problem
arose when converting existing applications
from the original version of the image to the new
2. Renaming translateBy: version. All of the senders of translateBy: in
the application had be examined by a pro-
One, seemingly simple, problem that arose
grammer and the type of the receiver had to be
involved renaming methods. In release 4.0 of
determined. Since there may be many such loca-
ParcPlace Smalltalk, both GraphicsContext tions, this was a very tedious, error-prone proc-
and most subclasses of Geometric defined ess as several hundred Smalltalk programmers
methods named translateBy:. Although these that were given the job of upgrading existing
methods shared a common name, their semantics applications can attest to.

4
In Smalltalk, the type of an object is effec- class GraphicsContext will remain un-
tively the set of messages that it responds to. touched.
Any object that responds to the same set of mes- Renaming methods at first seems to be one
sages as another can replace it regardless of their of the simplest of program transformations, but
inheritance relationship. This is unlike C++ in a dynamically typed environment such as
where the inheritance hierarchy is also the type Smalltalk, where the type of an object is the set
hierarchy. Therefore, in Smalltalk, renaming a of methods that it responds to, renaming a
method in a class is effectively changing the type method is changing the type of the object. Two
of all of the instances of that class. This can get objects that are polymorphic may not be after a
complicated when two classes that are not re- method is renamed. Therefore, the Refactoring
lated by inheritance have methods that share a Browser prompts the user whenever the situation
common name. It cannot be easily determined if is ambiguous. There are two situations that may
these classes are used polymorphically, or if the arise whenever a renaming is specified:
names are purely coincidental. If they are poly-
1. The user wishes to rename all of the
morphic, then both methods must be renamed if
methods with a given name in the image.
either one is. If they are not polymorphic, then
In this case, all of the implementers and
either method can be renamed independent of
all of the senders are renamed. This can
the other, but determining which method is be-
be done using only the functions in the
ing referred to at each call site in the program is
base image for determining all senders
difficult. This is the case with translateBy:.
and all implementers of the method. All
Solution of the classes must be checked for con-
flicts with the new name.
To port an application that uses the Geo-
metric classes from Smalltalk R4.0 to R4.1, we 2. The user wishes to rename only a subset
must locate all of the sites that send trans- of the implementations of a method. This
lateBy: message to a subclass of Geometric. procedure requires runtime analysis of
These sites must be modified to send the the program and is only behavior-
translatedBy: message. This takes the follow- preserving if the classes that implement
ing steps with the Refactoring Browser: the method are not used polymorphically
1. Using the rename method refactoring, re- with the class in which the renaming is
name the translateBy: method in the occurring. If the classes are used poly-
Geometric class to translatedBy:. This morphically with the class in which the
will rename the method in the renaming occurs, then renaming the
Geometric class and all of its sub- method in one class and not the others
classes. (All of the subclasses are re- will make them no longer polymorphic,
named to preserve the common protocol breaking the program.
that all Geometrics understand.)

2. Exercise the application on a test suite. 3. Converting Values into


This step is necessary because the re- ValueHolders
name method refactoring is a dynamic
refactoring, which will be discussed in When ParcPlace released VisualWorks, cre-
Section 6.3. ating user interfaces became much simpler.
However, the granularity of what was considered
3. File out the application and file it into an a model in VisualWorks changed. Before, mod-
Objectworks 4.1 image. els were typically large, monolithic objects that
implemented the entire functionality of the ap-
After completing these steps, all of the ref- plication. Under VisualWorks, every widget
erences to the translateBy: method in the (e.g. button, input field) has its own model,
Geometric classes will be renamed while the which are typically very simple, containing a
references to the method of the same name in single value. These simple models are known as

5
foo foo
"Getter method for instance variable foo" "Getter method for instance variable foo"

^foo ^self fooHolder value

foo: aValue foo: aValue


"Setter method for instance variable foo" "Setter method for instance variable foo"

foo := aValue self fooHolder value: aValue

fooHolder
Figure 3 - Accessor methods for foo ^foo isNil
ifTrue: [foo := nil asValue]
ifFalse: [foo]
holders. Holders are the same as variables ex-
Figure 4 - Transformed accessor methods
cept that they notify their dependents whenever
their value changes. Therefore, to convert exist-
ing applications to use the new interface tools
A common strategy for refactoring is to lo-
that are in VisualWorks, many of the values that
calize the scope of change. A major reason that
are stored in domain models must be converted
maintenance is so expensive is that adding a sin-
into ValueHolders.
gle feature typically requires changes throughout
Solution a program. If the scope of the change can be
limited to as few classes and methods as possi-
Converting values to ValueHolders by ble, errors are less likely to occur and are easier
hand is tedious. Using the Refactoring Browser to track down. In this case, the variable accesses
to perform this transformation is extremely sim- are scattered throughout the class and its sub-
ple and quick. classes. The automatic refactoring allows the
1. Use the add method refactoring to cre- programmer to rapidly localize all of these ref-
ate a method that returns the initialized erences into two methods where the necessary
value holder. changes can be made simply. If this refactoring
occurs often, the procedure can be entirely
2. Perform the abstract instance variable automated, eliminating the need for the pro-
refactoring on the instance variable you grammer to manually change the accessor meth-
wish to convert. ods.

This refactoring does several things. First, it


creates accessor methods for the variable if none 4. Creating an Abstract
exist. For example, if the variable is named foo, Class
it creates the two methods shown in Figure 3. It
then replaces all direct references to the variable One of the most common design changes
with either the getter method, if it is a reference, that occurs when developing reusable software is
or the setter method, if it is an assignment. creating an abstract class from several concrete
Now, to make the variable a value holder, classes [JF88]. The abstract class typically rep-
simply change the accessor methods (using the resents a common abstraction that several
browser) to use the value holder method created classes share, hence the name. Finding the cor-
in step 1. The code will look like Figure 4. The rect abstractions for a particular problem is dif-
foo and foo: methods act the same as the origi- ficult. Typically, several concrete classes must
nal methods. The fooHolder methods gives pro- be created before the correct abstraction be-
grammers access to the valueHolder object comes apparent [OJ93]. Therefore, abstract
which can be passed to the user interface or any classes are created from a particular set of con-
other object than needs to be dependent on the crete classes by migrating variables and func-
value of foo. tions from the concrete classes into the new su-

6
perclass. When done correctly, this will not 3. Move common instance variables from
change the behavior of the system . the concrete subclasses into the super-
For example, the original code for the Re- class. This is accomplished by using the
factoring Browser had a navigational part that pull up instance variable refactoring.
allowed the user to choose classes and methods
to edit. This was commonly displayed at the top 4. Methods with similar functions should be
of the browser. In addition to the navigational renamed to have identical names. This is
component, there was also a selection compo- accomplished with the rename method
nent that the user used to pick several classes refactoring.
and selectors so that actions could be performed
on them (e.g., running a lint rule on the selected 5. Often, methods in the subclasses will be
classes). This selection component was origi- identical except for a few pieces of
nally developed for a testing tool and was retro- code. These methods will have to have
fitted into the Refactoring Browser. Since it was the differences factored into smaller
originally developed for a different application, methods. The extract code as method
it did not fit into the same hierarchy as the navi- refactoring allows the user to select a
gational component. As a result, many items portion of a method and have it ex-
were duplicated, and the selection component tracted as a new method. Figure 5 shows
lacked some features implemented by the navi- an example of making two methods
gational component. This was fixed in later ver- identical by factoring out differences.
sions of the Refactoring Browser, by creating an Before the extraction, every subclass
abstract class that represented these commonali- must override the entire getController
ties. method, duplicating much code. After
the extraction, the subclasses only need
Solution to define the one portion of the method
Creating an abstract class by hand takes sev- that is actually different across sub-
eral steps, most of which can be easily auto- classes.
mated [Opd92][Opd93][OJ93]. The following There are several complications that
steps create an abstract class using the Refac- can arise when attempting to extract
toring Browser (they do not necessarily have to code into a separate method. The prob-
be performed in exactly this order): lems occur in code that uses variables
local to the original method. If the vari-
1. Create an empty superclass between the ables are simply referenced and not as-
concrete classes and their current super- signed, they can be passed as param-
class. This is by using the create new eters to the new method. If the variables
class refactoring. This refactoring allows are assigned and referenced exclusively
the users to insert an empty class into the within the extracted code, they can be-
hierarchy by specifying its superclass and come temporaries of the new method.
which of the subclasses of the original However, if the variables are assigned
superclass will be its subclasses. within the extracted code and are refer-
enced in the original method, then the
2. Use the rename instance variable refac- code cannot be extracted. Since Small-
toring. Rename common instance vari- talk passes all parameters by reference,
ables to share a common name by using there is no simple way to get the new
the rename instance variable refactoring. values of the temporary variables back
Often classes have instance variables that to the original method.
represent the same concept but have dif- In the Refactoring Browser, the code
ferent names. This is especially true if to be extracted is analyzed and the user
the classes were developed by indepen- is prompted to enter a selector with the
dent developers. appropriate number of arguments. If the

7
Before Extraction:
View>>getController
controller == nil ifTrue: [self setController: StandardController new].
^controller
SpecialView>>getController
controller == nil ifTrue: [self setController: MySpecialController new].
^controller

After Extraction:
View>>getController
controller == nil ifTrue: [self setController: self defaultControllerClass new].
^controller
View>>defaultControllerClass
^StandardController
SpecialView>>defaultControllerClass
^MySpecialController

Figure 5 - Extracting code to make similar methods


code cannot be extracted, the user is in- Extending existing algorithms to deal with addi-
formed and the process is aborted. tional types of nodes is simply a matter of add-
ing the appropriate method to the new node for
6. Move identical methods from the sub- each algorithm. However, to add a new algo-
classes into the superclass using the pull rithm, a new method must be added to every
up method refactoring. node type. If adding new algorithms is a rare oc-
currence, then this is the correct way to imple-
ment the algorithm.
5. Retrofitting the Visitor The visitor pattern addresses the situation
where the types of nodes are fairly constant, but
Pattern adding new algorithms or refining existing algo-
One of the key features of design patterns is rithms occurs relatively frequently. In the visitor
that they balance a set of forces. The visitor pattern, algorithms are performed by a separate
pattern is a classic example of force-balancing. object known as a visitor. Each node type im-
The visitor pattern is used in situations where an plements the accept: method, which takes an ar-
algorithm must be performed on a complex, het- gument that is a visitor object. It then calls the
erogeneous data structure (e.g., generating ma- method of the visitor that corresponds to the
chine code from a parse tree in a compiler) type of the receiver. For instance, an assignment
[GHJV95]. node would implement the accept: aVisitor
One solution is to have each node in the data method by sending the
structure implement the portion of the algorithm acceptFromAssignmentNode: self method
that deals with itself. For example, an assign- back to the visitor. This callback method will
ment node in a parse tree could generate code implement the details of the algorithm. In this
for itself by asking its rvalue to generate code solution, each algorithm is entirely encapsulated
for itself and then generate the code to assign the in a single visitor object. However, adding a new
results returned from the code fragment to the type of node to the data structure will force the
storage location represented by the lvalue. This programmer to add a method for the new node
solution is appropriate if the types of algorithm type to every visitor object.
that operate on the data structure are fairly static.

8
ParseNode
optimize
generate

VariableNode ConstantNode AssignmentNode LoopNode


optimize optimize optimize optimize
generate generate generate generate

ParseNode

acceptVisitor:

VariableNode ConstantNode AssignmentNode LoopNode

acceptVisitor: acceptVisitor: acceptVisitor: acceptVisitor:

Visitor
acceptFromVariableNode:
acceptFromConstantNode:
acceptFromAssignmentNode:
acceptFromLoopNode:

OptimizingVisitor GeneratingVisitor
acceptFromVariableNode: acceptFromVariableNode:
acceptFromConstantNode: acceptFromConstantNode:
acceptFromAssignmentNode: acceptFromAssignmentNode:
acceptFromLoopNode: acceptFromLoopNode:

Figure 6 - The visitor pattern

9
ParseNode>>optimize AssignmentNode>>optimizeWithVisitor: aVisitor

self optimizeWithVisitor: OptimizingVisitor new aVisitor acceptFromAssignmentNode: self

Figure 7 - New optimize method Figure 8 - New optimizeWithVisitor: method


In the initial design process, it is usually not combine them into one optimize method
apparent which is the correct method of imple- in the ParseNode class.
mentation. Only after the system has been run-
ning and several extensions have been made will 5. Next, use the add parameter to method
it become apparent whether to use a visitor or a to add a parameter containing an Opti-
distributed algorithm. Therefore, the ability to mizingVisitor to all of the optimize-
easily refactor a design from one method to the WithVisitor methods. You must specify
other is essential. a default value to passed. In this case, the
default value is OptimizingVisitor new.
Solution This refactoring causes the optimize in
Figure 6 shows a classic example of a situ- class ParseNode to look like Figure 7.
ation that is a candidate for the visitor pattern.
This is a parse tree from within a compiler that 6. To move the code into the visitor, use the
implements both an optimize and a generate move method across boundary refactor-
method. The steps to implement the visitor pat- ing to move the body of each
tern are as follows: optimizeWithVisitor: method into the
1. Create an abstract visitor class with the appropriate visitor method. Figure 8
create new class refactoring. shows the code for the Assign-
mentNode.
2. Add methods for each type of node to be
visited with the add method refactoring.
7. Finally, since the optimizeWithVisitor:
The body of the method corresponding to
has nothing specific to optimization, use
the root of the node hierarchy will just be
the rename method refactoring to change
self subclassResponsibility since this is
its name to acceptVisitor:.
an abstract class. The body of the meth-
ods corresponding to each subclass will Adding the code for the generating visitor is
simply call the method corresponding to similar to above process. However, the
its superclass. In the parse tree example, acceptVisitor: will already be implemented.
the method acceptFrom-
AssignmentNode: will have ^self
acceptFromParseNode: as its body.
6. Implementing the Refac-
This arrangement handles the case where toring Browser
the optimize method is not implemented
on every node. It does this by simulating This section briefly discusses some of the
the methods that would be called by in- implementation details of the Refactoring
heritance. Browser. One of the key design criteria was to
create a tool that could refactor real Smalltalk
3. Extract the body of the optimize method programs with the same interactive style that
to the optimizeWithVisitor method us- Smalltalk developers are used to. Therefore, one
ing the extract code as method refac- of our guiding principles is that automatic re-
toring. factorings should take less time than performing
them by hand. To achieve this goal, analyses that
4. Now all of the optimize methods have take longer than a few seconds were unaccept-
the same body (i.e., self optimize- able. This led to the framework for analysis and
WithVisitor) and can be consolidated. transformation discussed in this section.
Use the pull up method refactoring to

10
Another of the key design criteria is that we
‘varName := `@anything’ -> ‘self varName: `@anything’
assume that there is an intelligence directing the
refactoring process. We do not attempt to auto- ‘varName’ -> ‘self varName’
matically refactor code. It has been our experi-
ence that systems that perform purely automatic Figure 9 - Matching patterns to abstract
code reorganization are of limited utility in pro- instance variable varName
duction programming environments since they
typically have no understanding of the problem aspects of reflection but are simply users of re-
domain. For example, simply creating abstract flection. Specifically, we used three reflective
classes everywhere there is duplicated code facilities when implementing the browser
rarely yields meaningful designs. Especially
since the algorithms assign names to the new 6.1 Parse Tree Matching
classes in some automatic fashion. This ignores
The fundamental facility that any program
the fact that abstract classes, while eliminating
transformation system requires is access to an
duplicated code, should also represent some ab-
easily manipulable representation of the pro-
straction in the problem domain. Divining this
gram. VisualWorks provides access to the
abstraction from the results of an automatic re-
Smalltalk compiler and its intermediate struc-
organization is often more difficult than finding
tures such as parse trees. The source code trans-
it in the original code due to the obfuscation in-
formations that refactorings require are much
troduced by the naming system.
easier to implement as parse tree to parse tree
However, the algorithms used in automatic
transformations rather than string to string trans-
reorganization systems such as Ivan Moore’s
formations. Other dialects of Smalltalk do not
[Moore96], can be valuable when augmented
provide this level of access to the system, which
with user input. For instance, rather than auto-
is one of the major reasons the Refactoring
matically generating names, the system could
Browser has not been ported to other environ-
prompt the user for names for the new abstrac-
ments.
tions that it discovers. Our approach is similar to
Originally, the refactorings were imple-
this in that we provide a tool that will search for
mented by using visitors that would iterate the
places in your program where code is dupli-
nodes of a parser tree and convert one parse tree
cated, or unused, and point them out. This al-
into another depending on the refactoring being
lows a human to make the final call whether or
performed. As we developed more refactorings,
not the code should be consolidated and to pro-
developing the enumerators became tedious. To
vide a semantically meaningful name.
simplify this process, we developed a tree
One of the key features of Smalltalk that al-
matching system.
lows us to perform the sorts of analyses and
The matching system uses specially anno-
source code transformations necessary for re-
tated parse trees that are created by extending
factoring is that it has reflective facilities, that is,
the Smalltalk language. There are annotations
the ability for the language to examine and
that support matching variables, message sends,
modify its own structures. For instance, the Re-
any object (A variable followed by a list of mes-
factoring Browser examines the context stacks
sage sends), a statements, and a list of state-
of executing programs to determine runtime
ments. Once a match is found, a block of code is
properties such as callers of a particular method
executed for that match.
or cardinality relationships between components
The parse tree matcher can also be used to
and aggregates. Refactoring can be performed in
modify the parse trees. In addition to supplying
the absence of reflective facilities, but then re-
an annotated Smalltalk expression to match
quires a separate, metalanguage that can be used
against, another expression specifies the con-
to manipulate the original program. Having a
version once a match is found. For example, to
single language simplifies the process.
rename the size message send to length, you
Reflection is currently a hot topic of re-
only need to supply a search string such as
search, but from the standpoint of this project we
`@anyObject size and its conversion string
are not concerned with many of the theoretical
`@anyObject length (where `@anyObject is a

11
wildcard that can match any series of message toring Browser computes some of the precondi-
sends to a variable). Once the parse tree has tions by performing the analysis dynamically
been modified, it is parsed back into Smalltalk rather than statically. By observing the actual
code and compiled. Figure 8 shows the abstract values that occur, the Refactoring Browser can
instance variable refactoring as implemented perform correct refactorings. As an example,
using the tree matcher. consider the rename method refactoring. To cor-
rectly rename a method, all calls to that method
6.2 Compilation Framework must be renamed. This is difficult in an envi-
ronment that uses polymorphism to the extent
Arbitrary program transformations are not that Smalltalk does. Smalltalk also allows dy-
behavior-preserving. To ensure that the behavior namically created messages to be sent via the
of a program does not change, every refactoring perform: message. If an application uses this
has associated with it a set of preconditions that approach, any automatic renaming process has
must be met in order to apply the transformation. the potential of failure. Under these conditions,
Often these preconditions are as simple as “no guaranteeing the safety of a rename is impossi-
class with this name already exists in the sys- ble.
tem.” The static properties of the program must The Refactoring Browser uses method
be analyzed to determine if these preconditions wrappers to collect runtime information. These
are satisfied before performing a refactoring. wrappers are activated when the wrapped
Fortunately, the compilation framework in method is called and when it returns. The wrap-
VisualWorks already provides many of the per can execute an arbitrary block of Smalltalk
checks that are necessary for correct refactor- code. To perform the rename method refactoring
ings. The framework uses classes as first class dynamically, the Refactoring Browser renames
objects to manage variables, methods, and class the initial method and then puts a method wrap-
hierarchies. Many of the analysis and trans- per on the original method. As the program runs,
formational parts of a refactoring simply require the wrapper detects sites that call the original
a couple of message sends to a class. For exam- method. Whenever a call to the old method is
ple, to add a method to a class, you must check detected, the method wrapper suspends execu-
that the method name does not conflict with a tion of the program, goes up the call stack to the
method defined in the class or its superclasses. sender and changes the source code to refer to
This can easily be checked by sending the the new, renamed method. Therefore, as the
canUnderstand: message the class with the se- program is exercised, it converges towards a
lector in question as an argument. correctly refactored program.
Since we reused many of the existing static This type of “lazy” behavior occurs quite
checks present in the compilation framework, often in many branches of computer science.
the refactoring framework is much simpler than CLOS and other object systems based on lisp
it would be if we had implemented all of these have a lazy instance conversion scheme
checks ourselves. [Steele90]. In fact, virtual memory can even be
considered a lazy allocation scheme where pages
6.3 Dynamic Analysis and Method are only loaded as needed.
Wrappers The major drawback to this style of refac-
toring is that the analysis is only as good as your
Some preconditions of refactorings are not test suite. If there are pieces of code that are not
simple to compute from the static program text. executed, they will never be analyzed, and the
With Smalltalk’s dynamic typing, determining refactoring will not be completed for that par-
the type of a particular variable can be extremely ticular section of code.
difficult and time consuming. Moreover, there
are analyses that defy static analysis in any lan-
guage, such as cardinality relationships between 7. Future Work
objects.
The Refactoring Browser does not yet sup-
Since, performing these types of analyses
port undo. This feature is desirable in that it
statically is difficult or impossible, the Refac-

12
further encourages the exploration of program
designs by allowing the programmer to try it one
8. Conclusions
way and easily change his mind. Since Smalltalk Refactoring is a common operation in the
provides a continuous change history, it should software life cycle and the Refactoring Browser
be fairly simple to extend this to implement un- provides automatic support for many of the
limited undo. More interesting, though, is al- common transformations that come up in
lowing the programmer to undo refactorings in Smalltalk development. The Refactoring
any order, and the system will determine if there Browser is a practical tool in that it can perform
are any later refactorings that depend upon the correct refactorings on nearly all Smalltalk pro-
undone refactoring and undo them also. grams. In fact, we regularly use the Refactoring
We are always searching for new refactor- Browser on itself. That is, we use the tool to re-
ings to add to the system. Our main resource for factor its own source code. Additionally, the Re-
refactorings up to this point is William Opdyke’s factoring Browser has been used to help develop
Ph.D. thesis [Opd92]. In general, we prefer to a wide range of frameworks from the HotDraw
add primitive refactorings that can be composed framework for graphical editors[BJ94], to finan-
into larger, more complex refactorings. This cial models being developed by Caterpillar, to
composition is a topic that we are examining prototypes and models for a major telecommu-
most closely. nications company.
One result from composing refactorings is The Smalltalk Refactoring Browser can be
that in addition to having a precondition that obtained by anonymous ftp at st.cs.uiuc.edu un-
must be satisfied, each refactoring has a post- der the directory:
condition that is true after the transformation. /pub/Smalltalk/st80_vw/RefactoringBrowser
For example, in the visitor pattern, we added a or on the World Wide Web at:
parameter to the optimize method that was ini- http://st-www.cs.uiuc.edu/~brant/Refactory/.
tialized with a new Visitor object. A postcondi-
tion of this refactoring is that the type of the
variable is a Visitor, a fact that would otherwise 9. Acknowledgments
require extensive analysis to determine. Keeping This research funded in part by a fellowship
track of postconditions can be used to satisfy from the Fanny and John Hertz Foundation and
preconditions of later refactorings without re- by a grant from the Union Bank of Switzerland.
quiring extensive, complex analysis of the
source code. Therefore, refactorings are some-
times easier to perform in a composition than 10. References
they are independently. Developing a system [Beck96] Kent Beck. Smalltalk Best Practice
that incorporates these batch refactorings is cur- Patterns, Volume 1: Coding. Pren-
rently being researched. tice-Hall, 1996.
Another area that we are researching is using
the tool to allow packages of refactorings to be [BJ94] K. Beck and R. Johnson. Patterns
released with new versions of a framework or Generate Architectures. In Pro-
Smalltalk image. The refactorings would be ap- ceedings of the 8th European
plied to old code filed into a new image. For ex- Conference, ECOOP '94. Lecture
ample, the problem in section 2 would be solved Notes in Computer Science, Vol.
by specifying the renaming refactoring. Then 821, pages 139–149, 1994
whenever any old application was filed into a
new ObjectWorks 4.1 image, all of the refer- [BK87] E. Blake and S. Cook. On including
ences to the old message would be renamed part hierarchies in object-oriented
during file in. These refactoring packages would languages, with an implementation
allow the developer of the framework to provide in smalltalk. In Proceedings of
a set of refactorings that would transform sys- ECOOP ’87, Special Issue of
tems derived from the original framework to the BIGRE, pp 45–54, June 1987.
new framework.

13
[Cas91] Eduardo Casais. Managing Evolu- ings of OOPSLA'96, 1996, pp. 235-
tion in Object-Oriented Environ- 250
ments: An Algorithmic Approach.
PhD thesis, University of Geneva. [OBHS86] Tim O'Shea, Kent Beck, Dan Hal-
1991. bert, and Kurt J. Schmucker. Panel
on: The learnability of object-
[Cas92] Eduardo Casais. An Incremental oriented programming systems. In
Class Reorganization Approach, in Proceedings of OOPSLA '86. pages
Proceedings ECOOP '92, ed. O. 502–504. November 1986.
Lehrmann. Madsen, LNCS 615,
Springer Verlag, 1992, pp. 114– [OJ93] William F. Opdyke and Ralph E.
132. Johnson. Creating abstract super-
classes by refactoring. In ACM
[DZ91] Christing Delcourt and Roberto 1993 Computer Science Confer-
Zicari. The design of an integrity ence. February 1993.
consistency checker for an object-
oriented database system. In Pro- [Opd92] William F. Opdyke. Refactoring
ceedings ECOOP '91, ed. Pierre Object-Oriented Frameworks. PhD
America, LNCS 512, Springer thesis, University of Illinois, 1992.
Verlag, 1991, pp. 97–117. Available at: ftp://st.cs.uiuc.edu/
pub/papers/refactoring/opdyke-
[GHJV95] E. Gamma, R. Helm, R. Johnson, thesis.ps.Z
and J. Vlissides, Design Patterns:
Elements of Reusable Object- [Opd93] Ralph Johnson and William Op-
Oriented Software, Addison- dyke. Refactoring and Aggregation,
Wesley, 1995. In Object Technologies for Ad-
vanced Software, LNCS 742,
[Gri91] William G. Griswold. Program Re- Springer Verlag, 1993, pp. 264-278
structuring as an Aid to Software
Maintenance. PhD thesis. Univer- [PS87] D. Jason Penney and Jacob Stein.
sity of Washington. August 1991. Class modifications in the Gem-
Stone object-oriented dbms. In
[JF88] Ralph E. Johnson and Brian Foote. Proceedings of OOPSLA ’87, 1987.
Designing reusable classes. Journal
of Object-Oriented Programming. [Steele90] Guy L. Steele, Jr. Common Lisp:
June/July 1988. The Language, Digital Press, 1990.

[JO93] Ralph E. Johnson and William F. [TB95] Lance Tokuda and Don Batory.
Opdyke. Refactoring and Aggrega- Automated Software Evolution via
tion. In Proceedings of ISOTAS Design Pattern Transformations. In
'93: International Symposium on Proceedings of the 3rd Interna-
Object Techniques for Advanced tional Symposium on Applied Cor-
Software. November 1993. porate Computing, Monterrey,
Mexico, October 1995. Also TR-
[Kim90] Won Kim. Introduction to Object- 95-06, Department of Computer
Oriented Databases. MIT Press, Sciences, University of Texas at
1990. Austin, February 1995.

[Moore96] Ivan Moore. Automatic Inheritance [WP91] Francis Wolinski and Jean-Francois
Hierarchy Restructuring and Perrot. Representation of complex
Method Refactoring. In Proceed- objects: Multiple facets with part-

14
whole hierarchies. In Proceedings
ECOOP '91, ed. Pierre America,
LNCS 512, Springer Verlag, 1991,
pp. 288–306.

15

You might also like