Professional Documents
Culture Documents
ABSTRACT The use of patterns in software development has its roots in the
A pattern is a common solution to a problem that occurs in many work of Christopher Alexander, an architect. Alexander developed
different contexts. Patterns capture expert knowledge about “best a pattern language for planning towns and designing the buildings
practices” in software design in a form that allows that knowledge within them [1]. A pattern language is a collection of patterns that
to be reused and applied in the design of many different types of may be combined to solve a range of problems within a given appli-
software. Antipatterns are conceptually similar to patterns in that cation domain, such as architecture or software development. Alex-
they document recurring solutions to common design problems. ander’s work codified much of what was, until then, implicit in the
They are known as antipatterns because their use (or misuse) pro- field of architecture and required years of experience to learn.
duces negative consequences. Antipatterns document common mis-
takes made during software development as well as their solutions. In addition to capturing design expertise and providing solutions to
common design problems, patterns are valuable because they iden-
While both patterns and antipatterns can be found in the literature, tify abstractions that are at a higher level than individual classes and
they typically do not explicitly consider performance conse- objects. Now, instead of discussing software construction in terms
quences. This paper explores antipatterns from a performance per- of building blocks such as lines of code, or individual objects, we
spective. We discuss performance problems associated with one can talk about structuring software using patterns. For example,
well-known design antipattern and show how to solve them. We when we discuss using the Proxy pattern [5] to solve a problem, we
also propose three new performance antipatterns that often occur in are describing a building block that includes several classes as well
software systems. as the interactions among them.
tify a bad situation and provide a way to rectify the problem. This is topic is by Brown, et. al. [2]. Their work, like the work on patterns
particularly true for performance because good performance is the however, focuses principally on quality attributes other than perfor-
absence of problems. Thus, by illustrating performance problems mance.
and their causes, performance antipatterns help build performance
intuition in developers. Patterns, which do not contain performance This paper extends the work on antipatterns to explicitly address the
problems, may be less useful for building performance intuition, performance of software architectures and designs. It presents three
especially if their performance characteristics are not discussed (as common performance mistakes made in software architectures. The
is typically the case). first antipattern, the “god” class, was proposed by other authors as a
problem with software quality. We show how it also leads to poor
While both patterns and antipatterns can be found in the literature, performing software. We propose three additional antipatterns that
they typically do not explicitly consider performance conse- also cause poor performance. They may also have other negative
quences. It is important to document both design patterns that lead impacts on other quality attributes, but they are not addressed here.
to systems with good performance and to point out common perfor- Additional performance antipatterns appear in a forthcoming book
mance mistakes and how to avoid them. This is a supplement to by the authors.
software performance engineering that will improve the architec-
tures and designs of software developers. Each of the antipatterns is defined in the following sections using
this standard template:
This paper explores antipatterns from a performance perspective.
We discuss performance problems associated with one well-known • Name: the section title
design antipattern and show how to solve them. We also propose • Problem: What is the recurrent situation that causes nega-
three new performance antipatterns that often occur in software sys- tive consequences?
tems. • Solution: How do we avoid, minimize or refactor the anti-
pattern?
While their emphasis is different, both patterns and antipatterns
address common software problems and their solutions. The
emphasis in the patterns community, however, is on quality 3.0 THE “GOD” CLASS
attributes, such as reusability or maintainability, other than perfor- This antipattern is known by various names, including the “god”
mance. As the use of patterns and antipatterns becomes more wide- class [8] and the “blob” [2]. Both Reil and Brown, et. al. discuss the
spread, it is vital to also identify those that are likely to have good impact of this phenomenon on quality attributes such as modifiabil-
performance characteristics. We propose antipatterns for perfor- ity and maintainability. The presence of a “god” class in a design
mance problems that we encounter in many different contexts but also has a negative impact on performance.
have the same underlying pathology. Because we find them so
often, it is important to document these antipatterns so developers 3.1 Problem
will be able to recognize them before they occur, and select appro- A “god” class is one that performs most of the work of the system,
priate alternatives. relegating other classes to minor, supporting roles. A design con-
taining a “god” class is usually easy to recognize. It typically has a
single, complex controller class (often with a name containing
2.0 RELATED WORK Controller or Manager) that is surrounded by simple classes
Antipatterns are derived from work on patterns. As noted in the
that serve only as data containers. These classes typically contain
introduction, this work is aimed at capturing expert software design
only accessor operations (operations to get() and set() the
knowledge. There is a large body of published work on patterns
data) and perform little or no computation of their own. The “god”
including [5], [3], and the proceedings of the Pattern Languages of
class obtains the information it needs using the get() operations
Program Design (PLoP) conferences. While there is occasional
belonging to these data classes, performs some computation, and
mention of performance considerations in the work on patterns, the
then updates the data using their set() operations.
principal focus is on other quality attributes, such as modifiability
and maintainability.
The following (very) simplified example illustrates the effects of a
“god” class. Consider an industrial process control application in
Meszaros [6] presents a set of patterns that address capacity and
which it is necessary to control the status of a valve (open or
reliability in reactive systems such as telephony switches. Petriu
closed). Figure 1 shows a fragment of the class diagram for a possi-
and Somadder [7] extend these patterns for use in identifying and
ble design for this application. The Controller class in Figure 1
correcting performance problems in distributed layered client-
behaves like a “god” class. The Valve class has no intelligence. It
server systems with multi-threaded servers.
simply reports its status (open or closed) and responds to
open() and close() operation invocations. The Controller
Smith [10] presents a set of principles for constructing responsive
does all of the work; it requests information from the Valve,
software systems. While they were published before the work on
makes decisions, and tells the Valve what to do.
software patterns began and presented with a different focus, they
can be viewed as performance patterns.
The Controller class is tightly coupled to the Valve class and
requires extra messages to perform an operation, as shown by the
Antipatterns extend the notion of patterns to capture common
following code fragment.
design errors and their solution. The most extensive work on this
Copyright 2000 by Software Engineering Research and L&S Computer Technology, Inc.
Appears in Proceedings 2nd International Workshop on Software and Performance, Sept. 2000
Page 3 of 11
There is also a variant of the “god” class that, rather than perform- 3.2 Solution
ing all of the work, contains all of the system’s data [8]. The func- The solution to the “god” class problem is to refactor the design to
tions are then assigned to other classes. When one of the function distribute intelligence uniformly across the top-level classes in the
Copyright 2000 by Software Engineering Research and L&S Computer Technology, Inc.
Appears in Proceedings 2nd International Workshop on Software and Performance, Sept. 2000
Page 4 of 11
70
60 Design 1
Design 2
Number of Messages
50
40
30
20
10
0
Inventory
Container
Reading
Monitor
Record a Daily
Clean a
Create a
Create a
Transfer
Transfer
Schedule a
Add to
Vat
Batch
Batch
Bottle a
a Batch
a Batch
Create a
Recipe
Scenario
4.0 EXCESSIVE DYNAMIC ALLOCATION The sequence diagram in Figure 4 illustrates Excessive Dynamic
With dynamic allocation, objects are created when they are first Allocation. This example is drawn from a call-processing applica-
accessed (a sort of “just-in-time” approach) and then destroyed tion in which, when a customer lifts the telephone handset (an
when they are no longer needed. This can often be a good approach offHook event), the switch creates a call object to manage the
to structuring a system, providing flexibility in highly dynamic situ- call. When the call is completed (an onHook event), the call
ations. For example, in a graphics editor, creating an instance of a object is destroyed. (Details of the call processing are in the
shape (such as a circle or rectangle) when it is drawn and destroying sequence diagram referenced by handleCall, which is not
the instance when the shape is deleted may be a very useful shown here.)
approach. Excessive Dynamic Allocation, however, addresses fre-
quent, unnecessary creation and destruction of objects of the same While constructing a single Call object may not seem excessive, a
class. Call is a complex object that contains several other objects that
also must be created. In addition, a switch can receive hundreds of
thousands of offHook events each hour. In a case like this, the
Copyright 2000 by Software Engineering Research and L&S Computer Technology, Inc.
Appears in Proceedings 2nd International Workshop on Software and Performance, Sept. 2000
Page 5 of 11
6000
5000
Total Service Time
Depth=3, S=.001
4000 Depth=5, S=.001
Depth=10, S=.001
3000
Depth=3, S=.005
0
1 2 3 4 5 6 7 8 9 10 11
The first improvement approach affects the cost in Figure 5 by especially problematic when the data is on a remote server and each
reducing the service time, S, to the time to allocate/return an object access requires transmitting all the intermediate queries and their
from the pool, and changing the depth to 1 because the pre-allo- results via a network and perhaps through other servers in a multi-
cated objects already have created the subordinate objects. The tier environment.
improvement for the second approach is similar.
The computer-aided design case study originally described in [Wil-
liams, 1998] illustrates this antipattern. The ICAD application
5.0 CIRCUITOUS TREASURE HUNT allows engineers to construct and view drawings that model struc-
Do you remember the child’s treasure hunt game that starts with a tures, such as aircraft wings. A model is stored in a relational data-
clue which leads to a location where the next clue is hidden, and so base and several versions of the model may exist within the
on until the “treasure” is finally located? The antipattern analogy is database.
typically found in database appl.ications. Software retrieves data
from a first table, uses those results to search a second table, Figure 6 shows a portion of the ICAD class diagram with the rele-
retrieves data from that table, and so on until the “ultimate results” vant classes. A model consists of elements which may be: beams,
are obtained. which connect two nodes; triangles, which connect three nodes; or
plates, which connect four or more nodes. A node is defined by its
5.1 Problem position in three-dimensional space (x, y, z). Additional data is
The impact on performance is the large amount of database pro- associated with each type of element to allow solution of the engi-
cessing required each time the “ultimate results” are needed. It is neer’s model.
Model Element
1..N
modelID : int elementNo : int
draw()
4..N
This example focuses on the DrawMod scenario in which a model Another instance of the antipattern is also found in object-oriented
is retrieved from the database and drawn on the screen. Figure 7 systems where operations have large “response sets.” In this case,
shows a sequence diagram for this scenario. A typical model con- one object invokes an operation in another object, that object then
sists of 2000 beams and 1500 nodes (a single node may be con- invokes an operation in another object, and so on until the “ultimate
nected to up to 4 beams). The software first finds the model id, then result” is achieved. Then each operation returns, one by one, to the
uses it to find the beams, and repeats the sequence of steps to: object that made the original call.
retrieve each beam row, use the node number from the beam row to
find then retrieve the node row which contains the “ultimate The performance impact is the extra processing required to identify
results” – the node coordinates. This information is then used to the final operation to be called and invoking it, especially in distrib-
draw the model. For a typical DrawMod scenario there are 6001 uted object systems where objects may reside in other processes
database calls: 1 for the model, 2000 for beams, and 4000 for the and on other processors. When the invocation causes the intermedi-
nodes. ate objects to be created and destroyed the performance impact is
even greater. This behavior also has poor memory locality because
A large number of database calls causes the most serious perfor- each context switch may cause the working set of the called object
mance problems in systems with remote database accesses; because to be loaded. The working sets of intermediate objects may need to
of the cost of the remote access, the processing of the query, and the be re-loaded later when the return executes.
network transfer of all the intermediate results.
The class diagram in Figure 6 shows a simple example. Suppose
that the model data has been retrieved from the database and is now
Copyright 2000 by Software Engineering Research and L&S Computer Technology, Inc.
Appears in Proceedings 2nd International Workshop on Software and Performance, Sept. 2000
Page 7 of 11
«create»
: Model
draw()
open()
find(modelID)
find(modelID, beams)
sort(beams)
«create»
retrieve(node1)
«create»
retrieve(node2)
«create»
draw()
draw()
draw()
draw()
close()
contained within each object. Then the model object must deter- beam, and for the intermediate class (beam) aj is 2000 beams per
mine (from the association to Beam, probably a table of pointers) model, so cs is 4000.
each Beam object to call, and each Beam must determine (from
the association to Nodes) which Node objects to call. The Model There are some potential disadvantages to reorganizing the Draw-
calls the first Beam operation, then the Beam calls 2 Node opera- Mod data in this way. Optimizing the data organization for one sce-
tions, and so on. nario may de-optimize it for other scenarios. To determine the
appropriateness of each alternative, the performance engineer will
5.2 Solution need to analyze the performance impact on other scenarios that use
If you find the database access problem early in development, you the database. It is unwise to optimize the database organization for a
may have the option of selecting a different data organization. For single scenario if it has a detrimental affect on all other scenarios;
example, the DrawMod database could store the node coordinates you want the “globally optimal” solution for the key performance
(x, y, z) in the beam table. The sequence diagram for the alternative scenarios. You evaluate the overall performance by revising each
database design is in Figure 8. With the node coordinates in the scenario that is affected by the change and comparing the model
beam row, the database call to find and retrieve nodes is unneces- solutions. Details are beyond the scope of this discussion.
sary and is omitted. For a typical DrawMod scenario with 2000
beams, there will be 4000 fewer database calls. For distributed systems, if you cannot change the database organi-
zation, you can reduce the number of remote database calls by
In general the number of calls saved will be: using the Adapter pattern [5] to provide a more reasonable interface
for remote calls. The Adapter would then make all the other (local)
database calls required to retrieve the “ultimate result” return only
cs = ∏ aj those results to the remote caller. This reduces the number of
j ∈ rootpath remote calls and the amount of data transferred, but does not reduce
the database processing.
where cs is the total number of calls saved, aj is number of associ-
ated objects in the level below for each object in this level, for For designs with large response sets, an alternative is to create a
every object j between the object originally containing the “ultimate new association that leads directly to the “ultimate result.” For
result” (the leaf class), and the object containing the “first clue” (the example, in Figure 6 we would add an association between the
root class). For example, for the leaf class (node) aj is 2 nodes per Model class and the Node class. In the DrawMod scenario, this
would reduce the number of operations called from 6000 (2000
Copyright 2000 by Software Engineering Research and L&S Computer Technology, Inc.
Appears in Proceedings 2nd International Workshop on Software and Performance, Sept. 2000
Page 8 of 11
«create»
: Model
draw()
open()
find(modelID)
find(modelID, beams)
sort(beams)
«create»
«create»
«create»
draw()
draw()
draw()
draw()
close()
Beam calls plus 4000 Node calls) to 1500 (the number of Nodes The sequence diagram in Figure 9 illustrates the database variant of
per Model). The performance impact is substantial if these are the One Lane Bridge antipattern. Each order requires a database
remote calls that are made via middleware such as Corba or update for each item ordered. The structure selected for the data-
DCOM. base assigns a new order-item number to each item and inserts all
items at the end of the table. If every new update must go to the
same physical location, and all new items are “inserted,” then the
6.0 THE ONE LANE BRIDGE update behaves like a One Lane Bridge because only one insert may
On the south island of New Zealand there is a highway that has a proceed at a time; all others must wait. There is also a second prob-
one lane bridge that is even shared with a train. It isn’t a problem lem in that these inserts are costly because they must update a data-
there because there is light traffic in that part of the country. It base index for each key on the table.
would be a problem though if it were in downtown Los Angeles.
Similar problems occur when the database key is a date/time stamp
6.1 Problem for an entity, or any key that increases monotonically. We have also
The problem with a One Lane Bridge is that traffic may only travel seen this problem for periodic archives where processing must halt
in one direction at a time, and if there are multiple lanes of traffic while state information is transferred to long term storage.
all moving in parallel, they must merge and proceed across the
bridge, one vehicle at a time. This increases the time required to 6.2 Solution
get a given number of vehicles across the bridge and can also cause With vehicular traffic, you alleviate the congestion caused by a One
long back-ups. Lane Bridge by constructing multiple lanes, constructing additional
bridges (or other alternatives), or re-routing traffic.
The software analogy to the One Lane Bridge is a point in the exe-
cution where one, or only a few, processes may continue to execute The analogous solutions in the database update example above
concurrently. All other processes must wait. It frequently occurs in would be:
applications that access a database. Here, a lock ensures that only
one process may update the associated portion of the database at a • use an algorithm for assigning new database keys that
time. It may also occur when a set of processes make a synchronous results in a “random” location for inserts,
call to another process that is not multi-threaded; all of the pro- • use multiple tables that may be consolidated later, or
cesses making synchronous calls must take turns “crossing the • use another alternative such as pre-loading “empty” data-
bridge.” base rows and selecting a location to update that minimizes
conflicts.
Copyright 2000 by Software Engineering Research and L&S Computer Technology, Inc.
Appears in Proceedings 2nd International Workshop on Software and Performance, Sept. 2000
Page 9 of 11
enterOrder
checkOut
updateDB
triggerOrderProcess
ack
For example, an alternative for assigning date-time keys is to use Figure 10 also illustrates the relative importance of the One Lane
multiple “buckets” for inserts and use a hashing algorithm to assign Bridge antipattern: if the intensity of requests for the service is
new inserts to the “buckets.” high, it may be a significant barrier to achieving responsiveness and
scalability requirements.
Reducing the amount of time required to cross the bridge also helps
relieve congestion. One way to do this is to find alternatives for per- This solution to the One Lane Bridge problem embodies the shared
forming the update. For example, if the update must change multi- resources principle [Smith, 1988; Smith, 1990] because responsive-
ple tables, it would be better to select a different data organization ness improves when we minimize the scheduling time plus the
in which the update could be processed in a single table. holding time. Holding time is reduced by reducing the service time
for the One Lane Bridge and by re-routing the work.
For the database example above, the magnitude of the improvement
depends on the intensity of new item orders and the service time for
performing updates. The relationship is: 7.0 SUMMARY AND CONCLUSIONS
This paper has explored antipatterns from a performance perspec-
S tive. We show the performance consequences of a recognized anti-
RT = ----------------- pattern, introduce three new antipatterns with negative performance
1 – XS
consequences, and quantify their impact on performance.
where RT is the residence time (elapsed time for performing the
The value of both antipatterns and their predecessor, patterns, is
update), S is the service time for performing updates, and X is the
that they capture expert software design knowledge. This value has
arrival rate.
been amply demonstrated by their acceptance within the develop-
ment community. One serious shortcoming of both patterns and
Figure 10 shows a comparison of the residence time for various
antipatterns has been their lack of focus on performance issues.
arrival rates for two different service times. The first curve assumes
While some authors focused on performance [Meszaros, 1996; Pet-
the service time for the update is 10 milliseconds (thus the arrival
riu and Somadder, 1997], most have considered it as an after-
rate of update requests must be less than 100 requests per second),
thought, if at all. Demonstrating the performance characteristics of
and shows how the residence time increases as the arrival rate
patterns and antipatterns is vital so that developers using them in
approaches the maximum. The second curve shows the improve-
designing software can select alternatives that will meet their per-
ment if the update service time is reduced by 1 millisecond! The
formance goals.
figure illustrates the improvement achievable by reducing the ser-
vice time (the time required to cross the bridge). If you change the
The work presented here goes beyond merely describing the char-
structure of the database so that you update in multiple locations so
acteristics of architectural or design antipatterns, however. The
fewer processes wait for each update, this is equivalent to reducing
Excessive Dynamic Allocation, Circuitous Treasure Hunt, and One
the arrival rate and the figure also shows the relative benefit of this
Lane Bridge antipatterns document common performance mistakes
alternative.
and provide solutions for them. While these antipatterns may mani-
Copyright 2000 by Software Engineering Research and L&S Computer Technology, Inc.
Appears in Proceedings 2nd International Workshop on Software and Performance, Sept. 2000
Page 10 of 11
10.0
9.0
8.0
7.0
Residence time
6.0
S= 0.010
5.0
S = 0.009
4.0
3.0
2.0
1.0
0.0
20 50 80 98 99.5 99.87 110 111
Arrival rate
fest themselves in a variety of ways (for example the One Lane ware, Architectures, and Projects in Crisis, New York,
Bridge problem may be caused by either database collisions or syn- John Wiley and Sons, Inc., 1998.
chronization delays) the manifestations have a common underlying
cause. [3] F. Buschmann, R. Meunier, H. Rohnert, P. Sommerlad,
and M. Stal, Pattern-Oriented Software Architecture:
The solutions to these antipatterns embody sound, well-accepted
A System of Patterns, Chichester, England, John Wiley
performance principles [Smith, 1988; Smith, 1990]. These princi-
ples are similar to patterns in that they provide guidelines for creat- and Sons, 1996.
ing responsive software. The antipatterns presented here provide a
complement to the performance principles by illustrating what not [4] M. Fowler, Refactoring: Improving the Design of
to do and how to fix a problem when you find it. A simple analogy Existing Code, Reading, MA, Addison-Wesley Long-
from electrical engineering would be using examples of series and man, 1999.
parallel circuits (i.e., patterns) to illustrate how to build proper cir-
cuits and an example of a short circuit (i.e., an antipattern) to show [5] E. Gamma, R. Helm, R. Johnson, and J. Vlissides,
what to avoid. Feedback from students in our classes indicates that Design Patterns: Elements of Reusable Object-Ori-
both types of example are needed to instill performance intuition. ented Software, Reading, MA, Addison-Wesley, 1995.
More work is needed on both patterns and antipatterns that includes
[6] G. Meszaros, "A Pattern Language for Improving the
their impact on performance as well as other quality attributes. We
are continuing to identify other performance-related patterns and Capacity of Reactive Systems," in Pattern Languages
antipatterns. Many of these are described in a forthcoming book. of Program Design 2, J. M. Vlissides, J. O. Coplein
and N. L. Kerth, ed., Reading, MA, Addison-Wesley,
1996, pp. 575-591.
8.0 REFERENCES
[7] D. Petriu and G. Somadder, "A Pattern Language For
[1] C. Alexander, S. Ishikawa, M. Silverstein, M. Jacob- Improving the Capacity of Layered Client/Server Sys-
son, I. Fiksdahl-King, and S. Angel, A Pattern Lan- tems with Multi-Threaded Servers," Proceedings of
guage, New York, NY, Oxford University Press, 1977. EuroPLoP'97, Kloster Irsee, Germany, July, 1997.
[2] W. J. Brown, R. C. Malveau, H. W. McCormick III, [8] A. J. Riel, Object-Oriented Design Heuristics, Read-
and T. J. Mowbray, AntiPatterns: Refactoring Soft- ing, MA, Addison-Wesley, 1996.
Copyright 2000 by Software Engineering Research and L&S Computer Technology, Inc.
Appears in Proceedings 2nd International Workshop on Software and Performance, Sept. 2000
Page 11 of 11
[9] R. C. Sharble and S. S. Cohen, "The Object-Oriented [11] C. U. Smith, Performance Engineering of Software
Brewery: A Comparison of Two Object-Oriented Systems, Reading, MA, Addison-Wesley, 1990.
Development Methods," Software Engineering Notes,
vol. 18, no. 2, pp. 60-73, 1993. [12] L. G. Williams, C. U. Smith, “Performance Engineer-
ing of Software Architectures,” Proceedings Workshop
[10] C. U. Smith, "Applying Synthesis Principles to Create on Software and Performance, Santa Fe, NM, Oct.
Responsive Software Systems," IEEE Transactions on 1998.
Software Engineering, vol. 14, no. 10, pp. 1394-1408,
1988.
Copyright 2000 by Software Engineering Research and L&S Computer Technology, Inc.
Appears in Proceedings 2nd International Workshop on Software and Performance, Sept. 2000