You are on page 1of 9

2011 34th IEEE Software Engineering Workshop

Metrics and Antipatterns for Software Quality Evaluation

Francesca Arcelli Fontana


University of Milano Bicocca Department of Computer Science Milano, Italy arcelli@disco.unimib.it

Stefano Maggioni
University of Milano Bicocca Department of Computer Science Milano, Italy maggioni@disco.unimib.it

Abstract In the context of software evolution, many activities are involved and are very useful, like being able to evaluate the design quality of an evolving system, both to locate the parts that need particular refactoring or reengineering efforts, and to evaluate parts that are well designed. This paper aims to give support hints for the evaluation of the code and design quality of a system and in particular we suggest to use metrics computation and antipatterns detection together. We propose metrics computation based on particular kinds of micro-structures and the detection of structural and object-oriented antipatterns with the aim of identifying areas of design improvements. We can evaluate the quality of a system according to different issues, for example by understanding its global complexity, analyzing the cohesion and coupling of system modules and locating the most critical and complex components that need particular refactoring or maintenance. Keywords-metrics computation; antipatterns detection; software quality assurance; maintainability.

I.

INTRODUCTION

Software maintenance is very expensive for all the activities that are necessary to adapt, migrate and evolve the software according to new requirements and needs. Many tasks are strictly related to these activities. Among them, the possibility to evaluate the code, design, and architectural quality of a system can be very useful to locate the parts that need more reengineering or refactoring efforts. If the number of these located components is large, this provides a first indication of the costs needed to maintain and evolve the system. Easily identifying bad quality code and design provides a way to prevent rapid decay of the system. In this paper we propose to evaluate the design and code quality of Java systems through the computation of a particular metric and the detection of antipatterns [7,9]. Software metrics [18, 19, 21] are exploited both to understand the complexity of the analyzed systems and their overall quality and stability. Software antipatterns are software structures and entities that, unlike design patterns, seem to be
1550-6215/11 $26.00 2011 IEEE DOI 10.1109/SEW.2011.13 48

an adequate solution to a certain design or programming issue; however, they are actually far from the optimal implementation. The presence of antipatterns inside an objectoriented system indicates system non-modularity, far from the object-oriented best practices, and difficulties in system maintenance, extension, and reuse. Hence, the identification of these structures helps in the detection of important critical elements composing the system that can be seen by the engineers as the main candidates for possible refactoring or restructuring activities. The detection of the antipatterns described in this paper is supported by both metrics analysis and by the detection of particular micro-structures. A general evaluation of the design quality of a system can be performed also through the detection of design patterns, but this not always implies good quality. Vice versa, the presence of antipatterns indicates bad design quality, and underlines the existence of system parts needing restructuring, in order to improve its overall quality and stability. The design quality is strictly related to the cohesion and coupling of a system, and consequently to its stability, which are important characteristics also for software reuse [11]. Through the metrics described in the paper, we are able to assess the impact of changes and refactoring during software evolution. If the value of some metrics is high or if we detect some antipatterns, the risks and the costs of any change could be very high. We are able to produce dependencies analysis, which helps in understanding which classes of the system are affected if a certain class undergoes some modifications or restructuring. Such an analysis helps in evaluating the percentage of the system that will be potentially affected by a single-point modification. It consequently helps to understand the relationships and dependencies among system modules. Moreover, to support the integration of different systems, the dependencies analysis would reveal necessary, as integrating system parts with a low degree of coupling is easier than integrating highly coupled software modules. This analysis is useful also in a migration process towards service oriented

architectures (SOA), to locate components that can be easily exposed as services. The dependencies analysis may also result in the detection of structural antipatterns.
The principal aim of this paper is to describe metrics and structural and object-oriented antipatterns that we are able to detect through an Eclipse plug-in we have developed, with the intent to provide a framework for code and design quality evaluation to support software evolution. Several tools for metrics computation have been developed, less for antipatterns detection (see Section II). The novelty of our approach relative to the previous ones is that the computation of metrics and the detection of antipatterns are performed by analyzing two sets of particular software micro-structures, namely Elemental Design Patterns (EDPs) [28] and micro patterns [12], that will be described in the following. Moreover, often these activities are performed through separate tools, while we integrate them in one plugin for Eclipse, which we have developed to support, also, software architecture reconstruction views, design pattern detection and code smell detection. The metrics that we compute are all well knows metrics for quality evaluation, in particular for dependencies and dependents computation among classes and packages, but we provide a new interpretation of some of them.

evaluations and analysis can be performed, which are useful to assist software maintenance and reengineering activities. Antipatterns have been introduced by Brown et al. [9], while dependence antipatterns have been discussed by Binkley et al. [7]. Some tools (such as Dcor, iPlasma [30], Analyst4J [2], Structural Analysis for Java [34] and Rational Software Architect [31]) support the detection of some antipatterns. To our knowledge, no tool exists which is able to recognize both the metrics and both the object-oriented and structural antipatterns that we describe in this paper. Moreover the approach we follow is peculiar, since it is based on the detection of particular micro-structures, which in some cases correspond directly to the identification of critical classes that need refactoring. III.
AN APPROACH BASED ON MICRO-STRUCTURE DETECTION

The approach we describe for antipatterns detection and metrics computation is based on the detection of microstructures, in particular elemental design patterns and micro patterns. Other kinds of micro-structures have been proposed, like design pattern clues [23], but they have not been considered for metric-based and quality evaluations. A. Elemental Design Patterns Elemental design patterns (EDPs) were proposed by Smith and Stotts [28]. They provide solutions to very common programming problems (we can state that these problems occur in the everyday practice of every programmer). They share the same aim with the design patterns, but they are applied to more restricted and specific issues. In fact, if design patterns propose solutions to problems which can involve a certain number of classes, EDPs address problems of much more limited dimensions, which generally do not involve more than three classes. There are 16 EDPs subdivided into three categories: - Object Elements: contains three EDPs related to the creation and the referencing of objects as well as to the presence of abstract methods inside an abstract class, or interface methods inside an interface; - Method Invocation: collects twelve EDPs which represent the various forms of possible method calls; - Type Relation: contains a single EDP representing the inheritance relationship between two classes, as well as the implementation of an interface. A sample EDP is the Create object, which belongs to the Object Elements category, whose intent is to ensure that newly allocated data structures conform to a set of assertions and preconditions before they are operated on by the rest of the system, and that can only be operated on in pre-defined ways [29]. More simply, it represents the fact that a certain class creates an instance of a class, which can be either the same class or a different one. A sample EDP belonging to the Method Invocation category is the Delegate, whose intent is to parcel out, or delegate, a portion of the current work to another method in another object [29]. This means that there are two different classes C1 and C2 which dont share any superclass, and a method m1

We performed different experiments analyzing systems of different sizes, like JHotDraw 6.0b1 (718classes) [17], Adempiere (3105classes) [1], JasperReports (2830classes) [13], Plazma 0.1.8 (3345classes) [30], and Sakai 2.6.0 (5417classes) [32]. We also pursued other experiments with software provided by various companies, with the aim to evaluate the stability of a system, its integration effort with other components, and prediction of maintenance costs based on system complexity and dependencies analysis. The paper is organized through the following Sections: in Section II we briefly introduce some related works; in Section III we introduce the micro-structures on which our approach is based; in Section IV we describe the overall architecture of our software architecture reconstruction module, which can also be exploited for metrics computation and antipatterns detection; in Section V we describe the different metrics which we are able to compute, outlining their relevance according to system quality evaluation; in Section VI we describe the object-oriented and structural antipatterns we are able to detect. Finally in Section VII we conclude outlining some future developments. II. RELATED WORKS In this paper we concentrate our attention in exploiting metrics and antipatterns for the evaluation of the code and design quality of a system to better support software evolution. Many works in the literature exist outlining the relevance of the metrics, considering different metrics and different aims [6, 19, 22, 24, 27, 35]. Several tools for the computation of metrics currently exist, like Understand, JDepend [14], SDMetrics [33], Metrics 1.3.6 [26] and JHawk [16]. In any case many metrics exist and new metrics can be defined for specific purposes. Through metrics computations, new

49

of class C1 invokes a method m2 of class C2, and m1 and m2 have a different signature. For a complete description of each EDP refer elsewhere [28, 29]. EDPs can be detected inside Java systems through the Micro-structures Detector module [10], which is part of the MARPLE (Metrics and Architecture REconstruction PLugin for Eclipse) project [ 4, 10]. B. Micro Patterns Micro patterns were introduced by Gil and Maman [12] in order to capture some very common progra mming techniques. Micro patterns can be thought of as class-level traceable patterns, i.e. structures similar to design patterns but which, differently from Gamma design pattern s [11], can be mechanically recognized and stand at a clas s abstraction level. A micro pattern is traceable if it can be expressed as a simple formal condition on the attributes, types, name and body of a software module and its components. Curre ntly, there are 27 micro patterns subdivided into eight categories. The eight micro pattern categories are: Degenerate State and Behaviour: this category includes micro patterns describing interfaces and classes whose state and behaviour are degenerated. In most cases this means that the interface or class does not define any variable or method; Degenerate Behaviour: these micro patterns are related to classes with no methods or with very sim ple ones; Degenerate State: this category is related to classes which have no state (i.e. variables), or their state is shared with other classes or they are immutable; Controlled Creation: the micro patterns belonging to this category describe special protocols for cr eating objects; Wrappers: this category collects micro patterns dealing with classes which have a single central instance field and methods working on it, so that the main functionalities are delegated to this field; Data Managers: these micro patterns are related to classes whose main purpose is to manage the data stored in a set of instance variables; Base Classes: the micro patterns belonging to this category describe different ways in which a base class makes preparations for its sublcasses; Inheritors: the micro patterns in this category correspond to three ways in which a class can use the definitions of its superclass, i.e. abstract method imple mentation, method overriding and interface enrichment.

analysis [12], and with the Micro-structures Detector module of MARPLE [4, 10], based o n source code analysis, the one we used for our analysis in this paper. IV.
METRICS COMPUTATION AND ANTIPATTERNS DETECTION IN M ARPLE

We have developed a module to support software architecture reconstruction (SAR) and software evolution activities, called MARPLE SAR, that is devoted to the generation of views about the analyzed systems, the computation of metrics, the detection of antipatterns and of particularly interesting classes that can be candidates to be refactored or reengineered. The module is structured according to Figure 1 . The MARPLE SAR architecture presents a canonical threelayer architecture, which follows the extract-abstract-present model described by Tilley [36]. The input layer is formed by two kinds of input, represented by corresponding XML files, which are provided by the M icro-structures Detector module [10, 23].

Each micro pattern is described in this way: A class/interface that. Therefore, the focus of each micro p attern is on whole single classes or interfaces, whose method s and/or attributes must satisfy the constraints defined by the m icro pattern itself. This characteristic represents a strong difference with respect to EDPs, as this latter category of element s presents a lower detail level. In fact, EDPs can be detected by analyzing single method invocation, object creations and inheritance relationships. For a complete discussion abo ut micro patterns, refer elsewhere [12]. Micro patterns are recognized through a prototype developed by Gil and Maman ba sed on byte code

Figure 1. The architecture of the MARPLE SAR module.

The System structure information input collects the whole set of packages and types, i.e. the classes and interfaces composing the analyzed syste m. For each type, the set of the defined methods and attributes is also specified. The Microstructures information colle cts the whole set of microstructures that have been detected in a system: for each type, it reports which EDPs and micro patterns are implemented within it.

50

The structural relationships among classes and the other functionalities are all derived and computed by the elaboration layer, based on the information provided by the two XML files just described (the analysis of the XML inputs is performed using the Apache XMLBeans technology [37]). This layer is composed of four sub-modules. The Class structure parser is devoted to the generation of an abstracted representation of the types (classes and interfaces, with their corresponding methods and attributes) and packages composing the system, basing on the information provided by the input layer. The Class core parser analyzes the microstructures detected from source code and subsequently identifies the relationships among classes (the EDP). The Metrics computation sub-module computes object-oriented metrics basing on the relationships generated by the Class core parser. The metrics that are currently computed will be introduced in the next section. The Visualization sub-module organizes the whole abstracted information generated by the elaboration layer in order to be exploitable by the end users. As far as the entities composing the analyzed system are concerned, packages and types will be represented in package or class views as graph nodes, while the relationships among them identified by the class core parser will be depicted as edges connecting them. The generated views are produced exploiting the functionalities provided by the JGraph libraries [15]. The results are finally presented by the output layer, which provides the set of generated structural views, the metrics computed on the system, and tags related to the identified antipatterns and other interesting or critical classes. Among the various functionalities provided by MARPLE SAR, we will consider in particular the computation of metrics that can be interpreted for quality evaluation as well as for the identification of structural and object-oriented antipatterns, that allow for the identification of critical system components, in terms of quality, maintainability and reusability. We now go further into details, explaining how the EDPs and the micro patterns are actually exploited to achieve the metrics computation and antipatterns detection functionalities. We will not consider in this paper the generation of the system-views functionality offered by the MARPLE SAR module just described. V. METRICS COMPUTATION The core concepts behind MARPLE SAR are the entities composing the system (i.e. packages, classes and interfaces), and the relationships connecting them with one another. While packages, classes and interfaces are derived by the class structure parser sub-module analyzing the system structure information, the relationships among them are identified by the class core parser, analyzing the micro-structures information input. Through the analysis of the EDPs given as input it is possible to generate representations of the association, generalization and implementation relationships. Table 1 indicates the EDPs that are exploited in the generation of the available relationships.

The relationships that can be obtained among classes and interfaces are the foundations for the computation of metrics within MARPLE SAR.

Relationship

Association

Object Elements EDPs Create Object, Retrieve

Type Relation EDPs

Method Invocation EDPs Delegate, redirect, Delegate in family, Delegate in limited family, Redirect in limited family Revert method, Extend method

Generalization

Inheritance

Implementation

Inheritance

TABLE I.

EDP USED IN THE RELATIONSHIPS AMONG SYSTEM ENTITIES

Four main metrics are computed through the MARPLE SAR module based on micro-structures detection. They are: Local dependencies: given a type (either a class or interface), the local dependencies of this type is the number of types this type depends on, within the same package. It is obtained by counting the number of associations going out from the subject type to types belonging to the same package; Local dependents: given a type, the local dependents of this type is the number of types that depend on the functionalities provided by the subject type, within the same package. It is obtained by counting the number of associations coming in the considered type from types belonging to the same package; External dependencies: given a type, the external dependencies of this type is the number of types this class depends on, considering the overall system and not the package the subject type is contained in. It is derived by counting the number of associations going out from the type to types belonging to different packages; External dependents: given a type, the external dependents of this type is the number of types that depend on the functionalities provided by the subject type, but not belonging to the same package. It is computed by counting the number of associations coming in the considered type from types belonging to packages different from the package the subject type is contained in.

Dependencies and dependents can be related to packages as well. In this case, only the global case is considered, as a package can only expose relationships with other (external)

51

packages. Therefore the dependencies of a package are the number of packages the subject package depends on, while the dependents of a package are the number of packages that depend on the functionalities provided by the subject package. These metrics are used in different approaches and tools for software analysis, like JDepend [14] and SA4J [34]. We adopt a different interpretation of the local and global concepts with respect to SA4J, as this tool does not allow the user to distinguish immediately intra-package from inter-package relationships, which (to note a distinction) is possible with MARPLE SAR. In fact, SA4J considers as local all the direct dependencies among classes. If class C1 of package P1 declares a reference to class C2 in package C2, C2 is a local dependency of C1, even if they are declared in different packages. On the other hand all global dependencies areindirect dependencies. If C1 has a reference to C2, and C2 has a reference to C3, then C3 is a global dependency of C1, independently from the packages the three classes are declared
in. This aspect doesnt allow a user to immediately distinguish -

between classes belonging to a certain package and those declared in different ones. On the contrary, MARPLE SAR considers uniquely direct dependency relationships, both in the same package or in different packages. If the relationship is between classes belonging to the same package, then the dependency is said to be local. In the same way, if the relationship involves two entities belonging to different packages, then the relationship is said to be external. For further details and a concrete example of this issue, refer elsewhere [23]. Having a strong and clear distinction between these two kinds of relationships lets the users evaluate the cohesion and coupling within single packages, as well as within the overall system.
Another fundamental metric that we compute is Abstractness [25], i.e. the amount of abstract classes and interfaces inside a package relative to the total number of types composing it. It can be evaluated by considering the Abstract interface EDP, which indicates that inside a given class, when an abstract class method or an interface method is declared, then the declaring type is consequently an abstract class or an interface.

external dependencies by the number of external dependencies and external dependents. - Distance from the main sequence [25]: abstractness (A) and instability (I) are strictly related metrics. Given the graph depicted in Figure 2, two core categories of packages can be identified: those that are totally composed by abstract entities and are stable (represented by the (0, 1) point in the diagram), and those that contain only concrete entities and are completely instable (represented by the (1, 0) point in the diagram). Obviously, not all of the packages of a system can belong to one of these two positions, as they generally have different degrees of abstractness and instability values. - Bonding [34]: it indicates how well the classes within the package are connected with one another, and can be obtained by dividing the number of local dependencies by the total number of dependencies, both local and external. - Link density [34]: it indicates the mean number of relationships among classes within the package, giving an indication of how strong these relationships are; it is obtained by dividing the number of local dependencies and dependents by the total number of types contained in the package.

Figure 2. The Main Sequence

A. Derived Metrics Starting from the above basic five metrics, five other metrics can be computed and derived, both on classes and on packages. The metric that can be calculated on classes is: Belonging [34]: it represents how much a class is being used by its package, dividing the number of local dependencies and dependents by the overall number of dependencies and dependents of the class, considered both at the local and at the external level. The metrics that can be obtained on packages are: Instability [25]: it indicates how much the classes are linked to their package, and it is obtained by dividing the number of

Many other metrics could actually be computed following our approach based on micro-structures. Anyway, we decided to focus on this set of metrics for different reasons. Basic objectoriented metrics, like for example the number of attributes (NOA) or methods (NOM) of a class [21], are generally focused in evaluating the characteristics of single classes, and do not consider the system (or part of it) in its overall structure. To obtain some usable information about the structure of a system they need to be combined with other metrics, and adequately analyzed and interpreted [18]. On the other hand, the metrics we consider (the dependencies, dependents and their derived metrics) are in their nature focused on the structure of a system, as they are computed on single types or packages, but depend on the entities connected to these types or packages. They automatically provide a sort of structured information that can be easily exploitable and useful during the evaluation of the design quality of a system.

52

B. Quality evaluation issues concerning metrics computation The metrics we have described are well established means to assert the quality of a system in terms of its stability, cohesion, and ease of reuse. As these metrics consider the number of associations related to each single class, and as associations are derived by analyzing the EDPs characterizing each class, we can state that these metrics are derived by EDPs as well, without further inspection about the analyzed system. Dependencies and dependents can be considered as a first means to understand the complexity of a system. A high quality system must pursue the high cohesion low coupling principle [20]. The number of dependencies of a class is to be considered as an indication of the level of coupling of each single class. Classes with a high number of dependencies consequently augment the coupling of the system, worsening its overall quality. In the same way, the number of dependencies of a package can be seen as an indication of the cohesion of the same package. The fewer dependencies the package has with the rest of the system, the more cohesive the package is, consequently improving the quality of the system. On the other hand, the number of dependents of a class gives an overview of how many classes in the system are affected if the subject class is changed. Dependencies and dependents are also strictly related to the identification of structural antipatterns, as discussed in the next section. Concerning the other metrics we can assert that for: Abstractness(A): packages with high abstractness values are easily extensible and reusable by other parts or modules of the system. Belonging: if the value is equals to 1, the class is completely used and referenced within its package, as it doesnt have any external dependencies or dependents. Instability(I): this metric is an indicator of the package's resilience to change. A value of zero indicates a completely stable package (as its classes dont refer to classes belonging to other packages, therefore the package is completely self-contained) and a value of one indicates a completely unstable package (as its classes only refer to external types). Distance from the main sequence: a package with for example A = 0 and I = 0 is highly stable and totally concrete. Such packages are not desirable, as they are rigid, hence they cannot be extended as they are not abstract. They are also difficult to change, due to their stability. Packages with A = 1 and I = 1 are not desirable as well, as they are totally abstract, but with no dependents, hence the abstractions are impossible to be extended. A package with A = 0.5 and I = 0.5 is partially extensible and partially stable, so that the extensions are not subject to maximal instability. In this case usually the stability of the package is in balance with its abstractness. In Figure 2, the line connecting the (0, 1) and (1, 0) points represents those packages whose abstractness is balanced with stability. This line is called the main sequence. As it

is desirable for packages being as close as possible to the main sequence, it has been defined the (normalized) distance from the main sequence as D = |Abstractness + Instability 1|. Values for this metric range in the interval [0, 1]. The more a package has a D value close to zero, the more it is near to the main sequence and hence well balanced. The engineers can therefore focus on those packages with a D value not near to zero, as they are the first candidates to be reanalyzed and restructured. Bonding metric: this metric gives an idea of how much a certain class exploits the functionalities provided by the other classes belonging to the same package: if it equals to 1, the class is used only by classes of the same package. VI.
ANTIPATTERN DETECTION

We now describe the structural and object-oriented antipatterns we are able to detect through the MARPLE SAR module. We will outline the quality issues we can observe by detecting these patterns and the consequent reengineering or refactoring activities we have to perform. A. Structural Antipaterns Given the number of local and external dependencies and dependents, six structural antipatterns [34] can be identified: Local breakable: a class with many local dependencies, which has excessive responsibility within the system, and can be typically recognized by the presence of many long methods (even if the local dependencies metric can be used as well). The presence of breakables makes the code very difficult to understand, to maintain, and to reuse; Global breakable: a type that is often affected when any other entity within the system is changed, due to the high number of external dependencies it has. They have to be avoided, as they indicate fragility and lack of modularity in the system; Local butterfly: a type that has many local dependents. If a local butterfly is changed, these changes often have an important impact on the rest of the package. Hence, they are allowed only for either basic system interfaces or utility classes. Global butterfly: a type with many global dependents. If a global butterfly is changed, this produces heavy consequences on the rest of the system. Therefore, as in the local case, global butterflies should only be either basic system interfaces or utility classes. Local hub: a type that has many immediate dependencies and many immediate dependents. It is both a local breakable and a local butterfly at the same time. Local hubs have too many responsibilities within the system, and also serve as utility components. Hubs make the code difficult to understand, to maintain, and to reuse, and they also make the code itself unstable. Global hub: a type with many global dependencies and many global dependents. It is both a global breakable and

53

a global butterfly. If a modification within a system occurs, a global hub is often affected. Being a global butterfly, it also affects a significant part of the system if it changes. Global hubs indicate fragility and lack of modularity in the system. The detection of these structural antipatterns helps the engineers to identify areas of design improvement, as for example components of a system that are critical in terms of their structure, i.e. in terms of their number of outgoing and incoming relationships with the rest of the system. These complex components are to be considered as the first candidates for a structural refactoring. Re-engineering these entities results in having a more stable and self contained system. In MARPLE SAR, the minimum number of dependencies/dependents that an entity must present in order to be an instance of one of the structural antipatterns just described is 10. To make a concrete example on JHotDraw, Table 2 indicates the classes that are instances of one of the structural antipatterns. In particular, they represent global antipatterns, as only their external relationships exceed the number of 10. A further improvement of our module will consider the number of dependencies and dependents with respect to the number of types composing the single package or the global system. In fact, it seems sensible to assert that a class belonging to a package containing, for example, ten classes, which presents ten local dependencies, is far more critical than a class having the same ten local dependencies, but spread in a larger package, with for example a hundred or more classes.
Class TextTool ImageFigure RectangleFigure LineConnection TextFigure PolyLineFigure ElbowConnect. External dependencies 14 11 5 13 16 10 10 External dependents 8 2 37 14 10 10 5 Global breakable Yes Yes No Yes Yes Yes yes Global butterfly No No Yes Yes Yes Yes No Global hub No No No Yes Yes Yes No

Figure 3. A global breakable clas detected in JHotDraw.

These complex components are to be considered as the first candidates for a structural refactoring. Re-engineering these entities results in having a more stable and self contained system. B. Object-oriented Antipatterns Four micro patterns are devoted to the identification of classes whose implementation is far from the object-oriented paradigm. We can define them as a sort of object-oriented antipatterns, and their identification lets the engineers focus on these classes in order to solve the issues and problems they present. The four micro patterns representing antipatterns are: Cobol like: classes with a single static method, one or more static variables, but no instance methods or fields. The programming style represented by this micro pattern is far away from object-orientation. This micro pattern can be mainly detected in those main classes developed by beginner programmers, even if also well established systems and libraries may present instances of it; Pool: class which declares only static final fields, but no methods. Pool classes are considered antipatterns as they can be generally implemented as interfaces. This is known as the constant interface antipattern [8]; Pseudo class: a class with no instance fields, and no concrete methods. Within it, only static fields and abstract methods are allowed. This kind of class constitutes an antipattern as they can be rewritten as interfaces, therefore they are good candidates for an easy refactoring; Record: a class in which all fields are public, and no methods are declared (other than constructors and those methods inherited from java.lang.Object). Instances of this micro pattern look very similar to Pascal record types. Such classes run against the encapsulation principle of object-orientation, according to which fields should be declared private (or protected) and accessed by appropriate getter and setter methods (as, for example, with data manager classes). In the next two tables we provide the results of the detection of these four antipatterns in three different releases of Apache Ant and JHotDraw. Table 3 considers three different releases of Apache Ant, namely 1.5.2, 1.6.2 and 1.7.1. As we can notice, the antipatterns codified by the four micro patterns are rarely present inside each implementation. For

TABLE II.

SOME SAMPLE STRUCTURAL ANTIPATTERNS INSTANCES

Just as an example, Figure 3 reports a sample global breakable class, identified in JHotDraw 6.0b1. The detection of these structural antipatterns helps the engineers to identify the components of a system that are critical in terms of its their structure, i.e. in terms of their number of outgoing and incoming relationships with the rest of the system.

54

example, only one instance of the Pseudo class micro pattern can be found in only two releases. The most common antipatterns are the Cobol like, whose presence increases along the releases, and the Record, increasing as well. In any case, their presence seems not to be critical for the overall system, as only 0.9% to 1.5% of classes present these micro patterns. Table 4 considers three different releases of JHotDraw, namely 5.1, 6.0b1 and 7.1. In the first two considered releases, micro patterns representing object-oriented antipatterns are really scarce.
Ant 1.5.2 N.of classes Cobol like Pool Pseudo class Record 10 1 0 8 % on the whole system 1.4% 0.1% 0% 1.1% N. of classes 13 2 1 10 Ant 1.6.2 % on the whole system 1.4% 0.2% 0.1% 1.1% Ant 1.7.1 N.classes 17 5 1 10 % on the whole system 1.5% 0.4% 0.1% 0.9%

solved in newer versions of the framework. Therefore, as can be noticed, the analysis of micro patterns, and in particular of those devising object-oriented antipatterns, allows for the evaluation and the identification of software modifications and/or restructuring, that in some cases may produce side effects like those presented in the evolution for JHotDraw 6.0b1 to release 7.1. The detection of these micro patterns has two main advantages. First of all, if detected, they can be refactored according to the object-oriented paradigm, in order to fully comply with the rest of the system. On the other hand, their absence can be considered, in some cases, as an indication of good system quality, as it demonstrates that the system has been designed and implemented correctly following the object-oriented directives. VII.
CONCLUSION AND FUTURE DEVELOPMENTS

TABLE III. OBJECT-ORIENTED ANTIPATTERNS IN APACHE ANT RELEASES

Table 4 considers three different releases of JHotDraw, namely 5.1, 6.0b1 and 7.1. In the first two considered releases, micro patterns representing object-oriented antipatterns are really scarce. No instances of Pool can be found, only one instance of Record has been detected and in general only in one case the amount of classes featuring an antipattern went over 1% (namely for JHotDraw 5.1, where two instances of Cobol-like have been detected).
JHotDraw 5.1 Number of classes 2 0 0 1 % on the whole system 1.2% 0% 0% 0.6% JHotDraw 6.0b1 Number of classes 4 0 2 1 % on the whole system 0.7% 0% 0.4% 0.2% JHotDraw 7.1 Number of classes 12 17 1 15 % on the whole system 1.7% 2.4% 0.1% 2.1%

Cobol like Pool Pseudo class Record

TABLE IV. OBJECT-ORIENTED ANTIPATTERNS IN JHOTDRAW RELEASES

Interesting observations emerge by analyzing the results obtained on JHotDraw 7.1. In this case, the presence of objectoriented antipatterns notably increased: 17 instances of Pool have now been detected (with respect to 0 instances for both release 5.1 and 6.0b1). Record instances increased from only 1 to 15, covering the 2.1% of the entire system. Cobol-like instances increased as well, while only Pseudo class decreased, presenting now only one instance. We think that this behavior is related to the fact that JHotDraw underwent an important restructuring from release 6.0b1 to 7.1. The architecture of the system heavily changed as well as the actual implementation, also in terms of the increased dimension of the system. It will be interesting to check if the issues identified by the considered micro patterns have been

In this paper we have described how micro-structures (in particular EDPs and micro patterns) can be exploited for software quality analysis. Through micro patterns we are able to perform quality evaluation since through their detection we can detect critical or peculiar classes or object-oriented antipatterns as for Cobol-like or Pseudo class micro patterns. For some of them, correlations with particular metrics values have been underlined and inspected in the foregoing paragraphs. Through the analysis of EDPs, it is possible to compute a set of common object-oriented and quality metrics, which help engineers in analyzing the complexity of the system and consequently focusing on those components and modules that expose criticalities. This process is also supported by the detection of structural antipatterns, which are represented by types with a high number of dependencies and/or dependents with other entities. In this case, the engineer can analyze those classes whose implementation is not compliant with the object-oriented principles, violating encapsulation and limiting the possibility of extension or reuse of the affected classes. Through the metrics computation and antipatterns detection we have described, we provide support for software evolution and maintenance. These two functionalities can also take advantages of the other one offered be MARPLE SAR with respect to software architecture-reconstruction views [4]. Regarding future developments, we are planning to extend the antipatterns and metrics computation functionalities offered by MARPLE. New metrics for reusability and extendibility will be considered in the future. We are implementing and integrating the detection of some antipatterns described by Brown et al [9]. Currently we are able to detect the Blob, Functional decomposition, Lava flow, Spaghetti code and Swiss Army knife antipatterns. Moreover we are extending MARPLE with code smells-detection capabilities and best practices improvement support [8]. We have recently experimented with several tools for code-smells detection [5]; usually the detection techniques of these tools are based on metrics computation. We think that other information such as the relations existing between smells or other possible

55

correlation between smells and micro patterns could be exploited to improve the current detection techniques of the tools [3]. We are currently interested in performing correlation analysis through data mining techniques between antipatterns and micro patterns and between antipatterns and code smells. Moreover we are interested in analyzing the correlations between some metrics, as for example the Instability metric, and some antipatterns or micro patterns for system evolution analysis. Finally, because through our tool we are able to detect different metrics related to the analysis of local/external dependencies and dependents and several derived metrics such as instability, distance from the main sequence and other metrics, we are interested in exploring how, for dependencies analysis, we can exploit cluster detection algorithms proposed for communities detection in complex networks. As far as this last research area is concerned, we have recently performed an experimentation using a cluster analyzer for dependency graphs in object-oriented systems, trying to compare the results obtained with the dependencies analysis performed through our tool as against results obtained with other tools such as JDepend or SA4J. REFERENCES
[1] [2] [3] [4] Adempiere, http://www.adempiere.com/ Analyst4j, http://www.codeswat.com/cswat/index.php?option=com_ content&task=view&id=43&Itemid=63 F.Arcelli, M.Zanoni, On investigating code smells correlations, Proceedings of the RefTest 2011 Workshop, Berlin, March 2011. F.Arcelli, M. Zanoni, A Tool for Design Pattern Detection and Software Architecture Reconstruction. Information Sciences Journal, Elsevier, doi:10.1016/j.ins.2011. F.Arcelli et al, An experience report on using code smells detection tools, Proceedings of the RefTest 2011 Workshop, Berlin, March 2011.

[19] R. A. Khan, K. Mustafa, S. I. Ahson, An Empirical Validation of Object Oriented Design Quality Metrics, Computer and Information Science, Vol. 19, pp. 1-16, 2007. [20] C. Larman, Applying UML and Patterns: An Introduction to ObjectOriented Analysis and Design and Iterative Development (3rd Edition), Prentice-Hall, October 2004. [21] M. Lorenz, J. Kidd, Object-oriented software metrics: a practical guide. Prentice-Hall, Inc. 1994. [22] M. Lanza and R. Marinescu, Object-Oriented Metrics in Practice. Springer-Verlag, 2006. [23] S. Maggioni, Marple-Metrics and Architecture Reconstruction Plugin for Eclipse. PhD Thesis University of Milano Bicocca, Italy , 2010. [24] S. Maggioni, F. Arcelli, Metrics-Based Detection of Micro Patterns, Proceedings of the International Workshop on Emerging Trends in Software Metrics (WETSoM 2010), Cape Town, South Africa, May 2010. [25] R. C. Martin, OO Design Quality Metrics, An Analysis of Dependencies, Proceedings of the Workshop on Pragmatics and Theoretical Directions in Object-Oriented Software Metrics, OOPSLA 94, ROAD Vol. 2 No. 3, 1995. [26] Metrics 1.3.6, http://metrics.sourceforge.net/ [27] J. Schalken, H. van Vliet, Measuring where it Matters: Determining starting points for metrics collection, Journal of Systems and Software, Vol. 81 Issue 5, May 2008, pp. 603-615. [28] J. McC. Smith, D. Stotts, Elemental Design Patterns: A Formal Semantics for Composition of OO Software Architecture, Proceedings of the Software Engineering Workshop, 27th Annual NASA Goddard/IEEE, Dec. 2002. [29] J. McC. Smith, An Elemental Design Patterns Catalog, Tech. Rep. 02040, Computer Science Department, University of North Carolina at Chapel Hill, December 2002. [30] iPlazma, http://loose.pt.ro/iplazma/ [31] Rational Software Architect, http://www01.ibm.com/software/awdtools/architect/swarchitect/ [32] Sakai, http://sakaiproject.org/portal [33] SDMetrics, http://www.sdmetrics.com/LoM.html [34] Structural Analysis for Java, http://www.alphaworks.ibm.com /tech/sa4j [35] L. Tahvildari, K. Kontogiannis, A Metric-Based Approach to Enhance Design Quality through Meta-Pattern Transformations, Proceedings of th the 7 European Conference on Software Maintenance and Reengineering (CSMR03), 2003. [36] S. R. Tilley, S. Paul, D. B. Smith. Towards a framework for program understanding, Proceedings of International Workshop on Program Comprehension, IWPC, 1996. [37] Apache XMLBeans, http://xmlbeans.apache.org/

[5]

[6]

V. R. Basili, L. C. Briand, W. L. Melo, A Validation of Object-Oriented Design Metrics as Quality Indicators, IEEE Transactions on Software Engineering, 22(10), 1996. [7] D. Binkley, N. Gold, M. Harman, Z. Li, K. Mahdavi, J. Wegener, rd Dependence Anti Patterns, 23 IEEE/ACM International Conference on Automated Software Engineering, 2008. [8] J. Bloch, Effective Java Programming Language Guide. AddisonWesley, 1st edition, June 2001. [9] J.Brown, R. C. Malveau, H. W. McCormick, T. J. Mowbray, AntiPatterns: Refactoring Software, Architectures, and Projects in st Crisis, 1 edition, Wiley, 1998. [10] Software Evolution and Reverse Engineering (ESSeRE) Lab, University of Milano-Bicocca, MARPLE, Micro Structures Structure Detector http://www.essere.disco.unimib.it/reverse/Marple [11] E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software, Addison Wesley, 1994. [12] Y. Gil, I. Maman, Micro Patterns in Java Code, Proceedings of the 20th annual ACM SIGPLAN conference on Object oriented programming systems languages and applications (OOPSLA 2005), October 2005. JasperReports, http://jasperforge.org/website/jasperreportswebsite/ trunk/index.html?group_id=252 JDepend, http://clarkware.com/software/JDepend.html JGraph, http://www.jgraph.com/ JHawk, http://www.virtualmachinery.com/jhawkprod.htm JHotDraw, http://www.jhotdraw.org/ C. Kaner, P. Bond, Software Engineering Metrics: What Do They Measure and How Do We Know?, Proceedings of the 10th International Software Metrics Symposium (METRICS 2004), 2004.

[13] [14] [15] [16] [17] [18]

56