You are on page 1of 7

Natl. Acad. Sci. Lett.

(January–February 2017) 40(1):21–27


DOI 10.1007/s40009-016-0472-y

SHORT COMMUNICATION

Package-Restructuring Based on Software Change History


Anshu Parashar1 • Jitender Kumar Chhabra1

Received: 21 June 2015 / Revised: 25 September 2015 / Accepted: 22 March 2016 / Published online: 16 April 2016
 The National Academy of Sciences, India 2016

Abstract In component-based development of object- system making it more understandable and easy to main-
oriented software, modularization depends on how pack- tain [2–6]. Large object-oriented software systems com-
ages are structured or organized. Packages evolve over prise of many sub-systems, represented as packages. As a
time and require re-structuring after some time so that result of bad design decisions, inconsistent, hasty and
understandability and maintainability gets improved. In unplanned maintenance activities, package structure
this paper, a new approach for package-restructuring becomes more coupled and requires restructuring. The
named as Change-History based Package-Restructuring identification of candidates (packages in present context)
(CHPR) is introduced to identify proper package-regroup- for restructuring is one of the fundamental issues. One of
ing possibilities. CHPR approach considers past co-change the most common criteria for restructuring is minimization
patterns of packages along with their structural coupling. A of inter-package coupling which in turn increases cohe-
series of experiments on open source software systems siveness [7, 8]. The coupling among software entities (e.g.
have been performed and preliminary results show that packages) can be explored through source code as well as
CHPR approach is able to indentify meaningful package change history. Source code is useful to identify structural
restructuring. coupling among the software entities (i.e. packages) and
change history tells which entities (i.e. packages) of the
Keywords Computer science  software system are coupled through common changes or
Object-oriented software engineering  Restructuring  co-changes. This co-change pattern of packages is also
Change-history called as change-coupling.
Several researchers have used concept of clustering for
software restructuring to identify the components of legacy
Software system evolves continuously with increasing systems [9–17]. In recent years, various approaches and
complexity and over the time its modularity goes down tools have been developed to identify software restructur-
which makes it less maintainable. Restructuring can be a ing opportunities based on various sources of information
solution at that stage. Fowler [1] describes restructuring as e.g. structural and historical [18–23]. Cui et al. [17] have
the process of changing a software system in such a way used agglomerative hierarchical clustering to identify
that it does not alter the external behavior of the code yet components of software systems. Geipal [24], shows the
improves its internal structure. Most of software develop- correlation between modularity, dependency and change.
ment approaches including object-oriented (OO), use Further, they have also investigated the relationship
restructuring to improve the structure of the software between structural and change dependencies [25]. Their
empirical results stress on the fact that past co-change
pattern also reflects dependencies or coupling. Earlier, Gall
& Anshu Parashar et al. [26, 27] have detected the logical coupling on the
parashar.nitk@gmail.com
basis of release history. Zimmermann et al. [28] have also
1
National Institute of Technology, Kurukshetra, Haryana, utilized change history to detect the coupling and change
India pattern. It has been found that apart from structural

123
22 A. Parashar, J. K. Chhabra

coupling, change coupling is also vital to identify the works as input to the clustering phase for restructuring. It is
coupling pattern. Hence, from literature, it can be con- also desirable to suitably represent coupling information.
cluded that both structural and change coupling can be used The representation of coupling can be either binary or
together for identifying the concrete or actual coupling weighted. The binary scheme is commonly employed in
pattern. So, in present approach, both structural and change software clustering [13–15] and in most of the cases it is
coupling used to minimize the inter-package coupling. The enough to know whether coupling exists or not. In our
structural dependency is represented as design-coupling package-restructuring approach both coupling and extent
and co-change dependency is represented as change-cou- of coupling among packages are practically same. So,
pling among packages. binary scheme has been used to indicate the existence of
The primary contributions of the paper are: coupling among packages. For the purpose of demonstra-
tion, Let consider an example software-system S = {A, B,
• Change-History based Package-Restructuring (CHPR)
C, D} consisting of four packages. Structural coupling
approach has been proposed by exploring change-
among the packages is represented as Package Design-
history of the software system to recognize package-
Coupling Set (PDCS). Packages are design-coupled if
regrouping opportunities or recommendations.
static dependency exists between them. So, for each
• It has been investigated that how source-code and
package P of software-system S, PDCS(P) consists of set of
change-history based coupling information for pack-
packages to which the package P is structurally coupled as
ages can be aggregated and used in order to improve the
per the source-code of the underlying software-system S.
package structure. Further, clustering has been applied
For example, if PDCS(A) = {B, D}, then it tells that the
on the pool of java packages to restructure the package
package A is design-coupled with packages B and D. After
organization.
extracting the PDCS for each package, it is represented as
• A case study has been done and restructuring results
m*m Design-Coupling Matrix (DCM). Here m is the total
have been visualized and analyzed.
number of packages. In DCM(S), each row Ri represents
The CHPR approach has three key phases, (1) Extrac- the PDCS of the package I with remaining packages. The
tion of structural and change coupling among the packages, design-coupling weights (wdi,j) are binary (0 or 1). The
(2) Computation of concrete (overall) coupling among the value wdi,j = 1 indicates that package i and j are design-
packages based on extracted structural and change cou- coupled and wdi,j = 0 indicates no design coupling
pling and (3) Restructuring of the packages. between i and j.
In phase one, the source-code of the software system is 2 3
wd11 wd12 . . . wd1m
analyzed to extract the structural coupling among the 6 wd21 wd22 . . . wd2m 7
6 7
packages. Secondly, the change (evolution) history is DCM ðSÞ ¼ 6 .. .. .. .. 7
4. . . . 5
mined to extract package change-coupling based on their
past co-change pattern. The source-code and change-his- wdm1 wdm2 . . . wdmm
tory of the underlying software-systems are extracted from Apart from design-coupling, change coupling among the
the software repositories. Software repository utilities packages also considered. So, change-history of packages
svnsearch [29–31] and sourceforge [32–34] are being used has been analyzed to comprehend how they co-evolved or
for this purpose. Change-history of packages is explored in co-changed. In this approach, packages which are co-
terms of the change-commits logged by the development changed in the past as change-coupled considered and they
team during the software evolution. These change-com- grouped together during restructuring. So, for each package
mits are corresponding to change-reports related to pack- P, Package Change-Coupling Set (PChCS) has been
ages of the software system. Each change-report maintains constructed. PChCS(P) consists of set of packages that
a set of co-changed packages. Further, following types of have changed together with the package P in past according
reports are not used for computation: (1) Package change- to the change-history of the underlying software system S.
reports consisting of large number of change-commits. For example, consider two change-reports of software-
Usually such change-reports do not reflect any co-evolu- system S as shown below.
tion pattern of packages. (2) Reports related to single
Changereport1 Changereport2
package change-commit. (3) Packages which are deleted
M=S=src=A=A1:java M=S=src=A=B1:java
during successive development phases. These packages
M=S=src=B=B1:java M=S=src=C=C1:java
are not available for restructuring because they have
already been removed. As per the above two change-reports, PChCS(A) = {B,
After the above cleaning process, design and change C}, showing that package A is change-coupled with
couplings of packages are extracted from source-code and packages B and C. Now, change-coupling among the
change-reports respectively. This coupling information packages of the software-system is represented as m*m

123
Package-Restructuring Based on Software Change History 23

Change-Coupling Matrix (ChCM). Here m indicates total the number of packages i.e. clusters, we apply below
numbers of packages. Each row Ri of ChCM(S) represents mentioned criteria:
the PChCS of the package I with other packages. The NClusters ¼ NPkgs
change-coupling weight (wchi,j) represents the change-
 ððNFCCS1 þ NFCCS2 . . .NFCCSm Þ  NFCCSÞ
coupling among packages. Again, wchi,j are binary (0 or 1)
weights. The value wchi,j = 1 indicates package i and j are
Here, NClusters Number of clusters required, NPkgs
change-coupled and wchi,j = 0 indicates no coupling
Number of packages in software system, NFCCSi Number
between package i and j.
of packages in ith Frequently Change-Coupled Set (FCCS),
2 3
wch11 wch12 . . . wch1m NFCCS Total number of Frequently Change-Coupled Set
6 wch21 wch22 . . . wch2m 7 of packages of software system.
6 7
ChCM ðSÞ ¼ 6 .. .. .. .. 7 NFCCSi and NFCCS both are measured from Change-
4. . . . 5
Coupling Matrix. NFCCSi represents the number of fre-
wchm1 wchm2 ... wchmm
quently changed packages in ith set of each FCCS and
In phase two, both types of couplings are together used NFCCS gives the count of frequently changed package
to comprehend the concrete or actual coupling pattern sets. After clustering, each package will have sub-packages
among packages. PDCS and PChCS show design-coupling that are highly cohesive (i.e. connected with each other as
and change-coupling among packages respectively. These per both their design and change coupling patterns). The
two types of coupling (design and change) individually details of clustering methodology and visualization of its
capture different aspects of coupling and thus are combined results have been described through a case study. The goals
together to get the concrete or actual coupling pattern of of present case study are (a) to apply CHPR for package
packages, which is a pre-requisite for restructuring. So, for restructuring, (b) identification of highly change-coupled
each package P of software-system S, Package Concrete- packages based on change-history and (c) analysis and
Coupling Set i.e. PCCS(P) has been formed that consists of validation of results. Experiments have been performed on
set of packages that are design-coupled as well as change- 03 open source subjected software systems namely EMMA
coupled with the package P. It means, PCCS(P) = [29, 32], JPF [31, 34] and JTRAC [30, 33]; all developed
PDCS(P)U PChCS(P). So, for package A, PCCS(A) = using Java programming language. Details of these systems
{B, C, D} says that the package A is actually coupled with are mentioned in Table 1.
packages B, C and D. Now, overall or concrete coupling Both types of coupling among packages of these soft-
pattern of packages of the software-system has been ware systems is extracted by exploring their source code
represented as m*m Concrete-Coupling Matrix (CCM) as [32–34] and change-history [29–31]. After this, Design-
given below. Coupling Matrix (DCM) and Change-Coupling Matrix
2 3 (ChCM) have been formed. Then, both types of couplings
wc11 wc12 . . . wc1m
6 wc21 wc22 . . . wc2m 7 have been combined to get the overall coupling pattern
6 7 among packages and Concrete-Coupling Matrix (CCM)
CCM ðSÞ ¼ 6 .. .. .. .. 7
4. . . . 5 has been formed. Finally, in order to perform package re-
wcm1 wcm2 ... wcmm structuring, three Clustering Methods (CM), namely
Agglomerative, Repeated Bisection and Graph have been
Each concrete-coupling weight (wci,j) is calculated as:
employed [36]. Clustering is an unsupervised classification
   
1 if wdij ¼ 1 or wchij ¼ 1  i  m; j  m of objects based on the similarity or distance between them
wcij ¼
0 if wdij ¼ 0 and wchij ¼ 0 i  m; j  m [35]. The results of clustering are influenced by both
clustering algorithms and similarity functions employed.
The value 1 of wci,j indicates that packages i and j are So, for each clustering method, two types of Similarity
either design-coupled or change-coupled or both. Functions (SF) used i.e. Correlation Coefficient and
In phase three, after getting the concrete-coupling Cosine. The unique combination of both CM and SF is
among packages, clustering has been applied to restructure defined as a Clustering Strategy (CS). So, a thorough study
the package hierarchy. In our approach, every cluster ini- has been done by evaluating the outcome of six clustering
tially represents one package. Before performing the strategies applied on three software systems chosen for
package restructuring, it is desirable to decide the number case study. These clustering strategies (CS) are namely,
of packages required as a result of clustering. In present CS-1 (CM-Agglomerative, SF-Cosine), CS-2 (CM-Ag-
context, it can be said that each cluster will have collection glomerative, SF-Correlation coefficient), CS-3 (CM-Re-
of packages that are highly coupled with each other as per peated Bisection, SF-Cosine), CS-4 (CM-Repeated
their structure and change-coupling patterns. So, to have Bisection, SF-Correlation coefficient), CS-5 (CM-Graph,

123
24 A. Parashar, J. K. Chhabra

Table 1 Software systems used in case study


Name of software system Number of packages Number of change reports studied Evolution period studied Domain of software system

EMMA 20 300 2004–2006 Java Code Coverage Tool


JPF 14 467 2004–2009 Java Plugin Framework
JTRAC 19 1241 2006–2008 Java issue Tracking System

CS-1 [0={builder, filter, editor}, 1= {maventesting}, CS-2 [0={builder},1={ctrl},2={maventesting}, CS-3 [0={maventesting},1={ctrl},2={builder,


2={ctrl}, 3={test, properties}, 4={preference}, 3={test, properties}, 4={ merge}, 5={ rt, html }, filter, editor}, 3={preference}, 4={ merge, instr},
5={merge, instr}, 6={ant}, 7={data}, 6={data}, 7={joblistener, decorator, util, filter, 5={test, properties }, 6={rt, ant}, 7={data},
8={joblistener, report, action, decorator}, 9={rt, report}, 8={ preference, ant }, 9={instr}, 8={joblistener, action, report, decorator},
html}, 10={util}, 11={ swtcomponents}] 10={report}, 11={editor, swtcomponents}] 9={html}, 10={ swtcomponent} 11={util}]

CS-4 [0={ctrl},1={ preference},2={ merge, instr}, CS-5 [0={util}, 1={maventesting}, 2={report, CS-6 [0={util, decorator}, 1={ant, instr, rt,
3={maventesting}, 4={properties}, 5={ test}, action, joblistener}, 3={filter, builder, editor}, preference}, 2={action, report, joblistener},
6={builder, filter, editor}, 7={data}, 8={joblistener, 4={properties, test}, 5={merge}, 6={ html}, 3={properties, test}, 4={html}, 5={filter,
decorator ,report, util, action}, 9={ swtcomponents 7={ctrl}, 8={rt, data}, 9={ swtcomponents }, builder}, 6={maventesting}, 7={swtcomponents},
}, 10={html }, 11={ant, rt}] 10={instr, preferences, ant}, 11={decorator}] 8={editor}, 9={data}, 10={ctrl}, 11={merge}]

Fig. 1 Package re-structuring of EMMA based on six clustering strategies (NClusters = 12)

SF-Cosine) and CS-6 (CM-Graph, SF-Correlation coeffi- peak indicates the estimate of distribution of packages in
cient). To automate clustering strategies and visualize the each cluster and peak shape is a Gaussian curve. The height
results, GCLUTO [36] clustering tool has been utilized. and volume of each peak is proportional to the cluster’s
Mountain visualization method has been used to visualize internal similarity and number of elements contained
the package re-structuring results. Through this, we view within the cluster, respectively. The color of a peak is
the package distribution among different clusters/packages proportional to the cluster’s internal deviation. Red color
that have been formed as a result of our package-restruc- indicates low deviation between cluster elements e.g. in
turing approach. The resultant package structure of Fig. 1, according to CS-1, packages in cluster-8 i.e. job-
EMMA, JPF and JTRAC software systems has been rep- listener, report, action, decorator are close to each other.
resented like 3D terrain visualization as shown in Figs. 1, 2 Blue color indicates high deviation among cluster ele-
and 3 respectively. Each figure describes clustering strat- ments. At all other areas, the color is determined by
egy (CS), resultant clusters (numerically numbered) and blending to create a smooth transition. Only the color at the
their elements (i.e. packages). Mountain visualization is tip of a peak is significant [36].
based on the relative similarity of clusters, their size, To validate the clustering results precision computed,
internal similarity and internal deviation. The shape of each Recall and F-measures among the reference (expected)

123
Package-Restructuring Based on Software Change History 25

CS-1 [0={plugin, model, util},1={demo} CS-2 [0={plugin, model, util},1={ jtds, mysql} , CS-3 [0={plugin, model, util},1={demo} ,
,2={codecolorer, core}, 3={dbbrowser, 2={demo},3={codecolorer, core}, 4={ 2={ jtds, mysql}, 3={dbbrowser, template},
template}, 4={jtds, mysql}, 5={minilib}, dbbrowser , template }, 5={toolbox, 4={ codecolorer, core}, 5={ toolbox,
6={toolbox, pluginbrowser, findreplace}] pluginbrowser}, 6={findreplace, minilib}] pluginbrowser}, 6={findreplace, minilib}]

CS-4 [0={plugin, model, util},1={ demo} CS-5 [0={ template},1={core, jtds, findreplace, CS-6 [0={plugin, model, util}. 1={jtds,
,2={ jtds, mysql}, 3={ dbbrowser, template}, minilib}, 2={dbbrowse, toolbox, core}, 3={ mysql, core}, 2={pluginbrowser,toolbox,
4={ codecolorer, core }, 5={toolbox, pluginbrowser}, 4={plugin, model, util }, 5={ findreplace, minilib}, 3={template},
pluginbrowser}, 6={ findreplace, minilib }] codecolorer}, 6={demo}] 4={dbbrowser}, 5={codecolorer}, 6=
{demo}]

Fig. 2 Package re-structuring of JPF based on six clustering strategies (NClusters = 07)

and predicted (after restructuring) structure of packages. Some recommendations (based on the frequent co-
The reference package structure has been formed care- changed pattern of the packages) for package re-grouping
fully in consultation with experts by analyzing the design derived from the consolidated results of all six clustering
and frequent co-change coupling among the packages. strategies applied on EMMA, JPF and JTRAC software
For all three subjected systems, the results of all clus- systems recorded.
tering strategies with the reference set were compared.
• EMMA Assessment of clustering results of EMMA
Average value of Precision, Recall and F-Measure as
indicates that packages properties and test should be re-
0.78, 0.72 and 0.74 were found respectively. It indicates
structured as one package. Both are frequently change
that the restructuring results are very similar to the
coupled (confidence = 72 %). Similarly, packages fil-
expected results. Further, for subjected systems,
ter, builder and editor should also be in same package.
Agglomerative and Graph clustering strategies have
These packages are also frequently coupled
produced best possible restructuring results with average
(confidence = 89 %).
Precision, Recall and F-Measures ranging from 0.80 to
• JPF Assessment of clustering results of JPF indicates
0.88. Furthermore, in order to evaluate the performance
that packages plugin, util and model are to be re-
of restructuring, values of Intra-package and Inter-pack-
structured as one package. These packages are fre-
age coupling among the packages before and after
quently coupled (confidence = 89 %). Packages jtds
restructuring are compared. For all three software sys-
and mysql are frequently coupled (confidence = 80 %)
tems, after restructuring, average Intra-package coupling
and should be in one package. Further, packages tollbox
increases by 61 % and average Inter-package coupling
and pluginbrowers could also be restructured in same
decreases by 52 %. Increase in Intra-package and
package (confidence = 75 %).
decrease in Inter-package coupling leads to improved
• JTRAC Assessment of clustering results of JTRAC
cohesion and in turn improved quality. Hence, these
showed that packages mylyn, editor and ui are
results clearly indicate significant improvement in the
frequently coupled (confidence = 75 %) and should
package structure of the software systems.

123
26 A. Parashar, J. K. Chhabra

CS-1 [0={maven}, 1={mylyn, ui, editor}, CS-2 [0={hibernate, config, mail, acegi},
CS-3 [0={maven}, 1={mylyn, ui, editor},
2={exception, selenium}, 3={wicket, 1={mylyn, ui, editor}, 2={domain}, 3={maven},
2={search}, 3={yui}, 4={tag, web},
domain}, 4={web, tag}, 5={search}, 4={lucene, watij}, 5={search}, 6={yui}, 7={tag,
5={domain}, 6={ config, mail, acegi}, 7={
6={config, mail, acegi}, 7={yui}, 8={ web}, 8={wicket}, 9={util}, 10={ exception,
lucene, watij}, 8={exception, selenium}
lucene, watij }, 9={util}, 10={hibernate}] selenium }]
9={hibernate, util}, 10={wicket, domain}]

CS-4 [0={maven}, 1={mylyn, ui, editor}, CS-5 [0={domain}, 1={yui}, 2={search, CS-6 [0={mylyn, ui, editor, hibernate},
2={search}, 3={tag, web}, 4={yui}, exception}, 3={hibernate}, 4={tag, web}, 5={ 1={domain} , 2={lucene, waiti}, 3={mail,
5={domain}, 6={wicket}, 7={config, mylyn, ui, editor}, 6={acegi, config, mail}, 7={ config, acegi}, 4={tag, web}, 5={util},
mail, acegi}, 8={hibernate, util}, 9={ maven}, 8={lucene, watji}, 9={util}, 6={maven}, 7={yui}, 8={search}, 9={
exception, selenium}, 10={lucene, watij}] 10={wicket, selenium}] wicket, selenium}, 10={exception}]

Fig. 3 Package re-structuring of JTRAC based on six clustering strategies (NClusters = 11)

be re-structured as in same package. Further, packages change coupled, can be moved to other packages. While
config, mail and acegi should also be restructured in maintaining the software system, present approach can be
same package (confidence = 71 %). very useful as, (a) CHPR helps maintainer to restructure
the packages on the basis of the design and change
Apart from restructuring, these recommendations will
dependencies, (b) PChCS and ChCM help the maintainer to
also help maintainer to understand change-coupling among
identify change dependencies among the packages based
the software entities (e.g. class, packages). In order to
on their past co-change pattern. Such dependencies might
verify the correctness of restructuring recommendations,
not be reflected by the source code, (c) PCCS and CCM
four experts having good experience in software develop-
help the maintainer to explore the consolidated view of
ment and research were consulted. They were asked to
design and change coupling pattern among the packages,
carefully explore past co-change pattern among the pack-
(d) FCCS helps maintainer to identify the frequently
ages and also to evaluate results and rank them on the scale
change coupled packages and (e) Maintainer can also
of 1–10. Their average rating is 7.2 which indicates that
explore the package-regrouping recommendations to
adopted approach has significant potential to indentify
identify the probable candidates (i.e. packages) for
restructuring opportunities.
restructuring based on their past change-history.
Once system becomes stable then subsequent mainte-
Hence, it can be concluded that, for any particular
nance is carried out in tight time constraints and design
software system, change coupling among the packages
principles are not completely followed. As a result, after
needs to be considered along with their structural coupling
some time, the package structure becomes more coupled as
for knowing the actual coupling pattern among packages.
well as difficult to comprehend. At this stage restructuring
The obtained results are promising and illustrate that
of packages becomes desirable, so that future maintenance
Change-History based Package-Restructuring approach
can be carried out more smoothly. At micro-level, then
opens the road towards optimum package-restructuring and
instead of moving packages, the classes which are more

123
Package-Restructuring Based on Software Change History 27

recommendations technique to provide more accurate 16. Mengn FC, Zhan DC, Xu XF (2005) Business component iden-
suggestions for package-regrouping. tification of enterprise information system: a hierarchical clus-
tering method. In: Proceedings of the IEEE international
conference on e-business engineering, pp 473–480
17. Cui JF, Chae HS (2011) Applying agglomerative hierarchical
clustering algorithms to component identification for legacy
References systems. Inf Softw Technol 53:601–614
18. Seng O, Bauer M, Biehl M, Pache G (2005) Search-based
1. Fowler M, Beck K, Brant J, Opdyke W, Roberts D (1999) improvement of subsystem decompositions. In: Proceedings of
Refactoring: improving the design of existing code. Addison- GECCO, pp 1045–1051
Wesley, Boston 19. Abdeen H, Ducasse S, Sahraoui HA, Alloui I (2009) Automatic
2. Czibula I, Serban G (2006) Improving systems design using a package coupling and cycle minimization. In: Proceedings of
clustering approach. Int J Comput Sci Netw Secur 6:40–49 working conference on reverse engineering, pp 103–112
3. Serban G, Czibula I (2008) Object-oriented software systems 20. Corazza A, Martino SD, Scanniello G (2010) A probabilistic
restructuring through clustering. In: Proceedings of international based approach towards software system clustering. In: Pro-
conference on artificial intelligence and soft computing— ceedings of conference on software maintenance and reengi-
ICAISC, pp 693–704 neering, pp 88–96
4. Brown WJ, Malveau RC, Hays W, McCormick I, Mowbray TJ 21. Bavota G, Lucia AD, Marcus A, Oliveto R (2013) Using struc-
(1998) AntiPatterns: refactoring software, architectures, and tural and semantic measures to improve software modularization.
projects in crisis. Wiley, New York Empir Softw Eng 18:901–932
5. Demeyer S, Ducasse S, Nierstrasz OM (2003) Object-oriented 22. Patel C, Hamou-Lhadj A, Rilling J (2009) Software clustering
reengineering patterns. Morgan and Kaufmann, Los Altos using dynamic analysis and static dependencies. In: Proceedings
6. Mens T, Tourwe T (2004) A survey of software refactoring. IEEE of the software maintenance and reengineering, pp 27–36
Trans Softw Eng 30:126–139 23. Ying ATT, Murphy GC, Ng R, Chu-Carroll MC (2004) Pre-
7. Anquetil N, Laval J (2011) Legacy software restructuring: ana- dicting source code changes by mining revision history. IEEE
lyzing a concrete case. In: IEEE European conference on soft- Trans Softw Eng 30:574–586
ware maintenance and reengineering (CSMR), pp 279–286 24. Geipel MM (2012) Modularity, dependence and change. Adv
8. Zanetti MS, Tessone CJ, Scholtes I, Schweitzer F (2014) Auto- Complex Syst 15:6
mated software remodularization based on move refactoring: a 25. Geipel MM, Schweitzer F (2012) The link between dependency
complex systems approach. In: Proceedings of modularity, and co-change: empirical evidence. IEEE Trans Softw Eng
pp 73–84 38:1432–1444
9. Maqbool O, Babri HA (2007) Hierarchical clustering for software 26. Gall H, Jazayeri M, Krajewski J (2003) CVS release history data
architecture recovery. IEEE Trans Software Eng 33:759–780 for detecting logical couplings. In: Proceeding of international
10. Shtern M, Tzerpos V (2009) Methods for selecting and improving workshop on principles of software evolution, pp 13–23
software clustering algorithms. In: Proceedings of IEEE inter- 27. Gall H, Fluri B, Pinzger M (2009) Change analysis with evolizer
national conference on program comprehension, pp 248–252 and changedistiller. IEEE Softw 26:26–33
11. Canfora G, Penta MD (2007) New frontiers of reverse engi- 28. Zimmermann T, Weißgerber P, Diehl S, Zeller A (2005) Mining
neering. In: Proceedings of future of software engineering, version histories to guide software changes. IEEE Trans Softw
pp 326–341 Eng 31:429–445
12. Quinlan D, Yi Q, Kumfert G, Epperly T, Dahlgren T, Schordan 29. EMMA change history. http://svnsearch.org/svnsearch/repos/
M, White B (2005) Toward the automated generation of com- EMMA/search
ponents from existing source code. In: Proceedings of the 2nd 30. JTRAC change history. http://svnsearch.org/svnsearch/repos/
workshop on productivity and performance in high-end comput- JTRAC/search
ing, pp 12–19 31. JPF change history. http://svnsearch.org/svnsearch/repos/JPF/
13. Wiggerts TA (1997) Using clustering algorithms in legacy sys- search
tems remodularization. In: Proceedings of the 4th working con- 32. EMMA. http://emma.sourceforge.net/
ference on reverse engineering, pp 33–43 33. JTRAC. http://jtrac.info/
14. Mitchell BS, Mancoridis S (2006) On the automatic modular- 34. JPF. http://jpf.sourceforge.net/
ization of software systems using the bunch tool. IEEE Trans 35. Han J (2005) Data mining: concepts and techniques. Morgan
Softw Eng 32:193–208 Kaufmann Publishers, San Francisco
15. Maqbool O, Babri HA (2004) The weighted combined algorithm: 36. GCLUTO.
a linkage algorithm for software clustering, In: Proceedings of http://glaros.dtc.umn.edu/gkhome/cluto/gcluto/overview
software maintenance and reengineering, pp 15–24

123

You might also like