You are on page 1of 4

Reasons of Code Cloning: An Investigation

1st Manu Singh 2nd Kriti Priya Gupta 3rd Vidushi Sharma
Computer Science and Engineering (of Aff.) SCMS School of ICT
Krishna Engineering College (AKTU) Symbiosis International University (SIU) Gautam buddha university
Ghaziabad, India Noida, India Greater Noida, India
manu.singh@krishnacollege.ac.in kriti.gupta@scmsnoida.ac.in vidushi@gbu.ac.in

4th Y.N. Prajapati


Computer Science and Engineering
Diwan Institute of Management studies
Meerut, India
Ynp1581@gmail.com

Abstract—Code cloning is a common practice which is adopted reasons of code cloning on various dimensions of software
by software developers. Although the practice is not generally quality and maintainability. The remaining paper is organized
harmful, however too much code cloning can be harmful for as follows: Section 2 presents the review of prior research
software quality. The present study deals with software devel-
opers’ perspectives to understand their reasons of using code done to understand the effects of code cloning. The employed
clones. The reasons of using code clone are categorized into three research methodology is discussed in section 3 and the data
categories namely ‘technical’, organizational’ and ‘personal’. The analysis and results are presented in section 4. Section 5
study uses survey method to obtain primary data pertaining concludes the paper and sections 6 respectively discuss the
to the perceptions of software developers working with various limitations and implications of the research.
software development companies in NOIDA. The data obtained
is statistically analyzed using one-sample t-tests and one-way II. L ITERATURE R EVIEW
ANOVA. The findings indicate that software developers generally
do code cloning because of ‘technical’ and ‘organizational’ Software engineering researchers have mixed arguments
reasons. regarding the impact of code cloning on software quality.
Index Terms—Code cloning, Software developers. Cloning can happen at several levels of abstraction (Singh M.
et al. 2017). Code cloning can increase unnecessary code in
I. I NTRODUCTION size. Code clones that are not well understood can introduce
In the modern world, every business depends on software. new bugs in the software system. Maintenance cost and
In the light of this reality, software quality is one of the most efforts can be increased when problems or bugs must be
crucial aspects for a software development company. One of fixed multiple times. Models for estimating software costs
the factors, which affect software quality, is code cloning. are summarized by researchers (Keim, Y. et al.) Cloning
Code cloning refers to copying or modifying a block of code code can lead to idle, or unused, code in the system (Singh
in software program and is the most elementary technique M. et al. 2015). Some researchers fare in favour of code
of software reuse. There are various categories available in cloning and conclude that clones are not harmful (Kim et al.,
clones and High-Level cloning is extensively used by software 2005; Krinke, 2007; Aversano, 2007; Hotta et al., 2010; Saha
developers for various reasons (Singh M. et al. 2013). High- et al., 2010). Instead, clones can be helpful from different
level clones are said to be composed of multiple simple clones perspectives (Kapser et al., 2008). On the other hand, some
(Oumaziz M., 2020). Code cloning may cause long-term risks researchers have showed that clones have negative effects
to software quality. Baker (1999) has considered cloning to on software quality as cloning increases maintenance costs
be harmful to the quality of the source code. If a piece of (Lozano and Wermelinger, 2008; Juergens et al., 2009; Lozano
code consisting of bugs is cloned, then the bugs are also and Wermelinger,2010). Although the effects of code cloning
cloned thereby deteriorating the software quality. Although the have been widely studied by various researchers, there is a
impact of code cloning has been extensively studied by various dearth of work on systematically understanding the perspec-
researchers, there is a lack of research on understanding the tives of software developers as to why they do code cloning
opinions of software developers regarding the issue. Since while developing a software system (Zhang et al., 2012)
software developers play a key role in the design, installation, and what is the effect of code cloning on software quality
testing and maintenance of software, hence it is important to and maintainability. In few studies, the developers’ intentions
understand their reasons of code cloning. This paper attempts behind cloning practices have been identified as development
to understand the software developers’ perspectives on the strategies, maintenance benefits, language limitations, and
developer’s capabilities (Kim et al., 2004; Kapser and Godfrey,
Identify applicable funding agency here. If none, delete this. 2006; Roy and Cordy, 2007). Zhang et al. (2012) in their
industrial study have investigated reasons of cloning practices
in from technical, personal, and organizational perspectives.

III. R ESEARCH M ETHODOLOGY


The study involves a descriptive research design wherein
survey method is used to obtain the primary data pertaining to
the perceptions of software developers regarding code cloning.
Software developers working for at least one year with the
software companies in NOIDA have been selected as target re-
spondents of the study. The questionnaire has been developed
on the basis of the extant literature review and discussions
with few academicians and practitioners. The questionnaire Fig. 1. Sample Profile
consists questions related to the demographic profile (gender,
age, language familiarity, experience) of the respondents and
reasons as to why they do code cloning. The reasons of doing
code cloning are grouped in three categories: Technical reasons
(which originate from the nature of technical problems to be
solved); Personal reasons (which are related to the developers’
skills, habits, experience etc.); and Organizational reasons
(which are related to the management strategies of project
teams in the developer’s organization). Measures for all the
questions (except demographic variables) have used a five-
point Likert-type response format, with “strongly disagree”
and “strongly agree” as the anchors. The respondents have
recorded their assessment of the items on five-point Likert-
type scales (1= strongly disagree, 2= disagree, 3= neutral, 4=
agree, 5= strongly agree). The questionnaires were distributed
to 250 randomly selected software developers out of which,
233 questionnaires were received. After removing unviable
responses (incomplete responses, selection of more than one
answer, unanswered), we have chosen 204 usable responses
as the sample. Fig. 2. Table 2

IV. DATA A NALYSIS AND I NTERPRETATION


B. Reasons for Code Cloning
This section presents the analysis of the data collected
Table 2 highlights the mean ratings for the reasons as to
through questionnaires. The findings based on the data analysis
why software developers do code cloning.
are also discussed. The collected data is analyzed using SPSS
To test the significance of the roles of three reasons (techni-
17.0.
cal, organizational, and personal) in motivating software devel-
opers to do code cloning, the following hypotheses have been
A. Sample Profile
formulated: H1a: Software developers do code cloning due to
Table 1 depicts the profile of the respondents who par- technical reasons H1b: Software developers do code cloning
ticipated in the survey. As can be observed, 54.5% of the due to organizational reasons H1c: Software developers do
participants in the study are males and 45.5% are females code cloning due to personal reasons The above hypotheses
which can be considered as a good representation of both have been tested by using one-sample t-tests and the results
the genders in the sample. Majority of the respondents who obtained are exhibited in table 3. As can be noticed from table
have participated in the study are in the age groups of 25-
30 years, and 30-35 years indicating that software developers
belong to the young and middle aged groups. The developers
having lesser work experience (1-3 years) have a smaller
representation in the sample (8.4%). The respondents generally
have an experience of more than 3 years. As far as the
language familiarity of software developers is concerned, three
programming languages have been considered i.e. C, Java
and C++ and the sample has equally good representation of
respondents knowing all the three languages. Fig. 3. Reasons of Code Cloning - Results of One-sample t-tests
new codes altogether. One of the ‘organizational reasons’ i.e.
time limitation is also found to be a motive behind using code
clones. Software developers are often pressurized to complete
the project by the said deadline and hence they find code
clones to be a better way out for completing the project in
Fig. 4. Reasons of Code Cloning - Results of ANOVA time.
D. Limitations and Scope for Further Research
3, all the three reasons are found to play a significant role In this research, only few reasons are explored which may
in motivating software developers to do code cloning (p¡.05). be responsible for code cloning. These reasons may vary with
Further, to test whether these reasons play equally important the software developers’ skill sets, working experience, age
role in code cloning or some or the other reason is preferred and other demographic variables. Thus further research needs
over the others, the following hypothesis have been formu- to contain more desirable variables, in order to gain deeper
lated: H2: The three reasons (technical, organizational, and insight regarding the factors which can influencing developers
personal) differ significantly in motivating software developers to code clone. Future research can employ a more rigorous
to do code cloning The above hypothesis is tested using One- research methodology and effect of code cloning on software
way ANOVA and the results are summarized in table 4. The quality and maintainability by using statistical analysis.
results indicate that there are significant differences among
the three reasons in influencing software developers for code R EFERENCES
cloning (p¡.05; F=742.387). Further, from table 4, it can be [1] Aversano, L, Cerulo, L Penta, M.D. (2007), How clones are main-
observed that technical reasons (mean score = 4.73) are most tained: An empirical study, Proc 11th European Conference on Software
important followed by organizational reasons (mean score = Maintenance and Reengineering (CSMR’2007), 81-90. Retrieved, from
doi¿10.1109/CSMR.2007.26
4.61) and personal reasons (mean score = 4.14) in motivat- [2] Baker, B.S. (1995). On finding duplication and near-duplication in
ing software developers do code cloning. Hence it can be large software systems. Proceedings of the Second Working Conference
concluded that although software developers do code cloning on Reverse Engineering (WCRE ’95) Washington, DC, USA: IEEE
Computer Society, 86-95.
because of all types of reasons: technical, organizational and [3] Bettenburg et al. (2009). An empirical study on inconsistent changes to
personal; but they are more influenced by technical reasons as code clones at release level. Proceedings of the 2009 16th Working
compared to the other ones. Furthermore, it can also be noticed Conference on Reverse Engineering Washington, DC, USA: IEEE
Computer Society, 85–94.
from table 4 that generally software developers use code [4] Hotta et al.. (2010). Is Duplicate Code More Frequently Modified than
clones because this provides a ready-made solution for their Non-duplicate Code in Software Evolution?: An Empirical Study on
problems (mean score = 4.92). Following existing solutions Open Source Software.International Workshop on Principles of Software
Evolution (IWPSE), 0(0), 73-82.
in the form of code clones is better to use that to develop [5] Juergens et al.. (2009). Do code clones matter? Proceedings of the
new solutions altogether. Also, using code clones is less risk 31st International Conference on Software Engineering Washington, DC,
taking (mean score = 4.83) as code clones are more stable USA: IEEE Computer Society, 485–495.
[6] Kapser, C. J Godfrey, M.W. (December 2008). ”Cloning considered
because of being already tested. Developing new code requires harmful” considered harmful: patterns of cloning in software. Empirical
testing and involves stability issues. As far as the organization Software Engineering,13(6), 645 - 692.
reasons are concerned, time limitation (mean score = 4.80) [7] Kim et al.. (2005). An empirical study of code clone genealogies.
Proc European Software Engineering Conf and Foundations of Software
is an important reason for using code clones. Because of Engineering (ESEC/FSE), 187-196.
Time limitation, software developers are pressured to deliver a [8] Keim, Y., Bhardwaj, M., Saroop, S., Tandon, A. (2014). Software cost
working product within a specified time limit and they do not estimation models and techniques: A survey. International Journal of
Engineering Research and Technology, 3(2), 1763-1768.
have time to restructure the system for better design; hence [9] Krinke, J. (2007). A study of consistent and inconsistent changes to
they go for code cloning. Although the personal reasons also code clones. Proceedings of the 14th Working Conference on Reverse
play a significant role in code cloning, but their role is not that Engineering Washington, DC, USA: IEEE Computer Society, 170-178.
[10] Lozano, A Wermelinger, M. (2008). Assessing the effect of clones on
important as compared to technical and organizational reasons. changeability. Proceedings of the 24th IEEE International Conference
Sometimes, software developers intentionally do code cloning on Software Maintenance, 227–236.
to just explore a particular program with which they are not [11] Lozano, A Wermelinger, M. (2010). Tracking clones imprint. Proc Int’l
Workshop on Software Clones (IWSC), 65-72.
familiar. Code cloning helps them in fulfilling their task by
[12] Oumaziz M., (2020) , Cloning beyond source code: a study of the
modifying the source codes from existing programs. practices in API documentation and infrastructure as code.. Software
Engineering [cs.SE]. Université de Bordeaux, 2020. English. ffNNT :
C. Conclusion 2020BORD0007ff. fftel-02879899f
[13] Roy, C Cordy, J. (2007). A survey on software clone detection
In the present study, we have analyzed the software de- research. Technical Report No: 2007-541, School of computing, Queens
University at Kingston Ontario, Canada, 115,
velopers’ perceptions regarding code cloning. Based on the [14] Saha et al.. (2010). Evaluating code clone genealogies at release level:
findings, we conclude that software developers generally do An empirical study. Proceeding of Source Code Analysis and Manipu-
code cloning because of ‘technical reasons’ as code clones are lation (SCAM), 2010, 87-96.
[15] Singh M. and Sharma V., (2013) “High Level Clones Classification”,
available to them as a convenient solution for their problems International Journal of Engineering and Advanced Technology (IJEAT)
and using code clones is less risky as compared to developing ISSN: 2249-8958, Volume-2, Issue-6.
[16] Singh M. and Sharma V., (2017), “ASCII based Sequential Multiple
Pattern Matching Algorithm for High Level Cloning” International
Journal of Advanced Computer Science and Applications(ijacsa), 8(6),
2017. http://dx.doi.org/10.14569/IJACSA.2017.080635
[17] Singh M. and Sharma V., (2015) ”Detection of file level clone for high
level cloning”, Procedia Computer. Science., vol. 57, 915-922.
[18] Zhang et al.. (2012),Cloning practices: Why developers clone and what
can be changed, ICSM 2012 IEEE International Conference On Software
Maintenance, 285-29

You might also like