You are on page 1of 25

Information and Software Technology 132 (2021) 106448

Contents lists available at ScienceDirect

Information and Software Technology


journal homepage: www.elsevier.com/locate/infsof

Big Data analytics in Agile software development: A systematic mapping


study
Katarzyna Biesialska a ,∗, Xavier Franch a , Victor Muntés-Mulero b
a
Universitat Politècnica de Catalunya, Barcelona, Spain
b
Beawre Digital S.L., Barcelona, Spain

ARTICLE INFO ABSTRACT

Keywords: Context: Over the last decade, Agile methods have changed the software development process in an unparal-
Agile software development leled way and with the increasing popularity of Big Data, optimizing development cycles through data analytics
Software analytics is becoming a commodity.
Data analytics
Objective: Although a myriad of research exists on software analytics as well as on Agile software development
Machine learning
(ASD) practice on itself, there exists no systematic overview of the research done on ASD from a data analytics
Artificial intelligence
Literature review
perspective. Therefore, the objective of this work is to make progress by linking ASD with Big Data analytics
(BDA).
Method: As the primary method to find relevant literature on the topic, we performed manual search and
snowballing on papers published between 2011 and 2019.
Results: In total, 88 primary studies were selected and analyzed. Our results show that BDA is employed
throughout the whole ASD lifecycle. The results reveal that data-driven software development is focused on
the following areas: code repository analytics, defects/bug fixing, testing, project management analytics, and
application usage analytics.
Conclusions: As BDA and ASD are fast-developing areas, improving the productivity of software development
teams is one of the most important objectives BDA is facing in the industry. This study provides scholars with
information about the state of software analytics research and the current trends as well as applications in the
business environment. Whereas, thanks to this literature review, practitioners should be able to understand
better how to obtain actionable insights from their software artifacts and on which aspects of data analytics
to focus when investing in such initiatives.

1. Introduction software development approaches follow, as the name suggests, a


pre-defined plan. Being reactive in their nature, plan-driven software
Over the last decade, Agile methods have changed the software development processes are less susceptible to immediate feedback than
development process in an unparalleled way. As opposed to traditional, ASD. Hence, their need for timely information is seemingly lower. Nev-
plan-driven models of software development (e.g. waterfall model),
ertheless, both plan-driven software development and ASD can benefit
where processes are organized in a series of sequentially ordered stages,
from applying BDA. However, their needs with respect to analytics
Agile software development (ASD) entails collaborative development
with swift and incremental iterations. As a result, adaptability to fre- may differ due to the defining characteristics of each of the software
quently changing requirements and a strong emphasis on delivering development processes. In our study, we show that BDA applied in
value to customers represent the crux of ASD and have driven its wide ASD is often tailored to specific features of ASD: such as regression
acceptance among software practitioners in the last years. Furthermore, testing, feedback analysis, or iteration planning, among others. Conse-
this paradigm shift from plan-driven software development processes to quently, with the increasing popularity of Big Data, which throughout
ASD accorded with social and technological advances. the years has successfully entered the realm of computer science and
When it comes to applying data-driven approaches to improve the
business environment alike, software practitioners can now efficiently
work of software teams, software development practice has been in
use it to improve software development processes [3]. Improving the
the midst of major upheaval for some years now [1,2]. Plan-driven

∗ Corresponding author.
E-mail address: katarzyna.biesialska@upc.edu (K. Biesialska).

https://doi.org/10.1016/j.infsof.2020.106448
Received 13 May 2020; Received in revised form 22 August 2020; Accepted 29 September 2020
Available online 13 October 2020
0950-5849/© 2020 Elsevier B.V. All rights reserved.
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

productivity of software development teams is one of the most impor- In the software engineering domain, BDA is very much related
tant objectives every software company faces. Optimizing development to the Mining Software Repositories (MSR) field. In particular, MSR
cycles through data analytics can not only streamline the company’s analyzes data from various software repositories (e.g. version control
day-to-day business operations but also help the business venture take systems, issue tracking systems) to get insights related to development
over its competitors in the longer term. A data-driven approach may of software systems and evolution of software projects. As such MSR,
enhance the decision-making processes in the company by providing is a subset of BDA. Our systematic mapping study features a wealth of
insights, for instance, on the following topics: estimating whether a articles that fall under the MSR category (e.g. [S4,S20,S81]). However,
project is on track with its budget and timeline, predicting the number many selected primary studies leverage also datasets that cannot be
of tasks that can be accomplished within one cycle, prioritizing soft- classified as software repositories sensu stricto (e.g. organization data),
ware feature releases based on software usage metrics, or advising on and for that reason, we prefer to use the BDA term in our research.
the best team composition for a particular project.
A myriad of research exists on software analytics, software metrics, 2.2. Agile software development
as well as on ASD practice. Hence, a vast number of systematic litera-
ture studies summarizing the research literature published on each of ASD is said to be the answer to multiple limitations of plan-driven
those topics alone (e.g. [4–9]). Systematic literature reviews in ASD approaches [9,13,14]. Specifically, frequent changes of the system’s
focus mainly on topics such as global software engineering, usability, requirements, close collaboration between developers and their stake-
management-oriented approaches [7], and, at the same time, refrain holders, iterative development delivered in increments are the corner-
from discussing data-driven approaches to Agile development methods. stone of ASD. The main phases of ASD lifecycle (SDLC), are: the concept,
Therefore, our objective for this work is to make significant progress by inception, construction, release, production, and retirement phases [15,16].
linking ASD with data analytics. In this paper, we examine studies on The first phase, which is the concept phase, is a stage where a project
BDA in the context of ASD. By means of a systematic mapping study, we is envisioned. This phase can be considered the planning phase. The
aim at getting a comprehensive understanding of the state of research inception phase entails assembling the team, finding a project sponsor,
on data-driven ASD. Therefore, we first examine studies concerning the and securing the funding, framing initial requirements. It resembles
topic. After performing a thematic analysis, we develop a classification the analysis phase. The iteration/construction phase is about work-
framework explaining different ways how software development teams, ing products, hence the following activities take place in this phase:
and organizations in general, adopt BDA to improve ASD. development of the software, requirements definition, and customer
The remainder of this paper is structured as follows: Section 2 feedback loop. This phase is the design and implementation phase. In
is meant as an introduction to the topic of data analysis within the the release phase testing plays an important role. This is the testing and
realm of ASD. In Section 3, we explain our approach to conducting a
integration phase. The production phase is related to the maintenance
systematic literature study. Namely, we provide a detailed description
of the software in a production environment. This is the maintenance
of search methods, inclusion and exclusion criteria, and study design.
phase. Finally, the retirement phase relates to software decommission.
We also compare our mapping study with previous literature reviews
ASD is an iterative process, where each iteration (known as a sprint
on this topic. Section 4 describes our results as well as topology and
in Scrum) is expected to create and deliver a working piece of soft-
a classification framework proposed in this paper. Section 5 provides
ware as soon as possible. Therefore, a typical iteration encompasses
a discussion of the results and the limitations of the study. Finally,
requirement, design and development, testing, delivery, and assessment
Section 6 concludes the paper.
stages, which form a loop, as numerous iterations might be needed to
release a working product or a new feature. Notwithstanding breaking
2. Background
software development work into iterations/sprints, the aforementioned
ASD phases may vary, also because initially, Agile methods were cre-
2.1. Big data analytics
ated with small organizations in mind. With time, however, their
Although the phenomenon of Big Data is not new – it has emerged large-scale variants have been adopted by large organizations [17,
in computer science in the late 1990s – the term itself still causes some 18], e.g. the Scaled Agile Framework (SAFe) [19], Large-Scale Scrum
confusion as precise boundaries for Big Data are hard to define [10]. (LeSS) [20], Scrum-of-Scrums, Nexus, Scrum at Scale, or Disciplined
In general, Big Data is often described along three dimensions, the so- Agile Delivery (DAD) [21]. Large organizations face a challenge re-
called three Vs — volume (large datasets), velocity (high-frequency data lated to coordination and communication between several development
influx), and variety (data from heterogeneous data sources). These three teams or different organizational units, often globally distributed [17].
characteristics are frequently extended to include more factors, such as Therefore, data analytics in software engineering can increase the
veracity (data accuracy and reliability), variability, value (the potential ability of organizations to understand how to improve their software
of data to support decisions) or visualization (visual representation of development processes.
data). Therefore, the whole concept of Big Data can be understood as a
process of gathering, storing, analyzing, and extracting knowledge from 3. Research method
high-volume and/or complex data, often by means of Machine Learning
(ML)/Artificial Intelligence (AI) algorithms. In that context, the term A systematic mapping (SM) study (also referred to as a scoping
Big Data Analytics (BDA) can be defined as a sub-step in the Big study), similar to a systematic literature review (SLR), serves as a
Data process, focused on gaining insights through advanced analytics summary of scientific articles published on a particular research topic.
techniques — ‘‘Big data analytics is the process of using analysis algorithms In light of this difference, the reason why we decided to perform an
running on powerful supporting platforms to uncover potentials concealed in SM study of the topic, rather than an SLR, is twofold. First, when
big data, such as hidden patterns or unknown correlations’’ [11]. Therefore, performing initial analysis of our research domain and before starting
BDA ‘‘must effectively mine massive datasets at different levels in real- an actual systematic review, we identified the involved domain as a
time or near real-time — including modeling, visualization, prediction, broad research topic, for which an SM study seemed to be a more
and optimization’’ [11]. Business analytics for years has been viewed appropriate way of summarizing academic findings than an SLR. Sec-
from three complementary perspectives, i.e. descriptive, predictive, and ond, in this article, we intend to make a classification of evidence
prescriptive [10,12]. The most recent addition to this group is adaptive and summarize the current state-of-the-art research related to applying
analytics, which instead of only analyzing the past, responds in real- data analytics to improve ASD practice. Our goal is not to answer
time to the actions of a user. By keeping humans in the loop, this type very specific questions related to the said domain, but rather provide a
of analytics is the embodiment of real-time decision making. general overview.

2
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Fig. 1. A protocol outlining stages of this SM study.

Table 1
Research questions and their sub-questions answered in this evaluation.
Research questions Sub-questions
RQ1.1: When the studies have been published?
RQ1.2: Where and in what types of venues have the studies been published?
RQ1.3: What is the geographic distribution of the studies and their authors?
RQ1: What studies discuss BDA and ASD?
RQ1.4: How are the selected studies distributed between industry and academia?
RQ1.5: What are the companies discussed in the studies and/or what is their authors’ affiliation?
RQ1.6: What is the distribution of the studies with regard to the research results?
RQ2.1: What types of analytics have been used in the ASD domain?
RQ2: What are the approaches to using BDA in ASD? RQ2.2: What sources of data have been used?
RQ2.3: What methods, models, or techniques have been utilized in the studies?
RQ3.1: Where BDA is employed throughout the ASD lifecycle? What are the main research topics and sub-topics?
RQ3: In what ways BDA is used in ASD processes? RQ3.2: What Agile practices, techniques, or engineering practices have been used in the studies?
RQ3.3: Do the approaches reported in the literature discuss tools? Have the tools been applied in practice?

3.1. Protocol development mapped the frequencies of industrial versus academic publications as
well as and authors’ affiliation (RQ1.4) and companies that took part
In this work, we followed a protocol devised exclusively for the in case studies discussed in the selected primary studies (RQ1.5). In
purpose of this SM study, which was inspired by the work of Petersen RQ1.6 we classified papers with respect to research results (i.e. the kind
et al. [22]. Similar to other scholars, e.g. [23–25], our review protocol of contribution). This type of analysis was a stepping stone to the next
was divided into separate, yet collectively exhaustive steps, as outlined two RQs and allowed us to identify research gaps in the field.
in Fig. 1. Below we present only a brief overview of steps undertaken RQ2: What are the approaches to using BDA in ASD? The rationale
by us in the course of conducting this SM study. We started off with behind the second RQ (including its sub-questions) is to classify BDA
defining our research questions (outlined in Section 3.2) and a set of approaches according to the used methods. By approaches we un-
research papers that we identified as major contributions in the domain derstand types of analytics used in ASD domain (such as descriptive,
in recent years, that were published in leading journals and conferences predictive) [10] according to which we group studies in RQ2.1. More-
in our research area (our search method is discussed in Section 3.4). over, RQ2 helps with understanding what type of data triggers the
As BDA and ASD are fast-developing areas, we assumed that papers adoption of BDA in ASD, as in RQ2.2 we discussed what data sources
published between 2011 and 2019 had cited all seminal pieces on the selected studies leveraged. Whereas, RQ2.3 provides information
the topic. The assumption is in accordance with findings of Meidan on what kind of methods and techniques are utilized in BDA for ASD.
et al. [26] who, in their SM study, showed that the number of papers RQ3: In what ways BDA is used in ASD processes? By answering RQ3,
discussing Agile and Lean development processes almost doubled after we were able to determine how BDA aids ASD. Concretely, in RQ3.1,
2010. In the same vein, Laanti et al. [27] argue that scientific and we determined what stages of the ASD lifecycle are most frequently
quantitative studies on Agile methods were still rare in 2011. Therefore, supported by data analytics. In order to make our results more informa-
we based our search of primary studies on manual search in the selected tive to a broad audience of people without specialized experience in any
top-tier publication venues, and then we performed both backward and given sub-field, we decided to standardize our findings (where possible)
forward snowballing from the selected papers. For papers identified by using the Software Engineering Body of Knowledge (SWEBOK) [30]
as relevant through the first iteration of backward snowballing and sub-topics (RQ3.1). Such a thematic classification gave us confidence
forward snowballing, we reviewed the references appearing in those that: (1) we did not miss any important software engineering sub-
papers. Next, we applied the second iteration of the forward and field, (2) our work can be compared with studies conducted by other
backward snowballing technique, which turned out to be the final one researchers, as the SWEBOK document is a widely accepted document
as no new study was found after that (i.e. during the third iteration of in the software engineering community. In RQ3.2, we utilized a classi-
the method) [28,29]. fication somewhat similar to the one used in VersionOne’s Annual State
of Agile Report [31], analyzing primary studies from three perspectives:
3.2. Research questions Agile practices, Agile techniques, and Agile engineering practices. Fur-
thermore, the last RQ sheds light on the applicability of the concepts
In order to gain a comprehensive understanding of how BDA is used discussed in the selected papers. RQ3.3 answers the question of whether
in Agile development, the objective of this study was broken down into the papers proposed tools and, if so, what concepts have been applied
three main RQs, listed in Table 1 and explained below. in practice.
RQ1: What studies discuss BDA and ASD? RQ1 and its sub-questions
provide information on the demographics and the general type of 3.3. Study design
papers selected in this mapping study. Based on that, we were able
to aggregate research papers with respect to the quantity and present First, in Section 3.5, we investigated whether any previously pub-
the distribution of publication over time (RQ1.1), venue (RQ1.2), or lished secondary study discussed our topic of interest extensively. We
according to a location (RQ1.3) to determine trends. Furthermore, we concluded that no such study existed and we started work on our

3
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

SM study. Our work analyzed papers published in 2019 and earlier,


but not before 2011. Only central journals and conferences in our
research area were evaluated in our work. The reason behind limiting
selection results only to certain types of publications, such as those that
appeared in selected top-tier venues, is the fact that such publication
venues need to publish the most relevant high-quality studies, because
otherwise they would not be able to sustain their ranking positions and
a reputation for quality. Therefore, our initial question was: Is the paper
published in a venue where relevant papers are published? To answer
this question, we relied on our domain expertise. It appears that the
selection of journals and conferences proposed in our paper is well-
aligned with the one presented by Dingsøyr et al. [32] in their summary Fig. 2. A primary study selection process.
of a decade of ASD practice. Nevertheless, we would like to highlight
that before deciding which journals, conferences, and workshops to in-
clude, we performed a broad search by topic across different publishing
venues to find as many relevant journals, conferences, and workshops Step 0. Studies from selected venues: A basic search for all potential
on the topic as possible. Overall, we reviewed publications from 11 candidates. This means taking into account all papers published within
journals, 12 conferences, and 7 workshop proceedings. However, only the selected publishing venues within the timeframe specified in our
8 journals, 10 conferences, and 5 workshops published works that we mapping study. Manual search technique used.
finally selected in our study (see Appendix). No new venue emerged Step 1. Title and abstract: Papers limited to those that, according to
while applying a snowballing technique to find all relevant studies. title and abstract, seem to be appropriate for the mapping study.
The list of excluded venues due to lack of relevant primary studies is Step 2. Skimming: Quick read of papers reveals which studies are not
available in Appendix. relevant.
Step 3. Full Text: Full text read reveals precisely whether already
3.4. Search strategy selected papers meet the criteria.
After finding a set of relevant primary studies through manual
Before starting work on our literature survey, first, we had to inves- search, we performed both backward and forward snowballing from
tigate whether any previously published secondary study discussed the the selected papers.
topic of interest, and if so, whether such a paper formulated research Step 4. 1st Round snowballing: We reviewed the references appearing
questions (RQs) similar to ours. Therefore, in order to eliminate the risk in those papers. The first round of forward and backward snowballing.
of duplicating existing literature review studies, we decided to search Step 5. 2nd Round snowballing: For papers identified as relevant
within a vast and diverse range of sources using an automatic search through the first iteration of snowballing, the second round of forward
technique. Hence, our approach for finding existing literature reviews and backward snowballing was performed. After the second iteration
differed from the main one used in finding primary papers for our SM of the forward and backward snowballing technique, no new study was
study. The main reason why we decided to deviate from our main found. Hence, the last iteration, which resulted in new studies included
approach was the need to cover all existing literature surveys in the in our work, was the second iteration of snowballing.
involved domain, including those published in less reputable outlets. To Fig. 2 shows the number of papers selected after each stage, and
perform an exhaustive search of surveys, we used the following digital below is a description of each step. The full list of selected papers in
libraries: Thomson Reuters’s Web of Science, Elsevier’s Scopus, IEEExplore, this mapping study is presented in the Primary Studies reference section
ACM Digital Library, Microsoft Academic, Google Scholar and Semantic at the end of this paper. For the sake of clarity, the selected primary
Scholar (see Appendix for the results of the search). studies are italicized and marked as S[i], where S denotes a primary
Furthermore, in order to have the best chance of finding similar study in our SM, and i indicates its ordinal number.
review papers to ours, we also decided to cast a net wide with respect From the study selection phase, we can conclude that we have
to strings used in our search query, which we describe in detail below.
certainly not included all works published in the involved domain.
Specifically, for analytics, being a recurrent keyword for terms such as
However, the snowballing method is regarded as a comparable method
software analytics, data analytics and big data analytics (among others),
in terms of the quality of results to other methods used in systematic
we used an asterisk (*) to make the search wider. Following Kitchen-
studies [34]. Furthermore, building on Kitchenham and Charters’s [33]
ham and Charters’s [33] advice on constructing a search string, we
argument that for an SM study, quality assessment of primary studies
defined ours as follows: (survey OR review OR ‘‘mapping study’’ OR ‘‘sys-
is not necessary, we decided not to use the quality assessment as one
tematic map’’ OR ‘‘state?of?the?art’’ OR ‘‘meta?analysis’’) AND (‘‘software
of our criteria for the study selection.
engineering’’ OR ‘‘software development’’ OR ‘‘agile *development’’) AND
(‘‘*analytics’’ OR ‘‘data-driven’’ OR ‘‘big data’’). As usual, later on the
search string needed to be adapted to conform to the requirements of 3.5. Comparison to previous studies
the selected digital libraries. For instance, not all electronic databases
supported the question mark sign (?), matching exactly one character, Although a quick look at the literature on software analytics can
or the asterisk character (*), matching zero or more characters. Hence, reveal a number of short articles, summaries of panel discussions, or
in such cases, we had to add missing characters or words where editorials discussing the topic [35–37], however, none of these can
applicable. be considered an extensive literature review of the research area.
The main source of noise in our search results was related to various Therefore, in the course of preparing this SM, we examined a body of
applications of Agile methods to improve data analytics processes. literature in order to find secondary or tertiary studies covering our
After eliminating irrelevant studies, two literature reviews satisfied topic of interest. The detailed explanation of our search process can be
our conditions (discussed in Section 3.5). None of them addressed the found in Section 3.4.
research topic extensively enough; hence, we concluded that our SM Although none of the two papers discussed in this section is entirely
study could fill this gap. focused on ASD, we identified them as being the closest to the research
Next, according to the next step in our protocol (presented in topic; hence, a brief description of those works is presented below.
Section 3.1), we used a hybrid search technique to find a set of relevant The first secondary study, identified as similar to ours in our search
publications. The detailed description of this stage is presented below: for similar works, is by Abdellatif et al. [5]. The authors carried out

4
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

an SLR in which they selected and analyzed 19 primary studies (out Table 2
Inclusion and exclusion criteria of this SM study.
of 135) in the period of January 2000 to December 2014. Those
Inclusion criteria
papers were used to identify software practitioners benefiting most
from available software analytics studies, understand which software IC1: studies related to BDA and ASD
IC2: studies published from 2011 until 2019 (inclusive)
engineering domains are covered by software analytics studies as well IC3: primary studies
as explain which artifacts are consumed by analytics and whether they IC4: studies published in selected publishing venues
are linked together somehow. Abdellatif et al. [5] tried to answer the Exclusion criteria
following research questions: (1) ‘‘Which software practitioners does EC1: studies not accessible in full-text
the available SA research target?’’, (2)‘‘Which domains are covered EC2: recaps of conferences, panel discussions, editorials
by SA studies?’’, (3) ‘‘Which software artifacts are extracted?’’, (4) ‘‘If EC3: workshop papers published before 2014
EC4: duplicated studies
different artifacts are used, are they linked together?’’
The authors of the SLR found that 90% of all studies targeted de-
velopers (47% developers only, 21% developers and project managers,
11% developers, testers and project managers, and 11% developers, 3.6. Study selection criteria
portfolio, and project managers). Whereas, studies focused only on
project managers constituted 10% and this was the only group iden- The same research questions and inclusion/exclusion criteria are
tified to be targeted solely apart from developers. The review by used in both search strategies: manual identification of papers in the se-
Abdellatif et al. [5] showed that most available SA studies fell into lected venues and snowballing. The following criteria (Table 2) explain
one of the following domains: maintainability and reverse engineering when a study was selected in our survey.
Furthermore, articles published in the special sections of journals
(58%), team collaboration and dashboard (21%), incident management
featuring the best works presented beforehand at conferences (i.e. ar-
and defect prediction (11%), software analytics platform (5%), software
ticles which are revised and extended versions of the original papers)
effort estimation (5%). Furthermore, the researchers found that 47%
were considered on a case by case basis. Although, in general, this study
of the studies used only one artifact as a data source (predominantly
reviewed papers published between 2011–2019 inclusive, however,
source code). 53% of works used more than one, including data from:
for workshops we applied slightly modified criteria. Namely, work-
issue tracking systems, version control systems, operating system and shops usually publish papers describing emerging results and work in
service transaction logs, project management, team wikis, as well as progress. However, if considered important by the research community,
defect and bug reports (among others). All in all, the study concludes they are later extended and published as fully-fledged conference or
that it was not possible to find too many relevant or mature research journal articles. Therefore, for that reason, in this work, we do not
in the software analytics field. review workshop papers published before 2014. Although our work
Also, Bagriyanik and Karahoca [8] performed an SLR. The study could also cover secondary studies, and hence be considered a tertiary
included papers from January 2010 to October or November 2015 study (or a tertiary review), however, we decided not to analyze papers
(the end month remains unclear: in the abstract, it appeared to be that do not describe new or original experimental work.
October, whereas in the body, the authors mentioned November). The In Appendix, we present the number of selected papers published
researchers selected 32 papers out of 326 studies found through Google by major journals or presented at top conferences and peer-reviewed
Scholar. In their work, Bagriyanik and Karahoca tried to answer two workshops, where papers on the research topic are featured.
main research questions: (1) ‘‘In which software engineering areas
3.7. Data extraction and synthesis
Big Data and Software Engineering are interacting and to what ex-
tent?’’ and (2) ‘‘Which software engineering artifacts are used for Big
To answer our Research Questions (RQs), we prepared a form
Data processing? What are the most frequently used artifacts?’’ [8].
in which we tracked metadata together with other relevant informa-
Bagriyanik and Karahoca’s RQ1 resembles our RQ3.1 (discussed in tion (see Table 3) extracted from each reviewed primary study (see
detail in Section 3.2) as it aimed at finding software engineering areas Appendix).
that benefit from Big Data. The authors concluded that software quality, The first iteration was performed for 34 studies (pilot extraction
development, project management, human–computer interaction as data). Based on this sample we have improved our process of extracting
well as software evolution and software visualization were the most data modified our data extraction form. Furthermore, we have revised
active research areas in software engineering big data studies. Whereas our RQs, which resulted in clarifying RQ2 and formulating mutually
RQ2 focused on determining software engineering data sources used exclusive and exhaustive sub-questions for each main RQ.
in Big Data is similar to our RQ2.2. In terms of artifacts, source code, In order to prepare a thematic classification first, we used the
issue related, and operational data was identified as the most frequently keywords mentioned in each primary study. Since content analysis is
used sources of data in the studies. This RQ was also covered in our a popular empirical research method for inferring meaning from text
study (see RQ2.2 in Section 3.2). As we will explain later in Section 4, data [38] and is applicable to all types of written texts (e.g. pub-
the number of publications related to BDA in ASD has increased over lished journal or conference articles), we decided to use it for data
the last few years, with the year 2014 being the caesura. As the work analysis in this study. Our data analysis started with conceptualizing
covered papers published over a span of 5 years starting from 2010, we the ASD process as well as BDA. As guidance for initial codes we
followed Blackett’s approach [39], hence our literature review classifies
can conclude that this literature review certainly requires an update.
data analytics according to the depth of analysis: descriptive analytics,
Furthermore, the two RQs formulated by the authors of [8], although
predictive analytics, prescriptive analytics, and adaptive analytics. As we
helpful to understand the progress of the research area, do not cover
opted for computer-aided data collection and analysis, we coded the
all related factors such as the type of software analytics, methods and
data extraction form accordingly. We prepared a preliminary list of
techniques used, to name just a few. codes, the so-called start list of codes. Next, we employed an automatic
In sum, at the time of this writing, none of the previously published keyphrase extraction to find keywords for the text. A set of keyphrases
secondary studies (i.e. [5,8]) covered the topic of interest extensively for a document should ideally capture the main topics discussed in it,
enough by answering research questions similar to those that our work so that we can easily map them with a list of research questions and
tries to address. Therefore, with this work, we aim to fill this research hypotheses that we defined at the beginning of this study. The list of
gap. keywords is reported in Appendix.

5
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Table 3
Data extraction form adopted in this study.
Data item Description/Value Relevant RQ
Study ID ID from the Appendix –
Paper title Article title –
Authors Authors’ names –
Year Publication year RQ1.1
Venue Publication source and type (journal, conference, workshop) RQ1.2
Country Country of affiliation RQ1.3
Authors’ affiliation Academia, industry, both and names of the institutions RQ1.4
Company Company name and country RQ1.5
Research results Procedure or technique, tool, solution, empirical study etc. RQ1.6
Analytics type Descriptive, predictive, prescriptive, adaptive RQ2.1
Data source Source of data: code repository, application usage etc. RQ2.2
Methods Methods, models, or techniques proposed in the studies RQ2.3
BDA purpose SWEBOK codes RQ3.1
Agile purposes Agile practices, techniques or engineering practices RQ3.2
Tools & their application Tool names and information if the approach been applied in practice RQ3.3

Fig. 3. Distribution of studies per 2-year window.

Fig. 4. Distribution of studies over venue type.

4. Results
and IEEE Software) individually published the highest number of the
This SM study was performed according to the protocol described selected primary studies, namely 21 (23.9%) split evenly.
in Section 3.1. After executing all its steps, we identified 88 studies on To put it in context, we compared our numbers with the ratios
the usage of BDA for ASD. reported by other mapping studies in the software engineering domain.
Concretely, the data was obtained from a sample of 13 papers compiled
4.1. Study demographics and trends (RQ1) by Ameller et al. [40] (by taking an average), which we extended
with 4 recent SM studies [41–44]. We observe that, similar to the
This section focuses on describing the demographic data of the results reported in the above-mentioned reference mapping studies,
selected primary studies, i.e. the distribution of studies with respect to conferences combined together with workshops were the main forum
authors’ affiliation, country, publication year, and publication venue, (54.5%) for researchers to showcase their research. However, in our
among others. case, the difference between journals and conferences combined with
workshops is very small. In fact, the percentage of publications in
4.1.1. Publication years (RQ1.1) journals is the highest among the analyzed studies.
The detailed analysis of the distribution of the selected primary
studies over the years is presented in Fig. 3. It can be noticed that 4.1.2. Authors’ affiliation (RQ1.4)
there was a significant increase in the total number of publications Fig. 5(a) shows the distribution of studies with respect to the
after 2013, which is clearly visible in Fig. 3 showing the volume status of employment of their authors: academia (university or research
of published research on the topic for two-year periods. This trend institute), industry (company, research lab, or public administration)
suggests a relatively growing research interest in the topic of BDA in the or both i.e. joint authorship. 47 (53.4%) papers were published by
context of ASD, as the number of publications steadily increases every authors from academia, while 32 (36.4%) publications were a result
year, with record-breaking years 2018 and 2019, when 16 (18.2%) and of joint authorship. In total, 9 studies (10.2%) originated purely from
24 (27.3%) research papers were published on the topic respectively. the industry. Furthermore, 5 authors representing more than one entity
As discussed in Section 3.3, the design of our mapping study implied had both academic and industry affiliation [S12,S40,S69,S73]. In sum,
doing a pre-selection of the most important venues in our research area as shown in Fig. 5(a) using a two-year window, joint collaborations
as well as performing a snowballing process to cover relevant papers remained more or less at the same level throughout the years, whereas
as diligently as possible. The selected primary studies were distributed studies from academia were increasing every year, with an exception
over 22 publication venues (see Appendix). We found that works pre- for 2016 (the decline in 2016 is synchronized with the overall decrease
sented at periodically held events (such as conferences and workshops) in the number of papers published that year). The number of papers
constituted the majority of studies in our area (see Fig. 4). Journals originating solely from industry remained steady until the year 2019,
were, in general, slightly less popular (45.5%) among the scholars. with a record number of 5 (5.7%) industrial studies published that year.
Nevertheless, two of them (i.e., Information and Software Technology 3 of those studies were written by Microsoft researchers. Moreover,

6
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

co-authored by researchers with industrial affiliations did not exceed 3


studies. In addition, 9 papers (10.2%) which originated entirely from
industry were mostly written by authors affiliated to the US offices of
the companies they represented (19 out of 39), again largely American
(Microsoft, Google, Fannie Mae), or offices in India (14 out of 39).
Furthermore, it is also worth mentioning that not all papers discussing
industrial case studies had industry practitioners or researchers with
industrial affiliations among their authors. Hence, some of the selected
primary studies, even though set in the industrial context, were not
covered in this section.

4.1.3. Companies (RQ1.5)


Our literature search also allowed us to draw a conclusion with
regard to the extent of Agile research undertaken in different compa-
nies. In total, 23 companies, 1 research lab and 1 government-related
entity were explicitly mentioned by name as the subjects of the se-
lected primary studies. Not surprisingly, some of the authors of the
selected studies were affiliated with the companies presented in the
papers; however, some of the papers discussed industrial case studies
without having any employee of those companies among their authors
(e.g. [S37,S32,S75]). Although no dominant companies were identified
in our study, the most frequently covered organization was Microsoft,
followed by Mozilla with its numerous projects. Further, only three
other companies were featured more than once, i.e. Ericsson, Google,
and IBM. These 5 companies accounted for 18.2% of all studies (see
Table 4). Moreover, 30 (34.1%) of the selected primary studies covered
anonymous companies, i.e. their names were not disclosed, but in most
of the cases we were able to extract information such as their size,
core business, or country of origin. The relatively high number of
anonymous companies may be attributed to the fact that the majority
of the primary studies discussing industrial case studies partnered up
with more than one company, e.g. Lindgren et al. [S47] in their article
covered as many as 10 anonymous software development companies
from Finland working in different domains. Similarly, [S55] featured
5 anonymous companies. Authors tend to be consistent with their ap-
proach to disclosing company names throughout their papers — if they
mention companies, they either reveal the names of all of the covered
organizations or none of them. Moreover, our SM study included also
papers without any industrial case study. Those studies, were grouped
under a separate category, called N/A in Table 4. Furthermore, some
of the selected primary studies covered companies with international
Fig. 5. Distribution of studies w.r.t. authors’ affiliation.
presence, hence we included this information in Table 4 where possible.
Namely, the Global label was used to indicate multinational corpo-
rations, with countries participating in the selected studies listed in
in Fig. 5(b) we also presented summary statistics for the number of brackets (if available). Similarly, other companies present in more than
authors per each paper — such as the median observation (horizontal one country (but not being multinational corporations) were classified
bar), the lower (Q1) and upper (Q3) quartiles, and the average (blue as international (Intl. abbreviation was used in this case), again with
diamond). Irrespective of author’s affiliation, the median equals 4 countries participating in the selected studies listed in brackets (if
authors. Articles written with industrial practitioners were more often available). It appears that the majority of multinational corporations
authored by more than 4 researchers. discussed in the selected primary studies are companies established or
We also analyzed how research on the topic spread across different having their headquarters in the US. Whereas, large companies and
countries and institutions. The graphical representation of this analysis
small and medium companies (SMEs) mainly hailed from Europe, with
is shown in Fig. 6, where numbers next to countries, regions, and
large representation of Finnish companies (due to the study by Lindgren
institutions indicate the number of authors affiliated with each one of
et al. [S47]).
them. It appears that Europe overall leads when it comes to conducting
As shown in Fig. 7(a), SMEs and corporations were most frequently
research in the industrial context, having 27 papers (30.7%) written by
studied. However, when grouping organizations by size, the largest
one or more authors with industrial affiliations, out of which 13 studies
(14.8%) were carried out entirely by European teams. Further, despite group in our study would consist of corporations and large compa-
a largely fragmented research landscape in Europe, the continent was nies, together being the subject of 32 studies (36.4%). In total, only
strongly represented by Sweden, whose researchers participated in 7 2 research centers (2.3%) and 3 government-related entities (3.4%)
studies (8.0%), and the Netherlands with 5 studies (5.7%) undertaken took part in case studies. The small representation of public sector
with industrial partners co-authoring the papers. European research companies poses a question whether in less competitive environments
institutes were also very active in conducting research on the topic. the adoption of Agile methods and data analytics is as prevalent as in
Moreover, with respect to studies performed with industry, Europe was the fast-changing business landscape.
closely followed by North American countries: the USA having 15 such Furthermore, as shown in Fig. 7(b), most studies were conducted in
studies (17.0%) and Canada 5 studies (5.7%). For other countries, re- companies developing software and providing IT services to customers
gardless of the region they belonged to, the number of research papers (39.8%). However, other domains, such as financial services, industry

7
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Fig. 6. Distribution of studies w.r.t. authors’ affiliation.

(e.g. oil and gas, manufacturing), healthcare, or sport and entertain-


ment industry were also present. This can be perhaps attributed to the
growing trend of non-technology organizations becoming technology
companies with software developed in-house. However, in sum, the
majority of the selected papers studied software development practices
and application of BDA in organizations specializing specifically in
developing software in a broad context, such as Microsoft, Google, and
smaller software houses. Hence, we assume that most of the studies
in industry dealt with professional software developers with proven
competencies, whose software development practices generalize well.

4.1.4. Research results (RQ1.6)


Following Shaw’s [45] classification of types of software engineer-
ing research results, we categorized our primary studies into seven
groups shown in Table 5. Each paper belongs to exactly one group.
We observe a trend toward proposing new procedure and techniques.
This can be attributed to a large number of academic papers that pro-
duced new approaches providing a lab-based or theoretical validation.
However, examples of use in actual practice in the industry were less
prevalent. Solutions, prototypes and tools together constituted about a
fifth (17) of all studies. Moreover, only 12 (13.6%) studies reported
lessons learned, followed by a small representation of frameworks and
guidelines 9 (10.2%). This analysis of the contribution type facets
indicates that this is still an evolving research field, which is developing
new approaches and lacks evaluated models and theories.

4.2. Data analytics throughout the lifecycle of the Agile software develop-
ment process (RQ2)

In this section, we report the results of our analysis of different


types of analytics employed in ASD, and we discuss various sources of
data used in the selected primary studies as well as we shed light on
numerous BDA methods and techniques used in ASD.

4.2.1. Analytics types (RQ2.1)


In order to analyze different analytics types used in ASD, we defined
four types of analytics. To this end, we adopted the classification of
Fig. 7. Distribution of companies w.r.t. their type. analytics types as proposed by [10] and extended it with an additional
type (i.e. adaptive) to cover the whole spectrum of analytics. Next, we

8
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Table 4
Companies discussed in the studies.
Company Company size/type Company domain Country Relevant studies
Microsoft Corporation Consumer and enterprise software Global (USA, China, India) [S5,S17,S24,S30,S51,S59,S87]
Ericsson Corporation Telecommunications infrastructure Global [S57,S78]
Google Corporation Consumer and enterprise software Global [S29,S66]
IBM Corporation Enterprise software Global [S11,S27]
ABB Corporation Industrial technology Global (USA, Poland) [S6]
Accenture Corporation Technology services Global (India, Singapore) [S72]
Adyen Large company Payments solutions [S83]
AENSys Informatics SME Home IoT and IT security management [S8]
Amisoft SME Software development services Chile [S63]
Avalia Systems SME Software development services Intl. (Switzerland) [S33]
Ericpol SME IT outsourcing and consulting services Intl. (Poland) [S31]
Fannie Mae GSEa Financial services USA [S74]
F-Secure Corporation Cyber security and privacy solutions Global (Finland) [S27]
Grupo Saberes SME Software development services Colombia [S34]
ING Corporation Financial services Global (Netherlands) [S30]
LexisNexis Corporation Computer-assisted legal research and consulting services Global (USA) [S86]
Mozilla Corporation Free software community [S25,S25,S75]
Multilogic SME Software development services Hungary [S68]
Pason Systems Corporation Drilling data management systems Canada [S67]
Plexina SME Healthcare IT solutions Canada [S32]
Salesforce Corporation Marketing and sales software Global (USA) [S14]
Virtus Not disclosed Software development services Brazil [S32]
VTT Research lab N/A Finland [S13]
Other Government agency N/A [S12,S69]
Not disclosed Not disclosed Not disclosed Intl. (Finland, Spain) [S49]
Not disclosed Large company N/A Italy [S1]
Not disclosed Large company Consumer and business security solutions Finland [S33]
Not disclosed Large company Subscription-based television services Italy [S35]
Not disclosed SME On-line business entertaining platforms Estonia [S33]
Not disclosed Research lab N/A Italy [S1]
Not disclosed Large university IT department Not disclosed [S52]
Not disclosed Not disclosed ICT system development Not disclosed [S53]
Not disclosed Not disclosed Secure communications and connectivity solutions Not disclosed [S82]
Not disclosed Not disclosed Software development tools Not disclosed [S82]
Not disclosed Large company Infrastructure provider company Not disclosed [S16]
Not disclosed SME Gaming Finland [S47]
Not disclosed SME ICT services Finland [S47]
Not disclosed Large company ICT services Finland [S47]
Not disclosed SME Sports Finland [S47]
Not disclosed SME Software development tools Finland [S47]
Not disclosed Large company Security Finland [S47]
Not disclosed Large company Telecom Finland [S47]
Not disclosed SME Multimedia Finland [S47]
Not disclosed Not disclosed Software company Belgium [S61]
Not disclosed SME Software development services USA [S65]
Not disclosed Not disclosed ICT services Canada [S7]
Not disclosed Not disclosed Digital media platform Brazil [S7]
Not disclosed SME Software development services Germany [S22]
Not disclosed Not disclosed Consumer electronics and software solutions Intl. [S55]
Not disclosed Not disclosed Software solutions and services Intl. [S55]
Not disclosed Not disclosed Experimentation solutions for websites and mobile Not disclosed [S55]
Not disclosed Not disclosed Experimentation solutions and website optimization Not disclosed [S55]
Not disclosed Not disclosed Booking and travel solution provider Not disclosed [S55]
Not disclosed Not disclosed Mobile app development services Not disclosed [S56]
Not disclosed Not disclosed Not disclosed N/A [S23,S35,S54,S84]
N/A N/A N/A N/A [S2,S3,S4,S9,S10,S15,S18,S19,S20,
S21,S24,S26,S28,S29,S31,S34,S35,
S43,S48,S50,S58,S60,S62,S64,S70,
S71,S73,S76,S79,S80,S81,S85],
S88
a
Government-sponsored enterprise.

Table 5
Selected primary studies mapped according to the research results.
Category Relevant studies # studies
Procedure or technique [S1,S2,S3,S8,S9,S11,S16,S19,S20,S21,S25,S32,S36,S37,S45,S46,S48,S49,S50,S51,S52,S53,S56,S57,S58,S60,S61] 39
[S62,S64,S65,S68,S70,S71,S73,S76,S80,S85,S86], S89
Tool [S4,S5,S6,S18,S24,S39,S41,S43,S59,S66,S78] 11
Qualitative or descriptive model [S10,S30,S38] 3
Empirical model [S12,S23,S33,S54,S69] 5
Analytic model [S35] 1
Solution, prototype or judgment [S7,S13,S14,S15,S17,S22,S26,S27,S29,S31,S34,S42,S67,S79,S81,S83,S84] 17
Report [S28,S40,S44,S47,S55,S63,S72,S74,S75,S77,S82,S87] 12

9
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Table 6
Types of analytics employed in selected primary studies.
Analytics type Description Relevant studies
Descriptive reactive, typically based on historical data, using descriptive [S4,S6,S7,S8,S12,S16,S18,S22,S26,S28,S30,S32,S33,S36,S41,S42,S43,
statistics to describe past in a summarized form (e.g. visualization, S44,S50,S57,S60,S65,S66,S72,S75,S78,S79,S81,S83,S84]
ad-hoc reporting, dashboards)
Predictive reactive, typically based on historical data and using AI/ML [S1,S2,S3,S5,S7,S9,S11,S13,S14,S15,S17,S19,S20,S21,S22,S23,S24,
techniques, this type of analytics uncovers hidden patterns and S25,S26,S27,S28,S29,S31,S34,S35,S37,S39,S40,S42,S45,S46,S48,S50,
makes predictions about the future (forecasting) (e.g. fraud S51,S52,S53,S54,S58,S59,S60,S61,S62,S63,S64,S69,S71,S72,S74,S76,
detection, sentiment analysis) S80,S85,S87]
Prescriptive reactive and proactive, typically based on historical data [S10,S67,S49,S56,S73,S70,S68,S80]
recommends actions that can be undertaken by decision makers
along with their implications (e.g. simulations)
Adaptive proactive, historical and real-time data, by interacting with an [S55,S70,S86]
environment automatically adapts to recent actions which
influences the present and, in result, improves the ongoing learning
process (e.g. reinforcement learning, counterfactual ML,
recommender systems)

problem, which is resolved using search-based methods in the study


[S70]. The prescriptive/adaptive component of the solution allows
rescheduling of experiments, which are frequently adjusted to changing
user behavior. Some primary studies discussed more than one type
of analytics (e.g. [S42,S50,S70,S80]). There were also 4 studies that
did not explicitly cover any type of analytics, i.e. [S38,S47,S77,S82],
hence were not included in Table 6. The authors of these papers
carried out qualitative studies and, based on case study data (such
as surveys, interviews), drew conclusions regarding experimentation
in product development and release planning practices, among others.
Although the topic of their studies is related to BDA in ASD, the papers
themselves cannot be categorized into any of the 4 major groups of
analytics. Interestingly, 2018 was the year when the largest number (4)
of prescriptive analytics was produced. However, before that, there was
a 2-year gap, when no such papers appeared. Papers covering adaptive
analytics started to appear in 2018, with two such studies published
in 2019. Hence, we could conclude that in the last two years more
Fig. 8. Analytics type.
advanced BDA gained attention in the community. However, it remains
to be seen if this trend of steady growth prevails.

split our papers into four groups, which are discussed in Table 6 and
4.2.2. Data sources (RQ2.2)
presented in Fig. 8. The quality of insights provided by analytics depends to a large
The majority of studies covered descriptive and predictive analytics. degree on data. In our analysis, we divided data sources into 7 major
Both of them are primarily based on historical data and do not in- groups, which are presented in Table 7.
clude more advanced ML concepts and applications such as simulation As shown in Table 7, in the software domain, there is no dominant
of different scenarios or reinforcement learning, among others. For source of information for analytics, and many analytics engines use
instance, descriptive analytics solutions often took the form of dash- various data sources to provide a more holistic and comprehensive
boards with KPIs or reporting methods providing ASD related metrics view of the studied environment. 44.3% of the studies used more
(e.g. [S6,S12,S16,S78,S81,S84]). For instance, the solutions proposed than one data source, including data from: issue reports, test results,
in [S78] and [S84] were based on typical ASD artifacts such as the commit logs, bug repositories, version control systems, among others.
number of features or test cases designed, developed, or integrated in One such example is an analytics solution presented in [S6]. The
a given period of time. Thanks to a descriptive analysis with a visual authors highlighted the importance of combining disparate data sources
component performed in the two studies, the decision-makers in the such as the source code repository, defect backlog, personnel data. For
companies gained a better overview of their ASD processes. instance, the last type of data – in this case coming from Lightweight
The more advanced type of analytics, the smaller the number of Directory Access Protocol directory (LDAP) – proved to be helpful in
papers discussing it, e.g. prescriptive was less frequently covered in resolving code ownership issues, such as identifying abandoned parts
the selected primary studies. With regard to prescriptive analytics, we of the codebase due to the departure of staff or role changes. Similarly,
observed mainly studies focus on optimization [S70,S68] as well as Czerwonka et al. [S24] in their study used multiple data sources. Apart
simulation and modeling [S49,S67,S73]. The last type, which is adap- from source code repositories, which the authors considered the largest
tive analytics, was covered in three primary studies, discussing topics and most volatile sources of engineering data in their organization,
such as dynamic re-prioritization of test cases, reinforcement learning they utilized test results, organization, and project execution data
approach for A/B tests or an optimization solution for experimenting (e.g. release schedules and development milestones). Similarly, in [S17]
in a continuous deployment environment. Specifically, the authors the project repository was utilized specifically because it contained all
of [S70] observed that continuous experimentation at scale requires source code information, such as the number of lines of code (LOC) for
careful orchestration of experiments, as they often depend on each a class or dependency information for a method. Project execution data,
other and many constraints need to be considered (e.g different user although often not specified, frequently included sprint data [S9], story
base, the volatility of experiments, code development). Since product points [S35,S61,S69,S68]. For instance, Batarseh et al. [S9] described
releases can be impacted by failed experiments, it is a non-trivial in detail what data they collected from sprint data: i.e. sprint number,

10
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Table 7
Summary of data sources and systems used to gather data.
Data source type Detailed data source Data source system Relevant studies
Online/app reviews e.g. the App Store, Google Play [S18,S36,S39,S48,S50], S88
Controlled experiments [S30,S55]
User feedback
Self-assessment questionnaires e.g. JIRA plug-ins [S43]
interviews/surveys [S4,S10,S15,S16,S32,S38,S47,S55,S56,S57,S58,
S65,S77,S82,S83]
Application usage e.g. Eclipse IDE, APM & program analysis tools [S8,S10,S41,S46,S50,S65,S70,S83]
Test results & test suite execution [S13,S24,S29,S44,S59,S60,S86]
Commit logs [S59,S64,S75,S76]
System failures e.g. Microsoft Dynamics Ax [S17,S28]
Failure data [S54]
Logs
History of pull requests e.g. Azure DevOps [S5,S51]
Revision history [S85]
Workflow data logs [S40]
Event monitoring logs [S28,S40,S60,S83]
Deployment orchestration logs e.g. TravisCI [S40,S81]
Epic [S11,S22]
Features [S24,S56,S84]
User stories [S11,S22,S26,S35,S49,S52,S65,S69,S73]
Defects [S24]
Project artifacts
Tasks [S45]
Product backlog e.g. JIRA [S11,S12,S78]
Defect backlog [S6,S78]
Deployment registry [S40]
Issue tracking system e.g. JIRA [S2,S3,S19,S20,S21,S25,S28,S42,S53,S71,S75,
Issue reports
S76,S79]
Bug repository e.g. Bugzilla [S37]
Source code & data model Source code [S6,S17,S24,S41,S51,S59,S74,S75,S76]
Ruby programs & Ruby on Rails [S34,S64]
Java programs e.g. Eclipse IDE plug-ins [S1,S4,S8,S14,S33,S81]
Function calls [S87]
Code metrics [S28]
Development repository e.g. GitHub [S17,S24,S27,S59,S81]
Test case e.g. Cucumber, Gherkin, Microsoft Dynamics Ax [S17,S64,S76,S86]
Code quality e.g. SonarQube [S28,S53]
Application data schema [S34]
Sprint data [S9,S52]
Story points e.g. JIRA agile [S22,S35,S43,S52,S61,S69,S68]
Project timeline e.g. Asana, internal Kanban tool [S31,S62,S78,S79]
Project execution
Development effort e.g. Asana, internal Kanban tool [S31,S43,S49,S52,S62,S78,S79]
Test management system e.g. Selenium [S79]
Version control system e.g. Git, Azure DevOps [S7,S24,S25,S28,S51,S59,S79]
Demographic data e.g. LDAP [S6,S24,S51,S52,S67,S71,S76]
Organization data
Communication structures [S43]
Meeting minutes [S63]
Time sheets [S63]
Other
Meeting behavior [S43]
Conversations [S43,S65]

Fig. 9. Distribution of data sources presented in the studies w.r.t analytics type.

11
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Table 8
Classification of ML/AI methods and techniques.
Category Learning/Model type Method Relevant studies
Instance-based Discriminative Support Vector Machine (SVM) [S1,S2,S14,S20,S27,S28,S36,S42,
S52,S59,S61,S62,S65,S71,S86]
k-Nearest Neighbor (kNN) [S2,S28,S37,S42,S52,S61]
Decision Trees Discriminative Decision Tree (DT) [S28,S52,S61]
Classification and Regression Tree (CART) [S2]
C4.5 [S2,S19,S48]
Regression Discriminative Linear Regression [S1,S7,S9,S33,S51,S52], S88
Logistic Regression [S25,S36,S59,S65]
Exponential Regression [S9]
Generalized Linear Model (GLM) [S62]
Probabilistic Bayesian Ridge regression [S51,S52]
Clustering Connectivity Hierarchical Agglomerative Clustering (HAC) [S17]
Centroid K-Means [S42]
Ensemble Boosting AdaBoost [S2]
Bagging & Random Forests Random Forest (RF) [S2,S3,S11,S19,S20,S21,S27,S28]
Bagged AdaBoost (Adabag/BAB) [S2]
Gradient Boosting [S51]
Stochastic Gradient Boosting (SGB) [S2,S19,S20]
Extreme Gradient Boosting [S28,S59]
Bagging [S48]
Custom algorithm [S52]
Neural Networks Shallow Neural Networks Neural Network (e.g. Multi-Layer Perceptron) [S1,S19,S21,S36,S52,S59,S62]
Deep Learning Deep Neural Network [S19,S20,S21]
Probabilistic Graphical Models Generative Naive Bayes [S2,S19,S22,S27,S28,S32,S36,S48,
S53,S61,S76]
Bayesian Network [S52]
Hidden Markov Hodel (HMM) [S37]
Expectation Maximization for Naive Bayes (EMNB) [S18]
Directed Acyclic Graph (DAG) [S73]
Stochastic Automata Network (SAN) [S23]
Social Network Analysis (SNA) [S85]
Optimization Combinatorial optimization Multiple knapsack [S68]
Branch-and-bound [S68]
Branch-and-cut [S35]
Simulated Annealing (SA) [S70]
Genetic Algorithms (GA) [S70]
Distributed constraint optimization Custom algorithm (SMART) [S45]
Reinforcement learning Multi-armed bandit (MAB) [S55]

start and failure date, total time, Mean Time between Failures (MTBF). tool, or Eclipse IDE (with deployed plug-ins) to capture Java develop-
Importantly, some primary studies used synthetic data. For instance, ers’ behaviors [S8,S33]. Further, [S14] used testing results, data from
[S4] used simulated Scrum data; also authors of [S49] augmented the live automation system at Salesforce. With regard to strictly Agile
real data with an artificial one (such as artificial user stories) to feed related data and tools, several different approaches unfolded. Some
their simulation model. Data collection, processing, and further loading studies relied on popular solutions such as GitLab, web-based DevOps
to an analytics engine often entailed executing complex steps along platforms [S43,S70,S79], while others decided to leverage tools devel-
the way. For instance, Huijgens et al. [S40] mentioned combining oped in-house. For instance, Fitzgerald et al. [S31] used an internally
data from the deployment registry, the configuration management developed, web-based Kanban board tool to record metrics such as:
database, the event monitoring data warehouse, and the deployment allocated project number, date of project arrival, project development
orchestration logging. Further, they used time stamp data to identify start date, expected and actual finish dates of the project, estimated
active configuration items and to extract deployment steps from the and actual development effort for the project. [S46] and [S66] utilized
tools developed for the purpose of the studies, which are described in
logged workflow data. Very rarely studies provided information on data
detail in Section 4.3.3.
sources which they deliberately did not use, due to high costs, for
For the majority of studies, data originated from software projects,
example. However, in rare cases, such information was revealed; for
both industrial and open-source (OSS). Certainly, the amount and the
instance, [S63] describes that a company covered in the paper decided
variety of data as well as the ability to process it, among other factors,
not to monitor neither its version control nor defect tracker systems. In
depend on the organization and software development teams, which
a similar vein, numerous studies stressed the importance of data quality
are the source of data. With regard to the size of industrial projects,
(including taking care of incomplete and inaccurate data) and the need
the discussed primary studies covered a wide range of them: from small
for its verification before rolling out an analytics solution [S6,S12,S62]. teams with around 10 developers (e.g. [S49]) to large teams, covering
When it comes to specific tools and systems providing data, JIRA as much as 300 different development teams in a large software com-
turned out to be the most popular choice, with 9 (10.2%) studies pany [S40]. Also, the number of projects varied. Some studies focused
using it [S2,S3,S19,S20,S21,S22,S42,S43,S53]. Also, various Microsoft only on 1 project (e.g. [S8,S14,S17,S49,S78]), while others included
products (such as Dynamics Ax, Windows, Windows Phone, Office, hundreds of projects — e.g. [S31] studied 467 projects, [S40] covered
Exchange, Lync, MS SQL, Azure, Bing, and Xbox) were named in several 750 projects.
studies [S5,S17,S24,S79]; however, Microsoft was an industrial partner With respect to OSS projects, Apache projects dominated, with 7
in those studies. Other software products used as data sources included, (7.9%) studies using their data [S4,S19,S20,S21,S61,S71,S85]. Other
for instance, Bugzilla [S37] — a web-based bug tracking and testing frequently utilized OSS projects included Usergrid, Appcelerator Studio,

12
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Fig. 10. Classification of ML/AI algorithms w.r.t to their performance and model categories. Pointers (i.e. arrows’ pointed ends) indicate algorithms with better performance if
such comparison was reported in the selected primary studies. Studies having more than one type of analytics are depicted as a combination of appropriate shapes.

Aptana Studio, Spring XD, Titanium SDK — each one of them was em- Fig. 10 outlines different ML/AI methods used in the selected pri-
ployed in three studies (e.g. [S21,S61,S71]). Furthermore, benchmark mary studies and, if applicable, their performance in comparison to
datasets such as the ISBSG dataset were also used by some studies other techniques. Studies reporting more than one type of analytics
[S62]. In terms of the association of data sources used in four types are represented as a combination of shapes (e.g. [S42]: descriptive and
of analytics, Fig. 9 depicts this relation. Predictive analytics used the predictive analytics, [S70]: prescriptive and adaptive analytics). Some
largest number of data sources. Especially issue reports, logs, and project papers discussed more than one method, and if they were compared,
execution data prove to be used more frequently for predictive analytics it is marked by arrows in Fig. 10. Specifically, a method achieving the
than for descriptive analytics. For prescriptive analytics, the usage of best results for a particular study is indicated by an arrow, pointing
different data sources is more balanced, with the largest number of from the worse performing models in the study to the best one. For
studies (3) using project artifacts. While studies using adaptive analytics instance, in [S2] various models were deployed, such as kNN, different
utilized user feedback, logs and source code as data sources. decision trees, or ensemble methods.
Therefore, overall, groups containing the largest number of ar-
4.2.3. Methods, models and techniques (RQ2.3) rowheads yielded the best results. Among the techniques, ensemble-
In this section, we discuss ML/AI methods, models, and techniques based classifiers performed significantly better than regular classifiers,
employed in the selected primary studies. A summarized version is where Random Forest performed best in several studies (e.g. [S2,S19,
presented in Table 8. S27]). Further, although neural networks (NNs) often yielded good
Application areas for BDA in ASD are as varied as a testing im- results as individual classifiers (e.g. [S36]), they were outperformed
provement to feedback elicitation. All of those are underpinned by by simpler methods such as SVM on several occasions (e.g. [S1,S62]).
a large variety of different ML models. Several studies covered more Interestingly, SVMs were used both as classifiers (e.g. [S42,S65]) and
than one type of analytics (i.e. [S42,S50,S70,S80]). In some works, regression models (e.g. [S1,S20]). Overall, SVM was the most popu-
the application of classical ML algorithms was contrasted with more lar technique used in 15 (17.0%) studies, followed by Naive Bayes
modern approaches such as Deep Learning (DL) (e.g. [S19,S21]). employed in 11 (12.5%) studies.

13
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Fig. 11. Distribution of data sources presented in the studies w.r.t ML/AI algorithms.

Notwithstanding that many studies discussed ML methods or tech-


niques applied, some papers provided a more coarse-grained descrip-
tion of implemented BDA systems without delving into the details or
types of methods used, hence they were not included in Table 8. One
such example is [S87], where authors merely scratched the surface of
used ML models by reporting only that classification and clustering
were performed over execution traces. Along the same lines, Pareto
et al. [S57] presented a mixed method combining qualitative, quantita-
tive, and analytical techniques to improve architecture documentation,
without providing much detail on the used techniques. Another such
example is [S6] in which authors reported on lessons learned from
implementing and designing dashboards to visualize metrics and KPIs.
Visualization was strongly represented by [S22,S26,S43,S78,S84].
In general, studies published in IEEE Software provided higher-
level of abstraction, without discussing particular ML algorithms used
(e.g. [S6,S24,S39,S41,S49,S50,S63,S74,S75,S80,S85]). Several papers
used their own, custom-designed methods. For instance, [S54] pro-
posed a model with domain-specific heuristics for the prioritization
of test cases. Similarly, [S56] proposed a solution called Satisfaction-
Dissatisfaction Optimizer (SDO) for Asymmetric Release Planning. Im-
portantly, some authors decided to compare their custom approaches
(e.g. RSTrace, a reviewer recommendation approach in [S76]) with
well-established benchmark ML algorithms, such as Naive Bayes. In
[S34], a model for automatically generated diagrams was demon-
strated. Whereas [S6,S12,S40,S43,S63,S81] developed BDA solutions
which relied on various metrics and indicators to assist decision-makers
with creating goals or to report some activities , not necessarily employ-
ing any kind of AI/ML algorithms.
Other studies focused on descriptive or Exploratory Data Anal-
ysis (EDA) [S46], which are not ML/AI methods, per se. Similarly,
four studies (i.e. [S38,S47,S77,S82]) conducted qualitative studies and
semi-structured individual interviews with industry practitioners. The
papers, although reported on BDA in ASD, did not use themselves
any type of analytics to enhance ASD processes. As shown in Fig. 10,
the majority of papers that contributed to understanding what ML
techniques are used in ASD originated from academia, followed by
joint authorship papers. Importantly, industry-led research mentioning Fig. 12. A thematic classification of studies w.r.t SWEBOK knowledge areas.
specific ML techniques appeared only in 2019. Nevertheless, most

14
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Table 9
A thematic classification of studies based on SWEBOK knowledge areas.
Area Sub-area Relevant studies # studies
7.2 Software Project Planning and Cost Estimation 7.2.3. Effort, Schedule [S1,S4,S6,S8,S11,S20,S21,S22,S23, 24
S35,S42,S45,S46,S49,S51,S52,S58,
S61,S62,S63,S68,S69,S71,S74]
7.6 Software Engineering Measurement General category [S12,S15,S16,S28,S40,S41,S53, 12
S72,S78,S79,S80,S82]
10.1 Software Quality Fundamentals 10.1.4. Software Quality Improvement [S9,S16,S18,S19,S25,S30,S36,S37, 11
S41,S47,S66]
1.4. Requirements Analysis 1.4.4 Requirements Negotiation [S1,S15,S27,S34,S38,S48,S50,S63, 10
S73,S84]
4.5 Test Process 4.5.2. Test Activities [S14,S17,S29,S39,S44,S54,S55, 10
S67,S70,S86]
6.6 Software Release Management and Delivery 6.6.2. Software Release Management [S2,S3,S25,S56,S75,S77,S68], S88 8
7.4. Review and Evaluation 7.4.2 Reviewing and Evaluating Performance [S12,S43,S53,S78,S83,S85] 6
7.2 Software Project Planning and Cost Estimation 7.2.4. Resource Allocation [S6,S23,S31,S45,S74] 5
11.2 Group Dynamics and Psychology 11.2.1 Dynamics of Working in Teams/Groups [S32,S43,S49,S72] 4
10.2 Software Quality Management Processes 10.2.3. Reviews and Audits [S5,S60,S76,S81] 4
1.3. Requirements Elicitation 1.3.2. Elicitation Techniques [S65,S77,S84] 3
3.4. Construction Technologies General category [S10,S33,S83] 3
7.2 Software Project Planning and Cost Estimation 7.2.5. Risk Management [S19,S59,S60] 3
5.2 Key Issues in Software Maintenance 5.2.2. Management Issues [S7,S81] 2
8.3 Software Process Assessment and Improvement General category [S24,S60] 2
2.2 Objectives of Testing 2.2.13 Usability and Human–Computer Interaction Testing [S39,S70] 2
1.4. Requirements Analysis General category [S4,S77] 2
11.3 Communication Skills 11.3.3 Team and Group Communication [S32,S43] 2
3.3 Practical Considerations 3.3.3. Construction Testing [S64] 1
2.3. Software Structure and Architecture 2.3.4 Architecture Design Decisions [S57] 1
4.6 Software Testing Tools 4.6.2. Categories of Tools [S13] 1
10.3 Practical Considerations 10.3.1. Software Quality Requirements [S87] 1
1.5 Requirements Specification 1.5.5 Software Requirements Specification [S26] 1
1.7 Practical Considerations 1.7.5 Measuring Requirements [S11] 1

studies which disclosed techniques and models used, did not provide 10.1 and 10.2). BDA applied to requirements analysis and elicitation
information on the set parameters and hyperparameters. (SWEBOK areas 1.4, 1.3, 1.5 and 1.7) also gained interest among the
Fig. 11 provides information on types of data sources used by SE community, we identified 15 (17.0%) such studies . Our study also
specific ML techniques. Instance-based family of methods was by far the reveals that software testing was one of the major applications and most
most frequently used set of algorithms and it used predominantly issue diverse when it comes to different analytics types, as it covered all four
reports, source code and project execution data. Importantly, clustering of them (see Fig. 12(b)). In total, 11 (12.5%) studies discussed topics
algorithms used only 3 types of data. The majority of issue reports data related to testing (SWEBOK areas 4.5 and 4.6). The year 2019 was the
was analyzed by instance-based, ensembles, and probabilistic graphical most diverse in terms of the SWEBOK areas covered — 9 different areas
models. Optimization algorithms did not show any clear preference out of 10 (see Fig. 12(a)).
toward types of data used, while ensemble methods preferred issue The selected prescriptive analytics studies focused mainly on test ac-
reports over others. On the other hand, regression techniques utilized tivities [S10,S67,S70], effort estimation, and delivery capability [S35,
project execution data most frequently. S49,S73].
Not only different types of analytics use different data sources, but
4.3. How data analytics improves Agile software development process also different phases of the ASD process require different sets of infor-
(RQ3) mation on input. In Fig. 13, studies focused on software engineering
management activities (as the only SWEBOK group) utilized the whole
4.3.1. Research topics and sub-topics (RQ3.1) spectrum of data sources with majority of works using project execution,
Broad knowledge areas, such as requirements engineering, software followed by project artifacts and logs data. Software requirements used
design, development, testing, or project management, were narrowed two main data sources: user feedback and project artifacts, with a slight
down using the SWEBOK thematic classification (see Table 9). preference toward user feedback. On the other hand, testing had a clear
BDA is applicable to almost all phases of the ASD process. 24 inclination toward logs, while Software quality used more balanced set
(27.3%) studies covered more than one area. Software engineering of data sources.
management topics such as project planning, risk management and
effort estimation (SWEBOK areas 7.2, 7.6, 7.4) turned out to be the 4.3.2. Agile practices, techniques or engineering practices (RQ3.2)
most popular among the authors of our primary studies — 43 (48.9%) Our classification of Agile practices/methodologies is presented in
studies covered them in total. Quality improvement and assurance was Table 10. In terms of the popularity of reported Agile practices,
the second most popular topic with 17 (19.3%) studies. Issue resolution 55.7% studies did not specify which Agile methodology they used
and detection of defects, technical debt and code reviews remain one (category ‘‘Other - Not Specified’’). Despite this fact, we can assume
of the primary domains of research in BDA for ASD (SWEBOK areas they use Agile-inspired methodologies, as papers discussing them refer

15
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Fig. 13. Distribution of data sources presented in the studies w.r.t SWEBOK knowledge areas.

Table 10
Selected primary studies mapped according to Agile practices, techniques, and engineering practices.
Category Relevant studies # studies
Agile practices
Scrum [S4,S6,S22,S32,S34,S35,S38,S40,S42,S43,S45,S46,S49,S67,S77,S79] 16
Kanban [S31,S49,S77] 3
Custom Extreme Programming (XP) [S1,S44,S61,S77,S84] 5
Custom Hybrid (multiple methodologies) [S47,S57,S74,S79] 4
Other — Rational Unified Process (RUP) [S57,S58,S63] 3
Other — Not Specified [S2,S3,S8,S9,S10,S12,S13,S14,S15,S17,S18,S19,S20,S21,S23,S24,S27,S28,S29,S30,S33, 49
S36,S48,S50,S51,S52,S54,S55,S56,S58,S59,S62,S63,S64,S65,S66,S68,S69,S72,S73,S75,
S76,S77,S78,S81,S82,S83,S87], S88
Agile engineering practices
Refactoring [S1,S32] 2
Pair programming [S1,S44,S61,S84] 4
Model-Driven Development (MDD) [S57] 1
Test-Driven development (TDD) [S1,S10,S33] 3
Behavior-driven development (BDD) [S39,S44,S64,S74] 4
Automated acceptance testing [S75] 1
Continuous integration [S14,S29,S16,S47,S51,S54,S59,S60,S79,S80,S81,S82,S86] 13
Continuous delivery [S16,S40,S41,S51,S79,S80] 6
Continuous deployment [S16,S51,S59,S60,S70,S79,S80] 7
Continuous experimentation [S55,S70] 2
Agile techniques
Story mapping [S1,S22,S26,S21,S27,S32,S34,S49,S52,S65,S69,S71,S84] 13
Iteration/sprint planning [S1,S9,S12,S22,S32,S34,S35,S42,S43,S45,S46,S49,S52,S61,S67,S84] 16
Release planning [S27,S32,S35,S38,S49,S56,S68,S84,S70,S75],S88 11
Team-based estimation [S1,S15,S22,S42,S43,S45,S46,S61,S69] 9
Retrospectives [S38,S43,S49] 3
Standups [S43,S65] 2
Collective code ownership [S7,S52,S64] 3
Agile/Lean UX [S47] 1

to e.g. Planning Poker (as the method for requirements prioritization) instance, although RUP is not ASD per se, it encompasses many Agile-
[S15], project sprints taking two weeks each [S12], Test-Driven De- inspired techniques. Therefore, similar to other authors of literature
velopment (TDD) [S33], or user stories and story points [S73]. In reviews on Agile, e.g. [47], we also included those practices in our
the same vein, in [S11] an IBM product was developed according to study. Scrum was the second most popular practice, with 18.2% studies
some variant of Agile methodology, having its own terminology, which mentioning it. Kanban, on the other hand, was rarely used (only 3.4%
nonetheless mapped easily to well-known terms in Agile, e.g. a Plan studies). 5.7% studies chosen XP. Customized lean methodologies were
Item is an epic. Also, a number of studies in this category discuss mentioned in 2 papers: [S78,S57]. Whereas the DevOps approach was
OSS Apache Software Foundation projects, which also employ Agile- applied in projects from 5 (5.7%) studies: [S40,S60,S74,S77,S83].
inspired software development processes — they are self-governing and To put it in context, we compared our numbers with the ratios
egalitarian by using a collaborative, consensus-based process [46]. For reported by Svensson et al. [S77], where Scrum (43%) was the primary

16
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Agile practice, followed by not specified Agile (29%), Kanban (15%), metrics. Thanks to this solution, the company’s software development
DevOps (12%) and XP (1%). We can conclude that, similar to [S77], teams were provided with insights not available through an application
Scrum and general Agile practice were dominant methodologies, while lifecycle management (ALM) tool, which they normally used. In a
Kanban, XP and DevOps were much less popular. similar vein, teams at Fannie Mae IT [S74] complemented insights from
Agile techniques and engineering practices are listed in Table 10. the pre-build static-analysis tool SonarQube with an internally devel-
Not all selected primary studies shed light on details of embraced Agile oped solution, AIP. Snyder and Curtis [S74] stressed the importance of
practices or techniques. For instance, [S87] advocated the iterative leveraging existing platforms where possible and adding new solutions
Agile process with quick feedback loops. As Zhang et al. [S87] stressed, only when necessary. Similarly, the CODEMINE data platform [S24],
close collaboration between researchers and practitioners fosters con- an internal Microsoft tool used by hundreds of users across all major
tinuous development and improvement. Nevertheless, particular Agile Microsoft product groups, is integrated with a test prioritization, failure
practices employed remained to be disclosed. and change risk prediction tool called CRANE. CODEMINE supplied
CRANE with data related to source code, features, defects, and people
4.3.3. Tools (RQ3.3) from different Microsoft’s product teams.
After reporting on software analytics applications, this section is However, not all tools were used only as in-house solutions. Some
devoted to a discussion regarding tools and models that were designed were made available for use by the public (indicated by a footnote
and developed by the authors of the selected primary studies. In this in Table 11 where applicable), or are planned to be released for the
section, we try to understand how pervasive is the transfer of research broader software engineering community benefit, as in [S51]. Two of
outcomes into practice and the state of industrial adoption of BDA those studies (i.e. [S18,S86]) did not provide access to their tool, but
solutions. We created three groups with regard to the degree of tools’ the authors made their datasets public. Some works featured models
adoption: that authors wished to improve and build as tools in the future — as
(i) No practical application — a tool with no immediate application. they were not ready at the time of writing those papers, they are not
Experiments were performed based on publicly available data (e.g. listed in Table 11, however in this mapping study we shortly comment
Apache repositories) or in an academic setting. on them. For instance, [S12] discussed a work-in-progress reporting
(ii) Partial application — numerous studies conducted a case study at model for Agile projects. The authors of the paper reported that at that
a company to develop a model or a tool together with stakeholders time, the method was being validated with new clients and they aimed
from a company. Such proofs-of-concept or pilots were only partially to implement it in an automated tool focusing on backlog data from
introduced in the company. JIRA to automatically calculate metrics. Similarly, Lunesu et al. [S49]
(iii) Practical application — a tool with practical application. Most often, demonstrated a simulator which the authors planned to extend as a
an internal team within a company developed a tool that supported the simulation tool sourcing data from issue tracking systems such as JIRA
organization’s software development efforts in some way. or Redmine. We observed that especially in 2019 selected primary
In Table 11, tools with Practical application are denoted by Yes in the studies featured a large number of tools, some of which are available
Adoption column, tools with no practical application as No, while partial to the public.
application as Partial. Tools having their roots in the industry proved
to be widely exploited research results by practitioners (e.g. [S5,S6, 5. Discussion
S24,S31,S66]). For instance, a tool developed in [S5] is reported to
run on 123 Microsoft’s repositories with rapidly increasing adoption. In this section, we provide a high-level classification of the common
Furthermore, a substantial number of the tools adopted in the industry themes that emerged from the reading of the selected primary studies.
were featured in the papers that reported on their usage through experi- Fig. 14 wraps up our key findings. We also comment on topics that
ence reports. The majority of them (e.g. [S6,S24,S3,S74]) were covered the authors of the papers found particularly important. Additionally,
in IEEE Software journal which, in contrast with other publishing we summarize best practices reported in the literature regarding ap-
forums, particularly encourages contributions that cater to practition- plying BDA to ASD projects as well as the ASD process itself. The
ers. Some tools included in our paper were developed for specific section provides also recommendations for the software engineering re-
case studies and their adaptation to other scenarios and organizations search community and practitioners alike. The review of our findings is
would require conducting further validations as well as substantial complemented with a discussion on their implications and limitations.
effort (e.g. [S46,S54,S67]). Furthermore, the dissemination of research
results originating from academia was not prevalent. Although authors 5.1. Analytics is not one-size-fits-all
of such papers often claimed that their solutions could provide practical
value, however, they indicated that further experiments on real-world Even though all companies covered in the selected primary studies
datasets and empirical studies to validate industrial applicability of the adopted some of the Agile principles, they tend to differ in terms
techniques would be needed. For instance, HASE was only tested in an of size, structure, IT landscape, data ecosystem, among others. As
academic setting (i.e. a group of undergraduate students). However, the those organizations are not homogeneous, their analytics needs and
authors suggested that their goal is to design a situation-aware decision employed BDA solutions also differ, as we will show in this section.
support system that would help global ASD teams to allocate tasks more
efficiently. Similarly, the authors of the feature crumb management 5.1.1. Different analytics for different needs
platform [S41] provided only a reference implementation, which was Company size and budget . Even though at the beginning, Agile meth-
validated in a university capstone project. ods were designed for bottom-up adoption in small organizations, our
In terms of the kind of tools described in the studies, many works literature review shows that large companies and corporations are the
incorporated additional functionalities to existing tools or integrated most frequently studied Agile organizations in terms of BDA. Also, from
with solutions already in use by the studied organizations. In gen- the analyzed studies, it becomes evident that big companies are more
eral, Eclipse plug-is were especially well-represented among the tools focused on leveraging the potential of BDA, because of the abundance
(e.g. [S4,S10]), followed by JIRA plug-ins (e.g. [S26,S43]) and, to a of data, various data sources, and greater resources (i.e. money, labor).
lesser degree, Azure DevOps extensions [S5,S51,S79]. Larger budgets and headcount also allow bigger organizations to al-
For instance, Augustine et al. [S6] used a third-party business locate more resources to collaborate with academia and publish their
intelligence (BI) platform, Qlik Sense, to build an in-house analytics research. Consequently, not all software analytics efforts are created
solution called the Qlik Sense Metrics Portal (QSMP). The tool was equal and some are more suitable for large organizations, whereas
used to visualize data, provide KPIs and other important team-related others are more appropriate for SMEs [S63,S84]. The lack of overlap,

17
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Table 11
Tools discussed in the primary studies.
Tool name Description Adoption Relevant study
ScrumCity Conceptual Visualization Visualization linking code with functional requirements and development activity Partial [S4]
WhoDo Automatic code review system Yes [S5]
Qlik Sense Metrics Portal (QSMP) Dashboard with KPIs visualizing metrics Yes [S6]
Besouroa Automatic TDD behavior evaluation system Partial [S10]
eConferenceMT Automatic translation of text messages No [S15]
AR-Minerb Mobile app review mining platform No [S18]
Unnamed Visualization tool for identification of inaccurately estimated user stories Partial [S22]
CODEMINE Platform for collecting and analyzing engineering process data Yes [S24]
REVV-Light Toolc Visualization tool for ambiguity in user stories No [S26]
Q-Rapids Toold Predicting evolution of external quality of software and factors influencing it Partial [S28,S82]
AppFlow Synthesizing highly robust, highly reusable UI tests Partial [S39]
Crumb Management Platform Feature increment assessment with additional knowledge sources No [S41]
ProDynamics Self-assessment and visualization tool for behavior-driven sprint retrospectives No [S43]
Probe Docke Test analytics platform Partial [S44]
Human-centered Agile Software Online Agile project management (APM) tool No [S46]
Engineering (HASE)
PRLifetime Service Pull request completion time prediction system Yes [S51]
ROCKET Tool automating test selection and scheduling in continuous integration Partial [S54]
FastLane System predicting test outcomes and commit risk Partial [S59]
Unnamed Platform for analyzing project scheduling and delivery capabilities Yes [S63]
TAITI Tool detecting candidate application files for tested based on associated tasks No [S64]
Tricoder Program analysis platform Yes [S66]
Unnamed Simulator for evaluating degrees of test automation Partial [S67]
Project Insights & Visualization Toolkit (PIVoT) Dashboard-based framework linking the developer with project management decisions Yes [S72]
Application Intelligence Platform (AIP) Structural-quality analytics Yes [S74]
RSTraceBotf Automatically reviewer recommender system No [S76]
Unnamed Measurement system for monitoring bottlenecks Partial [S78]
Unnamed Unified software project monitoring platform No [S79]
CI-ODOR Automated CI anti-patterns reporting tool No [S81]
Unnamed Monitoring-aware IDE plug-in Partial [S83]
Feature Survival Charts+ (FSC+) Requirements scope tracking tool Partial [S84]
TERMINATORg Automated UI test case prioritization tool Partial [S86]
StackMine Postmortem performance debugging system Yes [S87]

a https://github.com/brunopedroso/besouro.

b https://github.com/RELabUU/revv-light.

c
https://github.com/q-rapids.
d
https://probedock.io.
e
https://sites.google.com/site/appsuserreviews.
f
https://figshare.com/s/27a35b4ae70269481a2c.
g https://github.com/ai-se/Data-for-automated-UI-testing-from-LexisNexis.

Fig. 14. Key takeaways and observations of our mapping study.

as Robbes et al. [S63] explained, lies in different analytics needs in incorporated additional functionalities to existing tools or integrated
the case of small companies. However, as authors [S63] emphasize, with solutions already in use by the studied organizations. For instance,
some ‘‘decision-making scenarios are similar (such as release planning,
numerous authors mention that they plan to integrate their analytics
targeting training, and understanding customers)’’.
solutions with tools such as JIRA [S12,S49] — a popular choice among
Reusing existing platforms, tools and building on top of them. Our lit-
organizations tracking software issues. Leveraging existing platforms
erature review also allowed us to draw a conclusion that organizations
exhibited a different degree of BDA adoption, also due to varying tech- wherever possible, is a strategy advised by several works covered in
nological maturity. As we demonstrated in Section 4.3.3, many works our study [S24,S74].

18
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

5.1.2. Variety and availability of data sources — no silver bullet Data disclosure. As discussed in the previous paragraph, the avail-
Challenge to utilize heterogeneous data. Over a third of the selected ability of data within organizations often tends to be limited, not to
studies used more than one data source. As illustrated in [S28,S40], mention data being shared with third-party or the public. Companies,
managing heterogeneous data sources and data orchestration is a com- in general, are focused on building a competitive advantage and, hence,
plex task, which involves data collection, processing, and loading for are more sensitive about disclosing information considered intellectual
further use in analytics solutions, which is a time-consuming effort. property.
Hence providing a broad spectrum of data available in a timely manner
is a challenge for organizations [S40,S47]. For that reason, numer-
5.2. Analytics getting more sophisticated but with unbalanced growth
ous studies (e.g. [S79,S82]) stressed the need for aggregating statuses
of various activities during the development process. For example,
[S79] emphasized that in large-scale ASD organizations with numerous BDA solutions proposed by academics and practitioners differ in
projects running in parallel, it is challenging to consolidate and unify many respects. Limited data and AI/ML model sharing, scarce transfer
data to effectively monitor statuses of multiple projects using hybrid of research outcomes from academia into practice widen the BDA
processes and various tools. Moreover, the complexity inevitably in- inequality gap. ML techniques proposed by scholars today may no
creases with ML-powered applications. Keeping track of the metadata longer be competitive in the market in the future.
characteristics of datasets and storing versioned datasets is an impor-
tant challenge to consider [48,49]. As reported in the literature, data
5.2.1. BDA adoption is not mature yet but is improving
versioning is not as prevalent practice as code versioning [49]. This an
As it could not be otherwise, more sophisticated analytics is on the
impediment for adopting especially more advanced forms of BDA.
rise in recent years. Yet, descriptive analytics is still the most preva-
Deficiencies in the availability of data. Often analytics teams and lent among all discussed types of analytics, perhaps as it is the most
researchers need to resort to readily available data. Getting the exact fundamental and hence clearly understood type. Nowadays, even the
type of data they would need for their analysis is frequently not most basic variant of analytics enhances ASD in practice. Concretely,
possible [S47] or requires manual effort [S78], because mechanisms as numerous studies (e.g. [S28,S79]) suggest, it is nowadays vital to
to automatically collect the data are not in place. There are numer- manage the complexity that comes from vast amounts of continuously
ous reasons for this: confidentiality restrictions, business constraints arriving data collected in different tools; therefore, industry tends to
limiting the collection of such data (shortage of money, time, human implement and use tools that provide actionable insights, often based
resources), or simply lack of knowledge on the business side that such on analysis of metrics, without an explicit predictive component. The
data could improve business operations of the company. Furthermore,
low number of adaptive analytics papers may be explained by the
data availability has a direct influence on the type of BDA algorithms
fact that it is a relatively new concept. However, one may assume it
that can be used: for instance, [S5] contends that ‘‘we did not start off
will most likely change in the future. With the increased adoption of
with a machine learning approach as it requires ground truth in the
simple solutions, we predict more use of advanced types of analytics
form of positive and negative labels’’.
(prescriptive, adaptive) in the coming years. Furthermore, different
The amount of useful data is capped. As stressed in [S77], too much applications of adaptive analytics (even when not explicitly named
data may inhibit the decision-making process, and as a consequence, as such) are being proposed in the scientific literature (e.g. [50]).
the abundance of data results in putting limits on their usage. For in- Enriching software with adaptive capabilities is a promising research
stance, several primary studies included in our mapping study demon- avenue.
strated that the availability of data does not necessarily imply its usage
[S5,S28,S63,S74]. For example, [S63] described a scenario where a
company decided not to monitor neither its version control nor defect 5.2.2. Algorithms achieving best results today and tomorrow
tracker systems, even though in other studies these types of data Dominating algorithms. With regard to specific methods employed in
sources were used (e.g. [S6,S25]). Furthermore, inadequate analysis of the selected primary studies, techniques such as SVMs, NBs, NNs, or
available data is pervasive [S47]. Therefore, it is important to recognize different ensemble methods were frequently chosen. Ensemble models,
what type of data would be needed for a given scenario and ensure its achieving strong results in many selected primary studies, are stacked
completeness and good quality. models, hence provide often the best results. For instance, RF, popular
among the authors of the studies, is an easy to train, ensemble decision
5.1.3. Organizational entropy and data sharing practices tree. On the other hand, NNs and DL methods, in general, require large
Organizational silos and legacy systems. It is also important to ensure datasets and careful hyperparameter tuning. Hence, NNs need more
there is a single version of the truth, which is technology-agnostic and time to train and are more prone to overfitting. In terms of methods,
comparable across applications [S72,S74,S79]. For instance, because DL is gaining momentum (e.g. [S21]), however traditional statistical
of organizational silos that implied having different data sources with methods and regression models are still pervasive. Relatively low usage
no comparable measures, Snyder et al. [S74] decided to use only one of process mining for process-related analysis in the software domain.
measure for analyzing productivity, even though more than one were
available. Apart from aligning data sources at the company level, it is Data maintenance and model upkeep. Even the best performing
also important to ensure a common connectivity mechanism between model may not provide satisfying results in the future. Models degrade
applications (e.g. through well-defined APIs) [S38] as well as preserve with time. Several measures need to be taken to make AI/ML models
backward compatibility [S24]. ASD is associated with reduced em- usable for a long time and sustain their performance. Data needs to be
phasis on documentation. However, as indicated by [S4,S34,S57] the continuously updated and validated [S9]. It is important to constantly
low degree of architecture documentation adversely affects the main- ensure that the model is adjusted to the changing data distribution.
tenance of existing applications, in particular legacy software. Pareto Accordingly, the management of data pipelines is critical so that the
et al. [S57] stressed that ‘‘lack of a clear architecture’’ significantly
maintenance of the model is streamlined.
increases projects’ lead time. In the same vein, [S38] contended that
documenting software architecture in large-scale Agile development is
an important challenge without a clear solution and it is especially 5.2.3. Little transparency of ML techniques used in industry, replication
important to provide detailed architectural guidance to the ASD teams difficult
at the beginning of the projects. The authors [S57] advocated prioritiza- Half of the studies were written either as a joint collaboration
tion of documentation in line with a single business goal to streamline between academia and industry or solely by industry practitioners,
development. which is a positive sign.

19
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Industry vs academia: different environments, different priorities. 5.3.1. Key areas for implementing BDA come down to risk management
Industry-led research is less explicit when it comes to ML methods Data-driven ASD is mostly focused on project management analytics
used in ASD than publications originating from academia. Apart from (effort estimation, resource allocation), requirements engineering, and
data disclosure policies, another possible explanation might be that software quality assurance (defect/bug fixing, testing, code mainte-
business ventures provide high-level information because it is sim- nance). Hence, managing risk related to software development turned
ply more critical for them to focus on business-relevant metrics and out to be the common denominator of the majority of studies. The
system design rather than on specific algorithms. In the same vein, authors of selected primary studies recognized the importance of min-
as opposed to theoretical works, companies are more interested in imizing delays (e.g. predicting the delay of issues) and predicting time
satisfying business needs, which often require complex orchestration of for completing activities (e.g. predicting delivery capability or time to
existing systems. Further, the analysis of performance is oftentimes not fix bugs). For instance, [S22] analyzed reasons of inaccurate effort
a straightforward task. Unlike in academia, where a standard dataset estimations in ASD. The authors reported that large user stories or
user stories dependent on other artifacts in the product backlog are
can be used for evaluation, industry-led research is set in a specific
more frequently inaccurately estimated . This can be explained by the
context. As Zhang et al. [S87] contended, although researchers can
fact that organizations are willing to apply BDA especially to ASD
measure intermediate results using evaluation criteria such as precision
areas where they could suffer possibly the greatest capital and resource
or recall, for real-world tasks these are often not sufficient, requiring
losses. Although studies discussing those areas confirm the potential
empirical evaluations performed together with practitioners. Therefore,
of using ML models to improve activities in the respective fields, they
following the same evaluation procedure is frequently neither feasible
rarely reach consensus regarding best approaches. Further studies are
nor practical. required to provide recommendations.
Difficult operationalization and dissemination of academic research
to industry. In order for ML models to work within the business envi- 5.3.2. Variability in how companies interpret and implement Agile
There was considerable variability in how companies interpreted
ronment, analytics teams need to consider many practical constraints.
and implemented ASD. Also, the large number of corporations and big
Our study reveals that papers focused on BDA in ASD favored new
companies imposed that large-scale Agile implementations were also
procedures and techniques — the large number of procedures and
present in our study, perhaps being over-represented. Nevertheless, as
techniques can be attributed to the considerably high number of journal
such large ASD implementations generate a vast amount of data, it is
articles covered in our study. A possible explanation is that although
understandable that BDA is particularly suitable for such applications.
BDA is a highly applied discipline. Nevertheless, the number of real-
Moreover, especially among purely academic publications with no
world applications of BDA in ASD with well-documented scientific
industrial use cases, Agile development context was rarely in focus.
evidence is still scarce. Our analysis of the state of industrial adoption Given scarce information on analytics suited to specific Agile prac-
and dissemination of research results originating from academia shows tices/techniques, we may conclude that the domain is not consolidated.
that the transfer of research outcomes into practice is not pervasive. However, a few common themes emerged:
Replication is difficult . Research papers produced by academics often Effort estimation is highly context-dependent, but a set of key features
utilized data from open source projects (e.g. [S76,S81]). Nevertheless, for cross-project estimation exists for BDA. The efficiency of utiliza-
the number of standard datasets for evaluation is relatively low. For tion of team resources depends to a large extent on the experience of
instance, as reported in [S62], in the field of software estimation, the users and is context-dependent [S45]. Story point estimation in ASD
datasets are often small, outdated, and suffer from homogeneity, which is specific to teams and projects [S21]. ‘‘In an issue report, the fields
may cause under- or over-fitting while training an ML model. A possible containing a summary, description, names of related components, and
explanation is that the generation of a high-quality dataset is time and issue type provide relevant features for story point estimation. Most
resource-intensive. Furthermore, it is often a non-trivial task, as soft- frequently, these features are project dependent’’ [S61]. Furthermore,
ware repositories may vary in terms of the information granularity or analyses of feature relevance show that although features are highly
quality. However, efforts to create benchmark datasets or make internal dependent on the project and prediction stage, certain properties are
datasets public were undertaken by some scholars (e.g. [S21,S61,S18, important for most projects and phases [S11]. Such as a requirement
S86]). For instance, Sülün et al. [S76] used in their study only one creator, time remained to the end of an iteration, time since last
OSS project, but for a comprehensive comparison of their novel method requirement summary change, and the number of times requirement
against existing approaches they had to resort to a publicly available has been replanned for a new iteration [S27]. Cross-project estimation
dataset compiled by [51] . Furthermore, most studies which disclosed is possible, but less accurate than estimation within the same organi-
zation [S21]. Moreover, Dehghan et al. [S27] contend that although
techniques and models used (mostly performed by academics), did not
satisfying predictions can be made at the early stages, the performance
provide information on the parameters. Hence replication of results,
of predictions improves over time by taking advantage of requirements’
based on information provided, might be cumbersome. Also, releasing
progress data. Therefore, as advocated [S1,S22], BDA-aided effort esti-
the code is not a common practice in SE. Nevertheless, based on
mation that may prove especially helpful for inexperienced developers
how scholars in other domains (i.e. AI/ML) benefit from releasing the
with limited estimation and software implementation experience. ML-
code to reproduce experimental results or to streamline the software
models to validate their estimates or as an initial first estimate [S1].
development process, we hypothesize that it could also help in SE to
With respect to productivity, Balogh et al. [S8] stressed the importance
establish baselines and advance the state of the art. of measuring productivity at a fine-grained level.

Data presentation is no less important . Numerous studies (e.g. [S4,


5.3. Analytics is already present in many areas of Agile software develop-
S6,S36,S72,S76,S78,S84]) stressed the importance of visual design,
ment, but the domain is not consolidated
navigation, and visualization techniques in BDA as it improves com-
prehension of data. With regard to ASD, Alshakhouri et al. [S4] argued
ASD implementations may differ from company to company, but that visual analytics in software engineering mainly provides insights
the main objective for BDA remains unchanged. BDA is predominantly about intangible software product artifacts (such as program runtime
employed to support informed decision-making and minimize project- behavior or software evolution), while software process visualization is
related risks. To achieve better results, ASD processes can be tuned to far less popular and deserves more attention. In the same vein, Augus-
provide BDA with more valuable data. tine et al. [S6] demonstrated the ability of visual analytics to improve

20
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

the consistency of process implementation among software develop- 5.4.1. Mobile apps lead in analyzing explicit user feedback
ment teams. Furthermore, numerous studies (e.g. [S6,S84]) proved that According to our analysis, the majority of customer feedback is
visual representation of data often offers the kind of insights that are es- gathered and analyzed for mobile applications. Data comes predom-
pecially useful for decision-makers, such as dashboards visualizing most inantly from user reviews in application stores. However, as Chen
vital metrics for ASD processes and KPIs. Similar to [S77], [52], we et al. [S18] found, only about 35% of app reviews provide insights that
conclude that visualization that supports the decision-making process are directly applicable so that developers can improve the applications.
is a very important addition to the BDA toolchain. We would like to see more industrial studies investigating data gath-
ered from customer feedback and experimentation other than mobile
5.3.3. Design and implementation of ASD processes are key to better support applications.
BDA
All too often, BDA is built to support existing ASD processes with 5.4.2. Leveraging usage data coming from different data sources
a presumption that those processes cannot change. That is to say, Systematic experiments with customers are rare [S47], which au-
approaches for leveraging BDA are mostly top-down: with ASD process thors of the selected primary studies advocate [S30]. Very rarely studies
in place, once data is available, a BDA solution is built. In contrast, monitored the usage of their proposed solutions and evaluated feedback
we perceive that key to increasing the adoption of BDA in ASD is of the users given (both implicitly or explicitly) during their regular
to implement ASD processes in organizations in a way that better work. Analyzing implicit feedback is critical as it lacks many biases
accommodates BDA. Namely, the bottom-up approach could focus on of explicit feedback. In many cases, the product instrumentation only
identifying what value should the BDA solution deliver and what data it covered performance data and basic user demographics. However,
needs. Based on that, the focus should be shifted to continually improve the awareness of companies of the importance of product usage data
the quality of data and Agile practices to better serve the BDA solution. is growing, as some studies suggest [S47]. Companies need to start
The changes in the ASD process can be small but may greatly improve collecting product usage data, such as telemetry from IDEs, or various
the quality of insights provided by the analytics and, in result, decision- ALM systems. In the same vein, supporting development teams with
making. There is a wealth of research on adopting ASD processes for up-to-date monitoring data (e.g. within their IDE environment) allows
ML and AI, which is beyond the scope of this mapping study. Hence them to better understand how their software works (its performance,
here, we will only highlight some of the approaches proposed in the usage patterns, identify possible bugs) as well as comprehend complex
selected primary studies to improve Agile practices so that they better business processes [S83].
support BDA.

□ Backlog : projects often have ill-structured product backlog, with 5.4.3. Guiding product development in ASD through BDA
missing properties for product backlog items [S12]. The quality Release planning based on user feedback is gaining more atten-
of backlog data can be improved with dedicated tools (e.g. JIRA, tion in SE research community in recent years (e.g. [S56], S88). For
VersionOne) offering standardized backlog functionality [S12]. instance, [S56] presented a solution for analyzing customers’ satisfac-
□ Features: it is important to capture information such as the re- tion and dissatisfaction and planning future software releases based
quirement creator, time remained to the end of an iteration, time on stakeholder needs. Other example may include online controlled
since last requirement summary change and the number of times experiments (OCEs) approach to rolling out changes to ML-centric
requirement replanned for a new iteration [S27]. software. We expect to see more research on BDA that guides product
□ User stories: user stories should be written with a higher level of development in an Agile environment, which also includes AI-powered
detail, which is more suitable for automatic prediction of effort applications. Some of the covered studies already incorporated mech-
[S1]. Abrahamsson et al. suggest that ‘‘very short (containing a anisms to gather feedback. For example [S66], described a tool that
few words), informal user stories do not provide enough informa- includes a feedback loop that helps to improve the platform.
tion to train any effort prediction model that would yield results
with a required level of accuracy’’. 5.5. Threats to validity
In general, it is advised to capture fine-grained data throughout
the project’s lifecycle. For instance, detailed descriptions of product In order to ensure the high quality of our study in terms of com-
backlog items, their creators, time remained to the end of an iteration, pleteness and rigor, we identified the following threats to validity and
time past since the last requirement summary change, the number of implemented appropriate mitigation strategies. Our threats to validity
times requirements are replanned for a new iteration. Such a well- fall into the following main categories [53]:
structured approach to gathering evidence can be aided by a dedicated Incompleteness of search strategy (internal validity). The snowballing
Agile product lifecycle management tools. Various kinds of artifacts method, despite being considered to give comparable results to
‘‘(requirements, source code, test cases, document, risks, etc.) are pro- database search method [34], suffers from exhaustiveness. Hence, the
duced throughout the lifecycle. These artifacts are not independent results of this study can be skewed, as not all papers might be included
from each other and are often interlinked with each other via so-called (i.e. could be mistakenly excluded or missed by the first author doing
traceability links. These artifacts and their traceability link informa- the review). However, as Kitchenham et al. [54] contend, this is also
tion are typically stored in modern application lifecycle management a defining characteristic of mapping studies in general. What is more,
(ALM) tools, hence, it is relatively easier to access this traceability limiting the number of venues based on their ranking and relevancy of
information’’ [S76]. published papers also impacted the list of our selected primary studies.
Nevertheless, to better understand the completeness of our study, we
5.4. Feedback and behavior analysis should be at the heart of the BDA ran a database search in one digital library (i.e. Thomson Reuters’s
strategy for ASD Web of Science). Using a similar search string as in Section 3.4 to
find relevant primary studies on the topic — i.e. ‘‘agile *development’’
Measuring customer feedback unobtrusively (e.g. telemetry, sen- AND (‘‘*analytics’’ OR ‘‘data-driven’’ OR ‘‘big data’’), we identified 215
timent analysis of online reviews) offers unprecedented opportunity studies, which is comparable to the result of our manual search through
to understand customers’ needs and consequently set proper product which we found 272 relevant studies. Hence, we conclude that the miss
objectives. Yet, the opportunity has not been fully exploited so far. ratio is not substantial..

21
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

Sample size representativeness (conclusion validity). The sample size of 6. Conclusion


the literature review was relatively small and we were clearly not
able to capture all studies covering the topic for the selected period. By performing a mapping study, we examined a body of primary
Nevertheless, we believe that the sample size should be representative research studies, and we described how data analytics (did and could)
of this area of research. To reduce this threat, in our literature review, improve ASD. We also found that past work in the concerned field
we extracted data for analytical reviews. As it turns out, similar to [5], is mainly focused on requirements and delivery capabilities. As much
IEEE Software happened to be the second most frequently represented of the work discussed in this SM study takes the form of solution
publishing venue in our study. However, we do not consider this proposals, rather than empirical studies in industry, we suggest that
an anomaly. Unlike in [5], our study is not skewed by the 2013 future research would benefit from: (1) more generalizable, represen-
special issue of IEEE Software on software analytics. Moreover, also tative empirical studies discussing methods and their applications not
in the work of Dingsøyr et al. [32] the aforementioned journal was only at the level of experience reports or solution proposals, (2) more
industrial studies investigating data gathered from customer feedback
the major journal outlet for ASD related works. Although there were
and experimentation other than mobile applications. It is especially
companies that were more frequently represented in our study than
important, because it is said that up to 80% of products lose money
others — i.e. Microsoft [S17,S24,S30,S87], Ericsson [S57,S78], Google
due to wrongly set product objectives that result in building products
[S29,S66], and IBM [S11,S27]. However, they together accounted only
that customers do not need [55]. We expect to see more research on
for about 15% of all studies. Hence, we did not face a problem similar
BDA that guides product development in an Agile environment.
to the one identified in [6], where the majority of primary studies were
Further research on this topic may focus on reviewing gray liter-
conducted with companies known for collaborating with empirical
ature which, as we hypothesize, covers related approaches to those
software engineering researchers on a regular basis. Nevertheless, 18 discussed in the selected primary studies, but seemingly in a less rigor-
papers did not reveal the names of industrial partners participating in ous manner. Finally, it appears that ASD and BDA alike are very much
those studies (some of those papers discussed more than one industrial similar to other phenomena that over the decades, were once popular
case study). Therefore, in total, names of 34 companies were not and considered breakthroughs, but eventually became absorbed into
disclosed; hence we cannot entirely rule out the possibility of having a what is called business as usual. In that respect, ASD and BDA concepts
bias toward particular companies, as it might be implicit. will most probably not fade away, but rather become embedded in
many activities of modern software companies. Hence, at this point,
Researcher bias (construct validity). Since the study selection and data
it will be pointless to call them out as they will be an integral part of
extraction was performed by a single researcher, in principle, there was
business operations.
a high probability that researcher bias could potentially adversely affect
the overall result of the mapping study. However, in order to reduce the
Declaration of competing interest
risks of research bias, we first developed and later followed a structured
review method when conducting this study (designed in the form of a The authors declare that they have no known competing finan-
research protocol — demonstrated in Section 3.1). Moreover, in case of cial interests or personal relationships that could have appeared to
a vague extraction or unclear categorization, the first author consulted influence the work reported in this paper.
such studies with at least one of the remaining authors. Hence, we
argue that no systematic errors were introduced in the course of the Acknowledgments
study selection and data extraction that could impact our findings.
This work is supported in part by the Catalan Agencia de Gestión de
Naming confusion (external validity). Agile methodology, and ASD in
Ayudas Universitarias y de Investigación (AGAUR) through the FI Ph.D.
particular, not always appear in titles, author keywords, index terms,
grant. The research is also partially supported by the Spanish Ministerio
or even in abstracts of the selected primary studies (e.g. [S19,S21]).
de Economía, Industria y Competitividad through the GENESIS project
Furthermore, ASD is often referred to as iterative software development
(grant TIN2016-79269-R).
in papers (e.g. [S20]). In addition, several concepts such as regres-
sion testing, automatic app review classification (e.g. [S17,S36,S54])
Appendix. Additional resources
or continuous integration, although popular in ASD, are not directly
linked to Agile practices in papers describing them. For that reason, it
The spreadsheets used to track the following metadata and RQ-
was not always clear if a particular work discussed only plan-driven related information from the reviewed primary studies:
software development approaches or rather focused on modern Agile
development practices, where incremental and iterative development • Additional material concerning demographics
plays a vital role. For instance, although the Rational Unified Process • Data extraction form
(RUP) process framework is not ASD per se, it encompasses many Agile- • Selected and excluded publishing outlets
inspired techniques. Therefore, we often had to infer some information
can be found under the following link: https://mydisk.cs.upc.edu/s/
from text, and, as in the case of RUP (e.g. [S57,S58,S63]), such studies G5XfnkFntj6oKoe
were also included in our survey. In the same vein, naming confusion
also occurred for the BDA concept. Terms such as software analytics References
and data analytics are used interchangeably in the literature. Also,
AI, ML, and DL can be found in a growing body of studies on SE in [1] Helena Holmström Olsson, Jan Bosch, From opinions to data-driven software
ASD. Even though their meaning is slightly different, they also share a r&d: A multi-case study on how to close the’open loop’problem, in: 2014 40th
EUROMICRO Conference on Software Engineering and Advanced Applications,
number of common properties. Furthermore, software analytics or data
IEEE, 2014, pp. 9–16.
analytics belong to a common category. In order to provide non-obvious [2] Foster Provost, Tom Fawcett, Data science and its relationship to big data and
insights and improve the decision-making process, they need to employ data-driven decision making, Big data 1 (1) (2013) 51–59.
ML, DL, or AI methods in some shape or form. Those methods require [3] Deanne Larson, Victor Chang, A review and future direction of agile, business
intelligence, analytics and data science, Int. J. Inf. Manage. 36 (5) (2016)
a significant amount of data (often complex or unstructured) to train
700–710.
models. For that reason, we decided to use the term BDA as an umbrella [4] Barbara Kitchenham, What’s up with software metrics? - A preliminary mapping
term for all the concepts mentioned above. study, J. Syst. Softw. 83 (1) (2010) 37–51.

22
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

[5] Tamer Mohamed Abdellatif, Luiz Fernando Capretz, Danny Ho, Software analyt- [34] Samireh Jalali, Claes Wohlin, Systematic literature studies: Database searches
ics to software practice: A systematic literature review, in: Proceedings of the vs. backward snowballing, in: Proceedings of the ACM-IEEE International Sym-
First International Workshop on BIG Data Software Engineering, BIGDSE ’15, posium on Empirical Software Engineering and Measurement, ESEM ’12, ACM,
IEEE Press, Piscataway, NJ, USA, 2015, pp. 30–36. New York, NY, USA, 2012, pp. 29–38.
[6] Eetu Kupiainen, Mika Mäntylä, Juha Itkonen, Using metrics in Agile and Lean [35] Tim Menzies, Thomas Zimmermann, Goldfish bowl panel: Software development
Software Development - A systematic literature review of industrial studies, Inf. analytics, in: Proceedings of the 34th International Conference on Software
Softw. Technol. 62 (2015) 143–163. Engineering, IEEE Press, 2012, pp. 1032–1033.
[7] Rashina Hoda, Norsaremah Salleh, John Grundy, Hui Mien Tee, Systematic [36] Tim Menzies, Thomas Zimmermann, Software analytics: so what? IEEE Softw.
literature reviews in agile software development: A tertiary study, Inf. Softw. 30 (4) (2013) 31–37.
Technol. 85 (2017) 60–70. [37] Tim Menzies, Thomas Zimmermann, Software analytics: What’s next? IEEE Softw.
[8] Selami Bagriyanik, Adem Karahoca, Big data in software engineering: A 35 (5) (2018) 64–70.
systematic literature review, Glob. J. Inf. Technol. 6 (1) (2016) 107–116. [38] K. Krippendorff, Models of messages: three prototypes, in: G. Gerbner, O.R.
[9] Tore Dybå, Torgeir Dingsøyr, Empirical studies of agile software development: Holsti, K. Krippendorff, G.J. Paisly, Ph.J. Stone (Eds.), The Analysis of
A systematic review, Inf. Softw. Technol. 50 (9–10) (2008) 833–859. Communication Content, Wiley, New York, 1969.
[10] Longbing Cao, Data science: A comprehensive overview, ACM Comput. Surv. 50 [39] G. Blackett, Analytics network - O.R. & analytics, 2018, https://theorsociety.com/
(3) (2017) 43:1–43:42. Pages/SpecialInterest/AnalyticsNetwork_analytics.aspx. (Accessed 03 June 2018).
[11] Han Hu, Yonggang Wen, Tat-Seng Chua, Xuelong Li, Toward scalable systems [40] David Ameller, Xavier Burgués, Oriol Collell, Dolors Costal, Xavier Franch,
for big data analytics: A technology tutorial, IEEE Access 2 (2014) 652–687. Mike P. Papazoglou, Development of service-oriented architectures using
model-driven development, Inf. Softw. Technol. 62 (C) (2015) 42–66.
[12] Hype Cycle for Emerging Technologies, Gartner Inc., 2003, https://www.gartner.
[41] Ayman Meidan, J.A. García-García, Isabel M. Ramos, María José Escalona
com/doc/2571624/hype-cycle-emerging-technologies-. (Accessed 30 July 2018).
Cuaresma, Measuring software process: A systematic mapping study, ACM
[13] Boehm, Richard Turner, Balancing Agility and Discipline: A Guide for the
Comput. Surv. 51 (2018) 58:1–58:32.
Perplexed, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA,
[42] Ali Ahmadian Ramaki, Abbas Rasoolzadegan Barforoush, Abbas Ghaemi Bafghi,
2003.
A systematic mapping study on intrusion alert analysis in intrusion detection
[14] Kai Petersen, Claes Wohlin, The effect of moving from a plan-driven to an
systems, ACM Comput. Surv. 51 (2018) 55:1–55:41.
incremental software development approach with agile practices, Empir. Softw.
[43] Dong Qiu, Bixin Li, Shunhui Ji, Hareton K.N. Leung, Regression testing of web
Eng. 15 (2010) 654–693.
service: A systematic mapping study, ACM Comput. Surv. 47 (2014) 21:1–21:46.
[15] Scott W. Ambler, in: Bertrand Meyer, Jerzy R. Nawrocki, Bartosz Walter (Eds.),
[44] Davi Silva Rodrigues, Márcio Eduardo Delamaro, Cléber Gimenez Corrêa, Fátima
Balancing Agility and Formalism in Software Engineering, Springer-Verlag,
de Lourdes dos Santos Nunes, Using genetic algorithms in test data generation:
Berlin, Heidelberg, 2008, pp. 1–12.
A critical systematic mapping, ACM Comput. Surv. 51 (2018) 41:1–41:23.
[16] The Agile System Development Lifecycle (SDLC), Ambler, S.W, 2005, http: [45] Mary Shaw, Writing good software engineering research papers: Minitutorial, in:
//www.ambysoft.com/essays/agileLifecycle.html. (Accessed 30 July 2018). Proceedings of the 25th International Conference on Software Engineering, ICSE
[17] Maria Paasivaara, Benjamin Behm, Casper Lassenius, Minna Hallikainen, Large- ’03, IEEE Computer Society, 2003, pp. 726–736.
scale agile transformation at Ericsson: a case study, Empir. Softw. Eng. [46] The Apache Software Foundation, 2018, https://www.apache.org/foundation/
(2018). how-it-works.html. (Accessed 30 July 2018).
[18] Torgeir Dingsøyr, Nils Brede Moe, Tor Erlend Fægri, Eva Amdahl Seim, Exploring [47] Amadeu Silveira Campanelli, Fernando Silva Parreiras, Agile methods tailoring–A
software development at the very large-scale: a revelatory case study and systematic literature review, J. Syst. Softw. 110 (2015) 85–100.
research agenda for agile method adaptation, Empir. Softw. Eng. 23 (1) (2018) [48] Anders Arpteg, Björn Brinne, Luka Crnkovic-Friis, Jan Bosch, Software engineer-
490–520. ing challenges of deep learning, in: 2018 44th Euromicro Conference on Software
[19] Scaled Agile Framework, Scaled Agile Inc., 2015, http://scaledagileframework. Engineering and Advanced Applications, SEAA, IEEE, 2018, pp. 50–59.
com/. (Accessed 30 July 2018). [49] Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece
[20] Less Framework, The LeSS Company B.V., 2015, http://less.works/. (Accessed Kamar, Nachiappan Nagappan, Besmira Nushi, Thomas Zimmermann, Software
30 July 2018). engineering for machine learning: A case study, in: 2019 IEEE/ACM 41st Inter-
[21] Ambler S.W. (Ed.), Disciplined Agile Delivery: A Practitioner’s Guide to Agile national Conference on Software Engineering: Software Engineering in Practice,
Software Delivery in the Enterprise, IBM Press, 2012. ICSE-SEIP, IEEE, 2019, pp. 291–300.
[22] K. Petersen, R. Feldt, M. Shahid, M. Mattsson, Systematic mapping studies [50] Hoa Khanh Dam, Truyen Tran, John Grundy, Aditya Ghose, Yasutaka Kamei,
in software engineering, in: Proceedings of the Evaluation and Assessment in Towards effective AI-powered agile project management, 2018, arXiv preprint
Software Engineering, EASE’08, Bari, Italy, 2008, pp. 1–10. arXiv:1812.10578.
[23] Pearl Brereton, Barbara A. Kitchenham, David Budgen, Mark Turner, Mohamed [51] Jakub Lipčák, Bruno Rossi, Rev-rec source - code reviews dataset (seaa2018),
Khalil, Lessons from applying the systematic literature review process within the 2018, http://dx.doi.org/10.6084/m9.figshare.6462380.v1.
software engineering domain, J. Syst. Softw. 80 (40) (2007) 571–583. [52] Marijn Janssen, Haiko van der Voort, Agung Wahyudi, Factors influencing big
[24] Øyvind Hauge, Claudia P. Ayala, Reidar Conradi, Adoption of open source data decision-making quality, J. Bus. Res. 70 (2017) 338–345.
software in software-intensive organizations - A systematic literature review, Inf. [53] Kai Petersen, Sairam Vakkalanka, Ludwik Kuzniarz, Guidelines for conducting
Softw. Technol. 52 (11) (2010) 1133–1154. systematic mapping studies in software engineering: An update, Inf. Softw.
[25] Claes Wohlin, Second-generation systematic literature studies using snowballing., Technol. 64 (2015) 1–18.
in: Sarah Beecham, Barbara Kitchenham, Stephen G. MacDonell (Eds.), EASE, [54] Barbara A. Kitchenham, David Budgen, O. Pearl Brereton, Using mapping studies
ACM, 2016, pp. 15:1–15:6. as the basis for further research - A participant-observer case study, Inf. Softw.
[26] Ayman Meidan, Julián A. García-García, Isabel Ramos, María José Escalona, Technol. 53 (6) (2011) 638–651.
Measuring software process: A systematic mapping study, ACM Comput. Surv. [55] Visions 40 (3) (2016).
51 (3) (2018) 58:1–58:32.
[27] Maarit Laanti, Outi Salo, Pekka Abrahamsson, Agile methods rapidly replacing Primary studies
traditional methods at nokia: A survey of opinions on agile transformation, Inf.
Softw. Technol. 53 (3) (2011) 276–290. [S1] P. Abrahamsson, I. Fronza, R. Moser, J. Vlasenko, W. Pedrycz, Predicting
[28] David Budgen, Mark Turner, Pearl Brereton, Barbara Kitchenham, Using mapping development effort from user stories, in: Empirical Software Engineering and
studies in software engineering, in: PPIG, Psychology of Programming Interest Measurement (ESEM), 2011 International Symposium on, IEEE, 2011, pp.
Group, 2008, p. 20. 400–403.
[29] Claes Wohlin, Guidelines for snowballing in systematic literature studies and a [S2] S.D.A. Alam, M.R. Karim, D. Pfahl, G. Ruhe, Comparative analysis of predictive
replication in software engineering, in: Martin J. Shepperd, Tracy Hall, Ingunn techniques for release readiness classification, in: Proceedings of the 5th
Myrtveit (Eds.), EASE, ACM, 2014, pp. 38:1–38:10. International Workshop on Realizing Artificial Intelligence Synergies in Software
[30] Pierre Bourque, Richard E. Fairley (Eds.), SWEBOK: Guide to the Software Engineering, ACM, 2016, pp. 15–21.
Engineering Body of Knowledge, IEEE Computer Society, 2014, Version 3.0. [S3] S.D.A. Alam, D. Pfahl, G. Ruhe, Release readiness classification: An explorative
[31] The 12th Annual State of Agile Report, VersionOne, 2018, https://explore. case study, in: Proceedings of the 10th ACM/IEEE International Symposium on
versionone.com/state-of-agile/versionone-12th-annual-state-of-agile-report. (Ac- Empirical Software Engineering and Measurement, ACM, 2016, p. 27.
cessed 25 February 2019). [S4] M. Alshakhouri, J. Buchan, S.G. MacDonell, Synchronized visualization
[32] Torgeir Dingsøyr, Sridhar Nerur, VenuGopal Balijepally, Nils Brede Moe, A of software process and product artifacts: concept, design and prototype
decade of agile methodologies: Towards explaining agile software development, implementation, Inf. Softw. Technol. 98 (2018) 131–145.
J. Syst. Softw. 85 (6) (2012) 1213–1221, Special Issue: Agile Development. [S5] S. Asthana, R. Kumar, R. Bhagwan, C. Bird, C. Bansal, C.S. Maddila, S. Mehta, B.
[33] B. Kitchenham, S. Charters, Guidelines for Performing Systematic Literature Ashok, WhoDo: automating reviewer suggestions at scale, in: Proceedings of the
Reviews in Software Engineering, Technical Report EBSE-2007-01, School of 2019 27th ACM Joint Meeting on European Software Engineering Conference
Computer Science and Mathematics, Keele University, 2007. and Symposium on the Foundations of Software Engineering, 2019.

23
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

[S6] V. Augustine, J. Hudepohl, P. Marcinczak, W. Snipes, Deploying software team [S32] D. Fucci, H. Erdogmus, B. Turhan, M. Oivo, N. Juristo, A dissection of
analytics in a multinational organization, IEEE Softw. 35 (1) (2018) 72–76. the test-driven development process: Does it really matter to test-first or to
[S7] G. Avelino, L.T. Passos, F. Petrillo, M.T. Valente, Who can maintain this code?: test-last? IEEE Trans. Softw. Eng. 43 (7) (2017) 597–614.
Assessing the effectiveness of repository-mining techniques for identifying [S33] J. García, K. Garcés, Improving understanding of dynamically typed software
software maintainers, IEEE Softw. 36 (2019) 34–42. developed by agile practitioners, in: Proceedings of the 2017 11th Joint Meeting
[S8] G. Balogh, G. Antal, Á. Beszédes, L. Vidács, T. Gyimóthy, Á.Z. Végh, Identifying on Foundations of Software Engineering, ACM, 2017, pp. 908–913.
wasted effort in the field via developer interaction data, in: Software Mainte- [S34] M. Golfarelli, S. Rizzi, E. Turricchia, Multi-sprint planning and smooth
nance and Evolution (ICSME), 2015 IEEE International Conference on, IEEE, replanning: An optimization model, J. Syst. Softw. 86 (9) (2013) 2357–2370.
2015, pp. 391–400. [S35] E. Guzman, M. El-Haliby, B. Bruegge, Ensemble methods for app review
[S9] F.A. Batarseh, A.J. Gonzalez, Predicting failures in agile software development classification: An approach for software evolution, in: ASE, IEEE Computer
through data analytics, Softw. Qual. J. 26 (1) (2018) 49–66. Society, 2015, pp. 771–776.
[S10] K. Becker, B. de Souza Costa Pedroso, M.S. Pimenta, R.P. Jacobi, Besouro: [S36] M. Habayeb, S.S. Murtaza, A.V. Miranskyy, A.B. Bener, On the use of hidden
A framework for exploring compliance rules in automatic TDD behavior Markov model to predict the time to fix bugs, IEEE Trans. Softw. Eng. 44 (2018)
assessment, Inf. Softw. Technol. 57 (2015) 494–508. 1224–1244.
[S11] K. Blincoe, A. Dehghan, A.-D. Salaou, A. Neal, J. Linåker, D.E. Damian, High- [S37] V. Heikkilä, M. Paasivaara, K. Rautiainen, C. Lassenius, T. Toivola, J. Järvinen,
level software requirements and iteration changes: a predictive model, Empir. Operational release planning in large-scale scrum with multiple stakeholders - A
Softw. Eng. 24 (2018) 1610–1648. longitudinal case study at F-Secure Corporation, Inf. Softw. Technol. 57 (2015)
[S12] M.P. Boerman, Z. Lubsen, D.A. Tamburri, J. Visser, Measuring and monitoring 116–140.
agile development status, in: Proceedings of the Sixth International Workshop [S38] G. Hu, L. Zhu, J. Yang, AppFlow: Using Machine Learning to Synthesize Robust,
on Emerging Trends in Software Metrics, IEEE Press, 2015, pp. 54–62. Reusable UI Tests, ESEC/SIGSOFT, FSE, 2018.
[S13] P. Braione, G. Denaro, A. Mattavelli, M. Vivanti, A. Muhammad, Software [S39] H. Huijgens, R. Lamping, D. Stevens, H. Rothengatter, G. Gousios, D. Romano,
testing with code-based test generators: data and lessons learned from a case Strong agile metrics: mining log data to determine predictive power of software
study with an industrial software component, Softw. Qual. J. 22 (2) (2014) metrics for continuous delivery teams, in: Proceedings of the 2017 11th Joint
311–333. Meeting on Foundations of Software Engineering, ACM, 2017, pp. 866–871.
[S14] B. Busjaeger, T. Xie, Learning for test prioritization: an industrial case study, [S40] J.O. Johanssen, A. Kleebaum, B. Brügge, B. Paech, Feature crumbs: Adapting
in: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on usage monitoring to continuous software engineering, in: PROFES, 2018.
Foundations of Software Engineering, ACM, 2016, pp. 975–980. [S41] M.R. Karim, A. Alam, S. Didar, S.J. Kabeer, G. Ruhe, B. Baluta, S. Mahmud,
[S15] F. Calefato, F. Lanubile, T. Conte, R. Prikladnicki, Assessing the impact of real- Applying data analytics toward optimized issue management: An industrial
time machine translation on multilingual meetings in global software projects, case study, in: Proceedings of the 4th International Workshop on Conducting
Empir. Softw. Eng. 21 (3) (2016) 1002–1034. Empirical Studies in Industry, ACM, 2016, pp. 7–13.
[S16] G. Çalikli, M. Staron, W. Meding, Measure early and decide fast: transforming [S42] F. Kortum, J. Klünder, K. Schneider, Behavior-driven dynamics in agile devel-
quality management and measurement to continuous deployment, in: Proceed- opment: The effect of fast feedback on teams, in: 2019 IEEE/ACM International
ings of the 2018 International Conference on Software and System Process, Conference on Software and System Processes, ICSSP, 2019, pp. 34–43.
ACM, 2018, pp. 51–60. [S43] O. Liechti, J. Pasquier, R. Reis, Supporting agile teams with a test analytics
[S17] R. Carlson, H. Do, A. Denton, A clustering approach to improving test case platform: a case study, in: Automation of Software Testing (AST), 2017
prioritization: An industrial case study, in: ICSM, IEEE Computer Society, 2011, IEEE/ACM 12th International Workshop on, IEEE, 2017, pp. 9–15.
pp. 382–391. [S44] J. Lin, Context-aware task allocation for distributed agile team, in: Proceed-
[S18] N. Chen, J. Lin, S.C.H. Hoi, X. Xiao, B. Zhang, AR-miner: mining informative ings of the 28th IEEE/ACM International Conference on Automated Software
reviews for developers from mobile app marketplace, in: P. Jalote, L.C. Briand, Engineering, IEEE Press, 2013, pp. 758–761.
A. van der Hoek (Eds.), ICSE, ACM, 2014, pp. 767–778. [S45] J. Lin, H. Yu, Z. Shen, C. Miao, Studying task allocation decisions of novice agile
[S19] M. Choetkiertikul, H.K. Dam, T. Tran, A. Ghose, Predicting the delay of teams with data from agile project management tools, in: Proceedings of the
issues with due dates in software projects, Empir. Softw. Eng. 22 (3) (2017) 29th ACM/IEEE International Conference on Automated Software Engineering,
1223–1263. ACM, 2014, pp. 689–694.
[S20] M. Choetkiertikul, H.K. Dam, T. Tran, A. Ghose, J. Grundy, Predicting delivery [S46] E. Lindgren, J. Münch, Raising the odds of success: the current state of
capability in iterative software development, IEEE Trans. Softw. Eng. 44 (6) experimentation in product development, Inf. Softw. Technol. 77 (2016) 80–91.
(2018) 551–573. [S47] M. Lu, P. Liang, Automatic classification of non-functional requirements from
[S21] M. Conoscenti, V. Besner, A. Vetrò, D.M. Fernández, Combining data analytics augmented app user reviews, in: EASE, ACM, 2017, pp. 344–353.
and developers feedback for identifying reasons of inaccurate estimations in [S48] M.I. Lunesu, J. Münch, M. Marchesi, M. Kuhrmann, Using simulation for
agile software development, J. Syst. Softw. 156 (2019) 126–135. understanding and reproducing distributed software development processes in
[S22] R.M. Czekster, P. Fernandes, L. Lopes, A. Sales, A.R. Santos, T. Webber, the cloud, Inf. Softw. Technol. 103 (2018) 226–238.
Stochastic performance analysis of global software development teams, ACM [S49] W. Maalej, M. Nayebi, T. Johann, G. Ruhe, Toward data-driven requirements
Trans. Softw. Eng. Methodol. 25 (3) (2016) 26. engineering, IEEE Softw. 33 (1) (2016) 48–54.
[S23] J. Czerwonka, N. Nagappan, W. Schulte, B. Murphy, CODEMINE: Building a [S50] C.S. Maddila, C. Bansal, N. Nagappan, Predicting pull request completion time:
software development data analytics platform at microsoft, IEEE Softw. 30 (4) a case study on large scale cloud services, in: Proceedings of the 2019 27th ACM
(2013) 64–71. Joint Meeting on European Software Engineering Conference and Symposium
[S24] D.A. da Costa, S. McIntosh, C. Treude, U. Kulesza, A.E. Hassan, The impact of on the Foundations of Software Engineering, 2019.
rapid release cycles on the integration delay of fixed issues, Empir. Softw. Eng. [S51] O. Malgonde, K. Chari, An ensemble-based model for predicting agile software
23 (2) (2018) 835–904. development effort, Empir. Softw. Eng. 24 (2018) 1017–1055.
[S25] F. Dalpiaz, I.V.D. Schalk, S. Brinkkemper, F.B. Aydemir, G. Lucassen, Detecting [S52] M. Manzano, E. Mendes, C. Gómez, C.P. Ayala, X. Franch, Using Bayesian
terminological ambiguity in user stories: Tool and experimentation, Inf. Softw. networks to estimate strategic indicators in the context of rapid software
Technol. 110 (2019) 3–16. development, in: PROMISE, 2018.
[S26] A. Dehghan, A. Neal, K. Blincoe, J. Linaker, D. Damian, Predicting likelihood [S53] D. Marijan, A. Gotlieb, S. Sen, Test case prioritization for continuous regression
of requirement implementation within the planned iteration: an empirical testing: An industrial case study, in: ICSM, IEEE Computer Society, 2013, pp.
study at IBM, in: Mining Software Repositories (MSR), 2017 IEEE/ACM 14th 540–543.
International Conference on, IEEE, 2017, pp. 124–134. [S54] D.I. Mattos, J. Bosch, H.H. Olsson, Multi-armed bandits in the wild: Pitfalls and
[S27] C. Ebert, J. Heidrich, S. Martínez-Fernández, A. Trendowicz, Data science: strategies in online experiments, Inf. Softw. Technol. 113 (2019) 68–81.
Technologies for better software, IEEE Softw. 36 (2019) 66–72. [S55] M. Nayebi, G. Ruhe, Asymmetric release planning: Compromising satisfaction
[S28] S. Elbaum, G. Rothermel, J. Penix, Techniques for improving regression testing against dissatisfaction, IEEE Trans. Softw. Eng. 45 (9) (2019) 839–857.
in continuous integration development environments, in: Proceedings of the [S56] L. Pareto, A.B. Sandberg, P.S. Eriksson, S. Ehnebom, Collaborative prioritization
22nd ACM SIGSOFT International Symposium on Foundations of Software of architectural concerns, 85 (9) (2012) 1971–1994.
Engineering, ACM, 2014, pp. 235–245. [S57] F. Paz, C. Zapata, J.A. Pow-Sang, An approach for effort estimation in
[S29] A. Fabijan, P. Dmitriev, H.H. Olsson, J. Bosch, The evolution of continuous incremental software development using cosmic function points, in: M. Morisio,
experimentation in software product development: from data to a data-driven T. Dybå, M. Torchiano (Eds.), ESEM, ACM, 2014, pp. 44:1–44:4.
organization at scale, in: ICSE, IEEE / ACM, 2017, pp. 770–780. [S58] A.A. Philip, R. Bhagwan, R. Kumar, C.S. Maddila, N. Nagappan, FastLane:
[S30] B. Fitzgerald, M. Musiał, K.-J. Stol, Evidence-based decision making in lean soft- Test minimization for rapidly deployed large-scale online services, in: 2019
ware project management, in: Companion Proceedings of the 36th International IEEE/ACM 41st International Conference on Software Engineering, ICSE, 2019,
Conference on Software Engineering, ACM, 2014, pp. 93–102. pp. 408–418.
[S31] A.S. Freire, M. Perkusich, R.M. Saraiva, H.O. de Almeida, A. Perkusich, A [S59] R. Pietrantuono, A. Bertolino, G.D. Angelis, B. Miranda, S. Russo, Toward
Bayesian networks-based approach to assess and improve the teamwork quality continuous software reliability testing in DevOps, in: 2019 IEEE/ACM 14th
of agile teams, Inf. Softw. Technol. 100 (2018) 119–132. International Workshop on Automation of Software Test AST, 2019, pp. 21–27.

24
K. Biesialska et al. Information and Software Technology 132 (2021) 106448

[S60] S. Porru, A. Murgia, S. Demeyer, M. Marchesi, R. Tonelli, Estimating story [S74] M. Staron, W. Meding, Monitoring bottlenecks in agile and lean software devel-
points from issue reports, in: Proceedings of the 12th International Conference opment projects–a method and its industrial use, in: International Conference
on Predictive Models and Data Analytics in Software Engineering, ACM, 2016, on Product Focused Software Process Improvement, Springer, 2011, pp. 3–16.
p. 2. [S75] E. Sülün, E. Tüzün, U. Dogrusöz, Reviewer recommendation using software
[S61] P. Pospieszny, B. Czarnacka-Chrobot, A. Kobylinski, An effective approach artifact traceability graphs, in: Proceedings of the Fifteenth International
for software project effort and duration estimation with machine learning Conference on Predictive Models and Data Analytics in Software Engineering,
algorithms, J. Syst. Softw. 137 (2018) 184–196. 2019.
[S62] R. Robbes, R. Vidal, M.C. Bastarrica, Are software analytics efforts worth- [S76] R.B. Svensson, R. Feldt, R. Torkar, The unfulfilled potential of data-driven
while for small companies? The case of Amisoft, IEEE softw. 30 (5) (2013) decision making in agile software development, in: International Conference
46–53. on Agile Software Development, Springer, 2019, pp. 69–85.
[S63] T. Rocha, P. Borba, J.P. Santos, Using acceptance tests to predict files changed [S77] A. Szoke, Conceptual scheduling model and optimized release scheduling for
by programming tasks, J. Syst. Softw. 154 (2019) 176–195. agile environments, Inf. Softw. Technol. 53 (6) (2011) 574–591.
[S64] P. Rodeghero, S. Jiang, A. Armaly, C. McMillan, Detecting user story infor- [S78] E. Tüzün, c. Üsfekes, Y. Macit, G. Giray, Toward unified software project mon-
mation in developer-client conversations to generate extractive summaries, in: itoring for organizations using hybrid processes and tools, in: 2019 IEEE/ACM
Software Engineering (ICSE), 2017 IEEE/ACM 39th International Conference International Conference on Software and System Processes, ICSSP, 2019, pp.
on, IEEE, 2017, pp. 49–59. 115–119.
[S65] C. Sadowski, J. Van Gogh, C. Jaspan, E. Söderberg, C. Winter, Tricorder: [S79] E.L. Vargas, J. Hejderup, M. Kechagia, M. Bruntink, G. Gousios, Enabling real-
Building a program analysis ecosystem, in: Proceedings of the 37th International time feedback in software engineering, in: 2018 IEEE/ACM 40th International
Conference on Software Engineering, Vol. 1, IEEE Press, 2015, pp. 598–608. Conference on Software Engineering: New Ideas and Emerging Technologies
[S66] Z. Sahaf, V. Garousi, D. Pfahl, R. Irving, Y. Amannejad, When to automate Results, ICSE-NIER, 2018, pp. 21–24.
software testing? decision support based on system dynamics: an industrial case [S80] C. Vassallo, S. Proksch, H.C. Gall, M.D. Penta, Automated reporting of
study, in: Proceedings of the 2014 International Conference on Software and anti-patterns and decay in continuous integration, in: 2019 IEEE/ACM 41st
System Process, ACM, 2014, pp. 149–158. International Conference on Software Engineering, ICSE, 2019, pp. 105–115.
[S67] C.J.T. Salinas, J. Sedeño, M.a.J.E. Cuaresma, M. Mejí as, Estimating, planning [S81] A.M. Vollmer, S. Martínez-Fernández, A. Bagnato, J. Partanen, L. López, P.R.
and managing agile web development projects under a value-based perspective, Marín, Practical experiences and value of applying software analytics to manage
Inf. Softw. Technol. 61 (2015) 124–144. quality, in: 2019 ACM/IEEE International Symposium on Empirical Software
[S68] A. Scheerer, S. Bick, T. Hildenbrand, A. Heinzl, The effects of team backlog Engineering and Measurement, ESEM, 2019, pp. 1–6.
dependencies on agile multiteam systems: A graph theoretical approach, in: [S82] J. Winter, M.F. Aniche, J. Cito, A. van Deursen, Monitoring-aware IDEs,
T.X. Bui, R.H. Sprague Jr. (Eds.), HICSS, IEEE Computer Society, 2015, pp. in: Proceedings of the 2019 27th ACM Joint Meeting on European Soft-
5124–5132. ware Engineering Conference and Symposium on the Foundations of Software
[S69] G. Schermann, P. Leitner, Search-based scheduling of experiments in continuous Engineering, 2019.
deployment, in: 2018 IEEE International Conference on Software Maintenance [S83] K. Wnuk, T. Gorschek, D. Callele, E.-A. Karlsson, E. Åhlin, B. Regnell,
and Evolution, ICSME, 2018, pp. 485–495. Supporting scope tracking and visualization for very large-scale require-
[S70] E. Scott, D. Pfahl, Using developers’ features to estimate story points, in: ICSSP, ments engineering-utilizing FSC+, decision patterns, and atomic decision
2018. visualizations, IEEE Trans. Softw. Eng. 42 (1) (2016) 47–74.
[S71] V.S. Sharma, R. Mehra, S. Podder, A.P. Burden, A journey toward providing [S84] L. Xiao, Z. Yu, B. Chen, X. Wang, How robust is your development team? IEEE
intelligence and actionable insights to development teams in software delivery, Softw. 35 (1) (2018) 64–71.
in: 2019 34th IEEE/ACM International Conference on Automated Software [S85] Z. Yu, F.M. Fahid, T. Menzies, G. Rothermel, K. Patrick, S. Cherian, TERMINA-
Engineering, ASE, 2019, pp. 1214–1215. TOR: better automated UI test case prioritization, in: Proceedings of the 2019
[S72] B. Snyder, B. Curtis, Using analytics to guide improvement during an 27th ACM Joint Meeting on European Software Engineering Conference and
Agile–DevOps transformation, IEEE Softw. 35 (1) (2018) 78–83. Symposium on the Foundations of Software Engineering, 2019.
[S73] R.R.G. Souza, C. von Flach G. Chavez, R.A. Bittencourt, Rapid releases and patch [S86] D. Zhang, S. Han, Y. Dang, J.-G. Lou, H. Zhang, T. Xie, Software analytics in
backouts: A software analytics approach, IEEE Softw. 32 (2) (2015) 89–96. practice, IEEE Softw. 30 (5) (2013) 30–37.
[S87] J. Zhang, Y. Wang, T. Xie, Software feature refinement prioritization based on
online user review mining, Inf. Softw. Technol. 108 (2019) 30–34.

25

You might also like