Professional Documents
Culture Documents
Intro Thesis
Intro Thesis
Introduction
Contents
1.1 Presentation . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Research Problem . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Big Data . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Software Development and Maintenance . . . . . . . . 7
1.2.3 Visual Analytics and Software Maintenance . . . . . . 9
1.3 Aims and Research Questions . . . . . . . . . . . . . . 11
1.4 Methodology and Outline . . . . . . . . . . . . . . . . 13
1.5 Research lines of this thesis . . . . . . . . . . . . . . . 14
1.1 Presentation
Software systems are almost omnipresent in daily life and used in almost all
devices which are utilized by people and businesses. Individuals use these
contrivances and the associated software to do the following things (among
many others): to work; to learn (take online courses, read and research topics
of interest); entertain themselves (play, watch TV or videos, listen to music);
communicate (friends, family, co-workers, participate in forums, collaborate
and work meetings),to buy things; to do paperwork (banking, taxes, and
other payments) and telecommuting [Charette 2005]. This has been the social
reality of the recent past; is current social reality; and will continue to be
everyday social reality in the future.
4 Chapter 1. Introduction
process and the involved data elements and their relationships, whereas
chapter 2 provides a detailed explanation of such process.
structure before and after the change, and the source code and the changes
that were carried out on it [Estublier 2000]4 .
A revision identifies the current state of the project at the moment that
the change has been committed. Revisions are stored by a software repository
under the control of a SCM tool and are associated to a revision number.
Consequently, SE is an iterative process conformed by the accumulation
of changes and revisions during software maintenance and development
[Fernandez-Ramil 2008].
SE usually expands through several years, generating thousands and even
millions of Lines of Source Code (LOC) [Kagdi 2007a], hundreds of software
components and thousands of revisions [D’Ambros 2008]. Furthermore,
within software projects exist relationships among software items in the
form of inheritance, interface implementation, coupling and cohesion. In
addition, source code is composed of variables, constants, programming
structures, methods and relationships among those elements. Besides logs,
communication systems, defect-tracking systems and LOC tools keep records
with dates, comments, changes made to software projects and associated users
and programmers [Hassan 2005].
Accordingly, SEA requires the assistance of automatic analysis tools for
aiding the understanding of a software project. It takes into account the
evaluation of individual revisions and the comparison of the output produced
by such evaluation for a given number of revisions or all the existing revisions
associated to the project. In this context, the analysis on an individual revision
includes the comprehension of the structural characteristics of the project, the
relationships among software items, the software quality metrics, source code
facts, and the comprehension of the socio-technical relationships derived from
the development process.
SEA also requires the retrieval of data from the source code, the software
project structure, communication systems, logs and the metadata5 records
from bug tracking and SCM tools [Garcıa-Peñalvo 2011]. Thus, it makes
use of a set of techniques that have the capability of recovering and
analyzing software projects looking to discover patterns and relationships and
calculating software quality metrics and fact extraction from the results of
comparing the analysis performed on revisions [D’Ambros 2008]. Although,
4
The IEEE Standard 828-2005 [STA 2005] states that “SCM activities include the
identification and establishment of baselines; the review, approval, and control of changes;
the tracking and reporting of such changes; the audits and reviews of the evolving software
product; and the control of interface documentation and project supplier SCM. SCM is
the means through which the integrity and traceability of the software system are recorded,
communicated, and controlled during both development and maintenance.”
5
Metadata contains descriptive details about the data.
1.2. Research Problem 9
∗ Offer details about the software elements upon which their work
depends.
∗ Provide information about the software components that are being
simultaneously modified by other programmers.
1.3. Aims and Research Questions 11
Table 1.1: Correlation of the methodology with Chapters and research activities.
Methodology Chapters Research elements
phases and activities
Goals
Objectives
Introduction
Research questions
Research problem
Description of the
software development
Plan/Revise plan
Software Systems: and maintenance process
Maintenance and Evolution Discussion of software
evolution concepts and
details
Description of the visual
Visual Analytics analytics process and its
components
Classification of research
works
Identification of tasks, data
Systematic Mapping Study elements and visualizations
in use
Analysis and discussion
Diagnose
Survey on the Use of Visual
Discussion and analysis of
Tools in Software Development
results
and Maintenance
Detailed review of selected
Focused analysis and discussion
research works
Process description
A Visual Analytics Process Architecture specification
Take action
for Software Evolution Tool design and
implementation
Use case scenarios
Evaluate Architecture Validation Case study
User assessment test
Analysis of results
Analyze findings Conclusions
Conclusions