Professional Documents
Culture Documents
Group 2
Héctor Bállega Fernández
Fabian Nonnenmacher
Ruiyang Ding
1
Motivation
(e.g. which are the critical parts to test, what should be refactored, etc.)
Unfortunately...
The more projects we analyzed, the more the conclusions vary(Conclusion instability) ⇒
makes it hard to find general policies for software engineering
2
Research
What does the paper do?
3
Methodology
1. Choose one project as bellwether
- Divide the class into discrete and continuous classes
- Randomly sub-sampled the training data with Random Forest Algorithm, until:
- The training data had only positive and negative classes in a ratio of 1:2.
2. Create test-datasets: all other projects but one project (holdout) excluded
4
Defect Detection
Challenge:
● Manual Code Reviews are time consuming (costly)
● Which parts of the code most likely contain bugs?
Approach
● use static code attributes as
features (Figure 5)
5
Defect Detection
RQ1: How prevalent is the “Bellwether effect”?
6
Defect Detection
RQ2: How does the bellwether dataset fare against within-project dataset”?
7
Defect Detection
RQ3: How does the bellwether dataset fare against within-project dataset”?
8
Defect Detection
RQ4: How much data is required to find the bellwether dataset?
9
Defect Detection
RQ5: How effectively do bellwethers mitigate for conclusion instability?
10
Technical Questions
11
Questions
12
Code Smell Detection
Challenges:
● Unclear which code smells are really relevant (see Fig.2)
● code smells are hard to detect (based on common metrics → many false positive)
Fig. 4 Dataset
13
Questions
How does conclusion instability happens? Is any other reasons?
In paper:
- Change of data source with growing of the datasets(Data Drift)
=> Performance Instability
- Source Instability: The constant influx of potential new data sources.
=>Source Instability
14
Structure
1. Motivation
2. Research Methods and Techniques
3. Results
4. Implications
5. Technical questions
6. Discussion
15