Welcome to Scribd!

Bellwethers: A Baseline Method For Transfer Learning: Group 2 Héctor Bállega Fernández Fabian Nonnenmacher Ruiyang Ding

Uploaded by

0% found this document useful (0 votes)

25 views15 pages

This paper proposes using "Bellwethers" as a baseline method for transfer learning to mitigate conclusion instability in software engineering research. A Bellwether is a project that is used to learn from and then apply those learnings to other projects. The paper evaluates using Bellwethers for tasks like code smell detection, effort estimation, issue lifetime estimation, and defect prediction. Results show the Bellwether approach can effectively mitigate conclusion instability compared to evaluating projects individually. The paper also discusses technical questions around how Bellwethers are selected and evaluated.

Original Description:

Original Title

Bellwethers.pdf

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

25 views15 pages

Bellwethers: A Baseline Method For Transfer Learning: Group 2 Héctor Bállega Fernández Fabian Nonnenmacher Ruiyang Ding

Uploaded by

Héctor Bállega

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 15

Search inside document

Bellwethers: A Baseline Method

For Transfer Learning

Rahul Krishna, Tim Menzies

Published in: IEEE Transactions on Software Engineering

Citations: 10

Group 2
Héctor Bállega Fernández
Fabian Nonnenmacher
Ruiyang Ding

1
Motivation

Software managers want to have a clear guideline about software quality

(e.g. which are the critical parts to test, what should be refactored, etc.)

Unfortunately...

The more projects we analyzed, the more the conclusions vary(Conclusion instability) ⇒
makes it hard to find general policies for software engineering

2
Research
What does the paper do?

● Proposed to using “Bellwether” to mitigate the conclusion instability.

● bellwether = learn from one project and apply to others

● Using “Bellwethers” as the baseline method for transfer learning

○ code smell detection
○ effort estimation
○ issue lifetime estimation
○ defect prediction

3
Methodology
1. Choose one project as bellwether
- Divide the class into discrete and continuous classes

- Randomly sub-sampled the training data with Random Forest Algorithm, until:
- The training data had only positive and negative classes in a ratio of 1:2.

2. Create test-datasets: all other projects but one project (holdout) excluded

3. Evaluate performance of the bellwether on all test-datasets

⇒ Select bellwether who delivers consistently high performance

4
Defect Detection
Challenge:
● Manual Code Reviews are time consuming (costly)
● Which parts of the code most likely contain bugs?

Approach
● use static code attributes as
features (Figure 5)

5
Defect Detection
RQ1: How prevalent is the “Bellwether effect”?

6
Defect Detection
RQ2: How does the bellwether dataset fare against within-project dataset”?

7
Defect Detection
RQ3: How does the bellwether dataset fare against within-project dataset”?

8
Defect Detection
RQ4: How much data is required to find the bellwether dataset?

9
Defect Detection
RQ5: How effectively do bellwethers mitigate for conclusion instability?

10
Technical Questions

- Why is the “Holdout Set” required to determine the best

Bellwether project?

- Estimation of Issue Lifetime:

commits and comments are not available at issue creation time?

11
Questions

- A bellwether is identified within a (homogenous) community?

- Is it a legit approach to put all projects of the same
organization into the same community?
- Would be better results achieved if the projects are cluster
into smaller communities based on some other
characteristics?

- How can an IT manager find the bellwether without having to

analyze all projects first? How can he evaluate the
performance?

12
Code Smell Detection
Challenges:
● Unclear which code smells are really relevant (see Fig.2)
● code smells are hard to detect (based on common metrics → many false positive)

Fig. 4 Dataset

13
Questions
How does conclusion instability happens? Is any other reasons?

In paper:
- Change of data source with growing of the datasets(Data Drift)
=> Performance Instability
- Source Instability: The constant influx of potential new data sources.
=>Source Instability

14
Structure
1. Motivation
2. Research Methods and Techniques
3. Results
4. Implications
5. Technical questions
6. Discussion

The Awakening Study Guide Answers
Document24 pages
The Awakening Study Guide Answers
Ali Ridha
No ratings yet
Unit 1 Introduction To Software Engineering
Document25 pages
Unit 1 Introduction To Software Engineering
Ravi Raval
No ratings yet
Notes Sepm Unit 2
Document51 pages
Notes Sepm Unit 2
Kartheek 577
No ratings yet
1.1 Introduction To Software Engineering
Document25 pages
1.1 Introduction To Software Engineering
Arrahim Fitrah
No ratings yet
Cs615 Collection of Old Papers
Document20 pages
Cs615 Collection of Old Papers
cs619finalproject.com
No ratings yet
DWDM Lab Manual: Department of Computer Science and Engineering
Document46 pages
DWDM Lab Manual: Department of Computer Science and Engineering
Dilli Books
No ratings yet
Software Engineering Class Notes - Slides
Document44 pages
Software Engineering Class Notes - Slides
podemi3794
No ratings yet
Exam 1 (2021) : This Is A Preview of The Published Version of The Quiz
Document16 pages
Exam 1 (2021) : This Is A Preview of The Published Version of The Quiz
The Gas Cast Live
No ratings yet
3.reserch Problem
Document35 pages
3.reserch Problem
ayam
No ratings yet
CH 4
Document48 pages
CH 4
shiferaw
No ratings yet
Prototyping: Alternative Systems Development Methodology by J M Carey
Document27 pages
Prototyping: Alternative Systems Development Methodology by J M Carey
MoNa Sed
No ratings yet
Week 03
Document33 pages
Week 03
Maryam Chouhdry
No ratings yet
Practical Experiences of Agility in The Telecom Industry
Document10 pages
Practical Experiences of Agility in The Telecom Industry
Ali
No ratings yet
The Mythical Man-Month
Document56 pages
The Mythical Man-Month
wooppoowwoop
86% (7)
CSC 1017 System Analysis and Design First Assignment
Document6 pages
CSC 1017 System Analysis and Design First Assignment
Achyut Neupane
No ratings yet
Real-World Challenges in Building Accurate Software Fault Prediction Models
Document47 pages
Real-World Challenges in Building Accurate Software Fault Prediction Models
Jasmeet Kaur
No ratings yet
Software Testing
Document17 pages
Software Testing
Zarin
No ratings yet
Deep Learning Most Important Ideas PDF
Document16 pages
Deep Learning Most Important Ideas PDF
Ronald Salamanca
No ratings yet
STE Microject
Document14 pages
STE Microject
Sushant Patil
100% (1)
SEPA Questions
Document35 pages
SEPA Questions
Satyasmith Ray
No ratings yet
Solution Network Security
Document12 pages
Solution Network Security
Unknown
No ratings yet
Final Review
Document13 pages
Final Review
Keseho
No ratings yet
PERFORMANCE TEST ENGINNER - Preparation
Document5 pages
PERFORMANCE TEST ENGINNER - Preparation
Tinna Condrache
No ratings yet
Lecture - Week - 3 - Software Metrics
Document29 pages
Lecture - Week - 3 - Software Metrics
Manl
No ratings yet
Software-Engineering 9093874 Powerpoint
Document205 pages
Software-Engineering 9093874 Powerpoint
safdar khaksar
No ratings yet
Test Driven Development - TDD & ATDD
Document23 pages
Test Driven Development - TDD & ATDD
Wael ANIBA
100% (1)
Ifipkaner
Document41 pages
Ifipkaner
da6789121
No ratings yet
6170 Lecture Notes Daniel Jackson, Fall 2k
Document9 pages
6170 Lecture Notes Daniel Jackson, Fall 2k
Daniel Edwards
No ratings yet
Effective Test Driven Development For Embedded Software
Document6 pages
Effective Test Driven Development For Embedded Software
jpr_joaopaulo
No ratings yet
Paige
Document17 pages
Paige
SHUBHAM POL
No ratings yet
Week 1 SE PDF
Document59 pages
Week 1 SE PDF
riki maru
No ratings yet
MS110 Business Systems Analysis Course Outline 2011 12
Document7 pages
MS110 Business Systems Analysis Course Outline 2011 12
Fuzail Naseer
No ratings yet
TY - SPM - L15 Project Approach2
Document20 pages
TY - SPM - L15 Project Approach2
sanil
No ratings yet
SPM IMP Q&A (E-Next - In)
Document51 pages
SPM IMP Q&A (E-Next - In)
RAQUIB AHMAD
No ratings yet
Lecture01 Introduction
Document11 pages
Lecture01 Introduction
SadiaMuqaddas
No ratings yet
CS507 MID 2010 Descriptive Questions With Answers by VuZs Team
Document9 pages
CS507 MID 2010 Descriptive Questions With Answers by VuZs Team
afshanaslam653
No ratings yet
Operating System
Document21 pages
Operating System
Usman Malik
No ratings yet
Module1 ppt2
Document21 pages
Module1 ppt2
Azar Azar
No ratings yet
Tutorial 2 - Methods and Frameworks
Document2 pages
Tutorial 2 - Methods and Frameworks
kakakeke312
No ratings yet
04 Designing Architectures
Document30 pages
04 Designing Architectures
Enes Faik Albayrak
No ratings yet
Softeng1 m02 (Tues)
Document26 pages
Softeng1 m02 (Tues)
Dann Laurte
No ratings yet
Software Requirement Engineering
Document37 pages
Software Requirement Engineering
Gaming Uday
No ratings yet
CS435: Introduction To Software Engineering: Dr. M. Zhu
Document56 pages
CS435: Introduction To Software Engineering: Dr. M. Zhu
rohan pandya
No ratings yet
Chapter 01
Document23 pages
Chapter 01
devanshkhurana2211
No ratings yet
CIS2303 - Project Details and Rubric - 202310
Document11 pages
CIS2303 - Project Details and Rubric - 202310
omaralmutawa1
No ratings yet
Lec 1 Intro
Document27 pages
Lec 1 Intro
Mazhar Bukhari
No ratings yet
Dr. Naveed Ahmad: Software Engineering I
Document25 pages
Dr. Naveed Ahmad: Software Engineering I
KHALiFA Op
No ratings yet
Chapter 01
Document30 pages
Chapter 01
Samar Fatima
No ratings yet
Wk6-1 Complexity Metrics
Document16 pages
Wk6-1 Complexity Metrics
Ayu Amalia
No ratings yet
Chatbot MSC
Document30 pages
Chatbot MSC
Wahid Khan
No ratings yet
4307 Dcap405 Software Engineering PDF
Document258 pages
4307 Dcap405 Software Engineering PDF
Isha Mehra
100% (1)
A Rational Design Process: How and To Fake It
Document7 pages
A Rational Design Process: How and To Fake It
Jhoel Lithus
No ratings yet
SE Theory Assignment
Document9 pages
SE Theory Assignment
snigdhat2004
No ratings yet
859
Document22 pages
859
Prashant Yelpale
No ratings yet
CH 2
Document13 pages
CH 2
arhamsajid476
No ratings yet
Drivendata Ebook Reliable Data Science
Document37 pages
Drivendata Ebook Reliable Data Science
ratber
No ratings yet
Emotion Based Movie Recommender System Using CNN
Document11 pages
Emotion Based Movie Recommender System Using CNN
dataprodcs
No ratings yet
STM Unit1 Part1
Document13 pages
STM Unit1 Part1
SRIKANTH TECH ACADEMY
No ratings yet
Software Engeering
Document120 pages
Software Engeering
Sanjana Sharma
No ratings yet
Full UNIT 1 CSE 320
Document113 pages
Full UNIT 1 CSE 320
hiiiiiiiii402
No ratings yet
Professional Test Driven Development with C#: Developing Real World Applications with TDD
From Everand
Professional Test Driven Development with C#: Developing Real World Applications with TDD
James Bender
No ratings yet
Assignment Format
Document3 pages
Assignment Format
fhsn84
No ratings yet
Contoh RPH
Document2 pages
Contoh RPH
Ahmad Fawwaz
No ratings yet
Courses Offered in Spring 2015
Document3 pages
Courses Offered in Spring 2015
Mohammed Afzal Asif
No ratings yet
Workshop Manual Transporter 2016 1-29
Document167 pages
Workshop Manual Transporter 2016 1-29
samuele
No ratings yet
Antenna - Conductive Boom Vs Non Conductive Boom - What's The Difference On A Yagi - Amateur Radio Stack Exchange
Document4 pages
Antenna - Conductive Boom Vs Non Conductive Boom - What's The Difference On A Yagi - Amateur Radio Stack Exchange
fixfixit
No ratings yet
Mud Motor DV826
Document1 page
Mud Motor DV826
CAMILO ALFONSO VIVEROS BRICEÑO
No ratings yet
British Cost Accounting 1887-1952 Preview
Document27 pages
British Cost Accounting 1887-1952 Preview
Jessica Jess
No ratings yet
He 2011
Document11 pages
He 2011
Darshil
No ratings yet
01 Fir Theory
Document2 pages
01 Fir Theory
Trịnh Minh Khoa
No ratings yet
Be1-Flex (Uub)
Document2 pages
Be1-Flex (Uub)
Eliyanto E Budiarto
No ratings yet
Samgroup: Unli Vocab Now Learning
Document20 pages
Samgroup: Unli Vocab Now Learning
Brixter Mangalindan
No ratings yet
Rubrics - Reporting - Rizal
Document2 pages
Rubrics - Reporting - Rizal
jake
No ratings yet
Tanner EDA Tools v16.3 Release Notes
Document60 pages
Tanner EDA Tools v16.3 Release Notes
Mohiuddin Mohammad
100% (1)
Please Get Rid of That Smell
Document4 pages
Please Get Rid of That Smell
mcant1980
No ratings yet
Group 1 Research
Document16 pages
Group 1 Research
Saedamen
No ratings yet
Richard Whish - Interview
Document17 pages
Richard Whish - Interview
baba bofa
No ratings yet
Wood Support-Micro Guard Product Data
Document13 pages
Wood Support-Micro Guard Product Data
M. Murat Ergin
No ratings yet
C Band Polarimetric Doppler Weather Radar Observations During An Extreme Precipitation Event and Associated Dynamics Over Peninsular India
Document16 pages
C Band Polarimetric Doppler Weather Radar Observations During An Extreme Precipitation Event and Associated Dynamics Over Peninsular India
mann singh
100% (1)
Ii Week 12 (Seminar)
Document30 pages
Ii Week 12 (Seminar)
cc
No ratings yet
SAES-L-610 PDF Download - Nonmetallic Piping in Oily Water Services - PDFYAR
Document6 pages
SAES-L-610 PDF Download - Nonmetallic Piping in Oily Water Services - PDFYAR
ZahidRafique
No ratings yet
HW 3
Document10 pages
HW 3
Hande Özer
No ratings yet
Data Sheets Bulletin Electric Actuators Model Epi 2 Keystone Us en 2721364
Document16 pages
Data Sheets Bulletin Electric Actuators Model Epi 2 Keystone Us en 2721364
Nag Raj
No ratings yet
MAE 320 Syllabus PDF
Document4 pages
MAE 320 Syllabus PDF
Robert V. Abrasaldo
No ratings yet
Web Development Internship Task
Document12 pages
Web Development Internship Task
radha
No ratings yet
Installation Manual - Chain Link Fence
Document2 pages
Installation Manual - Chain Link Fence
Chase Gietter
No ratings yet
33198
Document8 pages
33198
tatacps
No ratings yet
Asphaltene Eng2020
Document4 pages
Asphaltene Eng2020
Elprince Mido
No ratings yet
Term Symbol
Document20 pages
Term Symbol
Ririn Zarlina
100% (1)
OPC Functional Specifications
Document30 pages
OPC Functional Specifications
electedwess
No ratings yet