Fair and balanced?: bias in bug-fix datasets

Authors:
Christian Bird;Adrian Bachmann;Eirik Aune;John Duffy;Abraham Bernstein;Vladimir Filkov;Premkumar Devanbu
Affiliations:
University of California, Davis, Davis, CA, USA;University of Zurich, Zurich, Switzerland;Univeristy of California, Davis, Davis, CA, USA;University of California, Davis, Davis, CA, USA;University of Zurich, Zurich, Switzerland;University of California, Davis, Davis, CA, USA;University of California, Davis, Davis, CA, USA
Venue:
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Year:
2009

Citing 24
Cited 53

Software metrics: establishing a company-wide program

Software metrics: establishing a company-wide program
In the age of the smart machine: the future of work and power

In the age of the smart machine: the future of work and power
Empirical studies of software engineering: a roadmap

Proceedings of the Conference on The Future of Software Engineering
Machine Learning

Machine Learning
Populating a Release History Database from Version Control and Bug Tracking Systems

ICSM '03 Proceedings of the International Conference on Software Maintenance
Defect Handling in Medium and Large Open Source Projects

IEEE Software
Learning and evaluating classifiers under sample selection bias

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Hipikat: A Project Memory for Software Development

IEEE Transactions on Software Engineering
When do changes induce fixes?

MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
Towards predictor models for large libre software projects

PROMISE '05 Proceedings of the 2005 workshop on Predictor models in software engineering
An investigation of the effect of module size on defect prediction using static measures

PROMISE '05 Proceedings of the 2005 workshop on Predictor models in software engineering
Predicting component failures at design time

Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering
Automatic Identification of Bug-Introducing Changes

ASE '06 Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering
Memories of bug fixes

Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
Predicting Faults from Cached History

ICSE '07 Proceedings of the 29th international conference on Software Engineering
Predicting Defects for Eclipse

PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
Filtering, Robust Filtering, Polishing: Techniques for Addressing Quality in Software Data

ESEM '07 Proceedings of the First International Symposium on Empirical Software Engineering and Measurement
Predicting vulnerable software components

Proceedings of the 14th ACM conference on Computer and communications security
Extraction of bug localization benchmarks from history

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Guide to Advanced Empirical Software Engineering

Guide to Advanced Empirical Software Engineering
A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction

Proceedings of the 30th international conference on Software engineering
Data sets and data quality in software engineering

Proceedings of the 4th international workshop on Predictor models in software engineering
Do Crosscutting Concerns Cause Defects?

IEEE Transactions on Software Engineering
Software process data quality and characteristics: a historical view on open and closed source projects

Proceedings of the joint international and annual ERCIM workshops on Principles of software evolution (IWPSE) and software evolution (Evol) workshops

Cross-project defect prediction: a large scale experiment on data vs. domain vs. process

Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
An empirical study of reported bugs in server software with implications for automated bug diagnosis

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Characterizing and predicting which bugs get fixed: an empirical study of Microsoft Windows

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Predicting the fix time of bugs

Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering
A machine learning approach for text categorization of fixing-issue commits on CVS

Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
Automatically documenting program changes

Proceedings of the IEEE/ACM international conference on Automated software engineering
The missing links: bugs and bug-fix commits

Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
After-life vulnerabilities: a study on firefox evolution, its vulnerabilities, and fixes

ESSoS'11 Proceedings of the Third international conference on Engineering secure software and systems
"Not my bug!" and other reasons for software bug report reassignments

Proceedings of the ACM 2011 conference on Computer supported cooperative work
Design evolution metrics for defect prediction in object oriented systems

Empirical Software Engineering
Comparing fine-grained source code changes and code churn for bug prediction

Proceedings of the 8th Working Conference on Mining Software Repositories
An empirical analysis of the FixCache algorithm

Proceedings of the 8th Working Conference on Mining Software Repositories
An empirical study of build maintenance effort

Proceedings of the 33rd International Conference on Software Engineering
An empirical investigation into the role of API-level refactorings during software evolution

Proceedings of the 33rd International Conference on Software Engineering
Detecting software modularity violations

Proceedings of the 33rd International Conference on Software Engineering
Dealing with noise in defect prediction

Proceedings of the 33rd International Conference on Software Engineering
Ownership, experience and defects: a fine-grained study of authorship

Proceedings of the 33rd International Conference on Software Engineering
Nothing else matters: what predictive model should I use?

Proceedings of the 7th International Conference on Predictive Models in Software Engineering
Using the gini coefficient for bug prediction in eclipse

Proceedings of the 12th International Workshop on Principles of Software Evolution and the 7th annual ERCIM Workshop on Software Evolution
ReLink: recovering links between bugs and changes

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Micro interaction metrics for defect prediction

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
BugCache for inspections: hit or miss?

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Using structural and textual information to capture feature coupling in object-oriented software

Empirical Software Engineering
Faster issue resolution with higher technical quality of software

Software Quality Control
Clones: what is that smell?

Empirical Software Engineering
Evaluating defect prediction approaches: a benchmark and an extensive comparison

Empirical Software Engineering
Time variance and defect prediction in software projects

Empirical Software Engineering
A systematic study of automated program repair: fixing 55 out of 105 bugs for $8 each

Proceedings of the 34th International Conference on Software Engineering
Bug prediction based on fine-grained module histories

Proceedings of the 34th International Conference on Software Engineering
Content classification of development emails

Proceedings of the 34th International Conference on Software Engineering
Identifying Linux bug fixing patches

Proceedings of the 34th International Conference on Software Engineering
Information needs for software development analytics

Proceedings of the 34th International Conference on Software Engineering
Goldfish bowl panel: software development analytics

Proceedings of the 34th International Conference on Software Engineering
Five days of empirical software engineering: the PASED experience

Proceedings of the 34th International Conference on Software Engineering
Defect, defect, defect: defect prediction 2.0

Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Method-level bug prediction

Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
Recalling the "imprecision" of cross-project defect prediction

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Multi-layered approach for recovering links between bug reports and fixes

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Studying the impact of social interactions on software quality

Empirical Software Engineering
The (un)reliability of NVD vulnerable versions data: an empirical experiment on Google Chrome vulnerabilities

Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security
The design of bug fixes

Proceedings of the 2013 International Conference on Software Engineering
Does bug prediction support human developers? findings from a google case study

Proceedings of the 2013 International Conference on Software Engineering
Transfer defect learning

Proceedings of the 2013 International Conference on Software Engineering
It's not a bug, it's a feature: how misclassification impacts bug prediction

Proceedings of the 2013 International Conference on Software Engineering
How, and why, process metrics are better

Proceedings of the 2013 International Conference on Software Engineering
Measuring architecture quality by structure plus history analysis

Proceedings of the 2013 International Conference on Software Engineering
Linux variability anomalies: what causes them and how do they get fixed?

Proceedings of the 10th Working Conference on Mining Software Repositories
The impact of tangled code changes

Proceedings of the 10th Working Conference on Mining Software Repositories
Discovering, reporting, and fixing performance bugs

Proceedings of the 10th Working Conference on Mining Software Repositories
Using citation influence to predict software defects

Proceedings of the 10th Working Conference on Mining Software Repositories
Sample size vs. bias in defect prediction

Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
API change and fault proneness: a threat to the success of Android apps

Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
A new perspective on the socialness in bug triaging: a case study of the eclipse platform project

Proceedings of the 2013 International Workshop on Social Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Software engineering researchers have long been interested in where and why bugs occur in code, and in predicting where they might turn up next. Historical bug-occurence data has been key to this research. Bug tracking systems, and code version histories, record when, how and by whom bugs were fixed; from these sources, datasets that relate file changes to bug fixes can be extracted. These historical datasets can be used to test hypotheses concerning processes of bug introduction, and also to build statistical bug prediction models. Unfortunately, processes and humans are imperfect, and only a fraction of bug fixes are actually labelled in source code version histories, and thus become available for study in the extracted datasets. The question naturally arises, are the bug fixes recorded in these historical datasets a fair representation of the full population of bug fixes? In this paper, we investigate historical data from several software projects, and find strong evidence of systematic bias. We then investigate the potential effects of "unfair, imbalanced" datasets on the performance of prediction techniques. We draw the lesson that bias is a critical problem that threatens both the effectiveness of processes that rely on biased datasets to build prediction models and the generalizability of hypotheses tested on biased data.