Predicting Faults from Cached History

Authors:
Sunghun Kim;Thomas Zimmermann;E. James Whitehead Jr.;Andreas Zeller
Affiliations:
Massachusetts Institute of Technology, USA;Saarland University, Germany;University of California, Santa Cruz, USA;Saarland University, Germany
Venue:
ICSE '07 Proceedings of the 29th international conference on Software Engineering
Year:
2007

Citing 22
Cited 59

Predicting Fault Incidence Using Software Change History

IEEE Transactions on Software Engineering
Ordering Fault-Prone Software Modules

Software Quality Control
Hipikat: recommending pertinent software development artifacts

Proceedings of the 25th International Conference on Software Engineering
Identifying Reasons for Software Changes Using Historic Databases

ICSM '00 Proceedings of the International Conference on Software Maintenance (ICSM'00)
Predicting the Order of Fault-Prone Modules in Legacy Software

ISSRE '98 Proceedings of the The Ninth International Symposium on Software Reliability Engineering
Populating a Release History Database from Version Control and Bug Tracking Systems

ICSM '03 Proceedings of the International Conference on Software Maintenance
CVS Release History Data for Detecting Logical Couplings

IWPSE '03 Proceedings of the 6th International Workshop on Principles of Software Evolution
Identification of Software Instabilities

WCRE '03 Proceedings of the 10th Working Conference on Reverse Engineering
Introduction to Machine Learning (Adaptive Computation and Machine Learning)

Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Using Origin Analysis to Detect Merging and Splitting of Source Code Entities

IEEE Transactions on Software Engineering
Use of relative code churn measures to predict system defect density

Proceedings of the 27th international conference on Software engineering
Predicting the Location and Number of Faults in Large Software Systems

IEEE Transactions on Software Engineering
Mining Version Histories to Guide Software Changes

IEEE Transactions on Software Engineering
HATARI: raising risk awareness

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Facilitating software evolution research with kenyon

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
When do changes induce fixes?

MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
The Top Ten List: Dynamic Fault Prediction

ICSM '05 Proceedings of the 21st IEEE International Conference on Software Maintenance
When Functions Change Their Names: Automatic Detection of Origin Relationships

WCRE '05 Proceedings of the 12th Working Conference on Reverse Engineering
Mining metrics to predict component failures

Proceedings of the 28th international conference on Software engineering
Automatic Identification of Bug-Introducing Changes

ASE '06 Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering
Identifying Refactorings from Source-Code Changes

ASE '06 Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering
A framework and methodology for studying the causes of software errors in programming systems

Journal of Visual Languages and Computing

Which warnings should I fix first?

Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Learning from bug-introducing changes to prevent fault prone code

Ninth international workshop on Principles of software evolution: in conjunction with the 6th ESEC/FSE joint meeting
On the relation of refactorings and software defect prediction

Proceedings of the 2008 international working conference on Mining software repositories
On establishing a benchmark for evaluating static analysis alert prioritization and classification techniques

Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
Mining Bug Classifier and Debug Strategy Association Rules for Web-Based Applications

ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Can developer-module networks predict failures?

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
An empirical approach to evaluating dependency locality in hierarchically structured software systems

Journal of Systems and Software
Fair and balanced?: bias in bug-fix datasets

Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Identifying static analysis techniques for finding non-fix hunks in fix revisions

Proceedings of the ACM first international workshop on Data-intensive software management and mining
Fault-prone module detection using large-scale text features based on spam filtering

Empirical Software Engineering
Exploring the relationship of a file's history and its fault-proneness: An empirical method and its application to open source programs

Information and Software Technology
Application of traditional software testing methodologies to web accessibility

Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A)
Has the bug really been fixed?

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Recurring bug fixes in object-oriented programs

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
SOFAS: software analysis services

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2
Failure preventing recommendations

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2
Detection of recurring software vulnerabilities

Proceedings of the IEEE/ACM international conference on Automated software engineering
An integrated approach to detect fault-prone modules using complexity and text feature metrics

AST/UCMA/ISA/ACN'10 Proceedings of the 2010 international conference on Advances in computer science and information technology
The missing links: bugs and bug-fix commits

Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Continual monitoring of code quality

Proceedings of the 4th India Software Engineering Conference
An empirical analysis of the FixCache algorithm

Proceedings of the 8th Working Conference on Mining Software Repositories
Software defect prediction based on source code metrics time series

Transactions on rough sets XIII
An empirical investigation into the role of API-level refactorings during software evolution

Proceedings of the 33rd International Conference on Software Engineering
Dealing with noise in defect prediction

Proceedings of the 33rd International Conference on Software Engineering
The code orb: supporting contextualized coding via at-a-glance views (NIER track)

Proceedings of the 33rd International Conference on Software Engineering
Topic-based defect prediction (NIER track)

Proceedings of the 33rd International Conference on Software Engineering
Exploring, exposing, and exploiting emails to include human factors in software engineering

Proceedings of the 33rd International Conference on Software Engineering
Pragmatic prioritization of software quality assurance efforts

Proceedings of the 33rd International Conference on Software Engineering
Using feature locality: can we leverage history to avoid failures during reconfiguration?

Proceedings of the 8th workshop on Assurances for self-adaptive systems
ReLink: recovering links between bugs and changes

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Micro interaction metrics for defect prediction

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
BugCache for inspections: hit or miss?

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Fuzzy set and cache-based approach for bug triaging

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
A framework for defect prediction in specific software project contexts

CEE-SET'08 Proceedings of the Third IFIP TC 2 Central and East European conference on Software engineering techniques
Sample-based software defect prediction with active and semi-supervised learning

Automated Software Engineering
A topic-based approach for narrowing the search space of buggy files from a bug report

ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
On the improvement of a fault classification scheme with implications for white-box testing

Proceedings of the 27th Annual ACM Symposium on Applied Computing
On the use of calling structure information to improve fault prediction

Empirical Software Engineering
Evaluating defect prediction approaches: a benchmark and an extensive comparison

Empirical Software Engineering
Time variance and defect prediction in software projects

Empirical Software Engineering
Where should the bugs be fixed? - more accurate information retrieval-based bug localization based on bug reports

Proceedings of the 34th International Conference on Software Engineering
Bug prediction based on fine-grained module histories

Proceedings of the 34th International Conference on Software Engineering
Identifying Linux bug fixing patches

Proceedings of the 34th International Conference on Software Engineering
Active refinement of clone anomaly reports

Proceedings of the 34th International Conference on Software Engineering
Defect, defect, defect: defect prediction 2.0

Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Method-level bug prediction

Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
How do software engineers understand code changes?: an exploratory study in industry

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Recalling the "imprecision" of cross-project defect prediction

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Multi-layered approach for recovering links between bug reports and fixes

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
The design of bug fixes

Proceedings of the 2013 International Conference on Software Engineering
Does bug prediction support human developers? findings from a google case study

Proceedings of the 2013 International Conference on Software Engineering
It's not a bug, it's a feature: how misclassification impacts bug prediction

Proceedings of the 2013 International Conference on Software Engineering
How, and why, process metrics are better

Proceedings of the 2013 International Conference on Software Engineering
Replicating mining studies with SOFAS

Proceedings of the 10th Working Conference on Mining Software Repositories
Sample size vs. bias in defect prediction

Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Using code change types in an analogy-based classifier for short-term defect prediction

Proceedings of the 9th International Conference on Predictive Models in Software Engineering
Declarative visitors to ease fine-grained source code mining with full history on billions of AST nodes

Proceedings of the 12th international conference on Generative programming: concepts & experiences
Evaluating a query framework for software evolution data

ACM Transactions on Software Engineering and Methodology (TOSEM) - Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
Bug prediction using entropy-based measures

International Journal of Knowledge Engineering and Data Mining

Quantified Score

Hi-index	0.01

Visualization

Abstract

We analyze the version history of 7 software systems to predict the most fault prone entities and files. The basic assumption is that faults do not occur in isolation, but rather in bursts of several related faults. Therefore, we cache locations that are likely to have faults: starting from the location of a known (fixed) fault, we cache the location itself, any locations changed together with the fault, recently added locations, and recently changed locations. By consulting the cache at the moment a fault is fixed, a developer can detect likely fault-prone locations. This is useful for prioritizing verification and validation resources on the most fault prone files or entities. In our evaluation of seven open source projects with more than 200,000 revisions, the cache selects 10% of the source code files; these files account for 73%-95% of faults-- a significant advance beyond the state of the art.