Predicting Fault Incidence Using Software Change History

Authors:
Todd L. Graves;Alan F. Karr;J. S. Marron;Harvey Siy
Affiliations:
Los Alamos National Lab, Los Alamos, NM;National Institute of Statistical Sciences, Research Triangle Park, NC;Univ. of North Carolina at Chapel Hill, Chapel Hill;Lucent Technologies, Naperville, IL
Venue:
IEEE Transactions on Software Engineering
Year:
2000

Citing 11
Cited 143

Software errors and complexity: an empirical investigation0

Communications of the ACM
Identifying Error-Prone Software An Empirical Study

IEEE Transactions on Software Engineering
Program evolution: processes of software change

Program evolution: processes of software change
An Analysis of Several Software Defect Models

IEEE Transactions on Software Engineering
Regression modelling of software quality: empirical investigation

Journal of Electronic Materials
Software complexity: measures and methods

Software complexity: measures and methods
Estimating software fault content before coding

ICSE '92 Proceedings of the 14th international conference on Software engineering
Predicting Fault-Prone Software Modules in Telephone Switches

IEEE Transactions on Software Engineering
Models and Measurements for Quality Assessment of Software

ACM Computing Surveys (CSUR)
Elements of Software Science (Operating and programming systems series)

Elements of Software Science (Operating and programming systems series)
Reexamining the Fault Density-Component Size Connection

IEEE Software

Does Code Decay? Assessing the Evidence from Change Management Data

IEEE Transactions on Software Engineering
Parallel changes in large-scale software development: an observational case study

ACM Transactions on Software Engineering and Methodology (TOSEM)
An object-oriented metrics suite for Ada 95

Proceedings of the 2001 annual ACM SIGAda international conference on Ada
A method for detecting faulty code violating implicit coding rules

Proceedings of the International Workshop on Principles of Software Evolution
Visualizing Software Changes

IEEE Transactions on Software Engineering
The distribution of faults in a large industrial software system

ISSTA '02 Proceedings of the 2002 ACM SIGSOFT international symposium on Software testing and analysis
A model of factors affecting an information system's change in state

Journal of Software Maintenance: Research and Practice
Practical assessment of the models for identification of defect-prone classes in object-oriented commercial systems using design metrics

Journal of Systems and Software
Understanding and predicting effort in software projects

Proceedings of the 25th International Conference on Software Engineering
CVSSearch: Searching through Source Code using CVS Comments

ICSM '01 Proceedings of the IEEE International Conference on Software Maintenance (ICSM'01)
Using operational distributions to judge testing progress

Proceedings of the 2003 ACM symposium on Applied computing
An Empirical Study of Software Reuse vs. Defect-Density and Stability

Proceedings of the 26th International Conference on Software Engineering
Mining Version Histories to Guide Software Changes

Proceedings of the 26th International Conference on Software Engineering
Where the bugs are

ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
Predictors of customer perceived software quality

Proceedings of the 27th international conference on Software engineering
Use of relative code churn measures to predict system defect density

Proceedings of the 27th international conference on Software engineering
Predicting the Location and Number of Faults in Large Software Systems

IEEE Transactions on Software Engineering
Mining Version Histories to Guide Software Changes

IEEE Transactions on Software Engineering
Automatic Mining of Source Code Repositories to Improve Bug Finding Techniques

IEEE Transactions on Software Engineering
Toward Understanding the Rhetoric of Small Source Code Changes

IEEE Transactions on Software Engineering
Mining student CVS repositories for performance indicators

MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
Revisiting the problem of using problem reports for quality assessment

Proceedings of the 2006 international workshop on Software quality
TA-RE: an exchange language for mining software repositories

Proceedings of the 2006 international workshop on Mining software repositories
Program element matching for multi-version program analyses

Proceedings of the 2006 international workshop on Mining software repositories
Predicting defect densities in source code files with decision tree learners

Proceedings of the 2006 international workshop on Mining software repositories
Information theoretic evaluation of change prediction models for large-scale software

Proceedings of the 2006 international workshop on Mining software repositories
Supporting change request assignment in open source development

Proceedings of the 2006 ACM symposium on Applied computing
Looking for bugs in all the right places

Proceedings of the 2006 international symposium on Software testing and analysis
An empirical study of fine-grained software modifications

Empirical Software Engineering
Replaying development history to assess the effectiveness of change propagation tools

Empirical Software Engineering
Predicting fault-prone components in a java legacy system

Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering
Predicting component failures at design time

Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering
Memories of bug fixes

Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
Understanding component co-evolution with a study on Linux

Empirical Software Engineering
Change propagations in the maintenance of kernel-based software with a study on Linux

ACM-SE 45 Proceedings of the 45th annual southeast regional conference
A Study of Design Characteristics in Evolving Software Using Stability as a Criterion

IEEE Transactions on Software Engineering
Predicting Faults from Cached History

ICSE '07 Proceedings of the 29th international conference on Software Engineering
Mining Software Engineering Data

ICSE COMPANION '07 Companion to the proceedings of the 29th International Conference on Software Engineering
A Replicated Quantitative Analysis of Fault Distributions in Complex Software Systems

IEEE Transactions on Software Engineering
Empirical analysis on the correlation between GCC compiler warnings and revision numbers of source files in five industrial software projects

Empirical Software Engineering
Using Developer Information as a Factor for Fault Prediction

PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
Predicting Defects for Eclipse

PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
Automating algorithms for the identification of fault-prone files

Proceedings of the 2007 international symposium on Software testing and analysis
Which warnings should I fix first?

Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Software engineering research: from cradle to grave

Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Trace anomalies as precursors of field failures: an empirical study

Empirical Software Engineering
Improving defect prediction using temporal features and non linear models

Ninth international workshop on Principles of software evolution: in conjunction with the 6th ESEC/FSE joint meeting
Learning from bug-introducing changes to prevent fault prone code

Ninth international workshop on Principles of software evolution: in conjunction with the 6th ESEC/FSE joint meeting
Empirical Analysis of Software Fault Content and Fault Proneness Using Bayesian Methods

IEEE Transactions on Software Engineering
The Daikon system for dynamic detection of likely invariants

Science of Computer Programming
The application of product measures in directing software maintenance activity

Journal of Software Maintenance and Evolution: Research and Practice
Mining software repositories for comprehensible software fault prediction models

Journal of Systems and Software
An empirical investigation of software reuse benefits in a large telecom product

ACM Transactions on Software Engineering and Methodology (TOSEM)
A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction

Proceedings of the 30th international conference on Software engineering
The influence of organizational structure on software quality: an empirical case study

Proceedings of the 30th international conference on Software engineering
Interval quality: relating customer-perceived quality to process quality

Proceedings of the 30th international conference on Software engineering
A case study in database reliability: component types, usage profiles, and testing

Proceedings of the 1st international workshop on Testing database systems
A metric for software readability

ISSTA '08 Proceedings of the 2008 international symposium on Software testing and analysis
Exploring the relationship of history characteristics and defect count: an empirical study

DEFECTS '08 Proceedings of the 2008 workshop on Defects in large software systems
Iterative identification of fault-prone binaries using in-process metrics

Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
Mining software code repositories and bug databases using survival analysis models

Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
Analysis of the reliability of a subset of change metrics for defect prediction

Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement
Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models

Empirical Software Engineering
Can developer-module networks predict failures?

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
Integrating in-process software defect prediction with association mining to discover defect pattern

Information and Software Technology
Modeling software evolution defects: a time series approach

Journal of Software Maintenance and Evolution: Research and Practice
Toward an understanding of bug fix patterns

Empirical Software Engineering
Predicting faults using the complexity of code changes

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Synthesis, Analysis, and Modeling of Large-Scale Mission-Critical Embedded Software Systems

ICSP '09 Proceedings of the International Conference on Software Process: Trustworthy Software Development Processes
Change impact graphs: Determining the impact of prior codechanges

Information and Software Technology
Cross-project defect prediction: a large scale experiment on data vs. domain vs. process

Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
On the relationship between process maturity and geographic distribution: an empirical analysis of their impact on software quality

Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
On the relative value of cross-company and within-company data for defect prediction

Empirical Software Engineering
A systematic and comprehensive investigation of methods to build and evaluate fault prediction models

Journal of Systems and Software
Antecedents of open source software defects: A data mining approach to model formulation, validation and testing

Information Technology and Management
A detailed examination of the correlation between imports and failure-proneness of software components

ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
Strong dependencies between software components

ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
Test coverage and post-verification defects: A multiple case study

ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
Empirical Evaluation of Hunk Metrics as Bug Predictors

IWSM '09 /Mensura '09 Proceedings of the International Conferences on Software Process and Product Measurement
BUGINNINGS: identifying the origins of a bug

Proceedings of the 3rd India software engineering conference
Fault-prone module detection using large-scale text features based on spam filtering

Empirical Software Engineering
Understanding cost drivers of software evolution: a quantitative and qualitative investigation of change effort in two evolving software systems

Empirical Software Engineering
Exploring the relationship of a file's history and its fault-proneness: An empirical method and its application to open source programs

Information and Software Technology
Prediction = power

TestCom'03 Proceedings of the 15th IFIP international conference on Testing of communicating systems
Software support tools and experimental work

Proceedings of the 2006 international conference on Empirical software engineering issues: critical assessment and future directions
What can fault prediction do for you?

TAP'08 Proceedings of the 2nd international conference on Tests and proofs
Transparent combination of expert and measurement data for defect prediction: an industrial case study

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2
Mining software engineering data

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2
Understanding the impact of code and process metrics on post-release defects: a case study on the Eclipse project

Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
Towards a software failure cost impact model for the customer: an analysis of an open source product

Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Programmer-based fault prediction

Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Energy-Efficient Progressive Remote Update for Flash-Based Firmware of Networked Embedded Systems

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Organizational volatility and its effects on software defects

Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Predicting defect priority based on neural networks

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Design evolution metrics for defect prediction in object oriented systems

Empirical Software Engineering
Comparing fine-grained source code changes and code churn for bug prediction

Proceedings of the 8th Working Conference on Mining Software Repositories
Security versus performance bugs: a case study on Firefox

Proceedings of the 8th Working Conference on Mining Software Repositories
Do time of day and developer experience affect commit bugginess?

Proceedings of the 8th Working Conference on Mining Software Repositories
Software defect prediction based on source code metrics time series

Transactions on rough sets XIII
Factors leading to integration failures in global feature-oriented development: an empirical analysis

Proceedings of the 33rd International Conference on Software Engineering
Non-essential changes in version histories

Proceedings of the 33rd International Conference on Software Engineering
Ownership, experience and defects: a fine-grained study of authorship

Proceedings of the 33rd International Conference on Software Engineering
Pragmatic prioritization of software quality assurance efforts

Proceedings of the 33rd International Conference on Software Engineering
Recovering traceability links between source code and fixed bugs via patch analysis

Proceedings of the 6th International Workshop on Traceability in Emerging Forms of Software Engineering
Defect prediction using social network analysis on issue repositories

Proceedings of the 2011 International Conference on Software and Systems Process
Does measuring code change improve fault prediction?

Proceedings of the 7th International Conference on Predictive Models in Software Engineering
Studying the fix-time for bugs in large open source projects

Proceedings of the 7th International Conference on Predictive Models in Software Engineering
Towards a classification of logical dependencies origins: a case study

Proceedings of the 12th International Workshop on Principles of Software Evolution and the 7th annual ERCIM Workshop on Software Evolution
High-impact defects: a study of breakage and surprise defects

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
A framework for defect prediction in specific software project contexts

CEE-SET'08 Proceedings of the Third IFIP TC 2 Central and East European conference on Software engineering techniques
An evolutionary programming based asymmetric weighted least squares support vector machine ensemble learning methodology for software repository mining

Information Sciences: an International Journal
Identification of defect-prone classes in telecommunication software systems using design metrics

Information Sciences: an International Journal
Learning from the future of component repositories

Proceedings of the 15th ACM SIGSOFT symposium on Component Based Software Engineering
Faster issue resolution with higher technical quality of software

Software Quality Control
On the use of calling structure information to improve fault prediction

Empirical Software Engineering
The evolution of Java build systems

Empirical Software Engineering
Evaluating defect prediction approaches: a benchmark and an extensive comparison

Empirical Software Engineering
Time variance and defect prediction in software projects

Empirical Software Engineering
Controversy Corner: On the relationship between comment update practices and Software Bugs

Journal of Systems and Software
Controversy Corner: Preserving knowledge in software projects

Journal of Systems and Software
Bug prediction based on fine-grained module histories

Proceedings of the 34th International Conference on Software Engineering
A learning-to-rank algorithm for constructing defect prediction models

IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Characterizing the roles of classes and their fault-proneness through change metrics

Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
Method-level bug prediction

Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
Software development environments on the web: a research agenda

Proceedings of the ACM international symposium on New ideas, new paradigms, and reflections on programming and software
An industrial study on the risk of software changes

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
The design of polynomial function-based neural network predictors for detection of software defects

Information Sciences: an International Journal
Open Source Software Systems: Understanding Bug Prediction and Software Developer Roles

International Journal of Open Source Software and Processes
Influence of confirmation biases of developers on software quality: an empirical study

Software Quality Control
Towards content-driven reputation for collaborative code repositories

Proceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration
Distributed development considered harmful?

Proceedings of the 2013 International Conference on Software Engineering
Studying the effect of co-change dispersion on software quality

Proceedings of the 2013 International Conference on Software Engineering
Risky files: an approach to focus quality improvement effort

Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Declarative visitors to ease fine-grained source code mining with full history on billions of AST nodes

Proceedings of the 12th international conference on Generative programming: concepts & experiences
On software component co-installability

ACM Transactions on Software Engineering and Methodology (TOSEM) - Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
A goal driven framework for software project data analytics

CAiSE'13 Proceedings of the 25th international conference on Advanced Information Systems Engineering
Controlled experiment on the supportive effect of architectural component diagrams for design understanding of novice architects

ECSA'13 Proceedings of the 7th European conference on Software Architecture
Is lines of code a good measure of effort in effort-aware models?

Information and Software Technology
Software evolution visualization: A systematic mapping study

Information and Software Technology
The communication patterns of technical leaders: impact on product development team performance

Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing
An empirical study of the factors affecting co-change frequency of cloned code

CASCON '13 Proceedings of the 2013 Conference of the Center for Advanced Studies on Collaborative Research
CMS tool: calculating defect and change data from software project repositories

ACM SIGSOFT Software Engineering Notes
Bug prediction using entropy-based measures

International Journal of Knowledge Engineering and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper is an attempt to understand the processes by which software ages. We define code to be aged or decayed if its structure makes it unnecessarily difficult to understand or change and we measure the extent of decay by counting the number of faults in code in a period of time. Using change management data from a very large, long-lived software system, we explore the extent to which measurements from the change history are successful in predicting the distribution over modules of these incidences of faults. In general, process measures based on the change history are more useful in predicting fault rates than product metrics of the code: For instance, the number of times code has been changed is a better indication of how many faults it will contain than is its length. We also compare the fault rates of code of various ages, finding that if a module is, on the average, a year older than an otherwise similar module, the older module will have roughly a third fewer faults. Our most successful model measures the fault potential of a module as the sum of contributions from all of the times the module has been changed, with large, recent changes receiving the most weight.