Finding Latent Code Errors via Machine Learning over Program Executions

  • Authors:
  • Yuriy Brun;Michael D. Ernst

  • Affiliations:
  • University of Southern California;Massachusetts Institute of Technology

  • Venue:
  • Proceedings of the 26th International Conference on Software Engineering
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a technique for identifying programproperties that indicate errors. The technique generates machinelearning models of program properties known to resultfrom errors, and applies these models to program propertiesof user-written code to classify and rank propertiesthat may lead the user to errors. Given a set of propertiesproduced by the program analysis, the technique selectssubset of properties that are most likely to reveal an error.An implementation, the Fault Invariant Classifier,demonstrates the efficacy of the technique. The implementationuses dynamic invariant detection to generate programproperties. It uses support vector machine and decision treelearning tools to classify those properties. In our experimentalevaluation, the technique increases the relevance(the concentration of fault-revealing properties) by a factorof 50 on average for the C programs, and 4.8 for the Javaprograms. Preliminary experience suggests that most of thefault-revealing properties do lead a programmer to an error.