Improving the Precision of Dependence-Based Defect Mining by Supervised Learning of Rule and Violation Graphs

  • Authors:
  • Boya Sun;Andy Podgurski;Soumya Ray

  • Affiliations:
  • -;-;-

  • Venue:
  • ISSRE '10 Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Previous work has shown that application of graph mining techniques to system dependence graphs improves the precision of automatic defect discovery by revealing subgraphs corresponding to implicit programming rules and to rule violations. However, developers must still confirm, edit, or discard reported rules and violations, which is both costly and error-prone. In order to reduce developer effort and further improve precision, we investigate the use of supervised learning models for classifying and ranking rule and violation subgraphs. In particular, we present and evaluate logistic regression models for rules and violations, respectively, which are based on general dependence-graph features. Our empirical results indicate that (i) use of these models can significantly improve the precision and recall of defect discovery, and (ii) our approach is superior to existing heuristic approaches to rule and violation ranking and to an existing static-warning classifier, and (iii) accurate models can be learned using only a few labeled examples.