Statistical Debugging Using Latent Topic Models

Authors:
David Andrzejewski;Anne Mulhern;Ben Liblit;Xiaojin Zhu
Affiliations:
Computer Sciences Department, University of Wisconsin, Madison WI 53706, USA;Computer Sciences Department, University of Wisconsin, Madison WI 53706, USA;Computer Sciences Department, University of Wisconsin, Madison WI 53706, USA;Computer Sciences Department, University of Wisconsin, Madison WI 53706, USA
Venue:
ECML '07 Proceedings of the 18th European conference on Machine Learning
Year:
2007

Citing 15
Cited 12

Finding failures by cluster analysis of execution profiles

ICSE '01 Proceedings of the 23rd International Conference on Software Engineering
Tracking down software bugs using automatic anomaly detection

Proceedings of the 24th International Conference on Software Engineering
Winnowing: local algorithms for document fingerprinting

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Latent dirichlet allocation

The Journal of Machine Learning Research
Scalable statistical bug isolation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
A Bayesian Hierarchical Model for Learning Natural Scene Categories

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
SOBER: statistical model-based bug localization

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact

Empirical Software Engineering
Empirical evaluation of the tarantula automatic fault-localization technique

Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering
Statistical debugging: simultaneous identification of multiple bugs

ICML '06 Proceedings of the 23rd international conference on Machine learning
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Statistical debugging using compound boolean predicates

Proceedings of the 2007 international symposium on Software testing and analysis
Cooperative Bug Isolation: Winning Thesis of the 2005 ACM Doctoral Dissertation Competition (Lecture Notes in Computer Science)

Cooperative Bug Isolation: Winning Thesis of the 2005 ACM Doctoral Dissertation Competition (Lecture Notes in Computer Science)
Probabilistic latent semantic analysis

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Path optimization in programs and its application to debugging

ESOP'06 Proceedings of the 15th European conference on Programming Languages and Systems

Reflections on the Role of Static Analysis in Cooperative Bug Isolation

SAS '08 Proceedings of the 15th international symposium on Static Analysis
Diagnosing mobile applications in the wild

Hotnets-IX Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks
Learning rare behaviours

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part II
Software defect detection with rocus

Journal of Computer Science and Technology
A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Analyzing and mining a code search engine usage log

Empirical Software Engineering
ABHRANTA: locating bugs that manifest at large system scales

HotDep'12 Proceedings of the Eighth USENIX conference on Hot Topics in System Dependability
Topic extraction based on prior knowledge obtained from target documents

ACL '12 Proceedings of ACL 2012 Student Research Workshop
Concept-based failure clustering

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Unsupervised mining of long time series based on latent topic model

Neurocomputing
WuKong: automatically detecting and localizing bugs that manifest at large system scales

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Automatically describing software faults

Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Statistical debugging uses machine learning to model program failures and help identify root causes of bugs. We approach this task using a novel Delta-Latent-Dirichlet-Allocation model. We model execution traces attributed to failed runs of a program as being generated by two types of latent topics: normal usage topics and bug topics. Execution traces attributed to successful runs of the same program, however, are modeled by usage topics only. Joint modeling of both kinds of traces allows us to identify weak bug topics that would otherwise remain undetected. We perform model inference with collapsed Gibbs sampling. In quantitative evaluations on four real programs, our model produces bug topics highly correlated to the true bugs, as measured by the Rand index. Qualitative evaluation by domain experts suggests that our model outperforms existing statistical methods for bug cause identification, and may help support other software tasks not addressed by earlier models.