Estimation of software reliability by stratified sampling
ACM Transactions on Software Engineering and Methodology (TOSEM)
Finding failures by cluster analysis of execution profiles
ICSE '01 Proceedings of the 23rd International Conference on Software Engineering
Pursuing failure: the distribution of program failures in a profile space
Proceedings of the 8th European software engineering conference held jointly with 9th ACM SIGSOFT international symposium on Foundations of software engineering
Automated support for classifying software failure reports
Proceedings of the 25th International Conference on Software Engineering
Empirical Software Engineering
Failure proximity: a fault localization-based approach
Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
Semantic clustering: Identifying topics in source code
Information and Software Technology
Proceedings of the 2007 international symposium on Software testing and analysis
Statistical Debugging Using Latent Topic Models
ECML '07 Proceedings of the 18th European conference on Machine Learning
Mining source code to automatically split identifiers for software analysis
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
Bug localization using latent Dirichlet allocation
Information and Software Technology
Software Behavior and Failure Clustering: An Empirical Study of Fault Causality
ICST '12 Proceedings of the 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation
Automatically describing software faults
Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Hi-index | 0.00 |
When attempting to determine the number and set of execution failures that are caused by particular faults, developers must perform an arduous task of investigating and diagnosing each individual failure. Researchers proposed failure-clustering techniques to automatically categorize failures, with the intention of isolating each culpable fault. The current techniques utilize dynamic control flow to characterize each failure to then cluster them. These existing techniques, however, are blind to the intent or purpose of each execution, other than what can be inferred by the control-flow profile. We hypothesize that semantically rich execution information can aid clustering effectiveness by categorizing failures according to which functionality they exhibit in the software. This paper presents a novel clustering method that utilizes latent-semantic-analysis techniques to categorize each failure by the semantic concepts that are expressed in the executed source code. We present an experiment comparing this new technique to traditional control-flow-based clustering. The results of the experiment showed that the semantic-concept clustering was more precise in the number of clusters produced than the traditional approach, without sacrificing cluster accuracy.