Automated, highly-accurate, bug assignment using machine learning and tossing graphs

Authors:
Pamela Bhattacharya;Iulian Neamtiu;Christian R. Shelton
Affiliations:
Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA;Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA;Department of Computer Science and Engineering, University of California, Riverside, CA 92521, USA
Venue:
Journal of Systems and Software
Year:
2012

Citing 10
Cited 1

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
C4.5: programs for machine learning

C4.5: programs for machine learning
Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
Modernizing Legacy Systems: Software Technologies, Engineering Process and Business Practices

Modernizing Legacy Systems: Software Technologies, Engineering Process and Business Practices
Automated support for classifying software failure reports

Proceedings of the 25th International Conference on Software Engineering
An Approach to Classify Software Maintenance Requests

ICSM '02 Proceedings of the International Conference on Software Maintenance (ICSM'02)
Software Engineering (7th Edition)

Software Engineering (7th Edition)
Who should fix this bug?

Proceedings of the 28th international conference on Software engineering
Supporting change request assignment in open source development

Proceedings of the 2006 ACM symposium on Applied computing
Assigning bug reports using a vocabulary-based expertise model of developers

MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories

Why so complicated? simple term filtering and weighting for location-based bug report assignment recommendation

Proceedings of the 10th Working Conference on Mining Software Repositories

Quantified Score

Hi-index	0.00

Visualization

Abstract

Empirical studies indicate that automating the bug assignment process has the potential to significantly reduce software evolution effort and costs. Prior work has used machine learning techniques to automate bug assignment but has employed a narrow band of tools which can be ineffective in large, long-lived software projects. To redress this situation, in this paper we employ a comprehensive set of machine learning tools and a probabilistic graph-based model (bug tossing graphs) that lead to highly-accurate predictions, and lay the foundation for the next generation of machine learning-based bug assignment. Our work is the first to examine the impact of multiple machine learning dimensions (classifiers, attributes, and training history) along with bug tossing graphs on prediction accuracy in bug assignment. We validate our approach on Mozilla and Eclipse, covering 856,259 bug reports and 21 cumulative years of development. We demonstrate that our techniques can achieve up to 86.09% prediction accuracy in bug assignment and significantly reduce tossing path lengths. We show that for our data sets the Naive Bayes classifier coupled with product-component features, tossing graphs and incremental learning performs best. Next, we perform an ablative analysis by unilaterally varying classifiers, features, and learning model to show their relative importance of on bug assignment accuracy. Finally, we propose optimization techniques that achieve high prediction accuracy while reducing training and prediction time.