Selecting discriminating terms for bug assignment: a formal analysis

Authors:
Ibrahim Aljarah;Shadi Banitaan;Sameer Abufardeh;Wei Jin;Saeed Salem
Affiliations:
North Dakota State University, Fargo, ND;North Dakota State University, Fargo, ND;North Dakota State University, Fargo, ND;North Dakota State University, Fargo, ND;North Dakota State University, Fargo, ND
Venue:
Proceedings of the 7th International Conference on Predictive Models in Software Engineering
Year:
2011

Citing 10
Cited 1

Advances in Software Engineering

Computer
Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Software Defect Reduction Top 10 List

Computer
Coping with an open bug repository

eclipse '05 Proceedings of the 2005 OOPSLA workshop on Eclipse technology eXchange
Who should fix this bug?

Proceedings of the 28th international conference on Software engineering
Automating bug report assignment

Proceedings of the 28th international conference on Software engineering
Introduction to Information Retrieval

Introduction to Information Retrieval
Improving bug triage with bug tossing graphs

Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Optimized assignment of developers for fixing bugs an initial evaluation for eclipse projects

ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement

Towards understanding software change request assignment: a survey with practitioners

Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering

Quantified Score

Hi-index	0.03

Visualization

Abstract

Background. The bug assignment problem is the problem of triaging new bug reports to the most qualified developer. The qualified developer is the one who has enough knowledge in a specific area that is relevant to the reported bug. In recent years, bug triaging has received a considerable amount of attention from researchers. In previous work, bugs were represented as vectors of terms extracted from the bug reports' description. Once the bugs are represented as vectors in the terms space, traditional machine learning techniques are employed for the bug assignment. Most of the previous algorithms are marred by low accuracy values. Aims. This paper formulates the bug assignment problem as a classification task, and then examines the impact of several term selection approaches on the classification effectiveness. Method. Three variants selection methods that are based on the Log Odds Ratio (LOR) score are compared against methods that are based on the Information Gain (IG) score and Latent Semantic Analysis (LSA). The main difference in the methods that are based on the LOR score is in the process of selecting the terms. Results. Term selection techniques that are based on the Log Odds Ratio achieved up to 30% improvement in the precision and up to 5% higher in recall compared to other term selection methods such as Latent Semantic Analysis and Information Gain. Conclusions. Experimental results showed that the effectiveness of bug assignment methods is directly affected by the selected terms that are used in the classification methods.