Impact analysis of SCRs using single and multi-label machine learning classification

Authors:
Syed Nadeem Ahsan;Franz Wotawa
Affiliations:
Graz University of Technology, Austria;Graz University of Technology, Austria
Venue:
Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
Year:
2010

Citing 5
Cited 0

Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Analyzing and Relating Bug Report Data for Feature Tracking

WCRE '03 Proceedings of the 10th Working Conference on Reverse Engineering
Mining Version Histories to Guide Software Changes

Proceedings of the 26th International Conference on Software Engineering
Impact Analysis by Mining Software and Change Request Repositories

METRICS '05 Proceedings of the 11th IEEE International Software Metrics Symposium
Who should fix this bug?

Proceedings of the 28th international conference on Software engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

In case of resolved software change requests (SCRs), the names of impacted source files are known. In this paper, we tackle the question whether it is possible to use this information in order to predict the files that have to be changed whenever a new SCR is received. In order to provide a solution, we present two different approaches, which are based on automatic text classification of SCRs. First, we use Latent Semantic Indexing (LSI) to index the key terms of SCRs. Then, for classification we use two different approaches of machine learning i.e., single and multi label classification. We applied our approaches on the SCR's data of Gnome, Mozilla and Eclipse OSS projects. Our initial experimental results are promising, the obtained maximum precision values for single and multi label classification are 58.2% and 47.1% respectively. Furthermore, in case of single and multilabel classification, the maximum attained precision values for any individual label are 86.5% and 92% respectively.