Impact analysis of SCRs using single and multi-label machine learning classification

  • Authors:
  • Syed Nadeem Ahsan;Franz Wotawa

  • Affiliations:
  • Graz University of Technology, Austria;Graz University of Technology, Austria

  • Venue:
  • Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In case of resolved software change requests (SCRs), the names of impacted source files are known. In this paper, we tackle the question whether it is possible to use this information in order to predict the files that have to be changed whenever a new SCR is received. In order to provide a solution, we present two different approaches, which are based on automatic text classification of SCRs. First, we use Latent Semantic Indexing (LSI) to index the key terms of SCRs. Then, for classification we use two different approaches of machine learning i.e., single and multi label classification. We applied our approaches on the SCR's data of Gnome, Mozilla and Eclipse OSS projects. Our initial experimental results are promising, the obtained maximum precision values for single and multi label classification are 58.2% and 47.1% respectively. Furthermore, in case of single and multilabel classification, the maximum attained precision values for any individual label are 86.5% and 92% respectively.