A machine learning approach for text categorization of fixing-issue commits on CVS

  • Authors:
  • Alessandro Murgia;Giulio Concas;Michele Marchesi;Roberto Tonelli

  • Affiliations:
  • University of Cagliari, Italy;University of Cagliari, Italy;University of Cagliari, Italy;University of Cagliari, Italy

  • Venue:
  • Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We studied data mining from CVS repositories of two large OO projects, Eclipse and Netbeans, focusing on "fixing-issue" commits. We highlight common characteristics of issue reporting, and problems related to the identification of these messages, and compare static traditional approaches, like Knowledge Engineering, to dynamic approaches based on Machine Learning techniques. We compare for the first time performances of Machine Learning (ML) techniques to automatic classify "fixing-issues" among message commits. Our study calculates precision and recall of different Machine Learning Classifiers for the correct classification of issue-reporting commits. Our results show that some ML classifiers can correctly classify up to 99.9% of such commits.