An Empirical Comparison of Machine Learning Techniques in Predicting the Bug Severity of Open and Closed Source Projects

  • Authors:
  • K. K. Chaturvedi;V.B. Singh

  • Affiliations:
  • Indian Agricultural Statistics Research Institute, New Delhi, Delhi, India;Delhi College of Arts & Commerce, University of Delhi, New Delhi, Delhi, India

  • Venue:
  • International Journal of Open Source Software and Processes
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Bug severity is the degree of impact that a defect has on the development or operation of a component or system, and can be classified into different levels based on their impact on the system. Identification of severity level can be useful for bug triager in allocating the bug to the concerned bug fixer. Various researchers have attempted text mining techniques in predicting the severity of bugs, detection of duplicate bug reports and assignment of bugs to suitable fixer for its fix. In this paper, an attempt has been made to compare the performance of different machine learning techniques namely Support vector machine SVM, probability based Naïve Bayes NB, Decision Tree based J48 A Java implementation of C4.5, rule based Repeated Incremental Pruning to Produce Error Reduction RIPPER and Random Forests RF learners in predicting the severity level 1 to 5 of a reported bug by analyzing the summary or short description of the bug reports. The bug report data has been taken from NASA's PITS Projects and Issue Tracking System datasets as closed source and components of Eclipse, Mozilla & GNOME datasets as open source projects. The analysis has been carried out in RapidMiner and STATISTICA data mining tools. The authors measured the performance of different machine learning techniques by considering i the value of accuracy and F-Measure for all severity level and ii number of best cases at different threshold level of accuracy and F-Measure.