Prediction of Developer Participation in Issues of Open Source Projects

  • Authors:
  • Andre Luis Schwerz;Rafael Liberato;Igor Scaliante Wiese;Igor Steinmacher;Marco Aurelio Gerosa;Joao Eduardo Ferreira

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • SBSC '12 Proceedings of the 2012 Brazilian Symposium on Collaborative Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Developers of distributed open source projects use management and issues tracking tool to communicate. These tools provide a large volume of unstructured information that makes the triage of issues difficult, increasing developers' overhead. This problem is common to online communities based on volunteer participation. This paper shows the importance of the content of comments in an open source project to build a classifier to predict the participation for a developer in an issue. To design this prediction model, we used two machine learning algorithms called Naive Bayes and J48. We used the data of three Apache Hadoop subprojects to evaluate the use of the algorithms. By applying our approach to the most active developers of these subprojects we have achieved an accuracy ranging from 79% to 96%. The results indicate that the content of comments in issues of open source projects is a relevant factor to build a classifier of issues for developers.