Predicting OSS Development Success: A Data Mining Approach

  • Authors:
  • Uzma Raja;Marietta J. Tretter

  • Affiliations:
  • University of Alabama, USA;Texas A&M University, USA

  • Venue:
  • International Journal of Information System Modeling and Design
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Open Source Software OSS has reached new levels of sophistication and acceptance by users and commercial software vendors. This research creates tests and validates a model for predicting successful development of OSS projects. Widely available archival data was used for OSS projects from Sourceforge.net. The data is analyzed with multiple Data Mining techniques. Initially three competing models are created using Logistic Regression, Decision Trees and Neural Networks. These models are compared for precision and are refined in several phases. Text Mining is used to create new variables that improve the predictive power of the models. The final model is chosen based on best fit to separate training and validation data sets and the ability to explain the relationship among variables. Model robustness is determined by testing it on a new dataset extracted from the SF repository. The results indicate that end-user involvement, project age, functionality, usage, project management techniques, project type and team communication methods have a significant impact on the development of OSS projects.