Systematic literature review of machine learning based software development effort estimation models

  • Authors:
  • Jianfeng Wen;Shixian Li;Zhiyong Lin;Yong Hu;Changqin Huang

  • Affiliations:
  • Department of Computer Science, Sun Yat-sen University, Guangzhou, China;Department of Computer Science, Sun Yat-sen University, Guangzhou, China;Department of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China;Institute of Business Intelligence and Knowledge Discovery, Department of E-commerce, Guangdong University of Foreign Studies, Sun Yat-sen University, Guangzhou, China;Engineering Research Center of Computer Network and Information Systems, South China Normal University, Guangzhou, China

  • Venue:
  • Information and Software Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.01

Visualization

Abstract

Context: Software development effort estimation (SDEE) is the process of predicting the effort required to develop a software system. In order to improve estimation accuracy, many researchers have proposed machine learning (ML) based SDEE models (ML models) since 1990s. However, there has been no attempt to analyze the empirical evidence on ML models in a systematic way. Objective: This research aims to systematically analyze ML models from four aspects: type of ML technique, estimation accuracy, model comparison, and estimation context. Method: We performed a systematic literature review of empirical studies on ML model published in the last two decades (1991-2010). Results: We have identified 84 primary studies relevant to the objective of this research. After investigating these studies, we found that eight types of ML techniques have been employed in SDEE models. Overall speaking, the estimation accuracy of these ML models is close to the acceptable level and is better than that of non-ML models. Furthermore, different ML models have different strengths and weaknesses and thus favor different estimation contexts. Conclusion: ML models are promising in the field of SDEE. However, the application of ML models in industry is still limited, so that more effort and incentives are needed to facilitate the application of ML models. To this end, based on the findings of this review, we provide recommendations for researchers as well as guidelines for practitioners.