Systematic literature review of machine learning based software development effort estimation models

Authors:
Jianfeng Wen;Shixian Li;Zhiyong Lin;Yong Hu;Changqin Huang
Affiliations:
Department of Computer Science, Sun Yat-sen University, Guangzhou, China;Department of Computer Science, Sun Yat-sen University, Guangzhou, China;Department of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China;Institute of Business Intelligence and Knowledge Discovery, Department of E-commerce, Guangdong University of Foreign Studies, Sun Yat-sen University, Guangzhou, China;Engineering Research Center of Computer Network and Information Systems, South China Normal University, Guangzhou, China
Venue:
Information and Software Technology
Year:
2012

Citing 93
Cited 1

An empirical validation of software cost estimation models

Communications of the ACM
Examining the feasibility of a case-based reasoning model for software effort estimation

MIS Quarterly
A Pattern Recognition Approach for Software Engineering Data Analysis

IEEE Transactions on Software Engineering - Special issue on software measurement principles, techniques, and environments
Machine Learning Approaches to Estimating Software Development Effort

IEEE Transactions on Software Engineering
A comparison of software effort estimation techniques: using function points with neural networks, case-based reasoning and regression models

Journal of Systems and Software
Estimating Software Project Effort Using Analogies

IEEE Transactions on Software Engineering
Software development cost estimation integrating neural network with cluster analysis

Information and Management
An assessment and comparison of common software cost estimation modeling techniques

Proceedings of the 21st international conference on Software engineering
A Controlled Experiment to Assess the Benefits of Estimating with Analogy and Regression Models

IEEE Transactions on Software Engineering
A replicated assessment and comparison of common software cost modeling techniques

Proceedings of the 22nd international conference on Software engineering
Empirical Data Modeling in Software Engineering Using Radial Basis Functions

IEEE Transactions on Software Engineering
An investigation of machine learning based prediction systems

Journal of Systems and Software - Special issue on empirical studies of software development and evolution
Modeling Development Effort in Object-Oriented Systems Using Design Properties

IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Comparing Software Prediction Techniques Using Simulation

IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Software Engineering Economics

Software Engineering Economics
Machine Learning

Machine Learning
Software development cost estimation approaches – A survey

Annals of Software Engineering
An Empirical Study of Analogy-based Software Effort Estimation

Empirical Software Engineering
Software Metrics Data Analysis—Exploring the RelativePerformance of Some Commonly Used Modeling Techniques

Empirical Software Engineering
A Simulation Tool for Efficient Analogy Based Cost Estimation

Empirical Software Engineering
Predicting project delivery rates using the Naive-Bayes classifier

Journal of Software Maintenance: Research and Practice
A Comparative Study of Cost Estimation Models for Web Hypermedia Applications

Empirical Software Engineering
Machine Learning and Software Engineering

Software Quality Control
A meta-model for software development resource expenditures

ICSE '81 Proceedings of the 5th international conference on Software engineering
Human Performance Estimating with Analogy and Regression Models: An Empirical Validation

METRICS '98 Proceedings of the 5th International Symposium on Software Metrics
Using Public Domain Metrics To Estimate Software Development Effort

METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
How Valuable is company-specific Data Compared to multi-company Data for Software Cost Estimation?

METRICS '02 Proceedings of the 8th International Symposium on Software Metrics
Further Investigation into the Use of CBR and Stepwise Regression to Predict Development Effort for Web Hypermedia Applications

ISESE '02 Proceedings of the 2002 International Symposium on Empirical Software Engineering
Combining techniques to optimize effort predictions in software project management

Journal of Systems and Software
Software effort estimation by analogy and "regression toward the mean"

Journal of Systems and Software - Special issue: Best papers on Software Engineering from the SEKE'01 Conference
Evidence-Based Software Engineering

Proceedings of the 26th International Conference on Software Engineering
A Survey on Software Estimation in the Norwegian Industry

METRICS '04 Proceedings of the Software Metrics, 10th International Symposium
Further Comparison of Cross-Company and Within-Company Effort Estimation Models for Web Applications

METRICS '04 Proceedings of the Software Metrics, 10th International Symposium
Introduction to Machine Learning (Adaptive Computation and Machine Learning)

Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Soup or Art? The Role of Evidential Force in Empirical Software Engineering

IEEE Software
Reliability and Validity in Comparative Studies of Software Prediction Models

IEEE Transactions on Software Engineering
A Probabilistic Model for Predicting Software Development Effort

IEEE Transactions on Software Engineering
Investigating Web size metrics for early Web cost estimation

Journal of Systems and Software
A Survey of Controlled Experiments in Software Engineering

IEEE Transactions on Software Engineering
Optimal Project Feature Weights in Analogy-Based Cost Estimation: Improvement and Limitations

IEEE Transactions on Software Engineering
Effort estimation modeling techniques: a case study for web applications

ICWE '06 Proceedings of the 6th international conference on Web engineering
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Improving the COCOMO model using a neuro-fuzzy approach

Applied Soft Computing
The adjusted analogy-based software effort estimation based on similarity distances

Journal of Systems and Software
Lessons from applying the systematic literature review process within the software engineering domain

Journal of Systems and Software
A flexible method for software effort estimation by analogy

Empirical Software Engineering
A Systematic Review of Software Development Cost Estimation Studies

IEEE Transactions on Software Engineering
Software project economics: a roadmap

FOSE '07 2007 Future of Software Engineering
Cross versus Within-Company Cost Estimation Studies: A Systematic Review

IEEE Transactions on Software Engineering
Letters: Estimation of software project effort with support vector regression

Neurocomputing
Impact Analysis of Missing Values on the Prediction Accuracy of Analogy-based Software Effort Estimation Method AQUA

ESEM '07 Proceedings of the First International Symposium on Empirical Software Engineering and Measurement
A Comparison of Techniques for Web Effort Estimation

ESEM '07 Proceedings of the First International Symposium on Empirical Software Engineering and Measurement
Comparing Local and Global Software Effort Estimation Models -- Reflections on a Systematic Review

ESEM '07 Proceedings of the First International Symposium on Empirical Software Engineering and Measurement
A General Empirical Solution to the Macro Software Sizing and Estimating Problem

IEEE Transactions on Software Engineering
Software Function, Source Lines of Code, and Development Effort Prediction: A Software Science Validation

IEEE Transactions on Software Engineering
Improving analogy-based software cost estimation by a resampling method

Information and Software Technology
Analysis of attribute weighting heuristics for analogy-based software effort estimation method AQUA+

Empirical Software Engineering
Software Effort Estimation Using Machine Learning Techniques with Robust Confidence Intervals

ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 01
An investigation of artificial neural networks based prediction systems in software project management

Journal of Systems and Software
Cross-company vs. single-company web effort models using the Tukutuku database: An extended study

Journal of Systems and Software
Combining probabilistic models for explanatory productivity estimation

Information and Software Technology
An empirical analysis of software effort estimation with outlier elimination

Proceedings of the 4th international workshop on Predictor models in software engineering
Improving analogy software effort estimation using fuzzy feature subset selection algorithm

Proceedings of the 4th international workshop on Predictor models in software engineering
Empirical studies of agile software development: A systematic review

Information and Software Technology
An empirical validation of a neural network model for software effort estimation

Expert Systems with Applications: An International Journal
Software development cost estimation using wavelet neural networks

Journal of Systems and Software
Software Cost Estimation Models Using Radial Basis Function Neural Networks

Software Process and Product Measurement
Analogy-X: Providing Statistical Inference to Analogy-Based Software Cost Estimation

IEEE Transactions on Software Engineering
An analysis of the most cited articles in software engineering journals - 2002

Information and Software Technology
Bayesian Network Models for Web Effort Prediction: A Comparative Study

IEEE Transactions on Software Engineering
A study of project selection and feature weighting for analogy based software cost estimation

Journal of Systems and Software
Comparison of estimation methods of cost and duration in IT projects

Information and Software Technology
A study of mutual information based feature selection for case based reasoning in software cost estimation

Expert Systems with Applications: An International Journal
Applying fuzzy neural network to estimate software development effort

Applied Intelligence
Improved estimation of software project effort using multiple additive regression trees

Expert Systems with Applications: An International Journal
Software effort estimation based on weighted fuzzy grey relational analysis

PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Ensemble of neural networks with associative memory (ENNA) for estimating software development costs

Knowledge-Based Systems
A study of the non-linear adjustment for analogy based software cost estimation

Empirical Software Engineering
Improve Analogy-Based Software Effort Estimation Using Principal Components Analysis and Correlation Weighting

APSEC '09 Proceedings of the 2009 16th Asia-Pacific Software Engineering Conference
Software effort estimation terminology: The tower of Babel

Information and Software Technology
BBN based approach for improving the software development process of an SME—a case study

Journal of Software Maintenance and Evolution: Research and Practice
Software project similarity measurement based on fuzzy C-means

ICSP'08 Proceedings of the Software process, 2008 international conference on Making globally distributed software development a success story
Systematic literature reviews in software engineering - A tertiary study

Information and Software Technology
LSEbA: least squares regression and estimation by analogy in a semi-parametric model for software cost estimation

Empirical Software Engineering
GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation

Information and Software Technology
How effective is Tabu search to configure support vector regression for effort estimation?

Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Filtering of Inconsistent Software Project Data for Analogy-Based Effort Estimation

COMPSAC '10 Proceedings of the 2010 IEEE 34th Annual Computer Software and Applications Conference
Genetic Programming for Effort Estimation: An Analysis of the Impact of Different Fitness Functions

SSBSE '10 Proceedings of the 2nd International Symposium on Search Based Software Engineering
Software Effort Prediction Using Regression Rule Extraction from Neural Networks

ICTAI '10 Proceedings of the 2010 22nd IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Hybrid Intelligent Design of Morphological-Rank-Linear Perceptrons for Software Development Cost Estimation

ICTAI '10 Proceedings of the 2010 22nd IEEE International Conference on Tools with Artificial Intelligence - Volume 01
Investigating the use of Support Vector Regression for web effort estimation

Empirical Software Engineering
Software Engineering Economics

IEEE Transactions on Software Engineering
A review of studies on expert estimation of software development effort

Journal of Systems and Software

An analysis of multi-objective evolutionary algorithms for training ensemble models based on different performance measures in software effort estimation

Proceedings of the 9th International Conference on Predictive Models in Software Engineering

Quantified Score

Hi-index	0.01

Visualization

Abstract

Context: Software development effort estimation (SDEE) is the process of predicting the effort required to develop a software system. In order to improve estimation accuracy, many researchers have proposed machine learning (ML) based SDEE models (ML models) since 1990s. However, there has been no attempt to analyze the empirical evidence on ML models in a systematic way. Objective: This research aims to systematically analyze ML models from four aspects: type of ML technique, estimation accuracy, model comparison, and estimation context. Method: We performed a systematic literature review of empirical studies on ML model published in the last two decades (1991-2010). Results: We have identified 84 primary studies relevant to the objective of this research. After investigating these studies, we found that eight types of ML techniques have been employed in SDEE models. Overall speaking, the estimation accuracy of these ML models is close to the acceptable level and is better than that of non-ML models. Furthermore, different ML models have different strengths and weaknesses and thus favor different estimation contexts. Conclusion: ML models are promising in the field of SDEE. However, the application of ML models in industry is still limited, so that more effort and incentives are needed to facilitate the application of ML models. To this end, based on the findings of this review, we provide recommendations for researchers as well as guidelines for practitioners.