Machine Learning Approaches to Estimating Software Development Effort
IEEE Transactions on Software Engineering
Machine Learning
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Ensemble learning via negative correlation
Neural Networks
ACM SIGSOFT Software Engineering Notes
Comparing Software Prediction Techniques Using Simulation
IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Software Engineering Economics
Software Engineering Economics
Clustering Algorithms
Software Cost Estimation with Cocomo II with Cdrom
Software Cost Estimation with Cocomo II with Cdrom
Dealing with Missing Software Project Data
METRICS '03 Proceedings of the 9th International Symposium on Software Metrics
A Simulation Study of the Model Evaluation Criterion MMRE
IEEE Transactions on Software Engineering
A Systematic Review of Software Development Cost Estimation Studies
IEEE Transactions on Software Engineering
Selecting Best Practices for Effort Estimation
IEEE Transactions on Software Engineering
Cross versus Within-Company Cost Estimation Studies: A Systematic Review
IEEE Transactions on Software Engineering
An empirical analysis of software effort estimation with outlier elimination
Proceedings of the 4th international workshop on Predictor models in software engineering
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Exploiting the Essential Assumptions of Analogy-Based Effort Estimation
IEEE Transactions on Software Engineering
Simultaneous training of negatively correlated neural networks inan ensemble
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Regularized Negative Correlation Learning for Neural Network Ensembles
IEEE Transactions on Neural Networks
Can cross-company data improve performance in software effort estimation?
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Software effort estimation as a multiobjective learning problem
ACM Transactions on Software Engineering and Methodology (TOSEM) - Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
Hi-index | 0.00 |
Background: Software effort estimation (SEE) is a task of strategic importance in software management. Recently, some studies have attempted to use ensembles of learning machines for this task. Aims: We aim at (1) evaluating whether readily available ensemble methods generally improve SEE given by single learning machines and which of them would be more useful; getting insight on (2) how to improve SEE; and (3) how to choose machine learning (ML) models for SEE. Method: A principled and comprehensive statistical comparison of three ensemble methods and three single learners was carried out using thirteen data sets. Feature selection and ensemble diversity analyses were performed to gain insight on how to improve SEE based on the approaches singled out. In addition, a risk analysis was performed to investigate the robustness to outliers. Therefore, the better understanding/insight provided by the paper is based on principled experiments, not just an intuition or speculation. Results: None of the compared methods is consistently the best, even though regression trees and bagging using multilayer perceptrons (MLPs) are more frequently among the best. These two approaches usually perform similarly. Regression trees place more important features in higher levels of the trees, suggesting that feature weights are important when using ML models for SEE. The analysis of bagging with MLPs suggests that a self-tuning ensemble diversity method may help improving SEE. Conclusions: Ideally, principled experiments should be done in an individual basis to choose a model. If an organisation has no resources for that, regression trees seem to be a good choice for its simplicity. The analysis also suggests approaches to improve SEE.