Why comparative effort prediction studies may be invalid

Authors:
Barbara Kitchenham;Emilia Mendes
Affiliations:
Keele University, Keele, UK;University of Auckland, Auckland, New Zealand
Venue:
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Year:
2009

Citing 18
Cited 18

Estimating Software Project Effort Using Analogies

IEEE Transactions on Software Engineering
A Procedure for Analyzing Unbalanced Datasets

IEEE Transactions on Software Engineering
Software Engineering Economics

Software Engineering Economics
Software Development Cost Estimation Using Function Points

IEEE Transactions on Software Engineering
An empirical study of maintenance and development estimation accuracy

Journal of Systems and Software
Human Performance Estimating with Analogy and Regression Models: An Empirical Validation

METRICS '98 Proceedings of the 5th International Symposium on Software Metrics
A Simulation Study of the Model Evaluation Criterion MMRE

IEEE Transactions on Software Engineering
An analysis of data sets used to train and validate cost prediction systems

PROMISE '05 Proceedings of the 2005 workshop on Predictor models in software engineering
Using Grey Relational Analysis to Predict Software Effort with Small Data Sets

METRICS '05 Proceedings of the 11th IEEE International Software Metrics Symposium
The adjusted analogy-based software effort estimation based on similarity distances

Journal of Systems and Software
Cross versus Within-Company Cost Estimation Studies: A Systematic Review

IEEE Transactions on Software Engineering
Software Function, Source Lines of Code, and Development Effort Prediction: A Software Science Validation

IEEE Transactions on Software Engineering
An investigation of artificial neural networks based prediction systems in software project management

Journal of Systems and Software
Comparing cost prediction models by resampling techniques

Journal of Systems and Software
An empirical validation of a neural network model for software effort estimation

Expert Systems with Applications: An International Journal
Software development cost estimation using wavelet neural networks

Journal of Systems and Software
Segmented software cost estimation models based on fuzzy clustering

Journal of Systems and Software
Software project effort estimation with voting rules

Decision Support Systems

Visual comparison of software cost estimation models by regression error characteristic analysis

Journal of Systems and Software
GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation

Information and Software Technology
Adaptive ridge regression system for software cost estimating on multi-collinear datasets

Journal of Systems and Software
Replication of defect prediction studies: problems, pitfalls and recommendations

Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Sensitivity of results to different data quality meta-data criteria in the sample selection of projects from the ISBSG dataset

Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Validity and reliability of evaluation procedures in comparative studies of effort prediction models

Empirical Software Engineering
A permutation test based on regression error characteristic curves for software cost estimation models

Empirical Software Engineering
Search-based approaches for software development effort estimation

Proceedings of the 12th International Conference on Product Focused Software Development and Process Improvement
2012 Special Issue: An evolutionary morphological approach for software development cost estimation

Neural Networks
An exploratory study on the accuracy of FPA to COSMIC measurement method conversion types

Information and Software Technology
StatREC: a graphical user interface tool for visual hypothesis testing of cost prediction models

Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Alternative methods using similarities in software effort estimation

Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Size doesn't matter?: on the value of software size features for effort estimation

Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Functional Link Artificial Neural Networks for Software Cost Estimation

International Journal of Applied Evolutionary Computation
How to treat timing information for software effort estimation?

Proceedings of the 2013 International Conference on Software and System Process
Revisiting software development effort estimation based on early phase development activities

Proceedings of the 10th Working Conference on Mining Software Repositories
Building a second opinion: learning cross-company data

Proceedings of the 9th International Conference on Predictive Models in Software Engineering
Towards a simplified definition of Function Points

Information and Software Technology

Quantified Score

Hi-index	0.01

Visualization

Abstract

Background: Many cost estimation papers are based on finding a "new" estimation method, trying out the method on one or two past datasets and "proving" that the new method is better than linear regression. Aim: This paper aims to explain why this approach to model comparison is often invalid and to suggest that the PROMISE repository may be making things worse. Method: We identify some of the theoretical problems with studies that compare different estimation models. We review some of the commonly used datasets from the viewpoint of the reliability of the data and the validity of the proposed linear regression models. Discussion points: It is invalid to select one or two datasets to "prove" the validity of a new technique because we cannot be sure that, of the many published datasets, those chosen are the only ones that favour the new technique. When new models are compared with regression models, researchers need to understand how to use regression analysis appropriately. The use of linear regression presupposes: a linear relationship between dependent and independent variables, no significant outliers, no significant skewness, no relationship between the variance of the dependent variable and the magnitude of the variable. If all these conditions are not true, standard statistical practice is to use a robust regression or transform the data. The logarithmic transformation is appropriate in many cases, and for the Desharnais dataset gives better results than the regression model presented in the PROMISE repository. Conclusions: Simplistic studies comparing data intensive methods with linear regression will be scientifically valueless, if the regression techniques are applied incorrectly. They are also suspect if only a small number of datasets are used and the selection of those datasets is not scientifically justified.