Software Cost Estimation with Incomplete Data

Authors:
Kevin Strike;Khaled El Emam;Nazim Madhavji
Affiliations:
-;-;-
Venue:
IEEE Transactions on Software Engineering
Year:
2001

Citing 32
Cited 43

Software project development cost estimation

Journal of Systems and Software
Software engineering metrics and models

Software engineering metrics and models
Statistical analysis with missing data

Statistical analysis with missing data
An empirical validation of software cost estimation models

Communications of the ACM
Scale Economies in New Software Development

IEEE Transactions on Software Engineering
Data analysis for software metrics

Journal of Systems and Software - An Oregon workshop on software metrics
Method to estimate parameter values in software prediction models

Information and Software Technology - Information and software economics
A model to evaluate variables impacting the productivity of software maintenance projects

Management Science
Software Effort Models for Early Estimation of Process Control Applications

IEEE Transactions on Software Engineering
A Pattern Recognition Approach for Software Engineering Data Analysis

IEEE Transactions on Software Engineering - Special issue on software measurement principles, techniques, and environments
Empirical studies of assumptions that underlie software cost-estimation models

Information and Software Technology
Dimensionality reduction in software development effort estimation

Journal of Systems and Software
Robust regression for developing software estimation models

Journal of Systems and Software
Machine Learning Approaches to Estimating Software Development Effort

IEEE Transactions on Software Engineering
Evaluating Alternative Software Production Functions

IEEE Transactions on Software Engineering
A comparison of software effort estimation techniques: using function points with neural networks, case-based reasoning and regression models

Journal of Systems and Software
Estimating Software Project Effort Using Analogies

IEEE Transactions on Software Engineering
Calibrating the COCOMO II post-architecture model

Proceedings of the 20th international conference on Software engineering
Explaining the cost of European space and military projects

Proceedings of the 21st international conference on Software engineering
An assessment and comparison of common software cost estimation modeling techniques

Proceedings of the 21st international conference on Software engineering
Performance Evaluation of General and Company Specific Models in Software Development Effort Estimation

Management Science
A replicated assessment and comparison of common software cost modeling techniques

Proceedings of the 22nd international conference on Software engineering
A Comprehensive Evaluation of Capture-Recapture Models for Estimating Software Defect Content

IEEE Transactions on Software Engineering
The measurement of user information satisfaction

Communications of the ACM
Software Engineering Economics

Software Engineering Economics
The Mythical Man-Month: Essays on Softw

The Mythical Man-Month: Essays on Softw
Software Development Cost Estimation Using Function Points

IEEE Transactions on Software Engineering
Comments on: Evaluating Alternative Software Production Functions

IEEE Transactions on Software Engineering
Quantitative Empirical Modeling for Manageing Software Development: Constraints, Needs and Solutions

Proceedings of the International Workshop on Experimental Software Engineering Issues: Critical Assessment and Future Directions
A meta-model for software development resource expenditures

ICSE '81 Proceedings of the 5th international conference on Software engineering
An effort estimation model for implementing ISO 9001

ISESS '95 Proceedings of the 2nd IEEE Software Engineering Standards Symposium
The effects of software process maturity on software development effort

The effects of software process maturity on software development effort

Analyzing Data Sets with Missing Data: An Empirical Evaluation of Imputation Methods and Likelihood-Based Methods

IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Effort estimation for corrective software maintenance

SEKE '02 Proceedings of the 14th international conference on Software engineering and knowledge engineering
Measuring Effort Estimation Uncertainty to Improve Client Confidence

Software Quality Control
A Simulation Study of the Model Evaluation Criterion MMRE

IEEE Transactions on Software Engineering
Software Effort Prediction Models Using Maximum Likelihood Methods Require Multivariate Normality of the Software Metrics Data Sample: Can Such a Sample Be Made Multivariate Normal?

COMPSAC '04 Proceedings of the 28th Annual International Computer Software and Applications Conference - Volume 01
A Short Note on Safest Default Missingness Mechanism Assumptions

Empirical Software Engineering
Assessing Variation in Development Effort Consistency Using a Data Source with Missing Data

Software Quality Control
Reliability and Validity in Comparative Studies of Software Prediction Models

IEEE Transactions on Software Engineering
Ensemble of missing data techniques to improve software prediction accuracy

Proceedings of the 28th international conference on Software engineering
Categorical missing data imputation for software cost estimation by multinomial logistic regression

Journal of Systems and Software
Applying statistical methodology to optimize and simplify software metric models with missing data

Proceedings of the 2006 ACM symposium on Applied computing
Benchmarking k-nearest neighbour imputation with homogeneous Likert data

Empirical Software Engineering
A comparative study of attribute weighting heuristics for effort estimation by analogy

Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering
A new imputation method for small software project data sets

Journal of Systems and Software
Outlier elimination in construction of software metric models

Proceedings of the 2007 ACM symposium on Applied computing
Predicting object-oriented software maintainability using multivariate adaptive regression splines

Journal of Systems and Software
Software project economics: a roadmap

FOSE '07 2007 Future of Software Engineering
Decision Support Analysis for Software Effort Estimation by Analogy

PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
A comprehensive empirical evaluation of missing value imputation in noisy software measurement data

Journal of Systems and Software
Missing Data Imputation Techniques

International Journal of Business Intelligence and Data Mining
Towards a value-based approach in software engineering

CEA'08 Proceedings of the 2nd WSEAS International Conference on Computer Engineering and Applications
Tests for consistent measurement of external subjective software quality attributes

Empirical Software Engineering
Can k-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation

Journal of Systems and Software
Handling imprecision and uncertainty in software development effort prediction: A type-2 fuzzy logic based framework

Information and Software Technology
Imputation techniques for multivariate missingness in software measurement data

Software Quality Control
A study of the non-linear adjustment for analogy based software cost estimation

Empirical Software Engineering
Discovery of characteristic patterns from tabular structured data including missing values

International Journal of Business Intelligence and Data Mining
Handling incomplete data using evolution of imputation methods

ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
LSEbA: least squares regression and estimation by analogy in a semi-parametric model for software cost estimation

Empirical Software Engineering
Ensemble missing data techniques for software effort prediction

Intelligent Data Analysis
Data quality: cinderella at the software metrics ball?

Proceedings of the 2nd International Workshop on Emerging Trends in Software Metrics
Dealing with noise in defect prediction

Proceedings of the 33rd International Conference on Software Engineering
Handling missing data in software effort prediction with naive Bayes and EM algorithm

Proceedings of the 7th International Conference on Predictive Models in Software Engineering
ReLink: recovering links between bugs and changes

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Conformity evaluation system based on member capability information in the software projects

KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part III
Soft computing based imputation and hybrid data and text mining: The case of predicting the severity of phishing alerts

Expert Systems with Applications: An International Journal
Hot deck methods for imputing missing data: the effects of limiting donor usage

MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Automated trendline generation for accurate software effort estimation

Proceedings of the 3rd annual conference on Systems, programming, and applications: software for humanity
Multi-layered approach for recovering links between bug reports and fixes

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Optimum estimation of missing values in randomized complete block design by genetic algorithm

Knowledge-Based Systems
An algorithmic approach to missing data problem in modeling human aspects in software development

Proceedings of the 9th International Conference on Predictive Models in Software Engineering
On the value of outlier elimination on software effort estimation research

Empirical Software Engineering
Incomplete-case nearest neighbor imputation in software measurement data

Information Sciences: an International Journal

Quantified Score

Hi-index	0.01

Visualization

Abstract

The construction of software cost estimation models remains an active topic of research. The basic premise of cost modeling is that a historical database of software project cost data can be used to develop a quantitative model to predict the cost of future projects. One of the difficulties faced by workers in this area is that many of these historical databases contain substantial amounts of missing data. Thus far, the common practice has been to ignore observations with missing data. In principle, such a practice can lead to gross biases and may be detrimental to the accuracy of cost estimation models. In this paper, we describe an extensive simulation where we evaluate different techniques for dealing with missing data in the context of software cost modeling. Three techniques are evaluated: listwise deletion, mean imputation, and eight different types of hot-deck imputation. Our results indicate that all the missing data techniques perform well with small biases and high precision. This suggests that the simplest technique, listwise deletion, is a reasonable choice. However, this will not necessarily provide the best performance. Consistent best performance (minimal bias and highest precision) can be obtained by using hot-deck imputation with Euclidean distance and a z-score standardization.