Assessing the Benefits of Imputing ERP Projects with Missing Data

Authors:
Ingunn Myrtveit;Ulf Olsson;Erik Stensrud
Affiliations:
-;-;-
Venue:
METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
Year:
2001

Citing 0
Cited 3

Analyzing Data Sets with Missing Data: An Empirical Evaluation of Imputation Methods and Likelihood-Based Methods

IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Applying statistical methodology to optimize and simplify software metric models with missing data

Proceedings of the 2006 ACM symposium on Applied computing
Outlier elimination in construction of software metric models

Proceedings of the 2007 ACM symposium on Applied computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Incomplete, or missing, data is likely to be encountered in empirical software engineering data sets. In this paper we evaluate some methods for handling missing data. The methods are presented and discussed in general and thereafter applied to effort estimation of ERP projects. We found that two sampling-based methods, mean imputation (MI) and similar response pattern imputation (SRPI), waste less information than listwise deletion (LD). However, MI may introduce more bias than the SRPI method. Compared to sampling-based methods, likelihood-based imputation methods require too large data sets to be realistic to use in empirical software engineering. None of the sampling-based methods, such as MI and SRPI, seem able to correct bias. So, though imputation is an attractive idea, the available methods still have severe limitations.