Statistical analysis with missing data
Statistical analysis with missing data
Software metrics (2nd ed.): a rigorous and practical approach
Software metrics (2nd ed.): a rigorous and practical approach
Data quality and systems theory
Communications of the ACM
Experimentation in software engineering: an introduction
Experimentation in software engineering: an introduction
Validating the ISO/IEC 15504 Measure of Software Requirements Analysis Process Capability
IEEE Transactions on Software Engineering
Software Cost Estimation with Incomplete Data
IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Dealing with Missing Software Project Data
METRICS '03 Proceedings of the 9th International Symposium on Software Metrics
Analyzing Software Measurement Data with Clustering Techniques
IEEE Intelligent Systems
Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study
Empirical Software Engineering
An Evaluation of k-Nearest Neighbour Imputation Using Likert Data
METRICS '04 Proceedings of the Software Metrics, 10th International Symposium
Class Noise vs. Attribute Noise: A Quantitative Study
Artificial Intelligence Review
A Short Note on Safest Default Missingness Mechanism Assumptions
Empirical Software Engineering
Ensemble Imputation Methods for Missing Software Engineering Data
METRICS '05 Proceedings of the 11th IEEE International Software Metrics Symposium
Enhancing software quality estimation using ensemble-classifier based noise filtering
Intelligent Data Analysis
Identifying noisy features with the Pairwise Attribute Noise Detection Algorithm
Intelligent Data Analysis
Handling missing data in software effort prediction with naive Bayes and EM algorithm
Proceedings of the 7th International Conference on Predictive Models in Software Engineering
Data quality in empirical software engineering: a targeted review
Proceedings of the 17th International Conference on Evaluation and Assessment in Software Engineering
Incomplete-case nearest neighbor imputation in software measurement data
Information Sciences: an International Journal
Hi-index | 0.00 |
The problem of missing values in software measurement data used in empirical analysis has led to the proposal of numerous potential solutions. Imputation procedures, for example, have been proposed to `fill-in' the missing values with plausible alternatives. We present a comprehensive study of imputation techniques using real-world software measurement datasets. Two different datasets with dramatically different properties were utilized in this study, with the injection of missing values according to three different missingness mechanisms (MCAR, MAR, and NI). We consider the occurrence of missing values in multiple attributes, and compare three procedures, Bayesian multiple imputation, k Nearest Neighbor imputation, and Mean imputation. We also examine the relationship between noise in the dataset and the performance of the imputation techniques, which has not been addressed previously. Our comprehensive experiments demonstrate conclusively that Bayesian multiple imputation is an extremely effective imputation technique.