Software metrics (2nd ed.): a rigorous and practical approach
Software metrics (2nd ed.): a rigorous and practical approach
Software Cost Estimation with Incomplete Data
IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Dealing with Missing Software Project Data
METRICS '03 Proceedings of the 9th International Symposium on Software Metrics
Analyzing Software Measurement Data with Clustering Techniques
IEEE Intelligent Systems
Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study
Empirical Software Engineering
An Evaluation of k-Nearest Neighbour Imputation Using Likert Data
METRICS '04 Proceedings of the Software Metrics, 10th International Symposium
A Short Note on Safest Default Missingness Mechanism Assumptions
Empirical Software Engineering
Nearest neighbour approach in the least-squares data imputation algorithms
Information Sciences: an International Journal
Ensemble Imputation Methods for Missing Software Engineering Data
METRICS '05 Proceedings of the 11th IEEE International Software Metrics Symposium
A Comparison of Software Fault Imputation Procedures
ICMLA '06 Proceedings of the 5th International Conference on Machine Learning and Applications
Enhancing software quality estimation using ensemble-classifier based noise filtering
Intelligent Data Analysis
Identifying noisy features with the Pairwise Attribute Noise Detection Algorithm
Intelligent Data Analysis
Imputation techniques for multivariate missingness in software measurement data
Software Quality Control
AN EMPIRICAL COMPARISON OF TECHNIQUES FOR HANDLING INCOMPLETE DATA USING DECISION TREES
Applied Artificial Intelligence
Hi-index | 0.07 |
k nearest neighbor imputation (kNNI) is one of the most popular methods in empirical software engineering for imputing missing values. kNNI typically uses only complete cases as possible donors for imputation (called complete case kNNI or CCkNNI). Though it often produces reasonable results, CCkNNI is severely limited when the amount of missing data is large (and hence the number of complete cases is small). In response, a variant of CCkNNI called incomplete case k nearest neighbor imputation (ICkNNI) has been proposed as an attractive alternative. This work presents a detailed simulation comparing CCkNNI and ICkNNI using two different software measurement datasets. The empirical results show that using incomplete cases often increases the effectiveness of nearest neighbor imputation (especially at higher missingness levels), regardless of the type of missingness (i.e., the distribution of missing values in the data).