GBKII: an imputation method for missing values

Authors:
Chengqi Zhang;Xiaofeng Zhu;Jilian Zhang;Yongsong Qin;Shichao Zhang
Affiliations:
Faculty of Information Technology, University of Technology, Sydney and Department of Information Systems, City University of Hong Kong, China;Department of Computer Science, Guangxi Normal University, China;Department of Computer Science, Guangxi Normal University, China;Department of Computer Science, Guangxi Normal University, China;Department of Computer Science, Guangxi Normal University, China
Venue:
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Year:
2007

Citing 2
Cited 8

Imputation of Missing Data in Industrial Databases

Applied Intelligence
Optimized parameters for missing data imputation

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence

NIIA: Nonparametric Iterative Imputation Algorithm

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Cost-time sensitive decision tree with missing values

KSEM'07 Proceedings of the 2nd international conference on Knowledge science, engineering and management
Cost-sensitive classification with respect to waiting cost

Knowledge-Based Systems
Missing value imputation based on data clustering

Transactions on computational science I
Shell-neighbor method and its application in missing data imputation

Applied Intelligence
Estimating Semi-Parametric Missing Values with Iterative Imputation

International Journal of Data Warehousing and Mining
Instance driven clustering for the imputation of missing data in KDD

International Journal of Communication Networks and Distributed Systems
Clustering with Missing Values

Fundamenta Informaticae

Quantified Score

Hi-index	0.00

Visualization

Abstract

Missing data imputation is an actual and challenging issue in machine learning and data mining. This is because missing values in a dataset can generate bias that affects the quality of the learned patterns or the classification performances. To deal with this issue, this paper proposes a Grey-Based K-NN Iteration Imputation method, called GBKII, for imputing missing values. GBKII is an instance-based imputation method, which is referred to a non-parametric regression method in statistics. It is also efficient for handling with categorical attributes. We experimentally evaluate our approach and demonstrate that GBKII is much more efficient than the k-NN and mean-substitution methods.