GBKII: an imputation method for missing values

  • Authors:
  • Chengqi Zhang;Xiaofeng Zhu;Jilian Zhang;Yongsong Qin;Shichao Zhang

  • Affiliations:
  • Faculty of Information Technology, University of Technology, Sydney and Department of Information Systems, City University of Hong Kong, China;Department of Computer Science, Guangxi Normal University, China;Department of Computer Science, Guangxi Normal University, China;Department of Computer Science, Guangxi Normal University, China;Department of Computer Science, Guangxi Normal University, China

  • Venue:
  • PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Missing data imputation is an actual and challenging issue in machine learning and data mining. This is because missing values in a dataset can generate bias that affects the quality of the learned patterns or the classification performances. To deal with this issue, this paper proposes a Grey-Based K-NN Iteration Imputation method, called GBKII, for imputing missing values. GBKII is an instance-based imputation method, which is referred to a non-parametric regression method in statistics. It is also efficient for handling with categorical attributes. We experimentally evaluate our approach and demonstrate that GBKII is much more efficient than the k-NN and mean-substitution methods.