POP algorithm: Kernel-based imputation to treat missing values in knowledge discovery from databases

  • Authors:
  • Yongsong Qin;Shichao Zhang;Xiaofeng Zhu;Jilian Zhang;Chengqi Zhang

  • Affiliations:
  • School of Computer Science and Information Technology, Guangxi Normal University, PR China;Institute of Logics, Zhongshan University, PR China and Faculty of Information Technology, University of Technology, Sydney, P.O. Box 123, Broadway NSW 2007, Australia;School of Computer Science and Information Technology, Guangxi Normal University, PR China;School of Computer Science and Information Technology, Guangxi Normal University, PR China;Faculty of Information Technology, University of Technology, Sydney, P.O. Box 123, Broadway NSW 2007, Australia

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 12.05

Visualization

Abstract

To complete missing values a solution is to use correlations between the attributes of the data. The problem is that it is difficult to identify relations within data containing missing values. Accordingly, we develop a kernel-based missing data imputation in this paper. This approach aims at making an optimal inference on statistical parameters: mean, distribution function and quantile after missing data are imputed. And we refer this approach to parameter optimization method (POP algorithm). We experimentally evaluate our approach, and demonstrate that our POP algorithm (random regression imputation) is much better than deterministic regression imputation in efficiency and generating an inference on the above parameters.