Optimized parameters for missing data imputation

  • Authors:
  • Shichao Zhang;Yongsong Qin;Xiaofeng Zhu;Jilian Zhang;Chengqi Zhang

  • Affiliations:
  • Deparment of Computer Science, Guangxi Normal University, China and Faculty of Information Technology, University of Technology Sydney, NSW, Australia;Deparment of Computer Science, Guangxi Normal University, China;Deparment of Computer Science, Guangxi Normal University, China;Deparment of Computer Science, Guangxi Normal University, China;Faculty of Information Technology, University of Technology Sydney, NSW, Australia

  • Venue:
  • PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

To complete missing values, a solution is to use attribute correlations within data. However, it is difficult to identify such relations within data containing missing values. Accordingly, we develop a kernel-based missing data imputation method in this paper. This approach aims at making optimal statistical parameters: mean, distribution function after missing-data are imputed. We refer this approach to parameter optimization method (POP algorithm, a random regression imputation). We experimentally evaluate our approach, and demonstrate that our POP algorithm is much better than deterministic regression imputation in efficiency of generating an inference on the above two parameters. The results also show our algorithm is computationally efficient, robust and stable for the missing data imputation.