Missing data imputation by utilizing information within incomplete instances

  • Authors:
  • Shichao Zhang;Zhi Jin;Xiaofeng Zhu

  • Affiliations:
  • Department of Computer Science, Zhejiang Normal University, Jinhua, China and State Key Laboratory for Novel Software Technology, Nanjing University, China;Key Lab of High Confidence Software Technologies, School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China;School of Information Technology & Electrical Engineering, University of Queensland, QLD 4072, Australia

  • Venue:
  • Journal of Systems and Software
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes to utilize information within incomplete instances (instances with missing values) when estimating missing values. Accordingly, a simple and efficient nonparametric iterative imputation algorithm, called the NIIA method, is designed for iteratively imputing missing target values. The NIIA method imputes each missing value several times until the algorithm converges. In the first iteration, all the complete instances are used to estimate missing values. The information within incomplete instances is utilized since the second imputation iteration. We conduct some experiments for evaluating the efficiency, and demonstrate: (1) the utilization of information within incomplete instances is of benefit to easily capture the distribution of a dataset; and (2) the NIIA method outperforms the existing methods in accuracy, and this advantage is clearly highlighted when datasets have a high missing ratio.