Missing data imputation by utilizing information within incomplete instances

Authors:
Shichao Zhang;Zhi Jin;Xiaofeng Zhu
Affiliations:
Department of Computer Science, Zhejiang Normal University, Jinhua, China and State Key Laboratory for Novel Software Technology, Nanjing University, China;Key Lab of High Confidence Software Technologies, School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China;School of Information Technology & Electrical Engineering, University of Queensland, QLD 4072, Australia
Venue:
Journal of Systems and Software
Year:
2011

Citing 11
Cited 2

Statistical analysis with missing data

Statistical analysis with missing data
Unknown attribute values in induction

Proceedings of the sixth international workshop on Machine learning
Handling missing data by using stored truth values

ACM SIGMOD Record
Minimal Projective Reconstruction Including Missing Data

IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust Learning with Missing Data

Machine Learning
Imputation of Missing Data in Industrial Databases

Applied Intelligence
Principal Component Analysis with Missing Data and Its Application to Polyhedral Object Modeling

IEEE Transactions on Pattern Analysis and Machine Intelligence
POP algorithm: Kernel-based imputation to treat missing values in knowledge discovery from databases

Expert Systems with Applications: An International Journal
NIIA: Nonparametric Iterative Imputation Algorithm

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Missing Value Estimation for Mixed-Attribute Data Sets

IEEE Transactions on Knowledge and Data Engineering
Kernel classification rules from missing data

IEEE Transactions on Information Theory

Decision tree classifiers sensitive to heterogeneous costs

Journal of Systems and Software
Missing data analyses: a hybrid multiple imputation algorithm using Gray System Theory and entropy based on clustering

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes to utilize information within incomplete instances (instances with missing values) when estimating missing values. Accordingly, a simple and efficient nonparametric iterative imputation algorithm, called the NIIA method, is designed for iteratively imputing missing target values. The NIIA method imputes each missing value several times until the algorithm converges. In the first iteration, all the complete instances are used to estimate missing values. The information within incomplete instances is utilized since the second imputation iteration. We conduct some experiments for evaluating the efficiency, and demonstrate: (1) the utilization of information within incomplete instances is of benefit to easily capture the distribution of a dataset; and (2) the NIIA method outperforms the existing methods in accuracy, and this advantage is clearly highlighted when datasets have a high missing ratio.