Optimized parameters for missing data imputation

Authors:
Shichao Zhang;Yongsong Qin;Xiaofeng Zhu;Jilian Zhang;Chengqi Zhang
Affiliations:
Deparment of Computer Science, Guangxi Normal University, China and Faculty of Information Technology, University of Technology Sydney, NSW, Australia;Deparment of Computer Science, Guangxi Normal University, China;Deparment of Computer Science, Guangxi Normal University, China;Deparment of Computer Science, Guangxi Normal University, China;Faculty of Information Technology, University of Technology Sydney, NSW, Australia
Venue:
PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Year:
2006

Citing 9
Cited 4

Statistical analysis with missing data

Statistical analysis with missing data
Handling missing data by using stored truth values

ACM SIGMOD Record
Data mining: concepts and techniques

Data mining: concepts and techniques
Minimal Projective Reconstruction Including Missing Data

IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust Learning with Missing Data

Machine Learning
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Guest Editors' Introduction: Information Enhancement for Data Mining

IEEE Intelligent Systems
"Missing Is Useful': Missing Values in Cost-Sensitive Decision Trees

IEEE Transactions on Knowledge and Data Engineering
Generating weighted fuzzy rules from relational database systems for estimating values using genetic algorithms

IEEE Transactions on Fuzzy Systems

Semi-parametric optimization for missing data imputation

Applied Intelligence
NIIA: Nonparametric Iterative Imputation Algorithm

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
GBKII: an imputation method for missing values

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Missing value imputation based on data clustering

Transactions on computational science I

Quantified Score

Hi-index	0.00

Visualization

Abstract

To complete missing values, a solution is to use attribute correlations within data. However, it is difficult to identify such relations within data containing missing values. Accordingly, we develop a kernel-based missing data imputation method in this paper. This approach aims at making optimal statistical parameters: mean, distribution function after missing-data are imputed. We refer this approach to parameter optimization method (POP algorithm, a random regression imputation). We experimentally evaluate our approach, and demonstrate that our POP algorithm is much better than deterministic regression imputation in efficiency of generating an inference on the above two parameters. The results also show our algorithm is computationally efficient, robust and stable for the missing data imputation.