Locally linear reconstruction based missing value imputation for supervised learning

Authors:
Pilsung Kang
Affiliations:
Department of Industrial & Information Systems Engineering, College of Business and Technology, Seoul National University of Science & Technology (Seoultech), 139-743, 232 Gongreung ro, Nowon-gu, ...
Venue:
Neurocomputing
Year:
2013

Citing 24
Cited 0

Statistical analysis with missing data

Statistical analysis with missing data
C4.5: programs for machine learning

C4.5: programs for machine learning
The CN2 Induction Algorithm

Machine Learning
A Comparison of Several Approaches to Missing Attribute Values in Data Mining

RSCTC '00 Revised Papers from the Second International Conference on Rough Sets and Current Trends in Computing
Gaussian mixture clustering and imputation of microarray data

Bioinformatics
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Locally linear reconstruction for instance-based learning

Pattern Recognition
Impact of imputation of missing values on classification error for discrete data

Pattern Recognition
Combination of KNN-Based Feature Selection and KNNBased Missing-Value Imputation of Microarray Data

ICICIC '08 Proceedings of the 2008 3rd International Conference on Innovative Computing Information and Control
Using Imputation Techniques to Help Learn Accurate Classifiers

ICTAI '08 Proceedings of the 2008 20th IEEE International Conference on Tools with Artificial Intelligence - Volume 01
K nearest neighbours with mutual information for simultaneous classification and missing data imputation

Neurocomputing
Missing data imputation: a fuzzy K-means clustering algorithm over sliding window

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 3
Missing value imputation based on data clustering

Transactions on computational science I
Imputation of missing values for compositional data using classical and robust methods

Computational Statistics & Data Analysis
Missing data imputation using statistical and machine learning methods in a real breast cancer problem

Artificial Intelligence in Medicine
Missing Value Estimation for Mixed-Attribute Data Sets

IEEE Transactions on Knowledge and Data Engineering
Incorporating Nonlinear Relationships in Microarray Missing Value Imputation

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Using classifier-based nominal imputation to improve machine learning

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
On using prototype reduction schemes to optimize locally linear reconstruction methods

Pattern Recognition
A comparison of imputation methods for handling missing scores in biometric fusion

Pattern Recognition
A Novel Framework for Imputation of Missing Values in Databases

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
A new fuzzy c-means method with total variation regularization for segmentation of images with noisy and incomplete data

Pattern Recognition
Nearest neighbor pattern classification

IEEE Transactions on Information Theory
A New Version of the Rule Induction System LERS

Fundamenta Informaticae

Quantified Score

Hi-index	0.01

Visualization

Abstract

Most learning algorithms generally assume that data is complete so each attribute of all instances is filled with a valid value. However, missing values are very common in real datasets for various reasons. In this paper, we propose a new single imputation method based on locally linear reconstruction (LLR) that improves the prediction performance of supervised learning (classification & regression) with missing values. First, we investigate how missing values degrade the prediction performance with various missing ratios. Next, we compare the proposed missing value imputation method (LLR) with six well-known single imputation methods for five different learning algorithms based on 13 classification and nine regression datasets. The experimental results showed that (1) all imputation methods helped to improve the prediction accuracy, although some were very simple; (2) the proposed LLR imputation method enhanced the modeling performance more than all other imputation methods, irrespective of the learning algorithms and the missing ratios; and (3) LLR was outstanding when the missing ratio was relatively high and its prediction accuracy was similar to that of the complete dataset.