Missing value imputation on missing completely at random data using multilayer perceptrons

Authors:
Esther-Lydia Silva-Ramírez;Rafael Pino-Mejías;Manuel López-Coello;María-Dolores Cubiles-de-la-Vega
Affiliations:
Department of Computer Languages and Systems, University of Cadiz, C/Chile N 1, 11003 Cadiz, Spain;Andalusian Prospective Center, Avda. Reina Mercedes s/n, 41012 Seville, Spain and Department of Statistics and Operational Research, University of Seville, Avda. Reina Mercedes s/n, 41012 Seville, ...;Department of Computer Languages and Systems, University of Cadiz, C/Chile N 1, 11003 Cadiz, Spain;Department of Statistics and Operational Research, University of Seville, Avda. Reina Mercedes s/n, 41012 Seville, Spain
Venue:
Neural Networks
Year:
2011

Citing 6
Cited 7

Statistical analysis with missing data

Statistical analysis with missing data
Training Algorithm with Incomplete Data for Feed-ForwardNeural Networks

Neural Processing Letters
Local distance-based classification

Knowledge-Based Systems
Robust H∞ control for a class of nonlinear discrete time-delay stochastic systems with missing measurements

Automatica (Journal of IFAC)
State estimation for coupled uncertain stochastic networks with missing measurements and time-varying delays: the discrete-time case

IEEE Transactions on Neural Networks
A SVM regression based approach to filling in missing values

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III

Removal and interpolation of missing values using wavelet neural network for heterogeneous data sets

Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Classifying patterns with missing values using Multi-Task Learning perceptrons

Expert Systems with Applications: An International Journal
A data mining driven risk profiling method for road asset management

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Missing value imputation using decision trees and decision forests by splitting and merging records: Two novel techniques

Knowledge-Based Systems
Partial imputation of unseen records to improve classification using a hybrid multi-layered artificial immune system and genetic algorithm

Applied Soft Computing
Detecting mistakes in binary data tables

Automatic Documentation and Mathematical Linguistics
FIMUS: A framework for imputing missing values using co-appearance, correlation and similarity analysis

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data mining is based on data files which usually contain errors in the form of missing values. This paper focuses on a methodological framework for the development of an automated data imputation model based on artificial neural networks. Fifteen real and simulated data sets are exposed to a perturbation experiment, based on the random generation of missing values. These data set sizes range from 47 to 1389 records. A perturbation experiment was performed for each data set where the probability of missing value was set to 0.05. Several architectures and learning algorithms for the multilayer perceptron are tested and compared with three classic imputation procedures: mean/mode imputation, regression and hot-deck. The obtained results, considering different performance measures, not only suggest this approach improves the quality of a database with missing values, but also the best results are clearly obtained using the Multilayer Perceptron model in data sets with categorical variables. Three learning rules (Levenberg-Marquardt, BFGS Quasi-Newton and Conjugate Gradient Fletcher-Reeves Update) and a small number of hidden nodes are recommended.