An Evolutionary Algorithm for Missing Values Substitution in Classification Tasks

Authors:
Jonathan A. Silva;Eduardo R. Hruschka
Affiliations:
University of Sao Paulo (USP), Brazil;University of Sao Paulo (USP), Brazil
Venue:
HAIS '09 Proceedings of the 4th International Conference on Hybrid Artificial Intelligence Systems
Year:
2009

Citing 3
Cited 0

Data preparation for data mining

Data preparation for data mining
On the influence of imputation in classification: practical issues

Journal of Experimental & Theoretical Artificial Intelligence
Evolving clusters in gene-expression data

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a method for substituting missing values that is based on an evolutionary algorithm for clustering. Missing values substitution has been traditionally assessed by some measures of the prediction capability of imputation methods. Although this evaluation is useful, it does not allow inferring the influence of imputed values in the ultimate modeling task (e.g., in classification). In this sense, alternative approaches to the so called prediction capability evaluation are needed. Therefore, we here also discuss the influence of imputed values in the classification task. Preliminary results obtained in a bioinformatics data set illustrate that the proposed imputation algorithm can insert less classification bias than three state of the art algorithms (i.e., KNNimpute, SKNN and IKNN). Finally, we illustrate that better prediction results do not necessarily imply in less classification bias.