A conservative feature subset selection algorithm with missing data

  • Authors:
  • Alex Aussem;Sergio Rodrigues de Morais

  • Affiliations:
  • LIESP, University of Lyon, UCBL, 69622 Villeurbanne, France;LIESP, University of Lyon, INSA-Lyon, 69622 Villeurbanne, France

  • Venue:
  • Neurocomputing
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper introduces a novel conservative feature subset selection method with incomplete data sets. The method is conservative in the sense that it selects the minimal subset of features that renders the rest of the features independent of the target (the class variable) without making any assumption about the missing data mechanism. This is achieved in the context of determining the Markov blanket of the target that reflects the worst-case assumption about the missing data mechanism, including the case when data are not missing at random. An application of the method on synthetic and real-world incomplete data is carried out to illustrate its practical relevance. The method is compared against state-of-the-art approaches such as the expectation-maximization (EM) algorithm and the available case technique.