A Conservative Feature Subset Selection Algorithm with Missing Data

  • Authors:
  • Alex Aussem;Sergio Rodrigues de Morais

  • Affiliations:
  • -;-

  • Venue:
  • ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a novel conservative feature subset selection method with incomplete data sets. The method is conservative in the sense that it selects the minimal subset of features that renders the rest of the features independent of the target (the class variable) without making any assumption about the missing data mechanism. This is achieved in the context of determining the Markov blanket of the target that reflects the worst-case assumption about the missing data mechanism, including the case when data is not missing at random. An application of the method on synthetic incomplete data is carried out to illustrate its practical relevance. The method is compared against state-of-the-art approaches such as the {expectation maximization} (EM) algorithm and the available case technique.