A novel Markov boundary based feature subset selection algorithm

  • Authors:
  • Sérgio Rodrigues de Morais;Alex Aussem

  • Affiliations:
  • LIESP, University of Lyon, INSA-Lyon, 69622 Villeurbanne, France;LIESP, University of Lyon, UCBL, 69622 Villeurbanne, France

  • Venue:
  • Neurocomputing
  • Year:
  • 2010

Quantified Score

Hi-index 0.02

Visualization

Abstract

We aim to identify the minimal subset of random variables that is relevant for probabilistic classification in data sets with many variables but few instances. A principled solution to this problem is to determine the Markov boundary of the class variable. In this paper, we propose a novel constraint-based Markov boundary discovery algorithm called MBOR with the objective of improving accuracy while still remaining scalable to very high dimensional data sets and theoretically correct under the so-called faithfulness condition. We report extensive empirical experiments on synthetic data sets scaling up to tens of thousand variables.