Towards scalable and data efficient learning of Markov boundaries

  • Authors:
  • Jose M. Peòa;Roland Nilsson;Johan Björkegren;Jesper Tegnér

  • Affiliations:
  • IFM, Linköping University, SE-58183 Linköping, Sweden;IFM, Linköping University, SE-58183 Linköping, Sweden;CGB, Karolinska Institutet, SE-17177 Stockholm, Sweden;IFM, Linköping University, SE-58183 Linköping, Sweden and CGB, Karolinska Institutet, SE-17177 Stockholm, Sweden

  • Venue:
  • International Journal of Approximate Reasoning
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose algorithms for learning Markov boundaries from data without having to learn a Bayesian network first. We study their correctness, scalability and data efficiency. The last two properties are important because we aim to apply the algorithms to identify the minimal set of features that is needed for probabilistic classification in databases with thousands of features but few instances, e.g. gene expression databases. We evaluate the algorithms on synthetic and real databases, including one with 139,351 features.