Computational aspects of Bayesian partition models

  • Authors:
  • Mikko Koivisto;Kismat Sood

  • Affiliations:
  • University of Helsinki, Helsinki, Finland;National Public Health Institute, Helsinki, Finland

  • Venue:
  • ICML '05 Proceedings of the 22nd international conference on Machine learning
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

The conditional distribution of a discrete variable y, given another discrete variable x, is often specified by assigning one multinomial distribution to each state of x. The cost of this rich parametrization is the loss of statistical power in cases where the data actually fits a model with much fewer parameters. In this paper, we consider a model that partitions the state space of x into disjoint sets, and assigns a single Dirichlet-multinomial to each set. We treat the partition as an unknown variable which is to be integrated away when the interest is in a coarser level task, e.g., variable selection or classification. Based on two different computational approaches, we present two exact algorithms for integration over partitions. Respective complexity bounds are derived in terms of detailed problem characteristics, including the size of the data and the size of the state space of x. Experiments on synthetic data demonstrate the applicability of the algorithms.