Multivariate information bottleneck

  • Authors:
  • Noam Slonim;Nir Friedman;Naftali Tishby

  • Affiliations:
  • Department of Physics and the Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ;School of Computer Science and Engineering, Hebrew University, Jerusalem, Israel;School of Computer Science and Engineering, and Interdisciplinary Center for Neural Computation, Hebrew University, Jerusalem, Israel

  • Venue:
  • Neural Computation
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The information bottleneck (IB) method is an unsupervised model independent data organization technique. Given a joint distribution, p(X, Y), this method constructs a new variable, T, that extracts partitions, or clusters, over the values of X that are informative about Y. Algorithms that are motivated by the IB method have already been applied to text classification, gene expression, neural code, and spectral analysis. Here, we introduce a general principled framework for multivariate extensions of the IB method. This allows us to consider multiple systems of data partitions that are interrelated. Our approach utilizes Bayesian networks for specifying the systems of clusters and which information terms should be maintained. We show that this construction provides insights about bottleneck variations and enables us to characterize the solutions of these variations. We also present four different algorithmic approaches that allow us to construct solutions in practice and apply them to several real-world problems.