A generalized maximum entropy approach to bregman co-clustering and matrix approximation

  • Authors:
  • Arindam Banerjee;Inderjit Dhillon;Joydeep Ghosh;Srujana Merugu;Dharmendra S. Modha

  • Affiliations:
  • University of Texas, Austin, TX;University of Texas, Austin, TX;University of Texas, Austin, TX;University of Texas, Austin, TX;IBM Almaden Research Center, San Jose, CA

  • Venue:
  • Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Co-clustering is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. Recently, an information-theoretic co-clustering approach applicable to empirical joint probability distributions was proposed. In many situations, co-clustering of more general matrices is desired. In this paper, we present a substantially generalized co-clustering framework wherein any Bregman divergence can be used in the objective function, and various conditional expectation based constraints can be considered based on the statistics that need to be preserved. Analysis of the co-clustering problem leads to the minimum Bregman information principle, which generalizes the maximum entropy principle, and yields an elegant meta algorithm that is guaranteed to achieve local optimality. Our methodology yields new algorithms and also encompasses several previously known clustering and co-clustering algorithms based on alternate minimization.