A generalized maximum entropy approach to bregman co-clustering and matrix approximation

Authors:
Arindam Banerjee;Inderjit Dhillon;Joydeep Ghosh;Srujana Merugu;Dharmendra S. Modha
Affiliations:
University of Texas, Austin, TX;University of Texas, Austin, TX;University of Texas, Austin, TX;University of Texas, Austin, TX;IBM Almaden Research Center, San Jose, CA
Venue:
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2004

Citing 4
Cited 36

Elements of information theory

Elements of information theory
Parallel Optimization: Theory, Algorithms and Applications

Parallel Optimization: Theory, Algorithms and Applications
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining

Model-based overlapping clustering

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Co-clustering by block value decomposition

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Multi-way distributional clustering via pairwise interactions

ICML '05 Proceedings of the 22nd international conference on Machine learning
A Scalable Collaborative Filtering Framework Based on Co-Clustering

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Comparing Subspace Clusterings

IEEE Transactions on Knowledge and Data Engineering
Spectral clustering for multi-type relational data

ICML '06 Proceedings of the 23rd international conference on Machine learning
Unsupervised learning on k-partite graphs

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Minimum sum-squared residue for fuzzy co-clustering

Intelligent Data Analysis
Adaptive dimension reduction using discriminant analysis and K-means clustering

Proceedings of the 24th international conference on Machine learning
Relational clustering by symmetric convex coding

Proceedings of the 24th international conference on Machine learning
Topic segmentation with shared topic detection and alignment of multiple documents

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic framework for relational clustering

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning correlations using the mixture-of-subsets model

ACM Transactions on Knowledge Discovery from Data (TKDD)
Biclustering in data mining

Computers and Operations Research
Bipartite isoperimetric graph partitioning for data co-clustering

Data Mining and Knowledge Discovery
Can chinese web pages be classified with english data source?

Proceedings of the 17th international conference on World Wide Web
CRD: fast co-clustering on large datasets utilizing sampling-based matrix decomposition

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Collaborative filtering using orthogonal nonnegative matrix tri-factorization

Information Processing and Management: an International Journal
Graph partitioning based on link distributions

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Clustering on complex graphs

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Analyzing knowledge communities using foreground and background clusters

ACM Transactions on Knowledge Discovery from Data (TKDD)
I/O scalable Bregman co-clustering

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Co-clustering with augmented data matrix

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
A framework for joint community detection across multiple related networks

Neurocomputing
The discrete basis problem

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
An information-theoretic framework for high-order co-clustering of heterogeneous objects

ECML'06 Proceedings of the 17th European conference on Machine Learning
An empirical study on the effectiveness of hyperspectral image classification algorithms with dimensionality reduction

Proceedings of the 2011 ACM Symposium on Research in Applied Computation
A new fuzzy co-clustering algorithm for categorization of datasets with overlapping clusters

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
A levelwise spectral co-clustering algorithm for collaborative filtering

Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication
Bi-clustering gene expression data using co-similarity

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Stochastic neighbor embedding (SNE) for dimension reduction and visualization using arbitrary divergences

Neurocomputing
Hierarchical co-clustering based on entropy splitting

Proceedings of the 21st ACM international conference on Information and knowledge management
Twitter hyperlink recommendation with user-tweet-hyperlink three-way clustering

Proceedings of the 21st ACM international conference on Information and knowledge management
Network Anomaly Detection Using Co-clustering

ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Co-clustering with augmented matrix

Applied Intelligence
Mining order-preserving submatrices from probabilistic matrices

ACM Transactions on Database Systems (TODS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Co-clustering is a powerful data mining technique with varied applications such as text clustering, microarray analysis and recommender systems. Recently, an information-theoretic co-clustering approach applicable to empirical joint probability distributions was proposed. In many situations, co-clustering of more general matrices is desired. In this paper, we present a substantially generalized co-clustering framework wherein any Bregman divergence can be used in the objective function, and various conditional expectation based constraints can be considered based on the statistics that need to be preserved. Analysis of the co-clustering problem leads to the minimum Bregman information principle, which generalizes the maximum entropy principle, and yields an elegant meta algorithm that is guaranteed to achieve local optimality. Our methodology yields new algorithms and also encompasses several previously known clustering and co-clustering algorithms based on alternate minimization.