Information theoretic pairwise clustering

Authors:
Avishay Friedman;Jacob Goldberger
Affiliations:
Engineering Faculty, Bar-Ilan University, Ramat-Gan, Israel;Engineering Faculty, Bar-Ilan University, Ramat-Gan, Israel
Venue:
SIMBAD'13 Proceedings of the Second international conference on Similarity-Based Pattern Recognition
Year:
2013

Citing 13
Cited 0

Elements of information theory

Elements of information theory
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
Document clustering using word clusters via the information bottleneck method

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Machine Learning

Machine Learning
Unsupervised document classification using sequential information maximization

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
A New Nonparametric Pairwise Clustering Algorithm Based on Iterative Estimation of Distance Profiles

Machine Learning - Special issue: Unsupervised learning
A divisive information theoretic feature clustering algorithm for text classification

The Journal of Machine Learning Research
Multiclass Spectral Clustering

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A tutorial on spectral clustering

Statistics and Computing
Weighted Graph Cuts without Eigenvectors A Multilevel Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Introduction to Information Retrieval

Introduction to Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we develop an information-theoretic approach for pairwise clustering. The Laplacian of the pairwise similarity matrix can be used to define a Markov random walk on the data points. This view forms a probabilistic interpretation of spectral clustering methods. We utilize this probabilistic model to define a novel clustering cost function that is based on maximizing the mutual information between consecutively visited clusters of states of the Markov chain defined by the graph Laplacian matrix. The algorithm complexity is linear on sparse graphs. The improved performance and the reduced computational complexity of the proposed algorithm are demonstrated on several standard datasets.