Clustering via matrix powering

Authors:
Hanson Zhou;David Woodruff
Affiliations:
Intelligence Laboratory, Cambridge, MA;Intelligence Laboratory, Cambridge, MA
Venue:
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2004

Citing 15
Cited 2

Coloring random and semi-random k-colorable graphs

Journal of Algorithms
Latent semantic indexing: a probabilistic analysis

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Approximation schemes for Euclidean k-medians and related problems

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Segmentation problems

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
A constant-factor approximation algorithm for the k-median problem (extended abstract)

STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Clustering in large graphs and matrices

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Algorithms for graph partitioning on the planted partition model

Random Structures & Algorithms
Fast computation of low rank matrix approximations

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Improved Combinatorial Algorithms for the Facility Location and k-Median Problems

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Primal-Dual Approximation Algorithms for Metric Facility Location and k-Median Problems

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
On clusterings-good, bad and spectral

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Polynomial time approximation schemes for geometric k-clustering

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Spectral Partitioning of Random Graphs

FOCS '01 Proceedings of the 42nd IEEE symposium on Foundations of Computer Science
Clustering the Chilean Web

LA-WEB '03 Proceedings of the First Conference on Latin American Web Congress
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining

Approximation algorithms for co-clustering

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A Very Fast Method for Clustering Big Text Datasets

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Given a set of n points with a matrix of pairwise similarity measures, one would like to partition the points into clusters so that similar points are together and different ones apart. We present an algorithm requiring only matrix powering that performs well in practice and bears an elegant interpretation in terms of random walks on a graph. Under a certain mixture model involving planting a partition via randomized rounding of tailored matrix entries, the algorithm can be proven effective for only a single squaring. It is shown that the clustering performance of the algorithm degrades with larger values of the exponent, thus revealing that a single squaring is optimal.