The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Factorization Approach to Grouping
ECCV '98 Proceedings of the 5th European Conference on Computer Vision-Volume I - Volume I
Segmentation Using Eigenvectors: A Unifying View
ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Learning Eigenfunctions Links Spectral Embedding and Kernel PCA
Neural Computation
Learning an atlas of a cognitive process in its functional geometry
IPMI'11 Proceedings of the 22nd international conference on Information processing in medical imaging
Hi-index | 0.00 |
Recently, spectral clustering (a.k.a. normalized graph cut) techniques have become popular for their potential ability at finding irregularly-shaped clusters in data. The input to these methods is a similarity measure between every pair of data points. If the clusters are well-separated, the eigenvectors of the similarity matrix can be used to identify the clusters, essentially by identifying groups of points that are related by transitive similarity relationships. However, these techniques fail when the clusters are noisy and not well-separated, or when the scale parameter that is used to map distances between points to similarities is not set correctly. Our approach to solving these problems is to introduce a generative probability model that explicitly models noise and can be trained in a maximum-likelihood fashion to estimate the scale parameter. Exact inference is computationally intractable, but we describe tractable, approximate techniques for inference and learning. Interestingly, it turns out that greedy inference and learning in one of our models with a fixed scale parameter is equivalent to spectral clustering. We examine several data sets, and demonstrate that our method finds better clusters compared with spectral clustering.