Active spectral clustering via iterative uncertainty reduction

Authors:
Fabian L. Wauthier;Nebojsa Jojic;Michael I. Jordan
Affiliations:
University of California, Berkeley, Berkeley, CA, USA;Microsoft, Redmond, WA, USA;University of California, Berkeley, Berkeley, CA, USA
Venue:
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2012

Citing 16
Cited 1

Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Pass efficient algorithms for approximating large matrices

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Query Learning with Large Margin Classifiers

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Support vector machine active learning with applications to text classification

The Journal of Machine Learning Research
A survey of kernels for structured data

ACM SIGKDD Explorations Newsletter
Spectral Grouping Using the Nyström Method

IEEE Transactions on Pattern Analysis and Machine Intelligence
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Fast monte-carlo algorithms for finding low-rank approximations

Journal of the ACM (JACM)
Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix

SIAM Journal on Computing
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning

The Journal of Machine Learning Research
Fast computation of low-rank matrix approximations

Journal of the ACM (JACM)
A tutorial on spectral clustering

Statistics and Computing
Fast Spectral Clustering with Random Projection and Sampling

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Investigation of various matrix factorization methods for large recommender systems

Proceedings of the 2nd KDD Workshop on Large-Scale Recommender Systems and the Netflix Prize Competition
Active Spectral Clustering

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Active constrained clustering by examining spectral eigenvectors

DS'05 Proceedings of the 8th international conference on Discovery Science

Active learning from relative queries

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Spectral clustering is a widely used method for organizing data that only relies on pairwise similarity measurements. This makes its application to non-vectorial data straight-forward in principle, as long as all pairwise similarities are available. However, in recent years, numerous examples have emerged in which the cost of assessing similarities is substantial or prohibitive. We propose an active learning algorithm for spectral clustering that incrementally measures only those similarities that are most likely to remove uncertainty in an intermediate clustering solution. In many applications, similarities are not only costly to compute, but also noisy. We extend our algorithm to maintain running estimates of the true similarities, as well as estimates of their accuracy. Using this information, the algorithm updates only those estimates which are relatively inaccurate and whose update would most likely remove clustering uncertainty. We compare our methods on several datasets, including a realistic example where similarities are expensive and noisy. The results show a significant improvement in performance compared to the alternatives.