Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Pass efficient algorithms for approximating large matrices
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Query Learning with Large Margin Classifiers
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Support vector machine active learning with applications to text classification
The Journal of Machine Learning Research
A survey of kernels for structured data
ACM SIGKDD Explorations Newsletter
Spectral Grouping Using the Nyström Method
IEEE Transactions on Pattern Analysis and Machine Intelligence
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Fast monte-carlo algorithms for finding low-rank approximations
Journal of the ACM (JACM)
Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix
SIAM Journal on Computing
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning
The Journal of Machine Learning Research
Fast computation of low-rank matrix approximations
Journal of the ACM (JACM)
A tutorial on spectral clustering
Statistics and Computing
Fast Spectral Clustering with Random Projection and Sampling
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Investigation of various matrix factorization methods for large recommender systems
Proceedings of the 2nd KDD Workshop on Large-Scale Recommender Systems and the Netflix Prize Competition
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Active constrained clustering by examining spectral eigenvectors
DS'05 Proceedings of the 8th international conference on Discovery Science
Active learning from relative queries
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Spectral clustering is a widely used method for organizing data that only relies on pairwise similarity measurements. This makes its application to non-vectorial data straight-forward in principle, as long as all pairwise similarities are available. However, in recent years, numerous examples have emerged in which the cost of assessing similarities is substantial or prohibitive. We propose an active learning algorithm for spectral clustering that incrementally measures only those similarities that are most likely to remove uncertainty in an intermediate clustering solution. In many applications, similarities are not only costly to compute, but also noisy. We extend our algorithm to maintain running estimates of the true similarities, as well as estimates of their accuracy. Using this information, the algorithm updates only those estimates which are relatively inaccurate and whose update would most likely remove clustering uncertainty. We compare our methods on several datasets, including a realistic example where similarities are expensive and noisy. The results show a significant improvement in performance compared to the alternatives.