Deflation Techniques for an Implicitly Restarted Arnoldi Iteration
SIAM Journal on Matrix Analysis and Applications
Approximating matrix multiplication for pattern recognition tasks
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Graph Clustering Using Multiway Ratio Cut
GD '97 Proceedings of the 5th International Symposium on Graph Drawing
Spectral Grouping Using the Nyström Method
IEEE Transactions on Pattern Analysis and Machine Intelligence
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Clustering via matrix powering
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Classification in Networked Data: A Toolkit and a Univariate Case Study
The Journal of Machine Learning Research
Random walks on the click graph
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A tutorial on spectral clustering
Statistics and Computing
Weighted Graph Cuts without Eigenvectors A Multilevel Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
Introduction to Information Retrieval
Introduction to Information Retrieval
Fast approximate spectral clustering
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Parallel Spectral Clustering in Distributed Systems
IEEE Transactions on Pattern Analysis and Machine Intelligence
Collectively representing semi-structured data from the web
AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
Deflation-based power iteration clustering
Applied Intelligence
Hi-index | 0.00 |
Large-scale text datasets have long eluded a family of particularly elegant and effective clustering methods that exploits the power of pair-wise similarities between data points due to the prohibitive cost, time-and space-wise, in operating on a similarity matrix, where the state-of-the-art is at best quadratic in time and in space. We present an extremely fast and simple method also using the power of all pair-wise similarity between data points, and show through experiments that it does as well as previous methods in clustering accuracy, and it does so with in linear time and space, without sampling data points or sparsifying the similarity matrix.