Approximate counting, uniform generation and rapidly mixing Markov chains
Information and Computation
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
On clusterings: Good, bad and spectral
Journal of the ACM (JACM)
A spectral approach to clustering numerical vectors as nodes in a network
Pattern Recognition
Euclidean distances, soft and spectral clustering on weighted graphs
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Hi-index | 0.01 |
Recently, a large amount of work has been devoted to the study of spectral clustering-a powerful unsupervised classification method. This paper brings contributions to both its foundations, and its applications to text classification. Departing from the mainstream, concerned with hard membership, we study the extension of spectral clustering to soft membership (probabilistic, EM style) assignments. One of its key features is to avoid the complexity gap of hard membership. We apply this theory to a challenging problem, text clustering for languages having permeable borders, via a novel construction of Markov chains from corpora. Experiments with a readily available code clearly display the potential of the method, which brings a visually appealing soft distinction of languages that may define altogether a whole corpus.