Towards subjectifying text clustering

Authors:
Sajib Dasgupta;Vincent Ng
Affiliations:
Human Language Technology Research Institute, Richardson, TX, USA;Human Language Technology Research Institute, Richardson, TX, USA
Venue:
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Year:
2010

Citing 11
Cited 2

Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Constrained K-means Clustering with Background Knowledge

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Document clustering based on non-negative matrix factorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Non-Redundant Data Clustering

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Meta Clustering

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
An interactive algorithm for asking and incorporating feature feedback into support vector machines

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Text classification by labeling words

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Interactive clustering of text collections according to a user-specified criterion

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Spectral learning

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Topic-wise, sentiment-wise, or otherwise?: Identifying the hidden dimension for unsupervised text classification

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2

Which clustering do you want? inducing your ideal clustering with minimal feedback

Journal of Artificial Intelligence Research
Interactive text document clustering using feature labeling

Proceedings of the 2013 ACM symposium on Document engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although it is common practice to produce only a single clustering of a dataset, in many cases text documents can be clustered along different dimensions. Unfortunately, not only do traditional text clustering algorithms fail to produce multiple clusterings of a dataset, the only clustering they produce may not be the one that the user desires. In this paper, we propose a simple active clustering algorithm that is capable of producing multiple clusterings of the same data according to user interest. In comparison to previous work on feedback-oriented clustering, the amount of user feedback required by our algorithm is minimal. In fact, the feedback turns out to be as simple as a cursory look at a list of words. Experimental results are very promising: our system is able to generate clusterings along the user-specified dimensions with reasonable accuracies on several challenging text classification tasks, thus providing suggestive evidence that our approach is viable.