Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Multi-way distributional clustering via pairwise interactions
ICML '05 Proceedings of the 22nd international conference on Machine learning
Constructing informative prior distributions from domain knowledge in text classification
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Text classification by labeling words
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Sentiment retrieval using generative models
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Topic and role discovery in social networks
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Distributional Similarity Model for Multi-modality Clustering in Social Media
WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
Sentiment Clustering: A Novel Method to Explore in the Blogosphere
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Chinese Blog Clustering by Hidden Sentiment Factors
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Towards subjectifying text clustering
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Which clustering do you want? inducing your ideal clustering with minimal feedback
Journal of Artificial Intelligence Research
Document clustering with universum
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A framework for personalized and collaborative clustering of search results
Proceedings of the 20th ACM international conference on Information and knowledge management
A novel approach for clustering sentiments in Chinese blogs based on graph similarity
Computers & Mathematics with Applications
iVisClustering: An Interactive Visual Document Clustering via Topic Modeling
Computer Graphics Forum
Hi-index | 0.00 |
Document clustering is traditionally tackled from the perspective of grouping documents that are topically similar. However, many other criteria for clustering documents can be considered: for example, documents' genre or the author's mood. We propose an interactive scheme for clustering document collections, based on any criterion of the user's preference. The user holds an active position in the clustering process: first, she chooses the types of features suitable to the underlying task, leading to a task-specific document representation. She can then provide examples of features-- if such examples are emerging, e.g., when clustering by the author's sentiment, words like 'perfect', 'mediocre', 'awful' are intuitively good features. The algorithm proceeds iteratively, and the user can fix errors made by the clustering system at the end of each iteration. Such an interactive clustering method demonstrates excellent results on clustering by sentiment, substantially outperforming an SVM trained on a large amount of labeled data. Even if features are not provided because they are not intuitively obvious to the user--e.g., what would be good features for clustering by genre using part-of-speech trigrams?--our multi-modal clustering method performs significantly better than k-means and Latent Dirichlet Allocation (LDA).