Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Intelligent clustering with instance-level constraints
Intelligent clustering with instance-level constraints
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Incorporating prior knowledge with weighted margin support vector machines
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Document clustering with prior knowledge
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Text clustering with extended user feedback
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Enhancing semi-supervised clustering: a feature projection perspective
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning from labeled features using generalized expectation criteria
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Constrained locally weighted clustering
Proceedings of the VLDB Endowment
An active learning framework for semi-supervised document clustering with language modeling
Data & Knowledge Engineering
Document-Word Co-regularization for Semi-supervised Sentiment Analysis
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Sentiment analysis of blogs by combining lexical knowledge with text classification
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Text classification by labeling words
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
A unified approach to active dual supervision for labeling features and examples
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Interactive feature selection for document clustering
Proceedings of the 2011 ACM Symposium on Applied Computing
Semi-supervised document clustering with dual supervision through seeding
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Enhancing semi-supervised document clustering with feature supervision
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Personalized document clustering with dual supervision
Proceedings of the 2012 ACM symposium on Document engineering
Personalized document clustering with dual supervision
Proceedings of the 2012 ACM symposium on Document engineering
Hi-index | 0.00 |
Semi-supervised clustering algorithms for general problems use a small amount of labeled instances or pairwise instance constraints to aid the unsupervised clustering. However, user supervision can also be provided in alternative forms for document clustering, such as labeling a feature by associating it with a document or a cluster. Besides labeled documents, this paper also explores labeled features to generate cluster seeds to seed the unsupervised clustering. In this paper, we present a unified framework in which one can use both labeled documents and features in terms of seeding clusters and refine this information using intermediate clusters. We introduce two methods of using labeled features to generate cluster seeds. Experimental results on several real-world data sets demonstrate that constraining the clustering by both documents and features seeding can significantly improve document clustering performance over random seeding and document only seeding. We also demonstrate that the clustering performance can be improved even with only a fraction of clusters being seeded compared to unsupervised clustering.