Concept decompositions for large sparse text data using clustering
Machine Learning
Semi-supervised Clustering by Seeding
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Clustering with Instance-level Constraints
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Integrating constraints and metric learning in semi-supervised clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Hierarchical Clustering Algorithms for Document Datasets
Data Mining and Knowledge Discovery
Subspace clustering of text documents with feature weighting k-means algorithm
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
An active learning framework for semi-supervised document clustering with language modeling
Data & Knowledge Engineering
Hi-index | 0.00 |
This paper investigates the idea of incorporating incremental user feedbacks and a small amount of sample documents for some, not necessarily all, clusters into text clustering. For the modeling of each cluster, we make use of a local weight metric to reflect the importance of the features for a particular cluster. The local weight metric is learned using both the unlabeled data and the constraints generated automatically from user feedbacks and sample documents. The quality of local metric is improved by incorporating more precise constraints. Improving the quality of local metric will in return enhance the clustering performance. We have conducted extensive experiments on real-world news documents. The results demonstrate that user feedback information coupled with local metric learning can dramatically improve the clustering performance.