A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Concept decompositions for large sparse text data using clustering
Machine Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Clustering with Instance-level Constraints
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A study of smoothing methods for language models applied to information retrieval
ACM Transactions on Information Systems (TOIS)
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Integrating constraints and metric learning in semi-supervised clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Hierarchical Clustering Algorithms for Document Datasets
Data Mining and Knowledge Discovery
Document Clustering Using Locality Preserving Indexing
IEEE Transactions on Knowledge and Data Engineering
Semi-Supervised Clustering with Metric Learning Using Relative Comparisons
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Semi-supervised model-based document clustering: A comparative study
Machine Learning
Text document clustering based on frequent word meaning sequences
Data & Knowledge Engineering
Active learning with statistical models
Journal of Artificial Intelligence Research
Subspace metric ensembles for semi-supervised clustering of high dimensional data
ECML'06 Proceedings of the 17th European conference on Machine Learning
Text clustering with limited user feedback under local metric learning
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Active Learning of Instance-Level Constraints for Semi-supervised Document Clustering
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Collaborative clustering with background knowledge
Data & Knowledge Engineering
A novel semi-supervised fuzzy C-means clustering method
CCDC'09 Proceedings of the 21st annual international conference on Chinese control and decision conference
Data & Knowledge Engineering
Background knowledge integration in clustering using purity indexes
KSEM'10 Proceedings of the 4th international conference on Knowledge science, engineering and management
Document clustering using synthetic cluster prototypes
Data & Knowledge Engineering
Interactive feature selection for document clustering
Proceedings of the 2011 ACM Symposium on Applied Computing
A unified framework for document clustering with dual supervision
ACM SIGAPP Applied Computing Review
SHACUN: semi-supervised hierarchical active clustering based on ranking constraints
ICDM'12 Proceedings of the 12th Industrial conference on Advances in Data Mining: applications and theoretical aspects
Hi-index | 0.00 |
This paper investigates a framework that actively selects informative document pairs for obtaining user feedback for semi-supervised document clustering. A gain-directed document pair selection method that measures how much we can learn by revealing judgments of selected document pairs is designed. We use the estimation of term co-occurrence probabilities as a clue for finding informative document pairs. Term co-occurrence probabilities are considered in the semi-supervised document clustering process to capture term-to-term dependence relationships. In the semi-supervised document clustering, each cluster is represented by a language model. We have conducted extensive experiments on several real-world corpora. The results demonstrate that our proposed framework is effective.