Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Learning to classify text from labeled and unlabeled documents
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Document classification through interactive supervision of document and term labels
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Text classification by labeling words
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
An interactive algorithm for asking and incorporating feature feedback into support vector machines
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Integrating rich user feedback into intelligent user interfaces
Proceedings of the 13th international conference on Intelligent user interfaces
Learning from labeled features using generalized expectation criteria
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Non-negative matrix factorization for semi-supervised data clustering
Knowledge and Information Systems
Unsupervised Text Learning Based on Context Mixture Model with Dirichlet Prior
Advanced Web and NetworkTechnologies, and Applications
A hidden Markov model-based text classification of medical documents
Journal of Information Science
Interacting meaningfully with machine learning systems: Three experiments
International Journal of Human-Computer Studies
Managing email overload with an automatic nonparametric clustering system
The Journal of Supercomputing
Interactive feature space construction using semantic information
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
One-class clustering in the text domain
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Clustering objects from multiple collections
KI'09 Proceedings of the 32nd annual German conference on Advances in artificial intelligence
Clustering dictionary definitions using Amazon Mechanical Turk
CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Generating an event arrangement for understanding news articles on the web
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part II
Integrating knowledge capture and supervised learning through a human-computer interface
Proceedings of the sixth international conference on Knowledge capture
Toward interactive training and evaluation
Proceedings of the 20th ACM international conference on Information and knowledge management
Semi-supervised document clustering with dual supervision through seeding
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Enhancing semi-supervised document clustering with feature supervision
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Continuous user feedback learning for data capture from business documents
HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II
A unified framework for document clustering with dual supervision
ACM SIGAPP Applied Computing Review
Personalized document clustering with dual supervision
Proceedings of the 2012 ACM symposium on Document engineering
On Knowledge-Enhanced Document Clustering
International Journal of Information Retrieval Research
Hi-index | 0.00 |
Text clustering is most commonly treated as a fully automated task without user feedback. However, a variety of researchers have explored mixed-initiative clustering methods which allow a user to interact with and advise the clustering algorithm. This mixed-initiative approach is especially attractive for text clustering tasks where the user is trying to organize a corpus of documents into clusters for some particular purpose (e.g., clustering their email into folders that reflect various activities in which they are involved). This paper introduces a new approach to mixed-initiative clustering that handles several natural types of user feedback. We first introduce a new probabilistic generative model for text clustering (the SpeClustering model) and show that it outperforms the commonly used mixture of multinomials clustering model, even when used in fully autonomous mode with no user input. We then describe how to incorporate four distinct types of user feedback into the clustering algorithm, and provide experimental evidence showing substantial improvements in text clustering when this user feedback is incorporated.