Machine Learning
Constrained K-means Clustering with Background Knowledge
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
The Journal of Machine Learning Research
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
A bootstrapping method for learning semantic lexicons using extraction pattern contexts
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Topic modeling: beyond bag-of-words
ICML '06 Proceedings of the 23rd international conference on Machine learning
Prototype-driven learning for sequence models
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Comparing clusterings---an information based distance
Journal of Multivariate Analysis
Constrained Clustering: Advances in Algorithms, Theory, and Applications
Constrained Clustering: Advances in Algorithms, Theory, and Applications
Incorporating domain knowledge into topic modeling via Dirichlet Forest priors
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Latent Dirichlet Allocation with topic-in-set knowledge
SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Incorporating word correlation into tag-topic model for semantic knowledge acquisition
Proceedings of the 21st ACM international conference on Information and knowledge management
Discovering coherent topics using general knowledge
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Leveraging multi-domain prior knowledge in topic models
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Topic models have great potential for helping users understand document corpora. This potential is stymied by their purely unsupervised nature, which often leads to topics that are neither entirely meaningful nor effective in extrinsic tasks (Chang et al., 2009). We propose a simple and effective way to guide topic models to learn topics of specific interest to a user. We achieve this by providing sets of seed words that a user believes are representative of the underlying topics in a corpus. Our model uses these seeds to improve both topic-word distributions (by biasing topics to produce appropriate seed words) and to improve document-topic distributions (by biasing documents to select topics related to the seed words they contain). Extrinsic evaluation on a document clustering task reveals a significant improvement when using seed information, even over other models that use seed information naïvely.