Leveraging multi-domain prior knowledge in topic models

Authors:
Zhiyuan Chen;Arjun Mukherjee;Bing Liu;Meichun Hsu;Malu Castellanos;Riddhiman Ghosh
Affiliations:
University of Illinois at Chicago;University of Illinois at Chicago;University of Illinois at Chicago;HP Labs;HP Labs;HP Labs
Venue:
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Year:
2013

Citing 23
Cited 1

WordNet: a lexical database for English

Communications of the ACM
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Latent dirichlet allocation

The Journal of Machine Learning Research
Topic sentiment mixture: modeling facets and opinions in weblogs

Proceedings of the 16th international conference on World Wide Web
Modeling online reviews with multi-grain topic models

Proceedings of the 17th international conference on World Wide Web
Topic-bridged PLSA for cross-domain text classification

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Polya Urn Models

Polya Urn Models
Incorporating domain knowledge into topic modeling via Dirichlet Forest priors

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A Generic Approach to Topic Models

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Learning author-topic models from text corpora

ACM Transactions on Information Systems (TOIS)
Evaluating topic models for digital libraries

Proceedings of the 10th annual joint conference on Digital libraries
Latent aspect rating analysis on review text data: a rating regression approach

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A Survey on Transfer Learning

IEEE Transactions on Knowledge and Data Engineering
Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Aspect and sentiment unification model for online review analysis

Proceedings of the fourth ACM international conference on Web search and data mining
Interactive topic modeling

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Multi-aspect Sentiment Analysis with Topic Models

ICDMW '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops
Optimizing semantic coherence in topic models

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Improving performance of topic models by variable grouping

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Interfacing virtual agents with collaborative knowledge: open domain question answering using wikipedia-based topic models

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Incorporating lexical priors into topic models

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Aspect extraction through semi-supervised modeling

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1

Discovering coherent topics using general knowledge

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Topic models have been widely used to identify topics in text corpora. It is also known that purely unsupervised models often result in topics that are not comprehensible in applications. In recent years, a number of knowledge-based models have been proposed, which allow the user to input prior knowledge of the domain to produce more coherent and meaningful topics. In this paper, we go one step further to study how the prior knowledge from other domains can be exploited to help topic modeling in the new domain. This problem setting is important from both the application and the learning perspectives because knowledge is inherently accumulative. We human beings gain knowledge gradually and use the old knowledge to help solve new problems. To achieve this objective, existing models have some major difficulties. In this paper, we propose a novel knowledge-based model, called MDK-LDA, which is capable of using prior knowledge from multiple domains. Our evaluation results will demonstrate its effectiveness.