Statistics and Computing
The Journal of Machine Learning Research
The author-topic model for authors and documents
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Pachinko allocation: DAG-structured mixture models of topic correlations
ICML '06 Proceedings of the 23rd international conference on Machine learning
Topic modeling: beyond bag-of-words
ICML '06 Proceedings of the 23rd international conference on Machine learning
Estimating Likelihoods for Topic Models
ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Automatic evaluation of topic coherence
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Unsupervised modeling of Twitter conversations
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Not-so-latent dirichlet allocation: collapsed Gibbs sampling using human judgments
CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Evaluating models of latent document semantics in the presence of OCR errors
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Exploiting user interests for collaborative filtering: interests expansion via personalized ranking
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part II
Mining software repositories using topic models
Proceedings of the 33rd International Conference on Software Engineering
Sampling table configurations for the hierarchical poisson-dirichlet process
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Neurocomputing
Non-Parametric Estimation of Topic Hierarchies from Texts with Hierarchical Dirichlet Processes
The Journal of Machine Learning Research
Bayesian checking for topic models
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Optimizing semantic coherence in topic models
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
International Journal of Computer Vision
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Improving topic evaluation using conceptual knowledge
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Video Behaviour Mining Using a Dynamic Topic Model
International Journal of Computer Vision
Shared components topic models
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Exploring topic coherence over many models and many topics
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
The generalized dirichlet distribution in enhanced topic detection
Proceedings of the 21st ACM international conference on Information and knowledge management
Modeling topic hierarchies with the recursive chinese restaurant process
Proceedings of the 21st ACM international conference on Information and knowledge management
Evaluating the use of clustering for automatically organising digital library collections
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
Analyzing the flow of knowledge in computer mediated teams
Proceedings of the 2013 conference on Computer supported cooperative work
Mining Divergent Opinion Trust Networks through Latent Dirichlet Allocation
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
A topic-based recommender system for electronic marketplace platforms
Expert Systems with Applications: An International Journal
Proceedings of the 6th ACM India Computing Convention
Proceedings of the 7th ACM international conference on Web search and data mining
Probabilistic topic models for sequence data
Machine Learning
Studying software evolution using topic models
Science of Computer Programming
Static test case prioritization using topic models
Empirical Software Engineering
Semantic Characterization of Tweets Using Topic Models: A Use Case in the Entertainment Domain
International Journal on Semantic Web & Information Systems
Genre-Based Music Language Modeling with Latent Hierarchical Pitman-Yor Process Allocation
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
A natural evaluation metric for statistical topic models is the probability of held-out documents given a trained model. While exact computation of this probability is intractable, several estimators for this probability have been used in the topic modeling literature, including the harmonic mean method and empirical likelihood method. In this paper, we demonstrate experimentally that commonly-used methods are unlikely to accurately estimate the probability of held-out documents, and propose two alternative methods that are both accurate and efficient.