Hierarchical Bayesian Modeling of Topics in Time-Stamped Documents

Authors:
Iulian Pruteanu-Malinici;Lu Ren;John Paisley;Eric Wang;Lawrence Carin
Affiliations:
Duke University, Durham;Duke University, Durham;Duke University, Durham;Duke University, Durham;Duke University, Durham
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2010

Citing 0
Cited 7

Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Steering time-dependent estimation of posteriors with hyperparameter indexing in bayesian topic models

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
A time-dependent topic model for multiple text streams

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Sampling table configurations for the hierarchical poisson-dirichlet process

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Doctoral dissertations of Library and Information Science in China: A co-word analysis

Scientometrics
An n-gram topic model for time-stamped documents

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Dynamic joint sentiment-topic model

ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining

Quantified Score

Hi-index	0.14

Visualization

Abstract

We consider the problem of inferring and modeling topics in a sequence of documents with known publication dates. The documents at a given time are each characterized by a topic and the topics are drawn from a mixture model. The proposed model infers the change in the topic mixture weights as a function of time. The details of this general framework may take different forms, depending on the specifics of the model. For the examples considered here, we examine base measures based on independent multinomial-Dirichlet measures for representation of topic-dependent word counts. The form of the hierarchical model allows efficient variational Bayesian inference, of interest for large-scale problems. We demonstrate results and make comparisons to the model when the dynamic character is removed, and also compare to latent Dirichlet allocation (LDA) and Topics over Time (TOT). We consider a database of Neural Information Processing Systems papers as well as the US Presidential State of the Union addresses from 1790 to 2008.