The Journal of Machine Learning Research
A hierarchical Bayesian language model based on Pitman-Yor processes
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Evaluation methods for topic models
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A stochastic memoizer for sequence data
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies
Journal of the ACM (JACM)
A note on the implementation of hierarchical dirichlet processes
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Estimating Likelihoods for Topic Models
ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
Hierarchical Bayesian Modeling of Topics in Time-Stamped Documents
IEEE Transactions on Pattern Analysis and Machine Intelligence
Sequential Latent Dirichlet Allocation: Discover Underlying Topic Structures within a Document
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection
Modelling sequential text with an adaptive topic model
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
Hierarchical modeling and reasoning are fundamental in machine intelligence, and for this the two-parameter Poisson-Dirichlet Process (PDP) plays an important role. The most popular MCMC sampling algorithm for the hierarchical PDP and hierarchical Dirichlet Process is to conduct an incremental sampling based on the Chinese restaurant metaphor, which originates from the Chinese restaurant process (CRP). In this paper, with the same metaphor, we propose a new table representation for the hierarchical PDPs by introducing an auxiliary latent variable, called table indicator, to record which customer takes responsibility for starting a new table. In this way, the new representation allows full exchangeability that is an essential condition for a correct Gibbs sampling algorithm. Based on this representation, we develop a block Gibbs sampling algorithm, which can jointly sample the data item and its table contribution. We test this out on the hierarchical Dirichlet process variant of latent Dirichlet allocation (HDP-LDA) developed by Teh, Jordan, Beal and Blei. Experiment results show that the proposed algorithm outperforms their "posterior sampling by direct assignment" algorithm in both out-of-sample perplexity and convergence speed. The representation can be used with many other hierarchical PDP models.