Sampling table configurations for the hierarchical poisson-dirichlet process

Authors:
Changyou Chen;Lan Du;Wray Buntine
Affiliations:
Research School of Computer Science, The Australian National University and National ICT, Canberra, ACT, Australia;Research School of Computer Science, The Australian National University and National ICT, Canberra, ACT, Australia;Research School of Computer Science, The Australian National University and National ICT, Canberra, ACT, Australia
Venue:
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Year:
2011

Citing 11
Cited 1

Latent dirichlet allocation

The Journal of Machine Learning Research
A hierarchical Bayesian language model based on Pitman-Yor processes

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Evaluation methods for topic models

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A stochastic memoizer for sequence data

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies

Journal of the ACM (JACM)
A note on the implementation of hierarchical dirichlet processes

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Estimating Likelihoods for Topic Models

ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
Hierarchical Bayesian Modeling of Topics in Time-Stamped Documents

IEEE Transactions on Pattern Analysis and Machine Intelligence
A segmented topic model based on the two-parameter Poisson-Dirichlet process

Machine Learning
Sequential Latent Dirichlet Allocation: Discover Underlying Topic Structures within a Document

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Discrete component analysis

SLSFS'05 Proceedings of the 2005 international conference on Subspace, Latent Structure and Feature Selection

Modelling sequential text with an adaptive topic model

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hierarchical modeling and reasoning are fundamental in machine intelligence, and for this the two-parameter Poisson-Dirichlet Process (PDP) plays an important role. The most popular MCMC sampling algorithm for the hierarchical PDP and hierarchical Dirichlet Process is to conduct an incremental sampling based on the Chinese restaurant metaphor, which originates from the Chinese restaurant process (CRP). In this paper, with the same metaphor, we propose a new table representation for the hierarchical PDPs by introducing an auxiliary latent variable, called table indicator, to record which customer takes responsibility for starting a new table. In this way, the new representation allows full exchangeability that is an essential condition for a correct Gibbs sampling algorithm. Based on this representation, we develop a block Gibbs sampling algorithm, which can jointly sample the data item and its table contribution. We test this out on the hierarchical Dirichlet process variant of latent Dirichlet allocation (HDP-LDA) developed by Teh, Jordan, Beal and Blei. Experiment results show that the proposed algorithm outperforms their "posterior sampling by direct assignment" algorithm in both out-of-sample perplexity and convergence speed. The representation can be used with many other hierarchical PDP models.