Estimating Likelihoods for Topic Models

  • Authors:
  • Wray Buntine

  • Affiliations:
  • NICTA and Australian National University, Canberra, Australia 2601

  • Venue:
  • ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Topic models are a discrete analogue to principle component analysis and independent component analysis that model topic at the word level within a document. They have many variants such as NMF, PLSI and LDA, and are used in many fields such as genetics, text and the web, image analysis and recommender systems. However, only recently have reasonable methods for estimating the likelihood of unseen documents, for instance to perform testing or model comparison, become available. This paper explores a number of recent methods, and improves their theory, performance, and testing.