An n-gram topic model for time-stamped documents

  • Authors:
  • Shoaib Jameel;Wai Lam

  • Affiliations:
  • Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong;Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong

  • Venue:
  • ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a topic model that captures the temporal dynamics in the text data along with topical phrases. Previous approaches have relied upon bag-of-words assumption to model such property in a corpus. This has resulted in an inferior performance with less interpretable topics. Our topic model can not only capture changes in the way a topic structure changes over time but also maintains important contextual information in the text data. Finding topical n-grams, when possible based on context, instead of always presenting unigrams in topics does away with many ambiguities that individual words may carry. We derive a collapsed Gibbs sampler for posterior inference. Our experimental results show an improvement over the current state-of-the-art topics over time model.