Automatic labelling of topic models

  • Authors:
  • Jey Han Lau;Karl Grieser;David Newman;Timothy Baldwin

  • Affiliations:
  • NICTA Victoria Research Laboratory and University of Melbourne;University of Melbourne;NICTA Victoria Research Laboratory and University of California Irvine;NICTA Victoria Research Laboratory and University of Melbourne

  • Venue:
  • HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a method for automatically labelling topics learned via LDA topic models. We generate our label candidate set from the top-ranking topic terms, titles of Wikipedia articles containing the top-ranking topic terms, and sub-phrases extracted from the Wikipedia article titles. We rank the label candidates using a combination of association measures and lexical features, optionally fed into a supervised ranking model. Our method is shown to perform strongly over four independent sets of topics, significantly better than a benchmark method.