Document-topic hierarchies from document graphs

  • Authors:
  • Tim Weninger;Yonatan Bisk;Jiawei Han

  • Affiliations:
  • University of Illinois Urbana-Champaign, Urbana, IL, USA;University of Illinois Urbana-Champaign, Urbana, IL, USA;University of Illinois Urbana-Champaign, Urbana, IL, USA

  • Venue:
  • Proceedings of the 21st ACM international conference on Information and knowledge management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Topic taxonomies present a multi-level view of a document collection, where general topics live towards the top of the taxonomy and more specific topics live towards the bottom. Topic taxonomies allow users to quickly drill down into their topic of interest to find documents. We show that hierarchies of documents, where documents live at the inner nodes of the hierarchy-tree can also be inferred by combining document text with inter-document links. We present a Bayesian generative model by which an explicit hierarchy of documents is created. Experiments on three document-graph data sets shows that the generated document hierarchies are able to fit the observed data, and that the levels in the constructed document hierarchy represent practical groupings.