Latent contextual indexing of annotated documents

  • Authors:
  • Christian Sengstock;Michael Gertz

  • Affiliations:
  • Heidelberg University, Heidelberg, Germany;Heidelberg University, Heidelberg, Germany

  • Venue:
  • Proceedings of the 21st international conference companion on World Wide Web
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we propose a simple and flexible framework to index context-annotated documents, e.g., documents with timestamps or georeferences, by contextual topics. A contextual topic is a distribution over document features with a particular meaning in the context domain, such as a repetitive event or a geographic phenomenon. Such a framework supports document clustering, labeling, and search, with respect to contextual knowledge contained in the document collection. To realize the framework, we introduce an approach to project documents into a context-feature space. Then, dimensionality reduction is used to extract contextual topics in this context-feature space. The topics can then be projected back onto the documents. We demonstrate the utility of our approach with a case study on georeferenced Wikipedia articles.