Syntactic Query Models for Restatement Retrieval

  • Authors:
  • Niranjan Balasubramanian;James Allan

  • Affiliations:
  • Center for Intelligent Information Retrieval, University of Massachusetts Amherst, Amherst 01003;Center for Intelligent Information Retrieval, University of Massachusetts Amherst, Amherst 01003

  • Venue:
  • SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of retrieving sentence level restatements. Formally, we define restatements as sentences that contain all or some subset of information present in a query sentence. Identifying restatements is useful for several applications such as multi-document summarization, document provenance, text reuse and novelty detection. Spurious partial matches and term dependence become important issues for restatement retrieval in these settings. To address these issues, we focus on query models that capture relative term importance and sequential term dependence. In this paper, we build query models using syntactic information such as subject-verb-objects and phrases. Our experimental results on two different collections show that syntactic query models are consistently more effective than purely statistical alternatives.