Finding relevant passages using noun-noun compounds (poster session): coherence vs. proximity

  • Authors:
  • Eduard Hoenkamp;Rob de Groot

  • Affiliations:
  • Nijmegen Institute for Cognition and Information (NICI), PO Box 9104, Nijmogen, the Netherlands;Nijmegen Institute for Cognition and Information (NICI), PO Box 9104, Nijmogen, the Netherlands

  • Venue:
  • SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Intuitively, words forming phrases are a more precise description of content than words as a sequence of keywords. Yet, evidence that phrases would be more effective for information retrieval is inconclusive. This paper isolates a neglected class of phrases, that is abundant in communication, has an established theoretical foundation, and shows promise for an effective expression of the user's information need: the noun-noun compound (NNC). In an experiment, a variety of meaningful NNCs were used to isolate relevant passages in a large and varied corpus. In a first pass, passages were retrieved based on textual proximity of the words or their semantic peers. A second pass retained only passages containing a syntactically coherent structure equivalent to the original NNC. This second pass showed a dramatic increase in precision. Preliminary results show the validity of our intuition about phrases in the special but very productive case of NNCs.