The use of anaphoric resolution for document description in information retrieval

  • Authors:
  • S. Bonzi;E. Liddy

  • Affiliations:
  • School of Information Studies, Syracuse University;School of Information Studies, Syracuse University

  • Venue:
  • SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1988

Quantified Score

Hi-index 0.00

Visualization

Abstract

This study investigated two hypotheses concerning the use of anaphors in information retrieval. The first hypothesis, that anaphors tend to refer to integral concepts rather than to peripheral concepts, was well supported. Two samples of documents, one in psychology and the other in computer science, were examined by subject experts who judged the centrality of phrases which were referred to anaphorically. The second hypothesis, that various term weighting schemes are affected differently by anaphoric resolution, was also well supported. It was found that schemes which incorporate document length into the calculations produce much smaller increases in term weights for terms occurring in anaphoric resolutions than do those which do not consider document length. It is concluded that although anaphoric resolution has potential for better representing the “aboutness” of a document, care must be taken in choosing both the anaphoric classes to be resolved and the term weighting schemes to be used in measuring a document's topicality.