On document relevance and lexical cohesion between query terms

  • Authors:
  • Olga Vechtomova;Murat Karamuftuoglu;Stephen E. Robertson

  • Affiliations:
  • Department of Management Sciences, University of Waterloo, Waterloo, Ont., Canada;Department of Computer Engineering, Bilkent University, Bilkent, Ankara, Turkey;Microsoft Research Cambridge, Cambridge, UK

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Lexical cohesion is a property of text, achieved through lexical-semantic relations between words in text. Most information retrieval systems make use of lexical relations in text only to a limited extent. In this paper we empirically investigate whether the degree of lexical cohesion between the contexts of query terms' occurrences in a document is related to its relevance to the query. Lexical cohesion between distinct query terms in a document is estimated on the basis of the lexical-semantic relations (repetition, synonymy, hyponymy and sibling) that exist between there collocates - words that co-occur with them in the same windows of text. Experiments suggest significant differences between the lexical cohesion in relevant and non-relevant document sets exist. A document ranking method based on lexical cohesion shows some performance improvements.