Inter and intra-document contexts applied in polyrepresentation for best match IR

  • Authors:
  • Mette Skov;Birger Larsen;Peter Ingwersen

  • Affiliations:
  • Information Interaction and Information Architecture, Royal School of Library and Information Science, Birketinget 6, DK-2300 Copenhagen S, Denmark;Information Interaction and Information Architecture, Royal School of Library and Information Science, Birketinget 6, DK-2300 Copenhagen S, Denmark;Information Interaction and Information Architecture, Royal School of Library and Information Science, Birketinget 6, DK-2300 Copenhagen S, Denmark

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

The principle of polyrepresentation offers a theoretical framework for handling multiple contexts in information retrieval (IR). This paper presents an empirical laboratory study of polyrepresentation in restricted mode of the information space with focus on inter and intra-document features. The Cystic Fibrosis test collection indexed in the best match system InQuery constitutes the experimental setting. Overlaps between five functionally and/or cognitively different document representations are identified. Supporting the principle of polyrepresentation, results show that in general overlaps generated by three or four representations of different nature have higher precision than those generated from two representations or the single fields. This result pertains to both structured and unstructured query mode in best match retrieval, however, with the latter query mode demonstrating higher performance. The retrieval overlaps containing search keys from the bibliographic references provide the best retrieval performance and minor MeSH terms the worst. It is concluded that a highly structured query language is necessary when implementing the principle of polyrepresentation in a best match IR system because the principle is inherently Boolean. Finally a re-ranking test shows promising results when search results are re-ranked according to precision obtained in the overlaps whilst re-ranking by citations seems less useful when integrated into polyrepresentative applications.