Where to start reading a textual XML document?

  • Authors:
  • Jaap Kamps;Marijn Koolen;Mounia Lalmas

  • Affiliations:
  • University of Amsterdam, Amsterdam, Netherlands;University of Amsterdam, Amsterdam, Netherlands;Queen Mary: University of London, London, United Kingdom

  • Venue:
  • SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In structured information retrieval, the aim is to exploit document structure to retrieve relevant components, allowing the user to go straight to the relevant material. This paper looks at the so-called best entry points (BEPs), which are intended to give the user the best starting point to access the relevant information in the document. We examine the relationship between BEPs and relevant components in the INEX 2006 ad hoc assessments. Our main findings are the following: First, although documents are short, assessors often choose the best entry point some distance from the start of the document. Second, many of the best entry points coincide with the first relevant character in relevant documents, showing a strong relation between the BEP and relevant text. Third, we find browsing BEPs in articles with a single relevant passages, and container BEPs or context BEPs in articles with more relevant passages.