Best-match querying from document-centric XML

  • Authors:
  • Jaap Kamps;Maarten Marx;Maarten de Rijke;Börkur Sigurbjörnsson

  • Affiliations:
  • University of Amsterdam, Amsterdam, The Netherlands;University of Amsterdam, Amsterdam, The Netherlands;University of Amsterdam, Amsterdam, The Netherlands;University of Amsterdam, Amsterdam, The Netherlands

  • Venue:
  • Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

On the Web, there is a pervasive use of XML to give lightweight semantics to textual collections. Such document-centric XML collections require a query language that can gracefully handle structural constraints as well as constraints on the free text of the documents. Our main contributions are three-fold. First, we outline two fragments of XPath tailored to users that have varying degrees of understanding of the XML structure used, and give both syntactic and semantic characterizations of these fragments. Second, we extend XPath with an about function having a best-match semantics based on the relevance of the document component for the expressed information need. Third, we evaluate the resulting query language using the INEX 2003 test suite, and show that best-match approaches outperform exact-match approaches for evaluating content-and-structure queries.