TIJAH: Embracing IR Methods in XML Databases

  • Authors:
  • Vojkan Mihajlovic;Johan List;Vojkan Mihajlovi";Georgina Ramírez;Arjen P. Vries;Djoerd Hiemstra;Henk Ernst Blok

  • Affiliations:
  • University of Twente, AE Enschede, The Netherlands 7500;CWI, Amsterdam, The Netherlands 1090 GB;University of Twente, AE Enschede, The Netherlands 7500;CWI, Amsterdam, The Netherlands 1090 GB;CWI, Amsterdam, The Netherlands 1090 GB;University of Twente, AE Enschede, The Netherlands 7500;University of Twente, AE Enschede, The Netherlands 7500

  • Venue:
  • Information Retrieval
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper discusses our participation in INEX (the Initiative for the Evaluation of XML Retrieval) using the TIJAH XML-IR system. TIJAH's system design follows a `standard' layered database architecture, carefully separating the conceptual, logical and physical levels. At the conceptual level, we classify the INEX XPath-based query expressions into three different query patterns. For each pattern, we present its mapping into a query execution strategy. The logical layer exploits score region algebra (SRA) as the basis for query processing. We discuss the region operators used to select and manipulate XML document components. The logical algebra expressions are mapped into efficient relational algebra expressions over a physical representation of the XML document collection using the `pre-post numbering scheme'. The paper concludes with an analysis of experiments performed with the INEX test collection.