A cost-based join selection for XML twig content-based queries

  • Authors:
  • Radim Bača;Michal Krátký

  • Affiliations:
  • Technical University of Ostrava;Czech Republic

  • Venue:
  • DataX '08 Proceedings of the 2008 EDBT workshop on Database technologies for handling XML information on the web
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML (Extensible Mark-up Language) has been embraced as a new approach to data modeling. Nowadays, more and more information is formated as semi-structured data, e.g., articles in a digital library, documents on the web, and so on. Implementation of an efficient system enabling storage and querying of XML documents requires development of new techniques. Many different techniques of XML indexing have been proposed during recent years. If we consider some classes of indexing methods, we distinguish two kinds of joins for processing twig queries. The first join merges two sets retrieved from an inverted list. The second join applies the first query result in building the second query. Although authors propose improvements of their joins, there has not yet been a discussion about the advantages of applying various join operations. In this article, we propose a join selection based on the cost of a join. By choosing a more appropriate join operation, twig query processing efficiency is significantly improved.