On the efficient search of an XML twig query in large DataGuide trees

  • Authors:
  • Radim Bača;Michal Krátký;Václav Snášel

  • Affiliations:
  • -;Technical University of Ostrava;Czech Republic

  • Venue:
  • IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML (Extensible Mark-up Language) has been embraced as a new approach to data modeling. Nowadays, more and more information is formatted as semi-structured data, e.g., articles in a digital library, documents on the web, and so on. Implementation of an efficient system enabling storage and querying of XML documents requires development of new techniques. Many different techniques of XML indexing have been proposed in recent years. In the case of XML data, we can distinguish the following trees: an XML tree, a tree of elements and attributes, and a DataGuide, a tree of element tags and attribute names. Obviously, the XML tree of an XML document is much larger than the DataGuide of a given document. Authors often consider DataGuide as a small tree. Therefore, they consider the DataGuide search as a small problem. However, we show that DataGuide trees are often massive in the case of real XML documents. Consequently, a trivial DataGuide search may be time and memory consuming. In this article, we introduce efficient methods for searching an XML twig pattern in large, complex DataGuide trees.