On the efficient processing regular path expressions of an enormous volume of XML data

  • Authors:
  • Michal Krátký;Radim Ba?ca;Václav Snášel

  • Affiliations:
  • Department of Computer Science, VŠB - Technical University of Ostrava, Ostrava-Poruba, Czech Republic;Department of Computer Science, VŠB - Technical University of Ostrava, Ostrava-Poruba, Czech Republic;Department of Computer Science, VŠB - Technical University of Ostrava, Ostrava-Poruba, Czech Republic

  • Venue:
  • DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

XML (Extensible Mark-up Language) has recently been embraced as a new approach to data modeling. Nowadays, more and more information is formatted as semi-structured data, i.e. articles in a digital library, documents on the web and so on. Implementation of an efficient system enabling storage and querying of XML documents requires development of new techniques. The indexing of an XML document is enabled by providing an efficient evaluation of a user query. XML query languages, like XPath or XQuery, apply a form of path expressions for composing more general queries. The evaluation process of regular path expressions is not efficient enough using the current approaches to indexing XML data. Most approaches index single elements and the query statement is processed by joining individual expressions. In this article we will introduce an approach which makes it possible to efficiently process a query defined by regular path expressions. This approach indexes all root-to-leaf paths and stores them in multi-dimensional data structures, allowing the indexing and efficient querying of an enormous volume of XML data.