Principles of Holism for sequential twig pattern matching
The VLDB Journal — The International Journal on Very Large Data Bases
Machine models for query processing
ACM SIGMOD Record
Towards unifying advances in twig join algorithms
ADC '10 Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104
Benchmarking holistic approaches to XML tree pattern query processing
DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
Indexing and querying XML using extended Dewey labeling scheme
Data & Knowledge Engineering
Proceedings of the VLDB Endowment
Adding logical operators to tree pattern queries on graph-structured data
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Current twig join algorithms incur high memory costs on queries that involve child-axis nodes. In this paper we provide an analytical explanation for this phenomenon. In a first large-scale study of the space complexity of evaluating XPath queries over indexed XML documents we show the space to depend on three factors: (1) whether the query is a path or a tree; (2) the types of axes occurring in the query and their occurrence pattern; and (3) the mode of query evaluation (filtering, full-fledged, or "pattern matching"). Our lower bounds imply that evaluation of a large class of queries that have child-axis nodes indeed requires large space. Our study also reveals that on some queries there is a large gap between the space needed for pattern matching and the space needed for full-fledged evaluation or filtering. This implies that many existing twig join algorithms, which work in the pattern matching mode, incur significant space overhead. We present a new twig join algorithm that avoids this overhead. On certain queries our algorithm is exceedingly more space-efficient than existing algorithms, sometimes bringing the space down from linear in the document size to constant.