XPath query evaluation based on the stack encoding

Authors:
Yangjun Chen;Donovan Cooke
Affiliations:
University of Winnipeg, Winnipeg, Manitoba, Canada;University of Winnipeg, Winnipeg, Manitoba, Canada
Venue:
C3S2E '09 Proceedings of the 2nd Canadian Conference on Computer Science and Software Engineering
Year:
2009

Citing 9
Cited 0

On supporting containment queries in relational database management systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Query Optimization for XML

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Relational Databases for Querying XML Documents: Limitations and Opportunities

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Quilt: An XML Query Language for Heterogeneous Data Sources

Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases
ViST: a dynamic index method for querying XML data by tree structures

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Structural Joins: A Primitive for Efficient XML Query Pattern Matching

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
On the Sequencing of Tree Structures for XML Indexing

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Twig2Stack: bottom-up processing of generalized-tree-pattern queries over XML documents

VLDB '06 Proceedings of the 32nd international conference on Very large data bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

The twig join, which is used to find all occurrences of a twig pattern in an XML database, is a core operation for XML query processing. A great many strategies for handling this problem have been proposed and can be roughly classified into two groups. The first group decomposes a twig pattern (a small tree) into a set of binary relationships between pairs of nodes, such as parent-child and ancestor-descendant relations; and transforms a tree matching problem into a series of simple relation look-ups. The second group decomposes a twig pattern into a set of paths. Among all this kind of methods, the approach based on the so-called stack encoding by Bruno et. al. [2] is very interesting, which can represent in linear space a potentially exponential (in the number of query nodes) number of matching paths. However, the available processes for generating such compressed paths suffer some redundancy and can be significantly improved. In this paper, we analyze this method and show that the time complexities of path generation in its two main procedures: PathStack and TwigStack can be reduced from O(m2 ·n) to O(m ·n), where m and n are the sizes of the query tree and document tree, respectively. Experiments have been done to compare ours and some existing startegies, which shows that using our method much less time is needed to generate matching paths.