Containment and equivalence for an XPath fragment
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Covering indexes for branching path queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Index Structures for Path Expressions
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Indexing and Querying XML Data for Regular Path Expressions
Proceedings of the 27th International Conference on Very Large Data Bases
D(k)-index: an adaptive structural summary for graph-structured data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Multiresolution Indexing of XML for Frequent Queries
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
BLAS: an efficient XPath processing system
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Efficient processing of XML twig queries with OR-predicates
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
ORDPATHs: insert-friendly XML node labels
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Holistic twig joins on indexed XML documents
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Hi-index | 0.00 |
Finding all the occurrences of a twig pattern in an XML document is a core operation for efficient evaluation of XML queries. A number of algorithms have been proposed to process twig queries based on region encoding. While each element in source document is given two or more numbers in region-encoding-form index, the size of index grows linearly to the source document. The algorithms based on region encoding perform worse when the source document grows large. In this paper, we address the problem by putting forward a novel index structure, called Clustered Absolute Path Index (CAPI for brief). This index can extremely reduce the size of index and grows slowly as the source document grows large. Based on CAPI, we design novel join algorithms, called Path-Match to process queries without branches, Branch-Filter and RelatedPath-Join to process queries with branches. Experimental results show that the proposed algorithms based on CAPI outperform twig join significantly and have good scalability.