Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Indexing and Querying XML Data for Regular Path Expressions
Proceedings of the 27th International Conference on Very Large Data Bases
ViST: a dynamic index method for querying XML data by tree structures
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Structural Joins: A Primitive for Efficient XML Query Pattern Matching
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
PRIX: Indexing And Querying XML Using Prüfer Sequences
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
On the Sequencing of Tree Structures for XML Indexing
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Efficient structural joins on indexed XML documents
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
LCS-TRIM: dynamic programming meets XML indexing and querying
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
RRSi: indexing XML data for proximity twig queries
Knowledge and Information Systems
Effective pruning for XML structural match queries
Data & Knowledge Engineering
Key concepts for native XML processing
From active data management to event-based systems and more
Examining the impact of data-access cost on XML twig pattern matching
Information Sciences: an International Journal
Hi-index | 0.00 |
With the advent of XML as the new standard for information representation and exchange, indexing and querying of XML data is of major concern. In this paper, we propose a method for representing an XML document as a sequence based on a variation of Prüfer sequences. We incorporate new components in the node encodings such as level, number of a certain kind of descendants and develop methods for holistic processing of tree pattern queries. The query processing involves converting the query also into a sequence and performing subsequence matching on the document sequence. We establish certain interesting properties of the proposed method of sequencing that give rise to a new efficient pattern matching algorithm. The sequence data is stored in a two level B+-trees to support query processing. We also propose an optimization for parent-child axis to speed up the query processing. Our approach does not require any post-processing and guarantees results that are free of false positives and duplicates. Experimental results show that our system performs significantly better than previous systems in a large number of cases.