ToXgene: a template-based data generator for XML
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Optimizing Regular Path Expressions Using Graph Schemas
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Indexing and Querying XML Data for Regular Path Expressions
Proceedings of the 27th International Conference on Very Large Data Bases
An XML query engine for network-bound data
The VLDB Journal — The International Journal on Very Large Data Bases
Stream processing of XPath queries with predicates
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
XPath queries on streaming data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
YFilter: Efficient and Scalable Filtering of XML Documents
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Structural Joins: A Primitive for Efficient XML Query Pattern Matching
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Recursive XML Schemas, Recursive XML Queries, and Relational Storage: XML-to-SQL Query Translation
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
EXPedite: a system for encoded XML processing
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Automaton meets algebra: a hybrid paradigm for XML stream processing
Data & Knowledge Engineering - Special issue: ER 2003
A transducer-based XML query processor
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Query processing for high-volume XML message brokering
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
FluXQuery: an optimizing XQuery processor for streaming XML data
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Hi-index | 0.00 |
XML stream applications bring the challenge of efficiently processing queries on sequentially accessible token-based data. For efficient processing of queries, we need to ensure that memory usage stays low. This in turn requires that we avoid holding data in the query buffer, by outputting it at the earliest possible time. In this paper, we propose a new class of stream algebra operators for efficient recursive XQuery stream processing. Our plan generator will analyze the query, and the schema when available to determine which join operators in the query need recursive join support and thus can plug in the more inexpensive just-in-time structural join whenever possible. In particular, we propose two strategies for implementing structural joins: (a) the just-in-time structural join strategy efficiently processes joins over non-recursive XML token streams; and (b) the recursive structural join strategy supports structural joins over recursive XML substreams, however, at an added cost of generating and comparing tuple-level ID. Both structural join strategies are complemented by an automata-driven invocation mechanism that triggers the execution of each join process at the first possible moment upon recognizing the end of the targeted input stream subelement. Further, we design this StructuralJoin operator itself to be context-aware. The operator is capable of at run-time switching from the efficient just-in-time join strategy for elements that are recognized to be non-recursive to the more powerful ID-based structural join strategy for elements that are identified to be recursive. We incorporate the proposed techniques into the Raindrop stream engine. We also report on experimental studies we conducted using the ToXgene benchmark that demonstrate that the performance improvements of the techniques.