Bulk data in main memory-based XQuery evaluation

Authors:
Stefanie Scherzinger
Affiliations:
Saarland University Database Group, Saarbrücken, Germany
Venue:
XIME-P '07 Proceedings of the 4th international workshop on XQuery implementation, experience and perspectives
Year:
2007

Citing 14
Cited 1

XMill: an efficient compressor for XML data

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
An efficient context-free parsing algorithm

Communications of the ACM
Modern Compiler Design

Modern Compiler Design
Processing XML Streams with Deterministic Automata

ICDT '03 Proceedings of the 9th International Conference on Database Theory
Efficient Filtering of XML Documents for Selective Dissemination of Information

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
XQuery by the Book: The IPSI XQuery Demonstrator

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
YFilter: Efficient and Scalable Filtering of XML Documents

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Type-based XML projection

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
On the complexity of nonrecursive XQuery and functional query languages on complex values

ACM Transactions on Database Systems (TODS)
Path queries on compressed XML

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Projecting XML documents

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
The BEA/XQRL streaming XQuery processor

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Schema-based scheduling of event processors and buffer minimization for queries on structured data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient memory representation of XML documents

DBPL'05 Proceedings of the 10th international conference on Database Programming Languages

XQuery optimization based on program slicing

Proceedings of the 20th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

XQuery processors that load the input into main memory suffer from huge memory demands. Yet for the evaluation of many queries, large parts of the input are actually irrelevant. In XML document projection, this data is recognized and not loaded in the first place. However, there are also queries where little can be gained by projection. We have observed that these queries tend to require large parts of the input only for generating output. This suggests that such "bulk" data may be stored and treated differently from data that is actually traversed in query evaluation. In this paper, we present a technique to recognize bulk data while loading XML documents for the evaluation of composition-free XQuery. Our approach is coupled with XML document projection, and utilizes a finite automaton that is expressly suited for matching path expressions. We show in an exploratory analysis that bulk data arises in practice, and discuss ongoing work along the line of bulk-bypassing in main memory-based XQuery engines.