Efficient evaluation of XQuery over streaming data

Authors:
Xiaogang Li;Gagan Agrawal
Affiliations:
Ohio State University, Columbus, OH;Ohio State University, Columbus, OH
Venue:
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Year:
2005

Citing 38
Cited 18

Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
A transformation-based approach to optimizing loops in database programming languages

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
The virtual microscope

The virtual microscope
Query unnesting in object-oriented databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Storing semistructured data with STORED

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
On optimizing an SQL-like nested query

ACM Transactions on Database Systems (TODS)
On computing correlated aggregates over continual data streams

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
MaJIC: compiling MATLAB for speed and responsiveness

PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Storing and querying ordered XML using a relational database system

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Rainbow: mapping-driven XQuery processing system

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Continuous queries over data streams

ACM SIGMOD Record
Grid Services for Distributed System Integration

Computer
Titan: A High-Performance Remote Sensing Database

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Query Optimization for XML

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient Filtering of XML Documents for Selective Dissemination of Information

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Querying XML Views of Relational Data

Proceedings of the 27th International Conference on Very Large Data Bases
Answering XML Queries on Heterogeneous Data Sources

Proceedings of the 27th International Conference on Very Large Data Bases
Nested Queries in Object Bases

DBLP-4 Proceedings of the Fourth International Workshop on Database Programming Languages - Object Models and Languages
TAX: A Tree Algebra for XML

DBPL '01 Revised Papers from the 8th International Workshop on Database Programming Languages
Improved Unnesting Algorithms for Join Aggregate SQL Queries

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
A simple algorithm for finding frequent elements in streams and bags

ACM Transactions on Database Systems (TODS)
TIMBER: A native XML database

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient filtering of XML documents with XPath expressions

The VLDB Journal — The International Journal on Very Large Data Bases
Issues in data stream management

ACM SIGMOD Record
What are Web services?

Communications of the ACM - E-services: a cornucopia of digital offerings ushers in the next Net-based evolution
Compiler support for efficient processing of XML datasets

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
XPath queries on streaming data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
YFilter: Efficient and Scalable Filtering of XML Documents

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Efficient elastic burst detection in data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Characterizing memory requirements for queries over continuous data streams

ACM Transactions on Database Systems (TODS)
A transducer-based XML query processor

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
XMark: a benchmark for XML data management

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Projecting XML documents

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
From tree patterns to generalized tree patterns: on efficient evaluation of XQuery

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
The BEA/XQRL streaming XQuery processor

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Implementing XQuery 1.0: the Galax experience

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Schema-based scheduling of event processors and buffer minimization for queries on structured data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

MonetDB/XQuery: a fast XQuery processor powered by a relational engine

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Massively multi-query join processing in publish/subscribe systems

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficiently Querying Large XML Data Repositories: A Survey

IEEE Transactions on Knowledge and Data Engineering
Online evaluation of regular tree queries

Nordic Journal of Computing
The GCX system: dynamic buffer minimization in streaming XQuery evaluation

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
On-the-fly tuple selection for XQuery

XIME-P '07 Proceedings of the 4th international workshop on XQuery implementation, experience and perspectives
Semantic query optimization for processing XML streams with minimized memory footprint

DataX '08 Proceedings of the 2008 EDBT workshop on Database technologies for handling XML information on the web
StreamTX: extracting tuples from streaming XML data

Proceedings of the VLDB Endowment
Querying and monitoring distributed business processes

Proceedings of the VLDB Endowment
Runtime monitoring of web service choreographies using streaming XML

Proceedings of the 2009 ACM symposium on Applied Computing
Tagging stream data for rich real-time services

Proceedings of the VLDB Endowment
Twig'n join: progressive query processing of multiple XML streams

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Efficient XQuery join processing in publish/subscribe systems

ADC '09 Proceedings of the Twentieth Australasian Conference on Australasian Database - Volume 92
Efficient evaluation of generalized tree-pattern queries on XML streams

The VLDB Journal — The International Journal on Very Large Data Bases
Steno: automatic optimization of declarative queries

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Dynamic migration of processing elements for optimized query execution in event-based systems

OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part II
A survey on XML streaming evaluation techniques

The VLDB Journal — The International Journal on Very Large Data Bases
JetXSLT: a resource-conscious XSLT processor

ADC '13 Proceedings of the Twenty-Fourth Australasian Database Conference - Volume 137

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the growing popularity of XML and emergence of streaming data model, processing queries over streaming XML has become an important topic. This paper presents a new framework and a set of techniques for processing XQuery over streaming data. As compared to the existing work on supporting XPath/XQuery over data streams, we make the following three contributions:1. We propose a series of optimizations which transform XQuery queries so that they can be correctly executed with a single pass on the dataset.2. We present a methodology for determining when an XQuery query, possibly after the transformations we introduce, can be correctly executed with only a single pass on the dataset.3. We describe a code generation approach which can handle XQuery queries with user-defined aggregates, including recursive functions. We aggressively use static analysis and generate executable code, i.e., do not require a query plan to be interpreted at runtime.We have evaluated our implementation using several XMark benchmarks and three other XQuery queries driven by real applications. Our experimental results show that as compared to Qizx/Open, Saxon, and Galax, our system: 1) is at least 25% faster on XMark queries with small datasets, 2) is significantly faster on XMark queries with larger datasets, 3) at least one order of magnitude faster on the queries driven by real applications, as unlike other systems, we can transform them to execute with a single pass, and 4) executes queries efficiently on large datasets when other systems often have memory overflows.