Efficient string-based XML stream prefiltering

Authors:
Stefan Böttcher;Rita Hartel;Steffen Weber
Affiliations:
University of Paderborn (Germany), Fürstenallee, Paderborn;University of Paderborn (Germany), Fürstenallee, Paderborn;University of Paderborn (Germany), Fürstenallee, Paderborn
Venue:
ADC '12 Proceedings of the Twenty-Third Australasian Database Conference - Volume 124
Year:
2012

Citing 12
Cited 0

A fast string searching algorithm

Communications of the ACM
XPath: Looking Forward

EDBT '02 Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers
A String Matching Algorithm Fast on the Average

Proceedings of the 6th Colloquium, on Automata, Languages and Programming
Processing Text Files as Is: Pattern Matching over Compressed Texts, Multi-byte Character Texts, and Semi-structured Texts

SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Prefiltering techniques for efficient XML document processing

Proceedings of the 2005 ACM symposium on Document engineering
XML Evolution: a two-phase XML processing model using XML prefiltering techniques

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
XMark: a benchmark for XML data management

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
ROX: relational over XML

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Indexing XML data stored in a relational database

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
XML Prefiltering as a String Matching Problem

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Generic and updatable XML value indices covering equality and range lookups

Proceedings of the 2009 EDBT/ICDT Workshops
Processing XPath queries with forward and downward axes over XML streams

Proceedings of the 13th International Conference on Extending Database Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Whenever huge XML documents have to be evaluated according to a given XPath or XQuery query, parsing the whole document in form of e. g. SAX events is the baseline that is common to all evaluators. But typically only few parts of the document are really relevant and can contribute to the query evaluation. We propose an approach to String-based prefiltering of an XML document D that outputs a smaller document D' that contains the relevant parts of the document, such that the query Q evaluated on D yields the same result as Q evaluated on D'. In contrast to previous approaches, our approach extends the idea of efficient String-based XML prefiltering with support for XML Schema instead of DTDs, recursive schemata, and attribute filters. Our experiments on a 1 GB XMark document, taking the average over 22 queries, have shown that our approach outperforms previous prefiltering approaches and that it reaches an average speed-up factor of 8 compared to XQuery evaluation without prefiltering.