Effective pruning for XML structural match queries

Authors:
Yefei Xin;Zhen He;Jinli Cao
Affiliations:
Department of Computer Science and Computer Engineering, La Trobe University, VIC 3086, Australia;Department of Computer Science and Computer Engineering, La Trobe University, VIC 3086, Australia;Department of Computer Science and Computer Engineering, La Trobe University, VIC 3086, Australia
Venue:
Data & Knowledge Engineering
Year:
2010

Citing 40
Cited 0

Lore: a database management system for semistructured data

ACM SIGMOD Record
On supporting containment queries in relational database management systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Containment and equivalence for an XPath fragment

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Accelerating XPath location steps

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
APEX: an adaptive path index for XML data

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Covering indexes for branching path queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Storing and querying ordered XML using a relational database system

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Query Optimization for XML

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Optimization Properties for Classes of Conjunctive Regular Path Queries

DBPL '01 Revised Papers from the 8th International Workshop on Database Programming Languages
Tree pattern query minimization

The VLDB Journal — The International Journal on Very Large Data Bases
ViST: a dynamic index method for querying XML data by tree structures

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
D(k)-index: an adaptive structural summary for graph-structured data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Containment join size estimation: models and methods

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Structural Joins: A Primitive for Efficient XML Query Pattern Matching

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Exploiting Local Similarity for Indexing Paths in Graph-Structured Data

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
PRIX: Indexing And Querying XML Using Prüfer Sequences

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Efficient processing of XML twig queries with OR-predicates

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
On the Sequencing of Tree Structures for XML Indexing

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On boosting holism in XML twig pattern matching using structural indexing techniques

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Indexing Useful Structural Patterns for XML Query Processing

IEEE Transactions on Knowledge and Data Engineering
Accelerating queries by pruning XML documents

Data & Knowledge Engineering
Efficient indexing and querying of XML data using modified Prüfer sequences

Proceedings of the 14th ACM international conference on Information and knowledge management
Sequencing XML data and query twigs for fast pattern matching

ACM Transactions on Database Systems (TODS)
MonetDB/XQuery: a fast XQuery processor powered by a relational engine

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Index structures for matching XML twigs using relational query processors

Data & Knowledge Engineering
Efficient structural joins on indexed XML documents

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficiently Querying Large XML Data Repositories: A Survey

IEEE Transactions on Knowledge and Data Engineering
Mixed mode XML query processing

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Holistic twig joins on indexed XML documents

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Staircase join: teach a relational DBMS to watch its (axis) steps

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Bloom histogram: path selectivity estimation for XML data with updates

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Fragment-based approximate retrieval in highly heterogeneous XML collections

Data & Knowledge Engineering
LCS-TRIM: dynamic programming meets XML indexing and querying

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Early profile pruning on XML-aware publish-subscribe systems

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient algorithms for exact ranked twig-pattern matching over graphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Grouping and optimization of XPath expressions in DB2® pureXML

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
An xml index advisor for DB2

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
XML Prefiltering as a String Matching Problem

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Advanced structural joins using element distribution

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extensible Markup Language (XML) is becoming the de facto standard for exchanging information over the Internet, which results in the proliferation of XML documents. This has led to increased interest in this area by the research community. One of the main challenges is processing large collections of XML documents efficiently. Most current methods suffer from two drawbacks: an inability to complement each other to further enhance query processing performance without modifying the existing query processing engine; and an incapability of being customized for different structural and usage characteristics. This paper presents a new approach for structural query processing called Property-Driven Pruning Algorithm (PDPA), which offers the twin features of structural query processing independence and plug-and-play properties to overcome both drawbacks. PDPA consists of two phases: the offline and the online phase. During the offline phase, a list of pruning properties is added into the original XML documents. During the online phase, the input queries are modified with a list of carefully selected properties which are used during query processing to quickly prune non-matching candidate documents. We have proposed an exhaustive and a greedy heuristic algorithm. The experimental results based on both algorithms demonstrate that PDPA can improve XML query processing performance in a variety of situations by up to twofold.