Principles of Holism for sequential twig pattern matching

Authors:
Federica Mandreoli;Riccardo Martoglia;Pavel Zezula
Affiliations:
DII, Univ. di Modena e Reggio Emilia, Modena, Italy;DII, Univ. di Modena e Reggio Emilia, Modena, Italy;Masaryk University, Brno, Czech Republic
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
2009

Citing 29
Cited 1

On supporting containment queries in relational database management systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
XRel: a path-based approach to storage and retrieval of XML documents using relational databases

ACM Transactions on Internet Technology (TOIT)
Covering indexes for branching path queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Indexing and Querying XML Data for Regular Path Expressions

Proceedings of the 27th International Conference on Very Large Data Bases
Maintaining order in a linked list

STOC '82 Proceedings of the fourteenth annual ACM symposium on Theory of computing
The XML benchmark project

The XML benchmark project
ViST: a dynamic index method for querying XML data by tree structures

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Structural Joins: A Primitive for Efficient XML Query Pattern Matching

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Accelerating XPath evaluation in any RDBMS

ACM Transactions on Database Systems (TODS)
Efficient processing of XML twig patterns with parent child edges: a look-ahead approach

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Vectorizing and Querying Large XML Repositories

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the Sequencing of Tree Structures for XML Indexing

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On boosting holism in XML twig pattern matching using structural indexing techniques

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Indexing Useful Structural Patterns for XML Query Processing

IEEE Transactions on Knowledge and Data Engineering
Efficient processing of XML path queries using the disk-based F&B Index

VLDB '05 Proceedings of the 31st international conference on Very large data bases
From region encoding to extended dewey: on efficient processing of XML twig pattern matching

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Optimizing cursor movement in holistic twig joins

Proceedings of the 14th ACM international conference on Information and knowledge management
Sequencing XML data and query twigs for fast pattern matching

ACM Transactions on Database Systems (TODS)
MonetDB/XQuery: a fast XQuery processor powered by a relational engine

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Twig2Stack: bottom-up processing of generalized-tree-pattern queries over XML documents

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient structural joins on indexed XML documents

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A Parsing Model Based on Ordered Tree Inclusion Matching

SNPD '07 Proceedings of the Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing - Volume 03
Holistic twig joins on indexed XML documents

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Staircase join: teach a relational DBMS to watch its (axis) steps

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
XQuery on SQL hosts

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
The Space Complexity of Processing XML Twig Queries Over Indexed Documents

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Twiglist: make twig pattern matching fast

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications

Efficient management of multi-version clinical guidelines

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern applications face the challenge of dealing with structured and semi-structured data. They have to deal with complex objects, most of them presenting some kind of internal structure, which often forms a hierarchy. Though XML documents are the most known, chemical compounds, CAD drawings, web-sites and many other applications have to deal with similar problems. In such environments, ordered and unordered tree pattern matching are the fundamental search operations. One of the main thrusts of research activities for tree pattern matching is the class of holistic approaches. Their ultimate goal is to evaluate a query twig as a whole by relying on sequential access patterns and non trivial auxiliary storage structures, typically stored in main memory. Based on the pre/post-order ranks of individual tree nodes, we establish strong theoretical bases as a foundation for correct and efficient holistic pattern matching algorithms. In particular, we define and prove sufficient and necessary conditions to minimize the amount of data retained in memory, thus introducing a correct and complete framework on which different holistic solutions can be compared. We also show how these rules can be applied for building algorithms for ordered and unordered tree-pattern matching. Thanks to the above theoretical achievements, each holistic algorithm gains in efficiency as it is directly implemented on the adopted numbering scheme, avoids expensive matching refinements and keeps memory requirements stable. An experimental analysis and comparison with previous approaches confirms the superiority of our approach tested on synthetic as well as real-life data sets.