XPath query processing improvements

Authors:
P. Mark Pettovello;Farshad Fotouhi
Affiliations:
Wayne State University, Detroit, Michigan;Wayne State University, Detroit, Michigan
Venue:
Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research
Year:
2010

Citing 33
Cited 0

On supporting containment queries in relational database management systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Accelerating XPath location steps

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Covering indexes for branching path queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Structures and Algorithms

Data Structures and Algorithms
Indexing Techniques for Queries on Nested Objects

IEEE Transactions on Knowledge and Data Engineering
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
XPath: Looking Forward

EDBT '02 Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers
Anatomy of a native XML base management system

The VLDB Journal — The International Journal on Very Large Data Bases
Maintaining order in a linked list

STOC '82 Proceedings of the fourteenth annual ACM symposium on Theory of computing
ViST: a dynamic index method for querying XML data by tree structures

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
An efficient single-pass query evaluator for XML data streams

Proceedings of the 2004 ACM symposium on Applied computing
Accelerating XPath evaluation in any RDBMS

ACM Transactions on Database Systems (TODS)
On the integration of structure indexes and inverted lists

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Content and structure in indexing and ranking XML

Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Virtual cursors for XML joins

Proceedings of the thirteenth ACM international conference on Information and knowledge management
An evaluation of XML indexes for structural join

ACM SIGMOD Record
Benefits of path summaries in an XML query optimizer supporting multiple access methods

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Tree-pattern queries on a lightweight XML processor

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Pathfinder: XQuery---the relational way

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Exploiting native XML indexing techniques for XML retrieval in relational database systems

Proceedings of the 7th annual ACM international workshop on Web information and data management
MonetDB/XQuery: a fast XQuery processor powered by a relational engine

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Xpath on steroids: exploiting relational engines for xpath performance

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Optimizing XPath queries on streaming XML data

ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Efficient algorithms for processing XPath queries

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficiently Querying Large XML Data Repositories: A Survey

IEEE Transactions on Knowledge and Data Engineering
Staircase join: teach a relational DBMS to watch its (axis) steps

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Efficient processing of branch queries for high-performance XML filtering

Proceedings of the 2nd international conference on Scalable information systems
Stream firewalling of xml constraints

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Faster path indexes for search in XML data

ADC '08 Proceedings of the nineteenth conference on Australasian database - Volume 75
Breaking the memory wall in MonetDB

Communications of the ACM - Surviving the data deluge
XML Structural Summaries

Proceedings of the VLDB Endowment
Efficient XPath query processing

CASCON '08 Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds
Efficient and expressive tree filters

FSTTCS'07 Proceedings of the 27th international conference on Foundations of software technology and theoretical computer science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Much research has been done adapting relational technology for use with XML and XPath query processing, several research efforts have focused on native XML databases, and some research efforts have focused on hybrid approaches. This paper presents a hybrid design: we extend the usage of path summary indexes by combining them with partitioned indexes on schema-less XML documents to accelerate XPath query processing. Efficient XPath query processing is important because XPath is the query language used for node selection within XQuery. To index an XML document, each node is assigned a path identifier that is unique for every root-to-node path. A separate XML path summary index is created, itself encoded as an XML document, which summarizes the document structure by eliminating path redundancies which are inherent within many XML document instances. The use of structure summaries is widely adopted. Two additional supporting indexes are utilized: first, the XML structure is placed into a structure index that is partitioned by the path identifier, and second, the XML element and attribute values are placed into a separate value index that is partitioned by the same path identifier. Therefore, we integrate structure summaries, complete structure, and values into a unified index. To support comprehensive integration we use unique implementation and query methods. XPath queries, either partially or fully, are first executed against the summary index to derive candidate path identifiers which are placed into a specialized hash map tree cursor. We introduce the partitioned branching path join, a twig join that enables efficient index nested loop joins between various B+-tree partitions on the same structure relation, guided by the hash map tree cursor. We conclude with performance results from several queries using our lightweight prototype system, which demonstrates that our combination of methods matches or outperforms existing high-end database engines when determining node sequences for several XPath queries.