Parallel database systems: the future of high performance database systems
Communications of the ACM
Constraint checking with partial information
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
An introduction to partial evaluation
ACM Computing Surveys (CSUR)
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
The state of the art in distributed query processing
ACM Computing Surveys (CSUR)
On supporting containment queries in relational database management systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Distributed query evaluation on semistructured data
ACM Transactions on Database Systems (TODS)
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Covering indexes for branching path queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Efficient algorithms for minimizing tree pattern queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Chained Declustering: A New Availability Strategy for Multiprocessor Database Machines
Proceedings of the Sixth International Conference on Data Engineering
Index Structures for Path Expressions
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Efficient Filtering of XML Documents for Selective Dissemination of Information
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Efficient Relational Storage and Retrieval of XML Documents
Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases
A Strategy for Partial Evaluation of Views
Proceedings of the IIS'2000 Symposium on Intelligent Information Systems
An XML query engine for network-bound data
The VLDB Journal — The International Journal on Very Large Data Bases
Scaling heterogeneous databases and the design of Disco
ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
Dynamic XML documents with distribution and replication
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Structural Joins: A Primitive for Efficient XML Query Pattern Matching
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Distributed Evaluation of Network Directory Queries
IEEE Transactions on Knowledge and Data Engineering
A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
The Piazza Peer Data Management System
IEEE Transactions on Knowledge and Data Engineering
Lazy query evaluation for Active XML
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Querying peer-to-peer networks using P-trees
Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Cost-sensitive reordering of navigational primitives
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
From region encoding to extended dewey: on efficient processing of XML twig pattern matching
VLDB '05 Proceedings of the 31st international conference on Very large data bases
BATON: a balanced tree structure for peer-to-peer networks
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Structural properties of XPath fragments
Theoretical Computer Science - Database theory
MonetDB/XQuery: a fast XQuery processor powered by a relational engine
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
A linear time algorithm for optimal tree sibling partitioning and approximation algorithms in Natix
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Using partial evaluation in distributed query evaluation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Storing and retrieving XPath fragments in structured P2P networks
Data & Knowledge Engineering - Special issue: WIDM 2004
Querying xml with update syntax
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Distributed query evaluation with performance guarantees
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Efficient algorithms for processing XPath queries
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
XMark: a benchmark for XML data management
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Active XML: peer-to-peer data and web services integration
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Online balancing of range-partitioned data with applications to peer-to-peer systems
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Indexing XML data stored in a relational database
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
XRPC: interoperable and efficient distributed XQuery
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
The Active XML project: an overview
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient Distribution of Full-Fledged XQuery
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Efficient algorithm for the partitioning of trees
IBM Journal of Research and Development
Generating efficient execution plans for vertically partitioned XML databases
Proceedings of the VLDB Endowment
Partitioning XML documents for iterative queries
Proceedings of the 16th International Database Engineering & Applications Sysmposium
Processing XML queries and updates on map/reduce clusters
Proceedings of the 16th International Conference on Extending Database Technology
Hi-index | 0.00 |
This article proposes algorithms for evaluating XPath queries over an XML tree that is partitioned horizontally and vertically, and is distributed across a number of sites. The key idea is based on partial evaluation: it is to send the whole query to each site that partially evaluates the query, in parallel, and sends the results as compact (Boolean) functions to a coordinator that combines these to obtain the result. This approach possesses the following performance guarantees. First, each site is visited at most twice for data-selecting XPath queries, and only once for Boolean XPath queries. Second, the network traffic is determined by the answer to the query, rather than the size of the tree. Third, the total computation is comparable to that of centralized algorithms on the tree stored in a single site, regardless of how the tree is fragmented and distributed. We also present a MapReduce algorithm for evaluating Boolean XPath queries, based on partial evaluation. In addition, we provide algorithms to evaluate XPath queries on very large XML trees, in a centralized setting. We show both analytically and empirically that our techniques are scalable with large trees and complex XPath queries. These results, we believe, illustrate the usefulness and potential of partial evaluation in distributed systems as well as centralized XML stores for evaluating XPath queries and beyond.