Parallel database systems: the future of high performance database systems
Communications of the ACM
An introduction to partial evaluation
ACM Computing Surveys (CSUR)
LDAP: programming directory-enabled applications with lightweight directory access protocol
LDAP: programming directory-enabled applications with lightweight directory access protocol
The state of the art in distributed query processing
ACM Computing Surveys (CSUR)
Distributed query evaluation on semistructured data
ACM Transactions on Database Systems (TODS)
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient algorithms for minimizing tree pattern queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
A Strategy for Partial Evaluation of Views
Proceedings of the IIS'2000 Symposium on Intelligent Information Systems
An XML query engine for network-bound data
The VLDB Journal — The International Journal on Very Large Data Bases
Scaling heterogeneous databases and the design of Disco
ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
Dynamic XML documents with distribution and replication
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Distributed Evaluation of Network Directory Queries
IEEE Transactions on Knowledge and Data Engineering
The Piazza Peer Data Management System
IEEE Transactions on Knowledge and Data Engineering
Querying peer-to-peer networks using P-trees
Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Cost-sensitive reordering of navigational primitives
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
BATON: a balanced tree structure for peer-to-peer networks
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Using partial evaluation in distributed query evaluation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient algorithms for processing XPath queries
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
XMark: a benchmark for XML data management
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Online balancing of range-partitioned data with applications to peer-to-peer systems
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Paths to stardom: calibrating the potential of a peer-based data management system
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Fault-tolerant computation of distributed regular path queries
Theoretical Computer Science
Parallelization of XPath queries using multi-core processors: challenges and experiences
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Statistics-based parallelization of XPath queries in shared memory systems
Proceedings of the 13th International Conference on Extending Database Technology
Query and update through XML views
DNIS'07 Proceedings of the 5th international conference on Databases in networked information systems
Generating efficient execution plans for vertically partitioned XML databases
Proceedings of the VLDB Endowment
Updating XML views and querying XML views with update syntax
International Journal of Computational Science and Engineering
Distributed XML query processing
XSym'10 Proceedings of the 7th international XML database conference on Database and XML technologies
Automated partitioning design in parallel database systems
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Scaling XML query processing: distribution, localization and pruning
Distributed and Parallel Databases
Capturing topology in graph pattern matching
Proceedings of the VLDB Endowment
Adaptive parallelization of queries to data providing web service operations
Transactions on Large-Scale Data- and Knowledge-Centered Systems V
Distributed graph pattern matching
Proceedings of the 21st international conference on World Wide Web
Graph pattern matching revised for social network analysis
Proceedings of the 15th International Conference on Database Theory
Performance guarantees for distributed reachability queries
Proceedings of the VLDB Endowment
Partial Evaluation for Distributed XPath Query Processing and Beyond
ACM Transactions on Database Systems (TODS)
Strong simulation: Capturing topology in graph pattern matching
ACM Transactions on Database Systems (TODS)
Hi-index | 0.00 |
Partial evaluation has recently proven an effective technique for evaluating Boolean XPath queries over a fragmented tree that is distributed over a number of sites. What left open is whether or not the technique is applicable to generic data-selecting XPath queries. In contrast to Boolean queries that return a single truth value, a generic XPath query returns a set of elements, and its evaluation introduces difficulties to avoiding excessive data shipping. This paper settles this question in positive by providing evaluation algorithms and optimizations for generic XPath queries in the same distributed and fragmented setting. These algorithms explore parallelism and retain the performance guarantees of their counterpart for Boolean queries, regardless of how the tree is fragmented and distributed. First, each site is visited at most three times, and down to at most twice when optimizations are in place. Second, the network traffic is determined by the final answer of the query, rather than the size of the tree, without incurring unnecessary data shipping. Third, the total computation is comparable to that of centralized algorithms on the tree stored in a single site. We show both analytically and experimentally that our algorithms and optimizations are scalable and efficient on large trees and complex XPath queries.