Intensive Data Management in Parallel Systems: A Survey
Distributed and Parallel Databases
The state of the art in distributed query processing
ACM Computing Surveys (CSUR)
Distributed query evaluation on semistructured data
ACM Transactions on Database Systems (TODS)
Query Processing in Parallel Relational Database Systems
Query Processing in Parallel Relational Database Systems
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Statistical synopses for graph-structured XML databases
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Estimating Answer Sizes for XML Queries
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Counting Twig Matches in a Tree
Proceedings of the 17th International Conference on Data Engineering
Estimating the Selectivity of XML Path Expressions for Internet Scale Applications
Proceedings of the 27th International Conference on Very Large Data Bases
The XML benchmark project
Statistical learning techniques for costing XML queries
VLDB '05 Proceedings of the 31st international conference on Very large data bases
CXHist: an on-line classification-based histogram for XML string selectivity estimation
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Cost-based optimization in DB2 XML
IBM Systems Journal
Using partial evaluation in distributed query evaluation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A relative cost model for XQuery
Proceedings of the 2007 ACM symposium on Applied computing
Distributed query evaluation with performance guarantees
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
A Static Load-Balancing Scheme for Parallel XML Parsing on Multicore CPUs
CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Parallel XML processing by work stealing
Proceedings of the 2007 workshop on Service-oriented computing performance: aspects, issues, and approaches
XPathLearner: an on-line self-tuning Markov histogram for XML path selectivity estimation
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Structure and value synopses for XML data graphs
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Bloom histogram: path selectivity estimation for XML data with updates
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Grouping and optimization of XPath expressions in DB2® pureXML
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Dependable cardinality forecasts for XQuery
Proceedings of the VLDB Endowment
Proceedings of the 4th international workshop on Data management on new hardware
Data Management on New Hardware (co-located w/ SIGMOD/PODS 2008)
A Parallel Approach to XML Parsing
GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
Parallelization of XPath queries using multi-core processors: challenges and experiences
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
The Art of Multiprocessor Programming
The Art of Multiprocessor Programming
Case studies in hardware XPath acceleration
Proceedings of the 4th Annual International Conference on Systems and Storage
Scalable XML query processing using parallel pushdown transducers
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
The wide availability of commodity multi-core systems presents an opportunity to address the latency issues that have plaqued XML query processing. However, simply executing multiple XML queries over multiple cores merely addresses the throughput issue: intra-query parallelization is needed to exploit multiple processing cores for better latency. Toward this effort, this paper investigates the parallelization of individual XPath queries over shared-address space multi-core processors. Much previous work on parallelizing XPath in a distributed setting failed to exploit the shared memory parallelism of multi-core systems. We propose a novel, end-to-end parallelization framework that determines the optimal way of parallelizing an XML query. This decision is based on a statistics-based approach that relies both on the query specifics and the data statistics. At each stage of the parallelization process, we evaluate three alternative approaches, namely, data-, query-, and hybrid-partitioning. For a given XPath query, our parallelization algorithm uses XML statistics to estimate the relative efficiencies of these different alternatives and find an optimal parallel XPath processing plan. Our experiments using well-known XML documents validate our parallel cost model and optimization framework, and demonstrate that it is possible to accelerate XPath processing using commodity multi-core systems.