Optimization of large join queries
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Grammar-like functional rules for representing query optimization alternatives
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Randomized algorithms for optimizing large join queries
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Tabu search optimization of large join queries
EDBT '94 Proceedings of the 4th international conference on extending database technology: Advances in database technology
Parallelism and its price: a case study of nonstop SQL/MP
ACM SIGMOD Record
Database Management Systems
Parallel dynamic programming for solving the string editing problem on a CGM/BSP
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Automatic Partitioning of Parallel Loops with Parallelepiped-Shaped Tiles
IEEE Transactions on Parallel and Distributed Systems
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
A new way to compute the product and join of relations
SIGMOD '80 Proceedings of the 1980 ACM SIGMOD international conference on Management of data
Large Join Optimization on a Hypercube Multiprocessor
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Parallel and Distributed Systems
Measuring the Complexity of Join Enumeration in Query Optimization
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Cache Coherency in Oracle Parallel Server
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Cache Conscious Algorithms for Relational Query Processing
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Efficient processing of joins on set-valued attributes
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Estimating compilation time of a query optimizer
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Robust query processing through progressive optimization
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Efficient set joins on similarity predicates
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Adaptive load shedding for windowed stream joins
Proceedings of the 14th ACM international conference on Information and knowledge management
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Locality and parallelism optimization for dynamic programming algorithm in bioinformatics
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Optimal top-down join enumeration
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Progressive optimization in a shared-nothing parallel database
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
A parallel dynamic programming algorithm on a multi-core architecture
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Dynamic programming strikes back
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Is (your) database research having impact?
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Dependency-aware reordering for parallelizing query optimization in multi-core CPUs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Parallelizing extensible query optimizers
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
A Query Cache Tool for Optimizing Repeatable and Parallel OLAP Queries
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Parallel skyline computation on multicore architectures
Information Systems
Optimizing analytic data flows for multiple execution engines
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Optimization of analytic data flows for next generation business intelligence applications
TPCTC'11 Proceedings of the Third TPC Technology conference on Topics in Performance Evaluation, Measurement and Characterization
An evolutionary multi-agent system for database query optimization
Proceedings of the 15th annual conference on Genetic and evolutionary computation
Hybrid Analytic Flows-the Case for Optimization
Fundamenta Informaticae - Scalable Workflow Enactment Engines and Technology
Hi-index | 0.00 |
Many commercial RDBMSs employ cost-based query optimization exploiting dynamic programming (DP) to efficiently generate the optimal query execution plan. However, optimization time increases rapidly for queries joining more than 10 tables. Randomized or heuristic search algorithms reduce query optimization time for large join queries by considering fewer plans, sacrificing plan optimality. Though commercial systems executing query plans in parallel have existed for over a decade, the optimization of such plans still occurs serially. While modern microprocessors employ multiple cores to accelerate computations, parallelizing query optimization to exploit multi-core parallelism is not as straightforward as it may seem. The DP used in join enumeration belongs to the challenging nonserial polyadic DP class because of its non-uniform data dependencies. In this paper, we propose a comprehensive and practical solution for parallelizing query optimization in the multi-core processor architecture, including a parallel join enumeration algorithm and several alternative ways to allocate work to threads to balance their load. We also introduce a novel data structure called skip vector array to significantly reduce the generation of join partitions that are infeasible. This solution has been prototyped in PostgreSQL. Extensive experiments using various query graph topologies confirm that our algorithms allocate the work evenly, thereby achieving almost linear speed-up. Our parallel join enumeration algorithm enhanced with our skip vector array outperforms the conventional generate-and-filter DP algorithm by up to two orders of magnitude for star queries-linear speedup due to parallelism and an order of magnitude performance improvement due to the skip vector array.