SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
A bridging model for parallel computation
Communications of the ACM
Proceedings of the sixteenth international conference on Very large databases
Parallel database systems: the future of high performance database systems
Communications of the ACM
Limits of parallelism in hash join algorithms
Performance '93 Proceedings of the 16th IFIP Working Group 7.3 international symposium on Computer performance modeling measurement and evaluation
Parallel evaluation of multi-join queries
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Implications of certain assumptions in database performance evauation
ACM Transactions on Database Systems (TODS)
Query Processing in Parallel Relational Database Systems
Query Processing in Parallel Relational Database Systems
Effectiveness of Parallel Joins
IEEE Transactions on Knowledge and Data Engineering
Dynamic Load Balancing in Multicomputer Database Systems Using Partition Tuning
IEEE Transactions on Knowledge and Data Engineering
Dynamic and Load-balanced Task-Oriented Datbase Query Processing in Parallel Systems
EDBT '92 Proceedings of the 3rd International Conference on Extending Database Technology: Advances in Database Technology
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Practical Skew Handling in Parallel Joins
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Estimation of Query-Result Distribution and its Application in Parallel-Join Load Balancing
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Analysis of Parallel Scan Processing in Shared Disk Database Systems
Euro-Par '95 Proceedings of the First International Euro-Par Conference on Parallel Processing
Dynamic Join Product Skew Handling for Hash-Joins in Shared-Nothing Database Systems
Proceedings of the 4th International Conference on Database Systems for Advanced Applications (DASFAA)
A Case for Parallelism in Data Warehousing and OLAP
DEXA '98 Proceedings of the 9th International Workshop on Database and Expert Systems Applications
Handling data skew in parallel joins in shared-nothing systems
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient outer join data skew handling in parallel DBMS
Proceedings of the VLDB Endowment
Semi-join computation on distributed file systems using map-reduce-merge model
Proceedings of the 2010 ACM Symposium on Applied Computing
A new framework for join product skew
RED'10 Proceedings of the Third international conference on Resource Discovery
Hi-index | 0.00 |
Although many skew-handling algorithms have been proposed for simple join operations, they remain generally inefficient in the case of θ-join and in the case of multi-join. A new method for self-balancing equi-join operations on shared-nothing (SN) machines is proposed here. It offers deterministic and near-perfect load balancing through flexible control of communications in intra-transaction parallelism. The new algorithm mixes a balanced data-distribution strategy with pure hash-join. Its predictably low join-product- and attribute-value skews make it suitable for repeated use in multi-join operations. Its tradeoff between balancing overhead and speedup is analyzed in the BSP (Bulk-synchronous parallel) computing model. The scalable model predicts a negligible join product skew and a near-linear speed-up in any combination of selectivity, skew and number of processors. This prediction is confirmed by a series of tests.