Parallel database systems: the future of high performance database systems
Communications of the ACM
On the relative cost of sampling for join selectivity estimation
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Effectiveness of Parallel Joins
IEEE Transactions on Knowledge and Data Engineering
Frequency-adaptive join for shared nothing machines
Progress in computer research
A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Practical Skew Handling in Parallel Joins
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Dynamic Join Product Skew Handling for Hash-Joins in Shared-Nothing Database Systems
Proceedings of the 4th International Conference on Database Systems for Advanced Applications (DASFAA)
Data placement in shared-nothing parallel database systems
The VLDB Journal — The International Journal on Very Large Data Bases
Skew-Insensitive Parallel Algorithms for Relational Join
HIPC '98 Proceedings of the Fifth International Conference on High Performance Computing
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Handling data skew in parallel joins in shared-nothing systems
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient outer join data skew handling in parallel DBMS
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Different types of data skew can result in load imbalance in the context of parallel joins under the shared nothing architecture. We study one important type of skew, join product skew (JPS). A static approach based on frequency classes is proposed which takes for granted the data distribution of join attribute values. It comes from the observation that the join selectivity can be expressed as a sum of products of frequencies of the join attribute values. As a consequence, an appropriate assignment of join sub-tasks that takes into consideration the magnitude of the frequency products can alleviate the join product skew. Motivated by the aforementioned remark, we propose an algorithm, called Handling Join Product Skew (HJPS), to handle join product skew.