Parallel database systems: the future of high performance database systems
Communications of the ACM
Using shared virtual memory for parallel join processing
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Outerjoin optimization in multidatabase systems
DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
PDIS '94 Proceedings of the third international conference on on Parallel and distributed information systems
New Algorithms for Parallelizing Relational Database Joins in the Presence of Data Skew
IEEE Transactions on Knowledge and Data Engineering
A Parallel Sort Merge Join Algorithm for Managing Data Skew
IEEE Transactions on Parallel and Distributed Systems
Frequency-adaptive join for shared nothing machines
Progress in computer research
An Effective Algorithm for Parallelizing Hash Joins in the Presence of Data Skew
Proceedings of the Seventh International Conference on Data Engineering
Efficient Processing of Outer Joins and Aggregate Functions
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Practical Skew Handling in Parallel Joins
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Dynamic Join Product Skew Handling for Hash-Joins in Shared-Nothing Database Systems
Proceedings of the 4th International Conference on Database Systems for Advanced Applications (DASFAA)
CASCON '95 Proceedings of the 1995 conference of the Centre for Advanced Studies on Collaborative research
Skew-Insensitive Parallel Algorithms for Relational Join
HIPC '98 Proceedings of the Fifth International Conference on High Performance Computing
View matching for outer-join views
VLDB '05 Proceedings of the 31st international conference on Very large data bases
View matching for outer-join views
The VLDB Journal — The International Journal on Very Large Data Bases
Handling data skew in parallel joins in shared-nothing systems
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Skew-resistant parallel processing of feature-extracting scientific user-defined functions
Proceedings of the 1st ACM symposium on Cloud computing
An efficient skew-insensitive algorithm for join processing on grid architectures
Proceedings of the fifth international workshop on High-level parallel programming and applications
A new framework for join product skew
RED'10 Proceedings of the Third international conference on Resource Discovery
SkewTune: mitigating skew in mapreduce applications
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Introducing skew into the TPC-H benchmark
TPCTC'11 Proceedings of the Third TPC Technology conference on Topics in Performance Evaluation, Measurement and Characterization
Balancing reducer workload for skewed data using sampling-based partitioning
Computers and Electrical Engineering
Hi-index | 0.00 |
Large enterprises have been relying on parallel database management systems (PDBMS) to process their ever-increasing data volume and complex queries. The scalability and performance of a PDBMS comes from load balancing on all nodes in the system. Skewed processing will significantly slow down query response time and degrade the overall system performance. Business intelligence tools used by enterprises frequently generate a large number of outer joins and require high performance from the underlying database systems. Although extensive research has been done on handling skewed processing for inner joins in PDBMS, there is no known research on data skew handling for parallel outer joins. We propose a simple and efficient outer join algorithm called OJSO (Outer Join Skew Optimization) to improve the performance and scalability of parallel outer joins. Our experimental results show that the OJSO algorithm significantly speeds up query elapsed time in the presence of data skew.