Load Balancing for Parallel Query Execution on NUMA Multiprocessors

Authors:
Luc Bouganim;Daniela Florescu;Patrick Valduriez
Affiliations:
INRIA Rocquencourt, France. E-mail: luc.bouganim@inria.fr;INRIA Rocquencourt, France. E-mail: daniela.florescu@inria.fr;INRIA Rocquencourt, France. E-mail: patrick.valduriez@inria.fr
Venue:
Distributed and Parallel Databases
Year:
1999

Citing 41
Cited 2

A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Bucket spreading parallel hash: a new, robust, parallel hash join method for data skew in the super database computer (SDC)

Proceedings of the sixteenth international conference on Very large databases
Parallel database systems: the future of high performance database systems

Communications of the ACM
Exploiting inter-operation parallelism in XPRS

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
The Stanford Dash Multiprocessor

Computer
Comparative performance evaluation of cache-coherent NUMA and COMA architectures

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
DDM: A Cache-Only Memory Architecture

Computer
On optimal processor allocation to support pipelined hash joins

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Using shared virtual memory for parallel join processing

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Parallel database systems: open problems and new issues

Distributed and Parallel Databases - Special issue: Research topics in distributed and parallel databases
On parallel execution of multiple pipelined hash joins

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
The Stanford FLASH multiprocessor

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Parallel evaluation of multi-join queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
The MIT Alewife machine: architecture and performance

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
COMA: an opportunity for building fault-tolerant scalable shared memory multiprocessors

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
STiNG: a CC-NUMA computer system for the commercial marketplace

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Multi-dimensional resource scheduling for parallel queries

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Join and Semijoin Algorithms for a Multiprocessor Database Machine

ACM Transactions on Database Systems (TODS)
Parallelism in relational data base systems: architectural issues and design approaches

DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
Cache Memories

ACM Computing Surveys (CSUR)
Optimizing multi-join queries in parallel relational databases

PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Load balancing algorithms for parallel database processing on shared memory multiprocessors

PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
Prototyping Bubba, A Highly Parallel Database System

IEEE Transactions on Knowledge and Data Engineering
The Gamma Database Machine Project

IEEE Transactions on Knowledge and Data Engineering
PRISMA/DB: A Parallel, Main Memory Relational DBMS

IEEE Transactions on Knowledge and Data Engineering
Volcano— An Extensible and Parallel Query Evaluation System

IEEE Transactions on Knowledge and Data Engineering
The DASH Prototype: Logic Overhead and Performance

IEEE Transactions on Parallel and Distributed Systems
Adaptive Parallel Query Execution in DBS3

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Execution Plan Balancing

Proceedings of the Seventh International Conference on Data Engineering
An Analysis of Three Transaction Processing Architectures

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Optimization of Multi-Way Join Queries for Parallel Execution

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared Memory Multiprocessor

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Practical Skew Handling in Parallel Joins

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
On the Effectiveness of Optimization Search Strategies for Parallel Execution Spaces

VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Multi-Join Optimization for Symmetric Multiprocessors

VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Optimization Algorithms for Exploiting the Parallelism-Communication Tradeoff in Pipelined Parallelism

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Dynamic Multi-Resource Load Balancing in Parallel Database Systems

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Managing Intra-operator Parallelism in Parallel Database Systems

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Dynamic Load Balancing in Hierarchical Parallel Database Systems

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
The convoy phenomenon

ACM SIGOPS Operating Systems Review

CPU and incremental memory allocation in dynamic parallelization of SQL Queries

Parallel Computing
Experimenting NUMA for Scalable CDR Processing

DEXA '00 Proceedings of the 11th International Conference on Database and Expert Systems Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

To scale up to high-end configurations,shared-memory multiprocessors are evolvingtowards Non Uniform Memory Access (NUMA) architectures. In this paper,we address the central problem of load balancing during parallelquery execution in NUMA multiprocessors. We first showthat an execution model for NUMA should not usedata partitioning (as shared-nothing systems do) but should strive to exploit efficient shared-memory strategies like SynchronousPipelining (SP). However, SP has problems in NUMA,especially with skewed data. Thus, we propose a newexecution strategy which solves theseproblems. The basic idea is to allowpartial materialization of intermediate results and to make them progressivlypublic, i.e., able to be processed by any processor, as needed to avoid processor idle times. Hence, we call thisstrategy Progressive Sharing (PS). We conducted a performance comparison using an implementation of SP and PSon a 72-processor KSR1 computer, with many queries and large relations. With no skew, SP and PS have both linear speed-up. However, theimpact of skew is very severe on SP performance while it isinsignificant on PS. Finally, we show that, in NUMA, PS canalso be beneficial in executing several pipelinechains concurrently.