SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Proceedings of the sixteenth international conference on Very large databases
Parallel database systems: the future of high performance database systems
Communications of the ACM
Exploiting inter-operation parallelism in XPRS
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
The Stanford Dash Multiprocessor
Computer
Comparative performance evaluation of cache-coherent NUMA and COMA architectures
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
On optimal processor allocation to support pipelined hash joins
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Using shared virtual memory for parallel join processing
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Parallel database systems: open problems and new issues
Distributed and Parallel Databases - Special issue: Research topics in distributed and parallel databases
On parallel execution of multiple pipelined hash joins
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
The Stanford FLASH multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Parallel evaluation of multi-join queries
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
The MIT Alewife machine: architecture and performance
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
COMA: an opportunity for building fault-tolerant scalable shared memory multiprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
STiNG: a CC-NUMA computer system for the commercial marketplace
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Multi-dimensional resource scheduling for parallel queries
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Join and Semijoin Algorithms for a Multiprocessor Database Machine
ACM Transactions on Database Systems (TODS)
Parallelism in relational data base systems: architectural issues and design approaches
DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
ACM Computing Surveys (CSUR)
Optimizing multi-join queries in parallel relational databases
PDIS '93 Proceedings of the second international conference on Parallel and distributed information systems
Load balancing algorithms for parallel database processing on shared memory multiprocessors
PDIS '91 Proceedings of the first international conference on Parallel and distributed information systems
Prototyping Bubba, A Highly Parallel Database System
IEEE Transactions on Knowledge and Data Engineering
The Gamma Database Machine Project
IEEE Transactions on Knowledge and Data Engineering
PRISMA/DB: A Parallel, Main Memory Relational DBMS
IEEE Transactions on Knowledge and Data Engineering
Volcano An Extensible and Parallel Query Evaluation System
IEEE Transactions on Knowledge and Data Engineering
The DASH Prototype: Logic Overhead and Performance
IEEE Transactions on Parallel and Distributed Systems
Adaptive Parallel Query Execution in DBS3
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Proceedings of the Seventh International Conference on Data Engineering
An Analysis of Three Transaction Processing Architectures
VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Optimization of Multi-Way Join Queries for Parallel Execution
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared Memory Multiprocessor
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Practical Skew Handling in Parallel Joins
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
On the Effectiveness of Optimization Search Strategies for Parallel Execution Spaces
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Multi-Join Optimization for Symmetric Multiprocessors
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Dynamic Multi-Resource Load Balancing in Parallel Database Systems
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Managing Intra-operator Parallelism in Parallel Database Systems
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Dynamic Load Balancing in Hierarchical Parallel Database Systems
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
ACM SIGOPS Operating Systems Review
Experimenting NUMA for Scalable CDR Processing
DEXA '00 Proceedings of the 11th International Conference on Database and Expert Systems Applications
Hi-index | 0.00 |
To scale up to high-end configurations,shared-memory multiprocessors are evolvingtowards Non Uniform Memory Access (NUMA) architectures. In this paper,we address the central problem of load balancing during parallelquery execution in NUMA multiprocessors. We first showthat an execution model for NUMA should not usedata partitioning (as shared-nothing systems do) but should strive to exploit efficient shared-memory strategies like SynchronousPipelining (SP). However, SP has problems in NUMA,especially with skewed data. Thus, we propose a newexecution strategy which solves theseproblems. The basic idea is to allowpartial materialization of intermediate results and to make them progressivlypublic, i.e., able to be processed by any processor, as needed to avoid processor idle times. Hence, we call thisstrategy Progressive Sharing (PS). We conducted a performance comparison using an implementation of SP and PSon a 72-processor KSR1 computer, with many queries and large relations. With no skew, SP and PS have both linear speed-up. However, theimpact of skew is very severe on SP performance while it isinsignificant on PS. Finally, we show that, in NUMA, PS canalso be beneficial in executing several pipelinechains concurrently.