Selection of nodes for distributing relations in parallel database

Authors:
Xuan Ping
Affiliations:
School of Computer Science and Technology, Heilongjiang University, Harbin, China
Venue:
CAR'10 Proceedings of the 2nd international Asia conference on Informatics in control, automation and robotics - Volume 1
Year:
2010

Citing 7
Cited 0

Towards self-tuning data placement in parallel database systems

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Parallel Optimization of Large Join Queries with Set Operators and Aggregates in a Parallel Environment Supporting Pipeline

IEEE Transactions on Knowledge and Data Engineering
Optimizing Large Join Queries Using A Graph-Based Approach

IEEE Transactions on Knowledge and Data Engineering
CMD: A Multidimensional Declustering Method for Parallel Data Systems

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Data placement in shared-nothing parallel database systems

The VLDB Journal — The International Journal on Very Large Data Bases
Mining Association Rules from Relations on a Parallel NCR Teradata Database System

ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
Adaptive Overlapped Declustering: A Highly Available Data-Placement Method Balancing Access Load and Space Utilization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

In parallel database system, a good data placement could improve execution efficiency of multi-join queries greatly. The bandwidth of network communication is always the bottleneck of parallel database system based on PC clusters. Data communication among nodes would bring more time cost when executing join operations. This paper proposes selection of nodes algorithm, which takes the data redistribution into consideration and reduces additional communication cost. Furthermore, it takes into account intra-operator parallelism, independent inter-operator parallelism and pipelined parallelism in order to develop parallelisms of PC clusters system. The result of experiment indicates the algorithm has good performance and contributes to promoting execution efficiency of parallel multijoin queries.