Considering data skew factor in multi-way join query optimization for parallel execution

  • Authors:
  • Kien A. Hua;Yo Lung Lo;Honesty C. Young

  • Affiliations:
  • University of Central Florida, Orlando, FL;University of Central Florida, Orlando, FL;IBM Research Division, Almaden Research Center, San Jose, CA

  • Venue:
  • The VLDB Journal — The International Journal on Very Large Data Bases - Parallelism in database systems
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

A consensus on parallel architecture for very large database management has emerged. This architecture is based on a shared-nothing hardware organization. The computation model is very sensitive to skew in tuple distribution, however. Recently, several parallel join algorithms with dynamic load balancing capabilities have been proposed to address this issue, but none of them consider multi-way join problems. In this article we propose a dynamic load balancing technique for multi-way joins, and investigate the effect of load balancing on query optimization. In particular, we present a join-ordering strategy that takes load-balancing issues into consideration. Our performance study indicates that the proposed query optimization technique can provide very impressive performance improvement over conventional approaches.