Parallel Processing of "GroupBy-Before-Join" Queries in Cluster Architecture

Authors:
David Taniar;J. Wenny Rahayu
Affiliations:
-;-
Venue:
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Year:
2001

Citing 0
Cited 1

Performance analysis of "Groupby-After-Join" query processing in parallel database systems

Information Sciences—Informatics and Computer Science: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

SQL queries in the real world are replete with group-by and join operations. This type of queries is often known as "GroupBy-Join" queries. In some GroupBy-Join queries, it is desirable to perform group-by before join in order to achieve better performance. This subset of GroupBy-Join queries is called "GroupBy-Before-Join" queries. In this paper, we present a study on parallelization of GroupBy-Before-Join queries, particularly by exploiting cluster architectures. From our study, we have learned that in parallel query optimization, processing group-by as early as possible is not always desirable. In many occasions, performing data distribution first before group-by offers performance advantages. In this study, we also describe our cluster-based scheme.