OLAP Query Evaluation in a Database Cluster: A Performance Study on Intra-Query Parallelism

Authors:
Fuat Akal;Klemens Böhm;Hans-Jörg Schek
Affiliations:
-;-;-
Venue:
ADBIS '02 Proceedings of the 6th East European Conference on Advances in Databases and Information Systems
Year:
2002

Citing 16
Cited 12

Optimization of nested SQL queries revisited

SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Parallel database systems: the future of high performance database systems

Communications of the ACM
DB2 parallel edition

IBM Systems Journal
Adaptive parallel aggregation algorithms

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Distributed and parallel database systems

ACM Computing Surveys (CSUR)
The state of the art in distributed query processing

ACM Computing Surveys (CSUR)
Parallel database processing on a 100 Node PC cluster: cases for decision support query processing and data mining

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Inside Microsoft SQL Server 2000

Inside Microsoft SQL Server 2000
Prototyping Bubba, A Highly Parallel Database System

IEEE Transactions on Knowledge and Data Engineering
The Gamma Database Machine Project

IEEE Transactions on Knowledge and Data Engineering
Volcano— An Extensible and Parallel Query Evaluation System

IEEE Transactions on Knowledge and Data Engineering
OLAP Query Routing and Physical Design in a Database Cluster

EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Cache-Aware Query Routing in a Cluster of Databases

Proceedings of the 17th International Conference on Data Engineering
A PC-NOW Based Parallel Extension for a Sequential DBMS

IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
The Design of XPRS

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
FAS: a freshness-sensitive coordination middleware for a cluster of OLAP components

The Hyperdatabase Project --- From the Vision to Realizations

BNCOD '08 Proceedings of the 25th British national conference on Databases: Sharing Data, Information and Knowledge
Adaptive hybrid partitioning for OLAP query processing in a database cluster

International Journal of High Performance Computing and Networking
High-Performance Query Processing of a Real-World OLAP Database with ParGRES

High Performance Computing for Computational Science - VECPAR 2008
Model and procedure for performance and availability-wise parallel warehouses

Distributed and Parallel Databases
Parallel OLAP query processing in database clusters with data replication

Distributed and Parallel Databases
HyperDB: a PC-based database cluster system for efficient OLAP query processing

PDCS '07 Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems
DWMiner: a tool for mining frequent item sets efficiently in data warehouses

VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
Exploring graphics processing units as parallel coprocessors for online aggregation

DOLAP '10 Proceedings of the ACM 13th international workshop on Data warehousing and OLAP
Quality of experience in distributed databases

Distributed and Parallel Databases
Apuama: combining intra-query and inter-query parallelism in a database cluster

EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
Autonomous database partitioning using data mining on single computers and cluster computers

Proceedings of the 16th International Database Engineering & Applications Sysmposium
On the performance of the position() XPath function

Proceedings of the 2013 ACM symposium on Document engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

While cluster computing is well established, it is not clear how to coordinate clusters consisting of many database components in order to process high workloads. In this paper, we focus on Online Analytical Processing (OLAP) queries, i.e., relatively complex queries whose evaluation tends to be time-consuming, and we report on some observations and preliminary results of our PowerDB project in this context. We investigate how many cluster nodes should be used to evaluate an OLAP query in parallel. Moreover, we provide a classification of OLAP queries, which is used to decide, whether and how a query should be parallelized. We run extensive experiments to evaluate these query classes in quantitative terms. Our results are an important step towards a two-phase query optimizer. In the first phase, the coordination infrastructure decomposes a query into subqueries and ships them to appropriate cluster nodes. In the second phase, each cluster node optimizes and evaluates its subquery locally.