Optimization of nested SQL queries revisited
SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Parallel database systems: the future of high performance database systems
Communications of the ACM
IBM Systems Journal
Adaptive parallel aggregation algorithms
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Distributed and parallel database systems
ACM Computing Surveys (CSUR)
The state of the art in distributed query processing
ACM Computing Surveys (CSUR)
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Inside Microsoft SQL Server 2000
Inside Microsoft SQL Server 2000
Prototyping Bubba, A Highly Parallel Database System
IEEE Transactions on Knowledge and Data Engineering
The Gamma Database Machine Project
IEEE Transactions on Knowledge and Data Engineering
Volcano An Extensible and Parallel Query Evaluation System
IEEE Transactions on Knowledge and Data Engineering
OLAP Query Routing and Physical Design in a Database Cluster
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Cache-Aware Query Routing in a Cluster of Databases
Proceedings of the 17th International Conference on Data Engineering
A PC-NOW Based Parallel Extension for a Sequential DBMS
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
The Hyperdatabase Project --- From the Vision to Realizations
BNCOD '08 Proceedings of the 25th British national conference on Databases: Sharing Data, Information and Knowledge
Adaptive hybrid partitioning for OLAP query processing in a database cluster
International Journal of High Performance Computing and Networking
High-Performance Query Processing of a Real-World OLAP Database with ParGRES
High Performance Computing for Computational Science - VECPAR 2008
Model and procedure for performance and availability-wise parallel warehouses
Distributed and Parallel Databases
Parallel OLAP query processing in database clusters with data replication
Distributed and Parallel Databases
HyperDB: a PC-based database cluster system for efficient OLAP query processing
PDCS '07 Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems
DWMiner: a tool for mining frequent item sets efficiently in data warehouses
VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
Exploring graphics processing units as parallel coprocessors for online aggregation
DOLAP '10 Proceedings of the ACM 13th international workshop on Data warehousing and OLAP
Quality of experience in distributed databases
Distributed and Parallel Databases
Apuama: combining intra-query and inter-query parallelism in a database cluster
EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
Autonomous database partitioning using data mining on single computers and cluster computers
Proceedings of the 16th International Database Engineering & Applications Sysmposium
On the performance of the position() XPath function
Proceedings of the 2013 ACM symposium on Document engineering
Hi-index | 0.00 |
While cluster computing is well established, it is not clear how to coordinate clusters consisting of many database components in order to process high workloads. In this paper, we focus on Online Analytical Processing (OLAP) queries, i.e., relatively complex queries whose evaluation tends to be time-consuming, and we report on some observations and preliminary results of our PowerDB project in this context. We investigate how many cluster nodes should be used to evaluate an OLAP query in parallel. Moreover, we provide a classification of OLAP queries, which is used to decide, whether and how a query should be parallelized. We run extensive experiments to evaluate these query classes in quantitative terms. Our results are an important step towards a two-phase query optimizer. In the first phase, the coordination infrastructure decomposes a query into subqueries and ships them to appropriate cluster nodes. In the second phase, each cluster node optimizes and evaluates its subquery locally.