ACM Transactions on Database Systems (TODS)
The LRU-K page replacement algorithm for database disk buffering
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
View indexing in relational databases
ACM Transactions on Database Systems (TODS)
Efficient and extensible algorithms for multi query optimization
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Pipelining in multi-query optimization
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
Batch Scheduling in Parallel Database Systems
Proceedings of the Ninth International Conference on Data Engineering
2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Redbrick Vista: Aggregate Computation and Management
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Dynamic Caching of Query Results for Decision Support Systems
SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
QPipe: a simultaneously pipelined relational query engine
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
ARC: A Self-Tuning, Low Overhead Replacement Cache
FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
An evaluation of buffer management strategies for relational database systems
VLDB '85 Proceedings of the 11th international conference on Very Large Data Bases - Volume 11
Adaptive aggregation on chip multiprocessors
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Cooperative scans: dynamic bandwidth sharing in a DBMS
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Main-memory scan sharing for multi-core CPUs
Proceedings of the VLDB Endowment
Shore-MT: a scalable storage manager for the multicore era
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
A scalable, predictable join operator for highly concurrent data warehouses
Proceedings of the VLDB Endowment
Predictable performance for unpredictable workloads
Proceedings of the VLDB Endowment
The DataPath system: a data-centric analytic processing engine for large data warehouses
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Predictable performance and high query concurrency for data analytics
The VLDB Journal — The International Journal on Very Large Data Bases
SharedDB: killing one thousand queries with one stone
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Today's data deluge enables organizations to collect massive data, and analyze it with an ever-increasing number of concurrent queries. Traditional data warehouses (DW) face a challenging problem in executing this task, due to their query-centric model: each query is optimized and executed independently. This model results in high contention for resources. Thus, modern DW depart from the query-centric model to execution models involving sharing of common data and work. Our goal is to show when and how a DW should employ sharing. We evaluate experimentally two sharing methodologies, based on their original prototype systems, that exploit work sharing opportunities among concurrent queries at run-time: Simultaneous Pipelining (SP), which shares intermediate results of common sub-plans, and Global Query Plans (GQP), which build and evaluate a single query plan with shared operators. First, after a short review of sharing methodologies, we show that SP and GQP are orthogonal techniques. SP can be applied to shared operators of a GQP, reducing response times by 20%-48% in workloads with numerous common sub-plans. Second, we corroborate previous results on the negative impact of SP on performance for cases of low concurrency. We attribute this behavior to a bottleneck caused by the push-based communication model of SP. We show that pull-based communication for SP eliminates the overhead of sharing altogether for low concurrency, and scales better on multi-core machines than push-based SP, further reducing response times by 82%-86% for high concurrency. Third, we perform an experimental analysis of SP, GQP and their combination, and show when each one is beneficial. We identify a trade-off between low and high concurrency. In the former case, traditional query-centric operators with SP perform better, while in the latter case, GQP with shared operators enhanced by SP give the best results.