QPipe: a simultaneously pipelined relational query engine

Authors:
Stavros Harizopoulos;Vladislav Shkapenyuk;Anastassia Ailamaki
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Rutgers University, Piscataway, NJ;Carnegie Mellon University, Pittsburgh, PA
Venue:
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Year:
2005

Citing 26
Cited 40

Buffer management in relational database systems

ACM Transactions on Database Systems (TODS)
Efficiently updating materialized views

SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Multiple-query optimization

ACM Transactions on Database Systems (TODS)
The LRU-K page replacement algorithm for database disk buffering

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Shoring up persistent applications

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Red brick warehouse: a read-mostly RDBMS for open SMP platforms

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Iterators, schedulers, and distributed-memory parallelism

Software—Practice & Experience
Efficient mid-query re-optimization of sub-optimal query execution plans

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
View indexing in relational databases

ACM Transactions on Database Systems (TODS)
Efficient and extensible algorithms for multi query optimization

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
NiagaraCQ: a scalable continuous query system for Internet databases

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Pipelining in multi-query optimization

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continuously adaptive continuous queries over streams

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Common expression analysis in database applications

SIGMOD '82 Proceedings of the 1982 ACM SIGMOD international conference on Management of data
The Gamma Database Machine Project

IEEE Transactions on Knowledge and Data Engineering
Volcano— An Extensible and Parallel Query Evaluation System

IEEE Transactions on Knowledge and Data Engineering
The Case for Enhanced Abstract Data Types

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Automated Selection of Materialized Views and Indexes in SQL Databases

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Semantic Data Caching and Replacement

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Dynamic Caching of Query Results for Decision Support Systems

SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
The next database revolution

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Buffering databse operations for enhanced instruction cache performance

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
ARC: A Self-Tuning, Low Overhead Replacement Cache

FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Streaming queries over streaming data

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Green query optimization: taming query optimization overheads through plan recycling

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30

Performance tradeoffs in read-optimized databases

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient execution of multiple queries on deep memory hierarchy

Journal of Computer Science and Technology
To share or not to share?

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Cooperative scans: dynamic bandwidth sharing in a DBMS

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Request Window: an approach to improve throughput of RDBMS-based data integration system by utilizing data sharing across concurrent distributed queries

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Increasing buffer-locality for multiple index based scans through intelligent placement and index scan speed control

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A general framework for improving query processing performance on multi-level memory hierarchies

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Architectural characterization of XQuery workloads on modern processors

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Model analysis for business event processing

IBM Systems Journal
Optimizing complex queries with multiple relation instances

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Column-stores vs. row-stores: how different are they really?

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Main-memory scan sharing for multi-core CPUs

Proceedings of the VLDB Endowment
Scheduling shared scans of large data files

Proceedings of the VLDB Endowment
Exploiting the power of relational databases for efficient stream processing

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Cost-Based Vectorization of Instance-Based Integration Processes

ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
Database architecture evolution: mammals flourished long before dinosaurs became extinct

Proceedings of the VLDB Endowment
A scalable, predictable join operator for highly concurrent data warehouses

Proceedings of the VLDB Endowment
Predictable performance for unpredictable workloads

Proceedings of the VLDB Endowment
Data processing on FPGAs

Proceedings of the VLDB Endowment
The DataPath system: a data-centric analytic processing engine for large data warehouses

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Service-oriented execution model supporting data sharing and adaptive query processing

Cluster Computing
Cost-based vectorization of instance-based integration processes

Information Systems
MRShare: sharing across multiple queries in MapReduce

Proceedings of the VLDB Endowment
Data-oriented transaction execution

Proceedings of the VLDB Endowment
Predictable performance and high query concurrency for data analytics

The VLDB Journal — The International Journal on Very Large Data Bases
Optimization of sub-query processing in distributed data integration systems

Journal of Network and Computer Applications
CoScan: cooperative scan sharing in the cloud

Proceedings of the 2nd ACM Symposium on Cloud Computing
Qserv: a distributed shared-nothing database for the LSST catalog

State of the Practice Reports
Sorting networks on FPGAs

The VLDB Journal — The International Journal on Very Large Data Bases
MiniTasking: improving cache performance for multiple query workloads

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
SharedDB: killing one thousand queries with one stone

Proceedings of the VLDB Endowment
Optimizing I/O for big array analytics

Proceedings of the VLDB Endowment
Accelerating pathology image data cross-comparison on CPU-GPU hybrid systems

Proceedings of the VLDB Endowment
From cooperative scans to predictive buffer management

Proceedings of the VLDB Endowment
On the optimization of schedules for MapReduce workloads in the presence of shared scans

The VLDB Journal — The International Journal on Very Large Data Bases
Scaling up analytical queries with column-stores

Proceedings of the Sixth International Workshop on Testing Database Systems
Data management systems on GPUs: promises and challenges

Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Memory footprint matters: efficient equi-join algorithms for main memory data processing

Proceedings of the 4th annual Symposium on Cloud Computing
Sharing data and work across concurrent analytical queries

Proceedings of the VLDB Endowment
Eliminating unscalable communication in transaction processing

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Relational DBMS typically execute concurrent queries independently by invoking a set of operator instances for each query. To exploit common data retrievals and computation in concurrent queries, researchers have proposed a wealth of techniques, ranging from buffering disk pages to constructing materialized views and optimizing multiple queries. The ideas proposed, however, are inherently limited by the query-centric philosophy of modern engine designs. Ideally, the query engine should proactively coordinate same-operator execution among concurrent queries, thereby exploiting common accesses to memory and disks as well as common intermediate result computation.This paper introduces on-demand simultaneous pipelining (OSP), a novel query evaluation paradigm for maximizing data and work sharing across concurrent queries at execution time. OSP enables proactive, dynamic operator sharing by pipelining the operator's output simultaneously to multiple parent nodes. This paper also introduces QPipe, a new operator-centric relational engine that effortlessly supports OSP. Each relational operator is encapsulated in a micro-engine serving query tasks from a queue, naturally exploiting all data and work sharing opportunities. Evaluation of QPipe built on top of BerkeleyDB shows that QPipe achieves a 2x speedup over a commercial DBMS when running a workload consisting of TPC-H queries.