Optimizing complex queries with multiple relation instances

Authors:
Yu Cao;Gopal C. Das;Chee-Yong Chan;Kian-Lee Tan
Affiliations:
National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore
Venue:
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Year:
2008

Citing 14
Cited 7

Query evaluation techniques for large databases

ACM Computing Surveys (CSUR)
The LRU-K page replacement algorithm for database disk buffering

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
TPC-D—the challenges, issues and results

ACM SIGMOD Record
Cost-based query scrambling for initial delays

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Simultaneous optimization and evaluation of multiple dimensional queries

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient and extensible algorithms for multi query optimization

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Redbrick Vista: Aggregate Computation and Management

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Pipelining in multi-query optimization

Journal of Computer and System Sciences - Special issu on PODS 2001
Accelerating XPath evaluation in any RDBMS

ACM Transactions on Database Systems (TODS)
QPipe: a simultaneously pipelined relational query engine

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
An efficient SQL-based RDF querying scheme

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Efficient exploitation of similar subexpressions for query processing

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Cooperative scans: dynamic bandwidth sharing in a DBMS

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Increasing buffer-locality for multiple index based scans through intelligent placement and index scan speed control

VLDB '07 Proceedings of the 33rd international conference on Very large data bases

Operational BI platform for video analytics

Proceedings of the International Conference on Management of Emergent Digital EcoSystems
Cooperating SQL Dataflow Processes for In-DB Analytics

OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part I
Efficient multiple assignment to database tables

AIKED'10 Proceedings of the 9th WSEAS international conference on Artificial intelligence, knowledge engineering and data bases
Rethinking database updates using a multiple assignment-based approach

WSEAS Transactions on Computers
Fast loads and queries

Transactions on large-scale data- and knowledge-centered systems II
Fast loads and queries

Transactions on large-scale data- and knowledge-centered systems II
Sort-sharing-aware query processing

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Today's query processing engines do not take advantage of the multiple occurrences of a relation in a query to improve performance. Instead, each instance is treated as a distinct relation and has its own independent table access method. In this paper, we present MAPLE, a Multi-instance-Aware PLan Evaluation engine that enables multiple instances of a relation to share one physical scan (called SharedScan) with limited buffer space. During execution, as SharedScan pulls a tuple for any instance, that tuple is also pushed to the buffers of other instances with matching predicates. To avoid buffer overflow, a novel interleaved execution strategy is proposed: whenever an instance's buffer becomes full, the execution is temporarily switched to a drainer (an ancestor blocking operator of the instance) to consume all the tuples in the buffer. Thus, the execution is interleaved between normal processing and drainers. We also propose a cost-based approach to generate a plan to maximize the shared scan benefit as well as to avoid interleaved execution deadlocks. MAPLE is light-weight and can be easily integrated into existing RDBMS executors. We have implemented MAPLE in PostgreSQL, and our experimental study on the TPC-DS benchmark shows significant reduction in execution time.