Inspector joins

Authors:
Shimin Chen;Anastassia Ailamaki;Phillip B. Gibbons;Todd C. Mowry
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Intel Research Pittsburgh, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA
Venue:
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Year:
2005

Citing 13
Cited 10

Join processing in database systems with large main memories

ACM Transactions on Database Systems (TODS)
Efficiently updating materialized views

SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Join indices

ACM Transactions on Database Systems (TODS)
Multiple-query optimization

ACM Transactions on Database Systems (TODS)
Query evaluation techniques for large databases

ACM Computing Surveys (CSUR)
Efficient mid-query re-optimization of sub-optimal query execution plans

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Space/time trade-offs in hash coding with allowable errors

Communications of the ACM
Database Architecture Optimized for the New Bottleneck: Memory Access

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
What Happens During a Join? Dissecting CPU and Memory Optimization Effects

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
LEO - DB2's LEarning Optimizer

Proceedings of the 27th International Conference on Very Large Data Bases
Cache Conscious Algorithms for Relational Query Processing

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Improving Hash Join Performance through Prefetching

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Robust query processing through progressive optimization

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data

Scheduling threads for constructive cache sharing on CMPs

Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Improving hash join performance through prefetching

ACM Transactions on Database Systems (TODS)
Adaptive aggregation on chip multiprocessors

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A general framework for improving query processing performance on multi-level memory hierarchies

DaMoN '07 Proceedings of the 3rd international workshop on Data management on new hardware
Breaking the memory wall in MonetDB

Communications of the ACM - Surviving the data deluge
The design of a query monitoring system

ACM Transactions on Database Systems (TODS)
Improving the performance of list intersection

Proceedings of the VLDB Endowment
Suffix tree construction algorithms on modern hardware

Proceedings of the 13th International Conference on Extending Database Technology
Optimization of joins using random record generation method

Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
Memory footprint matters: efficient equi-join algorithms for main memory data processing

Proceedings of the 4th annual Symposium on Cloud Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The key idea behind Inspector Joins is that during the I/O partitioning phase of a hash-based join, we have the opportunity to look at the actual data itself and then use this knowledge in two ways: (1) to create specialized indexes, specific to the given query on the given data, for optimizing the CPU cache performance of the subsequent join phase of the algorithm, and (2) to decide which join phase algorithm best suits this specific query. We show how inspector joins, employing novel statistics and specialized indexes, match or exceed the performance of state-of-the-art cache-friendly hash join algorithms. For example, when run on eight or more processors, our experiments show that inspector joins offer 1.1-1.4X speedups over these previous algorithms, with the speedup increasing as the number of processors increases.