Join algorithm costs revisited

Authors:
Evan P. Harris;Kotagiri Ramamohanarao
Affiliations:
Department of Computer Science, The University of Melbourne, Parkville VIC 3052, Australia;Department of Computer Science, The University of Melbourne, Parkville VIC 3052, Australia
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
1996

Citing 23
Cited 17

Join processing in database systems with large main memories

ACM Transactions on Database Systems (TODS)
Design and evaluation of parallel pipelined join algorithms

SIGMOD '87 Proceedings of the 1987 ACM SIGMOD international conference on Management of data
Simulated annealing and Boltzmann machines: a stochastic approach to combinatorial optimization and neural computing

Simulated annealing and Boltzmann machines: a stochastic approach to combinatorial optimization and neural computing
A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
The effect of bucket size tuning in the dynamic hybrid GRACE hash join method

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Practical selectivity estimation through adaptive sampling

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Join processing in relational databases

ACM Computing Surveys (CSUR)
Sequential sampling procedures for query size estimation

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Query evaluation techniques for large databases

ACM Computing Surveys (CSUR)
Partially preemptible hash joins

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
An instant and accurate size estimation method for joins and selections in a retrieval-intensive environment

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Using shared virtual memory for parallel join processing

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Implementation techniques for main memory database systems

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
The aditi deductive database system

The VLDB Journal — The International Journal on Very Large Data Bases - Prototypes of deductive database systems
The Adaptive-Hash Join Algorithm for a Hypercube Multicomputer

IEEE Transactions on Parallel and Distributed Systems
An Efficient Hybrid Join Algorithm: A DB2 Prototype

Proceedings of the Seventh International Conference on Data Engineering
An Observation on Database Buffering Performance Metrics

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Hash-Partitioned Join Method Using Dynamic Destaging Strategy

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared Memory Multiprocessor

VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Benchmarking Database Systems A Systematic Approach

VLDB '83 Proceedings of the 9th International Conference on Very Large Data Bases
Why sort-merge gives the best implementation of the natural join

ACM SIGMOD Record

Distributed Parallel Query Processing on Networks of Workstations

HPCN Europe 2000 Proceedings of the 8th International Conference on High-Performance Computing and Networking
Diag-Join: An Opportunistic Join Algorithm for 1:N Relationships

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Evaluating Functional Joins Along Nested Reference Sets in Object-Relational and Object-Oriented Databases

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Generalised Hash Teams for Join and Group-by

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Integrating the UB-Tree into a Database System Kernel

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Performance Analysis of Database Systems

Performance Evaluation: Origins and Directions
Improving Temporal Joins Using Histograms

DEXA '00 Proceedings of the 11th International Conference on Database and Expert Systems Applications
Functional-join processing

The VLDB Journal — The International Journal on Very Large Data Bases
Performance of a distributed architecture for query processing on workstation clusters

Future Generation Computer Systems - Selected papers from CCGRID 2002
Improving OLAP Performance by Multidimensional Hierarchical Clustering

IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Join operations in temporal databases

The VLDB Journal — The International Journal on Very Large Data Bases
The use of Hints in SQL-Nested query optimization

Information Sciences: an International Journal
ONE: a predictable and scalable DW model

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
A predictable storage model for scalable parallel DW

Proceedings of the 15th Symposium on International Database Engineering & Applications
TEEPA: a timely-aware elastic parallel architecture

Proceedings of the 16th International Database Engineering & Applications Sysmposium
Overcoming the scalability limitations of parallel star schema data warehouses

ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Providing timely results with an elastic parallel DW

ISMIS'12 Proceedings of the 20th international conference on Foundations of Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

A method of analysing join algorithms based upon the time required to access, transfer and perform the relevant CPU-based operations on a disk page is proposed. The costs of variations of several of the standard join algorithms, including nested block, sort-merge, GRACE hash and hybrid hash, are presented. For a given total buffer size, the cost of these join algorithms depends on the parts of the buffer allocated for each purpose. For example, when joining two relations using the nested block join algorithm, the amount of buffer space allocated for the outer and inner relations can significantly affect the cost of the join. Analysis of expected and experimental results of various join algorithms show that a combination of the optimal nested block and optimal GRACE hash join algorithms usually provide the greatest cost benefit, unless the relation size is a small multiple of the memory size. Algorithms to quickly determine a buffer allocation producing the minimal cost for each of these algorithms are presented. When the relation size is a small multiple of the amount of main memory available (typically up to three to six times), the hybrid hash join algorithm is preferable.