Design and implementation of the Wisconsin storage system
Software—Practice & Experience
A benchmark of NonStop SQL on the debit credit transaction
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
A performance analysis of the gamma database machine
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
DPDS '88 Proceedings of the first international symposium on Databases in parallel and distributed systems
Join and Semijoin Algorithms for a Multiprocessor Database Machine
ACM Transactions on Database Systems (TODS)
Implementing a relational database by means of specialzed hardware
ACM Transactions on Database Systems (TODS)
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Implementation techniques for main memory database systems
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Query Execution for Large Relations on Functional Disk Systems
Proceedings of the Fifth International Conference on Data Engineering
Hashing Methods and Relational Algebra Operations
VLDB '84 Proceedings of the 10th International Conference on Very Large Data Bases
GAMMA - A High Performance Dataflow Database Machine
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Benchmarking Database Systems A Systematic Approach
VLDB '83 Proceedings of the 9th International Conference on Very Large Data Bases
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Parallel database systems: the future of database processing or a passing fad?
ACM SIGMOD Record - Directions for future database research & development
Heap-Filter Merge Join: A New Algorithm for Joining Medium-Size Inputs
IEEE Transactions on Software Engineering
Join processing in relational databases
ACM Computing Surveys (CSUR)
Parallel database systems: the future of high performance database systems
Communications of the ACM
Query optimization for parallel execution
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Processing multi-join query in parallel systems
SAC '92 Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing: technological challenges of the 1990's
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
Using shared virtual memory for parallel join processing
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Dynamic Load Balancing in Very Large Shared-Nothing Hypercube Database Computers
IEEE Transactions on Computers
A Parallel Hash Join Algorithm for Managing Data Skew
IEEE Transactions on Parallel and Distributed Systems
ACM SIGMOD Record
On parallel execution of multiple pipelined hash joins
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Predictive dynamic load balancing of parallel and distributed rule and query processing
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
A Hierarchical Approach to Parallel Multiquery Scheduling
IEEE Transactions on Parallel and Distributed Systems
IBM Systems Journal
Fast algorithms for universal quantification in large databases
ACM Transactions on Database Systems (TODS)
Scheduling and mapping for parallel execution of extended SQL queries
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
Parallel evaluation of multi-join queries
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
A Parallel Distributive Join Algorithm for Cube-Connected Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Heraclitus: elevating deltas to be first-class citizens in a database programming language
ACM Transactions on Database Systems (TODS)
&mgr;Database: parallelism in a memory-mapped environment (research summary)
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Adaptive Join Algorithms in Dynamic Distributed Databases
Distributed and Parallel Databases
Parallel Execution of Hash Joins in Parallel Databases
IEEE Transactions on Parallel and Distributed Systems
A Parallel Scheme Using the Divide-and-Conquer Method
Distributed and Parallel Databases
Memory-adaptive scheduling for large query execution
Proceedings of the seventh international conference on Information and knowledge management
Parallel Distributive Join Algorithm on the Intel Paragon
The Journal of Supercomputing
Domain vector hashing for earth system data querying
SAC '95 Proceedings of the 1995 ACM symposium on Applied computing
Performance evaluation of functional disk system with nonuniform data distribution
DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
An effective algorithm for parallelizing sort merge joins in the presence of data skew
DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
A query processing method for data warehouses which contain multimedia
SAC '97 Proceedings of the 1997 ACM symposium on Applied computing
Application of domain vector perfect hash join for multimedia data mining
SAC '97 Proceedings of the 1997 ACM symposium on Applied computing
Intensive Data Management in Parallel Systems: A Survey
Distributed and Parallel Databases
Permutation-Based Range-Join Algorithms on N-Dimensional Meshes
IEEE Transactions on Parallel and Distributed Systems
Dynamic memory allocation strategies for parallel query execution
Proceedings of the 2002 ACM symposium on Applied computing
A scalable hash ripple join algorithm
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
An Adaptive Parallel Distributive Join Algorithm on a Cluster of Workstations
The Journal of Supercomputing
Journal of Computer Science and Technology
Load Balancing for Parallel Query Execution on NUMA Multiprocessors
Distributed and Parallel Databases
Parallel query processing with zigzag trees
The VLDB Journal — The International Journal on Very Large Data Bases - Parallelism in database systems
Considering data skew factor in multi-way join query optimization for parallel execution
The VLDB Journal — The International Journal on Very Large Data Bases - Parallelism in database systems
The Gamma Database Machine Project
IEEE Transactions on Knowledge and Data Engineering
Parallel Hash-Based Join Algorithms for a Shared-Everything Environment
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering
Optimizing Sort Order Query Execution in Balanced and Nested Grid Files
IEEE Transactions on Knowledge and Data Engineering
Applying Segmented Right-Deep Trees to Pipelining Multiple Hash Joins
IEEE Transactions on Knowledge and Data Engineering
Dynamic Load Balancing in Multicomputer Database Systems Using Partition Tuning
IEEE Transactions on Knowledge and Data Engineering
Utilizing Page-Level Join Index for Optimization in Parallel Join Execution
IEEE Transactions on Knowledge and Data Engineering
Optimization of Parallel Execution for Multi-Join Queries
IEEE Transactions on Knowledge and Data Engineering
Performance Analysis of Parallel Query Processing Algorithms for Object-Oriented Databases
IEEE Transactions on Knowledge and Data Engineering
Parallel Star Join + DataIndexes: Efficient Query Processing in Data Warehouses and OLAP
IEEE Transactions on Knowledge and Data Engineering
The Adaptive-Hash Join Algorithm for a Hypercube Multicomputer
IEEE Transactions on Parallel and Distributed Systems
A Parallel Sort Merge Join Algorithm for Managing Data Skew
IEEE Transactions on Parallel and Distributed Systems
Performance Issues in Distributed Query Processing
IEEE Transactions on Parallel and Distributed Systems
Control Versus Data Flow in Parallel Database Machines
IEEE Transactions on Parallel and Distributed Systems
Distributed Load Balancing for Parallel Main Memory Hash Join
IEEE Transactions on Parallel and Distributed Systems
Encapsulation of Parallelism and Architecture-Independence in Extensible Database Query Execution
IEEE Transactions on Software Engineering
Join and Data Redistribution Algorithms for Hypercubes
IEEE Transactions on Knowledge and Data Engineering
Frequency-adaptive join for shared nothing machines
Progress in computer research
Performance Comparison of Pipelined Hash Joins on Workstation Clusters
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
An Adaptive Hash Join Algorithm on a Network of Workstations
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Efficient Permutation-Based Range-Join Algorithms on N-Dimensional Meshes
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A PC-NOW Based Parallel Extension for a Sequential DBMS
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Hash-Based Join Algorithms for Multiprocessor Computers
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Hybrid-Range Partitioning Strategy: A New Declustering Strategy for Multiprocessor Database Machines
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
An Adaptive Hash Join Algorithm for Multiuser Environments
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Parity Striping of Disk Arrays: Low-Cost Reliable Storage with Acceptable Throughput
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Optimization of Multi-Way Join Queries for Parallel Execution
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared Memory Multiprocessor
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
An Evaluation of Non-Equijoin Algorithms
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Practical Skew Handling in Parallel Joins
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Analysis of Dynamic Load Balancing Strategies for Parallel Shared Nothing Database Systems
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Applying Hash Filters to Improving the Execution of Bushy Trees
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Estimation of Query-Result Distribution and its Application in Parallel-Join Load Balancing
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Implementation and Analysis of a Parallel Collection Query Language
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Memory Aware Query Routing in Interactive Web-Based Information Systems
BNCOD 18 Proceedings of the 18th British National Conference on Databases: Advances in Databases
Using a Network of Workstations to Enhance Database Query Processing Performance
Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
DEXA '99 Proceedings of the 10th International Conference on Database and Expert Systems Applications
A Skew-insensitive Algorithm for Join and Multi-join Operations on Shared Nothing Machines
DEXA '00 Proceedings of the 11th International Conference on Database and Expert Systems Applications
A Parallel Strategy for Transitive Closure usind Double Hash-Based Clustering
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
On applying hash filters to improving the execution of multi-join queries
The VLDB Journal — The International Journal on Very Large Data Bases
Join algorithm costs revisited
The VLDB Journal — The International Journal on Very Large Data Bases
PicoDBMS: Scaling down database techniques for the smartcard
The VLDB Journal — The International Journal on Very Large Data Bases
Hash-Merge Join: A Non-blocking Join Algorithm for Producing Fast and Early Join Results
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
The Design, Implementation and Evaluation of an ODMG Compliant, Parallel Object Database Server
Distributed and Parallel Databases
Revisiting pipelined parallelism in multi-join query processing
VLDB '05 Proceedings of the 31st international conference on Very large data bases
GRACE-based joins on active storage devices
Distributed and Parallel Databases
NSJ: an efficient non-blocking spatial join algorithm
GIS '06 Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Relational joins on graphics processors
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Relational query coprocessing on graphics processors
ACM Transactions on Database Systems (TODS)
A demonstration of SciDB: a science-oriented DBMS
Proceedings of the VLDB Endowment
Processing independent and inter-linked documents in XML databases
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
An adaptive load balancing algorithm for large data parallel processing with communication delay
ICCS'03 Proceedings of the 2003 international conference on Computational science
Nephele/PACTs: a programming model and execution framework for web-scale analytical processing
Proceedings of the 1st ACM symposium on Cloud computing
Efficient parallel set-similarity joins using MapReduce
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
A comparison of join algorithms for log processing in MaPreduce
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Design patterns for efficient graph algorithms in MapReduce
Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Exploiting programmable network interfaces for parallel query execution in workstation clusters
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Mobile Information Systems
Parallelizing join computations of SPARQL queries for large semantic web databases
Proceedings of the 2011 ACM Symposium on Applied Computing
An efficient skew-insensitive algorithm for join processing on grid architectures
Proceedings of the fifth international workshop on High-level parallel programming and applications
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Accelerating large semantic web databases by parallel join computations of SPARQL queries
ACM SIGAPP Applied Computing Review
Parallel hash join algorithms for dynamic load balancing in a shared disks cluster
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V
An optimal skew-insensitive join and multi-join algorithm for distributed architectures
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Efficient parallel kNN joins for large data in MapReduce
Proceedings of the 15th International Conference on Extending Database Technology
Adaptive MapReduce using situation-aware mappers
Proceedings of the 15th International Conference on Extending Database Technology
Spinning fast iterative data flows
Proceedings of the VLDB Endowment
Overcoming the scalability limitations of parallel star schema data warehouses
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Providing timely results with an elastic parallel DW
ISMIS'12 Proceedings of the 20th international conference on Foundations of Intelligent Systems
Distributed data management using MapReduce
ACM Computing Surveys (CSUR)
Super-EGO: fast multi-dimensional similarity join
The VLDB Journal — The International Journal on Very Large Data Bases
Revisiting co-processing for hash joins on the coupled CPU-GPU architecture
Proceedings of the VLDB Endowment
Hi-index | 0.03 |
In this paper we analyze and compare four parallel join algorithms. Grace and Hybrid hash represent the class of hash-based join methods, Simple hash represents a looping algorithm with hashing, and our last algorithm is the more traditional sort-merge. The performance of each of the algorithms with different tuple distribution policies, the addition of bit vector filters, varying amounts of main-memory for joining, and non-uniformly distributed join attribute values is studied. The Hybrid hash-join algorithm is found to be superior except when the join attribute values of the inner relation are non-uniformly distributed and memory is limited. In this case, a more conservative algorithm such as the sort-merge algorithm should be used. The Gamma database machine serves as the host for the performance comparison.