On the optimal nesting order for computing N-relational joins
ACM Transactions on Database Systems (TODS)
Optimizing join queries in distributed databases
Proc. of the seventh conference on Foundations of software technology and theoretical computer science
The input/output complexity of sorting and related problems
Communications of the ACM
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
On the propagation of errors in the size of join results
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
On the Complexity of Testing Implications of Functional and Join Dependencies
Journal of the ACM (JACM)
A relational model of data for large shared data banks
Communications of the ACM
Principles of Database and Knowledge-Base Systems: Volume II: The New Technologies
Principles of Database and Knowledge-Base Systems: Volume II: The New Technologies
Processing complex aggregate queries over data streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Computer Science Handbook, Second Edition
Computer Science Handbook, Second Edition
Readings in Database Systems: Fourth Edition
Readings in Database Systems: Fourth Edition
Subquadratic algorithms for 3SUM
WADS'05 Proceedings of the 9th international conference on Algorithms and Data Structures
Faster join-projects and sparse matrix multiplications
Proceedings of the 12th International Conference on Database Theory
Worst-case optimal join algorithms: [extended abstract]
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Hi-index | 0.00 |
The join operation of relational algebra is a cornerstone of relational database systems. Computing the join of several relations is NP-hard in general, whereas special (and typical) cases are tractable. This paper considers joins having an acyclic join graph, for which current methods initially apply a full reducer to efficiently eliminate tuples that will not contribute to the result of the join. From a worst-case perspective, previous algorithms for computing an acyclic join of k fully reduced relations, occupying a total of n≥k blocks on disk, use Ω((n+z)k) I/Os, where z is the size of the join result in blocks.In this paper we show how to compute the join in a time bound that is within a constant factor of the cost of running a full reducer plus sorting the output. For a broad class of acyclic join graphs this is O(sort(n+z)) I/Os, removing the dependence on k from previous bounds. Traditional methods decompose the join into a number of binary joins, which are then carried out one by one. Departing from this approach, our technique is based on computing the size of certain subsets of the result, and using these sizes to compute the location(s) of each data item in the result.Finally, as an initial study of cyclic joins in the I/O model, we show how to compute a join whose join graph is a 3-cycle, in O(n2/m+sort(n+z)) I/Os, where m is the number of blocks in internal memory.