Storing a Sparse Table with 0(1) Worst Case Access Time
Journal of the ACM (JACM)
The input/output complexity of sorting and related problems
Communications of the ACM
Dynamic Perfect Hashing: Upper and Lower Bounds
SIAM Journal on Computing
Heaps and heapsort on secondary storage
Theoretical Computer Science
Extendible hashing—a fast access method for dynamic files
ACM Transactions on Database Systems (TODS)
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
On a model of indexability and its bounds for range queries
Journal of the ACM (JACM)
Cache-oblivious priority queue and graph algorithm applications
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Lower bounds for external memory dictionaries
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Journal of Algorithms
Linear hashing: a new tool for file and table addressing
VLDB '80 Proceedings of the sixth international conference on Very Large Data Bases - Volume 6
Probabilistic computations: Toward a unified measure of complexity
SFCS '77 Proceedings of the 18th Annual Symposium on Foundations of Computer Science
Database Systems: The Complete Book
Database Systems: The Complete Book
Optimality in External Memory Hashing
Algorithmica
Algorithms and Data Structures for External Memory
Algorithms and Data Structures for External Memory
Optimal External Memory Planar Point Enclosure
Algorithmica
Dynamic indexability and lower bounds for dynamic one-dimensional range query indexes
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The limits of buffering: a tight lower bound for dynamic membership in the external memory model
Proceedings of the forty-second ACM symposium on Theory of computing
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Cheap and large CAMs for high performance data-intensive networked systems
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
On the cell probe complexity of dynamic membership
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Approximate MaxRS in spatial databases
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Hash tables are one of the most fundamental data structures in computer science, in both theory and practice. They are especially useful in external memory, where their query performance approaches the ideal cost of just one disk access. Knuth [16] gave an elegant analysis showing that with some simple collision resolution strategies such as linear probing or chaining, the expected average number of disk I/Os of a lookup is merely 1+1/2Ω(b), where each I/O can read and/or write a disk block containing b items. Inserting a new item into the hash table also costs 1+1/2Ω(b) I/Os, which is again almost the best one can do if the hash table is entirely stored on disk. However, this requirement is unrealistic since any algorithm operating on an external hash table must have some internal memory (at least Ω(1) blocks) to work with. The availability of a small internal memory buffer can dramatically reduce the amortized insertion cost to o(1) I/Os for many external memory data structures. In this paper we study the inherent query-insertion tradeoff of external hash tables in the presence of a memory buffer. In particular, we show that for any constant c1, if the expected average successful query cost is targeted at 1+O(1/bc) I/Os, then it is not possible to support insertions in less than 1-O(1/bc-1/6) I/Os amortized, which means that the memory buffer is essentially useless. While if the query cost is relaxed to 1+O(1/bc) I/Os for any constant co(1) insertion cost.