The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Sorting by Address Calculation
Journal of the ACM (JACM)
Communications of the ACM
Analysis of computational systems: Cumulative polygon address calculation sorting
ACM '65 Proceedings of the 1965 20th national conference
Estimation of the cumulative by fourier series methods and application to the insertion problem
ACM '68 Proceedings of the 1968 23rd ACM national conference
Order-preserving key transformations
ACM Transactions on Database Systems (TODS)
Hashing practice: analysis of hashing and universal hashing
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Managing Statistical Behavior of Large Data Sets in Shared-Nothing Architectures
IEEE Transactions on Parallel and Distributed Systems
SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
New Order Preserving Access Methods for Very Large Files Derived from Linear Hashing
IEEE Transactions on Knowledge and Data Engineering
B-tree indexes, interpolation search, and skew
DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Hi-index | 0.00 |
In this paper procedures are studied for storing, accessing, updating, and reorganizing data in large files whose organization is direct, an organization used when a fast response time is required. "Distribution-dependent" hashing functions and the division method are compared as methods of indirect addressing."Distribution-dependent" hashing functions are characterized. These hashing functions generate addresses from a set of keys by using knowledge of the distribution of that key set within the key space or range of keys. A study of the performance measures obtained during tests of these functions on several key sets indicates that in certain cases, distribution-dependent methods perform better than the division method. This result is extended by a demonstration that distribution-dependent hashing functions can accommodate a change in the distribution of keys without being redefined. A number of insertions to and deletions from the key set can be made before a distribution-dependent hashing function gives poorer performance than the division method under identical circumstances.If many additions are made to a set of keys, it becomes necessary to reorganize, in a larger storage area, the direct file of records identified by that key set. Although processor time must be sacrificed in order to redefine a distribution-dependent hashing function, the division method requires substantially greater access time in a reorganizational situation.