The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
A Dictionary for Minimum Redundancy Encoding
Journal of the ACM (JACM)
Minimal perfect hash functions made simple
Communications of the ACM
Communications of the ACM
Understanding Natural Language
Understanding Natural Language
Anatomy of LISP
Hi-index | 0.00 |
The research reported in this paper derives from the recent algorithm of Cichelli (1980) for computing machine-independent, minimal perfect hash functions of the form:hash value: hash key length + associated value of the key's first letter + associated value of the key's last letterA minimal perfect hash function is one which provides single probe retrieval from a minimally-sized table of hash identifiers [ keys]. Cichelli's hash function is machine-independent because the character code used by a particular machine never enters into the hash calculation.Cichelli's algorithm uses a simple backtracking process to find an assignment of non-negative integers to letters which results in a perfect minimal hash function. Cichelli employs a twofold ordering strategy which rearranges the static set of keys in such a way that hash value collisions will occur and be resolved as early as possible during the backtracking process. This double ordering provides a necessary reduction in the size of the potentially large search space, thus considerably speeding the computation of associated values.In spite of Cichelli's ordering strategies, his method is found to require excessive computation to find hash functions for sets of keys with more than about 40 members. Cichelli's method is also limited since two keys with the same first and last letters and the same length are not permitted.Alternative algorithms and their implementations will be discussed in the next section; these algorithms overcome some of the difficulties encountered when using Cichelli's original algorithm. Some experimental results are presented, followed by a discussion of the application of perfect hash functions to the problem of natural language lexicon design.