Lexicon design using perfect hash functions

  • Authors:
  • Nick Cercone;Max Krause;John Boates

  • Affiliations:
  • -;-;-

  • Venue:
  • CHI '81 Proceedings of the Joint Conference on Easier and More Productive Use of Computer Systems. (Part - II): Human Interface and the User Interface - Volume 1981
  • Year:
  • 1981

Quantified Score

Hi-index 0.00

Visualization

Abstract

The research reported in this paper derives from the recent algorithm of Cichelli (1980) for computing machine-independent, minimal perfect hash functions of the form:hash value: hash key length + associated value of the key's first letter + associated value of the key's last letterA minimal perfect hash function is one which provides single probe retrieval from a minimally-sized table of hash identifiers [ keys]. Cichelli's hash function is machine-independent because the character code used by a particular machine never enters into the hash calculation.Cichelli's algorithm uses a simple backtracking process to find an assignment of non-negative integers to letters which results in a perfect minimal hash function. Cichelli employs a twofold ordering strategy which rearranges the static set of keys in such a way that hash value collisions will occur and be resolved as early as possible during the backtracking process. This double ordering provides a necessary reduction in the size of the potentially large search space, thus considerably speeding the computation of associated values.In spite of Cichelli's ordering strategies, his method is found to require excessive computation to find hash functions for sets of keys with more than about 40 members. Cichelli's method is also limited since two keys with the same first and last letters and the same length are not permitted.Alternative algorithms and their implementations will be discussed in the next section; these algorithms overcome some of the difficulties encountered when using Cichelli's original algorithm. Some experimental results are presented, followed by a discussion of the application of perfect hash functions to the problem of natural language lexicon design.