Augmented order preserving minimal perfect hash functions for very large digital libraries

Authors:
Amjad M. Daoud;Hussain AbdelJaber;Jafar Ababneh
Affiliations:
WISE University, Amman, Jordan;WISE University, Amman, Jordan;WISE University, Amman, Jordan
Venue:
CIT'11 Proceedings of the 5th WSEAS international conference on Communications and information technology
Year:
2011

Citing 16
Cited 0

Graphical evolution: an introduction to the theory of random graphs

Graphical evolution: an introduction to the theory of random graphs
Order-preserving key transformations

ACM Transactions on Database Systems (TODS)
Automatic text processing

Automatic text processing
Fast hashing of variable-length text strings

Communications of the ACM
Order preserving minimal perfect hash functions and information retrieval

SIGIR '90 Proceedings of the 13th annual international ACM SIGIR conference on Research and development in information retrieval
Handbook of algorithms and data structures: in Pascal and C (2nd ed.)

Handbook of algorithms and data structures: in Pascal and C (2nd ed.)
Practical minimal perfect hash functions for large databases

Communications of the ACM
Integrating IR and RDBMS using cooperative indexing

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Perfect hashing

Theoretical Computer Science
Fast algorithms for sorting and searching strings

SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Dynamic hashing schemes

ACM Computing Surveys (CSUR)
Ubiquitous B-Tree

ACM Computing Surveys (CSUR)
The study of an ordered minimal perfect hashing scheme

Communications of the ACM
Transaction Processing: Concepts and Techniques

Transaction Processing: Concepts and Techniques
Trie hashing

SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
Grow and Post Index Trees: Roles, Techniques and Future Potential

SSD '91 Proceedings of the Second International Symposium on Advances in Spatial Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Rapid access to information is essential for a wide variety of retrieval systems and applications. Hashing has long been used when the fastest possible direct search is desired, but was considered an exotic [15] and not appropriate when sequential or range searches are also required. To change that, we extend order preserving perfect hash [11] functions to handle sequential access to lexicographically ordered records. To implement this, the lexicographically sorted key set is compressed with prefix compression and stored in blocks or pages in a similar fashion to the leaf pages of a B+ tree with prefix compression [9]. The block number and the key prefix offset within the block are combined to form the key address. The key address is augmented with a signature of the prefix to form the key data. The key data is blended into the function specification to produce an augmented order preserving minimal perfect hash function. Our algorithm uses the bipartite graph approach to avoid degenerate edges problems. It relaxes the acyclic requirement in random graphs presented in [7] and can tolerate the presence of cyclic components. Moreover the algorithm is designed to avoid the conditions described in [7] that make the Fox, Chen, Daoud, and Heath approach [11] exponential. Examples of these conditions are given along with how the algorithm overcomes them. The algorithm produces OPMHFs with much higher success rates than the acyclic hypergraph approach [7] and mostly from the first trial.