Partial-match retrieval using hashing and descriptors

Authors:
K. Ramamohanarao;James A. Thom;John W. Lloyd
Affiliations:
Univ. of Melbourne, Parkville, Victoria, Australia;Univ. of Melbourne, Parkville, Victoria, Australia;Univ. of Melbourne, Parkville, Victoria, Australia
Venue:
ACM Transactions on Database Systems (TODS)
Year:
1983

Citing 6
Cited 8

New file organization based on dynamic hashing

ACM Transactions on Database Systems (TODS)
Optimal partial-match retrieval when fields are independently specified

ACM Transactions on Database Systems (TODS)
Extendible hashing—a fast access method for dynamic files

ACM Transactions on Database Systems (TODS)
Optimality Properties of Multiple-Key Hashing Functions

Journal of the ACM (JACM)
Partial-match retrieval using indexed descriptor files

Communications of the ACM
Attribute based file organization in a paged memory environment

Communications of the ACM

Multiattribute hashing using Gray codes

SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Gray Codes for Partial Match and Range Queries

IEEE Transactions on Software Engineering
Clustered multiattribute hash files

PODS '89 Proceedings of the eighth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
A compendium of key search references

ACM SIGIR Forum
A Stochastic Programming Approach for Range Query Retrieval Problems

IEEE Transactions on Knowledge and Data Engineering
A Superjoin Algorithm for Deductive Databases

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Serving Datacube Tuples from Main Memory

SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
The optimal size of a signature

Mathematical and Computer Modelling: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper studies a partial-match retrieval scheme based on hash functions and descriptors. The emphasis is placed on showing how the use of a descriptor file can improve the performance of the scheme. Records in the file are given addresses according to hash functions for each field in the record. Furthermore, each page of the file has associated with it a descriptor, which is a fixed-length bit string, determined by the records actually present in the page. Before a page is accessed to see if it contains records in the answer to a query, the descriptor for the page is checked. This check may show that no relevant records are on the page and, hence, that the page does not have to be accessed. The method is shown to have a very substantial performance advantage over pure hashing schemes, when some fields in the records have large key spaces. A mathematical model of the scheme, plus an algorithm for optimizing performance, is given.