Optimal partial-match retrieval when fields are independently specified

  • Authors:
  • Alfred V. Aho;Jeffrey D. Ullman

  • Affiliations:
  • Bell Labs, Murray Hill, NJ;Princeton Univ., Princeton, NJ

  • Venue:
  • ACM Transactions on Database Systems (TODS)
  • Year:
  • 1979

Quantified Score

Hi-index 0.03

Visualization

Abstract

This paper considers the design of a system to answer partial-match queries from a file containing a collection of records, each record consisting of a sequence of fields. A partial-match query is a specification of values for zero or more fields of a record, and the answer to a query is a listing of all records in the file whose fields match the specified values.A design is considered in which the file is stored in a set of bins. A formula is derived for the optimal number of bits in a bin address to assign to each field, assuming the probability that a given field is specified in a query is independent of what other fields are specified. Implications of the optimality criterion on the size of bins are also discussed.