Efficient Storage and Retrieval by Content and Address of Static Files

  • Authors:
  • Peter Elias

  • Affiliations:
  • Department of Electrical Engineering, Massachusetts Institute of Technology, Cambridge, MA

  • Venue:
  • Journal of the ACM (JACM)
  • Year:
  • 1974

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider a set of static files or inventories, each consisting of the same number of entries, each entry a binary word of the same fixed length selected (with replacement) from the set of all binary sequences of that length, and the entries in each file sorted into lexical order. We also consider several retrieval questions of interest for each such file. One is to find the value of the jth entry, another to find the number of entries of value less than k.When a binary representation of such a file is stored in computer memory and an algorithm or machine which knows only the file parameters (i.e. number of entries, number of possible values per entry) accesses some of the stored bits to answer a retrieval question, the number of bits stored and the number of bits accessed per retrieval question are two cost measures for the storage and retrieval task which have been used by Minsky and Papert. Bits stored depends on the representation chosen: bits accessed also depends on the retrieval question asked and on the algorithm used.We give firm lower bounds to minimax measures of bits stored and bits accessed for each of four retrieval questions, and construct representations and algorithms for a bit-addressable machine which come within factors of two or three of attaining all four bounds at once for files of any size. All four factors approach one for large enough files.