Design of a signature file method that accounts for non-uniform occurrence and query frequencies

  • Authors:
  • Christos Faloutsos;Stavros Christodoulakis

  • Affiliations:
  • -;-

  • Venue:
  • VLDB '85 Proceedings of the 11th international conference on Very Large Data Bases - Volume 11
  • Year:
  • 1985

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we study a variation of the signature file access method for text and attribute retrieval. According to this method, the documents (or records) are stored sequentially in the "text file". Abstractions ("signatures") of the documents (or records) are stored in the "signature file". The latter serves as a filter on retrieval: It helps discarding a large number of nonqualifying documents. We pro-pose a signature extraction method that takes into account the query and occurrence frequencies, thus achieving better performance. The model we present is general enough, so that results can be applied not only for text retrieval but also for files with formatted data.