Design of a signature file method that accounts for non-uniform occurrence and query frequencies

Authors:
Christos Faloutsos;Stavros Christodoulakis
Affiliations:
-;-
Venue:
VLDB '85 Proceedings of the 11th international conference on Very Large Data Bases - Volume 11
Year:
1985

Citing 8
Cited 6

Signature files: design and performance comparison of some signature extraction methods

SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
Message files

ACM Transactions on Information Systems (TOIS)
A fast string searching algorithm

Communications of the ACM
Efficient string matching: an aid to bibliographic search

Communications of the ACM
Information Retrieval

Information Retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
On extending the functions of a relational database system

SIGMOD '82 Proceedings of the 1982 ACM SIGMOD international conference on Management of data
A Multimedia Office Filing System

VLDB '83 Proceedings of the 9th International Conference on Very Large Data Bases

Group Updates for Red-Black Trees

CIAC '00 Proceedings of the 4th Italian Conference on Algorithms and Complexity
Transparent Distributed Web Caching

LCN '01 Proceedings of the 26th Annual IEEE Conference on Local Computer Networks
On the SD-tree construction for optimal signature operations

COMPUTE '08 Proceedings of the 1st Bangalore Annual Compute Conference
Divide, Compress and Conquer: Querying XML via Partitioned Path-Based Compressed Data Blocks

World Wide Web
Parallel high-dimensional index structure using cell-based filtering for multimedia data

ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
P-CBF: a parallel cell-based filtering scheme using a horizontal partitioning technique

HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we study a variation of the signature file access method for text and attribute retrieval. According to this method, the documents (or records) are stored sequentially in the "text file". Abstractions ("signatures") of the documents (or records) are stored in the "signature file". The latter serves as a filter on retrieval: It helps discarding a large number of nonqualifying documents. We pro-pose a signature extraction method that takes into account the query and occurrence frequencies, thus achieving better performance. The model we present is general enough, so that results can be applied not only for text retrieval but also for files with formatted data.