FAST-INV: A Fast Algorithm for building large inverted files

Authors:
Edward A. Fox;Whay C. Lee
Affiliations:
-;-
Venue:
FAST-INV: A Fast Algorithm for building large inverted files
Year:
1991

Citing 0
Cited 3

Fast Incremental Indexing for Full-Text Information Retrieval

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Efficient single-pass index construction for text databases

Journal of the American Society for Information Science and Technology
Inverted files for text search engines

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Inverted files are widely used in building bibliographic and other types of retrieval systems. In order to investigate the utility of advance information retrieval methods for improving access to large online library catalogs, it was necessary to extend the SMART system in a variety of ways. One particular problem was to develop a fast method to produce an inverted file from hundreds of thousands of (partial) MARC records. The FAST-INV software was developed in 1986, taking advantage of the large primary memories available on modern computers and the order inherent in the input data. Using the new algorithm, processing in primary memory for N basic data elements has time complexity O(N), and processing of files that will not fit in primary memory can be accomplished in a fixed number of passes. Performance studies show this approach to be (at least) an order of magnitude faster than commonly used techniques. It is hoped that these findings will be of interest to database providers and will help them reduce costs relating to the building of inverted files, as we have been doing for the last five years.