HYTREM-A Hybrid Text-Retrieval Machine for Large Databases

  • Authors:
  • Dik Lun Lee;Frederick H. Lochovsky

  • Affiliations:
  • Ohio State Univ., Columbus;Univ. of Toronto, Toronto, Canada

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 1990

Quantified Score

Hi-index 14.98

Visualization

Abstract

The design of a text-retrieval machine, called HYTREM (hybrid text-retrieval machine), for the support of large unformatted text databases is described. A signature file is used as an access method to reduce the amount of data that need to be searched directly. Therefore, HYTREM consists of two major subsystems: a signature processor and a text processor. The signature processor is based on a world-parallel, bit-serial organization which is faster, more efficient, and more flexible than a word-serial, bit-parallel organization proposed by S.R. Ahuja and C.S. Roberts (1980). The text processor, called ALTEP (associative linear text processor), is a linear array of logic cells capable of matching regular expressions at a much higher speed than that of previous designs. Since both the signature processor and ALTEP are highly parallel processors, a high-speed multiple-response resolver is provided to facilitate data transfer between the processors and the controllers over a single common bus. Issues about th design of a cost-effective mass-storage system are also discussed. Performance and implementation issues for HYTREM are discussed.