Recent Studies in Automatic Text Analysis and Document Retrieval

  • Authors:
  • G. Salton

  • Affiliations:
  • Cornell University, Department of Computer Science, Ithaca, New York

  • Venue:
  • Journal of the ACM (JACM)
  • Year:
  • 1973

Quantified Score

Hi-index 0.02

Visualization

Abstract

Many experts in mechanized text processing now agree that useful automatic language analysis procedures are largely unavailable and that the existing linguistic methodologies generally produce disappointing results. An attempt is made in the present study to identify those automatic procedures which appear most effective as a replacement for the missing language analysis.A series of computer experiments is described, designed to simulate a conventional document retrieval environment. It is found that a simple duplication, by automatic means, of the standard, manual document indexing and retrieval operations will not produce acceptable output results. New mechanized approaches to document handling are proposed, including document ranking methods, automatic dictionary and word list generation, and user feedback searches. It is shown that the fully automatic methodology is superior in effectiveness to the conventional procedures in normal use.