Fast text processing for information retrieval

  • Authors:
  • Tomek Strzalkowski;Barabara Vauthey

  • Affiliations:
  • -;-

  • Venue:
  • HLT '91 Proceedings of the workshop on Speech and Natural Language
  • Year:
  • 1991

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe an advanced text processing system for information retrieval from natural language document collections. We use both syntactic processing as well as statistical term clustering to obtain a representation of documents which would be more accurate than those obtained with more traditional key-word methods. A reliable top-down parser has been developed that allows for fast processing of large amounts of text, and for a precise identification of desired types of phrases for statistical analysis. Two statistical measures are computed: the measure of informational contribution of words in phrases, and the similarity measure between words.