A common architecture for different text processing techniques in an information retrieval environment

  • Authors:
  • G. Thurmair

  • Affiliations:
  • SIEMENS AG, Munich

  • Venue:
  • Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1986

Quantified Score

Hi-index 0.01

Visualization

Abstract

The following paper gives an overview on a text processing software called REALIST (Retrieval Aids by Linguistics and Statistics) which integrates different text processing techniques into a common surface. It supports the user by offering the environment of a given term, using morphological, syntactic and statistic means. The user can call up the processing results, use it for indexing, classification or retrieval purposes and combine them as he wishes e.g. to set up a search logic. The text processing is done on a main frame computer, the results are transferred to a minicomputer where the evaluation is performed. REALIST is a stand alone package, fitting any existing search systems.In the retrieval context, this technique reduces connecting time and improves the search results.REALIST is able to run on English and German texts. Each REALIST component has been separately tested with good success. An integrated version is currently under test at the US Patent ad Trademark Office using 150000 English patent abstracts, and a German version is being tested with 12000 legal texts of the European Community.