A lexically-intensive algorithm for domain-specific knowlegde acquisition

  • Authors:
  • René Schneider

  • Affiliations:
  • Text Understanding Systems, Ulm, Germany

  • Venue:
  • NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper is an outline of a statistical learning algorithm for information extraction systems. It is based on a lexically intensive analysis of a small number of texts that belong to one domain and provides a robust lemmatisation of the word forms and the collection of the most important syntagmatic dependencies in weighted regular expressions. The lexical and syntactical knowledge is collected in a very compact knowledge base that enables the analysis of correct and partly incorrect texts or messages, which due to transmission errors, spelling or grammatical mistakes otherwise would have been rejected by conventional systems.